[Core] Optimizing cross-attention `QKVParallelLinear` computation (#12325)
Signed-off-by:NickLucche <nlucches@redhat.com> Signed-off-by:
NickLucche <nick@nlucches-4xa100.c.openshift-330514.internal> Co-authored-by:
NickLucche <nick@nlucches-4xa100.c.openshift-330514.internal>
Showing
Please register or sign in to comment