Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
86a32615
Unverified
Commit
86a32615
authored
Dec 12, 2025
by
Matthew Bonanni
Committed by
GitHub
Dec 13, 2025
Browse files
[Bugfix] Pass FA version in `MultiHeadAttention` (#30575)
Signed-off-by:
Matthew Bonanni
<
mbonanni@redhat.com
>
parent
08f8a562
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
10 additions
and
0 deletions
+10
-0
vllm/attention/layer.py
vllm/attention/layer.py
+10
-0
No files found.
vllm/attention/layer.py
View file @
86a32615
...
...
@@ -2,6 +2,7 @@
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project
"""Attention layer."""
import
functools
from
collections.abc
import
Callable
from
typing
import
cast
...
...
@@ -17,6 +18,7 @@ from vllm.attention.backends.abstract import (
)
from
vllm.attention.backends.registry
import
AttentionBackendEnum
from
vllm.attention.selector
import
get_attn_backend
from
vllm.attention.utils.fa_utils
import
get_flash_attn_version
from
vllm.attention.utils.kv_sharing_utils
import
validate_kv_sharing_target
from
vllm.attention.utils.kv_transfer_utils
import
maybe_transfer_kv_layer
from
vllm.config
import
CacheConfig
,
get_current_vllm_config
...
...
@@ -524,6 +526,14 @@ class MultiHeadAttention(nn.Module):
AttentionBackendEnum
.
ROCM_AITER_FA
,
}
self
.
fa_version
=
None
if
self
.
attn_backend
==
AttentionBackendEnum
.
FLASH_ATTN
:
self
.
fa_version
=
get_flash_attn_version
()
assert
self
.
_flash_attn_varlen_func
is
not
None
self
.
_flash_attn_varlen_func
=
functools
.
partial
(
self
.
_flash_attn_varlen_func
,
fa_version
=
self
.
fa_version
)
logger
.
info_once
(
f
"Using
{
self
.
attn_backend
}
for MultiHeadAttention in multimodal encoder."
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment