Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
3c900b76
"tests/vscode:/vscode.git/clone" did not exist on "675aa2ec64b2d8ab45948f45cef80f74ebfadbbb"
Commit
3c900b76
authored
Mar 17, 2026
by
fanwl
Browse files
add fa unified attn 导入判断
parent
eb35ba1b
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
8 additions
and
3 deletions
+8
-3
vllm/v1/attention/ops/triton_unified_attention.py
vllm/v1/attention/ops/triton_unified_attention.py
+8
-3
No files found.
vllm/v1/attention/ops/triton_unified_attention.py
View file @
3c900b76
...
...
@@ -13,9 +13,10 @@ from vllm.logger import init_logger
from
vllm.platforms
import
current_platform
from
vllm.triton_utils
import
tl
,
triton
from
vllm
import
envs
from
flash_attn
import
(
varlen_fwd_unified
,
)
try
:
from
flash_attn
import
varlen_fwd_unified
except
Exception
:
varlen_fwd_unified
=
None
logger
=
init_logger
(
__name__
)
float8_info
=
torch
.
finfo
(
current_platform
.
fp8_dtype
())
...
...
@@ -1045,6 +1046,10 @@ def unified_attention(
USE_FP8
=
output_scale
is
not
None
,
)
else
:
if
varlen_fwd_unified
is
None
:
raise
RuntimeError
(
"flash_attn.varlen_fwd_unified is not available in this flash-attn version"
)
# print("Running FA kernel")
varlen_fwd_unified
(
q
=
q
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment