Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
4418f599
Unverified
Commit
4418f599
authored
Apr 22, 2025
by
JieXin Liang
Committed by
GitHub
Apr 22, 2025
Browse files
Fix FA3 DeepSeek prefill performance regression (#5624)
Co-authored-by:
ispobock
<
ispobaoke@gmail.com
>
parent
04f2abcb
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
2 deletions
+6
-2
python/sglang/srt/models/deepseek_v2.py
python/sglang/srt/models/deepseek_v2.py
+6
-2
No files found.
python/sglang/srt/models/deepseek_v2.py
View file @
4418f599
...
@@ -583,13 +583,17 @@ class DeepseekV2AttentionMLA(nn.Module):
...
@@ -583,13 +583,17 @@ class DeepseekV2AttentionMLA(nn.Module):
return
AttnForwardMethod
.
MLA
return
AttnForwardMethod
.
MLA
elif
self
.
attention_backend
==
"fa3"
:
elif
self
.
attention_backend
==
"fa3"
:
# Flash Attention: Use MHA with chunked KV cache when prefilling on long sequences.
# Flash Attention: Use MHA with chunked KV cache when prefilling on long sequences.
if
forward_batch
.
extend_prefix_lens_cpu
is
not
None
:
sum_extend_prefix_lens
=
sum
(
forward_batch
.
extend_prefix_lens_cpu
)
if
(
if
(
forward_batch
.
forward_mode
.
is_extend
()
forward_batch
.
forward_mode
.
is_extend
()
and
not
self
.
disable_chunked_prefix_cache
and
not
self
.
disable_chunked_prefix_cache
and
not
forward_batch
.
forward_mode
.
is_target_verify
()
and
not
forward_batch
.
forward_mode
.
is_target_verify
()
and
not
forward_batch
.
forward_mode
.
is_draft_extend
()
and
not
forward_batch
.
forward_mode
.
is_draft_extend
()
and
sum
(
forward_batch
.
extend_prefix_lens_cpu
)
and
(
>=
self
.
chunked_prefix_cache_threshold
sum_extend_prefix_lens
>=
self
.
chunked_prefix_cache_threshold
or
sum_extend_prefix_lens
==
0
)
):
):
return
AttnForwardMethod
.
MHA_CHUNKED_KV
return
AttnForwardMethod
.
MHA_CHUNKED_KV
else
:
else
:
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment