Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
e98d9346
Unverified
Commit
e98d9346
authored
Sep 28, 2025
by
Lifu Huang
Committed by
GitHub
Sep 28, 2025
Browse files
[1/2] Support FA4 for MHA Prefill in sgl-kernel (#10940)
parent
0c917410
Changes
5
Hide whitespace changes
Inline
Side-by-side
Showing
5 changed files
with
41 additions
and
5 deletions
+41
-5
sgl-kernel/pyproject.toml
sgl-kernel/pyproject.toml
+1
-1
sgl-kernel/pyproject_cpu.toml
sgl-kernel/pyproject_cpu.toml
+1
-1
sgl-kernel/pyproject_rocm.toml
sgl-kernel/pyproject_rocm.toml
+1
-1
sgl-kernel/python/sgl_kernel/flash_attn.py
sgl-kernel/python/sgl_kernel/flash_attn.py
+37
-1
sgl-kernel/python/sgl_kernel/version.py
sgl-kernel/python/sgl_kernel/version.py
+1
-1
No files found.
sgl-kernel/pyproject.toml
View file @
e98d9346
...
...
@@ -8,7 +8,7 @@ build-backend = "scikit_build_core.build"
[project]
name
=
"sgl-kernel"
version
=
"0.3.1
2
"
version
=
"0.3.1
3
"
description
=
"Kernel Library for SGLang"
readme
=
"README.md"
requires-python
=
">=3.10"
...
...
sgl-kernel/pyproject_cpu.toml
View file @
e98d9346
...
...
@@ -8,7 +8,7 @@ build-backend = "scikit_build_core.build"
[project]
name
=
"sgl-kernel"
version
=
"0.3.1
2
"
version
=
"0.3.1
3
"
description
=
"Kernel Library for SGLang"
readme
=
"README.md"
requires-python
=
">=3.10"
...
...
sgl-kernel/pyproject_rocm.toml
View file @
e98d9346
...
...
@@ -9,7 +9,7 @@ build-backend = "setuptools.build_meta"
[project]
name
=
"sgl-kernel"
version
=
"0.3.1
2
"
version
=
"0.3.1
3
"
description
=
"Kernel Library for SGLang"
readme
=
"README.md"
requires-python
=
">=3.10"
...
...
sgl-kernel/python/sgl_kernel/flash_attn.py
View file @
e98d9346
...
...
@@ -153,7 +153,43 @@ def flash_attn_with_kvcache(
normalization factor).
"""
if
ver
==
4
:
raise
NotImplementedError
(
"haven't implemented flash_attn_with_kvcache for fa4"
)
assert
(
flash_attn_varlen_func_v4
is
not
None
),
"FA4 is not available, please check your installation."
# Using `(-1, -1)` as no sliding window causes correctness issues for FA4.
assert
(
k
is
None
and
v
is
None
),
"FA4 does not support updating KV cache in-place."
assert
(
rotary_cos
is
None
and
rotary_sin
is
None
and
rotary_interleaved
is
None
and
rotary_seqlens
is
None
),
"FA4 does not support rotary embedding."
assert
(
cache_batch_idx
is
None
and
cache_leftpad
is
None
),
"FA4 does not support non-consecutive batch indices or left padding."
assert
(
q_descale
is
None
and
k_descale
is
None
and
v_descale
is
None
),
"FA4 does not support descale."
if
window_size
==
(
-
1
,
-
1
):
window_size
=
(
None
,
None
)
return
flash_attn_varlen_func_v4
(
q
=
q
,
k
=
k_cache
,
v
=
v_cache
,
cu_seqlens_q
=
cu_seqlens_q
,
seqused_k
=
cache_seqlens
,
softmax_scale
=
softmax_scale
,
causal
=
causal
,
window_size
=
window_size
,
softcap
=
softcap
,
pack_gqa
=
pack_gqa
,
return_softmax_lse
=
return_softmax_lse
,
learnable_sink
=
sinks
,
page_table
=
page_table
,
)
assert
k_cache
.
stride
(
-
1
)
==
1
,
"k_cache must have contiguous last dimension"
assert
v_cache
.
stride
(
-
1
)
==
1
,
"v_cache must have contiguous last dimension"
...
...
sgl-kernel/python/sgl_kernel/version.py
View file @
e98d9346
__version__
=
"0.3.1
2
"
__version__
=
"0.3.1
3
"
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment