Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
0ab3f437
Unverified
Commit
0ab3f437
authored
May 08, 2025
by
Trevor Morris
Committed by
GitHub
May 08, 2025
Browse files
Cutlass MLA: Disable split kv due to
https://github.com/NVIDIA/cutlass/issues/2274
(#6101)
parent
cec98f10
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
5 additions
and
2 deletions
+5
-2
sgl-kernel/csrc/attention/cutlass_mla_kernel.cu
sgl-kernel/csrc/attention/cutlass_mla_kernel.cu
+4
-1
sgl-kernel/tests/test_cutlass_mla.py
sgl-kernel/tests/test_cutlass_mla.py
+1
-1
No files found.
sgl-kernel/csrc/attention/cutlass_mla_kernel.cu
View file @
0ab3f437
...
...
@@ -151,7 +151,10 @@ typename T::Fmha::Arguments args_from_options(
page_size
},
{
static_cast
<
ElementOut
*>
(
out
.
data_ptr
()),
stride_O
,
static_cast
<
ElementAcc
*>
(
nullptr
),
stride_LSE
},
hw_info
,
-
1
,
// split_kv
// TODO(trevor-m): Change split_kv back to -1 when
// https://github.com/NVIDIA/cutlass/issues/2274 is fixed. Split_kv=1 will
// perform worse with larger context length and smaller batch sizes.
1
,
// split_kv
nullptr
,
// is_var_split_kv
};
// TODO(kaixih@nvidia): When split_kv=-1 and is_var_split_kv=false, we compute
...
...
sgl-kernel/tests/test_cutlass_mla.py
View file @
0ab3f437
...
...
@@ -67,7 +67,7 @@ def test_cutlass_mla_decode(
pack_factor
=
128
//
block_size
block_num
=
((
block_num
+
pack_factor
-
1
)
//
pack_factor
)
*
pack_factor
q
=
torch
.
randn
(
bs
,
h_q
,
d
)
q
=
torch
.
randn
(
bs
,
h_q
,
d
)
*
100.0
block_table
=
torch
.
randint
(
0
,
bs
*
block_num
,
(
bs
,
block_num
),
dtype
=
torch
.
int32
)
kv_cache
=
torch
.
randn
(
block_table
.
numel
(),
block_size
,
d
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment