Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
FlashMLA
Commits
38421051
".github/vscode:/vscode.git/clone" did not exist on "1f5ce9243bc5533fff974a85f237adfd6fa02a72"
Commit
38421051
authored
Jan 29, 2026
by
zhanghj2
Browse files
减少lds使用, 提高并行度
parent
6d68e3d1
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
1 addition
and
1 deletion
+1
-1
csrc/sm90/decode/sparse_fp8/splitkv_mla.cuh
csrc/sm90/decode/sparse_fp8/splitkv_mla.cuh
+1
-1
No files found.
csrc/sm90/decode/sparse_fp8/splitkv_mla.cuh
View file @
38421051
...
@@ -725,7 +725,7 @@ void KernelTemplate<MODEL_TYPE, NUM_HEADS>::run(const SparseAttnDecodeParams &pa
...
@@ -725,7 +725,7 @@ void KernelTemplate<MODEL_TYPE, NUM_HEADS>::run(const SparseAttnDecodeParams &pa
KU_ASSERT
(
params
.
stride_kv_row
==
656
);
// number of bytes per token (512 fp8 + 4 float32 + 64 bfloat16)
KU_ASSERT
(
params
.
stride_kv_row
==
656
);
// number of bytes per token (512 fp8 + 4 float32 + 64 bfloat16)
}
}
auto
mla_kernel
=
&
flash_fwd_splitkv_mla_fp8_sparse_kernel
<
KernelTemplate
<
MODEL_TYPE
,
NUM_HEADS
>>
;
auto
mla_kernel
=
&
flash_fwd_splitkv_mla_fp8_sparse_kernel
<
KernelTemplate
<
MODEL_TYPE
,
NUM_HEADS
>>
;
constexpr
size_t
smem_size
=
sizeof
(
SharedMemoryPlan
);
constexpr
size_t
smem_size
=
32768
;
// lds复用
// zhj debug
// zhj debug
// printf("NUM_M_BLOCKS = %d smem_size = %d \n",NUM_M_BLOCKS, smem_size);
// printf("NUM_M_BLOCKS = %d smem_size = %d \n",NUM_M_BLOCKS, smem_size);
mla_kernel
<<<
dim3
(
NUM_M_BLOCKS
,
params
.
s_q
,
params
.
num_sm_parts
),
NUM_THREADS
,
smem_size
,
params
.
stream
>>>
(
params
);
mla_kernel
<<<
dim3
(
NUM_M_BLOCKS
,
params
.
s_q
,
params
.
num_sm_parts
),
NUM_THREADS
,
smem_size
,
params
.
stream
>>>
(
params
);
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment