Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
1e60f87b
Unverified
Commit
1e60f87b
authored
Jan 22, 2025
by
Jinzhen Lin
Committed by
GitHub
Jan 21, 2025
Browse files
[Kernel] fix moe_align_block_size error condition (#12239)
Signed-off-by:
Jinzhen Lin
<
linjinzhen@hotmail.com
>
parent
9705b90b
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
4 deletions
+6
-4
csrc/moe/moe_align_sum_kernels.cu
csrc/moe/moe_align_sum_kernels.cu
+6
-4
No files found.
csrc/moe/moe_align_sum_kernels.cu
View file @
1e60f87b
...
...
@@ -234,14 +234,16 @@ void moe_align_block_size(torch::Tensor topk_ids, int64_t num_experts,
bool
use_global_memory
=
false
;
bool
use_i16
=
false
;
// Use uint16_t for shared memory token counts
if
(
shared_mem_i
16
>
device_max_shared_mem
)
{
use_global_memory
=
true
;
}
else
if
(
shared_mem_i
32
>
device_max_shared_mem
&&
if
(
shared_mem_i
32
<
device_max_shared_mem
)
{
// Do nothing in this case. We're all set to use int32_t token counts
}
else
if
(
shared_mem_i
16
<
device_max_shared_mem
&&
topk_ids
.
numel
()
<=
65535
)
{
// when nelements of topk_ids is smaller than 65535 (max value of uint16),
// element value of token_cnts would also smaller than 65535,
// so we can use uint16 as dtype of token_cnts
use_i16
=
true
;
}
else
{
use_global_memory
=
true
;
}
if
(
use_global_memory
)
{
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment