"vscode:/vscode.git/clone" did not exist on "54139f1615a8a8183f3ad0ad83afd9eccdee4366"
Unverified Commit 75f64d8b authored by Cody Yu's avatar Cody Yu Committed by GitHub
Browse files

[Bugfix] Fix illegal memory access in FP8 MoE kernel (#6382)

parent 21b2dced
...@@ -492,12 +492,14 @@ def fused_experts(hidden_states: torch.Tensor, ...@@ -492,12 +492,14 @@ def fused_experts(hidden_states: torch.Tensor,
if tokens_in_chunk == 0: if tokens_in_chunk == 0:
break break
if tokens_in_chunk < CHUNK_SIZE: if tokens_in_chunk < CHUNK_SIZE and chunk > 0:
# will only happen in the last chunk # Adjust the intermediate cache size and config for the last
# chunk. Note that in most cases we only have one chunk
# so the cache size and config are already set correctly and
# do not need to be adjusted.
intermediate_cache1 = intermediate_cache1[:tokens_in_chunk] intermediate_cache1 = intermediate_cache1[:tokens_in_chunk]
intermediate_cache2 = intermediate_cache2[:tokens_in_chunk] intermediate_cache2 = intermediate_cache2[:tokens_in_chunk]
intermediate_cache3 = intermediate_cache3[:tokens_in_chunk] intermediate_cache3 = intermediate_cache3[:tokens_in_chunk]
# reload config to get better performance on the last chunk
config = get_config_func(tokens_in_chunk) config = get_config_func(tokens_in_chunk)
curr_topk_ids = topk_ids[begin_chunk_idx:end_chunk_idx] curr_topk_ids = topk_ids[begin_chunk_idx:end_chunk_idx]
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment