Unverified Commit 2f6bacee authored by Cheng Wan's avatar Cheng Wan Committed by GitHub
Browse files

[moe] fix: correct the cache size in the last chunk (#3679)


Co-authored-by: default avatarAbatom <abzhonghua@gmail.com>
parent 40148041
......@@ -1064,7 +1064,9 @@ def fused_experts_impl(
# so the cache size and config are already set correctly and
# do not need to be adjusted.
intermediate_cache1 = intermediate_cache1[:tokens_in_chunk]
intermediate_cache2 = intermediate_cache2[:tokens_in_chunk]
intermediate_cache2 = intermediate_cache2[
: tokens_in_chunk * topk_ids.shape[1]
]
intermediate_cache3 = intermediate_cache3[:tokens_in_chunk]
config = get_config_func(tokens_in_chunk)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment