Commit 17f39880 authored by Varun Sundar Rabindranath's avatar Varun Sundar Rabindranath Committed by Kevin H. Luu
Browse files

[BugFix] Workspace allocation during profile run : DeepEPHighThroughput + DeepGEMM (#30899)

(cherry picked from commit e3fc374a)
parent 682c3858
...@@ -795,7 +795,10 @@ class FusedMoEModularKernel(torch.nn.Module): ...@@ -795,7 +795,10 @@ class FusedMoEModularKernel(torch.nn.Module):
top_k, top_k,
global_num_experts, global_num_experts,
local_num_experts, local_num_experts,
expert_tokens_meta, # expert_tokens_meta help in allocating optimal/minimal
# amount of workspace. Mark it None, so we allocate for
# the worst-case scenario.
expert_tokens_meta=None,
) )
) )
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment