Unverified Commit c5d2b01c authored by zk-lover's avatar zk-lover Committed by GitHub
Browse files

[LongCat] Optimize zero_experts_compute_triton by changing mask (#10303)

parent 46ccbed2
...@@ -1416,7 +1416,7 @@ def zero_experts_compute_triton( ...@@ -1416,7 +1416,7 @@ def zero_experts_compute_triton(
zero_expert_scales[zero_expert_mask] = 0.0 zero_expert_scales[zero_expert_mask] = 0.0
normal_expert_mask = expert_indices >= num_experts normal_expert_mask = expert_indices >= num_experts
expert_indices[normal_expert_mask] = 0 expert_indices[normal_expert_mask] = -1
expert_scales[normal_expert_mask] = 0.0 expert_scales[normal_expert_mask] = 0.0
output = torch.zeros_like(hidden_states).to(hidden_states.device) output = torch.zeros_like(hidden_states).to(hidden_states.device)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment