Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
e8eb0490
Unverified
Commit
e8eb0490
authored
Apr 24, 2026
by
Netanel Haber
Committed by
GitHub
Apr 24, 2026
Browse files
[Bugfix][MoE] Unpad routed output before shared expert add [Fixes #35949] (#40794)
Signed-off-by:
Netanel Haber
<
nhaber@nvidia.com
>
parent
e8ee2a78
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
0 deletions
+6
-0
vllm/model_executor/layers/fused_moe/runner/moe_runner.py
vllm/model_executor/layers/fused_moe/runner/moe_runner.py
+6
-0
No files found.
vllm/model_executor/layers/fused_moe/runner/moe_runner.py
View file @
e8eb0490
...
@@ -550,10 +550,14 @@ class MoERunner(MoERunnerInterface):
...
@@ -550,10 +550,14 @@ class MoERunner(MoERunnerInterface):
hidden_states
hidden_states
)
)
# Record before `_maybe_pad_hidden_states` pads activations to match
# `moe_config.hidden_dim`, e.g. after `align_trtllm_fp4_moe_hidden_dim_for_fi`
routed_hidden_dim
=
hidden_states
.
shape
[
-
1
]
hidden_states
,
og_hidden_dim
=
self
.
_maybe_pad_hidden_states
(
hidden_states
,
og_hidden_dim
=
self
.
_maybe_pad_hidden_states
(
shared_experts_input
,
shared_experts_input
,
hidden_states
,
hidden_states
,
)
)
hidden_dim_was_padded
=
hidden_states
.
shape
[
-
1
]
>
routed_hidden_dim
result
=
self
.
_forward_entry
(
result
=
self
.
_forward_entry
(
hidden_states
,
hidden_states
,
...
@@ -573,6 +577,8 @@ class MoERunner(MoERunnerInterface):
...
@@ -573,6 +577,8 @@ class MoERunner(MoERunnerInterface):
# Extract outputs from result
# Extract outputs from result
shared_output
,
fused_output
=
_unpack
(
result
)
shared_output
,
fused_output
=
_unpack
(
result
)
if
hidden_dim_was_padded
:
fused_output
=
fused_output
[...,
:
routed_hidden_dim
]
# If combine kernel already reduced fused, reduce shared to match.
# If combine kernel already reduced fused, reduce shared to match.
# See note above re: the two all-reduce points.
# See note above re: the two all-reduce points.
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment