Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
081b5594
Unverified
Commit
081b5594
authored
Sep 25, 2025
by
Shu Wang
Committed by
GitHub
Sep 25, 2025
Browse files
Fix routing_bias dtype (#25711)
Signed-off-by:
Shu Wang.
<
shuw@nvidia.com
>
parent
57329a8c
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
4 additions
and
1 deletion
+4
-1
vllm/model_executor/layers/quantization/modelopt.py
vllm/model_executor/layers/quantization/modelopt.py
+4
-1
No files found.
vllm/model_executor/layers/quantization/modelopt.py
View file @
081b5594
...
@@ -1454,10 +1454,13 @@ class ModelOptNvFp4FusedMoE(FusedMoEMethodBase):
...
@@ -1454,10 +1454,13 @@ class ModelOptNvFp4FusedMoE(FusedMoEMethodBase):
routing_method_type
=
flashinfer
.
RoutingMethodType
.
DeepSeekV3
routing_method_type
=
flashinfer
.
RoutingMethodType
.
DeepSeekV3
if
use_llama4_routing
:
if
use_llama4_routing
:
routing_method_type
=
flashinfer
.
RoutingMethodType
.
Llama4
routing_method_type
=
flashinfer
.
RoutingMethodType
.
Llama4
routing_bias
=
e_score_correction_bias
if
routing_bias
is
not
None
:
routing_bias
=
routing_bias
.
to
(
torch
.
bfloat16
)
out
=
flashinfer
.
fused_moe
.
trtllm_fp4_block_scale_moe
(
out
=
flashinfer
.
fused_moe
.
trtllm_fp4_block_scale_moe
(
routing_logits
=
router_logits
routing_logits
=
router_logits
if
use_llama4_routing
else
router_logits
.
to
(
torch
.
float32
),
if
use_llama4_routing
else
router_logits
.
to
(
torch
.
float32
),
routing_bias
=
e_score_correction
_bias
,
routing_bias
=
routing
_bias
,
hidden_states
=
hidden_states_fp4
,
hidden_states
=
hidden_states_fp4
,
hidden_states_scale
=
hidden_states_scale_linear_fp4
.
view
(
hidden_states_scale
=
hidden_states_scale_linear_fp4
.
view
(
torch
.
float8_e4m3fn
).
flatten
(),
torch
.
float8_e4m3fn
).
flatten
(),
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment