Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
12449488
Unverified
Commit
12449488
authored
Sep 30, 2025
by
Wentao Ye
Committed by
GitHub
Sep 30, 2025
Browse files
[Log] Optimize Log for FP8MOE (#25709)
Signed-off-by:
yewentao256
<
zhyanwentao@126.com
>
parent
a73f6491
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
4 additions
and
4 deletions
+4
-4
vllm/model_executor/layers/quantization/fp8.py
vllm/model_executor/layers/quantization/fp8.py
+4
-4
No files found.
vllm/model_executor/layers/quantization/fp8.py
View file @
12449488
...
@@ -467,7 +467,8 @@ class Fp8MoEMethod(FusedMoEMethodBase):
...
@@ -467,7 +467,8 @@ class Fp8MoEMethod(FusedMoEMethodBase):
logger
.
info_once
(
"DeepGemm disabled: FlashInfer MOE is"
logger
.
info_once
(
"DeepGemm disabled: FlashInfer MOE is"
" enabled."
)
" enabled."
)
elif
(
is_deep_gemm_supported
()):
elif
(
is_deep_gemm_supported
()):
logger
.
info_once
(
"Using DeepGemm kernels for Fp8MoEMethod."
)
logger
.
debug_once
(
"DeepGemm kernels available for Fp8MoEMethod."
)
self
.
allow_deep_gemm
=
True
self
.
allow_deep_gemm
=
True
else
:
else
:
logger
.
warning_once
(
logger
.
warning_once
(
...
@@ -481,9 +482,8 @@ class Fp8MoEMethod(FusedMoEMethodBase):
...
@@ -481,9 +482,8 @@ class Fp8MoEMethod(FusedMoEMethodBase):
elif
(
current_platform
.
is_cuda
()
elif
(
current_platform
.
is_cuda
()
and
current_platform
.
is_device_capability
(
100
)
and
current_platform
.
is_device_capability
(
100
)
and
not
self
.
flashinfer_moe_backend
):
and
not
self
.
flashinfer_moe_backend
):
logger
.
info_once
(
logger
.
debug_once
(
"Using CutlassBlockScaledGroupedGemm kernels for Fp8 MOE "
"CutlassBlockScaledGroupedGemm available for Fp8MoEMethod."
)
"on SM100."
)
self
.
allow_cutlass_block_scaled_grouped_gemm
=
True
self
.
allow_cutlass_block_scaled_grouped_gemm
=
True
def
create_weights
(
self
,
layer
:
Module
,
num_experts
:
int
,
hidden_size
:
int
,
def
create_weights
(
self
,
layer
:
Module
,
num_experts
:
int
,
hidden_size
:
int
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment