Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
8a74165f
Commit
8a74165f
authored
Apr 24, 2026
by
王敏
Browse files
w4a8 默认使用deepgemm的masked接口
parent
d04137d6
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
2 deletions
+2
-2
vllm/model_executor/layers/fused_moe/batched_deep_gemm_moe.py
.../model_executor/layers/fused_moe/batched_deep_gemm_moe.py
+2
-2
No files found.
vllm/model_executor/layers/fused_moe/batched_deep_gemm_moe.py
View file @
8a74165f
...
@@ -41,9 +41,9 @@ from vllm.model_executor.layers.activation import SiluAndMul
...
@@ -41,9 +41,9 @@ from vllm.model_executor.layers.activation import SiluAndMul
from
lightop
import
fuse_silu_mul_quant_ep
,
m_grouped_w4a8_gemm_nt_masked
from
lightop
import
fuse_silu_mul_quant_ep
,
m_grouped_w4a8_gemm_nt_masked
from
lmslim.layers.gemm.int8_utils
import
per_token_quant_int8
from
lmslim.layers.gemm.int8_utils
import
per_token_quant_int8
if
has_deep_gemm
():
if
has_deep_gemm
():
from
deepgemm
import
m_grouped_w8a8_gemm_nt_masked
from
deepgemm
import
m_grouped_w8a8_gemm_nt_masked
,
m_grouped_w4a8_gemm_nt_masked
else
:
else
:
from
lightop
import
m_grouped_w8a8_gemm_nt_masked
from
lightop
import
m_grouped_w8a8_gemm_nt_masked
,
m_grouped_w4a8_gemm_nt_masked
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment