Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
9a521c23
Commit
9a521c23
authored
Sep 01, 2025
by
zhuwenwen
Browse files
update fused_moe,py
parent
90c5cc41
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
4 additions
and
3 deletions
+4
-3
vllm/model_executor/layers/fused_moe/fused_moe.py
vllm/model_executor/layers/fused_moe/fused_moe.py
+4
-3
No files found.
vllm/model_executor/layers/fused_moe/fused_moe.py
View file @
9a521c23
...
@@ -644,7 +644,7 @@ def invoke_fused_moe_kernel(A: torch.Tensor,
...
@@ -644,7 +644,7 @@ def invoke_fused_moe_kernel(A: torch.Tensor,
expert_ids
,
expert_ids
,
num_tokens_post_padded
,
num_tokens_post_padded
,
B
.
size
(
1
)
if
not
use_nn_moe
else
B
.
size
(
2
),
B
.
size
(
1
)
if
not
use_nn_moe
else
B
.
size
(
2
),
B
.
size
(
1
),
A
.
size
(
1
),
EM
,
EM
,
num_tokens
,
num_tokens
,
A
.
stride
(
0
),
A
.
stride
(
0
),
...
@@ -1081,7 +1081,7 @@ def inplace_fused_experts(
...
@@ -1081,7 +1081,7 @@ def inplace_fused_experts(
use_int8_w8a8
,
use_int8_w8a16
,
use_int4_w4a16
,
use_int8_w8a8
,
use_int8_w8a16
,
use_int4_w4a16
,
use_mxfp4_w4a4
,
per_channel_quant
,
global_num_experts
,
use_mxfp4_w4a4
,
per_channel_quant
,
global_num_experts
,
expert_map
,
w1_scale
,
w2_scale
,
w1_zp
,
w2_zp
,
a1_scale
,
expert_map
,
w1_scale
,
w2_scale
,
w1_zp
,
w2_zp
,
a1_scale
,
a2_scale
,
block_shape
,
w1_bias
,
w2_bias
)
a2_scale
,
block_shape
,
w1_bias
,
w2_bias
,
use_nn_moe
)
def
inplace_fused_experts_fake
(
hidden_states
:
torch
.
Tensor
,
def
inplace_fused_experts_fake
(
hidden_states
:
torch
.
Tensor
,
...
@@ -1108,7 +1108,8 @@ def inplace_fused_experts_fake(hidden_states: torch.Tensor,
...
@@ -1108,7 +1108,8 @@ def inplace_fused_experts_fake(hidden_states: torch.Tensor,
a2_scale
:
Optional
[
torch
.
Tensor
]
=
None
,
a2_scale
:
Optional
[
torch
.
Tensor
]
=
None
,
block_shape
:
Optional
[
list
[
int
]]
=
None
,
block_shape
:
Optional
[
list
[
int
]]
=
None
,
w1_bias
:
Optional
[
torch
.
Tensor
]
=
None
,
w1_bias
:
Optional
[
torch
.
Tensor
]
=
None
,
w2_bias
:
Optional
[
torch
.
Tensor
]
=
None
)
->
None
:
w2_bias
:
Optional
[
torch
.
Tensor
]
=
None
,
use_nn_moe
:
Optional
[
bool
]
=
False
)
->
None
:
pass
pass
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment