Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
c3666f56
Unverified
Commit
c3666f56
authored
Dec 26, 2025
by
Jee Jee Li
Committed by
GitHub
Dec 26, 2025
Browse files
[Misc] Fix Qwen2-MoE shared_expert_gate (#31339)
Signed-off-by:
Jee Jee Li
<
pandaleefree@gmail.com
>
parent
c79dbfa9
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
8 additions
and
3 deletions
+8
-3
vllm/lora/request.py
vllm/lora/request.py
+0
-1
vllm/model_executor/models/qwen2_moe.py
vllm/model_executor/models/qwen2_moe.py
+8
-2
No files found.
vllm/lora/request.py
View file @
c3666f56
...
...
@@ -20,7 +20,6 @@ class LoRARequest(
lora_name
:
str
lora_int_id
:
int
lora_path
:
str
=
""
long_lora_max_len
:
int
|
None
=
None
base_model_name
:
str
|
None
=
msgspec
.
field
(
default
=
None
)
tensorizer_config_dict
:
dict
|
None
=
None
...
...
vllm/model_executor/models/qwen2_moe.py
View file @
c3666f56
...
...
@@ -111,7 +111,7 @@ class Qwen2MoeMLP(nn.Module):
out
,
_
=
self
.
down_proj
(
out
)
if
self
.
expert_gate
is
not
None
:
out
=
F
.
sigmoid
(
self
.
expert_gate
(
x
))
*
out
out
=
F
.
sigmoid
(
self
.
expert_gate
(
x
)
[
0
]
)
*
out
return
out
...
...
@@ -140,7 +140,13 @@ class Qwen2MoeSparseMoeBlock(nn.Module):
prefix
=
f
"
{
prefix
}
.gate"
,
)
self
.
shared_expert_gate
=
torch
.
nn
.
Linear
(
config
.
hidden_size
,
1
,
bias
=
False
)
self
.
shared_expert_gate
=
ReplicatedLinear
(
config
.
hidden_size
,
1
,
bias
=
False
,
quant_config
=
None
,
prefix
=
f
"
{
prefix
}
.shared_expert_gate"
,
)
if
config
.
shared_expert_intermediate_size
>
0
:
self
.
shared_expert
=
Qwen2MoeMLP
(
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment