Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
ef8ec07b
Unverified
Commit
ef8ec07b
authored
May 13, 2025
by
fzyzcjy
Committed by
GitHub
May 12, 2025
Browse files
Support tuning moe for llama 4 model (#6042)
parent
f24fc5b8
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
7 additions
and
1 deletion
+7
-1
benchmark/kernels/fused_moe_triton/tuning_fused_moe_triton.py
...hmark/kernels/fused_moe_triton/tuning_fused_moe_triton.py
+7
-1
No files found.
benchmark/kernels/fused_moe_triton/tuning_fused_moe_triton.py
View file @
ef8ec07b
...
@@ -408,6 +408,12 @@ def main(args: argparse.Namespace):
...
@@ -408,6 +408,12 @@ def main(args: argparse.Namespace):
topk
=
config
.
num_experts_per_tok
topk
=
config
.
num_experts_per_tok
intermediate_size
=
config
.
moe_intermediate_size
intermediate_size
=
config
.
moe_intermediate_size
shard_intermediate_size
=
2
*
intermediate_size
//
args
.
tp_size
shard_intermediate_size
=
2
*
intermediate_size
//
args
.
tp_size
elif
config
.
architectures
[
0
]
==
"Llama4ForConditionalGeneration"
:
n_share_fusion_experts
=
args
.
n_share_experts_fusion
E
=
config
.
text_config
.
num_local_experts
+
n_share_fusion_experts
topk
=
config
.
text_config
.
num_experts_per_tok
intermediate_size
=
config
.
text_config
.
intermediate_size
shard_intermediate_size
=
2
*
intermediate_size
//
args
.
tp_size
elif
config
.
architectures
[
0
]
in
[
elif
config
.
architectures
[
0
]
in
[
"Grok1ForCausalLM"
,
"Grok1ForCausalLM"
,
"Grok1ImgGen"
,
"Grok1ImgGen"
,
...
@@ -424,7 +430,7 @@ def main(args: argparse.Namespace):
...
@@ -424,7 +430,7 @@ def main(args: argparse.Namespace):
intermediate_size
=
config
.
intermediate_size
intermediate_size
=
config
.
intermediate_size
shard_intermediate_size
=
2
*
intermediate_size
//
args
.
tp_size
shard_intermediate_size
=
2
*
intermediate_size
//
args
.
tp_size
hidden_size
=
config
.
hidden_size
hidden_size
=
getattr
(
config
,
"hidden_size"
,
None
)
or
config
.
text_
config
.
hidden_size
dtype
=
config
.
torch_dtype
dtype
=
config
.
torch_dtype
use_fp8_w8a8
=
args
.
dtype
==
"fp8_w8a8"
use_fp8_w8a8
=
args
.
dtype
==
"fp8_w8a8"
use_int8_w8a8
=
args
.
dtype
==
"int8_w8a8"
use_int8_w8a8
=
args
.
dtype
==
"int8_w8a8"
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment