Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
5f15bdb5
Commit
5f15bdb5
authored
Mar 26, 2025
by
gaoqiong
Browse files
增加blockint8支持优化
parent
f3deca99
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
6 deletions
+6
-6
vllm/model_executor/layers/fused_moe/fused_moe.py
vllm/model_executor/layers/fused_moe/fused_moe.py
+6
-6
No files found.
vllm/model_executor/layers/fused_moe/fused_moe.py
View file @
5f15bdb5
...
...
@@ -1734,14 +1734,14 @@ def fused_experts_impl(hidden_states: torch.Tensor,
torch
.
ops
.
_C
.
silu_and_mul
(
intermediate_cache2
,
intermediate_cache1
.
view
(
-
1
,
N
))
if
use_int8_w8a8
:
m
1
=
intermediate_cache2
.
shape
[
0
]
if
m
1
<=
16
:
config
=
stage2_best_config
[
m
1
-
1
]
elif
m
1
<=
32
:
m
=
curr_hidden_states
.
shape
[
0
]
if
m
<=
16
:
config
=
stage2_best_config
[
m
-
1
]
elif
m
<=
32
:
config
=
stage2_best_config
[
15
]
elif
m
1
<=
64
:
elif
m
<=
64
:
config
=
stage2_best_config
[
16
]
elif
m
1
<
256
:
elif
m
<
256
:
config
=
{
"BLOCK_SIZE_M"
:
16
,
"BLOCK_SIZE_N"
:
32
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment