Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
Qwen3.5_vllm
Commits
360fd436
"googlemock/include/vscode:/vscode.git/clone" did not exist on "190e2cdd0b55d289136a177638942e1cd1b2d457"
Commit
360fd436
authored
Mar 12, 2026
by
weishb
Browse files
update README.md
parent
1c72076a
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
2 deletions
+2
-2
README.md
README.md
+2
-2
No files found.
README.md
View file @
360fd436
...
@@ -62,7 +62,7 @@ pip install numpy==1.25.0
...
@@ -62,7 +62,7 @@ pip install numpy==1.25.0
### vllm
### vllm
#### 单机推理
#### 单机推理
**注意**
:使用
`K100 AI`
启动服务时需要添加
`--disable-custom-all-reduce`
参数,加载
8
W8A模型启动服务时需要添加
`-cc.mode=3`
和
`-cc.inductor_compile_config='{"combo_kernels": false, "benchmark_combo_kernel": false}'`
**注意**
:使用
`K100 AI`
启动服务时需要添加
`--disable-custom-all-reduce`
参数,加载W8A
8
模型启动服务时需要添加
`-cc.mode=3`
和
`-cc.inductor_compile_config='{"combo_kernels": false, "benchmark_combo_kernel": false}'`
```
bash
```
bash
## serve启动
## serve启动
...
@@ -176,7 +176,7 @@ DCU与GPU精度一致,推理框架:vllm。
...
@@ -176,7 +176,7 @@ DCU与GPU精度一致,推理框架:vllm。
| 模型名称 | 权重大小 | DCU型号 | 最低卡数需求 | 下载地址 |
| 模型名称 | 权重大小 | DCU型号 | 最低卡数需求 | 下载地址 |
|:------:|:----:|:----------:|:------:|:---------------------:|
|:------:|:----:|:----------:|:------:|:---------------------:|
| Qwen3.5-397B-A17B | 397B | K100AI,BW1000 | 16 |
[
Hugging Face
](
https://huggingface.co/Qwen/Qwen3.5-397B-A17B
)
|
| Qwen3.5-397B-A17B | 397B | K100AI,BW1000 | 16 |
[
Hugging Face
](
https://huggingface.co/Qwen/Qwen3.5-397B-A17B
)
|
| Qwen3.5-397B-A17B-
W8A
8 | 397B | K100AI,BW1000 | 8 |
[
Modelscope
](
https://www.modelscope.cn/models/metax-tech/Qwen3.5-397B-A17B-W8A8
)
|
| Qwen3.5-397B-A17B-
INT
8 | 397B | K100AI,BW1000 | 8 |
[
Modelscope
](
https://www.modelscope.cn/models/metax-tech/Qwen3.5-397B-A17B-W8A8
)
|
| Qwen3.5-122B-A10B | 122B | K100AI,BW1000 | 8 |
[
Hugging Face
](
https://huggingface.co/Qwen/Qwen3.5-122B-A10B
)
|
| Qwen3.5-122B-A10B | 122B | K100AI,BW1000 | 8 |
[
Hugging Face
](
https://huggingface.co/Qwen/Qwen3.5-122B-A10B
)
|
| Qwen3.5-35B-A3B | 35B | K100AI,BW1000 | 2 |
[
Hugging Face
](
https://huggingface.co/Qwen/Qwen3.5-35B-A3B
)
|
| Qwen3.5-35B-A3B | 35B | K100AI,BW1000 | 2 |
[
Hugging Face
](
https://huggingface.co/Qwen/Qwen3.5-35B-A3B
)
|
| Qwen3.5-27B | 27B | K100AI,BW1000 | 2 |
[
Hugging Face
](
https://huggingface.co/Qwen/Qwen3.5-27B
)
|
| Qwen3.5-27B | 27B | K100AI,BW1000 | 2 |
[
Hugging Face
](
https://huggingface.co/Qwen/Qwen3.5-27B
)
|
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment