Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
Qwen3.6_vllm
Commits
97223342
Commit
97223342
authored
Apr 17, 2026
by
raojy
💬
Browse files
Update README.md
parent
9904ad79
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
3 deletions
+6
-3
README.md
README.md
+6
-3
No files found.
README.md
View file @
97223342
...
...
@@ -65,14 +65,17 @@ vllm serve Qwen/Qwen3.6-35B-A3B \
--gpu-memory-utilization
0.925
## client访问
curl
-X
POST
"http://localhost:8001/v1/chat/completions"
-H
"Content-Type: application/json"
-d
'{
"model": "
Qwen/Qwen3.6-35B-A3B
",
curl
-X
POST
"http://localhost:8001/v1/chat/completions"
-H
"Content-Type: application/json"
-d
'{
"model": "
/public/home/raojy/project/model_code/qwen36
",
"messages": [
{"role": "system", "content": "你是一个有用的助手。"},
{"role": "user", "content": "你好,请做一下简单的自我介绍。"}
],
"max_tokens": 512,
"temperature": 0.7,
"stream": false
}'
```
## 效果展示
...
...
@@ -87,7 +90,7 @@ DCU与GPU精度一致,推理框架:vllm。
| 模型名称 | 权重大小 | DCU型号 | 最低卡数需求 | 下载地址 |
|:------:|:----:|:----------:|:------:|:---------------------:|
| Qwen3.6-35B-A3B | 35 | BW1000 | 4 |
[
Hugging Face
](
https://huggingface.co/Qwen/Qwen3.6-35B-A3B
)
|
| Qwen3.6-35B-A3B-FP8 | 35B | BW1000 | 2 |
[
Hugging Face
](
https://huggingface.co/Qwen/Qwen3.6-35B-A3B-FP8
)
|
## 源码仓库及问题反馈
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment