Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
Qwen3_vllm
Commits
75fb8aaa
Commit
75fb8aaa
authored
Oct 30, 2025
by
laibao
Browse files
更新README.md,调整示例命令以去除多余的参数,并优化模型精度表格的格式以提高可读性。
parent
a75f7b9d
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
23 additions
and
7 deletions
+23
-7
README.md
README.md
+23
-7
No files found.
README.md
View file @
75fb8aaa
...
...
@@ -100,7 +100,7 @@ export VLLM_RANK7_NUMA=7
### 离线批量推理
```
bash
python examples/offline_inference/basic/basic.py
-tp
8
--model_path
xxx
python examples/offline_inference/basic/basic.py
-
-tp
8
--model_path
xxx
```
其中,本示例脚本在代码中直接定义了
`prompts`
,并设置
`temperature=0.8`
、
`top_p=0.95`
、
`max_tokens=16`
;如需调整请修改脚本中的参数。
`model_path`
在脚本中指定为本地模型路径;
`tensor_parallel_size=1`
表示使用 1 卡;
`dtype="float16"`
为推理数据类型(若权重为 bfloat16,请相应调整)。本示例未使用
`quantization`
参数,量化推理请参考下文性能测试示例。
...
...
@@ -130,7 +130,7 @@ export VLLM_RANK7_NUMA=7
1.
启动服务:
```
bash
vllm serve
--model
/your/model/path
--enforce-eager
--dtype
float16
--trust-remote-code
--tensor-parallel-size
8
--gpu-memory-utilization
0.98
vllm serve /your/model/path
--enforce-eager
--dtype
float16
--trust-remote-code
--tensor-parallel-size
8
--gpu-memory-utilization
0.98
```
2.
启动客户端
...
...
@@ -200,6 +200,7 @@ chmod +x frpc_linux_amd64_v0.*
```
ssh -L 8000:计算节点IP:8000 -L 8001:计算节点IP:8001 用户名@登录节点 -p 登录节点端口
```
通过跳板机(登录节点)转发端口,让你在本地访问内网计算节点上的服务(如 vLLM API)。
3.
启动OpenAI兼容服务
...
...
@@ -225,11 +226,26 @@ Prompt: 'What is deep learning?', Generated text: ' Deep learning is a subset of
## 精度
| 模型 | 数据集 | 得分 |
| --------------- | --------- | ----- |
| | gsm8k | 95.83 |
| Qwen3-235B-A22B | math500 | 94.2 |
| | humameval | 95.73 |
<table>
<tr>
<th>
模型
</th>
<th>
数据集
</th>
<th>
得分
</th>
</tr>
<tr>
<td
rowspan=
"3"
align=
"center"
>
Qwen3-235B-A22B
</td>
<td>
gsm8k
</td>
<td>
95.83
</td>
</tr>
<tr>
<td>
math500
</td>
<td>
94.2
</td>
</tr>
<tr>
<td>
humameval
</td>
<td>
95.73
</td>
</tr>
</table>
## 应用场景
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment