Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
GLM-4.7_vllm
Commits
80303715
Commit
80303715
authored
Mar 04, 2026
by
chenych
Browse files
Add glm-4.7-flash
parent
de15f105
Changes
4
Show whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
64 additions
and
21 deletions
+64
-21
README.md
README.md
+64
-21
doc/result-glm4.7-flash.png
doc/result-glm4.7-flash.png
+0
-0
doc/result-glm4.7.png
doc/result-glm4.7.png
+0
-0
doc/result.png
doc/result.png
+0
-0
No files found.
README.md
View file @
80303715
...
...
@@ -21,13 +21,13 @@ GLM-4.7 是智谱最新旗舰模型,GLM-4.7 面向 Agentic Coding 场景强化
## 环境依赖
| 软件 | 版本 |
| :------: | :------: |
| DTK | 2
5
.04
.2
|
| python | 3.10 |
| transformers |
4.57.3
|
| vllm | 0.
9.2
+das.opt1.dtk2
5
04
2
|
| torch | 2.
5.1
+das.opt1.dtk2
5
04
1
|
| DTK | 2
6
.04 |
| python | 3.10
.12
|
| transformers |
5.2.0
|
| vllm | 0.
15.1
+das.opt1.
alpha.
dtk2
6
04
.20260220.g2799735a
|
| torch | 2.
9.0
+das.opt1.dtk2
6
04
.20260206.g275d08c2
|
推荐使用镜像:
image
.sourcefind.cn:5
000
/dcu/admin/base/vllm
:
0.
9.2
-ubuntu22.04-dtk2
5
.04
.2-das1.7
-py3.10-202
51
20
3
推荐使用镜像:
harbor
.sourcefind.cn:5
443
/dcu/admin/base/
custom:
vllm0.
15.1
-ubuntu22.04-dtk2
6
.04
-0130
-py3.10-202
602
20
-
挂载地址
`-v`
根据实际模型情况修改
...
...
@@ -46,20 +46,52 @@ docker run -it \
-u
root
\
-v
/opt/hyhal/:/opt/hyhal/:ro
\
-v
/path/your_code_data/:/path/your_code_data/
\
image
.sourcefind.cn:5
000
/dcu/admin/base/vllm
:
0.
9.2
-ubuntu22.04-dtk2
5
.04
.2-das1.7
-py3.10-202
51
20
3
bash
harbor
.sourcefind.cn:5
443
/dcu/admin/base/
custom:
vllm0.
15.1
-ubuntu22.04-dtk2
6
.04
-0130
-py3.10-202
602
20 bash
```
更多镜像可前往
[
光源
](
https://sourcefind.cn/#/service-list
)
下载使用。
关于本项目DCU显卡所需的特殊深度学习库可从
[
光合
](
https://developer.sourcefind.cn/tool/
)
开发者社区下载安装,其它包参照requirements.txt安装:
关于本项目DCU显卡所需的特殊深度学习库可从
[
光合
](
https://developer.sourcefind.cn/tool/
)
开发者社区下载安装,其它包安装:
```
pip install pycountry
pip install -U transformers
```
## 数据集
暂无
`
暂无
`
## 训练
暂无
`
暂无
`
## 推理
### vllm
#### 单机推理
```
bash
## serve启动
export
ALLREDUCE_STREAM_WITH_COMPUTE
=
1
vllm serve ZhipuAI/GLM-4.7-Flash
\
--tensor-parallel-size
4
\
--speculative-config
.method mtp
\
--speculative-config
.num_speculative_tokens 1
\
--gpu-memory-utilization
0.95
\
--tool-call-parser
glm47
\
--reasoning-parser
glm45
\
--enable-auto-tool-choice
\
--port
8001
# client访问
curl http://localhost:8001/v1/chat/completions
\
-H
"Content-Type: application/json"
\
-d
'{
"model": "glm-4.7-flash",
"messages": [
{
"role": "user",
"content": "hello"
}
]
}'
```
#### 多机推理
1.
环境变量配置
```
bash
...
...
@@ -106,14 +138,16 @@ ray start --address='x.x.x.x:6379' --num-gpus=8 --num-cpus=32
3.
启动vllm server
```
bash
vllm serve ZhipuAI/GLM-4.7
\
--trust-remote-code
\
--distributed-executor-backend
ray
\
--dtype
bfloat16
\
--tensor-parallel-size
16
\
--max-model-len
32768
\
--speculative-config
.method mtp
\
--speculative-config
.num_speculative_tokens 1
\
--tool-call-parser
glm47
\
--reasoning-parser
glm45
\
--enable-auto-tool-choice
\
--gpu-memory-utilization
0.95
\
--port
8001
\
--served-model-name
glm-4.7
\
--kv-cache-dtype
auto
--served-model-name
glm-4.7
```
启动完成后可通过以下方式访问:
...
...
@@ -133,17 +167,26 @@ curl http://localhost:8001/v1/chat/completions \
```
## 效果展示
-
GLM-4.7-Flash 推理效果展示
<div
align=
center
>
<img
src=
"./doc/result-glm4.7-flash.png"
/>
</div>
-
GLM-4.7 推理效果展示
<div
align=
center
>
<img
src=
"./doc/result.png"
/>
<img
src=
"./doc/result
-glm4.7
.png"
/>
</div>
### 精度
DCU与GPU精度一致,推理框架:vllm。
`
DCU与GPU精度一致,推理框架:vllm。
`
## 预训练权重
| 模型名称 | 权重大小 | DCU型号 | 最低卡数需求 |下载地址|
|:-----:|:----------:|:----------:|:---------------------:|:----------:|
| GLM-4.7 | 358B | BW1000 | 16 |
[
Modelscope
](
https://modelscope.cn/models/ZhipuAI/GLM-4.7
)
|
| GLM-4.7 | 355B-A32B | BW1000 | 16 |
[
Modelscope
](
https://modelscope.cn/models/ZhipuAI/GLM-4.7
)
|
| GLM-4.7-Flash| 30B-A3B | BW1000 | 2 |
[
Modelscope
](
https://modelscope.cn/models/ZhipuAI/GLM-4.7-Flash
)
|
## 源码仓库及问题反馈
-
https://developer.sourcefind.cn/codes/modelzoo/glm-4.7_vllm
...
...
doc/result-glm4.7-flash.png
0 → 100644
View file @
80303715
150 KB
doc/result-glm4.7.png
0 → 100644
View file @
80303715
307 KB
doc/result.png
deleted
100644 → 0
View file @
de15f105
163 KB
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment