Commit 20f4e124 authored by chenych's avatar chenych
Browse files

Update GLM5

parent dc06c77b
...@@ -20,7 +20,7 @@ ...@@ -20,7 +20,7 @@
```bash ```bash
docker run -it \ docker run -it \
--shm-size 60g \ --shm-size 200g \
--network=host \ --network=host \
--name glm-5 \ --name glm-5 \
--privileged \ --privileged \
...@@ -110,21 +110,24 @@ ray start --address='x.x.x.x:6379' --num-gpus=8 --num-cpus=32 ...@@ -110,21 +110,24 @@ ray start --address='x.x.x.x:6379' --num-gpus=8 --num-cpus=32
3. 启动vllm server 3. 启动vllm server
```bash ```bash
vllm serve zai-org/GLM-5 \ vllm serve zai-org/GLM-5 \
--port 8001 \ --port 8001 \
--trust-remote-code \ --trust-remote-code \
--tensor-parallel-size 32 \ --tensor-parallel-size 32 \
--gpu-memory-utilization 0.85 \ --gpu-memory-utilization 0.85 \
--speculative-config.method mtp \ --distributed-executor-backend ray \
--speculative-config.num_speculative_tokens 1 \ --dtype bfloat16 \
--tool-call-parser glm47 \ --max-model-len 32768 \
--reasoning-parser glm45 \ --speculative-config.method mtp \
--enable-auto-tool-choice \ --speculative-config.num_speculative_tokens 1 \
--served-model-name glm-5 --tool-call-parser glm47 \
--reasoning-parser glm45 \
--enable-auto-tool-choice \
--served-model-name glm-5
``` ```
启动完成后可通过以下方式访问: 启动完成后可通过以下方式访问:
```bash ```bash
curl http://localhost:8001/v1/chat/completions \ curl http://12.12.12.83:8001/v1/chat/completions \
-H "Content-Type: application/json" \ -H "Content-Type: application/json" \
-d '{ -d '{
"model": "glm-5", "model": "glm-5",
...@@ -132,14 +135,14 @@ curl http://localhost:8001/v1/chat/completions \ ...@@ -132,14 +135,14 @@ curl http://localhost:8001/v1/chat/completions \
{"role": "system", "content": "You are a helpful assistant."}, {"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Summarize GLM-5 in one sentence."} {"role": "user", "content": "Summarize GLM-5 in one sentence."}
], ],
"max_tokens": 4096, "max_tokens": 200,
"temperature": 1 "temperature": 0.7
}' }'
``` ```
## 效果展示 ## 效果展示
<div align=center> <div align=center>
<img src="./doc/xxx.png"/> <img src="./doc/result.png"/>
</div> </div>
### 精度 ### 精度
......
doc/result.png

31.7 KB | W: | H:

doc/result.png

104 KB | W: | H:

doc/result.png
doc/result.png
doc/result.png
doc/result.png
  • 2-up
  • Swipe
  • Onion skin
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment