Commit 933b0e34 authored by shihm's avatar shihm
Browse files

updata cade

parent ba8c0ea1
......@@ -57,31 +57,6 @@ docker run -it \
## 推理
### vllm
#### 单机推理
启动服务
```bash
vllm serve /path/to/baichuan-inc/Baichuan-M3-235B --tensor-parallel-size 8 --max-model-len 8192 --gpu-memory-utilization 0.9 --served-model-name baichuan-m3 --reasoning-parser deepseek_r1
```
启动完成后可通过以下方式访问:
```bash
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "baichuan-m3",
"messages": [
{
"role": "user",
"content": "下午头痛怎么办?"
}
]
}'
```
## 效果展示
<div align=center>
<img src="./doc/result.png"/>
</div>
#### 多机推理
加入环境变量
......@@ -132,7 +107,6 @@ vllm serve /path/to/baichuan-inc/Baichuan-M3-235B
--distributed-executor-backend ray
--tensor-parallel-size 8
--pipeline-parallel-size 2
--max-model-len 32768
--gpu-memory-utilization 0.9
--served-model-name baichuan-m3
--reasoning-parser deepseek_r1
......@@ -202,7 +176,7 @@ print(response)
## 预训练权重
| 模型名称 | 权重大小 | DCU型号 | 最低卡数需求 |下载地址|
|:-----:|:----------:|:----------:|:---------------------:|:----------:|
| Baichuan-M3-235B | 235B | BW1000 | 8 | [ModelScope](https://modelscope.cn/models/baichuan-inc/Baichuan-M3-235B) |
| Baichuan-M3-235B | 235B | BW1000 | 16 | [ModelScope](https://modelscope.cn/models/baichuan-inc/Baichuan-M3-235B) |
## 源码仓库及问题反馈
- https://developer.sourcefind.cn/codes/modelzoo/baichuan-m3-235b_vllm
......
doc/result1.png

741 KB | W: | H:

doc/result1.png

1010 KB | W: | H:

doc/result1.png
doc/result1.png
doc/result1.png
doc/result1.png
  • 2-up
  • Swipe
  • Onion skin
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment