Commit 646210d2 authored by shihm's avatar shihm
Browse files

updata

parent a293067a
......@@ -114,7 +114,7 @@ ray start --address='x.x.x.x:6379' --num-gpus=8 --num-cpus=32
```
启动vllm server
```bash
vllm serve /path/to/baichuan-inc/Baichuan-M3-235B
vllm serve /path/to/Baichuan-M3-235B
--host x.x.x.x --port 8000
--distributed-executor-backend ray
--tensor-parallel-size 8
......@@ -147,10 +147,10 @@ curl http://localhost:8000/v1/chat/completions \
### transformers
```python
#### 单机推理
```bash
from transformers import AutoTokenizer, AutoModelForCausalLM
model_path = "/path/to/baichuan-inc/Baichuan-M3-235B"
model_path = "/path/to/Baichuan-M3-235B"
import os
import torch
os.environ['TRANSFORMERS_OFFLINE'] = '1'
......@@ -188,7 +188,7 @@ print(response)
## 预训练权重
| 模型名称 | 权重大小 | DCU型号 | 最低卡数需求 |下载地址|
|:-----:|:----------:|:----------:|:---------------------:|:----------:|
| Baichuan-M3-235B | 235B | BW1000 | 16 | [ModelScope](https://modelscope.cn/models/baichuan-inc/Baichuan-M3-235B) |
| Baichuan-M3-235B | 235B | BW1000 | 16 | [Modelscope](https://modelscope.cn/models/baichuan-inc/Baichuan-M3-235B) |
## 源码仓库及问题反馈
- https://developer.sourcefind.cn/codes/modelzoo/baichuan-m3-235b_vllm
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment