Commit a9ee04b7 authored by shihm's avatar shihm
Browse files

updata readme

parent 26aebea5
......@@ -114,7 +114,7 @@ ray start --address='x.x.x.x:6379' --num-gpus=8 --num-cpus=32
```
启动vllm server
```bash
vllm serve /path/to/Baichuan-M3-235B
vllm serve /baichuan-inc/Baichuan-M3-235B
--host x.x.x.x --port 8000
--distributed-executor-backend ray
--tensor-parallel-size 16
......@@ -183,7 +183,7 @@ print(response)
### 精度
`DCU与GPU精度一致,推理框架:vllm,transformer`
`DCU与GPU精度一致,推理框架:vllm,transformers`
## 预训练权重
| 模型名称 | 权重大小 | DCU型号 | 最低卡数需求 |下载地址|
......
# 模型唯一标识
modelCode=2152
# 模型名称
modelName=Baichuan-M3_vllm
modelName=Baichuan-M3_pytorch
# 模型描述
modelDescription=Baichuan-M3 是百川智能推出的全新一代医疗增强大语言模型,是继 Baichuan-M2 之后的重要里程碑。
# 运行过程
......@@ -9,6 +9,6 @@ processType=推理
# 算法类别
appCategory=对话问答
# 框架类型
frameType=vllm
frameType=vllm,transformers
# 加速卡类型
accelerateType=BW1000
\ No newline at end of file
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment