Commit 646210d2 authored by shihm's avatar shihm
Browse files

updata

parent a293067a
...@@ -114,7 +114,7 @@ ray start --address='x.x.x.x:6379' --num-gpus=8 --num-cpus=32 ...@@ -114,7 +114,7 @@ ray start --address='x.x.x.x:6379' --num-gpus=8 --num-cpus=32
``` ```
启动vllm server 启动vllm server
```bash ```bash
vllm serve /path/to/baichuan-inc/Baichuan-M3-235B vllm serve /path/to/Baichuan-M3-235B
--host x.x.x.x --port 8000 --host x.x.x.x --port 8000
--distributed-executor-backend ray --distributed-executor-backend ray
--tensor-parallel-size 8 --tensor-parallel-size 8
...@@ -147,10 +147,10 @@ curl http://localhost:8000/v1/chat/completions \ ...@@ -147,10 +147,10 @@ curl http://localhost:8000/v1/chat/completions \
### transformers ### transformers
#### 单机推理
```python ```bash
from transformers import AutoTokenizer, AutoModelForCausalLM from transformers import AutoTokenizer, AutoModelForCausalLM
model_path = "/path/to/baichuan-inc/Baichuan-M3-235B" model_path = "/path/to/Baichuan-M3-235B"
import os import os
import torch import torch
os.environ['TRANSFORMERS_OFFLINE'] = '1' os.environ['TRANSFORMERS_OFFLINE'] = '1'
...@@ -188,7 +188,7 @@ print(response) ...@@ -188,7 +188,7 @@ print(response)
## 预训练权重 ## 预训练权重
| 模型名称 | 权重大小 | DCU型号 | 最低卡数需求 |下载地址| | 模型名称 | 权重大小 | DCU型号 | 最低卡数需求 |下载地址|
|:-----:|:----------:|:----------:|:---------------------:|:----------:| |:-----:|:----------:|:----------:|:---------------------:|:----------:|
| Baichuan-M3-235B | 235B | BW1000 | 16 | [ModelScope](https://modelscope.cn/models/baichuan-inc/Baichuan-M3-235B) | | Baichuan-M3-235B | 235B | BW1000 | 16 | [Modelscope](https://modelscope.cn/models/baichuan-inc/Baichuan-M3-235B) |
## 源码仓库及问题反馈 ## 源码仓库及问题反馈
- https://developer.sourcefind.cn/codes/modelzoo/baichuan-m3-235b_vllm - https://developer.sourcefind.cn/codes/modelzoo/baichuan-m3-235b_vllm
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment