Commit c5d122cc authored by weishb's avatar weishb
Browse files

更新readme版本

parent ea4701c5
...@@ -46,6 +46,21 @@ docker run -it \ ...@@ -46,6 +46,21 @@ docker run -it \
关于本项目DCU显卡所需的特殊深度学习库可从[光合](https://developer.sourcefind.cn/tool/)开发者社区下载安装。 关于本项目DCU显卡所需的特殊深度学习库可从[光合](https://developer.sourcefind.cn/tool/)开发者社区下载安装。
## 预训练权重
**请根据`支持的DCU型号`选择对应模型下载,FP8模型仅在BW1100/BW1101上支持,其他型号请勿使用!**
| 模型名称 | 权重大小 | 数据类型 | 支持的DCU型号 | 最低卡数需求 | 下载地址 |
|:------:|:----:|:----------:|:----------:|:------:|:---------------------:|
| Qwen3.5-397B-A17B | 397B | BF16 | K100AI,BW1000 | 16 | [Hugging Face](https://huggingface.co/Qwen/Qwen3.5-397B-A17B) |
| Qwen3.5-397B-A17B-INT8 | 397B | INT8 | BW1000 | 8 | [ModelScope](https://www.modelscope.cn/models/metax-tech/Qwen3.5-397B-A17B-W8A8) |
| Qwen3.5-122B-A10B | 122B | BF16 | K100AI,BW1000 | 8 | [Hugging Face](https://huggingface.co/Qwen/Qwen3.5-122B-A10B) |
| Qwen3.5-35B-A3B | 35B | BF16 | K100AI,BW1000 | 2 | [Hugging Face](https://huggingface.co/Qwen/Qwen3.5-35B-A3B) |
| Qwen3.5-27B | 27B | BF16 | K100AI,BW1000 | 2 | [Hugging Face](https://huggingface.co/Qwen/Qwen3.5-27B) |
| Qwen3.5-9B | 9B | BF16 | K100AI,BW1000 | 1 | [Hugging Face](https://huggingface.co/Qwen/Qwen3.5-9B) |
| Qwen3.5-4B | 4B | BF16 | K100AI,BW1000 | 1 | [Hugging Face](https://huggingface.co/Qwen/Qwen3.5-4B) |
| Qwen3.5-2B | 2B | BF16 | K100AI,BW1000 | 1 | [Hugging Face](https://huggingface.co/Qwen/Qwen3.5-2B) |
| Qwen3.5-0.8B | 0.8B | BF16 | K100AI,BW1000 | 1 | [Hugging Face](https://huggingface.co/Qwen/Qwen3.5-0.8B) |
## 数据集 ## 数据集
暂无 暂无
...@@ -64,7 +79,7 @@ docker run -it \ ...@@ -64,7 +79,7 @@ docker run -it \
#### 单机推理 #### 单机推理
```bash ```bash
## serve启动 # serve启动
vllm serve Qwen/Qwen3.5-35B-A3B \ vllm serve Qwen/Qwen3.5-35B-A3B \
--port 8001 \ --port 8001 \
--trust-remote-code \ --trust-remote-code \
...@@ -76,7 +91,7 @@ vllm serve Qwen/Qwen3.5-35B-A3B \ ...@@ -76,7 +91,7 @@ vllm serve Qwen/Qwen3.5-35B-A3B \
--enable-auto-tool-choice \ --enable-auto-tool-choice \
--tool-call-parser qwen3_coder --tool-call-parser qwen3_coder
## client访问 # client访问
curl http://localhost:8001/v1/chat/completions \ curl http://localhost:8001/v1/chat/completions \
-H "Content-Type: application/json" \ -H "Content-Type: application/json" \
-d '{ -d '{
...@@ -138,7 +153,7 @@ ray start --address='x.x.x.x:6379' --num-gpus=8 --num-cpus=32 ...@@ -138,7 +153,7 @@ ray start --address='x.x.x.x:6379' --num-gpus=8 --num-cpus=32
3. 启动vllm server 3. 启动vllm server
```bash ```bash
## serve启动 # serve启动
vllm serve Qwen/Qwen3.5-397B-A17B \ vllm serve Qwen/Qwen3.5-397B-A17B \
--port 8001 \ --port 8001 \
--tensor-parallel-size 16 \ --tensor-parallel-size 16 \
...@@ -151,7 +166,7 @@ vllm serve Qwen/Qwen3.5-397B-A17B \ ...@@ -151,7 +166,7 @@ vllm serve Qwen/Qwen3.5-397B-A17B \
--enable-auto-tool-choice \ --enable-auto-tool-choice \
--tool-call-parser qwen3_coder --tool-call-parser qwen3_coder
## client访问 # client访问
curl http://localhost:8001/v1/chat/completions \ curl http://localhost:8001/v1/chat/completions \
-H "Content-Type: application/json" \ -H "Content-Type: application/json" \
-d '{ -d '{
...@@ -165,25 +180,12 @@ curl http://localhost:8001/v1/chat/completions \ ...@@ -165,25 +180,12 @@ curl http://localhost:8001/v1/chat/completions \
## 效果展示 ## 效果展示
<div align=center> <div align=center>
<img src="./doc/result-dcu.jpg"/> <img src="./doc/result-dcu.png"/>
</div> </div>
### 精度 ### 精度
DCU与GPU精度一致,推理框架:vllm。 DCU与GPU精度一致,推理框架:vllm。
## 预训练权重
| 模型名称 | 权重大小 | DCU型号 | 最低卡数需求 | 下载地址 |
|:------:|:----:|:----------:|:------:|:---------------------:|
| Qwen3.5-397B-A17B | 397B | K100AI,BW1000 | 16 | [Hugging Face](https://huggingface.co/Qwen/Qwen3.5-397B-A17B) |
| Qwen3.5-397B-A17B-INT8 | 397B | BW1000 | 8 | [Modelscope](https://www.modelscope.cn/models/metax-tech/Qwen3.5-397B-A17B-W8A8) |
| Qwen3.5-122B-A10B | 122B | K100AI,BW1000 | 8 | [Hugging Face](https://huggingface.co/Qwen/Qwen3.5-122B-A10B) |
| Qwen3.5-35B-A3B | 35B | K100AI,BW1000 | 2 | [Hugging Face](https://huggingface.co/Qwen/Qwen3.5-35B-A3B) |
| Qwen3.5-27B | 27B | K100AI,BW1000 | 2 | [Hugging Face](https://huggingface.co/Qwen/Qwen3.5-27B) |
| Qwen3.5-9B | 9B | K100AI,BW1000 | 1 | [Hugging Face](https://huggingface.co/Qwen/Qwen3.5-9B) |
| Qwen3.5-4B | 4B | K100AI,BW1000 | 1 | [Hugging Face](https://huggingface.co/Qwen/Qwen3.5-4B) |
| Qwen3.5-2B | 2B | K100AI,BW1000 | 1 | [Hugging Face](https://huggingface.co/Qwen/Qwen3.5-2B) |
| Qwen3.5-0.8B | 0.8B | K100AI,BW1000 | 1 | [Hugging Face](https://huggingface.co/Qwen/Qwen3.5-0.8B) |
## 源码仓库及问题反馈 ## 源码仓库及问题反馈
- https://developer.sourcefind.cn/codes/modelzoo/qwen3.5_vllm - https://developer.sourcefind.cn/codes/modelzoo/qwen3.5_vllm
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment