Commit 3e1bb3cb authored by raojy's avatar raojy 💬
Browse files

Update README.md

parent 3df6e522
......@@ -64,7 +64,7 @@ pip install numpy==1.26.1
```bash
## serve启动
vllm serve /public/home/raojy/project/model_code/Qwen3-Omni-30B-A3B-Instruct \
vllm serve Qwen/Qwen3-Omni-30B-A3B-Instruct \
--trust-remote-code \
--tensor-parallel-size 4 \
--dtype bfloat16
......@@ -74,7 +74,7 @@ curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer EMPTY" \
-d '{
"model": "/public/home/raojy/project/model_code/Qwen3-Omni-30B-A3B-Instruct",
"model": "Qwen/Qwen3-Omni-30B-A3B-Instruct",
"messages": [
{
"role": "user",
......@@ -110,8 +110,7 @@ DCU与GPU精度一致,推理框架:vllm。
## 预训练权重
| 模型名称 | 权重大小 | DCU型号 | 最低卡数需求 | 下载地址 |
|:------:|:----:|:----------:|:------:|:---------------------:|
|
Qwen3-Omni-30B-A3B-Instruct | 30B | BW1000 | 2 | [Hugging Face](https://huggingface.co/Qwen/Qwen3-Omni-30B-A3B-Instruct) |
| Qwen3-Omni-30B-A3B-Instruct | 30B | BW1000 | 2 | [Hugging Face](https://huggingface.co/Qwen/Qwen3-Omni-30B-A3B-Instruct) |
## 源码仓库及问题反馈
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment