Commit 58ab51f3 authored by raojy's avatar raojy 💬
Browse files

Update README.md

parent 91544c88
...@@ -76,9 +76,7 @@ vllm serve Qwen/Qwen3-Coder-Next \ ...@@ -76,9 +76,7 @@ vllm serve Qwen/Qwen3-Coder-Next \
--served-model-name Qwen3-Coder-Next \ --served-model-name Qwen3-Coder-Next \
--dtype bfloat16 \ --dtype bfloat16 \
--trust-remote-code \ --trust-remote-code \
--tensor-parallel-size 4 \ --tensor-parallel-size 8 \
--gpu-memory-utilization 0.95 \
--max-model-len 8192 \
--port 8000 --port 8000
## client访问 ## client访问
...@@ -110,7 +108,7 @@ curl http://localhost:8000/v1/chat/completions \ ...@@ -110,7 +108,7 @@ curl http://localhost:8000/v1/chat/completions \
| **模型名称** | **权重大小** | **DCU型号** | **最低卡数需求** | **下载地址** | | **模型名称** | **权重大小** | **DCU型号** | **最低卡数需求** | **下载地址** |
| :------------------: | :----------: | :-----------: | :--------------: | :----------------------------------------------------------: | | :------------------: | :----------: | :-----------: | :--------------: | :----------------------------------------------------------: |
| Qwen3-Coder-Next | 80B | BW1000 | 4 | [Hugging Face](https://huggingface.co/Qwen/Qwen3-Coder-Next) | | Qwen3-Coder-Next | 80B | BW1000 | 8 | [Hugging Face](https://huggingface.co/Qwen/Qwen3-Coder-Next) |
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment