Update README.md

58ab51f3 · raojy · 91544c88 · 58ab51f3
Commit 58ab51f3 authored Mar 31, 2026 by raojy 💬
Hide whitespace changes
Inline Side-by-side

Showing with 2 additions and 4 deletions

README.md README.md +2 -4

No files found.
--- a/README.md
+++ b/README.md
@@ -76,9 +76,7 @@ vllm serve Qwen/Qwen3-Coder-Next \
    --served-model-name Qwen3-Coder-Next \
    --dtype bfloat16 \
    --trust-remote-code \
-    --tensor-parallel-size 4 \
+    --tensor-parallel-size 8 \
-    --gpu-memory-utilization 0.95 \
-    --max-model-len 8192 \
    --port 8000
 ## client访问
@@ -110,7 +108,7 @@ curl http://localhost:8000/v1/chat/completions   \
 |     **模型名称**     | **权重大小** |  **DCU型号**  | **最低卡数需求** |                         **下载地址**                         |
 | :------------------: | :----------: | :-----------: | :--------------: | :----------------------------------------------------------: |
-| Qwen3-Coder-Next |     80B      | BW1000 |        4         | [Hugging Face](https://huggingface.co/Qwen/Qwen3-Coder-Next) |
+| Qwen3-Coder-Next |     80B      | BW1000 |        8         | [Hugging Face](https://huggingface.co/Qwen/Qwen3-Coder-Next) |