Update README.md

ed315608 · raojy · 313413b4 · ed315608
Commit ed315608 authored May 09, 2026 by raojy 💬
Show whitespace changes
Inline Side-by-side

Showing with 8 additions and 8 deletions

README.md README.md +8 -8

No files found.
--- a/README.md
+++ b/README.md
@@ -13,11 +13,11 @@ Qwen3.6是一款采用混合专家 (MoE) 架构并包含视觉编码器的多模
 | 软件 |                    版本                     |
 | :------: |:-----------------------------------------:|
 | DTK |                   26.04                   |
-| python |                  3.10.12                  |
-| transformers |            5.5.0               |
-| vllm |      0.18.1+das.fa71803.dtk2604     |
-| triton |      3.6.0+gitc73250c4.staging      |
-| torch |   2.10.0+das.opt1.dtk2604.20260325.g6b060a   |
+| Python |                  3.10.12                  |
+| Transformers |            5.5.0               |
+| vLLm |      0.18.1+das.fa71803.dtk2604     |
+| Triton |      3.6.0+gitc73250c4.staging      |
+| Torch |   2.10.0+das.opt1.dtk2604.20260325.g6b060a   |

 当前推荐使用镜像: harbor.sourcefind.cn:5443/dcu/admin/base/custom:vllm018-ubuntu22.04-dtk26.04-qwen3.6-20260423

@@ -56,7 +56,7 @@ docker run -it \
 暂无

 ## 推理
-### vllm
+### vLLm
 #### 单机推理
 ```bash
 ## serve启动
@@ -65,7 +65,7 @@ vllm serve Qwen/Qwen3.6-27B \
    --trust-remote-code \
    --dtype bfloat16 \
    --tensor-parallel-size 2 \
-    --gpu-memory-utilization 0.925 \
+    --gpu-memory-utilization 0.9 \
    --enable-auto-tool-choice \
    --tool-call-parser qwen3_coder

@@ -103,7 +103,7 @@ curl http://localhost:8001/v1/chat/completions \
 </div>

 ### 精度
- 推理框架：vllm
+- 推理框架：vLLm
 - 测试数据：humaneval、gsm8k
 - 使用的加速卡:bw1000