readme

61f51a8e · shihm · b80e3d36 · 61f51a8e
Commit 61f51a8e authored Mar 13, 2026 by shihm
Hide whitespace changes
Inline Side-by-side

Showing with 8 additions and 7 deletions

README.md README.md +8 -7

No files found.
--- a/README.md
+++ b/README.md
 # Baichuan-M3
 ## 论文
-[Modeling Clinical Inquiry for Reliable Medical Decision-Making](https://arxiv.org/abs/2602.06570)
+[Baichuan-M3: Modeling Clinical Inquiry for Reliable Medical Decision-Making](https://arxiv.org/abs/2602.06570)
 ## 模型简介
 Baichuan-M3 是百川智能推出的全新一代医疗增强大语言模型，是继 Baichuan-M2 之后的重要里程碑。
@@ -61,6 +61,7 @@ docker run -it \
 ```bash
 python inference.py
 ```
+Transformers推理不支持Baichuan-M3-235B-GPTQ-INT4模型
 ### vllm
@@ -68,12 +69,12 @@ python inference.py
 启动vllm server
 ```bash
-vllm serve /baichuan-inc/Baichuan-M3-235B
+vllm serve baichuan-inc/Baichuan-M3-235B \
-    --reasoning-parser qwen3 
+    --reasoning-parser qwen3 \
-    --tensor-parallel-size 8 
+    --tensor-parallel-size 8  \
-    --trust-remote-code
+    --trust-remote-code \
-    --port 8000
+    --port 8000 \
-    --gpu-memory-utilization 0.95 
+    --gpu-memory-utilization 0.95 \
    --served-model-name baichuan-m3 
 ```
 启动完成后可通过以下方式访问：