Commit 61f51a8e authored by shihm's avatar shihm
Browse files

readme

parent b80e3d36
# Baichuan-M3
## 论文
[Modeling Clinical Inquiry for Reliable Medical Decision-Making](https://arxiv.org/abs/2602.06570)
[Baichuan-M3: Modeling Clinical Inquiry for Reliable Medical Decision-Making](https://arxiv.org/abs/2602.06570)
## 模型简介
Baichuan-M3 是百川智能推出的全新一代医疗增强大语言模型,是继 Baichuan-M2 之后的重要里程碑。
......@@ -61,6 +61,7 @@ docker run -it \
```bash
python inference.py
```
Transformers推理不支持Baichuan-M3-235B-GPTQ-INT4模型
### vllm
......@@ -68,12 +69,12 @@ python inference.py
启动vllm server
```bash
vllm serve /baichuan-inc/Baichuan-M3-235B
--reasoning-parser qwen3
--tensor-parallel-size 8
--trust-remote-code
--port 8000
--gpu-memory-utilization 0.95
vllm serve baichuan-inc/Baichuan-M3-235B \
--reasoning-parser qwen3 \
--tensor-parallel-size 8 \
--trust-remote-code \
--port 8000 \
--gpu-memory-utilization 0.95 \
--served-model-name baichuan-m3
```
启动完成后可通过以下方式访问:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment