README.md 6.03 KB
Newer Older
shihm's avatar
shihm committed
1
# Baichuan-M3
shihm's avatar
shihm committed
2
3
4
5
6
## 论文
[Modeling Clinical Inquiry for Reliable Medical Decision-Making](https://arxiv.org/abs/2602.06570)

## 模型简介
Baichuan-M3 是百川智能推出的全新一代医疗增强大语言模型,是继 Baichuan-M2 之后的重要里程碑。
shihm's avatar
shihm committed
7
8
9
10
11
12
与以往主要聚焦于静态问答或表面角色扮演的方法不同,Baichuan-M3 经过专门训练,能够显式建模 临床决策过程,旨在提升模型在真实医疗场景中的可用性与可靠性。该模型并非仅生成“听起来合理”的答案,或频繁给出诸如“你应尽快就医”等模糊建议,而是被训练为能够 主动获取关键临床信息、构建连贯的医学推理路径,并 系统性地约束易产生幻觉的行为。<br>
具有以下的亮点:<br>
超越 GPT-5.2:在 HealthBench、HealthBench-Hard、幻觉评估和 SCAN-bench 等多项指标上全面超越 OpenAI 最新模型,树立医疗 AI 新的 SOTA。<br>
高保真临床问诊能力:唯一在 SCAN-bench 全部三个维度(临床问诊、实验室检查、诊断)均排名第一的模型。<br>
低幻觉率,高可靠性:通过 Fact-Aware RL,在无外部工具辅助的情况下,幻觉率低于 GPT-5.2。<br>
 高效部署:W4 量化将内存占用降至原始的 26%;Gated Eagle3 推测解码实现 96% 的加速。<br>
shihm's avatar
shihm committed
13
14
15
16
17
18
19
20
21
22

## 环境依赖

|     软件     |                      版本                      |
| :----------: | :--------------------------------------------: |
|     DTK      |                    26.04.2                     |
|    python    |                    3.10.12                     |
| transformers |                     4.57.6                     |
|     vllm     | 0.11.0+das.opt1.rc2.dtk2604.20260128.g0bf89b0c | 

shihm's avatar
shihm committed
23
推荐使用镜像:harbor.sourcefind.cn:5443/dcu/admin/base/vllm:0.11.0-ubuntu22.04-dtk26.04-0130-py3.10-20260204
shihm's avatar
shihm committed
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41

- 挂载地址`-v`根据实际模型情况修改

```bash
docker run -it \
    --shm-size 200g \
    --network=host \
    --name baichuan_m3 \
    --privileged \
    --device=/dev/kfd \
    --device=/dev/dri \
    --device=/dev/mkfd \
    --group-add video \
    --cap-add=SYS_PTRACE \
    --security-opt seccomp=unconfined \
    -u root \
    -v /opt/hyhal/:/opt/hyhal/:ro \
    -v /path/your_code_data/:/path/your_code_data/ \
shihm's avatar
shihm committed
42
    harbor.sourcefind.cn:5443/dcu/admin/base/vllm:0.11.0-ubuntu22.04-dtk26.04-0130-py3.10-20260204 bash
shihm's avatar
shihm committed
43
44
```
更多镜像可前往[光源](https://sourcefind.cn/#/service-list)下载使用。
shihm's avatar
shihm committed
45

shihm's avatar
shihm committed
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
关于本项目DCU显卡所需的特殊深度学习库可从[光合](https://developer.sourcefind.cn/tool/)开发者社区下载安装。


## 数据集

`暂无`

## 训练

`暂无`

## 推理

### vllm

shihm's avatar
shihm committed
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
#### 多机推理
加入环境变量

```bash
export ALLREDUCE_STREAM_WITH_COMPUTE=1
export VLLM_HOST_IP=x.x.x.x # 对应计算节点的IP,选择IB口SOCKET_IFNAME对应IP地址
export NCCL_SOCKET_IFNAME=ibxxxx
export GLOO_SOCKET_IFNAME=ibxxxx
export NCCL_IB_HCA=mlx5_0:1 # 环境中的IB网卡名字
unset NCCL_ALGO
export NCCL_MIN_NCHANNELS=16
export NCCL_MAX_NCHANNELS=16
export NCCL_NET_GDR_READ=1
export HIP_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
export VLLM_SPEC_DECODE_EAGER=1
export VLLM_MLA_DISABLE=0
export VLLM_USE_FLASH_MLA=1
export VLLM_RPC_TIMEOUT=1800000

# K100_AI集群建议额外设置的环境变量:
export VLLM_ENFORCE_EAGER_BS_THRESHOLD=44

# 海光CPU绑定核
export VLLM_NUMA_BIND=1
export VLLM_RANK0_NUMA=0
export VLLM_RANK1_NUMA=1
export VLLM_RANK2_NUMA=2
export VLLM_RANK3_NUMA=3
export VLLM_RANK4_NUMA=4
export VLLM_RANK5_NUMA=5
export VLLM_RANK6_NUMA=6
export VLLM_RANK7_NUMA=7
```
启动RAY集群
x.x.x.x对应第一步的head节点VLLM_HOST_IP

```bash
# head节点执行
ray start --head --node-ip-address=x.x.x.x --port=6379 --num-gpus=8 --num-cpus=32
# worker节点执行
ray start --address='x.x.x.x:6379' --num-gpus=8 --num-cpus=32
```
启动vllm server
```bash
vllm serve /path/to/baichuan-inc/Baichuan-M3-235B 
    --host x.x.x.x --port 8000  
    --distributed-executor-backend ray 
    --tensor-parallel-size 8   
    --pipeline-parallel-size 2  
    --gpu-memory-utilization 0.9 
    --served-model-name baichuan-m3 
    --reasoning-parser deepseek_r1
```
启动完成后可通过以下方式访问:
```bash
curl http://localhost:8000/v1/chat/completions   \
    -H "Content-Type: application/json"  \
    -d '{
        "model": "baichuan-m3",
        "messages": [
            {
                "role": "user",
                "content": "下午头痛怎么办?"
            }
        ]
}'
```
## 效果展示
<div align=center>
    <img src="./doc/result1.png"/>
</div>





shihm's avatar
shihm committed
137
### transformers
shihm's avatar
shihm committed
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
model_path = "/path/to/baichuan-inc/Baichuan-M3-235B"
import os
import torch
os.environ['TRANSFORMERS_OFFLINE'] = '1'
os.environ['MODELSCOPE_OFFLINE'] = '1'
model = AutoModelForCausalLM.from_pretrained(
  model_path, 
  trust_remote_code=True,
  device_map="auto",
  torch_dtype=torch.bfloat16
  )
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

messages = [{"role": "user", "content": "I've been having headaches lately, especially worse in the afternoon. What should I do?"}]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    thinking_mode='on'
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=32768,
    temperature=0.6
)
response = tokenizer.decode(generated_ids[0][len(model_inputs.input_ids[0]):], skip_special_tokens=True)
print(response)
```


### 精度
shihm's avatar
shihm committed
174
`DCU与GPU精度一致,推理框架:vllm,transformer`
shihm's avatar
shihm committed
175
176
177
178

## 预训练权重
| 模型名称  | 权重大小  | DCU型号  | 最低卡数需求 |下载地址|
|:-----:|:----------:|:----------:|:---------------------:|:----------:|
shihm's avatar
shihm committed
179
| Baichuan-M3-235B | 235B | BW1000  | 16  | [ModelScope](https://modelscope.cn/models/baichuan-inc/Baichuan-M3-235B) |
shihm's avatar
shihm committed
180

shihm's avatar
shihm committed
181
182
183
## 源码仓库及问题反馈
- https://developer.sourcefind.cn/codes/modelzoo/baichuan-m3-235b_vllm

shihm's avatar
shihm committed
184
185
186
## 参考资料
- https://www.baichuan-ai.com/blog/baichuan-M3