README.md 5.98 KB
Newer Older
mashun1's avatar
mashun1 committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
# DeepSeek-R1-Distill

## 论文

`DeepSeek-R1`

* https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf

## 模型结构

该算法共有三种模型,分别是LLama3.1,LLama3.3和Qwen2.5,三者都是decoder-only结构。

![alt text](readme_imgs/arch.png)

## 算法原理

DeepSeek-R1-Distill-model基于目前性能较好的开源模型,使用`DeepSeek-R1`生成的高质量数据进行监督微调(SFT)获得。


## 环境配置

### Docker(方法一)
    
    docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.3.0-ubuntu22.04-dtk24.04.3-py3.10

    docker run --shm-size 500g --network=host --name=dpskv3 --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home/ -v /opt/hyhal:/opt/hyhal:ro -it <your IMAGE ID> bash

    pip install https://download.sourcefind.cn:65024/directlink/4/lmslim/DAS1.3/lmslim-0.1.2+das.dtk24043-cp310-cp310-manylinux_2_28_x86_64.whl

    pip install https://download.sourcefind.cn:65024/directlink/4/vllm/DAS1.3/vllm-0.6.2+das.opt1.dtk24043-cp310-cp310-manylinux_2_28_x86_64.whl

### Dockerfile(方法二)

    docker build -t <IMAGE_NAME>:<TAG> .

    docker run --shm-size 500g --network=host --name=dpskv3 --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home/ -v /opt/hyhal:/opt/hyhal:ro -it <your IMAGE ID> bash

    pip install https://download.sourcefind.cn:65024/directlink/4/lmslim/DAS1.3/lmslim-0.1.2+das.dtk24043-cp310-cp310-manylinux_2_28_x86_64.whl

    pip install https://download.sourcefind.cn:65024/directlink/4/vllm/DAS1.3/vllm-0.6.2+das.opt1.dtk24043-cp310-cp310-manylinux_2_28_x86_64.whl
41
42
43
44
45
46
47
48
49
50
51
52
53
  
  
环境变量:  
export ALLREDUCE_STREAM_WITH_COMPUTE=1  
export VLLM_NUMA_BIND=1  
export VLLM_RANK0_NUMA=0  
export VLLM_RANK1_NUMA=1  
export VLLM_RANK2_NUMA=2  
export VLLM_RANK3_NUMA=3  
export VLLM_RANK4_NUMA=4  
export VLLM_RANK5_NUMA=5  
export VLLM_RANK6_NUMA=6  
export VLLM_RANK7_NUMA=7  
mashun1's avatar
mashun1 committed
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
## 数据集

数据需要使用`DeepSeek-R1`获取,本项目提供一个示例数据集用于测试,见`examples/toy.json`

## 训练

可使用[LLaMA-Factory](https://developer.sourcefind.cn/codes/OpenDAS/llama-factory)训练,安装方法如下

```bash
git clone http://developer.sourcefind.cn/codes/OpenDAS/llama-factory.git

cd llama-factory && pip install -e ".[torch,metrics]"
```

### SFT

deepseek_r1_distill.yaml
```yaml
# 单机N卡训练配置(按需修改)
model_name_or_path: /path/to/your/model

stage: sft
do_train: true
finetuning_type: full
deepspeed: examples/deepspeed/ds_z3_config.json

dataset: deepseek-r1_distill
template: qwen
cutoff_len: 2048
max_samples: 5000
overwrite_cache: true
preprocessing_num_workers: 16

output_dir: /path/to/save/checkpoints
logging_steps: 10
save_steps: 500
overwrite_output_dir: true

per_device_train_batch_size: 1
gradient_accumulation_steps: 4
learning_rate: 1.0e-4
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 1800

val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 500
```

```bash
mashun1's avatar
mashun1 committed
108
cd llama-factory
mashun1's avatar
mashun1 committed
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
llamafactory-cli train /path/to/deepseek_r1_distill.yaml
```

## 推理

### vllm服务

```bash
vllm serve /path/to/distill_model --tensor-parallel-size 2 --max-model-len 32768 --enforce-eager
```

```bash
curl http://localhost:8000/v1/completions \
-H "Content-Type: application/json" \
-d '{
    "model": "model_id",
    "prompt": "your prompt",
    "max_tokens": 512,
    "temperature": 0
}'
```

## result

```bash
curl http://localhost:8000/v1/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "/home/modelzoo/DeepSeek-R1-Distill-Qwen-14B/",
        "prompt": "甲乙两班共有学生98人,甲班比乙班多6人,求两班各有多少人?",
        "max_tokens": 300,
        "temperature": 0
    }'
```

```bash
{"id":"cmpl-5473237b46054a98ba27906a4b099e33","object":"text_completion","created":1737515343,"model":"/home/modelzoo/DeepSeek-R1-Distill-Qwen-14B/","choices":[{"index":0,"text":"(用方程解)\n\n首先,设乙班有x人,那么甲班就有x + 6人。\n\n根据总人数,可以列出方程:x + (x + 6) = 98。\n\n解这个方程,得到x = 41。\n\n因此,乙班有41人,甲班有47人。\n</think>\n\n**解答:**\n\n设乙班有 \\( x \\) 人,则甲班有 \\( x + 6 \\) 人。\n\n根据题意,两班共有学生98人,可以列出方程:\n\n\\[\nx + (x + 6) = 98\n\\]\n\n解方程:\n\n\\[\n2x + 6 = 98\n\\]\n\n\\[\n2x = 98 - 6\n\\]\n\n\\[\n2x = 92\n\\]\n\n\\[\nx = 46\n\\]\n\n因此,乙班有46人,甲班有:\n\n\\[\nx + 6 = 46 + 6 = 52\n\\]\n\n**答案:**\n\n甲班有 \\(\\boxed{52}\\) 人,乙班有 \\(\\boxed{46}\\) 人。","logprobs":null,"finish_reason":"stop","stop_reason":null,"prompt_logprobs":null}],"usage":{"prompt_tokens":27,"total_tokens":285,"completion_tokens":258}}
```

### 精度

与Nvidia GPU保持一致。

## 应用场景

### 算法类别

`对话问答`

### 热点应用行业

`电商,教育,广媒`

## 预训练权重

|model|下载地址|
|:---:|:---:|
chenzk's avatar
chenzk committed
166
167
168
169
170
171
|DeepSeek-R1-Distill-Qwen-1.5B|  [huggingface](https://hf-mirror.com/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) \| [SCNet]|
|DeepSeek-R1-Distill-Qwen-7B| [huggingface](https://hf-mirror.com/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) \| [SCNet] |
|DeepSeek-R1-Distill-Llama-8B| [huggingface](https://hf-mirror.com/deepseek-ai/DeepSeek-R1-Distill-Llama-8B) \| [SCNet]|
|DeepSeek-R1-Distill-Qwen-14B| [huggingface](https://hf-mirror.com/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B) \| [SCNet]|
|DeepSeek-R1-Distill-Qwen-32B| [huggingface](https://hf-mirror.com/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B) \| [SCNet]|
|DeepSeek-R1-Distill-Llama-70B| [huggingface](https://hf-mirror.com/deepseek-ai/DeepSeek-R1-Distill-Llama-70B) \| [SCNet]|
mashun1's avatar
mashun1 committed
172
173
174
175
176
177
178
179
180


## 源码仓库及问题反馈

* https://developer.sourcefind.cn/codes/modelzoo/deepseek-r1-distill_vllm

## 参考资料

* https://github.com/deepseek-ai/DeepSeek-R1