README.md 3.79 KB
Newer Older
dcuai's avatar
dcuai committed
1
# DeepSeek-R1
wanglch's avatar
wanglch committed
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

## 论文

`DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning`

* https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf


## 模型结构

该模型基于Transformer,采用Multi-Head Latent Attention和DeepSeekMoE架构,其中MLA通过减少KV缓存降低内存占用可用于高效推理,DeepSeekMoE通过auxiliary loss平衡专家负载。

![alt text](readme_imgs/arch.png)

## 算法原理

DeepSeek-R1的模型结构通过MLA、DeepSeekMoE、辅助损失无关的负载均衡策略、多令牌预测和FP8混合精度训练等创新技术,显著提升了模型的性能和训练效率,使用强化学习训练模型,增强模型的思考能力,这些设计使得DeepSeek-R1在保持高性能的同时,大幅降低了训练成本。


## 环境配置

### Docker(方法一)
    
wanglch's avatar
wanglch committed
25
    docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.3.0-py3.10-dtk24.04.3-ubuntu20.04
wanglch's avatar
wanglch committed
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44

    docker run --shm-size 500g --network=host --name=dpskr1 --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home/ -v /opt/hyhal:/opt/hyhal:ro -it <your IMAGE ID> bash
    

    # 部署模型环境
    
    cd inference
    pip install -r requirements.txt


### Dockerfile(方法二)

    docker build -t <IMAGE_NAME>:<TAG> .

    docker run --shm-size 500g --network=host --name=dpskr1 --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home/ -v /opt/hyhal:/opt/hyhal:ro -it <your IMAGE ID> bash

    cd inference
    pip install -r requirements.txt

wanglch's avatar
wanglch committed
45
46
47
### Anaconda(方法三)
此处提供本地配置、编译的详细步骤,例如:

chenzk's avatar
chenzk committed
48
关于本项目DCU显卡所需的特殊深度学习库可从[光合](https://developer.sourcefind.cn/tool/)开发者社区下载安装。
wanglch's avatar
wanglch committed
49
50
51
52
53
54
55
56
```
DTK驱动:dtk24.04.3
python:3.10
torch:2.3.0
```
`Tips:以上dtk驱动、python、torch等DCU相关工具版本需要严格一一对应`

其它非深度学习库参照requirement.txt安装:
wanglch's avatar
wanglch committed
57
    
wanglch's avatar
wanglch committed
58
59
    conda create -n Deepseek-R1 python=3.10
    conda activate Deepseek-R1 
wanglch's avatar
wanglch committed
60

wanglch's avatar
wanglch committed
61
    cd /path_code/inference
wanglch's avatar
wanglch committed
62
63
64
    pip install -r requirements.txt


wanglch's avatar
wanglch committed
65
66
67
68
69
70
71
72
73
74
75
76
77
78

## 数据集



## 训练



## 推理

### 配置ollama环境

```
wanglch's avatar
wanglch committed
79
git clone -b 0.5.7 http://developer.sourcefind.cn/codes/OpenDAS/ollama.git --depth=1
wanglch's avatar
wanglch committed
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104

cd ollama

# 编译

wget https://go.dev/dl/go1.23.4.linux-amd64.tar.gz
tar -C /usr/local -xzf go1.23.4.linux-amd64.tar.gz
export PATH=$PATH:/usr/local/go/bin

# 修改go下载源,提升速度(按需设置)
go env -w GOPROXY=https://goproxy.cn,direct

# 运行编译

export LIBRARY_PATH=/opt/dtk/lib:$LIBRARY_PATH
make -j 16
go build .
```

### run

#### deepseek-r1 模型推理  其它模型参考 [ollama.com](https://ollama.com/library)

##### 启用服务端 (server)
```
dcuai's avatar
dcuai committed
105
export HSA_OVERRIDE_GFX_VERSION=设备型号(如: Z100L gfx906对应9.0.6;K100 gfx926对应9.2.6;K100AI gfx928对应9.2.8)
wanglch's avatar
wanglch committed
106
107
108

# 例如 export HSA_OVERRIDE_GFX_VERSION=9.2.8

chenzk's avatar
chenzk committed
109
export ROCR_VISIBLE_DEVICES=显卡序号(0,1,2,3,4,5,6,...)
wanglch's avatar
wanglch committed
110

chenzk's avatar
chenzk committed
111
# 例如 export ROCR_VISIBLE_DEVICES=0,1,2,3
wanglch's avatar
wanglch committed
112
113


wanglch's avatar
wanglch committed
114
115
116
117
118
119
120
121
122
123
124
./ollama serve

```
##### 启用应用端 (chat)

新建终端,进入容器


```
cd  ollama

wanglch's avatar
wanglch committed
125
./ollama run deepseek-r1:671b
wanglch's avatar
wanglch committed
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
```


## result

![alt text](readme_imgs/result1.png)

### 精度



## 应用场景

### 算法类别

`对话问答`

### 热点应用行业

`电商,教育,广媒,交通,政府`

## 预训练权重
chenzk's avatar
chenzk committed
148
149
SCNet高速下载通道:
[DeepSeek-R1-GGUF](https://www.scnet.cn/ui/aihub/models/acona0zft8/DeepSeek-R1-GGUF)
wanglch's avatar
wanglch committed
150
151
152

## 源码仓库及问题反馈

dcuai's avatar
dcuai committed
153
* https://developer.sourcefind.cn/codes/modelzoo/Deepseek-r1_ollama
wanglch's avatar
wanglch committed
154
155
156
157
158
159

## 参考资料

* https://github.com/deepseek-ai/DeepSeek-R1

* https://github.com/ollama/ollama