"magic_pdf/pre_proc/fix_table.py.bak" did not exist on "f99149b8ddc24251dce6de33cfc4ec09e18821c2"
README.md 4.3 KB
Newer Older
Rayyyyy's avatar
Rayyyyy committed
1
# deepseek-v2
Rayyyyy's avatar
Rayyyyy committed
2
3
4
5
6
7
8
9
## 论文
[deepseek-v2](https://arxiv.org/abs/2405.04434)

## 模型结构
<div align=center>
    <img src="./doc/model.png"/>
</div>

Rayyyyy's avatar
Rayyyyy committed
10
11
## 算法原理
DeepSeek-V2对模型框架进行了全方位的创新,提出了媲美MHA的MLA(Multi-head Latent Attention)架构,大幅减少计算量和推理显存;自研Sparse结构DeepSeekMoE进一步将计算量降低到极致,两者结合最终实现模型性能跨级别的提升。
Rayyyyy's avatar
Rayyyyy committed
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

## 环境配置
-v 路径、docker_name和imageID根据实际情况修改

### Docker(方法一)
```bash
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-centos7.6-dtk24.04-py310
docker run -it -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal/:/opt/hyhal/:ro --shm-size=80G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash

cd /your_code_path/deepseek-v2_pytorch
pip install -r requirements.txt
pip install -U huggingface_hub hf_transfer
export HF_ENDPOINT=https://hf-mirror.com
```

### Dockerfile(方法二)
```bash
cd docker
Rayyyyy's avatar
Rayyyyy committed
30
docker build --no-cache -t deepseek-v2:latest .
Rayyyyy's avatar
Rayyyyy committed
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
docker run -it -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal/:/opt/hyhal/:ro --shm-size=80G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash

cd /your_code_path/deepseek-v2_pytorch
pip install -r requirements.txt
pip install -U huggingface_hub hf_transfer
export HF_ENDPOINT=https://hf-mirror.com
```

### Anaconda(方法三)
关于本项目DCU显卡所需的特殊深度学习库可从[光合](https://developer.hpccube.com/tool/)开发者社区下载安装。
```bash
DTK驱动: dtk24.04
python: python3.10
torch: 2.1.0
```
`Tips:以上dtk驱动、python、torch等DCU相关工具版本需要严格一一对应`

其它非深度学习库安装方式如下:
```bash
pip install -r requirements.txt
pip install -U huggingface_hub hf_transfer
export HF_ENDPOINT=https://hf-mirror.com
```

## 数据集
暂无

## 训练
Rayyyyy's avatar
Rayyyyy committed
59
暂无
Rayyyyy's avatar
Rayyyyy committed
60
61

## 推理
Rayyyyy's avatar
Rayyyyy committed
62
基于**Huggingface's Transformers**进行推理,根据本地模型地址设置`model_name_or_path`参数。
Rayyyyy's avatar
Rayyyyy committed
63
64

如未下载预训练模型,代码会根据选择自动进行下载,当前可用模型为:"deepseek-ai/DeepSeek-V2-Lite"、"deepseek-ai/DeepSeek-V2-Lite-Chat"。
Rayyyyy's avatar
Rayyyyy committed
65

Rayyyyy's avatar
Rayyyyy committed
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
### 文本扩写
```bash
export HSA_FORCE_FINE_GRAIN_PCIE=1
export USE_MIOPEN_BATCHNORM=1

python text_completion.py
```

### 对话
```bash
export HSA_FORCE_FINE_GRAIN_PCIE=1
export USE_MIOPEN_BATCHNORM=1

python chat_completion.py
```
Rayyyyy's avatar
Rayyyyy committed
81

Rayyyyy's avatar
Rayyyyy committed
82
83
84
85
86
87
## result

</div>
    <img src="./doc/chat_completion.png" width="500" height="300"/>
</div>

Rayyyyy's avatar
Rayyyyy committed
88
89
90
91
92
93
94
95
### 精度
暂无

## 应用场景
### 算法类别
对话问答

### 热点应用行业
Rayyyyy's avatar
Rayyyyy committed
96
金融,广媒,教育
Rayyyyy's avatar
Rayyyyy committed
97
98

## 预训练权重
Rayyyyy's avatar
Rayyyyy committed
99
100
[Huggingface-deepseek-ai](https://huggingface.co/deepseek-ai)

Rayyyyy's avatar
Rayyyyy committed
101
102
103
104
105
106
107
108
109
模型目录结构如下:
```bash
├── model_save_path
│   ├── DeepSeek-V2
│       ├── LICENSE
│       ├── README.md
│       ├── config.json
│       ├── configuration_deepseek.py
│       ├── generation_config.json
Rayyyyy's avatar
Rayyyyy committed
110
111
112
113
114
│       ├── model-00001-of-000055.safetensors
│       ├── model-00002-of-000055.safetensors
│       ...
│       ├── model-00054-of-000055.safetensors
│       ├── model-00055-of-000055.safetensors
Rayyyyy's avatar
Rayyyyy committed
115
116
117
118
119
120
121
122
123
124
125
│       ├── model.safetensors.index.json
│       ├── modeling_deepseek.py
│       ├── tokenization_deepseek_fast.py
│       ├── tokenizer.json
│       └── tokenizer_config.json
│   ├── DeepSeek-V2-Lite
│       ├── LICENSE
│       ├── README.md
│       ├── config.json
│       ├── configuration_deepseek.py
│       ├── generation_config.json
Rayyyyy's avatar
Rayyyyy committed
126
127
128
129
│       ├── model-00001-of-000004.safetensors
│       ├── model-00002-of-000004.safetensors
│       ├── model-00003-of-000004.safetensors
│       ├── model-00004-of-000004.safetensors
Rayyyyy's avatar
Rayyyyy committed
130
131
132
133
134
135
136
137
138
139
140
141
│       ├── model.safetensors.index.json
│       ├── modeling_deepseek.py
│       ├── tokenization_deepseek_fast.py
│       ├── tokenizer.json
│       └── tokenizer_config.json
```

## 源码仓库及问题反馈
- https://developer.hpccube.com/codes/modelzoo/deepseek-v2_pytorch

## 参考资料
- https://github.com/deepseek-ai/DeepSeek-V2
Rayyyyy's avatar
Rayyyyy committed
142
- https://huggingface.co/deepseek-ai