Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
DeepSeek-Coder_pytorch
Commits
50406f0b
Commit
50406f0b
authored
May 24, 2024
by
dengjb
Browse files
first commit
parent
793db024
Changes
6
Hide whitespace changes
Inline
Side-by-side
Showing
6 changed files
with
154 additions
and
2 deletions
+154
-2
README.md
README.md
+112
-2
assets/dataset.png
assets/dataset.png
+0
-0
data/test_dataset_download_here.txt
data/test_dataset_download_here.txt
+0
-0
finetune/train_ft.sh
finetune/train_ft.sh
+29
-0
inference.py
inference.py
+13
-0
weights/model_download_here.txt
weights/model_download_here.txt
+0
-0
No files found.
README.md
View file @
50406f0b
# deepseek-coder_pytorch
# DeepSeek-Coder
DeepSeek Coder系列包括1B、5.7B、6.7B及33B多个版本,涵盖广泛的代码和自然语言处理任务。
## 论文
`DeepSeek-Coder: When the Large Language Model Meets Programming - The Rise of Code Intelligence`
[
deepseek-coder
](
https://arxiv.org/pdf/2401.14196
)
DeepSeek-Coder is a family generation models
## 模型结构
\ No newline at end of file
DeepSeek-Coder LLM架构主要参照了LLama,并建立在与DeepSeek LLM同样的架构之下。每个模型都是一个decoder-only的Transformer架构。在同size的情况下,DeepSeek超越Codellama等模型,表现SOTA

## 算法原理
其中33B模型使用了GQA模块,能够在带来一定模型表征能力的同时,也能够对提高模型的性能。而6.7B等则使用了MHA,以提高模型的表征能力。并且在该系列的模型中使用了RoPE旋转位置编码,使得模型能够具有更好的外推性。

## 环境配置
-v 路径、docker_name和imageID根据实际情况修改
### Docker(方法一)
```
bash
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-centos7.6-dtk24.04-py310
docker run
-it
-v
/path/your_code_data/:/path/your_code_data/
-v
/opt/hyhal/:/opt/hyhal/:ro
--shm-size
=
80G
--privileged
=
true
--device
=
/dev/kfd
--device
=
/dev/dri/
--group-add
video
--name
docker_name imageID bash
cd
/your_code_path/deepseek-coder_pytorch
pip
install
-r
requirements.txt
pip
install
-U
huggingface_hub hf_transfer
export
HF_ENDPOINT
=
https://hf-mirror.com
```
### Dockerfile(方法二)
```
bash
cd
docker
docker build
--no-cache
-t
deepseek_coder:latest
.
docker run
-it
-v
/path/your_code_data/:/path/your_code_data/
-v
/opt/hyhal/:/opt/hyhal/:ro
--shm-size
=
80G
--privileged
=
true
--device
=
/dev/kfd
--device
=
/dev/dri/
--group-add
video
--name
docker_name imageID bash
cd
/your_code_path/deepseek-coder_pytorch
pip
install
-r
requirements.txt
pip
install
-U
huggingface_hub hf_transfer
export
HF_ENDPOINT
=
https://hf-mirror.com
```
### Anaconda(方法三)
关于本项目DCU显卡所需的特殊深度学习库可从
[
光合
](
https://developer.hpccube.com/tool/
)
开发者社区下载安装。
```
DTK驱动: dtk24.04
python: python3.10
torch: 2.1.0
```
`Tips:以上dtk驱动、python、torch等DCU相关工具版本需要严格一一对应`
其它非深度学习库安装方式如下:
```
bash
pip
install
-r
requirements.txt
pip
install
-U
huggingface_hub hf_transfer
export
HF_ENDPOINT
=
https://hf-mirror.com
```
## 数据集
finetune训练样例数据采用nickrosh/Evol-Instruct-Code-80k-v1
[
下载地址
](
https://hf-mirror.com/datasets/nickrosh/Evol-Instruct-Code-80k-v1
)

## 训练
单机四卡
<br>
具体参数更改请在train_ft.sh文件中进行,以下为必要参数
<br>
DATA_PATH="{数据集地址}"
<br>
OUTPUT_PATH="{训练文件保存地址}"
<br>
MODEL_PATH="{预训练模型加载地址}"
<br>
```
bash
cd
finetune
./trian_ft.sh
```
## 推理
基于Huggingface's Transformers进行推理.
<br>
模型下载后 默认需存放至weights文件夹中
<br>
也可自行更改 inference.py文件中的 model_name 参数
<br>
```
python
HIP_VISIBLE_DEVICES
=
0
python
inference
.
py
```
prompt:用verilog写一个读和写的FIFO模块
<br>
result:!
[
result.png
](
assets%2Fresult.png
)
### 精度
暂无
## 应用场景
### 算法类别
代码生成
### 热点应用行业
制造,能源,教育
## 预训练权重
模型目录结构如下:
```
bash
# deepseek-coder-6.7b-instruct/
├── config.json
├── generation_config.json
├── LICENSE
├── model-00001-of-00002.safetensors
├── model-00002-of-00002.safetensors
├── model.safetensors.index.json
├── pytorch_model-00001-of-00002.bin
├── pytorch_model-00002-of-00002.bin
├── pytorch_model.bin.index.json
├── README.md
├── tokenizer_config.json
└── tokenizer.json
```
## 源码仓库及问题反馈
-
https://developer.hpccube.com/codes/modelzoo/deepseek-coder_pytorch
## 参考资料
-
https://github.com/deepseek-ai/DeepSeek-Coder
-
https://huggingface.co/deepseek-ai
assets/dataset.png
0 → 100644
View file @
50406f0b
26.6 KB
data/test_dataset_download_here.txt
0 → 100644
View file @
50406f0b
finetune/train_ft.sh
0 → 100644
View file @
50406f0b
#!/bin/bash
export
HIP_VISIBLE_DEVICES
=
0,1,2,3
# test datasets
DATA_PATH
=
"../data/Evol-Instruct-Code-80k-v1/EvolInstruct-Code-80k.json"
# output dir
OUTPUT_PATH
=
"../outputs"
MODEL_PATH
=
"../weights/DeepSeek-Coder/deepseek-coder-6.7b-instruct"
deepspeed
--include
=
"localhost:0,1,2,3"
finetune_deepseekcoder.py
\
--model_name_or_path
$MODEL_PATH
\
--data_path
$DATA_PATH
\
--output_dir
$OUTPUT_PATH
\
--num_train_epochs
3
\
--model_max_length
2048
\
--per_device_train_batch_size
2
\
--per_device_eval_batch_size
1
\
--gradient_accumulation_steps
4
\
--evaluation_strategy
"no"
\
--save_strategy
"steps"
\
--save_steps
100
\
--save_total_limit
100
\
--learning_rate
2e-5
\
--warmup_steps
10
\
--logging_steps
1
\
--lr_scheduler_type
"cosine"
\
--gradient_checkpointing
True
\
--report_to
"all"
\
--deepspeed
configs/ds_config_zero3.json
\
--bf16
True
```
\ No newline at end of file
inference.py
0 → 100644
View file @
50406f0b
import
torch
import
time
from
transformers
import
AutoTokenizer
,
AutoModelForCausalLM
model_name
=
"./weights/DeepSeek-Coder/deepseek-coder-6.7b-instruct"
tokenizer
=
AutoTokenizer
.
from_pretrained
(
model_name
,
trust_remote_code
=
True
)
model
=
AutoModelForCausalLM
.
from_pretrained
(
model_name
,
trust_remote_code
=
True
,
torch_dtype
=
torch
.
bfloat16
).
cuda
()
messages
=
[
{
'role'
:
'user'
,
'content'
:
"用verilog写一个读和写的FIFO模块。"
}
]
print
(
tokenizer
.
apply_chat_template
(
messages
,
add_generation_prompt
=
True
,
tokenize
=
False
))
inputs
=
tokenizer
.
apply_chat_template
(
messages
,
add_generation_prompt
=
True
,
return_tensors
=
"pt"
).
to
(
model
.
device
)
outputs
=
model
.
generate
(
inputs
,
max_new_tokens
=
1024
,
do_sample
=
False
,
top_k
=
50
,
top_p
=
0.95
,
num_return_sequences
=
1
,
eos_token_id
=
tokenizer
.
eos_token_id
)
print
(
tokenizer
.
decode
(
outputs
[
0
][
len
(
inputs
[
0
]):],
skip_special_tokens
=
True
))
weights/model_download_here.txt
0 → 100644
View file @
50406f0b
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment