Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
GPT2_pytorch
Commits
81a090ed
Commit
81a090ed
authored
Oct 11, 2023
by
hepj987
Browse files
添加result结果
parent
d0d55509
Pipeline
#588
failed with stage
Changes
3
Pipelines
2
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
25 additions
and
2 deletions
+25
-2
README.md
README.md
+24
-2
requirements.txt
requirements.txt
+1
-0
result.jpg
result.jpg
+0
-0
No files found.
README.md
View file @
81a090ed
...
...
@@ -45,6 +45,8 @@ source env.sh
#安装DTK版本依赖
pip install torch-1.10.0+gite378c3c.abi0.dtk2304-cp37-cp37m-manylinux2014_x86_64.whl
pip install deepspeed-0.9.2+git25d5540.abi0.dtk2304.torch1.10.0-cp37-cp37m-manylinux2014_x86_64.whl
pip install apex-0.1+f49ddd4.abi0.dtk2304.torch1.10-cp37-cp37m-manylinux2014_x86_64.whl
pip install torchvision-0.10.0+git48e6bbb.abi0.dtk2304.torch1.10-cp37-cp37m-manylinux2014_x86_64.whl
#安装其他依赖
pip install -r requirements.txt -i http://pypi.tuna.tsinghua.edu.cn/simple --trusted-host pypi.tuna.tsinghua.edu.cn
```
...
...
@@ -88,7 +90,7 @@ sh creat-data.sh
##
GPT2预
训练
## 训练
### GPT2单节点训练
...
...
@@ -129,7 +131,19 @@ SAVE_INTERVAL 保存频率
sh mpi-run-16B.sh(主要参数在single-16B.sh,参数类型与单节点相同, 默认以fp32精度训练,如需采用fp16精度可执行sh mpi-16B-fp16.sh)
```
## GPT2文本生成
## 推理
### 说明
```
注意:推理时pp数需要为1,而tp数需要与训练时一致或者为1。(tp不为1时为多卡推理,为1则是单卡推理)
tools/convert_checkpoint/deepspeed_to_deepspeed.py 模型tp数转化
tools/convert_checkpoint/deepspeed_to_megatron.py 模型pp数转化,并变为可推理格式(推理必须有这一步)
下边展示多节点4tp 4pp训练的模型专为4tp 1pp的多卡推理,以及单节点4tp 1pp训练的模型转化为1tp 1pp的单卡推理。
16B模型单卡推理显存不足,这里不给示例,如有多tp 多pp转为单卡推理需要可以参考单卡推理conver-model-1tp.sh
```
### 转换成多卡推理
...
...
@@ -180,6 +194,14 @@ mpirun -np 1 run-inf.sh
## result
推理示例如下:(生成单词异常是由于训练不充分导致一些token合并问题)

## 精度
16B模型训练loss:
| 卡数 | 配置 | lm loss |
...
...
requirements.txt
View file @
81a090ed
...
...
@@ -11,3 +11,4 @@ transformers
black
==21.4b0
isort
>=5.5.4
ninja
mpi4py
result.jpg
0 → 100644
View file @
81a090ed
25.1 KB
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment