Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
GPT2_pytorch
Commits
d0d55509
Commit
d0d55509
authored
Sep 28, 2023
by
hepj987
Browse files
修复tool
parent
6bd15ea7
Pipeline
#579
failed with stage
Changes
7
Pipelines
2
Hide whitespace changes
Inline
Side-by-side
Showing
7 changed files
with
19 additions
and
17 deletions
+19
-17
README.md
README.md
+10
-10
model.properties
model.properties
+1
-1
mpi-run-16B-fp16.sh
mpi-run-16B-fp16.sh
+1
-1
mpi-run-16B.sh
mpi-run-16B.sh
+1
-1
single-16B-fp16.sh
single-16B-fp16.sh
+1
-1
single-16B.sh
single-16B.sh
+1
-1
tools/convert_checkpoint/deepspeed_to_megatron.py
tools/convert_checkpoint/deepspeed_to_megatron.py
+4
-2
No files found.
README.md
View file @
d0d55509
...
@@ -178,16 +178,6 @@ mpirun -np 1 run-inf.sh
...
@@ -178,16 +178,6 @@ mpirun -np 1 run-inf.sh
--num-samples 生成样本个数
--num-samples 生成样本个数
```
```
## 应用场景
### 算法类别
`文本生成`
### 热点应用行业
`互联网`
## result
## result
16B模型训练loss:
16B模型训练loss:
...
@@ -208,6 +198,16 @@ mpirun -np 1 run-inf.sh
...
@@ -208,6 +198,16 @@ mpirun -np 1 run-inf.sh


## 应用场景
### 算法类别
`文本生成`
### 热点应用行业
`互联网`
## 源码仓库及问题反馈
## 源码仓库及问题反馈
https://developer.hpccube.com/codes/modelzoo/gpt2-pytorch/
https://developer.hpccube.com/codes/modelzoo/gpt2-pytorch/
...
...
model.properties
View file @
d0d55509
...
@@ -5,6 +5,6 @@ modelName=gpt2_pytorch
...
@@ -5,6 +5,6 @@ modelName=gpt2_pytorch
# 模型描述
# 模型描述
modelDescription
=
基于Pytorch训练框架的gpt2模型
modelDescription
=
基于Pytorch训练框架的gpt2模型
# 应用场景
# 应用场景
appScenario
=
训练,推理,
train,inference,nlp,智能聊天助手
appScenario
=
训练,推理,
文本生成,互联网
# 框架类型
# 框架类型
frameType
=
Pytorch,Deepspeed
frameType
=
Pytorch,Deepspeed
mpi-run-16B-fp16.sh
View file @
d0d55509
...
@@ -7,6 +7,6 @@ np=$(($np*8))
...
@@ -7,6 +7,6 @@ np=$(($np*8))
nodename
=
$(
cat
$hostfile
|sed
-n
"1p"
)
nodename
=
$(
cat
$hostfile
|sed
-n
"1p"
)
dist_url
=
`
echo
$nodename
|
awk
'{print $1}'
`
dist_url
=
`
echo
$nodename
|
awk
'{print $1}'
`
which mpirun
which mpirun
mpirun
-np
$np
--allow-run-as-root
--hostfile
hostfile
--bind-to
none
--mca
btl_tcp_if_include
$dist_url
single-16B-fp16.sh
mpirun
-np
$np
--allow-run-as-root
--hostfile
$
hostfile
--bind-to
none
--mca
btl_tcp_if_include
$dist_url
single-16B-fp16.sh
$dist_url
echo
"END TIME:
$(
date
)
"
echo
"END TIME:
$(
date
)
"
mpi-run-16B.sh
View file @
d0d55509
...
@@ -7,6 +7,6 @@ np=$(($np*8))
...
@@ -7,6 +7,6 @@ np=$(($np*8))
nodename
=
$(
cat
$hostfile
|sed
-n
"1p"
)
nodename
=
$(
cat
$hostfile
|sed
-n
"1p"
)
dist_url
=
`
echo
$nodename
|
awk
'{print $1}'
`
dist_url
=
`
echo
$nodename
|
awk
'{print $1}'
`
which mpirun
which mpirun
mpirun
-np
$np
--allow-run-as-root
--hostfile
hostfile
--bind-to
none
--mca
btl_tcp_if_include
$dist_url
single-16B.sh
mpirun
-np
$np
--allow-run-as-root
--hostfile
$
hostfile
--bind-to
none
--mca
btl_tcp_if_include
$dist_url
single-16B.sh
$dist_url
echo
"END TIME:
$(
date
)
"
echo
"END TIME:
$(
date
)
"
single-16B-fp16.sh
View file @
d0d55509
...
@@ -53,7 +53,7 @@ GPT_ARGS=" \
...
@@ -53,7 +53,7 @@ GPT_ARGS=" \
--max-position-embeddings
$SEQ_LEN
\
--max-position-embeddings
$SEQ_LEN
\
--micro-batch-size
$MICRO_BATCH_SIZE
\
--micro-batch-size
$MICRO_BATCH_SIZE
\
--global-batch-size
$GLOBAL_BATCH_SIZE
\
--global-batch-size
$GLOBAL_BATCH_SIZE
\
--train
_
iters 7000
\
--train
-
iters 7000
\
--loss-scale 12
\
--loss-scale 12
\
--vocab-file gpt2-vocab.json
\
--vocab-file gpt2-vocab.json
\
--merge-file gpt2-merges.txt
\
--merge-file gpt2-merges.txt
\
...
...
single-16B.sh
View file @
d0d55509
...
@@ -53,7 +53,7 @@ GPT_ARGS=" \
...
@@ -53,7 +53,7 @@ GPT_ARGS=" \
--max-position-embeddings
$SEQ_LEN
\
--max-position-embeddings
$SEQ_LEN
\
--micro-batch-size
$MICRO_BATCH_SIZE
\
--micro-batch-size
$MICRO_BATCH_SIZE
\
--global-batch-size
$GLOBAL_BATCH_SIZE
\
--global-batch-size
$GLOBAL_BATCH_SIZE
\
--train
_
iters 7000
\
--train
-
iters 7000
\
--loss-scale 12
\
--loss-scale 12
\
--vocab-file gpt2-vocab.json
\
--vocab-file gpt2-vocab.json
\
--merge-file gpt2-merges.txt
\
--merge-file gpt2-merges.txt
\
...
...
tools/convert_checkpoint/deepspeed_to_megatron.py
View file @
d0d55509
...
@@ -4,8 +4,10 @@ import argparse
...
@@ -4,8 +4,10 @@ import argparse
import
os
import
os
import
torch
import
torch
from
collections
import
OrderedDict
from
collections
import
OrderedDict
from
.deepspeed_checkpoint
import
ARGS_KEY
,
DeepSpeedCheckpoint
from
deepspeed.checkpoint.deepspeed_checkpoint
import
(
ARGS_KEY
,
DeepSpeedCheckpoint
,
)
MODEL_KEY
=
'model'
MODEL_KEY
=
'model'
ARGS_KEY
=
'args'
ARGS_KEY
=
'args'
LANGUGAGE_MODEL_KEY
=
'language_model'
LANGUGAGE_MODEL_KEY
=
'language_model'
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment