Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
4d8c4337
"doc/git@developer.sourcefind.cn:wangsen/paddle_dbnet.git" did not exist on "f1c771bd7f6733523a6b7e8803949245c0c13660"
Commit
4d8c4337
authored
Jun 18, 2019
by
thomwolf
Browse files
test barrier in distrib training
parent
33599556
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
23 additions
and
17 deletions
+23
-17
README.md
README.md
+8
-15
examples/run_classifier.py
examples/run_classifier.py
+11
-2
pytorch_pretrained_bert/modeling.py
pytorch_pretrained_bert/modeling.py
+4
-0
No files found.
README.md
View file @
4d8c4337
...
@@ -1272,27 +1272,20 @@ python run_classifier.py \
...
@@ -1272,27 +1272,20 @@ python run_classifier.py \
```
```
**Distributed training**
**Distributed training**
Here is an example using distributed training on 8 V100 GPUs and Bert Whole Word Masking model to reach a F1 > 9
3
on
SQuAD
:
Here is an example using distributed training on 8 V100 GPUs and Bert Whole Word Masking model to reach a F1 > 9
2
on
MRPC
:
```
bash
```
bash
python
-m
torch.distributed.launch
--nproc_per_node
=
8
\
python
-m
torch.distributed.launch
--nproc_per_node
8 run_classifier.py
--bert_model
bert-large-uncased-whole-word-masking
--task_name
MRPC
--do_train
--do_eval
--do_lower_case
--data_dir
$GLUE_DIR
/MRPC/
--max_seq_length
128
--train_batch_size
8
--learning_rate
2e-5
--num_train_epochs
3.0
--output_dir
/tmp/mrpc_output/
run_classifier.py
\
--bert_model
bert-large-cased-whole-word-masking
\
--task_name
MRPC
\
--do_train
\
--do_eval
\
--do_lower_case
\
--data_dir
$GLUE_DIR
/MRPC/
\
--max_seq_length
128
\
--train_batch_size
64
\
--learning_rate
2e-5
\
--num_train_epochs
3.0
\
--output_dir
/tmp/mrpc_output/
```
```
Training with these hyper-parameters gave us the following results:
Training with these hyper-parameters gave us the following results:
```
bash
```
bash
{
"exact_match"
: 86.91579943235573,
"f1"
: 93.1532499015869
}
acc
=
0.8823529411764706
acc_and_f1
=
0.901702786377709
eval_loss
=
0.3418912578906332
f1
=
0.9210526315789473
global_step
=
174
loss
=
0.07231863956341798
```
```
#### SQuAD
#### SQuAD
...
...
examples/run_classifier.py
View file @
4d8c4337
...
@@ -50,6 +50,12 @@ else:
...
@@ -50,6 +50,12 @@ else:
logger
=
logging
.
getLogger
(
__name__
)
logger
=
logging
.
getLogger
(
__name__
)
def
barrier
():
t
=
torch
.
randn
((),
device
=
'cuda'
)
torch
.
distributed
.
all_reduce
(
t
)
torch
.
cuda
.
synchronize
()
def
main
():
def
main
():
parser
=
argparse
.
ArgumentParser
()
parser
=
argparse
.
ArgumentParser
()
...
@@ -201,10 +207,13 @@ def main():
...
@@ -201,10 +207,13 @@ def main():
label_list
=
processor
.
get_labels
()
label_list
=
processor
.
get_labels
()
num_labels
=
len
(
label_list
)
num_labels
=
len
(
label_list
)
if
args
.
local_rank
not
in
[
-
1
,
0
]:
barrier
()
# Make sure only the first process in distributed training will download model & vocab
tokenizer
=
BertTokenizer
.
from_pretrained
(
args
.
bert_model
,
do_lower_case
=
args
.
do_lower_case
)
tokenizer
=
BertTokenizer
.
from_pretrained
(
args
.
bert_model
,
do_lower_case
=
args
.
do_lower_case
)
# Prepare model
model
=
BertForSequenceClassification
.
from_pretrained
(
args
.
bert_model
,
num_labels
=
num_labels
)
model
=
BertForSequenceClassification
.
from_pretrained
(
args
.
bert_model
,
num_labels
=
num_labels
)
if
args
.
local_rank
==
0
:
barrier
()
if
args
.
fp16
:
if
args
.
fp16
:
model
.
half
()
model
.
half
()
model
.
to
(
device
)
model
.
to
(
device
)
...
...
pytorch_pretrained_bert/modeling.py
View file @
4d8c4337
...
@@ -44,6 +44,10 @@ PRETRAINED_MODEL_ARCHIVE_MAP = {
...
@@ -44,6 +44,10 @@ PRETRAINED_MODEL_ARCHIVE_MAP = {
'bert-base-german-cased'
:
"https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-cased-pytorch_model.bin"
,
'bert-base-german-cased'
:
"https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-cased-pytorch_model.bin"
,
'bert-large-uncased-whole-word-masking'
:
"https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-pytorch_model.bin"
,
'bert-large-uncased-whole-word-masking'
:
"https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-pytorch_model.bin"
,
'bert-large-cased-whole-word-masking'
:
"https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-pytorch_model.bin"
,
'bert-large-cased-whole-word-masking'
:
"https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-pytorch_model.bin"
,
'bert-large-uncased-whole-word-masking-finetuned-squad'
:
"https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-finetuned-squad-pytorch_model.bin"
,
'bert-large-cased-whole-word-masking-finetuned-squad'
:
"https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-finetuned-squad-pytorch_model.bin"
,
'bert-base-uncased-finetuned-mrpc'
:
"https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-finetuned-mrpc-pytorch_model.bin"
,
'bert-large-uncased-whole-word-masking-finetuned-mrpc'
:
"https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-finetuned-mrpc-pytorch_model.bin"
,
}
}
PRETRAINED_CONFIG_ARCHIVE_MAP
=
{
PRETRAINED_CONFIG_ARCHIVE_MAP
=
{
'bert-base-uncased'
:
"https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json"
,
'bert-base-uncased'
:
"https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json"
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment