Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
d82e5dee
Commit
d82e5dee
authored
Jun 18, 2019
by
thomwolf
Browse files
set find_unused_parameters=True in DDP
parent
a59abedf
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
8 additions
and
4 deletions
+8
-4
README.md
README.md
+4
-3
examples/run_squad.py
examples/run_squad.py
+4
-1
No files found.
README.md
View file @
d82e5dee
...
...
@@ -1468,12 +1468,13 @@ python -m torch.distributed.launch --nproc_per_node=8 \
--do_lower_case
\
--train_file
$SQUAD_DIR
/train-v1.1.json
\
--predict_file
$SQUAD_DIR
/dev-v1.1.json
\
--train_batch_size
12
\
--learning_rate
3e-5
\
--num_train_epochs
2
.0
\
--num_train_epochs
2
\
--max_seq_length
384
\
--doc_stride
128
\
--output_dir
/tmp/debug_squad/
--output_dir
/tmp/debug_squad/
\
--train_batch_size
24
\
--gradient_accumulation_steps
2
```
## Notebooks
...
...
examples/run_squad.py
View file @
d82e5dee
...
...
@@ -907,7 +907,10 @@ def main():
# except ImportError:
# raise ImportError("Please install apex from https://www.github.com/nvidia/apex to use distributed and fp16 training.")
model
=
torch
.
nn
.
parallel
.
DistributedDataParallel
(
model
,
device_ids
=
[
args
.
local_rank
],
output_device
=
args
.
local_rank
)
model
=
torch
.
nn
.
parallel
.
DistributedDataParallel
(
model
,
device_ids
=
[
args
.
local_rank
],
output_device
=
args
.
local_rank
,
find_unused_parameters
=
True
)
elif
n_gpu
>
1
:
model
=
torch
.
nn
.
DataParallel
(
model
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment