Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
d55c3ae8
Commit
d55c3ae8
authored
Nov 04, 2018
by
VictorSanh
Browse files
Small logger bug (multi-gpu, distribution) in training
parent
3d291dea
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
2 additions
and
2 deletions
+2
-2
run_classifier.py
run_classifier.py
+1
-1
run_squad.py
run_squad.py
+1
-1
No files found.
run_classifier.py
View file @
d55c3ae8
...
@@ -420,7 +420,7 @@ def main():
...
@@ -420,7 +420,7 @@ def main():
n_gpu
=
1
n_gpu
=
1
# Initializes the distributed backend which will take care of sychronizing nodes/GPUs
# Initializes the distributed backend which will take care of sychronizing nodes/GPUs
torch
.
distributed
.
init_process_group
(
backend
=
'nccl'
)
torch
.
distributed
.
init_process_group
(
backend
=
'nccl'
)
logger
.
info
(
"device
"
,
device
,
"n_gpu"
,
n_gpu
,
"
distributed training
"
,
bool
(
args
.
local_rank
!=
-
1
))
logger
.
info
(
"device
%s n_gpu %d
distributed training
%r"
,
device
,
n_gpu
,
bool
(
args
.
local_rank
!=
-
1
))
if
args
.
accumulate_gradients
<
1
:
if
args
.
accumulate_gradients
<
1
:
raise
ValueError
(
"Invalid accumulate_gradients parameter: {}, should be >= 1"
.
format
(
raise
ValueError
(
"Invalid accumulate_gradients parameter: {}, should be >= 1"
.
format
(
...
...
run_squad.py
View file @
d55c3ae8
...
@@ -750,7 +750,7 @@ def main():
...
@@ -750,7 +750,7 @@ def main():
n_gpu
=
1
n_gpu
=
1
# Initializes the distributed backend which will take care of sychronizing nodes/GPUs
# Initializes the distributed backend which will take care of sychronizing nodes/GPUs
torch
.
distributed
.
init_process_group
(
backend
=
'nccl'
)
torch
.
distributed
.
init_process_group
(
backend
=
'nccl'
)
logger
.
info
(
"device
"
,
device
,
"n_gpu"
,
n_gpu
,
"
distributed training
"
,
bool
(
args
.
local_rank
!=
-
1
))
logger
.
info
(
"device
%s n_gpu %d
distributed training
%r"
,
device
,
n_gpu
,
bool
(
args
.
local_rank
!=
-
1
))
if
args
.
accumulate_gradients
<
1
:
if
args
.
accumulate_gradients
<
1
:
raise
ValueError
(
"Invalid accumulate_gradients parameter: {}, should be >= 1"
.
format
(
raise
ValueError
(
"Invalid accumulate_gradients parameter: {}, should be >= 1"
.
format
(
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment