Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
Megatron-LM
Commits
9044bc42
Commit
9044bc42
authored
Jun 08, 2022
by
Lawrence McAfee
Browse files
removed count-zeros debuggables.
parent
7fccd6a1
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
3 additions
and
15 deletions
+3
-15
megatron/optimizer/clip_grads.py
megatron/optimizer/clip_grads.py
+3
-15
No files found.
megatron/optimizer/clip_grads.py
View file @
9044bc42
...
...
@@ -124,10 +124,7 @@ def count_zeros_fp32(parameters, model_parallel_group):
# - grad should not be none
# - parameter should not be shared
# - should not be a replica due to tensor model parallelism
# >>>
# total_num_zeros = 0.0
total_num_zeros
=
torch
.
cuda
.
FloatTensor
([
0.0
])
# <<<
for
param
in
parameters
:
grad_not_none
=
param
.
grad
is
not
None
is_not_shared
=
param_is_not_shared
(
param
)
...
...
@@ -138,18 +135,9 @@ def count_zeros_fp32(parameters, model_parallel_group):
total_num_zeros
=
num_zeros
+
total_num_zeros
# Sum across all model-parallel GPUs.
# >>>
try
:
torch
.
distributed
.
all_reduce
(
total_num_zeros
,
op
=
torch
.
distributed
.
ReduceOp
.
SUM
,
group
=
model_parallel_group
)
except
:
from
lutil
import
pax
pax
({
"total_num_zeros"
:
total_num_zeros
,
"parameters"
:
parameters
,
})
# <<<
torch
.
distributed
.
all_reduce
(
total_num_zeros
,
op
=
torch
.
distributed
.
ReduceOp
.
SUM
,
group
=
model_parallel_group
)
total_num_zeros
=
total_num_zeros
.
item
()
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment