Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
Megatron-LM
Commits
de593298
Commit
de593298
authored
May 24, 2022
by
Lawrence McAfee
Browse files
allreduce layernorm fixes.
parent
cf1c7848
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
2 additions
and
3 deletions
+2
-3
megatron/optimizer/optimizer.py
megatron/optimizer/optimizer.py
+2
-2
megatron/training.py
megatron/training.py
+0
-1
No files found.
megatron/optimizer/optimizer.py
View file @
de593298
...
...
@@ -21,6 +21,7 @@ from apex.multi_tensor_apply import multi_tensor_applier
import
amp_C
import
torch
from
torch.nn.parallel.distributed
import
DistributedDataParallel
as
torchDDP
from
torch._utils
import
_flatten_dense_tensors
,
_unflatten_dense_tensors
from
megatron
import
get_timers
from
megatron
import
mpu
...
...
@@ -273,9 +274,8 @@ class MegatronOptimizer(ABC):
# when sequence parallelism is used
if
mpu
.
get_tensor_model_parallel_world_size
()
>
1
and
\
args
.
sequence_parallel
:
raise
Exception
(
"hi."
)
grads
=
[]
for
model_module
in
model
:
for
model_module
in
self
.
model
s
:
unwrapped_model
=
unwrap_model
(
model_module
,
(
torchDDP
,
LocalDDP
,
Float16Module
))
for
param
in
unwrapped_model
.
parameters
():
...
...
megatron/training.py
View file @
de593298
...
...
@@ -23,7 +23,6 @@ import time
_TRAIN_START_TIME
=
time
.
time
()
import
torch
from
torch.nn.parallel.distributed
import
DistributedDataParallel
as
torchDDP
from
torch._utils
import
_flatten_dense_tensors
,
_unflatten_dense_tensors
from
megatron
import
get_args
from
megatron
import
get_signal_handler
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment