Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
Megatron-LM
Commits
ade99d61
Commit
ade99d61
authored
Dec 30, 2021
by
Vijay Korthikanti
Browse files
avoid allocation of word embedding for head in T5 pp=2 case
parent
26ea8314
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
3 additions
and
1 deletion
+3
-1
megatron/model/module.py
megatron/model/module.py
+3
-1
No files found.
megatron/model/module.py
View file @
ade99d61
...
@@ -85,7 +85,9 @@ class MegatronModule(torch.nn.Module):
...
@@ -85,7 +85,9 @@ class MegatronModule(torch.nn.Module):
# 3. In the training loop, before an all-reduce between the grads of
# 3. In the training loop, before an all-reduce between the grads of
# the two word_embeddings layers to ensure that every applied weight
# the two word_embeddings layers to ensure that every applied weight
# update is the same on both stages.
# update is the same on both stages.
if
mpu
.
is_pipeline_last_stage
():
if
mpu
.
is_pipeline_last_stage
()
and
\
(
not
hasattr
(
self
.
language_model
,
'embedding'
)
or
self
.
language_model
.
embedding
is
None
):
assert
not
mpu
.
is_pipeline_first_stage
()
assert
not
mpu
.
is_pipeline_first_stage
()
self
.
_word_embeddings_for_head_key
=
'word_embeddings_for_head'
self
.
_word_embeddings_for_head_key
=
'word_embeddings_for_head'
# set word_embeddings weights to 0 here, then copy first
# set word_embeddings weights to 0 here, then copy first
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment