Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
Megatron-LM
Commits
24369dd6
Commit
24369dd6
authored
Jan 25, 2022
by
Lawrence McAfee
Browse files
limit 'make_viewless_tensor()' to case of micro_batch_size == 1; added comment
parent
0c8e8cce
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
12 additions
and
5 deletions
+12
-5
megatron/model/transformer.py
megatron/model/transformer.py
+12
-5
No files found.
megatron/model/transformer.py
View file @
24369dd6
...
@@ -557,6 +557,7 @@ class ParallelTransformer(MegatronModule):
...
@@ -557,6 +557,7 @@ class ParallelTransformer(MegatronModule):
self
.
pre_process
=
pre_process
self
.
pre_process
=
pre_process
self
.
post_process
=
post_process
self
.
post_process
=
post_process
self
.
input_tensor
=
None
self
.
input_tensor
=
None
self
.
micro_batch_size
=
args
.
micro_batch_size
# Store activation checkpoiting flag.
# Store activation checkpoiting flag.
self
.
activations_checkpoint_method
=
args
.
activations_checkpoint_method
self
.
activations_checkpoint_method
=
args
.
activations_checkpoint_method
...
@@ -696,11 +697,17 @@ class ParallelTransformer(MegatronModule):
...
@@ -696,11 +697,17 @@ class ParallelTransformer(MegatronModule):
hidden_states
=
self
.
input_tensor
hidden_states
=
self
.
input_tensor
# Viewless tensor
# Viewless tensor
hidden_states
=
mpu
.
make_viewless_tensor
(
# We only need to create a viewless tensor in the case of micro batch
hidden_states
,
# size (mbs) == 1, since in this case, 'hidden_states.transpose()'
requires_grad
=
True
,
# above creates a view tensor, and '.contiguous()' is a pass-through.
keep_graph
=
True
,
# For mbs >= 2, '.contiguous()' creates a new tensor, eliminating
)
# the need to make it viewless.
if
self
.
micro_batch_size
==
1
:
hidden_states
=
mpu
.
make_viewless_tensor
(
hidden_states
,
requires_grad
=
True
,
keep_graph
=
True
,
)
if
encoder_output
is
not
None
:
if
encoder_output
is
not
None
:
encoder_output
=
encoder_output
.
transpose
(
0
,
1
).
contiguous
()
encoder_output
=
encoder_output
.
transpose
(
0
,
1
).
contiguous
()
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment