bug fix for using `return_layernorm_output=True` (#1382)

the current implementation would release the output of ln, leading to an error if setting `return_layernorm_output=True`. Signed-off-by: Liyuan Liu <llychinalz@gmail.com> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

bug fix for using `return_layernorm_output=True` (#1382)
the current implementation would release the output of ln, leading to an error if setting `return_layernorm_output=True`. Signed-off-by: Liyuan Liu <llychinalz@gmail.com> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
61cf1020 · Liyuan Liu · GitHub · b898cbe1 · 61cf1020
Unverified Commit 61cf1020 authored Jan 07, 2025 by Liyuan Liu Committed by GitHub Jan 07, 2025
Show whitespace changes
Inline Side-by-side

Showing with 1 addition and 1 deletion

transformer_engine/pytorch/module/layernorm_mlp.py transformer_engine/pytorch/module/layernorm_mlp.py +1 -1

No files found.
--- a/transformer_engine/pytorch/module/layernorm_mlp.py
+++ b/transformer_engine/pytorch/module/layernorm_mlp.py
@@ -373,7 +373,7 @@ class _LayerNormMLP(torch.autograd.Function):
                ub=ub_obj_lnout if ub_overlap_ag else None,
                extra_output_tensor=ln_out if ub_overlap_ag else None,
            )
-            if not is_grad_enabled:
+            if not is_grad_enabled and not return_layernorm_output:
                clear_tensor_data(ln_out_total)
            if bias_gelu_nvfusion: