Merge branch 'checkpointing-fix' into 'master'

small fix for CheckpointFunction's backward() method when some args may be NoneType See merge request ADLR/megatron-lm!92

Merge branch 'checkpointing-fix' into 'master'
small fix for CheckpointFunction's backward() method when some args may be NoneType See merge request ADLR/megatron-lm!92
c20f4d48 · Mohammad Shoeybi · 46a536cc · 4ee0537a · c20f4d48
Commit c20f4d48 authored Jul 08, 2020 by Mohammad Shoeybi
Hide whitespace changes
Inline Side-by-side

Showing with 3 additions and 1 deletion

megatron/mpu/random.py megatron/mpu/random.py +3 -1

No files found.
--- a/megatron/mpu/random.py
+++ b/megatron/mpu/random.py
@@ -234,7 +234,9 @@ class CheckpointFunction(torch.autograd.Function):
        if isinstance(outputs, torch.Tensor):
            outputs = (outputs,)
        torch.autograd.backward(outputs, args)
-        return (None,) + tuple(inp.grad for inp in detached_inputs)
+        grads = tuple(inp.grad if isinstance(inp, torch.Tensor) else inp
+                      for inp in detached_inputs)
+        return (None,) + grads
 def checkpoint(function, *args):