Fix unbalanced gradients bug in ZeRO-2 gradient accumulation (#545)
* Use zero-tensors for missing gradients to avoid size mismatch * Unit test for unbalanced gradients in ZeRO * Formatting fixes
Showing
tests/unit/test_zero.py
0 → 100755
Please register or sign in to comment