"projects/nerf/train_nerf.py" did not exist on "5b7491188156955a7a6fb9fe333350d43405f6cb"
Add features to distributed Adam for Megatron support (#1414)
* Add features to distributed Adam for Megatron support Support gradient clipping, gradient scaling, FP32 grad accumulation, and multiple dtypes and devices. * Restore closure arg to distributed Adam Review suggestion from @crcrpar
Showing
Please register or sign in to comment