LARC clipping+documentation (#6)
* Proper implementation of LARC clipping * Documentation of LARC class * Modification of FP16_Optimizer to absorb optimizer instance that's being wrapped instead of creating new optimizer instance of same class.
Showing
Please register or sign in to comment