"tests/vscode:/vscode.git/clone" did not exist on "e0eebebf50750139d25f1575e1a29902a672794d"
-
Jun Ru Anderson authored
Implement scaling of optimizer state when using pure-fp16 training to avoid underflow. Update benchmark to use pure-fp16. Modify state_dict methods to store and load the optimizer state scale. Co-authored-by:Jun Ru Anderson <andersonic@fb.com>
5251a69a