Commits · ea9876e37feb825b53ad68e51777e9f6a7ac238a · OpenDAS / fairscale

"git@developer.sourcefind.cn:zhaoyu6/sglang.git" did not exist on "58d06fdc95603922f64db27e4452de63ff91972f"

28 Oct, 2020 1 commit
- [chore] update isort to 5.6.4 (#170) · ea9876e3
  msbaines authored Oct 27, 2020
  
  ea9876e3
03 Sep, 2020 1 commit

Jun Ru Anderson authored Sep 03, 2020



Add GradScaler to Fairscale, subclassing PyTorch's GradScaler. Use GradScaler in the pipe benchmark; though it is not needed in this case, it is a good example of how to use gradient scaling for larger models that do require gradient scaling in order to converge.
Co-authored-by: Jun Ru Anderson <andersonic@fb.com>

b6a5e634

22 Aug, 2020 1 commit

[feat] optimizer state scaling (#44) · 5251a69a

Jun Ru Anderson authored Aug 21, 2020



Implement scaling of optimizer state when using pure-fp16 training to avoid underflow. Update benchmark to use pure-fp16. Modify state_dict methods to store and load the optimizer state scale.
Co-authored-by: Jun Ru Anderson <andersonic@fb.com>

5251a69a

21 Aug, 2020 1 commit

[test] set torch seed for Adam tests (#49) · 0e8c2a96

Jun Ru Anderson authored Aug 21, 2020



Set the torch seed for tests. xfail mixed precision and memory-efficient mixed-precision state_dict tests due to their states being cast to FP16 and back to FP32 during load_state_dict.
Co-authored-by: Jun Ru Anderson <andersonic@fb.com>

0e8c2a96

19 Aug, 2020 1 commit

[fix] fix tests and state_dict; refactor tests (#45) · 9d6c7b6a

Jun Ru Anderson authored Aug 19, 2020



Refactor tests to remove duplicated code. Fix the state_dict test to instantiate the second optimizer with the correct precision. Fix Adam.load_state_dict to make optimizer state the right type.
Co-authored-by: Jun Ru Anderson <andersonic@fb.com>

9d6c7b6a

18 Aug, 2020 1 commit

[feat] allow fp16 optimizer state with Adam (#41) · 8ee5a8ff

Jun Ru Anderson authored Aug 18, 2020



Allow training with optimizer state in fp16. Use an enum to select from full-precision, mixed precision, memory efficient mixed precision and pure fp16. Improve clarity of testing code
Co-authored-by: Jun Ru Anderson <andersonic@fb.com>

8ee5a8ff

14 Aug, 2020 1 commit

[feat] add mixed precision Adam (#40) · e2d8f573

Jun Ru Anderson authored Aug 14, 2020



Add support for mixed-precision (half precision params, full precision gradients) and memory-efficient (half precision params and half precision gradients) training with Adam
Co-authored-by: Jun Ru Anderson <andersonic@fb.com>

e2d8f573

31 Jul, 2020 1 commit

[feat] add FusedAdam (#10) · bfba68d8

Jun Ru Anderson authored Jul 30, 2020



Add FusedAdam, update benchmark and add tests.
Co-authored-by: Jun Ru Anderson <andersonic@fb.com>

bfba68d8