- 27 May, 2020 1 commit
-
-
Jeff Rasley authored
* updates to support fp32 grad clipping and disable max_grad_norm
-
- 20 May, 2020 1 commit
-
-
Jeff Rasley authored
-
- 19 May, 2020 1 commit
-
-
Jeff Rasley authored
Updates for ZeRO stage 2 + ZeRO stage 1 w. RS Co-authored-by:
Tunji Ruwase <olruwase@microsoft.com> Co-authored-by:
Samyam Rajbhandari <samyamr@microsoft.com> Co-authored-by:
Shaden Smith <ShadenTSmith@gmail.com> Co-authored-by:
Elton Zheng <eltonz@microsoft.com> Co-authored-by:
Shaden Smith <Shaden.Smith@microsoft.com> Co-authored-by:
yuxionghe <yuxhe@microsoft.com> Co-authored-by:
Arash Ashari <arashari@microsoft.com>
-
- 18 May, 2020 1 commit
-
-
Arash Ashari authored
* adding BingSqaud e2e test * updating the draft test; bring final step under try section * finalizinf test for base deepspeed and deepspeed with ZeRO * applying the comment (thanks Jeff); fixed formatting
-
- 30 Apr, 2020 1 commit
-
-
Jeff Rasley authored
* update apex version to feb 5th commit * use gradient clipping instead of max grad norm in tests * add warning when user provides max_grad_norm * update examples commit
-
- 27 Mar, 2020 1 commit
-
-
Olatunji Ruwase authored
* Push to remote * Correctly handle multi output models by doing loss scaling in backward() Unit tests for multi output models * Fix formatting issues * Formatting issues fix * Fix formatting * Update DeepSpeedExamples submodule Enable Megatron model tests
-
- 10 Mar, 2020 1 commit
-
-
Samyam Rajbhandari authored
* Enhancement: Ability to load checkpoint without loading the optimizer states. Unittest testing saving and loading checkpoint with fused, unfused and zero optimizer. The unitest takes about 165s
-
- 22 Feb, 2020 1 commit
-
-
Olatunji Ruwase authored
* Support legacy optimizer fusion as config option * Configure for legacy optimizer fusion * Update configuration jsons for new apex
-
- 14 Feb, 2020 1 commit
-
-
Shaden Smith authored
* Porting BingBertSquad test * Updating default paths. * Enable model tests. * Updating DeepSpeedExamples submodule * Adding BingBertSquad's log uploads. * Messed up the submodule again :-)
-
- 12 Feb, 2020 1 commit
-
-
eltonzheng authored
-
- 07 Feb, 2020 1 commit
-
-
Samyam Rajbhandari authored
* simplifying the batch config, using a single assert to test for validity and allowing for specifying only the micro batch size * Simplifying Batch Config, Adding ability to specify batch using just micro_batch, and adding a bunch of unit tests * ran formatting * Typo fixes and added the config file * reformatting * path fixes * removing print statements
-
- 04 Feb, 2020 1 commit
-
-
Shaden Smith authored
-
- 03 Feb, 2020 3 commits
-
-
Shaden Smith authored
-
Shaden Smith authored
-
Elton Zheng authored
-