- 08 Mar, 2021 1 commit
-
-
Samyam Rajbhandari authored
* Squash stage3 v1 (#146) Co-authored-by:
Samyam <samyamr@microsoft.com> Co-authored-by:
Jeff Rasley <jerasley@microsoft.com> Co-authored-by:
Samyam Rajbhandari <samyamr@microsoft.com> Co-authored-by:
Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by:
Shaden Smith <Shaden.Smith@microsoft.com> Co-authored-by:
Shaden Smith <ShadenTSmith@gmail.com> Co-authored-by:
eltonzheng <eltonz@microsoft.com> * Fix correctness bug (#147) * formatting fix (#150) * stage3 bugfix (API) update and simplified FP16 Z3 tests (#151) * fp16 Z3 API update and bugfix * revert debug change * ZeRO-3 detach and race condition bugfixes (#149) * trying out ZeRO-3 race condition fix * CUDA sync instead of stream * reduction stream sync * remove commented code * Fix optimizer state_dict KeyError (#148) Co-authored-by:
Jeff Rasley <jerasley@microsoft.com> * fix for smaller SGS sizes, ensures each grad is backed by unique tensors (#152) * Simplifying the logic for getting averaged gradients (#153) * skip for now * Z3 Docs redux (#154) * removing some TODOs and commented code (#155) * New Z3 defaults (#156) Co-authored-by:
Jeff Rasley <jerasley@microsoft.com> * formatting * megatron external params Co-authored-by:
Jeff Rasley <jerasley@microsoft.com> Co-authored-by:
Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by:
Shaden Smith <Shaden.Smith@microsoft.com> Co-authored-by:
Shaden Smith <ShadenTSmith@gmail.com> Co-authored-by:
eltonzheng <eltonz@microsoft.com>
-
- 10 Feb, 2021 1 commit
-
-
Olatunji Ruwase authored
* Fix docstring * Make screenshots clickable for easier viewing
-
- 20 Jan, 2021 1 commit
-
-
Stas Bekman authored
Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
- 05 Jan, 2021 1 commit
-
-
brett koonce authored
Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
- 11 Nov, 2020 1 commit
-
-
Samyam Rajbhandari authored
* Update zero.md Update to ZeRO tutorial to specify the use of activation checkpointing * Update zero-offload.md Use activation checkpointing with ZeRO-Offload Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
- 24 Sep, 2020 1 commit
-
-
Conglong Li authored
* url fix * revert absolute path but keep some actual fix * add real readme
-
- 10 Sep, 2020 1 commit
-
-
Olatunji Ruwase authored
Co-authored-by:
Shaden Smith <Shaden.Smith@microsoft.com> Co-authored-by:
Jeff Rasley <jerasley@microsoft.com>
-