"sgl-kernel/python/vscode:/vscode.git/clone" did not exist on "aca1101a13965a6d461b0625ed933c78475a7382"
- 08 Mar, 2021 1 commit
-
-
Samyam Rajbhandari authored
* Squash stage3 v1 (#146) Co-authored-by:
Samyam <samyamr@microsoft.com> Co-authored-by:
Jeff Rasley <jerasley@microsoft.com> Co-authored-by:
Samyam Rajbhandari <samyamr@microsoft.com> Co-authored-by:
Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by:
Shaden Smith <Shaden.Smith@microsoft.com> Co-authored-by:
Shaden Smith <ShadenTSmith@gmail.com> Co-authored-by:
eltonzheng <eltonz@microsoft.com> * Fix correctness bug (#147) * formatting fix (#150) * stage3 bugfix (API) update and simplified FP16 Z3 tests (#151) * fp16 Z3 API update and bugfix * revert debug change * ZeRO-3 detach and race condition bugfixes (#149) * trying out ZeRO-3 race condition fix * CUDA sync instead of stream * reduction stream sync * remove commented code * Fix optimizer state_dict KeyError (#148) Co-authored-by:
Jeff Rasley <jerasley@microsoft.com> * fix for smaller SGS sizes, ensures each grad is backed by unique tensors (#152) * Simplifying the logic for getting averaged gradients (#153) * skip for now * Z3 Docs redux (#154) * removing some TODOs and commented code (#155) * New Z3 defaults (#156) Co-authored-by:
Jeff Rasley <jerasley@microsoft.com> * formatting * megatron external params Co-authored-by:
Jeff Rasley <jerasley@microsoft.com> Co-authored-by:
Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by:
Shaden Smith <Shaden.Smith@microsoft.com> Co-authored-by:
Shaden Smith <ShadenTSmith@gmail.com> Co-authored-by:
eltonzheng <eltonz@microsoft.com>
-
- 10 Sep, 2020 2 commits
-
-
Jeff Rasley authored
-
Jeff Rasley authored
-
- 02 Sep, 2020 3 commits
-
-
Jeff Rasley authored
Remove llvm/cmake install for now, causing pyyaml issues
-
Jeff Rasley authored
-
Jeff Rasley authored
* Sparse attn + ops/runtime refactor + v0.3.0 Co-authored-by:
Arash Ashari <arashari@microsoft.com> Co-authored-by:
Arash Ashari <arashari@microsoft.com>
-
- 28 Jul, 2020 1 commit
-
-
Jeff Rasley authored
* fix nv_peer_mem version in dockerfile * fix security issue, remove pillow dependency (this is only needed for cifar example which has its own requirements.txt)
-
- 13 May, 2020 1 commit
-
-
Jeff Rasley authored
-
- 09 Mar, 2020 1 commit
-
-
Incomplete authored
* Add --no_sudo to run without sudo * Add --pip_mirror to set the pip mirror * Default to running pip without sudo * Typo * Add --pip_sudo to Dockerfile and azure-pipelines.yml Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
- 13 Feb, 2020 1 commit
-
-
Jeff Rasley authored
* bump tf version in dockerfile * Update install.sh
-
- 10 Feb, 2020 1 commit
-
-
Jeff Rasley authored
-
- 05 Feb, 2020 1 commit
-
-
Jeff Rasley authored
* update examples submodule * install requirements.txt with install script * add dockerfile
-