- 07 Jan, 2021 2 commits
-
-
dependabot[bot] authored
Bumps [nokogiri](https://github.com/sparklemotion/nokogiri) from 1.10.10 to 1.11.0. - [Release notes](https://github.com/sparklemotion/nokogiri/releases) - [Changelog](https://github.com/sparklemotion/nokogiri/blob/master/CHANGELOG.md) - [Commits](https://github.com/sparklemotion/nokogiri/compare/v1.10.10...v1.11.0 ) Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by:
Jeff Rasley <jerasley@microsoft.com>
-
Xingjian Shi authored
-
- 06 Jan, 2021 1 commit
-
-
Jeff Rasley authored
Co-authored-by:
Reza Yazdani <reyazda@microsoft.com> Co-authored-by:
Olatunji Ruwase <olruwase@microsoft.com>
-
- 05 Jan, 2021 4 commits
-
-
Olatunji Ruwase authored
-
brett koonce authored
Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
Ammar Ahmad Awan authored
-
gcooper-isi authored
Allow DeepSpeed models to be initialized with optimizer=None Co-authored-by:Shaden Smith <Shaden.Smith@microsoft.com>
-
- 04 Jan, 2021 2 commits
-
-
Olatunji Ruwase authored
-
Jeff Rasley authored
-
- 23 Dec, 2020 1 commit
-
-
Jeff Rasley authored
Co-authored-by:Samyam Rajbhandari <samyamr@microsoft.com>
-
- 18 Dec, 2020 1 commit
-
-
Jeff Rasley authored
-
- 17 Dec, 2020 1 commit
-
-
Reza Yazdani authored
Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
- 15 Dec, 2020 2 commits
-
-
Jeff Rasley authored
Co-authored-by:Shaden Smith <Shaden.Smith@microsoft.com>
-
Stas Bekman authored
* [doc] xref to hostfile discussion wasn't clear where to find what was meant by `hostfile` - so adding a link to where it's discussed. * remove whitespace
-
- 14 Dec, 2020 1 commit
-
-
Stas Bekman authored
Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
- 11 Dec, 2020 5 commits
-
-
Jeff Rasley authored
* Update launch.py * formatting
-
carefree0910 authored
Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
Stas Bekman authored
Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
Stas Bekman authored
* fix arch flags, add PTX * bug fix Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
Jeff Rasley authored
-
- 09 Dec, 2020 4 commits
-
-
Jeff Rasley authored
-
Jeff Rasley authored
-
Jeff Rasley authored
-
Jeff Rasley authored
-
- 08 Dec, 2020 1 commit
-
-
Shaden Smith authored
* Switch from deprecated allreduce interface. * Make pipeline checkpoint files portable.
-
- 07 Dec, 2020 2 commits
-
-
Stas Bekman authored
RTX-30 series are compute_86 ``` python -c "import torch; print(torch.cuda.get_device_capability())" ``` This PR adds support for this compute capability. Reference: https://developer.nvidia.com/cuda-gpus Co-authored-by:
Jeff Rasley <jerasley@microsoft.com>
-
Stas Bekman authored
-
- 04 Dec, 2020 1 commit
-
-
Zhun authored
* 1) Register layout as buffer of module so that we can save/load checkpoint; 2) Add a broadcast of layout at the beginning to ensure different processes will have consistent layout during distributed training. * Add docstring for max_seq_length argument in SparseSelfAttention Co-authored-by:
Zhun Liu <zhunliu@microsoft.com> Co-authored-by:
Jeff Rasley <jerasley@microsoft.com>
-
- 03 Dec, 2020 3 commits
-
-
Stas Bekman authored
-
Jeff Rasley authored
-
Stas Bekman authored
-
- 02 Dec, 2020 2 commits
-
-
Jeff Rasley authored
-
Stas Bekman authored
* [cifar tutorial] improve readability
-
- 01 Dec, 2020 2 commits
-
-
Reza Yazdani authored
* tracking optimizer step in cpu-adam when loading checkpoint * add warning/error message for updating optimizer step count * resolve build issue * supporting state update from the python side * track step from python in all cases * remove comma
-
Reza Yazdani authored
* supporting different hidden dimensions * add support for larger hidden dimensions (greater than 8K) * remove empty line * add loop unrolling factor for dropout kernels * update different kernels based on the reviews Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
- 28 Nov, 2020 1 commit
-
-
Stas Bekman authored
This PR: * fixes a misspelled method name * also `( () )` doesn't read too well, until one reads the code and understands that it's not a formatting bug. I proposed to simply say that it's a callable object.
-
- 25 Nov, 2020 4 commits
-
-
Jeff Rasley authored
-
Jeff Rasley authored
-
Jeff Rasley authored
-
Shaden Smith authored
-