- 10 Sep, 2020 9 commits
-
-
Minjia Zhang authored
Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
Arash Ashari authored
Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
Jeff Rasley authored
-
Olatunji Ruwase authored
Co-authored-by:
Shaden Smith <Shaden.Smith@microsoft.com> Co-authored-by:
Jeff Rasley <jerasley@microsoft.com>
-
Jeff Rasley authored
-
Jeff Rasley authored
Fixes a dataype issue with softmax where the number of blocks being sent to the Triton kernel source was a torch.Tensor but should have been a python integer. On some environments (e.g., conda) this resulted in triton not knowing how to serialize the input (and crashing in our tests). Once switching to the correct datatype that triton expects this seems to have solved the issue. Co-authored-by:Shaden Smith <Shaden.Smith@microsoft.com>
-
Ammar Ahmad Awan authored
Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
Shaden Smith authored
Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
Jeff Rasley authored
* ZeRO-Offload (squash) (#381) Co-authored-by:
Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by:
Reza Yazdani <reyazda@microsoft.com> Co-authored-by:
Jeff Rasley <jerasley@microsoft.com> Co-authored-by:
Jie <37380896+jren73@users.noreply.github.com> Co-authored-by:
Arash Ashari <arashari@microsoft.com> Co-authored-by:
Reza Yazdani <reyazda@microsoft.com> Co-authored-by:
Samyam Rajbhandari <samyamr@microsoft.com> Co-authored-by:
Shaden Smith <Shaden.Smith@microsoft.com> Co-authored-by:
arashashari <arashashari@ArashMSLaptop.redmond.corp.microsoft.com> Co-authored-by:
RezaYazdaniAminabadi <44502768+RezaYazdaniAminabadi@users.noreply.github.com> Co-authored-by:
Reza Yazdani <reyazda@microsoft.com> Co-authored-by:
Samyam Rajbhandari <samyamr@microsoft.com> Co-authored-by:
Shaden Smith <Shaden.Smith@microsoft.com>
-
- 09 Sep, 2020 5 commits
-
-
Jeff Rasley authored
-
Arash Ashari authored
Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
Ammar Ahmad Awan authored
* 1-bit adam (#353) Co-authored-by:
Jeff Rasley <jerasley@microsoft.com> Co-authored-by:
Your Name <you@example.com> Co-authored-by:
tanghl1994 <htang14@ur.rochester.edu> Co-authored-by:
Hank <tanghl1994@gmail.com> Co-authored-by:
root <root@node2x12b.cs.rochester.edu> Co-authored-by:
Ammar Ahmad Awan <awan.ammar@microsoft.com>
-
Jeff Rasley authored
-
Arash Ashari authored
-
- 06 Sep, 2020 2 commits
-
-
Arash Ashari authored
* adding BingSqaud e2e test * updating the draft test; bring final step under try section * finalizinf test for base deepspeed and deepspeed with ZeRO * applying the comment (thanks Jeff); fixed formatting * update Sparse Attention Tutorial * fixed few issues and applied comments for better organization and readability * updated sparse attention tutorial with making how to use section incremental; applying more comments Co-authored-by:arashashari <arashashari@ArashMSLaptop.redmond.corp.microsoft.com>
-
Olatunji Ruwase authored
-
- 05 Sep, 2020 2 commits
-
-
Shaden Smith authored
-
Arash Ashari authored
-
- 04 Sep, 2020 1 commit
-
-
Shaden Smith authored
-
- 03 Sep, 2020 2 commits
-
-
Arash Ashari authored
* adding link to Sparse Attention in Navigation page
-
Jeff Rasley authored
-
- 02 Sep, 2020 4 commits
-
-
Jeff Rasley authored
-
Jeff Rasley authored
Remove llvm/cmake install for now, causing pyyaml issues
-
Jeff Rasley authored
-
Jeff Rasley authored
* Sparse attn + ops/runtime refactor + v0.3.0 Co-authored-by:
Arash Ashari <arashari@microsoft.com> Co-authored-by:
Arash Ashari <arashari@microsoft.com>
-
- 01 Sep, 2020 6 commits
-
-
Shaden Smith authored
-
Jeff Rasley authored
-
Samyam Rajbhandari authored
Renaming config files to gas3
-
Samyam Rajbhandari authored
-
Samyam Rajbhandari authored
-
Samyam Rajbhandari authored
* Adding gradient accumulation support for ZeRO Stage 2. Changing all Megatron-LM tests to also test gradient accumulation * Gradient Accumulation support for Stage 2. Model tests added to test the feature * formatting * Update deepspeed_light.py removing comment * Update ds_config_func_bs8_zero1.json reverting this file back. Its not needed for this PR * defining baseline prefix Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
- 31 Aug, 2020 1 commit
-
-
Samyam Rajbhandari authored
* Update deepspeed_checkpointing.py * formatting Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
- 28 Aug, 2020 1 commit
-
-
Jeff Rasley authored
-
- 27 Aug, 2020 1 commit
-
-
Jeff Rasley authored
* Create CODEOWNERS
-
- 18 Aug, 2020 1 commit
-
-
Jeff Rasley authored
* turn off multi-node launch if only 1 node
-
- 14 Aug, 2020 1 commit
-
-
Jeff Rasley authored
-
- 13 Aug, 2020 2 commits
-
-
Jeff Rasley authored
-
Jeff Rasley authored
* update fan out flag for pdsh
-
- 12 Aug, 2020 2 commits
-
-
Jeff Rasley authored
-
Shaden Smith authored
-