1. 27 May, 2020 2 commits
    • Samyam Rajbhandari's avatar
      Samyamr/cpu memory bloat fix zero (#233) · d24d3de9
      Samyam Rajbhandari authored
      * Fix for CPU memory Bloating Issue caused by pyorch backward graph creation in allgather. Fixed by calling detach on tensors before calling all_gather
      
      * Fix for CPU memory Bloating Issue caused by pyorch backward graph creation in allgather. Fixed by calling detach on tensors before calling all_gather
      
      * Fix for CPU memory Bloating Issue caused by pyorch backward graph creation in allgather. Fixed by calling detach on tensors before calling all_gather
      d24d3de9
    • Jeff Rasley's avatar
      Support fp32 grad clipping and fix max_grad_norm confusion (#232) · abe2204d
      Jeff Rasley authored
      * updates to support fp32 grad clipping and disable max_grad_norm
      abe2204d
  2. 26 May, 2020 1 commit
  3. 25 May, 2020 1 commit
  4. 21 May, 2020 2 commits
  5. 20 May, 2020 1 commit
  6. 19 May, 2020 6 commits
  7. 18 May, 2020 1 commit
    • Arash Ashari's avatar
      adding BingSqaud e2e test (#214) · c61e23b4
      Arash Ashari authored
      * adding BingSqaud e2e test
      
      * updating the draft test; bring final step under try section
      
      * finalizinf test for base deepspeed and deepspeed with ZeRO
      
      * applying the comment (thanks Jeff); fixed formatting
      c61e23b4
  8. 15 May, 2020 1 commit
  9. 13 May, 2020 1 commit
  10. 12 May, 2020 1 commit
  11. 11 May, 2020 1 commit
  12. 06 May, 2020 2 commits
  13. 05 May, 2020 1 commit
  14. 04 May, 2020 1 commit
  15. 30 Apr, 2020 2 commits
  16. 29 Apr, 2020 1 commit
    • Samyam Rajbhandari's avatar
      CSR+FP32 fix (#206) · 6cb332f1
      Samyam Rajbhandari authored
      1) CSR parameter names should end with .weight. 
      2) When using basic optimizer directly, DeepSpeed should handle zero_grad. Letting the basic optimizer do the zero_grad resulted in residual gradients in the embedding layer due to unknown reasons.
      6cb332f1
  17. 27 Apr, 2020 1 commit
  18. 25 Apr, 2020 1 commit
  19. 24 Apr, 2020 1 commit
  20. 22 Apr, 2020 2 commits
  21. 21 Apr, 2020 1 commit
  22. 20 Apr, 2020 1 commit
  23. 16 Apr, 2020 1 commit
  24. 12 Apr, 2020 1 commit
  25. 10 Apr, 2020 1 commit
  26. 09 Apr, 2020 1 commit
  27. 07 Apr, 2020 1 commit
  28. 06 Apr, 2020 1 commit
  29. 03 Apr, 2020 1 commit
  30. 28 Mar, 2020 1 commit