1. 17 Apr, 2021 1 commit
  2. 15 Apr, 2021 1 commit
    • Sudhakar Singh's avatar
      Add unit tests for Fused NovoGrad (#1065) · 59d2f7ac
      Sudhakar Singh authored
      * Add unit tests for fused-novograd
      
      * Fix: tensors should reside on the same device
      
      * Fix: Cudastream should be called on the same device on which the tensors reside on. Found this during debugging fused novograd multi-device unit test
      
      * fixed issues mentioned in the comments
      59d2f7ac
  3. 19 Oct, 2020 1 commit
    • lly-zero-one's avatar
      Optimize the sync batchnorm by batching the communication (#980) · 8a1ed9e8
      lly-zero-one authored
      In this PR, we mainly tried to optimize the performance of Syncatchnorm and also fixed one potential issue in the welford_parallel kernel implementation.
      
      For performance improvement, we batched the mean/var/count all_gather communication together and sent it once in the forward path
      We also batch the all_reduce in backward path
      We add the contiguous call on the input of welford_parallel kernel.
      If there is any standard perf benchmark, I would be happy to run it.
      8a1ed9e8
  4. 05 Aug, 2020 1 commit
  5. 06 Jul, 2020 1 commit
    • jjsjann123's avatar
      [sync BN] (#792) · 1ff54b8f
      jjsjann123 authored
      * [sync BN]
      
      support non-uniform batch size across process group.
      
      TODO: test should be added once cleaned up.
      
      * updating unit tests
      
      * new unit tests for different inputs
      
      * cleaning
      1ff54b8f
  6. 23 May, 2020 1 commit
  7. 22 May, 2020 5 commits
  8. 21 May, 2020 1 commit
  9. 14 May, 2020 1 commit
  10. 30 Apr, 2020 3 commits
  11. 28 Apr, 2020 1 commit
  12. 22 Apr, 2020 1 commit
  13. 10 Apr, 2020 1 commit
  14. 27 Feb, 2020 1 commit
  15. 04 Oct, 2019 1 commit
  16. 06 Sep, 2019 1 commit
    • mcarilli's avatar
      Fix for #456 (#477) · 325f5a0b
      mcarilli authored
      * Pushing for build tests
      
      * Contrib files
      
      * Removing deprecated checks
      325f5a0b
  17. 20 Aug, 2019 1 commit
  18. 17 Aug, 2019 1 commit
  19. 16 Aug, 2019 2 commits
    • Deyu Fu's avatar
      clean up variance options support by all fused optimizers: · 18062b69
      Deyu Fu authored
      correctly not apply bias correction to epsilon(same as recent upstream change)
      correctly not apply bias correction to weight decay(consistent with upstream AdamW)
      Make adam_w_mode for FusedAdam/LAMB, to do L2 or Weight Decay (Adam vs AdamW)
      Correct document reg_inside_moment differently from adam_w_mode in FusedNovoGrad
      Removed legacy eps_mode from FusedAdam
      Make internal math type float across fused optimizers
      18062b69
    • Deyu Fu's avatar
      add fused lamb, put lamb kernels into one file · c8f9cceb
      Deyu Fu authored
      c8f9cceb
  20. 08 Aug, 2019 1 commit
  21. 06 Aug, 2019 1 commit
    • ngimel's avatar
      Clean up layer norm tests (#418) · 3ef01fae
      ngimel authored
      * Bug fix for non-affine layer-norm + add backward unit test
      
      * clean up tests and add tests for a large batch
      3ef01fae
  22. 01 Aug, 2019 1 commit
  23. 26 Jul, 2019 1 commit
  24. 12 Jul, 2019 1 commit
  25. 03 Jul, 2019 4 commits
  26. 28 Jun, 2019 1 commit
  27. 14 Jun, 2019 1 commit
  28. 11 Jun, 2019 1 commit
  29. 31 May, 2019 2 commits