1. 16 Dec, 2020 1 commit
  2. 15 Dec, 2020 1 commit
  3. 21 Aug, 2020 1 commit
  4. 01 Jun, 2020 1 commit
  5. 29 May, 2020 2 commits
  6. 10 Oct, 2019 1 commit
  7. 08 Oct, 2019 1 commit
  8. 27 Aug, 2019 1 commit
    • ptrblck's avatar
      Enable Checkpointing (#420) · dec4fdd6
      ptrblck authored
      * add state_dict, load_state_dict
      
      * add test_restoring, test_loss_scale_decrease
      
      * disable amp outputs for checkpoint tests
      
      * add test for amp.state_dict, cleanup
      
      * add state_dict patch, add test
      
      * fixed testing, cleanup
      
      * add readme for checkpointing
      
      * add docs to source/amp
      
      * add review changes to doc
      dec4fdd6
  9. 13 Aug, 2019 1 commit
  10. 24 Jun, 2019 1 commit
  11. 09 May, 2019 1 commit
  12. 30 Apr, 2019 1 commit
  13. 18 Apr, 2019 1 commit
  14. 11 Apr, 2019 1 commit
  15. 12 Mar, 2019 1 commit
  16. 07 Mar, 2019 2 commits
  17. 04 Mar, 2019 1 commit
  18. 01 Mar, 2019 2 commits
  19. 28 Feb, 2019 1 commit
    • vfdev's avatar
      typo · 519ff816
      vfdev authored
      519ff816
  20. 20 Feb, 2019 5 commits
  21. 28 Jan, 2019 1 commit
  22. 31 Oct, 2018 1 commit
  23. 30 Oct, 2018 1 commit
  24. 23 Oct, 2018 1 commit
    • jjsjann123's avatar
      [syncBN] (#48) · 81eef1ef
      jjsjann123 authored
      * [syncBN]
        added syncBN in native pure python apex
        added fused cuda kernels used for sync BN. Using welford for mean/var
          optional installation using 'python setup.py install --cuda_ext'
        added unit test with side to side comparison between apex sync BN with
          PyTorch BN. Notice that for pytorch BN implementation, because of
          numerical issue for mean/var, the output will be slightly off.
      
      * [syncBN PR]
        added fp16 support
        addressing review comments on:
          1. updating last pow 2
          2. look for import error when importing syncBN kernel
      
      * [syncBN PR]
        added convert function to insert SyncBatchNorm
        refactored some kernel code
      
      * fixing type issue (fp16/fp32/fp64)
      added Kahan summation
      editing unit test to use pytorch primitive ops with double, passing reasonable tests now
      
      * updating tensor creation calls
      
      * fixing the all_reduce contiguous tensor
      
      * transposed all reduce results
      
      * [syncBN]
      support fp16 input & fp32 layer for apex fp16
      partially fixing launch configs
      enabling imagenet example to run with --sync_bn
      
      * [syncBN PR]
      Documentation added
      
      * adjusting README
      
      * adjusting again
      
      * added some doc to imagenet example
      
      * [syncBN]
        warp-level reduction
        bug fix: warp reduction logic updated. check for dummy element to avoid nan.
        improved launch config for better reduction kernels. Further improvements
      would be to increase grid size.
      
      * [syncBN]
        fixing undefined behavior in __shfl_down_sync from divergent threads in warp
      reduction.
        changing at::native::empty to at::empty (upstream comments)
      81eef1ef
  25. 08 Oct, 2018 1 commit
  26. 22 Aug, 2018 1 commit
  27. 19 Aug, 2018 1 commit
  28. 05 Jul, 2018 1 commit
  29. 15 Jun, 2018 3 commits
  30. 14 Jun, 2018 2 commits