1. 23 Apr, 2023 2 commits
    • aspanday's avatar
      Updating BLOCK_SIZE to 1024 in all optimizers. (#103) · 06053e19
      aspanday authored
      * Updating BLOCK_SIZE to 1024.
      tests/L0/run_optimizers/test_fused_optimizer.py test passes except for bfloat16 for Adam. There seems to be a bug in this test that needs to be resolved.
      For now skipping test_bfloat16 for Adam in the unittest.
      Ran 17 other tests and ALL other tests pass!
      More details on the effects of these changes can be found here -  https://confluence.amd.com/display/MLSE/Apex+Kernel+Optimization
      
      .
      This commit changes BLOCK_SIZE=1024 ONLY FOR different optimizers.
      L2norm kernels (part of LAMB optimizer algorithm) still maintain BLOCK_SIZE=512 otherwise Allclose fails.
      
      * Updating tests/L0/run_optimizers/test_fused_optimizer.py with @skipifRocm to skip test_bfloat16 in Adam.
      Co-authored-by: default avataraspanday <aspanday@amd.com>
      06053e19
    • Hubert Lu's avatar
      Unskip some unit tests related to issue #82 (#98) · 2951440a
      Hubert Lu authored
      * Unskip some unit tests related to issue #82
      
      * Ensure test_state_dict to use capturable=True for torch.optim.Adam
      
      * Fix TestFusedAdam tests in test_fused_optimizer.py
      2951440a
  2. 10 Aug, 2022 1 commit
  3. 08 Aug, 2022 1 commit
  4. 22 Jun, 2022 1 commit
  5. 14 Dec, 2021 1 commit
  6. 19 Oct, 2021 1 commit
  7. 15 Apr, 2021 1 commit
    • Sudhakar Singh's avatar
      Add unit tests for Fused NovoGrad (#1065) · 59d2f7ac
      Sudhakar Singh authored
      * Add unit tests for fused-novograd
      
      * Fix: tensors should reside on the same device
      
      * Fix: Cudastream should be called on the same device on which the tensors reside on. Found this during debugging fused novograd multi-device unit test
      
      * fixed issues mentioned in the comments
      59d2f7ac
  8. 21 Jan, 2021 1 commit
  9. 18 Jan, 2021 1 commit
  10. 05 Aug, 2020 1 commit
  11. 26 May, 2020 1 commit
  12. 03 Sep, 2019 1 commit
    • Deyu Fu's avatar
      Fix issues in fused_dam (#469) · 7fa74925
      Deyu Fu authored
      * move import of amp_C to __init__()
      
      * make fp16/32 separate lists to support mixed param types, disable double test
      
      * make zero_grad consistent between adam/novograd/lamb
      7fa74925
  13. 17 Aug, 2019 1 commit
  14. 13 Aug, 2019 1 commit
    • Deyu Fu's avatar
      Reverse to Fused* naming, clean up accordingly: · 007c5947
      Deyu Fu authored
      FusedSGD now work as before
      FusedAdam now work with o1/o2, no longer fuse scaling and casting
      Removed special backend handling for FusedAdam
      Moved and updated test for FusedAdam into run_optimizers
      Removed legacy tests for optimizers.FP16_optimizer and FusedAdam in run_mixed_adam
      007c5947
  15. 12 Aug, 2019 1 commit
  16. 08 Aug, 2019 1 commit
  17. 13 Mar, 2019 1 commit
  18. 26 Feb, 2019 1 commit
  19. 30 Oct, 2018 1 commit
    • ngimel's avatar
      Adam tests (#67) · d594826c
      ngimel authored
      * Add unittest for FusedAdam.
      
      * Fix some bugs.
      
      * set seed for adam test
      d594826c