1. 23 Apr, 2023 1 commit
    • aspanday's avatar
      Updating BLOCK_SIZE to 1024 in all optimizers. (#103) · 06053e19
      aspanday authored
      * Updating BLOCK_SIZE to 1024.
      tests/L0/run_optimizers/test_fused_optimizer.py test passes except for bfloat16 for Adam. There seems to be a bug in this test that needs to be resolved.
      For now skipping test_bfloat16 for Adam in the unittest.
      Ran 17 other tests and ALL other tests pass!
      More details on the effects of these changes can be found here -  https://confluence.amd.com/display/MLSE/Apex+Kernel+Optimization
      
      .
      This commit changes BLOCK_SIZE=1024 ONLY FOR different optimizers.
      L2norm kernels (part of LAMB optimizer algorithm) still maintain BLOCK_SIZE=512 otherwise Allclose fails.
      
      * Updating tests/L0/run_optimizers/test_fused_optimizer.py with @skipifRocm to skip test_bfloat16 in Adam.
      Co-authored-by: default avataraspanday <aspanday@amd.com>
      06053e19
  2. 25 Jan, 2023 1 commit
    • aspanday's avatar
      Updating BLOCK_SIZE to 1024 in all optimizers. (#103) · 14db5c27
      aspanday authored
      * Updating BLOCK_SIZE to 1024.
      tests/L0/run_optimizers/test_fused_optimizer.py test passes except for bfloat16 for Adam. There seems to be a bug in this test that needs to be resolved.
      For now skipping test_bfloat16 for Adam in the unittest.
      Ran 17 other tests and ALL other tests pass!
      More details on the effects of these changes can be found here -  https://confluence.amd.com/display/MLSE/Apex+Kernel+Optimization
      
      .
      This commit changes BLOCK_SIZE=1024 ONLY FOR different optimizers.
      L2norm kernels (part of LAMB optimizer algorithm) still maintain BLOCK_SIZE=512 otherwise Allclose fails.
      
      * Updating tests/L0/run_optimizers/test_fused_optimizer.py with @skipifRocm to skip test_bfloat16 in Adam.
      Co-authored-by: default avataraspanday <aspanday@amd.com>
      14db5c27
  3. 22 Jun, 2022 1 commit
  4. 25 Feb, 2021 1 commit
  5. 21 May, 2020 1 commit
  6. 12 May, 2020 1 commit
  7. 16 Aug, 2019 1 commit
    • Deyu Fu's avatar
      clean up variance options support by all fused optimizers: · 18062b69
      Deyu Fu authored
      correctly not apply bias correction to epsilon(same as recent upstream change)
      correctly not apply bias correction to weight decay(consistent with upstream AdamW)
      Make adam_w_mode for FusedAdam/LAMB, to do L2 or Weight Decay (Adam vs AdamW)
      Correct document reg_inside_moment differently from adam_w_mode in FusedNovoGrad
      Removed legacy eps_mode from FusedAdam
      Make internal math type float across fused optimizers
      18062b69
  8. 08 Aug, 2019 1 commit