Commits · 7a428776bb94a231af6111855a52afbb1b4d604c · OpenDAS / apex

23 Mar, 2023 1 commit

Add FusedLARS optimizer (#109) · 7a428776

luise.chen authored Mar 24, 2023

* Add fused_lars optimizer

* Update primitive fused_lars optimizer, working for resnet50 with NHWC/NCHW

* Add flow of using nesterov in FusedLARS

7a428776

09 Dec, 2021 2 commits

Add fused mixed precision lamb optimizer. (#1237) · d11ddccf

Kevin Stephano authored Dec 08, 2021

* Add fused mixed precision lamb optimizer.

* Fix device usage in constructor.

* Fix sending param_group tensor state to device.

* Remove unneeded device set.

d11ddccf

Add fused mixed precision lamb optimizer. (#1237) · 3c8f5161

Kevin Stephano authored Dec 08, 2021

* Add fused mixed precision lamb optimizer.

* Fix device usage in constructor.

* Fix sending param_group tensor state to device.

* Remove unneeded device set.

3c8f5161

01 Sep, 2021 1 commit

Seryilmaz/fuse norm into scale (#1149) · 4d190db6

Burc Eryilmaz authored Sep 01, 2021



* fuse norm into scale

* add fused norm into dlamb
Co-authored-by: Sukru Eryilmaz <seryilmaz@computelab-dgx1v-32.nvidia.com>

4d190db6

23 May, 2020 1 commit
- fix function signature · 2be773d3
  Kexin Yu authored May 23, 2020
  
  2be773d3
14 May, 2020 1 commit
- Add FusedAdagrad (#822) · 3bae8c83
  Andrew Tulloch authored May 14, 2020
  
  3bae8c83
30 Apr, 2020 1 commit
- fix function signature for LAMBStage2Functor · c8bcfff8
  Kexin Yu authored Apr 30, 2020
  
  c8bcfff8
28 Apr, 2020 1 commit
- LAMB: global grad clipping & more flexibility in adaptive lr · 5b300119
  Kexin Yu authored Apr 28, 2020
  
  5b300119
20 Aug, 2019 1 commit
- add back lamb stage1/2 to amp_C python · b9f0995b
  Deyu Fu authored Aug 20, 2019
  
  b9f0995b
16 Aug, 2019 2 commits

clean up variance options support by all fused optimizers: · 18062b69

Deyu Fu authored Aug 16, 2019

correctly not apply bias correction to epsilon(same as recent upstream change)
correctly not apply bias correction to weight decay(consistent with upstream AdamW)
Make adam_w_mode for FusedAdam/LAMB, to do L2 or Weight Decay (Adam vs AdamW)
Correct document reg_inside_moment differently from adam_w_mode in FusedNovoGrad
Removed legacy eps_mode from FusedAdam
Make internal math type float across fused optimizers

18062b69

add fused lamb, put lamb kernels into one file · c8f9cceb
Deyu Fu authored Aug 16, 2019

c8f9cceb

08 Aug, 2019 1 commit
- initial commit to make fused optimizers compatible with AMP · 690b1f71
  Deyu Fu authored Aug 08, 2019
  
  690b1f71
31 May, 2019 2 commits

Multi tensor lamb optimizer (#334) · 8be5b6be

Thor Johnsen authored May 31, 2019

* First draft, for discussion

* Fix mistakes in LAMB equations

* Add loop over chunk

* Bug fix

* Bug fix

* Bug fix

* Undo bug fix

* Bug fix

* Add multi tensor LAMB optimizer to setup.py

* Rename step_size to learning_rate

* Fix compilation errors

8be5b6be

Give multi-tensor L2 norm the ability to compute norms per-tensor as well as globally (#333) · 93338e62

mcarilli authored May 31, 2019

* Existing tests passing, still need to add per-tensor tests

* Test is passing, still need to measure performance

* ILP for l2norm functor

93338e62

10 May, 2019 1 commit
- materialize_master_weights for FusedSGD · c763f0fe
  Michael Carilli authored May 09, 2019
  
  c763f0fe
09 Apr, 2019 1 commit
- Simple cut of the kernel in place · e57f5d0e
  Michael Carilli authored Apr 09, 2019
  
  e57f5d0e
04 Apr, 2019 1 commit

WIP: Handle arbitrary combinations of optimizers/models/losses (#232) · 3f87614f

mcarilli authored Apr 03, 2019

* Refactor to allow more flexible treatment of multiple optimizers/models/losses

* Adding _process_optimizers.py

* Created L0 tests (now passing).

* fix: minor print typo (#234)

* make L1 results easier to read

* L0 multiple model/optimizer/loss test fleshed out

* Adding test that master params remain synced across distributed processes

* Docstring updates

* Docstring updates

3f87614f

19 Mar, 2019 1 commit
- Multi-tensor axpby kernel for more flexible unscaling (groundwork for #163 and #179 fix) · 5e552004
  Michael Carilli authored Mar 18, 2019
  
  5e552004
11 Mar, 2019 1 commit

Fix dispatch, add wd after momentum option · b265b0b5

Simon Layton authored Mar 11, 2019

Fix dispatch where we have a parameter group with multiple
combinations of types
Optionally apply weight decay after momentum

b265b0b5

10 Mar, 2019 1 commit
- Removing deprecated scale_check_overflow kernel · 8f53411a
  Michael Carilli authored Mar 10, 2019
  
  8f53411a
08 Mar, 2019 1 commit
- Fused multi-tensor SGD · cadad920
  Simon Layton authored Mar 08, 2019
```
Initial implementation, all fp32
Tested against torch.optim.sgd
```
  cadad920
19 Feb, 2019 1 commit
- Reworked multi tensor apply, added tests · 6763a8be
  Michael Carilli authored Feb 18, 2019
  
  6763a8be