Commits · 3dd36070c3d2ec5f5a474eb23177b5f634b62c85 · OpenDAS / apex

06 Jul, 2020 1 commit

jjsjann123 authored Jul 06, 2020

* [sync BN]

support non-uniform batch size across process group.

TODO: test should be added once cleaned up.

* updating unit tests

* new unit tests for different inputs

* cleaning

1ff54b8f

23 Jun, 2020 3 commits
- add test case for non-zero weight decay · ad50ce9a
  Kexin Yu authored Jun 23, 2020
  
  ad50ce9a
- test nvlamb; hyperparams consistent with adam/adagrad tests · cd3d6d12
  Kexin Yu authored Jun 23, 2020
  
  cd3d6d12
- add test for FusedLAMB · 9774ce0d
  Kexin Yu authored Jun 22, 2020
  
  9774ce0d
14 May, 2020 1 commit
- Add FusedAdagrad (#822) · 3bae8c83
  Andrew Tulloch authored May 14, 2020
  
  3bae8c83
30 Apr, 2020 1 commit

Improvements to apex.mlp (#804) · 31aceeaa

Deyu Fu authored Apr 30, 2020

* update fused bias relu backward kernel

* adding support for not require first layer dgrad

* fix bug: wrong layer in requires grad

* add infrastructure for optional bias and activation, currently only support no bias and no relu

* make bias and relu optional separately

* add sigmoid activation option

31aceeaa

22 Apr, 2020 2 commits

initial commit to add Multilayer Perceptron (MLP) extension (#790) · 71511faf
Deyu Fu authored Apr 22, 2020

71511faf

Fix LARC with mixed precision (#793) · 2ec84ebd

Vinicius Reis authored Apr 22, 2020

The LARC optimizer wraps an underlying optimizer and then needs to be passed
to amp.initialize for mixed precision. There were 3 different crashes happening
in this situation, fix all of them and add a unit test.

I don't know if the 'LARC' in sys.modules check ever worked. In my setup, the
entry in sys.modules is 'apex.parallel.LARC'. Checking if the variable is
defined seems more reliable though.

2ec84ebd

31 Mar, 2020 1 commit
- Add support for bool datatype (#601) (#603) · ca00adac
  Jeff Bowles authored Mar 31, 2020
  
  ca00adac
27 Feb, 2020 1 commit
- NHWC support for multi tensor apply (#732) · de6378f5
  mcarilli authored Feb 26, 2020
```
* NHWC support for multi tensor apply

* compilation fix for version<=1.4
```
  de6378f5
06 Nov, 2019 1 commit
- fixing batchnorm 1d input (#590) · 37cdaf4a
  jjsjann123 authored Nov 06, 2019
  
  37cdaf4a
03 Oct, 2019 1 commit

Disable tests for mixed opt_levels, add bitwise accurate test of parameters (#520) · 0b74bfd9

ptrblck authored Oct 03, 2019

* increase atol for Half-Float comparison to 1.5e-4

* disable tests for different opt_levels

* reset atol

* add bitwise accurate comparison

0b74bfd9

06 Sep, 2019 1 commit

Fix for #456 (#477) · 325f5a0b

mcarilli authored Sep 05, 2019

* Pushing for build tests

* Contrib files

* Removing deprecated checks

325f5a0b

03 Sep, 2019 1 commit

Fix issues in fused_dam (#469) · 7fa74925

Deyu Fu authored Sep 03, 2019

* move import of amp_C to __init__()

* make fp16/32 separate lists to support mixed param types, disable double test

* make zero_grad consistent between adam/novograd/lamb

7fa74925

27 Aug, 2019 1 commit

Enable Checkpointing (#420) · dec4fdd6

ptrblck authored Aug 27, 2019

* add state_dict, load_state_dict

* add test_restoring, test_loss_scale_decrease

* disable amp outputs for checkpoint tests

* add test for amp.state_dict, cleanup

* add state_dict patch, add test

* fixed testing, cleanup

* add readme for checkpointing

* add docs to source/amp

* add review changes to doc

dec4fdd6

17 Aug, 2019 1 commit
- disable breaking test until switch to test against upstream v1.2.0 · f855f856
  Deyu Fu authored Aug 16, 2019
  
  f855f856
15 Aug, 2019 1 commit
- Undefined name: import os for line 134 · 453eefa5
  Christian Clauss authored Aug 15, 2019
  
  453eefa5
13 Aug, 2019 2 commits

Reverse to Fused* naming, clean up accordingly: · 007c5947

Deyu Fu authored Aug 13, 2019

FusedSGD now work as before
FusedAdam now work with o1/o2, no longer fuse scaling and casting
Removed special backend handling for FusedAdam
Moved and updated test for FusedAdam into run_optimizers
Removed legacy tests for optimizers.FP16_optimizer and FusedAdam in run_mixed_adam

007c5947

Adding PyProf to Apex (#404) · 880ab925

Marek Kolodziej authored Aug 13, 2019


Co-authored-by: Aditya Agrawal <aditya.iitb@gmail.com>
Co-authored-by: Marek Kolodziej <mkolod@gmail.com>

880ab925

12 Aug, 2019 1 commit
- keep old fused* name and rename new optimizers without prefix · adad5996
  Deyu Fu authored Aug 12, 2019
  
  adad5996
08 Aug, 2019 1 commit
- initial commit to make fused optimizers compatible with AMP · 690b1f71
  Deyu Fu authored Aug 08, 2019
  
  690b1f71
06 Aug, 2019 1 commit

Clean up layer norm tests (#418) · 3ef01fae

ngimel authored Aug 06, 2019

* Bug fix for non-affine layer-norm + add backward unit test

* clean up tests and add tests for a large batch

3ef01fae

26 Jul, 2019 1 commit

[sbn update] (#384) · 896ecdd6

jjsjann123 authored Jul 12, 2019

fixing empty return from python implementation
  adding proper test to verify functional correctness for python implementation

896ecdd6

12 Jul, 2019 1 commit

[sbn update] (#384) · 574fe244

jjsjann123 authored Jul 12, 2019

fixing empty return from python implementation
  adding proper test to verify functional correctness for python implementation

574fe244

03 Jul, 2019 3 commits
- Bumping container version for test · d352d440
  Michael Carilli authored Jul 03, 2019
  
  d352d440
- Bumping container version for test · 1483f22d
  Michael Carilli authored Jul 03, 2019
  
  1483f22d
- Fix use of multi_tensor_l2norm, remove test using deprecated syntax · b9336b1e
  Michael Carilli authored Jul 03, 2019
  
  b9336b1e
31 May, 2019 1 commit

Give multi-tensor L2 norm the ability to compute norms per-tensor as well as globally (#333) · 93338e62

mcarilli authored May 31, 2019

* Existing tests passing, still need to add per-tensor tests

* Test is passing, still need to measure performance

* ILP for l2norm functor

93338e62

27 May, 2019 2 commits
- test cleanup · d68ec712
  Michael Carilli authored May 27, 2019
  
  d68ec712
- FusedSGD tests passing for all opt_levels · 848c777d
  Michael Carilli authored May 27, 2019
  
  848c777d
16 May, 2019 1 commit
- Support add_param_group (#310) · 4d325d2f
  mcarilli authored May 15, 2019
```
* Support add_param_group

* syntax

* Test added and passing
```
  4d325d2f
13 May, 2019 1 commit
- Adding docker build test for 1.1 container · f2b3a62c
  Michael Carilli authored May 13, 2019
  
  f2b3a62c
02 May, 2019 1 commit
- Adding test_fused_sgd.py · d0505433
  Michael Carilli authored May 02, 2019
  
  d0505433
01 May, 2019 1 commit
- allreduce_different_streams is now hidden · 72bce160
  Michael Carilli authored May 01, 2019
  
  72bce160
30 Apr, 2019 1 commit

Remove unused tensor in fast_collate (#281) · ca2baffb

ptrblck authored Apr 30, 2019

* remove unused tens tensor in example/imagenet/main_amp.py

* remove unused tens tensor in deprecated examples and tests/L1

ca2baffb

10 Apr, 2019 3 commits
- add new tests to run_test.py · 6d40465a
  Lam Dang authored Apr 10, 2019
  
  6d40465a
- quick fix: make FusedLayerNorm compatible with cpu · d130ec1f
  Lam Dang authored Apr 10, 2019
  
  d130ec1f
- Kernel + sizes stress test · 1a48b26b
  Michael Carilli authored Apr 09, 2019
  
  1a48b26b
04 Apr, 2019 1 commit

WIP: Handle arbitrary combinations of optimizers/models/losses (#232) · 3f87614f

mcarilli authored Apr 03, 2019

* Refactor to allow more flexible treatment of multiple optimizers/models/losses

* Adding _process_optimizers.py

* Created L0 tests (now passing).

* fix: minor print typo (#234)

* make L1 results easier to read

* L0 multiple model/optimizer/loss test fleshed out

* Adding test that master params remain synced across distributed processes

* Docstring updates

* Docstring updates

3f87614f

22 Mar, 2019 1 commit

Check cuda version (#216) · 5b8faa29

mcarilli authored Mar 21, 2019

* Adding Torch + bare-metal nvcc version check and container build tests

* Putting a canary in the coalmine

* canary proved elusive

* Trying direct setup.py install

* this should work

* Removing canary

* hopefully this works

5b8faa29