Commits · 14fe2b67d96ea5c7791263baf7f1de2b29613055 · OpenDAS / apex

02 Jul, 2019 1 commit
- Update scaler.py · 14fe2b67
  mcarilli authored Jul 01, 2019
  
  14fe2b67
28 Jun, 2019 2 commits
- Merge pull request #383 from NVIDIA/lamb_add_fp16_support_update_term · ff6b8bb0
  Thor Johnsen authored Jun 28, 2019
```
Add support for fp16 update term (new UPD_T typename in template)
```
  ff6b8bb0
- Add support for fp16 update term (new UPD_T typename in template) · 3aeea0d8
  Thor Johnsen authored Jun 28, 2019
  
  3aeea0d8
27 Jun, 2019 1 commit
- Clarifying documentation on gradient accumulation · 18f2eaee
  Michael Carilli authored Jun 26, 2019
  
  18f2eaee
25 Jun, 2019 1 commit
- grid_sample should be fp32 for now · 656d14b0
  Michael Carilli authored Jun 25, 2019
  
  656d14b0
24 Jun, 2019 4 commits
- Merge branch 'master' of https://github.com/NVIDIA/apex · 9ce80178
  Michael Carilli authored Jun 24, 2019
  
  9ce80178
- Docstring for multiple losses · f8557569
  Michael Carilli authored Jun 24, 2019
  
  f8557569
- Update README.md · f17cd953
  mcarilli authored Jun 24, 2019
  
  f17cd953
- Updating gradient accumulation guidance · ca35aa79
  Michael Carilli authored Jun 24, 2019
  
  ca35aa79
21 Jun, 2019 2 commits
- Make main_amp.py more profiling-friendly · f29b3f8d
  Michael Carilli authored Jun 21, 2019
  
  f29b3f8d
- Don't need to blacklist mean for pytorch >= 1.1 · 4b9858ec
  Michael Carilli authored Jun 21, 2019
  
  4b9858ec
20 Jun, 2019 1 commit
- Fix end-of-epoch with record_stream · 90e5b05a
  Michael Carilli authored Jun 20, 2019
  
  90e5b05a
19 Jun, 2019 2 commits
- Merge branch 'master' of https://github.com/NVIDIA/apex · 1ccaaf4a
  Michael Carilli authored Jun 18, 2019
  
  1ccaaf4a
- Fix for https://github.com/NVIDIA/apex/issues/361 · 68c850d3
  Michael Carilli authored Jun 18, 2019
  
  68c850d3
18 Jun, 2019 2 commits
- Give custom to method a higher priority (#364) · cd6e46c2
  mcarilli authored Jun 18, 2019
  
  cd6e46c2
- Fix rare caching allocator race condition in imagenet prefetcher · d5e2bb4b
  Michael Carilli authored Jun 17, 2019
  
  d5e2bb4b
17 Jun, 2019 1 commit
- More helpful message for unexpected opt_level · c3bcf18e
  Michael Carilli authored Jun 17, 2019
  
  c3bcf18e
14 Jun, 2019 4 commits
- Merge branch 'master' of https://github.com/NVIDIA/apex · a9f5f711
  Michael Carilli authored Jun 14, 2019
  
  a9f5f711
- Removing gradient_average_split_factor · 41c98511
  Michael Carilli authored Jun 14, 2019
  
  41c98511
- Separate LDG/STG from compute loop (#359) · 121a2500
  Thor Johnsen authored Jun 13, 2019
  
  121a2500
- Adding delay_overflow_check=False ninja control point · 4a9c2a53
  Michael Carilli authored Jun 13, 2019
  
  4a9c2a53
13 Jun, 2019 2 commits
- disable_allreduce -> _disable_allreduce · ae7f0def
  Michael Carilli authored Jun 13, 2019
  
  ae7f0def
- Add option to turn on/off allreduce in DDP (useful for gradient accumulation) (#356) · 1c2ba890
  Thor Johnsen authored Jun 13, 2019
  
  1c2ba890
11 Jun, 2019 1 commit
- Allow multi_tensor_lamb to update fp16 params · 47e3367f
  Michael Carilli authored Jun 11, 2019
  
  47e3367f
07 Jun, 2019 1 commit
- Fix for https://github.com/NVIDIA/apex/issues/344 · 04667139
  Michael Carilli authored Jun 07, 2019
  
  04667139
06 Jun, 2019 1 commit
- Making O1 the default opt level · 1dca16cc
  Michael Carilli authored Jun 06, 2019
  
  1dca16cc
04 Jun, 2019 1 commit
- Adding min_loss_scale and max_loss_scale arguments to amp.initialize · b82c6bd7
  Michael Carilli authored Jun 03, 2019
  
  b82c6bd7
31 May, 2019 2 commits

Multi tensor lamb optimizer (#334) · 8be5b6be

Thor Johnsen authored May 31, 2019

* First draft, for discussion

* Fix mistakes in LAMB equations

* Add loop over chunk

* Bug fix

* Bug fix

* Bug fix

* Undo bug fix

* Bug fix

* Add multi tensor LAMB optimizer to setup.py

* Rename step_size to learning_rate

* Fix compilation errors

8be5b6be

Give multi-tensor L2 norm the ability to compute norms per-tensor as well as globally (#333) · 93338e62

mcarilli authored May 31, 2019

* Existing tests passing, still need to add per-tensor tests

* Test is passing, still need to measure performance

* ILP for l2norm functor

93338e62

28 May, 2019 1 commit
- Fix for https://github.com/NVIDIA/apex/issues/332 · a151575c
  Michael Carilli authored May 28, 2019
  
  a151575c
24 May, 2019 1 commit
- add backwards compatibility for PyTorch 0.4 for named_buffers (#331) · 14e34f7f
  ptrblck authored May 24, 2019
  
  14e34f7f
23 May, 2019 1 commit
- Changing error message · e6eec3ba
  Michael Carilli authored May 23, 2019
  
  e6eec3ba
22 May, 2019 3 commits
- Hard error on Pytorch Cuda + Cuda toolkit version mismatch (#323) · 50689f6a
  mcarilli authored May 22, 2019
  
  50689f6a
- Fixing second line for 321. · ccffa71c
  Michael Carilli authored May 22, 2019
  
  ccffa71c
- use value in assert statement (#321) · 9bd61cc1
  ptrblck authored May 22, 2019
  
  9bd61cc1
21 May, 2019 1 commit

Enable LARC for use with amp (#306) · c490bd36

blisc authored May 20, 2019



* update larc
Signed-off-by: Jason <jasoli@nvidia.com>

* scale_loss fix
Signed-off-by: Jason <jasoli@nvidia.com>

* typo
Signed-off-by: Jason <jasoli@nvidia.com>

* revert LARC

c490bd36

17 May, 2019 2 commits

[syncbn update] (#287) · a5289067

jjsjann123 authored May 17, 2019

update input size check to fix github issue #262

update SyncBatchNorm count check so that size 1 input with cross GPU
synchronization runs fine.

a5289067

[SyncBatchNorm update] (#285) · ffbb52ba

jjsjann123 authored May 17, 2019

resolves issue #254

Added input casting for pure python implementation, this supports mismatched
input and layer dtype.

ffbb52ba

16 May, 2019 1 commit
- Support add_param_group (#310) · 4d325d2f
  mcarilli authored May 15, 2019
```
* Support add_param_group

* syntax

* Test added and passing
```
  4d325d2f
15 May, 2019 1 commit
- use verbose parameter to control print of grad overflow (#300) · cfb628ba
  Michael Glass authored May 15, 2019
  
  cfb628ba