Commits · d24c25b92ad05281f018aa489122756afdf3607d · OpenDAS / apex

28 Feb, 2019 1 commit
- Comprehensive tests for cross product of options · d24c25b9
  Michael Carilli authored Feb 27, 2019
  
  d24c25b9
26 Feb, 2019 1 commit
- No need for casts during optimizer step · 613997ea
  Michael Carilli authored Feb 26, 2019
  
  613997ea
25 Feb, 2019 1 commit
- Fix for unscale usage in fp16_utils.FP16_Optimizer · ed8236fa
  Michael Carilli authored Feb 25, 2019
  
  ed8236fa
24 Feb, 2019 1 commit
- Stashing work · d137b800
  Michael Carilli authored Feb 24, 2019
  
  d137b800
22 Feb, 2019 1 commit
- Allow multi-tensor unscale to handle FP16 output, so it can also be used for... · 80a3f3ca
  Michael Carilli authored Feb 21, 2019
```
Allow multi-tensor unscale to handle FP16 output, so it can also be used for copy-scatter. Rename some options.
```
  80a3f3ca
19 Feb, 2019 2 commits
- Adding small python wrapper for multi_tensor_apply · 4cc1c1b4
  Michael Carilli authored Feb 18, 2019
  
  4cc1c1b4
- Reworked multi tensor apply, added tests · 6763a8be
  Michael Carilli authored Feb 18, 2019
  
  6763a8be
13 Feb, 2019 1 commit
- New API tentatively works on resnet50, ready for stress testing. · 889d1712
  Michael Carilli authored Feb 12, 2019
  
  889d1712
11 Feb, 2019 1 commit
- Stashing work · fad78c16
  Michael Carilli authored Feb 10, 2019
  
  fad78c16
08 Feb, 2019 2 commits
- Merge branch 'master' into api_refactor · 8db3f95c
  Michael Carilli authored Feb 08, 2019
  
  8db3f95c
- stashing work · 1f693b92
  Michael Carilli authored Feb 08, 2019
  
  1f693b92
06 Feb, 2019 7 commits
- Merge pull request #144 from jma127/master · 1b903852
  ngimel authored Feb 06, 2019
```
Better FP16 support in pytorch fp16 utils.
```
  1b903852
- Some documentation cleanup · b2f63c48
  Michael Carilli authored Feb 06, 2019
  
  b2f63c48
- Merge branch 'master' into api_refactor · 2cbca1a4
  Michael Carilli authored Feb 06, 2019
  
  2cbca1a4
- Tests for the fused downscale kernel · 340e71a4
  Michael Carilli authored Feb 05, 2019
  
  340e71a4
- Merge branch 'new_downscale_kernel' · 8818ba9e
  Michael Carilli authored Feb 05, 2019
  
  8818ba9e
- Tests and resnet50 example work · a5bc76db
  Michael Carilli authored Feb 05, 2019
  
  a5bc76db
- ready for testing · 6e9159d8
  Michael Carilli authored Feb 05, 2019
  
  6e9159d8
05 Feb, 2019 8 commits
- Better FP16 support in pytorch fp16 utils. · 713e0fb8
  Jerry Ma authored Feb 01, 2019
```
This commit adds an FP16Model class as a successor to network_to_half.

The benefits of this class are:

- Preservation of single-precision for BatchNorm layers. The models
  generated by network_to_half() convert BatchNorm moment tensors to
  half-precision, then back to single-precision, which hurts the
  accuracy of the moment estimators and occasionally results in NaNs.
- Support for multi-argument nn.Modules (self-explanatory from code).
```
  713e0fb8
- Merge branch 'master' of https://github.com/NVIDIA/apex · 9288ba5c
  Michael Carilli authored Feb 05, 2019
  
  9288ba5c
- Removing patching of loss.backward, which appears to cause memory leaks... · a11c45a4
  Michael Carilli authored Feb 05, 2019
```
Removing patching of loss.backward, which appears to cause memory leaks (reference cycles?) in some models
```
  a11c45a4
- New downscale kernel is working but not perf tested · 337056c1
  Michael Carilli authored Feb 05, 2019
  
  337056c1
- Merge pull request #123 from donglixp/patch-1 · 57ad1840
  mcarilli authored Feb 05, 2019
```
apex.optimizers.FP16_Optimizer: add state_dict() and load_state_dict()
```
  57ad1840
- Merge pull request #146 from NVIDIA/restore_fused_kernel · 45537d34
  mcarilli authored Feb 04, 2019
```
Restore fused kernel
```
  45537d34
- Only warn once in LossScaler constructor · 03b0eeb8
  Michael Carilli authored Feb 04, 2019
  
  03b0eeb8
- FP16 grad downscale (which shouldn't happen in user code) fallback + warning · a153c41a
  Michael Carilli authored Feb 04, 2019
  
  a153c41a
04 Feb, 2019 2 commits
- Merge pull request #143 from NVIDIA/sbn_no_affine · d81ed26d
  mcarilli authored Feb 04, 2019
```
allowing syncBN to run with affine = False
```
  d81ed26d
- Restoring fused inf/nan check + downscale kernel · fd03f26a
  Michael Carilli authored Feb 03, 2019
  
  fd03f26a
03 Feb, 2019 1 commit
- Lazy imports to reduce error spam · 48299b0d
  Michael Carilli authored Feb 02, 2019
  
  48299b0d
01 Feb, 2019 8 commits
- async->non_blocking, module-specific logging · cc85a2e5
  Michael Carilli authored Feb 01, 2019
  
  cc85a2e5
- Stashing work · a9a3fe57
  Michael Carilli authored Feb 01, 2019
  
  a9a3fe57
- Making note of loss scaling in README · 859f528b
  Michael Carilli authored Feb 01, 2019
  
  859f528b
- Merge branch 'master' of https://github.com/NVIDIA/apex · ae5982cb
  Michael Carilli authored Feb 01, 2019
  
  ae5982cb
- Making static loss scale the default, and clipping master grads when running with --fp16 · 43522e63
  Michael Carilli authored Feb 01, 2019
  
  43522e63
- Update README.md · 33512f93
  mcarilli authored Jan 31, 2019
  
  33512f93
- Update README.md · b83e38a6
  mcarilli authored Jan 31, 2019
  
  b83e38a6
- allowing syncBN to run with affine = False · 223a47e9
  jiej authored Jan 31, 2019
  
  223a47e9
31 Jan, 2019 2 commits
- Merging in master · aed3086a
  Michael Carilli authored Jan 31, 2019
  
  aed3086a
- Removing spurious references to Penn Tree Bank results · b5465fe6
  Michael Carilli authored Jan 31, 2019
  
  b5465fe6
30 Jan, 2019 1 commit
- Merge pull request #142 from NVIDIA/update_word_language_model · 9041a868
  mcarilli authored Jan 30, 2019
```
Update default dims in word_language_model to be multiples of 8 to enable Tensor Core use
```
  9041a868