Commits · cfb628ba2e46c9b8dc1368d6429031bbfa7a6f10 · OpenDAS / apex

15 May, 2019 3 commits
- use verbose parameter to control print of grad overflow (#300) · cfb628ba
  Michael Glass authored May 15, 2019
  
  cfb628ba
- raise exception if cudnn is disabled (#305) · a3169768
  ptrblck authored May 15, 2019
  
  a3169768
- fix URLs in docs of apex.parallel (#309) · df099a4b
  ptrblck authored May 15, 2019
```
* fix URLs

* Update distributed.py
```
  df099a4b
13 May, 2019 2 commits
- Adding docker build test for 1.1 container · f2b3a62c
  Michael Carilli authored May 13, 2019
  
  f2b3a62c
- Fix for #302 · 54b8a852
  mcarilli authored May 13, 2019
  
  54b8a852
09 May, 2019 1 commit
- Fix link to distributed samples (#298) · 4ff153cd
  Tim Zaman authored May 09, 2019
  
  4ff153cd
30 Apr, 2019 5 commits
- Clarifying docker launch · 39e153a3
  Michael Carilli authored Apr 30, 2019
  
  39e153a3
- resolving delete conflicts · 86bd6c79
  Michael Carilli authored Apr 30, 2019
  
  86bd6c79
- Remove deprecated examples and update Docker guidance · d2ac4872
  Michael Carilli authored Apr 30, 2019
  
  d2ac4872
- Remove unused tensor in fast_collate (#281) · ca2baffb
  ptrblck authored Apr 30, 2019
```
* remove unused tens tensor in example/imagenet/main_amp.py

* remove unused tens tensor in deprecated examples and tests/L1
```
  ca2baffb
- Casting logic should reflatten RNN parameters · 03a25ba8
  mcarilli authored Apr 29, 2019
  
  03a25ba8
29 Apr, 2019 2 commits
- Adding warning for amp.scale_loss · 1b8303d8
  Michael Carilli authored Apr 29, 2019
  
  1b8303d8
- Warning for inception_v3 · 7b245dba
  Michael Carilli authored Apr 29, 2019
  
  7b245dba
26 Apr, 2019 1 commit

Replace type().ScalarType() with scalar_type() (#272) · 855808f3

ptrblck authored Apr 26, 2019

* change .type().ScalarType() to .scalar_type() + at::ScalarType::X to at::kX

* revert scalar_type() to type() for AT_DISPATCH_FLOATING_TYPES_AND_HALF

* revert scalar_type() to type() in AT_DISPATCH_FLOATING_TYPES

* revert scalar_type() to type() for AT_DISPATCH_FLOATING_TYPES_AND_HALF in welford.cu

* revert scalar_type() to type() in layer_norm_cuda_kernel.cu

* revert at::kType  to at::ScalarType::Type

* use DISPATCH_FLOAT_AND_HALF to get rid of warnings

* add dispatch mechanisms for double+float and double+float+half

855808f3

23 Apr, 2019 1 commit
- move and fix check_optimizers (#268) · 1c464b48
  ptrblck authored Apr 23, 2019
  
  1c464b48
18 Apr, 2019 2 commits
- initial commit, add CUDA warning to check_params_fp32 (#263) · 28097c99
  ptrblck authored Apr 18, 2019
  
  28097c99
- Update README.md (#261) · cd2708cc
  Glenn Jocher authored Apr 18, 2019
  
  cd2708cc
16 Apr, 2019 1 commit
- Adding option to ensure that model outputs are a desired type · 0b5dd020
  Michael Carilli authored Apr 16, 2019
  
  0b5dd020
11 Apr, 2019 1 commit

prelu belongs in FP16_CASTS (#257) · 4dc711bc

henrymai authored Apr 11, 2019

The main use of these functions (e.g.: `torch.{conv*, prelu}`) is via their `torch.nn`
wrapping layers.

The `torch.nn` layers are what contain the weights and call into these lower level
functions with the weights as a parameter in their `forward()` method.

The `torch.conv*` functions are already in the `FP16_CASTS` list due to amp's philosophy of
casting the parameters rather than the model/layer weights.

Conceptually `torch.prelu` is the same as the `torch.conv*` case, where its weight parameter
is passed in from its wrapper layer `torch.nn.PReLU`.

4dc711bc

10 Apr, 2019 5 commits
- Merge pull request #256 from LamDang/master · 2c18651b
  ngimel authored Apr 10, 2019
```
quick fix: make FusedLayerNorm compatible with cpu
```
  2c18651b
- add new tests to run_test.py · 6d40465a
  Lam Dang authored Apr 10, 2019
  
  6d40465a
- quick fix: make FusedLayerNorm compatible with cpu · d130ec1f
  Lam Dang authored Apr 10, 2019
  
  d130ec1f
- Quick kernel to clean up l2norm · 683b6e0e
  Michael Carilli authored Apr 10, 2019
  
  683b6e0e
- Kernel + sizes stress test · 1a48b26b
  Michael Carilli authored Apr 09, 2019
  
  1a48b26b
09 Apr, 2019 1 commit
- Simple cut of the kernel in place · e57f5d0e
  Michael Carilli authored Apr 09, 2019
  
  e57f5d0e
08 Apr, 2019 1 commit
- Fix for #246 · 03100f46
  Michael Carilli authored Apr 08, 2019
  
  03100f46
05 Apr, 2019 3 commits
- Merge branch 'master' of https://github.com/NVIDIA/apex · e9bbfa59
  Michael Carilli authored Apr 04, 2019
  
  e9bbfa59
- docstring · 2eccdbd2
  Michael Carilli authored Apr 04, 2019
  
  2eccdbd2
- delay_unscale is never necessary and generally discouraged, but should still work for some cases · 0750a757
  Michael Carilli authored Apr 04, 2019
  
  0750a757
04 Apr, 2019 3 commits

Merge pull request #241 from mkolod/fp32_interp · dc8f400f
ngimel authored Apr 04, 2019
```
Run interpolation in fp32 because it's faster
```
dc8f400f
Run interpolation in fp32 because it's faster · 47e9cae5
Marek Kolodziej authored Apr 04, 2019

47e9cae5

WIP: Handle arbitrary combinations of optimizers/models/losses (#232) · 3f87614f

mcarilli authored Apr 03, 2019

* Refactor to allow more flexible treatment of multiple optimizers/models/losses

* Adding _process_optimizers.py

* Created L0 tests (now passing).

* fix: minor print typo (#234)

* make L1 results easier to read

* L0 multiple model/optimizer/loss test fleshed out

* Adding test that master params remain synced across distributed processes

* Docstring updates

* Docstring updates

3f87614f

03 Apr, 2019 1 commit
- Allow verbose casting information for O1 · 214fda42
  mcarilli authored Apr 03, 2019
  
  214fda42
01 Apr, 2019 1 commit
- Merge pull request #233 from DTennant/patch-1 · 84028786
  jjsjann123 authored Apr 01, 2019
```
Fix a typo in optimized_sync_batchnorm_kernel.py
```
  84028786
31 Mar, 2019 1 commit
- Update optimized_sync_batchnorm_kernel.py · 9b114c15
  Bingchen Zhao authored Mar 31, 2019
```
in line 54, running_var should be running_variance..
```
  9b114c15
27 Mar, 2019 2 commits
- Merge pull request #225 from NVIDIA/bmm-fp16 · a8c2b7dd
  ngimel authored Mar 27, 2019
```
Conditionally run bmm functions in fp16 based on cuda version
```
  a8c2b7dd
- Conditionally run bmm functions in fp16 based on cuda version · f1123e32
  Carl Case authored Mar 27, 2019
  
  f1123e32
26 Mar, 2019 2 commits
- Minor docstring updates · f5cd5ae9
  Michael Carilli authored Mar 26, 2019
  
  f5cd5ae9
- Add some links · e7f19560
  Michael Carilli authored Mar 26, 2019
  
  e7f19560
23 Mar, 2019 1 commit
- Fix typo in setup.py error message on torch version check (#219) · dc55a996
  Cubbee authored Mar 23, 2019
  
  dc55a996