Commits · 93338e624c57a8e565bc7c5da07753bfab683092 · OpenDAS / apex

31 May, 2019 1 commit

Give multi-tensor L2 norm the ability to compute norms per-tensor as well as globally (#333) · 93338e62

mcarilli authored May 31, 2019

* Existing tests passing, still need to add per-tensor tests

* Test is passing, still need to measure performance

* ILP for l2norm functor

93338e62

28 May, 2019 1 commit
- Fix for https://github.com/NVIDIA/apex/issues/332 · a151575c
  Michael Carilli authored May 28, 2019
  
  a151575c
24 May, 2019 1 commit
- add backwards compatibility for PyTorch 0.4 for named_buffers (#331) · 14e34f7f
  ptrblck authored May 24, 2019
  
  14e34f7f
23 May, 2019 1 commit
- Changing error message · e6eec3ba
  Michael Carilli authored May 23, 2019
  
  e6eec3ba
22 May, 2019 3 commits
- Hard error on Pytorch Cuda + Cuda toolkit version mismatch (#323) · 50689f6a
  mcarilli authored May 22, 2019
  
  50689f6a
- Fixing second line for 321. · ccffa71c
  Michael Carilli authored May 22, 2019
  
  ccffa71c
- use value in assert statement (#321) · 9bd61cc1
  ptrblck authored May 22, 2019
  
  9bd61cc1
21 May, 2019 1 commit

Enable LARC for use with amp (#306) · c490bd36

blisc authored May 20, 2019



* update larc
Signed-off-by: Jason <jasoli@nvidia.com>

* scale_loss fix
Signed-off-by: Jason <jasoli@nvidia.com>

* typo
Signed-off-by: Jason <jasoli@nvidia.com>

* revert LARC

c490bd36

17 May, 2019 2 commits

[syncbn update] (#287) · a5289067

jjsjann123 authored May 17, 2019

update input size check to fix github issue #262

update SyncBatchNorm count check so that size 1 input with cross GPU
synchronization runs fine.

a5289067

[SyncBatchNorm update] (#285) · ffbb52ba

jjsjann123 authored May 17, 2019

resolves issue #254

Added input casting for pure python implementation, this supports mismatched
input and layer dtype.

ffbb52ba

16 May, 2019 1 commit
- Support add_param_group (#310) · 4d325d2f
  mcarilli authored May 15, 2019
```
* Support add_param_group

* syntax

* Test added and passing
```
  4d325d2f
15 May, 2019 3 commits
- use verbose parameter to control print of grad overflow (#300) · cfb628ba
  Michael Glass authored May 15, 2019
  
  cfb628ba
- raise exception if cudnn is disabled (#305) · a3169768
  ptrblck authored May 15, 2019
  
  a3169768
- fix URLs in docs of apex.parallel (#309) · df099a4b
  ptrblck authored May 15, 2019
```
* fix URLs

* Update distributed.py
```
  df099a4b
13 May, 2019 2 commits
- Adding docker build test for 1.1 container · f2b3a62c
  Michael Carilli authored May 13, 2019
  
  f2b3a62c
- Fix for #302 · 54b8a852
  mcarilli authored May 13, 2019
  
  54b8a852
09 May, 2019 1 commit
- Fix link to distributed samples (#298) · 4ff153cd
  Tim Zaman authored May 09, 2019
  
  4ff153cd
30 Apr, 2019 5 commits
- Clarifying docker launch · 39e153a3
  Michael Carilli authored Apr 30, 2019
  
  39e153a3
- resolving delete conflicts · 86bd6c79
  Michael Carilli authored Apr 30, 2019
  
  86bd6c79
- Remove deprecated examples and update Docker guidance · d2ac4872
  Michael Carilli authored Apr 30, 2019
  
  d2ac4872
- Remove unused tensor in fast_collate (#281) · ca2baffb
  ptrblck authored Apr 30, 2019
```
* remove unused tens tensor in example/imagenet/main_amp.py

* remove unused tens tensor in deprecated examples and tests/L1
```
  ca2baffb
- Casting logic should reflatten RNN parameters · 03a25ba8
  mcarilli authored Apr 29, 2019
  
  03a25ba8
29 Apr, 2019 2 commits
- Adding warning for amp.scale_loss · 1b8303d8
  Michael Carilli authored Apr 29, 2019
  
  1b8303d8
- Warning for inception_v3 · 7b245dba
  Michael Carilli authored Apr 29, 2019
  
  7b245dba
26 Apr, 2019 1 commit

Replace type().ScalarType() with scalar_type() (#272) · 855808f3

ptrblck authored Apr 26, 2019

* change .type().ScalarType() to .scalar_type() + at::ScalarType::X to at::kX

* revert scalar_type() to type() for AT_DISPATCH_FLOATING_TYPES_AND_HALF

* revert scalar_type() to type() in AT_DISPATCH_FLOATING_TYPES

* revert scalar_type() to type() for AT_DISPATCH_FLOATING_TYPES_AND_HALF in welford.cu

* revert scalar_type() to type() in layer_norm_cuda_kernel.cu

* revert at::kType  to at::ScalarType::Type

* use DISPATCH_FLOAT_AND_HALF to get rid of warnings

* add dispatch mechanisms for double+float and double+float+half

855808f3

23 Apr, 2019 1 commit
- move and fix check_optimizers (#268) · 1c464b48
  ptrblck authored Apr 23, 2019
  
  1c464b48
18 Apr, 2019 2 commits
- initial commit, add CUDA warning to check_params_fp32 (#263) · 28097c99
  ptrblck authored Apr 18, 2019
  
  28097c99
- Update README.md (#261) · cd2708cc
  Glenn Jocher authored Apr 18, 2019
  
  cd2708cc
16 Apr, 2019 1 commit
- Adding option to ensure that model outputs are a desired type · 0b5dd020
  Michael Carilli authored Apr 16, 2019
  
  0b5dd020
11 Apr, 2019 1 commit

prelu belongs in FP16_CASTS (#257) · 4dc711bc

henrymai authored Apr 11, 2019

The main use of these functions (e.g.: `torch.{conv*, prelu}`) is via their `torch.nn`
wrapping layers.

The `torch.nn` layers are what contain the weights and call into these lower level
functions with the weights as a parameter in their `forward()` method.

The `torch.conv*` functions are already in the `FP16_CASTS` list due to amp's philosophy of
casting the parameters rather than the model/layer weights.

Conceptually `torch.prelu` is the same as the `torch.conv*` case, where its weight parameter
is passed in from its wrapper layer `torch.nn.PReLU`.

4dc711bc

10 Apr, 2019 5 commits
- Merge pull request #256 from LamDang/master · 2c18651b
  ngimel authored Apr 10, 2019
```
quick fix: make FusedLayerNorm compatible with cpu
```
  2c18651b
- add new tests to run_test.py · 6d40465a
  Lam Dang authored Apr 10, 2019
  
  6d40465a
- quick fix: make FusedLayerNorm compatible with cpu · d130ec1f
  Lam Dang authored Apr 10, 2019
  
  d130ec1f
- Quick kernel to clean up l2norm · 683b6e0e
  Michael Carilli authored Apr 10, 2019
  
  683b6e0e
- Kernel + sizes stress test · 1a48b26b
  Michael Carilli authored Apr 09, 2019
  
  1a48b26b
09 Apr, 2019 1 commit
- Simple cut of the kernel in place · e57f5d0e
  Michael Carilli authored Apr 09, 2019
  
  e57f5d0e
08 Apr, 2019 1 commit
- Fix for #246 · 03100f46
  Michael Carilli authored Apr 08, 2019
  
  03100f46
05 Apr, 2019 3 commits
- Merge branch 'master' of https://github.com/NVIDIA/apex · e9bbfa59
  Michael Carilli authored Apr 04, 2019
  
  e9bbfa59
- docstring · 2eccdbd2
  Michael Carilli authored Apr 04, 2019
  
  2eccdbd2
- delay_unscale is never necessary and generally discouraged, but should still work for some cases · 0750a757
  Michael Carilli authored Apr 04, 2019
  
  0750a757