Commits · ef209a74e6359a5af5cf357cc9885da6ab10cd3b · OpenDAS / apex

09 Dec, 2020 1 commit
- update setup file for rocm due to newer hipify changes · ef209a74
  lcskrishna authored Dec 08, 2020
  
  ef209a74
04 Nov, 2020 1 commit

Fix LayerNorm op on ROCm (#36) · 7eed38aa

Ashish Farmer authored Nov 04, 2020

* fix warp size in WARP_SHFL* in layernorm

* enable fused_layer_norm tests on ROCm

7eed38aa

21 Aug, 2020 1 commit
- update readme with ninja build instruction and pip3.6 install (#35) · e9c43d67
  Chaitanya Sri Krishna Lolla authored Aug 21, 2020
  
  e9c43d67
18 Aug, 2020 1 commit

[contrib] Support for xentropy extension. (#34) · 3344233f

Chaitanya Sri Krishna Lolla authored Aug 18, 2020

* enable deprecated fused adam optimizer

* enable deprecated fused lamb

* enable xentropy extension

* add warpsize 32 for nv and 64 for amd

* update compiler arguments

* update the syncwarp conditions

* update syncwarp condition

3344233f

17 Aug, 2020 1 commit

[contrib] Support optimizers on rocm. (#33) · 17fbbf91

Chaitanya Sri Krishna Lolla authored Aug 17, 2020

* enable deprecated fused adam optimizer

* enable deprecated fused lamb

* reset the compiler arguments

* syntax error

* aligning the compiler arguments

17fbbf91

05 Aug, 2020 1 commit

Enable mlp_cuda extension. (#28) · d2f6d04a

Chaitanya Sri Krishna Lolla authored Aug 05, 2020

* enable mlp cuda

* add setup changes and tests

* skip the unit tests

* updated conditions for empty array

* removed hip platform conditions

d2f6d04a

01 Aug, 2020 1 commit
- Merge pull request #30 from lcskrishna/ifu_07272020 · 4116ed66
  Ashish Farmer authored Jul 31, 2020
```
IFU-master 07/27/2020.
```
  4116ed66
31 Jul, 2020 2 commits
- Merge branch 'master' of https://github.com/ROCmSoftwarePlatform/apex into ifu_07272020 · 1c664582
  lcskrishna authored Jul 31, 2020
  
  1c664582
- skipping bfloat16 mgpu tests (#32) · 8dd19e3b
  Chaitanya Sri Krishna Lolla authored Jul 31, 2020
  
  8dd19e3b
27 Jul, 2020 1 commit
- Merge remote-tracking branch 'rocm_upstream/master' into ifu_07272020 · 6f7a8b39
  lcskrishna authored Jul 27, 2020
  
  6f7a8b39
23 Jul, 2020 1 commit
- Merge pull request #918 from a-maci/ASP_sparse_param_dict_update · 459de22d
  Thor Johnsen authored Jul 23, 2020
```
Asp sparse param dict update
```
  459de22d
22 Jul, 2020 2 commits

Accept custom (layer type:param name) to include in sparse_parameter dictionary · b3c16411

Asit authored Jul 22, 2020

1. Support to include in sparse_parameter_list an user-supplied custom layer type and its parameter name. This is useful when users have their own implementation of nn.Linear or nn.Conv2D. For example, huggingface repo has a custom implementation of nn.Linear called LinearActivation.
2. Print info of layers in the model that are not pruned.

b3c16411

Merge pull request #3 from NVIDIA/master · eb95950d
Asit authored Jul 22, 2020
```
Merge pull request #917 from a-maci/master
```
eb95950d

21 Jul, 2020 1 commit
- Merge pull request #917 from a-maci/master · 0ac5dd62
  Thor Johnsen authored Jul 20, 2020
```
Fixing the case when grads are None
```
  0ac5dd62
20 Jul, 2020 3 commits
- Merge pull request #2 from a-maci/a-maci-patch-1 · 089149d3
  Asit authored Jul 20, 2020
```
Fixing mask multiplication with grad tensors
```
  089149d3
- Fixing mask multiplication with grad tensors · 774de913
  Asit authored Jul 20, 2020
```
Grads can be None type. Adding this fix to skip multiplication with masks if this is the case.
```
  774de913
- Merge pull request #1 from NVIDIA/master · 3dd36070
  Asit authored Jul 20, 2020
```
Updating my repo
```
  3dd36070
16 Jul, 2020 2 commits
- Merge pull request #910 from szmigacz/smigacz/mha_xavier_init_gain_fix · 3104fd59
  Thor Johnsen authored Jul 16, 2020
```
Fixed weight init for fused weight matrices in fused MHA by adding correct gain factor
```
  3104fd59
- Merge pull request #904 from ksivaman/sparsity · 4027bcba
  Thor Johnsen authored Jul 16, 2020
```
Fixed variable name
```
  4027bcba
10 Jul, 2020 1 commit

Enable sync batchnorm extension. (#27) · 9c80f6d3

Chaitanya Sri Krishna Lolla authored Jul 10, 2020

* Enable sync batchnorm

* enable syncbn properly

* update the unit tests

* update tests

* update conditions for welford_merge_element

* updated conditions based on comments.

9c80f6d3

09 Jul, 2020 1 commit
- Fixed weight init for fused weight matrices in fused MHA by adding correct gain factor. · a0d99fdb
  Szymon Migacz authored Jul 09, 2020
  
  a0d99fdb
08 Jul, 2020 1 commit
- Merge pull request #26 from lcskrishna/cl/ifu_07072020 · 33a3a667
  Chaitanya Sri Krishna Lolla authored Jul 08, 2020
```
IFU-07072020
```
  33a3a667
07 Jul, 2020 2 commits
- skip newer tests · eba809d7
  lcskrishna authored Jul 07, 2020
  
  eba809d7
- fixed merge conflicts ifu_07072020 · 8d5c2624
  lcskrishna authored Jul 07, 2020
  
  8d5c2624
06 Jul, 2020 1 commit

[sync BN] (#792) · 1ff54b8f

jjsjann123 authored Jul 06, 2020

* [sync BN]

support non-uniform batch size across process group.

TODO: test should be added once cleaned up.

* updating unit tests

* new unit tests for different inputs

* cleaning

1ff54b8f

01 Jul, 2020 1 commit
- name -> p_name (name variable is out of scope) · 59995c76
  Kirthi Sivamani authored Jul 01, 2020
  
  59995c76
30 Jun, 2020 1 commit

Don't patch tensor ops that aren't present (#899) · 43a6f9fe

mcarilli authored Jun 30, 2020



* Only attempt to patch Tensor methods if defined

* syntax
Co-authored-by: Michael Carilli <mcarilli@nvidia.com>

43a6f9fe

23 Jun, 2020 5 commits
- Merge pull request #892 from kexinyu/master · 44532b30
  Kexin Yu authored Jun 23, 2020
```
add unit tests for FusedLAMB optimizer
```
  44532b30
- add test case for non-zero weight decay · ad50ce9a
  Kexin Yu authored Jun 23, 2020
  
  ad50ce9a
- test nvlamb; hyperparams consistent with adam/adagrad tests · cd3d6d12
  Kexin Yu authored Jun 23, 2020
  
  cd3d6d12
- Merge pull request #23 from ashishfarmer/launch_bounds_fix · 7e099371
  Peng authored Jun 22, 2020
```
Fix launch bounds for cleanup(...) call
```
  7e099371
- add test for FusedLAMB · 9774ce0d
  Kexin Yu authored Jun 22, 2020
  
  9774ce0d
22 Jun, 2020 1 commit
- fix launch bounds for cleanup · a640c63b
  ashishfarmer authored Jun 22, 2020
  
  a640c63b
18 Jun, 2020 1 commit
- Merge pull request #21 from rohithkrn/layernorm_bf16_fix · f2fcce58
  rohithkrn authored Jun 18, 2020
```
fix bf16 layernorm bug
```
  f2fcce58
15 Jun, 2020 5 commits
- fix bf16 layernorm bug · c9d35a49
  rohithkrn authored Jun 15, 2020
  
  c9d35a49
- Merge pull request #885 from a-maci/2dmasking_sparsity · c3fad1ad
  Thor Johnsen authored Jun 15, 2020
```
2d masking and sparsity
```
  c3fad1ad
- Updating comment · 73ff00ea
  Asit authored Jun 15, 2020
```
Minor edit
```
  73ff00ea
- adding comments for 2d pruning · 8923046f
  Asit authored Jun 15, 2020
```
Importance and usage is 2d masking
```
  8923046f
- editing comments · ceca097f
  Asit authored Jun 15, 2020
  
  ceca097f
11 Jun, 2020 1 commit
- Merge pull request #883 from NVIDIA/schetlur/stream_bug_fix · 02a33875
  schetlur authored Jun 11, 2020
```
Update softmax.h
```
  02a33875