- 09 Dec, 2020 1 commit
-
-
lcskrishna authored
-
- 04 Nov, 2020 1 commit
-
-
Ashish Farmer authored
* fix warp size in WARP_SHFL* in layernorm * enable fused_layer_norm tests on ROCm
-
- 21 Aug, 2020 1 commit
-
-
Chaitanya Sri Krishna Lolla authored
-
- 18 Aug, 2020 1 commit
-
-
Chaitanya Sri Krishna Lolla authored
* enable deprecated fused adam optimizer * enable deprecated fused lamb * enable xentropy extension * add warpsize 32 for nv and 64 for amd * update compiler arguments * update the syncwarp conditions * update syncwarp condition
-
- 17 Aug, 2020 1 commit
-
-
Chaitanya Sri Krishna Lolla authored
* enable deprecated fused adam optimizer * enable deprecated fused lamb * reset the compiler arguments * syntax error * aligning the compiler arguments
-
- 05 Aug, 2020 1 commit
-
-
Chaitanya Sri Krishna Lolla authored
* enable mlp cuda * add setup changes and tests * skip the unit tests * updated conditions for empty array * removed hip platform conditions
-
- 01 Aug, 2020 1 commit
-
-
Ashish Farmer authored
IFU-master 07/27/2020.
-
- 31 Jul, 2020 2 commits
-
-
-
Chaitanya Sri Krishna Lolla authored
-
- 27 Jul, 2020 1 commit
-
-
lcskrishna authored
-
- 23 Jul, 2020 1 commit
-
-
Thor Johnsen authored
Asp sparse param dict update
-
- 22 Jul, 2020 2 commits
-
-
Asit authored
1. Support to include in sparse_parameter_list an user-supplied custom layer type and its parameter name. This is useful when users have their own implementation of nn.Linear or nn.Conv2D. For example, huggingface repo has a custom implementation of nn.Linear called LinearActivation. 2. Print info of layers in the model that are not pruned.
-
Asit authored
Merge pull request #917 from a-maci/master
-
- 21 Jul, 2020 1 commit
-
-
Thor Johnsen authored
Fixing the case when grads are None
-
- 20 Jul, 2020 3 commits
- 16 Jul, 2020 2 commits
-
-
Thor Johnsen authored
Fixed weight init for fused weight matrices in fused MHA by adding correct gain factor
-
Thor Johnsen authored
Fixed variable name
-
- 10 Jul, 2020 1 commit
-
-
Chaitanya Sri Krishna Lolla authored
* Enable sync batchnorm * enable syncbn properly * update the unit tests * update tests * update conditions for welford_merge_element * updated conditions based on comments.
-
- 09 Jul, 2020 1 commit
-
-
Szymon Migacz authored
-
- 08 Jul, 2020 1 commit
-
-
Chaitanya Sri Krishna Lolla authored
IFU-07072020
-
- 07 Jul, 2020 2 commits
-
-
lcskrishna authored
-
lcskrishna authored
-
- 06 Jul, 2020 1 commit
-
-
jjsjann123 authored
* [sync BN] support non-uniform batch size across process group. TODO: test should be added once cleaned up. * updating unit tests * new unit tests for different inputs * cleaning
-
- 01 Jul, 2020 1 commit
-
-
Kirthi Sivamani authored
-
- 30 Jun, 2020 1 commit
-
-
mcarilli authored
* Only attempt to patch Tensor methods if defined * syntax Co-authored-by:Michael Carilli <mcarilli@nvidia.com>
-
- 23 Jun, 2020 5 commits
- 22 Jun, 2020 1 commit
-
-
ashishfarmer authored
-
- 18 Jun, 2020 1 commit
-
-
rohithkrn authored
fix bf16 layernorm bug
-
- 15 Jun, 2020 5 commits
-
-
rohithkrn authored
-
Thor Johnsen authored
2d masking and sparsity
-
Asit authored
Minor edit
-
Asit authored
Importance and usage is 2d masking
-
Asit authored
-
- 11 Jun, 2020 1 commit
-
-
schetlur authored
Update softmax.h
-