- 29 Jul, 2022 1 commit
-
-
hubertlu-tw authored
-
- 07 Jul, 2022 1 commit
-
-
Masaki Kozuki authored
* remove pyprof * remove reparameterization * remove pyprof test * clean up
-
- 02 Oct, 2021 1 commit
-
-
Masaki Kozuki authored
Co-authored-by:
Piotr Bialecki <pbialecki@nvidia.com> Co-authored-by:
Eddie Yan <eddiey@nvidia.com> Co-authored-by:
Rishi Puri <riship@nvidia.com> Co-authored-by:
Sangkug Lym <slym@nvidia.com>
-
- 21 Jan, 2021 1 commit
-
-
Jeff Daily authored
use __launch_bounds__(1024) for multi_tensor_apply, re-enable skipped tests
-
- 04 Nov, 2020 1 commit
-
-
Ashish Farmer authored
* fix warp size in WARP_SHFL* in layernorm * enable fused_layer_norm tests on ROCm
-
- 05 Aug, 2020 1 commit
-
-
Chaitanya Sri Krishna Lolla authored
* enable mlp cuda * add setup changes and tests * skip the unit tests * updated conditions for empty array * removed hip platform conditions
-
- 19 May, 2020 4 commits
-
-
lcskrishna authored
-
lcskrishna authored
-
lcskrishna authored
-
lcskrishna authored
-
- 22 Apr, 2020 1 commit
-
-
Deyu Fu authored
-
- 31 Mar, 2020 1 commit
-
-
Jeff Bowles authored
-
- 13 Aug, 2019 2 commits
-
-
Deyu Fu authored
FusedSGD now work as before FusedAdam now work with o1/o2, no longer fuse scaling and casting Removed special backend handling for FusedAdam Moved and updated test for FusedAdam into run_optimizers Removed legacy tests for optimizers.FP16_optimizer and FusedAdam in run_mixed_adam
-
Marek Kolodziej authored
Co-authored-by:
Aditya Agrawal <aditya.iitb@gmail.com> Co-authored-by:
Marek Kolodziej <mkolod@gmail.com>
-
- 10 Apr, 2019 1 commit
-
-
Lam Dang authored
-
- 26 Feb, 2019 1 commit
-
-
Michael Carilli authored
-
- 05 Feb, 2019 1 commit
-
-
Jerry Ma authored
This commit adds an FP16Model class as a successor to network_to_half. The benefits of this class are: - Preservation of single-precision for BatchNorm layers. The models generated by network_to_half() convert BatchNorm moment tensors to half-precision, then back to single-precision, which hurts the accuracy of the moment estimators and occasionally results in NaNs. - Support for multi-argument nn.Modules (self-explanatory from code).
-
- 01 Feb, 2019 1 commit
-
-
Michael Carilli authored
-
- 30 Oct, 2018 1 commit
-
-
ngimel authored
* Add unittest for FusedAdam. * Fix some bugs. * set seed for adam test
-
- 13 Sep, 2018 1 commit
-
-
Michael Carilli authored
-