- 05 Aug, 2022 1 commit
-
-
Hubert Lu authored
* FusedRMSNorm/"T5LayerNorm" based on FusedLayerNorm (#1274) * FusedRMSNorm based on FusedLayerNorm * refactor duplicated kernels * delete comments * delete comments * cleanup * cleanup * cleanup, fixed clobbering forward_affine_mixed_dtypes * fix pybind naming and add MixedFused test * undo skipping * check elementwise_affine * Update tests/L0/run_fused_layer_norm/test_fused_layer_norm.py Oof, nice catch, thanks Co-authored-by:
Masaki Kozuki <masaki.kozuki.2014@gmail.com> Co-authored-by:
Masaki Kozuki <masaki.kozuki.2014@gmail.com> * fix and generate docs for FusedRMSNorm (#1285) * [FusedRMSNorm doc] document where epsilon is added (#1295) * [FusedRMSNorm doc] add epsilon to formula * correct * better wording * Fix some bugs * Optimize HostRMSNormGradient and HostApplyRMSNorm for AMD GPUs * Fix NaN issues in FusedRMSNorm * Update test_fused_layer_norm.py * Skip test_fused_layer_norm.TestAutocastFusedRMSNorm on ROCm * Use at::cuda::warp_size() instead of at::cuda::getCurrentDeviceProperties()->warpSize Co-authored-by:
eqy <eddiey@nvidia.com> Co-authored-by:
Masaki Kozuki <masaki.kozuki.2014@gmail.com> Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com>
-
- 02 Oct, 2021 1 commit
-
-
Masaki Kozuki authored
Co-authored-by:
Piotr Bialecki <pbialecki@nvidia.com> Co-authored-by:
Eddie Yan <eddiey@nvidia.com> Co-authored-by:
Rishi Puri <riship@nvidia.com> Co-authored-by:
Sangkug Lym <slym@nvidia.com>
-
- 15 Jun, 2020 1 commit
-
-
rohithkrn authored
-
- 03 Jul, 2019 2 commits
-
-
Michael Carilli authored
-
Michael Carilli authored
-
- 26 Apr, 2019 1 commit
-
-
ptrblck authored
* change .type().ScalarType() to .scalar_type() + at::ScalarType::X to at::kX * revert scalar_type() to type() for AT_DISPATCH_FLOATING_TYPES_AND_HALF * revert scalar_type() to type() in AT_DISPATCH_FLOATING_TYPES * revert scalar_type() to type() for AT_DISPATCH_FLOATING_TYPES_AND_HALF in welford.cu * revert scalar_type() to type() in layer_norm_cuda_kernel.cu * revert at::kType to at::ScalarType::Type * use DISPATCH_FLOAT_AND_HALF to get rid of warnings * add dispatch mechanisms for double+float and double+float+half
-
- 21 Mar, 2019 2 commits
-
-
Syed Tousif Ahmed authored
-
Syed Tousif Ahmed authored
-
- 12 Mar, 2019 1 commit
-
-
Michael Carilli authored
-
- 31 Oct, 2018 1 commit
-
-
Thor Johnsen authored
* Pre-release of fused layer norm apex extension * Remove half and __half2 specializations * Code changes from review
-