- 02 Jan, 2025 1 commit
-
-
Kirthi Shankar Sivamani authored
Signed-off-by:Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 21 Aug, 2024 1 commit
-
-
Tim Moon authored
* Perform scale-inv update in cast-transpose kernels Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Perform scale-inv update in cast and activation kernels Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Perform sclae-inv update in LayerNorm and RMSNorm kernels Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Perform scale-inv update after FP8 GEMMs Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Fuse casts and scale-inv updates in linear module Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Fuse casts and scale-inv updates in layernorm-linear module Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Simplify kernel to update FP8 scale-inv Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Fix typos Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Debug amax update in layernorm kernels Signed-off-by:
Tim Moon <tmoon@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Debug test failures Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Debug ONNX export Use quantization scaling factor in ONNX quantize op. Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Review suggestion from @ptrendx Signed-off-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> * Debug mismatched dtypes Signed-off-by:
Tim Moon <tmoon@nvidia.com> --------- Signed-off-by:
Tim Moon <tmoon@nvidia.com> Signed-off-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
-
- 14 Jun, 2024 1 commit
-
-
Kirthi Shankar Sivamani authored
* Apply formatting Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Apply formatting Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 13 May, 2024 1 commit
-
-
Phuong Nguyen authored
* renamed gelu to act * added relu, srelu, qgelu * fixes initialization for layernorm_fp8_mlp tests * moved activation_fp8 prim into testunit file * Moved NVTE_Activation_Enum to common/.../activation.h --------- Signed-off-by:Phuong Nguyen <phuonguyen@nvidia.com>
-
- 24 Apr, 2024 1 commit
-
-
Phuong Nguyen authored
* Implemented swiglu and silu Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com> * Renamed nvte-*silu to nvte-*swish + generalized GetDBiasDact functions Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com>
-
- 17 Feb, 2024 1 commit
-
-
Alp Dener authored
* Added QuickGELUActivation from HuggingFace/Transformers to common and pytorch Signed-off-by:
Alp Dener <adener@nvidia.com> * Removing 'qgelu' from double-size activations list in LayerNormMLP. Signed-off-by:
Alp Dener <adener@nvidia.com> * indent fix Signed-off-by:
Alp Dener <adener@nvidia.com> --------- Signed-off-by:
Alp Dener <adener@nvidia.com> Co-authored-by:
Przemyslaw Tredak <ptredak@nvidia.com>
-
- 03 Jan, 2024 1 commit
-
-
Przemyslaw Tredak authored
Signed-off-by:Przemek Tredak <ptredak@nvidia.com>
-
- 13 Jun, 2023 1 commit
-
-
Przemyslaw Tredak authored
* Added ReLU and GLU variants to common Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * pyTorch changes Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * PyTorch C++ lint Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Bug fix Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * More fixes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fix storage errors Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Compute bgrad Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fix numerical tests Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fix ONNX export tests Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Review comments Co-authored-by:
Przemyslaw Tredak <ptrendx@gmail.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 17 Jan, 2023 1 commit
-
-
Kirthi Shankar Sivamani authored
* Move scale inverse calculation to framework Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * cleanup Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fix RMSNorm Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fix tests Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fix gated kernel/geglu Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 12 Jan, 2023 1 commit
-
-
Przemyslaw Tredak authored
* Add NVTX to TE modules Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Fix Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Fix pylint Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Fix NVTX in _prepare_backward Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Add NVTX to C API Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Fix cpplint and link nvToolsExt Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Add NVTX to GeGlu Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com>
-
- 10 Jan, 2023 1 commit
-
-
zlsh80826 authored
* Add GeGLU and DGeGLU Signed-off-by:
Reese Wang <rewang@nvidia.com> * Add DGeGLUCT Signed-off-by:
Reese Wang <rewang@nvidia.com> * Update copyright year Signed-off-by:
Reese Wang <rewang@nvidia.com> * Refine shape check Signed-off-by:
Reese Wang <rewang@nvidia.com> * Code refine Signed-off-by:
Reese Wang <rewang@nvidia.com> Signed-off-by:
Reese Wang <rewang@nvidia.com>
-
- 03 Jan, 2023 1 commit
-
-
Przemyslaw Tredak authored
Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com>
-
- 08 Dec, 2022 1 commit
-
-
Przemyslaw Tredak authored
* Move the amax/scale/scale_inv into the TE Tensor struct. Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Handle multi_cast_transpose Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Changed softmax to new Tensor Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * First pass at the cpp tests Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Round of fixes Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Fix multi_cast_transpose Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Fix cast_to_fp8 Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Przemyslaw Tredak <ptrendx@gmail.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Signed-off-by:
Przemyslaw Tredak <ptrendx@gmail.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 28 Sep, 2022 1 commit
-
-
Przemek Tredak authored
Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com>
-