- 02 Jan, 2026 1 commit
-
-
Kirthi Shankar Sivamani authored
Update copyright to include 2026 Signed-off-by:Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 26 Nov, 2025 1 commit
-
-
Paweł Gadziński authored
* init Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * lines lenght Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * fix Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * fix Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * fix Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * subtitle --- fix in many files: Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * cross entropy _input -> input rename Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * cross entropy _input -> input rename Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * a lot of small fixes Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * torch_version() change Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing module and fix warnings Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * fix Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * removed training whitespace: Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * Update docs/api/pytorch.rst Co-authored-by:
greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by:
Paweł Gadziński <62263673+pggPL@users.noreply.github.com> * Fix import Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fix more imports Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fix NumPy docstring parameter spacing and indentation - Standardize parameter documentation to use 'param : type' format (space before and after colon) per NumPy style guide - Fix inconsistent indentation in cpu_offload.py docstring - Modified 51 Python files across transformer_engine/pytorch Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> --------- Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> Signed-off-by:
Paweł Gadziński <62263673+pggPL@users.noreply.github.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by:
greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 24 Jul, 2025 1 commit
-
-
Evgeny Tsykunov authored
* Increase intermediate precision and reuse tensors from fwd Signed-off-by:
Evgeny <etsykunov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * JIT warmup only when required Signed-off-by:
Evgeny <etsykunov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Recompute only rsqrt_norm Signed-off-by:
Evgeny <etsykunov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by:
Evgeny <etsykunov@nvidia.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
-
- 16 Jul, 2025 1 commit
-
-
Paweł Gadziński authored
* some initial code Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * onnx support Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * mxfp8 support Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * fixed returning layernorm etc Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * formatting Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * lint fix Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * license fix Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * fix Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * tests passing Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * fix Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refactor Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * lint Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * fixes Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * fix Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * added pip install to test.sh Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * Update transformer_engine/pytorch/export.py Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Paweł Gadziński <62263673+pggPL@users.noreply.github.com> * fix Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * fix Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * float8currentscaling quantizer exception Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added to wheels Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * onnx versions Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * fix Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * installations in tests Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * fix Signed-off-by:
root <root@prenyx0221.a51.clusters.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * lint fix Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix Signed-off-by:
root <pgadzinski@nvidia.com> * fixes Signed-off-by:
root <pgadzinski@nvidia.com> * fixes Signed-off-by:
root <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixes Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixes Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * Update setup.py Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Paweł Gadziński <62263673+pggPL@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * onnxscript version chnage Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixes Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix CI Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix Signed-off-by:
Pawel Gadzinski <pgadzinski@gmail.com> * Update build.yml Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Update pytorch.py Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Pawel Gadzinski <pgadzinski@nvidia.com> Signed-off-by:
Paweł Gadziński <62263673+pggPL@users.noreply.github.com> Signed-off-by:
root <root@prenyx0221.a51.clusters.nvidia.com> Signed-off-by:
root <pgadzinski@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Pawel Gadzinski <pgadzinski@gmail.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by:
root <root@prenyx0221.a51.clusters.nvidia.com> Co-authored-by:
Pawel Gadzinski <pgadzinski@gmail.com>
-
- 12 Jun, 2025 1 commit
-
-
Evgeny Tsykunov authored
* Support L2Norm basic op Signed-off-by:
Evgeny <etsykunov@nvidia.com> * Add L2Norm module wrapper Signed-off-by:
Evgeny <etsykunov@nvidia.com> * Expose qk_norm to MHA nd transformer laayer Signed-off-by:
Evgeny <etsykunov@nvidia.com> * Move tests into separate file Signed-off-by:
Evgeny <etsykunov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pass Signed-off-by:
Evgeny <etsykunov@nvidia.com> * Add license Signed-off-by:
Evgeny <etsykunov@nvidia.com> * Remove module Signed-off-by:
Evgeny <etsykunov@nvidia.com> * Resollve comments Signed-off-by:
Evgeny <etsykunov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by:
Evgeny <etsykunov@nvidia.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
-
- 03 May, 2025 1 commit
-
-
Xin Yao authored
* Fix autocast deprecation warnings Signed-off-by:
Xin Yao <xiny@nvidia.com> * merge main Signed-off-by:
Xin Yao <xiny@nvidia.com> * update Signed-off-by:
Xin Yao <xiny@nvidia.com> * resolve comments Signed-off-by:
Xin Yao <xiny@nvidia.com> --------- Signed-off-by:
Xin Yao <xiny@nvidia.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 25 Mar, 2025 1 commit
-
-
Peter St. John authored
* Defer torch compilation steps until first function call Signed-off-by:
Peter St. John <pstjohn@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix function call in smoke test Signed-off-by:
Peter St. John <pstjohn@nvidia.com> --------- Signed-off-by:
Peter St. John <pstjohn@nvidia.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com>
-
- 20 Mar, 2025 1 commit
-
-
yuguo authored
-
- 28 Feb, 2025 1 commit
-
-
Kirthi Shankar Sivamani authored
* Enforce torch 2.0 and run attn tests with torch.compile Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * replace torch.compile with jit_fuser Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fixes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 02 Jan, 2025 1 commit
-
-
Kirthi Shankar Sivamani authored
Signed-off-by:Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 16 Oct, 2024 1 commit
-
-
Kirthi Shankar Sivamani authored
* Upgrade pylint and first round formatting Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * round 2 Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * round 3 Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Format and fixes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Paddle lint Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Reviews Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * FIxes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * More linting Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Run formatter Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Paddle lint Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fixes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 04 Oct, 2024 1 commit
-
-
Tim Moon authored
* CPU perf optimization in linear autograd function Avoid enable_grad context when possible in cast function. Cache distributed group properties. Signed-off-by:
Tim Moon <tmoon@nvidia.com> * CPU perf optimization in prepare_forward function Avoid torch.nn.Module impl of __setattr__. Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Avoid module import in TE module forwards Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Use fast getter for params Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Reuse tensor dims in linear autograd func Signed-off-by:
Tim Moon <tmoon@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Apply optimizations to grouped linear Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Debug test failures Signed-off-by:
Tim Moon <tmoon@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Debug test failures Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Fix linter warnings Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Avoid deepcopy in tests Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Move _fast_setattr logic to __setattr__ method Signed-off-by:
Tim Moon <tmoon@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by:
Tim Moon <tmoon@nvidia.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
-
- 14 Jun, 2024 1 commit
-
-
Kirthi Shankar Sivamani authored
* Apply formatting Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Apply formatting Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 12 Jun, 2024 1 commit
-
-
Sudhakar Singh authored
skip switching to nvfuser for torch >= 2.2 Signed-off-by:Sudhakar Singh <sudhakars@nvidia.com>
-
- 24 Jan, 2024 1 commit
-
-
Przemyslaw Tredak authored
Signed-off-by:Przemek Tredak <ptredak@nvidia.com>
-
- 19 Jan, 2024 1 commit
-
-
Kirthi Shankar Sivamani authored
Signed-off-by:Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 08 Jan, 2024 1 commit
-
-
Jaemin Choi authored
* Use jit_fuser for bias-dropout-add fusion Signed-off-by:
Jaemin Choi <jaeminc@nvidia.com> * Use jit_fuser for CP FA kernel Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Jaemin Choi <jaeminc@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by:
Jaemin Choi <jaeminc@nvidia.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 03 Jan, 2024 1 commit
-
-
Przemyslaw Tredak authored
Signed-off-by:Przemek Tredak <ptredak@nvidia.com>
-
- 15 Dec, 2023 1 commit
-
-
Przemyslaw Tredak authored
* Disable dynamo for Fused Attention Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Added test Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> --------- Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Co-authored-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com>
-
- 19 Jul, 2023 1 commit
-
-
Kirthi Shankar Sivamani authored
Signed-off-by:Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 15 Jul, 2023 1 commit
-
-
Tim Moon authored
* Disable TorchDynamo optimizations in PyTorch modules Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Add test for Torch Dynamo Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Add torch.dynamo test to qa Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Skip torch.compile test for <v2.0 Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Tim Moon <tmoon@nvidia.com> Signed-off-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 07 Jun, 2023 1 commit
-
-
Kirthi Shankar Sivamani authored
* Use torch.compile for version 2.0 and higher Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Address review Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Remove unused import Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * use torch.__version__ Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Use NVFuser for dropout fusions Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fix onnx tests Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 12 May, 2023 1 commit
-
-
Kirthi Shankar Sivamani authored
* deterministic JIT warmup Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * review comments Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 07 Apr, 2023 1 commit
-
-
ngoyal2707 authored
* made bias configurable Signed-off-by:
Naman Goyal <naman@fb.com> * removed commented lines Signed-off-by:
Naman Goyal <naman@fb.com> * Update transformer_engine/pytorch/jit.py Co-authored-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Signed-off-by:
ngoyal2707 <ngoyal2707@users.noreply.github.com> * Update transformer_engine/pytorch/jit.py Co-authored-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Signed-off-by:
ngoyal2707 <ngoyal2707@users.noreply.github.com> * fixed incorrect call to fused bias dropout add kernel Signed-off-by:
Naman Goyal <naman@fb.com> * Update transformer_engine/pytorch/jit.py Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> * Separate FC1 and FC2 use_bias args; solves all ci errors Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * jit fusion improvement Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Docs Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Naman Goyal <naman@fb.com> Signed-off-by:
ngoyal2707 <ngoyal2707@users.noreply.github.com> Signed-off-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by:
Naman Goyal <naman@fb.com> Co-authored-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 03 Jan, 2023 1 commit
-
-
Przemyslaw Tredak authored
Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com>
-
- 28 Sep, 2022 1 commit
-
-
Przemek Tredak authored
Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com>
-