- 10 Oct, 2024 1 commit
-
-
Przemyslaw Tredak authored
* Fixes to Float8Tensor Signed-off-by:
Przemyslaw Tredak <ptredak@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by:
Przemyslaw Tredak <ptredak@nvidia.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
-
- 06 Jun, 2024 1 commit
-
-
Kirthi Shankar Sivamani authored
Cleanup Signed-off-by:Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 30 Apr, 2024 1 commit
-
-
Tim Moon authored
* Fix linter warnings from unused args Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Update .gitignore Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> --------- Signed-off-by:
Tim Moon <tmoon@nvidia.com> Signed-off-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 29 Jan, 2024 1 commit
-
-
Alp Dener authored
* Removed cudaMalloc/WorkspaceManager in JAX csrc. JAX custom ops now request buffers from XLA for their workspace tensors. Signed-off-by:
Alp Dener <adener@nvidia.com> * removed unused GEMM C++ API in TE-JAX Signed-off-by:
Alp Dener <adener@nvidia.com> * fixed typo in layernorm_geglu_fp8_mlp and removed unnecessary shape reductions in primitives Signed-off-by:
Alp Dener <adener@nvidia.com> * fixed import order for linting Signed-off-by:
Alp Dener <adener@nvidia.com> * fixed custom op errors due to incorrect static arg nums in JAX jit Signed-off-by:
Alp Dener <adener@nvidia.com> * shifted cudnnSetStream further down the kernel to avoid error when executing dummy kernel call with nullptr stream Signed-off-by:
Alp Dener <adener@nvidia.com> * fixed linting errors for blank lines Signed-off-by:
Alp Dener <adener@nvidia.com> --------- Signed-off-by:
Alp Dener <adener@nvidia.com>
-
- 02 Jun, 2023 1 commit
-
-
Jan Bielak authored
* Ignore IDE files Signed-off-by:
Jan Bielak <jbielak@nvidia.com> * Fix typing errors Signed-off-by:
Jan Bielak <jbielak@nvidia.com> * Ignore devcontainer files Signed-off-by:
Jan Bielak <jbielak@nvidia.com> * Avoid import from private module Signed-off-by:
Jan Bielak <jbielak@nvidia.com> * Apply @timmoon10 's suggestions Signed-off-by:
Jan Bielak <jbielak@nvidia.com> --------- Signed-off-by:
Jan Bielak <jbielak@nvidia.com>
-
- 18 Jan, 2023 1 commit
-
-
asfiyab-nvidia authored
* Add ONNX export support for TE modules (#1) * Add TorchScript Operators * Add symbolic methods to ONNX exporter * Add tests for the ONNX export Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * fixes for pylint tests Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * fix pylint warning in softmax.py Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * move FP8 ORT lib inside tests/ Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * enable cross attention tests Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * refactor code by @nzmora * Increase layernorm FP16 threshold * Normalize onnx file names: _ separates configs; - separates words in a single config * Add get_attn_mask_str and fix mask string * Add missing ONNX files * Moved generated ONNX files to tests/gen_onnx_models/ Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * fix merge conflict changes Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * fix Q/DQ scale input Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * enable FP16 config when bias is disabled Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * fix pylint check errors Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * updates 1. remove List import for pylint failure 2. address comments: remove state tensors from GPU 3. address comments: Update reverse_map_dtype function and add to namespace Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * minor fix: coding guidelines Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * changes: 1. skip FP8 tests on non-hopper devices 2. minor fix for C++ lint check Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * fix onnxruntime version Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * minor fix: add space between code and comment Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * changes 1. update copyrights 2. update path to ORT .so Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * Apply suggestions from code review Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
asfiyab-nvidia <117682710+asfiyab-nvidia@users.noreply.github.com> Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> Signed-off-by:
asfiyab-nvidia <117682710+asfiyab-nvidia@users.noreply.github.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 28 Sep, 2022 1 commit
-
-
Przemek Tredak authored
Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com>
-