- 20 Jan, 2023 1 commit
-
-
Kirthi Shankar Sivamani authored
Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 18 Jan, 2023 1 commit
-
-
asfiyab-nvidia authored
* Add ONNX export support for TE modules (#1) * Add TorchScript Operators * Add symbolic methods to ONNX exporter * Add tests for the ONNX export Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * fixes for pylint tests Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * fix pylint warning in softmax.py Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * move FP8 ORT lib inside tests/ Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * enable cross attention tests Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * refactor code by @nzmora * Increase layernorm FP16 threshold * Normalize onnx file names: _ separates configs; - separates words in a single config * Add get_attn_mask_str and fix mask string * Add missing ONNX files * Moved generated ONNX files to tests/gen_onnx_models/ Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * fix merge conflict changes Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * fix Q/DQ scale input Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * enable FP16 config when bias is disabled Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * fix pylint check errors Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * updates 1. remove List import for pylint failure 2. address comments: remove state tensors from GPU 3. address comments: Update reverse_map_dtype function and add to namespace Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * minor fix: coding guidelines Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * changes: 1. skip FP8 tests on non-hopper devices 2. minor fix for C++ lint check Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * fix onnxruntime version Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * minor fix: add space between code and comment Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * changes 1. update copyrights 2. update path to ORT .so Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> * Apply suggestions from code review Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
asfiyab-nvidia <117682710+asfiyab-nvidia@users.noreply.github.com> Signed-off-by:
Asfiya Baig <asfiyab@nvidia.com> Signed-off-by:
asfiyab-nvidia <117682710+asfiyab-nvidia@users.noreply.github.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 03 Jan, 2023 1 commit
-
-
Przemyslaw Tredak authored
Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com>
-
- 08 Dec, 2022 1 commit
-
-
Przemyslaw Tredak authored
* Move the amax/scale/scale_inv into the TE Tensor struct. Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Handle multi_cast_transpose Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Changed softmax to new Tensor Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * First pass at the cpp tests Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Round of fixes Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Fix multi_cast_transpose Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Fix cast_to_fp8 Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Przemyslaw Tredak <ptrendx@gmail.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Signed-off-by:
Przemyslaw Tredak <ptrendx@gmail.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 01 Dec, 2022 1 commit
-
-
Kirthi Shankar Sivamani authored
* Make fused softmax kernels PyTorch independent Co-authored-by:
Sean Lee <selee@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Address review comments Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * move get_batch_per_block to python Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix license in softmax.h Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by:
Sean Lee <selee@nvidia.com>
-
- 28 Nov, 2022 1 commit
-
-
Tim Moon authored
* Add kernel for multi-tensor cast-transpose Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Fix incorrect test function in multi-tensor cast-transpose unit test Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Remove std::vector from multi-tensor cast-transpose function signature Makes sure the main header is C-compatible. Signed-off-by:
Tim Moon <tmoon@nvidia.com> Signed-off-by:
Tim Moon <tmoon@nvidia.com> Co-authored-by:
Przemyslaw Tredak <ptredak@nvidia.com>
-
- 12 Oct, 2022 1 commit
-
-
Przemyslaw Tredak authored
* Remove fp8_out from LN API Signed-off-by:
Przemyslaw Tredak <ptredak@nvidia.com> * fix LN test Signed-off-by:
Przemyslaw Tredak <ptredak@nvidia.com> * Fixes Signed-off-by:
Przemyslaw Tredak <ptredak@nvidia.com> Signed-off-by:
Przemyslaw Tredak <ptredak@nvidia.com> Co-authored-by:
ksivamani <ksivamani@nvidia.com>
-
- 28 Sep, 2022 1 commit
-
-
Przemek Tredak authored
Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com>
-