- 19 Apr, 2023 1 commit
-
-
Kirthi Shankar Sivamani authored
* Port initial changes Co-authored-by:
Sangkug Lym <slym@nvidia.com> Co-authored-by:
Vasudevan Rengasamy <vrengasamy@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * readd FA include for PyTorch Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Re-enable sm_70 + cleanup Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * LICENSE, cleanup header Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * 5k -> 173 errors Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * license and fixes in userbuffers-host Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * next round fixes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * final cpp cleanup Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * pylinting Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix from linting Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Turn off default async amax reduction (#148) Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * remove unused code path Signed-off-by:
Sangkug Lym <slym@nvidia.com> * cleanup Macros Signed-off-by:
Sangkug Lym <slym@nvidia.com> * fix conflict resolution bug Signed-off-by:
Sangkug Lym <slym@nvidia.com> * Fix gencode flags in setup (#145) * Fix gencode flags based on cuda version Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * review suggestions Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * revert append_nvcc_threads change Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Change overlap config dict error message Signed-off-by:
Sangkug Lym <slym@nvidia.com> * simplify ub initialization Signed-off-by:
Sangkug Lym <slym@nvidia.com> * lint Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix sanity imports Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * cpplint Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix TensorFlow build Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix TE macros in public header Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix lint Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * More fixes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * compiles with and w/o MPI Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fixes for python side annotations for conditional compile Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * link gdrAPI only when MPI found Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix comments for dummy var Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fix linking Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Review comments Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * load MPI before TE Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Add Py side argument checks Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * remove unused code and catch silent failures Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fix cpp tests Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix find_lib path for tests Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Sangkug Lym <slym@nvidia.com> Co-authored-by:
Sangkug Lym <slym@nvidia.com> Co-authored-by:
Vasudevan Rengasamy <vrengasamy@nvidia.com>
-
- 17 Apr, 2023 1 commit
-
-
Kirthi Shankar Sivamani authored
* use upstream flash-attn Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * get correct FA for linting Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 07 Apr, 2023 1 commit
-
-
ngoyal2707 authored
* made bias configurable Signed-off-by:
Naman Goyal <naman@fb.com> * removed commented lines Signed-off-by:
Naman Goyal <naman@fb.com> * Update transformer_engine/pytorch/jit.py Co-authored-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Signed-off-by:
ngoyal2707 <ngoyal2707@users.noreply.github.com> * Update transformer_engine/pytorch/jit.py Co-authored-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Signed-off-by:
ngoyal2707 <ngoyal2707@users.noreply.github.com> * fixed incorrect call to fused bias dropout add kernel Signed-off-by:
Naman Goyal <naman@fb.com> * Update transformer_engine/pytorch/jit.py Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> * Separate FC1 and FC2 use_bias args; solves all ci errors Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * jit fusion improvement Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Docs Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Naman Goyal <naman@fb.com> Signed-off-by:
ngoyal2707 <ngoyal2707@users.noreply.github.com> Signed-off-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by:
Naman Goyal <naman@fb.com> Co-authored-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 29 Mar, 2023 1 commit
-
-
tcherckez-nvidia authored
Signed-off-by:
Tal Cherckez <tcherckez@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 28 Mar, 2023 2 commits
-
-
Tim Moon authored
* Remove zombie process from querying TE install path Co-authored-by:
Naman Goyal <naman@fb.com> Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Fix FA version checking Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix unused import error Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fix lint warning Signed-off-by:
Tim Moon <tmoon@nvidia.com> --------- Signed-off-by:
Tim Moon <tmoon@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by:
Naman Goyal <naman@fb.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
Kirthi Shankar Sivamani authored
* fix usage of return_bias argument Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * review comments Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 22 Mar, 2023 1 commit
-
-
Kirthi Shankar Sivamani authored
FA doesn't support compute 8.6 with head_dim>64 Signed-off-by:Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 11 Mar, 2023 2 commits
-
-
Przemyslaw Tredak authored
* Change from AutoDoc to AutoAPI Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Fixes Signed-off-by:
Przemyslaw Tredak <ptredak@nvidia.com> * WAR for the wrong autosummary generation Signed-off-by:
Przemyslaw Tredak <ptredak@nvidia.com> * Change common to be in line with pytorch API docs Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Add GitHub Action to build docs Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Fix Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Trying to fix the versions Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> --------- Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Signed-off-by:
Przemyslaw Tredak <ptredak@nvidia.com>
-
Kirthi Shankar Sivamani authored
* deprecate qk layer scaling and fp32 softmax args Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * apply QK layer scaling for fp16 training Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * address review comments Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 07 Mar, 2023 1 commit
-
-
Kirthi Shankar Sivamani authored
* ignore self attention mask for causal type Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * further relax checks to run FA, update docs Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix pytorch softmax path Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fixes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * minimum ampere requirement for fa Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 02 Mar, 2023 1 commit
-
-
Kirthi Shankar Sivamani authored
* fix qkv weight unfused path Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix non FA non interleaved case Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 22 Feb, 2023 1 commit
-
-
cyanguwa authored
* add flash attention to TransformerLayer Signed-off-by:
Charlene Yang <charleney@nvidia.com> * Add docs for FP8 calibration (#61) Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Charlene Yang <charleney@nvidia.com> * Fix the integer overflow in fused softmax (#60) Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Signed-off-by:
Charlene Yang <charleney@nvidia.com> * prefix flash attn env var with NVTE_ Signed-off-by:
Charlene Yang <charleney@nvidia.com> * Address steady memory increase and bloated checkpoints (#63) Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Charlene Yang <charleney@nvidia.com> * fix env var logic Signed-off-by:
cyanguwa <cyang.uwa@gmail.com> Signed-off-by:
Charlene Yang <charleney@nvidia.com> * fix flash attn env var logic again Signed-off-by:
cyanguwa <cyang.uwa@gmail.com> Signed-off-by:
Charlene Yang <charleney@nvidia.com> * remove d2d copies (#64) * remove d2d copies Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * cleanup Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Charlene Yang <charleney@nvidia.com> * Increase number of FP8 tensors per GEMM (#22) * Increase number of FP8 tensors per GEMM Signed-off-by:
Vasudevan Rengasamy <vrengasamy@nvidia.com> * Enable FP8 output tensor for fp8_gemm Signed-off-by:
Vasudevan Rengasamy <vrengasamy@nvidia.com> * [BERT FP8] Initial TE review comments Signed-off-by:
Vasudevan Rengasamy <vrengasamy@nvidia.com> * Temporary fix for cuda graph non convergence Signed-off-by:
Vasudevan Rengasamy <vrengasamy@nvidia.com> * Address review comments-2 Signed-off-by:
Vasudevan Rengasamy <vrengasamy@nvidia.com> * Review comments-3 Signed-off-by:
Vasudevan Rengasamy <vrengasamy@nvidia.com> * Cleanup Signed-off-by:
Vasudevan Rengasamy <vrengasamy@nvidia.com> * Change for New API Signed-off-by:
Vasudevan Rengasamy <vrengasamy@nvidia.com> * Remove unnecessary clone for D_scale, D_amax Signed-off-by:
Vasudevan Rengasamy <vrengasamy@nvidia.com> * Avoid Roll for AMAX history size = 1 Signed-off-by:
Vasudevan Rengasamy <vrengasamy@nvidia.com> * Update onnx_te_gemm API Signed-off-by:
Vasudevan Rengasamy <vrengasamy@nvidia.com> * Fix Lint errors Signed-off-by:
Vasudevan Rengasamy <vrengasamy@nvidia.com> --------- Signed-off-by:
Vasudevan Rengasamy <vrengasamy@nvidia.com> Signed-off-by:
Charlene Yang <charleney@nvidia.com> * Bug fixes from PR 22 (#65) * Bug fixes from PR 22 Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Add FP8 tests to ci Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * bundle unittests for ci Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Charlene Yang <charleney@nvidia.com> * replace rearrange with transpose Signed-off-by:
cyanguwa <cyang.uwa@gmail.com> Signed-off-by:
Charlene Yang <charleney@nvidia.com> * QKV parameters unfused path fixes and optimization (#66) * Bug fixes from PR 22 Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Add FP8 tests to ci Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Better QKV parameter fusion Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * small fix Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * keep original param for unfused case to retain externally set attrs Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * lint fix Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fix ONNX exports Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * improve arg naming Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * No need to set data pointers Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * lint Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Assert memory loc in NoopCat Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Handle case of different memory in param and buffer Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix assert always true Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Reassign params memory to avoid more concats Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Charlene Yang <charleney@nvidia.com> * Fix gradients when using AMP (#70) retain grad related attrs while casting Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Charlene Yang <charleney@nvidia.com> * fix pylint violations fixed pyline violations such as trailing white spaces and too long lines Signed-off-by:
cyanguwa <cyang.uwa@gmail.com> * fix pylint violation on line 264 with R1719 Signed-off-by:
cyanguwa <cyang.uwa@gmail.com> * fix two more pylint violations Signed-off-by:
cyanguwa <cyang.uwa@gmail.com> * DotProductAttention API Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Add docs for attention Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix assert always true Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * check for correct flash-attn version Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * address review comments Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * lint+build fixes, correct settings for default flash-attn Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * correct version Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * review comments and fixes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix onnx and disable flash-attn export test Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * remove einops dependency Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * cleanup internal API; rm duplication Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * only install TE wheel (exclude flash-attn to rm conflicts) Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * forgot to change install wheel path Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * next round review comments Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix flash_attn output Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix QK layer scaling Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * update docs Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * review comments and fixes to selective checkpointing Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Charlene Yang <charleney@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
cyanguwa <cyang.uwa@gmail.com> Co-authored-by:
Charlene Yang <charleney@nvidia.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 15 Feb, 2023 1 commit
-
-
Przemyslaw Tredak authored
* C++ implementation of LayerNorm1P Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Expose zero centered gamma to pyTorch Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Fix Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Fix ONNX export Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Fix ONNX export and tests Signed-off-by:
Przemyslaw Tredak <ptredak@nvidia.com> * Fix lint Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Fix backward handling - C++ part Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Fix for backward - Python side Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Fix FP8 path Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Reenable the pylint check Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Fix the NVTX marker Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Change in the bwd kernel Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> --------- Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Signed-off-by:
Przemyslaw Tredak <ptredak@nvidia.com>
-
- 10 Feb, 2023 1 commit
-
-
Kirthi Shankar Sivamani authored
* Bug fixes from PR 22 Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Add FP8 tests to ci Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Better QKV parameter fusion Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * small fix Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * keep original param for unfused case to retain externally set attrs Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * lint fix Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fix ONNX exports Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * improve arg naming Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * No need to set data pointers Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * lint Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Assert memory loc in NoopCat Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Handle case of different memory in param and buffer Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix assert always true Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Reassign params memory to avoid more concats Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 05 Jan, 2023 1 commit
-
-
Kirthi Shankar Sivamani authored
* Enforce boolean attention mask type Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix tests Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 03 Jan, 2023 1 commit
-
-
Przemyslaw Tredak authored
Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com>
-
- 17 Dec, 2022 1 commit
-
-
Kirthi Shankar Sivamani authored
fix unfused qkv param Xattn path Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 07 Dec, 2022 1 commit
-
-
Kirthi Shankar Sivamani authored
ensure contiguous inputs Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 04 Oct, 2022 1 commit
-
-
Kirthi Shankar Sivamani authored
Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 28 Sep, 2022 1 commit
-
-
Przemek Tredak authored
Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com>
-