- 08 Dec, 2022 1 commit
-
-
Przemyslaw Tredak authored
* Move the amax/scale/scale_inv into the TE Tensor struct. Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Handle multi_cast_transpose Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Changed softmax to new Tensor Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * First pass at the cpp tests Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Round of fixes Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Fix multi_cast_transpose Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Fix cast_to_fp8 Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Przemyslaw Tredak <ptrendx@gmail.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Signed-off-by:
Przemyslaw Tredak <ptrendx@gmail.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 28 Nov, 2022 1 commit
-
-
Tim Moon authored
* Add kernel for multi-tensor cast-transpose Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Fix incorrect test function in multi-tensor cast-transpose unit test Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Remove std::vector from multi-tensor cast-transpose function signature Makes sure the main header is C-compatible. Signed-off-by:
Tim Moon <tmoon@nvidia.com> Signed-off-by:
Tim Moon <tmoon@nvidia.com> Co-authored-by:
Przemyslaw Tredak <ptredak@nvidia.com>
-
- 03 Nov, 2022 1 commit
-
-
schetlur-nv authored
* Conditional dgrad/wgrad support Signed-off-by:
Sharan Chetlur <schetlur@dlcluster.nvidia.com> * Fixing the change to depend only on requires_grad. Also updating LayerNorm MLP Signed-off-by:
Sharan Chetlur <schetlur@dlcluster.nvidia.com> * Minor fixes. Signed-off-by:
Sharan Chetlur <schetlur@dlcluster.nvidia.com> * Adding conditional wgrad for LayerNormLinear Signed-off-by:
Sharan Chetlur <schetlur@dlcluster.nvidia.com> * bug fix and remove conditional dgrad Co-authored-by: schetlur-nv schetlur@nvidia.com Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Adding unit test for wgrad disabled path Signed-off-by:
Sharan Chetlur <schetlur@dlcluster.nvidia.com> * Adding more unit tests for wgrad disabled path Signed-off-by:
Sharan Chetlur <schetlur@dlcluster.nvidia.com> * Adding unit tests for fp8 wgrad disabling, and cleaning up the code. Signed-off-by:
Sharan Chetlur <schetlur@dlcluster.nvidia.com> * fix lint errors Co-Authored-By:
Sharan Chetlur <schetlur@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Sharan Chetlur <schetlur@dlcluster.nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by:
Sharan Chetlur <schetlur@dlcluster.nvidia.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 31 Oct, 2022 1 commit
-
-
Przemyslaw Tredak authored
* Build the wheel as GitHub action Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Change the sanity test Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com>
-
- 12 Oct, 2022 1 commit
-
-
Przemyslaw Tredak authored
* Remove fp8_out from LN API Signed-off-by:
Przemyslaw Tredak <ptredak@nvidia.com> * fix LN test Signed-off-by:
Przemyslaw Tredak <ptredak@nvidia.com> * Fixes Signed-off-by:
Przemyslaw Tredak <ptredak@nvidia.com> Signed-off-by:
Przemyslaw Tredak <ptredak@nvidia.com> Co-authored-by:
ksivamani <ksivamani@nvidia.com>
-
- 28 Sep, 2022 1 commit
-
-
Przemek Tredak authored
Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com>
-