- 15 Dec, 2022 1 commit
-
-
Przemek Tredak authored
-
- 08 Dec, 2022 1 commit
-
-
Przemyslaw Tredak authored
* Move the amax/scale/scale_inv into the TE Tensor struct. Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Handle multi_cast_transpose Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Changed softmax to new Tensor Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * First pass at the cpp tests Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Round of fixes Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Fix multi_cast_transpose Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Fix cast_to_fp8 Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Przemyslaw Tredak <ptrendx@gmail.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Signed-off-by:
Przemyslaw Tredak <ptrendx@gmail.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 07 Dec, 2022 1 commit
-
-
Kirthi Shankar Sivamani authored
ensure contiguous inputs Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 06 Dec, 2022 1 commit
-
-
Kirthi Shankar Sivamani authored
* Softmax docs and type fixes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * lint whitespace Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * change API, better naming, const fixes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 02 Dec, 2022 1 commit
-
-
Przemyslaw Tredak authored
Signed-off-by:
Przemyslaw Tredak <ptredak@nvidia.com> Signed-off-by:
Przemyslaw Tredak <ptredak@nvidia.com>
-
- 01 Dec, 2022 3 commits
-
-
Kirthi Shankar Sivamani authored
Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
Kirthi Shankar Sivamani authored
* Make fused softmax kernels PyTorch independent Co-authored-by:
Sean Lee <selee@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Address review comments Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * move get_batch_per_block to python Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix license in softmax.h Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by:
Sean Lee <selee@nvidia.com>
-
Przemyslaw Tredak authored
* Add pylint to Lint action Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Test Ubuntu 20.04 Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Pylint inside the container Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Update transformer_engine/pytorch/distributed.py Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Przemyslaw Tredak <ptrendx@gmail.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Signed-off-by:
Przemyslaw Tredak <ptrendx@gmail.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 30 Nov, 2022 1 commit
-
-
Tim Moon authored
Fix illegal memory access in layernorm backward kernel Signed-off-by:
Tim Moon <tmoon@nvidia.com> Signed-off-by:
Tim Moon <tmoon@nvidia.com>
-
- 28 Nov, 2022 1 commit
-
-
Tim Moon authored
* Add kernel for multi-tensor cast-transpose Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Fix incorrect test function in multi-tensor cast-transpose unit test Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Remove std::vector from multi-tensor cast-transpose function signature Makes sure the main header is C-compatible. Signed-off-by:
Tim Moon <tmoon@nvidia.com> Signed-off-by:
Tim Moon <tmoon@nvidia.com> Co-authored-by:
Przemyslaw Tredak <ptredak@nvidia.com>
-
- 23 Nov, 2022 1 commit
-
-
Kirthi Shankar Sivamani authored
fix checkpoint loading bug for FAR Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 18 Nov, 2022 2 commits
-
-
Przemek Tredak authored
Signed-off-by:Przemek Tredak <ptredak@nvidia.com>
-
Tim Moon authored
* Documentation for advanced perf optimizations Fix bug where we were doing backward passes inside fp8_autocast in example notebooks. Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Minor tweaks to advanced perf optimization docs Review suggestions from @ptrendx Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Rewording sequence parallelism in advanced perf optimization docs Review suggestion from @ksivaman Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Signed-off-by:
Tim Moon <tmoon@nvidia.com> Signed-off-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 17 Nov, 2022 1 commit
-
-
Kirthi Shankar Sivamani authored
* Make amax reduction optional Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * remove setup for global amax redux for optional case Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Improve documentation Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Address documentation review Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Documentation fix Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * better FP8 checkpointing Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Making checkpointing backwards compatible Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Add deprecation warning for old checkpoint loading Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix checkpointing for fp8 recompute case Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * improvements to deprecation warning Co-authored-by:
Przemyslaw Tredak <ptrendx@gmail.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by:
Przemyslaw Tredak <ptredak@nvidia.com> Co-authored-by:
Przemyslaw Tredak <ptrendx@gmail.com>
-
- 16 Nov, 2022 1 commit
-
-
Kirthi Shankar Sivamani authored
* Fix bugs for full activation recompute in FP8 Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Ensure identical numerics in recomputation for pipeline parallelism Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * expose checkpoint API and add docs Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * complete checkpointing docs Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 15 Nov, 2022 1 commit
-
-
Kirthi Shankar Sivamani authored
addressed LayerNormMLP bias issue #26 Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 08 Nov, 2022 1 commit
-
-
Przemyslaw Tredak authored
Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com>
-
- 04 Nov, 2022 1 commit
-
-
Przemek Tredak authored
-
- 03 Nov, 2022 2 commits
-
-
schetlur-nv authored
* Conditional dgrad/wgrad support Signed-off-by:
Sharan Chetlur <schetlur@dlcluster.nvidia.com> * Fixing the change to depend only on requires_grad. Also updating LayerNorm MLP Signed-off-by:
Sharan Chetlur <schetlur@dlcluster.nvidia.com> * Minor fixes. Signed-off-by:
Sharan Chetlur <schetlur@dlcluster.nvidia.com> * Adding conditional wgrad for LayerNormLinear Signed-off-by:
Sharan Chetlur <schetlur@dlcluster.nvidia.com> * bug fix and remove conditional dgrad Co-authored-by: schetlur-nv schetlur@nvidia.com Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Adding unit test for wgrad disabled path Signed-off-by:
Sharan Chetlur <schetlur@dlcluster.nvidia.com> * Adding more unit tests for wgrad disabled path Signed-off-by:
Sharan Chetlur <schetlur@dlcluster.nvidia.com> * Adding unit tests for fp8 wgrad disabling, and cleaning up the code. Signed-off-by:
Sharan Chetlur <schetlur@dlcluster.nvidia.com> * fix lint errors Co-Authored-By:
Sharan Chetlur <schetlur@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Sharan Chetlur <schetlur@dlcluster.nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by:
Sharan Chetlur <schetlur@dlcluster.nvidia.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
nzmora-nvidia authored
Fix the sample code so it compiles after the signature of `te.Linear` has changed. Signed-off-by:
nzmora-nvidia <96238833+nzmora-nvidia@users.noreply.github.com> Signed-off-by:
nzmora-nvidia <96238833+nzmora-nvidia@users.noreply.github.com>
-
- 31 Oct, 2022 1 commit
-
-
Przemyslaw Tredak authored
* Build the wheel as GitHub action Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * Change the sanity test Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com>
-
- 28 Oct, 2022 1 commit
-
-
Przemek Tredak authored
Signed-off-by:Przemek Tredak <ptredak@nvidia.com>
-
- 20 Oct, 2022 2 commits
-
-
Przemek Tredak authored
Signed-off-by:Przemek Tredak <ptredak@nvidia.com>
-
Przemyslaw Tredak authored
Signed-off-by:Przemek Tredak <ptredak@nvidia.com>
-
- 12 Oct, 2022 1 commit
-
-
Przemyslaw Tredak authored
* Remove fp8_out from LN API Signed-off-by:
Przemyslaw Tredak <ptredak@nvidia.com> * fix LN test Signed-off-by:
Przemyslaw Tredak <ptredak@nvidia.com> * Fixes Signed-off-by:
Przemyslaw Tredak <ptredak@nvidia.com> Signed-off-by:
Przemyslaw Tredak <ptredak@nvidia.com> Co-authored-by:
ksivamani <ksivamani@nvidia.com>
-
- 10 Oct, 2022 1 commit
-
-
Przemyslaw Tredak authored
Add lint test as GitHub action Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com>
-
- 07 Oct, 2022 1 commit
-
-
Przemyslaw Tredak authored
Add blossom-ci.yml Signed-off-by:Przemek Tredak <ptredak@nvidia.com>
-
- 05 Oct, 2022 2 commits
-
-
Kirthi Shankar Sivamani authored
Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
Kirthi Shankar Sivamani authored
Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 04 Oct, 2022 4 commits
-
-
Kirthi Shankar Sivamani authored
Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
Kirthi Shankar Sivamani authored
Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
Kirthi Shankar Sivamani authored
Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
Przemyslaw Tredak authored
Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com>
-
- 28 Sep, 2022 2 commits
-
-
Przemek Tredak authored
Signed-off-by:Przemek Tredak <ptredak@nvidia.com>
-
Przemek Tredak authored
Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com>
-