- 11 Sep, 2025 2 commits
-
-
wenjh authored
-
wenjh authored
Signed-off-by:wenjh <wenjh@sugon.com>
-
- 09 Sep, 2025 2 commits
-
-
wenjh authored
-
wenjh authored
Signed-off-by:wenjh <wenjh@sugon.com>
-
- 03 Sep, 2025 2 commits
-
-
wenjh authored
-
wenjh authored
Signed-off-by:wenjh <wenjh@sugon.com>
-
- 02 Sep, 2025 4 commits
-
-
wenjh authored
-
wenjh authored
Signed-off-by:wenjh <wenjh@sugon.com>
-
wenjh authored
-
wenjh authored
Signed-off-by:wenjh <wenjh@sugon.com>
-
- 28 Aug, 2025 4 commits
-
-
yuguo authored
Merge branch 'develop_v2.7' of http://10.16.6.30/dcutoolkit/deeplearing/TransformerEngine into release_v2.7
-
yuguo authored
-
yuguo authored
Merge branch 'develop_v2.7' of http://10.16.6.30/dcutoolkit/deeplearing/TransformerEngine into release_v2.7
-
yuguo authored
-
- 27 Aug, 2025 4 commits
-
-
yuguo authored
Merge branch 'develop_v2.7' of http://10.16.6.30/dcutoolkit/deeplearing/TransformerEngine into release_v2.7
-
yuguo authored
-
-
yuguo authored
Merge commit '734bcedd' of https://github.com/NVIDIA/TransformerEngine
-
- 26 Aug, 2025 13 commits
-
-
jberchtold-nvidia authored
Revert "[Common] PDL for Blockwise Quantization (#2066)" This reverts commit ebca6153 . Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com>
-
vcherepanov-nv authored
* Bump cuDNN FE to 1.14.0 Signed-off-by:
Vladimir Cherepanov <vcherepanov@nvidia.com> * Change submodule hash Signed-off-by:
Vladimir Cherepanov <vcherepanov@nvidia.com> * Pick up a cuDNN FE fix Signed-off-by:
Vladimir Cherepanov <vcherepanov@nvidia.com> * New model configs in tests Signed-off-by:
Vladimir Cherepanov <vcherepanov@nvidia.com> * Exclude cuDNN backend for some configs Signed-off-by:
Vladimir Cherepanov <vcherepanov@nvidia.com> --------- Signed-off-by:
Vladimir Cherepanov <vcherepanov@nvidia.com>
-
jberchtold-nvidia authored
Revert "[Common] PDL for Quantization Kernels (#2001)" This reverts commit bfab8c67 . Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com>
-
Phuong Nguyen authored
* added shardy warning Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com> --------- Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com>
-
Tim Moon authored
* Return dummy wgrad tensors when requested by Mcore Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Apply suggestions from code review Co-authored-by:
Jan Bielak <janekb04@icloud.com> Signed-off-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> --------- Signed-off-by:
Tim Moon <tmoon@nvidia.com> Signed-off-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by:
Jan Bielak <janekb04@icloud.com>
-
Md Fahim Faysal Khan authored
* added cp strategy arg to DPA api Signed-off-by:
Md Fahim Faysal Khan <mdfahimfaysa@nvidia.com> * converted DPA cp_strategy to string Signed-off-by:
Md Fahim Faysal Khan <mdfahimfaysa@nvidia.com> --------- Signed-off-by:
Md Fahim Faysal Khan <mdfahimfaysa@nvidia.com>
-
Tim Moon authored
* Fix incorrect version checks for atomic GEMM Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Fix typo Signed-off-by:
Tim Moon <tmoon@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by:
Tim Moon <tmoon@nvidia.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
-
Tim Moon authored
Avoid garbage collection when capturing a CUDA Graph Signed-off-by:Tim Moon <tmoon@nvidia.com>
-
jberchtold-nvidia authored
[JAX] Error checking for mesh resource and update GemmPrimitive to use global_mesh_resource().fsdp_resource (#2088) * Enforce global MeshResource is set Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com> * Use global_mesh_resource().fsdp_resource in gemm primitive Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com> * Update tests Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com> * Update gemm.py Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com> * Update test_layer.py Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com> --------- Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com>
-
-
yuguo authored
-
-
yuguo authored
-
- 25 Aug, 2025 4 commits
- 23 Aug, 2025 3 commits
- 21 Aug, 2025 2 commits
-
-
-
yuguo authored
-