- 12 Aug, 2025 2 commits
-
-
wenjh authored
-
wenjh authored
Signed-off-by:wenjh <wenjh@sugon.com>
-
- 11 Aug, 2025 2 commits
-
-
wenjh authored
-
wenjh authored
Signed-off-by:wenjh <wenjh@sugon.com>
-
- 08 Aug, 2025 4 commits
- 07 Aug, 2025 2 commits
-
-
-
yuguo authored
-
- 06 Aug, 2025 6 commits
- 05 Aug, 2025 2 commits
-
-
-
yuguo authored
-
- 31 Jul, 2025 2 commits
-
-
-
yuguo authored
-
- 29 Jul, 2025 1 commit
-
-
- 25 Jul, 2025 4 commits
-
-
wenjh authored
-
wenjh authored
Signed-off-by:wenjh <wenjh@sugon.com>
-
wenjh authored
-
wenjh authored
Signed-off-by:wenjh <wenjh@sugon.com>
-
- 22 Jul, 2025 3 commits
-
-
yuguo authored
-
yuguo authored
-
yuguo authored
Merge commit '7a9a0825' of https://github.com/NVIDIA/TransformerEngine
-
- 21 Jul, 2025 1 commit
-
-
Kshitij Lakhani authored
Signed-off-by:Kshitij Janardan Lakhani <klakhani@nvidia.com>
-
- 19 Jul, 2025 1 commit
-
-
jberchtold-nvidia authored
Update tolerance of distributed layernorm MLP for FP8 Signed-off-by:Jeremy Berchtold <jberchtold@nvidia.com>
-
- 18 Jul, 2025 9 commits
-
-
Phuong Nguyen authored
* enable cudnn norm tests Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com> * exclude tests on pre-Hopper Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com> --------- Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com>
-
Phuong Nguyen authored
* set precision=HIGHEST for the ref_grouped_gemm impl in the unit test Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com> --------- Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com>
-
-
yuguo authored
-
-
yuguo authored
-
wenjh authored
-
wenjh authored
Signed-off-by:wenjh <wenjh@sugon.com>
-
Charlene Yang authored
* update cudnn-frontend to 1.13.0 Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * disable 9.11 for a bug Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix selection logic Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
-
- 17 Jul, 2025 1 commit
-
-
Charlene Yang authored
* optimize kv_cache reindex and copy kernels Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * avoid reindexing from python side Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * rename variable from previous commit Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * minor fix Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * minor fix Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> --------- Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
-