- 15 Dec, 2025 1 commit
-
-
Yashaswi Karnati authored
* fix ce loss with ignore idx Signed-off-by:
ykarnati <ykarnati@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by:
ykarnati <ykarnati@nvidia.com> * remove fix comments Signed-off-by:
ykarnati <ykarnati@nvidia.com> * fallback divisor to 1 Signed-off-by:
ykarnati <ykarnati@nvidia.com> * have arg for n_rows and n_non_ignore Signed-off-by:
ykarnati <ykarnati@nvidia.com> * fuse n_non_ignore to softmax kernel Signed-off-by:
ykarnati <ykarnati@nvidia.com> * fix incorrect arg Signed-off-by:
ykarnati <ykarnati@nvidia.com> --------- Signed-off-by:
ykarnati <ykarnati@nvidia.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
-
- 10 Nov, 2025 1 commit
-
-
Teddy Do authored
* move triton to common and change paths Signed-off-by:
tdophung <tdophung@nvidia.com> * Formatting Signed-off-by:
tdophung <tdophung@nvidia.com> --------- Signed-off-by:
tdophung <tdophung@nvidia.com>
-
- 04 Sep, 2025 1 commit
-
-
Casper authored
* fix cross entropy Signed-off-by:
Casper <casperbh.96@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by:
Casper <casperbh.96@gmail.com> * fix comments Signed-off-by:
Casper <casperbh.96@gmail.com> * fix: few more style issues Signed-off-by:
Casper <casperbh.96@gmail.com> * fix: remove grad_output_stride (unnecessary) Signed-off-by:
Casper <casperbh.96@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: only backward was broken Signed-off-by:
Casper <casperbh.96@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Generalize cross entropy backward kernel to handle reduced and unreduced loss Signed-off-by:
Tim Moon <tmoon@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by:
Casper <casperbh.96@gmail.com> Signed-off-by:
Tim Moon <tmoon@nvidia.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by:
Tim Moon <tmoon@nvidia.com>
-
- 13 Aug, 2025 1 commit
-
-
Kate Cheng authored
* Remove if-else and torch.tensor to meet cudagraph requirement Signed-off-by:
Kate Cheng <yunhsuanc@nvidia.com> * Add is_cg_capturable flag to guard the if-else statement Signed-off-by:
Kate Cheng <yunhsuanc@nvidia.com> --------- Signed-off-by:
Kate Cheng <yunhsuanc@nvidia.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 16 Jun, 2025 1 commit
-
-
Li Tao authored
* Fix an issue when mcore uses te fusion ce implementation Signed-off-by:
lit <lit@nvidia.com> * simplify unit test code Signed-off-by:
lit <lit@nvidia.com> * Update tests/pytorch/test_parallel_cross_entropy.py Signed-off-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> --------- Signed-off-by:
lit <lit@nvidia.com> Signed-off-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com>
-
- 16 May, 2025 1 commit
-
-
Selvaraj Anandaraj authored
* Added token ignoring for CE loss Signed-off-by:
Selvaraj Anandaraj <selvaraja@cw-dfw-cs-001-login-01.cm.cluster> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added tests Signed-off-by:
root <root@cw-dfw-h100-004-210-013.cm.cluster> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by:
Selvaraj Anandaraj <selvaraja@cw-dfw-cs-001-login-01.cm.cluster> Co-authored-by:
Selvaraj Anandaraj <selvaraja@cw-dfw-cs-001-login-01.cm.cluster> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 26 Feb, 2025 1 commit
-
-
Selvaraj Anandaraj authored
* Added parallel cross entropy loss implementation using online softmax Signed-off-by:
Selvaraj Anandaraj <selvaraja@cw-dfw-cs-001-login-01.cm.cluster> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added tests Signed-off-by:
Selvaraj Anandaraj <selvaraja@cw-dfw-cs-001-login-01.cm.cluster> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added reshape of loss output Signed-off-by:
Selvaraj Anandaraj <selvaraja@cw-dfw-cs-001-login-01.cm.cluster> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added to test list Signed-off-by:
Selvaraj Anandaraj <selvaraja@cw-dfw-cs-001-login-01.cm.cluster> * Added Triton dependency Signed-off-by:
Selvaraj Anandaraj <selvaraja@cw-dfw-cs-001-login-01.cm.cluster> * Added copyright Signed-off-by:
Selvaraj Anandaraj <selvaraja@cw-dfw-cs-001-login-01.cm.cluster> * Fixed lint errors Signed-off-by:
Selvaraj Anandaraj <selvaraja@cw-dfw-cs-001-login-01.cm.cluster> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update setup.py Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Selvaraj Anandaraj <anandaraj@wisc.edu> * Fixed lint and triton failure Signed-off-by:
Selvaraj Anandaraj <selvaraja@cw-dfw-cs-001-login-01.cm.cluster> * Removed flattening for scalars Signed-off-by:
Selvaraj Anandaraj <selvaraja@cw-dfw-cs-001-login-01.cm.cluster> * Skip tests on Blackwell due to TE CI caveat Signed-off-by:
Selvaraj Anandaraj <selvaraja@cw-dfw-cs-001-login-01.cm.cluster> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added reason arg Signed-off-by:
Selvaraj Anandaraj <selvaraja@cw-dfw-cs-001-login-01.cm.cluster> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Do not register Triton dependency with setuptools Signed-off-by:
Tim Moon <tmoon@nvidia.com> --------- Signed-off-by:
Selvaraj Anandaraj <selvaraja@cw-dfw-cs-001-login-01.cm.cluster> Signed-off-by:
Selvaraj Anandaraj <anandaraj@wisc.edu> Signed-off-by:
Tim Moon <tmoon@nvidia.com> Co-authored-by:
Selvaraj Anandaraj <selvaraja@cw-dfw-cs-001-login-01.cm.cluster> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by:
Tim Moon <tmoon@nvidia.com> Co-authored-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com>
-