[Common] MXFP8 kernel for grouped tensors (#2586)
* Rebased to main Signed-off-by:Oleg Goncharov <ogoncharov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed the year to 2026 Signed-off-by:
Oleg Goncharov <ogoncharov@nvidia.com> * Added compilation guards Signed-off-by:
Oleg Goncharov <ogoncharov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added BWD pass Signed-off-by:
Oleg Goncharov <ogoncharov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added dbias and dact tests. Refactoring. Signed-off-by:
Oleg Goncharov <ogoncharov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added grouped MXFP8 DACT and ACT API and tests Signed-off-by:
Oleg Goncharov <ogoncharov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed a typo Signed-off-by:
Oleg Goncharov <ogoncharov@nvidia.com> * Fixes per the review Signed-off-by:
Oleg Goncharov <ogoncharov@nvidia.com> * More fixes from the review Signed-off-by:
Oleg Goncharov <ogoncharov@nvidia.com> * Fixes per the review Signed-off-by:
Oleg Goncharov <ogoncharov@nvidia.com> * Relaxed requirement for last dim from mod128 to mod32 Signed-off-by:
Oleg Goncharov <ogoncharov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix Signed-off-by:
Oleg Goncharov <ogoncharov@nvidia.com> * Added alignment checks when tensor descriptors are modified Signed-off-by:
Oleg Goncharov <ogoncharov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by:
Oleg Goncharov <ogoncharov@nvidia.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by:
vthumbe1503 <vthumbe@nvidia.com>
Showing
Please register or sign in to comment