Add support for head_dim > 128 (#1797)
* add support for head dim > 128 Signed-off-by:Charlene Yang <charleney@nvidia.com> * remove debugging Signed-off-by:
Charlene Yang <charleney@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * raise tols slightly to tolerate 1/2048 mismatches Signed-off-by:
Charlene Yang <charleney@nvidia.com> * fix is_training for test_te_layer Signed-off-by:
Charlene Yang <charleney@nvidia.com> * add bprop support for blackwell Signed-off-by:
Charlene Yang <charleney@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * minor tweak for format Signed-off-by:
Charlene Yang <charleney@nvidia.com> * fix backend selection results Signed-off-by:
Charlene Yang <charleney@nvidia.com> * bump sm100 to sm100+ Signed-off-by:
Charlene Yang <charleney@nvidia.com> * add sq=1 test for MLA Signed-off-by:
Charlene Yang <charleney@nvidia.com> * enable sq=1 for bprop Signed-off-by:
Charlene Yang <charleney@nvidia.com> * minor tweak in comments Signed-off-by:
Charlene Yang <charleney@nvidia.com> * fix head_dim logic and remove pytest skip Signed-off-by:
Charlene Yang <charleney@nvidia.com> * add FE fix for d>128 Signed-off-by:
Charlene Yang <charleney@nvidia.com> * update FE again to take in small fixes Signed-off-by:
Charlene Yang <charleney@nvidia.com> * add cuDNN version info in L0 tests Signed-off-by:
Charlene Yang <charleney@nvidia.com> * increase tols for Unfused + large dim Signed-off-by:
Charlene Yang <charleney@nvidia.com> * Revert "add cuDNN version info in L0 tests" This reverts commit 3e1b426ca5319a2c0540b9e73bba7047d0e583e5. Signed-off-by:
Charlene Yang <charleney@nvidia.com> * fix tols for Unfused Signed-off-by:
Charlene Yang <charleney@nvidia.com> --------- Signed-off-by:
Charlene Yang <charleney@nvidia.com> Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Showing
Please register or sign in to comment