mxfp8 (for all gemm layouts) is not supported on 120+ arch yet (#1939)

* mxfp8 is not supported on 120+ arch yet Signed-off-by: Sudhakar Singh <sudhakars@nvidia.com> * change the default recipe for arch 120 Signed-off-by: Sudhakar Singh <sudhakars@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Sudhakar Singh <sudhakars@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

mxfp8 (for all gemm layouts) is not supported on 120+ arch yet (#1939)
* mxfp8 is not supported on 120+ arch yet Signed-off-by: Sudhakar Singh <sudhakars@nvidia.com> * change the default recipe for arch 120 Signed-off-by: Sudhakar Singh <sudhakars@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Sudhakar Singh <sudhakars@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
fa91ed72 · Sudhakar Singh · GitHub · bda29934 · fa91ed72
Unverified Commit fa91ed72 authored Jul 16, 2025 by Sudhakar Singh Committed by GitHub Jul 16, 2025
Show whitespace changes
Inline Side-by-side

Showing with 7 additions and 1 deletion

transformer_engine/pytorch/fp8.py transformer_engine/pytorch/fp8.py +7 -1

No files found.
--- a/transformer_engine/pytorch/fp8.py
+++ b/transformer_engine/pytorch/fp8.py
@@ -46,6 +46,8 @@ def check_fp8_support() -> Tuple[bool, str]:
 def check_mxfp8_support() -> Tuple[bool, str]:
    """Return if fp8 support is available"""
+    if get_device_compute_capability() >= (12, 0):
+        return False, "MXFP8 (for all gemm layouts) is not supported on 12.0+ architectures yet."
    if get_device_compute_capability() >= (10, 0):  # blackwell and above
        return True, ""
    return False, "Device compute capability 10.0 or higher required for MXFP8 execution."
@@ -64,7 +66,11 @@ def check_fp8_block_scaling_support() -> Tuple[bool, str]:
 def get_default_fp8_recipe() -> Recipe:
    """FP8 recipe with default args."""
-    if get_device_compute_capability() >= (10, 0):  # blackwell and above
+    if check_mxfp8_support()[0]:
+        # This is a temporary restriction until MXFP8 is supported for all
+        # gemm layouts.
+        if get_device_compute_capability() >= (12, 0):
+            return Float8BlockScaling()
        return MXFP8BlockScaling()
    return DelayedScaling()