[Kernel][Bugfix] Fixup some warnings in nvfp4_blockwise_moe when CUDA < 12.8 (#20324)

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

[Kernel][Bugfix] Fixup some warnings in nvfp4_blockwise_moe when CUDA < 12.8 (#20324)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
3be8d312 · Tyler Michael Smith · GitHub · 3abfe221 · 3be8d312
Unverified Commit 3be8d312 authored Jul 01, 2025 by Tyler Michael Smith Committed by GitHub Jul 01, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 2 additions and 0 deletions

csrc/quantization/fp4/nvfp4_blockwise_moe_kernel.cu csrc/quantization/fp4/nvfp4_blockwise_moe_kernel.cu +2 -0

No files found.
--- a/csrc/quantization/fp4/nvfp4_blockwise_moe_kernel.cu
+++ b/csrc/quantization/fp4/nvfp4_blockwise_moe_kernel.cu
@@ -335,8 +335,10 @@ void run_fp4_blockwise_scaled_group_mm(
  TORCH_CHECK(status == cutlass::Status::kSuccess, "Failed to run GEMM");
 }
+#if defined ENABLE_NVFP4 && ENABLE_NVFP4
 constexpr auto FLOAT4_E2M1X2 = at::ScalarType::Byte;
 constexpr auto SF_DTYPE = at::ScalarType::Float8_e4m3fn;
+#endif
 #define CHECK_TYPE(x, st, m) \
  TORCH_CHECK(x.scalar_type() == st, ": Inconsistency of Tensor type:", m)