Unverified Commit 93548eb3 authored by Varun Sundar Rabindranath's avatar Varun Sundar Rabindranath Committed by GitHub
Browse files

[Kernel] Enable FP8 Cutlass for Ada Lovelace (#6950)


Co-authored-by: default avatarVarun Sundar Rabindranath <varun@neuralmagic.com>
parent 460c1884
...@@ -38,13 +38,7 @@ bool cutlass_scaled_mm_supports_fp8(int64_t cuda_device_capability) { ...@@ -38,13 +38,7 @@ bool cutlass_scaled_mm_supports_fp8(int64_t cuda_device_capability) {
if (cuda_device_capability >= 90) { if (cuda_device_capability >= 90) {
return CUDA_VERSION >= 12000; return CUDA_VERSION >= 12000;
} else if (cuda_device_capability >= 89) { } else if (cuda_device_capability >= 89) {
// CUTLASS Kernels have not been tuned for Ada Lovelace systems return CUDA_VERSION >= 12040;
// and are slower than torch.mm. Return false unconditionally in this case.
return false;
// Once the CUTLASS kernels have been optimized for Lovelace systems,
// use the following check:
// return CUDA_VERSION >= 12040;
} }
#endif #endif
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment