Unverified Commit 36239f79 authored by Michael Goin's avatar Michael Goin Committed by GitHub
Browse files

Fix FA2 fallback for Blackwell V1 (#19781)


Signed-off-by: default avatarmgoin <mgoin64@gmail.com>
parent dfada85e
......@@ -255,7 +255,7 @@ class CudaPlatformBase(Platform):
"install FlashInfer for better performance.")
pass
# FlashAttention is the default for SM 8.0+ GPUs
elif cls.has_device_capability(80):
if cls.has_device_capability(80):
logger.info_once("Using Flash Attention backend on V1 engine.")
return ("vllm.v1.attention.backends."
"flash_attn.FlashAttentionBackend")
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment