Unverified Commit 5c796324 authored by Mengqing Cao's avatar Mengqing Cao Committed by GitHub
Browse files

[attn][tiny fix] fix attn backend in MultiHeadAttention (#11463)


Signed-off-by: default avatarMengqing Cao <cmq0113@163.com>
parent 461cde20
......@@ -191,6 +191,7 @@ class MultiHeadAttention(nn.Module):
kv_cache_dtype=None,
block_size=16,
is_attention_free=False)
attn_backend = backend_name_to_enum(attn_backend.get_name())
if attn_backend in {_Backend.FLASH_ATTN, _Backend.FLASH_ATTN_VLLM_V1}:
attn_backend = _Backend.XFORMERS
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment