Unverified Commit 5f7209a7 authored by Bram Wasti's avatar Bram Wasti Committed by GitHub
Browse files

[tiny] Remove unsupported TRITON_MLA backend from batch invariance (#28832)


Signed-off-by: default avatarBram Wasti <bwasti@meta.com>
Signed-off-by: default avatarBram Wasti <bwasti@fb.com>
Co-authored-by: default avatarWentao Ye <44945378+yewentao256@users.noreply.github.com>
parent 2d4978a5
......@@ -805,11 +805,11 @@ def override_envs_for_invariance():
"FLASH_ATTN", # best supported backend
"FLASHINFER",
"FLASH_ATTN_MLA",
"TRITON_MLA",
# Not yet supported MLA backends
# "FLASHMLA",
# "FLEX_ATTENTION", # IMA issue even if we disable batch invariance
# "FLASHINFER_MLA", https://github.com/vllm-project/vllm/pull/28967
# "TRITON_MLA",
]
if curr_attn_backend not in supported_backends:
warning = (
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment