"examples/backends/sglang/vscode:/vscode.git/clone" did not exist on "44e8600a8dfde42dd47243d2bb212d7d43185242"
Commit a9c53382 authored by zhuwenwen's avatar zhuwenwen
Browse files

for qwen2, mixtral and mistral models, remove the warning from _ROCM_SWA_REASON

parent 220e6456
...@@ -197,12 +197,12 @@ _ROCM_SWA_REASON = ("Sliding window attention (SWA) is not yet supported in " ...@@ -197,12 +197,12 @@ _ROCM_SWA_REASON = ("Sliding window attention (SWA) is not yet supported in "
"please use CK flash attention by setting " "please use CK flash attention by setting "
"`VLLM_USE_TRITON_FLASH_ATTN=0`") "`VLLM_USE_TRITON_FLASH_ATTN=0`")
_ROCM_PARTIALLY_SUPPORTED_MODELS: Dict[str, str] = { _ROCM_PARTIALLY_SUPPORTED_MODELS: Dict[str, str] = {
"Qwen2ForCausalLM": # "Qwen2ForCausalLM":
_ROCM_SWA_REASON, # _ROCM_SWA_REASON,
"MistralForCausalLM": # "MistralForCausalLM":
_ROCM_SWA_REASON, # _ROCM_SWA_REASON,
"MixtralForCausalLM": # "MixtralForCausalLM":
_ROCM_SWA_REASON, # _ROCM_SWA_REASON,
"PaliGemmaForConditionalGeneration": "PaliGemmaForConditionalGeneration":
("ROCm flash attention does not yet " ("ROCm flash attention does not yet "
"fully support 32-bit precision on PaliGemma"), "fully support 32-bit precision on PaliGemma"),
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment