Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
a9c53382
Commit
a9c53382
authored
Jan 06, 2025
by
zhuwenwen
Browse files
for qwen2, mixtral and mistral models, remove the warning from _ROCM_SWA_REASON
parent
220e6456
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
6 deletions
+6
-6
vllm/model_executor/models/registry.py
vllm/model_executor/models/registry.py
+6
-6
No files found.
vllm/model_executor/models/registry.py
View file @
a9c53382
...
@@ -197,12 +197,12 @@ _ROCM_SWA_REASON = ("Sliding window attention (SWA) is not yet supported in "
...
@@ -197,12 +197,12 @@ _ROCM_SWA_REASON = ("Sliding window attention (SWA) is not yet supported in "
"please use CK flash attention by setting "
"please use CK flash attention by setting "
"`VLLM_USE_TRITON_FLASH_ATTN=0`"
)
"`VLLM_USE_TRITON_FLASH_ATTN=0`"
)
_ROCM_PARTIALLY_SUPPORTED_MODELS
:
Dict
[
str
,
str
]
=
{
_ROCM_PARTIALLY_SUPPORTED_MODELS
:
Dict
[
str
,
str
]
=
{
"Qwen2ForCausalLM"
:
#
"Qwen2ForCausalLM":
_ROCM_SWA_REASON
,
#
_ROCM_SWA_REASON,
"MistralForCausalLM"
:
#
"MistralForCausalLM":
_ROCM_SWA_REASON
,
#
_ROCM_SWA_REASON,
"MixtralForCausalLM"
:
#
"MixtralForCausalLM":
_ROCM_SWA_REASON
,
#
_ROCM_SWA_REASON,
"PaliGemmaForConditionalGeneration"
:
"PaliGemmaForConditionalGeneration"
:
(
"ROCm flash attention does not yet "
(
"ROCm flash attention does not yet "
"fully support 32-bit precision on PaliGemma"
),
"fully support 32-bit precision on PaliGemma"
),
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment