Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
9c34e9d2
Unverified
Commit
9c34e9d2
authored
Mar 11, 2026
by
Michael Goin
Committed by
GitHub
Mar 11, 2026
Browse files
Disable cascade attention by default (#36318)
parent
09b6f998
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
5 additions
and
4 deletions
+5
-4
vllm/config/model.py
vllm/config/model.py
+5
-4
No files found.
vllm/config/model.py
View file @
9c34e9d2
...
@@ -217,12 +217,13 @@ class ModelConfig:
...
@@ -217,12 +217,13 @@ class ModelConfig:
"""Whether to disable sliding window. If True, we will disable the sliding
"""Whether to disable sliding window. If True, we will disable the sliding
window functionality of the model, capping to sliding window size. If the
window functionality of the model, capping to sliding window size. If the
model does not support sliding window, this argument is ignored."""
model does not support sliding window, this argument is ignored."""
disable_cascade_attn
:
bool
=
Fals
e
disable_cascade_attn
:
bool
=
Tru
e
"""Disable cascade attention for V1. While cascade attention does not
"""Disable cascade attention for V1. While cascade attention does not
change the mathematical correctness, disabling it could be useful for
change the mathematical correctness, disabling it could be useful for
preventing potential numerical issues. Note that even if this is set to
preventing potential numerical issues. This defaults to True, so users
False, cascade attention will be only used when the heuristic tells that
must opt in to cascade attention by setting this to False. Even when this
it's beneficial."""
is set to False, cascade attention will only be used when the heuristic
tells that it's beneficial."""
skip_tokenizer_init
:
bool
=
False
skip_tokenizer_init
:
bool
=
False
"""Skip initialization of tokenizer and detokenizer. Expects valid
"""Skip initialization of tokenizer and detokenizer. Expects valid
`prompt_token_ids` and `None` for prompt from the input. The generated
`prompt_token_ids` and `None` for prompt from the input. The generated
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment