Unverified Commit 9c34e9d2 authored by Michael Goin's avatar Michael Goin Committed by GitHub
Browse files

Disable cascade attention by default (#36318)

parent 09b6f998
...@@ -217,12 +217,13 @@ class ModelConfig: ...@@ -217,12 +217,13 @@ class ModelConfig:
"""Whether to disable sliding window. If True, we will disable the sliding """Whether to disable sliding window. If True, we will disable the sliding
window functionality of the model, capping to sliding window size. If the window functionality of the model, capping to sliding window size. If the
model does not support sliding window, this argument is ignored.""" model does not support sliding window, this argument is ignored."""
disable_cascade_attn: bool = False disable_cascade_attn: bool = True
"""Disable cascade attention for V1. While cascade attention does not """Disable cascade attention for V1. While cascade attention does not
change the mathematical correctness, disabling it could be useful for change the mathematical correctness, disabling it could be useful for
preventing potential numerical issues. Note that even if this is set to preventing potential numerical issues. This defaults to True, so users
False, cascade attention will be only used when the heuristic tells that must opt in to cascade attention by setting this to False. Even when this
it's beneficial.""" is set to False, cascade attention will only be used when the heuristic
tells that it's beneficial."""
skip_tokenizer_init: bool = False skip_tokenizer_init: bool = False
"""Skip initialization of tokenizer and detokenizer. Expects valid """Skip initialization of tokenizer and detokenizer. Expects valid
`prompt_token_ids` and `None` for prompt from the input. The generated `prompt_token_ids` and `None` for prompt from the input. The generated
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment