update max_seq_len_to_capture to option int

d560429c · zhuwenwen · 988fc31c · d560429c
Commit d560429c authored Jul 22, 2025 by zhuwenwen
Hide whitespace changes
Inline Side-by-side

Showing with 1 addition and 1 deletion

vllm/config.py vllm/config.py +1 -1

No files found.
--- a/vllm/config.py
+++ b/vllm/config.py
@@ -313,7 +313,7 @@ class ModelConfig:
    graph and always execute the model in eager mode. If False, we will use
    CUDA graph and eager execution in hybrid for maximal performance and
    flexibility."""
-    max_seq_len_to_capture: int = None # 8192
+    max_seq_len_to_capture: Optional[int] = None # 8192
    """Maximum sequence len covered by CUDA graphs. When a sequence has context
    length larger than this, we fall back to eager mode. Additionally for
    encoder-decoder models, if the sequence length of the encoder input is