Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
d560429c
Commit
d560429c
authored
Jul 22, 2025
by
zhuwenwen
Browse files
update max_seq_len_to_capture to option int
parent
988fc31c
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
1 addition
and
1 deletion
+1
-1
vllm/config.py
vllm/config.py
+1
-1
No files found.
vllm/config.py
View file @
d560429c
...
@@ -313,7 +313,7 @@ class ModelConfig:
...
@@ -313,7 +313,7 @@ class ModelConfig:
graph and always execute the model in eager mode. If False, we will use
graph and always execute the model in eager mode. If False, we will use
CUDA graph and eager execution in hybrid for maximal performance and
CUDA graph and eager execution in hybrid for maximal performance and
flexibility."""
flexibility."""
max_seq_len_to_capture
:
int
=
None
# 8192
max_seq_len_to_capture
:
Optional
[
int
]
=
None
# 8192
"""Maximum sequence len covered by CUDA graphs. When a sequence has context
"""Maximum sequence len covered by CUDA graphs. When a sequence has context
length larger than this, we fall back to eager mode. Additionally for
length larger than this, we fall back to eager mode. Additionally for
encoder-decoder models, if the sequence length of the encoder input is
encoder-decoder models, if the sequence length of the encoder input is
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment