Commit d560429c authored by zhuwenwen's avatar zhuwenwen
Browse files

update max_seq_len_to_capture to option int

parent 988fc31c
...@@ -313,7 +313,7 @@ class ModelConfig: ...@@ -313,7 +313,7 @@ class ModelConfig:
graph and always execute the model in eager mode. If False, we will use graph and always execute the model in eager mode. If False, we will use
CUDA graph and eager execution in hybrid for maximal performance and CUDA graph and eager execution in hybrid for maximal performance and
flexibility.""" flexibility."""
max_seq_len_to_capture: int = None # 8192 max_seq_len_to_capture: Optional[int] = None # 8192
"""Maximum sequence len covered by CUDA graphs. When a sequence has context """Maximum sequence len covered by CUDA graphs. When a sequence has context
length larger than this, we fall back to eager mode. Additionally for length larger than this, we fall back to eager mode. Additionally for
encoder-decoder models, if the sequence length of the encoder input is encoder-decoder models, if the sequence length of the encoder input is
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment