Fix test_max_model_len in tests/entrypoints/llm/test_generate.py (#19451)

Signed-off-by: Lu Fang <lufang@fb.com>

Fix test_max_model_len in tests/entrypoints/llm/test_generate.py (#19451)
Signed-off-by: Lu Fang <lufang@fb.com>
2b1e2111 · Lu Fang · GitHub · a45b979d · 2b1e2111
Unverified Commit 2b1e2111 authored Jun 11, 2025 by Lu Fang Committed by GitHub Jun 11, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 4 additions and 1 deletion

tests/entrypoints/llm/test_generate.py tests/entrypoints/llm/test_generate.py +4 -1

No files found.
--- a/tests/entrypoints/llm/test_generate.py
+++ b/tests/entrypoints/llm/test_generate.py
@@ -125,4 +125,7 @@ def test_max_model_len():
    for output in outputs:
        num_total_tokens = len(output.prompt_token_ids) + len(
            output.outputs[0].token_ids)
-        assert num_total_tokens == max_model_len
+        # Total tokens must not exceed max_model_len.
+        # It can be less if generation finishes due to other reasons (e.g., EOS)
+        # before reaching the absolute model length limit.
+        assert num_total_tokens <= max_model_len