[Docs] Update `CacheConfig` block_size docstring to remove inaccurate limit...

[Docs] Update `CacheConfig` block_size docstring to remove inaccurate limit when using CUDA (#35632) Signed-off-by: Seiji Eicher <seiji@anyscale.com>

[Docs] Update `CacheConfig` block_size docstring to remove inaccurate limit...
[Docs] Update `CacheConfig` block_size docstring to remove inaccurate limit when using CUDA (#35632) Signed-off-by: Seiji Eicher <seiji@anyscale.com>
e2b31243 · Seiji Eicher · GitHub · c3598d02 · e2b31243
Unverified Commit e2b31243 authored Mar 04, 2026 by Seiji Eicher Committed by GitHub Mar 05, 2026
Hide whitespace changes
Inline Side-by-side

Showing with 1 addition and 2 deletions

vllm/config/cache.py vllm/config/cache.py +1 -2

No files found.
--- a/vllm/config/cache.py
+++ b/vllm/config/cache.py
@@ -40,8 +40,7 @@ class CacheConfig:
    """Configuration for the KV cache."""

    block_size: SkipValidation[BlockSize] = None  # type: ignore[assignment]
-    """Size of a contiguous cache block in number of tokens. On CUDA devices,
-    only block sizes up to 32 are supported.
+    """Size of a contiguous cache block in number of tokens.

    This config has no static default. If left unspecified by the user, it will
    be set in `Platform.check_and_update_config()` based on the current