Unverified Commit e2b31243 authored by Seiji Eicher's avatar Seiji Eicher Committed by GitHub
Browse files

[Docs] Update `CacheConfig` block_size docstring to remove inaccurate limit...


[Docs] Update `CacheConfig` block_size docstring to remove inaccurate limit when using CUDA (#35632)
Signed-off-by: default avatarSeiji Eicher <seiji@anyscale.com>
parent c3598d02
......@@ -40,8 +40,7 @@ class CacheConfig:
"""Configuration for the KV cache."""
block_size: SkipValidation[BlockSize] = None # type: ignore[assignment]
"""Size of a contiguous cache block in number of tokens. On CUDA devices,
only block sizes up to 32 are supported.
"""Size of a contiguous cache block in number of tokens.
This config has no static default. If left unspecified by the user, it will
be set in `Platform.check_and_update_config()` based on the current
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment