Unverified Commit d00f9607 authored by Yan Ru Pei's avatar Yan Ru Pei Committed by GitHub
Browse files

docs: clarify the usage of LRU for mocker evictor (#6053)


Signed-off-by: default avatarPeaBrane <yanrpei@gmail.com>
parent 4715005b
...@@ -139,7 +139,11 @@ The following diagram illustrates the block lifecycle, based on vLLM's block man ...@@ -139,7 +139,11 @@ The following diagram illustrates the block lifecycle, based on vLLM's block man
### Evictor ### Evictor
The LRU evictor maintains blocks ordered by their last access time, enabling O(1) eviction of the oldest unused block. It supports both normal insertion (for completed sequences) and front-insertion (for preempted sequences that should be evicted first if memory pressure continues). The LRU evictor maintains blocks ordered by a monotonic counter, enabling O(log n) eviction of the lowest-priority block. Each `insert` assigns the next counter value, so blocks inserted later have higher counters and survive longer.
This produces a **depth-aware eviction policy**: when a sequence completes, `free_signal` releases its blocks in reverse order (tail first). Deeper suffix blocks therefore receive lower counters and are evicted before shallower prefix blocks. This keeps shared prefixes cached longer, improving cache hit rates across requests with common prefixes.
The evictor also supports front-insertion (negative counters) for marking blocks for immediate eviction, though this is not currently used in the scheduler.
### Sequence Tracking ### Sequence Tracking
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment