- 05 Dec, 2025 1 commit
-
-
Max Hu authored
Signed-off-by:
Max Hu <hyoung2991@gmail.com> Signed-off-by:
Max Hu <maxhu@nvidia.com> Co-authored-by:
Max Hu <maxhu@nvidia.com>
-
- 03 Dec, 2025 1 commit
-
-
Yong Hoon Shin authored
Signed-off-by:Yong Hoon Shin <yhshin@meta.com>
-
- 01 Dec, 2025 1 commit
-
-
shivampr authored
Introduces three new Prometheus histograms for fine-grained observability of KV cache residency behavior: vllm:kv_block_lifetime_seconds — total lifetime from allocation to free vllm:kv_block_idle_before_evict_seconds — idle duration before eviction vllm:kv_block_reuse_gap_seconds — time between consecutive reuses of the same block These metrics help operators analyze KV cache efficiency, reuse patterns, and eviction timing beyond simple utilization rates. Implementation uses monotonic timestamps for accuracy, 1% sampling for minimal overhead (~48 bytes/block), and is fully thread-safe with zero runtime cost when disabled. Two new runtime flags are introduced: --kv-cache-metrics – enable KV cache residency metrics --kv-cache-metrics-sample – control sampling ratio (default: 0.01) Signed-off-by:Shivam <shivamprasad91@gmail.com>
-
- 26 Nov, 2025 1 commit
-
-
George D. Torres authored
Signed-off-by:
George D. Torres <gdavtor@gmail.com> Signed-off-by:
George D. Torres <41129492+geodavic@users.noreply.github.com> Signed-off-by:
Russell Bryant <rbryant@redhat.com> Co-authored-by:
Russell Bryant <rbryant@redhat.com>
-
- 15 Oct, 2025 1 commit
-
-
Sam/Samuel authored
Signed-off-by:
Samuel Wu <cernunnos1710@gmail.com> Signed-off-by:
Sam/Samuel <57896620+cern1710@users.noreply.github.com> Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 12 Oct, 2025 1 commit
-
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 05 Oct, 2025 1 commit
-
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 24 Sep, 2025 1 commit
-
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-