- 13 Feb, 2026 1 commit
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 10 Feb, 2026 4 commits
-
-
junuxyz authored
Signed-off-by:junuxyz <216036880+junuxyz@users.noreply.github.com>
-
Krish Gupta authored
Signed-off-by:KrxGu <krishom70@gmail.com>
-
Chen Zhang authored
Signed-off-by:Chen Zhang <zhangch99@outlook.com>
-
Roger Wang authored
Signed-off-by:Roger Wang <hey@rogerw.io>
-
- 07 Feb, 2026 1 commit
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 05 Feb, 2026 1 commit
-
-
Mark McLoughlin authored
Signed-off-by:Mark McLoughlin <markmc@redhat.com>
-
- 04 Feb, 2026 2 commits
-
-
Nick Hill authored
Signed-off-by:Nick Hill <nickhill123@gmail.com>
-
Or Ozeri authored
Fixes a not-yet-reported case where it was possible for blocks to be freed by an abort before an async transfer completed, resulting in corrupted KV data. Signed-off-by:Or Ozeri <oro@il.ibm.com>
-
- 03 Feb, 2026 1 commit
-
-
Nick Hill authored
Signed-off-by:Nick Hill <nickhill123@gmail.com>
-
- 02 Feb, 2026 1 commit
-
-
Yifan Qiao authored
Signed-off-by:Yifan Qiao <yifanqiao@berkeley.edu>
-
- 31 Jan, 2026 1 commit
-
-
jma99_2333 authored
Signed-off-by:
Roger Wang <hey@rogerw.io> Co-authored-by:
Roger Wang <hey@rogerw.io>
-
- 29 Jan, 2026 1 commit
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 28 Jan, 2026 1 commit
-
-
Wentao Ye authored
[Feature] Fully support for async scheduling + PP, 30.8% E2E throughput improvement, 31.8% TPOT improvement (#32618) Signed-off-by:
yewentao256 <zhyanwentao@126.com> Signed-off-by:
Wentao Ye <44945378+yewentao256@users.noreply.github.com> Signed-off-by:
Nick Hill <nickhill123@gmail.com> Co-authored-by:
Nick Hill <nickhill123@gmail.com>
-
- 24 Jan, 2026 1 commit
-
-
Joshua Deng authored
Signed-off-by:
Joshua Deng <joshuakdeng@gmail.com> Signed-off-by:
Patrick von Platen <patrick.v.platen@gmail.com> Signed-off-by:
Nick Hill <nickhill123@gmail.com> Signed-off-by:
Roger Wang <hey@rogerw.io> Co-authored-by:
Roger Wang <hey@rogerw.io> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Nick Hill <nickhill123@gmail.com>
-
- 23 Jan, 2026 1 commit
-
-
Harry Huang authored
Signed-off-by:
huanghaoyan.hhy <huanghaoyan.hhy@alibaba-inc.com> Signed-off-by:
Chen Zhang <zhangch99@outlook.com> Co-authored-by:
Chen Zhang <zhangch99@outlook.com>
-
- 22 Jan, 2026 1 commit
-
-
knlnguyen1802 authored
Signed-off-by:knlnguyen1802 <knlnguyen1802@gmail.com>
-
- 15 Jan, 2026 1 commit
-
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
- 14 Jan, 2026 1 commit
-
-
Lumosis authored
Signed-off-by:Lihao Ran <imlihao.ran@gmail.com>
-
- 12 Jan, 2026 1 commit
-
-
Or Ozeri authored
Signed-off-by:Or Ozeri <oro@il.ibm.com>
-
- 10 Jan, 2026 1 commit
-
-
Or Ozeri authored
Signed-off-by:Or Ozeri <oro@il.ibm.com>
-
- 09 Jan, 2026 1 commit
-
-
Yifan Qiao authored
Signed-off-by:Yifan Qiao <yifanqiao@berkeley.edu>
-
- 08 Jan, 2026 1 commit
-
-
Lumosis authored
Signed-off-by:
Lihao Ran <imlihao.ran@gmail.com> Signed-off-by:
Lumosis <30372757+Lumosis@users.noreply.github.com>
-
- 05 Jan, 2026 1 commit
-
-
Nick Hill authored
Signed-off-by:njhill <nickhill123@gmail.com>
-
- 27 Dec, 2025 1 commit
-
-
Yifan Qiao authored
Signed-off-by:
Yifan Qiao <yifanqiao@berkeley.edu> Co-authored-by:
KuntaiDu <kuntai@uchicago.edu>
-
- 24 Dec, 2025 2 commits
-
-
Michael Goin authored
Signed-off-by:mgoin <mgoin64@gmail.com>
-
Chen Zhang authored
Signed-off-by:
Chen Zhang <zhangch99@outlook.com> Signed-off-by:
Yifan Qiao <yifanqiao@berkeley.edu> Co-authored-by:
Yifan Qiao <yifanqiao@berkeley.edu>
-
- 16 Dec, 2025 1 commit
-
-
Roger Wang authored
Signed-off-by:
Roger Wang <hey@rogerw.io> Co-authored-by:
Sun Kim <sunytokki@gmail.com>
-
- 09 Dec, 2025 1 commit
-
-
Or Ozeri authored
Signed-off-by:Or Ozeri <oro@il.ibm.com>
-
- 07 Dec, 2025 2 commits
-
-
Cyrus Leung authored
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 05 Dec, 2025 1 commit
-
-
rasmith authored
Signed-off-by:
Randall Smith <ransmith@amd.com> Co-authored-by:
Randall Smith <ransmith@amd.com>
-
- 04 Dec, 2025 1 commit
-
-
Mark McLoughlin authored
Signed-off-by:Mark McLoughlin <markmc@redhat.com>
-
- 02 Dec, 2025 3 commits
-
-
Chauncey authored
Signed-off-by:
chaunceyjiang <chaunceyjiang@gmail.com> Signed-off-by:
Nick Hill <nhill@redhat.com> Co-authored-by:
Nick Hill <nhill@redhat.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Zhuohan Li authored
Signed-off-by:
Zhuohan Li <zhuohan123@gmail.com> Signed-off-by:
Nick Hill <nhill@redhat.com> Co-authored-by:
Nick Hill <nhill@redhat.com>
-
- 01 Dec, 2025 2 commits
-
-
shivampr authored
Introduces three new Prometheus histograms for fine-grained observability of KV cache residency behavior: vllm:kv_block_lifetime_seconds — total lifetime from allocation to free vllm:kv_block_idle_before_evict_seconds — idle duration before eviction vllm:kv_block_reuse_gap_seconds — time between consecutive reuses of the same block These metrics help operators analyze KV cache efficiency, reuse patterns, and eviction timing beyond simple utilization rates. Implementation uses monotonic timestamps for accuracy, 1% sampling for minimal overhead (~48 bytes/block), and is fully thread-safe with zero runtime cost when disabled. Two new runtime flags are introduced: --kv-cache-metrics – enable KV cache residency metrics --kv-cache-metrics-sample – control sampling ratio (default: 0.01) Signed-off-by:Shivam <shivamprasad91@gmail.com>
-
Marcin Ostrowski authored
Signed-off-by:Marcin Ostrowski <marcinx.ostrowski@intel.com>
-
- 28 Nov, 2025 1 commit
-
-
maang-h authored
Signed-off-by:maang <maang_h@163.com>
-
- 25 Nov, 2025 1 commit
-
-
Yifan Qiao authored
Signed-off-by:
Yifan Qiao <yifanqiao@berkeley.edu> Co-authored-by:
Chen Zhang <zhangch99@outlook.com>
-