- 15 Jan, 2026 1 commit
-
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
- 14 Jan, 2026 1 commit
-
-
Lumosis authored
Signed-off-by:Lihao Ran <imlihao.ran@gmail.com>
-
- 12 Jan, 2026 1 commit
-
-
Or Ozeri authored
Signed-off-by:Or Ozeri <oro@il.ibm.com>
-
- 10 Jan, 2026 1 commit
-
-
Or Ozeri authored
Signed-off-by:Or Ozeri <oro@il.ibm.com>
-
- 09 Jan, 2026 1 commit
-
-
Yifan Qiao authored
Signed-off-by:Yifan Qiao <yifanqiao@berkeley.edu>
-
- 08 Jan, 2026 1 commit
-
-
Lumosis authored
Signed-off-by:
Lihao Ran <imlihao.ran@gmail.com> Signed-off-by:
Lumosis <30372757+Lumosis@users.noreply.github.com>
-
- 05 Jan, 2026 1 commit
-
-
Nick Hill authored
Signed-off-by:njhill <nickhill123@gmail.com>
-
- 27 Dec, 2025 1 commit
-
-
Yifan Qiao authored
Signed-off-by:
Yifan Qiao <yifanqiao@berkeley.edu> Co-authored-by:
KuntaiDu <kuntai@uchicago.edu>
-
- 24 Dec, 2025 2 commits
-
-
Michael Goin authored
Signed-off-by:mgoin <mgoin64@gmail.com>
-
Chen Zhang authored
Signed-off-by:
Chen Zhang <zhangch99@outlook.com> Signed-off-by:
Yifan Qiao <yifanqiao@berkeley.edu> Co-authored-by:
Yifan Qiao <yifanqiao@berkeley.edu>
-
- 16 Dec, 2025 1 commit
-
-
Roger Wang authored
Signed-off-by:
Roger Wang <hey@rogerw.io> Co-authored-by:
Sun Kim <sunytokki@gmail.com>
-
- 09 Dec, 2025 1 commit
-
-
Or Ozeri authored
Signed-off-by:Or Ozeri <oro@il.ibm.com>
-
- 07 Dec, 2025 2 commits
-
-
Cyrus Leung authored
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 05 Dec, 2025 1 commit
-
-
rasmith authored
Signed-off-by:
Randall Smith <ransmith@amd.com> Co-authored-by:
Randall Smith <ransmith@amd.com>
-
- 04 Dec, 2025 1 commit
-
-
Mark McLoughlin authored
Signed-off-by:Mark McLoughlin <markmc@redhat.com>
-
- 02 Dec, 2025 3 commits
-
-
Chauncey authored
Signed-off-by:
chaunceyjiang <chaunceyjiang@gmail.com> Signed-off-by:
Nick Hill <nhill@redhat.com> Co-authored-by:
Nick Hill <nhill@redhat.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Zhuohan Li authored
Signed-off-by:
Zhuohan Li <zhuohan123@gmail.com> Signed-off-by:
Nick Hill <nhill@redhat.com> Co-authored-by:
Nick Hill <nhill@redhat.com>
-
- 01 Dec, 2025 2 commits
-
-
shivampr authored
Introduces three new Prometheus histograms for fine-grained observability of KV cache residency behavior: vllm:kv_block_lifetime_seconds — total lifetime from allocation to free vllm:kv_block_idle_before_evict_seconds — idle duration before eviction vllm:kv_block_reuse_gap_seconds — time between consecutive reuses of the same block These metrics help operators analyze KV cache efficiency, reuse patterns, and eviction timing beyond simple utilization rates. Implementation uses monotonic timestamps for accuracy, 1% sampling for minimal overhead (~48 bytes/block), and is fully thread-safe with zero runtime cost when disabled. Two new runtime flags are introduced: --kv-cache-metrics – enable KV cache residency metrics --kv-cache-metrics-sample – control sampling ratio (default: 0.01) Signed-off-by:Shivam <shivamprasad91@gmail.com>
-
Marcin Ostrowski authored
Signed-off-by:Marcin Ostrowski <marcinx.ostrowski@intel.com>
-
- 28 Nov, 2025 1 commit
-
-
maang-h authored
Signed-off-by:maang <maang_h@163.com>
-
- 25 Nov, 2025 2 commits
-
-
Yifan Qiao authored
Signed-off-by:
Yifan Qiao <yifanqiao@berkeley.edu> Co-authored-by:
Chen Zhang <zhangch99@outlook.com>
-
wang.yuqi authored
Signed-off-by:wang.yuqi <yuqi.wang@daocloud.io>
-
- 24 Nov, 2025 1 commit
-
-
Chen Zhang authored
Signed-off-by:Chen Zhang <zhangch99@outlook.com>
-
- 22 Nov, 2025 1 commit
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 21 Nov, 2025 2 commits
-
-
Mark McLoughlin authored
Signed-off-by:
Mark McLoughlin <markmc@redhat.com> Co-authored-by:
Nicolò Lucchesi <nlucches@redhat.com>
-
Jialin Ouyang authored
Signed-off-by:Jialin Ouyang <Jialin.Ouyang@gmail.com>
-
- 15 Nov, 2025 3 commits
-
-
Cyrus Leung authored
Signed-off-by:
Jialin Ouyang <Jialin.Ouyang@gmail.com> Signed-off-by:
DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by:
Jialin Ouyang <Jialin.Ouyang@gmail.com>
-
Nick Hill authored
-
Jialin Ouyang authored
[Core] Performance: Use list[np.ndarray] instead of list[list[int]] for output tokens for GC optimization (#26368) Signed-off-by:Jialin Ouyang <Jialin.Ouyang@gmail.com>
-
- 14 Nov, 2025 3 commits
-
-
Marcin Ostrowski authored
Signed-off-by:Marcin Ostrowski <marcinx.ostrowski@intel.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 13 Nov, 2025 1 commit
-
-
Mark McLoughlin authored
Signed-off-by:Mark McLoughlin <markmc@redhat.com>
-
- 12 Nov, 2025 2 commits
-
-
Andy Lo authored
Signed-off-by:Andy Lo <andy@mistral.ai>
-
Chenguang Zheng authored
Signed-off-by:
n00909098 <nguyen.kha.long@huawei.com> Signed-off-by:
knlnguyen1802 <knlnguyen1802@gmail.com> Signed-off-by:
herotai214 <herotai214@gmail.com> Signed-off-by:
Khuong Le <khuong.le.manh@huawei.com> Signed-off-by:
Khuong Le <lemanhkhuong2611@gmail.com> Co-authored-by:
n00909098 <nguyen.kha.long@huawei.com> Co-authored-by:
knlnguyen1802 <knlnguyen1802@gmail.com> Co-authored-by:
herotai214 <herotai214@gmail.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
Khuong Le <khuong.le.manh@huawei.com> Co-authored-by:
Khuong Le <lemanhkhuong2611@gmail.com>
-
- 05 Nov, 2025 1 commit
-
-
Kuntai Du authored
[Hybrid allocator + kv connector] revert connector test changes related to hybrid allocator (#28011) Signed-off-by:KuntaiDu <kuntai@uchicago.edu>
-
- 04 Nov, 2025 1 commit
-
-
Nick Hill authored
Signed-off-by:Nick Hill <nhill@redhat.com>
-
- 03 Nov, 2025 1 commit
-
-
Biswa Panda authored
Signed-off-by:Biswa Panda <biswa.panda@gmail.com>
-