- 05 Feb, 2026 1 commit
-
-
zhuwenwen authored
-
- 27 Jan, 2026 1 commit
-
-
Robert Shaw authored
Signed-off-by:
Robert Shaw <robshaw@redhat.com> Signed-off-by:
Amir Klein <203507526+amirkl94@users.noreply.github.com> Co-authored-by:
Robert Shaw <robshaw@redhat.com> Co-authored-by:
amirkl94 <203507526+amirkl94@users.noreply.github.com>
-
- 26 Jan, 2026 1 commit
-
-
Alex Brooks authored
Signed-off-by:Alex-Brooks <Alex.Brooks@ibm.com>
-
- 22 Jan, 2026 1 commit
-
-
wang.yuqi authored
Signed-off-by:wang.yuqi <yuqi.wang@daocloud.io>
-
- 21 Jan, 2026 4 commits
-
-
whx authored
Signed-off-by:whx-sjtu <2952154980@qq.com>
-
Robert Shaw authored
-
Lucas Kabela authored
Signed-off-by:Lucas Kabela <lucaskabela@meta.com>
-
Robert Shaw authored
-
- 20 Jan, 2026 2 commits
-
-
whx authored
Signed-off-by:whx-sjtu <2952154980@qq.com>
-
杨朱 · Kiki authored
This PR completes the removal of the deprecated vllm:time_per_output_token_seconds metric that was deprecated in v0.11, hidden in v0.12, scheduled for removal in v0.13, but delayed until v0.15. Signed-off-by:
carlory <baofa.fan@daocloud.io> Co-authored-by:
Claude Haiku 4.5 <noreply@anthropic.com>
-
- 19 Jan, 2026 1 commit
-
-
lon authored
Signed-off-by:
lon <114724657+longregen@users.noreply.github.com> Signed-off-by:
Russell Bryant <russell.bryant@gmail.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
Russell Bryant <russell.bryant@gmail.com>
-
- 18 Jan, 2026 1 commit
-
-
Robert Shaw authored
Signed-off-by:
Robert Shaw <rshaw@neuralmagic.com> Co-authored-by:
Robert Shaw <rshaw@neuralmagic.com>
-
- 14 Jan, 2026 1 commit
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 13 Jan, 2026 1 commit
-
-
Andrew Bennett authored
Signed-off-by:Andrew Bennett <potatosaladx@meta.com>
-
- 12 Jan, 2026 2 commits
-
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
XlKsyt authored
Signed-off-by:minimAluminiumalism <caixuesen@outlook.com>
-
- 09 Jan, 2026 2 commits
-
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
Shanshan Shen authored
Signed-off-by:shen-shanshan <467638484@qq.com>
-
- 08 Jan, 2026 3 commits
-
-
Lucas Kabela authored
Signed-off-by:Lucas Kabela <lucaskabela@meta.com>
-
Robert Shaw authored
Signed-off-by:
Robert Shaw <robshaw@redhat.com> Signed-off-by:
Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com> Co-authored-by:
Robert Shaw <robshaw@redhat.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
Pavani Majety <pmajety@nvidia.com>
-
Robert Shaw authored
-
- 07 Jan, 2026 1 commit
-
-
weiyu authored
Signed-off-by:
Wei-Yu Lin <weiyulin@google.com> Signed-off-by:
weiyu <62784299+weiyu0824@users.noreply.github.com>
-
- 06 Jan, 2026 1 commit
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 05 Jan, 2026 1 commit
-
-
wangxiyuan authored
Signed-off-by:wangxiyuan <wangxiyuan1007@gmail.com>
-
- 29 Dec, 2025 1 commit
-
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 22 Dec, 2025 1 commit
-
-
Steve Westerhouse authored
Signed-off-by:westers <steve.westerhouse@origami-analytics.com>
-
- 18 Dec, 2025 1 commit
-
-
Elizabeth Thomas authored
Signed-off-by:
Elizabeth Thomas <email2eliza@gmail.com> Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 17 Dec, 2025 1 commit
-
-
rongfu.leng authored
Signed-off-by:rongfu.leng <rongfu.leng@daocloud.io>
-
- 14 Dec, 2025 1 commit
-
-
Didier Durand authored
Signed-off-by:
Didier Durand <durand.didier@gmail.com> Signed-off-by:
Didier Durand <2927957+didier-durand@users.noreply.github.com> Co-authored-by:
Wentao Ye <44945378+yewentao256@users.noreply.github.com>
-
- 11 Dec, 2025 1 commit
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 10 Dec, 2025 1 commit
-
-
Mark McLoughlin authored
Signed-off-by:
Mark McLoughlin <markmc@redhat.com> Co-authored-by:
Claude <noreply@anthropic.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 06 Dec, 2025 1 commit
-
-
redwrasse authored
Signed-off-by:redwrasse <mail@redwrasse.io>
-
- 05 Dec, 2025 1 commit
-
-
Yanan Cao authored
Signed-off-by:Yanan Cao <gmagogsfm@gmail.com>
-
- 04 Dec, 2025 2 commits
-
-
TimWang authored
Signed-off-by:Tim <tim.wang03@sap.com>
-
CYJiang authored
In Prometheus Counters always expose their actual numeric value with a metric name that ends in _total. We should document the base name, as this what appears in the get_metrics() API. Signed-off-by:CYJiang <86391540+googs1025@users.noreply.github.com>
-
- 03 Dec, 2025 1 commit
-
-
bnellnm authored
Signed-off-by:
Bill Nell <bnell@redhat.com> Signed-off-by:
bnellnm <49004751+bnellnm@users.noreply.github.com> Co-authored-by:
Tyler Michael Smith <tyler@neuralmagic.com>
-
- 02 Dec, 2025 1 commit
-
-
wang.yuqi authored
Signed-off-by:
wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 01 Dec, 2025 2 commits
-
-
shivampr authored
Introduces three new Prometheus histograms for fine-grained observability of KV cache residency behavior: vllm:kv_block_lifetime_seconds — total lifetime from allocation to free vllm:kv_block_idle_before_evict_seconds — idle duration before eviction vllm:kv_block_reuse_gap_seconds — time between consecutive reuses of the same block These metrics help operators analyze KV cache efficiency, reuse patterns, and eviction timing beyond simple utilization rates. Implementation uses monotonic timestamps for accuracy, 1% sampling for minimal overhead (~48 bytes/block), and is fully thread-safe with zero runtime cost when disabled. Two new runtime flags are introduced: --kv-cache-metrics – enable KV cache residency metrics --kv-cache-metrics-sample – control sampling ratio (default: 0.01) Signed-off-by:Shivam <shivamprasad91@gmail.com>
-
wang.yuqi authored
Signed-off-by:wang.yuqi <yuqi.wang@daocloud.io>
-
- 30 Nov, 2025 1 commit
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-