- 07 Dec, 2025 1 commit
-
-
jeremyteboul authored
Signed-off-by:
Jeremy Teboul <jeremyteboul@fb.com> Co-authored-by:
Jeremy Teboul <jeremyteboul@fb.com>
-
- 06 Dec, 2025 2 commits
-
-
Viacheslav authored
Signed-off-by:Viacheslav Barinov <viacheslav.teh@gmail.com>
-
redwrasse authored
Signed-off-by:redwrasse <mail@redwrasse.io>
-
- 05 Dec, 2025 4 commits
-
-
Russell Bryant authored
Signed-off-by:Russell Bryant <rbryant@redhat.com>
-
Yanan Cao authored
Signed-off-by:Yanan Cao <gmagogsfm@gmail.com>
-
Tiger Xu / Zhonghu Xu authored
Signed-off-by:Zhonghu Xu <xuzhonghu@huawei.com>
-
Hubert de La Jonquiere authored
Signed-off-by:hdlj-h <hubert@hcompany.ai>
-
- 04 Dec, 2025 8 commits
-
-
TimWang authored
Signed-off-by:Tim <tim.wang03@sap.com>
-
Tao Yun authored
Signed-off-by:
taoyun <1069423820@qq.com> Signed-off-by:
Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by:
Cyrus Leung <cyrus.tl.leung@gmail.com>
-
Shengqi Chen authored
Signed-off-by:Shengqi Chen <harry-chen@outlook.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
wang.yuqi authored
Signed-off-by:
wang.yuqi <noooop@126.com> Signed-off-by:
wang.yuqi <yuqi.wang@daocloud.io> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
-
dtc authored
Signed-off-by:
Tianchen Ding <dtcccc@linux.alibaba.com> Signed-off-by:
dtc <dtcccc@linux.alibaba.com> Co-authored-by:
Nicolò Lucchesi <nicolo.lucchesi@gmail.com>
-
CYJiang authored
In Prometheus Counters always expose their actual numeric value with a metric name that ends in _total. We should document the base name, as this what appears in the get_metrics() API. Signed-off-by:CYJiang <86391540+googs1025@users.noreply.github.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 03 Dec, 2025 7 commits
-
-
bnellnm authored
Signed-off-by:
Bill Nell <bnell@redhat.com> Signed-off-by:
bnellnm <49004751+bnellnm@users.noreply.github.com> Co-authored-by:
Tyler Michael Smith <tyler@neuralmagic.com>
-
Lumis Chen authored
Signed-off-by:
LuminolT <lumischen01@gmail.com> Signed-off-by:
Lumis Chen <lumischen01@gmail.com> Co-authored-by:
Russell Bryant <rbryant@redhat.com>
-
ioana ghiban authored
Signed-off-by:Ioana Ghiban <ioana.ghiban@arm.com>
-
ioana ghiban authored
Signed-off-by:Ioana Ghiban <ioana.ghiban@arm.com>
-
Amr Mahdi authored
Signed-off-by:Amr Mahdi <amrmahdi@meta.com>
-
Fadi Arafeh authored
Signed-off-by:Fadi Arafeh <fadi.arafeh@arm.com>
-
Russell Bryant authored
Signed-off-by:Russell Bryant <rbryant@redhat.com>
-
- 02 Dec, 2025 4 commits
-
-
wang.yuqi authored
Signed-off-by:
wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Julien Denize authored
Signed-off-by:
Julien Denize <julien.denize@mistral.ai> Signed-off-by:
Julien Denize <40604584+juliendenize@users.noreply.github.com> Signed-off-by:
Mickael Seznec <mickael@mistral.ai> Signed-off-by:
Roger Wang <hey@rogerw.io> Co-authored-by:
Roger Wang <hey@rogerw.io> Co-authored-by:
Mickael Seznec <mickael@mistral.ai>
-
Louie Tsai authored
Signed-off-by:
Tsai, Louie <louie.tsai@intel.com> Signed-off-by:
Louie Tsai <louie.tsai@intel.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
Li, Jiang <bigpyj64@gmail.com>
-
Shengqi Chen authored
Signed-off-by:Shengqi Chen <harry-chen@outlook.com>
-
- 01 Dec, 2025 7 commits
-
-
Kevin H. Luu authored
-
Finbarr Timbers authored
Signed-off-by:Finbarr Timbers <finbarrtimbers@gmail.com>
-
shivampr authored
Introduces three new Prometheus histograms for fine-grained observability of KV cache residency behavior: vllm:kv_block_lifetime_seconds — total lifetime from allocation to free vllm:kv_block_idle_before_evict_seconds — idle duration before eviction vllm:kv_block_reuse_gap_seconds — time between consecutive reuses of the same block These metrics help operators analyze KV cache efficiency, reuse patterns, and eviction timing beyond simple utilization rates. Implementation uses monotonic timestamps for accuracy, 1% sampling for minimal overhead (~48 bytes/block), and is fully thread-safe with zero runtime cost when disabled. Two new runtime flags are introduced: --kv-cache-metrics – enable KV cache residency metrics --kv-cache-metrics-sample – control sampling ratio (default: 0.01) Signed-off-by:Shivam <shivamprasad91@gmail.com>
-
sangbumlikeagod authored
Signed-off-by:
sangbumlikeagod <oironese@naver.com> Signed-off-by:
sangbumlikeagod <98077576+sangbumlikeagod@users.noreply.github.com>
-
Shengqi Chen authored
Signed-off-by:Shengqi Chen <harry-chen@outlook.com>
-
wang.yuqi authored
Signed-off-by:wang.yuqi <yuqi.wang@daocloud.io>
-
Yifei Zhang authored
Signed-off-by:Yifei Zhang <yifei.zhang1992@outlook.com>
-
- 30 Nov, 2025 1 commit
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 29 Nov, 2025 3 commits
-
-
Jinzhen Lin authored
Signed-off-by:
Jinzhen Lin <jinzhen.ljz@antgroup.com> Signed-off-by:
Michael Goin <mgoin64@gmail.com> Signed-off-by:
Jinzhen Lin <linjinzhen@hotmail.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
Michael Goin <mgoin@redhat.com>
-
dublc authored
Signed-off-by:dublc <jdublc0x@gmail.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 28 Nov, 2025 3 commits
-
-
Yanan Cao authored
Signed-off-by:
Yanan Cao <gmagogsfm@gmail.com> Co-authored-by:
Claude <noreply@anthropic.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Wilson Wu authored
Signed-off-by:
Wilson Wu <iwilsonwu@gmail.com> Co-authored-by:
Cyrus Leung <cyrus.tl.leung@gmail.com>
-