- 04 Dec, 2025 2 commits
-
-
TimWang authored
Signed-off-by:Tim <tim.wang03@sap.com>
-
CYJiang authored
In Prometheus Counters always expose their actual numeric value with a metric name that ends in _total. We should document the base name, as this what appears in the get_metrics() API. Signed-off-by:CYJiang <86391540+googs1025@users.noreply.github.com>
-
- 03 Dec, 2025 1 commit
-
-
bnellnm authored
Signed-off-by:
Bill Nell <bnell@redhat.com> Signed-off-by:
bnellnm <49004751+bnellnm@users.noreply.github.com> Co-authored-by:
Tyler Michael Smith <tyler@neuralmagic.com>
-
- 02 Dec, 2025 1 commit
-
-
wang.yuqi authored
Signed-off-by:
wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 01 Dec, 2025 2 commits
-
-
shivampr authored
Introduces three new Prometheus histograms for fine-grained observability of KV cache residency behavior: vllm:kv_block_lifetime_seconds — total lifetime from allocation to free vllm:kv_block_idle_before_evict_seconds — idle duration before eviction vllm:kv_block_reuse_gap_seconds — time between consecutive reuses of the same block These metrics help operators analyze KV cache efficiency, reuse patterns, and eviction timing beyond simple utilization rates. Implementation uses monotonic timestamps for accuracy, 1% sampling for minimal overhead (~48 bytes/block), and is fully thread-safe with zero runtime cost when disabled. Two new runtime flags are introduced: --kv-cache-metrics – enable KV cache residency metrics --kv-cache-metrics-sample – control sampling ratio (default: 0.01) Signed-off-by:Shivam <shivamprasad91@gmail.com>
-
wang.yuqi authored
Signed-off-by:wang.yuqi <yuqi.wang@daocloud.io>
-
- 30 Nov, 2025 1 commit
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 29 Nov, 2025 2 commits
-
-
Jinzhen Lin authored
Signed-off-by:
Jinzhen Lin <jinzhen.ljz@antgroup.com> Signed-off-by:
Michael Goin <mgoin64@gmail.com> Signed-off-by:
Jinzhen Lin <linjinzhen@hotmail.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
Michael Goin <mgoin@redhat.com>
-
dublc authored
Signed-off-by:dublc <jdublc0x@gmail.com>
-
- 28 Nov, 2025 1 commit
-
-
Yanan Cao authored
Signed-off-by:
Yanan Cao <gmagogsfm@gmail.com> Co-authored-by:
Claude <noreply@anthropic.com>
-
- 27 Nov, 2025 1 commit
-
-
Morrison Turnansky authored
Signed-off-by:
morrison-turnansky <mturnans@redhat.com> Signed-off-by:
adabeyta <aabeyta@redhat.com> Signed-off-by:
Morrison Turnansky <mturnans@redhat.com> Co-authored-by:
adabeyta <aabeyta@redhat.com> Co-authored-by:
Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 26 Nov, 2025 1 commit
-
-
Lucas Wilkinson authored
-
- 25 Nov, 2025 1 commit
-
-
Michael Goin authored
Signed-off-by:
mgoin <mgoin64@gmail.com> Signed-off-by:
Michael Goin <mgoin64@gmail.com>
-
- 24 Nov, 2025 1 commit
-
-
Laith Sakka authored
Add option to use unbacked, and backed size obl dynamic shapes for more sounds compilation. (#26199) Signed-off-by:Laith Sakka <lsakka@meta.com>
-
- 22 Nov, 2025 1 commit
-
-
Angela Yi authored
Signed-off-by:angelayi <yiangela7@gmail.com>
-
- 21 Nov, 2025 2 commits
-
-
wangxiyuan authored
Signed-off-by:wangxiyuan <wangxiyuan1007@gmail.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 19 Nov, 2025 3 commits
-
-
Didier Durand authored
Signed-off-by:Didier Durand <durand.didier@gmail.com>
-
Michael Yao authored
Signed-off-by:windsonsea <haifeng.yao@daocloud.io>
-
Didier Durand authored
Signed-off-by:Didier Durand <durand.didier@gmail.com>
-
- 18 Nov, 2025 1 commit
-
-
Didier Durand authored
Signed-off-by:Didier Durand <durand.didier@gmail.com>
-
- 16 Nov, 2025 1 commit
-
-
Didier Durand authored
Signed-off-by:Didier Durand <durand.didier@gmail.com>
-
- 15 Nov, 2025 1 commit
-
-
Didier Durand authored
Signed-off-by:Didier Durand <durand.didier@gmail.com>
-
- 14 Nov, 2025 2 commits
-
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Chen Wang authored
Signed-off-by:
Chen Wang <Chen.Wang1@ibm.com> Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 12 Nov, 2025 2 commits
-
-
Benjamin Chislett authored
[Perf] Refactor cudagraph_support to enable full CUDA graphs for spec decoding with FlashInfer (#28479) Signed-off-by:Benjamin Chislett <bchislett@nvidia.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 10 Nov, 2025 1 commit
-
-
vllmellm authored
[RFC][ROCm][AITER] Keep all AITER kernels in `_aiter_ops` class like `_custom_ops` and `_ipex_ops` (#24490) Signed-off-by:
vllmellm <vllm.ellm@embeddedllm.com> Co-authored-by:
Luka Govedič <ProExpertProg@users.noreply.github.com>
-
- 05 Nov, 2025 2 commits
-
-
Richard Zou authored
Signed-off-by:Richard Zou <zou3519@gmail.com>
-
Isotr0py authored
Signed-off-by:Isotr0py <mozf@mail2.sysu.edu.cn>
-
- 03 Nov, 2025 1 commit
-
-
ahao-anyscale authored
[BUG] Make 'binary' default option for saving torch compile artifacts when using standalone_compile (#27616) Signed-off-by:ahao-anyscale <ahao@anyscale.com>
-
- 30 Oct, 2025 1 commit
-
-
wang.yuqi authored
Signed-off-by:
wang.yuqi <noooop@126.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
Cyrus Leung <cyrus.tl.leung@gmail.com>
-
- 24 Oct, 2025 2 commits
-
-
Lifans authored
Signed-off-by:Lifan Shen <lifans@meta.com>
-
fhl2000 authored
Signed-off-by:fhl2000 <63384265+fhl2000@users.noreply.github.com>
-
- 23 Oct, 2025 1 commit
-
-
wang.yuqi authored
Signed-off-by:
wang.yuqi <noooop@126.com> Signed-off-by:
Christian Pinto <christian.pinto@ibm.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
Christian Pinto <christian.pinto@ibm.com>
-
- 22 Oct, 2025 2 commits
-
-
William Song authored
Signed-off-by:William Song <jinwook@umich.edu>
-
Mark McLoughlin authored
Signed-off-by:
Simon Mo <simon.mo@hey.com> Signed-off-by:
Mark McLoughlin <markmc@redhat.com> Signed-off-by:
atalhens <sneh.lata@nutanix.com> Co-authored-by:
Simon Mo <simon.mo@hey.com> Co-authored-by:
atalhens <sneh.lata@nutanix.com>
-
- 18 Oct, 2025 1 commit
-
-
Tova Movshovitz authored
Signed-off-by:tovam <tovam@pliops.com>
-
- 17 Oct, 2025 2 commits
-
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-