- 02 Dec, 2025 23 commits
-
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Julien Denize authored
Signed-off-by:
Julien Denize <julien.denize@mistral.ai> Signed-off-by:
Julien Denize <40604584+juliendenize@users.noreply.github.com> Signed-off-by:
Mickael Seznec <mickael@mistral.ai> Signed-off-by:
Roger Wang <hey@rogerw.io> Co-authored-by:
Roger Wang <hey@rogerw.io> Co-authored-by:
Mickael Seznec <mickael@mistral.ai>
-
Louie Tsai authored
Signed-off-by:
Tsai, Louie <louie.tsai@intel.com> Signed-off-by:
Louie Tsai <louie.tsai@intel.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
Li, Jiang <bigpyj64@gmail.com>
-
Boyuan Feng authored
Signed-off-by:Boyuan Feng <boyuan@meta.com>
-
杰兮 authored
Signed-off-by:
zhyajie <yajizhan@amd.com> Co-authored-by:
zhyajie <yajizhan@amd.com>
-
Boyuan Feng authored
Signed-off-by:Boyuan Feng <boyuan@meta.com>
-
Wushi Dong authored
Signed-off-by:Wushi Dong <dongws@meta.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Zhang Xiangze authored
Signed-off-by:Zhang Xiangze <Xiangze.Zhang@arm.com>
-
Shengqi Chen authored
Signed-off-by:Shengqi Chen <harry-chen@outlook.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Divakar Verma authored
Signed-off-by:Divakar Verma <divakar.verma@amd.com>
-
Divakar Verma authored
Signed-off-by:
Divakar Verma <divakar.verma@amd.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
usberkeley authored
Signed-off-by:Bradley <bradley.b.pitt@gmail.com>
-
Zuyi Zhao authored
Signed-off-by:Zuyi Zhao <zhaozuy@amazon.com>
-
Johnny Yang authored
Signed-off-by:Johnny Yang <johnnyyang@google.com>
-
Seiji Eicher authored
Signed-off-by:
Seiji Eicher <seiji@anyscale.com> Co-authored-by:
rongfu.leng <1275177125@qq.com>
-
Wei Wei authored
Signed-off-by:Wei Wei <wwei6@meta.com>
-
Zhuohan Li authored
Signed-off-by:
Zhuohan Li <zhuohan123@gmail.com> Signed-off-by:
Nick Hill <nhill@redhat.com> Co-authored-by:
Nick Hill <nhill@redhat.com>
-
Andrew Xia authored
Signed-off-by:
Andrew Xia <axia@fb.com> Co-authored-by:
Andrew Xia <axia@fb.com>
-
Divakar Verma authored
Signed-off-by:
Divakar Verma <divakar.verma@amd.com> Signed-off-by:
Divakar Verma <137818590+divakar-amd@users.noreply.github.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
-
Hendrik Holtmann authored
Signed-off-by:
mgoin <mgoin64@gmail.com> Co-authored-by:
mgoin <mgoin64@gmail.com>
-
Nick Hill authored
Signed-off-by:Nick Hill <nhill@redhat.com>
-
- 01 Dec, 2025 17 commits
-
-
Alexei-V-Ivanov-AMD authored
Signed-off-by:
Alexei V. Ivanov <alexei.ivanov@amd.com> Signed-off-by:
Alexei-V-Ivanov-AMD <156011006+Alexei-V-Ivanov-AMD@users.noreply.github.com> Co-authored-by:
Kevin H. Luu <khluu000@gmail.com>
-
Kevin H. Luu authored
-
Nengjun Ma authored
Signed-off-by:leo-pony <nengjunma@outlook.com>
-
Finbarr Timbers authored
Signed-off-by:Finbarr Timbers <finbarrtimbers@gmail.com>
-
shivampr authored
Introduces three new Prometheus histograms for fine-grained observability of KV cache residency behavior: vllm:kv_block_lifetime_seconds — total lifetime from allocation to free vllm:kv_block_idle_before_evict_seconds — idle duration before eviction vllm:kv_block_reuse_gap_seconds — time between consecutive reuses of the same block These metrics help operators analyze KV cache efficiency, reuse patterns, and eviction timing beyond simple utilization rates. Implementation uses monotonic timestamps for accuracy, 1% sampling for minimal overhead (~48 bytes/block), and is fully thread-safe with zero runtime cost when disabled. Two new runtime flags are introduced: --kv-cache-metrics – enable KV cache residency metrics --kv-cache-metrics-sample – control sampling ratio (default: 0.01) Signed-off-by:Shivam <shivamprasad91@gmail.com>
-
Kevin H. Luu authored
Signed-off-by:Kevin H. Luu <khluu000@gmail.com>
-
knlnguyen1802 authored
Signed-off-by:
knlnguyen1802 <knlnguyen1802@gmail.com> Co-authored-by:
Chenguang Zheng <645327136@qq.com> Co-authored-by:
Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
BADAOUI Abdennacer authored
Signed-off-by:badaoui <abdennacerbadaoui0@gmail.com>
-
sangbumlikeagod authored
Signed-off-by:
sangbumlikeagod <oironese@naver.com> Signed-off-by:
sangbumlikeagod <98077576+sangbumlikeagod@users.noreply.github.com>
-
FredericOdermatt authored
Signed-off-by:Frederic Odermatt <frederic.odermatt@44ai.ch>
-
Shengqi Chen authored
Signed-off-by:Shengqi Chen <harry-chen@outlook.com>
-
Liu Jinyi authored
Signed-off-by:KKKZOZ <kkkzoz@qq.com>
-
Shengqi Chen authored
Signed-off-by:Shengqi Chen <harry-chen@outlook.com>
-
Marcin Ostrowski authored
Signed-off-by:Marcin Ostrowski <marcinx.ostrowski@intel.com>
-
Isotr0py authored
Signed-off-by:
Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by:
baonudesifeizhai <baonudesifeizhai@gmail.com> Co-authored-by:
baonudesifeizhai <baonudesifeizhai@gmail.com>
-
Zhengxu Chen authored
Signed-off-by:
zhxchen17 <zhxchen17@fb.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
Fanli Lin authored
Signed-off-by:Fanli Lin <fanli.lin@intel.com>
-