- 06 Dec, 2025 3 commits
-
-
Nick Hill authored
Signed-off-by:Nick Hill <nhill@redhat.com>
-
Rohan Potdar authored
Signed-off-by:Rohan138 <rohanpotdar138@gmail.com>
-
Harry Mellor authored
Better error when world size is larger than node and `distributed_executor_backend` is not set (#30140) Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 05 Dec, 2025 7 commits
-
-
Bangsheng Tang authored
-
Ilya Markov authored
Signed-off-by:
Luka Govedič <lgovedic@redhat.com> Signed-off-by:
Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by:
ilmarkov <markovilya197@gmail.com> Signed-off-by:
Luka Govedič <luka.govedic@gmail.com> Signed-off-by:
ProExpertProg <lgovedic@redhat.com> Co-authored-by:
Luka Govedič <lgovedic@redhat.com> Co-authored-by:
Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by:
Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com> Co-authored-by:
Luka Govedič <luka.govedic@gmail.com>
-
Matthew Bonanni authored
Signed-off-by:
Matthew Bonanni <mbonanni@redhat.com> Signed-off-by:
Matthew Bonanni <mbonanni001@gmail.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
-
Alec S authored
Signed-off-by:
Alec Solder <alecs@fb.com> Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Alec Solder <alecs@fb.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Max Hu authored
Signed-off-by:
Max Hu <hyoung2991@gmail.com> Signed-off-by:
Max Hu <maxhu@nvidia.com> Co-authored-by:
Max Hu <maxhu@nvidia.com>
-
amitz-nv authored
[Frontend][Model] Add 'float16' to possible mamba cache dtype values, override mamba SSM cache dtype value for NemotronH (#29978) Signed-off-by:amitz-nv <203509407+amitz-nv@users.noreply.github.com>
-
Qiu authored
Signed-off-by:
QiuChunshuo <qiuchunshuo@huawei.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
- 04 Dec, 2025 4 commits
-
-
Mercykid-bash authored
Signed-off-by:
Che Ruan <cr623@ic.ac.uk> Signed-off-by:
mengxingkongzhouhan <117415539+mengxingkongzhouhan@users.noreply.github.com> Signed-off-by:
Mercykid-bash <ruanche0218@gmail.com> Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Che Ruan <cr623@ic.ac.uk> Co-authored-by:
mengxingkongzhouhan <117415539+mengxingkongzhouhan@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
wang.yuqi authored
Signed-off-by:
wang.yuqi <noooop@126.com> Signed-off-by:
wang.yuqi <yuqi.wang@daocloud.io> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
-
Arpit Khandelwal authored
Signed-off-by:
arpitkh101 <arpit5khandelwal@gmail.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
Xieyang Xu authored
-
- 03 Dec, 2025 4 commits
-
-
Lumis Chen authored
Signed-off-by:
LuminolT <lumischen01@gmail.com> Signed-off-by:
Lumis Chen <lumischen01@gmail.com> Co-authored-by:
Russell Bryant <rbryant@redhat.com>
-
Chauncey authored
Signed-off-by:chaunceyjiang <chaunceyjiang@gmail.com>
-
Yong Hoon Shin authored
Signed-off-by:Yong Hoon Shin <yhshin@meta.com>
-
Arpit Khandelwal authored
Signed-off-by:
arpitkh101 <arpit5khandelwal@gmail.com> Co-authored-by:
Luka Govedič <ProExpertProg@users.noreply.github.com>
-
- 02 Dec, 2025 5 commits
-
-
Isotr0py authored
Signed-off-by:Isotr0py <mozf@mail2.sysu.edu.cn>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Julien Denize authored
Signed-off-by:
Julien Denize <julien.denize@mistral.ai> Signed-off-by:
Julien Denize <40604584+juliendenize@users.noreply.github.com> Signed-off-by:
Mickael Seznec <mickael@mistral.ai> Signed-off-by:
Roger Wang <hey@rogerw.io> Co-authored-by:
Roger Wang <hey@rogerw.io> Co-authored-by:
Mickael Seznec <mickael@mistral.ai>
-
Boyuan Feng authored
Signed-off-by:Boyuan Feng <boyuan@meta.com>
-
Wei Wei authored
Signed-off-by:Wei Wei <wwei6@meta.com>
-
- 01 Dec, 2025 4 commits
-
-
Nengjun Ma authored
Signed-off-by:leo-pony <nengjunma@outlook.com>
-
shivampr authored
Introduces three new Prometheus histograms for fine-grained observability of KV cache residency behavior: vllm:kv_block_lifetime_seconds — total lifetime from allocation to free vllm:kv_block_idle_before_evict_seconds — idle duration before eviction vllm:kv_block_reuse_gap_seconds — time between consecutive reuses of the same block These metrics help operators analyze KV cache efficiency, reuse patterns, and eviction timing beyond simple utilization rates. Implementation uses monotonic timestamps for accuracy, 1% sampling for minimal overhead (~48 bytes/block), and is fully thread-safe with zero runtime cost when disabled. Two new runtime flags are introduced: --kv-cache-metrics – enable KV cache residency metrics --kv-cache-metrics-sample – control sampling ratio (default: 0.01) Signed-off-by:Shivam <shivamprasad91@gmail.com>
-
FredericOdermatt authored
Signed-off-by:Frederic Odermatt <frederic.odermatt@44ai.ch>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 30 Nov, 2025 2 commits
-
-
Xingyu Liu authored
Signed-off-by:
Xingyu Liu <charlotteliu12x@gmail.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 28 Nov, 2025 3 commits
-
-
Zhengxu Chen authored
Signed-off-by:zhxchen17 <zhxchen17@fb.com>
-
Yanan Cao authored
Signed-off-by:
Yanan Cao <gmagogsfm@gmail.com> Co-authored-by:
Claude <noreply@anthropic.com>
-
wang.yuqi authored
Signed-off-by:
wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by:
wang.yuqi <noooop@126.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
- 27 Nov, 2025 5 commits
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Injae Ryou authored
Signed-off-by:
Injae Ryou <injaeryou@gmail.com> Signed-off-by:
Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
Isotr0py <mozf@mail2.sysu.edu.cn>
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
Didier Durand authored
Signed-off-by:Didier Durand <durand.didier@gmail.com>
-
Morrison Turnansky authored
Signed-off-by:
morrison-turnansky <mturnans@redhat.com> Signed-off-by:
adabeyta <aabeyta@redhat.com> Signed-off-by:
Morrison Turnansky <mturnans@redhat.com> Co-authored-by:
adabeyta <aabeyta@redhat.com> Co-authored-by:
Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 26 Nov, 2025 2 commits
-
-
George D. Torres authored
Signed-off-by:
George D. Torres <gdavtor@gmail.com> Signed-off-by:
George D. Torres <41129492+geodavic@users.noreply.github.com> Signed-off-by:
Russell Bryant <rbryant@redhat.com> Co-authored-by:
Russell Bryant <rbryant@redhat.com>
-
Lucia Fang authored
Signed-off-by:
Lu Fang <fanglu@fb.com> Co-authored-by:
Lucia (Lu) Fang <fanglu@meta.com>
-
- 25 Nov, 2025 1 commit
-
-
Zhengxu Chen authored
Signed-off-by:zhxchen17 <zhxchen17@fb.com>
-