- 14 Apr, 2026 1 commit
-
-
Mark McLoughlin authored
[Core][Metrics][BugFix] Replace num_cached_tokens/num_external_computed_tokens with PrefillStats (#37460) Related to `Counters can only be incremented by non-negative amounts` error with the `vllm:prompt_tokens_by_source_total` metric. Signed-off-by:
Mark McLoughlin <markmc@redhat.com> Co-authored-by:
Or Ozeri <or@ozery.com>
-
- 13 Apr, 2026 3 commits
-
-
Giancarlo Delfin authored
Signed-off-by:Giancarlo Delfin <gdelfin@inferact.ai>
-
mukesh-hai authored
Signed-off-by:
Mukesh Baphna <mukesh@hippocraticai.com> Signed-off-by:
Mark McLoughlin <markmc@redhat.com> Co-authored-by:
Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by:
Mark McLoughlin <markmc@redhat.com>
-
Wentao Ye authored
Signed-off-by:
yewentao256 <zhyanwentao@126.com> Co-authored-by:
TJian <tunjian.tan@embeddedllm.com>
-
- 12 Apr, 2026 3 commits
-
-
Mark McLoughlin authored
Signed-off-by:Mark McLoughlin <markmc@redhat.com>
-
Martin Hickey authored
Signed-off-by:
Martin Hickey <martin.hickey@ie.ibm.com> Co-authored-by:
Or Ozeri <or@ozery.com>
-
Xinyu Chen authored
Signed-off-by:
Xinyu Chen <xinyu1.chen@intel.com> Co-authored-by:
Kunshang Ji <kunshang.ji@intel.com>
-
- 11 Apr, 2026 1 commit
-
-
Tianyu Guo authored
Signed-off-by:
Tianyu Guo <guoty9@mail2.sysu.edu.cn> Co-authored-by:
mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
-
- 10 Apr, 2026 14 commits
-
-
Fynn Schmitt-Ulms authored
Signed-off-by:
Rahul-Tuli <rtuli@redhat.com> Signed-off-by:
Fynn Schmitt-Ulms <fschmitt@redhat.com> Co-authored-by:
Rahul-Tuli <rtuli@redhat.com> Co-authored-by:
Claude <noreply@anthropic.com>
-
yzong-rh authored
Signed-off-by:Yifan Zong <yzong@redhat.com>
-
zhrrr authored
Signed-off-by:zhuhaoran <zhuhaoran.zhr@alibaba-inc.com>
-
Peter Nguyen authored
Signed-off-by:Peter Nguyen <petern0408@gmail.com>
-
xaguilar-amd authored
Signed-off-by:xaguilar-amd <xaguilar@amd.com>
-
Elvir Crnčević authored
Signed-off-by:
Elvir Crncevic <elvircrn@gmail.com> Co-authored-by:
Claude Sonnet 4 <noreply@anthropic.com>
-
Richard Zou authored
Signed-off-by:Richard Zou <zou3519@gmail.com>
-
jackwang2120 authored
Signed-off-by:
jackcfwang <jackcfwang@tencent.com> Co-authored-by:
Isotr0py <mozf@mail2.sysu.edu.cn>
-
Nick Hill authored
Signed-off-by:Nick Hill <nickhill123@gmail.com>
-
Srreyansh Sethi authored
Signed-off-by:
vnadathur <glvikramn@gmail.com> Signed-off-by:
WorldExplored <srreyansh.sethi@gmail.com> Signed-off-by:
Srreyansh Sethi <107075589+WorldExplored@users.noreply.github.com> Co-authored-by:
vnadathur <glvikramn@gmail.com> Co-authored-by:
vnadathur <236933696+vnadathur@users.noreply.github.com>
-
Ronen Schaffer authored
Signed-off-by:Ronen Schaffer <ronen.schaffer@ibm.com>
-
Kyungmin Lee authored
Signed-off-by:
lkm2835 <lkm2835@gmail.com> Co-authored-by:
mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
-
Ganesh R authored
Signed-off-by:R <Ganesh.R@amd.com>
-
Chauncey authored
Signed-off-by:chaunceyjiang <chaunceyjiang@gmail.com>
-
- 09 Apr, 2026 13 commits
-
-
zzaebok authored
Signed-off-by:
Jaebok Lee <jaebok9541@naver.com> Co-authored-by:
mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
-
Xinyu Chen authored
Signed-off-by:Xinyu Chen <xinyu1.chen@intel.com>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Lucas Kabela authored
[Performance Improvement] Update `batched_count_greater_than` to handle batch size 1 without recompile (#38933) Signed-off-by:
Lucas Kabela <lucaskabela@meta.com> Co-authored-by:
Luka Govedič <ProExpertProg@users.noreply.github.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Nick Hill authored
Signed-off-by:Nick Hill <nickhill123@gmail.com>
-
wang.yuqi authored
Signed-off-by:wang.yuqi <yuqi.wang@daocloud.io>
-
Andrew Barnes authored
Signed-off-by:Bortlesboat <bortstheboat@gmail.com>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Yongye Zhu authored
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Ajay Anubolu authored
Signed-off-by:
AjAnubolu <anuboluajay@gmail.com> Signed-off-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com>
-
Michael Goin authored
Signed-off-by:
mgoin <mgoin64@gmail.com> Signed-off-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
Claude <noreply@anthropic.com>
-
- 08 Apr, 2026 5 commits
-
-
Richard Zou authored
Signed-off-by:Richard Zou <zou3519@gmail.com>
-
triangleXIV authored
[BugFix] --max-model-len=-1 causes over-limit requests to hang and starve the entire service (#39102) Signed-off-by:
triangle14 <y1019026570@gmail.com> Signed-off-by:
mgoin <mgoin64@gmail.com> Co-authored-by:
mgoin <mgoin64@gmail.com>
-
Lain authored
Signed-off-by:
Siyuan Fu <siyuanf@nvidia.com> Signed-off-by:
Lucas Wilkinson <lwilkins@redhat.com> Co-authored-by:
Lucas Wilkinson <lwilkins@redhat.com>
-
Roberto L. Castro authored
[Perf][Kernel] Persistent TopK scheduler: unified CUDAGraph-safe kernel with dynamic per-row dispatch - DeepSeek-V3.2 DSA decode (#37421) Signed-off-by:
LopezCastroRoberto <rocastro@redhat.com> Signed-off-by:
Roberto L. Castro <38211239+LopezCastroRoberto@users.noreply.github.com> Co-authored-by:
Claude Sonnet 4.5 <noreply@anthropic.com> Co-authored-by:
Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
-
Shengqi Chen authored
Signed-off-by:
Shengqi Chen <harry-chen@outlook.com> Co-authored-by:
Jason Li <jasonlizhengjian@gmail.com> Co-authored-by:
Roger Wang <hey@rogerw.io>
-