- 14 Apr, 2026 17 commits
-
-
bnellnm authored
Signed-off-by:
Bill Nell <bnell@redhat.com> Co-authored-by:
Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
-
bhargav-patel-29 authored
[Bugfix] Fix mismatch between global and local attention heads in tensor-parallel mode for param2moe model (#39707) Signed-off-by:
bhargav-patel-29 <bhargav.patel@tihiitb.org> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
-
Yiyang "Ian" Liu authored
Signed-off-by:
Yiyang Liu <37043548+ianliuy@users.noreply.github.com> Co-authored-by:
Copilot <223556219+Copilot@users.noreply.github.com>
-
Matthias Gehre authored
Signed-off-by:Matthias Gehre <matthias.gehre@amd.com>
-
Thomas authored
Signed-off-by:
thomasmaindron <thomasmaindron@users.noreply.github.com> Co-authored-by:
thomasmaindron <thomasmaindron@users.noreply.github.com> Co-authored-by:
Claude Opus 4.6 (1M context) <noreply@anthropic.com>
-
fxmarty-amd authored
[fix][MOE] Fix MOE experts `intermediate_size` dimension not being narrowed before weight loading (#39688) Signed-off-by:Felix Marty <Felix.Marty@amd.com>
-
xiangdong authored
Signed-off-by:zengxian <xiangdong.zeng@intel.com>
-
Julien Debache authored
Signed-off-by:jdebache <jdebache@nvidia.com>
-
Shanshan Shen authored
Signed-off-by:
shen-shanshan <467638484@qq.com> Signed-off-by:
Shanshan Shen <87969357+shen-shanshan@users.noreply.github.com> Co-authored-by:
Roger Wang <hey@rogerw.io>
-
wang.yuqi authored
[Frontend] Offload blocking preprocessing & postprocessing ops to thread pool for pooling entrypoints. (#39763) Signed-off-by:
wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by:
wang.yuqi <noooop@126.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
-
lalit10 authored
Signed-off-by:
Lalit Laxminarayan Bangad <lalitbangad@gmail.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
Mark McLoughlin authored
[Core][Metrics][BugFix] Replace num_cached_tokens/num_external_computed_tokens with PrefillStats (#37460) Related to `Counters can only be incremented by non-negative amounts` error with the `vllm:prompt_tokens_by_source_total` metric. Signed-off-by:
Mark McLoughlin <markmc@redhat.com> Co-authored-by:
Or Ozeri <or@ozery.com>
-
noobHappylife authored
Signed-off-by:noobhappylife <aratar1991@hotmail.com>
-
김의진 authored
Signed-off-by:
KimuGenie <baby11686@naver.com> Signed-off-by:
Chauncey <chaunceyjiang@gmail.com> Co-authored-by:
Chauncey <chaunceyjiang@gmail.com>
-
Flora Feng authored
[Refactor][Parser] Migrate chat completion auto-tool/reasoning/plain streaming to parse_delta (#39446) Signed-off-by:sfeng33 <4florafeng@gmail.com>
-
Chauncey authored
Signed-off-by:chaunceyjiang <chaunceyjiang@gmail.com>
-
chunxiaozheng authored
Signed-off-by:
idellzheng <idellzheng@tencent.com> Co-authored-by:
Yihua Cheng <yihua98@uchicago.edu>
-
- 13 Apr, 2026 23 commits
-
-
Giancarlo Delfin authored
Signed-off-by:Giancarlo Delfin <gdelfin@inferact.ai>
-
Netanel Haber authored
Signed-off-by:Netanel Haber <58652339+netanel-haber@users.noreply.github.com>
-
Flora Feng authored
Signed-off-by:sfeng33 <4florafeng@gmail.com>
-
Monishver authored
Signed-off-by:
Monishver Chandrasekaran <monishverchandrasekaran@gmail.com> Co-authored-by:
mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by:
Nicolò Lucchesi <nlucches@redhat.com>
-
Pedram Razavi authored
Signed-off-by:Pedram Razavi <pedram.razavi@gmail.com>
-
mukesh-hai authored
Signed-off-by:
Mukesh Baphna <mukesh@hippocraticai.com> Signed-off-by:
Mark McLoughlin <markmc@redhat.com> Co-authored-by:
Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by:
Mark McLoughlin <markmc@redhat.com>
-
Tyler Michael Smith authored
[Bugfix] Reject non-nvfp4 dtypes when using the flashinfer_nvlink_one_sided all2all backend (#39717) Signed-off-by:
Tyler Michael Smith <tlrmchlsmth@gmail.com> Co-authored-by:
Claude Opus 4.6 (1M context) <noreply@anthropic.com>
-
Yuyi Ao authored
Signed-off-by:
George-ao <yuyiao772@gmail.com> Signed-off-by:
Yuyi Ao <yuyiao772@gmail.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
JartX authored
Signed-off-by:JartX <sagformas@epdcenter.es>
-
Nicolò Lucchesi authored
Signed-off-by:NickLucche <nlucches@redhat.com>
-
haosdent authored
Signed-off-by:
haosdent <haosdent@gmail.com> Signed-off-by:
DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by:
DarkLight1337 <tlleungac@connect.ust.hk>
-
Yongye Zhu authored
Signed-off-by:Yongye Zhu <zyy1102000@gmail.com>
-
Santino Ramos authored
Signed-off-by:Santino Ramos <santinor@inferact.ai>
-
zhanqiuhu authored
Signed-off-by:ZhanqiuHu <zhu@redhat.com>
-
Wentao Ye authored
Signed-off-by:
yewentao256 <zhyanwentao@126.com> Co-authored-by:
TJian <tunjian.tan@embeddedllm.com>
-
Yi Liu authored
Signed-off-by:yiliu30 <yi4.liu@intel.com>
-
Ekagra Ranjan authored
Signed-off-by:Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
-
Tihomir Elek authored
Signed-off-by:Tihomir Elek <tiho.elek@gmail.com>
-
zofia authored
Signed-off-by:Zhu, Zufang <zufang.zhu@intel.com>
-
Yufeng He authored
Signed-off-by:
Yufeng He <40085740+he-yufeng@users.noreply.github.com> Co-authored-by:
Chauncey <chaunceyjiang@gmail.com>
-
Jesus Federico authored
Signed-off-by:
Jesus Federico <jefp@amazon.com> Signed-off-by:
wang.yuqi <yuqi.wang@daocloud.io> Co-authored-by:
Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by:
wang.yuqi <yuqi.wang@daocloud.io>
-
Flora Feng authored
Signed-off-by:sfeng33 <4florafeng@gmail.com>
-
sihao_li authored
Signed-off-by:
sihao.li <sihao.li@intel.com> Co-authored-by:
Kunshang Ji <kunshang.ji@intel.com>
-