- 14 Apr, 2026 35 commits
-
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
maobaolong authored
Signed-off-by:baoloongmao <baoloongmao@tencent.com>
-
rishitdholakia13 authored
Signed-off-by:
rishitdholakia13 <rishit+github@cohere.com> Signed-off-by:
rishitdholakia13 <123388671+rishitdholakia13@users.noreply.github.com> Signed-off-by:
Aaron Pham <contact@aarnphm.xyz> Co-authored-by:
mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by:
Aaron Pham <contact@aarnphm.xyz>
-
bnellnm authored
Signed-off-by:Bill Nell <bnell@redhat.com>
-
Jackmin801 authored
Signed-off-by:
Robert Shaw <robertgshaw2@gmail.com> Signed-off-by:
Jackmin801 <ongjackm@gmail.com> Co-authored-by:
Robert Shaw <robertgshaw2@gmail.com>
-
roikoren755 authored
Signed-off-by:Roi Koren <roik@nvidia.com>
-
zhanqiuhu authored
[CI][KVConnector][Metrics] Update multi KV connector edge case according to prefill stats changes (#39808) Signed-off-by:Zhanqiu Hu <zhu@redhat.com>
-
danielafrimi authored
Signed-off-by:
root <root@lyris0017.lyris.clusters.nvidia.com> Signed-off-by:
Daniel Afrimi <dafrimi@nvidia.com> Co-authored-by:
root <root@lyris0017.lyris.clusters.nvidia.com>
-
Albert Cheng authored
Signed-off-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
Albert Cheng (Engrg-Hardware 1) <albecheng@login-lyris02.lyris.clusters.nvidia.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
-
omerpaz95 authored
Signed-off-by:
omerpaz95 <omerpaz95@gmail.com> Co-authored-by:
Or Ozeri <oro@il.ibm.com>
-
Andrew Barnes authored
Signed-off-by:Bortlesboat <bortstheboat@gmail.com>
-
Alessandro Sangiorgi authored
Signed-off-by:Alessandro Sangiorgi <asangior@redhat.com>
-
Rohan Potdar authored
Signed-off-by:Rohan138 <rohanpotdar138@gmail.com>
-
Mark McLoughlin authored
Signed-off-by:Mark McLoughlin <markmc@redhat.com>
-
Micah Williamson authored
Signed-off-by:Micah Williamson <micah.williamson@amd.com>
-
Netanel Haber authored
Signed-off-by:Netanel Haber <58652339+netanel-haber@users.noreply.github.com>
-
Lucas Kabela authored
Signed-off-by:Lucas Kabela <lucaskabela@meta.com>
-
Hexiang Wang authored
Signed-off-by:whx-sjtu <2952154980@qq.com>
-
bnellnm authored
Signed-off-by:
Bill Nell <bnell@redhat.com> Co-authored-by:
Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
-
bhargav-patel-29 authored
[Bugfix] Fix mismatch between global and local attention heads in tensor-parallel mode for param2moe model (#39707) Signed-off-by:
bhargav-patel-29 <bhargav.patel@tihiitb.org> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
-
Yiyang "Ian" Liu authored
Signed-off-by:
Yiyang Liu <37043548+ianliuy@users.noreply.github.com> Co-authored-by:
Copilot <223556219+Copilot@users.noreply.github.com>
-
Matthias Gehre authored
Signed-off-by:Matthias Gehre <matthias.gehre@amd.com>
-
Thomas authored
Signed-off-by:
thomasmaindron <thomasmaindron@users.noreply.github.com> Co-authored-by:
thomasmaindron <thomasmaindron@users.noreply.github.com> Co-authored-by:
Claude Opus 4.6 (1M context) <noreply@anthropic.com>
-
fxmarty-amd authored
[fix][MOE] Fix MOE experts `intermediate_size` dimension not being narrowed before weight loading (#39688) Signed-off-by:Felix Marty <Felix.Marty@amd.com>
-
xiangdong authored
Signed-off-by:zengxian <xiangdong.zeng@intel.com>
-
Julien Debache authored
Signed-off-by:jdebache <jdebache@nvidia.com>
-
Shanshan Shen authored
Signed-off-by:
shen-shanshan <467638484@qq.com> Signed-off-by:
Shanshan Shen <87969357+shen-shanshan@users.noreply.github.com> Co-authored-by:
Roger Wang <hey@rogerw.io>
-
wang.yuqi authored
[Frontend] Offload blocking preprocessing & postprocessing ops to thread pool for pooling entrypoints. (#39763) Signed-off-by:
wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by:
wang.yuqi <noooop@126.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
-
lalit10 authored
Signed-off-by:
Lalit Laxminarayan Bangad <lalitbangad@gmail.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
Mark McLoughlin authored
[Core][Metrics][BugFix] Replace num_cached_tokens/num_external_computed_tokens with PrefillStats (#37460) Related to `Counters can only be incremented by non-negative amounts` error with the `vllm:prompt_tokens_by_source_total` metric. Signed-off-by:
Mark McLoughlin <markmc@redhat.com> Co-authored-by:
Or Ozeri <or@ozery.com>
-
noobHappylife authored
Signed-off-by:noobhappylife <aratar1991@hotmail.com>
-
김의진 authored
Signed-off-by:
KimuGenie <baby11686@naver.com> Signed-off-by:
Chauncey <chaunceyjiang@gmail.com> Co-authored-by:
Chauncey <chaunceyjiang@gmail.com>
-
Flora Feng authored
[Refactor][Parser] Migrate chat completion auto-tool/reasoning/plain streaming to parse_delta (#39446) Signed-off-by:sfeng33 <4florafeng@gmail.com>
-
Chauncey authored
Signed-off-by:chaunceyjiang <chaunceyjiang@gmail.com>
-
chunxiaozheng authored
Signed-off-by:
idellzheng <idellzheng@tencent.com> Co-authored-by:
Yihua Cheng <yihua98@uchicago.edu>
-
- 13 Apr, 2026 5 commits
-
-
Giancarlo Delfin authored
Signed-off-by:Giancarlo Delfin <gdelfin@inferact.ai>
-
Netanel Haber authored
Signed-off-by:Netanel Haber <58652339+netanel-haber@users.noreply.github.com>
-
Flora Feng authored
Signed-off-by:sfeng33 <4florafeng@gmail.com>
-
Monishver authored
Signed-off-by:
Monishver Chandrasekaran <monishverchandrasekaran@gmail.com> Co-authored-by:
mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by:
Nicolò Lucchesi <nlucches@redhat.com>
-
Pedram Razavi authored
Signed-off-by:Pedram Razavi <pedram.razavi@gmail.com>
-