- 09 Apr, 2026 1 commit
-
-
Chendi.Xue authored
Signed-off-by:Chendi Xue <chendi.xue@intel.com>
-
- 08 Apr, 2026 7 commits
-
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
triangleXIV authored
[BugFix] --max-model-len=-1 causes over-limit requests to hang and starve the entire service (#39102) Signed-off-by:
triangle14 <y1019026570@gmail.com> Signed-off-by:
mgoin <mgoin64@gmail.com> Co-authored-by:
mgoin <mgoin64@gmail.com>
-
Rishi Puri authored
Signed-off-by:
Rishi Puri <riship@nvidia.com> Signed-off-by:
Rishi Puri <puririshi98@berkeley.edu> Signed-off-by:
sfeng33 <4florafeng@gmail.com> Co-authored-by:
Benjamin Chislett <chislett.ben@gmail.com> Co-authored-by:
Flora Feng <4florafeng@gmail.com>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Or Ozeri authored
Signed-off-by:Or Ozeri <oro@il.ibm.com>
-
haosdent authored
Signed-off-by:haosdent <haosdent@gmail.com>
-
Giancarlo Delfin authored
Signed-off-by:Giancarlo Delfin <gdelfin@inferact.ai>
-
- 07 Apr, 2026 3 commits
-
-
ibifrost authored
Signed-off-by:
wuchenxin <wuchenxin.wcx@alibaba-inc.com> Signed-off-by:
ibifrost <47308427+ibifrost@users.noreply.github.com> Co-authored-by:
Simon Mo <simon.mo@hey.com>
-
Ronen Schaffer authored
Signed-off-by:Ronen Schaffer <ronen.schaffer@ibm.com>
-
Andreas Karatzas authored
Signed-off-by:Andreas Karatzas <akaratza@amd.com>
-
- 06 Apr, 2026 3 commits
-
-
zhanqiuhu authored
-
Walter Beller-Morales authored
Signed-off-by:walterbm <walter.beller.morales@gmail.com>
-
Julien Denize authored
Signed-off-by:juliendenize <julien.denize@mistral.ai>
-
- 05 Apr, 2026 2 commits
-
-
Greg Pereira authored
Signed-off-by:
greg pereira <grpereir@redhat.com> Signed-off-by:
Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com> Co-authored-by:
Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
-
Aaron Batilo authored
Signed-off-by:Aaron Batilo <abatilo@coreweave.com>
-
- 03 Apr, 2026 2 commits
-
-
Yusuf Mohammad authored
Signed-off-by:
yusuf <yusuf@deeplearningmachine.mynet> Signed-off-by: <> Co-authored-by:
yusuf <yusuf@deeplearningmachine.mynet>
-
wliao2 authored
Signed-off-by:Liao, Wei <wei.liao@intel.com>
-
- 02 Apr, 2026 2 commits
-
-
zhanqiuhu authored
-
wang.yuqi authored
Signed-off-by:wang.yuqi <yuqi.wang@daocloud.io>
-
- 01 Apr, 2026 5 commits
-
-
Chauncey authored
Signed-off-by:chaunceyjiang <chaunceyjiang@gmail.com>
-
yzong-rh authored
Signed-off-by:
Yifan <yzong@redhat.com> Co-authored-by:
Nicolò Lucchesi <nlucches@redhat.com>
-
Lucas Wilkinson authored
Signed-off-by:Lucas Wilkinson <lwilkins@redhat.com>
-
HarshRathva authored
Signed-off-by:
HarshRathva <harshrathvaai@gmail.com> Co-authored-by:
Or Ozeri <oro@il.ibm.com>
-
Yifan Qiao authored
Signed-off-by:
Yifan Qiao <yifanqiao@berkeley.edu> Signed-off-by:
Yifan Qiao <yifanqiao@inferact.ai>
-
- 31 Mar, 2026 5 commits
-
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Matthew Bonanni authored
Signed-off-by:
SandishKumarHN <sandishkumarhn@gmail.com> Signed-off-by:
Matthew Bonanni <mbonanni@redhat.com> Co-authored-by:
SandishKumarHN <sandishkumarhn@gmail.com>
-
wliao2 authored
Signed-off-by:
Liao, Wei <wei.liao@intel.com> Signed-off-by:
wliao2 <wei.liao@intel.com> Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Kunshang Ji <kunshang.ji@intel.com>
-
Nicolò Lucchesi authored
Signed-off-by:NickLucche <nlucches@redhat.com>
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
- 30 Mar, 2026 5 commits
-
-
Benjamin Chislett authored
Signed-off-by:Benjamin Chislett <bchislett@nvidia.com>
-
Chendi.Xue authored
[HMA]Fix corner case when hybrid page_size can not be evenly divided issue (blk_size=64,tp=4) (#37467) Signed-off-by:
Chendi Xue <chendi.xue@intel.com> Signed-off-by:
Matthew Bonanni <mbonanni@redhat.com> Signed-off-by:
Chendi.Xue <chendi.xue@intel.com> Co-authored-by:
Matthew Bonanni <mbonanni@redhat.com> Co-authored-by:
Nicolò Lucchesi <nlucches@redhat.com>
-
Collin McCarthy authored
Signed-off-by:
Collin McCarthy <cmccarthy@nvidia.com> Signed-off-by:
Netanel Haber <58652339+netanel-haber@users.noreply.github.com> Co-authored-by:
Netanel Haber <58652339+netanel-haber@users.noreply.github.com>
-
Nicolò Lucchesi authored
[Mamba][Bugfix] Raise on insufficient cache blocks instead of silently capping cudagraph sizes (#38270) Signed-off-by:NickLucche <nlucches@redhat.com>
-
Andreas Karatzas authored
Signed-off-by:Andreas Karatzas <akaratza@amd.com>
-
- 29 Mar, 2026 1 commit
-
-
Wentao Ye authored
[Perf] Remove redundant device copies for CPU-only pooling token IDs, 48.9% E2E throughput improvement (#38139) Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
- 28 Mar, 2026 1 commit
-
-
yzong-rh authored
Signed-off-by:Yifan <yzong@redhat.com>
-
- 27 Mar, 2026 2 commits
-
-
dtc authored
Signed-off-by:Tianchen Ding <dtcccc@linux.alibaba.com>
-
Or Ozeri authored
Signed-off-by:Or Ozeri <oro@il.ibm.com>
-
- 26 Mar, 2026 1 commit
-
-
Giancarlo Delfin authored
Signed-off-by:Giancarlo Delfin <gdelfin@inferact.ai>
-