- 31 Mar, 2026 4 commits
-
-
Matthew Bonanni authored
Signed-off-by:
SandishKumarHN <sandishkumarhn@gmail.com> Signed-off-by:
Matthew Bonanni <mbonanni@redhat.com> Co-authored-by:
SandishKumarHN <sandishkumarhn@gmail.com>
-
wliao2 authored
Signed-off-by:
Liao, Wei <wei.liao@intel.com> Signed-off-by:
wliao2 <wei.liao@intel.com> Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Kunshang Ji <kunshang.ji@intel.com>
-
Nicolò Lucchesi authored
Signed-off-by:NickLucche <nlucches@redhat.com>
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
- 30 Mar, 2026 5 commits
-
-
Benjamin Chislett authored
Signed-off-by:Benjamin Chislett <bchislett@nvidia.com>
-
Chendi.Xue authored
[HMA]Fix corner case when hybrid page_size can not be evenly divided issue (blk_size=64,tp=4) (#37467) Signed-off-by:
Chendi Xue <chendi.xue@intel.com> Signed-off-by:
Matthew Bonanni <mbonanni@redhat.com> Signed-off-by:
Chendi.Xue <chendi.xue@intel.com> Co-authored-by:
Matthew Bonanni <mbonanni@redhat.com> Co-authored-by:
Nicolò Lucchesi <nlucches@redhat.com>
-
Collin McCarthy authored
Signed-off-by:
Collin McCarthy <cmccarthy@nvidia.com> Signed-off-by:
Netanel Haber <58652339+netanel-haber@users.noreply.github.com> Co-authored-by:
Netanel Haber <58652339+netanel-haber@users.noreply.github.com>
-
Nicolò Lucchesi authored
[Mamba][Bugfix] Raise on insufficient cache blocks instead of silently capping cudagraph sizes (#38270) Signed-off-by:NickLucche <nlucches@redhat.com>
-
Andreas Karatzas authored
Signed-off-by:Andreas Karatzas <akaratza@amd.com>
-
- 29 Mar, 2026 1 commit
-
-
Wentao Ye authored
[Perf] Remove redundant device copies for CPU-only pooling token IDs, 48.9% E2E throughput improvement (#38139) Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
- 28 Mar, 2026 1 commit
-
-
yzong-rh authored
Signed-off-by:Yifan <yzong@redhat.com>
-
- 27 Mar, 2026 2 commits
-
-
dtc authored
Signed-off-by:Tianchen Ding <dtcccc@linux.alibaba.com>
-
Or Ozeri authored
Signed-off-by:Or Ozeri <oro@il.ibm.com>
-
- 26 Mar, 2026 2 commits
-
-
Giancarlo Delfin authored
Signed-off-by:Giancarlo Delfin <gdelfin@inferact.ai>
-
Woosuk Kwon authored
Signed-off-by:
Woosuk Kwon <woosuk@inferact.ai> Signed-off-by:
Nick Hill <nickhill123@gmail.com> Co-authored-by:
Nick Hill <nickhill123@gmail.com>
-
- 25 Mar, 2026 6 commits
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Wentao Ye authored
Signed-off-by:
yewentao256 <zhyanwentao@126.com> Signed-off-by:
Woosuk Kwon <woosuk@inferact.ai> Co-authored-by:
Woosuk Kwon <woosuk@inferact.ai>
-
Andrii Skliar authored
Signed-off-by:
Andrii Skliar <askliar@nvidia.com> Signed-off-by:
[Andrii Skliar] <askliar@nvidia.com> Co-authored-by:
Andrii Skliar <askliar@nvidia.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Gregory Shtrasberg authored
Signed-off-by:
Gregory Shtrasberg <Gregory.Shtrasberg@amd.com> Signed-off-by:
Micah Williamson <micah.williamson@amd.com> Co-authored-by:
Micah Williamson <micah.williamson@amd.com>
-
- 24 Mar, 2026 3 commits
-
-
Sungjae Lee authored
Signed-off-by:
Sungjae Lee <33976427+llsj14@users.noreply.github.com> Signed-off-by:
Sungjae Lee <sung-jae.lee@navercorp.com> Signed-off-by:
Chauncey <chaunceyjiang@gmail.com> Co-authored-by:
Chauncey <chaunceyjiang@gmail.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Ronen Schaffer authored
[KV Offload] Refactor CPU offloading: pluggable CachePolicy, remove Backend abstraction, restructure into `cpu/` package (#37874) Signed-off-by:Ronen Schaffer <ronen.schaffer@ibm.com>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
- 23 Mar, 2026 4 commits
-
-
Ranran authored
Signed-off-by:
Ranran <1012869439@qq.com> Signed-off-by:
Ranran <hzz5361@psu.edu> Signed-off-by:
ran <hzz5361@psu.edu> Co-authored-by:
Wentao Ye <44945378+yewentao256@users.noreply.github.com>
-
Matthew Bonanni authored
Signed-off-by:
zhuhaoran <zhuhaoran.zhr@alibaba-inc.com> Signed-off-by:
Matthew Bonanni <mbonanni@redhat.com> Signed-off-by:
Lucas Wilkinson <lwilkins@redhat.com> Co-authored-by:
zhuhaoran <zhuhaoran.zhr@alibaba-inc.com> Co-authored-by:
zhrrr <43847754+izhuhaoran@users.noreply.github.com> Co-authored-by:
Lucas Wilkinson <lwilkins@redhat.com> Co-authored-by:
Benjamin Chislett <chislett.ben@gmail.com>
-
Nicolò Lucchesi authored
Signed-off-by:NickLucche <nlucches@redhat.com>
-
Baorun (Lauren) Mu authored
Signed-off-by:Baorun Mu <bmu@nvidia.com>
-
- 22 Mar, 2026 2 commits
-
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Andreas Karatzas authored
Signed-off-by:Andreas Karatzas <akaratza@amd.com>
-
- 21 Mar, 2026 2 commits
-
-
Brandon Pelfrey authored
Signed-off-by:
Brandon Pelfrey <bpelfrey@nvidia.com> Signed-off-by:
Brandon Pelfrey <brandonpelfrey@gmail.com> Signed-off-by:
Nick Hill <nickhill123@gmail.com> Co-authored-by:
Nick Hill <nickhill123@gmail.com>
-
Francesco Fusco authored
Signed-off-by:Francesco Fusco <ffu@zurich.ibm.com>
-
- 20 Mar, 2026 5 commits
-
-
Santino Ramos authored
Signed-off-by:Santino Ramos <elsantinoramos@gmail.com>
-
Lucas Wilkinson authored
Signed-off-by:Lucas Wilkinson <lwilkins@redhat.com>
-
Flora Feng authored
Signed-off-by:sfeng33 <4florafeng@gmail.com>
-
Flora Feng authored
Signed-off-by:sfeng33 <4florafeng@gmail.com>
-
tianshu-Michael-yu authored
Signed-off-by:tianshu.yu <tianshuyu.formal@gmail.com>
-
- 19 Mar, 2026 1 commit
-
-
zhanqiuhu authored
-
- 18 Mar, 2026 2 commits
-
-
Thillai Chithambaram authored
Signed-off-by:Thillai Chithambaram <thillaichithambaram.a@gmail.com>
-
Andy Lo authored
[Bugfix] Fix KV scales inconsistency in fp8 MLA & FlashInfer kv_cache_dtype "auto" leading to gibberish (#37054) Signed-off-by:Andy Lo <andy@mistral.ai>
-