- 10 Mar, 2026 13 commits
-
-
Nick Hill authored
Signed-off-by:Nick Hill <nickhill123@gmail.com>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk@inferact.ai>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Pleaplusone authored
Signed-off-by:ganyi <ygan@amd.com>
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
Srinivasoo7 authored
feat(kv-offload): Strategy A — StoreReusedOffloadingManager gates CPU stores on reuse frequency (#35342) Signed-off-by:
srinivas_oo7 <Sriusa4414@gmail.com> Signed-off-by: Sriusa4414@gmail.com Signed-off-by:
Srinivasoo7 <158864704+Srinivasoo7@users.noreply.github.com> Co-authored-by:
srinivas_oo7 <sklinkedin0120@gmail.com> Co-authored-by:
Srinivasoo7 <158864704+Srinivasoo7@users.noreply.github.com> Co-authored-by:
Or Ozeri <oro@il.ibm.com>
-
SoluMilken authored
Signed-off-by:SoluMilken <ypiheyn.imm02g@g2.nctu.edu.tw>
-
Mark McLoughlin authored
Signed-off-by:Mark McLoughlin <markmc@redhat.com>
-
Vadim Gimpelson authored
Signed-off-by:Vadim Gimpelson <vadim.gimpelson@gmail.com>
-
Nick Hill authored
Signed-off-by:Nick Hill <nickhill123@gmail.com>
-
Nick Hill authored
Signed-off-by:Nick Hill <nickhill123@gmail.com>
-
Wentao Ye authored
[Perf] Compute maxsim in worker side, reducing redundant copies, 2.7% E2E throughput improvement (#36159) Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk@inferact.ai>
-
- 09 Mar, 2026 13 commits
-
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk@inferact.ai>
-
Lucas Wilkinson authored
Signed-off-by:Lucas Wilkinson <lwilkins@redhat.com>
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk@inferact.ai>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk@inferact.ai>
-
Andreas Karatzas authored
[ROCm][CI] Fix ROCm attention backend validation for head sizes, block sizes, and compute capability checks (#36292) Signed-off-by:Andreas Karatzas <akaratza@amd.com>
-
Roberto L. Castro authored
[Attention][Perf][Kernel] Replace torch.cat with vectorized CUDA kernel MLA query concat - DeepSeek-V3.2 (#34917) Signed-off-by:
LopezCastroRoberto <rocastro@redhat.com> Signed-off-by:
Roberto L. Castro <38211239+LopezCastroRoberto@users.noreply.github.com>
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Li, Jiang authored
Signed-off-by:jiang1.li <jiang1.li@intel.com>
-
cong-or authored
Signed-off-by:cong-or <conchubhar.gannon@gmail.com>
-
Weiguang Li authored
Signed-off-by:
liweiguang <codingpunk@gmail.com> Co-authored-by:
Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by:
Li, Jiang <jiang1.li@intel.com>
-
- 08 Mar, 2026 1 commit
-
-
Sage authored
-
- 07 Mar, 2026 6 commits
-
-
Wei Zhao authored
-
PatchyTIS authored
-
Matthew Bonanni authored
-
Mengtao (Martin) Yuan authored
Signed-off-by:
Martin Yuan <myuan@meta.com> Co-authored-by:
Martin Yuan <myuan@meta.com>
-
Nick Hill authored
Signed-off-by:Nick Hill <nickhill123@gmail.com>
-
Nick Hill authored
Signed-off-by:
Nick Hill <nickhill123@gmail.com> Co-authored-by:
Mark McLoughlin <markmc@redhat.com>
-
- 06 Mar, 2026 7 commits
-
-
Chuan (Richard) Li authored
Signed-off-by:Li <chuali@amd.com>
-
Nick Hill authored
-
Travis Johnson authored
Signed-off-by:
Travis Johnson <tsjohnso@us.ibm.com> Signed-off-by:
Nick Hill <nickhill123@gmail.com> Co-authored-by:
Nick Hill <nickhill123@gmail.com>
-
Raphaël Rialland authored
[Bugfix] Fix `cudagraph_mode:FULL` dispatch (This does not impact `FULL_AND_PIECEWISE` (default)) (#36165)
-
Nicolò Lucchesi authored
Signed-off-by:NickLucche <nlucches@redhat.com>
-
zhanqiuhu authored
Signed-off-by:
Claude <noreply@anthropic.com> Signed-off-by:
Zhanqiu Hu <zh338@cornell.edu> Co-authored-by:
Nicolò Lucchesi <nlucches@redhat.com>
-
Andreas Karatzas authored
[ROCm][CI] Fix tool use test stability - disable skinny GEMM, prefix caching, eliminate batch variance (#35553) Signed-off-by:Andreas Karatzas <akaratza@amd.com>
-