- 24 Apr, 2026 1 commit
-
-
Nick Hill authored
Signed-off-by:Nick Hill <nickhill123@gmail.com>
-
- 23 Apr, 2026 1 commit
-
-
Srreyansh Sethi authored
Signed-off-by:
Srreyansh Sethi <srreyansh.sethi@gmail.com> Signed-off-by:
Matthew Bonanni <mbonanni@redhat.com> Co-authored-by:
Matthew Bonanni <mbonanni@redhat.com> Co-authored-by:
mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
-
- 20 Apr, 2026 1 commit
-
-
larryli2-amd authored
[ROCm][Feature] Enable AITER MLA attention backend to work with Eagle3 speculative decoding on ROCm (#39616) Signed-off-by:
larryli2-amd <larryli2@amd.com> Co-authored-by:
TJian <tunjian.tan@embeddedllm.com>
-
- 16 Apr, 2026 1 commit
-
-
Giancarlo Delfin authored
Signed-off-by:Giancarlo Delfin <gdelfin@inferact.ai>
-
- 10 Apr, 2026 1 commit
-
-
xaguilar-amd authored
Signed-off-by:xaguilar-amd <xaguilar@amd.com>
-
- 09 Apr, 2026 1 commit
-
-
Yongye Zhu authored
-
- 08 Apr, 2026 1 commit
-
-
Roberto L. Castro authored
[Perf][Kernel] Persistent TopK scheduler: unified CUDAGraph-safe kernel with dynamic per-row dispatch - DeepSeek-V3.2 DSA decode (#37421) Signed-off-by:
LopezCastroRoberto <rocastro@redhat.com> Signed-off-by:
Roberto L. Castro <38211239+LopezCastroRoberto@users.noreply.github.com> Co-authored-by:
Claude Sonnet 4.5 <noreply@anthropic.com> Co-authored-by:
Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
-
- 07 Apr, 2026 2 commits
-
-
Lucas Wilkinson authored
Signed-off-by:Lucas Wilkinson <lwilkins@redhat.com>
-
Chendi.Xue authored
Signed-off-by:
Chendi Xue <chendi.xue@intel.com> Co-authored-by:
Kunshang Ji <kunshang.ji@intel.com>
-
- 03 Apr, 2026 1 commit
-
-
wufann authored
Signed-off-by:
wufann <36477220+wufann@users.noreply.github.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
-
- 02 Apr, 2026 1 commit
-
-
Koushik Dutta authored
Signed-off-by:
Koushik Dutta <koushd@gmail.com> Co-authored-by:
root <root@ubuntu-nvidia.localdomain> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
- 01 Apr, 2026 3 commits
-
-
Elvir Crnčević authored
[Bugfix] Revert "Zero-init MLA attention output buffers to prevent NaN from CUDA graph padding" (#38359) Signed-off-by:
Elvir Crncevic <elvircrn@gmail.com> Co-authored-by:
Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by:
Tyler Michael Smith <tyler@neuralmagic.com>
-
Lucas Wilkinson authored
Signed-off-by:Lucas Wilkinson <lwilkins@redhat.com>
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
- 30 Mar, 2026 1 commit
-
-
SandishKumarHN authored
Signed-off-by:
SandishKumarHN <sandish@fb.com> Signed-off-by:
Matthew Bonanni <mbonanni@redhat.com> Co-authored-by:
Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by:
Matthew Bonanni <mbonanni@redhat.com>
-
- 26 Mar, 2026 2 commits
-
-
haosdent authored
Signed-off-by:
haosdent <haosdent@gmail.com> Co-authored-by:
Matthew Bonanni <mbonanni@redhat.com>
-
Chauncey authored
Signed-off-by:chaunceyjiang <chaunceyjiang@gmail.com>
-
- 25 Mar, 2026 2 commits
-
-
Sathish Sanjeevi authored
Signed-off-by:Sathish Sanjeevi <sathish.krishnan.p.s@gmail.com>
-
Chauncey authored
[Revert] Remove CUDA torch fallbacks for fp8_mqa_logits/fp8_paged_mqa_logits_torch function (#37968) Signed-off-by:chaunceyjiang <chaunceyjiang@gmail.com>
-
- 23 Mar, 2026 1 commit
-
-
Ranran authored
Signed-off-by:
Ranran <1012869439@qq.com> Signed-off-by:
Ranran <hzz5361@psu.edu> Signed-off-by:
ran <hzz5361@psu.edu> Co-authored-by:
Wentao Ye <44945378+yewentao256@users.noreply.github.com>
-
- 20 Mar, 2026 1 commit
-
-
Kaihang Jiang authored
Signed-off-by:Kaihang Jiang <kaihangj@nvidia.com>
-
- 19 Mar, 2026 1 commit
-
-
Elvir Crnčević authored
Signed-off-by:
Elvir Crncevic <elvircrn@gmail.com> Signed-off-by:
Matthew Bonanni <mbonanni@redhat.com> Co-authored-by:
Matthew Bonanni <mbonanni@redhat.com>
-
- 18 Mar, 2026 1 commit
-
-
Andy Lo authored
[Bugfix] Fix KV scales inconsistency in fp8 MLA & FlashInfer kv_cache_dtype "auto" leading to gibberish (#37054) Signed-off-by:Andy Lo <andy@mistral.ai>
-
- 16 Mar, 2026 3 commits
-
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
haosdent authored
Signed-off-by:
haosdent <haosdent@gmail.com> Co-authored-by:
Or Ozeri <oro@il.ibm.com>
-
- 13 Mar, 2026 1 commit
-
-
Dimitrios Bariamis authored
Signed-off-by:
Dimitrios Bariamis <12195802+dbari@users.noreply.github.com> Co-authored-by:
Dimitrios Bariamis <12195802+dbari@users.noreply.github.com>
-
- 12 Mar, 2026 1 commit
-
-
grimulkan authored
Signed-off-by:grimulkan <grimulkan@gmail.com>
-
- 11 Mar, 2026 3 commits
-
-
Wuxun Zhang authored
Signed-off-by:Zhang, Wuxun <wuxun.zhang@intel.com>
-
pschlan-amd authored
Signed-off-by:Patrick Schlangen <pschlan@amd.com>
-
Benjamin Chislett authored
Signed-off-by:Benjamin Chislett <bchislett@nvidia.com>
-
- 10 Mar, 2026 2 commits
-
-
Pleaplusone authored
Signed-off-by:ganyi <ygan@amd.com>
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
- 09 Mar, 2026 3 commits
-
-
Andreas Karatzas authored
[ROCm][CI] Fix ROCm attention backend validation for head sizes, block sizes, and compute capability checks (#36292) Signed-off-by:Andreas Karatzas <akaratza@amd.com>
-
Roberto L. Castro authored
[Attention][Perf][Kernel] Replace torch.cat with vectorized CUDA kernel MLA query concat - DeepSeek-V3.2 (#34917) Signed-off-by:
LopezCastroRoberto <rocastro@redhat.com> Signed-off-by:
Roberto L. Castro <38211239+LopezCastroRoberto@users.noreply.github.com>
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
- 07 Mar, 2026 1 commit
-
-
Wei Zhao authored
-
- 06 Mar, 2026 2 commits
-
-
Chuan (Richard) Li authored
Signed-off-by:Li <chuali@amd.com>
-
Rohan Potdar authored
Signed-off-by:Rohan138 <rohanpotdar138@gmail.com>
-
- 05 Mar, 2026 1 commit
-
-
Jiayi Yan authored
Signed-off-by:
1195343015 <1195343015@qq.com> Signed-off-by:
Jiayi Yan <66017932+1195343015@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-