- 09 Jan, 2026 2 commits
-
-
Shanshan Shen authored
Signed-off-by:shen-shanshan <467638484@qq.com>
-
vllmellm authored
[Bugfix][ROCm]Fix Qwen3-Next-80B-A3B-Thinking inference and optimize non-standard block size (544) support under rocm_atten (#31380) Signed-off-by:vllmellm <vllm.ellm@embeddedllm.com>
-
- 08 Jan, 2026 4 commits
-
-
Lucas Wilkinson authored
[Misc] Fix `Current vLLM config is not set.` warnings, assert to avoid issues in the future (#31747) Signed-off-by:
Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by:
Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by:
Luka Govedič <ProExpertProg@users.noreply.github.com>
-
Isotr0py authored
Signed-off-by:Isotr0py <mozf@mail2.sysu.edu.cn>
-
Rabi Mishra authored
Signed-off-by:rabi <ramishra@redhat.com>
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
- 07 Jan, 2026 4 commits
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
BlankR authored
Signed-off-by:
BlankR <hjyblanche@gmail.com> Co-authored-by:
Wentao Ye <44945378+yewentao256@users.noreply.github.com>
-
weiyu authored
Signed-off-by:
Wei-Yu Lin <weiyulin@google.com> Signed-off-by:
weiyu <62784299+weiyu0824@users.noreply.github.com>
-
Jack Yang authored
Signed-off-by:
Zhuohao Yang <zy242@cornell.edu> Co-authored-by:
Zhuohao Yang <zy242@cornell.edu> Co-authored-by:
Wentao Ye <44945378+yewentao256@users.noreply.github.com>
-
- 06 Jan, 2026 1 commit
-
-
Isotr0py authored
Signed-off-by:Isotr0py <mozf@mail2.sysu.edu.cn>
-
- 05 Jan, 2026 1 commit
-
-
Isotr0py authored
Signed-off-by:Isotr0py <mozf@mail2.sysu.edu.cn>
-
- 02 Jan, 2026 1 commit
-
-
Andreas Karatzas authored
Signed-off-by:Andreas Karatzas <akaratza@amd.com>
-
- 30 Dec, 2025 1 commit
-
-
yt0428 authored
Signed-off-by:
yuantao <2422264527@qq.com> Signed-off-by:
yt0428 <51468697+yt0428@users.noreply.github.com> Co-authored-by:
Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
- 28 Dec, 2025 1 commit
-
-
Boyuan Feng authored
Signed-off-by:Boyuan Feng <boyuan@meta.com>
-
- 23 Dec, 2025 1 commit
-
-
Andreas Karatzas authored
Signed-off-by:Andreas Karatzas <akaratza@amd.com>
-
- 22 Dec, 2025 1 commit
-
-
Kevin McKay authored
Signed-off-by:c0de128 <kevin.mckay@outlook.com>
-
- 20 Dec, 2025 1 commit
-
-
zejunchen-zejun authored
[Bugfix] fix the alias bug of AttentionBackendEnum when register CUSTOM attention backend to vllm (#30869) Signed-off-by:zejunchen-zejun <zejun.chen@amd.com>
-
- 19 Dec, 2025 2 commits
-
-
Thomas Parnell authored
[Bugfix] [Kernel] Triton attention kernels: mask out V blocks that fall outside sliding window (#30887) Signed-off-by:Thomas Parnell <tpa@zurich.ibm.com>
-
Isotr0py authored
Signed-off-by:Isotr0py <mozf@mail2.sysu.edu.cn>
-
- 18 Dec, 2025 4 commits
-
-
Isotr0py authored
Signed-off-by:Isotr0py <mozf@mail2.sysu.edu.cn>
-
Isotr0py authored
[MM Encoder]: Migrate legacy ViT `MultiHeadAttention` to new `MMEncoderAttention` interface (#30684) Signed-off-by:Isotr0py <mozf@mail2.sysu.edu.cn>
-
Andreas Karatzas authored
[ROCm][Bugfix] Fix `fa_version` argument error in `flash_attn_maxseqlen_wrapper` for ROCm without aiter (#30909) Signed-off-by:Andreas Karatzas <akaratza@amd.com>
-
Isotr0py authored
-
- 17 Dec, 2025 1 commit
-
-
Hank_ authored
Signed-off-by:Hank <hcc.mayday@gmail.com>
-
- 16 Dec, 2025 3 commits
-
-
Nicolò Lucchesi authored
Signed-off-by:
NickLucche <nlucches@redhat.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
TJian authored
Signed-off-by:tjtanaa <tunjian.tan@embeddedllm.com>
-
Lucas Wilkinson authored
Signed-off-by:
Lucas Wilkinson <lwilkins@redhat.com> Co-authored-by:
Stanislaw Wozniak <stw@zurich.ibm.com>
-
- 15 Dec, 2025 3 commits
-
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
Isotr0py authored
[Platform] Refactor Platform attention backend selection to avoid breakpoint for OOT platform (#30212) Signed-off-by:
Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by:
Isotr0py <2037008807@qq.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
-
Shanshan Shen authored
[CustomOp][MM] Extract MMEncoderAttention as CustomOp and replace the backend of QwenVisionAttention with it. (#30125) Signed-off-by:
shen-shanshan <467638484@qq.com> Signed-off-by:
Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by:
tjtanaa <tunjian.tan@embeddedllm.com> Co-authored-by:
Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by:
tjtanaa <tunjian.tan@embeddedllm.com>
-
- 13 Dec, 2025 1 commit
-
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
- 12 Dec, 2025 1 commit
-
-
jvlunteren authored
Signed-off-by:
Jan van Lunteren <jvl@zurich.ibm.com> Signed-off-by:
jvlunteren <161835099+jvlunteren@users.noreply.github.com> Co-authored-by:
Thomas Parnell <tom.parnell@gmail.com> Co-authored-by:
Thomas Parnell <tpa@zurich.ibm.com>
-
- 11 Dec, 2025 2 commits
-
-
Qiu authored
Signed-off-by:
QiuChunshuo <qiuchunshuo@huawei.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 10 Dec, 2025 1 commit
-
-
Lucas Wilkinson authored
[Attention] Make seq_lens_cpu optional in CommonAttentionMetadata to enable true async spec-decode (#29624) Signed-off-by:
Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by:
Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by:
Benjamin Chislett <chislett.ben@gmail.com>
-
- 09 Dec, 2025 2 commits
-
-
rasmith authored
[CI/Build] Make test_mha_attn.py run on correct platform only and check for flash_attn_varlen_func in layer.py (#29145)
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
- 08 Dec, 2025 1 commit
-
-
Dazhi Jiang authored
Signed-off-by:Dazhi Jiang <dazhi_jiang@163.com>
-
- 07 Dec, 2025 1 commit
-
-
Isotr0py authored
Signed-off-by:Isotr0py <mozf@mail2.sysu.edu.cn>
-