- 03 Feb, 2026 1 commit
-
-
王敏 authored
-
- 28 Jan, 2026 1 commit
-
-
chenyue3 authored
-
- 21 Jan, 2026 1 commit
-
-
zhuwenwen authored
-
- 16 Jan, 2026 3 commits
-
-
zhuwenwen authored
区分pcie和hglink custom allreduce的使用 vllm:export VLLM_CUSTOM_CACHE=1 dtk:export HIP_KERNEL_EVENT_SYSTENFENCE=1 set VLLM_USE_FUSED_RMS_ROPE=1 add SUPPORT_MOE_MARLIN_W16A16 to use moe marlin on bw support fa kvcache fp8 (todo: add VLLM_USE_QUERY_QUANT to not use q quant) update moe_align_block_size
-
zhuwenwen authored
fix _forward_encoder_attention remove medusa set VLLM_PCIE_USE_CUSTOM_ALLREDUCE=1
-
vllmellm authored
[Bugfix][ROCm][performance] Resolve the performance regression issue of the Qwen3-Next-80B-A3B-Thinking under rocm_atten (#32336) Signed-off-by:
vllmellm <vllm.ellm@embeddedllm.com> (cherry picked from commit e27078ea)
-
- 13 Jan, 2026 1 commit
-
-
cjackal authored
Signed-off-by:cjackal <44624812+cjackal@users.noreply.github.com>
-
- 12 Jan, 2026 3 commits
-
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
Asaf Joseph Gardin authored
Signed-off-by:Josephasafg <ajgard7@gmail.com>
-
Isotr0py authored
Signed-off-by:Isotr0py <mozf@mail2.sysu.edu.cn>
-
- 10 Jan, 2026 3 commits
-
-
Vadim Gimpelson authored
Signed-off-by:
Vadim Gimpelson <vadim.gimpelson@gmail.com> Signed-off-by:
Vadim Gimpelson <156319763+vadiklyutiy@users.noreply.github.com>
-
Lucas Wilkinson authored
Signed-off-by:Lucas Wilkinson <lwilkins@redhat.com>
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
- 09 Jan, 2026 3 commits
-
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
R3hankhan authored
Signed-off-by:Rehan Khan <Rehan.Khan7@ibm.com>
-
vllmellm authored
[Bugfix][ROCm]Fix Qwen3-Next-80B-A3B-Thinking inference and optimize non-standard block size (544) support under rocm_atten (#31380) Signed-off-by:vllmellm <vllm.ellm@embeddedllm.com>
-
- 08 Jan, 2026 1 commit
-
-
Rabi Mishra authored
Signed-off-by:rabi <ramishra@redhat.com>
-
- 07 Jan, 2026 6 commits
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
vllmellm authored
Signed-off-by:vllmellm <vllm.ellm@embeddedllm.com>
-
weiyu authored
Signed-off-by:
Wei-Yu Lin <weiyulin@google.com> Signed-off-by:
weiyu <62784299+weiyu0824@users.noreply.github.com>
-
Lucas Wilkinson authored
[Attention][3/n] Remove usage of deprecated `seq_lens_cpu` and `num_computed_tokens_cpu` CommonAttentionMetadata properties (#31850) Signed-off-by:Lucas Wilkinson <lwilkins@redhat.com>
-
vllmellm authored
Signed-off-by:vllmellm <vllm.ellm@embeddedllm.com>
-
Jack Yang authored
Signed-off-by:
Zhuohao Yang <zy242@cornell.edu> Co-authored-by:
Zhuohao Yang <zy242@cornell.edu> Co-authored-by:
Wentao Ye <44945378+yewentao256@users.noreply.github.com>
-
- 06 Jan, 2026 4 commits
-
-
Lucas Wilkinson authored
[Attention][2/n] Remove usage of deprecated `seq_lens_cpu` and `num_computed_tokens_cpu` CommonAttentionMetadata properties (#31774) Signed-off-by:
Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by:
Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by:
Cyrus Leung <cyrus.tl.leung@gmail.com>
-
Lucas Wilkinson authored
[Attention][1/n] Remove usage of deprecated `seq_lens_cpu` and `num_computed_tokens_cpu` CommonAttentionMetadata properties (#31773) Signed-off-by:Lucas Wilkinson <lwilkins@redhat.com>
-
zhuwenwen authored
-
zhuwenwen authored
update weights_not_loaded and flash_mla_with_kvcache update paged_mqa_logits
-
- 05 Jan, 2026 3 commits
-
-
Or Ozeri authored
Signed-off-by:Or Ozeri <oro@il.ibm.com>
-
Isotr0py authored
Signed-off-by:Isotr0py <mozf@mail2.sysu.edu.cn>
-
zhuwenwen authored
-
- 02 Jan, 2026 1 commit
-
-
Kevin McKay authored
Signed-off-by:c0de128 <kevin.mckay@outlook.com>
-
- 31 Dec, 2025 1 commit
-
-
Wentao Ye authored
Signed-off-by:
yewentao256 <zhyanwentao@126.com> Signed-off-by:
Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
-
- 30 Dec, 2025 1 commit
-
-
yt0428 authored
Signed-off-by:
yuantao <2422264527@qq.com> Signed-off-by:
yt0428 <51468697+yt0428@users.noreply.github.com> Co-authored-by:
Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
- 25 Dec, 2025 1 commit
-
-
zhuwenwen authored
-
- 23 Dec, 2025 4 commits
-
-
Asaf Joseph Gardin authored
-
Patrick von Platen authored
Signed-off-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
zhuwenwen authored
-
Pavani Majety authored
Signed-off-by:Pavani Majety <pmajety@nvidia.com>
-
- 22 Dec, 2025 2 commits
-
-
Benjamin Chislett authored
Signed-off-by:Benjamin Chislett <bchislett@nvidia.com>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-