- 03 Feb, 2026 1 commit
-
-
王敏 authored
-
- 02 Feb, 2026 1 commit
-
-
zhuwenwen authored
-
- 28 Jan, 2026 1 commit
-
-
chenyue3 authored
-
- 23 Jan, 2026 1 commit
-
-
Nicolò Lucchesi authored
Signed-off-by:
NickLucche <nlucches@redhat.com> (cherry picked from commit ea6102b8)
-
- 21 Jan, 2026 1 commit
-
-
zhuwenwen authored
-
- 16 Jan, 2026 4 commits
-
-
zhuwenwen authored
区分pcie和hglink custom allreduce的使用 vllm:export VLLM_CUSTOM_CACHE=1 dtk:export HIP_KERNEL_EVENT_SYSTENFENCE=1 set VLLM_USE_FUSED_RMS_ROPE=1 add SUPPORT_MOE_MARLIN_W16A16 to use moe marlin on bw support fa kvcache fp8 (todo: add VLLM_USE_QUERY_QUANT to not use q quant) update moe_align_block_size
-
zhuwenwen authored
fix _forward_encoder_attention remove medusa set VLLM_PCIE_USE_CUSTOM_ALLREDUCE=1
-
Pleaplusone authored
Signed-off-by:
ganyi <ygan@amd.com> (cherry picked from commit 77c16df3)
-
vllmellm authored
[Bugfix][ROCm][performance] Resolve the performance regression issue of the Qwen3-Next-80B-A3B-Thinking under rocm_atten (#32336) Signed-off-by:
vllmellm <vllm.ellm@embeddedllm.com> (cherry picked from commit e27078ea)
-
- 13 Jan, 2026 3 commits
-
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
cjackal authored
Signed-off-by:cjackal <44624812+cjackal@users.noreply.github.com>
-
Nick Hill authored
Signed-off-by:Nick Hill <nickhill123@gmail.com>
-
- 12 Jan, 2026 12 commits
-
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
Nicolò Lucchesi authored
Signed-off-by:NickLucche <nlucches@redhat.com>
-
Or Ozeri authored
Signed-off-by:Or Ozeri <oro@il.ibm.com>
-
Roger Wang authored
Signed-off-by:Roger Wang <hey@rogerw.io>
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
Asaf Joseph Gardin authored
Signed-off-by:Josephasafg <ajgard7@gmail.com>
-
Or Ozeri authored
Signed-off-by:Or Ozeri <oro@il.ibm.com>
-
Hongxin Xu authored
Signed-off-by:
xhx1022 <1737006628@qq.com> Signed-off-by:
Hongxin Xu <70438206+xhx1022@users.noreply.github.com> Signed-off-by:
arlenxu <arlenxu@tencent.com> Co-authored-by:
22quinn <33176974+22quinn@users.noreply.github.com> Co-authored-by:
arlenxu <arlenxu@tencent.com>
-
Isotr0py authored
Signed-off-by:Isotr0py <mozf@mail2.sysu.edu.cn>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
- 11 Jan, 2026 3 commits
-
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
rongfu.leng authored
Signed-off-by:lengrongfu <lenronfu@gmail.com>
-
Or Ozeri authored
Signed-off-by:Or Ozeri <oro@il.ibm.com>
-
- 10 Jan, 2026 9 commits
-
-
Or Ozeri authored
Signed-off-by:Or Ozeri <oro@il.ibm.com>
-
Vadim Gimpelson authored
Signed-off-by:
Vadim Gimpelson <vadim.gimpelson@gmail.com> Signed-off-by:
Vadim Gimpelson <156319763+vadiklyutiy@users.noreply.github.com>
-
Or Ozeri authored
Signed-off-by:Or Ozeri <oro@il.ibm.com>
-
jvlunteren authored
Signed-off-by:Jan van Lunteren <jvl@zurich.ibm.com>
-
Frelam authored
[Bugfix] fix encoder cache leak of waiting requests in scheduler to solve stuck in CPU scheduling (#31857) Signed-off-by:
frelam <frelam112233@gmail.com> Co-authored-by:
Roger Wang <hey@rogerw.io>
-
Lucas Wilkinson authored
Signed-off-by:Lucas Wilkinson <lwilkins@redhat.com>
-
Lucas Kabela authored
Signed-off-by:Lucas Kabela <lucaskabela@meta.com>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
- 09 Jan, 2026 4 commits
-
-
zhrrr authored
Signed-off-by:
izhuhaoran <izhuhaoran@qq.com> Signed-off-by:
zhuhaoran <zhuhaoran.zhr@alibaba-inc.com>
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Yifan Qiao authored
Signed-off-by:Yifan Qiao <yifanqiao@berkeley.edu>
-