- 30 Jan, 2026 1 commit
-
-
zhuwenwen authored
-
- 23 Jan, 2026 1 commit
-
-
Isotr0py authored
Signed-off-by:
Isotr0py <mozf@mail2.sysu.edu.cn> (cherry picked from commit 8ebf271b)
-
- 16 Jan, 2026 2 commits
-
-
zhuwenwen authored
区分pcie和hglink custom allreduce的使用 vllm:export VLLM_CUSTOM_CACHE=1 dtk:export HIP_KERNEL_EVENT_SYSTENFENCE=1 set VLLM_USE_FUSED_RMS_ROPE=1 add SUPPORT_MOE_MARLIN_W16A16 to use moe marlin on bw support fa kvcache fp8 (todo: add VLLM_USE_QUERY_QUANT to not use q quant) update moe_align_block_size
-
zhuwenwen authored
fix _forward_encoder_attention remove medusa set VLLM_PCIE_USE_CUSTOM_ALLREDUCE=1
-
- 11 Jan, 2026 1 commit
-
-
Fadi Arafeh authored
Signed-off-by:Fadi Arafeh <fadi.arafeh@arm.com>
-
- 09 Jan, 2026 3 commits
-
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
Michael Goin authored
Signed-off-by:mgoin <mgoin64@gmail.com>
-
inkcherry authored
Signed-off-by:inkcherry <mingzhi.liu@amd.com>
-
- 07 Jan, 2026 2 commits
-
-
Kate Cheng authored
Signed-off-by:
Kate Cheng <yunhsuanc@nvidia.com> Signed-off-by:
Jhao-Ting Chen <jhaotingc@nvidia.com> Co-authored-by:
Jhao-Ting Chen <jhaotingc@nvidia.com>
-
zhuwenwen authored
-
- 05 Jan, 2026 1 commit
-
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
- 19 Dec, 2025 2 commits
-
-
Seiji Eicher authored
Signed-off-by:Seiji Eicher <seiji@anyscale.com>
-
zhuwenwen authored
-
- 18 Dec, 2025 3 commits
-
-
Elizabeth Thomas authored
Signed-off-by:
Elizabeth Thomas <email2eliza@gmail.com> Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
zhuwenwen authored
-
SungMinCho authored
Signed-off-by:
SungMinCho <tjdals4565@gmail.com> Signed-off-by:
Mark McLoughlin <markmc@redhat.com> Co-authored-by:
Mark McLoughlin <markmc@redhat.com>
-
- 17 Dec, 2025 2 commits
-
-
Zhengxu Chen authored
Signed-off-by:zhxchen17 <zhxchen17@fb.com>
-
zhuwenwen authored
修复CompressedTensorsLinearMethod中的w4a16的冲突问题 feat(moe): add Marlin W16A16 fused MoE behind VLLM_USE_MARLIN_W16A16_MOE replace the fp8_mqa_logits and fp8_paged_mqa_logits interfaces in deepgemm with mqa_logits and paged_mqa_logits from lightop
-
- 16 Dec, 2025 1 commit
-
-
Lucas Wilkinson authored
Signed-off-by:
Lucas Wilkinson <lwilkins@redhat.com> Co-authored-by:
Stanislaw Wozniak <stw@zurich.ibm.com>
-
- 13 Dec, 2025 1 commit
-
-
zhuwenwen authored
-
- 12 Dec, 2025 1 commit
-
-
Lucas Wilkinson authored
Signed-off-by:Lucas Wilkinson <lwilkins@redhat.com>
-
- 11 Dec, 2025 2 commits
-
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 10 Dec, 2025 1 commit
-
-
Jialin Ouyang authored
[Perf] Enable environment cache in EngineCore to enable the feature for UniProcExecutor as well (#29289) Signed-off-by:Jialin Ouyang <Jialin.Ouyang@gmail.com>
-
- 09 Dec, 2025 3 commits
-
-
Benjamin Chislett authored
Signed-off-by:
Benjamin Chislett <bchislett@nvidia.com> Signed-off-by:
Benjamin Chislett <chislett.ben@gmail.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Wentao Ye authored
[Compile] Fix torch warning `TensorFloat32 tensor cores for float32 matrix multiplication available but not enabled` (#29897) Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Ming Yang authored
Signed-off-by:Ming Yang <minos.future@gmail.com>
-
- 04 Dec, 2025 1 commit
-
-
dtc authored
Signed-off-by:
Tianchen Ding <dtcccc@linux.alibaba.com> Signed-off-by:
dtc <dtcccc@linux.alibaba.com> Co-authored-by:
Nicolò Lucchesi <nicolo.lucchesi@gmail.com>
-
- 03 Dec, 2025 4 commits
-
-
Shengqi Chen authored
[CI] fix docker image build by specifying merge-base commit id when downloading pre-compiled wheels (#29930) Signed-off-by:Shengqi Chen <harry-chen@outlook.com>
-
Elizabeth Thomas authored
Signed-off-by:
Elizabeth Thomas <email2eliza@gmail.com> Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by:
Roger Wang <hey@rogerw.io> Signed-off-by:
Jane Xu <janeyx@meta.com> Signed-off-by:
Nick Hill <nhill@redhat.com> Signed-off-by:
Johnny Yang <johnnyyang@google.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
bruceszchen <bruceszchen@tencent.com> Co-authored-by:
Roger Wang <hey@rogerw.io> Co-authored-by:
Jane (Yuan) Xu <31798555+janeyx99@users.noreply.github.com> Co-authored-by:
Nick Hill <nhill@redhat.com> Co-authored-by:
Johnny Yang <24908445+jcyang43@users.noreply.github.com>
-
Amr Mahdi authored
Signed-off-by:Amr Mahdi <amrmahdi@meta.com>
-
zhuwenwen authored
add VLLM_USE_OPT_RESHAPE_AND_CACHE、VLLM_USE_FUSE_SILU_AND_MUL and VLLM_USE_TOPK_RENORM for qwen3-30b
-
- 02 Dec, 2025 3 commits
-
-
Andrew Xia authored
Signed-off-by:
Andrew Xia <axia@fb.com> Co-authored-by:
Andrew Xia <axia@fb.com>
-
zhuwenwen authored
-
Shengqi Chen authored
Signed-off-by:Shengqi Chen <harry-chen@outlook.com>
-
- 01 Dec, 2025 4 commits
-
-
Kevin H. Luu authored
-
Shengqi Chen authored
Signed-off-by:Shengqi Chen <harry-chen@outlook.com>
-
Yifei Zhang authored
Signed-off-by:Yifei Zhang <yifei.zhang1992@outlook.com>
-
Shu Wang authored
Signed-off-by:
Shu Wang <shuw@nvidia.com> Signed-off-by:
Shu Wang. <shuw@nvidia.com> Signed-off-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
root <root@umbriel-b200-017.ipp4a1.colossus.nvidia.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com>
-
- 29 Nov, 2025 1 commit
-
-
Jinzhen Lin authored
Signed-off-by:
Jinzhen Lin <jinzhen.ljz@antgroup.com> Signed-off-by:
Michael Goin <mgoin64@gmail.com> Signed-off-by:
Jinzhen Lin <linjinzhen@hotmail.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
Michael Goin <mgoin@redhat.com>
-