- 04 Feb, 2026 1 commit
-
-
zhuwenwen authored
-
- 22 Jan, 2026 2 commits
-
-
Eldar Kurtić authored
Signed-off-by:
Eldar Kurtic <8884008+eldarkurtic@users.noreply.github.com> Signed-off-by:
eldarkurtic <8884008+eldarkurtic@users.noreply.github.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 20 Jan, 2026 1 commit
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 19 Jan, 2026 1 commit
-
-
Tomas Ruiz authored
Signed-off-by:Tomas Ruiz <tomas.ruiz.te@gmail.com>
-
- 18 Jan, 2026 1 commit
-
-
tjp_zju authored
Signed-off-by:
tom-zju <tanjianpingzju1990@gmail.com> Co-authored-by:
Wentao Ye <44945378+yewentao256@users.noreply.github.com>
-
- 16 Jan, 2026 1 commit
-
-
zhuwenwen authored
区分pcie和hglink custom allreduce的使用 vllm:export VLLM_CUSTOM_CACHE=1 dtk:export HIP_KERNEL_EVENT_SYSTENFENCE=1 set VLLM_USE_FUSED_RMS_ROPE=1 add SUPPORT_MOE_MARLIN_W16A16 to use moe marlin on bw support fa kvcache fp8 (todo: add VLLM_USE_QUERY_QUANT to not use q quant) update moe_align_block_size
-
- 14 Jan, 2026 1 commit
-
-
Yi Liu authored
Signed-off-by:yiliu30 <yi4.liu@intel.com>
-
- 11 Jan, 2026 1 commit
-
-
maang authored
Signed-off-by:
maang <maang_h@163.com> Signed-off-by:
maang-h <55082429+maang-h@users.noreply.github.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
-
- 10 Jan, 2026 1 commit
-
-
Russell Bryant authored
Signed-off-by:Russell Bryant <rbryant@redhat.com>
-
- 09 Jan, 2026 1 commit
-
-
Michael Goin authored
Signed-off-by:mgoin <mgoin64@gmail.com>
-
- 08 Jan, 2026 1 commit
-
-
omer-dayan authored
Signed-off-by:
Omer Dayan <omdayan@nvidia.com> Signed-off-by:
Isotr0py <2037008807@qq.com> Co-authored-by:
Isotr0py <2037008807@qq.com> Co-authored-by:
Isotr0py <mozf@mail2.sysu.edu.cn>
-
- 07 Jan, 2026 5 commits
-
-
roikoren755 authored
Signed-off-by:Roi Koren <roik@nvidia.com>
-
zhuwenwen authored
-
zhuwenwen authored
-
weiyu authored
Signed-off-by:
Wei-Yu Lin <weiyulin@google.com> Signed-off-by:
weiyu <62784299+weiyu0824@users.noreply.github.com>
-
zhuwenwen authored
-
- 06 Jan, 2026 3 commits
-
-
zhuwenwen authored
-
zhuwenwen authored
update weights_not_loaded and flash_mla_with_kvcache update paged_mqa_logits
-
maang authored
Signed-off-by:
maang <maang_h@163.com> Signed-off-by:
maang <55082429+maang-h@users.noreply.github.com> Co-authored-by:
Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by:
Wentao Ye <44945378+yewentao256@users.noreply.github.com>
-
- 24 Dec, 2025 1 commit
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 22 Dec, 2025 1 commit
-
-
dengyunyang authored
Signed-off-by:dengyunyang <584797741@qq.com>
-
- 18 Dec, 2025 1 commit
-
-
Michael Goin authored
Signed-off-by:mgoin <mgoin64@gmail.com>
-
- 09 Dec, 2025 2 commits
-
-
Tsukasa OI authored
[Model][Quantization] Restore MoE + GGUF models support (incl. Qwen3 MoE) by allowing Sideload Parameters (#30116) Signed-off-by:
Tsukasa OI <floss_llm@irq.a4lg.com> Co-authored-by:
Isotr0py <mozf@mail2.sysu.edu.cn>
-
liangel-02 authored
Signed-off-by:Angel Li <liangel@meta.com>
-
- 08 Dec, 2025 1 commit
-
-
wang.yuqi authored
[Model][7/N] Improve all pooling task | Deprecation as_reward_model. Extract hidden states prefer using new multi-vector retrieval API (#26686) Signed-off-by:wang.yuqi <yuqi.wang@daocloud.io>
-
- 03 Dec, 2025 3 commits
-
-
Tsukasa OI authored
Signed-off-by:Tsukasa OI <floss_llm@irq.a4lg.com>
-
zhuwenwen authored
-
zhuwenwen authored
add VLLM_USE_OPT_RESHAPE_AND_CACHE、VLLM_USE_FUSE_SILU_AND_MUL and VLLM_USE_TOPK_RENORM for qwen3-30b
-
- 02 Dec, 2025 1 commit
-
-
zhuwenwen authored
-
- 28 Nov, 2025 1 commit
-
-
Isotr0py authored
Signed-off-by:Isotr0py <mozf@mail2.sysu.edu.cn>
-
- 26 Nov, 2025 1 commit
-
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
- 25 Nov, 2025 1 commit
-
-
Injae Ryou authored
Signed-off-by:
Injae Ryou <injaeryou@gmail.com> Signed-off-by:
Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by:
Isotr0py <mozf@mail2.sysu.edu.cn>
-
- 22 Nov, 2025 2 commits
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Nandan Vallamdasu authored
Signed-off-by:
nandan2003 <nandan.vallamdasu@outlook.com> Signed-off-by:
Nandan Vallamdasu <nandan.vallamdasu@outlook.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
- 21 Nov, 2025 2 commits
-
-
Julien Denize authored
Signed-off-by:
Julien Denize <julien.denize@mistral.ai> Signed-off-by:
Julien Denize <40604584+juliendenize@users.noreply.github.com> Signed-off-by:
mgoin <mgoin64@gmail.com> Signed-off-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
mgoin <mgoin64@gmail.com>
-
Ning Xie authored
Signed-off-by:Andy Xie <andy.xning@gmail.com>
-
- 20 Nov, 2025 2 commits
-
-
zhuwenwen authored
update VLLM_USE_PD_SPLIT=0 (for dspk)and VLLM_USE_PD_SPLIT=1 (for others)
-
liangel-02 authored
Signed-off-by:Angel Li <liangel@meta.com>
-
- 19 Nov, 2025 1 commit
-
-
Jerry Zhang authored
Signed-off-by:Jerry Zhang <jerryzh168@gmail.com>
-