- 11 Feb, 2026 1 commit
-
-
laibao authored
参考并移植 011/vllm 的关键提交逻辑 新增 VLLM_USE_MOE_W16A16_TRITON 开关,并接入基于 lightop 的运行时能力探测与启用结果缓存。 在权重加载后对 w13 与 w2 执行 W16A16 Marlin 预打包。 W16A16 Marlin 启用时保留 monolithic 执行路径,并在 fused_experts_impl 中增加 packed 权重 fast-path。 保持 Marlin 或 lightop 不可用时的回退行为不变。
-
- 06 Feb, 2026 1 commit
-
-
zhuwenwen authored
-
- 26 Jan, 2026 1 commit
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 24 Jan, 2026 1 commit
-
-
Isotr0py authored
Signed-off-by:Isotr0py <mozf@mail2.sysu.edu.cn>
-
- 21 Jan, 2026 1 commit
-
-
Robert Shaw authored
-
- 16 Jan, 2026 1 commit
-
-
zhuwenwen authored
fix _forward_encoder_attention remove medusa set VLLM_PCIE_USE_CUSTOM_ALLREDUCE=1
-
- 07 Jan, 2026 1 commit
-
-
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 authored
Signed-off-by:Hollow Man <hollowman@opensuse.org>
-
- 06 Jan, 2026 1 commit
-
-
zhuwenwen authored
实现了用于优化张量计算的 rms_mrope_fuse 和 rms_mrope_fuse_fake 方法 更新了 forward:在满足条件时走新的 M-RoPE 融合路径 增强了 Qwen3MoeModel 对动态参数维度的支持,以适配该功能
-
- 17 Dec, 2025 1 commit
-
-
zhuwenwen authored
修复CompressedTensorsLinearMethod中的w4a16的冲突问题 feat(moe): add Marlin W16A16 fused MoE behind VLLM_USE_MARLIN_W16A16_MOE replace the fp8_mqa_logits and fp8_paged_mqa_logits interfaces in deepgemm with mqa_logits and paged_mqa_logits from lightop
-
- 11 Dec, 2025 1 commit
-
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 10 Dec, 2025 1 commit
-
-
haoyangli-amd authored
Signed-off-by:Haoyang Li <lihaoyang0109@gmail.com>
-
- 26 Nov, 2025 1 commit
-
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
- 19 Nov, 2025 1 commit
-
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 13 Nov, 2025 2 commits
-
-
zhuwenwen authored
set default_max_num_batched_tokens = 10240 update qwen3_moe of layernorm off lightop of moe_fused_gate
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 10 Nov, 2025 1 commit
-
-
jiahanc authored
Signed-off-by:jiahanc <173873397+jiahanc@users.noreply.github.com>
-
- 05 Nov, 2025 1 commit
-
-
Ilya Markov authored
Signed-off-by:
ilmarkov <markovilya197@gmail.com> Signed-off-by:
Sage Moore <sage@neuralmagic.com> Co-authored-by:
Sage Moore <sage@neuralmagic.com> Co-authored-by:
Tyler Michael Smith <tyler@neuralmagic.com> Co-authored-by:
Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
-
- 12 Oct, 2025 1 commit
-
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 11 Oct, 2025 2 commits
-
-
Jee Jee Li authored
Signed-off-by:Jee Jee Li <pandaleefree@gmail.com>
-
Rahul Tuli authored
Signed-off-by:Rahul Tuli <rtuli@redhat.com>
-
- 05 Oct, 2025 1 commit
-
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 28 Sep, 2025 3 commits
-
-
Roger Wang authored
Signed-off-by:
Roger Wang <hey@rogerw.io> Signed-off-by:
simon-mo <simon.mo@hey.com>
-
Tyler Michael Smith authored
Signed-off-by:
Tyler Michael Smith <tlrmchlsmth@gmail.com> Signed-off-by:
simon-mo <simon.mo@hey.com>
-
Roger Wang authored
Signed-off-by:Roger Wang <hey@rogerw.io>
-
- 27 Sep, 2025 1 commit
-
-
Tyler Michael Smith authored
Signed-off-by:Tyler Michael Smith <tlrmchlsmth@gmail.com>
-
- 26 Sep, 2025 1 commit
-
-
Isotr0py authored
Signed-off-by:Isotr0py <mozf@mail2.sysu.edu.cn>
-
- 21 Sep, 2025 1 commit
-
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk@thinkingmachines.ai>
-
- 17 Sep, 2025 3 commits
-
-
bnellnm authored
Signed-off-by:Bill Nell <bnell@redhat.com>
-
whx authored
Signed-off-by:whx-sjtu <2952154980@qq.com>
-
Roger Wang authored
Signed-off-by:
Roger Wang <hey@rogerw.io> Signed-off-by:
Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by:
Huang Jie <92386084+JJJYmmm@users.noreply.github.com> Co-authored-by:
松灵 <26085463+wulipc@users.noreply.github.com> Co-authored-by:
Isotr0py <mozf@mail2.sysu.edu.cn>
-
- 13 Sep, 2025 1 commit
-
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
- 12 Sep, 2025 1 commit
-
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
- 01 Sep, 2025 1 commit
-
-
JartX authored
[BUGFIX] GPTQ quantization compatibility for Qwen3 MOE models (AutoGPTQ and AutoRound-GPTQ) (#23994) Signed-off-by:
JartX <sagformas@epdcenter.es> Signed-off-by:
Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by:
Isotr0py <mozf@mail2.sysu.edu.cn>
-
- 29 Aug, 2025 1 commit
-
-
Lukas Geiger authored
Signed-off-by:Lukas Geiger <lukas.geiger94@gmail.com>
-
- 25 Aug, 2025 1 commit
-
-
Isotr0py authored
Signed-off-by:Isotr0py <mozf@mail2.sysu.edu.cn>
-
- 20 Aug, 2025 1 commit
-
-
rongfu.leng authored
Signed-off-by:
rongfu.leng <rongfu.leng@daocloud.io> Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by:
rongfu.leng <lenronfu@gmail.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 19 Aug, 2025 1 commit
-
-
yiz-liu authored
Signed-off-by:Yizhou Liu <liu_yizhou@outlook.com>
-
- 13 Aug, 2025 1 commit
-
-
Gh0u1L5 authored
Signed-off-by:Gh0u1L5 <Gh0u1L5@outlook.com>
-
- 12 Aug, 2025 1 commit
-
-
Andy Chen authored
Signed-off-by:
Jee Jee Li <pandaleefree@gmail.com> Co-authored-by:
Jee Jee Li <pandaleefree@gmail.com>
-
- 11 Aug, 2025 1 commit
-
-
JartX authored
-