- 10 Feb, 2026 1 commit
-
-
zhuwenwen authored
[qwen3-480b] MoE(TN) configs for nmz TP=8 [opt] 优化deepep相关代码 [fix] 修复deepseek moe模型的awq量化推理bug和精度问题, 修复awq模型的VLLM_USE_LIGHTOP_MOE_SUM_MUL_ADD设置位置, update_state,优化性能,去除冗余操作 pcie 解决custom cudagraph模式需要拷贝的问题,需要配合dtk进行使用 [feat] Switch default w8a8 gemm impl to blaslt. Support w8a8-fp8 GEMM backend.MoE 路由抓取:新增 router_capture 工具链与 envs 统一配置 [envs] set VLLM_CUSTOM_CACHE=1、VLLM_USE_FUSED_RMS_ROPE=1、VLLM_USE_FUSED_FILL_RMS_CAT=1、VLLM_USE_FLASH_ATTN_FP8=1、VLLM_USE_FLASH_MLA_FP8=1、update VLLM_USE_TOPK_RENORM
-
- 05 Jan, 2026 1 commit
-
-
laibao authored
-
- 03 Jun, 2025 1 commit
-
-
Simon Mo authored
Signed-off-by:simon-mo <simon.mo@hey.com>
-
- 15 May, 2025 1 commit
-
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 13 May, 2025 1 commit
-
-
zhuwenwen authored
support telechat2 and glm4 nn layout remove log of request_id
-
- 24 Apr, 2025 1 commit
-
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
- 22 Apr, 2025 1 commit
-
-
zhuwenwen authored
Fix glm4.py residual bug
-
- 19 Apr, 2025 1 commit
-
-
zhuwenwen authored
-
- 17 Apr, 2025 1 commit
-
-
intervitens authored
Signed-off-by:intervitens <intervitens@tutanota.com>
-
- 10 Apr, 2025 1 commit
-
-
Yuxuan Zhang authored
Signed-off-by:
lvfei.lv <lvfei.lv@alibaba-inc.com> Signed-off-by:
zRzRzRzRzRzRzR <2448370773@qq.com> Signed-off-by:
DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by:
yihong0618 <zouzou0208@gmail.com> Signed-off-by:
Lu Fang <fanglu@fb.com> Signed-off-by:
Ajay Vohra <ajayvohr@amazon.com> Signed-off-by:
NickLucche <nlucches@redhat.com> Signed-off-by:
Guillaume Calmettes <gcalmettes@scaleway.com> Co-authored-by:
Accelerator1996 <lvfei.lv@alibaba-inc.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk> Co-authored-by:
Michael Goin <michael@neuralmagic.com> Co-authored-by:
yihong <zouzou0208@gmail.com> Co-authored-by:
Lucia Fang <116399278+luccafong@users.noreply.github.com> Co-authored-by:
ajayvohra2005 <ajayvohr@amazon.com> Co-authored-by:
Nicolò Lucchesi <nlucches@redhat.com> Co-authored-by:
Guillaume Calmettes <gcalmettes@scaleway.com>
-
- 07 Apr, 2025 1 commit
-
-
YamPengLi authored
Signed-off-by:
YamPengLi <yampayne.lyp@alibaba-inc.com> Co-authored-by:
Cyrus Leung <cyrus.tl.leung@gmail.com>
-