- 10 Feb, 2026 1 commit
-
-
zhuwenwen authored
[qwen3-480b] MoE(TN) configs for nmz TP=8 [opt] 优化deepep相关代码 [fix] 修复deepseek moe模型的awq量化推理bug和精度问题, 修复awq模型的VLLM_USE_LIGHTOP_MOE_SUM_MUL_ADD设置位置, update_state,优化性能,去除冗余操作 pcie 解决custom cudagraph模式需要拷贝的问题,需要配合dtk进行使用 [feat] Switch default w8a8 gemm impl to blaslt. Support w8a8-fp8 GEMM backend.MoE 路由抓取:新增 router_capture 工具链与 envs 统一配置 [envs] set VLLM_CUSTOM_CACHE=1、VLLM_USE_FUSED_RMS_ROPE=1、VLLM_USE_FUSED_FILL_RMS_CAT=1、VLLM_USE_FLASH_ATTN_FP8=1、VLLM_USE_FLASH_MLA_FP8=1、update VLLM_USE_TOPK_RENORM
-
- 18 Dec, 2025 1 commit
-
-
xiabo authored
vllm:export VLLM_CUSTOM_CACHE=1 dtk:export HIP_KERNEL_EVENT_SYSTENFENCE=1
-
- 18 Nov, 2025 1 commit
-
-
wujl5 authored
-
- 18 Apr, 2025 2 commits
- 12 Apr, 2025 1 commit
-
-
Tianer Zhou authored
Signed-off-by:Tianer Zhou <ezhoureal@gmail.com>
-
- 01 Apr, 2025 1 commit
-
-
Ilya Markov authored
Signed-off-by:
ilmarkov <imarkov@redhat.com> Co-authored-by:
ilmarkov <imarkov@redhat.com>
-
- 28 Jan, 2025 1 commit
-
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 07 Nov, 2024 1 commit
-
-
Hanzhi Zhou authored
Signed-off-by:Hanzhi Zhou <hanzhi713@gmail.com>
-
- 25 Sep, 2024 1 commit
-
-
sasha0552 authored
-
- 24 Sep, 2024 1 commit
-
-
Hanzhi Zhou authored
-
- 22 May, 2024 1 commit
-
-
Michael Goin authored
-
- 22 Mar, 2024 1 commit
-
-
Hanzhi Zhou authored
-
- 29 Jan, 2024 1 commit
-
-
Hanzhi Zhou authored
-
- 27 Jan, 2024 1 commit
-
-
Hanzhi Zhou authored
-