- 15 Dec, 2025 1 commit
-
-
王敏 authored
-
- 10 Dec, 2025 1 commit
-
-
王敏 authored
-
- 08 Dec, 2025 1 commit
-
-
王敏 authored
-
- 02 Dec, 2025 1 commit
-
-
王敏 authored
-
- 13 Nov, 2025 1 commit
-
-
zhuwenwen authored
restore the default settings of disable_cascade_attn add VLLM_USE_OPT_ZEROS to replace triton_ (torch.zeros) set default_max_num_batched_tokens = 10240 update qwen3_moe of layernorm
-
- 07 Nov, 2025 1 commit
-
-
zhuwenwen authored
-
- 03 Nov, 2025 1 commit
-
-
zhuwenwen authored
-
- 01 Nov, 2025 1 commit
-
-
王敏 authored
-
- 29 Oct, 2025 1 commit
-
-
zhuwenwen authored
-
- 27 Oct, 2025 1 commit
-
-
王敏 authored
-
- 24 Oct, 2025 1 commit
-
-
zhuwenwen authored
support prefix cache on kme fix the error in test_moe caused by moe align not supporting 511 and 211 multi-modal switching to torch implementation on z100l&k100
-
- 15 Oct, 2025 4 commits
- 13 Oct, 2025 2 commits
- 10 Oct, 2025 1 commit
-
-
zhuwenwen authored
-
- 30 Sep, 2025 3 commits
- 25 Sep, 2025 1 commit
-
-
zhuwenwen authored
[kernels] update moe_align_block_size and moe_sum interface
-
- 24 Sep, 2025 1 commit
-
-
zhuwenwen authored
[FIX] 修复mtp和VLLM_USE_TRITON_CAT不能一起开的bug
-
- 22 Sep, 2025 1 commit
-
-
wujl5 authored
-
- 18 Sep, 2025 1 commit
-
-
zhuwenwen authored
[kernel] add VLLM_USE_DEEPSEEK_MOE_SUM_MUL_AND to use lightop's moe_sum fusion operator for deepseek
-
- 14 Sep, 2025 1 commit
-
-
wujl5 authored
-
- 10 Sep, 2025 1 commit
-
-
zhuwenwen authored
-
- 09 Sep, 2025 2 commits
- 04 Sep, 2025 1 commit
-
-
王敏 authored
2.解决mtp >1 大EP推理all gather卡住问题
-
- 01 Sep, 2025 2 commits
- 29 Aug, 2025 1 commit
-
-
yangql authored
-
- 28 Aug, 2025 1 commit
-
-
王敏 authored
-
- 25 Aug, 2025 1 commit
-
-
王敏 authored
-
- 15 Aug, 2025 1 commit
-
-
王敏 authored
-
- 07 Aug, 2025 1 commit
-
-
王敏 authored
-
- 06 Aug, 2025 1 commit
-
-
zhuwenwen authored
This reverts merge request !169
-
- 05 Aug, 2025 1 commit
-
-
王敏 authored
-
- 04 Aug, 2025 1 commit
-
-
zhuwenwen authored
-