- 16 Jan, 2026 1 commit
-
-
zhuwenwen authored
add VLLM_USE_FUSED_CACHE_QUANT_BMM_MLA to use fused rmsnorm + contiguous + rope(for dpsk-v3) + concat_and_cache_mla + q quant, control bmm(todo) + cat +mla (fp8)
-
- 19 Dec, 2025 1 commit
-
-
zhuwenwen authored
-
- 17 Dec, 2025 1 commit
-
-
zhuwenwen authored
-
- 16 Dec, 2025 1 commit
-
-
zhuwenwen authored
set VLLM_USE_LIGHTOP_RMS_ROPE_CONCAT=1
-
- 07 Dec, 2025 1 commit
-
-
zhuwenwen authored
-
- 23 Nov, 2025 1 commit
-
-
zhuwenwen authored
-
- 13 Oct, 2025 1 commit
-
-
zhuwenwen authored
-
- 28 Sep, 2025 1 commit
-
-
yangql authored
-
- 21 Aug, 2025 1 commit
-
-
zhuwenwen authored
-
- 15 Aug, 2025 2 commits
- 03 Jun, 2025 1 commit
-
-
Simon Mo authored
Signed-off-by:simon-mo <simon.mo@hey.com>
-
- 10 Apr, 2025 1 commit
-
-
zhuwenwen authored
-
- 27 Feb, 2025 1 commit
-
-
Lucas Wilkinson authored
Signed-off-by:
Lucas Wilkinson <lwilkinson@neuralmagic.com> Signed-off-by:
Lucas Wilkinson <lwilkins@redhat.com>
-