- 25 Feb, 2026 1 commit
-
-
Your Name authored
-
- 11 Feb, 2026 2 commits
- 09 Feb, 2026 3 commits
- 08 Feb, 2026 1 commit
-
-
王敏 authored
-
- 06 Feb, 2026 2 commits
- 05 Feb, 2026 1 commit
-
-
jujl1 authored
-
- 03 Feb, 2026 1 commit
-
-
jujl1 authored
-
- 28 Jan, 2026 3 commits
- 27 Jan, 2026 2 commits
- 26 Jan, 2026 1 commit
-
-
zhuwenwen authored
-
- 23 Jan, 2026 1 commit
-
-
zhuwenwen authored
-
- 17 Jan, 2026 3 commits
- 16 Jan, 2026 1 commit
-
-
zhuwenwen authored
add VLLM_USE_FUSED_CACHE_QUANT_BMM_MLA to use fused rmsnorm + contiguous + rope(for dpsk-v3) + concat_and_cache_mla + q quant, control bmm(todo) + cat +mla (fp8)
-
- 14 Jan, 2026 2 commits
- 13 Jan, 2026 3 commits
- 12 Jan, 2026 2 commits
- 09 Jan, 2026 1 commit
-
-
jujl1 authored
-
- 07 Jan, 2026 1 commit
-
-
laibao authored
- repeat_counts/CPU 元数据为 numpy/array-like 时会在 repeat_interleave/.to() 崩溃 - 统一转换为 CPU torch.Tensor 后再扩展并拷到 GPU
-
- 06 Jan, 2026 1 commit
-
-
jujl1 authored
-
- 04 Jan, 2026 2 commits
- 31 Dec, 2025 3 commits
- 29 Dec, 2025 1 commit
-
-
yangql authored
-
- 26 Dec, 2025 1 commit
-
-
zhuwenwen authored
-
- 25 Dec, 2025 1 commit
-
-
王敏 authored
-