- 22 Apr, 2025 2 commits
-
-
Shengyu Liu authored
-
Shengyu Liu authored
* Fix benchmark script * Performance optimization for compute-bound cases * Add new testcase (s_k = 16384) * Update README.md * Update comment * Update README.md * Add the deep-dive blog * Add background color for MLA Kernel Sched.drawio.svg * Use relative path for the schedule image * Move flash_mla.h to kernels/params.h
-
- 01 Mar, 2025 2 commits
- 27 Feb, 2025 3 commits
- 26 Feb, 2025 4 commits
- 25 Feb, 2025 4 commits
-
-
yangsijia.614 authored
-
ljss authored
-
Jiashi Li authored
Support FP16 dtype in FlashMLA kenrel
-
ljss authored
-
- 24 Feb, 2025 15 commits
-
-
Sijia Chen authored
-
Jiashi Li authored
feat: add benchmark for flash_infer vs flash_mla
-
Jiashi Li authored
Update docstring
-
zhengsize authored
-
chunyang.wen authored
-
zhengsize authored
-
Sijia Chen authored
-
Sijia Chen authored
-
Jiashi Li authored
support Windows build
-
Jiashi Li authored
minor fix test
-
lancerts authored
-
Jiashi Li authored
tests: Triton 3.2.0 had remove the fast_flush parameter from do_bench
-
程元 authored
-
sazc authored
-
Jiashi Li authored
i
-