- 25 Aug, 2025 2 commits
-
-
Li Xiang authored
* get rid of cudaMalloc and cudaFree * minor fix --------- Co-authored-by:Jiashi Li <js.li@high-flyer.cn>
-
zhang authored
-
- 14 Aug, 2025 2 commits
- 01 Aug, 2025 1 commit
-
-
Zeyu WANG authored
* Add more GPU architctures support * Merge fmha and mla runner * add varlen & non varlen support, and add incontiguous tensor support * update readme * add varlen api --------- Co-authored-by:dianzhangc <dianzhangc@nvidia.com>
-
- 29 Apr, 2025 2 commits
- 28 Apr, 2025 1 commit
-
-
ljss authored
-
- 23 Apr, 2025 2 commits
-
-
Shengyu Liu authored
-
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 authored
Thank you for open source FlashMLA! Just read the write up and very amazing work! Found some very minor mistakes regarding to typos, and the link to the FlashAttention-3 paper is wrong as that is the original FlashAttention paper, so I just send the PR here. Thanks again! Signed-off-by:Hollow Man <hollowman@opensuse.org>
-
- 22 Apr, 2025 2 commits
-
-
Shengyu Liu authored
-
Shengyu Liu authored
* Fix benchmark script * Performance optimization for compute-bound cases * Add new testcase (s_k = 16384) * Update README.md * Update comment * Update README.md * Add the deep-dive blog * Add background color for MLA Kernel Sched.drawio.svg * Use relative path for the schedule image * Move flash_mla.h to kernels/params.h
-
- 01 Mar, 2025 2 commits
- 27 Feb, 2025 3 commits
- 26 Feb, 2025 4 commits
- 25 Feb, 2025 4 commits
-
-
yangsijia.614 authored
-
ljss authored
-
Jiashi Li authored
Support FP16 dtype in FlashMLA kenrel
-
ljss authored
-
- 24 Feb, 2025 15 commits
-
-
Sijia Chen authored
-
Jiashi Li authored
feat: add benchmark for flash_infer vs flash_mla
-
Jiashi Li authored
Update docstring
-
zhengsize authored
-
chunyang.wen authored
-
zhengsize authored
-
Sijia Chen authored
-
Sijia Chen authored
-
Jiashi Li authored
support Windows build
-
Jiashi Li authored
minor fix test
-
lancerts authored
-
Jiashi Li authored
tests: Triton 3.2.0 had remove the fast_flush parameter from do_bench
-
程元 authored
-
sazc authored
-
Jiashi Li authored
i
-