- 27 Jan, 2026 2 commits
- 26 Jan, 2026 5 commits
- 25 Jan, 2026 6 commits
- 20 Jan, 2026 1 commit
-
-
Shengyu Liu authored
-
- 19 Jan, 2026 1 commit
-
-
Jiashi Li authored
Co-authored-by:baowending.bwd <baowending.bwd@alibaba-inc.com>
-
- 16 Jan, 2026 1 commit
-
-
Shengyu Liu authored
* Multiple updates and refactorings * Remove dead code
-
- 30 Sep, 2025 4 commits
-
-
Jiashi Li authored
-
Jiashi Li authored
-
Jiashi Li authored
-
Shengyu Liu authored
-
- 29 Sep, 2025 6 commits
-
-
Shengyu Liu authored
-
Shengyu Liu authored
-
Simon Mo authored
Signed-off-by:simon-mo <simon.mo@hey.com>
-
Shengyu Liu authored
Add Sparse Attention Kernels on Hopper
-
Shengyu Liu authored
-
Shengyu Liu authored
-
- 24 Sep, 2025 2 commits
-
-
Shengyu Liu authored
-
Shengyu Liu authored
-
- 22 Sep, 2025 1 commit
-
-
zhang authored
-
- 27 Aug, 2025 1 commit
-
-
Zeyu WANG authored
* fix calc space bug * use python code to allocate the buffer for backward kernel
-
- 25 Aug, 2025 2 commits
-
-
Li Xiang authored
* get rid of cudaMalloc and cudaFree * minor fix --------- Co-authored-by:Jiashi Li <js.li@high-flyer.cn>
-
zhang authored
-
- 14 Aug, 2025 2 commits
- 01 Aug, 2025 1 commit
-
-
Zeyu WANG authored
* Add more GPU architctures support * Merge fmha and mla runner * add varlen & non varlen support, and add incontiguous tensor support * update readme * add varlen api --------- Co-authored-by:dianzhangc <dianzhangc@nvidia.com>
-
- 29 Apr, 2025 2 commits
- 28 Apr, 2025 1 commit
-
-
ljss authored
-
- 23 Apr, 2025 2 commits
-
-
Shengyu Liu authored
-
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 authored
Thank you for open source FlashMLA! Just read the write up and very amazing work! Found some very minor mistakes regarding to typos, and the link to the FlashAttention-3 paper is wrong as that is the original FlashAttention paper, so I just send the PR here. Thanks again! Signed-off-by:Hollow Man <hollowman@opensuse.org>
-