- 26 Mar, 2024 11 commits
- 23 Jan, 2024 1 commit
-
-
Tri Dao authored
Co-authored-by:ljss <450993438@qq.com>
-
- 21 Jan, 2024 3 commits
- 20 Jan, 2024 1 commit
-
-
Tri Dao authored
-
- 15 Jan, 2024 1 commit
-
-
Tri Dao authored
-
- 14 Jan, 2024 4 commits
- 13 Jan, 2024 1 commit
-
-
Tri Dao authored
-
- 12 Jan, 2024 1 commit
-
-
Tri Dao authored
-
- 22 Dec, 2023 1 commit
-
-
Tri Dao authored
-
- 20 Dec, 2023 1 commit
-
-
Sanghun Cho authored
* hard-code alibi in fwd * use params.h as hun_heads * hard-code alibi in bwd * add alibi on/off option * compute alibi_start, ratio outside of kernels * fix minor merge conflict * add test_alibi.py * change apply_alibi() location before masking * add alibi in splitkv kernel * fix backward func # of returns * add out-of-bound check in apply_alibi() * update test_alibi.py * update test_alibi.py for kvcache * simplify alibi parameter interface * fix performance issue by computing alibi outside of branch * update test_flash_attn_varlen_func() for left padding * implement alibi_slopes (b, nh) loading * optimize apply_alibi() a bit * update test cases for alibi_slopes loading * reflect stylistic comments * disable "seqlenq_ngroups_swapped" when using alibi --------- Co-authored-by:monk.detective <monk.detective@kakaobrain.com>
-
- 20 Nov, 2023 1 commit
-
-
Tri Dao authored
-
- 03 Oct, 2023 1 commit
-
-
Tri Dao authored
-
- 26 Sep, 2023 1 commit
-
-
Tri Dao authored
Co-authored-by:Timothee Lacroix <t@mistral.ai>
-
- 21 Sep, 2023 1 commit
-
-
Tri Dao authored
-
- 16 Sep, 2023 1 commit
-
-
Tri Dao authored
-
- 13 Sep, 2023 1 commit
-
-
Tri Dao authored
-
- 12 Sep, 2023 1 commit
-
-
Tri Dao authored
-
- 11 Sep, 2023 1 commit
-
-
Tri Dao authored
-
- 04 Sep, 2023 2 commits
- 03 Sep, 2023 1 commit
-
-
Tri Dao authored
-
- 30 Aug, 2023 1 commit
-
-
Tri Dao authored
-
- 29 Aug, 2023 1 commit
-
-
Tri Dao authored
-
- 25 Aug, 2023 1 commit
-
-
Tri Dao authored
-
- 24 Aug, 2023 1 commit
-
-
BoxiangW authored
Support flash attention 2 with causal masking when KV's seq length is longer than Q's seq length. (#436)
-