Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
flash-attention
Commits
35d589fa81a68b7cb806982af4fafac0f19d644d
Switch branch/tag
flash-attention
csrc
flash_attn
src
fmha_fprop_kernel_1xN.h
14 Oct, 2022
1 commit
Implement attention kernel that splits the batch into two
· 5badfb78
Tri Dao
authored
Oct 13, 2022
5badfb78
10 Jul, 2022
4 commits
Implement for bf16
· de19de7a
Tri Dao
authored
Jul 09, 2022
de19de7a
Refactor gemm_cl to template on either __half or __nv_bfloat16
· 6a77a6da
Tri Dao
authored
Jul 08, 2022
6a77a6da
Refactor to template on __half, implement bf16 util functions
· e518a4b3
Tri Dao
authored
Jul 08, 2022
e518a4b3
Fix Illegal Memory Access bug in fwd when d=16
· 2dc1b205
Tri Dao
authored
Jul 09, 2022
2dc1b205
04 Jul, 2022
1 commit
Implement cross attention
· 6c3a8c65
Tri Dao
authored
Jun 30, 2022
6c3a8c65
30 Jun, 2022
1 commit
Support batch size > 64K by swapping grid.x and grid.y
· f66603cb
Tri Dao
authored
Jun 29, 2022
f66603cb
12 Jun, 2022
3 commits
Refactor Gmem code to store q, k, v pointers separately
· 5d07483b
Tri Dao
authored
Jun 12, 2022
5d07483b
Implement bwd for head dim 128
· d3e64409
Tri Dao
authored
Jun 11, 2022
d3e64409
Implement fwd for head dim 128
· 0d854692
Tri Dao
authored
Jun 05, 2022
0d854692
02 Jun, 2022
1 commit
Use Cutlass gemm as WarpMma
· 14dc326e
Tri Dao
authored
Jun 02, 2022
14dc326e
26 May, 2022
1 commit
Rename, add benchmarking script
· 9dbc491a
Tri Dao
authored
May 26, 2022
9dbc491a
20 May, 2022
1 commit
First release
· 1fcbe6f0
Tri Dao
authored
May 20, 2022
1fcbe6f0