Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
flash-attention
Commits
46fd2a20b20849598b1abb438ea09520449d7eb3
Switch branch/tag
flash-attention
csrc
flash_attn
src
fmha
gmem_tile.h
24 Oct, 2022
1 commit
Support all head dims that are multiples of 8, up to 128
· 46fd2a20
Tri Dao
authored
Oct 24, 2022
46fd2a20
23 Oct, 2022
2 commits
Attempt to use atomicCAS to replace atomicAdd(bfloat16)
· 9e92a1f2
Tri Dao
authored
Oct 23, 2022
9e92a1f2
Split bwd on the seqlen_q dimension
· a5a8806d
Tri Dao
authored
Oct 23, 2022
a5a8806d
10 Jul, 2022
1 commit
Refactor to template on __half, implement bf16 util functions
· e518a4b3
Tri Dao
authored
Jul 08, 2022
e518a4b3
04 Jul, 2022
1 commit
Implement cross attention
· 6c3a8c65
Tri Dao
authored
Jun 30, 2022
6c3a8c65
30 Jun, 2022
1 commit
Support batch size > 64K by swapping grid.x and grid.y
· f66603cb
Tri Dao
authored
Jun 29, 2022
f66603cb
12 Jun, 2022
1 commit
Refactor Gmem code to store q, k, v pointers separately
· 5d07483b
Tri Dao
authored
Jun 12, 2022
5d07483b
26 May, 2022
1 commit
Rename, add benchmarking script
· 9dbc491a
Tri Dao
authored
May 26, 2022
9dbc491a
20 May, 2022
1 commit
First release
· 1fcbe6f0
Tri Dao
authored
May 20, 2022
1fcbe6f0