Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
flash-attention
Commits
226a1b721dba950e5798e4f96ca46a8a13ecb452
Switch branch/tag
flash-attention
flash_attn
ops
24 Dec, 2022
1 commit
Implement TensorParallel for FusedDense and FusedDenseGeluDense
· 226a1b72
Tri Dao
authored
Dec 23, 2022
226a1b72
23 Dec, 2022
1 commit
Simplify FusedDense
· e68ebbe8
Tri Dao
authored
Dec 22, 2022
e68ebbe8
19 Dec, 2022
1 commit
Implement BERT
· 5fb6df0e
Tri Dao
authored
Dec 18, 2022
5fb6df0e
13 Dec, 2022
1 commit
[LayerNorm] Support taking subset of input or subset of output
· 5db33051
Tri Dao
authored
Dec 12, 2022
5db33051
11 Dec, 2022
1 commit
[LayerNorm] Fuse LayerScale
· ae137ed1
Tri Dao
authored
Dec 10, 2022
ae137ed1
09 Dec, 2022
1 commit
[LayerNorm] Support all dimensions up to 6k (if divisible by 8)
· 8c6609ae
Tri Dao
authored
Dec 08, 2022
8c6609ae
18 Nov, 2022
1 commit
Add __init__.py files to subdirectories for installation
· ece539ab
Tri Dao
authored
Nov 17, 2022
ece539ab
14 Nov, 2022
3 commits
Add GPT and ViT models
· 2e33fc8e
Tri Dao
authored
Nov 13, 2022
2e33fc8e
Add MLP, MHA, Block, Embedding modules
· d4b320b3
Tri Dao
authored
Nov 13, 2022
d4b320b3
Add fused_dense and dropout_add_layernorm CUDA extensions
· fa6d1ce4
Tri Dao
authored
Nov 13, 2022
fa6d1ce4