Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
flash-attention
Commits
cadfa396b89397451ca629d0edc71486b1c39bdf
Switch branch/tag
flash-attention
training
30 Dec, 2022
3 commits
[Docker] Set torchmetrics==0.10.3
· cadfa396
Tri Dao
authored
Dec 30, 2022
cadfa396
[Docs] Fix formatting
· 43798966
Tri Dao
authored
Dec 30, 2022
43798966
[Docs] Mention that dropout_layer_norm supports all dims up to 6k
· 3c7cbfc1
Tri Dao
authored
Dec 29, 2022
3c7cbfc1
29 Dec, 2022
1 commit
Update training Dockerfile to use flash-attn==0.2.6
· 984d5204
Tri Dao
authored
Dec 29, 2022
984d5204
27 Dec, 2022
1 commit
Implement Tensor Parallel for GPT model
· b4018a50
Tri Dao
authored
Dec 25, 2022
b4018a50
23 Dec, 2022
1 commit
Add smoothing for CrossEntropyParallel, rename to CrossEntropyLoss
· dff68c2b
Tri Dao
authored
Dec 23, 2022
dff68c2b
21 Dec, 2022
1 commit
Fix typo in config: train.gpu -> train.gpu_mem
· c2407dec
Tri Dao
authored
Dec 21, 2022
c2407dec
29 Nov, 2022
2 commits
Update configs, add results
· 4a6eaa9f
Tri Dao
authored
Nov 29, 2022
4a6eaa9f
Release training code
· 0bf5e500
Tri Dao
authored
Nov 28, 2022
0bf5e500