Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
flash-attention
Commits
93383bd55bfffb0fa2c4584c4849971152397035
Switch branch/tag
flash-attention
tests
models
07 Jan, 2023
1 commit
[TP] Implement TensorParallel without sequence parallel
· 93383bd5
Tri Dao
authored
Jan 07, 2023
93383bd5
04 Jan, 2023
1 commit
[Gen] Add option to run generation with FT attention kernel
· a668890f
Tri Dao
authored
Jan 03, 2023
a668890f
01 Jan, 2023
1 commit
[GPT] Refactor function to shard state_dict for TensorParallel
· ef1ba918
Tri Dao
authored
Jan 01, 2023
ef1ba918
28 Dec, 2022
1 commit
Implement generation for GPT
· 63670fd8
Tri Dao
authored
Dec 27, 2022
63670fd8
27 Dec, 2022
3 commits
Support loading GPT2 weights from Huggingface
· 9d797d88
Tri Dao
authored
Dec 27, 2022
9d797d88
Tweak CrossEntropyLoss to take process_group in init
· c6ecd40a
Tri Dao
authored
Dec 27, 2022
c6ecd40a
Implement Tensor Parallel for GPT model
· b4018a50
Tri Dao
authored
Dec 25, 2022
b4018a50
20 Dec, 2022
1 commit
Implement last_layer_subset optimization for BERT
· 13cdceb3
Tri Dao
authored
Dec 19, 2022
13cdceb3
19 Dec, 2022
1 commit
Implement BERT
· 5fb6df0e
Tri Dao
authored
Dec 18, 2022
5fb6df0e