Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
flash-attention
Commits
"test/vscode:/vscode.git/clone" did not exist on "73d4a5f8791f8292a7f3e7f57cdeb1e764b6d2e6"
780e8eeabb84fe3f41e8244f04521743b032ba35
Switch branch/tag
flash-attention
tests
models
test_gpt_generation.py
15 Jan, 2023
2 commits
[Gen] Pass qkv_stride to ft_attention kernel for batched generation
· f1e01c27
Tri Dao
authored
Jan 15, 2023
f1e01c27
[Gen] Make generation work with Tensor Parallel
· 7c219154
Tri Dao
authored
Jan 15, 2023
7c219154
08 Jan, 2023
3 commits
[Gen] Add timing option
· b4859900
Tri Dao
authored
Jan 07, 2023
b4859900
[Gen] Adjust shape of kv_cache when using FT
· 0938298e
Tri Dao
authored
Jan 07, 2023
0938298e
[Gen] Implement top-k and top-p sampling
· e02fd588
Tri Dao
authored
Jan 07, 2023
e02fd588
07 Jan, 2023
1 commit
[Gen] Test generation with rotary embedding
· 11be742a
Tri Dao
authored
Jan 07, 2023
11be742a
04 Jan, 2023
1 commit
[Gen] Add option to run generation with FT attention kernel
· a668890f
Tri Dao
authored
Jan 03, 2023
a668890f
28 Dec, 2022
1 commit
Implement generation for GPT
· 63670fd8
Tri Dao
authored
Dec 27, 2022
63670fd8