Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
flash-attention
Commits
765741c1
Commit
765741c1
authored
Jun 14, 2022
by
Dan Fu
Browse files
More explanation
parent
2d5b2483
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
1 deletion
+2
-1
README.md
README.md
+2
-1
No files found.
README.md
View file @
765741c1
...
@@ -77,7 +77,8 @@ As a result, FlashAttention can scale to much longer sequence lengths.
...
@@ -77,7 +77,8 @@ As a result, FlashAttention can scale to much longer sequence lengths.
We show speedup with head dimension 128.
We show speedup with head dimension 128.
Here we show batch size 16 with 12 heads.
Here we show batch size 16 with 12 heads.
Speedup is less than with the smaller head sizes, but speedup is still significant -- especially with a causal mask.
Speedup is less than with the smaller head sizes, since we have to make the block size smaller in the tiling.
But speedup is still significant, especially with a causal mask.
### RTX 3090
### RTX 3090
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment