More explanation

765741c1 · Dan Fu · 2d5b2483 · 765741c1
Commit 765741c1 authored Jun 14, 2022 by Dan Fu
Hide whitespace changes
Inline Side-by-side

Showing with 2 additions and 1 deletion

README.md README.md +2 -1

No files found.
--- a/README.md
+++ b/README.md
@@ -77,7 +77,8 @@ As a result, FlashAttention can scale to much longer sequence lengths.
 We show speedup with head dimension 128.
 Here we show batch size 16 with 12 heads.
-Speedup is less than with the smaller head sizes, but speedup is still significant -- especially with a causal mask.
+Speedup is less than with the smaller head sizes, since we have to make the block size smaller in the tiling.
+But speedup is still significant, especially with a causal mask.
 ### RTX 3090