Commits · 470010f59b00a53f371e602cfcfd9bb2919f140d · gaoqiong / flash-attention

03 Nov, 2022 1 commit
- Fix race condition for Triton bwd for headdim 48 and 96 · 470010f5
  Tri Dao authored Nov 03, 2022
  
  470010f5
02 Nov, 2022 1 commit
- Fix race condition in Triton bwd for non-po2 headdims · aacc10fb
  Tri Dao authored Nov 02, 2022
  
  aacc10fb
01 Nov, 2022 2 commits
- Avoid memcpy in the Triton bwd · 1fb12afd
  Tri Dao authored Nov 01, 2022
  
  1fb12afd
- Fix race conditions in the Triton bwd for headdim=64 · 731f154d
  Tri Dao authored Nov 01, 2022
  
  731f154d
31 Oct, 2022 10 commits
- Fix race condition in Triton fwd · 9b0bc978
  Tri Dao authored Oct 31, 2022
  
  9b0bc978
- Fix EVEN_M & EVEN_HEADDIM for headdim=40 in Triton bwd · 215930bc
  Tri Dao authored Oct 31, 2022
  
  215930bc
- Add debug_barrier for all headdims in Triton bwd · 4f81aff4
  Tri Dao authored Oct 31, 2022
  
  4f81aff4
- Disable some autotune configs that give wrong results in Triton bwd · bedcbd6a
  Tri Dao authored Oct 31, 2022
  
  bedcbd6a
- [WIP] Support all head dimensions up to 128 in the Triton bwd · e78d509c
  Tri Dao authored Oct 31, 2022
```
WIP because there seems to be some race conditions for head dimensions other
than 16, 32, 64, 128.
```
  e78d509c
- Support all head dimensions up to 128 in the Triton fwd · 008951f1
  Tri Dao authored Oct 30, 2022
  
  008951f1
- Support arbitrary seqlens (both q & k) in Triton bwd · b910bf14
  Tri Dao authored Oct 30, 2022
  
  b910bf14
- Support arbitrary seqlen_k in Triton bwd · dc554693
  Tri Dao authored Oct 30, 2022
  
  dc554693
- Fix Triton fwd to support seqlen not multiples of 128 · d11341fd
  Tri Dao authored Oct 30, 2022
  
  d11341fd
- Implement FlashAttention in Triton · b0c0db81
  Tri Dao authored Oct 30, 2022
  
  b0c0db81