Commits · 780e8eeabb84fe3f41e8244f04521743b032ba35 · gaoqiong / flash-attention

16 Jan, 2023 1 commit
- Reorder LN in Block, support OPT · ff34123b
  Tri Dao authored Jan 15, 2023
  
  ff34123b
01 Jan, 2023 1 commit
- [Bert] Fix embedding layer norm before embedding dropout · 714c1b4f
  Tri Dao authored Jan 01, 2023
  
  714c1b4f
27 Dec, 2022 1 commit
- Tweak CrossEntropyLoss to take process_group in init · c6ecd40a
  Tri Dao authored Dec 27, 2022
  
  c6ecd40a
23 Dec, 2022 2 commits
- Add smoothing for CrossEntropyParallel, rename to CrossEntropyLoss · dff68c2b
  Tri Dao authored Dec 23, 2022
  
  dff68c2b
- Simplify FusedDense · e68ebbe8
  Tri Dao authored Dec 22, 2022
  
  e68ebbe8
20 Dec, 2022 1 commit
- Implement last_layer_subset optimization for BERT · 13cdceb3
  Tri Dao authored Dec 19, 2022
  
  13cdceb3
19 Dec, 2022 1 commit
- Implement BERT · 5fb6df0e
  Tri Dao authored Dec 18, 2022
  
  5fb6df0e