Commits · ce26d3d73d07e9779c5ba6fb2ca3bc187a34c6cc · gaoqiong / flash-attention

04 Jan, 2023 1 commit
- [Gen] Add option to run generation with FT attention kernel · a668890f
  Tri Dao authored Jan 03, 2023
  
  a668890f
28 Dec, 2022 2 commits
- Bump to v0.2.6 · a6ec1782
  Tri Dao authored Dec 27, 2022
  
  a6ec1782
- Implement generation for GPT · 63670fd8
  Tri Dao authored Dec 27, 2022
  
  63670fd8
27 Dec, 2022 2 commits
- Tweak CrossEntropyLoss to take process_group in init · c6ecd40a
  Tri Dao authored Dec 27, 2022
  
  c6ecd40a
- Implement Tensor Parallel for GPT model · b4018a50
  Tri Dao authored Dec 25, 2022
  
  b4018a50
24 Dec, 2022 1 commit
- Implement TensorParallel for FusedDense and FusedDenseGeluDense · 226a1b72
  Tri Dao authored Dec 23, 2022
  
  226a1b72
18 Nov, 2022 1 commit
- Add __init__.py files to subdirectories for installation · ece539ab
  Tri Dao authored Nov 17, 2022
  
  ece539ab
23 Oct, 2022 1 commit
- Move benchmark utils, support AMP · fb88e5e4
  Tri Dao authored Oct 23, 2022
  
  fb88e5e4