Commits · e02fd588aaaf24cb9af11df0ea7533f2a017c719 · gaoqiong / flash-attention

08 Jan, 2023 1 commit
- [Gen] Implement top-k and top-p sampling · e02fd588
  Tri Dao authored Jan 07, 2023
  
  e02fd588
04 Jan, 2023 1 commit
- [Gen] Add option to run generation with FT attention kernel · a668890f
  Tri Dao authored Jan 03, 2023
  
  a668890f
28 Dec, 2022 2 commits
- Bump to v0.2.6 · a6ec1782
  Tri Dao authored Dec 27, 2022
  
  a6ec1782
- Implement generation for GPT · 63670fd8
  Tri Dao authored Dec 27, 2022
  
  63670fd8