Commits · dfe29f5e2bb65ab4920cdfb45eb9d4a256401ab8 · gaoqiong / flash-attention

18 Sep, 2023 1 commit
- [Gen] Don't use ft_attention, use flash_attn_with_kvcache instead · dfe29f5e
  Tri Dao authored Sep 18, 2023
  
  dfe29f5e
11 Sep, 2023 1 commit
- [Gen] Fix calling update_graph_cache in tests · 8a733cbd
  Tri Dao authored Sep 10, 2023
  
  8a733cbd
05 Sep, 2023 1 commit
- [Gen] Refactor decoding function · 913922ca
  Tri Dao authored Sep 04, 2023
  
  913922ca
04 Sep, 2023 1 commit
- Fix test_baichuan · 798858f9
  Tri Dao authored Sep 03, 2023
  
  798858f9
19 Aug, 2023 1 commit
- Run isort and black on test files · 0e8c46ae
  Tri Dao authored Aug 18, 2023
  
  0e8c46ae
29 Jul, 2023 1 commit
- [GPT] Implement parallel LLaMa · 184b992d
  Tri Dao authored Jul 28, 2023
  
  184b992d
23 Jul, 2023 1 commit
- [GPT] Implement Falcon · d38357dd
  Tri Dao authored Jul 23, 2023
  
  d38357dd