Commits · ffc8682dd53f6440b9af214166e7bf27a51d2232 · gaoqiong / flash-attention

"src/turbomind/models/llama/llama_utils.h" did not exist on "cc93136e6a166566fc6f0502c67aa99a94673db3"

29 Aug, 2023 1 commit
- add unpad_input_for_concatenated_sequences (#499) · 8f6f48d8
  Su Zhu authored Aug 29, 2023
```
* add unpad_input_for_concatenated_sequences

* modify docstring
```
  8f6f48d8
18 Aug, 2023 1 commit
- Run isort and black on python files · f1a73d07
  Tri Dao authored Aug 18, 2023
  
  f1a73d07
06 Oct, 2022 1 commit

Antoine Adam authored Oct 06, 2022

According to the `setup.py` file, only dependencies are torch and einops. But the `bert_padding.py` file requires `numpy` only to multiply the elements of a `torch.Size` object. This change aims at allowing the use of FlashAttention without numpy.

4e38df05

05 Aug, 2022 1 commit
- Support index_first_axis with more than 2 dimensions · 6cc73425
  Tri Dao authored Aug 05, 2022
  
  6cc73425
02 Jun, 2022 1 commit
- Rename src -> flash_attn · 5a61cb77
  Tri Dao authored Jun 01, 2022
  
  5a61cb77
29 May, 2022 1 commit
- Reorganize directories, add banner figure · 67c37795
  Tri Dao authored May 29, 2022
  
  67c37795
20 May, 2022 1 commit
- First release · 1fcbe6f0
  Tri Dao authored May 20, 2022
  
  1fcbe6f0