"src/turbomind/models/llama/llama_utils.h" did not exist on "cc93136e6a166566fc6f0502c67aa99a94673db3"
- 29 Aug, 2023 1 commit
-
-
Su Zhu authored
* add unpad_input_for_concatenated_sequences * modify docstring
-
- 18 Aug, 2023 1 commit
-
-
Tri Dao authored
-
- 06 Oct, 2022 1 commit
-
-
Antoine Adam authored
According to the `setup.py` file, only dependencies are torch and einops. But the `bert_padding.py` file requires `numpy` only to multiply the elements of a `torch.Size` object. This change aims at allowing the use of FlashAttention without numpy.
-
- 05 Aug, 2022 1 commit
-
-
Tri Dao authored
-
- 02 Jun, 2022 1 commit
-
-
Tri Dao authored
-
- 29 May, 2022 1 commit
-
-
Tri Dao authored
-
- 20 May, 2022 1 commit
-
-
Tri Dao authored
-