- 15 Mar, 2023 1 commit
-
-
Vik Paruchuri authored
-
- 13 Jan, 2023 1 commit
-
-
Kiarash Jamali authored
Documentation says default is 0.1, but the code has attention_dropout default at 0.0
-
- 19 Dec, 2022 1 commit
-
-
Tri Dao authored
-
- 10 Nov, 2022 1 commit
-
-
Tri Dao authored
To avoid import error if one doesn't have rotary_emb installed
-
- 11 Sep, 2022 1 commit
-
-
Tri Dao authored
-
- 06 Sep, 2022 1 commit
-
-
eric-tc-wong authored
Recasting query and key after rotary_emb()
-
- 09 Aug, 2022 1 commit
-
-
Tri Dao authored
-
- 05 Aug, 2022 1 commit
-
-
Tri Dao authored
-
- 04 Jul, 2022 1 commit
-
-
Tri Dao authored
-
- 02 Jun, 2022 1 commit
-
-
Tri Dao authored
-
- 29 May, 2022 1 commit
-
-
Tri Dao authored
-
- 26 May, 2022 1 commit
-
-
Tri Dao authored
-
- 20 May, 2022 1 commit
-
-
Tri Dao authored
-