- 22 Jul, 2024 1 commit
-
-
Phil Wang authored
* check in the two ways of approaching backwards for softcapping, both functional * prepare the softcap switch for backwards * temporary * cleanup to the way Tri prefers * calculate dtanh when copying from scores -> dtanh Tensor * no ternary operators allowed for constexpr, so just use some hack found online * fix maybe_dtanh, restore some files * restore another file * move calculate_dtanh to utils and colocate with apply_softcap * cleanup * maybe last cleanup * save for another pr * remove a stray line * fix spacing * fix an issue, and make test_flash_attn.py ready to test softcapping backwards
-
- 26 May, 2024 1 commit
-
-
Tri Dao authored
-
- 19 May, 2024 2 commits
-
-
Woosuk Kwon authored
-
Woosuk Kwon authored
-
- 26 Mar, 2024 10 commits
- 21 Jan, 2024 3 commits
- 14 Jan, 2024 1 commit
-
-
Tri Dao authored
-
- 12 Jan, 2024 1 commit
-
-
Tri Dao authored
-
- 16 Sep, 2023 1 commit
-
-
Tri Dao authored
-
- 04 Sep, 2023 1 commit
-
-
Tri Dao authored
-
- 01 Sep, 2023 1 commit
-
-
Sophia Wisdom authored
-
- 13 Aug, 2023 2 commits
- 17 Jul, 2023 1 commit
-
-
Tri Dao authored
-