- 02 Mar, 2024 1 commit
-
-
Tri Dao authored
-
- 21 Feb, 2024 4 commits
- 20 Feb, 2024 1 commit
-
-
Tri Dao authored
-
- 18 Feb, 2024 1 commit
-
-
Qubitium authored
Optimize compile to 1: avoid oom 2: minimize swap usage 3: avoid threads starvation when letting ninja decide how many workers to spawn or manual MAX_JOBS "guesses". Logic is to take the min value of MAX_JOBS auto-calculated by two metrics: 1: cpu cores 2: free memory. This should allow flash-attn to compile close to the most efficient manner under any consumer/server env. (#832)
-
- 10 Feb, 2024 4 commits
-
-
Tri Dao authored
-
Tri Dao authored
-
Tri Dao authored
-
Brian Hirsh authored
-
- 08 Feb, 2024 1 commit
-
-
Grigory Sizov authored
-
- 31 Jan, 2024 3 commits
- 30 Jan, 2024 5 commits
-
-
Tri Dao authored
-
Tri Dao authored
-
Tri Dao authored
-
Jeremy Reizenstein authored
For faster and smaller builds in some simple cases, provide switches to allow disabling -backward -alibi -uneven k -dropout -local attention Co-authored-by:Jeremy Francis Reizenstein <bottler@users.noreply.github.com>
-
Christian Kadner authored
* [CI] Build wheels for Pytorch 2.3 (dev/nightly) Resolves #790 Signed-off-by:
Christian Kadner <ckadner@us.ibm.com> * update TORCH_CUDA_VERSION Signed-off-by:
Christian Kadner <ckadner@us.ibm.com> * revert torch 2.2 back to dev20231130 Signed-off-by:
Christian Kadner <ckadner@us.ibm.com> * add link to PyTorch compatibility matrix Signed-off-by:
Christian Kadner <ckadner@us.ibm.com> --------- Signed-off-by:
Christian Kadner <ckadner@us.ibm.com>
-
- 27 Jan, 2024 1 commit
-
-
Avelina9X authored
* Updated docstrings of bert_padding.py Added docstrings for missing arguments in the unpad and pad methods. * Update bert_padding.py Fixed spelling mistakes
-
- 24 Jan, 2024 1 commit
-
-
Tri Dao authored
-
- 23 Jan, 2024 5 commits
-
-
Tao He authored
Signed-off-by:Tao He <sighingnow@gmail.com>
-
Tri Dao authored
-
Tri Dao authored
Co-authored-by:ljss <450993438@qq.com>
-
Tri Dao authored
-
Tri Dao authored
-
- 22 Jan, 2024 3 commits
- 21 Jan, 2024 10 commits
-
-
Tri Dao authored
-
Tri Dao authored
-
Grigory Sizov authored
* Add split-k, M<->H to varseq path * skip M<->H when dropout>0, fix LSE
-
Curtis "Fjord" Hawthorne authored
-
Tri Dao authored
-
Tri Dao authored
-
Tri Dao authored
-
Tri Dao authored
-
Tri Dao authored
-
Tri Dao authored
-