- 28 Mar, 2024 7 commits
- 19 Mar, 2024 1 commit
-
-
Tri Dao authored
-
- 15 Mar, 2024 3 commits
-
-
Markus Krimmel authored
-
Driss Guessous authored
-
Grigory Sizov authored
* Enable paged attention in varlen forward * Format + fix padding
-
- 14 Mar, 2024 2 commits
-
-
Arvind Sundararajan authored
-
Chirag Jain authored
-
- 02 Mar, 2024 2 commits
- 21 Feb, 2024 4 commits
- 20 Feb, 2024 1 commit
-
-
Tri Dao authored
-
- 18 Feb, 2024 1 commit
-
-
Qubitium authored
Optimize compile to 1: avoid oom 2: minimize swap usage 3: avoid threads starvation when letting ninja decide how many workers to spawn or manual MAX_JOBS "guesses". Logic is to take the min value of MAX_JOBS auto-calculated by two metrics: 1: cpu cores 2: free memory. This should allow flash-attn to compile close to the most efficient manner under any consumer/server env. (#832)
-
- 10 Feb, 2024 4 commits
-
-
Tri Dao authored
-
Tri Dao authored
-
Tri Dao authored
-
Brian Hirsh authored
-
- 08 Feb, 2024 1 commit
-
-
Grigory Sizov authored
-
- 31 Jan, 2024 3 commits
- 30 Jan, 2024 5 commits
-
-
Tri Dao authored
-
Tri Dao authored
-
Tri Dao authored
-
Jeremy Reizenstein authored
For faster and smaller builds in some simple cases, provide switches to allow disabling -backward -alibi -uneven k -dropout -local attention Co-authored-by:Jeremy Francis Reizenstein <bottler@users.noreply.github.com>
-
Christian Kadner authored
* [CI] Build wheels for Pytorch 2.3 (dev/nightly) Resolves #790 Signed-off-by:
Christian Kadner <ckadner@us.ibm.com> * update TORCH_CUDA_VERSION Signed-off-by:
Christian Kadner <ckadner@us.ibm.com> * revert torch 2.2 back to dev20231130 Signed-off-by:
Christian Kadner <ckadner@us.ibm.com> * add link to PyTorch compatibility matrix Signed-off-by:
Christian Kadner <ckadner@us.ibm.com> --------- Signed-off-by:
Christian Kadner <ckadner@us.ibm.com>
-
- 27 Jan, 2024 1 commit
-
-
Avelina9X authored
* Updated docstrings of bert_padding.py Added docstrings for missing arguments in the unpad and pad methods. * Update bert_padding.py Fixed spelling mistakes
-
- 24 Jan, 2024 1 commit
-
-
Tri Dao authored
-
- 23 Jan, 2024 4 commits
-
-
Tao He authored
Signed-off-by:Tao He <sighingnow@gmail.com>
-
Tri Dao authored
-
Tri Dao authored
Co-authored-by:ljss <450993438@qq.com>
-
Tri Dao authored
-