- 13 Aug, 2023 4 commits
-
-
Tri Dao authored
Merge branch 'feature/demo-wheels' of https://github.com/piercefreeman/flash-attention into piercefreeman-feature/demo-wheels * 'feature/demo-wheels' of https://github.com/piercefreeman/flash-attention: (25 commits) Install standard non-wheel package Remove release creation Build wheel on each push Isolate 2.0.0 & cuda12 Clean setup.py imports Remove builder project Bump version Add notes to github action workflow Add torch dependency to final build Exclude cuda erroring builds Exclude additional disallowed matrix params Full version matrix Add CUDA 11.7 Release is actually unsupported echo OS version Temp disable deploy OS version build numbers Restore full build matrix Refactor and clean of setup.py Strip cuda name from torch version ...
-
Tri Dao authored
-
Tri Dao authored
-
Tri Dao authored
-
- 11 Aug, 2023 4 commits
-
-
Pierce Freeman authored
-
Pierce Freeman authored
-
Pierce Freeman authored
-
Pierce Freeman authored
-
- 10 Aug, 2023 1 commit
-
-
Tri Dao authored
-
- 01 Aug, 2023 5 commits
- 29 Jul, 2023 1 commit
-
-
Tri Dao authored
-
- 28 Jul, 2023 3 commits
-
-
Tri Dao authored
-
Tri Dao authored
-
Kirthi Shankar Sivamani authored
* Bump version to 2.0.2 Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Update version in Dockerfile Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 27 Jul, 2023 1 commit
-
-
Kirthi Shankar Sivamani authored
* Add RNG state to kernel launch params Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Save seed and offset for backward Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Single thread write to global mem Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * compute_dq_dk_dv_1colblock get seed and offset from launch params Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * compute_dq_dk_dv_1rowblock get seed and offset from launch params Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Change forward c++ APIs to save RNG state for backward Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Change backward c++ APIs to set RNG state for bprop launcher Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Bug fixes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Python side API changes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Bug fix; only save seeds instead of full offset Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Account for 3D grid size Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 26 Jul, 2023 4 commits
-
-
Tri Dao authored
-
Haodong Lyu authored
-
Tri Dao authored
-
Tri Dao authored
-
- 23 Jul, 2023 10 commits
-
-
Tri Dao authored
-
Tri Dao authored
-
Joel Lamy-Poirier authored
-
Tri Dao authored
-
Kiarash Jamali authored
-
Ian Timmis authored
* README syntax highlighting Adds syntax highlighting to README * Update README.md
-
Tri Dao authored
-
Tri Dao authored
-
Tri Dao authored
-
Tri Dao authored
-
- 22 Jul, 2023 1 commit
-
-
Tri Dao authored
-
- 21 Jul, 2023 2 commits
- 20 Jul, 2023 2 commits
- 19 Jul, 2023 2 commits
-
-
Tri Dao authored
[LayerNorm] Fix typo in ln_api.cpp
-
Ikko Eltociear Ashimine authored
unintialized -> uninitialized
-