- 01 Aug, 2023 2 commits
- 29 Jul, 2023 1 commit
-
-
Tri Dao authored
-
- 28 Jul, 2023 3 commits
-
-
Tri Dao authored
-
Tri Dao authored
-
Kirthi Shankar Sivamani authored
* Bump version to 2.0.2 Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Update version in Dockerfile Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 27 Jul, 2023 1 commit
-
-
Kirthi Shankar Sivamani authored
* Add RNG state to kernel launch params Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Save seed and offset for backward Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Single thread write to global mem Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * compute_dq_dk_dv_1colblock get seed and offset from launch params Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * compute_dq_dk_dv_1rowblock get seed and offset from launch params Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Change forward c++ APIs to save RNG state for backward Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Change backward c++ APIs to set RNG state for bprop launcher Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Bug fixes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Python side API changes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Bug fix; only save seeds instead of full offset Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Account for 3D grid size Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 26 Jul, 2023 4 commits
-
-
Tri Dao authored
-
Haodong Lyu authored
-
Tri Dao authored
-
Tri Dao authored
-
- 23 Jul, 2023 10 commits
-
-
Tri Dao authored
-
Tri Dao authored
-
Joel Lamy-Poirier authored
-
Tri Dao authored
-
Kiarash Jamali authored
-
Ian Timmis authored
* README syntax highlighting Adds syntax highlighting to README * Update README.md
-
Tri Dao authored
-
Tri Dao authored
-
Tri Dao authored
-
Tri Dao authored
-
- 22 Jul, 2023 1 commit
-
-
Tri Dao authored
-
- 21 Jul, 2023 2 commits
- 20 Jul, 2023 2 commits
- 19 Jul, 2023 4 commits
-
-
Tri Dao authored
[LayerNorm] Fix typo in ln_api.cpp
-
Ikko Eltociear Ashimine authored
unintialized -> uninitialized
-
Tri Dao authored
Fix compile error with `BOOL_SWITCH`
-
- 18 Jul, 2023 2 commits
- 17 Jul, 2023 2 commits
- 16 Jul, 2023 1 commit
-
-
Tri Dao authored
Metal FlashAttention
-
- 15 Jul, 2023 3 commits
-
-
Philip Turner authored
-
Philip Turner authored
-
Philip Turner authored
-
- 08 Jul, 2023 2 commits
-
-
Tri Dao authored
rotary: update cos/sin cache when switching from inference mode
-
Volodymyr Kyrylov authored
This resolves RuntimeErrors after running evaluation in inference mode: ``` File "/home/proger/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/proger/.local/lib/python3.10/site-packages/flash_attn/modules/mha.py", line 492, in forward qkv = self.rotary_emb(qkv) File "/home/proger/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/proger/.local/lib/python3.10/site-packages/flash_attn/layers/rotary.py", line 229, in forward return apply_rotary_emb_qkv_( File "/home/proger/.local/lib/python3.10/site-packages/torch/autograd/function.py", line 506, in apply return super().apply(*args, **kwargs) # type: ignore[misc] RuntimeError: Inference tensors cannot be saved for backward. To work around you can make a clone to get a normal tensor and use it in autograd. ```
-