- 25 Aug, 2023 2 commits
- 24 Aug, 2023 1 commit
-
-
Aman Gupta Karmani authored
-
- 22 Aug, 2023 2 commits
- 21 Aug, 2023 1 commit
-
-
GAOXinyu authored
-
- 20 Aug, 2023 2 commits
-
-
Xuechen Li authored
* q * add comment.
-
Tri Dao authored
-
- 19 Aug, 2023 1 commit
-
-
Xuechen Li authored
* fix name. * set inv function. * add map back function. * handle gqa. * add type annotation to avoid confusion. * fix docstr. * test inverse remap logic.
-
- 18 Aug, 2023 4 commits
-
-
Tri Dao authored
-
Xuechen Li authored
* uneql rank. * trim. * enable passing in number of heads for each rank. * simplify. * simplify. * cleanup. * fix col parallel. * fix bug with row parallel. * fit out proj. * refac. * fix sharding logic. * refac sharding. * refac. * support multiple of. * make fn reuseable. * fix bug in dimensions. * scaffold. * test uneven heads. * fix test by adding barrier. * refac. * reuse code. * clean up.
-
Tri Dao authored
-
Tri Dao authored
-
- 17 Aug, 2023 4 commits
- 16 Aug, 2023 1 commit
-
-
Tri Dao authored
-
- 15 Aug, 2023 1 commit
-
-
Xuechen Li authored
* prelim. * add hf convertion fn. * mlp. * change name. * fix bug. * inverse permute. * change comment. * revert style changes. * fix. * add doc. * revert. * enable load safe. * fix safe load. * fix import. * fix typing-related lints. * fix ckpt loading logic. * make single gpu work. * test with parallel. * ckpt format. * enable pretrained state dict. * remove unused imports. * remove unused. * mark idea related.
-
- 14 Aug, 2023 3 commits
- 13 Aug, 2023 2 commits
- 10 Aug, 2023 1 commit
-
-
Tri Dao authored
-
- 01 Aug, 2023 3 commits
- 29 Jul, 2023 1 commit
-
-
Tri Dao authored
-
- 28 Jul, 2023 3 commits
-
-
Tri Dao authored
-
Tri Dao authored
-
Kirthi Shankar Sivamani authored
* Bump version to 2.0.2 Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Update version in Dockerfile Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 27 Jul, 2023 1 commit
-
-
Kirthi Shankar Sivamani authored
* Add RNG state to kernel launch params Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Save seed and offset for backward Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Single thread write to global mem Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * compute_dq_dk_dv_1colblock get seed and offset from launch params Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * compute_dq_dk_dv_1rowblock get seed and offset from launch params Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Change forward c++ APIs to save RNG state for backward Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Change backward c++ APIs to set RNG state for bprop launcher Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Bug fixes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Python side API changes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Bug fix; only save seeds instead of full offset Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Account for 3D grid size Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 26 Jul, 2023 2 commits
-
-
Tri Dao authored
-
Haodong Lyu authored
-
- 23 Jul, 2023 5 commits
-
-
Tri Dao authored
-
Tri Dao authored
-
Kiarash Jamali authored
-
Tri Dao authored
-
Tri Dao authored
-