- 16 Mar, 2023 1 commit
-
-
Kirthi Shankar Sivamani authored
Signed-off-by:Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 14 Mar, 2023 1 commit
-
-
Kirthi Shankar Sivamani authored
* Catch FP8 modulo16 error before cublas and fp8 kernels Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * annotate Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 07 Mar, 2023 1 commit
-
-
Kirthi Shankar Sivamani authored
* ignore self attention mask for causal type Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * further relax checks to run FA, update docs Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix pytorch softmax path Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fixes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * minimum ampere requirement for fa Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 02 Mar, 2023 1 commit
-
-
Kirthi Shankar Sivamani authored
* fix qkv weight unfused path Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix non FA non interleaved case Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 10 Feb, 2023 1 commit
-
-
Kirthi Shankar Sivamani authored
retain grad related attrs while casting Signed-off-by:Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 03 Jan, 2023 1 commit
-
-
Przemyslaw Tredak authored
Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com>
-
- 28 Sep, 2022 1 commit
-
-
Przemek Tredak authored
Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com>
-