- 02 Jan, 2025 1 commit
-
-
Kirthi Shankar Sivamani authored
Signed-off-by:Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 12 Oct, 2024 1 commit
-
-
Xin Yao authored
* Let Fused RoPE support THD with CP Signed-off-by:
Xin Yao <xiny@nvidia.com> * add comment Signed-off-by:
Xin Yao <xiny@nvidia.com> --------- Signed-off-by:
Xin Yao <xiny@nvidia.com> Co-authored-by:
Xiaowei Ren <103958965+xrennvidia@users.noreply.github.com>
-
- 14 Jun, 2024 1 commit
-
-
Kirthi Shankar Sivamani authored
* Apply formatting Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Apply formatting Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 25 Jan, 2024 1 commit
-
-
Xin Yao authored
* fused apply rope Signed-off-by:
Xin Yao <xiny@nvidia.com> * Apply suggestions from code review Co-authored-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Signed-off-by:
Xin Yao <yaox12@outlook.com> * resolve comments Signed-off-by:
Xin Yao <xiny@nvidia.com> * make rotary_percent optional Signed-off-by:
Xin Yao <xiny@nvidia.com> * fix ci Signed-off-by:
Xin Yao <xiny@nvidia.com> * fix test Signed-off-by:
Xin Yao <xiny@nvidia.com> * add rope test to qa Signed-off-by:
Xin Yao <xiny@nvidia.com> * fix linting Signed-off-by:
Xin Yao <xiny@nvidia.com> * sync apex: add transpose_output_memory Signed-off-by:
Xin Yao <xiny@nvidia.com> * small fix Signed-off-by:
Xin Yao <xiny@nvidia.com> * sync apex: fuse sin/cos Signed-off-by:
Xin Yao <xiny@nvidia.com> * sync apex: fused rope for thd format Signed-off-by:
Xin Yao <xiny@nvidia.com> * fix lint Signed-off-by:
Xin Yao <xiny@nvidia.com> * Fix license headers Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> * add support for bshd format Signed-off-by:
Xin Yao <xiny@nvidia.com> * support different seq length Signed-off-by:
Xin Yao <xiny@nvidia.com> * update Signed-off-by:
Xin Yao <xiny@nvidia.com> * update copyright Signed-off-by:
Xin Yao <xiny@nvidia.com> * remove transpose_output_memory Signed-off-by:
Xin Yao <xiny@nvidia.com> * Make outputs contiguous in SBHD case Signed-off-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> --------- Signed-off-by:
Xin Yao <xiny@nvidia.com> Signed-off-by:
Xin Yao <yaox12@outlook.com> Signed-off-by:
Przemek Tredak <ptredak@nvidia.com> Signed-off-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by:
Przemyslaw Tredak <ptredak@nvidia.com>
-