Atomic gemm for TP-AR and TP-RS overlap with P2P exchanges (#732)
* Atomic gemm for TP-AR and TP-RS overlap with P2P exchanges Signed-off-by:Sangkug Lym <slym@nvidia.com> * FP8 reduction for atomic TP-RS with p2p exchange Signed-off-by:
Sangkug Lym <slym@nvidia.com> * Fix Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Sangkug Lym <slym@nvidia.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
Showing
Please register or sign in to comment