[JAX] THD ring attention (#1454)
* Support THD + ring attention for self attn Signed-off-by:Reese Wang <rewang@nvidia.com> * Consolidate reorder strategy Signed-off-by:
Reese Wang <rewang@nvidia.com> * Fix dataclass frozen issue Signed-off-by:
Reese Wang <rewang@nvidia.com> * Remove redundant code Signed-off-by:
Reese Wang <rewang@nvidia.com> * Use AttnBiasType, AttnMaskType, QKVLayout in cpp_extension Signed-off-by:
Reese Wang <rewang@nvidia.com> * Fix lint Signed-off-by:
Reese Wang <rewang@nvidia.com> * Refine P2P helper check_supported Signed-off-by:
Reese Wang <rewang@nvidia.com> * Add segment_ids/pos check Signed-off-by:
Reese Wang <rewang@nvidia.com> * Fixup Signed-off-by:
Reese Wang <rewang@nvidia.com> * Add dual chunk swap example Signed-off-by:
Reese Wang <rewang@nvidia.com> * Align different reorder code structure Signed-off-by:
Reese Wang <rewang@nvidia.com> --------- Signed-off-by:
Reese Wang <rewang@nvidia.com> Co-authored-by:
Phuong Nguyen <phuonguyen@nvidia.com>
Showing
This diff is collapsed.
Please register or sign in to comment