Hierarchical CP implementation (Ulysses + Ring) (#1209)
* change API for hierarchical CP Signed-off-by:Xiaowei Ren <xren@nvidia.com> * move fp8 code before qkv reshape Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * try to insert A2A for hierarchical CP Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * make fwd work Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * remove a redundant sync Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * make bwd of hierarchical CP work Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * fix dout a2a in bwd Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * fix q_f16 with fp8 Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * assert hierarchical CP implementation does not support THD format Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * bug fix Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * assert hierarchical CP does not support attn bias Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * add unit test for hierarchical CP Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * fix cp_comm_type in unit test Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * bug fix and code cleaning Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * minor change Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * an assert info change Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * dout shape fix Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move function definitions to the front of the first call Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * fix tensor view comments Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * refine CP unit test Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * typo fix Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * typo fix Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * save cp_size_a2a and rank_a2a in fwd Signed-off-by:
Xiaowei Ren <xren@nvidia.com> * add more explainations of cp_group in doc_string Signed-off-by:
Xiaowei Ren <xren@nvidia.com> --------- Signed-off-by:
Xiaowei Ren <xren@nvidia.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Showing
This diff is collapsed.
Please register or sign in to comment