Support CUDA Graph for MoE models (#1233)
* Align RNG tracker with megatron Signed-off-by:Robin Zhang <robinz@nvidia.com> Co-authored-by:
Yifei Song <yifeis@nvidia.com> * Fix module_params order and warmup bug in cudagraph Signed-off-by:
Robin Zhang <robinz@nvidia.com> Co-authored-by:
Yifei Song <yifeis@nvidia.com> * Add fp8_group argument and fix fp8 accuracy issue for cudagraph Signed-off-by:
Robin Zhang <robinz@nvidia.com> Co-authored-by:
Yifei Song <yifeis@nvidia.com> * Add TE modules and weights filters to support MoE models Signed-off-by:
Robin Zhang <robinz@nvidia.com> Co-authored-by:
Yifei Song <yifeis@nvidia.com> * Revert self.fp8 Signed-off-by:
Robin Zhang <robinz@nvidia.com> * Use hooks to filter module params Signed-off-by:
Robin Zhang <robinz@nvidia.com> * Filter all TE modules in hooks Signed-off-by:
Robin Zhang <robinz@nvidia.com> Co-authored-by:
Yifei Song <yifeis@nvidia.com> * Format code Signed-off-by:
Robin Zhang <robinz@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update graph.py Signed-off-by:
Xin Yao <yaox12@outlook.com> * Revert CudaRNGStatesTracker Signed-off-by:
Robin Zhang <robinz@nvidia.com> * Format Update Signed-off-by:
Yifei Song <yifeis@nvidia.com> * Revert "Use hooks to filter module params" This reverts commit 73a22e2e8bcf43ec84c23bc844b8d16d06626e26. Signed-off-by:
Yifei Song <yifeis@nvidia.com> * Remove filtering module params Signed-off-by:
Robin Zhang <robinz@nvidia.com> --------- Signed-off-by:
Robin Zhang <robinz@nvidia.com> Signed-off-by:
Xin Yao <yaox12@outlook.com> Signed-off-by:
Yifei Song <yifeis@nvidia.com> Co-authored-by:
Yifei Song <yifeis@nvidia.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by:
Xin Yao <yaox12@outlook.com> Co-authored-by:
Xin Yao <xiny@nvidia.com> Co-authored-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com>
Showing
Please register or sign in to comment