Commit a8c5a3cb authored by Rick Ho's avatar Rick Ho
Browse files

bug fix for megatron patch

parent ae10e942
r"""
Patching some of Megatron-LM's functions to create an MoE model
"""
import torch
def patch_forward_step(forward_step_func):
r"""
Patch model's forward_step_func to support balance loss
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment