DistributedFusedAdam Model Parallelism Support (Megatron) (#981)
DistributedFusedAdam Model Parallelism Support (Megatron) Co-authored-by:Kexin Yu <kexiny@nvidia.com> Co-authored-by:
Kexin Yu <kexinznzn@gmail.com>
Showing
Please register or sign in to comment