Unverified Commit 00c3a254 authored by Samyam Rajbhandari's avatar Samyam Rajbhandari Committed by GitHub
Browse files

Bug fix for norm calculation in absence of model parallel group (#551)

In the absence of a model parallel group, model_parallel_allreduce should not do any reduction. This commit fixes the bug which was doing a model parallel allreduce across world group when model parallel group is None
parent bcd56f97
......@@ -1198,7 +1198,7 @@ class FP16_DeepSpeedZeroOptimizer(object):
""" Perform all reduce within model parallel group, if any.
"""
if self.model_parallel_group is None:
torch.distributed.all_reduce(tensor=tensor, op=op)
pass
else:
torch.distributed.all_reduce(tensor=tensor,
op=op,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment