Commit b1870045 authored by Mohammad Shoeybi's avatar Mohammad Shoeybi
Browse files

Merge branch 'mpatwary/main_text_gen' into 'main'

Fixed uninit communication gruop

See merge request ADLR/megatron-lm!142
parents 10c828ed 057a1fab
......@@ -65,7 +65,7 @@ def _gather(input_):
tensor_list = [torch.empty_like(input_) for _ in range(world_size)]
tensor_list[rank] = input_
torch.distributed.all_gather(tensor_list, input_, group=group)
torch.distributed.all_gather(tensor_list, input_, group=get_model_parallel_group())
# Note: torch.cat already creates a contiguous tensor.
output = torch.cat(tensor_list, dim=last_dim).contiguous()
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment