"...testdata/git@developer.sourcefind.cn:OpenDAS/ollama.git" did not exist on "b3e5491e41811294de9d81649a96581af6522d08"
Commit 189fcabf authored by Peng-Jen Chen's avatar Peng-Jen Chen Committed by Facebook Github Bot
Browse files

Fix error when training multilingual_translation task with multi-GPU

Summary:
D10052908 introduce multilingual_translation task, but it raises exception when training with multiple-GPUs: P60202593

With Myle's help, we found that it is because of improperly handled dummy batch data type, and it causes optimizer.backward() is not executed same number of times cross different GPUs.

Reviewed By: xianxl

Differential Revision: D12964263

fbshipit-source-id: 4991039030bf373f0c484e131acc4736487be4d8
parent 8eb232ce
...@@ -59,6 +59,8 @@ class RoundRobinZipDatasets(FairseqDataset): ...@@ -59,6 +59,8 @@ class RoundRobinZipDatasets(FairseqDataset):
def collater(self, samples): def collater(self, samples):
"""Merge a list of samples to form a mini-batch.""" """Merge a list of samples to form a mini-batch."""
if len(samples) == 0:
return None
if self.eval_key is None: if self.eval_key is None:
return OrderedDict([ return OrderedDict([
(key, dataset.collater([sample[key] for sample in samples])) (key, dataset.collater([sample[key] for sample in samples]))
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment