Commit 14282ff3 authored by Arya McCarthy's avatar Arya McCarthy Committed by Facebook Github Bot
Browse files

Add fairspeq task to train ASR model with auxiliary data. (#813)

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/813

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/663

Pull Request resolved: https://github.com/fairinternal/fairspeq/pull/4

Introduce new training for speech models which accept additional training data.

Reviewed By: liezl200

Differential Revision: D15846661

fbshipit-source-id: 8b2cbfd56a86cf03c0b34c4a025bebdd5db7204e
parent 1c1fd730
......@@ -102,6 +102,11 @@ def filter_by_size(indices, size_fn, max_positions, raise_exception=False):
for key in intersect_keys
)
else:
# Hacky as heck, for the specific case of multilingual training with RoundRobin.
if isinstance(size_fn(idx), dict) and isinstance(max_positions, tuple):
return all(a is None or b is None or a <= b
for a, b in zip(size_fn(idx).values(), max_positions)
)
# For MultiCorpusSampledDataset, will generalize it later
if not isinstance(size_fn(idx), Iterable):
return all(size_fn(idx) <= b for b in max_positions)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment