Commit 00e47d7c authored by Peng Li's avatar Peng Li Committed by Facebook Github Bot
Browse files

fix data checking report bug (#403)

Summary:
The original code reports the size of a valid sample instead of an invalid one when raising an Exception , which will make people confused.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/403

Differential Revision: D13391431

Pulled By: myleott

fbshipit-source-id: 4642ed027c0f664424fc5a9baf4363791144feaf
parent 03ef3ab8
......@@ -100,12 +100,13 @@ def filter_by_size(indices, size_fn, max_positions, raise_exception=False):
ignored = []
itr = collect_filtered(check_size, indices, ignored)
for idx in itr:
if len(ignored) > 0 and raise_exception:
raise Exception((
'Size of sample #{} is invalid (={}) since max_positions={}, '
'skip this example with --skip-invalid-size-inputs-valid-test'
).format(idx, size_fn(idx), max_positions))
).format(ignored[0], size_fn(ignored[0]), max_positions))
yield idx
if len(ignored) > 0:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment