"git@developer.sourcefind.cn:OpenDAS/megatron-lm.git" did not exist on "6fd0b406740cf121ead37f31ab69a0e0914411d4"
Unverified Commit bf298439 authored by Taylor Robie's avatar Taylor Robie Committed by GitHub
Browse files

fix error when last shard is not assigned a batch (#5569)

parent 19d4eaaf
...@@ -206,7 +206,9 @@ def _construct_records( ...@@ -206,7 +206,9 @@ def _construct_records(
num_workers: Number of multiprocessing workers to use for negative num_workers: Number of multiprocessing workers to use for negative
generation. generation.
cache_paths: Paths object with information of where to write files. cache_paths: Paths object with information of where to write files.
num_readers: The number of reader datasets in the input_fn. num_readers: The number of reader datasets in the input_fn. This number is
approximate; fewer shards will be created if not all shards are assigned
batches. This can occur due to discretization in the assignment process.
num_neg: The number of false negatives per positive example. num_neg: The number of false negatives per positive example.
num_positives: The number of positive examples. This value is used num_positives: The number of positive examples. This value is used
to pre-allocate arrays while the imap is still running. (NumPy does not to pre-allocate arrays while the imap is still running. (NumPy does not
...@@ -307,6 +309,10 @@ def _construct_records( ...@@ -307,6 +309,10 @@ def _construct_records(
break break
batches_by_file[current_file_id].append(current_batch_id) batches_by_file[current_file_id].append(current_batch_id)
# Drop shards which were not assigned batches
batches_by_file = [i for i in batches_by_file if i]
num_readers = len(batches_by_file)
if is_training: if is_training:
# Empirically it is observed that placing the batch with repeated values at # Empirically it is observed that placing the batch with repeated values at
# the start rather than the end improves convergence. # the start rather than the end improves convergence.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment