Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
ResNet50_tensorflow
Commits
bf298439
"git@developer.sourcefind.cn:OpenDAS/megatron-lm.git" did not exist on "6fd0b406740cf121ead37f31ab69a0e0914411d4"
Unverified
Commit
bf298439
authored
Oct 18, 2018
by
Taylor Robie
Committed by
GitHub
Oct 18, 2018
Browse files
fix error when last shard is not assigned a batch (#5569)
parent
19d4eaaf
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
7 additions
and
1 deletion
+7
-1
official/recommendation/data_async_generation.py
official/recommendation/data_async_generation.py
+7
-1
No files found.
official/recommendation/data_async_generation.py
View file @
bf298439
...
@@ -206,7 +206,9 @@ def _construct_records(
...
@@ -206,7 +206,9 @@ def _construct_records(
num_workers: Number of multiprocessing workers to use for negative
num_workers: Number of multiprocessing workers to use for negative
generation.
generation.
cache_paths: Paths object with information of where to write files.
cache_paths: Paths object with information of where to write files.
num_readers: The number of reader datasets in the input_fn.
num_readers: The number of reader datasets in the input_fn. This number is
approximate; fewer shards will be created if not all shards are assigned
batches. This can occur due to discretization in the assignment process.
num_neg: The number of false negatives per positive example.
num_neg: The number of false negatives per positive example.
num_positives: The number of positive examples. This value is used
num_positives: The number of positive examples. This value is used
to pre-allocate arrays while the imap is still running. (NumPy does not
to pre-allocate arrays while the imap is still running. (NumPy does not
...
@@ -307,6 +309,10 @@ def _construct_records(
...
@@ -307,6 +309,10 @@ def _construct_records(
break
break
batches_by_file
[
current_file_id
].
append
(
current_batch_id
)
batches_by_file
[
current_file_id
].
append
(
current_batch_id
)
# Drop shards which were not assigned batches
batches_by_file
=
[
i
for
i
in
batches_by_file
if
i
]
num_readers
=
len
(
batches_by_file
)
if
is_training
:
if
is_training
:
# Empirically it is observed that placing the batch with repeated values at
# Empirically it is observed that placing the batch with repeated values at
# the start rather than the end improves convergence.
# the start rather than the end improves convergence.
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment