"vscode:/vscode.git/clone" did not exist on "4da25bb4a0750a55eaad7db6d473ddc0bbd5dd57"
Commit 468d8bb6 authored by Jules Gagnon-Marchand's avatar Jules Gagnon-Marchand Committed by Taylor Robie
Browse files

Deterministic dataset order fix (#5098)

* Deterministic dataset order fix

In order for the order of the files to be deterministic, in `tf.data.Dataset.list_files(..., shuffle)`, shuffle needs to be True, otherwise different iterator inits will yield different file orders

* removed unnecessary shuffle of filenames

* Removed the `_FILE_SHUFFLE_BUFFER` definition
parent abc1c4a7
...@@ -58,8 +58,6 @@ import tensorflow as tf ...@@ -58,8 +58,6 @@ import tensorflow as tf
from official.utils.misc import model_helpers from official.utils.misc import model_helpers
# Use the number of training files as the shuffle buffer.
_FILE_SHUFFLE_BUFFER = 100
# Buffer size for reading records from a TFRecord file. Each training file is # Buffer size for reading records from a TFRecord file. Each training file is
# 7.2 MB, so 8 MB allows an entire file to be kept in memory. # 7.2 MB, so 8 MB allows an entire file to be kept in memory.
_READ_RECORD_BUFFER = 8 * 1000 * 1000 _READ_RECORD_BUFFER = 8 * 1000 * 1000
...@@ -220,11 +218,7 @@ def _read_and_batch_from_files( ...@@ -220,11 +218,7 @@ def _read_and_batch_from_files(
Returns: Returns:
tf.data.Dataset object containing examples loaded from the files. tf.data.Dataset object containing examples loaded from the files.
""" """
dataset = tf.data.Dataset.list_files(file_pattern) dataset = tf.data.Dataset.list_files(file_pattern, shuffle=shuffle)
if shuffle:
# Shuffle filenames
dataset = dataset.shuffle(buffer_size=_FILE_SHUFFLE_BUFFER)
# Read files and interleave results. When training, the order of the examples # Read files and interleave results. When training, the order of the examples
# will be non-deterministic. # will be non-deterministic.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment