Commit a7531875 authored by Toby Boyd's avatar Toby Boyd
Browse files

Updated readme and cleaned up formatting

parent 8008e72f
...@@ -14,40 +14,18 @@ Before trying to run the model we highly encourage you to read all the README. ...@@ -14,40 +14,18 @@ Before trying to run the model we highly encourage you to read all the README.
1. Install TensorFlow version 1.2.1 or later with GPU support. 1. Install TensorFlow version 1.2.1 or later with GPU support.
You can see how to do it [here](https://www.tensorflow.org/install/). You can see how to do it [here](https://www.tensorflow.org/install/).
2. Download the CIFAR-10 dataset. 2. Generate TFRecord files.
```shell
curl -o cifar-10-python.tar.gz https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
tar xzf cifar-10-python.tar.gz
```
After running the commands above, you should see the following files in the folder where the data was downloaded.
``` shell
ls -R cifar-10-batches-py
```
The output should be:
```
batches.meta data_batch_1 data_batch_2 data_batch_3
data_batch_4 data_batch_5 readme.html test_batch
```
3. Generate TFRecord files.
This will generate a tf record for the training and test data available at the input_dir. This will generate a tf record for the training and test data available at the input_dir.
You can see more details in `generate_cifar10_tf_records.py` You can see more details in `generate_cifar10_tf_records.py`
```shell ```shell
python generate_cifar10_tfrecords.py --input-dir=${PWD}/cifar-10-batches-py \ python generate_cifar10_tfrecords.py --data-dir=${PWD}/cifar-10-data
--output-dir=${PWD}/cifar-10-batches-py
``` ```
After running the command above, you should see the following new files in the output_dir. After running the command above, you should see the following new files in the output_dir.
``` shell ``` shell
ls -R cifar-10-batches-py ls -R cifar-10-data
``` ```
``` ```
...@@ -59,7 +37,7 @@ train.tfrecords validation.tfrecords eval.tfrecords ...@@ -59,7 +37,7 @@ train.tfrecords validation.tfrecords eval.tfrecords
Run the model on CPU only. After training, it runs the evaluation. Run the model on CPU only. After training, it runs the evaluation.
``` ```
python cifar10_main.py --data-dir=${PWD}/cifar-10-batches-py \ python cifar10_main.py --data-dir=${PWD}/cifar-10-data \
--job-dir=/tmp/cifar10 \ --job-dir=/tmp/cifar10 \
--num-gpus=0 \ --num-gpus=0 \
--train-steps=1000 --train-steps=1000
...@@ -67,7 +45,7 @@ python cifar10_main.py --data-dir=${PWD}/cifar-10-batches-py \ ...@@ -67,7 +45,7 @@ python cifar10_main.py --data-dir=${PWD}/cifar-10-batches-py \
Run the model on 2 GPUs using CPU as parameter server. After training, it runs the evaluation. Run the model on 2 GPUs using CPU as parameter server. After training, it runs the evaluation.
``` ```
python cifar10_main.py --data-dir=${PWD}/cifar-10-batches-py \ python cifar10_main.py --data-dir=${PWD}/cifar-10-data \
--job-dir=/tmp/cifar10 \ --job-dir=/tmp/cifar10 \
--num-gpus=2 \ --num-gpus=2 \
--train-steps=1000 --train-steps=1000
...@@ -78,7 +56,7 @@ It will run an experiment, which for local setting basically means it will run s ...@@ -78,7 +56,7 @@ It will run an experiment, which for local setting basically means it will run s
a couple of times to perform evaluation. a couple of times to perform evaluation.
``` ```
python cifar10_main.py --data-dir=${PWD}/cifar-10-batches-bin \ python cifar10_main.py --data-dir=${PWD}/cifar-10-data \
--job-dir=/tmp/cifar10 \ --job-dir=/tmp/cifar10 \
--variable-strategy GPU \ --variable-strategy GPU \
--num-gpus=2 \ --num-gpus=2 \
...@@ -98,7 +76,7 @@ You'll also need a Google Cloud Storage bucket for the data. If you followed the ...@@ -98,7 +76,7 @@ You'll also need a Google Cloud Storage bucket for the data. If you followed the
``` ```
MY_BUCKET=gs://<my-bucket-name> MY_BUCKET=gs://<my-bucket-name>
gsutil cp -r ${PWD}/cifar-10-batches-py $MY_BUCKET/ gsutil cp -r ${PWD}/cifar-10-data $MY_BUCKET/
``` ```
Then run the following command from the `tutorials/image` directory of this repository (the parent directory of this README): Then run the following command from the `tutorials/image` directory of this repository (the parent directory of this README):
...@@ -111,7 +89,7 @@ gcloud ml-engine jobs submit training cifarmultigpu \ ...@@ -111,7 +89,7 @@ gcloud ml-engine jobs submit training cifarmultigpu \
--package-path cifar10_estimator/ \ --package-path cifar10_estimator/ \
--module-name cifar10_estimator.cifar10_main \ --module-name cifar10_estimator.cifar10_main \
-- \ -- \
--data-dir=$MY_BUCKET/cifar-10-batches-py \ --data-dir=$MY_BUCKET/cifar-10-data \
--num-gpus=4 \ --num-gpus=4 \
--train-steps=1000 --train-steps=1000
``` ```
...@@ -191,7 +169,7 @@ The num_workers arugument is used only to update the learning rate correctly. ...@@ -191,7 +169,7 @@ The num_workers arugument is used only to update the learning rate correctly.
Make sure the model_dir is the same as defined on the TF_CONFIG. Make sure the model_dir is the same as defined on the TF_CONFIG.
```shell ```shell
python cifar10_main.py --data-dir=gs://path/cifar-10-batches-py \ python cifar10_main.py --data-dir=gs://path/cifar-10-data \
--job-dir=gs://path/model_dir/ \ --job-dir=gs://path/model_dir/ \
--num-gpus=4 \ --num-gpus=4 \
--train-steps=40000 \ --train-steps=40000 \
...@@ -332,7 +310,7 @@ It will run evaluation a couple of times during training. ...@@ -332,7 +310,7 @@ It will run evaluation a couple of times during training.
Make sure the model_dir is the same as defined on the TF_CONFIG. Make sure the model_dir is the same as defined on the TF_CONFIG.
```shell ```shell
python cifar10_main.py --data-dir=gs://path/cifar-10-batches-py \ python cifar10_main.py --data-dir=gs://path/cifar-10-data \
--job-dir=gs://path/model_dir/ \ --job-dir=gs://path/model_dir/ \
--num-gpus=4 \ --num-gpus=4 \
--train-steps=40000 \ --train-steps=40000 \
......
...@@ -12,10 +12,11 @@ ...@@ -12,10 +12,11 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
# ============================================================================== # ==============================================================================
"""Read CIFAR-10 data from pickled numpy arrays and write TFExamples. """Read CIFAR-10 data from pickled numpy arrays and writes TFRecords.
Generates TFRecord files from the python version of the CIFAR-10 dataset Generates tf.train.Example protos and writes them to TFRecord files from the
downloaded from https://www.cs.toronto.edu/~kriz/cifar.html. python version of the CIFAR-10 dataset downloaded from
https://www.cs.toronto.edu/~kriz/cifar.html.
""" """
from __future__ import absolute_import from __future__ import absolute_import
...@@ -30,14 +31,18 @@ import tarfile ...@@ -30,14 +31,18 @@ import tarfile
from six.moves import xrange # pylint: disable=redefined-builtin from six.moves import xrange # pylint: disable=redefined-builtin
import tensorflow as tf import tensorflow as tf
DATA_URL = 'https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz' CIFAR_FILENAME = 'cifar-10-python.tar.gz'
CIFAR_10_FILE_NAME = 'cifar-10-python.tar.gz' CIFAR_DOWNLOAD_URL = 'https://www.cs.toronto.edu/~kriz/' + CIFAR_FILENAME
CIFAR_LOCAL_FOLDER = 'cifar-10-batches-py' CIFAR_LOCAL_FOLDER = 'cifar-10-batches-py'
def download_and_extract(data_dir): def download_and_extract(data_dir):
# download CIFAR-10 if not already downloaded. # download CIFAR-10 if not already downloaded.
tf.contrib.learn.datasets.base.maybe_download(CIFAR_10_FILE_NAME, data_dir, DATA_URL) tf.contrib.learn.datasets.base.maybe_download(CIFAR_FILENAME, data_dir,
tarfile.open(os.path.join(data_dir,CIFAR_10_FILE_NAME), 'r:gz').extractall(data_dir) CIFAR_DOWNLOAD_URL)
tarfile.open(os.path.join(data_dir, CIFAR_FILENAME),
'r:gz').extractall(data_dir)
def _int64_feature(value): def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value])) return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
...@@ -63,19 +68,17 @@ def read_pickle_from_file(filename): ...@@ -63,19 +68,17 @@ def read_pickle_from_file(filename):
def convert_to_tfrecord(input_files, output_file): def convert_to_tfrecord(input_files, output_file):
"""Converts a file to tfrecords.""" """Converts a file to TFRecords."""
print('Generating %s' % output_file) print('Generating %s' % output_file)
with tf.python_io.TFRecordWriter(output_file) as record_writer: with tf.python_io.TFRecordWriter(output_file) as record_writer:
for input_file in input_files: for input_file in input_files:
print(input_file)
data_dict = read_pickle_from_file(input_file) data_dict = read_pickle_from_file(input_file)
data = data_dict['data'] data = data_dict['data']
labels = data_dict['labels'] labels = data_dict['labels']
num_entries_in_batch = len(labels) num_entries_in_batch = len(labels)
for i in range(num_entries_in_batch): for i in range(num_entries_in_batch):
example = tf.train.Example( example = tf.train.Example(features=tf.train.Features(
features=tf.train.Features(feature={ feature={
'image': _bytes_feature(data[i].tobytes()), 'image': _bytes_feature(data[i].tobytes()),
'label': _int64_feature(labels[i]) 'label': _int64_feature(labels[i])
})) }))
...@@ -83,18 +86,18 @@ def convert_to_tfrecord(input_files, output_file): ...@@ -83,18 +86,18 @@ def convert_to_tfrecord(input_files, output_file):
def main(data_dir): def main(data_dir):
print('Download from {} and extract.'.format(CIFAR_DOWNLOAD_URL))
download_and_extract(data_dir) download_and_extract(data_dir)
file_names = _get_file_names() file_names = _get_file_names()
input_dir = os.path.join(data_dir, CIFAR_LOCAL_FOLDER) input_dir = os.path.join(data_dir, CIFAR_LOCAL_FOLDER)
for mode, files in file_names.items(): for mode, files in file_names.items():
input_files = [ input_files = [os.path.join(input_dir, f) for f in files]
os.path.join(input_dir, f) for f in files]
output_file = os.path.join(data_dir, mode + '.tfrecords') output_file = os.path.join(data_dir, mode + '.tfrecords')
try: try:
os.remove(output_file) os.remove(output_file)
except OSError: except OSError:
pass pass
# Convert to Examples and write the result to TFRecords. # Convert to tf.train.Example and write the to TFRecords.
convert_to_tfrecord(input_files, output_file) convert_to_tfrecord(input_files, output_file)
print('Done!') print('Done!')
...@@ -105,8 +108,7 @@ if __name__ == '__main__': ...@@ -105,8 +108,7 @@ if __name__ == '__main__':
'--data-dir', '--data-dir',
type=str, type=str,
default='', default='',
help='Directory to download and extract CIFAR-10 to.' help='Directory to download and extract CIFAR-10 to.')
)
args = parser.parse_args() args = parser.parse_args()
main(args.data_dir) main(args.data_dir)
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment