Additional inception fixes

c539b46d · Neal Wu · 9da44850 · c539b46d · c539b46d · c539b46d
Commit c539b46d authored Mar 25, 2017 by Neal Wu
7 changed files
--- a/inception/README.md
+++ b/inception/README.md
@@ -111,15 +111,12 @@ ready to train or evaluate with the ImageNet data set.
 intensive task and depending on your compute setup may take several days or even
 weeks.

-*Before proceeding* please read the [Convolutional Neural Networks]
-(https://www.tensorflow.org/tutorials/deep_cnn/index.html) tutorial in
-particular focus on [Training a Model Using Multiple GPU Cards]
-(https://www.tensorflow.org/tutorials/deep_cnn/index.html#training-a-model-using-multiple-gpu-cards)
-. The model training method is nearly identical to that described in the
+*Before proceeding* please read the [Convolutional Neural Networks](https://www.tensorflow.org/tutorials/deep_cnn/index.html) tutorial; in
+particular, focus on [Training a Model Using Multiple GPU Cards](https://www.tensorflow.org/tutorials/deep_cnn/index.html#launching_and_training_the_model_on_multiple_gpu_cards). The model training method is nearly identical to that described in the
 CIFAR-10 multi-GPU model training. Briefly, the model training

-*   Places an individual model replica on each GPU. Split the batch across the
-    GPUs.
+*   Places an individual model replica on each GPU.
+*   Splits the batch across the GPUs.
 *   Updates model parameters synchronously by waiting for all GPUs to finish
    processing a batch of data.

@@ -245,11 +242,9 @@ We term each machine that maintains model parameters a `ps`, short for
 `ps` as the model parameters may be sharded across multiple machines.

 Variables may be updated with synchronous or asynchronous gradient updates. One
-may construct a an [`Optimizer`]
-(https://www.tensorflow.org/api_docs/python/train.html#optimizers) in TensorFlow
-that constructs the necessary graph for either case diagrammed below from
-TensorFlow [Whitepaper]
-(http://download.tensorflow.org/paper/whitepaper2015.pdf):
+may construct a an [`Optimizer`](https://www.tensorflow.org/api_docs/python/train.html#optimizers) in TensorFlow
+that constructs the necessary graph for either case diagrammed below from the
+TensorFlow [Whitepaper](http://download.tensorflow.org/paper/whitepaper2015.pdf):

 <div style="width:40%; margin:auto; margin-bottom:10px; margin-top:20px;">
  <img style="width:100%"
@@ -380,10 +375,8 @@ training Inception in a distributed manner.
 Evaluating an Inception v3 model on the ImageNet 2012 validation data set
 requires running a separate binary.

-The evaluation procedure is nearly identical to [Evaluating a Model]
-(https://www.tensorflow.org/tutorials/deep_cnn/index.html#evaluating-a-model)
-described in the [Convolutional Neural Network]
-(https://www.tensorflow.org/tutorials/deep_cnn/index.html) tutorial.
+The evaluation procedure is nearly identical to [Evaluating a Model](https://www.tensorflow.org/tutorials/deep_cnn/index.html#evaluating_a_model)
+described in the [Convolutional Neural Network](https://www.tensorflow.org/tutorials/deep_cnn/index.html) tutorial.

 **WARNING** Be careful not to run the evaluation and training binary on the same
 GPU or else you might run out of memory. Consider running the evaluation on a
@@ -438,8 +431,7 @@ daisy, dandelion, roses, sunflowers, tulips
 There is a single automated script that downloads the data set and converts it
 to the TFRecord format. Much like the ImageNet data set, each record in the
 TFRecord format is a serialized `tf.Example` proto whose entries include a
-JPEG-encoded string and an integer label. Please see [`parse_example_proto`]
-(inception/image_processing.py) for details.
+JPEG-encoded string and an integer label. Please see [`parse_example_proto`](inception/image_processing.py) for details.

 The script just takes a few minutes to run depending your network connection
 speed for downloading and processing the images. Your hard disk requires 200MB
@@ -471,14 +463,12 @@ and `validation-?????-of-00002`, respectively.
 **NOTE** If you wish to prepare a custom image data set for transfer learning,
 you will need to invoke [`build_image_data.py`](inception/data/build_image_data.py) on
 your custom data set. Please see the associated options and assumptions behind
-this script by reading the comments section of [`build_image_data.py`]
-(inception/data/build_image_data.py). Also, if your custom data has a different
+this script by reading the comments section of [`build_image_data.py`](inception/data/build_image_data.py). Also, if your custom data has a different
 number of examples or classes, you need to change the appropriate values in
 [`imagenet_data.py`](inception/imagenet_data.py).

 The second piece you will need is a trained Inception v3 image model. You have
-the option of either training one yourself (See [How to Train from Scratch]
-(#how-to-train-from-scratch) for details) or you can download a pre-trained
+the option of either training one yourself (See [How to Train from Scratch](#how-to-train-from-scratch) for details) or you can download a pre-trained
 model like so:

 ```shell
@@ -806,8 +796,7 @@ comments in [`image_processing.py`](inception/image_processing.py) for more deta
 #### The model runs out of CPU memory.

 In lieu of buying more CPU memory, an easy fix is to decrease
-`--input_queue_memory_factor`. See [Adjusting Memory Demands]
-(#adjusting-memory-demands).
+`--input_queue_memory_factor`. See [Adjusting Memory Demands](#adjusting-memory-demands).

 #### The model runs out of GPU memory.


--- a/inception/inception/data/build_image_data.py
+++ b/inception/inception/data/build_image_data.py
@@ -32,7 +32,7 @@ a sharded data set consisting of TFRecord files
  train_directory/train-00000-of-01024
  train_directory/train-00001-of-01024
  ...
-  train_directory/train-00127-of-01024
+  train_directory/train-01023-of-01024

 and

@@ -50,7 +50,7 @@ contains the following fields:
  image/width: integer, image width in pixels
  image/colorspace: string, specifying the colorspace, always 'RGB'
  image/channels: integer, specifying the number of channels, always 3
-  image/format: string, specifying the format, always'JPEG'
+  image/format: string, specifying the format, always 'JPEG'

  image/filename: string containing the basename of the image file
            e.g. 'n01440764_10026.JPEG' or 'ILSVRC2012_val_00000293.JPEG'
@@ -60,7 +60,7 @@ contains the following fields:
  image/class/text: string specifying the human-readable version of the label
    e.g. 'dog'

-If you data set involves bounding boxes, please look at build_imagenet_data.py.
+If your data set involves bounding boxes, please look at build_imagenet_data.py.
 """
 from __future__ import absolute_import
 from __future__ import division
@@ -72,7 +72,6 @@ import random
 import sys
 import threading

-
 import numpy as np
 import tensorflow as tf

@@ -306,7 +305,7 @@ def _process_image_files(name, filenames, texts, labels, num_shards):
  spacing = np.linspace(0, len(filenames), FLAGS.num_threads + 1).astype(np.int)
  ranges = []
  for i in range(len(spacing) - 1):
-    ranges.append([spacing[i], spacing[i+1]])
+    ranges.append([spacing[i], spacing[i + 1]])

  # Launch a thread for each batch.
  print('Launching %d threads for spacings: %s' % (FLAGS.num_threads, ranges))

--- a/inception/inception/data/build_imagenet_data.py
+++ b/inception/inception/data/build_imagenet_data.py
@@ -36,7 +36,7 @@ a sharded data set consisting of 1024 and 128 TFRecord files, respectively.
  train_directory/train-00000-of-01024
  train_directory/train-00001-of-01024
  ...
-  train_directory/train-00127-of-01024
+  train_directory/train-01023-of-01024

 and

@@ -54,7 +54,7 @@ serialized Example proto. The Example proto contains the following fields:
  image/width: integer, image width in pixels
  image/colorspace: string, specifying the colorspace, always 'RGB'
  image/channels: integer, specifying the number of channels, always 3
-  image/format: string, specifying the format, always'JPEG'
+  image/format: string, specifying the format, always 'JPEG'

  image/filename: string containing the basename of the image file
            e.g. 'n01440764_10026.JPEG' or 'ILSVRC2012_val_00000293.JPEG'
@@ -80,7 +80,7 @@ serialized Example proto. The Example proto contains the following fields:
 Note that the length of xmin is identical to the length of xmax, ymin and ymax
 for each example.

-Running this script using 16 threads may take around ~2.5 hours on a HP Z420.
+Running this script using 16 threads may take around ~2.5 hours on an HP Z420.
 """
 from __future__ import absolute_import
 from __future__ import division
@@ -92,7 +92,6 @@ import random
 import sys
 import threading

-
 import numpy as np
 import tensorflow as tf

@@ -435,7 +434,7 @@ def _process_image_files(name, filenames, synsets, labels, humans,
  ranges = []
  threads = []
  for i in range(len(spacing) - 1):
-    ranges.append([spacing[i], spacing[i+1]])
+    ranges.append([spacing[i], spacing[i + 1]])

  # Launch a thread for each batch.
  print('Launching %d threads for spacings: %s' % (FLAGS.num_threads, ranges))

--- a/inception/inception/data/download_and_preprocess_flowers.sh
+++ b/inception/inception/data/download_and_preprocess_flowers.sh
@@ -35,7 +35,7 @@
 set -e

 if [ -z "$1" ]; then
-  echo "usage download_and_preprocess_flowers.sh [data dir]"
+  echo "Usage: download_and_preprocess_flowers.sh [data dir]"
  exit
 fi


--- a/inception/inception/data/download_and_preprocess_flowers_mac.sh
+++ b/inception/inception/data/download_and_preprocess_flowers_mac.sh
@@ -35,7 +35,7 @@
 set -e

 if [ -z "$1" ]; then
-  echo "usage download_and_preprocess_flowers.sh [data dir]"
+  echo "Usage: download_and_preprocess_flowers.sh [data dir]"
  exit
 fi


--- a/inception/inception/data/download_and_preprocess_imagenet.sh
+++ b/inception/inception/data/download_and_preprocess_imagenet.sh
@@ -49,7 +49,7 @@
 set -e

 if [ -z "$1" ]; then
-  echo "usage download_and_preprocess_imagenet.sh [data dir]"
+  echo "Usage: download_and_preprocess_imagenet.sh [data dir]"
  exit
 fi

@@ -84,7 +84,7 @@ BOUNDING_BOX_FILE="${SCRATCH_DIR}/imagenet_2012_bounding_boxes.csv"
 BOUNDING_BOX_DIR="${SCRATCH_DIR}bounding_boxes/"

 "${BOUNDING_BOX_SCRIPT}" "${BOUNDING_BOX_DIR}" "${LABELS_FILE}" \
- | sort >"${BOUNDING_BOX_FILE}"
+ | sort > "${BOUNDING_BOX_FILE}"
 echo "Finished downloading and preprocessing the ImageNet data."

 # Build the TFRecords version of the ImageNet data.

--- a/inception/inception/data/download_imagenet.sh
+++ b/inception/inception/data/download_imagenet.sh
@@ -24,7 +24,7 @@
 # downloading the raw images.
 #
 # usage:
-#  ./download_imagenet.sh [dirname]
+#  ./download_imagenet.sh [dir name] [synsets file]
 set -e

 if [ "x$IMAGENET_ACCESS_KEY" == x -o "x$IMAGENET_USERNAME" == x ]; then