Commit 0c74ba69 authored by Toby Boyd's avatar Toby Boyd
Browse files

link to non-deprecated imagenet preprocessing script

parent fe54563c
# ResNet in TensorFlow # ResNet in TensorFlow
Deep residual networks, or ResNets for short, provided the breakthrough idea of identity mappings in order to enable training of very deep convolutional neural networks. This folder contains an implementation of ResNet for the ImageNet dataset written in TensorFlow. Deep residual networks, or ResNets for short, provided the breakthrough idea of
identity mappings in order to enable training of very deep convolutional neural
networks. This folder contains an implementation of ResNet for the ImageNet
dataset written in TensorFlow.
See the following papers for more background: See the following papers for more background:
...@@ -12,14 +15,13 @@ In code, v1 refers to the ResNet defined in [1] but where a stride 2 is used on ...@@ -12,14 +15,13 @@ In code, v1 refers to the ResNet defined in [1] but where a stride 2 is used on
the 3x3 conv rather than the first 1x1 in the bottleneck. This change results the 3x3 conv rather than the first 1x1 in the bottleneck. This change results
in higher and more stable accuracy with less epochs than the original v1 and has in higher and more stable accuracy with less epochs than the original v1 and has
shown to scale to higher batch sizes with minimal degradation in accuracy. shown to scale to higher batch sizes with minimal degradation in accuracy.
There is no originating paper and the first mention we are aware of was in the There is no originating paper. The first mention we are aware of was in the
[torch version of ResNetv1](https://github.com/facebook/fb.resnet.torch). Most torch version of [ResNetv1](https://github.com/facebook/fb.resnet.torch). Most
popular v1 implementations are this implementation which we call ResNetv1.5. In popular v1 implementations are this implementation which we call ResNetv1.5.
testing we found v1.5 requires ~12% more compute to train and has 6% reduced
throughput for inference compared to ResNetv1. Comparing the v1 model to the In testing we found v1.5 requires ~12% more compute to train and has 6% reduced
v1.5 model, which has happened in blog posts, is an apples-to-oranges throughput for inference compared to ResNetv1. CIFAR-10 ResNet does not use the
comparison especially in regards to hardware or platform performance. CIFAR-10 bottleneck and is thus the same for v1 as v1.5.
ResNet does not use the bottleneck and is not impacted by these nuances.
v2 refers to [2]. The principle difference between the two versions is that v1 v2 refers to [2]. The principle difference between the two versions is that v1
applies batch normalization and activation after convolution, while v2 applies applies batch normalization and activation after convolution, while v2 applies
...@@ -38,14 +40,11 @@ First make sure you've [added the models folder to your Python path](/official/# ...@@ -38,14 +40,11 @@ First make sure you've [added the models folder to your Python path](/official/#
Then download and extract the CIFAR-10 data from Alex's website, specifying the location with the `--data_dir` flag. Run the following: Then download and extract the CIFAR-10 data from Alex's website, specifying the location with the `--data_dir` flag. Run the following:
``` ```bash
python cifar10_download_and_extract.py python cifar10_download_and_extract.py
``` # Then to train the model, run the following:
Then to train the model, run the following:
```
python cifar10_main.py python cifar10_main.py
``` ```
Use `--data_dir` to specify the location of the CIFAR-10 data used in the previous step. There are more flag options as described in `cifar10_main.py`. Use `--data_dir` to specify the location of the CIFAR-10 data used in the previous step. There are more flag options as described in `cifar10_main.py`.
...@@ -54,23 +53,32 @@ Use `--data_dir` to specify the location of the CIFAR-10 data used in the previo ...@@ -54,23 +53,32 @@ Use `--data_dir` to specify the location of the CIFAR-10 data used in the previo
## ImageNet ## ImageNet
### Setup ### Setup
To begin, you will need to download the ImageNet dataset and convert it to TFRecord format. Follow along with the [Inception guide](https://github.com/tensorflow/models/tree/master/research/inception#getting-started) in order to prepare the dataset. To begin, you will need to download the ImageNet dataset and convert it to
TFRecord format. The following [script](https://github.com/tensorflow/tpu/blob/master/tools/datasets/imagenet_to_gcs.py)
and [README](https://github.com/tensorflow/tpu/tree/master/tools/datasets#imagenet_to_gcspy)
provide a few options.
Once your dataset is ready, you can begin training the model as follows: Once your dataset is ready, you can begin training the model as follows:
``` ```bash
python imagenet_main.py --data_dir=/path/to/imagenet python imagenet_main.py --data_dir=/path/to/imagenet
``` ```
The model will begin training and will automatically evaluate itself on the validation data roughly once per epoch. The model will begin training and will automatically evaluate itself on the
validation data roughly once per epoch.
Note that there are a number of other options you can specify, including `--model_dir` to choose where to store the model and `--resnet_size` to choose the model size (options include ResNet-18 through ResNet-200). See [`resnet.py`](resnet.py) for the full list of options. Note that there are a number of other options you can specify, including
`--model_dir` to choose where to store the model and `--resnet_size` to choose
the model size (options include ResNet-18 through ResNet-200). See
[`resnet.py`](resnet.py) for the full list of options.
## Compute Devices ## Compute Devices
Training is accomplished using the DistributionStrategies API. (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/distribute/README.md) Training is accomplished using the DistributionStrategies API. (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/distribute/README.md)
The appropriate distribution strategy is chosen based on the `--num_gpus` flag. By default this flag is one if TensorFlow is compiled with CUDA, and zero otherwise. The appropriate distribution strategy is chosen based on the `--num_gpus` flag.
By default this flag is one if TensorFlow is compiled with CUDA, and zero
otherwise.
num_gpus: num_gpus:
+ 0: Use OneDeviceStrategy and train on CPU. + 0: Use OneDeviceStrategy and train on CPU.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment