Merged commit includes the following changes:

204489224 by Zhichao Lu: Modify ssd mobilenet v1 fpn config to be a bit more tolerant to OOM failure by bumping down the batch size to 64 and doubling the number of iterations to 25k. It now converges in 2.5 hours. -- 204488942 by Zhichao Lu: Internal change 204480631 by Zhichao Lu: This CL makes sure that num_steps parameter are not updated to 0 if num_steps field is not mentioned in config. The default behavior for number of steps parameter for training is infinite (train forever). The default value num_steps in train.proto is 0 (for training indefinitely). However the estimator/training function expects the num_steps to be set to None to train indefinitely. -- 204437217 by Zhichao Lu: Create a Docker image to support TensorFlow Lite / Object Detection blog post. -- 204317570 by Zhichao Lu: Internal change PiperOrigin-RevId: 204489224

Merged commit includes the following changes:
204489224 by Zhichao Lu: Modify ssd mobilenet v1 fpn config to be a bit more tolerant to OOM failure by bumping down the batch size to 64 and doubling the number of iterations to 25k. It now converges in 2.5 hours. -- 204488942 by Zhichao Lu: Internal change 204480631 by Zhichao Lu: This CL makes sure that num_steps parameter are not updated to 0 if num_steps field is not mentioned in config. The default behavior for number of steps parameter for training is infinite (train forever). The default value num_steps in train.proto is 0 (for training indefinitely). However the estimator/training function expects the num_steps to be set to None to train indefinitely. -- 204437217 by Zhichao Lu: Create a Docker image to support TensorFlow Lite / Object Detection blog post. -- 204317570 by Zhichao Lu: Internal change PiperOrigin-RevId: 204489224
85dd5fa4 · Zhichao Lu · pkulzc · 11070af9 · 85dd5fa4 · 85dd5fa4
Commit 85dd5fa4 authored Jul 13, 2018 by Zhichao Lu Committed by pkulzc Jul 13, 2018
11 changed files
--- a/research/object_detection/README.md
+++ b/research/object_detection/README.md
@@ -77,6 +77,10 @@ Extras:
      Run an instance segmentation model</a><br>
  * <a href='g3doc/challenge_evaluation.md'>
      Run the evaluation for the Open Images Challenge 2018</a><br>
+  * <a href='g3doc/tpu_compatibility.md'>
+      TPU compatible detection pipelines</a><br>
+  *  <a href='g3doc/running_on_mobile_tensorflowlite.md'>
+      Running object detection on mobile devices with TensorFlow Lite</a><br>

 ## Getting Help

@@ -95,6 +99,26 @@ reporting an issue.

 ## Release information

+### July 13, 2018
+
+There are many new updates in this release, extending the functionality and
+capability of the API:
+
+* Moving from slim-based training to [Estimator](https://www.tensorflow.org/api_docs/python/tf/estimator/Estimator)-based
+training.
+* Support for [RetinaNet](https://arxiv.org/abs/1708.02002), and a [MobileNet](https://ai.googleblog.com/2017/06/mobilenets-open-source-models-for.html)
+adaptation of RetinaNet.
+* A novel SSD-based architecture called the [Pooling Pyramid Network](https://arxiv.org/abs/1807.03284) (PPN).
+* Releasing several [TPU](https://cloud.google.com/tpu/)-compatible models.
+These can be found in the `samples/configs/` directory with a comment in the
+pipeline configuration files indicating TPU compatibility.
+* Support for quantized training.
+* Updated documentation for new binaries, Cloud training, and [Tensorflow Lite](https://www.tensorflow.org/mobile/tflite/).
+
+<b>Thanks to contributors</b>: Sara Robinson, Aakanksha Chowdhery, Derek Chow,
+Pengchong Jin, Jonathan Huang, Vivek Rathod, Zhichao Lu, Ronny Votel
+
+
 ### June 25, 2018

 Additional evaluation tools for the [Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html) are out.

--- a/research/object_detection/dockerfiles/android/Dockerfile
+++ b/research/object_detection/dockerfiles/android/Dockerfile
+# Copyright 2018 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# #==========================================================================
+
+FROM tensorflow/tensorflow:nightly-devel
+
+# Get the tensorflow models research directory, and move it into tensorflow
+# source folder to match recommendation of installation
+RUN git clone --depth 1 https://github.com/tensorflow/models.git && \
+    mv models /tensorflow/models
+
+
+# Install gcloud and gsutil commands
+# https://cloud.google.com/sdk/docs/quickstart-debian-ubuntu
+RUN export CLOUD_SDK_REPO="cloud-sdk-$(lsb_release -c -s)" && \
+    echo "deb http://packages.cloud.google.com/apt $CLOUD_SDK_REPO main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list && \
+    curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - && \
+    apt-get update -y && apt-get install google-cloud-sdk -y
+
+
+# Install the Tensorflow Object Detection API from here
+# https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/installation.md
+
+# Install object detection api dependencies
+RUN apt-get install -y protobuf-compiler python-pil python-lxml python-tk && \
+    pip install Cython && \
+    pip install contextlib2 && \
+    pip install jupyter && \
+    pip install matplotlib
+
+# Install pycocoapi
+RUN git clone --depth 1 https://github.com/cocodataset/cocoapi.git && \
+    cd cocoapi/PythonAPI && \
+    make -j8 && \
+    cp -r pycocotools /tensorflow/models/research && \
+    cd ../../ && \
+    rm -rf cocoapi
+
+# Get protoc 3.0.0, rather than the old version already in the container
+RUN curl -OL "https://github.com/google/protobuf/releases/download/v3.0.0/protoc-3.0.0-linux-x86_64.zip" && \
+    unzip protoc-3.0.0-linux-x86_64.zip -d proto3 && \
+    mv proto3/bin/* /usr/local/bin && \
+    mv proto3/include/* /usr/local/include && \
+    rm -rf proto3 protoc-3.0.0-linux-x86_64.zip
+
+# Run protoc on the object detection repo
+RUN cd /tensorflow/models/research && \
+    protoc object_detection/protos/*.proto --python_out=.
+
+# Set the PYTHONPATH to finish installing the API
+ENV PYTHONPATH $PYTHONPATH:/tensorflow/models/research:/tensorflow/models/research/slim
+
+
+# Install wget (to make life easier below) and editors (to allow people to edit
+# the files inside the container)
+RUN apt-get install -y wget vim emacs nano
+
+
+# Grab various data files which are used throughout the demo: dataset,
+# pretrained model, and pretrained TensorFlow Lite model. Install these all in
+# the same directories as recommended by the blog post.
+
+# Pets example dataset
+RUN mkdir -p /tmp/pet_faces_tfrecord/ && \
+    cd /tmp/pet_faces_tfrecord && \
+    curl "http://download.tensorflow.org/models/object_detection/pet_faces_tfrecord.tar.gz" | tar xzf -
+
+# Pretrained model
+# This one doesn't need its own directory, since it comes in a folder.
+RUN cd /tmp && \
+    curl -O "http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_07_03.tar.gz" && \
+    tar xzf ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_07_03.tar.gz && \
+    rm ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_07_03.tar.gz
+
+# Trained TensorFlow Lite model. This should get replaced by one generated from
+# export_tflite_ssd_graph.py when that command is called.
+RUN cd /tmp && \
+    curl -L -o tflite.zip \
+    https://storage.googleapis.com/download.tensorflow.org/models/tflite/frozengraphs_ssd_mobilenet_v1_0.75_quant_pets_2018_06_29.zip && \
+    unzip tflite.zip -d tflite && \
+    rm tflite.zip
+
+
+# Install Android development tools
+# Inspired by the following sources:
+# https://github.com/bitrise-docker/android/blob/master/Dockerfile
+# https://github.com/reddit/docker-android-build/blob/master/Dockerfile
+
+# Set environment variables
+ENV ANDROID_HOME /opt/android-sdk-linux
+ENV ANDROID_NDK_HOME /opt/android-ndk-r14b
+ENV PATH ${PATH}:${ANDROID_HOME}/tools:${ANDROID_HOME}/tools/bin:${ANDROID_HOME}/platform-tools
+
+# Install SDK tools
+RUN cd /opt && \
+    curl -OL https://dl.google.com/android/repository/sdk-tools-linux-4333796.zip && \
+    unzip sdk-tools-linux-4333796.zip -d ${ANDROID_HOME} && \
+    rm sdk-tools-linux-4333796.zip
+
+# Accept licenses before installing components, no need to echo y for each component
+# License is valid for all the standard components in versions installed from this file
+# Non-standard components: MIPS system images, preview versions, GDK (Google Glass) and Android Google TV require separate licenses, not accepted there
+RUN yes | sdkmanager --licenses
+
+# Install platform tools, SDK platform, and other build tools
+RUN yes | sdkmanager \
+    "tools" \
+    "platform-tools" \
+    "platforms;android-27" \
+    "platforms;android-23" \
+    "build-tools;27.0.3" \
+    "build-tools;23.0.3"
+
+# Install Android NDK (r14b)
+RUN cd /opt && \
+    curl -L -o android-ndk.zip http://dl.google.com/android/repository/android-ndk-r14b-linux-x86_64.zip && \
+    unzip -q android-ndk.zip && \
+    rm -f android-ndk.zip
+
+# Configure the build to use the things we just downloaded
+RUN cd /tensorflow && \
+    printf '\n\nn\ny\nn\nn\nn\ny\nn\nn\nn\nn\nn\nn\n\ny\n%s\n\n\n' ${ANDROID_HOME}|./configure
+
+
+WORKDIR /tensorflow
--- a/research/object_detection/dockerfiles/android/README.md
+++ b/research/object_detection/dockerfiles/android/README.md
+# Dockerfile for the TPU and TensorFlow Lite Object Detection tutorial
+
+This Docker image automates the setup involved with training
+object detection models on Google Cloud and building the Android TensorFlow Lite
+demo app. We recommend using this container if you decide to work through our
+tutorial on ["Training and serving a real-time mobile object detector in
+30 minutes with Cloud TPUs"](https://medium.com/tensorflow/training-and-serving-a-realtime-mobile-object-detector-in-30-minutes-with-cloud-tpus-b78971cf1193), though of course it may be useful even if you would
+like to use the Object Detection API outside the context of the tutorial.
+
+A couple words of warning:
+
+1. Docker containers do not have persistent storage. This means that any changes
+   you make to files inside the container will not persist if you restart
+   the container. When running through the tutorial,
+   **do not close the container**.
+2. To be able to deploy the [Android app](
+   https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/examples/android/app)
+   (which you will build at the end of the tutorial),
+   you will need to kill any instances of `adb` running on the host machine. You
+   can accomplish this by closing all instances of Android Studio, and then
+   running `adb kill-server`.
+
+You can install Docker by following the [instructions here](
+https://docs.docker.com/install/).
+
+## Running The Container
+
+From this directory, build the Dockerfile as follows (this takes a while):
+
+```
+docker build --tag detect-tf .
+```
+
+Run the container:
+
+```
+docker run --rm -it --privileged -p 6006:6006 detect-tf
+```
+
+When running the container, you will find yourself inside the `/tensorflow`
+directory, which is the path to the TensorFlow [source
+tree](https://github.com/tensorflow/tensorflow).
+
+## Text Editing
+
+The tutorial also
+requires you to occasionally edit files inside the source tree.
+This Docker images comes with `vim`, `nano`, and `emacs` preinstalled for your
+convenience.
+
+## What's In This Container
+
+This container is derived from the nightly build of TensorFlow, and contains the
+sources for TensorFlow at `/tensorflow`, as well as the
+[TensorFlow Models](https://github.com/tensorflow/models) which are available at
+`/tensorflow/models` (and contain the Object Detection API as a subdirectory
+at `/tensorflow/models/research/object_detection`).
+The Oxford-IIIT Pets dataset, the COCO pre-trained SSD + MobileNet (v1)
+checkpoint, and example
+trained model are all available in `/tmp` in their respective folders.
+
+This container also has the `gsutil` and `gcloud` utilities, the `bazel` build
+tool, and all dependencies necessary to use the Object Detection API, and
+compile and install the TensorFlow Lite Android demo app.
+
+At various points throughout the tutorial, you may see references to the
+*research directory*.  This refers to the `research` folder within the
+models repository, located at
+`/tensorflow/models/resesarch`.
--- a/research/object_detection/g3doc/detection_model_zoo.md
+++ b/research/object_detection/g3doc/detection_model_zoo.md
 # Tensorflow detection model zoo

 We provide a collection of detection models pre-trained on the [COCO
-dataset](http://mscoco.org), the [Kitti dataset](http://www.cvlibs.net/datasets/kitti/), and the
-[Open Images dataset](https://github.com/openimages/dataset). These models can
+dataset](http://mscoco.org), the [Kitti dataset](http://www.cvlibs.net/datasets/kitti/),
+the [Open Images dataset](https://github.com/openimages/dataset) and the
+[AVA v2.1 dataset](https://research.google.com/ava/). These models can
 be useful for
 out-of-the-box inference if you are interested in categories already in COCO
 (e.g., humans, cars, etc) or in Open Images (e.g.,
@@ -57,19 +58,26 @@ Some remarks on frozen inference graphs:
  a detector (and discarding the part past that point), which negatively impacts
  standard mAP metrics.
 * Our frozen inference graphs are generated using the
-  [v1.5.0](https://github.com/tensorflow/tensorflow/tree/v1.5.0)
+  [v1.8.0](https://github.com/tensorflow/tensorflow/tree/v1.8.0)
  release version of Tensorflow and we do not guarantee that these will work
  with other versions; this being said, each frozen inference graph can be
  regenerated using your current version of Tensorflow by re-running the
  [exporter](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/exporting_models.md),
-  pointing it at the model directory as well as the config file inside of it.
+  pointing it at the model directory as well as the corresponding config file in
+  [samples/configs](https://github.com/tensorflow/models/tree/master/research/object_detection/samples/configs).


-## COCO-trained models {#coco-models}
+## COCO-trained models

 | Model name  | Speed (ms) | COCO mAP[^1] | Outputs |
 | ------------ | :--------------: | :--------------: | :-------------: |
 | [ssd_mobilenet_v1_coco](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_2018_01_28.tar.gz) | 30 | 21 | Boxes |
+| [ssd_mobilenet_v1_0.75_depth_coco ☆](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_07_03.tar.gz) | 26 | 18 | Boxes |
+| [ssd_mobilenet_v1_quantized_coco ☆](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_quantized_300x300_coco14_sync_2018_07_03.tar.gz) | 29 | 18 | Boxes |
+| [ssd_mobilenet_v1_0.75_depth_quantized_coco ☆](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_0.75_depth_quantized_300x300_coco14_sync_2018_07_03.tar.gz) | 29 | 16 | Boxes |
+| [ssd_mobilenet_v1_ppn_coco ☆](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_ppn_shared_box_predictor_300x300_coco14_sync_2018_07_03.tar.gz) | 26 | 20 | Boxes |
+| [ssd_mobilenet_v1_fpn_coco ☆](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_fpn_shared_box_predictor_640x640_coco14_sync_2018_07_03.tar.gz) | 56 | 32 | Boxes |
+| [ssd_resnet_50_fpn_coco ☆](http://download.tensorflow.org/models/object_detection/ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync_2018_07_03.tar.gz) | 76 | 35 | Boxes |
 | [ssd_mobilenet_v2_coco](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v2_coco_2018_03_29.tar.gz) | 31 | 22 | Boxes |
 | [ssdlite_mobilenet_v2_coco](http://download.tensorflow.org/models/object_detection/ssdlite_mobilenet_v2_coco_2018_05_09.tar.gz) | 27 | 22 | Boxes |
 | [ssd_inception_v2_coco](http://download.tensorflow.org/models/object_detection/ssd_inception_v2_coco_2018_01_28.tar.gz) | 42 | 24 | Boxes |
@@ -88,15 +96,15 @@ Some remarks on frozen inference graphs:
 | [mask_rcnn_resnet101_atrous_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_resnet101_atrous_coco_2018_01_28.tar.gz) | 470 | 33 | Masks |
 | [mask_rcnn_resnet50_atrous_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_resnet50_atrous_coco_2018_01_28.tar.gz) | 343 | 29 | Masks |

+Note: The asterisk (☆) at the end of model name indicates that this model supports TPU training.

-
-## Kitti-trained models {#kitti-models}
+## Kitti-trained models

 Model name                                                                                                                                                        | Speed (ms) | Pascal mAP@0.5 | Outputs
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----:
 [faster_rcnn_resnet101_kitti](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_kitti_2018_01_28.tar.gz) | 79  | 87              | Boxes

-## Open Images-trained models {#open-images-models}
+## Open Images-trained models

 Model name                                                                                                                                                        | Speed (ms) | Open Images mAP@0.5[^2] | Outputs
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----:
@@ -104,7 +112,7 @@ Model name
 [faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2018_01_28.tar.gz) | 347  |               | Boxes


-## AVA v2.1 trained models {#ava-models}
+## AVA v2.1 trained models

 Model name                                                                                                                                                        | Speed (ms) | Pascal mAP@0.5 | Outputs
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----:
@@ -112,5 +120,6 @@ Model name


 [^1]: See [MSCOCO evaluation protocol](http://cocodataset.org/#detections-eval).
+
 [^2]: This is PASCAL mAP with a slightly different way of true positives computation: see [Open Images evaluation protocol](evaluation_protocols.md#open-images).

--- a/research/object_detection/g3doc/running_locally.md
+++ b/research/object_detection/g3doc/running_locally.md
@@ -34,37 +34,22 @@ A local training job can be run with the following command:

 ```bash
 # From the tensorflow/models/research/ directory
-python object_detection/train.py \
-    --logtostderr \
-    --pipeline_config_path=${PATH_TO_YOUR_PIPELINE_CONFIG} \
-    --train_dir=${PATH_TO_TRAIN_DIR}
+PIPELINE_CONFIG_PATH={path to pipeline config file}
+MODEL_DIR={path to model directory}
+NUM_TRAIN_STEPS=50000
+NUM_EVAL_STEPS=2000
+python object_detection/model_main.py \
+    --pipeline_config_path=${PIPELINE_CONFIG_PATH} \
+    --model_dir=${MODEL_DIR} \
+    --num_train_steps=${NUM_TRAIN_STEPS} \
+    --num_eval_steps=${NUM_EVAL_STEPS} \
+    --alsologtostderr
 ```

-where `${PATH_TO_YOUR_PIPELINE_CONFIG}` points to the pipeline config and
-`${PATH_TO_TRAIN_DIR}` points to the directory in which training checkpoints
-and events will be written to. By default, the training job will
-run indefinitely until the user kills it.
-
-## Running the Evaluation Job
-
-Evaluation is run as a separate job. The eval job will periodically poll the
-train directory for new checkpoints and evaluate them on a test dataset. The
-job can be run using the following command:
-
-```bash
-# From the tensorflow/models/research/ directory
-python object_detection/eval.py \
-    --logtostderr \
-    --pipeline_config_path=${PATH_TO_YOUR_PIPELINE_CONFIG} \
-    --checkpoint_dir=${PATH_TO_TRAIN_DIR} \
-    --eval_dir=${PATH_TO_EVAL_DIR}
-```
-
-where `${PATH_TO_YOUR_PIPELINE_CONFIG}` points to the pipeline config,
-`${PATH_TO_TRAIN_DIR}` points to the directory in which training checkpoints
-were saved (same as the training job) and `${PATH_TO_EVAL_DIR}` points to the
-directory in which evaluation events will be saved. As with the training job,
-the eval job run until terminated by default.
+where `${PIPELINE_CONFIG_PATH}` points to the pipeline config and
+`${MODEL_DIR}` points to the directory in which training checkpoints
+and events will be written to. Note that this binary will interleave both
+training and evaluation.

 ## Running Tensorboard

@@ -73,9 +58,9 @@ using the recommended directory structure, Tensorboard can be run using the
 following command:

 ```bash
-tensorboard --logdir=${PATH_TO_MODEL_DIRECTORY}
+tensorboard --logdir=${MODEL_DIR}
 ```

-where `${PATH_TO_MODEL_DIRECTORY}` points to the directory that contains the
+where `${MODEL_DIR}` points to the directory that contains the
 train and eval directories. Please note it may take Tensorboard a couple minutes
 to populate with data.
--- a/research/object_detection/g3doc/running_on_cloud.md
+++ b/research/object_detection/g3doc/running_on_cloud.md
-# Running on Google Cloud Platform
+# Running on Google Cloud ML Engine

 The Tensorflow Object Detection API supports distributed training on Google
 Cloud ML Engine. This section documents instructions on how to train and
@@ -23,26 +23,28 @@ evaluation jobs for a few iterations
 ## Packaging

 In order to run the Tensorflow Object Detection API on Cloud ML, it must be
-packaged (along with it's TF-Slim dependency). The required packages can be
-created with the following command
+packaged (along with it's TF-Slim dependency and the
+[pycocotools](https://github.com/cocodataset/cocoapi/tree/master/PythonAPI/pycocotools)
+library). The required packages can be created with the following command

 ``` bash
 # From tensorflow/models/research/
+bash object_detection/dataset_tools/create_pycocotools_package.sh /tmp/pycocotools
 python setup.py sdist
 (cd slim && python setup.py sdist)
 ```

-This will create python packages in dist/object_detection-0.1.tar.gz and
-slim/dist/slim-0.1.tar.gz.
+This will create python packages dist/object_detection-0.1.tar.gz,
+slim/dist/slim-0.1.tar.gz, and /tmp/pycocotools/pycocotools-2.0.tar.gz.

-## Running a Multiworker Training Job
+## Running a Multiworker (GPU) Training Job on CMLE

 Google Cloud ML requires a YAML configuration file for a multiworker training
 job using GPUs. A sample YAML file is given below:

 ```
 trainingInput:
-  runtimeVersion: "1.2"
+  runtimeVersion: "1.8"
  scaleTier: CUSTOM
  masterType: standard_gpu
  workerCount: 9
@@ -68,22 +70,22 @@ The YAML file should be saved on the local machine (not on GCP). Once it has
 been written, a user can start a training job on Cloud ML Engine using the
 following command:

-``` bash
+```bash
 # From tensorflow/models/research/
-gcloud ml-engine jobs submit training object_detection_`date +%s` \
-    --runtime-version 1.2 \
-    --job-dir=gs://${TRAIN_DIR} \
-    --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz \
-    --module-name object_detection.train \
+gcloud ml-engine jobs submit training object_detection_`date +%m_%d_%Y_%H_%M_%S` \
+    --runtime-version 1.8 \
+    --job-dir=gs://${MODEL_DIR} \
+    --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
+    --module-name object_detection.model_main \
    --region us-central1 \
    --config ${PATH_TO_LOCAL_YAML_FILE} \
    -- \
-    --train_dir=gs://${TRAIN_DIR} \
+    --model_dir=gs://${MODEL_DIR} \
    --pipeline_config_path=gs://${PIPELINE_CONFIG_PATH}
 ```

 Where `${PATH_TO_LOCAL_YAML_FILE}` is the local path to the YAML configuration,
-`gs://${TRAIN_DIR}` specifies the directory on Google Cloud Storage where the
+`gs://${MODEL_DIR}` specifies the directory on Google Cloud Storage where the
 training checkpoints and events will be written to and
 `gs://${PIPELINE_CONFIG_PATH}` points to the pipeline configuration stored on
 Google Cloud Storage.
@@ -91,34 +93,69 @@ Google Cloud Storage.
 Users can monitor the progress of their training job on the [ML Engine
 Dashboard](https://console.cloud.google.com/mlengine/jobs).

-Note: This sample is supported for use with 1.2 runtime version.
+Note: This sample is supported for use with 1.8 runtime version.
+
+## Running a TPU Training Job on CMLE
+
+Launching a training job with a TPU compatible pipeline config requires using a
+similar command:
+
+```bash
+gcloud ml-engine jobs submit training `whoami`_object_detection_`date +%m_%d_%Y_%H_%M_%S` \
+--job-dir=gs://${MODEL_DIR} \
+--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
+--module-name object_detection.model_tpu_main \
+--runtime-version 1.8 \
+--scale-tier BASIC_TPU \
+--region us-central1 \
+-- \
+--tpu_zone us-central1 \
+--model_dir=gs://${MODEL_DIR} \
+--pipeline_config_path=gs://${PIPELINE_CONFIG_PATH}
+```
+
+In contrast with the GPU training command, there is no need to specify a YAML
+file and we point to the *object_detection.model_tpu_main* binary instead of
+*object_detection.model_main*. We must also now set `scale-tier` to be
+`BASIC_TPU` and provide a `tpu_zone`. Finally as before `pipeline_config_path`
+points to a points to the pipeline configuration stored on Google Cloud Storage
+(but is now must be a TPU compatible model).
+
+## Running an Evaluation Job on CMLE

-## Running an Evaluation Job on Cloud
+Note: You only need to do this when using TPU for training as it does not
+interleave evaluation during training as in the case of Multiworker GPU
+training.

 Evaluation jobs run on a single machine, so it is not necessary to write a YAML
 configuration for evaluation. Run the following command to start the evaluation
 job:

-``` bash
-gcloud ml-engine jobs submit training object_detection_eval_`date +%s` \
-    --runtime-version 1.2 \
-    --job-dir=gs://${TRAIN_DIR} \
-    --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz \
-    --module-name object_detection.eval \
+```bash
+gcloud ml-engine jobs submit training object_detection_eval_`date +%m_%d_%Y_%H_%M_%S` \
+    --runtime-version 1.8 \
+    --job-dir=gs://${MODEL_DIR} \
+    --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
+    --module-name object_detection.model_main \
    --region us-central1 \
    --scale-tier BASIC_GPU \
    -- \
-    --checkpoint_dir=gs://${TRAIN_DIR} \
-    --eval_dir=gs://${EVAL_DIR} \
-    --pipeline_config_path=gs://${PIPELINE_CONFIG_PATH}
+    --model_dir=gs://${MODEL_DIR} \
+    --pipeline_config_path=gs://${PIPELINE_CONFIG_PATH} \
+    --checkpoint_dir=gs://${MODEL_DIR}
 ```

-Where `gs://${TRAIN_DIR}` points to the directory on Google Cloud Storage where
-training checkpoints are saved (same as the training job), `gs://${EVAL_DIR}`
-points to where evaluation events will be saved on Google Cloud Storage and
+Where `gs://${MODEL_DIR}` points to the directory on Google Cloud Storage where
+training checkpoints are saved (same as the training job), as well as
+to where evaluation events will be saved on Google Cloud Storage and
 `gs://${PIPELINE_CONFIG_PATH}` points to where the pipeline configuration is
 stored on Google Cloud Storage.

+Typically one starts an evaluation job concurrently with the training job.
+Note that we do not support running evaluation on TPU, so the above command
+line for launching evaluation jobs is the same whether you are training
+on GPU or TPU.
+
 ## Running Tensorboard

 You can run Tensorboard locally on your own machine to view progress of your
@@ -130,3 +167,4 @@ tensorboard --logdir=gs://${YOUR_CLOUD_BUCKET}
 ```

 Note it may Tensorboard a few minutes to populate with results.
+
--- a/research/object_detection/g3doc/running_on_mobile_tensorflowlite.md
+++ b/research/object_detection/g3doc/running_on_mobile_tensorflowlite.md
+# Running on mobile with TensorFlow Lite
+
+In this section, we will show you how to use [TensorFlow
+Lite](https://www.tensorflow.org/mobile/tflite/) to get a smaller model and
+allow you take advantage of ops that have been optimized for mobile devices.
+TensorFlow Lite is TensorFlow’s lightweight solution for mobile and embedded
+devices. It enables on-device machine learning inference with low latency and a
+small binary size. TensorFlow Lite uses many techniques for this such as
+quantized kernels that allow smaller and faster (fixed-point math) models.
+
+For this section, you will need to build [TensorFlow from
+source](https://www.tensorflow.org/install/install_sources) to get the
+TensorFlow Lite support for the SSD model. You will also need to install the
+[bazel build
+tool](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android#bazel).
+
+To make these commands easier to run, let’s set up some environment variables:
+
+```shell
+export CONFIG_FILE=PATH_TO_BE_CONFIGURED/pipeline.config
+export CHECKPOINT_PATH=PATH_TO_BE_CONFIGURED/model.ckpt
+export OUTPUT_DIR=/tmp/tflite
+```
+
+We start with a checkpoint and get a TensorFlow frozen graph with compatible ops
+that we can use with TensorFlow Lite. First, you’ll need to install these
+[python
+libraries](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/installation.md).
+Then to get the frozen graph, run the export_tflite_ssd_graph.py script from the
+`models/research` directory with this command:
+
+```shell
+object_detection/export_tflite_ssd_graph.py \
+--pipeline_config_path=$CONFIG_FILE \
+--trained_checkpoint_prefix=$CHECKPOINT_PATH \
+--output_directory=$OUTPUT_DIR \
+--add_postprocessing_op=true
+```
+
+In the /tmp/tflite directory, you should now see two files: tflite_graph.pb and
+tflite_graph.pbtxt. Note that the add_postprocessing flag enables the model to
+take advantage of a custom optimized detection post-processing operation which
+can be thought of as a replacement for
+[tf.image.non_max_suppression](https://www.tensorflow.org/api_docs/python/tf/image/non_max_suppression).
+Make sure not to confuse export_tflite_ssd_graph with export_inference_graph in
+the same directory. Both scripts output frozen graphs: export_tflite_ssd_graph
+will output the frozen graph that we can input to TensorFlow Lite directly and
+is the one we’ll be using.
+
+Next we’ll use TensorFlow Lite to get the optimized model by using
+[TOCO](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/toco),
+the TensorFlow Lite Optimizing Converter. This will convert the resulting frozen
+graph (tflite_graph.pb) to the TensorFlow Lite flatbuffer format (detect.tflite)
+via the following command. For a quantized model, run this from the tensorflow/
+directory:
+
+```shell
+bazel run --config=opt tensorflow/contrib/lite/toco:toco -- \
+--input_file=$OUTPUT_DIR/tflite_graph.pb \
+--output_file=$OUTPUT_DIR/detect.tflite \
+--input_shapes=1,300,300,3 \
+--input_arrays=normalized_input_image_tensor \
+--output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3' \
+--inference_type=QUANTIZED_UINT8 \
+--mean_values=128 \
+--std_values=128 \
+--change_concat_input_ranges=false \
+--allow_custom_ops
+```
+
+This command takes the input tensor normalized_input_image_tensor after resizing
+each camera image frame to 300x300 pixels. The outputs of the quantized model
+are named 'TFLite_Detection_PostProcess', 'TFLite_Detection_PostProcess:1',
+'TFLite_Detection_PostProcess:2', and 'TFLite_Detection_PostProcess:3' and
+represent four arrays: detection_boxes, detection_classes, detection_scores, and
+num_detections. The documentation for other flags used in this command is
+[here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/toco/g3doc/cmdline_reference.md).
+If things ran successfully, you should now see a third file in the /tmp/tflite
+directory called detect.tflite. This file contains the graph and all model
+parameters and can be run via the TensorFlow Lite interpreter on the Android
+device. For a floating point model, run this from the tensorflow/ directory:
+
+```shell
+bazel run --config=opt tensorflow/contrib/lite/toco:toco -- \
+--input_file=$OUTPUT_DIR/tflite_graph.pb \
+--output_file=$OUTPUT_DIR/detect.tflite \
+--input_shapes=1,300,300,3 \
+--input_arrays=normalized_input_image_tensor \
+--output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3'  \
+--inference_type=FLOAT \
+--allow_custom_ops
+```
+
+# Running our model on Android
+
+To run our TensorFlow Lite model on device, we will need to install the Android
+NDK and SDK. The current recommended Android NDK version is 14b and can be found
+on the [NDK
+Archives](https://developer.android.com/ndk/downloads/older_releases.html#ndk-14b-downloads)
+page. Android SDK and build tools can be [downloaded
+separately](https://developer.android.com/tools/revisions/build-tools.html) or
+used as part of [Android
+Studio](https://developer.android.com/studio/index.html). To build the
+TensorFlow Lite Android demo, build tools require API >= 23 (but it will run on
+devices with API >= 21). Additional details are available on the [TensorFlow
+Lite Android App
+page](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/java/demo/README.md).
+
+Next we need to point the app to our new detect.tflite file and give it the
+names of our new labels. Specifically, we will copy our TensorFlow Lite
+flatbuffer to the app assets directory with the following command:
+
+```shell
+cp /tmp/tflite/detect.tflite \
+//tensorflow/contrib/lite/examples/android/app/src/main/assets
+```
+
+You will also need to copy your new labelmap labels_list.txt to the assets
+directory.
+
+We will now edit the BUILD file to point to this new model. First, open the
+BUILD file tensorflow/contrib/lite/examples/android/BUILD. Then find the assets
+section, and replace the line “@tflite_mobilenet_ssd_quant//:detect.tflite”
+(which by default points to a COCO pretrained model) with the path to your new
+TFLite model
+“//tensorflow/contrib/lite/examples/android/app/src/main/assets:detect.tflite”.
+Finally, change the last line in assets section to use the new label map as
+well.
+
+We will also need to tell our app to use the new label map. In order to do this,
+open up the
+tensorflow/contrib/lite/examples/android/app/src/main/java/org/tensorflow/demo/DetectorActivity.java
+file in a text editor and find the definition of TF_OD_API_LABELS_FILE. Update
+this path to point to your new label map file:
+"file:///android_asset/labels_list.txt". Note that if your model is quantized,
+the flag TF_OD_API_IS_QUANTIZED is set to true, and if your model is floating
+point, the flag TF_OD_API_IS_QUANTIZED is set to false. This new section of
+DetectorActivity.java should now look as follows for a quantized model:
+
+```shell
+  private static final boolean TF_OD_API_IS_QUANTIZED = true;
+  private static final String TF_OD_API_MODEL_FILE = "detect.tflite";
+  private static final String TF_OD_API_LABELS_FILE = "file:///android_asset/labels_list.txt";
+```
+
+Once you’ve copied the TensorFlow Lite file and edited your BUILD and
+DetectorActivity.java files, you can build the demo app, run this bazel command
+from the tensorflow directory:
+
+```shell
+ bazel build -c opt --config=android_arm{,64} --cxxopt='--std=c++11'
+"//tensorflow/contrib/lite/examples/android:tflite_demo"
+```
+
+Now install the demo on a
+[debug-enabled](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android#install)
+Android phone via [Android Debug
+Bridge](https://developer.android.com/studio/command-line/adb) (adb):
+
+```shell
+adb install bazel-bin/tensorflow/contrib/lite/examples/android/tflite_demo.apk
+```
--- a/research/object_detection/g3doc/running_pets.md
+++ b/research/object_detection/g3doc/running_pets.md
@@ -93,17 +93,18 @@ python object_detection/dataset_tools/create_pet_tf_record.py \
 Note: It is normal to see some warnings when running this script. You may ignore
 them.

-Two TFRecord files named `pet_train.record` and `pet_val.record` should be
-generated in the `tensorflow/models/research/` directory.
+Two 10-sharded TFRecord files named `pet_faces_train.record-*` and
+`pet_faces_val.record-*` should be generated in the
+`tensorflow/models/research/` directory.

 Now that the data has been generated, we'll need to upload it to Google Cloud
 Storage so the data can be accessed by ML Engine. Run the following command to
 copy the files into your GCS bucket (substituting `${YOUR_GCS_BUCKET}`):

-``` bash
+```bash
 # From tensorflow/models/research/
-gsutil cp pet_train.record gs://${YOUR_GCS_BUCKET}/data/pet_train.record
-gsutil cp pet_val.record gs://${YOUR_GCS_BUCKET}/data/pet_val.record
+gsutil cp pet_faces_train.record-* gs://${YOUR_GCS_BUCKET}/data/
+gsutil cp pet_faces_val.record-* gs://${YOUR_GCS_BUCKET}/data/
 gsutil cp object_detection/data/pet_label_map.pbtxt gs://${YOUR_GCS_BUCKET}/data/pet_label_map.pbtxt
 ```

@@ -176,8 +177,8 @@ the following:
    - model.ckpt.meta
    - model.ckpt.data-00000-of-00001
    - pet_label_map.pbtxt
-    - pet_train.record
-    - pet_val.record
+    - pet_faces_train.record-*
+    - pet_faces_val.record-*
 ```

 You can inspect your bucket using the [Google Cloud Storage
@@ -193,59 +194,39 @@ Before we can start a job on Google Cloud ML Engine, we must:
 To package the Tensorflow Object Detection code, run the following commands from
 the `tensorflow/models/research/` directory:

-``` bash
+```bash
 # From tensorflow/models/research/
+bash object_detection/dataset_tools/create_pycocotools_package.sh /tmp/pycocotools
 python setup.py sdist
 (cd slim && python setup.py sdist)
 ```

-You should see two tar.gz files created at `dist/object_detection-0.1.tar.gz`
-and `slim/dist/slim-0.1.tar.gz`.
+This will create python packages dist/object_detection-0.1.tar.gz,
+slim/dist/slim-0.1.tar.gz, and /tmp/pycocotools/pycocotools-2.0.tar.gz.

 For running the training Cloud ML job, we'll configure the cluster to use 10
 training jobs (1 master + 9 workers) and three parameters servers. The
 configuration file can be found at `object_detection/samples/cloud/cloud.yml`.

-Note: This sample is supported for use with 1.2 runtime version.
+Note: This sample is supported for use with 1.8 runtime version.

-To start training, execute the following command from the
+To start training and evaluation, execute the following command from the
 `tensorflow/models/research/` directory:

-``` bash
+```bash
 # From tensorflow/models/research/
-gcloud ml-engine jobs submit training `whoami`_object_detection_`date +%s` \
-    --runtime-version 1.2 \
-    --job-dir=gs://${YOUR_GCS_BUCKET}/train \
-    --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz \
-    --module-name object_detection.train \
+gcloud ml-engine jobs submit training `whoami`_object_detection_pets_`date +%m_%d_%Y_%H_%M_%S` \
+    --runtime-version 1.8 \
+    --job-dir=gs://${YOUR_GCS_BUCKET}/model_dir \
+    --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
+    --module-name object_detection.model_main \
    --region us-central1 \
    --config object_detection/samples/cloud/cloud.yml \
    -- \
-    --train_dir=gs://${YOUR_GCS_BUCKET}/train \
-    --pipeline_config_path=gs://${YOUR_GCS_BUCKET}/data/faster_rcnn_resnet101_pets.config
-```
-
-Once training has started, we can run an evaluation concurrently:
-
-``` bash
-# From tensorflow/models/research/
-gcloud ml-engine jobs submit training `whoami`_object_detection_eval_`date +%s` \
-    --runtime-version 1.2 \
-    --job-dir=gs://${YOUR_GCS_BUCKET}/train \
-    --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz \
-    --module-name object_detection.eval \
-    --region us-central1 \
-    --scale-tier BASIC_GPU \
-    -- \
-    --checkpoint_dir=gs://${YOUR_GCS_BUCKET}/train \
-    --eval_dir=gs://${YOUR_GCS_BUCKET}/eval \
+    --model_dir=gs://${YOUR_GCS_BUCKET}/model_dir \
    --pipeline_config_path=gs://${YOUR_GCS_BUCKET}/data/faster_rcnn_resnet101_pets.config
 ```

-Note: Even though we're running an evaluation job, the `gcloud ml-engine jobs
-submit training` command is correct. ML Engine does not distinguish between
-training and evaluation jobs.
-
 Users can monitor and stop training and evaluation jobs on the [ML Engine
 Dashboard](https://console.cloud.google.com/mlengine/jobs).

@@ -254,12 +235,12 @@ Dashboard](https://console.cloud.google.com/mlengine/jobs).
 You can monitor progress of the training and eval jobs by running Tensorboard on
 your local machine:

-``` bash
+```bash
 # This command needs to be run once to allow your local machine to access your
 # GCS bucket.
 gcloud auth application-default login

-tensorboard --logdir=gs://${YOUR_GCS_BUCKET}
+tensorboard --logdir=gs://${YOUR_GCS_BUCKET}/model_dir
 ```

 Once Tensorboard is running, navigate to `localhost:6006` from your favourite
@@ -284,12 +265,12 @@ that they've converged.

 ## Exporting the Tensorflow Graph

-After your model has been trained, you should export it to a Tensorflow
-graph proto. First, you need to identify a candidate checkpoint to export. You
-can search your bucket using the [Google Cloud Storage
+After your model has been trained, you should export it to a Tensorflow graph
+proto. First, you need to identify a candidate checkpoint to export. You can
+search your bucket using the [Google Cloud Storage
 Browser](https://console.cloud.google.com/storage/browser). The file should be
-stored under `${YOUR_GCS_BUCKET}/train`. The checkpoint will typically consist of
-three files:
+stored under `${YOUR_GCS_BUCKET}/model_dir`. The checkpoint will typically
+consist of three files:

 * `model.ckpt-${CHECKPOINT_NUMBER}.data-00000-of-00001`
 * `model.ckpt-${CHECKPOINT_NUMBER}.index`
@@ -298,9 +279,9 @@ three files:
 After you've identified a candidate checkpoint to export, run the following
 command from `tensorflow/models/research/`:

-``` bash
+```bash
 # From tensorflow/models/research/
-gsutil cp gs://${YOUR_GCS_BUCKET}/train/model.ckpt-${CHECKPOINT_NUMBER}.* .
+gsutil cp gs://${YOUR_GCS_BUCKET}/model_dir/model.ckpt-${CHECKPOINT_NUMBER}.* .
 python object_detection/export_inference_graph.py \
    --input_type image_tensor \
    --pipeline_config_path object_detection/samples/configs/faster_rcnn_resnet101_pets.config \

--- a/research/object_detection/g3doc/tpu_compatibility.md
+++ b/research/object_detection/g3doc/tpu_compatibility.md
+# TPU compatible detection pipelines
+
+[TOC]
+
+The Tensorflow Object Detection API supports TPU training for some models. To
+make models TPU compatible you need to make a few tweaks to the model config as
+mentioned below. We also provide several sample configs that you can use as a
+template.
+
+## TPU compatibility
+
+### Static shaped tensors
+
+TPU training currently requires all tensors in the Tensorflow Graph to have
+static shapes. However, most of the sample configs in Object Detection API have
+a few different tensors that are dynamically shaped. Fortunately, we provide
+simple alternatives in the model configuration that modifies these tensors to
+have static shape:
+
+*   **Image tensors with static shape** - This can be achieved either by using a
+    `fixed_shape_resizer` that resizes images to a fixed spatial shape or by
+    setting `pad_to_max_dimension: true` in `keep_aspect_ratio_resizer` which
+    pads the resized images with zeros to the bottom and right. Padded image
+    tensors are correctly handled internally within the model.
+
+    ```
+    image_resizer {
+      fixed_shape_resizer {
+        height: 640
+        width: 640
+      }
+    }
+    ```
+
+    or
+
+    ```
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 640
+        max_dimension: 640
+        pad_to_max_dimension: true
+      }
+    }
+    ```
+
+*   **Groundtruth tensors with static shape** - Images in a typical detection
+    dataset have variable number of groundtruth boxes and associated classes.
+    Setting `max_number_of_boxes` to a large enough number in the
+    `train_input_reader` and `eval_input_reader` pads the groundtruth tensors
+    with zeros to a static shape. Padded groundtruth tensors are correctly
+    handled internally within the model.
+
+    ```
+    train_input_reader: {
+      tf_record_input_reader {
+        input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record-?????-of-00100"
+      }
+      label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
+      max_number_of_boxes: 200
+    }
+
+    eval_input_reader: {
+      tf_record_input_reader {
+        input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record-?????-of-0010"
+      }
+      label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
+      max_number_of_boxes: 200
+    }
+    ```
+
+### TPU friendly ops
+
+Although TPU supports a vast number of tensorflow ops, a few used in the
+Tensorflow Object Detection API are unsupported. We list such ops below and
+recommend compatible substitutes.
+
+*   **Anchor sampling** - Typically we use hard example mining in standard SSD
+    pipeliens to balance positive and negative anchors that contribute to the
+    loss. Hard Example mining uses non max suppression as a subroutine and since
+    non max suppression is not currently supported on TPUs we cannot use hard
+    example mining. Fortunately, we provide an implementation of focal loss that
+    can be used instead of hard example mining. Remove `hard_example_miner` from
+    the config and substitute `weighted_sigmoid` classification loss with
+    `weighted_sigmoid_focal` loss.
+
+    ```
+    loss {
+      classification_loss {
+        weighted_sigmoid_focal {
+          alpha: 0.25
+          gamma: 2.0
+        }
+      }
+      localization_loss {
+        weighted_smooth_l1 {
+        }
+      }
+      classification_weight: 1.0
+      localization_weight: 1.0
+    }
+    ```
+
+*   **Target Matching** - Object detection API provides two choices for matcher
+    used in target assignment: `argmax_matcher` and `bipartite_matcher`.
+    Bipartite matcher is not currently supported on TPU, therefore we must
+    modify the configs to use `argmax_matcher`. Additionally, set
+    `use_matmul_gather: true` for efficiency on TPU.
+
+    ```
+    matcher {
+      argmax_matcher {
+        matched_threshold: 0.5
+        unmatched_threshold: 0.5
+        ignore_thresholds: false
+        negatives_lower_than_unmatched: true
+        force_match_for_each_row: true
+        use_matmul_gather: true
+      }
+    }
+    ```
+
+### TPU training hyperparameters
+
+Object Detection training on TPU uses synchronous SGD. On a typical cloud TPU
+with 8 cores we recommend batch sizes that are 8x large when compared to a GPU
+config that uses asynchronous SGD. We also use fewer training steps (~ 1/100 x)
+due to the large batch size. This necessitates careful tuning of some other
+training parameters as listed below.
+
+*   **Batch size** - Use the largest batch size that can fit on cloud TPU.
+
+    ```
+    train_config {
+      batch_size: 1024
+    }
+    ```
+
+*   **Training steps** - Typically only 10s of thousands.
+
+    ```
+    train_config {
+      num_steps: 25000
+    }
+    ```
+
+*   **Batch norm decay** - Use smaller decay constants (0.97 or 0.997) since we
+    take fewer training steps.
+
+    ```
+    batch_norm {
+      scale: true,
+      decay: 0.97,
+      epsilon: 0.001,
+    }
+    ```
+
+*   **Learning rate** - Use large learning rate with warmup. Scale learning rate
+    linearly with batch size. See `cosine_decay_learning_rate` or
+    `manual_step_learning_rate` for examples.
+
+    ```
+    learning_rate: {
+      cosine_decay_learning_rate {
+        learning_rate_base: .04
+        total_steps: 25000
+        warmup_learning_rate: .013333
+        warmup_steps: 2000
+      }
+    }
+    ```
+
+    or
+
+    ```
+     learning_rate: {
+      manual_step_learning_rate {
+        warmup: true
+        initial_learning_rate: .01333
+        schedule {
+          step: 2000
+          learning_rate: 0.04
+        }
+        schedule {
+          step: 15000
+          learning_rate: 0.004
+        }
+      }
+    }
+    ```
+
+## Example TPU compatible configs
+
+We provide example config files that you can use to train your own models on TPU
+
+*   <a href='https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_mobilenet_v1_300x300_coco14_sync.config'>ssd_mobilenet_v1_300x300</a> <br>
+*   <a href='https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_mobilenet_v1_ppn_shared_box_predictor_300x300_coco14_sync.config'>ssd_mobilenet_v1_ppn_300x300</a> <br>
+*   <a href='https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_mobilenet_v1_fpn_shared_box_predictor_640x640_coco14_sync.config'>ssd_mobilenet_v1_fpn_640x640
+    (mobilenet based retinanet)</a> <br>
+*   <a href='https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync.config'>ssd_resnet50_v1_fpn_640x640
+    (retinanet)</a> <br>
+
+## Supported Meta architectures
+
+Currently, `SSDMetaArch` models are supported on TPUs. `FasterRCNNMetaArch` is
+going to be supported soon.
--- a/research/object_detection/model_lib.py
+++ b/research/object_detection/model_lib.py
@@ -500,11 +500,13 @@ def create_estimator_and_inputs(run_config,
  eval_config = configs['eval_config']
  eval_input_config = configs['eval_input_config']

-  if train_steps is None:
-    train_steps = configs['train_config'].num_steps
+  # update train_steps from config but only when non-zero value is provided
+  if train_steps is None and train_config.num_steps != 0:
+    train_steps = train_config.num_steps

-  if eval_steps is None:
-    eval_steps = configs['eval_config'].num_examples
+  # update eval_steps from config but only when non-zero value is provided
+  if eval_steps is None and eval_config.num_examples != 0:
+    eval_steps = eval_config.num_examples

  detection_model_fn = functools.partial(
      model_builder.build, model_config=model_config)

--- a/research/object_detection/samples/configs/ssd_mobilenet_v1_fpn_shared_box_predictor_640x640_coco14_sync.config
+++ b/research/object_detection/samples/configs/ssd_mobilenet_v1_fpn_shared_box_predictor_640x640_coco14_sync.config
@@ -3,8 +3,7 @@
 # See Lin et al, https://arxiv.org/abs/1708.02002
 # Trained on COCO, initialized from Imagenet classification checkpoint

-# Achieves 29.6 mAP on COCO14 minival dataset. Doubling the number of training
-# steps to 25k gets 31.5 mAP
+# Achieves 29.7 mAP on COCO14 minival dataset.

 # This config is TPU compatible

@@ -133,11 +132,11 @@ model {

 train_config: {
  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
-  batch_size: 128
+  batch_size: 64
  sync_replicas: true
  startup_delay_steps: 0
  replicas_to_aggregate: 8
-  num_steps: 12500
+  num_steps: 25000
  data_augmentation_options {
    random_horizontal_flip {
    }
@@ -156,10 +155,10 @@ train_config: {
    momentum_optimizer: {
      learning_rate: {
        cosine_decay_learning_rate {
-          learning_rate_base: .08
-          total_steps: 12500
-          warmup_learning_rate: .026666
-          warmup_steps: 1000
+          learning_rate_base: .04
+          total_steps: 25000
+          warmup_learning_rate: .013333
+          warmup_steps: 2000
        }
      }
      momentum_optimizer_value: 0.9