"git@developer.sourcefind.cn:norm/vllm.git" did not exist on "380170038e05cf81953c29d7e8ed789e048b6434"
Unverified Commit 70b176a9 authored by Jonathan Huang's avatar Jonathan Huang Committed by GitHub
Browse files

Merge pull request #4764 from pkulzc/master

Adding new features to extend the functionality and capability of the API
parents 11070af9 e2d46371
...@@ -77,6 +77,10 @@ Extras: ...@@ -77,6 +77,10 @@ Extras:
Run an instance segmentation model</a><br> Run an instance segmentation model</a><br>
* <a href='g3doc/challenge_evaluation.md'> * <a href='g3doc/challenge_evaluation.md'>
Run the evaluation for the Open Images Challenge 2018</a><br> Run the evaluation for the Open Images Challenge 2018</a><br>
* <a href='g3doc/tpu_compatibility.md'>
TPU compatible detection pipelines</a><br>
* <a href='g3doc/running_on_mobile_tensorflowlite.md'>
Running object detection on mobile devices with TensorFlow Lite</a><br>
## Getting Help ## Getting Help
...@@ -95,6 +99,28 @@ reporting an issue. ...@@ -95,6 +99,28 @@ reporting an issue.
## Release information ## Release information
### July 13, 2018
There are many new updates in this release, extending the functionality and
capability of the API:
* Moving from slim-based training to [Estimator](https://www.tensorflow.org/api_docs/python/tf/estimator/Estimator)-based
training.
* Support for [RetinaNet](https://arxiv.org/abs/1708.02002), and a [MobileNet](https://ai.googleblog.com/2017/06/mobilenets-open-source-models-for.html)
adaptation of RetinaNet.
* A novel SSD-based architecture called the [Pooling Pyramid Network](https://arxiv.org/abs/1807.03284) (PPN).
* Releasing several [TPU](https://cloud.google.com/tpu/)-compatible models.
These can be found in the `samples/configs/` directory with a comment in the
pipeline configuration files indicating TPU compatibility.
* Support for quantized training.
* Updated documentation for new binaries, Cloud training, and [Tensorflow Lite](https://www.tensorflow.org/mobile/tflite/).
See also our [expanded announcement blogpost](https://ai.googleblog.com/2018/07/accelerated-training-and-inference-with.html) and accompanying tutorial at the [TensorFlow blog](https://medium.com/tensorflow/training-and-serving-a-realtime-mobile-object-detector-in-30-minutes-with-cloud-tpus-b78971cf1193).
<b>Thanks to contributors</b>: Sara Robinson, Aakanksha Chowdhery, Derek Chow,
Pengchong Jin, Jonathan Huang, Vivek Rathod, Zhichao Lu, Ronny Votel
### June 25, 2018 ### June 25, 2018
Additional evaluation tools for the [Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html) are out. Additional evaluation tools for the [Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html) are out.
......
# Copyright 2018 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# #==========================================================================
FROM tensorflow/tensorflow:nightly-devel
# Get the tensorflow models research directory, and move it into tensorflow
# source folder to match recommendation of installation
RUN git clone --depth 1 https://github.com/tensorflow/models.git && \
mv models /tensorflow/models
# Install gcloud and gsutil commands
# https://cloud.google.com/sdk/docs/quickstart-debian-ubuntu
RUN export CLOUD_SDK_REPO="cloud-sdk-$(lsb_release -c -s)" && \
echo "deb http://packages.cloud.google.com/apt $CLOUD_SDK_REPO main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list && \
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - && \
apt-get update -y && apt-get install google-cloud-sdk -y
# Install the Tensorflow Object Detection API from here
# https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/installation.md
# Install object detection api dependencies
RUN apt-get install -y protobuf-compiler python-pil python-lxml python-tk && \
pip install Cython && \
pip install contextlib2 && \
pip install jupyter && \
pip install matplotlib
# Install pycocoapi
RUN git clone --depth 1 https://github.com/cocodataset/cocoapi.git && \
cd cocoapi/PythonAPI && \
make -j8 && \
cp -r pycocotools /tensorflow/models/research && \
cd ../../ && \
rm -rf cocoapi
# Get protoc 3.0.0, rather than the old version already in the container
RUN curl -OL "https://github.com/google/protobuf/releases/download/v3.0.0/protoc-3.0.0-linux-x86_64.zip" && \
unzip protoc-3.0.0-linux-x86_64.zip -d proto3 && \
mv proto3/bin/* /usr/local/bin && \
mv proto3/include/* /usr/local/include && \
rm -rf proto3 protoc-3.0.0-linux-x86_64.zip
# Run protoc on the object detection repo
RUN cd /tensorflow/models/research && \
protoc object_detection/protos/*.proto --python_out=.
# Set the PYTHONPATH to finish installing the API
ENV PYTHONPATH $PYTHONPATH:/tensorflow/models/research:/tensorflow/models/research/slim
# Install wget (to make life easier below) and editors (to allow people to edit
# the files inside the container)
RUN apt-get install -y wget vim emacs nano
# Grab various data files which are used throughout the demo: dataset,
# pretrained model, and pretrained TensorFlow Lite model. Install these all in
# the same directories as recommended by the blog post.
# Pets example dataset
RUN mkdir -p /tmp/pet_faces_tfrecord/ && \
cd /tmp/pet_faces_tfrecord && \
curl "http://download.tensorflow.org/models/object_detection/pet_faces_tfrecord.tar.gz" | tar xzf -
# Pretrained model
# This one doesn't need its own directory, since it comes in a folder.
RUN cd /tmp && \
curl -O "http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_07_03.tar.gz" && \
tar xzf ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_07_03.tar.gz && \
rm ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_07_03.tar.gz
# Trained TensorFlow Lite model. This should get replaced by one generated from
# export_tflite_ssd_graph.py when that command is called.
RUN cd /tmp && \
curl -L -o tflite.zip \
https://storage.googleapis.com/download.tensorflow.org/models/tflite/frozengraphs_ssd_mobilenet_v1_0.75_quant_pets_2018_06_29.zip && \
unzip tflite.zip -d tflite && \
rm tflite.zip
# Install Android development tools
# Inspired by the following sources:
# https://github.com/bitrise-docker/android/blob/master/Dockerfile
# https://github.com/reddit/docker-android-build/blob/master/Dockerfile
# Set environment variables
ENV ANDROID_HOME /opt/android-sdk-linux
ENV ANDROID_NDK_HOME /opt/android-ndk-r14b
ENV PATH ${PATH}:${ANDROID_HOME}/tools:${ANDROID_HOME}/tools/bin:${ANDROID_HOME}/platform-tools
# Install SDK tools
RUN cd /opt && \
curl -OL https://dl.google.com/android/repository/sdk-tools-linux-4333796.zip && \
unzip sdk-tools-linux-4333796.zip -d ${ANDROID_HOME} && \
rm sdk-tools-linux-4333796.zip
# Accept licenses before installing components, no need to echo y for each component
# License is valid for all the standard components in versions installed from this file
# Non-standard components: MIPS system images, preview versions, GDK (Google Glass) and Android Google TV require separate licenses, not accepted there
RUN yes | sdkmanager --licenses
# Install platform tools, SDK platform, and other build tools
RUN yes | sdkmanager \
"tools" \
"platform-tools" \
"platforms;android-27" \
"platforms;android-23" \
"build-tools;27.0.3" \
"build-tools;23.0.3"
# Install Android NDK (r14b)
RUN cd /opt && \
curl -L -o android-ndk.zip http://dl.google.com/android/repository/android-ndk-r14b-linux-x86_64.zip && \
unzip -q android-ndk.zip && \
rm -f android-ndk.zip
# Configure the build to use the things we just downloaded
RUN cd /tensorflow && \
printf '\n\nn\ny\nn\nn\nn\ny\nn\nn\nn\nn\nn\nn\n\ny\n%s\n\n\n' ${ANDROID_HOME}|./configure
WORKDIR /tensorflow
# Dockerfile for the TPU and TensorFlow Lite Object Detection tutorial
This Docker image automates the setup involved with training
object detection models on Google Cloud and building the Android TensorFlow Lite
demo app. We recommend using this container if you decide to work through our
tutorial on ["Training and serving a real-time mobile object detector in
30 minutes with Cloud TPUs"](https://medium.com/tensorflow/training-and-serving-a-realtime-mobile-object-detector-in-30-minutes-with-cloud-tpus-b78971cf1193), though of course it may be useful even if you would
like to use the Object Detection API outside the context of the tutorial.
A couple words of warning:
1. Docker containers do not have persistent storage. This means that any changes
you make to files inside the container will not persist if you restart
the container. When running through the tutorial,
**do not close the container**.
2. To be able to deploy the [Android app](
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/examples/android/app)
(which you will build at the end of the tutorial),
you will need to kill any instances of `adb` running on the host machine. You
can accomplish this by closing all instances of Android Studio, and then
running `adb kill-server`.
You can install Docker by following the [instructions here](
https://docs.docker.com/install/).
## Running The Container
From this directory, build the Dockerfile as follows (this takes a while):
```
docker build --tag detect-tf .
```
Run the container:
```
docker run --rm -it --privileged -p 6006:6006 detect-tf
```
When running the container, you will find yourself inside the `/tensorflow`
directory, which is the path to the TensorFlow [source
tree](https://github.com/tensorflow/tensorflow).
## Text Editing
The tutorial also
requires you to occasionally edit files inside the source tree.
This Docker images comes with `vim`, `nano`, and `emacs` preinstalled for your
convenience.
## What's In This Container
This container is derived from the nightly build of TensorFlow, and contains the
sources for TensorFlow at `/tensorflow`, as well as the
[TensorFlow Models](https://github.com/tensorflow/models) which are available at
`/tensorflow/models` (and contain the Object Detection API as a subdirectory
at `/tensorflow/models/research/object_detection`).
The Oxford-IIIT Pets dataset, the COCO pre-trained SSD + MobileNet (v1)
checkpoint, and example
trained model are all available in `/tmp` in their respective folders.
This container also has the `gsutil` and `gcloud` utilities, the `bazel` build
tool, and all dependencies necessary to use the Object Detection API, and
compile and install the TensorFlow Lite Android demo app.
At various points throughout the tutorial, you may see references to the
*research directory*. This refers to the `research` folder within the
models repository, located at
`/tensorflow/models/resesarch`.
# Tensorflow detection model zoo # Tensorflow detection model zoo
We provide a collection of detection models pre-trained on the [COCO We provide a collection of detection models pre-trained on the [COCO
dataset](http://mscoco.org), the [Kitti dataset](http://www.cvlibs.net/datasets/kitti/), and the dataset](http://mscoco.org), the [Kitti dataset](http://www.cvlibs.net/datasets/kitti/),
[Open Images dataset](https://github.com/openimages/dataset). These models can the [Open Images dataset](https://github.com/openimages/dataset) and the
[AVA v2.1 dataset](https://research.google.com/ava/). These models can
be useful for be useful for
out-of-the-box inference if you are interested in categories already in COCO out-of-the-box inference if you are interested in categories already in COCO
(e.g., humans, cars, etc) or in Open Images (e.g., (e.g., humans, cars, etc) or in Open Images (e.g.,
...@@ -57,19 +58,26 @@ Some remarks on frozen inference graphs: ...@@ -57,19 +58,26 @@ Some remarks on frozen inference graphs:
a detector (and discarding the part past that point), which negatively impacts a detector (and discarding the part past that point), which negatively impacts
standard mAP metrics. standard mAP metrics.
* Our frozen inference graphs are generated using the * Our frozen inference graphs are generated using the
[v1.5.0](https://github.com/tensorflow/tensorflow/tree/v1.5.0) [v1.8.0](https://github.com/tensorflow/tensorflow/tree/v1.8.0)
release version of Tensorflow and we do not guarantee that these will work release version of Tensorflow and we do not guarantee that these will work
with other versions; this being said, each frozen inference graph can be with other versions; this being said, each frozen inference graph can be
regenerated using your current version of Tensorflow by re-running the regenerated using your current version of Tensorflow by re-running the
[exporter](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/exporting_models.md), [exporter](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/exporting_models.md),
pointing it at the model directory as well as the config file inside of it. pointing it at the model directory as well as the corresponding config file in
[samples/configs](https://github.com/tensorflow/models/tree/master/research/object_detection/samples/configs).
## COCO-trained models {#coco-models} ## COCO-trained models
| Model name | Speed (ms) | COCO mAP[^1] | Outputs | | Model name | Speed (ms) | COCO mAP[^1] | Outputs |
| ------------ | :--------------: | :--------------: | :-------------: | | ------------ | :--------------: | :--------------: | :-------------: |
| [ssd_mobilenet_v1_coco](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_2018_01_28.tar.gz) | 30 | 21 | Boxes | | [ssd_mobilenet_v1_coco](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_2018_01_28.tar.gz) | 30 | 21 | Boxes |
| [ssd_mobilenet_v1_0.75_depth_coco ☆](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_07_03.tar.gz) | 26 | 18 | Boxes |
| [ssd_mobilenet_v1_quantized_coco ☆](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_quantized_300x300_coco14_sync_2018_07_03.tar.gz) | 29 | 18 | Boxes |
| [ssd_mobilenet_v1_0.75_depth_quantized_coco ☆](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_0.75_depth_quantized_300x300_coco14_sync_2018_07_03.tar.gz) | 29 | 16 | Boxes |
| [ssd_mobilenet_v1_ppn_coco ☆](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_ppn_shared_box_predictor_300x300_coco14_sync_2018_07_03.tar.gz) | 26 | 20 | Boxes |
| [ssd_mobilenet_v1_fpn_coco ☆](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_fpn_shared_box_predictor_640x640_coco14_sync_2018_07_03.tar.gz) | 56 | 32 | Boxes |
| [ssd_resnet_50_fpn_coco ☆](http://download.tensorflow.org/models/object_detection/ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync_2018_07_03.tar.gz) | 76 | 35 | Boxes |
| [ssd_mobilenet_v2_coco](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v2_coco_2018_03_29.tar.gz) | 31 | 22 | Boxes | | [ssd_mobilenet_v2_coco](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v2_coco_2018_03_29.tar.gz) | 31 | 22 | Boxes |
| [ssdlite_mobilenet_v2_coco](http://download.tensorflow.org/models/object_detection/ssdlite_mobilenet_v2_coco_2018_05_09.tar.gz) | 27 | 22 | Boxes | | [ssdlite_mobilenet_v2_coco](http://download.tensorflow.org/models/object_detection/ssdlite_mobilenet_v2_coco_2018_05_09.tar.gz) | 27 | 22 | Boxes |
| [ssd_inception_v2_coco](http://download.tensorflow.org/models/object_detection/ssd_inception_v2_coco_2018_01_28.tar.gz) | 42 | 24 | Boxes | | [ssd_inception_v2_coco](http://download.tensorflow.org/models/object_detection/ssd_inception_v2_coco_2018_01_28.tar.gz) | 42 | 24 | Boxes |
...@@ -88,15 +96,15 @@ Some remarks on frozen inference graphs: ...@@ -88,15 +96,15 @@ Some remarks on frozen inference graphs:
| [mask_rcnn_resnet101_atrous_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_resnet101_atrous_coco_2018_01_28.tar.gz) | 470 | 33 | Masks | | [mask_rcnn_resnet101_atrous_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_resnet101_atrous_coco_2018_01_28.tar.gz) | 470 | 33 | Masks |
| [mask_rcnn_resnet50_atrous_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_resnet50_atrous_coco_2018_01_28.tar.gz) | 343 | 29 | Masks | | [mask_rcnn_resnet50_atrous_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_resnet50_atrous_coco_2018_01_28.tar.gz) | 343 | 29 | Masks |
Note: The asterisk (☆) at the end of model name indicates that this model supports TPU training.
## Kitti-trained models
## Kitti-trained models {#kitti-models}
Model name | Speed (ms) | Pascal mAP@0.5 | Outputs Model name | Speed (ms) | Pascal mAP@0.5 | Outputs
----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----: ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----:
[faster_rcnn_resnet101_kitti](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_kitti_2018_01_28.tar.gz) | 79 | 87 | Boxes [faster_rcnn_resnet101_kitti](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_kitti_2018_01_28.tar.gz) | 79 | 87 | Boxes
## Open Images-trained models {#open-images-models} ## Open Images-trained models
Model name | Speed (ms) | Open Images mAP@0.5[^2] | Outputs Model name | Speed (ms) | Open Images mAP@0.5[^2] | Outputs
----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----: ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----:
...@@ -104,7 +112,7 @@ Model name ...@@ -104,7 +112,7 @@ Model name
[faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2018_01_28.tar.gz) | 347 | | Boxes [faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2018_01_28.tar.gz) | 347 | | Boxes
## AVA v2.1 trained models {#ava-models} ## AVA v2.1 trained models
Model name | Speed (ms) | Pascal mAP@0.5 | Outputs Model name | Speed (ms) | Pascal mAP@0.5 | Outputs
----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----: ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----:
...@@ -112,5 +120,6 @@ Model name ...@@ -112,5 +120,6 @@ Model name
[^1]: See [MSCOCO evaluation protocol](http://cocodataset.org/#detections-eval). [^1]: See [MSCOCO evaluation protocol](http://cocodataset.org/#detections-eval).
[^2]: This is PASCAL mAP with a slightly different way of true positives computation: see [Open Images evaluation protocol](evaluation_protocols.md#open-images). [^2]: This is PASCAL mAP with a slightly different way of true positives computation: see [Open Images evaluation protocol](evaluation_protocols.md#open-images).
...@@ -34,37 +34,22 @@ A local training job can be run with the following command: ...@@ -34,37 +34,22 @@ A local training job can be run with the following command:
```bash ```bash
# From the tensorflow/models/research/ directory # From the tensorflow/models/research/ directory
python object_detection/train.py \ PIPELINE_CONFIG_PATH={path to pipeline config file}
--logtostderr \ MODEL_DIR={path to model directory}
--pipeline_config_path=${PATH_TO_YOUR_PIPELINE_CONFIG} \ NUM_TRAIN_STEPS=50000
--train_dir=${PATH_TO_TRAIN_DIR} NUM_EVAL_STEPS=2000
python object_detection/model_main.py \
--pipeline_config_path=${PIPELINE_CONFIG_PATH} \
--model_dir=${MODEL_DIR} \
--num_train_steps=${NUM_TRAIN_STEPS} \
--num_eval_steps=${NUM_EVAL_STEPS} \
--alsologtostderr
``` ```
where `${PATH_TO_YOUR_PIPELINE_CONFIG}` points to the pipeline config and where `${PIPELINE_CONFIG_PATH}` points to the pipeline config and
`${PATH_TO_TRAIN_DIR}` points to the directory in which training checkpoints `${MODEL_DIR}` points to the directory in which training checkpoints
and events will be written to. By default, the training job will and events will be written to. Note that this binary will interleave both
run indefinitely until the user kills it. training and evaluation.
## Running the Evaluation Job
Evaluation is run as a separate job. The eval job will periodically poll the
train directory for new checkpoints and evaluate them on a test dataset. The
job can be run using the following command:
```bash
# From the tensorflow/models/research/ directory
python object_detection/eval.py \
--logtostderr \
--pipeline_config_path=${PATH_TO_YOUR_PIPELINE_CONFIG} \
--checkpoint_dir=${PATH_TO_TRAIN_DIR} \
--eval_dir=${PATH_TO_EVAL_DIR}
```
where `${PATH_TO_YOUR_PIPELINE_CONFIG}` points to the pipeline config,
`${PATH_TO_TRAIN_DIR}` points to the directory in which training checkpoints
were saved (same as the training job) and `${PATH_TO_EVAL_DIR}` points to the
directory in which evaluation events will be saved. As with the training job,
the eval job run until terminated by default.
## Running Tensorboard ## Running Tensorboard
...@@ -73,9 +58,9 @@ using the recommended directory structure, Tensorboard can be run using the ...@@ -73,9 +58,9 @@ using the recommended directory structure, Tensorboard can be run using the
following command: following command:
```bash ```bash
tensorboard --logdir=${PATH_TO_MODEL_DIRECTORY} tensorboard --logdir=${MODEL_DIR}
``` ```
where `${PATH_TO_MODEL_DIRECTORY}` points to the directory that contains the where `${MODEL_DIR}` points to the directory that contains the
train and eval directories. Please note it may take Tensorboard a couple minutes train and eval directories. Please note it may take Tensorboard a couple minutes
to populate with data. to populate with data.
# Running on Google Cloud Platform # Running on Google Cloud ML Engine
The Tensorflow Object Detection API supports distributed training on Google The Tensorflow Object Detection API supports distributed training on Google
Cloud ML Engine. This section documents instructions on how to train and Cloud ML Engine. This section documents instructions on how to train and
...@@ -23,26 +23,28 @@ evaluation jobs for a few iterations ...@@ -23,26 +23,28 @@ evaluation jobs for a few iterations
## Packaging ## Packaging
In order to run the Tensorflow Object Detection API on Cloud ML, it must be In order to run the Tensorflow Object Detection API on Cloud ML, it must be
packaged (along with it's TF-Slim dependency). The required packages can be packaged (along with it's TF-Slim dependency and the
created with the following command [pycocotools](https://github.com/cocodataset/cocoapi/tree/master/PythonAPI/pycocotools)
library). The required packages can be created with the following command
``` bash ``` bash
# From tensorflow/models/research/ # From tensorflow/models/research/
bash object_detection/dataset_tools/create_pycocotools_package.sh /tmp/pycocotools
python setup.py sdist python setup.py sdist
(cd slim && python setup.py sdist) (cd slim && python setup.py sdist)
``` ```
This will create python packages in dist/object_detection-0.1.tar.gz and This will create python packages dist/object_detection-0.1.tar.gz,
slim/dist/slim-0.1.tar.gz. slim/dist/slim-0.1.tar.gz, and /tmp/pycocotools/pycocotools-2.0.tar.gz.
## Running a Multiworker Training Job ## Running a Multiworker (GPU) Training Job on CMLE
Google Cloud ML requires a YAML configuration file for a multiworker training Google Cloud ML requires a YAML configuration file for a multiworker training
job using GPUs. A sample YAML file is given below: job using GPUs. A sample YAML file is given below:
``` ```
trainingInput: trainingInput:
runtimeVersion: "1.2" runtimeVersion: "1.8"
scaleTier: CUSTOM scaleTier: CUSTOM
masterType: standard_gpu masterType: standard_gpu
workerCount: 9 workerCount: 9
...@@ -68,22 +70,22 @@ The YAML file should be saved on the local machine (not on GCP). Once it has ...@@ -68,22 +70,22 @@ The YAML file should be saved on the local machine (not on GCP). Once it has
been written, a user can start a training job on Cloud ML Engine using the been written, a user can start a training job on Cloud ML Engine using the
following command: following command:
``` bash ```bash
# From tensorflow/models/research/ # From tensorflow/models/research/
gcloud ml-engine jobs submit training object_detection_`date +%s` \ gcloud ml-engine jobs submit training object_detection_`date +%m_%d_%Y_%H_%M_%S` \
--runtime-version 1.2 \ --runtime-version 1.8 \
--job-dir=gs://${TRAIN_DIR} \ --job-dir=gs://${MODEL_DIR} \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz \ --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
--module-name object_detection.train \ --module-name object_detection.model_main \
--region us-central1 \ --region us-central1 \
--config ${PATH_TO_LOCAL_YAML_FILE} \ --config ${PATH_TO_LOCAL_YAML_FILE} \
-- \ -- \
--train_dir=gs://${TRAIN_DIR} \ --model_dir=gs://${MODEL_DIR} \
--pipeline_config_path=gs://${PIPELINE_CONFIG_PATH} --pipeline_config_path=gs://${PIPELINE_CONFIG_PATH}
``` ```
Where `${PATH_TO_LOCAL_YAML_FILE}` is the local path to the YAML configuration, Where `${PATH_TO_LOCAL_YAML_FILE}` is the local path to the YAML configuration,
`gs://${TRAIN_DIR}` specifies the directory on Google Cloud Storage where the `gs://${MODEL_DIR}` specifies the directory on Google Cloud Storage where the
training checkpoints and events will be written to and training checkpoints and events will be written to and
`gs://${PIPELINE_CONFIG_PATH}` points to the pipeline configuration stored on `gs://${PIPELINE_CONFIG_PATH}` points to the pipeline configuration stored on
Google Cloud Storage. Google Cloud Storage.
...@@ -91,34 +93,69 @@ Google Cloud Storage. ...@@ -91,34 +93,69 @@ Google Cloud Storage.
Users can monitor the progress of their training job on the [ML Engine Users can monitor the progress of their training job on the [ML Engine
Dashboard](https://console.cloud.google.com/mlengine/jobs). Dashboard](https://console.cloud.google.com/mlengine/jobs).
Note: This sample is supported for use with 1.2 runtime version. Note: This sample is supported for use with 1.8 runtime version.
## Running a TPU Training Job on CMLE
Launching a training job with a TPU compatible pipeline config requires using a
similar command:
```bash
gcloud ml-engine jobs submit training `whoami`_object_detection_`date +%m_%d_%Y_%H_%M_%S` \
--job-dir=gs://${MODEL_DIR} \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
--module-name object_detection.model_tpu_main \
--runtime-version 1.8 \
--scale-tier BASIC_TPU \
--region us-central1 \
-- \
--tpu_zone us-central1 \
--model_dir=gs://${MODEL_DIR} \
--pipeline_config_path=gs://${PIPELINE_CONFIG_PATH}
```
In contrast with the GPU training command, there is no need to specify a YAML
file and we point to the *object_detection.model_tpu_main* binary instead of
*object_detection.model_main*. We must also now set `scale-tier` to be
`BASIC_TPU` and provide a `tpu_zone`. Finally as before `pipeline_config_path`
points to a points to the pipeline configuration stored on Google Cloud Storage
(but is now must be a TPU compatible model).
## Running an Evaluation Job on CMLE
## Running an Evaluation Job on Cloud Note: You only need to do this when using TPU for training as it does not
interleave evaluation during training as in the case of Multiworker GPU
training.
Evaluation jobs run on a single machine, so it is not necessary to write a YAML Evaluation jobs run on a single machine, so it is not necessary to write a YAML
configuration for evaluation. Run the following command to start the evaluation configuration for evaluation. Run the following command to start the evaluation
job: job:
``` bash ```bash
gcloud ml-engine jobs submit training object_detection_eval_`date +%s` \ gcloud ml-engine jobs submit training object_detection_eval_`date +%m_%d_%Y_%H_%M_%S` \
--runtime-version 1.2 \ --runtime-version 1.8 \
--job-dir=gs://${TRAIN_DIR} \ --job-dir=gs://${MODEL_DIR} \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz \ --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
--module-name object_detection.eval \ --module-name object_detection.model_main \
--region us-central1 \ --region us-central1 \
--scale-tier BASIC_GPU \ --scale-tier BASIC_GPU \
-- \ -- \
--checkpoint_dir=gs://${TRAIN_DIR} \ --model_dir=gs://${MODEL_DIR} \
--eval_dir=gs://${EVAL_DIR} \ --pipeline_config_path=gs://${PIPELINE_CONFIG_PATH} \
--pipeline_config_path=gs://${PIPELINE_CONFIG_PATH} --checkpoint_dir=gs://${MODEL_DIR}
``` ```
Where `gs://${TRAIN_DIR}` points to the directory on Google Cloud Storage where Where `gs://${MODEL_DIR}` points to the directory on Google Cloud Storage where
training checkpoints are saved (same as the training job), `gs://${EVAL_DIR}` training checkpoints are saved (same as the training job), as well as
points to where evaluation events will be saved on Google Cloud Storage and to where evaluation events will be saved on Google Cloud Storage and
`gs://${PIPELINE_CONFIG_PATH}` points to where the pipeline configuration is `gs://${PIPELINE_CONFIG_PATH}` points to where the pipeline configuration is
stored on Google Cloud Storage. stored on Google Cloud Storage.
Typically one starts an evaluation job concurrently with the training job.
Note that we do not support running evaluation on TPU, so the above command
line for launching evaluation jobs is the same whether you are training
on GPU or TPU.
## Running Tensorboard ## Running Tensorboard
You can run Tensorboard locally on your own machine to view progress of your You can run Tensorboard locally on your own machine to view progress of your
...@@ -130,3 +167,4 @@ tensorboard --logdir=gs://${YOUR_CLOUD_BUCKET} ...@@ -130,3 +167,4 @@ tensorboard --logdir=gs://${YOUR_CLOUD_BUCKET}
``` ```
Note it may Tensorboard a few minutes to populate with results. Note it may Tensorboard a few minutes to populate with results.
# Running on mobile with TensorFlow Lite
In this section, we will show you how to use [TensorFlow
Lite](https://www.tensorflow.org/mobile/tflite/) to get a smaller model and
allow you take advantage of ops that have been optimized for mobile devices.
TensorFlow Lite is TensorFlow’s lightweight solution for mobile and embedded
devices. It enables on-device machine learning inference with low latency and a
small binary size. TensorFlow Lite uses many techniques for this such as
quantized kernels that allow smaller and faster (fixed-point math) models.
For this section, you will need to build [TensorFlow from
source](https://www.tensorflow.org/install/install_sources) to get the
TensorFlow Lite support for the SSD model. You will also need to install the
[bazel build
tool](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android#bazel).
To make these commands easier to run, let’s set up some environment variables:
```shell
export CONFIG_FILE=PATH_TO_BE_CONFIGURED/pipeline.config
export CHECKPOINT_PATH=PATH_TO_BE_CONFIGURED/model.ckpt
export OUTPUT_DIR=/tmp/tflite
```
We start with a checkpoint and get a TensorFlow frozen graph with compatible ops
that we can use with TensorFlow Lite. First, you’ll need to install these
[python
libraries](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/installation.md).
Then to get the frozen graph, run the export_tflite_ssd_graph.py script from the
`models/research` directory with this command:
```shell
object_detection/export_tflite_ssd_graph.py \
--pipeline_config_path=$CONFIG_FILE \
--trained_checkpoint_prefix=$CHECKPOINT_PATH \
--output_directory=$OUTPUT_DIR \
--add_postprocessing_op=true
```
In the /tmp/tflite directory, you should now see two files: tflite_graph.pb and
tflite_graph.pbtxt. Note that the add_postprocessing flag enables the model to
take advantage of a custom optimized detection post-processing operation which
can be thought of as a replacement for
[tf.image.non_max_suppression](https://www.tensorflow.org/api_docs/python/tf/image/non_max_suppression).
Make sure not to confuse export_tflite_ssd_graph with export_inference_graph in
the same directory. Both scripts output frozen graphs: export_tflite_ssd_graph
will output the frozen graph that we can input to TensorFlow Lite directly and
is the one we’ll be using.
Next we’ll use TensorFlow Lite to get the optimized model by using
[TOCO](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/toco),
the TensorFlow Lite Optimizing Converter. This will convert the resulting frozen
graph (tflite_graph.pb) to the TensorFlow Lite flatbuffer format (detect.tflite)
via the following command. For a quantized model, run this from the tensorflow/
directory:
```shell
bazel run --config=opt tensorflow/contrib/lite/toco:toco -- \
--input_file=$OUTPUT_DIR/tflite_graph.pb \
--output_file=$OUTPUT_DIR/detect.tflite \
--input_shapes=1,300,300,3 \
--input_arrays=normalized_input_image_tensor \
--output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3' \
--inference_type=QUANTIZED_UINT8 \
--mean_values=128 \
--std_values=128 \
--change_concat_input_ranges=false \
--allow_custom_ops
```
This command takes the input tensor normalized_input_image_tensor after resizing
each camera image frame to 300x300 pixels. The outputs of the quantized model
are named 'TFLite_Detection_PostProcess', 'TFLite_Detection_PostProcess:1',
'TFLite_Detection_PostProcess:2', and 'TFLite_Detection_PostProcess:3' and
represent four arrays: detection_boxes, detection_classes, detection_scores, and
num_detections. The documentation for other flags used in this command is
[here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/toco/g3doc/cmdline_reference.md).
If things ran successfully, you should now see a third file in the /tmp/tflite
directory called detect.tflite. This file contains the graph and all model
parameters and can be run via the TensorFlow Lite interpreter on the Android
device. For a floating point model, run this from the tensorflow/ directory:
```shell
bazel run --config=opt tensorflow/contrib/lite/toco:toco -- \
--input_file=$OUTPUT_DIR/tflite_graph.pb \
--output_file=$OUTPUT_DIR/detect.tflite \
--input_shapes=1,300,300,3 \
--input_arrays=normalized_input_image_tensor \
--output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3' \
--inference_type=FLOAT \
--allow_custom_ops
```
# Running our model on Android
To run our TensorFlow Lite model on device, we will need to install the Android
NDK and SDK. The current recommended Android NDK version is 14b and can be found
on the [NDK
Archives](https://developer.android.com/ndk/downloads/older_releases.html#ndk-14b-downloads)
page. Android SDK and build tools can be [downloaded
separately](https://developer.android.com/tools/revisions/build-tools.html) or
used as part of [Android
Studio](https://developer.android.com/studio/index.html). To build the
TensorFlow Lite Android demo, build tools require API >= 23 (but it will run on
devices with API >= 21). Additional details are available on the [TensorFlow
Lite Android App
page](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/java/demo/README.md).
Next we need to point the app to our new detect.tflite file and give it the
names of our new labels. Specifically, we will copy our TensorFlow Lite
flatbuffer to the app assets directory with the following command:
```shell
cp /tmp/tflite/detect.tflite \
//tensorflow/contrib/lite/examples/android/app/src/main/assets
```
You will also need to copy your new labelmap labels_list.txt to the assets
directory.
We will now edit the BUILD file to point to this new model. First, open the
BUILD file tensorflow/contrib/lite/examples/android/BUILD. Then find the assets
section, and replace the line “@tflite_mobilenet_ssd_quant//:detect.tflite”
(which by default points to a COCO pretrained model) with the path to your new
TFLite model
“//tensorflow/contrib/lite/examples/android/app/src/main/assets:detect.tflite”.
Finally, change the last line in assets section to use the new label map as
well.
We will also need to tell our app to use the new label map. In order to do this,
open up the
tensorflow/contrib/lite/examples/android/app/src/main/java/org/tensorflow/demo/DetectorActivity.java
file in a text editor and find the definition of TF_OD_API_LABELS_FILE. Update
this path to point to your new label map file:
"file:///android_asset/labels_list.txt". Note that if your model is quantized,
the flag TF_OD_API_IS_QUANTIZED is set to true, and if your model is floating
point, the flag TF_OD_API_IS_QUANTIZED is set to false. This new section of
DetectorActivity.java should now look as follows for a quantized model:
```shell
private static final boolean TF_OD_API_IS_QUANTIZED = true;
private static final String TF_OD_API_MODEL_FILE = "detect.tflite";
private static final String TF_OD_API_LABELS_FILE = "file:///android_asset/labels_list.txt";
```
Once you’ve copied the TensorFlow Lite file and edited your BUILD and
DetectorActivity.java files, you can build the demo app, run this bazel command
from the tensorflow directory:
```shell
bazel build -c opt --config=android_arm{,64} --cxxopt='--std=c++11'
"//tensorflow/contrib/lite/examples/android:tflite_demo"
```
Now install the demo on a
[debug-enabled](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android#install)
Android phone via [Android Debug
Bridge](https://developer.android.com/studio/command-line/adb) (adb):
```shell
adb install bazel-bin/tensorflow/contrib/lite/examples/android/tflite_demo.apk
```
...@@ -93,17 +93,18 @@ python object_detection/dataset_tools/create_pet_tf_record.py \ ...@@ -93,17 +93,18 @@ python object_detection/dataset_tools/create_pet_tf_record.py \
Note: It is normal to see some warnings when running this script. You may ignore Note: It is normal to see some warnings when running this script. You may ignore
them. them.
Two TFRecord files named `pet_train.record` and `pet_val.record` should be Two 10-sharded TFRecord files named `pet_faces_train.record-*` and
generated in the `tensorflow/models/research/` directory. `pet_faces_val.record-*` should be generated in the
`tensorflow/models/research/` directory.
Now that the data has been generated, we'll need to upload it to Google Cloud Now that the data has been generated, we'll need to upload it to Google Cloud
Storage so the data can be accessed by ML Engine. Run the following command to Storage so the data can be accessed by ML Engine. Run the following command to
copy the files into your GCS bucket (substituting `${YOUR_GCS_BUCKET}`): copy the files into your GCS bucket (substituting `${YOUR_GCS_BUCKET}`):
``` bash ```bash
# From tensorflow/models/research/ # From tensorflow/models/research/
gsutil cp pet_train.record gs://${YOUR_GCS_BUCKET}/data/pet_train.record gsutil cp pet_faces_train.record-* gs://${YOUR_GCS_BUCKET}/data/
gsutil cp pet_val.record gs://${YOUR_GCS_BUCKET}/data/pet_val.record gsutil cp pet_faces_val.record-* gs://${YOUR_GCS_BUCKET}/data/
gsutil cp object_detection/data/pet_label_map.pbtxt gs://${YOUR_GCS_BUCKET}/data/pet_label_map.pbtxt gsutil cp object_detection/data/pet_label_map.pbtxt gs://${YOUR_GCS_BUCKET}/data/pet_label_map.pbtxt
``` ```
...@@ -176,8 +177,8 @@ the following: ...@@ -176,8 +177,8 @@ the following:
- model.ckpt.meta - model.ckpt.meta
- model.ckpt.data-00000-of-00001 - model.ckpt.data-00000-of-00001
- pet_label_map.pbtxt - pet_label_map.pbtxt
- pet_train.record - pet_faces_train.record-*
- pet_val.record - pet_faces_val.record-*
``` ```
You can inspect your bucket using the [Google Cloud Storage You can inspect your bucket using the [Google Cloud Storage
...@@ -193,59 +194,39 @@ Before we can start a job on Google Cloud ML Engine, we must: ...@@ -193,59 +194,39 @@ Before we can start a job on Google Cloud ML Engine, we must:
To package the Tensorflow Object Detection code, run the following commands from To package the Tensorflow Object Detection code, run the following commands from
the `tensorflow/models/research/` directory: the `tensorflow/models/research/` directory:
``` bash ```bash
# From tensorflow/models/research/ # From tensorflow/models/research/
bash object_detection/dataset_tools/create_pycocotools_package.sh /tmp/pycocotools
python setup.py sdist python setup.py sdist
(cd slim && python setup.py sdist) (cd slim && python setup.py sdist)
``` ```
You should see two tar.gz files created at `dist/object_detection-0.1.tar.gz` This will create python packages dist/object_detection-0.1.tar.gz,
and `slim/dist/slim-0.1.tar.gz`. slim/dist/slim-0.1.tar.gz, and /tmp/pycocotools/pycocotools-2.0.tar.gz.
For running the training Cloud ML job, we'll configure the cluster to use 10 For running the training Cloud ML job, we'll configure the cluster to use 10
training jobs (1 master + 9 workers) and three parameters servers. The training jobs (1 master + 9 workers) and three parameters servers. The
configuration file can be found at `object_detection/samples/cloud/cloud.yml`. configuration file can be found at `object_detection/samples/cloud/cloud.yml`.
Note: This sample is supported for use with 1.2 runtime version. Note: This sample is supported for use with 1.8 runtime version.
To start training, execute the following command from the To start training and evaluation, execute the following command from the
`tensorflow/models/research/` directory: `tensorflow/models/research/` directory:
``` bash ```bash
# From tensorflow/models/research/ # From tensorflow/models/research/
gcloud ml-engine jobs submit training `whoami`_object_detection_`date +%s` \ gcloud ml-engine jobs submit training `whoami`_object_detection_pets_`date +%m_%d_%Y_%H_%M_%S` \
--runtime-version 1.2 \ --runtime-version 1.8 \
--job-dir=gs://${YOUR_GCS_BUCKET}/train \ --job-dir=gs://${YOUR_GCS_BUCKET}/model_dir \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz \ --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
--module-name object_detection.train \ --module-name object_detection.model_main \
--region us-central1 \ --region us-central1 \
--config object_detection/samples/cloud/cloud.yml \ --config object_detection/samples/cloud/cloud.yml \
-- \ -- \
--train_dir=gs://${YOUR_GCS_BUCKET}/train \ --model_dir=gs://${YOUR_GCS_BUCKET}/model_dir \
--pipeline_config_path=gs://${YOUR_GCS_BUCKET}/data/faster_rcnn_resnet101_pets.config
```
Once training has started, we can run an evaluation concurrently:
``` bash
# From tensorflow/models/research/
gcloud ml-engine jobs submit training `whoami`_object_detection_eval_`date +%s` \
--runtime-version 1.2 \
--job-dir=gs://${YOUR_GCS_BUCKET}/train \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz \
--module-name object_detection.eval \
--region us-central1 \
--scale-tier BASIC_GPU \
-- \
--checkpoint_dir=gs://${YOUR_GCS_BUCKET}/train \
--eval_dir=gs://${YOUR_GCS_BUCKET}/eval \
--pipeline_config_path=gs://${YOUR_GCS_BUCKET}/data/faster_rcnn_resnet101_pets.config --pipeline_config_path=gs://${YOUR_GCS_BUCKET}/data/faster_rcnn_resnet101_pets.config
``` ```
Note: Even though we're running an evaluation job, the `gcloud ml-engine jobs
submit training` command is correct. ML Engine does not distinguish between
training and evaluation jobs.
Users can monitor and stop training and evaluation jobs on the [ML Engine Users can monitor and stop training and evaluation jobs on the [ML Engine
Dashboard](https://console.cloud.google.com/mlengine/jobs). Dashboard](https://console.cloud.google.com/mlengine/jobs).
...@@ -254,12 +235,12 @@ Dashboard](https://console.cloud.google.com/mlengine/jobs). ...@@ -254,12 +235,12 @@ Dashboard](https://console.cloud.google.com/mlengine/jobs).
You can monitor progress of the training and eval jobs by running Tensorboard on You can monitor progress of the training and eval jobs by running Tensorboard on
your local machine: your local machine:
``` bash ```bash
# This command needs to be run once to allow your local machine to access your # This command needs to be run once to allow your local machine to access your
# GCS bucket. # GCS bucket.
gcloud auth application-default login gcloud auth application-default login
tensorboard --logdir=gs://${YOUR_GCS_BUCKET} tensorboard --logdir=gs://${YOUR_GCS_BUCKET}/model_dir
``` ```
Once Tensorboard is running, navigate to `localhost:6006` from your favourite Once Tensorboard is running, navigate to `localhost:6006` from your favourite
...@@ -284,12 +265,12 @@ that they've converged. ...@@ -284,12 +265,12 @@ that they've converged.
## Exporting the Tensorflow Graph ## Exporting the Tensorflow Graph
After your model has been trained, you should export it to a Tensorflow After your model has been trained, you should export it to a Tensorflow graph
graph proto. First, you need to identify a candidate checkpoint to export. You proto. First, you need to identify a candidate checkpoint to export. You can
can search your bucket using the [Google Cloud Storage search your bucket using the [Google Cloud Storage
Browser](https://console.cloud.google.com/storage/browser). The file should be Browser](https://console.cloud.google.com/storage/browser). The file should be
stored under `${YOUR_GCS_BUCKET}/train`. The checkpoint will typically consist of stored under `${YOUR_GCS_BUCKET}/model_dir`. The checkpoint will typically
three files: consist of three files:
* `model.ckpt-${CHECKPOINT_NUMBER}.data-00000-of-00001` * `model.ckpt-${CHECKPOINT_NUMBER}.data-00000-of-00001`
* `model.ckpt-${CHECKPOINT_NUMBER}.index` * `model.ckpt-${CHECKPOINT_NUMBER}.index`
...@@ -298,9 +279,9 @@ three files: ...@@ -298,9 +279,9 @@ three files:
After you've identified a candidate checkpoint to export, run the following After you've identified a candidate checkpoint to export, run the following
command from `tensorflow/models/research/`: command from `tensorflow/models/research/`:
``` bash ```bash
# From tensorflow/models/research/ # From tensorflow/models/research/
gsutil cp gs://${YOUR_GCS_BUCKET}/train/model.ckpt-${CHECKPOINT_NUMBER}.* . gsutil cp gs://${YOUR_GCS_BUCKET}/model_dir/model.ckpt-${CHECKPOINT_NUMBER}.* .
python object_detection/export_inference_graph.py \ python object_detection/export_inference_graph.py \
--input_type image_tensor \ --input_type image_tensor \
--pipeline_config_path object_detection/samples/configs/faster_rcnn_resnet101_pets.config \ --pipeline_config_path object_detection/samples/configs/faster_rcnn_resnet101_pets.config \
......
# TPU compatible detection pipelines
[TOC]
The Tensorflow Object Detection API supports TPU training for some models. To
make models TPU compatible you need to make a few tweaks to the model config as
mentioned below. We also provide several sample configs that you can use as a
template.
## TPU compatibility
### Static shaped tensors
TPU training currently requires all tensors in the Tensorflow Graph to have
static shapes. However, most of the sample configs in Object Detection API have
a few different tensors that are dynamically shaped. Fortunately, we provide
simple alternatives in the model configuration that modifies these tensors to
have static shape:
* **Image tensors with static shape** - This can be achieved either by using a
`fixed_shape_resizer` that resizes images to a fixed spatial shape or by
setting `pad_to_max_dimension: true` in `keep_aspect_ratio_resizer` which
pads the resized images with zeros to the bottom and right. Padded image
tensors are correctly handled internally within the model.
```
image_resizer {
fixed_shape_resizer {
height: 640
width: 640
}
}
```
or
```
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 640
max_dimension: 640
pad_to_max_dimension: true
}
}
```
* **Groundtruth tensors with static shape** - Images in a typical detection
dataset have variable number of groundtruth boxes and associated classes.
Setting `max_number_of_boxes` to a large enough number in the
`train_input_reader` and `eval_input_reader` pads the groundtruth tensors
with zeros to a static shape. Padded groundtruth tensors are correctly
handled internally within the model.
```
train_input_reader: {
tf_record_input_reader {
input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record-?????-of-00100"
}
label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
max_number_of_boxes: 200
}
eval_input_reader: {
tf_record_input_reader {
input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record-?????-of-0010"
}
label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
max_number_of_boxes: 200
}
```
### TPU friendly ops
Although TPU supports a vast number of tensorflow ops, a few used in the
Tensorflow Object Detection API are unsupported. We list such ops below and
recommend compatible substitutes.
* **Anchor sampling** - Typically we use hard example mining in standard SSD
pipeliens to balance positive and negative anchors that contribute to the
loss. Hard Example mining uses non max suppression as a subroutine and since
non max suppression is not currently supported on TPUs we cannot use hard
example mining. Fortunately, we provide an implementation of focal loss that
can be used instead of hard example mining. Remove `hard_example_miner` from
the config and substitute `weighted_sigmoid` classification loss with
`weighted_sigmoid_focal` loss.
```
loss {
classification_loss {
weighted_sigmoid_focal {
alpha: 0.25
gamma: 2.0
}
}
localization_loss {
weighted_smooth_l1 {
}
}
classification_weight: 1.0
localization_weight: 1.0
}
```
* **Target Matching** - Object detection API provides two choices for matcher
used in target assignment: `argmax_matcher` and `bipartite_matcher`.
Bipartite matcher is not currently supported on TPU, therefore we must
modify the configs to use `argmax_matcher`. Additionally, set
`use_matmul_gather: true` for efficiency on TPU.
```
matcher {
argmax_matcher {
matched_threshold: 0.5
unmatched_threshold: 0.5
ignore_thresholds: false
negatives_lower_than_unmatched: true
force_match_for_each_row: true
use_matmul_gather: true
}
}
```
### TPU training hyperparameters
Object Detection training on TPU uses synchronous SGD. On a typical cloud TPU
with 8 cores we recommend batch sizes that are 8x large when compared to a GPU
config that uses asynchronous SGD. We also use fewer training steps (~ 1/100 x)
due to the large batch size. This necessitates careful tuning of some other
training parameters as listed below.
* **Batch size** - Use the largest batch size that can fit on cloud TPU.
```
train_config {
batch_size: 1024
}
```
* **Training steps** - Typically only 10s of thousands.
```
train_config {
num_steps: 25000
}
```
* **Batch norm decay** - Use smaller decay constants (0.97 or 0.997) since we
take fewer training steps.
```
batch_norm {
scale: true,
decay: 0.97,
epsilon: 0.001,
}
```
* **Learning rate** - Use large learning rate with warmup. Scale learning rate
linearly with batch size. See `cosine_decay_learning_rate` or
`manual_step_learning_rate` for examples.
```
learning_rate: {
cosine_decay_learning_rate {
learning_rate_base: .04
total_steps: 25000
warmup_learning_rate: .013333
warmup_steps: 2000
}
}
```
or
```
learning_rate: {
manual_step_learning_rate {
warmup: true
initial_learning_rate: .01333
schedule {
step: 2000
learning_rate: 0.04
}
schedule {
step: 15000
learning_rate: 0.004
}
}
}
```
## Example TPU compatible configs
We provide example config files that you can use to train your own models on TPU
* <a href='https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_mobilenet_v1_300x300_coco14_sync.config'>ssd_mobilenet_v1_300x300</a> <br>
* <a href='https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_mobilenet_v1_ppn_shared_box_predictor_300x300_coco14_sync.config'>ssd_mobilenet_v1_ppn_300x300</a> <br>
* <a href='https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_mobilenet_v1_fpn_shared_box_predictor_640x640_coco14_sync.config'>ssd_mobilenet_v1_fpn_640x640
(mobilenet based retinanet)</a> <br>
* <a href='https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync.config'>ssd_resnet50_v1_fpn_640x640
(retinanet)</a> <br>
## Supported Meta architectures
Currently, `SSDMetaArch` models are supported on TPUs. `FasterRCNNMetaArch` is
going to be supported soon.
...@@ -500,11 +500,13 @@ def create_estimator_and_inputs(run_config, ...@@ -500,11 +500,13 @@ def create_estimator_and_inputs(run_config,
eval_config = configs['eval_config'] eval_config = configs['eval_config']
eval_input_config = configs['eval_input_config'] eval_input_config = configs['eval_input_config']
if train_steps is None: # update train_steps from config but only when non-zero value is provided
train_steps = configs['train_config'].num_steps if train_steps is None and train_config.num_steps != 0:
train_steps = train_config.num_steps
if eval_steps is None: # update eval_steps from config but only when non-zero value is provided
eval_steps = configs['eval_config'].num_examples if eval_steps is None and eval_config.num_examples != 0:
eval_steps = eval_config.num_examples
detection_model_fn = functools.partial( detection_model_fn = functools.partial(
model_builder.build, model_config=model_config) model_builder.build, model_config=model_config)
......
...@@ -3,8 +3,7 @@ ...@@ -3,8 +3,7 @@
# See Lin et al, https://arxiv.org/abs/1708.02002 # See Lin et al, https://arxiv.org/abs/1708.02002
# Trained on COCO, initialized from Imagenet classification checkpoint # Trained on COCO, initialized from Imagenet classification checkpoint
# Achieves 29.6 mAP on COCO14 minival dataset. Doubling the number of training # Achieves 29.7 mAP on COCO14 minival dataset.
# steps to 25k gets 31.5 mAP
# This config is TPU compatible # This config is TPU compatible
...@@ -133,11 +132,11 @@ model { ...@@ -133,11 +132,11 @@ model {
train_config: { train_config: {
fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt" fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
batch_size: 128 batch_size: 64
sync_replicas: true sync_replicas: true
startup_delay_steps: 0 startup_delay_steps: 0
replicas_to_aggregate: 8 replicas_to_aggregate: 8
num_steps: 12500 num_steps: 25000
data_augmentation_options { data_augmentation_options {
random_horizontal_flip { random_horizontal_flip {
} }
...@@ -156,10 +155,10 @@ train_config: { ...@@ -156,10 +155,10 @@ train_config: {
momentum_optimizer: { momentum_optimizer: {
learning_rate: { learning_rate: {
cosine_decay_learning_rate { cosine_decay_learning_rate {
learning_rate_base: .08 learning_rate_base: .04
total_steps: 12500 total_steps: 25000
warmup_learning_rate: .026666 warmup_learning_rate: .013333
warmup_steps: 1000 warmup_steps: 2000
} }
} }
momentum_optimizer_value: 0.9 momentum_optimizer_value: 0.9
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment