Commit 31ca3b97 authored by Kaushik Shivakumar's avatar Kaushik Shivakumar
Browse files

resovle merge conflicts

parents 3e9d886d 7fcd7cba
......@@ -67,7 +67,7 @@ your own models:
1. <a href="https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/mask_rcnn_resnet50_atrous_coco.config" target=_blank>mask_rcnn_resnet50_atrous_coco</a>
1. <a href="https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/mask_rcnn_inception_v2_coco.config" target=_blank>mask_rcnn_inception_v2_coco</a>
For more details see the [detection model zoo](detection_model_zoo.md).
For more details see the [detection model zoo](tf1_detection_zoo.md).
### Updating a Faster R-CNN config file
......
......@@ -113,10 +113,10 @@ computations on subsets of the validation and test sets.
## Inferring detections
Inference requires a trained object detection model. In this tutorial we will
use a model from the [detections model zoo](detection_model_zoo.md), which can
use a model from the [detections model zoo](tf1_detection_zoo.md), which can
be downloaded and unpacked by running the commands below. More information about
the model, such as its architecture and how it was trained, is available in the
[model zoo page](detection_model_zoo.md).
[model zoo page](tf1_detection_zoo.md).
```bash
# From tensorflow/models/research/oid
......
# Preparing Inputs
Tensorflow Object Detection API reads data using the TFRecord file format. Two
TensorFlow Object Detection API reads data using the TFRecord file format. Two
sample scripts (`create_pascal_tf_record.py` and `create_pet_tf_record.py`) are
provided to convert from the PASCAL VOC dataset and Oxford-IIIT Pet dataset to
TFRecords.
......
# Release Notes
### July 10th, 2020
We are happy to announce that the TF OD API officially supports TF2! Our release
includes:
* New binaries for train/eval/export that are designed to run in eager mode.
* A suite of TF2 compatible (Keras-based) models; this includes migrations of
our most popular TF1.x models (e.g., SSD with MobileNet, RetinaNet, Faster
R-CNN, Mask R-CNN), as well as a few new architectures for which we will
only maintain TF2 implementations:
1. CenterNet - a simple and effective anchor-free architecture based on the
recent [Objects as Points](https://arxiv.org/abs/1904.07850) paper by
Zhou et al
2. [EfficientDet](https://arxiv.org/abs/1911.09070) - a recent family of
SOTA models discovered with the help of Neural Architecture Search.
* COCO pre-trained weights for all of the models provided as TF2 style
object-based checkpoints.
* Access to
[Distribution Strategies](https://www.tensorflow.org/guide/distributed_training)
for distributed training --- our model are designed to be trainable using
sync multi-GPU and TPU platforms.
* Colabs demo’ing eager mode training and inference.
See our release blogpost
[here](https://blog.tensorflow.org/2020/07/tensorflow-2-meets-object-detection-api.html).
If you are an existing user of the TF OD API using TF 1.x, don’t worry, we’ve
got you covered.
**Thanks to contributors**: Akhil Chinnakotla, Allen Lavoie, Anirudh Vegesana,
Anjali Sridhar, Austin Myers, Dan Kondratyuk, David Ross, Derek Chow, Jaeyoun
Kim, Jing Li, Jonathan Huang, Jordi Pont-Tuset, Karmel Allison, Kathy Ruan,
Kaushik Shivakumar, Lu He, Mingxing Tan, Pengchong Jin, Ronny Votel, Sara Beery,
Sergi Caelles Prat, Shan Yang, Sudheendra Vijayanarasimhan, Tina Tian, Tomer
Kaftan, Vighnesh Birodkar, Vishnu Banna, Vivek Rathod, Yanhui Liang, Yiming Shi,
Yixin Shi, Yu-hui Chen, Zhichao Lu.
### June 26th, 2020
We have released SSDLite with MobileDet GPU backbone, which achieves 17% mAP
higher than the MobileNetV2 SSDLite (27.5 mAP vs 23.5 mAP) on a NVIDIA Jetson
Xavier at comparable latency (3.2ms vs 3.3ms).
Along with the model definition, we are also releasing model checkpoints trained
on the COCO dataset.
<b>Thanks to contributors</b>: Yongzhe Wang, Bo Chen, Hanxiao Liu, Le An
(NVIDIA), Yu-Te Cheng (NVIDIA), Oliver Knieps (NVIDIA), and Josh Park (NVIDIA).
### June 17th, 2020
We have released [Context R-CNN](https://arxiv.org/abs/1912.03538), a model that
uses attention to incorporate contextual information images (e.g. from
temporally nearby frames taken by a static camera) in order to improve accuracy.
Importantly, these contextual images need not be labeled.
* When applied to a challenging wildlife detection dataset
([Snapshot Serengeti](http://lila.science/datasets/snapshot-serengeti)),
Context R-CNN with context from up to a month of images outperforms a
single-frame baseline by 17.9% mAP, and outperforms S3D (a 3d convolution
based baseline) by 11.2% mAP.
* Context R-CNN leverages temporal context from the unlabeled frames of a
novel camera deployment to improve performance at that camera, boosting
model generalizeability.
Read about Context R-CNN on the Google AI blog
[here](https://ai.googleblog.com/2020/06/leveraging-temporal-context-for-object.html).
We have provided code for generating data with associated context
[here](context_rcnn.md), and a sample config for a Context R-CNN model
[here](../samples/configs/context_rcnn_resnet101_snapshot_serengeti_sync.config).
Snapshot Serengeti-trained Faster R-CNN and Context R-CNN models can be found in
the
[model zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf1_detection_zoo.md#snapshot-serengeti-camera-trap-trained-models).
A colab demonstrating Context R-CNN is provided
[here](../colab_tutorials/context_rcnn_tutorial.ipynb).
<b>Thanks to contributors</b>: Sara Beery, Jonathan Huang, Guanhang Wu, Vivek
Rathod, Ronny Votel, Zhichao Lu, David Ross, Pietro Perona, Tanya Birch, and the
Wildlife Insights AI Team.
### May 19th, 2020
We have released [MobileDets](https://arxiv.org/abs/2004.14525), a set of
high-performance models for mobile CPUs, DSPs and EdgeTPUs.
* MobileDets outperform MobileNetV3+SSDLite by 1.7 mAP at comparable mobile
CPU inference latencies. MobileDets also outperform MobileNetV2+SSDLite by
1.9 mAP on mobile CPUs, 3.7 mAP on EdgeTPUs and 3.4 mAP on DSPs while
running equally fast. MobileDets also offer up to 2x speedup over MnasFPN on
EdgeTPUs and DSPs.
For each of the three hardware platforms we have released model definition,
model checkpoints trained on the COCO14 dataset and converted TFLite models in
fp32 and/or uint8.
<b>Thanks to contributors</b>: Yunyang Xiong, Hanxiao Liu, Suyog Gupta, Berkin
Akin, Gabriel Bender, Pieter-Jan Kindermans, Mingxing Tan, Vikas Singh, Bo Chen,
Quoc Le, Zhichao Lu.
### May 7th, 2020
We have released a mobile model with the
[MnasFPN head](https://arxiv.org/abs/1912.01106).
* MnasFPN with MobileNet-V2 backbone is the most accurate (26.6 mAP at 183ms
on Pixel 1) mobile detection model we have released to date. With
depth-multiplier, MnasFPN with MobileNet-V2 backbone is 1.8 mAP higher than
MobileNet-V3-Large with SSDLite (23.8 mAP vs 22.0 mAP) at similar latency
(120ms) on Pixel 1.
We have released model definition, model checkpoints trained on the COCO14
dataset and a converted TFLite model.
<b>Thanks to contributors</b>: Bo Chen, Golnaz Ghiasi, Hanxiao Liu, Tsung-Yi
Lin, Dmitry Kalenichenko, Hartwig Adam, Quoc Le, Zhichao Lu, Jonathan Huang, Hao
Xu.
### Nov 13th, 2019
We have released MobileNetEdgeTPU SSDLite model.
* SSDLite with MobileNetEdgeTPU backbone, which achieves 10% mAP higher than
MobileNetV2 SSDLite (24.3 mAP vs 22 mAP) on a Google Pixel4 at comparable
latency (6.6ms vs 6.8ms).
Along with the model definition, we are also releasing model checkpoints trained
on the COCO dataset.
<b>Thanks to contributors</b>: Yunyang Xiong, Bo Chen, Suyog Gupta, Hanxiao Liu,
Gabriel Bender, Mingxing Tan, Berkin Akin, Zhichao Lu, Quoc Le
### Oct 15th, 2019
We have released two MobileNet V3 SSDLite models (presented in
[Searching for MobileNetV3](https://arxiv.org/abs/1905.02244)).
* SSDLite with MobileNet-V3-Large backbone, which is 27% faster than Mobilenet
V2 SSDLite (119ms vs 162ms) on a Google Pixel phone CPU at the same mAP.
* SSDLite with MobileNet-V3-Small backbone, which is 37% faster than MnasNet
SSDLite reduced with depth-multiplier (43ms vs 68ms) at the same mAP.
Along with the model definition, we are also releasing model checkpoints trained
on the COCO dataset.
<b>Thanks to contributors</b>: Bo Chen, Zhichao Lu, Vivek Rathod, Jonathan Huang
### July 1st, 2019
We have released an updated set of utils and an updated
[tutorial](challenge_evaluation.md) for all three tracks of the
[Open Images Challenge 2019](https://storage.googleapis.com/openimages/web/challenge2019.html)!
The Instance Segmentation metric for
[Open Images V5](https://storage.googleapis.com/openimages/web/index.html) and
[Challenge 2019](https://storage.googleapis.com/openimages/web/challenge2019.html)
is part of this release. Check out
[the metric description](https://storage.googleapis.com/openimages/web/evaluation.html#instance_segmentation_eval)
on the Open Images website.
<b>Thanks to contributors</b>: Alina Kuznetsova, Rodrigo Benenson
### Feb 11, 2019
We have released detection models trained on the Open Images Dataset V4 in our
detection model zoo, including
* Faster R-CNN detector with Inception Resnet V2 feature extractor
* SSD detector with MobileNet V2 feature extractor
* SSD detector with ResNet 101 FPN feature extractor (aka RetinaNet-101)
<b>Thanks to contributors</b>: Alina Kuznetsova, Yinxiao Li
### Sep 17, 2018
We have released Faster R-CNN detectors with ResNet-50 / ResNet-101 feature
extractors trained on the
[iNaturalist Species Detection Dataset](https://github.com/visipedia/inat_comp/blob/master/2017/README.md#bounding-boxes).
The models are trained on the training split of the iNaturalist data for 4M
iterations, they achieve 55% and 58% mean AP@.5 over 2854 classes respectively.
For more details please refer to this [paper](https://arxiv.org/abs/1707.06642).
<b>Thanks to contributors</b>: Chen Sun
### July 13, 2018
There are many new updates in this release, extending the functionality and
capability of the API:
* Moving from slim-based training to
[Estimator](https://www.tensorflow.org/api_docs/python/tf/estimator/Estimator)-based
training.
* Support for [RetinaNet](https://arxiv.org/abs/1708.02002), and a
[MobileNet](https://ai.googleblog.com/2017/06/mobilenets-open-source-models-for.html)
adaptation of RetinaNet.
* A novel SSD-based architecture called the
[Pooling Pyramid Network](https://arxiv.org/abs/1807.03284) (PPN).
* Releasing several [TPU](https://cloud.google.com/tpu/)-compatible models.
These can be found in the `samples/configs/` directory with a comment in the
pipeline configuration files indicating TPU compatibility.
* Support for quantized training.
* Updated documentation for new binaries, Cloud training, and
[TensorFlow Lite](https://www.tensorflow.org/mobile/tflite/).
See also our
[expanded announcement blogpost](https://ai.googleblog.com/2018/07/accelerated-training-and-inference-with.html)
and accompanying tutorial at the
[TensorFlow blog](https://medium.com/tensorflow/training-and-serving-a-realtime-mobile-object-detector-in-30-minutes-with-cloud-tpus-b78971cf1193).
<b>Thanks to contributors</b>: Sara Robinson, Aakanksha Chowdhery, Derek Chow,
Pengchong Jin, Jonathan Huang, Vivek Rathod, Zhichao Lu, Ronny Votel
### June 25, 2018
Additional evaluation tools for the
[Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html)
are out. Check out our short tutorial on data preparation and running evaluation
[here](challenge_evaluation.md)!
<b>Thanks to contributors</b>: Alina Kuznetsova
### June 5, 2018
We have released the implementation of evaluation metrics for both tracks of the
[Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html)
as a part of the Object Detection API - see the
[evaluation protocols](evaluation_protocols.md) for more details. Additionally,
we have released a tool for hierarchical labels expansion for the Open Images
Challenge: check out
[oid_hierarchical_labels_expansion.py](../dataset_tools/oid_hierarchical_labels_expansion.py).
<b>Thanks to contributors</b>: Alina Kuznetsova, Vittorio Ferrari, Jasper
Uijlings
### April 30, 2018
We have released a Faster R-CNN detector with ResNet-101 feature extractor
trained on [AVA](https://research.google.com/ava/) v2.1. Compared with other
commonly used object detectors, it changes the action classification loss
function to per-class Sigmoid loss to handle boxes with multiple labels. The
model is trained on the training split of AVA v2.1 for 1.5M iterations, it
achieves mean AP of 11.25% over 60 classes on the validation split of AVA v2.1.
For more details please refer to this [paper](https://arxiv.org/abs/1705.08421).
<b>Thanks to contributors</b>: Chen Sun, David Ross
### April 2, 2018
Supercharge your mobile phones with the next generation mobile object detector!
We are adding support for MobileNet V2 with SSDLite presented in
[MobileNetV2: Inverted Residuals and Linear Bottlenecks](https://arxiv.org/abs/1801.04381).
This model is 35% faster than Mobilenet V1 SSD on a Google Pixel phone CPU
(200ms vs. 270ms) at the same accuracy. Along with the model definition, we are
also releasing a model checkpoint trained on the COCO dataset.
<b>Thanks to contributors</b>: Menglong Zhu, Mark Sandler, Zhichao Lu, Vivek
Rathod, Jonathan Huang
### February 9, 2018
We now support instance segmentation!! In this API update we support a number of
instance segmentation models similar to those discussed in the
[Mask R-CNN paper](https://arxiv.org/abs/1703.06870). For further details refer
to [our slides](http://presentations.cocodataset.org/Places17-GMRI.pdf) from the
2017 Coco + Places Workshop. Refer to the section on
[Running an Instance Segmentation Model](instance_segmentation.md) for
instructions on how to configure a model that predicts masks in addition to
object bounding boxes.
<b>Thanks to contributors</b>: Alireza Fathi, Zhichao Lu, Vivek Rathod, Ronny
Votel, Jonathan Huang
### November 17, 2017
As a part of the Open Images V3 release we have released:
* An implementation of the Open Images evaluation metric and the
[protocol](evaluation_protocols.md#open-images).
* Additional tools to separate inference of detection and evaluation (see
[this tutorial](oid_inference_and_evaluation.md)).
* A new detection model trained on the Open Images V2 data release (see
[Open Images model](tf1_detection_zoo.md#open-images-models)).
See more information on the
[Open Images website](https://github.com/openimages/dataset)!
<b>Thanks to contributors</b>: Stefan Popov, Alina Kuznetsova
### November 6, 2017
We have re-released faster versions of our (pre-trained) models in the
<a href='tf1_detection_zoo.md'>model zoo</a>. In addition to what was available
before, we are also adding Faster R-CNN models trained on COCO with Inception V2
and Resnet-50 feature extractors, as well as a Faster R-CNN with Resnet-101
model trained on the KITTI dataset.
<b>Thanks to contributors</b>: Jonathan Huang, Vivek Rathod, Derek Chow, Tal
Remez, Chen Sun.
### October 31, 2017
We have released a new state-of-the-art model for object detection using the
Faster-RCNN with the
[NASNet-A image featurization](https://arxiv.org/abs/1707.07012). This model
achieves mAP of 43.1% on the test-dev validation dataset for COCO, improving on
the best available model in the zoo by 6% in terms of absolute mAP.
<b>Thanks to contributors</b>: Barret Zoph, Vijay Vasudevan, Jonathon Shlens,
Quoc Le
### August 11, 2017
We have released an update to the
[Android Detect demo](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android)
which will now run models trained using the TensorFlow Object Detection API on
an Android device. By default, it currently runs a frozen SSD w/Mobilenet
detector trained on COCO, but we encourage you to try out other detection
models!
<b>Thanks to contributors</b>: Jonathan Huang, Andrew Harp
### June 15, 2017
In addition to our base TensorFlow detection model definitions, this release
includes:
* A selection of trainable detection models, including:
* Single Shot Multibox Detector (SSD) with MobileNet,
* SSD with Inception V2,
* Region-Based Fully Convolutional Networks (R-FCN) with Resnet 101,
* Faster RCNN with Resnet 101,
* Faster RCNN with Inception Resnet v2
* Frozen weights (trained on the COCO dataset) for each of the above models to
be used for out-of-the-box inference purposes.
* A [Jupyter notebook](../colab_tutorials/object_detection_tutorial.ipynb) for
performing out-of-the-box inference with one of our released models
* Convenient training and evaluation
[instructions](tf1_training_and_evaluation.md) for local runs and Google
Cloud.
<b>Thanks to contributors</b>: Jonathan Huang, Vivek Rathod, Derek Chow, Chen
Sun, Menglong Zhu, Matthew Tang, Anoop Korattikara, Alireza Fathi, Ian Fischer,
Zbigniew Wojna, Yang Song, Sergio Guadarrama, Jasper Uijlings, Viacheslav
Kovalevskyi, Kevin Murphy
# Running Locally
This page walks through the steps required to train an object detection model
on a local machine. It assumes the reader has completed the
following prerequisites:
1. The Tensorflow Object Detection API has been installed as documented in the
[installation instructions](installation.md). This includes installing library
dependencies, compiling the configuration protobufs and setting up the Python
environment.
2. A valid data set has been created. See [this page](preparing_inputs.md) for
instructions on how to generate a dataset for the PASCAL VOC challenge or the
Oxford-IIIT Pet dataset.
3. A Object Detection pipeline configuration has been written. See
[this page](configuring_jobs.md) for details on how to write a pipeline configuration.
## Recommended Directory Structure for Training and Evaluation
```
+data
-label_map file
-train TFRecord file
-eval TFRecord file
+models
+ model
-pipeline config file
+train
+eval
```
## Running the Training Job
A local training job can be run with the following command:
```bash
# From the tensorflow/models/research/ directory
PIPELINE_CONFIG_PATH={path to pipeline config file}
MODEL_DIR={path to model directory}
NUM_TRAIN_STEPS=50000
SAMPLE_1_OF_N_EVAL_EXAMPLES=1
python object_detection/model_main.py \
--pipeline_config_path=${PIPELINE_CONFIG_PATH} \
--model_dir=${MODEL_DIR} \
--num_train_steps=${NUM_TRAIN_STEPS} \
--sample_1_of_n_eval_examples=$SAMPLE_1_OF_N_EVAL_EXAMPLES \
--alsologtostderr
```
where `${PIPELINE_CONFIG_PATH}` points to the pipeline config and
`${MODEL_DIR}` points to the directory in which training checkpoints
and events will be written to. Note that this binary will interleave both
training and evaluation.
## Running Tensorboard
Progress for training and eval jobs can be inspected using Tensorboard. If
using the recommended directory structure, Tensorboard can be run using the
following command:
```bash
tensorboard --logdir=${MODEL_DIR}
```
where `${MODEL_DIR}` points to the directory that contains the
train and eval directories. Please note it may take Tensorboard a couple minutes
to populate with data.
# Quick Start: Jupyter notebook for off-the-shelf inference
[![TensorFlow 2.2](https://img.shields.io/badge/TensorFlow-2.2-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v2.2.0)
[![TensorFlow 1.15](https://img.shields.io/badge/TensorFlow-1.15-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
If you'd like to hit the ground running and run detection on a few example
images right out of the box, we recommend trying out the Jupyter notebook demo.
To run the Jupyter notebook, run the following command from
......
# Running on mobile with TensorFlow Lite
[![TensorFlow 1.15](https://img.shields.io/badge/TensorFlow-1.15-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
In this section, we will show you how to use [TensorFlow
Lite](https://www.tensorflow.org/mobile/tflite/) to get a smaller model and
allow you take advantage of ops that have been optimized for mobile devices.
......
# Quick Start: Distributed Training on the Oxford-IIIT Pets Dataset on Google Cloud
This page is a walkthrough for training an object detector using the Tensorflow
[![TensorFlow 1.15](https://img.shields.io/badge/TensorFlow-1.15-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
This page is a walkthrough for training an object detector using the TensorFlow
Object Detection API. In this tutorial, we'll be training on the Oxford-IIIT Pets
dataset to build a system to detect various breeds of cats and dogs. The output
of the detector will look like the following:
......@@ -40,10 +42,10 @@ export YOUR_GCS_BUCKET=${YOUR_GCS_BUCKET}
It is also possible to run locally by following
[the running locally instructions](running_locally.md).
## Installing Tensorflow and the Tensorflow Object Detection API
## Installing TensorFlow and the TensorFlow Object Detection API
Please run through the [installation instructions](installation.md) to install
Tensorflow and all it dependencies. Ensure the Protobuf libraries are
TensorFlow and all it dependencies. Ensure the Protobuf libraries are
compiled and the library directories are added to `PYTHONPATH`.
## Getting the Oxford-IIIT Pets Dataset and Uploading it to Google Cloud Storage
......@@ -77,7 +79,7 @@ should appear as follows:
... other files and directories
```
The Tensorflow Object Detection API expects data to be in the TFRecord format,
The TensorFlow Object Detection API expects data to be in the TFRecord format,
so we'll now run the `create_pet_tf_record` script to convert from the raw
Oxford-IIIT Pet dataset into TFRecords. Run the following commands from the
`tensorflow/models/research/` directory:
......@@ -134,7 +136,7 @@ in the following step.
## Configuring the Object Detection Pipeline
In the Tensorflow Object Detection API, the model parameters, training
In the TensorFlow Object Detection API, the model parameters, training
parameters and eval parameters are all defined by a config file. More details
can be found [here](configuring_jobs.md). For this tutorial, we will use some
predefined templates provided with the source code. In the
......@@ -188,10 +190,10 @@ browser](https://console.cloud.google.com/storage/browser).
Before we can start a job on Google Cloud ML Engine, we must:
1. Package the Tensorflow Object Detection code.
1. Package the TensorFlow Object Detection code.
2. Write a cluster configuration for our Google Cloud ML job.
To package the Tensorflow Object Detection code, run the following commands from
To package the TensorFlow Object Detection code, run the following commands from
the `tensorflow/models/research/` directory:
```bash
......@@ -248,7 +250,7 @@ web browser. You should see something similar to the following:
![](img/tensorboard.png)
Make sure your Tensorboard version is the same minor version as your Tensorflow (1.x)
Make sure your Tensorboard version is the same minor version as your TensorFlow (1.x)
You will also want to click on the images tab to see example detections made by
the model while it trains. After about an hour and a half of training, you can
......@@ -265,9 +267,9 @@ the training jobs are configured to go for much longer than is necessary for
convergence. To save money, we recommend killing your jobs once you've seen
that they've converged.
## Exporting the Tensorflow Graph
## Exporting the TensorFlow Graph
After your model has been trained, you should export it to a Tensorflow graph
After your model has been trained, you should export it to a TensorFlow graph
proto. First, you need to identify a candidate checkpoint to export. You can
search your bucket using the [Google Cloud Storage
Browser](https://console.cloud.google.com/storage/browser). The file should be
......
# Object Detection API with TensorFlow 1
## Requirements
[![Python 3.6](https://img.shields.io/badge/Python-3.6-3776AB)](https://www.python.org/downloads/release/python-360/)
[![TensorFlow 1.15](https://img.shields.io/badge/TensorFlow-1.15-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
[![Protobuf Compiler >= 3.0](https://img.shields.io/badge/ProtoBuf%20Compiler-%3E3.0-brightgreen)](https://grpc.io/docs/protoc-installation/#install-using-a-package-manager)
## Installation
You can install the TensorFlow Object Detection API either with Python Package
Installer (pip) or Docker. For local runs we recommend using Docker and for
Google Cloud runs we recommend using pip.
Clone the TensorFlow Models repository and proceed to one of the installation
options.
```bash
git clone https://github.com/tensorflow/models.git
```
### Docker Installation
```bash
# From the root of the git repository
docker build -f research/object_detection/dockerfiles/tf1/Dockerfile -t od .
docker run -it od
```
### Python Package Installation
```bash
cd models/research
# Compile protos.
protoc object_detection/protos/*.proto --python_out=.
# Install TensorFlow Object Detection API.
cp object_detection/packages/tf1/setup.py .
python -m pip install .
```
```bash
# Test the installation.
python object_detection/builders/model_builder_tf1_test.py
```
## Quick Start
### Colabs
* [Jupyter notebook for off-the-shelf inference](../colab_tutorials/object_detection_tutorial.ipynb)
* [Training a pet detector](running_pets.md)
### Training and Evaluation
To train and evaluate your models either locally or on Google Cloud see
[instructions](tf1_training_and_evaluation.md).
## Model Zoo
We provide a large collection of models that are trained on several datasets in
the [Model Zoo](tf1_detection_zoo.md).
## Guides
* <a href='configuring_jobs.md'>
Configuring an object detection pipeline</a><br>
* <a href='preparing_inputs.md'>Preparing inputs</a><br>
* <a href='defining_your_own_model.md'>
Defining your own model architecture</a><br>
* <a href='using_your_own_dataset.md'>
Bringing in your own dataset</a><br>
* <a href='evaluation_protocols.md'>
Supported object detection evaluation protocols</a><br>
* <a href='tpu_compatibility.md'>
TPU compatible detection pipelines</a><br>
* <a href='tf1_training_and_evaluation.md'>
Training and evaluation guide (CPU, GPU, or TPU)</a><br>
## Extras:
* <a href='exporting_models.md'>
Exporting a trained model for inference</a><br>
* <a href='tpu_exporters.md'>
Exporting a trained model for TPU inference</a><br>
* <a href='oid_inference_and_evaluation.md'>
Inference and evaluation on the Open Images dataset</a><br>
* <a href='instance_segmentation.md'>
Run an instance segmentation model</a><br>
* <a href='challenge_evaluation.md'>
Run the evaluation for the Open Images Challenge 2018/2019</a><br>
* <a href='running_on_mobile_tensorflowlite.md'>
Running object detection on mobile devices with TensorFlow Lite</a><br>
* <a href='context_rcnn.md'>
Context R-CNN documentation for data preparation, training, and export</a><br>
# Tensorflow detection model zoo
# TensorFlow 1 Detection Model Zoo
[![TensorFlow 1.15](https://img.shields.io/badge/TensorFlow-1.15-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
[![Python 3.6](https://img.shields.io/badge/Python-3.6-3776AB)](https://www.python.org/downloads/release/python-360/)
We provide a collection of detection models pre-trained on the
[COCO dataset](http://cocodataset.org), the
......@@ -64,9 +67,9 @@ Some remarks on frozen inference graphs:
metrics.
* Our frozen inference graphs are generated using the
[v1.12.0](https://github.com/tensorflow/tensorflow/tree/v1.12.0) release
version of Tensorflow and we do not guarantee that these will work with
version of TensorFlow and we do not guarantee that these will work with
other versions; this being said, each frozen inference graph can be
regenerated using your current version of Tensorflow by re-running the
regenerated using your current version of TensorFlow by re-running the
[exporter](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/exporting_models.md),
pointing it at the model directory as well as the corresponding config file
in
......
# Running on Google Cloud ML Engine
# Training and Evaluation with TensorFlow 1
The Tensorflow Object Detection API supports distributed training on Google
Cloud ML Engine. This section documents instructions on how to train and
evaluate your model using Cloud ML. The reader should complete the following
prerequistes:
[![Python 3.6](https://img.shields.io/badge/Python-3.6-3776AB)](https://www.python.org/downloads/release/python-360/)
[![TensorFlow 1.15](https://img.shields.io/badge/TensorFlow-1.15-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
1. The reader has created and configured a project on Google Cloud Platform.
See [the Cloud ML quick start guide](https://cloud.google.com/ml-engine/docs/quickstarts/command-line).
2. The reader has installed the Tensorflow Object Detection API as documented
in the [installation instructions](installation.md).
3. The reader has a valid data set and stored it in a Google Cloud Storage
bucket. See [this page](preparing_inputs.md) for instructions on how to generate
a dataset for the PASCAL VOC challenge or the Oxford-IIIT Pet dataset.
4. The reader has configured a valid Object Detection pipeline, and stored it
in a Google Cloud Storage bucket. See [this page](configuring_jobs.md) for
details on how to write a pipeline configuration.
This page walks through the steps required to train an object detection model.
It assumes the reader has completed the following prerequisites:
Additionally, it is recommended users test their job by running training and
evaluation jobs for a few iterations
[locally on their own machines](running_locally.md).
1. The TensorFlow Object Detection API has been installed as documented in the
[installation instructions](tf1.md#installation).
2. A valid data set has been created. See [this page](preparing_inputs.md) for
instructions on how to generate a dataset for the PASCAL VOC challenge or
the Oxford-IIIT Pet dataset.
## Recommended Directory Structure for Training and Evaluation
```bash
.
├── data/
│   ├── eval-00000-of-00001.tfrecord
│   ├── label_map.txt
│   ├── train-00000-of-00002.tfrecord
│   └── train-00001-of-00002.tfrecord
└── models/
└── my_model_dir/
├── eval/ # Created by evaluation job.
├── my_model.config
└── train/ #
└── model_ckpt-100-data@1 # Created by training job.
└── model_ckpt-100-index #
└── checkpoint #
```
## Writing a model configuration
Please refer to sample [TF1 configs](../samples/configs) and
[configuring jobs](configuring_jobs.md) to create a model config.
## Packaging
### Model Parameter Initialization
In order to run the Tensorflow Object Detection API on Cloud ML, it must be
packaged (along with it's TF-Slim dependency and the
[pycocotools](https://github.com/cocodataset/cocoapi/tree/master/PythonAPI/pycocotools)
library). The required packages can be created with the following command
While optional, it is highly recommended that users utilize classification or
object detection checkpoints. Training an object detector from scratch can take
days. To speed up the training process, it is recommended that users re-use the
feature extractor parameters from a pre-existing image classification or object
detection checkpoint. The`train_config` section in the config provides two
fields to specify pre-existing checkpoints:
``` bash
# From tensorflow/models/research/
bash object_detection/dataset_tools/create_pycocotools_package.sh /tmp/pycocotools
python setup.py sdist
(cd slim && python setup.py sdist)
* `fine_tune_checkpoint`: a path prefix to the pre-existing checkpoint
(ie:"/usr/home/username/checkpoint/model.ckpt-#####").
* `fine_tune_checkpoint_type`: with value `classification` or `detection`
depending on the type.
A list of detection checkpoints can be found [here](tf1_detection_zoo.md).
## Local
### Training
A local training job can be run with the following command:
```bash
# From the tensorflow/models/research/ directory
PIPELINE_CONFIG_PATH={path to pipeline config file}
MODEL_DIR={path to model directory}
NUM_TRAIN_STEPS=50000
SAMPLE_1_OF_N_EVAL_EXAMPLES=1
python object_detection/model_main.py \
--pipeline_config_path=${PIPELINE_CONFIG_PATH} \
--model_dir=${MODEL_DIR} \
--num_train_steps=${NUM_TRAIN_STEPS} \
--sample_1_of_n_eval_examples=${SAMPLE_1_OF_N_EVAL_EXAMPLES} \
--alsologtostderr
```
This will create python packages dist/object_detection-0.1.tar.gz,
slim/dist/slim-0.1.tar.gz, and /tmp/pycocotools/pycocotools-2.0.tar.gz.
where `${PIPELINE_CONFIG_PATH}` points to the pipeline config and `${MODEL_DIR}`
points to the directory in which training checkpoints and events will be
written. Note that this binary will interleave both training and evaluation.
## Running a Multiworker (GPU) Training Job on CMLE
## Google Cloud AI Platform
The TensorFlow Object Detection API supports training on Google Cloud AI
Platform. This section documents instructions on how to train and evaluate your
model using Cloud AI Platform. The reader should complete the following
prerequistes:
1. The reader has created and configured a project on Google Cloud AI Platform.
See
[Using GPUs](https://cloud.google.com/ai-platform/training/docs/using-gpus)
and
[Using TPUs](https://cloud.google.com/ai-platform/training/docs/using-tpus)
guides.
2. The reader has a valid data set and stored it in a Google Cloud Storage
bucket. See [this page](preparing_inputs.md) for instructions on how to
generate a dataset for the PASCAL VOC challenge or the Oxford-IIIT Pet
dataset.
Additionally, it is recommended users test their job by running training and
evaluation jobs for a few iterations [locally on their own machines](#local).
### Training with multiple workers with single GPU
Google Cloud ML requires a YAML configuration file for a multiworker training
job using GPUs. A sample YAML file is given below:
```
trainingInput:
runtimeVersion: "1.12"
runtimeVersion: "1.15"
scaleTier: CUSTOM
masterType: standard_gpu
workerCount: 9
......@@ -52,30 +113,32 @@ trainingInput:
parameterServerCount: 3
parameterServerType: standard
```
Please keep the following guidelines in mind when writing the YAML
configuration:
* A job with n workers will have n + 1 training machines (n workers + 1 master).
* The number of parameters servers used should be an odd number to prevent
a parameter server from storing only weight variables or only bias variables
(due to round robin parameter scheduling).
* The learning rate in the training config should be decreased when using a
larger number of workers. Some experimentation is required to find the
optimal learning rate.
* A job with n workers will have n + 1 training machines (n workers + 1
master).
* The number of parameters servers used should be an odd number to prevent a
parameter server from storing only weight variables or only bias variables
(due to round robin parameter scheduling).
* The learning rate in the training config should be decreased when using a
larger number of workers. Some experimentation is required to find the
optimal learning rate.
The YAML file should be saved on the local machine (not on GCP). Once it has
been written, a user can start a training job on Cloud ML Engine using the
following command:
```bash
# From tensorflow/models/research/
# From the tensorflow/models/research/ directory
cp object_detection/packages/tf1/setup.py .
gcloud ml-engine jobs submit training object_detection_`date +%m_%d_%Y_%H_%M_%S` \
--runtime-version 1.12 \
--runtime-version 1.15 \
--python-version 3.6 \
--job-dir=gs://${MODEL_DIR} \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
--package-path ./object_detection \
--module-name object_detection.model_main \
--region us-central1 \
--config ${PATH_TO_LOCAL_YAML_FILE} \
......@@ -90,41 +153,42 @@ training checkpoints and events will be written to and
`gs://${PIPELINE_CONFIG_PATH}` points to the pipeline configuration stored on
Google Cloud Storage.
Users can monitor the progress of their training job on the [ML Engine
Dashboard](https://console.cloud.google.com/mlengine/jobs).
Note: This sample is supported for use with 1.12 runtime version.
Users can monitor the progress of their training job on the
[ML Engine Dashboard](https://console.cloud.google.com/ai-platform/jobs).
## Running a TPU Training Job on CMLE
## Training with TPU
Launching a training job with a TPU compatible pipeline config requires using a
similar command:
```bash
# From the tensorflow/models/research/ directory
cp object_detection/packages/tf1/setup.py .
gcloud ml-engine jobs submit training `whoami`_object_detection_`date +%m_%d_%Y_%H_%M_%S` \
--job-dir=gs://${MODEL_DIR} \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
--module-name object_detection.model_tpu_main \
--runtime-version 1.12 \
--scale-tier BASIC_TPU \
--region us-central1 \
-- \
--tpu_zone us-central1 \
--model_dir=gs://${MODEL_DIR} \
--pipeline_config_path=gs://${PIPELINE_CONFIG_PATH}
--job-dir=gs://${MODEL_DIR} \
--package-path ./object_detection \
--module-name object_detection.model_tpu_main \
--runtime-version 1.15 \
--python-version 3.6 \
--scale-tier BASIC_TPU \
--region us-central1 \
-- \
--tpu_zone us-central1 \
--model_dir=gs://${MODEL_DIR} \
--pipeline_config_path=gs://${PIPELINE_CONFIG_PATH}
```
In contrast with the GPU training command, there is no need to specify a YAML
file and we point to the *object_detection.model_tpu_main* binary instead of
file, and we point to the *object_detection.model_tpu_main* binary instead of
*object_detection.model_main*. We must also now set `scale-tier` to be
`BASIC_TPU` and provide a `tpu_zone`. Finally as before `pipeline_config_path`
points to a points to the pipeline configuration stored on Google Cloud Storage
(but is now must be a TPU compatible model).
## Running an Evaluation Job on CMLE
## Evaluation with GPU
Note: You only need to do this when using TPU for training as it does not
interleave evaluation during training as in the case of Multiworker GPU
Note: You only need to do this when using TPU for training, as it does not
interleave evaluation during training, as in the case of Multiworker GPU
training.
Evaluation jobs run on a single machine, so it is not necessary to write a YAML
......@@ -132,10 +196,13 @@ configuration for evaluation. Run the following command to start the evaluation
job:
```bash
# From the tensorflow/models/research/ directory
cp object_detection/packages/tf1/setup.py .
gcloud ml-engine jobs submit training object_detection_eval_`date +%m_%d_%Y_%H_%M_%S` \
--runtime-version 1.12 \
--runtime-version 1.15 \
--python-version 3.6 \
--job-dir=gs://${MODEL_DIR} \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
--package-path ./object_detection \
--module-name object_detection.model_main \
--region us-central1 \
--scale-tier BASIC_GPU \
......@@ -146,25 +213,25 @@ gcloud ml-engine jobs submit training object_detection_eval_`date +%m_%d_%Y_%H_%
```
Where `gs://${MODEL_DIR}` points to the directory on Google Cloud Storage where
training checkpoints are saved (same as the training job), as well as
to where evaluation events will be saved on Google Cloud Storage and
training checkpoints are saved (same as the training job), as well as to where
evaluation events will be saved on Google Cloud Storage and
`gs://${PIPELINE_CONFIG_PATH}` points to where the pipeline configuration is
stored on Google Cloud Storage.
Typically one starts an evaluation job concurrently with the training job.
Note that we do not support running evaluation on TPU, so the above command
line for launching evaluation jobs is the same whether you are training
on GPU or TPU.
Typically one starts an evaluation job concurrently with the training job. Note
that we do not support running evaluation on TPU, so the above command line for
launching evaluation jobs is the same whether you are training on GPU or TPU.
## Running Tensorboard
You can run Tensorboard locally on your own machine to view progress of your
training and eval jobs on Google Cloud ML. Run the following command to start
Tensorboard:
Progress for training and eval jobs can be inspected using Tensorboard. If using
the recommended directory structure, Tensorboard can be run using the following
command:
``` bash
tensorboard --logdir=gs://${YOUR_CLOUD_BUCKET}
```bash
tensorboard --logdir=${MODEL_DIR}
```
Note it may Tensorboard a few minutes to populate with results.
where `${MODEL_DIR}` points to the directory that contains the train and eval
directories. Please note it may take Tensorboard a couple minutes to populate
with data.
# Object Detection API with TensorFlow 2
## Requirements
[![Python 3.6](https://img.shields.io/badge/Python-3.6-3776AB)](https://www.python.org/downloads/release/python-360/)
[![TensorFlow 2.2](https://img.shields.io/badge/TensorFlow-2.2-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v2.2.0)
[![Protobuf Compiler >= 3.0](https://img.shields.io/badge/ProtoBuf%20Compiler-%3E3.0-brightgreen)](https://grpc.io/docs/protoc-installation/#install-using-a-package-manager)
## Installation
You can install the TensorFlow Object Detection API either with Python Package
Installer (pip) or Docker. For local runs we recommend using Docker and for
Google Cloud runs we recommend using pip.
Clone the TensorFlow Models repository and proceed to one of the installation
options.
```bash
git clone https://github.com/tensorflow/models.git
```
### Docker Installation
```bash
# From the root of the git repository
docker build -f research/object_detection/dockerfiles/tf2/Dockerfile -t od .
docker run -it od
```
### Python Package Installation
```bash
cd models/research
# Compile protos.
protoc object_detection/protos/*.proto --python_out=.
# Install TensorFlow Object Detection API.
cp object_detection/packages/tf2/setup.py .
python -m pip install .
```
```bash
# Test the installation.
python object_detection/builders/model_builder_tf2_test.py
```
## Quick Start
### Colabs
<!-- mdlint off(URL_BAD_G3DOC_PATH) -->
* Training -
[Fine-tune a pre-trained detector in eager mode on custom data](../colab_tutorials/eager_few_shot_od_training_tf2_colab.ipynb)
* Inference -
[Run inference with models from the zoo](../colab_tutorials/inference_tf2_colab.ipynb)
<!-- mdlint on -->
## Training and Evaluation
To train and evaluate your models either locally or on Google Cloud see
[instructions](tf2_training_and_evaluation.md).
## Model Zoo
We provide a large collection of models that are trained on COCO 2017 in the
[Model Zoo](tf2_detection_zoo.md).
## Guides
* <a href='configuring_jobs.md'>
Configuring an object detection pipeline</a><br>
* <a href='preparing_inputs.md'>Preparing inputs</a><br>
* <a href='defining_your_own_model.md'>
Defining your own model architecture</a><br>
* <a href='using_your_own_dataset.md'>
Bringing in your own dataset</a><br>
* <a href='evaluation_protocols.md'>
Supported object detection evaluation protocols</a><br>
* <a href='tpu_compatibility.md'>
TPU compatible detection pipelines</a><br>
* <a href='tf2_training_and_evaluation.md'>
Training and evaluation guide (CPU, GPU, or TPU)</a><br>
\ No newline at end of file
# TensorFlow 2 Classification Model Zoo
[![TensorFlow 2.2](https://img.shields.io/badge/TensorFlow-2.2-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v2.2.0)
[![Python 3.6](https://img.shields.io/badge/Python-3.6-3776AB)](https://www.python.org/downloads/release/python-360/)
We provide a collection of classification models pre-trained on the
[Imagenet](http://www.image-net.org). These can be used to initilize detection
model parameters.
Model name |
---------- |
[EfficientNet B0](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b0.tar.gz) |
[EfficientNet B1](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b1.tar.gz) |
[EfficientNet B2](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b2.tar.gz) |
[EfficientNet B3](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b3.tar.gz) |
[EfficientNet B4](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b4.tar.gz) |
[EfficientNet B5](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b5.tar.gz) |
[EfficientNet B6](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b6.tar.gz) |
[EfficientNet B7](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b7.tar.gz) |
[Resnet V1 50](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/resnet50_v1.tar.gz) |
[Resnet V1 101](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/resnet101_v1.tar.gz) |
[Resnet V1 152](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/resnet152_v1.tar.gz) |
[Inception Resnet V2](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/inception_resnet_v2.tar.gz) |
[MobileNet V1](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/mobilnet_v1.tar.gz) |
[MobileNet V2](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/mobilnet_v2.tar.gz) |
# TensorFlow 2 Detection Model Zoo
[![TensorFlow 2.2](https://img.shields.io/badge/TensorFlow-2.2-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v2.2.0)
[![Python 3.6](https://img.shields.io/badge/Python-3.6-3776AB)](https://www.python.org/downloads/release/python-360/)
<!-- mdlint off(URL_BAD_G3DOC_PATH) -->
We provide a collection of detection models pre-trained on the
[COCO 2017 dataset](http://cocodataset.org). These models can be useful for
out-of-the-box inference if you are interested in categories already in those
datasets. You can try it in our inference
[colab](../colab_tutorials/inference_tf2_colab.ipynb)
They are also useful for initializing your models when training on novel
datasets. You can try this out on our few-shot training
[colab](../colab_tutorials/eager_few_shot_od_training_tf2_colab.ipynb).
<!-- mdlint on -->
Finally, if you would like to train these models from scratch, you can find the
model configs in this [directory](../configs/tf2) (also in the linked
`tar.gz`s).
Model name | Speed (ms) | COCO mAP | Outputs
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :--------: | :----------: | :-----:
[CenterNet HourGlass104 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_hg104_512x512_coco17_tpu-8.tar.gz) | 70 | 41.6 | Boxes
[CenterNet HourGlass104 Keypoints 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_hg104_512x512_kpts_coco17_tpu-32.tar.gz) | 76 | 40.0/61.4 | Boxes/Keypoints
[CenterNet HourGlass104 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_hg104_1024x1024_coco17_tpu-32.tar.gz) | 197 | 43.5 | Boxes
[CenterNet HourGlass104 Keypoints 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_hg104_1024x1024_kpts_coco17_tpu-32.tar.gz) | 211 | 42.8/64.5 | Boxes/Keypoints
[CenterNet Resnet50 V1 FPN 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_resnet50_v1_fpn_512x512_coco17_tpu-8.tar.gz) | 27 | 31.2 | Boxes
[CenterNet Resnet50 V1 FPN Keypoints 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_resnet50_v1_fpn_512x512_kpts_coco17_tpu-8.tar.gz) | 30 | 29.3/50.7 | Boxes/Keypoints
[CenterNet Resnet101 V1 FPN 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_resnet101_v1_fpn_512x512_coco17_tpu-8.tar.gz) | 34 | 34.2 | Boxes
[CenterNet Resnet50 V2 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_resnet50_v2_512x512_coco17_tpu-8.tar.gz) | 27 | 29.5 | Boxes
[CenterNet Resnet50 V2 Keypoints 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_resnet50_v2_512x512_kpts_coco17_tpu-8.tar.gz) | 30 | 27.6/48.2 | Boxes/Keypoints
[EfficientDet D0 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d0_coco17_tpu-32.tar.gz) | 39 | 33.6 | Boxes
[EfficientDet D1 640x640](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d1_coco17_tpu-32.tar.gz) | 54 | 38.4 | Boxes
[EfficientDet D2 768x768](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d2_coco17_tpu-32.tar.gz) | 67 | 41.8 | Boxes
[EfficientDet D3 896x896](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d3_coco17_tpu-32.tar.gz) | 95 | 45.4 | Boxes
[EfficientDet D4 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d4_coco17_tpu-32.tar.gz) | 133 | 48.5 | Boxes
[EfficientDet D5 1280x1280](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d5_coco17_tpu-32.tar.gz) | 222 | 49.7 | Boxes
[EfficientDet D6 1280x1280](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d6_coco17_tpu-32.tar.gz) | 268 | 50.5 | Boxes
[EfficientDet D7 1536x1536](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d7_coco17_tpu-32.tar.gz) | 325 | 51.2 | Boxes
[SSD MobileNet v2 320x320](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_320x320_coco17_tpu-8.tar.gz) |19 | 20.2 | Boxes
[SSD MobileNet V1 FPN 640x640](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v1_fpn_640x640_coco17_tpu-8.tar.gz) | 48 | 29.1 | Boxes
[SSD MobileNet V2 FPNLite 320x320](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.tar.gz) | 22 | 22.2 | Boxes
[SSD MobileNet V2 FPNLite 640x640](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8.tar.gz) | 39 | 28.2 | Boxes
[SSD ResNet50 V1 FPN 640x640 (RetinaNet50)](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.tar.gz) | 46 | 34.3 | Boxes
[SSD ResNet50 V1 FPN 1024x1024 (RetinaNet50)](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet50_v1_fpn_1024x1024_coco17_tpu-8.tar.gz) | 87 | 38.3 | Boxes
[SSD ResNet101 V1 FPN 640x640 (RetinaNet101)](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet101_v1_fpn_640x640_coco17_tpu-8.tar.gz) | 57 | 35.6 | Boxes
[SSD ResNet101 V1 FPN 1024x1024 (RetinaNet101)](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet101_v1_fpn_1024x1024_coco17_tpu-8.tar.gz) | 104 | 39.5 | Boxes
[SSD ResNet152 V1 FPN 640x640 (RetinaNet152)](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet152_v1_fpn_640x640_coco17_tpu-8.tar.gz) | 80 | 35.4 | Boxes
[SSD ResNet152 V1 FPN 1024x1024 (RetinaNet152)](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet152_v1_fpn_1024x1024_coco17_tpu-8.tar.gz) | 111 | 39.6 | Boxes
[Faster R-CNN ResNet50 V1 640x640](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet50_v1_640x640_coco17_tpu-8.tar.gz) | 53 | 29.3 | Boxes
[Faster R-CNN ResNet50 V1 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet50_v1_1024x1024_coco17_tpu-8.tar.gz) | 65 | 31.0 | Boxes
[Faster R-CNN ResNet50 V1 800x1333](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet50_v1_800x1333_coco17_gpu-8.tar.gz) | 65 | 31.6 | Boxes
[Faster R-CNN ResNet101 V1 640x640](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet101_v1_640x640_coco17_tpu-8.tar.gz) | 55 | 31.8 | Boxes
[Faster R-CNN ResNet101 V1 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet101_v1_1024x1024_coco17_tpu-8.tar.gz) | 72 | 37.1 | Boxes
[Faster R-CNN ResNet101 V1 800x1333](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet101_v1_800x1333_coco17_gpu-8.tar.gz) | 77 | 36.6 | Boxes
[Faster R-CNN ResNet152 V1 640x640](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet152_v1_640x640_coco17_tpu-8.tar.gz) | 64 | 32.4 | Boxes
[Faster R-CNN ResNet152 V1 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet152_v1_1024x1024_coco17_tpu-8.tar.gz) | 85 | 37.6 | Boxes
[Faster R-CNN ResNet152 V1 800x1333](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet152_v1_800x1333_coco17_gpu-8.tar.gz) | 101 | 37.4 | Boxes
[Faster R-CNN Inception ResNet V2 640x640](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_inception_resnet_v2_640x640_coco17_tpu-8.tar.gz) | 206 | 37.7 | Boxes
[Faster R-CNN Inception ResNet V2 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_inception_resnet_v2_1024x1024_coco17_tpu-8.tar.gz) | 236 | 38.7 | Boxes
[Mask R-CNN Inception ResNet V2 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8.tar.gz) | 301 | 39.0/34.6 | Boxes/Masks
[ExtremeNet](http://download.tensorflow.org/models/object_detection/tf2/20200711/extremenet.tar.gz) | -- | -- | Boxes
# Training and Evaluation with TensorFlow 2
[![TensorFlow 2.2](https://img.shields.io/badge/TensorFlow-2.2-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v2.2.0)
[![Python 3.6](https://img.shields.io/badge/Python-3.6-3776AB)](https://www.python.org/downloads/release/python-360/)
This page walks through the steps required to train an object detection model.
It assumes the reader has completed the following prerequisites:
1. The TensorFlow Object Detection API has been installed as documented in the
[installation instructions](tf2.md#installation).
2. A valid data set has been created. See [this page](preparing_inputs.md) for
instructions on how to generate a dataset for the PASCAL VOC challenge or
the Oxford-IIIT Pet dataset.
## Recommended Directory Structure for Training and Evaluation
```bash
.
├── data/
│   ├── eval-00000-of-00001.tfrecord
│   ├── label_map.txt
│   ├── train-00000-of-00002.tfrecord
│   └── train-00001-of-00002.tfrecord
└── models/
└── my_model_dir/
├── eval/ # Created by evaluation job.
├── my_model.config
└── model_ckpt-100-data@1 #
└── model_ckpt-100-index # Created by training job.
└── checkpoint #
```
## Writing a model configuration
Please refer to sample [TF2 configs](../configs/tf2) and
[configuring jobs](configuring_jobs.md) to create a model config.
### Model Parameter Initialization
While optional, it is highly recommended that users utilize classification or
object detection checkpoints. Training an object detector from scratch can take
days. To speed up the training process, it is recommended that users re-use the
feature extractor parameters from a pre-existing image classification or object
detection checkpoint. The `train_config` section in the config provides two
fields to specify pre-existing checkpoints:
* `fine_tune_checkpoint`: a path prefix to the pre-existing checkpoint
(ie:"/usr/home/username/checkpoint/model.ckpt-#####").
* `fine_tune_checkpoint_type`: with value `classification` or `detection`
depending on the type.
A list of classification checkpoints can be found
[here](tf2_classification_zoo.md)
A list of detection checkpoints can be found [here](tf2_detection_zoo.md).
## Local
### Training
A local training job can be run with the following command:
```bash
# From the tensorflow/models/research/ directory
PIPELINE_CONFIG_PATH={path to pipeline config file}
MODEL_DIR={path to model directory}
python object_detection/model_main_tf2.py \
--pipeline_config_path=${PIPELINE_CONFIG_PATH} \
--model_dir=${MODEL_DIR} \
--alsologtostderr
```
where `${PIPELINE_CONFIG_PATH}` points to the pipeline config and `${MODEL_DIR}`
points to the directory in which training checkpoints and events will be
written.
### Evaluation
A local evaluation job can be run with the following command:
```bash
# From the tensorflow/models/research/ directory
PIPELINE_CONFIG_PATH={path to pipeline config file}
MODEL_DIR={path to model directory}
CHECKPOINT_DIR=${MODEL_DIR}
MODEL_DIR={path to model directory}
python object_detection/model_main_tf2.py \
--pipeline_config_path=${PIPELINE_CONFIG_PATH} \
--model_dir=${MODEL_DIR} \
--checkpoint_dir=${CHECKPOINT_DIR} \
--alsologtostderr
```
where `${CHECKPOINT_DIR}` points to the directory with checkpoints produced by
the training job. Evaluation events are written to `${MODEL_DIR/eval}`
## Google Cloud VM
The TensorFlow Object Detection API supports training on Google Cloud with Deep
Learning GPU VMs and TPU VMs. This section documents instructions on how to
train and evaluate your model on them. The reader should complete the following
prerequistes:
1. The reader has create and configured a GPU VM or TPU VM on Google Cloud with
TensorFlow >= 2.2.0. See
[TPU quickstart](https://cloud.google.com/tpu/docs/quickstart) and
[GPU quickstart](https://cloud.google.com/ai-platform/deep-learning-vm/docs/tensorflow_start_instance#with-one-or-more-gpus)
2. The reader has installed the TensorFlow Object Detection API as documented
in the [installation instructions](tf2.md#installation) on the VM.
3. The reader has a valid data set and stored it in a Google Cloud Storage
bucket or locally on the VM. See [this page](preparing_inputs.md) for
instructions on how to generate a dataset for the PASCAL VOC challenge or
the Oxford-IIIT Pet dataset.
Additionally, it is recommended users test their job by running training and
evaluation jobs for a few iterations [locally on their own machines](#local).
### Training
Training on GPU or TPU VMs is similar to local training. It can be launched
using the following command.
```bash
# From the tensorflow/models/research/ directory
USE_TPU=true
TPU_NAME="MY_TPU_NAME"
PIPELINE_CONFIG_PATH={path to pipeline config file}
MODEL_DIR={path to model directory}
python object_detection/model_main_tf2.py \
--pipeline_config_path=${PIPELINE_CONFIG_PATH} \
--model_dir=${MODEL_DIR} \
--use_tpu=${USE_TPU} \ # (optional) only required for TPU training.
--tpu_name=${TPU_NAME} \ # (optional) only required for TPU training.
--alsologtostderr
```
where `${PIPELINE_CONFIG_PATH}` points to the pipeline config and `${MODEL_DIR}`
points to the root directory for the files produces. Training checkpoints and
events are written to `${MODEL_DIR}`. Note that the paths can be either local or
a path to GCS bucket.
### Evaluation
Evaluation is only supported on GPU. Similar to local evaluation it can be
launched using the following command:
```bash
# From the tensorflow/models/research/ directory
PIPELINE_CONFIG_PATH={path to pipeline config file}
MODEL_DIR={path to model directory}
CHECKPOINT_DIR=${MODEL_DIR}
MODEL_DIR={path to model directory}
python object_detection/model_main_tf2.py \
--pipeline_config_path=${PIPELINE_CONFIG_PATH} \
--model_dir=${MODEL_DIR} \
--checkpoint_dir=${CHECKPOINT_DIR} \
--alsologtostderr
```
where `${CHECKPOINT_DIR}` points to the directory with checkpoints produced by
the training job. Evaluation events are written to `${MODEL_DIR/eval}`. Note
that the paths can be either local or a path to GCS bucket.
## Google Cloud AI Platform
The TensorFlow Object Detection API supports also supports training on Google
Cloud AI Platform. This section documents instructions on how to train and
evaluate your model using Cloud ML. The reader should complete the following
prerequistes:
1. The reader has created and configured a project on Google Cloud AI Platform.
See
[Using GPUs](https://cloud.google.com/ai-platform/training/docs/using-gpus)
and
[Using TPUs](https://cloud.google.com/ai-platform/training/docs/using-tpus)
guides.
2. The reader has a valid data set and stored it in a Google Cloud Storage
bucket. See [this page](preparing_inputs.md) for instructions on how to
generate a dataset for the PASCAL VOC challenge or the Oxford-IIIT Pet
dataset.
Additionally, it is recommended users test their job by running training and
evaluation jobs for a few iterations [locally on their own machines](#local).
### Training with multiple GPUs
A user can start a training job on Cloud AI Platform using the following
command:
```bash
# From the tensorflow/models/research/ directory
cp object_detection/packages/tf2/setup.py .
gcloud ai-platform jobs submit training object_detection_`date +%m_%d_%Y_%H_%M_%S` \
--runtime-version 2.1 \
--python-version 3.6 \
--job-dir=gs://${MODEL_DIR} \
--package-path ./object_detection \
--module-name object_detection.model_main_tf2 \
--region us-central1 \
--master-machine-type n1-highcpu-16 \
--master-accelerator count=8,type=nvidia-tesla-v100 \
-- \
--model_dir=gs://${MODEL_DIR} \
--pipeline_config_path=gs://${PIPELINE_CONFIG_PATH}
```
Where `gs://${MODEL_DIR}` specifies the directory on Google Cloud Storage where
the training checkpoints and events will be written to and
`gs://${PIPELINE_CONFIG_PATH}` points to the pipeline configuration stored on
Google Cloud Storage.
Users can monitor the progress of their training job on the
[ML Engine Dashboard](https://console.cloud.google.com/ai-platform/jobs).
### Training with TPU
Launching a training job with a TPU compatible pipeline config requires using a
similar command:
```bash
# From the tensorflow/models/research/ directory
cp object_detection/packages/tf2/setup.py .
gcloud ai-platform jobs submit training `whoami`_object_detection_`date +%m_%d_%Y_%H_%M_%S` \
--job-dir=gs://${MODEL_DIR} \
--package-path ./object_detection \
--module-name object_detection.model_main_tf2 \
--runtime-version 2.1 \
--python-version 3.6 \
--scale-tier BASIC_TPU \
--region us-central1 \
-- \
--use_tpu true \
--model_dir=gs://${MODEL_DIR} \
--pipeline_config_path=gs://${PIPELINE_CONFIG_PATH}
```
As before `pipeline_config_path` points to the pipeline configuration stored on
Google Cloud Storage (but is now must be a TPU compatible model).
### Evaluating with GPU
Evaluation jobs run on a single machine. Run the following command to start the
evaluation job:
```bash
# From the tensorflow/models/research/ directory
cp object_detection/packages/tf2/setup.py .
gcloud ai-platform jobs submit training object_detection_eval_`date +%m_%d_%Y_%H_%M_%S` \
--runtime-version 2.1 \
--python-version 3.6 \
--job-dir=gs://${MODEL_DIR} \
--package-path ./object_detection \
--module-name object_detection.model_main_tf2 \
--region us-central1 \
--scale-tier BASIC_GPU \
-- \
--model_dir=gs://${MODEL_DIR} \
--pipeline_config_path=gs://${PIPELINE_CONFIG_PATH} \
--checkpoint_dir=gs://${MODEL_DIR}
```
where `gs://${MODEL_DIR}` points to the directory on Google Cloud Storage where
training checkpoints are saved and `gs://{PIPELINE_CONFIG_PATH}` points to where
the model configuration file stored on Google Cloud Storage. Evaluation events
are written to `gs://${MODEL_DIR}/eval`
Typically one starts an evaluation job concurrently with the training job. Note
that we do not support running evaluation on TPU.
## Running Tensorboard
Progress for training and eval jobs can be inspected using Tensorboard. If using
the recommended directory structure, Tensorboard can be run using the following
command:
```bash
tensorboard --logdir=${MODEL_DIR}
```
where `${MODEL_DIR}` points to the directory that contains the train and eval
directories. Please note it may take Tensorboard a couple minutes to populate
with data.
......@@ -2,7 +2,7 @@
[TOC]
The Tensorflow Object Detection API supports TPU training for some models. To
The TensorFlow Object Detection API supports TPU training for some models. To
make models TPU compatible you need to make a few tweaks to the model config as
mentioned below. We also provide several sample configs that you can use as a
template.
......@@ -11,7 +11,7 @@ template.
### Static shaped tensors
TPU training currently requires all tensors in the Tensorflow Graph to have
TPU training currently requires all tensors in the TensorFlow Graph to have
static shapes. However, most of the sample configs in Object Detection API have
a few different tensors that are dynamically shaped. Fortunately, we provide
simple alternatives in the model configuration that modifies these tensors to
......@@ -62,7 +62,7 @@ have static shape:
### TPU friendly ops
Although TPU supports a vast number of tensorflow ops, a few used in the
Tensorflow Object Detection API are unsupported. We list such ops below and
TensorFlow Object Detection API are unsupported. We list such ops below and
recommend compatible substitutes.
* **Anchor sampling** - Typically we use hard example mining in standard SSD
......
# Object Detection TPU Inference Exporter
[![TensorFlow 1.15](https://img.shields.io/badge/TensorFlow-1.15-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
This package contains SavedModel Exporter for TPU Inference of object detection
models.
......
......@@ -2,7 +2,7 @@
[TOC]
To use your own dataset in Tensorflow Object Detection API, you must convert it
To use your own dataset in TensorFlow Object Detection API, you must convert it
into the [TFRecord file format](https://www.tensorflow.org/api_guides/python/python_io#tfrecords_format_details).
This document outlines how to write a script to generate the TFRecord file.
......
......@@ -27,6 +27,7 @@ from object_detection.builders import model_builder
from object_detection.builders import preprocessor_builder
from object_detection.core import box_list
from object_detection.core import box_list_ops
from object_detection.core import densepose_ops
from object_detection.core import keypoint_ops
from object_detection.core import preprocessor
from object_detection.core import standard_fields as fields
......@@ -289,6 +290,13 @@ def transform_input_data(tensor_dict,
out_tensor_dict[flds_gt_kpt_vis],
keypoint_type_weight))
dp_surface_coords_fld = fields.InputDataFields.groundtruth_dp_surface_coords
if dp_surface_coords_fld in tensor_dict:
dp_surface_coords = out_tensor_dict[dp_surface_coords_fld]
realigned_dp_surface_coords = densepose_ops.change_coordinate_frame(
dp_surface_coords, im_box)
out_tensor_dict[dp_surface_coords_fld] = realigned_dp_surface_coords
if use_bfloat16:
preprocessed_resized_image = tf.cast(
preprocessed_resized_image, tf.bfloat16)
......@@ -355,7 +363,8 @@ def pad_input_data_to_static_shapes(tensor_dict,
num_classes,
spatial_image_shape=None,
max_num_context_features=None,
context_feature_length=None):
context_feature_length=None,
max_dp_points=336):
"""Pads input tensors to static shapes.
In case num_additional_channels > 0, we assume that the additional channels
......@@ -372,6 +381,11 @@ def pad_input_data_to_static_shapes(tensor_dict,
max_num_context_features (optional): The maximum number of context
features needed to compute shapes padding.
context_feature_length (optional): The length of the context feature.
max_dp_points (optional): The maximum number of DensePose sampled points per
instance. The default (336) is selected since the original DensePose paper
(https://arxiv.org/pdf/1802.00434.pdf) indicates that the maximum number
of samples per part is 14, and therefore 24 * 14 = 336 is the maximum
sampler per instance.
Returns:
A dictionary keyed by fields.InputDataFields containing padding shapes for
......@@ -476,6 +490,15 @@ def pad_input_data_to_static_shapes(tensor_dict,
padding_shape = [max_num_boxes, shape_utils.get_dim_as_int(tensor_shape[1])]
padding_shapes[fields.InputDataFields.
groundtruth_keypoint_weights] = padding_shape
if fields.InputDataFields.groundtruth_dp_num_points in tensor_dict:
padding_shapes[
fields.InputDataFields.groundtruth_dp_num_points] = [max_num_boxes]
padding_shapes[
fields.InputDataFields.groundtruth_dp_part_ids] = [
max_num_boxes, max_dp_points]
padding_shapes[
fields.InputDataFields.groundtruth_dp_surface_coords] = [
max_num_boxes, max_dp_points, 4]
# Prepare for ContextRCNN related fields.
if fields.InputDataFields.context_features in tensor_dict:
......@@ -535,6 +558,10 @@ def augment_input_data(tensor_dict, data_augmentation_options):
in tensor_dict)
include_multiclass_scores = (fields.InputDataFields.multiclass_scores in
tensor_dict)
dense_pose_fields = [fields.InputDataFields.groundtruth_dp_num_points,
fields.InputDataFields.groundtruth_dp_part_ids,
fields.InputDataFields.groundtruth_dp_surface_coords]
include_dense_pose = all(field in tensor_dict for field in dense_pose_fields)
tensor_dict = preprocessor.preprocess(
tensor_dict, data_augmentation_options,
func_arg_map=preprocessor.get_default_func_arg_map(
......@@ -543,7 +570,8 @@ def augment_input_data(tensor_dict, data_augmentation_options):
include_multiclass_scores=include_multiclass_scores,
include_instance_masks=include_instance_masks,
include_keypoints=include_keypoints,
include_keypoint_visibilities=include_keypoint_visibilities))
include_keypoint_visibilities=include_keypoint_visibilities,
include_dense_pose=include_dense_pose))
tensor_dict[fields.InputDataFields.image] = tf.squeeze(
tensor_dict[fields.InputDataFields.image], axis=0)
return tensor_dict
......@@ -572,6 +600,9 @@ def _get_labels_dict(input_dict):
fields.InputDataFields.groundtruth_difficult,
fields.InputDataFields.groundtruth_keypoint_visibilities,
fields.InputDataFields.groundtruth_keypoint_weights,
fields.InputDataFields.groundtruth_dp_num_points,
fields.InputDataFields.groundtruth_dp_part_ids,
fields.InputDataFields.groundtruth_dp_surface_coords
]
for key in optional_label_keys:
......@@ -720,6 +751,17 @@ def train_input(train_config, train_input_config,
groundtruth visibilities for each keypoint.
labels[fields.InputDataFields.groundtruth_labeled_classes] is a
[batch_size, num_classes] float32 k-hot tensor of classes.
labels[fields.InputDataFields.groundtruth_dp_num_points] is a
[batch_size, num_boxes] int32 tensor with the number of sampled
DensePose points per object.
labels[fields.InputDataFields.groundtruth_dp_part_ids] is a
[batch_size, num_boxes, max_sampled_points] int32 tensor with the
DensePose part ids (0-indexed) per object.
labels[fields.InputDataFields.groundtruth_dp_surface_coords] is a
[batch_size, num_boxes, max_sampled_points, 4] float32 tensor with the
DensePose surface coordinates. The format is (y, x, v, u), where (y, x)
are normalized image coordinates and (v, u) are normalized surface part
coordinates.
Raises:
TypeError: if the `train_config`, `train_input_config` or `model_config`
......@@ -861,6 +903,17 @@ def eval_input(eval_config, eval_input_config, model_config,
same class which heavily occlude each other.
labels[fields.InputDataFields.groundtruth_labeled_classes] is a
[num_boxes, num_classes] float32 k-hot tensor of classes.
labels[fields.InputDataFields.groundtruth_dp_num_points] is a
[batch_size, num_boxes] int32 tensor with the number of sampled
DensePose points per object.
labels[fields.InputDataFields.groundtruth_dp_part_ids] is a
[batch_size, num_boxes, max_sampled_points] int32 tensor with the
DensePose part ids (0-indexed) per object.
labels[fields.InputDataFields.groundtruth_dp_surface_coords] is a
[batch_size, num_boxes, max_sampled_points, 4] float32 tensor with the
DensePose surface coordinates. The format is (y, x, v, u), where (y, x)
are normalized image coordinates and (v, u) are normalized surface part
coordinates.
Raises:
TypeError: if the `eval_config`, `eval_input_config` or `model_config`
......@@ -1041,8 +1094,12 @@ def get_reduce_to_frame_fn(input_reader_config, is_training):
num_frames = tf.cast(
tf.shape(tensor_dict[fields.InputDataFields.source_id])[0],
dtype=tf.int32)
frame_index = tf.random.uniform((), minval=0, maxval=num_frames,
dtype=tf.int32)
if input_reader_config.frame_index == -1:
frame_index = tf.random.uniform((), minval=0, maxval=num_frames,
dtype=tf.int32)
else:
frame_index = tf.constant(input_reader_config.frame_index,
dtype=tf.int32)
out_tensor_dict = {}
for key in tensor_dict:
if key in fields.SEQUENCE_FIELDS:
......
......@@ -61,7 +61,7 @@ def _get_configs_for_model(model_name):
configs, kwargs_dict=override_dict)
def _get_configs_for_model_sequence_example(model_name):
def _get_configs_for_model_sequence_example(model_name, frame_index=-1):
"""Returns configurations for model."""
fname = os.path.join(tf.resource_loader.get_data_files_path(),
'test_data/' + model_name + '.config')
......@@ -74,7 +74,8 @@ def _get_configs_for_model_sequence_example(model_name):
override_dict = {
'train_input_path': data_path,
'eval_input_path': data_path,
'label_map_path': label_map_path
'label_map_path': label_map_path,
'frame_index': frame_index
}
return config_util.merge_external_params_with_configs(
configs, kwargs_dict=override_dict)
......@@ -312,6 +313,46 @@ class InputFnTest(test_case.TestCase, parameterized.TestCase):
tf.float32,
labels[fields.InputDataFields.groundtruth_weights].dtype)
def test_context_rcnn_resnet50_train_input_with_sequence_example_frame_index(
self, train_batch_size=8):
"""Tests the training input function for FasterRcnnResnet50."""
configs = _get_configs_for_model_sequence_example(
'context_rcnn_camera_trap', frame_index=2)
model_config = configs['model']
train_config = configs['train_config']
train_config.batch_size = train_batch_size
train_input_fn = inputs.create_train_input_fn(
train_config, configs['train_input_config'], model_config)
features, labels = _make_initializable_iterator(train_input_fn()).get_next()
self.assertAllEqual([train_batch_size, 640, 640, 3],
features[fields.InputDataFields.image].shape.as_list())
self.assertEqual(tf.float32, features[fields.InputDataFields.image].dtype)
self.assertAllEqual([train_batch_size],
features[inputs.HASH_KEY].shape.as_list())
self.assertEqual(tf.int32, features[inputs.HASH_KEY].dtype)
self.assertAllEqual(
[train_batch_size, 100, 4],
labels[fields.InputDataFields.groundtruth_boxes].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_boxes].dtype)
self.assertAllEqual(
[train_batch_size, 100, model_config.faster_rcnn.num_classes],
labels[fields.InputDataFields.groundtruth_classes].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_classes].dtype)
self.assertAllEqual(
[train_batch_size, 100],
labels[fields.InputDataFields.groundtruth_weights].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_weights].dtype)
self.assertAllEqual(
[train_batch_size, 100, model_config.faster_rcnn.num_classes],
labels[fields.InputDataFields.groundtruth_confidences].shape.as_list())
self.assertEqual(
tf.float32,
labels[fields.InputDataFields.groundtruth_confidences].dtype)
def test_ssd_inceptionV2_train_input(self):
"""Tests the training input function for SSDInceptionV2."""
configs = _get_configs_for_model('ssd_inception_v2_pets')
......@@ -1293,6 +1334,51 @@ class DataTransformationFnTest(test_case.TestCase, parameterized.TestCase):
groundtruth_keypoint_weights,
[[1.0, 1.0], [1.0, 1.0]])
def test_groundtruth_dense_pose(self):
def graph_fn():
tensor_dict = {
fields.InputDataFields.image:
tf.constant(np.random.rand(100, 50, 3).astype(np.float32)),
fields.InputDataFields.groundtruth_boxes:
tf.constant(np.array([[.5, .5, 1, 1], [.0, .0, .5, .5]],
np.float32)),
fields.InputDataFields.groundtruth_classes:
tf.constant(np.array([1, 2], np.int32)),
fields.InputDataFields.groundtruth_dp_num_points:
tf.constant([0, 2], dtype=tf.int32),
fields.InputDataFields.groundtruth_dp_part_ids:
tf.constant([[0, 0], [4, 23]], dtype=tf.int32),
fields.InputDataFields.groundtruth_dp_surface_coords:
tf.constant([[[0., 0., 0., 0.,], [0., 0., 0., 0.,]],
[[0.1, 0.2, 0.3, 0.4,], [0.6, 0.8, 0.6, 0.7,]]],
dtype=tf.float32),
}
num_classes = 1
input_transformation_fn = functools.partial(
inputs.transform_input_data,
model_preprocess_fn=_fake_resize50_preprocess_fn,
image_resizer_fn=_fake_image_resizer_fn,
num_classes=num_classes)
transformed_inputs = input_transformation_fn(tensor_dict=tensor_dict)
transformed_dp_num_points = transformed_inputs[
fields.InputDataFields.groundtruth_dp_num_points]
transformed_dp_part_ids = transformed_inputs[
fields.InputDataFields.groundtruth_dp_part_ids]
transformed_dp_surface_coords = transformed_inputs[
fields.InputDataFields.groundtruth_dp_surface_coords]
return (transformed_dp_num_points, transformed_dp_part_ids,
transformed_dp_surface_coords)
dp_num_points, dp_part_ids, dp_surface_coords = self.execute_cpu(
graph_fn, [])
self.assertAllEqual(dp_num_points, [0, 2])
self.assertAllEqual(dp_part_ids, [[0, 0], [4, 23]])
self.assertAllClose(
dp_surface_coords,
[[[0., 0., 0., 0.,], [0., 0., 0., 0.,]],
[[0.1, 0.1, 0.3, 0.4,], [0.6, 0.4, 0.6, 0.7,]]])
class PadInputDataToStaticShapesFnTest(test_case.TestCase):
......@@ -1454,6 +1540,35 @@ class PadInputDataToStaticShapesFnTest(test_case.TestCase):
fields.InputDataFields.groundtruth_keypoint_visibilities]
.shape.as_list(), [3, 16])
def test_dense_pose(self):
input_tensor_dict = {
fields.InputDataFields.groundtruth_dp_num_points:
tf.constant([0, 2], dtype=tf.int32),
fields.InputDataFields.groundtruth_dp_part_ids:
tf.constant([[0, 0], [4, 23]], dtype=tf.int32),
fields.InputDataFields.groundtruth_dp_surface_coords:
tf.constant([[[0., 0., 0., 0.,], [0., 0., 0., 0.,]],
[[0.1, 0.2, 0.3, 0.4,], [0.6, 0.8, 0.6, 0.7,]]],
dtype=tf.float32),
}
padded_tensor_dict = inputs.pad_input_data_to_static_shapes(
tensor_dict=input_tensor_dict,
max_num_boxes=3,
num_classes=1,
spatial_image_shape=[128, 128],
max_dp_points=200)
self.assertAllEqual(
padded_tensor_dict[fields.InputDataFields.groundtruth_dp_num_points]
.shape.as_list(), [3])
self.assertAllEqual(
padded_tensor_dict[fields.InputDataFields.groundtruth_dp_part_ids]
.shape.as_list(), [3, 200])
self.assertAllEqual(
padded_tensor_dict[fields.InputDataFields.groundtruth_dp_surface_coords]
.shape.as_list(), [3, 200, 4])
def test_context_features(self):
context_memory_size = 8
context_feature_length = 10
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment