README.md 14 KB
Newer Older
1
2
![TensorFlow Requirement: 1.x](https://img.shields.io/badge/TensorFlow%20Requirement-1.x-brightgreen)
![TensorFlow 2 Not Supported](https://img.shields.io/badge/TensorFlow%202%20Not%20Supported-%E2%9C%95-red.svg)
Vivek Rathod's avatar
Vivek Rathod committed
3

4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Tensorflow Object Detection API
Creating accurate machine learning models capable of localizing and identifying
multiple objects in a single image remains a core challenge in computer vision.
The TensorFlow Object Detection API is an open source framework built on top of
TensorFlow that makes it easy to construct, train and deploy object detection
models.  At Google we’ve certainly found this codebase to be useful for our
computer vision needs, and we hope that you will as well.
<p align="center">
  <img src="g3doc/img/kites_detections_output.jpg" width=676 height=450>
</p>
Contributions to the codebase are welcome and we would love to hear back from
you if you find this API useful.  Finally if you use the Tensorflow Object
Detection API for a research publication, please consider citing:

```
"Speed/accuracy trade-offs for modern convolutional object detectors."
Huang J, Rathod V, Sun C, Zhu M, Korattikara A, Fathi A, Fischer I, Wojna Z,
Song Y, Guadarrama S, Murphy K, CVPR 2017
```
\[[link](https://arxiv.org/abs/1611.10012)\]\[[bibtex](
https://scholar.googleusercontent.com/scholar.bib?q=info:l291WsrB-hQJ:scholar.google.com/&output=citation&scisig=AAGBfm0AAAAAWUIIlnPZ_L9jxvPwcC49kDlELtaeIyU-&scisf=4&ct=citation&cd=-1&hl=en&scfhb=1)\]

26
27
28
29
<p align="center">
  <img src="g3doc/img/tf-od-api-logo.png" width=140 height=195>
</p>

30
31
32
33
## Maintainers

* Jonathan Huang, github: [jch1](https://github.com/jch1)
* Vivek Rathod, github: [tombstone](https://github.com/tombstone)
34
* Ronny Votel, github: [ronnyvotel](https://github.com/ronnyvotel)
35
36
37
* Derek Chow, github: [derekjchow](https://github.com/derekjchow)
* Chen Sun, github: [jesu9](https://github.com/jesu9)
* Menglong Zhu, github: [dreamdragon](https://github.com/dreamdragon)
38
* Alireza Fathi, github: [afathi3](https://github.com/afathi3)
Zhichao Lu's avatar
Zhichao Lu committed
39
* Zhichao Lu, github: [pkulzc](https://github.com/pkulzc)
40
41
42
43


## Table of contents

44
45
46
47
Setup:

  * <a href='g3doc/installation.md'>Installation</a><br>

48
Quick Start:
49
50

  * <a href='object_detection_tutorial.ipynb'>
51
      Quick Start: Jupyter notebook for off-the-shelf inference</a><br>
52
  * <a href="g3doc/running_pets.md">Quick Start: Training a pet detector</a><br>
53

54
Customizing a Pipeline:
55
56

  * <a href='g3doc/configuring_jobs.md'>
57
      Configuring an object detection pipeline</a><br>
58
  * <a href='g3doc/preparing_inputs.md'>Preparing inputs</a><br>
59
60

Running:
61
62
63

  * <a href='g3doc/running_locally.md'>Running locally</a><br>
  * <a href='g3doc/running_on_cloud.md'>Running on the cloud</a><br>
64
65

Extras:
66
67
68

  * <a href='g3doc/detection_model_zoo.md'>Tensorflow detection model zoo</a><br>
  * <a href='g3doc/exporting_models.md'>
69
      Exporting a trained model for inference</a><br>
70
71
  * <a href='g3doc/tpu_exporters.md'>
      Exporting a trained model for TPU inference</a><br>
72
  * <a href='g3doc/defining_your_own_model.md'>
73
      Defining your own model architecture</a><br>
74
  * <a href='g3doc/using_your_own_dataset.md'>
75
      Bringing in your own dataset</a><br>
76
77
  * <a href='g3doc/evaluation_protocols.md'>
      Supported object detection evaluation protocols</a><br>
Vivek Rathod's avatar
Vivek Rathod committed
78
79
  * <a href='g3doc/oid_inference_and_evaluation.md'>
      Inference and evaluation on the Open Images dataset</a><br>
80
  * <a href='g3doc/instance_segmentation.md'>
Zhichao Lu's avatar
Zhichao Lu committed
81
      Run an instance segmentation model</a><br>
82
  * <a href='g3doc/challenge_evaluation.md'>
pkulzc's avatar
pkulzc committed
83
      Run the evaluation for the Open Images Challenge 2018/2019</a><br>
84
85
  * <a href='g3doc/tpu_compatibility.md'>
      TPU compatible detection pipelines</a><br>
86
  * <a href='g3doc/running_on_mobile_tensorflowlite.md'>
87
      Running object detection on mobile devices with TensorFlow Lite</a><br>
88
89
90

## Getting Help

91
92
93
94
95
To get help with issues you may encounter using the Tensorflow Object Detection
API, create a new question on [StackOverflow](https://stackoverflow.com/) with
the tags "tensorflow" and "object-detection".

Please report bugs (actually broken code, not usage questions) to the
96
tensorflow/models GitHub
97
[issue tracker](https://github.com/tensorflow/models/issues), prefixing the
98
issue name with "object_detection".
99

100
101
Please check [FAQ](g3doc/faq.md) for frequently asked questions before
reporting an issue.
102
103


104
105
## Release information

106
107
108
109
110
111
112
113
114
115
116
117
118
### Nov 13th, 2019
We have released MobileNetEdgeTPU SSDLite model.

* SSDLite with MobileNetEdgeTPU backbone, which achieves 10% mAP higher than
MobileNetV2 SSDLite (24.3 mAP vs 22 mAP) on a Google Pixel4 at comparable
latency (6.6ms vs 6.8ms).

Along with the model definition, we are also releasing model checkpoints
trained on the COCO dataset.

<b>Thanks to contributors</b>: Yunyang Xiong, Bo Chen, Suyog Gupta, Hanxiao Liu,
Gabriel Bender, Mingxing Tan, Berkin Akin, Zhichao Lu, Quoc Le

119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
### Oct 15th, 2019
We have released two MobileNet V3 SSDLite models (presented in
[Searching for MobileNetV3](https://arxiv.org/abs/1905.02244)).

* SSDLite with MobileNet-V3-Large backbone, which is 27% faster than Mobilenet
V2 SSDLite (119ms vs 162ms) on a Google Pixel phone CPU at the same mAP.
* SSDLite with MobileNet-V3-Small backbone, which is 37% faster than MnasNet
SSDLite reduced with depth-multiplier (43ms vs 68ms) at the same mAP.

Along with the model definition, we are also releasing model checkpoints
trained on the COCO dataset.

<b>Thanks to contributors</b>: Bo Chen, Zhichao Lu, Vivek Rathod, Jonathan Huang


pkulzc's avatar
pkulzc committed
134
135
136
137
138
139
140
141
142
143
144
145
146
147
### July 1st, 2019

We have released an updated set of utils and an updated
[tutorial](g3doc/challenge_evaluation.md) for all three tracks of the
[Open Images Challenge 2019](https://storage.googleapis.com/openimages/web/challenge2019.html)!

The Instance Segmentation metric for
[Open Images V5](https://storage.googleapis.com/openimages/web/index.html)
and [Challenge 2019](https://storage.googleapis.com/openimages/web/challenge2019.html)
is part of this release. Check out [the metric description](https://storage.googleapis.com/openimages/web/evaluation.html#instance_segmentation_eval)
on the Open Images website.

<b>Thanks to contributors</b>: Alina Kuznetsova, Rodrigo Benenson

148
149
### Feb 11, 2019

pkulzc's avatar
pkulzc committed
150
We have released detection models trained on the Open Images Dataset V4
151
152
153
154
155
156
157
158
in our detection model zoo, including

* Faster R-CNN detector with Inception Resnet V2 feature extractor
* SSD detector with MobileNet V2 feature extractor
* SSD detector with ResNet 101 FPN feature extractor (aka RetinaNet-101)

<b>Thanks to contributors</b>: Alina Kuznetsova, Yinxiao Li

159
160
161
162
163
164
165
166
167
168
### Sep 17, 2018

We have released Faster R-CNN detectors with ResNet-50 / ResNet-101 feature
extractors trained on the [iNaturalist Species Detection Dataset](https://github.com/visipedia/inat_comp/blob/master/2017/README.md#bounding-boxes).
The models are trained on the training split of the iNaturalist data for 4M
iterations, they achieve 55% and 58% mean AP@.5 over 2854 classes respectively.
For more details please refer to this [paper](https://arxiv.org/abs/1707.06642).

<b>Thanks to contributors</b>: Chen Sun

169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
### July 13, 2018

There are many new updates in this release, extending the functionality and
capability of the API:

* Moving from slim-based training to [Estimator](https://www.tensorflow.org/api_docs/python/tf/estimator/Estimator)-based
training.
* Support for [RetinaNet](https://arxiv.org/abs/1708.02002), and a [MobileNet](https://ai.googleblog.com/2017/06/mobilenets-open-source-models-for.html)
adaptation of RetinaNet.
* A novel SSD-based architecture called the [Pooling Pyramid Network](https://arxiv.org/abs/1807.03284) (PPN).
* Releasing several [TPU](https://cloud.google.com/tpu/)-compatible models.
These can be found in the `samples/configs/` directory with a comment in the
pipeline configuration files indicating TPU compatibility.
* Support for quantized training.
* Updated documentation for new binaries, Cloud training, and [Tensorflow Lite](https://www.tensorflow.org/mobile/tflite/).

pkulzc's avatar
pkulzc committed
185
See also our [expanded announcement blogpost](https://ai.googleblog.com/2018/07/accelerated-training-and-inference-with.html) and accompanying tutorial at the [TensorFlow blog](https://medium.com/tensorflow/training-and-serving-a-realtime-mobile-object-detector-in-30-minutes-with-cloud-tpus-b78971cf1193).
pkulzc's avatar
pkulzc committed
186

187
188
189
190
<b>Thanks to contributors</b>: Sara Robinson, Aakanksha Chowdhery, Derek Chow,
Pengchong Jin, Jonathan Huang, Vivek Rathod, Zhichao Lu, Ronny Votel


191
192
193
194
195
196
197
198
199
200
201
202
203
204
### June 25, 2018

Additional evaluation tools for the [Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html) are out.
Check out our short tutorial on data preparation and running evaluation [here](g3doc/challenge_evaluation.md)!

<b>Thanks to contributors</b>: Alina Kuznetsova

### June 5, 2018

We have released the implementation of evaluation metrics for both tracks of the [Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html) as a part of the Object Detection API - see the [evaluation protocols](g3doc/evaluation_protocols.md) for more details.
Additionally, we have released a tool for hierarchical labels expansion for the Open Images Challenge: check out [oid_hierarchical_labels_expansion.py](dataset_tools/oid_hierarchical_labels_expansion.py).

<b>Thanks to contributors</b>: Alina Kuznetsova, Vittorio Ferrari, Jasper Uijlings

205
206
207
208
209
210
211
212
213
### April 30, 2018

We have released a Faster R-CNN detector with ResNet-101 feature extractor trained on [AVA](https://research.google.com/ava/) v2.1.
Compared with other commonly used object detectors, it changes the action classification loss function to per-class Sigmoid loss to handle boxes with multiple labels.
The model is trained on the training split of AVA v2.1 for 1.5M iterations, it achieves mean AP of 11.25% over 60 classes on the validation split of AVA v2.1.
For more details please refer to this [paper](https://arxiv.org/abs/1705.08421).

<b>Thanks to contributors</b>: Chen Sun, David Ross

214
215
216
217
218
### April 2, 2018

Supercharge your mobile phones with the next generation mobile object detector!
We are adding support for MobileNet V2 with SSDLite presented in
[MobileNetV2: Inverted Residuals and Linear Bottlenecks](https://arxiv.org/abs/1801.04381).
219
This model is 35% faster than Mobilenet V1 SSD on a Google Pixel phone CPU (200ms vs. 270ms) at the same accuracy.
220
Along with the model definition, we are also releasing a model checkpoint trained on the COCO dataset.
221
222
223

<b>Thanks to contributors</b>: Menglong Zhu, Mark Sandler, Zhichao Lu, Vivek Rathod, Jonathan Huang

224
225
226
227
228
229
230
231
232
### February 9, 2018

We now support instance segmentation!!  In this API update we support a number of instance segmentation models similar to those discussed in the [Mask R-CNN paper](https://arxiv.org/abs/1703.06870). For further details refer to
[our slides](http://presentations.cocodataset.org/Places17-GMRI.pdf) from the 2017 Coco + Places Workshop.
Refer to the section on [Running an Instance Segmentation Model](g3doc/instance_segmentation.md) for instructions on how to configure a model
that predicts masks in addition to object bounding boxes.

<b>Thanks to contributors</b>: Alireza Fathi, Zhichao Lu, Vivek Rathod, Ronny Votel, Jonathan Huang

233
234
235
236
237
238
239
240
241
242
243
### November 17, 2017

As a part of the Open Images V3 release we have released:

* An implementation of the Open Images evaluation metric and the [protocol](g3doc/evaluation_protocols.md#open-images).
* Additional tools to separate inference of detection and evaluation (see [this tutorial](g3doc/oid_inference_and_evaluation.md)).
* A new detection model trained on the Open Images V2 data release (see [Open Images model](g3doc/detection_model_zoo.md#open-images-models)).

See more information on the [Open Images website](https://github.com/openimages/dataset)!

<b>Thanks to contributors</b>: Stefan Popov, Alina Kuznetsova
Vivek Rathod's avatar
Vivek Rathod committed
244
245
246
247
248
249
250
251
252
253
254
255

### November 6, 2017

We have re-released faster versions of our (pre-trained) models in the
<a href='g3doc/detection_model_zoo.md'>model zoo</a>.  In addition to what
was available before, we are also adding Faster R-CNN models trained on COCO
with Inception V2 and Resnet-50 feature extractors, as well as a Faster R-CNN
with Resnet-101 model trained on the KITTI dataset.

<b>Thanks to contributors</b>: Jonathan Huang, Vivek Rathod, Derek Chow,
Tal Remez, Chen Sun.

256
257
258
259
260
261
262
263
264
265
266
### October 31, 2017

We have released a new state-of-the-art model for object detection using
the Faster-RCNN with the
[NASNet-A image featurization](https://arxiv.org/abs/1707.07012). This
model achieves mAP of 43.1% on the test-dev validation dataset for COCO,
improving on the best available model in the zoo by 6% in terms
of absolute mAP.

<b>Thanks to contributors</b>: Barret Zoph, Vijay Vasudevan, Jonathon Shlens, Quoc Le

267
268
269
270
271
272
273
274
275
276
277
### August 11, 2017

We have released an update to the [Android Detect
demo](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android)
which will now run models trained using the Tensorflow Object
Detection API on an Android device.  By default, it currently runs a
frozen SSD w/Mobilenet detector trained on COCO, but we encourage
you to try out other detection models!

<b>Thanks to contributors</b>: Jonathan Huang, Andrew Harp

278

279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
### June 15, 2017

In addition to our base Tensorflow detection model definitions, this
release includes:

* A selection of trainable detection models, including:
  * Single Shot Multibox Detector (SSD) with MobileNet,
  * SSD with Inception V2,
  * Region-Based Fully Convolutional Networks (R-FCN) with Resnet 101,
  * Faster RCNN with Resnet 101,
  * Faster RCNN with Inception Resnet v2
* Frozen weights (trained on the COCO dataset) for each of the above models to
  be used for out-of-the-box inference purposes.
* A [Jupyter notebook](object_detection_tutorial.ipynb) for performing
  out-of-the-box inference with one of our released models
* Convenient [local training](g3doc/running_locally.md) scripts as well as
  distributed training and evaluation pipelines via
  [Google Cloud](g3doc/running_on_cloud.md).

298
299
300
301
<b>Thanks to contributors</b>: Jonathan Huang, Vivek Rathod, Derek Chow, Chen
Sun, Menglong Zhu, Matthew Tang, Anoop Korattikara, Alireza Fathi, Ian Fischer,
Zbigniew Wojna, Yang Song, Sergio Guadarrama, Jasper Uijlings, Viacheslav
Kovalevskyi, Kevin Murphy