Implementation of the Neural Programmer model described in [paper](https://openreview.net/pdf?id=ry2YOrcge)
Implementation of the Neural Programmer model as described in this [paper](https://openreview.net/pdf?id=ry2YOrcge).
Download and extract the data from [dropbox](https://www.dropbox.com/s/9tvtcv6lmy51zfw/data.zip?dl=0). Change the ``data_dir FLAG`` to the location of the data.
Download and extract the data from the [WikiTableQuestions](https://ppasupat.github.io/WikiTableQuestions/) site. The dataset contains
11321, 2831, and 4344 examples for training, development, and testing respectively. We use their tokenization, number and date pre-processing. Please note that the above paper used the [initial release](https://github.com/ppasupat/WikiTableQuestions/releases/tag/v0.2) for training, development and testing.
Change the `data_dir FLAG` to the location of the data.
Running object detection on mobile devices with TensorFlow Lite</a><br>
*<ahref='g3doc/context_rcnn.md'>
Context R-CNN documentation for data preparation, training, and export</a><br>
## Getting Help
...
...
@@ -98,78 +99,107 @@ tensorflow/models GitHub
[issue tracker](https://github.com/tensorflow/models/issues), prefixing the
issue name with "object_detection".
Please check [FAQ](g3doc/faq.md) for frequently asked questions before
reporting an issue.
Please check [FAQ](g3doc/faq.md) for frequently asked questions before reporting
an issue.
## Release information
### June 17th, 2020
We have released [Context R-CNN](https://arxiv.org/abs/1912.03538), a model that
uses attention to incorporate contextual information images (e.g. from
temporally nearby frames taken by a static camera) in order to improve accuracy.
Importantly, these contextual images need not be labeled.
* When applied to a challenging wildlife detection dataset ([Snapshot Serengeti](http://lila.science/datasets/snapshot-serengeti)),
Context R-CNN with context from up to a month of images outperforms a
single-frame baseline by 17.9% mAP, and outperforms S3D (a 3d convolution
based baseline) by 11.2% mAP.
* Context R-CNN leverages temporal context from the unlabeled frames of a
novel camera deployment to improve performance at that camera, boosting
model generalizeability.
Read about Context R-CNN on the Google AI blog [here](https://ai.googleblog.com/2020/06/leveraging-temporal-context-for-object.html).
We have provided code for generating data with associated context
[here](g3doc/context_rcnn.md), and a sample config for a Context R-CNN
model [here](samples/configs/context_rcnn_resnet101_snapshot_serengeti_sync.config).
Snapshot Serengeti-trained Faster R-CNN and Context R-CNN models can be found in
the [model zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md#snapshot-serengeti-camera-trap-trained-models).
<b>Thanks to contributors</b>: Alina Kuznetsova, Rodrigo Benenson
### Feb 11, 2019
We have released detection models trained on the Open Images Dataset V4
in our detection model zoo, including
We have released detection models trained on the Open Images Dataset V4 in our
detection model zoo, including
* Faster R-CNN detector with Inception Resnet V2 feature extractor
* SSD detector with MobileNet V2 feature extractor
...
...
@@ -198,7 +229,8 @@ in our detection model zoo, including
### Sep 17, 2018
We have released Faster R-CNN detectors with ResNet-50 / ResNet-101 feature
extractors trained on the [iNaturalist Species Detection Dataset](https://github.com/visipedia/inat_comp/blob/master/2017/README.md#bounding-boxes).
extractors trained on the
[iNaturalist Species Detection Dataset](https://github.com/visipedia/inat_comp/blob/master/2017/README.md#bounding-boxes).
The models are trained on the training split of the iNaturalist data for 4M
iterations, they achieve 55% and 58% mean AP@.5 over 2854 classes respectively.
For more details please refer to this [paper](https://arxiv.org/abs/1707.06642).
...
...
@@ -210,42 +242,59 @@ For more details please refer to this [paper](https://arxiv.org/abs/1707.06642).
There are many new updates in this release, extending the functionality and
capability of the API:
* Moving from slim-based training to [Estimator](https://www.tensorflow.org/api_docs/python/tf/estimator/Estimator)-based
training.
* Support for [RetinaNet](https://arxiv.org/abs/1708.02002), and a [MobileNet](https://ai.googleblog.com/2017/06/mobilenets-open-source-models-for.html)
adaptation of RetinaNet.
* A novel SSD-based architecture called the [Pooling Pyramid Network](https://arxiv.org/abs/1807.03284)(PPN).
See also our [expanded announcement blogpost](https://ai.googleblog.com/2018/07/accelerated-training-and-inference-with.html) and accompanying tutorial at the [TensorFlow blog](https://medium.com/tensorflow/training-and-serving-a-realtime-mobile-object-detector-in-30-minutes-with-cloud-tpus-b78971cf1193).
are out. Check out our short tutorial on data preparation and running evaluation
[here](g3doc/challenge_evaluation.md)!
<b>Thanks to contributors</b>: Alina Kuznetsova
### June 5, 2018
We have released the implementation of evaluation metrics for both tracks of the [Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html) as a part of the Object Detection API - see the [evaluation protocols](g3doc/evaluation_protocols.md) for more details.
Additionally, we have released a tool for hierarchical labels expansion for the Open Images Challenge: check out [oid_hierarchical_labels_expansion.py](dataset_tools/oid_hierarchical_labels_expansion.py).
We have released the implementation of evaluation metrics for both tracks of the
<b>Thanks to contributors</b>: Alina Kuznetsova, Vittorio Ferrari, Jasper Uijlings
<b>Thanks to contributors</b>: Alina Kuznetsova, Vittorio Ferrari, Jasper
Uijlings
### April 30, 2018
We have released a Faster R-CNN detector with ResNet-101 feature extractor trained on [AVA](https://research.google.com/ava/) v2.1.
Compared with other commonly used object detectors, it changes the action classification loss function to per-class Sigmoid loss to handle boxes with multiple labels.
The model is trained on the training split of AVA v2.1 for 1.5M iterations, it achieves mean AP of 11.25% over 60 classes on the validation split of AVA v2.1.
We have released a Faster R-CNN detector with ResNet-101 feature extractor
trained on [AVA](https://research.google.com/ava/) v2.1. Compared with other
commonly used object detectors, it changes the action classification loss
function to per-class Sigmoid loss to handle boxes with multiple labels. The
model is trained on the training split of AVA v2.1 for 1.5M iterations, it
achieves mean AP of 11.25% over 60 classes on the validation split of AVA v2.1.
For more details please refer to this [paper](https://arxiv.org/abs/1705.08421).
<b>Thanks to contributors</b>: Chen Sun, David Ross
...
...
@@ -255,70 +304,80 @@ For more details please refer to this [paper](https://arxiv.org/abs/1705.08421).
Supercharge your mobile phones with the next generation mobile object detector!
We are adding support for MobileNet V2 with SSDLite presented in
[MobileNetV2: Inverted Residuals and Linear Bottlenecks](https://arxiv.org/abs/1801.04381).
This model is 35% faster than Mobilenet V1 SSD on a Google Pixel phone CPU (200ms vs. 270ms) at the same accuracy.
Along with the model definition, we are also releasing a model checkpoint trained on the COCO dataset.
This model is 35% faster than Mobilenet V1 SSD on a Google Pixel phone CPU
(200ms vs. 270ms) at the same accuracy. Along with the model definition, we are
also releasing a model checkpoint trained on the COCO dataset.
<b>Thanks to contributors</b>: Menglong Zhu, Mark Sandler, Zhichao Lu, Vivek Rathod, Jonathan Huang
<b>Thanks to contributors</b>: Menglong Zhu, Mark Sandler, Zhichao Lu, Vivek
Rathod, Jonathan Huang
### February 9, 2018
We now support instance segmentation!! In this API update we support a number of instance segmentation models similar to those discussed in the [Mask R-CNN paper](https://arxiv.org/abs/1703.06870). For further details refer to
[our slides](http://presentations.cocodataset.org/Places17-GMRI.pdf) from the 2017 Coco + Places Workshop.
Refer to the section on [Running an Instance Segmentation Model](g3doc/instance_segmentation.md) for instructions on how to configure a model
that predicts masks in addition to object bounding boxes.
We now support instance segmentation!! In this API update we support a number of
instance segmentation models similar to those discussed in the
[Mask R-CNN paper](https://arxiv.org/abs/1703.06870). For further details refer
to [our slides](http://presentations.cocodataset.org/Places17-GMRI.pdf) from the
2017 Coco + Places Workshop. Refer to the section on
[Running an Instance Segmentation Model](g3doc/instance_segmentation.md) for
instructions on how to configure a model that predicts masks in addition to
object bounding boxes.
<b>Thanks to contributors</b>: Alireza Fathi, Zhichao Lu, Vivek Rathod, Ronny Votel, Jonathan Huang
<b>Thanks to contributors</b>: Alireza Fathi, Zhichao Lu, Vivek Rathod, Ronny
Votel, Jonathan Huang
### November 17, 2017
As a part of the Open Images V3 release we have released:
* An implementation of the Open Images evaluation metric and the [protocol](g3doc/evaluation_protocols.md#open-images).
* Additional tools to separate inference of detection and evaluation (see [this tutorial](g3doc/oid_inference_and_evaluation.md)).
* A new detection model trained on the Open Images V2 data release (see [Open Images model](g3doc/detection_model_zoo.md#open-images-models)).
* An implementation of the Open Images evaluation metric and the