Merge remote-tracking branch 'upstream/master' into newavarecords

5a2cf36f · Kaushik Shivakumar · 258ddfc3 · a829e648 · 5a2cf36f · 5a2cf36f
Commit 5a2cf36f authored Jul 23, 2020 by Kaushik Shivakumar
20 changed files
--- a/research/delf/setup.py
+++ b/research/delf/setup.py
@@ -22,7 +22,7 @@ install_requires = [
    'pandas >= 0.24.2',
    'numpy >= 1.16.1',
    'scipy >= 1.2.2',
-    'tensorflow >= 2.0.0b1',
+    'tensorflow >= 2.2.0',
    'tf_slim >= 1.1',
    'tensorflow_probability >= 0.9.0',
 ]

--- a/research/object_detection/CONTRIBUTING.md
+++ b/research/object_detection/CONTRIBUTING.md
-# Contributing to the Tensorflow Object Detection API
+# Contributing to the TensorFlow Object Detection API

-Patches to Tensorflow Object Detection API are welcome!
+Patches to TensorFlow Object Detection API are welcome!

 We require contributors to fill out either the individual or corporate
 Contributor License Agreement (CLA).
@@ -9,5 +9,5 @@ Contributor License Agreement (CLA).
  * If you work for a company that wants to allow you to contribute your work, then you'll need to sign a [corporate CLA](http://code.google.com/legal/corporate-cla-v1.0.html).

 Please follow the
-[Tensorflow contributing guidelines](https://github.com/tensorflow/tensorflow/blob/master/CONTRIBUTING.md)
+[TensorFlow contributing guidelines](https://github.com/tensorflow/tensorflow/blob/master/CONTRIBUTING.md)
 when submitting pull requests.
--- a/research/object_detection/README.md
+++ b/research/object_detection/README.md
-![TensorFlow Requirement: 1.15](https://img.shields.io/badge/TensorFlow%20Requirement-1.15-brightgreen)
-![TensorFlow 2 Not Supported](https://img.shields.io/badge/TensorFlow%202%20Not%20Supported-%E2%9C%95-red.svg)
-
-# Tensorflow Object Detection API
+# TensorFlow Object Detection API
+[![TensorFlow 2.2](https://img.shields.io/badge/TensorFlow-2.2-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v2.2.0)
+[![TensorFlow 1.15](https://img.shields.io/badge/TensorFlow-1.15-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
+[![Python 3.6](https://img.shields.io/badge/Python-3.6-3776AB)](https://www.python.org/downloads/release/python-360/)

 Creating accurate machine learning models capable of localizing and identifying
 multiple objects in a single image remains a core challenge in computer vision.
@@ -11,7 +11,7 @@ models. At Google we’ve certainly found this codebase to be useful for our
 computer vision needs, and we hope that you will as well. <p align="center">
 <img src="g3doc/img/kites_detections_output.jpg" width=676 height=450> </p>
 Contributions to the codebase are welcome and we would love to hear back from
-you if you find this API useful. Finally if you use the Tensorflow Object
+you if you find this API useful. Finally if you use the TensorFlow Object
 Detection API for a research publication, please consider citing:

 ```
@@ -26,91 +26,110 @@ Song Y, Guadarrama S, Murphy K, CVPR 2017
  <img src="g3doc/img/tf-od-api-logo.png" width=140 height=195>
 </p>

-## Maintainers
+## Support for TensorFlow 2 and 1
+The TensorFlow Object Detection API supports both TensorFlow 2 (TF2) and
+TensorFlow 1 (TF1). A majority of the modules in the library are both TF1 and
+TF2 compatible. In cases where they are not, we provide two versions.

-Name           | GitHub
-------------- | ---------------------------------------------
-Jonathan Huang | [jch1](https://github.com/jch1)
-Vivek Rathod   | [tombstone](https://github.com/tombstone)
-Ronny Votel    | [ronnyvotel](https://github.com/ronnyvotel)
-Derek Chow     | [derekjchow](https://github.com/derekjchow)
-Chen Sun       | [jesu9](https://github.com/jesu9)
-Menglong Zhu   | [dreamdragon](https://github.com/dreamdragon)
-Alireza Fathi  | [afathi3](https://github.com/afathi3)
-Zhichao Lu     | [pkulzc](https://github.com/pkulzc)
-
-## Table of contents
-
-Setup:
-
-*   <a href='g3doc/installation.md'>Installation</a><br>
-
-Quick Start:
-
-*   <a href='object_detection_tutorial.ipynb'>
-      Quick Start: Jupyter notebook for off-the-shelf inference</a><br>
-*   <a href="g3doc/running_pets.md">Quick Start: Training a pet detector</a><br>
-
-Customizing a Pipeline:
-
-*   <a href='g3doc/configuring_jobs.md'>
-      Configuring an object detection pipeline</a><br>
-*   <a href='g3doc/preparing_inputs.md'>Preparing inputs</a><br>
-
-Running:
-
-*   <a href='g3doc/running_locally.md'>Running locally</a><br>
-*   <a href='g3doc/running_on_cloud.md'>Running on the cloud</a><br>
-
-Extras:
-
-*   <a href='g3doc/detection_model_zoo.md'>Tensorflow detection model zoo</a><br>
-*   <a href='g3doc/exporting_models.md'>
-      Exporting a trained model for inference</a><br>
-*   <a href='g3doc/tpu_exporters.md'>
-      Exporting a trained model for TPU inference</a><br>
-*   <a href='g3doc/defining_your_own_model.md'>
-      Defining your own model architecture</a><br>
-*   <a href='g3doc/using_your_own_dataset.md'>
-      Bringing in your own dataset</a><br>
-*   <a href='g3doc/evaluation_protocols.md'>
-      Supported object detection evaluation protocols</a><br>
-*   <a href='g3doc/oid_inference_and_evaluation.md'>
-      Inference and evaluation on the Open Images dataset</a><br>
-*   <a href='g3doc/instance_segmentation.md'>
-      Run an instance segmentation model</a><br>
-*   <a href='g3doc/challenge_evaluation.md'>
-      Run the evaluation for the Open Images Challenge 2018/2019</a><br>
-*   <a href='g3doc/tpu_compatibility.md'>
-      TPU compatible detection pipelines</a><br>
-*   <a href='g3doc/running_on_mobile_tensorflowlite.md'>
-      Running object detection on mobile devices with TensorFlow Lite</a><br>
-*   <a href='g3doc/context_rcnn.md'>
-      Context R-CNN documentation for data preparation, training, and export</a><br>
+Although we will continue to maintain the TF1 models and provide support, we
+encourage users to try the Object Detection API with TF2 for the following
+reasons:

-## Getting Help
+* We provide new architectures supported in TF2 only and we will continue to
+  develop in TF2 going forward.

-To get help with issues you may encounter using the Tensorflow Object Detection
-API, create a new question on [StackOverflow](https://stackoverflow.com/) with
-the tags "tensorflow" and "object-detection".
+* The popular models we ported from TF1 to TF2 achieve the same performance.

-Please report bugs (actually broken code, not usage questions) to the
-tensorflow/models GitHub
-[issue tracker](https://github.com/tensorflow/models/issues), prefixing the
-issue name with "object_detection".
+* A single training and evaluation binary now supports both GPU and TPU
+  distribution strategies making it possible to train models with synchronous
+  SGD by default.
+
+* Eager execution with new binaries makes debugging easy!
+
+Finally, if are an existing user of the Object Detection API we have retained
+the same config language you are familiar with and ensured that the
+TF2 training/eval binary takes the same arguments as our TF1 binaries.
+
+Note: The models we provide in [TF2 Zoo](g3doc/tf2_detection_zoo.md) and
+[TF1 Zoo](g3doc/tf1_detection_zoo.md) are specific to the TensorFlow major
+version and are not interoperable.
+
+Please select one of the links below for TensorFlow version-specific
+documentation of the Object Detection API:

-Please check [FAQ](g3doc/faq.md) for frequently asked questions before reporting
-an issue.
+<!-- mdlint off(WHITESPACE_LINE_LENGTH) -->
+### Tensorflow 2.x
+  *   <a href='g3doc/tf2.md'>
+        Object Detection API TensorFlow 2</a><br>
+  *   <a href='g3doc/tf2_detection_zoo.md'>
+        TensorFlow 2 Model Zoo</a><br>

-## Release information
-### June 17th, 2020
+### Tensorflow 1.x
+  *   <a href='g3doc/tf1.md'>
+        Object Detection API TensorFlow 1</a><br>
+  *   <a href='g3doc/tf1_detection_zoo.md'>
+        TensorFlow 1 Model Zoo</a><br>
+<!-- mdlint on -->
+
+## Whats New
+
+### TensorFlow 2 Support
+
+We are happy to announce that the TF OD API officially supports TF2! Our release
+includes:
+
+* New binaries for train/eval/export that are designed to run in eager mode.
+* A suite of TF2 compatible (Keras-based) models; this includes migrations of
+  our most popular TF1.x models (e.g., SSD with MobileNet, RetinaNet,
+  Faster R-CNN, Mask R-CNN), as well as a few new architectures for which we
+  will only maintain TF2 implementations:
+
+    1. CenterNet - a simple and effective anchor-free architecture based on
+       the recent [Objects as Points](https://arxiv.org/abs/1904.07850) paper by
+       Zhou et al.
+    2. [EfficientDet](https://arxiv.org/abs/1911.09070) - a recent family of
+       SOTA models discovered with the help of Neural Architecture Search.
+
+* COCO pre-trained weights for all of the models provided as TF2 style
+  object-based checkpoints.
+* Access to [Distribution Strategies](https://www.tensorflow.org/guide/distributed_training)
+  for distributed training --- our model are designed to be trainable using sync
+  multi-GPU and TPU platforms.
+* Colabs demo’ing eager mode training and inference.
+
+See our release blogpost [here](https://blog.tensorflow.org/2020/07/tensorflow-2-meets-object-detection-api.html).
+If you are an existing user of the TF OD API using TF 1.x, don’t worry, we’ve
+got you covered.
+
+**Thanks to contributors**: Akhil Chinnakotla, Allen Lavoie, Anirudh Vegesana,
+Anjali Sridhar, Austin Myers, Dan Kondratyuk, David Ross, Derek Chow, Jaeyoun
+Kim, Jing Li, Jonathan Huang, Jordi Pont-Tuset, Karmel Allison, Kathy Ruan,
+Kaushik Shivakumar, Lu He, Mingxing Tan, Pengchong Jin, Ronny Votel, Sara Beery,
+Sergi Caelles Prat, Shan Yang, Sudheendra Vijayanarasimhan, Tina Tian, Tomer
+Kaftan, Vighnesh Birodkar, Vishnu Banna, Vivek Rathod, Yanhui Liang, Yiming Shi,
+Yixin Shi, Yu-hui Chen, Zhichao Lu.
+
+### MobileDet GPU
+
+We have released SSDLite with MobileDet GPU backbone, which achieves 17% mAP
+higher than the MobileNetV2 SSDLite (27.5 mAP vs 23.5 mAP) on a NVIDIA Jetson
+Xavier at comparable latency (3.2ms vs 3.3ms).
+
+Along with the model definition, we are also releasing model checkpoints trained
+on the COCO dataset.
+
+<b>Thanks to contributors</b>: Yongzhe Wang, Bo Chen, Hanxiao Liu, Le An
+(NVIDIA), Yu-Te Cheng (NVIDIA), Oliver Knieps (NVIDIA), and Josh Park (NVIDIA).
+
+### Context R-CNN

 We have released [Context R-CNN](https://arxiv.org/abs/1912.03538), a model that
 uses attention to incorporate contextual information images (e.g. from
 temporally nearby frames taken by a static camera) in order to improve accuracy.
 Importantly, these contextual images need not be labeled.

-*   When applied to a challenging wildlife detection dataset ([Snapshot Serengeti](http://lila.science/datasets/snapshot-serengeti)),
+*   When applied to a challenging wildlife detection dataset
+    ([Snapshot Serengeti](http://lila.science/datasets/snapshot-serengeti)),
    Context R-CNN with context from up to a month of images outperforms a
    single-frame baseline by 17.9% mAP, and outperforms S3D (a 3d convolution
    based baseline) by 11.2% mAP.
@@ -118,282 +137,48 @@ Importantly, these contextual images need not be labeled.
    novel camera deployment to improve performance at that camera, boosting
    model generalizeability.

-Read about Context R-CNN on the Google AI blog [here](https://ai.googleblog.com/2020/06/leveraging-temporal-context-for-object.html).
+Read about Context R-CNN on the Google AI blog
+[here](https://ai.googleblog.com/2020/06/leveraging-temporal-context-for-object.html).

 We have provided code for generating data with associated context
-[here](g3doc/context_rcnn.md), and a sample config for a Context R-CNN
-model [here](samples/configs/context_rcnn_resnet101_snapshot_serengeti_sync.config).
+[here](g3doc/context_rcnn.md), and a sample config for a Context R-CNN model
+[here](samples/configs/context_rcnn_resnet101_snapshot_serengeti_sync.config).

 Snapshot Serengeti-trained Faster R-CNN and Context R-CNN models can be found in
-the [model zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md#snapshot-serengeti-camera-trap-trained-models).
+the
+[model zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf1_detection_zoo.md#snapshot-serengeti-camera-trap-trained-models).

 A colab demonstrating Context R-CNN is provided
 [here](colab_tutorials/context_rcnn_tutorial.ipynb).

 <b>Thanks to contributors</b>: Sara Beery, Jonathan Huang, Guanhang Wu, Vivek
-Rathod, Ronny Votel, Zhichao Lu, David Ross, Pietro Perona, Tanya Birch, and
-the Wildlife Insights AI Team.
-
-### May 19th, 2020
-
-We have released [MobileDets](https://arxiv.org/abs/2004.14525), a set of
-high-performance models for mobile CPUs, DSPs and EdgeTPUs.
-
-*   MobileDets outperform MobileNetV3+SSDLite by 1.7 mAP at comparable mobile
-    CPU inference latencies. MobileDets also outperform MobileNetV2+SSDLite by
-    1.9 mAP on mobile CPUs, 3.7 mAP on EdgeTPUs and 3.4 mAP on DSPs while
-    running equally fast. MobileDets also offer up to 2x speedup over MnasFPN on
-    EdgeTPUs and DSPs.
-
-For each of the three hardware platforms we have released model definition,
-model checkpoints trained on the COCO14 dataset and converted TFLite models in
-fp32 and/or uint8.
-
-<b>Thanks to contributors</b>: Yunyang Xiong, Hanxiao Liu, Suyog Gupta, Berkin
-Akin, Gabriel Bender, Pieter-Jan Kindermans, Mingxing Tan, Vikas Singh, Bo Chen,
-Quoc Le, Zhichao Lu.
-
-### May 7th, 2020
-
-We have released a mobile model with the
-[MnasFPN head](https://arxiv.org/abs/1912.01106).
-
-*   MnasFPN with MobileNet-V2 backbone is the most accurate (26.6 mAP at 183ms
-    on Pixel 1) mobile detection model we have released to date. With
-    depth-multiplier, MnasFPN with MobileNet-V2 backbone is 1.8 mAP higher than
-    MobileNet-V3-Large with SSDLite (23.8 mAP vs 22.0 mAP) at similar latency
-    (120ms) on Pixel 1.
-
-We have released model definition, model checkpoints trained on the COCO14
-dataset and a converted TFLite model.
-
-<b>Thanks to contributors</b>: Bo Chen, Golnaz Ghiasi, Hanxiao Liu, Tsung-Yi
-Lin, Dmitry Kalenichenko, Hartwig Adam, Quoc Le, Zhichao Lu, Jonathan Huang, Hao
-Xu.
-
-### Nov 13th, 2019
-
-We have released MobileNetEdgeTPU SSDLite model.
-
-*   SSDLite with MobileNetEdgeTPU backbone, which achieves 10% mAP higher than
-    MobileNetV2 SSDLite (24.3 mAP vs 22 mAP) on a Google Pixel4 at comparable
-    latency (6.6ms vs 6.8ms).
-
-Along with the model definition, we are also releasing model checkpoints trained
-on the COCO dataset.
-
-<b>Thanks to contributors</b>: Yunyang Xiong, Bo Chen, Suyog Gupta, Hanxiao Liu,
-Gabriel Bender, Mingxing Tan, Berkin Akin, Zhichao Lu, Quoc Le
-
-### Oct 15th, 2019
-
-We have released two MobileNet V3 SSDLite models (presented in
-[Searching for MobileNetV3](https://arxiv.org/abs/1905.02244)).
-
-*   SSDLite with MobileNet-V3-Large backbone, which is 27% faster than Mobilenet
-    V2 SSDLite (119ms vs 162ms) on a Google Pixel phone CPU at the same mAP.
-*   SSDLite with MobileNet-V3-Small backbone, which is 37% faster than MnasNet
-    SSDLite reduced with depth-multiplier (43ms vs 68ms) at the same mAP.
-
-Along with the model definition, we are also releasing model checkpoints trained
-on the COCO dataset.
-
-<b>Thanks to contributors</b>: Bo Chen, Zhichao Lu, Vivek Rathod, Jonathan Huang
-
-### July 1st, 2019
-
-We have released an updated set of utils and an updated
-[tutorial](g3doc/challenge_evaluation.md) for all three tracks of the
-[Open Images Challenge 2019](https://storage.googleapis.com/openimages/web/challenge2019.html)!
+Rathod, Ronny Votel, Zhichao Lu, David Ross, Pietro Perona, Tanya Birch, and the
+Wildlife Insights AI Team.

-The Instance Segmentation metric for
-[Open Images V5](https://storage.googleapis.com/openimages/web/index.html) and
-[Challenge 2019](https://storage.googleapis.com/openimages/web/challenge2019.html)
-is part of this release. Check out
-[the metric description](https://storage.googleapis.com/openimages/web/evaluation.html#instance_segmentation_eval)
-on the Open Images website.
+## Release Notes
+See [notes](g3doc/release_notes.md) for all past releases.

-<b>Thanks to contributors</b>: Alina Kuznetsova, Rodrigo Benenson
-
-### Feb 11, 2019
-
-We have released detection models trained on the Open Images Dataset V4 in our
-detection model zoo, including
-
-*   Faster R-CNN detector with Inception Resnet V2 feature extractor
-*   SSD detector with MobileNet V2 feature extractor
-*   SSD detector with ResNet 101 FPN feature extractor (aka RetinaNet-101)
-
-<b>Thanks to contributors</b>: Alina Kuznetsova, Yinxiao Li
-
-### Sep 17, 2018
-
-We have released Faster R-CNN detectors with ResNet-50 / ResNet-101 feature
-extractors trained on the
-[iNaturalist Species Detection Dataset](https://github.com/visipedia/inat_comp/blob/master/2017/README.md#bounding-boxes).
-The models are trained on the training split of the iNaturalist data for 4M
-iterations, they achieve 55% and 58% mean AP@.5 over 2854 classes respectively.
-For more details please refer to this [paper](https://arxiv.org/abs/1707.06642).
-
-<b>Thanks to contributors</b>: Chen Sun
-
-### July 13, 2018
-
-There are many new updates in this release, extending the functionality and
-capability of the API:
-
-*   Moving from slim-based training to
-    [Estimator](https://www.tensorflow.org/api_docs/python/tf/estimator/Estimator)-based
-    training.
-*   Support for [RetinaNet](https://arxiv.org/abs/1708.02002), and a
-    [MobileNet](https://ai.googleblog.com/2017/06/mobilenets-open-source-models-for.html)
-    adaptation of RetinaNet.
-*   A novel SSD-based architecture called the
-    [Pooling Pyramid Network](https://arxiv.org/abs/1807.03284) (PPN).
-*   Releasing several [TPU](https://cloud.google.com/tpu/)-compatible models.
-    These can be found in the `samples/configs/` directory with a comment in the
-    pipeline configuration files indicating TPU compatibility.
-*   Support for quantized training.
-*   Updated documentation for new binaries, Cloud training, and
-    [Tensorflow Lite](https://www.tensorflow.org/mobile/tflite/).
-
-See also our
-[expanded announcement blogpost](https://ai.googleblog.com/2018/07/accelerated-training-and-inference-with.html)
-and accompanying tutorial at the
-[TensorFlow blog](https://medium.com/tensorflow/training-and-serving-a-realtime-mobile-object-detector-in-30-minutes-with-cloud-tpus-b78971cf1193).
-
-<b>Thanks to contributors</b>: Sara Robinson, Aakanksha Chowdhery, Derek Chow,
-Pengchong Jin, Jonathan Huang, Vivek Rathod, Zhichao Lu, Ronny Votel
-
-### June 25, 2018
-
-Additional evaluation tools for the
-[Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html)
-are out. Check out our short tutorial on data preparation and running evaluation
-[here](g3doc/challenge_evaluation.md)!
-
-<b>Thanks to contributors</b>: Alina Kuznetsova
-
-### June 5, 2018
-
-We have released the implementation of evaluation metrics for both tracks of the
-[Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html)
-as a part of the Object Detection API - see the
-[evaluation protocols](g3doc/evaluation_protocols.md) for more details.
-Additionally, we have released a tool for hierarchical labels expansion for the
-Open Images Challenge: check out
-[oid_hierarchical_labels_expansion.py](dataset_tools/oid_hierarchical_labels_expansion.py).
-
-<b>Thanks to contributors</b>: Alina Kuznetsova, Vittorio Ferrari, Jasper
-Uijlings
-
-### April 30, 2018
-
-We have released a Faster R-CNN detector with ResNet-101 feature extractor
-trained on [AVA](https://research.google.com/ava/) v2.1. Compared with other
-commonly used object detectors, it changes the action classification loss
-function to per-class Sigmoid loss to handle boxes with multiple labels. The
-model is trained on the training split of AVA v2.1 for 1.5M iterations, it
-achieves mean AP of 11.25% over 60 classes on the validation split of AVA v2.1.
-For more details please refer to this [paper](https://arxiv.org/abs/1705.08421).
-
-<b>Thanks to contributors</b>: Chen Sun, David Ross
-
-### April 2, 2018
-
-Supercharge your mobile phones with the next generation mobile object detector!
-We are adding support for MobileNet V2 with SSDLite presented in
-[MobileNetV2: Inverted Residuals and Linear Bottlenecks](https://arxiv.org/abs/1801.04381).
-This model is 35% faster than Mobilenet V1 SSD on a Google Pixel phone CPU
-(200ms vs. 270ms) at the same accuracy. Along with the model definition, we are
-also releasing a model checkpoint trained on the COCO dataset.
-
-<b>Thanks to contributors</b>: Menglong Zhu, Mark Sandler, Zhichao Lu, Vivek
-Rathod, Jonathan Huang
-
-### February 9, 2018
-
-We now support instance segmentation!! In this API update we support a number of
-instance segmentation models similar to those discussed in the
-[Mask R-CNN paper](https://arxiv.org/abs/1703.06870). For further details refer
-to [our slides](http://presentations.cocodataset.org/Places17-GMRI.pdf) from the
-2017 Coco + Places Workshop. Refer to the section on
-[Running an Instance Segmentation Model](g3doc/instance_segmentation.md) for
-instructions on how to configure a model that predicts masks in addition to
-object bounding boxes.
-
-<b>Thanks to contributors</b>: Alireza Fathi, Zhichao Lu, Vivek Rathod, Ronny
-Votel, Jonathan Huang
-
-### November 17, 2017
-
-As a part of the Open Images V3 release we have released:
-
-*   An implementation of the Open Images evaluation metric and the
-    [protocol](g3doc/evaluation_protocols.md#open-images).
-*   Additional tools to separate inference of detection and evaluation (see
-    [this tutorial](g3doc/oid_inference_and_evaluation.md)).
-*   A new detection model trained on the Open Images V2 data release (see
-    [Open Images model](g3doc/detection_model_zoo.md#open-images-models)).
-
-See more information on the
-[Open Images website](https://github.com/openimages/dataset)!
-
-<b>Thanks to contributors</b>: Stefan Popov, Alina Kuznetsova
-
-### November 6, 2017
-
-We have re-released faster versions of our (pre-trained) models in the
-<a href='g3doc/detection_model_zoo.md'>model zoo</a>. In addition to what was
-available before, we are also adding Faster R-CNN models trained on COCO with
-Inception V2 and Resnet-50 feature extractors, as well as a Faster R-CNN with
-Resnet-101 model trained on the KITTI dataset.
-
-<b>Thanks to contributors</b>: Jonathan Huang, Vivek Rathod, Derek Chow, Tal
-Remez, Chen Sun.
-
-### October 31, 2017
-
-We have released a new state-of-the-art model for object detection using the
-Faster-RCNN with the
-[NASNet-A image featurization](https://arxiv.org/abs/1707.07012). This model
-achieves mAP of 43.1% on the test-dev validation dataset for COCO, improving on
-the best available model in the zoo by 6% in terms of absolute mAP.
-
-<b>Thanks to contributors</b>: Barret Zoph, Vijay Vasudevan, Jonathon Shlens,
-Quoc Le
-
-### August 11, 2017
+## Getting Help

-We have released an update to the
-[Android Detect demo](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android)
-which will now run models trained using the Tensorflow Object Detection API on
-an Android device. By default, it currently runs a frozen SSD w/Mobilenet
-detector trained on COCO, but we encourage you to try out other detection
-models!
+To get help with issues you may encounter using the TensorFlow Object Detection
+API, create a new question on [StackOverflow](https://stackoverflow.com/) with
+the tags "tensorflow" and "object-detection".

-<b>Thanks to contributors</b>: Jonathan Huang, Andrew Harp
+Please report bugs (actually broken code, not usage questions) to the
+tensorflow/models GitHub
+[issue tracker](https://github.com/tensorflow/models/issues), prefixing the
+issue name with "object_detection".

-### June 15, 2017
+Please check the [FAQ](g3doc/faq.md) for frequently asked questions before
+reporting an issue.

-In addition to our base Tensorflow detection model definitions, this release
-includes:
+## Maintainers

-*   A selection of trainable detection models, including:
-    *   Single Shot Multibox Detector (SSD) with MobileNet,
-    *   SSD with Inception V2,
-    *   Region-Based Fully Convolutional Networks (R-FCN) with Resnet 101,
-    *   Faster RCNN with Resnet 101,
-    *   Faster RCNN with Inception Resnet v2
-*   Frozen weights (trained on the COCO dataset) for each of the above models to
-    be used for out-of-the-box inference purposes.
-*   A [Jupyter notebook](colab_tutorials/object_detection_tutorial.ipynb) for
-    performing out-of-the-box inference with one of our released models
-*   Convenient [local training](g3doc/running_locally.md) scripts as well as
-    distributed training and evaluation pipelines via
-    [Google Cloud](g3doc/running_on_cloud.md).
-
-<b>Thanks to contributors</b>: Jonathan Huang, Vivek Rathod, Derek Chow, Chen
-Sun, Menglong Zhu, Matthew Tang, Anoop Korattikara, Alireza Fathi, Ian Fischer,
-Zbigniew Wojna, Yang Song, Sergio Guadarrama, Jasper Uijlings, Viacheslav
-Kovalevskyi, Kevin Murphy
+* Jonathan Huang ([@GitHub jch1](https://github.com/jch1))
+* Vivek Rathod ([@GitHub tombstone](https://github.com/tombstone))
+* Vighnesh Birodkar ([@GitHub vighneshbirodkar](https://github.com/vighneshbirodkar))
+* Austin Myers ([@GitHub austin-myers](https://github.com/austin-myers))
+* Zhichao Lu ([@GitHub pkulzc](https://github.com/pkulzc))
+* Ronny Votel ([@GitHub ronnyvotel](https://github.com/ronnyvotel))
+* Yu-hui Chen ([@GitHub yuhuichen1015](https://github.com/yuhuichen1015))
+* Derek Chow  ([@GitHub derekjchow](https://github.com/derekjchow))
--- a/research/object_detection/builders/box_predictor_builder_test.py
+++ b/research/object_detection/builders/box_predictor_builder_test.py
@@ -17,9 +17,8 @@
 """Tests for box_predictor_builder."""

 import unittest
-import mock
+from unittest import mock  # pylint: disable=g-importing-member
 import tensorflow.compat.v1 as tf
-
 from google.protobuf import text_format
 from object_detection.builders import box_predictor_builder
 from object_detection.builders import hyperparams_builder

--- a/research/object_detection/builders/graph_rewriter_builder_tf1_test.py
+++ b/research/object_detection/builders/graph_rewriter_builder_tf1_test.py
@@ -14,7 +14,7 @@
 # ==============================================================================
 """Tests for graph_rewriter_builder."""
 import unittest
-import mock
+from unittest import mock  # pylint: disable=g-importing-member
 import tensorflow.compat.v1 as tf
 import tf_slim as slim


--- a/research/object_detection/builders/model_builder.py
+++ b/research/object_detection/builders/model_builder.py
@@ -16,6 +16,7 @@
 """A function to build a DetectionModel from configuration."""

 import functools
+import sys
 from object_detection.builders import anchor_generator_builder
 from object_detection.builders import box_coder_builder
 from object_detection.builders import box_predictor_builder
@@ -38,6 +39,7 @@ from object_detection.protos import losses_pb2
 from object_detection.protos import model_pb2
 from object_detection.utils import label_map_util
 from object_detection.utils import ops
+from object_detection.utils import spatial_transform_ops as spatial_ops
 from object_detection.utils import tf_version

 ## Feature Extractors for TF
@@ -47,6 +49,7 @@ from object_detection.utils import tf_version
 # pylint: disable=g-import-not-at-top
 if tf_version.is_tf2():
  from object_detection.models import center_net_hourglass_feature_extractor
+  from object_detection.models import center_net_mobilenet_v2_feature_extractor
  from object_detection.models import center_net_resnet_feature_extractor
  from object_detection.models import center_net_resnet_v1_fpn_feature_extractor
  from object_detection.models import faster_rcnn_inception_resnet_v2_keras_feature_extractor as frcnn_inc_res_keras
@@ -58,6 +61,8 @@ if tf_version.is_tf2():
  from object_detection.models.ssd_mobilenet_v2_fpn_keras_feature_extractor import SSDMobileNetV2FpnKerasFeatureExtractor
  from object_detection.models.ssd_mobilenet_v2_keras_feature_extractor import SSDMobileNetV2KerasFeatureExtractor
  from object_detection.predictors import rfcn_keras_box_predictor
+  if sys.version_info[0] >= 3:
+    from object_detection.models import ssd_efficientnet_bifpn_feature_extractor as ssd_efficientnet_bifpn

 if tf_version.is_tf1():
  from object_detection.models import faster_rcnn_inception_resnet_v2_feature_extractor as frcnn_inc_res
@@ -99,6 +104,22 @@ if tf_version.is_tf2():
          ssd_resnet_v1_fpn_keras.SSDResNet101V1FpnKerasFeatureExtractor,
      'ssd_resnet152_v1_fpn_keras':
          ssd_resnet_v1_fpn_keras.SSDResNet152V1FpnKerasFeatureExtractor,
+      'ssd_efficientnet-b0_bifpn_keras':
+          ssd_efficientnet_bifpn.SSDEfficientNetB0BiFPNKerasFeatureExtractor,
+      'ssd_efficientnet-b1_bifpn_keras':
+          ssd_efficientnet_bifpn.SSDEfficientNetB1BiFPNKerasFeatureExtractor,
+      'ssd_efficientnet-b2_bifpn_keras':
+          ssd_efficientnet_bifpn.SSDEfficientNetB2BiFPNKerasFeatureExtractor,
+      'ssd_efficientnet-b3_bifpn_keras':
+          ssd_efficientnet_bifpn.SSDEfficientNetB3BiFPNKerasFeatureExtractor,
+      'ssd_efficientnet-b4_bifpn_keras':
+          ssd_efficientnet_bifpn.SSDEfficientNetB4BiFPNKerasFeatureExtractor,
+      'ssd_efficientnet-b5_bifpn_keras':
+          ssd_efficientnet_bifpn.SSDEfficientNetB5BiFPNKerasFeatureExtractor,
+      'ssd_efficientnet-b6_bifpn_keras':
+          ssd_efficientnet_bifpn.SSDEfficientNetB6BiFPNKerasFeatureExtractor,
+      'ssd_efficientnet-b7_bifpn_keras':
+          ssd_efficientnet_bifpn.SSDEfficientNetB7BiFPNKerasFeatureExtractor,
  }

  FASTER_RCNN_KERAS_FEATURE_EXTRACTOR_CLASS_MAP = {
@@ -110,22 +131,29 @@ if tf_version.is_tf2():
          frcnn_resnet_keras.FasterRCNNResnet152KerasFeatureExtractor,
      'faster_rcnn_inception_resnet_v2_keras':
      frcnn_inc_res_keras.FasterRCNNInceptionResnetV2KerasFeatureExtractor,
-      'fasret_rcnn_resnet50_fpn_keras':
+      'faster_rcnn_resnet50_fpn_keras':
          frcnn_resnet_fpn_keras.FasterRCNNResnet50FpnKerasFeatureExtractor,
-      'fasret_rcnn_resnet101_fpn_keras':
+      'faster_rcnn_resnet101_fpn_keras':
          frcnn_resnet_fpn_keras.FasterRCNNResnet101FpnKerasFeatureExtractor,
-      'fasret_rcnn_resnet152_fpn_keras':
+      'faster_rcnn_resnet152_fpn_keras':
          frcnn_resnet_fpn_keras.FasterRCNNResnet152FpnKerasFeatureExtractor,
  }

  CENTER_NET_EXTRACTOR_FUNCTION_MAP = {
      'resnet_v2_50': center_net_resnet_feature_extractor.resnet_v2_50,
      'resnet_v2_101': center_net_resnet_feature_extractor.resnet_v2_101,
+      'resnet_v1_18_fpn':
+          center_net_resnet_v1_fpn_feature_extractor.resnet_v1_18_fpn,
+      'resnet_v1_34_fpn':
+          center_net_resnet_v1_fpn_feature_extractor.resnet_v1_34_fpn,
      'resnet_v1_50_fpn':
          center_net_resnet_v1_fpn_feature_extractor.resnet_v1_50_fpn,
      'resnet_v1_101_fpn':
          center_net_resnet_v1_fpn_feature_extractor.resnet_v1_101_fpn,
-      'hourglass_104': center_net_hourglass_feature_extractor.hourglass_104,
+      'hourglass_104':
+          center_net_hourglass_feature_extractor.hourglass_104,
+      'mobilenet_v2':
+          center_net_mobilenet_v2_feature_extractor.mobilenet_v2,
  }

  FEATURE_EXTRACTOR_MAPS = [
@@ -310,6 +338,14 @@ def _build_ssd_feature_extractor(feature_extractor_config,
            feature_extractor_config.fpn.additional_layer_depth,
    })

+  if feature_extractor_config.HasField('bifpn'):
+    kwargs.update({
+        'bifpn_min_level': feature_extractor_config.bifpn.min_level,
+        'bifpn_max_level': feature_extractor_config.bifpn.max_level,
+        'bifpn_num_iterations': feature_extractor_config.bifpn.num_iterations,
+        'bifpn_num_filters': feature_extractor_config.bifpn.num_filters,
+        'bifpn_combine_method': feature_extractor_config.bifpn.combine_method,
+    })

  return feature_extractor_class(**kwargs)

@@ -621,8 +657,9 @@ def _build_faster_rcnn_model(frcnn_config, is_training, add_summaries):
        second_stage_localization_loss_weight)

  crop_and_resize_fn = (
-      ops.matmul_crop_and_resize if frcnn_config.use_matmul_crop_and_resize
-      else ops.native_crop_and_resize)
+      spatial_ops.multilevel_matmul_crop_and_resize
+      if frcnn_config.use_matmul_crop_and_resize
+      else spatial_ops.multilevel_native_crop_and_resize)
  clip_anchors_to_image = (
      frcnn_config.clip_anchors_to_image)

@@ -843,6 +880,22 @@ def mask_proto_to_params(mask_config):
      heatmap_bias_init=mask_config.heatmap_bias_init)


+def densepose_proto_to_params(densepose_config):
+  """Converts CenterNet.DensePoseEstimation proto to parameter namedtuple."""
+  classification_loss, localization_loss, _, _, _, _, _ = (
+      losses_builder.build(densepose_config.loss))
+  return center_net_meta_arch.DensePoseParams(
+      class_id=densepose_config.class_id,
+      classification_loss=classification_loss,
+      localization_loss=localization_loss,
+      part_loss_weight=densepose_config.part_loss_weight,
+      coordinate_loss_weight=densepose_config.coordinate_loss_weight,
+      num_parts=densepose_config.num_parts,
+      task_loss_weight=densepose_config.task_loss_weight,
+      upsample_to_input_res=densepose_config.upsample_to_input_res,
+      heatmap_bias_init=densepose_config.heatmap_bias_init)
+
+
 def _build_center_net_model(center_net_config, is_training, add_summaries):
  """Build a CenterNet detection model.

@@ -895,6 +948,11 @@ def _build_center_net_model(center_net_config, is_training, add_summaries):
  if center_net_config.HasField('mask_estimation_task'):
    mask_params = mask_proto_to_params(center_net_config.mask_estimation_task)

+  densepose_params = None
+  if center_net_config.HasField('densepose_estimation_task'):
+    densepose_params = densepose_proto_to_params(
+        center_net_config.densepose_estimation_task)
+
  return center_net_meta_arch.CenterNetMetaArch(
      is_training=is_training,
      add_summaries=add_summaries,
@@ -904,7 +962,8 @@ def _build_center_net_model(center_net_config, is_training, add_summaries):
      object_center_params=object_center_params,
      object_detection_params=object_detection_params,
      keypoint_params_dict=keypoint_params_dict,
-      mask_params=mask_params)
+      mask_params=mask_params,
+      densepose_params=densepose_params)


 def _build_center_net_feature_extractor(

--- a/research/object_detection/builders/model_builder_test.py
+++ b/research/object_detection/builders/model_builder_test.py
@@ -39,6 +39,9 @@ class ModelBuilderTest(test_case.TestCase, parameterized.TestCase):
  def ssd_feature_extractors(self):
    raise NotImplementedError

+  def get_override_base_feature_extractor_hyperparams(self, extractor_type):
+    raise NotImplementedError
+
  def faster_rcnn_feature_extractors(self):
    raise NotImplementedError

@@ -70,7 +73,6 @@ class ModelBuilderTest(test_case.TestCase, parameterized.TestCase):
                }
              }
          }
-          override_base_feature_extractor_hyperparams: true
        }
        box_coder {
          faster_rcnn_box_coder {
@@ -205,6 +207,8 @@ class ModelBuilderTest(test_case.TestCase, parameterized.TestCase):
    for extractor_type, extractor_class in self.ssd_feature_extractors().items(
    ):
      model_proto.ssd.feature_extractor.type = extractor_type
+      model_proto.ssd.feature_extractor.override_base_feature_extractor_hyperparams = (
+          self.get_override_base_feature_extractor_hyperparams(extractor_type))
      model = model_builder.build(model_proto, is_training=True)
      self.assertIsInstance(model, ssd_meta_arch.SSDMetaArch)
      self.assertIsInstance(model._feature_extractor, extractor_class)

--- a/research/object_detection/builders/model_builder_tf1_test.py
+++ b/research/object_detection/builders/model_builder_tf1_test.py
@@ -38,6 +38,9 @@ class ModelBuilderTF1Test(model_builder_test.ModelBuilderTest):
  def ssd_feature_extractors(self):
    return model_builder.SSD_FEATURE_EXTRACTOR_CLASS_MAP

+  def get_override_base_feature_extractor_hyperparams(self, extractor_type):
+    return extractor_type in {'ssd_inception_v2', 'ssd_inception_v3'}
+
  def faster_rcnn_feature_extractors(self):
    return model_builder.FASTER_RCNN_FEATURE_EXTRACTOR_CLASS_MAP


--- a/research/object_detection/builders/model_builder_tf2_test.py
+++ b/research/object_detection/builders/model_builder_tf2_test.py
@@ -42,6 +42,9 @@ class ModelBuilderTF2Test(model_builder_test.ModelBuilderTest):
  def ssd_feature_extractors(self):
    return model_builder.SSD_KERAS_FEATURE_EXTRACTOR_CLASS_MAP

+  def get_override_base_feature_extractor_hyperparams(self, extractor_type):
+    return extractor_type in {}
+
  def faster_rcnn_feature_extractors(self):
    return model_builder.FASTER_RCNN_KERAS_FEATURE_EXTRACTOR_CLASS_MAP

@@ -161,6 +164,28 @@ class ModelBuilderTF2Test(model_builder_test.ModelBuilderTest):
    return text_format.Merge(proto_txt,
                             center_net_pb2.CenterNet.MaskEstimation())

+  def get_fake_densepose_proto(self):
+    proto_txt = """
+      task_loss_weight: 0.5
+      class_id: 0
+      loss {
+        classification_loss {
+          weighted_softmax {}
+        }
+        localization_loss {
+          l1_localization_loss {
+          }
+        }
+      }
+      num_parts: 24
+      part_loss_weight: 1.0
+      coordinate_loss_weight: 2.0
+      upsample_to_input_res: true
+      heatmap_bias_init: -2.0
+    """
+    return text_format.Merge(proto_txt,
+                             center_net_pb2.CenterNet.DensePoseEstimation())
+
  def test_create_center_net_model(self):
    """Test building a CenterNet model from proto txt."""
    proto_txt = """
@@ -192,6 +217,8 @@ class ModelBuilderTF2Test(model_builder_test.ModelBuilderTest):
        self.get_fake_label_map_file_path())
    config.center_net.mask_estimation_task.CopyFrom(
        self.get_fake_mask_proto())
+    config.center_net.densepose_estimation_task.CopyFrom(
+        self.get_fake_densepose_proto())

    # Build the model from the configuration.
    model = model_builder.build(config, is_training=True)
@@ -248,6 +275,21 @@ class ModelBuilderTF2Test(model_builder_test.ModelBuilderTest):
    self.assertAlmostEqual(
        model._mask_params.heatmap_bias_init, -2.0, places=4)

+    # Check DensePose related parameters.
+    self.assertEqual(model._densepose_params.class_id, 0)
+    self.assertIsInstance(model._densepose_params.classification_loss,
+                          losses.WeightedSoftmaxClassificationLoss)
+    self.assertIsInstance(model._densepose_params.localization_loss,
+                          losses.L1LocalizationLoss)
+    self.assertAlmostEqual(model._densepose_params.part_loss_weight, 1.0)
+    self.assertAlmostEqual(model._densepose_params.coordinate_loss_weight, 2.0)
+    self.assertEqual(model._densepose_params.num_parts, 24)
+    self.assertAlmostEqual(model._densepose_params.task_loss_weight, 0.5)
+    self.assertTrue(model._densepose_params.upsample_to_input_res)
+    self.assertEqual(model._densepose_params.upsample_method, 'bilinear')
+    self.assertAlmostEqual(
+        model._densepose_params.heatmap_bias_init, -2.0, places=4)
+
    # Check feature extractor parameters.
    self.assertIsInstance(
        model._feature_extractor,

--- a/research/object_detection/builders/preprocessor_builder.py
+++ b/research/object_detection/builders/preprocessor_builder.py
@@ -417,4 +417,12 @@ def build(preprocessor_step_config):
        'num_scales': config.num_scales
    }

+  if step_type == 'random_scale_crop_and_pad_to_square':
+    config = preprocessor_step_config.random_scale_crop_and_pad_to_square
+    return preprocessor.random_scale_crop_and_pad_to_square, {
+        'scale_min': config.scale_min,
+        'scale_max': config.scale_max,
+        'output_size': config.output_size,
+    }
+
  raise ValueError('Unknown preprocessing step.')
--- a/research/object_detection/colab_tutorials/eager_few_shot_od_training_tf2_colab.ipynb
+++ b/research/object_detection/colab_tutorials/eager_few_shot_od_training_tf2_colab.ipynb
+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "rOvvWAVTkMR7"
+      },
+      "source": [
+        "# Eager Few Shot Object Detection Colab\n",
+        "\n",
+        "Welcome to the Eager Few Shot Object Detection Colab --- in this colab we demonstrate fine tuning of a (TF2 friendly) RetinaNet architecture on very few examples of a novel class after initializing from a pre-trained COCO checkpoint.\n",
+        "Training runs in eager mode.\n",
+        "\n",
+        "Estimated time to run through this colab (with GPU): \u003c 5 minutes."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "vPs64QA1Zdov"
+      },
+      "source": [
+        "## Imports"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "LBZ9VWZZFUCT"
+      },
+      "outputs": [],
+      "source": [
+        "!pip install -U --pre tensorflow==\"2.2.0\""
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "oi28cqGGFWnY"
+      },
+      "outputs": [],
+      "source": [
+        "import os\n",
+        "import pathlib\n",
+        "\n",
+        "# Clone the tensorflow models repository if it doesn't already exist\n",
+        "if \"models\" in pathlib.Path.cwd().parts:\n",
+        "  while \"models\" in pathlib.Path.cwd().parts:\n",
+        "    os.chdir('..')\n",
+        "elif not pathlib.Path('models').exists():\n",
+        "  !git clone --depth 1 https://github.com/tensorflow/models"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "NwdsBdGhFanc"
+      },
+      "outputs": [],
+      "source": [
+        "# Install the Object Detection API\n",
+        "%%bash\n",
+        "cd models/research/\n",
+        "protoc object_detection/protos/*.proto --python_out=.\n",
+        "cp object_detection/packages/tf2/setup.py .\n",
+        "python -m pip install ."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "uZcqD4NLdnf4"
+      },
+      "outputs": [],
+      "source": [
+        "import matplotlib\n",
+        "import matplotlib.pyplot as plt\n",
+        "\n",
+        "import os\n",
+        "import random\n",
+        "import io\n",
+        "import imageio\n",
+        "import glob\n",
+        "import scipy.misc\n",
+        "import numpy as np\n",
+        "from six import BytesIO\n",
+        "from PIL import Image, ImageDraw, ImageFont\n",
+        "from IPython.display import display, Javascript\n",
+        "from IPython.display import Image as IPyImage\n",
+        "\n",
+        "import tensorflow as tf\n",
+        "\n",
+        "from object_detection.utils import label_map_util\n",
+        "from object_detection.utils import config_util\n",
+        "from object_detection.utils import visualization_utils as viz_utils\n",
+        "from object_detection.utils import colab_utils\n",
+        "from object_detection.builders import model_builder\n",
+        "\n",
+        "%matplotlib inline"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "IogyryF2lFBL"
+      },
+      "source": [
+        "# Utilities"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "-y9R0Xllefec"
+      },
+      "outputs": [],
+      "source": [
+        "def load_image_into_numpy_array(path):\n",
+        "  \"\"\"Load an image from file into a numpy array.\n",
+        "\n",
+        "  Puts image into numpy array to feed into tensorflow graph.\n",
+        "  Note that by convention we put it into a numpy array with shape\n",
+        "  (height, width, channels), where channels=3 for RGB.\n",
+        "\n",
+        "  Args:\n",
+        "    path: a file path.\n",
+        "\n",
+        "  Returns:\n",
+        "    uint8 numpy array with shape (img_height, img_width, 3)\n",
+        "  \"\"\"\n",
+        "  img_data = tf.io.gfile.GFile(path, 'rb').read()\n",
+        "  image = Image.open(BytesIO(img_data))\n",
+        "  (im_width, im_height) = image.size\n",
+        "  return np.array(image.getdata()).reshape(\n",
+        "      (im_height, im_width, 3)).astype(np.uint8)\n",
+        "\n",
+        "def plot_detections(image_np,\n",
+        "                    boxes,\n",
+        "                    classes,\n",
+        "                    scores,\n",
+        "                    category_index,\n",
+        "                    figsize=(12, 16),\n",
+        "                    image_name=None):\n",
+        "  \"\"\"Wrapper function to visualize detections.\n",
+        "\n",
+        "  Args:\n",
+        "    image_np: uint8 numpy array with shape (img_height, img_width, 3)\n",
+        "    boxes: a numpy array of shape [N, 4]\n",
+        "    classes: a numpy array of shape [N]. Note that class indices are 1-based,\n",
+        "      and match the keys in the label map.\n",
+        "    scores: a numpy array of shape [N] or None.  If scores=None, then\n",
+        "      this function assumes that the boxes to be plotted are groundtruth\n",
+        "      boxes and plot all boxes as black with no classes or scores.\n",
+        "    category_index: a dict containing category dictionaries (each holding\n",
+        "      category index `id` and category name `name`) keyed by category indices.\n",
+        "    figsize: size for the figure.\n",
+        "    image_name: a name for the image file.\n",
+        "  \"\"\"\n",
+        "  image_np_with_annotations = image_np.copy()\n",
+        "  viz_utils.visualize_boxes_and_labels_on_image_array(\n",
+        "      image_np_with_annotations,\n",
+        "      boxes,\n",
+        "      classes,\n",
+        "      scores,\n",
+        "      category_index,\n",
+        "      use_normalized_coordinates=True,\n",
+        "      min_score_thresh=0.8)\n",
+        "  if image_name:\n",
+        "    plt.imsave(image_name, image_np_with_annotations)\n",
+        "  else:\n",
+        "    plt.imshow(image_np_with_annotations)\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "sSaXL28TZfk1"
+      },
+      "source": [
+        "# Rubber Ducky data\n",
+        "\n",
+        "We will start with some toy (literally) data consisting of 5 images of a rubber\n",
+        "ducky.  Note that the [coco](https://cocodataset.org/#explore) dataset contains a number of animals, but notably, it does *not* contain rubber duckies (or even ducks for that matter), so this is a novel class."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "SQy3ND7EpFQM"
+      },
+      "outputs": [],
+      "source": [
+        "# Load images and visualize\n",
+        "train_image_dir = 'models/research/object_detection/test_images/ducky/train/'\n",
+        "train_images_np = []\n",
+        "for i in range(1, 6):\n",
+        "  image_path = os.path.join(train_image_dir, 'robertducky' + str(i) + '.jpg')\n",
+        "  train_images_np.append(load_image_into_numpy_array(image_path))\n",
+        "\n",
+        "plt.rcParams['axes.grid'] = False\n",
+        "plt.rcParams['xtick.labelsize'] = False\n",
+        "plt.rcParams['ytick.labelsize'] = False\n",
+        "plt.rcParams['xtick.top'] = False\n",
+        "plt.rcParams['xtick.bottom'] = False\n",
+        "plt.rcParams['ytick.left'] = False\n",
+        "plt.rcParams['ytick.right'] = False\n",
+        "plt.rcParams['figure.figsize'] = [14, 7]\n",
+        "\n",
+        "for idx, train_image_np in enumerate(train_images_np):\n",
+        "  plt.subplot(2, 3, idx+1)\n",
+        "  plt.imshow(train_image_np)\n",
+        "plt.show()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "cbKXmQoxcUgE"
+      },
+      "source": [
+        "# Annotate images with bounding boxes\n",
+        "\n",
+        "In this cell you will annotate the rubber duckies --- draw a box around the rubber ducky in each image; click `next image` to go to the next image and `submit` when there are no more images.\n",
+        "\n",
+        "If you'd like to skip the manual annotation step, we totally understand.  In this case, simply skip this cell and run the next cell instead, where we've prepopulated the groundtruth with pre-annotated bounding boxes.\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "-nEDRoUEcUgL"
+      },
+      "outputs": [],
+      "source": [
+        "gt_boxes = []\n",
+        "colab_utils.annotate(train_images_np, box_storage_pointer=gt_boxes)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "wTP9AFqecUgS"
+      },
+      "source": [
+        "# In case you didn't want to label...\n",
+        "\n",
+        "Run this cell only if you didn't annotate anything above and\n",
+        "would prefer to just use our preannotated boxes.  Don't forget\n",
+        "to uncomment."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "wIAT6ZUmdHOC"
+      },
+      "outputs": [],
+      "source": [
+        "# gt_boxes = [\n",
+        "#             np.array([[0.436, 0.591, 0.629, 0.712]], dtype=np.float32),\n",
+        "#             np.array([[0.539, 0.583, 0.73, 0.71]], dtype=np.float32),\n",
+        "#             np.array([[0.464, 0.414, 0.626, 0.548]], dtype=np.float32),\n",
+        "#             np.array([[0.313, 0.308, 0.648, 0.526]], dtype=np.float32),\n",
+        "#             np.array([[0.256, 0.444, 0.484, 0.629]], dtype=np.float32)\n",
+        "# ]"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "Dqb_yjAo3cO_"
+      },
+      "source": [
+        "# Prepare data for training\n",
+        "\n",
+        "Below we add the class annotations (for simplicity, we assume a single class in this colab; though it should be straightforward to extend this to handle multiple classes).  We also convert everything to the format that the training\n",
+        "loop below expects (e.g., everything converted to tensors, classes converted to one-hot representations, etc.)."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "HWBqFVMcweF-"
+      },
+      "outputs": [],
+      "source": [
+        "\n",
+        "# By convention, our non-background classes start counting at 1.  Given\n",
+        "# that we will be predicting just one class, we will therefore assign it a\n",
+        "# `class id` of 1.\n",
+        "duck_class_id = 1\n",
+        "num_classes = 1\n",
+        "\n",
+        "category_index = {duck_class_id: {'id': duck_class_id, 'name': 'rubber_ducky'}}\n",
+        "\n",
+        "# Convert class labels to one-hot; convert everything to tensors.\n",
+        "# The `label_id_offset` here shifts all classes by a certain number of indices;\n",
+        "# we do this here so that the model receives one-hot labels where non-background\n",
+        "# classes start counting at the zeroth index.  This is ordinarily just handled\n",
+        "# automatically in our training binaries, but we need to reproduce it here.\n",
+        "label_id_offset = 1\n",
+        "train_image_tensors = []\n",
+        "gt_classes_one_hot_tensors = []\n",
+        "gt_box_tensors = []\n",
+        "for (train_image_np, gt_box_np) in zip(\n",
+        "    train_images_np, gt_boxes):\n",
+        "  train_image_tensors.append(tf.expand_dims(tf.convert_to_tensor(\n",
+        "      train_image_np, dtype=tf.float32), axis=0))\n",
+        "  gt_box_tensors.append(tf.convert_to_tensor(gt_box_np, dtype=tf.float32))\n",
+        "  zero_indexed_groundtruth_classes = tf.convert_to_tensor(\n",
+        "      np.ones(shape=[gt_box_np.shape[0]], dtype=np.int32) - label_id_offset)\n",
+        "  gt_classes_one_hot_tensors.append(tf.one_hot(\n",
+        "      zero_indexed_groundtruth_classes, num_classes))\n",
+        "print('Done prepping data.')\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "b3_Z3mJWN9KJ"
+      },
+      "source": [
+        "# Let's just visualize the rubber duckies as a sanity check\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "YBD6l-E4N71y"
+      },
+      "outputs": [],
+      "source": [
+        "dummy_scores = np.array([1.0], dtype=np.float32)  # give boxes a score of 100%\n",
+        "\n",
+        "plt.figure(figsize=(30, 15))\n",
+        "for idx in range(5):\n",
+        "  plt.subplot(2, 3, idx+1)\n",
+        "  plot_detections(\n",
+        "      train_images_np[idx],\n",
+        "      gt_boxes[idx],\n",
+        "      np.ones(shape=[gt_boxes[idx].shape[0]], dtype=np.int32),\n",
+        "      dummy_scores, category_index)\n",
+        "plt.show()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "ghDAsqfoZvPh"
+      },
+      "source": [
+        "# Create model and restore weights for all but last layer\n",
+        "\n",
+        "In this cell we build a single stage detection architecture (RetinaNet) and restore all but the classification layer at the top (which will be automatically randomly initialized).\n",
+        "\n",
+        "For simplicity, we have hardcoded a number of things in this colab for the specific RetinaNet architecture at hand (including assuming that the image size will always be 640x640), however it is not difficult to generalize to other model configurations."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "9J16r3NChD-7"
+      },
+      "outputs": [],
+      "source": [
+        "# Download the checkpoint and put it into models/research/object_detection/test_data/\n",
+        "\n",
+        "!wget http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.tar.gz\n",
+        "!tar -xf ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.tar.gz\n",
+        "!mv ssd_resnet50_v1_fpn_640x640_coco17_tpu-8/checkpoint models/research/object_detection/test_data/"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "RyT4BUbaMeG-"
+      },
+      "outputs": [],
+      "source": [
+        "tf.keras.backend.clear_session()\n",
+        "\n",
+        "print('Building model and restoring weights for fine-tuning...', flush=True)\n",
+        "num_classes = 1\n",
+        "pipeline_config = 'models/research/object_detection/configs/tf2/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.config'\n",
+        "checkpoint_path = 'models/research/object_detection/test_data/checkpoint/ckpt-0'\n",
+        "\n",
+        "# Load pipeline config and build a detection model.\n",
+        "#\n",
+        "# Since we are working off of a COCO architecture which predicts 90\n",
+        "# class slots by default, we override the `num_classes` field here to be just\n",
+        "# one (for our new rubber ducky class).\n",
+        "configs = config_util.get_configs_from_pipeline_file(pipeline_config)\n",
+        "model_config = configs['model']\n",
+        "model_config.ssd.num_classes = num_classes\n",
+        "model_config.ssd.freeze_batchnorm = True\n",
+        "detection_model = model_builder.build(\n",
+        "      model_config=model_config, is_training=True)\n",
+        "\n",
+        "# Set up object-based checkpoint restore --- RetinaNet has two prediction\n",
+        "# `heads` --- one for classification, the other for box regression.  We will\n",
+        "# restore the box regression head but initialize the classification head\n",
+        "# from scratch (we show the omission below by commenting out the line that\n",
+        "# we would add if we wanted to restore both heads)\n",
+        "fake_box_predictor = tf.compat.v2.train.Checkpoint(\n",
+        "    _base_tower_layers_for_heads=detection_model._box_predictor._base_tower_layers_for_heads,\n",
+        "    # _prediction_heads=detection_model._box_predictor._prediction_heads,\n",
+        "    #    (i.e., the classification head that we *will not* restore)\n",
+        "    _box_prediction_head=detection_model._box_predictor._box_prediction_head,\n",
+        "    )\n",
+        "fake_model = tf.compat.v2.train.Checkpoint(\n",
+        "          _feature_extractor=detection_model._feature_extractor,\n",
+        "          _box_predictor=fake_box_predictor)\n",
+        "ckpt = tf.compat.v2.train.Checkpoint(model=fake_model)\n",
+        "ckpt.restore(checkpoint_path).expect_partial()\n",
+        "\n",
+        "# Run model through a dummy image so that variables are created\n",
+        "image, shapes = detection_model.preprocess(tf.zeros([1, 640, 640, 3]))\n",
+        "prediction_dict = detection_model.predict(image, shapes)\n",
+        "_ = detection_model.postprocess(prediction_dict, shapes)\n",
+        "print('Weights restored!')"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "pCkWmdoZZ0zJ"
+      },
+      "source": [
+        "# Eager mode custom training loop\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "nyHoF4mUrv5-"
+      },
+      "outputs": [],
+      "source": [
+        "tf.keras.backend.set_learning_phase(True)\n",
+        "\n",
+        "# These parameters can be tuned; since our training set has 5 images\n",
+        "# it doesn't make sense to have a much larger batch size, though we could\n",
+        "# fit more examples in memory if we wanted to.\n",
+        "batch_size = 4\n",
+        "learning_rate = 0.01\n",
+        "num_batches = 100\n",
+        "\n",
+        "# Select variables in top layers to fine-tune.\n",
+        "trainable_variables = detection_model.trainable_variables\n",
+        "to_fine_tune = []\n",
+        "prefixes_to_train = [\n",
+        "  'WeightSharedConvolutionalBoxPredictor/WeightSharedConvolutionalBoxHead',\n",
+        "  'WeightSharedConvolutionalBoxPredictor/WeightSharedConvolutionalClassHead']\n",
+        "for var in trainable_variables:\n",
+        "  if any([var.name.startswith(prefix) for prefix in prefixes_to_train]):\n",
+        "    to_fine_tune.append(var)\n",
+        "\n",
+        "# Set up forward + backward pass for a single train step.\n",
+        "def get_model_train_step_function(model, optimizer, vars_to_fine_tune):\n",
+        "  \"\"\"Get a tf.function for training step.\"\"\"\n",
+        "\n",
+        "  # Use tf.function for a bit of speed.\n",
+        "  # Comment out the tf.function decorator if you want the inside of the\n",
+        "  # function to run eagerly.\n",
+        "  @tf.function\n",
+        "  def train_step_fn(image_tensors,\n",
+        "                    groundtruth_boxes_list,\n",
+        "                    groundtruth_classes_list):\n",
+        "    \"\"\"A single training iteration.\n",
+        "\n",
+        "    Args:\n",
+        "      image_tensors: A list of [1, height, width, 3] Tensor of type tf.float32.\n",
+        "        Note that the height and width can vary across images, as they are\n",
+        "        reshaped within this function to be 640x640.\n",
+        "      groundtruth_boxes_list: A list of Tensors of shape [N_i, 4] with type\n",
+        "        tf.float32 representing groundtruth boxes for each image in the batch.\n",
+        "      groundtruth_classes_list: A list of Tensors of shape [N_i, num_classes]\n",
+        "        with type tf.float32 representing groundtruth boxes for each image in\n",
+        "        the batch.\n",
+        "\n",
+        "    Returns:\n",
+        "      A scalar tensor representing the total loss for the input batch.\n",
+        "    \"\"\"\n",
+        "    shapes = tf.constant(batch_size * [[640, 640, 3]], dtype=tf.int32)\n",
+        "    model.provide_groundtruth(\n",
+        "        groundtruth_boxes_list=groundtruth_boxes_list,\n",
+        "        groundtruth_classes_list=groundtruth_classes_list)\n",
+        "    with tf.GradientTape() as tape:\n",
+        "      preprocessed_images = tf.concat(\n",
+        "          [detection_model.preprocess(image_tensor)[0]\n",
+        "           for image_tensor in image_tensors], axis=0)\n",
+        "      prediction_dict = model.predict(preprocessed_images, shapes)\n",
+        "      losses_dict = model.loss(prediction_dict, shapes)\n",
+        "      total_loss = losses_dict['Loss/localization_loss'] + losses_dict['Loss/classification_loss']\n",
+        "      gradients = tape.gradient(total_loss, vars_to_fine_tune)\n",
+        "      optimizer.apply_gradients(zip(gradients, vars_to_fine_tune))\n",
+        "    return total_loss\n",
+        "\n",
+        "  return train_step_fn\n",
+        "\n",
+        "optimizer = tf.keras.optimizers.SGD(learning_rate=learning_rate, momentum=0.9)\n",
+        "train_step_fn = get_model_train_step_function(\n",
+        "    detection_model, optimizer, to_fine_tune)\n",
+        "\n",
+        "print('Start fine-tuning!', flush=True)\n",
+        "for idx in range(num_batches):\n",
+        "  # Grab keys for a random subset of examples\n",
+        "  all_keys = list(range(len(train_images_np)))\n",
+        "  random.shuffle(all_keys)\n",
+        "  example_keys = all_keys[:batch_size]\n",
+        "\n",
+        "  # Note that we do not do data augmentation in this demo.  If you want a\n",
+        "  # a fun exercise, we recommend experimenting with random horizontal flipping\n",
+        "  # and random cropping :)\n",
+        "  gt_boxes_list = [gt_box_tensors[key] for key in example_keys]\n",
+        "  gt_classes_list = [gt_classes_one_hot_tensors[key] for key in example_keys]\n",
+        "  image_tensors = [train_image_tensors[key] for key in example_keys]\n",
+        "\n",
+        "  # Training step (forward pass + backwards pass)\n",
+        "  total_loss = train_step_fn(image_tensors, gt_boxes_list, gt_classes_list)\n",
+        "\n",
+        "  if idx % 10 == 0:\n",
+        "    print('batch ' + str(idx) + ' of ' + str(num_batches)\n",
+        "    + ', loss=' +  str(total_loss.numpy()), flush=True)\n",
+        "\n",
+        "print('Done fine-tuning!')"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "WHlXL1x_Z3tc"
+      },
+      "source": [
+        "# Load test images and run inference with new model!"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "WcE6OwrHQJya"
+      },
+      "outputs": [],
+      "source": [
+        "test_image_dir = 'models/research/object_detection/test_images/ducky/test/'\n",
+        "test_images_np = []\n",
+        "for i in range(1, 50):\n",
+        "  image_path = os.path.join(test_image_dir, 'out' + str(i) + '.jpg')\n",
+        "  test_images_np.append(np.expand_dims(\n",
+        "      load_image_into_numpy_array(image_path), axis=0))\n",
+        "\n",
+        "# Again, uncomment this decorator if you want to run inference eagerly\n",
+        "@tf.function\n",
+        "def detect(input_tensor):\n",
+        "  \"\"\"Run detection on an input image.\n",
+        "\n",
+        "  Args:\n",
+        "    input_tensor: A [1, height, width, 3] Tensor of type tf.float32.\n",
+        "      Note that height and width can be anything since the image will be\n",
+        "      immediately resized according to the needs of the model within this\n",
+        "      function.\n",
+        "\n",
+        "  Returns:\n",
+        "    A dict containing 3 Tensors (`detection_boxes`, `detection_classes`,\n",
+        "      and `detection_scores`).\n",
+        "  \"\"\"\n",
+        "  preprocessed_image, shapes = detection_model.preprocess(input_tensor)\n",
+        "  prediction_dict = detection_model.predict(preprocessed_image, shapes)\n",
+        "  return detection_model.postprocess(prediction_dict, shapes)\n",
+        "\n",
+        "# Note that the first frame will trigger tracing of the tf.function, which will\n",
+        "# take some time, after which inference should be fast.\n",
+        "\n",
+        "label_id_offset = 1\n",
+        "for i in range(len(test_images_np)):\n",
+        "  input_tensor = tf.convert_to_tensor(test_images_np[i], dtype=tf.float32)\n",
+        "  detections = detect(input_tensor)\n",
+        "\n",
+        "  plot_detections(\n",
+        "      test_images_np[i][0],\n",
+        "      detections['detection_boxes'][0].numpy(),\n",
+        "      detections['detection_classes'][0].numpy().astype(np.uint32)\n",
+        "      + label_id_offset,\n",
+        "      detections['detection_scores'][0].numpy(),\n",
+        "      category_index, figsize=(15, 20), image_name=\"gif_frame_\" + ('%02d' % i) + \".jpg\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "RW1FrT2iNnpy"
+      },
+      "outputs": [],
+      "source": [
+        "imageio.plugins.freeimage.download()\n",
+        "\n",
+        "anim_file = 'duckies_test.gif'\n",
+        "\n",
+        "filenames = glob.glob('gif_frame_*.jpg')\n",
+        "filenames = sorted(filenames)\n",
+        "last = -1\n",
+        "images = []\n",
+        "for filename in filenames:\n",
+        "  image = imageio.imread(filename)\n",
+        "  images.append(image)\n",
+        "\n",
+        "imageio.mimsave(anim_file, images, 'GIF-FI', fps=5)\n",
+        "\n",
+        "display(IPyImage(open(anim_file, 'rb').read()))"
+      ]
+    }
+  ],
+  "metadata": {
+    "accelerator": "GPU",
+    "colab": {
+      "collapsed_sections": [],
+      "name": "interactive_eager_few_shot_od_training_colab.ipynb",
+      "provenance": []
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}
--- a/research/object_detection/colab_tutorials/inference_from_saved_model_tf2_colab.ipynb
+++ b/research/object_detection/colab_tutorials/inference_from_saved_model_tf2_colab.ipynb
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "name": "inference_from_saved_model_tf2_colab.ipynb",
+      "provenance": [],
+      "collapsed_sections": []
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "cT5cdSLPX0ui"
+      },
+      "source": [
+        "# Intro to Object Detection Colab\n",
+        "\n",
+        "Welcome to the object detection colab! This demo will take you through the steps of running an \"out-of-the-box\" detection model in SavedModel format on a collection of images.\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "vPs64QA1Zdov"
+      },
+      "source": [
+        "Imports"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "OBzb04bdNGM8",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "!pip install -U --pre tensorflow==\"2.2.0\""
+      ],
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "NgSXyvKSNHIl",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "import os\n",
+        "import pathlib\n",
+        "\n",
+        "# Clone the tensorflow models repository if it doesn't already exist\n",
+        "if \"models\" in pathlib.Path.cwd().parts:\n",
+        "  while \"models\" in pathlib.Path.cwd().parts:\n",
+        "    os.chdir('..')\n",
+        "elif not pathlib.Path('models').exists():\n",
+        "  !git clone --depth 1 https://github.com/tensorflow/models"
+      ],
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "rhpPgW7TNLs6",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "# Install the Object Detection API\n",
+        "%%bash\n",
+        "cd models/research/\n",
+        "protoc object_detection/protos/*.proto --python_out=.\n",
+        "cp object_detection/packages/tf2/setup.py .\n",
+        "python -m pip install ."
+      ],
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "colab_type": "code",
+        "id": "yn5_uV1HLvaz",
+        "colab": {}
+      },
+      "source": [
+        "import io\n",
+        "import os\n",
+        "import scipy.misc\n",
+        "import numpy as np\n",
+        "import six\n",
+        "import time\n",
+        "\n",
+        "from six import BytesIO\n",
+        "\n",
+        "import matplotlib\n",
+        "import matplotlib.pyplot as plt\n",
+        "from PIL import Image, ImageDraw, ImageFont\n",
+        "\n",
+        "import tensorflow as tf\n",
+        "from object_detection.utils import visualization_utils as viz_utils\n",
+        "\n",
+        "%matplotlib inline"
+      ],
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "colab_type": "code",
+        "id": "-y9R0Xllefec",
+        "colab": {}
+      },
+      "source": [
+        "def load_image_into_numpy_array(path):\n",
+        "  \"\"\"Load an image from file into a numpy array.\n",
+        "\n",
+        "  Puts image into numpy array to feed into tensorflow graph.\n",
+        "  Note that by convention we put it into a numpy array with shape\n",
+        "  (height, width, channels), where channels=3 for RGB.\n",
+        "\n",
+        "  Args:\n",
+        "    path: a file path (this can be local or on colossus)\n",
+        "\n",
+        "  Returns:\n",
+        "    uint8 numpy array with shape (img_height, img_width, 3)\n",
+        "  \"\"\"\n",
+        "  img_data = tf.io.gfile.GFile(path, 'rb').read()\n",
+        "  image = Image.open(BytesIO(img_data))\n",
+        "  (im_width, im_height) = image.size\n",
+        "  return np.array(image.getdata()).reshape(\n",
+        "      (im_height, im_width, 3)).astype(np.uint8)\n",
+        "\n",
+        "# Load the COCO Label Map\n",
+        "category_index = {\n",
+        "    1: {'id': 1, 'name': 'person'},\n",
+        "    2: {'id': 2, 'name': 'bicycle'},\n",
+        "    3: {'id': 3, 'name': 'car'},\n",
+        "    4: {'id': 4, 'name': 'motorcycle'},\n",
+        "    5: {'id': 5, 'name': 'airplane'},\n",
+        "    6: {'id': 6, 'name': 'bus'},\n",
+        "    7: {'id': 7, 'name': 'train'},\n",
+        "    8: {'id': 8, 'name': 'truck'},\n",
+        "    9: {'id': 9, 'name': 'boat'},\n",
+        "    10: {'id': 10, 'name': 'traffic light'},\n",
+        "    11: {'id': 11, 'name': 'fire hydrant'},\n",
+        "    13: {'id': 13, 'name': 'stop sign'},\n",
+        "    14: {'id': 14, 'name': 'parking meter'},\n",
+        "    15: {'id': 15, 'name': 'bench'},\n",
+        "    16: {'id': 16, 'name': 'bird'},\n",
+        "    17: {'id': 17, 'name': 'cat'},\n",
+        "    18: {'id': 18, 'name': 'dog'},\n",
+        "    19: {'id': 19, 'name': 'horse'},\n",
+        "    20: {'id': 20, 'name': 'sheep'},\n",
+        "    21: {'id': 21, 'name': 'cow'},\n",
+        "    22: {'id': 22, 'name': 'elephant'},\n",
+        "    23: {'id': 23, 'name': 'bear'},\n",
+        "    24: {'id': 24, 'name': 'zebra'},\n",
+        "    25: {'id': 25, 'name': 'giraffe'},\n",
+        "    27: {'id': 27, 'name': 'backpack'},\n",
+        "    28: {'id': 28, 'name': 'umbrella'},\n",
+        "    31: {'id': 31, 'name': 'handbag'},\n",
+        "    32: {'id': 32, 'name': 'tie'},\n",
+        "    33: {'id': 33, 'name': 'suitcase'},\n",
+        "    34: {'id': 34, 'name': 'frisbee'},\n",
+        "    35: {'id': 35, 'name': 'skis'},\n",
+        "    36: {'id': 36, 'name': 'snowboard'},\n",
+        "    37: {'id': 37, 'name': 'sports ball'},\n",
+        "    38: {'id': 38, 'name': 'kite'},\n",
+        "    39: {'id': 39, 'name': 'baseball bat'},\n",
+        "    40: {'id': 40, 'name': 'baseball glove'},\n",
+        "    41: {'id': 41, 'name': 'skateboard'},\n",
+        "    42: {'id': 42, 'name': 'surfboard'},\n",
+        "    43: {'id': 43, 'name': 'tennis racket'},\n",
+        "    44: {'id': 44, 'name': 'bottle'},\n",
+        "    46: {'id': 46, 'name': 'wine glass'},\n",
+        "    47: {'id': 47, 'name': 'cup'},\n",
+        "    48: {'id': 48, 'name': 'fork'},\n",
+        "    49: {'id': 49, 'name': 'knife'},\n",
+        "    50: {'id': 50, 'name': 'spoon'},\n",
+        "    51: {'id': 51, 'name': 'bowl'},\n",
+        "    52: {'id': 52, 'name': 'banana'},\n",
+        "    53: {'id': 53, 'name': 'apple'},\n",
+        "    54: {'id': 54, 'name': 'sandwich'},\n",
+        "    55: {'id': 55, 'name': 'orange'},\n",
+        "    56: {'id': 56, 'name': 'broccoli'},\n",
+        "    57: {'id': 57, 'name': 'carrot'},\n",
+        "    58: {'id': 58, 'name': 'hot dog'},\n",
+        "    59: {'id': 59, 'name': 'pizza'},\n",
+        "    60: {'id': 60, 'name': 'donut'},\n",
+        "    61: {'id': 61, 'name': 'cake'},\n",
+        "    62: {'id': 62, 'name': 'chair'},\n",
+        "    63: {'id': 63, 'name': 'couch'},\n",
+        "    64: {'id': 64, 'name': 'potted plant'},\n",
+        "    65: {'id': 65, 'name': 'bed'},\n",
+        "    67: {'id': 67, 'name': 'dining table'},\n",
+        "    70: {'id': 70, 'name': 'toilet'},\n",
+        "    72: {'id': 72, 'name': 'tv'},\n",
+        "    73: {'id': 73, 'name': 'laptop'},\n",
+        "    74: {'id': 74, 'name': 'mouse'},\n",
+        "    75: {'id': 75, 'name': 'remote'},\n",
+        "    76: {'id': 76, 'name': 'keyboard'},\n",
+        "    77: {'id': 77, 'name': 'cell phone'},\n",
+        "    78: {'id': 78, 'name': 'microwave'},\n",
+        "    79: {'id': 79, 'name': 'oven'},\n",
+        "    80: {'id': 80, 'name': 'toaster'},\n",
+        "    81: {'id': 81, 'name': 'sink'},\n",
+        "    82: {'id': 82, 'name': 'refrigerator'},\n",
+        "    84: {'id': 84, 'name': 'book'},\n",
+        "    85: {'id': 85, 'name': 'clock'},\n",
+        "    86: {'id': 86, 'name': 'vase'},\n",
+        "    87: {'id': 87, 'name': 'scissors'},\n",
+        "    88: {'id': 88, 'name': 'teddy bear'},\n",
+        "    89: {'id': 89, 'name': 'hair drier'},\n",
+        "    90: {'id': 90, 'name': 'toothbrush'},\n",
+        "}"
+      ],
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "QwcBC2TlPSwg",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "# Download the saved model and put it into models/research/object_detection/test_data/\n",
+        "!wget http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d5_coco17_tpu-32.tar.gz\n",
+        "!tar -xf efficientdet_d5_coco17_tpu-32.tar.gz\n",
+        "!mv efficientdet_d5_coco17_tpu-32/ models/research/object_detection/test_data/"
+      ],
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "colab_type": "code",
+        "id": "Z2p-PmKLYCVU",
+        "colab": {}
+      },
+      "source": [
+        "start_time = time.time()\n",
+        "tf.keras.backend.clear_session()\n",
+        "detect_fn = tf.saved_model.load('models/research/object_detection/test_data/efficientdet_d5_coco17_tpu-32/saved_model/')\n",
+        "end_time = time.time()\n",
+        "elapsed_time = end_time - start_time\n",
+        "print('Elapsed time: ' + str(elapsed_time) + 's')"
+      ],
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "colab_type": "code",
+        "id": "vukkhd5-9NSL",
+        "colab": {}
+      },
+      "source": [
+        "import time\n",
+        "\n",
+        "image_dir = 'models/research/object_detection/test_images'\n",
+        "\n",
+        "elapsed = []\n",
+        "for i in range(2):\n",
+        "  image_path = os.path.join(image_dir, 'image' + str(i + 1) + '.jpg')\n",
+        "  image_np = load_image_into_numpy_array(image_path)\n",
+        "  input_tensor = np.expand_dims(image_np, 0)\n",
+        "  start_time = time.time()\n",
+        "  detections = detect_fn(input_tensor)\n",
+        "  end_time = time.time()\n",
+        "  elapsed.append(end_time - start_time)\n",
+        "\n",
+        "  plt.rcParams['figure.figsize'] = [42, 21]\n",
+        "  label_id_offset = 1\n",
+        "  image_np_with_detections = image_np.copy()\n",
+        "  viz_utils.visualize_boxes_and_labels_on_image_array(\n",
+        "        image_np_with_detections,\n",
+        "        detections['detection_boxes'][0].numpy(),\n",
+        "        detections['detection_classes'][0].numpy().astype(np.int32),\n",
+        "        detections['detection_scores'][0].numpy(),\n",
+        "        category_index,\n",
+        "        use_normalized_coordinates=True,\n",
+        "        max_boxes_to_draw=200,\n",
+        "        min_score_thresh=.40,\n",
+        "        agnostic_mode=False)\n",
+        "  plt.subplot(2, 1, i+1)\n",
+        "  plt.imshow(image_np_with_detections)\n",
+        "\n",
+        "mean_elapsed = sum(elapsed) / float(len(elapsed))\n",
+        "print('Elapsed time: ' + str(mean_elapsed) + ' second per image')"
+      ],
+      "execution_count": null,
+      "outputs": []
+    }
+  ]
+}
\ No newline at end of file
--- a/research/object_detection/colab_tutorials/inference_tf2_colab.ipynb
+++ b/research/object_detection/colab_tutorials/inference_tf2_colab.ipynb
+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "rOvvWAVTkMR7"
+      },
+      "source": [
+        "# Intro to Object Detection Colab\n",
+        "\n",
+        "Welcome to the object detection colab!  This demo will take you through the steps of running an \"out-of-the-box\" detection model on a collection of images."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "vPs64QA1Zdov"
+      },
+      "source": [
+        "## Imports and Setup"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "LBZ9VWZZFUCT"
+      },
+      "outputs": [],
+      "source": [
+        "!pip install -U --pre tensorflow==\"2.2.0\""
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "oi28cqGGFWnY"
+      },
+      "outputs": [],
+      "source": [
+        "import os\n",
+        "import pathlib\n",
+        "\n",
+        "# Clone the tensorflow models repository if it doesn't already exist\n",
+        "if \"models\" in pathlib.Path.cwd().parts:\n",
+        "  while \"models\" in pathlib.Path.cwd().parts:\n",
+        "    os.chdir('..')\n",
+        "elif not pathlib.Path('models').exists():\n",
+        "  !git clone --depth 1 https://github.com/tensorflow/models"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "NwdsBdGhFanc"
+      },
+      "outputs": [],
+      "source": [
+        "# Install the Object Detection API\n",
+        "%%bash\n",
+        "cd models/research/\n",
+        "protoc object_detection/protos/*.proto --python_out=.\n",
+        "cp object_detection/packages/tf2/setup.py .\n",
+        "python -m pip install ."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "yn5_uV1HLvaz"
+      },
+      "outputs": [],
+      "source": [
+        "import matplotlib\n",
+        "import matplotlib.pyplot as plt\n",
+        "\n",
+        "import io\n",
+        "import scipy.misc\n",
+        "import numpy as np\n",
+        "from six import BytesIO\n",
+        "from PIL import Image, ImageDraw, ImageFont\n",
+        "\n",
+        "import tensorflow as tf\n",
+        "\n",
+        "from object_detection.utils import label_map_util\n",
+        "from object_detection.utils import config_util\n",
+        "from object_detection.utils import visualization_utils as viz_utils\n",
+        "from object_detection.builders import model_builder\n",
+        "\n",
+        "%matplotlib inline"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "IogyryF2lFBL"
+      },
+      "source": [
+        "## Utilities"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "-y9R0Xllefec"
+      },
+      "outputs": [],
+      "source": [
+        "def load_image_into_numpy_array(path):\n",
+        "  \"\"\"Load an image from file into a numpy array.\n",
+        "\n",
+        "  Puts image into numpy array to feed into tensorflow graph.\n",
+        "  Note that by convention we put it into a numpy array with shape\n",
+        "  (height, width, channels), where channels=3 for RGB.\n",
+        "\n",
+        "  Args:\n",
+        "    path: the file path to the image\n",
+        "\n",
+        "  Returns:\n",
+        "    uint8 numpy array with shape (img_height, img_width, 3)\n",
+        "  \"\"\"\n",
+        "  img_data = tf.io.gfile.GFile(path, 'rb').read()\n",
+        "  image = Image.open(BytesIO(img_data))\n",
+        "  (im_width, im_height) = image.size\n",
+        "  return np.array(image.getdata()).reshape(\n",
+        "      (im_height, im_width, 3)).astype(np.uint8)\n",
+        "\n",
+        "def get_keypoint_tuples(eval_config):\n",
+        "  \"\"\"Return a tuple list of keypoint edges from the eval config.\n",
+        "  \n",
+        "  Args:\n",
+        "    eval_config: an eval config containing the keypoint edges\n",
+        "  \n",
+        "  Returns:\n",
+        "    a list of edge tuples, each in the format (start, end)\n",
+        "  \"\"\"\n",
+        "  tuple_list = []\n",
+        "  kp_list = eval_config.keypoint_edge\n",
+        "  for edge in kp_list:\n",
+        "    tuple_list.append((edge.start, edge.end))\n",
+        "  return tuple_list"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "R4YjnOjME1gy"
+      },
+      "outputs": [],
+      "source": [
+        "# @title Choose the model to use, then evaluate the cell.\n",
+        "MODELS = {'centernet_with_keypoints': 'centernet_hg104_512x512_kpts_coco17_tpu-32', 'centernet_without_keypoints': 'centernet_hg104_512x512_coco17_tpu-8'}\n",
+        "\n",
+        "model_display_name = 'centernet_with_keypoints' # @param ['centernet_with_keypoints', 'centernet_without_keypoints']\n",
+        "model_name = MODELS[model_display_name]"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "6917xnUSlp9x"
+      },
+      "source": [
+        "### Build a detection model and load pre-trained model weights\n",
+        "\n",
+        "This sometimes takes a little while, please be patient!"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "ctPavqlyPuU_"
+      },
+      "outputs": [],
+      "source": [
+        "# Download the checkpoint and put it into models/research/object_detection/test_data/\n",
+        "\n",
+        "if model_display_name == 'centernet_with_keypoints':\n",
+        "  !wget http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_hg104_512x512_kpts_coco17_tpu-32.tar.gz\n",
+        "  !tar -xf centernet_hg104_512x512_kpts_coco17_tpu-32.tar.gz\n",
+        "  !mv centernet_hg104_512x512_kpts_coco17_tpu-32/checkpoint models/research/object_detection/test_data/\n",
+        "else:\n",
+        "  !wget http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_hg104_512x512_coco17_tpu-8.tar.gz\n",
+        "  !tar -xf centernet_hg104_512x512_coco17_tpu-8.tar.gz\n",
+        "  !mv centernet_hg104_512x512_coco17_tpu-8/checkpoint models/research/object_detection/test_data/"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "4cni4SSocvP_"
+      },
+      "outputs": [],
+      "source": [
+        "pipeline_config = os.path.join('models/research/object_detection/configs/tf2/',\n",
+        "                                model_name + '.config')\n",
+        "model_dir = 'models/research/object_detection/test_data/checkpoint/'\n",
+        "\n",
+        "# Load pipeline config and build a detection model\n",
+        "configs = config_util.get_configs_from_pipeline_file(pipeline_config)\n",
+        "model_config = configs['model']\n",
+        "detection_model = model_builder.build(\n",
+        "      model_config=model_config, is_training=False)\n",
+        "\n",
+        "# Restore checkpoint\n",
+        "ckpt = tf.compat.v2.train.Checkpoint(\n",
+        "      model=detection_model)\n",
+        "ckpt.restore(os.path.join(model_dir, 'ckpt-0')).expect_partial()\n",
+        "\n",
+        "def get_model_detection_function(model):\n",
+        "  \"\"\"Get a tf.function for detection.\"\"\"\n",
+        "\n",
+        "  @tf.function\n",
+        "  def detect_fn(image):\n",
+        "    \"\"\"Detect objects in image.\"\"\"\n",
+        "\n",
+        "    image, shapes = model.preprocess(image)\n",
+        "    prediction_dict = model.predict(image, shapes)\n",
+        "    detections = model.postprocess(prediction_dict, shapes)\n",
+        "\n",
+        "    return detections, prediction_dict, tf.reshape(shapes, [-1])\n",
+        "\n",
+        "  return detect_fn\n",
+        "\n",
+        "detect_fn = get_model_detection_function(detection_model)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "NKtD0IeclbL5"
+      },
+      "source": [
+        "# Load label map data (for plotting).\n",
+        "\n",
+        "Label maps correspond index numbers to category names, so that when our convolution network predicts `5`, we know that this corresponds to `airplane`.  Here we use internal utility functions, but anything that returns a dictionary mapping integers to appropriate string labels would be fine."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "5mucYUS6exUJ"
+      },
+      "outputs": [],
+      "source": [
+        "label_map_path = configs['eval_input_config'].label_map_path\n",
+        "label_map = label_map_util.load_labelmap(label_map_path)\n",
+        "categories = label_map_util.convert_label_map_to_categories(\n",
+        "    label_map,\n",
+        "    max_num_classes=label_map_util.get_max_label_map_index(label_map),\n",
+        "    use_display_name=True)\n",
+        "category_index = label_map_util.create_category_index(categories)\n",
+        "label_map_dict = label_map_util.get_label_map_dict(label_map, use_display_name=True)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "RLusV1o-mAx8"
+      },
+      "source": [
+        "### Putting everything together!\n",
+        "\n",
+        "Run the below code which loads an image, runs it through the detection model and visualizes the detection results, including the keypoints.\n",
+        "\n",
+        "Note that this will take a long time (several minutes) the first time you run this code due to tf.function's trace-compilation --- on subsequent runs (e.g. on new images), things will be faster.\n",
+        "\n",
+        "Here are some simple things to try out if you are curious:\n",
+        "* Try running inference on your own images (local paths work)\n",
+        "* Modify some of the input images and see if detection still works.  Some simple things to try out here (just uncomment the relevant portions of code) include flipping the image horizontally, or converting to grayscale (note that we still expect the input image to have 3 channels).\n",
+        "* Print out `detections['detection_boxes']` and try to match the box locations to the boxes in the image.  Notice that coordinates are given in normalized form (i.e., in the interval [0, 1]).\n",
+        "* Set min_score_thresh to other values (between 0 and 1) to allow more detections in or to filter out more detections.\n",
+        "\n",
+        "Note that you can run this cell repeatedly without rerunning earlier cells.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "vr_Fux-gfaG9"
+      },
+      "outputs": [],
+      "source": [
+        "image_dir = 'models/research/object_detection/test_images/'\n",
+        "image_path = os.path.join(image_dir, 'image2.jpg')\n",
+        "image_np = load_image_into_numpy_array(image_path)\n",
+        "\n",
+        "# Things to try:\n",
+        "# Flip horizontally\n",
+        "# image_np = np.fliplr(image_np).copy()\n",
+        "\n",
+        "# Convert image to grayscale\n",
+        "# image_np = np.tile(\n",
+        "#     np.mean(image_np, 2, keepdims=True), (1, 1, 3)).astype(np.uint8)\n",
+        "\n",
+        "input_tensor = tf.convert_to_tensor(\n",
+        "    np.expand_dims(image_np, 0), dtype=tf.float32)\n",
+        "detections, predictions_dict, shapes = detect_fn(input_tensor)\n",
+        "\n",
+        "label_id_offset = 1\n",
+        "image_np_with_detections = image_np.copy()\n",
+        "\n",
+        "# Use keypoints if available in detections\n",
+        "keypoints, keypoint_scores = None, None\n",
+        "if 'detection_keypoints' in detections:\n",
+        "  keypoints = detections['detection_keypoints'][0].numpy()\n",
+        "  keypoint_scores = detections['detection_keypoint_scores'][0].numpy()\n",
+        "\n",
+        "viz_utils.visualize_boxes_and_labels_on_image_array(\n",
+        "      image_np_with_detections,\n",
+        "      detections['detection_boxes'][0].numpy(),\n",
+        "      (detections['detection_classes'][0].numpy() + label_id_offset).astype(int),\n",
+        "      detections['detection_scores'][0].numpy(),\n",
+        "      category_index,\n",
+        "      use_normalized_coordinates=True,\n",
+        "      max_boxes_to_draw=200,\n",
+        "      min_score_thresh=.30,\n",
+        "      agnostic_mode=False,\n",
+        "      keypoints=keypoints,\n",
+        "      keypoint_scores=keypoint_scores,\n",
+        "      keypoint_edges=get_keypoint_tuples(configs['eval_config']))\n",
+        "\n",
+        "plt.figure(figsize=(12,16))\n",
+        "plt.imshow(image_np_with_detections)\n",
+        "plt.show()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "lYnOxprty3TD"
+      },
+      "source": [
+        "## Digging into the model's intermediate predictions\n",
+        "\n",
+        "For this part we will assume that the detection model is a CenterNet model following Zhou et al (https://arxiv.org/abs/1904.07850).  And more specifically, we will assume that `detection_model` is of type `meta_architectures.center_net_meta_arch.CenterNetMetaArch`.\n",
+        "\n",
+        "As one of its intermediate predictions, CenterNet produces a heatmap of box centers for each class (for example, it will produce a heatmap whose size is proportional to that of the image that lights up at the center of each, e.g., \"zebra\"). In the following, we will visualize these intermediate class center heatmap predictions."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "xBgYgSGMhHVi"
+      },
+      "outputs": [],
+      "source": [
+        "if detection_model.__class__.__name__ != 'CenterNetMetaArch':\n",
+        "  raise AssertionError('The meta-architecture for this section '\n",
+        "  'is assumed to be CenterNetMetaArch!')\n",
+        "\n",
+        "def get_heatmap(predictions_dict, class_name):\n",
+        "  \"\"\"Grabs class center logits and apply inverse logit transform.\n",
+        "\n",
+        "  Args:\n",
+        "    predictions_dict: dictionary of tensors containing a `object_center`\n",
+        "      field of shape [1, heatmap_width, heatmap_height, num_classes]\n",
+        "    class_name: string name of category (e.g., `horse`)\n",
+        "\n",
+        "  Returns:\n",
+        "    heatmap: 2d Tensor heatmap representing heatmap of centers for a given class\n",
+        "      (For CenterNet, this is 128x128 or 256x256) with values in [0,1]\n",
+        "  \"\"\"\n",
+        "  class_index = label_map_dict[class_name]\n",
+        "  class_center_logits = predictions_dict['object_center'][0]\n",
+        "  class_center_logits = class_center_logits[0][\n",
+        "    :, :, class_index - label_id_offset]\n",
+        "  heatmap = tf.exp(class_center_logits) / (tf.exp(class_center_logits) + 1)\n",
+        "  return heatmap\n",
+        "\n",
+        "def unpad_heatmap(heatmap, image_np):\n",
+        "  \"\"\"Reshapes/unpads heatmap appropriately.\n",
+        "\n",
+        "  Reshapes/unpads heatmap appropriately to match image_np.\n",
+        "\n",
+        "  Args:\n",
+        "    heatmap: Output of `get_heatmap`, a 2d Tensor\n",
+        "    image_np: uint8 numpy array with shape (img_height, img_width, 3).  Note\n",
+        "      that due to padding, the relationship between img_height and img_width\n",
+        "      might not be a simple scaling.\n",
+        "\n",
+        "  Returns:\n",
+        "    resized_heatmap_unpadded: a resized heatmap (2d Tensor) that is the same\n",
+        "      size as `image_np`\n",
+        "  \"\"\"\n",
+        "  heatmap = tf.tile(tf.expand_dims(heatmap, 2), [1, 1, 3]) * 255\n",
+        "  pre_strided_size = detection_model._stride * heatmap.shape[0]\n",
+        "  resized_heatmap = tf.image.resize(\n",
+        "      heatmap, [pre_strided_size, pre_strided_size],\n",
+        "      method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)\n",
+        "  resized_heatmap_unpadded = tf.slice(resized_heatmap, begin=[0,0,0], size=shapes)\n",
+        "  return tf.image.resize(\n",
+        "      resized_heatmap_unpadded,\n",
+        "      [image_np.shape[0], image_np.shape[1]],\n",
+        "      method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)[:,:,0]\n",
+        "\n",
+        "\n",
+        "class_name = 'kite'\n",
+        "heatmap = get_heatmap(predictions_dict, class_name)\n",
+        "resized_heatmap_unpadded = unpad_heatmap(heatmap, image_np)\n",
+        "plt.figure(figsize=(12,16))\n",
+        "plt.imshow(image_np_with_detections)\n",
+        "plt.imshow(resized_heatmap_unpadded, alpha=0.7,vmin=0, vmax=160, cmap='viridis')\n",
+        "plt.title('Object center heatmap (class: ' + class_name + ')')\n",
+        "plt.show()\n",
+        "\n",
+        "class_name = 'person'\n",
+        "heatmap = get_heatmap(predictions_dict, class_name)\n",
+        "resized_heatmap_unpadded = unpad_heatmap(heatmap, image_np)\n",
+        "plt.figure(figsize=(12,16))\n",
+        "plt.imshow(image_np_with_detections)\n",
+        "plt.imshow(resized_heatmap_unpadded, alpha=0.7,vmin=0, vmax=160, cmap='viridis')\n",
+        "plt.title('Object center heatmap (class: ' + class_name + ')')\n",
+        "plt.show()"
+      ]
+    }
+  ],
+  "metadata": {
+    "colab": {
+      "collapsed_sections": [],
+      "name": "inference_tf2_colab.ipynb",
+      "provenance": []
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}
--- a/research/object_detection/colab_tutorials/object_detection_tutorial.ipynb
+++ b/research/object_detection/colab_tutorials/object_detection_tutorial.ipynb
@@ -71,7 +71,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 0,
+      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
@@ -95,7 +95,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 0,
+      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
@@ -118,7 +118,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 0,
+      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
@@ -149,7 +149,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 0,
+      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
@@ -164,7 +164,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 0,
+      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
@@ -189,7 +189,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 0,
+      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
@@ -224,7 +224,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 0,
+      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
@@ -249,7 +249,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 0,
+      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
@@ -300,7 +300,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 0,
+      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
@@ -319,7 +319,6 @@
        "  model_dir = pathlib.Path(model_dir)/\"saved_model\"\n",
        "\n",
        "  model = tf.saved_model.load(str(model_dir))\n",
-        "  model = model.signatures['serving_default']\n",
        "\n",
        "  return model"
      ]
@@ -337,7 +336,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 0,
+      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
@@ -362,7 +361,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 0,
+      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
@@ -398,7 +397,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 0,
+      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
@@ -417,12 +416,12 @@
        "id": "yN1AYfAEJIGp"
      },
      "source": [
-        "Check the model's input signature, it expects a batch of 3-color images of type uint8: "
+        "Check the model's input signature, it expects a batch of 3-color images of type uint8:"
      ]
    },
    {
      "cell_type": "code",
-      "execution_count": 0,
+      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
@@ -430,7 +429,7 @@
      },
      "outputs": [],
      "source": [
-        "print(detection_model.inputs)"
+        "print(detection_model.signatures['serving_default'].inputs)"
      ]
    },
    {
@@ -445,7 +444,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 0,
+      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
@@ -453,12 +452,12 @@
      },
      "outputs": [],
      "source": [
-        "detection_model.output_dtypes"
+        "detection_model.signatures['serving_default'].output_dtypes"
      ]
    },
    {
      "cell_type": "code",
-      "execution_count": 0,
+      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
@@ -466,7 +465,7 @@
      },
      "outputs": [],
      "source": [
-        "detection_model.output_shapes"
+        "detection_model.signatures['serving_default'].output_shapes"
      ]
    },
    {
@@ -481,7 +480,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 0,
+      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
@@ -497,7 +496,8 @@
        "  input_tensor = input_tensor[tf.newaxis,...]\n",
        "\n",
        "  # Run inference\n",
-        "  output_dict = model(input_tensor)\n",
+        "  model_fn = model.signatures['serving_default']\n",
+        "  output_dict = model_fn(input_tensor)\n",
        "\n",
        "  # All outputs are batches tensors.\n",
        "  # Convert to numpy arrays, and take index [0] to remove the batch dimension.\n",
@@ -535,7 +535,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 0,
+      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
@@ -565,7 +565,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 0,
+      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
@@ -589,7 +589,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 0,
+      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
@@ -613,7 +613,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 0,
+      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
@@ -626,7 +626,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 0,
+      "execution_count": null,
      "metadata": {
        "colab": {},
        "colab_type": "code",
@@ -637,19 +637,6 @@
        "for image_path in TEST_IMAGE_PATHS:\n",
        "  show_inference(masking_model, image_path)"
      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": 0,
-      "metadata": {
-        "colab": {},
-        "colab_type": "code",
-        "id": "nLlmm9JojEKm"
-      },
-      "outputs": [],
-      "source": [
-        ""
-      ]
    }
  ],
  "metadata": {
@@ -663,6 +650,10 @@
      "name": "object_detection_tutorial.ipynb",
      "private_outputs": true,
      "provenance": [
+        {
+          "file_id": "/piper/depot/google3/third_party/tensorflow_models/object_detection/colab_tutorials/object_detection_tutorial.ipynb",
+          "timestamp": 1594335690840
+        },
        {
          "file_id": "1LNYL6Zsn9Xlil2CVNOTsgDZQSBKeOjCh",
          "timestamp": 1566498233247
@@ -699,8 +690,7 @@
          "file_id": "https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb",
          "timestamp": 1556150293326
        }
-      ],
-      "version": "0.3.2"
+      ]
    },
    "kernelspec": {
      "display_name": "Python 3",

--- a/research/object_detection/configs/tf2/center_net_hourglass104_1024x1024_coco17_tpu-32.config
+++ b/research/object_detection/configs/tf2/center_net_hourglass104_1024x1024_coco17_tpu-32.config
+# CenterNet meta-architecture from the "Objects as Points" [2] paper with the
+# hourglass[1] backbone.
+# [1]: https://arxiv.org/abs/1603.06937
+# [2]: https://arxiv.org/abs/1904.07850
+# Trained on COCO, initialized from Extremenet Detection checkpoint
+# Train on TPU-32 v3
+#
+# Achieves 44.6 mAP on COCO17 Val
+
+
+model {
+  center_net {
+    num_classes: 90
+    feature_extractor {
+      type: "hourglass_104"
+      bgr_ordering: true
+      channel_means: [104.01362025, 114.03422265, 119.9165958 ]
+      channel_stds: [73.6027665 , 69.89082075, 70.9150767 ]
+    }
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 1024
+        max_dimension: 1024
+        pad_to_max_dimension: true
+      }
+    }
+    object_detection_task {
+      task_loss_weight: 1.0
+      offset_loss_weight: 1.0
+      scale_loss_weight: 0.1
+      localization_loss {
+        l1_localization_loss {
+        }
+      }
+    }
+    object_center_params {
+      object_center_loss_weight: 1.0
+      min_box_overlap_iou: 0.7
+      max_box_predictions: 100
+      classification_loss {
+        penalty_reduced_logistic_focal_loss {
+          alpha: 2.0
+          beta: 4.0
+        }
+      }
+    }
+  }
+}
+
+train_config: {
+
+  batch_size: 128
+  num_steps: 50000
+
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_hue {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_contrast {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_saturation {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_brightness {
+    }
+  }
+
+   data_augmentation_options {
+     random_square_crop_by_scale {
+      scale_min: 0.6
+      scale_max: 1.3
+    }
+  }
+
+  optimizer {
+    adam_optimizer: {
+      epsilon: 1e-7  # Match tf.keras.optimizers.Adam's default.
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: 1e-3
+          total_steps: 50000
+          warmup_learning_rate: 2.5e-4
+          warmup_steps: 5000
+        }
+      }
+    }
+    use_moving_average: false
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/ckpt-1"
+  fine_tune_checkpoint_type: "detection"
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
--- a/research/object_detection/configs/tf2/center_net_hourglass104_512x512_coco17_tpu-8.config
+++ b/research/object_detection/configs/tf2/center_net_hourglass104_512x512_coco17_tpu-8.config
+# CenterNet meta-architecture from the "Objects as Points" [2] paper with the
+# hourglass[1] backbone.
+# [1]: https://arxiv.org/abs/1603.06937
+# [2]: https://arxiv.org/abs/1904.07850
+# Trained on COCO, initialized from Extremenet Detection checkpoint
+# Train on TPU-8
+#
+# Achieves 41.9 mAP on COCO17 Val
+
+model {
+  center_net {
+    num_classes: 90
+    feature_extractor {
+      type: "hourglass_104"
+      bgr_ordering: true
+      channel_means: [104.01362025, 114.03422265, 119.9165958 ]
+      channel_stds: [73.6027665 , 69.89082075, 70.9150767 ]
+    }
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 512
+        max_dimension: 512
+        pad_to_max_dimension: true
+      }
+    }
+    object_detection_task {
+      task_loss_weight: 1.0
+      offset_loss_weight: 1.0
+      scale_loss_weight: 0.1
+      localization_loss {
+        l1_localization_loss {
+        }
+      }
+    }
+    object_center_params {
+      object_center_loss_weight: 1.0
+      min_box_overlap_iou: 0.7
+      max_box_predictions: 100
+      classification_loss {
+        penalty_reduced_logistic_focal_loss {
+          alpha: 2.0
+          beta: 4.0
+        }
+      }
+    }
+  }
+}
+
+train_config: {
+
+  batch_size: 128
+  num_steps: 140000
+
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+
+  data_augmentation_options {
+    random_crop_image {
+      min_aspect_ratio: 0.5
+      max_aspect_ratio: 1.7
+      random_coef: 0.25
+    }
+  }
+
+
+  data_augmentation_options {
+    random_adjust_hue {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_contrast {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_saturation {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_brightness {
+    }
+  }
+
+  data_augmentation_options {
+    random_absolute_pad_image {
+       max_height_padding: 200
+       max_width_padding: 200
+       pad_color: [0, 0, 0]
+    }
+  }
+
+  optimizer {
+    adam_optimizer: {
+      epsilon: 1e-7  # Match tf.keras.optimizers.Adam's default.
+      learning_rate: {
+        manual_step_learning_rate {
+          initial_learning_rate: 1e-3
+          schedule {
+           step: 90000
+           learning_rate: 1e-4
+          }
+          schedule {
+            step: 120000
+            learning_rate: 1e-5
+          }
+        }
+      }
+    }
+    use_moving_average: false
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/ckpt-1"
+  fine_tune_checkpoint_type: "detection"
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
--- a/research/object_detection/configs/tf2/center_net_resnet101_v1_fpn_512x512_coco17_tpu-8.config
+++ b/research/object_detection/configs/tf2/center_net_resnet101_v1_fpn_512x512_coco17_tpu-8.config
+# CenterNet meta-architecture from the "Objects as Points" [1] paper
+# with the ResNet-v1-101 FPN backbone.
+# [1]: https://arxiv.org/abs/1904.07850
+
+# Train on TPU-8
+#
+# Achieves 34.18 mAP on COCO17 Val
+
+
+model {
+  center_net {
+    num_classes: 90
+    feature_extractor {
+      type: "resnet_v2_101"
+    }
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 512
+        max_dimension: 512
+        pad_to_max_dimension: true
+      }
+    }
+    object_detection_task {
+      task_loss_weight: 1.0
+      offset_loss_weight: 1.0
+      scale_loss_weight: 0.1
+      localization_loss {
+        l1_localization_loss {
+        }
+      }
+    }
+    object_center_params {
+      object_center_loss_weight: 1.0
+      min_box_overlap_iou: 0.7
+      max_box_predictions: 100
+      classification_loss {
+        penalty_reduced_logistic_focal_loss {
+          alpha: 2.0
+          beta: 4.0
+        }
+      }
+    }
+  }
+}
+
+train_config: {
+
+  batch_size: 128
+  num_steps: 140000
+
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+
+  data_augmentation_options {
+    random_crop_image {
+      min_aspect_ratio: 0.5
+      max_aspect_ratio: 1.7
+      random_coef: 0.25
+    }
+  }
+
+
+  data_augmentation_options {
+    random_adjust_hue {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_contrast {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_saturation {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_brightness {
+    }
+  }
+
+  data_augmentation_options {
+    random_absolute_pad_image {
+       max_height_padding: 200
+       max_width_padding: 200
+       pad_color: [0, 0, 0]
+    }
+  }
+
+  optimizer {
+    adam_optimizer: {
+      epsilon: 1e-7  # Match tf.keras.optimizers.Adam's default.
+      learning_rate: {
+        manual_step_learning_rate {
+          initial_learning_rate: 1e-3
+          schedule {
+           step: 90000
+           learning_rate: 1e-4
+          }
+          schedule {
+            step: 120000
+            learning_rate: 1e-5
+          }
+        }
+      }
+    }
+    use_moving_average: false
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/weights-1"
+  fine_tune_checkpoint_type: "classification"
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
+
--- a/research/object_detection/configs/tf2/faster_rcnn_resnet101_v1_1024x1024_coco17_tpu-8.config
+++ b/research/object_detection/configs/tf2/faster_rcnn_resnet101_v1_1024x1024_coco17_tpu-8.config
+# Faster R-CNN with Resnet-101 (v1),
+# w/high res inputs, long training schedule
+# Trained on COCO, initialized from Imagenet classification checkpoint
+#
+# Train on TPU-8
+#
+# Achieves 37.1 mAP on COCO17 val
+
+model {
+  faster_rcnn {
+    num_classes: 90
+    image_resizer {
+      fixed_shape_resizer {
+        width: 1024
+        height: 1024
+      }
+    }
+    feature_extractor {
+      type: 'faster_rcnn_resnet101_keras'
+      batch_norm_trainable: true
+    }
+    first_stage_anchor_generator {
+      grid_anchor_generator {
+        scales: [0.25, 0.5, 1.0, 2.0]
+        aspect_ratios: [0.5, 1.0, 2.0]
+        height_stride: 16
+        width_stride: 16
+      }
+    }
+    first_stage_box_predictor_conv_hyperparams {
+      op: CONV
+      regularizer {
+        l2_regularizer {
+          weight: 0.0
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+          stddev: 0.01
+        }
+      }
+    }
+    first_stage_nms_score_threshold: 0.0
+    first_stage_nms_iou_threshold: 0.7
+    first_stage_max_proposals: 300
+    first_stage_localization_loss_weight: 2.0
+    first_stage_objectness_loss_weight: 1.0
+    initial_crop_size: 14
+    maxpool_kernel_size: 2
+    maxpool_stride: 2
+    second_stage_box_predictor {
+      mask_rcnn_box_predictor {
+        use_dropout: false
+        dropout_keep_probability: 1.0
+        fc_hyperparams {
+          op: FC
+          regularizer {
+            l2_regularizer {
+              weight: 0.0
+            }
+          }
+          initializer {
+            variance_scaling_initializer {
+              factor: 1.0
+              uniform: true
+              mode: FAN_AVG
+            }
+          }
+        }
+        share_box_across_classes: true
+      }
+    }
+    second_stage_post_processing {
+      batch_non_max_suppression {
+        score_threshold: 0.0
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 300
+      }
+      score_converter: SOFTMAX
+    }
+    second_stage_localization_loss_weight: 2.0
+    second_stage_classification_loss_weight: 1.0
+    use_static_shapes: true
+    use_matmul_crop_and_resize: true
+    clip_anchors_to_image: true
+    use_static_balanced_label_sampler: true
+    use_matmul_gather_in_matcher: true
+  }
+}
+
+train_config: {
+  batch_size: 64
+  sync_replicas: true
+  startup_delay_steps: 0
+  replicas_to_aggregate: 8
+  num_steps: 100000
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: .04
+          total_steps: 100000
+          warmup_learning_rate: .013333
+          warmup_steps: 2000
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/resnet101.ckpt-1"
+  fine_tune_checkpoint_type: "classification"
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_hue {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_contrast {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_saturation {
+    }
+  }
+
+  data_augmentation_options {
+     random_square_crop_by_scale {
+      scale_min: 0.6
+      scale_max: 1.3
+    }
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+  use_bfloat16: true  # works only on TPUs
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
--- a/research/object_detection/configs/tf2/faster_rcnn_resnet101_v1_640x640_coco17_tpu-8.config
+++ b/research/object_detection/configs/tf2/faster_rcnn_resnet101_v1_640x640_coco17_tpu-8.config
+# Faster R-CNN with Resnet-50 (v1)
+# Trained on COCO, initialized from Imagenet classification checkpoint
+#
+# Train on TPU-8
+#
+# Achieves 31.8 mAP on COCO17 val
+
+model {
+  faster_rcnn {
+    num_classes: 90
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 640
+        max_dimension: 640
+        pad_to_max_dimension: true
+      }
+    }
+    feature_extractor {
+      type: 'faster_rcnn_resnet101_keras'
+      batch_norm_trainable: true
+    }
+    first_stage_anchor_generator {
+      grid_anchor_generator {
+        scales: [0.25, 0.5, 1.0, 2.0]
+        aspect_ratios: [0.5, 1.0, 2.0]
+        height_stride: 16
+        width_stride: 16
+      }
+    }
+    first_stage_box_predictor_conv_hyperparams {
+      op: CONV
+      regularizer {
+        l2_regularizer {
+          weight: 0.0
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+          stddev: 0.01
+        }
+      }
+    }
+    first_stage_nms_score_threshold: 0.0
+    first_stage_nms_iou_threshold: 0.7
+    first_stage_max_proposals: 300
+    first_stage_localization_loss_weight: 2.0
+    first_stage_objectness_loss_weight: 1.0
+    initial_crop_size: 14
+    maxpool_kernel_size: 2
+    maxpool_stride: 2
+    second_stage_box_predictor {
+      mask_rcnn_box_predictor {
+        use_dropout: false
+        dropout_keep_probability: 1.0
+        fc_hyperparams {
+          op: FC
+          regularizer {
+            l2_regularizer {
+              weight: 0.0
+            }
+          }
+          initializer {
+            variance_scaling_initializer {
+              factor: 1.0
+              uniform: true
+              mode: FAN_AVG
+            }
+          }
+        }
+        share_box_across_classes: true
+      }
+    }
+    second_stage_post_processing {
+      batch_non_max_suppression {
+        score_threshold: 0.0
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 300
+      }
+      score_converter: SOFTMAX
+    }
+    second_stage_localization_loss_weight: 2.0
+    second_stage_classification_loss_weight: 1.0
+    use_static_shapes: true
+    use_matmul_crop_and_resize: true
+    clip_anchors_to_image: true
+    use_static_balanced_label_sampler: true
+    use_matmul_gather_in_matcher: true
+  }
+}
+
+train_config: {
+  batch_size: 64
+  sync_replicas: true
+  startup_delay_steps: 0
+  replicas_to_aggregate: 8
+  num_steps: 25000
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: .04
+          total_steps: 25000
+          warmup_learning_rate: .013333
+          warmup_steps: 2000
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/resnet101.ckpt-1"
+  fine_tune_checkpoint_type: "classification"
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+  use_bfloat16: true  # works only on TPUs
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
--- a/research/object_detection/configs/tf2/faster_rcnn_resnet101_v1_800x1333_coco17_gpu-8.config
+++ b/research/object_detection/configs/tf2/faster_rcnn_resnet101_v1_800x1333_coco17_gpu-8.config
+# Faster R-CNN with Resnet-101 (v1),
+# Initialized from Imagenet classification checkpoint
+#
+# Train on GPU-8
+#
+# Achieves 36.6 mAP on COCO17 val
+
+model {
+  faster_rcnn {
+    num_classes: 90
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 800
+        max_dimension: 1333
+        pad_to_max_dimension: true
+      }
+    }
+    feature_extractor {
+      type: 'faster_rcnn_resnet101_keras'
+    }
+    first_stage_anchor_generator {
+      grid_anchor_generator {
+        scales: [0.25, 0.5, 1.0, 2.0]
+        aspect_ratios: [0.5, 1.0, 2.0]
+        height_stride: 16
+        width_stride: 16
+      }
+    }
+    first_stage_box_predictor_conv_hyperparams {
+      op: CONV
+      regularizer {
+        l2_regularizer {
+          weight: 0.0
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+          stddev: 0.01
+        }
+      }
+    }
+    first_stage_nms_score_threshold: 0.0
+    first_stage_nms_iou_threshold: 0.7
+    first_stage_max_proposals: 300
+    first_stage_localization_loss_weight: 2.0
+    first_stage_objectness_loss_weight: 1.0
+    initial_crop_size: 14
+    maxpool_kernel_size: 2
+    maxpool_stride: 2
+    second_stage_box_predictor {
+      mask_rcnn_box_predictor {
+        use_dropout: false
+        dropout_keep_probability: 1.0
+        fc_hyperparams {
+          op: FC
+          regularizer {
+            l2_regularizer {
+              weight: 0.0
+            }
+          }
+          initializer {
+            variance_scaling_initializer {
+              factor: 1.0
+              uniform: true
+              mode: FAN_AVG
+            }
+          }
+        }
+      }
+    }
+    second_stage_post_processing {
+      batch_non_max_suppression {
+        score_threshold: 0.0
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 100
+      }
+      score_converter: SOFTMAX
+    }
+    second_stage_localization_loss_weight: 2.0
+    second_stage_classification_loss_weight: 1.0
+  }
+}
+
+train_config: {
+  batch_size: 16
+  num_steps: 200000
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: 0.01
+          total_steps: 200000
+          warmup_learning_rate: 0.0
+          warmup_steps: 5000
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  gradient_clipping_by_norm: 10.0
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/resnet101.ckpt-1"
+  fine_tune_checkpoint_type: "classification"
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_hue {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_contrast {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_saturation {
+    }
+  }
+
+  data_augmentation_options {
+     random_square_crop_by_scale {
+      scale_min: 0.6
+      scale_max: 1.3
+    }
+  }
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}