Merged commit includes the following changes: (#8830)

320622111 by rathodv: Internal Change. -- PiperOrigin-RevId: 320622111 Co-authored-by: TF Object Detection Team <no-reply@google.com>

Merged commit includes the following changes: (#8830)
320622111 by rathodv: Internal Change. -- PiperOrigin-RevId: 320622111 Co-authored-by: TF Object Detection Team <no-reply@google.com>
5245161c · vivek rathod · GitHub · c9eb3554 · 5245161c · 5245161c
Unverified Commit 5245161c authored Jul 10, 2020 by vivek rathod Committed by GitHub Jul 10, 2020
20 changed files
--- a/research/object_detection/README.md
+++ b/research/object_detection/README.md
-![TensorFlow Requirement: 1.15](https://img.shields.io/badge/TensorFlow%20Requirement-1.15-brightgreen)
-![TensorFlow 2 Not Supported](https://img.shields.io/badge/TensorFlow%202%20Not%20Supported-%E2%9C%95-red.svg)
-
-# Tensorflow Object Detection API
+# TensorFlow Object Detection API
+[![TensorFlow 2.2](https://img.shields.io/badge/TensorFlow-2.2-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v2.2.0)
+[![TensorFlow 1.15](https://img.shields.io/badge/TensorFlow-1.15-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
+[![Python 3.6](https://img.shields.io/badge/Python-3.6-3776AB)](https://www.python.org/downloads/release/python-360/)

 Creating accurate machine learning models capable of localizing and identifying
 multiple objects in a single image remains a core challenge in computer vision.
@@ -11,7 +11,7 @@ models. At Google we’ve certainly found this codebase to be useful for our
 computer vision needs, and we hope that you will as well. <p align="center">
 <img src="g3doc/img/kites_detections_output.jpg" width=676 height=450> </p>
 Contributions to the codebase are welcome and we would love to hear back from
-you if you find this API useful. Finally if you use the Tensorflow Object
+you if you find this API useful. Finally if you use the TensorFlow Object
 Detection API for a research publication, please consider citing:

 ```
@@ -26,91 +26,91 @@ Song Y, Guadarrama S, Murphy K, CVPR 2017
  <img src="g3doc/img/tf-od-api-logo.png" width=140 height=195>
 </p>

-## Maintainers
+## Support for TensorFlow 2 and 1
+The TensorFlow Object Detection API supports both TensorFlow 2 (TF2) and
+TensorFlow 1 (TF1). A majority of the modules in the library are both TF1 and
+TF2 compatible. In cases where they are not, we provide two versions.

-Name           | GitHub
-------------- | ---------------------------------------------
-Jonathan Huang | [jch1](https://github.com/jch1)
-Vivek Rathod   | [tombstone](https://github.com/tombstone)
-Ronny Votel    | [ronnyvotel](https://github.com/ronnyvotel)
-Derek Chow     | [derekjchow](https://github.com/derekjchow)
-Chen Sun       | [jesu9](https://github.com/jesu9)
-Menglong Zhu   | [dreamdragon](https://github.com/dreamdragon)
-Alireza Fathi  | [afathi3](https://github.com/afathi3)
-Zhichao Lu     | [pkulzc](https://github.com/pkulzc)
-
-## Table of contents
-
-Setup:
-
-*   <a href='g3doc/installation.md'>Installation</a><br>
-
-Quick Start:
-
-*   <a href='object_detection_tutorial.ipynb'>
-      Quick Start: Jupyter notebook for off-the-shelf inference</a><br>
-*   <a href="g3doc/running_pets.md">Quick Start: Training a pet detector</a><br>
-
-Customizing a Pipeline:
-
-*   <a href='g3doc/configuring_jobs.md'>
-      Configuring an object detection pipeline</a><br>
-*   <a href='g3doc/preparing_inputs.md'>Preparing inputs</a><br>
-
-Running:
-
-*   <a href='g3doc/running_locally.md'>Running locally</a><br>
-*   <a href='g3doc/running_on_cloud.md'>Running on the cloud</a><br>
-
-Extras:
-
-*   <a href='g3doc/detection_model_zoo.md'>Tensorflow detection model zoo</a><br>
-*   <a href='g3doc/exporting_models.md'>
-      Exporting a trained model for inference</a><br>
-*   <a href='g3doc/tpu_exporters.md'>
-      Exporting a trained model for TPU inference</a><br>
-*   <a href='g3doc/defining_your_own_model.md'>
-      Defining your own model architecture</a><br>
-*   <a href='g3doc/using_your_own_dataset.md'>
-      Bringing in your own dataset</a><br>
-*   <a href='g3doc/evaluation_protocols.md'>
-      Supported object detection evaluation protocols</a><br>
-*   <a href='g3doc/oid_inference_and_evaluation.md'>
-      Inference and evaluation on the Open Images dataset</a><br>
-*   <a href='g3doc/instance_segmentation.md'>
-      Run an instance segmentation model</a><br>
-*   <a href='g3doc/challenge_evaluation.md'>
-      Run the evaluation for the Open Images Challenge 2018/2019</a><br>
-*   <a href='g3doc/tpu_compatibility.md'>
-      TPU compatible detection pipelines</a><br>
-*   <a href='g3doc/running_on_mobile_tensorflowlite.md'>
-      Running object detection on mobile devices with TensorFlow Lite</a><br>
-*   <a href='g3doc/context_rcnn.md'>
-      Context R-CNN documentation for data preparation, training, and export</a><br>
+Although we will continue to maintain the TF1 models and provide support, we
+encourage users to try the Object Detection API with TF2 for the following
+reasons:

-## Getting Help
+* We provide new architectures supported in TF2 only and we will continue to
+  develop in TF2 going forward.

-To get help with issues you may encounter using the Tensorflow Object Detection
-API, create a new question on [StackOverflow](https://stackoverflow.com/) with
-the tags "tensorflow" and "object-detection".
+* The popular models we ported from TF1 to TF2 achieve the same performance.

-Please report bugs (actually broken code, not usage questions) to the
-tensorflow/models GitHub
-[issue tracker](https://github.com/tensorflow/models/issues), prefixing the
-issue name with "object_detection".
+* A single training and evaluation binary now supports both GPU and TPU
+  distribution strategies making it possible to train models with synchronous
+  SGD by default.
+
+* Eager execution with new binaries makes debugging easy!
+
+Finally, if are an existing user of the Object Detection API we have retained
+the same config language you are familiar with and ensured that the
+TF2 training/eval binary takes the same arguments as our TF1 binaries.
+
+Note: The models we provide in [TF2 Zoo](g3doc/tf2_detection_zoo.md) and
+[TF1 Zoo](g3doc/tf1_detection_zoo.md) are specific to the TensorFlow major
+version and are not interoperable.

-Please check [FAQ](g3doc/faq.md) for frequently asked questions before reporting
-an issue.
+Please select one of the two links below for TensorFlow version specific
+documentation of the Object Detection API:

-## Release information
-### June 17th, 2020
+<!-- mdlint off(WHITESPACE_LINE_LENGTH) -->
+
+[![Object Detection API TensorFlow 2](https://img.shields.io/badge/Object%20Detection%20API-TensorFlow%202-orange)](g3doc/tf2.md) \
+[![Object Detection API TensorFlow 1](https://img.shields.io/badge/Object%20Detection%20API-TensorFlow%201-orange)](g3doc/tf1.md)
+
+<!-- mdlint on -->
+
+## Whats New
+
+### TensorFlow 2 Support
+
+We are happy to announce that the TF OD API officially supports TF2! Our release
+includes:
+
+* New binaries for train/eval/export that are designed to run in eager mode.
+* A suite of TF2 compatible (Keras-based) models; this includes migrations of
+  our most popular TF1.x models (e.g., SSD with MobileNet, RetinaNet,
+  Faster R-CNN, Mask R-CNN), as well as a few new architectures for which we
+  will only maintain TF2 implementations:
+
+    1. CenterNet - a simple and effective anchor-free architecture based on
+       the recent [Objects as Points](https://arxiv.org/abs/1904.07850) paper by
+       Zhou et al.
+    2. [EfficientDet](https://arxiv.org/abs/1911.09070) - a recent family of
+       SOTA models discovered with the help of Neural Architecture Search.
+
+* COCO pre-trained weights for all of the models provided as TF2 style
+  object-based checkpoints.
+* Access to [Distribution Strategies](https://www.tensorflow.org/guide/distributed_training)
+  for distributed training --- our model are designed to be trainable using sync
+  multi-GPU and TPU platforms.
+* Colabs demo’ing eager mode training and inference.
+
+See our release blogpost [here](https://blog.tensorflow.org/2020/07/tensorflow-2-meets-object-detection-api.html).
+If you are an existing user of the TF OD API using TF 1.x, don’t worry, we’ve
+got you covered.
+
+**Thanks to contributors**: Akhil Chinnakotla, Allen Lavoie, Anirudh Vegesana,
+Anjali Sridhar, Austin Myers, Dan Kondratyuk, David Ross, Derek Chow, Jaeyoun
+Kim, Jing Li, Jonathan Huang, Jordi Pont-Tuset, Karmel Allison, Kathy Ruan,
+Kaushik Shivakumar, Lu He, Mingxing Tan, Pengchong Jin, Ronny Votel, Sara Beery,
+Sergi Caelles Prat, Shan Yang, Sudheendra Vijayanarasimhan, Tina Tian, Tomer
+Kaftan, Vighnesh Birodkar, Vishnu Banna, Vivek Rathod, Yanhui Liang, Yiming Shi,
+Yixin Shi, Yu-hui Chen, Zhichao Lu.
+
+### Context R-CNN

 We have released [Context R-CNN](https://arxiv.org/abs/1912.03538), a model that
 uses attention to incorporate contextual information images (e.g. from
 temporally nearby frames taken by a static camera) in order to improve accuracy.
 Importantly, these contextual images need not be labeled.

-*   When applied to a challenging wildlife detection dataset ([Snapshot Serengeti](http://lila.science/datasets/snapshot-serengeti)),
+*   When applied to a challenging wildlife detection dataset
+    ([Snapshot Serengeti](http://lila.science/datasets/snapshot-serengeti)),
    Context R-CNN with context from up to a month of images outperforms a
    single-frame baseline by 17.9% mAP, and outperforms S3D (a 3d convolution
    based baseline) by 11.2% mAP.
@@ -118,282 +118,48 @@ Importantly, these contextual images need not be labeled.
    novel camera deployment to improve performance at that camera, boosting
    model generalizeability.

-Read about Context R-CNN on the Google AI blog [here](https://ai.googleblog.com/2020/06/leveraging-temporal-context-for-object.html).
+Read about Context R-CNN on the Google AI blog
+[here](https://ai.googleblog.com/2020/06/leveraging-temporal-context-for-object.html).

 We have provided code for generating data with associated context
-[here](g3doc/context_rcnn.md), and a sample config for a Context R-CNN
-model [here](samples/configs/context_rcnn_resnet101_snapshot_serengeti_sync.config).
+[here](g3doc/context_rcnn.md), and a sample config for a Context R-CNN model
+[here](samples/configs/context_rcnn_resnet101_snapshot_serengeti_sync.config).

 Snapshot Serengeti-trained Faster R-CNN and Context R-CNN models can be found in
-the [model zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md#snapshot-serengeti-camera-trap-trained-models).
+the
+[model zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf1_detection_zoo.md#snapshot-serengeti-camera-trap-trained-models).

 A colab demonstrating Context R-CNN is provided
 [here](colab_tutorials/context_rcnn_tutorial.ipynb).

 <b>Thanks to contributors</b>: Sara Beery, Jonathan Huang, Guanhang Wu, Vivek
-Rathod, Ronny Votel, Zhichao Lu, David Ross, Pietro Perona, Tanya Birch, and
-the Wildlife Insights AI Team.
-
-### May 19th, 2020
-
-We have released [MobileDets](https://arxiv.org/abs/2004.14525), a set of
-high-performance models for mobile CPUs, DSPs and EdgeTPUs.
-
-*   MobileDets outperform MobileNetV3+SSDLite by 1.7 mAP at comparable mobile
-    CPU inference latencies. MobileDets also outperform MobileNetV2+SSDLite by
-    1.9 mAP on mobile CPUs, 3.7 mAP on EdgeTPUs and 3.4 mAP on DSPs while
-    running equally fast. MobileDets also offer up to 2x speedup over MnasFPN on
-    EdgeTPUs and DSPs.
-
-For each of the three hardware platforms we have released model definition,
-model checkpoints trained on the COCO14 dataset and converted TFLite models in
-fp32 and/or uint8.
-
-<b>Thanks to contributors</b>: Yunyang Xiong, Hanxiao Liu, Suyog Gupta, Berkin
-Akin, Gabriel Bender, Pieter-Jan Kindermans, Mingxing Tan, Vikas Singh, Bo Chen,
-Quoc Le, Zhichao Lu.
-
-### May 7th, 2020
-
-We have released a mobile model with the
-[MnasFPN head](https://arxiv.org/abs/1912.01106).
-
-*   MnasFPN with MobileNet-V2 backbone is the most accurate (26.6 mAP at 183ms
-    on Pixel 1) mobile detection model we have released to date. With
-    depth-multiplier, MnasFPN with MobileNet-V2 backbone is 1.8 mAP higher than
-    MobileNet-V3-Large with SSDLite (23.8 mAP vs 22.0 mAP) at similar latency
-    (120ms) on Pixel 1.
-
-We have released model definition, model checkpoints trained on the COCO14
-dataset and a converted TFLite model.
-
-<b>Thanks to contributors</b>: Bo Chen, Golnaz Ghiasi, Hanxiao Liu, Tsung-Yi
-Lin, Dmitry Kalenichenko, Hartwig Adam, Quoc Le, Zhichao Lu, Jonathan Huang, Hao
-Xu.
-
-### Nov 13th, 2019
-
-We have released MobileNetEdgeTPU SSDLite model.
-
-*   SSDLite with MobileNetEdgeTPU backbone, which achieves 10% mAP higher than
-    MobileNetV2 SSDLite (24.3 mAP vs 22 mAP) on a Google Pixel4 at comparable
-    latency (6.6ms vs 6.8ms).
-
-Along with the model definition, we are also releasing model checkpoints trained
-on the COCO dataset.
-
-<b>Thanks to contributors</b>: Yunyang Xiong, Bo Chen, Suyog Gupta, Hanxiao Liu,
-Gabriel Bender, Mingxing Tan, Berkin Akin, Zhichao Lu, Quoc Le
-
-### Oct 15th, 2019
-
-We have released two MobileNet V3 SSDLite models (presented in
-[Searching for MobileNetV3](https://arxiv.org/abs/1905.02244)).
-
-*   SSDLite with MobileNet-V3-Large backbone, which is 27% faster than Mobilenet
-    V2 SSDLite (119ms vs 162ms) on a Google Pixel phone CPU at the same mAP.
-*   SSDLite with MobileNet-V3-Small backbone, which is 37% faster than MnasNet
-    SSDLite reduced with depth-multiplier (43ms vs 68ms) at the same mAP.
-
-Along with the model definition, we are also releasing model checkpoints trained
-on the COCO dataset.
-
-<b>Thanks to contributors</b>: Bo Chen, Zhichao Lu, Vivek Rathod, Jonathan Huang
-
-### July 1st, 2019
-
-We have released an updated set of utils and an updated
-[tutorial](g3doc/challenge_evaluation.md) for all three tracks of the
-[Open Images Challenge 2019](https://storage.googleapis.com/openimages/web/challenge2019.html)!
-
-The Instance Segmentation metric for
-[Open Images V5](https://storage.googleapis.com/openimages/web/index.html) and
-[Challenge 2019](https://storage.googleapis.com/openimages/web/challenge2019.html)
-is part of this release. Check out
-[the metric description](https://storage.googleapis.com/openimages/web/evaluation.html#instance_segmentation_eval)
-on the Open Images website.
-
-<b>Thanks to contributors</b>: Alina Kuznetsova, Rodrigo Benenson
-
-### Feb 11, 2019
-
-We have released detection models trained on the Open Images Dataset V4 in our
-detection model zoo, including
+Rathod, Ronny Votel, Zhichao Lu, David Ross, Pietro Perona, Tanya Birch, and the
+Wildlife Insights AI Team.

-*   Faster R-CNN detector with Inception Resnet V2 feature extractor
-*   SSD detector with MobileNet V2 feature extractor
-*   SSD detector with ResNet 101 FPN feature extractor (aka RetinaNet-101)
+## Release Notes
+See [notes](g3doc/release_notes.md) for all past releases.

-<b>Thanks to contributors</b>: Alina Kuznetsova, Yinxiao Li
-
-### Sep 17, 2018
-
-We have released Faster R-CNN detectors with ResNet-50 / ResNet-101 feature
-extractors trained on the
-[iNaturalist Species Detection Dataset](https://github.com/visipedia/inat_comp/blob/master/2017/README.md#bounding-boxes).
-The models are trained on the training split of the iNaturalist data for 4M
-iterations, they achieve 55% and 58% mean AP@.5 over 2854 classes respectively.
-For more details please refer to this [paper](https://arxiv.org/abs/1707.06642).
-
-<b>Thanks to contributors</b>: Chen Sun
-
-### July 13, 2018
-
-There are many new updates in this release, extending the functionality and
-capability of the API:
-
-*   Moving from slim-based training to
-    [Estimator](https://www.tensorflow.org/api_docs/python/tf/estimator/Estimator)-based
-    training.
-*   Support for [RetinaNet](https://arxiv.org/abs/1708.02002), and a
-    [MobileNet](https://ai.googleblog.com/2017/06/mobilenets-open-source-models-for.html)
-    adaptation of RetinaNet.
-*   A novel SSD-based architecture called the
-    [Pooling Pyramid Network](https://arxiv.org/abs/1807.03284) (PPN).
-*   Releasing several [TPU](https://cloud.google.com/tpu/)-compatible models.
-    These can be found in the `samples/configs/` directory with a comment in the
-    pipeline configuration files indicating TPU compatibility.
-*   Support for quantized training.
-*   Updated documentation for new binaries, Cloud training, and
-    [Tensorflow Lite](https://www.tensorflow.org/mobile/tflite/).
-
-See also our
-[expanded announcement blogpost](https://ai.googleblog.com/2018/07/accelerated-training-and-inference-with.html)
-and accompanying tutorial at the
-[TensorFlow blog](https://medium.com/tensorflow/training-and-serving-a-realtime-mobile-object-detector-in-30-minutes-with-cloud-tpus-b78971cf1193).
-
-<b>Thanks to contributors</b>: Sara Robinson, Aakanksha Chowdhery, Derek Chow,
-Pengchong Jin, Jonathan Huang, Vivek Rathod, Zhichao Lu, Ronny Votel
-
-### June 25, 2018
-
-Additional evaluation tools for the
-[Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html)
-are out. Check out our short tutorial on data preparation and running evaluation
-[here](g3doc/challenge_evaluation.md)!
-
-<b>Thanks to contributors</b>: Alina Kuznetsova
-
-### June 5, 2018
-
-We have released the implementation of evaluation metrics for both tracks of the
-[Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html)
-as a part of the Object Detection API - see the
-[evaluation protocols](g3doc/evaluation_protocols.md) for more details.
-Additionally, we have released a tool for hierarchical labels expansion for the
-Open Images Challenge: check out
-[oid_hierarchical_labels_expansion.py](dataset_tools/oid_hierarchical_labels_expansion.py).
-
-<b>Thanks to contributors</b>: Alina Kuznetsova, Vittorio Ferrari, Jasper
-Uijlings
-
-### April 30, 2018
-
-We have released a Faster R-CNN detector with ResNet-101 feature extractor
-trained on [AVA](https://research.google.com/ava/) v2.1. Compared with other
-commonly used object detectors, it changes the action classification loss
-function to per-class Sigmoid loss to handle boxes with multiple labels. The
-model is trained on the training split of AVA v2.1 for 1.5M iterations, it
-achieves mean AP of 11.25% over 60 classes on the validation split of AVA v2.1.
-For more details please refer to this [paper](https://arxiv.org/abs/1705.08421).
-
-<b>Thanks to contributors</b>: Chen Sun, David Ross
-
-### April 2, 2018
-
-Supercharge your mobile phones with the next generation mobile object detector!
-We are adding support for MobileNet V2 with SSDLite presented in
-[MobileNetV2: Inverted Residuals and Linear Bottlenecks](https://arxiv.org/abs/1801.04381).
-This model is 35% faster than Mobilenet V1 SSD on a Google Pixel phone CPU
-(200ms vs. 270ms) at the same accuracy. Along with the model definition, we are
-also releasing a model checkpoint trained on the COCO dataset.
-
-<b>Thanks to contributors</b>: Menglong Zhu, Mark Sandler, Zhichao Lu, Vivek
-Rathod, Jonathan Huang
-
-### February 9, 2018
-
-We now support instance segmentation!! In this API update we support a number of
-instance segmentation models similar to those discussed in the
-[Mask R-CNN paper](https://arxiv.org/abs/1703.06870). For further details refer
-to [our slides](http://presentations.cocodataset.org/Places17-GMRI.pdf) from the
-2017 Coco + Places Workshop. Refer to the section on
-[Running an Instance Segmentation Model](g3doc/instance_segmentation.md) for
-instructions on how to configure a model that predicts masks in addition to
-object bounding boxes.
-
-<b>Thanks to contributors</b>: Alireza Fathi, Zhichao Lu, Vivek Rathod, Ronny
-Votel, Jonathan Huang
-
-### November 17, 2017
-
-As a part of the Open Images V3 release we have released:
-
-*   An implementation of the Open Images evaluation metric and the
-    [protocol](g3doc/evaluation_protocols.md#open-images).
-*   Additional tools to separate inference of detection and evaluation (see
-    [this tutorial](g3doc/oid_inference_and_evaluation.md)).
-*   A new detection model trained on the Open Images V2 data release (see
-    [Open Images model](g3doc/detection_model_zoo.md#open-images-models)).
-
-See more information on the
-[Open Images website](https://github.com/openimages/dataset)!
-
-<b>Thanks to contributors</b>: Stefan Popov, Alina Kuznetsova
-
-### November 6, 2017
-
-We have re-released faster versions of our (pre-trained) models in the
-<a href='g3doc/detection_model_zoo.md'>model zoo</a>. In addition to what was
-available before, we are also adding Faster R-CNN models trained on COCO with
-Inception V2 and Resnet-50 feature extractors, as well as a Faster R-CNN with
-Resnet-101 model trained on the KITTI dataset.
-
-<b>Thanks to contributors</b>: Jonathan Huang, Vivek Rathod, Derek Chow, Tal
-Remez, Chen Sun.
-
-### October 31, 2017
-
-We have released a new state-of-the-art model for object detection using the
-Faster-RCNN with the
-[NASNet-A image featurization](https://arxiv.org/abs/1707.07012). This model
-achieves mAP of 43.1% on the test-dev validation dataset for COCO, improving on
-the best available model in the zoo by 6% in terms of absolute mAP.
-
-<b>Thanks to contributors</b>: Barret Zoph, Vijay Vasudevan, Jonathon Shlens,
-Quoc Le
-
-### August 11, 2017
+## Getting Help

-We have released an update to the
-[Android Detect demo](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android)
-which will now run models trained using the Tensorflow Object Detection API on
-an Android device. By default, it currently runs a frozen SSD w/Mobilenet
-detector trained on COCO, but we encourage you to try out other detection
-models!
+To get help with issues you may encounter using the TensorFlow Object Detection
+API, create a new question on [StackOverflow](https://stackoverflow.com/) with
+the tags "tensorflow" and "object-detection".

-<b>Thanks to contributors</b>: Jonathan Huang, Andrew Harp
+Please report bugs (actually broken code, not usage questions) to the
+tensorflow/models GitHub
+[issue tracker](https://github.com/tensorflow/models/issues), prefixing the
+issue name with "object_detection".

-### June 15, 2017
+Please check the [FAQ](g3doc/faq.md) for frequently asked questions before
+reporting an issue.

-In addition to our base Tensorflow detection model definitions, this release
-includes:
+## Maintainers

-*   A selection of trainable detection models, including:
-    *   Single Shot Multibox Detector (SSD) with MobileNet,
-    *   SSD with Inception V2,
-    *   Region-Based Fully Convolutional Networks (R-FCN) with Resnet 101,
-    *   Faster RCNN with Resnet 101,
-    *   Faster RCNN with Inception Resnet v2
-*   Frozen weights (trained on the COCO dataset) for each of the above models to
-    be used for out-of-the-box inference purposes.
-*   A [Jupyter notebook](colab_tutorials/object_detection_tutorial.ipynb) for
-    performing out-of-the-box inference with one of our released models
-*   Convenient [local training](g3doc/running_locally.md) scripts as well as
-    distributed training and evaluation pipelines via
-    [Google Cloud](g3doc/running_on_cloud.md).
-
-<b>Thanks to contributors</b>: Jonathan Huang, Vivek Rathod, Derek Chow, Chen
-Sun, Menglong Zhu, Matthew Tang, Anoop Korattikara, Alireza Fathi, Ian Fischer,
-Zbigniew Wojna, Yang Song, Sergio Guadarrama, Jasper Uijlings, Viacheslav
-Kovalevskyi, Kevin Murphy
+* Jonathan Huang ([@GitHub jch1](https://github.com/jch1))
+* Vivek Rathod ([@GitHub tombstone](https://github.com/tombstone))
+* Vighnesh Birodkar ([@GitHub vighneshbirodkar](https://github.com/vighneshbirodkar))
+* Austin Myers ([@GitHub austin-myers](https://github.com/austin-myers))
+* Zhichao Lu ([@GitHub pkulzc](https://github.com/pkulzc))
+* Ronny Votel ([@GitHub ronnyvotel](https://github.com/ronnyvotel))
+* Yu-hui Chen ([@GitHub yuhuichen1015](https://github.com/yuhuichen1015))
+* Derek Chow  ([@GitHub derekjchow](https://github.com/derekjchow))
--- a/research/object_detection/configs/tf2/center_net_hourglass104_1024x1024_coco17_tpu-32.config
+++ b/research/object_detection/configs/tf2/center_net_hourglass104_1024x1024_coco17_tpu-32.config
+# CenterNet meta-architecture from the "Objects as Points" [2] paper with the
+# hourglass[1] backbone.
+# [1]: https://arxiv.org/abs/1603.06937
+# [2]: https://arxiv.org/abs/1904.07850
+# Trained on COCO, initialized from Extremenet Detection checkpoint
+# Train on TPU-32 v3
+#
+# Achieves 44.6 mAP on COCO17 Val
+
+
+model {
+  center_net {
+    num_classes: 90
+    feature_extractor {
+      type: "hourglass_104"
+      bgr_ordering: true
+      channel_means: [104.01362025, 114.03422265, 119.9165958 ]
+      channel_stds: [73.6027665 , 69.89082075, 70.9150767 ]
+    }
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 1024
+        max_dimension: 1024
+        pad_to_max_dimension: true
+      }
+    }
+    object_detection_task {
+      task_loss_weight: 1.0
+      offset_loss_weight: 1.0
+      scale_loss_weight: 0.1
+      localization_loss {
+        l1_localization_loss {
+        }
+      }
+    }
+    object_center_params {
+      object_center_loss_weight: 1.0
+      min_box_overlap_iou: 0.7
+      max_box_predictions: 100
+      classification_loss {
+        penalty_reduced_logistic_focal_loss {
+          alpha: 2.0
+          beta: 4.0
+        }
+      }
+    }
+  }
+}
+
+train_config: {
+
+  batch_size: 128
+  num_steps: 50000
+
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_hue {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_contrast {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_saturation {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_brightness {
+    }
+  }
+
+   data_augmentation_options {
+     random_square_crop_by_scale {
+      scale_min: 0.6
+      scale_max: 1.3
+    }
+  }
+
+  optimizer {
+    adam_optimizer: {
+      epsilon: 1e-7  # Match tf.keras.optimizers.Adam's default.
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: 1e-3
+          total_steps: 50000
+          warmup_learning_rate: 2.5e-4
+          warmup_steps: 5000
+        }
+      }
+    }
+    use_moving_average: false
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/ckpt-1"
+  fine_tune_checkpoint_type: "detection"
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
--- a/research/object_detection/configs/tf2/center_net_hourglass104_512x512_coco17_tpu-8.config
+++ b/research/object_detection/configs/tf2/center_net_hourglass104_512x512_coco17_tpu-8.config
+# CenterNet meta-architecture from the "Objects as Points" [2] paper with the
+# hourglass[1] backbone.
+# [1]: https://arxiv.org/abs/1603.06937
+# [2]: https://arxiv.org/abs/1904.07850
+# Trained on COCO, initialized from Extremenet Detection checkpoint
+# Train on TPU-8
+#
+# Achieves 41.9 mAP on COCO17 Val
+
+model {
+  center_net {
+    num_classes: 90
+    feature_extractor {
+      type: "hourglass_104"
+      bgr_ordering: true
+      channel_means: [104.01362025, 114.03422265, 119.9165958 ]
+      channel_stds: [73.6027665 , 69.89082075, 70.9150767 ]
+    }
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 512
+        max_dimension: 512
+        pad_to_max_dimension: true
+      }
+    }
+    object_detection_task {
+      task_loss_weight: 1.0
+      offset_loss_weight: 1.0
+      scale_loss_weight: 0.1
+      localization_loss {
+        l1_localization_loss {
+        }
+      }
+    }
+    object_center_params {
+      object_center_loss_weight: 1.0
+      min_box_overlap_iou: 0.7
+      max_box_predictions: 100
+      classification_loss {
+        penalty_reduced_logistic_focal_loss {
+          alpha: 2.0
+          beta: 4.0
+        }
+      }
+    }
+  }
+}
+
+train_config: {
+
+  batch_size: 128
+  num_steps: 140000
+
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+
+  data_augmentation_options {
+    random_crop_image {
+      min_aspect_ratio: 0.5
+      max_aspect_ratio: 1.7
+      random_coef: 0.25
+    }
+  }
+
+
+  data_augmentation_options {
+    random_adjust_hue {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_contrast {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_saturation {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_brightness {
+    }
+  }
+
+  data_augmentation_options {
+    random_absolute_pad_image {
+       max_height_padding: 200
+       max_width_padding: 200
+       pad_color: [0, 0, 0]
+    }
+  }
+
+  optimizer {
+    adam_optimizer: {
+      epsilon: 1e-7  # Match tf.keras.optimizers.Adam's default.
+      learning_rate: {
+        manual_step_learning_rate {
+          initial_learning_rate: 1e-3
+          schedule {
+           step: 90000
+           learning_rate: 1e-4
+          }
+          schedule {
+            step: 120000
+            learning_rate: 1e-5
+          }
+        }
+      }
+    }
+    use_moving_average: false
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/ckpt-1"
+  fine_tune_checkpoint_type: "detection"
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
--- a/research/object_detection/configs/tf2/center_net_resnet101_v1_fpn_512x512_coco17_tpu-8.config
+++ b/research/object_detection/configs/tf2/center_net_resnet101_v1_fpn_512x512_coco17_tpu-8.config
+# CenterNet meta-architecture from the "Objects as Points" [1] paper
+# with the ResNet-v1-101 FPN backbone.
+# [1]: https://arxiv.org/abs/1904.07850
+
+# Train on TPU-8
+#
+# Achieves 34.18 mAP on COCO17 Val
+
+
+model {
+  center_net {
+    num_classes: 90
+    feature_extractor {
+      type: "resnet_v2_101"
+    }
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 512
+        max_dimension: 512
+        pad_to_max_dimension: true
+      }
+    }
+    object_detection_task {
+      task_loss_weight: 1.0
+      offset_loss_weight: 1.0
+      scale_loss_weight: 0.1
+      localization_loss {
+        l1_localization_loss {
+        }
+      }
+    }
+    object_center_params {
+      object_center_loss_weight: 1.0
+      min_box_overlap_iou: 0.7
+      max_box_predictions: 100
+      classification_loss {
+        penalty_reduced_logistic_focal_loss {
+          alpha: 2.0
+          beta: 4.0
+        }
+      }
+    }
+  }
+}
+
+train_config: {
+
+  batch_size: 128
+  num_steps: 140000
+
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+
+  data_augmentation_options {
+    random_crop_image {
+      min_aspect_ratio: 0.5
+      max_aspect_ratio: 1.7
+      random_coef: 0.25
+    }
+  }
+
+
+  data_augmentation_options {
+    random_adjust_hue {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_contrast {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_saturation {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_brightness {
+    }
+  }
+
+  data_augmentation_options {
+    random_absolute_pad_image {
+       max_height_padding: 200
+       max_width_padding: 200
+       pad_color: [0, 0, 0]
+    }
+  }
+
+  optimizer {
+    adam_optimizer: {
+      epsilon: 1e-7  # Match tf.keras.optimizers.Adam's default.
+      learning_rate: {
+        manual_step_learning_rate {
+          initial_learning_rate: 1e-3
+          schedule {
+           step: 90000
+           learning_rate: 1e-4
+          }
+          schedule {
+            step: 120000
+            learning_rate: 1e-5
+          }
+        }
+      }
+    }
+    use_moving_average: false
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/weights-1"
+  fine_tune_checkpoint_type: "classification"
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
+
--- a/research/object_detection/configs/tf2/faster_rcnn_resnet101_v1_1024x1024_coco17_tpu-8.config
+++ b/research/object_detection/configs/tf2/faster_rcnn_resnet101_v1_1024x1024_coco17_tpu-8.config
+# Faster R-CNN with Resnet-101 (v1),
+# w/high res inputs, long training schedule
+# Trained on COCO, initialized from Imagenet classification checkpoint
+#
+# Train on TPU-8
+#
+# Achieves 37.1 mAP on COCO17 val
+
+model {
+  faster_rcnn {
+    num_classes: 90
+    image_resizer {
+      fixed_shape_resizer {
+        width: 1024
+        height: 1024
+      }
+    }
+    feature_extractor {
+      type: 'faster_rcnn_resnet101_keras'
+      batch_norm_trainable: true
+    }
+    first_stage_anchor_generator {
+      grid_anchor_generator {
+        scales: [0.25, 0.5, 1.0, 2.0]
+        aspect_ratios: [0.5, 1.0, 2.0]
+        height_stride: 16
+        width_stride: 16
+      }
+    }
+    first_stage_box_predictor_conv_hyperparams {
+      op: CONV
+      regularizer {
+        l2_regularizer {
+          weight: 0.0
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+          stddev: 0.01
+        }
+      }
+    }
+    first_stage_nms_score_threshold: 0.0
+    first_stage_nms_iou_threshold: 0.7
+    first_stage_max_proposals: 300
+    first_stage_localization_loss_weight: 2.0
+    first_stage_objectness_loss_weight: 1.0
+    initial_crop_size: 14
+    maxpool_kernel_size: 2
+    maxpool_stride: 2
+    second_stage_box_predictor {
+      mask_rcnn_box_predictor {
+        use_dropout: false
+        dropout_keep_probability: 1.0
+        fc_hyperparams {
+          op: FC
+          regularizer {
+            l2_regularizer {
+              weight: 0.0
+            }
+          }
+          initializer {
+            variance_scaling_initializer {
+              factor: 1.0
+              uniform: true
+              mode: FAN_AVG
+            }
+          }
+        }
+        share_box_across_classes: true
+      }
+    }
+    second_stage_post_processing {
+      batch_non_max_suppression {
+        score_threshold: 0.0
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 300
+      }
+      score_converter: SOFTMAX
+    }
+    second_stage_localization_loss_weight: 2.0
+    second_stage_classification_loss_weight: 1.0
+    use_static_shapes: true
+    use_matmul_crop_and_resize: true
+    clip_anchors_to_image: true
+    use_static_balanced_label_sampler: true
+    use_matmul_gather_in_matcher: true
+  }
+}
+
+train_config: {
+  batch_size: 64
+  sync_replicas: true
+  startup_delay_steps: 0
+  replicas_to_aggregate: 8
+  num_steps: 100000
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: .04
+          total_steps: 100000
+          warmup_learning_rate: .013333
+          warmup_steps: 2000
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/resnet101.ckpt-1"
+  fine_tune_checkpoint_type: "classification"
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_hue {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_contrast {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_saturation {
+    }
+  }
+
+  data_augmentation_options {
+     random_square_crop_by_scale {
+      scale_min: 0.6
+      scale_max: 1.3
+    }
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+  use_bfloat16: true  # works only on TPUs
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
--- a/research/object_detection/configs/tf2/faster_rcnn_resnet101_v1_640x640_coco17_tpu-8.config
+++ b/research/object_detection/configs/tf2/faster_rcnn_resnet101_v1_640x640_coco17_tpu-8.config
+# Faster R-CNN with Resnet-50 (v1)
+# Trained on COCO, initialized from Imagenet classification checkpoint
+#
+# Train on TPU-8
+#
+# Achieves 31.8 mAP on COCO17 val
+
+model {
+  faster_rcnn {
+    num_classes: 90
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 640
+        max_dimension: 640
+        pad_to_max_dimension: true
+      }
+    }
+    feature_extractor {
+      type: 'faster_rcnn_resnet101_keras'
+      batch_norm_trainable: true
+    }
+    first_stage_anchor_generator {
+      grid_anchor_generator {
+        scales: [0.25, 0.5, 1.0, 2.0]
+        aspect_ratios: [0.5, 1.0, 2.0]
+        height_stride: 16
+        width_stride: 16
+      }
+    }
+    first_stage_box_predictor_conv_hyperparams {
+      op: CONV
+      regularizer {
+        l2_regularizer {
+          weight: 0.0
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+          stddev: 0.01
+        }
+      }
+    }
+    first_stage_nms_score_threshold: 0.0
+    first_stage_nms_iou_threshold: 0.7
+    first_stage_max_proposals: 300
+    first_stage_localization_loss_weight: 2.0
+    first_stage_objectness_loss_weight: 1.0
+    initial_crop_size: 14
+    maxpool_kernel_size: 2
+    maxpool_stride: 2
+    second_stage_box_predictor {
+      mask_rcnn_box_predictor {
+        use_dropout: false
+        dropout_keep_probability: 1.0
+        fc_hyperparams {
+          op: FC
+          regularizer {
+            l2_regularizer {
+              weight: 0.0
+            }
+          }
+          initializer {
+            variance_scaling_initializer {
+              factor: 1.0
+              uniform: true
+              mode: FAN_AVG
+            }
+          }
+        }
+        share_box_across_classes: true
+      }
+    }
+    second_stage_post_processing {
+      batch_non_max_suppression {
+        score_threshold: 0.0
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 300
+      }
+      score_converter: SOFTMAX
+    }
+    second_stage_localization_loss_weight: 2.0
+    second_stage_classification_loss_weight: 1.0
+    use_static_shapes: true
+    use_matmul_crop_and_resize: true
+    clip_anchors_to_image: true
+    use_static_balanced_label_sampler: true
+    use_matmul_gather_in_matcher: true
+  }
+}
+
+train_config: {
+  batch_size: 64
+  sync_replicas: true
+  startup_delay_steps: 0
+  replicas_to_aggregate: 8
+  num_steps: 25000
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: .04
+          total_steps: 25000
+          warmup_learning_rate: .013333
+          warmup_steps: 2000
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/resnet101.ckpt-1"
+  fine_tune_checkpoint_type: "classification"
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+  use_bfloat16: true  # works only on TPUs
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
--- a/research/object_detection/configs/tf2/faster_rcnn_resnet101_v1_800x1333_coco17_gpu-8.config
+++ b/research/object_detection/configs/tf2/faster_rcnn_resnet101_v1_800x1333_coco17_gpu-8.config
+# Faster R-CNN with Resnet-101 (v1),
+# Initialized from Imagenet classification checkpoint
+#
+# Train on GPU-8
+#
+# Achieves 36.6 mAP on COCO17 val
+
+model {
+  faster_rcnn {
+    num_classes: 90
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 800
+        max_dimension: 1333
+        pad_to_max_dimension: true
+      }
+    }
+    feature_extractor {
+      type: 'faster_rcnn_resnet101_keras'
+    }
+    first_stage_anchor_generator {
+      grid_anchor_generator {
+        scales: [0.25, 0.5, 1.0, 2.0]
+        aspect_ratios: [0.5, 1.0, 2.0]
+        height_stride: 16
+        width_stride: 16
+      }
+    }
+    first_stage_box_predictor_conv_hyperparams {
+      op: CONV
+      regularizer {
+        l2_regularizer {
+          weight: 0.0
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+          stddev: 0.01
+        }
+      }
+    }
+    first_stage_nms_score_threshold: 0.0
+    first_stage_nms_iou_threshold: 0.7
+    first_stage_max_proposals: 300
+    first_stage_localization_loss_weight: 2.0
+    first_stage_objectness_loss_weight: 1.0
+    initial_crop_size: 14
+    maxpool_kernel_size: 2
+    maxpool_stride: 2
+    second_stage_box_predictor {
+      mask_rcnn_box_predictor {
+        use_dropout: false
+        dropout_keep_probability: 1.0
+        fc_hyperparams {
+          op: FC
+          regularizer {
+            l2_regularizer {
+              weight: 0.0
+            }
+          }
+          initializer {
+            variance_scaling_initializer {
+              factor: 1.0
+              uniform: true
+              mode: FAN_AVG
+            }
+          }
+        }
+      }
+    }
+    second_stage_post_processing {
+      batch_non_max_suppression {
+        score_threshold: 0.0
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 100
+      }
+      score_converter: SOFTMAX
+    }
+    second_stage_localization_loss_weight: 2.0
+    second_stage_classification_loss_weight: 1.0
+  }
+}
+
+train_config: {
+  batch_size: 16
+  num_steps: 200000
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: 0.01
+          total_steps: 200000
+          warmup_learning_rate: 0.0
+          warmup_steps: 5000
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  gradient_clipping_by_norm: 10.0
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/resnet101.ckpt-1"
+  fine_tune_checkpoint_type: "classification"
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_hue {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_contrast {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_saturation {
+    }
+  }
+
+  data_augmentation_options {
+     random_square_crop_by_scale {
+      scale_min: 0.6
+      scale_max: 1.3
+    }
+  }
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
--- a/research/object_detection/configs/tf2/faster_rcnn_resnet152_v1_1024x1024_coco17_tpu-8.config
+++ b/research/object_detection/configs/tf2/faster_rcnn_resnet152_v1_1024x1024_coco17_tpu-8.config
+# Faster R-CNN with Resnet-152 (v1)
+# w/high res inputs, long training schedule
+# Trained on COCO, initialized from Imagenet classification checkpoint
+#
+# Train on TPU-8
+#
+# Achieves 37.6 mAP on COCO17 val
+
+model {
+  faster_rcnn {
+    num_classes: 90
+    image_resizer {
+      fixed_shape_resizer {
+        width: 1024
+        height: 1024
+      }
+    }
+    feature_extractor {
+      type: 'faster_rcnn_resnet152_keras'
+      batch_norm_trainable: true
+    }
+    first_stage_anchor_generator {
+      grid_anchor_generator {
+        scales: [0.25, 0.5, 1.0, 2.0]
+        aspect_ratios: [0.5, 1.0, 2.0]
+        height_stride: 16
+        width_stride: 16
+      }
+    }
+    first_stage_box_predictor_conv_hyperparams {
+      op: CONV
+      regularizer {
+        l2_regularizer {
+          weight: 0.0
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+          stddev: 0.01
+        }
+      }
+    }
+    first_stage_nms_score_threshold: 0.0
+    first_stage_nms_iou_threshold: 0.7
+    first_stage_max_proposals: 300
+    first_stage_localization_loss_weight: 2.0
+    first_stage_objectness_loss_weight: 1.0
+    initial_crop_size: 14
+    maxpool_kernel_size: 2
+    maxpool_stride: 2
+    second_stage_box_predictor {
+      mask_rcnn_box_predictor {
+        use_dropout: false
+        dropout_keep_probability: 1.0
+        fc_hyperparams {
+          op: FC
+          regularizer {
+            l2_regularizer {
+              weight: 0.0
+            }
+          }
+          initializer {
+            variance_scaling_initializer {
+              factor: 1.0
+              uniform: true
+              mode: FAN_AVG
+            }
+          }
+        }
+        share_box_across_classes: true
+      }
+    }
+    second_stage_post_processing {
+      batch_non_max_suppression {
+        score_threshold: 0.0
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 300
+      }
+      score_converter: SOFTMAX
+    }
+    second_stage_localization_loss_weight: 2.0
+    second_stage_classification_loss_weight: 1.0
+    use_static_shapes: true
+    use_matmul_crop_and_resize: true
+    clip_anchors_to_image: true
+    use_static_balanced_label_sampler: true
+    use_matmul_gather_in_matcher: true
+  }
+}
+
+train_config: {
+  batch_size: 64
+  sync_replicas: true
+  startup_delay_steps: 0
+  replicas_to_aggregate: 8
+  num_steps: 100000
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: .04
+          total_steps: 100000
+          warmup_learning_rate: .013333
+          warmup_steps: 2000
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/resnet152.ckpt-1"
+  fine_tune_checkpoint_type: "classification"
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_hue {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_contrast {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_saturation {
+    }
+  }
+
+  data_augmentation_options {
+     random_square_crop_by_scale {
+      scale_min: 0.6
+      scale_max: 1.3
+    }
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+  use_bfloat16: true  # works only on TPUs
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
--- a/research/object_detection/configs/tf2/faster_rcnn_resnet152_v1_640x640_coco17_tpu-8.config
+++ b/research/object_detection/configs/tf2/faster_rcnn_resnet152_v1_640x640_coco17_tpu-8.config
+# Faster R-CNN with Resnet-152 (v1)
+# Trained on COCO, initialized from Imagenet classification checkpoint
+#
+# Train on TPU-8
+#
+# Achieves 32.4 mAP on COCO17 val
+
+model {
+  faster_rcnn {
+    num_classes: 90
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 640
+        max_dimension: 640
+        pad_to_max_dimension: true
+      }
+    }
+    feature_extractor {
+      type: 'faster_rcnn_resnet152_keras'
+      batch_norm_trainable: true
+    }
+    first_stage_anchor_generator {
+      grid_anchor_generator {
+        scales: [0.25, 0.5, 1.0, 2.0]
+        aspect_ratios: [0.5, 1.0, 2.0]
+        height_stride: 16
+        width_stride: 16
+      }
+    }
+    first_stage_box_predictor_conv_hyperparams {
+      op: CONV
+      regularizer {
+        l2_regularizer {
+          weight: 0.0
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+          stddev: 0.01
+        }
+      }
+    }
+    first_stage_nms_score_threshold: 0.0
+    first_stage_nms_iou_threshold: 0.7
+    first_stage_max_proposals: 300
+    first_stage_localization_loss_weight: 2.0
+    first_stage_objectness_loss_weight: 1.0
+    initial_crop_size: 14
+    maxpool_kernel_size: 2
+    maxpool_stride: 2
+    second_stage_box_predictor {
+      mask_rcnn_box_predictor {
+        use_dropout: false
+        dropout_keep_probability: 1.0
+        fc_hyperparams {
+          op: FC
+          regularizer {
+            l2_regularizer {
+              weight: 0.0
+            }
+          }
+          initializer {
+            variance_scaling_initializer {
+              factor: 1.0
+              uniform: true
+              mode: FAN_AVG
+            }
+          }
+        }
+        share_box_across_classes: true
+      }
+    }
+    second_stage_post_processing {
+      batch_non_max_suppression {
+        score_threshold: 0.0
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 300
+      }
+      score_converter: SOFTMAX
+    }
+    second_stage_localization_loss_weight: 2.0
+    second_stage_classification_loss_weight: 1.0
+    use_static_shapes: true
+    use_matmul_crop_and_resize: true
+    clip_anchors_to_image: true
+    use_static_balanced_label_sampler: true
+    use_matmul_gather_in_matcher: true
+  }
+}
+
+train_config: {
+  batch_size: 64
+  sync_replicas: true
+  startup_delay_steps: 0
+  replicas_to_aggregate: 8
+  num_steps: 25000
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: .04
+          total_steps: 25000
+          warmup_learning_rate: .013333
+          warmup_steps: 2000
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/resnet152.ckpt-1"
+  fine_tune_checkpoint_type: "classification"
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+  use_bfloat16: true  # works only on TPUs
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
--- a/research/object_detection/configs/tf2/faster_rcnn_resnet152_v1_800x1333_coco17_gpu-8.config
+++ b/research/object_detection/configs/tf2/faster_rcnn_resnet152_v1_800x1333_coco17_gpu-8.config
+# Faster R-CNN with Resnet-152 (v1),
+# Initialized from Imagenet classification checkpoint
+#
+# Train on GPU-8
+#
+# Achieves 37.3 mAP on COCO17 val
+
+model {
+  faster_rcnn {
+    num_classes: 90
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 800
+        max_dimension: 1333
+        pad_to_max_dimension: true
+      }
+    }
+    feature_extractor {
+      type: 'faster_rcnn_resnet152_keras'
+    }
+    first_stage_anchor_generator {
+      grid_anchor_generator {
+        scales: [0.25, 0.5, 1.0, 2.0]
+        aspect_ratios: [0.5, 1.0, 2.0]
+        height_stride: 16
+        width_stride: 16
+      }
+    }
+    first_stage_box_predictor_conv_hyperparams {
+      op: CONV
+      regularizer {
+        l2_regularizer {
+          weight: 0.0
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+          stddev: 0.01
+        }
+      }
+    }
+    first_stage_nms_score_threshold: 0.0
+    first_stage_nms_iou_threshold: 0.7
+    first_stage_max_proposals: 300
+    first_stage_localization_loss_weight: 2.0
+    first_stage_objectness_loss_weight: 1.0
+    initial_crop_size: 14
+    maxpool_kernel_size: 2
+    maxpool_stride: 2
+    second_stage_box_predictor {
+      mask_rcnn_box_predictor {
+        use_dropout: false
+        dropout_keep_probability: 1.0
+        fc_hyperparams {
+          op: FC
+          regularizer {
+            l2_regularizer {
+              weight: 0.0
+            }
+          }
+          initializer {
+            variance_scaling_initializer {
+              factor: 1.0
+              uniform: true
+              mode: FAN_AVG
+            }
+          }
+        }
+      }
+    }
+    second_stage_post_processing {
+      batch_non_max_suppression {
+        score_threshold: 0.0
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 100
+      }
+      score_converter: SOFTMAX
+    }
+    second_stage_localization_loss_weight: 2.0
+    second_stage_classification_loss_weight: 1.0
+  }
+}
+
+train_config: {
+  batch_size: 16
+  num_steps: 200000
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: 0.01
+          total_steps: 200000
+          warmup_learning_rate: 0.0
+          warmup_steps: 5000
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  gradient_clipping_by_norm: 10.0
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/resnet152.ckpt-1"
+  fine_tune_checkpoint_type: "classification"
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_hue {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_contrast {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_saturation {
+    }
+  }
+
+  data_augmentation_options {
+     random_square_crop_by_scale {
+      scale_min: 0.6
+      scale_max: 1.3
+    }
+  }
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
--- a/research/object_detection/configs/tf2/faster_rcnn_resnet50_v1_1024x1024_coco17_tpu-8.config
+++ b/research/object_detection/configs/tf2/faster_rcnn_resnet50_v1_1024x1024_coco17_tpu-8.config
+# Faster R-CNN with Resnet-50 (v1),
+# w/high res inputs, long training schedule
+# Trained on COCO, initialized from Imagenet classification checkpoint
+#
+# Train on TPU-8
+#
+# Achieves 31.0 mAP on COCO17 val
+
+model {
+  faster_rcnn {
+    num_classes: 90
+    image_resizer {
+      fixed_shape_resizer {
+        width: 1024
+        height: 1024
+      }
+    }
+    feature_extractor {
+      type: 'faster_rcnn_resnet50_keras'
+      batch_norm_trainable: true
+    }
+    first_stage_anchor_generator {
+      grid_anchor_generator {
+        scales: [0.25, 0.5, 1.0, 2.0]
+        aspect_ratios: [0.5, 1.0, 2.0]
+        height_stride: 16
+        width_stride: 16
+      }
+    }
+    first_stage_box_predictor_conv_hyperparams {
+      op: CONV
+      regularizer {
+        l2_regularizer {
+          weight: 0.0
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+          stddev: 0.01
+        }
+      }
+    }
+    first_stage_nms_score_threshold: 0.0
+    first_stage_nms_iou_threshold: 0.7
+    first_stage_max_proposals: 300
+    first_stage_localization_loss_weight: 2.0
+    first_stage_objectness_loss_weight: 1.0
+    initial_crop_size: 14
+    maxpool_kernel_size: 2
+    maxpool_stride: 2
+    second_stage_box_predictor {
+      mask_rcnn_box_predictor {
+        use_dropout: false
+        dropout_keep_probability: 1.0
+        fc_hyperparams {
+          op: FC
+          regularizer {
+            l2_regularizer {
+              weight: 0.0
+            }
+          }
+          initializer {
+            variance_scaling_initializer {
+              factor: 1.0
+              uniform: true
+              mode: FAN_AVG
+            }
+          }
+        }
+        share_box_across_classes: true
+      }
+    }
+    second_stage_post_processing {
+      batch_non_max_suppression {
+        score_threshold: 0.0
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 300
+      }
+      score_converter: SOFTMAX
+    }
+    second_stage_localization_loss_weight: 2.0
+    second_stage_classification_loss_weight: 1.0
+    use_static_shapes: true
+    use_matmul_crop_and_resize: true
+    clip_anchors_to_image: true
+    use_static_balanced_label_sampler: true
+    use_matmul_gather_in_matcher: true
+  }
+}
+
+train_config: {
+  batch_size: 64
+  sync_replicas: true
+  startup_delay_steps: 0
+  replicas_to_aggregate: 8
+  num_steps: 100000
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: .04
+          total_steps: 100000
+          warmup_learning_rate: .013333
+          warmup_steps: 2000
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/resnet50.ckpt-1"
+  fine_tune_checkpoint_type: "classification"
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_hue {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_contrast {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_saturation {
+    }
+  }
+
+  data_augmentation_options {
+     random_square_crop_by_scale {
+      scale_min: 0.6
+      scale_max: 1.3
+    }
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+  use_bfloat16: true  # works only on TPUs
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
--- a/research/object_detection/configs/tf2/faster_rcnn_resnet50_v1_640x640_coco17_tpu-8.config
+++ b/research/object_detection/configs/tf2/faster_rcnn_resnet50_v1_640x640_coco17_tpu-8.config
+# Faster R-CNN with Resnet-50 (v1) with 640x640 input resolution
+# Trained on COCO, initialized from Imagenet classification checkpoint
+#
+# Train on TPU-8
+#
+# Achieves 29.3 mAP on COCO17 Val
+
+model {
+  faster_rcnn {
+    num_classes: 90
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 640
+        max_dimension: 640
+        pad_to_max_dimension: true
+      }
+    }
+    feature_extractor {
+      type: 'faster_rcnn_resnet50_keras'
+      batch_norm_trainable: true
+    }
+    first_stage_anchor_generator {
+      grid_anchor_generator {
+        scales: [0.25, 0.5, 1.0, 2.0]
+        aspect_ratios: [0.5, 1.0, 2.0]
+        height_stride: 16
+        width_stride: 16
+      }
+    }
+    first_stage_box_predictor_conv_hyperparams {
+      op: CONV
+      regularizer {
+        l2_regularizer {
+          weight: 0.0
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+          stddev: 0.01
+        }
+      }
+    }
+    first_stage_nms_score_threshold: 0.0
+    first_stage_nms_iou_threshold: 0.7
+    first_stage_max_proposals: 300
+    first_stage_localization_loss_weight: 2.0
+    first_stage_objectness_loss_weight: 1.0
+    initial_crop_size: 14
+    maxpool_kernel_size: 2
+    maxpool_stride: 2
+    second_stage_box_predictor {
+      mask_rcnn_box_predictor {
+        use_dropout: false
+        dropout_keep_probability: 1.0
+        fc_hyperparams {
+          op: FC
+          regularizer {
+            l2_regularizer {
+              weight: 0.0
+            }
+          }
+          initializer {
+            variance_scaling_initializer {
+              factor: 1.0
+              uniform: true
+              mode: FAN_AVG
+            }
+          }
+        }
+        share_box_across_classes: true
+      }
+    }
+    second_stage_post_processing {
+      batch_non_max_suppression {
+        score_threshold: 0.0
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 300
+      }
+      score_converter: SOFTMAX
+    }
+    second_stage_localization_loss_weight: 2.0
+    second_stage_classification_loss_weight: 1.0
+    use_static_shapes: true
+    use_matmul_crop_and_resize: true
+    clip_anchors_to_image: true
+    use_static_balanced_label_sampler: true
+    use_matmul_gather_in_matcher: true
+  }
+}
+
+train_config: {
+  batch_size: 64
+  sync_replicas: true
+  startup_delay_steps: 0
+  replicas_to_aggregate: 8
+  num_steps: 25000
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: .04
+          total_steps: 25000
+          warmup_learning_rate: .013333
+          warmup_steps: 2000
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/resnet50.ckpt-1"
+  fine_tune_checkpoint_type: "classification"
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+  use_bfloat16: true  # works only on TPUs
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
--- a/research/object_detection/configs/tf2/faster_rcnn_resnet50_v1_800x1333_coco17_gpu-8.config
+++ b/research/object_detection/configs/tf2/faster_rcnn_resnet50_v1_800x1333_coco17_gpu-8.config
+# Faster R-CNN with Resnet-50 (v1),
+# Initialized from Imagenet classification checkpoint
+#
+# Train on GPU-8
+#
+# Achieves 31.4 mAP on COCO17 val
+
+model {
+  faster_rcnn {
+    num_classes: 90
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 800
+        max_dimension: 1333
+        pad_to_max_dimension: true
+      }
+    }
+    feature_extractor {
+      type: 'faster_rcnn_resnet50_keras'
+    }
+    first_stage_anchor_generator {
+      grid_anchor_generator {
+        scales: [0.25, 0.5, 1.0, 2.0]
+        aspect_ratios: [0.5, 1.0, 2.0]
+        height_stride: 16
+        width_stride: 16
+      }
+    }
+    first_stage_box_predictor_conv_hyperparams {
+      op: CONV
+      regularizer {
+        l2_regularizer {
+          weight: 0.0
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+          stddev: 0.01
+        }
+      }
+    }
+    first_stage_nms_score_threshold: 0.0
+    first_stage_nms_iou_threshold: 0.7
+    first_stage_max_proposals: 300
+    first_stage_localization_loss_weight: 2.0
+    first_stage_objectness_loss_weight: 1.0
+    initial_crop_size: 14
+    maxpool_kernel_size: 2
+    maxpool_stride: 2
+    second_stage_box_predictor {
+      mask_rcnn_box_predictor {
+        use_dropout: false
+        dropout_keep_probability: 1.0
+        fc_hyperparams {
+          op: FC
+          regularizer {
+            l2_regularizer {
+              weight: 0.0
+            }
+          }
+          initializer {
+            variance_scaling_initializer {
+              factor: 1.0
+              uniform: true
+              mode: FAN_AVG
+            }
+          }
+        }
+      }
+    }
+    second_stage_post_processing {
+      batch_non_max_suppression {
+        score_threshold: 0.0
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 100
+      }
+      score_converter: SOFTMAX
+    }
+    second_stage_localization_loss_weight: 2.0
+    second_stage_classification_loss_weight: 1.0
+  }
+}
+
+train_config: {
+  batch_size: 16
+  num_steps: 200000
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: 0.01
+          total_steps: 200000
+          warmup_learning_rate: 0.0
+          warmup_steps: 5000
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  gradient_clipping_by_norm: 10.0
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/resnet50.ckpt-1"
+  fine_tune_checkpoint_type: "classification"
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_hue {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_contrast {
+    }
+  }
+
+  data_augmentation_options {
+    random_adjust_saturation {
+    }
+  }
+
+  data_augmentation_options {
+     random_square_crop_by_scale {
+      scale_min: 0.6
+      scale_max: 1.3
+    }
+  }
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
--- a/research/object_detection/configs/tf2/mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8.config
+++ b/research/object_detection/configs/tf2/mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8.config
+# Mask R-CNN with Inception Resnet v2 (no atrous)
+# Sync-trained on COCO (with 8 GPUs) with batch size 16 (1024x1024 resolution)
+# Initialized from Imagenet classification checkpoint
+#
+# Train on GPU-8
+#
+# Achieves 40.4 box mAP and 35.5 mask mAP on COCO17 val
+
+model {
+  faster_rcnn {
+    number_of_stages: 3
+    num_classes: 90
+    image_resizer {
+      fixed_shape_resizer {
+        height: 1024
+        width: 1024
+      }
+    }
+    feature_extractor {
+      type: 'faster_rcnn_inception_resnet_v2_keras'
+    }
+    first_stage_anchor_generator {
+      grid_anchor_generator {
+        scales: [0.25, 0.5, 1.0, 2.0]
+        aspect_ratios: [0.5, 1.0, 2.0]
+        height_stride: 16
+        width_stride: 16
+      }
+    }
+    first_stage_box_predictor_conv_hyperparams {
+      op: CONV
+      regularizer {
+        l2_regularizer {
+          weight: 0.0
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+          stddev: 0.01
+        }
+      }
+    }
+    first_stage_nms_score_threshold: 0.0
+    first_stage_nms_iou_threshold: 0.7
+    first_stage_max_proposals: 300
+    first_stage_localization_loss_weight: 2.0
+    first_stage_objectness_loss_weight: 1.0
+    initial_crop_size: 17
+    maxpool_kernel_size: 1
+    maxpool_stride: 1
+    second_stage_box_predictor {
+      mask_rcnn_box_predictor {
+        use_dropout: false
+        dropout_keep_probability: 1.0
+        fc_hyperparams {
+          op: FC
+          regularizer {
+            l2_regularizer {
+              weight: 0.0
+            }
+          }
+          initializer {
+            variance_scaling_initializer {
+              factor: 1.0
+              uniform: true
+              mode: FAN_AVG
+            }
+          }
+        }
+        mask_height: 33
+        mask_width: 33
+        mask_prediction_conv_depth: 0
+        mask_prediction_num_conv_layers: 4
+        conv_hyperparams {
+          op: CONV
+          regularizer {
+            l2_regularizer {
+              weight: 0.0
+            }
+          }
+          initializer {
+            truncated_normal_initializer {
+              stddev: 0.01
+            }
+          }
+        }
+        predict_instance_masks: true
+      }
+    }
+    second_stage_post_processing {
+      batch_non_max_suppression {
+        score_threshold: 0.0
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 100
+      }
+      score_converter: SOFTMAX
+    }
+    second_stage_localization_loss_weight: 2.0
+    second_stage_classification_loss_weight: 1.0
+    second_stage_mask_prediction_loss_weight: 4.0
+    resize_masks: false
+  }
+}
+
+train_config: {
+  batch_size: 16
+  num_steps: 200000
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: 0.008
+          total_steps: 200000
+          warmup_learning_rate: 0.0
+          warmup_steps: 5000
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  gradient_clipping_by_norm: 10.0
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/inception_resnet_v2.ckpt-1"
+  fine_tune_checkpoint_type: "classification"
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+  load_instance_masks: true
+  mask_type: PNG_MASKS
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  metrics_set: "coco_mask_metrics"
+  eval_instance_masks: true
+  use_moving_averages: false
+  batch_size: 1
+  include_metrics_per_category: true
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+  load_instance_masks: true
+  mask_type: PNG_MASKS
+}
--- a/research/object_detection/configs/tf2/ssd_efficientdet_d0_512x512_coco17_tpu-8.config
+++ b/research/object_detection/configs/tf2/ssd_efficientdet_d0_512x512_coco17_tpu-8.config
+ # SSD with EfficientNet-b0 + BiFPN feature extractor,
+# shared box predictor and focal loss (a.k.a EfficientDet-d0).
+# See EfficientDet, Tan et al, https://arxiv.org/abs/1911.09070
+# See Lin et al, https://arxiv.org/abs/1708.02002
+# Trained on COCO, initialized from an EfficientNet-b0 checkpoint.
+#
+# Train on TPU-8
+
+model {
+  ssd {
+    inplace_batchnorm_update: true
+    freeze_batchnorm: false
+    num_classes: 90
+    add_background_class: false
+    box_coder {
+      faster_rcnn_box_coder {
+        y_scale: 10.0
+        x_scale: 10.0
+        height_scale: 5.0
+        width_scale: 5.0
+      }
+    }
+    matcher {
+      argmax_matcher {
+        matched_threshold: 0.5
+        unmatched_threshold: 0.5
+        ignore_thresholds: false
+        negatives_lower_than_unmatched: true
+        force_match_for_each_row: true
+        use_matmul_gather: true
+      }
+    }
+    similarity_calculator {
+      iou_similarity {
+      }
+    }
+    encode_background_as_zeros: true
+    anchor_generator {
+      multiscale_anchor_generator {
+        min_level: 3
+        max_level: 7
+        anchor_scale: 4.0
+        aspect_ratios: [1.0, 2.0, 0.5]
+        scales_per_octave: 3
+      }
+    }
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 512
+        max_dimension: 512
+        pad_to_max_dimension: true
+        }
+    }
+    box_predictor {
+      weight_shared_convolutional_box_predictor {
+        depth: 64
+        class_prediction_bias_init: -4.6
+        conv_hyperparams {
+          force_use_bias: true
+          activation: SWISH
+          regularizer {
+            l2_regularizer {
+              weight: 0.00004
+            }
+          }
+          initializer {
+            random_normal_initializer {
+              stddev: 0.01
+              mean: 0.0
+            }
+          }
+          batch_norm {
+            scale: true
+            decay: 0.99
+            epsilon: 0.001
+          }
+        }
+        num_layers_before_predictor: 3
+        kernel_size: 3
+        use_depthwise: true
+      }
+    }
+    feature_extractor {
+      type: 'ssd_efficientnet-b0_bifpn_keras'
+      bifpn {
+        min_level: 3
+        max_level: 7
+        num_iterations: 3
+        num_filters: 64
+      }
+      conv_hyperparams {
+        force_use_bias: true
+        activation: SWISH
+        regularizer {
+          l2_regularizer {
+            weight: 0.00004
+          }
+        }
+        initializer {
+          truncated_normal_initializer {
+            stddev: 0.03
+            mean: 0.0
+          }
+        }
+        batch_norm {
+          scale: true,
+          decay: 0.99,
+          epsilon: 0.001,
+        }
+      }
+    }
+    loss {
+      classification_loss {
+        weighted_sigmoid_focal {
+          alpha: 0.25
+          gamma: 1.5
+        }
+      }
+      localization_loss {
+        weighted_smooth_l1 {
+        }
+      }
+      classification_weight: 1.0
+      localization_weight: 1.0
+    }
+    normalize_loss_by_num_matches: true
+    normalize_loc_loss_by_codesize: true
+    post_processing {
+      batch_non_max_suppression {
+        score_threshold: 1e-8
+        iou_threshold: 0.5
+        max_detections_per_class: 100
+        max_total_detections: 100
+      }
+      score_converter: SIGMOID
+    }
+  }
+}
+
+train_config: {
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/ckpt-0"
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint_type: "classification"
+  batch_size: 128
+  sync_replicas: true
+  startup_delay_steps: 0
+  replicas_to_aggregate: 8
+  use_bfloat16: true
+  num_steps: 300000
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+  data_augmentation_options {
+    random_scale_crop_and_pad_to_square {
+      output_size: 512
+      scale_min: 0.1
+      scale_max: 2.0
+    }
+  }
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: 8e-2
+          total_steps: 300000
+          warmup_learning_rate: .001
+          warmup_steps: 2500
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
--- a/research/object_detection/configs/tf2/ssd_efficientdet_d1_640x640_coco17_tpu-8.config
+++ b/research/object_detection/configs/tf2/ssd_efficientdet_d1_640x640_coco17_tpu-8.config
+ # SSD with EfficientNet-b1 + BiFPN feature extractor,
+# shared box predictor and focal loss (a.k.a EfficientDet-d1).
+# See EfficientDet, Tan et al, https://arxiv.org/abs/1911.09070
+# See Lin et al, https://arxiv.org/abs/1708.02002
+# Trained on COCO, initialized from an EfficientNet-b1 checkpoint.
+#
+# Train on TPU-8
+
+model {
+  ssd {
+    inplace_batchnorm_update: true
+    freeze_batchnorm: false
+    num_classes: 90
+    add_background_class: false
+    box_coder {
+      faster_rcnn_box_coder {
+        y_scale: 10.0
+        x_scale: 10.0
+        height_scale: 5.0
+        width_scale: 5.0
+      }
+    }
+    matcher {
+      argmax_matcher {
+        matched_threshold: 0.5
+        unmatched_threshold: 0.5
+        ignore_thresholds: false
+        negatives_lower_than_unmatched: true
+        force_match_for_each_row: true
+        use_matmul_gather: true
+      }
+    }
+    similarity_calculator {
+      iou_similarity {
+      }
+    }
+    encode_background_as_zeros: true
+    anchor_generator {
+      multiscale_anchor_generator {
+        min_level: 3
+        max_level: 7
+        anchor_scale: 4.0
+        aspect_ratios: [1.0, 2.0, 0.5]
+        scales_per_octave: 3
+      }
+    }
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 640
+        max_dimension: 640
+        pad_to_max_dimension: true
+        }
+    }
+    box_predictor {
+      weight_shared_convolutional_box_predictor {
+        depth: 88
+        class_prediction_bias_init: -4.6
+        conv_hyperparams {
+          force_use_bias: true
+          activation: SWISH
+          regularizer {
+            l2_regularizer {
+              weight: 0.00004
+            }
+          }
+          initializer {
+            random_normal_initializer {
+              stddev: 0.01
+              mean: 0.0
+            }
+          }
+          batch_norm {
+            scale: true
+            decay: 0.99
+            epsilon: 0.001
+          }
+        }
+        num_layers_before_predictor: 3
+        kernel_size: 3
+        use_depthwise: true
+      }
+    }
+    feature_extractor {
+      type: 'ssd_efficientnet-b1_bifpn_keras'
+      bifpn {
+        min_level: 3
+        max_level: 7
+        num_iterations: 4
+        num_filters: 88
+      }
+      conv_hyperparams {
+        force_use_bias: true
+        activation: SWISH
+        regularizer {
+          l2_regularizer {
+            weight: 0.00004
+          }
+        }
+        initializer {
+          truncated_normal_initializer {
+            stddev: 0.03
+            mean: 0.0
+          }
+        }
+        batch_norm {
+          scale: true,
+          decay: 0.99,
+          epsilon: 0.001,
+        }
+      }
+    }
+    loss {
+      classification_loss {
+        weighted_sigmoid_focal {
+          alpha: 0.25
+          gamma: 1.5
+        }
+      }
+      localization_loss {
+        weighted_smooth_l1 {
+        }
+      }
+      classification_weight: 1.0
+      localization_weight: 1.0
+    }
+    normalize_loss_by_num_matches: true
+    normalize_loc_loss_by_codesize: true
+    post_processing {
+      batch_non_max_suppression {
+        score_threshold: 1e-8
+        iou_threshold: 0.5
+        max_detections_per_class: 100
+        max_total_detections: 100
+      }
+      score_converter: SIGMOID
+    }
+  }
+}
+
+train_config: {
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/ckpt-0"
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint_type: "classification"
+  batch_size: 128
+  sync_replicas: true
+  startup_delay_steps: 0
+  replicas_to_aggregate: 8
+  use_bfloat16: true
+  num_steps: 300000
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+  data_augmentation_options {
+    random_scale_crop_and_pad_to_square {
+      output_size: 640
+      scale_min: 0.1
+      scale_max: 2.0
+    }
+  }
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: 8e-2
+          total_steps: 300000
+          warmup_learning_rate: .001
+          warmup_steps: 2500
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
--- a/research/object_detection/configs/tf2/ssd_efficientdet_d2_768x768_coco17_tpu-8.config
+++ b/research/object_detection/configs/tf2/ssd_efficientdet_d2_768x768_coco17_tpu-8.config
+ # SSD with EfficientNet-b2 + BiFPN feature extractor,
+# shared box predictor and focal loss (a.k.a EfficientDet-d2).
+# See EfficientDet, Tan et al, https://arxiv.org/abs/1911.09070
+# See Lin et al, https://arxiv.org/abs/1708.02002
+# Trained on COCO, initialized from an EfficientNet-b2 checkpoint.
+#
+# Train on TPU-8
+
+model {
+  ssd {
+    inplace_batchnorm_update: true
+    freeze_batchnorm: false
+    num_classes: 90
+    add_background_class: false
+    box_coder {
+      faster_rcnn_box_coder {
+        y_scale: 10.0
+        x_scale: 10.0
+        height_scale: 5.0
+        width_scale: 5.0
+      }
+    }
+    matcher {
+      argmax_matcher {
+        matched_threshold: 0.5
+        unmatched_threshold: 0.5
+        ignore_thresholds: false
+        negatives_lower_than_unmatched: true
+        force_match_for_each_row: true
+        use_matmul_gather: true
+      }
+    }
+    similarity_calculator {
+      iou_similarity {
+      }
+    }
+    encode_background_as_zeros: true
+    anchor_generator {
+      multiscale_anchor_generator {
+        min_level: 3
+        max_level: 7
+        anchor_scale: 4.0
+        aspect_ratios: [1.0, 2.0, 0.5]
+        scales_per_octave: 3
+      }
+    }
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 768
+        max_dimension: 768
+        pad_to_max_dimension: true
+        }
+    }
+    box_predictor {
+      weight_shared_convolutional_box_predictor {
+        depth: 112
+        class_prediction_bias_init: -4.6
+        conv_hyperparams {
+          force_use_bias: true
+          activation: SWISH
+          regularizer {
+            l2_regularizer {
+              weight: 0.00004
+            }
+          }
+          initializer {
+            random_normal_initializer {
+              stddev: 0.01
+              mean: 0.0
+            }
+          }
+          batch_norm {
+            scale: true
+            decay: 0.99
+            epsilon: 0.001
+          }
+        }
+        num_layers_before_predictor: 3
+        kernel_size: 3
+        use_depthwise: true
+      }
+    }
+    feature_extractor {
+      type: 'ssd_efficientnet-b2_bifpn_keras'
+      bifpn {
+        min_level: 3
+        max_level: 7
+        num_iterations: 5
+        num_filters: 112
+      }
+      conv_hyperparams {
+        force_use_bias: true
+        activation: SWISH
+        regularizer {
+          l2_regularizer {
+            weight: 0.00004
+          }
+        }
+        initializer {
+          truncated_normal_initializer {
+            stddev: 0.03
+            mean: 0.0
+          }
+        }
+        batch_norm {
+          scale: true,
+          decay: 0.99,
+          epsilon: 0.001,
+        }
+      }
+    }
+    loss {
+      classification_loss {
+        weighted_sigmoid_focal {
+          alpha: 0.25
+          gamma: 1.5
+        }
+      }
+      localization_loss {
+        weighted_smooth_l1 {
+        }
+      }
+      classification_weight: 1.0
+      localization_weight: 1.0
+    }
+    normalize_loss_by_num_matches: true
+    normalize_loc_loss_by_codesize: true
+    post_processing {
+      batch_non_max_suppression {
+        score_threshold: 1e-8
+        iou_threshold: 0.5
+        max_detections_per_class: 100
+        max_total_detections: 100
+      }
+      score_converter: SIGMOID
+    }
+  }
+}
+
+train_config: {
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/ckpt-0"
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint_type: "classification"
+  batch_size: 128
+  sync_replicas: true
+  startup_delay_steps: 0
+  replicas_to_aggregate: 8
+  use_bfloat16: true
+  num_steps: 300000
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+  data_augmentation_options {
+    random_scale_crop_and_pad_to_square {
+      output_size: 768
+      scale_min: 0.1
+      scale_max: 2.0
+    }
+  }
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: 8e-2
+          total_steps: 300000
+          warmup_learning_rate: .001
+          warmup_steps: 2500
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
--- a/research/object_detection/configs/tf2/ssd_efficientdet_d3_896x896_coco17_tpu-32.config
+++ b/research/object_detection/configs/tf2/ssd_efficientdet_d3_896x896_coco17_tpu-32.config
+ # SSD with EfficientNet-b3 + BiFPN feature extractor,
+# shared box predictor and focal loss (a.k.a EfficientDet-d3).
+# See EfficientDet, Tan et al, https://arxiv.org/abs/1911.09070
+# See Lin et al, https://arxiv.org/abs/1708.02002
+# Trained on COCO, initialized from an EfficientNet-b3 checkpoint.
+#
+# Train on TPU-32
+
+model {
+  ssd {
+    inplace_batchnorm_update: true
+    freeze_batchnorm: false
+    num_classes: 90
+    add_background_class: false
+    box_coder {
+      faster_rcnn_box_coder {
+        y_scale: 10.0
+        x_scale: 10.0
+        height_scale: 5.0
+        width_scale: 5.0
+      }
+    }
+    matcher {
+      argmax_matcher {
+        matched_threshold: 0.5
+        unmatched_threshold: 0.5
+        ignore_thresholds: false
+        negatives_lower_than_unmatched: true
+        force_match_for_each_row: true
+        use_matmul_gather: true
+      }
+    }
+    similarity_calculator {
+      iou_similarity {
+      }
+    }
+    encode_background_as_zeros: true
+    anchor_generator {
+      multiscale_anchor_generator {
+        min_level: 3
+        max_level: 7
+        anchor_scale: 4.0
+        aspect_ratios: [1.0, 2.0, 0.5]
+        scales_per_octave: 3
+      }
+    }
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 896
+        max_dimension: 896
+        pad_to_max_dimension: true
+        }
+    }
+    box_predictor {
+      weight_shared_convolutional_box_predictor {
+        depth: 160
+        class_prediction_bias_init: -4.6
+        conv_hyperparams {
+          force_use_bias: true
+          activation: SWISH
+          regularizer {
+            l2_regularizer {
+              weight: 0.00004
+            }
+          }
+          initializer {
+            random_normal_initializer {
+              stddev: 0.01
+              mean: 0.0
+            }
+          }
+          batch_norm {
+            scale: true
+            decay: 0.99
+            epsilon: 0.001
+          }
+        }
+        num_layers_before_predictor: 4
+        kernel_size: 3
+        use_depthwise: true
+      }
+    }
+    feature_extractor {
+      type: 'ssd_efficientnet-b3_bifpn_keras'
+      bifpn {
+        min_level: 3
+        max_level: 7
+        num_iterations: 6
+        num_filters: 160
+      }
+      conv_hyperparams {
+        force_use_bias: true
+        activation: SWISH
+        regularizer {
+          l2_regularizer {
+            weight: 0.00004
+          }
+        }
+        initializer {
+          truncated_normal_initializer {
+            stddev: 0.03
+            mean: 0.0
+          }
+        }
+        batch_norm {
+          scale: true,
+          decay: 0.99,
+          epsilon: 0.001,
+        }
+      }
+    }
+    loss {
+      classification_loss {
+        weighted_sigmoid_focal {
+          alpha: 0.25
+          gamma: 1.5
+        }
+      }
+      localization_loss {
+        weighted_smooth_l1 {
+        }
+      }
+      classification_weight: 1.0
+      localization_weight: 1.0
+    }
+    normalize_loss_by_num_matches: true
+    normalize_loc_loss_by_codesize: true
+    post_processing {
+      batch_non_max_suppression {
+        score_threshold: 1e-8
+        iou_threshold: 0.5
+        max_detections_per_class: 100
+        max_total_detections: 100
+      }
+      score_converter: SIGMOID
+    }
+  }
+}
+
+train_config: {
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/ckpt-0"
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint_type: "classification"
+  batch_size: 128
+  sync_replicas: true
+  startup_delay_steps: 0
+  replicas_to_aggregate: 8
+  use_bfloat16: true
+  num_steps: 300000
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+  data_augmentation_options {
+    random_scale_crop_and_pad_to_square {
+      output_size: 896
+      scale_min: 0.1
+      scale_max: 2.0
+    }
+  }
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: 8e-2
+          total_steps: 300000
+          warmup_learning_rate: .001
+          warmup_steps: 2500
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
--- a/research/object_detection/configs/tf2/ssd_efficientdet_d4_1024x1024_coco17_tpu-32.config
+++ b/research/object_detection/configs/tf2/ssd_efficientdet_d4_1024x1024_coco17_tpu-32.config
+ # SSD with EfficientNet-b4 + BiFPN feature extractor,
+# shared box predictor and focal loss (a.k.a EfficientDet-d4).
+# See EfficientDet, Tan et al, https://arxiv.org/abs/1911.09070
+# See Lin et al, https://arxiv.org/abs/1708.02002
+# Trained on COCO, initialized from an EfficientNet-b4 checkpoint.
+#
+# Train on TPU-32
+
+model {
+  ssd {
+    inplace_batchnorm_update: true
+    freeze_batchnorm: false
+    num_classes: 90
+    add_background_class: false
+    box_coder {
+      faster_rcnn_box_coder {
+        y_scale: 10.0
+        x_scale: 10.0
+        height_scale: 5.0
+        width_scale: 5.0
+      }
+    }
+    matcher {
+      argmax_matcher {
+        matched_threshold: 0.5
+        unmatched_threshold: 0.5
+        ignore_thresholds: false
+        negatives_lower_than_unmatched: true
+        force_match_for_each_row: true
+        use_matmul_gather: true
+      }
+    }
+    similarity_calculator {
+      iou_similarity {
+      }
+    }
+    encode_background_as_zeros: true
+    anchor_generator {
+      multiscale_anchor_generator {
+        min_level: 3
+        max_level: 7
+        anchor_scale: 4.0
+        aspect_ratios: [1.0, 2.0, 0.5]
+        scales_per_octave: 3
+      }
+    }
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 1024
+        max_dimension: 1024
+        pad_to_max_dimension: true
+        }
+    }
+    box_predictor {
+      weight_shared_convolutional_box_predictor {
+        depth: 224
+        class_prediction_bias_init: -4.6
+        conv_hyperparams {
+          force_use_bias: true
+          activation: SWISH
+          regularizer {
+            l2_regularizer {
+              weight: 0.00004
+            }
+          }
+          initializer {
+            random_normal_initializer {
+              stddev: 0.01
+              mean: 0.0
+            }
+          }
+          batch_norm {
+            scale: true
+            decay: 0.99
+            epsilon: 0.001
+          }
+        }
+        num_layers_before_predictor: 4
+        kernel_size: 3
+        use_depthwise: true
+      }
+    }
+    feature_extractor {
+      type: 'ssd_efficientnet-b4_bifpn_keras'
+      bifpn {
+        min_level: 3
+        max_level: 7
+        num_iterations: 7
+        num_filters: 224
+      }
+      conv_hyperparams {
+        force_use_bias: true
+        activation: SWISH
+        regularizer {
+          l2_regularizer {
+            weight: 0.00004
+          }
+        }
+        initializer {
+          truncated_normal_initializer {
+            stddev: 0.03
+            mean: 0.0
+          }
+        }
+        batch_norm {
+          scale: true,
+          decay: 0.99,
+          epsilon: 0.001,
+        }
+      }
+    }
+    loss {
+      classification_loss {
+        weighted_sigmoid_focal {
+          alpha: 0.25
+          gamma: 1.5
+        }
+      }
+      localization_loss {
+        weighted_smooth_l1 {
+        }
+      }
+      classification_weight: 1.0
+      localization_weight: 1.0
+    }
+    normalize_loss_by_num_matches: true
+    normalize_loc_loss_by_codesize: true
+    post_processing {
+      batch_non_max_suppression {
+        score_threshold: 1e-8
+        iou_threshold: 0.5
+        max_detections_per_class: 100
+        max_total_detections: 100
+      }
+      score_converter: SIGMOID
+    }
+  }
+}
+
+train_config: {
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/ckpt-0"
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint_type: "classification"
+  batch_size: 128
+  sync_replicas: true
+  startup_delay_steps: 0
+  replicas_to_aggregate: 8
+  use_bfloat16: true
+  num_steps: 300000
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+  data_augmentation_options {
+    random_scale_crop_and_pad_to_square {
+      output_size: 1024
+      scale_min: 0.1
+      scale_max: 2.0
+    }
+  }
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: 8e-2
+          total_steps: 300000
+          warmup_learning_rate: .001
+          warmup_steps: 2500
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}
--- a/research/object_detection/configs/tf2/ssd_efficientdet_d5_1280x1280_coco17_tpu-32.config
+++ b/research/object_detection/configs/tf2/ssd_efficientdet_d5_1280x1280_coco17_tpu-32.config
+ # SSD with EfficientNet-b5 + BiFPN feature extractor,
+# shared box predictor and focal loss (a.k.a EfficientDet-d5).
+# See EfficientDet, Tan et al, https://arxiv.org/abs/1911.09070
+# See Lin et al, https://arxiv.org/abs/1708.02002
+# Trained on COCO, initialized from an EfficientNet-b5 checkpoint.
+#
+# Train on TPU-32
+
+model {
+  ssd {
+    inplace_batchnorm_update: true
+    freeze_batchnorm: false
+    num_classes: 90
+    add_background_class: false
+    box_coder {
+      faster_rcnn_box_coder {
+        y_scale: 10.0
+        x_scale: 10.0
+        height_scale: 5.0
+        width_scale: 5.0
+      }
+    }
+    matcher {
+      argmax_matcher {
+        matched_threshold: 0.5
+        unmatched_threshold: 0.5
+        ignore_thresholds: false
+        negatives_lower_than_unmatched: true
+        force_match_for_each_row: true
+        use_matmul_gather: true
+      }
+    }
+    similarity_calculator {
+      iou_similarity {
+      }
+    }
+    encode_background_as_zeros: true
+    anchor_generator {
+      multiscale_anchor_generator {
+        min_level: 3
+        max_level: 7
+        anchor_scale: 4.0
+        aspect_ratios: [1.0, 2.0, 0.5]
+        scales_per_octave: 3
+      }
+    }
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 1280
+        max_dimension: 1280
+        pad_to_max_dimension: true
+        }
+    }
+    box_predictor {
+      weight_shared_convolutional_box_predictor {
+        depth: 288
+        class_prediction_bias_init: -4.6
+        conv_hyperparams {
+          force_use_bias: true
+          activation: SWISH
+          regularizer {
+            l2_regularizer {
+              weight: 0.00004
+            }
+          }
+          initializer {
+            random_normal_initializer {
+              stddev: 0.01
+              mean: 0.0
+            }
+          }
+          batch_norm {
+            scale: true
+            decay: 0.99
+            epsilon: 0.001
+          }
+        }
+        num_layers_before_predictor: 4
+        kernel_size: 3
+        use_depthwise: true
+      }
+    }
+    feature_extractor {
+      type: 'ssd_efficientnet-b5_bifpn_keras'
+      bifpn {
+        min_level: 3
+        max_level: 7
+        num_iterations: 7
+        num_filters: 288
+      }
+      conv_hyperparams {
+        force_use_bias: true
+        activation: SWISH
+        regularizer {
+          l2_regularizer {
+            weight: 0.00004
+          }
+        }
+        initializer {
+          truncated_normal_initializer {
+            stddev: 0.03
+            mean: 0.0
+          }
+        }
+        batch_norm {
+          scale: true,
+          decay: 0.99,
+          epsilon: 0.001,
+        }
+      }
+    }
+    loss {
+      classification_loss {
+        weighted_sigmoid_focal {
+          alpha: 0.25
+          gamma: 1.5
+        }
+      }
+      localization_loss {
+        weighted_smooth_l1 {
+        }
+      }
+      classification_weight: 1.0
+      localization_weight: 1.0
+    }
+    normalize_loss_by_num_matches: true
+    normalize_loc_loss_by_codesize: true
+    post_processing {
+      batch_non_max_suppression {
+        score_threshold: 1e-8
+        iou_threshold: 0.5
+        max_detections_per_class: 100
+        max_total_detections: 100
+      }
+      score_converter: SIGMOID
+    }
+  }
+}
+
+train_config: {
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/ckpt-0"
+  fine_tune_checkpoint_version: V2
+  fine_tune_checkpoint_type: "classification"
+  batch_size: 128
+  sync_replicas: true
+  startup_delay_steps: 0
+  replicas_to_aggregate: 8
+  use_bfloat16: true
+  num_steps: 300000
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+  data_augmentation_options {
+    random_scale_crop_and_pad_to_square {
+      output_size: 1280
+      scale_min: 0.1
+      scale_max: 2.0
+    }
+  }
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: 8e-2
+          total_steps: 300000
+          warmup_learning_rate: .001
+          warmup_steps: 2500
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+}
+
+train_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/train2017-?????-of-00256.tfrecord"
+  }
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  batch_size: 1;
+}
+
+eval_input_reader: {
+  label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
+  shuffle: false
+  num_epochs: 1
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
+  }
+}