Merge remote-tracking branch 'upstream/master' into newavarecords

5a2cf36f · Kaushik Shivakumar · 258ddfc3 · a829e648 · 5a2cf36f · 5a2cf36f
Commit 5a2cf36f authored Jul 23, 2020 by Kaushik Shivakumar
20 changed files
--- a/research/object_detection/g3doc/running_notebook.md
+++ b/research/object_detection/g3doc/running_notebook.md
 # Quick Start: Jupyter notebook for off-the-shelf inference
+[![TensorFlow 2.2](https://img.shields.io/badge/TensorFlow-2.2-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v2.2.0)
+[![TensorFlow 1.15](https://img.shields.io/badge/TensorFlow-1.15-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
 If you'd like to hit the ground running and run detection on a few example
 images right out of the box, we recommend trying out the Jupyter notebook demo.
 To run the Jupyter notebook, run the following command from

--- a/research/object_detection/g3doc/running_on_mobile_tensorflowlite.md
+++ b/research/object_detection/g3doc/running_on_mobile_tensorflowlite.md
 # Running on mobile with TensorFlow Lite
+[![TensorFlow 1.15](https://img.shields.io/badge/TensorFlow-1.15-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
 In this section, we will show you how to use [TensorFlow
 Lite](https://www.tensorflow.org/mobile/tflite/) to get a smaller model and
 allow you take advantage of ops that have been optimized for mobile devices.

--- a/research/object_detection/g3doc/running_pets.md
+++ b/research/object_detection/g3doc/running_pets.md
 # Quick Start: Distributed Training on the Oxford-IIIT Pets Dataset on Google Cloud
-This page is a walkthrough for training an object detector using the Tensorflow
+[![TensorFlow 1.15](https://img.shields.io/badge/TensorFlow-1.15-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
+This page is a walkthrough for training an object detector using the TensorFlow
 Object Detection API. In this tutorial, we'll be training on the Oxford-IIIT Pets
 dataset to build a system to detect various breeds of cats and dogs. The output
 of the detector will look like the following:
@@ -40,10 +42,10 @@ export YOUR_GCS_BUCKET=${YOUR_GCS_BUCKET}
 It is also possible to run locally by following
 [the running locally instructions](running_locally.md).
-## Installing Tensorflow and the Tensorflow Object Detection API
+## Installing TensorFlow and the TensorFlow Object Detection API
 Please run through the [installation instructions](installation.md) to install
-Tensorflow and all it dependencies. Ensure the Protobuf libraries are
+TensorFlow and all it dependencies. Ensure the Protobuf libraries are
 compiled and the library directories are added to `PYTHONPATH`.
 ## Getting the Oxford-IIIT Pets Dataset and Uploading it to Google Cloud Storage
@@ -77,7 +79,7 @@ should appear as follows:
 ... other files and directories
 ```
-The Tensorflow Object Detection API expects data to be in the TFRecord format,
+The TensorFlow Object Detection API expects data to be in the TFRecord format,
 so we'll now run the `create_pet_tf_record` script to convert from the raw
 Oxford-IIIT Pet dataset into TFRecords. Run the following commands from the
 `tensorflow/models/research/` directory:
@@ -134,7 +136,7 @@ in the following step.
 ## Configuring the Object Detection Pipeline
-In the Tensorflow Object Detection API, the model parameters, training
+In the TensorFlow Object Detection API, the model parameters, training
 parameters and eval parameters are all defined by a config file. More details
 can be found [here](configuring_jobs.md). For this tutorial, we will use some
 predefined templates provided with the source code. In the
@@ -188,10 +190,10 @@ browser](https://console.cloud.google.com/storage/browser).
 Before we can start a job on Google Cloud ML Engine, we must:
-1. Package the Tensorflow Object Detection code.
+1. Package the TensorFlow Object Detection code.
 2. Write a cluster configuration for our Google Cloud ML job.
-To package the Tensorflow Object Detection code, run the following commands from
+To package the TensorFlow Object Detection code, run the following commands from
 the `tensorflow/models/research/` directory:
 ```bash
@@ -248,7 +250,7 @@ web browser. You should see something similar to the following:
 ![](img/tensorboard.png)
-Make sure your Tensorboard version is the same minor version as your Tensorflow (1.x)
+Make sure your Tensorboard version is the same minor version as your TensorFlow (1.x)
 You will also want to click on the images tab to see example detections made by
 the model while it trains. After about an hour and a half of training, you can
@@ -265,9 +267,9 @@ the training jobs are configured to go for much longer than is necessary for
 convergence.  To save money, we recommend killing your jobs once you've seen
 that they've converged.
-## Exporting the Tensorflow Graph
+## Exporting the TensorFlow Graph
-After your model has been trained, you should export it to a Tensorflow graph
+After your model has been trained, you should export it to a TensorFlow graph
 proto. First, you need to identify a candidate checkpoint to export. You can
 search your bucket using the [Google Cloud Storage
 Browser](https://console.cloud.google.com/storage/browser). The file should be

--- a/research/object_detection/g3doc/tf1.md
+++ b/research/object_detection/g3doc/tf1.md
+# Object Detection API with TensorFlow 1
+## Requirements
+[![Python 3.6](https://img.shields.io/badge/Python-3.6-3776AB)](https://www.python.org/downloads/release/python-360/)
+[![TensorFlow 1.15](https://img.shields.io/badge/TensorFlow-1.15-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
+[![Protobuf Compiler >= 3.0](https://img.shields.io/badge/ProtoBuf%20Compiler-%3E3.0-brightgreen)](https://grpc.io/docs/protoc-installation/#install-using-a-package-manager)
+## Installation
+You can install the TensorFlow Object Detection API either with Python Package
+Installer (pip) or Docker. For local runs we recommend using Docker and for
+Google Cloud runs we recommend using pip.
+Clone the TensorFlow Models repository and proceed to one of the installation
+options.
+```bash
+git clone https://github.com/tensorflow/models.git
+```
+### Docker Installation
+```bash
+# From the root of the git repository
+docker build -f research/object_detection/dockerfiles/tf1/Dockerfile -t od .
+docker run -it od
+```
+### Python Package Installation
+```bash
+cd models/research
+# Compile protos.
+protoc object_detection/protos/*.proto --python_out=.
+# Install TensorFlow Object Detection API.
+cp object_detection/packages/tf1/setup.py .
+python -m pip install .
+```
+```bash
+# Test the installation.
+python object_detection/builders/model_builder_tf1_test.py
+```
+## Quick Start
+### Colabs
+*   [Jupyter notebook for off-the-shelf inference](../colab_tutorials/object_detection_tutorial.ipynb)
+*   [Training a pet detector](running_pets.md)
+### Training and Evaluation
+To train and evaluate your models either locally or on Google Cloud see
+[instructions](tf1_training_and_evaluation.md).
+## Model Zoo
+We provide a large collection of models that are trained on several datasets in
+the [Model Zoo](tf1_detection_zoo.md).
+## Guides
+*   <a href='configuring_jobs.md'>
+      Configuring an object detection pipeline</a><br>
+*   <a href='preparing_inputs.md'>Preparing inputs</a><br>
+*   <a href='defining_your_own_model.md'>
+      Defining your own model architecture</a><br>
+*   <a href='using_your_own_dataset.md'>
+      Bringing in your own dataset</a><br>
+*   <a href='evaluation_protocols.md'>
+      Supported object detection evaluation protocols</a><br>
+*   <a href='tpu_compatibility.md'>
+      TPU compatible detection pipelines</a><br>
+*   <a href='tf1_training_and_evaluation.md'>
+      Training and evaluation guide (CPU, GPU, or TPU)</a><br>
+## Extras:
+*   <a href='exporting_models.md'>
+      Exporting a trained model for inference</a><br>
+*   <a href='tpu_exporters.md'>
+      Exporting a trained model for TPU inference</a><br>
+*   <a href='oid_inference_and_evaluation.md'>
+      Inference and evaluation on the Open Images dataset</a><br>
+*   <a href='instance_segmentation.md'>
+      Run an instance segmentation model</a><br>
+*   <a href='challenge_evaluation.md'>
+      Run the evaluation for the Open Images Challenge 2018/2019</a><br>
+*   <a href='running_on_mobile_tensorflowlite.md'>
+      Running object detection on mobile devices with TensorFlow Lite</a><br>
+*   <a href='context_rcnn.md'>
+      Context R-CNN documentation for data preparation, training, and export</a><br>
--- a/research/object_detection/g3doc/detection_model_zoo.md
+++ b/research/object_detection/g3doc/detection_model_zoo.md
-# Tensorflow detection model zoo
+# TensorFlow 1 Detection Model Zoo
+[![TensorFlow 1.15](https://img.shields.io/badge/TensorFlow-1.15-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
+[![Python 3.6](https://img.shields.io/badge/Python-3.6-3776AB)](https://www.python.org/downloads/release/python-360/)
 We provide a collection of detection models pre-trained on the
 [COCO dataset](http://cocodataset.org), the
@@ -64,9 +67,9 @@ Some remarks on frozen inference graphs:
    metrics.
 *   Our frozen inference graphs are generated using the
    [v1.12.0](https://github.com/tensorflow/tensorflow/tree/v1.12.0) release
-    version of Tensorflow and we do not guarantee that these will work with
+    version of TensorFlow and we do not guarantee that these will work with
    other versions; this being said, each frozen inference graph can be
-    regenerated using your current version of Tensorflow by re-running the
+    regenerated using your current version of TensorFlow by re-running the
    [exporter](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/exporting_models.md),
    pointing it at the model directory as well as the corresponding config file
    in

--- a/research/object_detection/g3doc/running_on_cloud.md
+++ b/research/object_detection/g3doc/running_on_cloud.md
-# Running on Google Cloud ML Engine
+# Training and Evaluation with TensorFlow 1
-The Tensorflow Object Detection API supports distributed training on Google
+[![Python 3.6](https://img.shields.io/badge/Python-3.6-3776AB)](https://www.python.org/downloads/release/python-360/)
-Cloud ML Engine. This section documents instructions on how to train and
+[![TensorFlow 1.15](https://img.shields.io/badge/TensorFlow-1.15-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
-evaluate your model using Cloud ML. The reader should complete the following
-prerequistes:
-1. The reader has created and configured a project on Google Cloud Platform.
+This page walks through the steps required to train an object detection model.
-See [the Cloud ML quick start guide](https://cloud.google.com/ml-engine/docs/quickstarts/command-line).
+It assumes the reader has completed the following prerequisites:
-2. The reader has installed the Tensorflow Object Detection API as documented
-in the [installation instructions](installation.md).
-3. The reader has a valid data set and stored it in a Google Cloud Storage
-bucket. See [this page](preparing_inputs.md) for instructions on how to generate
-a dataset for the PASCAL VOC challenge or the Oxford-IIIT Pet dataset.
-4. The reader has configured a valid Object Detection pipeline, and stored it
-in a Google Cloud Storage bucket. See [this page](configuring_jobs.md) for
-details on how to write a pipeline configuration.
-Additionally, it is recommended users test their job by running training and
+1.  The TensorFlow Object Detection API has been installed as documented in the
-evaluation jobs for a few iterations
+    [installation instructions](tf1.md#installation).
-[locally on their own machines](running_locally.md).
+2.  A valid data set has been created. See [this page](preparing_inputs.md) for
+    instructions on how to generate a dataset for the PASCAL VOC challenge or
+    the Oxford-IIIT Pet dataset.
+## Recommended Directory Structure for Training and Evaluation
+```bash
+.
+├── data/
+│   ├── eval-00000-of-00001.tfrecord
+│   ├── label_map.txt
+│   ├── train-00000-of-00002.tfrecord
+│   └── train-00001-of-00002.tfrecord
+└── models/
+    └── my_model_dir/
+        ├── eval/                      # Created by evaluation job.
+        ├── my_model.config
+        └── train/                     #
+            └── model_ckpt-100-data@1  # Created by training job.
+            └── model_ckpt-100-index   #
+            └── checkpoint             #
+```
+## Writing a model configuration
+Please refer to sample [TF1 configs](../samples/configs) and
+[configuring jobs](configuring_jobs.md) to create a model config.
-## Packaging
+### Model Parameter Initialization
-In order to run the Tensorflow Object Detection API on Cloud ML, it must be
+While optional, it is highly recommended that users utilize classification or
-packaged (along with it's TF-Slim dependency and the
+object detection checkpoints. Training an object detector from scratch can take
-[pycocotools](https://github.com/cocodataset/cocoapi/tree/master/PythonAPI/pycocotools)
+days. To speed up the training process, it is recommended that users re-use the
-library). The required packages can be created with the following command
+feature extractor parameters from a pre-existing image classification or object
+detection checkpoint. The`train_config` section in the config provides two
+fields to specify pre-existing checkpoints:
-``` bash
+*   `fine_tune_checkpoint`: a path prefix to the pre-existing checkpoint
-# From tensorflow/models/research/
+    (ie:"/usr/home/username/checkpoint/model.ckpt-#####").
-bash object_detection/dataset_tools/create_pycocotools_package.sh /tmp/pycocotools
-python setup.py sdist
+*   `fine_tune_checkpoint_type`: with value `classification` or `detection`
-(cd slim && python setup.py sdist)
+    depending on the type.
+A list of detection checkpoints can be found [here](tf1_detection_zoo.md).
+## Local
+### Training
+A local training job can be run with the following command:
+```bash
+# From the tensorflow/models/research/ directory
+PIPELINE_CONFIG_PATH={path to pipeline config file}
+MODEL_DIR={path to model directory}
+NUM_TRAIN_STEPS=50000
+SAMPLE_1_OF_N_EVAL_EXAMPLES=1
+python object_detection/model_main.py \
+    --pipeline_config_path=${PIPELINE_CONFIG_PATH} \
+    --model_dir=${MODEL_DIR} \
+    --num_train_steps=${NUM_TRAIN_STEPS} \
+    --sample_1_of_n_eval_examples=${SAMPLE_1_OF_N_EVAL_EXAMPLES} \
+    --alsologtostderr
 ```
-This will create python packages dist/object_detection-0.1.tar.gz,
+where `${PIPELINE_CONFIG_PATH}` points to the pipeline config and `${MODEL_DIR}`
-slim/dist/slim-0.1.tar.gz, and /tmp/pycocotools/pycocotools-2.0.tar.gz.
+points to the directory in which training checkpoints and events will be
+written. Note that this binary will interleave both training and evaluation.
-## Running a Multiworker (GPU) Training Job on CMLE
+## Google Cloud AI Platform
+The TensorFlow Object Detection API supports training on Google Cloud AI
+Platform. This section documents instructions on how to train and evaluate your
+model using Cloud AI Platform. The reader should complete the following
+prerequistes:
+1.  The reader has created and configured a project on Google Cloud AI Platform.
+    See
+    [Using GPUs](https://cloud.google.com/ai-platform/training/docs/using-gpus)
+    and
+    [Using TPUs](https://cloud.google.com/ai-platform/training/docs/using-tpus)
+    guides.
+2.  The reader has a valid data set and stored it in a Google Cloud Storage
+    bucket. See [this page](preparing_inputs.md) for instructions on how to
+    generate a dataset for the PASCAL VOC challenge or the Oxford-IIIT Pet
+    dataset.
+Additionally, it is recommended users test their job by running training and
+evaluation jobs for a few iterations [locally on their own machines](#local).
+### Training with multiple workers with single GPU
 Google Cloud ML requires a YAML configuration file for a multiworker training
 job using GPUs. A sample YAML file is given below:
 ```
 trainingInput:
-  runtimeVersion: "1.12"
+  runtimeVersion: "1.15"
  scaleTier: CUSTOM
  masterType: standard_gpu
  workerCount: 9
@@ -52,30 +113,32 @@ trainingInput:
  parameterServerCount: 3
  parameterServerType: standard
 ```
 Please keep the following guidelines in mind when writing the YAML
 configuration:
-* A job with n workers will have n + 1 training machines (n workers + 1 master).
+*   A job with n workers will have n + 1 training machines (n workers + 1
-* The number of parameters servers used should be an odd number to prevent
+    master).
-  a parameter server from storing only weight variables or only bias variables
+*   The number of parameters servers used should be an odd number to prevent a
-  (due to round robin parameter scheduling).
+    parameter server from storing only weight variables or only bias variables
-* The learning rate in the training config should be decreased when using a
+    (due to round robin parameter scheduling).
-  larger number of workers. Some experimentation is required to find the
+*   The learning rate in the training config should be decreased when using a
-  optimal learning rate.
+    larger number of workers. Some experimentation is required to find the
+    optimal learning rate.
 The YAML file should be saved on the local machine (not on GCP). Once it has
 been written, a user can start a training job on Cloud ML Engine using the
 following command:
 ```bash
-# From tensorflow/models/research/
+# From the tensorflow/models/research/ directory
+cp object_detection/packages/tf1/setup.py .
 gcloud ml-engine jobs submit training object_detection_`date +%m_%d_%Y_%H_%M_%S` \
-    --runtime-version 1.12 \
+    --runtime-version 1.15 \
+    --python-version 3.6 \
    --job-dir=gs://${MODEL_DIR} \
-    --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
+    --package-path ./object_detection \
    --module-name object_detection.model_main \
    --region us-central1 \
    --config ${PATH_TO_LOCAL_YAML_FILE} \
@@ -90,41 +153,42 @@ training checkpoints and events will be written to and
 `gs://${PIPELINE_CONFIG_PATH}` points to the pipeline configuration stored on
 Google Cloud Storage.
-Users can monitor the progress of their training job on the [ML Engine
+Users can monitor the progress of their training job on the
-Dashboard](https://console.cloud.google.com/mlengine/jobs).
+[ML Engine Dashboard](https://console.cloud.google.com/ai-platform/jobs).
-Note: This sample is supported for use with 1.12 runtime version.
-## Running a TPU Training Job on CMLE
+## Training with TPU
 Launching a training job with a TPU compatible pipeline config requires using a
 similar command:
 ```bash
+# From the tensorflow/models/research/ directory
+cp object_detection/packages/tf1/setup.py .
 gcloud ml-engine jobs submit training `whoami`_object_detection_`date +%m_%d_%Y_%H_%M_%S` \
--job-dir=gs://${MODEL_DIR} \
+    --job-dir=gs://${MODEL_DIR} \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
+    --package-path ./object_detection \
--module-name object_detection.model_tpu_main \
+    --module-name object_detection.model_tpu_main \
--runtime-version 1.12 \
+    --runtime-version 1.15 \
--scale-tier BASIC_TPU \
+    --python-version 3.6 \
--region us-central1 \
+    --scale-tier BASIC_TPU \
-- \
+    --region us-central1 \
--tpu_zone us-central1 \
+    -- \
--model_dir=gs://${MODEL_DIR} \
+    --tpu_zone us-central1 \
--pipeline_config_path=gs://${PIPELINE_CONFIG_PATH}
+    --model_dir=gs://${MODEL_DIR} \
+    --pipeline_config_path=gs://${PIPELINE_CONFIG_PATH}
 ```
 In contrast with the GPU training command, there is no need to specify a YAML
-file and we point to the *object_detection.model_tpu_main* binary instead of
+file, and we point to the *object_detection.model_tpu_main* binary instead of
 *object_detection.model_main*. We must also now set `scale-tier` to be
 `BASIC_TPU` and provide a `tpu_zone`. Finally as before `pipeline_config_path`
 points to a points to the pipeline configuration stored on Google Cloud Storage
 (but is now must be a TPU compatible model).
-## Running an Evaluation Job on CMLE
+## Evaluation with GPU
-Note: You only need to do this when using TPU for training as it does not
+Note: You only need to do this when using TPU for training, as it does not
-interleave evaluation during training as in the case of Multiworker GPU
+interleave evaluation during training, as in the case of Multiworker GPU
 training.
 Evaluation jobs run on a single machine, so it is not necessary to write a YAML
@@ -132,10 +196,13 @@ configuration for evaluation. Run the following command to start the evaluation
 job:
 ```bash
+# From the tensorflow/models/research/ directory
+cp object_detection/packages/tf1/setup.py .
 gcloud ml-engine jobs submit training object_detection_eval_`date +%m_%d_%Y_%H_%M_%S` \
-    --runtime-version 1.12 \
+    --runtime-version 1.15 \
+    --python-version 3.6 \
    --job-dir=gs://${MODEL_DIR} \
-    --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
+    --package-path ./object_detection \
    --module-name object_detection.model_main \
    --region us-central1 \
    --scale-tier BASIC_GPU \
@@ -146,25 +213,25 @@ gcloud ml-engine jobs submit training object_detection_eval_`date +%m_%d_%Y_%H_%
 ```
 Where `gs://${MODEL_DIR}` points to the directory on Google Cloud Storage where
-training checkpoints are saved (same as the training job), as well as
+training checkpoints are saved (same as the training job), as well as to where
-to where evaluation events will be saved on Google Cloud Storage and
+evaluation events will be saved on Google Cloud Storage and
 `gs://${PIPELINE_CONFIG_PATH}` points to where the pipeline configuration is
 stored on Google Cloud Storage.
-Typically one starts an evaluation job concurrently with the training job.
+Typically one starts an evaluation job concurrently with the training job. Note
-Note that we do not support running evaluation on TPU, so the above command
+that we do not support running evaluation on TPU, so the above command line for
-line for launching evaluation jobs is the same whether you are training
+launching evaluation jobs is the same whether you are training on GPU or TPU.
-on GPU or TPU.
 ## Running Tensorboard
-You can run Tensorboard locally on your own machine to view progress of your
+Progress for training and eval jobs can be inspected using Tensorboard. If using
-training and eval jobs on Google Cloud ML. Run the following command to start
+the recommended directory structure, Tensorboard can be run using the following
-Tensorboard:
+command:
-``` bash
+```bash
-tensorboard --logdir=gs://${YOUR_CLOUD_BUCKET}
+tensorboard --logdir=${MODEL_DIR}
 ```
-Note it may Tensorboard a few minutes to populate with results.
+where `${MODEL_DIR}` points to the directory that contains the train and eval
+directories. Please note it may take Tensorboard a couple minutes to populate
+with data.
--- a/research/object_detection/g3doc/tf2.md
+++ b/research/object_detection/g3doc/tf2.md
+# Object Detection API with TensorFlow 2
+## Requirements
+[![Python 3.6](https://img.shields.io/badge/Python-3.6-3776AB)](https://www.python.org/downloads/release/python-360/)
+[![TensorFlow 2.2](https://img.shields.io/badge/TensorFlow-2.2-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v2.2.0)
+[![Protobuf Compiler >= 3.0](https://img.shields.io/badge/ProtoBuf%20Compiler-%3E3.0-brightgreen)](https://grpc.io/docs/protoc-installation/#install-using-a-package-manager)
+## Installation
+You can install the TensorFlow Object Detection API either with Python Package
+Installer (pip) or Docker. For local runs we recommend using Docker and for
+Google Cloud runs we recommend using pip.
+Clone the TensorFlow Models repository and proceed to one of the installation
+options.
+```bash
+git clone https://github.com/tensorflow/models.git
+```
+### Docker Installation
+```bash
+# From the root of the git repository
+docker build -f research/object_detection/dockerfiles/tf2/Dockerfile -t od .
+docker run -it od
+```
+### Python Package Installation
+```bash
+cd models/research
+# Compile protos.
+protoc object_detection/protos/*.proto --python_out=.
+# Install TensorFlow Object Detection API.
+cp object_detection/packages/tf2/setup.py .
+python -m pip install .
+```
+```bash
+# Test the installation.
+python object_detection/builders/model_builder_tf2_test.py
+```
+## Quick Start
+### Colabs
+<!-- mdlint off(URL_BAD_G3DOC_PATH) -->
+*   Training -
+    [Fine-tune a pre-trained detector in eager mode on custom data](../colab_tutorials/eager_few_shot_od_training_tf2_colab.ipynb)
+*   Inference -
+    [Run inference with models from the zoo](../colab_tutorials/inference_tf2_colab.ipynb)
+<!-- mdlint on -->
+## Training and Evaluation
+To train and evaluate your models either locally or on Google Cloud see
+[instructions](tf2_training_and_evaluation.md).
+## Model Zoo
+We provide a large collection of models that are trained on COCO 2017 in the
+[Model Zoo](tf2_detection_zoo.md).
+## Guides
+*   <a href='configuring_jobs.md'>
+      Configuring an object detection pipeline</a><br>
+*   <a href='preparing_inputs.md'>Preparing inputs</a><br>
+*   <a href='defining_your_own_model.md'>
+      Defining your own model architecture</a><br>
+*   <a href='using_your_own_dataset.md'>
+      Bringing in your own dataset</a><br>
+*   <a href='evaluation_protocols.md'>
+      Supported object detection evaluation protocols</a><br>
+*   <a href='tpu_compatibility.md'>
+      TPU compatible detection pipelines</a><br>
+*   <a href='tf2_training_and_evaluation.md'>
+      Training and evaluation guide (CPU, GPU, or TPU)</a><br>
\ No newline at end of file
--- a/research/object_detection/g3doc/tf2_classification_zoo.md
+++ b/research/object_detection/g3doc/tf2_classification_zoo.md
+# TensorFlow 2 Classification Model Zoo
+[![TensorFlow 2.2](https://img.shields.io/badge/TensorFlow-2.2-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v2.2.0)
+[![Python 3.6](https://img.shields.io/badge/Python-3.6-3776AB)](https://www.python.org/downloads/release/python-360/)
+We provide a collection of classification models pre-trained on the
+[Imagenet](http://www.image-net.org). These can be used to initilize detection
+model parameters.
+Model name |
+---------- |
+[EfficientNet B0](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b0.tar.gz)     |
+[EfficientNet B1](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b1.tar.gz)     |
+[EfficientNet B2](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b2.tar.gz)     |
+[EfficientNet B3](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b3.tar.gz)     |
+[EfficientNet B4](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b4.tar.gz)     |
+[EfficientNet B5](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b5.tar.gz)     |
+[EfficientNet B6](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b6.tar.gz)     |
+[EfficientNet B7](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b7.tar.gz)     |
+[Resnet V1 50](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/resnet50_v1.tar.gz)         |
+[Resnet V1 101](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/resnet101_v1.tar.gz)       |
+[Resnet V1 152](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/resnet152_v1.tar.gz)       |
+[Inception Resnet V2](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/inception_resnet_v2.tar.gz) |
+[MobileNet V1](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/mobilnet_v1.tar.gz)        |
+[MobileNet V2](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/mobilnet_v2.tar.gz)        |
--- a/research/object_detection/g3doc/tf2_detection_zoo.md
+++ b/research/object_detection/g3doc/tf2_detection_zoo.md
+# TensorFlow 2 Detection Model Zoo
+[![TensorFlow 2.2](https://img.shields.io/badge/TensorFlow-2.2-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v2.2.0)
+[![Python 3.6](https://img.shields.io/badge/Python-3.6-3776AB)](https://www.python.org/downloads/release/python-360/)
+<!-- mdlint off(URL_BAD_G3DOC_PATH) -->
+We provide a collection of detection models pre-trained on the
+[COCO 2017 dataset](http://cocodataset.org). These models can be useful for
+out-of-the-box inference if you are interested in categories already in those
+datasets. You can try it in our inference
+[colab](../colab_tutorials/inference_tf2_colab.ipynb)
+They are also useful for initializing your models when training on novel
+datasets. You can try this out on our few-shot training
+[colab](../colab_tutorials/eager_few_shot_od_training_tf2_colab.ipynb).
+<!-- mdlint on -->
+Finally, if you would like to train these models from scratch, you can find the
+model configs in this [directory](../configs/tf2) (also in the linked
+`tar.gz`s).
+Model name                                                                                                                                                                  | Speed (ms) | COCO mAP | Outputs
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :--------: | :----------: | :-----:
+[CenterNet HourGlass104 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_hg104_512x512_coco17_tpu-8.tar.gz)                    | 70         | 41.6           | Boxes
+[CenterNet HourGlass104 Keypoints 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_hg104_512x512_kpts_coco17_tpu-32.tar.gz)                    | 76         | 40.0/61.4           | Boxes/Keypoints
+[CenterNet HourGlass104 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_hg104_1024x1024_coco17_tpu-32.tar.gz)               | 197       | 43.5           | Boxes
+[CenterNet HourGlass104 Keypoints 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_hg104_1024x1024_kpts_coco17_tpu-32.tar.gz)               | 211       | 42.8/64.5          | Boxes/Keypoints
+[CenterNet Resnet50 V1 FPN 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_resnet50_v1_fpn_512x512_coco17_tpu-8.tar.gz)     | 27         | 31.2           | Boxes
+[CenterNet Resnet50 V1 FPN Keypoints 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_resnet50_v1_fpn_512x512_kpts_coco17_tpu-8.tar.gz)     | 30         | 29.3/50.7         | Boxes/Keypoints
+[CenterNet Resnet101 V1 FPN 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_resnet101_v1_fpn_512x512_coco17_tpu-8.tar.gz)     | 34         | 34.2           | Boxes
+[CenterNet Resnet50 V2 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_resnet50_v2_512x512_coco17_tpu-8.tar.gz)     | 27         | 29.5           | Boxes
+[CenterNet Resnet50 V2 Keypoints 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_resnet50_v2_512x512_kpts_coco17_tpu-8.tar.gz)     | 30         | 27.6/48.2           | Boxes/Keypoints
+[EfficientDet D0 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d0_coco17_tpu-32.tar.gz)                                  | 39         | 33.6           | Boxes
+[EfficientDet D1 640x640](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d1_coco17_tpu-32.tar.gz)                                  | 54         | 38.4           | Boxes
+[EfficientDet D2 768x768](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d2_coco17_tpu-32.tar.gz)                                  | 67         | 41.8           | Boxes
+[EfficientDet D3 896x896](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d3_coco17_tpu-32.tar.gz)                                  | 95         | 45.4           | Boxes
+[EfficientDet D4 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d4_coco17_tpu-32.tar.gz)                              | 133         | 48.5           | Boxes
+[EfficientDet D5 1280x1280](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d5_coco17_tpu-32.tar.gz)                             | 222         | 49.7           | Boxes
+[EfficientDet D6 1280x1280](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d6_coco17_tpu-32.tar.gz)                             | 268         | 50.5           | Boxes
+[EfficientDet D7 1536x1536](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d7_coco17_tpu-32.tar.gz)                             | 325         | 51.2           | Boxes
+[SSD MobileNet v2 320x320](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_320x320_coco17_tpu-8.tar.gz)                                |19         | 20.2           | Boxes
+[SSD MobileNet V1 FPN 640x640](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v1_fpn_640x640_coco17_tpu-8.tar.gz)                        | 48        | 29.1           | Boxes
+[SSD MobileNet V2 FPNLite 320x320](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.tar.gz)                | 22         | 22.2           | Boxes
+[SSD MobileNet V2 FPNLite 640x640](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8.tar.gz)                | 39         | 28.2           | Boxes
+[SSD ResNet50 V1 FPN 640x640 (RetinaNet50)](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.tar.gz)                          | 46         | 34.3           | Boxes
+[SSD ResNet50 V1 FPN 1024x1024 (RetinaNet50)](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet50_v1_fpn_1024x1024_coco17_tpu-8.tar.gz)                      | 87         | 38.3           | Boxes
+[SSD ResNet101 V1 FPN 640x640 (RetinaNet101)](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet101_v1_fpn_640x640_coco17_tpu-8.tar.gz)                        | 57         | 35.6           | Boxes
+[SSD ResNet101 V1 FPN 1024x1024 (RetinaNet101)](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet101_v1_fpn_1024x1024_coco17_tpu-8.tar.gz)                    | 104        | 39.5           | Boxes
+[SSD ResNet152 V1 FPN 640x640 (RetinaNet152)](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet152_v1_fpn_640x640_coco17_tpu-8.tar.gz)                        | 80         | 35.4           | Boxes
+[SSD ResNet152 V1 FPN 1024x1024 (RetinaNet152)](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet152_v1_fpn_1024x1024_coco17_tpu-8.tar.gz)                    | 111        | 39.6           | Boxes
+[Faster R-CNN ResNet50 V1 640x640](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet50_v1_640x640_coco17_tpu-8.tar.gz)                 | 53         | 29.3           | Boxes
+[Faster R-CNN ResNet50 V1 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet50_v1_1024x1024_coco17_tpu-8.tar.gz)             | 65         | 31.0           | Boxes
+[Faster R-CNN ResNet50 V1 800x1333](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet50_v1_800x1333_coco17_gpu-8.tar.gz)               | 65         | 31.6           | Boxes
+[Faster R-CNN ResNet101 V1 640x640](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet101_v1_640x640_coco17_tpu-8.tar.gz)               |    55      | 31.8           | Boxes
+[Faster R-CNN ResNet101 V1 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet101_v1_1024x1024_coco17_tpu-8.tar.gz)           | 72         | 37.1           | Boxes
+[Faster R-CNN ResNet101 V1 800x1333](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet101_v1_800x1333_coco17_gpu-8.tar.gz)             | 77         | 36.6           | Boxes
+[Faster R-CNN ResNet152 V1 640x640](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet152_v1_640x640_coco17_tpu-8.tar.gz)               | 64         | 32.4           | Boxes
+[Faster R-CNN ResNet152 V1 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet152_v1_1024x1024_coco17_tpu-8.tar.gz)           | 85         | 37.6           | Boxes
+[Faster R-CNN ResNet152 V1 800x1333](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet152_v1_800x1333_coco17_gpu-8.tar.gz)             | 101         | 37.4           | Boxes
+[Faster R-CNN Inception ResNet V2 640x640](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_inception_resnet_v2_640x640_coco17_tpu-8.tar.gz)             | 206         | 37.7           | Boxes
+[Faster R-CNN Inception ResNet V2 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_inception_resnet_v2_1024x1024_coco17_tpu-8.tar.gz)             | 236         | 38.7           | Boxes
+[Mask R-CNN Inception ResNet V2 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8.tar.gz) |    301      | 39.0/34.6           | Boxes/Masks
+[ExtremeNet](http://download.tensorflow.org/models/object_detection/tf2/20200711/extremenet.tar.gz)                                                                         | --         | --           | Boxes
--- a/research/object_detection/g3doc/tf2_training_and_evaluation.md
+++ b/research/object_detection/g3doc/tf2_training_and_evaluation.md
+# Training and Evaluation with TensorFlow 2
+[![TensorFlow 2.2](https://img.shields.io/badge/TensorFlow-2.2-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v2.2.0)
+[![Python 3.6](https://img.shields.io/badge/Python-3.6-3776AB)](https://www.python.org/downloads/release/python-360/)
+This page walks through the steps required to train an object detection model.
+It assumes the reader has completed the following prerequisites:
+1.  The TensorFlow Object Detection API has been installed as documented in the
+    [installation instructions](tf2.md#installation).
+2.  A valid data set has been created. See [this page](preparing_inputs.md) for
+    instructions on how to generate a dataset for the PASCAL VOC challenge or
+    the Oxford-IIIT Pet dataset.
+## Recommended Directory Structure for Training and Evaluation
+```bash
+.
+├── data/
+│   ├── eval-00000-of-00001.tfrecord
+│   ├── label_map.txt
+│   ├── train-00000-of-00002.tfrecord
+│   └── train-00001-of-00002.tfrecord
+└── models/
+    └── my_model_dir/
+        ├── eval/                 # Created by evaluation job.
+        ├── my_model.config
+        └── model_ckpt-100-data@1 #
+        └── model_ckpt-100-index  # Created by training job.
+        └── checkpoint            #
+```
+## Writing a model configuration
+Please refer to sample [TF2 configs](../configs/tf2) and
+[configuring jobs](configuring_jobs.md) to create a model config.
+### Model Parameter Initialization
+While optional, it is highly recommended that users utilize classification or
+object detection checkpoints. Training an object detector from scratch can take
+days. To speed up the training process, it is recommended that users re-use the
+feature extractor parameters from a pre-existing image classification or object
+detection checkpoint. The `train_config` section in the config provides two
+fields to specify pre-existing checkpoints:
+*   `fine_tune_checkpoint`: a path prefix to the pre-existing checkpoint
+    (ie:"/usr/home/username/checkpoint/model.ckpt-#####").
+*   `fine_tune_checkpoint_type`: with value `classification` or `detection`
+    depending on the type.
+A list of classification checkpoints can be found
+[here](tf2_classification_zoo.md)
+A list of detection checkpoints can be found [here](tf2_detection_zoo.md).
+## Local
+### Training
+A local training job can be run with the following command:
+```bash
+# From the tensorflow/models/research/ directory
+PIPELINE_CONFIG_PATH={path to pipeline config file}
+MODEL_DIR={path to model directory}
+python object_detection/model_main_tf2.py \
+    --pipeline_config_path=${PIPELINE_CONFIG_PATH} \
+    --model_dir=${MODEL_DIR} \
+    --alsologtostderr
+```
+where `${PIPELINE_CONFIG_PATH}` points to the pipeline config and `${MODEL_DIR}`
+points to the directory in which training checkpoints and events will be
+written.
+### Evaluation
+A local evaluation job can be run with the following command:
+```bash
+# From the tensorflow/models/research/ directory
+PIPELINE_CONFIG_PATH={path to pipeline config file}
+MODEL_DIR={path to model directory}
+CHECKPOINT_DIR=${MODEL_DIR}
+MODEL_DIR={path to model directory}
+python object_detection/model_main_tf2.py \
+    --pipeline_config_path=${PIPELINE_CONFIG_PATH} \
+    --model_dir=${MODEL_DIR} \
+    --checkpoint_dir=${CHECKPOINT_DIR} \
+    --alsologtostderr
+```
+where `${CHECKPOINT_DIR}` points to the directory with checkpoints produced by
+the training job. Evaluation events are written to `${MODEL_DIR/eval}`
+## Google Cloud VM
+The TensorFlow Object Detection API supports training on Google Cloud with Deep
+Learning GPU VMs and TPU VMs. This section documents instructions on how to
+train and evaluate your model on them. The reader should complete the following
+prerequistes:
+1.  The reader has create and configured a GPU VM or TPU VM on Google Cloud with
+    TensorFlow >= 2.2.0. See
+    [TPU quickstart](https://cloud.google.com/tpu/docs/quickstart) and
+    [GPU quickstart](https://cloud.google.com/ai-platform/deep-learning-vm/docs/tensorflow_start_instance#with-one-or-more-gpus)
+2.  The reader has installed the TensorFlow Object Detection API as documented
+    in the [installation instructions](tf2.md#installation) on the VM.
+3.  The reader has a valid data set and stored it in a Google Cloud Storage
+    bucket or locally on the VM. See [this page](preparing_inputs.md) for
+    instructions on how to generate a dataset for the PASCAL VOC challenge or
+    the Oxford-IIIT Pet dataset.
+Additionally, it is recommended users test their job by running training and
+evaluation jobs for a few iterations [locally on their own machines](#local).
+### Training
+Training on GPU or TPU VMs is similar to local training. It can be launched
+using the following command.
+```bash
+# From the tensorflow/models/research/ directory
+USE_TPU=true
+TPU_NAME="MY_TPU_NAME"
+PIPELINE_CONFIG_PATH={path to pipeline config file}
+MODEL_DIR={path to model directory}
+python object_detection/model_main_tf2.py \
+    --pipeline_config_path=${PIPELINE_CONFIG_PATH} \
+    --model_dir=${MODEL_DIR} \
+    --use_tpu=${USE_TPU} \  # (optional) only required for TPU training.
+    --tpu_name=${TPU_NAME} \  # (optional) only required for TPU training.
+    --alsologtostderr
+```
+where `${PIPELINE_CONFIG_PATH}` points to the pipeline config and `${MODEL_DIR}`
+points to the root directory for the files produces. Training checkpoints and
+events are written to `${MODEL_DIR}`. Note that the paths can be either local or
+a path to GCS bucket.
+### Evaluation
+Evaluation is only supported on GPU. Similar to local evaluation it can be
+launched using the following command:
+```bash
+# From the tensorflow/models/research/ directory
+PIPELINE_CONFIG_PATH={path to pipeline config file}
+MODEL_DIR={path to model directory}
+CHECKPOINT_DIR=${MODEL_DIR}
+MODEL_DIR={path to model directory}
+python object_detection/model_main_tf2.py \
+    --pipeline_config_path=${PIPELINE_CONFIG_PATH} \
+    --model_dir=${MODEL_DIR} \
+    --checkpoint_dir=${CHECKPOINT_DIR} \
+    --alsologtostderr
+```
+where `${CHECKPOINT_DIR}` points to the directory with checkpoints produced by
+the training job. Evaluation events are written to `${MODEL_DIR/eval}`. Note
+that the paths can be either local or a path to GCS bucket.
+## Google Cloud AI Platform
+The TensorFlow Object Detection API supports also supports training on Google
+Cloud AI Platform. This section documents instructions on how to train and
+evaluate your model using Cloud ML. The reader should complete the following
+prerequistes:
+1.  The reader has created and configured a project on Google Cloud AI Platform.
+    See
+    [Using GPUs](https://cloud.google.com/ai-platform/training/docs/using-gpus)
+    and
+    [Using TPUs](https://cloud.google.com/ai-platform/training/docs/using-tpus)
+    guides.
+2.  The reader has a valid data set and stored it in a Google Cloud Storage
+    bucket. See [this page](preparing_inputs.md) for instructions on how to
+    generate a dataset for the PASCAL VOC challenge or the Oxford-IIIT Pet
+    dataset.
+Additionally, it is recommended users test their job by running training and
+evaluation jobs for a few iterations [locally on their own machines](#local).
+### Training with multiple GPUs
+A user can start a training job on Cloud AI Platform using the following
+command:
+```bash
+# From the tensorflow/models/research/ directory
+cp object_detection/packages/tf2/setup.py .
+gcloud ai-platform jobs submit training object_detection_`date +%m_%d_%Y_%H_%M_%S` \
+    --runtime-version 2.1 \
+    --python-version 3.6 \
+    --job-dir=gs://${MODEL_DIR} \
+    --package-path ./object_detection \
+    --module-name object_detection.model_main_tf2 \
+    --region us-central1 \
+    --master-machine-type n1-highcpu-16 \
+    --master-accelerator count=8,type=nvidia-tesla-v100 \
+    -- \
+    --model_dir=gs://${MODEL_DIR} \
+    --pipeline_config_path=gs://${PIPELINE_CONFIG_PATH}
+```
+Where `gs://${MODEL_DIR}` specifies the directory on Google Cloud Storage where
+the training checkpoints and events will be written to and
+`gs://${PIPELINE_CONFIG_PATH}` points to the pipeline configuration stored on
+Google Cloud Storage.
+Users can monitor the progress of their training job on the
+[ML Engine Dashboard](https://console.cloud.google.com/ai-platform/jobs).
+### Training with TPU
+Launching a training job with a TPU compatible pipeline config requires using a
+similar command:
+```bash
+# From the tensorflow/models/research/ directory
+cp object_detection/packages/tf2/setup.py .
+gcloud ai-platform jobs submit training `whoami`_object_detection_`date +%m_%d_%Y_%H_%M_%S` \
+    --job-dir=gs://${MODEL_DIR} \
+    --package-path ./object_detection \
+    --module-name object_detection.model_main_tf2 \
+    --runtime-version 2.1 \
+    --python-version 3.6 \
+    --scale-tier BASIC_TPU \
+    --region us-central1 \
+    -- \
+    --use_tpu true \
+    --model_dir=gs://${MODEL_DIR} \
+    --pipeline_config_path=gs://${PIPELINE_CONFIG_PATH}
+```
+As before `pipeline_config_path` points to the pipeline configuration stored on
+Google Cloud Storage (but is now must be a TPU compatible model).
+### Evaluating with GPU
+Evaluation jobs run on a single machine. Run the following command to start the
+evaluation job:
+```bash
+# From the tensorflow/models/research/ directory
+cp object_detection/packages/tf2/setup.py .
+gcloud ai-platform jobs submit training object_detection_eval_`date +%m_%d_%Y_%H_%M_%S` \
+    --runtime-version 2.1 \
+    --python-version 3.6 \
+    --job-dir=gs://${MODEL_DIR} \
+    --package-path ./object_detection \
+    --module-name object_detection.model_main_tf2 \
+    --region us-central1 \
+    --scale-tier BASIC_GPU \
+    -- \
+    --model_dir=gs://${MODEL_DIR} \
+    --pipeline_config_path=gs://${PIPELINE_CONFIG_PATH} \
+    --checkpoint_dir=gs://${MODEL_DIR}
+```
+where `gs://${MODEL_DIR}` points to the directory on Google Cloud Storage where
+training checkpoints are saved and `gs://{PIPELINE_CONFIG_PATH}` points to where
+the model configuration file stored on Google Cloud Storage. Evaluation events
+are written to `gs://${MODEL_DIR}/eval`
+Typically one starts an evaluation job concurrently with the training job. Note
+that we do not support running evaluation on TPU.
+## Running Tensorboard
+Progress for training and eval jobs can be inspected using Tensorboard. If using
+the recommended directory structure, Tensorboard can be run using the following
+command:
+```bash
+tensorboard --logdir=${MODEL_DIR}
+```
+where `${MODEL_DIR}` points to the directory that contains the train and eval
+directories. Please note it may take Tensorboard a couple minutes to populate
+with data.
--- a/research/object_detection/g3doc/tpu_compatibility.md
+++ b/research/object_detection/g3doc/tpu_compatibility.md
@@ -2,7 +2,7 @@
 [TOC]
-The Tensorflow Object Detection API supports TPU training for some models. To
+The TensorFlow Object Detection API supports TPU training for some models. To
 make models TPU compatible you need to make a few tweaks to the model config as
 mentioned below. We also provide several sample configs that you can use as a
 template.
@@ -11,7 +11,7 @@ template.
 ### Static shaped tensors
-TPU training currently requires all tensors in the Tensorflow Graph to have
+TPU training currently requires all tensors in the TensorFlow Graph to have
 static shapes. However, most of the sample configs in Object Detection API have
 a few different tensors that are dynamically shaped. Fortunately, we provide
 simple alternatives in the model configuration that modifies these tensors to
@@ -62,7 +62,7 @@ have static shape:
 ### TPU friendly ops
 Although TPU supports a vast number of tensorflow ops, a few used in the
-Tensorflow Object Detection API are unsupported. We list such ops below and
+TensorFlow Object Detection API are unsupported. We list such ops below and
 recommend compatible substitutes.
 *   **Anchor sampling** - Typically we use hard example mining in standard SSD

--- a/research/object_detection/g3doc/tpu_exporters.md
+++ b/research/object_detection/g3doc/tpu_exporters.md
 # Object Detection TPU Inference Exporter
+[![TensorFlow 1.15](https://img.shields.io/badge/TensorFlow-1.15-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
 This package contains SavedModel Exporter for TPU Inference of object detection
 models.

--- a/research/object_detection/g3doc/using_your_own_dataset.md
+++ b/research/object_detection/g3doc/using_your_own_dataset.md
@@ -2,7 +2,7 @@
 [TOC]
-To use your own dataset in Tensorflow Object Detection API, you must convert it
+To use your own dataset in TensorFlow Object Detection API, you must convert it
 into the [TFRecord file format](https://www.tensorflow.org/api_guides/python/python_io#tfrecords_format_details).
 This document outlines how to write a script to generate the TFRecord file.

--- a/research/object_detection/inputs.py
+++ b/research/object_detection/inputs.py
@@ -1094,8 +1094,12 @@ def get_reduce_to_frame_fn(input_reader_config, is_training):
          num_frames = tf.cast(
              tf.shape(tensor_dict[fields.InputDataFields.source_id])[0],
              dtype=tf.int32)
-          frame_index = tf.random.uniform((), minval=0, maxval=num_frames,
+          if input_reader_config.frame_index == -1:
-                                          dtype=tf.int32)
+            frame_index = tf.random.uniform((), minval=0, maxval=num_frames,
+                                            dtype=tf.int32)
+          else:
+            frame_index = tf.constant(input_reader_config.frame_index,
+                                      dtype=tf.int32)
          out_tensor_dict = {}
          for key in tensor_dict:
            if key in fields.SEQUENCE_FIELDS:

--- a/research/object_detection/inputs_test.py
+++ b/research/object_detection/inputs_test.py
@@ -61,7 +61,7 @@ def _get_configs_for_model(model_name):
      configs, kwargs_dict=override_dict)
-def _get_configs_for_model_sequence_example(model_name):
+def _get_configs_for_model_sequence_example(model_name, frame_index=-1):
  """Returns configurations for model."""
  fname = os.path.join(tf.resource_loader.get_data_files_path(),
                       'test_data/' + model_name + '.config')
@@ -74,7 +74,8 @@ def _get_configs_for_model_sequence_example(model_name):
  override_dict = {
      'train_input_path': data_path,
      'eval_input_path': data_path,
-      'label_map_path': label_map_path
+      'label_map_path': label_map_path,
+      'frame_index': frame_index
  }
  return config_util.merge_external_params_with_configs(
      configs, kwargs_dict=override_dict)
@@ -312,6 +313,46 @@ class InputFnTest(test_case.TestCase, parameterized.TestCase):
        tf.float32,
        labels[fields.InputDataFields.groundtruth_weights].dtype)
+  def test_context_rcnn_resnet50_train_input_with_sequence_example_frame_index(
+      self, train_batch_size=8):
+    """Tests the training input function for FasterRcnnResnet50."""
+    configs = _get_configs_for_model_sequence_example(
+        'context_rcnn_camera_trap', frame_index=2)
+    model_config = configs['model']
+    train_config = configs['train_config']
+    train_config.batch_size = train_batch_size
+    train_input_fn = inputs.create_train_input_fn(
+        train_config, configs['train_input_config'], model_config)
+    features, labels = _make_initializable_iterator(train_input_fn()).get_next()
+    self.assertAllEqual([train_batch_size, 640, 640, 3],
+                        features[fields.InputDataFields.image].shape.as_list())
+    self.assertEqual(tf.float32, features[fields.InputDataFields.image].dtype)
+    self.assertAllEqual([train_batch_size],
+                        features[inputs.HASH_KEY].shape.as_list())
+    self.assertEqual(tf.int32, features[inputs.HASH_KEY].dtype)
+    self.assertAllEqual(
+        [train_batch_size, 100, 4],
+        labels[fields.InputDataFields.groundtruth_boxes].shape.as_list())
+    self.assertEqual(tf.float32,
+                     labels[fields.InputDataFields.groundtruth_boxes].dtype)
+    self.assertAllEqual(
+        [train_batch_size, 100, model_config.faster_rcnn.num_classes],
+        labels[fields.InputDataFields.groundtruth_classes].shape.as_list())
+    self.assertEqual(tf.float32,
+                     labels[fields.InputDataFields.groundtruth_classes].dtype)
+    self.assertAllEqual(
+        [train_batch_size, 100],
+        labels[fields.InputDataFields.groundtruth_weights].shape.as_list())
+    self.assertEqual(tf.float32,
+                     labels[fields.InputDataFields.groundtruth_weights].dtype)
+    self.assertAllEqual(
+        [train_batch_size, 100, model_config.faster_rcnn.num_classes],
+        labels[fields.InputDataFields.groundtruth_confidences].shape.as_list())
+    self.assertEqual(
+        tf.float32,
+        labels[fields.InputDataFields.groundtruth_confidences].dtype)
  def test_ssd_inceptionV2_train_input(self):
    """Tests the training input function for SSDInceptionV2."""
    configs = _get_configs_for_model('ssd_inception_v2_pets')

--- a/research/object_detection/meta_architectures/center_net_meta_arch.py
+++ b/research/object_detection/meta_architectures/center_net_meta_arch.py
@@ -924,13 +924,16 @@ def convert_strided_predictions_to_normalized_keypoints(
 def convert_strided_predictions_to_instance_masks(
-    boxes, classes, masks, stride, mask_height, mask_width,
+    boxes, classes, masks, true_image_shapes,
-    true_image_shapes, score_threshold=0.5):
+    densepose_part_heatmap=None, densepose_surface_coords=None, stride=4,
+    mask_height=256, mask_width=256, score_threshold=0.5,
+    densepose_class_index=-1):
  """Converts predicted full-image masks into instance masks.
  For each predicted detection box:
-    * Crop and resize the predicted mask based on the detected bounding box
+    * Crop and resize the predicted mask (and optionally DensePose coordinates)
-      coordinates and class prediction. Uses bilinear resampling.
+      based on the detected bounding box coordinates and class prediction. Uses
+      bilinear resampling.
    * Binarize the mask using the provided score threshold.
  Args:
@@ -940,57 +943,212 @@ def convert_strided_predictions_to_instance_masks(
      detected class for each box (0-indexed).
    masks: A [batch, output_height, output_width, num_classes] float32
      tensor with class probabilities.
+    true_image_shapes: A tensor of shape [batch, 3] representing the true
+      shape of the inputs not considering padding.
+    densepose_part_heatmap: (Optional) A [batch, output_height, output_width,
+      num_parts] float32 tensor with part scores (i.e. logits).
+    densepose_surface_coords: (Optional) A [batch, output_height, output_width,
+      2 * num_parts] float32 tensor with predicted part coordinates (in
+      vu-format).
    stride: The stride in the output space.
    mask_height: The desired resized height for instance masks.
    mask_width: The desired resized width for instance masks.
-    true_image_shapes: A tensor of shape [batch, 3] representing the true
-      shape of the inputs not considering padding.
    score_threshold: The threshold at which to convert predicted mask
       into foreground pixels.
+    densepose_class_index: The class index (0-indexed) corresponding to the
+      class which has DensePose labels (e.g. person class).
  Returns:
-    A [batch_size, max_detections, mask_height, mask_width] uint8 tensor with
+    A tuple of masks and surface_coords.
-    predicted foreground mask for each instance. The masks take values in
+    instance_masks: A [batch_size, max_detections, mask_height, mask_width]
-    {0, 1}.
+      uint8 tensor with predicted foreground mask for each
+      instance. If DensePose tensors are provided, then each pixel value in the
+      mask encodes the 1-indexed part.
+    surface_coords: A [batch_size, max_detections, mask_height, mask_width, 2]
+      float32 tensor with (v, u) coordinates. Note that v, u coordinates are
+      only defined on instance masks, and the coordinates at each location of
+      the foreground mask correspond to coordinates on a local part coordinate
+      system (the specific part can be inferred from the `instance_masks`
+      output. If DensePose feature maps are not passed to this function, this
+      output will be None.
+  Raises:
+    ValueError: If one but not both of `densepose_part_heatmap` and
+    `densepose_surface_coords` is provided.
  """
-  _, output_height, output_width, _ = (
+  batch_size, output_height, output_width, _ = (
      shape_utils.combined_static_and_dynamic_shape(masks))
  input_height = stride * output_height
  input_width = stride * output_width
+  true_heights, true_widths, _ = tf.unstack(true_image_shapes, axis=1)
+  # If necessary, create dummy DensePose tensors to simplify the map function.
+  densepose_present = True
+  if ((densepose_part_heatmap is not None) ^
+      (densepose_surface_coords is not None)):
+    raise ValueError('To use DensePose, both `densepose_part_heatmap` and '
+                     '`densepose_surface_coords` must be provided')
+  if densepose_part_heatmap is None and densepose_surface_coords is None:
+    densepose_present = False
+    densepose_part_heatmap = tf.zeros(
+        (batch_size, output_height, output_width, 1), dtype=tf.float32)
+    densepose_surface_coords = tf.zeros(
+        (batch_size, output_height, output_width, 2), dtype=tf.float32)
+  crop_and_threshold_fn = functools.partial(
+      crop_and_threshold_masks, input_height=input_height,
+      input_width=input_width, mask_height=mask_height, mask_width=mask_width,
+      score_threshold=score_threshold,
+      densepose_class_index=densepose_class_index)
+  instance_masks, surface_coords = shape_utils.static_or_dynamic_map_fn(
+      crop_and_threshold_fn,
+      elems=[boxes, classes, masks, densepose_part_heatmap,
+             densepose_surface_coords, true_heights, true_widths],
+      dtype=[tf.uint8, tf.float32],
+      back_prop=False)
+  surface_coords = surface_coords if densepose_present else None
+  return instance_masks, surface_coords
+def crop_and_threshold_masks(elems, input_height, input_width, mask_height=256,
+                             mask_width=256, score_threshold=0.5,
+                             densepose_class_index=-1):
+  """Crops and thresholds masks based on detection boxes.
+  Args:
+    elems: A tuple of
+      boxes - float32 tensor of shape [max_detections, 4]
+      classes - int32 tensor of shape [max_detections] (0-indexed)
+      masks - float32 tensor of shape [output_height, output_width, num_classes]
+      part_heatmap - float32 tensor of shape [output_height, output_width,
+        num_parts]
+      surf_coords - float32 tensor of shape [output_height, output_width,
+        2 * num_parts]
+      true_height - scalar int tensor
+      true_width - scalar int tensor
+    input_height: Input height to network.
+    input_width: Input width to network.
+    mask_height: Height for resizing mask crops.
+    mask_width: Width for resizing mask crops.
+    score_threshold: The threshold at which to convert predicted mask
+      into foreground pixels.
+    densepose_class_index: scalar int tensor with the class index (0-indexed)
+      for DensePose.
+  Returns:
+    A tuple of
+    all_instances: A [max_detections, mask_height, mask_width] uint8 tensor
+      with a predicted foreground mask for each instance. Background is encoded
+      as 0, and foreground is encoded as a positive integer. Specific part
+      indices are encoded as 1-indexed parts (for classes that have part
+      information).
+    surface_coords: A [max_detections, mask_height, mask_width, 2]
+      float32 tensor with (v, u) coordinates. for each part.
+  """
+  (boxes, classes, masks, part_heatmap, surf_coords, true_height,
+   true_width) = elems
  # Boxes are in normalized coordinates relative to true image shapes. Convert
  # coordinates to be normalized relative to input image shapes (since masks
  # may still have padding).
-  # Then crop and resize each mask.
+  boxlist = box_list.BoxList(boxes)
-  def crop_and_threshold_masks(args):
+  y_scale = true_height / input_height
-    """Crops masks based on detection boxes."""
+  x_scale = true_width / input_width
-    boxes, classes, masks, true_height, true_width = args
+  boxlist = box_list_ops.scale(boxlist, y_scale, x_scale)
-    boxlist = box_list.BoxList(boxes)
+  boxes = boxlist.get()
-    y_scale = true_height / input_height
+  # Convert masks from [output_height, output_width, num_classes] to
-    x_scale = true_width / input_width
+  # [num_classes, output_height, output_width, 1].
-    boxlist = box_list_ops.scale(boxlist, y_scale, x_scale)
+  num_classes = tf.shape(masks)[-1]
-    boxes = boxlist.get()
+  masks_4d = tf.transpose(masks, perm=[2, 0, 1])[:, :, :, tf.newaxis]
-    # Convert masks from [input_height, input_width, num_classes] to
+  # Tile part and surface coordinate masks for all classes.
-    # [num_classes, input_height, input_width, 1].
+  part_heatmap_4d = tf.tile(part_heatmap[tf.newaxis, :, :, :],
-    masks_4d = tf.transpose(masks, perm=[2, 0, 1])[:, :, :, tf.newaxis]
+                            multiples=[num_classes, 1, 1, 1])
-    cropped_masks = tf2.image.crop_and_resize(
+  surf_coords_4d = tf.tile(surf_coords[tf.newaxis, :, :, :],
-        masks_4d,
+                           multiples=[num_classes, 1, 1, 1])
-        boxes=boxes,
+  feature_maps_concat = tf.concat([masks_4d, part_heatmap_4d, surf_coords_4d],
-        box_indices=classes,
+                                  axis=-1)
-        crop_size=[mask_height, mask_width],
+  # The following tensor has shape
-        method='bilinear')
+  # [max_detections, mask_height, mask_width, 1 + 3 * num_parts].
-    masks_3d = tf.squeeze(cropped_masks, axis=3)
+  cropped_masks = tf2.image.crop_and_resize(
-    masks_binarized = tf.math.greater_equal(masks_3d, score_threshold)
+      feature_maps_concat,
-    return tf.cast(masks_binarized, tf.uint8)
+      boxes=boxes,
+      box_indices=classes,
+      crop_size=[mask_height, mask_width],
+      method='bilinear')
+  # Split the cropped masks back into instance masks, part masks, and surface
+  # coordinates.
+  num_parts = tf.shape(part_heatmap)[-1]
+  instance_masks, part_heatmap_cropped, surface_coords_cropped = tf.split(
+      cropped_masks, [1, num_parts, 2 * num_parts], axis=-1)
+  # Threshold the instance masks. Resulting tensor has shape
+  # [max_detections, mask_height, mask_width, 1].
+  instance_masks_int = tf.cast(
+      tf.math.greater_equal(instance_masks, score_threshold), dtype=tf.int32)
+  # Produce a binary mask that is 1.0 only:
+  #  - in the foreground region for an instance
+  #  - in detections corresponding to the DensePose class
+  det_with_parts = tf.equal(classes, densepose_class_index)
+  det_with_parts = tf.cast(
+      tf.reshape(det_with_parts, [-1, 1, 1, 1]), dtype=tf.int32)
+  instance_masks_with_parts = tf.math.multiply(instance_masks_int,
+                                               det_with_parts)
+  # Similarly, produce a binary mask that holds the foreground masks only for
+  # instances without parts (i.e. non-DensePose classes).
+  det_without_parts = 1 - det_with_parts
+  instance_masks_without_parts = tf.math.multiply(instance_masks_int,
+                                                  det_without_parts)
+  # Assemble a tensor that has standard instance segmentation masks for
+  # non-DensePose classes (with values in [0, 1]), and part segmentation masks
+  # for DensePose classes (with vaues in [0, 1, ..., num_parts]).
+  part_mask_int_zero_indexed = tf.math.argmax(
+      part_heatmap_cropped, axis=-1, output_type=tf.int32)[:, :, :, tf.newaxis]
+  part_mask_int_one_indexed = part_mask_int_zero_indexed + 1
+  all_instances = (instance_masks_without_parts +
+                   instance_masks_with_parts * part_mask_int_one_indexed)
+  # Gather the surface coordinates for the parts.
+  surface_coords_cropped = tf.reshape(
+      surface_coords_cropped, [-1, mask_height, mask_width, num_parts, 2])
+  surface_coords = gather_surface_coords_for_parts(surface_coords_cropped,
+                                                   part_mask_int_zero_indexed)
+  surface_coords = (
+      surface_coords * tf.cast(instance_masks_with_parts, tf.float32))
+  return [tf.squeeze(all_instances, axis=3), surface_coords]
+def gather_surface_coords_for_parts(surface_coords_cropped,
+                                    highest_scoring_part):
+  """Gathers the (v, u) coordinates for the highest scoring DensePose parts.
-  true_heights, true_widths, _ = tf.unstack(true_image_shapes, axis=1)
+  Args:
-  masks_for_image = shape_utils.static_or_dynamic_map_fn(
+    surface_coords_cropped: A [max_detections, height, width, num_parts, 2]
-      crop_and_threshold_masks,
+      float32 tensor with (v, u) surface coordinates.
-      elems=[boxes, classes, masks, true_heights, true_widths],
+    highest_scoring_part: A [max_detections, height, width] integer tensor with
-      dtype=tf.uint8,
+      the highest scoring part (0-indexed) indices for each location.
-      back_prop=False)
-  masks = tf.stack(masks_for_image, axis=0)
+  Returns:
-  return masks
+    A [max_detections, height, width, 2] float32 tensor with the (v, u)
+    coordinates selected from the highest scoring parts.
+  """
+  max_detections, height, width, num_parts, _ = (
+      shape_utils.combined_static_and_dynamic_shape(surface_coords_cropped))
+  flattened_surface_coords = tf.reshape(surface_coords_cropped, [-1, 2])
+  flattened_part_ids = tf.reshape(highest_scoring_part, [-1])
+  # Produce lookup indices that represent the locations of the highest scoring
+  # parts in the `flattened_surface_coords` tensor.
+  flattened_lookup_indices = (
+      num_parts * tf.range(max_detections * height * width) +
+      flattened_part_ids)
+  vu_coords_flattened = tf.gather(flattened_surface_coords,
+                                  flattened_lookup_indices, axis=0)
+  return tf.reshape(vu_coords_flattened, [max_detections, height, width, 2])
 class ObjectDetectionParams(
@@ -1235,6 +1393,64 @@ class MaskParams(
                              score_threshold, heatmap_bias_init)
+class DensePoseParams(
+    collections.namedtuple('DensePoseParams', [
+        'class_id', 'classification_loss', 'localization_loss',
+        'part_loss_weight', 'coordinate_loss_weight', 'num_parts',
+        'task_loss_weight', 'upsample_to_input_res', 'upsample_method',
+        'heatmap_bias_init'
+    ])):
+  """Namedtuple to store DensePose prediction related parameters."""
+  __slots__ = ()
+  def __new__(cls,
+              class_id,
+              classification_loss,
+              localization_loss,
+              part_loss_weight=1.0,
+              coordinate_loss_weight=1.0,
+              num_parts=24,
+              task_loss_weight=1.0,
+              upsample_to_input_res=True,
+              upsample_method='bilinear',
+              heatmap_bias_init=-2.19):
+    """Constructor with default values for DensePoseParams.
+    Args:
+      class_id: the ID of the class that contains the DensePose groundtruth.
+        This should typically correspond to the "person" class. Note that the ID
+        is 0-based, meaning that class 0 corresponds to the first non-background
+        object class.
+      classification_loss: an object_detection.core.losses.Loss object to
+        compute the loss for the body part predictions in CenterNet.
+      localization_loss: an object_detection.core.losses.Loss object to compute
+        the loss for the surface coordinate regression in CenterNet.
+      part_loss_weight: The loss weight to apply to part prediction.
+      coordinate_loss_weight: The loss weight to apply to surface coordinate
+        prediction.
+      num_parts: The number of DensePose parts to predict.
+      task_loss_weight: float, the loss weight for the DensePose task.
+      upsample_to_input_res: Whether to upsample the DensePose feature maps to
+        the input resolution before applying loss. Note that the prediction
+        outputs are still at the standard CenterNet output stride.
+      upsample_method: Method for upsampling DensePose feature maps. Options are
+        either 'bilinear' or 'nearest'). This takes no effect when
+        `upsample_to_input_res` is False.
+      heatmap_bias_init: float, the initial value of bias in the convolutional
+        kernel of the part prediction head. If set to None, the
+        bias is initialized with zeros.
+    Returns:
+      An initialized DensePoseParams namedtuple.
+    """
+    return super(DensePoseParams,
+                 cls).__new__(cls, class_id, classification_loss,
+                              localization_loss, part_loss_weight,
+                              coordinate_loss_weight, num_parts,
+                              task_loss_weight, upsample_to_input_res,
+                              upsample_method, heatmap_bias_init)
 # The following constants are used to generate the keys of the
 # (prediction, loss, target assigner,...) dictionaries used in CenterNetMetaArch
 # class.
@@ -1247,6 +1463,9 @@ KEYPOINT_HEATMAP = 'keypoint/heatmap'
 KEYPOINT_OFFSET = 'keypoint/offset'
 SEGMENTATION_TASK = 'segmentation_task'
 SEGMENTATION_HEATMAP = 'segmentation/heatmap'
+DENSEPOSE_TASK = 'densepose_task'
+DENSEPOSE_HEATMAP = 'densepose/heatmap'
+DENSEPOSE_REGRESSION = 'densepose/regression'
 LOSS_KEY_PREFIX = 'Loss'
@@ -1290,7 +1509,8 @@ class CenterNetMetaArch(model.DetectionModel):
               object_center_params,
               object_detection_params=None,
               keypoint_params_dict=None,
-               mask_params=None):
+               mask_params=None,
+               densepose_params=None):
    """Initializes a CenterNet model.
    Args:
@@ -1318,6 +1538,10 @@ class CenterNetMetaArch(model.DetectionModel):
      mask_params: A MaskParams namedtuple. This object
        holds the hyper-parameters for segmentation. Please see the class
        definition for more details.
+      densepose_params: A DensePoseParams namedtuple. This object holds the
+        hyper-parameters for DensePose prediction. Please see the class
+        definition for more details. Note that if this is provided, it is
+        expected that `mask_params` is also provided.
    """
    assert object_detection_params or keypoint_params_dict
    # Shorten the name for convenience and better formatting.
@@ -1333,6 +1557,10 @@ class CenterNetMetaArch(model.DetectionModel):
    self._od_params = object_detection_params
    self._kp_params_dict = keypoint_params_dict
    self._mask_params = mask_params
+    if densepose_params is not None and mask_params is None:
+      raise ValueError('To run DensePose prediction, `mask_params` must also '
+                       'be supplied.')
+    self._densepose_params = densepose_params
    # Construct the prediction head nets.
    self._prediction_head_dict = self._construct_prediction_heads(
@@ -1413,8 +1641,18 @@ class CenterNetMetaArch(model.DetectionModel):
    if self._mask_params is not None:
      prediction_heads[SEGMENTATION_HEATMAP] = [
          make_prediction_net(num_classes,
-                              bias_fill=class_prediction_bias_init)
+                              bias_fill=self._mask_params.heatmap_bias_init)
+          for _ in range(num_feature_outputs)]
+    if self._densepose_params is not None:
+      prediction_heads[DENSEPOSE_HEATMAP] = [
+          make_prediction_net(  # pylint: disable=g-complex-comprehension
+              self._densepose_params.num_parts,
+              bias_fill=self._densepose_params.heatmap_bias_init)
          for _ in range(num_feature_outputs)]
+      prediction_heads[DENSEPOSE_REGRESSION] = [
+          make_prediction_net(2 * self._densepose_params.num_parts)
+          for _ in range(num_feature_outputs)
+      ]
    return prediction_heads
  def _initialize_target_assigners(self, stride, min_box_overlap_iou):
@@ -1449,6 +1687,10 @@ class CenterNetMetaArch(model.DetectionModel):
    if self._mask_params is not None:
      target_assigners[SEGMENTATION_TASK] = (
          cn_assigner.CenterNetMaskTargetAssigner(stride))
+    if self._densepose_params is not None:
+      dp_stride = 1 if self._densepose_params.upsample_to_input_res else stride
+      target_assigners[DENSEPOSE_TASK] = (
+          cn_assigner.CenterNetDensePoseTargetAssigner(dp_stride))
    return target_assigners
@@ -1860,6 +2102,113 @@ class CenterNetMetaArch(model.DetectionModel):
        float(len(segmentation_predictions)) * total_pixels_in_loss)
    return total_loss
+  def _compute_densepose_losses(self, input_height, input_width,
+                                prediction_dict):
+    """Computes the weighted DensePose losses.
+    Args:
+      input_height: An integer scalar tensor representing input image height.
+      input_width: An integer scalar tensor representing input image width.
+      prediction_dict: A dictionary holding predicted tensors output by the
+        "predict" function. See the "predict" function for more detailed
+        description.
+    Returns:
+      A dictionary of scalar float tensors representing the weighted losses for
+      the DensePose task:
+         DENSEPOSE_HEATMAP: the weighted part segmentation loss.
+         DENSEPOSE_REGRESSION: the weighted part surface coordinate loss.
+    """
+    dp_heatmap_loss, dp_regression_loss = (
+        self._compute_densepose_part_and_coordinate_losses(
+            input_height=input_height,
+            input_width=input_width,
+            part_predictions=prediction_dict[DENSEPOSE_HEATMAP],
+            surface_coord_predictions=prediction_dict[DENSEPOSE_REGRESSION]))
+    loss_dict = {}
+    loss_dict[DENSEPOSE_HEATMAP] = (
+        self._densepose_params.part_loss_weight * dp_heatmap_loss)
+    loss_dict[DENSEPOSE_REGRESSION] = (
+        self._densepose_params.coordinate_loss_weight * dp_regression_loss)
+    return loss_dict
+  def _compute_densepose_part_and_coordinate_losses(
+      self, input_height, input_width, part_predictions,
+      surface_coord_predictions):
+    """Computes the individual losses for the DensePose task.
+    Args:
+      input_height: An integer scalar tensor representing input image height.
+      input_width: An integer scalar tensor representing input image width.
+      part_predictions: A list of float tensors of shape [batch_size,
+        out_height, out_width, num_parts].
+      surface_coord_predictions: A list of float tensors of shape [batch_size,
+        out_height, out_width, 2 * num_parts].
+    Returns:
+      A tuple with two scalar loss tensors: part_prediction_loss and
+      surface_coord_loss.
+    """
+    gt_dp_num_points_list = self.groundtruth_lists(
+        fields.BoxListFields.densepose_num_points)
+    gt_dp_part_ids_list = self.groundtruth_lists(
+        fields.BoxListFields.densepose_part_ids)
+    gt_dp_surface_coords_list = self.groundtruth_lists(
+        fields.BoxListFields.densepose_surface_coords)
+    gt_weights_list = self.groundtruth_lists(fields.BoxListFields.weights)
+    assigner = self._target_assigner_dict[DENSEPOSE_TASK]
+    batch_indices, batch_part_ids, batch_surface_coords, batch_weights = (
+        assigner.assign_part_and_coordinate_targets(
+            height=input_height,
+            width=input_width,
+            gt_dp_num_points_list=gt_dp_num_points_list,
+            gt_dp_part_ids_list=gt_dp_part_ids_list,
+            gt_dp_surface_coords_list=gt_dp_surface_coords_list,
+            gt_weights_list=gt_weights_list))
+    part_prediction_loss = 0
+    surface_coord_loss = 0
+    classification_loss_fn = self._densepose_params.classification_loss
+    localization_loss_fn = self._densepose_params.localization_loss
+    num_predictions = float(len(part_predictions))
+    num_valid_points = tf.math.count_nonzero(batch_weights)
+    num_valid_points = tf.cast(tf.math.maximum(num_valid_points, 1), tf.float32)
+    for part_pred, surface_coord_pred in zip(part_predictions,
+                                             surface_coord_predictions):
+      # Potentially upsample the feature maps, so that better quality (i.e.
+      # higher res) groundtruth can be applied.
+      if self._densepose_params.upsample_to_input_res:
+        part_pred = tf.keras.layers.UpSampling2D(
+            self._stride, interpolation=self._densepose_params.upsample_method)(
+                part_pred)
+        surface_coord_pred = tf.keras.layers.UpSampling2D(
+            self._stride, interpolation=self._densepose_params.upsample_method)(
+                surface_coord_pred)
+      # Compute the part prediction loss.
+      part_pred = cn_assigner.get_batch_predictions_from_indices(
+          part_pred, batch_indices[:, 0:3])
+      part_prediction_loss += classification_loss_fn(
+          part_pred[:, tf.newaxis, :],
+          batch_part_ids[:, tf.newaxis, :],
+          weights=batch_weights[:, tf.newaxis, tf.newaxis])
+      # Compute the surface coordinate loss.
+      batch_size, out_height, out_width, _ = _get_shape(
+          surface_coord_pred, 4)
+      surface_coord_pred = tf.reshape(
+          surface_coord_pred, [batch_size, out_height, out_width, -1, 2])
+      surface_coord_pred = cn_assigner.get_batch_predictions_from_indices(
+          surface_coord_pred, batch_indices)
+      surface_coord_loss += localization_loss_fn(
+          surface_coord_pred,
+          batch_surface_coords,
+          weights=batch_weights[:, tf.newaxis])
+    part_prediction_loss = tf.reduce_sum(part_prediction_loss) / (
+        num_predictions * num_valid_points)
+    surface_coord_loss = tf.reduce_sum(surface_coord_loss) / (
+        num_predictions * num_valid_points)
+    return part_prediction_loss, surface_coord_loss
  def preprocess(self, inputs):
    outputs = shape_utils.resize_images_and_return_shapes(
        inputs, self._image_resizer_fn)
@@ -1909,6 +2258,13 @@ class CenterNetMetaArch(model.DetectionModel):
        'segmentation/heatmap' - [optional] A list of size num_feature_outputs
          holding float tensors of size [batch_size, output_height,
          output_width, num_classes] representing the mask logits.
+        'densepose/heatmap' - [optional] A list of size num_feature_outputs
+          holding float tensors of size [batch_size, output_height,
+          output_width, num_parts] representing the mask logits for each part.
+        'densepose/regression' - [optional] A list of size num_feature_outputs
+          holding float tensors of size [batch_size, output_height,
+          output_width, 2 * num_parts] representing the DensePose surface
+          coordinate predictions.
        Note the $TASK_NAME is provided by the KeypointEstimation namedtuple
        used to differentiate between different keypoint tasks.
    """
@@ -1938,10 +2294,16 @@ class CenterNetMetaArch(model.DetectionModel):
      scope: Optional scope name.
    Returns:
-      A dictionary mapping the keys ['Loss/object_center', 'Loss/box/scale',
+      A dictionary mapping the keys [
-        'Loss/box/offset', 'Loss/$TASK_NAME/keypoint/heatmap',
+        'Loss/object_center',
-        'Loss/$TASK_NAME/keypoint/offset',
+        'Loss/box/scale',  (optional)
-        'Loss/$TASK_NAME/keypoint/regression', 'Loss/segmentation/heatmap'] to
+        'Loss/box/offset', (optional)
+        'Loss/$TASK_NAME/keypoint/heatmap', (optional)
+        'Loss/$TASK_NAME/keypoint/offset', (optional)
+        'Loss/$TASK_NAME/keypoint/regression', (optional)
+        'Loss/segmentation/heatmap', (optional)
+        'Loss/densepose/heatmap', (optional)
+        'Loss/densepose/regression]' (optional)
        scalar tensors corresponding to the losses for different tasks. Note the
        $TASK_NAME is provided by the KeypointEstimation namedtuple used to
        differentiate between different keypoint tasks.
@@ -1999,6 +2361,16 @@ class CenterNetMetaArch(model.DetectionModel):
        seg_losses[key] = seg_losses[key] * self._mask_params.task_loss_weight
      losses.update(seg_losses)
+    if self._densepose_params is not None:
+      densepose_losses = self._compute_densepose_losses(
+          input_height=input_height,
+          input_width=input_width,
+          prediction_dict=prediction_dict)
+      for key in densepose_losses:
+        densepose_losses[key] = (
+            densepose_losses[key] * self._densepose_params.task_loss_weight)
+      losses.update(densepose_losses)
    # Prepend the LOSS_KEY_PREFIX to the keys in the dictionary such that the
    # losses will be grouped together in Tensorboard.
    return dict([('%s/%s' % (LOSS_KEY_PREFIX, key), val)
@@ -2033,9 +2405,14 @@ class CenterNetMetaArch(model.DetectionModel):
          invalid keypoints have their coordinates and scores set to 0.0.
        detection_keypoint_scores: (Optional) A float tensor of shape [batch,
          max_detection, num_keypoints] with scores for each keypoint.
-        detection_masks: (Optional) An int tensor of shape [batch,
+        detection_masks: (Optional) A uint8 tensor of shape [batch,
-          max_detections, mask_height, mask_width] with binarized masks for each
+          max_detections, mask_height, mask_width] with masks for each
-          detection.
+          detection. Background is specified with 0, and foreground is specified
+          with positive integers (1 for standard instance segmentation mask, and
+          1-indexed parts for DensePose task).
+        detection_surface_coords: (Optional) A float32 tensor of shape [batch,
+          max_detection, mask_height, mask_width, 2] with DensePose surface
+          coordinates, in (v, u) format.
    """
    object_center_prob = tf.nn.sigmoid(prediction_dict[OBJECT_CENTER][-1])
    # Get x, y and channel indices corresponding to the top indices in the class
@@ -2076,14 +2453,27 @@ class CenterNetMetaArch(model.DetectionModel):
    if self._mask_params:
      masks = tf.nn.sigmoid(prediction_dict[SEGMENTATION_HEATMAP][-1])
-      instance_masks = convert_strided_predictions_to_instance_masks(
+      densepose_part_heatmap, densepose_surface_coords = None, None
-          boxes, classes, masks, self._stride, self._mask_params.mask_height,
+      densepose_class_index = 0
-          self._mask_params.mask_width, true_image_shapes,
+      if self._densepose_params:
-          self._mask_params.score_threshold)
+        densepose_part_heatmap = prediction_dict[DENSEPOSE_HEATMAP][-1]
-      postprocess_dict.update({
+        densepose_surface_coords = prediction_dict[DENSEPOSE_REGRESSION][-1]
-          fields.DetectionResultFields.detection_masks:
+        densepose_class_index = self._densepose_params.class_id
-              instance_masks
+      instance_masks, surface_coords = (
-      })
+          convert_strided_predictions_to_instance_masks(
+              boxes, classes, masks, true_image_shapes,
+              densepose_part_heatmap, densepose_surface_coords,
+              stride=self._stride, mask_height=self._mask_params.mask_height,
+              mask_width=self._mask_params.mask_width,
+              score_threshold=self._mask_params.score_threshold,
+              densepose_class_index=densepose_class_index))
+      postprocess_dict[
+          fields.DetectionResultFields.detection_masks] = instance_masks
+      if self._densepose_params:
+        postprocess_dict[
+            fields.DetectionResultFields.detection_surface_coords] = (
+                surface_coords)
    return postprocess_dict
  def _postprocess_keypoints(self, prediction_dict, classes, y_indices,
@@ -2359,6 +2749,14 @@ class CenterNetMetaArch(model.DetectionModel):
        checkpoint (with compatible variable names) or to restore from a
        classification checkpoint for initialization prior to training.
        Valid values: `detection`, `classification`. Default 'detection'.
+        'detection': used when loading in the Hourglass model pre-trained on
+          other detection task.
+        'classification': used when loading in the ResNet model pre-trained on
+          image classification task. Note that only the image feature encoding
+          part is loaded but not those upsampling layers.
+        'fine_tune': used when loading the entire CenterNet feature extractor
+          pre-trained on other tasks. The checkpoints saved during CenterNet
+          model training can be directly loaded using this mode.
    Returns:
      A dict mapping keys to Trackable objects (tf.Module or Checkpoint).
@@ -2367,9 +2765,14 @@ class CenterNetMetaArch(model.DetectionModel):
    if fine_tune_checkpoint_type == 'classification':
      return {'feature_extractor': self._feature_extractor.get_base_model()}
-    if fine_tune_checkpoint_type == 'detection':
+    elif fine_tune_checkpoint_type == 'detection':
      return {'feature_extractor': self._feature_extractor.get_model()}
+    elif fine_tune_checkpoint_type == 'fine_tune':
+      feature_extractor_model = tf.train.Checkpoint(
+          _feature_extractor=self._feature_extractor)
+      return {'model': feature_extractor_model}
    else:
      raise ValueError('Not supported  fine tune checkpoint type - {}'.format(
          fine_tune_checkpoint_type))

--- a/research/object_detection/meta_architectures/center_net_meta_arch_tf2_test.py
+++ b/research/object_detection/meta_architectures/center_net_meta_arch_tf2_test.py
@@ -266,7 +266,7 @@ class CenterNetMetaArchHelpersTest(test_case.TestCase, parameterized.TestCase):
      masks_np[0, :, :3, 1] = 1  # Class 1.
      masks = tf.constant(masks_np)
      true_image_shapes = tf.constant([[6, 8, 3]])
-      instance_masks = cnma.convert_strided_predictions_to_instance_masks(
+      instance_masks, _ = cnma.convert_strided_predictions_to_instance_masks(
          boxes, classes, masks, stride=2, mask_height=2, mask_width=2,
          true_image_shapes=true_image_shapes)
      return instance_masks
@@ -289,6 +289,104 @@ class CenterNetMetaArchHelpersTest(test_case.TestCase, parameterized.TestCase):
        ])
    np.testing.assert_array_equal(expected_instance_masks, instance_masks)
+  def test_convert_strided_predictions_raises_error_with_one_tensor(self):
+    def graph_fn():
+      boxes = tf.constant(
+          [
+              [[0.5, 0.5, 1.0, 1.0],
+               [0.0, 0.5, 0.5, 1.0],
+               [0.0, 0.0, 0.0, 0.0]],
+          ], tf.float32)
+      classes = tf.constant(
+          [
+              [0, 1, 0],
+          ], tf.int32)
+      masks_np = np.zeros((1, 4, 4, 2), dtype=np.float32)
+      masks_np[0, :, 2:, 0] = 1  # Class 0.
+      masks_np[0, :, :3, 1] = 1  # Class 1.
+      masks = tf.constant(masks_np)
+      true_image_shapes = tf.constant([[6, 8, 3]])
+      densepose_part_heatmap = tf.random.uniform(
+          [1, 4, 4, 24])
+      instance_masks, _ = cnma.convert_strided_predictions_to_instance_masks(
+          boxes, classes, masks, true_image_shapes,
+          densepose_part_heatmap=densepose_part_heatmap,
+          densepose_surface_coords=None)
+      return instance_masks
+    with self.assertRaises(ValueError):
+      self.execute_cpu(graph_fn, [])
+  def test_crop_and_threshold_masks(self):
+    boxes_np = np.array(
+        [[0., 0., 0.5, 0.5],
+         [0.25, 0.25, 1.0, 1.0]], dtype=np.float32)
+    classes_np = np.array([0, 2], dtype=np.int32)
+    masks_np = np.zeros((4, 4, _NUM_CLASSES), dtype=np.float32)
+    masks_np[0, 0, 0] = 0.8
+    masks_np[1, 1, 0] = 0.6
+    masks_np[3, 3, 2] = 0.7
+    part_heatmap_np = np.zeros((4, 4, _DENSEPOSE_NUM_PARTS), dtype=np.float32)
+    part_heatmap_np[0, 0, 4] = 1
+    part_heatmap_np[0, 0, 2] = 0.6  # Lower scoring.
+    part_heatmap_np[1, 1, 8] = 0.2
+    part_heatmap_np[3, 3, 4] = 0.5
+    surf_coords_np = np.zeros((4, 4, 2 * _DENSEPOSE_NUM_PARTS),
+                              dtype=np.float32)
+    surf_coords_np[:, :, 8:10] = 0.2, 0.9
+    surf_coords_np[:, :, 16:18] = 0.3, 0.5
+    true_height, true_width = 10, 10
+    input_height, input_width = 10, 10
+    mask_height = 4
+    mask_width = 4
+    def graph_fn():
+      elems = [
+          tf.constant(boxes_np),
+          tf.constant(classes_np),
+          tf.constant(masks_np),
+          tf.constant(part_heatmap_np),
+          tf.constant(surf_coords_np),
+          tf.constant(true_height, dtype=tf.int32),
+          tf.constant(true_width, dtype=tf.int32)
+      ]
+      part_masks, surface_coords = cnma.crop_and_threshold_masks(
+          elems, input_height, input_width, mask_height=mask_height,
+          mask_width=mask_width, densepose_class_index=0)
+      return part_masks, surface_coords
+    part_masks, surface_coords = self.execute_cpu(graph_fn, [])
+    expected_part_masks = np.zeros((2, 4, 4), dtype=np.uint8)
+    expected_part_masks[0, 0, 0] = 5  # Recall classes are 1-indexed in output.
+    expected_part_masks[0, 2, 2] = 9  # Recall classes are 1-indexed in output.
+    expected_part_masks[1, 3, 3] = 1  # Standard instance segmentation mask.
+    expected_surface_coords = np.zeros((2, 4, 4, 2), dtype=np.float32)
+    expected_surface_coords[0, 0, 0, :] = 0.2, 0.9
+    expected_surface_coords[0, 2, 2, :] = 0.3, 0.5
+    np.testing.assert_allclose(expected_part_masks, part_masks)
+    np.testing.assert_allclose(expected_surface_coords, surface_coords)
+  def test_gather_surface_coords_for_parts(self):
+    surface_coords_cropped_np = np.zeros((2, 5, 5, _DENSEPOSE_NUM_PARTS, 2),
+                                         dtype=np.float32)
+    surface_coords_cropped_np[0, 0, 0, 5] = 0.3, 0.4
+    surface_coords_cropped_np[0, 1, 0, 9] = 0.5, 0.6
+    highest_scoring_part_np = np.zeros((2, 5, 5), dtype=np.int32)
+    highest_scoring_part_np[0, 0, 0] = 5
+    highest_scoring_part_np[0, 1, 0] = 9
+    def graph_fn():
+      surface_coords_cropped = tf.constant(surface_coords_cropped_np,
+                                           tf.float32)
+      highest_scoring_part = tf.constant(highest_scoring_part_np, tf.int32)
+      surface_coords_gathered = cnma.gather_surface_coords_for_parts(
+          surface_coords_cropped, highest_scoring_part)
+      return surface_coords_gathered
+    surface_coords_gathered = self.execute_cpu(graph_fn, [])
+    np.testing.assert_allclose([0.3, 0.4], surface_coords_gathered[0, 0, 0])
+    np.testing.assert_allclose([0.5, 0.6], surface_coords_gathered[0, 1, 0])
  def test_top_k_feature_map_locations(self):
    feature_map_np = np.zeros((2, 3, 3, 2), dtype=np.float32)
    feature_map_np[0, 2, 0, 1] = 1.0
@@ -535,6 +633,8 @@ class CenterNetMetaArchHelpersTest(test_case.TestCase, parameterized.TestCase):
    keypoint_heatmap_np[1, 0, 1, 1] = 0.9
    keypoint_heatmap_np[1, 2, 0, 1] = 0.8
+    # Note that the keypoint offsets are now per keypoint (as opposed to
+    # keypoint agnostic, in the test test_keypoint_candidate_prediction).
    keypoint_heatmap_offsets_np = np.zeros((2, 3, 3, 4), dtype=np.float32)
    keypoint_heatmap_offsets_np[0, 0, 0] = [0.5, 0.25, 0.0, 0.0]
    keypoint_heatmap_offsets_np[0, 2, 1] = [-0.25, 0.5, 0.0, 0.0]
@@ -949,6 +1049,7 @@ class CenterNetMetaArchHelpersTest(test_case.TestCase, parameterized.TestCase):
 _NUM_CLASSES = 10
 _KEYPOINT_INDICES = [0, 1, 2, 3]
 _NUM_KEYPOINTS = len(_KEYPOINT_INDICES)
+_DENSEPOSE_NUM_PARTS = 24
 _TASK_NAME = 'human_pose'
@@ -991,6 +1092,20 @@ def get_fake_mask_params():
      mask_width=4)
+def get_fake_densepose_params():
+  """Returns the fake DensePose estimation parameter namedtuple."""
+  return cnma.DensePoseParams(
+      class_id=1,
+      classification_loss=losses.WeightedSoftmaxClassificationLoss(),
+      localization_loss=losses.L1LocalizationLoss(),
+      part_loss_weight=1.0,
+      coordinate_loss_weight=1.0,
+      num_parts=_DENSEPOSE_NUM_PARTS,
+      task_loss_weight=1.0,
+      upsample_to_input_res=True,
+      upsample_method='nearest')
 def build_center_net_meta_arch(build_resnet=False):
  """Builds the CenterNet meta architecture."""
  if build_resnet:
@@ -1018,7 +1133,8 @@ def build_center_net_meta_arch(build_resnet=False):
      object_center_params=get_fake_center_params(),
      object_detection_params=get_fake_od_params(),
      keypoint_params_dict={_TASK_NAME: get_fake_kp_params()},
-      mask_params=get_fake_mask_params())
+      mask_params=get_fake_mask_params(),
+      densepose_params=get_fake_densepose_params())
 def _logit(p):
@@ -1102,6 +1218,16 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
        fake_feature_map)
    self.assertEqual((4, 128, 128, _NUM_CLASSES), output.shape)
+    # "densepose parts" head:
+    output = model._prediction_head_dict[cnma.DENSEPOSE_HEATMAP][-1](
+        fake_feature_map)
+    self.assertEqual((4, 128, 128, _DENSEPOSE_NUM_PARTS), output.shape)
+    # "densepose surface coordinates" head:
+    output = model._prediction_head_dict[cnma.DENSEPOSE_REGRESSION][-1](
+        fake_feature_map)
+    self.assertEqual((4, 128, 128, 2 * _DENSEPOSE_NUM_PARTS), output.shape)
  def test_initialize_target_assigners(self):
    model = build_center_net_meta_arch()
    assigner_dict = model._initialize_target_assigners(
@@ -1125,6 +1251,10 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
    self.assertIsInstance(assigner_dict[cnma.SEGMENTATION_TASK],
                          cn_assigner.CenterNetMaskTargetAssigner)
+    # DensePose estimation target assigner:
+    self.assertIsInstance(assigner_dict[cnma.DENSEPOSE_TASK],
+                          cn_assigner.CenterNetDensePoseTargetAssigner)
  def test_predict(self):
    """Test the predict function."""
@@ -1145,6 +1275,10 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
                     (2, 32, 32, 2))
    self.assertEqual(prediction_dict[cnma.SEGMENTATION_HEATMAP][0].shape,
                     (2, 32, 32, _NUM_CLASSES))
+    self.assertEqual(prediction_dict[cnma.DENSEPOSE_HEATMAP][0].shape,
+                     (2, 32, 32, _DENSEPOSE_NUM_PARTS))
+    self.assertEqual(prediction_dict[cnma.DENSEPOSE_REGRESSION][0].shape,
+                     (2, 32, 32, 2 * _DENSEPOSE_NUM_PARTS))
  def test_loss(self):
    """Test the loss function."""
@@ -1157,7 +1291,13 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
        groundtruth_keypoints_list=groundtruth_dict[
            fields.BoxListFields.keypoints],
        groundtruth_masks_list=groundtruth_dict[
-            fields.BoxListFields.masks])
+            fields.BoxListFields.masks],
+        groundtruth_dp_num_points_list=groundtruth_dict[
+            fields.BoxListFields.densepose_num_points],
+        groundtruth_dp_part_ids_list=groundtruth_dict[
+            fields.BoxListFields.densepose_part_ids],
+        groundtruth_dp_surface_coords_list=groundtruth_dict[
+            fields.BoxListFields.densepose_surface_coords])
    prediction_dict = get_fake_prediction_dict(
        input_height=16, input_width=32, stride=4)
@@ -1193,6 +1333,12 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
    self.assertGreater(
        0.01, loss_dict['%s/%s' % (cnma.LOSS_KEY_PREFIX,
                                   cnma.SEGMENTATION_HEATMAP)])
+    self.assertGreater(
+        0.01, loss_dict['%s/%s' % (cnma.LOSS_KEY_PREFIX,
+                                   cnma.DENSEPOSE_HEATMAP)])
+    self.assertGreater(
+        0.01, loss_dict['%s/%s' % (cnma.LOSS_KEY_PREFIX,
+                                   cnma.DENSEPOSE_REGRESSION)])
  @parameterized.parameters(
      {'target_class_id': 1},
@@ -1230,6 +1376,14 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
    segmentation_heatmap[:, 14:18, 14:18, target_class_id] = 1.0
    segmentation_heatmap = _logit(segmentation_heatmap)
+    dp_part_ind = 4
+    dp_part_heatmap = np.zeros((1, 32, 32, _DENSEPOSE_NUM_PARTS),
+                               dtype=np.float32)
+    dp_part_heatmap[0, 14:18, 14:18, dp_part_ind] = 1.0
+    dp_part_heatmap = _logit(dp_part_heatmap)
+    dp_surf_coords = np.random.randn(1, 32, 32, 2 * _DENSEPOSE_NUM_PARTS)
    class_center = tf.constant(class_center)
    height_width = tf.constant(height_width)
    offset = tf.constant(offset)
@@ -1237,6 +1391,8 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
    keypoint_offsets = tf.constant(keypoint_offsets, dtype=tf.float32)
    keypoint_regression = tf.constant(keypoint_regression, dtype=tf.float32)
    segmentation_heatmap = tf.constant(segmentation_heatmap, dtype=tf.float32)
+    dp_part_heatmap = tf.constant(dp_part_heatmap, dtype=tf.float32)
+    dp_surf_coords = tf.constant(dp_surf_coords, dtype=tf.float32)
    prediction_dict = {
        cnma.OBJECT_CENTER: [class_center],
@@ -1249,6 +1405,8 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
        cnma.get_keypoint_name(_TASK_NAME, cnma.KEYPOINT_REGRESSION):
            [keypoint_regression],
        cnma.SEGMENTATION_HEATMAP: [segmentation_heatmap],
+        cnma.DENSEPOSE_HEATMAP: [dp_part_heatmap],
+        cnma.DENSEPOSE_REGRESSION: [dp_surf_coords]
    }
    def graph_fn():
@@ -1271,12 +1429,13 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
    self.assertAllEqual([1, max_detection, 4, 4],
                        detections['detection_masks'].shape)
-    # There should be some section of the first mask (correspond to the only
+    # Masks should be empty for everything but the first detection.
-    # detection) with non-zero mask values.
-    self.assertGreater(np.sum(detections['detection_masks'][0, 0, :, :] > 0), 0)
    self.assertAllEqual(
        detections['detection_masks'][0, 1:, :, :],
        np.zeros_like(detections['detection_masks'][0, 1:, :, :]))
+    self.assertAllEqual(
+        detections['detection_surface_coords'][0, 1:, :, :],
+        np.zeros_like(detections['detection_surface_coords'][0, 1:, :, :]))
    if target_class_id == 1:
      expected_kpts_for_obj_0 = np.array(
@@ -1287,6 +1446,12 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
                                 expected_kpts_for_obj_0, rtol=1e-6)
      np.testing.assert_allclose(detections['detection_keypoint_scores'][0][0],
                                 expected_kpt_scores_for_obj_0, rtol=1e-6)
+      # First detection has DensePose parts.
+      self.assertSameElements(
+          np.unique(detections['detection_masks'][0, 0, :, :]),
+          set([0, dp_part_ind + 1]))
+      self.assertGreater(np.sum(np.abs(detections['detection_surface_coords'])),
+                         0.0)
    else:
      # All keypoint outputs should be zeros.
      np.testing.assert_allclose(
@@ -1297,6 +1462,14 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
          detections['detection_keypoint_scores'][0][0],
          np.zeros([num_keypoints], np.float),
          rtol=1e-6)
+      # Binary segmentation mask.
+      self.assertSameElements(
+          np.unique(detections['detection_masks'][0, 0, :, :]),
+          set([0, 1]))
+      # No DensePose surface coordinates.
+      np.testing.assert_allclose(
+          detections['detection_surface_coords'][0, 0, :, :],
+          np.zeros_like(detections['detection_surface_coords'][0, 0, :, :]))
  def test_get_instance_indices(self):
    classes = tf.constant([[0, 1, 2, 0], [2, 1, 2, 2]], dtype=tf.int32)
@@ -1353,6 +1526,17 @@ def get_fake_prediction_dict(input_height, input_width, stride):
  mask_heatmap[0, 2, 4, 1] = 1.0
  mask_heatmap = _logit(mask_heatmap)
+  densepose_heatmap = np.zeros((2, output_height, output_width,
+                                _DENSEPOSE_NUM_PARTS), dtype=np.float32)
+  densepose_heatmap[0, 2, 4, 5] = 1.0
+  densepose_heatmap = _logit(densepose_heatmap)
+  densepose_regression = np.zeros((2, output_height, output_width,
+                                   2 * _DENSEPOSE_NUM_PARTS), dtype=np.float32)
+  # The surface coordinate indices for part index 5 are:
+  # (5 * 2, 5 * 2 + 1), or (10, 11).
+  densepose_regression[0, 2, 4, 10:12] = 0.4, 0.7
  prediction_dict = {
      'preprocessed_inputs':
          tf.zeros((2, input_height, input_width, 3)),
@@ -1383,6 +1567,14 @@ def get_fake_prediction_dict(input_height, input_width, stride):
      cnma.SEGMENTATION_HEATMAP: [
          tf.constant(mask_heatmap),
          tf.constant(mask_heatmap)
+      ],
+      cnma.DENSEPOSE_HEATMAP: [
+          tf.constant(densepose_heatmap),
+          tf.constant(densepose_heatmap),
+      ],
+      cnma.DENSEPOSE_REGRESSION: [
+          tf.constant(densepose_regression),
+          tf.constant(densepose_regression),
      ]
  }
  return prediction_dict
@@ -1427,12 +1619,30 @@ def get_fake_groundtruth_dict(input_height, input_width, stride):
      tf.constant(mask),
      tf.zeros_like(mask),
  ]
+  densepose_num_points = [
+      tf.constant([1], dtype=tf.int32),
+      tf.constant([0], dtype=tf.int32),
+  ]
+  densepose_part_ids = [
+      tf.constant([[5, 0, 0]], dtype=tf.int32),
+      tf.constant([[0, 0, 0]], dtype=tf.int32),
+  ]
+  densepose_surface_coords_np = np.zeros((1, 3, 4), dtype=np.float32)
+  densepose_surface_coords_np[0, 0, :] = 0.55, 0.55, 0.4, 0.7
+  densepose_surface_coords = [
+      tf.constant(densepose_surface_coords_np),
+      tf.zeros_like(densepose_surface_coords_np)
+  ]
  groundtruth_dict = {
      fields.BoxListFields.boxes: boxes,
      fields.BoxListFields.weights: weights,
      fields.BoxListFields.classes: classes,
      fields.BoxListFields.keypoints: keypoints,
      fields.BoxListFields.masks: masks,
+      fields.BoxListFields.densepose_num_points: densepose_num_points,
+      fields.BoxListFields.densepose_part_ids: densepose_part_ids,
+      fields.BoxListFields.densepose_surface_coords:
+          densepose_surface_coords,
      fields.InputDataFields.groundtruth_labeled_classes: labeled_classes,
  }
  return groundtruth_dict

--- a/research/object_detection/meta_architectures/context_rcnn_lib_tf2.py
+++ b/research/object_detection/meta_architectures/context_rcnn_lib_tf2.py
+# Lint as: python3
+# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""Library functions for Context R-CNN."""
+import tensorflow as tf
+from object_detection.core import freezable_batch_norm
+# The negative value used in padding the invalid weights.
+_NEGATIVE_PADDING_VALUE = -100000
+class ContextProjection(tf.keras.layers.Layer):
+  """Custom layer to do batch normalization and projection."""
+  def __init__(self, projection_dimension, **kwargs):
+    self.batch_norm = freezable_batch_norm.FreezableBatchNorm(
+        epsilon=0.001,
+        center=True,
+        scale=True,
+        momentum=0.97,
+        trainable=True)
+    self.projection = tf.keras.layers.Dense(units=projection_dimension,
+                                            activation=tf.nn.relu6,
+                                            use_bias=True)
+    super(ContextProjection, self).__init__(**kwargs)
+  def build(self, input_shape):
+    self.batch_norm.build(input_shape)
+    self.projection.build(input_shape)
+  def call(self, input_features, is_training=False):
+    return self.projection(self.batch_norm(input_features, is_training))
+class AttentionBlock(tf.keras.layers.Layer):
+  """Custom layer to perform all attention."""
+  def __init__(self, bottleneck_dimension, attention_temperature,
+               output_dimension=None, is_training=False,
+               name='AttentionBlock', **kwargs):
+    """Constructs an attention block.
+    Args:
+      bottleneck_dimension: A int32 Tensor representing the bottleneck dimension
+        for intermediate projections.
+      attention_temperature: A float Tensor. It controls the temperature of the
+        softmax for weights calculation. The formula for calculation as follows:
+          weights = exp(weights / temperature) / sum(exp(weights / temperature))
+      output_dimension: A int32 Tensor representing the last dimension of the
+        output feature.
+      is_training: A boolean Tensor (affecting batch normalization).
+      name: A string describing what to name the variables in this block.
+      **kwargs: Additional keyword arguments.
+    """
+    self._key_proj = ContextProjection(bottleneck_dimension)
+    self._val_proj = ContextProjection(bottleneck_dimension)
+    self._query_proj = ContextProjection(bottleneck_dimension)
+    self._feature_proj = None
+    self._attention_temperature = attention_temperature
+    self._bottleneck_dimension = bottleneck_dimension
+    self._is_training = is_training
+    self._output_dimension = output_dimension
+    if self._output_dimension:
+      self._feature_proj = ContextProjection(self._output_dimension)
+    super(AttentionBlock, self).__init__(name=name, **kwargs)
+  def build(self, input_shapes):
+    """Finishes building the attention block.
+    Args:
+      input_shapes: the shape of the primary input box features.
+    """
+    if not self._feature_proj:
+      self._output_dimension = input_shapes[-1]
+      self._feature_proj = ContextProjection(self._output_dimension)
+  def call(self, box_features, context_features, valid_context_size):
+    """Handles a call by performing attention.
+    Args:
+      box_features: A float Tensor of shape [batch_size, input_size,
+        num_input_features].
+      context_features: A float Tensor of shape [batch_size, context_size,
+        num_context_features].
+      valid_context_size: A int32 Tensor of shape [batch_size].
+    Returns:
+      A float Tensor with shape [batch_size, input_size, num_input_features]
+      containing output features after attention with context features.
+    """
+    _, context_size, _ = context_features.shape
+    valid_mask = compute_valid_mask(valid_context_size, context_size)
+    # Average pools over height and width dimension so that the shape of
+    # box_features becomes [batch_size, max_num_proposals, channels].
+    box_features = tf.reduce_mean(box_features, [2, 3])
+    queries = project_features(
+        box_features, self._bottleneck_dimension, self._is_training,
+        self._query_proj, normalize=True)
+    keys = project_features(
+        context_features, self._bottleneck_dimension, self._is_training,
+        self._key_proj, normalize=True)
+    values = project_features(
+        context_features, self._bottleneck_dimension, self._is_training,
+        self._val_proj, normalize=True)
+    weights = tf.matmul(queries, keys, transpose_b=True)
+    weights, values = filter_weight_value(weights, values, valid_mask)
+    weights = tf.nn.softmax(weights / self._attention_temperature)
+    features = tf.matmul(weights, values)
+    output_features = project_features(
+        features, self._output_dimension, self._is_training,
+        self._feature_proj, normalize=False)
+    output_features = output_features[:, :, tf.newaxis, tf.newaxis, :]
+    return output_features
+def filter_weight_value(weights, values, valid_mask):
+  """Filters weights and values based on valid_mask.
+  _NEGATIVE_PADDING_VALUE will be added to invalid elements in the weights to
+  avoid their contribution in softmax. 0 will be set for the invalid elements in
+  the values.
+  Args:
+    weights: A float Tensor of shape [batch_size, input_size, context_size].
+    values: A float Tensor of shape [batch_size, context_size,
+      projected_dimension].
+    valid_mask: A boolean Tensor of shape [batch_size, context_size]. True means
+      valid and False means invalid.
+  Returns:
+    weights: A float Tensor of shape [batch_size, input_size, context_size].
+    values: A float Tensor of shape [batch_size, context_size,
+      projected_dimension].
+  Raises:
+    ValueError: If shape of doesn't match.
+  """
+  w_batch_size, _, w_context_size = weights.shape
+  v_batch_size, v_context_size, _ = values.shape
+  m_batch_size, m_context_size = valid_mask.shape
+  if w_batch_size != v_batch_size or v_batch_size != m_batch_size:
+    raise ValueError('Please make sure the first dimension of the input'
+                     ' tensors are the same.')
+  if w_context_size != v_context_size:
+    raise ValueError('Please make sure the third dimension of weights matches'
+                     ' the second dimension of values.')
+  if w_context_size != m_context_size:
+    raise ValueError('Please make sure the third dimension of the weights'
+                     ' matches the second dimension of the valid_mask.')
+  valid_mask = valid_mask[..., tf.newaxis]
+  # Force the invalid weights to be very negative so it won't contribute to
+  # the softmax.
+  weights += tf.transpose(
+      tf.cast(tf.math.logical_not(valid_mask), weights.dtype) *
+      _NEGATIVE_PADDING_VALUE,
+      perm=[0, 2, 1])
+  # Force the invalid values to be 0.
+  values *= tf.cast(valid_mask, values.dtype)
+  return weights, values
+def project_features(features, bottleneck_dimension, is_training,
+                     layer, normalize=True):
+  """Projects features to another feature space.
+  Args:
+    features: A float Tensor of shape [batch_size, features_size,
+      num_features].
+    bottleneck_dimension: A int32 Tensor.
+    is_training: A boolean Tensor (affecting batch normalization).
+    layer: Contains a custom layer specific to the particular operation
+          being performed (key, value, query, features)
+    normalize: A boolean Tensor. If true, the output features will be l2
+      normalized on the last dimension.
+  Returns:
+    A float Tensor of shape [batch, features_size, projection_dimension].
+  """
+  shape_arr = features.shape
+  batch_size, _, num_features = shape_arr
+  features = tf.reshape(features, [-1, num_features])
+  projected_features = layer(features, is_training)
+  projected_features = tf.reshape(projected_features,
+                                  [batch_size, -1, bottleneck_dimension])
+  if normalize:
+    projected_features = tf.keras.backend.l2_normalize(projected_features,
+                                                       axis=-1)
+  return projected_features
+def compute_valid_mask(num_valid_elements, num_elements):
+  """Computes mask of valid entries within padded context feature.
+  Args:
+    num_valid_elements: A int32 Tensor of shape [batch_size].
+    num_elements: An int32 Tensor.
+  Returns:
+    A boolean Tensor of the shape [batch_size, num_elements]. True means
+      valid and False means invalid.
+  """
+  batch_size = num_valid_elements.shape[0]
+  element_idxs = tf.range(num_elements, dtype=tf.int32)
+  batch_element_idxs = tf.tile(element_idxs[tf.newaxis, ...], [batch_size, 1])
+  num_valid_elements = num_valid_elements[..., tf.newaxis]
+  valid_mask = tf.less(batch_element_idxs, num_valid_elements)
+  return valid_mask
--- a/research/object_detection/meta_architectures/context_rcnn_lib_tf2_test.py
+++ b/research/object_detection/meta_architectures/context_rcnn_lib_tf2_test.py
+# Lint as: python3
+# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""Tests for context_rcnn_lib."""
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+import unittest
+from absl.testing import parameterized
+import tensorflow.compat.v1 as tf
+from object_detection.meta_architectures import context_rcnn_lib_tf2 as context_rcnn_lib
+from object_detection.utils import test_case
+from object_detection.utils import tf_version
+_NEGATIVE_PADDING_VALUE = -100000
+@unittest.skipIf(tf_version.is_tf1(), 'Skipping TF2.X only test.')
+class ContextRcnnLibTest(parameterized.TestCase, test_case.TestCase):
+  """Tests for the functions in context_rcnn_lib."""
+  def test_compute_valid_mask(self):
+    num_elements = tf.constant(3, tf.int32)
+    num_valid_elementss = tf.constant((1, 2), tf.int32)
+    valid_mask = context_rcnn_lib.compute_valid_mask(num_valid_elementss,
+                                                     num_elements)
+    expected_valid_mask = tf.constant([[1, 0, 0], [1, 1, 0]], tf.float32)
+    self.assertAllEqual(valid_mask, expected_valid_mask)
+  def test_filter_weight_value(self):
+    weights = tf.ones((2, 3, 2), tf.float32) * 4
+    values = tf.ones((2, 2, 4), tf.float32)
+    valid_mask = tf.constant([[True, True], [True, False]], tf.bool)
+    filtered_weights, filtered_values = context_rcnn_lib.filter_weight_value(
+        weights, values, valid_mask)
+    expected_weights = tf.constant([[[4, 4], [4, 4], [4, 4]],
+                                    [[4, _NEGATIVE_PADDING_VALUE + 4],
+                                     [4, _NEGATIVE_PADDING_VALUE + 4],
+                                     [4, _NEGATIVE_PADDING_VALUE + 4]]])
+    expected_values = tf.constant([[[1, 1, 1, 1], [1, 1, 1, 1]],
+                                   [[1, 1, 1, 1], [0, 0, 0, 0]]])
+    self.assertAllEqual(filtered_weights, expected_weights)
+    self.assertAllEqual(filtered_values, expected_values)
+    # Changes the valid_mask so the results will be different.
+    valid_mask = tf.constant([[True, True], [False, False]], tf.bool)
+    filtered_weights, filtered_values = context_rcnn_lib.filter_weight_value(
+        weights, values, valid_mask)
+    expected_weights = tf.constant(
+        [[[4, 4], [4, 4], [4, 4]],
+         [[_NEGATIVE_PADDING_VALUE + 4, _NEGATIVE_PADDING_VALUE + 4],
+          [_NEGATIVE_PADDING_VALUE + 4, _NEGATIVE_PADDING_VALUE + 4],
+          [_NEGATIVE_PADDING_VALUE + 4, _NEGATIVE_PADDING_VALUE + 4]]])
+    expected_values = tf.constant([[[1, 1, 1, 1], [1, 1, 1, 1]],
+                                   [[0, 0, 0, 0], [0, 0, 0, 0]]])
+    self.assertAllEqual(filtered_weights, expected_weights)
+    self.assertAllEqual(filtered_values, expected_values)
+  @parameterized.parameters((2, True, True), (2, False, True),
+                            (10, True, False), (10, False, False))
+  def test_project_features(self, projection_dimension, is_training, normalize):
+    features = tf.ones([2, 3, 4], tf.float32)
+    projected_features = context_rcnn_lib.project_features(
+        features,
+        projection_dimension,
+        is_training,
+        context_rcnn_lib.ContextProjection(projection_dimension),
+        normalize=normalize)
+    # Makes sure the shape is correct.
+    self.assertAllEqual(projected_features.shape, [2, 3, projection_dimension])
+  @parameterized.parameters(
+      (2, 10, 1),
+      (3, 10, 2),
+      (4, None, 3),
+      (5, 20, 4),
+      (7, None, 5),
+  )
+  def test_attention_block(self, bottleneck_dimension, output_dimension,
+                           attention_temperature):
+    input_features = tf.ones([2, 8, 3, 3, 3], tf.float32)
+    context_features = tf.ones([2, 20, 10], tf.float32)
+    attention_block = context_rcnn_lib.AttentionBlock(
+        bottleneck_dimension,
+        attention_temperature,
+        output_dimension=output_dimension,
+        is_training=False)
+    valid_context_size = tf.random_uniform((2,),
+                                           minval=0,
+                                           maxval=10,
+                                           dtype=tf.int32)
+    output_features = attention_block(input_features, context_features,
+                                      valid_context_size)
+    # Makes sure the shape is correct.
+    self.assertAllEqual(output_features.shape,
+                        [2, 8, 1, 1, (output_dimension or 3)])
+if __name__ == '__main__':
+  tf.test.main()
--- a/research/object_detection/meta_architectures/context_rcnn_meta_arch.py
+++ b/research/object_detection/meta_architectures/context_rcnn_meta_arch.py
@@ -27,7 +27,9 @@ import functools
 from object_detection.core import standard_fields as fields
 from object_detection.meta_architectures import context_rcnn_lib
+from object_detection.meta_architectures import context_rcnn_lib_tf2
 from object_detection.meta_architectures import faster_rcnn_meta_arch
+from object_detection.utils import tf_version
 class ContextRCNNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
@@ -264,11 +266,17 @@ class ContextRCNNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
            return_raw_detections_during_predict),
        output_final_box_features=output_final_box_features)
-    self._context_feature_extract_fn = functools.partial(
+    if tf_version.is_tf1():
-        context_rcnn_lib.compute_box_context_attention,
+      self._context_feature_extract_fn = functools.partial(
-        bottleneck_dimension=attention_bottleneck_dimension,
+          context_rcnn_lib.compute_box_context_attention,
-        attention_temperature=attention_temperature,
+          bottleneck_dimension=attention_bottleneck_dimension,
-        is_training=is_training)
+          attention_temperature=attention_temperature,
+          is_training=is_training)
+    else:
+      self._context_feature_extract_fn = context_rcnn_lib_tf2.AttentionBlock(
+          bottleneck_dimension=attention_bottleneck_dimension,
+          attention_temperature=attention_temperature,
+          is_training=is_training)
  @staticmethod
  def get_side_inputs(features):
@@ -323,8 +331,9 @@ class ContextRCNNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
    Returns:
      A float32 Tensor with shape [K, new_height, new_width, depth].
    """
    box_features = self._crop_and_resize_fn(
-        features_to_crop, proposal_boxes_normalized,
+        [features_to_crop], proposal_boxes_normalized, None,
        [self._initial_crop_size, self._initial_crop_size])
    attention_features = self._context_feature_extract_fn(