Merge branch 'master' of https://github.com/tensorflow/models into context_tf2

3ce2f61b · Kaushik Shivakumar · bb16d5ca · 8e9296ff · 3ce2f61b · 3ce2f61b
Commit 3ce2f61b authored Jul 12, 2020 by Kaushik Shivakumar
20 changed files
--- a/research/object_detection/g3doc/tf2_classification_zoo.md
+++ b/research/object_detection/g3doc/tf2_classification_zoo.md
+# TensorFlow 2 Classification Model Zoo
+
+[![TensorFlow 2.2](https://img.shields.io/badge/TensorFlow-2.2-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v2.2.0)
+[![Python 3.6](https://img.shields.io/badge/Python-3.6-3776AB)](https://www.python.org/downloads/release/python-360/)
+
+We provide a collection of classification models pre-trained on the
+[Imagenet](http://www.image-net.org). These can be used to initilize detection
+model parameters.
+
+Model name |
+---------- |
+[EfficientNet B0](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b0.tar.gz)     |
+[EfficientNet B1](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b1.tar.gz)     |
+[EfficientNet B2](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b2.tar.gz)     |
+[EfficientNet B3](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b3.tar.gz)     |
+[EfficientNet B4](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b4.tar.gz)     |
+[EfficientNet B5](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b5.tar.gz)     |
+[EfficientNet B6](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b6.tar.gz)     |
+[EfficientNet B7](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b7.tar.gz)     |
+[Resnet V1 50](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/resnet50_v1.tar.gz)         |
+[Resnet V1 101](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/resnet101_v1.tar.gz)       |
+[Resnet V1 152](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/resnet152_v1.tar.gz)       |
+[Inception Resnet V2](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/inception_resnet_v2.tar.gz) |
+[MobileNet V1](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/mobilnet_v1.tar.gz)        |
+[MobileNet V2](http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/mobilnet_v2.tar.gz)        |
--- a/research/object_detection/g3doc/tf2_detection_zoo.md
+++ b/research/object_detection/g3doc/tf2_detection_zoo.md
+# TensorFlow 2 Detection Model Zoo
+
+[![TensorFlow 2.2](https://img.shields.io/badge/TensorFlow-2.2-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v2.2.0)
+[![Python 3.6](https://img.shields.io/badge/Python-3.6-3776AB)](https://www.python.org/downloads/release/python-360/)
+
+<!-- mdlint off(URL_BAD_G3DOC_PATH) -->
+
+We provide a collection of detection models pre-trained on the
+[COCO 2017 dataset](http://cocodataset.org). These models can be useful for
+out-of-the-box inference if you are interested in categories already in those
+datasets. You can try it in our inference
+[colab](../colab_tutorials/inference_tf2_colab.ipynb)
+
+They are also useful for initializing your models when training on novel
+datasets. You can try this out on our few-shot training
+[colab](../colab_tutorials/eager_few_shot_od_training_tf2_colab.ipynb).
+
+<!-- mdlint on -->
+
+Finally, if you would like to train these models from scratch, you can find the
+model configs in this [directory](../configs/tf2) (also in the linked
+`tar.gz`s).
+
+Model name                                                                                                                                                                  | Speed (ms) | COCO mAP | Outputs
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :--------: | :----------: | :-----:
+[CenterNet HourGlass104 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_hg104_512x512_coco17_tpu-8.tar.gz)                    | 70         | 41.6           | Boxes
+[CenterNet HourGlass104 Keypoints 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_hg104_512x512_kpts_coco17_tpu-32.tar.gz)                    | 76         | 40.0/61.4           | Boxes/Keypoints
+[CenterNet HourGlass104 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_hg104_1024x1024_coco17_tpu-32.tar.gz)               | 197       | 43.5           | Boxes
+[CenterNet HourGlass104 Keypoints 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_hg104_1024x1024_kpts_coco17_tpu-32.tar.gz)               | 211       | 42.8/64.5          | Boxes/Keypoints
+[CenterNet Resnet50 V1 FPN 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_resnet50_v1_fpn_512x512_coco17_tpu-8.tar.gz)     | 27         | 31.2           | Boxes
+[CenterNet Resnet50 V1 FPN Keypoints 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_resnet50_v1_fpn_512x512_kpts_coco17_tpu-8.tar.gz)     | 30         | 29.3/50.7         | Boxes/Keypoints
+[CenterNet Resnet101 V1 FPN 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_resnet101_v1_fpn_512x512_coco17_tpu-8.tar.gz)     | 34         | 34.2           | Boxes
+[CenterNet Resnet50 V2 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_resnet50_v2_512x512_coco17_tpu-8.tar.gz)     | 27         | 29.5           | Boxes
+[CenterNet Resnet50 V2 Keypoints 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_resnet50_v2_512x512_kpts_coco17_tpu-8.tar.gz)     | 30         | 27.6/48.2           | Boxes/Keypoints
+[EfficientDet D0 512x512](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d0_coco17_tpu-32.tar.gz)                                  | 39         | 33.6           | Boxes
+[EfficientDet D1 640x640](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d1_coco17_tpu-32.tar.gz)                                  | 54         | 38.4           | Boxes
+[EfficientDet D2 768x768](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d2_coco17_tpu-32.tar.gz)                                  | 67         | 41.8           | Boxes
+[EfficientDet D3 896x896](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d3_coco17_tpu-32.tar.gz)                                  | 95         | 45.4           | Boxes
+[EfficientDet D4 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d4_coco17_tpu-32.tar.gz)                              | 133         | 48.5           | Boxes
+[EfficientDet D5 1280x1280](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d5_coco17_tpu-32.tar.gz)                             | 222         | 49.7           | Boxes
+[EfficientDet D6 1280x1280](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d6_coco17_tpu-32.tar.gz)                             | 268         | 50.5           | Boxes
+[EfficientDet D7 1536x1536](http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d7_coco17_tpu-32.tar.gz)                             | 325         | 51.2           | Boxes
+[SSD MobileNet v2 320x320](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_320x320_coco17_tpu-8.tar.gz)                                |19         | 20.2           | Boxes
+[SSD MobileNet V1 FPN 640x640](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v1_fpn_640x640_coco17_tpu-8.tar.gz)                        | 48        | 29.1           | Boxes
+[SSD MobileNet V2 FPNLite 320x320](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.tar.gz)                | 22         | 22.2           | Boxes
+[SSD MobileNet V2 FPNLite 640x640](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8.tar.gz)                | 39         | 28.2           | Boxes
+[SSD ResNet50 V1 FPN 640x640 (RetinaNet50)](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.tar.gz)                          | 46         | 34.3           | Boxes
+[SSD ResNet50 V1 FPN 1024x1024 (RetinaNet50)](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet50_v1_fpn_1024x1024_coco17_tpu-8.tar.gz)                      | 87         | 38.3           | Boxes
+[SSD ResNet101 V1 FPN 640x640 (RetinaNet101)](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet101_v1_fpn_640x640_coco17_tpu-8.tar.gz)                        | 57         | 35.6           | Boxes
+[SSD ResNet101 V1 FPN 1024x1024 (RetinaNet101)](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet101_v1_fpn_1024x1024_coco17_tpu-8.tar.gz)                    | 104        | 39.5           | Boxes
+[SSD ResNet152 V1 FPN 640x640 (RetinaNet152)](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet152_v1_fpn_640x640_coco17_tpu-8.tar.gz)                        | 80         | 35.4           | Boxes
+[SSD ResNet152 V1 FPN 1024x1024 (RetinaNet152)](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet152_v1_fpn_1024x1024_coco17_tpu-8.tar.gz)                    | 111        | 39.6           | Boxes
+[Faster R-CNN ResNet50 V1 640x640](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet50_v1_640x640_coco17_tpu-8.tar.gz)                 | 53         | 29.3           | Boxes
+[Faster R-CNN ResNet50 V1 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet50_v1_1024x1024_coco17_tpu-8.tar.gz)             | 65         | 31.0           | Boxes
+[Faster R-CNN ResNet50 V1 800x1333](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet50_v1_800x1333_coco17_gpu-8.tar.gz)               | 65         | 31.6           | Boxes
+[Faster R-CNN ResNet101 V1 640x640](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet101_v1_640x640_coco17_tpu-8.tar.gz)               |    55      | 31.8           | Boxes
+[Faster R-CNN ResNet101 V1 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet101_v1_1024x1024_coco17_tpu-8.tar.gz)           | 72         | 37.1           | Boxes
+[Faster R-CNN ResNet101 V1 800x1333](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet101_v1_800x1333_coco17_gpu-8.tar.gz)             | 77         | 36.6           | Boxes
+[Faster R-CNN ResNet152 V1 640x640](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet152_v1_640x640_coco17_tpu-8.tar.gz)               | 64         | 32.4           | Boxes
+[Faster R-CNN ResNet152 V1 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet152_v1_1024x1024_coco17_tpu-8.tar.gz)           | 85         | 37.6           | Boxes
+[Faster R-CNN ResNet152 V1 800x1333](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet152_v1_800x1333_coco17_gpu-8.tar.gz)             | 101         | 37.4           | Boxes
+[Faster R-CNN Inception ResNet V2 640x640](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_inception_resnet_v2_640x640_coco17_tpu-8.tar.gz)             | 206         | 37.7           | Boxes
+[Faster R-CNN Inception ResNet V2 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_inception_resnet_v2_1024x1024_coco17_tpu-8.tar.gz)             | 236         | 38.7           | Boxes
+[Mask R-CNN Inception ResNet V2 1024x1024](http://download.tensorflow.org/models/object_detection/tf2/20200711/mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8.tar.gz) |    301      | 39.0/34.6           | Boxes/Masks
+[ExtremeNet](http://download.tensorflow.org/models/object_detection/tf2/20200711/extremenet.tar.gz)                                                                         | --         | --           | Boxes
--- a/research/object_detection/g3doc/tf2_training_and_evaluation.md
+++ b/research/object_detection/g3doc/tf2_training_and_evaluation.md
+# Training and Evaluation with TensorFlow 2
+
+[![TensorFlow 2.2](https://img.shields.io/badge/TensorFlow-2.2-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v2.2.0)
+[![Python 3.6](https://img.shields.io/badge/Python-3.6-3776AB)](https://www.python.org/downloads/release/python-360/)
+
+This page walks through the steps required to train an object detection model.
+It assumes the reader has completed the following prerequisites:
+
+1.  The TensorFlow Object Detection API has been installed as documented in the
+    [installation instructions](tf2.md#installation).
+2.  A valid data set has been created. See [this page](preparing_inputs.md) for
+    instructions on how to generate a dataset for the PASCAL VOC challenge or
+    the Oxford-IIIT Pet dataset.
+
+## Recommended Directory Structure for Training and Evaluation
+
+```bash
+.
+├── data/
+│   ├── eval-00000-of-00001.tfrecord
+│   ├── label_map.txt
+│   ├── train-00000-of-00002.tfrecord
+│   └── train-00001-of-00002.tfrecord
+└── models/
+    └── my_model_dir/
+        ├── eval/                 # Created by evaluation job.
+        ├── my_model.config
+        └── model_ckpt-100-data@1 #
+        └── model_ckpt-100-index  # Created by training job.
+        └── checkpoint            #
+```
+
+## Writing a model configuration
+
+Please refer to sample [TF2 configs](../configs/tf2) and
+[configuring jobs](configuring_jobs.md) to create a model config.
+
+### Model Parameter Initialization
+
+While optional, it is highly recommended that users utilize classification or
+object detection checkpoints. Training an object detector from scratch can take
+days. To speed up the training process, it is recommended that users re-use the
+feature extractor parameters from a pre-existing image classification or object
+detection checkpoint. The `train_config` section in the config provides two
+fields to specify pre-existing checkpoints:
+
+*   `fine_tune_checkpoint`: a path prefix to the pre-existing checkpoint
+    (ie:"/usr/home/username/checkpoint/model.ckpt-#####").
+
+*   `fine_tune_checkpoint_type`: with value `classification` or `detection`
+    depending on the type.
+
+A list of classification checkpoints can be found
+[here](tf2_classification_zoo.md)
+
+A list of detection checkpoints can be found [here](tf2_detection_zoo.md).
+
+## Local
+
+### Training
+
+A local training job can be run with the following command:
+
+```bash
+# From the tensorflow/models/research/ directory
+PIPELINE_CONFIG_PATH={path to pipeline config file}
+MODEL_DIR={path to model directory}
+python object_detection/model_main_tf2.py \
+    --pipeline_config_path=${PIPELINE_CONFIG_PATH} \
+    --model_dir=${MODEL_DIR} \
+    --alsologtostderr
+```
+
+where `${PIPELINE_CONFIG_PATH}` points to the pipeline config and `${MODEL_DIR}`
+points to the directory in which training checkpoints and events will be
+written.
+
+### Evaluation
+
+A local evaluation job can be run with the following command:
+
+```bash
+# From the tensorflow/models/research/ directory
+PIPELINE_CONFIG_PATH={path to pipeline config file}
+MODEL_DIR={path to model directory}
+CHECKPOINT_DIR=${MODEL_DIR}
+MODEL_DIR={path to model directory}
+python object_detection/model_main_tf2.py \
+    --pipeline_config_path=${PIPELINE_CONFIG_PATH} \
+    --model_dir=${MODEL_DIR} \
+    --checkpoint_dir=${CHECKPOINT_DIR} \
+    --alsologtostderr
+```
+
+where `${CHECKPOINT_DIR}` points to the directory with checkpoints produced by
+the training job. Evaluation events are written to `${MODEL_DIR/eval}`
+
+## Google Cloud VM
+
+The TensorFlow Object Detection API supports training on Google Cloud with Deep
+Learning GPU VMs and TPU VMs. This section documents instructions on how to
+train and evaluate your model on them. The reader should complete the following
+prerequistes:
+
+1.  The reader has create and configured a GPU VM or TPU VM on Google Cloud with
+    TensorFlow >= 2.2.0. See
+    [TPU quickstart](https://cloud.google.com/tpu/docs/quickstart) and
+    [GPU quickstart](https://cloud.google.com/ai-platform/deep-learning-vm/docs/tensorflow_start_instance#with-one-or-more-gpus)
+
+2.  The reader has installed the TensorFlow Object Detection API as documented
+    in the [installation instructions](tf2.md#installation) on the VM.
+
+3.  The reader has a valid data set and stored it in a Google Cloud Storage
+    bucket or locally on the VM. See [this page](preparing_inputs.md) for
+    instructions on how to generate a dataset for the PASCAL VOC challenge or
+    the Oxford-IIIT Pet dataset.
+
+Additionally, it is recommended users test their job by running training and
+evaluation jobs for a few iterations [locally on their own machines](#local).
+
+### Training
+
+Training on GPU or TPU VMs is similar to local training. It can be launched
+using the following command.
+
+```bash
+# From the tensorflow/models/research/ directory
+USE_TPU=true
+TPU_NAME="MY_TPU_NAME"
+PIPELINE_CONFIG_PATH={path to pipeline config file}
+MODEL_DIR={path to model directory}
+python object_detection/model_main_tf2.py \
+    --pipeline_config_path=${PIPELINE_CONFIG_PATH} \
+    --model_dir=${MODEL_DIR} \
+    --use_tpu=${USE_TPU} \  # (optional) only required for TPU training.
+    --tpu_name=${TPU_NAME} \  # (optional) only required for TPU training.
+    --alsologtostderr
+```
+
+where `${PIPELINE_CONFIG_PATH}` points to the pipeline config and `${MODEL_DIR}`
+points to the root directory for the files produces. Training checkpoints and
+events are written to `${MODEL_DIR}`. Note that the paths can be either local or
+a path to GCS bucket.
+
+### Evaluation
+
+Evaluation is only supported on GPU. Similar to local evaluation it can be
+launched using the following command:
+
+```bash
+# From the tensorflow/models/research/ directory
+PIPELINE_CONFIG_PATH={path to pipeline config file}
+MODEL_DIR={path to model directory}
+CHECKPOINT_DIR=${MODEL_DIR}
+MODEL_DIR={path to model directory}
+python object_detection/model_main_tf2.py \
+    --pipeline_config_path=${PIPELINE_CONFIG_PATH} \
+    --model_dir=${MODEL_DIR} \
+    --checkpoint_dir=${CHECKPOINT_DIR} \
+    --alsologtostderr
+```
+
+where `${CHECKPOINT_DIR}` points to the directory with checkpoints produced by
+the training job. Evaluation events are written to `${MODEL_DIR/eval}`. Note
+that the paths can be either local or a path to GCS bucket.
+
+## Google Cloud AI Platform
+
+The TensorFlow Object Detection API supports also supports training on Google
+Cloud AI Platform. This section documents instructions on how to train and
+evaluate your model using Cloud ML. The reader should complete the following
+prerequistes:
+
+1.  The reader has created and configured a project on Google Cloud AI Platform.
+    See
+    [Using GPUs](https://cloud.google.com/ai-platform/training/docs/using-gpus)
+    and
+    [Using TPUs](https://cloud.google.com/ai-platform/training/docs/using-tpus)
+    guides.
+2.  The reader has a valid data set and stored it in a Google Cloud Storage
+    bucket. See [this page](preparing_inputs.md) for instructions on how to
+    generate a dataset for the PASCAL VOC challenge or the Oxford-IIIT Pet
+    dataset.
+
+Additionally, it is recommended users test their job by running training and
+evaluation jobs for a few iterations [locally on their own machines](#local).
+
+### Training with multiple GPUs
+
+A user can start a training job on Cloud AI Platform using the following
+command:
+
+```bash
+# From the tensorflow/models/research/ directory
+cp object_detection/packages/tf2/setup.py .
+gcloud ai-platform jobs submit training object_detection_`date +%m_%d_%Y_%H_%M_%S` \
+    --runtime-version 2.1 \
+    --python-version 3.6 \
+    --job-dir=gs://${MODEL_DIR} \
+    --package-path ./object_detection \
+    --module-name object_detection.model_main_tf2 \
+    --region us-central1 \
+    --master-machine-type n1-highcpu-16 \
+    --master-accelerator count=8,type=nvidia-tesla-v100 \
+    -- \
+    --model_dir=gs://${MODEL_DIR} \
+    --pipeline_config_path=gs://${PIPELINE_CONFIG_PATH}
+```
+
+Where `gs://${MODEL_DIR}` specifies the directory on Google Cloud Storage where
+the training checkpoints and events will be written to and
+`gs://${PIPELINE_CONFIG_PATH}` points to the pipeline configuration stored on
+Google Cloud Storage.
+
+Users can monitor the progress of their training job on the
+[ML Engine Dashboard](https://console.cloud.google.com/ai-platform/jobs).
+
+### Training with TPU
+
+Launching a training job with a TPU compatible pipeline config requires using a
+similar command:
+
+```bash
+# From the tensorflow/models/research/ directory
+cp object_detection/packages/tf2/setup.py .
+gcloud ai-platform jobs submit training `whoami`_object_detection_`date +%m_%d_%Y_%H_%M_%S` \
+    --job-dir=gs://${MODEL_DIR} \
+    --package-path ./object_detection \
+    --module-name object_detection.model_main_tf2 \
+    --runtime-version 2.1 \
+    --python-version 3.6 \
+    --scale-tier BASIC_TPU \
+    --region us-central1 \
+    -- \
+    --use_tpu true \
+    --model_dir=gs://${MODEL_DIR} \
+    --pipeline_config_path=gs://${PIPELINE_CONFIG_PATH}
+```
+
+As before `pipeline_config_path` points to the pipeline configuration stored on
+Google Cloud Storage (but is now must be a TPU compatible model).
+
+### Evaluating with GPU
+
+Evaluation jobs run on a single machine. Run the following command to start the
+evaluation job:
+
+```bash
+# From the tensorflow/models/research/ directory
+cp object_detection/packages/tf2/setup.py .
+gcloud ai-platform jobs submit training object_detection_eval_`date +%m_%d_%Y_%H_%M_%S` \
+    --runtime-version 2.1 \
+    --python-version 3.6 \
+    --job-dir=gs://${MODEL_DIR} \
+    --package-path ./object_detection \
+    --module-name object_detection.model_main_tf2 \
+    --region us-central1 \
+    --scale-tier BASIC_GPU \
+    -- \
+    --model_dir=gs://${MODEL_DIR} \
+    --pipeline_config_path=gs://${PIPELINE_CONFIG_PATH} \
+    --checkpoint_dir=gs://${MODEL_DIR}
+```
+
+where `gs://${MODEL_DIR}` points to the directory on Google Cloud Storage where
+training checkpoints are saved and `gs://{PIPELINE_CONFIG_PATH}` points to where
+the model configuration file stored on Google Cloud Storage. Evaluation events
+are written to `gs://${MODEL_DIR}/eval`
+
+Typically one starts an evaluation job concurrently with the training job. Note
+that we do not support running evaluation on TPU.
+
+## Running Tensorboard
+
+Progress for training and eval jobs can be inspected using Tensorboard. If using
+the recommended directory structure, Tensorboard can be run using the following
+command:
+
+```bash
+tensorboard --logdir=${MODEL_DIR}
+```
+
+where `${MODEL_DIR}` points to the directory that contains the train and eval
+directories. Please note it may take Tensorboard a couple minutes to populate
+with data.
--- a/research/object_detection/g3doc/tpu_compatibility.md
+++ b/research/object_detection/g3doc/tpu_compatibility.md
@@ -2,7 +2,7 @@

 [TOC]

-The Tensorflow Object Detection API supports TPU training for some models. To
+The TensorFlow Object Detection API supports TPU training for some models. To
 make models TPU compatible you need to make a few tweaks to the model config as
 mentioned below. We also provide several sample configs that you can use as a
 template.
@@ -11,7 +11,7 @@ template.

 ### Static shaped tensors

-TPU training currently requires all tensors in the Tensorflow Graph to have
+TPU training currently requires all tensors in the TensorFlow Graph to have
 static shapes. However, most of the sample configs in Object Detection API have
 a few different tensors that are dynamically shaped. Fortunately, we provide
 simple alternatives in the model configuration that modifies these tensors to
@@ -62,7 +62,7 @@ have static shape:
 ### TPU friendly ops

 Although TPU supports a vast number of tensorflow ops, a few used in the
-Tensorflow Object Detection API are unsupported. We list such ops below and
+TensorFlow Object Detection API are unsupported. We list such ops below and
 recommend compatible substitutes.

 *   **Anchor sampling** - Typically we use hard example mining in standard SSD

--- a/research/object_detection/g3doc/tpu_exporters.md
+++ b/research/object_detection/g3doc/tpu_exporters.md
 # Object Detection TPU Inference Exporter

+[![TensorFlow 1.15](https://img.shields.io/badge/TensorFlow-1.15-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
+
 This package contains SavedModel Exporter for TPU Inference of object detection
 models.


--- a/research/object_detection/g3doc/using_your_own_dataset.md
+++ b/research/object_detection/g3doc/using_your_own_dataset.md
@@ -2,7 +2,7 @@

 [TOC]

-To use your own dataset in Tensorflow Object Detection API, you must convert it
+To use your own dataset in TensorFlow Object Detection API, you must convert it
 into the [TFRecord file format](https://www.tensorflow.org/api_guides/python/python_io#tfrecords_format_details).
 This document outlines how to write a script to generate the TFRecord file.


--- a/research/object_detection/meta_architectures/center_net_meta_arch.py
+++ b/research/object_detection/meta_architectures/center_net_meta_arch.py
@@ -924,13 +924,16 @@ def convert_strided_predictions_to_normalized_keypoints(


 def convert_strided_predictions_to_instance_masks(
-    boxes, classes, masks, stride, mask_height, mask_width,
-    true_image_shapes, score_threshold=0.5):
+    boxes, classes, masks, true_image_shapes,
+    densepose_part_heatmap=None, densepose_surface_coords=None, stride=4,
+    mask_height=256, mask_width=256, score_threshold=0.5,
+    densepose_class_index=-1):
  """Converts predicted full-image masks into instance masks.

  For each predicted detection box:
-    * Crop and resize the predicted mask based on the detected bounding box
-      coordinates and class prediction. Uses bilinear resampling.
+    * Crop and resize the predicted mask (and optionally DensePose coordinates)
+      based on the detected bounding box coordinates and class prediction. Uses
+      bilinear resampling.
    * Binarize the mask using the provided score threshold.

  Args:
@@ -940,57 +943,212 @@ def convert_strided_predictions_to_instance_masks(
      detected class for each box (0-indexed).
    masks: A [batch, output_height, output_width, num_classes] float32
      tensor with class probabilities.
+    true_image_shapes: A tensor of shape [batch, 3] representing the true
+      shape of the inputs not considering padding.
+    densepose_part_heatmap: (Optional) A [batch, output_height, output_width,
+      num_parts] float32 tensor with part scores (i.e. logits).
+    densepose_surface_coords: (Optional) A [batch, output_height, output_width,
+      2 * num_parts] float32 tensor with predicted part coordinates (in
+      vu-format).
    stride: The stride in the output space.
    mask_height: The desired resized height for instance masks.
    mask_width: The desired resized width for instance masks.
-    true_image_shapes: A tensor of shape [batch, 3] representing the true
-      shape of the inputs not considering padding.
    score_threshold: The threshold at which to convert predicted mask
       into foreground pixels.
+    densepose_class_index: The class index (0-indexed) corresponding to the
+      class which has DensePose labels (e.g. person class).

  Returns:
-    A [batch_size, max_detections, mask_height, mask_width] uint8 tensor with
-    predicted foreground mask for each instance. The masks take values in
-    {0, 1}.
+    A tuple of masks and surface_coords.
+    instance_masks: A [batch_size, max_detections, mask_height, mask_width]
+      uint8 tensor with predicted foreground mask for each
+      instance. If DensePose tensors are provided, then each pixel value in the
+      mask encodes the 1-indexed part.
+    surface_coords: A [batch_size, max_detections, mask_height, mask_width, 2]
+      float32 tensor with (v, u) coordinates. Note that v, u coordinates are
+      only defined on instance masks, and the coordinates at each location of
+      the foreground mask correspond to coordinates on a local part coordinate
+      system (the specific part can be inferred from the `instance_masks`
+      output. If DensePose feature maps are not passed to this function, this
+      output will be None.
+
+  Raises:
+    ValueError: If one but not both of `densepose_part_heatmap` and
+    `densepose_surface_coords` is provided.
  """
-  _, output_height, output_width, _ = (
+  batch_size, output_height, output_width, _ = (
      shape_utils.combined_static_and_dynamic_shape(masks))
  input_height = stride * output_height
  input_width = stride * output_width

+  true_heights, true_widths, _ = tf.unstack(true_image_shapes, axis=1)
+  # If necessary, create dummy DensePose tensors to simplify the map function.
+  densepose_present = True
+  if ((densepose_part_heatmap is not None) ^
+      (densepose_surface_coords is not None)):
+    raise ValueError('To use DensePose, both `densepose_part_heatmap` and '
+                     '`densepose_surface_coords` must be provided')
+  if densepose_part_heatmap is None and densepose_surface_coords is None:
+    densepose_present = False
+    densepose_part_heatmap = tf.zeros(
+        (batch_size, output_height, output_width, 1), dtype=tf.float32)
+    densepose_surface_coords = tf.zeros(
+        (batch_size, output_height, output_width, 2), dtype=tf.float32)
+  crop_and_threshold_fn = functools.partial(
+      crop_and_threshold_masks, input_height=input_height,
+      input_width=input_width, mask_height=mask_height, mask_width=mask_width,
+      score_threshold=score_threshold,
+      densepose_class_index=densepose_class_index)
+
+  instance_masks, surface_coords = shape_utils.static_or_dynamic_map_fn(
+      crop_and_threshold_fn,
+      elems=[boxes, classes, masks, densepose_part_heatmap,
+             densepose_surface_coords, true_heights, true_widths],
+      dtype=[tf.uint8, tf.float32],
+      back_prop=False)
+  surface_coords = surface_coords if densepose_present else None
+  return instance_masks, surface_coords
+
+
+def crop_and_threshold_masks(elems, input_height, input_width, mask_height=256,
+                             mask_width=256, score_threshold=0.5,
+                             densepose_class_index=-1):
+  """Crops and thresholds masks based on detection boxes.
+
+  Args:
+    elems: A tuple of
+      boxes - float32 tensor of shape [max_detections, 4]
+      classes - int32 tensor of shape [max_detections] (0-indexed)
+      masks - float32 tensor of shape [output_height, output_width, num_classes]
+      part_heatmap - float32 tensor of shape [output_height, output_width,
+        num_parts]
+      surf_coords - float32 tensor of shape [output_height, output_width,
+        2 * num_parts]
+      true_height - scalar int tensor
+      true_width - scalar int tensor
+    input_height: Input height to network.
+    input_width: Input width to network.
+    mask_height: Height for resizing mask crops.
+    mask_width: Width for resizing mask crops.
+    score_threshold: The threshold at which to convert predicted mask
+      into foreground pixels.
+    densepose_class_index: scalar int tensor with the class index (0-indexed)
+      for DensePose.
+
+  Returns:
+    A tuple of
+    all_instances: A [max_detections, mask_height, mask_width] uint8 tensor
+      with a predicted foreground mask for each instance. Background is encoded
+      as 0, and foreground is encoded as a positive integer. Specific part
+      indices are encoded as 1-indexed parts (for classes that have part
+      information).
+    surface_coords: A [max_detections, mask_height, mask_width, 2]
+      float32 tensor with (v, u) coordinates. for each part.
+  """
+  (boxes, classes, masks, part_heatmap, surf_coords, true_height,
+   true_width) = elems
  # Boxes are in normalized coordinates relative to true image shapes. Convert
  # coordinates to be normalized relative to input image shapes (since masks
  # may still have padding).
-  # Then crop and resize each mask.
-  def crop_and_threshold_masks(args):
-    """Crops masks based on detection boxes."""
-    boxes, classes, masks, true_height, true_width = args
-    boxlist = box_list.BoxList(boxes)
-    y_scale = true_height / input_height
-    x_scale = true_width / input_width
-    boxlist = box_list_ops.scale(boxlist, y_scale, x_scale)
-    boxes = boxlist.get()
-    # Convert masks from [input_height, input_width, num_classes] to
-    # [num_classes, input_height, input_width, 1].
-    masks_4d = tf.transpose(masks, perm=[2, 0, 1])[:, :, :, tf.newaxis]
-    cropped_masks = tf2.image.crop_and_resize(
-        masks_4d,
-        boxes=boxes,
-        box_indices=classes,
-        crop_size=[mask_height, mask_width],
-        method='bilinear')
-    masks_3d = tf.squeeze(cropped_masks, axis=3)
-    masks_binarized = tf.math.greater_equal(masks_3d, score_threshold)
-    return tf.cast(masks_binarized, tf.uint8)
+  boxlist = box_list.BoxList(boxes)
+  y_scale = true_height / input_height
+  x_scale = true_width / input_width
+  boxlist = box_list_ops.scale(boxlist, y_scale, x_scale)
+  boxes = boxlist.get()
+  # Convert masks from [output_height, output_width, num_classes] to
+  # [num_classes, output_height, output_width, 1].
+  num_classes = tf.shape(masks)[-1]
+  masks_4d = tf.transpose(masks, perm=[2, 0, 1])[:, :, :, tf.newaxis]
+  # Tile part and surface coordinate masks for all classes.
+  part_heatmap_4d = tf.tile(part_heatmap[tf.newaxis, :, :, :],
+                            multiples=[num_classes, 1, 1, 1])
+  surf_coords_4d = tf.tile(surf_coords[tf.newaxis, :, :, :],
+                           multiples=[num_classes, 1, 1, 1])
+  feature_maps_concat = tf.concat([masks_4d, part_heatmap_4d, surf_coords_4d],
+                                  axis=-1)
+  # The following tensor has shape
+  # [max_detections, mask_height, mask_width, 1 + 3 * num_parts].
+  cropped_masks = tf2.image.crop_and_resize(
+      feature_maps_concat,
+      boxes=boxes,
+      box_indices=classes,
+      crop_size=[mask_height, mask_width],
+      method='bilinear')
+
+  # Split the cropped masks back into instance masks, part masks, and surface
+  # coordinates.
+  num_parts = tf.shape(part_heatmap)[-1]
+  instance_masks, part_heatmap_cropped, surface_coords_cropped = tf.split(
+      cropped_masks, [1, num_parts, 2 * num_parts], axis=-1)
+
+  # Threshold the instance masks. Resulting tensor has shape
+  # [max_detections, mask_height, mask_width, 1].
+  instance_masks_int = tf.cast(
+      tf.math.greater_equal(instance_masks, score_threshold), dtype=tf.int32)
+
+  # Produce a binary mask that is 1.0 only:
+  #  - in the foreground region for an instance
+  #  - in detections corresponding to the DensePose class
+  det_with_parts = tf.equal(classes, densepose_class_index)
+  det_with_parts = tf.cast(
+      tf.reshape(det_with_parts, [-1, 1, 1, 1]), dtype=tf.int32)
+  instance_masks_with_parts = tf.math.multiply(instance_masks_int,
+                                               det_with_parts)
+
+  # Similarly, produce a binary mask that holds the foreground masks only for
+  # instances without parts (i.e. non-DensePose classes).
+  det_without_parts = 1 - det_with_parts
+  instance_masks_without_parts = tf.math.multiply(instance_masks_int,
+                                                  det_without_parts)
+
+  # Assemble a tensor that has standard instance segmentation masks for
+  # non-DensePose classes (with values in [0, 1]), and part segmentation masks
+  # for DensePose classes (with vaues in [0, 1, ..., num_parts]).
+  part_mask_int_zero_indexed = tf.math.argmax(
+      part_heatmap_cropped, axis=-1, output_type=tf.int32)[:, :, :, tf.newaxis]
+  part_mask_int_one_indexed = part_mask_int_zero_indexed + 1
+  all_instances = (instance_masks_without_parts +
+                   instance_masks_with_parts * part_mask_int_one_indexed)
+
+  # Gather the surface coordinates for the parts.
+  surface_coords_cropped = tf.reshape(
+      surface_coords_cropped, [-1, mask_height, mask_width, num_parts, 2])
+  surface_coords = gather_surface_coords_for_parts(surface_coords_cropped,
+                                                   part_mask_int_zero_indexed)
+  surface_coords = (
+      surface_coords * tf.cast(instance_masks_with_parts, tf.float32))
+
+  return [tf.squeeze(all_instances, axis=3), surface_coords]
+
+
+def gather_surface_coords_for_parts(surface_coords_cropped,
+                                    highest_scoring_part):
+  """Gathers the (v, u) coordinates for the highest scoring DensePose parts.

-  true_heights, true_widths, _ = tf.unstack(true_image_shapes, axis=1)
-  masks_for_image = shape_utils.static_or_dynamic_map_fn(
-      crop_and_threshold_masks,
-      elems=[boxes, classes, masks, true_heights, true_widths],
-      dtype=tf.uint8,
-      back_prop=False)
-  masks = tf.stack(masks_for_image, axis=0)
-  return masks
+  Args:
+    surface_coords_cropped: A [max_detections, height, width, num_parts, 2]
+      float32 tensor with (v, u) surface coordinates.
+    highest_scoring_part: A [max_detections, height, width] integer tensor with
+      the highest scoring part (0-indexed) indices for each location.
+
+  Returns:
+    A [max_detections, height, width, 2] float32 tensor with the (v, u)
+    coordinates selected from the highest scoring parts.
+  """
+  max_detections, height, width, num_parts, _ = (
+      shape_utils.combined_static_and_dynamic_shape(surface_coords_cropped))
+  flattened_surface_coords = tf.reshape(surface_coords_cropped, [-1, 2])
+  flattened_part_ids = tf.reshape(highest_scoring_part, [-1])
+
+  # Produce lookup indices that represent the locations of the highest scoring
+  # parts in the `flattened_surface_coords` tensor.
+  flattened_lookup_indices = (
+      num_parts * tf.range(max_detections * height * width) +
+      flattened_part_ids)
+
+  vu_coords_flattened = tf.gather(flattened_surface_coords,
+                                  flattened_lookup_indices, axis=0)
+  return tf.reshape(vu_coords_flattened, [max_detections, height, width, 2])


 class ObjectDetectionParams(
@@ -1235,6 +1393,64 @@ class MaskParams(
                              score_threshold, heatmap_bias_init)


+class DensePoseParams(
+    collections.namedtuple('DensePoseParams', [
+        'class_id', 'classification_loss', 'localization_loss',
+        'part_loss_weight', 'coordinate_loss_weight', 'num_parts',
+        'task_loss_weight', 'upsample_to_input_res', 'upsample_method',
+        'heatmap_bias_init'
+    ])):
+  """Namedtuple to store DensePose prediction related parameters."""
+
+  __slots__ = ()
+
+  def __new__(cls,
+              class_id,
+              classification_loss,
+              localization_loss,
+              part_loss_weight=1.0,
+              coordinate_loss_weight=1.0,
+              num_parts=24,
+              task_loss_weight=1.0,
+              upsample_to_input_res=True,
+              upsample_method='bilinear',
+              heatmap_bias_init=-2.19):
+    """Constructor with default values for DensePoseParams.
+
+    Args:
+      class_id: the ID of the class that contains the DensePose groundtruth.
+        This should typically correspond to the "person" class. Note that the ID
+        is 0-based, meaning that class 0 corresponds to the first non-background
+        object class.
+      classification_loss: an object_detection.core.losses.Loss object to
+        compute the loss for the body part predictions in CenterNet.
+      localization_loss: an object_detection.core.losses.Loss object to compute
+        the loss for the surface coordinate regression in CenterNet.
+      part_loss_weight: The loss weight to apply to part prediction.
+      coordinate_loss_weight: The loss weight to apply to surface coordinate
+        prediction.
+      num_parts: The number of DensePose parts to predict.
+      task_loss_weight: float, the loss weight for the DensePose task.
+      upsample_to_input_res: Whether to upsample the DensePose feature maps to
+        the input resolution before applying loss. Note that the prediction
+        outputs are still at the standard CenterNet output stride.
+      upsample_method: Method for upsampling DensePose feature maps. Options are
+        either 'bilinear' or 'nearest'). This takes no effect when
+        `upsample_to_input_res` is False.
+      heatmap_bias_init: float, the initial value of bias in the convolutional
+        kernel of the part prediction head. If set to None, the
+        bias is initialized with zeros.
+
+    Returns:
+      An initialized DensePoseParams namedtuple.
+    """
+    return super(DensePoseParams,
+                 cls).__new__(cls, class_id, classification_loss,
+                              localization_loss, part_loss_weight,
+                              coordinate_loss_weight, num_parts,
+                              task_loss_weight, upsample_to_input_res,
+                              upsample_method, heatmap_bias_init)
+
 # The following constants are used to generate the keys of the
 # (prediction, loss, target assigner,...) dictionaries used in CenterNetMetaArch
 # class.
@@ -1247,6 +1463,9 @@ KEYPOINT_HEATMAP = 'keypoint/heatmap'
 KEYPOINT_OFFSET = 'keypoint/offset'
 SEGMENTATION_TASK = 'segmentation_task'
 SEGMENTATION_HEATMAP = 'segmentation/heatmap'
+DENSEPOSE_TASK = 'densepose_task'
+DENSEPOSE_HEATMAP = 'densepose/heatmap'
+DENSEPOSE_REGRESSION = 'densepose/regression'
 LOSS_KEY_PREFIX = 'Loss'


@@ -1290,7 +1509,8 @@ class CenterNetMetaArch(model.DetectionModel):
               object_center_params,
               object_detection_params=None,
               keypoint_params_dict=None,
-               mask_params=None):
+               mask_params=None,
+               densepose_params=None):
    """Initializes a CenterNet model.

    Args:
@@ -1318,6 +1538,10 @@ class CenterNetMetaArch(model.DetectionModel):
      mask_params: A MaskParams namedtuple. This object
        holds the hyper-parameters for segmentation. Please see the class
        definition for more details.
+      densepose_params: A DensePoseParams namedtuple. This object holds the
+        hyper-parameters for DensePose prediction. Please see the class
+        definition for more details. Note that if this is provided, it is
+        expected that `mask_params` is also provided.
    """
    assert object_detection_params or keypoint_params_dict
    # Shorten the name for convenience and better formatting.
@@ -1333,6 +1557,10 @@ class CenterNetMetaArch(model.DetectionModel):
    self._od_params = object_detection_params
    self._kp_params_dict = keypoint_params_dict
    self._mask_params = mask_params
+    if densepose_params is not None and mask_params is None:
+      raise ValueError('To run DensePose prediction, `mask_params` must also '
+                       'be supplied.')
+    self._densepose_params = densepose_params

    # Construct the prediction head nets.
    self._prediction_head_dict = self._construct_prediction_heads(
@@ -1413,8 +1641,18 @@ class CenterNetMetaArch(model.DetectionModel):
    if self._mask_params is not None:
      prediction_heads[SEGMENTATION_HEATMAP] = [
          make_prediction_net(num_classes,
-                              bias_fill=class_prediction_bias_init)
+                              bias_fill=self._mask_params.heatmap_bias_init)
+          for _ in range(num_feature_outputs)]
+    if self._densepose_params is not None:
+      prediction_heads[DENSEPOSE_HEATMAP] = [
+          make_prediction_net(  # pylint: disable=g-complex-comprehension
+              self._densepose_params.num_parts,
+              bias_fill=self._densepose_params.heatmap_bias_init)
          for _ in range(num_feature_outputs)]
+      prediction_heads[DENSEPOSE_REGRESSION] = [
+          make_prediction_net(2 * self._densepose_params.num_parts)
+          for _ in range(num_feature_outputs)
+      ]
    return prediction_heads

  def _initialize_target_assigners(self, stride, min_box_overlap_iou):
@@ -1449,6 +1687,10 @@ class CenterNetMetaArch(model.DetectionModel):
    if self._mask_params is not None:
      target_assigners[SEGMENTATION_TASK] = (
          cn_assigner.CenterNetMaskTargetAssigner(stride))
+    if self._densepose_params is not None:
+      dp_stride = 1 if self._densepose_params.upsample_to_input_res else stride
+      target_assigners[DENSEPOSE_TASK] = (
+          cn_assigner.CenterNetDensePoseTargetAssigner(dp_stride))

    return target_assigners

@@ -1860,6 +2102,113 @@ class CenterNetMetaArch(model.DetectionModel):
        float(len(segmentation_predictions)) * total_pixels_in_loss)
    return total_loss

+  def _compute_densepose_losses(self, input_height, input_width,
+                                prediction_dict):
+    """Computes the weighted DensePose losses.
+
+    Args:
+      input_height: An integer scalar tensor representing input image height.
+      input_width: An integer scalar tensor representing input image width.
+      prediction_dict: A dictionary holding predicted tensors output by the
+        "predict" function. See the "predict" function for more detailed
+        description.
+
+    Returns:
+      A dictionary of scalar float tensors representing the weighted losses for
+      the DensePose task:
+         DENSEPOSE_HEATMAP: the weighted part segmentation loss.
+         DENSEPOSE_REGRESSION: the weighted part surface coordinate loss.
+    """
+    dp_heatmap_loss, dp_regression_loss = (
+        self._compute_densepose_part_and_coordinate_losses(
+            input_height=input_height,
+            input_width=input_width,
+            part_predictions=prediction_dict[DENSEPOSE_HEATMAP],
+            surface_coord_predictions=prediction_dict[DENSEPOSE_REGRESSION]))
+    loss_dict = {}
+    loss_dict[DENSEPOSE_HEATMAP] = (
+        self._densepose_params.part_loss_weight * dp_heatmap_loss)
+    loss_dict[DENSEPOSE_REGRESSION] = (
+        self._densepose_params.coordinate_loss_weight * dp_regression_loss)
+    return loss_dict
+
+  def _compute_densepose_part_and_coordinate_losses(
+      self, input_height, input_width, part_predictions,
+      surface_coord_predictions):
+    """Computes the individual losses for the DensePose task.
+
+    Args:
+      input_height: An integer scalar tensor representing input image height.
+      input_width: An integer scalar tensor representing input image width.
+      part_predictions: A list of float tensors of shape [batch_size,
+        out_height, out_width, num_parts].
+      surface_coord_predictions: A list of float tensors of shape [batch_size,
+        out_height, out_width, 2 * num_parts].
+
+    Returns:
+      A tuple with two scalar loss tensors: part_prediction_loss and
+      surface_coord_loss.
+    """
+    gt_dp_num_points_list = self.groundtruth_lists(
+        fields.BoxListFields.densepose_num_points)
+    gt_dp_part_ids_list = self.groundtruth_lists(
+        fields.BoxListFields.densepose_part_ids)
+    gt_dp_surface_coords_list = self.groundtruth_lists(
+        fields.BoxListFields.densepose_surface_coords)
+    gt_weights_list = self.groundtruth_lists(fields.BoxListFields.weights)
+
+    assigner = self._target_assigner_dict[DENSEPOSE_TASK]
+    batch_indices, batch_part_ids, batch_surface_coords, batch_weights = (
+        assigner.assign_part_and_coordinate_targets(
+            height=input_height,
+            width=input_width,
+            gt_dp_num_points_list=gt_dp_num_points_list,
+            gt_dp_part_ids_list=gt_dp_part_ids_list,
+            gt_dp_surface_coords_list=gt_dp_surface_coords_list,
+            gt_weights_list=gt_weights_list))
+
+    part_prediction_loss = 0
+    surface_coord_loss = 0
+    classification_loss_fn = self._densepose_params.classification_loss
+    localization_loss_fn = self._densepose_params.localization_loss
+    num_predictions = float(len(part_predictions))
+    num_valid_points = tf.math.count_nonzero(batch_weights)
+    num_valid_points = tf.cast(tf.math.maximum(num_valid_points, 1), tf.float32)
+    for part_pred, surface_coord_pred in zip(part_predictions,
+                                             surface_coord_predictions):
+      # Potentially upsample the feature maps, so that better quality (i.e.
+      # higher res) groundtruth can be applied.
+      if self._densepose_params.upsample_to_input_res:
+        part_pred = tf.keras.layers.UpSampling2D(
+            self._stride, interpolation=self._densepose_params.upsample_method)(
+                part_pred)
+        surface_coord_pred = tf.keras.layers.UpSampling2D(
+            self._stride, interpolation=self._densepose_params.upsample_method)(
+                surface_coord_pred)
+      # Compute the part prediction loss.
+      part_pred = cn_assigner.get_batch_predictions_from_indices(
+          part_pred, batch_indices[:, 0:3])
+      part_prediction_loss += classification_loss_fn(
+          part_pred[:, tf.newaxis, :],
+          batch_part_ids[:, tf.newaxis, :],
+          weights=batch_weights[:, tf.newaxis, tf.newaxis])
+      # Compute the surface coordinate loss.
+      batch_size, out_height, out_width, _ = _get_shape(
+          surface_coord_pred, 4)
+      surface_coord_pred = tf.reshape(
+          surface_coord_pred, [batch_size, out_height, out_width, -1, 2])
+      surface_coord_pred = cn_assigner.get_batch_predictions_from_indices(
+          surface_coord_pred, batch_indices)
+      surface_coord_loss += localization_loss_fn(
+          surface_coord_pred,
+          batch_surface_coords,
+          weights=batch_weights[:, tf.newaxis])
+    part_prediction_loss = tf.reduce_sum(part_prediction_loss) / (
+        num_predictions * num_valid_points)
+    surface_coord_loss = tf.reduce_sum(surface_coord_loss) / (
+        num_predictions * num_valid_points)
+    return part_prediction_loss, surface_coord_loss
+
  def preprocess(self, inputs):
    outputs = shape_utils.resize_images_and_return_shapes(
        inputs, self._image_resizer_fn)
@@ -1909,6 +2258,13 @@ class CenterNetMetaArch(model.DetectionModel):
        'segmentation/heatmap' - [optional] A list of size num_feature_outputs
          holding float tensors of size [batch_size, output_height,
          output_width, num_classes] representing the mask logits.
+        'densepose/heatmap' - [optional] A list of size num_feature_outputs
+          holding float tensors of size [batch_size, output_height,
+          output_width, num_parts] representing the mask logits for each part.
+        'densepose/regression' - [optional] A list of size num_feature_outputs
+          holding float tensors of size [batch_size, output_height,
+          output_width, 2 * num_parts] representing the DensePose surface
+          coordinate predictions.
        Note the $TASK_NAME is provided by the KeypointEstimation namedtuple
        used to differentiate between different keypoint tasks.
    """
@@ -1938,10 +2294,16 @@ class CenterNetMetaArch(model.DetectionModel):
      scope: Optional scope name.

    Returns:
-      A dictionary mapping the keys ['Loss/object_center', 'Loss/box/scale',
-        'Loss/box/offset', 'Loss/$TASK_NAME/keypoint/heatmap',
-        'Loss/$TASK_NAME/keypoint/offset',
-        'Loss/$TASK_NAME/keypoint/regression', 'Loss/segmentation/heatmap'] to
+      A dictionary mapping the keys [
+        'Loss/object_center',
+        'Loss/box/scale',  (optional)
+        'Loss/box/offset', (optional)
+        'Loss/$TASK_NAME/keypoint/heatmap', (optional)
+        'Loss/$TASK_NAME/keypoint/offset', (optional)
+        'Loss/$TASK_NAME/keypoint/regression', (optional)
+        'Loss/segmentation/heatmap', (optional)
+        'Loss/densepose/heatmap', (optional)
+        'Loss/densepose/regression]' (optional)
        scalar tensors corresponding to the losses for different tasks. Note the
        $TASK_NAME is provided by the KeypointEstimation namedtuple used to
        differentiate between different keypoint tasks.
@@ -1999,6 +2361,16 @@ class CenterNetMetaArch(model.DetectionModel):
        seg_losses[key] = seg_losses[key] * self._mask_params.task_loss_weight
      losses.update(seg_losses)

+    if self._densepose_params is not None:
+      densepose_losses = self._compute_densepose_losses(
+          input_height=input_height,
+          input_width=input_width,
+          prediction_dict=prediction_dict)
+      for key in densepose_losses:
+        densepose_losses[key] = (
+            densepose_losses[key] * self._densepose_params.task_loss_weight)
+      losses.update(densepose_losses)
+
    # Prepend the LOSS_KEY_PREFIX to the keys in the dictionary such that the
    # losses will be grouped together in Tensorboard.
    return dict([('%s/%s' % (LOSS_KEY_PREFIX, key), val)
@@ -2033,9 +2405,14 @@ class CenterNetMetaArch(model.DetectionModel):
          invalid keypoints have their coordinates and scores set to 0.0.
        detection_keypoint_scores: (Optional) A float tensor of shape [batch,
          max_detection, num_keypoints] with scores for each keypoint.
-        detection_masks: (Optional) An int tensor of shape [batch,
-          max_detections, mask_height, mask_width] with binarized masks for each
-          detection.
+        detection_masks: (Optional) A uint8 tensor of shape [batch,
+          max_detections, mask_height, mask_width] with masks for each
+          detection. Background is specified with 0, and foreground is specified
+          with positive integers (1 for standard instance segmentation mask, and
+          1-indexed parts for DensePose task).
+        detection_surface_coords: (Optional) A float32 tensor of shape [batch,
+          max_detection, mask_height, mask_width, 2] with DensePose surface
+          coordinates, in (v, u) format.
    """
    object_center_prob = tf.nn.sigmoid(prediction_dict[OBJECT_CENTER][-1])
    # Get x, y and channel indices corresponding to the top indices in the class
@@ -2076,14 +2453,27 @@ class CenterNetMetaArch(model.DetectionModel):

    if self._mask_params:
      masks = tf.nn.sigmoid(prediction_dict[SEGMENTATION_HEATMAP][-1])
-      instance_masks = convert_strided_predictions_to_instance_masks(
-          boxes, classes, masks, self._stride, self._mask_params.mask_height,
-          self._mask_params.mask_width, true_image_shapes,
-          self._mask_params.score_threshold)
-      postprocess_dict.update({
-          fields.DetectionResultFields.detection_masks:
-              instance_masks
-      })
+      densepose_part_heatmap, densepose_surface_coords = None, None
+      densepose_class_index = 0
+      if self._densepose_params:
+        densepose_part_heatmap = prediction_dict[DENSEPOSE_HEATMAP][-1]
+        densepose_surface_coords = prediction_dict[DENSEPOSE_REGRESSION][-1]
+        densepose_class_index = self._densepose_params.class_id
+      instance_masks, surface_coords = (
+          convert_strided_predictions_to_instance_masks(
+              boxes, classes, masks, true_image_shapes,
+              densepose_part_heatmap, densepose_surface_coords,
+              stride=self._stride, mask_height=self._mask_params.mask_height,
+              mask_width=self._mask_params.mask_width,
+              score_threshold=self._mask_params.score_threshold,
+              densepose_class_index=densepose_class_index))
+      postprocess_dict[
+          fields.DetectionResultFields.detection_masks] = instance_masks
+      if self._densepose_params:
+        postprocess_dict[
+            fields.DetectionResultFields.detection_surface_coords] = (
+                surface_coords)
+
    return postprocess_dict

  def _postprocess_keypoints(self, prediction_dict, classes, y_indices,
@@ -2359,6 +2749,14 @@ class CenterNetMetaArch(model.DetectionModel):
        checkpoint (with compatible variable names) or to restore from a
        classification checkpoint for initialization prior to training.
        Valid values: `detection`, `classification`. Default 'detection'.
+        'detection': used when loading in the Hourglass model pre-trained on
+          other detection task.
+        'classification': used when loading in the ResNet model pre-trained on
+          image classification task. Note that only the image feature encoding
+          part is loaded but not those upsampling layers.
+        'fine_tune': used when loading the entire CenterNet feature extractor
+          pre-trained on other tasks. The checkpoints saved during CenterNet
+          model training can be directly loaded using this mode.

    Returns:
      A dict mapping keys to Trackable objects (tf.Module or Checkpoint).
@@ -2367,9 +2765,14 @@ class CenterNetMetaArch(model.DetectionModel):
    if fine_tune_checkpoint_type == 'classification':
      return {'feature_extractor': self._feature_extractor.get_base_model()}

-    if fine_tune_checkpoint_type == 'detection':
+    elif fine_tune_checkpoint_type == 'detection':
      return {'feature_extractor': self._feature_extractor.get_model()}

+    elif fine_tune_checkpoint_type == 'fine_tune':
+      feature_extractor_model = tf.train.Checkpoint(
+          _feature_extractor=self._feature_extractor)
+      return {'model': feature_extractor_model}
+
    else:
      raise ValueError('Not supported  fine tune checkpoint type - {}'.format(
          fine_tune_checkpoint_type))

--- a/research/object_detection/meta_architectures/center_net_meta_arch_tf2_test.py
+++ b/research/object_detection/meta_architectures/center_net_meta_arch_tf2_test.py
@@ -266,7 +266,7 @@ class CenterNetMetaArchHelpersTest(test_case.TestCase, parameterized.TestCase):
      masks_np[0, :, :3, 1] = 1  # Class 1.
      masks = tf.constant(masks_np)
      true_image_shapes = tf.constant([[6, 8, 3]])
-      instance_masks = cnma.convert_strided_predictions_to_instance_masks(
+      instance_masks, _ = cnma.convert_strided_predictions_to_instance_masks(
          boxes, classes, masks, stride=2, mask_height=2, mask_width=2,
          true_image_shapes=true_image_shapes)
      return instance_masks
@@ -289,6 +289,104 @@ class CenterNetMetaArchHelpersTest(test_case.TestCase, parameterized.TestCase):
        ])
    np.testing.assert_array_equal(expected_instance_masks, instance_masks)

+  def test_convert_strided_predictions_raises_error_with_one_tensor(self):
+    def graph_fn():
+      boxes = tf.constant(
+          [
+              [[0.5, 0.5, 1.0, 1.0],
+               [0.0, 0.5, 0.5, 1.0],
+               [0.0, 0.0, 0.0, 0.0]],
+          ], tf.float32)
+      classes = tf.constant(
+          [
+              [0, 1, 0],
+          ], tf.int32)
+      masks_np = np.zeros((1, 4, 4, 2), dtype=np.float32)
+      masks_np[0, :, 2:, 0] = 1  # Class 0.
+      masks_np[0, :, :3, 1] = 1  # Class 1.
+      masks = tf.constant(masks_np)
+      true_image_shapes = tf.constant([[6, 8, 3]])
+      densepose_part_heatmap = tf.random.uniform(
+          [1, 4, 4, 24])
+      instance_masks, _ = cnma.convert_strided_predictions_to_instance_masks(
+          boxes, classes, masks, true_image_shapes,
+          densepose_part_heatmap=densepose_part_heatmap,
+          densepose_surface_coords=None)
+      return instance_masks
+
+    with self.assertRaises(ValueError):
+      self.execute_cpu(graph_fn, [])
+
+  def test_crop_and_threshold_masks(self):
+    boxes_np = np.array(
+        [[0., 0., 0.5, 0.5],
+         [0.25, 0.25, 1.0, 1.0]], dtype=np.float32)
+    classes_np = np.array([0, 2], dtype=np.int32)
+    masks_np = np.zeros((4, 4, _NUM_CLASSES), dtype=np.float32)
+    masks_np[0, 0, 0] = 0.8
+    masks_np[1, 1, 0] = 0.6
+    masks_np[3, 3, 2] = 0.7
+    part_heatmap_np = np.zeros((4, 4, _DENSEPOSE_NUM_PARTS), dtype=np.float32)
+    part_heatmap_np[0, 0, 4] = 1
+    part_heatmap_np[0, 0, 2] = 0.6  # Lower scoring.
+    part_heatmap_np[1, 1, 8] = 0.2
+    part_heatmap_np[3, 3, 4] = 0.5
+    surf_coords_np = np.zeros((4, 4, 2 * _DENSEPOSE_NUM_PARTS),
+                              dtype=np.float32)
+    surf_coords_np[:, :, 8:10] = 0.2, 0.9
+    surf_coords_np[:, :, 16:18] = 0.3, 0.5
+    true_height, true_width = 10, 10
+    input_height, input_width = 10, 10
+    mask_height = 4
+    mask_width = 4
+    def graph_fn():
+      elems = [
+          tf.constant(boxes_np),
+          tf.constant(classes_np),
+          tf.constant(masks_np),
+          tf.constant(part_heatmap_np),
+          tf.constant(surf_coords_np),
+          tf.constant(true_height, dtype=tf.int32),
+          tf.constant(true_width, dtype=tf.int32)
+      ]
+      part_masks, surface_coords = cnma.crop_and_threshold_masks(
+          elems, input_height, input_width, mask_height=mask_height,
+          mask_width=mask_width, densepose_class_index=0)
+      return part_masks, surface_coords
+
+    part_masks, surface_coords = self.execute_cpu(graph_fn, [])
+
+    expected_part_masks = np.zeros((2, 4, 4), dtype=np.uint8)
+    expected_part_masks[0, 0, 0] = 5  # Recall classes are 1-indexed in output.
+    expected_part_masks[0, 2, 2] = 9  # Recall classes are 1-indexed in output.
+    expected_part_masks[1, 3, 3] = 1  # Standard instance segmentation mask.
+    expected_surface_coords = np.zeros((2, 4, 4, 2), dtype=np.float32)
+    expected_surface_coords[0, 0, 0, :] = 0.2, 0.9
+    expected_surface_coords[0, 2, 2, :] = 0.3, 0.5
+    np.testing.assert_allclose(expected_part_masks, part_masks)
+    np.testing.assert_allclose(expected_surface_coords, surface_coords)
+
+  def test_gather_surface_coords_for_parts(self):
+    surface_coords_cropped_np = np.zeros((2, 5, 5, _DENSEPOSE_NUM_PARTS, 2),
+                                         dtype=np.float32)
+    surface_coords_cropped_np[0, 0, 0, 5] = 0.3, 0.4
+    surface_coords_cropped_np[0, 1, 0, 9] = 0.5, 0.6
+    highest_scoring_part_np = np.zeros((2, 5, 5), dtype=np.int32)
+    highest_scoring_part_np[0, 0, 0] = 5
+    highest_scoring_part_np[0, 1, 0] = 9
+    def graph_fn():
+      surface_coords_cropped = tf.constant(surface_coords_cropped_np,
+                                           tf.float32)
+      highest_scoring_part = tf.constant(highest_scoring_part_np, tf.int32)
+      surface_coords_gathered = cnma.gather_surface_coords_for_parts(
+          surface_coords_cropped, highest_scoring_part)
+      return surface_coords_gathered
+
+    surface_coords_gathered = self.execute_cpu(graph_fn, [])
+
+    np.testing.assert_allclose([0.3, 0.4], surface_coords_gathered[0, 0, 0])
+    np.testing.assert_allclose([0.5, 0.6], surface_coords_gathered[0, 1, 0])
+
  def test_top_k_feature_map_locations(self):
    feature_map_np = np.zeros((2, 3, 3, 2), dtype=np.float32)
    feature_map_np[0, 2, 0, 1] = 1.0
@@ -535,6 +633,8 @@ class CenterNetMetaArchHelpersTest(test_case.TestCase, parameterized.TestCase):
    keypoint_heatmap_np[1, 0, 1, 1] = 0.9
    keypoint_heatmap_np[1, 2, 0, 1] = 0.8

+    # Note that the keypoint offsets are now per keypoint (as opposed to
+    # keypoint agnostic, in the test test_keypoint_candidate_prediction).
    keypoint_heatmap_offsets_np = np.zeros((2, 3, 3, 4), dtype=np.float32)
    keypoint_heatmap_offsets_np[0, 0, 0] = [0.5, 0.25, 0.0, 0.0]
    keypoint_heatmap_offsets_np[0, 2, 1] = [-0.25, 0.5, 0.0, 0.0]
@@ -949,6 +1049,7 @@ class CenterNetMetaArchHelpersTest(test_case.TestCase, parameterized.TestCase):
 _NUM_CLASSES = 10
 _KEYPOINT_INDICES = [0, 1, 2, 3]
 _NUM_KEYPOINTS = len(_KEYPOINT_INDICES)
+_DENSEPOSE_NUM_PARTS = 24
 _TASK_NAME = 'human_pose'


@@ -991,6 +1092,20 @@ def get_fake_mask_params():
      mask_width=4)


+def get_fake_densepose_params():
+  """Returns the fake DensePose estimation parameter namedtuple."""
+  return cnma.DensePoseParams(
+      class_id=1,
+      classification_loss=losses.WeightedSoftmaxClassificationLoss(),
+      localization_loss=losses.L1LocalizationLoss(),
+      part_loss_weight=1.0,
+      coordinate_loss_weight=1.0,
+      num_parts=_DENSEPOSE_NUM_PARTS,
+      task_loss_weight=1.0,
+      upsample_to_input_res=True,
+      upsample_method='nearest')
+
+
 def build_center_net_meta_arch(build_resnet=False):
  """Builds the CenterNet meta architecture."""
  if build_resnet:
@@ -1018,7 +1133,8 @@ def build_center_net_meta_arch(build_resnet=False):
      object_center_params=get_fake_center_params(),
      object_detection_params=get_fake_od_params(),
      keypoint_params_dict={_TASK_NAME: get_fake_kp_params()},
-      mask_params=get_fake_mask_params())
+      mask_params=get_fake_mask_params(),
+      densepose_params=get_fake_densepose_params())


 def _logit(p):
@@ -1102,6 +1218,16 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
        fake_feature_map)
    self.assertEqual((4, 128, 128, _NUM_CLASSES), output.shape)

+    # "densepose parts" head:
+    output = model._prediction_head_dict[cnma.DENSEPOSE_HEATMAP][-1](
+        fake_feature_map)
+    self.assertEqual((4, 128, 128, _DENSEPOSE_NUM_PARTS), output.shape)
+
+    # "densepose surface coordinates" head:
+    output = model._prediction_head_dict[cnma.DENSEPOSE_REGRESSION][-1](
+        fake_feature_map)
+    self.assertEqual((4, 128, 128, 2 * _DENSEPOSE_NUM_PARTS), output.shape)
+
  def test_initialize_target_assigners(self):
    model = build_center_net_meta_arch()
    assigner_dict = model._initialize_target_assigners(
@@ -1125,6 +1251,10 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
    self.assertIsInstance(assigner_dict[cnma.SEGMENTATION_TASK],
                          cn_assigner.CenterNetMaskTargetAssigner)

+    # DensePose estimation target assigner:
+    self.assertIsInstance(assigner_dict[cnma.DENSEPOSE_TASK],
+                          cn_assigner.CenterNetDensePoseTargetAssigner)
+
  def test_predict(self):
    """Test the predict function."""

@@ -1145,6 +1275,10 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
                     (2, 32, 32, 2))
    self.assertEqual(prediction_dict[cnma.SEGMENTATION_HEATMAP][0].shape,
                     (2, 32, 32, _NUM_CLASSES))
+    self.assertEqual(prediction_dict[cnma.DENSEPOSE_HEATMAP][0].shape,
+                     (2, 32, 32, _DENSEPOSE_NUM_PARTS))
+    self.assertEqual(prediction_dict[cnma.DENSEPOSE_REGRESSION][0].shape,
+                     (2, 32, 32, 2 * _DENSEPOSE_NUM_PARTS))

  def test_loss(self):
    """Test the loss function."""
@@ -1157,7 +1291,13 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
        groundtruth_keypoints_list=groundtruth_dict[
            fields.BoxListFields.keypoints],
        groundtruth_masks_list=groundtruth_dict[
-            fields.BoxListFields.masks])
+            fields.BoxListFields.masks],
+        groundtruth_dp_num_points_list=groundtruth_dict[
+            fields.BoxListFields.densepose_num_points],
+        groundtruth_dp_part_ids_list=groundtruth_dict[
+            fields.BoxListFields.densepose_part_ids],
+        groundtruth_dp_surface_coords_list=groundtruth_dict[
+            fields.BoxListFields.densepose_surface_coords])

    prediction_dict = get_fake_prediction_dict(
        input_height=16, input_width=32, stride=4)
@@ -1193,6 +1333,12 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
    self.assertGreater(
        0.01, loss_dict['%s/%s' % (cnma.LOSS_KEY_PREFIX,
                                   cnma.SEGMENTATION_HEATMAP)])
+    self.assertGreater(
+        0.01, loss_dict['%s/%s' % (cnma.LOSS_KEY_PREFIX,
+                                   cnma.DENSEPOSE_HEATMAP)])
+    self.assertGreater(
+        0.01, loss_dict['%s/%s' % (cnma.LOSS_KEY_PREFIX,
+                                   cnma.DENSEPOSE_REGRESSION)])

  @parameterized.parameters(
      {'target_class_id': 1},
@@ -1230,6 +1376,14 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
    segmentation_heatmap[:, 14:18, 14:18, target_class_id] = 1.0
    segmentation_heatmap = _logit(segmentation_heatmap)

+    dp_part_ind = 4
+    dp_part_heatmap = np.zeros((1, 32, 32, _DENSEPOSE_NUM_PARTS),
+                               dtype=np.float32)
+    dp_part_heatmap[0, 14:18, 14:18, dp_part_ind] = 1.0
+    dp_part_heatmap = _logit(dp_part_heatmap)
+
+    dp_surf_coords = np.random.randn(1, 32, 32, 2 * _DENSEPOSE_NUM_PARTS)
+
    class_center = tf.constant(class_center)
    height_width = tf.constant(height_width)
    offset = tf.constant(offset)
@@ -1237,6 +1391,8 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
    keypoint_offsets = tf.constant(keypoint_offsets, dtype=tf.float32)
    keypoint_regression = tf.constant(keypoint_regression, dtype=tf.float32)
    segmentation_heatmap = tf.constant(segmentation_heatmap, dtype=tf.float32)
+    dp_part_heatmap = tf.constant(dp_part_heatmap, dtype=tf.float32)
+    dp_surf_coords = tf.constant(dp_surf_coords, dtype=tf.float32)

    prediction_dict = {
        cnma.OBJECT_CENTER: [class_center],
@@ -1249,6 +1405,8 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
        cnma.get_keypoint_name(_TASK_NAME, cnma.KEYPOINT_REGRESSION):
            [keypoint_regression],
        cnma.SEGMENTATION_HEATMAP: [segmentation_heatmap],
+        cnma.DENSEPOSE_HEATMAP: [dp_part_heatmap],
+        cnma.DENSEPOSE_REGRESSION: [dp_surf_coords]
    }

    def graph_fn():
@@ -1271,12 +1429,13 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
    self.assertAllEqual([1, max_detection, 4, 4],
                        detections['detection_masks'].shape)

-    # There should be some section of the first mask (correspond to the only
-    # detection) with non-zero mask values.
-    self.assertGreater(np.sum(detections['detection_masks'][0, 0, :, :] > 0), 0)
+    # Masks should be empty for everything but the first detection.
    self.assertAllEqual(
        detections['detection_masks'][0, 1:, :, :],
        np.zeros_like(detections['detection_masks'][0, 1:, :, :]))
+    self.assertAllEqual(
+        detections['detection_surface_coords'][0, 1:, :, :],
+        np.zeros_like(detections['detection_surface_coords'][0, 1:, :, :]))

    if target_class_id == 1:
      expected_kpts_for_obj_0 = np.array(
@@ -1287,6 +1446,12 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
                                 expected_kpts_for_obj_0, rtol=1e-6)
      np.testing.assert_allclose(detections['detection_keypoint_scores'][0][0],
                                 expected_kpt_scores_for_obj_0, rtol=1e-6)
+      # First detection has DensePose parts.
+      self.assertSameElements(
+          np.unique(detections['detection_masks'][0, 0, :, :]),
+          set([0, dp_part_ind + 1]))
+      self.assertGreater(np.sum(np.abs(detections['detection_surface_coords'])),
+                         0.0)
    else:
      # All keypoint outputs should be zeros.
      np.testing.assert_allclose(
@@ -1297,6 +1462,14 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
          detections['detection_keypoint_scores'][0][0],
          np.zeros([num_keypoints], np.float),
          rtol=1e-6)
+      # Binary segmentation mask.
+      self.assertSameElements(
+          np.unique(detections['detection_masks'][0, 0, :, :]),
+          set([0, 1]))
+      # No DensePose surface coordinates.
+      np.testing.assert_allclose(
+          detections['detection_surface_coords'][0, 0, :, :],
+          np.zeros_like(detections['detection_surface_coords'][0, 0, :, :]))

  def test_get_instance_indices(self):
    classes = tf.constant([[0, 1, 2, 0], [2, 1, 2, 2]], dtype=tf.int32)
@@ -1353,6 +1526,17 @@ def get_fake_prediction_dict(input_height, input_width, stride):
  mask_heatmap[0, 2, 4, 1] = 1.0
  mask_heatmap = _logit(mask_heatmap)

+  densepose_heatmap = np.zeros((2, output_height, output_width,
+                                _DENSEPOSE_NUM_PARTS), dtype=np.float32)
+  densepose_heatmap[0, 2, 4, 5] = 1.0
+  densepose_heatmap = _logit(densepose_heatmap)
+
+  densepose_regression = np.zeros((2, output_height, output_width,
+                                   2 * _DENSEPOSE_NUM_PARTS), dtype=np.float32)
+  # The surface coordinate indices for part index 5 are:
+  # (5 * 2, 5 * 2 + 1), or (10, 11).
+  densepose_regression[0, 2, 4, 10:12] = 0.4, 0.7
+
  prediction_dict = {
      'preprocessed_inputs':
          tf.zeros((2, input_height, input_width, 3)),
@@ -1383,6 +1567,14 @@ def get_fake_prediction_dict(input_height, input_width, stride):
      cnma.SEGMENTATION_HEATMAP: [
          tf.constant(mask_heatmap),
          tf.constant(mask_heatmap)
+      ],
+      cnma.DENSEPOSE_HEATMAP: [
+          tf.constant(densepose_heatmap),
+          tf.constant(densepose_heatmap),
+      ],
+      cnma.DENSEPOSE_REGRESSION: [
+          tf.constant(densepose_regression),
+          tf.constant(densepose_regression),
      ]
  }
  return prediction_dict
@@ -1427,12 +1619,30 @@ def get_fake_groundtruth_dict(input_height, input_width, stride):
      tf.constant(mask),
      tf.zeros_like(mask),
  ]
+  densepose_num_points = [
+      tf.constant([1], dtype=tf.int32),
+      tf.constant([0], dtype=tf.int32),
+  ]
+  densepose_part_ids = [
+      tf.constant([[5, 0, 0]], dtype=tf.int32),
+      tf.constant([[0, 0, 0]], dtype=tf.int32),
+  ]
+  densepose_surface_coords_np = np.zeros((1, 3, 4), dtype=np.float32)
+  densepose_surface_coords_np[0, 0, :] = 0.55, 0.55, 0.4, 0.7
+  densepose_surface_coords = [
+      tf.constant(densepose_surface_coords_np),
+      tf.zeros_like(densepose_surface_coords_np)
+  ]
  groundtruth_dict = {
      fields.BoxListFields.boxes: boxes,
      fields.BoxListFields.weights: weights,
      fields.BoxListFields.classes: classes,
      fields.BoxListFields.keypoints: keypoints,
      fields.BoxListFields.masks: masks,
+      fields.BoxListFields.densepose_num_points: densepose_num_points,
+      fields.BoxListFields.densepose_part_ids: densepose_part_ids,
+      fields.BoxListFields.densepose_surface_coords:
+          densepose_surface_coords,
      fields.InputDataFields.groundtruth_labeled_classes: labeled_classes,
  }
  return groundtruth_dict

--- a/research/object_detection/meta_architectures/context_rcnn_meta_arch_test.py
+++ b/research/object_detection/meta_architectures/context_rcnn_meta_arch_test.py
@@ -20,8 +20,8 @@ from __future__ import print_function

 import functools
 import unittest
+from unittest import mock  # pylint: disable=g-importing-member
 from absl.testing import parameterized
-import mock
 import tensorflow.compat.v1 as tf
 import tf_slim as slim


--- a/research/object_detection/meta_architectures/faster_rcnn_meta_arch.py
+++ b/research/object_detection/meta_architectures/faster_rcnn_meta_arch.py
@@ -2547,7 +2547,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
      if second_stage_mask_loss is not None:
        mask_loss = tf.multiply(self._second_stage_mask_loss_weight,
                                second_stage_mask_loss, name='mask_loss')
-        loss_dict[mask_loss.op.name] = mask_loss
+        loss_dict['Loss/BoxClassifierLoss/mask_loss'] = mask_loss
    return loss_dict

  def _get_mask_proposal_boxes_and_classes(

--- a/research/object_detection/metrics/coco_evaluation.py
+++ b/research/object_detection/metrics/coco_evaluation.py
@@ -432,14 +432,9 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
    return eval_metric_ops


-def _check_mask_type_and_value(array_name, masks):
-  """Checks whether mask dtype is uint8 and the values are either 0 or 1."""
-  if masks.dtype != np.uint8:
-    raise ValueError('{} must be of type np.uint8. Found {}.'.format(
-        array_name, masks.dtype))
-  if np.any(np.logical_and(masks != 0, masks != 1)):
-    raise ValueError('{} elements can only be either 0 or 1.'.format(
-        array_name))
+def convert_masks_to_binary(masks):
+  """Converts masks to 0 or 1 and uint8 type."""
+  return (masks > 0).astype(np.uint8)


 class CocoKeypointEvaluator(CocoDetectionEvaluator):
@@ -952,9 +947,8 @@ class CocoMaskEvaluator(object_detection_evaluation.DetectionEvaluator):

    groundtruth_instance_masks = groundtruth_dict[
        standard_fields.InputDataFields.groundtruth_instance_masks]
-    _check_mask_type_and_value(standard_fields.InputDataFields.
-                               groundtruth_instance_masks,
-                               groundtruth_instance_masks)
+    groundtruth_instance_masks = convert_masks_to_binary(
+        groundtruth_instance_masks)
    self._groundtruth_list.extend(
        coco_tools.
        ExportSingleImageGroundtruthToCoco(
@@ -1013,9 +1007,7 @@ class CocoMaskEvaluator(object_detection_evaluation.DetectionEvaluator):
                       'are incompatible: {} vs {}'.format(
                           groundtruth_masks_shape,
                           detection_masks.shape))
-    _check_mask_type_and_value(standard_fields.DetectionResultFields.
-                               detection_masks,
-                               detection_masks)
+    detection_masks = convert_masks_to_binary(detection_masks)
    self._detection_masks_list.extend(
        coco_tools.ExportSingleImageDetectionMasksToCoco(
            image_id=image_id,

--- a/research/object_detection/metrics/coco_evaluation_test.py
+++ b/research/object_detection/metrics/coco_evaluation_test.py
@@ -1424,14 +1424,16 @@ class CocoMaskEvaluationTest(tf.test.TestCase):
        image_id='image3',
        detections_dict={
            standard_fields.DetectionResultFields.detection_boxes:
-            np.array([[25., 25., 50., 50.]]),
+                np.array([[25., 25., 50., 50.]]),
            standard_fields.DetectionResultFields.detection_scores:
-            np.array([.8]),
+                np.array([.8]),
            standard_fields.DetectionResultFields.detection_classes:
-            np.array([1]),
+                np.array([1]),
            standard_fields.DetectionResultFields.detection_masks:
-            np.pad(np.ones([1, 25, 25], dtype=np.uint8),
-                   ((0, 0), (10, 10), (10, 10)), mode='constant')
+                # The value of 5 is equivalent to 1, since masks will be
+                # thresholded and binarized before evaluation.
+                np.pad(5 * np.ones([1, 25, 25], dtype=np.uint8),
+                       ((0, 0), (10, 10), (10, 10)), mode='constant')
        })
    metrics = coco_evaluator.evaluate()
    self.assertAlmostEqual(metrics['DetectionMasks_Precision/mAP'], 1.0)

--- a/research/object_detection/metrics/oid_challenge_evaluation_utils.py
+++ b/research/object_detection/metrics/oid_challenge_evaluation_utils.py
@@ -136,15 +136,15 @@ def build_groundtruth_dictionary(data, class_label_map):

  dictionary = {
      standard_fields.InputDataFields.groundtruth_boxes:
-          data_location[['YMin', 'XMin', 'YMax', 'XMax']].as_matrix(),
+          data_location[['YMin', 'XMin', 'YMax', 'XMax']].to_numpy(),
      standard_fields.InputDataFields.groundtruth_classes:
          data_location['LabelName'].map(lambda x: class_label_map[x]
-                                        ).as_matrix(),
+                                        ).to_numpy(),
      standard_fields.InputDataFields.groundtruth_group_of:
-          data_location['IsGroupOf'].as_matrix().astype(int),
+          data_location['IsGroupOf'].to_numpy().astype(int),
      standard_fields.InputDataFields.groundtruth_image_classes:
          data_labels['LabelName'].map(lambda x: class_label_map[x]
-                                      ).as_matrix(),
+                                      ).to_numpy(),
  }

  if 'Mask' in data_location:
@@ -179,9 +179,9 @@ def build_predictions_dictionary(data, class_label_map):
  """
  dictionary = {
      standard_fields.DetectionResultFields.detection_classes:
-          data['LabelName'].map(lambda x: class_label_map[x]).as_matrix(),
+          data['LabelName'].map(lambda x: class_label_map[x]).to_numpy(),
      standard_fields.DetectionResultFields.detection_scores:
-          data['Score'].as_matrix()
+          data['Score'].to_numpy()
  }

  if 'Mask' in data:
@@ -192,6 +192,6 @@ def build_predictions_dictionary(data, class_label_map):
  else:
    dictionary[standard_fields.DetectionResultFields.detection_boxes] = data[[
        'YMin', 'XMin', 'YMax', 'XMax'
-    ]].as_matrix()
+    ]].to_numpy()

  return dictionary
--- a/research/object_detection/metrics/oid_vrd_challenge_evaluation_utils.py
+++ b/research/object_detection/metrics/oid_vrd_challenge_evaluation_utils.py
@@ -53,16 +53,16 @@ def build_groundtruth_vrd_dictionary(data, class_label_map,

  boxes = np.zeros(data_boxes.shape[0], dtype=vrd_evaluation.vrd_box_data_type)
  boxes['subject'] = data_boxes[['YMin1', 'XMin1', 'YMax1',
-                                 'XMax1']].as_matrix()
-  boxes['object'] = data_boxes[['YMin2', 'XMin2', 'YMax2', 'XMax2']].as_matrix()
+                                 'XMax1']].to_numpy()
+  boxes['object'] = data_boxes[['YMin2', 'XMin2', 'YMax2', 'XMax2']].to_numpy()

  labels = np.zeros(data_boxes.shape[0], dtype=vrd_evaluation.label_data_type)
  labels['subject'] = data_boxes['LabelName1'].map(
-      lambda x: class_label_map[x]).as_matrix()
+      lambda x: class_label_map[x]).to_numpy()
  labels['object'] = data_boxes['LabelName2'].map(
-      lambda x: class_label_map[x]).as_matrix()
+      lambda x: class_label_map[x]).to_numpy()
  labels['relation'] = data_boxes['RelationshipLabel'].map(
-      lambda x: relationship_label_map[x]).as_matrix()
+      lambda x: relationship_label_map[x]).to_numpy()

  return {
      standard_fields.InputDataFields.groundtruth_boxes:
@@ -71,7 +71,7 @@ def build_groundtruth_vrd_dictionary(data, class_label_map,
          labels,
      standard_fields.InputDataFields.groundtruth_image_classes:
          data_labels['LabelName'].map(lambda x: class_label_map[x])
-          .as_matrix(),
+          .to_numpy(),
  }


@@ -104,16 +104,16 @@ def build_predictions_vrd_dictionary(data, class_label_map,

  boxes = np.zeros(data_boxes.shape[0], dtype=vrd_evaluation.vrd_box_data_type)
  boxes['subject'] = data_boxes[['YMin1', 'XMin1', 'YMax1',
-                                 'XMax1']].as_matrix()
-  boxes['object'] = data_boxes[['YMin2', 'XMin2', 'YMax2', 'XMax2']].as_matrix()
+                                 'XMax1']].to_numpy()
+  boxes['object'] = data_boxes[['YMin2', 'XMin2', 'YMax2', 'XMax2']].to_numpy()

  labels = np.zeros(data_boxes.shape[0], dtype=vrd_evaluation.label_data_type)
  labels['subject'] = data_boxes['LabelName1'].map(
-      lambda x: class_label_map[x]).as_matrix()
+      lambda x: class_label_map[x]).to_numpy()
  labels['object'] = data_boxes['LabelName2'].map(
-      lambda x: class_label_map[x]).as_matrix()
+      lambda x: class_label_map[x]).to_numpy()
  labels['relation'] = data_boxes['RelationshipLabel'].map(
-      lambda x: relationship_label_map[x]).as_matrix()
+      lambda x: relationship_label_map[x]).to_numpy()

  return {
      standard_fields.DetectionResultFields.detection_boxes:
@@ -121,5 +121,5 @@ def build_predictions_vrd_dictionary(data, class_label_map,
      standard_fields.DetectionResultFields.detection_classes:
          labels,
      standard_fields.DetectionResultFields.detection_scores:
-          data_boxes['Score'].as_matrix()
+          data_boxes['Score'].to_numpy()
  }
--- a/research/object_detection/model_lib.py
+++ b/research/object_detection/model_lib.py
@@ -43,7 +43,6 @@ from object_detection.utils import visualization_utils as vis_utils
 # pylint: disable=g-import-not-at-top
 try:
  from tensorflow.contrib import learn as contrib_learn
-  from tensorflow.contrib import tpu as contrib_tpu
 except ImportError:
  # TF 2.0 doesn't ship with contrib.
  pass
@@ -94,6 +93,15 @@ def _prepare_groundtruth_for_eval(detection_model, class_agnostic,
        of groundtruth boxes per image..
      'groundtruth_keypoints': [batch_size, num_boxes, num_keypoints, 2] float32
        tensor of keypoints (if provided in groundtruth).
+      'groundtruth_dp_num_points_list': [batch_size, num_boxes] int32 tensor
+        with the number of DensePose points for each instance (if provided in
+        groundtruth).
+      'groundtruth_dp_part_ids_list': [batch_size, num_boxes,
+        max_sampled_points] int32 tensor with the part ids for each DensePose
+        sampled point (if provided in groundtruth).
+      'groundtruth_dp_surface_coords_list': [batch_size, num_boxes,
+        max_sampled_points, 4] containing the DensePose surface coordinates for
+        each sampled point (if provided in groundtruth).
      'groundtruth_group_of': [batch_size, num_boxes] bool tensor indicating
        group_of annotations (if provided in groundtruth).
      'groundtruth_labeled_classes': [batch_size, num_classes] int64
@@ -164,6 +172,21 @@ def _prepare_groundtruth_for_eval(detection_model, class_agnostic,
      groundtruth[input_data_fields.groundtruth_labeled_classes] = tf.stack(
          labeled_classes)

+  if detection_model.groundtruth_has_field(
+      fields.BoxListFields.densepose_num_points):
+    groundtruth[input_data_fields.groundtruth_dp_num_points] = tf.stack(
+        detection_model.groundtruth_lists(
+            fields.BoxListFields.densepose_num_points))
+  if detection_model.groundtruth_has_field(
+      fields.BoxListFields.densepose_part_ids):
+    groundtruth[input_data_fields.groundtruth_dp_part_ids] = tf.stack(
+        detection_model.groundtruth_lists(
+            fields.BoxListFields.densepose_part_ids))
+  if detection_model.groundtruth_has_field(
+      fields.BoxListFields.densepose_surface_coords):
+    groundtruth[input_data_fields.groundtruth_dp_surface_coords] = tf.stack(
+        detection_model.groundtruth_lists(
+            fields.BoxListFields.densepose_surface_coords))
  groundtruth[input_data_fields.num_groundtruth_boxes] = (
      tf.tile([max_number_of_boxes], multiples=[groundtruth_boxes_shape[0]]))
  return groundtruth
@@ -219,6 +242,9 @@ def unstack_batch(tensor_dict, unpad_groundtruth_tensors=True):
        fields.InputDataFields.groundtruth_boxes,
        fields.InputDataFields.groundtruth_keypoints,
        fields.InputDataFields.groundtruth_keypoint_visibilities,
+        fields.InputDataFields.groundtruth_dp_num_points,
+        fields.InputDataFields.groundtruth_dp_part_ids,
+        fields.InputDataFields.groundtruth_dp_surface_coords,
        fields.InputDataFields.groundtruth_group_of,
        fields.InputDataFields.groundtruth_difficult,
        fields.InputDataFields.groundtruth_is_crowd,
@@ -269,6 +295,18 @@ def provide_groundtruth(model, labels):
  if fields.InputDataFields.groundtruth_keypoint_visibilities in labels:
    gt_keypoint_visibilities_list = labels[
        fields.InputDataFields.groundtruth_keypoint_visibilities]
+  gt_dp_num_points_list = None
+  if fields.InputDataFields.groundtruth_dp_num_points in labels:
+    gt_dp_num_points_list = labels[
+        fields.InputDataFields.groundtruth_dp_num_points]
+  gt_dp_part_ids_list = None
+  if fields.InputDataFields.groundtruth_dp_part_ids in labels:
+    gt_dp_part_ids_list = labels[
+        fields.InputDataFields.groundtruth_dp_part_ids]
+  gt_dp_surface_coords_list = None
+  if fields.InputDataFields.groundtruth_dp_surface_coords in labels:
+    gt_dp_surface_coords_list = labels[
+        fields.InputDataFields.groundtruth_dp_surface_coords]
  gt_weights_list = None
  if fields.InputDataFields.groundtruth_weights in labels:
    gt_weights_list = labels[fields.InputDataFields.groundtruth_weights]
@@ -297,13 +335,16 @@ def provide_groundtruth(model, labels):
      groundtruth_masks_list=gt_masks_list,
      groundtruth_keypoints_list=gt_keypoints_list,
      groundtruth_keypoint_visibilities_list=gt_keypoint_visibilities_list,
+      groundtruth_dp_num_points_list=gt_dp_num_points_list,
+      groundtruth_dp_part_ids_list=gt_dp_part_ids_list,
+      groundtruth_dp_surface_coords_list=gt_dp_surface_coords_list,
      groundtruth_weights_list=gt_weights_list,
      groundtruth_is_crowd_list=gt_is_crowd_list,
      groundtruth_group_of_list=gt_group_of_list,
      groundtruth_area_list=gt_area_list)


-def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False,
+def create_model_fn(detection_model_fn, configs, hparams=None, use_tpu=False,
                    postprocess_on_cpu=False):
  """Creates a model function for `Estimator`.

@@ -377,7 +418,7 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False,
    side_inputs = detection_model.get_side_inputs(features)

    if use_tpu and train_config.use_bfloat16:
-      with contrib_tpu.bfloat16_scope():
+      with tf.tpu.bfloat16_scope():
        prediction_dict = detection_model.predict(
            preprocessed_images,
            features[fields.InputDataFields.true_image_shape], **side_inputs)
@@ -392,7 +433,7 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False,

    if mode in (tf.estimator.ModeKeys.EVAL, tf.estimator.ModeKeys.PREDICT):
      if use_tpu and postprocess_on_cpu:
-        detections = contrib_tpu.outside_compilation(
+        detections = tf.tpu.outside_compilation(
            postprocess_wrapper,
            (prediction_dict,
             features[fields.InputDataFields.true_image_shape]))
@@ -468,7 +509,7 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False,

    if mode == tf.estimator.ModeKeys.TRAIN:
      if use_tpu:
-        training_optimizer = contrib_tpu.CrossShardOptimizer(training_optimizer)
+        training_optimizer = tf.tpu.CrossShardOptimizer(training_optimizer)

      # Optionally freeze some layers by setting their gradients to be zero.
      trainable_variables = None
@@ -588,7 +629,7 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False,

    # EVAL executes on CPU, so use regular non-TPU EstimatorSpec.
    if use_tpu and mode != tf.estimator.ModeKeys.EVAL:
-      return contrib_tpu.TPUEstimatorSpec(
+      return tf.estimator.tpu.TPUEstimatorSpec(
          mode=mode,
          scaffold_fn=scaffold_fn,
          predictions=detections,
@@ -619,8 +660,8 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False,


 def create_estimator_and_inputs(run_config,
-                                hparams,
-                                pipeline_config_path,
+                                hparams=None,
+                                pipeline_config_path=None,
                                config_override=None,
                                train_steps=None,
                                sample_1_of_n_eval_examples=1,
@@ -639,7 +680,7 @@ def create_estimator_and_inputs(run_config,

  Args:
    run_config: A `RunConfig`.
-    hparams: A `HParams`.
+    hparams: (optional) A `HParams`.
    pipeline_config_path: A path to a pipeline config file.
    config_override: A pipeline_pb2.TrainEvalPipelineConfig text proto to
      override the config from `pipeline_config_path`.
@@ -762,14 +803,14 @@ def create_estimator_and_inputs(run_config,
      model_config=model_config, predict_input_config=eval_input_configs[0])

  # Read export_to_tpu from hparams if not passed.
-  if export_to_tpu is None:
+  if export_to_tpu is None and hparams is not None:
    export_to_tpu = hparams.get('export_to_tpu', False)
  tf.logging.info('create_estimator_and_inputs: use_tpu %s, export_to_tpu %s',
                  use_tpu, export_to_tpu)
  model_fn = model_fn_creator(detection_model_fn, configs, hparams, use_tpu,
                              postprocess_on_cpu)
  if use_tpu_estimator:
-    estimator = contrib_tpu.TPUEstimator(
+    estimator = tf.estimator.tpu.TPUEstimator(
        model_fn=model_fn,
        train_batch_size=train_config.batch_size,
        # For each core, only batch size 1 is supported for eval.

--- a/research/object_detection/model_lib_v2.py
+++ b/research/object_detection/model_lib_v2.py
@@ -93,6 +93,12 @@ def _compute_losses_and_predictions_dicts(
          instance masks for objects.
        labels[fields.InputDataFields.groundtruth_keypoints] is a
          float32 tensor containing keypoints for each box.
+        labels[fields.InputDataFields.groundtruth_dp_num_points] is an int32
+          tensor with the number of sampled DensePose points per object.
+        labels[fields.InputDataFields.groundtruth_dp_part_ids] is an int32
+          tensor with the DensePose part ids (0-indexed) per object.
+        labels[fields.InputDataFields.groundtruth_dp_surface_coords] is a
+          float32 tensor with the DensePose surface coordinates.
        labels[fields.InputDataFields.groundtruth_group_of] is a tf.bool tensor
          containing group_of annotations.
        labels[fields.InputDataFields.groundtruth_labeled_classes] is a float32
@@ -196,6 +202,17 @@ def eager_train_step(detection_model,
        labels[fields.InputDataFields.groundtruth_keypoints] is a
          [batch_size, num_boxes, num_keypoints, 2] float32 tensor containing
          keypoints for each box.
+        labels[fields.InputDataFields.groundtruth_dp_num_points] is a
+          [batch_size, num_boxes] int32 tensor with the number of DensePose
+          sampled points per instance.
+        labels[fields.InputDataFields.groundtruth_dp_part_ids] is a
+          [batch_size, num_boxes, max_sampled_points] int32 tensor with the
+          part ids (0-indexed) for each instance.
+        labels[fields.InputDataFields.groundtruth_dp_surface_coords] is a
+          [batch_size, num_boxes, max_sampled_points, 4] float32 tensor with the
+          surface coordinates for each point. Each surface coordinate is of the
+          form (y, x, v, u) where (y, x) are normalized image locations and
+          (v, u) are part-relative normalized surface coordinates.
        labels[fields.InputDataFields.groundtruth_labeled_classes] is a float32
          k-hot tensor of classes.
    unpad_groundtruth_tensors: A parameter passed to unstack_batch.
@@ -337,11 +354,18 @@ def load_fine_tune_checkpoint(
        labels)

  strategy = tf.compat.v2.distribute.get_strategy()
-  strategy.experimental_run_v2(
-      _dummy_computation_fn, args=(
-          features,
-          labels,
-      ))
+  if hasattr(tf.distribute.Strategy, 'run'):
+    strategy.run(
+        _dummy_computation_fn, args=(
+            features,
+            labels,
+        ))
+  else:
+    strategy.experimental_run_v2(
+        _dummy_computation_fn, args=(
+            features,
+            labels,
+        ))

  restore_from_objects_dict = model.restore_from_objects(
      fine_tune_checkpoint_type=checkpoint_type)
@@ -563,8 +587,12 @@ def train_loop(

        def _sample_and_train(strategy, train_step_fn, data_iterator):
          features, labels = data_iterator.next()
-          per_replica_losses = strategy.experimental_run_v2(
-              train_step_fn, args=(features, labels))
+          if hasattr(tf.distribute.Strategy, 'run'):
+            per_replica_losses = strategy.run(
+                train_step_fn, args=(features, labels))
+          else:
+            per_replica_losses = strategy.experimental_run_v2(
+                train_step_fn, args=(features, labels))
          # TODO(anjalisridhar): explore if it is safe to remove the
          ## num_replicas scaling of the loss and switch this to a ReduceOp.Mean
          return strategy.reduce(tf.distribute.ReduceOp.SUM,
@@ -768,7 +796,16 @@ def eager_eval_loop(
          name='eval_side_by_side_' + str(i),
          step=global_step,
          data=sbys_images,
-          max_outputs=1)
+          max_outputs=eval_config.num_visualizations)
+      if eval_util.has_densepose(eval_dict):
+        dp_image_list = vutils.draw_densepose_visualizations(
+            eval_dict)
+        dp_images = tf.concat(dp_image_list, axis=0)
+        tf.compat.v2.summary.image(
+            name='densepose_detections_' + str(i),
+            step=global_step,
+            data=dp_images,
+            max_outputs=eval_config.num_visualizations)

    if evaluators is None:
      if class_agnostic:

--- a/research/object_detection/model_main.py
+++ b/research/object_detection/model_main.py
@@ -22,7 +22,6 @@ from absl import flags

 import tensorflow.compat.v1 as tf

-from object_detection import model_hparams
 from object_detection import model_lib

 flags.DEFINE_string(
@@ -41,10 +40,6 @@ flags.DEFINE_integer('sample_1_of_n_eval_on_train_examples', 5, 'Will sample '
                     'one of every n train input examples for evaluation, '
                     'where n is provided. This is only used if '
                     '`eval_training_data` is True.')
-flags.DEFINE_string(
-    'hparams_overrides', None, 'Hyperparameter overrides, '
-    'represented as a string containing comma-separated '
-    'hparam_name=value pairs.')
 flags.DEFINE_string(
    'checkpoint_dir', None, 'Path to directory holding a checkpoint.  If '
    '`checkpoint_dir` is provided, this binary operates in eval-only mode, '
@@ -68,7 +63,6 @@ def main(unused_argv):

  train_and_eval_dict = model_lib.create_estimator_and_inputs(
      run_config=config,
-      hparams=model_hparams.create_hparams(FLAGS.hparams_overrides),
      pipeline_config_path=FLAGS.pipeline_config_path,
      train_steps=FLAGS.num_train_steps,
      sample_1_of_n_eval_examples=FLAGS.sample_1_of_n_eval_examples,

--- a/research/object_detection/model_main_tf2.py
+++ b/research/object_detection/model_main_tf2.py
@@ -54,6 +54,10 @@ flags.DEFINE_integer('eval_timeout', 3600, 'Number of seconds to wait for an'
                     'evaluation checkpoint before exiting.')

 flags.DEFINE_bool('use_tpu', False, 'Whether the job is executing on a TPU.')
+flags.DEFINE_string(
+    'tpu_name',
+    default=None,
+    help='Name of the Cloud TPU for Cluster Resolvers.')
 flags.DEFINE_integer(
    'num_workers', 1, 'When num_workers > 1, training uses '
    'MultiWorkerMirroredStrategy. When num_workers = 1 it uses '
@@ -79,7 +83,10 @@ def main(unused_argv):
        wait_interval=300, timeout=FLAGS.eval_timeout)
  else:
    if FLAGS.use_tpu:
-      resolver = tf.distribute.cluster_resolver.TPUClusterResolver()
+      # TPU is automatically inferred if tpu_name is None and
+      # we are running under cloud ai-platform.
+      resolver = tf.distribute.cluster_resolver.TPUClusterResolver(
+          FLAGS.tpu_name)
      tf.config.experimental_connect_to_cluster(resolver)
      tf.tpu.experimental.initialize_tpu_system(resolver)
      strategy = tf.distribute.experimental.TPUStrategy(resolver)

--- a/research/object_detection/model_tpu_main.py
+++ b/research/object_detection/model_tpu_main.py
@@ -26,18 +26,8 @@ from absl import flags
 import tensorflow.compat.v1 as tf


-from object_detection import model_hparams
 from object_detection import model_lib

-# pylint: disable=g-import-not-at-top
-try:
-  from tensorflow.contrib import cluster_resolver as contrib_cluster_resolver
-  from tensorflow.contrib import tpu as contrib_tpu
-except ImportError:
-  # TF 2.0 doesn't ship with contrib.
-  pass
-# pylint: enable=g-import-not-at-top
-
 tf.flags.DEFINE_bool('use_tpu', True, 'Use TPUs rather than plain CPUs')

 # Cloud TPU Cluster Resolvers
@@ -67,10 +57,6 @@ flags.DEFINE_string('mode', 'train',
 flags.DEFINE_integer('train_batch_size', None, 'Batch size for training. If '
                     'this is not provided, batch size is read from training '
                     'config.')
-
-flags.DEFINE_string(
-    'hparams_overrides', None, 'Comma-separated list of '
-    'hyperparameters to override defaults.')
 flags.DEFINE_integer('num_train_steps', None, 'Number of train steps.')
 flags.DEFINE_boolean('eval_training_data', False,
                     'If training data should be evaluated for this job.')
@@ -99,15 +85,15 @@ def main(unused_argv):
  flags.mark_flag_as_required('pipeline_config_path')

  tpu_cluster_resolver = (
-      contrib_cluster_resolver.TPUClusterResolver(
+      tf.distribute.cluster_resolver.TPUClusterResolver(
          tpu=[FLAGS.tpu_name], zone=FLAGS.tpu_zone, project=FLAGS.gcp_project))
  tpu_grpc_url = tpu_cluster_resolver.get_master()

-  config = contrib_tpu.RunConfig(
+  config = tf.estimator.tpu.RunConfig(
      master=tpu_grpc_url,
      evaluation_master=tpu_grpc_url,
      model_dir=FLAGS.model_dir,
-      tpu_config=contrib_tpu.TPUConfig(
+      tpu_config=tf.estimator.tpu.TPUConfig(
          iterations_per_loop=FLAGS.iterations_per_loop,
          num_shards=FLAGS.num_shards))

@@ -117,7 +103,6 @@ def main(unused_argv):

  train_and_eval_dict = model_lib.create_estimator_and_inputs(
      run_config=config,
-      hparams=model_hparams.create_hparams(FLAGS.hparams_overrides),
      pipeline_config_path=FLAGS.pipeline_config_path,
      train_steps=FLAGS.num_train_steps,
      sample_1_of_n_eval_examples=FLAGS.sample_1_of_n_eval_examples,

--- a/research/object_detection/models/faster_rcnn_resnet_v1_fpn_keras_feature_extractor.py
+++ b/research/object_detection/models/faster_rcnn_resnet_v1_fpn_keras_feature_extractor.py
@@ -56,7 +56,7 @@ class FasterRCNNResnetV1FpnKerasFeatureExtractor(
        the resnet_v1.resnet_v1_{50,101,152} models.
      resnet_v1_base_model_name: model name under which to construct resnet v1.
      first_stage_features_stride: See base class.
-      conv_hyperparameters: a `hyperparams_builder.KerasLayerHyperparams` object
+      conv_hyperparams: a `hyperparams_builder.KerasLayerHyperparams` object
        containing convolution hyperparameters for the layers added on top of
        the base feature extractor.
      batch_norm_trainable: See base class.
@@ -143,19 +143,21 @@ class FasterRCNNResnetV1FpnKerasFeatureExtractor(
      with tf.name_scope('ResnetV1FPN'):
        full_resnet_v1_model = self._resnet_v1_base_model(
            batchnorm_training=self._train_batch_norm,
-            conv_hyperparams=(self._conv_hyperparams
-                              if self._override_base_feature_extractor_hyperparams
+            conv_hyperparams=(self._conv_hyperparams if
+                              self._override_base_feature_extractor_hyperparams
                              else None),
            classes=None,
            weights=None,
            include_top=False)
-        output_layers = _RESNET_MODEL_OUTPUT_LAYERS[self._resnet_v1_base_model_name]
+        output_layers = _RESNET_MODEL_OUTPUT_LAYERS[
+            self._resnet_v1_base_model_name]
        outputs = [full_resnet_v1_model.get_layer(output_layer_name).output
                   for output_layer_name in output_layers]
        self.classification_backbone = tf.keras.Model(
            inputs=full_resnet_v1_model.inputs,
            outputs=outputs)
-        backbone_outputs = self.classification_backbone(full_resnet_v1_model.inputs)
+        backbone_outputs = self.classification_backbone(
+            full_resnet_v1_model.inputs)

        # construct FPN feature generator
        self._base_fpn_max_level = min(self._fpn_max_level, 5)
@@ -199,7 +201,7 @@ class FasterRCNNResnetV1FpnKerasFeatureExtractor(
              self._conv_hyperparams.build_activation_layer(
                  name=layer_name))
          self._coarse_feature_layers.append(layers)
-        
+
        feature_maps = []
        for level in range(self._fpn_min_level, self._base_fpn_max_level + 1):
          feature_maps.append(fpn_features['top_down_block{}'.format(level-1)])
@@ -236,7 +238,7 @@ class FasterRCNNResnetV1FpnKerasFeatureExtractor(
    """
    with tf.name_scope(name):
      with tf.name_scope('ResnetV1FPN'):
-        # TODO: Add a batchnorm layer between two fc layers.
+        # TODO(yiming): Add a batchnorm layer between two fc layers.
        feature_extractor_model = tf.keras.models.Sequential([
            tf.keras.layers.Flatten(),
            tf.keras.layers.Dense(units=1024, activation='relu'),
@@ -283,12 +285,15 @@ class FasterRCNNResnet50FpnKerasFeatureExtractor(
        fpn_min_level=fpn_min_level,
        fpn_max_level=fpn_max_level,
        additional_layer_depth=additional_layer_depth,
-        override_base_feature_extractor_hyperparams=override_base_feature_extractor_hyperparams)
+        override_base_feature_extractor_hyperparams=
+        override_base_feature_extractor_hyperparams
+    )


 class FasterRCNNResnet101FpnKerasFeatureExtractor(
    FasterRCNNResnetV1FpnKerasFeatureExtractor):
  """Faster RCNN with Resnet101 FPN feature extractor."""
+
  def __init__(self,
               is_training,
               first_stage_features_stride=16,
@@ -323,7 +328,8 @@ class FasterRCNNResnet101FpnKerasFeatureExtractor(
        fpn_min_level=fpn_min_level,
        fpn_max_level=fpn_max_level,
        additional_layer_depth=additional_layer_depth,
-        override_base_feature_extractor_hyperparams=override_base_feature_extractor_hyperparams)
+        override_base_feature_extractor_hyperparams=
+        override_base_feature_extractor_hyperparams)


 class FasterRCNNResnet152FpnKerasFeatureExtractor(
@@ -364,4 +370,5 @@ class FasterRCNNResnet152FpnKerasFeatureExtractor(
        fpn_min_level=fpn_min_level,
        fpn_max_level=fpn_max_level,
        additional_layer_depth=additional_layer_depth,
-        override_base_feature_extractor_hyperparams=override_base_feature_extractor_hyperparams)
+        override_base_feature_extractor_hyperparams=
+        override_base_feature_extractor_hyperparams)