pull latest

657dcda5 · Kaushik Shivakumar · 26e24e21 · e6017471 · 657dcda5 · 657dcda5
Commit 657dcda5 authored Jul 01, 2020 by Kaushik Shivakumar
20 changed files
--- a/research/delf/delf/python/training/README.md
+++ b/research/delf/delf/python/training/README.md
 # DELF Training Instructions
-This README documents the end-to-end process for training a landmark detection and retrieval
+This README documents the end-to-end process for training a landmark detection
-model using the DELF library on the [Google Landmarks Dataset v2](https://github.com/cvdfoundation/google-landmark) (GLDv2). This can be achieved following these steps:
+and retrieval model using the DELF library on the
-1. Install the DELF Python library.
+[Google Landmarks Dataset v2](https://github.com/cvdfoundation/google-landmark)
-2. Download the raw images of the GLDv2 dataset.
+(GLDv2). This can be achieved following these steps:
-3. Prepare the training data.
-4. Run the training.
+1.  Install the DELF Python library.
+2.  Download the raw images of the GLDv2 dataset.
+3.  Prepare the training data.
+4.  Run the training.
 The next sections will cove each of these steps in greater detail.
 ## Prerequisites
-Clone the [TensorFlow Model Garden](https://github.com/tensorflow/models) repository and move
+Clone the [TensorFlow Model Garden](https://github.com/tensorflow/models)
-into the `models/research/delf/delf/python/training`folder.
+repository and move into the `models/research/delf/delf/python/training`folder.
 ```
 git clone https://github.com/tensorflow/models.git
 cd models/research/delf/delf/python/training
@@ -20,109 +24,270 @@ cd models/research/delf/delf/python/training
 ## Install the DELF Library
-The DELF Python library can be installed by running the [`install_delf.sh`](./install_delf.sh)
+The DELF Python library can be installed by running the
-script using the command:
+[`install_delf.sh`](./install_delf.sh) script using the command:
 ```
 bash install_delf.sh
 ```
-The script installs both the DELF library and its dependencies in the following sequence:
-* Install TensorFlow 2.2 and TensorFlow 2.2 for GPU.
-* Install the [TF-Slim](https://github.com/google-research/tf-slim) library from source.
-* Download [protoc](https://github.com/protocolbuffers/protobuf) and compile the DELF Protocol
-Buffers.
-* Install the matplotlib, numpy, scikit-image, scipy and python3-tk Python libraries.
-* Install the [TensorFlow Object Detection API](https://github.com/tensorflow/models/tree/master/research/object_detection) from the cloned TensorFlow Model Garden repository.
-* Install the DELF package.
-*Please note that the current installation only works on 64 bits Linux architectures due to the 
+The script installs both the DELF library and its dependencies in the following
-`protoc` binary downloaded by the installation script. If you wish to install the DELF library on
+sequence:
-other architectures please update the [`install_delf.sh`](./install_delf.sh) script by referencing
-the desired `protoc` [binary release](https://github.com/protocolbuffers/protobuf/releases).*
+*   Install TensorFlow 2.2 and TensorFlow 2.2 for GPU.
+*   Install the [TF-Slim](https://github.com/google-research/tf-slim) library
+    from source.
+*   Download [protoc](https://github.com/protocolbuffers/protobuf) and compile
+    the DELF Protocol Buffers.
+*   Install the matplotlib, numpy, scikit-image, scipy and python3-tk Python
+    libraries.
+*   Install the
+    [TensorFlow Object Detection API](https://github.com/tensorflow/models/tree/master/research/object_detection)
+    from the cloned TensorFlow Model Garden repository.
+*   Install the DELF package.
+*Please note that the current installation only works on 64 bits Linux
+architectures due to the `protoc` binary downloaded by the installation script.
+If you wish to install the DELF library on other architectures please update the
+[`install_delf.sh`](./install_delf.sh) script by referencing the desired
+`protoc`
+[binary release](https://github.com/protocolbuffers/protobuf/releases).*
 ## Download the GLDv2 Training Data
-The [GLDv2](https://github.com/cvdfoundation/google-landmark) images are grouped in 3 datasets: TRAIN, INDEX, TEST. Images in each dataset are grouped into `*.tar` files and individually
+The [GLDv2](https://github.com/cvdfoundation/google-landmark) images are grouped
-referenced in `*.csv`files containing training metadata and licensing information. The number of
+in 3 datasets: TRAIN, INDEX, TEST. Images in each dataset are grouped into
-`*.tar` files per dataset is as follows:
+`*.tar` files and individually referenced in `*.csv`files containing training
-* TRAIN: 500 files.
+metadata and licensing information. The number of `*.tar` files per dataset is
-* INDEX: 100 files.
+as follows:
-* TEST: 20 files.
+*   TRAIN: 500 files.
+*   INDEX: 100 files.
+*   TEST: 20 files.
+To download the GLDv2 images, run the
+[`download_dataset.sh`](./download_dataset.sh) script like in the following
+example:
-To download the GLDv2 images, run the [`download_dataset.sh`](./download_dataset.sh) script like in
-the following example:
 ```
 bash download_dataset.sh 500 100 20
 ```
 The script takes the following parameters, in order:
-* The number of image files from the TRAIN dataset to download (maximum 500).
-* The number of image files from the INDEX dataset to download (maximum 100).
+*   The number of image files from the TRAIN dataset to download (maximum 500).
-* The number of image files from the TEST dataset to download (maximum 20).
+*   The number of image files from the INDEX dataset to download (maximum 100).
+*   The number of image files from the TEST dataset to download (maximum 20).
 The script downloads the GLDv2 images under the following directory structure:
-* gldv2_dataset/
-  * train/ - Contains raw images from the TRAIN dataset.
+*   gldv2_dataset/
-  * index/ - Contains raw images from the INDEX dataset.
+    *   train/ - Contains raw images from the TRAIN dataset.
-  * test/ - Contains raw images from the TEST dataset.
+    *   index/ - Contains raw images from the INDEX dataset.
+    *   test/ - Contains raw images from the TEST dataset.
-Each of the three folders `gldv2_dataset/train/`, `gldv2_dataset/index/` and `gldv2_dataset/test/`
-contains the following:
+Each of the three folders `gldv2_dataset/train/`, `gldv2_dataset/index/` and
-* The downloaded `*.tar` files.
+`gldv2_dataset/test/` contains the following:
-* The corresponding MD5 checksum files, `*.txt`.
-* The unpacked content of the downloaded files. (*Images are organized in folders and subfolders
+*   The downloaded `*.tar` files.
-based on the first, second and third character in their file name.*)
+*   The corresponding MD5 checksum files, `*.txt`.
-* The CSV files containing training and licensing metadata of the downloaded images.
+*   The unpacked content of the downloaded files. (*Images are organized in
+    folders and subfolders based on the first, second and third character in
-*Please note that due to the large size of the GLDv2 dataset, the download can take up to 12 
+    their file name.*)
-hours and up to 1 TB of disk space. In order to save bandwidth and disk space, you may want to start by downloading only the TRAIN dataset, the only one required for the training, thus saving
+*   The CSV files containing training and licensing metadata of the downloaded
-approximately ~95 GB, the equivalent of the INDEX and TEST datasets. To further save disk space,
+    images.
-the `*.tar` files can be deleted after downloading and upacking them.*
+*Please note that due to the large size of the GLDv2 dataset, the download can
+take up to 12 hours and up to 1 TB of disk space. In order to save bandwidth and
+disk space, you may want to start by downloading only the TRAIN dataset, the
+only one required for the training, thus saving approximately ~95 GB, the
+equivalent of the INDEX and TEST datasets. To further save disk space, the
+`*.tar` files can be deleted after downloading and upacking them.*
 ## Prepare the Data for Training
-Preparing the data for training consists of creating [TFRecord](https://www.tensorflow.org/tutorials/load_data/tfrecord)
+Preparing the data for training consists of creating
-files from the raw GLDv2 images grouped into TRAIN and VALIDATION splits. The training set
+[TFRecord](https://www.tensorflow.org/tutorials/load_data/tfrecord) files from
-produced contains only the *clean* subset of the GLDv2 dataset. The [CVPR'20 paper](https://arxiv.org/abs/2004.01804)
+the raw GLDv2 images grouped into TRAIN and VALIDATION splits. The training set
-introducing the GLDv2 dataset contains a detailed description of the *clean* subset.
+produced contains only the *clean* subset of the GLDv2 dataset. The
+[CVPR'20 paper](https://arxiv.org/abs/2004.01804) introducing the GLDv2 dataset
+contains a detailed description of the *clean* subset.
+Generating the TFRecord files containing the TRAIN and VALIDATION splits of the
+*clean* GLDv2 subset can be achieved by running the
+[`build_image_dataset.py`](./build_image_dataset.py) script. Assuming that the
+GLDv2 images have been downloaded to the `gldv2_dataset` folder, the script can
+be run as follows:
-Generating the TFRecord files containing the TRAIN and VALIDATION splits of the *clean* GLDv2 
-subset can be achieved by running the [`build_image_dataset.py`](./build_image_dataset.py) 
-script. Assuming that the GLDv2 images have been downloaded to the `gldv2_dataset` folder, the 
-script can be run as follows:
 ```
 python3 build_image_dataset.py \
-    --train_csv_path=gldv2_dataset/train/train.csv \
+  --train_csv_path=gldv2_dataset/train/train.csv \
-    --train_clean_csv_path=gldv2_dataset/train/train_clean.csv \
+  --train_clean_csv_path=gldv2_dataset/train/train_clean.csv \
-    --train_directory=gldv2_dataset/train/*/*/*/ \
+  --train_directory=gldv2_dataset/train/*/*/*/ \
-    --output_directory=gldv2_dataset/tfrecord/ \
+  --output_directory=gldv2_dataset/tfrecord/ \
-    --num_shards=128 \
+  --num_shards=128 \
-    --generate_train_validation_splits \
+  --generate_train_validation_splits \
-    --validation_split_size=0.2
+  --validation_split_size=0.2
 ```
-*Please refer to the source code of the [`build_image_dataset.py`](./build_image_dataset.py) script for a detailed description of its parameters.*
-The TFRecord files written in the `OUTPUT_DIRECTORY` will be prefixed as follows:
+*Please refer to the source code of the
-* TRAIN split: `train-*`
+[`build_image_dataset.py`](./build_image_dataset.py) script for a detailed
-* VALIDATION split: `validation-*`
+description of its parameters.*
+The TFRecord files written in the `OUTPUT_DIRECTORY` will be prefixed as
+follows:
+*   TRAIN split: `train-*`
+*   VALIDATION split: `validation-*`
+The same script can be used to generate TFRecord files for the TEST split for
+post-training evaluation purposes. This can be achieved by adding the
+parameters:
-The same script can be used to generate TFRecord files for the TEST split for post-training
-evaluation purposes. This can be achieved by adding the parameters:
 ```
-    --test_csv_path=gldv2_dataset/train/test.csv \
+--test_csv_path=gldv2_dataset/train/test.csv \
-    --test_directory=gldv2_dataset/test/*/*/*/ \
+--test_directory=gldv2_dataset/test/*/*/*/ \
 ```
-In this scenario, the TFRecord files of the TEST split written in the `OUTPUT_DIRECTORY` will be
-named according to the pattern `test-*`.
-*Please note that due to the large size of the GLDv2 dataset, the generation of the TFRecord 
+In this scenario, the TFRecord files of the TEST split written in the
-files can take up to 12 hours and up to 500 GB of space disk.*
+`OUTPUT_DIRECTORY` will be named according to the pattern `test-*`.
+*Please note that due to the large size of the GLDv2 dataset, the generation of
+the TFRecord files can take up to 12 hours and up to 500 GB of space disk.*
 ## Running the Training
-Assuming the TFRecord files were generated in the `gldv2_dataset/tfrecord/` directory, running 
+For the training to converge faster, it is possible to initialize the ResNet
-the following command should start training a model:
+backbone with the weights of a pretrained ImageNet model. The ImageNet
+checkpoint is available at the following location:
+[`http://storage.googleapis.com/delf/resnet50_imagenet_weights.tar.gz`](http://storage.googleapis.com/delf/resnet50_imagenet_weights.tar.gz).
+To download and unpack it run the following commands on a Linux box:
+```
+curl -Os http://storage.googleapis.com/delf/resnet50_imagenet_weights.tar.gz
+tar -xzvf resnet50_imagenet_weights.tar.gz
+```
+Assuming the TFRecord files were generated in the `gldv2_dataset/tfrecord/`
+directory, running the following command should start training a model and
+output the results in the `gldv2_training` directory:
 ```
 python3 train.py \
  --train_file_pattern=gldv2_dataset/tfrecord/train* \
-  --validation_file_pattern=gldv2_dataset/tfrecord/validation*
+  --validation_file_pattern=gldv2_dataset/tfrecord/validation* \
+  --imagenet_checkpoint=resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5 \
+  --dataset_version=gld_v2_clean \
+  --logdir=gldv2_training/
 ```
+On a multi-GPU machine the batch size can be increased to speed up the training
+using the `--batch_size` parameter. On a 8 Tesla P100 GPUs machine you can set
+the batch size to `256`:
+```
+--batch_size=256
+```
+## Exporting the Trained Model
+Assuming the training output, the TensorFlow checkpoint, is in the
+`gldv2_training` directory, running the following commands exports the model.
+### DELF local feature model
+```
+python3 model/export_model.py \
+  --ckpt_path=gldv2_training/delf_weights \
+  --export_path=gldv2_model_local \
+  --block3_strides
+```
+### Kaggle-compatible global feature model
+To export a global feature model in the format required by the
+[2020 Landmark Retrieval challenge](https://www.kaggle.com/c/landmark-retrieval-2020),
+you can use the following command:
+```
+python3 model/export_global_model.py \
+  --ckpt_path=gldv2_training/delf_weights \
+  --export_path=gldv2_model_global \
+  --input_scales_list=0.70710677,1.0,1.4142135 \
+  --multi_scale_pool_type=sum \
+  --normalize_global_descriptor
+```
+## Testing the Trained Model
+After the trained model has been exported, it can be used to extract DELF
+features from 2 images of the same landmark and to perform a matching test
+between the 2 images based on the extracted features to validate they represent
+the same landmark.
+Start by downloading the Oxford buildings dataset:
+```
+mkdir data && cd data
+wget http://www.robots.ox.ac.uk/~vgg/data/oxbuildings/oxbuild_images.tgz
+mkdir oxford5k_images oxford5k_features
+tar -xvzf oxbuild_images.tgz -C oxford5k_images/
+cd ../
+echo data/oxford5k_images/hertford_000056.jpg >> list_images.txt
+echo data/oxford5k_images/oxford_000317.jpg >> list_images.txt
+```
+Make a copy of the
+[`delf_config_example.pbtxt`](../examples/delf_config_example.pbtxt) protobuffer
+file which configures the DELF feature extraction. Update the file by making the
+following changes:
+*   set the `model_path` attribute to the directory containing the exported
+    model, `gldv2_model_local` in this example
+*   add at the root level the attribute `is_tf2_exported` with the value `true`
+*   set to `false` the `use_pca` attribute inside `delf_local_config`
+The ensuing file should resemble the following:
+```
+model_path: "gldv2_model_local"
+image_scales: .25
+image_scales: .3536
+image_scales: .5
+image_scales: .7071
+image_scales: 1.0
+image_scales: 1.4142
+image_scales: 2.0
+is_tf2_exported: true
+delf_local_config {
+  use_pca: false
+  max_feature_num: 1000
+  score_threshold: 100.0
+}
+```
+Run the following command to extract DELF features for the images
+`hertford_000056.jpg` and `oxford_000317.jpg`:
+```
+python3 ../examples/extract_features.py \
+  --config_path delf_config_example.pbtxt \
+  --list_images_path list_images.txt \
+  --output_dir data/oxford5k_features
+```
+Run the following command to perform feature matching between the images
+`hertford_000056.jpg` and `oxford_000317.jpg`:
+```
+python3 ../examples/match_images.py \
+  --image_1_path data/oxford5k_images/hertford_000056.jpg \
+  --image_2_path data/oxford5k_images/oxford_000317.jpg \
+  --features_1_path data/oxford5k_features/hertford_000056.delf \
+  --features_2_path data/oxford5k_features/oxford_000317.delf \
+  --output_image matched_images.png
+```
+The generated image `matched_images.png` should look similar to this one:
+![MatchedImagesDemo](./matched_images_demo.png)
--- a/research/delf/delf/python/training/build_image_dataset.py
+++ b/research/delf/delf/python/training/build_image_dataset.py
@@ -302,6 +302,21 @@ def _write_relabeling_rules(relabeling_rules):
      csv_writer.writerow([new_label, old_label])
+def _shuffle_by_columns(np_array, random_state):
+  """Shuffle the columns of a 2D numpy array.
+  Args:
+    np_array: array to shuffle.
+    random_state: numpy RandomState to be used for shuffling.
+  Returns:
+    The shuffled array.
+  """
+  columns = np_array.shape[1]
+  columns_indices = np.arange(columns)
+  random_state.shuffle(columns_indices)
+  return np_array[:, columns_indices]
 def _build_train_and_validation_splits(image_paths, file_ids, labels,
                                       validation_split_size, seed):
  """Create TRAIN and VALIDATION splits containg all labels in equal proportion.
@@ -353,19 +368,21 @@ def _build_train_and_validation_splits(image_paths, file_ids, labels,
  for label, indexes in image_attrs_idx_by_label.items():
    # Create the subset for the current label.
    image_attrs_label = image_attrs[:, indexes]
-    images_per_label = image_attrs_label.shape[1]
    # Shuffle the current label subset.
-    columns_indices = np.arange(images_per_label)
+    image_attrs_label = _shuffle_by_columns(image_attrs_label, rs)
-    rs.shuffle(columns_indices)
-    image_attrs_label = image_attrs_label[:, columns_indices]
    # Split the current label subset into TRAIN and VALIDATION splits and add
    # each split to the list of all splits.
+    images_per_label = image_attrs_label.shape[1]
    cutoff_idx = max(1, int(validation_split_size * images_per_label))
    splits[_VALIDATION_SPLIT].append(image_attrs_label[:, 0 : cutoff_idx])
    splits[_TRAIN_SPLIT].append(image_attrs_label[:, cutoff_idx : ])
-  validation_split = np.concatenate(splits[_VALIDATION_SPLIT], axis=1)
+  # Concatenate all subsets of image attributes into TRAIN and VALIDATION splits
-  train_split = np.concatenate(splits[_TRAIN_SPLIT], axis=1)
+  # and reshuffle them again to ensure variance of labels across batches.
+  validation_split = _shuffle_by_columns(
+      np.concatenate(splits[_VALIDATION_SPLIT], axis=1), rs)
+  train_split = _shuffle_by_columns(
+      np.concatenate(splits[_TRAIN_SPLIT], axis=1), rs)
  # Unstack the image attribute arrays in the TRAIN and VALIDATION splits and
  # convert them back to lists. Convert labels back to 'int' from 'str'

--- a/research/delf/delf/python/training/datasets/googlelandmarks.py
+++ b/research/delf/delf/python/training/datasets/googlelandmarks.py
@@ -29,11 +29,7 @@ import tensorflow as tf
 class _GoogleLandmarksInfo(object):
  """Metadata about the Google Landmarks dataset."""
-  num_classes = {
+  num_classes = {'gld_v1': 14951, 'gld_v2': 203094, 'gld_v2_clean': 81313}
-      'gld_v1': 14951,
-      'gld_v2': 203094,
-      'gld_v2_clean': 81313
-  }
 class _DataAugmentationParams(object):
@@ -123,6 +119,8 @@ def _ParseFunction(example, name_to_features, image_size, augmentation):
  # Parse to get image.
  image = parsed_example['image/encoded']
  image = tf.io.decode_jpeg(image)
+  image = NormalizeImages(
+      image, pixel_value_scale=128.0, pixel_value_offset=128.0)
  if augmentation:
    image = _ImageNetCrop(image)
  else:
@@ -130,6 +128,7 @@ def _ParseFunction(example, name_to_features, image_size, augmentation):
    image.set_shape([image_size, image_size, 3])
  # Parse to get label.
  label = parsed_example['image/class/label']
  return image, label
@@ -162,6 +161,7 @@ def CreateDataset(file_pattern,
      'image/width': tf.io.FixedLenFeature([], tf.int64, default_value=0),
      'image/channels': tf.io.FixedLenFeature([], tf.int64, default_value=0),
      'image/format': tf.io.FixedLenFeature([], tf.string, default_value=''),
+      'image/id': tf.io.FixedLenFeature([], tf.string, default_value=''),
      'image/filename': tf.io.FixedLenFeature([], tf.string, default_value=''),
      'image/encoded': tf.io.FixedLenFeature([], tf.string, default_value=''),
      'image/class/label': tf.io.FixedLenFeature([], tf.int64, default_value=0),

--- a/research/delf/delf/python/training/matched_images_demo.png
+++ b/research/delf/delf/python/training/matched_images_demo.png
--- a/research/delf/delf/python/training/model/delf_model.py
+++ b/research/delf/delf/python/training/model/delf_model.py
@@ -132,10 +132,12 @@ class Delf(tf.keras.Model):
            self.attn_classification.trainable_weights)
  def call(self, input_image, training=True):
-    blocks = {'block3': None}
+    blocks = {}
-    self.backbone(input_image, intermediates_dict=blocks, training=training)
-    features = blocks['block3']
+    self.backbone.build_call(
+        input_image, intermediates_dict=blocks, training=training)
+    features = blocks['block3']  # pytype: disable=key-error
    _, probs, _ = self.attention(features, training=training)
    return probs, features
--- a/research/delf/delf/python/training/model/export_global_model.py
+++ b/research/delf/delf/python/training/model/export_global_model.py
+# Lint as: python3
+# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""Export global feature tensorflow inference model.
+This model includes image pyramids for multi-scale processing.
+"""
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+import os
+from absl import app
+from absl import flags
+import tensorflow as tf
+from delf.python.training.model import delf_model
+from delf.python.training.model import export_model_utils
+FLAGS = flags.FLAGS
+flags.DEFINE_string('ckpt_path', '/tmp/delf-logdir/delf-weights',
+                    'Path to saved checkpoint.')
+flags.DEFINE_string('export_path', None, 'Path where model will be exported.')
+flags.DEFINE_list(
+    'input_scales_list', None,
+    'Optional input image scales to use. If None (default), an input end-point '
+    '"input_scales" is added for the exported model. If not None, the '
+    'specified list of floats will be hard-coded as the desired input scales.')
+flags.DEFINE_enum(
+    'multi_scale_pool_type', 'None', ['None', 'average', 'sum'],
+    "If 'None' (default), the model is exported with an output end-point "
+    "'global_descriptors', where the global descriptor for each scale is "
+    "returned separately. If not 'None', the global descriptor of each scale is"
+    ' pooled and a 1D global descriptor is returned, with output end-point '
+    "'global_descriptor'.")
+flags.DEFINE_boolean('normalize_global_descriptor', False,
+                     'If True, L2-normalizes global descriptor.')
+class _ExtractModule(tf.Module):
+  """Helper module to build and save global feature model."""
+  def __init__(self,
+               multi_scale_pool_type='None',
+               normalize_global_descriptor=False,
+               input_scales_tensor=None):
+    """Initialization of global feature model.
+    Args:
+      multi_scale_pool_type: Type of multi-scale pooling to perform.
+      normalize_global_descriptor: Whether to L2-normalize global descriptor.
+      input_scales_tensor: If None, the exported function to be used should be
+        ExtractFeatures, where an input end-point "input_scales" is added for
+        the exported model. If not None, the specified 1D tensor of floats will
+        be hard-coded as the desired input scales, in conjunction with
+        ExtractFeaturesFixedScales.
+    """
+    self._multi_scale_pool_type = multi_scale_pool_type
+    self._normalize_global_descriptor = normalize_global_descriptor
+    if input_scales_tensor is None:
+      self._input_scales_tensor = []
+    else:
+      self._input_scales_tensor = input_scales_tensor
+    # Setup the DELF model for extraction.
+    self._model = delf_model.Delf(block3_strides=False, name='DELF')
+  def LoadWeights(self, checkpoint_path):
+    self._model.load_weights(checkpoint_path)
+  @tf.function(input_signature=[
+      tf.TensorSpec(shape=[None, None, 3], dtype=tf.uint8, name='input_image'),
+      tf.TensorSpec(shape=[None], dtype=tf.float32, name='input_scales'),
+      tf.TensorSpec(
+          shape=[None], dtype=tf.int32, name='input_global_scales_ind')
+  ])
+  def ExtractFeatures(self, input_image, input_scales, input_global_scales_ind):
+    extracted_features = export_model_utils.ExtractGlobalFeatures(
+        input_image,
+        input_scales,
+        input_global_scales_ind,
+        lambda x: self._model.backbone.build_call(x, training=False),
+        multi_scale_pool_type=self._multi_scale_pool_type,
+        normalize_global_descriptor=self._normalize_global_descriptor)
+    named_output_tensors = {}
+    if self._multi_scale_pool_type == 'None':
+      named_output_tensors['global_descriptors'] = tf.identity(
+          extracted_features, name='global_descriptors')
+    else:
+      named_output_tensors['global_descriptor'] = tf.identity(
+          extracted_features, name='global_descriptor')
+    return named_output_tensors
+  @tf.function(input_signature=[
+      tf.TensorSpec(shape=[None, None, 3], dtype=tf.uint8, name='input_image')
+  ])
+  def ExtractFeaturesFixedScales(self, input_image):
+    return self.ExtractFeatures(input_image, self._input_scales_tensor,
+                                tf.range(tf.size(self._input_scales_tensor)))
+def main(argv):
+  if len(argv) > 1:
+    raise app.UsageError('Too many command-line arguments.')
+  export_path = FLAGS.export_path
+  if os.path.exists(export_path):
+    raise ValueError('export_path %s already exists.' % export_path)
+  if FLAGS.input_scales_list is None:
+    input_scales_tensor = None
+  else:
+    input_scales_tensor = tf.constant(
+        [float(s) for s in FLAGS.input_scales_list],
+        dtype=tf.float32,
+        shape=[len(FLAGS.input_scales_list)],
+        name='input_scales')
+  module = _ExtractModule(FLAGS.multi_scale_pool_type,
+                          FLAGS.normalize_global_descriptor,
+                          input_scales_tensor)
+  # Load the weights.
+  checkpoint_path = FLAGS.ckpt_path
+  module.LoadWeights(checkpoint_path)
+  print('Checkpoint loaded from ', checkpoint_path)
+  # Save the module
+  if FLAGS.input_scales_list is None:
+    served_function = module.ExtractFeatures
+  else:
+    served_function = module.ExtractFeaturesFixedScales
+  tf.saved_model.save(
+      module, export_path, signatures={'serving_default': served_function})
+if __name__ == '__main__':
+  app.run(main)
--- a/research/delf/delf/python/training/model/export_model.py
+++ b/research/delf/delf/python/training/model/export_model.py
@@ -42,67 +42,39 @@ flags.DEFINE_boolean('block3_strides', False,
 flags.DEFINE_float('iou', 1.0, 'IOU for non-max suppression.')
-def _build_tensor_info(tensor_dict):
+class _ExtractModule(tf.Module):
-  """Replace the dict's value by the tensor info.
+  """Helper module to build and save DELF model."""
-  Args:
+  def __init__(self, block3_strides, iou):
-    tensor_dict: A dictionary contains <string, tensor>.
+    """Initialization of DELF model.
-  Returns:
+    Args:
-    dict: New dictionary contains <string, tensor_info>.
+      block3_strides: bool, whether to add strides to the output of block3.
-  """
+      iou: IOU for non-max suppression.
-  return {
+    """
-      k: tf.compat.v1.saved_model.utils.build_tensor_info(t)
+    self._stride_factor = 2.0 if block3_strides else 1.0
-      for k, t in tensor_dict.items()
+    self._iou = iou
-  }
-def main(argv):
-  if len(argv) > 1:
-    raise app.UsageError('Too many command-line arguments.')
-  export_path = FLAGS.export_path
-  if os.path.exists(export_path):
-    raise ValueError('Export_path already exists.')
-  with tf.Graph().as_default() as g, tf.compat.v1.Session(graph=g) as sess:
    # Setup the DELF model for extraction.
-    model = delf_model.Delf(block3_strides=FLAGS.block3_strides, name='DELF')
+    self._model = delf_model.Delf(
+        block3_strides=block3_strides, name='DELF')
-    # Initial forward pass to build model.
-    images = tf.zeros((1, 321, 321, 3), dtype=tf.float32)
-    model(images)
-    stride_factor = 2.0 if FLAGS.block3_strides else 1.0
+  def LoadWeights(self, checkpoint_path):
+    self._model.load_weights(checkpoint_path)
-    # Setup the multiscale keypoint extraction.
+  @tf.function(input_signature=[
-    input_image = tf.compat.v1.placeholder(
+      tf.TensorSpec(shape=[None, None, 3], dtype=tf.uint8, name='input_image'),
-        tf.uint8, shape=(None, None, 3), name='input_image')
+      tf.TensorSpec(shape=[None], dtype=tf.float32, name='input_scales'),
-    input_abs_thres = tf.compat.v1.placeholder(
+      tf.TensorSpec(shape=(), dtype=tf.int32, name='input_max_feature_num'),
-        tf.float32, shape=(), name='input_abs_thres')
+      tf.TensorSpec(shape=(), dtype=tf.float32, name='input_abs_thres')
-    input_scales = tf.compat.v1.placeholder(
+  ])
-        tf.float32, shape=[None], name='input_scales')
+  def ExtractFeatures(self, input_image, input_scales, input_max_feature_num,
-    input_max_feature_num = tf.compat.v1.placeholder(
+                      input_abs_thres):
-        tf.int32, shape=(), name='input_max_feature_num')
    extracted_features = export_model_utils.ExtractLocalFeatures(
        input_image, input_scales, input_max_feature_num, input_abs_thres,
-        FLAGS.iou, lambda x: model(x, training=False), stride_factor)
+        self._iou, lambda x: self._model(x, training=False),
+        self._stride_factor)
-    # Load the weights.
-    checkpoint_path = FLAGS.ckpt_path
-    model.load_weights(checkpoint_path)
-    print('Checkpoint loaded from ', checkpoint_path)
-    named_input_tensors = {
-        'input_image': input_image,
-        'input_scales': input_scales,
-        'input_abs_thres': input_abs_thres,
-        'input_max_feature_num': input_max_feature_num,
-    }
-    # Outputs to the exported model.
    named_output_tensors = {}
    named_output_tensors['boxes'] = tf.identity(
        extracted_features[0], name='boxes')
@@ -112,25 +84,27 @@ def main(argv):
        extracted_features[2], name='scales')
    named_output_tensors['scores'] = tf.identity(
        extracted_features[3], name='scores')
+    return named_output_tensors
+def main(argv):
+  if len(argv) > 1:
+    raise app.UsageError('Too many command-line arguments.')
+  export_path = FLAGS.export_path
+  if os.path.exists(export_path):
+    raise ValueError(f'Export_path {export_path} already exists. Please '
+                     'specify a different path or delete the existing one.')
+  module = _ExtractModule(FLAGS.block3_strides, FLAGS.iou)
+  # Load the weights.
+  checkpoint_path = FLAGS.ckpt_path
+  module.LoadWeights(checkpoint_path)
+  print('Checkpoint loaded from ', checkpoint_path)
-    # Export the model.
+  # Save the module
-    signature_def = tf.compat.v1.saved_model.signature_def_utils.build_signature_def(
+  tf.saved_model.save(module, export_path)
-        inputs=_build_tensor_info(named_input_tensors),
-        outputs=_build_tensor_info(named_output_tensors))
-    print('Exporting trained model to:', export_path)
-    builder = tf.compat.v1.saved_model.builder.SavedModelBuilder(export_path)
-    init_op = None
-    builder.add_meta_graph_and_variables(
-        sess, [tf.compat.v1.saved_model.tag_constants.SERVING],
-        signature_def_map={
-            tf.compat.v1.saved_model.signature_constants
-            .DEFAULT_SERVING_SIGNATURE_DEF_KEY:
-                signature_def
-        },
-        main_op=init_op)
-    builder.save()
 if __name__ == '__main__':

--- a/research/delf/delf/python/training/model/export_model_utils.py
+++ b/research/delf/delf/python/training/model/export_model_utils.py
@@ -142,20 +142,21 @@ def ExtractLocalFeatures(image, image_scales, max_feature_num, abs_thres, iou,
  keep_going = lambda j, b, f, scales, scores: tf.less(j, num_scales)
  (_, output_boxes, output_features, output_scales,
-   output_scores) = tf.while_loop(
+   output_scores) = tf.nest.map_structure(
-       cond=keep_going,
+       tf.stop_gradient,
-       body=_ProcessSingleScale,
+       tf.while_loop(
-       loop_vars=[
+           cond=keep_going,
-           i, output_boxes, output_features, output_scales, output_scores
+           body=_ProcessSingleScale,
-       ],
+           loop_vars=[
-       shape_invariants=[
+               i, output_boxes, output_features, output_scales, output_scores
-           i.get_shape(),
+           ],
-           tf.TensorShape([None, 4]),
+           shape_invariants=[
-           tf.TensorShape([None, feature_depth]),
+               i.get_shape(),
-           tf.TensorShape([None]),
+               tf.TensorShape([None, 4]),
-           tf.TensorShape([None])
+               tf.TensorShape([None, feature_depth]),
-       ],
+               tf.TensorShape([None]),
-       back_prop=False)
+               tf.TensorShape([None])
+           ]))
  feature_boxes = box_list.BoxList(output_boxes)
  feature_boxes.add_field('features', output_features)
@@ -169,3 +170,99 @@ def ExtractLocalFeatures(image, image_scales, max_feature_num, abs_thres, iou,
  return final_boxes.get(), final_boxes.get_field(
      'features'), final_boxes.get_field('scales'), tf.expand_dims(
          final_boxes.get_field('scores'), 1)
+@tf.function
+def ExtractGlobalFeatures(image,
+                          image_scales,
+                          global_scales_ind,
+                          model_fn,
+                          multi_scale_pool_type='None',
+                          normalize_global_descriptor=False):
+  """Extract global features for input image.
+  Args:
+    image: image tensor of type tf.uint8 with shape [h, w, channels].
+    image_scales: 1D float tensor which contains float scales used for image
+      pyramid construction.
+    global_scales_ind: Feature extraction happens only for a subset of
+      `image_scales`, those with corresponding indices from this tensor.
+    model_fn: model function. Follows the signature:
+      * Args:
+        * `images`: Image tensor which is re-scaled.
+      * Returns:
+        * `global_descriptors`: Global descriptors for input images.
+    multi_scale_pool_type: If set, the global descriptor of each scale is pooled
+      and a 1D global descriptor is returned.
+    normalize_global_descriptor: If True, output global descriptors are
+      L2-normalized.
+  Returns:
+    global_descriptors: If `multi_scale_pool_type` is 'None', returns a [S, D]
+      float tensor. S is the number of scales, and D the global descriptor
+      dimensionality. Each D-dimensional entry is a global descriptor, which may
+      be L2-normalized depending on `normalize_global_descriptor`. If
+      `multi_scale_pool_type` is not 'None', returns a [D] float tensor with the
+      pooled global descriptor.
+  """
+  original_image_shape_float = tf.gather(
+      tf.dtypes.cast(tf.shape(image), tf.float32), [0, 1])
+  image_tensor = gld.NormalizeImages(
+      image, pixel_value_offset=128.0, pixel_value_scale=128.0)
+  image_tensor = tf.expand_dims(image_tensor, 0, name='image/expand_dims')
+  def _ResizeAndExtract(scale_index):
+    """Helper function to resize image then extract global feature.
+    Args:
+      scale_index: A valid index in image_scales.
+    Returns:
+      global_descriptor: [1,D] tensor denoting the extracted global descriptor.
+    """
+    scale = tf.gather(image_scales, scale_index)
+    new_image_size = tf.dtypes.cast(
+        tf.round(original_image_shape_float * scale), tf.int32)
+    resized_image = tf.image.resize(image_tensor, new_image_size)
+    global_descriptor = model_fn(resized_image)
+    return global_descriptor
+  # First loop to find initial scale to be used.
+  num_scales = tf.shape(image_scales)[0]
+  initial_scale_index = tf.constant(-1, dtype=tf.int32)
+  for scale_index in tf.range(num_scales):
+    if tf.reduce_any(tf.equal(global_scales_ind, scale_index)):
+      initial_scale_index = scale_index
+      break
+  output_global = _ResizeAndExtract(initial_scale_index)
+  # Loop over subsequent scales.
+  for scale_index in tf.range(initial_scale_index + 1, num_scales):
+    # Allow an undefined number of global feature scales to be extracted.
+    tf.autograph.experimental.set_loop_options(
+        shape_invariants=[(output_global, tf.TensorShape([None, None]))])
+    if tf.reduce_any(tf.equal(global_scales_ind, scale_index)):
+      global_descriptor = _ResizeAndExtract(scale_index)
+      output_global = tf.concat([output_global, global_descriptor], 0)
+  normalization_axis = 1
+  if multi_scale_pool_type == 'average':
+    output_global = tf.reduce_mean(
+        output_global,
+        axis=0,
+        keepdims=False,
+        name='multi_scale_average_pooling')
+    normalization_axis = 0
+  elif multi_scale_pool_type == 'sum':
+    output_global = tf.reduce_sum(
+        output_global, axis=0, keepdims=False, name='multi_scale_sum_pooling')
+    normalization_axis = 0
+  if normalize_global_descriptor:
+    output_global = tf.nn.l2_normalize(
+        output_global, axis=normalization_axis, name='l2_normalization')
+  return output_global
--- a/research/delf/delf/python/training/model/resnet50.py
+++ b/research/delf/delf/python/training/model/resnet50.py
@@ -22,9 +22,14 @@ from __future__ import division
 from __future__ import print_function
 import functools
+import os
+import tempfile
+from absl import logging
+import h5py
 import tensorflow as tf
 layers = tf.keras.layers
@@ -284,8 +289,8 @@ class ResNet50(tf.keras.Model):
      else:
        self.global_pooling = None
-  def call(self, inputs, training=True, intermediates_dict=None):
+  def build_call(self, inputs, training=True, intermediates_dict=None):
-    """Call the ResNet50 model.
+    """Building the ResNet50 model.
    Args:
      inputs: Images to compute features for.
@@ -356,3 +361,79 @@ class ResNet50(tf.keras.Model):
      return self.global_pooling(x)
    else:
      return x
+  def call(self, inputs, training=True, intermediates_dict=None):
+    """Call the ResNet50 model.
+    Args:
+      inputs: Images to compute features for.
+      training: Whether model is in training phase.
+      intermediates_dict: `None` or dictionary. If not None, accumulate feature
+        maps from intermediate blocks into the dictionary. ""
+    Returns:
+      Tensor with featuremap.
+    """
+    return self.build_call(inputs, training, intermediates_dict)
+  def restore_weights(self, filepath):
+    """Load pretrained weights.
+    This function loads a .h5 file from the filepath with saved model weights
+    and assigns them to the model.
+    Args:
+      filepath: String, path to the .h5 file
+    Raises:
+      ValueError: if the file referenced by `filepath` does not exist.
+    """
+    if not tf.io.gfile.exists(filepath):
+      raise ValueError('Unable to load weights from %s. You must provide a'
+                       'valid file.' % (filepath))
+    # Create a local copy of the weights file for h5py to be able to read it.
+    local_filename = os.path.basename(filepath)
+    tmp_filename = os.path.join(tempfile.gettempdir(), local_filename)
+    tf.io.gfile.copy(filepath, tmp_filename, overwrite=True)
+    # Load the content of the weights file.
+    f = h5py.File(tmp_filename, mode='r')
+    saved_layer_names = [n.decode('utf8') for n in f.attrs['layer_names']]
+    try:
+      # Iterate through all the layers assuming the max `depth` is 2.
+      for layer in self.layers:
+        if hasattr(layer, 'layers'):
+          for inlayer in layer.layers:
+            # Make sure the weights are in the saved model, and that we are in
+            # the innermost layer.
+            if inlayer.name not in saved_layer_names:
+              raise ValueError('Layer %s absent from the pretrained weights.'
+                               'Unable to load its weights.' % (inlayer.name))
+            if hasattr(inlayer, 'layers'):
+              raise ValueError('Layer %s is not a depth 2 layer. Unable to load'
+                               'its weights.' % (inlayer.name))
+            # Assign the weights in the current layer.
+            g = f[inlayer.name]
+            weight_names = [n.decode('utf8') for n in g.attrs['weight_names']]
+            weight_values = [g[weight_name] for weight_name in weight_names]
+            print('Setting the weights for layer %s' % (inlayer.name))
+            inlayer.set_weights(weight_values)
+    finally:
+      # Clean up the temporary file.
+      tf.io.gfile.remove(tmp_filename)
+  def log_weights(self):
+    """Log backbone weights."""
+    logging.info('Logging backbone weights')
+    logging.info('------------------------')
+    for layer in self.layers:
+      if hasattr(layer, 'layers'):
+        for inlayer in layer.layers:
+          logging.info('Weights for layer: %s, inlayer % s', layer.name,
+                       inlayer.name)
+          weights = inlayer.get_weights()
+          logging.info(weights)
+      else:
+        logging.info('Layer %s does not have inner layers.',
+                     layer.name)
--- a/research/delf/delf/python/training/train.py
+++ b/research/delf/delf/python/training/train.py
@@ -43,17 +43,20 @@ flags.DEFINE_string('train_file_pattern', '/tmp/data/train*',
                    'File pattern of training dataset files.')
 flags.DEFINE_string('validation_file_pattern', '/tmp/data/validation*',
                    'File pattern of validation dataset files.')
-flags.DEFINE_enum('dataset_version', 'gld_v1',
+flags.DEFINE_enum(
-                  ['gld_v1', 'gld_v2', 'gld_v2_clean'],
+    'dataset_version', 'gld_v1', ['gld_v1', 'gld_v2', 'gld_v2_clean'],
-                  'Google Landmarks dataset version, used to determine the'
+    'Google Landmarks dataset version, used to determine the'
-                  'number of classes.')
+    'number of classes.')
 flags.DEFINE_integer('seed', 0, 'Seed to training dataset.')
-flags.DEFINE_float('initial_lr', 0.001, 'Initial learning rate.')
+flags.DEFINE_float('initial_lr', 0.01, 'Initial learning rate.')
 flags.DEFINE_integer('batch_size', 32, 'Global batch size.')
 flags.DEFINE_integer('max_iters', 500000, 'Maximum iterations.')
-flags.DEFINE_boolean('block3_strides', False, 'Whether to use block3_strides.')
+flags.DEFINE_boolean('block3_strides', True, 'Whether to use block3_strides.')
 flags.DEFINE_boolean('use_augmentation', True,
                     'Whether to use ImageNet style augmentation.')
+flags.DEFINE_string(
+    'imagenet_checkpoint', None,
+    'ImageNet checkpoint for ResNet backbone. If None, no checkpoint is used.')
 def _record_accuracy(metric, logits, labels):
@@ -64,6 +67,10 @@ def _record_accuracy(metric, logits, labels):
 def _attention_summaries(scores, global_step):
  """Record statistics of the attention score."""
+  tf.summary.image(
+      'batch_attention',
+      scores / tf.reduce_max(scores + 1e-3),
+      step=global_step)
  tf.summary.scalar('attention/max', tf.reduce_max(scores), step=global_step)
  tf.summary.scalar('attention/min', tf.reduce_min(scores), step=global_step)
  tf.summary.scalar('attention/mean', tf.reduce_mean(scores), step=global_step)
@@ -124,7 +131,7 @@ def main(argv):
  max_iters = FLAGS.max_iters
  global_batch_size = FLAGS.batch_size
  image_size = 321
-  num_eval = 1000
+  num_eval_batches = int(50000 / global_batch_size)
  report_interval = 100
  eval_interval = 1000
  save_interval = 20000
@@ -134,9 +141,10 @@ def main(argv):
  clip_val = tf.constant(10.0)
  if FLAGS.debug:
+    tf.config.run_functions_eagerly(True)
    global_batch_size = 4
-    max_iters = 4
+    max_iters = 100
-    num_eval = 1
+    num_eval_batches = 1
    save_interval = 1
    report_interval = 1
@@ -159,11 +167,12 @@ def main(argv):
      augmentation=False,
      seed=FLAGS.seed)
-  train_iterator = strategy.make_dataset_iterator(train_dataset)
+  train_dist_dataset = strategy.experimental_distribute_dataset(train_dataset)
-  validation_iterator = strategy.make_dataset_iterator(validation_dataset)
+  validation_dist_dataset = strategy.experimental_distribute_dataset(
+      validation_dataset)
-  train_iterator.initialize()
+  train_iter = iter(train_dist_dataset)
-  validation_iterator.initialize()
+  validation_iter = iter(validation_dist_dataset)
  # Create a checkpoint directory to store the checkpoints.
  checkpoint_prefix = os.path.join(FLAGS.logdir, 'delf_tf2-ckpt')
@@ -219,11 +228,14 @@ def main(argv):
      labels = tf.clip_by_value(labels, 0, model.num_classes)
      global_step = optimizer.iterations
+      tf.summary.image('batch_images', (images + 1.0) / 2.0, step=global_step)
      tf.summary.scalar(
          'image_range/max', tf.reduce_max(images), step=global_step)
      tf.summary.scalar(
          'image_range/min', tf.reduce_min(images), step=global_step)
+      # TODO(andrearaujo): we should try to unify the backprop into a single
+      # function, instead of applying once to descriptor then to attention.
      def _backprop_loss(tape, loss, weights):
        """Backpropogate losses using clipped gradients.
@@ -344,12 +356,25 @@ def main(argv):
      with tf.summary.record_if(
          tf.math.equal(0, optimizer.iterations % report_interval)):
+        # TODO(dananghel): try to load pretrained weights at backbone creation.
+        # Load pretrained weights for ResNet50 trained on ImageNet.
+        if FLAGS.imagenet_checkpoint is not None:
+          logging.info('Attempting to load ImageNet pretrained weights.')
+          input_batch = next(train_iter)
+          _, _ = distributed_train_step(input_batch)
+          model.backbone.restore_weights(FLAGS.imagenet_checkpoint)
+          logging.info('Done.')
+        else:
+          logging.info('Skip loading ImageNet pretrained weights.')
+        if FLAGS.debug:
+          model.backbone.log_weights()
        global_step_value = optimizer.iterations.numpy()
        while global_step_value < max_iters:
          # input_batch : images(b, h, w, c), labels(b,).
          try:
-            input_batch = train_iterator.get_next()
+            input_batch = next(train_iter)
          except tf.errors.OutOfRangeError:
            # Break if we run out of data in the dataset.
            logging.info('Stopping training at global step %d, no more data',
@@ -392,9 +417,9 @@ def main(argv):
          # Validate once in {eval_interval*n, n \in N} steps.
          if global_step_value % eval_interval == 0:
-            for i in range(num_eval):
+            for i in range(num_eval_batches):
              try:
-                validation_batch = validation_iterator.get_next()
+                validation_batch = next(validation_iter)
                desc_validation_result, attn_validation_result = (
                    distributed_validation_step(validation_batch))
              except tf.errors.OutOfRangeError:
@@ -416,13 +441,17 @@ def main(argv):
              print('          : attn:', attn_validation_result.numpy())
          # Save checkpoint once (each save_interval*n, n \in N) steps.
+          # TODO(andrearaujo): save only in one of the two ways. They are
+          # identical, the only difference is that the manager adds some extra
+          # prefixes and variables (eg, optimizer variables).
          if global_step_value % save_interval == 0:
            save_path = manager.save()
-            logging.info('Saved({global_step_value}) at %s', save_path)
+            logging.info('Saved (%d) at %s', global_step_value, save_path)
            file_path = '%s/delf_weights' % FLAGS.logdir
            model.save_weights(file_path, save_format='tf')
-            logging.info('Saved weights({global_step_value}) at %s', file_path)
+            logging.info('Saved weights (%d) at %s', global_step_value,
+                         file_path)
          # Reset metrics for next step.
          desc_train_accuracy.reset_states()

--- a/research/object_detection/README.md
+++ b/research/object_detection/README.md
@@ -118,6 +118,8 @@ Importantly, these contextual images need not be labeled.
    novel camera deployment to improve performance at that camera, boosting
    model generalizeability.
+Read about Context R-CNN on the Google AI blog [here](https://ai.googleblog.com/2020/06/leveraging-temporal-context-for-object.html).
 We have provided code for generating data with associated context
 [here](g3doc/context_rcnn.md), and a sample config for a Context R-CNN
 model [here](samples/configs/context_rcnn_resnet101_snapshot_serengeti_sync.config).

--- a/research/object_detection/builders/dataset_builder_test.py
+++ b/research/object_detection/builders/dataset_builder_test.py
@@ -390,7 +390,7 @@ class DatasetBuilderTest(test_case.TestCase):
      return iter1.get_next(), iter2.get_next()
    output_dict1, output_dict2 = self.execute(graph_fn, [])
-    self.assertAllEqual(['0'], output_dict1[fields.InputDataFields.source_id])
+    self.assertAllEqual([b'0'], output_dict1[fields.InputDataFields.source_id])
    self.assertEqual([b'1'], output_dict2[fields.InputDataFields.source_id])
  def test_sample_one_of_n_shards(self):

--- a/research/object_detection/builders/decoder_builder.py
+++ b/research/object_detection/builders/decoder_builder.py
@@ -58,7 +58,8 @@ def build(input_reader_config):
          use_display_name=input_reader_config.use_display_name,
          num_additional_channels=input_reader_config.num_additional_channels,
          num_keypoints=input_reader_config.num_keypoints,
-          expand_hierarchy_labels=input_reader_config.expand_labels_hierarchy)
+          expand_hierarchy_labels=input_reader_config.expand_labels_hierarchy,
+          load_dense_pose=input_reader_config.load_dense_pose)
      return decoder
    elif input_type == input_reader_pb2.InputType.Value('TF_SEQUENCE_EXAMPLE'):
      decoder = tf_sequence_example_decoder.TfSequenceExampleDecoder(

--- a/research/object_detection/builders/model_builder.py
+++ b/research/object_detection/builders/model_builder.py
@@ -52,6 +52,7 @@ if tf_version.is_tf2():
  from object_detection.models import faster_rcnn_inception_resnet_v2_keras_feature_extractor as frcnn_inc_res_keras
  from object_detection.models import faster_rcnn_resnet_keras_feature_extractor as frcnn_resnet_keras
  from object_detection.models import ssd_resnet_v1_fpn_keras_feature_extractor as ssd_resnet_v1_fpn_keras
+  from object_detection.models import faster_rcnn_resnet_v1_fpn_keras_feature_extractor as frcnn_resnet_fpn_keras
  from object_detection.models.ssd_mobilenet_v1_fpn_keras_feature_extractor import SSDMobileNetV1FpnKerasFeatureExtractor
  from object_detection.models.ssd_mobilenet_v1_keras_feature_extractor import SSDMobileNetV1KerasFeatureExtractor
  from object_detection.models.ssd_mobilenet_v2_fpn_keras_feature_extractor import SSDMobileNetV2FpnKerasFeatureExtractor
@@ -109,6 +110,12 @@ if tf_version.is_tf2():
          frcnn_resnet_keras.FasterRCNNResnet152KerasFeatureExtractor,
      'faster_rcnn_inception_resnet_v2_keras':
      frcnn_inc_res_keras.FasterRCNNInceptionResnetV2KerasFeatureExtractor,
+      'fasret_rcnn_resnet50_fpn_keras':
+          frcnn_resnet_fpn_keras.FasterRCNNResnet50FpnKerasFeatureExtractor,
+      'fasret_rcnn_resnet101_fpn_keras':
+          frcnn_resnet_fpn_keras.FasterRCNNResnet101FpnKerasFeatureExtractor,
+      'fasret_rcnn_resnet152_fpn_keras':
+          frcnn_resnet_fpn_keras.FasterRCNNResnet152FpnKerasFeatureExtractor,
  }
  CENTER_NET_EXTRACTOR_FUNCTION_MAP = {

--- a/research/object_detection/core/densepose_ops.py
+++ b/research/object_detection/core/densepose_ops.py
+# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""DensePose operations.
+DensePose part ids are represented as tensors of shape
+[num_instances, num_points] and coordinates are represented as tensors of shape
+[num_instances, num_points, 4] where each point holds (y, x, v, u). The location
+of the DensePose sampled point is (y, x) in normalized coordinates. The surface
+coordinate (in the part coordinate frame) is (v, u). Note that dim 1 of both
+tensors may contain padding, since the number of sampled points per instance
+is not fixed. The value `num_points` represents the maximum number of sampled
+points for an instance in the example.
+"""
+import os
+import scipy.io
+import tensorflow.compat.v1 as tf
+from object_detection.utils import shape_utils
+PART_NAMES = [
+    b'torso_back', b'torso_front', b'right_hand', b'left_hand', b'left_foot',
+    b'right_foot', b'right_upper_leg_back', b'left_upper_leg_back',
+    b'right_upper_leg_front', b'left_upper_leg_front', b'right_lower_leg_back',
+    b'left_lower_leg_back', b'right_lower_leg_front', b'left_lower_leg_front',
+    b'left_upper_arm_back', b'right_upper_arm_back', b'left_upper_arm_front',
+    b'right_upper_arm_front', b'left_lower_arm_back', b'right_lower_arm_back',
+    b'left_lower_arm_front', b'right_lower_arm_front', b'right_face',
+    b'left_face',
+]
+_SRC_PATH = ('google3/third_party/tensorflow_models/object_detection/'
+             'dataset_tools/densepose')
+def scale(dp_surface_coords, y_scale, x_scale, scope=None):
+  """Scales DensePose coordinates in y and x dimensions.
+  Args:
+    dp_surface_coords: a tensor of shape [num_instances, num_points, 4], with
+      coordinates in (y, x, v, u) format.
+    y_scale: (float) scalar tensor
+    x_scale: (float) scalar tensor
+    scope: name scope.
+  Returns:
+    new_dp_surface_coords: a tensor of shape [num_instances, num_points, 4]
+  """
+  with tf.name_scope(scope, 'DensePoseScale'):
+    y_scale = tf.cast(y_scale, tf.float32)
+    x_scale = tf.cast(x_scale, tf.float32)
+    new_keypoints = dp_surface_coords * [[[y_scale, x_scale, 1, 1]]]
+    return new_keypoints
+def clip_to_window(dp_surface_coords, window, scope=None):
+  """Clips DensePose points to a window.
+  This op clips any input DensePose points to a window.
+  Args:
+    dp_surface_coords: a tensor of shape [num_instances, num_points, 4] with
+      DensePose surface coordinates in (y, x, v, u) format.
+    window: a tensor of shape [4] representing the [y_min, x_min, y_max, x_max]
+      window to which the op should clip the keypoints.
+    scope: name scope.
+  Returns:
+    new_dp_surface_coords: a tensor of shape [num_instances, num_points, 4].
+  """
+  with tf.name_scope(scope, 'DensePoseClipToWindow'):
+    y, x, v, u = tf.split(value=dp_surface_coords, num_or_size_splits=4, axis=2)
+    win_y_min, win_x_min, win_y_max, win_x_max = tf.unstack(window)
+    y = tf.maximum(tf.minimum(y, win_y_max), win_y_min)
+    x = tf.maximum(tf.minimum(x, win_x_max), win_x_min)
+    new_dp_surface_coords = tf.concat([y, x, v, u], 2)
+    return new_dp_surface_coords
+def prune_outside_window(dp_num_points, dp_part_ids, dp_surface_coords, window,
+                         scope=None):
+  """Prunes DensePose points that fall outside a given window.
+  This function replaces points that fall outside the given window with zeros.
+  See also clip_to_window which clips any DensePose points that fall outside the
+  given window.
+  Note that this operation uses dynamic shapes, and therefore is not currently
+  suitable for TPU.
+  Args:
+    dp_num_points: a tensor of shape [num_instances] that indicates how many
+      (non-padded) DensePose points there are per instance.
+    dp_part_ids: a tensor of shape [num_instances, num_points] with DensePose
+      part ids. These part_ids are 0-indexed, where the first non-background
+      part has index 0.
+    dp_surface_coords: a tensor of shape [num_instances, num_points, 4] with
+      DensePose surface coordinates in (y, x, v, u) format.
+    window: a tensor of shape [4] representing the [y_min, x_min, y_max, x_max]
+      window outside of which the op should prune the points.
+    scope: name scope.
+  Returns:
+    new_dp_num_points: a tensor of shape [num_instances] that indicates how many
+      (non-padded) DensePose points there are per instance after pruning.
+    new_dp_part_ids: a tensor of shape [num_instances, num_points] with
+      DensePose part ids. These part_ids are 0-indexed, where the first
+      non-background part has index 0.
+    new_dp_surface_coords: a tensor of shape [num_instances, num_points, 4] with
+      DensePose surface coordinates after pruning.
+  """
+  with tf.name_scope(scope, 'DensePosePruneOutsideWindow'):
+    y, x, _, _ = tf.unstack(dp_surface_coords, axis=-1)
+    win_y_min, win_x_min, win_y_max, win_x_max = tf.unstack(window)
+    num_instances, num_points = shape_utils.combined_static_and_dynamic_shape(
+        dp_part_ids)
+    dp_num_points_tiled = tf.tile(dp_num_points[:, tf.newaxis],
+                                  multiples=[1, num_points])
+    range_tiled = tf.tile(tf.range(num_points)[tf.newaxis, :],
+                          multiples=[num_instances, 1])
+    valid_initial = range_tiled < dp_num_points_tiled
+    valid_in_window = tf.logical_and(
+        tf.logical_and(y >= win_y_min, y <= win_y_max),
+        tf.logical_and(x >= win_x_min, x <= win_x_max))
+    valid_indices = tf.logical_and(valid_initial, valid_in_window)
+    new_dp_num_points = tf.math.reduce_sum(
+        tf.cast(valid_indices, tf.int32), axis=1)
+    max_num_points = tf.math.reduce_max(new_dp_num_points)
+    def gather_and_reshuffle(elems):
+      dp_part_ids, dp_surface_coords, valid_indices = elems
+      locs = tf.where(valid_indices)[:, 0]
+      valid_part_ids = tf.gather(dp_part_ids, locs, axis=0)
+      valid_part_ids_padded = shape_utils.pad_or_clip_nd(
+          valid_part_ids, output_shape=[max_num_points])
+      valid_surface_coords = tf.gather(dp_surface_coords, locs, axis=0)
+      valid_surface_coords_padded = shape_utils.pad_or_clip_nd(
+          valid_surface_coords, output_shape=[max_num_points, 4])
+      return [valid_part_ids_padded, valid_surface_coords_padded]
+    new_dp_part_ids, new_dp_surface_coords = (
+        shape_utils.static_or_dynamic_map_fn(
+            gather_and_reshuffle,
+            elems=[dp_part_ids, dp_surface_coords, valid_indices],
+            dtype=[tf.int32, tf.float32],
+            back_prop=False))
+    return new_dp_num_points, new_dp_part_ids, new_dp_surface_coords
+def change_coordinate_frame(dp_surface_coords, window, scope=None):
+  """Changes coordinate frame of the points to be relative to window's frame.
+  Given a window of the form [y_min, x_min, y_max, x_max] in normalized
+  coordinates, changes DensePose coordinates to be relative to this window.
+  An example use case is data augmentation: where we are given groundtruth
+  points and would like to randomly crop the image to some window. In this
+  case we need to change the coordinate frame of each sampled point to be
+  relative to this new window.
+  Args:
+    dp_surface_coords: a tensor of shape [num_instances, num_points, 4] with
+      DensePose surface coordinates in (y, x, v, u) format.
+    window: a tensor of shape [4] representing the [y_min, x_min, y_max, x_max]
+      window we should change the coordinate frame to.
+    scope: name scope.
+  Returns:
+    new_dp_surface_coords: a tensor of shape [num_instances, num_points, 4].
+  """
+  with tf.name_scope(scope, 'DensePoseChangeCoordinateFrame'):
+    win_height = window[2] - window[0]
+    win_width = window[3] - window[1]
+    new_dp_surface_coords = scale(
+        dp_surface_coords - [window[0], window[1], 0, 0],
+        1.0 / win_height, 1.0 / win_width)
+    return new_dp_surface_coords
+def to_normalized_coordinates(dp_surface_coords, height, width,
+                              check_range=True, scope=None):
+  """Converts absolute DensePose coordinates to normalized in range [0, 1].
+  This function raises an assertion failed error at graph execution time when
+  the maximum coordinate is smaller than 1.01 (which means that coordinates are
+  already normalized). The value 1.01 is to deal with small rounding errors.
+  Args:
+    dp_surface_coords: a tensor of shape [num_instances, num_points, 4] with
+      DensePose absolute surface coordinates in (y, x, v, u) format.
+    height: Height of image.
+    width: Width of image.
+    check_range: If True, checks if the coordinates are already normalized.
+    scope: name scope.
+  Returns:
+    A tensor of shape [num_instances, num_points, 4] with normalized
+    coordinates.
+  """
+  with tf.name_scope(scope, 'DensePoseToNormalizedCoordinates'):
+    height = tf.cast(height, tf.float32)
+    width = tf.cast(width, tf.float32)
+    if check_range:
+      max_val = tf.reduce_max(dp_surface_coords[:, :, :2])
+      max_assert = tf.Assert(tf.greater(max_val, 1.01),
+                             ['max value is lower than 1.01: ', max_val])
+      with tf.control_dependencies([max_assert]):
+        width = tf.identity(width)
+    return scale(dp_surface_coords, 1.0 / height, 1.0 / width)
+def to_absolute_coordinates(dp_surface_coords, height, width,
+                            check_range=True, scope=None):
+  """Converts normalized DensePose coordinates to absolute pixel coordinates.
+  This function raises an assertion failed error when the maximum
+  coordinate value is larger than 1.01 (in which case coordinates are already
+  absolute).
+  Args:
+    dp_surface_coords: a tensor of shape [num_instances, num_points, 4] with
+      DensePose normalized surface coordinates in (y, x, v, u) format.
+    height: Height of image.
+    width: Width of image.
+    check_range: If True, checks if the coordinates are normalized or not.
+    scope: name scope.
+  Returns:
+    A tensor of shape [num_instances, num_points, 4] with absolute coordinates.
+  """
+  with tf.name_scope(scope, 'DensePoseToAbsoluteCoordinates'):
+    height = tf.cast(height, tf.float32)
+    width = tf.cast(width, tf.float32)
+    if check_range:
+      max_val = tf.reduce_max(dp_surface_coords[:, :, :2])
+      max_assert = tf.Assert(tf.greater_equal(1.01, max_val),
+                             ['maximum coordinate value is larger than 1.01: ',
+                              max_val])
+      with tf.control_dependencies([max_assert]):
+        width = tf.identity(width)
+    return scale(dp_surface_coords, height, width)
+class DensePoseHorizontalFlip(object):
+  """Class responsible for horizontal flipping of parts and surface coords."""
+  def __init__(self):
+    """Constructor."""
+    uv_symmetry_transforms_path = os.path.join(
+        tf.resource_loader.get_data_files_path(), '..', 'dataset_tools',
+        'densepose', 'UV_symmetry_transforms.mat')
+    data = scipy.io.loadmat(uv_symmetry_transforms_path)
+    # Create lookup maps which indicate how a VU coordinate changes after a
+    # horizontal flip.
+    uv_symmetry_map = {}
+    for key in ('U_transforms', 'V_transforms'):
+      uv_symmetry_map_per_part = []
+      for i in range(data[key].shape[1]):
+        # The following tensor has shape [256, 256].
+        map_per_part = tf.constant(data[key][0, i], dtype=tf.float32)
+        uv_symmetry_map_per_part.append(map_per_part)
+      uv_symmetry_map[key] = tf.reshape(
+          tf.stack(uv_symmetry_map_per_part, axis=0), [-1])
+    # The following dictionary contains flattened lookup maps for the U and V
+    # coordinates separately. The shape of each is [24 * 256 * 256].
+    self.uv_symmetries = uv_symmetry_map
+    # Create a list of that maps part index to flipped part index (0-indexed).
+    part_symmetries = []
+    for i, part_name in enumerate(PART_NAMES):
+      if b'left' in part_name:
+        part_symmetries.append(PART_NAMES.index(
+            part_name.replace(b'left', b'right')))
+      elif b'right' in part_name:
+        part_symmetries.append(PART_NAMES.index(
+            part_name.replace(b'right', b'left')))
+      else:
+        part_symmetries.append(i)
+    self.part_symmetries = part_symmetries
+  def flip_parts_and_coords(self, part_ids, vu):
+    """Flips part ids and coordinates.
+    Args:
+      part_ids: a [num_instances, num_points] int32 tensor with pre-flipped part
+        ids. These part_ids are 0-indexed, where the first non-background part
+        has index 0.
+      vu: a [num_instances, num_points, 2] float32 tensor with pre-flipped vu
+        normalized coordinates.
+    Returns:
+      new_part_ids: a [num_instances, num_points] int32 tensor with post-flipped
+        part ids. These part_ids are 0-indexed, where the first non-background
+        part has index 0.
+      new_vu: a [num_instances, num_points, 2] float32 tensor with post-flipped
+        vu coordinates.
+    """
+    num_instances, num_points = shape_utils.combined_static_and_dynamic_shape(
+        part_ids)
+    part_ids_flattened = tf.reshape(part_ids, [-1])
+    new_part_ids_flattened = tf.gather(self.part_symmetries, part_ids_flattened)
+    new_part_ids = tf.reshape(new_part_ids_flattened,
+                              [num_instances, num_points])
+    # Convert VU floating point coordinates to values in [256, 256] grid.
+    vu = tf.math.minimum(tf.math.maximum(vu, 0.0), 1.0)
+    vu_locs = tf.cast(vu * 256., dtype=tf.int32)
+    vu_locs_flattened = tf.reshape(vu_locs, [-1, 2])
+    v_locs_flattened, u_locs_flattened = tf.unstack(vu_locs_flattened, axis=1)
+    # Convert vu_locs into lookup indices (in flattened part symmetries map).
+    symmetry_lookup_inds = (
+        part_ids_flattened * 65536 + 256 * v_locs_flattened + u_locs_flattened)
+    # New VU coordinates.
+    v_new = tf.gather(self.uv_symmetries['V_transforms'], symmetry_lookup_inds)
+    u_new = tf.gather(self.uv_symmetries['U_transforms'], symmetry_lookup_inds)
+    new_vu_flattened = tf.stack([v_new, u_new], axis=1)
+    new_vu = tf.reshape(new_vu_flattened, [num_instances, num_points, 2])
+    return new_part_ids, new_vu
+def flip_horizontal(dp_part_ids, dp_surface_coords, scope=None):
+  """Flips the DensePose points horizontally around the flip_point.
+  This operation flips dense pose annotations horizontally. Note that part ids
+  and surface coordinates may or may not change as a result of the flip.
+  Args:
+    dp_part_ids: a tensor of shape [num_instances, num_points] with DensePose
+      part ids. These part_ids are 0-indexed, where the first non-background
+      part has index 0.
+    dp_surface_coords: a tensor of shape [num_instances, num_points, 4] with
+      DensePose surface coordinates in (y, x, v, u) normalized format.
+    scope: name scope.
+  Returns:
+    new_dp_part_ids: a tensor of shape [num_instances, num_points] with
+      DensePose part ids after flipping.
+    new_dp_surface_coords: a tensor of shape [num_instances, num_points, 4] with
+      DensePose surface coordinates after flipping.
+  """
+  with tf.name_scope(scope, 'DensePoseFlipHorizontal'):
+    # First flip x coordinate.
+    y, x, vu = tf.split(dp_surface_coords, num_or_size_splits=[1, 1, 2], axis=2)
+    xflipped = 1.0 - x
+    # Flip part ids and surface coordinates.
+    horizontal_flip = DensePoseHorizontalFlip()
+    new_dp_part_ids, new_vu = horizontal_flip.flip_parts_and_coords(
+        dp_part_ids, vu)
+    new_dp_surface_coords = tf.concat([y, xflipped, new_vu], axis=2)
+    return new_dp_part_ids, new_dp_surface_coords
--- a/research/object_detection/core/densepose_ops_test.py
+++ b/research/object_detection/core/densepose_ops_test.py
+# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""Tests for object_detection.core.densepose_ops."""
+import numpy as np
+import tensorflow.compat.v1 as tf
+from object_detection.core import densepose_ops
+from object_detection.utils import test_case
+class DensePoseOpsTest(test_case.TestCase):
+  """Tests for common DensePose operations."""
+  def test_scale(self):
+    def graph_fn():
+      dp_surface_coords = tf.constant([
+          [[0.0, 0.0, 0.1, 0.2], [100.0, 200.0, 0.3, 0.4]],
+          [[50.0, 120.0, 0.5, 0.6], [100.0, 140.0, 0.7, 0.8]]
+      ])
+      y_scale = tf.constant(1.0 / 100)
+      x_scale = tf.constant(1.0 / 200)
+      output = densepose_ops.scale(dp_surface_coords, y_scale, x_scale)
+      return output
+    output = self.execute(graph_fn, [])
+    expected_dp_surface_coords = np.array([
+        [[0., 0., 0.1, 0.2], [1.0, 1.0, 0.3, 0.4]],
+        [[0.5, 0.6, 0.5, 0.6], [1.0, 0.7, 0.7, 0.8]]
+    ])
+    self.assertAllClose(output, expected_dp_surface_coords)
+  def test_clip_to_window(self):
+    def graph_fn():
+      dp_surface_coords = tf.constant([
+          [[0.25, 0.5, 0.1, 0.2], [0.75, 0.75, 0.3, 0.4]],
+          [[0.5, 0.0, 0.5, 0.6], [1.0, 1.0, 0.7, 0.8]]
+      ])
+      window = tf.constant([0.25, 0.25, 0.75, 0.75])
+      output = densepose_ops.clip_to_window(dp_surface_coords, window)
+      return output
+    output = self.execute(graph_fn, [])
+    expected_dp_surface_coords = np.array([
+        [[0.25, 0.5, 0.1, 0.2], [0.75, 0.75, 0.3, 0.4]],
+        [[0.5, 0.25, 0.5, 0.6], [0.75, 0.75, 0.7, 0.8]]
+    ])
+    self.assertAllClose(output, expected_dp_surface_coords)
+  def test_prune_outside_window(self):
+    def graph_fn():
+      dp_num_points = tf.constant([2, 0, 1])
+      dp_part_ids = tf.constant([[1, 1], [0, 0], [16, 0]])
+      dp_surface_coords = tf.constant([
+          [[0.9, 0.5, 0.1, 0.2], [0.75, 0.75, 0.3, 0.4]],
+          [[0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0]],
+          [[0.8, 0.5, 0.6, 0.6], [0.5, 0.5, 0.7, 0.7]]
+      ])
+      window = tf.constant([0.25, 0.25, 0.75, 0.75])
+      new_dp_num_points, new_dp_part_ids, new_dp_surface_coords = (
+          densepose_ops.prune_outside_window(dp_num_points, dp_part_ids,
+                                             dp_surface_coords, window))
+      return new_dp_num_points, new_dp_part_ids, new_dp_surface_coords
+    new_dp_num_points, new_dp_part_ids, new_dp_surface_coords = (
+        self.execute_cpu(graph_fn, []))
+    expected_dp_num_points = np.array([1, 0, 0])
+    expected_dp_part_ids = np.array([[1], [0], [0]])
+    expected_dp_surface_coords = np.array([
+        [[0.75, 0.75, 0.3, 0.4]],
+        [[0.0, 0.0, 0.0, 0.0]],
+        [[0.0, 0.0, 0.0, 0.0]]
+    ])
+    self.assertAllEqual(new_dp_num_points, expected_dp_num_points)
+    self.assertAllEqual(new_dp_part_ids, expected_dp_part_ids)
+    self.assertAllClose(new_dp_surface_coords, expected_dp_surface_coords)
+  def test_change_coordinate_frame(self):
+    def graph_fn():
+      dp_surface_coords = tf.constant([
+          [[0.25, 0.5, 0.1, 0.2], [0.75, 0.75, 0.3, 0.4]],
+          [[0.5, 0.0, 0.5, 0.6], [1.0, 1.0, 0.7, 0.8]]
+      ])
+      window = tf.constant([0.25, 0.25, 0.75, 0.75])
+      output = densepose_ops.change_coordinate_frame(dp_surface_coords, window)
+      return output
+    output = self.execute(graph_fn, [])
+    expected_dp_surface_coords = np.array([
+        [[0, 0.5, 0.1, 0.2], [1.0, 1.0, 0.3, 0.4]],
+        [[0.5, -0.5, 0.5, 0.6], [1.5, 1.5, 0.7, 0.8]]
+    ])
+    self.assertAllClose(output, expected_dp_surface_coords)
+  def test_to_normalized_coordinates(self):
+    def graph_fn():
+      dp_surface_coords = tf.constant([
+          [[10., 30., 0.1, 0.2], [30., 45., 0.3, 0.4]],
+          [[20., 0., 0.5, 0.6], [40., 60., 0.7, 0.8]]
+      ])
+      output = densepose_ops.to_normalized_coordinates(
+          dp_surface_coords, 40, 60)
+      return output
+    output = self.execute(graph_fn, [])
+    expected_dp_surface_coords = np.array([
+        [[0.25, 0.5, 0.1, 0.2], [0.75, 0.75, 0.3, 0.4]],
+        [[0.5, 0.0, 0.5, 0.6], [1.0, 1.0, 0.7, 0.8]]
+    ])
+    self.assertAllClose(output, expected_dp_surface_coords)
+  def test_to_absolute_coordinates(self):
+    def graph_fn():
+      dp_surface_coords = tf.constant([
+          [[0.25, 0.5, 0.1, 0.2], [0.75, 0.75, 0.3, 0.4]],
+          [[0.5, 0.0, 0.5, 0.6], [1.0, 1.0, 0.7, 0.8]]
+      ])
+      output = densepose_ops.to_absolute_coordinates(
+          dp_surface_coords, 40, 60)
+      return output
+    output = self.execute(graph_fn, [])
+    expected_dp_surface_coords = np.array([
+        [[10., 30., 0.1, 0.2], [30., 45., 0.3, 0.4]],
+        [[20., 0., 0.5, 0.6], [40., 60., 0.7, 0.8]]
+    ])
+    self.assertAllClose(output, expected_dp_surface_coords)
+  def test_horizontal_flip(self):
+    part_ids_np = np.array([[1, 4], [0, 8]], dtype=np.int32)
+    surf_coords_np = np.array([
+        [[0.1, 0.7, 0.2, 0.4], [0.3, 0.8, 0.2, 0.4]],
+        [[0.0, 0.5, 0.8, 0.7], [0.6, 1.0, 0.7, 0.9]],
+    ], dtype=np.float32)
+    def graph_fn():
+      part_ids = tf.constant(part_ids_np, dtype=tf.int32)
+      surf_coords = tf.constant(surf_coords_np, dtype=tf.float32)
+      flipped_part_ids, flipped_surf_coords = densepose_ops.flip_horizontal(
+          part_ids, surf_coords)
+      flipped_twice_part_ids, flipped_twice_surf_coords = (
+          densepose_ops.flip_horizontal(flipped_part_ids, flipped_surf_coords))
+      return (flipped_part_ids, flipped_surf_coords,
+              flipped_twice_part_ids, flipped_twice_surf_coords)
+    (flipped_part_ids, flipped_surf_coords, flipped_twice_part_ids,
+     flipped_twice_surf_coords) = self.execute(graph_fn, [])
+    expected_flipped_part_ids = [[1, 5],  # 1->1, 4->5
+                                 [0, 9]]  # 0->0, 8->9
+    expected_flipped_surf_coords_yx = np.array([
+        [[0.1, 1.0-0.7], [0.3, 1.0-0.8]],
+        [[0.0, 1.0-0.5], [0.6, 1.0-1.0]],
+    ], dtype=np.float32)
+    self.assertAllEqual(expected_flipped_part_ids, flipped_part_ids)
+    self.assertAllClose(expected_flipped_surf_coords_yx,
+                        flipped_surf_coords[:, :, 0:2])
+    self.assertAllEqual(part_ids_np, flipped_twice_part_ids)
+    self.assertAllClose(surf_coords_np, flipped_twice_surf_coords, rtol=1e-2,
+                        atol=1e-2)
+if __name__ == '__main__':
+  tf.test.main()
--- a/research/object_detection/core/model.py
+++ b/research/object_detection/core/model.py
@@ -391,7 +391,9 @@ class DetectionModel(six.with_metaclass(abc.ABCMeta, _BaseClass)):
    pass
  @abc.abstractmethod
-  def restore_map(self, fine_tune_checkpoint_type='detection'):
+  def restore_map(self,
+                  fine_tune_checkpoint_type='detection',
+                  load_all_detection_checkpoint_vars=False):
    """Returns a map of variables to load from a foreign checkpoint.
    Returns a map of variable names to load from a checkpoint to variables in
@@ -407,6 +409,9 @@ class DetectionModel(six.with_metaclass(abc.ABCMeta, _BaseClass)):
        checkpoint (with compatible variable names) or to restore from a
        classification checkpoint for initialization prior to training.
        Valid values: `detection`, `classification`. Default 'detection'.
+      load_all_detection_checkpoint_vars: whether to load all variables (when
+         `fine_tune_checkpoint_type` is `detection`). If False, only variables
+         within the feature extractor scope are included. Default False.
    Returns:
      A dict mapping variable names (to load from a checkpoint) to variables in
@@ -414,6 +419,36 @@ class DetectionModel(six.with_metaclass(abc.ABCMeta, _BaseClass)):
    """
    pass
+  @abc.abstractmethod
+  def restore_from_objects(self, fine_tune_checkpoint_type='detection'):
+    """Returns a map of variables to load from a foreign checkpoint.
+    Returns a dictionary of Tensorflow 2 Trackable objects (e.g. tf.Module
+    or Checkpoint). This enables the model to initialize based on weights from
+    another task. For example, the feature extractor variables from a
+    classification model can be used to bootstrap training of an object
+    detector. When loading from an object detection model, the checkpoint model
+    should have the same parameters as this detection model with exception of
+    the num_classes parameter.
+    Note that this function is intended to be used to restore Keras-based
+    models when running Tensorflow 2, whereas restore_map (above) is intended
+    to be used to restore Slim-based models when running Tensorflow 1.x.
+    TODO(jonathanhuang,rathodv): Check tf_version and raise unimplemented
+    error for both restore_map and restore_from_objects depending on version.
+    Args:
+      fine_tune_checkpoint_type: whether to restore from a full detection
+        checkpoint (with compatible variable names) or to restore from a
+        classification checkpoint for initialization prior to training.
+        Valid values: `detection`, `classification`. Default 'detection'.
+    Returns:
+      A dict mapping keys to Trackable objects (tf.Module or Checkpoint).
+    """
+    pass
  @abc.abstractmethod
  def updates(self):
    """Returns a list of update operators for this model.

--- a/research/object_detection/core/model_test.py
+++ b/research/object_detection/core/model_test.py
@@ -57,6 +57,9 @@ class FakeModel(model.DetectionModel):
  def restore_map(self):
    return {}
+  def restore_from_objects(self, fine_tune_checkpoint_type):
+    pass
  def regularization_losses(self):
    return []

--- a/research/object_detection/core/preprocessor.py
+++ b/research/object_detection/core/preprocessor.py
--- a/research/object_detection/core/preprocessor_test.py
+++ b/research/object_detection/core/preprocessor_test.py