Add TPU SavedModel exporter and refactor OD code (#6737)

247226201 by ronnyvotel: Updating the visualization tools to accept unique_ids for color coding. -- 247067830 by Zhichao Lu: Add box_encodings_clip_range options for the convolutional box predictor (for TPU compatibility). -- 246888475 by Zhichao Lu: Remove unused _update_eval_steps function. -- 246163259 by lzc: Add a gather op that can handle ignore indices (which are "-1"s in this case). -- 246084944 by Zhichao Lu: Keras based implementation for SSD + MobilenetV2 + FPN. -- 245544227 by rathodv: Add batch_get_targets method to target assigner module to gather any groundtruth tensors based on the results of target assigner. -- 245540854 by rathodv: Update target assigner to return match tensor instead of a match object. -- 245434441 by Zhichao Lu: Add README for tpu_exporters package. -- 245381834 by lzc: Internal change. -- 245298983 by Zhichao Lu: Add conditional_shape_resizer to config_util -- 245134666 by Zhichao Lu: Adds ConditionalShapeResizer to the ImageResizer proto which enables resizing only if input image height or width is is greater or smaller than a certain size. Also enables specification of resize method in resize_to_{max, min}_dimension methods. -- 245093975 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (faster-rcnn) -- 245072421 by Zhichao Lu: Adds a new image resizing method "resize_to_max_dimension" which resizes images only if a dimension is greater than the maximum desired value while maintaining aspect ratio. -- 244946998 by lzc: Internal Changes. -- 244943693 by Zhichao Lu: Add a custom config to mobilenet v2 that makes it more detection friendly. -- 244754158 by derekjchow: Internal change. -- 244699875 by Zhichao Lu: Add check_range=False to box_list_ops.to_normalized_coordinates when training for instance segmentation. This is consistent with other calls when training for object detection. There could be wrongly annotated boxes in the dataset. -- 244507425 by rathodv: Support bfloat16 for ssd models. -- 244399982 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (ssd) -- 244209387 by Zhichao Lu: Internal change. -- 243922296 by rathodv: Change `raw_detection_scores` to contain softmax/sigmoid scores (not logits) for `raw_ detection_boxes`. -- 243883978 by Zhichao Lu: Add a sample fully conv config. -- 243369455 by Zhichao Lu: Fix regularization loss gap in Keras and Slim. -- 243292002 by lzc: Internal changes. -- 243097958 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (ssd model) -- 243007177 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (ssd model) -- 242776550 by Zhichao Lu: Make object detection pre-processing run on GPU. tf.map_fn() uses TensorArrayV3 ops, which have no int32 GPU implementation. Cast to int64, then cast back to int32. -- 242723128 by Zhichao Lu: Using sorted dictionaries for additional heads in non_max_suppression to ensure tensor order -- 242495311 by Zhichao Lu: Update documentation to reflect new TFLite examples repo location -- 242230527 by Zhichao Lu: Fix Dropout bugs for WeightSharedConvolutionalBoxPred. -- 242226573 by Zhichao Lu: Create Keras-based WeightSharedConvolutionalBoxPredictor. -- 241806074 by Zhichao Lu: Add inference in unit tests of TFX OD template. -- 241641498 by lzc: Internal change. -- 241637481 by Zhichao Lu: matmul_crop_and_resize(): Switch to dynamic shaping, so that not all dimensions are required to be known. -- 241429980 by Zhichao Lu: Internal change -- 241167237 by Zhichao Lu: Adds a faster_rcnn_inception_resnet_v2 Keras feature extractor, and updates the model builder to construct it. -- 241088616 by Zhichao Lu: Make it compatible with different dtype, e.g. float32, bfloat16, etc. -- 240897364 by lzc: Use image_np_expanded in object_detection_tutorial notebook. -- 240890393 by Zhichao Lu: Disable multicore inference for OD template as its not yet compatible. -- 240352168 by Zhichao Lu: Make SSDResnetV1FpnFeatureExtractor not protected to allow inheritance. -- 240351470 by lzc: Internal change. -- 239878928 by Zhichao Lu: Defines Keras box predictors for Faster RCNN and RFCN -- 239872103 by Zhichao Lu: Delete duplicated inputs in test. -- 239714273 by Zhichao Lu: Adding scope variable to all class heads -- 239698643 by Zhichao Lu: Create FPN feature extractor for object detection. -- 239696657 by Zhichao Lu: Internal Change. -- 239299404 by Zhichao Lu: Allows the faster rcnn meta-architecture to support Keras subcomponents -- 238502595 by Zhichao Lu: Lay the groundwork for symmetric quantization. -- 238496885 by Zhichao Lu: Add flexible_grid_anchor_generator -- 238138727 by lzc: Remove dead code. _USE_C_SHAPES has been forced True in TensorFlow releases since TensorFlow 1.9 (https://github.com/tensorflow/tensorflow/commit/1d74a69443f741e69f9f52cb6bc2940b4d4ae3b7) -- 238123936 by rathodv: Add num_matched_groundtruth summary to target assigner in SSD. -- 238103345 by ronnyvotel: Raising error if input file pattern does not match any files. Also printing the number of evaluation images for coco metrics. -- 238044081 by Zhichao Lu: Fix docstring to state the correct dimensionality of `class_predictions_with_background`. -- 237920279 by Zhichao Lu: [XLA] Rework debug flags for dumping HLO. The following flags (usually passed via the XLA_FLAGS envvar) are removed: xla_dump_computations_to xla_dump_executions_to xla_dump_ir_to xla_dump_optimized_hlo_proto_to xla_dump_per_pass_hlo_proto_to xla_dump_unoptimized_hlo_proto_to xla_generate_hlo_graph xla_generate_hlo_text_to xla_hlo_dump_as_html xla_hlo_graph_path xla_log_hlo_text The following new flags are added: xla_dump_to xla_dump_hlo_module_re xla_dump_hlo_pass_re xla_dump_hlo_as_text xla_dump_hlo_as_proto xla_dump_hlo_as_dot xla_dump_hlo_as_url xla_dump_hlo_as_html xla_dump_ir xla_dump_hlo_snapshots The default is not to dump anything at all, but as soon as some dumping flag is specified, we enable the following defaults (most of which can be overridden). * dump to stdout (overridden by --xla_dump_to) * dump HLO modules at the very beginning and end of the optimization pipeline * don't dump between any HLO passes (overridden by --xla_dump_hlo_pass_re) * dump all HLO modules (overridden by --xla_dump_hlo_module_re) * dump in textual format (overridden by --xla_dump_hlo_as_{text,proto,dot,url,html}). For example, to dump optimized and unoptimized HLO text and protos to /tmp/foo, pass --xla_dump_to=/tmp/foo --xla_dump_hlo_as_text --xla_dump_hlo_as_proto For details on these flags' meanings, see xla.proto. The intent of this change is to make dumping both simpler to use and more powerful. For example: * Previously there was no way to dump the HLO module during the pass pipeline in HLO text format; the only option was --dump_per_pass_hlo_proto_to, which dumped in proto format. Now this is --xla_dump_pass_re=.* --xla_dump_hlo_as_text. (In fact, the second flag is not necessary in this case, as dumping as text is the default.) * Previously there was no way to dump HLO as a graph before and after compilation; the only option was --xla_generate_hlo_graph, which would dump before/after every pass. Now this is --xla_dump_hlo_as_{dot,url,html} (depending on what format you want the graph in). * Previously, there was no coordination between the filenames written by the various flags, so info about one module might be dumped with various filename prefixes. Now the filenames are consistent and all dumps from a particular module are next to each other. If you only specify some of these flags, we try to figure out what you wanted. For example: * --xla_dump_to implies --xla_dump_hlo_as_text unless you specify some other --xla_dump_as_* flag. * --xla_dump_hlo_as_text or --xla_dump_ir implies dumping to stdout unless you specify a different --xla_dump_to directory. You can explicitly dump to stdout with --xla_dump_to=-. As part of this change, I simplified the debugging code in the HLO passes for dumping HLO modules. Previously, many tests explicitly VLOG'ed the HLO module before, after, and sometimes during the pass. I removed these VLOGs. If you want dumps before/during/after an HLO pass, use --xla_dump_pass_re=<pass_name>. -- 237510043 by lzc: Internal Change. -- 237469515 by Zhichao Lu: Parameterize model_builder.build in inputs.py. -- 237293511 by rathodv: Remove multiclass_scores from tensor_dict in transform_data_fn always. -- 237260333 by ronnyvotel: Updating faster_rcnn_meta_arch to define prediction dictionary fields that are batched. -- PiperOrigin-RevId: 247226201

Add TPU SavedModel exporter and refactor OD code (#6737)
247226201 by ronnyvotel: Updating the visualization tools to accept unique_ids for color coding. -- 247067830 by Zhichao Lu: Add box_encodings_clip_range options for the convolutional box predictor (for TPU compatibility). -- 246888475 by Zhichao Lu: Remove unused _update_eval_steps function. -- 246163259 by lzc: Add a gather op that can handle ignore indices (which are "-1"s in this case). -- 246084944 by Zhichao Lu: Keras based implementation for SSD + MobilenetV2 + FPN. -- 245544227 by rathodv: Add batch_get_targets method to target assigner module to gather any groundtruth tensors based on the results of target assigner. -- 245540854 by rathodv: Update target assigner to return match tensor instead of a match object. -- 245434441 by Zhichao Lu: Add README for tpu_exporters package. -- 245381834 by lzc: Internal change. -- 245298983 by Zhichao Lu: Add conditional_shape_resizer to config_util -- 245134666 by Zhichao Lu: Adds ConditionalShapeResizer to the ImageResizer proto which enables resizing only if input image height or width is is greater or smaller than a certain size. Also enables specification of resize method in resize_to_{max, min}_dimension methods. -- 245093975 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (faster-rcnn) -- 245072421 by Zhichao Lu: Adds a new image resizing method "resize_to_max_dimension" which resizes images only if a dimension is greater than the maximum desired value while maintaining aspect ratio. -- 244946998 by lzc: Internal Changes. -- 244943693 by Zhichao Lu: Add a custom config to mobilenet v2 that makes it more detection friendly. -- 244754158 by derekjchow: Internal change. -- 244699875 by Zhichao Lu: Add check_range=False to box_list_ops.to_normalized_coordinates when training for instance segmentation. This is consistent with other calls when training for object detection. There could be wrongly annotated boxes in the dataset. -- 244507425 by rathodv: Support bfloat16 for ssd models. -- 244399982 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (ssd) -- 244209387 by Zhichao Lu: Internal change. -- 243922296 by rathodv: Change `raw_detection_scores` to contain softmax/sigmoid scores (not logits) for `raw_ detection_boxes`. -- 243883978 by Zhichao Lu: Add a sample fully conv config. -- 243369455 by Zhichao Lu: Fix regularization loss gap in Keras and Slim. -- 243292002 by lzc: Internal changes. -- 243097958 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (ssd model) -- 243007177 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (ssd model) -- 242776550 by Zhichao Lu: Make object detection pre-processing run on GPU. tf.map_fn() uses TensorArrayV3 ops, which have no int32 GPU implementation. Cast to int64, then cast back to int32. -- 242723128 by Zhichao Lu: Using sorted dictionaries for additional heads in non_max_suppression to ensure tensor order -- 242495311 by Zhichao Lu: Update documentation to reflect new TFLite examples repo location -- 242230527 by Zhichao Lu: Fix Dropout bugs for WeightSharedConvolutionalBoxPred. -- 242226573 by Zhichao Lu: Create Keras-based WeightSharedConvolutionalBoxPredictor. -- 241806074 by Zhichao Lu: Add inference in unit tests of TFX OD template. -- 241641498 by lzc: Internal change. -- 241637481 by Zhichao Lu: matmul_crop_and_resize(): Switch to dynamic shaping, so that not all dimensions are required to be known. -- 241429980 by Zhichao Lu: Internal change -- 241167237 by Zhichao Lu: Adds a faster_rcnn_inception_resnet_v2 Keras feature extractor, and updates the model builder to construct it. -- 241088616 by Zhichao Lu: Make it compatible with different dtype, e.g. float32, bfloat16, etc. -- 240897364 by lzc: Use image_np_expanded in object_detection_tutorial notebook. -- 240890393 by Zhichao Lu: Disable multicore inference for OD template as its not yet compatible. -- 240352168 by Zhichao Lu: Make SSDResnetV1FpnFeatureExtractor not protected to allow inheritance. -- 240351470 by lzc: Internal change. -- 239878928 by Zhichao Lu: Defines Keras box predictors for Faster RCNN and RFCN -- 239872103 by Zhichao Lu: Delete duplicated inputs in test. -- 239714273 by Zhichao Lu: Adding scope variable to all class heads -- 239698643 by Zhichao Lu: Create FPN feature extractor for object detection. -- 239696657 by Zhichao Lu: Internal Change. -- 239299404 by Zhichao Lu: Allows the faster rcnn meta-architecture to support Keras subcomponents -- 238502595 by Zhichao Lu: Lay the groundwork for symmetric quantization. -- 238496885 by Zhichao Lu: Add flexible_grid_anchor_generator -- 238138727 by lzc: Remove dead code. _USE_C_SHAPES has been forced True in TensorFlow releases since TensorFlow 1.9 (https://github.com/tensorflow/tensorflow/commit/1d74a69443f741e69f9f52cb6bc2940b4d4ae3b7) -- 238123936 by rathodv: Add num_matched_groundtruth summary to target assigner in SSD. -- 238103345 by ronnyvotel: Raising error if input file pattern does not match any files. Also printing the number of evaluation images for coco metrics. -- 238044081 by Zhichao Lu: Fix docstring to state the correct dimensionality of `class_predictions_with_background`. -- 237920279 by Zhichao Lu: [XLA] Rework debug flags for dumping HLO. The following flags (usually passed via the XLA_FLAGS envvar) are removed: xla_dump_computations_to xla_dump_executions_to xla_dump_ir_to xla_dump_optimized_hlo_proto_to xla_dump_per_pass_hlo_proto_to xla_dump_unoptimized_hlo_proto_to xla_generate_hlo_graph xla_generate_hlo_text_to xla_hlo_dump_as_html xla_hlo_graph_path xla_log_hlo_text The following new flags are added: xla_dump_to xla_dump_hlo_module_re xla_dump_hlo_pass_re xla_dump_hlo_as_text xla_dump_hlo_as_proto xla_dump_hlo_as_dot xla_dump_hlo_as_url xla_dump_hlo_as_html xla_dump_ir xla_dump_hlo_snapshots The default is not to dump anything at all, but as soon as some dumping flag is specified, we enable the following defaults (most of which can be overridden). * dump to stdout (overridden by --xla_dump_to) * dump HLO modules at the very beginning and end of the optimization pipeline * don't dump between any HLO passes (overridden by --xla_dump_hlo_pass_re) * dump all HLO modules (overridden by --xla_dump_hlo_module_re) * dump in textual format (overridden by --xla_dump_hlo_as_{text,proto,dot,url,html}). For example, to dump optimized and unoptimized HLO text and protos to /tmp/foo, pass --xla_dump_to=/tmp/foo --xla_dump_hlo_as_text --xla_dump_hlo_as_proto For details on these flags' meanings, see xla.proto. The intent of this change is to make dumping both simpler to use and more powerful. For example: * Previously there was no way to dump the HLO module during the pass pipeline in HLO text format; the only option was --dump_per_pass_hlo_proto_to, which dumped in proto format. Now this is --xla_dump_pass_re=.* --xla_dump_hlo_as_text. (In fact, the second flag is not necessary in this case, as dumping as text is the default.) * Previously there was no way to dump HLO as a graph before and after compilation; the only option was --xla_generate_hlo_graph, which would dump before/after every pass. Now this is --xla_dump_hlo_as_{dot,url,html} (depending on what format you want the graph in). * Previously, there was no coordination between the filenames written by the various flags, so info about one module might be dumped with various filename prefixes. Now the filenames are consistent and all dumps from a particular module are next to each other. If you only specify some of these flags, we try to figure out what you wanted. For example: * --xla_dump_to implies --xla_dump_hlo_as_text unless you specify some other --xla_dump_as_* flag. * --xla_dump_hlo_as_text or --xla_dump_ir implies dumping to stdout unless you specify a different --xla_dump_to directory. You can explicitly dump to stdout with --xla_dump_to=-. As part of this change, I simplified the debugging code in the HLO passes for dumping HLO modules. Previously, many tests explicitly VLOG'ed the HLO module before, after, and sometimes during the pass. I removed these VLOGs. If you want dumps before/during/after an HLO pass, use --xla_dump_pass_re=<pass_name>. -- 237510043 by lzc: Internal Change. -- 237469515 by Zhichao Lu: Parameterize model_builder.build in inputs.py. -- 237293511 by rathodv: Remove multiclass_scores from tensor_dict in transform_data_fn always. -- 237260333 by ronnyvotel: Updating faster_rcnn_meta_arch to define prediction dictionary fields that are batched. -- PiperOrigin-RevId: 247226201
80444539 · Zhuoran Liu · pkulzc · c4f34e58 · 80444539 · 80444539
Commit 80444539 authored May 22, 2019 by Zhuoran Liu Committed by pkulzc May 22, 2019
20 changed files
--- a/research/object_detection/predictors/heads/keras_box_head_test.py
+++ b/research/object_detection/predictors/heads/keras_box_head_test.py
@@ -71,5 +71,114 @@ class ConvolutionalKerasBoxHeadTest(test_case.TestCase):
    box_encodings = box_prediction_head(image_feature)
    self.assertAllEqual([64, 323, 1, 4], box_encodings.get_shape().as_list())

+
+class MaskRCNNKerasBoxHeadTest(test_case.TestCase):
+
+  def _build_fc_hyperparams(
+      self, op_type=hyperparams_pb2.Hyperparams.FC):
+    hyperparams = hyperparams_pb2.Hyperparams()
+    hyperparams_text_proto = """
+      activation: NONE
+      regularizer {
+        l2_regularizer {
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+        }
+      }
+    """
+    text_format.Merge(hyperparams_text_proto, hyperparams)
+    hyperparams.op = op_type
+    return hyperparams_builder.KerasLayerHyperparams(hyperparams)
+
+  def test_prediction_size(self):
+    box_prediction_head = keras_box_head.MaskRCNNBoxHead(
+        is_training=False,
+        num_classes=20,
+        fc_hyperparams=self._build_fc_hyperparams(),
+        freeze_batchnorm=False,
+        use_dropout=True,
+        dropout_keep_prob=0.5,
+        box_code_size=4,
+        share_box_across_classes=False)
+    roi_pooled_features = tf.random_uniform(
+        [64, 7, 7, 1024], minval=-10.0, maxval=10.0, dtype=tf.float32)
+    prediction = box_prediction_head(roi_pooled_features)
+    self.assertAllEqual([64, 1, 20, 4], prediction.get_shape().as_list())
+
+
+class WeightSharedConvolutionalKerasBoxHead(test_case.TestCase):
+
+  def _build_conv_hyperparams(self):
+    conv_hyperparams = hyperparams_pb2.Hyperparams()
+    conv_hyperparams_text_proto = """
+    activation: NONE
+    regularizer {
+      l2_regularizer {
+      }
+    }
+    initializer {
+      truncated_normal_initializer {
+      }
+    }
+    """
+    text_format.Merge(conv_hyperparams_text_proto, conv_hyperparams)
+    return hyperparams_builder.KerasLayerHyperparams(conv_hyperparams)
+
+  def test_prediction_size_depthwise_false(self):
+    conv_hyperparams = self._build_conv_hyperparams()
+    box_prediction_head = keras_box_head.WeightSharedConvolutionalBoxHead(
+        box_code_size=4,
+        conv_hyperparams=conv_hyperparams,
+        num_predictions_per_location=1,
+        use_depthwise=False)
+    image_feature = tf.random_uniform(
+        [64, 17, 19, 1024], minval=-10.0, maxval=10.0, dtype=tf.float32)
+    box_encodings = box_prediction_head(image_feature)
+    self.assertAllEqual([64, 323, 4], box_encodings.get_shape().as_list())
+
+  def test_prediction_size_depthwise_true(self):
+    conv_hyperparams = self._build_conv_hyperparams()
+    box_prediction_head = keras_box_head.WeightSharedConvolutionalBoxHead(
+        box_code_size=4,
+        conv_hyperparams=conv_hyperparams,
+        num_predictions_per_location=1,
+        use_depthwise=True)
+    image_feature = tf.random_uniform(
+        [64, 17, 19, 1024], minval=-10.0, maxval=10.0, dtype=tf.float32)
+    box_encodings = box_prediction_head(image_feature)
+    self.assertAllEqual([64, 323, 4], box_encodings.get_shape().as_list())
+
+  def test_variable_count_depth_wise_true(self):
+    g = tf.Graph()
+    with g.as_default():
+      conv_hyperparams = self._build_conv_hyperparams()
+      box_prediction_head = keras_box_head.WeightSharedConvolutionalBoxHead(
+          box_code_size=4,
+          conv_hyperparams=conv_hyperparams,
+          num_predictions_per_location=1,
+          use_depthwise=True)
+      image_feature = tf.random_uniform(
+          [64, 17, 19, 1024], minval=-10.0, maxval=10.0, dtype=tf.float32)
+      _ = box_prediction_head(image_feature)
+      variables = g.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)
+    self.assertEqual(len(variables), 3)
+
+  def test_variable_count_depth_wise_False(self):
+    g = tf.Graph()
+    with g.as_default():
+      conv_hyperparams = self._build_conv_hyperparams()
+      box_prediction_head = keras_box_head.WeightSharedConvolutionalBoxHead(
+          box_code_size=4,
+          conv_hyperparams=conv_hyperparams,
+          num_predictions_per_location=1,
+          use_depthwise=False)
+      image_feature = tf.random_uniform(
+          [64, 17, 19, 1024], minval=-10.0, maxval=10.0, dtype=tf.float32)
+      _ = box_prediction_head(image_feature)
+      variables = g.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)
+    self.assertEqual(len(variables), 2)
+
 if __name__ == '__main__':
  tf.test.main()
--- a/research/object_detection/predictors/heads/keras_class_head.py
+++ b/research/object_detection/predictors/heads/keras_class_head.py
@@ -134,7 +134,6 @@ class ConvolutionalClassHead(head.KerasHead):
        [batch_size, num_anchors, num_class_slots] representing the class
        predictions for the proposals.
    """
-    # Add a slot for the background class.
    class_predictions_with_background = features
    for layer in self._class_predictor_layers:
      class_predictions_with_background = layer(
@@ -146,3 +145,197 @@ class ConvolutionalClassHead(head.KerasHead):
        class_predictions_with_background,
        [batch_size, -1, self._num_class_slots])
    return class_predictions_with_background
+
+
+class MaskRCNNClassHead(head.KerasHead):
+  """Mask RCNN class prediction head.
+
+  This is a piece of Mask RCNN which is responsible for predicting
+  just the class scores of boxes.
+
+  Please refer to Mask RCNN paper:
+  https://arxiv.org/abs/1703.06870
+  """
+
+  def __init__(self,
+               is_training,
+               num_class_slots,
+               fc_hyperparams,
+               freeze_batchnorm,
+               use_dropout,
+               dropout_keep_prob,
+               name=None):
+    """Constructor.
+
+    Args:
+      is_training: Indicates whether the BoxPredictor is in training mode.
+      num_class_slots: number of class slots. Note that num_class_slots may or
+        may not include an implicit background category.
+      fc_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
+        containing hyperparameters for fully connected dense ops.
+      freeze_batchnorm: Whether to freeze batch norm parameters during
+        training or not. When training with a small batch size (e.g. 1), it is
+        desirable to freeze batch norm update and use pretrained batch norm
+        params.
+      use_dropout: Option to use dropout or not.  Note that a single dropout
+        op is applied here prior to both box and class predictions, which stands
+        in contrast to the ConvolutionalBoxPredictor below.
+      dropout_keep_prob: Keep probability for dropout.
+        This is only used if use_dropout is True.
+      name: A string name scope to assign to the class head. If `None`, Keras
+        will auto-generate one from the class name.
+    """
+    super(MaskRCNNClassHead, self).__init__(name=name)
+    self._is_training = is_training
+    self._freeze_batchnorm = freeze_batchnorm
+    self._num_class_slots = num_class_slots
+    self._fc_hyperparams = fc_hyperparams
+    self._use_dropout = use_dropout
+    self._dropout_keep_prob = dropout_keep_prob
+
+    self._class_predictor_layers = [tf.keras.layers.Flatten()]
+
+    if self._use_dropout:
+      self._class_predictor_layers.append(
+          tf.keras.layers.Dropout(rate=1.0 - self._dropout_keep_prob))
+
+    self._class_predictor_layers.append(
+        tf.keras.layers.Dense(self._num_class_slots,
+                              name='ClassPredictor_dense'))
+    self._class_predictor_layers.append(
+        fc_hyperparams.build_batch_norm(training=(is_training and
+                                                  not freeze_batchnorm),
+                                        name='ClassPredictor_batchnorm'))
+
+  def _predict(self, features):
+    """Predicts the class scores for boxes.
+
+    Args:
+      features: A float tensor of shape [batch_size, height, width, channels]
+        containing features for a batch of images.
+
+    Returns:
+      class_predictions_with_background: A float tensor of shape
+        [batch_size, 1, num_class_slots] representing the class predictions for
+        the proposals.
+    """
+    spatial_averaged_roi_pooled_features = tf.reduce_mean(
+        features, [1, 2], keep_dims=True, name='AvgPool')
+    net = spatial_averaged_roi_pooled_features
+    for layer in self._class_predictor_layers:
+      net = layer(net)
+    class_predictions_with_background = tf.reshape(
+        net,
+        [-1, 1, self._num_class_slots])
+    return class_predictions_with_background
+
+
+class WeightSharedConvolutionalClassHead(head.KerasHead):
+  """Weight shared convolutional class prediction head.
+
+  This head allows sharing the same set of parameters (weights) when called more
+  then once on different feature maps.
+  """
+
+  def __init__(self,
+               num_class_slots,
+               num_predictions_per_location,
+               conv_hyperparams,
+               kernel_size=3,
+               class_prediction_bias_init=0.0,
+               use_dropout=False,
+               dropout_keep_prob=0.8,
+               use_depthwise=False,
+               score_converter_fn=tf.identity,
+               return_flat_predictions=True,
+               name=None):
+    """Constructor.
+
+    Args:
+      num_class_slots: number of class slots. Note that num_class_slots may or
+        may not include an implicit background category.
+      num_predictions_per_location: Number of box predictions to be made per
+        spatial location. Int specifying number of boxes per location.
+      conv_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
+        containing hyperparameters for convolution ops.
+      kernel_size: Size of final convolution kernel.
+      class_prediction_bias_init: constant value to initialize bias of the last
+        conv2d layer before class prediction.
+      use_dropout: Whether to apply dropout to class prediction head.
+      dropout_keep_prob: Probability of keeping activiations.
+      use_depthwise: Whether to use depthwise convolutions for prediction
+        steps. Default is False.
+      score_converter_fn: Callable elementwise nonlinearity (that takes tensors
+        as inputs and returns tensors).
+      return_flat_predictions: If true, returns flattened prediction tensor
+        of shape [batch, height * width * num_predictions_per_location,
+        box_coder]. Otherwise returns the prediction tensor before reshaping,
+        whose shape is [batch, height, width, num_predictions_per_location *
+        num_class_slots].
+      name: A string name scope to assign to the model. If `None`, Keras
+        will auto-generate one from the class name.
+    """
+    super(WeightSharedConvolutionalClassHead, self).__init__(name=name)
+    self._num_class_slots = num_class_slots
+    self._kernel_size = kernel_size
+    self._class_prediction_bias_init = class_prediction_bias_init
+    self._use_dropout = use_dropout
+    self._dropout_keep_prob = dropout_keep_prob
+    self._use_depthwise = use_depthwise
+    self._score_converter_fn = score_converter_fn
+    self._return_flat_predictions = return_flat_predictions
+
+    self._class_predictor_layers = []
+
+    if self._use_dropout:
+      self._class_predictor_layers.append(
+          tf.keras.layers.Dropout(rate=1.0 - self._dropout_keep_prob))
+    if self._use_depthwise:
+      self._class_predictor_layers.append(
+          tf.keras.layers.SeparableConv2D(
+              num_predictions_per_location * self._num_class_slots,
+              [self._kernel_size, self._kernel_size],
+              padding='SAME',
+              depth_multiplier=1,
+              strides=1,
+              name='ClassPredictor',
+              bias_initializer=tf.constant_initializer(
+                  self._class_prediction_bias_init),
+              **conv_hyperparams.params(use_bias=True)))
+    else:
+      self._class_predictor_layers.append(
+          tf.keras.layers.Conv2D(
+              num_predictions_per_location * self._num_class_slots,
+              [self._kernel_size, self._kernel_size],
+              padding='SAME',
+              name='ClassPredictor',
+              bias_initializer=tf.constant_initializer(
+                  self._class_prediction_bias_init),
+              **conv_hyperparams.params(use_bias=True)))
+
+  def _predict(self, features):
+    """Predicts boxes.
+
+    Args:
+      features: A float tensor of shape [batch_size, height, width, channels]
+        containing image features.
+
+    Returns:
+      class_predictions_with_background: A float tensor of shape
+        [batch_size, num_anchors, num_class_slots] representing the class
+        predictions for the proposals.
+    """
+    class_predictions_with_background = features
+    for layer in self._class_predictor_layers:
+      class_predictions_with_background = layer(
+          class_predictions_with_background)
+    batch_size = features.get_shape().as_list()[0]
+    if batch_size is None:
+      batch_size = tf.shape(features)[0]
+    class_predictions_with_background = self._score_converter_fn(
+        class_predictions_with_background)
+    if self._return_flat_predictions:
+      class_predictions_with_background = tf.reshape(
+          class_predictions_with_background,
+          [batch_size, -1, self._num_class_slots])
+    return class_predictions_with_background
--- a/research/object_detection/predictors/heads/keras_class_head_test.py
+++ b/research/object_detection/predictors/heads/keras_class_head_test.py
@@ -77,5 +77,115 @@ class ConvolutionalKerasClassPredictorTest(test_case.TestCase):
    self.assertAllEqual([64, 323, 20],
                        class_predictions.get_shape().as_list())

+
+class MaskRCNNClassHeadTest(test_case.TestCase):
+
+  def _build_fc_hyperparams(self,
+                            op_type=hyperparams_pb2.Hyperparams.FC):
+    hyperparams = hyperparams_pb2.Hyperparams()
+    hyperparams_text_proto = """
+      activation: NONE
+      regularizer {
+        l2_regularizer {
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+        }
+      }
+    """
+    text_format.Merge(hyperparams_text_proto, hyperparams)
+    hyperparams.op = op_type
+    return hyperparams_builder.KerasLayerHyperparams(hyperparams)
+
+  def test_prediction_size(self):
+    class_prediction_head = keras_class_head.MaskRCNNClassHead(
+        is_training=False,
+        num_class_slots=20,
+        fc_hyperparams=self._build_fc_hyperparams(),
+        freeze_batchnorm=False,
+        use_dropout=True,
+        dropout_keep_prob=0.5)
+    roi_pooled_features = tf.random_uniform(
+        [64, 7, 7, 1024], minval=-10.0, maxval=10.0, dtype=tf.float32)
+    prediction = class_prediction_head(roi_pooled_features)
+    self.assertAllEqual([64, 1, 20], prediction.get_shape().as_list())
+
+
+class WeightSharedConvolutionalKerasClassPredictorTest(test_case.TestCase):
+
+  def _build_conv_hyperparams(self):
+    conv_hyperparams = hyperparams_pb2.Hyperparams()
+    conv_hyperparams_text_proto = """
+    activation: NONE
+      regularizer {
+        l2_regularizer {
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+        }
+      }
+    """
+    text_format.Merge(conv_hyperparams_text_proto, conv_hyperparams)
+    return hyperparams_builder.KerasLayerHyperparams(conv_hyperparams)
+
+  def test_prediction_size_depthwise_false(self):
+    conv_hyperparams = self._build_conv_hyperparams()
+    class_prediction_head = keras_class_head.WeightSharedConvolutionalClassHead(
+        num_class_slots=20,
+        conv_hyperparams=conv_hyperparams,
+        num_predictions_per_location=1,
+        use_depthwise=False)
+    image_feature = tf.random_uniform(
+        [64, 17, 19, 1024], minval=-10.0, maxval=10.0, dtype=tf.float32)
+    class_predictions = class_prediction_head(image_feature)
+    self.assertAllEqual([64, 323, 20], class_predictions.get_shape().as_list())
+
+  def test_prediction_size_depthwise_true(self):
+    conv_hyperparams = self._build_conv_hyperparams()
+    class_prediction_head = keras_class_head.WeightSharedConvolutionalClassHead(
+        num_class_slots=20,
+        conv_hyperparams=conv_hyperparams,
+        num_predictions_per_location=1,
+        use_depthwise=True)
+    image_feature = tf.random_uniform(
+        [64, 17, 19, 1024], minval=-10.0, maxval=10.0, dtype=tf.float32)
+    class_predictions = class_prediction_head(image_feature)
+    self.assertAllEqual([64, 323, 20], class_predictions.get_shape().as_list())
+
+  def test_variable_count_depth_wise_true(self):
+    g = tf.Graph()
+    with g.as_default():
+      conv_hyperparams = self._build_conv_hyperparams()
+      class_prediction_head = (
+          keras_class_head.WeightSharedConvolutionalClassHead(
+              num_class_slots=20,
+              conv_hyperparams=conv_hyperparams,
+              num_predictions_per_location=1,
+              use_depthwise=True))
+      image_feature = tf.random_uniform(
+          [64, 17, 19, 1024], minval=-10.0, maxval=10.0, dtype=tf.float32)
+      _ = class_prediction_head(image_feature)
+      variables = g.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)
+    self.assertEqual(len(variables), 3)
+
+  def test_variable_count_depth_wise_False(self):
+    g = tf.Graph()
+    with g.as_default():
+      conv_hyperparams = self._build_conv_hyperparams()
+      class_prediction_head = (
+          keras_class_head.WeightSharedConvolutionalClassHead(
+              num_class_slots=20,
+              conv_hyperparams=conv_hyperparams,
+              num_predictions_per_location=1,
+              use_depthwise=False))
+      image_feature = tf.random_uniform(
+          [64, 17, 19, 1024], minval=-10.0, maxval=10.0, dtype=tf.float32)
+      _ = class_prediction_head(image_feature)
+      variables = g.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)
+    self.assertEqual(len(variables), 2)
+
+
 if __name__ == '__main__':
  tf.test.main()
--- a/research/object_detection/predictors/heads/keras_mask_head.py
+++ b/research/object_detection/predictors/heads/keras_mask_head.py
@@ -19,9 +19,11 @@ Contains Mask prediction head classes for different meta architectures.
 All the mask prediction heads have a predict function that receives the
 `features` as the first argument and returns `mask_predictions`.
 """
+import math
 import tensorflow as tf

 from object_detection.predictors.heads import head
+from object_detection.utils import ops


 class ConvolutionalMaskHead(head.KerasHead):
@@ -156,3 +158,281 @@ class ConvolutionalMaskHead(head.KerasHead):
        mask_predictions,
        [batch_size, -1, self._num_masks, self._mask_height, self._mask_width])
    return mask_predictions
+
+
+class MaskRCNNMaskHead(head.KerasHead):
+  """Mask RCNN mask prediction head.
+
+  This is a piece of Mask RCNN which is responsible for predicting
+  just the pixelwise foreground scores for regions within the boxes.
+
+  Please refer to Mask RCNN paper:
+  https://arxiv.org/abs/1703.06870
+  """
+
+  def __init__(self,
+               is_training,
+               num_classes,
+               freeze_batchnorm,
+               conv_hyperparams,
+               mask_height=14,
+               mask_width=14,
+               mask_prediction_num_conv_layers=2,
+               mask_prediction_conv_depth=256,
+               masks_are_class_agnostic=False,
+               convolve_then_upsample=False,
+               name=None):
+    """Constructor.
+
+    Args:
+      is_training: Indicates whether the Mask head is in training mode.
+      num_classes: number of classes.  Note that num_classes *does not*
+        include the background category, so if groundtruth labels take values
+        in {0, 1, .., K-1}, num_classes=K (and not K+1, even though the
+        assigned classification targets can range from {0,... K}).
+      freeze_batchnorm: Whether to freeze batch norm parameters during
+        training or not. When training with a small batch size (e.g. 1), it is
+        desirable to freeze batch norm update and use pretrained batch norm
+        params.
+      conv_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
+        containing hyperparameters for convolution ops.
+      mask_height: Desired output mask height. The default value is 14.
+      mask_width: Desired output mask width. The default value is 14.
+      mask_prediction_num_conv_layers: Number of convolution layers applied to
+        the image_features in mask prediction branch.
+      mask_prediction_conv_depth: The depth for the first conv2d_transpose op
+        applied to the image_features in the mask prediction branch. If set
+        to 0, the depth of the convolution layers will be automatically chosen
+        based on the number of object classes and the number of channels in the
+        image features.
+      masks_are_class_agnostic: Boolean determining if the mask-head is
+        class-agnostic or not.
+      convolve_then_upsample: Whether to apply convolutions on mask features
+        before upsampling using nearest neighbor resizing. Otherwise, mask
+        features are resized to [`mask_height`, `mask_width`] using bilinear
+        resizing before applying convolutions.
+      name: A string name scope to assign to the mask head. If `None`, Keras
+        will auto-generate one from the class name.
+    """
+    super(MaskRCNNMaskHead, self).__init__(name=name)
+    self._is_training = is_training
+    self._freeze_batchnorm = freeze_batchnorm
+    self._num_classes = num_classes
+    self._conv_hyperparams = conv_hyperparams
+    self._mask_height = mask_height
+    self._mask_width = mask_width
+    self._mask_prediction_num_conv_layers = mask_prediction_num_conv_layers
+    self._mask_prediction_conv_depth = mask_prediction_conv_depth
+    self._masks_are_class_agnostic = masks_are_class_agnostic
+    self._convolve_then_upsample = convolve_then_upsample
+
+    self._mask_predictor_layers = []
+
+  def build(self, input_shapes):
+    num_conv_channels = self._mask_prediction_conv_depth
+    if num_conv_channels == 0:
+      num_feature_channels = input_shapes.as_list()[3]
+      num_conv_channels = self._get_mask_predictor_conv_depth(
+          num_feature_channels, self._num_classes)
+
+    for i in range(self._mask_prediction_num_conv_layers - 1):
+      self._mask_predictor_layers.append(
+          tf.keras.layers.Conv2D(
+              num_conv_channels,
+              [3, 3],
+              padding='SAME',
+              name='MaskPredictor_conv2d_{}'.format(i),
+              **self._conv_hyperparams.params()))
+      self._mask_predictor_layers.append(
+          self._conv_hyperparams.build_batch_norm(
+              training=(self._is_training and not self._freeze_batchnorm),
+              name='MaskPredictor_batchnorm_{}'.format(i)))
+      self._mask_predictor_layers.append(
+          self._conv_hyperparams.build_activation_layer(
+              name='MaskPredictor_activation_{}'.format(i)))
+
+    if self._convolve_then_upsample:
+      # Replace Transposed Convolution with a Nearest Neighbor upsampling step
+      # followed by 3x3 convolution.
+      height_scale = self._mask_height / input_shapes[1].value
+      width_scale = self._mask_width / input_shapes[2].value
+      # pylint: disable=g-long-lambda
+      self._mask_predictor_layers.append(tf.keras.layers.Lambda(
+          lambda features: ops.nearest_neighbor_upsampling(
+              features, height_scale=height_scale, width_scale=width_scale)
+      ))
+      # pylint: enable=g-long-lambda
+      self._mask_predictor_layers.append(
+          tf.keras.layers.Conv2D(
+              num_conv_channels,
+              [3, 3],
+              padding='SAME',
+              name='MaskPredictor_upsample_conv2d',
+              **self._conv_hyperparams.params()))
+      self._mask_predictor_layers.append(
+          self._conv_hyperparams.build_batch_norm(
+              training=(self._is_training and not self._freeze_batchnorm),
+              name='MaskPredictor_upsample_batchnorm'))
+      self._mask_predictor_layers.append(
+          self._conv_hyperparams.build_activation_layer(
+              name='MaskPredictor_upsample_activation'))
+
+    num_masks = 1 if self._masks_are_class_agnostic else self._num_classes
+    self._mask_predictor_layers.append(
+        tf.keras.layers.Conv2D(
+            num_masks,
+            [3, 3],
+            padding='SAME',
+            name='MaskPredictor_last_conv2d',
+            **self._conv_hyperparams.params(use_bias=True)))
+
+    self.built = True
+
+  def _get_mask_predictor_conv_depth(self,
+                                     num_feature_channels,
+                                     num_classes,
+                                     class_weight=3.0,
+                                     feature_weight=2.0):
+    """Computes the depth of the mask predictor convolutions.
+
+    Computes the depth of the mask predictor convolutions given feature channels
+    and number of classes by performing a weighted average of the two in
+    log space to compute the number of convolution channels. The weights that
+    are used for computing the weighted average do not need to sum to 1.
+
+    Args:
+      num_feature_channels: An integer containing the number of feature
+        channels.
+      num_classes: An integer containing the number of classes.
+      class_weight: Class weight used in computing the weighted average.
+      feature_weight: Feature weight used in computing the weighted average.
+
+    Returns:
+      An integer containing the number of convolution channels used by mask
+        predictor.
+    """
+    num_feature_channels_log = math.log(float(num_feature_channels), 2.0)
+    num_classes_log = math.log(float(num_classes), 2.0)
+    weighted_num_feature_channels_log = (
+        num_feature_channels_log * feature_weight)
+    weighted_num_classes_log = num_classes_log * class_weight
+    total_weight = feature_weight + class_weight
+    num_conv_channels_log = round(
+        (weighted_num_feature_channels_log + weighted_num_classes_log) /
+        total_weight)
+    return int(math.pow(2.0, num_conv_channels_log))
+
+  def _predict(self, features):
+    """Predicts pixelwise foreground scores for regions within the boxes.
+
+    Args:
+      features: A float tensor of shape [batch_size, height, width, channels]
+        containing features for a batch of images.
+
+    Returns:
+      instance_masks: A float tensor of shape
+          [batch_size, 1, num_classes, mask_height, mask_width].
+    """
+    if not self._convolve_then_upsample:
+      features = tf.image.resize_bilinear(
+          features, [self._mask_height, self._mask_width],
+          align_corners=True)
+
+    mask_predictions = features
+    for layer in self._mask_predictor_layers:
+      mask_predictions = layer(mask_predictions)
+    return tf.expand_dims(
+        tf.transpose(mask_predictions, perm=[0, 3, 1, 2]),
+        axis=1,
+        name='MaskPredictor')
+
+
+class WeightSharedConvolutionalMaskHead(head.KerasHead):
+  """Weight shared convolutional mask prediction head based on Keras."""
+
+  def __init__(self,
+               num_classes,
+               num_predictions_per_location,
+               conv_hyperparams,
+               kernel_size=3,
+               use_dropout=False,
+               dropout_keep_prob=0.8,
+               mask_height=7,
+               mask_width=7,
+               masks_are_class_agnostic=False,
+               name=None):
+    """Constructor.
+
+    Args:
+      num_classes: number of classes.  Note that num_classes *does not*
+        include the background category, so if groundtruth labels take values
+        in {0, 1, .., K-1}, num_classes=K (and not K+1, even though the
+        assigned classification targets can range from {0,... K}).
+      num_predictions_per_location: Number of box predictions to be made per
+        spatial location. Int specifying number of boxes per location.
+      conv_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
+        containing hyperparameters for convolution ops.
+      kernel_size: Size of final convolution kernel.
+      use_dropout: Whether to apply dropout to class prediction head.
+      dropout_keep_prob: Probability of keeping activiations.
+      mask_height: Desired output mask height. The default value is 7.
+      mask_width: Desired output mask width. The default value is 7.
+      masks_are_class_agnostic: Boolean determining if the mask-head is
+        class-agnostic or not.
+      name: A string name scope to assign to the model. If `None`, Keras
+        will auto-generate one from the class name.
+
+    Raises:
+      ValueError: if min_depth > max_depth.
+    """
+    super(WeightSharedConvolutionalMaskHead, self).__init__(name=name)
+    self._num_classes = num_classes
+    self._num_predictions_per_location = num_predictions_per_location
+    self._kernel_size = kernel_size
+    self._use_dropout = use_dropout
+    self._dropout_keep_prob = dropout_keep_prob
+    self._mask_height = mask_height
+    self._mask_width = mask_width
+    self._masks_are_class_agnostic = masks_are_class_agnostic
+
+    self._mask_predictor_layers = []
+
+    if self._masks_are_class_agnostic:
+      self._num_masks = 1
+    else:
+      self._num_masks = self._num_classes
+    num_mask_channels = self._num_masks * self._mask_height * self._mask_width
+
+    if self._use_dropout:
+      self._mask_predictor_layers.append(
+          tf.keras.layers.Dropout(rate=1.0 - self._dropout_keep_prob))
+    self._mask_predictor_layers.append(
+        tf.keras.layers.Conv2D(
+            num_predictions_per_location * num_mask_channels,
+            [self._kernel_size, self._kernel_size],
+            padding='SAME',
+            name='MaskPredictor',
+            **conv_hyperparams.params(use_bias=True)))
+
+  def _predict(self, features):
+    """Predicts boxes.
+
+    Args:
+      features: A float tensor of shape [batch_size, height, width, channels]
+        containing image features.
+
+    Returns:
+      mask_predictions: A tensor of shape
+        [batch_size, num_anchors, num_classes, mask_height, mask_width]
+        representing the mask predictions for the proposals.
+    """
+    mask_predictions = features
+    for layer in self._mask_predictor_layers:
+      mask_predictions = layer(mask_predictions)
+    batch_size = features.get_shape().as_list()[0]
+    if batch_size is None:
+      batch_size = tf.shape(features)[0]
+    mask_predictions = tf.reshape(
+        mask_predictions,
+        [batch_size, -1, self._num_masks, self._mask_height, self._mask_width])
+    return mask_predictions
--- a/research/object_detection/predictors/heads/keras_mask_head_test.py
+++ b/research/object_detection/predictors/heads/keras_mask_head_test.py
@@ -123,5 +123,107 @@ class ConvolutionalMaskPredictorTest(test_case.TestCase):
    self.assertAllEqual([64, 323, 1, 7, 7],
                        mask_predictions.get_shape().as_list())

+
+class MaskRCNNMaskHeadTest(test_case.TestCase):
+
+  def _build_conv_hyperparams(self,
+                              op_type=hyperparams_pb2.Hyperparams.CONV):
+    hyperparams = hyperparams_pb2.Hyperparams()
+    hyperparams_text_proto = """
+      activation: NONE
+      regularizer {
+        l2_regularizer {
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+        }
+      }
+    """
+    text_format.Merge(hyperparams_text_proto, hyperparams)
+    hyperparams.op = op_type
+    return hyperparams_builder.KerasLayerHyperparams(hyperparams)
+
+  def test_prediction_size(self):
+    mask_prediction_head = keras_mask_head.MaskRCNNMaskHead(
+        is_training=True,
+        num_classes=20,
+        conv_hyperparams=self._build_conv_hyperparams(),
+        freeze_batchnorm=False,
+        mask_height=14,
+        mask_width=14,
+        mask_prediction_num_conv_layers=2,
+        mask_prediction_conv_depth=256,
+        masks_are_class_agnostic=False)
+    roi_pooled_features = tf.random_uniform(
+        [64, 7, 7, 1024], minval=-10.0, maxval=10.0, dtype=tf.float32)
+    prediction = mask_prediction_head(roi_pooled_features)
+    self.assertAllEqual([64, 1, 20, 14, 14], prediction.get_shape().as_list())
+
+  def test_prediction_size_with_convolve_then_upsample(self):
+    mask_prediction_head = keras_mask_head.MaskRCNNMaskHead(
+        is_training=True,
+        num_classes=20,
+        conv_hyperparams=self._build_conv_hyperparams(),
+        freeze_batchnorm=False,
+        mask_height=28,
+        mask_width=28,
+        mask_prediction_num_conv_layers=2,
+        mask_prediction_conv_depth=256,
+        masks_are_class_agnostic=True,
+        convolve_then_upsample=True)
+    roi_pooled_features = tf.random_uniform(
+        [64, 14, 14, 1024], minval=-10.0, maxval=10.0, dtype=tf.float32)
+    prediction = mask_prediction_head(roi_pooled_features)
+    self.assertAllEqual([64, 1, 1, 28, 28], prediction.get_shape().as_list())
+
+
+class WeightSharedConvolutionalMaskPredictorTest(test_case.TestCase):
+
+  def _build_conv_hyperparams(self):
+    conv_hyperparams = hyperparams_pb2.Hyperparams()
+    conv_hyperparams_text_proto = """
+    activation: NONE
+      regularizer {
+        l2_regularizer {
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+        }
+      }
+    """
+    text_format.Merge(conv_hyperparams_text_proto, conv_hyperparams)
+    return hyperparams_builder.KerasLayerHyperparams(conv_hyperparams)
+
+  def test_prediction_size(self):
+    mask_prediction_head = (
+        keras_mask_head.WeightSharedConvolutionalMaskHead(
+            num_classes=20,
+            num_predictions_per_location=1,
+            conv_hyperparams=self._build_conv_hyperparams(),
+            mask_height=7,
+            mask_width=7))
+    image_feature = tf.random_uniform(
+        [64, 17, 19, 1024], minval=-10.0, maxval=10.0, dtype=tf.float32)
+    mask_predictions = mask_prediction_head(image_feature)
+    self.assertAllEqual([64, 323, 20, 7, 7],
+                        mask_predictions.get_shape().as_list())
+
+  def test_class_agnostic_prediction_size(self):
+    mask_prediction_head = (
+        keras_mask_head.WeightSharedConvolutionalMaskHead(
+            num_classes=20,
+            num_predictions_per_location=1,
+            conv_hyperparams=self._build_conv_hyperparams(),
+            mask_height=7,
+            mask_width=7,
+            masks_are_class_agnostic=True))
+    image_feature = tf.random_uniform(
+        [64, 17, 19, 1024], minval=-10.0, maxval=10.0, dtype=tf.float32)
+    mask_predictions = mask_prediction_head(image_feature)
+    self.assertAllEqual([64, 323, 1, 7, 7],
+                        mask_predictions.get_shape().as_list())
+
 if __name__ == '__main__':
  tf.test.main()
--- a/research/object_detection/predictors/mask_rcnn_keras_box_predictor.py
+++ b/research/object_detection/predictors/mask_rcnn_keras_box_predictor.py
+# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+
+"""Mask R-CNN Box Predictor."""
+from object_detection.core import box_predictor
+
+
+BOX_ENCODINGS = box_predictor.BOX_ENCODINGS
+CLASS_PREDICTIONS_WITH_BACKGROUND = (
+    box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND)
+MASK_PREDICTIONS = box_predictor.MASK_PREDICTIONS
+
+
+class MaskRCNNKerasBoxPredictor(box_predictor.KerasBoxPredictor):
+  """Mask R-CNN Box Predictor.
+
+  See Mask R-CNN: He, K., Gkioxari, G., Dollar, P., & Girshick, R. (2017).
+  Mask R-CNN. arXiv preprint arXiv:1703.06870.
+
+  This is used for the second stage of the Mask R-CNN detector where proposals
+  cropped from an image are arranged along the batch dimension of the input
+  image_features tensor. Notice that locations are *not* shared across classes,
+  thus for each anchor, a separate prediction is made for each class.
+
+  In addition to predicting boxes and classes, optionally this class allows
+  predicting masks and/or keypoints inside detection boxes.
+
+  Currently this box predictor makes per-class predictions; that is, each
+  anchor makes a separate box prediction for each class.
+  """
+
+  def __init__(self,
+               is_training,
+               num_classes,
+               freeze_batchnorm,
+               box_prediction_head,
+               class_prediction_head,
+               third_stage_heads,
+               name=None):
+    """Constructor.
+
+    Args:
+      is_training: Indicates whether the BoxPredictor is in training mode.
+      num_classes: number of classes.  Note that num_classes *does not*
+        include the background category, so if groundtruth labels take values
+        in {0, 1, .., K-1}, num_classes=K (and not K+1, even though the
+        assigned classification targets can range from {0,... K}).
+      freeze_batchnorm: Whether to freeze batch norm parameters during
+        training or not. When training with a small batch size (e.g. 1), it is
+        desirable to freeze batch norm update and use pretrained batch norm
+        params.
+      box_prediction_head: The head that predicts the boxes in second stage.
+      class_prediction_head: The head that predicts the classes in second stage.
+      third_stage_heads: A dictionary mapping head names to mask rcnn head
+        classes.
+      name: A string name scope to assign to the model. If `None`, Keras
+        will auto-generate one from the class name.
+    """
+    super(MaskRCNNKerasBoxPredictor, self).__init__(
+        is_training, num_classes, freeze_batchnorm=freeze_batchnorm,
+        inplace_batchnorm_update=False, name=name)
+    self._box_prediction_head = box_prediction_head
+    self._class_prediction_head = class_prediction_head
+    self._third_stage_heads = third_stage_heads
+
+  @property
+  def num_classes(self):
+    return self._num_classes
+
+  def get_second_stage_prediction_heads(self):
+    return BOX_ENCODINGS, CLASS_PREDICTIONS_WITH_BACKGROUND
+
+  def get_third_stage_prediction_heads(self):
+    return sorted(self._third_stage_heads.keys())
+
+  def _predict(self,
+               image_features,
+               prediction_stage=2):
+    """Optionally computes encoded object locations, confidences, and masks.
+
+    Predicts the heads belonging to the given prediction stage.
+
+    Args:
+      image_features: A list of float tensors of shape
+        [batch_size, height_i, width_i, channels_i] containing roi pooled
+        features for each image. The length of the list should be 1 otherwise
+        a ValueError will be raised.
+      prediction_stage: Prediction stage. Acceptable values are 2 and 3.
+
+    Returns:
+      A dictionary containing the predicted tensors that are listed in
+      self._prediction_heads. A subset of the following keys will exist in the
+      dictionary:
+        BOX_ENCODINGS: A float tensor of shape
+          [batch_size, 1, num_classes, code_size] representing the
+          location of the objects.
+        CLASS_PREDICTIONS_WITH_BACKGROUND: A float tensor of shape
+          [batch_size, 1, num_classes + 1] representing the class
+          predictions for the proposals.
+        MASK_PREDICTIONS: A float tensor of shape
+          [batch_size, 1, num_classes, image_height, image_width]
+
+    Raises:
+      ValueError: If num_predictions_per_location is not 1 or if
+        len(image_features) is not 1.
+      ValueError: if prediction_stage is not 2 or 3.
+    """
+    if len(image_features) != 1:
+      raise ValueError('length of `image_features` must be 1. Found {}'.format(
+          len(image_features)))
+    image_feature = image_features[0]
+    predictions_dict = {}
+
+    if prediction_stage == 2:
+      predictions_dict[BOX_ENCODINGS] = self._box_prediction_head(image_feature)
+      predictions_dict[CLASS_PREDICTIONS_WITH_BACKGROUND] = (
+          self._class_prediction_head(image_feature))
+    elif prediction_stage == 3:
+      for prediction_head in self.get_third_stage_prediction_heads():
+        head_object = self._third_stage_heads[prediction_head]
+        predictions_dict[prediction_head] = head_object(image_feature)
+    else:
+      raise ValueError('prediction_stage should be either 2 or 3.')
+
+    return predictions_dict
--- a/research/object_detection/predictors/mask_rcnn_keras_box_predictor_test.py
+++ b/research/object_detection/predictors/mask_rcnn_keras_box_predictor_test.py
+# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+
+"""Tests for object_detection.predictors.mask_rcnn_box_predictor."""
+import numpy as np
+import tensorflow as tf
+
+from google.protobuf import text_format
+from object_detection.builders import box_predictor_builder
+from object_detection.builders import hyperparams_builder
+from object_detection.predictors import mask_rcnn_keras_box_predictor as box_predictor
+from object_detection.protos import hyperparams_pb2
+from object_detection.utils import test_case
+
+
+class MaskRCNNKerasBoxPredictorTest(test_case.TestCase):
+
+  def _build_hyperparams(self,
+                         op_type=hyperparams_pb2.Hyperparams.FC):
+    hyperparams = hyperparams_pb2.Hyperparams()
+    hyperparams_text_proto = """
+      activation: NONE
+      regularizer {
+        l2_regularizer {
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+        }
+      }
+    """
+    text_format.Merge(hyperparams_text_proto, hyperparams)
+    hyperparams.op = op_type
+    return hyperparams_builder.KerasLayerHyperparams(hyperparams)
+
+  def test_get_boxes_with_five_classes(self):
+    def graph_fn(image_features):
+      mask_box_predictor = (
+          box_predictor_builder.build_mask_rcnn_keras_box_predictor(
+              is_training=False,
+              num_classes=5,
+              fc_hyperparams=self._build_hyperparams(),
+              freeze_batchnorm=False,
+              use_dropout=False,
+              dropout_keep_prob=0.5,
+              box_code_size=4,
+          ))
+      box_predictions = mask_box_predictor(
+          [image_features],
+          prediction_stage=2)
+      return (box_predictions[box_predictor.BOX_ENCODINGS],
+              box_predictions[box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND])
+    image_features = np.random.rand(2, 7, 7, 3).astype(np.float32)
+    (box_encodings,
+     class_predictions_with_background) = self.execute(graph_fn,
+                                                       [image_features])
+    self.assertAllEqual(box_encodings.shape, [2, 1, 5, 4])
+    self.assertAllEqual(class_predictions_with_background.shape, [2, 1, 6])
+
+  def test_get_boxes_with_five_classes_share_box_across_classes(self):
+    def graph_fn(image_features):
+      mask_box_predictor = (
+          box_predictor_builder.build_mask_rcnn_keras_box_predictor(
+              is_training=False,
+              num_classes=5,
+              fc_hyperparams=self._build_hyperparams(),
+              freeze_batchnorm=False,
+              use_dropout=False,
+              dropout_keep_prob=0.5,
+              box_code_size=4,
+              share_box_across_classes=True
+          ))
+      box_predictions = mask_box_predictor(
+          [image_features],
+          prediction_stage=2)
+      return (box_predictions[box_predictor.BOX_ENCODINGS],
+              box_predictions[box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND])
+    image_features = np.random.rand(2, 7, 7, 3).astype(np.float32)
+    (box_encodings,
+     class_predictions_with_background) = self.execute(graph_fn,
+                                                       [image_features])
+    self.assertAllEqual(box_encodings.shape, [2, 1, 1, 4])
+    self.assertAllEqual(class_predictions_with_background.shape, [2, 1, 6])
+
+  def test_get_instance_masks(self):
+    def graph_fn(image_features):
+      mask_box_predictor = (
+          box_predictor_builder.build_mask_rcnn_keras_box_predictor(
+              is_training=False,
+              num_classes=5,
+              fc_hyperparams=self._build_hyperparams(),
+              freeze_batchnorm=False,
+              use_dropout=False,
+              dropout_keep_prob=0.5,
+              box_code_size=4,
+              conv_hyperparams=self._build_hyperparams(
+                  op_type=hyperparams_pb2.Hyperparams.CONV),
+              predict_instance_masks=True))
+      box_predictions = mask_box_predictor(
+          [image_features],
+          prediction_stage=3)
+      return (box_predictions[box_predictor.MASK_PREDICTIONS],)
+    image_features = np.random.rand(2, 7, 7, 3).astype(np.float32)
+    mask_predictions = self.execute(graph_fn, [image_features])
+    self.assertAllEqual(mask_predictions.shape, [2, 1, 5, 14, 14])
+
+  def test_do_not_return_instance_masks_without_request(self):
+    image_features = tf.random_uniform([2, 7, 7, 3], dtype=tf.float32)
+    mask_box_predictor = (
+        box_predictor_builder.build_mask_rcnn_keras_box_predictor(
+            is_training=False,
+            num_classes=5,
+            fc_hyperparams=self._build_hyperparams(),
+            freeze_batchnorm=False,
+            use_dropout=False,
+            dropout_keep_prob=0.5,
+            box_code_size=4))
+    box_predictions = mask_box_predictor(
+        [image_features],
+        prediction_stage=2)
+    self.assertEqual(len(box_predictions), 2)
+    self.assertTrue(box_predictor.BOX_ENCODINGS in box_predictions)
+    self.assertTrue(box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND
+                    in box_predictions)
+
+
+if __name__ == '__main__':
+  tf.test.main()
--- a/research/object_detection/predictors/rfcn_keras_box_predictor.py
+++ b/research/object_detection/predictors/rfcn_keras_box_predictor.py
+# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+
+"""RFCN Box Predictor."""
+import tensorflow as tf
+from object_detection.core import box_predictor
+from object_detection.utils import ops
+
+BOX_ENCODINGS = box_predictor.BOX_ENCODINGS
+CLASS_PREDICTIONS_WITH_BACKGROUND = (
+    box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND)
+MASK_PREDICTIONS = box_predictor.MASK_PREDICTIONS
+
+
+class RfcnKerasBoxPredictor(box_predictor.KerasBoxPredictor):
+  """RFCN Box Predictor.
+
+  Applies a position sensitive ROI pooling on position sensitive feature maps to
+  predict classes and refined locations. See https://arxiv.org/abs/1605.06409
+  for details.
+
+  This is used for the second stage of the RFCN meta architecture. Notice that
+  locations are *not* shared across classes, thus for each anchor, a separate
+  prediction is made for each class.
+  """
+
+  def __init__(self,
+               is_training,
+               num_classes,
+               conv_hyperparams,
+               freeze_batchnorm,
+               num_spatial_bins,
+               depth,
+               crop_size,
+               box_code_size,
+               name=None):
+    """Constructor.
+
+    Args:
+      is_training: Indicates whether the BoxPredictor is in training mode.
+      num_classes: number of classes.  Note that num_classes *does not*
+        include the background category, so if groundtruth labels take values
+        in {0, 1, .., K-1}, num_classes=K (and not K+1, even though the
+        assigned classification targets can range from {0,... K}).
+      conv_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
+        containing hyperparameters for convolution ops.
+      freeze_batchnorm: Whether to freeze batch norm parameters during
+        training or not. When training with a small batch size (e.g. 1), it is
+        desirable to freeze batch norm update and use pretrained batch norm
+        params.
+      num_spatial_bins: A list of two integers `[spatial_bins_y,
+        spatial_bins_x]`.
+      depth: Target depth to reduce the input feature maps to.
+      crop_size: A list of two integers `[crop_height, crop_width]`.
+      box_code_size: Size of encoding for each box.
+      name: A string name scope to assign to the box predictor. If `None`, Keras
+        will auto-generate one from the class name.
+    """
+    super(RfcnKerasBoxPredictor, self).__init__(
+        is_training, num_classes, freeze_batchnorm=freeze_batchnorm,
+        inplace_batchnorm_update=False, name=name)
+    self._freeze_batchnorm = freeze_batchnorm
+    self._conv_hyperparams = conv_hyperparams
+    self._num_spatial_bins = num_spatial_bins
+    self._depth = depth
+    self._crop_size = crop_size
+    self._box_code_size = box_code_size
+
+    # Build the shared layers used for both heads
+    self._shared_conv_layers = []
+    self._shared_conv_layers.append(
+        tf.keras.layers.Conv2D(
+            self._depth,
+            [1, 1],
+            padding='SAME',
+            name='reduce_depth_conv',
+            **self._conv_hyperparams.params()))
+    self._shared_conv_layers.append(
+        self._conv_hyperparams.build_batch_norm(
+            training=(self._is_training and not self._freeze_batchnorm),
+            name='reduce_depth_batchnorm'))
+    self._shared_conv_layers.append(
+        self._conv_hyperparams.build_activation_layer(
+            name='reduce_depth_activation'))
+
+    self._box_encoder_layers = []
+    location_feature_map_depth = (self._num_spatial_bins[0] *
+                                  self._num_spatial_bins[1] *
+                                  self.num_classes *
+                                  self._box_code_size)
+    self._box_encoder_layers.append(
+        tf.keras.layers.Conv2D(
+            location_feature_map_depth,
+            [1, 1],
+            padding='SAME',
+            name='refined_locations_conv',
+            **self._conv_hyperparams.params()))
+    self._box_encoder_layers.append(
+        self._conv_hyperparams.build_batch_norm(
+            training=(self._is_training and not self._freeze_batchnorm),
+            name='refined_locations_batchnorm'))
+
+    self._class_predictor_layers = []
+    self._total_classes = self.num_classes + 1  # Account for background class.
+    class_feature_map_depth = (self._num_spatial_bins[0] *
+                               self._num_spatial_bins[1] *
+                               self._total_classes)
+    self._class_predictor_layers.append(
+        tf.keras.layers.Conv2D(
+            class_feature_map_depth,
+            [1, 1],
+            padding='SAME',
+            name='class_predictions_conv',
+            **self._conv_hyperparams.params()))
+    self._class_predictor_layers.append(
+        self._conv_hyperparams.build_batch_norm(
+            training=(self._is_training and not self._freeze_batchnorm),
+            name='class_predictions_batchnorm'))
+
+  @property
+  def num_classes(self):
+    return self._num_classes
+
+  def _predict(self, image_features, proposal_boxes):
+    """Computes encoded object locations and corresponding confidences.
+
+    Args:
+      image_features: A list of float tensors of shape [batch_size, height_i,
+      width_i, channels_i] containing features for a batch of images.
+      proposal_boxes: A float tensor of shape [batch_size, num_proposals,
+        box_code_size].
+
+    Returns:
+      box_encodings: A list of float tensors of shape
+        [batch_size, num_anchors_i, q, code_size] representing the location of
+        the objects, where q is 1 or the number of classes. Each entry in the
+        list corresponds to a feature map in the input `image_features` list.
+      class_predictions_with_background: A list of float tensors of shape
+        [batch_size, num_anchors_i, num_classes + 1] representing the class
+        predictions for the proposals. Each entry in the list corresponds to a
+        feature map in the input `image_features` list.
+
+    Raises:
+      ValueError: if num_predictions_per_location is not 1 or if
+        len(image_features) is not 1.
+    """
+    if len(image_features) != 1:
+      raise ValueError('length of `image_features` must be 1. Found {}'.
+                       format(len(image_features)))
+    image_feature = image_features[0]
+    batch_size = tf.shape(proposal_boxes)[0]
+    num_boxes = tf.shape(proposal_boxes)[1]
+    net = image_feature
+    for layer in self._shared_conv_layers:
+      net = layer(net)
+
+    # Location predictions.
+    box_net = net
+    for layer in self._box_encoder_layers:
+      box_net = layer(box_net)
+    box_encodings = ops.batch_position_sensitive_crop_regions(
+        box_net,
+        boxes=proposal_boxes,
+        crop_size=self._crop_size,
+        num_spatial_bins=self._num_spatial_bins,
+        global_pool=True)
+    box_encodings = tf.squeeze(box_encodings, squeeze_dims=[2, 3])
+    box_encodings = tf.reshape(box_encodings,
+                               [batch_size * num_boxes, 1, self.num_classes,
+                                self._box_code_size])
+
+    # Class predictions.
+    class_net = net
+    for layer in self._class_predictor_layers:
+      class_net = layer(class_net)
+    class_predictions_with_background = (
+        ops.batch_position_sensitive_crop_regions(
+            class_net,
+            boxes=proposal_boxes,
+            crop_size=self._crop_size,
+            num_spatial_bins=self._num_spatial_bins,
+            global_pool=True))
+    class_predictions_with_background = tf.squeeze(
+        class_predictions_with_background, squeeze_dims=[2, 3])
+    class_predictions_with_background = tf.reshape(
+        class_predictions_with_background,
+        [batch_size * num_boxes, 1, self._total_classes])
+
+    return {BOX_ENCODINGS: [box_encodings],
+            CLASS_PREDICTIONS_WITH_BACKGROUND:
+            [class_predictions_with_background]}
--- a/research/object_detection/predictors/rfcn_keras_box_predictor_test.py
+++ b/research/object_detection/predictors/rfcn_keras_box_predictor_test.py
+# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+
+"""Tests for object_detection.predictors.rfcn_box_predictor."""
+import numpy as np
+import tensorflow as tf
+
+from google.protobuf import text_format
+from object_detection.builders import hyperparams_builder
+from object_detection.predictors import rfcn_keras_box_predictor as box_predictor
+from object_detection.protos import hyperparams_pb2
+from object_detection.utils import test_case
+
+
+class RfcnKerasBoxPredictorTest(test_case.TestCase):
+
+  def _build_conv_hyperparams(self):
+    conv_hyperparams = hyperparams_pb2.Hyperparams()
+    conv_hyperparams_text_proto = """
+      regularizer {
+        l2_regularizer {
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+        }
+      }
+    """
+    text_format.Merge(conv_hyperparams_text_proto, conv_hyperparams)
+    return hyperparams_builder.KerasLayerHyperparams(conv_hyperparams)
+
+  def test_get_correct_box_encoding_and_class_prediction_shapes(self):
+
+    def graph_fn(image_features, proposal_boxes):
+      rfcn_box_predictor = box_predictor.RfcnKerasBoxPredictor(
+          is_training=False,
+          num_classes=2,
+          conv_hyperparams=self._build_conv_hyperparams(),
+          freeze_batchnorm=False,
+          num_spatial_bins=[3, 3],
+          depth=4,
+          crop_size=[12, 12],
+          box_code_size=4
+      )
+      box_predictions = rfcn_box_predictor(
+          [image_features],
+          proposal_boxes=proposal_boxes)
+      box_encodings = tf.concat(
+          box_predictions[box_predictor.BOX_ENCODINGS], axis=1)
+      class_predictions_with_background = tf.concat(
+          box_predictions[box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND],
+          axis=1)
+      return (box_encodings, class_predictions_with_background)
+
+    image_features = np.random.rand(4, 8, 8, 64).astype(np.float32)
+    proposal_boxes = np.random.rand(4, 2, 4).astype(np.float32)
+    (box_encodings, class_predictions_with_background) = self.execute(
+        graph_fn, [image_features, proposal_boxes])
+
+    self.assertAllEqual(box_encodings.shape, [8, 1, 2, 4])
+    self.assertAllEqual(class_predictions_with_background.shape, [8, 1, 3])
+
+
+if __name__ == '__main__':
+  tf.test.main()
--- a/research/object_detection/protos/anchor_generator.proto
+++ b/research/object_detection/protos/anchor_generator.proto
@@ -2,9 +2,10 @@ syntax = "proto2";

 package object_detection.protos;

+import "object_detection/protos/flexible_grid_anchor_generator.proto";
 import "object_detection/protos/grid_anchor_generator.proto";
-import "object_detection/protos/ssd_anchor_generator.proto";
 import "object_detection/protos/multiscale_anchor_generator.proto";
+import "object_detection/protos/ssd_anchor_generator.proto";

 // Configuration proto for the anchor generator to use in the object detection
 // pipeline. See core/anchor_generator.py for details.
@@ -13,5 +14,6 @@ message AnchorGenerator {
    GridAnchorGenerator grid_anchor_generator = 1;
    SsdAnchorGenerator ssd_anchor_generator = 2;
    MultiscaleAnchorGenerator multiscale_anchor_generator = 3;
+    FlexibleGridAnchorGenerator flexible_grid_anchor_generator = 4;
  }
 }
--- a/research/object_detection/protos/box_predictor.proto
+++ b/research/object_detection/protos/box_predictor.proto
@@ -15,7 +15,6 @@ message BoxPredictor {
  }
 }

-
 // Configuration proto for Convolutional box predictor.
 // Next id: 13
 message ConvolutionalBoxPredictor {
@@ -57,6 +56,13 @@ message ConvolutionalBoxPredictor {

  // Whether to use depthwise separable convolution for box predictor layers.
  optional bool use_depthwise = 11 [default = false];
+
+  // If specified, apply clipping to box encodings.
+  message BoxEncodingsClipRange {
+    optional float min = 1;
+    optional float max = 2;
+  }
+  optional BoxEncodingsClipRange box_encodings_clip_range = 12;
 }

 // Configuration proto for weight shared convolutional box predictor.
@@ -118,6 +124,8 @@ message WeightSharedConvolutionalBoxPredictor {
  optional BoxEncodingsClipRange box_encodings_clip_range = 17;
 }

+
+
 // TODO(alirezafathi): Refactor the proto file to be able to configure mask rcnn
 // head easily.
 // Next id: 15

--- a/research/object_detection/protos/flexible_grid_anchor_generator.proto
+++ b/research/object_detection/protos/flexible_grid_anchor_generator.proto
+syntax = "proto2";
+
+package object_detection.protos;
+
+message FlexibleGridAnchorGenerator {
+  repeated AnchorGrid anchor_grid = 1;
+
+  // Whether to produce anchors in normalized coordinates.
+  optional bool normalize_coordinates = 2 [default = true];
+}
+
+message AnchorGrid {
+  // The base sizes in pixels for each anchor in this anchor layer.
+  repeated float base_sizes = 1;
+
+  // The aspect ratios for each anchor in this anchor layer.
+  repeated float aspect_ratios = 2;
+
+  // The anchor height stride in pixels.
+  optional uint32 height_stride = 3;
+
+  // The anchor width stride in pixels.
+  optional uint32 width_stride = 4;
+
+  // The anchor height offset in pixels.
+  optional uint32 height_offset = 5 [default = 0];
+
+  // The anchor width offset in pixels.
+  optional uint32 width_offset = 6 [default = 0];
+}
--- a/research/object_detection/protos/graph_rewriter.proto
+++ b/research/object_detection/protos/graph_rewriter.proto
@@ -20,4 +20,7 @@ message Quantization {
  // Number of bits to use for quantizing activations.
  // Only 8 bit is supported for now.
  optional int32 activation_bits = 3 [default = 8];
+
+  // Whether to use symmetric weight quantization.
+  optional bool symmetric = 4 [default = false];
 }
--- a/research/object_detection/protos/image_resizer.proto
+++ b/research/object_detection/protos/image_resizer.proto
@@ -9,6 +9,7 @@ message ImageResizer {
    KeepAspectRatioResizer keep_aspect_ratio_resizer = 1;
    FixedShapeResizer fixed_shape_resizer = 2;
    IdentityResizer identity_resizer = 3;
+    ConditionalShapeResizer conditional_shape_resizer = 4;
  }
 }

@@ -61,3 +62,31 @@ message FixedShapeResizer {
  // Whether to also resize the image channels from 3 to 1 (RGB to grayscale).
  optional bool convert_to_grayscale = 4 [default = false];
 }
+
+// Configuration proto for image resizer that resizes only if input image height
+// or width is greater or smaller than a certain size.
+// Aspect ratio is maintained.
+message ConditionalShapeResizer {
+
+  // Enumeration for the condition on which to resize an image.
+  enum ResizeCondition {
+    INVALID = 0; // Default value.
+    GREATER = 1; // Resizes image if a dimension is greater than specified size.
+    SMALLER = 2; // Resizes image if a dimension is smaller than specified size.
+  }
+
+  // Condition which must be true to resize the image.
+  optional ResizeCondition condition = 1 [default = GREATER];
+
+  // Threshold for the image size. If any image dimension is above or below this
+  // (as specified by condition) the image will be resized so that it meets the
+  // threshold.
+  optional int32 size_threshold = 2 [default = 300];
+
+  // Desired method when resizing image.
+  optional ResizeType resize_method = 3 [default = BILINEAR];
+
+  // Whether to also resize the image channels from 3 to 1 (RGB to grayscale).
+  optional bool convert_to_grayscale = 4 [default = false];
+
+}
--- a/research/object_detection/protos/ssd.proto
+++ b/research/object_detection/protos/ssd.proto
 syntax = "proto2";
+
 package object_detection.protos;

 import "object_detection/protos/anchor_generator.proto";
@@ -6,15 +7,14 @@ import "object_detection/protos/box_coder.proto";
 import "object_detection/protos/box_predictor.proto";
 import "object_detection/protos/hyperparams.proto";
 import "object_detection/protos/image_resizer.proto";
-import "object_detection/protos/matcher.proto";
 import "object_detection/protos/losses.proto";
+import "object_detection/protos/matcher.proto";
 import "object_detection/protos/post_processing.proto";
 import "object_detection/protos/region_similarity_calculator.proto";

 // Configuration for Single Shot Detection (SSD) models.
 // Next id: 26
 message Ssd {
-
  // Number of classes to predict.
  optional int32 num_classes = 1;

@@ -114,8 +114,8 @@ message Ssd {
    // features and the number of classes.
    optional int32 mask_prediction_conv_depth = 4 [default = 256];

-    // The number of convolutions applied to image_features in the mask prediction
-    // branch.
+    // The number of convolutions applied to image_features in the mask
+    // prediction branch.
    optional int32 mask_prediction_num_conv_layers = 5 [default = 2];

    // Whether to apply convolutions on mask features before upsampling using
@@ -125,10 +125,10 @@ message Ssd {
    optional bool convolve_then_upsample_masks = 6 [default = false];

    // Mask loss weight.
-    optional float mask_loss_weight = 7 [default=5.0];
+    optional float mask_loss_weight = 7 [default = 5.0];

    // Number of boxes to be generated at training time for computing mask loss.
-    optional int32 mask_loss_sample_size = 8 [default=16];
+    optional int32 mask_loss_sample_size = 8 [default = 16];

    // Hyperparameters for convolution ops used in the box predictor.
    optional Hyperparams conv_hyperparams = 9;

--- a/research/object_detection/samples/configs/ssd_mobilenet_v1_focal_loss_pets_inference.config
+++ b/research/object_detection/samples/configs/ssd_mobilenet_v1_focal_loss_pets_inference.config
+# SSD with Mobilenet v1, configured for Oxford-IIIT Pets Dataset.
+# Users should configure the fine_tune_checkpoint field in the train config as
+# well as the label_map_path and input_path fields in the train_input_reader and
+# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
+# should be configured.
+# TPU-compatible for both training and inference
+
+model {
+  ssd {
+    num_classes: 37
+    box_coder {
+      faster_rcnn_box_coder {
+        y_scale: 10.0
+        x_scale: 10.0
+        height_scale: 5.0
+        width_scale: 5.0
+      }
+    }
+    matcher {
+      argmax_matcher {
+        matched_threshold: 0.5
+        unmatched_threshold: 0.5
+        ignore_thresholds: false
+        negatives_lower_than_unmatched: true
+        force_match_for_each_row: true
+      }
+    }
+    similarity_calculator {
+      iou_similarity {
+      }
+    }
+    anchor_generator {
+      ssd_anchor_generator {
+        num_layers: 6
+        min_scale: 0.2
+        max_scale: 0.95
+        aspect_ratios: 1.0
+        aspect_ratios: 2.0
+        aspect_ratios: 0.5
+        aspect_ratios: 3.0
+        aspect_ratios: 0.3333
+      }
+    }
+    image_resizer {
+      fixed_shape_resizer {
+        height: 300
+        width: 300
+      }
+    }
+    box_predictor {
+      convolutional_box_predictor {
+        min_depth: 0
+        max_depth: 0
+        num_layers_before_predictor: 0
+        use_dropout: false
+        dropout_keep_probability: 0.8
+        kernel_size: 1
+        box_code_size: 4
+        apply_sigmoid_to_scores: false
+        conv_hyperparams {
+          activation: RELU_6,
+          regularizer {
+            l2_regularizer {
+              weight: 0.00004
+            }
+          }
+          initializer {
+            truncated_normal_initializer {
+              stddev: 0.03
+              mean: 0.0
+            }
+          }
+          batch_norm {
+            train: true,
+            scale: true,
+            center: true,
+            decay: 0.9997,
+            epsilon: 0.001,
+          }
+        }
+      }
+    }
+    feature_extractor {
+      type: 'ssd_mobilenet_v1'
+      min_depth: 16
+      depth_multiplier: 1.0
+      conv_hyperparams {
+        activation: RELU_6,
+        regularizer {
+          l2_regularizer {
+            weight: 0.00004
+          }
+        }
+        initializer {
+          truncated_normal_initializer {
+            stddev: 0.03
+            mean: 0.0
+          }
+        }
+        batch_norm {
+          train: true,
+          scale: true,
+          center: true,
+          decay: 0.9997,
+          epsilon: 0.001,
+        }
+      }
+    }
+    loss {
+      classification_loss {
+        weighted_sigmoid_focal {
+          alpha: 0.75
+          gamma: 2.0
+        }
+      }
+      localization_loss {
+        weighted_smooth_l1 {
+        }
+      }
+      classification_weight: 1.0
+      localization_weight: 1.0
+    }
+    normalize_loss_by_num_matches: true
+    post_processing {
+      batch_non_max_suppression {
+        score_threshold: 1e-8
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 100
+        use_static_shapes: true
+      }
+      score_converter: SIGMOID
+    }
+  }
+}
+
+train_config: {
+  batch_size: 24
+  optimizer {
+    rms_prop_optimizer: {
+      learning_rate: {
+        exponential_decay_learning_rate {
+          initial_learning_rate: 0.004
+          decay_steps: 800720
+          decay_factor: 0.95
+        }
+      }
+      momentum_optimizer_value: 0.9
+      decay: 0.9
+      epsilon: 1.0
+    }
+  }
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
+  from_detection_checkpoint: true
+  load_all_detection_checkpoint_vars: true
+  # Note: The below line limits the training process to 200K steps, which we
+  # empirically found to be sufficient enough to train the pets dataset. This
+  # effectively bypasses the learning rate schedule (the learning rate will
+  # never decay). Remove the below line to train indefinitely.
+  num_steps: 200000
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+  data_augmentation_options {
+    ssd_random_crop {
+    }
+  }
+  max_number_of_boxes: 50
+  unpad_groundtruth_tensors: false
+}
+
+train_input_reader: {
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/pet_faces_train.record-?????-of-00010"
+  }
+  label_map_path: "PATH_TO_BE_CONFIGURED/pet_label_map.pbtxt"
+}
+
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  num_examples: 1101
+}
+
+eval_input_reader: {
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/pet_faces_val.record-?????-of-00010"
+  }
+  label_map_path: "PATH_TO_BE_CONFIGURED/pet_label_map.pbtxt"
+  shuffle: false
+  num_readers: 1
+}
--- a/research/object_detection/samples/configs/ssd_mobilenet_v2_fullyconv_coco.config
+++ b/research/object_detection/samples/configs/ssd_mobilenet_v2_fullyconv_coco.config
+# SSD with Mobilenet v2 configuration for MSCOCO Dataset.
+# Users should configure the fine_tune_checkpoint field in the train config as
+# well as the label_map_path and input_path fields in the train_input_reader and
+# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
+# should be configured.
+
+model {
+  ssd {
+    num_classes: 90
+    box_coder {
+      faster_rcnn_box_coder {
+        y_scale: 10.0
+        x_scale: 10.0
+        height_scale: 5.0
+        width_scale: 5.0
+      }
+    }
+    matcher {
+      argmax_matcher {
+        matched_threshold: 0.5
+        unmatched_threshold: 0.5
+        ignore_thresholds: false
+        negatives_lower_than_unmatched: true
+        force_match_for_each_row: true
+      }
+    }
+    similarity_calculator {
+      iou_similarity {
+      }
+    }
+    anchor_generator {
+      ssd_anchor_generator {
+        num_layers: 6
+        min_scale: 0.2
+        max_scale: 0.95
+        aspect_ratios: 1.0
+        aspect_ratios: 2.0
+        aspect_ratios: 0.5
+        aspect_ratios: 3.0
+        aspect_ratios: 0.3333
+        height_stride: 16
+        height_stride: 32
+        height_stride: 64
+        height_stride: 128
+        height_stride: 256
+        height_stride: 512
+        width_stride: 16
+        width_stride: 32
+        width_stride: 64
+        width_stride: 128
+        width_stride: 256
+        width_stride: 512
+      }
+    }
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 320
+        max_dimension: 640
+      }
+    }
+    box_predictor {
+      convolutional_box_predictor {
+        min_depth: 0
+        max_depth: 0
+        num_layers_before_predictor: 0
+        use_dropout: false
+        dropout_keep_probability: 0.8
+        kernel_size: 1
+        box_code_size: 4
+        apply_sigmoid_to_scores: false
+        conv_hyperparams {
+          activation: RELU_6,
+          regularizer {
+            l2_regularizer {
+              weight: 0.00004
+            }
+          }
+          initializer {
+            truncated_normal_initializer {
+              stddev: 0.03
+              mean: 0.0
+            }
+          }
+          batch_norm {
+            train: true,
+            scale: true,
+            center: true,
+            decay: 0.9997,
+            epsilon: 0.001,
+          }
+        }
+      }
+    }
+    feature_extractor {
+      type: 'ssd_mobilenet_v2'
+      min_depth: 16
+      depth_multiplier: 1.0
+      use_explicit_padding: true
+      conv_hyperparams {
+        activation: RELU_6,
+        regularizer {
+          l2_regularizer {
+            weight: 0.00004
+          }
+        }
+        initializer {
+          truncated_normal_initializer {
+            stddev: 0.03
+            mean: 0.0
+          }
+        }
+        batch_norm {
+          train: true,
+          scale: true,
+          center: true,
+          decay: 0.9997,
+          epsilon: 0.001,
+        }
+      }
+    }
+    loss {
+      classification_loss {
+        weighted_sigmoid {
+        }
+      }
+      localization_loss {
+        weighted_smooth_l1 {
+        }
+      }
+      hard_example_miner {
+        num_hard_examples: 3000
+        iou_threshold: 0.99
+        loss_type: CLASSIFICATION
+        max_negatives_per_positive: 3
+        min_negatives_per_image: 3
+      }
+      classification_weight: 1.0
+      localization_weight: 1.0
+    }
+    normalize_loss_by_num_matches: true
+    post_processing {
+      batch_non_max_suppression {
+        score_threshold: 1e-8
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 100
+      }
+      score_converter: SIGMOID
+    }
+  }
+}
+
+train_config: {
+  batch_size: 24
+  optimizer {
+    rms_prop_optimizer: {
+      learning_rate: {
+        exponential_decay_learning_rate {
+          initial_learning_rate: 0.004
+          decay_steps: 800720
+          decay_factor: 0.95
+        }
+      }
+      momentum_optimizer_value: 0.9
+      decay: 0.9
+      epsilon: 1.0
+    }
+  }
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
+  fine_tune_checkpoint_type:  "detection"
+  # Note: The below line limits the training process to 200K steps, which we
+  # empirically found to be sufficient enough to train the pets dataset. This
+  # effectively bypasses the learning rate schedule (the learning rate will
+  # never decay). Remove the below line to train indefinitely.
+  num_steps: 200000
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+  data_augmentation_options {
+    ssd_random_crop_fixed_aspect_ratio {
+    }
+  }
+}
+
+train_input_reader: {
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record-?????-of-00100"
+  }
+  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
+}
+
+eval_config: {
+  num_examples: 8000
+  # Note: The below line limits the evaluation process to 10 evaluations.
+  # Remove the below line to evaluate indefinitely.
+  max_evals: 10
+}
+
+eval_input_reader: {
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record-?????-of-00010"
+  }
+  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
+  shuffle: false
+  num_readers: 1
+}
--- a/research/object_detection/tpu_exporters/__init__.py
+++ b/research/object_detection/tpu_exporters/__init__.py
+
--- a/research/object_detection/tpu_exporters/export_saved_model_tpu.py
+++ b/research/object_detection/tpu_exporters/export_saved_model_tpu.py
+# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""Python binary for exporting SavedModel, tailored for TPU inference."""
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import tensorflow as tf
+from object_detection.tpu_exporters import export_saved_model_tpu_lib
+
+flags = tf.app.flags
+FLAGS = flags.FLAGS
+
+flags.DEFINE_string('pipeline_config_file', None,
+                    'A pipeline_pb2.TrainEvalPipelineConfig config file.')
+flags.DEFINE_string(
+    'ckpt_path', None, 'Path to trained checkpoint, typically of the form '
+    'path/to/model.ckpt')
+flags.DEFINE_string('export_dir', None, 'Path to export SavedModel.')
+flags.DEFINE_string('input_placeholder_name', 'placeholder_tensor',
+                    'Name of input placeholder in model\'s signature_def_map.')
+flags.DEFINE_string(
+    'input_type', 'tf_example', 'Type of input node. Can be '
+    'one of [`image_tensor`, `encoded_image_string_tensor`, '
+    '`tf_example`]')
+flags.DEFINE_boolean('use_bfloat16', False, 'If true, use tf.bfloat16 on TPU.')
+
+
+def main(argv):
+  if len(argv) > 1:
+    raise tf.app.UsageError('Too many command-line arguments.')
+  export_saved_model_tpu_lib.export(FLAGS.pipeline_config_file, FLAGS.ckpt_path,
+                                    FLAGS.export_dir,
+                                    FLAGS.input_placeholder_name,
+                                    FLAGS.input_type, FLAGS.use_bfloat16)
+
+
+if __name__ == '__main__':
+  tf.app.flags.mark_flag_as_required('pipeline_config_file')
+  tf.app.flags.mark_flag_as_required('ckpt_path')
+  tf.app.flags.mark_flag_as_required('export_dir')
+  tf.app.run()
--- a/research/object_detection/tpu_exporters/export_saved_model_tpu_lib.py
+++ b/research/object_detection/tpu_exporters/export_saved_model_tpu_lib.py
+# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""Python library for exporting SavedModel, tailored for TPU inference."""
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import tensorflow as tf
+
+from google.protobuf import text_format
+# pylint: disable=g-direct-tensorflow-import
+from tensorflow.python.saved_model import loader
+from tensorflow.python.saved_model import signature_constants
+from tensorflow.python.saved_model import tag_constants
+# pylint: enable=g-direct-tensorflow-import
+from object_detection.protos import pipeline_pb2
+from object_detection.tpu_exporters import faster_rcnn
+from object_detection.tpu_exporters import ssd
+
+model_map = {
+    'faster_rcnn': faster_rcnn,
+    'ssd': ssd,
+}
+
+
+def parse_pipeline_config(pipeline_config_file):
+  """Returns pipeline config and meta architecture name."""
+  with tf.gfile.GFile(pipeline_config_file, 'r') as config_file:
+    config_str = config_file.read()
+  pipeline_config = pipeline_pb2.TrainEvalPipelineConfig()
+  text_format.Merge(config_str, pipeline_config)
+  meta_arch = pipeline_config.model.WhichOneof('model')
+
+  return pipeline_config, meta_arch
+
+
+def export(pipeline_config_file,
+           ckpt_path,
+           export_dir,
+           input_placeholder_name='placeholder_tensor',
+           input_type='encoded_image_string_tensor',
+           use_bfloat16=False):
+  """Exports as SavedModel.
+
+  Args:
+    pipeline_config_file: Pipeline config file name.
+    ckpt_path: Training checkpoint path.
+    export_dir: Directory to export SavedModel.
+    input_placeholder_name: input placeholder's name in SavedModel signature.
+    input_type: One of
+                'encoded_image_string_tensor': a 1d tensor with dtype=tf.string
+                'image_tensor': a 4d tensor with dtype=tf.uint8
+                'tf_example': a 1d tensor with dtype=tf.string
+    use_bfloat16: If true, use tf.bfloat16 on TPU.
+  """
+  pipeline_config, meta_arch = parse_pipeline_config(pipeline_config_file)
+
+  shapes_info = model_map[meta_arch].get_prediction_tensor_shapes(
+      pipeline_config)
+
+  with tf.Graph().as_default(), tf.Session() as sess:
+    placeholder_tensor, result_tensor_dict = model_map[meta_arch].build_graph(
+        pipeline_config, shapes_info, input_type, use_bfloat16)
+
+    saver = tf.train.Saver()
+    init_op = tf.global_variables_initializer()
+
+    sess.run(init_op)
+    if ckpt_path is not None:
+      saver.restore(sess, ckpt_path)
+
+    # export saved model
+    builder = tf.saved_model.builder.SavedModelBuilder(export_dir)
+    tensor_info_inputs = {
+        input_placeholder_name:
+            tf.saved_model.utils.build_tensor_info(placeholder_tensor)
+    }
+    tensor_info_outputs = {
+        k: tf.saved_model.utils.build_tensor_info(v)
+        for k, v in result_tensor_dict.items()
+    }
+    detection_signature = (
+        tf.saved_model.signature_def_utils.build_signature_def(
+            inputs=tensor_info_inputs,
+            outputs=tensor_info_outputs,
+            method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME))
+
+    tf.logging.info('Inputs:\n{}\nOutputs:{}\nPredict method name:{}'.format(
+        tensor_info_inputs, tensor_info_outputs,
+        tf.saved_model.signature_constants.PREDICT_METHOD_NAME))
+    # Graph for TPU.
+    builder.add_meta_graph_and_variables(
+        sess, [
+            tf.saved_model.tag_constants.SERVING,
+            tf.saved_model.tag_constants.TPU
+        ],
+        signature_def_map={
+            tf.saved_model.signature_constants
+            .DEFAULT_SERVING_SIGNATURE_DEF_KEY:
+                detection_signature,
+        },
+        strip_default_attrs=True)
+    # Graph for CPU, this is for passing infra validation.
+    builder.add_meta_graph(
+        [tf.saved_model.tag_constants.SERVING],
+        signature_def_map={
+            tf.saved_model.signature_constants
+            .DEFAULT_SERVING_SIGNATURE_DEF_KEY:
+                detection_signature,
+        },
+        strip_default_attrs=True)
+    builder.save(as_text=False)
+    tf.logging.info('Model saved to {}'.format(export_dir))
+
+
+def run_inference(inputs,
+                  pipeline_config_file,
+                  ckpt_path,
+                  input_type='encoded_image_string_tensor',
+                  use_bfloat16=False,
+                  repeat=1):
+  """Runs inference on TPU.
+
+  Args:
+    inputs: Input image with the same type as `input_type`
+    pipeline_config_file: Pipeline config file name.
+    ckpt_path: Training checkpoint path.
+    input_type: One of
+                'encoded_image_string_tensor': a 1d tensor with dtype=tf.string
+                'image_tensor': a 4d tensor with dtype=tf.uint8
+                'tf_example': a 1d tensor with dtype=tf.string
+    use_bfloat16: If true, use tf.bfloat16 on TPU.
+    repeat: Number of times to repeat running the provided input for profiling.
+
+  Returns:
+    A dict of resulting tensors.
+  """
+
+  pipeline_config, meta_arch = parse_pipeline_config(pipeline_config_file)
+
+  shapes_info = model_map[meta_arch].get_prediction_tensor_shapes(
+      pipeline_config)
+
+  with tf.Graph().as_default(), tf.Session() as sess:
+    placeholder_tensor, result_tensor_dict = model_map[meta_arch].build_graph(
+        pipeline_config, shapes_info, input_type, use_bfloat16)
+
+    saver = tf.train.Saver()
+    init_op = tf.global_variables_initializer()
+
+    sess.run(tf.contrib.tpu.initialize_system())
+
+    sess.run(init_op)
+    if ckpt_path is not None:
+      saver.restore(sess, ckpt_path)
+
+    for _ in range(repeat):
+      tensor_dict_out = sess.run(
+          result_tensor_dict, feed_dict={placeholder_tensor: [inputs]})
+
+    sess.run(tf.contrib.tpu.shutdown_system())
+
+    return tensor_dict_out
+
+
+def run_inference_from_saved_model(inputs,
+                                   saved_model_dir,
+                                   input_placeholder_name='placeholder_tensor',
+                                   repeat=1):
+  """Loads saved model and run inference on TPU.
+
+  Args:
+    inputs: Input image with the same type as `input_type`
+    saved_model_dir: The directory SavedModel being exported to.
+    input_placeholder_name: input placeholder's name in SavedModel signature.
+    repeat: Number of times to repeat running the provided input for profiling.
+
+  Returns:
+    A dict of resulting tensors.
+  """
+  with tf.Graph().as_default(), tf.Session() as sess:
+    meta_graph = loader.load(sess, [tag_constants.SERVING, tag_constants.TPU],
+                             saved_model_dir)
+
+    sess.run(tf.contrib.tpu.initialize_system())
+
+    key_prediction = signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY
+
+    tensor_name_input = (
+        meta_graph.signature_def[key_prediction].inputs[input_placeholder_name]
+        .name)
+    tensor_name_output = {
+        k: v.name
+        for k, v in (meta_graph.signature_def[key_prediction].outputs.items())
+    }
+
+    for _ in range(repeat):
+      tensor_dict_out = sess.run(
+          tensor_name_output, feed_dict={tensor_name_input: [inputs]})
+
+    sess.run(tf.contrib.tpu.shutdown_system())
+
+    return tensor_dict_out