Add TPU SavedModel exporter and refactor OD code (#6737)

247226201 by ronnyvotel: Updating the visualization tools to accept unique_ids for color coding. -- 247067830 by Zhichao Lu: Add box_encodings_clip_range options for the convolutional box predictor (for TPU compatibility). -- 246888475 by Zhichao Lu: Remove unused _update_eval_steps function. -- 246163259 by lzc: Add a gather op that can handle ignore indices (which are "-1"s in this case). -- 246084944 by Zhichao Lu: Keras based implementation for SSD + MobilenetV2 + FPN. -- 245544227 by rathodv: Add batch_get_targets method to target assigner module to gather any groundtruth tensors based on the results of target assigner. -- 245540854 by rathodv: Update target assigner to return match tensor instead of a match object. -- 245434441 by Zhichao Lu: Add README for tpu_exporters package. -- 245381834 by lzc: Internal change. -- 245298983 by Zhichao Lu: Add conditional_shape_resizer to config_util -- 245134666 by Zhichao Lu: Adds ConditionalShapeResizer to the ImageResizer proto which enables resizing only if input image height or width is is greater or smaller than a certain size. Also enables specification of resize method in resize_to_{max, min}_dimension methods. -- 245093975 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (faster-rcnn) -- 245072421 by Zhichao Lu: Adds a new image resizing method "resize_to_max_dimension" which resizes images only if a dimension is greater than the maximum desired value while maintaining aspect ratio. -- 244946998 by lzc: Internal Changes. -- 244943693 by Zhichao Lu: Add a custom config to mobilenet v2 that makes it more detection friendly. -- 244754158 by derekjchow: Internal change. -- 244699875 by Zhichao Lu: Add check_range=False to box_list_ops.to_normalized_coordinates when training for instance segmentation. This is consistent with other calls when training for object detection. There could be wrongly annotated boxes in the dataset. -- 244507425 by rathodv: Support bfloat16 for ssd models. -- 244399982 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (ssd) -- 244209387 by Zhichao Lu: Internal change. -- 243922296 by rathodv: Change `raw_detection_scores` to contain softmax/sigmoid scores (not logits) for `raw_ detection_boxes`. -- 243883978 by Zhichao Lu: Add a sample fully conv config. -- 243369455 by Zhichao Lu: Fix regularization loss gap in Keras and Slim. -- 243292002 by lzc: Internal changes. -- 243097958 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (ssd model) -- 243007177 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (ssd model) -- 242776550 by Zhichao Lu: Make object detection pre-processing run on GPU. tf.map_fn() uses TensorArrayV3 ops, which have no int32 GPU implementation. Cast to int64, then cast back to int32. -- 242723128 by Zhichao Lu: Using sorted dictionaries for additional heads in non_max_suppression to ensure tensor order -- 242495311 by Zhichao Lu: Update documentation to reflect new TFLite examples repo location -- 242230527 by Zhichao Lu: Fix Dropout bugs for WeightSharedConvolutionalBoxPred. -- 242226573 by Zhichao Lu: Create Keras-based WeightSharedConvolutionalBoxPredictor. -- 241806074 by Zhichao Lu: Add inference in unit tests of TFX OD template. -- 241641498 by lzc: Internal change. -- 241637481 by Zhichao Lu: matmul_crop_and_resize(): Switch to dynamic shaping, so that not all dimensions are required to be known. -- 241429980 by Zhichao Lu: Internal change -- 241167237 by Zhichao Lu: Adds a faster_rcnn_inception_resnet_v2 Keras feature extractor, and updates the model builder to construct it. -- 241088616 by Zhichao Lu: Make it compatible with different dtype, e.g. float32, bfloat16, etc. -- 240897364 by lzc: Use image_np_expanded in object_detection_tutorial notebook. -- 240890393 by Zhichao Lu: Disable multicore inference for OD template as its not yet compatible. -- 240352168 by Zhichao Lu: Make SSDResnetV1FpnFeatureExtractor not protected to allow inheritance. -- 240351470 by lzc: Internal change. -- 239878928 by Zhichao Lu: Defines Keras box predictors for Faster RCNN and RFCN -- 239872103 by Zhichao Lu: Delete duplicated inputs in test. -- 239714273 by Zhichao Lu: Adding scope variable to all class heads -- 239698643 by Zhichao Lu: Create FPN feature extractor for object detection. -- 239696657 by Zhichao Lu: Internal Change. -- 239299404 by Zhichao Lu: Allows the faster rcnn meta-architecture to support Keras subcomponents -- 238502595 by Zhichao Lu: Lay the groundwork for symmetric quantization. -- 238496885 by Zhichao Lu: Add flexible_grid_anchor_generator -- 238138727 by lzc: Remove dead code. _USE_C_SHAPES has been forced True in TensorFlow releases since TensorFlow 1.9 (https://github.com/tensorflow/tensorflow/commit/1d74a69443f741e69f9f52cb6bc2940b4d4ae3b7) -- 238123936 by rathodv: Add num_matched_groundtruth summary to target assigner in SSD. -- 238103345 by ronnyvotel: Raising error if input file pattern does not match any files. Also printing the number of evaluation images for coco metrics. -- 238044081 by Zhichao Lu: Fix docstring to state the correct dimensionality of `class_predictions_with_background`. -- 237920279 by Zhichao Lu: [XLA] Rework debug flags for dumping HLO. The following flags (usually passed via the XLA_FLAGS envvar) are removed: xla_dump_computations_to xla_dump_executions_to xla_dump_ir_to xla_dump_optimized_hlo_proto_to xla_dump_per_pass_hlo_proto_to xla_dump_unoptimized_hlo_proto_to xla_generate_hlo_graph xla_generate_hlo_text_to xla_hlo_dump_as_html xla_hlo_graph_path xla_log_hlo_text The following new flags are added: xla_dump_to xla_dump_hlo_module_re xla_dump_hlo_pass_re xla_dump_hlo_as_text xla_dump_hlo_as_proto xla_dump_hlo_as_dot xla_dump_hlo_as_url xla_dump_hlo_as_html xla_dump_ir xla_dump_hlo_snapshots The default is not to dump anything at all, but as soon as some dumping flag is specified, we enable the following defaults (most of which can be overridden). * dump to stdout (overridden by --xla_dump_to) * dump HLO modules at the very beginning and end of the optimization pipeline * don't dump between any HLO passes (overridden by --xla_dump_hlo_pass_re) * dump all HLO modules (overridden by --xla_dump_hlo_module_re) * dump in textual format (overridden by --xla_dump_hlo_as_{text,proto,dot,url,html}). For example, to dump optimized and unoptimized HLO text and protos to /tmp/foo, pass --xla_dump_to=/tmp/foo --xla_dump_hlo_as_text --xla_dump_hlo_as_proto For details on these flags' meanings, see xla.proto. The intent of this change is to make dumping both simpler to use and more powerful. For example: * Previously there was no way to dump the HLO module during the pass pipeline in HLO text format; the only option was --dump_per_pass_hlo_proto_to, which dumped in proto format. Now this is --xla_dump_pass_re=.* --xla_dump_hlo_as_text. (In fact, the second flag is not necessary in this case, as dumping as text is the default.) * Previously there was no way to dump HLO as a graph before and after compilation; the only option was --xla_generate_hlo_graph, which would dump before/after every pass. Now this is --xla_dump_hlo_as_{dot,url,html} (depending on what format you want the graph in). * Previously, there was no coordination between the filenames written by the various flags, so info about one module might be dumped with various filename prefixes. Now the filenames are consistent and all dumps from a particular module are next to each other. If you only specify some of these flags, we try to figure out what you wanted. For example: * --xla_dump_to implies --xla_dump_hlo_as_text unless you specify some other --xla_dump_as_* flag. * --xla_dump_hlo_as_text or --xla_dump_ir implies dumping to stdout unless you specify a different --xla_dump_to directory. You can explicitly dump to stdout with --xla_dump_to=-. As part of this change, I simplified the debugging code in the HLO passes for dumping HLO modules. Previously, many tests explicitly VLOG'ed the HLO module before, after, and sometimes during the pass. I removed these VLOGs. If you want dumps before/during/after an HLO pass, use --xla_dump_pass_re=<pass_name>. -- 237510043 by lzc: Internal Change. -- 237469515 by Zhichao Lu: Parameterize model_builder.build in inputs.py. -- 237293511 by rathodv: Remove multiclass_scores from tensor_dict in transform_data_fn always. -- 237260333 by ronnyvotel: Updating faster_rcnn_meta_arch to define prediction dictionary fields that are batched. -- PiperOrigin-RevId: 247226201

Add TPU SavedModel exporter and refactor OD code (#6737)
247226201 by ronnyvotel: Updating the visualization tools to accept unique_ids for color coding. -- 247067830 by Zhichao Lu: Add box_encodings_clip_range options for the convolutional box predictor (for TPU compatibility). -- 246888475 by Zhichao Lu: Remove unused _update_eval_steps function. -- 246163259 by lzc: Add a gather op that can handle ignore indices (which are "-1"s in this case). -- 246084944 by Zhichao Lu: Keras based implementation for SSD + MobilenetV2 + FPN. -- 245544227 by rathodv: Add batch_get_targets method to target assigner module to gather any groundtruth tensors based on the results of target assigner. -- 245540854 by rathodv: Update target assigner to return match tensor instead of a match object. -- 245434441 by Zhichao Lu: Add README for tpu_exporters package. -- 245381834 by lzc: Internal change. -- 245298983 by Zhichao Lu: Add conditional_shape_resizer to config_util -- 245134666 by Zhichao Lu: Adds ConditionalShapeResizer to the ImageResizer proto which enables resizing only if input image height or width is is greater or smaller than a certain size. Also enables specification of resize method in resize_to_{max, min}_dimension methods. -- 245093975 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (faster-rcnn) -- 245072421 by Zhichao Lu: Adds a new image resizing method "resize_to_max_dimension" which resizes images only if a dimension is greater than the maximum desired value while maintaining aspect ratio. -- 244946998 by lzc: Internal Changes. -- 244943693 by Zhichao Lu: Add a custom config to mobilenet v2 that makes it more detection friendly. -- 244754158 by derekjchow: Internal change. -- 244699875 by Zhichao Lu: Add check_range=False to box_list_ops.to_normalized_coordinates when training for instance segmentation. This is consistent with other calls when training for object detection. There could be wrongly annotated boxes in the dataset. -- 244507425 by rathodv: Support bfloat16 for ssd models. -- 244399982 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (ssd) -- 244209387 by Zhichao Lu: Internal change. -- 243922296 by rathodv: Change `raw_detection_scores` to contain softmax/sigmoid scores (not logits) for `raw_ detection_boxes`. -- 243883978 by Zhichao Lu: Add a sample fully conv config. -- 243369455 by Zhichao Lu: Fix regularization loss gap in Keras and Slim. -- 243292002 by lzc: Internal changes. -- 243097958 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (ssd model) -- 243007177 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (ssd model) -- 242776550 by Zhichao Lu: Make object detection pre-processing run on GPU. tf.map_fn() uses TensorArrayV3 ops, which have no int32 GPU implementation. Cast to int64, then cast back to int32. -- 242723128 by Zhichao Lu: Using sorted dictionaries for additional heads in non_max_suppression to ensure tensor order -- 242495311 by Zhichao Lu: Update documentation to reflect new TFLite examples repo location -- 242230527 by Zhichao Lu: Fix Dropout bugs for WeightSharedConvolutionalBoxPred. -- 242226573 by Zhichao Lu: Create Keras-based WeightSharedConvolutionalBoxPredictor. -- 241806074 by Zhichao Lu: Add inference in unit tests of TFX OD template. -- 241641498 by lzc: Internal change. -- 241637481 by Zhichao Lu: matmul_crop_and_resize(): Switch to dynamic shaping, so that not all dimensions are required to be known. -- 241429980 by Zhichao Lu: Internal change -- 241167237 by Zhichao Lu: Adds a faster_rcnn_inception_resnet_v2 Keras feature extractor, and updates the model builder to construct it. -- 241088616 by Zhichao Lu: Make it compatible with different dtype, e.g. float32, bfloat16, etc. -- 240897364 by lzc: Use image_np_expanded in object_detection_tutorial notebook. -- 240890393 by Zhichao Lu: Disable multicore inference for OD template as its not yet compatible. -- 240352168 by Zhichao Lu: Make SSDResnetV1FpnFeatureExtractor not protected to allow inheritance. -- 240351470 by lzc: Internal change. -- 239878928 by Zhichao Lu: Defines Keras box predictors for Faster RCNN and RFCN -- 239872103 by Zhichao Lu: Delete duplicated inputs in test. -- 239714273 by Zhichao Lu: Adding scope variable to all class heads -- 239698643 by Zhichao Lu: Create FPN feature extractor for object detection. -- 239696657 by Zhichao Lu: Internal Change. -- 239299404 by Zhichao Lu: Allows the faster rcnn meta-architecture to support Keras subcomponents -- 238502595 by Zhichao Lu: Lay the groundwork for symmetric quantization. -- 238496885 by Zhichao Lu: Add flexible_grid_anchor_generator -- 238138727 by lzc: Remove dead code. _USE_C_SHAPES has been forced True in TensorFlow releases since TensorFlow 1.9 (https://github.com/tensorflow/tensorflow/commit/1d74a69443f741e69f9f52cb6bc2940b4d4ae3b7) -- 238123936 by rathodv: Add num_matched_groundtruth summary to target assigner in SSD. -- 238103345 by ronnyvotel: Raising error if input file pattern does not match any files. Also printing the number of evaluation images for coco metrics. -- 238044081 by Zhichao Lu: Fix docstring to state the correct dimensionality of `class_predictions_with_background`. -- 237920279 by Zhichao Lu: [XLA] Rework debug flags for dumping HLO. The following flags (usually passed via the XLA_FLAGS envvar) are removed: xla_dump_computations_to xla_dump_executions_to xla_dump_ir_to xla_dump_optimized_hlo_proto_to xla_dump_per_pass_hlo_proto_to xla_dump_unoptimized_hlo_proto_to xla_generate_hlo_graph xla_generate_hlo_text_to xla_hlo_dump_as_html xla_hlo_graph_path xla_log_hlo_text The following new flags are added: xla_dump_to xla_dump_hlo_module_re xla_dump_hlo_pass_re xla_dump_hlo_as_text xla_dump_hlo_as_proto xla_dump_hlo_as_dot xla_dump_hlo_as_url xla_dump_hlo_as_html xla_dump_ir xla_dump_hlo_snapshots The default is not to dump anything at all, but as soon as some dumping flag is specified, we enable the following defaults (most of which can be overridden). * dump to stdout (overridden by --xla_dump_to) * dump HLO modules at the very beginning and end of the optimization pipeline * don't dump between any HLO passes (overridden by --xla_dump_hlo_pass_re) * dump all HLO modules (overridden by --xla_dump_hlo_module_re) * dump in textual format (overridden by --xla_dump_hlo_as_{text,proto,dot,url,html}). For example, to dump optimized and unoptimized HLO text and protos to /tmp/foo, pass --xla_dump_to=/tmp/foo --xla_dump_hlo_as_text --xla_dump_hlo_as_proto For details on these flags' meanings, see xla.proto. The intent of this change is to make dumping both simpler to use and more powerful. For example: * Previously there was no way to dump the HLO module during the pass pipeline in HLO text format; the only option was --dump_per_pass_hlo_proto_to, which dumped in proto format. Now this is --xla_dump_pass_re=.* --xla_dump_hlo_as_text. (In fact, the second flag is not necessary in this case, as dumping as text is the default.) * Previously there was no way to dump HLO as a graph before and after compilation; the only option was --xla_generate_hlo_graph, which would dump before/after every pass. Now this is --xla_dump_hlo_as_{dot,url,html} (depending on what format you want the graph in). * Previously, there was no coordination between the filenames written by the various flags, so info about one module might be dumped with various filename prefixes. Now the filenames are consistent and all dumps from a particular module are next to each other. If you only specify some of these flags, we try to figure out what you wanted. For example: * --xla_dump_to implies --xla_dump_hlo_as_text unless you specify some other --xla_dump_as_* flag. * --xla_dump_hlo_as_text or --xla_dump_ir implies dumping to stdout unless you specify a different --xla_dump_to directory. You can explicitly dump to stdout with --xla_dump_to=-. As part of this change, I simplified the debugging code in the HLO passes for dumping HLO modules. Previously, many tests explicitly VLOG'ed the HLO module before, after, and sometimes during the pass. I removed these VLOGs. If you want dumps before/during/after an HLO pass, use --xla_dump_pass_re=<pass_name>. -- 237510043 by lzc: Internal Change. -- 237469515 by Zhichao Lu: Parameterize model_builder.build in inputs.py. -- 237293511 by rathodv: Remove multiclass_scores from tensor_dict in transform_data_fn always. -- 237260333 by ronnyvotel: Updating faster_rcnn_meta_arch to define prediction dictionary fields that are batched. -- PiperOrigin-RevId: 247226201
80444539 · Zhuoran Liu · pkulzc · c4f34e58 · 80444539 · 80444539
Commit 80444539 authored May 22, 2019 by Zhuoran Liu Committed by pkulzc May 22, 2019
20 changed files
--- a/research/object_detection/README.md
+++ b/research/object_detection/README.md
@@ -65,6 +65,8 @@ Extras:
  * <a href='g3doc/detection_model_zoo.md'>Tensorflow detection model zoo</a><br>
  * <a href='g3doc/exporting_models.md'>
      Exporting a trained model for inference</a><br>
+  * <a href='g3doc/tpu_exporters.md'>
+      Exporting a trained model for TPU inference</a><br>
  * <a href='g3doc/defining_your_own_model.md'>
      Defining your own model architecture</a><br>
  * <a href='g3doc/using_your_own_dataset.md'>

--- a/research/object_detection/anchor_generators/flexible_grid_anchor_generator.py
+++ b/research/object_detection/anchor_generators/flexible_grid_anchor_generator.py
+# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""Generates grid anchors on the fly corresponding to multiple CNN layers."""
+
+import tensorflow as tf
+
+from object_detection.anchor_generators import grid_anchor_generator
+from object_detection.core import anchor_generator
+from object_detection.core import box_list_ops
+
+
+class FlexibleGridAnchorGenerator(anchor_generator.AnchorGenerator):
+  """Generate a grid of anchors for multiple CNN layers of different scale."""
+
+  def __init__(self, base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
+               normalize_coordinates=True):
+    """Constructs a FlexibleGridAnchorGenerator.
+
+    This generator is more flexible than the multiple_grid_anchor_generator
+    and multiscale_grid_anchor_generator, and can generate any of the anchors
+    that they can generate, plus additional anchor configurations. In
+    particular, it allows the explicit specification of scale and aspect ratios
+    at each layer without making any assumptions between the relationship
+    between scales and aspect ratios between layers.
+
+    Args:
+      base_sizes: list of tuples of anchor base sizes. For example, setting
+        base_sizes=[(1, 2, 3), (4, 5)] means that we want 3 anchors at each
+        grid point on the first layer with the base sizes of 1, 2, and 3, and 2
+        anchors at each grid point on the second layer with the base sizes of
+        4 and 5.
+      aspect_ratios: list or tuple of aspect ratios. For example, setting
+        aspect_ratios=[(1.0, 2.0, 0.5), (1.0, 2.0)] means that we want 3 anchors
+        at each grid point on the first layer with aspect ratios of 1.0, 2.0,
+        and 0.5, and 2 anchors at each grid point on the sercond layer with the
+        base sizes of 1.0 and 2.0.
+      anchor_strides: list of pairs of strides in pixels (in y and x directions
+        respectively). For example, setting anchor_strides=[(25, 25), (50, 50)]
+        means that we want the anchors corresponding to the first layer to be
+        strided by 25 pixels and those in the second layer to be strided by 50
+        pixels in both y and x directions.
+      anchor_offsets: list of pairs of offsets in pixels (in y and x directions
+        respectively). The offset specifies where we want the center of the
+        (0, 0)-th anchor to lie for each layer. For example, setting
+        anchor_offsets=[(10, 10), (20, 20)]) means that we want the
+        (0, 0)-th anchor of the first layer to lie at (10, 10) in pixel space
+        and likewise that we want the (0, 0)-th anchor of the second layer to
+        lie at (25, 25) in pixel space.
+      normalize_coordinates: whether to produce anchors in normalized
+        coordinates. (defaults to True).
+    """
+    self._base_sizes = base_sizes
+    self._aspect_ratios = aspect_ratios
+    self._anchor_strides = anchor_strides
+    self._anchor_offsets = anchor_offsets
+    self._normalize_coordinates = normalize_coordinates
+
+  def name_scope(self):
+    return 'FlexibleGridAnchorGenerator'
+
+  def num_anchors_per_location(self):
+    """Returns the number of anchors per spatial location.
+
+    Returns:
+      a list of integers, one for each expected feature map to be passed to
+      the Generate function.
+    """
+    return [len(size) for size in self._base_sizes]
+
+  def _generate(self, feature_map_shape_list, im_height=1, im_width=1):
+    """Generates a collection of bounding boxes to be used as anchors.
+
+    Currently we require the input image shape to be statically defined.  That
+    is, im_height and im_width should be integers rather than tensors.
+
+    Args:
+      feature_map_shape_list: list of pairs of convnet layer resolutions in the
+        format [(height_0, width_0), (height_1, width_1), ...]. For example,
+        setting feature_map_shape_list=[(8, 8), (7, 7)] asks for anchors that
+        correspond to an 8x8 layer followed by a 7x7 layer.
+      im_height: the height of the image to generate the grid for. If both
+        im_height and im_width are 1, anchors can only be generated in
+        absolute coordinates.
+      im_width: the width of the image to generate the grid for. If both
+        im_height and im_width are 1, anchors can only be generated in
+        absolute coordinates.
+
+    Returns:
+      boxes_list: a list of BoxLists each holding anchor boxes corresponding to
+        the input feature map shapes.
+    Raises:
+      ValueError: if im_height and im_width are 1, but normalized coordinates
+        were requested.
+    """
+    anchor_grid_list = []
+    for (feat_shape, base_sizes, aspect_ratios, anchor_stride, anchor_offset
+        ) in zip(feature_map_shape_list, self._base_sizes, self._aspect_ratios,
+                 self._anchor_strides, self._anchor_offsets):
+      anchor_grid = grid_anchor_generator.tile_anchors(
+          feat_shape[0],
+          feat_shape[1],
+          tf.to_float(tf.convert_to_tensor(base_sizes)),
+          tf.to_float(tf.convert_to_tensor(aspect_ratios)),
+          tf.constant([1.0, 1.0]),
+          tf.to_float(tf.convert_to_tensor(anchor_stride)),
+          tf.to_float(tf.convert_to_tensor(anchor_offset)))
+      num_anchors = anchor_grid.num_boxes_static()
+      if num_anchors is None:
+        num_anchors = anchor_grid.num_boxes()
+      anchor_indices = tf.zeros([num_anchors])
+      anchor_grid.add_field('feature_map_index', anchor_indices)
+      if self._normalize_coordinates:
+        if im_height == 1 or im_width == 1:
+          raise ValueError(
+              'Normalized coordinates were requested upon construction of the '
+              'FlexibleGridAnchorGenerator, but a subsequent call to '
+              'generate did not supply dimension information.')
+        anchor_grid = box_list_ops.to_normalized_coordinates(
+            anchor_grid, im_height, im_width, check_range=False)
+      anchor_grid_list.append(anchor_grid)
+
+    return anchor_grid_list
--- a/research/object_detection/anchor_generators/flexible_grid_anchor_generator_test.py
+++ b/research/object_detection/anchor_generators/flexible_grid_anchor_generator_test.py
+# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+
+"""Tests for anchor_generators.flexible_grid_anchor_generator_test.py."""
+import numpy as np
+import tensorflow as tf
+
+from object_detection.anchor_generators import flexible_grid_anchor_generator as fg
+from object_detection.utils import test_case
+
+
+class FlexibleGridAnchorGeneratorTest(test_case.TestCase):
+
+  def test_construct_single_anchor(self):
+    anchor_strides = [(32, 32),]
+    anchor_offsets = [(16, 16),]
+    base_sizes = [(128.0,)]
+    aspect_ratios = [(1.0,)]
+    im_height = 64
+    im_width = 64
+    feature_map_shape_list = [(2, 2)]
+    exp_anchor_corners = [[-48, -48, 80, 80],
+                          [-48, -16, 80, 112],
+                          [-16, -48, 112, 80],
+                          [-16, -16, 112, 112]]
+    anchor_generator = fg.FlexibleGridAnchorGenerator(
+        base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
+        normalize_coordinates=False)
+    anchors_list = anchor_generator.generate(
+        feature_map_shape_list, im_height=im_height, im_width=im_width)
+    anchor_corners = anchors_list[0].get()
+
+    with self.test_session():
+      anchor_corners_out = anchor_corners.eval()
+      self.assertAllClose(anchor_corners_out, exp_anchor_corners)
+
+  def test_construct_single_anchor_unit_dimensions(self):
+    anchor_strides = [(32, 32),]
+    anchor_offsets = [(16, 16),]
+    base_sizes = [(32.0,)]
+    aspect_ratios = [(1.0,)]
+    im_height = 1
+    im_width = 1
+    feature_map_shape_list = [(2, 2)]
+    # Positive offsets are produced.
+    exp_anchor_corners = [[0, 0, 32, 32],
+                          [0, 32, 32, 64],
+                          [32, 0, 64, 32],
+                          [32, 32, 64, 64]]
+
+    anchor_generator = fg.FlexibleGridAnchorGenerator(
+        base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
+        normalize_coordinates=False)
+    anchors_list = anchor_generator.generate(
+        feature_map_shape_list, im_height=im_height, im_width=im_width)
+    anchor_corners = anchors_list[0].get()
+
+    with self.test_session():
+      anchor_corners_out = anchor_corners.eval()
+      self.assertAllClose(anchor_corners_out, exp_anchor_corners)
+
+  def test_construct_normalized_anchors_fails_with_unit_dimensions(self):
+    anchor_generator = fg.FlexibleGridAnchorGenerator(
+        [(32.0,)], [(1.0,)], [(32, 32),], [(16, 16),],
+        normalize_coordinates=True)
+    with self.assertRaisesRegexp(ValueError, 'Normalized coordinates'):
+      anchor_generator.generate(
+          feature_map_shape_list=[(2, 2)], im_height=1, im_width=1)
+
+  def test_construct_single_anchor_in_normalized_coordinates(self):
+    anchor_strides = [(32, 32),]
+    anchor_offsets = [(16, 16),]
+    base_sizes = [(128.0,)]
+    aspect_ratios = [(1.0,)]
+    im_height = 64
+    im_width = 128
+    feature_map_shape_list = [(2, 2)]
+    exp_anchor_corners = [[-48./64, -48./128, 80./64, 80./128],
+                          [-48./64, -16./128, 80./64, 112./128],
+                          [-16./64, -48./128, 112./64, 80./128],
+                          [-16./64, -16./128, 112./64, 112./128]]
+    anchor_generator = fg.FlexibleGridAnchorGenerator(
+        base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
+        normalize_coordinates=True)
+    anchors_list = anchor_generator.generate(
+        feature_map_shape_list, im_height=im_height, im_width=im_width)
+    anchor_corners = anchors_list[0].get()
+
+    with self.test_session():
+      anchor_corners_out = anchor_corners.eval()
+      self.assertAllClose(anchor_corners_out, exp_anchor_corners)
+
+  def test_num_anchors_per_location(self):
+    anchor_strides = [(32, 32), (64, 64)]
+    anchor_offsets = [(16, 16), (32, 32)]
+    base_sizes = [(32.0, 64.0, 96.0, 32.0, 64.0, 96.0),
+                  (64.0, 128.0, 172.0, 64.0, 128.0, 172.0)]
+    aspect_ratios = [(1.0, 1.0, 1.0, 2.0, 2.0, 2.0),
+                     (1.0, 1.0, 1.0, 2.0, 2.0, 2.0)]
+    anchor_generator = fg.FlexibleGridAnchorGenerator(
+        base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
+        normalize_coordinates=False)
+    self.assertEqual(anchor_generator.num_anchors_per_location(), [6, 6])
+
+  def test_construct_single_anchor_dynamic_size(self):
+    anchor_strides = [(32, 32),]
+    anchor_offsets = [(0, 0),]
+    base_sizes = [(128.0,)]
+    aspect_ratios = [(1.0,)]
+    im_height = tf.constant(64)
+    im_width = tf.constant(64)
+    feature_map_shape_list = [(2, 2)]
+    # Zero offsets are used.
+    exp_anchor_corners = [[-64, -64, 64, 64],
+                          [-64, -32, 64, 96],
+                          [-32, -64, 96, 64],
+                          [-32, -32, 96, 96]]
+
+    anchor_generator = fg.FlexibleGridAnchorGenerator(
+        base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
+        normalize_coordinates=False)
+    anchors_list = anchor_generator.generate(
+        feature_map_shape_list, im_height=im_height, im_width=im_width)
+    anchor_corners = anchors_list[0].get()
+
+    with self.test_session():
+      anchor_corners_out = anchor_corners.eval()
+      self.assertAllClose(anchor_corners_out, exp_anchor_corners)
+
+  def test_construct_single_anchor_with_odd_input_dimension(self):
+
+    def graph_fn():
+      anchor_strides = [(32, 32),]
+      anchor_offsets = [(0, 0),]
+      base_sizes = [(128.0,)]
+      aspect_ratios = [(1.0,)]
+      im_height = 65
+      im_width = 65
+      feature_map_shape_list = [(3, 3)]
+      anchor_generator = fg.FlexibleGridAnchorGenerator(
+          base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
+          normalize_coordinates=False)
+      anchors_list = anchor_generator.generate(
+          feature_map_shape_list, im_height=im_height, im_width=im_width)
+      anchor_corners = anchors_list[0].get()
+      return (anchor_corners,)
+    anchor_corners_out = self.execute(graph_fn, [])
+    exp_anchor_corners = [[-64, -64, 64, 64],
+                          [-64, -32, 64, 96],
+                          [-64, 0, 64, 128],
+                          [-32, -64, 96, 64],
+                          [-32, -32, 96, 96],
+                          [-32, 0, 96, 128],
+                          [0, -64, 128, 64],
+                          [0, -32, 128, 96],
+                          [0, 0, 128, 128]]
+    self.assertAllClose(anchor_corners_out, exp_anchor_corners)
+
+  def test_construct_single_anchor_on_two_feature_maps(self):
+
+    def graph_fn():
+      anchor_strides = [(32, 32), (64, 64)]
+      anchor_offsets = [(16, 16), (32, 32)]
+      base_sizes = [(128.0,), (256.0,)]
+      aspect_ratios = [(1.0,), (1.0,)]
+      im_height = 64
+      im_width = 64
+      feature_map_shape_list = [(2, 2), (1, 1)]
+      anchor_generator = fg.FlexibleGridAnchorGenerator(
+          base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
+          normalize_coordinates=False)
+      anchors_list = anchor_generator.generate(feature_map_shape_list,
+                                               im_height=im_height,
+                                               im_width=im_width)
+      anchor_corners = [anchors.get() for anchors in anchors_list]
+      return anchor_corners
+
+    anchor_corners_out = np.concatenate(self.execute(graph_fn, []), axis=0)
+    exp_anchor_corners = [[-48, -48, 80, 80],
+                          [-48, -16, 80, 112],
+                          [-16, -48, 112, 80],
+                          [-16, -16, 112, 112],
+                          [-96, -96, 160, 160]]
+    self.assertAllClose(anchor_corners_out, exp_anchor_corners)
+
+  def test_construct_single_anchor_with_two_scales_per_octave(self):
+
+    def graph_fn():
+      anchor_strides = [(64, 64),]
+      anchor_offsets = [(32, 32),]
+      base_sizes = [(256.0, 362.03867)]
+      aspect_ratios = [(1.0, 1.0)]
+      im_height = 64
+      im_width = 64
+      feature_map_shape_list = [(1, 1)]
+
+      anchor_generator = fg.FlexibleGridAnchorGenerator(
+          base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
+          normalize_coordinates=False)
+      anchors_list = anchor_generator.generate(feature_map_shape_list,
+                                               im_height=im_height,
+                                               im_width=im_width)
+      anchor_corners = [anchors.get() for anchors in anchors_list]
+      return anchor_corners
+    # There are 4 set of anchors in this configuration. The order is:
+    # [[2**0.0 intermediate scale + 1.0 aspect],
+    #  [2**0.5 intermediate scale + 1.0 aspect]]
+    exp_anchor_corners = [[-96., -96., 160., 160.],
+                          [-149.0193, -149.0193, 213.0193, 213.0193]]
+
+    anchor_corners_out = self.execute(graph_fn, [])
+    self.assertAllClose(anchor_corners_out, exp_anchor_corners)
+
+  def test_construct_single_anchor_with_two_scales_per_octave_and_aspect(self):
+    def graph_fn():
+      anchor_strides = [(64, 64),]
+      anchor_offsets = [(32, 32),]
+      base_sizes = [(256.0, 362.03867, 256.0, 362.03867)]
+      aspect_ratios = [(1.0, 1.0, 2.0, 2.0)]
+      im_height = 64
+      im_width = 64
+      feature_map_shape_list = [(1, 1)]
+      anchor_generator = fg.FlexibleGridAnchorGenerator(
+          base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
+          normalize_coordinates=False)
+      anchors_list = anchor_generator.generate(feature_map_shape_list,
+                                               im_height=im_height,
+                                               im_width=im_width)
+      anchor_corners = [anchors.get() for anchors in anchors_list]
+      return anchor_corners
+    # There are 4 set of anchors in this configuration. The order is:
+    # [[2**0.0 intermediate scale + 1.0 aspect],
+    #  [2**0.5 intermediate scale + 1.0 aspect],
+    #  [2**0.0 intermediate scale + 2.0 aspect],
+    #  [2**0.5 intermediate scale + 2.0 aspect]]
+
+    exp_anchor_corners = [[-96., -96., 160., 160.],
+                          [-149.0193, -149.0193, 213.0193, 213.0193],
+                          [-58.50967, -149.0193, 122.50967, 213.0193],
+                          [-96., -224., 160., 288.]]
+    anchor_corners_out = self.execute(graph_fn, [])
+    self.assertAllClose(anchor_corners_out, exp_anchor_corners)
+
+  def test_construct_single_anchors_on_feature_maps_with_dynamic_shape(self):
+
+    def graph_fn(feature_map1_height, feature_map1_width, feature_map2_height,
+                 feature_map2_width):
+      anchor_strides = [(32, 32), (64, 64)]
+      anchor_offsets = [(16, 16), (32, 32)]
+      base_sizes = [(128.0,), (256.0,)]
+      aspect_ratios = [(1.0,), (1.0,)]
+      im_height = 64
+      im_width = 64
+      feature_map_shape_list = [(feature_map1_height, feature_map1_width),
+                                (feature_map2_height, feature_map2_width)]
+      anchor_generator = fg.FlexibleGridAnchorGenerator(
+          base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
+          normalize_coordinates=False)
+      anchors_list = anchor_generator.generate(feature_map_shape_list,
+                                               im_height=im_height,
+                                               im_width=im_width)
+      anchor_corners = [anchors.get() for anchors in anchors_list]
+      return anchor_corners
+
+    anchor_corners_out = np.concatenate(
+        self.execute_cpu(graph_fn, [
+            np.array(2, dtype=np.int32),
+            np.array(2, dtype=np.int32),
+            np.array(1, dtype=np.int32),
+            np.array(1, dtype=np.int32)
+        ]),
+        axis=0)
+    exp_anchor_corners = [[-48, -48, 80, 80],
+                          [-48, -16, 80, 112],
+                          [-16, -48, 112, 80],
+                          [-16, -16, 112, 112],
+                          [-96, -96, 160, 160]]
+    self.assertAllClose(anchor_corners_out, exp_anchor_corners)
+
+
+if __name__ == '__main__':
+  tf.test.main()
--- a/research/object_detection/builders/anchor_generator_builder.py
+++ b/research/object_detection/builders/anchor_generator_builder.py
@@ -15,6 +15,7 @@

 """A function to build an object detection anchor generator from config."""

+from object_detection.anchor_generators import flexible_grid_anchor_generator
 from object_detection.anchor_generators import grid_anchor_generator
 from object_detection.anchor_generators import multiple_grid_anchor_generator
 from object_detection.anchor_generators import multiscale_grid_anchor_generator
@@ -90,5 +91,19 @@ def build(anchor_generator_config):
        cfg.scales_per_octave,
        cfg.normalize_coordinates
    )
+  elif anchor_generator_config.WhichOneof(
+      'anchor_generator_oneof') == 'flexible_grid_anchor_generator':
+    cfg = anchor_generator_config.flexible_grid_anchor_generator
+    base_sizes = []
+    aspect_ratios = []
+    strides = []
+    offsets = []
+    for anchor_grid in cfg.anchor_grid:
+      base_sizes.append(tuple(anchor_grid.base_sizes))
+      aspect_ratios.append(tuple(anchor_grid.aspect_ratios))
+      strides.append((anchor_grid.height_stride, anchor_grid.width_stride))
+      offsets.append((anchor_grid.height_offset, anchor_grid.width_offset))
+    return flexible_grid_anchor_generator.FlexibleGridAnchorGenerator(
+        base_sizes, aspect_ratios, strides, offsets, cfg.normalize_coordinates)
  else:
    raise ValueError('Empty anchor generator.')
--- a/research/object_detection/builders/anchor_generator_builder_test.py
+++ b/research/object_detection/builders/anchor_generator_builder_test.py
@@ -20,6 +20,7 @@ import math
 import tensorflow as tf

 from google.protobuf import text_format
+from object_detection.anchor_generators import flexible_grid_anchor_generator
 from object_detection.anchor_generators import grid_anchor_generator
 from object_detection.anchor_generators import multiple_grid_anchor_generator
 from object_detection.anchor_generators import multiscale_grid_anchor_generator
@@ -43,8 +44,8 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
    text_format.Merge(anchor_generator_text_proto, anchor_generator_proto)
    anchor_generator_object = anchor_generator_builder.build(
        anchor_generator_proto)
-    self.assertTrue(isinstance(anchor_generator_object,
-                               grid_anchor_generator.GridAnchorGenerator))
+    self.assertIsInstance(anchor_generator_object,
+                          grid_anchor_generator.GridAnchorGenerator)
    self.assertListEqual(anchor_generator_object._scales, [])
    self.assertListEqual(anchor_generator_object._aspect_ratios, [])
    self.assertAllEqual(anchor_generator_object._anchor_offset, [0, 0])
@@ -68,8 +69,8 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
    text_format.Merge(anchor_generator_text_proto, anchor_generator_proto)
    anchor_generator_object = anchor_generator_builder.build(
        anchor_generator_proto)
-    self.assertTrue(isinstance(anchor_generator_object,
-                               grid_anchor_generator.GridAnchorGenerator))
+    self.assertIsInstance(anchor_generator_object,
+                          grid_anchor_generator.GridAnchorGenerator)
    self.assert_almost_list_equal(anchor_generator_object._scales,
                                  [0.4, 2.2])
    self.assert_almost_list_equal(anchor_generator_object._aspect_ratios,
@@ -88,9 +89,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
    text_format.Merge(anchor_generator_text_proto, anchor_generator_proto)
    anchor_generator_object = anchor_generator_builder.build(
        anchor_generator_proto)
-    self.assertTrue(isinstance(anchor_generator_object,
-                               multiple_grid_anchor_generator.
-                               MultipleGridAnchorGenerator))
+    self.assertIsInstance(anchor_generator_object,
+                          multiple_grid_anchor_generator.
+                          MultipleGridAnchorGenerator)
    for actual_scales, expected_scales in zip(
        list(anchor_generator_object._scales),
        [(0.1, 0.2, 0.2),
@@ -118,9 +119,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
    text_format.Merge(anchor_generator_text_proto, anchor_generator_proto)
    anchor_generator_object = anchor_generator_builder.build(
        anchor_generator_proto)
-    self.assertTrue(isinstance(anchor_generator_object,
-                               multiple_grid_anchor_generator.
-                               MultipleGridAnchorGenerator))
+    self.assertIsInstance(anchor_generator_object,
+                          multiple_grid_anchor_generator.
+                          MultipleGridAnchorGenerator)
    for actual_scales, expected_scales in zip(
        list(anchor_generator_object._scales),
        [(0.1, math.sqrt(0.1 * 0.15)),
@@ -143,9 +144,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
    text_format.Merge(anchor_generator_text_proto, anchor_generator_proto)
    anchor_generator_object = anchor_generator_builder.build(
        anchor_generator_proto)
-    self.assertTrue(isinstance(anchor_generator_object,
-                               multiple_grid_anchor_generator.
-                               MultipleGridAnchorGenerator))
+    self.assertIsInstance(anchor_generator_object,
+                          multiple_grid_anchor_generator.
+                          MultipleGridAnchorGenerator)
    for actual_aspect_ratio, expected_aspect_ratio in zip(
        list(anchor_generator_object._aspect_ratios),
        6 * [(0.5, 0.5)]):
@@ -162,9 +163,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
    text_format.Merge(anchor_generator_text_proto, anchor_generator_proto)
    anchor_generator_object = anchor_generator_builder.build(
        anchor_generator_proto)
-    self.assertTrue(isinstance(anchor_generator_object,
-                               multiple_grid_anchor_generator.
-                               MultipleGridAnchorGenerator))
+    self.assertIsInstance(anchor_generator_object,
+                          multiple_grid_anchor_generator.
+                          MultipleGridAnchorGenerator)

    for actual_scales, expected_scales in zip(
        list(anchor_generator_object._scales),
@@ -204,9 +205,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
    text_format.Merge(anchor_generator_text_proto, anchor_generator_proto)
    anchor_generator_object = anchor_generator_builder.build(
        anchor_generator_proto)
-    self.assertTrue(isinstance(anchor_generator_object,
-                               multiple_grid_anchor_generator.
-                               MultipleGridAnchorGenerator))
+    self.assertIsInstance(anchor_generator_object,
+                          multiple_grid_anchor_generator.
+                          MultipleGridAnchorGenerator)

    for actual_scales, expected_scales in zip(
        list(anchor_generator_object._scales),
@@ -246,9 +247,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
    text_format.Merge(anchor_generator_text_proto, anchor_generator_proto)
    anchor_generator_object = anchor_generator_builder.build(
        anchor_generator_proto)
-    self.assertTrue(isinstance(anchor_generator_object,
-                               multiscale_grid_anchor_generator.
-                               MultiscaleGridAnchorGenerator))
+    self.assertIsInstance(anchor_generator_object,
+                          multiscale_grid_anchor_generator.
+                          MultiscaleGridAnchorGenerator)
    for level, anchor_grid_info in zip(
        range(3, 8), anchor_generator_object._anchor_grid_info):
      self.assertEqual(set(anchor_grid_info.keys()), set(['level', 'info']))
@@ -273,11 +274,59 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
    text_format.Merge(anchor_generator_text_proto, anchor_generator_proto)
    anchor_generator_object = anchor_generator_builder.build(
        anchor_generator_proto)
-    self.assertTrue(isinstance(anchor_generator_object,
-                               multiscale_grid_anchor_generator.
-                               MultiscaleGridAnchorGenerator))
+    self.assertIsInstance(anchor_generator_object,
+                          multiscale_grid_anchor_generator.
+                          MultiscaleGridAnchorGenerator)
    self.assertFalse(anchor_generator_object._normalize_coordinates)

+  def test_build_flexible_anchor_generator(self):
+    anchor_generator_text_proto = """
+      flexible_grid_anchor_generator {
+        anchor_grid {
+          base_sizes: [1.5]
+          aspect_ratios: [1.0]
+          height_stride: 16
+          width_stride: 20
+          height_offset: 8
+          width_offset: 9
+        }
+        anchor_grid {
+          base_sizes: [1.0, 2.0]
+          aspect_ratios: [1.0, 0.5]
+          height_stride: 32
+          width_stride: 30
+          height_offset: 10
+          width_offset: 11
+        }
+      }
+    """
+    anchor_generator_proto = anchor_generator_pb2.AnchorGenerator()
+    text_format.Merge(anchor_generator_text_proto, anchor_generator_proto)
+    anchor_generator_object = anchor_generator_builder.build(
+        anchor_generator_proto)
+    self.assertIsInstance(anchor_generator_object,
+                          flexible_grid_anchor_generator.
+                          FlexibleGridAnchorGenerator)
+
+    for actual_base_sizes, expected_base_sizes in zip(
+        list(anchor_generator_object._base_sizes), [(1.5,), (1.0, 2.0)]):
+      self.assert_almost_list_equal(expected_base_sizes, actual_base_sizes)
+
+    for actual_aspect_ratios, expected_aspect_ratios in zip(
+        list(anchor_generator_object._aspect_ratios), [(1.0,), (1.0, 0.5)]):
+      self.assert_almost_list_equal(expected_aspect_ratios,
+                                    actual_aspect_ratios)
+
+    for actual_strides, expected_strides in zip(
+        list(anchor_generator_object._anchor_strides), [(16, 20), (32, 30)]):
+      self.assert_almost_list_equal(expected_strides, actual_strides)
+
+    for actual_offsets, expected_offsets in zip(
+        list(anchor_generator_object._anchor_offsets), [(8, 9), (10, 11)]):
+      self.assert_almost_list_equal(expected_offsets, actual_offsets)
+
+    self.assertTrue(anchor_generator_object._normalize_coordinates)
+

 if __name__ == '__main__':
  tf.test.main()
--- a/research/object_detection/builders/box_predictor_builder.py
+++ b/research/object_detection/builders/box_predictor_builder.py
@@ -20,11 +20,14 @@ import tensorflow as tf
 from object_detection.predictors import convolutional_box_predictor
 from object_detection.predictors import convolutional_keras_box_predictor
 from object_detection.predictors import mask_rcnn_box_predictor
+from object_detection.predictors import mask_rcnn_keras_box_predictor
 from object_detection.predictors import rfcn_box_predictor
+from object_detection.predictors import rfcn_keras_box_predictor
 from object_detection.predictors.heads import box_head
 from object_detection.predictors.heads import class_head
 from object_detection.predictors.heads import keras_box_head
 from object_detection.predictors.heads import keras_class_head
+from object_detection.predictors.heads import keras_mask_head
 from object_detection.predictors.heads import mask_head
 from object_detection.protos import box_predictor_pb2

@@ -42,7 +45,8 @@ def build_convolutional_box_predictor(is_training,
                                      apply_sigmoid_to_scores=False,
                                      add_background_class=True,
                                      class_prediction_bias_init=0.0,
-                                      use_depthwise=False,):
+                                      use_depthwise=False,
+                                      box_encodings_clip_range=None):
  """Builds the ConvolutionalBoxPredictor from the arguments.

  Args:
@@ -77,6 +81,7 @@ def build_convolutional_box_predictor(is_training,
      conv2d layer before class prediction.
    use_depthwise: Whether to use depthwise convolutions for prediction
      steps. Default is False.
+    box_encodings_clip_range: Min and max values for clipping the box_encodings.

  Returns:
    A ConvolutionalBoxPredictor class.
@@ -85,7 +90,8 @@ def build_convolutional_box_predictor(is_training,
      is_training=is_training,
      box_code_size=box_code_size,
      kernel_size=kernel_size,
-      use_depthwise=use_depthwise)
+      use_depthwise=use_depthwise,
+      box_encodings_clip_range=box_encodings_clip_range)
  class_prediction_head = class_head.ConvolutionalClassHead(
      is_training=is_training,
      num_class_slots=num_classes + 1 if add_background_class else num_classes,
@@ -124,6 +130,7 @@ def build_convolutional_keras_box_predictor(is_training,
                                            add_background_class=True,
                                            class_prediction_bias_init=0.0,
                                            use_depthwise=False,
+                                            box_encodings_clip_range=None,
                                            name='BoxPredictor'):
  """Builds the Keras ConvolutionalBoxPredictor from the arguments.

@@ -168,6 +175,7 @@ def build_convolutional_keras_box_predictor(is_training,
      conv2d layer before class prediction.
    use_depthwise: Whether to use depthwise convolutions for prediction
      steps. Default is False.
+    box_encodings_clip_range: Min and max values for clipping the box_encodings.
    name: A string name scope to assign to the box predictor. If `None`, Keras
      will auto-generate one from the class name.

@@ -189,6 +197,7 @@ def build_convolutional_keras_box_predictor(is_training,
            freeze_batchnorm=freeze_batchnorm,
            num_predictions_per_location=num_predictions_per_location,
            use_depthwise=use_depthwise,
+            box_encodings_clip_range=box_encodings_clip_range,
            name='ConvolutionalBoxHead_%d' % stack_index))
    class_prediction_heads.append(
        keras_class_head.ConvolutionalClassHead(
@@ -300,6 +309,224 @@ def build_weight_shared_convolutional_box_predictor(
      use_depthwise=use_depthwise)


+def build_weight_shared_convolutional_keras_box_predictor(
+    is_training,
+    num_classes,
+    conv_hyperparams,
+    freeze_batchnorm,
+    inplace_batchnorm_update,
+    num_predictions_per_location_list,
+    depth,
+    num_layers_before_predictor,
+    box_code_size,
+    kernel_size=3,
+    add_background_class=True,
+    class_prediction_bias_init=0.0,
+    use_dropout=False,
+    dropout_keep_prob=0.8,
+    share_prediction_tower=False,
+    apply_batch_norm=True,
+    use_depthwise=False,
+    score_converter_fn=tf.identity,
+    box_encodings_clip_range=None,
+    name='WeightSharedConvolutionalBoxPredictor'):
+  """Builds the Keras WeightSharedConvolutionalBoxPredictor from the arguments.
+
+  Args:
+    is_training: Indicates whether the BoxPredictor is in training mode.
+    num_classes: number of classes.  Note that num_classes *does not*
+      include the background category, so if groundtruth labels take values
+      in {0, 1, .., K-1}, num_classes=K (and not K+1, even though the
+      assigned classification targets can range from {0,... K}).
+    conv_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
+      containing hyperparameters for convolution ops.
+    freeze_batchnorm: Whether to freeze batch norm parameters during
+      training or not. When training with a small batch size (e.g. 1), it is
+      desirable to freeze batch norm update and use pretrained batch norm
+      params.
+    inplace_batchnorm_update: Whether to update batch norm moving average
+      values inplace. When this is false train op must add a control
+      dependency on tf.graphkeys.UPDATE_OPS collection in order to update
+      batch norm statistics.
+    num_predictions_per_location_list: A list of integers representing the
+      number of box predictions to be made per spatial location for each
+      feature map.
+    depth: depth of conv layers.
+    num_layers_before_predictor: Number of the additional conv layers before
+      the predictor.
+    box_code_size: Size of encoding for each box.
+    kernel_size: Size of final convolution kernel.
+    add_background_class: Whether to add an implicit background class.
+    class_prediction_bias_init: constant value to initialize bias of the last
+      conv2d layer before class prediction.
+    use_dropout: Whether to apply dropout to class prediction head.
+        dropout_keep_prob: Probability of keeping activiations.
+    share_prediction_tower: Whether to share the multi-layer tower between box
+      prediction and class prediction heads.
+    apply_batch_norm: Whether to apply batch normalization to conv layers in
+      this predictor.
+    use_depthwise: Whether to use depthwise separable conv2d instead of conv2d.
+    score_converter_fn: Callable score converter to perform elementwise op on
+      class scores.
+    box_encodings_clip_range: Min and max values for clipping the box_encodings.
+    name: A string name scope to assign to the box predictor. If `None`, Keras
+      will auto-generate one from the class name.
+
+  Returns:
+    A Keras WeightSharedConvolutionalBoxPredictor class.
+  """
+  if len(set(num_predictions_per_location_list)) > 1:
+    raise ValueError('num predictions per location must be same for all'
+                     'feature maps, found: {}'.format(
+                         num_predictions_per_location_list))
+  num_predictions_per_location = num_predictions_per_location_list[0]
+
+  box_prediction_head = keras_box_head.WeightSharedConvolutionalBoxHead(
+      box_code_size=box_code_size,
+      kernel_size=kernel_size,
+      conv_hyperparams=conv_hyperparams,
+      num_predictions_per_location=num_predictions_per_location,
+      use_depthwise=use_depthwise,
+      box_encodings_clip_range=box_encodings_clip_range,
+      name='WeightSharedConvolutionalBoxHead')
+  class_prediction_head = keras_class_head.WeightSharedConvolutionalClassHead(
+      num_class_slots=(
+          num_classes + 1 if add_background_class else num_classes),
+      use_dropout=use_dropout,
+      dropout_keep_prob=dropout_keep_prob,
+      kernel_size=kernel_size,
+      conv_hyperparams=conv_hyperparams,
+      num_predictions_per_location=num_predictions_per_location,
+      class_prediction_bias_init=class_prediction_bias_init,
+      use_depthwise=use_depthwise,
+      score_converter_fn=score_converter_fn,
+      name='WeightSharedConvolutionalClassHead')
+  other_heads = {}
+
+  return (
+      convolutional_keras_box_predictor.WeightSharedConvolutionalBoxPredictor(
+          is_training=is_training,
+          num_classes=num_classes,
+          box_prediction_head=box_prediction_head,
+          class_prediction_head=class_prediction_head,
+          other_heads=other_heads,
+          conv_hyperparams=conv_hyperparams,
+          depth=depth,
+          num_layers_before_predictor=num_layers_before_predictor,
+          freeze_batchnorm=freeze_batchnorm,
+          inplace_batchnorm_update=inplace_batchnorm_update,
+          kernel_size=kernel_size,
+          apply_batch_norm=apply_batch_norm,
+          share_prediction_tower=share_prediction_tower,
+          use_depthwise=use_depthwise,
+          name=name))
+
+
+
+
+def build_mask_rcnn_keras_box_predictor(is_training,
+                                        num_classes,
+                                        fc_hyperparams,
+                                        freeze_batchnorm,
+                                        use_dropout,
+                                        dropout_keep_prob,
+                                        box_code_size,
+                                        add_background_class=True,
+                                        share_box_across_classes=False,
+                                        predict_instance_masks=False,
+                                        conv_hyperparams=None,
+                                        mask_height=14,
+                                        mask_width=14,
+                                        mask_prediction_num_conv_layers=2,
+                                        mask_prediction_conv_depth=256,
+                                        masks_are_class_agnostic=False,
+                                        convolve_then_upsample_masks=False):
+  """Builds and returns a MaskRCNNKerasBoxPredictor class.
+
+  Args:
+    is_training: Indicates whether the BoxPredictor is in training mode.
+    num_classes: number of classes.  Note that num_classes *does not*
+      include the background category, so if groundtruth labels take values
+      in {0, 1, .., K-1}, num_classes=K (and not K+1, even though the
+      assigned classification targets can range from {0,... K}).
+    fc_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
+      containing hyperparameters for fully connected dense ops.
+    freeze_batchnorm: Whether to freeze batch norm parameters during
+      training or not. When training with a small batch size (e.g. 1), it is
+      desirable to freeze batch norm update and use pretrained batch norm
+      params.
+    use_dropout: Option to use dropout or not.  Note that a single dropout
+      op is applied here prior to both box and class predictions, which stands
+      in contrast to the ConvolutionalBoxPredictor below.
+    dropout_keep_prob: Keep probability for dropout.
+      This is only used if use_dropout is True.
+    box_code_size: Size of encoding for each box.
+    add_background_class: Whether to add an implicit background class.
+    share_box_across_classes: Whether to share boxes across classes rather
+      than use a different box for each class.
+    predict_instance_masks: If True, will add a third stage mask prediction
+      to the returned class.
+    conv_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
+      containing hyperparameters for convolution ops.
+    mask_height: Desired output mask height. The default value is 14.
+    mask_width: Desired output mask width. The default value is 14.
+    mask_prediction_num_conv_layers: Number of convolution layers applied to
+      the image_features in mask prediction branch.
+    mask_prediction_conv_depth: The depth for the first conv2d_transpose op
+      applied to the image_features in the mask prediction branch. If set
+      to 0, the depth of the convolution layers will be automatically chosen
+      based on the number of object classes and the number of channels in the
+      image features.
+    masks_are_class_agnostic: Boolean determining if the mask-head is
+      class-agnostic or not.
+    convolve_then_upsample_masks: Whether to apply convolutions on mask
+      features before upsampling using nearest neighbor resizing. Otherwise,
+      mask features are resized to [`mask_height`, `mask_width`] using
+      bilinear resizing before applying convolutions.
+
+  Returns:
+    A MaskRCNNKerasBoxPredictor class.
+  """
+  box_prediction_head = keras_box_head.MaskRCNNBoxHead(
+      is_training=is_training,
+      num_classes=num_classes,
+      fc_hyperparams=fc_hyperparams,
+      freeze_batchnorm=freeze_batchnorm,
+      use_dropout=use_dropout,
+      dropout_keep_prob=dropout_keep_prob,
+      box_code_size=box_code_size,
+      share_box_across_classes=share_box_across_classes)
+  class_prediction_head = keras_class_head.MaskRCNNClassHead(
+      is_training=is_training,
+      num_class_slots=num_classes + 1 if add_background_class else num_classes,
+      fc_hyperparams=fc_hyperparams,
+      freeze_batchnorm=freeze_batchnorm,
+      use_dropout=use_dropout,
+      dropout_keep_prob=dropout_keep_prob)
+  third_stage_heads = {}
+  if predict_instance_masks:
+    third_stage_heads[
+        mask_rcnn_box_predictor.
+        MASK_PREDICTIONS] = keras_mask_head.MaskRCNNMaskHead(
+            is_training=is_training,
+            num_classes=num_classes,
+            conv_hyperparams=conv_hyperparams,
+            freeze_batchnorm=freeze_batchnorm,
+            mask_height=mask_height,
+            mask_width=mask_width,
+            mask_prediction_num_conv_layers=mask_prediction_num_conv_layers,
+            mask_prediction_conv_depth=mask_prediction_conv_depth,
+            masks_are_class_agnostic=masks_are_class_agnostic,
+            convolve_then_upsample=convolve_then_upsample_masks)
+  return mask_rcnn_keras_box_predictor.MaskRCNNKerasBoxPredictor(
+      is_training=is_training,
+      num_classes=num_classes,
+      freeze_batchnorm=freeze_batchnorm,
+      box_prediction_head=box_prediction_head,
+      class_prediction_head=class_prediction_head,
+      third_stage_heads=third_stage_heads)
+
+
 def build_mask_rcnn_box_predictor(is_training,
                                  num_classes,
                                  fc_hyperparams_fn,
@@ -457,6 +684,13 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes,
    config_box_predictor = box_predictor_config.convolutional_box_predictor
    conv_hyperparams_fn = argscope_fn(config_box_predictor.conv_hyperparams,
                                      is_training)
+    # Optionally apply clipping to box encodings, when box_encodings_clip_range
+    # is set.
+    box_encodings_clip_range = None
+    if config_box_predictor.HasField('box_encodings_clip_range'):
+      box_encodings_clip_range = BoxEncodingsClipRange(
+          min=config_box_predictor.box_encodings_clip_range.min,
+          max=config_box_predictor.box_encodings_clip_range.max)
    return build_convolutional_box_predictor(
        is_training=is_training,
        num_classes=num_classes,
@@ -473,7 +707,8 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes,
        apply_sigmoid_to_scores=config_box_predictor.apply_sigmoid_to_scores,
        class_prediction_bias_init=(
            config_box_predictor.class_prediction_bias_init),
-        use_depthwise=config_box_predictor.use_depthwise)
+        use_depthwise=config_box_predictor.use_depthwise,
+        box_encodings_clip_range=box_encodings_clip_range)

  if  box_predictor_oneof == 'weight_shared_convolutional_box_predictor':
    config_box_predictor = (
@@ -488,12 +723,11 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes,
        config_box_predictor.score_converter, is_training)
    # Optionally apply clipping to box encodings, when box_encodings_clip_range
    # is set.
-    box_encodings_clip_range = (
-        BoxEncodingsClipRange(
-            min=config_box_predictor.box_encodings_clip_range.min,
-            max=config_box_predictor.box_encodings_clip_range.max)
-        if config_box_predictor.HasField('box_encodings_clip_range') else None)
-
+    box_encodings_clip_range = None
+    if config_box_predictor.HasField('box_encodings_clip_range'):
+      box_encodings_clip_range = BoxEncodingsClipRange(
+          min=config_box_predictor.box_encodings_clip_range.min,
+          max=config_box_predictor.box_encodings_clip_range.max)
    return build_weight_shared_convolutional_box_predictor(
        is_training=is_training,
        num_classes=num_classes,
@@ -514,6 +748,7 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes,
        score_converter_fn=score_converter_fn,
        box_encodings_clip_range=box_encodings_clip_range)

+
  if box_predictor_oneof == 'mask_rcnn_box_predictor':
    config_box_predictor = box_predictor_config.mask_rcnn_box_predictor
    fc_hyperparams_fn = argscope_fn(config_box_predictor.fc_hyperparams,
@@ -563,7 +798,7 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes,
  raise ValueError('Unknown box predictor: {}'.format(box_predictor_oneof))


-def build_keras(conv_hyperparams_fn, freeze_batchnorm, inplace_batchnorm_update,
+def build_keras(hyperparams_fn, freeze_batchnorm, inplace_batchnorm_update,
                num_predictions_per_location_list, box_predictor_config,
                is_training, num_classes, add_background_class=True):
  """Builds a Keras-based box predictor based on the configuration.
@@ -573,7 +808,7 @@ def build_keras(conv_hyperparams_fn, freeze_batchnorm, inplace_batchnorm_update,
  for more details.

  Args:
-    conv_hyperparams_fn: A function that takes a hyperparams_pb2.Hyperparams
+    hyperparams_fn: A function that takes a hyperparams_pb2.Hyperparams
      proto and returns a `hyperparams_builder.KerasLayerHyperparams`
      for Conv or FC hyperparameters.
    freeze_batchnorm: Whether to freeze batch norm parameters during
@@ -607,8 +842,16 @@ def build_keras(conv_hyperparams_fn, freeze_batchnorm, inplace_batchnorm_update,

  if box_predictor_oneof == 'convolutional_box_predictor':
    config_box_predictor = box_predictor_config.convolutional_box_predictor
-    conv_hyperparams = conv_hyperparams_fn(
+    conv_hyperparams = hyperparams_fn(
        config_box_predictor.conv_hyperparams)
+    # Optionally apply clipping to box encodings, when box_encodings_clip_range
+    # is set.
+    box_encodings_clip_range = None
+    if config_box_predictor.HasField('box_encodings_clip_range'):
+      box_encodings_clip_range = BoxEncodingsClipRange(
+          min=config_box_predictor.box_encodings_clip_range.min,
+          max=config_box_predictor.box_encodings_clip_range.max)
+
    return build_convolutional_keras_box_predictor(
        is_training=is_training,
        num_classes=num_classes,
@@ -627,7 +870,97 @@ def build_keras(conv_hyperparams_fn, freeze_batchnorm, inplace_batchnorm_update,
        max_depth=config_box_predictor.max_depth,
        class_prediction_bias_init=(
            config_box_predictor.class_prediction_bias_init),
-        use_depthwise=config_box_predictor.use_depthwise)
+        use_depthwise=config_box_predictor.use_depthwise,
+        box_encodings_clip_range=box_encodings_clip_range)
+
+  if box_predictor_oneof == 'weight_shared_convolutional_box_predictor':
+    config_box_predictor = (
+        box_predictor_config.weight_shared_convolutional_box_predictor)
+    conv_hyperparams = hyperparams_fn(config_box_predictor.conv_hyperparams)
+    apply_batch_norm = config_box_predictor.conv_hyperparams.HasField(
+        'batch_norm')
+    # During training phase, logits are used to compute the loss. Only apply
+    # sigmoid at inference to make the inference graph TPU friendly. This is
+    # required because during TPU inference, model.postprocess is not called.
+    score_converter_fn = build_score_converter(
+        config_box_predictor.score_converter, is_training)
+    # Optionally apply clipping to box encodings, when box_encodings_clip_range
+    # is set.
+    box_encodings_clip_range = None
+    if config_box_predictor.HasField('box_encodings_clip_range'):
+      box_encodings_clip_range = BoxEncodingsClipRange(
+          min=config_box_predictor.box_encodings_clip_range.min,
+          max=config_box_predictor.box_encodings_clip_range.max)
+
+    return build_weight_shared_convolutional_keras_box_predictor(
+        is_training=is_training,
+        num_classes=num_classes,
+        conv_hyperparams=conv_hyperparams,
+        freeze_batchnorm=freeze_batchnorm,
+        inplace_batchnorm_update=inplace_batchnorm_update,
+        num_predictions_per_location_list=num_predictions_per_location_list,
+        depth=config_box_predictor.depth,
+        num_layers_before_predictor=(
+            config_box_predictor.num_layers_before_predictor),
+        box_code_size=config_box_predictor.box_code_size,
+        kernel_size=config_box_predictor.kernel_size,
+        add_background_class=add_background_class,
+        class_prediction_bias_init=(
+            config_box_predictor.class_prediction_bias_init),
+        use_dropout=config_box_predictor.use_dropout,
+        dropout_keep_prob=config_box_predictor.dropout_keep_probability,
+        share_prediction_tower=config_box_predictor.share_prediction_tower,
+        apply_batch_norm=apply_batch_norm,
+        use_depthwise=config_box_predictor.use_depthwise,
+        score_converter_fn=score_converter_fn,
+        box_encodings_clip_range=box_encodings_clip_range)
+
+  if box_predictor_oneof == 'mask_rcnn_box_predictor':
+    config_box_predictor = box_predictor_config.mask_rcnn_box_predictor
+    fc_hyperparams = hyperparams_fn(config_box_predictor.fc_hyperparams)
+    conv_hyperparams = None
+    if config_box_predictor.HasField('conv_hyperparams'):
+      conv_hyperparams = hyperparams_fn(
+          config_box_predictor.conv_hyperparams)
+    return build_mask_rcnn_keras_box_predictor(
+        is_training=is_training,
+        num_classes=num_classes,
+        add_background_class=add_background_class,
+        fc_hyperparams=fc_hyperparams,
+        freeze_batchnorm=freeze_batchnorm,
+        use_dropout=config_box_predictor.use_dropout,
+        dropout_keep_prob=config_box_predictor.dropout_keep_probability,
+        box_code_size=config_box_predictor.box_code_size,
+        share_box_across_classes=(
+            config_box_predictor.share_box_across_classes),
+        predict_instance_masks=config_box_predictor.predict_instance_masks,
+        conv_hyperparams=conv_hyperparams,
+        mask_height=config_box_predictor.mask_height,
+        mask_width=config_box_predictor.mask_width,
+        mask_prediction_num_conv_layers=(
+            config_box_predictor.mask_prediction_num_conv_layers),
+        mask_prediction_conv_depth=(
+            config_box_predictor.mask_prediction_conv_depth),
+        masks_are_class_agnostic=(
+            config_box_predictor.masks_are_class_agnostic),
+        convolve_then_upsample_masks=(
+            config_box_predictor.convolve_then_upsample_masks))
+
+  if box_predictor_oneof == 'rfcn_box_predictor':
+    config_box_predictor = box_predictor_config.rfcn_box_predictor
+    conv_hyperparams = hyperparams_fn(config_box_predictor.conv_hyperparams)
+    box_predictor_object = rfcn_keras_box_predictor.RfcnKerasBoxPredictor(
+        is_training=is_training,
+        num_classes=num_classes,
+        conv_hyperparams=conv_hyperparams,
+        freeze_batchnorm=freeze_batchnorm,
+        crop_size=[config_box_predictor.crop_height,
+                   config_box_predictor.crop_width],
+        num_spatial_bins=[config_box_predictor.num_spatial_bins_height,
+                          config_box_predictor.num_spatial_bins_width],
+        depth=config_box_predictor.depth,
+        box_code_size=config_box_predictor.box_code_size)
+    return box_predictor_object

  raise ValueError(
      'Unknown box predictor for Keras: {}'.format(box_predictor_oneof))
--- a/research/object_detection/builders/box_predictor_builder_test.py
+++ b/research/object_detection/builders/box_predictor_builder_test.py
@@ -353,6 +353,8 @@ class WeightSharedConvolutionalBoxPredictorBuilderTest(tf.test.TestCase):
    self.assertEqual(box_predictor._apply_batch_norm, True)


+
+
 class MaskRCNNBoxPredictorBuilderTest(tf.test.TestCase):

  def test_box_predictor_builder_calls_fc_argscope_fn(self):

--- a/research/object_detection/builders/dataset_builder.py
+++ b/research/object_detection/builders/dataset_builder.py
@@ -56,9 +56,15 @@ def read_dataset(file_read_func, input_files, config):

  Returns:
    A tf.data.Dataset of (undecoded) tf-records based on config.
+
+  Raises:
+    RuntimeError: If no files are found at the supplied path(s).
  """
  # Shard, shuffle, and read files.
  filenames = tf.gfile.Glob(input_files)
+  if not filenames:
+    raise RuntimeError('Did not find any input files matching the glob pattern '
+                       '{}'.format(input_files))
  num_readers = config.num_readers
  if num_readers > len(filenames):
    num_readers = len(filenames)

--- a/research/object_detection/builders/graph_rewriter_builder.py
+++ b/research/object_detection/builders/graph_rewriter_builder.py
@@ -32,11 +32,14 @@ def build(graph_rewriter_config, is_training):

    # Quantize the graph by inserting quantize ops for weights and activations
    if is_training:
-      tf.contrib.quantize.create_training_graph(
+      tf.contrib.quantize.experimental_create_training_graph(
          input_graph=tf.get_default_graph(),
-          quant_delay=graph_rewriter_config.quantization.delay)
+          quant_delay=graph_rewriter_config.quantization.delay
+      )
    else:
-      tf.contrib.quantize.create_eval_graph(input_graph=tf.get_default_graph())
+      tf.contrib.quantize.experimental_create_eval_graph(
+          input_graph=tf.get_default_graph()
+      )

    tf.contrib.layers.summarize_collection('quant_vars')
  return graph_rewrite_fn
--- a/research/object_detection/builders/graph_rewriter_builder_test.py
+++ b/research/object_detection/builders/graph_rewriter_builder_test.py
@@ -23,7 +23,8 @@ class QuantizationBuilderTest(tf.test.TestCase):

  def testQuantizationBuilderSetsUpCorrectTrainArguments(self):
    with mock.patch.object(
-        tf.contrib.quantize, 'create_training_graph') as mock_quant_fn:
+        tf.contrib.quantize,
+        'experimental_create_training_graph') as mock_quant_fn:
      with mock.patch.object(tf.contrib.layers,
                             'summarize_collection') as mock_summarize_col:
        graph_rewriter_proto = graph_rewriter_pb2.GraphRewriter()
@@ -40,7 +41,7 @@ class QuantizationBuilderTest(tf.test.TestCase):

  def testQuantizationBuilderSetsUpCorrectEvalArguments(self):
    with mock.patch.object(tf.contrib.quantize,
-                           'create_eval_graph') as mock_quant_fn:
+                           'experimental_create_eval_graph') as mock_quant_fn:
      with mock.patch.object(tf.contrib.layers,
                             'summarize_collection') as mock_summarize_col:
        graph_rewriter_proto = graph_rewriter_pb2.GraphRewriter()

--- a/research/object_detection/builders/image_resizer_builder.py
+++ b/research/object_detection/builders/image_resizer_builder.py
@@ -110,6 +110,32 @@ def build(image_resizer_config):
      else:
        return [image, masks, tf.shape(image)]
    return image_resizer_fn
+  elif image_resizer_oneof == 'conditional_shape_resizer':
+    conditional_shape_resize_config = (
+        image_resizer_config.conditional_shape_resizer)
+    method = _tf_resize_method(conditional_shape_resize_config.resize_method)
+
+    if conditional_shape_resize_config.condition == (
+        image_resizer_pb2.ConditionalShapeResizer.GREATER):
+      image_resizer_fn = functools.partial(
+          preprocessor.resize_to_max_dimension,
+          max_dimension=conditional_shape_resize_config.size_threshold,
+          method=method)
+
+    elif conditional_shape_resize_config.condition == (
+        image_resizer_pb2.ConditionalShapeResizer.SMALLER):
+      image_resizer_fn = functools.partial(
+          preprocessor.resize_to_min_dimension,
+          min_dimension=conditional_shape_resize_config.size_threshold,
+          method=method)
+    else:
+      raise ValueError(
+          'Invalid image resizer condition option for '
+          'ConditionalShapeResizer: \'%s\'.'
+          % conditional_shape_resize_config.condition)
+
+    if not conditional_shape_resize_config.convert_to_grayscale:
+      return image_resizer_fn
  else:
    raise ValueError(
        'Invalid image resizer option: \'%s\'.' % image_resizer_oneof)

--- a/research/object_detection/builders/image_resizer_builder_test.py
+++ b/research/object_detection/builders/image_resizer_builder_test.py
@@ -147,6 +147,69 @@ class ImageResizerBuilderTest(tf.test.TestCase):
    self.assertEqual(len(vals), 1)
    self.assertEqual(vals[0], 1)

+  def test_build_conditional_shape_resizer_greater_returns_expected_shape(self):
+    image_resizer_text_proto = """
+      conditional_shape_resizer {
+        condition: GREATER
+        size_threshold: 30
+      }
+    """
+    input_shape = (60, 30, 3)
+    expected_output_shape = (30, 15, 3)
+    output_shape = self._shape_of_resized_random_image_given_text_proto(
+        input_shape, image_resizer_text_proto)
+    self.assertEqual(output_shape, expected_output_shape)
+
+  def test_build_conditional_shape_resizer_same_shape_with_no_resize(self):
+    image_resizer_text_proto = """
+      conditional_shape_resizer {
+        condition: GREATER
+        size_threshold: 30
+      }
+    """
+    input_shape = (15, 15, 3)
+    expected_output_shape = (15, 15, 3)
+    output_shape = self._shape_of_resized_random_image_given_text_proto(
+        input_shape, image_resizer_text_proto)
+    self.assertEqual(output_shape, expected_output_shape)
+
+  def test_build_conditional_shape_resizer_smaller_returns_expected_shape(self):
+    image_resizer_text_proto = """
+      conditional_shape_resizer {
+        condition: SMALLER
+        size_threshold: 30
+      }
+    """
+    input_shape = (30, 15, 3)
+    expected_output_shape = (60, 30, 3)
+    output_shape = self._shape_of_resized_random_image_given_text_proto(
+        input_shape, image_resizer_text_proto)
+    self.assertEqual(output_shape, expected_output_shape)
+
+  def test_build_conditional_shape_resizer_grayscale(self):
+    image_resizer_text_proto = """
+      conditional_shape_resizer {
+        condition: GREATER
+        size_threshold: 30
+        convert_to_grayscale: true
+      }
+    """
+    input_shape = (60, 30, 3)
+    expected_output_shape = (30, 15, 1)
+    output_shape = self._shape_of_resized_random_image_given_text_proto(
+        input_shape, image_resizer_text_proto)
+    self.assertEqual(output_shape, expected_output_shape)
+
+  def test_build_conditional_shape_resizer_error_on_invalid_condition(self):
+    invalid_image_resizer_text_proto = """
+      conditional_shape_resizer {
+        condition: INVALID
+        size_threshold: 30
+      }
+    """
+    with self.assertRaises(ValueError):
+      image_resizer_builder.build(invalid_image_resizer_text_proto)
+

 if __name__ == '__main__':
  tf.test.main()
--- a/research/object_detection/builders/model_builder.py
+++ b/research/object_detection/builders/model_builder.py
@@ -33,6 +33,7 @@ from object_detection.meta_architectures import faster_rcnn_meta_arch
 from object_detection.meta_architectures import rfcn_meta_arch
 from object_detection.meta_architectures import ssd_meta_arch
 from object_detection.models import faster_rcnn_inception_resnet_v2_feature_extractor as frcnn_inc_res
+from object_detection.models import faster_rcnn_inception_resnet_v2_keras_feature_extractor as frcnn_inc_res_keras
 from object_detection.models import faster_rcnn_inception_v2_feature_extractor as frcnn_inc_v2
 from object_detection.models import faster_rcnn_nas_feature_extractor as frcnn_nas
 from object_detection.models import faster_rcnn_pnas_feature_extractor as frcnn_pnas
@@ -44,13 +45,16 @@ from object_detection.models.ssd_inception_v2_feature_extractor import SSDIncept
 from object_detection.models.ssd_inception_v3_feature_extractor import SSDInceptionV3FeatureExtractor
 from object_detection.models.ssd_mobilenet_v1_feature_extractor import SSDMobileNetV1FeatureExtractor
 from object_detection.models.ssd_mobilenet_v1_fpn_feature_extractor import SSDMobileNetV1FpnFeatureExtractor
+from object_detection.models.ssd_mobilenet_v1_fpn_keras_feature_extractor import SSDMobileNetV1FpnKerasFeatureExtractor
 from object_detection.models.ssd_mobilenet_v1_keras_feature_extractor import SSDMobileNetV1KerasFeatureExtractor
 from object_detection.models.ssd_mobilenet_v1_ppn_feature_extractor import SSDMobileNetV1PpnFeatureExtractor
 from object_detection.models.ssd_mobilenet_v2_feature_extractor import SSDMobileNetV2FeatureExtractor
 from object_detection.models.ssd_mobilenet_v2_fpn_feature_extractor import SSDMobileNetV2FpnFeatureExtractor
+from object_detection.models.ssd_mobilenet_v2_fpn_keras_feature_extractor import SSDMobileNetV2FpnKerasFeatureExtractor
 from object_detection.models.ssd_mobilenet_v2_keras_feature_extractor import SSDMobileNetV2KerasFeatureExtractor
 from object_detection.models.ssd_pnasnet_feature_extractor import SSDPNASNetFeatureExtractor
 from object_detection.predictors import rfcn_box_predictor
+from object_detection.predictors import rfcn_keras_box_predictor
 from object_detection.predictors.heads import mask_head
 from object_detection.protos import model_pb2
 from object_detection.utils import ops
@@ -78,7 +82,9 @@ SSD_FEATURE_EXTRACTOR_CLASS_MAP = {

 SSD_KERAS_FEATURE_EXTRACTOR_CLASS_MAP = {
    'ssd_mobilenet_v1_keras': SSDMobileNetV1KerasFeatureExtractor,
-    'ssd_mobilenet_v2_keras': SSDMobileNetV2KerasFeatureExtractor
+    'ssd_mobilenet_v1_fpn_keras': SSDMobileNetV1FpnKerasFeatureExtractor,
+    'ssd_mobilenet_v2_keras': SSDMobileNetV2KerasFeatureExtractor,
+    'ssd_mobilenet_v2_fpn_keras': SSDMobileNetV2FpnKerasFeatureExtractor,
 }

 # A map of names to Faster R-CNN feature extractors.
@@ -99,6 +105,11 @@ FASTER_RCNN_FEATURE_EXTRACTOR_CLASS_MAP = {
    frcnn_resnet_v1.FasterRCNNResnet152FeatureExtractor,
 }

+FASTER_RCNN_KERAS_FEATURE_EXTRACTOR_CLASS_MAP = {
+    'faster_rcnn_inception_resnet_v2_keras':
+    frcnn_inc_res_keras.FasterRCNNInceptionResnetV2KerasFeatureExtractor,
+}
+

 def build(model_config, is_training, add_summaries=True):
  """Builds a DetectionModel based on the model config.
@@ -253,7 +264,7 @@ def _build_ssd_model(ssd_config, is_training, add_summaries):
      ssd_config.anchor_generator)
  if feature_extractor.is_keras_model:
    ssd_box_predictor = box_predictor_builder.build_keras(
-        conv_hyperparams_fn=hyperparams_builder.KerasLayerHyperparams,
+        hyperparams_fn=hyperparams_builder.KerasLayerHyperparams,
        freeze_batchnorm=ssd_config.freeze_batchnorm,
        inplace_batchnorm_update=False,
        num_predictions_per_location_list=anchor_generator
@@ -355,7 +366,45 @@ def _build_faster_rcnn_feature_extractor(
      feature_type]
  return feature_extractor_class(
      is_training, first_stage_features_stride,
-      batch_norm_trainable, reuse_weights)
+      batch_norm_trainable, reuse_weights=reuse_weights)
+
+
+def _build_faster_rcnn_keras_feature_extractor(
+    feature_extractor_config, is_training,
+    inplace_batchnorm_update=False):
+  """Builds a faster_rcnn_meta_arch.FasterRCNNKerasFeatureExtractor from config.
+
+  Args:
+    feature_extractor_config: A FasterRcnnFeatureExtractor proto config from
+      faster_rcnn.proto.
+    is_training: True if this feature extractor is being built for training.
+    inplace_batchnorm_update: Whether to update batch_norm inplace during
+      training. This is required for batch norm to work correctly on TPUs. When
+      this is false, user must add a control dependency on
+      tf.GraphKeys.UPDATE_OPS for train/loss op in order to update the batch
+      norm moving average parameters.
+
+  Returns:
+    faster_rcnn_meta_arch.FasterRCNNKerasFeatureExtractor based on config.
+
+  Raises:
+    ValueError: On invalid feature extractor type.
+  """
+  if inplace_batchnorm_update:
+    raise ValueError('inplace batchnorm updates not supported.')
+  feature_type = feature_extractor_config.type
+  first_stage_features_stride = (
+      feature_extractor_config.first_stage_features_stride)
+  batch_norm_trainable = feature_extractor_config.batch_norm_trainable
+
+  if feature_type not in FASTER_RCNN_KERAS_FEATURE_EXTRACTOR_CLASS_MAP:
+    raise ValueError('Unknown Faster R-CNN feature_extractor: {}'.format(
+        feature_type))
+  feature_extractor_class = FASTER_RCNN_KERAS_FEATURE_EXTRACTOR_CLASS_MAP[
+      feature_type]
+  return feature_extractor_class(
+      is_training, first_stage_features_stride,
+      batch_norm_trainable)


 def _build_faster_rcnn_model(frcnn_config, is_training, add_summaries):
@@ -380,9 +429,17 @@ def _build_faster_rcnn_model(frcnn_config, is_training, add_summaries):
  num_classes = frcnn_config.num_classes
  image_resizer_fn = image_resizer_builder.build(frcnn_config.image_resizer)

-  feature_extractor = _build_faster_rcnn_feature_extractor(
-      frcnn_config.feature_extractor, is_training,
-      inplace_batchnorm_update=frcnn_config.inplace_batchnorm_update)
+  is_keras = (frcnn_config.feature_extractor.type in
+              FASTER_RCNN_KERAS_FEATURE_EXTRACTOR_CLASS_MAP)
+
+  if is_keras:
+    feature_extractor = _build_faster_rcnn_keras_feature_extractor(
+        frcnn_config.feature_extractor, is_training,
+        inplace_batchnorm_update=frcnn_config.inplace_batchnorm_update)
+  else:
+    feature_extractor = _build_faster_rcnn_feature_extractor(
+        frcnn_config.feature_extractor, is_training,
+        inplace_batchnorm_update=frcnn_config.inplace_batchnorm_update)

  number_of_stages = frcnn_config.number_of_stages
  first_stage_anchor_generator = anchor_generator_builder.build(
@@ -393,8 +450,13 @@ def _build_faster_rcnn_model(frcnn_config, is_training, add_summaries):
      'proposal',
      use_matmul_gather=frcnn_config.use_matmul_gather_in_matcher)
  first_stage_atrous_rate = frcnn_config.first_stage_atrous_rate
-  first_stage_box_predictor_arg_scope_fn = hyperparams_builder.build(
-      frcnn_config.first_stage_box_predictor_conv_hyperparams, is_training)
+  if is_keras:
+    first_stage_box_predictor_arg_scope_fn = (
+        hyperparams_builder.KerasLayerHyperparams(
+            frcnn_config.first_stage_box_predictor_conv_hyperparams))
+  else:
+    first_stage_box_predictor_arg_scope_fn = hyperparams_builder.build(
+        frcnn_config.first_stage_box_predictor_conv_hyperparams, is_training)
  first_stage_box_predictor_kernel_size = (
      frcnn_config.first_stage_box_predictor_kernel_size)
  first_stage_box_predictor_depth = frcnn_config.first_stage_box_predictor_depth
@@ -432,11 +494,21 @@ def _build_faster_rcnn_model(frcnn_config, is_training, add_summaries):
      'FasterRCNN',
      'detection',
      use_matmul_gather=frcnn_config.use_matmul_gather_in_matcher)
-  second_stage_box_predictor = box_predictor_builder.build(
-      hyperparams_builder.build,
-      frcnn_config.second_stage_box_predictor,
-      is_training=is_training,
-      num_classes=num_classes)
+  if is_keras:
+    second_stage_box_predictor = box_predictor_builder.build_keras(
+        hyperparams_builder.KerasLayerHyperparams,
+        freeze_batchnorm=False,
+        inplace_batchnorm_update=False,
+        num_predictions_per_location_list=[1],
+        box_predictor_config=frcnn_config.second_stage_box_predictor,
+        is_training=is_training,
+        num_classes=num_classes)
+  else:
+    second_stage_box_predictor = box_predictor_builder.build(
+        hyperparams_builder.build,
+        frcnn_config.second_stage_box_predictor,
+        is_training=is_training,
+        num_classes=num_classes)
  second_stage_batch_size = frcnn_config.second_stage_batch_size
  second_stage_sampler = sampler.BalancedPositiveNegativeSampler(
      positive_fraction=frcnn_config.second_stage_balance_fraction,
@@ -507,8 +579,10 @@ def _build_faster_rcnn_model(frcnn_config, is_training, add_summaries):
      'resize_masks': frcnn_config.resize_masks
  }

-  if isinstance(second_stage_box_predictor,
-                rfcn_box_predictor.RfcnBoxPredictor):
+  if (isinstance(second_stage_box_predictor,
+                 rfcn_box_predictor.RfcnBoxPredictor) or
+      isinstance(second_stage_box_predictor,
+                 rfcn_keras_box_predictor.RfcnKerasBoxPredictor)):
    return rfcn_meta_arch.RFCNMetaArch(
        second_stage_rfcn_box_predictor=second_stage_box_predictor,
        **common_kwargs)

--- a/research/object_detection/core/matcher.py
+++ b/research/object_detection/core/matcher.py
@@ -170,7 +170,13 @@ class Match(object):
      row_indices: int32 tensor of shape [K] with row indices.
    """
    return self._reshape_and_cast(
-        self._gather_op(self._match_results, self.matched_column_indices()))
+        self._gather_op(tf.to_float(self._match_results),
+                        self.matched_column_indices()))
+
+  def num_matched_rows(self):
+    """Returns number (int32 scalar tensor) of matched rows."""
+    unique_rows, _ = tf.unique(self.matched_row_indices())
+    return tf.size(unique_rows)

  def _reshape_and_cast(self, t):
    return tf.cast(tf.reshape(t, [-1]), tf.int32)
@@ -199,7 +205,7 @@ class Match(object):
    """
    input_tensor = tf.concat(
        [tf.stack([ignored_value, unmatched_value]),
-         tf.to_float(input_tensor)],
+         input_tensor],
        axis=0)
    gather_indices = tf.maximum(self.match_results + 2, 0)
    gathered_tensor = self._gather_op(input_tensor, gather_indices)

--- a/research/object_detection/core/matcher_test.py
+++ b/research/object_detection/core/matcher_test.py
@@ -27,37 +27,42 @@ class MatchTest(tf.test.TestCase):
    match = matcher.Match(match_results)
    expected_column_indices = [0, 1, 3, 5]
    matched_column_indices = match.matched_column_indices()
-    self.assertEquals(matched_column_indices.dtype, tf.int32)
+    self.assertEqual(matched_column_indices.dtype, tf.int32)
    with self.test_session() as sess:
      matched_column_indices = sess.run(matched_column_indices)
      self.assertAllEqual(matched_column_indices, expected_column_indices)

  def test_get_correct_counts(self):
-    match_results = tf.constant([3, 1, -1, 0, -1, 5, -2])
+    match_results = tf.constant([3, 1, -1, 0, -1, 1, -2])
    match = matcher.Match(match_results)
    exp_num_matched_columns = 4
    exp_num_unmatched_columns = 2
    exp_num_ignored_columns = 1
+    exp_num_matched_rows = 3
    num_matched_columns = match.num_matched_columns()
    num_unmatched_columns = match.num_unmatched_columns()
    num_ignored_columns = match.num_ignored_columns()
-    self.assertEquals(num_matched_columns.dtype, tf.int32)
-    self.assertEquals(num_unmatched_columns.dtype, tf.int32)
-    self.assertEquals(num_ignored_columns.dtype, tf.int32)
+    num_matched_rows = match.num_matched_rows()
+    self.assertEqual(num_matched_columns.dtype, tf.int32)
+    self.assertEqual(num_unmatched_columns.dtype, tf.int32)
+    self.assertEqual(num_ignored_columns.dtype, tf.int32)
+    self.assertEqual(num_matched_rows.dtype, tf.int32)
    with self.test_session() as sess:
      (num_matched_columns_out, num_unmatched_columns_out,
-       num_ignored_columns_out) = sess.run(
-           [num_matched_columns, num_unmatched_columns, num_ignored_columns])
+       num_ignored_columns_out, num_matched_rows_out) = sess.run(
+           [num_matched_columns, num_unmatched_columns, num_ignored_columns,
+            num_matched_rows])
      self.assertAllEqual(num_matched_columns_out, exp_num_matched_columns)
      self.assertAllEqual(num_unmatched_columns_out, exp_num_unmatched_columns)
      self.assertAllEqual(num_ignored_columns_out, exp_num_ignored_columns)
+      self.assertAllEqual(num_matched_rows_out, exp_num_matched_rows)

  def testGetCorrectUnmatchedColumnIndices(self):
    match_results = tf.constant([3, 1, -1, 0, -1, 5, -2])
    match = matcher.Match(match_results)
    expected_column_indices = [2, 4]
    unmatched_column_indices = match.unmatched_column_indices()
-    self.assertEquals(unmatched_column_indices.dtype, tf.int32)
+    self.assertEqual(unmatched_column_indices.dtype, tf.int32)
    with self.test_session() as sess:
      unmatched_column_indices = sess.run(unmatched_column_indices)
      self.assertAllEqual(unmatched_column_indices, expected_column_indices)
@@ -67,7 +72,7 @@ class MatchTest(tf.test.TestCase):
    match = matcher.Match(match_results)
    expected_row_indices = [3, 1, 0, 5]
    matched_row_indices = match.matched_row_indices()
-    self.assertEquals(matched_row_indices.dtype, tf.int32)
+    self.assertEqual(matched_row_indices.dtype, tf.int32)
    with self.test_session() as sess:
      matched_row_inds = sess.run(matched_row_indices)
      self.assertAllEqual(matched_row_inds, expected_row_indices)
@@ -77,7 +82,7 @@ class MatchTest(tf.test.TestCase):
    match = matcher.Match(match_results)
    expected_column_indices = [6]
    ignored_column_indices = match.ignored_column_indices()
-    self.assertEquals(ignored_column_indices.dtype, tf.int32)
+    self.assertEqual(ignored_column_indices.dtype, tf.int32)
    with self.test_session() as sess:
      ignored_column_indices = sess.run(ignored_column_indices)
      self.assertAllEqual(ignored_column_indices, expected_column_indices)
@@ -87,7 +92,7 @@ class MatchTest(tf.test.TestCase):
    match = matcher.Match(match_results)
    expected_column_indicator = [True, True, False, True, False, True, False]
    matched_column_indicator = match.matched_column_indicator()
-    self.assertEquals(matched_column_indicator.dtype, tf.bool)
+    self.assertEqual(matched_column_indicator.dtype, tf.bool)
    with self.test_session() as sess:
      matched_column_indicator = sess.run(matched_column_indicator)
      self.assertAllEqual(matched_column_indicator, expected_column_indicator)
@@ -97,7 +102,7 @@ class MatchTest(tf.test.TestCase):
    match = matcher.Match(match_results)
    expected_column_indicator = [False, False, True, False, True, False, False]
    unmatched_column_indicator = match.unmatched_column_indicator()
-    self.assertEquals(unmatched_column_indicator.dtype, tf.bool)
+    self.assertEqual(unmatched_column_indicator.dtype, tf.bool)
    with self.test_session() as sess:
      unmatched_column_indicator = sess.run(unmatched_column_indicator)
      self.assertAllEqual(unmatched_column_indicator, expected_column_indicator)
@@ -107,7 +112,7 @@ class MatchTest(tf.test.TestCase):
    match = matcher.Match(match_results)
    expected_column_indicator = [False, False, False, False, False, False, True]
    ignored_column_indicator = match.ignored_column_indicator()
-    self.assertEquals(ignored_column_indicator.dtype, tf.bool)
+    self.assertEqual(ignored_column_indicator.dtype, tf.bool)
    with self.test_session() as sess:
      ignored_column_indicator = sess.run(ignored_column_indicator)
      self.assertAllEqual(ignored_column_indicator, expected_column_indicator)
@@ -118,7 +123,7 @@ class MatchTest(tf.test.TestCase):
    expected_column_indices = [2, 4, 6]
    unmatched_ignored_column_indices = (match.
                                        unmatched_or_ignored_column_indices())
-    self.assertEquals(unmatched_ignored_column_indices.dtype, tf.int32)
+    self.assertEqual(unmatched_ignored_column_indices.dtype, tf.int32)
    with self.test_session() as sess:
      unmatched_ignored_column_indices = sess.run(
          unmatched_ignored_column_indices)
@@ -153,7 +158,7 @@ class MatchTest(tf.test.TestCase):
    gathered_tensor = match.gather_based_on_match(input_tensor,
                                                  unmatched_value=100.,
                                                  ignored_value=200.)
-    self.assertEquals(gathered_tensor.dtype, tf.float32)
+    self.assertEqual(gathered_tensor.dtype, tf.float32)
    with self.test_session():
      gathered_tensor_out = gathered_tensor.eval()
    self.assertAllEqual(expected_gathered_tensor, gathered_tensor_out)
@@ -167,7 +172,7 @@ class MatchTest(tf.test.TestCase):
    gathered_tensor = match.gather_based_on_match(input_tensor,
                                                  unmatched_value=tf.zeros(4),
                                                  ignored_value=tf.zeros(4))
-    self.assertEquals(gathered_tensor.dtype, tf.float32)
+    self.assertEqual(gathered_tensor.dtype, tf.float32)
    with self.test_session():
      gathered_tensor_out = gathered_tensor.eval()
    self.assertAllEqual(expected_gathered_tensor, gathered_tensor_out)
@@ -181,7 +186,7 @@ class MatchTest(tf.test.TestCase):
    gathered_tensor = match.gather_based_on_match(input_tensor,
                                                  unmatched_value=tf.zeros(4),
                                                  ignored_value=tf.zeros(4))
-    self.assertEquals(gathered_tensor.dtype, tf.float32)
+    self.assertEqual(gathered_tensor.dtype, tf.float32)
    with self.test_session() as sess:
      self.assertTrue(
          all([op.name is not 'Gather' for op in sess.graph.get_operations()]))

--- a/research/object_detection/core/post_processing.py
+++ b/research/object_detection/core/post_processing.py
@@ -12,9 +12,9 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # ==============================================================================
-
 """Post-processing operations on detected boxes."""

+import collections
 import numpy as np
 import tensorflow as tf

@@ -271,8 +271,8 @@ def batch_multiclass_non_max_suppression(boxes,
      all images in the batch. If clip_widow is None, all boxes are used to
      perform non-max suppression.
    change_coordinate_frame: Whether to normalize coordinates after clipping
-      relative to clip_window (this can only be set to True if a clip_window
-      is provided)
+      relative to clip_window (this can only be set to True if a clip_window is
+      provided)
    num_valid_boxes: (optional) a Tensor of type `int32`. A 1-D tensor of shape
      [batch_size] representing the number of valid boxes to be considered
      for each image in the batch.  This parameter allows for ignoring zero
@@ -322,7 +322,17 @@ def batch_multiclass_non_max_suppression(boxes,
    raise ValueError('if change_coordinate_frame is True, then a clip_window'
                     'must be specified.')
  original_masks = masks
-  original_additional_fields = additional_fields
+
+  # Create ordered dictionary using the sorted keys from
+  # additional fields to ensure getting the same key value assignment
+  # in _single_image_nms_fn(). The dictionary is thus a sorted version of
+  # additional_fields.
+  if additional_fields is None:
+    ordered_additional_fields = {}
+  else:
+    ordered_additional_fields = collections.OrderedDict(
+        sorted(additional_fields.items(), key=lambda item: item[0]))
+  del additional_fields
  with tf.name_scope(scope, 'BatchMultiClassNonMaxSuppression'):
    boxes_shape = boxes.shape
    batch_size = boxes_shape[0].value
@@ -354,9 +364,6 @@ def batch_multiclass_non_max_suppression(boxes,
    if clip_window.shape.ndims == 1:
      clip_window = tf.tile(tf.expand_dims(clip_window, 0), [batch_size, 1])

-    if additional_fields is None:
-      additional_fields = {}
-
    def _single_image_nms_fn(args):
      """Runs NMS on a single image and returns padded output.

@@ -403,9 +410,11 @@ def batch_multiclass_non_max_suppression(boxes,
      per_image_scores = args[1]
      per_image_masks = args[2]
      per_image_clip_window = args[3]
+      # Make sure that the order of elements passed in args is aligned with
+      # the iteration order of ordered_additional_fields
      per_image_additional_fields = {
          key: value
-          for key, value in zip(additional_fields, args[4:-1])
+          for key, value in zip(ordered_additional_fields, args[4:-1])
      }
      per_image_num_valid_boxes = args[-1]
      if use_static_shapes:
@@ -459,21 +468,24 @@ def batch_multiclass_non_max_suppression(boxes,
      nmsed_scores = nmsed_boxlist.get_field(fields.BoxListFields.scores)
      nmsed_classes = nmsed_boxlist.get_field(fields.BoxListFields.classes)
      nmsed_masks = nmsed_boxlist.get_field(fields.BoxListFields.masks)
-      nmsed_additional_fields = [
-          nmsed_boxlist.get_field(key) for key in per_image_additional_fields
-      ]
+      nmsed_additional_fields = []
+      # Sorting is needed here to ensure that the values stored in
+      # nmsed_additional_fields are always kept in the same order
+      # across different execution runs.
+      for key in sorted(per_image_additional_fields.keys()):
+        nmsed_additional_fields.append(nmsed_boxlist.get_field(key))
      return ([nmsed_boxes, nmsed_scores, nmsed_classes, nmsed_masks] +
              nmsed_additional_fields + [num_detections])

    num_additional_fields = 0
-    if additional_fields is not None:
-      num_additional_fields = len(additional_fields)
+    if ordered_additional_fields:
+      num_additional_fields = len(ordered_additional_fields)
    num_nmsed_outputs = 4 + num_additional_fields

    batch_outputs = shape_utils.static_or_dynamic_map_fn(
        _single_image_nms_fn,
        elems=([boxes, scores, masks, clip_window] +
-               list(additional_fields.values()) + [num_valid_boxes]),
+               list(ordered_additional_fields.values()) + [num_valid_boxes]),
        dtype=(num_nmsed_outputs * [tf.float32] + [tf.int32]),
        parallel_iterations=parallel_iterations)

@@ -481,16 +493,23 @@ def batch_multiclass_non_max_suppression(boxes,
    batch_nmsed_scores = batch_outputs[1]
    batch_nmsed_classes = batch_outputs[2]
    batch_nmsed_masks = batch_outputs[3]
-    batch_nmsed_additional_fields = {
-        key: value
-        for key, value in zip(additional_fields, batch_outputs[4:-1])
-    }
+    batch_nmsed_values = batch_outputs[4:-1]
+
+    batch_nmsed_additional_fields = {}
+    if num_additional_fields > 0:
+      # Sort the keys to ensure arranging elements in same order as
+      # in _single_image_nms_fn.
+      batch_nmsed_keys = ordered_additional_fields.keys()
+      for i in range(len(batch_nmsed_keys)):
+        batch_nmsed_additional_fields[
+            batch_nmsed_keys[i]] = batch_nmsed_values[i]
+
    batch_num_detections = batch_outputs[-1]

    if original_masks is None:
      batch_nmsed_masks = None

-    if original_additional_fields is None:
+    if not ordered_additional_fields:
      batch_nmsed_additional_fields = None

    return (batch_nmsed_boxes, batch_nmsed_scores, batch_nmsed_classes,

--- a/research/object_detection/core/post_processing_test.py
+++ b/research/object_detection/core/post_processing_test.py
@@ -839,6 +839,9 @@ class MulticlassNonMaxSuppressionTest(test_case.TestCase):
              [[0, 0], [0, 0]]]],
            tf.float32)
    }
+    additional_fields['size'] = tf.constant(
+        [[[[6], [8]], [[0], [2]], [[0], [0]], [[0], [0]]],
+         [[[13], [15]], [[8], [10]], [[10], [12]], [[0], [0]]]], tf.float32)
    score_thresh = 0.1
    iou_thresh = .5
    max_output_size = 4
@@ -865,6 +868,10 @@ class MulticlassNonMaxSuppressionTest(test_case.TestCase):
                                [[8, 9], [10, 11]],
                                [[0, 0], [0, 0]]]])
    }
+    exp_nms_additional_fields['size'] = np.array([[[[0], [0]], [[6], [8]],
+                                                   [[0], [0]], [[0], [0]]],
+                                                  [[[10], [12]], [[13], [15]],
+                                                   [[8], [10]], [[0], [0]]]])

    (nmsed_boxes, nmsed_scores, nmsed_classes, nmsed_masks,
     nmsed_additional_fields, num_detections
@@ -1071,6 +1078,11 @@ class MulticlassNonMaxSuppressionTest(test_case.TestCase):
              [[0, 0], [0, 0]]]],
            tf.float32)
    }
+
+    additional_fields['size'] = tf.constant(
+        [[[[7], [9]], [[1], [3]], [[0], [0]], [[0], [0]]],
+         [[[14], [16]], [[9], [11]], [[11], [13]], [[0], [0]]]], tf.float32)
+
    num_valid_boxes = tf.constant([1, 1], tf.int32)
    score_thresh = 0.1
    iou_thresh = .5
@@ -1099,6 +1111,11 @@ class MulticlassNonMaxSuppressionTest(test_case.TestCase):
                                [[0, 0], [0, 0]]]])
    }

+    exp_nms_additional_fields['size'] = np.array([[[[7], [9]], [[0], [0]],
+                                                   [[0], [0]], [[0], [0]]],
+                                                  [[[14], [16]], [[0], [0]],
+                                                   [[0], [0]], [[0], [0]]]])
+
    (nmsed_boxes, nmsed_scores, nmsed_classes, nmsed_masks,
     nmsed_additional_fields, num_detections
    ) = post_processing.batch_multiclass_non_max_suppression(

--- a/research/object_detection/core/preprocessor.py
+++ b/research/object_detection/core/preprocessor.py
@@ -2298,11 +2298,20 @@ def resize_to_range(image,
    return result


+def _get_image_info(image):
+  """Returns the height, width and number of channels in the image."""
+  image_height = tf.shape(image)[0]
+  image_width = tf.shape(image)[1]
+  num_channels = tf.shape(image)[2]
+  return (image_height, image_width, num_channels)
+
+
 # TODO(alirezafathi): Make sure the static shapes are preserved.
-def resize_to_min_dimension(image, masks=None, min_dimension=600):
+def resize_to_min_dimension(image, masks=None, min_dimension=600,
+                            method=tf.image.ResizeMethod.BILINEAR):
  """Resizes image and masks given the min size maintaining the aspect ratio.

-  If one of the image dimensions is smaller that min_dimension, it will scale
+  If one of the image dimensions is smaller than min_dimension, it will scale
  the image such that its smallest dimension is equal to min_dimension.
  Otherwise, will keep the image size as is.

@@ -2310,8 +2319,11 @@ def resize_to_min_dimension(image, masks=None, min_dimension=600):
    image: a tensor of size [height, width, channels].
    masks: (optional) a tensors of size [num_instances, height, width].
    min_dimension: minimum image dimension.
+    method: (optional) interpolation method used in resizing. Defaults to
+    BILINEAR.

  Returns:
+    An array containing resized_image, resized_masks, and resized_image_shape.
    Note that the position of the resized_image_shape changes based on whether
    masks are present.
    resized_image: A tensor of size [new_height, new_width, channels].
@@ -2327,18 +2339,72 @@ def resize_to_min_dimension(image, masks=None, min_dimension=600):
    raise ValueError('Image should be 3D tensor')

  with tf.name_scope('ResizeGivenMinDimension', values=[image, min_dimension]):
-    image_height = tf.shape(image)[0]
-    image_width = tf.shape(image)[1]
-    num_channels = tf.shape(image)[2]
+    (image_height, image_width, num_channels) = _get_image_info(image)
    min_image_dimension = tf.minimum(image_height, image_width)
    min_target_dimension = tf.maximum(min_image_dimension, min_dimension)
    target_ratio = tf.to_float(min_target_dimension) / tf.to_float(
        min_image_dimension)
    target_height = tf.to_int32(tf.to_float(image_height) * target_ratio)
    target_width = tf.to_int32(tf.to_float(image_width) * target_ratio)
-    image = tf.image.resize_bilinear(
-        tf.expand_dims(image, axis=0),
-        size=[target_height, target_width],
+    image = tf.image.resize_images(
+        tf.expand_dims(image, axis=0), size=[target_height, target_width],
+        method=method,
+        align_corners=True)
+    result = [tf.squeeze(image, axis=0)]
+
+    if masks is not None:
+      masks = tf.image.resize_nearest_neighbor(
+          tf.expand_dims(masks, axis=3),
+          size=[target_height, target_width],
+          align_corners=True)
+      result.append(tf.squeeze(masks, axis=3))
+
+    result.append(tf.stack([target_height, target_width, num_channels]))
+    return result
+
+
+def resize_to_max_dimension(image, masks=None, max_dimension=600,
+                            method=tf.image.ResizeMethod.BILINEAR):
+  """Resizes image and masks given the max size maintaining the aspect ratio.
+
+  If one of the image dimensions is greater than max_dimension, it will scale
+  the image such that its largest dimension is equal to max_dimension.
+  Otherwise, will keep the image size as is.
+
+  Args:
+    image: a tensor of size [height, width, channels].
+    masks: (optional) a tensors of size [num_instances, height, width].
+    max_dimension: maximum image dimension.
+    method: (optional) interpolation method used in resizing. Defaults to
+    BILINEAR.
+
+  Returns:
+    An array containing resized_image, resized_masks, and resized_image_shape.
+    Note that the position of the resized_image_shape changes based on whether
+    masks are present.
+    resized_image: A tensor of size [new_height, new_width, channels].
+    resized_masks: If masks is not None, also outputs masks. A 3D tensor of
+      shape [num_instances, new_height, new_width]
+    resized_image_shape: A 1D tensor of shape [3] containing the shape of the
+      resized image.
+
+  Raises:
+    ValueError: if the image is not a 3D tensor.
+  """
+  if len(image.get_shape()) != 3:
+    raise ValueError('Image should be 3D tensor')
+
+  with tf.name_scope('ResizeGivenMaxDimension', values=[image, max_dimension]):
+    (image_height, image_width, num_channels) = _get_image_info(image)
+    max_image_dimension = tf.maximum(image_height, image_width)
+    max_target_dimension = tf.minimum(max_image_dimension, max_dimension)
+    target_ratio = tf.to_float(max_target_dimension) / tf.to_float(
+        max_image_dimension)
+    target_height = tf.to_int32(tf.to_float(image_height) * target_ratio)
+    target_width = tf.to_int32(tf.to_float(image_width) * target_ratio)
+    image = tf.image.resize_images(
+        tf.expand_dims(image, axis=0), size=[target_height, target_width],
+        method=method,
        align_corners=True)
    result = [tf.squeeze(image, axis=0)]


--- a/research/object_detection/core/preprocessor_test.py
+++ b/research/object_detection/core/preprocessor_test.py
@@ -2663,6 +2663,68 @@ class PreprocessorTest(tf.test.TestCase):
        out_image_shape = sess.run(out_image_shape)
        self.assertAllEqual(out_image_shape, expected_shape)

+  def testResizeToMaxDimensionTensorShapes(self):
+    """Tests both cases where image should and shouldn't be resized."""
+    in_image_shape_list = [[100, 50, 3], [15, 30, 3]]
+    in_masks_shape_list = [[15, 100, 50], [10, 15, 30]]
+    max_dim = 50
+    expected_image_shape_list = [[50, 25, 3], [15, 30, 3]]
+    expected_masks_shape_list = [[15, 50, 25], [10, 15, 30]]
+
+    for (in_image_shape, expected_image_shape, in_masks_shape,
+         expected_mask_shape) in zip(in_image_shape_list,
+                                     expected_image_shape_list,
+                                     in_masks_shape_list,
+                                     expected_masks_shape_list):
+      in_image = tf.placeholder(tf.float32, shape=(None, None, 3))
+      in_masks = tf.placeholder(tf.float32, shape=(None, None, None))
+      in_masks = tf.random_uniform(in_masks_shape)
+      out_image, out_masks, _ = preprocessor.resize_to_max_dimension(
+          in_image, in_masks, max_dimension=max_dim)
+      out_image_shape = tf.shape(out_image)
+      out_masks_shape = tf.shape(out_masks)
+
+      with self.test_session() as sess:
+        out_image_shape, out_masks_shape = sess.run(
+            [out_image_shape, out_masks_shape],
+            feed_dict={
+                in_image: np.random.randn(*in_image_shape),
+                in_masks: np.random.randn(*in_masks_shape)
+            })
+        self.assertAllEqual(out_image_shape, expected_image_shape)
+        self.assertAllEqual(out_masks_shape, expected_mask_shape)
+
+  def testResizeToMaxDimensionWithInstanceMasksTensorOfSizeZero(self):
+    """Tests both cases where image should and shouldn't be resized."""
+    in_image_shape_list = [[100, 50, 3], [15, 30, 3]]
+    in_masks_shape_list = [[0, 100, 50], [0, 15, 30]]
+    max_dim = 50
+    expected_image_shape_list = [[50, 25, 3], [15, 30, 3]]
+    expected_masks_shape_list = [[0, 50, 25], [0, 15, 30]]
+
+    for (in_image_shape, expected_image_shape, in_masks_shape,
+         expected_mask_shape) in zip(in_image_shape_list,
+                                     expected_image_shape_list,
+                                     in_masks_shape_list,
+                                     expected_masks_shape_list):
+      in_image = tf.random_uniform(in_image_shape)
+      in_masks = tf.random_uniform(in_masks_shape)
+      out_image, out_masks, _ = preprocessor.resize_to_max_dimension(
+          in_image, in_masks, max_dimension=max_dim)
+      out_image_shape = tf.shape(out_image)
+      out_masks_shape = tf.shape(out_masks)
+
+      with self.test_session() as sess:
+        out_image_shape, out_masks_shape = sess.run(
+            [out_image_shape, out_masks_shape])
+        self.assertAllEqual(out_image_shape, expected_image_shape)
+        self.assertAllEqual(out_masks_shape, expected_mask_shape)
+
+  def testResizeToMaxDimensionRaisesErrorOn4DImage(self):
+    image = tf.random_uniform([1, 200, 300, 3])
+    with self.assertRaises(ValueError):
+      preprocessor.resize_to_max_dimension(image, 500)
+
  def testResizeToMinDimensionTensorShapes(self):
    in_image_shape_list = [[60, 55, 3], [15, 30, 3]]
    in_masks_shape_list = [[15, 60, 55], [10, 15, 30]]

--- a/research/object_detection/core/target_assigner.py
+++ b/research/object_detection/core/target_assigner.py
@@ -130,9 +130,13 @@ class TargetAssigner(object):
        representing weights for each element in cls_targets.
      reg_targets: a float32 tensor with shape [num_anchors, box_code_dimension]
      reg_weights: a float32 tensor with shape [num_anchors]
-      match: a matcher.Match object encoding the match between anchors and
-        groundtruth boxes, with rows corresponding to groundtruth boxes
-        and columns corresponding to anchors.
+      match: an int32 tensor of shape [num_anchors] containing result of anchor
+        groundtruth matching. Each position in the tensor indicates an anchor
+        and holds the following meaning:
+        (1) if match[i] >= 0, anchor i is matched with groundtruth match[i].
+        (2) if match[i]=-1, anchor i is marked to be background .
+        (3) if match[i]=-2, anchor i is ignored since it is not background and
+            does not have sufficient overlap to call it a foreground.

    Raises:
      ValueError: if anchors or groundtruth_boxes are not of type
@@ -203,7 +207,8 @@ class TargetAssigner(object):
      reg_weights = self._reset_target_shape(reg_weights, num_anchors)
      cls_weights = self._reset_target_shape(cls_weights, num_anchors)

-    return cls_targets, cls_weights, reg_targets, reg_weights, match
+    return (cls_targets, cls_weights, reg_targets, reg_weights,
+            match.match_results)

  def _reset_target_shape(self, target, num_anchors):
    """Sets the static shape of the target.
@@ -416,12 +421,12 @@ def create_target_assigner(reference, stage=None,
                        negative_class_weight=negative_class_weight)


-def batch_assign_targets(target_assigner,
-                         anchors_batch,
-                         gt_box_batch,
-                         gt_class_targets_batch,
-                         unmatched_class_label=None,
-                         gt_weights_batch=None):
+def batch_assign(target_assigner,
+                 anchors_batch,
+                 gt_box_batch,
+                 gt_class_targets_batch,
+                 unmatched_class_label=None,
+                 gt_weights_batch=None):
  """Batched assignment of classification and regression targets.

  Args:
@@ -450,10 +455,14 @@ def batch_assign_targets(target_assigner,
    batch_reg_targets: a tensor with shape [batch_size, num_anchors,
      box_code_dimension]
    batch_reg_weights: a tensor with shape [batch_size, num_anchors],
-    match_list: a list of matcher.Match objects encoding the match between
-      anchors and groundtruth boxes for each image of the batch,
-      with rows of the Match objects corresponding to groundtruth boxes
-      and columns corresponding to anchors.
+    match: an int32 tensor of shape [batch_size, num_anchors] containing result
+      of anchor groundtruth matching. Each position in the tensor indicates an
+      anchor and holds the following meaning:
+      (1) if match[x, i] >= 0, anchor i is matched with groundtruth match[x, i].
+      (2) if match[x, i]=-1, anchor i is marked to be background .
+      (3) if match[x, i]=-2, anchor i is ignored since it is not background and
+          does not have sufficient overlap to call it a foreground.
+
  Raises:
    ValueError: if input list lengths are inconsistent, i.e.,
      batch_size == len(gt_box_batch) == len(gt_class_targets_batch)
@@ -491,8 +500,55 @@ def batch_assign_targets(target_assigner,
  batch_cls_weights = tf.stack(cls_weights_list)
  batch_reg_targets = tf.stack(reg_targets_list)
  batch_reg_weights = tf.stack(reg_weights_list)
+  batch_match = tf.stack(match_list)
  return (batch_cls_targets, batch_cls_weights, batch_reg_targets,
-          batch_reg_weights, match_list)
+          batch_reg_weights, batch_match)
+
+
+# Assign an alias to avoid large refactor of existing users.
+batch_assign_targets = batch_assign
+
+
+def batch_get_targets(batch_match, groundtruth_tensor_list,
+                      groundtruth_weights_list, unmatched_value,
+                      unmatched_weight):
+  """Returns targets based on anchor-groundtruth box matching results.
+
+  Args:
+    batch_match: An int32 tensor of shape [batch, num_anchors] containing the
+      result of target assignment returned by TargetAssigner.assign(..).
+    groundtruth_tensor_list: A list of groundtruth tensors of shape
+      [num_groundtruth, d_1, d_2, ..., d_k]. The tensors can be of any type.
+    groundtruth_weights_list: A list of weights, one per groundtruth tensor, of
+      shape [num_groundtruth].
+    unmatched_value: A tensor of shape [d_1, d_2, ..., d_k] of the same type as
+      groundtruth tensor containing target value for anchors that remain
+      unmatched.
+    unmatched_weight: Scalar weight to assign to anchors that remain unmatched.
+
+  Returns:
+    targets: A tensor of shape [batch, num_anchors, d_1, d_2, ..., d_k]
+      containing targets for anchors.
+    weights: A float tensor of shape [batch, num_anchors] containing the weights
+      to assign to each target.
+  """
+  match_list = tf.unstack(batch_match)
+  targets_list = []
+  weights_list = []
+  for match_tensor, groundtruth_tensor, groundtruth_weight in zip(
+      match_list, groundtruth_tensor_list, groundtruth_weights_list):
+    match_object = mat.Match(match_tensor)
+    targets = match_object.gather_based_on_match(
+        groundtruth_tensor,
+        unmatched_value=unmatched_value,
+        ignored_value=unmatched_value)
+    targets_list.append(targets)
+    weights = match_object.gather_based_on_match(
+        groundtruth_weight,
+        unmatched_value=unmatched_weight,
+        ignored_value=tf.zeros_like(unmatched_weight))
+    weights_list.append(weights)
+  return tf.stack(targets_list), tf.stack(weights_list)


 def batch_assign_confidences(target_assigner,
@@ -548,10 +604,13 @@ def batch_assign_confidences(target_assigner,
    batch_reg_targets: a tensor with shape [batch_size, num_anchors,
      box_code_dimension]
    batch_reg_weights: a tensor with shape [batch_size, num_anchors],
-    match_list: a list of matcher.Match objects encoding the match between
-      anchors and groundtruth boxes for each image of the batch,
-      with rows of the Match objects corresponding to groundtruth boxes
-      and columns corresponding to anchors.
+    match: an int32 tensor of shape [batch_size, num_anchors] containing result
+      of anchor groundtruth matching. Each position in the tensor indicates an
+      anchor and holds the following meaning:
+      (1) if match[x, i] >= 0, anchor i is matched with groundtruth match[x, i].
+      (2) if match[x, i]=-1, anchor i is marked to be background .
+      (3) if match[x, i]=-2, anchor i is ignored since it is not background and
+          does not have sufficient overlap to call it a foreground.

  Raises:
    ValueError: if input list lengths are inconsistent, i.e.,
@@ -634,5 +693,6 @@ def batch_assign_confidences(target_assigner,
  batch_cls_weights = tf.stack(cls_weights_list)
  batch_reg_targets = tf.stack(reg_targets_list)
  batch_reg_weights = tf.stack(reg_weights_list)
+  batch_match = tf.stack(match_list)
  return (batch_cls_targets, batch_cls_weights, batch_reg_targets,
-          batch_reg_weights, match_list)
+          batch_reg_weights, batch_match)