Merged commit includes the following changes:

185215255 by Zhichao Lu: Stop populating image/object/class/text field when generating COCO tf record. -- 185213306 by Zhichao Lu: Use the params batch size and not the one from train_config in input_fn -- 185209081 by Zhichao Lu: Handle the case when there are no ground-truth masks for an image. -- 185195531 by Zhichao Lu: Remove unstack and stack operations on features from third_party/object_detection/model.py. -- 185195017 by Zhichao Lu: Matrix multiplication based gather op implementation. -- 185187744 by Zhichao Lu: Fix eval_util minor issue. -- 185098733 by Zhichao Lu: Internal change 185076656 by Zhichao Lu: Increment the amount of boxes for coco17. -- 185074199 by Zhichao Lu: Add config for SSD Resnet50 v1 with FPN. -- 185060199 by Zhichao Lu: Fix a bug in clear_detections. This method set detection_keys to an empty dictionary instead of an empty set. I've refactored so that this method and the constructor use the same code path. -- 185031359 by Zhichao Lu: Eval TPU trained models continuously. -- 185016591 by Zhichao Lu: Use TPUEstimatorSpec for TPU -- 185013651 by Zhichao Lu: Add PreprocessorCache to record and duplicate augmentations. -- 184921763 by Zhichao Lu: Minor fixes for object detection. -- 184920610 by Zhichao Lu: Adds a model builder test for "embedded_ssd_mobilenet_v1" feature extractor. -- 184919284 by Zhichao Lu: Added unit tests for TPU, with optional training / eval. -- 184915910 by Zhichao Lu: Update third_party g3 doc with Mask RCNN detection models. -- 184914085 by Zhichao Lu: Slight change to WeightSharedConvolutionalBoxPredictor implementation to make things match more closely with RetinaNet. Specifically we now construct the box encoding and class predictor towers separately rather than having them share weights until penultimate layer. -- 184913786 by Zhichao Lu: Plumbs SSD Resnet V1 with FPN models into model builder. -- 184910030 by Zhichao Lu: Add coco metrics to evaluator. -- 184897758 by Zhichao Lu: Merge changes from github. -- 184888736 by Zhichao Lu: Ensure groundtruth_weights are always 1-D. -- 184887256 by Zhichao Lu: Introduce an option to add summaries in the model so it can be turned off when necessary. -- 184865559 by Zhichao Lu: Updating inputs so that a dictionary of tensors is returned from input_fn. Moving unbatch/unpad to model.py. Also removing source_id key from features dictionary, and replacing with an integer hash. -- 184859205 by Zhichao Lu: This CL is trying to hide those differences by making the default settings work with the public code. -- 184769779 by Zhichao Lu: Pass groundtruth weights into ssd meta architecture all the way to target assigner. This will allow training ssd models with padded groundtruth tensors. -- 184767117 by Zhichao Lu: * Add `params` arg to make all input fns work with TPUEstimator * Add --master * Output eval results -- 184766244 by Zhichao Lu: Update create_coco_tf_record to include category indices -- 184752937 by Zhichao Lu: Create a third_party version of TPU compatible mobilenet_v2_focal_loss coco config. -- 184750174 by Zhichao Lu: A few small fixes for multiscale anchor generator and a test. -- 184746581 by Zhichao Lu: Update jupyter notebook to show mask if provided by model. -- 184728646 by Zhichao Lu: Adding a few more tests to make sure decoding with/without label maps performs as expected. -- 184624154 by Zhichao Lu: Add an object detection binary for TPU. -- 184622118 by Zhichao Lu: Batch, transform, and unbatch in the tflearn interface. -- 184595064 by Zhichao Lu: Add support for training grayscale models. -- 184532026 by Zhichao Lu: Change dataset_builder.build to perform optional batching using tf.data.Dataset API -- 184330239 by Zhichao Lu: Add augment_input_data and transform_input_data helper functions to third_party/tensorflow_models/object_detection/inputs.py -- 184328681 by Zhichao Lu: Use an internal rgb to gray method that can be quantized. -- 184327909 by Zhichao Lu: Helper function to return padding shapes to use with Dataset.padded_batch. -- 184326291 by Zhichao Lu: Added decode_func for specialized decoding. -- 184314676 by Zhichao Lu: Add unstack_batch method to inputs.py. This will enable us to convert batched tensors to lists of tensors. This is compatible with OD API that consumes groundtruth batch as a list of tensors. -- 184281269 by Zhichao Lu: Internal test target changes. -- 184192851 by Zhichao Lu: Adding `Estimator` interface for object detection. -- 184187885 by Zhichao Lu: Add config_util functions to help with input pipeline. 1. function to return expected shapes from the resizer config 2. function to extract image_resizer_config from model_config. -- 184139892 by Zhichao Lu: Adding support for depthwise SSD (ssd-lite) and depthwise box predictions. -- 184089891 by Zhichao Lu: Fix third_party faster rcnn resnet101 coco config. -- 184083378 by Zhichao Lu: In the case when there is no object/weights field in tf.Example proto, return a default weight of 1.0 for all boxes. -- PiperOrigin-RevId: 185215255

Merged commit includes the following changes:
185215255 by Zhichao Lu: Stop populating image/object/class/text field when generating COCO tf record. -- 185213306 by Zhichao Lu: Use the params batch size and not the one from train_config in input_fn -- 185209081 by Zhichao Lu: Handle the case when there are no ground-truth masks for an image. -- 185195531 by Zhichao Lu: Remove unstack and stack operations on features from third_party/object_detection/model.py. -- 185195017 by Zhichao Lu: Matrix multiplication based gather op implementation. -- 185187744 by Zhichao Lu: Fix eval_util minor issue. -- 185098733 by Zhichao Lu: Internal change 185076656 by Zhichao Lu: Increment the amount of boxes for coco17. -- 185074199 by Zhichao Lu: Add config for SSD Resnet50 v1 with FPN. -- 185060199 by Zhichao Lu: Fix a bug in clear_detections. This method set detection_keys to an empty dictionary instead of an empty set. I've refactored so that this method and the constructor use the same code path. -- 185031359 by Zhichao Lu: Eval TPU trained models continuously. -- 185016591 by Zhichao Lu: Use TPUEstimatorSpec for TPU -- 185013651 by Zhichao Lu: Add PreprocessorCache to record and duplicate augmentations. -- 184921763 by Zhichao Lu: Minor fixes for object detection. -- 184920610 by Zhichao Lu: Adds a model builder test for "embedded_ssd_mobilenet_v1" feature extractor. -- 184919284 by Zhichao Lu: Added unit tests for TPU, with optional training / eval. -- 184915910 by Zhichao Lu: Update third_party g3 doc with Mask RCNN detection models. -- 184914085 by Zhichao Lu: Slight change to WeightSharedConvolutionalBoxPredictor implementation to make things match more closely with RetinaNet. Specifically we now construct the box encoding and class predictor towers separately rather than having them share weights until penultimate layer. -- 184913786 by Zhichao Lu: Plumbs SSD Resnet V1 with FPN models into model builder. -- 184910030 by Zhichao Lu: Add coco metrics to evaluator. -- 184897758 by Zhichao Lu: Merge changes from github. -- 184888736 by Zhichao Lu: Ensure groundtruth_weights are always 1-D. -- 184887256 by Zhichao Lu: Introduce an option to add summaries in the model so it can be turned off when necessary. -- 184865559 by Zhichao Lu: Updating inputs so that a dictionary of tensors is returned from input_fn. Moving unbatch/unpad to model.py. Also removing source_id key from features dictionary, and replacing with an integer hash. -- 184859205 by Zhichao Lu: This CL is trying to hide those differences by making the default settings work with the public code. -- 184769779 by Zhichao Lu: Pass groundtruth weights into ssd meta architecture all the way to target assigner. This will allow training ssd models with padded groundtruth tensors. -- 184767117 by Zhichao Lu: * Add `params` arg to make all input fns work with TPUEstimator * Add --master * Output eval results -- 184766244 by Zhichao Lu: Update create_coco_tf_record to include category indices -- 184752937 by Zhichao Lu: Create a third_party version of TPU compatible mobilenet_v2_focal_loss coco config. -- 184750174 by Zhichao Lu: A few small fixes for multiscale anchor generator and a test. -- 184746581 by Zhichao Lu: Update jupyter notebook to show mask if provided by model. -- 184728646 by Zhichao Lu: Adding a few more tests to make sure decoding with/without label maps performs as expected. -- 184624154 by Zhichao Lu: Add an object detection binary for TPU. -- 184622118 by Zhichao Lu: Batch, transform, and unbatch in the tflearn interface. -- 184595064 by Zhichao Lu: Add support for training grayscale models. -- 184532026 by Zhichao Lu: Change dataset_builder.build to perform optional batching using tf.data.Dataset API -- 184330239 by Zhichao Lu: Add augment_input_data and transform_input_data helper functions to third_party/tensorflow_models/object_detection/inputs.py -- 184328681 by Zhichao Lu: Use an internal rgb to gray method that can be quantized. -- 184327909 by Zhichao Lu: Helper function to return padding shapes to use with Dataset.padded_batch. -- 184326291 by Zhichao Lu: Added decode_func for specialized decoding. -- 184314676 by Zhichao Lu: Add unstack_batch method to inputs.py. This will enable us to convert batched tensors to lists of tensors. This is compatible with OD API that consumes groundtruth batch as a list of tensors. -- 184281269 by Zhichao Lu: Internal test target changes. -- 184192851 by Zhichao Lu: Adding `Estimator` interface for object detection. -- 184187885 by Zhichao Lu: Add config_util functions to help with input pipeline. 1. function to return expected shapes from the resizer config 2. function to extract image_resizer_config from model_config. -- 184139892 by Zhichao Lu: Adding support for depthwise SSD (ssd-lite) and depthwise box predictions. -- 184089891 by Zhichao Lu: Fix third_party faster rcnn resnet101 coco config. -- 184083378 by Zhichao Lu: In the case when there is no object/weights field in tf.Example proto, return a default weight of 1.0 for all boxes. -- PiperOrigin-RevId: 185215255
1efe98bb · Zhichao Lu · lzc5123016 · fbc5ba06 · 1efe98bb · 1efe98bb
Commit 1efe98bb authored Feb 09, 2018 by Zhichao Lu Committed by lzc5123016 Feb 09, 2018
20 changed files
--- a/research/object_detection/exporter.py
+++ b/research/object_detection/exporter.py
@@ -19,6 +19,7 @@ import os
 import tempfile
 import tensorflow as tf
 from google.protobuf import text_format
+from tensorflow.core.protobuf import saver_pb2
 from tensorflow.python import pywrap_tensorflow
 from tensorflow.python.client import session
 from tensorflow.python.framework import graph_util
@@ -354,16 +355,22 @@ def _export_inference_graph(input_type,

  if graph_hook_fn: graph_hook_fn()

+  saver_kwargs = {}
  if use_moving_averages:
-    temp_checkpoint_file = tempfile.NamedTemporaryFile()
+    # This check is to be compatible with both version of SaverDef.
+    if os.path.isfile(trained_checkpoint_prefix):
+      saver_kwargs['write_version'] = saver_pb2.SaverDef.V1
+      temp_checkpoint_prefix = tempfile.NamedTemporaryFile().name
+    else:
+      temp_checkpoint_prefix = tempfile.mkdtemp()
    replace_variable_values_with_moving_averages(
        tf.get_default_graph(), trained_checkpoint_prefix,
-        temp_checkpoint_file.name)
-    checkpoint_to_use = temp_checkpoint_file.name
+        temp_checkpoint_prefix)
+    checkpoint_to_use = temp_checkpoint_prefix
  else:
    checkpoint_to_use = trained_checkpoint_prefix

-  saver = tf.train.Saver()
+  saver = tf.train.Saver(**saver_kwargs)
  input_saver_def = saver.as_saver_def()

  _write_graph_and_checkpoint(

--- a/research/object_detection/g3doc/detection_model_zoo.md
+++ b/research/object_detection/g3doc/detection_model_zoo.md
@@ -23,7 +23,7 @@ In the table below, we list each such pre-trained model including:
 * detector performance on subset of the COCO validation set or Open Images test split as measured by the dataset-specific mAP measure.
  Here, higher is better, and we only report bounding box mAP rounded to the
  nearest integer.
-* Output types (currently only `Boxes`)
+* Output types (`Boxes`, and `Masks` if applicable )

 You can un-tar each tar.gz file via, e.g.,:

@@ -55,7 +55,7 @@ Some remarks on frozen inference graphs:
  a detector (and discarding the part past that point), which negatively impacts
  standard mAP metrics.
 * Our frozen inference graphs are generated using the
-  [v1.4.0](https://github.com/tensorflow/tensorflow/tree/v1.4.0)
+  [v1.5.0](https://github.com/tensorflow/tensorflow/tree/v1.5.0)
  release version of Tensorflow and we do not guarantee that these will work
  with other versions; this being said, each frozen inference graph can be
  regenerated using your current version of Tensorflow by re-running the
@@ -69,16 +69,20 @@ Some remarks on frozen inference graphs:
 | ------------ | :--------------: | :--------------: | :-------------: |
 | [ssd_mobilenet_v1_coco](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_2017_11_17.tar.gz) | 30 | 21 | Boxes |
 | [ssd_inception_v2_coco](http://download.tensorflow.org/models/object_detection/ssd_inception_v2_coco_2017_11_17.tar.gz) | 42 | 24 | Boxes |
-| [faster_rcnn_inception_v2_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_v2_coco_2017_11_08.tar.gz) | 58 | 28 | Boxes |
-| [faster_rcnn_resnet50_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet50_coco_2017_11_08.tar.gz) | 89 | 30 | Boxes |
-| [faster_rcnn_resnet50_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet50_lowproposals_coco_2017_11_08.tar.gz) | 64 |  | Boxes |
-| [rfcn_resnet101_coco](http://download.tensorflow.org/models/object_detection/rfcn_resnet101_coco_2017_11_08.tar.gz)  | 92 | 30 | Boxes |
-| [faster_rcnn_resnet101_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_coco_2017_11_08.tar.gz) | 106 | 32 | Boxes |
-| [faster_rcnn_resnet101_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_lowproposals_coco_2017_11_08.tar.gz) | 82 |  | Boxes |
-| [faster_rcnn_inception_resnet_v2_atrous_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_coco_2017_11_08.tar.gz) | 620 | 37 | Boxes |
-| [faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco_2017_11_08.tar.gz) | 241 |  | Boxes |
-| [faster_rcnn_nas](http://download.tensorflow.org/models/object_detection/faster_rcnn_nas_coco_2017_11_08.tar.gz) | 1833 | 43 | Boxes |
-| [faster_rcnn_nas_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_nas_lowproposals_coco_2017_11_08.tar.gz) | 540 |  | Boxes |
+| [faster_rcnn_inception_v2_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_v2_coco_2018_01_28.tar.gz) | 58 | 28 | Boxes |
+| [faster_rcnn_resnet50_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet50_coco_2018_01_28.tar.gz) | 89 | 30 | Boxes |
+| [faster_rcnn_resnet50_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet50_lowproposals_coco_2018_01_28.tar.gz) | 64 |  | Boxes |
+| [rfcn_resnet101_coco](http://download.tensorflow.org/models/object_detection/rfcn_resnet101_coco_2018_01_28.tar.gz)  | 92 | 30 | Boxes |
+| [faster_rcnn_resnet101_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_coco_2018_01_28.tar.gz) | 106 | 32 | Boxes |
+| [faster_rcnn_resnet101_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_lowproposals_coco_2018_01_28.tar.gz) | 82 |  | Boxes |
+| [faster_rcnn_inception_resnet_v2_atrous_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_coco_2018_01_28.tar.gz) | 620 | 37 | Boxes |
+| [faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco_2018_01_28.tar.gz) | 241 |  | Boxes |
+| [faster_rcnn_nas](http://download.tensorflow.org/models/object_detection/faster_rcnn_nas_coco_2018_01_28.tar.gz) | 1833 | 43 | Boxes |
+| [faster_rcnn_nas_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_nas_lowproposals_coco_2018_01_28.tar.gz) | 540 |  | Boxes |
+| [mask_rcnn_inception_resnet_v2_atrous_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_inception_resnet_v2_atrous_coco_2018_01_28.tar.gz) | 771 | 36 | Masks |
+| [mask_rcnn_inception_v2_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_inception_v2_coco_2018_01_28.tar.gz) | 79 | 25 | Masks |
+| [mask_rcnn_resnet101_atrous_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_resnet101_atrous_coco_2018_01_28.tar.gz) | 470 | 33 | Masks |
+| [mask_rcnn_resnet50_atrous_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_resnet50_atrous_coco_2018_01_28.tar.gz) | 343 | 29 | Masks |



@@ -86,14 +90,14 @@ Some remarks on frozen inference graphs:

 Model name                                                                                                                                                        | Speed (ms) | Pascal mAP@0.5 (ms) | Outputs
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----:
-[faster_rcnn_resnet101_kitti](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_kitti_2017_11_08.tar.gz) | 79  | 87              | Boxes
+[faster_rcnn_resnet101_kitti](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_kitti_2018_01_28.tar.gz) | 79  | 87              | Boxes

 ## Open Images-trained models {#open-images-models}

 Model name                                                                                                                                                        | Speed (ms) | Open Images mAP@0.5[^2] | Outputs
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----:
-[faster_rcnn_inception_resnet_v2_atrous_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_oid_2017_11_08.tar.gz) | 727 | 37              | Boxes
-[faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2017_11_08.tar.gz) | 347  |               | Boxes
+[faster_rcnn_inception_resnet_v2_atrous_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_oid_2018_01_28.tar.gz) | 727 | 37              | Boxes
+[faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2018_01_28.tar.gz) | 347  |               | Boxes


 [^1]: See [MSCOCO evaluation protocol](http://cocodataset.org/#detections-eval).

--- a/research/object_detection/g3doc/img/kites_with_segment_overlay.png
+++ b/research/object_detection/g3doc/img/kites_with_segment_overlay.png
--- a/research/object_detection/g3doc/instance_segmentation.md
+++ b/research/object_detection/g3doc/instance_segmentation.md
+## Run an Instance Segmentation Model
+
+For some applications it isn't adequate enough to localize an object with a
+simple bounding box. For instance, you might want to segment an object region
+once it is detected. This class of problems is called **instance segmentation**.
+
+<p align="center">
+  <img src="img/kites_with_segment_overlay.png" width=676 height=450>
+</p>
+
+### Materializing data for instance segmentation {#materializing-instance-seg}
+
+Instance segmentation is an extension of object detection, where a binary mask
+(i.e. object vs. background) is associated with every bounding box. This allows
+for more fine-grained information about the extent of the object within the box.
+To train an instance segmentation model, a groundtruth mask must be supplied for
+every groundtruth bounding box. In additional to the proto fields listed in the
+section titled [Using your own dataset](using_your_own_dataset.md), one must
+also supply `image/object/mask`, which can either be a repeated list of
+single-channel encoded PNG strings, or a single dense 3D binary tensor where
+masks corresponding to each object are stacked along the first dimension. Each
+is described in more detail below.
+
+#### PNG Instance Segmentation Masks
+
+Instance segmentation masks can be supplied as serialized PNG images.
+
+```shell
+image/object/mask = ["\x89PNG\r\n\x1A\n\x00\x00\x00\rIHDR\...", ...]
+```
+
+These masks are whole-image masks, one for each object instance. The spatial
+dimensions of each mask must agree with the image. Each mask has only a single
+channel, and the pixel values are either 0 (background) or 1 (object mask).
+**PNG masks are the preferred parameterization since they offer considerable
+space savings compared to dense numerical masks.**
+
+#### Dense Numerical Instance Segmentation Masks
+
+Masks can also be specified via a dense numerical tensor.
+
+```shell
+image/object/mask = [0.0, 0.0, 1.0, 1.0, 0.0, ...]
+```
+
+For an image with dimensions `H` x `W` and `num_boxes` groundtruth boxes, the
+mask corresponds to a [`num_boxes`, `H`, `W`] float32 tensor, flattened into a
+single vector of shape `num_boxes` * `H` * `W`. In TensorFlow, examples are read
+in row-major format, so the elements are organized as:
+
+```shell
+... mask 0 row 0 ... mask 0 row 1 ... // ... mask 0 row H-1 ... mask 1 row 0 ...
+```
+
+where each row has W contiguous binary values.
+
+To see an example tf-records with mask labels, see the examples under the
+[Preparing Inputs](preparing_inputs.md) section.
+
+### Pre-existing config files
+
+We provide four instance segmentation config files that you can use to train
+your own models:
+
+1.  <a href="https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/mask_rcnn_inception_resnet_v2_atrous_coco.config" target=_blank>mask_rcnn_inception_resnet_v2_atrous_coco</a>
+1.  <a href="https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/mask_rcnn_resnet101_atrous_coco.config" target=_blank>mask_rcnn_resnet101_atrous_coco</a>
+1.  <a href="https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/mask_rcnn_resnet50_atrous_coco.config" target=_blank>mask_rcnn_resnet50_atrous_coco</a>
+1.  <a href="https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/mask_rcnn_inception_v2_coco.config" target=_blank>mask_rcnn_inception_v2_coco</a>
+
+For more details see the [detection model zoo](detection_model_zoo.md).
+
+### Updating a Faster R-CNN config file
+
+Currently, the only supported instance segmentation model is [Mask
+R-CNN](https://arxiv.org/abs/1703.06870), which requires Faster R-CNN as the
+backbone object detector.
+
+Once you have a baseline Faster R-CNN pipeline configuration, you can make the
+following modifications in order to convert it into a Mask R-CNN model.
+
+1.  Within `train_input_reader` and `eval_input_reader`, set
+    `load_instance_masks` to `True`. If using PNG masks, set `mask_type` to
+    `PNG_MASKS`, otherwise you can leave it as the default 'NUMERICAL_MASKS'.
+1.  Within the `faster_rcnn` config, use a `MaskRCNNBoxPredictor` as the
+    `second_stage_box_predictor`.
+1.  Within the `MaskRCNNBoxPredictor` message, set `predict_instance_masks` to
+    `True`. You must also define `conv_hyperparams`.
+1.  Within the `faster_rcnn` message, set `number_of_stages` to `3`.
+1.  Add instance segmentation metrics to the set of metrics:
+    `'coco_mask_metrics'`.
+1.  Update the `input_path`s to point at your data.
+
+Please refer to the section on [Running the pets dataset](running_pets.md) for
+additional details.
+
+> Note: The mask prediction branch consists of a sequence of convolution layers.
+> You can set the number of convolution layers and their depth as follows:
+>
+> 1.  Within the `MaskRCNNBoxPredictor` message, set the
+>     `mask_prediction_conv_depth` to your value of interest. The default value
+>     is 256. If you set it to `0` (recommended), the depth is computed
+>     automatically based on the number of classes in the dataset.
+> 1.  Within the `MaskRCNNBoxPredictor` message, set the
+>     `mask_prediction_num_conv_layers` to your value of interest. The default
+>     value is 2.
--- a/research/object_detection/g3doc/running_pets.md
+++ b/research/object_detection/g3doc/running_pets.md
@@ -319,6 +319,9 @@ instance segmentation pipeline. Everything above that was mentioned about object
 detection holds true for instance segmentation. Instance segmentation consists
 of an object detection model with an additional head that predicts the object
 mask inside each predicted box once we remove the training and other details.
+Please refer to the section on [Running an Instance Segmentation
+Model](instance_segmentation.md) for instructions on how to configure a model
+that predicts masks in addition to object bounding boxes.

 ## What's Next


--- a/research/object_detection/g3doc/using_your_own_dataset.md
+++ b/research/object_detection/g3doc/using_your_own_dataset.md
@@ -103,7 +103,7 @@ FLAGS = flags.FLAGS


 def create_tf_example(example):
-  # TODO(user): Populate the following variables from your example.
+  # TODO: Populate the following variables from your example.
  height = None # Image height
  width = None # Image width
  filename = None # Filename of the image. Empty if image is not from file
@@ -139,7 +139,7 @@ def create_tf_example(example):
 def main(_):
  writer = tf.python_io.TFRecordWriter(FLAGS.output_path)

-  # TODO(user): Write code to read in your dataset to examples variable
+  # TODO: Write code to read in your dataset to examples variable

  for example in examples:
    tf_example = create_tf_example(example)
@@ -155,3 +155,7 @@ if __name__ == '__main__':

 Note: You may notice additional fields in some other datasets. They are
 currently unused by the API and are optional.
+
+Note: Please refer to the section on [Running an Instance Segmentation
+Model](instance_segmentation.md) for instructions on how to configure a model
+that predicts masks in addition to object bounding boxes.
--- a/research/object_detection/inputs.py
+++ b/research/object_detection/inputs.py
--- a/research/object_detection/inputs_test.py
+++ b/research/object_detection/inputs_test.py
--- a/research/object_detection/matchers/argmax_matcher.py
+++ b/research/object_detection/matchers/argmax_matcher.py
@@ -55,7 +55,8 @@ class ArgMaxMatcher(matcher.Matcher):
               matched_threshold,
               unmatched_threshold=None,
               negatives_lower_than_unmatched=True,
-               force_match_for_each_row=False):
+               force_match_for_each_row=False,
+               use_matmul_gather=False):
    """Construct ArgMaxMatcher.

    Args:
@@ -74,11 +75,15 @@ class ArgMaxMatcher(matcher.Matcher):
        at least one column (which is not guaranteed otherwise if the
        matched_threshold is high). Defaults to False. See
        argmax_matcher_test.testMatcherForceMatch() for an example.
+      use_matmul_gather: Force constructed match objects to use matrix
+        multiplication based gather instead of standard tf.gather.
+        (Default: False).

    Raises:
      ValueError: if unmatched_threshold is set but matched_threshold is not set
        or if unmatched_threshold > matched_threshold.
    """
+    super(ArgMaxMatcher, self).__init__(use_matmul_gather=use_matmul_gather)
    if (matched_threshold is None) and (unmatched_threshold is not None):
      raise ValueError('Need to also define matched_threshold when'
                       'unmatched_threshold is defined')

--- a/research/object_detection/matchers/bipartite_matcher.py
+++ b/research/object_detection/matchers/bipartite_matcher.py
@@ -24,6 +24,17 @@ from object_detection.core import matcher
 class GreedyBipartiteMatcher(matcher.Matcher):
  """Wraps a Tensorflow greedy bipartite matcher."""

+  def __init__(self, use_matmul_gather=False):
+    """Constructs a Matcher.
+
+    Args:
+      use_matmul_gather: Force constructed match objects to use matrix
+        multiplication based gather instead of standard tf.gather.
+        (Default: False).
+    """
+    super(GreedyBipartiteMatcher, self).__init__(
+        use_matmul_gather=use_matmul_gather)
+
  def _match(self, similarity_matrix, num_valid_rows=-1):
    """Bipartite matches a collection rows and columns. A greedy bi-partite.


--- a/research/object_detection/meta_architectures/faster_rcnn_meta_arch.py
+++ b/research/object_detection/meta_architectures/faster_rcnn_meta_arch.py
@@ -251,7 +251,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
               second_stage_classification_loss,
               second_stage_mask_prediction_loss_weight=1.0,
               hard_example_miner=None,
-               parallel_iterations=16):
+               parallel_iterations=16,
+               add_summaries=True):
    """FasterRCNNMetaArch Constructor.

    Args:
@@ -355,12 +356,17 @@ class FasterRCNNMetaArch(model.DetectionModel):
      hard_example_miner:  A losses.HardExampleMiner object (can be None).
      parallel_iterations: (Optional) The number of iterations allowed to run
        in parallel for calls to tf.map_fn.
+      add_summaries: boolean (default: True) controlling whether summary ops
+        should be added to tensorflow graph.
+
    Raises:
      ValueError: If `second_stage_batch_size` > `first_stage_max_proposals` at
        training time.
      ValueError: If first_stage_anchor_generator is not of type
        grid_anchor_generator.GridAnchorGenerator.
    """
+    # TODO: add_summaries is currently unused. Respect that directive
+    # in the future.
    super(FasterRCNNMetaArch, self).__init__(num_classes=num_classes)

    if is_training and second_stage_batch_size > first_stage_max_proposals:

--- a/research/object_detection/meta_architectures/rfcn_meta_arch.py
+++ b/research/object_detection/meta_architectures/rfcn_meta_arch.py
@@ -75,7 +75,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
               second_stage_classification_loss_weight,
               second_stage_classification_loss,
               hard_example_miner,
-               parallel_iterations=16):
+               parallel_iterations=16,
+               add_summaries=True):
    """RFCNMetaArch Constructor.

    Args:
@@ -155,11 +156,16 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
      hard_example_miner:  A losses.HardExampleMiner object (can be None).
      parallel_iterations: (Optional) The number of iterations allowed to run
        in parallel for calls to tf.map_fn.
+      add_summaries: boolean (default: True) controlling whether summary ops
+        should be added to tensorflow graph.
+
    Raises:
      ValueError: If `second_stage_batch_size` > `first_stage_max_proposals`
      ValueError: If first_stage_anchor_generator is not of type
        grid_anchor_generator.GridAnchorGenerator.
    """
+    # TODO: add_summaries is currently unused. Respect that directive
+    # in the future.
    super(RFCNMetaArch, self).__init__(
        is_training,
        num_classes,

--- a/research/object_detection/meta_architectures/ssd_meta_arch.py
+++ b/research/object_detection/meta_architectures/ssd_meta_arch.py
@@ -44,7 +44,8 @@ class SSDFeatureExtractor(object):
               conv_hyperparams,
               batch_norm_trainable=True,
               reuse_weights=None,
-               use_explicit_padding=False):
+               use_explicit_padding=False,
+               use_depthwise=False):
    """Constructor.

    Args:
@@ -61,6 +62,7 @@ class SSDFeatureExtractor(object):
      reuse_weights: whether to reuse variables. Default is None.
      use_explicit_padding: Whether to use explicit padding when extracting
        features. Default is False.
+      use_depthwise: Whether to use depthwise convolutions. Default is False.
    """
    self._is_training = is_training
    self._depth_multiplier = depth_multiplier
@@ -70,6 +72,7 @@ class SSDFeatureExtractor(object):
    self._batch_norm_trainable = batch_norm_trainable
    self._reuse_weights = reuse_weights
    self._use_explicit_padding = use_explicit_padding
+    self._use_depthwise = use_depthwise

  @abstractmethod
  def preprocess(self, resized_inputs):
@@ -130,7 +133,7 @@ class SSDMetaArch(model.DetectionModel):
               add_summaries=True):
    """SSDMetaArch Constructor.

-    TODO: group NMS parameters + score converter into
+    TODO(rathodv,jonathanhuang): group NMS parameters + score converter into
    a class and loss parameters into a class and write config protos for
    postprocessing and losses.

@@ -330,7 +333,8 @@ class SSDMetaArch(model.DetectionModel):
      feature_maps = self._feature_extractor.extract_features(
          preprocessed_inputs)
    feature_map_spatial_dims = self._get_feature_map_spatial_dims(feature_maps)
-    image_shape = tf.shape(preprocessed_inputs)
+    image_shape = shape_utils.combined_static_and_dynamic_shape(
+        preprocessed_inputs)
    self._anchors = self._anchor_generator.generate(
        feature_map_spatial_dims,
        im_height=image_shape[1],
@@ -472,11 +476,14 @@ class SSDMetaArch(model.DetectionModel):
      keypoints = None
      if self.groundtruth_has_field(fields.BoxListFields.keypoints):
        keypoints = self.groundtruth_lists(fields.BoxListFields.keypoints)
+      weights = None
+      if self.groundtruth_has_field(fields.BoxListFields.weights):
+        weights = self.groundtruth_lists(fields.BoxListFields.weights)
      (batch_cls_targets, batch_cls_weights, batch_reg_targets,
       batch_reg_weights, match_list) = self._assign_targets(
           self.groundtruth_lists(fields.BoxListFields.boxes),
           self.groundtruth_lists(fields.BoxListFields.classes),
-           keypoints)
+           keypoints, weights)
      if self._add_summaries:
        self._summarize_input(
            self.groundtruth_lists(fields.BoxListFields.boxes), match_list)
@@ -539,7 +546,8 @@ class SSDMetaArch(model.DetectionModel):
                                              'NegativeAnchorLossCDF')

  def _assign_targets(self, groundtruth_boxes_list, groundtruth_classes_list,
-                      groundtruth_keypoints_list=None):
+                      groundtruth_keypoints_list=None,
+                      groundtruth_weights_list=None):
    """Assign groundtruth targets.

    Adds a background class to each one-hot encoding of groundtruth classes
@@ -556,6 +564,8 @@ class SSDMetaArch(model.DetectionModel):
        index assumed to map to the first non-background class.
      groundtruth_keypoints_list: (optional) a list of 3-D tensors of shape
        [num_boxes, num_keypoints, 2]
+      groundtruth_weights_list: A list of 1-D tf.float32 tensors of shape
+        [num_boxes] containing weights for groundtruth boxes.

    Returns:
      batch_cls_targets: a tensor with shape [batch_size, num_anchors,
@@ -582,7 +592,7 @@ class SSDMetaArch(model.DetectionModel):
        boxlist.add_field(fields.BoxListFields.keypoints, keypoints)
    return target_assigner.batch_assign_targets(
        self._target_assigner, self.anchors, groundtruth_boxlists,
-        groundtruth_classes_with_background_list)
+        groundtruth_classes_with_background_list, groundtruth_weights_list)

  def _summarize_input(self, groundtruth_boxes_list, match_list):
    """Creates tensorflow summaries for the input boxes and anchors.

--- a/research/object_detection/metrics/coco_evaluation.py
+++ b/research/object_detection/metrics/coco_evaluation.py
@@ -24,13 +24,17 @@ from object_detection.utils import object_detection_evaluation
 class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
  """Class to evaluate COCO detection metrics."""

-  def __init__(self, categories, all_metrics_per_category=False):
+  def __init__(self,
+               categories,
+               include_metrics_per_category=False,
+               all_metrics_per_category=False):
    """Constructor.

    Args:
      categories: A list of dicts, each of which has the following keys -
        'id': (required) an integer id uniquely identifying this category.
        'name': (required) string representing category name e.g., 'cat', 'dog'.
+      include_metrics_per_category: If True, include metrics for each category.
      all_metrics_per_category: Whether to include all the summary metrics for
        each category in per_category_ap. Be careful with setting it to true if
        you have more than handful of categories, because it will pollute
@@ -45,6 +49,7 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
    self._category_id_set = set([cat['id'] for cat in self._categories])
    self._annotation_id = 1
    self._metrics = None
+    self._include_metrics_per_category = include_metrics_per_category
    self._all_metrics_per_category = all_metrics_per_category

  def clear(self):
@@ -166,7 +171,8 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
      'DetectionBoxes_Recall/AR@100 (large)': average recall for large objects
        with 100 detections.

-      2. per_category_ap: category specific results with keys of the form:
+      2. per_category_ap: if include_metrics_per_category is True, category
+      specific results with keys of the form:
      'Precision mAP ByCategory/category' (without the supercategory part if
      no supercategories exist). For backward compatibility
      'PerformanceByCategory' is included in the output regardless of
@@ -183,6 +189,7 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
    box_evaluator = coco_tools.COCOEvalWrapper(
        coco_wrapped_groundtruth, coco_wrapped_detections, agnostic_mode=False)
    box_metrics, box_per_category_ap = box_evaluator.ComputeMetrics(
+        include_metrics_per_category=self._include_metrics_per_category,
        all_metrics_per_category=self._all_metrics_per_category)
    box_metrics.update(box_per_category_ap)
    box_metrics = {'DetectionBoxes_'+ key: value
@@ -253,6 +260,7 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
                    'DetectionBoxes_Recall/AR@100 (large)',
                    'DetectionBoxes_Recall/AR@100 (medium)',
                    'DetectionBoxes_Recall/AR@100 (small)']
+    if self._include_metrics_per_category:
      for category_dict in self._categories:
        metric_names.append('DetectionBoxes_PerformanceByCategory/mAP/' +
                            category_dict['name'])
@@ -289,13 +297,14 @@ def _check_mask_type_and_value(array_name, masks):
 class CocoMaskEvaluator(object_detection_evaluation.DetectionEvaluator):
  """Class to evaluate COCO detection metrics."""

-  def __init__(self, categories):
+  def __init__(self, categories, include_metrics_per_category=False):
    """Constructor.

    Args:
      categories: A list of dicts, each of which has the following keys -
        'id': (required) an integer id uniquely identifying this category.
        'name': (required) string representing category name e.g., 'cat', 'dog'.
+      include_metrics_per_category: If True, include metrics for each category.
    """
    super(CocoMaskEvaluator, self).__init__(categories)
    self._image_id_to_mask_shape_map = {}
@@ -304,6 +313,7 @@ class CocoMaskEvaluator(object_detection_evaluation.DetectionEvaluator):
    self._detection_masks_list = []
    self._category_id_set = set([cat['id'] for cat in self._categories])
    self._annotation_id = 1
+    self._include_metrics_per_category = include_metrics_per_category

  def clear(self):
    """Clears the state to prepare for a fresh evaluation."""
@@ -438,7 +448,8 @@ class CocoMaskEvaluator(object_detection_evaluation.DetectionEvaluator):
      'Recall/AR@100 (large)': average recall for large objects with 100
        detections

-      2. per_category_ap: category specific results with keys of the form:
+      2. per_category_ap: if include_metrics_per_category is True, category
+      specific results with keys of the form:
      'Precision mAP ByCategory/category' (without the supercategory part if
      no supercategories exist). For backward compatibility
      'PerformanceByCategory' is included in the output regardless of
@@ -458,7 +469,8 @@ class CocoMaskEvaluator(object_detection_evaluation.DetectionEvaluator):
    mask_evaluator = coco_tools.COCOEvalWrapper(
        coco_wrapped_groundtruth, coco_wrapped_detection_masks,
        agnostic_mode=False, iou_type='segm')
-    mask_metrics, mask_per_category_ap = mask_evaluator.ComputeMetrics()
+    mask_metrics, mask_per_category_ap = mask_evaluator.ComputeMetrics(
+        include_metrics_per_category=self._include_metrics_per_category)
    mask_metrics.update(mask_per_category_ap)
    mask_metrics = {'DetectionMasks_'+ key: value
                    for key, value in mask_metrics.iteritems()}

--- a/research/object_detection/metrics/coco_evaluation_test.py
+++ b/research/object_detection/metrics/coco_evaluation_test.py
@@ -12,13 +12,12 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # ==============================================================================
-"""Tests for image.understanding.object_detection.metrics.coco_evaluation."""
+"""Tests for tensorflow_models.object_detection.metrics.coco_evaluation."""

 from __future__ import absolute_import
 from __future__ import division
 from __future__ import print_function

-import math
 import numpy as np
 import tensorflow as tf
 from object_detection.core import standard_fields
@@ -87,43 +86,6 @@ class CocoDetectionEvaluationTest(tf.test.TestCase):
    metrics = coco_evaluator.evaluate()
    self.assertAlmostEqual(metrics['DetectionBoxes_Precision/mAP'], 1.0)

-  def testReturnAllMetricsPerCategory(self):
-    """Tests that mAP is calculated correctly on GT and Detections."""
-    category_list = [{'id': 0, 'name': 'person'}]
-    coco_evaluator = coco_evaluation.CocoDetectionEvaluator(
-        category_list, all_metrics_per_category=True)
-    coco_evaluator.add_single_ground_truth_image_info(
-        image_id='image1',
-        groundtruth_dict={
-            standard_fields.InputDataFields.groundtruth_boxes:
-            np.array([[100., 100., 200., 200.]]),
-            standard_fields.InputDataFields.groundtruth_classes: np.array([1])
-        })
-    coco_evaluator.add_single_detected_image_info(
-        image_id='image1',
-        detections_dict={
-            standard_fields.DetectionResultFields.detection_boxes:
-            np.array([[100., 100., 200., 200.]]),
-            standard_fields.DetectionResultFields.detection_scores:
-            np.array([.8]),
-            standard_fields.DetectionResultFields.detection_classes:
-            np.array([1])
-        })
-    metrics = coco_evaluator.evaluate()
-    expected_metrics = [
-        'DetectionBoxes_Recall AR@10 ByCategory/person',
-        'DetectionBoxes_Precision mAP (medium) ByCategory/person',
-        'DetectionBoxes_Precision mAP ByCategory/person',
-        'DetectionBoxes_Precision mAP@.50IOU ByCategory/person',
-        'DetectionBoxes_Precision mAP (small) ByCategory/person',
-        'DetectionBoxes_Precision mAP (large) ByCategory/person',
-        'DetectionBoxes_Recall AR@1 ByCategory/person',
-        'DetectionBoxes_Precision mAP@.75IOU ByCategory/person',
-        'DetectionBoxes_Recall AR@100 ByCategory/person',
-        'DetectionBoxes_Recall AR@100 (medium) ByCategory/person',
-        'DetectionBoxes_Recall AR@100 (large) ByCategory/person']
-    self.assertTrue(set(expected_metrics).issubset(set(metrics)))
-
  def testRejectionOnDuplicateGroundtruth(self):
    """Tests that groundtruth cannot be added more than once for an image."""
    categories = [{'id': 1, 'name': 'cat'},
@@ -279,12 +241,6 @@ class CocoEvaluationPyFuncTest(tf.test.TestCase):
    self.assertAlmostEqual(metrics['DetectionBoxes_Recall/AR@100 (medium)'],
                           -1.0)
    self.assertAlmostEqual(metrics['DetectionBoxes_Recall/AR@100 (small)'], 1.0)
-    self.assertAlmostEqual(metrics[
-        'DetectionBoxes_PerformanceByCategory/mAP/dog'], 1.0)
-    self.assertAlmostEqual(metrics[
-        'DetectionBoxes_PerformanceByCategory/mAP/cat'], 1.0)
-    self.assertTrue(math.isnan(metrics[
-        'DetectionBoxes_PerformanceByCategory/mAP/person']))
    self.assertFalse(coco_evaluator._groundtruth_list)
    self.assertFalse(coco_evaluator._detection_boxes_list)
    self.assertFalse(coco_evaluator._image_ids)

--- a/research/object_detection/metrics/coco_tools.py
+++ b/research/object_detection/metrics/coco_tools.py
--- a/research/object_detection/metrics/coco_tools_test.py
+++ b/research/object_detection/metrics/coco_tools_test.py
--- a/research/object_detection/model.py
+++ b/research/object_detection/model.py
--- a/research/object_detection/model_hparams.py
+++ b/research/object_detection/model_hparams.py
--- a/research/object_detection/model_test.py
+++ b/research/object_detection/model_test.py