Merged commit includes the following changes:

185215255 by Zhichao Lu: Stop populating image/object/class/text field when generating COCO tf record. -- 185213306 by Zhichao Lu: Use the params batch size and not the one from train_config in input_fn -- 185209081 by Zhichao Lu: Handle the case when there are no ground-truth masks for an image. -- 185195531 by Zhichao Lu: Remove unstack and stack operations on features from third_party/object_detection/model.py. -- 185195017 by Zhichao Lu: Matrix multiplication based gather op implementation. -- 185187744 by Zhichao Lu: Fix eval_util minor issue. -- 185098733 by Zhichao Lu: Internal change 185076656 by Zhichao Lu: Increment the amount of boxes for coco17. -- 185074199 by Zhichao Lu: Add config for SSD Resnet50 v1 with FPN. -- 185060199 by Zhichao Lu: Fix a bug in clear_detections. This method set detection_keys to an empty dictionary instead of an empty set. I've refactored so that this method and the constructor use the same code path. -- 185031359 by Zhichao Lu: Eval TPU trained models continuously. -- 185016591 by Zhichao Lu: Use TPUEstimatorSpec for TPU -- 185013651 by Zhichao Lu: Add PreprocessorCache to record and duplicate augmentations. -- 184921763 by Zhichao Lu: Minor fixes for object detection. -- 184920610 by Zhichao Lu: Adds a model builder test for "embedded_ssd_mobilenet_v1" feature extractor. -- 184919284 by Zhichao Lu: Added unit tests for TPU, with optional training / eval. -- 184915910 by Zhichao Lu: Update third_party g3 doc with Mask RCNN detection models. -- 184914085 by Zhichao Lu: Slight change to WeightSharedConvolutionalBoxPredictor implementation to make things match more closely with RetinaNet. Specifically we now construct the box encoding and class predictor towers separately rather than having them share weights until penultimate layer. -- 184913786 by Zhichao Lu: Plumbs SSD Resnet V1 with FPN models into model builder. -- 184910030 by Zhichao Lu: Add coco metrics to evaluator. -- 184897758 by Zhichao Lu: Merge changes from github. -- 184888736 by Zhichao Lu: Ensure groundtruth_weights are always 1-D. -- 184887256 by Zhichao Lu: Introduce an option to add summaries in the model so it can be turned off when necessary. -- 184865559 by Zhichao Lu: Updating inputs so that a dictionary of tensors is returned from input_fn. Moving unbatch/unpad to model.py. Also removing source_id key from features dictionary, and replacing with an integer hash. -- 184859205 by Zhichao Lu: This CL is trying to hide those differences by making the default settings work with the public code. -- 184769779 by Zhichao Lu: Pass groundtruth weights into ssd meta architecture all the way to target assigner. This will allow training ssd models with padded groundtruth tensors. -- 184767117 by Zhichao Lu: * Add `params` arg to make all input fns work with TPUEstimator * Add --master * Output eval results -- 184766244 by Zhichao Lu: Update create_coco_tf_record to include category indices -- 184752937 by Zhichao Lu: Create a third_party version of TPU compatible mobilenet_v2_focal_loss coco config. -- 184750174 by Zhichao Lu: A few small fixes for multiscale anchor generator and a test. -- 184746581 by Zhichao Lu: Update jupyter notebook to show mask if provided by model. -- 184728646 by Zhichao Lu: Adding a few more tests to make sure decoding with/without label maps performs as expected. -- 184624154 by Zhichao Lu: Add an object detection binary for TPU. -- 184622118 by Zhichao Lu: Batch, transform, and unbatch in the tflearn interface. -- 184595064 by Zhichao Lu: Add support for training grayscale models. -- 184532026 by Zhichao Lu: Change dataset_builder.build to perform optional batching using tf.data.Dataset API -- 184330239 by Zhichao Lu: Add augment_input_data and transform_input_data helper functions to third_party/tensorflow_models/object_detection/inputs.py -- 184328681 by Zhichao Lu: Use an internal rgb to gray method that can be quantized. -- 184327909 by Zhichao Lu: Helper function to return padding shapes to use with Dataset.padded_batch. -- 184326291 by Zhichao Lu: Added decode_func for specialized decoding. -- 184314676 by Zhichao Lu: Add unstack_batch method to inputs.py. This will enable us to convert batched tensors to lists of tensors. This is compatible with OD API that consumes groundtruth batch as a list of tensors. -- 184281269 by Zhichao Lu: Internal test target changes. -- 184192851 by Zhichao Lu: Adding `Estimator` interface for object detection. -- 184187885 by Zhichao Lu: Add config_util functions to help with input pipeline. 1. function to return expected shapes from the resizer config 2. function to extract image_resizer_config from model_config. -- 184139892 by Zhichao Lu: Adding support for depthwise SSD (ssd-lite) and depthwise box predictions. -- 184089891 by Zhichao Lu: Fix third_party faster rcnn resnet101 coco config. -- 184083378 by Zhichao Lu: In the case when there is no object/weights field in tf.Example proto, return a default weight of 1.0 for all boxes. -- PiperOrigin-RevId: 185215255

Merged commit includes the following changes:
185215255 by Zhichao Lu: Stop populating image/object/class/text field when generating COCO tf record. -- 185213306 by Zhichao Lu: Use the params batch size and not the one from train_config in input_fn -- 185209081 by Zhichao Lu: Handle the case when there are no ground-truth masks for an image. -- 185195531 by Zhichao Lu: Remove unstack and stack operations on features from third_party/object_detection/model.py. -- 185195017 by Zhichao Lu: Matrix multiplication based gather op implementation. -- 185187744 by Zhichao Lu: Fix eval_util minor issue. -- 185098733 by Zhichao Lu: Internal change 185076656 by Zhichao Lu: Increment the amount of boxes for coco17. -- 185074199 by Zhichao Lu: Add config for SSD Resnet50 v1 with FPN. -- 185060199 by Zhichao Lu: Fix a bug in clear_detections. This method set detection_keys to an empty dictionary instead of an empty set. I've refactored so that this method and the constructor use the same code path. -- 185031359 by Zhichao Lu: Eval TPU trained models continuously. -- 185016591 by Zhichao Lu: Use TPUEstimatorSpec for TPU -- 185013651 by Zhichao Lu: Add PreprocessorCache to record and duplicate augmentations. -- 184921763 by Zhichao Lu: Minor fixes for object detection. -- 184920610 by Zhichao Lu: Adds a model builder test for "embedded_ssd_mobilenet_v1" feature extractor. -- 184919284 by Zhichao Lu: Added unit tests for TPU, with optional training / eval. -- 184915910 by Zhichao Lu: Update third_party g3 doc with Mask RCNN detection models. -- 184914085 by Zhichao Lu: Slight change to WeightSharedConvolutionalBoxPredictor implementation to make things match more closely with RetinaNet. Specifically we now construct the box encoding and class predictor towers separately rather than having them share weights until penultimate layer. -- 184913786 by Zhichao Lu: Plumbs SSD Resnet V1 with FPN models into model builder. -- 184910030 by Zhichao Lu: Add coco metrics to evaluator. -- 184897758 by Zhichao Lu: Merge changes from github. -- 184888736 by Zhichao Lu: Ensure groundtruth_weights are always 1-D. -- 184887256 by Zhichao Lu: Introduce an option to add summaries in the model so it can be turned off when necessary. -- 184865559 by Zhichao Lu: Updating inputs so that a dictionary of tensors is returned from input_fn. Moving unbatch/unpad to model.py. Also removing source_id key from features dictionary, and replacing with an integer hash. -- 184859205 by Zhichao Lu: This CL is trying to hide those differences by making the default settings work with the public code. -- 184769779 by Zhichao Lu: Pass groundtruth weights into ssd meta architecture all the way to target assigner. This will allow training ssd models with padded groundtruth tensors. -- 184767117 by Zhichao Lu: * Add `params` arg to make all input fns work with TPUEstimator * Add --master * Output eval results -- 184766244 by Zhichao Lu: Update create_coco_tf_record to include category indices -- 184752937 by Zhichao Lu: Create a third_party version of TPU compatible mobilenet_v2_focal_loss coco config. -- 184750174 by Zhichao Lu: A few small fixes for multiscale anchor generator and a test. -- 184746581 by Zhichao Lu: Update jupyter notebook to show mask if provided by model. -- 184728646 by Zhichao Lu: Adding a few more tests to make sure decoding with/without label maps performs as expected. -- 184624154 by Zhichao Lu: Add an object detection binary for TPU. -- 184622118 by Zhichao Lu: Batch, transform, and unbatch in the tflearn interface. -- 184595064 by Zhichao Lu: Add support for training grayscale models. -- 184532026 by Zhichao Lu: Change dataset_builder.build to perform optional batching using tf.data.Dataset API -- 184330239 by Zhichao Lu: Add augment_input_data and transform_input_data helper functions to third_party/tensorflow_models/object_detection/inputs.py -- 184328681 by Zhichao Lu: Use an internal rgb to gray method that can be quantized. -- 184327909 by Zhichao Lu: Helper function to return padding shapes to use with Dataset.padded_batch. -- 184326291 by Zhichao Lu: Added decode_func for specialized decoding. -- 184314676 by Zhichao Lu: Add unstack_batch method to inputs.py. This will enable us to convert batched tensors to lists of tensors. This is compatible with OD API that consumes groundtruth batch as a list of tensors. -- 184281269 by Zhichao Lu: Internal test target changes. -- 184192851 by Zhichao Lu: Adding `Estimator` interface for object detection. -- 184187885 by Zhichao Lu: Add config_util functions to help with input pipeline. 1. function to return expected shapes from the resizer config 2. function to extract image_resizer_config from model_config. -- 184139892 by Zhichao Lu: Adding support for depthwise SSD (ssd-lite) and depthwise box predictions. -- 184089891 by Zhichao Lu: Fix third_party faster rcnn resnet101 coco config. -- 184083378 by Zhichao Lu: In the case when there is no object/weights field in tf.Example proto, return a default weight of 1.0 for all boxes. -- PiperOrigin-RevId: 185215255
1efe98bb · Zhichao Lu · lzc5123016 · fbc5ba06 · 1efe98bb · 1efe98bb
Commit 1efe98bb authored Feb 09, 2018 by Zhichao Lu Committed by lzc5123016 Feb 09, 2018
20 changed files
--- a/research/object_detection/model_test_util.py
+++ b/research/object_detection/model_test_util.py
+# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""Common utils for tests for object detection tflearn model."""
+from __future__ import absolute_import
+import os
+import tempfile
+import tensorflow as tf
+from object_detection import model
+from object_detection import model_hparams
+FLAGS = tf.flags.FLAGS
+FASTER_RCNN_MODEL_NAME = 'faster_rcnn_resnet50_pets'
+SSD_INCEPTION_MODEL_NAME = 'ssd_inception_v2_pets'
+PATH_BASE = 'google3/third_party/tensorflow_models/object_detection/'
+def GetPipelineConfigPath(model_name):
+  """Returns path to the local pipeline config file."""
+  return os.path.join(FLAGS.test_srcdir, PATH_BASE, 'samples', 'configs',
+                      model_name + '.config')
+def InitializeFlags(model_name_for_test):
+  FLAGS.model_dir = tempfile.mkdtemp()
+  FLAGS.pipeline_config_path = GetPipelineConfigPath(model_name_for_test)
+def BuildExperiment():
+  """Builds an Experiment object for testing purposes."""
+  run_config = tf.contrib.learn.RunConfig()
+  hparams = model_hparams.create_hparams(
+      hparams_overrides='load_pretrained=false')
+  # pylint: disable=protected-access
+  experiment_fn = model._build_experiment_fn(10, 10)
+  # pylint: enable=protected-access
+  return experiment_fn(run_config, hparams)
--- a/research/object_detection/model_tpu.py
+++ b/research/object_detection/model_tpu.py
+# Copyright 2018 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+r"""Creates and runs `Estimator` for object detection model on TPUs.
+This uses the TPUEstimator API to define and run a model in TRAIN/EVAL modes.
+"""
+# pylint: enable=line-too-long
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+import functools
+import os
+import tensorflow as tf
+from tensorflow.contrib.tpu.python.tpu import tpu_config
+from tensorflow.contrib.tpu.python.tpu import tpu_estimator
+from tensorflow.contrib.training.python.training import evaluation
+from object_detection import inputs
+from object_detection import model
+from object_detection import model_hparams
+from object_detection.builders import model_builder
+from object_detection.utils import config_util
+tf.flags.DEFINE_bool('use_tpu', True, 'Use TPUs rather than plain CPUs')
+# Cloud TPU Cluster Resolvers
+tf.flags.DEFINE_string(
+    'gcp_project',
+    default=None,
+    help='Project name for the Cloud TPU-enabled project. If not specified, we '
+    'will attempt to automatically detect the GCE project from metadata.')
+tf.flags.DEFINE_string(
+    'tpu_zone',
+    default=None,
+    help='GCE zone where the Cloud TPU is located in. If not specified, we '
+    'will attempt to automatically detect the GCE project from metadata.')
+tf.flags.DEFINE_string(
+    'tpu_name',
+    default=None,
+    help='Name of the Cloud TPU for Cluster Resolvers. You must specify either '
+    'this flag or --master.')
+tf.flags.DEFINE_string(
+    'master', default=None,
+    help='GRPC URL of the master (e.g. grpc://ip.address.of.tpu:8470). You '
+    'must specify either this flag or --tpu_name.')
+tf.flags.DEFINE_integer('num_shards', 8, 'Number of shards (TPU cores).')
+tf.flags.DEFINE_integer('iterations_per_loop', 100,
+                        'Number of iterations per TPU training loop.')
+# For mode=train_and_eval, evaluation occurs after training is finished.
+# Note: independently of steps_per_checkpoint, estimator will save the most
+# recent checkpoint every 10 minutes by default for train_and_eval
+tf.flags.DEFINE_string('mode', 'train_and_eval',
+                       'Mode to run: train, eval, train_and_eval')
+tf.flags.DEFINE_integer('train_batch_size', 32 * 8, 'Batch size for training.')
+# For EVAL.
+tf.flags.DEFINE_integer('min_eval_interval_secs', 180,
+                        'Minimum seconds between evaluations.')
+tf.flags.DEFINE_integer(
+    'eval_timeout_secs', None,
+    'Maximum seconds between checkpoints before evaluation terminates.')
+FLAGS = tf.flags.FLAGS
+def create_estimator(run_config,
+                     hparams,
+                     pipeline_config_path,
+                     train_steps=None,
+                     eval_steps=None,
+                     train_batch_size=None,
+                     model_fn_creator=model.create_model_fn,
+                     use_tpu=False,
+                     num_shards=1,
+                     params=None,
+                     **kwargs):
+  """Creates an `Estimator` object.
+  Args:
+    run_config: A `RunConfig`.
+    hparams: A `HParams`.
+    pipeline_config_path: A path to a pipeline config file.
+    train_steps: Number of training steps. If None, the number of training steps
+      is set from the `TrainConfig` proto.
+    eval_steps: Number of evaluation steps per evaluation cycle. If None, the
+      number of evaluation steps is set from the `EvalConfig` proto.
+    train_batch_size: Training batch size. If none, use batch size from
+      `TrainConfig` proto.
+    model_fn_creator: A function that creates a `model_fn` for `Estimator`.
+      Follows the signature:
+      * Args:
+        * `detection_model_fn`: Function that returns `DetectionModel` instance.
+        * `configs`: Dictionary of pipeline config objects.
+        * `hparams`: `HParams` object.
+      * Returns:
+        `model_fn` for `Estimator`.
+    use_tpu: Boolean, whether training and evaluation should run on TPU.
+    num_shards: Number of shards (TPU cores).
+    params: Parameter dictionary passed from the estimator.
+    **kwargs: Additional keyword arguments for configuration override.
+  Returns:
+    Estimator: A estimator object used for training and evaluation
+    train_input_fn: Input function for the training loop
+    eval_input_fn: Input function for the evaluation run
+    train_steps: Number of training steps either from arg `train_steps` or
+      `TrainConfig` proto
+    eval_steps: Number of evaluation steps either from arg `eval_steps` or
+      `EvalConfig` proto
+  """
+  configs = config_util.get_configs_from_pipeline_file(pipeline_config_path)
+  configs = config_util.merge_external_params_with_configs(
+      configs,
+      hparams,
+      train_steps=train_steps,
+      eval_steps=eval_steps,
+      batch_size=train_batch_size,
+      **kwargs)
+  model_config = configs['model']
+  train_config = configs['train_config']
+  train_input_config = configs['train_input_config']
+  eval_config = configs['eval_config']
+  eval_input_config = configs['eval_input_config']
+  if params is None:
+    params = {}
+  if train_steps is None:
+    train_steps = train_config.num_steps if train_config.num_steps else None
+  if eval_steps is None:
+    eval_steps = eval_config.num_examples if eval_config.num_examples else None
+  detection_model_fn = functools.partial(
+      model_builder.build, model_config=model_config)
+  # Create the input functions for TRAIN/EVAL.
+  train_input_fn = inputs.create_train_input_fn(
+      train_config=train_config,
+      train_input_config=train_input_config,
+      model_config=model_config)
+  eval_input_fn = inputs.create_eval_input_fn(
+      eval_config=eval_config,
+      eval_input_config=eval_input_config,
+      model_config=model_config)
+  estimator = tpu_estimator.TPUEstimator(
+      model_fn=model_fn_creator(detection_model_fn, configs, hparams,
+                                use_tpu),
+      train_batch_size=train_config.batch_size,
+      # For each core, only batch size 1 is supported for eval.
+      eval_batch_size=num_shards * 1 if use_tpu else 1,
+      use_tpu=use_tpu,
+      config=run_config,
+      params=params)
+  return estimator, train_input_fn, eval_input_fn, train_steps, eval_steps
+def main(unused_argv):
+  tf.flags.mark_flag_as_required('model_dir')
+  tf.flags.mark_flag_as_required('pipeline_config_path')
+  if FLAGS.master is None and FLAGS.tpu_name is None:
+    raise RuntimeError('You must specify either --master or --tpu_name.')
+  if FLAGS.master is not None:
+    if FLAGS.tpu_name is not None:
+      tf.logging.warn('Both --master and --tpu_name are set. Ignoring '
+                      '--tpu_name and using --master.')
+    tpu_grpc_url = FLAGS.master
+  else:
+    tpu_cluster_resolver = (
+        tf.contrib.cluster_resolver.python.training.TPUClusterResolver(
+            tpu_names=[FLAGS.tpu_name],
+            zone=FLAGS.tpu_zone,
+            project=FLAGS.gcp_project))
+    tpu_grpc_url = tpu_cluster_resolver.get_master()
+  config = tpu_config.RunConfig(
+      master=tpu_grpc_url,
+      evaluation_master=tpu_grpc_url,
+      model_dir=FLAGS.model_dir,
+      tpu_config=tpu_config.TPUConfig(
+          iterations_per_loop=FLAGS.iterations_per_loop,
+          num_shards=FLAGS.num_shards))
+  params = {}
+  estimator, train_input_fn, eval_input_fn, train_steps, eval_steps = (
+      create_estimator(
+          config,
+          model_hparams.create_hparams(),
+          FLAGS.pipeline_config_path,
+          train_steps=FLAGS.num_train_steps,
+          eval_steps=FLAGS.num_eval_steps,
+          train_batch_size=FLAGS.train_batch_size,
+          use_tpu=FLAGS.use_tpu,
+          num_shards=FLAGS.num_shards,
+          params=params))
+  if FLAGS.mode in ['train', 'train_and_eval']:
+    estimator.train(input_fn=train_input_fn, max_steps=train_steps)
+  if FLAGS.mode == 'train_and_eval':
+    # Eval one time.
+    eval_results = estimator.evaluate(input_fn=eval_input_fn, steps=eval_steps)
+    tf.logging.info('Eval results: %s' % eval_results)
+  # Continuously evaluating.
+  if FLAGS.mode == 'eval':
+    def terminate_eval():
+      tf.logging.info('Terminating eval after %d seconds of no checkpoints' %
+                      FLAGS.eval_timeout_secs)
+      return True
+    # Run evaluation when there's a new checkpoint.
+    for ckpt in evaluation.checkpoints_iterator(
+        FLAGS.model_dir,
+        min_interval_secs=FLAGS.min_eval_interval_secs,
+        timeout=FLAGS.eval_timeout_secs,
+        timeout_fn=terminate_eval):
+      tf.logging.info('Starting to evaluate.')
+      try:
+        eval_results = estimator.evaluate(
+            input_fn=eval_input_fn,
+            steps=eval_steps,
+            checkpoint_path=ckpt)
+        tf.logging.info('Eval results: %s' % eval_results)
+        # Terminate eval job when final checkpoint is reached
+        current_step = int(os.path.basename(ckpt).split('-')[1])
+        if current_step >= train_steps:
+          tf.logging.info(
+              'Evaluation finished after training step %d' % current_step)
+          break
+      except tf.errors.NotFoundError:
+        tf.logging.info(
+            'Checkpoint %s no longer exists, skipping checkpoint' % ckpt)
+if __name__ == '__main__':
+  tf.app.run()
--- a/research/object_detection/models/BUILD
+++ b/research/object_detection/models/BUILD
@@ -119,6 +119,7 @@ py_library(
 py_test(
    name = "ssd_resnet_v1_fpn_feature_extractor_test",
+    timeout = "long",
    srcs = ["ssd_resnet_v1_fpn_feature_extractor_test.py"],
    deps = [
        ":ssd_resnet_v1_fpn_feature_extractor",

--- a/research/object_detection/models/embedded_ssd_mobilenet_v1_feature_extractor.py
+++ b/research/object_detection/models/embedded_ssd_mobilenet_v1_feature_extractor.py
@@ -52,7 +52,8 @@ class EmbeddedSSDMobileNetV1FeatureExtractor(
               conv_hyperparams,
               batch_norm_trainable=True,
               reuse_weights=None,
-               use_explicit_padding=False):
+               use_explicit_padding=False,
+               use_depthwise=False):
    """MobileNetV1 Feature Extractor for Embedded-friendly SSD Models.
    Args:
@@ -69,6 +70,7 @@ class EmbeddedSSDMobileNetV1FeatureExtractor(
      reuse_weights: Whether to reuse variables. Default is None.
      use_explicit_padding: Whether to use explicit padding when extracting
        features. Default is False.
+      use_depthwise: Whether to use depthwise convolutions. Default is False.
    Raises:
      ValueError: upon invalid `pad_to_multiple` values.
@@ -80,7 +82,7 @@ class EmbeddedSSDMobileNetV1FeatureExtractor(
    super(EmbeddedSSDMobileNetV1FeatureExtractor, self).__init__(
        is_training, depth_multiplier, min_depth, pad_to_multiple,
        conv_hyperparams, batch_norm_trainable, reuse_weights,
-        use_explicit_padding)
+        use_explicit_padding, use_depthwise)
  def extract_features(self, preprocessed_inputs):
    """Extract features from preprocessed inputs.
@@ -119,6 +121,7 @@ class EmbeddedSSDMobileNetV1FeatureExtractor(
        'layer_depth': [-1, -1, 512, 256, 256],
        'conv_kernel_size': [-1, -1, 3, 3, 2],
        'use_explicit_padding': self._use_explicit_padding,
+        'use_depthwise': self._use_depthwise,
    }
    with slim.arg_scope(self._conv_hyperparams):

--- a/research/object_detection/models/ssd_inception_v2_feature_extractor.py
+++ b/research/object_detection/models/ssd_inception_v2_feature_extractor.py
@@ -36,7 +36,8 @@ class SSDInceptionV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
               conv_hyperparams,
               batch_norm_trainable=True,
               reuse_weights=None,
-               use_explicit_padding=False):
+               use_explicit_padding=False,
+               use_depthwise=False):
    """InceptionV2 Feature Extractor for SSD Models.
    Args:
@@ -53,11 +54,12 @@ class SSDInceptionV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
      reuse_weights: Whether to reuse variables. Default is None.
      use_explicit_padding: Whether to use explicit padding when extracting
        features. Default is False.
+      use_depthwise: Whether to use depthwise convolutions. Default is False.
    """
    super(SSDInceptionV2FeatureExtractor, self).__init__(
        is_training, depth_multiplier, min_depth, pad_to_multiple,
        conv_hyperparams, batch_norm_trainable, reuse_weights,
-        use_explicit_padding)
+        use_explicit_padding, use_depthwise)
  def preprocess(self, resized_inputs):
    """SSD preprocessing.
@@ -92,6 +94,7 @@ class SSDInceptionV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
        'from_layer': ['Mixed_4c', 'Mixed_5c', '', '', '', ''],
        'layer_depth': [-1, -1, 512, 256, 256, 128],
        'use_explicit_padding': self._use_explicit_padding,
+        'use_depthwise': self._use_depthwise,
    }
    with slim.arg_scope(self._conv_hyperparams):

--- a/research/object_detection/models/ssd_inception_v3_feature_extractor.py
+++ b/research/object_detection/models/ssd_inception_v3_feature_extractor.py
@@ -36,7 +36,8 @@ class SSDInceptionV3FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
               conv_hyperparams,
               batch_norm_trainable=True,
               reuse_weights=None,
-               use_explicit_padding=False):
+               use_explicit_padding=False,
+               use_depthwise=False):
    """InceptionV3 Feature Extractor for SSD Models.
    Args:
@@ -53,11 +54,12 @@ class SSDInceptionV3FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
      reuse_weights: Whether to reuse variables. Default is None.
      use_explicit_padding: Whether to use explicit padding when extracting
        features. Default is False.
+      use_depthwise: Whether to use depthwise convolutions. Default is False.
    """
    super(SSDInceptionV3FeatureExtractor, self).__init__(
        is_training, depth_multiplier, min_depth, pad_to_multiple,
        conv_hyperparams, batch_norm_trainable, reuse_weights,
-        use_explicit_padding)
+        use_explicit_padding, use_depthwise)
  def preprocess(self, resized_inputs):
    """SSD preprocessing.
@@ -92,6 +94,7 @@ class SSDInceptionV3FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
        'from_layer': ['Mixed_5d', 'Mixed_6e', 'Mixed_7c', '', '', ''],
        'layer_depth': [-1, -1, -1, 512, 256, 128],
        'use_explicit_padding': self._use_explicit_padding,
+        'use_depthwise': self._use_depthwise,
    }
    with slim.arg_scope(self._conv_hyperparams):

--- a/research/object_detection/models/ssd_mobilenet_v1_feature_extractor.py
+++ b/research/object_detection/models/ssd_mobilenet_v1_feature_extractor.py
@@ -37,7 +37,8 @@ class SSDMobileNetV1FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
               conv_hyperparams,
               batch_norm_trainable=True,
               reuse_weights=None,
-               use_explicit_padding=False):
+               use_explicit_padding=False,
+               use_depthwise=False):
    """MobileNetV1 Feature Extractor for SSD Models.
    Args:
@@ -54,11 +55,12 @@ class SSDMobileNetV1FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
      reuse_weights: Whether to reuse variables. Default is None.
      use_explicit_padding: Whether to use explicit padding when extracting
        features. Default is False.
+      use_depthwise: Whether to use depthwise convolutions. Default is False.
    """
    super(SSDMobileNetV1FeatureExtractor, self).__init__(
        is_training, depth_multiplier, min_depth, pad_to_multiple,
        conv_hyperparams, batch_norm_trainable, reuse_weights,
-        use_explicit_padding)
+        use_explicit_padding, use_depthwise)
  def preprocess(self, resized_inputs):
    """SSD preprocessing.
@@ -94,6 +96,7 @@ class SSDMobileNetV1FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
                       '', ''],
        'layer_depth': [-1, -1, 512, 256, 256, 128],
        'use_explicit_padding': self._use_explicit_padding,
+        'use_depthwise': self._use_depthwise,
    }
    with slim.arg_scope(self._conv_hyperparams):

--- a/research/object_detection/models/ssd_resnet_v1_fpn_feature_extractor.py
+++ b/research/object_detection/models/ssd_resnet_v1_fpn_feature_extractor.py
@@ -25,9 +25,11 @@ class _SSDResnetV1FpnFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
               conv_hyperparams,
               resnet_base_fn,
               resnet_scope_name,
+               fpn_scope_name,
               batch_norm_trainable=True,
               reuse_weights=None,
-               use_explicit_padding=False):
+               use_explicit_padding=False,
+               use_depthwise=False):
    """SSD FPN feature extractor based on Resnet v1 architecture.
    Args:
@@ -39,7 +41,9 @@ class _SSDResnetV1FpnFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
        width dimensions to.
      conv_hyperparams: tf slim arg_scope for conv2d and separable_conv2d ops.
      resnet_base_fn: base resnet network to use.
-      resnet_scope_name: scope name to construct resnet
+      resnet_scope_name: scope name under which to construct resnet
+      fpn_scope_name: scope name under which to construct the feature pyramid
+        network.
      batch_norm_trainable: Whether to update batch norm parameters during
        training or not. When training with a small batch size
        (e.g. 1), it is desirable to disable batch norm update and use
@@ -47,6 +51,7 @@ class _SSDResnetV1FpnFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
      reuse_weights: Whether to reuse variables. Default is None.
      use_explicit_padding: Whether to use explicit padding when extracting
        features. Default is False. UNUSED currently.
+      use_depthwise: Whether to use depthwise convolutions. UNUSED currently.
    Raises:
      ValueError: On supplying invalid arguments for unused arguments.
@@ -62,6 +67,7 @@ class _SSDResnetV1FpnFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
      raise ValueError('Explicit padding is not a valid option.')
    self._resnet_base_fn = resnet_base_fn
    self._resnet_scope_name = resnet_scope_name
+    self._fpn_scope_name = fpn_scope_name
  def preprocess(self, resized_inputs):
    """SSD preprocessing.
@@ -124,6 +130,7 @@ class _SSDResnetV1FpnFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
            scope=scope)
      image_features = self._filter_features(image_features)
      last_feature_map = image_features['block4']
+    with tf.variable_scope(self._fpn_scope_name, reuse=self._reuse_weights):
      with slim.arg_scope(self._conv_hyperparams):
        for i in range(5, 7):
          last_feature_map = slim.conv2d(
@@ -154,7 +161,8 @@ class SSDResnet50V1FpnFeatureExtractor(_SSDResnetV1FpnFeatureExtractor):
               conv_hyperparams,
               batch_norm_trainable=True,
               reuse_weights=None,
-               use_explicit_padding=False):
+               use_explicit_padding=False,
+               use_depthwise=False):
    """Resnet50 v1 FPN Feature Extractor for SSD Models.
    Args:
@@ -170,11 +178,12 @@ class SSDResnet50V1FpnFeatureExtractor(_SSDResnetV1FpnFeatureExtractor):
        pretrained batch norm params.
      reuse_weights: Whether to reuse variables. Default is None.
      use_explicit_padding: Whether to use explicit padding when extracting
-        features. Default is False.
+        features. Default is False. UNUSED currently.
+      use_depthwise: Whether to use depthwise convolutions. UNUSED currently.
    """
    super(SSDResnet50V1FpnFeatureExtractor, self).__init__(
        is_training, depth_multiplier, min_depth, pad_to_multiple,
-        conv_hyperparams, resnet_v1.resnet_v1_50, 'resnet_v1_50_fpn',
+        conv_hyperparams, resnet_v1.resnet_v1_50, 'resnet_v1_50', 'fpn',
        batch_norm_trainable, reuse_weights, use_explicit_padding)
@@ -188,7 +197,8 @@ class SSDResnet101V1FpnFeatureExtractor(_SSDResnetV1FpnFeatureExtractor):
               conv_hyperparams,
               batch_norm_trainable=True,
               reuse_weights=None,
-               use_explicit_padding=False):
+               use_explicit_padding=False,
+               use_depthwise=False):
    """Resnet101 v1 FPN Feature Extractor for SSD Models.
    Args:
@@ -204,11 +214,12 @@ class SSDResnet101V1FpnFeatureExtractor(_SSDResnetV1FpnFeatureExtractor):
        pretrained batch norm params.
      reuse_weights: Whether to reuse variables. Default is None.
      use_explicit_padding: Whether to use explicit padding when extracting
-        features. Default is False.
+        features. Default is False. UNUSED currently.
+      use_depthwise: Whether to use depthwise convolutions. UNUSED currently.
    """
    super(SSDResnet101V1FpnFeatureExtractor, self).__init__(
        is_training, depth_multiplier, min_depth, pad_to_multiple,
-        conv_hyperparams, resnet_v1.resnet_v1_101, 'resnet_v1_101_fpn',
+        conv_hyperparams, resnet_v1.resnet_v1_101, 'resnet_v1_101', 'fpn',
        batch_norm_trainable, reuse_weights, use_explicit_padding)
@@ -222,7 +233,8 @@ class SSDResnet152V1FpnFeatureExtractor(_SSDResnetV1FpnFeatureExtractor):
               conv_hyperparams,
               batch_norm_trainable=True,
               reuse_weights=None,
-               use_explicit_padding=False):
+               use_explicit_padding=False,
+               use_depthwise=False):
    """Resnet152 v1 FPN Feature Extractor for SSD Models.
    Args:
@@ -238,9 +250,10 @@ class SSDResnet152V1FpnFeatureExtractor(_SSDResnetV1FpnFeatureExtractor):
        pretrained batch norm params.
      reuse_weights: Whether to reuse variables. Default is None.
      use_explicit_padding: Whether to use explicit padding when extracting
-        features. Default is False.
+        features. Default is False. UNUSED currently.
+      use_depthwise: Whether to use depthwise convolutions. UNUSED currently.
    """
    super(SSDResnet152V1FpnFeatureExtractor, self).__init__(
        is_training, depth_multiplier, min_depth, pad_to_multiple,
-        conv_hyperparams, resnet_v1.resnet_v1_152, 'resnet_v1_152_fpn',
+        conv_hyperparams, resnet_v1.resnet_v1_152, 'resnet_v1_152', 'fpn',
        batch_norm_trainable, reuse_weights, use_explicit_padding)
--- a/research/object_detection/models/ssd_resnet_v1_fpn_feature_extractor_test.py
+++ b/research/object_detection/models/ssd_resnet_v1_fpn_feature_extractor_test.py
@@ -7,7 +7,7 @@ from object_detection.models import ssd_resnet_v1_fpn_feature_extractor_testbase
 class SSDResnet50V1FeatureExtractorTest(
    ssd_resnet_v1_fpn_feature_extractor_testbase.
-    SSDResnetFeatureExtractorTestBase):
+    SSDResnetFPNFeatureExtractorTestBase):
  """SSDResnet50v1Fpn feature extractor test."""
  def _create_feature_extractor(self, depth_multiplier, pad_to_multiple):
@@ -19,13 +19,13 @@ class SSDResnet50V1FeatureExtractorTest(
        is_training, depth_multiplier, min_depth, pad_to_multiple,
        conv_hyperparams, batch_norm_trainable)
-  def _scope_name(self):
+  def _resnet_scope_name(self):
-    return 'resnet_v1_50_fpn'
+    return 'resnet_v1_50'
 class SSDResnet101V1FeatureExtractorTest(
    ssd_resnet_v1_fpn_feature_extractor_testbase.
-    SSDResnetFeatureExtractorTestBase):
+    SSDResnetFPNFeatureExtractorTestBase):
  """SSDResnet101v1Fpn feature extractor test."""
  def _create_feature_extractor(self, depth_multiplier, pad_to_multiple):
@@ -38,13 +38,13 @@ class SSDResnet101V1FeatureExtractorTest(
            is_training, depth_multiplier, min_depth, pad_to_multiple,
            conv_hyperparams, batch_norm_trainable))
-  def _scope_name(self):
+  def _resnet_scope_name(self):
-    return 'resnet_v1_101_fpn'
+    return 'resnet_v1_101'
 class SSDResnet152V1FeatureExtractorTest(
    ssd_resnet_v1_fpn_feature_extractor_testbase.
-    SSDResnetFeatureExtractorTestBase):
+    SSDResnetFPNFeatureExtractorTestBase):
  """SSDResnet152v1Fpn feature extractor test."""
  def _create_feature_extractor(self, depth_multiplier, pad_to_multiple):
@@ -57,8 +57,8 @@ class SSDResnet152V1FeatureExtractorTest(
            is_training, depth_multiplier, min_depth, pad_to_multiple,
            conv_hyperparams, batch_norm_trainable))
-  def _scope_name(self):
+  def _resnet_scope_name(self):
-    return 'resnet_v1_152_fpn'
+    return 'resnet_v1_152'
 if __name__ == '__main__':

--- a/research/object_detection/models/ssd_resnet_v1_fpn_feature_extractor_testbase.py
+++ b/research/object_detection/models/ssd_resnet_v1_fpn_feature_extractor_testbase.py
 """Tests for ssd resnet v1 FPN feature extractors."""
 import abc
 import numpy as np
+import tensorflow as tf
 from object_detection.models import ssd_feature_extractor_test
-class SSDResnetFeatureExtractorTestBase(
+class SSDResnetFPNFeatureExtractorTestBase(
    ssd_feature_extractor_test.SsdFeatureExtractorTestBase):
  """Helper test class for SSD Resnet v1 FPN feature extractors."""
  @abc.abstractmethod
-  def _scope_name(self):
+  def _resnet_scope_name(self):
    pass
+  @abc.abstractmethod
+  def _fpn_scope_name(self):
+    return 'fpn'
  def test_extract_features_returns_correct_shapes_256(self):
    image_height = 256
    image_width = 256
@@ -73,5 +78,16 @@ class SSDResnetFeatureExtractorTestBase(
  def test_variables_only_created_in_scope(self):
    depth_multiplier = 1
    pad_to_multiple = 1
-    self.check_feature_extractor_variables_under_scope(
+    g = tf.Graph()
-        depth_multiplier, pad_to_multiple, self._scope_name())
+    with g.as_default():
+      feature_extractor = self._create_feature_extractor(
+          depth_multiplier, pad_to_multiple)
+      preprocessed_inputs = tf.placeholder(tf.float32, (4, None, None, 3))
+      feature_extractor.extract_features(preprocessed_inputs)
+      variables = g.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)
+      for variable in variables:
+        self.assertTrue(
+            variable.name.startswith(self._resnet_scope_name())
+            or variable.name.startswith(self._fpn_scope_name()))
--- a/research/object_detection/object_detection_tutorial.ipynb
+++ b/research/object_detection/object_detection_tutorial.ipynb
@@ -35,6 +35,7 @@
    "from io import StringIO\n",
    "from matplotlib import pyplot as plt\n",
    "from PIL import Image\n",
+    "from object_detection.utils import ops as utils_ops\n",
    "\n",
    "if tf.__version__ < '1.4.0':\n",
    "  raise ImportError('Please upgrade your tensorflow installation to v1.4.* or later!')\n"
@@ -223,6 +224,59 @@
    "IMAGE_SIZE = (12, 8)"
   ]
  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def run_inference_for_single_image(image, graph):\n",
+    "  with graph.as_default():\n",
+    "    with tf.Session() as sess:\n",
+    "      # Get handles to input and output tensors\n",
+    "      ops = tf.get_default_graph().get_operations()\n",
+    "      all_tensor_names = {output.name for op in ops for output in op.outputs}\n",
+    "      tensor_dict = {}\n",
+    "      for key in [\n",
+    "          'num_detections', 'detection_boxes', 'detection_scores',\n",
+    "          'detection_classes', 'detection_masks'\n",
+    "      ]:\n",
+    "        tensor_name = key + ':0'\n",
+    "        if tensor_name in all_tensor_names:\n",
+    "          tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(\n",
+    "              tensor_name)\n",
+    "      if 'detection_masks' in tensor_dict:\n",
+    "        # The following processing is only for single image\n",
+    "        detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])\n",
+    "        detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])\n",
+    "        # Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.\n",
+    "        real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)\n",
+    "        detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])\n",
+    "        detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])\n",
+    "        detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(\n",
+    "            detection_masks, detection_boxes, image.shape[0], image.shape[1])\n",
+    "        detection_masks_reframed = tf.cast(\n",
+    "            tf.greater(detection_masks_reframed, 0.5), tf.uint8)\n",
+    "        # Follow the convention by adding back the batch dimension\n",
+    "        tensor_dict['detection_masks'] = tf.expand_dims(\n",
+    "            detection_masks_reframed, 0)\n",
+    "      image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')\n",
+    "\n",
+    "      # Run inference\n",
+    "      output_dict = sess.run(tensor_dict,\n",
+    "                             feed_dict={image_tensor: np.expand_dims(image, 0)})\n",
+    "\n",
+    "      # all outputs are float32 numpy arrays, so convert types as appropriate\n",
+    "      output_dict['num_detections'] = int(output_dict['num_detections'][0])\n",
+    "      output_dict['detection_classes'] = output_dict[\n",
+    "          'detection_classes'][0].astype(np.uint8)\n",
+    "      output_dict['detection_boxes'] = output_dict['detection_boxes'][0]\n",
+    "      output_dict['detection_scores'] = output_dict['detection_scores'][0]\n",
+    "      if 'detection_masks' in output_dict:\n",
+    "        output_dict['detection_masks'] = output_dict['detection_masks'][0]\n",
+    "  return output_dict"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": null,
@@ -231,39 +285,27 @@
   },
   "outputs": [],
   "source": [
-    "with detection_graph.as_default():\n",
+    "for image_path in TEST_IMAGE_PATHS:\n",
-    "  with tf.Session(graph=detection_graph) as sess:\n",
+    "  image = Image.open(image_path)\n",
-    "    # Definite input and output Tensors for detection_graph\n",
+    "  # the array based representation of the image will be used later in order to prepare the\n",
-    "    image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')\n",
+    "  # result image with boxes and labels on it.\n",
-    "    # Each box represents a part of the image where a particular object was detected.\n",
+    "  image_np = load_image_into_numpy_array(image)\n",
-    "    detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')\n",
+    "  # Expand dimensions since the model expects images to have shape: [1, None, None, 3]\n",
-    "    # Each score represent how level of confidence for each of the objects.\n",
+    "  image_np_expanded = np.expand_dims(image_np, axis=0)\n",
-    "    # Score is shown on the result image, together with the class label.\n",
+    "  # Actual detection.\n",
-    "    detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')\n",
+    "  output_dict = run_inference_for_single_image(image_np, detection_graph)\n",
-    "    detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')\n",
+    "  # Visualization of the results of a detection.\n",
-    "    num_detections = detection_graph.get_tensor_by_name('num_detections:0')\n",
+    "  vis_util.visualize_boxes_and_labels_on_image_array(\n",
-    "    for image_path in TEST_IMAGE_PATHS:\n",
+    "      image_np,\n",
-    "      image = Image.open(image_path)\n",
+    "      output_dict['detection_boxes'],\n",
-    "      # the array based representation of the image will be used later in order to prepare the\n",
+    "      output_dict['detection_classes'],\n",
-    "      # result image with boxes and labels on it.\n",
+    "      output_dict['detection_scores'],\n",
-    "      image_np = load_image_into_numpy_array(image)\n",
+    "      category_index,\n",
-    "      # Expand dimensions since the model expects images to have shape: [1, None, None, 3]\n",
+    "      instance_masks=output_dict.get('detection_masks'),\n",
-    "      image_np_expanded = np.expand_dims(image_np, axis=0)\n",
+    "      use_normalized_coordinates=True,\n",
-    "      # Actual detection.\n",
+    "      line_thickness=8)\n",
-    "      (boxes, scores, classes, num) = sess.run(\n",
+    "  plt.figure(figsize=IMAGE_SIZE)\n",
-    "          [detection_boxes, detection_scores, detection_classes, num_detections],\n",
+    "  plt.imshow(image_np)"
-    "          feed_dict={image_tensor: image_np_expanded})\n",
-    "      # Visualization of the results of a detection.\n",
-    "      vis_util.visualize_boxes_and_labels_on_image_array(\n",
-    "          image_np,\n",
-    "          np.squeeze(boxes),\n",
-    "          np.squeeze(classes).astype(np.int32),\n",
-    "          np.squeeze(scores),\n",
-    "          category_index,\n",
-    "          use_normalized_coordinates=True,\n",
-    "          line_thickness=8)\n",
-    "      plt.figure(figsize=IMAGE_SIZE)\n",
-    "      plt.imshow(image_np)"
   ]
  },
  {
@@ -275,6 +317,9 @@
  }
 ],
 "metadata": {
+  "colab": {
+   "version": "0.3.2"
+  },
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",

--- a/research/object_detection/protos/argmax_matcher.proto
+++ b/research/object_detection/protos/argmax_matcher.proto
@@ -22,4 +22,8 @@ message ArgMaxMatcher {
  // Whether to ensure each row is matched to at least one column.
  optional bool force_match_for_each_row = 5 [default = false];
+  // Force constructed match objects to use matrix multiplication based gather
+  // instead of standard tf.gather
+  optional bool use_matmul_gather = 6 [default = false];
 }
--- a/research/object_detection/protos/bipartite_matcher.proto
+++ b/research/object_detection/protos/bipartite_matcher.proto
@@ -5,4 +5,7 @@ package object_detection.protos;
 // Configuration proto for bipartite matcher. See
 // matchers/bipartite_matcher.py for details.
 message BipartiteMatcher {
+  // Force constructed match objects to use matrix multiplication based gather
+  // instead of standard tf.gather
+  optional bool use_matmul_gather = 6 [default = false];
 }
--- a/research/object_detection/protos/box_predictor.proto
+++ b/research/object_detection/protos/box_predictor.proto
@@ -52,6 +52,9 @@ message ConvolutionalBoxPredictor {
  optional bool apply_sigmoid_to_scores = 9 [default = false];
  optional float class_prediction_bias_init = 10 [default = 0.0];
+  // Whether to use depthwise separable convolution for box predictor layers.
+  optional bool use_depthwise = 11 [default = false];
 }
 // Configuration proto for weight shared convolutional box predictor.

--- a/research/object_detection/protos/image_resizer.proto
+++ b/research/object_detection/protos/image_resizer.proto
@@ -34,6 +34,9 @@ message KeepAspectRatioResizer {
  // [max_dimension, max_dimension]. Note that the zeros are padded to the
  // bottom and the right of the resized image.
  optional bool pad_to_max_dimension = 4 [default = false];
+  // Whether to also resize the image channels from 3 to 1 (RGB to grayscale).
+  optional bool convert_to_grayscale = 5 [default = false];
 }
 // Configuration proto for image resizer that resizes to a fixed shape.
@@ -46,4 +49,7 @@ message FixedShapeResizer {
  // Desired method when resizing image.
  optional ResizeType resize_method = 3 [default = BILINEAR];
+  // Whether to also resize the image channels from 3 to 1 (RGB to grayscale).
+  optional bool convert_to_grayscale = 4 [default = false];
 }
--- a/research/object_detection/protos/multiscale_anchor_generator.proto
+++ b/research/object_detection/protos/multiscale_anchor_generator.proto
@@ -19,4 +19,5 @@ message MultiscaleAnchorGenerator {
  repeated float aspect_ratios = 4;
  // Number of intermediate scale each scale octave
+  optional int32 scales_per_octave = 5 [default = 2];
 }
--- a/research/object_detection/protos/preprocessor.proto
+++ b/research/object_detection/protos/preprocessor.proto
@@ -32,6 +32,7 @@ message PreprocessingStep {
    SSDRandomCropPadFixedAspectRatio ssd_random_crop_pad_fixed_aspect_ratio = 24;
    RandomVerticalFlip random_vertical_flip = 25;
    RandomRotation90 random_rotation90 = 26;
+    RGBtoGray rgb_to_gray = 27;
  }
 }
@@ -236,6 +237,11 @@ message RandomResizeMethod {
  optional float target_width = 2;
 }
+// Converts the RGB image to a grayscale image. This also converts the image
+// depth from 3 to 1, unlike RandomRGBtoGray which does not change the image
+// depth.
+message RGBtoGray {}
 // Scales boxes from normalized coordinates to pixel coordinates.
 message ScaleBoxesToPixelCoordinates {
 }

--- a/research/object_detection/protos/ssd.proto
+++ b/research/object_detection/protos/ssd.proto
@@ -86,4 +86,8 @@ message SsdFeatureExtractor {
  // Whether to use explicit padding when extracting SSD multiresolution
  // features. Note that this does not apply to the base feature extractor.
  optional bool use_explicit_padding = 7 [default=false];
+  // Whether to use depthwise separable convolutions for to extract additional
+  // feature maps added by SSD.
+  optional bool use_depthwise = 8 [default=false];
 }
--- a/research/object_detection/protos/train.proto
+++ b/research/object_detection/protos/train.proto
@@ -78,4 +78,14 @@ message TrainConfig {
  // Setting this option to false is very useful while debugging the model and
  // losses.
  optional bool add_regularization_loss = 18 [default=true];
+  // Maximum number of boxes used during training.
+  // Set this to at least the maximum amount of boxes in the input data.
+  // Otherwise, it may cause "Data loss: Attempted to pad to a smaller size
+  // than the input element" errors.
+  optional int32 max_number_of_boxes = 20 [default=50];
+  // Whether to remove padding along `num_boxes` dimension of the groundtruth
+  // tensors.
+  optional bool unpad_groundtruth_tensors = 21 [default=true];
 }
--- a/research/object_detection/samples/configs/BUILD
+++ b/research/object_detection/samples/configs/BUILD
@@ -7,4 +7,5 @@ licenses(["notice"])
 exports_files([
    "faster_rcnn_resnet50_pets.config",
    "ssd_inception_v2_pets.config",
+    "ssd_mobilenet_v1_focal_loss_pets.config",
 ])