Object detection changes: (#7208)

257914648 by lzc: Internal changes -- 257525973 by Zhichao Lu: Fixes bug that silently prevents checkpoints from loading when training w/ eager + functions. Also sets up scripts to run training. -- 257296614 by Zhichao Lu: Adding detection_features to model outputs -- 257234565 by Zhichao Lu: Fix wrong order of `classes_with_max_scores` in class-agnostic NMS caused by sorting in partitioned-NMS. -- 257232002 by ronnyvotel: Supporting `filter_nonoverlapping` option in np_box_list_ops.clip_to_window(). -- 257198282 by Zhichao Lu: Adding the focal loss and l1 loss from the Objects as Points paper. -- 257089535 by Zhichao Lu: Create Keras based ssd + resnetv1 + fpn. -- 257087407 by Zhichao Lu: Make object_detection/data_decoders Python3-compatible. -- 257004582 by Zhichao Lu: Updates _decode_raw_data_into_masks_and_boxes to the latest binary masks-to-string encoding format. -- 257002124 by Zhichao Lu: Make object_detection/utils Python3-compatible, except json_utils. The patching trick used in json_utils is not going to work in Python 3. -- 256795056 by lzc: Add a detection_anchor_indices field to detection outputs. -- 256477542 by Zhichao Lu: Make object_detection/core Python3-compatible. -- 256387593 by Zhichao Lu: Edit class_id_function_approximations builder to skip class ids not present in label map. -- 256259039 by Zhichao Lu: Move NMS to TPU for FasterRCNN. -- 256071360 by rathodv: When multiclass_scores is empty, add one-hot encoding of groundtruth_classes as multiclass scores so that data_augmentation ops that expect the presence of multiclass_scores don't have to individually handle this case. Also copy input tensor_dict to out_tensor_dict first to avoid inplace modification. -- 256023645 by Zhichao Lu: Adds the first WIP iterations of TensorFlow v2 eager + functions style custom training & evaluation loops. -- 255980623 by Zhichao Lu: Adds a new data augmentation operation "remap_labels" which remaps a set of labels to a new label. -- 255753259 by Zhichao Lu: Announcement of the released evaluation tutorial for Open Images Challenge 2019. -- 255698776 by lzc: Fix rewrite_nn_resize_op function which was broken by tf forward compatibility movement. -- 255623150 by Zhichao Lu: Add Keras-based ResnetV1 models. -- 255504992 by Zhichao Lu: Fixing the typo in specifying label expansion for ground truth segmentation file. -- 255470768 by Zhichao Lu: 1. Fixing Python bug with parsed arguments. 2. Adding capability to parse relevant columns from CSV header. 3. Fixing bug with duplicated labels expansion. -- 255462432 by Zhichao Lu: Adds a new data augmentation operation "drop_label_probabilistically" which drops a given label with the given probability. This supports experiments on training in the presence of label noise. -- 255441632 by rathodv: Fallback on groundtruth classes when multiclass_scores tensor is empty. -- 255434899 by Zhichao Lu: Ensuring evaluation binary can run even with big files by synchronizing processing of ground truth and predictions: in this way, ground truth is not stored but immediatly used for evaluation. In case gt of object masks, this allows to run evaluations on relatively large sets. -- 255337855 by lzc: Internal change. -- 255308908 by Zhichao Lu: Add comment to clarify usage of calibration parameters proto. -- 255266371 by Zhichao Lu: Ensuring correct processing of the case, when no groundtruth masks are provided for an image. -- 255236648 by Zhichao Lu: Refactor model_builder in faster_rcnn.py to a util_map, so that it's possible to be overwritten. -- 255093285 by Zhichao Lu: Updating capability to subsample data during evaluation -- 255081222 by rathodv: Convert groundtruth masks to be of type float32 before its used in the loss function. When using mixed precision training, masks are represented using bfloat16 tensors in the input pipeline for performance reasons. We need to convert them to float32 before using it in the loss function. -- 254788436 by Zhichao Lu: Add forward_compatible to non_max_suppression_with_scores to make it is compatible with older tensorflow version. -- 254442362 by Zhichao Lu: Add num_layer field to ssd feature extractor proto. -- 253911582 by jonathanhuang: Plumbs Soft-NMS options (using the new tf.image.non_max_suppression_with_scores op) into the TF Object Detection API. It adds a `soft_nms_sigma` field to the postprocessing proto file and plumbs this through to both the multiclass and class_agnostic versions of NMS. Note that there is no effect on behavior of NMS when soft_nms_sigma=0 (which it is set to by default). See also "Soft-NMS -- Improving Object Detection With One Line of Code" by Bodla et al (https://arxiv.org/abs/1704.04503) -- 253703949 by Zhichao Lu: Internal test fixes. -- 253151266 by Zhichao Lu: Fix the op type check for FusedBatchNorm, given that we introduced FusedBatchNormV3 in a previous change. -- 252718956 by Zhichao Lu: Customize activation function to enable relu6 instead of relu for saliency prediction model seastarization -- 252158593 by Zhichao Lu: Make object_detection/core Python3-compatible. -- 252150717 by Zhichao Lu: Make object_detection/core Python3-compatible. -- 251967048 by Zhichao Lu: Make GraphRewriter proto extensible. -- 251950039 by Zhichao Lu: Remove experimental_export_device_assignment from TPUEstimator.export_savedmodel(), so as to remove rewrite_for_inference(). As a replacement, export_savedmodel() V2 API supports device_assignment where user call tpu.rewrite in model_fn and pass in device_assigment there. -- 251890697 by rathodv: Updated docstring to include new output nodes. -- 251662894 by Zhichao Lu: Add autoaugment augmentation option to objection detection api codebase. This is an available option in preprocessor.py. The intended usage of autoaugment is to be done along with random flipping and cropping for best results. -- 251532908 by Zhichao Lu: Add TrainingDataType enum to track whether class-specific or agnostic data was used to fit the calibration function. This is useful, since classes with few observations may require a calibration function fit on all classes. -- 251511339 by Zhichao Lu: Add multiclass isotonic regression to the calibration builder. -- 251317769 by pengchong: Internal Change. -- 250729989 by Zhichao Lu: Fixing bug in gt statistics count in case of mask and box annotations. -- 250729627 by Zhichao Lu: Label expansion for segmentation. -- 250724905 by Zhichao Lu: Fix use_depthwise in fpn and test it with fpnlite on ssd + mobilenet v2. -- 250670379 by Zhichao Lu: Internal change 250630364 by lzc: Fix detection_model_zoo footnotes -- 250560654 by Zhichao Lu: Fix static shape issue in matmul_crop_and_resize. -- 250534857 by Zhichao Lu: Edit class agnostic calibration function docstring to more accurately describe the function's outputs. -- 250533277 by Zhichao Lu: Edit the multiclass messages to use class ids instead of labels. -- PiperOrigin-RevId: 257914648

Object detection changes: (#7208)
257914648 by lzc: Internal changes -- 257525973 by Zhichao Lu: Fixes bug that silently prevents checkpoints from loading when training w/ eager + functions. Also sets up scripts to run training. -- 257296614 by Zhichao Lu: Adding detection_features to model outputs -- 257234565 by Zhichao Lu: Fix wrong order of `classes_with_max_scores` in class-agnostic NMS caused by sorting in partitioned-NMS. -- 257232002 by ronnyvotel: Supporting `filter_nonoverlapping` option in np_box_list_ops.clip_to_window(). -- 257198282 by Zhichao Lu: Adding the focal loss and l1 loss from the Objects as Points paper. -- 257089535 by Zhichao Lu: Create Keras based ssd + resnetv1 + fpn. -- 257087407 by Zhichao Lu: Make object_detection/data_decoders Python3-compatible. -- 257004582 by Zhichao Lu: Updates _decode_raw_data_into_masks_and_boxes to the latest binary masks-to-string encoding format. -- 257002124 by Zhichao Lu: Make object_detection/utils Python3-compatible, except json_utils. The patching trick used in json_utils is not going to work in Python 3. -- 256795056 by lzc: Add a detection_anchor_indices field to detection outputs. -- 256477542 by Zhichao Lu: Make object_detection/core Python3-compatible. -- 256387593 by Zhichao Lu: Edit class_id_function_approximations builder to skip class ids not present in label map. -- 256259039 by Zhichao Lu: Move NMS to TPU for FasterRCNN. -- 256071360 by rathodv: When multiclass_scores is empty, add one-hot encoding of groundtruth_classes as multiclass scores so that data_augmentation ops that expect the presence of multiclass_scores don't have to individually handle this case. Also copy input tensor_dict to out_tensor_dict first to avoid inplace modification. -- 256023645 by Zhichao Lu: Adds the first WIP iterations of TensorFlow v2 eager + functions style custom training & evaluation loops. -- 255980623 by Zhichao Lu: Adds a new data augmentation operation "remap_labels" which remaps a set of labels to a new label. -- 255753259 by Zhichao Lu: Announcement of the released evaluation tutorial for Open Images Challenge 2019. -- 255698776 by lzc: Fix rewrite_nn_resize_op function which was broken by tf forward compatibility movement. -- 255623150 by Zhichao Lu: Add Keras-based ResnetV1 models. -- 255504992 by Zhichao Lu: Fixing the typo in specifying label expansion for ground truth segmentation file. -- 255470768 by Zhichao Lu: 1. Fixing Python bug with parsed arguments. 2. Adding capability to parse relevant columns from CSV header. 3. Fixing bug with duplicated labels expansion. -- 255462432 by Zhichao Lu: Adds a new data augmentation operation "drop_label_probabilistically" which drops a given label with the given probability. This supports experiments on training in the presence of label noise. -- 255441632 by rathodv: Fallback on groundtruth classes when multiclass_scores tensor is empty. -- 255434899 by Zhichao Lu: Ensuring evaluation binary can run even with big files by synchronizing processing of ground truth and predictions: in this way, ground truth is not stored but immediatly used for evaluation. In case gt of object masks, this allows to run evaluations on relatively large sets. -- 255337855 by lzc: Internal change. -- 255308908 by Zhichao Lu: Add comment to clarify usage of calibration parameters proto. -- 255266371 by Zhichao Lu: Ensuring correct processing of the case, when no groundtruth masks are provided for an image. -- 255236648 by Zhichao Lu: Refactor model_builder in faster_rcnn.py to a util_map, so that it's possible to be overwritten. -- 255093285 by Zhichao Lu: Updating capability to subsample data during evaluation -- 255081222 by rathodv: Convert groundtruth masks to be of type float32 before its used in the loss function. When using mixed precision training, masks are represented using bfloat16 tensors in the input pipeline for performance reasons. We need to convert them to float32 before using it in the loss function. -- 254788436 by Zhichao Lu: Add forward_compatible to non_max_suppression_with_scores to make it is compatible with older tensorflow version. -- 254442362 by Zhichao Lu: Add num_layer field to ssd feature extractor proto. -- 253911582 by jonathanhuang: Plumbs Soft-NMS options (using the new tf.image.non_max_suppression_with_scores op) into the TF Object Detection API. It adds a `soft_nms_sigma` field to the postprocessing proto file and plumbs this through to both the multiclass and class_agnostic versions of NMS. Note that there is no effect on behavior of NMS when soft_nms_sigma=0 (which it is set to by default). See also "Soft-NMS -- Improving Object Detection With One Line of Code" by Bodla et al (https://arxiv.org/abs/1704.04503) -- 253703949 by Zhichao Lu: Internal test fixes. -- 253151266 by Zhichao Lu: Fix the op type check for FusedBatchNorm, given that we introduced FusedBatchNormV3 in a previous change. -- 252718956 by Zhichao Lu: Customize activation function to enable relu6 instead of relu for saliency prediction model seastarization -- 252158593 by Zhichao Lu: Make object_detection/core Python3-compatible. -- 252150717 by Zhichao Lu: Make object_detection/core Python3-compatible. -- 251967048 by Zhichao Lu: Make GraphRewriter proto extensible. -- 251950039 by Zhichao Lu: Remove experimental_export_device_assignment from TPUEstimator.export_savedmodel(), so as to remove rewrite_for_inference(). As a replacement, export_savedmodel() V2 API supports device_assignment where user call tpu.rewrite in model_fn and pass in device_assigment there. -- 251890697 by rathodv: Updated docstring to include new output nodes. -- 251662894 by Zhichao Lu: Add autoaugment augmentation option to objection detection api codebase. This is an available option in preprocessor.py. The intended usage of autoaugment is to be done along with random flipping and cropping for best results. -- 251532908 by Zhichao Lu: Add TrainingDataType enum to track whether class-specific or agnostic data was used to fit the calibration function. This is useful, since classes with few observations may require a calibration function fit on all classes. -- 251511339 by Zhichao Lu: Add multiclass isotonic regression to the calibration builder. -- 251317769 by pengchong: Internal Change. -- 250729989 by Zhichao Lu: Fixing bug in gt statistics count in case of mask and box annotations. -- 250729627 by Zhichao Lu: Label expansion for segmentation. -- 250724905 by Zhichao Lu: Fix use_depthwise in fpn and test it with fpnlite on ssd + mobilenet v2. -- 250670379 by Zhichao Lu: Internal change 250630364 by lzc: Fix detection_model_zoo footnotes -- 250560654 by Zhichao Lu: Fix static shape issue in matmul_crop_and_resize. -- 250534857 by Zhichao Lu: Edit class agnostic calibration function docstring to more accurately describe the function's outputs. -- 250533277 by Zhichao Lu: Edit the multiclass messages to use class ids instead of labels. -- PiperOrigin-RevId: 257914648
fe748d4a · pkulzc · GitHub · 81123ebf · fe748d4a · fe748d4a
Unverified Commit fe748d4a authored Jul 15, 2019 by pkulzc Committed by GitHub Jul 15, 2019
20 changed files
--- a/research/object_detection/models/faster_rcnn_pnas_feature_extractor.py
+++ b/research/object_detection/models/faster_rcnn_pnas_feature_extractor.py
@@ -21,6 +21,7 @@ Based on PNASNet model: https://arxiv.org/abs/1712.00559
 import tensorflow as tf
 from object_detection.meta_architectures import faster_rcnn_meta_arch
+from object_detection.utils import variables_helper
 from nets.nasnet import nasnet_utils
 from nets.nasnet import pnasnet
@@ -302,7 +303,7 @@ class FasterRCNNPNASFeatureExtractor(
      the model graph.
    """
    variables_to_restore = {}
-    for variable in tf.global_variables():
+    for variable in variables_helper.get_global_variables_safely():
      if variable.op.name.startswith(
          first_stage_feature_extractor_scope):
        var_name = variable.op.name.replace(

--- a/research/object_detection/models/faster_rcnn_resnet_v1_feature_extractor.py
+++ b/research/object_detection/models/faster_rcnn_resnet_v1_feature_extractor.py
@@ -44,7 +44,8 @@ class FasterRCNNResnetV1FeatureExtractor(
               first_stage_features_stride,
               batch_norm_trainable=False,
               reuse_weights=None,
-               weight_decay=0.0):
+               weight_decay=0.0,
+               activation_fn=tf.nn.relu):
    """Constructor.
    Args:
@@ -55,6 +56,7 @@ class FasterRCNNResnetV1FeatureExtractor(
      batch_norm_trainable: See base class.
      reuse_weights: See base class.
      weight_decay: See base class.
+      activation_fn: Activaton functon to use in Resnet V1 model.
    Raises:
      ValueError: If `first_stage_features_stride` is not 8 or 16.
@@ -63,9 +65,10 @@ class FasterRCNNResnetV1FeatureExtractor(
      raise ValueError('`first_stage_features_stride` must be 8 or 16.')
    self._architecture = architecture
    self._resnet_model = resnet_model
-    super(FasterRCNNResnetV1FeatureExtractor, self).__init__(
+    self._activation_fn = activation_fn
-        is_training, first_stage_features_stride, batch_norm_trainable,
+    super(FasterRCNNResnetV1FeatureExtractor,
-        reuse_weights, weight_decay)
+          self).__init__(is_training, first_stage_features_stride,
+                         batch_norm_trainable, reuse_weights, weight_decay)
  def preprocess(self, resized_inputs):
    """Faster R-CNN Resnet V1 preprocessing.
@@ -125,6 +128,7 @@ class FasterRCNNResnetV1FeatureExtractor(
          resnet_utils.resnet_arg_scope(
              batch_norm_epsilon=1e-5,
              batch_norm_scale=True,
+              activation_fn=self._activation_fn,
              weight_decay=self._weight_decay)):
        with tf.variable_scope(
            self._architecture, reuse=self._reuse_weights) as var_scope:
@@ -159,6 +163,7 @@ class FasterRCNNResnetV1FeatureExtractor(
          resnet_utils.resnet_arg_scope(
              batch_norm_epsilon=1e-5,
              batch_norm_scale=True,
+              activation_fn=self._activation_fn,
              weight_decay=self._weight_decay)):
        with slim.arg_scope([slim.batch_norm],
                            is_training=self._train_batch_norm):
@@ -182,7 +187,8 @@ class FasterRCNNResnet50FeatureExtractor(FasterRCNNResnetV1FeatureExtractor):
               first_stage_features_stride,
               batch_norm_trainable=False,
               reuse_weights=None,
-               weight_decay=0.0):
+               weight_decay=0.0,
+               activation_fn=tf.nn.relu):
    """Constructor.
    Args:
@@ -191,15 +197,16 @@ class FasterRCNNResnet50FeatureExtractor(FasterRCNNResnetV1FeatureExtractor):
      batch_norm_trainable: See base class.
      reuse_weights: See base class.
      weight_decay: See base class.
+      activation_fn: See base class.
    Raises:
      ValueError: If `first_stage_features_stride` is not 8 or 16,
        or if `architecture` is not supported.
    """
-    super(FasterRCNNResnet50FeatureExtractor, self).__init__(
+    super(FasterRCNNResnet50FeatureExtractor,
-        'resnet_v1_50', resnet_v1.resnet_v1_50, is_training,
+          self).__init__('resnet_v1_50', resnet_v1.resnet_v1_50, is_training,
-        first_stage_features_stride, batch_norm_trainable,
+                         first_stage_features_stride, batch_norm_trainable,
-        reuse_weights, weight_decay)
+                         reuse_weights, weight_decay, activation_fn)
 class FasterRCNNResnet101FeatureExtractor(FasterRCNNResnetV1FeatureExtractor):
@@ -210,7 +217,8 @@ class FasterRCNNResnet101FeatureExtractor(FasterRCNNResnetV1FeatureExtractor):
               first_stage_features_stride,
               batch_norm_trainable=False,
               reuse_weights=None,
-               weight_decay=0.0):
+               weight_decay=0.0,
+               activation_fn=tf.nn.relu):
    """Constructor.
    Args:
@@ -219,15 +227,16 @@ class FasterRCNNResnet101FeatureExtractor(FasterRCNNResnetV1FeatureExtractor):
      batch_norm_trainable: See base class.
      reuse_weights: See base class.
      weight_decay: See base class.
+      activation_fn: See base class.
    Raises:
      ValueError: If `first_stage_features_stride` is not 8 or 16,
        or if `architecture` is not supported.
    """
-    super(FasterRCNNResnet101FeatureExtractor, self).__init__(
+    super(FasterRCNNResnet101FeatureExtractor,
-        'resnet_v1_101', resnet_v1.resnet_v1_101, is_training,
+          self).__init__('resnet_v1_101', resnet_v1.resnet_v1_101, is_training,
-        first_stage_features_stride, batch_norm_trainable,
+                         first_stage_features_stride, batch_norm_trainable,
-        reuse_weights, weight_decay)
+                         reuse_weights, weight_decay, activation_fn)
 class FasterRCNNResnet152FeatureExtractor(FasterRCNNResnetV1FeatureExtractor):
@@ -238,7 +247,8 @@ class FasterRCNNResnet152FeatureExtractor(FasterRCNNResnetV1FeatureExtractor):
               first_stage_features_stride,
               batch_norm_trainable=False,
               reuse_weights=None,
-               weight_decay=0.0):
+               weight_decay=0.0,
+               activation_fn=tf.nn.relu):
    """Constructor.
    Args:
@@ -247,12 +257,13 @@ class FasterRCNNResnet152FeatureExtractor(FasterRCNNResnetV1FeatureExtractor):
      batch_norm_trainable: See base class.
      reuse_weights: See base class.
      weight_decay: See base class.
+      activation_fn: See base class.
    Raises:
      ValueError: If `first_stage_features_stride` is not 8 or 16,
        or if `architecture` is not supported.
    """
-    super(FasterRCNNResnet152FeatureExtractor, self).__init__(
+    super(FasterRCNNResnet152FeatureExtractor,
-        'resnet_v1_152', resnet_v1.resnet_v1_152, is_training,
+          self).__init__('resnet_v1_152', resnet_v1.resnet_v1_152, is_training,
-        first_stage_features_stride, batch_norm_trainable,
+                         first_stage_features_stride, batch_norm_trainable,
-        reuse_weights, weight_decay)
+                         reuse_weights, weight_decay, activation_fn)
--- a/research/object_detection/models/faster_rcnn_resnet_v1_feature_extractor_test.py
+++ b/research/object_detection/models/faster_rcnn_resnet_v1_feature_extractor_test.py
@@ -25,6 +25,7 @@ class FasterRcnnResnetV1FeatureExtractorTest(tf.test.TestCase):
  def _build_feature_extractor(self,
                               first_stage_features_stride,
+                               activation_fn=tf.nn.relu,
                               architecture='resnet_v1_101'):
    feature_extractor_map = {
        'resnet_v1_50':
@@ -37,6 +38,7 @@ class FasterRcnnResnetV1FeatureExtractorTest(tf.test.TestCase):
    return feature_extractor_map[architecture](
        is_training=False,
        first_stage_features_stride=first_stage_features_stride,
+        activation_fn=activation_fn,
        batch_norm_trainable=False,
        reuse_weights=None,
        weight_decay=0.0)
@@ -132,6 +134,32 @@ class FasterRcnnResnetV1FeatureExtractorTest(tf.test.TestCase):
      features_shape_out = sess.run(features_shape)
      self.assertAllEqual(features_shape_out, [3, 7, 7, 2048])
+  def test_overwriting_activation_fn(self):
+    for architecture in ['resnet_v1_50', 'resnet_v1_101', 'resnet_v1_152']:
+      feature_extractor = self._build_feature_extractor(
+          first_stage_features_stride=16,
+          architecture=architecture,
+          activation_fn=tf.nn.relu6)
+      preprocessed_inputs = tf.random_uniform([4, 224, 224, 3],
+                                              maxval=255,
+                                              dtype=tf.float32)
+      rpn_feature_map, _ = feature_extractor.extract_proposal_features(
+          preprocessed_inputs, scope='TestStage1Scope')
+      _ = feature_extractor.extract_box_classifier_features(
+          rpn_feature_map, scope='TestStaget2Scope')
+      conv_ops = [
+          op for op in tf.get_default_graph().get_operations()
+          if op.type == 'Relu6'
+      ]
+      op_names = [op.name for op in conv_ops]
+      self.assertIsNotNone(conv_ops)
+      self.assertIn('TestStage1Scope/resnet_v1_50/resnet_v1_50/conv1/Relu6',
+                    op_names)
+      self.assertIn(
+          'TestStaget2Scope/resnet_v1_50/block4/unit_1/bottleneck_v1/conv1/Relu6',
+          op_names)
 if __name__ == '__main__':
  tf.test.main()
--- a/research/object_detection/models/feature_map_generators.py
+++ b/research/object_detection/models/feature_map_generators.py
@@ -79,14 +79,19 @@ def create_conv_block(
  """
  layers = []
  if use_depthwise:
-    layers.append(tf.keras.layers.SeparableConv2D(
+    kwargs = conv_hyperparams.params()
-        depth,
+    # Both the regularizer and initializer apply to the depthwise layer,
-        [kernel_size, kernel_size],
+    # so we remap the kernel_* to depthwise_* here.
-        depth_multiplier=1,
+    kwargs['depthwise_regularizer'] = kwargs['kernel_regularizer']
-        padding=padding,
+    kwargs['depthwise_initializer'] = kwargs['kernel_initializer']
-        strides=stride,
+    layers.append(
-        name=layer_name + '_depthwise_conv',
+        tf.keras.layers.SeparableConv2D(
-        **conv_hyperparams.params()))
+            depth, [kernel_size, kernel_size],
+            depth_multiplier=1,
+            padding=padding,
+            strides=stride,
+            name=layer_name + '_depthwise_conv',
+            **kwargs))
  else:
    layers.append(tf.keras.layers.Conv2D(
        depth,

--- a/research/object_detection/models/keras_models/mobilenet_v2.py
+++ b/research/object_detection/models/keras_models/mobilenet_v2.py
@@ -160,7 +160,12 @@ class _LayersOverride(object):
    """
    if self._conv_hyperparams:
      kwargs = self._conv_hyperparams.params(**kwargs)
+      # Both the regularizer and initializer apply to the depthwise layer in
+      # MobilenetV1, so we remap the kernel_* to depthwise_* here.
+      kwargs['depthwise_regularizer'] = kwargs['kernel_regularizer']
+      kwargs['depthwise_initializer'] = kwargs['kernel_initializer']
    else:
+      kwargs['depthwise_regularizer'] = self.regularizer
      kwargs['depthwise_initializer'] = self.initializer
    kwargs['padding'] = 'same'

--- a/research/object_detection/models/keras_models/resnet_v1.py
+++ b/research/object_detection/models/keras_models/resnet_v1.py
+# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""A wrapper around the Keras Resnet V1 models for object detection."""
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+import tensorflow as tf
+from object_detection.core import freezable_batch_norm
+def _fixed_padding(inputs, kernel_size, rate=1):  # pylint: disable=invalid-name
+  """Pads the input along the spatial dimensions independently of input size.
+  Pads the input such that if it was used in a convolution with 'VALID' padding,
+  the output would have the same dimensions as if the unpadded input was used
+  in a convolution with 'SAME' padding.
+  Args:
+    inputs: A tensor of size [batch, height_in, width_in, channels].
+    kernel_size: The kernel to be used in the conv2d or max_pool2d operation.
+    rate: An integer, rate for atrous convolution.
+  Returns:
+    output: A tensor of size [batch, height_out, width_out, channels] with the
+      input, either intact (if kernel_size == 1) or padded (if kernel_size > 1).
+  """
+  kernel_size_effective = kernel_size + (kernel_size - 1) * (rate - 1)
+  pad_total = kernel_size_effective - 1
+  pad_beg = pad_total // 2
+  pad_end = pad_total - pad_beg
+  padded_inputs = tf.pad(
+      inputs, [[0, 0], [pad_beg, pad_end], [pad_beg, pad_end], [0, 0]])
+  return padded_inputs
+class _LayersOverride(object):
+  """Alternative Keras layers interface for the Keras Resnet V1."""
+  def __init__(self,
+               batchnorm_training,
+               batchnorm_scale=True,
+               default_batchnorm_momentum=0.997,
+               default_batchnorm_epsilon=1e-5,
+               weight_decay=0.0001,
+               conv_hyperparams=None,
+               min_depth=8,
+               depth_multiplier=1):
+    """Alternative tf.keras.layers interface, for use by the Keras Resnet V1.
+    The class is used by the Keras applications kwargs injection API to
+    modify the Resnet V1 Keras application with changes required by
+    the Object Detection API.
+    Args:
+      batchnorm_training: Bool. Assigned to Batch norm layer `training` param
+        when constructing `freezable_batch_norm.FreezableBatchNorm` layers.
+      batchnorm_scale: If True, uses an explicit `gamma` multiplier to scale
+        the activations in the batch normalization layer.
+      default_batchnorm_momentum: Float. When 'conv_hyperparams' is None,
+        batch norm layers will be constructed using this value as the momentum.
+      default_batchnorm_epsilon: Float. When 'conv_hyperparams' is None,
+        batch norm layers will be constructed using this value as the epsilon.
+      weight_decay: The weight decay to use for regularizing the model.
+      conv_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
+        containing hyperparameters for convolution ops. Optionally set to `None`
+        to use default resnet_v1 layer builders.
+      min_depth: Minimum number of filters in the convolutional layers.
+      depth_multiplier: The depth multiplier to modify the number of filters
+        in the convolutional layers.
+    """
+    self._batchnorm_training = batchnorm_training
+    self._batchnorm_scale = batchnorm_scale
+    self._default_batchnorm_momentum = default_batchnorm_momentum
+    self._default_batchnorm_epsilon = default_batchnorm_epsilon
+    self._conv_hyperparams = conv_hyperparams
+    self._min_depth = min_depth
+    self._depth_multiplier = depth_multiplier
+    self.regularizer = tf.keras.regularizers.l2(weight_decay)
+    self.initializer = tf.variance_scaling_initializer()
+  def _FixedPaddingLayer(self, kernel_size, rate=1):
+    return tf.keras.layers.Lambda(
+        lambda x: _fixed_padding(x, kernel_size, rate))
+  def Conv2D(self, filters, kernel_size, **kwargs):
+    """Builds a Conv2D layer according to the current Object Detection config.
+    Overrides the Keras Resnet application's convolutions with ones that
+    follow the spec specified by the Object Detection hyperparameters.
+    Args:
+      filters: The number of filters to use for the convolution.
+      kernel_size: The kernel size to specify the height and width of the 2D
+        convolution window.
+      **kwargs: Keyword args specified by the Keras application for
+        constructing the convolution.
+    Returns:
+      A one-arg callable that will either directly apply a Keras Conv2D layer to
+      the input argument, or that will first pad the input then apply a Conv2D
+      layer.
+    """
+    # Apply the minimum depth to the convolution layers.
+    filters = max(int(filters * self._depth_multiplier), self._min_depth)
+    if self._conv_hyperparams:
+      kwargs = self._conv_hyperparams.params(**kwargs)
+    else:
+      kwargs['kernel_regularizer'] = self.regularizer
+      kwargs['kernel_initializer'] = self.initializer
+    # Set use_bias as false to keep it consistent with Slim Resnet model.
+    kwargs['use_bias'] = False
+    kwargs['padding'] = 'same'
+    stride = kwargs.get('strides')
+    if stride and kernel_size and stride > 1 and kernel_size > 1:
+      kwargs['padding'] = 'valid'
+      def padded_conv(features):  # pylint: disable=invalid-name
+        padded_features = self._FixedPaddingLayer(kernel_size)(features)
+        return tf.keras.layers.Conv2D(
+            filters, kernel_size, **kwargs)(padded_features)
+      return padded_conv
+    else:
+      return tf.keras.layers.Conv2D(filters, kernel_size, **kwargs)
+  def Activation(self, *args, **kwargs):  # pylint: disable=unused-argument
+    """Builds an activation layer.
+    Overrides the Keras application Activation layer specified by the
+    Object Detection configuration.
+    Args:
+      *args: Ignored,
+        required to match the `tf.keras.layers.Activation` interface.
+      **kwargs: Only the name is used,
+        required to match `tf.keras.layers.Activation` interface.
+    Returns:
+      An activation layer specified by the Object Detection hyperparameter
+      configurations.
+    """
+    name = kwargs.get('name')
+    if self._conv_hyperparams:
+      return self._conv_hyperparams.build_activation_layer(name=name)
+    else:
+      return tf.keras.layers.Lambda(tf.nn.relu, name=name)
+  def BatchNormalization(self, **kwargs):
+    """Builds a normalization layer.
+    Overrides the Keras application batch norm with the norm specified by the
+    Object Detection configuration.
+    Args:
+      **kwargs: Only the name is used, all other params ignored.
+        Required for matching `layers.BatchNormalization` calls in the Keras
+        application.
+    Returns:
+      A normalization layer specified by the Object Detection hyperparameter
+      configurations.
+    """
+    name = kwargs.get('name')
+    if self._conv_hyperparams:
+      return self._conv_hyperparams.build_batch_norm(
+          training=self._batchnorm_training,
+          name=name)
+    else:
+      kwargs['scale'] = self._batchnorm_scale
+      kwargs['epsilon'] = self._default_batchnorm_epsilon
+      return freezable_batch_norm.FreezableBatchNorm(
+          training=self._batchnorm_training,
+          momentum=self._default_batchnorm_momentum,
+          **kwargs)
+  def Input(self, shape):
+    """Builds an Input layer.
+    Overrides the Keras application Input layer with one that uses a
+    tf.placeholder_with_default instead of a tf.placeholder. This is necessary
+    to ensure the application works when run on a TPU.
+    Args:
+      shape: A tuple of integers representing the shape of the input, which
+        includes both spatial share and channels, but not the batch size.
+        Elements of this tuple can be None; 'None' elements represent dimensions
+        where the shape is not known.
+    Returns:
+      An input layer for the specified shape that internally uses a
+      placeholder_with_default.
+    """
+    default_size = 224
+    default_batch_size = 1
+    shape = list(shape)
+    default_shape = [default_size if dim is None else dim for dim in shape]
+    input_tensor = tf.constant(0.0, shape=[default_batch_size] + default_shape)
+    placeholder_with_default = tf.placeholder_with_default(
+        input=input_tensor, shape=[None] + shape)
+    return tf.keras.layers.Input(tensor=placeholder_with_default)
+  def MaxPooling2D(self, pool_size, **kwargs):
+    """Builds a MaxPooling2D layer with default padding as 'SAME'.
+    This is specified by the default resnet arg_scope in slim.
+    Args:
+      pool_size: The pool size specified by the Keras application.
+      **kwargs: Ignored, required to match the Keras applications usage.
+    Returns:
+      A MaxPooling2D layer with default padding as 'SAME'.
+    """
+    kwargs['padding'] = 'same'
+    return tf.keras.layers.MaxPooling2D(pool_size, **kwargs)
+  # Add alias as Keras also has it.
+  MaxPool2D = MaxPooling2D  # pylint: disable=invalid-name
+  def ZeroPadding2D(self, padding, **kwargs):  # pylint: disable=unused-argument
+    """Replaces explicit padding in the Keras application with a no-op.
+    Args:
+      padding: The padding values for image height and width.
+      **kwargs: Ignored, required to match the Keras applications usage.
+    Returns:
+      A no-op identity lambda.
+    """
+    return lambda x: x
+  # Forward all non-overridden methods to the keras layers
+  def __getattr__(self, item):
+    return getattr(tf.keras.layers, item)
+# pylint: disable=invalid-name
+def resnet_v1_50(batchnorm_training,
+                 batchnorm_scale=True,
+                 default_batchnorm_momentum=0.997,
+                 default_batchnorm_epsilon=1e-5,
+                 weight_decay=0.0001,
+                 conv_hyperparams=None,
+                 min_depth=8,
+                 depth_multiplier=1,
+                 **kwargs):
+  """Instantiates the Resnet50 architecture, modified for object detection.
+  Args:
+    batchnorm_training: Bool. Assigned to Batch norm layer `training` param
+      when constructing `freezable_batch_norm.FreezableBatchNorm` layers.
+    batchnorm_scale: If True, uses an explicit `gamma` multiplier to scale
+      the activations in the batch normalization layer.
+    default_batchnorm_momentum: Float. When 'conv_hyperparams' is None,
+      batch norm layers will be constructed using this value as the momentum.
+    default_batchnorm_epsilon: Float. When 'conv_hyperparams' is None,
+      batch norm layers will be constructed using this value as the epsilon.
+    weight_decay: The weight decay to use for regularizing the model.
+    conv_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
+      containing hyperparameters for convolution ops. Optionally set to `None`
+      to use default resnet_v1 layer builders.
+    min_depth: Minimum number of filters in the convolutional layers.
+    depth_multiplier: The depth multiplier to modify the number of filters
+      in the convolutional layers.
+    **kwargs: Keyword arguments forwarded directly to the
+      `tf.keras.applications.Mobilenet` method that constructs the Keras
+      model.
+  Returns:
+    A Keras ResnetV1-50 model instance.
+  """
+  layers_override = _LayersOverride(
+      batchnorm_training,
+      batchnorm_scale=batchnorm_scale,
+      default_batchnorm_momentum=default_batchnorm_momentum,
+      default_batchnorm_epsilon=default_batchnorm_epsilon,
+      conv_hyperparams=conv_hyperparams,
+      weight_decay=weight_decay,
+      min_depth=min_depth,
+      depth_multiplier=depth_multiplier)
+  return tf.keras.applications.resnet.ResNet50(
+      layers=layers_override, **kwargs)
+def resnet_v1_101(batchnorm_training,
+                  batchnorm_scale=True,
+                  default_batchnorm_momentum=0.997,
+                  default_batchnorm_epsilon=1e-5,
+                  weight_decay=0.0001,
+                  conv_hyperparams=None,
+                  min_depth=8,
+                  depth_multiplier=1,
+                  **kwargs):
+  """Instantiates the Resnet50 architecture, modified for object detection.
+  Args:
+    batchnorm_training: Bool. Assigned to Batch norm layer `training` param
+      when constructing `freezable_batch_norm.FreezableBatchNorm` layers.
+    batchnorm_scale: If True, uses an explicit `gamma` multiplier to scale
+      the activations in the batch normalization layer.
+    default_batchnorm_momentum: Float. When 'conv_hyperparams' is None,
+      batch norm layers will be constructed using this value as the momentum.
+    default_batchnorm_epsilon: Float. When 'conv_hyperparams' is None,
+      batch norm layers will be constructed using this value as the epsilon.
+    weight_decay: The weight decay to use for regularizing the model.
+    conv_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
+      containing hyperparameters for convolution ops. Optionally set to `None`
+      to use default resnet_v1 layer builders.
+    min_depth: Minimum number of filters in the convolutional layers.
+    depth_multiplier: The depth multiplier to modify the number of filters
+      in the convolutional layers.
+    **kwargs: Keyword arguments forwarded directly to the
+      `tf.keras.applications.Mobilenet` method that constructs the Keras
+      model.
+  Returns:
+    A Keras ResnetV1-101 model instance.
+  """
+  layers_override = _LayersOverride(
+      batchnorm_training,
+      batchnorm_scale=batchnorm_scale,
+      default_batchnorm_momentum=default_batchnorm_momentum,
+      default_batchnorm_epsilon=default_batchnorm_epsilon,
+      conv_hyperparams=conv_hyperparams,
+      weight_decay=weight_decay,
+      min_depth=min_depth,
+      depth_multiplier=depth_multiplier)
+  return tf.keras.applications.resnet.ResNet101(
+      layers=layers_override, **kwargs)
+def resnet_v1_152(batchnorm_training,
+                  batchnorm_scale=True,
+                  default_batchnorm_momentum=0.997,
+                  default_batchnorm_epsilon=1e-5,
+                  weight_decay=0.0001,
+                  conv_hyperparams=None,
+                  min_depth=8,
+                  depth_multiplier=1,
+                  **kwargs):
+  """Instantiates the Resnet50 architecture, modified for object detection.
+  Args:
+    batchnorm_training: Bool. Assigned to Batch norm layer `training` param
+      when constructing `freezable_batch_norm.FreezableBatchNorm` layers.
+    batchnorm_scale: If True, uses an explicit `gamma` multiplier to scale
+      the activations in the batch normalization layer.
+    default_batchnorm_momentum: Float. When 'conv_hyperparams' is None,
+      batch norm layers will be constructed using this value as the momentum.
+    default_batchnorm_epsilon: Float. When 'conv_hyperparams' is None,
+      batch norm layers will be constructed using this value as the epsilon.
+    weight_decay: The weight decay to use for regularizing the model.
+    conv_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
+      containing hyperparameters for convolution ops. Optionally set to `None`
+      to use default resnet_v1 layer builders.
+    min_depth: Minimum number of filters in the convolutional layers.
+    depth_multiplier: The depth multiplier to modify the number of filters
+      in the convolutional layers.
+    **kwargs: Keyword arguments forwarded directly to the
+      `tf.keras.applications.Mobilenet` method that constructs the Keras
+      model.
+  Returns:
+    A Keras ResnetV1-152 model instance.
+  """
+  layers_override = _LayersOverride(
+      batchnorm_training,
+      batchnorm_scale=batchnorm_scale,
+      default_batchnorm_momentum=default_batchnorm_momentum,
+      default_batchnorm_epsilon=default_batchnorm_epsilon,
+      conv_hyperparams=conv_hyperparams,
+      weight_decay=weight_decay,
+      min_depth=min_depth,
+      depth_multiplier=depth_multiplier)
+  return tf.keras.applications.resnet.ResNet152(
+      layers=layers_override, **kwargs)
+# pylint: enable=invalid-name
--- a/research/object_detection/models/keras_models/resnet_v1_test.py
+++ b/research/object_detection/models/keras_models/resnet_v1_test.py
+# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""Tests for resnet_v1.py.
+This test mainly focuses on comparing slim resnet v1 and Keras resnet v1 for
+object detection. To verify the consistency of the two models, we compare:
+  1. Output shape of each layer given different inputs.
+  2. Number of global variables.
+"""
+import numpy as np
+from six.moves import zip
+import tensorflow as tf
+from google.protobuf import text_format
+from object_detection.builders import hyperparams_builder
+from object_detection.models.keras_models import resnet_v1
+from object_detection.protos import hyperparams_pb2
+from object_detection.utils import test_case
+_EXPECTED_SHAPES_224_RESNET50 = {
+    'conv2_block3_out': (4, 56, 56, 256),
+    'conv3_block4_out': (4, 28, 28, 512),
+    'conv4_block6_out': (4, 14, 14, 1024),
+    'conv5_block3_out': (4, 7, 7, 2048),
+}
+_EXPECTED_SHAPES_224_RESNET101 = {
+    'conv2_block3_out': (4, 56, 56, 256),
+    'conv3_block4_out': (4, 28, 28, 512),
+    'conv4_block23_out': (4, 14, 14, 1024),
+    'conv5_block3_out': (4, 7, 7, 2048),
+}
+_EXPECTED_SHAPES_224_RESNET152 = {
+    'conv2_block3_out': (4, 56, 56, 256),
+    'conv3_block8_out': (4, 28, 28, 512),
+    'conv4_block36_out': (4, 14, 14, 1024),
+    'conv5_block3_out': (4, 7, 7, 2048),
+}
+_RESNET_NAMES = ['resnet_v1_50', 'resnet_v1_101', 'resnet_v1_152']
+_RESNET_MODELS = [
+    resnet_v1.resnet_v1_50, resnet_v1.resnet_v1_101, resnet_v1.resnet_v1_152
+]
+_RESNET_SHAPES = [
+    _EXPECTED_SHAPES_224_RESNET50, _EXPECTED_SHAPES_224_RESNET101,
+    _EXPECTED_SHAPES_224_RESNET152
+]
+_NUM_CHANNELS = 3
+_BATCH_SIZE = 4
+class ResnetV1Test(test_case.TestCase):
+  def _build_conv_hyperparams(self):
+    conv_hyperparams = hyperparams_pb2.Hyperparams()
+    conv_hyperparams_text_proto = """
+      activation: RELU_6,
+      regularizer {
+        l2_regularizer {
+          weight: 0.0004
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+          stddev: 0.03
+          mean: 0.0
+        }
+      }
+      batch_norm {
+        scale: true,
+        decay: 0.997,
+        epsilon: 0.001,
+      }
+    """
+    text_format.Merge(conv_hyperparams_text_proto, conv_hyperparams)
+    return hyperparams_builder.KerasLayerHyperparams(conv_hyperparams)
+  def _create_application_with_layer_outputs(self,
+                                             model_index,
+                                             batchnorm_training,
+                                             batchnorm_scale=True,
+                                             weight_decay=0.0001,
+                                             default_batchnorm_momentum=0.997,
+                                             default_batchnorm_epsilon=1e-5):
+    """Constructs Keras resnet_v1 that extracts layer outputs."""
+    # Have to clear the Keras backend to ensure isolation in layer naming
+    tf.keras.backend.clear_session()
+    layer_names = _RESNET_SHAPES[model_index].keys()
+    full_model = _RESNET_MODELS[model_index](
+        batchnorm_training=batchnorm_training,
+        weights=None,
+        batchnorm_scale=batchnorm_scale,
+        weight_decay=weight_decay,
+        default_batchnorm_momentum=default_batchnorm_momentum,
+        default_batchnorm_epsilon=default_batchnorm_epsilon,
+        include_top=False)
+    layer_outputs = [
+        full_model.get_layer(name=layer).output for layer in layer_names
+    ]
+    return tf.keras.Model(inputs=full_model.inputs, outputs=layer_outputs)
+  def _check_returns_correct_shape(self,
+                                   image_height,
+                                   image_width,
+                                   model_index,
+                                   expected_feature_map_shape,
+                                   batchnorm_training=True,
+                                   batchnorm_scale=True,
+                                   weight_decay=0.0001,
+                                   default_batchnorm_momentum=0.997,
+                                   default_batchnorm_epsilon=1e-5):
+    model = self._create_application_with_layer_outputs(
+        model_index=model_index,
+        batchnorm_training=batchnorm_training,
+        batchnorm_scale=batchnorm_scale,
+        weight_decay=weight_decay,
+        default_batchnorm_momentum=default_batchnorm_momentum,
+        default_batchnorm_epsilon=default_batchnorm_epsilon)
+    image_tensor = np.random.rand(_BATCH_SIZE, image_height, image_width,
+                                  _NUM_CHANNELS).astype(np.float32)
+    feature_maps = model(image_tensor)
+    layer_names = _RESNET_SHAPES[model_index].keys()
+    for feature_map, layer_name in zip(feature_maps, layer_names):
+      expected_shape = _RESNET_SHAPES[model_index][layer_name]
+      self.assertAllEqual(feature_map.shape, expected_shape)
+  def _get_variables(self, model_index):
+    tf.keras.backend.clear_session()
+    model = self._create_application_with_layer_outputs(
+        model_index, batchnorm_training=False)
+    preprocessed_inputs = tf.placeholder(tf.float32,
+                                         (4, None, None, _NUM_CHANNELS))
+    model(preprocessed_inputs)
+    return model.variables
+  def test_returns_correct_shapes_224(self):
+    image_height = 224
+    image_width = 224
+    for model_index, _ in enumerate(_RESNET_NAMES):
+      expected_feature_map_shape = _RESNET_SHAPES[model_index]
+      self._check_returns_correct_shape(image_height, image_width, model_index,
+                                        expected_feature_map_shape)
+  def test_hyperparam_override(self):
+    for model_name in _RESNET_MODELS:
+      model = model_name(
+          batchnorm_training=True,
+          default_batchnorm_momentum=0.2,
+          default_batchnorm_epsilon=0.1,
+          weights=None,
+          include_top=False)
+      bn_layer = model.get_layer(name='conv1_bn')
+      self.assertAllClose(bn_layer.momentum, 0.2)
+      self.assertAllClose(bn_layer.epsilon, 0.1)
+  def test_variable_count(self):
+    # The number of variables from slim resnetv1-* model.
+    variable_nums = [265, 520, 775]
+    for model_index, var_num in enumerate(variable_nums):
+      variables = self._get_variables(model_index)
+      self.assertEqual(len(variables), var_num)
+if __name__ == '__main__':
+  tf.test.main()
--- a/research/object_detection/models/ssd_feature_extractor_test.py
+++ b/research/object_detection/models/ssd_feature_extractor_test.py
@@ -57,8 +57,13 @@ class SsdFeatureExtractorTestBase(test_case.TestCase):
      return sc
  @abstractmethod
-  def _create_feature_extractor(self, depth_multiplier, pad_to_multiple,
+  def _create_feature_extractor(self,
-                                use_explicit_padding=False, use_keras=False):
+                                depth_multiplier,
+                                pad_to_multiple,
+                                use_explicit_padding=False,
+                                num_layers=6,
+                                use_keras=False,
+                                use_depthwise=False):
    """Constructs a new feature extractor.
    Args:
@@ -68,42 +73,64 @@ class SsdFeatureExtractorTestBase(test_case.TestCase):
      use_explicit_padding: use 'VALID' padding for convolutions, but prepad
        inputs so that the output dimensions are the same as if 'SAME' padding
        were used.
+      num_layers: number of SSD layers.
      use_keras: if True builds a keras-based feature extractor, if False builds
        a slim-based one.
+      use_depthwise: Whether to use depthwise convolutions.
    Returns:
      an ssd_meta_arch.SSDFeatureExtractor or an
      ssd_meta_arch.SSDKerasFeatureExtractor object.
    """
    pass
-  def _extract_features(self, image_tensor, depth_multiplier, pad_to_multiple,
+  def _extract_features(self,
-                        use_explicit_padding=False, use_keras=False):
+                        image_tensor,
-    try:
+                        depth_multiplier,
-      feature_extractor = self._create_feature_extractor(depth_multiplier,
+                        pad_to_multiple,
-                                                         pad_to_multiple,
+                        use_explicit_padding=False,
-                                                         use_explicit_padding,
+                        use_depthwise=False,
-                                                         use_keras=use_keras)
+                        num_layers=6,
-    # If the unit test does not support a use_keras arg, it raises an error:
+                        use_keras=False):
-    except TypeError:
+    kwargs = {}
-      feature_extractor = self._create_feature_extractor(depth_multiplier,
+    if use_explicit_padding:
-                                                         pad_to_multiple,
+      kwargs.update({'use_explicit_padding': use_explicit_padding})
-                                                         use_explicit_padding)
+    if use_depthwise:
+      kwargs.update({'use_depthwise': use_depthwise})
+    if num_layers != 6:
+      kwargs.update({'num_layers': num_layers})
+    if use_keras:
+      kwargs.update({'use_keras': use_keras})
+    feature_extractor = self._create_feature_extractor(
+        depth_multiplier,
+        pad_to_multiple,
+        **kwargs)
    if use_keras:
      feature_maps = feature_extractor(image_tensor)
    else:
      feature_maps = feature_extractor.extract_features(image_tensor)
    return feature_maps
-  def check_extract_features_returns_correct_shape(
+  def check_extract_features_returns_correct_shape(self,
-      self, batch_size, image_height, image_width, depth_multiplier,
+                                                   batch_size,
-      pad_to_multiple, expected_feature_map_shapes, use_explicit_padding=False,
+                                                   image_height,
-      use_keras=False):
+                                                   image_width,
+                                                   depth_multiplier,
+                                                   pad_to_multiple,
+                                                   expected_feature_map_shapes,
+                                                   use_explicit_padding=False,
+                                                   num_layers=6,
+                                                   use_keras=False,
+                                                   use_depthwise=False):
    def graph_fn(image_tensor):
-      return self._extract_features(image_tensor,
+      return self._extract_features(
-                                    depth_multiplier,
+          image_tensor,
-                                    pad_to_multiple,
+          depth_multiplier,
-                                    use_explicit_padding,
+          pad_to_multiple,
-                                    use_keras=use_keras)
+          use_explicit_padding=use_explicit_padding,
+          num_layers=num_layers,
+          use_keras=use_keras,
+          use_depthwise=use_depthwise)
    image_tensor = np.random.rand(batch_size, image_height, image_width,
                                  3).astype(np.float32)
@@ -113,17 +140,29 @@ class SsdFeatureExtractorTestBase(test_case.TestCase):
      self.assertAllEqual(feature_map.shape, expected_shape)
  def check_extract_features_returns_correct_shapes_with_dynamic_inputs(
-      self, batch_size, image_height, image_width, depth_multiplier,
+      self,
-      pad_to_multiple, expected_feature_map_shapes, use_explicit_padding=False,
+      batch_size,
-      use_keras=False):
+      image_height,
+      image_width,
+      depth_multiplier,
+      pad_to_multiple,
+      expected_feature_map_shapes,
+      use_explicit_padding=False,
+      num_layers=6,
+      use_keras=False,
+      use_depthwise=False):
    def graph_fn(image_height, image_width):
      image_tensor = tf.random_uniform([batch_size, image_height, image_width,
                                        3], dtype=tf.float32)
-      return self._extract_features(image_tensor,
+      return self._extract_features(
-                                    depth_multiplier,
+          image_tensor,
-                                    pad_to_multiple,
+          depth_multiplier,
-                                    use_explicit_padding,
+          pad_to_multiple,
-                                    use_keras=use_keras)
+          use_explicit_padding=use_explicit_padding,
+          num_layers=num_layers,
+          use_keras=use_keras,
+          use_depthwise=use_depthwise)
    feature_maps = self.execute_cpu(graph_fn, [
        np.array(image_height, dtype=np.int32),
@@ -134,13 +173,20 @@ class SsdFeatureExtractorTestBase(test_case.TestCase):
      self.assertAllEqual(feature_map.shape, expected_shape)
  def check_extract_features_raises_error_with_invalid_image_size(
-      self, image_height, image_width, depth_multiplier, pad_to_multiple,
+      self,
-      use_keras=False):
+      image_height,
+      image_width,
+      depth_multiplier,
+      pad_to_multiple,
+      use_keras=False,
+      use_depthwise=False):
    preprocessed_inputs = tf.placeholder(tf.float32, (4, None, None, 3))
-    feature_maps = self._extract_features(preprocessed_inputs,
+    feature_maps = self._extract_features(
-                                          depth_multiplier,
+        preprocessed_inputs,
-                                          pad_to_multiple,
+        depth_multiplier,
-                                          use_keras=use_keras)
+        pad_to_multiple,
+        use_keras=use_keras,
+        use_depthwise=use_depthwise)
    test_preprocessed_image = np.random.rand(4, image_height, image_width, 3)
    with self.test_session() as sess:
      sess.run(tf.global_variables_initializer())
@@ -148,20 +194,32 @@ class SsdFeatureExtractorTestBase(test_case.TestCase):
        sess.run(feature_maps,
                 feed_dict={preprocessed_inputs: test_preprocessed_image})
-  def check_feature_extractor_variables_under_scope(
+  def check_feature_extractor_variables_under_scope(self,
-      self, depth_multiplier, pad_to_multiple, scope_name, use_keras=False):
+                                                    depth_multiplier,
+                                                    pad_to_multiple,
+                                                    scope_name,
+                                                    use_keras=False,
+                                                    use_depthwise=False):
    variables = self.get_feature_extractor_variables(
-        depth_multiplier, pad_to_multiple, use_keras)
+        depth_multiplier,
+        pad_to_multiple,
+        use_keras=use_keras,
+        use_depthwise=use_depthwise)
    for variable in variables:
      self.assertTrue(variable.name.startswith(scope_name))
-  def get_feature_extractor_variables(
+  def get_feature_extractor_variables(self,
-      self, depth_multiplier, pad_to_multiple, use_keras=False):
+                                      depth_multiplier,
+                                      pad_to_multiple,
+                                      use_keras=False,
+                                      use_depthwise=False):
    g = tf.Graph()
    with g.as_default():
      preprocessed_inputs = tf.placeholder(tf.float32, (4, None, None, 3))
-      self._extract_features(preprocessed_inputs,
+      self._extract_features(
-                             depth_multiplier,
+          preprocessed_inputs,
-                             pad_to_multiple,
+          depth_multiplier,
-                             use_keras=use_keras)
+          pad_to_multiple,
+          use_keras=use_keras,
+          use_depthwise=use_depthwise)
      return g.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)
--- a/research/object_detection/models/ssd_inception_v2_feature_extractor.py
+++ b/research/object_detection/models/ssd_inception_v2_feature_extractor.py
@@ -37,6 +37,7 @@ class SSDInceptionV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
               reuse_weights=None,
               use_explicit_padding=False,
               use_depthwise=False,
+               num_layers=6,
               override_base_feature_extractor_hyperparams=False):
    """InceptionV2 Feature Extractor for SSD Models.
@@ -53,6 +54,7 @@ class SSDInceptionV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
      use_explicit_padding: Whether to use explicit padding when extracting
        features. Default is False.
      use_depthwise: Whether to use depthwise convolutions. Default is False.
+      num_layers: Number of SSD layers.
      override_base_feature_extractor_hyperparams: Whether to override
        hyperparameters of the base feature extractor with the one from
        `conv_hyperparams_fn`.
@@ -69,6 +71,7 @@ class SSDInceptionV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
        reuse_weights=reuse_weights,
        use_explicit_padding=use_explicit_padding,
        use_depthwise=use_depthwise,
+        num_layers=num_layers,
        override_base_feature_extractor_hyperparams=
        override_base_feature_extractor_hyperparams)
    if not self._override_base_feature_extractor_hyperparams:
@@ -108,8 +111,9 @@ class SSDInceptionV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
        33, preprocessed_inputs)
    feature_map_layout = {
-        'from_layer': ['Mixed_4c', 'Mixed_5c', '', '', '', ''],
+        'from_layer': ['Mixed_4c', 'Mixed_5c', '', '', '', ''
-        'layer_depth': [-1, -1, 512, 256, 256, 128],
+                      ][:self._num_layers],
+        'layer_depth': [-1, -1, 512, 256, 256, 128][:self._num_layers],
        'use_explicit_padding': self._use_explicit_padding,
        'use_depthwise': self._use_depthwise,
    }

--- a/research/object_detection/models/ssd_inception_v2_feature_extractor_test.py
+++ b/research/object_detection/models/ssd_inception_v2_feature_extractor_test.py
@@ -24,7 +24,11 @@ from object_detection.models import ssd_inception_v2_feature_extractor
 class SsdInceptionV2FeatureExtractorTest(
    ssd_feature_extractor_test.SsdFeatureExtractorTestBase):
-  def _create_feature_extractor(self, depth_multiplier, pad_to_multiple,
+  def _create_feature_extractor(self,
+                                depth_multiplier,
+                                pad_to_multiple,
+                                use_explicit_padding=False,
+                                num_layers=6,
                                is_training=True):
    """Constructs a SsdInceptionV2FeatureExtractor.
@@ -32,6 +36,10 @@ class SsdInceptionV2FeatureExtractorTest(
      depth_multiplier: float depth multiplier for feature extractor
      pad_to_multiple: the nearest multiple to zero pad the input height and
        width dimensions to.
+      use_explicit_padding: Use 'VALID' padding for convolutions, but prepad
+        inputs so that the output dimensions are the same as if 'SAME' padding
+        were used.
+      num_layers: number of SSD layers.
      is_training: whether the network is in training mode.
    Returns:
@@ -39,8 +47,12 @@ class SsdInceptionV2FeatureExtractorTest(
    """
    min_depth = 32
    return ssd_inception_v2_feature_extractor.SSDInceptionV2FeatureExtractor(
-        is_training, depth_multiplier, min_depth, pad_to_multiple,
+        is_training,
+        depth_multiplier,
+        min_depth,
+        pad_to_multiple,
        self.conv_hyperparams_fn,
+        num_layers=num_layers,
        override_base_feature_extractor_hyperparams=True)
  def test_extract_features_returns_correct_shapes_128(self):
@@ -129,6 +141,17 @@ class SsdInceptionV2FeatureExtractorTest(
    self.check_feature_extractor_variables_under_scope(
        depth_multiplier, pad_to_multiple, scope_name)
+  def test_extract_features_with_fewer_layers(self):
+    image_height = 128
+    image_width = 128
+    depth_multiplier = 1.0
+    pad_to_multiple = 1
+    expected_feature_map_shape = [(2, 8, 8, 576), (2, 4, 4, 1024),
+                                  (2, 2, 2, 512), (2, 1, 1, 256)]
+    self.check_extract_features_returns_correct_shape(
+        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        expected_feature_map_shape, num_layers=4)
 if __name__ == '__main__':
  tf.test.main()
--- a/research/object_detection/models/ssd_inception_v3_feature_extractor.py
+++ b/research/object_detection/models/ssd_inception_v3_feature_extractor.py
@@ -37,6 +37,7 @@ class SSDInceptionV3FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
               reuse_weights=None,
               use_explicit_padding=False,
               use_depthwise=False,
+               num_layers=6,
               override_base_feature_extractor_hyperparams=False):
    """InceptionV3 Feature Extractor for SSD Models.
@@ -53,6 +54,7 @@ class SSDInceptionV3FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
      use_explicit_padding: Whether to use explicit padding when extracting
        features. Default is False.
      use_depthwise: Whether to use depthwise convolutions. Default is False.
+      num_layers: Number of SSD layers.
      override_base_feature_extractor_hyperparams: Whether to override
        hyperparameters of the base feature extractor with the one from
        `conv_hyperparams_fn`.
@@ -69,6 +71,7 @@ class SSDInceptionV3FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
        reuse_weights=reuse_weights,
        use_explicit_padding=use_explicit_padding,
        use_depthwise=use_depthwise,
+        num_layers=num_layers,
        override_base_feature_extractor_hyperparams=
        override_base_feature_extractor_hyperparams)
@@ -109,8 +112,9 @@ class SSDInceptionV3FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
        33, preprocessed_inputs)
    feature_map_layout = {
-        'from_layer': ['Mixed_5d', 'Mixed_6e', 'Mixed_7c', '', '', ''],
+        'from_layer': ['Mixed_5d', 'Mixed_6e', 'Mixed_7c', '', '', ''
-        'layer_depth': [-1, -1, -1, 512, 256, 128],
+                      ][:self._num_layers],
+        'layer_depth': [-1, -1, -1, 512, 256, 128][:self._num_layers],
        'use_explicit_padding': self._use_explicit_padding,
        'use_depthwise': self._use_depthwise,
    }

--- a/research/object_detection/models/ssd_inception_v3_feature_extractor_test.py
+++ b/research/object_detection/models/ssd_inception_v3_feature_extractor_test.py
@@ -24,7 +24,11 @@ from object_detection.models import ssd_inception_v3_feature_extractor
 class SsdInceptionV3FeatureExtractorTest(
    ssd_feature_extractor_test.SsdFeatureExtractorTestBase):
-  def _create_feature_extractor(self, depth_multiplier, pad_to_multiple,
+  def _create_feature_extractor(self,
+                                depth_multiplier,
+                                pad_to_multiple,
+                                use_explicit_padding=False,
+                                num_layers=6,
                                is_training=True):
    """Constructs a SsdInceptionV3FeatureExtractor.
@@ -32,6 +36,10 @@ class SsdInceptionV3FeatureExtractorTest(
      depth_multiplier: float depth multiplier for feature extractor
      pad_to_multiple: the nearest multiple to zero pad the input height and
        width dimensions to.
+      use_explicit_padding: Use 'VALID' padding for convolutions, but prepad
+        inputs so that the output dimensions are the same as if 'SAME' padding
+        were used.
+      num_layers: number of SSD layers.
      is_training: whether the network is in training mode.
    Returns:
@@ -39,8 +47,12 @@ class SsdInceptionV3FeatureExtractorTest(
    """
    min_depth = 32
    return ssd_inception_v3_feature_extractor.SSDInceptionV3FeatureExtractor(
-        is_training, depth_multiplier, min_depth, pad_to_multiple,
+        is_training,
+        depth_multiplier,
+        min_depth,
+        pad_to_multiple,
        self.conv_hyperparams_fn,
+        num_layers=num_layers,
        override_base_feature_extractor_hyperparams=True)
  def test_extract_features_returns_correct_shapes_128(self):
@@ -129,6 +141,17 @@ class SsdInceptionV3FeatureExtractorTest(
    self.check_feature_extractor_variables_under_scope(
        depth_multiplier, pad_to_multiple, scope_name)
+  def test_extract_features_with_fewer_layers(self):
+    image_height = 128
+    image_width = 128
+    depth_multiplier = 1.0
+    pad_to_multiple = 1
+    expected_feature_map_shape = [(2, 13, 13, 288), (2, 6, 6, 768),
+                                  (2, 2, 2, 2048), (2, 1, 1, 512)]
+    self.check_extract_features_returns_correct_shape(
+        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        expected_feature_map_shape, num_layers=4)
 if __name__ == '__main__':
  tf.test.main()
--- a/research/object_detection/models/ssd_mobilenet_v1_feature_extractor.py
+++ b/research/object_detection/models/ssd_mobilenet_v1_feature_extractor.py
@@ -39,6 +39,7 @@ class SSDMobileNetV1FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
               reuse_weights=None,
               use_explicit_padding=False,
               use_depthwise=False,
+               num_layers=6,
               override_base_feature_extractor_hyperparams=False):
    """MobileNetV1 Feature Extractor for SSD Models.
@@ -56,6 +57,7 @@ class SSDMobileNetV1FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
        inputs so that the output dimensions are the same as if 'SAME' padding
        were used.
      use_depthwise: Whether to use depthwise convolutions. Default is False.
+      num_layers: Number of SSD layers.
      override_base_feature_extractor_hyperparams: Whether to override
        hyperparameters of the base feature extractor with the one from
        `conv_hyperparams_fn`.
@@ -69,6 +71,7 @@ class SSDMobileNetV1FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
        reuse_weights=reuse_weights,
        use_explicit_padding=use_explicit_padding,
        use_depthwise=use_depthwise,
+        num_layers=num_layers,
        override_base_feature_extractor_hyperparams=
        override_base_feature_extractor_hyperparams)
@@ -103,8 +106,8 @@ class SSDMobileNetV1FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
    feature_map_layout = {
        'from_layer': ['Conv2d_11_pointwise', 'Conv2d_13_pointwise', '', '',
-                       '', ''],
+                       '', ''][:self._num_layers],
-        'layer_depth': [-1, -1, 512, 256, 256, 128],
+        'layer_depth': [-1, -1, 512, 256, 256, 128][:self._num_layers],
        'use_explicit_padding': self._use_explicit_padding,
        'use_depthwise': self._use_depthwise,
    }

--- a/research/object_detection/models/ssd_mobilenet_v1_feature_extractor_test.py
+++ b/research/object_detection/models/ssd_mobilenet_v1_feature_extractor_test.py
@@ -12,7 +12,6 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # ==============================================================================
 """Tests for SSD Mobilenet V1 feature extractors.
 By using parameterized test decorator, this test serves for both Slim-based and
@@ -37,8 +36,12 @@ slim = tf.contrib.slim
 class SsdMobilenetV1FeatureExtractorTest(
    ssd_feature_extractor_test.SsdFeatureExtractorTestBase):
-  def _create_feature_extractor(self, depth_multiplier, pad_to_multiple,
+  def _create_feature_extractor(self,
-                                use_explicit_padding=False, is_training=False,
+                                depth_multiplier,
+                                pad_to_multiple,
+                                use_explicit_padding=False,
+                                num_layers=6,
+                                is_training=False,
                                use_keras=False):
    """Constructs a new feature extractor.
@@ -49,16 +52,18 @@ class SsdMobilenetV1FeatureExtractorTest(
      use_explicit_padding: Use 'VALID' padding for convolutions, but prepad
        inputs so that the output dimensions are the same as if 'SAME' padding
        were used.
+      num_layers: number of SSD layers.
      is_training: whether the network is in training mode.
      use_keras: if True builds a keras-based feature extractor, if False builds
        a slim-based one.
    Returns:
      an ssd_meta_arch.SSDFeatureExtractor object.
    """
    min_depth = 32
    if use_keras:
-      return (ssd_mobilenet_v1_keras_feature_extractor.
+      return (ssd_mobilenet_v1_keras_feature_extractor
-              SSDMobileNetV1KerasFeatureExtractor(
+              .SSDMobileNetV1KerasFeatureExtractor(
                  is_training=is_training,
                  depth_multiplier=depth_multiplier,
                  min_depth=min_depth,
@@ -68,12 +73,17 @@ class SsdMobilenetV1FeatureExtractorTest(
                  freeze_batchnorm=False,
                  inplace_batchnorm_update=False,
                  use_explicit_padding=use_explicit_padding,
+                  num_layers=num_layers,
                  name='MobilenetV1'))
    else:
      return ssd_mobilenet_v1_feature_extractor.SSDMobileNetV1FeatureExtractor(
-          is_training, depth_multiplier, min_depth, pad_to_multiple,
+          is_training,
+          depth_multiplier,
+          min_depth,
+          pad_to_multiple,
          self.conv_hyperparams_fn,
-          use_explicit_padding=use_explicit_padding)
+          use_explicit_padding=use_explicit_padding,
+          num_layers=num_layers)
  def test_extract_features_returns_correct_shapes_128(self, use_keras):
    image_height = 128
@@ -84,12 +94,22 @@ class SsdMobilenetV1FeatureExtractorTest(
                                  (2, 2, 2, 512), (2, 1, 1, 256),
                                  (2, 1, 1, 256), (2, 1, 1, 128)]
    self.check_extract_features_returns_correct_shape(
-        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        2,
-        expected_feature_map_shape, use_explicit_padding=False,
+        image_height,
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_explicit_padding=False,
        use_keras=use_keras)
    self.check_extract_features_returns_correct_shape(
-        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        2,
-        expected_feature_map_shape, use_explicit_padding=True,
+        image_height,
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_explicit_padding=True,
        use_keras=use_keras)
  def test_extract_features_returns_correct_shapes_299(self, use_keras):
@@ -101,12 +121,22 @@ class SsdMobilenetV1FeatureExtractorTest(
                                  (2, 5, 5, 512), (2, 3, 3, 256),
                                  (2, 2, 2, 256), (2, 1, 1, 128)]
    self.check_extract_features_returns_correct_shape(
-        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        2,
-        expected_feature_map_shape, use_explicit_padding=False,
+        image_height,
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_explicit_padding=False,
        use_keras=use_keras)
    self.check_extract_features_returns_correct_shape(
-        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        2,
-        expected_feature_map_shape, use_explicit_padding=True,
+        image_height,
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_explicit_padding=True,
        use_keras=use_keras)
  def test_extract_features_with_dynamic_image_shape(self, use_keras):
@@ -118,12 +148,22 @@ class SsdMobilenetV1FeatureExtractorTest(
                                  (2, 2, 2, 512), (2, 1, 1, 256),
                                  (2, 1, 1, 256), (2, 1, 1, 128)]
    self.check_extract_features_returns_correct_shapes_with_dynamic_inputs(
-        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        2,
-        expected_feature_map_shape, use_explicit_padding=False,
+        image_height,
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_explicit_padding=False,
        use_keras=use_keras)
    self.check_extract_features_returns_correct_shape(
-        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        2,
-        expected_feature_map_shape, use_explicit_padding=True,
+        image_height,
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_explicit_padding=True,
        use_keras=use_keras)
  def test_extract_features_returns_correct_shapes_enforcing_min_depth(
@@ -133,15 +173,25 @@ class SsdMobilenetV1FeatureExtractorTest(
    depth_multiplier = 0.5**12
    pad_to_multiple = 1
    expected_feature_map_shape = [(2, 19, 19, 32), (2, 10, 10, 32),
-                                  (2, 5, 5, 32), (2, 3, 3, 32),
+                                  (2, 5, 5, 32), (2, 3, 3, 32), (2, 2, 2, 32),
-                                  (2, 2, 2, 32), (2, 1, 1, 32)]
+                                  (2, 1, 1, 32)]
    self.check_extract_features_returns_correct_shape(
-        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        2,
-        expected_feature_map_shape, use_explicit_padding=False,
+        image_height,
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_explicit_padding=False,
        use_keras=use_keras)
    self.check_extract_features_returns_correct_shape(
-        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        2,
-        expected_feature_map_shape, use_explicit_padding=True,
+        image_height,
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_explicit_padding=True,
        use_keras=use_keras)
  def test_extract_features_returns_correct_shapes_with_pad_to_multiple(
@@ -154,12 +204,22 @@ class SsdMobilenetV1FeatureExtractorTest(
                                  (2, 5, 5, 512), (2, 3, 3, 256),
                                  (2, 2, 2, 256), (2, 1, 1, 128)]
    self.check_extract_features_returns_correct_shape(
-        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        2,
-        expected_feature_map_shape, use_explicit_padding=False,
+        image_height,
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_explicit_padding=False,
        use_keras=use_keras)
    self.check_extract_features_returns_correct_shape(
-        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        2,
-        expected_feature_map_shape, use_explicit_padding=True,
+        image_height,
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_explicit_padding=True,
        use_keras=use_keras)
  def test_extract_features_raises_error_with_invalid_image_size(
@@ -169,7 +229,10 @@ class SsdMobilenetV1FeatureExtractorTest(
    depth_multiplier = 1.0
    pad_to_multiple = 1
    self.check_extract_features_raises_error_with_invalid_image_size(
-        image_height, image_width, depth_multiplier, pad_to_multiple,
+        image_height,
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
        use_keras=use_keras)
  def test_preprocess_returns_correct_value_range(self, use_keras):
@@ -178,9 +241,8 @@ class SsdMobilenetV1FeatureExtractorTest(
    depth_multiplier = 1
    pad_to_multiple = 1
    test_image = np.random.rand(2, image_height, image_width, 3)
-    feature_extractor = self._create_feature_extractor(depth_multiplier,
+    feature_extractor = self._create_feature_extractor(
-                                                       pad_to_multiple,
+        depth_multiplier, pad_to_multiple, use_keras=use_keras)
-                                                       use_keras=use_keras)
    preprocessed_image = feature_extractor.preprocess(test_image)
    self.assertTrue(np.all(np.less_equal(np.abs(preprocessed_image), 1.0)))
@@ -212,8 +274,22 @@ class SsdMobilenetV1FeatureExtractorTest(
      _ = feature_extractor(preprocessed_image)
    else:
      _ = feature_extractor.extract_features(preprocessed_image)
-    self.assertTrue(any(op.type == 'FusedBatchNorm'
+    self.assertTrue(
-                        for op in tf.get_default_graph().get_operations()))
+        any('FusedBatchNorm' in op.type
+            for op in tf.get_default_graph().get_operations()))
+  def test_extract_features_with_fewer_layers(self, use_keras):
+    image_height = 128
+    image_width = 128
+    depth_multiplier = 1.0
+    pad_to_multiple = 1
+    expected_feature_map_shape = [(2, 8, 8, 512), (2, 4, 4, 1024),
+                                  (2, 2, 2, 512), (2, 1, 1, 256)]
+    self.check_extract_features_returns_correct_shape(
+        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        expected_feature_map_shape, use_explicit_padding=False, num_layers=4,
+        use_keras=use_keras)
 if __name__ == '__main__':
  tf.test.main()
--- a/research/object_detection/models/ssd_mobilenet_v1_fpn_feature_extractor_test.py
+++ b/research/object_detection/models/ssd_mobilenet_v1_fpn_feature_extractor_test.py
@@ -220,7 +220,7 @@ class SsdMobilenetV1FpnFeatureExtractorTest(
      _ = feature_extractor.extract_features(preprocessed_image)
    self.assertTrue(
-        any(op.type == 'FusedBatchNorm'
+        any('FusedBatchNorm' in op.type
            for op in tf.get_default_graph().get_operations()))

--- a/research/object_detection/models/ssd_mobilenet_v1_keras_feature_extractor.py
+++ b/research/object_detection/models/ssd_mobilenet_v1_keras_feature_extractor.py
@@ -40,6 +40,7 @@ class SSDMobileNetV1KerasFeatureExtractor(
               inplace_batchnorm_update,
               use_explicit_padding=False,
               use_depthwise=False,
+               num_layers=6,
               override_base_feature_extractor_hyperparams=False,
               name=None):
    """Keras MobileNetV1 Feature Extractor for SSD Models.
@@ -65,6 +66,7 @@ class SSDMobileNetV1KerasFeatureExtractor(
        inputs so that the output dimensions are the same as if 'SAME' padding
        were used.
      use_depthwise: Whether to use depthwise convolutions. Default is False.
+      num_layers: Number of SSD layers.
      override_base_feature_extractor_hyperparams: Whether to override
        hyperparameters of the base feature extractor with the one from
        `conv_hyperparams`.
@@ -81,13 +83,14 @@ class SSDMobileNetV1KerasFeatureExtractor(
        inplace_batchnorm_update=inplace_batchnorm_update,
        use_explicit_padding=use_explicit_padding,
        use_depthwise=use_depthwise,
+        num_layers=num_layers,
        override_base_feature_extractor_hyperparams=
        override_base_feature_extractor_hyperparams,
        name=name)
    self._feature_map_layout = {
        'from_layer': ['Conv2d_11_pointwise', 'Conv2d_13_pointwise', '', '',
-                       '', ''],
+                       '', ''][:self._num_layers],
-        'layer_depth': [-1, -1, 512, 256, 256, 128],
+        'layer_depth': [-1, -1, 512, 256, 256, 128][:self._num_layers],
        'use_explicit_padding': self._use_explicit_padding,
        'use_depthwise': self._use_depthwise,
    }

--- a/research/object_detection/models/ssd_mobilenet_v1_ppn_feature_extractor_test.py
+++ b/research/object_detection/models/ssd_mobilenet_v1_ppn_feature_extractor_test.py
@@ -178,7 +178,7 @@ class SsdMobilenetV1PpnFeatureExtractorTest(
                                                       pad_to_multiple)
    preprocessed_image = feature_extractor.preprocess(image_placeholder)
    _ = feature_extractor.extract_features(preprocessed_image)
-    self.assertTrue(any(op.type == 'FusedBatchNorm'
+    self.assertTrue(any('FusedBatchNorm' in op.type
                        for op in tf.get_default_graph().get_operations()))
 if __name__ == '__main__':

--- a/research/object_detection/models/ssd_mobilenet_v2_feature_extractor.py
+++ b/research/object_detection/models/ssd_mobilenet_v2_feature_extractor.py
@@ -40,6 +40,7 @@ class SSDMobileNetV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
               reuse_weights=None,
               use_explicit_padding=False,
               use_depthwise=False,
+               num_layers=6,
               override_base_feature_extractor_hyperparams=False):
    """MobileNetV2 Feature Extractor for SSD Models.
@@ -59,6 +60,7 @@ class SSDMobileNetV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
      use_explicit_padding: Whether to use explicit padding when extracting
        features. Default is False.
      use_depthwise: Whether to use depthwise convolutions. Default is False.
+      num_layers: Number of SSD layers.
      override_base_feature_extractor_hyperparams: Whether to override
        hyperparameters of the base feature extractor with the one from
        `conv_hyperparams_fn`.
@@ -72,6 +74,7 @@ class SSDMobileNetV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
        reuse_weights=reuse_weights,
        use_explicit_padding=use_explicit_padding,
        use_depthwise=use_depthwise,
+        num_layers=num_layers,
        override_base_feature_extractor_hyperparams=
        override_base_feature_extractor_hyperparams)
@@ -105,8 +108,9 @@ class SSDMobileNetV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
        33, preprocessed_inputs)
    feature_map_layout = {
-        'from_layer': ['layer_15/expansion_output', 'layer_19', '', '', '', ''],
+        'from_layer': ['layer_15/expansion_output', 'layer_19', '', '', '', ''
-        'layer_depth': [-1, -1, 512, 256, 256, 128],
+                      ][:self._num_layers],
+        'layer_depth': [-1, -1, 512, 256, 256, 128][:self._num_layers],
        'use_depthwise': self._use_depthwise,
        'use_explicit_padding': self._use_explicit_padding,
    }

--- a/research/object_detection/models/ssd_mobilenet_v2_feature_extractor_test.py
+++ b/research/object_detection/models/ssd_mobilenet_v2_feature_extractor_test.py
@@ -33,8 +33,12 @@ slim = tf.contrib.slim
 class SsdMobilenetV2FeatureExtractorTest(
    ssd_feature_extractor_test.SsdFeatureExtractorTestBase):
-  def _create_feature_extractor(self, depth_multiplier, pad_to_multiple,
+  def _create_feature_extractor(self,
-                                use_explicit_padding=False, use_keras=False):
+                                depth_multiplier,
+                                pad_to_multiple,
+                                use_explicit_padding=False,
+                                num_layers=6,
+                                use_keras=False):
    """Constructs a new feature extractor.
    Args:
@@ -44,6 +48,7 @@ class SsdMobilenetV2FeatureExtractorTest(
      use_explicit_padding: use 'VALID' padding for convolutions, but prepad
        inputs so that the output dimensions are the same as if 'SAME' padding
        were used.
+      num_layers: number of SSD layers.
      use_keras: if True builds a keras-based feature extractor, if False builds
        a slim-based one.
    Returns:
@@ -61,6 +66,7 @@ class SsdMobilenetV2FeatureExtractorTest(
                  freeze_batchnorm=False,
                  inplace_batchnorm_update=False,
                  use_explicit_padding=use_explicit_padding,
+                  num_layers=num_layers,
                  name='MobilenetV2'))
    else:
      return ssd_mobilenet_v2_feature_extractor.SSDMobileNetV2FeatureExtractor(
@@ -69,7 +75,8 @@ class SsdMobilenetV2FeatureExtractorTest(
          min_depth,
          pad_to_multiple,
          self.conv_hyperparams_fn,
-          use_explicit_padding=use_explicit_padding)
+          use_explicit_padding=use_explicit_padding,
+          num_layers=num_layers)
  def test_extract_features_returns_correct_shapes_128(self, use_keras):
    image_height = 128
@@ -199,9 +206,21 @@ class SsdMobilenetV2FeatureExtractorTest(
      _ = feature_extractor(preprocessed_image)
    else:
      _ = feature_extractor.extract_features(preprocessed_image)
-    self.assertTrue(any(op.type == 'FusedBatchNorm'
+    self.assertTrue(any('FusedBatchNorm' in op.type
                        for op in tf.get_default_graph().get_operations()))
+  def test_extract_features_with_fewer_layers(self, use_keras):
+    image_height = 128
+    image_width = 128
+    depth_multiplier = 1.0
+    pad_to_multiple = 1
+    expected_feature_map_shape = [(2, 8, 8, 576), (2, 4, 4, 1280),
+                                  (2, 2, 2, 512), (2, 1, 1, 256)]
+    self.check_extract_features_returns_correct_shape(
+        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        expected_feature_map_shape, use_explicit_padding=False, num_layers=4,
+        use_keras=use_keras)
 if __name__ == '__main__':
  tf.test.main()
--- a/research/object_detection/models/ssd_mobilenet_v2_fpn_feature_extractor_test.py
+++ b/research/object_detection/models/ssd_mobilenet_v2_fpn_feature_extractor_test.py
@@ -30,15 +30,33 @@ slim = tf.contrib.slim
 @parameterized.parameters(
-    {'use_keras': False},
+    {
-    {'use_keras': True},
+        'use_depthwise': False,
+        'use_keras': True
+    },
+    {
+        'use_depthwise': True,
+        'use_keras': True
+    },
+    {
+        'use_depthwise': False,
+        'use_keras': False
+    },
+    {
+        'use_depthwise': True,
+        'use_keras': False
+    },
 )
 class SsdMobilenetV2FpnFeatureExtractorTest(
    ssd_feature_extractor_test.SsdFeatureExtractorTestBase):
-  def _create_feature_extractor(self, depth_multiplier, pad_to_multiple,
+  def _create_feature_extractor(self,
-                                is_training=True, use_explicit_padding=False,
+                                depth_multiplier,
-                                use_keras=False):
+                                pad_to_multiple,
+                                is_training=True,
+                                use_explicit_padding=False,
+                                use_keras=False,
+                                use_depthwise=False):
    """Constructs a new feature extractor.
    Args:
@@ -51,13 +69,14 @@ class SsdMobilenetV2FpnFeatureExtractorTest(
        were used.
      use_keras: if True builds a keras-based feature extractor, if False builds
        a slim-based one.
+      use_depthwise: Whether to use depthwise convolutions.
    Returns:
      an ssd_meta_arch.SSDFeatureExtractor object.
    """
    min_depth = 32
    if use_keras:
-      return (ssd_mobilenet_v2_fpn_keras_feature_extractor.
+      return (ssd_mobilenet_v2_fpn_keras_feature_extractor
-              SSDMobileNetV2FpnKerasFeatureExtractor(
+              .SSDMobileNetV2FpnKerasFeatureExtractor(
                  is_training=is_training,
                  depth_multiplier=depth_multiplier,
                  min_depth=min_depth,
@@ -67,18 +86,21 @@ class SsdMobilenetV2FpnFeatureExtractorTest(
                  freeze_batchnorm=False,
                  inplace_batchnorm_update=False,
                  use_explicit_padding=use_explicit_padding,
+                  use_depthwise=use_depthwise,
                  name='MobilenetV2_FPN'))
    else:
-      return (ssd_mobilenet_v2_fpn_feature_extractor.
+      return (ssd_mobilenet_v2_fpn_feature_extractor
-              SSDMobileNetV2FpnFeatureExtractor(
+              .SSDMobileNetV2FpnFeatureExtractor(
                  is_training,
                  depth_multiplier,
                  min_depth,
                  pad_to_multiple,
                  self.conv_hyperparams_fn,
+                  use_depthwise=use_depthwise,
                  use_explicit_padding=use_explicit_padding))
-  def test_extract_features_returns_correct_shapes_256(self, use_keras):
+  def test_extract_features_returns_correct_shapes_256(self, use_keras,
+                                                       use_depthwise):
    image_height = 256
    image_width = 256
    depth_multiplier = 1.0
@@ -87,15 +109,28 @@ class SsdMobilenetV2FpnFeatureExtractorTest(
                                  (2, 8, 8, 256), (2, 4, 4, 256),
                                  (2, 2, 2, 256)]
    self.check_extract_features_returns_correct_shape(
-        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        2,
-        expected_feature_map_shape, use_explicit_padding=False,
+        image_height,
-        use_keras=use_keras)
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_explicit_padding=False,
+        use_keras=use_keras,
+        use_depthwise=use_depthwise)
    self.check_extract_features_returns_correct_shape(
-        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        2,
-        expected_feature_map_shape, use_explicit_padding=True,
+        image_height,
-        use_keras=use_keras)
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_explicit_padding=True,
+        use_keras=use_keras,
+        use_depthwise=use_depthwise)
-  def test_extract_features_returns_correct_shapes_384(self, use_keras):
+  def test_extract_features_returns_correct_shapes_384(self, use_keras,
+                                                       use_depthwise):
    image_height = 320
    image_width = 320
    depth_multiplier = 1.0
@@ -104,15 +139,28 @@ class SsdMobilenetV2FpnFeatureExtractorTest(
                                  (2, 10, 10, 256), (2, 5, 5, 256),
                                  (2, 3, 3, 256)]
    self.check_extract_features_returns_correct_shape(
-        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        2,
-        expected_feature_map_shape, use_explicit_padding=False,
+        image_height,
-        use_keras=use_keras)
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_explicit_padding=False,
+        use_keras=use_keras,
+        use_depthwise=use_depthwise)
    self.check_extract_features_returns_correct_shape(
-        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        2,
-        expected_feature_map_shape, use_explicit_padding=True,
+        image_height,
-        use_keras=use_keras)
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_explicit_padding=True,
+        use_keras=use_keras,
+        use_depthwise=use_depthwise)
-  def test_extract_features_with_dynamic_image_shape(self, use_keras):
+  def test_extract_features_with_dynamic_image_shape(self, use_keras,
+                                                     use_depthwise):
    image_height = 256
    image_width = 256
    depth_multiplier = 1.0
@@ -121,16 +169,28 @@ class SsdMobilenetV2FpnFeatureExtractorTest(
                                  (2, 8, 8, 256), (2, 4, 4, 256),
                                  (2, 2, 2, 256)]
    self.check_extract_features_returns_correct_shapes_with_dynamic_inputs(
-        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        2,
-        expected_feature_map_shape, use_explicit_padding=False,
+        image_height,
-        use_keras=use_keras)
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_explicit_padding=False,
+        use_keras=use_keras,
+        use_depthwise=use_depthwise)
    self.check_extract_features_returns_correct_shapes_with_dynamic_inputs(
-        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        2,
-        expected_feature_map_shape, use_explicit_padding=True,
+        image_height,
-        use_keras=use_keras)
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_explicit_padding=True,
+        use_keras=use_keras,
+        use_depthwise=use_depthwise)
  def test_extract_features_returns_correct_shapes_with_pad_to_multiple(
-      self, use_keras):
+      self, use_keras, use_depthwise):
    image_height = 299
    image_width = 299
    depth_multiplier = 1.0
@@ -139,16 +199,28 @@ class SsdMobilenetV2FpnFeatureExtractorTest(
                                  (2, 10, 10, 256), (2, 5, 5, 256),
                                  (2, 3, 3, 256)]
    self.check_extract_features_returns_correct_shape(
-        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        2,
-        expected_feature_map_shape, use_explicit_padding=False,
+        image_height,
-        use_keras=use_keras)
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_explicit_padding=False,
+        use_keras=use_keras,
+        use_depthwise=use_depthwise)
    self.check_extract_features_returns_correct_shape(
-        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        2,
-        expected_feature_map_shape, use_explicit_padding=True,
+        image_height,
-        use_keras=use_keras)
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_explicit_padding=True,
+        use_keras=use_keras,
+        use_depthwise=use_depthwise)
  def test_extract_features_returns_correct_shapes_enforcing_min_depth(
-      self, use_keras):
+      self, use_keras, use_depthwise):
    image_height = 256
    image_width = 256
    depth_multiplier = 0.5**12
@@ -157,70 +229,102 @@ class SsdMobilenetV2FpnFeatureExtractorTest(
                                  (2, 8, 8, 32), (2, 4, 4, 32),
                                  (2, 2, 2, 32)]
    self.check_extract_features_returns_correct_shape(
-        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        2,
-        expected_feature_map_shape, use_explicit_padding=False,
+        image_height,
-        use_keras=use_keras)
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_explicit_padding=False,
+        use_keras=use_keras,
+        use_depthwise=use_depthwise)
    self.check_extract_features_returns_correct_shape(
-        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        2,
-        expected_feature_map_shape, use_explicit_padding=True,
+        image_height,
-        use_keras=use_keras)
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_explicit_padding=True,
+        use_keras=use_keras,
+        use_depthwise=use_depthwise)
  def test_extract_features_raises_error_with_invalid_image_size(
-      self, use_keras):
+      self, use_keras, use_depthwise):
    image_height = 32
    image_width = 32
    depth_multiplier = 1.0
    pad_to_multiple = 1
    self.check_extract_features_raises_error_with_invalid_image_size(
-        image_height, image_width, depth_multiplier, pad_to_multiple,
+        image_height,
-        use_keras=use_keras)
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        use_keras=use_keras,
+        use_depthwise=use_depthwise)
-  def test_preprocess_returns_correct_value_range(self, use_keras):
+  def test_preprocess_returns_correct_value_range(self, use_keras,
+                                                  use_depthwise):
    image_height = 256
    image_width = 256
    depth_multiplier = 1
    pad_to_multiple = 1
    test_image = np.random.rand(2, image_height, image_width, 3)
-    feature_extractor = self._create_feature_extractor(depth_multiplier,
+    feature_extractor = self._create_feature_extractor(
-                                                       pad_to_multiple,
+        depth_multiplier,
-                                                       use_keras=use_keras)
+        pad_to_multiple,
+        use_keras=use_keras,
+        use_depthwise=use_depthwise)
    preprocessed_image = feature_extractor.preprocess(test_image)
    self.assertTrue(np.all(np.less_equal(np.abs(preprocessed_image), 1.0)))
-  def test_variables_only_created_in_scope(self, use_keras):
+  def test_variables_only_created_in_scope(self, use_keras, use_depthwise):
    depth_multiplier = 1
    pad_to_multiple = 1
    scope_name = 'MobilenetV2'
    self.check_feature_extractor_variables_under_scope(
-        depth_multiplier, pad_to_multiple, scope_name, use_keras=use_keras)
+        depth_multiplier,
+        pad_to_multiple,
+        scope_name,
+        use_keras=use_keras,
+        use_depthwise=use_depthwise)
-  def test_fused_batchnorm(self, use_keras):
+  def test_fused_batchnorm(self, use_keras, use_depthwise):
    image_height = 256
    image_width = 256
    depth_multiplier = 1
    pad_to_multiple = 1
    image_placeholder = tf.placeholder(tf.float32,
                                       [1, image_height, image_width, 3])
-    feature_extractor = self._create_feature_extractor(depth_multiplier,
+    feature_extractor = self._create_feature_extractor(
-                                                       pad_to_multiple,
+        depth_multiplier,
-                                                       use_keras=use_keras)
+        pad_to_multiple,
+        use_keras=use_keras,
+        use_depthwise=use_depthwise)
    preprocessed_image = feature_extractor.preprocess(image_placeholder)
    if use_keras:
      _ = feature_extractor(preprocessed_image)
    else:
      _ = feature_extractor.extract_features(preprocessed_image)
    self.assertTrue(
-        any(op.type == 'FusedBatchNorm'
+        any('FusedBatchNorm' in op.type
            for op in tf.get_default_graph().get_operations()))
-  def test_variable_count(self, use_keras):
+  def test_variable_count(self, use_keras, use_depthwise):
    depth_multiplier = 1
    pad_to_multiple = 1
    variables = self.get_feature_extractor_variables(
-        depth_multiplier, pad_to_multiple, use_keras=use_keras)
+        depth_multiplier,
-    self.assertEqual(len(variables), 274)
+        pad_to_multiple,
+        use_keras=use_keras,
+        use_depthwise=use_depthwise)
+    expected_variables_len = 274
+    if use_depthwise:
+      expected_variables_len = 278
+    self.assertEqual(len(variables), expected_variables_len)
-  def test_get_expected_feature_map_variable_names(self, use_keras):
+  def test_get_expected_feature_map_variable_names(self, use_keras,
+                                                   use_depthwise):
    depth_multiplier = 1.0
    pad_to_multiple = 1
@@ -239,6 +343,25 @@ class SsdMobilenetV2FpnFeatureExtractorTest(
        'MobilenetV2/fpn/projection_2/weights',
        'MobilenetV2/fpn/projection_3/weights',
    ])
+    slim_expected_feature_maps_variables_with_depthwise = set([
+        # Slim Mobilenet V2 feature maps
+        'MobilenetV2/expanded_conv_4/depthwise/depthwise_weights',
+        'MobilenetV2/expanded_conv_7/depthwise/depthwise_weights',
+        'MobilenetV2/expanded_conv_14/depthwise/depthwise_weights',
+        'MobilenetV2/Conv_1/weights',
+        # FPN layers
+        'MobilenetV2/fpn/bottom_up_Conv2d_20/pointwise_weights',
+        'MobilenetV2/fpn/bottom_up_Conv2d_20/depthwise_weights',
+        'MobilenetV2/fpn/bottom_up_Conv2d_21/pointwise_weights',
+        'MobilenetV2/fpn/bottom_up_Conv2d_21/depthwise_weights',
+        'MobilenetV2/fpn/smoothing_1/depthwise_weights',
+        'MobilenetV2/fpn/smoothing_1/pointwise_weights',
+        'MobilenetV2/fpn/smoothing_2/depthwise_weights',
+        'MobilenetV2/fpn/smoothing_2/pointwise_weights',
+        'MobilenetV2/fpn/projection_1/weights',
+        'MobilenetV2/fpn/projection_2/weights',
+        'MobilenetV2/fpn/projection_3/weights',
+    ])
    keras_expected_feature_maps_variables = set([
        # Keras Mobilenet V2 feature maps
        'MobilenetV2_FPN/block_4_depthwise/depthwise_kernel',
@@ -254,17 +377,50 @@ class SsdMobilenetV2FpnFeatureExtractorTest(
        'MobilenetV2_FPN/FeatureMaps/top_down/projection_2/kernel',
        'MobilenetV2_FPN/FeatureMaps/top_down/projection_3/kernel'
    ])
+    keras_expected_feature_maps_variables_with_depthwise = set([
+        # Keras Mobilenet V2 feature maps
+        'MobilenetV2_FPN/block_4_depthwise/depthwise_kernel',
+        'MobilenetV2_FPN/block_7_depthwise/depthwise_kernel',
+        'MobilenetV2_FPN/block_14_depthwise/depthwise_kernel',
+        'MobilenetV2_FPN/Conv_1/kernel',
+        # FPN layers
+        'MobilenetV2_FPN/bottom_up_Conv2d_20_depthwise_conv/depthwise_kernel',
+        'MobilenetV2_FPN/bottom_up_Conv2d_20_depthwise_conv/pointwise_kernel',
+        'MobilenetV2_FPN/bottom_up_Conv2d_21_depthwise_conv/depthwise_kernel',
+        'MobilenetV2_FPN/bottom_up_Conv2d_21_depthwise_conv/pointwise_kernel',
+        ('MobilenetV2_FPN/FeatureMaps/top_down/smoothing_1_depthwise_conv/'
+         'depthwise_kernel'),
+        ('MobilenetV2_FPN/FeatureMaps/top_down/smoothing_1_depthwise_conv/'
+         'pointwise_kernel'),
+        ('MobilenetV2_FPN/FeatureMaps/top_down/smoothing_2_depthwise_conv/'
+         'depthwise_kernel'),
+        ('MobilenetV2_FPN/FeatureMaps/top_down/smoothing_2_depthwise_conv/'
+         'pointwise_kernel'),
+        'MobilenetV2_FPN/FeatureMaps/top_down/projection_1/kernel',
+        'MobilenetV2_FPN/FeatureMaps/top_down/projection_2/kernel',
+        'MobilenetV2_FPN/FeatureMaps/top_down/projection_3/kernel'
+    ])
    g = tf.Graph()
    with g.as_default():
      preprocessed_inputs = tf.placeholder(tf.float32, (4, None, None, 3))
      feature_extractor = self._create_feature_extractor(
-          depth_multiplier, pad_to_multiple, use_keras=use_keras)
+          depth_multiplier,
+          pad_to_multiple,
+          use_keras=use_keras,
+          use_depthwise=use_depthwise)
      if use_keras:
-        feature_extractor(preprocessed_inputs)
+        _ = feature_extractor(preprocessed_inputs)
        expected_feature_maps_variables = keras_expected_feature_maps_variables
+        if use_depthwise:
+          expected_feature_maps_variables = (
+              keras_expected_feature_maps_variables_with_depthwise)
      else:
-        feature_extractor.extract_features(preprocessed_inputs)
+        _ = feature_extractor.extract_features(preprocessed_inputs)
        expected_feature_maps_variables = slim_expected_feature_maps_variables
+        if use_depthwise:
+          expected_feature_maps_variables = (
+              slim_expected_feature_maps_variables_with_depthwise)
      actual_variable_set = set([
          var.op.name for var in g.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)
      ])