Release MobileNet V3 models and SSDLite models with MobileNet V3 backbone. (#7678)

* Merged commit includes the following changes: 275131829 by Sergio Guadarrama: updates mobilenet/README.md to be github compatible adds V2+ reference to mobilenet_v1.md file and fixes invalid markdown -- 274908068 by Sergio Guadarrama: Opensource MobilenetV3 detection models. -- 274697808 by Sergio Guadarrama: Fixed cases where tf.TensorShape was constructed with float dimensions This is a prerequisite for making TensorShape and Dimension more strict about the types of their arguments. -- 273577462 by Sergio Guadarrama: Fixing `conv_defs['defaults']` override issue. -- 272801298 by Sergio Guadarrama: Adds links to trained models for Moblienet V3, adds a version of minimalistic mobilenet-v3 to the definitions. -- 268928503 by Sergio Guadarrama: Mobilenet v2 with group normalization. -- 263492735 by Sergio Guadarrama: Internal change 260037126 by Sergio Guadarrama: Adds an option of using a custom depthwise operation in `expanded_conv`. -- 259997001 by Sergio Guadarrama: Explicitly mark Python binaries/tests with python_version = "PY2". -- 252697685 by Sergio Guadarrama: Internal change 251918746 by Sergio Guadarrama: Internal change 251909704 by Sergio Guadarrama: Mobilenet V3 backbone implementation. -- 247510236 by Sergio Guadarrama: Internal change 246196802 by Sergio Guadarrama: Internal change 246014539 by Sergio Guadarrama: Internal change 245891435 by Sergio Guadarrama: Internal change 245834925 by Sergio Guadarrama: n/a -- PiperOrigin-RevId: 275131829 * Merged commit includes the following changes: 274959989 by Zhichao Lu: Update detection model zoo with MobilenetV3 SSD candidates. -- 274908068 by Zhichao Lu: Opensource MobilenetV3 detection models. -- 274695889 by richardmunoz: RandomPatchGaussian preprocessing step This step can be used during model training to randomly apply gaussian noise to a random image patch. Example addition to an Object Detection API pipeline config: train_config { ... data_augmentation_options { random_patch_gaussian { random_coef: 0.5 min_patch_size: 1 max_patch_size: 250 min_gaussian_stddev: 0.0 max_gaussian_stddev: 1.0 } } ... } -- 274257872 by lzc: Internal change. -- 274114689 by Zhichao Lu: Pass native_resize flag to other FPN variants. -- 274112308 by lzc: Internal change. -- 274090763 by richardmunoz: Util function for getting a patch mask on an image for use with the Object Detection API -- 274069806 by Zhichao Lu: Adding functions which will help compute predictions and losses for CenterNet. -- 273860828 by lzc: Internal change. -- 273380069 by richardmunoz: RandomImageDownscaleToTargetPixels preprocessing step This step can be used during model training to randomly downscale an image to a random target number of pixels. If the image does not contain more than the target number of pixels, then downscaling is skipped. Example addition to an Object Detection API pipeline config: train_config { ... data_augmentation_options { random_downscale_to_target_pixels { random_coef: 0.5 min_target_pixels: 300000 max_target_pixels: 500000 } } ... } -- 272987602 by Zhichao Lu: Avoid -inf when empty box list is passed. -- 272525836 by Zhichao Lu: Cleanup repeated resizing code in meta archs. -- 272458667 by richardmunoz: RandomJpegQuality preprocessing step This step can be used during model training to randomly encode the image into a jpeg with a random quality level. Example addition to an Object Detection API pipeline config: train_config { ... data_augmentation_options { random_jpeg_quality { random_coef: 0.5 min_jpeg_quality: 80 max_jpeg_quality: 100 } } ... } -- 271412717 by Zhichao Lu: Enables TPU training with the V2 eager + tf.function Object Detection training loops. -- 270744153 by Zhichao Lu: Adding the offset and size target assigners for CenterNet. -- 269916081 by Zhichao Lu: Include basic installation in Object Detection API tutorial. Also: - Use TF2.0 - Use saved_model -- 269376056 by Zhichao Lu: Fix to variable loading in RetinaNet w/ custom loops. (makes the code rely on the exact name scopes that are generated a little bit less) -- 269256251 by lzc: Add use_partitioned_nms field to config and update post_prossing_builder to honor that flag when building nms function. -- 268865295 by Zhichao Lu: Adding functionality for importing and merging back internal state of the metric. -- 268640984 by Zhichao Lu: Fix computation of gaussian sigma value to create CenterNet heatmap target. -- 267475576 by Zhichao Lu: Fix for exporter trying to export non-existent exponential moving averages. -- 267286768 by Zhichao Lu: Update mixed-precision policy. -- 266166879 by Zhichao Lu: Internal change 265860884 by Zhichao Lu: Apply floor function to center coordinates when creating heatmap for CenterNet target. -- 265702749 by Zhichao Lu: Internal change -- 264241949 by ronnyvotel: Updating Faster R-CNN 'final_anchors' to be in normalized coordinates. -- 264175192 by lzc: Update model_fn to only read hparams if it is not None. -- 264159328 by Zhichao Lu: Modify nearest neighbor upsampling to eliminate a multiply operation. For quantized models, the multiply operation gets unnecessarily quantized and reduces accuracy (simple stacking would work in place of the broadcast op which doesn't require quantization). Also removes an unnecessary reshape op. -- 263668306 by Zhichao Lu: Add the option to use dynamic map_fn for batch NMS -- 263031163 by Zhichao Lu: Mark outside compilation for NMS as optional. -- 263024916 by Zhichao Lu: Add an ExperimentalModel meta arch for experimenting with new model types. -- 262655894 by Zhichao Lu: Add the center heatmap target assigner for CenterNet -- 262431036 by Zhichao Lu: Adding add_eval_dict to allow for evaluation on model_v2 -- 262035351 by ronnyvotel: Removing any non-Tensor predictions from the third stage of Mask R-CNN. -- 261953416 by Zhichao Lu: Internal change. -- 261834966 by Zhichao Lu: Fix the NMS OOM issue on TPU by forcing NMS to run outside of TPU. -- 261775941 by Zhichao Lu: Make Keras InputLayer compatible with both TF 1.x and TF 2.0. -- 261775633 by Zhichao Lu: Visualize additional channels with ground-truth bounding boxes. -- 261768117 by lzc: Internal change. -- 261766773 by ronnyvotel: Exposing `return_raw_detections_during_predict` in Faster R-CNN Proto. -- 260975089 by ronnyvotel: Moving calculation of batched prediction tensor names after all tensors in prediction dictionary are created. -- 259816913 by ronnyvotel: Adding raw detection boxes and feature map indices to SSD -- 259791955 by Zhichao Lu: Added a flag to control the use partitioned_non_max_suppression. -- 259580475 by Zhichao Lu: Tweak quantization-aware training re-writer to support NasFpn model architecture. -- 259579943 by rathodv: Add a meta target assigner proto and builders in OD API. -- 259577741 by Zhichao Lu: Internal change. -- 259366315 by lzc: Internal change. -- 259344310 by ronnyvotel: Updating faster rcnn so that raw_detection_boxes from predict() are in normalized coordinates. -- 259338670 by Zhichao Lu: Add support for use_native_resize_op to more feature extractors. Use dynamic shapes when static shapes are not available. -- 259083543 by ronnyvotel: Updating/fixing documentation. -- 259078937 by rathodv: Add prediction fields for tensors returned from detection_model.predict. -- 259044601 by Zhichao Lu: Add protocol buffer and builders for temperature scaling calibration. -- 259036770 by lzc: Internal changes. -- 259006223 by ronnyvotel: Adding detection anchor indices to Faster R-CNN Config. This is useful when one wishes to associate final detections and the anchors (or pre-nms boxes) from which they originated. -- 258872501 by Zhichao Lu: Run the training pipeline of ssd + resnet_v1_50 + fpn with a checkpoint. -- 258840686 by ronnyvotel: Adding standard outputs to DetectionModel.predict(). This CL only updates Faster R-CNN. Other meta architectures will be updated in future CLs. -- 258672969 by lzc: Internal change. -- 258649494 by lzc: Internal changes. -- 258630321 by ronnyvotel: Fixing documentation in shape_utils.flatten_dimensions(). -- 258468145 by Zhichao Lu: Add additional output tensors parameter to Postprocess op. -- 258099219 by Zhichao Lu: Internal changes -- PiperOrigin-RevId: 274959989

Release MobileNet V3 models and SSDLite models with MobileNet V3 backbone. (#7678)
* Merged commit includes the following changes: 275131829 by Sergio Guadarrama: updates mobilenet/README.md to be github compatible adds V2+ reference to mobilenet_v1.md file and fixes invalid markdown -- 274908068 by Sergio Guadarrama: Opensource MobilenetV3 detection models. -- 274697808 by Sergio Guadarrama: Fixed cases where tf.TensorShape was constructed with float dimensions This is a prerequisite for making TensorShape and Dimension more strict about the types of their arguments. -- 273577462 by Sergio Guadarrama: Fixing `conv_defs['defaults']` override issue. -- 272801298 by Sergio Guadarrama: Adds links to trained models for Moblienet V3, adds a version of minimalistic mobilenet-v3 to the definitions. -- 268928503 by Sergio Guadarrama: Mobilenet v2 with group normalization. -- 263492735 by Sergio Guadarrama: Internal change 260037126 by Sergio Guadarrama: Adds an option of using a custom depthwise operation in `expanded_conv`. -- 259997001 by Sergio Guadarrama: Explicitly mark Python binaries/tests with python_version = "PY2". -- 252697685 by Sergio Guadarrama: Internal change 251918746 by Sergio Guadarrama: Internal change 251909704 by Sergio Guadarrama: Mobilenet V3 backbone implementation. -- 247510236 by Sergio Guadarrama: Internal change 246196802 by Sergio Guadarrama: Internal change 246014539 by Sergio Guadarrama: Internal change 245891435 by Sergio Guadarrama: Internal change 245834925 by Sergio Guadarrama: n/a -- PiperOrigin-RevId: 275131829 * Merged commit includes the following changes: 274959989 by Zhichao Lu: Update detection model zoo with MobilenetV3 SSD candidates. -- 274908068 by Zhichao Lu: Opensource MobilenetV3 detection models. -- 274695889 by richardmunoz: RandomPatchGaussian preprocessing step This step can be used during model training to randomly apply gaussian noise to a random image patch. Example addition to an Object Detection API pipeline config: train_config { ... data_augmentation_options { random_patch_gaussian { random_coef: 0.5 min_patch_size: 1 max_patch_size: 250 min_gaussian_stddev: 0.0 max_gaussian_stddev: 1.0 } } ... } -- 274257872 by lzc: Internal change. -- 274114689 by Zhichao Lu: Pass native_resize flag to other FPN variants. -- 274112308 by lzc: Internal change. -- 274090763 by richardmunoz: Util function for getting a patch mask on an image for use with the Object Detection API -- 274069806 by Zhichao Lu: Adding functions which will help compute predictions and losses for CenterNet. -- 273860828 by lzc: Internal change. -- 273380069 by richardmunoz: RandomImageDownscaleToTargetPixels preprocessing step This step can be used during model training to randomly downscale an image to a random target number of pixels. If the image does not contain more than the target number of pixels, then downscaling is skipped. Example addition to an Object Detection API pipeline config: train_config { ... data_augmentation_options { random_downscale_to_target_pixels { random_coef: 0.5 min_target_pixels: 300000 max_target_pixels: 500000 } } ... } -- 272987602 by Zhichao Lu: Avoid -inf when empty box list is passed. -- 272525836 by Zhichao Lu: Cleanup repeated resizing code in meta archs. -- 272458667 by richardmunoz: RandomJpegQuality preprocessing step This step can be used during model training to randomly encode the image into a jpeg with a random quality level. Example addition to an Object Detection API pipeline config: train_config { ... data_augmentation_options { random_jpeg_quality { random_coef: 0.5 min_jpeg_quality: 80 max_jpeg_quality: 100 } } ... } -- 271412717 by Zhichao Lu: Enables TPU training with the V2 eager + tf.function Object Detection training loops. -- 270744153 by Zhichao Lu: Adding the offset and size target assigners for CenterNet. -- 269916081 by Zhichao Lu: Include basic installation in Object Detection API tutorial. Also: - Use TF2.0 - Use saved_model -- 269376056 by Zhichao Lu: Fix to variable loading in RetinaNet w/ custom loops. (makes the code rely on the exact name scopes that are generated a little bit less) -- 269256251 by lzc: Add use_partitioned_nms field to config and update post_prossing_builder to honor that flag when building nms function. -- 268865295 by Zhichao Lu: Adding functionality for importing and merging back internal state of the metric. -- 268640984 by Zhichao Lu: Fix computation of gaussian sigma value to create CenterNet heatmap target. -- 267475576 by Zhichao Lu: Fix for exporter trying to export non-existent exponential moving averages. -- 267286768 by Zhichao Lu: Update mixed-precision policy. -- 266166879 by Zhichao Lu: Internal change 265860884 by Zhichao Lu: Apply floor function to center coordinates when creating heatmap for CenterNet target. -- 265702749 by Zhichao Lu: Internal change -- 264241949 by ronnyvotel: Updating Faster R-CNN 'final_anchors' to be in normalized coordinates. -- 264175192 by lzc: Update model_fn to only read hparams if it is not None. -- 264159328 by Zhichao Lu: Modify nearest neighbor upsampling to eliminate a multiply operation. For quantized models, the multiply operation gets unnecessarily quantized and reduces accuracy (simple stacking would work in place of the broadcast op which doesn't require quantization). Also removes an unnecessary reshape op. -- 263668306 by Zhichao Lu: Add the option to use dynamic map_fn for batch NMS -- 263031163 by Zhichao Lu: Mark outside compilation for NMS as optional. -- 263024916 by Zhichao Lu: Add an ExperimentalModel meta arch for experimenting with new model types. -- 262655894 by Zhichao Lu: Add the center heatmap target assigner for CenterNet -- 262431036 by Zhichao Lu: Adding add_eval_dict to allow for evaluation on model_v2 -- 262035351 by ronnyvotel: Removing any non-Tensor predictions from the third stage of Mask R-CNN. -- 261953416 by Zhichao Lu: Internal change. -- 261834966 by Zhichao Lu: Fix the NMS OOM issue on TPU by forcing NMS to run outside of TPU. -- 261775941 by Zhichao Lu: Make Keras InputLayer compatible with both TF 1.x and TF 2.0. -- 261775633 by Zhichao Lu: Visualize additional channels with ground-truth bounding boxes. -- 261768117 by lzc: Internal change. -- 261766773 by ronnyvotel: Exposing `return_raw_detections_during_predict` in Faster R-CNN Proto. -- 260975089 by ronnyvotel: Moving calculation of batched prediction tensor names after all tensors in prediction dictionary are created. -- 259816913 by ronnyvotel: Adding raw detection boxes and feature map indices to SSD -- 259791955 by Zhichao Lu: Added a flag to control the use partitioned_non_max_suppression. -- 259580475 by Zhichao Lu: Tweak quantization-aware training re-writer to support NasFpn model architecture. -- 259579943 by rathodv: Add a meta target assigner proto and builders in OD API. -- 259577741 by Zhichao Lu: Internal change. -- 259366315 by lzc: Internal change. -- 259344310 by ronnyvotel: Updating faster rcnn so that raw_detection_boxes from predict() are in normalized coordinates. -- 259338670 by Zhichao Lu: Add support for use_native_resize_op to more feature extractors. Use dynamic shapes when static shapes are not available. -- 259083543 by ronnyvotel: Updating/fixing documentation. -- 259078937 by rathodv: Add prediction fields for tensors returned from detection_model.predict. -- 259044601 by Zhichao Lu: Add protocol buffer and builders for temperature scaling calibration. -- 259036770 by lzc: Internal changes. -- 259006223 by ronnyvotel: Adding detection anchor indices to Faster R-CNN Config. This is useful when one wishes to associate final detections and the anchors (or pre-nms boxes) from which they originated. -- 258872501 by Zhichao Lu: Run the training pipeline of ssd + resnet_v1_50 + fpn with a checkpoint. -- 258840686 by ronnyvotel: Adding standard outputs to DetectionModel.predict(). This CL only updates Faster R-CNN. Other meta architectures will be updated in future CLs. -- 258672969 by lzc: Internal change. -- 258649494 by lzc: Internal changes. -- 258630321 by ronnyvotel: Fixing documentation in shape_utils.flatten_dimensions(). -- 258468145 by Zhichao Lu: Add additional output tensors parameter to Postprocess op. -- 258099219 by Zhichao Lu: Internal changes -- PiperOrigin-RevId: 274959989
0ba83cf0 · pkulzc · Sergio Guadarrama · 9aed0ffb · 0ba83cf0 · 0ba83cf0
Commit 0ba83cf0 authored Oct 17, 2019 by pkulzc Committed by Sergio Guadarrama Oct 17, 2019
20 changed files
--- a/research/object_detection/model_lib_v2.py
+++ b/research/object_detection/model_lib_v2.py
@@ -47,9 +47,7 @@ MODEL_BUILD_UTIL_MAP = model_lib.MODEL_BUILD_UTIL_MAP

 def _compute_losses_and_predictions_dicts(
    model, features, labels,
-    add_regularization_loss=True,
-    use_tpu=False,
-    use_bfloat16=False):
+    add_regularization_loss=True):
  """Computes the losses dict and predictions dict for a model on inputs.

  Args:
@@ -88,8 +86,6 @@ def _compute_losses_and_predictions_dicts(
          float32 tensor containing keypoints for each box.
    add_regularization_loss: Whether or not to include the model's
      regularization loss in the losses dictionary.
-    use_tpu: Whether computation should happen on a TPU.
-    use_bfloat16: Whether computation on a TPU should use bfloat16.

  Returns:
    A tuple containing the losses dictionary (with the total loss under
@@ -100,18 +96,10 @@ def _compute_losses_and_predictions_dicts(
  model_lib.provide_groundtruth(model, labels)
  preprocessed_images = features[fields.InputDataFields.image]

-  # TODO(kaftan): Check how we're supposed to do this mixed precision stuff
-  ## in TF2 TPUStrategy + Keras
-  if use_tpu and use_bfloat16:
-    with tf.contrib.tpu.bfloat16_scope():
-      prediction_dict = model.predict(
-          preprocessed_images,
-          features[fields.InputDataFields.true_image_shape])
-      prediction_dict = ops.bfloat16_to_float32_nested(prediction_dict)
-  else:
-    prediction_dict = model.predict(
-        preprocessed_images,
-        features[fields.InputDataFields.true_image_shape])
+  prediction_dict = model.predict(
+      preprocessed_images,
+      features[fields.InputDataFields.true_image_shape])
+  prediction_dict = ops.bfloat16_to_float32_nested(prediction_dict)

  losses_dict = model.loss(
      prediction_dict, features[fields.InputDataFields.true_image_shape])
@@ -122,6 +110,8 @@ def _compute_losses_and_predictions_dicts(
    ## as well.
    regularization_losses = model.regularization_losses()
    if regularization_losses:
+      regularization_losses = ops.bfloat16_to_float32_nested(
+          regularization_losses)
      regularization_loss = tf.add_n(
          regularization_losses, name='regularization_loss')
      losses.append(regularization_loss)
@@ -146,7 +136,6 @@ def eager_train_step(detection_model,
                     add_regularization_loss=True,
                     clip_gradients_value=None,
                     use_tpu=False,
-                     use_bfloat16=False,
                     global_step=None,
                     num_replicas=1.0):
  """Process a single training batch.
@@ -204,7 +193,6 @@ def eager_train_step(detection_model,
    clip_gradients_value: If this is present, clip the gradients global norm
      at this value using `tf.clip_by_global_norm`.
    use_tpu: Whether computation should happen on a TPU.
-    use_bfloat16: Whether computation on a TPU should use bfloat16.
    global_step: The current training step. Used for TensorBoard logging
      purposes. This step is not updated by this function and must be
      incremented separately.
@@ -226,8 +214,7 @@ def eager_train_step(detection_model,

  with tf.GradientTape() as tape:
    losses_dict, _ = _compute_losses_and_predictions_dicts(
-        detection_model, features, labels, add_regularization_loss, use_tpu,
-        use_bfloat16)
+        detection_model, features, labels, add_regularization_loss)

    total_loss = losses_dict['Loss/total_loss']

@@ -236,9 +223,10 @@ def eager_train_step(detection_model,
                                tf.constant(num_replicas, dtype=tf.float32))
    losses_dict['Loss/normalized_total_loss'] = total_loss

-  for loss_type in losses_dict:
-    tf.compat.v2.summary.scalar(
-        loss_type, losses_dict[loss_type], step=global_step)
+  if not use_tpu:
+    for loss_type in losses_dict:
+      tf.compat.v2.summary.scalar(
+          loss_type, losses_dict[loss_type], step=global_step)

  trainable_variables = detection_model.trainable_variables

@@ -258,7 +246,7 @@ def eager_train_step(detection_model,
 def load_fine_tune_checkpoint(
    model, checkpoint_path, checkpoint_type,
    load_all_detection_checkpoint_vars, input_dataset,
-    unpad_groundtruth_tensors, use_tpu, use_bfloat16):
+    unpad_groundtruth_tensors):
  """Load a fine tuning classification or detection checkpoint.

  To make sure the model variables are all built, this method first executes
@@ -284,8 +272,6 @@ def load_fine_tune_checkpoint(
    input_dataset: The tf.data Dataset the model is being trained on. Needed
      to get the shapes for the dummy loss computation.
    unpad_groundtruth_tensors: A parameter passed to unstack_batch.
-    use_tpu: Whether computation should happen on a TPU.
-    use_bfloat16: Whether computation on a TPU should use bfloat16.
  """
  features, labels = iter(input_dataset).next()

@@ -299,9 +285,7 @@ def load_fine_tune_checkpoint(
    return _compute_losses_and_predictions_dicts(
        model,
        features,
-        labels,
-        use_tpu=use_tpu,
-        use_bfloat16=use_bfloat16)
+        labels)

  strategy = tf.compat.v2.distribute.get_strategy()
  strategy.experimental_run_v2(
@@ -313,11 +297,10 @@ def load_fine_tune_checkpoint(
      fine_tune_checkpoint_type=checkpoint_type,
      load_all_detection_checkpoint_vars=(
          load_all_detection_checkpoint_vars))
-  available_var_map = (
-      variables_helper.get_variables_available_in_checkpoint(
-          var_map,
-          checkpoint_path,
-          include_global_step=False))
+  available_var_map = variables_helper.get_variables_available_in_checkpoint(
+      var_map,
+      checkpoint_path,
+      include_global_step=False)
  tf.train.init_from_checkpoint(checkpoint_path,
                                available_var_map)

@@ -386,7 +369,6 @@ def train_loop(
  train_input_config = configs['train_input_config']

  unpad_groundtruth_tensors = train_config.unpad_groundtruth_tensors
-  use_bfloat16 = train_config.use_bfloat16
  add_regularization_loss = train_config.add_regularization_loss
  clip_gradients_value = None
  if train_config.gradient_clipping_by_norm > 0:
@@ -403,6 +385,9 @@ def train_loop(
      'train_loop: use_tpu %s, export_to_tpu %s', use_tpu,
      export_to_tpu)

+  if kwargs['use_bfloat16']:
+    tf.compat.v2.keras.mixed_precision.experimental.set_policy('mixed_bfloat16')
+
  # Parse the checkpoint fine tuning configs
  if hparams.load_pretrained:
    fine_tune_checkpoint_path = train_config.fine_tune_checkpoint
@@ -427,10 +412,8 @@ def train_loop(
    pipeline_config_final = create_pipeline_proto_from_configs(configs)
    config_util.save_pipeline_config(pipeline_config_final, model_dir)

-  # TODO(kaftan): Either make strategy a parameter of this method, or
-  ## grab it w/  Distribution strategy's get_scope
  # Build the model, optimizer, and training input
-  strategy = tf.compat.v2.distribute.MirroredStrategy()
+  strategy = tf.compat.v2.distribute.get_strategy()
  with strategy.scope():
    detection_model = model_builder.build(
        model_config=model_config, is_training=True)
@@ -446,7 +429,7 @@ def train_loop(
        train_input.repeat())

    global_step = tf.compat.v2.Variable(
-        0, trainable=False, dtype=tf.compat.v2.dtypes.int64)
+        0, trainable=False, dtype=tf.compat.v2.dtypes.int64, name='global_step')
    optimizer, (learning_rate,) = optimizer_builder.build(
        train_config.optimizer, global_step=global_step)

@@ -465,8 +448,7 @@ def train_loop(
                                  fine_tune_checkpoint_type,
                                  load_all_detection_checkpoint_vars,
                                  train_input,
-                                  unpad_groundtruth_tensors, use_tpu,
-                                  use_bfloat16)
+                                  unpad_groundtruth_tensors)

      ckpt = tf.compat.v2.train.Checkpoint(
          step=global_step, model=detection_model)
@@ -483,7 +465,6 @@ def train_loop(
            unpad_groundtruth_tensors,
            optimizer,
            learning_rate=learning_rate_fn(),
-            use_bfloat16=use_bfloat16,
            add_regularization_loss=add_regularization_loss,
            clip_gradients_value=clip_gradients_value,
            use_tpu=use_tpu,
@@ -512,11 +493,12 @@ def train_loop(
        loss = _dist_train_step(train_input_iter)
        global_step.assign_add(1)
        end_time = time.time()
-        tf.compat.v2.summary.scalar(
-            'steps_per_sec', 1.0 / (end_time - start_time), step=global_step)
+        if not use_tpu:
+          tf.compat.v2.summary.scalar(
+              'steps_per_sec', 1.0 / (end_time - start_time), step=global_step)
        # TODO(kaftan): Remove this print after it is no longer helpful for
        ## debugging.
-        tf.print('Finished step', global_step, end_time, loss)
+        print('Finished step', global_step, end_time, loss)
        if int(global_step.value().numpy()) % checkpoint_every_n == 0:
          manager.save()

@@ -552,7 +534,6 @@ def eager_eval_loop(
  train_config = configs['train_config']
  eval_input_config = configs['eval_input_config']
  eval_config = configs['eval_config']
-  use_bfloat16 = train_config.use_bfloat16
  add_regularization_loss = train_config.add_regularization_loss

  is_training = False
@@ -594,8 +575,7 @@ def eager_eval_loop(
        labels, unpad_groundtruth_tensors=unpad_groundtruth_tensors)

    losses_dict, prediction_dict = _compute_losses_and_predictions_dicts(
-        detection_model, features, labels, add_regularization_loss, use_tpu,
-        use_bfloat16)
+        detection_model, features, labels, add_regularization_loss)

    def postprocess_wrapper(args):
      return detection_model.postprocess(args[0], args[1])
@@ -762,6 +742,9 @@ def eval_continuously(
                           eval_on_train_input_config.num_epochs))
    eval_on_train_input_config.num_epochs = 1

+  if kwargs['use_bfloat16']:
+    tf.compat.v2.keras.mixed_precision.experimental.set_policy('mixed_bfloat16')
+
  detection_model = model_builder.build(
      model_config=model_config, is_training=True)


--- a/research/object_detection/models/feature_map_generators.py
+++ b/research/object_detection/models/feature_map_generators.py
@@ -27,6 +27,7 @@ import collections
 import functools
 import tensorflow as tf
 from object_detection.utils import ops
+from object_detection.utils import shape_utils
 slim = tf.contrib.slim

 # Activation bound used for TPU v1. Activations will be clipped to
@@ -568,7 +569,7 @@ class KerasFpnTopDownFeatureMaps(tf.keras.Model):
      # TODO (b/128922690): clean-up of ops.nearest_neighbor_upsampling
      if use_native_resize_op:
        def resize_nearest_neighbor(image):
-          image_shape = image.shape.as_list()
+          image_shape = shape_utils.combined_static_and_dynamic_shape(image)
          return tf.image.resize_nearest_neighbor(
              image, [image_shape[1] * 2, image_shape[2] * 2])
        top_down_net.append(tf.keras.layers.Lambda(
@@ -704,7 +705,8 @@ def fpn_top_down_feature_maps(image_features,
      for level in reversed(range(num_levels - 1)):
        if use_native_resize_op:
          with tf.name_scope('nearest_neighbor_upsampling'):
-            top_down_shape = top_down.shape.as_list()
+            top_down_shape = shape_utils.combined_static_and_dynamic_shape(
+                top_down)
            top_down = tf.image.resize_nearest_neighbor(
                top_down, [top_down_shape[1] * 2, top_down_shape[2] * 2])
        else:

--- a/research/object_detection/models/keras_models/mobilenet_v1.py
+++ b/research/object_detection/models/keras_models/mobilenet_v1.py
@@ -242,7 +242,7 @@ class _LayersOverride(object):

    placeholder_with_default = tf.placeholder_with_default(
        input=input_tensor, shape=[None] + shape)
-    return tf.keras.layers.Input(tensor=placeholder_with_default)
+    return model_utils.input_layer(shape, placeholder_with_default)

  # pylint: disable=unused-argument
  def ReLU(self, *args, **kwargs):

--- a/research/object_detection/models/keras_models/mobilenet_v2.py
+++ b/research/object_detection/models/keras_models/mobilenet_v2.py
@@ -230,10 +230,7 @@ class _LayersOverride(object):

    placeholder_with_default = tf.placeholder_with_default(
        input=input_tensor, shape=[None] + shape)
-    if tf.executing_eagerly():
-      return tf.keras.layers.Input(shape=shape)
-    else:
-      return tf.keras.layers.Input(tensor=placeholder_with_default)
+    return model_utils.input_layer(shape, placeholder_with_default)

  # pylint: disable=unused-argument
  def ReLU(self, *args, **kwargs):

--- a/research/object_detection/models/keras_models/model_utils.py
+++ b/research/object_detection/models/keras_models/model_utils.py
@@ -20,6 +20,7 @@ from __future__ import division
 from __future__ import print_function

 import collections
+import tensorflow as tf

 # This is to specify the custom config of model structures. For example,
 # ConvDefs(conv_name='conv_pw_12', filters=512) for Mobilenet V1 is to specify
@@ -43,3 +44,10 @@ def get_conv_def(conv_defs, layer_name):
    if layer_name == conv_def.conv_name:
      return conv_def.filters
  return None
+
+
+def input_layer(shape, placeholder_with_default):
+  if tf.executing_eagerly():
+    return tf.keras.layers.Input(shape=shape)
+  else:
+    return tf.keras.layers.Input(tensor=placeholder_with_default)
--- a/research/object_detection/models/keras_models/resnet_v1.py
+++ b/research/object_detection/models/keras_models/resnet_v1.py
@@ -22,6 +22,7 @@ from __future__ import print_function
 import tensorflow as tf

 from object_detection.core import freezable_batch_norm
+from object_detection.models.keras_models import model_utils


 def _fixed_padding(inputs, kernel_size, rate=1):  # pylint: disable=invalid-name
@@ -216,7 +217,7 @@ class _LayersOverride(object):

    placeholder_with_default = tf.placeholder_with_default(
        input=input_tensor, shape=[None] + shape)
-    return tf.keras.layers.Input(tensor=placeholder_with_default)
+    return model_utils.input_layer(shape, placeholder_with_default)

  def MaxPooling2D(self, pool_size, **kwargs):
    """Builds a MaxPooling2D layer with default padding as 'SAME'.

--- a/research/object_detection/models/ssd_mobilenet_v1_fpn_feature_extractor.py
+++ b/research/object_detection/models/ssd_mobilenet_v1_fpn_feature_extractor.py
@@ -52,6 +52,7 @@ class SSDMobileNetV1FpnFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
               reuse_weights=None,
               use_explicit_padding=False,
               use_depthwise=False,
+               use_native_resize_op=False,
               override_base_feature_extractor_hyperparams=False):
    """SSD FPN feature extractor based on Mobilenet v1 architecture.

@@ -79,6 +80,8 @@ class SSDMobileNetV1FpnFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
      use_explicit_padding: Whether to use explicit padding when extracting
        features. Default is False.
      use_depthwise: Whether to use depthwise convolutions. Default is False.
+      use_native_resize_op: Whether to use tf.image.nearest_neighbor_resize
+        to do upsampling in FPN. Default is false.
      override_base_feature_extractor_hyperparams: Whether to override
        hyperparameters of the base feature extractor with the one from
        `conv_hyperparams_fn`.
@@ -100,6 +103,7 @@ class SSDMobileNetV1FpnFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
    self._conv_defs = None
    if self._use_depthwise:
      self._conv_defs = _create_modified_mobilenet_config()
+    self._use_native_resize_op = use_native_resize_op

  def preprocess(self, resized_inputs):
    """SSD preprocessing.
@@ -162,7 +166,8 @@ class SSDMobileNetV1FpnFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
              [(key, image_features[key]) for key in feature_block_list],
              depth=depth_fn(self._additional_layer_depth),
              use_depthwise=self._use_depthwise,
-              use_explicit_padding=self._use_explicit_padding)
+              use_explicit_padding=self._use_explicit_padding,
+              use_native_resize_op=self._use_native_resize_op)
          feature_maps = []
          for level in range(self._fpn_min_level, base_fpn_max_level + 1):
            feature_maps.append(fpn_features['top_down_{}'.format(

--- a/research/object_detection/models/ssd_mobilenet_v1_fpn_keras_feature_extractor.py
+++ b/research/object_detection/models/ssd_mobilenet_v1_fpn_keras_feature_extractor.py
@@ -49,6 +49,7 @@ class SSDMobileNetV1FpnKerasFeatureExtractor(
               additional_layer_depth=256,
               use_explicit_padding=False,
               use_depthwise=False,
+               use_native_resize_op=False,
               override_base_feature_extractor_hyperparams=False,
               name=None):
    """SSD Keras based FPN feature extractor Mobilenet v1 architecture.
@@ -84,6 +85,8 @@ class SSDMobileNetV1FpnKerasFeatureExtractor(
      use_explicit_padding: Whether to use explicit padding when extracting
        features. Default is False.
      use_depthwise: whether to use depthwise convolutions. Default is False.
+      use_native_resize_op: Whether to use tf.image.nearest_neighbor_resize
+        to do upsampling in FPN. Default is false.
      override_base_feature_extractor_hyperparams: Whether to override
        hyperparameters of the base feature extractor with the one from
        `conv_hyperparams`.
@@ -109,6 +112,7 @@ class SSDMobileNetV1FpnKerasFeatureExtractor(
    self._conv_defs = None
    if self._use_depthwise:
      self._conv_defs = _create_modified_mobilenet_config()
+    self._use_native_resize_op = use_native_resize_op
    self._feature_blocks = [
        'Conv2d_3_pointwise', 'Conv2d_5_pointwise', 'Conv2d_11_pointwise',
        'Conv2d_13_pointwise'
@@ -153,6 +157,7 @@ class SSDMobileNetV1FpnKerasFeatureExtractor(
            depth=self._depth_fn(self._additional_layer_depth),
            use_depthwise=self._use_depthwise,
            use_explicit_padding=self._use_explicit_padding,
+            use_native_resize_op=self._use_native_resize_op,
            is_training=self._is_training,
            conv_hyperparams=self._conv_hyperparams,
            freeze_batchnorm=self._freeze_batchnorm,

--- a/research/object_detection/models/ssd_mobilenet_v2_fpn_feature_extractor.py
+++ b/research/object_detection/models/ssd_mobilenet_v2_fpn_feature_extractor.py
@@ -53,6 +53,7 @@ class SSDMobileNetV2FpnFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
               reuse_weights=None,
               use_explicit_padding=False,
               use_depthwise=False,
+               use_native_resize_op=False,
               override_base_feature_extractor_hyperparams=False):
    """SSD FPN feature extractor based on Mobilenet v2 architecture.

@@ -79,6 +80,8 @@ class SSDMobileNetV2FpnFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
      use_explicit_padding: Whether to use explicit padding when extracting
        features. Default is False.
      use_depthwise: Whether to use depthwise convolutions. Default is False.
+      use_native_resize_op: Whether to use tf.image.nearest_neighbor_resize
+        to do upsampling in FPN. Default is false.
      override_base_feature_extractor_hyperparams: Whether to override
        hyperparameters of the base feature extractor with the one from
        `conv_hyperparams_fn`.
@@ -100,6 +103,7 @@ class SSDMobileNetV2FpnFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
    self._conv_defs = None
    if self._use_depthwise:
      self._conv_defs = _create_modified_mobilenet_config()
+    self._use_native_resize_op = use_native_resize_op

  def preprocess(self, resized_inputs):
    """SSD preprocessing.
@@ -159,7 +163,8 @@ class SSDMobileNetV2FpnFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
              [(key, image_features[key]) for key in feature_block_list],
              depth=depth_fn(self._additional_layer_depth),
              use_depthwise=self._use_depthwise,
-              use_explicit_padding=self._use_explicit_padding)
+              use_explicit_padding=self._use_explicit_padding,
+              use_native_resize_op=self._use_native_resize_op)
          feature_maps = []
          for level in range(self._fpn_min_level, base_fpn_max_level + 1):
            feature_maps.append(fpn_features['top_down_{}'.format(

--- a/research/object_detection/models/ssd_mobilenet_v2_fpn_keras_feature_extractor.py
+++ b/research/object_detection/models/ssd_mobilenet_v2_fpn_keras_feature_extractor.py
@@ -52,6 +52,7 @@ class SSDMobileNetV2FpnKerasFeatureExtractor(
               reuse_weights=None,
               use_explicit_padding=False,
               use_depthwise=False,
+               use_native_resize_op=False,
               override_base_feature_extractor_hyperparams=False,
               name=None):
    """SSD Keras based FPN feature extractor Mobilenet v2 architecture.
@@ -87,6 +88,8 @@ class SSDMobileNetV2FpnKerasFeatureExtractor(
      use_explicit_padding: Whether to use explicit padding when extracting
        features. Default is False.
      use_depthwise: Whether to use depthwise convolutions. Default is False.
+      use_native_resize_op: Whether to use tf.image.nearest_neighbor_resize
+        to do upsampling in FPN. Default is false.
      override_base_feature_extractor_hyperparams: Whether to override
        hyperparameters of the base feature extractor with the one from
        `conv_hyperparams`.
@@ -112,6 +115,7 @@ class SSDMobileNetV2FpnKerasFeatureExtractor(
    self._conv_defs = None
    if self._use_depthwise:
      self._conv_defs = _create_modified_mobilenet_config()
+    self._use_native_resize_op = use_native_resize_op
    self._feature_blocks = ['layer_4', 'layer_7', 'layer_14', 'layer_19']
    self._mobilenet_v2 = None
    self._fpn_features_generator = None
@@ -151,6 +155,7 @@ class SSDMobileNetV2FpnKerasFeatureExtractor(
            depth=self._depth_fn(self._additional_layer_depth),
            use_depthwise=self._use_depthwise,
            use_explicit_padding=self._use_explicit_padding,
+            use_native_resize_op=self._use_native_resize_op,
            is_training=self._is_training,
            conv_hyperparams=self._conv_hyperparams,
            freeze_batchnorm=self._freeze_batchnorm,

--- a/research/object_detection/models/ssd_mobilenet_v3_feature_extractor.py
+++ b/research/object_detection/models/ssd_mobilenet_v3_feature_extractor.py
+# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""SSDFeatureExtractor for MobileNetV3 features."""
+
+import tensorflow as tf
+
+from object_detection.meta_architectures import ssd_meta_arch
+from object_detection.models import feature_map_generators
+from object_detection.utils import context_manager
+from object_detection.utils import ops
+from object_detection.utils import shape_utils
+from nets.mobilenet import mobilenet
+from nets.mobilenet import mobilenet_v3
+
+slim = tf.contrib.slim
+
+
+class _SSDMobileNetV3FeatureExtractorBase(ssd_meta_arch.SSDFeatureExtractor):
+  """Base class of SSD feature extractor using MobilenetV3 features."""
+
+  def __init__(self,
+               conv_defs,
+               from_layer,
+               is_training,
+               depth_multiplier,
+               min_depth,
+               pad_to_multiple,
+               conv_hyperparams_fn,
+               reuse_weights=None,
+               use_explicit_padding=False,
+               use_depthwise=False,
+               override_base_feature_extractor_hyperparams=False):
+    """MobileNetV3 Feature Extractor for SSD Models.
+
+    MobileNet v3. Details found in:
+    https://arxiv.org/abs/1905.02244
+
+    Args:
+      conv_defs: MobileNetV3 conv defs for backbone.
+      from_layer: A cell of two layer names (string) to connect to the 1st and
+        2nd inputs of the SSD head.
+      is_training: whether the network is in training mode.
+      depth_multiplier: float depth multiplier for feature extractor.
+      min_depth: minimum feature extractor depth.
+      pad_to_multiple: the nearest multiple to zero pad the input height and
+        width dimensions to.
+      conv_hyperparams_fn: A function to construct tf slim arg_scope for conv2d
+        and separable_conv2d ops in the layers that are added on top of the base
+        feature extractor.
+      reuse_weights: Whether to reuse variables. Default is None.
+      use_explicit_padding: Whether to use explicit padding when extracting
+        features. Default is False.
+      use_depthwise: Whether to use depthwise convolutions. Default is False.
+      override_base_feature_extractor_hyperparams: Whether to override
+        hyperparameters of the base feature extractor with the one from
+        `conv_hyperparams_fn`.
+    """
+    super(_SSDMobileNetV3FeatureExtractorBase, self).__init__(
+        is_training=is_training,
+        depth_multiplier=depth_multiplier,
+        min_depth=min_depth,
+        pad_to_multiple=pad_to_multiple,
+        conv_hyperparams_fn=conv_hyperparams_fn,
+        reuse_weights=reuse_weights,
+        use_explicit_padding=use_explicit_padding,
+        use_depthwise=use_depthwise,
+        override_base_feature_extractor_hyperparams=override_base_feature_extractor_hyperparams
+    )
+    self._conv_defs = conv_defs
+    self._from_layer = from_layer
+
+  def preprocess(self, resized_inputs):
+    """SSD preprocessing.
+
+    Maps pixel values to the range [-1, 1].
+
+    Args:
+      resized_inputs: a [batch, height, width, channels] float tensor
+        representing a batch of images.
+
+    Returns:
+      preprocessed_inputs: a [batch, height, width, channels] float tensor
+        representing a batch of images.
+    """
+    return (2.0 / 255.0) * resized_inputs - 1.0
+
+  def extract_features(self, preprocessed_inputs):
+    """Extract features from preprocessed inputs.
+
+    Args:
+      preprocessed_inputs: a [batch, height, width, channels] float tensor
+        representing a batch of images.
+
+    Returns:
+      feature_maps: a list of tensors where the ith tensor has shape
+        [batch, height_i, width_i, depth_i]
+    Raises:
+      ValueError if conv_defs is not provided or from_layer does not meet the
+        size requirement.
+    """
+
+    if not self._conv_defs:
+      raise ValueError('Must provide backbone conv defs.')
+
+    if len(self._from_layer) != 2:
+      raise ValueError('SSD input feature names are not provided.')
+
+    preprocessed_inputs = shape_utils.check_min_image_dim(
+        33, preprocessed_inputs)
+
+    feature_map_layout = {
+        'from_layer': [
+            self._from_layer[0], self._from_layer[1], '', '', '', ''
+        ],
+        'layer_depth': [-1, -1, 512, 256, 256, 128],
+        'use_depthwise': self._use_depthwise,
+        'use_explicit_padding': self._use_explicit_padding,
+    }
+
+    with tf.variable_scope('MobilenetV3', reuse=self._reuse_weights) as scope:
+      with slim.arg_scope(
+          mobilenet_v3.training_scope(is_training=None, bn_decay=0.9997)), \
+          slim.arg_scope(
+              [mobilenet.depth_multiplier], min_depth=self._min_depth):
+        with (slim.arg_scope(self._conv_hyperparams_fn())
+              if self._override_base_feature_extractor_hyperparams else
+              context_manager.IdentityContextManager()):
+          # TODO(bochen): switch to v3 modules once v3 is properly refactored.
+          _, image_features = mobilenet_v3.mobilenet_base(
+              ops.pad_to_multiple(preprocessed_inputs, self._pad_to_multiple),
+              conv_defs=self._conv_defs,
+              final_endpoint=self._from_layer[1],
+              depth_multiplier=self._depth_multiplier,
+              use_explicit_padding=self._use_explicit_padding,
+              scope=scope)
+        with slim.arg_scope(self._conv_hyperparams_fn()):
+          feature_maps = feature_map_generators.multi_resolution_feature_maps(
+              feature_map_layout=feature_map_layout,
+              depth_multiplier=self._depth_multiplier,
+              min_depth=self._min_depth,
+              insert_1x1_conv=True,
+              image_features=image_features)
+
+    return feature_maps.values()
+
+
+class SSDMobileNetV3LargeFeatureExtractor(_SSDMobileNetV3FeatureExtractorBase):
+  """Mobilenet V3-Large feature extractor."""
+
+  def __init__(self,
+               is_training,
+               depth_multiplier,
+               min_depth,
+               pad_to_multiple,
+               conv_hyperparams_fn,
+               reuse_weights=None,
+               use_explicit_padding=False,
+               use_depthwise=False,
+               override_base_feature_extractor_hyperparams=False):
+    super(SSDMobileNetV3LargeFeatureExtractor, self).__init__(
+        conv_defs=mobilenet_v3.V3_LARGE_DETECTION,
+        from_layer=['layer_14/expansion_output', 'layer_17'],
+        is_training=is_training,
+        depth_multiplier=depth_multiplier,
+        min_depth=min_depth,
+        pad_to_multiple=pad_to_multiple,
+        conv_hyperparams_fn=conv_hyperparams_fn,
+        reuse_weights=reuse_weights,
+        use_explicit_padding=use_explicit_padding,
+        use_depthwise=use_depthwise,
+        override_base_feature_extractor_hyperparams=override_base_feature_extractor_hyperparams
+    )
+
+
+class SSDMobileNetV3SmallFeatureExtractor(_SSDMobileNetV3FeatureExtractorBase):
+  """Mobilenet V3-Small feature extractor."""
+
+  def __init__(self,
+               is_training,
+               depth_multiplier,
+               min_depth,
+               pad_to_multiple,
+               conv_hyperparams_fn,
+               reuse_weights=None,
+               use_explicit_padding=False,
+               use_depthwise=False,
+               override_base_feature_extractor_hyperparams=False):
+    super(SSDMobileNetV3SmallFeatureExtractor, self).__init__(
+        conv_defs=mobilenet_v3.V3_SMALL_DETECTION,
+        from_layer=['layer_10/expansion_output', 'layer_13'],
+        is_training=is_training,
+        depth_multiplier=depth_multiplier,
+        min_depth=min_depth,
+        pad_to_multiple=pad_to_multiple,
+        conv_hyperparams_fn=conv_hyperparams_fn,
+        reuse_weights=reuse_weights,
+        use_explicit_padding=use_explicit_padding,
+        use_depthwise=use_depthwise,
+        override_base_feature_extractor_hyperparams=override_base_feature_extractor_hyperparams
+    )
--- a/research/object_detection/models/ssd_mobilenet_v3_feature_extractor_test.py
+++ b/research/object_detection/models/ssd_mobilenet_v3_feature_extractor_test.py
+# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""Tests for ssd_mobilenet_v3_feature_extractor."""
+
+import tensorflow as tf
+
+from object_detection.models import ssd_mobilenet_v3_feature_extractor
+from object_detection.models import ssd_mobilenet_v3_feature_extractor_testbase
+
+
+slim = tf.contrib.slim
+
+
+class SsdMobilenetV3LargeFeatureExtractorTest(
+    ssd_mobilenet_v3_feature_extractor_testbase
+    ._SsdMobilenetV3FeatureExtractorTestBase):
+
+  def _get_input_sizes(self):
+    """Return first two input feature map sizes."""
+    return [672, 480]
+
+  def _create_feature_extractor(self,
+                                depth_multiplier,
+                                pad_to_multiple,
+                                use_explicit_padding=False,
+                                use_keras=False):
+    """Constructs a new Mobilenet V3-Large feature extractor.
+
+    Args:
+      depth_multiplier: float depth multiplier for feature extractor
+      pad_to_multiple: the nearest multiple to zero pad the input height and
+        width dimensions to.
+      use_explicit_padding: use 'VALID' padding for convolutions, but prepad
+        inputs so that the output dimensions are the same as if 'SAME' padding
+        were used.
+      use_keras: if True builds a keras-based feature extractor, if False builds
+        a slim-based one.
+
+    Returns:
+      an ssd_meta_arch.SSDFeatureExtractor object.
+    """
+    min_depth = 32
+    return (
+        ssd_mobilenet_v3_feature_extractor.SSDMobileNetV3LargeFeatureExtractor(
+            False,
+            depth_multiplier,
+            min_depth,
+            pad_to_multiple,
+            self.conv_hyperparams_fn,
+            use_explicit_padding=use_explicit_padding))
+
+
+class SsdMobilenetV3SmallFeatureExtractorTest(
+    ssd_mobilenet_v3_feature_extractor_testbase
+    ._SsdMobilenetV3FeatureExtractorTestBase):
+
+  def _get_input_sizes(self):
+    """Return first two input feature map sizes."""
+    return [288, 288]
+
+  def _create_feature_extractor(self,
+                                depth_multiplier,
+                                pad_to_multiple,
+                                use_explicit_padding=False,
+                                use_keras=False):
+    """Constructs a new Mobilenet V3-Small feature extractor.
+
+    Args:
+      depth_multiplier: float depth multiplier for feature extractor
+      pad_to_multiple: the nearest multiple to zero pad the input height and
+        width dimensions to.
+      use_explicit_padding: use 'VALID' padding for convolutions, but prepad
+        inputs so that the output dimensions are the same as if 'SAME' padding
+        were used.
+      use_keras: if True builds a keras-based feature extractor, if False builds
+        a slim-based one.
+
+    Returns:
+      an ssd_meta_arch.SSDFeatureExtractor object.
+    """
+    min_depth = 32
+    return (
+        ssd_mobilenet_v3_feature_extractor.SSDMobileNetV3SmallFeatureExtractor(
+            False,
+            depth_multiplier,
+            min_depth,
+            pad_to_multiple,
+            self.conv_hyperparams_fn,
+            use_explicit_padding=use_explicit_padding))
+
+
+if __name__ == '__main__':
+  tf.test.main()
--- a/research/object_detection/models/ssd_mobilenet_v3_feature_extractor_testbase.py
+++ b/research/object_detection/models/ssd_mobilenet_v3_feature_extractor_testbase.py
+# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""Base test class for ssd_mobilenet_v3_feature_extractor."""
+
+import abc
+
+import numpy as np
+import tensorflow as tf
+
+from object_detection.models import ssd_feature_extractor_test
+
+
+slim = tf.contrib.slim
+
+
+class _SsdMobilenetV3FeatureExtractorTestBase(
+    ssd_feature_extractor_test.SsdFeatureExtractorTestBase):
+  """Base class for MobilenetV3 tests."""
+
+  @abc.abstractmethod
+  def _get_input_sizes(self):
+    """Return feature map sizes for the two inputs to SSD head."""
+    pass
+
+  def test_extract_features_returns_correct_shapes_128(self):
+    image_height = 128
+    image_width = 128
+    depth_multiplier = 1.0
+    pad_to_multiple = 1
+    input_feature_sizes = self._get_input_sizes()
+    expected_feature_map_shape = [(2, 8, 8, input_feature_sizes[0]),
+                                  (2, 4, 4, input_feature_sizes[1]),
+                                  (2, 2, 2, 512), (2, 1, 1, 256), (2, 1, 1,
+                                                                   256),
+                                  (2, 1, 1, 128)]
+    self.check_extract_features_returns_correct_shape(
+        2,
+        image_height,
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_keras=False)
+
+  def test_extract_features_returns_correct_shapes_299(self):
+    image_height = 299
+    image_width = 299
+    depth_multiplier = 1.0
+    pad_to_multiple = 1
+    input_feature_sizes = self._get_input_sizes()
+    expected_feature_map_shape = [(2, 19, 19, input_feature_sizes[0]),
+                                  (2, 10, 10, input_feature_sizes[1]),
+                                  (2, 5, 5, 512), (2, 3, 3, 256), (2, 2, 2,
+                                                                   256),
+                                  (2, 1, 1, 128)]
+    self.check_extract_features_returns_correct_shape(
+        2,
+        image_height,
+        image_width,
+        depth_multiplier,
+        pad_to_multiple,
+        expected_feature_map_shape,
+        use_keras=False)
+
+  def test_extract_features_returns_correct_shapes_with_pad_to_multiple(self):
+    image_height = 299
+    image_width = 299
+    depth_multiplier = 1.0
+    pad_to_multiple = 32
+    input_feature_sizes = self._get_input_sizes()
+    expected_feature_map_shape = [(2, 20, 20, input_feature_sizes[0]),
+                                  (2, 10, 10, input_feature_sizes[1]),
+                                  (2, 5, 5, 512), (2, 3, 3, 256), (2, 2, 2,
+                                                                   256),
+                                  (2, 1, 1, 128)]
+    self.check_extract_features_returns_correct_shape(
+        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        expected_feature_map_shape)
+
+  def test_preprocess_returns_correct_value_range(self):
+    image_height = 128
+    image_width = 128
+    depth_multiplier = 1
+    pad_to_multiple = 1
+    test_image = np.random.rand(4, image_height, image_width, 3)
+    feature_extractor = self._create_feature_extractor(
+        depth_multiplier, pad_to_multiple, use_keras=False)
+    preprocessed_image = feature_extractor.preprocess(test_image)
+    self.assertTrue(np.all(np.less_equal(np.abs(preprocessed_image), 1.0)))
+
+  def test_has_fused_batchnorm(self):
+    image_height = 40
+    image_width = 40
+    depth_multiplier = 1
+    pad_to_multiple = 1
+    image_placeholder = tf.placeholder(tf.float32,
+                                       [1, image_height, image_width, 3])
+    feature_extractor = self._create_feature_extractor(
+        depth_multiplier, pad_to_multiple, use_keras=False)
+    preprocessed_image = feature_extractor.preprocess(image_placeholder)
+    _ = feature_extractor.extract_features(preprocessed_image)
+    self.assertTrue(any('FusedBatchNorm' in op.type
+                        for op in tf.get_default_graph().get_operations()))
--- a/research/object_detection/models/ssd_resnet_v1_fpn_feature_extractor.py
+++ b/research/object_detection/models/ssd_resnet_v1_fpn_feature_extractor.py
@@ -47,6 +47,7 @@ class SSDResnetV1FpnFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
               reuse_weights=None,
               use_explicit_padding=False,
               use_depthwise=False,
+               use_native_resize_op=False,
               override_base_feature_extractor_hyperparams=False):
    """SSD FPN feature extractor based on Resnet v1 architecture.

@@ -77,6 +78,8 @@ class SSDResnetV1FpnFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
      use_explicit_padding: Whether to use explicit padding when extracting
        features. Default is False. UNUSED currently.
      use_depthwise: Whether to use depthwise convolutions. UNUSED currently.
+      use_native_resize_op: Whether to use tf.image.nearest_neighbor_resize
+        to do upsampling in FPN. Default is false.
      override_base_feature_extractor_hyperparams: Whether to override
        hyperparameters of the base feature extractor with the one from
        `conv_hyperparams_fn`.
@@ -103,6 +106,7 @@ class SSDResnetV1FpnFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
    self._fpn_min_level = fpn_min_level
    self._fpn_max_level = fpn_max_level
    self._additional_layer_depth = additional_layer_depth
+    self._use_native_resize_op = use_native_resize_op

  def preprocess(self, resized_inputs):
    """SSD preprocessing.
@@ -178,7 +182,8 @@ class SSDResnetV1FpnFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
            feature_block_list.append('block{}'.format(level - 1))
          fpn_features = feature_map_generators.fpn_top_down_feature_maps(
              [(key, image_features[key]) for key in feature_block_list],
-              depth=depth_fn(self._additional_layer_depth))
+              depth=depth_fn(self._additional_layer_depth),
+              use_native_resize_op=self._use_native_resize_op)
          feature_maps = []
          for level in range(self._fpn_min_level, base_fpn_max_level + 1):
            feature_maps.append(
@@ -213,6 +218,7 @@ class SSDResnet50V1FpnFeatureExtractor(SSDResnetV1FpnFeatureExtractor):
               reuse_weights=None,
               use_explicit_padding=False,
               use_depthwise=False,
+               use_native_resize_op=False,
               override_base_feature_extractor_hyperparams=False):
    """SSD Resnet50 V1 FPN feature extractor based on Resnet v1 architecture.

@@ -232,6 +238,8 @@ class SSDResnet50V1FpnFeatureExtractor(SSDResnetV1FpnFeatureExtractor):
      use_explicit_padding: Whether to use explicit padding when extracting
        features. Default is False. UNUSED currently.
      use_depthwise: Whether to use depthwise convolutions. UNUSED currently.
+      use_native_resize_op: Whether to use tf.image.nearest_neighbor_resize
+        to do upsampling in FPN. Default is false.
      override_base_feature_extractor_hyperparams: Whether to override
        hyperparameters of the base feature extractor with the one from
        `conv_hyperparams_fn`.
@@ -251,6 +259,7 @@ class SSDResnet50V1FpnFeatureExtractor(SSDResnetV1FpnFeatureExtractor):
        reuse_weights=reuse_weights,
        use_explicit_padding=use_explicit_padding,
        use_depthwise=use_depthwise,
+        use_native_resize_op=use_native_resize_op,
        override_base_feature_extractor_hyperparams=
        override_base_feature_extractor_hyperparams)

@@ -270,6 +279,7 @@ class SSDResnet101V1FpnFeatureExtractor(SSDResnetV1FpnFeatureExtractor):
               reuse_weights=None,
               use_explicit_padding=False,
               use_depthwise=False,
+               use_native_resize_op=False,
               override_base_feature_extractor_hyperparams=False):
    """SSD Resnet101 V1 FPN feature extractor based on Resnet v1 architecture.

@@ -289,6 +299,8 @@ class SSDResnet101V1FpnFeatureExtractor(SSDResnetV1FpnFeatureExtractor):
      use_explicit_padding: Whether to use explicit padding when extracting
        features. Default is False. UNUSED currently.
      use_depthwise: Whether to use depthwise convolutions. UNUSED currently.
+      use_native_resize_op: Whether to use tf.image.nearest_neighbor_resize
+        to do upsampling in FPN. Default is false.
      override_base_feature_extractor_hyperparams: Whether to override
        hyperparameters of the base feature extractor with the one from
        `conv_hyperparams_fn`.
@@ -308,6 +320,7 @@ class SSDResnet101V1FpnFeatureExtractor(SSDResnetV1FpnFeatureExtractor):
        reuse_weights=reuse_weights,
        use_explicit_padding=use_explicit_padding,
        use_depthwise=use_depthwise,
+        use_native_resize_op=use_native_resize_op,
        override_base_feature_extractor_hyperparams=
        override_base_feature_extractor_hyperparams)

@@ -327,6 +340,7 @@ class SSDResnet152V1FpnFeatureExtractor(SSDResnetV1FpnFeatureExtractor):
               reuse_weights=None,
               use_explicit_padding=False,
               use_depthwise=False,
+               use_native_resize_op=False,
               override_base_feature_extractor_hyperparams=False):
    """SSD Resnet152 V1 FPN feature extractor based on Resnet v1 architecture.

@@ -346,6 +360,8 @@ class SSDResnet152V1FpnFeatureExtractor(SSDResnetV1FpnFeatureExtractor):
      use_explicit_padding: Whether to use explicit padding when extracting
        features. Default is False. UNUSED currently.
      use_depthwise: Whether to use depthwise convolutions. UNUSED currently.
+      use_native_resize_op: Whether to use tf.image.nearest_neighbor_resize
+        to do upsampling in FPN. Default is false.
      override_base_feature_extractor_hyperparams: Whether to override
        hyperparameters of the base feature extractor with the one from
        `conv_hyperparams_fn`.
@@ -365,5 +381,6 @@ class SSDResnet152V1FpnFeatureExtractor(SSDResnetV1FpnFeatureExtractor):
        reuse_weights=reuse_weights,
        use_explicit_padding=use_explicit_padding,
        use_depthwise=use_depthwise,
+        use_native_resize_op=use_native_resize_op,
        override_base_feature_extractor_hyperparams=
        override_base_feature_extractor_hyperparams)
--- a/research/object_detection/models/ssd_resnet_v1_fpn_keras_feature_extractor.py
+++ b/research/object_detection/models/ssd_resnet_v1_fpn_keras_feature_extractor.py
@@ -52,6 +52,7 @@ class SSDResNetV1FpnKerasFeatureExtractor(
               additional_layer_depth=256,
               reuse_weights=None,
               use_explicit_padding=None,
+               use_depthwise=None,
               override_base_feature_extractor_hyperparams=False,
               name=None):
    """SSD Keras based FPN feature extractor Resnet v1 architecture.
@@ -90,6 +91,7 @@ class SSDResNetV1FpnKerasFeatureExtractor(
      use_explicit_padding: whether to use explicit padding when extracting
        features. Default is None, as it's an invalid option and not implemented
        in this feature extractor.
+      use_depthwise: Whether to use depthwise convolutions. UNUSED currently.
      override_base_feature_extractor_hyperparams: Whether to override
        hyperparameters of the base feature extractor with the one from
        `conv_hyperparams`.
@@ -105,11 +107,14 @@ class SSDResNetV1FpnKerasFeatureExtractor(
        freeze_batchnorm=freeze_batchnorm,
        inplace_batchnorm_update=inplace_batchnorm_update,
        use_explicit_padding=None,
+        use_depthwise=None,
        override_base_feature_extractor_hyperparams=
        override_base_feature_extractor_hyperparams,
        name=name)
    if self._use_explicit_padding:
      raise ValueError('Explicit padding is not a valid option.')
+    if self._use_depthwise:
+      raise ValueError('Depthwise is not a valid option.')
    self._fpn_min_level = fpn_min_level
    self._fpn_max_level = fpn_max_level
    self._additional_layer_depth = additional_layer_depth
@@ -251,6 +256,7 @@ class SSDResNet50V1FpnKerasFeatureExtractor(
               additional_layer_depth=256,
               reuse_weights=None,
               use_explicit_padding=None,
+               use_depthwise=None,
               override_base_feature_extractor_hyperparams=False,
               name='ResNet50V1_FPN'):
    """SSD Keras based FPN feature extractor ResnetV1-50 architecture.
@@ -278,7 +284,8 @@ class SSDResNet50V1FpnKerasFeatureExtractor(
      reuse_weights: whether to reuse variables. Default is None.
      use_explicit_padding: whether to use explicit padding when extracting
        features. Default is None, as it's an invalid option and not implemented
-        in this feature extractor
+        in this feature extractor.
+      use_depthwise: Whether to use depthwise convolutions. UNUSED currently.
      override_base_feature_extractor_hyperparams: Whether to override
        hyperparameters of the base feature extractor with the one from
        `conv_hyperparams`.
@@ -296,6 +303,7 @@ class SSDResNet50V1FpnKerasFeatureExtractor(
        resnet_v1_base_model=resnet_v1.resnet_v1_50,
        resnet_v1_base_model_name='resnet_v1_50',
        use_explicit_padding=use_explicit_padding,
+        use_depthwise=use_depthwise,
        override_base_feature_extractor_hyperparams=
        override_base_feature_extractor_hyperparams,
        name=name)
@@ -318,6 +326,7 @@ class SSDResNet101V1FpnKerasFeatureExtractor(
               additional_layer_depth=256,
               reuse_weights=None,
               use_explicit_padding=None,
+               use_depthwise=None,
               override_base_feature_extractor_hyperparams=False,
               name='ResNet101V1_FPN'):
    """SSD Keras based FPN feature extractor ResnetV1-101 architecture.
@@ -345,7 +354,8 @@ class SSDResNet101V1FpnKerasFeatureExtractor(
      reuse_weights: whether to reuse variables. Default is None.
      use_explicit_padding: whether to use explicit padding when extracting
        features. Default is None, as it's an invalid option and not implemented
-        in this feature extractor
+        in this feature extractor.
+      use_depthwise: Whether to use depthwise convolutions. UNUSED currently.
      override_base_feature_extractor_hyperparams: Whether to override
        hyperparameters of the base feature extractor with the one from
        `conv_hyperparams`.
@@ -363,6 +373,7 @@ class SSDResNet101V1FpnKerasFeatureExtractor(
        resnet_v1_base_model=resnet_v1.resnet_v1_101,
        resnet_v1_base_model_name='resnet_v1_101',
        use_explicit_padding=use_explicit_padding,
+        use_depthwise=use_depthwise,
        override_base_feature_extractor_hyperparams=
        override_base_feature_extractor_hyperparams,
        name=name)
@@ -385,6 +396,7 @@ class SSDResNet152V1FpnKerasFeatureExtractor(
               additional_layer_depth=256,
               reuse_weights=None,
               use_explicit_padding=False,
+               use_depthwise=None,
               override_base_feature_extractor_hyperparams=False,
               name='ResNet152V1_FPN'):
    """SSD Keras based FPN feature extractor ResnetV1-152 architecture.
@@ -412,7 +424,8 @@ class SSDResNet152V1FpnKerasFeatureExtractor(
      reuse_weights: whether to reuse variables. Default is None.
      use_explicit_padding: whether to use explicit padding when extracting
        features. Default is None, as it's an invalid option and not implemented
-        in this feature extractor
+        in this feature extractor.
+      use_depthwise: Whether to use depthwise convolutions. UNUSED currently.
      override_base_feature_extractor_hyperparams: Whether to override
        hyperparameters of the base feature extractor with the one from
        `conv_hyperparams`.
@@ -430,6 +443,7 @@ class SSDResNet152V1FpnKerasFeatureExtractor(
        resnet_v1_base_model=resnet_v1.resnet_v1_152,
        resnet_v1_base_model_name='resnet_v1_152',
        use_explicit_padding=use_explicit_padding,
+        use_depthwise=use_depthwise,
        override_base_feature_extractor_hyperparams=
        override_base_feature_extractor_hyperparams,
        name=name)
--- a/research/object_detection/object_detection_tutorial.ipynb
+++ b/research/object_detection/object_detection_tutorial.ipynb
--- a/research/object_detection/protos/box_predictor.proto
+++ b/research/object_detection/protos/box_predictor.proto
@@ -66,7 +66,7 @@ message ConvolutionalBoxPredictor {
 }

 // Configuration proto for weight shared convolutional box predictor.
-// Next id: 18
+// Next id: 19
 message WeightSharedConvolutionalBoxPredictor {
  // Hyperparameters for convolution ops used in the box predictor.
  optional Hyperparams conv_hyperparams = 1;
@@ -122,8 +122,8 @@ message WeightSharedConvolutionalBoxPredictor {
    optional float max = 2;
  }
  optional BoxEncodingsClipRange box_encodings_clip_range = 17;
-}

+}


 // TODO(alirezafathi): Refactor the proto file to be able to configure mask rcnn
@@ -197,3 +197,4 @@ message RfcnBoxPredictor {

  optional int32 crop_width = 7 [default = 12];
 }
+
--- a/research/object_detection/protos/calibration.proto
+++ b/research/object_detection/protos/calibration.proto
@@ -21,6 +21,9 @@ message CalibrationConfig {

    // Per-class sigmoid calibration.
    ClassIdSigmoidCalibrations class_id_sigmoid_calibrations = 4;
+
+    // Temperature scaling calibration.
+    TemperatureScalingCalibration temperature_scaling_calibration = 5;
  }
 }

@@ -50,6 +53,11 @@ message ClassIdSigmoidCalibrations {
  map<int32, SigmoidParameters> class_id_sigmoid_parameters_map = 1;
 }

+// Message for Temperature Scaling Calibration.
+message TemperatureScalingCalibration {
+  optional float scaler = 1;
+}
+
 // Description of data used to fit the calibration model. CLASS_SPECIFIC
 // indicates that the calibration parameters are derived from detections
 // pertaining to a single class. ALL_CLASSES indicates that parameters were

--- a/research/object_detection/protos/eval.proto
+++ b/research/object_detection/protos/eval.proto
@@ -4,80 +4,85 @@ package object_detection.protos;

 // Message for configuring DetectionModel evaluation jobs (eval.py).
 message EvalConfig {
-  optional uint32 batch_size = 25 [default=1];
+  optional uint32 batch_size = 25 [default = 1];
  // Number of visualization images to generate.
-  optional uint32 num_visualizations = 1 [default=10];
+  optional uint32 num_visualizations = 1 [default = 10];

  // Number of examples to process of evaluation.
-  optional uint32 num_examples = 2 [default=5000, deprecated=true];
+  optional uint32 num_examples = 2 [default = 5000, deprecated = true];

  // How often to run evaluation.
-  optional uint32 eval_interval_secs = 3 [default=300];
+  optional uint32 eval_interval_secs = 3 [default = 300];

  // Maximum number of times to run evaluation. If set to 0, will run forever.
-  optional uint32 max_evals = 4 [default=0, deprecated=true];
+  optional uint32 max_evals = 4 [default = 0, deprecated = true];

  // Whether the TensorFlow graph used for evaluation should be saved to disk.
-  optional bool save_graph = 5 [default=false];
+  optional bool save_graph = 5 [default = false];

  // Path to directory to store visualizations in. If empty, visualization
  // images are not exported (only shown on Tensorboard).
-  optional string visualization_export_dir = 6 [default=""];
+  optional string visualization_export_dir = 6 [default = ""];

  // BNS name of the TensorFlow master.
-  optional string eval_master = 7 [default=""];
+  optional string eval_master = 7 [default = ""];

  // Type of metrics to use for evaluation.
  repeated string metrics_set = 8;

  // Path to export detections to COCO compatible JSON format.
-  optional string export_path = 9 [default=''];
+  optional string export_path = 9 [default =''];

  // Option to not read groundtruth labels and only export detections to
  // COCO-compatible JSON file.
-  optional bool ignore_groundtruth = 10 [default=false];
+  optional bool ignore_groundtruth = 10 [default = false];

  // Use exponential moving averages of variables for evaluation.
  // TODO(rathodv): When this is false make sure the model is constructed
  // without moving averages in restore_fn.
-  optional bool use_moving_averages = 11 [default=false];
+  optional bool use_moving_averages = 11 [default = false];

  // Whether to evaluate instance masks.
  // Note that since there is no evaluation code currently for instance
  // segmenation this option is unused.
-  optional bool eval_instance_masks = 12 [default=false];
+  optional bool eval_instance_masks = 12 [default = false];

  // Minimum score threshold for a detected object box to be visualized
-  optional float min_score_threshold = 13 [default=0.5];
+  optional float min_score_threshold = 13 [default = 0.5];

  // Maximum number of detections to visualize
-  optional int32 max_num_boxes_to_visualize = 14 [default=20];
+  optional int32 max_num_boxes_to_visualize = 14 [default = 20];

  // When drawing a single detection, each label is by default visualized as
  // <label name> : <label score>. One can skip the name or/and score using the
  // following fields:
-  optional bool skip_scores = 15 [default=false];
-  optional bool skip_labels = 16 [default=false];
+  optional bool skip_scores = 15 [default = false];
+  optional bool skip_labels = 16 [default = false];

  // Whether to show groundtruth boxes in addition to detected boxes in
  // visualizations.
-  optional bool visualize_groundtruth_boxes = 17 [default=false];
+  optional bool visualize_groundtruth_boxes = 17 [default = false];

  // Box color for visualizing groundtruth boxes.
-  optional string groundtruth_box_visualization_color = 18 [default="black"];
+  optional string groundtruth_box_visualization_color = 18 [default = "black"];

  // Whether to keep image identifier in filename when exported to
  // visualization_export_dir.
-  optional bool keep_image_id_for_visualization_export = 19 [default=false];
+  optional bool keep_image_id_for_visualization_export = 19 [default = false];

  // Whether to retain original images (i.e. not pre-processed) in the tensor
  // dictionary, so that they can be displayed in Tensorboard.
-  optional bool retain_original_images = 23 [default=true];
+  optional bool retain_original_images = 23 [default = true];

  // If True, additionally include per-category metrics.
-  optional bool include_metrics_per_category = 24 [default=false];
+  optional bool include_metrics_per_category = 24 [default = false];

  // Recall range within which precision should be computed.
  optional float recall_lower_bound = 26 [default = 0.0];
  optional float recall_upper_bound = 27 [default = 1.0];
+
+  // Whether to retain additional channels (i.e. not pre-processed) in the
+  // tensor dictionary, so that they can be displayed in Tensorboard.
+  optional bool retain_original_image_additional_channels = 28
+      [default = false];
 }
--- a/research/object_detection/protos/faster_rcnn.proto
+++ b/research/object_detection/protos/faster_rcnn.proto
@@ -169,11 +169,17 @@ message FasterRcnn {
  // running evaluation (specifically not is_training if False).
  optional bool use_static_shapes_for_eval = 37 [default = false];

+  // If true, uses implementation of partitioned_non_max_suppression in first
+  // stage.
+  optional bool use_partitioned_nms_in_first_stage = 38 [default = true];
+
+  // Whether to return raw detections (pre NMS).
+  optional bool return_raw_detections_during_predict = 39 [default = false];
+
  // Whether to use tf.image.combined_non_max_suppression.
-  optional bool use_combined_nms_in_first_stage = 38 [default=false];
+  optional bool use_combined_nms_in_first_stage = 40 [default = false];
 }

-
 message FasterRcnnFeatureExtractor {
  // Type of Faster R-CNN model (e.g., 'faster_rcnn_resnet101';
  // See builders/model_builder.py for expected types).