Merged commit includes the following changes: (#6315)

236813471 by lzc: Internal change. -- 236507310 by lzc: Fix preprocess.random_resize_method config type issue. The target height and width will be passed as "size" to tf.image.resize_images which only accepts integer. -- 236409989 by Zhichao Lu: Config export_to_tpu from function parameter instead of HParams for TPU inference. -- 236403186 by Zhichao Lu: Make graph file names optional arguments. -- 236237072 by Zhichao Lu: Minor bugfix for keyword args. -- 236209602 by Zhichao Lu: Add support for PartitionedVariable to get_variables_available_in_checkpoint. -- 235828658 by Zhichao Lu: Automatically stop evaluation jobs when training is finished. -- 235817964 by Zhichao Lu: Add an optional process_metrics_fn callback to eval_util, it gets called with evaluation results once each evaluation is complete. -- 235788721 by lzc: Fix yml file tf runtime version. -- 235262897 by Zhichao Lu: Add keypoint support to the random_pad_image preprocessor method. -- 235257380 by Zhichao Lu: Support InputDataFields.groundtruth_confidences in retain_groundtruth(), retain_groundtruth_with_positive_classes(), filter_groundtruth_with_crowd_boxes(), filter_groundtruth_with_nan_box_coordinates(), filter_unrecognized_classes(). -- 235109188 by Zhichao Lu: Fix bug in pad_input_data_to_static_shapes for num_additional_channels > 0; make color-specific data augmentation only touch RGB channels. -- 235045010 by Zhichao Lu: Don't slice class_predictions_with_background when add_background_class is false. -- 235026189 by lzc: Fix import in g3doc. -- 234863426 by Zhichao Lu: Added fixes in exporter to allow writing a checkpoint to a specified temporary directory. -- 234671886 by lzc: Internal Change. -- 234630803 by rathodv: Internal Change. -- 233985896 by Zhichao Lu: Add Neumann optimizer to object detection. -- 233560911 by Zhichao Lu: Add NAS-FPN object detection with Resnet and Mobilenet v2. -- 233513536 by Zhichao Lu: Export TPU compatible object detection model -- 233495772 by lzc: Internal change. -- 233453557 by Zhichao Lu: Create Keras-based SSD+MobilenetV1 for object detection. -- 233220074 by lzc: Update release notes date. -- 233165761 by Zhichao Lu: Support depth_multiplier and min_depth in _SSDResnetV1FpnFeatureExtractor. -- 233160046 by lzc: Internal change. -- 232926599 by Zhichao Lu: [tf.data] Switching tf.data functions to use `defun`, providing an escape hatch to continue using the legacy `Defun`. There are subtle differences between the implementation of `defun` and `Defun` (such as resources handling or control flow) and it is possible that input pipelines that use control flow or resources in their functions might be affected by this change. To migrate majority of existing pipelines to the recommended way of creating functions in TF 2.0 world, while allowing (a small number of) existing pipelines to continue relying on the deprecated behavior, this CL provides an escape hatch. If your input pipeline is affected by this CL, it should apply the escape hatch by replacing `foo.map(...)` with `foo.map_with_legacy_function(...)`. -- 232891621 by Zhichao Lu: Modify faster_rcnn meta architecture to normalize raw detections. -- 232875817 by Zhichao Lu: Make calibration a post-processing step. Specifically: - Move the calibration config from pipeline.proto --> post_processing.proto - Edit post_processing_builder.py to return a calibration function. If no calibration config is provided, it None. - Edit SSD and FasterRCNN meta architectures to optionally call the calibration function on detection scores after score conversion and before NMS. -- 232704481 by Zhichao Lu: Edit calibration builder to build a function that will be used within a detection model's `postprocess` method, after score conversion and before non-maxima suppression. Specific Edits: - The returned function now accepts class_predictions_with_background as its argument instead of detection_scores and detection_classes. - Class-specific calibration was temporarily removed, as it requires more significant refactoring. Will be added later. -- 232615379 by Zhichao Lu: Internal change -- 232483345 by ronnyvotel: Making the use of bfloat16 restricted to TPUs. -- 232399572 by Zhichao Lu: Edit calibration builder and proto to support class-agnostic calibration. Specifically: - Edit calibration protos to include path to relevant label map if required for class-specific calibration. Previously, label maps were inferred from other parts of the pipeline proto; this allows all information required by the builder stay within the calibration proto and remove extraneous information from being passed with class-agnostic calibration. - Add class-agnostic protos to the calibration config. Note that the proto supports sigmoid and linear interpolation parameters, but the builder currently only supports linear interpolation. -- 231613048 by Zhichao Lu: Add calibration builder for applying calibration transformations from output of object detection models. Specifically: - Add calibration proto to support sigmoid and isotonic regression (stepwise function) calibration. - Add a builder to support calibration from isotonic regression outputs. -- 231519786 by lzc: model_builder test refactor. - removed proto text boilerplate in each test case and let them call a create_default_proto function instead. - consolidated all separate ssd model creation tests into one. - consolidated all separate faster rcnn model creation tests into one. - used parameterized test for testing mask rcnn models and use_matmul_crop_and_resize - added all failures test. -- 231448169 by Zhichao Lu: Return static shape as a constant tensor. -- 231423126 by lzc: Add a release note for OID v4 models. -- 231401941 by Zhichao Lu: Adding correct labelmap for the models trained on Open Images V4 (*oid_v4 config suffix). -- 231320357 by Zhichao Lu: Add scope to Nearest Neighbor Resize op so that it stays in the same name scope as the original resize ops. -- 231257699 by Zhichao Lu: Switch to using preserve_aspect_ratio in tf.image.resize_images rather than using a custom implementation. -- 231247368 by rathodv: Internal change. -- 231004874 by lzc: Update documentations to use tf 1.12 for object detection API. -- 230999911 by rathodv: Use tf.batch_gather instead of ops.batch_gather -- 230999720 by huizhongc: Fix weight equalization test in ops_test. -- 230984728 by rathodv: Internal update. -- 230929019 by lzc: Add an option to replace preprocess operation with placeholder for ssd feature extractor. -- 230845266 by lzc: Require tensorflow version 1.12 for object detection API and rename keras_applications to keras_models -- 230392064 by lzc: Add RetinaNet 101 checkpoint trained on OID v4 to detection model zoo. -- 230014128 by derekjchow: This file was re-located below the tensorflow/lite/g3doc/convert -- 229941449 by lzc: Update SSD mobilenet v2 quantized model download path. -- 229843662 by lzc: Add an option to use native resize tf op in fpn top-down feature map generation. -- 229636034 by rathodv: Add deprecation notice to a few old parameters in train.proto -- 228959078 by derekjchow: Remove duplicate elif case in _check_and_convert_legacy_input_config_key -- 228749719 by rathodv: Minor refactoring to make exporter's `build_detection_graph` method public. -- 228573828 by rathodv: Mofity model.postprocess to return raw detections and raw scores. Modify, post-process methods in core/model.py and the meta architectures to export raw detection (without any non-max suppression) and raw multiclass score logits for those detections. -- 228420670 by Zhichao Lu: Add shims for custom architectures for object detection models. -- 228241692 by Zhichao Lu: Fix the comment on "losses_mask" in "Loss" class. -- 228223810 by Zhichao Lu: Support other_heads' predictions in WeightSharedConvolutionalBoxPredictor. Also remove a few unused parameters and fix a couple of comments in convolutional_box_predictor.py. -- 228200588 by Zhichao Lu: Add Expected Calibration Error and an evaluator that calculates the metric for object detections. -- 228167740 by lzc: Add option to use bounded activations in FPN top-down feature map generation. -- 227767700 by rathodv: Internal. -- 226295236 by Zhichao Lu: Add Open Image V4 Resnet101-FPN training config to third_party -- 226254842 by Zhichao Lu: Fix typo in documentation. -- 225833971 by Zhichao Lu: Option to have no resizer in object detection model. -- 225824890 by lzc: Fixes p3 compatibility for model_lib.py -- 225760897 by menglong: normalizer should be at least 1. -- 225559842 by menglong: Add extra logic filtering unrecognized classes. -- 225379421 by lzc: Add faster_rcnn_inception_resnet_v2_atrous_oid_v4 config to third_party -- 225368337 by Zhichao Lu: Add extra logic filtering unrecognized classes. -- 225341095 by Zhichao Lu: Adding Open Images V4 models to OD API model zoo and corresponding configs to the configs. -- 225218450 by menglong: Add extra logic filtering unrecognized classes. -- 225057591 by Zhichao Lu: Internal change. -- 224895417 by rathodv: Internal change. -- 224209282 by Zhichao Lu: Add two data augmentations to object detection: (1) Self-concat (2) Absolute pads. -- 224073762 by Zhichao Lu: Do not create tf.constant until _generate() is actually called in the object detector. -- PiperOrigin-RevId: 236813471

Merged commit includes the following changes: (#6315)
236813471 by lzc: Internal change. -- 236507310 by lzc: Fix preprocess.random_resize_method config type issue. The target height and width will be passed as "size" to tf.image.resize_images which only accepts integer. -- 236409989 by Zhichao Lu: Config export_to_tpu from function parameter instead of HParams for TPU inference. -- 236403186 by Zhichao Lu: Make graph file names optional arguments. -- 236237072 by Zhichao Lu: Minor bugfix for keyword args. -- 236209602 by Zhichao Lu: Add support for PartitionedVariable to get_variables_available_in_checkpoint. -- 235828658 by Zhichao Lu: Automatically stop evaluation jobs when training is finished. -- 235817964 by Zhichao Lu: Add an optional process_metrics_fn callback to eval_util, it gets called with evaluation results once each evaluation is complete. -- 235788721 by lzc: Fix yml file tf runtime version. -- 235262897 by Zhichao Lu: Add keypoint support to the random_pad_image preprocessor method. -- 235257380 by Zhichao Lu: Support InputDataFields.groundtruth_confidences in retain_groundtruth(), retain_groundtruth_with_positive_classes(), filter_groundtruth_with_crowd_boxes(), filter_groundtruth_with_nan_box_coordinates(), filter_unrecognized_classes(). -- 235109188 by Zhichao Lu: Fix bug in pad_input_data_to_static_shapes for num_additional_channels > 0; make color-specific data augmentation only touch RGB channels. -- 235045010 by Zhichao Lu: Don't slice class_predictions_with_background when add_background_class is false. -- 235026189 by lzc: Fix import in g3doc. -- 234863426 by Zhichao Lu: Added fixes in exporter to allow writing a checkpoint to a specified temporary directory. -- 234671886 by lzc: Internal Change. -- 234630803 by rathodv: Internal Change. -- 233985896 by Zhichao Lu: Add Neumann optimizer to object detection. -- 233560911 by Zhichao Lu: Add NAS-FPN object detection with Resnet and Mobilenet v2. -- 233513536 by Zhichao Lu: Export TPU compatible object detection model -- 233495772 by lzc: Internal change. -- 233453557 by Zhichao Lu: Create Keras-based SSD+MobilenetV1 for object detection. -- 233220074 by lzc: Update release notes date. -- 233165761 by Zhichao Lu: Support depth_multiplier and min_depth in _SSDResnetV1FpnFeatureExtractor. -- 233160046 by lzc: Internal change. -- 232926599 by Zhichao Lu: [tf.data] Switching tf.data functions to use `defun`, providing an escape hatch to continue using the legacy `Defun`. There are subtle differences between the implementation of `defun` and `Defun` (such as resources handling or control flow) and it is possible that input pipelines that use control flow or resources in their functions might be affected by this change. To migrate majority of existing pipelines to the recommended way of creating functions in TF 2.0 world, while allowing (a small number of) existing pipelines to continue relying on the deprecated behavior, this CL provides an escape hatch. If your input pipeline is affected by this CL, it should apply the escape hatch by replacing `foo.map(...)` with `foo.map_with_legacy_function(...)`. -- 232891621 by Zhichao Lu: Modify faster_rcnn meta architecture to normalize raw detections. -- 232875817 by Zhichao Lu: Make calibration a post-processing step. Specifically: - Move the calibration config from pipeline.proto --> post_processing.proto - Edit post_processing_builder.py to return a calibration function. If no calibration config is provided, it None. - Edit SSD and FasterRCNN meta architectures to optionally call the calibration function on detection scores after score conversion and before NMS. -- 232704481 by Zhichao Lu: Edit calibration builder to build a function that will be used within a detection model's `postprocess` method, after score conversion and before non-maxima suppression. Specific Edits: - The returned function now accepts class_predictions_with_background as its argument instead of detection_scores and detection_classes. - Class-specific calibration was temporarily removed, as it requires more significant refactoring. Will be added later. -- 232615379 by Zhichao Lu: Internal change -- 232483345 by ronnyvotel: Making the use of bfloat16 restricted to TPUs. -- 232399572 by Zhichao Lu: Edit calibration builder and proto to support class-agnostic calibration. Specifically: - Edit calibration protos to include path to relevant label map if required for class-specific calibration. Previously, label maps were inferred from other parts of the pipeline proto; this allows all information required by the builder stay within the calibration proto and remove extraneous information from being passed with class-agnostic calibration. - Add class-agnostic protos to the calibration config. Note that the proto supports sigmoid and linear interpolation parameters, but the builder currently only supports linear interpolation. -- 231613048 by Zhichao Lu: Add calibration builder for applying calibration transformations from output of object detection models. Specifically: - Add calibration proto to support sigmoid and isotonic regression (stepwise function) calibration. - Add a builder to support calibration from isotonic regression outputs. -- 231519786 by lzc: model_builder test refactor. - removed proto text boilerplate in each test case and let them call a create_default_proto function instead. - consolidated all separate ssd model creation tests into one. - consolidated all separate faster rcnn model creation tests into one. - used parameterized test for testing mask rcnn models and use_matmul_crop_and_resize - added all failures test. -- 231448169 by Zhichao Lu: Return static shape as a constant tensor. -- 231423126 by lzc: Add a release note for OID v4 models. -- 231401941 by Zhichao Lu: Adding correct labelmap for the models trained on Open Images V4 (*oid_v4 config suffix). -- 231320357 by Zhichao Lu: Add scope to Nearest Neighbor Resize op so that it stays in the same name scope as the original resize ops. -- 231257699 by Zhichao Lu: Switch to using preserve_aspect_ratio in tf.image.resize_images rather than using a custom implementation. -- 231247368 by rathodv: Internal change. -- 231004874 by lzc: Update documentations to use tf 1.12 for object detection API. -- 230999911 by rathodv: Use tf.batch_gather instead of ops.batch_gather -- 230999720 by huizhongc: Fix weight equalization test in ops_test. -- 230984728 by rathodv: Internal update. -- 230929019 by lzc: Add an option to replace preprocess operation with placeholder for ssd feature extractor. -- 230845266 by lzc: Require tensorflow version 1.12 for object detection API and rename keras_applications to keras_models -- 230392064 by lzc: Add RetinaNet 101 checkpoint trained on OID v4 to detection model zoo. -- 230014128 by derekjchow: This file was re-located below the tensorflow/lite/g3doc/convert -- 229941449 by lzc: Update SSD mobilenet v2 quantized model download path. -- 229843662 by lzc: Add an option to use native resize tf op in fpn top-down feature map generation. -- 229636034 by rathodv: Add deprecation notice to a few old parameters in train.proto -- 228959078 by derekjchow: Remove duplicate elif case in _check_and_convert_legacy_input_config_key -- 228749719 by rathodv: Minor refactoring to make exporter's `build_detection_graph` method public. -- 228573828 by rathodv: Mofity model.postprocess to return raw detections and raw scores. Modify, post-process methods in core/model.py and the meta architectures to export raw detection (without any non-max suppression) and raw multiclass score logits for those detections. -- 228420670 by Zhichao Lu: Add shims for custom architectures for object detection models. -- 228241692 by Zhichao Lu: Fix the comment on "losses_mask" in "Loss" class. -- 228223810 by Zhichao Lu: Support other_heads' predictions in WeightSharedConvolutionalBoxPredictor. Also remove a few unused parameters and fix a couple of comments in convolutional_box_predictor.py. -- 228200588 by Zhichao Lu: Add Expected Calibration Error and an evaluator that calculates the metric for object detections. -- 228167740 by lzc: Add option to use bounded activations in FPN top-down feature map generation. -- 227767700 by rathodv: Internal. -- 226295236 by Zhichao Lu: Add Open Image V4 Resnet101-FPN training config to third_party -- 226254842 by Zhichao Lu: Fix typo in documentation. -- 225833971 by Zhichao Lu: Option to have no resizer in object detection model. -- 225824890 by lzc: Fixes p3 compatibility for model_lib.py -- 225760897 by menglong: normalizer should be at least 1. -- 225559842 by menglong: Add extra logic filtering unrecognized classes. -- 225379421 by lzc: Add faster_rcnn_inception_resnet_v2_atrous_oid_v4 config to third_party -- 225368337 by Zhichao Lu: Add extra logic filtering unrecognized classes. -- 225341095 by Zhichao Lu: Adding Open Images V4 models to OD API model zoo and corresponding configs to the configs. -- 225218450 by menglong: Add extra logic filtering unrecognized classes. -- 225057591 by Zhichao Lu: Internal change. -- 224895417 by rathodv: Internal change. -- 224209282 by Zhichao Lu: Add two data augmentations to object detection: (1) Self-concat (2) Absolute pads. -- 224073762 by Zhichao Lu: Do not create tf.constant until _generate() is actually called in the object detector. -- PiperOrigin-RevId: 236813471
05584085 · pkulzc · Jonathan Huang · a5db4420 · 05584085 · 05584085
Commit 05584085 authored Mar 07, 2019 by pkulzc Committed by Jonathan Huang Mar 07, 2019
20 changed files
--- a/research/object_detection/core/preprocessor.py
+++ b/research/object_detection/core/preprocessor.py
--- a/research/object_detection/core/preprocessor_cache.py
+++ b/research/object_detection/core/preprocessor_cache.py
@@ -51,6 +51,7 @@ class PreprocessorCache(object):
  ADD_BLACK_PATCH = 'add_black_patch'
  SELECTOR = 'selector'
  SELECTOR_TUPLES = 'selector_tuples'
+  SELF_CONCAT_IMAGE = 'self_concat_image'
  SSD_CROP_SELECTOR_ID = 'ssd_crop_selector_id'
  SSD_CROP_PAD_SELECTOR_ID = 'ssd_crop_pad_selector_id'
@@ -60,7 +61,8 @@ class PreprocessorCache(object):
                ADJUST_HUE, ADJUST_SATURATION, DISTORT_COLOR, STRICT_CROP_IMAGE,
                CROP_IMAGE, PAD_IMAGE, CROP_TO_ASPECT_RATIO, RESIZE_METHOD,
                PAD_TO_ASPECT_RATIO, BLACK_PATCHES, ADD_BLACK_PATCH, SELECTOR,
-                SELECTOR_TUPLES, SSD_CROP_SELECTOR_ID, SSD_CROP_PAD_SELECTOR_ID]
+                SELECTOR_TUPLES, SELF_CONCAT_IMAGE, SSD_CROP_SELECTOR_ID,
+                SSD_CROP_PAD_SELECTOR_ID]
  def __init__(self):
    self._history = defaultdict(dict)
@@ -99,4 +101,3 @@ class PreprocessorCache(object):
    if function_id not in self._VALID_FNS:
      raise ValueError('Function id not recognized: %s.' % str(function_id))
    self._history[function_id][key] = value
--- a/research/object_detection/core/preprocessor_test.py
+++ b/research/object_detection/core/preprocessor_test.py
@@ -2071,6 +2071,96 @@ class PreprocessorTest(tf.test.TestCase):
      self.assertTrue(np.all((boxes_[:, 3] - boxes_[:, 1]) >= (
          padded_boxes_[:, 3] - padded_boxes_[:, 1])))
+  def testRandomPadImageWithKeypoints(self):
+    preprocessing_options = [(preprocessor.normalize_image, {
+        'original_minval': 0,
+        'original_maxval': 255,
+        'target_minval': 0,
+        'target_maxval': 1
+    })]
+    images = self.createTestImages()
+    boxes = self.createTestBoxes()
+    labels = self.createTestLabels()
+    keypoints = self.createTestKeypoints()
+    tensor_dict = {
+        fields.InputDataFields.image: images,
+        fields.InputDataFields.groundtruth_boxes: boxes,
+        fields.InputDataFields.groundtruth_classes: labels,
+        fields.InputDataFields.groundtruth_keypoints: keypoints,
+    }
+    tensor_dict = preprocessor.preprocess(tensor_dict, preprocessing_options)
+    images = tensor_dict[fields.InputDataFields.image]
+    preprocessing_options = [(preprocessor.random_pad_image, {})]
+    padded_tensor_dict = preprocessor.preprocess(tensor_dict,
+                                                 preprocessing_options)
+    padded_images = padded_tensor_dict[fields.InputDataFields.image]
+    padded_boxes = padded_tensor_dict[
+        fields.InputDataFields.groundtruth_boxes]
+    padded_keypoints = padded_tensor_dict[
+        fields.InputDataFields.groundtruth_keypoints]
+    boxes_shape = tf.shape(boxes)
+    padded_boxes_shape = tf.shape(padded_boxes)
+    keypoints_shape = tf.shape(keypoints)
+    padded_keypoints_shape = tf.shape(padded_keypoints)
+    images_shape = tf.shape(images)
+    padded_images_shape = tf.shape(padded_images)
+    with self.test_session() as sess:
+      (boxes_shape_, padded_boxes_shape_, keypoints_shape_,
+       padded_keypoints_shape_, images_shape_, padded_images_shape_, boxes_,
+       padded_boxes_, keypoints_, padded_keypoints_) = sess.run(
+           [boxes_shape, padded_boxes_shape, keypoints_shape,
+            padded_keypoints_shape, images_shape, padded_images_shape, boxes,
+            padded_boxes, keypoints, padded_keypoints])
+      self.assertAllEqual(boxes_shape_, padded_boxes_shape_)
+      self.assertAllEqual(keypoints_shape_, padded_keypoints_shape_)
+      self.assertTrue((images_shape_[1] >= padded_images_shape_[1] * 0.5).all)
+      self.assertTrue((images_shape_[2] >= padded_images_shape_[2] * 0.5).all)
+      self.assertTrue((images_shape_[1] <= padded_images_shape_[1]).all)
+      self.assertTrue((images_shape_[2] <= padded_images_shape_[2]).all)
+      self.assertTrue(np.all((boxes_[:, 2] - boxes_[:, 0]) >= (
+          padded_boxes_[:, 2] - padded_boxes_[:, 0])))
+      self.assertTrue(np.all((boxes_[:, 3] - boxes_[:, 1]) >= (
+          padded_boxes_[:, 3] - padded_boxes_[:, 1])))
+      self.assertTrue(np.all((keypoints_[1, :, 0] - keypoints_[0, :, 0]) >= (
+          padded_keypoints_[1, :, 0] - padded_keypoints_[0, :, 0])))
+      self.assertTrue(np.all((keypoints_[1, :, 1] - keypoints_[0, :, 1]) >= (
+          padded_keypoints_[1, :, 1] - padded_keypoints_[0, :, 1])))
+  def testRandomAbsolutePadImage(self):
+    images = self.createTestImages()
+    boxes = self.createTestBoxes()
+    labels = self.createTestLabels()
+    tensor_dict = {
+        fields.InputDataFields.image: tf.to_float(images),
+        fields.InputDataFields.groundtruth_boxes: boxes,
+        fields.InputDataFields.groundtruth_classes: labels,
+    }
+    height_padding = 10
+    width_padding = 20
+    preprocessing_options = [(preprocessor.random_absolute_pad_image, {
+        'max_height_padding': height_padding,
+        'max_width_padding': width_padding})]
+    padded_tensor_dict = preprocessor.preprocess(tensor_dict,
+                                                 preprocessing_options)
+    original_shape = tf.shape(images)
+    final_shape = tf.shape(padded_tensor_dict[fields.InputDataFields.image])
+    with self.test_session() as sess:
+      _, height, width, _ = sess.run(original_shape)
+      for _ in range(100):
+        output_shape = sess.run(final_shape)
+        self.assertTrue(output_shape[1] >= height)
+        self.assertTrue(output_shape[1] < height + height_padding)
+        self.assertTrue(output_shape[2] >= width)
+        self.assertTrue(output_shape[2] < width + width_padding)
  def testRandomCropPadImageWithCache(self):
    preprocess_options = [(preprocessor.normalize_image, {
        'original_minval': 0,
@@ -2693,6 +2783,95 @@ class PreprocessorTest(tf.test.TestCase):
      self.assertAllEqual([0, 1, 1, 0, 1], one_hot)
+  def testRandomSelfConcatImage(self):
+    tf.set_random_seed(24601)
+    images = self.createTestImages()
+    boxes = self.createTestBoxes()
+    labels = self.createTestLabels()
+    weights = self.createTestGroundtruthWeights()
+    confidences = weights
+    scores = self.createTestMultiClassScores()
+    tensor_dict = {
+        fields.InputDataFields.image: tf.to_float(images),
+        fields.InputDataFields.groundtruth_boxes: boxes,
+        fields.InputDataFields.groundtruth_classes: labels,
+        fields.InputDataFields.groundtruth_weights: weights,
+        fields.InputDataFields.groundtruth_confidences: confidences,
+        fields.InputDataFields.multiclass_scores: scores,
+    }
+    preprocessing_options = [(preprocessor.random_self_concat_image, {
+        'concat_vertical_probability': 0.5,
+        'concat_horizontal_probability': 0.5,
+        'seed': 24601,
+    })]
+    func_arg_map = preprocessor.get_default_func_arg_map(
+        True, True, True)
+    output_tensor_dict = preprocessor.preprocess(
+        tensor_dict, preprocessing_options, func_arg_map=func_arg_map)
+    final_shape = tf.shape(output_tensor_dict[fields.InputDataFields.image])[
+        1:3]
+    with self.test_session() as sess:
+      outputs = []
+      augment_height_only = False
+      augment_width_only = False
+      for _ in range(50):
+        original_boxes = sess.run(boxes)
+        shape, new_boxes, new_labels, new_confidences, new_scores = sess.run(
+            [final_shape,
+             output_tensor_dict[fields.InputDataFields.groundtruth_boxes],
+             output_tensor_dict[fields.InputDataFields.groundtruth_classes],
+             output_tensor_dict[fields.InputDataFields.groundtruth_confidences],
+             output_tensor_dict[fields.InputDataFields.multiclass_scores],
+            ])
+        shape = np.array(shape)
+        outputs.append(shape)
+        if np.array_equal(shape, [8, 4]):
+          augment_height_only = True
+          self.assertEqual(
+              new_boxes.shape[0], 2 * boxes.shape[0])
+          self.assertAllClose(new_boxes[:2, :] * [2.0, 1.0, 2.0, 1.0],
+                              original_boxes)
+          self.assertAllClose(
+              (new_boxes[2:, :] - [0.5, 0.0, 0.5, 0.0]) * [
+                  2.0, 1.0, 2.0, 1.0],
+              original_boxes)
+        elif np.array_equal(shape, [4, 8]):
+          augment_width_only = True
+          self.assertEqual(
+              new_boxes.shape[0], 2 * boxes.shape[0])
+          self.assertAllClose(new_boxes[:2, :] * [1.0, 2.0, 1.0, 2.0],
+                              original_boxes)
+          self.assertAllClose(
+              (new_boxes[2:, :] - [0.0, 0.5, 0.0, 0.5]) * [
+                  1.0, 2.0, 1.0, 2.0],
+              original_boxes)
+        augmentation_factor = new_boxes.shape[0] / boxes.shape[0].value
+        self.assertEqual(new_labels.shape[0],
+                         labels.shape[0].value * augmentation_factor)
+        self.assertEqual(new_confidences.shape[0],
+                         confidences.shape[0].value * augmentation_factor)
+        self.assertEqual(new_scores.shape[0],
+                         scores.shape[0].value * augmentation_factor)
+      max_height = max(x[0] for x in outputs)
+      max_width = max(x[1] for x in outputs)
+      self.assertEqual(max_height, 8)
+      self.assertEqual(max_width, 8)
+      self.assertEqual(augment_height_only, True)
+      self.assertEqual(augment_width_only, True)
  def testSSDRandomCropWithCache(self):
    preprocess_options = [
        (preprocessor.normalize_image, {

--- a/research/object_detection/core/standard_fields.py
+++ b/research/object_detection/core/standard_fields.py
@@ -114,6 +114,9 @@ class DetectionResultFields(object):
    detection_boundaries: contains an object boundary for each detection box.
    detection_keypoints: contains detection keypoints for each detection box.
    num_detections: number of detections in the batch.
+    raw_detection_boxes: contains decoded detection boxes without Non-Max
+      suppression.
+    raw_detection_scores: contains class score logits for raw detection boxes.
  """
  source_id = 'source_id'
@@ -125,6 +128,8 @@ class DetectionResultFields(object):
  detection_boundaries = 'detection_boundaries'
  detection_keypoints = 'detection_keypoints'
  num_detections = 'num_detections'
+  raw_detection_boxes = 'raw_detection_boxes'
+  raw_detection_scores = 'raw_detection_scores'
 class BoxListFields(object):

--- a/research/object_detection/core/target_assigner.py
+++ b/research/object_detection/core/target_assigner.py
@@ -166,6 +166,10 @@ class TargetAssigner(object):
        num_gt_boxes = groundtruth_boxes.num_boxes()
      groundtruth_weights = tf.ones([num_gt_boxes], dtype=tf.float32)
+    # set scores on the gt boxes
+    scores = 1 - groundtruth_labels[:, 0]
+    groundtruth_boxes.add_field(fields.BoxListFields.scores, scores)
    with tf.control_dependencies(
        [unmatched_shape_assert, labels_and_box_shapes_assert]):
      match_quality_matrix = self._similarity_calc.compare(groundtruth_boxes,

--- a/research/object_detection/data/oid_v4_label_map.pbtxt
+++ b/research/object_detection/data/oid_v4_label_map.pbtxt
--- a/research/object_detection/data_decoders/tf_example_decoder.py
+++ b/research/object_detection/data_decoders/tf_example_decoder.py
@@ -131,7 +131,8 @@ class TfExampleDecoder(data_decoder.DataDecoder):
               use_display_name=False,
               dct_method='',
               num_keypoints=0,
-               num_additional_channels=0):
+               num_additional_channels=0,
+               load_multiclass_scores=False):
    """Constructor sets keys_to_features and items_to_handlers.
    Args:
@@ -153,6 +154,8 @@ class TfExampleDecoder(data_decoder.DataDecoder):
        example, the jpeg library does not have that specific option.
      num_keypoints: the number of keypoints per object.
      num_additional_channels: how many additional channels to use.
+      load_multiclass_scores: Whether to load multiclass scores associated with
+        boxes.
    Raises:
      ValueError: If `instance_mask_type` option is not one of
@@ -205,6 +208,7 @@ class TfExampleDecoder(data_decoder.DataDecoder):
            tf.VarLenFeature(tf.int64),
        'image/object/weight':
            tf.VarLenFeature(tf.float32),
    }
    # We are checking `dct_method` instead of passing it directly in order to
    # ensure TF version 1.6 compatibility.
@@ -251,7 +255,13 @@ class TfExampleDecoder(data_decoder.DataDecoder):
            slim_example_decoder.Tensor('image/object/group_of')),
        fields.InputDataFields.groundtruth_weights: (
            slim_example_decoder.Tensor('image/object/weight')),
    }
+    if load_multiclass_scores:
+      self.keys_to_features[
+          'image/object/class/multiclass_scores'] = tf.VarLenFeature(tf.float32)
+      self.items_to_handlers[fields.InputDataFields.multiclass_scores] = (
+          slim_example_decoder.Tensor('image/object/class/multiclass_scores'))
    if num_additional_channels > 0:
      self.keys_to_features[
          'image/additional_channels/encoded'] = tf.FixedLenFeature(
@@ -355,6 +365,9 @@ class TfExampleDecoder(data_decoder.DataDecoder):
        shape [None, None, None] containing instance masks.
      fields.InputDataFields.groundtruth_image_classes - 1D uint64 of shape
        [None] containing classes for the boxes.
+      fields.InputDataFields.multiclass_scores - 1D float32 tensor of shape
+        [None * num_classes] containing flattened multiclass scores for
+        groundtruth boxes.
    """
    serialized_example = tf.reshape(tf_example_string_tensor, shape=[])
    decoder = slim_example_decoder.TFExampleDecoder(self.keys_to_features,

--- a/research/object_detection/data_decoders/tf_example_decoder_test.py
+++ b/research/object_detection/data_decoders/tf_example_decoder_test.py
@@ -374,6 +374,43 @@ class TfExampleDecoderTest(tf.test.TestCase):
    self.assertAllEqual(bbox_classes,
                        tensor_dict[fields.InputDataFields.groundtruth_classes])
+  def testDecodeMultiClassScores(self):
+    image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
+    encoded_jpeg = self._EncodeImage(image_tensor)
+    bbox_ymins = [0.0, 4.0]
+    bbox_xmins = [1.0, 5.0]
+    bbox_ymaxs = [2.0, 6.0]
+    bbox_xmaxs = [3.0, 7.0]
+    flattened_multiclass_scores = [100., 50.] + [20., 30.]
+    example = tf.train.Example(
+        features=tf.train.Features(
+            feature={
+                'image/encoded':
+                    dataset_util.bytes_feature(encoded_jpeg),
+                'image/format':
+                    dataset_util.bytes_feature('jpeg'),
+                'image/object/class/multiclass_scores':
+                    dataset_util.float_list_feature(flattened_multiclass_scores
+                                                   ),
+                'image/object/bbox/ymin':
+                    dataset_util.float_list_feature(bbox_ymins),
+                'image/object/bbox/xmin':
+                    dataset_util.float_list_feature(bbox_xmins),
+                'image/object/bbox/ymax':
+                    dataset_util.float_list_feature(bbox_ymaxs),
+                'image/object/bbox/xmax':
+                    dataset_util.float_list_feature(bbox_xmaxs),
+            })).SerializeToString()
+    example_decoder = tf_example_decoder.TfExampleDecoder(
+        load_multiclass_scores=True)
+    tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
+    with self.test_session() as sess:
+      tensor_dict = sess.run(tensor_dict)
+    self.assertAllEqual(flattened_multiclass_scores,
+                        tensor_dict[fields.InputDataFields.multiclass_scores])
  def testDecodeObjectLabelNoText(self):
    image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
    encoded_jpeg = self._EncodeImage(image_tensor)
@@ -417,6 +454,51 @@ class TfExampleDecoderTest(tf.test.TestCase):
    self.assertAllEqual(bbox_classes,
                        tensor_dict[fields.InputDataFields.groundtruth_classes])
+  def testDecodeObjectLabelWithText(self):
+    image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
+    encoded_jpeg = self._EncodeImage(image_tensor)
+    bbox_classes_text = ['cat', 'dog']
+    # Annotation label gets overridden by labelmap id.
+    annotated_bbox_classes = [3, 4]
+    expected_bbox_classes = [1, 2]
+    example = tf.train.Example(
+        features=tf.train.Features(
+            feature={
+                'image/encoded':
+                    dataset_util.bytes_feature(encoded_jpeg),
+                'image/format':
+                    dataset_util.bytes_feature('jpeg'),
+                'image/object/class/text':
+                    dataset_util.bytes_list_feature(bbox_classes_text),
+                'image/object/class/label':
+                    dataset_util.int64_list_feature(annotated_bbox_classes),
+            })).SerializeToString()
+    label_map_string = """
+      item {
+        id:1
+        name:'cat'
+      }
+      item {
+        id:2
+        name:'dog'
+      }
+    """
+    label_map_path = os.path.join(self.get_temp_dir(), 'label_map.pbtxt')
+    with tf.gfile.Open(label_map_path, 'wb') as f:
+      f.write(label_map_string)
+    example_decoder = tf_example_decoder.TfExampleDecoder(
+        label_map_proto_file=label_map_path)
+    tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
+    init = tf.tables_initializer()
+    with self.test_session() as sess:
+      sess.run(init)
+      tensor_dict = sess.run(tensor_dict)
+    self.assertAllEqual(expected_bbox_classes,
+                        tensor_dict[fields.InputDataFields.groundtruth_classes])
  def testDecodeObjectLabelUnrecognizedName(self):
    image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
    encoded_jpeg = self._EncodeImage(image_tensor)
@@ -501,6 +583,50 @@ class TfExampleDecoderTest(tf.test.TestCase):
    self.assertAllEqual([3, 1],
                        tensor_dict[fields.InputDataFields.groundtruth_classes])
+  def testDecodeObjectLabelUnrecognizedNameWithMappingWithDisplayName(self):
+    image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
+    encoded_jpeg = self._EncodeImage(image_tensor)
+    bbox_classes_text = ['cat', 'cheetah']
+    bbox_classes_id = [5, 6]
+    example = tf.train.Example(
+        features=tf.train.Features(
+            feature={
+                'image/encoded':
+                    dataset_util.bytes_feature(encoded_jpeg),
+                'image/format':
+                    dataset_util.bytes_feature('jpeg'),
+                'image/object/class/text':
+                    dataset_util.bytes_list_feature(bbox_classes_text),
+                'image/object/class/label':
+                    dataset_util.int64_list_feature(bbox_classes_id),
+            })).SerializeToString()
+    label_map_string = """
+      item {
+        name:'/m/cat'
+        id:3
+        display_name:'cat'
+      }
+      item {
+        name:'/m/dog'
+        id:1
+        display_name:'dog'
+      }
+    """
+    label_map_path = os.path.join(self.get_temp_dir(), 'label_map.pbtxt')
+    with tf.gfile.Open(label_map_path, 'wb') as f:
+      f.write(label_map_string)
+    example_decoder = tf_example_decoder.TfExampleDecoder(
+        label_map_proto_file=label_map_path)
+    tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
+    with self.test_session() as sess:
+      sess.run(tf.tables_initializer())
+      tensor_dict = sess.run(tensor_dict)
+    self.assertAllEqual([3, -1],
+                        tensor_dict[fields.InputDataFields.groundtruth_classes])
  def testDecodeObjectLabelWithMappingWithName(self):
    image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
    encoded_jpeg = self._EncodeImage(image_tensor)

--- a/research/object_detection/eval_util.py
+++ b/research/object_detection/eval_util.py
@@ -15,6 +15,7 @@
 """Common utility functions for evaluation."""
 import collections
 import os
+import re
 import time
 import numpy as np
@@ -233,7 +234,8 @@ def _run_checkpoint_once(tensor_dict,
                         save_graph=False,
                         save_graph_dir='',
                         losses_dict=None,
-                         eval_export_path=None):
+                         eval_export_path=None,
+                         process_metrics_fn=None):
  """Evaluates metrics defined in evaluators and returns summaries.
  This function loads the latest checkpoint in checkpoint_dirs and evaluates
@@ -275,6 +277,12 @@ def _run_checkpoint_once(tensor_dict,
    losses_dict: optional dictionary of scalar detection losses.
    eval_export_path: Path for saving a json file that contains the detection
      results in json format.
+    process_metrics_fn: a callback called with evaluation results after each
+      evaluation is done.  It could be used e.g. to back up checkpoints with
+      best evaluation scores, or to call an external system to update evaluation
+      results in order to drive best hyper-parameter search.  Parameters are:
+      int checkpoint_number, Dict[str, ObjectDetectionEvalMetrics] metrics,
+      str checkpoint_file path.
  Returns:
    global_step: the count of global steps.
@@ -291,6 +299,7 @@ def _run_checkpoint_once(tensor_dict,
  sess.run(tf.global_variables_initializer())
  sess.run(tf.local_variables_initializer())
  sess.run(tf.tables_initializer())
+  checkpoint_file = None
  if restore_fn:
    restore_fn(sess)
  else:
@@ -370,6 +379,15 @@ def _run_checkpoint_once(tensor_dict,
      for key, value in iter(aggregate_result_losses_dict.items()):
        all_evaluator_metrics['Losses/' + key] = np.mean(value)
+      if process_metrics_fn and checkpoint_file:
+        m = re.search(r'model.ckpt-(\d+)$', checkpoint_file)
+        if not m:
+          tf.logging.error('Failed to parse checkpoint number from: %s',
+                           checkpoint_file)
+        else:
+          checkpoint_number = int(m.group(1))
+          process_metrics_fn(checkpoint_number, all_evaluator_metrics,
+                             checkpoint_file)
  sess.close()
  return (global_step, all_evaluator_metrics)
@@ -385,11 +403,13 @@ def repeated_checkpoint_run(tensor_dict,
                            num_batches=1,
                            eval_interval_secs=120,
                            max_number_of_evaluations=None,
+                            max_evaluation_global_step=None,
                            master='',
                            save_graph=False,
                            save_graph_dir='',
                            losses_dict=None,
-                            eval_export_path=None):
+                            eval_export_path=None,
+                            process_metrics_fn=None):
  """Periodically evaluates desired tensors using checkpoint_dirs or restore_fn.
  This function repeatedly loads a checkpoint and evaluates a desired
@@ -425,6 +445,7 @@ def repeated_checkpoint_run(tensor_dict,
    eval_interval_secs: the number of seconds between each evaluation run.
    max_number_of_evaluations: the max number of iterations of the evaluation.
      If the value is left as None the evaluation continues indefinitely.
+    max_evaluation_global_step: global step when evaluation stops.
    master: the location of the Tensorflow session.
    save_graph: whether or not the Tensorflow graph is saved as a pbtxt file.
    save_graph_dir: where to save on disk the Tensorflow graph. If store_graph
@@ -432,6 +453,12 @@ def repeated_checkpoint_run(tensor_dict,
    losses_dict: optional dictionary of scalar detection losses.
    eval_export_path: Path for saving a json file that contains the detection
      results in json format.
+    process_metrics_fn: a callback called with evaluation results after each
+      evaluation is done.  It could be used e.g. to back up checkpoints with
+      best evaluation scores, or to call an external system to update evaluation
+      results in order to drive best hyper-parameter search.  Parameters are:
+      int checkpoint_number, Dict[str, ObjectDetectionEvalMetrics] metrics,
+      str checkpoint_file path.
  Returns:
    metrics: A dictionary containing metric names and values in the latest
@@ -443,7 +470,10 @@ def repeated_checkpoint_run(tensor_dict,
  """
  if max_number_of_evaluations and max_number_of_evaluations <= 0:
    raise ValueError(
-        '`number_of_steps` must be either None or a positive number.')
+        '`max_number_of_evaluations` must be either None or a positive number.')
+  if max_evaluation_global_step and max_evaluation_global_step <= 0:
+    raise ValueError(
+        '`max_evaluation_global_step` must be either None or positive.')
  if not checkpoint_dirs:
    raise ValueError('`checkpoint_dirs` must have at least one entry.')
@@ -475,8 +505,13 @@ def repeated_checkpoint_run(tensor_dict,
          save_graph,
          save_graph_dir,
          losses_dict=losses_dict,
-          eval_export_path=eval_export_path)
+          eval_export_path=eval_export_path,
+          process_metrics_fn=process_metrics_fn)
      write_metrics(metrics, global_step, summary_dir)
+      if (max_evaluation_global_step and
+          global_step >= max_evaluation_global_step):
+        tf.logging.info('Finished evaluation!')
+        break
    number_of_evaluations += 1
    if (max_number_of_evaluations and

--- a/research/object_detection/export_inference_graph.py
+++ b/research/object_detection/export_inference_graph.py
@@ -39,6 +39,12 @@ and the following output nodes returned by the model.postprocess(..):
      [batch, num_boxes] containing class scores for the detections.
  * `detection_classes`: Outputs float32 tensors of the form
      [batch, num_boxes] containing classes for the detections.
+  * `raw_detection_boxes`: Outputs float32 tensors of the form
+      [batch, raw_num_boxes, 4] containing detection boxes without
+      post-processing.
+  * `raw_detection_scores`: Outputs float32 tensors of the form
+      [batch, raw_num_boxes, num_classes_with_background] containing class score
+      logits for raw detection boxes.
  * `detection_masks`: Outputs float32 tensors of the form
      [batch, num_boxes, mask_height, mask_width] containing predicted instance
      masks for each box if its present in the dictionary of postprocessed

--- a/research/object_detection/export_tflite_ssd_graph_lib.py
+++ b/research/object_detection/export_tflite_ssd_graph_lib.py
@@ -154,7 +154,9 @@ def export_tflite_graph(pipeline_config,
                        max_detections,
                        max_classes_per_detection,
                        detections_per_class=100,
-                        use_regular_nms=False):
+                        use_regular_nms=False,
+                        binary_graph_name='tflite_graph.pb',
+                        txt_graph_name='tflite_graph.pbtxt'):
  """Exports a tflite compatible graph and anchors for ssd detection model.
  Anchors are written to a tensor and tflite compatible graph
@@ -174,6 +176,8 @@ def export_tflite_graph(pipeline_config,
    for NonMaxSuppression per class
    use_regular_nms: Flag to set postprocessing op to use Regular NMS instead
      of Fast NMS.
+    binary_graph_name: Name of the exported graph file in binary format.
+    txt_graph_name: Name of the exported graph file in text format.
  Raises:
    ValueError: if the pipeline config contains models other than ssd or uses an
@@ -304,9 +308,9 @@ def export_tflite_graph(pipeline_config,
    # Return frozen without adding post-processing custom op
    transformed_graph_def = frozen_graph_def
-  binary_graph = os.path.join(output_dir, 'tflite_graph.pb')
+  binary_graph = os.path.join(output_dir, binary_graph_name)
  with tf.gfile.GFile(binary_graph, 'wb') as f:
    f.write(transformed_graph_def.SerializeToString())
-  txt_graph = os.path.join(output_dir, 'tflite_graph.pbtxt')
+  txt_graph = os.path.join(output_dir, txt_graph_name)
  with tf.gfile.GFile(txt_graph, 'w') as f:
    f.write(str(transformed_graph_def))
--- a/research/object_detection/exporter.py
+++ b/research/object_detection/exporter.py
@@ -19,11 +19,7 @@ import tempfile
 import tensorflow as tf
 from tensorflow.contrib.quantize.python import graph_matcher
 from tensorflow.core.protobuf import saver_pb2
-from tensorflow.python.client import session
+from tensorflow.python.tools import freeze_graph  # pylint: disable=g-direct-tensorflow-import
-from tensorflow.python.platform import gfile
-from tensorflow.python.saved_model import signature_constants
-from tensorflow.python.tools import freeze_graph
-from tensorflow.python.training import saver as saver_lib
 from object_detection.builders import graph_rewriter_builder
 from object_detection.builders import model_builder
 from object_detection.core import standard_fields as fields
@@ -73,7 +69,8 @@ def rewrite_nn_resize_op(is_quantized=False):
    nn_resize = tf.image.resize_nearest_neighbor(
        projection_op.outputs[0],
        add_op.outputs[0].shape.dims[1:3],
-        align_corners=False)
+        align_corners=False,
+        name=os.path.split(reshape_2_op.name)[0] + '/resize_nearest_neighbor')
    for index, op_input in enumerate(add_op.inputs):
      if op_input == reshape_2_op.outputs[0]:
@@ -207,6 +204,8 @@ def add_output_tensor_nodes(postprocessed_tensors,
  label_id_offset = 1
  boxes = postprocessed_tensors.get(detection_fields.detection_boxes)
  scores = postprocessed_tensors.get(detection_fields.detection_scores)
+  raw_boxes = postprocessed_tensors.get(detection_fields.raw_detection_boxes)
+  raw_scores = postprocessed_tensors.get(detection_fields.raw_detection_scores)
  classes = postprocessed_tensors.get(
      detection_fields.detection_classes) + label_id_offset
  keypoints = postprocessed_tensors.get(detection_fields.detection_keypoints)
@@ -221,6 +220,12 @@ def add_output_tensor_nodes(postprocessed_tensors,
      classes, name=detection_fields.detection_classes)
  outputs[detection_fields.num_detections] = tf.identity(
      num_detections, name=detection_fields.num_detections)
+  if raw_boxes is not None:
+    outputs[detection_fields.raw_detection_boxes] = tf.identity(
+        raw_boxes, name=detection_fields.raw_detection_boxes)
+  if raw_scores is not None:
+    outputs[detection_fields.raw_detection_scores] = tf.identity(
+        raw_scores, name=detection_fields.raw_detection_scores)
  if keypoints is not None:
    outputs[detection_fields.detection_keypoints] = tf.identity(
        keypoints, name=detection_fields.detection_keypoints)
@@ -252,7 +257,7 @@ def write_saved_model(saved_model_path,
    outputs: A tensor dictionary containing the outputs of a DetectionModel.
  """
  with tf.Graph().as_default():
-    with session.Session() as sess:
+    with tf.Session() as sess:
      tf.import_graph_def(frozen_graph_def, name='')
@@ -268,12 +273,15 @@ def write_saved_model(saved_model_path,
          tf.saved_model.signature_def_utils.build_signature_def(
              inputs=tensor_info_inputs,
              outputs=tensor_info_outputs,
-              method_name=signature_constants.PREDICT_METHOD_NAME))
+              method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME
+          ))
      builder.add_meta_graph_and_variables(
-          sess, [tf.saved_model.tag_constants.SERVING],
+          sess,
+          [tf.saved_model.tag_constants.SERVING],
          signature_def_map={
-              signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
+              tf.saved_model.signature_constants
+              .DEFAULT_SERVING_SIGNATURE_DEF_KEY:
                  detection_signature,
          },
      )
@@ -289,9 +297,9 @@ def write_graph_and_checkpoint(inference_graph_def,
    node.device = ''
  with tf.Graph().as_default():
    tf.import_graph_def(inference_graph_def, name='')
-    with session.Session() as sess:
+    with tf.Session() as sess:
-      saver = saver_lib.Saver(saver_def=input_saver_def,
+      saver = tf.train.Saver(
-                              save_relative_paths=True)
+          saver_def=input_saver_def, save_relative_paths=True)
      saver.restore(sess, trained_checkpoint_prefix)
      saver.save(sess, model_path)
@@ -308,8 +316,8 @@ def _get_outputs_from_inputs(input_tensors, detection_model,
                                 output_collection_name)
-def _build_detection_graph(input_type, detection_model, input_shape,
+def build_detection_graph(input_type, detection_model, input_shape,
-                           output_collection_name, graph_hook_fn):
+                          output_collection_name, graph_hook_fn):
  """Build the detection graph."""
  if input_type not in input_placeholder_fn_map:
    raise ValueError('Unknown input type: {}'.format(input_type))
@@ -343,7 +351,8 @@ def _export_inference_graph(input_type,
                            input_shape=None,
                            output_collection_name='inference_op',
                            graph_hook_fn=None,
-                            write_inference_graph=False):
+                            write_inference_graph=False,
+                            temp_checkpoint_prefix=''):
  """Export helper."""
  tf.gfile.MakeDirs(output_directory)
  frozen_graph_path = os.path.join(output_directory,
@@ -351,7 +360,7 @@ def _export_inference_graph(input_type,
  saved_model_path = os.path.join(output_directory, 'saved_model')
  model_path = os.path.join(output_directory, 'model.ckpt')
-  outputs, placeholder_tensor = _build_detection_graph(
+  outputs, placeholder_tensor = build_detection_graph(
      input_type=input_type,
      detection_model=detection_model,
      input_shape=input_shape,
@@ -361,12 +370,13 @@ def _export_inference_graph(input_type,
  profile_inference_graph(tf.get_default_graph())
  saver_kwargs = {}
  if use_moving_averages:
-    # This check is to be compatible with both version of SaverDef.
+    if not temp_checkpoint_prefix:
-    if os.path.isfile(trained_checkpoint_prefix):
+      # This check is to be compatible with both version of SaverDef.
-      saver_kwargs['write_version'] = saver_pb2.SaverDef.V1
+      if os.path.isfile(trained_checkpoint_prefix):
-      temp_checkpoint_prefix = tempfile.NamedTemporaryFile().name
+        saver_kwargs['write_version'] = saver_pb2.SaverDef.V1
-    else:
+        temp_checkpoint_prefix = tempfile.NamedTemporaryFile().name
-      temp_checkpoint_prefix = tempfile.mkdtemp()
+      else:
+        temp_checkpoint_prefix = tempfile.mkdtemp()
    replace_variable_values_with_moving_averages(
        tf.get_default_graph(), trained_checkpoint_prefix,
        temp_checkpoint_prefix)
@@ -388,7 +398,7 @@ def _export_inference_graph(input_type,
                                        'inference_graph.pbtxt')
    for node in inference_graph_def.node:
      node.device = ''
-    with gfile.GFile(inference_graph_path, 'wb') as f:
+    with tf.gfile.GFile(inference_graph_path, 'wb') as f:
      f.write(str(inference_graph_def))
  if additional_output_tensor_names is not None:
@@ -486,4 +496,3 @@ def profile_inference_graph(graph):
  tf.contrib.tfprof.model_analyzer.print_model_analysis(
      graph,
      tfprof_options=tfprof_flops_option)
--- a/research/object_detection/exporter_test.py
+++ b/research/object_detection/exporter_test.py
@@ -61,7 +61,14 @@ class FakeModel(model.DetectionModel):
                                           [0.9, 0.0]], tf.float32),
          'detection_classes': tf.constant([[0, 1],
                                            [1, 0]], tf.float32),
-          'num_detections': tf.constant([2, 1], tf.float32)
+          'num_detections': tf.constant([2, 1], tf.float32),
+          'raw_detection_boxes': tf.constant([[[0.0, 0.0, 0.5, 0.5],
+                                               [0.5, 0.5, 0.8, 0.8]],
+                                              [[0.5, 0.5, 1.0, 1.0],
+                                               [0.0, 0.5, 0.0, 0.5]]],
+                                             tf.float32),
+          'raw_detection_scores': tf.constant([[0.7, 0.6],
+                                               [0.9, 0.5]], tf.float32),
      }
      if self._add_detection_keypoints:
        postprocessed_tensors['detection_keypoints'] = tf.constant(
@@ -612,7 +619,7 @@ class ExportInferenceGraphTest(tf.test.TestCase):
      pipeline_config.eval_config.use_moving_averages = False
      detection_model = model_builder.build(pipeline_config.model,
                                            is_training=False)
-      outputs, _ = exporter._build_detection_graph(
+      outputs, _ = exporter.build_detection_graph(
          input_type='tf_example',
          detection_model=detection_model,
          input_shape=None,
@@ -760,7 +767,7 @@ class ExportInferenceGraphTest(tf.test.TestCase):
      pipeline_config.eval_config.use_moving_averages = False
      detection_model = model_builder.build(pipeline_config.model,
                                            is_training=False)
-      outputs, placeholder_tensor = exporter._build_detection_graph(
+      outputs, placeholder_tensor = exporter.build_detection_graph(
          input_type='tf_example',
          detection_model=detection_model,
          input_shape=None,
@@ -893,7 +900,7 @@ class ExportInferenceGraphTest(tf.test.TestCase):
      pipeline_config.eval_config.use_moving_averages = False
      detection_model = model_builder.build(pipeline_config.model,
                                            is_training=False)
-      exporter._build_detection_graph(
+      exporter.build_detection_graph(
          input_type='tf_example',
          detection_model=detection_model,
          input_shape=None,
@@ -917,13 +924,16 @@ class ExportInferenceGraphTest(tf.test.TestCase):
        tf_example = od_graph.get_tensor_by_name('tf_example:0')
        boxes = od_graph.get_tensor_by_name('detection_boxes:0')
        scores = od_graph.get_tensor_by_name('detection_scores:0')
+        raw_boxes = od_graph.get_tensor_by_name('raw_detection_boxes:0')
+        raw_scores = od_graph.get_tensor_by_name('raw_detection_scores:0')
        classes = od_graph.get_tensor_by_name('detection_classes:0')
        keypoints = od_graph.get_tensor_by_name('detection_keypoints:0')
        masks = od_graph.get_tensor_by_name('detection_masks:0')
        num_detections = od_graph.get_tensor_by_name('num_detections:0')
-        (boxes_np, scores_np, classes_np, keypoints_np, masks_np,
+        (boxes_np, scores_np, raw_boxes_np, raw_scores_np, classes_np,
-         num_detections_np) = sess.run(
+         keypoints_np, masks_np, num_detections_np) = sess.run(
-             [boxes, scores, classes, keypoints, masks, num_detections],
+             [boxes, scores, raw_boxes, raw_scores, classes, keypoints, masks,
+              num_detections],
             feed_dict={tf_example: tf_example_np})
        self.assertAllClose(boxes_np, [[[0.0, 0.0, 0.5, 0.5],
                                        [0.5, 0.5, 0.8, 0.8]],
@@ -931,6 +941,12 @@ class ExportInferenceGraphTest(tf.test.TestCase):
                                        [0.0, 0.0, 0.0, 0.0]]])
        self.assertAllClose(scores_np, [[0.7, 0.6],
                                        [0.9, 0.0]])
+        self.assertAllClose(raw_boxes_np, [[[0.0, 0.0, 0.5, 0.5],
+                                            [0.5, 0.5, 0.8, 0.8]],
+                                           [[0.5, 0.5, 1.0, 1.0],
+                                            [0.0, 0.5, 0.0, 0.5]]])
+        self.assertAllClose(raw_scores_np, [[0.7, 0.6],
+                                            [0.9, 0.5]])
        self.assertAllClose(classes_np, [[1, 2],
                                         [2, 1]])
        self.assertAllClose(keypoints_np, np.arange(48).reshape([2, 2, 6, 2]))

--- a/research/object_detection/g3doc/detection_model_zoo.md
+++ b/research/object_detection/g3doc/detection_model_zoo.md
@@ -57,7 +57,7 @@ Some remarks on frozen inference graphs:
  a detector (and discarding the part past that point), which negatively impacts
  standard mAP metrics.
 * Our frozen inference graphs are generated using the
-  [v1.8.0](https://github.com/tensorflow/tensorflow/tree/v1.8.0)
+  [v1.12.0](https://github.com/tensorflow/tensorflow/tree/v1.12.0)
  release version of Tensorflow and we do not guarantee that these will work
  with other versions; this being said, each frozen inference graph can be
  regenerated using your current version of Tensorflow by re-running the
@@ -78,7 +78,7 @@ Some remarks on frozen inference graphs:
 | [ssd_mobilenet_v1_fpn_coco ☆](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_fpn_shared_box_predictor_640x640_coco14_sync_2018_07_03.tar.gz) | 56 | 32 | Boxes |
 | [ssd_resnet_50_fpn_coco ☆](http://download.tensorflow.org/models/object_detection/ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync_2018_07_03.tar.gz) | 76 | 35 | Boxes |
 | [ssd_mobilenet_v2_coco](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v2_coco_2018_03_29.tar.gz) | 31 | 22 | Boxes |
-| [ssd_mobilenet_v2_quantized_coco](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v2_quantized_300x300_coco_2018_09_14.tar.gz) | 29 | 22 | Boxes |
+| [ssd_mobilenet_v2_quantized_coco](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03.tar.gz) | 29 | 22 | Boxes |
 | [ssdlite_mobilenet_v2_coco](http://download.tensorflow.org/models/object_detection/ssdlite_mobilenet_v2_coco_2018_05_09.tar.gz) | 27 | 22 | Boxes |
 | [ssd_inception_v2_coco](http://download.tensorflow.org/models/object_detection/ssd_inception_v2_coco_2018_01_28.tar.gz) | 42 | 24 | Boxes |
 | [faster_rcnn_inception_v2_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_v2_coco_2018_01_28.tar.gz) | 58 | 28 | Boxes |
@@ -110,10 +110,15 @@ Model name
 Model name                                                                                                                                                                                    | Speed (ms) | Open Images mAP@0.5[^2] | Outputs
 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :--------: | :---------------------: | :-----:
-[faster_rcnn_inception_resnet_v2_atrous_oidv2](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_oid_2018_01_28.tar.gz)                           | 727        | 37                      | Boxes
+[faster_rcnn_inception_resnet_v2_atrous_oidv2](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_oid_2018_01_28.tar.gz)                           | 727        | 37                     | Boxes
 [faster_rcnn_inception_resnet_v2_atrous_lowproposals_oidv2](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2018_01_28.tar.gz) | 347        |                         | Boxes
 [facessd_mobilenet_v2_quantized_open_image_v4](http://download.tensorflow.org/models/object_detection/facessd_mobilenet_v2_quantized_320x320_open_image_v4.tar.gz) [^3]                       | 20         | 73 (faces)              | Boxes
+Model name                                                                                                                                                                                    | Speed (ms) | Open Images mAP@0.5[^4] | Outputs
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :--------: | :---------------------: | :-----:
+[faster_rcnn_inception_resnet_v2_atrous_oidv4](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_oid_v4_2018_12_12.tar.gz)                         | 425        | 54                  | Boxes
+[ssd_mobilenetv2_oidv4](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v2_oid_v4_2018_12_12.tar.gz)                                                                       | 89         | 36                | Boxes
+[ssd_resnet_101_fpn_oidv4](http://download.tensorflow.org/models/object_detection/ssd_resnet101_v1_fpn_shared_box_predictor_oid_512x512_sync_2019_01_20.tar.gz)                                                                       | 237         | 38                | Boxes
 ## iNaturalist Species-trained models
 Model name                                                                                                                                                        | Speed (ms) | Pascal mAP@0.5 | Outputs
@@ -129,8 +134,10 @@ Model name
 [faster_rcnn_resnet101_ava_v2.1](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_ava_v2.1_2018_04_30.tar.gz) | 93  | 11              | Boxes
-[^1]: See [MSCOCO evaluation protocol](http://cocodataset.org/#detections-eval).
+[^1]: See [MSCOCO evaluation protocol](http://cocodataset.org/#detections-eval). The COCO mAP numbers here are evaluated on COCO 14 minival set (note that our split is different from COCO 17 Val). A full list of image ids used in our split could be fould [here](https://github.com/tensorflow/models/blob/master/research/object_detection/data/mscoco_minival_ids.txt).
-[^2]: This is PASCAL mAP with a slightly different way of true positives computation: see [Open Images evaluation protocol](evaluation_protocols.md#open-images).
+[^2]: This is PASCAL mAP with a slightly different way of true positives computation: see [Open Images evaluation protocols](evaluation_protocols.md), oid_V2_detection_metrics.
 [^3]: Non-face boxes are dropped during training and non-face groundtruth boxes are ignored when evaluating.
+[^4]: This is Open Images Challenge metric: see [Open Images evaluation protocols](evaluation_protocols.md), oid_challenge_detection_metrics.
--- a/research/object_detection/g3doc/installation.md
+++ b/research/object_detection/g3doc/installation.md
@@ -11,7 +11,7 @@ Tensorflow Object Detection API depends on the following libraries:
 *   tf Slim (which is included in the "tensorflow/models/research/" checkout)
 *   Jupyter notebook
 *   Matplotlib
-*   Tensorflow (>=1.9.0)
+*   Tensorflow (>=1.12.0)
 *   Cython
 *   contextlib2
 *   cocoapi

--- a/research/object_detection/g3doc/running_on_cloud.md
+++ b/research/object_detection/g3doc/running_on_cloud.md
@@ -44,7 +44,7 @@ job using GPUs. A sample YAML file is given below:
 ```
 trainingInput:
-  runtimeVersion: "1.9"
+  runtimeVersion: "1.12"
  scaleTier: CUSTOM
  masterType: standard_gpu
  workerCount: 9
@@ -73,7 +73,7 @@ following command:
 ```bash
 # From tensorflow/models/research/
 gcloud ml-engine jobs submit training object_detection_`date +%m_%d_%Y_%H_%M_%S` \
-    --runtime-version 1.9 \
+    --runtime-version 1.12 \
    --job-dir=gs://${MODEL_DIR} \
    --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
    --module-name object_detection.model_main \
@@ -93,7 +93,7 @@ Google Cloud Storage.
 Users can monitor the progress of their training job on the [ML Engine
 Dashboard](https://console.cloud.google.com/mlengine/jobs).
-Note: This sample is supported for use with 1.9 runtime version.
+Note: This sample is supported for use with 1.12 runtime version.
 ## Running a TPU Training Job on CMLE
@@ -105,7 +105,7 @@ gcloud ml-engine jobs submit training `whoami`_object_detection_`date +%m_%d_%Y_
 --job-dir=gs://${MODEL_DIR} \
 --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
 --module-name object_detection.model_tpu_main \
--runtime-version 1.9 \
+--runtime-version 1.12 \
 --scale-tier BASIC_TPU \
 --region us-central1 \
 -- \
@@ -133,7 +133,7 @@ job:
 ```bash
 gcloud ml-engine jobs submit training object_detection_eval_`date +%m_%d_%Y_%H_%M_%S` \
-    --runtime-version 1.9 \
+    --runtime-version 1.12 \
    --job-dir=gs://${MODEL_DIR} \
    --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
    --module-name object_detection.model_main \

--- a/research/object_detection/g3doc/running_pets.md
+++ b/research/object_detection/g3doc/running_pets.md
@@ -208,7 +208,7 @@ For running the training Cloud ML job, we'll configure the cluster to use 5
 training jobs and three parameters servers. The
 configuration file can be found at `object_detection/samples/cloud/cloud.yml`.
-Note: The code sample below is supported for use with 1.9 runtime version.
+Note: The code sample below is supported for use with 1.12 runtime version.
 To start training and evaluation, execute the following command from the
 `tensorflow/models/research/` directory:
@@ -216,7 +216,7 @@ To start training and evaluation, execute the following command from the
 ```bash
 # From tensorflow/models/research/
 gcloud ml-engine jobs submit training `whoami`_object_detection_pets_`date +%m_%d_%Y_%H_%M_%S` \
-    --runtime-version 1.9 \
+    --runtime-version 1.12 \
    --job-dir=gs://${YOUR_GCS_BUCKET}/model_dir \
    --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
    --module-name object_detection.model_main \

--- a/research/object_detection/g3doc/using_your_own_dataset.md
+++ b/research/object_detection/g3doc/using_your_own_dataset.md
@@ -76,7 +76,7 @@ def create_cat_tf_example(encoded_cat_image_data):
      'image/width': dataset_util.int64_feature(width),
      'image/filename': dataset_util.bytes_feature(filename),
      'image/source_id': dataset_util.bytes_feature(filename),
-      'image/encoded': dataset_util.bytes_feature(encoded_cat_image_data),
+      'image/encoded': dataset_util.bytes_feature(encoded_image_data),
      'image/format': dataset_util.bytes_feature(image_format),
      'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
      'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),

--- a/research/object_detection/inputs.py
+++ b/research/object_detection/inputs.py
@@ -53,6 +53,7 @@ def transform_input_data(tensor_dict,
                         data_augmentation_fn=None,
                         merge_multiple_boxes=False,
                         retain_original_image=False,
+                         use_multiclass_scores=False,
                         use_bfloat16=False):
  """A single function that is responsible for all input data transformations.
@@ -87,25 +88,37 @@ def transform_input_data(tensor_dict,
      and classes for a given image if the boxes are exactly the same.
    retain_original_image: (optional) whether to retain original image in the
      output dictionary.
+    use_multiclass_scores: whether to use multiclass scores as
+      class targets instead of one-hot encoding of `groundtruth_classes`.
    use_bfloat16: (optional) a bool, whether to use bfloat16 in training.
  Returns:
    A dictionary keyed by fields.InputDataFields containing the tensors obtained
    after applying all the transformations.
  """
+  # Reshape flattened multiclass scores tensor into a 2D tensor of shape
+  # [num_boxes, num_classes].
+  if fields.InputDataFields.multiclass_scores in tensor_dict:
+    tensor_dict[fields.InputDataFields.multiclass_scores] = tf.reshape(
+        tensor_dict[fields.InputDataFields.multiclass_scores], [
+            tf.shape(tensor_dict[fields.InputDataFields.groundtruth_boxes])[0],
+            num_classes
+        ])
  if fields.InputDataFields.groundtruth_boxes in tensor_dict:
    tensor_dict = util_ops.filter_groundtruth_with_nan_box_coordinates(
        tensor_dict)
-  if fields.InputDataFields.image_additional_channels in tensor_dict:
+    tensor_dict = util_ops.filter_unrecognized_classes(tensor_dict)
-    channels = tensor_dict[fields.InputDataFields.image_additional_channels]
-    tensor_dict[fields.InputDataFields.image] = tf.concat(
-        [tensor_dict[fields.InputDataFields.image], channels], axis=2)
  if retain_original_image:
    tensor_dict[fields.InputDataFields.original_image] = tf.cast(
        image_resizer_fn(tensor_dict[fields.InputDataFields.image], None)[0],
        tf.uint8)
+  if fields.InputDataFields.image_additional_channels in tensor_dict:
+    channels = tensor_dict[fields.InputDataFields.image_additional_channels]
+    tensor_dict[fields.InputDataFields.image] = tf.concat(
+        [tensor_dict[fields.InputDataFields.image], channels], axis=2)
  # Apply data augmentation ops.
  if data_augmentation_fn is not None:
    tensor_dict = data_augmentation_fn(tensor_dict)
@@ -136,6 +149,11 @@ def transform_input_data(tensor_dict,
  tensor_dict[fields.InputDataFields.groundtruth_classes] = tf.one_hot(
      zero_indexed_groundtruth_classes, num_classes)
+  if use_multiclass_scores:
+    tensor_dict[fields.InputDataFields.groundtruth_classes] = tensor_dict[
+        fields.InputDataFields.multiclass_scores]
+    tensor_dict.pop(fields.InputDataFields.multiclass_scores, None)
  if fields.InputDataFields.groundtruth_confidences in tensor_dict:
    groundtruth_confidences = tensor_dict[
        fields.InputDataFields.groundtruth_confidences]
@@ -172,6 +190,9 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes,
                                    spatial_image_shape=None):
  """Pads input tensors to static shapes.
+  In case num_additional_channels > 0, we assume that the additional channels
+  have already been concatenated to the base image.
  Args:
    tensor_dict: Tensor dictionary of input data
    max_num_boxes: Max number of groundtruth boxes needed to compute shapes for
@@ -186,7 +207,8 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes,
    tensors in the dataset.
  Raises:
-    ValueError: If groundtruth classes is neither rank 1 nor rank 2.
+    ValueError: If groundtruth classes is neither rank 1 nor rank 2, or if we
+      detect that additional channels have not been concatenated yet.
  """
  if not spatial_image_shape or spatial_image_shape == [-1, -1]:
@@ -198,14 +220,27 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes,
  if fields.InputDataFields.image_additional_channels in tensor_dict:
    num_additional_channels = tensor_dict[
        fields.InputDataFields.image_additional_channels].shape[2].value
-  num_image_channels = 3
+  # We assume that if num_additional_channels > 0, then it has already been
+  # concatenated to the base image (but not the ground truth).
+  num_channels = 3
  if fields.InputDataFields.image in tensor_dict:
-    num_image_channels = tensor_dict[fields.InputDataFields
+    num_channels = tensor_dict[fields.InputDataFields.image].shape[2].value
-                                     .image].shape[2].value
+  if num_additional_channels:
+    if num_additional_channels >= num_channels:
+      raise ValueError(
+          'Image must be already concatenated with additional channels.')
+    if (fields.InputDataFields.original_image in tensor_dict and
+        tensor_dict[fields.InputDataFields.original_image].shape[2].value ==
+        num_channels):
+      raise ValueError(
+          'Image must be already concatenated with additional channels.')
  padding_shapes = {
-      # Additional channels are merged before batching.
      fields.InputDataFields.image: [
-          height, width, num_image_channels + num_additional_channels
+          height, width, num_channels
      ],
      fields.InputDataFields.original_image_spatial_shape: [2],
      fields.InputDataFields.image_additional_channels: [
@@ -231,16 +266,14 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes,
      fields.InputDataFields.groundtruth_label_types: [max_num_boxes],
      fields.InputDataFields.groundtruth_label_weights: [max_num_boxes],
      fields.InputDataFields.true_image_shape: [3],
-      fields.InputDataFields.multiclass_scores: [
-          max_num_boxes, num_classes + 1 if num_classes is not None else None
-      ],
      fields.InputDataFields.groundtruth_image_classes: [num_classes],
      fields.InputDataFields.groundtruth_image_confidences: [num_classes],
  }
  if fields.InputDataFields.original_image in tensor_dict:
    padding_shapes[fields.InputDataFields.original_image] = [
-        height, width, num_image_channels + num_additional_channels
+        height, width, tensor_dict[fields.InputDataFields.
+                                   original_image].shape[2].value
    ]
  if fields.InputDataFields.groundtruth_keypoints in tensor_dict:
    tensor_shape = (
@@ -294,11 +327,14 @@ def augment_input_data(tensor_dict, data_augmentation_options):
                           in tensor_dict)
  include_label_confidences = (fields.InputDataFields.groundtruth_confidences
                               in tensor_dict)
+  include_multiclass_scores = (fields.InputDataFields.multiclass_scores in
+                               tensor_dict)
  tensor_dict = preprocessor.preprocess(
      tensor_dict, data_augmentation_options,
      func_arg_map=preprocessor.get_default_func_arg_map(
          include_label_weights=include_label_weights,
          include_label_confidences=include_label_confidences,
+          include_multiclass_scores=include_multiclass_scores,
          include_instance_masks=include_instance_masks,
          include_keypoints=include_keypoints))
  tensor_dict[fields.InputDataFields.image] = tf.squeeze(
@@ -472,6 +508,7 @@ def create_train_input_fn(train_config, train_input_config,
          data_augmentation_fn=data_augmentation_fn,
          merge_multiple_boxes=train_config.merge_multiple_label_boxes,
          retain_original_image=train_config.retain_original_images,
+          use_multiclass_scores=train_config.use_multiclass_scores,
          use_bfloat16=train_config.use_bfloat16)
      tensor_dict = pad_input_data_to_static_shapes(

--- a/research/object_detection/inputs_test.py
+++ b/research/object_detection/inputs_test.py
@@ -105,6 +105,48 @@ class InputsTest(test_case.TestCase, parameterized.TestCase):
        tf.float32,
        labels[fields.InputDataFields.groundtruth_confidences].dtype)
+  def test_faster_rcnn_resnet50_train_input_with_additional_channels(self):
+    """Tests the training input function for FasterRcnnResnet50."""
+    configs = _get_configs_for_model('faster_rcnn_resnet50_pets')
+    model_config = configs['model']
+    configs['train_input_config'].num_additional_channels = 2
+    configs['train_config'].retain_original_images = True
+    model_config.faster_rcnn.num_classes = 37
+    train_input_fn = inputs.create_train_input_fn(
+        configs['train_config'], configs['train_input_config'], model_config)
+    features, labels = _make_initializable_iterator(train_input_fn()).get_next()
+    self.assertAllEqual([1, None, None, 5],
+                        features[fields.InputDataFields.image].shape.as_list())
+    self.assertAllEqual(
+        [1, None, None, 3],
+        features[fields.InputDataFields.original_image].shape.as_list())
+    self.assertEqual(tf.float32, features[fields.InputDataFields.image].dtype)
+    self.assertAllEqual([1],
+                        features[inputs.HASH_KEY].shape.as_list())
+    self.assertEqual(tf.int32, features[inputs.HASH_KEY].dtype)
+    self.assertAllEqual(
+        [1, 100, 4],
+        labels[fields.InputDataFields.groundtruth_boxes].shape.as_list())
+    self.assertEqual(tf.float32,
+                     labels[fields.InputDataFields.groundtruth_boxes].dtype)
+    self.assertAllEqual(
+        [1, 100, model_config.faster_rcnn.num_classes],
+        labels[fields.InputDataFields.groundtruth_classes].shape.as_list())
+    self.assertEqual(tf.float32,
+                     labels[fields.InputDataFields.groundtruth_classes].dtype)
+    self.assertAllEqual(
+        [1, 100],
+        labels[fields.InputDataFields.groundtruth_weights].shape.as_list())
+    self.assertEqual(tf.float32,
+                     labels[fields.InputDataFields.groundtruth_weights].dtype)
+    self.assertAllEqual(
+        [1, 100, model_config.faster_rcnn.num_classes],
+        labels[fields.InputDataFields.groundtruth_confidences].shape.as_list())
+    self.assertEqual(
+        tf.float32,
+        labels[fields.InputDataFields.groundtruth_confidences].dtype)
  @parameterized.parameters(
      {'eval_batch_size': 1},
      {'eval_batch_size': 8}
@@ -595,6 +637,72 @@ class DataTransformationFnTest(test_case.TestCase):
        transformed_inputs[fields.InputDataFields.groundtruth_confidences],
        [[0, 0, 1], [1, 0, 0]])
+  def test_returns_correct_labels_with_unrecognized_class(self):
+    tensor_dict = {
+        fields.InputDataFields.image:
+            tf.constant(np.random.rand(4, 4, 3).astype(np.float32)),
+        fields.InputDataFields.groundtruth_boxes:
+            tf.constant(
+                np.array([[0, 0, 1, 1], [.2, .2, 4, 4], [.5, .5, 1, 1]],
+                         np.float32)),
+        fields.InputDataFields.groundtruth_area:
+            tf.constant(np.array([.5, .4, .3])),
+        fields.InputDataFields.groundtruth_classes:
+            tf.constant(np.array([3, -1, 1], np.int32)),
+        fields.InputDataFields.groundtruth_keypoints:
+            tf.constant(
+                np.array([[[.1, .1]], [[.2, .2]], [[.5, .5]]],
+                         np.float32)),
+        fields.InputDataFields.groundtruth_keypoint_visibilities:
+            tf.constant([True, False, True]),
+        fields.InputDataFields.groundtruth_instance_masks:
+            tf.constant(np.random.rand(3, 4, 4).astype(np.float32)),
+        fields.InputDataFields.groundtruth_is_crowd:
+            tf.constant([False, True, False]),
+        fields.InputDataFields.groundtruth_difficult:
+            tf.constant(np.array([0, 0, 1], np.int32))
+    }
+    num_classes = 3
+    input_transformation_fn = functools.partial(
+        inputs.transform_input_data,
+        model_preprocess_fn=_fake_model_preprocessor_fn,
+        image_resizer_fn=_fake_image_resizer_fn,
+        num_classes=num_classes)
+    with self.test_session() as sess:
+      transformed_inputs = sess.run(
+          input_transformation_fn(tensor_dict=tensor_dict))
+    self.assertAllClose(
+        transformed_inputs[fields.InputDataFields.groundtruth_classes],
+        [[0, 0, 1], [1, 0, 0]])
+    self.assertAllEqual(
+        transformed_inputs[fields.InputDataFields.num_groundtruth_boxes], 2)
+    self.assertAllClose(
+        transformed_inputs[fields.InputDataFields.groundtruth_area], [.5, .3])
+    self.assertAllEqual(
+        transformed_inputs[fields.InputDataFields.groundtruth_confidences],
+        [[0, 0, 1], [1, 0, 0]])
+    self.assertAllClose(
+        transformed_inputs[fields.InputDataFields.groundtruth_boxes],
+        [[0, 0, 1, 1], [.5, .5, 1, 1]])
+    self.assertAllClose(
+        transformed_inputs[fields.InputDataFields.groundtruth_keypoints],
+        [[[.1, .1]], [[.5, .5]]])
+    self.assertAllEqual(
+        transformed_inputs[
+            fields.InputDataFields.groundtruth_keypoint_visibilities],
+        [True, True])
+    self.assertAllEqual(
+        transformed_inputs[
+            fields.InputDataFields.groundtruth_instance_masks].shape, [2, 4, 4])
+    self.assertAllEqual(
+        transformed_inputs[fields.InputDataFields.groundtruth_is_crowd],
+        [False, False])
+    self.assertAllEqual(
+        transformed_inputs[fields.InputDataFields.groundtruth_difficult],
+        [0, 1])
  def test_returns_correct_merged_boxes(self):
    tensor_dict = {
        fields.InputDataFields.image:
@@ -885,7 +993,7 @@ class PadInputDataToStaticShapesFnTest(test_case.TestCase):
  def test_images_and_additional_channels(self):
    input_tensor_dict = {
        fields.InputDataFields.image:
-            tf.placeholder(tf.float32, [None, None, 3]),
+            tf.placeholder(tf.float32, [None, None, 5]),
        fields.InputDataFields.image_additional_channels:
            tf.placeholder(tf.float32, [None, None, 2]),
    }
@@ -895,6 +1003,8 @@ class PadInputDataToStaticShapesFnTest(test_case.TestCase):
        num_classes=3,
        spatial_image_shape=[5, 6])
+    # pad_input_data_to_static_shape assumes that image is already concatenated
+    # with additional channels.
    self.assertAllEqual(
        padded_tensor_dict[fields.InputDataFields.image].shape.as_list(),
        [5, 6, 5])
@@ -902,6 +1012,22 @@ class PadInputDataToStaticShapesFnTest(test_case.TestCase):
        padded_tensor_dict[fields.InputDataFields.image_additional_channels]
        .shape.as_list(), [5, 6, 2])
+  def test_images_and_additional_channels_errors(self):
+    input_tensor_dict = {
+        fields.InputDataFields.image:
+            tf.placeholder(tf.float32, [None, None, 3]),
+        fields.InputDataFields.image_additional_channels:
+            tf.placeholder(tf.float32, [None, None, 2]),
+        fields.InputDataFields.original_image:
+            tf.placeholder(tf.float32, [None, None, 3]),
+    }
+    with self.assertRaises(ValueError):
+      _ = inputs.pad_input_data_to_static_shapes(
+          tensor_dict=input_tensor_dict,
+          max_num_boxes=3,
+          num_classes=3,
+          spatial_image_shape=[5, 6])
  def test_gray_images(self):
    input_tensor_dict = {
        fields.InputDataFields.image:
@@ -920,10 +1046,12 @@ class PadInputDataToStaticShapesFnTest(test_case.TestCase):
  def test_gray_images_and_additional_channels(self):
    input_tensor_dict = {
        fields.InputDataFields.image:
-            tf.placeholder(tf.float32, [None, None, 1]),
+            tf.placeholder(tf.float32, [None, None, 3]),
        fields.InputDataFields.image_additional_channels:
            tf.placeholder(tf.float32, [None, None, 2]),
    }
+    # pad_input_data_to_static_shape assumes that image is already concatenated
+    # with additional channels.
    padded_tensor_dict = inputs.pad_input_data_to_static_shapes(
        tensor_dict=input_tensor_dict,
        max_num_boxes=3,