Merged commit includes the following changes:

223075771 by lzc: Bring in external fixes. -- 222919755 by ronnyvotel: Bug fix in faster r-cnn model builder. Was previously using `inplace_batchnorm_update` for `reuse_weights`. -- 222885680 by Zhichao Lu: Use the result_dict_for_batched_example in models_lib Also fixes the visualization size on when eval is on GPU -- 222883648 by Zhichao Lu: Fix _unmatched_class_label for the _add_background_class == False case in ssd_meta_arch.py. -- 222836663 by Zhichao Lu: Adding support for visualizing grayscale images. Without this change, the images are black-red instead of grayscale. -- 222501978 by Zhichao Lu: Fix a bug that caused convert_to_grayscale flag not to be respected. -- 222432846 by richardmunoz: Fix mapping of groundtruth_confidences from shape [num_boxes] to [num_boxes, num_classes] when the input contains the groundtruth_confidences field. -- 221725755 by richardmunoz: Internal change. -- 221458536 by Zhichao Lu: Fix saver defer build bug in object detection train codepath. -- 221391590 by Zhichao Lu: Add support for group normalization in the object detection API. Just adding MobileNet-v1 SSD currently. This may serve as a road map for other models that wish to support group normalization as an option. -- 221367993 by Zhichao Lu: Bug fixes (1) Make RandomPadImage work, (2) Fix keep_checkpoint_every_n_hours. -- 221266403 by rathodv: Use detection boxes as proposals to compute correct mask loss in eval jobs. -- 220845934 by lzc: Internal change. -- 220778850 by Zhichao Lu: Incorporating existing metrics into Estimator framework. Should restore: -oid_challenge_detection_metrics -pascal_voc_detection_metrics -weighted_pascal_voc_detection_metrics -pascal_voc_instance_segmentation_metrics -weighted_pascal_voc_instance_segmentation_metrics -oid_V2_detection_metrics -- 220370391 by alirezafathi: Adding precision and recall to the metrics. -- 220321268 by Zhichao Lu: Allow the option of setting max_examples_to_draw to zero. -- 220193337 by Zhichao Lu: This CL fixes a bug where the Keras convolutional box predictor was applying heads in the non-deterministic dict order. The consequence of this bug was that variables were created in non-deterministic orders. This in turn led different workers in a multi-gpu training setup to have slightly different graphs which had variables assigned to mismatched parameter servers. As a result, roughly half of all workers were unable to initialize and did no work, and training time was slowed down approximately 2x. -- 220136508 by huizhongc: Add weight equalization loss to SSD meta arch. -- 220125875 by pengchong: Rename label_scores to label_weights -- 219730108 by Zhichao Lu: Add description of detection_keypoints in postprocessed_tensors to docstring. -- 219577519 by pengchong: Support parsing the class confidences and training using them. -- 219547611 by lzc: Stop using static shapes in GPU eval jobs. -- 219536476 by Zhichao Lu: Migrate TensorFlow Lite out of tensorflow/contrib This change moves //tensorflow/contrib/lite to //tensorflow/lite in preparation for TensorFlow 2.0's deprecation of contrib/. If you refer to TF Lite build targets or headers, you will need to update them manually. If you use TF Lite from the TensorFlow python package, "tf.contrib.lite" now points to "tf.lite". Please update your imports as soon as possible. For more details, see https://groups.google.com/a/tensorflow.org/forum/#!topic/tflite/iIIXOTOFvwQ @angersson and @aselle are conducting this migration. Please contact them if you have any further questions. -- 219190083 by Zhichao Lu: Add a second expected_loss_weights function using an alternative expectation calculation compared to previous. Integrate this op into ssd_meta_arch and losses builder. Affects files that use losses_builder.build to handle the returning of an additional element. -- 218924451 by pengchong: Add a new way to assign training targets using groundtruth confidences. -- 218760524 by chowdhery: Modify export script to add option for regular NMS in TFLite post-processing op. -- PiperOrigin-RevId: 223075771

Merged commit includes the following changes:
223075771 by lzc: Bring in external fixes. -- 222919755 by ronnyvotel: Bug fix in faster r-cnn model builder. Was previously using `inplace_batchnorm_update` for `reuse_weights`. -- 222885680 by Zhichao Lu: Use the result_dict_for_batched_example in models_lib Also fixes the visualization size on when eval is on GPU -- 222883648 by Zhichao Lu: Fix _unmatched_class_label for the _add_background_class == False case in ssd_meta_arch.py. -- 222836663 by Zhichao Lu: Adding support for visualizing grayscale images. Without this change, the images are black-red instead of grayscale. -- 222501978 by Zhichao Lu: Fix a bug that caused convert_to_grayscale flag not to be respected. -- 222432846 by richardmunoz: Fix mapping of groundtruth_confidences from shape [num_boxes] to [num_boxes, num_classes] when the input contains the groundtruth_confidences field. -- 221725755 by richardmunoz: Internal change. -- 221458536 by Zhichao Lu: Fix saver defer build bug in object detection train codepath. -- 221391590 by Zhichao Lu: Add support for group normalization in the object detection API. Just adding MobileNet-v1 SSD currently. This may serve as a road map for other models that wish to support group normalization as an option. -- 221367993 by Zhichao Lu: Bug fixes (1) Make RandomPadImage work, (2) Fix keep_checkpoint_every_n_hours. -- 221266403 by rathodv: Use detection boxes as proposals to compute correct mask loss in eval jobs. -- 220845934 by lzc: Internal change. -- 220778850 by Zhichao Lu: Incorporating existing metrics into Estimator framework. Should restore: -oid_challenge_detection_metrics -pascal_voc_detection_metrics -weighted_pascal_voc_detection_metrics -pascal_voc_instance_segmentation_metrics -weighted_pascal_voc_instance_segmentation_metrics -oid_V2_detection_metrics -- 220370391 by alirezafathi: Adding precision and recall to the metrics. -- 220321268 by Zhichao Lu: Allow the option of setting max_examples_to_draw to zero. -- 220193337 by Zhichao Lu: This CL fixes a bug where the Keras convolutional box predictor was applying heads in the non-deterministic dict order. The consequence of this bug was that variables were created in non-deterministic orders. This in turn led different workers in a multi-gpu training setup to have slightly different graphs which had variables assigned to mismatched parameter servers. As a result, roughly half of all workers were unable to initialize and did no work, and training time was slowed down approximately 2x. -- 220136508 by huizhongc: Add weight equalization loss to SSD meta arch. -- 220125875 by pengchong: Rename label_scores to label_weights -- 219730108 by Zhichao Lu: Add description of detection_keypoints in postprocessed_tensors to docstring. -- 219577519 by pengchong: Support parsing the class confidences and training using them. -- 219547611 by lzc: Stop using static shapes in GPU eval jobs. -- 219536476 by Zhichao Lu: Migrate TensorFlow Lite out of tensorflow/contrib This change moves //tensorflow/contrib/lite to //tensorflow/lite in preparation for TensorFlow 2.0's deprecation of contrib/. If you refer to TF Lite build targets or headers, you will need to update them manually. If you use TF Lite from the TensorFlow python package, "tf.contrib.lite" now points to "tf.lite". Please update your imports as soon as possible. For more details, see https://groups.google.com/a/tensorflow.org/forum/#!topic/tflite/iIIXOTOFvwQ @angersson and @aselle are conducting this migration. Please contact them if you have any further questions. -- 219190083 by Zhichao Lu: Add a second expected_loss_weights function using an alternative expectation calculation compared to previous. Integrate this op into ssd_meta_arch and losses builder. Affects files that use losses_builder.build to handle the returning of an additional element. -- 218924451 by pengchong: Add a new way to assign training targets using groundtruth confidences. -- 218760524 by chowdhery: Modify export script to add option for regular NMS in TFLite post-processing op. -- PiperOrigin-RevId: 223075771
a1337e01 · Zhichao Lu · pkulzc · 2c680af3 · a1337e01 · a1337e01
Commit a1337e01 authored Nov 27, 2018 by Zhichao Lu Committed by pkulzc Nov 30, 2018
11 changed files
--- a/research/object_detection/protos/losses.proto
+++ b/research/object_detection/protos/losses.proto
@@ -23,6 +23,44 @@ message Loss {
  // If not left to default, applies random example sampling.
  optional RandomExampleSampler random_example_sampler = 6;
+  // Equalization loss.
+  message EqualizationLoss {
+    // Weight equalization loss strength.
+    optional float weight = 1 [default=0.0];
+    // When computing equalization loss, ops that start with
+    // equalization_exclude_prefixes will be ignored. Only used when
+    // equalization_weight > 0.
+    repeated string exclude_prefixes = 2;
+  }
+  optional EqualizationLoss equalization_loss = 7;
+  enum ExpectedLossWeights {
+    NONE = 0;
+    // Use expected_classification_loss_by_expected_sampling
+    // from third_party/tensorflow_models/object_detection/utils/ops.py
+    EXPECTED_SAMPLING = 1;
+    // Use expected_classification_loss_by_reweighting_unmatched_anchors
+    // from third_party/tensorflow_models/object_detection/utils/ops.py
+    REWEIGHTING_UNMATCHED_ANCHORS = 2;
+  }
+  // Method to compute expected loss weights with respect to balanced
+  // positive/negative sampling scheme. If NONE, use explicit sampling.
+  // TODO(birdbrain): Move under ExpectedLossWeights.
+  optional ExpectedLossWeights expected_loss_weights = 18 [default = NONE];
+  // Minimum number of effective negative samples.
+  // Only applies if expected_loss_weights is not NONE.
+  // TODO(birdbrain): Move under ExpectedLossWeights.
+  optional float min_num_negative_samples = 19 [default=0];
+  // Desired number of effective negative samples per positive sample.
+  // Only applies if expected_loss_weights is not NONE.
+  // TODO(birdbrain): Move under ExpectedLossWeights.
+  optional float desired_negative_sampling_ratio = 20 [default=3];
 }
 // Configuration for bounding box localization loss function.

--- a/research/object_detection/protos/preprocessor.proto
+++ b/research/object_detection/protos/preprocessor.proto
@@ -166,13 +166,13 @@ message RandomCropImage {
 message RandomPadImage {
  // Minimum dimensions for padded image. If unset, will use original image
  // dimension as a lower bound.
-  optional float min_image_height = 1;
+  optional int32 min_image_height = 1;
-  optional float min_image_width = 2;
+  optional int32 min_image_width = 2;
  // Maximum dimensions for padded image. If unset, will use double the original
  // image dimension as a lower bound.
-  optional float max_image_height = 3;
+  optional int32 max_image_height = 3;
-  optional float max_image_width = 4;
+  optional int32 max_image_width = 4;
  // Color of the padding. If unset, will pad using average color of the input
  // image.

--- a/research/object_detection/protos/ssd.proto
+++ b/research/object_detection/protos/ssd.proto
@@ -12,7 +12,7 @@ import "object_detection/protos/post_processing.proto";
 import "object_detection/protos/region_similarity_calculator.proto";
 // Configuration for Single Shot Detection (SSD) models.
-// Next id: 22
+// Next id: 26
 message Ssd {
  // Number of classes to predict.
@@ -35,7 +35,7 @@ message Ssd {
  // Whether background targets are to be encoded as an all
  // zeros vector or a one-hot vector (where background is the 0th class).
-  optional bool encode_background_as_zeros = 12 [default=false];
+  optional bool encode_background_as_zeros = 12 [default = false];
  // classification weight to be associated to negative
  // anchors (default: 1.0). The weight must be in [0., 1.].
@@ -52,11 +52,11 @@ message Ssd {
  // Whether to normalize the loss by number of groundtruth boxes that match to
  // the anchors.
-  optional bool normalize_loss_by_num_matches = 10 [default=true];
+  optional bool normalize_loss_by_num_matches = 10 [default = true];
  // Whether to normalize the localization loss by the code size of the box
  // encodings. This is applied along with other normalization factors.
-  optional bool normalize_loc_loss_by_codesize = 14 [default=false];
+  optional bool normalize_loc_loss_by_codesize = 14 [default = false];
  // Loss configuration for training.
  optional Loss loss = 11;
@@ -82,29 +82,66 @@ message Ssd {
  // to update the batch norm moving average parameters.
  optional bool inplace_batchnorm_update = 15 [default = false];
-  // Whether to weight the regression loss by the score of the ground truth box
+  // Whether to add an implicit background class to one-hot encodings of
-  // the anchor matches to.
+  // groundtruth labels. Set to false if training a single
-  optional bool weight_regression_loss_by_score = 17 [default=false];
+  // class model or using an explicit background class.
+  optional bool add_background_class = 21 [default = true];
-  // Whether to compute expected loss with respect to balanced positive/negative
+  // Whether to use an explicit background class. Set to true if using
-  // sampling scheme. If false, use explicit sampling.
+  // groundtruth labels with an explicit background class, as in multiclass
-  optional bool use_expected_classification_loss_under_sampling = 18 [default=false];
+  // scores.
+  optional bool explicit_background_class = 24 [default = false];
-  // Minimum number of effective negative samples.
+  optional bool use_confidences_as_targets = 22 [default = false];
-  // Only applies if use_expected_classification_loss_under_sampling is true.
-  optional float min_num_negative_samples = 19 [default=0];
-  // Desired number of effective negative samples per positive sample.
+  optional float implicit_example_weight = 23 [default = 1.0];
-  // Only applies if use_expected_classification_loss_under_sampling is true.
-  optional float desired_negative_sampling_ratio = 20 [default=3];
-  // Whether to add an implicit background class to one-hot encodings of
+  // Configuration proto for MaskHead.
-  // groundtruth labels. Set to false if using groundtruth labels with an
+  // Next id: 11
-  // explicit background class, using multiclass scores, or if training a single
+  message MaskHead {
-  // class model.
+    // The height and the width of the predicted mask. Only used when
-  optional bool add_background_class = 21 [default = true];
+    // predict_instance_masks is true.
-}
+    optional int32 mask_height = 1 [default = 15];
+    optional int32 mask_width = 2 [default = 15];
+    // Whether to predict class agnostic masks. Only used when
+    // predict_instance_masks is true.
+    optional bool masks_are_class_agnostic = 3 [default = true];
+    // The depth for the first conv2d_transpose op applied to the
+    // image_features in the mask prediction branch. If set to 0, the value
+    // will be set automatically based on the number of channels in the image
+    // features and the number of classes.
+    optional int32 mask_prediction_conv_depth = 4 [default = 256];
+    // The number of convolutions applied to image_features in the mask prediction
+    // branch.
+    optional int32 mask_prediction_num_conv_layers = 5 [default = 2];
+    // Whether to apply convolutions on mask features before upsampling using
+    // nearest neighbor resizing.
+    // By default, mask features are resized to [`mask_height`, `mask_width`]
+    // before applying convolutions and predicting masks.
+    optional bool convolve_then_upsample_masks = 6 [default = false];
+    // Mask loss weight.
+    optional float mask_loss_weight = 7 [default=5.0];
+    // Number of boxes to be generated at training time for computing mask loss.
+    optional int32 mask_loss_sample_size = 8 [default=16];
+    // Hyperparameters for convolution ops used in the box predictor.
+    optional Hyperparams conv_hyperparams = 9;
+    // Output size (width and height are set to be the same) of the initial
+    // bilinear interpolation based cropping during ROI pooling. Only used when
+    // we have second stage prediction head enabled (e.g. mask head).
+    optional int32 initial_crop_size = 10 [default = 15];
+  }
+  // Configs for mask head.
+  optional MaskHead mask_head_config = 25;
+}
 message SsdFeatureExtractor {
  reserved 6;
@@ -113,10 +150,10 @@ message SsdFeatureExtractor {
  optional string type = 1;
  // The factor to alter the depth of the channels in the feature extractor.
-  optional float depth_multiplier = 2 [default=1.0];
+  optional float depth_multiplier = 2 [default = 1.0];
  // Minimum number of the channels in the feature extractor.
-  optional int32 min_depth = 3 [default=16];
+  optional int32 min_depth = 3 [default = 16];
  // Hyperparameters that affect the layers of feature extractor added on top
  // of the base feature extractor.
@@ -128,7 +165,8 @@ message SsdFeatureExtractor {
  // layers while base feature extractor uses its own default hyperparams. If
  // this value is set to true, the base feature extractor's hyperparams will be
  // overridden with the `conv_hyperparams`.
-  optional bool override_base_feature_extractor_hyperparams = 9 [default = false];
+  optional bool override_base_feature_extractor_hyperparams = 9
+      [default = false];
  // The nearest multiple to zero-pad the input height and width dimensions to.
  // For example, if pad_to_multiple = 2, input dimensions are zero-padded
@@ -138,11 +176,11 @@ message SsdFeatureExtractor {
  // Whether to use explicit padding when extracting SSD multiresolution
  // features. This will also apply to the base feature extractor if a MobileNet
  // architecture is used.
-  optional bool use_explicit_padding = 7 [default=false];
+  optional bool use_explicit_padding = 7 [default = false];
  // Whether to use depthwise separable convolutions for to extract additional
  // feature maps added by SSD.
-  optional bool use_depthwise = 8 [default=false];
+  optional bool use_depthwise = 8 [default = false];
  // Feature Pyramid Networks config.
  optional FeaturePyramidNetworks fpn = 10;
@@ -173,4 +211,3 @@ message FeaturePyramidNetworks {
  // channel depth for additional coarse feature layers.
  optional int32 additional_layer_depth = 3 [default = 256];
 }
--- a/research/object_detection/protos/train.proto
+++ b/research/object_detection/protos/train.proto
@@ -20,7 +20,7 @@ message TrainConfig {
  optional bool sync_replicas = 3 [default=false];
  // How frequently to keep checkpoints.
-  optional uint32 keep_checkpoint_every_n_hours = 4 [default=1000];
+  optional float keep_checkpoint_every_n_hours = 4 [default=10000.0];
  // Optimizer used to train the DetectionModel.
  optional Optimizer optimizer = 5;

--- a/research/object_detection/utils/object_detection_evaluation.py
+++ b/research/object_detection/utils/object_detection_evaluation.py
@@ -33,6 +33,7 @@ import collections
 import logging
 import unicodedata
 import numpy as np
+import tensorflow as tf
 from object_detection.core import standard_fields
 from object_detection.utils import label_map_util
@@ -126,6 +127,7 @@ class ObjectDetectionEvaluator(DetectionEvaluator):
               categories,
               matching_iou_threshold=0.5,
               evaluate_corlocs=False,
+               evaluate_precision_recall=False,
               metric_prefix=None,
               use_weighted_mean_ap=False,
               evaluate_masks=False,
@@ -140,6 +142,8 @@ class ObjectDetectionEvaluator(DetectionEvaluator):
        boxes to detection boxes.
      evaluate_corlocs: (optional) boolean which determines if corloc scores
        are to be returned or not.
+      evaluate_precision_recall: (optional) boolean which determines if
+        precision and recall values are to be returned or not.
      metric_prefix: (optional) string prefix for metric name; if None, no
        prefix is used.
      use_weighted_mean_ap: (optional) boolean which determines if the mean
@@ -174,7 +178,50 @@ class ObjectDetectionEvaluator(DetectionEvaluator):
        group_of_weight=self._group_of_weight)
    self._image_ids = set([])
    self._evaluate_corlocs = evaluate_corlocs
+    self._evaluate_precision_recall = evaluate_precision_recall
    self._metric_prefix = (metric_prefix + '_') if metric_prefix else ''
+    self._expected_keys = set([
+        standard_fields.InputDataFields.key,
+        standard_fields.InputDataFields.groundtruth_boxes,
+        standard_fields.InputDataFields.groundtruth_classes,
+        standard_fields.InputDataFields.groundtruth_difficult,
+        standard_fields.InputDataFields.groundtruth_instance_masks,
+        standard_fields.DetectionResultFields.detection_boxes,
+        standard_fields.DetectionResultFields.detection_scores,
+        standard_fields.DetectionResultFields.detection_classes,
+        standard_fields.DetectionResultFields.detection_masks
+    ])
+    self._build_metric_names()
+  def _build_metric_names(self):
+    """Builds a list with metric names."""
+    self._metric_names = [
+        self._metric_prefix + 'Precision/mAP@{}IOU'.format(
+            self._matching_iou_threshold)
+    ]
+    if self._evaluate_corlocs:
+      self._metric_names.append(
+          self._metric_prefix +
+          'Precision/meanCorLoc@{}IOU'.format(self._matching_iou_threshold))
+    category_index = label_map_util.create_category_index(self._categories)
+    for idx in range(self._num_classes):
+      if idx + self._label_id_offset in category_index:
+        category_name = category_index[idx + self._label_id_offset]['name']
+        try:
+          category_name = unicode(category_name, 'utf-8')
+        except TypeError:
+          pass
+        category_name = unicodedata.normalize('NFKD', category_name).encode(
+            'ascii', 'ignore')
+        self._metric_names.append(
+            self._metric_prefix + 'PerformanceByCategory/AP@{}IOU/{}'.format(
+                self._matching_iou_threshold, category_name))
+        if self._evaluate_corlocs:
+          self._metric_names.append(
+              self._metric_prefix + 'PerformanceByCategory/CorLoc@{}IOU/{}'
+              .format(self._matching_iou_threshold, category_name))
  def add_single_ground_truth_image_info(self, image_id, groundtruth_dict):
    """Adds groundtruth for a single image to be used for evaluation.
@@ -283,22 +330,19 @@ class ObjectDetectionEvaluator(DetectionEvaluator):
      A dictionary of metrics with the following fields -
      1. summary_metrics:
-        'Precision/mAP@<matching_iou_threshold>IOU': mean average precision at
+        '<prefix if not empty>_Precision/mAP@<matching_iou_threshold>IOU': mean
-        the specified IOU threshold.
+        average precision at the specified IOU threshold.
      2. per_category_ap: category specific results with keys of the form
-        'PerformanceByCategory/mAP@<matching_iou_threshold>IOU/category'.
+        '<prefix if not empty>_PerformanceByCategory/
+        mAP@<matching_iou_threshold>IOU/category'.
    """
-    (per_class_ap, mean_ap, _, _, per_class_corloc, mean_corloc) = (
+    (per_class_ap, mean_ap, per_class_precision, per_class_recall,
-        self._evaluation.evaluate())
+     per_class_corloc, mean_corloc) = (
-    pascal_metrics = {
+         self._evaluation.evaluate())
-        self._metric_prefix +
+    pascal_metrics = {self._metric_names[0]: mean_ap}
-        'Precision/mAP@{}IOU'.format(self._matching_iou_threshold):
-            mean_ap
-    }
    if self._evaluate_corlocs:
-      pascal_metrics[self._metric_prefix + 'Precision/meanCorLoc@{}IOU'.format(
+      pascal_metrics[self._metric_names[1]] = mean_corloc
-          self._matching_iou_threshold)] = mean_corloc
    category_index = label_map_util.create_category_index(self._categories)
    for idx in range(per_class_ap.size):
      if idx + self._label_id_offset in category_index:
@@ -314,6 +358,19 @@ class ObjectDetectionEvaluator(DetectionEvaluator):
                self._matching_iou_threshold, category_name))
        pascal_metrics[display_name] = per_class_ap[idx]
+        # Optionally add precision and recall values
+        if self._evaluate_precision_recall:
+          display_name = (
+              self._metric_prefix +
+              'PerformanceByCategory/Precision@{}IOU/{}'.format(
+                  self._matching_iou_threshold, category_name))
+          pascal_metrics[display_name] = per_class_precision[idx]
+          display_name = (
+              self._metric_prefix +
+              'PerformanceByCategory/Recall@{}IOU/{}'.format(
+                  self._matching_iou_threshold, category_name))
+          pascal_metrics[display_name] = per_class_recall[idx]
        # Optionally add CorLoc metrics.classes
        if self._evaluate_corlocs:
          display_name = (
@@ -332,6 +389,74 @@ class ObjectDetectionEvaluator(DetectionEvaluator):
        label_id_offset=self._label_id_offset)
    self._image_ids.clear()
+  def get_estimator_eval_metric_ops(self, eval_dict):
+    """Returns dict of metrics to use with `tf.estimator.EstimatorSpec`.
+    Note that this must only be implemented if performing evaluation with a
+    `tf.estimator.Estimator`.
+    Args:
+      eval_dict: A dictionary that holds tensors for evaluating an object
+        detection model, returned from
+        eval_util.result_dict_for_single_example(). It must contain
+        standard_fields.InputDataFields.key.
+    Returns:
+      A dictionary of metric names to tuple of value_op and update_op that can
+      be used as eval metric ops in `tf.estimator.EstimatorSpec`.
+    """
+    # remove unexpected fields
+    eval_dict_filtered = dict()
+    for key, value in eval_dict.items():
+      if key in self._expected_keys:
+        eval_dict_filtered[key] = value
+    eval_dict_keys = eval_dict_filtered.keys()
+    def update_op(image_id, *eval_dict_batched_as_list):
+      """Update operation that adds batch of images to ObjectDetectionEvaluator.
+      Args:
+        image_id: image id (single id or an array)
+        *eval_dict_batched_as_list: the values of the dictionary of tensors.
+      """
+      if np.isscalar(image_id):
+        single_example_dict = dict(
+            zip(eval_dict_keys, eval_dict_batched_as_list))
+        self.add_single_ground_truth_image_info(image_id, single_example_dict)
+        self.add_single_detected_image_info(image_id, single_example_dict)
+      else:
+        for unzipped_tuple in zip(*eval_dict_batched_as_list):
+          single_example_dict = dict(zip(eval_dict_keys, unzipped_tuple))
+          image_id = single_example_dict[standard_fields.InputDataFields.key]
+          self.add_single_ground_truth_image_info(image_id, single_example_dict)
+          self.add_single_detected_image_info(image_id, single_example_dict)
+    args = [eval_dict_filtered[standard_fields.InputDataFields.key]]
+    args.extend(eval_dict_filtered.values())
+    update_op = tf.py_func(update_op, args, [])
+    def first_value_func():
+      self._metrics = self.evaluate()
+      self.clear()
+      return np.float32(self._metrics[self._metric_names[0]])
+    def value_func_factory(metric_name):
+      def value_func():
+        return np.float32(self._metrics[metric_name])
+      return value_func
+    # Ensure that the metrics are only evaluated once.
+    first_value_op = tf.py_func(first_value_func, [], tf.float32)
+    eval_metric_ops = {self._metric_names[0]: (first_value_op, update_op)}
+    with tf.control_dependencies([first_value_op]):
+      for metric_name in self._metric_names[1:]:
+        eval_metric_ops[metric_name] = (tf.py_func(
+            value_func_factory(metric_name), [], np.float32), update_op)
+    return eval_metric_ops
 class PascalDetectionEvaluator(ObjectDetectionEvaluator):
  """A class to evaluate detections using PASCAL metrics."""
@@ -442,6 +567,15 @@ class OpenImagesDetectionEvaluator(ObjectDetectionEvaluator):
        evaluate_corlocs,
        metric_prefix=metric_prefix,
        group_of_weight=group_of_weight)
+    self._expected_keys = set([
+        standard_fields.InputDataFields.key,
+        standard_fields.InputDataFields.groundtruth_boxes,
+        standard_fields.InputDataFields.groundtruth_classes,
+        standard_fields.InputDataFields.groundtruth_group_of,
+        standard_fields.DetectionResultFields.detection_boxes,
+        standard_fields.DetectionResultFields.detection_scores,
+        standard_fields.DetectionResultFields.detection_classes,
+    ])
  def add_single_ground_truth_image_info(self, image_id, groundtruth_dict):
    """Adds groundtruth for a single image to be used for evaluation.
@@ -535,6 +669,16 @@ class OpenImagesDetectionChallengeEvaluator(OpenImagesDetectionEvaluator):
        group_of_weight=group_of_weight)
    self._evaluatable_labels = {}
+    self._expected_keys = set([
+        standard_fields.InputDataFields.key,
+        standard_fields.InputDataFields.groundtruth_boxes,
+        standard_fields.InputDataFields.groundtruth_classes,
+        standard_fields.InputDataFields.groundtruth_group_of,
+        standard_fields.InputDataFields.groundtruth_image_classes,
+        standard_fields.DetectionResultFields.detection_boxes,
+        standard_fields.DetectionResultFields.detection_scores,
+        standard_fields.DetectionResultFields.detection_classes,
+    ])
  def add_single_ground_truth_image_info(self, image_id, groundtruth_dict):
    """Adds groundtruth for a single image to be used for evaluation.
@@ -890,15 +1034,14 @@ class ObjectDetectionEvaluation(object):
      if self.use_weighted_mean_ap:
        all_scores = np.append(all_scores, scores)
        all_tp_fp_labels = np.append(all_tp_fp_labels, tp_fp_labels)
-      logging.info('Scores and tpfp per class label: %d', class_index)
-      logging.info(tp_fp_labels)
-      logging.info(scores)
      precision, recall = metrics.compute_precision_recall(
          scores, tp_fp_labels, self.num_gt_instances_per_class[class_index])
      self.precisions_per_class[class_index] = precision
      self.recalls_per_class[class_index] = recall
      average_precision = metrics.compute_average_precision(precision, recall)
      self.average_precision_per_class[class_index] = average_precision
+      logging.info('average_precision: %f', average_precision)
    self.corloc_per_class = metrics.compute_cor_loc(
        self.num_gt_imgs_per_class,

--- a/research/object_detection/utils/object_detection_evaluation_test.py
+++ b/research/object_detection/utils/object_detection_evaluation_test.py
@@ -15,9 +15,10 @@
 """Tests for object_detection.utils.object_detection_evaluation."""
+from absl.testing import parameterized
 import numpy as np
 import tensorflow as tf
+from object_detection import eval_util
 from object_detection.core import standard_fields
 from object_detection.utils import object_detection_evaluation
@@ -683,5 +684,141 @@ class ObjectDetectionEvaluationTest(tf.test.TestCase):
    self.assertAlmostEqual(expected_mean_corloc, mean_corloc)
+class ObjectDetectionEvaluatorTest(tf.test.TestCase, parameterized.TestCase):
+  def setUp(self):
+    self.categories = [{
+        'id': 1,
+        'name': 'person'
+    }, {
+        'id': 2,
+        'name': 'dog'
+    }, {
+        'id': 3,
+        'name': 'cat'
+    }]
+    self.od_eval = object_detection_evaluation.ObjectDetectionEvaluator(
+        categories=self.categories)
+  def _make_evaluation_dict(self,
+                            resized_groundtruth_masks=False,
+                            batch_size=1,
+                            max_gt_boxes=None,
+                            scale_to_absolute=False):
+    input_data_fields = standard_fields.InputDataFields
+    detection_fields = standard_fields.DetectionResultFields
+    image = tf.zeros(shape=[batch_size, 20, 20, 3], dtype=tf.uint8)
+    if batch_size == 1:
+      key = tf.constant('image1')
+    else:
+      key = tf.constant([str(i) for i in range(batch_size)])
+    detection_boxes = tf.concat([
+        tf.tile(
+            tf.constant([[[0., 0., 1., 1.]]]), multiples=[batch_size - 1, 1, 1
+                                                         ]),
+        tf.constant([[[0., 0., 0.5, 0.5]]])
+    ],
+                                axis=0)
+    detection_scores = tf.concat([
+        tf.tile(tf.constant([[0.5]]), multiples=[batch_size - 1, 1]),
+        tf.constant([[0.8]])
+    ],
+                                 axis=0)
+    detection_classes = tf.tile(tf.constant([[0]]), multiples=[batch_size, 1])
+    detection_masks = tf.tile(
+        tf.ones(shape=[1, 2, 20, 20], dtype=tf.float32),
+        multiples=[batch_size, 1, 1, 1])
+    groundtruth_boxes = tf.constant([[0., 0., 1., 1.]])
+    groundtruth_classes = tf.constant([1])
+    groundtruth_instance_masks = tf.ones(shape=[1, 20, 20], dtype=tf.uint8)
+    num_detections = tf.ones([batch_size])
+    if resized_groundtruth_masks:
+      groundtruth_instance_masks = tf.ones(shape=[1, 10, 10], dtype=tf.uint8)
+    if batch_size > 1:
+      groundtruth_boxes = tf.tile(
+          tf.expand_dims(groundtruth_boxes, 0), multiples=[batch_size, 1, 1])
+      groundtruth_classes = tf.tile(
+          tf.expand_dims(groundtruth_classes, 0), multiples=[batch_size, 1])
+      groundtruth_instance_masks = tf.tile(
+          tf.expand_dims(groundtruth_instance_masks, 0),
+          multiples=[batch_size, 1, 1, 1])
+    detections = {
+        detection_fields.detection_boxes: detection_boxes,
+        detection_fields.detection_scores: detection_scores,
+        detection_fields.detection_classes: detection_classes,
+        detection_fields.detection_masks: detection_masks,
+        detection_fields.num_detections: num_detections
+    }
+    groundtruth = {
+        input_data_fields.groundtruth_boxes:
+            groundtruth_boxes,
+        input_data_fields.groundtruth_classes:
+            groundtruth_classes,
+        input_data_fields.groundtruth_instance_masks:
+            groundtruth_instance_masks,
+    }
+    if batch_size > 1:
+      return eval_util.result_dict_for_batched_example(
+          image,
+          key,
+          detections,
+          groundtruth,
+          scale_to_absolute=scale_to_absolute,
+          max_gt_boxes=max_gt_boxes)
+    else:
+      return eval_util.result_dict_for_single_example(
+          image,
+          key,
+          detections,
+          groundtruth,
+          scale_to_absolute=scale_to_absolute)
+  @parameterized.parameters({
+      'batch_size': 1,
+      'expected_map': 0,
+      'max_gt_boxes': None,
+      'scale_to_absolute': True
+  }, {
+      'batch_size': 8,
+      'expected_map': 0.765625,
+      'max_gt_boxes': [1],
+      'scale_to_absolute': True
+  }, {
+      'batch_size': 1,
+      'expected_map': 0,
+      'max_gt_boxes': None,
+      'scale_to_absolute': False
+  }, {
+      'batch_size': 8,
+      'expected_map': 0.765625,
+      'max_gt_boxes': [1],
+      'scale_to_absolute': False
+  })
+  def test_get_estimator_eval_metric_ops(self,
+                                         batch_size=1,
+                                         expected_map=1,
+                                         max_gt_boxes=None,
+                                         scale_to_absolute=False):
+    eval_dict = self._make_evaluation_dict(
+        batch_size=batch_size,
+        max_gt_boxes=max_gt_boxes,
+        scale_to_absolute=scale_to_absolute)
+    tf.logging.info('eval_dict: {}'.format(eval_dict))
+    metric_ops = self.od_eval.get_estimator_eval_metric_ops(eval_dict)
+    _, update_op = metric_ops['Precision/mAP@0.5IOU']
+    with self.test_session() as sess:
+      metrics = {}
+      for key, (value_op, _) in metric_ops.iteritems():
+        metrics[key] = value_op
+      sess.run(update_op)
+      metrics = sess.run(metrics)
+      self.assertAlmostEqual(expected_map, metrics['Precision/mAP@0.5IOU'])
 if __name__ == '__main__':
  tf.test.main()
--- a/research/object_detection/utils/ops.py
+++ b/research/object_detection/utils/ops.py
@@ -14,6 +14,7 @@
 # ==============================================================================
 """A module for helper tensorflow ops."""
+import collections
 import math
 import numpy as np
 import six
@@ -1087,81 +1088,10 @@ def native_crop_and_resize(image, boxes, crop_size, scope=None):
    return tf.reshape(cropped_regions, final_shape)
-def expected_classification_loss_under_sampling(
-    batch_cls_targets, cls_losses, unmatched_cls_losses,
-    desired_negative_sampling_ratio, min_num_negative_samples):
-  """Computes classification loss by background/foreground weighting.
-  The weighting is such that the effective background/foreground weight ratio
-  is the desired_negative_sampling_ratio. if p_i is the foreground probability
-  of anchor a_i, L(a_i) is the anchors loss, N is the number of anchors, M
-  is the sum of foreground probabilities across anchors, and K is the desired
-  ratio between the number of negative and positive samples, then the total loss
-  L is calculated as:
-  beta = K*M/(N-M)
-  L = sum_{i=1}^N [p_i * L_p(a_i) + beta * (1 - p_i) * L_n(a_i)]
-  where L_p(a_i) is the loss against target assuming the anchor was matched,
-  otherwise zero, and L_n(a_i) is the loss against the background target
-  assuming the anchor was unmatched, otherwise zero.
-  Args:
+EqualizationLossConfig = collections.namedtuple('EqualizationLossConfig',
-    batch_cls_targets: A tensor with shape [batch_size, num_anchors, num_classes
+                                                ['weight', 'exclude_prefixes'])
-      + 1], where 0'th index is the background class, containing the class
-      distrubution for the target assigned to a given anchor.
-    cls_losses: Float tensor of shape [batch_size, num_anchors] representing
-      anchorwise classification losses.
-    unmatched_cls_losses: loss for each anchor against the unmatched class
-      target.
-    desired_negative_sampling_ratio: The desired background/foreground weight
-      ratio.
-    min_num_negative_samples: Minimum number of effective negative samples.
-      Used only when there are no positive examples.
-  Returns:
-    The classification loss.
-  """
-  num_anchors = tf.cast(tf.shape(batch_cls_targets)[1], tf.float32)
-  # find the p_i
-  foreground_probabilities = 1 - batch_cls_targets[:, :, 0]
-  foreground_sum = tf.reduce_sum(foreground_probabilities, axis=-1)
-  # for each anchor, expected_j is the expected number of positive anchors
-  # given that this anchor was sampled as negative.
-  tiled_foreground_sum = tf.tile(
-      tf.reshape(foreground_sum, [-1, 1]),
-      [1, tf.cast(num_anchors, tf.int32)])
-  expected_j = tiled_foreground_sum - foreground_probabilities
-  k = desired_negative_sampling_ratio
-  # compute beta
-  expected_negatives = tf.to_float(num_anchors) - expected_j
-  desired_negatives = k * expected_j
-  desired_negatives = tf.where(
-      tf.greater(desired_negatives, expected_negatives), expected_negatives,
-      desired_negatives)
-  # probability that an anchor is sampled for the loss computation given that it
-  # is negative.
-  beta = desired_negatives / expected_negatives
-  # where the foreground sum is zero, use a minimum negative weight.
-  min_negative_weight = 1.0 * min_num_negative_samples / num_anchors
-  beta = tf.where(
-      tf.equal(tiled_foreground_sum, 0),
-      min_negative_weight * tf.ones_like(beta), beta)
-  foreground_weights = foreground_probabilities
-  background_weights = (1 - foreground_weights) * beta
-  weighted_foreground_losses = foreground_weights * cls_losses
-  weighted_background_losses = background_weights * unmatched_cls_losses
-  cls_losses = tf.reduce_sum(
-      weighted_foreground_losses, axis=-1) + tf.reduce_sum(
-          weighted_background_losses, axis=-1)
-  return cls_losses
--- a/research/object_detection/utils/ops_test.py
+++ b/research/object_detection/utils/ops_test.py
@@ -21,6 +21,8 @@ from object_detection.core import standard_fields as fields
 from object_detection.utils import ops
 from object_detection.utils import test_case
+slim = tf.contrib.slim
 class NormalizedToImageCoordinatesTest(tf.test.TestCase):
@@ -1466,189 +1468,9 @@ class OpsTestCropAndResize(test_case.TestCase):
    self.assertAllClose(crop_output, expected_output)
-class OpsTestExpectedClassificationLoss(test_case.TestCase):
-  def testExpectedClassificationLossUnderSamplingWithHardLabels(self):
-    def graph_fn(batch_cls_targets, cls_losses, unmatched_cls_losses,
-                 negative_to_positive_ratio, min_num_negative_samples):
-      return ops.expected_classification_loss_under_sampling(
-          batch_cls_targets, cls_losses, unmatched_cls_losses,
-          negative_to_positive_ratio, min_num_negative_samples)
-    batch_cls_targets = np.array(
-        [[[1., 0, 0], [0, 1., 0]], [[1., 0, 0], [0, 1., 0]]], dtype=np.float32)
-    cls_losses = np.array([[1, 2], [3, 4]], dtype=np.float32)
-    unmatched_cls_losses = np.array([[10, 20], [30, 40]], dtype=np.float32)
-    negative_to_positive_ratio = np.array([2], dtype=np.float32)
-    min_num_negative_samples = np.array([1], dtype=np.float32)
-    classification_loss = self.execute(graph_fn, [
-        batch_cls_targets, cls_losses, unmatched_cls_losses,
-        negative_to_positive_ratio, min_num_negative_samples
-    ])
-    # expected_foreground_sum = [1,1]
-    # expected_expected_j = [[1, 0], [1, 0]]
-    # expected_expected_negatives = [[1, 2], [1, 2]]
-    # expected_desired_negatives = [[2, 0], [2, 0]]
-    # expected_beta = [[1, 0], [1, 0]]
-    # expected_foreground_weights = [[0, 1], [0, 1]]
-    # expected_background_weights = [[1, 0], [1, 0]]
-    # expected_weighted_foreground_losses = [[0, 2], [0, 4]]
-    # expected_weighted_background_losses = [[10, 0], [30, 0]]
-    # expected_classification_loss_under_sampling = [6, 40]
-    expected_classification_loss_under_sampling = [2 + 10, 4 + 30]
-    self.assertAllClose(expected_classification_loss_under_sampling,
-                        classification_loss)
-  def testExpectedClassificationLossUnderSamplingWithHardLabelsMoreNegatives(
-      self):
-    def graph_fn(batch_cls_targets, cls_losses, unmatched_cls_losses,
-                 negative_to_positive_ratio, min_num_negative_samples):
-      return ops.expected_classification_loss_under_sampling(
-          batch_cls_targets, cls_losses, unmatched_cls_losses,
-          negative_to_positive_ratio, min_num_negative_samples)
-    batch_cls_targets = np.array(
-        [[[1., 0, 0], [0, 1., 0], [1., 0, 0], [1., 0, 0], [1., 0, 0]]],
-        dtype=np.float32)
-    cls_losses = np.array([[1, 2, 3, 4, 5]], dtype=np.float32)
-    unmatched_cls_losses = np.array([[10, 20, 30, 40, 50]], dtype=np.float32)
-    negative_to_positive_ratio = np.array([2], dtype=np.float32)
-    min_num_negative_samples = np.array([1], dtype=np.float32)
-    classification_loss = self.execute(graph_fn, [
-        batch_cls_targets, cls_losses, unmatched_cls_losses,
-        negative_to_positive_ratio, min_num_negative_samples
-    ])
-    # expected_foreground_sum = [1]
-    # expected_expected_j = [[1, 0, 1, 1, 1]]
-    # expected_expected_negatives = [[4, 5, 4, 4, 4]]
-    # expected_desired_negatives = [[2, 0, 2, 2, 2]]
-    # expected_beta = [[.5, 0, .5, .5, .5]]
-    # expected_foreground_weights = [[0, 1, 0, 0, 0]]
-    # expected_background_weights = [[.5, 0, .5, .5, .5]]
-    # expected_weighted_foreground_losses = [[0, 2, 0, 0, 0]]
-    # expected_weighted_background_losses = [[10*.5, 0, 30*.5, 40*.5, 50*.5]]
-    # expected_classification_loss_under_sampling = [5+2+15+20+25]
-    expected_classification_loss_under_sampling = [5 + 2 + 15 + 20 + 25]
-    self.assertAllClose(expected_classification_loss_under_sampling,
-                        classification_loss)
-  def testExpectedClassificationLossUnderSamplingWithAllNegative(self):
-    def graph_fn(batch_cls_targets, cls_losses, unmatched_cls_losses):
-      return ops.expected_classification_loss_under_sampling(
-          batch_cls_targets, cls_losses, unmatched_cls_losses,
-          negative_to_positive_ratio, min_num_negative_samples)
-    batch_cls_targets = np.array(
-        [[[1, 0, 0], [1, 0, 0]], [[1, 0, 0], [1, 0, 0]]], dtype=np.float32)
-    cls_losses = np.array([[1, 2], [3, 4]], dtype=np.float32)
-    unmatched_cls_losses = np.array([[10, 20], [30, 40]], dtype=np.float32)
-    negative_to_positive_ratio = np.array([2], dtype=np.float32)
-    min_num_negative_samples = np.array([1], dtype=np.float32)
-    classification_loss = self.execute(
-        graph_fn, [batch_cls_targets, cls_losses, unmatched_cls_losses])
-    # expected_foreground_sum = [0,0]
-    # expected_expected_j = [[0, 0], [0, 0]]
-    # expected_expected_negatives = [[2, 2], [2, 2]]
-    # expected_desired_negatives = [[0, 0], [0, 0]]
-    # expected_beta = [[0, 0],[0, 0]]
-    # expected_foreground_weights = [[0, 0], [0, 0]]
-    # expected_background_weights = [[.5, .5], [.5, .5]]
-    # expected_weighted_foreground_losses = [[0, 0], [0, 0]]
-    # expected_weighted_background_losses = [[5, 10], [15, 20]]
-    # expected_classification_loss_under_sampling = [15, 35]
-    expected_classification_loss_under_sampling = [
-        10 * .5 + 20 * .5, 30 * .5 + 40 * .5
-    ]
-    self.assertAllClose(expected_classification_loss_under_sampling,
-                        classification_loss)
-  def testExpectedClassificationLossUnderSamplingWithAllPositive(self):
-    def graph_fn(batch_cls_targets, cls_losses, unmatched_cls_losses):
-      return ops.expected_classification_loss_under_sampling(
-          batch_cls_targets, cls_losses, unmatched_cls_losses,
-          negative_to_positive_ratio, min_num_negative_samples)
-    batch_cls_targets = np.array(
-        [[[0, 1., 0], [0, 1., 0]], [[0, 1, 0], [0, 0, 1]]], dtype=np.float32)
-    cls_losses = np.array([[1, 2], [3, 4]], dtype=np.float32)
-    unmatched_cls_losses = np.array([[10, 20], [30, 40]], dtype=np.float32)
-    negative_to_positive_ratio = np.array([2], dtype=np.float32)
-    min_num_negative_samples = np.array([1], dtype=np.float32)
-    classification_loss = self.execute(
-        graph_fn, [batch_cls_targets, cls_losses, unmatched_cls_losses])
-    # expected_foreground_sum = [2,2]
-    # expected_expected_j = [[1, 1], [1, 1]]
-    # expected_expected_negatives = [[1, 1], [1, 1]]
-    # expected_desired_negatives = [[1, 1], [1, 1]]
-    # expected_beta = [[1, 1],[1, 1]]
-    # expected_foreground_weights = [[1, 1], [1, 1]]
-    # expected_background_weights = [[0, 0], [0, 0]]
-    # expected_weighted_foreground_losses = [[1, 2], [3, 4]]
-    # expected_weighted_background_losses = [[0, 0], [0, 0]]
-    # expected_classification_loss_under_sampling = [15, 35]
-    expected_classification_loss_under_sampling = [1 + 2, 3 + 4]
-    self.assertAllClose(expected_classification_loss_under_sampling,
-                        classification_loss)
-  def testExpectedClassificationLossUnderSamplingWithSoftLabels(self):
-    def graph_fn(batch_cls_targets, cls_losses, unmatched_cls_losses,
-                 negative_to_positive_ratio, min_num_negative_samples):
-      return ops.expected_classification_loss_under_sampling(
-          batch_cls_targets, cls_losses, unmatched_cls_losses,
-          negative_to_positive_ratio, min_num_negative_samples)
-    batch_cls_targets = np.array([[[.75, .25, 0], [0.25, .75, 0], [.75, .25, 0],
-                                   [0.25, .75, 0], [1., 0, 0]]],
-                                 dtype=np.float32)
-    cls_losses = np.array([[1, 2, 3, 4, 5]], dtype=np.float32)
-    unmatched_cls_losses = np.array([[10, 20, 30, 40, 50]], dtype=np.float32)
-    negative_to_positive_ratio = np.array([2], dtype=np.float32)
-    min_num_negative_samples = np.array([1], dtype=np.float32)
-    classification_loss = self.execute(graph_fn, [
-        batch_cls_targets, cls_losses, unmatched_cls_losses,
-        negative_to_positive_ratio, min_num_negative_samples
-    ])
-    # expected_foreground_sum = [2]
-    # expected_expected_j = [[1.75, 1.25, 1.75, 1.25, 2]]
-    # expected_expected_negatives = [[3.25, 3.75, 3.25, 3.75, 3]]
-    # expected_desired_negatives = [[3.25, 2.5, 3.25, 2.5, 3]]
-    # expected_beta = [[1, 2/3, 1, 2/3, 1]]
-    # expected_foreground_weights = [[0.25, .75, .25, .75, 0]]
-    # expected_background_weights = [[[.75, 1/6., .75, 1/6., 1]]]
-    # expected_weighted_foreground_losses = [[.25*1, .75*2, .25*3, .75*4, 0*5]]
-    # expected_weighted_background_losses = [[
-    #     .75*10, 1/6.*20, .75*30, 1/6.*40, 1*50]]
-    # expected_classification_loss_under_sampling = sum([
-    #     .25*1, .75*2, .25*3, .75*4, 0, .75*10, 1/6.*20, .75*30,
-    #     1/6.*40, 1*50])
-    expected_classification_loss_under_sampling = [
-        sum([
-            .25 * 1, .75 * 2, .25 * 3, .75 * 4, 0, .75 * 10, 1 / 6. * 20,
-            .75 * 30, 1 / 6. * 40, 1 * 50
-        ])
-    ]
-    self.assertAllClose(expected_classification_loss_under_sampling,
-                        classification_loss)
 if __name__ == '__main__':

--- a/research/object_detection/utils/test_utils.py
+++ b/research/object_detection/utils/test_utils.py
--- a/research/object_detection/utils/visualization_utils.py
+++ b/research/object_detection/utils/visualization_utils.py
--- a/research/object_detection/utils/visualization_utils_test.py
+++ b/research/object_detection/utils/visualization_utils_test.py