Merged commit includes the following changes: (#6932)

250447559 by Zhichao Lu: Update expected files format for Instance Segmentation challenge: - add fields ImageWidth, ImageHeight and store the values per prediction - as mask, store only encoded image and assume its size is ImageWidth x ImageHeight -- 250402780 by rathodv: Fix failing Mask R-CNN TPU convergence test. Cast second stage prediction tensors from bfloat16 to float32 to prevent errors in third target assignment (Mask Prediction) - Concat with different types bfloat16 and bfloat32 isn't allowed. -- 250300240 by Zhichao Lu: Addion Open Images Challenge 2019 object detection and instance segmentation support into Estimator framework. -- 249944839 by rathodv: Modify exporter.py to add multiclass score nodes in exported inference graphs. -- 249935201 by rathodv: Modify postprocess methods to preserve multiclass scores after non max suppression. -- 249878079 by Zhichao Lu: This CL slightly refactors some Object Detection helper functions for data creation, evaluation, and groundtruth providing. This will allow the eager+function custom loops to share code with the existing estimator training loops. Concretely we make the following changes: 1. In input creation we separate dataset-creation into top-level helpers, and allow it to optionally accept a pre-constructed model directly instead of always creating a model from the config just for feature preprocessing. 2. In coco evaluation we split the update_op creation into its own function, which the custom loops will call directly. 3. In model_lib we move groundtruth providing/ datastructure munging into a helper function 4. For now we put an escape hatch in `_summarize_target_assignment` when executing in tf v2.0 behavior because the summary apis used only work w/ tf 1.x -- 249673507 by rathodv: Use explicit casts instead of tf.to_float and tf.to_int32 to avoid warnings. -- 249656006 by Zhichao Lu: Add named "raw_keypoint_locations" node that corresponds with the "raw_box_locations" node. -- 249651674 by rathodv: Keep proposal boxes in float format. MatMulCropAndResize can handle the type even when feature themselves are bfloat16s. -- 249568633 by rathodv: Support q > 1 in class agnostic NMS. Break post_processing_test.py into 3 separate files to avoid linter errors. -- 249535530 by rathodv: Update some deprecated arguments to tf ops. -- 249368223 by rathodv: Modify MatMulCropAndResize to use MultiLevelRoIAlign method and move the tests to spatial_transform_ops.py module. This cl establishes that CropAndResize and RoIAlign are equivalent and only differ in the sampling point grid within the boxes. CropAndResize uses a uniform size x size point grid such that the corner points exactly overlap box corners, while RoiAlign divides boxes into size x size cells and uses their centers as sampling points. In this cl, we switch MatMulCropAndResize to use the MultiLevelRoIAlign implementation with `align_corner` option as MultiLevelRoIAlign implementation is more memory efficient on TPU when compared to the original MatMulCropAndResize. -- 249337338 by chowdhery: Add class-agnostic non-max-suppression in post_processing -- 249139196 by Zhichao Lu: Fix positional argument bug in export_tflite_ssd_graph -- 249120219 by Zhichao Lu: Add evaluator for computing precision limited to a given recall range. -- 249030593 by Zhichao Lu: Evaluation util to run segmentation and detection challenge evaluation. -- 248554358 by Zhichao Lu: This change contains the auxiliary changes required for TF 2.0 style training with eager+functions+dist strat loops, but not the loops themselves. It includes: - Updates to shape usage to support both tensorshape v1 and tensorshape v2 - A fix to FreezableBatchNorm to not override the `training` arg in call when `None` was passed to the constructor (Not an issue in the estimator loops but it was in the custom loops) - Puts some constants in init_scope so they work in eager + functions - Makes learning rate schedules return a callable in eager mode (required so they update when the global_step changes) - Makes DetectionModel a tf.module so it tracks variables (e.g. ones nested in layers) - Removes some references to `op.name` for some losses and replaces it w/ explicit names - A small part of the change to allow the coco evaluation metrics to work in eager mode -- 248271226 by rathodv: Add MultiLevel RoIAlign op. -- 248229103 by rathodv: Add functions to 1. pad features maps 2. ravel 5-D indices -- 248206769 by rathodv: Add utilities needed to introduce RoI Align op. -- 248177733 by pengchong: Internal changes -- 247742582 by Zhichao Lu: Open Images Challenge 2019 instance segmentation metric: part 2 -- 247525401 by Zhichao Lu: Update comments on max_class_per_detection. -- 247520753 by rathodv: Add multilevel crop and resize operation that builds on top of matmul_crop_and_resize. -- 247391600 by Zhichao Lu: Open Images Challenge 2019 instance segmentation metric -- 247325813 by chowdhery: Quantized MobileNet v2 SSD FPNLite config with depth multiplier 0.75 -- PiperOrigin-RevId: 250447559

Merged commit includes the following changes: (#6932)
250447559 by Zhichao Lu: Update expected files format for Instance Segmentation challenge: - add fields ImageWidth, ImageHeight and store the values per prediction - as mask, store only encoded image and assume its size is ImageWidth x ImageHeight -- 250402780 by rathodv: Fix failing Mask R-CNN TPU convergence test. Cast second stage prediction tensors from bfloat16 to float32 to prevent errors in third target assignment (Mask Prediction) - Concat with different types bfloat16 and bfloat32 isn't allowed. -- 250300240 by Zhichao Lu: Addion Open Images Challenge 2019 object detection and instance segmentation support into Estimator framework. -- 249944839 by rathodv: Modify exporter.py to add multiclass score nodes in exported inference graphs. -- 249935201 by rathodv: Modify postprocess methods to preserve multiclass scores after non max suppression. -- 249878079 by Zhichao Lu: This CL slightly refactors some Object Detection helper functions for data creation, evaluation, and groundtruth providing. This will allow the eager+function custom loops to share code with the existing estimator training loops. Concretely we make the following changes: 1. In input creation we separate dataset-creation into top-level helpers, and allow it to optionally accept a pre-constructed model directly instead of always creating a model from the config just for feature preprocessing. 2. In coco evaluation we split the update_op creation into its own function, which the custom loops will call directly. 3. In model_lib we move groundtruth providing/ datastructure munging into a helper function 4. For now we put an escape hatch in `_summarize_target_assignment` when executing in tf v2.0 behavior because the summary apis used only work w/ tf 1.x -- 249673507 by rathodv: Use explicit casts instead of tf.to_float and tf.to_int32 to avoid warnings. -- 249656006 by Zhichao Lu: Add named "raw_keypoint_locations" node that corresponds with the "raw_box_locations" node. -- 249651674 by rathodv: Keep proposal boxes in float format. MatMulCropAndResize can handle the type even when feature themselves are bfloat16s. -- 249568633 by rathodv: Support q > 1 in class agnostic NMS. Break post_processing_test.py into 3 separate files to avoid linter errors. -- 249535530 by rathodv: Update some deprecated arguments to tf ops. -- 249368223 by rathodv: Modify MatMulCropAndResize to use MultiLevelRoIAlign method and move the tests to spatial_transform_ops.py module. This cl establishes that CropAndResize and RoIAlign are equivalent and only differ in the sampling point grid within the boxes. CropAndResize uses a uniform size x size point grid such that the corner points exactly overlap box corners, while RoiAlign divides boxes into size x size cells and uses their centers as sampling points. In this cl, we switch MatMulCropAndResize to use the MultiLevelRoIAlign implementation with `align_corner` option as MultiLevelRoIAlign implementation is more memory efficient on TPU when compared to the original MatMulCropAndResize. -- 249337338 by chowdhery: Add class-agnostic non-max-suppression in post_processing -- 249139196 by Zhichao Lu: Fix positional argument bug in export_tflite_ssd_graph -- 249120219 by Zhichao Lu: Add evaluator for computing precision limited to a given recall range. -- 249030593 by Zhichao Lu: Evaluation util to run segmentation and detection challenge evaluation. -- 248554358 by Zhichao Lu: This change contains the auxiliary changes required for TF 2.0 style training with eager+functions+dist strat loops, but not the loops themselves. It includes: - Updates to shape usage to support both tensorshape v1 and tensorshape v2 - A fix to FreezableBatchNorm to not override the `training` arg in call when `None` was passed to the constructor (Not an issue in the estimator loops but it was in the custom loops) - Puts some constants in init_scope so they work in eager + functions - Makes learning rate schedules return a callable in eager mode (required so they update when the global_step changes) - Makes DetectionModel a tf.module so it tracks variables (e.g. ones nested in layers) - Removes some references to `op.name` for some losses and replaces it w/ explicit names - A small part of the change to allow the coco evaluation metrics to work in eager mode -- 248271226 by rathodv: Add MultiLevel RoIAlign op. -- 248229103 by rathodv: Add functions to 1. pad features maps 2. ravel 5-D indices -- 248206769 by rathodv: Add utilities needed to introduce RoI Align op. -- 248177733 by pengchong: Internal changes -- 247742582 by Zhichao Lu: Open Images Challenge 2019 instance segmentation metric: part 2 -- 247525401 by Zhichao Lu: Update comments on max_class_per_detection. -- 247520753 by rathodv: Add multilevel crop and resize operation that builds on top of matmul_crop_and_resize. -- 247391600 by Zhichao Lu: Open Images Challenge 2019 instance segmentation metric -- 247325813 by chowdhery: Quantized MobileNet v2 SSD FPNLite config with depth multiplier 0.75 -- PiperOrigin-RevId: 250447559
9bbf8015 · pkulzc · GitHub · f42fddee · 9bbf8015 · 9bbf8015
Unverified Commit 9bbf8015 authored May 30, 2019 by pkulzc Committed by GitHub May 30, 2019
16 changed files
--- a/research/object_detection/protos/post_processing.proto
+++ b/research/object_detection/protos/post_processing.proto
@@ -22,6 +22,20 @@ message BatchNonMaxSuppression {

  // Whether to use the implementation of NMS that guarantees static shapes.
  optional bool use_static_shapes = 6 [default = false];
+
+  // Whether to use class agnostic NMS.
+  // Class-agnostic NMS function implements a class-agnostic version
+  // of Non Maximal Suppression where if max_classes_per_detection=k,
+  // 1) we keep the top-k scores for each detection and
+  // 2) during NMS, each detection only uses the highest class score for sorting.
+  // 3) Compared to regular NMS, the worst runtime of this version is O(N^2)
+  // instead of O(KN^2) where N is the number of detections and K the number of
+  // classes.
+  optional bool use_class_agnostic_nms = 7 [default = false];
+
+  // Number of classes retained per detection in class agnostic NMS.
+
+  optional int32 max_classes_per_detection = 8 [default = 1];
 }

 // Configuration proto for post-processing predicted boxes and

--- a/research/object_detection/tpu_exporters/faster_rcnn.py
+++ b/research/object_detection/tpu_exporters/faster_rcnn.py
@@ -87,7 +87,7 @@ def get_prediction_tensor_shapes(pipeline_config):

  _, input_tensors = exporter.input_placeholder_fn_map['image_tensor']()

-  inputs = tf.to_float(input_tensors)
+  inputs = tf.cast(input_tensors, dtype=tf.float32)
  preprocessed_inputs, true_image_shapes = detection_model.preprocess(inputs)

  prediction_dict = detection_model.predict(preprocessed_inputs,
@@ -125,7 +125,7 @@ def build_graph(pipeline_config,
      exporter.input_placeholder_fn_map[input_type]()

  # CPU pre-processing
-  inputs = tf.to_float(input_tensors)
+  inputs = tf.cast(input_tensors, dtype=tf.float32)
  preprocessed_inputs, true_image_shapes = detection_model.preprocess(inputs)

  # Dimshuffle: [b, h, w, c] -> [b, c, h, w]

--- a/research/object_detection/tpu_exporters/ssd.py
+++ b/research/object_detection/tpu_exporters/ssd.py
@@ -57,7 +57,7 @@ def get_prediction_tensor_shapes(pipeline_config):
  detection_model = model_builder.build(
      pipeline_config.model, is_training=False)
  _, input_tensors = exporter.input_placeholder_fn_map['image_tensor']()
-  inputs = tf.to_float(input_tensors)
+  inputs = tf.cast(input_tensors, dtype=tf.float32)
  preprocessed_inputs, true_image_shapes = detection_model.preprocess(inputs)
  prediction_dict = detection_model.predict(preprocessed_inputs,
                                            true_image_shapes)
@@ -138,7 +138,7 @@ def build_graph(pipeline_config,
  placeholder_tensor, input_tensors = \
      exporter.input_placeholder_fn_map[input_type]()

-  inputs = tf.to_float(input_tensors)
+  inputs = tf.cast(input_tensors, dtype=tf.float32)
  preprocessed_inputs, true_image_shapes = detection_model.preprocess(inputs)

  # Dimshuffle: (b, h, w, c) -> (b, c, h, w)

--- a/research/object_detection/utils/learning_schedules.py
+++ b/research/object_detection/utils/learning_schedules.py
--- a/research/object_detection/utils/object_detection_evaluation.py
+++ b/research/object_detection/utils/object_detection_evaluation.py
--- a/research/object_detection/utils/object_detection_evaluation_test.py
+++ b/research/object_detection/utils/object_detection_evaluation_test.py
--- a/research/object_detection/utils/ops.py
+++ b/research/object_detection/utils/ops.py
--- a/research/object_detection/utils/ops_test.py
+++ b/research/object_detection/utils/ops_test.py
--- a/research/object_detection/utils/per_image_evaluation.py
+++ b/research/object_detection/utils/per_image_evaluation.py
--- a/research/object_detection/utils/per_image_evaluation_test.py
+++ b/research/object_detection/utils/per_image_evaluation_test.py
--- a/research/object_detection/utils/shape_utils.py
+++ b/research/object_detection/utils/shape_utils.py
--- a/research/object_detection/utils/shape_utils_test.py
+++ b/research/object_detection/utils/shape_utils_test.py
@@ -333,5 +333,79 @@ class AssertShapeEqualTest(tf.test.TestCase):
                              tensor_b: np.zeros([5])})


+class FlattenExpandDimensionTest(tf.test.TestCase):
+
+  def test_flatten_given_dims(self):
+    inputs = tf.random_uniform([5, 2, 10, 10, 3])
+    actual_flattened = shape_utils.flatten_dimensions(inputs, first=1, last=3)
+    expected_flattened = tf.reshape(inputs, [5, 20, 10, 3])
+    with self.test_session() as sess:
+      (actual_flattened_np,
+       expected_flattened_np) = sess.run([actual_flattened, expected_flattened])
+    self.assertAllClose(expected_flattened_np, actual_flattened_np)
+
+  def test_raises_value_error_incorrect_dimensions(self):
+    inputs = tf.random_uniform([5, 2, 10, 10, 3])
+    with self.assertRaises(ValueError):
+      shape_utils.flatten_dimensions(inputs, first=0, last=6)
+
+  def test_flatten_first_two_dimensions(self):
+    inputs = tf.constant(
+        [
+            [[1, 2], [3, 4]],
+            [[5, 6], [7, 8]],
+            [[9, 10], [11, 12]]
+        ], dtype=tf.int32)
+    flattened_tensor = shape_utils.flatten_first_n_dimensions(
+        inputs, 2)
+    with self.test_session() as sess:
+      flattened_tensor_out = sess.run(flattened_tensor)
+
+    expected_output = [[1, 2],
+                       [3, 4],
+                       [5, 6],
+                       [7, 8],
+                       [9, 10],
+                       [11, 12]]
+    self.assertAllEqual(expected_output, flattened_tensor_out)
+
+  def test_expand_first_dimension(self):
+    inputs = tf.constant(
+        [
+            [1, 2],
+            [3, 4],
+            [5, 6],
+            [7, 8],
+            [9, 10],
+            [11, 12]
+        ], dtype=tf.int32)
+    dims = [3, 2]
+    expanded_tensor = shape_utils.expand_first_dimension(
+        inputs, dims)
+    with self.test_session() as sess:
+      expanded_tensor_out = sess.run(expanded_tensor)
+
+    expected_output = [
+        [[1, 2], [3, 4]],
+        [[5, 6], [7, 8]],
+        [[9, 10], [11, 12]]]
+    self.assertAllEqual(expected_output, expanded_tensor_out)
+
+  def test_expand_first_dimension_with_incompatible_dims(self):
+    inputs_default = tf.constant(
+        [
+            [[1, 2]],
+            [[3, 4]],
+            [[5, 6]],
+        ], dtype=tf.int32)
+    inputs = tf.placeholder_with_default(inputs_default, [None, 1, 2])
+    dims = [3, 2]
+    expanded_tensor = shape_utils.expand_first_dimension(
+        inputs, dims)
+    with self.test_session() as sess:
+      with self.assertRaises(tf.errors.InvalidArgumentError):
+        sess.run(expanded_tensor)
+
+
 if __name__ == '__main__':
  tf.test.main()
--- a/research/object_detection/utils/spatial_transform_ops.py
+++ b/research/object_detection/utils/spatial_transform_ops.py
--- a/research/object_detection/utils/spatial_transform_ops_test.py
+++ b/research/object_detection/utils/spatial_transform_ops_test.py
--- a/research/object_detection/utils/static_shape.py
+++ b/research/object_detection/utils/static_shape.py
--- a/research/object_detection/utils/visualization_utils.py
+++ b/research/object_detection/utils/visualization_utils.py
@@ -1005,6 +1005,10 @@ class EvalMetricOpsVisualization(object):
          lambda: tf.summary.image(summary_name, image),
          lambda: tf.constant(''))

+    if tf.executing_eagerly():
+      update_op = self.add_images([[images[0]]])
+      image_tensors = get_images()
+    else:
      update_op = tf.py_func(self.add_images, [[images[0]]], [])
      image_tensors = tf.py_func(
          get_images, [], [tf.uint8] * self._max_examples_to_draw)