Refactor object detection box predictors and fix some issues with model_main. (#4965)

* Merged commit includes the following changes: 206852642 by Zhichao Lu: Build the balanced_positive_negative_sampler in the model builder for FasterRCNN. Also adds an option to use the static implementation of the sampler. -- 206803260 by Zhichao Lu: Fixes a misplaced argument in resnet fpn feature extractor. -- 206682736 by Zhichao Lu: This CL modifies the SSD meta architecture to support both Slim-based and Keras-based box predictors, and begins preparation for Keras box predictor support in the other meta architectures. Concretely, this CL adds a new `KerasBoxPredictor` base class and makes the meta architectures appropriately call whichever box predictors they are using. We can switch the non-ssd meta architectures to fully support Keras box predictors once the Keras Convolutional Box Predictor CL is submitted. -- 206669634 by Zhichao Lu: Adds an alternate method for balanced positive negative sampler using static shapes. -- 206643278 by Zhichao Lu: This CL adds a Keras layer hyperparameter configuration object to the hyperparams_builder. It automatically converts from Slim layer hyperparameter configs to Keras layer hyperparameters. Namely, it: - Builds Keras initializers/regularizers instead of Slim ones - sets weights_regularizer/initializer to kernel_regularizer/initializer - converts batchnorm decay to momentum - converts Slim l2 regularizer weights to the equivalent Keras l2 weights This will be used in the conversion of object detection feature extractors & box predictors to newer Tensorflow APIs. -- 206611681 by Zhichao Lu: Internal changes. -- 206591619 by Zhichao Lu: Clip the to shape when the input tensors are larger than the expected padded static shape -- 206517644 by Zhichao Lu: Make MultiscaleGridAnchorGenerator more consistent with MultipleGridAnchorGenerator. -- 206415624 by Zhichao Lu: Make the hardcoded feature pyramid network (FPN) levels configurable for both SSD Resnet and SSD Mobilenet. -- 206398204 by Zhichao Lu: This CL modifies the SSD meta architecture to support both Slim-based and Keras-based feature extractors. This allows us to begin the conversion of object detection to newer Tensorflow APIs. -- 206213448 by Zhichao Lu: Adding a method to compute the expected classification loss by background/foreground weighting. -- 206204232 by Zhichao Lu: Adding the keypoint head to the Mask RCNN pipeline. -- 206200352 by Zhichao Lu: - Create Faster R-CNN target assigner in the model builder. This allows configuring matchers in Target assigner to use TPU compatible ops (tf.gather in this case) without any change in meta architecture. - As a +ve side effect of the refactoring, we can now re-use a single target assigner for all of second stage heads in Faster R-CNN. -- 206178206 by Zhichao Lu: Force ssd feature extractor builder to use keyword arguments so values won't be passed to wrong arguments. -- 206168297 by Zhichao Lu: Updating exporter to use freeze_graph.freeze_graph_with_def_protos rather than a homegrown version. -- 206080748 by Zhichao Lu: Merge external contributions. -- 206074460 by Zhichao Lu: Update to preprocessor to apply temperature and softmax to the multiclass scores on read. -- 205960802 by Zhichao Lu: Fixing a bug in hierarchical label expansion script. -- 205944686 by Zhichao Lu: Update exporter to support exporting quantized model. -- 205912529 by Zhichao Lu: Add a two stage matcher to allow for thresholding by one criteria and then argmaxing on the other. -- 205909017 by Zhichao Lu: Add test for grayscale image_resizer -- 205892801 by Zhichao Lu: Add flag to decide whether to apply batch norm to conv layers of weight shared box predictor. -- 205824449 by Zhichao Lu: make sure that by default mask rcnn box predictor predicts 2 stages. -- 205730139 by Zhichao Lu: Updating warning message to be more explicit about variable size mismatch. -- 205696992 by Zhichao Lu: Remove utils/ops.py's dependency on core/box_list_ops.py. This will allow re-using TPU compatible ops from utils/ops.py in core/box_list_ops.py. -- 205696867 by Zhichao Lu: Refactoring mask rcnn predictor so have each head in a separate file. This CL lets us to add new heads more easily in the future to mask rcnn. -- 205492073 by Zhichao Lu: Refactor R-FCN box predictor to be TPU compliant. - Change utils/ops.py:position_sensitive_crop_regions to operate on single image and set of boxes without `box_ind` - Add a batch version that operations on batches of images and batches of boxes. - Refactor R-FCN box predictor to use the batched version of position sensitive crop regions. -- 205453567 by Zhichao Lu: Fix bug that cannot export inference graph when write_inference_graph flag is True. -- 205316039 by Zhichao Lu: Changing input tensor name. -- 205256307 by Zhichao Lu: Fix model zoo links for quantized model. -- 205164432 by Zhichao Lu: Fixes eval error when label map contains non-ascii characters. -- 205129842 by Zhichao Lu: Adds a option to clip the anchors to the window size without filtering the overlapped boxes in Faster-RCNN -- 205094863 by Zhichao Lu: Update to label map util to allow the option of adding a background class and fill in gaps in the label map. Useful for using multiclass scores which require a complete label map with explicit background label. -- 204989032 by Zhichao Lu: Add tf.prof support to exporter. -- 204825267 by Zhichao Lu: Modify mask rcnn box predictor tests for TPU compatibility. -- 204778749 by Zhichao Lu: Remove score filtering from postprocessing.py and rely on filtering logic in tf.image.non_max_suppression -- 204775818 by Zhichao Lu: Python3 fixes for object_detection. -- 204745920 by Zhichao Lu: Object Detection Dataset visualization tool (documentation). -- 204686993 by Zhichao Lu: Internal changes. -- 204559667 by Zhichao Lu: Refactor box_predictor.py into multiple files. The abstract base class remains in the object_detection/core, The other classes have moved to a separate file each in object_detection/predictors -- 204552847 by Zhichao Lu: Update blog post link. -- 204508028 by Zhichao Lu: Bump down the batch size to 1024 to be a bit more tolerant to OOM and double the number of iterations. This job still converges to 20.5 mAP in 3 hours. -- PiperOrigin-RevId: 206852642 * Add original post-processing back.

Refactor object detection box predictors and fix some issues with model_main. (#4965)
* Merged commit includes the following changes: 206852642 by Zhichao Lu: Build the balanced_positive_negative_sampler in the model builder for FasterRCNN. Also adds an option to use the static implementation of the sampler. -- 206803260 by Zhichao Lu: Fixes a misplaced argument in resnet fpn feature extractor. -- 206682736 by Zhichao Lu: This CL modifies the SSD meta architecture to support both Slim-based and Keras-based box predictors, and begins preparation for Keras box predictor support in the other meta architectures. Concretely, this CL adds a new `KerasBoxPredictor` base class and makes the meta architectures appropriately call whichever box predictors they are using. We can switch the non-ssd meta architectures to fully support Keras box predictors once the Keras Convolutional Box Predictor CL is submitted. -- 206669634 by Zhichao Lu: Adds an alternate method for balanced positive negative sampler using static shapes. -- 206643278 by Zhichao Lu: This CL adds a Keras layer hyperparameter configuration object to the hyperparams_builder. It automatically converts from Slim layer hyperparameter configs to Keras layer hyperparameters. Namely, it: - Builds Keras initializers/regularizers instead of Slim ones - sets weights_regularizer/initializer to kernel_regularizer/initializer - converts batchnorm decay to momentum - converts Slim l2 regularizer weights to the equivalent Keras l2 weights This will be used in the conversion of object detection feature extractors & box predictors to newer Tensorflow APIs. -- 206611681 by Zhichao Lu: Internal changes. -- 206591619 by Zhichao Lu: Clip the to shape when the input tensors are larger than the expected padded static shape -- 206517644 by Zhichao Lu: Make MultiscaleGridAnchorGenerator more consistent with MultipleGridAnchorGenerator. -- 206415624 by Zhichao Lu: Make the hardcoded feature pyramid network (FPN) levels configurable for both SSD Resnet and SSD Mobilenet. -- 206398204 by Zhichao Lu: This CL modifies the SSD meta architecture to support both Slim-based and Keras-based feature extractors. This allows us to begin the conversion of object detection to newer Tensorflow APIs. -- 206213448 by Zhichao Lu: Adding a method to compute the expected classification loss by background/foreground weighting. -- 206204232 by Zhichao Lu: Adding the keypoint head to the Mask RCNN pipeline. -- 206200352 by Zhichao Lu: - Create Faster R-CNN target assigner in the model builder. This allows configuring matchers in Target assigner to use TPU compatible ops (tf.gather in this case) without any change in meta architecture. - As a +ve side effect of the refactoring, we can now re-use a single target assigner for all of second stage heads in Faster R-CNN. -- 206178206 by Zhichao Lu: Force ssd feature extractor builder to use keyword arguments so values won't be passed to wrong arguments. -- 206168297 by Zhichao Lu: Updating exporter to use freeze_graph.freeze_graph_with_def_protos rather than a homegrown version. -- 206080748 by Zhichao Lu: Merge external contributions. -- 206074460 by Zhichao Lu: Update to preprocessor to apply temperature and softmax to the multiclass scores on read. -- 205960802 by Zhichao Lu: Fixing a bug in hierarchical label expansion script. -- 205944686 by Zhichao Lu: Update exporter to support exporting quantized model. -- 205912529 by Zhichao Lu: Add a two stage matcher to allow for thresholding by one criteria and then argmaxing on the other. -- 205909017 by Zhichao Lu: Add test for grayscale image_resizer -- 205892801 by Zhichao Lu: Add flag to decide whether to apply batch norm to conv layers of weight shared box predictor. -- 205824449 by Zhichao Lu: make sure that by default mask rcnn box predictor predicts 2 stages. -- 205730139 by Zhichao Lu: Updating warning message to be more explicit about variable size mismatch. -- 205696992 by Zhichao Lu: Remove utils/ops.py's dependency on core/box_list_ops.py. This will allow re-using TPU compatible ops from utils/ops.py in core/box_list_ops.py. -- 205696867 by Zhichao Lu: Refactoring mask rcnn predictor so have each head in a separate file. This CL lets us to add new heads more easily in the future to mask rcnn. -- 205492073 by Zhichao Lu: Refactor R-FCN box predictor to be TPU compliant. - Change utils/ops.py:position_sensitive_crop_regions to operate on single image and set of boxes without `box_ind` - Add a batch version that operations on batches of images and batches of boxes. - Refactor R-FCN box predictor to use the batched version of position sensitive crop regions. -- 205453567 by Zhichao Lu: Fix bug that cannot export inference graph when write_inference_graph flag is True. -- 205316039 by Zhichao Lu: Changing input tensor name. -- 205256307 by Zhichao Lu: Fix model zoo links for quantized model. -- 205164432 by Zhichao Lu: Fixes eval error when label map contains non-ascii characters. -- 205129842 by Zhichao Lu: Adds a option to clip the anchors to the window size without filtering the overlapped boxes in Faster-RCNN -- 205094863 by Zhichao Lu: Update to label map util to allow the option of adding a background class and fill in gaps in the label map. Useful for using multiclass scores which require a complete label map with explicit background label. -- 204989032 by Zhichao Lu: Add tf.prof support to exporter. -- 204825267 by Zhichao Lu: Modify mask rcnn box predictor tests for TPU compatibility. -- 204778749 by Zhichao Lu: Remove score filtering from postprocessing.py and rely on filtering logic in tf.image.non_max_suppression -- 204775818 by Zhichao Lu: Python3 fixes for object_detection. -- 204745920 by Zhichao Lu: Object Detection Dataset visualization tool (documentation). -- 204686993 by Zhichao Lu: Internal changes. -- 204559667 by Zhichao Lu: Refactor box_predictor.py into multiple files. The abstract base class remains in the object_detection/core, The other classes have moved to a separate file each in object_detection/predictors -- 204552847 by Zhichao Lu: Update blog post link. -- 204508028 by Zhichao Lu: Bump down the batch size to 1024 to be a bit more tolerant to OOM and double the number of iterations. This job still converges to 20.5 mAP in 3 hours. -- PiperOrigin-RevId: 206852642 * Add original post-processing back.
02a9969e · pkulzc · GitHub · d135ed9c · 02a9969e · 02a9969e
Unverified Commit 02a9969e authored Aug 01, 2018 by pkulzc Committed by GitHub Aug 01, 2018
20 changed files
--- a/research/object_detection/models/ssd_mobilenet_v1_feature_extractor.py
+++ b/research/object_detection/models/ssd_mobilenet_v1_feature_extractor.py
@@ -61,8 +61,15 @@ class SSDMobileNetV1FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
        `conv_hyperparams_fn`.
    """
    super(SSDMobileNetV1FeatureExtractor, self).__init__(
-        is_training, depth_multiplier, min_depth, pad_to_multiple,
+        is_training=is_training,
-        conv_hyperparams_fn, reuse_weights, use_explicit_padding, use_depthwise,
+        depth_multiplier=depth_multiplier,
+        min_depth=min_depth,
+        pad_to_multiple=pad_to_multiple,
+        conv_hyperparams_fn=conv_hyperparams_fn,
+        reuse_weights=reuse_weights,
+        use_explicit_padding=use_explicit_padding,
+        use_depthwise=use_depthwise,
+        override_base_feature_extractor_hyperparams=
        override_base_feature_extractor_hyperparams)
  def preprocess(self, resized_inputs):

--- a/research/object_detection/models/ssd_mobilenet_v1_fpn_feature_extractor.py
+++ b/research/object_detection/models/ssd_mobilenet_v1_fpn_feature_extractor.py
@@ -30,6 +30,61 @@ slim = tf.contrib.slim
 class SSDMobileNetV1FpnFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
  """SSD Feature Extractor using MobilenetV1 FPN features."""
+  def __init__(self,
+               is_training,
+               depth_multiplier,
+               min_depth,
+               pad_to_multiple,
+               conv_hyperparams_fn,
+               fpn_min_level=3,
+               fpn_max_level=7,
+               reuse_weights=None,
+               use_explicit_padding=False,
+               use_depthwise=False,
+               override_base_feature_extractor_hyperparams=False):
+    """SSD FPN feature extractor based on Mobilenet v1 architecture.
+    Args:
+      is_training: whether the network is in training mode.
+      depth_multiplier: float depth multiplier for feature extractor.
+      min_depth: minimum feature extractor depth.
+      pad_to_multiple: the nearest multiple to zero pad the input height and
+        width dimensions to.
+      conv_hyperparams_fn: A function to construct tf slim arg_scope for conv2d
+        and separable_conv2d ops in the layers that are added on top of the base
+        feature extractor.
+      fpn_min_level: the highest resolution feature map to use in FPN. The valid
+        values are {2, 3, 4, 5} which map to MobileNet v1 layers
+        {Conv2d_3_pointwise, Conv2d_5_pointwise, Conv2d_11_pointwise,
+        Conv2d_13_pointwise}, respectively.
+      fpn_max_level: the smallest resolution feature map to construct or use in
+        FPN. FPN constructions uses features maps starting from fpn_min_level
+        upto the fpn_max_level. In the case that there are not enough feature
+        maps in the backbone network, additional feature maps are created by
+        applying stride 2 convolutions until we get the desired number of fpn
+        levels.
+      reuse_weights: whether to reuse variables. Default is None.
+      use_explicit_padding: Whether to use explicit padding when extracting
+        features. Default is False.
+      use_depthwise: Whether to use depthwise convolutions. Default is False.
+      override_base_feature_extractor_hyperparams: Whether to override
+        hyperparameters of the base feature extractor with the one from
+        `conv_hyperparams_fn`.
+    """
+    super(SSDMobileNetV1FpnFeatureExtractor, self).__init__(
+        is_training=is_training,
+        depth_multiplier=depth_multiplier,
+        min_depth=min_depth,
+        pad_to_multiple=pad_to_multiple,
+        conv_hyperparams_fn=conv_hyperparams_fn,
+        reuse_weights=reuse_weights,
+        use_explicit_padding=use_explicit_padding,
+        use_depthwise=use_depthwise,
+        override_base_feature_extractor_hyperparams=
+        override_base_feature_extractor_hyperparams)
+    self._fpn_min_level = fpn_min_level
+    self._fpn_max_level = fpn_max_level
  def preprocess(self, resized_inputs):
    """SSD preprocessing.
@@ -78,24 +133,31 @@ class SSDMobileNetV1FpnFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
      depth_fn = lambda d: max(int(d * self._depth_multiplier), self._min_depth)
      with slim.arg_scope(self._conv_hyperparams_fn()):
        with tf.variable_scope('fpn', reuse=self._reuse_weights):
+          feature_blocks = [
+              'Conv2d_3_pointwise', 'Conv2d_5_pointwise', 'Conv2d_11_pointwise',
+              'Conv2d_13_pointwise'
+          ]
+          base_fpn_max_level = min(self._fpn_max_level, 5)
+          feature_block_list = []
+          for level in range(self._fpn_min_level, base_fpn_max_level + 1):
+            feature_block_list.append(feature_blocks[level - 2])
          fpn_features = feature_map_generators.fpn_top_down_feature_maps(
-              [(key, image_features[key])
+              [(key, image_features[key]) for key in feature_block_list],
-               for key in ['Conv2d_5_pointwise', 'Conv2d_11_pointwise',
-                           'Conv2d_13_pointwise']],
              depth=depth_fn(256))
-          last_feature_map = fpn_features['top_down_Conv2d_13_pointwise']
+          feature_maps = []
-          coarse_features = {}
+          for level in range(self._fpn_min_level, base_fpn_max_level + 1):
-          for i in range(14, 16):
+            feature_maps.append(fpn_features['top_down_{}'.format(
+                feature_blocks[level - 2])])
+          last_feature_map = fpn_features['top_down_{}'.format(
+              feature_blocks[base_fpn_max_level - 2])]
+          # Construct coarse features
+          for i in range(base_fpn_max_level + 1, self._fpn_max_level + 1):
            last_feature_map = slim.conv2d(
                last_feature_map,
                num_outputs=depth_fn(256),
                kernel_size=[3, 3],
                stride=2,
                padding='SAME',
-                scope='bottom_up_Conv2d_{}'.format(i))
+                scope='bottom_up_Conv2d_{}'.format(i - base_fpn_max_level + 13))
-            coarse_features['bottom_up_Conv2d_{}'.format(i)] = last_feature_map
+            feature_maps.append(last_feature_map)
-    return [fpn_features['top_down_Conv2d_5_pointwise'],
+    return feature_maps
-            fpn_features['top_down_Conv2d_11_pointwise'],
-            fpn_features['top_down_Conv2d_13_pointwise'],
-            coarse_features['bottom_up_Conv2d_14'],
-            coarse_features['bottom_up_Conv2d_15']]
--- a/research/object_detection/models/ssd_mobilenet_v2_feature_extractor.py
+++ b/research/object_detection/models/ssd_mobilenet_v2_feature_extractor.py
@@ -64,8 +64,15 @@ class SSDMobileNetV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
        `conv_hyperparams_fn`.
    """
    super(SSDMobileNetV2FeatureExtractor, self).__init__(
-        is_training, depth_multiplier, min_depth, pad_to_multiple,
+        is_training=is_training,
-        conv_hyperparams_fn, reuse_weights, use_explicit_padding, use_depthwise,
+        depth_multiplier=depth_multiplier,
+        min_depth=min_depth,
+        pad_to_multiple=pad_to_multiple,
+        conv_hyperparams_fn=conv_hyperparams_fn,
+        reuse_weights=reuse_weights,
+        use_explicit_padding=use_explicit_padding,
+        use_depthwise=use_depthwise,
+        override_base_feature_extractor_hyperparams=
        override_base_feature_extractor_hyperparams)
  def preprocess(self, resized_inputs):

--- a/research/object_detection/models/ssd_resnet_v1_fpn_feature_extractor.py
+++ b/research/object_detection/models/ssd_resnet_v1_fpn_feature_extractor.py
--- a/research/object_detection/object_detection_tutorial.ipynb
+++ b/research/object_detection/object_detection_tutorial.ipynb
--- a/research/object_detection/predictors/__init__.py
+++ b/research/object_detection/predictors/__init__.py
--- a/research/object_detection/predictors/convolutional_box_predictor.py
+++ b/research/object_detection/predictors/convolutional_box_predictor.py
--- a/research/object_detection/core/box_predictor_test.py
+++ b/research/object_detection/core/box_predictor_test.py
--- a/research/object_detection/predictors/mask_rcnn_box_predictor.py
+++ b/research/object_detection/predictors/mask_rcnn_box_predictor.py
--- a/research/object_detection/predictors/mask_rcnn_box_predictor_test.py
+++ b/research/object_detection/predictors/mask_rcnn_box_predictor_test.py
--- a/research/object_detection/predictors/mask_rcnn_heads/__init__.py
+++ b/research/object_detection/predictors/mask_rcnn_heads/__init__.py
--- a/research/object_detection/predictors/mask_rcnn_heads/box_head.py
+++ b/research/object_detection/predictors/mask_rcnn_heads/box_head.py
--- a/research/object_detection/predictors/mask_rcnn_heads/box_head_test.py
+++ b/research/object_detection/predictors/mask_rcnn_heads/box_head_test.py
--- a/research/object_detection/predictors/mask_rcnn_heads/class_head.py
+++ b/research/object_detection/predictors/mask_rcnn_heads/class_head.py
--- a/research/object_detection/predictors/mask_rcnn_heads/class_head_test.py
+++ b/research/object_detection/predictors/mask_rcnn_heads/class_head_test.py
--- a/research/object_detection/predictors/mask_rcnn_heads/keypoint_head.py
+++ b/research/object_detection/predictors/mask_rcnn_heads/keypoint_head.py
--- a/research/object_detection/predictors/mask_rcnn_heads/keypoint_head_test.py
+++ b/research/object_detection/predictors/mask_rcnn_heads/keypoint_head_test.py
--- a/research/object_detection/predictors/mask_rcnn_heads/mask_head.py
+++ b/research/object_detection/predictors/mask_rcnn_heads/mask_head.py
--- a/research/object_detection/predictors/mask_rcnn_heads/mask_head_test.py
+++ b/research/object_detection/predictors/mask_rcnn_heads/mask_head_test.py
--- a/research/object_detection/predictors/mask_rcnn_heads/mask_rcnn_head.py
+++ b/research/object_detection/predictors/mask_rcnn_heads/mask_rcnn_head.py