Unverified Commit 59f7e80a authored by pkulzc's avatar pkulzc Committed by GitHub
Browse files

Update object detection post processing and fixes boxes padding/clipping issue. (#5026)

* Merged commit includes the following changes:
207771702  by Zhichao Lu:

    Refactoring evaluation utilities so that it is easier to introduce new DetectionEvaluators with eval_metric_ops.

--
207758641  by Zhichao Lu:

    Require tensorflow version 1.9+ for running object detection API.

--
207641470  by Zhichao Lu:

    Clip `num_groundtruth_boxes` in pad_input_data_to_static_shapes() to `max_num_boxes`. This prevents a scenario where tensors are sliced to an invalid range in model_lib.unstack_batch().

--
207621728  by Zhichao Lu:

    This CL adds a FreezableBatchNorm that inherits from the Keras BatchNormalization layer, but supports freezing the `training` parameter at construction time instead of having to do it in the `call` method.

    It also adds a method to the `KerasLayerHyperparams` class that will build an appropriate FreezableBatchNorm layer according to the hyperparameter configuration. If batch_norm is disabled, this method returns and Identity layer.

    These will be used to simplify the conversion to Keras APIs.

--
207610524  by Zhichao Lu:

    Update anchor generators and box predictors for python3 compatibility.

--
207585122  by Zhichao Lu:

    Refactoring convolutional box predictor into separate prediction heads.

--
207549305  by Zhichao Lu:

    Pass all 1s for batch weights if nothing is specified in GT.

--
207336575  by Zhichao Lu:

    Move the new argument 'target_assigner_instance' to the end of the list of arguments to the ssd_meta_arch constructor for backwards compatibility.

--
207327862  by Zhichao Lu:

    Enable support for float output in quantized custom op for postprocessing in SSD Mobilenet model.

--
207323154  by Zhichao Lu:

    Bug fix: change dict.iteritems() to dict.items()

--
207301109  by Zhichao Lu:

    Integrating expected_classification_loss_under_sampling op as an option in the ssd_meta_arch

--
207286221  by Zhichao Lu:

    Adding an option to weight regression loss with foreground scores from the ground truth labels.

--
207231739  by Zhichao Lu:

    Explicitly mentioning the argument names when calling the batch target assigner.

--
207206356  by Zhichao Lu:

    Add include_trainable_variables field to train config to better handle trainable variables.

--
207135930  by Zhichao Lu:

    Internal change.

--
206862541  by Zhichao Lu:

    Do not unpad the outputs from batch_non_max_suppression before sampling.

    Since BalancedPositiveNegativeSampler takes an indicator for valid positions to sample from we can pass the output from NMS directly into Sampler.

--

PiperOrigin-RevId: 207771702

* Remove unused doc.
parent fb6bc29b
...@@ -58,7 +58,7 @@ class MultiscaleGridAnchorGenerator(anchor_generator.AnchorGenerator): ...@@ -58,7 +58,7 @@ class MultiscaleGridAnchorGenerator(anchor_generator.AnchorGenerator):
self._normalize_coordinates = normalize_coordinates self._normalize_coordinates = normalize_coordinates
scales = [2**(float(scale) / scales_per_octave) scales = [2**(float(scale) / scales_per_octave)
for scale in xrange(scales_per_octave)] for scale in range(scales_per_octave)]
aspects = list(aspect_ratios) aspects = list(aspect_ratios)
for level in range(min_level, max_level + 1): for level in range(min_level, max_level + 1):
......
...@@ -18,12 +18,280 @@ ...@@ -18,12 +18,280 @@
from object_detection.predictors import convolutional_box_predictor from object_detection.predictors import convolutional_box_predictor
from object_detection.predictors import mask_rcnn_box_predictor from object_detection.predictors import mask_rcnn_box_predictor
from object_detection.predictors import rfcn_box_predictor from object_detection.predictors import rfcn_box_predictor
from object_detection.predictors.mask_rcnn_heads import box_head from object_detection.predictors.heads import box_head
from object_detection.predictors.mask_rcnn_heads import class_head from object_detection.predictors.heads import class_head
from object_detection.predictors.mask_rcnn_heads import mask_head from object_detection.predictors.heads import mask_head
from object_detection.protos import box_predictor_pb2 from object_detection.protos import box_predictor_pb2
def build_convolutional_box_predictor(
is_training,
num_classes,
conv_hyperparams_fn,
min_depth,
max_depth,
num_layers_before_predictor,
use_dropout,
dropout_keep_prob,
kernel_size,
box_code_size,
apply_sigmoid_to_scores=False,
class_prediction_bias_init=0.0,
use_depthwise=False,
predict_instance_masks=False,
mask_height=7,
mask_width=7,
masks_are_class_agnostic=False):
"""Builds the ConvolutionalBoxPredictor from the arguments.
Args:
is_training: Indicates whether the BoxPredictor is in training mode.
num_classes: Number of classes.
conv_hyperparams_fn: A function to generate tf-slim arg_scope with
hyperparameters for convolution ops.
min_depth: Minimum feature depth prior to predicting box encodings
and class predictions.
max_depth: Maximum feature depth prior to predicting box encodings
and class predictions. If max_depth is set to 0, no additional
feature map will be inserted before location and class predictions.
num_layers_before_predictor: Number of the additional conv layers before
the predictor.
use_dropout: Option to use dropout or not. Note that a single dropout
op is applied here prior to both box and class predictions, which stands
in contrast to the ConvolutionalBoxPredictor below.
dropout_keep_prob: Keep probability for dropout.
This is only used if use_dropout is True.
kernel_size: Size of final convolution kernel. If the
spatial resolution of the feature map is smaller than the kernel size,
then the kernel size is automatically set to be
min(feature_width, feature_height).
box_code_size: Size of encoding for each box.
apply_sigmoid_to_scores: if True, apply the sigmoid on the output
class_predictions.
class_prediction_bias_init: constant value to initialize bias of the last
conv2d layer before class prediction.
use_depthwise: Whether to use depthwise convolutions for prediction
steps. Default is False.
predict_instance_masks: If True, will add a third stage mask prediction
to the returned class.
mask_height: Desired output mask height. The default value is 7.
mask_width: Desired output mask width. The default value is 7.
masks_are_class_agnostic: Boolean determining if the mask-head is
class-agnostic or not.
Returns:
A ConvolutionalBoxPredictor class.
"""
box_prediction_head = box_head.ConvolutionalBoxHead(
is_training=is_training,
box_code_size=box_code_size,
kernel_size=kernel_size,
use_depthwise=use_depthwise)
class_prediction_head = class_head.ConvolutionalClassHead(
is_training=is_training,
num_classes=num_classes,
use_dropout=use_dropout,
dropout_keep_prob=dropout_keep_prob,
kernel_size=kernel_size,
apply_sigmoid_to_scores=apply_sigmoid_to_scores,
class_prediction_bias_init=class_prediction_bias_init,
use_depthwise=use_depthwise)
other_heads = {}
if predict_instance_masks:
other_heads[convolutional_box_predictor.MASK_PREDICTIONS] = (
mask_head.ConvolutionalMaskHead(
is_training=is_training,
num_classes=num_classes,
use_dropout=use_dropout,
dropout_keep_prob=dropout_keep_prob,
kernel_size=kernel_size,
use_depthwise=use_depthwise,
mask_height=mask_height,
mask_width=mask_width,
masks_are_class_agnostic=masks_are_class_agnostic))
return convolutional_box_predictor.ConvolutionalBoxPredictor(
is_training=is_training,
num_classes=num_classes,
box_prediction_head=box_prediction_head,
class_prediction_head=class_prediction_head,
other_heads=other_heads,
conv_hyperparams_fn=conv_hyperparams_fn,
num_layers_before_predictor=num_layers_before_predictor,
min_depth=min_depth,
max_depth=max_depth)
def build_weight_shared_convolutional_box_predictor(
is_training,
num_classes,
conv_hyperparams_fn,
depth,
num_layers_before_predictor,
box_code_size,
kernel_size=3,
class_prediction_bias_init=0.0,
use_dropout=False,
dropout_keep_prob=0.8,
share_prediction_tower=False,
apply_batch_norm=True,
predict_instance_masks=False,
mask_height=7,
mask_width=7,
masks_are_class_agnostic=False):
"""Builds and returns a WeightSharedConvolutionalBoxPredictor class.
Args:
is_training: Indicates whether the BoxPredictor is in training mode.
num_classes: number of classes. Note that num_classes *does not*
include the background category, so if groundtruth labels take values
in {0, 1, .., K-1}, num_classes=K (and not K+1, even though the
assigned classification targets can range from {0,... K}).
conv_hyperparams_fn: A function to generate tf-slim arg_scope with
hyperparameters for convolution ops.
depth: depth of conv layers.
num_layers_before_predictor: Number of the additional conv layers before
the predictor.
box_code_size: Size of encoding for each box.
kernel_size: Size of final convolution kernel.
class_prediction_bias_init: constant value to initialize bias of the last
conv2d layer before class prediction.
use_dropout: Whether to apply dropout to class prediction head.
dropout_keep_prob: Probability of keeping activiations.
share_prediction_tower: Whether to share the multi-layer tower between box
prediction and class prediction heads.
apply_batch_norm: Whether to apply batch normalization to conv layers in
this predictor.
predict_instance_masks: If True, will add a third stage mask prediction
to the returned class.
mask_height: Desired output mask height. The default value is 7.
mask_width: Desired output mask width. The default value is 7.
masks_are_class_agnostic: Boolean determining if the mask-head is
class-agnostic or not.
Returns:
A WeightSharedConvolutionalBoxPredictor class.
"""
box_prediction_head = box_head.WeightSharedConvolutionalBoxHead(
box_code_size=box_code_size,
kernel_size=kernel_size,
class_prediction_bias_init=class_prediction_bias_init)
class_prediction_head = (
class_head.WeightSharedConvolutionalClassHead(
num_classes=num_classes,
kernel_size=kernel_size,
class_prediction_bias_init=class_prediction_bias_init,
use_dropout=use_dropout,
dropout_keep_prob=dropout_keep_prob))
other_heads = {}
if predict_instance_masks:
other_heads[convolutional_box_predictor.MASK_PREDICTIONS] = (
mask_head.WeightSharedConvolutionalMaskHead(
num_classes=num_classes,
kernel_size=kernel_size,
use_dropout=use_dropout,
dropout_keep_prob=dropout_keep_prob,
mask_height=mask_height,
mask_width=mask_width,
masks_are_class_agnostic=masks_are_class_agnostic))
return convolutional_box_predictor.WeightSharedConvolutionalBoxPredictor(
is_training=is_training,
num_classes=num_classes,
box_prediction_head=box_prediction_head,
class_prediction_head=class_prediction_head,
other_heads=other_heads,
conv_hyperparams_fn=conv_hyperparams_fn,
depth=depth,
num_layers_before_predictor=num_layers_before_predictor,
kernel_size=kernel_size,
apply_batch_norm=apply_batch_norm,
share_prediction_tower=share_prediction_tower)
def build_mask_rcnn_box_predictor(is_training,
num_classes,
fc_hyperparams_fn,
use_dropout,
dropout_keep_prob,
box_code_size,
share_box_across_classes=False,
predict_instance_masks=False,
conv_hyperparams_fn=None,
mask_height=14,
mask_width=14,
mask_prediction_num_conv_layers=2,
mask_prediction_conv_depth=256,
masks_are_class_agnostic=False):
"""Builds and returns a MaskRCNNBoxPredictor class.
Args:
is_training: Indicates whether the BoxPredictor is in training mode.
num_classes: number of classes. Note that num_classes *does not*
include the background category, so if groundtruth labels take values
in {0, 1, .., K-1}, num_classes=K (and not K+1, even though the
assigned classification targets can range from {0,... K}).
fc_hyperparams_fn: A function to generate tf-slim arg_scope with
hyperparameters for fully connected ops.
use_dropout: Option to use dropout or not. Note that a single dropout
op is applied here prior to both box and class predictions, which stands
in contrast to the ConvolutionalBoxPredictor below.
dropout_keep_prob: Keep probability for dropout.
This is only used if use_dropout is True.
box_code_size: Size of encoding for each box.
share_box_across_classes: Whether to share boxes across classes rather
than use a different box for each class.
predict_instance_masks: If True, will add a third stage mask prediction
to the returned class.
conv_hyperparams_fn: A function to generate tf-slim arg_scope with
hyperparameters for convolution ops.
mask_height: Desired output mask height. The default value is 14.
mask_width: Desired output mask width. The default value is 14.
mask_prediction_num_conv_layers: Number of convolution layers applied to
the image_features in mask prediction branch.
mask_prediction_conv_depth: The depth for the first conv2d_transpose op
applied to the image_features in the mask prediction branch. If set
to 0, the depth of the convolution layers will be automatically chosen
based on the number of object classes and the number of channels in the
image features.
masks_are_class_agnostic: Boolean determining if the mask-head is
class-agnostic or not.
Returns:
A MaskRCNNBoxPredictor class.
"""
box_prediction_head = box_head.MaskRCNNBoxHead(
is_training=is_training,
num_classes=num_classes,
fc_hyperparams_fn=fc_hyperparams_fn,
use_dropout=use_dropout,
dropout_keep_prob=dropout_keep_prob,
box_code_size=box_code_size,
share_box_across_classes=share_box_across_classes)
class_prediction_head = class_head.MaskRCNNClassHead(
is_training=is_training,
num_classes=num_classes,
fc_hyperparams_fn=fc_hyperparams_fn,
use_dropout=use_dropout,
dropout_keep_prob=dropout_keep_prob)
third_stage_heads = {}
if predict_instance_masks:
third_stage_heads[
mask_rcnn_box_predictor.
MASK_PREDICTIONS] = mask_head.MaskRCNNMaskHead(
num_classes=num_classes,
conv_hyperparams_fn=conv_hyperparams_fn,
mask_height=mask_height,
mask_width=mask_width,
mask_prediction_num_conv_layers=mask_prediction_num_conv_layers,
mask_prediction_conv_depth=mask_prediction_conv_depth,
masks_are_class_agnostic=masks_are_class_agnostic)
return mask_rcnn_box_predictor.MaskRCNNBoxPredictor(
is_training=is_training,
num_classes=num_classes,
box_prediction_head=box_prediction_head,
class_prediction_head=class_prediction_head,
third_stage_heads=third_stage_heads)
def build(argscope_fn, box_predictor_config, is_training, num_classes): def build(argscope_fn, box_predictor_config, is_training, num_classes):
"""Builds box predictor based on the configuration. """Builds box predictor based on the configuration.
...@@ -56,25 +324,22 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes): ...@@ -56,25 +324,22 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes):
config_box_predictor = box_predictor_config.convolutional_box_predictor config_box_predictor = box_predictor_config.convolutional_box_predictor
conv_hyperparams_fn = argscope_fn(config_box_predictor.conv_hyperparams, conv_hyperparams_fn = argscope_fn(config_box_predictor.conv_hyperparams,
is_training) is_training)
box_predictor_object = ( return build_convolutional_box_predictor(
convolutional_box_predictor.ConvolutionalBoxPredictor( is_training=is_training,
is_training=is_training, num_classes=num_classes,
num_classes=num_classes, conv_hyperparams_fn=conv_hyperparams_fn,
conv_hyperparams_fn=conv_hyperparams_fn, use_dropout=config_box_predictor.use_dropout,
min_depth=config_box_predictor.min_depth, dropout_keep_prob=config_box_predictor.dropout_keep_probability,
max_depth=config_box_predictor.max_depth, box_code_size=config_box_predictor.box_code_size,
num_layers_before_predictor=( kernel_size=config_box_predictor.kernel_size,
config_box_predictor.num_layers_before_predictor), num_layers_before_predictor=(
use_dropout=config_box_predictor.use_dropout, config_box_predictor.num_layers_before_predictor),
dropout_keep_prob=config_box_predictor.dropout_keep_probability, min_depth=config_box_predictor.min_depth,
kernel_size=config_box_predictor.kernel_size, max_depth=config_box_predictor.max_depth,
box_code_size=config_box_predictor.box_code_size, apply_sigmoid_to_scores=config_box_predictor.apply_sigmoid_to_scores,
apply_sigmoid_to_scores=config_box_predictor. class_prediction_bias_init=(
apply_sigmoid_to_scores, config_box_predictor.class_prediction_bias_init),
class_prediction_bias_init=( use_depthwise=config_box_predictor.use_depthwise)
config_box_predictor.class_prediction_bias_init),
use_depthwise=config_box_predictor.use_depthwise))
return box_predictor_object
if box_predictor_oneof == 'weight_shared_convolutional_box_predictor': if box_predictor_oneof == 'weight_shared_convolutional_box_predictor':
config_box_predictor = ( config_box_predictor = (
...@@ -83,23 +348,21 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes): ...@@ -83,23 +348,21 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes):
is_training) is_training)
apply_batch_norm = config_box_predictor.conv_hyperparams.HasField( apply_batch_norm = config_box_predictor.conv_hyperparams.HasField(
'batch_norm') 'batch_norm')
box_predictor_object = ( return build_weight_shared_convolutional_box_predictor(
convolutional_box_predictor.WeightSharedConvolutionalBoxPredictor( is_training=is_training,
is_training=is_training, num_classes=num_classes,
num_classes=num_classes, conv_hyperparams_fn=conv_hyperparams_fn,
conv_hyperparams_fn=conv_hyperparams_fn, depth=config_box_predictor.depth,
depth=config_box_predictor.depth, num_layers_before_predictor=(
num_layers_before_predictor=( config_box_predictor.num_layers_before_predictor),
config_box_predictor.num_layers_before_predictor), box_code_size=config_box_predictor.box_code_size,
kernel_size=config_box_predictor.kernel_size, kernel_size=config_box_predictor.kernel_size,
box_code_size=config_box_predictor.box_code_size, class_prediction_bias_init=(
class_prediction_bias_init=config_box_predictor. config_box_predictor.class_prediction_bias_init),
class_prediction_bias_init, use_dropout=config_box_predictor.use_dropout,
use_dropout=config_box_predictor.use_dropout, dropout_keep_prob=config_box_predictor.dropout_keep_probability,
dropout_keep_prob=config_box_predictor.dropout_keep_probability, share_prediction_tower=config_box_predictor.share_prediction_tower,
share_prediction_tower=config_box_predictor.share_prediction_tower, apply_batch_norm=apply_batch_norm)
apply_batch_norm=apply_batch_norm))
return box_predictor_object
if box_predictor_oneof == 'mask_rcnn_box_predictor': if box_predictor_oneof == 'mask_rcnn_box_predictor':
config_box_predictor = box_predictor_config.mask_rcnn_box_predictor config_box_predictor = box_predictor_config.mask_rcnn_box_predictor
...@@ -109,7 +372,7 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes): ...@@ -109,7 +372,7 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes):
if config_box_predictor.HasField('conv_hyperparams'): if config_box_predictor.HasField('conv_hyperparams'):
conv_hyperparams_fn = argscope_fn( conv_hyperparams_fn = argscope_fn(
config_box_predictor.conv_hyperparams, is_training) config_box_predictor.conv_hyperparams, is_training)
box_prediction_head = box_head.BoxHead( return build_mask_rcnn_box_predictor(
is_training=is_training, is_training=is_training,
num_classes=num_classes, num_classes=num_classes,
fc_hyperparams_fn=fc_hyperparams_fn, fc_hyperparams_fn=fc_hyperparams_fn,
...@@ -117,34 +380,17 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes): ...@@ -117,34 +380,17 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes):
dropout_keep_prob=config_box_predictor.dropout_keep_probability, dropout_keep_prob=config_box_predictor.dropout_keep_probability,
box_code_size=config_box_predictor.box_code_size, box_code_size=config_box_predictor.box_code_size,
share_box_across_classes=( share_box_across_classes=(
config_box_predictor.share_box_across_classes)) config_box_predictor.share_box_across_classes),
class_prediction_head = class_head.ClassHead( predict_instance_masks=config_box_predictor.predict_instance_masks,
is_training=is_training, conv_hyperparams_fn=conv_hyperparams_fn,
num_classes=num_classes, mask_height=config_box_predictor.mask_height,
fc_hyperparams_fn=fc_hyperparams_fn, mask_width=config_box_predictor.mask_width,
use_dropout=config_box_predictor.use_dropout, mask_prediction_num_conv_layers=(
dropout_keep_prob=config_box_predictor.dropout_keep_probability) config_box_predictor.mask_prediction_num_conv_layers),
third_stage_heads = {} mask_prediction_conv_depth=(
if config_box_predictor.predict_instance_masks: config_box_predictor.mask_prediction_conv_depth),
third_stage_heads[ masks_are_class_agnostic=(
mask_rcnn_box_predictor.MASK_PREDICTIONS] = mask_head.MaskHead( config_box_predictor.masks_are_class_agnostic))
num_classes=num_classes,
conv_hyperparams_fn=conv_hyperparams_fn,
mask_height=config_box_predictor.mask_height,
mask_width=config_box_predictor.mask_width,
mask_prediction_num_conv_layers=(
config_box_predictor.mask_prediction_num_conv_layers),
mask_prediction_conv_depth=(
config_box_predictor.mask_prediction_conv_depth),
masks_are_class_agnostic=(
config_box_predictor.masks_are_class_agnostic))
box_predictor_object = mask_rcnn_box_predictor.MaskRCNNBoxPredictor(
is_training=is_training,
num_classes=num_classes,
box_prediction_head=box_prediction_head,
class_prediction_head=class_prediction_head,
third_stage_heads=third_stage_heads)
return box_predictor_object
if box_predictor_oneof == 'rfcn_box_predictor': if box_predictor_oneof == 'rfcn_box_predictor':
config_box_predictor = box_predictor_config.rfcn_box_predictor config_box_predictor = box_predictor_config.rfcn_box_predictor
......
...@@ -111,16 +111,17 @@ class ConvolutionalBoxPredictorBuilderTest(tf.test.TestCase): ...@@ -111,16 +111,17 @@ class ConvolutionalBoxPredictorBuilderTest(tf.test.TestCase):
box_predictor_config=box_predictor_proto, box_predictor_config=box_predictor_proto,
is_training=False, is_training=False,
num_classes=10) num_classes=10)
class_head = box_predictor._class_prediction_head
self.assertEqual(box_predictor._min_depth, 2) self.assertEqual(box_predictor._min_depth, 2)
self.assertEqual(box_predictor._max_depth, 16) self.assertEqual(box_predictor._max_depth, 16)
self.assertEqual(box_predictor._num_layers_before_predictor, 2) self.assertEqual(box_predictor._num_layers_before_predictor, 2)
self.assertFalse(box_predictor._use_dropout) self.assertFalse(class_head._use_dropout)
self.assertAlmostEqual(box_predictor._dropout_keep_prob, 0.4) self.assertAlmostEqual(class_head._dropout_keep_prob, 0.4)
self.assertTrue(box_predictor._apply_sigmoid_to_scores) self.assertTrue(class_head._apply_sigmoid_to_scores)
self.assertAlmostEqual(box_predictor._class_prediction_bias_init, 4.0) self.assertAlmostEqual(class_head._class_prediction_bias_init, 4.0)
self.assertEqual(box_predictor.num_classes, 10) self.assertEqual(box_predictor.num_classes, 10)
self.assertFalse(box_predictor._is_training) self.assertFalse(box_predictor._is_training)
self.assertTrue(box_predictor._use_depthwise) self.assertTrue(class_head._use_depthwise)
def test_construct_default_conv_box_predictor(self): def test_construct_default_conv_box_predictor(self):
box_predictor_text_proto = """ box_predictor_text_proto = """
...@@ -143,15 +144,16 @@ class ConvolutionalBoxPredictorBuilderTest(tf.test.TestCase): ...@@ -143,15 +144,16 @@ class ConvolutionalBoxPredictorBuilderTest(tf.test.TestCase):
box_predictor_config=box_predictor_proto, box_predictor_config=box_predictor_proto,
is_training=True, is_training=True,
num_classes=90) num_classes=90)
class_head = box_predictor._class_prediction_head
self.assertEqual(box_predictor._min_depth, 0) self.assertEqual(box_predictor._min_depth, 0)
self.assertEqual(box_predictor._max_depth, 0) self.assertEqual(box_predictor._max_depth, 0)
self.assertEqual(box_predictor._num_layers_before_predictor, 0) self.assertEqual(box_predictor._num_layers_before_predictor, 0)
self.assertTrue(box_predictor._use_dropout) self.assertTrue(class_head._use_dropout)
self.assertAlmostEqual(box_predictor._dropout_keep_prob, 0.8) self.assertAlmostEqual(class_head._dropout_keep_prob, 0.8)
self.assertFalse(box_predictor._apply_sigmoid_to_scores) self.assertFalse(class_head._apply_sigmoid_to_scores)
self.assertEqual(box_predictor.num_classes, 90) self.assertEqual(box_predictor.num_classes, 90)
self.assertTrue(box_predictor._is_training) self.assertTrue(box_predictor._is_training)
self.assertFalse(box_predictor._use_depthwise) self.assertFalse(class_head._use_depthwise)
class WeightSharedConvolutionalBoxPredictorBuilderTest(tf.test.TestCase): class WeightSharedConvolutionalBoxPredictorBuilderTest(tf.test.TestCase):
...@@ -235,12 +237,13 @@ class WeightSharedConvolutionalBoxPredictorBuilderTest(tf.test.TestCase): ...@@ -235,12 +237,13 @@ class WeightSharedConvolutionalBoxPredictorBuilderTest(tf.test.TestCase):
box_predictor_config=box_predictor_proto, box_predictor_config=box_predictor_proto,
is_training=False, is_training=False,
num_classes=10) num_classes=10)
class_head = box_predictor._class_prediction_head
self.assertEqual(box_predictor._depth, 2) self.assertEqual(box_predictor._depth, 2)
self.assertEqual(box_predictor._num_layers_before_predictor, 2) self.assertEqual(box_predictor._num_layers_before_predictor, 2)
self.assertAlmostEqual(box_predictor._class_prediction_bias_init, 4.0) self.assertEqual(box_predictor._apply_batch_norm, False)
self.assertAlmostEqual(class_head._class_prediction_bias_init, 4.0)
self.assertEqual(box_predictor.num_classes, 10) self.assertEqual(box_predictor.num_classes, 10)
self.assertFalse(box_predictor._is_training) self.assertFalse(box_predictor._is_training)
self.assertEqual(box_predictor._apply_batch_norm, False)
def test_construct_default_conv_box_predictor(self): def test_construct_default_conv_box_predictor(self):
box_predictor_text_proto = """ box_predictor_text_proto = """
......
...@@ -16,6 +16,7 @@ ...@@ -16,6 +16,7 @@
"""Builder function to construct tf-slim arg_scope for convolution, fc ops.""" """Builder function to construct tf-slim arg_scope for convolution, fc ops."""
import tensorflow as tf import tensorflow as tf
from object_detection.core import freezable_batch_norm
from object_detection.protos import hyperparams_pb2 from object_detection.protos import hyperparams_pb2
from object_detection.utils import context_manager from object_detection.utils import context_manager
...@@ -93,6 +94,38 @@ class KerasLayerHyperparams(object): ...@@ -93,6 +94,38 @@ class KerasLayerHyperparams(object):
new_batch_norm_params.update(overrides) new_batch_norm_params.update(overrides)
return new_batch_norm_params return new_batch_norm_params
def build_batch_norm(self, training=None, **overrides):
"""Returns a Batch Normalization layer with the appropriate hyperparams.
If the hyperparams are configured to not use batch normalization,
this will return a Keras Lambda layer that only applies tf.Identity,
without doing any normalization.
Optionally overrides values in the batch_norm hyperparam dict. Overrides
only apply to individual calls of this method, and do not affect
future calls.
Args:
training: if True, the normalization layer will normalize using the batch
statistics. If False, the normalization layer will be frozen and will
act as if it is being used for inference. If None, the layer
will look up the Keras learning phase at `call` time to decide what to
do.
**overrides: batch normalization construction args to override from the
batch_norm hyperparams dictionary.
Returns: Either a FreezableBatchNorm layer (if use_batch_norm() is True),
or a Keras Lambda layer that applies the identity (if use_batch_norm()
is False)
"""
if self.use_batch_norm():
return freezable_batch_norm.FreezableBatchNorm(
training=training,
**self.batch_norm_params(**overrides)
)
else:
return tf.keras.layers.Lambda(tf.identity)
def params(self, **overrides): def params(self, **overrides):
"""Returns a dict containing the layer construction hyperparameters to use. """Returns a dict containing the layer construction hyperparameters to use.
......
...@@ -21,6 +21,7 @@ import tensorflow as tf ...@@ -21,6 +21,7 @@ import tensorflow as tf
from google.protobuf import text_format from google.protobuf import text_format
from object_detection.builders import hyperparams_builder from object_detection.builders import hyperparams_builder
from object_detection.core import freezable_batch_norm
from object_detection.protos import hyperparams_pb2 from object_detection.protos import hyperparams_pb2
slim = tf.contrib.slim slim = tf.contrib.slim
...@@ -282,6 +283,10 @@ class HyperparamsBuilderTest(tf.test.TestCase): ...@@ -282,6 +283,10 @@ class HyperparamsBuilderTest(tf.test.TestCase):
self.assertFalse(batch_norm_params['center']) self.assertFalse(batch_norm_params['center'])
self.assertTrue(batch_norm_params['scale']) self.assertTrue(batch_norm_params['scale'])
batch_norm_layer = keras_config.build_batch_norm()
self.assertTrue(isinstance(batch_norm_layer,
freezable_batch_norm.FreezableBatchNorm))
def test_return_non_default_batch_norm_params_keras_override( def test_return_non_default_batch_norm_params_keras_override(
self): self):
conv_hyperparams_text_proto = """ conv_hyperparams_text_proto = """
...@@ -413,6 +418,11 @@ class HyperparamsBuilderTest(tf.test.TestCase): ...@@ -413,6 +418,11 @@ class HyperparamsBuilderTest(tf.test.TestCase):
self.assertFalse(keras_config.use_batch_norm()) self.assertFalse(keras_config.use_batch_norm())
self.assertEqual(keras_config.batch_norm_params(), {}) self.assertEqual(keras_config.batch_norm_params(), {})
# The batch norm builder should build an identity Lambda layer
identity_layer = keras_config.build_batch_norm()
self.assertTrue(isinstance(identity_layer,
tf.keras.layers.Lambda))
def test_use_none_activation(self): def test_use_none_activation(self):
conv_hyperparams_text_proto = """ conv_hyperparams_text_proto = """
regularizer { regularizer {
......
...@@ -14,6 +14,7 @@ ...@@ -14,6 +14,7 @@
# ============================================================================== # ==============================================================================
"""A function to build a DetectionModel from configuration.""" """A function to build a DetectionModel from configuration."""
import functools
from object_detection.builders import anchor_generator_builder from object_detection.builders import anchor_generator_builder
from object_detection.builders import box_coder_builder from object_detection.builders import box_coder_builder
from object_detection.builders import box_predictor_builder from object_detection.builders import box_predictor_builder
...@@ -44,6 +45,8 @@ from object_detection.models.ssd_mobilenet_v1_ppn_feature_extractor import SSDMo ...@@ -44,6 +45,8 @@ from object_detection.models.ssd_mobilenet_v1_ppn_feature_extractor import SSDMo
from object_detection.models.ssd_mobilenet_v2_feature_extractor import SSDMobileNetV2FeatureExtractor from object_detection.models.ssd_mobilenet_v2_feature_extractor import SSDMobileNetV2FeatureExtractor
from object_detection.predictors import rfcn_box_predictor from object_detection.predictors import rfcn_box_predictor
from object_detection.protos import model_pb2 from object_detection.protos import model_pb2
from object_detection.utils import ops
# A map of names to SSD feature extractors. # A map of names to SSD feature extractors.
SSD_FEATURE_EXTRACTOR_CLASS_MAP = { SSD_FEATURE_EXTRACTOR_CLASS_MAP = {
...@@ -220,6 +223,22 @@ def _build_ssd_model(ssd_config, is_training, add_summaries, ...@@ -220,6 +223,22 @@ def _build_ssd_model(ssd_config, is_training, add_summaries,
random_example_sampler) = losses_builder.build(ssd_config.loss) random_example_sampler) = losses_builder.build(ssd_config.loss)
normalize_loss_by_num_matches = ssd_config.normalize_loss_by_num_matches normalize_loss_by_num_matches = ssd_config.normalize_loss_by_num_matches
normalize_loc_loss_by_codesize = ssd_config.normalize_loc_loss_by_codesize normalize_loc_loss_by_codesize = ssd_config.normalize_loc_loss_by_codesize
weight_regression_loss_by_score = (ssd_config.weight_regression_loss_by_score)
target_assigner_instance = target_assigner.TargetAssigner(
region_similarity_calculator,
matcher,
box_coder,
negative_class_weight=negative_class_weight,
weight_regression_loss_by_score=weight_regression_loss_by_score)
expected_classification_loss_under_sampling = None
if ssd_config.use_expected_classification_loss_under_sampling:
expected_classification_loss_under_sampling = functools.partial(
ops.expected_classification_loss_under_sampling,
minimum_negative_sampling=ssd_config.minimum_negative_sampling,
desired_negative_sampling_ratio=ssd_config.
desired_negative_sampling_ratio)
return ssd_meta_arch.SSDMetaArch( return ssd_meta_arch.SSDMetaArch(
is_training, is_training,
...@@ -240,12 +259,15 @@ def _build_ssd_model(ssd_config, is_training, add_summaries, ...@@ -240,12 +259,15 @@ def _build_ssd_model(ssd_config, is_training, add_summaries,
localization_weight, localization_weight,
normalize_loss_by_num_matches, normalize_loss_by_num_matches,
hard_example_miner, hard_example_miner,
target_assigner_instance=target_assigner_instance,
add_summaries=add_summaries, add_summaries=add_summaries,
normalize_loc_loss_by_codesize=normalize_loc_loss_by_codesize, normalize_loc_loss_by_codesize=normalize_loc_loss_by_codesize,
freeze_batchnorm=ssd_config.freeze_batchnorm, freeze_batchnorm=ssd_config.freeze_batchnorm,
inplace_batchnorm_update=ssd_config.inplace_batchnorm_update, inplace_batchnorm_update=ssd_config.inplace_batchnorm_update,
add_background_class=add_background_class, add_background_class=add_background_class,
random_example_sampler=random_example_sampler) random_example_sampler=random_example_sampler,
expected_classification_loss_under_sampling=
expected_classification_loss_under_sampling)
def _build_faster_rcnn_feature_extractor( def _build_faster_rcnn_feature_extractor(
......
...@@ -144,6 +144,9 @@ class ModelBuilderTest(tf.test.TestCase): ...@@ -144,6 +144,9 @@ class ModelBuilderTest(tf.test.TestCase):
} }
} }
} }
use_expected_classification_loss_under_sampling: true
minimum_negative_sampling: 10
desired_negative_sampling_ratio: 2
}""" }"""
model_proto = model_pb2.DetectionModel() model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto) text_format.Merge(model_text_proto, model_proto)
...@@ -151,6 +154,12 @@ class ModelBuilderTest(tf.test.TestCase): ...@@ -151,6 +154,12 @@ class ModelBuilderTest(tf.test.TestCase):
self.assertIsInstance(model, ssd_meta_arch.SSDMetaArch) self.assertIsInstance(model, ssd_meta_arch.SSDMetaArch)
self.assertIsInstance(model._feature_extractor, self.assertIsInstance(model._feature_extractor,
SSDInceptionV2FeatureExtractor) SSDInceptionV2FeatureExtractor)
self.assertIsNotNone(model._expected_classification_loss_under_sampling)
self.assertEqual(
model._expected_classification_loss_under_sampling.keywords, {
'minimum_negative_sampling': 10,
'desired_negative_sampling_ratio': 2
})
def test_create_ssd_inception_v3_model_from_config(self): def test_create_ssd_inception_v3_model_from_config(self):
model_text_proto = """ model_text_proto = """
...@@ -692,6 +701,7 @@ class ModelBuilderTest(tf.test.TestCase): ...@@ -692,6 +701,7 @@ class ModelBuilderTest(tf.test.TestCase):
} }
} }
} }
weight_regression_loss_by_score: true
}""" }"""
model_proto = model_pb2.DetectionModel() model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto) text_format.Merge(model_text_proto, model_proto)
...@@ -700,6 +710,7 @@ class ModelBuilderTest(tf.test.TestCase): ...@@ -700,6 +710,7 @@ class ModelBuilderTest(tf.test.TestCase):
self.assertIsInstance(model._feature_extractor, self.assertIsInstance(model._feature_extractor,
SSDMobileNetV2FeatureExtractor) SSDMobileNetV2FeatureExtractor)
self.assertTrue(model._normalize_loc_loss_by_codesize) self.assertTrue(model._normalize_loc_loss_by_codesize)
self.assertTrue(model._target_assigner._weight_regression_loss_by_score)
def test_create_embedded_ssd_mobilenet_v1_model_from_config(self): def test_create_embedded_ssd_mobilenet_v1_model_from_config(self):
model_text_proto = """ model_text_proto = """
......
# Copyright 2018 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""A freezable batch norm layer that uses Keras batch normalization."""
import tensorflow as tf
class FreezableBatchNorm(tf.keras.layers.BatchNormalization):
"""Batch normalization layer (Ioffe and Szegedy, 2014).
This is a `freezable` batch norm layer that supports setting the `training`
parameter in the __init__ method rather than having to set it either via
the Keras learning phase or via the `call` method parameter. This layer will
forward all other parameters to the default Keras `BatchNormalization`
layer
This is class is necessary because Object Detection model training sometimes
requires batch normalization layers to be `frozen` and used as if it was
evaluation time, despite still training (and potentially using dropout layers)
Like the default Keras BatchNormalization layer, this will normalize the
activations of the previous layer at each batch,
i.e. applies a transformation that maintains the mean activation
close to 0 and the activation standard deviation close to 1.
Arguments:
training: Boolean or None. If True, the batch normalization layer will
normalize the input batch using the batch mean and standard deviation,
and update the total moving mean and standard deviations. If False, the
layer will normalize using the moving average and std. dev, without
updating the learned avg and std. dev.
If None, the layer will follow the keras BatchNormalization layer
strategy of checking the Keras learning phase at `call` time to decide
what to do.
**kwargs: The keyword arguments to forward to the keras BatchNormalization
layer constructor.
Input shape:
Arbitrary. Use the keyword argument `input_shape`
(tuple of integers, does not include the samples axis)
when using this layer as the first layer in a model.
Output shape:
Same shape as input.
References:
- [Batch Normalization: Accelerating Deep Network Training by Reducing
Internal Covariate Shift](https://arxiv.org/abs/1502.03167)
"""
def __init__(self, training=None, **kwargs):
super(FreezableBatchNorm, self).__init__(**kwargs)
self._training = training
def call(self, inputs, training=None):
if training is None:
training = self._training
return super(FreezableBatchNorm, self).call(inputs, training=training)
# Copyright 2018 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for object_detection.core.freezable_batch_norm."""
import numpy as np
import tensorflow as tf
from object_detection.core import freezable_batch_norm
class FreezableBatchNormTest(tf.test.TestCase):
"""Tests for FreezableBatchNorm operations."""
def _build_model(self, training=None):
model = tf.keras.models.Sequential()
norm = freezable_batch_norm.FreezableBatchNorm(training=training,
input_shape=(10,),
momentum=0.8)
model.add(norm)
return model, norm
def _train_freezable_batch_norm(self, training_mean, training_var):
model, _ = self._build_model()
model.compile(loss='mse', optimizer='sgd')
# centered on training_mean, variance training_var
train_data = np.random.normal(
loc=training_mean,
scale=training_var,
size=(1000, 10))
model.fit(train_data, train_data, epochs=4, verbose=0)
return model.weights
def test_batchnorm_freezing_training_true(self):
with self.test_session():
training_mean = 5.0
training_var = 10.0
testing_mean = -10.0
testing_var = 5.0
# Initially train the batch norm, and save the weights
trained_weights = self._train_freezable_batch_norm(training_mean,
training_var)
# Load the batch norm weights, freezing training to True.
# Apply the batch norm layer to testing data and ensure it is normalized
# according to the batch statistics.
model, norm = self._build_model(training=True)
for trained_weight, blank_weight in zip(trained_weights, model.weights):
weight_copy = blank_weight.assign(tf.keras.backend.eval(trained_weight))
tf.keras.backend.eval(weight_copy)
# centered on testing_mean, variance testing_var
test_data = np.random.normal(
loc=testing_mean,
scale=testing_var,
size=(1000, 10))
out_tensor = norm(tf.convert_to_tensor(test_data, dtype=tf.float32))
out = tf.keras.backend.eval(out_tensor)
out -= tf.keras.backend.eval(norm.beta)
out /= tf.keras.backend.eval(norm.gamma)
np.testing.assert_allclose(out.mean(), 0.0, atol=1.5e-1)
np.testing.assert_allclose(out.std(), 1.0, atol=1.5e-1)
def test_batchnorm_freezing_training_false(self):
with self.test_session():
training_mean = 5.0
training_var = 10.0
testing_mean = -10.0
testing_var = 5.0
# Initially train the batch norm, and save the weights
trained_weights = self._train_freezable_batch_norm(training_mean,
training_var)
# Load the batch norm back up, freezing training to False.
# Apply the batch norm layer to testing data and ensure it is normalized
# according to the training data's statistics.
model, norm = self._build_model(training=False)
for trained_weight, blank_weight in zip(trained_weights, model.weights):
weight_copy = blank_weight.assign(tf.keras.backend.eval(trained_weight))
tf.keras.backend.eval(weight_copy)
# centered on testing_mean, variance testing_var
test_data = np.random.normal(
loc=testing_mean,
scale=testing_var,
size=(1000, 10))
out_tensor = norm(tf.convert_to_tensor(test_data, dtype=tf.float32))
out = tf.keras.backend.eval(out_tensor)
out -= tf.keras.backend.eval(norm.beta)
out /= tf.keras.backend.eval(norm.gamma)
out *= training_var
out += (training_mean - testing_mean)
out /= testing_var
np.testing.assert_allclose(out.mean(), 0.0, atol=1.5e-1)
np.testing.assert_allclose(out.std(), 1.0, atol=1.5e-1)
if __name__ == '__main__':
tf.test.main()
...@@ -47,6 +47,9 @@ def multiclass_non_max_suppression(boxes, ...@@ -47,6 +47,9 @@ def multiclass_non_max_suppression(boxes,
Please note that this operation is performed on *all* classes, therefore any Please note that this operation is performed on *all* classes, therefore any
background classes should be removed prior to calling this function. background classes should be removed prior to calling this function.
Selected boxes are guaranteed to be sorted in decreasing order by score (but
the sort is not guaranteed to be stable).
Args: Args:
boxes: A [k, q, 4] float32 tensor containing k detections. `q` can be either boxes: A [k, q, 4] float32 tensor containing k detections. `q` can be either
number of classes or 1 depending on whether a separate box is predicted number of classes or 1 depending on whether a separate box is predicted
...@@ -106,15 +109,9 @@ def multiclass_non_max_suppression(boxes, ...@@ -106,15 +109,9 @@ def multiclass_non_max_suppression(boxes,
'must be specified.') 'must be specified.')
with tf.name_scope(scope, 'MultiClassNonMaxSuppression'): with tf.name_scope(scope, 'MultiClassNonMaxSuppression'):
num_boxes = tf.shape(boxes)[0]
num_scores = tf.shape(scores)[0] num_scores = tf.shape(scores)[0]
num_classes = scores.get_shape()[1] num_classes = scores.get_shape()[1]
length_assert = tf.Assert(
tf.equal(num_boxes, num_scores),
['Incorrect scores field length: actual vs expected.',
num_scores, num_boxes])
selected_boxes_list = [] selected_boxes_list = []
per_class_boxes_list = tf.unstack(boxes, axis=1) per_class_boxes_list = tf.unstack(boxes, axis=1)
if masks is not None: if masks is not None:
...@@ -126,9 +123,9 @@ def multiclass_non_max_suppression(boxes, ...@@ -126,9 +123,9 @@ def multiclass_non_max_suppression(boxes,
for class_idx, boxes_idx in zip(range(num_classes), boxes_ids): for class_idx, boxes_idx in zip(range(num_classes), boxes_ids):
per_class_boxes = per_class_boxes_list[boxes_idx] per_class_boxes = per_class_boxes_list[boxes_idx]
boxlist_and_class_scores = box_list.BoxList(per_class_boxes) boxlist_and_class_scores = box_list.BoxList(per_class_boxes)
with tf.control_dependencies([length_assert]): class_scores = tf.reshape(
class_scores = tf.reshape( tf.slice(scores, [0, class_idx], tf.stack([num_scores, 1])), [-1])
tf.slice(scores, [0, class_idx], tf.stack([num_scores, 1])), [-1])
boxlist_and_class_scores.add_field(fields.BoxListFields.scores, boxlist_and_class_scores.add_field(fields.BoxListFields.scores,
class_scores) class_scores)
if masks is not None: if masks is not None:
...@@ -142,22 +139,17 @@ def multiclass_non_max_suppression(boxes, ...@@ -142,22 +139,17 @@ def multiclass_non_max_suppression(boxes,
if additional_fields is not None: if additional_fields is not None:
for key, tensor in additional_fields.items(): for key, tensor in additional_fields.items():
boxlist_and_class_scores.add_field(key, tensor) boxlist_and_class_scores.add_field(key, tensor)
boxlist_filtered = box_list_ops.filter_greater_than(
boxlist_and_class_scores, score_thresh)
if clip_window is not None:
boxlist_filtered = box_list_ops.clip_to_window(
boxlist_filtered, clip_window)
if change_coordinate_frame:
boxlist_filtered = box_list_ops.change_coordinate_frame(
boxlist_filtered, clip_window)
max_selection_size = tf.minimum(max_size_per_class, max_selection_size = tf.minimum(max_size_per_class,
boxlist_filtered.num_boxes()) boxlist_and_class_scores.num_boxes())
selected_indices = tf.image.non_max_suppression( selected_indices = tf.image.non_max_suppression(
boxlist_filtered.get(), boxlist_and_class_scores.get(),
boxlist_filtered.get_field(fields.BoxListFields.scores), boxlist_and_class_scores.get_field(fields.BoxListFields.scores),
max_selection_size, max_selection_size,
iou_threshold=iou_thresh) iou_threshold=iou_thresh,
nms_result = box_list_ops.gather(boxlist_filtered, selected_indices) score_threshold=score_thresh)
nms_result = box_list_ops.gather(boxlist_and_class_scores,
selected_indices)
nms_result.add_field( nms_result.add_field(
fields.BoxListFields.classes, (tf.zeros_like( fields.BoxListFields.classes, (tf.zeros_like(
nms_result.get_field(fields.BoxListFields.scores)) + class_idx)) nms_result.get_field(fields.BoxListFields.scores)) + class_idx))
...@@ -165,6 +157,11 @@ def multiclass_non_max_suppression(boxes, ...@@ -165,6 +157,11 @@ def multiclass_non_max_suppression(boxes,
selected_boxes = box_list_ops.concatenate(selected_boxes_list) selected_boxes = box_list_ops.concatenate(selected_boxes_list)
sorted_boxes = box_list_ops.sort_by_field(selected_boxes, sorted_boxes = box_list_ops.sort_by_field(selected_boxes,
fields.BoxListFields.scores) fields.BoxListFields.scores)
if clip_window is not None:
sorted_boxes = box_list_ops.clip_to_window(sorted_boxes, clip_window)
if change_coordinate_frame:
sorted_boxes = box_list_ops.change_coordinate_frame(
sorted_boxes, clip_window)
if max_total_size: if max_total_size:
max_total_size = tf.minimum(max_total_size, max_total_size = tf.minimum(max_total_size,
sorted_boxes.num_boxes()) sorted_boxes.num_boxes())
......
...@@ -22,24 +22,6 @@ from object_detection.core import standard_fields as fields ...@@ -22,24 +22,6 @@ from object_detection.core import standard_fields as fields
class MulticlassNonMaxSuppressionTest(tf.test.TestCase): class MulticlassNonMaxSuppressionTest(tf.test.TestCase):
def test_with_invalid_scores_size(self):
boxes = tf.constant([[[0, 0, 1, 1]],
[[0, 0.1, 1, 1.1]],
[[0, -0.1, 1, 0.9]],
[[0, 10, 1, 11]],
[[0, 10.1, 1, 11.1]],
[[0, 100, 1, 101]]], tf.float32)
scores = tf.constant([[.9], [.75], [.6], [.95], [.5]])
iou_thresh = .5
score_thresh = 0.6
max_output_size = 3
nms = post_processing.multiclass_non_max_suppression(
boxes, scores, score_thresh, iou_thresh, max_output_size)
with self.test_session() as sess:
with self.assertRaisesWithPredicateMatch(
tf.errors.InvalidArgumentError, 'Incorrect scores field length'):
sess.run(nms.get())
def test_multiclass_nms_select_with_shared_boxes(self): def test_multiclass_nms_select_with_shared_boxes(self):
boxes = tf.constant([[[0, 0, 1, 1]], boxes = tf.constant([[[0, 0, 1, 1]],
[[0, 0.1, 1, 1.1]], [[0, 0.1, 1, 1.1]],
......
...@@ -48,8 +48,12 @@ from object_detection.utils import shape_utils ...@@ -48,8 +48,12 @@ from object_detection.utils import shape_utils
class TargetAssigner(object): class TargetAssigner(object):
"""Target assigner to compute classification and regression targets.""" """Target assigner to compute classification and regression targets."""
def __init__(self, similarity_calc, matcher, box_coder, def __init__(self,
negative_class_weight=1.0): similarity_calc,
matcher,
box_coder,
negative_class_weight=1.0,
weight_regression_loss_by_score=False):
"""Construct Object Detection Target Assigner. """Construct Object Detection Target Assigner.
Args: Args:
...@@ -60,6 +64,8 @@ class TargetAssigner(object): ...@@ -60,6 +64,8 @@ class TargetAssigner(object):
groundtruth boxes with respect to anchors. groundtruth boxes with respect to anchors.
negative_class_weight: classification weight to be associated to negative negative_class_weight: classification weight to be associated to negative
anchors (default: 1.0). The weight must be in [0., 1.]. anchors (default: 1.0). The weight must be in [0., 1.].
weight_regression_loss_by_score: Whether to weight the regression loss by
ground truth box score.
Raises: Raises:
ValueError: if similarity_calc is not a RegionSimilarityCalculator or ValueError: if similarity_calc is not a RegionSimilarityCalculator or
...@@ -75,14 +81,20 @@ class TargetAssigner(object): ...@@ -75,14 +81,20 @@ class TargetAssigner(object):
self._matcher = matcher self._matcher = matcher
self._box_coder = box_coder self._box_coder = box_coder
self._negative_class_weight = negative_class_weight self._negative_class_weight = negative_class_weight
self._weight_regression_loss_by_score = weight_regression_loss_by_score
@property @property
def box_coder(self): def box_coder(self):
return self._box_coder return self._box_coder
# TODO(rathodv): move labels, scores, and weights to groundtruth_boxes fields. # TODO(rathodv): move labels, scores, and weights to groundtruth_boxes fields.
def assign(self, anchors, groundtruth_boxes, groundtruth_labels=None, def assign(self,
unmatched_class_label=None, groundtruth_weights=None, **params): anchors,
groundtruth_boxes,
groundtruth_labels=None,
unmatched_class_label=None,
groundtruth_weights=None,
**params):
"""Assign classification and regression targets to each anchor. """Assign classification and regression targets to each anchor.
For a given set of anchors and groundtruth detections, match anchors For a given set of anchors and groundtruth detections, match anchors
...@@ -172,7 +184,13 @@ class TargetAssigner(object): ...@@ -172,7 +184,13 @@ class TargetAssigner(object):
cls_targets = self._create_classification_targets(groundtruth_labels, cls_targets = self._create_classification_targets(groundtruth_labels,
unmatched_class_label, unmatched_class_label,
match) match)
reg_weights = self._create_regression_weights(match, groundtruth_weights) if self._weight_regression_loss_by_score:
reg_weights = self._create_regression_weights(
match, groundtruth_weights * scores)
else:
reg_weights = self._create_regression_weights(match,
groundtruth_weights)
cls_weights = self._create_classification_weights(match, cls_weights = self._create_classification_weights(match,
groundtruth_weights) groundtruth_weights)
...@@ -458,9 +476,9 @@ def batch_assign_targets(target_assigner, ...@@ -458,9 +476,9 @@ def batch_assign_targets(target_assigner,
gt_weights_batch = [None] * len(gt_class_targets_batch) gt_weights_batch = [None] * len(gt_class_targets_batch)
for anchors, gt_boxes, gt_class_targets, gt_weights in zip( for anchors, gt_boxes, gt_class_targets, gt_weights in zip(
anchors_batch, gt_box_batch, gt_class_targets_batch, gt_weights_batch): anchors_batch, gt_box_batch, gt_class_targets_batch, gt_weights_batch):
(cls_targets, cls_weights, reg_targets, reg_weights, (cls_targets, cls_weights,
match) = target_assigner.assign(anchors, gt_boxes, gt_class_targets, reg_targets, reg_weights, match) = target_assigner.assign(
unmatched_class_label, gt_weights) anchors, gt_boxes, gt_class_targets, unmatched_class_label, gt_weights)
cls_targets_list.append(cls_targets) cls_targets_list.append(cls_targets)
cls_weights_list.append(cls_weights) cls_weights_list.append(cls_weights)
reg_targets_list.append(reg_targets) reg_targets_list.append(reg_targets)
......
...@@ -318,6 +318,50 @@ class TargetAssignerTest(test_case.TestCase): ...@@ -318,6 +318,50 @@ class TargetAssignerTest(test_case.TestCase):
self.assertAllClose(cls_weights_out, exp_cls_weights) self.assertAllClose(cls_weights_out, exp_cls_weights)
self.assertAllClose(reg_weights_out, exp_reg_weights) self.assertAllClose(reg_weights_out, exp_reg_weights)
def test_assign_multiclass_with_weight_regression_loss_by_score(self):
def graph_fn(anchor_means, groundtruth_box_corners, groundtruth_labels):
similarity_calc = region_similarity_calculator.IouSimilarity()
matcher = argmax_matcher.ArgMaxMatcher(
matched_threshold=0.5, unmatched_threshold=0.5)
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
unmatched_class_label = tf.constant([1, 0, 0, 0, 0, 0, 0], tf.float32)
target_assigner = targetassigner.TargetAssigner(
similarity_calc,
matcher,
box_coder,
weight_regression_loss_by_score=True)
anchors_boxlist = box_list.BoxList(anchor_means)
groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners)
result = target_assigner.assign(
anchors_boxlist,
groundtruth_boxlist,
groundtruth_labels,
unmatched_class_label=unmatched_class_label)
(_, cls_weights, _, reg_weights, _) = result
return (cls_weights, reg_weights)
anchor_means = np.array(
[[0.0, 0.0, 0.5, 0.5], [0.5, 0.5, 1.0, 0.8], [0, 0.5, .5, 1.0],
[.75, 0, 1.0, .25]],
dtype=np.float32)
groundtruth_box_corners = np.array(
[[0.0, 0.0, 0.5, 0.5], [0.5, 0.5, 0.9, 0.9], [.75, 0, .95, .27]],
dtype=np.float32)
groundtruth_labels = np.array(
[[.9, .1, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 1, 0],
[.5, 0, 0, .5, 0, 0, 0]],
dtype=np.float32)
exp_cls_weights = [1, 1, 1, 1] # background class gets weight of 1.
exp_reg_weights = [.1, 1, 0., .5] # background class gets weight of 0.
(cls_weights_out, reg_weights_out) = self.execute(
graph_fn, [anchor_means, groundtruth_box_corners, groundtruth_labels])
self.assertAllClose(cls_weights_out, exp_cls_weights)
self.assertAllClose(reg_weights_out, exp_reg_weights)
def test_assign_multidimensional_class_targets(self): def test_assign_multidimensional_class_targets(self):
def graph_fn(anchor_means, groundtruth_box_corners, groundtruth_labels): def graph_fn(anchor_means, groundtruth_box_corners, groundtruth_labels):
......
...@@ -32,6 +32,18 @@ from object_detection.utils import visualization_utils as vis_utils ...@@ -32,6 +32,18 @@ from object_detection.utils import visualization_utils as vis_utils
slim = tf.contrib.slim slim = tf.contrib.slim
# A dictionary of metric names to classes that implement the metric. The classes
# in the dictionary must implement
# utils.object_detection_evaluation.DetectionEvaluator interface.
EVAL_METRICS_CLASS_DICT = {
'coco_detection_metrics':
coco_evaluation.CocoDetectionEvaluator,
'coco_mask_metrics':
coco_evaluation.CocoMaskEvaluator,
}
EVAL_DEFAULT_METRIC = 'coco_detection_metrics'
def write_metrics(metrics, global_step, summary_dir): def write_metrics(metrics, global_step, summary_dir):
"""Write metrics to a summary directory. """Write metrics to a summary directory.
...@@ -582,70 +594,90 @@ def result_dict_for_single_example(image, ...@@ -582,70 +594,90 @@ def result_dict_for_single_example(image,
return output_dict return output_dict
def get_eval_metric_ops_for_evaluators(evaluation_metrics, def get_evaluators(eval_config, categories, evaluator_options=None):
"""Returns the evaluator class according to eval_config, valid for categories.
Args:
eval_config: An `eval_pb2.EvalConfig`.
categories: A list of dicts, each of which has the following keys -
'id': (required) an integer id uniquely identifying this category.
'name': (required) string representing category name e.g., 'cat', 'dog'.
evaluator_options: A dictionary of metric names (see
EVAL_METRICS_CLASS_DICT) to `DetectionEvaluator` initialization
keyword arguments. For example:
evalator_options = {
'coco_detection_metrics': {'include_metrics_per_category': True}
}
Returns:
An list of instances of DetectionEvaluator.
Raises:
ValueError: if metric is not in the metric class dictionary.
"""
evaluator_options = evaluator_options or {}
eval_metric_fn_keys = eval_config.metrics_set
if not eval_metric_fn_keys:
eval_metric_fn_keys = [EVAL_DEFAULT_METRIC]
evaluators_list = []
for eval_metric_fn_key in eval_metric_fn_keys:
if eval_metric_fn_key not in EVAL_METRICS_CLASS_DICT:
raise ValueError('Metric not found: {}'.format(eval_metric_fn_key))
kwargs_dict = (evaluator_options[eval_metric_fn_key] if eval_metric_fn_key
in evaluator_options else {})
evaluators_list.append(EVAL_METRICS_CLASS_DICT[eval_metric_fn_key](
categories,
**kwargs_dict))
return evaluators_list
def get_eval_metric_ops_for_evaluators(eval_config,
categories, categories,
eval_dict, eval_dict):
include_metrics_per_category=False): """Returns eval metrics ops to use with `tf.estimator.EstimatorSpec`.
"""Returns a dictionary of eval metric ops to use with `tf.EstimatorSpec`.
Args: Args:
evaluation_metrics: List of evaluation metric names. Current options are eval_config: An `eval_pb2.EvalConfig`.
'coco_detection_metrics' and 'coco_mask_metrics'.
categories: A list of dicts, each of which has the following keys - categories: A list of dicts, each of which has the following keys -
'id': (required) an integer id uniquely identifying this category. 'id': (required) an integer id uniquely identifying this category.
'name': (required) string representing category name e.g., 'cat', 'dog'. 'name': (required) string representing category name e.g., 'cat', 'dog'.
eval_dict: An evaluation dictionary, returned from eval_dict: An evaluation dictionary, returned from
result_dict_for_single_example(). result_dict_for_single_example().
include_metrics_per_category: If True, additionally include per-category
metrics.
Returns: Returns:
A dictionary of metric names to tuple of value_op and update_op that can be A dictionary of metric names to tuple of value_op and update_op that can be
used as eval metric ops in tf.EstimatorSpec. used as eval metric ops in tf.EstimatorSpec.
Raises:
ValueError: If any of the metrics in `evaluation_metric` is not
'coco_detection_metrics' or 'coco_mask_metrics'.
""" """
evaluation_metrics = list(set(evaluation_metrics))
input_data_fields = fields.InputDataFields
detection_fields = fields.DetectionResultFields
eval_metric_ops = {} eval_metric_ops = {}
for metric in evaluation_metrics: evaluator_options = evaluator_options_from_eval_config(eval_config)
if metric == 'coco_detection_metrics': evaluators_list = get_evaluators(eval_config, categories, evaluator_options)
coco_evaluator = coco_evaluation.CocoDetectionEvaluator( for evaluator in evaluators_list:
categories, include_metrics_per_category=include_metrics_per_category) eval_metric_ops.update(evaluator.get_estimator_eval_metric_ops(
eval_metric_ops.update( eval_dict))
coco_evaluator.get_estimator_eval_metric_ops(
image_id=eval_dict[input_data_fields.key],
groundtruth_boxes=eval_dict[input_data_fields.groundtruth_boxes],
groundtruth_classes=eval_dict[
input_data_fields.groundtruth_classes],
detection_boxes=eval_dict[detection_fields.detection_boxes],
detection_scores=eval_dict[detection_fields.detection_scores],
detection_classes=eval_dict[detection_fields.detection_classes],
groundtruth_is_crowd=eval_dict.get(
input_data_fields.groundtruth_is_crowd)))
elif metric == 'coco_mask_metrics':
coco_mask_evaluator = coco_evaluation.CocoMaskEvaluator(
categories, include_metrics_per_category=include_metrics_per_category)
eval_metric_ops.update(
coco_mask_evaluator.get_estimator_eval_metric_ops(
image_id=eval_dict[input_data_fields.key],
groundtruth_boxes=eval_dict[input_data_fields.groundtruth_boxes],
groundtruth_classes=eval_dict[
input_data_fields.groundtruth_classes],
groundtruth_instance_masks=eval_dict[
input_data_fields.groundtruth_instance_masks],
detection_scores=eval_dict[detection_fields.detection_scores],
detection_classes=eval_dict[detection_fields.detection_classes],
detection_masks=eval_dict[detection_fields.detection_masks],
groundtruth_is_crowd=eval_dict.get(
input_data_fields.groundtruth_is_crowd),))
else:
raise ValueError('The only evaluation metrics supported are '
'"coco_detection_metrics" and "coco_mask_metrics". '
'Found {} in the evaluation metrics'.format(metric))
return eval_metric_ops return eval_metric_ops
def evaluator_options_from_eval_config(eval_config):
"""Produces a dictionary of evaluation options for each eval metric.
Args:
eval_config: An `eval_pb2.EvalConfig`.
Returns:
evaluator_options: A dictionary of metric names (see
EVAL_METRICS_CLASS_DICT) to `DetectionEvaluator` initialization
keyword arguments. For example:
evalator_options = {
'coco_detection_metrics': {'include_metrics_per_category': True}
}
"""
eval_metric_fn_keys = eval_config.metrics_set
evaluator_options = {}
for eval_metric_fn_key in eval_metric_fn_keys:
if eval_metric_fn_key in ('coco_detection_metrics', 'coco_mask_metrics'):
evaluator_options[eval_metric_fn_key] = {
'include_metrics_per_category': (
eval_config.include_metrics_per_category)
}
return evaluator_options
...@@ -23,6 +23,7 @@ import tensorflow as tf ...@@ -23,6 +23,7 @@ import tensorflow as tf
from object_detection import eval_util from object_detection import eval_util
from object_detection.core import standard_fields as fields from object_detection.core import standard_fields as fields
from object_detection.protos import eval_pb2
class EvalUtilTest(tf.test.TestCase): class EvalUtilTest(tf.test.TestCase):
...@@ -64,11 +65,12 @@ class EvalUtilTest(tf.test.TestCase): ...@@ -64,11 +65,12 @@ class EvalUtilTest(tf.test.TestCase):
groundtruth) groundtruth)
def test_get_eval_metric_ops_for_coco_detections(self): def test_get_eval_metric_ops_for_coco_detections(self):
evaluation_metrics = ['coco_detection_metrics'] eval_config = eval_pb2.EvalConfig()
eval_config.metrics_set.extend(['coco_detection_metrics'])
categories = self._get_categories_list() categories = self._get_categories_list()
eval_dict = self._make_evaluation_dict() eval_dict = self._make_evaluation_dict()
metric_ops = eval_util.get_eval_metric_ops_for_evaluators( metric_ops = eval_util.get_eval_metric_ops_for_evaluators(
evaluation_metrics, categories, eval_dict) eval_config, categories, eval_dict)
_, update_op = metric_ops['DetectionBoxes_Precision/mAP'] _, update_op = metric_ops['DetectionBoxes_Precision/mAP']
with self.test_session() as sess: with self.test_session() as sess:
...@@ -82,12 +84,13 @@ class EvalUtilTest(tf.test.TestCase): ...@@ -82,12 +84,13 @@ class EvalUtilTest(tf.test.TestCase):
self.assertNotIn('DetectionMasks_Precision/mAP', metrics) self.assertNotIn('DetectionMasks_Precision/mAP', metrics)
def test_get_eval_metric_ops_for_coco_detections_and_masks(self): def test_get_eval_metric_ops_for_coco_detections_and_masks(self):
evaluation_metrics = ['coco_detection_metrics', eval_config = eval_pb2.EvalConfig()
'coco_mask_metrics'] eval_config.metrics_set.extend(
['coco_detection_metrics', 'coco_mask_metrics'])
categories = self._get_categories_list() categories = self._get_categories_list()
eval_dict = self._make_evaluation_dict() eval_dict = self._make_evaluation_dict()
metric_ops = eval_util.get_eval_metric_ops_for_evaluators( metric_ops = eval_util.get_eval_metric_ops_for_evaluators(
evaluation_metrics, categories, eval_dict) eval_config, categories, eval_dict)
_, update_op_boxes = metric_ops['DetectionBoxes_Precision/mAP'] _, update_op_boxes = metric_ops['DetectionBoxes_Precision/mAP']
_, update_op_masks = metric_ops['DetectionMasks_Precision/mAP'] _, update_op_masks = metric_ops['DetectionMasks_Precision/mAP']
...@@ -102,12 +105,13 @@ class EvalUtilTest(tf.test.TestCase): ...@@ -102,12 +105,13 @@ class EvalUtilTest(tf.test.TestCase):
self.assertAlmostEqual(1.0, metrics['DetectionMasks_Precision/mAP']) self.assertAlmostEqual(1.0, metrics['DetectionMasks_Precision/mAP'])
def test_get_eval_metric_ops_for_coco_detections_and_resized_masks(self): def test_get_eval_metric_ops_for_coco_detections_and_resized_masks(self):
evaluation_metrics = ['coco_detection_metrics', eval_config = eval_pb2.EvalConfig()
'coco_mask_metrics'] eval_config.metrics_set.extend(
['coco_detection_metrics', 'coco_mask_metrics'])
categories = self._get_categories_list() categories = self._get_categories_list()
eval_dict = self._make_evaluation_dict(resized_groundtruth_masks=True) eval_dict = self._make_evaluation_dict(resized_groundtruth_masks=True)
metric_ops = eval_util.get_eval_metric_ops_for_evaluators( metric_ops = eval_util.get_eval_metric_ops_for_evaluators(
evaluation_metrics, categories, eval_dict) eval_config, categories, eval_dict)
_, update_op_boxes = metric_ops['DetectionBoxes_Precision/mAP'] _, update_op_boxes = metric_ops['DetectionBoxes_Precision/mAP']
_, update_op_masks = metric_ops['DetectionMasks_Precision/mAP'] _, update_op_masks = metric_ops['DetectionMasks_Precision/mAP']
...@@ -122,13 +126,53 @@ class EvalUtilTest(tf.test.TestCase): ...@@ -122,13 +126,53 @@ class EvalUtilTest(tf.test.TestCase):
self.assertAlmostEqual(1.0, metrics['DetectionMasks_Precision/mAP']) self.assertAlmostEqual(1.0, metrics['DetectionMasks_Precision/mAP'])
def test_get_eval_metric_ops_raises_error_with_unsupported_metric(self): def test_get_eval_metric_ops_raises_error_with_unsupported_metric(self):
evaluation_metrics = ['unsupported_metrics'] eval_config = eval_pb2.EvalConfig()
eval_config.metrics_set.extend(['unsupported_metric'])
categories = self._get_categories_list() categories = self._get_categories_list()
eval_dict = self._make_evaluation_dict() eval_dict = self._make_evaluation_dict()
with self.assertRaises(ValueError): with self.assertRaises(ValueError):
eval_util.get_eval_metric_ops_for_evaluators( eval_util.get_eval_metric_ops_for_evaluators(
evaluation_metrics, categories, eval_dict) eval_config, categories, eval_dict)
def test_get_eval_metric_ops_for_evaluators(self):
eval_config = eval_pb2.EvalConfig()
eval_config.metrics_set.extend(
['coco_detection_metrics', 'coco_mask_metrics'])
eval_config.include_metrics_per_category = True
evaluator_options = eval_util.evaluator_options_from_eval_config(
eval_config)
self.assertTrue(evaluator_options['coco_detection_metrics'][
'include_metrics_per_category'])
self.assertTrue(evaluator_options['coco_mask_metrics'][
'include_metrics_per_category'])
def test_get_evaluator_with_evaluator_options(self):
eval_config = eval_pb2.EvalConfig()
eval_config.metrics_set.extend(['coco_detection_metrics'])
eval_config.include_metrics_per_category = True
categories = self._get_categories_list()
evaluator_options = eval_util.evaluator_options_from_eval_config(
eval_config)
evaluator = eval_util.get_evaluators(
eval_config, categories, evaluator_options)
self.assertTrue(evaluator[0]._include_metrics_per_category)
def test_get_evaluator_with_no_evaluator_options(self):
eval_config = eval_pb2.EvalConfig()
eval_config.metrics_set.extend(['coco_detection_metrics'])
eval_config.include_metrics_per_category = True
categories = self._get_categories_list()
evaluator = eval_util.get_evaluators(
eval_config, categories, evaluator_options=None)
# Even though we are setting eval_config.include_metrics_per_category = True
# this option is never passed into the DetectionEvaluator constructor (via
# `evaluator_options`).
self.assertFalse(evaluator[0]._include_metrics_per_category)
if __name__ == '__main__': if __name__ == '__main__':
tf.test.main() tf.test.main()
...@@ -21,6 +21,7 @@ import tempfile ...@@ -21,6 +21,7 @@ import tempfile
import numpy as np import numpy as np
import tensorflow as tf import tensorflow as tf
from tensorflow.core.framework import attr_value_pb2 from tensorflow.core.framework import attr_value_pb2
from tensorflow.core.framework import types_pb2
from tensorflow.core.protobuf import saver_pb2 from tensorflow.core.protobuf import saver_pb2
from tensorflow.tools.graph_transforms import TransformGraph from tensorflow.tools.graph_transforms import TransformGraph
from object_detection import exporter from object_detection import exporter
...@@ -95,6 +96,12 @@ def append_postprocessing_op(frozen_graph_def, max_detections, ...@@ -95,6 +96,12 @@ def append_postprocessing_op(frozen_graph_def, max_detections,
new_output.name = 'TFLite_Detection_PostProcess' new_output.name = 'TFLite_Detection_PostProcess'
new_output.attr['_output_quantized'].CopyFrom( new_output.attr['_output_quantized'].CopyFrom(
attr_value_pb2.AttrValue(b=True)) attr_value_pb2.AttrValue(b=True))
new_output.attr['_output_types'].list.type.extend([
types_pb2.DT_FLOAT, types_pb2.DT_FLOAT, types_pb2.DT_FLOAT,
types_pb2.DT_FLOAT
])
new_output.attr['_support_output_type_float_in_quantized_op'].CopyFrom(
attr_value_pb2.AttrValue(b=True))
new_output.attr['max_detections'].CopyFrom( new_output.attr['max_detections'].CopyFrom(
attr_value_pb2.AttrValue(i=max_detections)) attr_value_pb2.AttrValue(i=max_detections))
new_output.attr['max_classes_per_detection'].CopyFrom( new_output.attr['max_classes_per_detection'].CopyFrom(
......
...@@ -21,6 +21,7 @@ import os ...@@ -21,6 +21,7 @@ import os
import numpy as np import numpy as np
import six import six
import tensorflow as tf import tensorflow as tf
from tensorflow.core.framework import types_pb2
from object_detection import export_tflite_ssd_graph_lib from object_detection import export_tflite_ssd_graph_lib
from object_detection.builders import graph_rewriter_builder from object_detection.builders import graph_rewriter_builder
from object_detection.builders import model_builder from object_detection.builders import model_builder
...@@ -29,6 +30,7 @@ from object_detection.protos import graph_rewriter_pb2 ...@@ -29,6 +30,7 @@ from object_detection.protos import graph_rewriter_pb2
from object_detection.protos import pipeline_pb2 from object_detection.protos import pipeline_pb2
from object_detection.protos import post_processing_pb2 from object_detection.protos import post_processing_pb2
if six.PY2: if six.PY2:
import mock # pylint: disable=g-import-not-at-top import mock # pylint: disable=g-import-not-at-top
else: else:
...@@ -122,7 +124,7 @@ class ExportTfliteGraphTest(tf.test.TestCase): ...@@ -122,7 +124,7 @@ class ExportTfliteGraphTest(tf.test.TestCase):
return box_encodings_np, class_predictions_np return box_encodings_np, class_predictions_np
def _export_graph(self, pipeline_config, num_channels=3): def _export_graph(self, pipeline_config, num_channels=3):
"""Exports a tflite graph and an anchor file.""" """Exports a tflite graph."""
output_dir = self.get_temp_dir() output_dir = self.get_temp_dir()
trained_checkpoint_prefix = os.path.join(output_dir, 'model.ckpt') trained_checkpoint_prefix = os.path.join(output_dir, 'model.ckpt')
tflite_graph_file = os.path.join(output_dir, 'tflite_graph.pb') tflite_graph_file = os.path.join(output_dir, 'tflite_graph.pb')
...@@ -147,6 +149,34 @@ class ExportTfliteGraphTest(tf.test.TestCase): ...@@ -147,6 +149,34 @@ class ExportTfliteGraphTest(tf.test.TestCase):
max_classes_per_detection=1) max_classes_per_detection=1)
return tflite_graph_file return tflite_graph_file
def _export_graph_with_postprocessing_op(self,
pipeline_config,
num_channels=3):
"""Exports a tflite graph with custom postprocessing op."""
output_dir = self.get_temp_dir()
trained_checkpoint_prefix = os.path.join(output_dir, 'model.ckpt')
tflite_graph_file = os.path.join(output_dir, 'tflite_graph.pb')
quantize = pipeline_config.HasField('graph_rewriter')
self._save_checkpoint_from_mock_model(
trained_checkpoint_prefix,
use_moving_averages=pipeline_config.eval_config.use_moving_averages,
quantize=quantize,
num_channels=num_channels)
with mock.patch.object(
model_builder, 'build', autospec=True) as mock_builder:
mock_builder.return_value = FakeModel()
with tf.Graph().as_default():
export_tflite_ssd_graph_lib.export_tflite_graph(
pipeline_config=pipeline_config,
trained_checkpoint_prefix=trained_checkpoint_prefix,
output_dir=output_dir,
add_postprocessing_op=True,
max_detections=10,
max_classes_per_detection=1)
return tflite_graph_file
def test_export_tflite_graph_with_moving_averages(self): def test_export_tflite_graph_with_moving_averages(self):
pipeline_config = pipeline_pb2.TrainEvalPipelineConfig() pipeline_config = pipeline_pb2.TrainEvalPipelineConfig()
pipeline_config.eval_config.use_moving_averages = True pipeline_config.eval_config.use_moving_averages = True
...@@ -267,6 +297,44 @@ class ExportTfliteGraphTest(tf.test.TestCase): ...@@ -267,6 +297,44 @@ class ExportTfliteGraphTest(tf.test.TestCase):
self.assertAllClose(class_predictions_np, self.assertAllClose(class_predictions_np,
[[[0.668188, 0.645656], [0.710949, 0.5]]]) [[[0.668188, 0.645656], [0.710949, 0.5]]])
def test_export_tflite_graph_with_postprocessing_op(self):
pipeline_config = pipeline_pb2.TrainEvalPipelineConfig()
pipeline_config.eval_config.use_moving_averages = False
pipeline_config.model.ssd.post_processing.score_converter = (
post_processing_pb2.PostProcessing.SIGMOID)
pipeline_config.model.ssd.image_resizer.fixed_shape_resizer.height = 10
pipeline_config.model.ssd.image_resizer.fixed_shape_resizer.width = 10
pipeline_config.model.ssd.num_classes = 2
pipeline_config.model.ssd.box_coder.faster_rcnn_box_coder.y_scale = 10.0
pipeline_config.model.ssd.box_coder.faster_rcnn_box_coder.x_scale = 10.0
pipeline_config.model.ssd.box_coder.faster_rcnn_box_coder.height_scale = 5.0
pipeline_config.model.ssd.box_coder.faster_rcnn_box_coder.width_scale = 5.0
tflite_graph_file = self._export_graph_with_postprocessing_op(
pipeline_config)
self.assertTrue(os.path.exists(tflite_graph_file))
graph = tf.Graph()
with graph.as_default():
graph_def = tf.GraphDef()
with tf.gfile.Open(tflite_graph_file) as f:
graph_def.ParseFromString(f.read())
all_op_names = [node.name for node in graph_def.node]
self.assertTrue('TFLite_Detection_PostProcess' in all_op_names)
for node in graph_def.node:
if node.name == 'TFLite_Detection_PostProcess':
self.assertTrue(node.attr['_output_quantized'].b is True)
self.assertTrue(
node.attr['_support_output_type_float_in_quantized_op'].b is True)
self.assertTrue(node.attr['y_scale'].f == 10.0)
self.assertTrue(node.attr['x_scale'].f == 10.0)
self.assertTrue(node.attr['h_scale'].f == 5.0)
self.assertTrue(node.attr['w_scale'].f == 5.0)
self.assertTrue(node.attr['num_classes'].i == 2)
self.assertTrue(
all([
t == types_pb2.DT_FLOAT
for t in node.attr['_output_types'].list.type
]))
if __name__ == '__main__': if __name__ == '__main__':
tf.test.main() tf.test.main()
# Frequently Asked Questions # Frequently Asked Questions
## Q: How can I ensure that all the groundtruth boxes are used during train and eval?
A: For the object detecion framework to be TPU-complient, we must pad our input
tensors to static shapes. This means that we must pad to a fixed number of
bounding boxes, configured by `InputReader.max_number_of_boxes`. It is
important to set this value to a number larger than the maximum number of
groundtruth boxes in the dataset. If an image is encountered with more
bounding boxes, the excess boxes will be clipped.
## Q: AttributeError: 'module' object has no attribute 'BackupHandler' ## Q: AttributeError: 'module' object has no attribute 'BackupHandler'
A: This BackupHandler (tf.contrib.slim.tfexample_decoder.BackupHandler) was A: This BackupHandler (tf.contrib.slim.tfexample_decoder.BackupHandler) was
introduced in tensorflow 1.5.0 so runing with earlier versions may cause this introduced in tensorflow 1.5.0 so runing with earlier versions may cause this
......
...@@ -11,7 +11,7 @@ Tensorflow Object Detection API depends on the following libraries: ...@@ -11,7 +11,7 @@ Tensorflow Object Detection API depends on the following libraries:
* tf Slim (which is included in the "tensorflow/models/research/" checkout) * tf Slim (which is included in the "tensorflow/models/research/" checkout)
* Jupyter notebook * Jupyter notebook
* Matplotlib * Matplotlib
* Tensorflow * Tensorflow (>=1.9.0)
* Cython * Cython
* contextlib2 * contextlib2
* cocoapi * cocoapi
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment