Commit a1337e01 authored by Zhichao Lu's avatar Zhichao Lu Committed by pkulzc
Browse files

Merged commit includes the following changes:

223075771  by lzc:

    Bring in external fixes.

--
222919755  by ronnyvotel:

    Bug fix in faster r-cnn model builder. Was previously using `inplace_batchnorm_update` for `reuse_weights`.

--
222885680  by Zhichao Lu:

    Use the result_dict_for_batched_example in models_lib
    Also fixes the visualization size on when eval is on GPU

--
222883648  by Zhichao Lu:

    Fix _unmatched_class_label for the _add_background_class == False case in ssd_meta_arch.py.

--
222836663  by Zhichao Lu:

    Adding support for visualizing grayscale images. Without this change, the images are black-red instead of grayscale.

--
222501978  by Zhichao Lu:

    Fix a bug that caused convert_to_grayscale flag not to be respected.

--
222432846  by richardmunoz:

    Fix mapping of groundtruth_confidences from shape [num_boxes] to [num_boxes, num_classes] when the input contains the groundtruth_confidences field.

--
221725755  by richardmunoz:

    Internal change.

--
221458536  by Zhichao Lu:

    Fix saver defer build bug in object detection train codepath.

--
221391590  by Zhichao Lu:

    Add support for group normalization in the object detection API. Just adding MobileNet-v1 SSD currently. This may serve as a road map for other models that wish to support group normalization as an option.

--
221367993  by Zhichao Lu:

    Bug fixes (1) Make RandomPadImage work, (2) Fix keep_checkpoint_every_n_hours.

--
221266403  by rathodv:

    Use detection boxes as proposals to compute correct mask loss in eval jobs.

--
220845934  by lzc:

    Internal change.

--
220778850  by Zhichao Lu:

    Incorporating existing metrics into Estimator framework.
    Should restore:
    -oid_challenge_detection_metrics
    -pascal_voc_detection_metrics
    -weighted_pascal_voc_detection_metrics
    -pascal_voc_instance_segmentation_metrics
    -weighted_pascal_voc_instance_segmentation_metrics
    -oid_V2_detection_metrics

--
220370391  by alirezafathi:

    Adding precision and recall to the metrics.

--
220321268  by Zhichao Lu:

    Allow the option of setting max_examples_to_draw to zero.

--
220193337  by Zhichao Lu:

    This CL fixes a bug where the Keras convolutional box predictor was applying heads in the non-deterministic dict order. The consequence of this bug was that variables were created in non-deterministic orders. This in turn led different workers in a multi-gpu training setup to have slightly different graphs which had variables assigned to mismatched parameter servers. As a result, roughly half of all workers were unable to initialize and did no work, and training time was slowed down approximately 2x.

--
220136508  by huizhongc:

    Add weight equalization loss to SSD meta arch.

--
220125875  by pengchong:

    Rename label_scores to label_weights

--
219730108  by Zhichao Lu:

    Add description of detection_keypoints in postprocessed_tensors to docstring.

--
219577519  by pengchong:

    Support parsing the class confidences and training using them.

--
219547611  by lzc:

    Stop using static shapes in GPU eval jobs.

--
219536476  by Zhichao Lu:

    Migrate TensorFlow Lite out of tensorflow/contrib

    This change moves //tensorflow/contrib/lite to //tensorflow/lite in preparation
    for TensorFlow 2.0's deprecation of contrib/. If you refer to TF Lite build
    targets or headers, you will need to update them manually. If you use TF Lite
    from the TensorFlow python package, "tf.contrib.lite" now points to "tf.lite".
    Please update your imports as soon as possible.

    For more details, see https://groups.google.com/a/tensorflow.org/forum/#!topic/tflite/iIIXOTOFvwQ

    @angersson and @aselle are conducting this migration. Please contact them if
    you have any further questions.

--
219190083  by Zhichao Lu:

    Add a second expected_loss_weights function using an alternative expectation calculation compared to previous. Integrate this op into ssd_meta_arch and losses builder. Affects files that use losses_builder.build to handle the returning of an additional element.

--
218924451  by pengchong:

    Add a new way to assign training targets using groundtruth confidences.

--
218760524  by chowdhery:

    Modify export script to add option for regular NMS in TFLite post-processing op.

--

PiperOrigin-RevId: 223075771
parent 2c680af3
......@@ -16,7 +16,6 @@
"""Function to build box predictor from configuration."""
import collections
from absl import logging
import tensorflow as tf
from object_detection.predictors import convolutional_box_predictor
from object_detection.predictors import convolutional_keras_box_predictor
......@@ -26,7 +25,6 @@ from object_detection.predictors.heads import box_head
from object_detection.predictors.heads import class_head
from object_detection.predictors.heads import keras_box_head
from object_detection.predictors.heads import keras_class_head
from object_detection.predictors.heads import keras_mask_head
from object_detection.predictors.heads import mask_head
from object_detection.protos import box_predictor_pb2
......@@ -44,8 +42,7 @@ def build_convolutional_box_predictor(is_training,
apply_sigmoid_to_scores=False,
add_background_class=True,
class_prediction_bias_init=0.0,
use_depthwise=False,
mask_head_config=None):
use_depthwise=False,):
"""Builds the ConvolutionalBoxPredictor from the arguments.
Args:
......@@ -80,8 +77,6 @@ def build_convolutional_box_predictor(is_training,
conv2d layer before class prediction.
use_depthwise: Whether to use depthwise convolutions for prediction
steps. Default is False.
mask_head_config: An optional MaskHead object containing configs for mask
head construction.
Returns:
A ConvolutionalBoxPredictor class.
......@@ -101,21 +96,6 @@ def build_convolutional_box_predictor(is_training,
class_prediction_bias_init=class_prediction_bias_init,
use_depthwise=use_depthwise)
other_heads = {}
if mask_head_config is not None:
if not mask_head_config.masks_are_class_agnostic:
logging.warning('Note that class specific mask prediction for SSD '
'models is memory consuming.')
other_heads[convolutional_box_predictor.MASK_PREDICTIONS] = (
mask_head.ConvolutionalMaskHead(
is_training=is_training,
num_classes=num_classes,
use_dropout=use_dropout,
dropout_keep_prob=dropout_keep_prob,
kernel_size=kernel_size,
use_depthwise=use_depthwise,
mask_height=mask_head_config.mask_height,
mask_width=mask_head_config.mask_width,
masks_are_class_agnostic=mask_head_config.masks_are_class_agnostic))
return convolutional_box_predictor.ConvolutionalBoxPredictor(
is_training=is_training,
num_classes=num_classes,
......@@ -144,7 +124,6 @@ def build_convolutional_keras_box_predictor(is_training,
add_background_class=True,
class_prediction_bias_init=0.0,
use_depthwise=False,
mask_head_config=None,
name='BoxPredictor'):
"""Builds the Keras ConvolutionalBoxPredictor from the arguments.
......@@ -189,8 +168,6 @@ def build_convolutional_keras_box_predictor(is_training,
conv2d layer before class prediction.
use_depthwise: Whether to use depthwise convolutions for prediction
steps. Default is False.
mask_head_config: An optional MaskHead object containing configs for mask
head construction.
name: A string name scope to assign to the box predictor. If `None`, Keras
will auto-generate one from the class name.
......@@ -199,11 +176,7 @@ def build_convolutional_keras_box_predictor(is_training,
"""
box_prediction_heads = []
class_prediction_heads = []
mask_prediction_heads = []
other_heads = {}
if mask_head_config is not None:
other_heads[convolutional_box_predictor.MASK_PREDICTIONS] = \
mask_prediction_heads
for stack_index, num_predictions_per_location in enumerate(
num_predictions_per_location_list):
......@@ -231,26 +204,6 @@ def build_convolutional_keras_box_predictor(is_training,
class_prediction_bias_init=class_prediction_bias_init,
use_depthwise=use_depthwise,
name='ConvolutionalClassHead_%d' % stack_index))
if mask_head_config is not None:
if not mask_head_config.masks_are_class_agnostic:
logging.warning('Note that class specific mask prediction for SSD '
'models is memory consuming.')
mask_prediction_heads.append(
keras_mask_head.ConvolutionalMaskHead(
is_training=is_training,
num_classes=num_classes,
use_dropout=use_dropout,
dropout_keep_prob=dropout_keep_prob,
kernel_size=kernel_size,
conv_hyperparams=conv_hyperparams,
freeze_batchnorm=freeze_batchnorm,
num_predictions_per_location=num_predictions_per_location,
use_depthwise=use_depthwise,
mask_height=mask_head_config.mask_height,
mask_width=mask_head_config.mask_width,
masks_are_class_agnostic=mask_head_config.
masks_are_class_agnostic,
name='ConvolutionalMaskHead_%d' % stack_index))
return convolutional_keras_box_predictor.ConvolutionalBoxPredictor(
is_training=is_training,
......@@ -282,7 +235,6 @@ def build_weight_shared_convolutional_box_predictor(
share_prediction_tower=False,
apply_batch_norm=True,
use_depthwise=False,
mask_head_config=None,
score_converter_fn=tf.identity,
box_encodings_clip_range=None):
"""Builds and returns a WeightSharedConvolutionalBoxPredictor class.
......@@ -310,8 +262,6 @@ def build_weight_shared_convolutional_box_predictor(
apply_batch_norm: Whether to apply batch normalization to conv layers in
this predictor.
use_depthwise: Whether to use depthwise separable conv2d instead of conv2d.
mask_head_config: An optional MaskHead object containing configs for mask
head construction.
score_converter_fn: Callable score converter to perform elementwise op on
class scores.
box_encodings_clip_range: Min and max values for clipping the box_encodings.
......@@ -335,19 +285,6 @@ def build_weight_shared_convolutional_box_predictor(
use_depthwise=use_depthwise,
score_converter_fn=score_converter_fn))
other_heads = {}
if mask_head_config is not None:
if not mask_head_config.masks_are_class_agnostic:
logging.warning('Note that class specific mask prediction for SSD '
'models is memory consuming.')
other_heads[convolutional_box_predictor.MASK_PREDICTIONS] = (
mask_head.WeightSharedConvolutionalMaskHead(
num_classes=num_classes,
kernel_size=kernel_size,
use_dropout=use_dropout,
dropout_keep_prob=dropout_keep_prob,
mask_height=mask_head_config.mask_height,
mask_width=mask_head_config.mask_width,
masks_are_class_agnostic=mask_head_config.masks_are_class_agnostic))
return convolutional_box_predictor.WeightSharedConvolutionalBoxPredictor(
is_training=is_training,
num_classes=num_classes,
......@@ -520,9 +457,6 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes,
config_box_predictor = box_predictor_config.convolutional_box_predictor
conv_hyperparams_fn = argscope_fn(config_box_predictor.conv_hyperparams,
is_training)
mask_head_config = (
config_box_predictor.mask_head
if config_box_predictor.HasField('mask_head') else None)
return build_convolutional_box_predictor(
is_training=is_training,
num_classes=num_classes,
......@@ -539,8 +473,7 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes,
apply_sigmoid_to_scores=config_box_predictor.apply_sigmoid_to_scores,
class_prediction_bias_init=(
config_box_predictor.class_prediction_bias_init),
use_depthwise=config_box_predictor.use_depthwise,
mask_head_config=mask_head_config)
use_depthwise=config_box_predictor.use_depthwise)
if box_predictor_oneof == 'weight_shared_convolutional_box_predictor':
config_box_predictor = (
......@@ -549,9 +482,6 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes,
is_training)
apply_batch_norm = config_box_predictor.conv_hyperparams.HasField(
'batch_norm')
mask_head_config = (
config_box_predictor.mask_head
if config_box_predictor.HasField('mask_head') else None)
# During training phase, logits are used to compute the loss. Only apply
# sigmoid at inference to make the inference graph TPU friendly.
score_converter_fn = build_score_converter(
......@@ -581,7 +511,6 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes,
share_prediction_tower=config_box_predictor.share_prediction_tower,
apply_batch_norm=apply_batch_norm,
use_depthwise=config_box_predictor.use_depthwise,
mask_head_config=mask_head_config,
score_converter_fn=score_converter_fn,
box_encodings_clip_range=box_encodings_clip_range)
......@@ -680,10 +609,6 @@ def build_keras(conv_hyperparams_fn, freeze_batchnorm, inplace_batchnorm_update,
config_box_predictor = box_predictor_config.convolutional_box_predictor
conv_hyperparams = conv_hyperparams_fn(
config_box_predictor.conv_hyperparams)
mask_head_config = (
config_box_predictor.mask_head
if config_box_predictor.HasField('mask_head') else None)
return build_convolutional_keras_box_predictor(
is_training=is_training,
num_classes=num_classes,
......@@ -702,8 +627,7 @@ def build_keras(conv_hyperparams_fn, freeze_batchnorm, inplace_batchnorm_update,
max_depth=config_box_predictor.max_depth,
class_prediction_bias_init=(
config_box_predictor.class_prediction_bias_init),
use_depthwise=config_box_predictor.use_depthwise,
mask_head_config=mask_head_config)
use_depthwise=config_box_predictor.use_depthwise)
raise ValueError(
'Unknown box predictor for Keras: {}'.format(box_predictor_oneof))
......@@ -21,9 +21,7 @@ import tensorflow as tf
from google.protobuf import text_format
from object_detection.builders import box_predictor_builder
from object_detection.builders import hyperparams_builder
from object_detection.predictors import convolutional_box_predictor
from object_detection.predictors import mask_rcnn_box_predictor
from object_detection.predictors.heads import mask_head
from object_detection.protos import box_predictor_pb2
from object_detection.protos import hyperparams_pb2
......@@ -161,73 +159,6 @@ class ConvolutionalBoxPredictorBuilderTest(tf.test.TestCase):
self.assertTrue(box_predictor._is_training)
self.assertFalse(class_head._use_depthwise)
def test_construct_default_conv_box_predictor_with_default_mask_head(self):
box_predictor_text_proto = """
convolutional_box_predictor {
mask_head {
}
conv_hyperparams {
regularizer {
l1_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}"""
box_predictor_proto = box_predictor_pb2.BoxPredictor()
text_format.Merge(box_predictor_text_proto, box_predictor_proto)
box_predictor = box_predictor_builder.build(
argscope_fn=hyperparams_builder.build,
box_predictor_config=box_predictor_proto,
is_training=True,
num_classes=90)
self.assertTrue(convolutional_box_predictor.MASK_PREDICTIONS in
box_predictor._other_heads)
mask_prediction_head = (
box_predictor._other_heads[convolutional_box_predictor.MASK_PREDICTIONS]
)
self.assertEqual(mask_prediction_head._mask_height, 15)
self.assertEqual(mask_prediction_head._mask_width, 15)
self.assertTrue(mask_prediction_head._masks_are_class_agnostic)
def test_construct_default_conv_box_predictor_with_custom_mask_head(self):
box_predictor_text_proto = """
convolutional_box_predictor {
mask_head {
mask_height: 7
mask_width: 7
masks_are_class_agnostic: false
}
conv_hyperparams {
regularizer {
l1_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}"""
box_predictor_proto = box_predictor_pb2.BoxPredictor()
text_format.Merge(box_predictor_text_proto, box_predictor_proto)
box_predictor = box_predictor_builder.build(
argscope_fn=hyperparams_builder.build,
box_predictor_config=box_predictor_proto,
is_training=True,
num_classes=90)
self.assertTrue(convolutional_box_predictor.MASK_PREDICTIONS in
box_predictor._other_heads)
mask_prediction_head = (
box_predictor._other_heads[convolutional_box_predictor.MASK_PREDICTIONS]
)
self.assertEqual(mask_prediction_head._mask_height, 7)
self.assertEqual(mask_prediction_head._mask_width, 7)
self.assertFalse(mask_prediction_head._masks_are_class_agnostic)
class WeightSharedConvolutionalBoxPredictorBuilderTest(tf.test.TestCase):
......@@ -421,79 +352,6 @@ class WeightSharedConvolutionalBoxPredictorBuilderTest(tf.test.TestCase):
self.assertTrue(box_predictor._is_training)
self.assertEqual(box_predictor._apply_batch_norm, True)
def test_construct_weight_shared_predictor_with_default_mask_head(self):
box_predictor_text_proto = """
weight_shared_convolutional_box_predictor {
mask_head {
}
conv_hyperparams {
regularizer {
l1_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}"""
box_predictor_proto = box_predictor_pb2.BoxPredictor()
text_format.Merge(box_predictor_text_proto, box_predictor_proto)
box_predictor = box_predictor_builder.build(
argscope_fn=hyperparams_builder.build,
box_predictor_config=box_predictor_proto,
is_training=True,
num_classes=90)
self.assertTrue(convolutional_box_predictor.MASK_PREDICTIONS in
box_predictor._other_heads)
weight_shared_convolutional_mask_head = (
box_predictor._other_heads[convolutional_box_predictor.MASK_PREDICTIONS]
)
self.assertIsInstance(weight_shared_convolutional_mask_head,
mask_head.WeightSharedConvolutionalMaskHead)
self.assertEqual(weight_shared_convolutional_mask_head._mask_height, 15)
self.assertEqual(weight_shared_convolutional_mask_head._mask_width, 15)
self.assertTrue(
weight_shared_convolutional_mask_head._masks_are_class_agnostic)
def test_construct_weight_shared_predictor_with_custom_mask_head(self):
box_predictor_text_proto = """
weight_shared_convolutional_box_predictor {
mask_head {
mask_height: 7
mask_width: 7
masks_are_class_agnostic: false
}
conv_hyperparams {
regularizer {
l1_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}"""
box_predictor_proto = box_predictor_pb2.BoxPredictor()
text_format.Merge(box_predictor_text_proto, box_predictor_proto)
box_predictor = box_predictor_builder.build(
argscope_fn=hyperparams_builder.build,
box_predictor_config=box_predictor_proto,
is_training=True,
num_classes=90)
self.assertTrue(convolutional_box_predictor.MASK_PREDICTIONS in
box_predictor._other_heads)
weight_shared_convolutional_mask_head = (
box_predictor._other_heads[convolutional_box_predictor.MASK_PREDICTIONS]
)
self.assertIsInstance(weight_shared_convolutional_mask_head,
mask_head.WeightSharedConvolutionalMaskHead)
self.assertEqual(weight_shared_convolutional_mask_head._mask_height, 7)
self.assertEqual(weight_shared_convolutional_mask_head._mask_width, 7)
self.assertFalse(
weight_shared_convolutional_mask_head._masks_are_class_agnostic)
class MaskRCNNBoxPredictorBuilderTest(tf.test.TestCase):
......
......@@ -182,8 +182,9 @@ def build(hyperparams_config, is_training):
initializer, weights regularizer, activation function, batch norm function
and batch norm parameters based on the configuration.
Note that if the batch_norm parameteres are not specified in the config
(i.e. left to default) then batch norm is excluded from the arg_scope.
Note that if no normalization parameters are specified in the config,
(i.e. left to default) then both batch norm and group norm are excluded
from the arg_scope.
The batch norm parameters are set for updates based on `is_training` argument
and conv_hyperparams_config.batch_norm.train parameter. During training, they
......@@ -208,13 +209,14 @@ def build(hyperparams_config, is_training):
raise ValueError('hyperparams_config not of type '
'hyperparams_pb.Hyperparams.')
batch_norm = None
normalizer_fn = None
batch_norm_params = None
if hyperparams_config.HasField('batch_norm'):
batch_norm = slim.batch_norm
normalizer_fn = slim.batch_norm
batch_norm_params = _build_batch_norm_params(
hyperparams_config.batch_norm, is_training)
if hyperparams_config.HasField('group_norm'):
normalizer_fn = tf.contrib.layers.group_norm
affected_ops = [slim.conv2d, slim.separable_conv2d, slim.conv2d_transpose]
if hyperparams_config.HasField('op') and (
hyperparams_config.op == hyperparams_pb2.Hyperparams.FC):
......@@ -230,7 +232,7 @@ def build(hyperparams_config, is_training):
weights_initializer=_build_initializer(
hyperparams_config.initializer),
activation_fn=_build_activation_fn(hyperparams_config.activation),
normalizer_fn=batch_norm) as sc:
normalizer_fn=normalizer_fn) as sc:
return sc
return scope_fn
......
......@@ -15,9 +15,11 @@
"""A function to build localization and classification losses from config."""
import functools
from object_detection.core import balanced_positive_negative_sampler as sampler
from object_detection.core import losses
from object_detection.protos import losses_pb2
from object_detection.utils import ops
def build(loss_config):
......@@ -66,8 +68,28 @@ def build(loss_config):
random_example_sampler = sampler.BalancedPositiveNegativeSampler(
positive_fraction=loss_config.random_example_sampler.
positive_sample_fraction)
if loss_config.expected_loss_weights == loss_config.NONE:
expected_loss_weights_fn = None
elif loss_config.expected_loss_weights == loss_config.EXPECTED_SAMPLING:
expected_loss_weights_fn = functools.partial(
ops.expected_classification_loss_by_expected_sampling,
min_num_negative_samples=loss_config.min_num_negative_samples,
desired_negative_sampling_ratio=loss_config
.desired_negative_sampling_ratio)
elif (loss_config.expected_loss_weights == loss_config
.REWEIGHTING_UNMATCHED_ANCHORS):
expected_loss_weights_fn = functools.partial(
ops.expected_classification_loss_by_reweighting_unmatched_anchors,
min_num_negative_samples=loss_config.min_num_negative_samples,
desired_negative_sampling_ratio=loss_config
.desired_negative_sampling_ratio)
else:
raise ValueError('Not a valid value for expected_classification_loss.')
return (classification_loss, localization_loss, classification_weight,
localization_weight, hard_example_miner, random_example_sampler)
localization_weight, hard_example_miner, random_example_sampler,
expected_loss_weights_fn)
def build_hard_example_miner(config,
......
......@@ -21,6 +21,7 @@ from google.protobuf import text_format
from object_detection.builders import losses_builder
from object_detection.core import losses
from object_detection.protos import losses_pb2
from object_detection.utils import ops
class LocalizationLossBuilderTest(tf.test.TestCase):
......@@ -38,7 +39,7 @@ class LocalizationLossBuilderTest(tf.test.TestCase):
"""
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
_, localization_loss, _, _, _, _ = losses_builder.build(losses_proto)
_, localization_loss, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(localization_loss,
losses.WeightedL2LocalizationLoss))
......@@ -55,7 +56,7 @@ class LocalizationLossBuilderTest(tf.test.TestCase):
"""
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
_, localization_loss, _, _, _, _ = losses_builder.build(losses_proto)
_, localization_loss, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(localization_loss,
losses.WeightedSmoothL1LocalizationLoss))
self.assertAlmostEqual(localization_loss._delta, 1.0)
......@@ -74,7 +75,7 @@ class LocalizationLossBuilderTest(tf.test.TestCase):
"""
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
_, localization_loss, _, _, _, _ = losses_builder.build(losses_proto)
_, localization_loss, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(localization_loss,
losses.WeightedSmoothL1LocalizationLoss))
self.assertAlmostEqual(localization_loss._delta, 0.1)
......@@ -92,7 +93,7 @@ class LocalizationLossBuilderTest(tf.test.TestCase):
"""
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
_, localization_loss, _, _, _, _ = losses_builder.build(losses_proto)
_, localization_loss, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(localization_loss,
losses.WeightedIOULocalizationLoss))
......@@ -109,7 +110,7 @@ class LocalizationLossBuilderTest(tf.test.TestCase):
"""
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
_, localization_loss, _, _, _, _ = losses_builder.build(losses_proto)
_, localization_loss, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(localization_loss,
losses.WeightedSmoothL1LocalizationLoss))
predictions = tf.constant([[[0.0, 0.0, 1.0, 1.0], [0.0, 0.0, 1.0, 1.0]]])
......@@ -146,7 +147,7 @@ class ClassificationLossBuilderTest(tf.test.TestCase):
"""
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
classification_loss, _, _, _, _, _ = losses_builder.build(losses_proto)
classification_loss, _, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(classification_loss,
losses.WeightedSigmoidClassificationLoss))
......@@ -163,7 +164,7 @@ class ClassificationLossBuilderTest(tf.test.TestCase):
"""
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
classification_loss, _, _, _, _, _ = losses_builder.build(losses_proto)
classification_loss, _, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(classification_loss,
losses.SigmoidFocalClassificationLoss))
self.assertAlmostEqual(classification_loss._alpha, None)
......@@ -184,7 +185,7 @@ class ClassificationLossBuilderTest(tf.test.TestCase):
"""
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
classification_loss, _, _, _, _, _ = losses_builder.build(losses_proto)
classification_loss, _, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(classification_loss,
losses.SigmoidFocalClassificationLoss))
self.assertAlmostEqual(classification_loss._alpha, 0.25)
......@@ -203,7 +204,7 @@ class ClassificationLossBuilderTest(tf.test.TestCase):
"""
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
classification_loss, _, _, _, _, _ = losses_builder.build(losses_proto)
classification_loss, _, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(classification_loss,
losses.WeightedSoftmaxClassificationLoss))
......@@ -220,7 +221,7 @@ class ClassificationLossBuilderTest(tf.test.TestCase):
"""
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
classification_loss, _, _, _, _, _ = losses_builder.build(losses_proto)
classification_loss, _, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(
isinstance(classification_loss,
losses.WeightedSoftmaxClassificationAgainstLogitsLoss))
......@@ -239,7 +240,7 @@ class ClassificationLossBuilderTest(tf.test.TestCase):
"""
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
classification_loss, _, _, _, _, _ = losses_builder.build(losses_proto)
classification_loss, _, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(classification_loss,
losses.WeightedSoftmaxClassificationLoss))
......@@ -257,7 +258,7 @@ class ClassificationLossBuilderTest(tf.test.TestCase):
"""
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
classification_loss, _, _, _, _, _ = losses_builder.build(losses_proto)
classification_loss, _, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(classification_loss,
losses.BootstrappedSigmoidClassificationLoss))
......@@ -275,7 +276,7 @@ class ClassificationLossBuilderTest(tf.test.TestCase):
"""
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
classification_loss, _, _, _, _, _ = losses_builder.build(losses_proto)
classification_loss, _, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(classification_loss,
losses.WeightedSigmoidClassificationLoss))
predictions = tf.constant([[[0.0, 1.0, 0.0], [0.0, 0.5, 0.5]]])
......@@ -312,7 +313,7 @@ class HardExampleMinerBuilderTest(tf.test.TestCase):
"""
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
_, _, _, _, hard_example_miner, _ = losses_builder.build(losses_proto)
_, _, _, _, hard_example_miner, _, _ = losses_builder.build(losses_proto)
self.assertEqual(hard_example_miner, None)
def test_build_hard_example_miner_for_classification_loss(self):
......@@ -331,7 +332,7 @@ class HardExampleMinerBuilderTest(tf.test.TestCase):
"""
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
_, _, _, _, hard_example_miner, _ = losses_builder.build(losses_proto)
_, _, _, _, hard_example_miner, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(hard_example_miner, losses.HardExampleMiner))
self.assertEqual(hard_example_miner._loss_type, 'cls')
......@@ -351,7 +352,7 @@ class HardExampleMinerBuilderTest(tf.test.TestCase):
"""
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
_, _, _, _, hard_example_miner, _ = losses_builder.build(losses_proto)
_, _, _, _, hard_example_miner, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(hard_example_miner, losses.HardExampleMiner))
self.assertEqual(hard_example_miner._loss_type, 'loc')
......@@ -375,7 +376,7 @@ class HardExampleMinerBuilderTest(tf.test.TestCase):
"""
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
_, _, _, _, hard_example_miner, _ = losses_builder.build(losses_proto)
_, _, _, _, hard_example_miner, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(hard_example_miner, losses.HardExampleMiner))
self.assertEqual(hard_example_miner._num_hard_examples, 32)
self.assertAlmostEqual(hard_example_miner._iou_threshold, 0.5)
......@@ -402,9 +403,9 @@ class LossBuilderTest(tf.test.TestCase):
"""
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
(classification_loss, localization_loss,
classification_weight, localization_weight,
hard_example_miner, _) = losses_builder.build(losses_proto)
(classification_loss, localization_loss, classification_weight,
localization_weight, hard_example_miner, _,
_) = losses_builder.build(losses_proto)
self.assertTrue(isinstance(hard_example_miner, losses.HardExampleMiner))
self.assertTrue(isinstance(classification_loss,
losses.WeightedSoftmaxClassificationLoss))
......@@ -413,6 +414,65 @@ class LossBuilderTest(tf.test.TestCase):
self.assertAlmostEqual(classification_weight, 0.8)
self.assertAlmostEqual(localization_weight, 0.2)
def test_build_expected_sampling(self):
losses_text_proto = """
localization_loss {
weighted_l2 {
}
}
classification_loss {
weighted_softmax {
}
}
hard_example_miner {
}
classification_weight: 0.8
localization_weight: 0.2
"""
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
(classification_loss, localization_loss, classification_weight,
localization_weight, hard_example_miner, _,
_) = losses_builder.build(losses_proto)
self.assertTrue(isinstance(hard_example_miner, losses.HardExampleMiner))
self.assertTrue(
isinstance(classification_loss,
losses.WeightedSoftmaxClassificationLoss))
self.assertTrue(
isinstance(localization_loss, losses.WeightedL2LocalizationLoss))
self.assertAlmostEqual(classification_weight, 0.8)
self.assertAlmostEqual(localization_weight, 0.2)
def test_build_reweighting_unmatched_anchors(self):
losses_text_proto = """
localization_loss {
weighted_l2 {
}
}
classification_loss {
weighted_softmax {
}
}
hard_example_miner {
}
classification_weight: 0.8
localization_weight: 0.2
"""
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
(classification_loss, localization_loss, classification_weight,
localization_weight, hard_example_miner, _,
_) = losses_builder.build(losses_proto)
self.assertTrue(isinstance(hard_example_miner, losses.HardExampleMiner))
self.assertTrue(
isinstance(classification_loss,
losses.WeightedSoftmaxClassificationLoss))
self.assertTrue(
isinstance(localization_loss, losses.WeightedL2LocalizationLoss))
self.assertAlmostEqual(classification_weight, 0.8)
self.assertAlmostEqual(localization_weight, 0.2)
def test_raise_error_when_both_focal_loss_and_hard_example_miner(self):
losses_text_proto = """
localization_loss {
......
......@@ -50,6 +50,7 @@ from object_detection.models.ssd_mobilenet_v2_fpn_feature_extractor import SSDMo
from object_detection.models.ssd_mobilenet_v2_keras_feature_extractor import SSDMobileNetV2KerasFeatureExtractor
from object_detection.models.ssd_pnasnet_feature_extractor import SSDPNASNetFeatureExtractor
from object_detection.predictors import rfcn_box_predictor
from object_detection.predictors.heads import mask_head
from object_detection.protos import model_pb2
from object_detection.utils import ops
......@@ -261,28 +262,23 @@ def _build_ssd_model(ssd_config, is_training, add_summaries):
non_max_suppression_fn, score_conversion_fn = post_processing_builder.build(
ssd_config.post_processing)
(classification_loss, localization_loss, classification_weight,
localization_weight, hard_example_miner,
random_example_sampler) = losses_builder.build(ssd_config.loss)
localization_weight, hard_example_miner, random_example_sampler,
expected_loss_weights_fn) = losses_builder.build(ssd_config.loss)
normalize_loss_by_num_matches = ssd_config.normalize_loss_by_num_matches
normalize_loc_loss_by_codesize = ssd_config.normalize_loc_loss_by_codesize
weight_regression_loss_by_score = (ssd_config.weight_regression_loss_by_score)
equalization_loss_config = ops.EqualizationLossConfig(
weight=ssd_config.loss.equalization_loss.weight,
exclude_prefixes=ssd_config.loss.equalization_loss.exclude_prefixes)
target_assigner_instance = target_assigner.TargetAssigner(
region_similarity_calculator,
matcher,
box_coder,
negative_class_weight=negative_class_weight,
weight_regression_loss_by_score=weight_regression_loss_by_score)
expected_classification_loss_under_sampling = None
if ssd_config.use_expected_classification_loss_under_sampling:
expected_classification_loss_under_sampling = functools.partial(
ops.expected_classification_loss_under_sampling,
min_num_negative_samples=ssd_config.min_num_negative_samples,
desired_negative_sampling_ratio=ssd_config.
desired_negative_sampling_ratio)
negative_class_weight=negative_class_weight)
ssd_meta_arch_fn = ssd_meta_arch.SSDMetaArch
kwargs = {}
return ssd_meta_arch_fn(
is_training=is_training,
......@@ -306,9 +302,13 @@ def _build_ssd_model(ssd_config, is_training, add_summaries):
freeze_batchnorm=ssd_config.freeze_batchnorm,
inplace_batchnorm_update=ssd_config.inplace_batchnorm_update,
add_background_class=ssd_config.add_background_class,
explicit_background_class=ssd_config.explicit_background_class,
random_example_sampler=random_example_sampler,
expected_classification_loss_under_sampling=
expected_classification_loss_under_sampling)
expected_loss_weights_fn=expected_loss_weights_fn,
use_confidences_as_targets=ssd_config.use_confidences_as_targets,
implicit_example_weight=ssd_config.implicit_example_weight,
equalization_loss_config=equalization_loss_config,
**kwargs)
def _build_faster_rcnn_feature_extractor(
......@@ -374,7 +374,7 @@ def _build_faster_rcnn_model(frcnn_config, is_training, add_summaries):
feature_extractor = _build_faster_rcnn_feature_extractor(
frcnn_config.feature_extractor, is_training,
frcnn_config.inplace_batchnorm_update)
inplace_batchnorm_update=frcnn_config.inplace_batchnorm_update)
number_of_stages = frcnn_config.number_of_stages
first_stage_anchor_generator = anchor_generator_builder.build(
......@@ -391,7 +391,8 @@ def _build_faster_rcnn_model(frcnn_config, is_training, add_summaries):
frcnn_config.first_stage_box_predictor_kernel_size)
first_stage_box_predictor_depth = frcnn_config.first_stage_box_predictor_depth
first_stage_minibatch_size = frcnn_config.first_stage_minibatch_size
use_static_shapes = frcnn_config.use_static_shapes
use_static_shapes = frcnn_config.use_static_shapes and (
frcnn_config.use_static_shapes_for_eval or is_training)
first_stage_sampler = sampler.BalancedPositiveNegativeSampler(
positive_fraction=frcnn_config.first_stage_positive_balance_fraction,
is_static=(frcnn_config.use_static_balanced_label_sampler and
......
......@@ -150,9 +150,6 @@ class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase):
}
}
}
use_expected_classification_loss_under_sampling: true
min_num_negative_samples: 10
desired_negative_sampling_ratio: 2
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
......@@ -160,12 +157,8 @@ class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase):
self.assertIsInstance(model, ssd_meta_arch.SSDMetaArch)
self.assertIsInstance(model._feature_extractor,
SSDInceptionV2FeatureExtractor)
self.assertIsNotNone(model._expected_classification_loss_under_sampling)
self.assertEqual(
model._expected_classification_loss_under_sampling.keywords, {
'min_num_negative_samples': 10,
'desired_negative_sampling_ratio': 2
})
self.assertIsNone(model._expected_loss_weights_fn)
def test_create_ssd_inception_v3_model_from_config(self):
......@@ -708,7 +701,6 @@ class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase):
}
}
}
weight_regression_loss_by_score: true
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
......@@ -719,7 +711,6 @@ class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase):
self.assertIsInstance(model._box_predictor,
convolutional_box_predictor.ConvolutionalBoxPredictor)
self.assertTrue(model._normalize_loc_loss_by_codesize)
self.assertTrue(model._target_assigner._weight_regression_loss_by_score)
def test_create_ssd_mobilenet_v2_keras_model_from_config(self):
model_text_proto = """
......@@ -785,7 +776,6 @@ class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase):
}
}
}
weight_regression_loss_by_score: true
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
......@@ -797,7 +787,6 @@ class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase):
model._box_predictor,
convolutional_keras_box_predictor.ConvolutionalBoxPredictor)
self.assertTrue(model._normalize_loc_loss_by_codesize)
self.assertTrue(model._target_assigner._weight_regression_loss_by_score)
def test_create_ssd_mobilenet_v2_fpn_model_from_config(self):
model_text_proto = """
......@@ -1037,7 +1026,7 @@ class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase):
def test_create_faster_rcnn_resnet_v1_models_from_config(self):
model_text_proto = """
faster_rcnn {
inplace_batchnorm_update: true
inplace_batchnorm_update: false
num_classes: 3
image_resizer {
keep_aspect_ratio_resizer {
......
......@@ -189,11 +189,12 @@ def build(preprocessor_step_config):
if config.HasField('max_image_height'):
max_image_size = (config.max_image_height, config.max_image_width)
pad_color = config.pad_color
if pad_color and len(pad_color) != 3:
pad_color = config.pad_color or None
if pad_color:
if len(pad_color) == 3:
pad_color = tf.to_float([x for x in config.pad_color])
else:
raise ValueError('pad_color should have 3 elements (RGB) if set!')
if not pad_color:
pad_color = None
return (preprocessor.random_pad_image,
{
'min_image_size': min_image_size,
......
......@@ -241,6 +241,7 @@ class DetectionModel(object):
groundtruth_masks_list=None,
groundtruth_keypoints_list=None,
groundtruth_weights_list=None,
groundtruth_confidences_list=None,
groundtruth_is_crowd_list=None,
is_annotated_list=None):
"""Provide groundtruth tensors.
......@@ -265,6 +266,9 @@ class DetectionModel(object):
missing keypoints should be encoded as NaN.
groundtruth_weights_list: A list of 1-D tf.float32 tensors of shape
[num_boxes] containing weights for groundtruth boxes.
groundtruth_confidences_list: A list of 2-D tf.float32 tensors of shape
[num_boxes, num_classes] containing class confidences for groundtruth
boxes.
groundtruth_is_crowd_list: A list of 1-D tf.bool tensors of shape
[num_boxes] containing is_crowd annotations
is_annotated_list: A list of scalar tf.bool tensors indicating whether
......@@ -276,6 +280,9 @@ class DetectionModel(object):
if groundtruth_weights_list:
self._groundtruth_lists[fields.BoxListFields.
weights] = groundtruth_weights_list
if groundtruth_confidences_list:
self._groundtruth_lists[fields.BoxListFields.
confidences] = groundtruth_confidences_list
if groundtruth_masks_list:
self._groundtruth_lists[
fields.BoxListFields.masks] = groundtruth_masks_list
......
......@@ -1734,7 +1734,7 @@ class PreprocessorTest(tf.test.TestCase):
}
preprocessor_arg_map = preprocessor.get_default_func_arg_map(
include_label_scores=True,
include_label_weights=True,
include_instance_masks=True)
preprocessing_options = [
......
......@@ -44,6 +44,8 @@ class InputDataFields(object):
groundtruth_image_confidences: image-level class confidences.
groundtruth_boxes: coordinates of the ground truth boxes in the image.
groundtruth_classes: box-level class labels.
groundtruth_confidences: box-level class confidences. The shape should be
the same as the shape of groundtruth_classes.
groundtruth_label_types: box-level label types (e.g. explicit negative).
groundtruth_is_crowd: [DEPRECATED, use groundtruth_group_of instead]
is the groundtruth a single object or a crowd.
......@@ -59,7 +61,7 @@ class InputDataFields(object):
groundtruth_instance_classes: instance mask-level class labels.
groundtruth_keypoints: ground truth keypoints.
groundtruth_keypoint_visibilities: ground truth keypoint visibilities.
groundtruth_label_scores: groundtruth label scores.
groundtruth_label_weights: groundtruth label weights.
groundtruth_weights: groundtruth weight factor for bounding boxes.
num_groundtruth_boxes: number of groundtruth boxes.
is_annotated: whether an image has been labeled or not.
......@@ -91,7 +93,7 @@ class InputDataFields(object):
groundtruth_instance_classes = 'groundtruth_instance_classes'
groundtruth_keypoints = 'groundtruth_keypoints'
groundtruth_keypoint_visibilities = 'groundtruth_keypoint_visibilities'
groundtruth_label_scores = 'groundtruth_label_scores'
groundtruth_label_weights = 'groundtruth_label_weights'
groundtruth_weights = 'groundtruth_weights'
num_groundtruth_boxes = 'num_groundtruth_boxes'
is_annotated = 'is_annotated'
......@@ -144,6 +146,7 @@ class BoxListFields(object):
classes = 'classes'
scores = 'scores'
weights = 'weights'
confidences = 'confidences'
objectness = 'objectness'
masks = 'masks'
boundaries = 'boundaries'
......
......@@ -52,8 +52,7 @@ class TargetAssigner(object):
similarity_calc,
matcher,
box_coder,
negative_class_weight=1.0,
weight_regression_loss_by_score=False):
negative_class_weight=1.0):
"""Construct Object Detection Target Assigner.
Args:
......@@ -64,8 +63,6 @@ class TargetAssigner(object):
groundtruth boxes with respect to anchors.
negative_class_weight: classification weight to be associated to negative
anchors (default: 1.0). The weight must be in [0., 1.].
weight_regression_loss_by_score: Whether to weight the regression loss by
ground truth box score.
Raises:
ValueError: if similarity_calc is not a RegionSimilarityCalculator or
......@@ -81,7 +78,6 @@ class TargetAssigner(object):
self._matcher = matcher
self._box_coder = box_coder
self._negative_class_weight = negative_class_weight
self._weight_regression_loss_by_score = weight_regression_loss_by_score
@property
def box_coder(self):
......@@ -170,11 +166,6 @@ class TargetAssigner(object):
num_gt_boxes = groundtruth_boxes.num_boxes()
groundtruth_weights = tf.ones([num_gt_boxes], dtype=tf.float32)
# set scores on the gt boxes
scores = 1 - groundtruth_labels[:, 0]
groundtruth_boxes.add_field(fields.BoxListFields.scores, scores)
with tf.control_dependencies(
[unmatched_shape_assert, labels_and_box_shapes_assert]):
match_quality_matrix = self._similarity_calc.compare(groundtruth_boxes,
......@@ -187,12 +178,7 @@ class TargetAssigner(object):
cls_targets = self._create_classification_targets(groundtruth_labels,
unmatched_class_label,
match)
if self._weight_regression_loss_by_score:
reg_weights = self._create_regression_weights(
match, groundtruth_weights * scores)
else:
reg_weights = self._create_regression_weights(match,
groundtruth_weights)
reg_weights = self._create_regression_weights(match, groundtruth_weights)
cls_weights = self._create_classification_weights(match,
groundtruth_weights)
......@@ -503,3 +489,146 @@ def batch_assign_targets(target_assigner,
batch_reg_weights = tf.stack(reg_weights_list)
return (batch_cls_targets, batch_cls_weights, batch_reg_targets,
batch_reg_weights, match_list)
def batch_assign_confidences(target_assigner,
anchors_batch,
gt_box_batch,
gt_class_confidences_batch,
gt_weights_batch=None,
unmatched_class_label=None,
include_background_class=True,
implicit_class_weight=1.0):
"""Batched assignment of classification and regression targets.
This differences between batch_assign_confidences and batch_assign_targets:
- 'batch_assign_targets' supports scalar (agnostic), vector (multiclass) and
tensor (high-dimensional) targets. 'batch_assign_confidences' only support
scalar (agnostic) and vector (multiclass) targets.
- 'batch_assign_targets' assumes the input class tensor using the binary
one/K-hot encoding. 'batch_assign_confidences' takes the class confidence
scores as the input, where 1 means positive classes, 0 means implicit
negative classes, and -1 means explicit negative classes.
- 'batch_assign_confidences' assigns the targets in the similar way as
'batch_assign_targets' except that it gives different weights for implicit
and explicit classes. This allows user to control the negative gradients
pushed differently for implicit and explicit examples during the training.
Args:
target_assigner: a target assigner.
anchors_batch: BoxList representing N box anchors or list of BoxList objects
with length batch_size representing anchor sets.
gt_box_batch: a list of BoxList objects with length batch_size
representing groundtruth boxes for each image in the batch
gt_class_confidences_batch: a list of tensors with length batch_size, where
each tensor has shape [num_gt_boxes_i, classification_target_size] and
num_gt_boxes_i is the number of boxes in the ith boxlist of
gt_box_batch. Note that in this tensor, 1 means explicit positive class,
-1 means explicit negative class, and 0 means implicit negative class.
gt_weights_batch: A list of 1-D tf.float32 tensors of shape
[num_gt_boxes_i] containing weights for groundtruth boxes.
unmatched_class_label: a float32 tensor with shape [d_1, d_2, ..., d_k]
which is consistent with the classification target for each
anchor (and can be empty for scalar targets). This shape must thus be
compatible with the groundtruth labels that are passed to the "assign"
function (which have shape [num_gt_boxes, d_1, d_2, ..., d_k]).
include_background_class: whether or not gt_class_confidences_batch includes
the background class.
implicit_class_weight: the weight assigned to implicit examples.
Returns:
batch_cls_targets: a tensor with shape [batch_size, num_anchors,
num_classes],
batch_cls_weights: a tensor with shape [batch_size, num_anchors,
num_classes],
batch_reg_targets: a tensor with shape [batch_size, num_anchors,
box_code_dimension]
batch_reg_weights: a tensor with shape [batch_size, num_anchors],
match_list: a list of matcher.Match objects encoding the match between
anchors and groundtruth boxes for each image of the batch,
with rows of the Match objects corresponding to groundtruth boxes
and columns corresponding to anchors.
Raises:
ValueError: if input list lengths are inconsistent, i.e.,
batch_size == len(gt_box_batch) == len(gt_class_targets_batch)
and batch_size == len(anchors_batch) unless anchors_batch is a single
BoxList, or if any element in gt_class_confidences_batch has rank > 2.
"""
if not isinstance(anchors_batch, list):
anchors_batch = len(gt_box_batch) * [anchors_batch]
if not all(
isinstance(anchors, box_list.BoxList) for anchors in anchors_batch):
raise ValueError('anchors_batch must be a BoxList or list of BoxLists.')
if not (len(anchors_batch)
== len(gt_box_batch)
== len(gt_class_confidences_batch)):
raise ValueError('batch size incompatible with lengths of anchors_batch, '
'gt_box_batch and gt_class_confidences_batch.')
cls_targets_list = []
cls_weights_list = []
reg_targets_list = []
reg_weights_list = []
match_list = []
if gt_weights_batch is None:
gt_weights_batch = [None] * len(gt_class_confidences_batch)
for anchors, gt_boxes, gt_class_confidences, gt_weights in zip(
anchors_batch, gt_box_batch, gt_class_confidences_batch,
gt_weights_batch):
if (gt_class_confidences is not None and
len(gt_class_confidences.get_shape().as_list()) > 2):
raise ValueError('The shape of the class target is not supported. ',
gt_class_confidences.get_shape())
cls_targets, _, reg_targets, _, match = target_assigner.assign(
anchors, gt_boxes, gt_class_confidences, unmatched_class_label,
groundtruth_weights=gt_weights)
if include_background_class:
cls_targets_without_background = tf.slice(
cls_targets, [0, 1], [-1, -1])
else:
cls_targets_without_background = cls_targets
positive_mask = tf.greater(cls_targets_without_background, 0.0)
negative_mask = tf.less(cls_targets_without_background, 0.0)
explicit_example_mask = tf.logical_or(positive_mask, negative_mask)
positive_anchors = tf.reduce_any(positive_mask, axis=-1)
regression_weights = tf.to_float(positive_anchors)
regression_targets = (
reg_targets * tf.expand_dims(regression_weights, axis=-1))
regression_weights_expanded = tf.expand_dims(regression_weights, axis=-1)
cls_targets_without_background = (
cls_targets_without_background * (1 - tf.to_float(negative_mask)))
cls_weights_without_background = (
(1 - implicit_class_weight) * tf.to_float(explicit_example_mask)
+ implicit_class_weight)
if include_background_class:
cls_weights_background = (
(1 - implicit_class_weight) * regression_weights_expanded
+ implicit_class_weight)
classification_weights = tf.concat(
[cls_weights_background, cls_weights_without_background], axis=-1)
cls_targets_background = 1 - regression_weights_expanded
classification_targets = tf.concat(
[cls_targets_background, cls_targets_without_background], axis=-1)
else:
classification_targets = cls_targets_without_background
classification_weights = cls_weights_without_background
cls_targets_list.append(classification_targets)
cls_weights_list.append(classification_weights)
reg_targets_list.append(regression_targets)
reg_weights_list.append(regression_weights)
match_list.append(match)
batch_cls_targets = tf.stack(cls_targets_list)
batch_cls_weights = tf.stack(cls_weights_list)
batch_reg_targets = tf.stack(reg_targets_list)
batch_reg_weights = tf.stack(reg_weights_list)
return (batch_cls_targets, batch_cls_weights, batch_reg_targets,
batch_reg_weights, match_list)
......@@ -325,54 +325,6 @@ class TargetAssignerTest(test_case.TestCase):
self.assertAllClose(cls_weights_out, exp_cls_weights)
self.assertAllClose(reg_weights_out, exp_reg_weights)
def test_assign_multiclass_with_weight_regression_loss_by_score(self):
def graph_fn(anchor_means, groundtruth_box_corners, groundtruth_labels):
similarity_calc = region_similarity_calculator.IouSimilarity()
matcher = argmax_matcher.ArgMaxMatcher(
matched_threshold=0.5, unmatched_threshold=0.5)
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
unmatched_class_label = tf.constant([1, 0, 0, 0, 0, 0, 0], tf.float32)
target_assigner = targetassigner.TargetAssigner(
similarity_calc,
matcher,
box_coder,
weight_regression_loss_by_score=True)
anchors_boxlist = box_list.BoxList(anchor_means)
groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners)
result = target_assigner.assign(
anchors_boxlist,
groundtruth_boxlist,
groundtruth_labels,
unmatched_class_label=unmatched_class_label)
(_, cls_weights, _, reg_weights, _) = result
return (cls_weights, reg_weights)
anchor_means = np.array(
[[0.0, 0.0, 0.5, 0.5], [0.5, 0.5, 1.0, 0.8], [0, 0.5, .5, 1.0],
[.75, 0, 1.0, .25]],
dtype=np.float32)
groundtruth_box_corners = np.array(
[[0.0, 0.0, 0.5, 0.5], [0.5, 0.5, 0.9, 0.9], [.75, 0, .95, .27]],
dtype=np.float32)
groundtruth_labels = np.array(
[[.9, .1, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 1, 0],
[.5, 0, 0, .5, 0, 0, 0]],
dtype=np.float32)
exp_cls_weights = [
[1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1]] # background class gets weight of 1.
exp_reg_weights = [.1, 1, 0., .5] # background class gets weight of 0.
(cls_weights_out, reg_weights_out) = self.execute(
graph_fn, [anchor_means, groundtruth_box_corners, groundtruth_labels])
self.assertAllClose(cls_weights_out, exp_cls_weights)
self.assertAllClose(reg_weights_out, exp_reg_weights)
def test_assign_multidimensional_class_targets(self):
def graph_fn(anchor_means, groundtruth_box_corners, groundtruth_labels):
......@@ -869,6 +821,321 @@ class BatchTargetAssignerTest(test_case.TestCase):
self.assertAllClose(reg_weights_out, exp_reg_weights)
class BatchTargetAssignConfidencesTest(test_case.TestCase):
def _get_target_assigner(self):
similarity_calc = region_similarity_calculator.IouSimilarity()
matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5,
unmatched_threshold=0.5)
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
return targetassigner.TargetAssigner(similarity_calc, matcher, box_coder)
def test_batch_assign_empty_groundtruth(self):
def graph_fn(anchor_means, groundtruth_box_corners, gt_class_confidences):
groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners)
gt_box_batch = [groundtruth_boxlist]
gt_class_confidences_batch = [gt_class_confidences]
anchors_boxlist = box_list.BoxList(anchor_means)
num_classes = 3
implicit_class_weight = 0.5
unmatched_class_label = tf.constant([1] + num_classes * [0], tf.float32)
multiclass_target_assigner = self._get_target_assigner()
(cls_targets, cls_weights, reg_targets, reg_weights,
_) = targetassigner.batch_assign_confidences(
multiclass_target_assigner,
anchors_boxlist,
gt_box_batch,
gt_class_confidences_batch,
unmatched_class_label=unmatched_class_label,
include_background_class=True,
implicit_class_weight=implicit_class_weight)
return (cls_targets, cls_weights, reg_targets, reg_weights)
groundtruth_box_corners = np.zeros((0, 4), dtype=np.float32)
anchor_means = np.array([[0, 0, .25, .25],
[0, .25, 1, 1]], dtype=np.float32)
num_classes = 3
pad = 1
gt_class_confidences = np.zeros((0, num_classes + pad), dtype=np.float32)
exp_cls_targets = [[[1, 0, 0, 0],
[1, 0, 0, 0]]]
exp_cls_weights = [[[0.5, 0.5, 0.5, 0.5],
[0.5, 0.5, 0.5, 0.5]]]
exp_reg_targets = [[[0, 0, 0, 0],
[0, 0, 0, 0]]]
exp_reg_weights = [[0, 0]]
(cls_targets_out,
cls_weights_out, reg_targets_out, reg_weights_out) = self.execute(
graph_fn,
[anchor_means, groundtruth_box_corners, gt_class_confidences])
self.assertAllClose(cls_targets_out, exp_cls_targets)
self.assertAllClose(cls_weights_out, exp_cls_weights)
self.assertAllClose(reg_targets_out, exp_reg_targets)
self.assertAllClose(reg_weights_out, exp_reg_weights)
def test_batch_assign_confidences_agnostic(self):
def graph_fn(anchor_means, groundtruth_boxlist1, groundtruth_boxlist2):
box_list1 = box_list.BoxList(groundtruth_boxlist1)
box_list2 = box_list.BoxList(groundtruth_boxlist2)
gt_box_batch = [box_list1, box_list2]
gt_class_confidences_batch = [None, None]
anchors_boxlist = box_list.BoxList(anchor_means)
agnostic_target_assigner = self._get_target_assigner()
implicit_class_weight = 0.5
(cls_targets, cls_weights, reg_targets, reg_weights,
_) = targetassigner.batch_assign_confidences(
agnostic_target_assigner,
anchors_boxlist,
gt_box_batch,
gt_class_confidences_batch,
include_background_class=False,
implicit_class_weight=implicit_class_weight)
return (cls_targets, cls_weights, reg_targets, reg_weights)
groundtruth_boxlist1 = np.array([[0., 0., 0.2, 0.2]], dtype=np.float32)
groundtruth_boxlist2 = np.array([[0, 0.25123152, 1, 1],
[0.015789, 0.0985, 0.55789, 0.3842]],
dtype=np.float32)
anchor_means = np.array([[0, 0, .25, .25],
[0, .25, 1, 1],
[0, .1, .5, .5],
[.75, .75, 1, 1]], dtype=np.float32)
exp_cls_targets = [[[1], [0], [0], [0]],
[[0], [1], [1], [0]]]
exp_cls_weights = [[[1], [0.5], [0.5], [0.5]],
[[0.5], [1], [1], [0.5]]]
exp_reg_targets = [[[0, 0, -0.5, -0.5],
[0, 0, 0, 0],
[0, 0, 0, 0,],
[0, 0, 0, 0,],],
[[0, 0, 0, 0,],
[0, 0.01231521, 0, 0],
[0.15789001, -0.01500003, 0.57889998, -1.15799987],
[0, 0, 0, 0]]]
exp_reg_weights = [[1, 0, 0, 0],
[0, 1, 1, 0]]
(cls_targets_out,
cls_weights_out, reg_targets_out, reg_weights_out) = self.execute(
graph_fn, [anchor_means, groundtruth_boxlist1, groundtruth_boxlist2])
self.assertAllClose(cls_targets_out, exp_cls_targets)
self.assertAllClose(cls_weights_out, exp_cls_weights)
self.assertAllClose(reg_targets_out, exp_reg_targets)
self.assertAllClose(reg_weights_out, exp_reg_weights)
def test_batch_assign_confidences_multiclass(self):
def graph_fn(anchor_means, groundtruth_boxlist1, groundtruth_boxlist2,
class_targets1, class_targets2):
box_list1 = box_list.BoxList(groundtruth_boxlist1)
box_list2 = box_list.BoxList(groundtruth_boxlist2)
gt_box_batch = [box_list1, box_list2]
gt_class_confidences_batch = [class_targets1, class_targets2]
anchors_boxlist = box_list.BoxList(anchor_means)
multiclass_target_assigner = self._get_target_assigner()
num_classes = 3
implicit_class_weight = 0.5
unmatched_class_label = tf.constant([1] + num_classes * [0], tf.float32)
(cls_targets, cls_weights, reg_targets, reg_weights,
_) = targetassigner.batch_assign_confidences(
multiclass_target_assigner,
anchors_boxlist,
gt_box_batch,
gt_class_confidences_batch,
unmatched_class_label=unmatched_class_label,
include_background_class=True,
implicit_class_weight=implicit_class_weight)
return (cls_targets, cls_weights, reg_targets, reg_weights)
groundtruth_boxlist1 = np.array([[0., 0., 0.2, 0.2]], dtype=np.float32)
groundtruth_boxlist2 = np.array([[0, 0.25123152, 1, 1],
[0.015789, 0.0985, 0.55789, 0.3842]],
dtype=np.float32)
class_targets1 = np.array([[0, 1, 0, 0]], dtype=np.float32)
class_targets2 = np.array([[0, 0, 0, 1],
[0, 0, -1, 0]], dtype=np.float32)
anchor_means = np.array([[0, 0, .25, .25],
[0, .25, 1, 1],
[0, .1, .5, .5],
[.75, .75, 1, 1]], dtype=np.float32)
exp_cls_targets = [[[0, 1, 0, 0],
[1, 0, 0, 0],
[1, 0, 0, 0],
[1, 0, 0, 0]],
[[1, 0, 0, 0],
[0, 0, 0, 1],
[1, 0, 0, 0],
[1, 0, 0, 0]]]
exp_cls_weights = [[[1, 1, 0.5, 0.5],
[0.5, 0.5, 0.5, 0.5],
[0.5, 0.5, 0.5, 0.5],
[0.5, 0.5, 0.5, 0.5]],
[[0.5, 0.5, 0.5, 0.5],
[1, 0.5, 0.5, 1],
[0.5, 0.5, 1, 0.5],
[0.5, 0.5, 0.5, 0.5]]]
exp_reg_targets = [[[0, 0, -0.5, -0.5],
[0, 0, 0, 0],
[0, 0, 0, 0,],
[0, 0, 0, 0,],],
[[0, 0, 0, 0,],
[0, 0.01231521, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]]]
exp_reg_weights = [[1, 0, 0, 0],
[0, 1, 0, 0]]
(cls_targets_out, cls_weights_out, reg_targets_out,
reg_weights_out) = self.execute(graph_fn, [
anchor_means, groundtruth_boxlist1, groundtruth_boxlist2,
class_targets1, class_targets2
])
self.assertAllClose(cls_targets_out, exp_cls_targets)
self.assertAllClose(cls_weights_out, exp_cls_weights)
self.assertAllClose(reg_targets_out, exp_reg_targets)
self.assertAllClose(reg_weights_out, exp_reg_weights)
def test_batch_assign_confidences_multiclass_with_padded_groundtruth(self):
def graph_fn(anchor_means, groundtruth_boxlist1, groundtruth_boxlist2,
class_targets1, class_targets2, groundtruth_weights1,
groundtruth_weights2):
box_list1 = box_list.BoxList(groundtruth_boxlist1)
box_list2 = box_list.BoxList(groundtruth_boxlist2)
gt_box_batch = [box_list1, box_list2]
gt_class_confidences_batch = [class_targets1, class_targets2]
gt_weights = [groundtruth_weights1, groundtruth_weights2]
anchors_boxlist = box_list.BoxList(anchor_means)
multiclass_target_assigner = self._get_target_assigner()
num_classes = 3
unmatched_class_label = tf.constant([1] + num_classes * [0], tf.float32)
implicit_class_weight = 0.5
(cls_targets, cls_weights, reg_targets, reg_weights,
_) = targetassigner.batch_assign_confidences(
multiclass_target_assigner,
anchors_boxlist,
gt_box_batch,
gt_class_confidences_batch,
gt_weights,
unmatched_class_label=unmatched_class_label,
include_background_class=True,
implicit_class_weight=implicit_class_weight)
return (cls_targets, cls_weights, reg_targets, reg_weights)
groundtruth_boxlist1 = np.array([[0., 0., 0.2, 0.2],
[0., 0., 0., 0.]], dtype=np.float32)
groundtruth_weights1 = np.array([1, 0], dtype=np.float32)
groundtruth_boxlist2 = np.array([[0, 0.25123152, 1, 1],
[0.015789, 0.0985, 0.55789, 0.3842],
[0, 0, 0, 0]],
dtype=np.float32)
groundtruth_weights2 = np.array([1, 1, 0], dtype=np.float32)
class_targets1 = np.array([[0, 1, 0, 0], [0, 0, 0, 0]], dtype=np.float32)
class_targets2 = np.array([[0, 0, 0, 1],
[0, 0, -1, 0],
[0, 0, 0, 0]], dtype=np.float32)
anchor_means = np.array([[0, 0, .25, .25],
[0, .25, 1, 1],
[0, .1, .5, .5],
[.75, .75, 1, 1]], dtype=np.float32)
exp_cls_targets = [[[0, 1, 0, 0],
[1, 0, 0, 0],
[1, 0, 0, 0],
[1, 0, 0, 0]],
[[1, 0, 0, 0],
[0, 0, 0, 1],
[1, 0, 0, 0],
[1, 0, 0, 0]]]
exp_cls_weights = [[[1, 1, 0.5, 0.5],
[0.5, 0.5, 0.5, 0.5],
[0.5, 0.5, 0.5, 0.5],
[0.5, 0.5, 0.5, 0.5]],
[[0.5, 0.5, 0.5, 0.5],
[1, 0.5, 0.5, 1],
[0.5, 0.5, 1, 0.5],
[0.5, 0.5, 0.5, 0.5]]]
exp_reg_targets = [[[0, 0, -0.5, -0.5],
[0, 0, 0, 0],
[0, 0, 0, 0,],
[0, 0, 0, 0,],],
[[0, 0, 0, 0,],
[0, 0.01231521, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]]]
exp_reg_weights = [[1, 0, 0, 0],
[0, 1, 0, 0]]
(cls_targets_out, cls_weights_out, reg_targets_out,
reg_weights_out) = self.execute(graph_fn, [
anchor_means, groundtruth_boxlist1, groundtruth_boxlist2,
class_targets1, class_targets2, groundtruth_weights1,
groundtruth_weights2
])
self.assertAllClose(cls_targets_out, exp_cls_targets)
self.assertAllClose(cls_weights_out, exp_cls_weights)
self.assertAllClose(reg_targets_out, exp_reg_targets)
self.assertAllClose(reg_weights_out, exp_reg_weights)
def test_batch_assign_confidences_multidimensional(self):
def graph_fn(anchor_means, groundtruth_boxlist1, groundtruth_boxlist2,
class_targets1, class_targets2):
box_list1 = box_list.BoxList(groundtruth_boxlist1)
box_list2 = box_list.BoxList(groundtruth_boxlist2)
gt_box_batch = [box_list1, box_list2]
gt_class_confidences_batch = [class_targets1, class_targets2]
anchors_boxlist = box_list.BoxList(anchor_means)
multiclass_target_assigner = self._get_target_assigner()
target_dimensions = (2, 3)
unmatched_class_label = tf.constant(np.zeros(target_dimensions),
tf.float32)
implicit_class_weight = 0.5
(cls_targets, cls_weights, reg_targets, reg_weights,
_) = targetassigner.batch_assign_confidences(
multiclass_target_assigner,
anchors_boxlist,
gt_box_batch,
gt_class_confidences_batch,
unmatched_class_label=unmatched_class_label,
include_background_class=True,
implicit_class_weight=implicit_class_weight)
return (cls_targets, cls_weights, reg_targets, reg_weights)
groundtruth_boxlist1 = np.array([[0., 0., 0.2, 0.2]], dtype=np.float32)
groundtruth_boxlist2 = np.array([[0, 0.25123152, 1, 1],
[0.015789, 0.0985, 0.55789, 0.3842]],
dtype=np.float32)
class_targets1 = np.array([[0, 1, 0, 0]], dtype=np.float32)
class_targets2 = np.array([[0, 0, 0, 1],
[0, 0, 1, 0]], dtype=np.float32)
class_targets1 = np.array([[[0, 1, 1],
[1, 1, 0]]], dtype=np.float32)
class_targets2 = np.array([[[0, 1, 1],
[1, 1, 0]],
[[0, 0, 1],
[0, 0, 1]]], dtype=np.float32)
anchor_means = np.array([[0, 0, .25, .25],
[0, .25, 1, 1],
[0, .1, .5, .5],
[.75, .75, 1, 1]], dtype=np.float32)
with self.assertRaises(ValueError):
_, _, _, _ = self.execute(graph_fn, [
anchor_means, groundtruth_boxlist1, groundtruth_boxlist2,
class_targets1, class_targets2
])
class CreateTargetAssignerTest(tf.test.TestCase):
def test_create_target_assigner(self):
......
item {
name: "background"
id: 0
display_name: "background"
}
item {
name: "/m/01g317"
id: 1
display_name: "person"
}
item {
name: "/m/0199g"
id: 2
display_name: "bicycle"
}
item {
name: "/m/0k4j"
id: 3
display_name: "car"
}
item {
name: "/m/04_sv"
id: 4
display_name: "motorcycle"
}
item {
name: "/m/05czz6l"
id: 5
display_name: "airplane"
}
item {
name: "/m/01bjv"
id: 6
display_name: "bus"
}
item {
name: "/m/07jdr"
id: 7
display_name: "train"
}
item {
name: "/m/07r04"
id: 8
display_name: "truck"
}
item {
name: "/m/019jd"
id: 9
display_name: "boat"
}
item {
name: "/m/015qff"
id: 10
display_name: "traffic light"
}
item {
name: "/m/01pns0"
id: 11
display_name: "fire hydrant"
}
item {
name: "12"
id: 12
display_name: "12"
}
item {
name: "/m/02pv19"
id: 13
display_name: "stop sign"
}
item {
name: "/m/015qbp"
id: 14
display_name: "parking meter"
}
item {
name: "/m/0cvnqh"
id: 15
display_name: "bench"
}
item {
name: "/m/015p6"
id: 16
display_name: "bird"
}
item {
name: "/m/01yrx"
id: 17
display_name: "cat"
}
item {
name: "/m/0bt9lr"
id: 18
display_name: "dog"
}
item {
name: "/m/03k3r"
id: 19
display_name: "horse"
}
item {
name: "/m/07bgp"
id: 20
display_name: "sheep"
}
item {
name: "/m/01xq0k1"
id: 21
display_name: "cow"
}
item {
name: "/m/0bwd_0j"
id: 22
display_name: "elephant"
}
item {
name: "/m/01dws"
id: 23
display_name: "bear"
}
item {
name: "/m/0898b"
id: 24
display_name: "zebra"
}
item {
name: "/m/03bk1"
id: 25
display_name: "giraffe"
}
item {
name: "26"
id: 26
display_name: "26"
}
item {
name: "/m/01940j"
id: 27
display_name: "backpack"
}
item {
name: "/m/0hnnb"
id: 28
display_name: "umbrella"
}
item {
name: "29"
id: 29
display_name: "29"
}
item {
name: "30"
id: 30
display_name: "30"
}
item {
name: "/m/080hkjn"
id: 31
display_name: "handbag"
}
item {
name: "/m/01rkbr"
id: 32
display_name: "tie"
}
item {
name: "/m/01s55n"
id: 33
display_name: "suitcase"
}
item {
name: "/m/02wmf"
id: 34
display_name: "frisbee"
}
item {
name: "/m/071p9"
id: 35
display_name: "skis"
}
item {
name: "/m/06__v"
id: 36
display_name: "snowboard"
}
item {
name: "/m/018xm"
id: 37
display_name: "sports ball"
}
item {
name: "/m/02zt3"
id: 38
display_name: "kite"
}
item {
name: "/m/03g8mr"
id: 39
display_name: "baseball bat"
}
item {
name: "/m/03grzl"
id: 40
display_name: "baseball glove"
}
item {
name: "/m/06_fw"
id: 41
display_name: "skateboard"
}
item {
name: "/m/019w40"
id: 42
display_name: "surfboard"
}
item {
name: "/m/0dv9c"
id: 43
display_name: "tennis racket"
}
item {
name: "/m/04dr76w"
id: 44
display_name: "bottle"
}
item {
name: "45"
id: 45
display_name: "45"
}
item {
name: "/m/09tvcd"
id: 46
display_name: "wine glass"
}
item {
name: "/m/08gqpm"
id: 47
display_name: "cup"
}
item {
name: "/m/0dt3t"
id: 48
display_name: "fork"
}
item {
name: "/m/04ctx"
id: 49
display_name: "knife"
}
item {
name: "/m/0cmx8"
id: 50
display_name: "spoon"
}
item {
name: "/m/04kkgm"
id: 51
display_name: "bowl"
}
item {
name: "/m/09qck"
id: 52
display_name: "banana"
}
item {
name: "/m/014j1m"
id: 53
display_name: "apple"
}
item {
name: "/m/0l515"
id: 54
display_name: "sandwich"
}
item {
name: "/m/0cyhj_"
id: 55
display_name: "orange"
}
item {
name: "/m/0hkxq"
id: 56
display_name: "broccoli"
}
item {
name: "/m/0fj52s"
id: 57
display_name: "carrot"
}
item {
name: "/m/01b9xk"
id: 58
display_name: "hot dog"
}
item {
name: "/m/0663v"
id: 59
display_name: "pizza"
}
item {
name: "/m/0jy4k"
id: 60
display_name: "donut"
}
item {
name: "/m/0fszt"
id: 61
display_name: "cake"
}
item {
name: "/m/01mzpv"
id: 62
display_name: "chair"
}
item {
name: "/m/02crq1"
id: 63
display_name: "couch"
}
item {
name: "/m/03fp41"
id: 64
display_name: "potted plant"
}
item {
name: "/m/03ssj5"
id: 65
display_name: "bed"
}
item {
name: "66"
id: 66
display_name: "66"
}
item {
name: "/m/04bcr3"
id: 67
display_name: "dining table"
}
item {
name: "68"
id: 68
display_name: "68"
}
item {
name: "69"
id: 69
display_name: "69"
}
item {
name: "/m/09g1w"
id: 70
display_name: "toilet"
}
item {
name: "71"
id: 71
display_name: "71"
}
item {
name: "/m/07c52"
id: 72
display_name: "tv"
}
item {
name: "/m/01c648"
id: 73
display_name: "laptop"
}
item {
name: "/m/020lf"
id: 74
display_name: "mouse"
}
item {
name: "/m/0qjjc"
id: 75
display_name: "remote"
}
item {
name: "/m/01m2v"
id: 76
display_name: "keyboard"
}
item {
name: "/m/050k8"
id: 77
display_name: "cell phone"
}
item {
name: "/m/0fx9l"
id: 78
display_name: "microwave"
}
item {
name: "/m/029bxz"
id: 79
display_name: "oven"
}
item {
name: "/m/01k6s3"
id: 80
display_name: "toaster"
}
item {
name: "/m/0130jx"
id: 81
display_name: "sink"
}
item {
name: "/m/040b_t"
id: 82
display_name: "refrigerator"
}
item {
name: "83"
id: 83
display_name: "83"
}
item {
name: "/m/0bt_c3"
id: 84
display_name: "book"
}
item {
name: "/m/01x3z"
id: 85
display_name: "clock"
}
item {
name: "/m/02s195"
id: 86
display_name: "vase"
}
item {
name: "/m/01lsmm"
id: 87
display_name: "scissors"
}
item {
name: "/m/0kmg4"
id: 88
display_name: "teddy bear"
}
item {
name: "/m/03wvsk"
id: 89
display_name: "hair drier"
}
item {
name: "/m/012xff"
id: 90
display_name: "toothbrush"
}
......@@ -14,7 +14,7 @@ A couple words of warning:
the container. When running through the tutorial,
**do not close the container**.
2. To be able to deploy the [Android app](
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/examples/android/app)
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/examples/android/app)
(which you will build at the end of the tutorial),
you will need to kill any instances of `adb` running on the host machine. You
can accomplish this by closing all instances of Android Studio, and then
......
......@@ -26,6 +26,7 @@ from object_detection.core import keypoint_ops
from object_detection.core import standard_fields as fields
from object_detection.metrics import coco_evaluation
from object_detection.utils import label_map_util
from object_detection.utils import object_detection_evaluation
from object_detection.utils import ops
from object_detection.utils import shape_utils
from object_detection.utils import visualization_utils as vis_utils
......@@ -40,6 +41,18 @@ EVAL_METRICS_CLASS_DICT = {
coco_evaluation.CocoDetectionEvaluator,
'coco_mask_metrics':
coco_evaluation.CocoMaskEvaluator,
'oid_challenge_detection_metrics':
object_detection_evaluation.OpenImagesDetectionChallengeEvaluator,
'pascal_voc_detection_metrics':
object_detection_evaluation.PascalDetectionEvaluator,
'weighted_pascal_voc_detection_metrics':
object_detection_evaluation.WeightedPascalDetectionEvaluator,
'pascal_voc_instance_segmentation_metrics':
object_detection_evaluation.PascalInstanceSegmentationEvaluator,
'weighted_pascal_voc_instance_segmentation_metrics':
object_detection_evaluation.WeightedPascalInstanceSegmentationEvaluator,
'oid_V2_detection_metrics':
object_detection_evaluation.OpenImagesDetectionEvaluator,
}
EVAL_DEFAULT_METRIC = 'coco_detection_metrics'
......@@ -588,8 +601,7 @@ def result_dict_for_single_example(image,
exclude_keys = [
fields.InputDataFields.original_image,
fields.DetectionResultFields.num_detections,
fields.InputDataFields.num_groundtruth_boxes,
fields.InputDataFields.original_image_spatial_shape
fields.InputDataFields.num_groundtruth_boxes
]
output_dict = {
......@@ -611,6 +623,7 @@ def result_dict_for_batched_example(images,
class_agnostic=False,
scale_to_absolute=False,
original_image_spatial_shapes=None,
true_image_shapes=None,
max_gt_boxes=None):
"""Merges all detection and groundtruth information for a single example.
......@@ -646,6 +659,8 @@ def result_dict_for_batched_example(images,
coordinates. Default False.
original_image_spatial_shapes: A 2D int32 tensor of shape [batch_size, 2]
used to resize the image. When set to None, the image size is retained.
true_image_shapes: A 2D int32 tensor of shape [batch_size, 3]
containing the size of the unpadded original_image.
max_gt_boxes: [batch_size] tensor representing the maximum number of
groundtruth boxes to pad.
......@@ -654,6 +669,8 @@ def result_dict_for_batched_example(images,
'original_image': A [batch_size, H, W, C] uint8 image tensor.
'original_image_spatial_shape': A [batch_size, 2] tensor containing the
original image sizes.
'true_image_shape': A [batch_size, 3] tensor containing the size of
the unpadded original_image.
'key': A [batch_size] string tensor with image identifier.
'detection_boxes': [batch_size, max_detections, 4] float32 tensor of boxes,
in normalized or absolute coordinates, depending on the value of
......@@ -681,8 +698,10 @@ def result_dict_for_batched_example(images,
of groundtruth boxes per image.
Raises:
ValueError: if original_image_spatial_shape is not 1D int32 tensor of shape
ValueError: if original_image_spatial_shape is not 2D int32 tensor of shape
[2].
ValueError: if true_image_shapes is not 2D int32 tensor of shape
[3].
"""
label_id_offset = 1 # Applying label id offset (b/63711816)
......@@ -698,11 +717,25 @@ def result_dict_for_batched_example(images,
'`original_image_spatial_shape` should be a 2D tensor of shape '
'[batch_size, 2].')
if true_image_shapes is None:
true_image_shapes = tf.tile(
tf.expand_dims(tf.shape(images)[1:4], axis=0),
multiples=[tf.shape(images)[0], 1])
else:
if (len(true_image_shapes.shape) != 2
and true_image_shapes.shape[1] != 3):
raise ValueError('`true_image_shapes` should be a 2D tensor of '
'shape [batch_size, 3].')
output_dict = {
input_data_fields.original_image: images,
input_data_fields.key: keys,
input_data_fields.original_image:
images,
input_data_fields.key:
keys,
input_data_fields.original_image_spatial_shape: (
original_image_spatial_shapes)
original_image_spatial_shapes),
input_data_fields.true_image_shape:
true_image_shapes
}
detection_fields = fields.DetectionResultFields
......
......@@ -47,7 +47,7 @@ class EvalUtilTest(test_case.TestCase, parameterized.TestCase):
if batch_size == 1:
key = tf.constant('image1')
else:
key = tf.constant([str(range(batch_size))])
key = tf.constant([str(i) for i in range(batch_size)])
detection_boxes = tf.tile(tf.constant([[[0., 0., 1., 1.]]]),
multiples=[batch_size, 1, 1])
detection_scores = tf.tile(tf.constant([[0.8]]), multiples=[batch_size, 1])
......
......@@ -107,8 +107,14 @@ flags.DEFINE_integer('max_detections', 10,
'Maximum number of detections (boxes) to show.')
flags.DEFINE_integer('max_classes_per_detection', 1,
'Number of classes to display per detection box.')
flags.DEFINE_integer(
'detections_per_class', 100,
'Number of anchors used per class in Regular Non-Max-Suppression.')
flags.DEFINE_bool('add_postprocessing_op', True,
'Add TFLite custom op for postprocessing to the graph.')
flags.DEFINE_bool(
'use_regular_nms', False,
'Flag to set postprocessing op to use Regular NMS instead of Fast NMS.')
flags.DEFINE_string(
'config_override', '', 'pipeline_pb2.TrainEvalPipelineConfig '
'text proto to override pipeline_config_path.')
......@@ -130,7 +136,7 @@ def main(argv):
export_tflite_ssd_graph_lib.export_tflite_graph(
pipeline_config, FLAGS.trained_checkpoint_prefix, FLAGS.output_directory,
FLAGS.add_postprocessing_op, FLAGS.max_detections,
FLAGS.max_classes_per_detection)
FLAGS.max_classes_per_detection, FLAGS.use_regular_nms)
if __name__ == '__main__':
......
......@@ -59,9 +59,15 @@ def get_const_center_size_encoded_anchors(anchors):
return encoded_anchors
def append_postprocessing_op(frozen_graph_def, max_detections,
max_classes_per_detection, nms_score_threshold,
nms_iou_threshold, num_classes, scale_values):
def append_postprocessing_op(frozen_graph_def,
max_detections,
max_classes_per_detection,
nms_score_threshold,
nms_iou_threshold,
num_classes,
scale_values,
detections_per_class=100,
use_regular_nms=False):
"""Appends postprocessing custom op.
Args:
......@@ -77,6 +83,10 @@ def append_postprocessing_op(frozen_graph_def, max_detections,
scale_values: scale values is a dict with following key-value pairs
{y_scale: 10, x_scale: 10, h_scale: 5, w_scale: 5} that are used in decode
centersize boxes
detections_per_class: In regular NonMaxSuppression, number of anchors used
for NonMaxSuppression per class
use_regular_nms: Flag to set postprocessing op to use Regular NMS instead
of Fast NMS.
Returns:
transformed_graph_def: Frozen GraphDef with postprocessing custom op
......@@ -121,6 +131,10 @@ def append_postprocessing_op(frozen_graph_def, max_detections,
attr_value_pb2.AttrValue(f=scale_values['h_scale'].pop()))
new_output.attr['w_scale'].CopyFrom(
attr_value_pb2.AttrValue(f=scale_values['w_scale'].pop()))
new_output.attr['detections_per_class'].CopyFrom(
attr_value_pb2.AttrValue(i=detections_per_class))
new_output.attr['use_regular_nms'].CopyFrom(
attr_value_pb2.AttrValue(b=use_regular_nms))
new_output.input.extend(
['raw_outputs/box_encodings', 'raw_outputs/class_predictions', 'anchors'])
......@@ -133,9 +147,14 @@ def append_postprocessing_op(frozen_graph_def, max_detections,
return transformed_graph_def
def export_tflite_graph(pipeline_config, trained_checkpoint_prefix, output_dir,
add_postprocessing_op, max_detections,
max_classes_per_detection):
def export_tflite_graph(pipeline_config,
trained_checkpoint_prefix,
output_dir,
add_postprocessing_op,
max_detections,
max_classes_per_detection,
detections_per_class=100,
use_regular_nms=False):
"""Exports a tflite compatible graph and anchors for ssd detection model.
Anchors are written to a tensor and tflite compatible graph
......@@ -151,7 +170,10 @@ def export_tflite_graph(pipeline_config, trained_checkpoint_prefix, output_dir,
TFLite_Detection_PostProcess custom op
max_detections: Maximum number of detections (boxes) to show
max_classes_per_detection: Number of classes to display per detection
detections_per_class: In regular NonMaxSuppression, number of anchors used
for NonMaxSuppression per class
use_regular_nms: Flag to set postprocessing op to use Regular NMS instead
of Fast NMS.
Raises:
ValueError: if the pipeline config contains models other than ssd or uses an
......@@ -276,7 +298,8 @@ def export_tflite_graph(pipeline_config, trained_checkpoint_prefix, output_dir,
if add_postprocessing_op:
transformed_graph_def = append_postprocessing_op(
frozen_graph_def, max_detections, max_classes_per_detection,
nms_score_threshold, nms_iou_threshold, num_classes, scale_values)
nms_score_threshold, nms_iou_threshold, num_classes, scale_values,
detections_per_class, use_regular_nms)
else:
# Return frozen without adding post-processing custom op
transformed_graph_def = frozen_graph_def
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment