Unverified Commit fe748d4a authored by pkulzc's avatar pkulzc Committed by GitHub
Browse files

Object detection changes: (#7208)

257914648  by lzc:

    Internal changes

--
257525973  by Zhichao Lu:

    Fixes bug that silently prevents checkpoints from loading when training w/ eager + functions. Also sets up scripts to run training.

--
257296614  by Zhichao Lu:

    Adding detection_features to model outputs

--
257234565  by Zhichao Lu:

    Fix wrong order of `classes_with_max_scores` in class-agnostic NMS caused by
    sorting in partitioned-NMS.

--
257232002  by ronnyvotel:

    Supporting `filter_nonoverlapping` option in np_box_list_ops.clip_to_window().

--
257198282  by Zhichao Lu:

    Adding the focal loss and l1 loss from the Objects as Points paper.

--
257089535  by Zhichao Lu:

    Create Keras based ssd + resnetv1 + fpn.

--
257087407  by Zhichao Lu:

    Make object_detection/data_decoders Python3-compatible.

--
257004582  by Zhichao Lu:

    Updates _decode_raw_data_into_masks_and_boxes to the latest binary masks-to-string encoding format.

--
257002124  by Zhichao Lu:

    Make object_detection/utils Python3-compatible, except json_utils.

    The patching trick used in json_utils is not going to work in Python 3.

--
256795056  by lzc:

    Add a detection_anchor_indices field to detection outputs.

--
256477542  by Zhichao Lu:

    Make object_detection/core Python3-compatible.

--
256387593  by Zhichao Lu:

    Edit class_id_function_approximations builder to skip class ids not present in label map.

--
256259039  by Zhichao Lu:

    Move NMS to TPU for FasterRCNN.

--
256071360  by rathodv:

    When multiclass_scores is empty, add one-hot encoding of groundtruth_classes as multiclass scores so that data_augmentation ops that expect the presence of multiclass_scores don't have to individually handle this case.

    Also copy input tensor_dict to out_tensor_dict first to avoid inplace modification.

--
256023645  by Zhichao Lu:

    Adds the first WIP iterations of TensorFlow v2 eager + functions style custom training & evaluation loops.

--
255980623  by Zhichao Lu:

    Adds a new data augmentation operation "remap_labels" which remaps a set of labels to a new label.

--
255753259  by Zhichao Lu:

    Announcement of the released evaluation tutorial for Open Images Challenge
    2019.

--
255698776  by lzc:

    Fix rewrite_nn_resize_op function which was broken by tf forward compatibility movement.

--
255623150  by Zhichao Lu:

    Add Keras-based ResnetV1 models.

--
255504992  by Zhichao Lu:

    Fixing the typo in specifying label expansion for ground truth segmentation
    file.

--
255470768  by Zhichao Lu:

    1. Fixing Python bug with parsed arguments.
    2. Adding capability to parse relevant columns from CSV header.
    3. Fixing bug with duplicated labels expansion.

--
255462432  by Zhichao Lu:

    Adds a new data augmentation operation "drop_label_probabilistically" which drops a given label with the given probability. This supports experiments on training in the presence of label noise.

--
255441632  by rathodv:

    Fallback on groundtruth classes when multiclass_scores tensor is empty.

--
255434899  by Zhichao Lu:

    Ensuring evaluation binary can run even with big files by synchronizing
    processing of ground truth and predictions: in this way, ground truth is not stored but immediatly
    used for evaluation. In case gt of object masks, this allows to run
    evaluations on relatively large sets.

--
255337855  by lzc:

    Internal change.

--
255308908  by Zhichao Lu:

    Add comment to clarify usage of calibration parameters proto.

--
255266371  by Zhichao Lu:

    Ensuring correct processing of the case, when no groundtruth masks are provided
    for an image.

--
255236648  by Zhichao Lu:

    Refactor model_builder in faster_rcnn.py to a util_map, so that it's possible to be overwritten.

--
255093285  by Zhichao Lu:

    Updating capability to subsample data during evaluation

--
255081222  by rathodv:

    Convert groundtruth masks to be of type float32 before its used in the loss function.

    When using mixed precision training, masks are represented using bfloat16 tensors in the input pipeline for performance reasons. We need to convert them to float32 before using it in the loss function.

--
254788436  by Zhichao Lu:

    Add forward_compatible to non_max_suppression_with_scores to make it is
    compatible with older tensorflow version.

--
254442362  by Zhichao Lu:

    Add num_layer field to ssd feature extractor proto.

--
253911582  by jonathanhuang:

    Plumbs Soft-NMS options (using the new tf.image.non_max_suppression_with_scores op) into the TF Object Detection API.  It adds a `soft_nms_sigma` field to the postprocessing proto file and plumbs this through to both the multiclass and class_agnostic versions of NMS. Note that there is no effect on behavior of NMS when soft_nms_sigma=0 (which it is set to by default).

    See also "Soft-NMS -- Improving Object Detection With One Line of Code" by Bodla et al (https://arxiv.org/abs/1704.04503)

--
253703949  by Zhichao Lu:

    Internal test fixes.

--
253151266  by Zhichao Lu:

    Fix the op type check for FusedBatchNorm, given that we introduced
    FusedBatchNormV3 in a previous change.

--
252718956  by Zhichao Lu:

    Customize activation function to enable relu6 instead of relu for saliency
    prediction model seastarization

--
252158593  by Zhichao Lu:

    Make object_detection/core Python3-compatible.

--
252150717  by Zhichao Lu:

    Make object_detection/core Python3-compatible.

--
251967048  by Zhichao Lu:

    Make GraphRewriter proto extensible.

--
251950039  by Zhichao Lu:

    Remove experimental_export_device_assignment from TPUEstimator.export_savedmodel(), so as to remove rewrite_for_inference().

    As a replacement, export_savedmodel() V2 API supports device_assignment where user call tpu.rewrite in model_fn and pass in device_assigment there.

--
251890697  by rathodv:

    Updated docstring to include new output nodes.

--
251662894  by Zhichao Lu:

    Add autoaugment augmentation option to objection detection api codebase. This
    is an available option in preprocessor.py.

    The intended usage of autoaugment is to be done along with random flipping and
    cropping for best results.

--
251532908  by Zhichao Lu:

    Add TrainingDataType enum to track whether class-specific or agnostic data was used to fit the calibration function.

    This is useful, since classes with few observations may require a calibration function fit on all classes.

--
251511339  by Zhichao Lu:

    Add multiclass isotonic regression to the calibration builder.

--
251317769  by pengchong:

    Internal Change.

--
250729989  by Zhichao Lu:

    Fixing bug in gt statistics count in case of mask and box annotations.

--
250729627  by Zhichao Lu:

    Label expansion for segmentation.

--
250724905  by Zhichao Lu:

    Fix use_depthwise in fpn and test it with fpnlite on ssd + mobilenet v2.

--
250670379  by Zhichao Lu:

    Internal change

250630364  by lzc:

    Fix detection_model_zoo footnotes

--
250560654  by Zhichao Lu:

    Fix static shape issue in matmul_crop_and_resize.

--
250534857  by Zhichao Lu:

    Edit class agnostic calibration function docstring to more accurately describe the function's outputs.

--
250533277  by Zhichao Lu:

    Edit the multiclass messages to use class ids instead of labels.

--

PiperOrigin-RevId: 257914648
parent 81123ebf
...@@ -21,6 +21,7 @@ Based on PNASNet model: https://arxiv.org/abs/1712.00559 ...@@ -21,6 +21,7 @@ Based on PNASNet model: https://arxiv.org/abs/1712.00559
import tensorflow as tf import tensorflow as tf
from object_detection.meta_architectures import faster_rcnn_meta_arch from object_detection.meta_architectures import faster_rcnn_meta_arch
from object_detection.utils import variables_helper
from nets.nasnet import nasnet_utils from nets.nasnet import nasnet_utils
from nets.nasnet import pnasnet from nets.nasnet import pnasnet
...@@ -302,7 +303,7 @@ class FasterRCNNPNASFeatureExtractor( ...@@ -302,7 +303,7 @@ class FasterRCNNPNASFeatureExtractor(
the model graph. the model graph.
""" """
variables_to_restore = {} variables_to_restore = {}
for variable in tf.global_variables(): for variable in variables_helper.get_global_variables_safely():
if variable.op.name.startswith( if variable.op.name.startswith(
first_stage_feature_extractor_scope): first_stage_feature_extractor_scope):
var_name = variable.op.name.replace( var_name = variable.op.name.replace(
......
...@@ -44,7 +44,8 @@ class FasterRCNNResnetV1FeatureExtractor( ...@@ -44,7 +44,8 @@ class FasterRCNNResnetV1FeatureExtractor(
first_stage_features_stride, first_stage_features_stride,
batch_norm_trainable=False, batch_norm_trainable=False,
reuse_weights=None, reuse_weights=None,
weight_decay=0.0): weight_decay=0.0,
activation_fn=tf.nn.relu):
"""Constructor. """Constructor.
Args: Args:
...@@ -55,6 +56,7 @@ class FasterRCNNResnetV1FeatureExtractor( ...@@ -55,6 +56,7 @@ class FasterRCNNResnetV1FeatureExtractor(
batch_norm_trainable: See base class. batch_norm_trainable: See base class.
reuse_weights: See base class. reuse_weights: See base class.
weight_decay: See base class. weight_decay: See base class.
activation_fn: Activaton functon to use in Resnet V1 model.
Raises: Raises:
ValueError: If `first_stage_features_stride` is not 8 or 16. ValueError: If `first_stage_features_stride` is not 8 or 16.
...@@ -63,9 +65,10 @@ class FasterRCNNResnetV1FeatureExtractor( ...@@ -63,9 +65,10 @@ class FasterRCNNResnetV1FeatureExtractor(
raise ValueError('`first_stage_features_stride` must be 8 or 16.') raise ValueError('`first_stage_features_stride` must be 8 or 16.')
self._architecture = architecture self._architecture = architecture
self._resnet_model = resnet_model self._resnet_model = resnet_model
super(FasterRCNNResnetV1FeatureExtractor, self).__init__( self._activation_fn = activation_fn
is_training, first_stage_features_stride, batch_norm_trainable, super(FasterRCNNResnetV1FeatureExtractor,
reuse_weights, weight_decay) self).__init__(is_training, first_stage_features_stride,
batch_norm_trainable, reuse_weights, weight_decay)
def preprocess(self, resized_inputs): def preprocess(self, resized_inputs):
"""Faster R-CNN Resnet V1 preprocessing. """Faster R-CNN Resnet V1 preprocessing.
...@@ -125,6 +128,7 @@ class FasterRCNNResnetV1FeatureExtractor( ...@@ -125,6 +128,7 @@ class FasterRCNNResnetV1FeatureExtractor(
resnet_utils.resnet_arg_scope( resnet_utils.resnet_arg_scope(
batch_norm_epsilon=1e-5, batch_norm_epsilon=1e-5,
batch_norm_scale=True, batch_norm_scale=True,
activation_fn=self._activation_fn,
weight_decay=self._weight_decay)): weight_decay=self._weight_decay)):
with tf.variable_scope( with tf.variable_scope(
self._architecture, reuse=self._reuse_weights) as var_scope: self._architecture, reuse=self._reuse_weights) as var_scope:
...@@ -159,6 +163,7 @@ class FasterRCNNResnetV1FeatureExtractor( ...@@ -159,6 +163,7 @@ class FasterRCNNResnetV1FeatureExtractor(
resnet_utils.resnet_arg_scope( resnet_utils.resnet_arg_scope(
batch_norm_epsilon=1e-5, batch_norm_epsilon=1e-5,
batch_norm_scale=True, batch_norm_scale=True,
activation_fn=self._activation_fn,
weight_decay=self._weight_decay)): weight_decay=self._weight_decay)):
with slim.arg_scope([slim.batch_norm], with slim.arg_scope([slim.batch_norm],
is_training=self._train_batch_norm): is_training=self._train_batch_norm):
...@@ -182,7 +187,8 @@ class FasterRCNNResnet50FeatureExtractor(FasterRCNNResnetV1FeatureExtractor): ...@@ -182,7 +187,8 @@ class FasterRCNNResnet50FeatureExtractor(FasterRCNNResnetV1FeatureExtractor):
first_stage_features_stride, first_stage_features_stride,
batch_norm_trainable=False, batch_norm_trainable=False,
reuse_weights=None, reuse_weights=None,
weight_decay=0.0): weight_decay=0.0,
activation_fn=tf.nn.relu):
"""Constructor. """Constructor.
Args: Args:
...@@ -191,15 +197,16 @@ class FasterRCNNResnet50FeatureExtractor(FasterRCNNResnetV1FeatureExtractor): ...@@ -191,15 +197,16 @@ class FasterRCNNResnet50FeatureExtractor(FasterRCNNResnetV1FeatureExtractor):
batch_norm_trainable: See base class. batch_norm_trainable: See base class.
reuse_weights: See base class. reuse_weights: See base class.
weight_decay: See base class. weight_decay: See base class.
activation_fn: See base class.
Raises: Raises:
ValueError: If `first_stage_features_stride` is not 8 or 16, ValueError: If `first_stage_features_stride` is not 8 or 16,
or if `architecture` is not supported. or if `architecture` is not supported.
""" """
super(FasterRCNNResnet50FeatureExtractor, self).__init__( super(FasterRCNNResnet50FeatureExtractor,
'resnet_v1_50', resnet_v1.resnet_v1_50, is_training, self).__init__('resnet_v1_50', resnet_v1.resnet_v1_50, is_training,
first_stage_features_stride, batch_norm_trainable, first_stage_features_stride, batch_norm_trainable,
reuse_weights, weight_decay) reuse_weights, weight_decay, activation_fn)
class FasterRCNNResnet101FeatureExtractor(FasterRCNNResnetV1FeatureExtractor): class FasterRCNNResnet101FeatureExtractor(FasterRCNNResnetV1FeatureExtractor):
...@@ -210,7 +217,8 @@ class FasterRCNNResnet101FeatureExtractor(FasterRCNNResnetV1FeatureExtractor): ...@@ -210,7 +217,8 @@ class FasterRCNNResnet101FeatureExtractor(FasterRCNNResnetV1FeatureExtractor):
first_stage_features_stride, first_stage_features_stride,
batch_norm_trainable=False, batch_norm_trainable=False,
reuse_weights=None, reuse_weights=None,
weight_decay=0.0): weight_decay=0.0,
activation_fn=tf.nn.relu):
"""Constructor. """Constructor.
Args: Args:
...@@ -219,15 +227,16 @@ class FasterRCNNResnet101FeatureExtractor(FasterRCNNResnetV1FeatureExtractor): ...@@ -219,15 +227,16 @@ class FasterRCNNResnet101FeatureExtractor(FasterRCNNResnetV1FeatureExtractor):
batch_norm_trainable: See base class. batch_norm_trainable: See base class.
reuse_weights: See base class. reuse_weights: See base class.
weight_decay: See base class. weight_decay: See base class.
activation_fn: See base class.
Raises: Raises:
ValueError: If `first_stage_features_stride` is not 8 or 16, ValueError: If `first_stage_features_stride` is not 8 or 16,
or if `architecture` is not supported. or if `architecture` is not supported.
""" """
super(FasterRCNNResnet101FeatureExtractor, self).__init__( super(FasterRCNNResnet101FeatureExtractor,
'resnet_v1_101', resnet_v1.resnet_v1_101, is_training, self).__init__('resnet_v1_101', resnet_v1.resnet_v1_101, is_training,
first_stage_features_stride, batch_norm_trainable, first_stage_features_stride, batch_norm_trainable,
reuse_weights, weight_decay) reuse_weights, weight_decay, activation_fn)
class FasterRCNNResnet152FeatureExtractor(FasterRCNNResnetV1FeatureExtractor): class FasterRCNNResnet152FeatureExtractor(FasterRCNNResnetV1FeatureExtractor):
...@@ -238,7 +247,8 @@ class FasterRCNNResnet152FeatureExtractor(FasterRCNNResnetV1FeatureExtractor): ...@@ -238,7 +247,8 @@ class FasterRCNNResnet152FeatureExtractor(FasterRCNNResnetV1FeatureExtractor):
first_stage_features_stride, first_stage_features_stride,
batch_norm_trainable=False, batch_norm_trainable=False,
reuse_weights=None, reuse_weights=None,
weight_decay=0.0): weight_decay=0.0,
activation_fn=tf.nn.relu):
"""Constructor. """Constructor.
Args: Args:
...@@ -247,12 +257,13 @@ class FasterRCNNResnet152FeatureExtractor(FasterRCNNResnetV1FeatureExtractor): ...@@ -247,12 +257,13 @@ class FasterRCNNResnet152FeatureExtractor(FasterRCNNResnetV1FeatureExtractor):
batch_norm_trainable: See base class. batch_norm_trainable: See base class.
reuse_weights: See base class. reuse_weights: See base class.
weight_decay: See base class. weight_decay: See base class.
activation_fn: See base class.
Raises: Raises:
ValueError: If `first_stage_features_stride` is not 8 or 16, ValueError: If `first_stage_features_stride` is not 8 or 16,
or if `architecture` is not supported. or if `architecture` is not supported.
""" """
super(FasterRCNNResnet152FeatureExtractor, self).__init__( super(FasterRCNNResnet152FeatureExtractor,
'resnet_v1_152', resnet_v1.resnet_v1_152, is_training, self).__init__('resnet_v1_152', resnet_v1.resnet_v1_152, is_training,
first_stage_features_stride, batch_norm_trainable, first_stage_features_stride, batch_norm_trainable,
reuse_weights, weight_decay) reuse_weights, weight_decay, activation_fn)
...@@ -25,6 +25,7 @@ class FasterRcnnResnetV1FeatureExtractorTest(tf.test.TestCase): ...@@ -25,6 +25,7 @@ class FasterRcnnResnetV1FeatureExtractorTest(tf.test.TestCase):
def _build_feature_extractor(self, def _build_feature_extractor(self,
first_stage_features_stride, first_stage_features_stride,
activation_fn=tf.nn.relu,
architecture='resnet_v1_101'): architecture='resnet_v1_101'):
feature_extractor_map = { feature_extractor_map = {
'resnet_v1_50': 'resnet_v1_50':
...@@ -37,6 +38,7 @@ class FasterRcnnResnetV1FeatureExtractorTest(tf.test.TestCase): ...@@ -37,6 +38,7 @@ class FasterRcnnResnetV1FeatureExtractorTest(tf.test.TestCase):
return feature_extractor_map[architecture]( return feature_extractor_map[architecture](
is_training=False, is_training=False,
first_stage_features_stride=first_stage_features_stride, first_stage_features_stride=first_stage_features_stride,
activation_fn=activation_fn,
batch_norm_trainable=False, batch_norm_trainable=False,
reuse_weights=None, reuse_weights=None,
weight_decay=0.0) weight_decay=0.0)
...@@ -132,6 +134,32 @@ class FasterRcnnResnetV1FeatureExtractorTest(tf.test.TestCase): ...@@ -132,6 +134,32 @@ class FasterRcnnResnetV1FeatureExtractorTest(tf.test.TestCase):
features_shape_out = sess.run(features_shape) features_shape_out = sess.run(features_shape)
self.assertAllEqual(features_shape_out, [3, 7, 7, 2048]) self.assertAllEqual(features_shape_out, [3, 7, 7, 2048])
def test_overwriting_activation_fn(self):
for architecture in ['resnet_v1_50', 'resnet_v1_101', 'resnet_v1_152']:
feature_extractor = self._build_feature_extractor(
first_stage_features_stride=16,
architecture=architecture,
activation_fn=tf.nn.relu6)
preprocessed_inputs = tf.random_uniform([4, 224, 224, 3],
maxval=255,
dtype=tf.float32)
rpn_feature_map, _ = feature_extractor.extract_proposal_features(
preprocessed_inputs, scope='TestStage1Scope')
_ = feature_extractor.extract_box_classifier_features(
rpn_feature_map, scope='TestStaget2Scope')
conv_ops = [
op for op in tf.get_default_graph().get_operations()
if op.type == 'Relu6'
]
op_names = [op.name for op in conv_ops]
self.assertIsNotNone(conv_ops)
self.assertIn('TestStage1Scope/resnet_v1_50/resnet_v1_50/conv1/Relu6',
op_names)
self.assertIn(
'TestStaget2Scope/resnet_v1_50/block4/unit_1/bottleneck_v1/conv1/Relu6',
op_names)
if __name__ == '__main__': if __name__ == '__main__':
tf.test.main() tf.test.main()
...@@ -79,14 +79,19 @@ def create_conv_block( ...@@ -79,14 +79,19 @@ def create_conv_block(
""" """
layers = [] layers = []
if use_depthwise: if use_depthwise:
layers.append(tf.keras.layers.SeparableConv2D( kwargs = conv_hyperparams.params()
depth, # Both the regularizer and initializer apply to the depthwise layer,
[kernel_size, kernel_size], # so we remap the kernel_* to depthwise_* here.
depth_multiplier=1, kwargs['depthwise_regularizer'] = kwargs['kernel_regularizer']
padding=padding, kwargs['depthwise_initializer'] = kwargs['kernel_initializer']
strides=stride, layers.append(
name=layer_name + '_depthwise_conv', tf.keras.layers.SeparableConv2D(
**conv_hyperparams.params())) depth, [kernel_size, kernel_size],
depth_multiplier=1,
padding=padding,
strides=stride,
name=layer_name + '_depthwise_conv',
**kwargs))
else: else:
layers.append(tf.keras.layers.Conv2D( layers.append(tf.keras.layers.Conv2D(
depth, depth,
......
...@@ -160,7 +160,12 @@ class _LayersOverride(object): ...@@ -160,7 +160,12 @@ class _LayersOverride(object):
""" """
if self._conv_hyperparams: if self._conv_hyperparams:
kwargs = self._conv_hyperparams.params(**kwargs) kwargs = self._conv_hyperparams.params(**kwargs)
# Both the regularizer and initializer apply to the depthwise layer in
# MobilenetV1, so we remap the kernel_* to depthwise_* here.
kwargs['depthwise_regularizer'] = kwargs['kernel_regularizer']
kwargs['depthwise_initializer'] = kwargs['kernel_initializer']
else: else:
kwargs['depthwise_regularizer'] = self.regularizer
kwargs['depthwise_initializer'] = self.initializer kwargs['depthwise_initializer'] = self.initializer
kwargs['padding'] = 'same' kwargs['padding'] = 'same'
......
# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""A wrapper around the Keras Resnet V1 models for object detection."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
from object_detection.core import freezable_batch_norm
def _fixed_padding(inputs, kernel_size, rate=1): # pylint: disable=invalid-name
"""Pads the input along the spatial dimensions independently of input size.
Pads the input such that if it was used in a convolution with 'VALID' padding,
the output would have the same dimensions as if the unpadded input was used
in a convolution with 'SAME' padding.
Args:
inputs: A tensor of size [batch, height_in, width_in, channels].
kernel_size: The kernel to be used in the conv2d or max_pool2d operation.
rate: An integer, rate for atrous convolution.
Returns:
output: A tensor of size [batch, height_out, width_out, channels] with the
input, either intact (if kernel_size == 1) or padded (if kernel_size > 1).
"""
kernel_size_effective = kernel_size + (kernel_size - 1) * (rate - 1)
pad_total = kernel_size_effective - 1
pad_beg = pad_total // 2
pad_end = pad_total - pad_beg
padded_inputs = tf.pad(
inputs, [[0, 0], [pad_beg, pad_end], [pad_beg, pad_end], [0, 0]])
return padded_inputs
class _LayersOverride(object):
"""Alternative Keras layers interface for the Keras Resnet V1."""
def __init__(self,
batchnorm_training,
batchnorm_scale=True,
default_batchnorm_momentum=0.997,
default_batchnorm_epsilon=1e-5,
weight_decay=0.0001,
conv_hyperparams=None,
min_depth=8,
depth_multiplier=1):
"""Alternative tf.keras.layers interface, for use by the Keras Resnet V1.
The class is used by the Keras applications kwargs injection API to
modify the Resnet V1 Keras application with changes required by
the Object Detection API.
Args:
batchnorm_training: Bool. Assigned to Batch norm layer `training` param
when constructing `freezable_batch_norm.FreezableBatchNorm` layers.
batchnorm_scale: If True, uses an explicit `gamma` multiplier to scale
the activations in the batch normalization layer.
default_batchnorm_momentum: Float. When 'conv_hyperparams' is None,
batch norm layers will be constructed using this value as the momentum.
default_batchnorm_epsilon: Float. When 'conv_hyperparams' is None,
batch norm layers will be constructed using this value as the epsilon.
weight_decay: The weight decay to use for regularizing the model.
conv_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
containing hyperparameters for convolution ops. Optionally set to `None`
to use default resnet_v1 layer builders.
min_depth: Minimum number of filters in the convolutional layers.
depth_multiplier: The depth multiplier to modify the number of filters
in the convolutional layers.
"""
self._batchnorm_training = batchnorm_training
self._batchnorm_scale = batchnorm_scale
self._default_batchnorm_momentum = default_batchnorm_momentum
self._default_batchnorm_epsilon = default_batchnorm_epsilon
self._conv_hyperparams = conv_hyperparams
self._min_depth = min_depth
self._depth_multiplier = depth_multiplier
self.regularizer = tf.keras.regularizers.l2(weight_decay)
self.initializer = tf.variance_scaling_initializer()
def _FixedPaddingLayer(self, kernel_size, rate=1):
return tf.keras.layers.Lambda(
lambda x: _fixed_padding(x, kernel_size, rate))
def Conv2D(self, filters, kernel_size, **kwargs):
"""Builds a Conv2D layer according to the current Object Detection config.
Overrides the Keras Resnet application's convolutions with ones that
follow the spec specified by the Object Detection hyperparameters.
Args:
filters: The number of filters to use for the convolution.
kernel_size: The kernel size to specify the height and width of the 2D
convolution window.
**kwargs: Keyword args specified by the Keras application for
constructing the convolution.
Returns:
A one-arg callable that will either directly apply a Keras Conv2D layer to
the input argument, or that will first pad the input then apply a Conv2D
layer.
"""
# Apply the minimum depth to the convolution layers.
filters = max(int(filters * self._depth_multiplier), self._min_depth)
if self._conv_hyperparams:
kwargs = self._conv_hyperparams.params(**kwargs)
else:
kwargs['kernel_regularizer'] = self.regularizer
kwargs['kernel_initializer'] = self.initializer
# Set use_bias as false to keep it consistent with Slim Resnet model.
kwargs['use_bias'] = False
kwargs['padding'] = 'same'
stride = kwargs.get('strides')
if stride and kernel_size and stride > 1 and kernel_size > 1:
kwargs['padding'] = 'valid'
def padded_conv(features): # pylint: disable=invalid-name
padded_features = self._FixedPaddingLayer(kernel_size)(features)
return tf.keras.layers.Conv2D(
filters, kernel_size, **kwargs)(padded_features)
return padded_conv
else:
return tf.keras.layers.Conv2D(filters, kernel_size, **kwargs)
def Activation(self, *args, **kwargs): # pylint: disable=unused-argument
"""Builds an activation layer.
Overrides the Keras application Activation layer specified by the
Object Detection configuration.
Args:
*args: Ignored,
required to match the `tf.keras.layers.Activation` interface.
**kwargs: Only the name is used,
required to match `tf.keras.layers.Activation` interface.
Returns:
An activation layer specified by the Object Detection hyperparameter
configurations.
"""
name = kwargs.get('name')
if self._conv_hyperparams:
return self._conv_hyperparams.build_activation_layer(name=name)
else:
return tf.keras.layers.Lambda(tf.nn.relu, name=name)
def BatchNormalization(self, **kwargs):
"""Builds a normalization layer.
Overrides the Keras application batch norm with the norm specified by the
Object Detection configuration.
Args:
**kwargs: Only the name is used, all other params ignored.
Required for matching `layers.BatchNormalization` calls in the Keras
application.
Returns:
A normalization layer specified by the Object Detection hyperparameter
configurations.
"""
name = kwargs.get('name')
if self._conv_hyperparams:
return self._conv_hyperparams.build_batch_norm(
training=self._batchnorm_training,
name=name)
else:
kwargs['scale'] = self._batchnorm_scale
kwargs['epsilon'] = self._default_batchnorm_epsilon
return freezable_batch_norm.FreezableBatchNorm(
training=self._batchnorm_training,
momentum=self._default_batchnorm_momentum,
**kwargs)
def Input(self, shape):
"""Builds an Input layer.
Overrides the Keras application Input layer with one that uses a
tf.placeholder_with_default instead of a tf.placeholder. This is necessary
to ensure the application works when run on a TPU.
Args:
shape: A tuple of integers representing the shape of the input, which
includes both spatial share and channels, but not the batch size.
Elements of this tuple can be None; 'None' elements represent dimensions
where the shape is not known.
Returns:
An input layer for the specified shape that internally uses a
placeholder_with_default.
"""
default_size = 224
default_batch_size = 1
shape = list(shape)
default_shape = [default_size if dim is None else dim for dim in shape]
input_tensor = tf.constant(0.0, shape=[default_batch_size] + default_shape)
placeholder_with_default = tf.placeholder_with_default(
input=input_tensor, shape=[None] + shape)
return tf.keras.layers.Input(tensor=placeholder_with_default)
def MaxPooling2D(self, pool_size, **kwargs):
"""Builds a MaxPooling2D layer with default padding as 'SAME'.
This is specified by the default resnet arg_scope in slim.
Args:
pool_size: The pool size specified by the Keras application.
**kwargs: Ignored, required to match the Keras applications usage.
Returns:
A MaxPooling2D layer with default padding as 'SAME'.
"""
kwargs['padding'] = 'same'
return tf.keras.layers.MaxPooling2D(pool_size, **kwargs)
# Add alias as Keras also has it.
MaxPool2D = MaxPooling2D # pylint: disable=invalid-name
def ZeroPadding2D(self, padding, **kwargs): # pylint: disable=unused-argument
"""Replaces explicit padding in the Keras application with a no-op.
Args:
padding: The padding values for image height and width.
**kwargs: Ignored, required to match the Keras applications usage.
Returns:
A no-op identity lambda.
"""
return lambda x: x
# Forward all non-overridden methods to the keras layers
def __getattr__(self, item):
return getattr(tf.keras.layers, item)
# pylint: disable=invalid-name
def resnet_v1_50(batchnorm_training,
batchnorm_scale=True,
default_batchnorm_momentum=0.997,
default_batchnorm_epsilon=1e-5,
weight_decay=0.0001,
conv_hyperparams=None,
min_depth=8,
depth_multiplier=1,
**kwargs):
"""Instantiates the Resnet50 architecture, modified for object detection.
Args:
batchnorm_training: Bool. Assigned to Batch norm layer `training` param
when constructing `freezable_batch_norm.FreezableBatchNorm` layers.
batchnorm_scale: If True, uses an explicit `gamma` multiplier to scale
the activations in the batch normalization layer.
default_batchnorm_momentum: Float. When 'conv_hyperparams' is None,
batch norm layers will be constructed using this value as the momentum.
default_batchnorm_epsilon: Float. When 'conv_hyperparams' is None,
batch norm layers will be constructed using this value as the epsilon.
weight_decay: The weight decay to use for regularizing the model.
conv_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
containing hyperparameters for convolution ops. Optionally set to `None`
to use default resnet_v1 layer builders.
min_depth: Minimum number of filters in the convolutional layers.
depth_multiplier: The depth multiplier to modify the number of filters
in the convolutional layers.
**kwargs: Keyword arguments forwarded directly to the
`tf.keras.applications.Mobilenet` method that constructs the Keras
model.
Returns:
A Keras ResnetV1-50 model instance.
"""
layers_override = _LayersOverride(
batchnorm_training,
batchnorm_scale=batchnorm_scale,
default_batchnorm_momentum=default_batchnorm_momentum,
default_batchnorm_epsilon=default_batchnorm_epsilon,
conv_hyperparams=conv_hyperparams,
weight_decay=weight_decay,
min_depth=min_depth,
depth_multiplier=depth_multiplier)
return tf.keras.applications.resnet.ResNet50(
layers=layers_override, **kwargs)
def resnet_v1_101(batchnorm_training,
batchnorm_scale=True,
default_batchnorm_momentum=0.997,
default_batchnorm_epsilon=1e-5,
weight_decay=0.0001,
conv_hyperparams=None,
min_depth=8,
depth_multiplier=1,
**kwargs):
"""Instantiates the Resnet50 architecture, modified for object detection.
Args:
batchnorm_training: Bool. Assigned to Batch norm layer `training` param
when constructing `freezable_batch_norm.FreezableBatchNorm` layers.
batchnorm_scale: If True, uses an explicit `gamma` multiplier to scale
the activations in the batch normalization layer.
default_batchnorm_momentum: Float. When 'conv_hyperparams' is None,
batch norm layers will be constructed using this value as the momentum.
default_batchnorm_epsilon: Float. When 'conv_hyperparams' is None,
batch norm layers will be constructed using this value as the epsilon.
weight_decay: The weight decay to use for regularizing the model.
conv_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
containing hyperparameters for convolution ops. Optionally set to `None`
to use default resnet_v1 layer builders.
min_depth: Minimum number of filters in the convolutional layers.
depth_multiplier: The depth multiplier to modify the number of filters
in the convolutional layers.
**kwargs: Keyword arguments forwarded directly to the
`tf.keras.applications.Mobilenet` method that constructs the Keras
model.
Returns:
A Keras ResnetV1-101 model instance.
"""
layers_override = _LayersOverride(
batchnorm_training,
batchnorm_scale=batchnorm_scale,
default_batchnorm_momentum=default_batchnorm_momentum,
default_batchnorm_epsilon=default_batchnorm_epsilon,
conv_hyperparams=conv_hyperparams,
weight_decay=weight_decay,
min_depth=min_depth,
depth_multiplier=depth_multiplier)
return tf.keras.applications.resnet.ResNet101(
layers=layers_override, **kwargs)
def resnet_v1_152(batchnorm_training,
batchnorm_scale=True,
default_batchnorm_momentum=0.997,
default_batchnorm_epsilon=1e-5,
weight_decay=0.0001,
conv_hyperparams=None,
min_depth=8,
depth_multiplier=1,
**kwargs):
"""Instantiates the Resnet50 architecture, modified for object detection.
Args:
batchnorm_training: Bool. Assigned to Batch norm layer `training` param
when constructing `freezable_batch_norm.FreezableBatchNorm` layers.
batchnorm_scale: If True, uses an explicit `gamma` multiplier to scale
the activations in the batch normalization layer.
default_batchnorm_momentum: Float. When 'conv_hyperparams' is None,
batch norm layers will be constructed using this value as the momentum.
default_batchnorm_epsilon: Float. When 'conv_hyperparams' is None,
batch norm layers will be constructed using this value as the epsilon.
weight_decay: The weight decay to use for regularizing the model.
conv_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
containing hyperparameters for convolution ops. Optionally set to `None`
to use default resnet_v1 layer builders.
min_depth: Minimum number of filters in the convolutional layers.
depth_multiplier: The depth multiplier to modify the number of filters
in the convolutional layers.
**kwargs: Keyword arguments forwarded directly to the
`tf.keras.applications.Mobilenet` method that constructs the Keras
model.
Returns:
A Keras ResnetV1-152 model instance.
"""
layers_override = _LayersOverride(
batchnorm_training,
batchnorm_scale=batchnorm_scale,
default_batchnorm_momentum=default_batchnorm_momentum,
default_batchnorm_epsilon=default_batchnorm_epsilon,
conv_hyperparams=conv_hyperparams,
weight_decay=weight_decay,
min_depth=min_depth,
depth_multiplier=depth_multiplier)
return tf.keras.applications.resnet.ResNet152(
layers=layers_override, **kwargs)
# pylint: enable=invalid-name
# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for resnet_v1.py.
This test mainly focuses on comparing slim resnet v1 and Keras resnet v1 for
object detection. To verify the consistency of the two models, we compare:
1. Output shape of each layer given different inputs.
2. Number of global variables.
"""
import numpy as np
from six.moves import zip
import tensorflow as tf
from google.protobuf import text_format
from object_detection.builders import hyperparams_builder
from object_detection.models.keras_models import resnet_v1
from object_detection.protos import hyperparams_pb2
from object_detection.utils import test_case
_EXPECTED_SHAPES_224_RESNET50 = {
'conv2_block3_out': (4, 56, 56, 256),
'conv3_block4_out': (4, 28, 28, 512),
'conv4_block6_out': (4, 14, 14, 1024),
'conv5_block3_out': (4, 7, 7, 2048),
}
_EXPECTED_SHAPES_224_RESNET101 = {
'conv2_block3_out': (4, 56, 56, 256),
'conv3_block4_out': (4, 28, 28, 512),
'conv4_block23_out': (4, 14, 14, 1024),
'conv5_block3_out': (4, 7, 7, 2048),
}
_EXPECTED_SHAPES_224_RESNET152 = {
'conv2_block3_out': (4, 56, 56, 256),
'conv3_block8_out': (4, 28, 28, 512),
'conv4_block36_out': (4, 14, 14, 1024),
'conv5_block3_out': (4, 7, 7, 2048),
}
_RESNET_NAMES = ['resnet_v1_50', 'resnet_v1_101', 'resnet_v1_152']
_RESNET_MODELS = [
resnet_v1.resnet_v1_50, resnet_v1.resnet_v1_101, resnet_v1.resnet_v1_152
]
_RESNET_SHAPES = [
_EXPECTED_SHAPES_224_RESNET50, _EXPECTED_SHAPES_224_RESNET101,
_EXPECTED_SHAPES_224_RESNET152
]
_NUM_CHANNELS = 3
_BATCH_SIZE = 4
class ResnetV1Test(test_case.TestCase):
def _build_conv_hyperparams(self):
conv_hyperparams = hyperparams_pb2.Hyperparams()
conv_hyperparams_text_proto = """
activation: RELU_6,
regularizer {
l2_regularizer {
weight: 0.0004
}
}
initializer {
truncated_normal_initializer {
stddev: 0.03
mean: 0.0
}
}
batch_norm {
scale: true,
decay: 0.997,
epsilon: 0.001,
}
"""
text_format.Merge(conv_hyperparams_text_proto, conv_hyperparams)
return hyperparams_builder.KerasLayerHyperparams(conv_hyperparams)
def _create_application_with_layer_outputs(self,
model_index,
batchnorm_training,
batchnorm_scale=True,
weight_decay=0.0001,
default_batchnorm_momentum=0.997,
default_batchnorm_epsilon=1e-5):
"""Constructs Keras resnet_v1 that extracts layer outputs."""
# Have to clear the Keras backend to ensure isolation in layer naming
tf.keras.backend.clear_session()
layer_names = _RESNET_SHAPES[model_index].keys()
full_model = _RESNET_MODELS[model_index](
batchnorm_training=batchnorm_training,
weights=None,
batchnorm_scale=batchnorm_scale,
weight_decay=weight_decay,
default_batchnorm_momentum=default_batchnorm_momentum,
default_batchnorm_epsilon=default_batchnorm_epsilon,
include_top=False)
layer_outputs = [
full_model.get_layer(name=layer).output for layer in layer_names
]
return tf.keras.Model(inputs=full_model.inputs, outputs=layer_outputs)
def _check_returns_correct_shape(self,
image_height,
image_width,
model_index,
expected_feature_map_shape,
batchnorm_training=True,
batchnorm_scale=True,
weight_decay=0.0001,
default_batchnorm_momentum=0.997,
default_batchnorm_epsilon=1e-5):
model = self._create_application_with_layer_outputs(
model_index=model_index,
batchnorm_training=batchnorm_training,
batchnorm_scale=batchnorm_scale,
weight_decay=weight_decay,
default_batchnorm_momentum=default_batchnorm_momentum,
default_batchnorm_epsilon=default_batchnorm_epsilon)
image_tensor = np.random.rand(_BATCH_SIZE, image_height, image_width,
_NUM_CHANNELS).astype(np.float32)
feature_maps = model(image_tensor)
layer_names = _RESNET_SHAPES[model_index].keys()
for feature_map, layer_name in zip(feature_maps, layer_names):
expected_shape = _RESNET_SHAPES[model_index][layer_name]
self.assertAllEqual(feature_map.shape, expected_shape)
def _get_variables(self, model_index):
tf.keras.backend.clear_session()
model = self._create_application_with_layer_outputs(
model_index, batchnorm_training=False)
preprocessed_inputs = tf.placeholder(tf.float32,
(4, None, None, _NUM_CHANNELS))
model(preprocessed_inputs)
return model.variables
def test_returns_correct_shapes_224(self):
image_height = 224
image_width = 224
for model_index, _ in enumerate(_RESNET_NAMES):
expected_feature_map_shape = _RESNET_SHAPES[model_index]
self._check_returns_correct_shape(image_height, image_width, model_index,
expected_feature_map_shape)
def test_hyperparam_override(self):
for model_name in _RESNET_MODELS:
model = model_name(
batchnorm_training=True,
default_batchnorm_momentum=0.2,
default_batchnorm_epsilon=0.1,
weights=None,
include_top=False)
bn_layer = model.get_layer(name='conv1_bn')
self.assertAllClose(bn_layer.momentum, 0.2)
self.assertAllClose(bn_layer.epsilon, 0.1)
def test_variable_count(self):
# The number of variables from slim resnetv1-* model.
variable_nums = [265, 520, 775]
for model_index, var_num in enumerate(variable_nums):
variables = self._get_variables(model_index)
self.assertEqual(len(variables), var_num)
if __name__ == '__main__':
tf.test.main()
...@@ -57,8 +57,13 @@ class SsdFeatureExtractorTestBase(test_case.TestCase): ...@@ -57,8 +57,13 @@ class SsdFeatureExtractorTestBase(test_case.TestCase):
return sc return sc
@abstractmethod @abstractmethod
def _create_feature_extractor(self, depth_multiplier, pad_to_multiple, def _create_feature_extractor(self,
use_explicit_padding=False, use_keras=False): depth_multiplier,
pad_to_multiple,
use_explicit_padding=False,
num_layers=6,
use_keras=False,
use_depthwise=False):
"""Constructs a new feature extractor. """Constructs a new feature extractor.
Args: Args:
...@@ -68,42 +73,64 @@ class SsdFeatureExtractorTestBase(test_case.TestCase): ...@@ -68,42 +73,64 @@ class SsdFeatureExtractorTestBase(test_case.TestCase):
use_explicit_padding: use 'VALID' padding for convolutions, but prepad use_explicit_padding: use 'VALID' padding for convolutions, but prepad
inputs so that the output dimensions are the same as if 'SAME' padding inputs so that the output dimensions are the same as if 'SAME' padding
were used. were used.
num_layers: number of SSD layers.
use_keras: if True builds a keras-based feature extractor, if False builds use_keras: if True builds a keras-based feature extractor, if False builds
a slim-based one. a slim-based one.
use_depthwise: Whether to use depthwise convolutions.
Returns: Returns:
an ssd_meta_arch.SSDFeatureExtractor or an an ssd_meta_arch.SSDFeatureExtractor or an
ssd_meta_arch.SSDKerasFeatureExtractor object. ssd_meta_arch.SSDKerasFeatureExtractor object.
""" """
pass pass
def _extract_features(self, image_tensor, depth_multiplier, pad_to_multiple, def _extract_features(self,
use_explicit_padding=False, use_keras=False): image_tensor,
try: depth_multiplier,
feature_extractor = self._create_feature_extractor(depth_multiplier, pad_to_multiple,
pad_to_multiple, use_explicit_padding=False,
use_explicit_padding, use_depthwise=False,
use_keras=use_keras) num_layers=6,
# If the unit test does not support a use_keras arg, it raises an error: use_keras=False):
except TypeError: kwargs = {}
feature_extractor = self._create_feature_extractor(depth_multiplier, if use_explicit_padding:
pad_to_multiple, kwargs.update({'use_explicit_padding': use_explicit_padding})
use_explicit_padding) if use_depthwise:
kwargs.update({'use_depthwise': use_depthwise})
if num_layers != 6:
kwargs.update({'num_layers': num_layers})
if use_keras:
kwargs.update({'use_keras': use_keras})
feature_extractor = self._create_feature_extractor(
depth_multiplier,
pad_to_multiple,
**kwargs)
if use_keras: if use_keras:
feature_maps = feature_extractor(image_tensor) feature_maps = feature_extractor(image_tensor)
else: else:
feature_maps = feature_extractor.extract_features(image_tensor) feature_maps = feature_extractor.extract_features(image_tensor)
return feature_maps return feature_maps
def check_extract_features_returns_correct_shape( def check_extract_features_returns_correct_shape(self,
self, batch_size, image_height, image_width, depth_multiplier, batch_size,
pad_to_multiple, expected_feature_map_shapes, use_explicit_padding=False, image_height,
use_keras=False): image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shapes,
use_explicit_padding=False,
num_layers=6,
use_keras=False,
use_depthwise=False):
def graph_fn(image_tensor): def graph_fn(image_tensor):
return self._extract_features(image_tensor, return self._extract_features(
depth_multiplier, image_tensor,
pad_to_multiple, depth_multiplier,
use_explicit_padding, pad_to_multiple,
use_keras=use_keras) use_explicit_padding=use_explicit_padding,
num_layers=num_layers,
use_keras=use_keras,
use_depthwise=use_depthwise)
image_tensor = np.random.rand(batch_size, image_height, image_width, image_tensor = np.random.rand(batch_size, image_height, image_width,
3).astype(np.float32) 3).astype(np.float32)
...@@ -113,17 +140,29 @@ class SsdFeatureExtractorTestBase(test_case.TestCase): ...@@ -113,17 +140,29 @@ class SsdFeatureExtractorTestBase(test_case.TestCase):
self.assertAllEqual(feature_map.shape, expected_shape) self.assertAllEqual(feature_map.shape, expected_shape)
def check_extract_features_returns_correct_shapes_with_dynamic_inputs( def check_extract_features_returns_correct_shapes_with_dynamic_inputs(
self, batch_size, image_height, image_width, depth_multiplier, self,
pad_to_multiple, expected_feature_map_shapes, use_explicit_padding=False, batch_size,
use_keras=False): image_height,
image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shapes,
use_explicit_padding=False,
num_layers=6,
use_keras=False,
use_depthwise=False):
def graph_fn(image_height, image_width): def graph_fn(image_height, image_width):
image_tensor = tf.random_uniform([batch_size, image_height, image_width, image_tensor = tf.random_uniform([batch_size, image_height, image_width,
3], dtype=tf.float32) 3], dtype=tf.float32)
return self._extract_features(image_tensor, return self._extract_features(
depth_multiplier, image_tensor,
pad_to_multiple, depth_multiplier,
use_explicit_padding, pad_to_multiple,
use_keras=use_keras) use_explicit_padding=use_explicit_padding,
num_layers=num_layers,
use_keras=use_keras,
use_depthwise=use_depthwise)
feature_maps = self.execute_cpu(graph_fn, [ feature_maps = self.execute_cpu(graph_fn, [
np.array(image_height, dtype=np.int32), np.array(image_height, dtype=np.int32),
...@@ -134,13 +173,20 @@ class SsdFeatureExtractorTestBase(test_case.TestCase): ...@@ -134,13 +173,20 @@ class SsdFeatureExtractorTestBase(test_case.TestCase):
self.assertAllEqual(feature_map.shape, expected_shape) self.assertAllEqual(feature_map.shape, expected_shape)
def check_extract_features_raises_error_with_invalid_image_size( def check_extract_features_raises_error_with_invalid_image_size(
self, image_height, image_width, depth_multiplier, pad_to_multiple, self,
use_keras=False): image_height,
image_width,
depth_multiplier,
pad_to_multiple,
use_keras=False,
use_depthwise=False):
preprocessed_inputs = tf.placeholder(tf.float32, (4, None, None, 3)) preprocessed_inputs = tf.placeholder(tf.float32, (4, None, None, 3))
feature_maps = self._extract_features(preprocessed_inputs, feature_maps = self._extract_features(
depth_multiplier, preprocessed_inputs,
pad_to_multiple, depth_multiplier,
use_keras=use_keras) pad_to_multiple,
use_keras=use_keras,
use_depthwise=use_depthwise)
test_preprocessed_image = np.random.rand(4, image_height, image_width, 3) test_preprocessed_image = np.random.rand(4, image_height, image_width, 3)
with self.test_session() as sess: with self.test_session() as sess:
sess.run(tf.global_variables_initializer()) sess.run(tf.global_variables_initializer())
...@@ -148,20 +194,32 @@ class SsdFeatureExtractorTestBase(test_case.TestCase): ...@@ -148,20 +194,32 @@ class SsdFeatureExtractorTestBase(test_case.TestCase):
sess.run(feature_maps, sess.run(feature_maps,
feed_dict={preprocessed_inputs: test_preprocessed_image}) feed_dict={preprocessed_inputs: test_preprocessed_image})
def check_feature_extractor_variables_under_scope( def check_feature_extractor_variables_under_scope(self,
self, depth_multiplier, pad_to_multiple, scope_name, use_keras=False): depth_multiplier,
pad_to_multiple,
scope_name,
use_keras=False,
use_depthwise=False):
variables = self.get_feature_extractor_variables( variables = self.get_feature_extractor_variables(
depth_multiplier, pad_to_multiple, use_keras) depth_multiplier,
pad_to_multiple,
use_keras=use_keras,
use_depthwise=use_depthwise)
for variable in variables: for variable in variables:
self.assertTrue(variable.name.startswith(scope_name)) self.assertTrue(variable.name.startswith(scope_name))
def get_feature_extractor_variables( def get_feature_extractor_variables(self,
self, depth_multiplier, pad_to_multiple, use_keras=False): depth_multiplier,
pad_to_multiple,
use_keras=False,
use_depthwise=False):
g = tf.Graph() g = tf.Graph()
with g.as_default(): with g.as_default():
preprocessed_inputs = tf.placeholder(tf.float32, (4, None, None, 3)) preprocessed_inputs = tf.placeholder(tf.float32, (4, None, None, 3))
self._extract_features(preprocessed_inputs, self._extract_features(
depth_multiplier, preprocessed_inputs,
pad_to_multiple, depth_multiplier,
use_keras=use_keras) pad_to_multiple,
use_keras=use_keras,
use_depthwise=use_depthwise)
return g.get_collection(tf.GraphKeys.GLOBAL_VARIABLES) return g.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)
...@@ -37,6 +37,7 @@ class SSDInceptionV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor): ...@@ -37,6 +37,7 @@ class SSDInceptionV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
reuse_weights=None, reuse_weights=None,
use_explicit_padding=False, use_explicit_padding=False,
use_depthwise=False, use_depthwise=False,
num_layers=6,
override_base_feature_extractor_hyperparams=False): override_base_feature_extractor_hyperparams=False):
"""InceptionV2 Feature Extractor for SSD Models. """InceptionV2 Feature Extractor for SSD Models.
...@@ -53,6 +54,7 @@ class SSDInceptionV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor): ...@@ -53,6 +54,7 @@ class SSDInceptionV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
use_explicit_padding: Whether to use explicit padding when extracting use_explicit_padding: Whether to use explicit padding when extracting
features. Default is False. features. Default is False.
use_depthwise: Whether to use depthwise convolutions. Default is False. use_depthwise: Whether to use depthwise convolutions. Default is False.
num_layers: Number of SSD layers.
override_base_feature_extractor_hyperparams: Whether to override override_base_feature_extractor_hyperparams: Whether to override
hyperparameters of the base feature extractor with the one from hyperparameters of the base feature extractor with the one from
`conv_hyperparams_fn`. `conv_hyperparams_fn`.
...@@ -69,6 +71,7 @@ class SSDInceptionV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor): ...@@ -69,6 +71,7 @@ class SSDInceptionV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
reuse_weights=reuse_weights, reuse_weights=reuse_weights,
use_explicit_padding=use_explicit_padding, use_explicit_padding=use_explicit_padding,
use_depthwise=use_depthwise, use_depthwise=use_depthwise,
num_layers=num_layers,
override_base_feature_extractor_hyperparams= override_base_feature_extractor_hyperparams=
override_base_feature_extractor_hyperparams) override_base_feature_extractor_hyperparams)
if not self._override_base_feature_extractor_hyperparams: if not self._override_base_feature_extractor_hyperparams:
...@@ -108,8 +111,9 @@ class SSDInceptionV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor): ...@@ -108,8 +111,9 @@ class SSDInceptionV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
33, preprocessed_inputs) 33, preprocessed_inputs)
feature_map_layout = { feature_map_layout = {
'from_layer': ['Mixed_4c', 'Mixed_5c', '', '', '', ''], 'from_layer': ['Mixed_4c', 'Mixed_5c', '', '', '', ''
'layer_depth': [-1, -1, 512, 256, 256, 128], ][:self._num_layers],
'layer_depth': [-1, -1, 512, 256, 256, 128][:self._num_layers],
'use_explicit_padding': self._use_explicit_padding, 'use_explicit_padding': self._use_explicit_padding,
'use_depthwise': self._use_depthwise, 'use_depthwise': self._use_depthwise,
} }
......
...@@ -24,7 +24,11 @@ from object_detection.models import ssd_inception_v2_feature_extractor ...@@ -24,7 +24,11 @@ from object_detection.models import ssd_inception_v2_feature_extractor
class SsdInceptionV2FeatureExtractorTest( class SsdInceptionV2FeatureExtractorTest(
ssd_feature_extractor_test.SsdFeatureExtractorTestBase): ssd_feature_extractor_test.SsdFeatureExtractorTestBase):
def _create_feature_extractor(self, depth_multiplier, pad_to_multiple, def _create_feature_extractor(self,
depth_multiplier,
pad_to_multiple,
use_explicit_padding=False,
num_layers=6,
is_training=True): is_training=True):
"""Constructs a SsdInceptionV2FeatureExtractor. """Constructs a SsdInceptionV2FeatureExtractor.
...@@ -32,6 +36,10 @@ class SsdInceptionV2FeatureExtractorTest( ...@@ -32,6 +36,10 @@ class SsdInceptionV2FeatureExtractorTest(
depth_multiplier: float depth multiplier for feature extractor depth_multiplier: float depth multiplier for feature extractor
pad_to_multiple: the nearest multiple to zero pad the input height and pad_to_multiple: the nearest multiple to zero pad the input height and
width dimensions to. width dimensions to.
use_explicit_padding: Use 'VALID' padding for convolutions, but prepad
inputs so that the output dimensions are the same as if 'SAME' padding
were used.
num_layers: number of SSD layers.
is_training: whether the network is in training mode. is_training: whether the network is in training mode.
Returns: Returns:
...@@ -39,8 +47,12 @@ class SsdInceptionV2FeatureExtractorTest( ...@@ -39,8 +47,12 @@ class SsdInceptionV2FeatureExtractorTest(
""" """
min_depth = 32 min_depth = 32
return ssd_inception_v2_feature_extractor.SSDInceptionV2FeatureExtractor( return ssd_inception_v2_feature_extractor.SSDInceptionV2FeatureExtractor(
is_training, depth_multiplier, min_depth, pad_to_multiple, is_training,
depth_multiplier,
min_depth,
pad_to_multiple,
self.conv_hyperparams_fn, self.conv_hyperparams_fn,
num_layers=num_layers,
override_base_feature_extractor_hyperparams=True) override_base_feature_extractor_hyperparams=True)
def test_extract_features_returns_correct_shapes_128(self): def test_extract_features_returns_correct_shapes_128(self):
...@@ -129,6 +141,17 @@ class SsdInceptionV2FeatureExtractorTest( ...@@ -129,6 +141,17 @@ class SsdInceptionV2FeatureExtractorTest(
self.check_feature_extractor_variables_under_scope( self.check_feature_extractor_variables_under_scope(
depth_multiplier, pad_to_multiple, scope_name) depth_multiplier, pad_to_multiple, scope_name)
def test_extract_features_with_fewer_layers(self):
image_height = 128
image_width = 128
depth_multiplier = 1.0
pad_to_multiple = 1
expected_feature_map_shape = [(2, 8, 8, 576), (2, 4, 4, 1024),
(2, 2, 2, 512), (2, 1, 1, 256)]
self.check_extract_features_returns_correct_shape(
2, image_height, image_width, depth_multiplier, pad_to_multiple,
expected_feature_map_shape, num_layers=4)
if __name__ == '__main__': if __name__ == '__main__':
tf.test.main() tf.test.main()
...@@ -37,6 +37,7 @@ class SSDInceptionV3FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor): ...@@ -37,6 +37,7 @@ class SSDInceptionV3FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
reuse_weights=None, reuse_weights=None,
use_explicit_padding=False, use_explicit_padding=False,
use_depthwise=False, use_depthwise=False,
num_layers=6,
override_base_feature_extractor_hyperparams=False): override_base_feature_extractor_hyperparams=False):
"""InceptionV3 Feature Extractor for SSD Models. """InceptionV3 Feature Extractor for SSD Models.
...@@ -53,6 +54,7 @@ class SSDInceptionV3FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor): ...@@ -53,6 +54,7 @@ class SSDInceptionV3FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
use_explicit_padding: Whether to use explicit padding when extracting use_explicit_padding: Whether to use explicit padding when extracting
features. Default is False. features. Default is False.
use_depthwise: Whether to use depthwise convolutions. Default is False. use_depthwise: Whether to use depthwise convolutions. Default is False.
num_layers: Number of SSD layers.
override_base_feature_extractor_hyperparams: Whether to override override_base_feature_extractor_hyperparams: Whether to override
hyperparameters of the base feature extractor with the one from hyperparameters of the base feature extractor with the one from
`conv_hyperparams_fn`. `conv_hyperparams_fn`.
...@@ -69,6 +71,7 @@ class SSDInceptionV3FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor): ...@@ -69,6 +71,7 @@ class SSDInceptionV3FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
reuse_weights=reuse_weights, reuse_weights=reuse_weights,
use_explicit_padding=use_explicit_padding, use_explicit_padding=use_explicit_padding,
use_depthwise=use_depthwise, use_depthwise=use_depthwise,
num_layers=num_layers,
override_base_feature_extractor_hyperparams= override_base_feature_extractor_hyperparams=
override_base_feature_extractor_hyperparams) override_base_feature_extractor_hyperparams)
...@@ -109,8 +112,9 @@ class SSDInceptionV3FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor): ...@@ -109,8 +112,9 @@ class SSDInceptionV3FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
33, preprocessed_inputs) 33, preprocessed_inputs)
feature_map_layout = { feature_map_layout = {
'from_layer': ['Mixed_5d', 'Mixed_6e', 'Mixed_7c', '', '', ''], 'from_layer': ['Mixed_5d', 'Mixed_6e', 'Mixed_7c', '', '', ''
'layer_depth': [-1, -1, -1, 512, 256, 128], ][:self._num_layers],
'layer_depth': [-1, -1, -1, 512, 256, 128][:self._num_layers],
'use_explicit_padding': self._use_explicit_padding, 'use_explicit_padding': self._use_explicit_padding,
'use_depthwise': self._use_depthwise, 'use_depthwise': self._use_depthwise,
} }
......
...@@ -24,7 +24,11 @@ from object_detection.models import ssd_inception_v3_feature_extractor ...@@ -24,7 +24,11 @@ from object_detection.models import ssd_inception_v3_feature_extractor
class SsdInceptionV3FeatureExtractorTest( class SsdInceptionV3FeatureExtractorTest(
ssd_feature_extractor_test.SsdFeatureExtractorTestBase): ssd_feature_extractor_test.SsdFeatureExtractorTestBase):
def _create_feature_extractor(self, depth_multiplier, pad_to_multiple, def _create_feature_extractor(self,
depth_multiplier,
pad_to_multiple,
use_explicit_padding=False,
num_layers=6,
is_training=True): is_training=True):
"""Constructs a SsdInceptionV3FeatureExtractor. """Constructs a SsdInceptionV3FeatureExtractor.
...@@ -32,6 +36,10 @@ class SsdInceptionV3FeatureExtractorTest( ...@@ -32,6 +36,10 @@ class SsdInceptionV3FeatureExtractorTest(
depth_multiplier: float depth multiplier for feature extractor depth_multiplier: float depth multiplier for feature extractor
pad_to_multiple: the nearest multiple to zero pad the input height and pad_to_multiple: the nearest multiple to zero pad the input height and
width dimensions to. width dimensions to.
use_explicit_padding: Use 'VALID' padding for convolutions, but prepad
inputs so that the output dimensions are the same as if 'SAME' padding
were used.
num_layers: number of SSD layers.
is_training: whether the network is in training mode. is_training: whether the network is in training mode.
Returns: Returns:
...@@ -39,8 +47,12 @@ class SsdInceptionV3FeatureExtractorTest( ...@@ -39,8 +47,12 @@ class SsdInceptionV3FeatureExtractorTest(
""" """
min_depth = 32 min_depth = 32
return ssd_inception_v3_feature_extractor.SSDInceptionV3FeatureExtractor( return ssd_inception_v3_feature_extractor.SSDInceptionV3FeatureExtractor(
is_training, depth_multiplier, min_depth, pad_to_multiple, is_training,
depth_multiplier,
min_depth,
pad_to_multiple,
self.conv_hyperparams_fn, self.conv_hyperparams_fn,
num_layers=num_layers,
override_base_feature_extractor_hyperparams=True) override_base_feature_extractor_hyperparams=True)
def test_extract_features_returns_correct_shapes_128(self): def test_extract_features_returns_correct_shapes_128(self):
...@@ -129,6 +141,17 @@ class SsdInceptionV3FeatureExtractorTest( ...@@ -129,6 +141,17 @@ class SsdInceptionV3FeatureExtractorTest(
self.check_feature_extractor_variables_under_scope( self.check_feature_extractor_variables_under_scope(
depth_multiplier, pad_to_multiple, scope_name) depth_multiplier, pad_to_multiple, scope_name)
def test_extract_features_with_fewer_layers(self):
image_height = 128
image_width = 128
depth_multiplier = 1.0
pad_to_multiple = 1
expected_feature_map_shape = [(2, 13, 13, 288), (2, 6, 6, 768),
(2, 2, 2, 2048), (2, 1, 1, 512)]
self.check_extract_features_returns_correct_shape(
2, image_height, image_width, depth_multiplier, pad_to_multiple,
expected_feature_map_shape, num_layers=4)
if __name__ == '__main__': if __name__ == '__main__':
tf.test.main() tf.test.main()
...@@ -39,6 +39,7 @@ class SSDMobileNetV1FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor): ...@@ -39,6 +39,7 @@ class SSDMobileNetV1FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
reuse_weights=None, reuse_weights=None,
use_explicit_padding=False, use_explicit_padding=False,
use_depthwise=False, use_depthwise=False,
num_layers=6,
override_base_feature_extractor_hyperparams=False): override_base_feature_extractor_hyperparams=False):
"""MobileNetV1 Feature Extractor for SSD Models. """MobileNetV1 Feature Extractor for SSD Models.
...@@ -56,6 +57,7 @@ class SSDMobileNetV1FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor): ...@@ -56,6 +57,7 @@ class SSDMobileNetV1FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
inputs so that the output dimensions are the same as if 'SAME' padding inputs so that the output dimensions are the same as if 'SAME' padding
were used. were used.
use_depthwise: Whether to use depthwise convolutions. Default is False. use_depthwise: Whether to use depthwise convolutions. Default is False.
num_layers: Number of SSD layers.
override_base_feature_extractor_hyperparams: Whether to override override_base_feature_extractor_hyperparams: Whether to override
hyperparameters of the base feature extractor with the one from hyperparameters of the base feature extractor with the one from
`conv_hyperparams_fn`. `conv_hyperparams_fn`.
...@@ -69,6 +71,7 @@ class SSDMobileNetV1FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor): ...@@ -69,6 +71,7 @@ class SSDMobileNetV1FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
reuse_weights=reuse_weights, reuse_weights=reuse_weights,
use_explicit_padding=use_explicit_padding, use_explicit_padding=use_explicit_padding,
use_depthwise=use_depthwise, use_depthwise=use_depthwise,
num_layers=num_layers,
override_base_feature_extractor_hyperparams= override_base_feature_extractor_hyperparams=
override_base_feature_extractor_hyperparams) override_base_feature_extractor_hyperparams)
...@@ -103,8 +106,8 @@ class SSDMobileNetV1FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor): ...@@ -103,8 +106,8 @@ class SSDMobileNetV1FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
feature_map_layout = { feature_map_layout = {
'from_layer': ['Conv2d_11_pointwise', 'Conv2d_13_pointwise', '', '', 'from_layer': ['Conv2d_11_pointwise', 'Conv2d_13_pointwise', '', '',
'', ''], '', ''][:self._num_layers],
'layer_depth': [-1, -1, 512, 256, 256, 128], 'layer_depth': [-1, -1, 512, 256, 256, 128][:self._num_layers],
'use_explicit_padding': self._use_explicit_padding, 'use_explicit_padding': self._use_explicit_padding,
'use_depthwise': self._use_depthwise, 'use_depthwise': self._use_depthwise,
} }
......
...@@ -12,7 +12,6 @@ ...@@ -12,7 +12,6 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
# ============================================================================== # ==============================================================================
"""Tests for SSD Mobilenet V1 feature extractors. """Tests for SSD Mobilenet V1 feature extractors.
By using parameterized test decorator, this test serves for both Slim-based and By using parameterized test decorator, this test serves for both Slim-based and
...@@ -37,8 +36,12 @@ slim = tf.contrib.slim ...@@ -37,8 +36,12 @@ slim = tf.contrib.slim
class SsdMobilenetV1FeatureExtractorTest( class SsdMobilenetV1FeatureExtractorTest(
ssd_feature_extractor_test.SsdFeatureExtractorTestBase): ssd_feature_extractor_test.SsdFeatureExtractorTestBase):
def _create_feature_extractor(self, depth_multiplier, pad_to_multiple, def _create_feature_extractor(self,
use_explicit_padding=False, is_training=False, depth_multiplier,
pad_to_multiple,
use_explicit_padding=False,
num_layers=6,
is_training=False,
use_keras=False): use_keras=False):
"""Constructs a new feature extractor. """Constructs a new feature extractor.
...@@ -49,16 +52,18 @@ class SsdMobilenetV1FeatureExtractorTest( ...@@ -49,16 +52,18 @@ class SsdMobilenetV1FeatureExtractorTest(
use_explicit_padding: Use 'VALID' padding for convolutions, but prepad use_explicit_padding: Use 'VALID' padding for convolutions, but prepad
inputs so that the output dimensions are the same as if 'SAME' padding inputs so that the output dimensions are the same as if 'SAME' padding
were used. were used.
num_layers: number of SSD layers.
is_training: whether the network is in training mode. is_training: whether the network is in training mode.
use_keras: if True builds a keras-based feature extractor, if False builds use_keras: if True builds a keras-based feature extractor, if False builds
a slim-based one. a slim-based one.
Returns: Returns:
an ssd_meta_arch.SSDFeatureExtractor object. an ssd_meta_arch.SSDFeatureExtractor object.
""" """
min_depth = 32 min_depth = 32
if use_keras: if use_keras:
return (ssd_mobilenet_v1_keras_feature_extractor. return (ssd_mobilenet_v1_keras_feature_extractor
SSDMobileNetV1KerasFeatureExtractor( .SSDMobileNetV1KerasFeatureExtractor(
is_training=is_training, is_training=is_training,
depth_multiplier=depth_multiplier, depth_multiplier=depth_multiplier,
min_depth=min_depth, min_depth=min_depth,
...@@ -68,12 +73,17 @@ class SsdMobilenetV1FeatureExtractorTest( ...@@ -68,12 +73,17 @@ class SsdMobilenetV1FeatureExtractorTest(
freeze_batchnorm=False, freeze_batchnorm=False,
inplace_batchnorm_update=False, inplace_batchnorm_update=False,
use_explicit_padding=use_explicit_padding, use_explicit_padding=use_explicit_padding,
num_layers=num_layers,
name='MobilenetV1')) name='MobilenetV1'))
else: else:
return ssd_mobilenet_v1_feature_extractor.SSDMobileNetV1FeatureExtractor( return ssd_mobilenet_v1_feature_extractor.SSDMobileNetV1FeatureExtractor(
is_training, depth_multiplier, min_depth, pad_to_multiple, is_training,
depth_multiplier,
min_depth,
pad_to_multiple,
self.conv_hyperparams_fn, self.conv_hyperparams_fn,
use_explicit_padding=use_explicit_padding) use_explicit_padding=use_explicit_padding,
num_layers=num_layers)
def test_extract_features_returns_correct_shapes_128(self, use_keras): def test_extract_features_returns_correct_shapes_128(self, use_keras):
image_height = 128 image_height = 128
...@@ -84,12 +94,22 @@ class SsdMobilenetV1FeatureExtractorTest( ...@@ -84,12 +94,22 @@ class SsdMobilenetV1FeatureExtractorTest(
(2, 2, 2, 512), (2, 1, 1, 256), (2, 2, 2, 512), (2, 1, 1, 256),
(2, 1, 1, 256), (2, 1, 1, 128)] (2, 1, 1, 256), (2, 1, 1, 128)]
self.check_extract_features_returns_correct_shape( self.check_extract_features_returns_correct_shape(
2, image_height, image_width, depth_multiplier, pad_to_multiple, 2,
expected_feature_map_shape, use_explicit_padding=False, image_height,
image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shape,
use_explicit_padding=False,
use_keras=use_keras) use_keras=use_keras)
self.check_extract_features_returns_correct_shape( self.check_extract_features_returns_correct_shape(
2, image_height, image_width, depth_multiplier, pad_to_multiple, 2,
expected_feature_map_shape, use_explicit_padding=True, image_height,
image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shape,
use_explicit_padding=True,
use_keras=use_keras) use_keras=use_keras)
def test_extract_features_returns_correct_shapes_299(self, use_keras): def test_extract_features_returns_correct_shapes_299(self, use_keras):
...@@ -101,12 +121,22 @@ class SsdMobilenetV1FeatureExtractorTest( ...@@ -101,12 +121,22 @@ class SsdMobilenetV1FeatureExtractorTest(
(2, 5, 5, 512), (2, 3, 3, 256), (2, 5, 5, 512), (2, 3, 3, 256),
(2, 2, 2, 256), (2, 1, 1, 128)] (2, 2, 2, 256), (2, 1, 1, 128)]
self.check_extract_features_returns_correct_shape( self.check_extract_features_returns_correct_shape(
2, image_height, image_width, depth_multiplier, pad_to_multiple, 2,
expected_feature_map_shape, use_explicit_padding=False, image_height,
image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shape,
use_explicit_padding=False,
use_keras=use_keras) use_keras=use_keras)
self.check_extract_features_returns_correct_shape( self.check_extract_features_returns_correct_shape(
2, image_height, image_width, depth_multiplier, pad_to_multiple, 2,
expected_feature_map_shape, use_explicit_padding=True, image_height,
image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shape,
use_explicit_padding=True,
use_keras=use_keras) use_keras=use_keras)
def test_extract_features_with_dynamic_image_shape(self, use_keras): def test_extract_features_with_dynamic_image_shape(self, use_keras):
...@@ -118,12 +148,22 @@ class SsdMobilenetV1FeatureExtractorTest( ...@@ -118,12 +148,22 @@ class SsdMobilenetV1FeatureExtractorTest(
(2, 2, 2, 512), (2, 1, 1, 256), (2, 2, 2, 512), (2, 1, 1, 256),
(2, 1, 1, 256), (2, 1, 1, 128)] (2, 1, 1, 256), (2, 1, 1, 128)]
self.check_extract_features_returns_correct_shapes_with_dynamic_inputs( self.check_extract_features_returns_correct_shapes_with_dynamic_inputs(
2, image_height, image_width, depth_multiplier, pad_to_multiple, 2,
expected_feature_map_shape, use_explicit_padding=False, image_height,
image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shape,
use_explicit_padding=False,
use_keras=use_keras) use_keras=use_keras)
self.check_extract_features_returns_correct_shape( self.check_extract_features_returns_correct_shape(
2, image_height, image_width, depth_multiplier, pad_to_multiple, 2,
expected_feature_map_shape, use_explicit_padding=True, image_height,
image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shape,
use_explicit_padding=True,
use_keras=use_keras) use_keras=use_keras)
def test_extract_features_returns_correct_shapes_enforcing_min_depth( def test_extract_features_returns_correct_shapes_enforcing_min_depth(
...@@ -133,15 +173,25 @@ class SsdMobilenetV1FeatureExtractorTest( ...@@ -133,15 +173,25 @@ class SsdMobilenetV1FeatureExtractorTest(
depth_multiplier = 0.5**12 depth_multiplier = 0.5**12
pad_to_multiple = 1 pad_to_multiple = 1
expected_feature_map_shape = [(2, 19, 19, 32), (2, 10, 10, 32), expected_feature_map_shape = [(2, 19, 19, 32), (2, 10, 10, 32),
(2, 5, 5, 32), (2, 3, 3, 32), (2, 5, 5, 32), (2, 3, 3, 32), (2, 2, 2, 32),
(2, 2, 2, 32), (2, 1, 1, 32)] (2, 1, 1, 32)]
self.check_extract_features_returns_correct_shape( self.check_extract_features_returns_correct_shape(
2, image_height, image_width, depth_multiplier, pad_to_multiple, 2,
expected_feature_map_shape, use_explicit_padding=False, image_height,
image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shape,
use_explicit_padding=False,
use_keras=use_keras) use_keras=use_keras)
self.check_extract_features_returns_correct_shape( self.check_extract_features_returns_correct_shape(
2, image_height, image_width, depth_multiplier, pad_to_multiple, 2,
expected_feature_map_shape, use_explicit_padding=True, image_height,
image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shape,
use_explicit_padding=True,
use_keras=use_keras) use_keras=use_keras)
def test_extract_features_returns_correct_shapes_with_pad_to_multiple( def test_extract_features_returns_correct_shapes_with_pad_to_multiple(
...@@ -154,12 +204,22 @@ class SsdMobilenetV1FeatureExtractorTest( ...@@ -154,12 +204,22 @@ class SsdMobilenetV1FeatureExtractorTest(
(2, 5, 5, 512), (2, 3, 3, 256), (2, 5, 5, 512), (2, 3, 3, 256),
(2, 2, 2, 256), (2, 1, 1, 128)] (2, 2, 2, 256), (2, 1, 1, 128)]
self.check_extract_features_returns_correct_shape( self.check_extract_features_returns_correct_shape(
2, image_height, image_width, depth_multiplier, pad_to_multiple, 2,
expected_feature_map_shape, use_explicit_padding=False, image_height,
image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shape,
use_explicit_padding=False,
use_keras=use_keras) use_keras=use_keras)
self.check_extract_features_returns_correct_shape( self.check_extract_features_returns_correct_shape(
2, image_height, image_width, depth_multiplier, pad_to_multiple, 2,
expected_feature_map_shape, use_explicit_padding=True, image_height,
image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shape,
use_explicit_padding=True,
use_keras=use_keras) use_keras=use_keras)
def test_extract_features_raises_error_with_invalid_image_size( def test_extract_features_raises_error_with_invalid_image_size(
...@@ -169,7 +229,10 @@ class SsdMobilenetV1FeatureExtractorTest( ...@@ -169,7 +229,10 @@ class SsdMobilenetV1FeatureExtractorTest(
depth_multiplier = 1.0 depth_multiplier = 1.0
pad_to_multiple = 1 pad_to_multiple = 1
self.check_extract_features_raises_error_with_invalid_image_size( self.check_extract_features_raises_error_with_invalid_image_size(
image_height, image_width, depth_multiplier, pad_to_multiple, image_height,
image_width,
depth_multiplier,
pad_to_multiple,
use_keras=use_keras) use_keras=use_keras)
def test_preprocess_returns_correct_value_range(self, use_keras): def test_preprocess_returns_correct_value_range(self, use_keras):
...@@ -178,9 +241,8 @@ class SsdMobilenetV1FeatureExtractorTest( ...@@ -178,9 +241,8 @@ class SsdMobilenetV1FeatureExtractorTest(
depth_multiplier = 1 depth_multiplier = 1
pad_to_multiple = 1 pad_to_multiple = 1
test_image = np.random.rand(2, image_height, image_width, 3) test_image = np.random.rand(2, image_height, image_width, 3)
feature_extractor = self._create_feature_extractor(depth_multiplier, feature_extractor = self._create_feature_extractor(
pad_to_multiple, depth_multiplier, pad_to_multiple, use_keras=use_keras)
use_keras=use_keras)
preprocessed_image = feature_extractor.preprocess(test_image) preprocessed_image = feature_extractor.preprocess(test_image)
self.assertTrue(np.all(np.less_equal(np.abs(preprocessed_image), 1.0))) self.assertTrue(np.all(np.less_equal(np.abs(preprocessed_image), 1.0)))
...@@ -212,8 +274,22 @@ class SsdMobilenetV1FeatureExtractorTest( ...@@ -212,8 +274,22 @@ class SsdMobilenetV1FeatureExtractorTest(
_ = feature_extractor(preprocessed_image) _ = feature_extractor(preprocessed_image)
else: else:
_ = feature_extractor.extract_features(preprocessed_image) _ = feature_extractor.extract_features(preprocessed_image)
self.assertTrue(any(op.type == 'FusedBatchNorm' self.assertTrue(
for op in tf.get_default_graph().get_operations())) any('FusedBatchNorm' in op.type
for op in tf.get_default_graph().get_operations()))
def test_extract_features_with_fewer_layers(self, use_keras):
image_height = 128
image_width = 128
depth_multiplier = 1.0
pad_to_multiple = 1
expected_feature_map_shape = [(2, 8, 8, 512), (2, 4, 4, 1024),
(2, 2, 2, 512), (2, 1, 1, 256)]
self.check_extract_features_returns_correct_shape(
2, image_height, image_width, depth_multiplier, pad_to_multiple,
expected_feature_map_shape, use_explicit_padding=False, num_layers=4,
use_keras=use_keras)
if __name__ == '__main__': if __name__ == '__main__':
tf.test.main() tf.test.main()
...@@ -220,7 +220,7 @@ class SsdMobilenetV1FpnFeatureExtractorTest( ...@@ -220,7 +220,7 @@ class SsdMobilenetV1FpnFeatureExtractorTest(
_ = feature_extractor.extract_features(preprocessed_image) _ = feature_extractor.extract_features(preprocessed_image)
self.assertTrue( self.assertTrue(
any(op.type == 'FusedBatchNorm' any('FusedBatchNorm' in op.type
for op in tf.get_default_graph().get_operations())) for op in tf.get_default_graph().get_operations()))
......
...@@ -40,6 +40,7 @@ class SSDMobileNetV1KerasFeatureExtractor( ...@@ -40,6 +40,7 @@ class SSDMobileNetV1KerasFeatureExtractor(
inplace_batchnorm_update, inplace_batchnorm_update,
use_explicit_padding=False, use_explicit_padding=False,
use_depthwise=False, use_depthwise=False,
num_layers=6,
override_base_feature_extractor_hyperparams=False, override_base_feature_extractor_hyperparams=False,
name=None): name=None):
"""Keras MobileNetV1 Feature Extractor for SSD Models. """Keras MobileNetV1 Feature Extractor for SSD Models.
...@@ -65,6 +66,7 @@ class SSDMobileNetV1KerasFeatureExtractor( ...@@ -65,6 +66,7 @@ class SSDMobileNetV1KerasFeatureExtractor(
inputs so that the output dimensions are the same as if 'SAME' padding inputs so that the output dimensions are the same as if 'SAME' padding
were used. were used.
use_depthwise: Whether to use depthwise convolutions. Default is False. use_depthwise: Whether to use depthwise convolutions. Default is False.
num_layers: Number of SSD layers.
override_base_feature_extractor_hyperparams: Whether to override override_base_feature_extractor_hyperparams: Whether to override
hyperparameters of the base feature extractor with the one from hyperparameters of the base feature extractor with the one from
`conv_hyperparams`. `conv_hyperparams`.
...@@ -81,13 +83,14 @@ class SSDMobileNetV1KerasFeatureExtractor( ...@@ -81,13 +83,14 @@ class SSDMobileNetV1KerasFeatureExtractor(
inplace_batchnorm_update=inplace_batchnorm_update, inplace_batchnorm_update=inplace_batchnorm_update,
use_explicit_padding=use_explicit_padding, use_explicit_padding=use_explicit_padding,
use_depthwise=use_depthwise, use_depthwise=use_depthwise,
num_layers=num_layers,
override_base_feature_extractor_hyperparams= override_base_feature_extractor_hyperparams=
override_base_feature_extractor_hyperparams, override_base_feature_extractor_hyperparams,
name=name) name=name)
self._feature_map_layout = { self._feature_map_layout = {
'from_layer': ['Conv2d_11_pointwise', 'Conv2d_13_pointwise', '', '', 'from_layer': ['Conv2d_11_pointwise', 'Conv2d_13_pointwise', '', '',
'', ''], '', ''][:self._num_layers],
'layer_depth': [-1, -1, 512, 256, 256, 128], 'layer_depth': [-1, -1, 512, 256, 256, 128][:self._num_layers],
'use_explicit_padding': self._use_explicit_padding, 'use_explicit_padding': self._use_explicit_padding,
'use_depthwise': self._use_depthwise, 'use_depthwise': self._use_depthwise,
} }
......
...@@ -178,7 +178,7 @@ class SsdMobilenetV1PpnFeatureExtractorTest( ...@@ -178,7 +178,7 @@ class SsdMobilenetV1PpnFeatureExtractorTest(
pad_to_multiple) pad_to_multiple)
preprocessed_image = feature_extractor.preprocess(image_placeholder) preprocessed_image = feature_extractor.preprocess(image_placeholder)
_ = feature_extractor.extract_features(preprocessed_image) _ = feature_extractor.extract_features(preprocessed_image)
self.assertTrue(any(op.type == 'FusedBatchNorm' self.assertTrue(any('FusedBatchNorm' in op.type
for op in tf.get_default_graph().get_operations())) for op in tf.get_default_graph().get_operations()))
if __name__ == '__main__': if __name__ == '__main__':
......
...@@ -40,6 +40,7 @@ class SSDMobileNetV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor): ...@@ -40,6 +40,7 @@ class SSDMobileNetV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
reuse_weights=None, reuse_weights=None,
use_explicit_padding=False, use_explicit_padding=False,
use_depthwise=False, use_depthwise=False,
num_layers=6,
override_base_feature_extractor_hyperparams=False): override_base_feature_extractor_hyperparams=False):
"""MobileNetV2 Feature Extractor for SSD Models. """MobileNetV2 Feature Extractor for SSD Models.
...@@ -59,6 +60,7 @@ class SSDMobileNetV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor): ...@@ -59,6 +60,7 @@ class SSDMobileNetV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
use_explicit_padding: Whether to use explicit padding when extracting use_explicit_padding: Whether to use explicit padding when extracting
features. Default is False. features. Default is False.
use_depthwise: Whether to use depthwise convolutions. Default is False. use_depthwise: Whether to use depthwise convolutions. Default is False.
num_layers: Number of SSD layers.
override_base_feature_extractor_hyperparams: Whether to override override_base_feature_extractor_hyperparams: Whether to override
hyperparameters of the base feature extractor with the one from hyperparameters of the base feature extractor with the one from
`conv_hyperparams_fn`. `conv_hyperparams_fn`.
...@@ -72,6 +74,7 @@ class SSDMobileNetV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor): ...@@ -72,6 +74,7 @@ class SSDMobileNetV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
reuse_weights=reuse_weights, reuse_weights=reuse_weights,
use_explicit_padding=use_explicit_padding, use_explicit_padding=use_explicit_padding,
use_depthwise=use_depthwise, use_depthwise=use_depthwise,
num_layers=num_layers,
override_base_feature_extractor_hyperparams= override_base_feature_extractor_hyperparams=
override_base_feature_extractor_hyperparams) override_base_feature_extractor_hyperparams)
...@@ -105,8 +108,9 @@ class SSDMobileNetV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor): ...@@ -105,8 +108,9 @@ class SSDMobileNetV2FeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
33, preprocessed_inputs) 33, preprocessed_inputs)
feature_map_layout = { feature_map_layout = {
'from_layer': ['layer_15/expansion_output', 'layer_19', '', '', '', ''], 'from_layer': ['layer_15/expansion_output', 'layer_19', '', '', '', ''
'layer_depth': [-1, -1, 512, 256, 256, 128], ][:self._num_layers],
'layer_depth': [-1, -1, 512, 256, 256, 128][:self._num_layers],
'use_depthwise': self._use_depthwise, 'use_depthwise': self._use_depthwise,
'use_explicit_padding': self._use_explicit_padding, 'use_explicit_padding': self._use_explicit_padding,
} }
......
...@@ -33,8 +33,12 @@ slim = tf.contrib.slim ...@@ -33,8 +33,12 @@ slim = tf.contrib.slim
class SsdMobilenetV2FeatureExtractorTest( class SsdMobilenetV2FeatureExtractorTest(
ssd_feature_extractor_test.SsdFeatureExtractorTestBase): ssd_feature_extractor_test.SsdFeatureExtractorTestBase):
def _create_feature_extractor(self, depth_multiplier, pad_to_multiple, def _create_feature_extractor(self,
use_explicit_padding=False, use_keras=False): depth_multiplier,
pad_to_multiple,
use_explicit_padding=False,
num_layers=6,
use_keras=False):
"""Constructs a new feature extractor. """Constructs a new feature extractor.
Args: Args:
...@@ -44,6 +48,7 @@ class SsdMobilenetV2FeatureExtractorTest( ...@@ -44,6 +48,7 @@ class SsdMobilenetV2FeatureExtractorTest(
use_explicit_padding: use 'VALID' padding for convolutions, but prepad use_explicit_padding: use 'VALID' padding for convolutions, but prepad
inputs so that the output dimensions are the same as if 'SAME' padding inputs so that the output dimensions are the same as if 'SAME' padding
were used. were used.
num_layers: number of SSD layers.
use_keras: if True builds a keras-based feature extractor, if False builds use_keras: if True builds a keras-based feature extractor, if False builds
a slim-based one. a slim-based one.
Returns: Returns:
...@@ -61,6 +66,7 @@ class SsdMobilenetV2FeatureExtractorTest( ...@@ -61,6 +66,7 @@ class SsdMobilenetV2FeatureExtractorTest(
freeze_batchnorm=False, freeze_batchnorm=False,
inplace_batchnorm_update=False, inplace_batchnorm_update=False,
use_explicit_padding=use_explicit_padding, use_explicit_padding=use_explicit_padding,
num_layers=num_layers,
name='MobilenetV2')) name='MobilenetV2'))
else: else:
return ssd_mobilenet_v2_feature_extractor.SSDMobileNetV2FeatureExtractor( return ssd_mobilenet_v2_feature_extractor.SSDMobileNetV2FeatureExtractor(
...@@ -69,7 +75,8 @@ class SsdMobilenetV2FeatureExtractorTest( ...@@ -69,7 +75,8 @@ class SsdMobilenetV2FeatureExtractorTest(
min_depth, min_depth,
pad_to_multiple, pad_to_multiple,
self.conv_hyperparams_fn, self.conv_hyperparams_fn,
use_explicit_padding=use_explicit_padding) use_explicit_padding=use_explicit_padding,
num_layers=num_layers)
def test_extract_features_returns_correct_shapes_128(self, use_keras): def test_extract_features_returns_correct_shapes_128(self, use_keras):
image_height = 128 image_height = 128
...@@ -199,9 +206,21 @@ class SsdMobilenetV2FeatureExtractorTest( ...@@ -199,9 +206,21 @@ class SsdMobilenetV2FeatureExtractorTest(
_ = feature_extractor(preprocessed_image) _ = feature_extractor(preprocessed_image)
else: else:
_ = feature_extractor.extract_features(preprocessed_image) _ = feature_extractor.extract_features(preprocessed_image)
self.assertTrue(any(op.type == 'FusedBatchNorm' self.assertTrue(any('FusedBatchNorm' in op.type
for op in tf.get_default_graph().get_operations())) for op in tf.get_default_graph().get_operations()))
def test_extract_features_with_fewer_layers(self, use_keras):
image_height = 128
image_width = 128
depth_multiplier = 1.0
pad_to_multiple = 1
expected_feature_map_shape = [(2, 8, 8, 576), (2, 4, 4, 1280),
(2, 2, 2, 512), (2, 1, 1, 256)]
self.check_extract_features_returns_correct_shape(
2, image_height, image_width, depth_multiplier, pad_to_multiple,
expected_feature_map_shape, use_explicit_padding=False, num_layers=4,
use_keras=use_keras)
if __name__ == '__main__': if __name__ == '__main__':
tf.test.main() tf.test.main()
...@@ -30,15 +30,33 @@ slim = tf.contrib.slim ...@@ -30,15 +30,33 @@ slim = tf.contrib.slim
@parameterized.parameters( @parameterized.parameters(
{'use_keras': False}, {
{'use_keras': True}, 'use_depthwise': False,
'use_keras': True
},
{
'use_depthwise': True,
'use_keras': True
},
{
'use_depthwise': False,
'use_keras': False
},
{
'use_depthwise': True,
'use_keras': False
},
) )
class SsdMobilenetV2FpnFeatureExtractorTest( class SsdMobilenetV2FpnFeatureExtractorTest(
ssd_feature_extractor_test.SsdFeatureExtractorTestBase): ssd_feature_extractor_test.SsdFeatureExtractorTestBase):
def _create_feature_extractor(self, depth_multiplier, pad_to_multiple, def _create_feature_extractor(self,
is_training=True, use_explicit_padding=False, depth_multiplier,
use_keras=False): pad_to_multiple,
is_training=True,
use_explicit_padding=False,
use_keras=False,
use_depthwise=False):
"""Constructs a new feature extractor. """Constructs a new feature extractor.
Args: Args:
...@@ -51,13 +69,14 @@ class SsdMobilenetV2FpnFeatureExtractorTest( ...@@ -51,13 +69,14 @@ class SsdMobilenetV2FpnFeatureExtractorTest(
were used. were used.
use_keras: if True builds a keras-based feature extractor, if False builds use_keras: if True builds a keras-based feature extractor, if False builds
a slim-based one. a slim-based one.
use_depthwise: Whether to use depthwise convolutions.
Returns: Returns:
an ssd_meta_arch.SSDFeatureExtractor object. an ssd_meta_arch.SSDFeatureExtractor object.
""" """
min_depth = 32 min_depth = 32
if use_keras: if use_keras:
return (ssd_mobilenet_v2_fpn_keras_feature_extractor. return (ssd_mobilenet_v2_fpn_keras_feature_extractor
SSDMobileNetV2FpnKerasFeatureExtractor( .SSDMobileNetV2FpnKerasFeatureExtractor(
is_training=is_training, is_training=is_training,
depth_multiplier=depth_multiplier, depth_multiplier=depth_multiplier,
min_depth=min_depth, min_depth=min_depth,
...@@ -67,18 +86,21 @@ class SsdMobilenetV2FpnFeatureExtractorTest( ...@@ -67,18 +86,21 @@ class SsdMobilenetV2FpnFeatureExtractorTest(
freeze_batchnorm=False, freeze_batchnorm=False,
inplace_batchnorm_update=False, inplace_batchnorm_update=False,
use_explicit_padding=use_explicit_padding, use_explicit_padding=use_explicit_padding,
use_depthwise=use_depthwise,
name='MobilenetV2_FPN')) name='MobilenetV2_FPN'))
else: else:
return (ssd_mobilenet_v2_fpn_feature_extractor. return (ssd_mobilenet_v2_fpn_feature_extractor
SSDMobileNetV2FpnFeatureExtractor( .SSDMobileNetV2FpnFeatureExtractor(
is_training, is_training,
depth_multiplier, depth_multiplier,
min_depth, min_depth,
pad_to_multiple, pad_to_multiple,
self.conv_hyperparams_fn, self.conv_hyperparams_fn,
use_depthwise=use_depthwise,
use_explicit_padding=use_explicit_padding)) use_explicit_padding=use_explicit_padding))
def test_extract_features_returns_correct_shapes_256(self, use_keras): def test_extract_features_returns_correct_shapes_256(self, use_keras,
use_depthwise):
image_height = 256 image_height = 256
image_width = 256 image_width = 256
depth_multiplier = 1.0 depth_multiplier = 1.0
...@@ -87,15 +109,28 @@ class SsdMobilenetV2FpnFeatureExtractorTest( ...@@ -87,15 +109,28 @@ class SsdMobilenetV2FpnFeatureExtractorTest(
(2, 8, 8, 256), (2, 4, 4, 256), (2, 8, 8, 256), (2, 4, 4, 256),
(2, 2, 2, 256)] (2, 2, 2, 256)]
self.check_extract_features_returns_correct_shape( self.check_extract_features_returns_correct_shape(
2, image_height, image_width, depth_multiplier, pad_to_multiple, 2,
expected_feature_map_shape, use_explicit_padding=False, image_height,
use_keras=use_keras) image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shape,
use_explicit_padding=False,
use_keras=use_keras,
use_depthwise=use_depthwise)
self.check_extract_features_returns_correct_shape( self.check_extract_features_returns_correct_shape(
2, image_height, image_width, depth_multiplier, pad_to_multiple, 2,
expected_feature_map_shape, use_explicit_padding=True, image_height,
use_keras=use_keras) image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shape,
use_explicit_padding=True,
use_keras=use_keras,
use_depthwise=use_depthwise)
def test_extract_features_returns_correct_shapes_384(self, use_keras): def test_extract_features_returns_correct_shapes_384(self, use_keras,
use_depthwise):
image_height = 320 image_height = 320
image_width = 320 image_width = 320
depth_multiplier = 1.0 depth_multiplier = 1.0
...@@ -104,15 +139,28 @@ class SsdMobilenetV2FpnFeatureExtractorTest( ...@@ -104,15 +139,28 @@ class SsdMobilenetV2FpnFeatureExtractorTest(
(2, 10, 10, 256), (2, 5, 5, 256), (2, 10, 10, 256), (2, 5, 5, 256),
(2, 3, 3, 256)] (2, 3, 3, 256)]
self.check_extract_features_returns_correct_shape( self.check_extract_features_returns_correct_shape(
2, image_height, image_width, depth_multiplier, pad_to_multiple, 2,
expected_feature_map_shape, use_explicit_padding=False, image_height,
use_keras=use_keras) image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shape,
use_explicit_padding=False,
use_keras=use_keras,
use_depthwise=use_depthwise)
self.check_extract_features_returns_correct_shape( self.check_extract_features_returns_correct_shape(
2, image_height, image_width, depth_multiplier, pad_to_multiple, 2,
expected_feature_map_shape, use_explicit_padding=True, image_height,
use_keras=use_keras) image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shape,
use_explicit_padding=True,
use_keras=use_keras,
use_depthwise=use_depthwise)
def test_extract_features_with_dynamic_image_shape(self, use_keras): def test_extract_features_with_dynamic_image_shape(self, use_keras,
use_depthwise):
image_height = 256 image_height = 256
image_width = 256 image_width = 256
depth_multiplier = 1.0 depth_multiplier = 1.0
...@@ -121,16 +169,28 @@ class SsdMobilenetV2FpnFeatureExtractorTest( ...@@ -121,16 +169,28 @@ class SsdMobilenetV2FpnFeatureExtractorTest(
(2, 8, 8, 256), (2, 4, 4, 256), (2, 8, 8, 256), (2, 4, 4, 256),
(2, 2, 2, 256)] (2, 2, 2, 256)]
self.check_extract_features_returns_correct_shapes_with_dynamic_inputs( self.check_extract_features_returns_correct_shapes_with_dynamic_inputs(
2, image_height, image_width, depth_multiplier, pad_to_multiple, 2,
expected_feature_map_shape, use_explicit_padding=False, image_height,
use_keras=use_keras) image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shape,
use_explicit_padding=False,
use_keras=use_keras,
use_depthwise=use_depthwise)
self.check_extract_features_returns_correct_shapes_with_dynamic_inputs( self.check_extract_features_returns_correct_shapes_with_dynamic_inputs(
2, image_height, image_width, depth_multiplier, pad_to_multiple, 2,
expected_feature_map_shape, use_explicit_padding=True, image_height,
use_keras=use_keras) image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shape,
use_explicit_padding=True,
use_keras=use_keras,
use_depthwise=use_depthwise)
def test_extract_features_returns_correct_shapes_with_pad_to_multiple( def test_extract_features_returns_correct_shapes_with_pad_to_multiple(
self, use_keras): self, use_keras, use_depthwise):
image_height = 299 image_height = 299
image_width = 299 image_width = 299
depth_multiplier = 1.0 depth_multiplier = 1.0
...@@ -139,16 +199,28 @@ class SsdMobilenetV2FpnFeatureExtractorTest( ...@@ -139,16 +199,28 @@ class SsdMobilenetV2FpnFeatureExtractorTest(
(2, 10, 10, 256), (2, 5, 5, 256), (2, 10, 10, 256), (2, 5, 5, 256),
(2, 3, 3, 256)] (2, 3, 3, 256)]
self.check_extract_features_returns_correct_shape( self.check_extract_features_returns_correct_shape(
2, image_height, image_width, depth_multiplier, pad_to_multiple, 2,
expected_feature_map_shape, use_explicit_padding=False, image_height,
use_keras=use_keras) image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shape,
use_explicit_padding=False,
use_keras=use_keras,
use_depthwise=use_depthwise)
self.check_extract_features_returns_correct_shape( self.check_extract_features_returns_correct_shape(
2, image_height, image_width, depth_multiplier, pad_to_multiple, 2,
expected_feature_map_shape, use_explicit_padding=True, image_height,
use_keras=use_keras) image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shape,
use_explicit_padding=True,
use_keras=use_keras,
use_depthwise=use_depthwise)
def test_extract_features_returns_correct_shapes_enforcing_min_depth( def test_extract_features_returns_correct_shapes_enforcing_min_depth(
self, use_keras): self, use_keras, use_depthwise):
image_height = 256 image_height = 256
image_width = 256 image_width = 256
depth_multiplier = 0.5**12 depth_multiplier = 0.5**12
...@@ -157,70 +229,102 @@ class SsdMobilenetV2FpnFeatureExtractorTest( ...@@ -157,70 +229,102 @@ class SsdMobilenetV2FpnFeatureExtractorTest(
(2, 8, 8, 32), (2, 4, 4, 32), (2, 8, 8, 32), (2, 4, 4, 32),
(2, 2, 2, 32)] (2, 2, 2, 32)]
self.check_extract_features_returns_correct_shape( self.check_extract_features_returns_correct_shape(
2, image_height, image_width, depth_multiplier, pad_to_multiple, 2,
expected_feature_map_shape, use_explicit_padding=False, image_height,
use_keras=use_keras) image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shape,
use_explicit_padding=False,
use_keras=use_keras,
use_depthwise=use_depthwise)
self.check_extract_features_returns_correct_shape( self.check_extract_features_returns_correct_shape(
2, image_height, image_width, depth_multiplier, pad_to_multiple, 2,
expected_feature_map_shape, use_explicit_padding=True, image_height,
use_keras=use_keras) image_width,
depth_multiplier,
pad_to_multiple,
expected_feature_map_shape,
use_explicit_padding=True,
use_keras=use_keras,
use_depthwise=use_depthwise)
def test_extract_features_raises_error_with_invalid_image_size( def test_extract_features_raises_error_with_invalid_image_size(
self, use_keras): self, use_keras, use_depthwise):
image_height = 32 image_height = 32
image_width = 32 image_width = 32
depth_multiplier = 1.0 depth_multiplier = 1.0
pad_to_multiple = 1 pad_to_multiple = 1
self.check_extract_features_raises_error_with_invalid_image_size( self.check_extract_features_raises_error_with_invalid_image_size(
image_height, image_width, depth_multiplier, pad_to_multiple, image_height,
use_keras=use_keras) image_width,
depth_multiplier,
pad_to_multiple,
use_keras=use_keras,
use_depthwise=use_depthwise)
def test_preprocess_returns_correct_value_range(self, use_keras): def test_preprocess_returns_correct_value_range(self, use_keras,
use_depthwise):
image_height = 256 image_height = 256
image_width = 256 image_width = 256
depth_multiplier = 1 depth_multiplier = 1
pad_to_multiple = 1 pad_to_multiple = 1
test_image = np.random.rand(2, image_height, image_width, 3) test_image = np.random.rand(2, image_height, image_width, 3)
feature_extractor = self._create_feature_extractor(depth_multiplier, feature_extractor = self._create_feature_extractor(
pad_to_multiple, depth_multiplier,
use_keras=use_keras) pad_to_multiple,
use_keras=use_keras,
use_depthwise=use_depthwise)
preprocessed_image = feature_extractor.preprocess(test_image) preprocessed_image = feature_extractor.preprocess(test_image)
self.assertTrue(np.all(np.less_equal(np.abs(preprocessed_image), 1.0))) self.assertTrue(np.all(np.less_equal(np.abs(preprocessed_image), 1.0)))
def test_variables_only_created_in_scope(self, use_keras): def test_variables_only_created_in_scope(self, use_keras, use_depthwise):
depth_multiplier = 1 depth_multiplier = 1
pad_to_multiple = 1 pad_to_multiple = 1
scope_name = 'MobilenetV2' scope_name = 'MobilenetV2'
self.check_feature_extractor_variables_under_scope( self.check_feature_extractor_variables_under_scope(
depth_multiplier, pad_to_multiple, scope_name, use_keras=use_keras) depth_multiplier,
pad_to_multiple,
scope_name,
use_keras=use_keras,
use_depthwise=use_depthwise)
def test_fused_batchnorm(self, use_keras): def test_fused_batchnorm(self, use_keras, use_depthwise):
image_height = 256 image_height = 256
image_width = 256 image_width = 256
depth_multiplier = 1 depth_multiplier = 1
pad_to_multiple = 1 pad_to_multiple = 1
image_placeholder = tf.placeholder(tf.float32, image_placeholder = tf.placeholder(tf.float32,
[1, image_height, image_width, 3]) [1, image_height, image_width, 3])
feature_extractor = self._create_feature_extractor(depth_multiplier, feature_extractor = self._create_feature_extractor(
pad_to_multiple, depth_multiplier,
use_keras=use_keras) pad_to_multiple,
use_keras=use_keras,
use_depthwise=use_depthwise)
preprocessed_image = feature_extractor.preprocess(image_placeholder) preprocessed_image = feature_extractor.preprocess(image_placeholder)
if use_keras: if use_keras:
_ = feature_extractor(preprocessed_image) _ = feature_extractor(preprocessed_image)
else: else:
_ = feature_extractor.extract_features(preprocessed_image) _ = feature_extractor.extract_features(preprocessed_image)
self.assertTrue( self.assertTrue(
any(op.type == 'FusedBatchNorm' any('FusedBatchNorm' in op.type
for op in tf.get_default_graph().get_operations())) for op in tf.get_default_graph().get_operations()))
def test_variable_count(self, use_keras): def test_variable_count(self, use_keras, use_depthwise):
depth_multiplier = 1 depth_multiplier = 1
pad_to_multiple = 1 pad_to_multiple = 1
variables = self.get_feature_extractor_variables( variables = self.get_feature_extractor_variables(
depth_multiplier, pad_to_multiple, use_keras=use_keras) depth_multiplier,
self.assertEqual(len(variables), 274) pad_to_multiple,
use_keras=use_keras,
use_depthwise=use_depthwise)
expected_variables_len = 274
if use_depthwise:
expected_variables_len = 278
self.assertEqual(len(variables), expected_variables_len)
def test_get_expected_feature_map_variable_names(self, use_keras): def test_get_expected_feature_map_variable_names(self, use_keras,
use_depthwise):
depth_multiplier = 1.0 depth_multiplier = 1.0
pad_to_multiple = 1 pad_to_multiple = 1
...@@ -239,6 +343,25 @@ class SsdMobilenetV2FpnFeatureExtractorTest( ...@@ -239,6 +343,25 @@ class SsdMobilenetV2FpnFeatureExtractorTest(
'MobilenetV2/fpn/projection_2/weights', 'MobilenetV2/fpn/projection_2/weights',
'MobilenetV2/fpn/projection_3/weights', 'MobilenetV2/fpn/projection_3/weights',
]) ])
slim_expected_feature_maps_variables_with_depthwise = set([
# Slim Mobilenet V2 feature maps
'MobilenetV2/expanded_conv_4/depthwise/depthwise_weights',
'MobilenetV2/expanded_conv_7/depthwise/depthwise_weights',
'MobilenetV2/expanded_conv_14/depthwise/depthwise_weights',
'MobilenetV2/Conv_1/weights',
# FPN layers
'MobilenetV2/fpn/bottom_up_Conv2d_20/pointwise_weights',
'MobilenetV2/fpn/bottom_up_Conv2d_20/depthwise_weights',
'MobilenetV2/fpn/bottom_up_Conv2d_21/pointwise_weights',
'MobilenetV2/fpn/bottom_up_Conv2d_21/depthwise_weights',
'MobilenetV2/fpn/smoothing_1/depthwise_weights',
'MobilenetV2/fpn/smoothing_1/pointwise_weights',
'MobilenetV2/fpn/smoothing_2/depthwise_weights',
'MobilenetV2/fpn/smoothing_2/pointwise_weights',
'MobilenetV2/fpn/projection_1/weights',
'MobilenetV2/fpn/projection_2/weights',
'MobilenetV2/fpn/projection_3/weights',
])
keras_expected_feature_maps_variables = set([ keras_expected_feature_maps_variables = set([
# Keras Mobilenet V2 feature maps # Keras Mobilenet V2 feature maps
'MobilenetV2_FPN/block_4_depthwise/depthwise_kernel', 'MobilenetV2_FPN/block_4_depthwise/depthwise_kernel',
...@@ -254,17 +377,50 @@ class SsdMobilenetV2FpnFeatureExtractorTest( ...@@ -254,17 +377,50 @@ class SsdMobilenetV2FpnFeatureExtractorTest(
'MobilenetV2_FPN/FeatureMaps/top_down/projection_2/kernel', 'MobilenetV2_FPN/FeatureMaps/top_down/projection_2/kernel',
'MobilenetV2_FPN/FeatureMaps/top_down/projection_3/kernel' 'MobilenetV2_FPN/FeatureMaps/top_down/projection_3/kernel'
]) ])
keras_expected_feature_maps_variables_with_depthwise = set([
# Keras Mobilenet V2 feature maps
'MobilenetV2_FPN/block_4_depthwise/depthwise_kernel',
'MobilenetV2_FPN/block_7_depthwise/depthwise_kernel',
'MobilenetV2_FPN/block_14_depthwise/depthwise_kernel',
'MobilenetV2_FPN/Conv_1/kernel',
# FPN layers
'MobilenetV2_FPN/bottom_up_Conv2d_20_depthwise_conv/depthwise_kernel',
'MobilenetV2_FPN/bottom_up_Conv2d_20_depthwise_conv/pointwise_kernel',
'MobilenetV2_FPN/bottom_up_Conv2d_21_depthwise_conv/depthwise_kernel',
'MobilenetV2_FPN/bottom_up_Conv2d_21_depthwise_conv/pointwise_kernel',
('MobilenetV2_FPN/FeatureMaps/top_down/smoothing_1_depthwise_conv/'
'depthwise_kernel'),
('MobilenetV2_FPN/FeatureMaps/top_down/smoothing_1_depthwise_conv/'
'pointwise_kernel'),
('MobilenetV2_FPN/FeatureMaps/top_down/smoothing_2_depthwise_conv/'
'depthwise_kernel'),
('MobilenetV2_FPN/FeatureMaps/top_down/smoothing_2_depthwise_conv/'
'pointwise_kernel'),
'MobilenetV2_FPN/FeatureMaps/top_down/projection_1/kernel',
'MobilenetV2_FPN/FeatureMaps/top_down/projection_2/kernel',
'MobilenetV2_FPN/FeatureMaps/top_down/projection_3/kernel'
])
g = tf.Graph() g = tf.Graph()
with g.as_default(): with g.as_default():
preprocessed_inputs = tf.placeholder(tf.float32, (4, None, None, 3)) preprocessed_inputs = tf.placeholder(tf.float32, (4, None, None, 3))
feature_extractor = self._create_feature_extractor( feature_extractor = self._create_feature_extractor(
depth_multiplier, pad_to_multiple, use_keras=use_keras) depth_multiplier,
pad_to_multiple,
use_keras=use_keras,
use_depthwise=use_depthwise)
if use_keras: if use_keras:
feature_extractor(preprocessed_inputs) _ = feature_extractor(preprocessed_inputs)
expected_feature_maps_variables = keras_expected_feature_maps_variables expected_feature_maps_variables = keras_expected_feature_maps_variables
if use_depthwise:
expected_feature_maps_variables = (
keras_expected_feature_maps_variables_with_depthwise)
else: else:
feature_extractor.extract_features(preprocessed_inputs) _ = feature_extractor.extract_features(preprocessed_inputs)
expected_feature_maps_variables = slim_expected_feature_maps_variables expected_feature_maps_variables = slim_expected_feature_maps_variables
if use_depthwise:
expected_feature_maps_variables = (
slim_expected_feature_maps_variables_with_depthwise)
actual_variable_set = set([ actual_variable_set = set([
var.op.name for var in g.get_collection(tf.GraphKeys.GLOBAL_VARIABLES) var.op.name for var in g.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)
]) ])
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment