Commit 05584085 authored by pkulzc's avatar pkulzc Committed by Jonathan Huang
Browse files

Merged commit includes the following changes: (#6315)

236813471  by lzc:

    Internal change.

--
236507310  by lzc:

    Fix preprocess.random_resize_method config type issue. The target height and width will be passed as "size" to tf.image.resize_images which only accepts integer.

--
236409989  by Zhichao Lu:

    Config export_to_tpu from function parameter instead of HParams for TPU inference.

--
236403186  by Zhichao Lu:

    Make graph file names optional arguments.

--
236237072  by Zhichao Lu:

    Minor bugfix for keyword args.

--
236209602  by Zhichao Lu:

    Add support for PartitionedVariable to get_variables_available_in_checkpoint.

--
235828658  by Zhichao Lu:

    Automatically stop evaluation jobs when training is finished.

--
235817964  by Zhichao Lu:

    Add an optional process_metrics_fn callback to eval_util, it gets called
    with evaluation results once each evaluation is complete.

--
235788721  by lzc:

    Fix yml file tf runtime version.

--
235262897  by Zhichao Lu:

    Add keypoint support to the random_pad_image preprocessor method.

--
235257380  by Zhichao Lu:

    Support InputDataFields.groundtruth_confidences in retain_groundtruth(), retain_groundtruth_with_positive_classes(), filter_groundtruth_with_crowd_boxes(), filter_groundtruth_with_nan_box_coordinates(), filter_unrecognized_classes().

--
235109188  by Zhichao Lu:

    Fix bug in pad_input_data_to_static_shapes for num_additional_channels > 0; make color-specific data augmentation only touch RGB channels.

--
235045010  by Zhichao Lu:

    Don't slice class_predictions_with_background when add_background_class is false.

--
235026189  by lzc:

    Fix import in g3doc.

--
234863426  by Zhichao Lu:

    Added fixes in exporter to allow writing a checkpoint to a specified temporary directory.

--
234671886  by lzc:

    Internal Change.

--
234630803  by rathodv:

    Internal Change.

--
233985896  by Zhichao Lu:

    Add Neumann optimizer to object detection.

--
233560911  by Zhichao Lu:

    Add NAS-FPN object detection with Resnet and Mobilenet v2.

--
233513536  by Zhichao Lu:

    Export TPU compatible object detection model

--
233495772  by lzc:

    Internal change.

--
233453557  by Zhichao Lu:

    Create Keras-based SSD+MobilenetV1 for object detection.

--
233220074  by lzc:

    Update release notes date.

--
233165761  by Zhichao Lu:

    Support depth_multiplier and min_depth in _SSDResnetV1FpnFeatureExtractor.

--
233160046  by lzc:

    Internal change.

--
232926599  by Zhichao Lu:

    [tf.data] Switching tf.data functions to use `defun`, providing an escape hatch to continue using the legacy `Defun`.

    There are subtle differences between the implementation of `defun` and `Defun` (such as resources handling or control flow) and it is possible that input pipelines that use control flow or resources in their functions might be affected by this change. To migrate majority of existing pipelines to the recommended way of creating functions in TF 2.0 world, while allowing (a small number of) existing pipelines to continue relying on the deprecated behavior, this CL provides an escape hatch.

    If your input pipeline is affected by this CL, it should apply the escape hatch by replacing `foo.map(...)` with `foo.map_with_legacy_function(...)`.

--
232891621  by Zhichao Lu:

    Modify faster_rcnn meta architecture to normalize raw detections.

--
232875817  by Zhichao Lu:

    Make calibration a post-processing step.

    Specifically:
    - Move the calibration config from pipeline.proto --> post_processing.proto
    - Edit post_processing_builder.py to return a calibration function. If no calibration config is provided, it None.
    - Edit SSD and FasterRCNN meta architectures to optionally call the calibration function on detection scores after score conversion and before NMS.

--
232704481  by Zhichao Lu:

    Edit calibration builder to build a function that will be used within a detection model's `postprocess` method, after score conversion and before non-maxima suppression.

    Specific Edits:
    - The returned function now accepts class_predictions_with_background as its argument instead of detection_scores and detection_classes.
    - Class-specific calibration was temporarily removed, as it requires more significant refactoring. Will be added later.

--
232615379  by Zhichao Lu:

    Internal change

--
232483345  by ronnyvotel:

    Making the use of bfloat16 restricted to TPUs.

--
232399572  by Zhichao Lu:

    Edit calibration builder and proto to support class-agnostic calibration.

    Specifically:
    - Edit calibration protos to include path to relevant label map if required for class-specific calibration. Previously, label maps were inferred from other parts of the pipeline proto; this allows all information required by the builder stay within the calibration proto and remove extraneous information from being passed with class-agnostic calibration.
    - Add class-agnostic protos to the calibration config.

    Note that the proto supports sigmoid and linear interpolation parameters, but the builder currently only supports linear interpolation.

--
231613048  by Zhichao Lu:

    Add calibration builder for applying calibration transformations from output of object detection models.

    Specifically:
    - Add calibration proto to support sigmoid and isotonic regression (stepwise function) calibration.
    - Add a builder to support calibration from isotonic regression outputs.

--
231519786  by lzc:

    model_builder test refactor.
    - removed proto text boilerplate in each test case and let them call a create_default_proto function instead.
    - consolidated all separate ssd model creation tests into one.
    - consolidated all separate faster rcnn model creation tests into one.
    - used parameterized test for testing mask rcnn models and use_matmul_crop_and_resize
    - added all failures test.

--
231448169  by Zhichao Lu:

    Return static shape as a constant tensor.

--
231423126  by lzc:

    Add a release note for OID v4 models.

--
231401941  by Zhichao Lu:

    Adding correct labelmap for the models trained on Open Images V4 (*oid_v4
    config suffix).

--
231320357  by Zhichao Lu:

    Add scope to Nearest Neighbor Resize op so that it stays in the same name scope as the original resize ops.

--
231257699  by Zhichao Lu:

    Switch to using preserve_aspect_ratio in tf.image.resize_images rather than using a custom implementation.

--
231247368  by rathodv:

    Internal change.

--
231004874  by lzc:

    Update documentations to use tf 1.12 for object detection API.

--
230999911  by rathodv:

    Use tf.batch_gather instead of ops.batch_gather

--
230999720  by huizhongc:

    Fix weight equalization test in ops_test.

--
230984728  by rathodv:

    Internal update.

--
230929019  by lzc:

    Add an option to replace preprocess operation with placeholder for ssd feature extractor.

--
230845266  by lzc:

    Require tensorflow version 1.12 for object detection API and rename keras_applications to keras_models

--
230392064  by lzc:

    Add RetinaNet 101 checkpoint trained on OID v4 to detection model zoo.

--
230014128  by derekjchow:

    This file was re-located below the tensorflow/lite/g3doc/convert

--
229941449  by lzc:

    Update SSD mobilenet v2 quantized model download path.

--
229843662  by lzc:

    Add an option to use native resize tf op in fpn top-down feature map generation.

--
229636034  by rathodv:

    Add deprecation notice to a few old parameters in train.proto

--
228959078  by derekjchow:

    Remove duplicate elif case in _check_and_convert_legacy_input_config_key

--
228749719  by rathodv:

    Minor refactoring to make exporter's `build_detection_graph` method public.

--
228573828  by rathodv:

    Mofity model.postprocess to return raw detections and raw scores.

    Modify, post-process methods in core/model.py and the meta architectures to export raw detection (without any non-max suppression) and raw multiclass score logits for those detections.

--
228420670  by Zhichao Lu:

    Add shims for custom architectures for object detection models.

--
228241692  by Zhichao Lu:

    Fix the comment on "losses_mask" in "Loss" class.

--
228223810  by Zhichao Lu:

    Support other_heads' predictions in WeightSharedConvolutionalBoxPredictor. Also remove a few unused parameters and fix a couple of comments in convolutional_box_predictor.py.

--
228200588  by Zhichao Lu:

    Add Expected Calibration Error and an evaluator that calculates the metric for object detections.

--
228167740  by lzc:

    Add option to use bounded activations in FPN top-down feature map generation.

--
227767700  by rathodv:

    Internal.

--
226295236  by Zhichao Lu:

    Add Open Image V4 Resnet101-FPN training config to third_party

--
226254842  by Zhichao Lu:

    Fix typo in documentation.

--
225833971  by Zhichao Lu:

    Option to have no resizer in object detection model.

--
225824890  by lzc:

    Fixes p3 compatibility for model_lib.py

--
225760897  by menglong:

    normalizer should be at least 1.

--
225559842  by menglong:

    Add extra logic filtering unrecognized classes.

--
225379421  by lzc:

    Add faster_rcnn_inception_resnet_v2_atrous_oid_v4 config to third_party

--
225368337  by Zhichao Lu:

    Add extra logic filtering unrecognized classes.

--
225341095  by Zhichao Lu:

    Adding Open Images V4 models to OD API model zoo and corresponding configs to the
    configs.

--
225218450  by menglong:

    Add extra logic filtering unrecognized classes.

--
225057591  by Zhichao Lu:

    Internal change.

--
224895417  by rathodv:

    Internal change.

--
224209282  by Zhichao Lu:

    Add two data augmentations to object detection: (1) Self-concat (2) Absolute pads.

--
224073762  by Zhichao Lu:

    Do not create tf.constant until _generate() is actually called in the object detector.

--

PiperOrigin-RevId: 236813471
parent a5db4420
...@@ -99,6 +99,17 @@ reporting an issue. ...@@ -99,6 +99,17 @@ reporting an issue.
## Release information ## Release information
### Feb 11, 2019
We have released detection models trained on the [Open Images Dataset V4](https://storage.googleapis.com/openimages/web/challenge.html)
in our detection model zoo, including
* Faster R-CNN detector with Inception Resnet V2 feature extractor
* SSD detector with MobileNet V2 feature extractor
* SSD detector with ResNet 101 FPN feature extractor (aka RetinaNet-101)
<b>Thanks to contributors</b>: Alina Kuznetsova, Yinxiao Li
### Sep 17, 2018 ### Sep 17, 2018
We have released Faster R-CNN detectors with ResNet-50 / ResNet-101 feature We have released Faster R-CNN detectors with ResNet-50 / ResNet-101 feature
......
...@@ -56,13 +56,10 @@ class GridAnchorGenerator(anchor_generator.AnchorGenerator): ...@@ -56,13 +56,10 @@ class GridAnchorGenerator(anchor_generator.AnchorGenerator):
# Handle argument defaults # Handle argument defaults
if base_anchor_size is None: if base_anchor_size is None:
base_anchor_size = [256, 256] base_anchor_size = [256, 256]
base_anchor_size = tf.to_float(tf.convert_to_tensor(base_anchor_size))
if anchor_stride is None: if anchor_stride is None:
anchor_stride = [16, 16] anchor_stride = [16, 16]
anchor_stride = tf.to_float(tf.convert_to_tensor(anchor_stride))
if anchor_offset is None: if anchor_offset is None:
anchor_offset = [0, 0] anchor_offset = [0, 0]
anchor_offset = tf.to_float(tf.convert_to_tensor(anchor_offset))
self._scales = scales self._scales = scales
self._aspect_ratios = aspect_ratios self._aspect_ratios = aspect_ratios
...@@ -108,6 +105,13 @@ class GridAnchorGenerator(anchor_generator.AnchorGenerator): ...@@ -108,6 +105,13 @@ class GridAnchorGenerator(anchor_generator.AnchorGenerator):
if not all([isinstance(list_item, tuple) and len(list_item) == 2 if not all([isinstance(list_item, tuple) and len(list_item) == 2
for list_item in feature_map_shape_list]): for list_item in feature_map_shape_list]):
raise ValueError('feature_map_shape_list must be a list of pairs.') raise ValueError('feature_map_shape_list must be a list of pairs.')
self._base_anchor_size = tf.to_float(tf.convert_to_tensor(
self._base_anchor_size))
self._anchor_stride = tf.to_float(tf.convert_to_tensor(
self._anchor_stride))
self._anchor_offset = tf.to_float(tf.convert_to_tensor(
self._anchor_offset))
grid_height, grid_width = feature_map_shape_list[0] grid_height, grid_width = feature_map_shape_list[0]
scales_grid, aspect_ratios_grid = ops.meshgrid(self._scales, scales_grid, aspect_ratios_grid = ops.meshgrid(self._scales,
self._aspect_ratios) self._aspect_ratios)
......
...@@ -60,7 +60,7 @@ class MultipleGridAnchorGenerator(anchor_generator.AnchorGenerator): ...@@ -60,7 +60,7 @@ class MultipleGridAnchorGenerator(anchor_generator.AnchorGenerator):
outside list having the same number of entries as feature_map_shape_list outside list having the same number of entries as feature_map_shape_list
(which is passed in at generation time). (which is passed in at generation time).
base_anchor_size: base anchor size as [height, width] base_anchor_size: base anchor size as [height, width]
(length-2 float tensor, default=[1.0, 1.0]). (length-2 float numpy or Tensor, default=[1.0, 1.0]).
The height and width values are normalized to the The height and width values are normalized to the
minimum dimension of the input height and width, so that minimum dimension of the input height and width, so that
when the base anchor height equals the base anchor when the base anchor height equals the base anchor
...@@ -95,7 +95,7 @@ class MultipleGridAnchorGenerator(anchor_generator.AnchorGenerator): ...@@ -95,7 +95,7 @@ class MultipleGridAnchorGenerator(anchor_generator.AnchorGenerator):
raise ValueError('box_specs_list is expected to be a ' raise ValueError('box_specs_list is expected to be a '
'list of lists of pairs') 'list of lists of pairs')
if base_anchor_size is None: if base_anchor_size is None:
base_anchor_size = tf.constant([256, 256], dtype=tf.float32) base_anchor_size = [256, 256]
self._base_anchor_size = base_anchor_size self._base_anchor_size = base_anchor_size
self._anchor_strides = anchor_strides self._anchor_strides = anchor_strides
self._anchor_offsets = anchor_offsets self._anchor_offsets = anchor_offsets
...@@ -211,10 +211,18 @@ class MultipleGridAnchorGenerator(anchor_generator.AnchorGenerator): ...@@ -211,10 +211,18 @@ class MultipleGridAnchorGenerator(anchor_generator.AnchorGenerator):
min_im_shape = tf.minimum(im_height, im_width) min_im_shape = tf.minimum(im_height, im_width)
scale_height = min_im_shape / im_height scale_height = min_im_shape / im_height
scale_width = min_im_shape / im_width scale_width = min_im_shape / im_width
base_anchor_size = [ if not tf.contrib.framework.is_tensor(self._base_anchor_size):
scale_height * self._base_anchor_size[0], base_anchor_size = [
scale_width * self._base_anchor_size[1] scale_height * tf.constant(self._base_anchor_size[0],
] dtype=tf.float32),
scale_width * tf.constant(self._base_anchor_size[1],
dtype=tf.float32)
]
else:
base_anchor_size = [
scale_height * self._base_anchor_size[0],
scale_width * self._base_anchor_size[1]
]
for feature_map_index, (grid_size, scales, aspect_ratios, stride, for feature_map_index, (grid_size, scales, aspect_ratios, stride,
offset) in enumerate( offset) in enumerate(
zip(feature_map_shape_list, self._scales, zip(feature_map_shape_list, self._scales,
...@@ -304,7 +312,6 @@ def create_ssd_anchors(num_layers=6, ...@@ -304,7 +312,6 @@ def create_ssd_anchors(num_layers=6,
""" """
if base_anchor_size is None: if base_anchor_size is None:
base_anchor_size = [1.0, 1.0] base_anchor_size = [1.0, 1.0]
base_anchor_size = tf.constant(base_anchor_size, dtype=tf.float32)
box_specs_list = [] box_specs_list = []
if scales is None or not scales: if scales is None or not scales:
scales = [min_scale + (max_scale - min_scale) * i / (num_layers - 1) scales = [min_scale + (max_scale - min_scale) * i / (num_layers - 1)
......
...@@ -47,14 +47,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase): ...@@ -47,14 +47,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
grid_anchor_generator.GridAnchorGenerator)) grid_anchor_generator.GridAnchorGenerator))
self.assertListEqual(anchor_generator_object._scales, []) self.assertListEqual(anchor_generator_object._scales, [])
self.assertListEqual(anchor_generator_object._aspect_ratios, []) self.assertListEqual(anchor_generator_object._aspect_ratios, [])
with self.test_session() as sess: self.assertAllEqual(anchor_generator_object._anchor_offset, [0, 0])
base_anchor_size, anchor_offset, anchor_stride = sess.run( self.assertAllEqual(anchor_generator_object._anchor_stride, [16, 16])
[anchor_generator_object._base_anchor_size, self.assertAllEqual(anchor_generator_object._base_anchor_size, [256, 256])
anchor_generator_object._anchor_offset,
anchor_generator_object._anchor_stride])
self.assertAllEqual(anchor_offset, [0, 0])
self.assertAllEqual(anchor_stride, [16, 16])
self.assertAllEqual(base_anchor_size, [256, 256])
def test_build_grid_anchor_generator_with_non_default_parameters(self): def test_build_grid_anchor_generator_with_non_default_parameters(self):
anchor_generator_text_proto = """ anchor_generator_text_proto = """
...@@ -79,14 +74,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase): ...@@ -79,14 +74,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
[0.4, 2.2]) [0.4, 2.2])
self.assert_almost_list_equal(anchor_generator_object._aspect_ratios, self.assert_almost_list_equal(anchor_generator_object._aspect_ratios,
[0.3, 4.5]) [0.3, 4.5])
with self.test_session() as sess: self.assertAllEqual(anchor_generator_object._anchor_offset, [30, 40])
base_anchor_size, anchor_offset, anchor_stride = sess.run( self.assertAllEqual(anchor_generator_object._anchor_stride, [10, 20])
[anchor_generator_object._base_anchor_size, self.assertAllEqual(anchor_generator_object._base_anchor_size, [128, 512])
anchor_generator_object._anchor_offset,
anchor_generator_object._anchor_stride])
self.assertAllEqual(anchor_offset, [30, 40])
self.assertAllEqual(anchor_stride, [10, 20])
self.assertAllEqual(base_anchor_size, [128, 512])
def test_build_ssd_anchor_generator_with_defaults(self): def test_build_ssd_anchor_generator_with_defaults(self):
anchor_generator_text_proto = """ anchor_generator_text_proto = """
...@@ -114,10 +104,7 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase): ...@@ -114,10 +104,7 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
list(anchor_generator_object._aspect_ratios), list(anchor_generator_object._aspect_ratios),
[(1.0, 2.0, 0.5)] + 5 * [(1.0, 1.0)]): [(1.0, 2.0, 0.5)] + 5 * [(1.0, 1.0)]):
self.assert_almost_list_equal(expected_aspect_ratio, actual_aspect_ratio) self.assert_almost_list_equal(expected_aspect_ratio, actual_aspect_ratio)
self.assertAllClose(anchor_generator_object._base_anchor_size, [1.0, 1.0])
with self.test_session() as sess:
base_anchor_size = sess.run(anchor_generator_object._base_anchor_size)
self.assertAllClose(base_anchor_size, [1.0, 1.0])
def test_build_ssd_anchor_generator_with_custom_scales(self): def test_build_ssd_anchor_generator_with_custom_scales(self):
anchor_generator_text_proto = """ anchor_generator_text_proto = """
...@@ -194,9 +181,7 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase): ...@@ -194,9 +181,7 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
6 * [(1.0, 1.0)]): 6 * [(1.0, 1.0)]):
self.assert_almost_list_equal(expected_aspect_ratio, actual_aspect_ratio) self.assert_almost_list_equal(expected_aspect_ratio, actual_aspect_ratio)
with self.test_session() as sess: self.assertAllClose(anchor_generator_object._base_anchor_size, [1.0, 1.0])
base_anchor_size = sess.run(anchor_generator_object._base_anchor_size)
self.assertAllClose(base_anchor_size, [1.0, 1.0])
def test_build_ssd_anchor_generator_with_non_default_parameters(self): def test_build_ssd_anchor_generator_with_non_default_parameters(self):
anchor_generator_text_proto = """ anchor_generator_text_proto = """
...@@ -241,9 +226,7 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase): ...@@ -241,9 +226,7 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
list(anchor_generator_object._anchor_offsets), [(8, 0), (16, 10)]): list(anchor_generator_object._anchor_offsets), [(8, 0), (16, 10)]):
self.assert_almost_list_equal(expected_offsets, actual_offsets) self.assert_almost_list_equal(expected_offsets, actual_offsets)
with self.test_session() as sess: self.assertAllClose(anchor_generator_object._base_anchor_size, [1.0, 1.0])
base_anchor_size = sess.run(anchor_generator_object._base_anchor_size)
self.assertAllClose(base_anchor_size, [1.0, 1.0])
def test_raise_value_error_on_empty_anchor_genertor(self): def test_raise_value_error_on_empty_anchor_genertor(self):
anchor_generator_text_proto = """ anchor_generator_text_proto = """
......
# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tensorflow ops to calibrate class predictions and background class."""
import tensorflow as tf
from object_detection.utils import shape_utils
def _find_interval_containing_new_value(x, new_value):
"""Find the index of x (ascending-ordered) after which new_value occurs."""
new_value_shape = shape_utils.combined_static_and_dynamic_shape(new_value)[0]
x_shape = shape_utils.combined_static_and_dynamic_shape(x)[0]
compare = tf.cast(tf.reshape(new_value, shape=(new_value_shape, 1)) >=
tf.reshape(x, shape=(1, x_shape)),
dtype=tf.int32)
diff = compare[:, 1:] - compare[:, :-1]
interval_idx = tf.argmin(diff, axis=1)
return interval_idx
def _tf_linear_interp1d(x_to_interpolate, fn_x, fn_y):
"""Tensorflow implementation of 1d linear interpolation.
Args:
x_to_interpolate: tf.float32 Tensor of shape (num_examples,) over which 1d
linear interpolation is performed.
fn_x: Monotonically-increasing, non-repeating tf.float32 Tensor of shape
(length,) used as the domain to approximate a function.
fn_y: tf.float32 Tensor of shape (length,) used as the range to approximate
a function.
Returns:
tf.float32 Tensor of shape (num_examples,)
"""
x_pad = tf.concat([fn_x[:1] - 1, fn_x, fn_x[-1:] + 1], axis=0)
y_pad = tf.concat([fn_y[:1], fn_y, fn_y[-1:]], axis=0)
interval_idx = _find_interval_containing_new_value(x_pad, x_to_interpolate)
# Interpolate
alpha = (
(x_to_interpolate - tf.gather(x_pad, interval_idx)) /
(tf.gather(x_pad, interval_idx + 1) - tf.gather(x_pad, interval_idx)))
interpolation = ((1 - alpha) * tf.gather(y_pad, interval_idx) +
alpha * tf.gather(y_pad, interval_idx + 1))
return interpolation
def _function_approximation_proto_to_tf_tensors(x_y_pairs_message):
"""Extracts (x,y) pairs from a XYPairs message.
Args:
x_y_pairs_message: calibration_pb2..XYPairs proto
Returns:
tf_x: tf.float32 tensor of shape (number_xy_pairs,) for function domain.
tf_y: tf.float32 tensor of shape (number_xy_pairs,) for function range.
"""
tf_x = tf.convert_to_tensor([x_y_pair.x
for x_y_pair
in x_y_pairs_message.x_y_pair],
dtype=tf.float32)
tf_y = tf.convert_to_tensor([x_y_pair.y
for x_y_pair
in x_y_pairs_message.x_y_pair],
dtype=tf.float32)
return tf_x, tf_y
def build(calibration_config):
"""Returns a function that calibrates Tensorflow model scores.
All returned functions are expected to apply positive monotonic
transformations to inputs (i.e. score ordering is strictly preserved or
adjacent scores are mapped to the same score, but an input of lower value
should never be exceed an input of higher value after transformation). For
class-agnostic calibration, positive monotonicity should hold across all
scores. In class-specific cases, positive monotonicity should hold within each
class.
Args:
calibration_config: calibration_pb2.CalibrationConfig proto.
Returns:
Function that that accepts class_predictions_with_background and calibrates
the output based on calibration_config's parameters.
Raises:
ValueError: No calibration builder defined for "Oneof" in
calibration_config.
"""
# Linear Interpolation (usually used as a result of calibration via
# isotonic regression).
if calibration_config.WhichOneof('calibrator') == 'function_approximation':
def calibration_fn(class_predictions_with_background):
"""Calibrate predictions via 1-d linear interpolation.
Predictions scores are linearly interpolated based on class-agnostic
function approximations. Note that the 0-indexed background class may
also transformed.
Args:
class_predictions_with_background: tf.float32 tensor of shape
[batch_size, num_anchors, num_classes + 1] containing scores on the
interval [0,1]. This is usually produced by a sigmoid or softmax layer
and the result of calling the `predict` method of a detection model.
Returns:
tf.float32 tensor of shape [batch_size, num_anchors, num_classes] if
background class is not present (else shape is
[batch_size, num_anchors, num_classes + 1]) on the interval [0, 1].
"""
# Flattening Tensors and then reshaping at the end.
flat_class_predictions_with_background = tf.reshape(
class_predictions_with_background, shape=[-1])
fn_x, fn_y = _function_approximation_proto_to_tf_tensors(
calibration_config.function_approximation.x_y_pairs)
updated_scores = _tf_linear_interp1d(
flat_class_predictions_with_background, fn_x, fn_y)
# Un-flatten the scores
original_detections_shape = shape_utils.combined_static_and_dynamic_shape(
class_predictions_with_background)
calibrated_class_predictions_with_background = tf.reshape(
updated_scores,
shape=original_detections_shape,
name='calibrate_scores')
return calibrated_class_predictions_with_background
# TODO(zbeaver): Add sigmoid calibration and per-class isotonic regression.
else:
raise ValueError('No calibration builder defined for "Oneof" in '
'calibration_config.')
return calibration_fn
# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for calibration_builder."""
import numpy as np
from scipy import interpolate
import tensorflow as tf
from object_detection.builders import calibration_builder
from object_detection.protos import calibration_pb2
class CalibrationBuilderTest(tf.test.TestCase):
def test_tf_linear_interp1d_map(self):
"""Tests TF linear interpolation mapping to a single number."""
with self.test_session() as sess:
tf_x = tf.constant([0., 0.5, 1.])
tf_y = tf.constant([0.5, 0.5, 0.5])
new_x = tf.constant([0., 0.25, 0.5, 0.75, 1.])
tf_map_outputs = calibration_builder._tf_linear_interp1d(
new_x, tf_x, tf_y)
tf_map_outputs_np = sess.run([tf_map_outputs])
self.assertAllClose(tf_map_outputs_np, [[0.5, 0.5, 0.5, 0.5, 0.5]])
def test_tf_linear_interp1d_interpolate(self):
"""Tests TF 1d linear interpolation not mapping to a single number."""
with self.test_session() as sess:
tf_x = tf.constant([0., 0.5, 1.])
tf_y = tf.constant([0.6, 0.7, 1.0])
new_x = tf.constant([0., 0.25, 0.5, 0.75, 1.])
tf_interpolate_outputs = calibration_builder._tf_linear_interp1d(
new_x, tf_x, tf_y)
tf_interpolate_outputs_np = sess.run([tf_interpolate_outputs])
self.assertAllClose(tf_interpolate_outputs_np, [[0.6, 0.65, 0.7, 0.85, 1.]])
@staticmethod
def _get_scipy_interp1d(new_x, x, y):
"""Helper performing 1d linear interpolation using SciPy."""
interpolation1d_fn = interpolate.interp1d(x, y)
return interpolation1d_fn(new_x)
def _get_tf_interp1d(self, new_x, x, y):
"""Helper performing 1d linear interpolation using Tensorflow."""
with self.test_session() as sess:
tf_interp_outputs = calibration_builder._tf_linear_interp1d(
tf.convert_to_tensor(new_x, dtype=tf.float32),
tf.convert_to_tensor(x, dtype=tf.float32),
tf.convert_to_tensor(y, dtype=tf.float32))
np_tf_interp_outputs = sess.run(tf_interp_outputs)
return np_tf_interp_outputs
def test_tf_linear_interp1d_against_scipy_map(self):
"""Tests parity of TF linear interpolation with SciPy for simple mapping."""
length = 10
np_x = np.linspace(0, 1, length)
# Mapping all numbers to 0.5
np_y_map = np.repeat(0.5, length)
# Scipy and TF interpolations
test_data_np = np.linspace(0, 1, length * 10)
scipy_map_outputs = self._get_scipy_interp1d(test_data_np, np_x, np_y_map)
np_tf_map_outputs = self._get_tf_interp1d(test_data_np, np_x, np_y_map)
self.assertAllClose(scipy_map_outputs, np_tf_map_outputs)
def test_tf_linear_interp1d_against_scipy_interpolate(self):
"""Tests parity of TF linear interpolation with SciPy."""
length = 10
np_x = np.linspace(0, 1, length)
# Requires interpolation over 0.5 to 1 domain
np_y_interp = np.linspace(0.5, 1, length)
# Scipy interpolation for comparison
test_data_np = np.linspace(0, 1, length * 10)
scipy_interp_outputs = self._get_scipy_interp1d(test_data_np, np_x,
np_y_interp)
np_tf_interp_outputs = self._get_tf_interp1d(test_data_np, np_x,
np_y_interp)
self.assertAllClose(scipy_interp_outputs, np_tf_interp_outputs)
@staticmethod
def _add_function_approximation_to_calibration_proto(calibration_proto,
x_array,
y_array,
class_label):
"""Adds a function approximation to calibration proto for a class label."""
# Per-class calibration.
if class_label:
label_function_approximation = (calibration_proto
.label_function_approximations
.label_xy_pairs_map[class_label])
# Class-agnostic calibration.
else:
label_function_approximation = (calibration_proto
.function_approximation
.x_y_pairs)
for x, y in zip(x_array, y_array):
x_y_pair_message = label_function_approximation.x_y_pair.add()
x_y_pair_message.x = x
x_y_pair_message.y = y
def test_class_agnostic_function_approximation(self):
"""Ensures that calibration appropriate values, regardless of class."""
# Generate fake calibration proto. For this interpolation, any input on
# [0.0, 0.5] should be divided by 2 and any input on (0.5, 1.0] should have
# 0.25 subtracted from it.
class_agnostic_x = np.asarray([0.0, 0.5, 1.0])
class_agnostic_y = np.asarray([0.0, 0.25, 0.75])
calibration_config = calibration_pb2.CalibrationConfig()
self._add_function_approximation_to_calibration_proto(calibration_config,
class_agnostic_x,
class_agnostic_y,
class_label=None)
od_graph = tf.Graph()
with self.test_session(graph=od_graph) as sess:
calibration_fn = calibration_builder.build(calibration_config)
# batch_size = 2, num_classes = 2, num_anchors = 2.
class_predictions_with_background = tf.constant(
[[[0.1, 0.2, 0.3],
[0.4, 0.5, 0.0]],
[[0.6, 0.7, 0.8],
[0.9, 1.0, 1.0]]], dtype=tf.float32)
# Everything should map to 0.5 if classes are ignored.
calibrated_scores = calibration_fn(class_predictions_with_background)
calibrated_scores_np = sess.run(calibrated_scores)
self.assertAllClose(calibrated_scores_np, [[[0.05, 0.1, 0.15],
[0.2, 0.25, 0.0]],
[[0.35, 0.45, 0.55],
[0.65, 0.75, 0.75]]])
if __name__ == '__main__':
tf.test.main()
...@@ -117,6 +117,7 @@ def build(input_reader_config, batch_size=None, transform_input_data_fn=None): ...@@ -117,6 +117,7 @@ def build(input_reader_config, batch_size=None, transform_input_data_fn=None):
label_map_proto_file = input_reader_config.label_map_path label_map_proto_file = input_reader_config.label_map_path
decoder = tf_example_decoder.TfExampleDecoder( decoder = tf_example_decoder.TfExampleDecoder(
load_instance_masks=input_reader_config.load_instance_masks, load_instance_masks=input_reader_config.load_instance_masks,
load_multiclass_scores=input_reader_config.load_multiclass_scores,
instance_mask_type=input_reader_config.mask_type, instance_mask_type=input_reader_config.mask_type,
label_map_proto_file=label_map_proto_file, label_map_proto_file=label_map_proto_file,
use_display_name=input_reader_config.use_display_name, use_display_name=input_reader_config.use_display_name,
...@@ -140,9 +141,12 @@ def build(input_reader_config, batch_size=None, transform_input_data_fn=None): ...@@ -140,9 +141,12 @@ def build(input_reader_config, batch_size=None, transform_input_data_fn=None):
num_parallel_calls = batch_size * input_reader_config.num_parallel_batches num_parallel_calls = batch_size * input_reader_config.num_parallel_batches
else: else:
num_parallel_calls = input_reader_config.num_parallel_map_calls num_parallel_calls = input_reader_config.num_parallel_map_calls
dataset = dataset.map( # TODO(b/123952794): Migrate to V2 function.
process_fn, if hasattr(dataset, 'map_with_legacy_function'):
num_parallel_calls=num_parallel_calls) data_map_fn = dataset.map_with_legacy_function
else:
data_map_fn = dataset.map
dataset = data_map_fn(process_fn, num_parallel_calls=num_parallel_calls)
if batch_size: if batch_size:
dataset = dataset.apply( dataset = dataset.apply(
tf.contrib.data.batch_and_drop_remainder(batch_size)) tf.contrib.data.batch_and_drop_remainder(batch_size))
......
...@@ -102,6 +102,14 @@ def build(image_resizer_config): ...@@ -102,6 +102,14 @@ def build(image_resizer_config):
method=method) method=method)
if not fixed_shape_resizer_config.convert_to_grayscale: if not fixed_shape_resizer_config.convert_to_grayscale:
return image_resizer_fn return image_resizer_fn
elif image_resizer_oneof == 'identity_resizer':
def image_resizer_fn(image, masks=None, **kwargs):
del kwargs
if masks is None:
return [image, tf.shape(image)]
else:
return [image, masks, tf.shape(image)]
return image_resizer_fn
else: else:
raise ValueError( raise ValueError(
'Invalid image resizer option: \'%s\'.' % image_resizer_oneof) 'Invalid image resizer option: \'%s\'.' % image_resizer_oneof)
......
...@@ -104,6 +104,17 @@ class ImageResizerBuilderTest(tf.test.TestCase): ...@@ -104,6 +104,17 @@ class ImageResizerBuilderTest(tf.test.TestCase):
input_shape, image_resizer_text_proto) input_shape, image_resizer_text_proto)
self.assertEqual(output_shape, expected_output_shape) self.assertEqual(output_shape, expected_output_shape)
def test_identity_resizer_returns_expected_shape(self):
image_resizer_text_proto = """
identity_resizer {
}
"""
input_shape = (10, 20, 3)
expected_output_shape = (10, 20, 3)
output_shape = self._shape_of_resized_random_image_given_text_proto(
input_shape, image_resizer_text_proto)
self.assertEqual(output_shape, expected_output_shape)
def test_raises_error_on_invalid_input(self): def test_raises_error_on_invalid_input(self):
invalid_input = 'invalid_input' invalid_input = 'invalid_input'
with self.assertRaises(ValueError): with self.assertRaises(ValueError):
......
...@@ -44,6 +44,7 @@ from object_detection.models.ssd_inception_v2_feature_extractor import SSDIncept ...@@ -44,6 +44,7 @@ from object_detection.models.ssd_inception_v2_feature_extractor import SSDIncept
from object_detection.models.ssd_inception_v3_feature_extractor import SSDInceptionV3FeatureExtractor from object_detection.models.ssd_inception_v3_feature_extractor import SSDInceptionV3FeatureExtractor
from object_detection.models.ssd_mobilenet_v1_feature_extractor import SSDMobileNetV1FeatureExtractor from object_detection.models.ssd_mobilenet_v1_feature_extractor import SSDMobileNetV1FeatureExtractor
from object_detection.models.ssd_mobilenet_v1_fpn_feature_extractor import SSDMobileNetV1FpnFeatureExtractor from object_detection.models.ssd_mobilenet_v1_fpn_feature_extractor import SSDMobileNetV1FpnFeatureExtractor
from object_detection.models.ssd_mobilenet_v1_keras_feature_extractor import SSDMobileNetV1KerasFeatureExtractor
from object_detection.models.ssd_mobilenet_v1_ppn_feature_extractor import SSDMobileNetV1PpnFeatureExtractor from object_detection.models.ssd_mobilenet_v1_ppn_feature_extractor import SSDMobileNetV1PpnFeatureExtractor
from object_detection.models.ssd_mobilenet_v2_feature_extractor import SSDMobileNetV2FeatureExtractor from object_detection.models.ssd_mobilenet_v2_feature_extractor import SSDMobileNetV2FeatureExtractor
from object_detection.models.ssd_mobilenet_v2_fpn_feature_extractor import SSDMobileNetV2FpnFeatureExtractor from object_detection.models.ssd_mobilenet_v2_fpn_feature_extractor import SSDMobileNetV2FpnFeatureExtractor
...@@ -76,6 +77,7 @@ SSD_FEATURE_EXTRACTOR_CLASS_MAP = { ...@@ -76,6 +77,7 @@ SSD_FEATURE_EXTRACTOR_CLASS_MAP = {
} }
SSD_KERAS_FEATURE_EXTRACTOR_CLASS_MAP = { SSD_KERAS_FEATURE_EXTRACTOR_CLASS_MAP = {
'ssd_mobilenet_v1_keras': SSDMobileNetV1KerasFeatureExtractor,
'ssd_mobilenet_v2_keras': SSDMobileNetV2KerasFeatureExtractor 'ssd_mobilenet_v2_keras': SSDMobileNetV2KerasFeatureExtractor
} }
...@@ -187,6 +189,12 @@ def _build_ssd_feature_extractor(feature_extractor_config, ...@@ -187,6 +189,12 @@ def _build_ssd_feature_extractor(feature_extractor_config,
override_base_feature_extractor_hyperparams override_base_feature_extractor_hyperparams
} }
if feature_extractor_config.HasField('replace_preprocessor_with_placeholder'):
kwargs.update({
'replace_preprocessor_with_placeholder':
feature_extractor_config.replace_preprocessor_with_placeholder
})
if is_keras_extractor: if is_keras_extractor:
kwargs.update({ kwargs.update({
'conv_hyperparams': conv_hyperparams, 'conv_hyperparams': conv_hyperparams,
......
...@@ -24,69 +24,29 @@ from object_detection.builders import model_builder ...@@ -24,69 +24,29 @@ from object_detection.builders import model_builder
from object_detection.meta_architectures import faster_rcnn_meta_arch from object_detection.meta_architectures import faster_rcnn_meta_arch
from object_detection.meta_architectures import rfcn_meta_arch from object_detection.meta_architectures import rfcn_meta_arch
from object_detection.meta_architectures import ssd_meta_arch from object_detection.meta_architectures import ssd_meta_arch
from object_detection.models import faster_rcnn_inception_resnet_v2_feature_extractor as frcnn_inc_res
from object_detection.models import faster_rcnn_inception_v2_feature_extractor as frcnn_inc_v2
from object_detection.models import faster_rcnn_nas_feature_extractor as frcnn_nas
from object_detection.models import faster_rcnn_pnas_feature_extractor as frcnn_pnas
from object_detection.models import faster_rcnn_resnet_v1_feature_extractor as frcnn_resnet_v1
from object_detection.models import ssd_resnet_v1_fpn_feature_extractor as ssd_resnet_v1_fpn from object_detection.models import ssd_resnet_v1_fpn_feature_extractor as ssd_resnet_v1_fpn
from object_detection.models import ssd_resnet_v1_ppn_feature_extractor as ssd_resnet_v1_ppn from object_detection.protos import hyperparams_pb2
from object_detection.models.embedded_ssd_mobilenet_v1_feature_extractor import EmbeddedSSDMobileNetV1FeatureExtractor from object_detection.protos import losses_pb2
from object_detection.models.ssd_inception_v2_feature_extractor import SSDInceptionV2FeatureExtractor
from object_detection.models.ssd_inception_v3_feature_extractor import SSDInceptionV3FeatureExtractor
from object_detection.models.ssd_mobilenet_v1_feature_extractor import SSDMobileNetV1FeatureExtractor
from object_detection.models.ssd_mobilenet_v1_fpn_feature_extractor import SSDMobileNetV1FpnFeatureExtractor
from object_detection.models.ssd_mobilenet_v1_ppn_feature_extractor import SSDMobileNetV1PpnFeatureExtractor
from object_detection.models.ssd_mobilenet_v2_feature_extractor import SSDMobileNetV2FeatureExtractor
from object_detection.models.ssd_mobilenet_v2_fpn_feature_extractor import SSDMobileNetV2FpnFeatureExtractor
from object_detection.models.ssd_mobilenet_v2_keras_feature_extractor import SSDMobileNetV2KerasFeatureExtractor
from object_detection.predictors import convolutional_box_predictor
from object_detection.predictors import convolutional_keras_box_predictor
from object_detection.protos import model_pb2 from object_detection.protos import model_pb2
FRCNN_RESNET_FEAT_MAPS = {
'faster_rcnn_resnet50':
frcnn_resnet_v1.FasterRCNNResnet50FeatureExtractor,
'faster_rcnn_resnet101':
frcnn_resnet_v1.FasterRCNNResnet101FeatureExtractor,
'faster_rcnn_resnet152':
frcnn_resnet_v1.FasterRCNNResnet152FeatureExtractor
}
SSD_RESNET_V1_FPN_FEAT_MAPS = {
'ssd_resnet50_v1_fpn':
ssd_resnet_v1_fpn.SSDResnet50V1FpnFeatureExtractor,
'ssd_resnet101_v1_fpn':
ssd_resnet_v1_fpn.SSDResnet101V1FpnFeatureExtractor,
'ssd_resnet152_v1_fpn':
ssd_resnet_v1_fpn.SSDResnet152V1FpnFeatureExtractor,
}
SSD_RESNET_V1_PPN_FEAT_MAPS = {
'ssd_resnet50_v1_ppn':
ssd_resnet_v1_ppn.SSDResnet50V1PpnFeatureExtractor,
'ssd_resnet101_v1_ppn':
ssd_resnet_v1_ppn.SSDResnet101V1PpnFeatureExtractor,
'ssd_resnet152_v1_ppn':
ssd_resnet_v1_ppn.SSDResnet152V1PpnFeatureExtractor
}
class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase): class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase):
def create_model(self, model_config): def create_model(self, model_config, is_training=True):
"""Builds a DetectionModel based on the model config. """Builds a DetectionModel based on the model config.
Args: Args:
model_config: A model.proto object containing the config for the desired model_config: A model.proto object containing the config for the desired
DetectionModel. DetectionModel.
is_training: True if this model is being built for training purposes.
Returns: Returns:
DetectionModel based on the config. DetectionModel based on the config.
""" """
return model_builder.build(model_config, is_training=True) return model_builder.build(model_config, is_training=is_training)
def test_create_ssd_inception_v2_model_from_config(self): def create_default_ssd_model_proto(self):
"""Creates a DetectionModel proto with ssd model fields populated."""
model_text_proto = """ model_text_proto = """
ssd { ssd {
feature_extractor { feature_extractor {
...@@ -153,56 +113,46 @@ class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase): ...@@ -153,56 +113,46 @@ class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase):
}""" }"""
model_proto = model_pb2.DetectionModel() model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto) text_format.Merge(model_text_proto, model_proto)
model = self.create_model(model_proto) return model_proto
self.assertIsInstance(model, ssd_meta_arch.SSDMetaArch)
self.assertIsInstance(model._feature_extractor,
SSDInceptionV2FeatureExtractor)
self.assertIsNone(model._expected_loss_weights_fn)
def create_default_faster_rcnn_model_proto(self):
"""Creates a DetectionModel proto with FasterRCNN model fields populated."""
def test_create_ssd_inception_v3_model_from_config(self):
model_text_proto = """ model_text_proto = """
ssd { faster_rcnn {
feature_extractor { inplace_batchnorm_update: false
type: 'ssd_inception_v3' num_classes: 3
conv_hyperparams { image_resizer {
regularizer { keep_aspect_ratio_resizer {
l2_regularizer { min_dimension: 600
} max_dimension: 1024
}
initializer {
truncated_normal_initializer {
}
}
}
override_base_feature_extractor_hyperparams: true
}
box_coder {
faster_rcnn_box_coder {
} }
} }
matcher { feature_extractor {
argmax_matcher { type: 'faster_rcnn_resnet101'
}
} }
similarity_calculator { first_stage_anchor_generator {
iou_similarity { grid_anchor_generator {
scales: [0.25, 0.5, 1.0, 2.0]
aspect_ratios: [0.5, 1.0, 2.0]
height_stride: 16
width_stride: 16
} }
} }
anchor_generator { first_stage_box_predictor_conv_hyperparams {
ssd_anchor_generator { regularizer {
aspect_ratios: 1.0 l2_regularizer {
}
} }
} initializer {
image_resizer { truncated_normal_initializer {
fixed_shape_resizer { }
height: 320
width: 320
} }
} }
box_predictor { initial_crop_size: 14
convolutional_box_predictor { maxpool_kernel_size: 2
maxpool_stride: 2
second_stage_box_predictor {
mask_rcnn_box_predictor {
conv_hyperparams { conv_hyperparams {
regularizer { regularizer {
l2_regularizer { l2_regularizer {
...@@ -213,1357 +163,169 @@ class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase): ...@@ -213,1357 +163,169 @@ class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase):
} }
} }
} }
} fc_hyperparams {
} op: FC
loss {
classification_loss {
weighted_softmax {
}
}
localization_loss {
weighted_smooth_l1 {
}
}
}
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
model = self.create_model(model_proto)
self.assertIsInstance(model, ssd_meta_arch.SSDMetaArch)
self.assertIsInstance(model._feature_extractor,
SSDInceptionV3FeatureExtractor)
def test_create_ssd_resnet_v1_fpn_model_from_config(self):
model_text_proto = """
ssd {
feature_extractor {
type: 'ssd_resnet50_v1_fpn'
fpn {
min_level: 3
max_level: 7
}
conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
box_coder {
faster_rcnn_box_coder {
}
}
matcher {
argmax_matcher {
}
}
similarity_calculator {
iou_similarity {
}
}
encode_background_as_zeros: true
anchor_generator {
multiscale_anchor_generator {
aspect_ratios: [1.0, 2.0, 0.5]
scales_per_octave: 2
}
}
image_resizer {
fixed_shape_resizer {
height: 320
width: 320
}
}
box_predictor {
weight_shared_convolutional_box_predictor {
depth: 32
conv_hyperparams {
regularizer { regularizer {
l2_regularizer { l2_regularizer {
} }
} }
initializer { initializer {
random_normal_initializer { truncated_normal_initializer {
} }
} }
} }
num_layers_before_predictor: 1
} }
} }
normalize_loss_by_num_matches: true second_stage_post_processing {
normalize_loc_loss_by_codesize: true batch_non_max_suppression {
loss { score_threshold: 0.01
classification_loss { iou_threshold: 0.6
weighted_sigmoid_focal { max_detections_per_class: 100
alpha: 0.25 max_total_detections: 300
gamma: 2.0
}
}
localization_loss {
weighted_smooth_l1 {
delta: 0.1
}
} }
classification_weight: 1.0 score_converter: SOFTMAX
localization_weight: 1.0
} }
}""" }"""
model_proto = model_pb2.DetectionModel() model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto) text_format.Merge(model_text_proto, model_proto)
return model_proto
for extractor_type, extractor_class in SSD_RESNET_V1_FPN_FEAT_MAPS.items(): def test_create_ssd_models_from_config(self):
model_proto = self.create_default_ssd_model_proto()
ssd_feature_extractor_map = {}
ssd_feature_extractor_map.update(
model_builder.SSD_FEATURE_EXTRACTOR_CLASS_MAP)
ssd_feature_extractor_map.update(
model_builder.SSD_KERAS_FEATURE_EXTRACTOR_CLASS_MAP)
for extractor_type, extractor_class in ssd_feature_extractor_map.items():
model_proto.ssd.feature_extractor.type = extractor_type model_proto.ssd.feature_extractor.type = extractor_type
model = model_builder.build(model_proto, is_training=True) model = model_builder.build(model_proto, is_training=True)
self.assertIsInstance(model, ssd_meta_arch.SSDMetaArch) self.assertIsInstance(model, ssd_meta_arch.SSDMetaArch)
self.assertIsInstance(model._feature_extractor, extractor_class) self.assertIsInstance(model._feature_extractor, extractor_class)
def test_create_ssd_resnet_v1_ppn_model_from_config(self): def test_create_ssd_fpn_model_from_config(self):
model_text_proto = """ model_proto = self.create_default_ssd_model_proto()
ssd { model_proto.ssd.feature_extractor.type = 'ssd_resnet101_v1_fpn'
feature_extractor { model_proto.ssd.feature_extractor.fpn.min_level = 3
type: 'ssd_resnet_v1_50_ppn' model_proto.ssd.feature_extractor.fpn.max_level = 7
conv_hyperparams { model = model_builder.build(model_proto, is_training=True)
regularizer { self.assertIsInstance(model._feature_extractor,
l2_regularizer { ssd_resnet_v1_fpn.SSDResnet101V1FpnFeatureExtractor)
} self.assertEqual(model._feature_extractor._fpn_min_level, 3)
} self.assertEqual(model._feature_extractor._fpn_max_level, 7)
initializer {
truncated_normal_initializer {
} @parameterized.named_parameters(
} {
} 'testcase_name': 'mask_rcnn_with_matmul',
} 'use_matmul_crop_and_resize': False,
box_coder { 'enable_mask_prediction': True
mean_stddev_box_coder { },
} {
} 'testcase_name': 'mask_rcnn_without_matmul',
matcher { 'use_matmul_crop_and_resize': True,
bipartite_matcher { 'enable_mask_prediction': True
} },
} {
similarity_calculator { 'testcase_name': 'faster_rcnn_with_matmul',
iou_similarity { 'use_matmul_crop_and_resize': False,
} 'enable_mask_prediction': False
} },
anchor_generator { {
ssd_anchor_generator { 'testcase_name': 'faster_rcnn_without_matmul',
aspect_ratios: 1.0 'use_matmul_crop_and_resize': True,
} 'enable_mask_prediction': False
} },
image_resizer { )
fixed_shape_resizer { def test_create_faster_rcnn_models_from_config(
height: 320 self, use_matmul_crop_and_resize, enable_mask_prediction):
width: 320 model_proto = self.create_default_faster_rcnn_model_proto()
} faster_rcnn_config = model_proto.faster_rcnn
} faster_rcnn_config.use_matmul_crop_and_resize = use_matmul_crop_and_resize
box_predictor { if enable_mask_prediction:
weight_shared_convolutional_box_predictor { faster_rcnn_config.second_stage_mask_prediction_loss_weight = 3.0
depth: 1024 mask_predictor_config = (
class_prediction_bias_init: -4.6 faster_rcnn_config.second_stage_box_predictor.mask_rcnn_box_predictor)
conv_hyperparams { mask_predictor_config.predict_instance_masks = True
activation: RELU_6,
regularizer { for extractor_type, extractor_class in (
l2_regularizer { model_builder.FASTER_RCNN_FEATURE_EXTRACTOR_CLASS_MAP.items()):
weight: 0.0004 faster_rcnn_config.feature_extractor.type = extractor_type
} model = model_builder.build(model_proto, is_training=True)
} self.assertIsInstance(model, faster_rcnn_meta_arch.FasterRCNNMetaArch)
initializer { self.assertIsInstance(model._feature_extractor, extractor_class)
variance_scaling_initializer { if enable_mask_prediction:
} self.assertAlmostEqual(model._second_stage_mask_loss_weight, 3.0)
}
}
num_layers_before_predictor: 2
kernel_size: 1
}
}
loss {
classification_loss {
weighted_softmax {
}
}
localization_loss {
weighted_l2 {
}
}
classification_weight: 1.0
localization_weight: 1.0
}
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
for extractor_type, extractor_class in SSD_RESNET_V1_PPN_FEAT_MAPS.items(): def test_create_faster_rcnn_model_from_config_with_example_miner(self):
model_proto.ssd.feature_extractor.type = extractor_type model_proto = self.create_default_faster_rcnn_model_proto()
model_proto.faster_rcnn.hard_example_miner.num_hard_examples = 64
model = model_builder.build(model_proto, is_training=True)
self.assertIsNotNone(model._hard_example_miner)
def test_create_rfcn_model_from_config(self):
model_proto = self.create_default_faster_rcnn_model_proto()
rfcn_predictor_config = (
model_proto.faster_rcnn.second_stage_box_predictor.rfcn_box_predictor)
rfcn_predictor_config.conv_hyperparams.op = hyperparams_pb2.Hyperparams.CONV
for extractor_type, extractor_class in (
model_builder.FASTER_RCNN_FEATURE_EXTRACTOR_CLASS_MAP.items()):
model_proto.faster_rcnn.feature_extractor.type = extractor_type
model = model_builder.build(model_proto, is_training=True) model = model_builder.build(model_proto, is_training=True)
self.assertIsInstance(model, ssd_meta_arch.SSDMetaArch) self.assertIsInstance(model, rfcn_meta_arch.RFCNMetaArch)
self.assertIsInstance(model._feature_extractor, extractor_class) self.assertIsInstance(model._feature_extractor, extractor_class)
def test_create_ssd_mobilenet_v1_model_from_config(self): def test_invalid_model_config_proto(self):
model_text_proto = """ model_proto = ''
ssd { with self.assertRaisesRegexp(
freeze_batchnorm: true ValueError, 'model_config not of type model_pb2.DetectionModel.'):
inplace_batchnorm_update: true model_builder.build(model_proto, is_training=True)
feature_extractor {
type: 'ssd_mobilenet_v1'
conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
box_coder {
faster_rcnn_box_coder {
}
}
matcher {
argmax_matcher {
}
}
similarity_calculator {
iou_similarity {
}
}
anchor_generator {
ssd_anchor_generator {
aspect_ratios: 1.0
}
}
image_resizer {
fixed_shape_resizer {
height: 320
width: 320
}
}
box_predictor {
convolutional_box_predictor {
conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
}
normalize_loc_loss_by_codesize: true
loss {
classification_loss {
weighted_softmax {
}
}
localization_loss {
weighted_smooth_l1 {
}
}
}
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
model = self.create_model(model_proto)
self.assertIsInstance(model, ssd_meta_arch.SSDMetaArch)
self.assertIsInstance(model._feature_extractor,
SSDMobileNetV1FeatureExtractor)
self.assertTrue(model._normalize_loc_loss_by_codesize)
self.assertTrue(model._freeze_batchnorm)
self.assertTrue(model._inplace_batchnorm_update)
def test_create_ssd_mobilenet_v1_fpn_model_from_config(self):
model_text_proto = """
ssd {
freeze_batchnorm: true
inplace_batchnorm_update: true
feature_extractor {
type: 'ssd_mobilenet_v1_fpn'
fpn {
min_level: 3
max_level: 7
}
conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
box_coder {
faster_rcnn_box_coder {
}
}
matcher {
argmax_matcher {
}
}
similarity_calculator {
iou_similarity {
}
}
anchor_generator {
ssd_anchor_generator {
aspect_ratios: 1.0
}
}
image_resizer {
fixed_shape_resizer {
height: 320
width: 320
}
}
box_predictor {
convolutional_box_predictor {
conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
}
normalize_loc_loss_by_codesize: true
loss {
classification_loss {
weighted_softmax {
}
}
localization_loss {
weighted_smooth_l1 {
}
}
}
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
model = self.create_model(model_proto)
self.assertIsInstance(model, ssd_meta_arch.SSDMetaArch)
self.assertIsInstance(model._feature_extractor,
SSDMobileNetV1FpnFeatureExtractor)
self.assertTrue(model._normalize_loc_loss_by_codesize)
self.assertTrue(model._freeze_batchnorm)
self.assertTrue(model._inplace_batchnorm_update)
def test_create_ssd_mobilenet_v1_ppn_model_from_config(self):
model_text_proto = """
ssd {
freeze_batchnorm: true
inplace_batchnorm_update: true
feature_extractor {
type: 'ssd_mobilenet_v1_ppn'
conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
box_coder {
faster_rcnn_box_coder {
}
}
matcher {
argmax_matcher {
}
}
similarity_calculator {
iou_similarity {
}
}
anchor_generator {
ssd_anchor_generator {
aspect_ratios: 1.0
}
}
image_resizer {
fixed_shape_resizer {
height: 320
width: 320
}
}
box_predictor {
convolutional_box_predictor {
conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
}
normalize_loc_loss_by_codesize: true
loss {
classification_loss {
weighted_softmax {
}
}
localization_loss {
weighted_smooth_l1 {
}
}
}
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
model = self.create_model(model_proto)
self.assertIsInstance(model, ssd_meta_arch.SSDMetaArch)
self.assertIsInstance(model._feature_extractor,
SSDMobileNetV1PpnFeatureExtractor)
self.assertTrue(model._normalize_loc_loss_by_codesize)
self.assertTrue(model._freeze_batchnorm)
self.assertTrue(model._inplace_batchnorm_update)
def test_create_ssd_mobilenet_v2_model_from_config(self):
model_text_proto = """
ssd {
feature_extractor {
type: 'ssd_mobilenet_v2'
conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
box_coder {
faster_rcnn_box_coder {
}
}
matcher {
argmax_matcher {
}
}
similarity_calculator {
iou_similarity {
}
}
anchor_generator {
ssd_anchor_generator {
aspect_ratios: 1.0
}
}
image_resizer {
fixed_shape_resizer {
height: 320
width: 320
}
}
box_predictor {
convolutional_box_predictor {
conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
}
normalize_loc_loss_by_codesize: true
loss {
classification_loss {
weighted_softmax {
}
}
localization_loss {
weighted_smooth_l1 {
}
}
}
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
model = self.create_model(model_proto)
self.assertIsInstance(model, ssd_meta_arch.SSDMetaArch)
self.assertIsInstance(model._feature_extractor,
SSDMobileNetV2FeatureExtractor)
self.assertIsInstance(model._box_predictor,
convolutional_box_predictor.ConvolutionalBoxPredictor)
self.assertTrue(model._normalize_loc_loss_by_codesize)
def test_create_ssd_mobilenet_v2_keras_model_from_config(self):
model_text_proto = """
ssd {
feature_extractor {
type: 'ssd_mobilenet_v2_keras'
conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
box_coder {
faster_rcnn_box_coder {
}
}
matcher {
argmax_matcher {
}
}
similarity_calculator {
iou_similarity {
}
}
anchor_generator {
ssd_anchor_generator {
aspect_ratios: 1.0
}
}
image_resizer {
fixed_shape_resizer {
height: 320
width: 320
}
}
box_predictor {
convolutional_box_predictor {
conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
}
normalize_loc_loss_by_codesize: true
loss {
classification_loss {
weighted_softmax {
}
}
localization_loss {
weighted_smooth_l1 {
}
}
}
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
model = self.create_model(model_proto)
self.assertIsInstance(model, ssd_meta_arch.SSDMetaArch)
self.assertIsInstance(model._feature_extractor,
SSDMobileNetV2KerasFeatureExtractor)
self.assertIsInstance(
model._box_predictor,
convolutional_keras_box_predictor.ConvolutionalBoxPredictor)
self.assertTrue(model._normalize_loc_loss_by_codesize)
def test_create_ssd_mobilenet_v2_fpn_model_from_config(self):
model_text_proto = """
ssd {
freeze_batchnorm: true
inplace_batchnorm_update: true
feature_extractor {
type: 'ssd_mobilenet_v2_fpn'
fpn {
min_level: 3
max_level: 7
}
conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
box_coder {
faster_rcnn_box_coder {
}
}
matcher {
argmax_matcher {
}
}
similarity_calculator {
iou_similarity {
}
}
anchor_generator {
ssd_anchor_generator {
aspect_ratios: 1.0
}
}
image_resizer {
fixed_shape_resizer {
height: 320
width: 320
}
}
box_predictor {
convolutional_box_predictor {
conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
}
normalize_loc_loss_by_codesize: true
loss {
classification_loss {
weighted_softmax {
}
}
localization_loss {
weighted_smooth_l1 {
}
}
}
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
model = self.create_model(model_proto)
self.assertIsInstance(model, ssd_meta_arch.SSDMetaArch)
self.assertIsInstance(model._feature_extractor,
SSDMobileNetV2FpnFeatureExtractor)
self.assertTrue(model._normalize_loc_loss_by_codesize)
self.assertTrue(model._freeze_batchnorm)
self.assertTrue(model._inplace_batchnorm_update)
def test_create_ssd_mobilenet_v2_fpnlite_model_from_config(self):
model_text_proto = """
ssd {
freeze_batchnorm: true
inplace_batchnorm_update: true
feature_extractor {
type: 'ssd_mobilenet_v2_fpn'
use_depthwise: true
fpn {
min_level: 3
max_level: 7
additional_layer_depth: 128
}
conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
box_coder {
faster_rcnn_box_coder {
}
}
matcher {
argmax_matcher {
}
}
similarity_calculator {
iou_similarity {
}
}
anchor_generator {
ssd_anchor_generator {
aspect_ratios: 1.0
}
}
image_resizer {
fixed_shape_resizer {
height: 320
width: 320
}
}
box_predictor {
convolutional_box_predictor {
conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
}
normalize_loc_loss_by_codesize: true
loss {
classification_loss {
weighted_softmax {
}
}
localization_loss {
weighted_smooth_l1 {
}
}
}
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
model = self.create_model(model_proto)
self.assertIsInstance(model, ssd_meta_arch.SSDMetaArch)
self.assertIsInstance(model._feature_extractor,
SSDMobileNetV2FpnFeatureExtractor)
self.assertTrue(model._normalize_loc_loss_by_codesize)
self.assertTrue(model._freeze_batchnorm)
self.assertTrue(model._inplace_batchnorm_update)
def test_create_embedded_ssd_mobilenet_v1_model_from_config(self):
model_text_proto = """
ssd {
feature_extractor {
type: 'embedded_ssd_mobilenet_v1'
conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
box_coder {
faster_rcnn_box_coder {
}
}
matcher {
argmax_matcher {
}
}
similarity_calculator {
iou_similarity {
}
}
anchor_generator {
ssd_anchor_generator {
aspect_ratios: 1.0
}
}
image_resizer {
fixed_shape_resizer {
height: 256
width: 256
}
}
box_predictor {
convolutional_box_predictor {
conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
}
loss {
classification_loss {
weighted_softmax {
}
}
localization_loss {
weighted_smooth_l1 {
}
}
}
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
model = self.create_model(model_proto)
self.assertIsInstance(model, ssd_meta_arch.SSDMetaArch)
self.assertIsInstance(model._feature_extractor,
EmbeddedSSDMobileNetV1FeatureExtractor)
def test_create_faster_rcnn_resnet_v1_models_from_config(self):
model_text_proto = """
faster_rcnn {
inplace_batchnorm_update: false
num_classes: 3
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 600
max_dimension: 1024
}
}
feature_extractor {
type: 'faster_rcnn_resnet101'
}
first_stage_anchor_generator {
grid_anchor_generator {
scales: [0.25, 0.5, 1.0, 2.0]
aspect_ratios: [0.5, 1.0, 2.0]
height_stride: 16
width_stride: 16
}
}
first_stage_box_predictor_conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
initial_crop_size: 14
maxpool_kernel_size: 2
maxpool_stride: 2
second_stage_box_predictor {
mask_rcnn_box_predictor {
fc_hyperparams {
op: FC
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
}
second_stage_post_processing {
batch_non_max_suppression {
score_threshold: 0.01
iou_threshold: 0.6
max_detections_per_class: 100
max_total_detections: 300
}
score_converter: SOFTMAX
}
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
for extractor_type, extractor_class in FRCNN_RESNET_FEAT_MAPS.items():
model_proto.faster_rcnn.feature_extractor.type = extractor_type
model = model_builder.build(model_proto, is_training=True)
self.assertIsInstance(model, faster_rcnn_meta_arch.FasterRCNNMetaArch)
self.assertIsInstance(model._feature_extractor, extractor_class)
@parameterized.parameters(
{'use_matmul_crop_and_resize': False},
{'use_matmul_crop_and_resize': True},
)
def test_create_faster_rcnn_resnet101_with_mask_prediction_enabled(
self, use_matmul_crop_and_resize):
model_text_proto = """
faster_rcnn {
num_classes: 3
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 600
max_dimension: 1024
}
}
feature_extractor {
type: 'faster_rcnn_resnet101'
}
first_stage_anchor_generator {
grid_anchor_generator {
scales: [0.25, 0.5, 1.0, 2.0]
aspect_ratios: [0.5, 1.0, 2.0]
height_stride: 16
width_stride: 16
}
}
first_stage_box_predictor_conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
initial_crop_size: 14
maxpool_kernel_size: 2
maxpool_stride: 2
second_stage_box_predictor {
mask_rcnn_box_predictor {
fc_hyperparams {
op: FC
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
predict_instance_masks: true
}
}
second_stage_mask_prediction_loss_weight: 3.0
second_stage_post_processing {
batch_non_max_suppression {
score_threshold: 0.01
iou_threshold: 0.6
max_detections_per_class: 100
max_total_detections: 300
}
score_converter: SOFTMAX
}
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
model_proto.faster_rcnn.use_matmul_crop_and_resize = (
use_matmul_crop_and_resize)
model = model_builder.build(model_proto, is_training=True)
self.assertAlmostEqual(model._second_stage_mask_loss_weight, 3.0)
def test_create_faster_rcnn_nas_model_from_config(self):
model_text_proto = """
faster_rcnn {
num_classes: 3
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 600
max_dimension: 1024
}
}
feature_extractor {
type: 'faster_rcnn_nas'
}
first_stage_anchor_generator {
grid_anchor_generator {
scales: [0.25, 0.5, 1.0, 2.0]
aspect_ratios: [0.5, 1.0, 2.0]
height_stride: 16
width_stride: 16
}
}
first_stage_box_predictor_conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
initial_crop_size: 17
maxpool_kernel_size: 1
maxpool_stride: 1
second_stage_box_predictor {
mask_rcnn_box_predictor {
fc_hyperparams {
op: FC
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
}
second_stage_post_processing {
batch_non_max_suppression {
score_threshold: 0.01
iou_threshold: 0.6
max_detections_per_class: 100
max_total_detections: 300
}
score_converter: SOFTMAX
}
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
model = model_builder.build(model_proto, is_training=True)
self.assertIsInstance(model, faster_rcnn_meta_arch.FasterRCNNMetaArch)
self.assertIsInstance(
model._feature_extractor,
frcnn_nas.FasterRCNNNASFeatureExtractor)
def test_create_faster_rcnn_pnas_model_from_config(self):
model_text_proto = """
faster_rcnn {
num_classes: 3
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 600
max_dimension: 1024
}
}
feature_extractor {
type: 'faster_rcnn_pnas'
}
first_stage_anchor_generator {
grid_anchor_generator {
scales: [0.25, 0.5, 1.0, 2.0]
aspect_ratios: [0.5, 1.0, 2.0]
height_stride: 16
width_stride: 16
}
}
first_stage_box_predictor_conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
initial_crop_size: 17
maxpool_kernel_size: 1
maxpool_stride: 1
second_stage_box_predictor {
mask_rcnn_box_predictor {
fc_hyperparams {
op: FC
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
}
second_stage_post_processing {
batch_non_max_suppression {
score_threshold: 0.01
iou_threshold: 0.6
max_detections_per_class: 100
max_total_detections: 300
}
score_converter: SOFTMAX
}
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
model = model_builder.build(model_proto, is_training=True)
self.assertIsInstance(model, faster_rcnn_meta_arch.FasterRCNNMetaArch)
self.assertIsInstance(
model._feature_extractor,
frcnn_pnas.FasterRCNNPNASFeatureExtractor)
def test_create_faster_rcnn_inception_resnet_v2_model_from_config(self):
model_text_proto = """
faster_rcnn {
num_classes: 3
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 600
max_dimension: 1024
}
}
feature_extractor {
type: 'faster_rcnn_inception_resnet_v2'
}
first_stage_anchor_generator {
grid_anchor_generator {
scales: [0.25, 0.5, 1.0, 2.0]
aspect_ratios: [0.5, 1.0, 2.0]
height_stride: 16
width_stride: 16
}
}
first_stage_box_predictor_conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
initial_crop_size: 17
maxpool_kernel_size: 1
maxpool_stride: 1
second_stage_box_predictor {
mask_rcnn_box_predictor {
fc_hyperparams {
op: FC
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
}
second_stage_post_processing {
batch_non_max_suppression {
score_threshold: 0.01
iou_threshold: 0.6
max_detections_per_class: 100
max_total_detections: 300
}
score_converter: SOFTMAX
}
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
model = model_builder.build(model_proto, is_training=True)
self.assertIsInstance(model, faster_rcnn_meta_arch.FasterRCNNMetaArch)
self.assertIsInstance(
model._feature_extractor,
frcnn_inc_res.FasterRCNNInceptionResnetV2FeatureExtractor)
def test_create_faster_rcnn_inception_v2_model_from_config(self):
model_text_proto = """
faster_rcnn {
num_classes: 3
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 600
max_dimension: 1024
}
}
feature_extractor {
type: 'faster_rcnn_inception_v2'
}
first_stage_anchor_generator {
grid_anchor_generator {
scales: [0.25, 0.5, 1.0, 2.0]
aspect_ratios: [0.5, 1.0, 2.0]
height_stride: 16
width_stride: 16
}
}
first_stage_box_predictor_conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
initial_crop_size: 14
maxpool_kernel_size: 2
maxpool_stride: 2
second_stage_box_predictor {
mask_rcnn_box_predictor {
fc_hyperparams {
op: FC
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
}
second_stage_post_processing {
batch_non_max_suppression {
score_threshold: 0.01
iou_threshold: 0.6
max_detections_per_class: 100
max_total_detections: 300
}
score_converter: SOFTMAX
}
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
model = model_builder.build(model_proto, is_training=True)
self.assertIsInstance(model, faster_rcnn_meta_arch.FasterRCNNMetaArch)
self.assertIsInstance(model._feature_extractor,
frcnn_inc_v2.FasterRCNNInceptionV2FeatureExtractor)
def test_create_faster_rcnn_model_from_config_with_example_miner(self):
model_text_proto = """
faster_rcnn {
num_classes: 3
feature_extractor {
type: 'faster_rcnn_inception_resnet_v2'
}
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 600
max_dimension: 1024
}
}
first_stage_anchor_generator {
grid_anchor_generator {
scales: [0.25, 0.5, 1.0, 2.0]
aspect_ratios: [0.5, 1.0, 2.0]
height_stride: 16
width_stride: 16
}
}
first_stage_box_predictor_conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
second_stage_box_predictor {
mask_rcnn_box_predictor {
fc_hyperparams {
op: FC
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
}
hard_example_miner {
num_hard_examples: 10
iou_threshold: 0.99
}
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
model = model_builder.build(model_proto, is_training=True)
self.assertIsNotNone(model._hard_example_miner)
def test_create_rfcn_resnet_v1_model_from_config(self): def test_unknown_meta_architecture(self):
model_text_proto = """
faster_rcnn {
num_classes: 3
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 600
max_dimension: 1024
}
}
feature_extractor {
type: 'faster_rcnn_resnet101'
}
first_stage_anchor_generator {
grid_anchor_generator {
scales: [0.25, 0.5, 1.0, 2.0]
aspect_ratios: [0.5, 1.0, 2.0]
height_stride: 16
width_stride: 16
}
}
first_stage_box_predictor_conv_hyperparams {
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
initial_crop_size: 14
maxpool_kernel_size: 2
maxpool_stride: 2
second_stage_box_predictor {
rfcn_box_predictor {
conv_hyperparams {
op: CONV
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
}
}
}
second_stage_post_processing {
batch_non_max_suppression {
score_threshold: 0.01
iou_threshold: 0.6
max_detections_per_class: 100
max_total_detections: 300
}
score_converter: SOFTMAX
}
}"""
model_proto = model_pb2.DetectionModel() model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto) with self.assertRaisesRegexp(ValueError, 'Unknown meta architecture'):
for extractor_type, extractor_class in FRCNN_RESNET_FEAT_MAPS.items(): model_builder.build(model_proto, is_training=True)
model_proto.faster_rcnn.feature_extractor.type = extractor_type
model = model_builder.build(model_proto, is_training=True) def test_unknown_ssd_feature_extractor(self):
self.assertIsInstance(model, rfcn_meta_arch.RFCNMetaArch) model_proto = self.create_default_ssd_model_proto()
self.assertIsInstance(model._feature_extractor, extractor_class) model_proto.ssd.feature_extractor.type = 'unknown_feature_extractor'
with self.assertRaisesRegexp(ValueError, 'Unknown ssd feature_extractor'):
model_builder.build(model_proto, is_training=True)
def test_unknown_faster_rcnn_feature_extractor(self):
model_proto = self.create_default_faster_rcnn_model_proto()
model_proto.faster_rcnn.feature_extractor.type = 'unknown_feature_extractor'
with self.assertRaisesRegexp(ValueError,
'Unknown Faster R-CNN feature_extractor'):
model_builder.build(model_proto, is_training=True)
def test_invalid_first_stage_nms_iou_threshold(self):
model_proto = self.create_default_faster_rcnn_model_proto()
model_proto.faster_rcnn.first_stage_nms_iou_threshold = 1.1
with self.assertRaisesRegexp(ValueError,
r'iou_threshold not in \[0, 1\.0\]'):
model_builder.build(model_proto, is_training=True)
model_proto.faster_rcnn.first_stage_nms_iou_threshold = -0.1
with self.assertRaisesRegexp(ValueError,
r'iou_threshold not in \[0, 1\.0\]'):
model_builder.build(model_proto, is_training=True)
def test_invalid_second_stage_batch_size(self):
model_proto = self.create_default_faster_rcnn_model_proto()
model_proto.faster_rcnn.first_stage_max_proposals = 1
model_proto.faster_rcnn.second_stage_batch_size = 2
with self.assertRaisesRegexp(
ValueError, 'second_stage_batch_size should be no greater '
'than first_stage_max_proposals.'):
model_builder.build(model_proto, is_training=True)
def test_invalid_faster_rcnn_batchnorm_update(self):
model_proto = self.create_default_faster_rcnn_model_proto()
model_proto.faster_rcnn.inplace_batchnorm_update = True
with self.assertRaisesRegexp(ValueError,
'inplace batchnorm updates not supported'):
model_builder.build(model_proto, is_training=True)
if __name__ == '__main__': if __name__ == '__main__':
......
...@@ -16,6 +16,8 @@ ...@@ -16,6 +16,8 @@
"""Functions to build DetectionModel training optimizers.""" """Functions to build DetectionModel training optimizers."""
import tensorflow as tf import tensorflow as tf
from object_detection.utils import learning_schedules from object_detection.utils import learning_schedules
...@@ -59,6 +61,7 @@ def build(optimizer_config): ...@@ -59,6 +61,7 @@ def build(optimizer_config):
summary_vars.append(learning_rate) summary_vars.append(learning_rate)
optimizer = tf.train.AdamOptimizer(learning_rate) optimizer = tf.train.AdamOptimizer(learning_rate)
if optimizer is None: if optimizer is None:
raise ValueError('Optimizer %s not supported.' % optimizer_type) raise ValueError('Optimizer %s not supported.' % optimizer_type)
......
# Copyright 2017 The TensorFlow Authors. All Rights Reserved. # Copyright 2019 The TensorFlow Authors. All Rights Reserved.
# #
# Licensed under the Apache License, Version 2.0 (the "License"); # Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License. # you may not use this file except in compliance with the License.
...@@ -17,6 +17,7 @@ ...@@ -17,6 +17,7 @@
import functools import functools
import tensorflow as tf import tensorflow as tf
from object_detection.builders import calibration_builder
from object_detection.core import post_processing from object_detection.core import post_processing
from object_detection.protos import post_processing_pb2 from object_detection.protos import post_processing_pb2
...@@ -24,8 +25,8 @@ from object_detection.protos import post_processing_pb2 ...@@ -24,8 +25,8 @@ from object_detection.protos import post_processing_pb2
def build(post_processing_config): def build(post_processing_config):
"""Builds callables for post-processing operations. """Builds callables for post-processing operations.
Builds callables for non-max suppression and score conversion based on the Builds callables for non-max suppression, score conversion, and (optionally)
configuration. calibration based on the configuration.
Non-max suppression callable takes `boxes`, `scores`, and optionally Non-max suppression callable takes `boxes`, `scores`, and optionally
`clip_window`, `parallel_iterations` `masks, and `scope` as inputs. It returns `clip_window`, `parallel_iterations` `masks, and `scope` as inputs. It returns
...@@ -35,8 +36,10 @@ def build(post_processing_config): ...@@ -35,8 +36,10 @@ def build(post_processing_config):
Score converter callable should be called with `input` tensor. The callable Score converter callable should be called with `input` tensor. The callable
returns the output from one of 3 tf operations based on the configuration - returns the output from one of 3 tf operations based on the configuration -
tf.identity, tf.sigmoid or tf.nn.softmax. See tensorflow documentation for tf.identity, tf.sigmoid or tf.nn.softmax. If a calibration config is provided,
argument and return value descriptions. score_converter also applies calibration transformations, as defined in
calibration_builder.py. See tensorflow documentation for argument and return
value descriptions.
Args: Args:
post_processing_config: post_processing.proto object containing the post_processing_config: post_processing.proto object containing the
...@@ -57,6 +60,10 @@ def build(post_processing_config): ...@@ -57,6 +60,10 @@ def build(post_processing_config):
score_converter_fn = _build_score_converter( score_converter_fn = _build_score_converter(
post_processing_config.score_converter, post_processing_config.score_converter,
post_processing_config.logit_scale) post_processing_config.logit_scale)
if post_processing_config.HasField('calibration_config'):
score_converter_fn = _build_calibrated_score_converter(
score_converter_fn,
post_processing_config.calibration_config)
return non_max_suppressor_fn, score_converter_fn return non_max_suppressor_fn, score_converter_fn
...@@ -122,3 +129,32 @@ def _build_score_converter(score_converter_config, logit_scale): ...@@ -122,3 +129,32 @@ def _build_score_converter(score_converter_config, logit_scale):
if score_converter_config == post_processing_pb2.PostProcessing.SOFTMAX: if score_converter_config == post_processing_pb2.PostProcessing.SOFTMAX:
return _score_converter_fn_with_logit_scale(tf.nn.softmax, logit_scale) return _score_converter_fn_with_logit_scale(tf.nn.softmax, logit_scale)
raise ValueError('Unknown score converter.') raise ValueError('Unknown score converter.')
def _build_calibrated_score_converter(score_converter_fn, calibration_config):
"""Wraps a score_converter_fn, adding a calibration step.
Builds a score converter function witha calibration transformation according
to calibration_builder.py. Calibration applies positive monotonic
transformations to inputs (i.e. score ordering is strictly preserved or
adjacent scores are mapped to the same score). When calibration is
class-agnostic, the highest-scoring class remains unchanged, unless two
adjacent scores are mapped to the same value and one class arbitrarily
selected to break the tie. In per-class calibration, it's possible (though
rare in practice) that the highest-scoring class will change, since positive
monotonicity is only required to hold within each class.
Args:
score_converter_fn: callable that takes logit scores as input.
calibration_config: post_processing_pb2.PostProcessing.calibration_config.
Returns:
Callable calibrated score coverter op.
"""
calibration_fn = calibration_builder.build(calibration_config)
def calibrated_score_converter_fn(logits):
converted_logits = score_converter_fn(logits)
return calibration_fn(converted_logits)
calibrated_score_converter_fn.__name__ = (
'calibrate_with_%s' % calibration_config.WhichOneof('calibrator'))
return calibrated_score_converter_fn
...@@ -47,7 +47,8 @@ class PostProcessingBuilderTest(tf.test.TestCase): ...@@ -47,7 +47,8 @@ class PostProcessingBuilderTest(tf.test.TestCase):
""" """
post_processing_config = post_processing_pb2.PostProcessing() post_processing_config = post_processing_pb2.PostProcessing()
text_format.Merge(post_processing_text_proto, post_processing_config) text_format.Merge(post_processing_text_proto, post_processing_config)
_, score_converter = post_processing_builder.build(post_processing_config) _, score_converter = post_processing_builder.build(
post_processing_config)
self.assertEqual(score_converter.__name__, 'identity_with_logit_scale') self.assertEqual(score_converter.__name__, 'identity_with_logit_scale')
inputs = tf.constant([1, 1], tf.float32) inputs = tf.constant([1, 1], tf.float32)
...@@ -102,6 +103,36 @@ class PostProcessingBuilderTest(tf.test.TestCase): ...@@ -102,6 +103,36 @@ class PostProcessingBuilderTest(tf.test.TestCase):
_, score_converter = post_processing_builder.build(post_processing_config) _, score_converter = post_processing_builder.build(post_processing_config)
self.assertEqual(score_converter.__name__, 'softmax_with_logit_scale') self.assertEqual(score_converter.__name__, 'softmax_with_logit_scale')
def test_build_calibrator_with_nonempty_config(self):
"""Test that identity function used when no calibration_config specified."""
# Calibration config maps all scores to 0.5.
post_processing_text_proto = """
score_converter: SOFTMAX
calibration_config {
function_approximation {
x_y_pairs {
x_y_pair {
x: 0.0
y: 0.5
}
x_y_pair {
x: 1.0
y: 0.5
}}}}"""
post_processing_config = post_processing_pb2.PostProcessing()
text_format.Merge(post_processing_text_proto, post_processing_config)
_, calibrated_score_conversion_fn = post_processing_builder.build(
post_processing_config)
self.assertEqual(calibrated_score_conversion_fn.__name__,
'calibrate_with_function_approximation')
input_scores = tf.constant([1, 1], tf.float32)
outputs = calibrated_score_conversion_fn(input_scores)
with self.test_session() as sess:
calibrated_scores = sess.run(outputs)
expected_calibrated_scores = sess.run(tf.constant([0.5, 0.5], tf.float32))
self.assertAllClose(calibrated_scores, expected_calibrated_scores)
if __name__ == '__main__': if __name__ == '__main__':
tf.test.main() tf.test.main()
...@@ -191,10 +191,10 @@ def build(preprocessor_step_config): ...@@ -191,10 +191,10 @@ def build(preprocessor_step_config):
pad_color = config.pad_color or None pad_color = config.pad_color or None
if pad_color: if pad_color:
if len(pad_color) == 3: if len(pad_color) != 3:
pad_color = tf.to_float([x for x in config.pad_color]) tf.logging.warn('pad_color should have 3 elements (RGB) if set!')
else:
raise ValueError('pad_color should have 3 elements (RGB) if set!') pad_color = tf.to_float([x for x in config.pad_color])
return (preprocessor.random_pad_image, return (preprocessor.random_pad_image,
{ {
'min_image_size': min_image_size, 'min_image_size': min_image_size,
...@@ -202,6 +202,25 @@ def build(preprocessor_step_config): ...@@ -202,6 +202,25 @@ def build(preprocessor_step_config):
'pad_color': pad_color, 'pad_color': pad_color,
}) })
if step_type == 'random_absolute_pad_image':
config = preprocessor_step_config.random_absolute_pad_image
max_height_padding = config.max_height_padding or 1
max_width_padding = config.max_width_padding or 1
pad_color = config.pad_color or None
if pad_color:
if len(pad_color) != 3:
tf.logging.warn('pad_color should have 3 elements (RGB) if set!')
pad_color = tf.to_float([x for x in config.pad_color])
return (preprocessor.random_absolute_pad_image,
{
'max_height_padding': max_height_padding,
'max_width_padding': max_width_padding,
'pad_color': pad_color,
})
if step_type == 'random_crop_pad_image': if step_type == 'random_crop_pad_image':
config = preprocessor_step_config.random_crop_pad_image config = preprocessor_step_config.random_crop_pad_image
min_padded_size_ratio = config.min_padded_size_ratio min_padded_size_ratio = config.min_padded_size_ratio
...@@ -210,9 +229,13 @@ def build(preprocessor_step_config): ...@@ -210,9 +229,13 @@ def build(preprocessor_step_config):
max_padded_size_ratio = config.max_padded_size_ratio max_padded_size_ratio = config.max_padded_size_ratio
if max_padded_size_ratio and len(max_padded_size_ratio) != 2: if max_padded_size_ratio and len(max_padded_size_ratio) != 2:
raise ValueError('max_padded_size_ratio should have 2 elements if set!') raise ValueError('max_padded_size_ratio should have 2 elements if set!')
pad_color = config.pad_color pad_color = config.pad_color or None
if pad_color and len(pad_color) != 3: if pad_color:
raise ValueError('pad_color should have 3 elements if set!') if len(pad_color) != 3:
tf.logging.warn('pad_color should have 3 elements (RGB) if set!')
pad_color = tf.to_float([x for x in config.pad_color])
kwargs = { kwargs = {
'min_object_covered': config.min_object_covered, 'min_object_covered': config.min_object_covered,
'aspect_ratio_range': (config.min_aspect_ratio, 'aspect_ratio_range': (config.min_aspect_ratio,
...@@ -221,13 +244,12 @@ def build(preprocessor_step_config): ...@@ -221,13 +244,12 @@ def build(preprocessor_step_config):
'overlap_thresh': config.overlap_thresh, 'overlap_thresh': config.overlap_thresh,
'clip_boxes': config.clip_boxes, 'clip_boxes': config.clip_boxes,
'random_coef': config.random_coef, 'random_coef': config.random_coef,
'pad_color': pad_color,
} }
if min_padded_size_ratio: if min_padded_size_ratio:
kwargs['min_padded_size_ratio'] = tuple(min_padded_size_ratio) kwargs['min_padded_size_ratio'] = tuple(min_padded_size_ratio)
if max_padded_size_ratio: if max_padded_size_ratio:
kwargs['max_padded_size_ratio'] = tuple(max_padded_size_ratio) kwargs['max_padded_size_ratio'] = tuple(max_padded_size_ratio)
if pad_color:
kwargs['pad_color'] = tuple(pad_color)
return (preprocessor.random_crop_pad_image, kwargs) return (preprocessor.random_crop_pad_image, kwargs)
if step_type == 'random_resize_method': if step_type == 'random_resize_method':
...@@ -247,6 +269,13 @@ def build(preprocessor_step_config): ...@@ -247,6 +269,13 @@ def build(preprocessor_step_config):
'method': method 'method': method
}) })
if step_type == 'random_self_concat_image':
config = preprocessor_step_config.random_self_concat_image
return (preprocessor.random_self_concat_image, {
'concat_vertical_probability': config.concat_vertical_probability,
'concat_horizontal_probability': config.concat_horizontal_probability
})
if step_type == 'ssd_random_crop': if step_type == 'ssd_random_crop':
config = preprocessor_step_config.ssd_random_crop config = preprocessor_step_config.ssd_random_crop
if config.operations: if config.operations:
......
...@@ -254,6 +254,23 @@ class PreprocessorBuilderTest(tf.test.TestCase): ...@@ -254,6 +254,23 @@ class PreprocessorBuilderTest(tf.test.TestCase):
'pad_color': None, 'pad_color': None,
}) })
def test_build_random_absolute_pad_image(self):
preprocessor_text_proto = """
random_absolute_pad_image {
max_height_padding: 50
max_width_padding: 100
}
"""
preprocessor_proto = preprocessor_pb2.PreprocessingStep()
text_format.Merge(preprocessor_text_proto, preprocessor_proto)
function, args = preprocessor_builder.build(preprocessor_proto)
self.assertEqual(function, preprocessor.random_absolute_pad_image)
self.assertEqual(args, {
'max_height_padding': 50,
'max_width_padding': 100,
'pad_color': None,
})
def test_build_random_crop_pad_image(self): def test_build_random_crop_pad_image(self):
preprocessor_text_proto = """ preprocessor_text_proto = """
random_crop_pad_image { random_crop_pad_image {
...@@ -278,6 +295,7 @@ class PreprocessorBuilderTest(tf.test.TestCase): ...@@ -278,6 +295,7 @@ class PreprocessorBuilderTest(tf.test.TestCase):
'overlap_thresh': 0.5, 'overlap_thresh': 0.5,
'clip_boxes': False, 'clip_boxes': False,
'random_coef': 0.125, 'random_coef': 0.125,
'pad_color': None,
}) })
def test_build_random_crop_pad_image_with_optional_parameters(self): def test_build_random_crop_pad_image_with_optional_parameters(self):
...@@ -295,9 +313,6 @@ class PreprocessorBuilderTest(tf.test.TestCase): ...@@ -295,9 +313,6 @@ class PreprocessorBuilderTest(tf.test.TestCase):
min_padded_size_ratio: 0.75 min_padded_size_ratio: 0.75
max_padded_size_ratio: 0.5 max_padded_size_ratio: 0.5
max_padded_size_ratio: 0.75 max_padded_size_ratio: 0.75
pad_color: 0.5
pad_color: 0.5
pad_color: 1.0
} }
""" """
preprocessor_proto = preprocessor_pb2.PreprocessingStep() preprocessor_proto = preprocessor_pb2.PreprocessingStep()
...@@ -313,7 +328,7 @@ class PreprocessorBuilderTest(tf.test.TestCase): ...@@ -313,7 +328,7 @@ class PreprocessorBuilderTest(tf.test.TestCase):
'random_coef': 0.125, 'random_coef': 0.125,
'min_padded_size_ratio': (0.5, 0.75), 'min_padded_size_ratio': (0.5, 0.75),
'max_padded_size_ratio': (0.5, 0.75), 'max_padded_size_ratio': (0.5, 0.75),
'pad_color': (0.5, 0.5, 1.0) 'pad_color': None,
}) })
def test_build_random_crop_to_aspect_ratio(self): def test_build_random_crop_to_aspect_ratio(self):
...@@ -409,6 +424,20 @@ class PreprocessorBuilderTest(tf.test.TestCase): ...@@ -409,6 +424,20 @@ class PreprocessorBuilderTest(tf.test.TestCase):
self.assertEqual(function, preprocessor.subtract_channel_mean) self.assertEqual(function, preprocessor.subtract_channel_mean)
self.assertEqual(args, {'means': [1.0, 2.0, 3.0]}) self.assertEqual(args, {'means': [1.0, 2.0, 3.0]})
def test_random_self_concat_image(self):
preprocessor_text_proto = """
random_self_concat_image {
concat_vertical_probability: 0.5
concat_horizontal_probability: 0.25
}
"""
preprocessor_proto = preprocessor_pb2.PreprocessingStep()
text_format.Merge(preprocessor_text_proto, preprocessor_proto)
function, args = preprocessor_builder.build(preprocessor_proto)
self.assertEqual(function, preprocessor.random_self_concat_image)
self.assertEqual(args, {'concat_vertical_probability': 0.5,
'concat_horizontal_probability': 0.25})
def test_build_ssd_random_crop(self): def test_build_ssd_random_crop(self):
preprocessor_text_proto = """ preprocessor_text_proto = """
ssd_random_crop { ssd_random_crop {
......
...@@ -53,7 +53,7 @@ def build(region_similarity_calculator_config): ...@@ -53,7 +53,7 @@ def build(region_similarity_calculator_config):
return region_similarity_calculator.NegSqDistSimilarity() return region_similarity_calculator.NegSqDistSimilarity()
if similarity_calculator == 'thresholded_iou_similarity': if similarity_calculator == 'thresholded_iou_similarity':
return region_similarity_calculator.ThresholdedIouSimilarity( return region_similarity_calculator.ThresholdedIouSimilarity(
region_similarity_calculator_config.thresholded_iou_similarity.threshold region_similarity_calculator_config.thresholded_iou_similarity
) .iou_threshold)
raise ValueError('Unknown region similarity calculator.') raise ValueError('Unknown region similarity calculator.')
...@@ -61,8 +61,8 @@ class Loss(object): ...@@ -61,8 +61,8 @@ class Loss(object):
shouldn't be factored into the loss. shouldn't be factored into the loss.
losses_mask: A [batch] boolean tensor that indicates whether losses should losses_mask: A [batch] boolean tensor that indicates whether losses should
be applied to individual images in the batch. For elements that be applied to individual images in the batch. For elements that
are True, corresponding prediction, target, and weight tensors will be are False, corresponding prediction, target, and weight tensors will not
removed prior to loss computation. If None, no filtering will take place contribute to loss computation. If None, no filtering will take place
prior to loss computation. prior to loss computation.
scope: Op scope name. Defaults to 'Loss' if None. scope: Op scope name. Defaults to 'Loss' if None.
**params: Additional keyword arguments for specific implementations of **params: Additional keyword arguments for specific implementations of
......
...@@ -54,15 +54,14 @@ By default, DetectionModels produce bounding box detections; However, we support ...@@ -54,15 +54,14 @@ By default, DetectionModels produce bounding box detections; However, we support
a handful of auxiliary annotations associated with each bounding box, namely, a handful of auxiliary annotations associated with each bounding box, namely,
instance masks and keypoints. instance masks and keypoints.
""" """
from abc import ABCMeta import abc
from abc import abstractmethod
from object_detection.core import standard_fields as fields from object_detection.core import standard_fields as fields
class DetectionModel(object): class DetectionModel(object):
"""Abstract base class for detection models.""" """Abstract base class for detection models."""
__metaclass__ = ABCMeta __metaclass__ = abc.ABCMeta
def __init__(self, num_classes): def __init__(self, num_classes):
"""Constructor. """Constructor.
...@@ -112,7 +111,7 @@ class DetectionModel(object): ...@@ -112,7 +111,7 @@ class DetectionModel(object):
""" """
return field in self._groundtruth_lists return field in self._groundtruth_lists
@abstractmethod @abc.abstractmethod
def preprocess(self, inputs): def preprocess(self, inputs):
"""Input preprocessing. """Input preprocessing.
...@@ -155,7 +154,7 @@ class DetectionModel(object): ...@@ -155,7 +154,7 @@ class DetectionModel(object):
""" """
pass pass
@abstractmethod @abc.abstractmethod
def predict(self, preprocessed_inputs, true_image_shapes): def predict(self, preprocessed_inputs, true_image_shapes):
"""Predict prediction tensors from inputs tensor. """Predict prediction tensors from inputs tensor.
...@@ -175,10 +174,14 @@ class DetectionModel(object): ...@@ -175,10 +174,14 @@ class DetectionModel(object):
""" """
pass pass
@abstractmethod @abc.abstractmethod
def postprocess(self, prediction_dict, true_image_shapes, **params): def postprocess(self, prediction_dict, true_image_shapes, **params):
"""Convert predicted output tensors to final detections. """Convert predicted output tensors to final detections.
This stage typically performs a few things such as
* Non-Max Suppression to remove overlapping detection boxes.
* Score conversion and background class removal.
Outputs adhere to the following conventions: Outputs adhere to the following conventions:
* Classes are integers in [0, num_classes); background classes are removed * Classes are integers in [0, num_classes); background classes are removed
and the first non-background class is mapped to 0. If the model produces and the first non-background class is mapped to 0. If the model produces
...@@ -212,10 +215,20 @@ class DetectionModel(object): ...@@ -212,10 +215,20 @@ class DetectionModel(object):
(optional) (optional)
keypoints: [batch, max_detections, num_keypoints, 2] (optional) keypoints: [batch, max_detections, num_keypoints, 2] (optional)
num_detections: [batch] num_detections: [batch]
In addition to the above fields this stage also outputs the following
raw tensors:
raw_detection_boxes: [batch, total_detections, 4] tensor containing
all detection boxes from `prediction_dict` in the format
[ymin, xmin, ymax, xmax] and normalized co-ordinates.
raw_detection_scores: [batch, total_detections,
num_classes_with_background] tensor of class score logits for
raw detection boxes.
""" """
pass pass
@abstractmethod @abc.abstractmethod
def loss(self, prediction_dict, true_image_shapes): def loss(self, prediction_dict, true_image_shapes):
"""Compute scalar loss tensors with respect to provided groundtruth. """Compute scalar loss tensors with respect to provided groundtruth.
...@@ -296,7 +309,7 @@ class DetectionModel(object): ...@@ -296,7 +309,7 @@ class DetectionModel(object):
self._groundtruth_lists[ self._groundtruth_lists[
fields.InputDataFields.is_annotated] = is_annotated_list fields.InputDataFields.is_annotated] = is_annotated_list
@abstractmethod @abc.abstractmethod
def regularization_losses(self): def regularization_losses(self):
"""Returns a list of regularization losses for this model. """Returns a list of regularization losses for this model.
...@@ -308,7 +321,7 @@ class DetectionModel(object): ...@@ -308,7 +321,7 @@ class DetectionModel(object):
""" """
pass pass
@abstractmethod @abc.abstractmethod
def restore_map(self, fine_tune_checkpoint_type='detection'): def restore_map(self, fine_tune_checkpoint_type='detection'):
"""Returns a map of variables to load from a foreign checkpoint. """Returns a map of variables to load from a foreign checkpoint.
...@@ -332,7 +345,7 @@ class DetectionModel(object): ...@@ -332,7 +345,7 @@ class DetectionModel(object):
""" """
pass pass
@abstractmethod @abc.abstractmethod
def updates(self): def updates(self):
"""Returns a list of update operators for this model. """Returns a list of update operators for this model.
......
...@@ -57,7 +57,53 @@ class MulticlassNonMaxSuppressionTest(test_case.TestCase): ...@@ -57,7 +57,53 @@ class MulticlassNonMaxSuppressionTest(test_case.TestCase):
self.assertAllClose(nms_scores_output, exp_nms_scores) self.assertAllClose(nms_scores_output, exp_nms_scores)
self.assertAllClose(nms_classes_output, exp_nms_classes) self.assertAllClose(nms_classes_output, exp_nms_classes)
# TODO(bhattad): Remove conditional after CMLE moves to TF 1.9 def test_multiclass_nms_select_with_shared_boxes_pad_to_max_output_size(self):
boxes = np.array([[[0, 0, 1, 1]],
[[0, 0.1, 1, 1.1]],
[[0, -0.1, 1, 0.9]],
[[0, 10, 1, 11]],
[[0, 10.1, 1, 11.1]],
[[0, 100, 1, 101]],
[[0, 1000, 1, 1002]],
[[0, 1000, 1, 1002.1]]], np.float32)
scores = np.array([[.9, 0.01], [.75, 0.05],
[.6, 0.01], [.95, 0],
[.5, 0.01], [.3, 0.01],
[.01, .85], [.01, .5]], np.float32)
score_thresh = 0.1
iou_thresh = .5
max_size_per_class = 4
max_output_size = 5
exp_nms_corners = [[0, 10, 1, 11],
[0, 0, 1, 1],
[0, 1000, 1, 1002],
[0, 100, 1, 101]]
exp_nms_scores = [.95, .9, .85, .3]
exp_nms_classes = [0, 0, 1, 0]
def graph_fn(boxes, scores):
nms, num_valid_nms_boxes = post_processing.multiclass_non_max_suppression(
boxes,
scores,
score_thresh,
iou_thresh,
max_size_per_class,
max_total_size=max_output_size,
pad_to_max_output_size=True)
return [nms.get(), nms.get_field(fields.BoxListFields.scores),
nms.get_field(fields.BoxListFields.classes), num_valid_nms_boxes]
[nms_corners_output, nms_scores_output, nms_classes_output,
num_valid_nms_boxes] = self.execute(graph_fn, [boxes, scores])
self.assertEqual(num_valid_nms_boxes, 4)
self.assertAllClose(nms_corners_output[0:num_valid_nms_boxes],
exp_nms_corners)
self.assertAllClose(nms_scores_output[0:num_valid_nms_boxes],
exp_nms_scores)
self.assertAllClose(nms_classes_output[0:num_valid_nms_boxes],
exp_nms_classes)
def test_multiclass_nms_select_with_shared_boxes_given_keypoints(self): def test_multiclass_nms_select_with_shared_boxes_given_keypoints(self):
boxes = tf.constant([[[0, 0, 1, 1]], boxes = tf.constant([[[0, 0, 1, 1]],
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment