Commit 80444539 authored by Zhuoran Liu's avatar Zhuoran Liu Committed by pkulzc
Browse files

Add TPU SavedModel exporter and refactor OD code (#6737)

247226201  by ronnyvotel:

    Updating the visualization tools to accept unique_ids for color coding.

--
247067830  by Zhichao Lu:

    Add box_encodings_clip_range options for the convolutional box predictor (for TPU compatibility).

--
246888475  by Zhichao Lu:

    Remove unused _update_eval_steps function.

--
246163259  by lzc:

    Add a gather op that can handle ignore indices (which are "-1"s in this case).

--
246084944  by Zhichao Lu:

    Keras based implementation for SSD + MobilenetV2 + FPN.

--
245544227  by rathodv:

    Add batch_get_targets method to target assigner module to gather any groundtruth tensors based on the results of target assigner.

--
245540854  by rathodv:

    Update target assigner to return match tensor instead of a match object.

--
245434441  by Zhichao Lu:

    Add README for tpu_exporters package.

--
245381834  by lzc:

    Internal change.

--
245298983  by Zhichao Lu:

    Add conditional_shape_resizer to config_util

--
245134666  by Zhichao Lu:

    Adds ConditionalShapeResizer to the ImageResizer proto which enables resizing only if input image height or width is is greater or smaller than a certain size. Also enables specification of resize method in resize_to_{max, min}_dimension methods.

--
245093975  by Zhichao Lu:

    Exporting SavedModel for Object Detection TPU inference. (faster-rcnn)

--
245072421  by Zhichao Lu:

    Adds a new image resizing method "resize_to_max_dimension" which resizes images only if a dimension is greater than the maximum desired value while maintaining aspect ratio.

--
244946998  by lzc:

    Internal Changes.

--
244943693  by Zhichao Lu:

    Add a custom config to mobilenet v2 that makes it more detection friendly.

--
244754158  by derekjchow:

    Internal change.

--
244699875  by Zhichao Lu:

    Add check_range=False to box_list_ops.to_normalized_coordinates when training
    for instance segmentation.  This is consistent with other calls when training
    for object detection.  There could be wrongly annotated boxes in the dataset.

--
244507425  by rathodv:

    Support bfloat16 for ssd models.

--
244399982  by Zhichao Lu:

    Exporting SavedModel for Object Detection TPU inference. (ssd)

--
244209387  by Zhichao Lu:

    Internal change.

--
243922296  by rathodv:

    Change `raw_detection_scores` to contain softmax/sigmoid scores (not logits) for `raw_ detection_boxes`.

--
243883978  by Zhichao Lu:

    Add a sample fully conv config.

--
243369455  by Zhichao Lu:

    Fix regularization loss gap in Keras and Slim.

--
243292002  by lzc:

    Internal changes.

--
243097958  by Zhichao Lu:

    Exporting SavedModel for Object Detection TPU inference. (ssd model)

--
243007177  by Zhichao Lu:

    Exporting SavedModel for Object Detection TPU inference. (ssd model)

--
242776550  by Zhichao Lu:

    Make object detection pre-processing run on GPU.  tf.map_fn() uses
    TensorArrayV3 ops, which have no int32 GPU implementation.  Cast to int64,
    then cast back to int32.

--
242723128  by Zhichao Lu:

    Using sorted dictionaries for additional heads in non_max_suppression to ensure tensor order

--
242495311  by Zhichao Lu:

    Update documentation to reflect new TFLite examples repo location

--
242230527  by Zhichao Lu:

    Fix Dropout bugs for WeightSharedConvolutionalBoxPred.

--
242226573  by Zhichao Lu:

    Create Keras-based WeightSharedConvolutionalBoxPredictor.

--
241806074  by Zhichao Lu:

    Add inference in unit tests of TFX OD template.

--
241641498  by lzc:

    Internal change.

--
241637481  by Zhichao Lu:

    matmul_crop_and_resize(): Switch to dynamic shaping, so that not all dimensions are required to be known.

--
241429980  by Zhichao Lu:

    Internal change

--
241167237  by Zhichao Lu:

    Adds a faster_rcnn_inception_resnet_v2 Keras feature extractor, and updates the model builder to construct it.

--
241088616  by Zhichao Lu:

    Make it compatible with different dtype, e.g. float32, bfloat16, etc.

--
240897364  by lzc:

    Use image_np_expanded in object_detection_tutorial notebook.

--
240890393  by Zhichao Lu:

    Disable multicore inference for OD template as its not yet compatible.

--
240352168  by Zhichao Lu:

    Make SSDResnetV1FpnFeatureExtractor not protected to allow inheritance.

--
240351470  by lzc:

    Internal change.

--
239878928  by Zhichao Lu:

    Defines Keras box predictors for Faster RCNN and RFCN

--
239872103  by Zhichao Lu:

    Delete duplicated inputs in test.

--
239714273  by Zhichao Lu:

    Adding scope variable to all class heads

--
239698643  by Zhichao Lu:

    Create FPN feature extractor for object detection.

--
239696657  by Zhichao Lu:

    Internal Change.

--
239299404  by Zhichao Lu:

    Allows the faster rcnn meta-architecture to support Keras subcomponents

--
238502595  by Zhichao Lu:

    Lay the groundwork for symmetric quantization.

--
238496885  by Zhichao Lu:

    Add flexible_grid_anchor_generator

--
238138727  by lzc:

    Remove dead code.

    _USE_C_SHAPES has been forced True in TensorFlow releases since
    TensorFlow 1.9
    (https://github.com/tensorflow/tensorflow/commit/1d74a69443f741e69f9f52cb6bc2940b4d4ae3b7)

--
238123936  by rathodv:

    Add num_matched_groundtruth summary to target assigner in SSD.

--
238103345  by ronnyvotel:

    Raising error if input file pattern does not match any files.
    Also printing the number of evaluation images for coco metrics.

--
238044081  by Zhichao Lu:

    Fix docstring to state the correct dimensionality of `class_predictions_with_background`.

--
237920279  by Zhichao Lu:

    [XLA] Rework debug flags for dumping HLO.

    The following flags (usually passed via the XLA_FLAGS envvar) are removed:

      xla_dump_computations_to
      xla_dump_executions_to
      xla_dump_ir_to
      xla_dump_optimized_hlo_proto_to
      xla_dump_per_pass_hlo_proto_to
      xla_dump_unoptimized_hlo_proto_to
      xla_generate_hlo_graph
      xla_generate_hlo_text_to
      xla_hlo_dump_as_html
      xla_hlo_graph_path
      xla_log_hlo_text

    The following new flags are added:

      xla_dump_to
      xla_dump_hlo_module_re
      xla_dump_hlo_pass_re
      xla_dump_hlo_as_text
      xla_dump_hlo_as_proto
      xla_dump_hlo_as_dot
      xla_dump_hlo_as_url
      xla_dump_hlo_as_html
      xla_dump_ir
      xla_dump_hlo_snapshots

    The default is not to dump anything at all, but as soon as some dumping flag is
    specified, we enable the following defaults (most of which can be overridden).

     * dump to stdout (overridden by --xla_dump_to)
     * dump HLO modules at the very beginning and end of the optimization pipeline
     * don't dump between any HLO passes (overridden by --xla_dump_hlo_pass_re)
     * dump all HLO modules (overridden by --xla_dump_hlo_module_re)
     * dump in textual format (overridden by
       --xla_dump_hlo_as_{text,proto,dot,url,html}).

    For example, to dump optimized and unoptimized HLO text and protos to /tmp/foo,
    pass

      --xla_dump_to=/tmp/foo --xla_dump_hlo_as_text --xla_dump_hlo_as_proto

    For details on these flags' meanings, see xla.proto.

    The intent of this change is to make dumping both simpler to use and more
    powerful.

    For example:

     * Previously there was no way to dump the HLO module during the pass pipeline
       in HLO text format; the only option was --dump_per_pass_hlo_proto_to, which
       dumped in proto format.

       Now this is --xla_dump_pass_re=.* --xla_dump_hlo_as_text.  (In fact, the
       second flag is not necessary in this case, as dumping as text is the
       default.)

     * Previously there was no way to dump HLO as a graph before and after
       compilation; the only option was --xla_generate_hlo_graph, which would dump
       before/after every pass.

       Now this is --xla_dump_hlo_as_{dot,url,html} (depending on what format you
       want the graph in).

     * Previously, there was no coordination between the filenames written by the
       various flags, so info about one module might be dumped with various
       filename prefixes.  Now the filenames are consistent and all dumps from a
       particular module are next to each other.

    If you only specify some of these flags, we try to figure out what you wanted.
    For example:

     * --xla_dump_to implies --xla_dump_hlo_as_text unless you specify some
       other --xla_dump_as_* flag.

     * --xla_dump_hlo_as_text or --xla_dump_ir implies dumping to stdout unless you
       specify a different --xla_dump_to directory.  You can explicitly dump to
       stdout with --xla_dump_to=-.

    As part of this change, I simplified the debugging code in the HLO passes for
    dumping HLO modules.  Previously, many tests explicitly VLOG'ed the HLO module
    before, after, and sometimes during the pass.  I removed these VLOGs.  If you
    want dumps before/during/after an HLO pass, use --xla_dump_pass_re=<pass_name>.

--
237510043  by lzc:

    Internal Change.

--
237469515  by Zhichao Lu:

    Parameterize model_builder.build in inputs.py.

--
237293511  by rathodv:

    Remove multiclass_scores from tensor_dict in transform_data_fn always.

--
237260333  by ronnyvotel:

    Updating faster_rcnn_meta_arch to define prediction dictionary fields that are batched.

--

PiperOrigin-RevId: 247226201
parent c4f34e58
...@@ -65,6 +65,8 @@ Extras: ...@@ -65,6 +65,8 @@ Extras:
* <a href='g3doc/detection_model_zoo.md'>Tensorflow detection model zoo</a><br> * <a href='g3doc/detection_model_zoo.md'>Tensorflow detection model zoo</a><br>
* <a href='g3doc/exporting_models.md'> * <a href='g3doc/exporting_models.md'>
Exporting a trained model for inference</a><br> Exporting a trained model for inference</a><br>
* <a href='g3doc/tpu_exporters.md'>
Exporting a trained model for TPU inference</a><br>
* <a href='g3doc/defining_your_own_model.md'> * <a href='g3doc/defining_your_own_model.md'>
Defining your own model architecture</a><br> Defining your own model architecture</a><br>
* <a href='g3doc/using_your_own_dataset.md'> * <a href='g3doc/using_your_own_dataset.md'>
......
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Generates grid anchors on the fly corresponding to multiple CNN layers."""
import tensorflow as tf
from object_detection.anchor_generators import grid_anchor_generator
from object_detection.core import anchor_generator
from object_detection.core import box_list_ops
class FlexibleGridAnchorGenerator(anchor_generator.AnchorGenerator):
"""Generate a grid of anchors for multiple CNN layers of different scale."""
def __init__(self, base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
normalize_coordinates=True):
"""Constructs a FlexibleGridAnchorGenerator.
This generator is more flexible than the multiple_grid_anchor_generator
and multiscale_grid_anchor_generator, and can generate any of the anchors
that they can generate, plus additional anchor configurations. In
particular, it allows the explicit specification of scale and aspect ratios
at each layer without making any assumptions between the relationship
between scales and aspect ratios between layers.
Args:
base_sizes: list of tuples of anchor base sizes. For example, setting
base_sizes=[(1, 2, 3), (4, 5)] means that we want 3 anchors at each
grid point on the first layer with the base sizes of 1, 2, and 3, and 2
anchors at each grid point on the second layer with the base sizes of
4 and 5.
aspect_ratios: list or tuple of aspect ratios. For example, setting
aspect_ratios=[(1.0, 2.0, 0.5), (1.0, 2.0)] means that we want 3 anchors
at each grid point on the first layer with aspect ratios of 1.0, 2.0,
and 0.5, and 2 anchors at each grid point on the sercond layer with the
base sizes of 1.0 and 2.0.
anchor_strides: list of pairs of strides in pixels (in y and x directions
respectively). For example, setting anchor_strides=[(25, 25), (50, 50)]
means that we want the anchors corresponding to the first layer to be
strided by 25 pixels and those in the second layer to be strided by 50
pixels in both y and x directions.
anchor_offsets: list of pairs of offsets in pixels (in y and x directions
respectively). The offset specifies where we want the center of the
(0, 0)-th anchor to lie for each layer. For example, setting
anchor_offsets=[(10, 10), (20, 20)]) means that we want the
(0, 0)-th anchor of the first layer to lie at (10, 10) in pixel space
and likewise that we want the (0, 0)-th anchor of the second layer to
lie at (25, 25) in pixel space.
normalize_coordinates: whether to produce anchors in normalized
coordinates. (defaults to True).
"""
self._base_sizes = base_sizes
self._aspect_ratios = aspect_ratios
self._anchor_strides = anchor_strides
self._anchor_offsets = anchor_offsets
self._normalize_coordinates = normalize_coordinates
def name_scope(self):
return 'FlexibleGridAnchorGenerator'
def num_anchors_per_location(self):
"""Returns the number of anchors per spatial location.
Returns:
a list of integers, one for each expected feature map to be passed to
the Generate function.
"""
return [len(size) for size in self._base_sizes]
def _generate(self, feature_map_shape_list, im_height=1, im_width=1):
"""Generates a collection of bounding boxes to be used as anchors.
Currently we require the input image shape to be statically defined. That
is, im_height and im_width should be integers rather than tensors.
Args:
feature_map_shape_list: list of pairs of convnet layer resolutions in the
format [(height_0, width_0), (height_1, width_1), ...]. For example,
setting feature_map_shape_list=[(8, 8), (7, 7)] asks for anchors that
correspond to an 8x8 layer followed by a 7x7 layer.
im_height: the height of the image to generate the grid for. If both
im_height and im_width are 1, anchors can only be generated in
absolute coordinates.
im_width: the width of the image to generate the grid for. If both
im_height and im_width are 1, anchors can only be generated in
absolute coordinates.
Returns:
boxes_list: a list of BoxLists each holding anchor boxes corresponding to
the input feature map shapes.
Raises:
ValueError: if im_height and im_width are 1, but normalized coordinates
were requested.
"""
anchor_grid_list = []
for (feat_shape, base_sizes, aspect_ratios, anchor_stride, anchor_offset
) in zip(feature_map_shape_list, self._base_sizes, self._aspect_ratios,
self._anchor_strides, self._anchor_offsets):
anchor_grid = grid_anchor_generator.tile_anchors(
feat_shape[0],
feat_shape[1],
tf.to_float(tf.convert_to_tensor(base_sizes)),
tf.to_float(tf.convert_to_tensor(aspect_ratios)),
tf.constant([1.0, 1.0]),
tf.to_float(tf.convert_to_tensor(anchor_stride)),
tf.to_float(tf.convert_to_tensor(anchor_offset)))
num_anchors = anchor_grid.num_boxes_static()
if num_anchors is None:
num_anchors = anchor_grid.num_boxes()
anchor_indices = tf.zeros([num_anchors])
anchor_grid.add_field('feature_map_index', anchor_indices)
if self._normalize_coordinates:
if im_height == 1 or im_width == 1:
raise ValueError(
'Normalized coordinates were requested upon construction of the '
'FlexibleGridAnchorGenerator, but a subsequent call to '
'generate did not supply dimension information.')
anchor_grid = box_list_ops.to_normalized_coordinates(
anchor_grid, im_height, im_width, check_range=False)
anchor_grid_list.append(anchor_grid)
return anchor_grid_list
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for anchor_generators.flexible_grid_anchor_generator_test.py."""
import numpy as np
import tensorflow as tf
from object_detection.anchor_generators import flexible_grid_anchor_generator as fg
from object_detection.utils import test_case
class FlexibleGridAnchorGeneratorTest(test_case.TestCase):
def test_construct_single_anchor(self):
anchor_strides = [(32, 32),]
anchor_offsets = [(16, 16),]
base_sizes = [(128.0,)]
aspect_ratios = [(1.0,)]
im_height = 64
im_width = 64
feature_map_shape_list = [(2, 2)]
exp_anchor_corners = [[-48, -48, 80, 80],
[-48, -16, 80, 112],
[-16, -48, 112, 80],
[-16, -16, 112, 112]]
anchor_generator = fg.FlexibleGridAnchorGenerator(
base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
normalize_coordinates=False)
anchors_list = anchor_generator.generate(
feature_map_shape_list, im_height=im_height, im_width=im_width)
anchor_corners = anchors_list[0].get()
with self.test_session():
anchor_corners_out = anchor_corners.eval()
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
def test_construct_single_anchor_unit_dimensions(self):
anchor_strides = [(32, 32),]
anchor_offsets = [(16, 16),]
base_sizes = [(32.0,)]
aspect_ratios = [(1.0,)]
im_height = 1
im_width = 1
feature_map_shape_list = [(2, 2)]
# Positive offsets are produced.
exp_anchor_corners = [[0, 0, 32, 32],
[0, 32, 32, 64],
[32, 0, 64, 32],
[32, 32, 64, 64]]
anchor_generator = fg.FlexibleGridAnchorGenerator(
base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
normalize_coordinates=False)
anchors_list = anchor_generator.generate(
feature_map_shape_list, im_height=im_height, im_width=im_width)
anchor_corners = anchors_list[0].get()
with self.test_session():
anchor_corners_out = anchor_corners.eval()
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
def test_construct_normalized_anchors_fails_with_unit_dimensions(self):
anchor_generator = fg.FlexibleGridAnchorGenerator(
[(32.0,)], [(1.0,)], [(32, 32),], [(16, 16),],
normalize_coordinates=True)
with self.assertRaisesRegexp(ValueError, 'Normalized coordinates'):
anchor_generator.generate(
feature_map_shape_list=[(2, 2)], im_height=1, im_width=1)
def test_construct_single_anchor_in_normalized_coordinates(self):
anchor_strides = [(32, 32),]
anchor_offsets = [(16, 16),]
base_sizes = [(128.0,)]
aspect_ratios = [(1.0,)]
im_height = 64
im_width = 128
feature_map_shape_list = [(2, 2)]
exp_anchor_corners = [[-48./64, -48./128, 80./64, 80./128],
[-48./64, -16./128, 80./64, 112./128],
[-16./64, -48./128, 112./64, 80./128],
[-16./64, -16./128, 112./64, 112./128]]
anchor_generator = fg.FlexibleGridAnchorGenerator(
base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
normalize_coordinates=True)
anchors_list = anchor_generator.generate(
feature_map_shape_list, im_height=im_height, im_width=im_width)
anchor_corners = anchors_list[0].get()
with self.test_session():
anchor_corners_out = anchor_corners.eval()
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
def test_num_anchors_per_location(self):
anchor_strides = [(32, 32), (64, 64)]
anchor_offsets = [(16, 16), (32, 32)]
base_sizes = [(32.0, 64.0, 96.0, 32.0, 64.0, 96.0),
(64.0, 128.0, 172.0, 64.0, 128.0, 172.0)]
aspect_ratios = [(1.0, 1.0, 1.0, 2.0, 2.0, 2.0),
(1.0, 1.0, 1.0, 2.0, 2.0, 2.0)]
anchor_generator = fg.FlexibleGridAnchorGenerator(
base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
normalize_coordinates=False)
self.assertEqual(anchor_generator.num_anchors_per_location(), [6, 6])
def test_construct_single_anchor_dynamic_size(self):
anchor_strides = [(32, 32),]
anchor_offsets = [(0, 0),]
base_sizes = [(128.0,)]
aspect_ratios = [(1.0,)]
im_height = tf.constant(64)
im_width = tf.constant(64)
feature_map_shape_list = [(2, 2)]
# Zero offsets are used.
exp_anchor_corners = [[-64, -64, 64, 64],
[-64, -32, 64, 96],
[-32, -64, 96, 64],
[-32, -32, 96, 96]]
anchor_generator = fg.FlexibleGridAnchorGenerator(
base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
normalize_coordinates=False)
anchors_list = anchor_generator.generate(
feature_map_shape_list, im_height=im_height, im_width=im_width)
anchor_corners = anchors_list[0].get()
with self.test_session():
anchor_corners_out = anchor_corners.eval()
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
def test_construct_single_anchor_with_odd_input_dimension(self):
def graph_fn():
anchor_strides = [(32, 32),]
anchor_offsets = [(0, 0),]
base_sizes = [(128.0,)]
aspect_ratios = [(1.0,)]
im_height = 65
im_width = 65
feature_map_shape_list = [(3, 3)]
anchor_generator = fg.FlexibleGridAnchorGenerator(
base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
normalize_coordinates=False)
anchors_list = anchor_generator.generate(
feature_map_shape_list, im_height=im_height, im_width=im_width)
anchor_corners = anchors_list[0].get()
return (anchor_corners,)
anchor_corners_out = self.execute(graph_fn, [])
exp_anchor_corners = [[-64, -64, 64, 64],
[-64, -32, 64, 96],
[-64, 0, 64, 128],
[-32, -64, 96, 64],
[-32, -32, 96, 96],
[-32, 0, 96, 128],
[0, -64, 128, 64],
[0, -32, 128, 96],
[0, 0, 128, 128]]
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
def test_construct_single_anchor_on_two_feature_maps(self):
def graph_fn():
anchor_strides = [(32, 32), (64, 64)]
anchor_offsets = [(16, 16), (32, 32)]
base_sizes = [(128.0,), (256.0,)]
aspect_ratios = [(1.0,), (1.0,)]
im_height = 64
im_width = 64
feature_map_shape_list = [(2, 2), (1, 1)]
anchor_generator = fg.FlexibleGridAnchorGenerator(
base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
normalize_coordinates=False)
anchors_list = anchor_generator.generate(feature_map_shape_list,
im_height=im_height,
im_width=im_width)
anchor_corners = [anchors.get() for anchors in anchors_list]
return anchor_corners
anchor_corners_out = np.concatenate(self.execute(graph_fn, []), axis=0)
exp_anchor_corners = [[-48, -48, 80, 80],
[-48, -16, 80, 112],
[-16, -48, 112, 80],
[-16, -16, 112, 112],
[-96, -96, 160, 160]]
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
def test_construct_single_anchor_with_two_scales_per_octave(self):
def graph_fn():
anchor_strides = [(64, 64),]
anchor_offsets = [(32, 32),]
base_sizes = [(256.0, 362.03867)]
aspect_ratios = [(1.0, 1.0)]
im_height = 64
im_width = 64
feature_map_shape_list = [(1, 1)]
anchor_generator = fg.FlexibleGridAnchorGenerator(
base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
normalize_coordinates=False)
anchors_list = anchor_generator.generate(feature_map_shape_list,
im_height=im_height,
im_width=im_width)
anchor_corners = [anchors.get() for anchors in anchors_list]
return anchor_corners
# There are 4 set of anchors in this configuration. The order is:
# [[2**0.0 intermediate scale + 1.0 aspect],
# [2**0.5 intermediate scale + 1.0 aspect]]
exp_anchor_corners = [[-96., -96., 160., 160.],
[-149.0193, -149.0193, 213.0193, 213.0193]]
anchor_corners_out = self.execute(graph_fn, [])
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
def test_construct_single_anchor_with_two_scales_per_octave_and_aspect(self):
def graph_fn():
anchor_strides = [(64, 64),]
anchor_offsets = [(32, 32),]
base_sizes = [(256.0, 362.03867, 256.0, 362.03867)]
aspect_ratios = [(1.0, 1.0, 2.0, 2.0)]
im_height = 64
im_width = 64
feature_map_shape_list = [(1, 1)]
anchor_generator = fg.FlexibleGridAnchorGenerator(
base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
normalize_coordinates=False)
anchors_list = anchor_generator.generate(feature_map_shape_list,
im_height=im_height,
im_width=im_width)
anchor_corners = [anchors.get() for anchors in anchors_list]
return anchor_corners
# There are 4 set of anchors in this configuration. The order is:
# [[2**0.0 intermediate scale + 1.0 aspect],
# [2**0.5 intermediate scale + 1.0 aspect],
# [2**0.0 intermediate scale + 2.0 aspect],
# [2**0.5 intermediate scale + 2.0 aspect]]
exp_anchor_corners = [[-96., -96., 160., 160.],
[-149.0193, -149.0193, 213.0193, 213.0193],
[-58.50967, -149.0193, 122.50967, 213.0193],
[-96., -224., 160., 288.]]
anchor_corners_out = self.execute(graph_fn, [])
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
def test_construct_single_anchors_on_feature_maps_with_dynamic_shape(self):
def graph_fn(feature_map1_height, feature_map1_width, feature_map2_height,
feature_map2_width):
anchor_strides = [(32, 32), (64, 64)]
anchor_offsets = [(16, 16), (32, 32)]
base_sizes = [(128.0,), (256.0,)]
aspect_ratios = [(1.0,), (1.0,)]
im_height = 64
im_width = 64
feature_map_shape_list = [(feature_map1_height, feature_map1_width),
(feature_map2_height, feature_map2_width)]
anchor_generator = fg.FlexibleGridAnchorGenerator(
base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
normalize_coordinates=False)
anchors_list = anchor_generator.generate(feature_map_shape_list,
im_height=im_height,
im_width=im_width)
anchor_corners = [anchors.get() for anchors in anchors_list]
return anchor_corners
anchor_corners_out = np.concatenate(
self.execute_cpu(graph_fn, [
np.array(2, dtype=np.int32),
np.array(2, dtype=np.int32),
np.array(1, dtype=np.int32),
np.array(1, dtype=np.int32)
]),
axis=0)
exp_anchor_corners = [[-48, -48, 80, 80],
[-48, -16, 80, 112],
[-16, -48, 112, 80],
[-16, -16, 112, 112],
[-96, -96, 160, 160]]
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
if __name__ == '__main__':
tf.test.main()
...@@ -15,6 +15,7 @@ ...@@ -15,6 +15,7 @@
"""A function to build an object detection anchor generator from config.""" """A function to build an object detection anchor generator from config."""
from object_detection.anchor_generators import flexible_grid_anchor_generator
from object_detection.anchor_generators import grid_anchor_generator from object_detection.anchor_generators import grid_anchor_generator
from object_detection.anchor_generators import multiple_grid_anchor_generator from object_detection.anchor_generators import multiple_grid_anchor_generator
from object_detection.anchor_generators import multiscale_grid_anchor_generator from object_detection.anchor_generators import multiscale_grid_anchor_generator
...@@ -90,5 +91,19 @@ def build(anchor_generator_config): ...@@ -90,5 +91,19 @@ def build(anchor_generator_config):
cfg.scales_per_octave, cfg.scales_per_octave,
cfg.normalize_coordinates cfg.normalize_coordinates
) )
elif anchor_generator_config.WhichOneof(
'anchor_generator_oneof') == 'flexible_grid_anchor_generator':
cfg = anchor_generator_config.flexible_grid_anchor_generator
base_sizes = []
aspect_ratios = []
strides = []
offsets = []
for anchor_grid in cfg.anchor_grid:
base_sizes.append(tuple(anchor_grid.base_sizes))
aspect_ratios.append(tuple(anchor_grid.aspect_ratios))
strides.append((anchor_grid.height_stride, anchor_grid.width_stride))
offsets.append((anchor_grid.height_offset, anchor_grid.width_offset))
return flexible_grid_anchor_generator.FlexibleGridAnchorGenerator(
base_sizes, aspect_ratios, strides, offsets, cfg.normalize_coordinates)
else: else:
raise ValueError('Empty anchor generator.') raise ValueError('Empty anchor generator.')
...@@ -20,6 +20,7 @@ import math ...@@ -20,6 +20,7 @@ import math
import tensorflow as tf import tensorflow as tf
from google.protobuf import text_format from google.protobuf import text_format
from object_detection.anchor_generators import flexible_grid_anchor_generator
from object_detection.anchor_generators import grid_anchor_generator from object_detection.anchor_generators import grid_anchor_generator
from object_detection.anchor_generators import multiple_grid_anchor_generator from object_detection.anchor_generators import multiple_grid_anchor_generator
from object_detection.anchor_generators import multiscale_grid_anchor_generator from object_detection.anchor_generators import multiscale_grid_anchor_generator
...@@ -43,8 +44,8 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase): ...@@ -43,8 +44,8 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
text_format.Merge(anchor_generator_text_proto, anchor_generator_proto) text_format.Merge(anchor_generator_text_proto, anchor_generator_proto)
anchor_generator_object = anchor_generator_builder.build( anchor_generator_object = anchor_generator_builder.build(
anchor_generator_proto) anchor_generator_proto)
self.assertTrue(isinstance(anchor_generator_object, self.assertIsInstance(anchor_generator_object,
grid_anchor_generator.GridAnchorGenerator)) grid_anchor_generator.GridAnchorGenerator)
self.assertListEqual(anchor_generator_object._scales, []) self.assertListEqual(anchor_generator_object._scales, [])
self.assertListEqual(anchor_generator_object._aspect_ratios, []) self.assertListEqual(anchor_generator_object._aspect_ratios, [])
self.assertAllEqual(anchor_generator_object._anchor_offset, [0, 0]) self.assertAllEqual(anchor_generator_object._anchor_offset, [0, 0])
...@@ -68,8 +69,8 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase): ...@@ -68,8 +69,8 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
text_format.Merge(anchor_generator_text_proto, anchor_generator_proto) text_format.Merge(anchor_generator_text_proto, anchor_generator_proto)
anchor_generator_object = anchor_generator_builder.build( anchor_generator_object = anchor_generator_builder.build(
anchor_generator_proto) anchor_generator_proto)
self.assertTrue(isinstance(anchor_generator_object, self.assertIsInstance(anchor_generator_object,
grid_anchor_generator.GridAnchorGenerator)) grid_anchor_generator.GridAnchorGenerator)
self.assert_almost_list_equal(anchor_generator_object._scales, self.assert_almost_list_equal(anchor_generator_object._scales,
[0.4, 2.2]) [0.4, 2.2])
self.assert_almost_list_equal(anchor_generator_object._aspect_ratios, self.assert_almost_list_equal(anchor_generator_object._aspect_ratios,
...@@ -88,9 +89,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase): ...@@ -88,9 +89,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
text_format.Merge(anchor_generator_text_proto, anchor_generator_proto) text_format.Merge(anchor_generator_text_proto, anchor_generator_proto)
anchor_generator_object = anchor_generator_builder.build( anchor_generator_object = anchor_generator_builder.build(
anchor_generator_proto) anchor_generator_proto)
self.assertTrue(isinstance(anchor_generator_object, self.assertIsInstance(anchor_generator_object,
multiple_grid_anchor_generator. multiple_grid_anchor_generator.
MultipleGridAnchorGenerator)) MultipleGridAnchorGenerator)
for actual_scales, expected_scales in zip( for actual_scales, expected_scales in zip(
list(anchor_generator_object._scales), list(anchor_generator_object._scales),
[(0.1, 0.2, 0.2), [(0.1, 0.2, 0.2),
...@@ -118,9 +119,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase): ...@@ -118,9 +119,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
text_format.Merge(anchor_generator_text_proto, anchor_generator_proto) text_format.Merge(anchor_generator_text_proto, anchor_generator_proto)
anchor_generator_object = anchor_generator_builder.build( anchor_generator_object = anchor_generator_builder.build(
anchor_generator_proto) anchor_generator_proto)
self.assertTrue(isinstance(anchor_generator_object, self.assertIsInstance(anchor_generator_object,
multiple_grid_anchor_generator. multiple_grid_anchor_generator.
MultipleGridAnchorGenerator)) MultipleGridAnchorGenerator)
for actual_scales, expected_scales in zip( for actual_scales, expected_scales in zip(
list(anchor_generator_object._scales), list(anchor_generator_object._scales),
[(0.1, math.sqrt(0.1 * 0.15)), [(0.1, math.sqrt(0.1 * 0.15)),
...@@ -143,9 +144,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase): ...@@ -143,9 +144,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
text_format.Merge(anchor_generator_text_proto, anchor_generator_proto) text_format.Merge(anchor_generator_text_proto, anchor_generator_proto)
anchor_generator_object = anchor_generator_builder.build( anchor_generator_object = anchor_generator_builder.build(
anchor_generator_proto) anchor_generator_proto)
self.assertTrue(isinstance(anchor_generator_object, self.assertIsInstance(anchor_generator_object,
multiple_grid_anchor_generator. multiple_grid_anchor_generator.
MultipleGridAnchorGenerator)) MultipleGridAnchorGenerator)
for actual_aspect_ratio, expected_aspect_ratio in zip( for actual_aspect_ratio, expected_aspect_ratio in zip(
list(anchor_generator_object._aspect_ratios), list(anchor_generator_object._aspect_ratios),
6 * [(0.5, 0.5)]): 6 * [(0.5, 0.5)]):
...@@ -162,9 +163,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase): ...@@ -162,9 +163,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
text_format.Merge(anchor_generator_text_proto, anchor_generator_proto) text_format.Merge(anchor_generator_text_proto, anchor_generator_proto)
anchor_generator_object = anchor_generator_builder.build( anchor_generator_object = anchor_generator_builder.build(
anchor_generator_proto) anchor_generator_proto)
self.assertTrue(isinstance(anchor_generator_object, self.assertIsInstance(anchor_generator_object,
multiple_grid_anchor_generator. multiple_grid_anchor_generator.
MultipleGridAnchorGenerator)) MultipleGridAnchorGenerator)
for actual_scales, expected_scales in zip( for actual_scales, expected_scales in zip(
list(anchor_generator_object._scales), list(anchor_generator_object._scales),
...@@ -204,9 +205,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase): ...@@ -204,9 +205,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
text_format.Merge(anchor_generator_text_proto, anchor_generator_proto) text_format.Merge(anchor_generator_text_proto, anchor_generator_proto)
anchor_generator_object = anchor_generator_builder.build( anchor_generator_object = anchor_generator_builder.build(
anchor_generator_proto) anchor_generator_proto)
self.assertTrue(isinstance(anchor_generator_object, self.assertIsInstance(anchor_generator_object,
multiple_grid_anchor_generator. multiple_grid_anchor_generator.
MultipleGridAnchorGenerator)) MultipleGridAnchorGenerator)
for actual_scales, expected_scales in zip( for actual_scales, expected_scales in zip(
list(anchor_generator_object._scales), list(anchor_generator_object._scales),
...@@ -246,9 +247,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase): ...@@ -246,9 +247,9 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
text_format.Merge(anchor_generator_text_proto, anchor_generator_proto) text_format.Merge(anchor_generator_text_proto, anchor_generator_proto)
anchor_generator_object = anchor_generator_builder.build( anchor_generator_object = anchor_generator_builder.build(
anchor_generator_proto) anchor_generator_proto)
self.assertTrue(isinstance(anchor_generator_object, self.assertIsInstance(anchor_generator_object,
multiscale_grid_anchor_generator. multiscale_grid_anchor_generator.
MultiscaleGridAnchorGenerator)) MultiscaleGridAnchorGenerator)
for level, anchor_grid_info in zip( for level, anchor_grid_info in zip(
range(3, 8), anchor_generator_object._anchor_grid_info): range(3, 8), anchor_generator_object._anchor_grid_info):
self.assertEqual(set(anchor_grid_info.keys()), set(['level', 'info'])) self.assertEqual(set(anchor_grid_info.keys()), set(['level', 'info']))
...@@ -273,11 +274,59 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase): ...@@ -273,11 +274,59 @@ class AnchorGeneratorBuilderTest(tf.test.TestCase):
text_format.Merge(anchor_generator_text_proto, anchor_generator_proto) text_format.Merge(anchor_generator_text_proto, anchor_generator_proto)
anchor_generator_object = anchor_generator_builder.build( anchor_generator_object = anchor_generator_builder.build(
anchor_generator_proto) anchor_generator_proto)
self.assertTrue(isinstance(anchor_generator_object, self.assertIsInstance(anchor_generator_object,
multiscale_grid_anchor_generator. multiscale_grid_anchor_generator.
MultiscaleGridAnchorGenerator)) MultiscaleGridAnchorGenerator)
self.assertFalse(anchor_generator_object._normalize_coordinates) self.assertFalse(anchor_generator_object._normalize_coordinates)
def test_build_flexible_anchor_generator(self):
anchor_generator_text_proto = """
flexible_grid_anchor_generator {
anchor_grid {
base_sizes: [1.5]
aspect_ratios: [1.0]
height_stride: 16
width_stride: 20
height_offset: 8
width_offset: 9
}
anchor_grid {
base_sizes: [1.0, 2.0]
aspect_ratios: [1.0, 0.5]
height_stride: 32
width_stride: 30
height_offset: 10
width_offset: 11
}
}
"""
anchor_generator_proto = anchor_generator_pb2.AnchorGenerator()
text_format.Merge(anchor_generator_text_proto, anchor_generator_proto)
anchor_generator_object = anchor_generator_builder.build(
anchor_generator_proto)
self.assertIsInstance(anchor_generator_object,
flexible_grid_anchor_generator.
FlexibleGridAnchorGenerator)
for actual_base_sizes, expected_base_sizes in zip(
list(anchor_generator_object._base_sizes), [(1.5,), (1.0, 2.0)]):
self.assert_almost_list_equal(expected_base_sizes, actual_base_sizes)
for actual_aspect_ratios, expected_aspect_ratios in zip(
list(anchor_generator_object._aspect_ratios), [(1.0,), (1.0, 0.5)]):
self.assert_almost_list_equal(expected_aspect_ratios,
actual_aspect_ratios)
for actual_strides, expected_strides in zip(
list(anchor_generator_object._anchor_strides), [(16, 20), (32, 30)]):
self.assert_almost_list_equal(expected_strides, actual_strides)
for actual_offsets, expected_offsets in zip(
list(anchor_generator_object._anchor_offsets), [(8, 9), (10, 11)]):
self.assert_almost_list_equal(expected_offsets, actual_offsets)
self.assertTrue(anchor_generator_object._normalize_coordinates)
if __name__ == '__main__': if __name__ == '__main__':
tf.test.main() tf.test.main()
...@@ -20,11 +20,14 @@ import tensorflow as tf ...@@ -20,11 +20,14 @@ import tensorflow as tf
from object_detection.predictors import convolutional_box_predictor from object_detection.predictors import convolutional_box_predictor
from object_detection.predictors import convolutional_keras_box_predictor from object_detection.predictors import convolutional_keras_box_predictor
from object_detection.predictors import mask_rcnn_box_predictor from object_detection.predictors import mask_rcnn_box_predictor
from object_detection.predictors import mask_rcnn_keras_box_predictor
from object_detection.predictors import rfcn_box_predictor from object_detection.predictors import rfcn_box_predictor
from object_detection.predictors import rfcn_keras_box_predictor
from object_detection.predictors.heads import box_head from object_detection.predictors.heads import box_head
from object_detection.predictors.heads import class_head from object_detection.predictors.heads import class_head
from object_detection.predictors.heads import keras_box_head from object_detection.predictors.heads import keras_box_head
from object_detection.predictors.heads import keras_class_head from object_detection.predictors.heads import keras_class_head
from object_detection.predictors.heads import keras_mask_head
from object_detection.predictors.heads import mask_head from object_detection.predictors.heads import mask_head
from object_detection.protos import box_predictor_pb2 from object_detection.protos import box_predictor_pb2
...@@ -42,7 +45,8 @@ def build_convolutional_box_predictor(is_training, ...@@ -42,7 +45,8 @@ def build_convolutional_box_predictor(is_training,
apply_sigmoid_to_scores=False, apply_sigmoid_to_scores=False,
add_background_class=True, add_background_class=True,
class_prediction_bias_init=0.0, class_prediction_bias_init=0.0,
use_depthwise=False,): use_depthwise=False,
box_encodings_clip_range=None):
"""Builds the ConvolutionalBoxPredictor from the arguments. """Builds the ConvolutionalBoxPredictor from the arguments.
Args: Args:
...@@ -77,6 +81,7 @@ def build_convolutional_box_predictor(is_training, ...@@ -77,6 +81,7 @@ def build_convolutional_box_predictor(is_training,
conv2d layer before class prediction. conv2d layer before class prediction.
use_depthwise: Whether to use depthwise convolutions for prediction use_depthwise: Whether to use depthwise convolutions for prediction
steps. Default is False. steps. Default is False.
box_encodings_clip_range: Min and max values for clipping the box_encodings.
Returns: Returns:
A ConvolutionalBoxPredictor class. A ConvolutionalBoxPredictor class.
...@@ -85,7 +90,8 @@ def build_convolutional_box_predictor(is_training, ...@@ -85,7 +90,8 @@ def build_convolutional_box_predictor(is_training,
is_training=is_training, is_training=is_training,
box_code_size=box_code_size, box_code_size=box_code_size,
kernel_size=kernel_size, kernel_size=kernel_size,
use_depthwise=use_depthwise) use_depthwise=use_depthwise,
box_encodings_clip_range=box_encodings_clip_range)
class_prediction_head = class_head.ConvolutionalClassHead( class_prediction_head = class_head.ConvolutionalClassHead(
is_training=is_training, is_training=is_training,
num_class_slots=num_classes + 1 if add_background_class else num_classes, num_class_slots=num_classes + 1 if add_background_class else num_classes,
...@@ -124,6 +130,7 @@ def build_convolutional_keras_box_predictor(is_training, ...@@ -124,6 +130,7 @@ def build_convolutional_keras_box_predictor(is_training,
add_background_class=True, add_background_class=True,
class_prediction_bias_init=0.0, class_prediction_bias_init=0.0,
use_depthwise=False, use_depthwise=False,
box_encodings_clip_range=None,
name='BoxPredictor'): name='BoxPredictor'):
"""Builds the Keras ConvolutionalBoxPredictor from the arguments. """Builds the Keras ConvolutionalBoxPredictor from the arguments.
...@@ -168,6 +175,7 @@ def build_convolutional_keras_box_predictor(is_training, ...@@ -168,6 +175,7 @@ def build_convolutional_keras_box_predictor(is_training,
conv2d layer before class prediction. conv2d layer before class prediction.
use_depthwise: Whether to use depthwise convolutions for prediction use_depthwise: Whether to use depthwise convolutions for prediction
steps. Default is False. steps. Default is False.
box_encodings_clip_range: Min and max values for clipping the box_encodings.
name: A string name scope to assign to the box predictor. If `None`, Keras name: A string name scope to assign to the box predictor. If `None`, Keras
will auto-generate one from the class name. will auto-generate one from the class name.
...@@ -189,6 +197,7 @@ def build_convolutional_keras_box_predictor(is_training, ...@@ -189,6 +197,7 @@ def build_convolutional_keras_box_predictor(is_training,
freeze_batchnorm=freeze_batchnorm, freeze_batchnorm=freeze_batchnorm,
num_predictions_per_location=num_predictions_per_location, num_predictions_per_location=num_predictions_per_location,
use_depthwise=use_depthwise, use_depthwise=use_depthwise,
box_encodings_clip_range=box_encodings_clip_range,
name='ConvolutionalBoxHead_%d' % stack_index)) name='ConvolutionalBoxHead_%d' % stack_index))
class_prediction_heads.append( class_prediction_heads.append(
keras_class_head.ConvolutionalClassHead( keras_class_head.ConvolutionalClassHead(
...@@ -300,6 +309,224 @@ def build_weight_shared_convolutional_box_predictor( ...@@ -300,6 +309,224 @@ def build_weight_shared_convolutional_box_predictor(
use_depthwise=use_depthwise) use_depthwise=use_depthwise)
def build_weight_shared_convolutional_keras_box_predictor(
is_training,
num_classes,
conv_hyperparams,
freeze_batchnorm,
inplace_batchnorm_update,
num_predictions_per_location_list,
depth,
num_layers_before_predictor,
box_code_size,
kernel_size=3,
add_background_class=True,
class_prediction_bias_init=0.0,
use_dropout=False,
dropout_keep_prob=0.8,
share_prediction_tower=False,
apply_batch_norm=True,
use_depthwise=False,
score_converter_fn=tf.identity,
box_encodings_clip_range=None,
name='WeightSharedConvolutionalBoxPredictor'):
"""Builds the Keras WeightSharedConvolutionalBoxPredictor from the arguments.
Args:
is_training: Indicates whether the BoxPredictor is in training mode.
num_classes: number of classes. Note that num_classes *does not*
include the background category, so if groundtruth labels take values
in {0, 1, .., K-1}, num_classes=K (and not K+1, even though the
assigned classification targets can range from {0,... K}).
conv_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
containing hyperparameters for convolution ops.
freeze_batchnorm: Whether to freeze batch norm parameters during
training or not. When training with a small batch size (e.g. 1), it is
desirable to freeze batch norm update and use pretrained batch norm
params.
inplace_batchnorm_update: Whether to update batch norm moving average
values inplace. When this is false train op must add a control
dependency on tf.graphkeys.UPDATE_OPS collection in order to update
batch norm statistics.
num_predictions_per_location_list: A list of integers representing the
number of box predictions to be made per spatial location for each
feature map.
depth: depth of conv layers.
num_layers_before_predictor: Number of the additional conv layers before
the predictor.
box_code_size: Size of encoding for each box.
kernel_size: Size of final convolution kernel.
add_background_class: Whether to add an implicit background class.
class_prediction_bias_init: constant value to initialize bias of the last
conv2d layer before class prediction.
use_dropout: Whether to apply dropout to class prediction head.
dropout_keep_prob: Probability of keeping activiations.
share_prediction_tower: Whether to share the multi-layer tower between box
prediction and class prediction heads.
apply_batch_norm: Whether to apply batch normalization to conv layers in
this predictor.
use_depthwise: Whether to use depthwise separable conv2d instead of conv2d.
score_converter_fn: Callable score converter to perform elementwise op on
class scores.
box_encodings_clip_range: Min and max values for clipping the box_encodings.
name: A string name scope to assign to the box predictor. If `None`, Keras
will auto-generate one from the class name.
Returns:
A Keras WeightSharedConvolutionalBoxPredictor class.
"""
if len(set(num_predictions_per_location_list)) > 1:
raise ValueError('num predictions per location must be same for all'
'feature maps, found: {}'.format(
num_predictions_per_location_list))
num_predictions_per_location = num_predictions_per_location_list[0]
box_prediction_head = keras_box_head.WeightSharedConvolutionalBoxHead(
box_code_size=box_code_size,
kernel_size=kernel_size,
conv_hyperparams=conv_hyperparams,
num_predictions_per_location=num_predictions_per_location,
use_depthwise=use_depthwise,
box_encodings_clip_range=box_encodings_clip_range,
name='WeightSharedConvolutionalBoxHead')
class_prediction_head = keras_class_head.WeightSharedConvolutionalClassHead(
num_class_slots=(
num_classes + 1 if add_background_class else num_classes),
use_dropout=use_dropout,
dropout_keep_prob=dropout_keep_prob,
kernel_size=kernel_size,
conv_hyperparams=conv_hyperparams,
num_predictions_per_location=num_predictions_per_location,
class_prediction_bias_init=class_prediction_bias_init,
use_depthwise=use_depthwise,
score_converter_fn=score_converter_fn,
name='WeightSharedConvolutionalClassHead')
other_heads = {}
return (
convolutional_keras_box_predictor.WeightSharedConvolutionalBoxPredictor(
is_training=is_training,
num_classes=num_classes,
box_prediction_head=box_prediction_head,
class_prediction_head=class_prediction_head,
other_heads=other_heads,
conv_hyperparams=conv_hyperparams,
depth=depth,
num_layers_before_predictor=num_layers_before_predictor,
freeze_batchnorm=freeze_batchnorm,
inplace_batchnorm_update=inplace_batchnorm_update,
kernel_size=kernel_size,
apply_batch_norm=apply_batch_norm,
share_prediction_tower=share_prediction_tower,
use_depthwise=use_depthwise,
name=name))
def build_mask_rcnn_keras_box_predictor(is_training,
num_classes,
fc_hyperparams,
freeze_batchnorm,
use_dropout,
dropout_keep_prob,
box_code_size,
add_background_class=True,
share_box_across_classes=False,
predict_instance_masks=False,
conv_hyperparams=None,
mask_height=14,
mask_width=14,
mask_prediction_num_conv_layers=2,
mask_prediction_conv_depth=256,
masks_are_class_agnostic=False,
convolve_then_upsample_masks=False):
"""Builds and returns a MaskRCNNKerasBoxPredictor class.
Args:
is_training: Indicates whether the BoxPredictor is in training mode.
num_classes: number of classes. Note that num_classes *does not*
include the background category, so if groundtruth labels take values
in {0, 1, .., K-1}, num_classes=K (and not K+1, even though the
assigned classification targets can range from {0,... K}).
fc_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
containing hyperparameters for fully connected dense ops.
freeze_batchnorm: Whether to freeze batch norm parameters during
training or not. When training with a small batch size (e.g. 1), it is
desirable to freeze batch norm update and use pretrained batch norm
params.
use_dropout: Option to use dropout or not. Note that a single dropout
op is applied here prior to both box and class predictions, which stands
in contrast to the ConvolutionalBoxPredictor below.
dropout_keep_prob: Keep probability for dropout.
This is only used if use_dropout is True.
box_code_size: Size of encoding for each box.
add_background_class: Whether to add an implicit background class.
share_box_across_classes: Whether to share boxes across classes rather
than use a different box for each class.
predict_instance_masks: If True, will add a third stage mask prediction
to the returned class.
conv_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
containing hyperparameters for convolution ops.
mask_height: Desired output mask height. The default value is 14.
mask_width: Desired output mask width. The default value is 14.
mask_prediction_num_conv_layers: Number of convolution layers applied to
the image_features in mask prediction branch.
mask_prediction_conv_depth: The depth for the first conv2d_transpose op
applied to the image_features in the mask prediction branch. If set
to 0, the depth of the convolution layers will be automatically chosen
based on the number of object classes and the number of channels in the
image features.
masks_are_class_agnostic: Boolean determining if the mask-head is
class-agnostic or not.
convolve_then_upsample_masks: Whether to apply convolutions on mask
features before upsampling using nearest neighbor resizing. Otherwise,
mask features are resized to [`mask_height`, `mask_width`] using
bilinear resizing before applying convolutions.
Returns:
A MaskRCNNKerasBoxPredictor class.
"""
box_prediction_head = keras_box_head.MaskRCNNBoxHead(
is_training=is_training,
num_classes=num_classes,
fc_hyperparams=fc_hyperparams,
freeze_batchnorm=freeze_batchnorm,
use_dropout=use_dropout,
dropout_keep_prob=dropout_keep_prob,
box_code_size=box_code_size,
share_box_across_classes=share_box_across_classes)
class_prediction_head = keras_class_head.MaskRCNNClassHead(
is_training=is_training,
num_class_slots=num_classes + 1 if add_background_class else num_classes,
fc_hyperparams=fc_hyperparams,
freeze_batchnorm=freeze_batchnorm,
use_dropout=use_dropout,
dropout_keep_prob=dropout_keep_prob)
third_stage_heads = {}
if predict_instance_masks:
third_stage_heads[
mask_rcnn_box_predictor.
MASK_PREDICTIONS] = keras_mask_head.MaskRCNNMaskHead(
is_training=is_training,
num_classes=num_classes,
conv_hyperparams=conv_hyperparams,
freeze_batchnorm=freeze_batchnorm,
mask_height=mask_height,
mask_width=mask_width,
mask_prediction_num_conv_layers=mask_prediction_num_conv_layers,
mask_prediction_conv_depth=mask_prediction_conv_depth,
masks_are_class_agnostic=masks_are_class_agnostic,
convolve_then_upsample=convolve_then_upsample_masks)
return mask_rcnn_keras_box_predictor.MaskRCNNKerasBoxPredictor(
is_training=is_training,
num_classes=num_classes,
freeze_batchnorm=freeze_batchnorm,
box_prediction_head=box_prediction_head,
class_prediction_head=class_prediction_head,
third_stage_heads=third_stage_heads)
def build_mask_rcnn_box_predictor(is_training, def build_mask_rcnn_box_predictor(is_training,
num_classes, num_classes,
fc_hyperparams_fn, fc_hyperparams_fn,
...@@ -457,6 +684,13 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes, ...@@ -457,6 +684,13 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes,
config_box_predictor = box_predictor_config.convolutional_box_predictor config_box_predictor = box_predictor_config.convolutional_box_predictor
conv_hyperparams_fn = argscope_fn(config_box_predictor.conv_hyperparams, conv_hyperparams_fn = argscope_fn(config_box_predictor.conv_hyperparams,
is_training) is_training)
# Optionally apply clipping to box encodings, when box_encodings_clip_range
# is set.
box_encodings_clip_range = None
if config_box_predictor.HasField('box_encodings_clip_range'):
box_encodings_clip_range = BoxEncodingsClipRange(
min=config_box_predictor.box_encodings_clip_range.min,
max=config_box_predictor.box_encodings_clip_range.max)
return build_convolutional_box_predictor( return build_convolutional_box_predictor(
is_training=is_training, is_training=is_training,
num_classes=num_classes, num_classes=num_classes,
...@@ -473,7 +707,8 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes, ...@@ -473,7 +707,8 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes,
apply_sigmoid_to_scores=config_box_predictor.apply_sigmoid_to_scores, apply_sigmoid_to_scores=config_box_predictor.apply_sigmoid_to_scores,
class_prediction_bias_init=( class_prediction_bias_init=(
config_box_predictor.class_prediction_bias_init), config_box_predictor.class_prediction_bias_init),
use_depthwise=config_box_predictor.use_depthwise) use_depthwise=config_box_predictor.use_depthwise,
box_encodings_clip_range=box_encodings_clip_range)
if box_predictor_oneof == 'weight_shared_convolutional_box_predictor': if box_predictor_oneof == 'weight_shared_convolutional_box_predictor':
config_box_predictor = ( config_box_predictor = (
...@@ -488,12 +723,11 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes, ...@@ -488,12 +723,11 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes,
config_box_predictor.score_converter, is_training) config_box_predictor.score_converter, is_training)
# Optionally apply clipping to box encodings, when box_encodings_clip_range # Optionally apply clipping to box encodings, when box_encodings_clip_range
# is set. # is set.
box_encodings_clip_range = ( box_encodings_clip_range = None
BoxEncodingsClipRange( if config_box_predictor.HasField('box_encodings_clip_range'):
min=config_box_predictor.box_encodings_clip_range.min, box_encodings_clip_range = BoxEncodingsClipRange(
max=config_box_predictor.box_encodings_clip_range.max) min=config_box_predictor.box_encodings_clip_range.min,
if config_box_predictor.HasField('box_encodings_clip_range') else None) max=config_box_predictor.box_encodings_clip_range.max)
return build_weight_shared_convolutional_box_predictor( return build_weight_shared_convolutional_box_predictor(
is_training=is_training, is_training=is_training,
num_classes=num_classes, num_classes=num_classes,
...@@ -514,6 +748,7 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes, ...@@ -514,6 +748,7 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes,
score_converter_fn=score_converter_fn, score_converter_fn=score_converter_fn,
box_encodings_clip_range=box_encodings_clip_range) box_encodings_clip_range=box_encodings_clip_range)
if box_predictor_oneof == 'mask_rcnn_box_predictor': if box_predictor_oneof == 'mask_rcnn_box_predictor':
config_box_predictor = box_predictor_config.mask_rcnn_box_predictor config_box_predictor = box_predictor_config.mask_rcnn_box_predictor
fc_hyperparams_fn = argscope_fn(config_box_predictor.fc_hyperparams, fc_hyperparams_fn = argscope_fn(config_box_predictor.fc_hyperparams,
...@@ -563,7 +798,7 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes, ...@@ -563,7 +798,7 @@ def build(argscope_fn, box_predictor_config, is_training, num_classes,
raise ValueError('Unknown box predictor: {}'.format(box_predictor_oneof)) raise ValueError('Unknown box predictor: {}'.format(box_predictor_oneof))
def build_keras(conv_hyperparams_fn, freeze_batchnorm, inplace_batchnorm_update, def build_keras(hyperparams_fn, freeze_batchnorm, inplace_batchnorm_update,
num_predictions_per_location_list, box_predictor_config, num_predictions_per_location_list, box_predictor_config,
is_training, num_classes, add_background_class=True): is_training, num_classes, add_background_class=True):
"""Builds a Keras-based box predictor based on the configuration. """Builds a Keras-based box predictor based on the configuration.
...@@ -573,7 +808,7 @@ def build_keras(conv_hyperparams_fn, freeze_batchnorm, inplace_batchnorm_update, ...@@ -573,7 +808,7 @@ def build_keras(conv_hyperparams_fn, freeze_batchnorm, inplace_batchnorm_update,
for more details. for more details.
Args: Args:
conv_hyperparams_fn: A function that takes a hyperparams_pb2.Hyperparams hyperparams_fn: A function that takes a hyperparams_pb2.Hyperparams
proto and returns a `hyperparams_builder.KerasLayerHyperparams` proto and returns a `hyperparams_builder.KerasLayerHyperparams`
for Conv or FC hyperparameters. for Conv or FC hyperparameters.
freeze_batchnorm: Whether to freeze batch norm parameters during freeze_batchnorm: Whether to freeze batch norm parameters during
...@@ -607,8 +842,16 @@ def build_keras(conv_hyperparams_fn, freeze_batchnorm, inplace_batchnorm_update, ...@@ -607,8 +842,16 @@ def build_keras(conv_hyperparams_fn, freeze_batchnorm, inplace_batchnorm_update,
if box_predictor_oneof == 'convolutional_box_predictor': if box_predictor_oneof == 'convolutional_box_predictor':
config_box_predictor = box_predictor_config.convolutional_box_predictor config_box_predictor = box_predictor_config.convolutional_box_predictor
conv_hyperparams = conv_hyperparams_fn( conv_hyperparams = hyperparams_fn(
config_box_predictor.conv_hyperparams) config_box_predictor.conv_hyperparams)
# Optionally apply clipping to box encodings, when box_encodings_clip_range
# is set.
box_encodings_clip_range = None
if config_box_predictor.HasField('box_encodings_clip_range'):
box_encodings_clip_range = BoxEncodingsClipRange(
min=config_box_predictor.box_encodings_clip_range.min,
max=config_box_predictor.box_encodings_clip_range.max)
return build_convolutional_keras_box_predictor( return build_convolutional_keras_box_predictor(
is_training=is_training, is_training=is_training,
num_classes=num_classes, num_classes=num_classes,
...@@ -627,7 +870,97 @@ def build_keras(conv_hyperparams_fn, freeze_batchnorm, inplace_batchnorm_update, ...@@ -627,7 +870,97 @@ def build_keras(conv_hyperparams_fn, freeze_batchnorm, inplace_batchnorm_update,
max_depth=config_box_predictor.max_depth, max_depth=config_box_predictor.max_depth,
class_prediction_bias_init=( class_prediction_bias_init=(
config_box_predictor.class_prediction_bias_init), config_box_predictor.class_prediction_bias_init),
use_depthwise=config_box_predictor.use_depthwise) use_depthwise=config_box_predictor.use_depthwise,
box_encodings_clip_range=box_encodings_clip_range)
if box_predictor_oneof == 'weight_shared_convolutional_box_predictor':
config_box_predictor = (
box_predictor_config.weight_shared_convolutional_box_predictor)
conv_hyperparams = hyperparams_fn(config_box_predictor.conv_hyperparams)
apply_batch_norm = config_box_predictor.conv_hyperparams.HasField(
'batch_norm')
# During training phase, logits are used to compute the loss. Only apply
# sigmoid at inference to make the inference graph TPU friendly. This is
# required because during TPU inference, model.postprocess is not called.
score_converter_fn = build_score_converter(
config_box_predictor.score_converter, is_training)
# Optionally apply clipping to box encodings, when box_encodings_clip_range
# is set.
box_encodings_clip_range = None
if config_box_predictor.HasField('box_encodings_clip_range'):
box_encodings_clip_range = BoxEncodingsClipRange(
min=config_box_predictor.box_encodings_clip_range.min,
max=config_box_predictor.box_encodings_clip_range.max)
return build_weight_shared_convolutional_keras_box_predictor(
is_training=is_training,
num_classes=num_classes,
conv_hyperparams=conv_hyperparams,
freeze_batchnorm=freeze_batchnorm,
inplace_batchnorm_update=inplace_batchnorm_update,
num_predictions_per_location_list=num_predictions_per_location_list,
depth=config_box_predictor.depth,
num_layers_before_predictor=(
config_box_predictor.num_layers_before_predictor),
box_code_size=config_box_predictor.box_code_size,
kernel_size=config_box_predictor.kernel_size,
add_background_class=add_background_class,
class_prediction_bias_init=(
config_box_predictor.class_prediction_bias_init),
use_dropout=config_box_predictor.use_dropout,
dropout_keep_prob=config_box_predictor.dropout_keep_probability,
share_prediction_tower=config_box_predictor.share_prediction_tower,
apply_batch_norm=apply_batch_norm,
use_depthwise=config_box_predictor.use_depthwise,
score_converter_fn=score_converter_fn,
box_encodings_clip_range=box_encodings_clip_range)
if box_predictor_oneof == 'mask_rcnn_box_predictor':
config_box_predictor = box_predictor_config.mask_rcnn_box_predictor
fc_hyperparams = hyperparams_fn(config_box_predictor.fc_hyperparams)
conv_hyperparams = None
if config_box_predictor.HasField('conv_hyperparams'):
conv_hyperparams = hyperparams_fn(
config_box_predictor.conv_hyperparams)
return build_mask_rcnn_keras_box_predictor(
is_training=is_training,
num_classes=num_classes,
add_background_class=add_background_class,
fc_hyperparams=fc_hyperparams,
freeze_batchnorm=freeze_batchnorm,
use_dropout=config_box_predictor.use_dropout,
dropout_keep_prob=config_box_predictor.dropout_keep_probability,
box_code_size=config_box_predictor.box_code_size,
share_box_across_classes=(
config_box_predictor.share_box_across_classes),
predict_instance_masks=config_box_predictor.predict_instance_masks,
conv_hyperparams=conv_hyperparams,
mask_height=config_box_predictor.mask_height,
mask_width=config_box_predictor.mask_width,
mask_prediction_num_conv_layers=(
config_box_predictor.mask_prediction_num_conv_layers),
mask_prediction_conv_depth=(
config_box_predictor.mask_prediction_conv_depth),
masks_are_class_agnostic=(
config_box_predictor.masks_are_class_agnostic),
convolve_then_upsample_masks=(
config_box_predictor.convolve_then_upsample_masks))
if box_predictor_oneof == 'rfcn_box_predictor':
config_box_predictor = box_predictor_config.rfcn_box_predictor
conv_hyperparams = hyperparams_fn(config_box_predictor.conv_hyperparams)
box_predictor_object = rfcn_keras_box_predictor.RfcnKerasBoxPredictor(
is_training=is_training,
num_classes=num_classes,
conv_hyperparams=conv_hyperparams,
freeze_batchnorm=freeze_batchnorm,
crop_size=[config_box_predictor.crop_height,
config_box_predictor.crop_width],
num_spatial_bins=[config_box_predictor.num_spatial_bins_height,
config_box_predictor.num_spatial_bins_width],
depth=config_box_predictor.depth,
box_code_size=config_box_predictor.box_code_size)
return box_predictor_object
raise ValueError( raise ValueError(
'Unknown box predictor for Keras: {}'.format(box_predictor_oneof)) 'Unknown box predictor for Keras: {}'.format(box_predictor_oneof))
...@@ -353,6 +353,8 @@ class WeightSharedConvolutionalBoxPredictorBuilderTest(tf.test.TestCase): ...@@ -353,6 +353,8 @@ class WeightSharedConvolutionalBoxPredictorBuilderTest(tf.test.TestCase):
self.assertEqual(box_predictor._apply_batch_norm, True) self.assertEqual(box_predictor._apply_batch_norm, True)
class MaskRCNNBoxPredictorBuilderTest(tf.test.TestCase): class MaskRCNNBoxPredictorBuilderTest(tf.test.TestCase):
def test_box_predictor_builder_calls_fc_argscope_fn(self): def test_box_predictor_builder_calls_fc_argscope_fn(self):
......
...@@ -56,9 +56,15 @@ def read_dataset(file_read_func, input_files, config): ...@@ -56,9 +56,15 @@ def read_dataset(file_read_func, input_files, config):
Returns: Returns:
A tf.data.Dataset of (undecoded) tf-records based on config. A tf.data.Dataset of (undecoded) tf-records based on config.
Raises:
RuntimeError: If no files are found at the supplied path(s).
""" """
# Shard, shuffle, and read files. # Shard, shuffle, and read files.
filenames = tf.gfile.Glob(input_files) filenames = tf.gfile.Glob(input_files)
if not filenames:
raise RuntimeError('Did not find any input files matching the glob pattern '
'{}'.format(input_files))
num_readers = config.num_readers num_readers = config.num_readers
if num_readers > len(filenames): if num_readers > len(filenames):
num_readers = len(filenames) num_readers = len(filenames)
......
...@@ -32,11 +32,14 @@ def build(graph_rewriter_config, is_training): ...@@ -32,11 +32,14 @@ def build(graph_rewriter_config, is_training):
# Quantize the graph by inserting quantize ops for weights and activations # Quantize the graph by inserting quantize ops for weights and activations
if is_training: if is_training:
tf.contrib.quantize.create_training_graph( tf.contrib.quantize.experimental_create_training_graph(
input_graph=tf.get_default_graph(), input_graph=tf.get_default_graph(),
quant_delay=graph_rewriter_config.quantization.delay) quant_delay=graph_rewriter_config.quantization.delay
)
else: else:
tf.contrib.quantize.create_eval_graph(input_graph=tf.get_default_graph()) tf.contrib.quantize.experimental_create_eval_graph(
input_graph=tf.get_default_graph()
)
tf.contrib.layers.summarize_collection('quant_vars') tf.contrib.layers.summarize_collection('quant_vars')
return graph_rewrite_fn return graph_rewrite_fn
...@@ -23,7 +23,8 @@ class QuantizationBuilderTest(tf.test.TestCase): ...@@ -23,7 +23,8 @@ class QuantizationBuilderTest(tf.test.TestCase):
def testQuantizationBuilderSetsUpCorrectTrainArguments(self): def testQuantizationBuilderSetsUpCorrectTrainArguments(self):
with mock.patch.object( with mock.patch.object(
tf.contrib.quantize, 'create_training_graph') as mock_quant_fn: tf.contrib.quantize,
'experimental_create_training_graph') as mock_quant_fn:
with mock.patch.object(tf.contrib.layers, with mock.patch.object(tf.contrib.layers,
'summarize_collection') as mock_summarize_col: 'summarize_collection') as mock_summarize_col:
graph_rewriter_proto = graph_rewriter_pb2.GraphRewriter() graph_rewriter_proto = graph_rewriter_pb2.GraphRewriter()
...@@ -40,7 +41,7 @@ class QuantizationBuilderTest(tf.test.TestCase): ...@@ -40,7 +41,7 @@ class QuantizationBuilderTest(tf.test.TestCase):
def testQuantizationBuilderSetsUpCorrectEvalArguments(self): def testQuantizationBuilderSetsUpCorrectEvalArguments(self):
with mock.patch.object(tf.contrib.quantize, with mock.patch.object(tf.contrib.quantize,
'create_eval_graph') as mock_quant_fn: 'experimental_create_eval_graph') as mock_quant_fn:
with mock.patch.object(tf.contrib.layers, with mock.patch.object(tf.contrib.layers,
'summarize_collection') as mock_summarize_col: 'summarize_collection') as mock_summarize_col:
graph_rewriter_proto = graph_rewriter_pb2.GraphRewriter() graph_rewriter_proto = graph_rewriter_pb2.GraphRewriter()
......
...@@ -110,6 +110,32 @@ def build(image_resizer_config): ...@@ -110,6 +110,32 @@ def build(image_resizer_config):
else: else:
return [image, masks, tf.shape(image)] return [image, masks, tf.shape(image)]
return image_resizer_fn return image_resizer_fn
elif image_resizer_oneof == 'conditional_shape_resizer':
conditional_shape_resize_config = (
image_resizer_config.conditional_shape_resizer)
method = _tf_resize_method(conditional_shape_resize_config.resize_method)
if conditional_shape_resize_config.condition == (
image_resizer_pb2.ConditionalShapeResizer.GREATER):
image_resizer_fn = functools.partial(
preprocessor.resize_to_max_dimension,
max_dimension=conditional_shape_resize_config.size_threshold,
method=method)
elif conditional_shape_resize_config.condition == (
image_resizer_pb2.ConditionalShapeResizer.SMALLER):
image_resizer_fn = functools.partial(
preprocessor.resize_to_min_dimension,
min_dimension=conditional_shape_resize_config.size_threshold,
method=method)
else:
raise ValueError(
'Invalid image resizer condition option for '
'ConditionalShapeResizer: \'%s\'.'
% conditional_shape_resize_config.condition)
if not conditional_shape_resize_config.convert_to_grayscale:
return image_resizer_fn
else: else:
raise ValueError( raise ValueError(
'Invalid image resizer option: \'%s\'.' % image_resizer_oneof) 'Invalid image resizer option: \'%s\'.' % image_resizer_oneof)
......
...@@ -147,6 +147,69 @@ class ImageResizerBuilderTest(tf.test.TestCase): ...@@ -147,6 +147,69 @@ class ImageResizerBuilderTest(tf.test.TestCase):
self.assertEqual(len(vals), 1) self.assertEqual(len(vals), 1)
self.assertEqual(vals[0], 1) self.assertEqual(vals[0], 1)
def test_build_conditional_shape_resizer_greater_returns_expected_shape(self):
image_resizer_text_proto = """
conditional_shape_resizer {
condition: GREATER
size_threshold: 30
}
"""
input_shape = (60, 30, 3)
expected_output_shape = (30, 15, 3)
output_shape = self._shape_of_resized_random_image_given_text_proto(
input_shape, image_resizer_text_proto)
self.assertEqual(output_shape, expected_output_shape)
def test_build_conditional_shape_resizer_same_shape_with_no_resize(self):
image_resizer_text_proto = """
conditional_shape_resizer {
condition: GREATER
size_threshold: 30
}
"""
input_shape = (15, 15, 3)
expected_output_shape = (15, 15, 3)
output_shape = self._shape_of_resized_random_image_given_text_proto(
input_shape, image_resizer_text_proto)
self.assertEqual(output_shape, expected_output_shape)
def test_build_conditional_shape_resizer_smaller_returns_expected_shape(self):
image_resizer_text_proto = """
conditional_shape_resizer {
condition: SMALLER
size_threshold: 30
}
"""
input_shape = (30, 15, 3)
expected_output_shape = (60, 30, 3)
output_shape = self._shape_of_resized_random_image_given_text_proto(
input_shape, image_resizer_text_proto)
self.assertEqual(output_shape, expected_output_shape)
def test_build_conditional_shape_resizer_grayscale(self):
image_resizer_text_proto = """
conditional_shape_resizer {
condition: GREATER
size_threshold: 30
convert_to_grayscale: true
}
"""
input_shape = (60, 30, 3)
expected_output_shape = (30, 15, 1)
output_shape = self._shape_of_resized_random_image_given_text_proto(
input_shape, image_resizer_text_proto)
self.assertEqual(output_shape, expected_output_shape)
def test_build_conditional_shape_resizer_error_on_invalid_condition(self):
invalid_image_resizer_text_proto = """
conditional_shape_resizer {
condition: INVALID
size_threshold: 30
}
"""
with self.assertRaises(ValueError):
image_resizer_builder.build(invalid_image_resizer_text_proto)
if __name__ == '__main__': if __name__ == '__main__':
tf.test.main() tf.test.main()
...@@ -33,6 +33,7 @@ from object_detection.meta_architectures import faster_rcnn_meta_arch ...@@ -33,6 +33,7 @@ from object_detection.meta_architectures import faster_rcnn_meta_arch
from object_detection.meta_architectures import rfcn_meta_arch from object_detection.meta_architectures import rfcn_meta_arch
from object_detection.meta_architectures import ssd_meta_arch from object_detection.meta_architectures import ssd_meta_arch
from object_detection.models import faster_rcnn_inception_resnet_v2_feature_extractor as frcnn_inc_res from object_detection.models import faster_rcnn_inception_resnet_v2_feature_extractor as frcnn_inc_res
from object_detection.models import faster_rcnn_inception_resnet_v2_keras_feature_extractor as frcnn_inc_res_keras
from object_detection.models import faster_rcnn_inception_v2_feature_extractor as frcnn_inc_v2 from object_detection.models import faster_rcnn_inception_v2_feature_extractor as frcnn_inc_v2
from object_detection.models import faster_rcnn_nas_feature_extractor as frcnn_nas from object_detection.models import faster_rcnn_nas_feature_extractor as frcnn_nas
from object_detection.models import faster_rcnn_pnas_feature_extractor as frcnn_pnas from object_detection.models import faster_rcnn_pnas_feature_extractor as frcnn_pnas
...@@ -44,13 +45,16 @@ from object_detection.models.ssd_inception_v2_feature_extractor import SSDIncept ...@@ -44,13 +45,16 @@ from object_detection.models.ssd_inception_v2_feature_extractor import SSDIncept
from object_detection.models.ssd_inception_v3_feature_extractor import SSDInceptionV3FeatureExtractor from object_detection.models.ssd_inception_v3_feature_extractor import SSDInceptionV3FeatureExtractor
from object_detection.models.ssd_mobilenet_v1_feature_extractor import SSDMobileNetV1FeatureExtractor from object_detection.models.ssd_mobilenet_v1_feature_extractor import SSDMobileNetV1FeatureExtractor
from object_detection.models.ssd_mobilenet_v1_fpn_feature_extractor import SSDMobileNetV1FpnFeatureExtractor from object_detection.models.ssd_mobilenet_v1_fpn_feature_extractor import SSDMobileNetV1FpnFeatureExtractor
from object_detection.models.ssd_mobilenet_v1_fpn_keras_feature_extractor import SSDMobileNetV1FpnKerasFeatureExtractor
from object_detection.models.ssd_mobilenet_v1_keras_feature_extractor import SSDMobileNetV1KerasFeatureExtractor from object_detection.models.ssd_mobilenet_v1_keras_feature_extractor import SSDMobileNetV1KerasFeatureExtractor
from object_detection.models.ssd_mobilenet_v1_ppn_feature_extractor import SSDMobileNetV1PpnFeatureExtractor from object_detection.models.ssd_mobilenet_v1_ppn_feature_extractor import SSDMobileNetV1PpnFeatureExtractor
from object_detection.models.ssd_mobilenet_v2_feature_extractor import SSDMobileNetV2FeatureExtractor from object_detection.models.ssd_mobilenet_v2_feature_extractor import SSDMobileNetV2FeatureExtractor
from object_detection.models.ssd_mobilenet_v2_fpn_feature_extractor import SSDMobileNetV2FpnFeatureExtractor from object_detection.models.ssd_mobilenet_v2_fpn_feature_extractor import SSDMobileNetV2FpnFeatureExtractor
from object_detection.models.ssd_mobilenet_v2_fpn_keras_feature_extractor import SSDMobileNetV2FpnKerasFeatureExtractor
from object_detection.models.ssd_mobilenet_v2_keras_feature_extractor import SSDMobileNetV2KerasFeatureExtractor from object_detection.models.ssd_mobilenet_v2_keras_feature_extractor import SSDMobileNetV2KerasFeatureExtractor
from object_detection.models.ssd_pnasnet_feature_extractor import SSDPNASNetFeatureExtractor from object_detection.models.ssd_pnasnet_feature_extractor import SSDPNASNetFeatureExtractor
from object_detection.predictors import rfcn_box_predictor from object_detection.predictors import rfcn_box_predictor
from object_detection.predictors import rfcn_keras_box_predictor
from object_detection.predictors.heads import mask_head from object_detection.predictors.heads import mask_head
from object_detection.protos import model_pb2 from object_detection.protos import model_pb2
from object_detection.utils import ops from object_detection.utils import ops
...@@ -78,7 +82,9 @@ SSD_FEATURE_EXTRACTOR_CLASS_MAP = { ...@@ -78,7 +82,9 @@ SSD_FEATURE_EXTRACTOR_CLASS_MAP = {
SSD_KERAS_FEATURE_EXTRACTOR_CLASS_MAP = { SSD_KERAS_FEATURE_EXTRACTOR_CLASS_MAP = {
'ssd_mobilenet_v1_keras': SSDMobileNetV1KerasFeatureExtractor, 'ssd_mobilenet_v1_keras': SSDMobileNetV1KerasFeatureExtractor,
'ssd_mobilenet_v2_keras': SSDMobileNetV2KerasFeatureExtractor 'ssd_mobilenet_v1_fpn_keras': SSDMobileNetV1FpnKerasFeatureExtractor,
'ssd_mobilenet_v2_keras': SSDMobileNetV2KerasFeatureExtractor,
'ssd_mobilenet_v2_fpn_keras': SSDMobileNetV2FpnKerasFeatureExtractor,
} }
# A map of names to Faster R-CNN feature extractors. # A map of names to Faster R-CNN feature extractors.
...@@ -99,6 +105,11 @@ FASTER_RCNN_FEATURE_EXTRACTOR_CLASS_MAP = { ...@@ -99,6 +105,11 @@ FASTER_RCNN_FEATURE_EXTRACTOR_CLASS_MAP = {
frcnn_resnet_v1.FasterRCNNResnet152FeatureExtractor, frcnn_resnet_v1.FasterRCNNResnet152FeatureExtractor,
} }
FASTER_RCNN_KERAS_FEATURE_EXTRACTOR_CLASS_MAP = {
'faster_rcnn_inception_resnet_v2_keras':
frcnn_inc_res_keras.FasterRCNNInceptionResnetV2KerasFeatureExtractor,
}
def build(model_config, is_training, add_summaries=True): def build(model_config, is_training, add_summaries=True):
"""Builds a DetectionModel based on the model config. """Builds a DetectionModel based on the model config.
...@@ -253,7 +264,7 @@ def _build_ssd_model(ssd_config, is_training, add_summaries): ...@@ -253,7 +264,7 @@ def _build_ssd_model(ssd_config, is_training, add_summaries):
ssd_config.anchor_generator) ssd_config.anchor_generator)
if feature_extractor.is_keras_model: if feature_extractor.is_keras_model:
ssd_box_predictor = box_predictor_builder.build_keras( ssd_box_predictor = box_predictor_builder.build_keras(
conv_hyperparams_fn=hyperparams_builder.KerasLayerHyperparams, hyperparams_fn=hyperparams_builder.KerasLayerHyperparams,
freeze_batchnorm=ssd_config.freeze_batchnorm, freeze_batchnorm=ssd_config.freeze_batchnorm,
inplace_batchnorm_update=False, inplace_batchnorm_update=False,
num_predictions_per_location_list=anchor_generator num_predictions_per_location_list=anchor_generator
...@@ -355,7 +366,45 @@ def _build_faster_rcnn_feature_extractor( ...@@ -355,7 +366,45 @@ def _build_faster_rcnn_feature_extractor(
feature_type] feature_type]
return feature_extractor_class( return feature_extractor_class(
is_training, first_stage_features_stride, is_training, first_stage_features_stride,
batch_norm_trainable, reuse_weights) batch_norm_trainable, reuse_weights=reuse_weights)
def _build_faster_rcnn_keras_feature_extractor(
feature_extractor_config, is_training,
inplace_batchnorm_update=False):
"""Builds a faster_rcnn_meta_arch.FasterRCNNKerasFeatureExtractor from config.
Args:
feature_extractor_config: A FasterRcnnFeatureExtractor proto config from
faster_rcnn.proto.
is_training: True if this feature extractor is being built for training.
inplace_batchnorm_update: Whether to update batch_norm inplace during
training. This is required for batch norm to work correctly on TPUs. When
this is false, user must add a control dependency on
tf.GraphKeys.UPDATE_OPS for train/loss op in order to update the batch
norm moving average parameters.
Returns:
faster_rcnn_meta_arch.FasterRCNNKerasFeatureExtractor based on config.
Raises:
ValueError: On invalid feature extractor type.
"""
if inplace_batchnorm_update:
raise ValueError('inplace batchnorm updates not supported.')
feature_type = feature_extractor_config.type
first_stage_features_stride = (
feature_extractor_config.first_stage_features_stride)
batch_norm_trainable = feature_extractor_config.batch_norm_trainable
if feature_type not in FASTER_RCNN_KERAS_FEATURE_EXTRACTOR_CLASS_MAP:
raise ValueError('Unknown Faster R-CNN feature_extractor: {}'.format(
feature_type))
feature_extractor_class = FASTER_RCNN_KERAS_FEATURE_EXTRACTOR_CLASS_MAP[
feature_type]
return feature_extractor_class(
is_training, first_stage_features_stride,
batch_norm_trainable)
def _build_faster_rcnn_model(frcnn_config, is_training, add_summaries): def _build_faster_rcnn_model(frcnn_config, is_training, add_summaries):
...@@ -380,9 +429,17 @@ def _build_faster_rcnn_model(frcnn_config, is_training, add_summaries): ...@@ -380,9 +429,17 @@ def _build_faster_rcnn_model(frcnn_config, is_training, add_summaries):
num_classes = frcnn_config.num_classes num_classes = frcnn_config.num_classes
image_resizer_fn = image_resizer_builder.build(frcnn_config.image_resizer) image_resizer_fn = image_resizer_builder.build(frcnn_config.image_resizer)
feature_extractor = _build_faster_rcnn_feature_extractor( is_keras = (frcnn_config.feature_extractor.type in
frcnn_config.feature_extractor, is_training, FASTER_RCNN_KERAS_FEATURE_EXTRACTOR_CLASS_MAP)
inplace_batchnorm_update=frcnn_config.inplace_batchnorm_update)
if is_keras:
feature_extractor = _build_faster_rcnn_keras_feature_extractor(
frcnn_config.feature_extractor, is_training,
inplace_batchnorm_update=frcnn_config.inplace_batchnorm_update)
else:
feature_extractor = _build_faster_rcnn_feature_extractor(
frcnn_config.feature_extractor, is_training,
inplace_batchnorm_update=frcnn_config.inplace_batchnorm_update)
number_of_stages = frcnn_config.number_of_stages number_of_stages = frcnn_config.number_of_stages
first_stage_anchor_generator = anchor_generator_builder.build( first_stage_anchor_generator = anchor_generator_builder.build(
...@@ -393,8 +450,13 @@ def _build_faster_rcnn_model(frcnn_config, is_training, add_summaries): ...@@ -393,8 +450,13 @@ def _build_faster_rcnn_model(frcnn_config, is_training, add_summaries):
'proposal', 'proposal',
use_matmul_gather=frcnn_config.use_matmul_gather_in_matcher) use_matmul_gather=frcnn_config.use_matmul_gather_in_matcher)
first_stage_atrous_rate = frcnn_config.first_stage_atrous_rate first_stage_atrous_rate = frcnn_config.first_stage_atrous_rate
first_stage_box_predictor_arg_scope_fn = hyperparams_builder.build( if is_keras:
frcnn_config.first_stage_box_predictor_conv_hyperparams, is_training) first_stage_box_predictor_arg_scope_fn = (
hyperparams_builder.KerasLayerHyperparams(
frcnn_config.first_stage_box_predictor_conv_hyperparams))
else:
first_stage_box_predictor_arg_scope_fn = hyperparams_builder.build(
frcnn_config.first_stage_box_predictor_conv_hyperparams, is_training)
first_stage_box_predictor_kernel_size = ( first_stage_box_predictor_kernel_size = (
frcnn_config.first_stage_box_predictor_kernel_size) frcnn_config.first_stage_box_predictor_kernel_size)
first_stage_box_predictor_depth = frcnn_config.first_stage_box_predictor_depth first_stage_box_predictor_depth = frcnn_config.first_stage_box_predictor_depth
...@@ -432,11 +494,21 @@ def _build_faster_rcnn_model(frcnn_config, is_training, add_summaries): ...@@ -432,11 +494,21 @@ def _build_faster_rcnn_model(frcnn_config, is_training, add_summaries):
'FasterRCNN', 'FasterRCNN',
'detection', 'detection',
use_matmul_gather=frcnn_config.use_matmul_gather_in_matcher) use_matmul_gather=frcnn_config.use_matmul_gather_in_matcher)
second_stage_box_predictor = box_predictor_builder.build( if is_keras:
hyperparams_builder.build, second_stage_box_predictor = box_predictor_builder.build_keras(
frcnn_config.second_stage_box_predictor, hyperparams_builder.KerasLayerHyperparams,
is_training=is_training, freeze_batchnorm=False,
num_classes=num_classes) inplace_batchnorm_update=False,
num_predictions_per_location_list=[1],
box_predictor_config=frcnn_config.second_stage_box_predictor,
is_training=is_training,
num_classes=num_classes)
else:
second_stage_box_predictor = box_predictor_builder.build(
hyperparams_builder.build,
frcnn_config.second_stage_box_predictor,
is_training=is_training,
num_classes=num_classes)
second_stage_batch_size = frcnn_config.second_stage_batch_size second_stage_batch_size = frcnn_config.second_stage_batch_size
second_stage_sampler = sampler.BalancedPositiveNegativeSampler( second_stage_sampler = sampler.BalancedPositiveNegativeSampler(
positive_fraction=frcnn_config.second_stage_balance_fraction, positive_fraction=frcnn_config.second_stage_balance_fraction,
...@@ -507,8 +579,10 @@ def _build_faster_rcnn_model(frcnn_config, is_training, add_summaries): ...@@ -507,8 +579,10 @@ def _build_faster_rcnn_model(frcnn_config, is_training, add_summaries):
'resize_masks': frcnn_config.resize_masks 'resize_masks': frcnn_config.resize_masks
} }
if isinstance(second_stage_box_predictor, if (isinstance(second_stage_box_predictor,
rfcn_box_predictor.RfcnBoxPredictor): rfcn_box_predictor.RfcnBoxPredictor) or
isinstance(second_stage_box_predictor,
rfcn_keras_box_predictor.RfcnKerasBoxPredictor)):
return rfcn_meta_arch.RFCNMetaArch( return rfcn_meta_arch.RFCNMetaArch(
second_stage_rfcn_box_predictor=second_stage_box_predictor, second_stage_rfcn_box_predictor=second_stage_box_predictor,
**common_kwargs) **common_kwargs)
......
...@@ -170,7 +170,13 @@ class Match(object): ...@@ -170,7 +170,13 @@ class Match(object):
row_indices: int32 tensor of shape [K] with row indices. row_indices: int32 tensor of shape [K] with row indices.
""" """
return self._reshape_and_cast( return self._reshape_and_cast(
self._gather_op(self._match_results, self.matched_column_indices())) self._gather_op(tf.to_float(self._match_results),
self.matched_column_indices()))
def num_matched_rows(self):
"""Returns number (int32 scalar tensor) of matched rows."""
unique_rows, _ = tf.unique(self.matched_row_indices())
return tf.size(unique_rows)
def _reshape_and_cast(self, t): def _reshape_and_cast(self, t):
return tf.cast(tf.reshape(t, [-1]), tf.int32) return tf.cast(tf.reshape(t, [-1]), tf.int32)
...@@ -199,7 +205,7 @@ class Match(object): ...@@ -199,7 +205,7 @@ class Match(object):
""" """
input_tensor = tf.concat( input_tensor = tf.concat(
[tf.stack([ignored_value, unmatched_value]), [tf.stack([ignored_value, unmatched_value]),
tf.to_float(input_tensor)], input_tensor],
axis=0) axis=0)
gather_indices = tf.maximum(self.match_results + 2, 0) gather_indices = tf.maximum(self.match_results + 2, 0)
gathered_tensor = self._gather_op(input_tensor, gather_indices) gathered_tensor = self._gather_op(input_tensor, gather_indices)
......
...@@ -27,37 +27,42 @@ class MatchTest(tf.test.TestCase): ...@@ -27,37 +27,42 @@ class MatchTest(tf.test.TestCase):
match = matcher.Match(match_results) match = matcher.Match(match_results)
expected_column_indices = [0, 1, 3, 5] expected_column_indices = [0, 1, 3, 5]
matched_column_indices = match.matched_column_indices() matched_column_indices = match.matched_column_indices()
self.assertEquals(matched_column_indices.dtype, tf.int32) self.assertEqual(matched_column_indices.dtype, tf.int32)
with self.test_session() as sess: with self.test_session() as sess:
matched_column_indices = sess.run(matched_column_indices) matched_column_indices = sess.run(matched_column_indices)
self.assertAllEqual(matched_column_indices, expected_column_indices) self.assertAllEqual(matched_column_indices, expected_column_indices)
def test_get_correct_counts(self): def test_get_correct_counts(self):
match_results = tf.constant([3, 1, -1, 0, -1, 5, -2]) match_results = tf.constant([3, 1, -1, 0, -1, 1, -2])
match = matcher.Match(match_results) match = matcher.Match(match_results)
exp_num_matched_columns = 4 exp_num_matched_columns = 4
exp_num_unmatched_columns = 2 exp_num_unmatched_columns = 2
exp_num_ignored_columns = 1 exp_num_ignored_columns = 1
exp_num_matched_rows = 3
num_matched_columns = match.num_matched_columns() num_matched_columns = match.num_matched_columns()
num_unmatched_columns = match.num_unmatched_columns() num_unmatched_columns = match.num_unmatched_columns()
num_ignored_columns = match.num_ignored_columns() num_ignored_columns = match.num_ignored_columns()
self.assertEquals(num_matched_columns.dtype, tf.int32) num_matched_rows = match.num_matched_rows()
self.assertEquals(num_unmatched_columns.dtype, tf.int32) self.assertEqual(num_matched_columns.dtype, tf.int32)
self.assertEquals(num_ignored_columns.dtype, tf.int32) self.assertEqual(num_unmatched_columns.dtype, tf.int32)
self.assertEqual(num_ignored_columns.dtype, tf.int32)
self.assertEqual(num_matched_rows.dtype, tf.int32)
with self.test_session() as sess: with self.test_session() as sess:
(num_matched_columns_out, num_unmatched_columns_out, (num_matched_columns_out, num_unmatched_columns_out,
num_ignored_columns_out) = sess.run( num_ignored_columns_out, num_matched_rows_out) = sess.run(
[num_matched_columns, num_unmatched_columns, num_ignored_columns]) [num_matched_columns, num_unmatched_columns, num_ignored_columns,
num_matched_rows])
self.assertAllEqual(num_matched_columns_out, exp_num_matched_columns) self.assertAllEqual(num_matched_columns_out, exp_num_matched_columns)
self.assertAllEqual(num_unmatched_columns_out, exp_num_unmatched_columns) self.assertAllEqual(num_unmatched_columns_out, exp_num_unmatched_columns)
self.assertAllEqual(num_ignored_columns_out, exp_num_ignored_columns) self.assertAllEqual(num_ignored_columns_out, exp_num_ignored_columns)
self.assertAllEqual(num_matched_rows_out, exp_num_matched_rows)
def testGetCorrectUnmatchedColumnIndices(self): def testGetCorrectUnmatchedColumnIndices(self):
match_results = tf.constant([3, 1, -1, 0, -1, 5, -2]) match_results = tf.constant([3, 1, -1, 0, -1, 5, -2])
match = matcher.Match(match_results) match = matcher.Match(match_results)
expected_column_indices = [2, 4] expected_column_indices = [2, 4]
unmatched_column_indices = match.unmatched_column_indices() unmatched_column_indices = match.unmatched_column_indices()
self.assertEquals(unmatched_column_indices.dtype, tf.int32) self.assertEqual(unmatched_column_indices.dtype, tf.int32)
with self.test_session() as sess: with self.test_session() as sess:
unmatched_column_indices = sess.run(unmatched_column_indices) unmatched_column_indices = sess.run(unmatched_column_indices)
self.assertAllEqual(unmatched_column_indices, expected_column_indices) self.assertAllEqual(unmatched_column_indices, expected_column_indices)
...@@ -67,7 +72,7 @@ class MatchTest(tf.test.TestCase): ...@@ -67,7 +72,7 @@ class MatchTest(tf.test.TestCase):
match = matcher.Match(match_results) match = matcher.Match(match_results)
expected_row_indices = [3, 1, 0, 5] expected_row_indices = [3, 1, 0, 5]
matched_row_indices = match.matched_row_indices() matched_row_indices = match.matched_row_indices()
self.assertEquals(matched_row_indices.dtype, tf.int32) self.assertEqual(matched_row_indices.dtype, tf.int32)
with self.test_session() as sess: with self.test_session() as sess:
matched_row_inds = sess.run(matched_row_indices) matched_row_inds = sess.run(matched_row_indices)
self.assertAllEqual(matched_row_inds, expected_row_indices) self.assertAllEqual(matched_row_inds, expected_row_indices)
...@@ -77,7 +82,7 @@ class MatchTest(tf.test.TestCase): ...@@ -77,7 +82,7 @@ class MatchTest(tf.test.TestCase):
match = matcher.Match(match_results) match = matcher.Match(match_results)
expected_column_indices = [6] expected_column_indices = [6]
ignored_column_indices = match.ignored_column_indices() ignored_column_indices = match.ignored_column_indices()
self.assertEquals(ignored_column_indices.dtype, tf.int32) self.assertEqual(ignored_column_indices.dtype, tf.int32)
with self.test_session() as sess: with self.test_session() as sess:
ignored_column_indices = sess.run(ignored_column_indices) ignored_column_indices = sess.run(ignored_column_indices)
self.assertAllEqual(ignored_column_indices, expected_column_indices) self.assertAllEqual(ignored_column_indices, expected_column_indices)
...@@ -87,7 +92,7 @@ class MatchTest(tf.test.TestCase): ...@@ -87,7 +92,7 @@ class MatchTest(tf.test.TestCase):
match = matcher.Match(match_results) match = matcher.Match(match_results)
expected_column_indicator = [True, True, False, True, False, True, False] expected_column_indicator = [True, True, False, True, False, True, False]
matched_column_indicator = match.matched_column_indicator() matched_column_indicator = match.matched_column_indicator()
self.assertEquals(matched_column_indicator.dtype, tf.bool) self.assertEqual(matched_column_indicator.dtype, tf.bool)
with self.test_session() as sess: with self.test_session() as sess:
matched_column_indicator = sess.run(matched_column_indicator) matched_column_indicator = sess.run(matched_column_indicator)
self.assertAllEqual(matched_column_indicator, expected_column_indicator) self.assertAllEqual(matched_column_indicator, expected_column_indicator)
...@@ -97,7 +102,7 @@ class MatchTest(tf.test.TestCase): ...@@ -97,7 +102,7 @@ class MatchTest(tf.test.TestCase):
match = matcher.Match(match_results) match = matcher.Match(match_results)
expected_column_indicator = [False, False, True, False, True, False, False] expected_column_indicator = [False, False, True, False, True, False, False]
unmatched_column_indicator = match.unmatched_column_indicator() unmatched_column_indicator = match.unmatched_column_indicator()
self.assertEquals(unmatched_column_indicator.dtype, tf.bool) self.assertEqual(unmatched_column_indicator.dtype, tf.bool)
with self.test_session() as sess: with self.test_session() as sess:
unmatched_column_indicator = sess.run(unmatched_column_indicator) unmatched_column_indicator = sess.run(unmatched_column_indicator)
self.assertAllEqual(unmatched_column_indicator, expected_column_indicator) self.assertAllEqual(unmatched_column_indicator, expected_column_indicator)
...@@ -107,7 +112,7 @@ class MatchTest(tf.test.TestCase): ...@@ -107,7 +112,7 @@ class MatchTest(tf.test.TestCase):
match = matcher.Match(match_results) match = matcher.Match(match_results)
expected_column_indicator = [False, False, False, False, False, False, True] expected_column_indicator = [False, False, False, False, False, False, True]
ignored_column_indicator = match.ignored_column_indicator() ignored_column_indicator = match.ignored_column_indicator()
self.assertEquals(ignored_column_indicator.dtype, tf.bool) self.assertEqual(ignored_column_indicator.dtype, tf.bool)
with self.test_session() as sess: with self.test_session() as sess:
ignored_column_indicator = sess.run(ignored_column_indicator) ignored_column_indicator = sess.run(ignored_column_indicator)
self.assertAllEqual(ignored_column_indicator, expected_column_indicator) self.assertAllEqual(ignored_column_indicator, expected_column_indicator)
...@@ -118,7 +123,7 @@ class MatchTest(tf.test.TestCase): ...@@ -118,7 +123,7 @@ class MatchTest(tf.test.TestCase):
expected_column_indices = [2, 4, 6] expected_column_indices = [2, 4, 6]
unmatched_ignored_column_indices = (match. unmatched_ignored_column_indices = (match.
unmatched_or_ignored_column_indices()) unmatched_or_ignored_column_indices())
self.assertEquals(unmatched_ignored_column_indices.dtype, tf.int32) self.assertEqual(unmatched_ignored_column_indices.dtype, tf.int32)
with self.test_session() as sess: with self.test_session() as sess:
unmatched_ignored_column_indices = sess.run( unmatched_ignored_column_indices = sess.run(
unmatched_ignored_column_indices) unmatched_ignored_column_indices)
...@@ -153,7 +158,7 @@ class MatchTest(tf.test.TestCase): ...@@ -153,7 +158,7 @@ class MatchTest(tf.test.TestCase):
gathered_tensor = match.gather_based_on_match(input_tensor, gathered_tensor = match.gather_based_on_match(input_tensor,
unmatched_value=100., unmatched_value=100.,
ignored_value=200.) ignored_value=200.)
self.assertEquals(gathered_tensor.dtype, tf.float32) self.assertEqual(gathered_tensor.dtype, tf.float32)
with self.test_session(): with self.test_session():
gathered_tensor_out = gathered_tensor.eval() gathered_tensor_out = gathered_tensor.eval()
self.assertAllEqual(expected_gathered_tensor, gathered_tensor_out) self.assertAllEqual(expected_gathered_tensor, gathered_tensor_out)
...@@ -167,7 +172,7 @@ class MatchTest(tf.test.TestCase): ...@@ -167,7 +172,7 @@ class MatchTest(tf.test.TestCase):
gathered_tensor = match.gather_based_on_match(input_tensor, gathered_tensor = match.gather_based_on_match(input_tensor,
unmatched_value=tf.zeros(4), unmatched_value=tf.zeros(4),
ignored_value=tf.zeros(4)) ignored_value=tf.zeros(4))
self.assertEquals(gathered_tensor.dtype, tf.float32) self.assertEqual(gathered_tensor.dtype, tf.float32)
with self.test_session(): with self.test_session():
gathered_tensor_out = gathered_tensor.eval() gathered_tensor_out = gathered_tensor.eval()
self.assertAllEqual(expected_gathered_tensor, gathered_tensor_out) self.assertAllEqual(expected_gathered_tensor, gathered_tensor_out)
...@@ -181,7 +186,7 @@ class MatchTest(tf.test.TestCase): ...@@ -181,7 +186,7 @@ class MatchTest(tf.test.TestCase):
gathered_tensor = match.gather_based_on_match(input_tensor, gathered_tensor = match.gather_based_on_match(input_tensor,
unmatched_value=tf.zeros(4), unmatched_value=tf.zeros(4),
ignored_value=tf.zeros(4)) ignored_value=tf.zeros(4))
self.assertEquals(gathered_tensor.dtype, tf.float32) self.assertEqual(gathered_tensor.dtype, tf.float32)
with self.test_session() as sess: with self.test_session() as sess:
self.assertTrue( self.assertTrue(
all([op.name is not 'Gather' for op in sess.graph.get_operations()])) all([op.name is not 'Gather' for op in sess.graph.get_operations()]))
......
...@@ -12,9 +12,9 @@ ...@@ -12,9 +12,9 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
# ============================================================================== # ==============================================================================
"""Post-processing operations on detected boxes.""" """Post-processing operations on detected boxes."""
import collections
import numpy as np import numpy as np
import tensorflow as tf import tensorflow as tf
...@@ -271,8 +271,8 @@ def batch_multiclass_non_max_suppression(boxes, ...@@ -271,8 +271,8 @@ def batch_multiclass_non_max_suppression(boxes,
all images in the batch. If clip_widow is None, all boxes are used to all images in the batch. If clip_widow is None, all boxes are used to
perform non-max suppression. perform non-max suppression.
change_coordinate_frame: Whether to normalize coordinates after clipping change_coordinate_frame: Whether to normalize coordinates after clipping
relative to clip_window (this can only be set to True if a clip_window relative to clip_window (this can only be set to True if a clip_window is
is provided) provided)
num_valid_boxes: (optional) a Tensor of type `int32`. A 1-D tensor of shape num_valid_boxes: (optional) a Tensor of type `int32`. A 1-D tensor of shape
[batch_size] representing the number of valid boxes to be considered [batch_size] representing the number of valid boxes to be considered
for each image in the batch. This parameter allows for ignoring zero for each image in the batch. This parameter allows for ignoring zero
...@@ -322,7 +322,17 @@ def batch_multiclass_non_max_suppression(boxes, ...@@ -322,7 +322,17 @@ def batch_multiclass_non_max_suppression(boxes,
raise ValueError('if change_coordinate_frame is True, then a clip_window' raise ValueError('if change_coordinate_frame is True, then a clip_window'
'must be specified.') 'must be specified.')
original_masks = masks original_masks = masks
original_additional_fields = additional_fields
# Create ordered dictionary using the sorted keys from
# additional fields to ensure getting the same key value assignment
# in _single_image_nms_fn(). The dictionary is thus a sorted version of
# additional_fields.
if additional_fields is None:
ordered_additional_fields = {}
else:
ordered_additional_fields = collections.OrderedDict(
sorted(additional_fields.items(), key=lambda item: item[0]))
del additional_fields
with tf.name_scope(scope, 'BatchMultiClassNonMaxSuppression'): with tf.name_scope(scope, 'BatchMultiClassNonMaxSuppression'):
boxes_shape = boxes.shape boxes_shape = boxes.shape
batch_size = boxes_shape[0].value batch_size = boxes_shape[0].value
...@@ -354,9 +364,6 @@ def batch_multiclass_non_max_suppression(boxes, ...@@ -354,9 +364,6 @@ def batch_multiclass_non_max_suppression(boxes,
if clip_window.shape.ndims == 1: if clip_window.shape.ndims == 1:
clip_window = tf.tile(tf.expand_dims(clip_window, 0), [batch_size, 1]) clip_window = tf.tile(tf.expand_dims(clip_window, 0), [batch_size, 1])
if additional_fields is None:
additional_fields = {}
def _single_image_nms_fn(args): def _single_image_nms_fn(args):
"""Runs NMS on a single image and returns padded output. """Runs NMS on a single image and returns padded output.
...@@ -403,9 +410,11 @@ def batch_multiclass_non_max_suppression(boxes, ...@@ -403,9 +410,11 @@ def batch_multiclass_non_max_suppression(boxes,
per_image_scores = args[1] per_image_scores = args[1]
per_image_masks = args[2] per_image_masks = args[2]
per_image_clip_window = args[3] per_image_clip_window = args[3]
# Make sure that the order of elements passed in args is aligned with
# the iteration order of ordered_additional_fields
per_image_additional_fields = { per_image_additional_fields = {
key: value key: value
for key, value in zip(additional_fields, args[4:-1]) for key, value in zip(ordered_additional_fields, args[4:-1])
} }
per_image_num_valid_boxes = args[-1] per_image_num_valid_boxes = args[-1]
if use_static_shapes: if use_static_shapes:
...@@ -459,21 +468,24 @@ def batch_multiclass_non_max_suppression(boxes, ...@@ -459,21 +468,24 @@ def batch_multiclass_non_max_suppression(boxes,
nmsed_scores = nmsed_boxlist.get_field(fields.BoxListFields.scores) nmsed_scores = nmsed_boxlist.get_field(fields.BoxListFields.scores)
nmsed_classes = nmsed_boxlist.get_field(fields.BoxListFields.classes) nmsed_classes = nmsed_boxlist.get_field(fields.BoxListFields.classes)
nmsed_masks = nmsed_boxlist.get_field(fields.BoxListFields.masks) nmsed_masks = nmsed_boxlist.get_field(fields.BoxListFields.masks)
nmsed_additional_fields = [ nmsed_additional_fields = []
nmsed_boxlist.get_field(key) for key in per_image_additional_fields # Sorting is needed here to ensure that the values stored in
] # nmsed_additional_fields are always kept in the same order
# across different execution runs.
for key in sorted(per_image_additional_fields.keys()):
nmsed_additional_fields.append(nmsed_boxlist.get_field(key))
return ([nmsed_boxes, nmsed_scores, nmsed_classes, nmsed_masks] + return ([nmsed_boxes, nmsed_scores, nmsed_classes, nmsed_masks] +
nmsed_additional_fields + [num_detections]) nmsed_additional_fields + [num_detections])
num_additional_fields = 0 num_additional_fields = 0
if additional_fields is not None: if ordered_additional_fields:
num_additional_fields = len(additional_fields) num_additional_fields = len(ordered_additional_fields)
num_nmsed_outputs = 4 + num_additional_fields num_nmsed_outputs = 4 + num_additional_fields
batch_outputs = shape_utils.static_or_dynamic_map_fn( batch_outputs = shape_utils.static_or_dynamic_map_fn(
_single_image_nms_fn, _single_image_nms_fn,
elems=([boxes, scores, masks, clip_window] + elems=([boxes, scores, masks, clip_window] +
list(additional_fields.values()) + [num_valid_boxes]), list(ordered_additional_fields.values()) + [num_valid_boxes]),
dtype=(num_nmsed_outputs * [tf.float32] + [tf.int32]), dtype=(num_nmsed_outputs * [tf.float32] + [tf.int32]),
parallel_iterations=parallel_iterations) parallel_iterations=parallel_iterations)
...@@ -481,16 +493,23 @@ def batch_multiclass_non_max_suppression(boxes, ...@@ -481,16 +493,23 @@ def batch_multiclass_non_max_suppression(boxes,
batch_nmsed_scores = batch_outputs[1] batch_nmsed_scores = batch_outputs[1]
batch_nmsed_classes = batch_outputs[2] batch_nmsed_classes = batch_outputs[2]
batch_nmsed_masks = batch_outputs[3] batch_nmsed_masks = batch_outputs[3]
batch_nmsed_additional_fields = { batch_nmsed_values = batch_outputs[4:-1]
key: value
for key, value in zip(additional_fields, batch_outputs[4:-1]) batch_nmsed_additional_fields = {}
} if num_additional_fields > 0:
# Sort the keys to ensure arranging elements in same order as
# in _single_image_nms_fn.
batch_nmsed_keys = ordered_additional_fields.keys()
for i in range(len(batch_nmsed_keys)):
batch_nmsed_additional_fields[
batch_nmsed_keys[i]] = batch_nmsed_values[i]
batch_num_detections = batch_outputs[-1] batch_num_detections = batch_outputs[-1]
if original_masks is None: if original_masks is None:
batch_nmsed_masks = None batch_nmsed_masks = None
if original_additional_fields is None: if not ordered_additional_fields:
batch_nmsed_additional_fields = None batch_nmsed_additional_fields = None
return (batch_nmsed_boxes, batch_nmsed_scores, batch_nmsed_classes, return (batch_nmsed_boxes, batch_nmsed_scores, batch_nmsed_classes,
......
...@@ -839,6 +839,9 @@ class MulticlassNonMaxSuppressionTest(test_case.TestCase): ...@@ -839,6 +839,9 @@ class MulticlassNonMaxSuppressionTest(test_case.TestCase):
[[0, 0], [0, 0]]]], [[0, 0], [0, 0]]]],
tf.float32) tf.float32)
} }
additional_fields['size'] = tf.constant(
[[[[6], [8]], [[0], [2]], [[0], [0]], [[0], [0]]],
[[[13], [15]], [[8], [10]], [[10], [12]], [[0], [0]]]], tf.float32)
score_thresh = 0.1 score_thresh = 0.1
iou_thresh = .5 iou_thresh = .5
max_output_size = 4 max_output_size = 4
...@@ -865,6 +868,10 @@ class MulticlassNonMaxSuppressionTest(test_case.TestCase): ...@@ -865,6 +868,10 @@ class MulticlassNonMaxSuppressionTest(test_case.TestCase):
[[8, 9], [10, 11]], [[8, 9], [10, 11]],
[[0, 0], [0, 0]]]]) [[0, 0], [0, 0]]]])
} }
exp_nms_additional_fields['size'] = np.array([[[[0], [0]], [[6], [8]],
[[0], [0]], [[0], [0]]],
[[[10], [12]], [[13], [15]],
[[8], [10]], [[0], [0]]]])
(nmsed_boxes, nmsed_scores, nmsed_classes, nmsed_masks, (nmsed_boxes, nmsed_scores, nmsed_classes, nmsed_masks,
nmsed_additional_fields, num_detections nmsed_additional_fields, num_detections
...@@ -1071,6 +1078,11 @@ class MulticlassNonMaxSuppressionTest(test_case.TestCase): ...@@ -1071,6 +1078,11 @@ class MulticlassNonMaxSuppressionTest(test_case.TestCase):
[[0, 0], [0, 0]]]], [[0, 0], [0, 0]]]],
tf.float32) tf.float32)
} }
additional_fields['size'] = tf.constant(
[[[[7], [9]], [[1], [3]], [[0], [0]], [[0], [0]]],
[[[14], [16]], [[9], [11]], [[11], [13]], [[0], [0]]]], tf.float32)
num_valid_boxes = tf.constant([1, 1], tf.int32) num_valid_boxes = tf.constant([1, 1], tf.int32)
score_thresh = 0.1 score_thresh = 0.1
iou_thresh = .5 iou_thresh = .5
...@@ -1099,6 +1111,11 @@ class MulticlassNonMaxSuppressionTest(test_case.TestCase): ...@@ -1099,6 +1111,11 @@ class MulticlassNonMaxSuppressionTest(test_case.TestCase):
[[0, 0], [0, 0]]]]) [[0, 0], [0, 0]]]])
} }
exp_nms_additional_fields['size'] = np.array([[[[7], [9]], [[0], [0]],
[[0], [0]], [[0], [0]]],
[[[14], [16]], [[0], [0]],
[[0], [0]], [[0], [0]]]])
(nmsed_boxes, nmsed_scores, nmsed_classes, nmsed_masks, (nmsed_boxes, nmsed_scores, nmsed_classes, nmsed_masks,
nmsed_additional_fields, num_detections nmsed_additional_fields, num_detections
) = post_processing.batch_multiclass_non_max_suppression( ) = post_processing.batch_multiclass_non_max_suppression(
......
...@@ -2298,11 +2298,20 @@ def resize_to_range(image, ...@@ -2298,11 +2298,20 @@ def resize_to_range(image,
return result return result
def _get_image_info(image):
"""Returns the height, width and number of channels in the image."""
image_height = tf.shape(image)[0]
image_width = tf.shape(image)[1]
num_channels = tf.shape(image)[2]
return (image_height, image_width, num_channels)
# TODO(alirezafathi): Make sure the static shapes are preserved. # TODO(alirezafathi): Make sure the static shapes are preserved.
def resize_to_min_dimension(image, masks=None, min_dimension=600): def resize_to_min_dimension(image, masks=None, min_dimension=600,
method=tf.image.ResizeMethod.BILINEAR):
"""Resizes image and masks given the min size maintaining the aspect ratio. """Resizes image and masks given the min size maintaining the aspect ratio.
If one of the image dimensions is smaller that min_dimension, it will scale If one of the image dimensions is smaller than min_dimension, it will scale
the image such that its smallest dimension is equal to min_dimension. the image such that its smallest dimension is equal to min_dimension.
Otherwise, will keep the image size as is. Otherwise, will keep the image size as is.
...@@ -2310,8 +2319,11 @@ def resize_to_min_dimension(image, masks=None, min_dimension=600): ...@@ -2310,8 +2319,11 @@ def resize_to_min_dimension(image, masks=None, min_dimension=600):
image: a tensor of size [height, width, channels]. image: a tensor of size [height, width, channels].
masks: (optional) a tensors of size [num_instances, height, width]. masks: (optional) a tensors of size [num_instances, height, width].
min_dimension: minimum image dimension. min_dimension: minimum image dimension.
method: (optional) interpolation method used in resizing. Defaults to
BILINEAR.
Returns: Returns:
An array containing resized_image, resized_masks, and resized_image_shape.
Note that the position of the resized_image_shape changes based on whether Note that the position of the resized_image_shape changes based on whether
masks are present. masks are present.
resized_image: A tensor of size [new_height, new_width, channels]. resized_image: A tensor of size [new_height, new_width, channels].
...@@ -2327,18 +2339,72 @@ def resize_to_min_dimension(image, masks=None, min_dimension=600): ...@@ -2327,18 +2339,72 @@ def resize_to_min_dimension(image, masks=None, min_dimension=600):
raise ValueError('Image should be 3D tensor') raise ValueError('Image should be 3D tensor')
with tf.name_scope('ResizeGivenMinDimension', values=[image, min_dimension]): with tf.name_scope('ResizeGivenMinDimension', values=[image, min_dimension]):
image_height = tf.shape(image)[0] (image_height, image_width, num_channels) = _get_image_info(image)
image_width = tf.shape(image)[1]
num_channels = tf.shape(image)[2]
min_image_dimension = tf.minimum(image_height, image_width) min_image_dimension = tf.minimum(image_height, image_width)
min_target_dimension = tf.maximum(min_image_dimension, min_dimension) min_target_dimension = tf.maximum(min_image_dimension, min_dimension)
target_ratio = tf.to_float(min_target_dimension) / tf.to_float( target_ratio = tf.to_float(min_target_dimension) / tf.to_float(
min_image_dimension) min_image_dimension)
target_height = tf.to_int32(tf.to_float(image_height) * target_ratio) target_height = tf.to_int32(tf.to_float(image_height) * target_ratio)
target_width = tf.to_int32(tf.to_float(image_width) * target_ratio) target_width = tf.to_int32(tf.to_float(image_width) * target_ratio)
image = tf.image.resize_bilinear( image = tf.image.resize_images(
tf.expand_dims(image, axis=0), tf.expand_dims(image, axis=0), size=[target_height, target_width],
size=[target_height, target_width], method=method,
align_corners=True)
result = [tf.squeeze(image, axis=0)]
if masks is not None:
masks = tf.image.resize_nearest_neighbor(
tf.expand_dims(masks, axis=3),
size=[target_height, target_width],
align_corners=True)
result.append(tf.squeeze(masks, axis=3))
result.append(tf.stack([target_height, target_width, num_channels]))
return result
def resize_to_max_dimension(image, masks=None, max_dimension=600,
method=tf.image.ResizeMethod.BILINEAR):
"""Resizes image and masks given the max size maintaining the aspect ratio.
If one of the image dimensions is greater than max_dimension, it will scale
the image such that its largest dimension is equal to max_dimension.
Otherwise, will keep the image size as is.
Args:
image: a tensor of size [height, width, channels].
masks: (optional) a tensors of size [num_instances, height, width].
max_dimension: maximum image dimension.
method: (optional) interpolation method used in resizing. Defaults to
BILINEAR.
Returns:
An array containing resized_image, resized_masks, and resized_image_shape.
Note that the position of the resized_image_shape changes based on whether
masks are present.
resized_image: A tensor of size [new_height, new_width, channels].
resized_masks: If masks is not None, also outputs masks. A 3D tensor of
shape [num_instances, new_height, new_width]
resized_image_shape: A 1D tensor of shape [3] containing the shape of the
resized image.
Raises:
ValueError: if the image is not a 3D tensor.
"""
if len(image.get_shape()) != 3:
raise ValueError('Image should be 3D tensor')
with tf.name_scope('ResizeGivenMaxDimension', values=[image, max_dimension]):
(image_height, image_width, num_channels) = _get_image_info(image)
max_image_dimension = tf.maximum(image_height, image_width)
max_target_dimension = tf.minimum(max_image_dimension, max_dimension)
target_ratio = tf.to_float(max_target_dimension) / tf.to_float(
max_image_dimension)
target_height = tf.to_int32(tf.to_float(image_height) * target_ratio)
target_width = tf.to_int32(tf.to_float(image_width) * target_ratio)
image = tf.image.resize_images(
tf.expand_dims(image, axis=0), size=[target_height, target_width],
method=method,
align_corners=True) align_corners=True)
result = [tf.squeeze(image, axis=0)] result = [tf.squeeze(image, axis=0)]
......
...@@ -2663,6 +2663,68 @@ class PreprocessorTest(tf.test.TestCase): ...@@ -2663,6 +2663,68 @@ class PreprocessorTest(tf.test.TestCase):
out_image_shape = sess.run(out_image_shape) out_image_shape = sess.run(out_image_shape)
self.assertAllEqual(out_image_shape, expected_shape) self.assertAllEqual(out_image_shape, expected_shape)
def testResizeToMaxDimensionTensorShapes(self):
"""Tests both cases where image should and shouldn't be resized."""
in_image_shape_list = [[100, 50, 3], [15, 30, 3]]
in_masks_shape_list = [[15, 100, 50], [10, 15, 30]]
max_dim = 50
expected_image_shape_list = [[50, 25, 3], [15, 30, 3]]
expected_masks_shape_list = [[15, 50, 25], [10, 15, 30]]
for (in_image_shape, expected_image_shape, in_masks_shape,
expected_mask_shape) in zip(in_image_shape_list,
expected_image_shape_list,
in_masks_shape_list,
expected_masks_shape_list):
in_image = tf.placeholder(tf.float32, shape=(None, None, 3))
in_masks = tf.placeholder(tf.float32, shape=(None, None, None))
in_masks = tf.random_uniform(in_masks_shape)
out_image, out_masks, _ = preprocessor.resize_to_max_dimension(
in_image, in_masks, max_dimension=max_dim)
out_image_shape = tf.shape(out_image)
out_masks_shape = tf.shape(out_masks)
with self.test_session() as sess:
out_image_shape, out_masks_shape = sess.run(
[out_image_shape, out_masks_shape],
feed_dict={
in_image: np.random.randn(*in_image_shape),
in_masks: np.random.randn(*in_masks_shape)
})
self.assertAllEqual(out_image_shape, expected_image_shape)
self.assertAllEqual(out_masks_shape, expected_mask_shape)
def testResizeToMaxDimensionWithInstanceMasksTensorOfSizeZero(self):
"""Tests both cases where image should and shouldn't be resized."""
in_image_shape_list = [[100, 50, 3], [15, 30, 3]]
in_masks_shape_list = [[0, 100, 50], [0, 15, 30]]
max_dim = 50
expected_image_shape_list = [[50, 25, 3], [15, 30, 3]]
expected_masks_shape_list = [[0, 50, 25], [0, 15, 30]]
for (in_image_shape, expected_image_shape, in_masks_shape,
expected_mask_shape) in zip(in_image_shape_list,
expected_image_shape_list,
in_masks_shape_list,
expected_masks_shape_list):
in_image = tf.random_uniform(in_image_shape)
in_masks = tf.random_uniform(in_masks_shape)
out_image, out_masks, _ = preprocessor.resize_to_max_dimension(
in_image, in_masks, max_dimension=max_dim)
out_image_shape = tf.shape(out_image)
out_masks_shape = tf.shape(out_masks)
with self.test_session() as sess:
out_image_shape, out_masks_shape = sess.run(
[out_image_shape, out_masks_shape])
self.assertAllEqual(out_image_shape, expected_image_shape)
self.assertAllEqual(out_masks_shape, expected_mask_shape)
def testResizeToMaxDimensionRaisesErrorOn4DImage(self):
image = tf.random_uniform([1, 200, 300, 3])
with self.assertRaises(ValueError):
preprocessor.resize_to_max_dimension(image, 500)
def testResizeToMinDimensionTensorShapes(self): def testResizeToMinDimensionTensorShapes(self):
in_image_shape_list = [[60, 55, 3], [15, 30, 3]] in_image_shape_list = [[60, 55, 3], [15, 30, 3]]
in_masks_shape_list = [[15, 60, 55], [10, 15, 30]] in_masks_shape_list = [[15, 60, 55], [10, 15, 30]]
......
...@@ -130,9 +130,13 @@ class TargetAssigner(object): ...@@ -130,9 +130,13 @@ class TargetAssigner(object):
representing weights for each element in cls_targets. representing weights for each element in cls_targets.
reg_targets: a float32 tensor with shape [num_anchors, box_code_dimension] reg_targets: a float32 tensor with shape [num_anchors, box_code_dimension]
reg_weights: a float32 tensor with shape [num_anchors] reg_weights: a float32 tensor with shape [num_anchors]
match: a matcher.Match object encoding the match between anchors and match: an int32 tensor of shape [num_anchors] containing result of anchor
groundtruth boxes, with rows corresponding to groundtruth boxes groundtruth matching. Each position in the tensor indicates an anchor
and columns corresponding to anchors. and holds the following meaning:
(1) if match[i] >= 0, anchor i is matched with groundtruth match[i].
(2) if match[i]=-1, anchor i is marked to be background .
(3) if match[i]=-2, anchor i is ignored since it is not background and
does not have sufficient overlap to call it a foreground.
Raises: Raises:
ValueError: if anchors or groundtruth_boxes are not of type ValueError: if anchors or groundtruth_boxes are not of type
...@@ -203,7 +207,8 @@ class TargetAssigner(object): ...@@ -203,7 +207,8 @@ class TargetAssigner(object):
reg_weights = self._reset_target_shape(reg_weights, num_anchors) reg_weights = self._reset_target_shape(reg_weights, num_anchors)
cls_weights = self._reset_target_shape(cls_weights, num_anchors) cls_weights = self._reset_target_shape(cls_weights, num_anchors)
return cls_targets, cls_weights, reg_targets, reg_weights, match return (cls_targets, cls_weights, reg_targets, reg_weights,
match.match_results)
def _reset_target_shape(self, target, num_anchors): def _reset_target_shape(self, target, num_anchors):
"""Sets the static shape of the target. """Sets the static shape of the target.
...@@ -416,12 +421,12 @@ def create_target_assigner(reference, stage=None, ...@@ -416,12 +421,12 @@ def create_target_assigner(reference, stage=None,
negative_class_weight=negative_class_weight) negative_class_weight=negative_class_weight)
def batch_assign_targets(target_assigner, def batch_assign(target_assigner,
anchors_batch, anchors_batch,
gt_box_batch, gt_box_batch,
gt_class_targets_batch, gt_class_targets_batch,
unmatched_class_label=None, unmatched_class_label=None,
gt_weights_batch=None): gt_weights_batch=None):
"""Batched assignment of classification and regression targets. """Batched assignment of classification and regression targets.
Args: Args:
...@@ -450,10 +455,14 @@ def batch_assign_targets(target_assigner, ...@@ -450,10 +455,14 @@ def batch_assign_targets(target_assigner,
batch_reg_targets: a tensor with shape [batch_size, num_anchors, batch_reg_targets: a tensor with shape [batch_size, num_anchors,
box_code_dimension] box_code_dimension]
batch_reg_weights: a tensor with shape [batch_size, num_anchors], batch_reg_weights: a tensor with shape [batch_size, num_anchors],
match_list: a list of matcher.Match objects encoding the match between match: an int32 tensor of shape [batch_size, num_anchors] containing result
anchors and groundtruth boxes for each image of the batch, of anchor groundtruth matching. Each position in the tensor indicates an
with rows of the Match objects corresponding to groundtruth boxes anchor and holds the following meaning:
and columns corresponding to anchors. (1) if match[x, i] >= 0, anchor i is matched with groundtruth match[x, i].
(2) if match[x, i]=-1, anchor i is marked to be background .
(3) if match[x, i]=-2, anchor i is ignored since it is not background and
does not have sufficient overlap to call it a foreground.
Raises: Raises:
ValueError: if input list lengths are inconsistent, i.e., ValueError: if input list lengths are inconsistent, i.e.,
batch_size == len(gt_box_batch) == len(gt_class_targets_batch) batch_size == len(gt_box_batch) == len(gt_class_targets_batch)
...@@ -491,8 +500,55 @@ def batch_assign_targets(target_assigner, ...@@ -491,8 +500,55 @@ def batch_assign_targets(target_assigner,
batch_cls_weights = tf.stack(cls_weights_list) batch_cls_weights = tf.stack(cls_weights_list)
batch_reg_targets = tf.stack(reg_targets_list) batch_reg_targets = tf.stack(reg_targets_list)
batch_reg_weights = tf.stack(reg_weights_list) batch_reg_weights = tf.stack(reg_weights_list)
batch_match = tf.stack(match_list)
return (batch_cls_targets, batch_cls_weights, batch_reg_targets, return (batch_cls_targets, batch_cls_weights, batch_reg_targets,
batch_reg_weights, match_list) batch_reg_weights, batch_match)
# Assign an alias to avoid large refactor of existing users.
batch_assign_targets = batch_assign
def batch_get_targets(batch_match, groundtruth_tensor_list,
groundtruth_weights_list, unmatched_value,
unmatched_weight):
"""Returns targets based on anchor-groundtruth box matching results.
Args:
batch_match: An int32 tensor of shape [batch, num_anchors] containing the
result of target assignment returned by TargetAssigner.assign(..).
groundtruth_tensor_list: A list of groundtruth tensors of shape
[num_groundtruth, d_1, d_2, ..., d_k]. The tensors can be of any type.
groundtruth_weights_list: A list of weights, one per groundtruth tensor, of
shape [num_groundtruth].
unmatched_value: A tensor of shape [d_1, d_2, ..., d_k] of the same type as
groundtruth tensor containing target value for anchors that remain
unmatched.
unmatched_weight: Scalar weight to assign to anchors that remain unmatched.
Returns:
targets: A tensor of shape [batch, num_anchors, d_1, d_2, ..., d_k]
containing targets for anchors.
weights: A float tensor of shape [batch, num_anchors] containing the weights
to assign to each target.
"""
match_list = tf.unstack(batch_match)
targets_list = []
weights_list = []
for match_tensor, groundtruth_tensor, groundtruth_weight in zip(
match_list, groundtruth_tensor_list, groundtruth_weights_list):
match_object = mat.Match(match_tensor)
targets = match_object.gather_based_on_match(
groundtruth_tensor,
unmatched_value=unmatched_value,
ignored_value=unmatched_value)
targets_list.append(targets)
weights = match_object.gather_based_on_match(
groundtruth_weight,
unmatched_value=unmatched_weight,
ignored_value=tf.zeros_like(unmatched_weight))
weights_list.append(weights)
return tf.stack(targets_list), tf.stack(weights_list)
def batch_assign_confidences(target_assigner, def batch_assign_confidences(target_assigner,
...@@ -548,10 +604,13 @@ def batch_assign_confidences(target_assigner, ...@@ -548,10 +604,13 @@ def batch_assign_confidences(target_assigner,
batch_reg_targets: a tensor with shape [batch_size, num_anchors, batch_reg_targets: a tensor with shape [batch_size, num_anchors,
box_code_dimension] box_code_dimension]
batch_reg_weights: a tensor with shape [batch_size, num_anchors], batch_reg_weights: a tensor with shape [batch_size, num_anchors],
match_list: a list of matcher.Match objects encoding the match between match: an int32 tensor of shape [batch_size, num_anchors] containing result
anchors and groundtruth boxes for each image of the batch, of anchor groundtruth matching. Each position in the tensor indicates an
with rows of the Match objects corresponding to groundtruth boxes anchor and holds the following meaning:
and columns corresponding to anchors. (1) if match[x, i] >= 0, anchor i is matched with groundtruth match[x, i].
(2) if match[x, i]=-1, anchor i is marked to be background .
(3) if match[x, i]=-2, anchor i is ignored since it is not background and
does not have sufficient overlap to call it a foreground.
Raises: Raises:
ValueError: if input list lengths are inconsistent, i.e., ValueError: if input list lengths are inconsistent, i.e.,
...@@ -634,5 +693,6 @@ def batch_assign_confidences(target_assigner, ...@@ -634,5 +693,6 @@ def batch_assign_confidences(target_assigner,
batch_cls_weights = tf.stack(cls_weights_list) batch_cls_weights = tf.stack(cls_weights_list)
batch_reg_targets = tf.stack(reg_targets_list) batch_reg_targets = tf.stack(reg_targets_list)
batch_reg_weights = tf.stack(reg_weights_list) batch_reg_weights = tf.stack(reg_weights_list)
batch_match = tf.stack(match_list)
return (batch_cls_targets, batch_cls_weights, batch_reg_targets, return (batch_cls_targets, batch_cls_weights, batch_reg_targets,
batch_reg_weights, match_list) batch_reg_weights, batch_match)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment