Unverified Commit 02a9969e authored by pkulzc's avatar pkulzc Committed by GitHub
Browse files

Refactor object detection box predictors and fix some issues with model_main. (#4965)

* Merged commit includes the following changes:
206852642  by Zhichao Lu:

    Build the balanced_positive_negative_sampler in the model builder for FasterRCNN. Also adds an option to use the static implementation of the sampler.

--
206803260  by Zhichao Lu:

    Fixes a misplaced argument in resnet fpn feature extractor.

--
206682736  by Zhichao Lu:

    This CL modifies the SSD meta architecture to support both Slim-based and Keras-based box predictors, and begins preparation for Keras box predictor support in the other meta architectures.

    Concretely, this CL adds a new `KerasBoxPredictor` base class and makes the meta architectures appropriately call whichever box predictors they are using.

    We can switch the non-ssd meta architectures to fully support Keras box predictors once the Keras Convolutional Box Predictor CL is submitted.

--
206669634  by Zhichao Lu:

    Adds an alternate method for balanced positive negative sampler using static shapes.

--
206643278  by Zhichao Lu:

    This CL adds a Keras layer hyperparameter configuration object to the hyperparams_builder.

    It automatically converts from Slim layer hyperparameter configs to Keras layer hyperparameters. Namely, it:
    - Builds Keras initializers/regularizers instead of Slim ones
    - sets weights_regularizer/initializer to kernel_regularizer/initializer
    - converts batchnorm decay to momentum
    - converts Slim l2 regularizer weights to the equivalent Keras l2 weights

    This will be used in the conversion of object detection feature extractors & box predictors to newer Tensorflow APIs.

--
206611681  by Zhichao Lu:

    Internal changes.

--
206591619  by Zhichao Lu:

    Clip the to shape when the input tensors are larger than the expected padded static shape

--
206517644  by Zhichao Lu:

    Make MultiscaleGridAnchorGenerator more consistent with MultipleGridAnchorGenerator.

--
206415624  by Zhichao Lu:

    Make the hardcoded feature pyramid network (FPN) levels configurable for both SSD
    Resnet and SSD Mobilenet.

--
206398204  by Zhichao Lu:

    This CL modifies the SSD meta architecture to support both Slim-based and Keras-based feature extractors.

    This allows us to begin the conversion of object detection to newer Tensorflow APIs.

--
206213448  by Zhichao Lu:

    Adding a method to compute the expected classification loss by background/foreground weighting.

--
206204232  by Zhichao Lu:

    Adding the keypoint head to the Mask RCNN pipeline.

--
206200352  by Zhichao Lu:

    - Create Faster R-CNN target assigner in the model builder. This allows configuring matchers in Target assigner to use TPU compatible ops (tf.gather in this case) without any change in meta architecture.
    - As a +ve side effect of the refactoring, we can now re-use a single target assigner for all of second stage heads in Faster R-CNN.

--
206178206  by Zhichao Lu:

    Force ssd feature extractor builder to use keyword arguments so values won't be passed to wrong arguments.

--
206168297  by Zhichao Lu:

    Updating exporter to use freeze_graph.freeze_graph_with_def_protos rather than a homegrown version.

--
206080748  by Zhichao Lu:

    Merge external contributions.

--
206074460  by Zhichao Lu:

    Update to preprocessor to apply temperature and softmax to the multiclass scores on read.

--
205960802  by Zhichao Lu:

    Fixing a bug in hierarchical label expansion script.

--
205944686  by Zhichao Lu:

    Update exporter to support exporting quantized model.

--
205912529  by Zhichao Lu:

    Add a two stage matcher to allow for thresholding by one criteria and then argmaxing on the other.

--
205909017  by Zhichao Lu:

    Add test for grayscale image_resizer

--
205892801  by Zhichao Lu:

    Add flag to decide whether to apply batch norm to conv layers of weight shared box predictor.

--
205824449  by Zhichao Lu:

    make sure that by default mask rcnn box predictor predicts 2 stages.

--
205730139  by Zhichao Lu:

    Updating warning message to be more explicit about variable size mismatch.

--
205696992  by Zhichao Lu:

    Remove utils/ops.py's dependency on core/box_list_ops.py. This will allow re-using TPU compatible ops from utils/ops.py in core/box_list_ops.py.

--
205696867  by Zhichao Lu:

    Refactoring mask rcnn predictor so have each head in a separate file.
    This CL lets us to add new heads more easily in the future to mask rcnn.

--
205492073  by Zhichao Lu:

    Refactor R-FCN box predictor to be TPU compliant.

    - Change utils/ops.py:position_sensitive_crop_regions to operate on single image and set of boxes without `box_ind`
    - Add a batch version that operations on batches of images and batches of boxes.
    - Refactor R-FCN box predictor to use the batched version of position sensitive crop regions.

--
205453567  by Zhichao Lu:

    Fix bug that cannot export inference graph when write_inference_graph flag is True.

--
205316039  by Zhichao Lu:

    Changing input tensor name.

--
205256307  by Zhichao Lu:

    Fix model zoo links for quantized model.

--
205164432  by Zhichao Lu:

    Fixes eval error when label map contains non-ascii characters.

--
205129842  by Zhichao Lu:

    Adds a option to clip the anchors to the window size without filtering the overlapped boxes in Faster-RCNN

--
205094863  by Zhichao Lu:

    Update to label map util to allow the option of adding a background class and fill in gaps in the label map. Useful for using multiclass scores which require a complete label map with explicit background label.

--
204989032  by Zhichao Lu:

    Add tf.prof support to exporter.

--
204825267  by Zhichao Lu:

    Modify mask rcnn box predictor tests for TPU compatibility.

--
204778749  by Zhichao Lu:

    Remove score filtering from postprocessing.py and rely on filtering logic in tf.image.non_max_suppression

--
204775818  by Zhichao Lu:

    Python3 fixes for object_detection.

--
204745920  by Zhichao Lu:

    Object Detection Dataset visualization tool (documentation).

--
204686993  by Zhichao Lu:

    Internal changes.

--
204559667  by Zhichao Lu:

    Refactor box_predictor.py into multiple files.
    The abstract base class remains in the object_detection/core, The other classes have moved to a separate file each in object_detection/predictors

--
204552847  by Zhichao Lu:

    Update blog post link.

--
204508028  by Zhichao Lu:

    Bump down the batch size to 1024 to be a bit more tolerant to OOM and double the number of iterations. This job still converges to 20.5 mAP in 3 hours.

--

PiperOrigin-RevId: 206852642

* Add original post-processing back.
parent d135ed9c
...@@ -18,6 +18,7 @@ import tensorflow as tf ...@@ -18,6 +18,7 @@ import tensorflow as tf
from object_detection.core import box_list from object_detection.core import box_list
from object_detection.core import region_similarity_calculator from object_detection.core import region_similarity_calculator
from object_detection.core import standard_fields as fields
class RegionSimilarityCalculatorTest(tf.test.TestCase): class RegionSimilarityCalculatorTest(tf.test.TestCase):
...@@ -70,6 +71,25 @@ class RegionSimilarityCalculatorTest(tf.test.TestCase): ...@@ -70,6 +71,25 @@ class RegionSimilarityCalculatorTest(tf.test.TestCase):
self.assertAllClose(iou_output_1, exp_output_1) self.assertAllClose(iou_output_1, exp_output_1)
self.assertAllClose(iou_output_2, exp_output_2) self.assertAllClose(iou_output_2, exp_output_2)
def test_get_correct_pairwise_similarity_based_on_thresholded_iou(self):
corners1 = tf.constant([[4.0, 3.0, 7.0, 5.0], [5.0, 6.0, 10.0, 7.0]])
corners2 = tf.constant([[3.0, 4.0, 6.0, 8.0], [14.0, 14.0, 15.0, 15.0],
[0.0, 0.0, 20.0, 20.0]])
scores = tf.constant([.3, .6])
iou_threshold = .013
exp_output = tf.constant([[0.3, 0., 0.3], [0.6, 0., 0.]])
boxes1 = box_list.BoxList(corners1)
boxes1.add_field(fields.BoxListFields.scores, scores)
boxes2 = box_list.BoxList(corners2)
iou_similarity_calculator = (
region_similarity_calculator.ThresholdedIouSimilarity(
iou_threshold=iou_threshold))
iou_similarity = iou_similarity_calculator.compare(boxes1, boxes2)
with self.test_session() as sess:
iou_output = sess.run(iou_similarity)
self.assertAllClose(iou_output, exp_output)
if __name__ == '__main__': if __name__ == '__main__':
tf.test.main() tf.test.main()
...@@ -49,7 +49,7 @@ class TargetAssigner(object): ...@@ -49,7 +49,7 @@ class TargetAssigner(object):
"""Target assigner to compute classification and regression targets.""" """Target assigner to compute classification and regression targets."""
def __init__(self, similarity_calc, matcher, box_coder, def __init__(self, similarity_calc, matcher, box_coder,
negative_class_weight=1.0, unmatched_cls_target=None): negative_class_weight=1.0):
"""Construct Object Detection Target Assigner. """Construct Object Detection Target Assigner.
Args: Args:
...@@ -60,12 +60,6 @@ class TargetAssigner(object): ...@@ -60,12 +60,6 @@ class TargetAssigner(object):
groundtruth boxes with respect to anchors. groundtruth boxes with respect to anchors.
negative_class_weight: classification weight to be associated to negative negative_class_weight: classification weight to be associated to negative
anchors (default: 1.0). The weight must be in [0., 1.]. anchors (default: 1.0). The weight must be in [0., 1.].
unmatched_cls_target: a float32 tensor with shape [d_1, d_2, ..., d_k]
which is consistent with the classification target for each
anchor (and can be empty for scalar targets). This shape must thus be
compatible with the groundtruth labels that are passed to the "assign"
function (which have shape [num_gt_boxes, d_1, d_2, ..., d_k]).
If set to None, unmatched_cls_target is set to be [0] for each anchor.
Raises: Raises:
ValueError: if similarity_calc is not a RegionSimilarityCalculator or ValueError: if similarity_calc is not a RegionSimilarityCalculator or
...@@ -81,17 +75,14 @@ class TargetAssigner(object): ...@@ -81,17 +75,14 @@ class TargetAssigner(object):
self._matcher = matcher self._matcher = matcher
self._box_coder = box_coder self._box_coder = box_coder
self._negative_class_weight = negative_class_weight self._negative_class_weight = negative_class_weight
if unmatched_cls_target is None:
self._unmatched_cls_target = tf.constant([0], tf.float32)
else:
self._unmatched_cls_target = unmatched_cls_target
@property @property
def box_coder(self): def box_coder(self):
return self._box_coder return self._box_coder
# TODO(rathodv): move labels, scores, and weights to groundtruth_boxes fields.
def assign(self, anchors, groundtruth_boxes, groundtruth_labels=None, def assign(self, anchors, groundtruth_boxes, groundtruth_labels=None,
groundtruth_weights=None, **params): unmatched_class_label=None, groundtruth_weights=None, **params):
"""Assign classification and regression targets to each anchor. """Assign classification and regression targets to each anchor.
For a given set of anchors and groundtruth detections, match anchors For a given set of anchors and groundtruth detections, match anchors
...@@ -110,6 +101,12 @@ class TargetAssigner(object): ...@@ -110,6 +101,12 @@ class TargetAssigner(object):
[d_1, ... d_k] can be empty (corresponding to scalar inputs). When set [d_1, ... d_k] can be empty (corresponding to scalar inputs). When set
to None, groundtruth_labels assumes a binary problem where all to None, groundtruth_labels assumes a binary problem where all
ground_truth boxes get a positive label (of 1). ground_truth boxes get a positive label (of 1).
unmatched_class_label: a float32 tensor with shape [d_1, d_2, ..., d_k]
which is consistent with the classification target for each
anchor (and can be empty for scalar targets). This shape must thus be
compatible with the groundtruth labels that are passed to the "assign"
function (which have shape [num_gt_boxes, d_1, d_2, ..., d_k]).
If set to None, unmatched_cls_target is set to be [0] for each anchor.
groundtruth_weights: a float tensor of shape [M] indicating the weight to groundtruth_weights: a float tensor of shape [M] indicating the weight to
assign to all anchors match to a particular groundtruth box. The weights assign to all anchors match to a particular groundtruth box. The weights
must be in [0., 1.]. If None, all weights are set to 1. must be in [0., 1.]. If None, all weights are set to 1.
...@@ -136,14 +133,17 @@ class TargetAssigner(object): ...@@ -136,14 +133,17 @@ class TargetAssigner(object):
if not isinstance(groundtruth_boxes, box_list.BoxList): if not isinstance(groundtruth_boxes, box_list.BoxList):
raise ValueError('groundtruth_boxes must be an BoxList') raise ValueError('groundtruth_boxes must be an BoxList')
if unmatched_class_label is None:
unmatched_class_label = tf.constant([0], tf.float32)
if groundtruth_labels is None: if groundtruth_labels is None:
groundtruth_labels = tf.ones(tf.expand_dims(groundtruth_boxes.num_boxes(), groundtruth_labels = tf.ones(tf.expand_dims(groundtruth_boxes.num_boxes(),
0)) 0))
groundtruth_labels = tf.expand_dims(groundtruth_labels, -1) groundtruth_labels = tf.expand_dims(groundtruth_labels, -1)
unmatched_shape_assert = shape_utils.assert_shape_equal( unmatched_shape_assert = shape_utils.assert_shape_equal(
shape_utils.combined_static_and_dynamic_shape(groundtruth_labels)[1:], shape_utils.combined_static_and_dynamic_shape(groundtruth_labels)[1:],
shape_utils.combined_static_and_dynamic_shape( shape_utils.combined_static_and_dynamic_shape(unmatched_class_label))
self._unmatched_cls_target))
labels_and_box_shapes_assert = shape_utils.assert_shape_equal( labels_and_box_shapes_assert = shape_utils.assert_shape_equal(
shape_utils.combined_static_and_dynamic_shape( shape_utils.combined_static_and_dynamic_shape(
groundtruth_labels)[:1], groundtruth_labels)[:1],
...@@ -155,6 +155,12 @@ class TargetAssigner(object): ...@@ -155,6 +155,12 @@ class TargetAssigner(object):
if not num_gt_boxes: if not num_gt_boxes:
num_gt_boxes = groundtruth_boxes.num_boxes() num_gt_boxes = groundtruth_boxes.num_boxes()
groundtruth_weights = tf.ones([num_gt_boxes], dtype=tf.float32) groundtruth_weights = tf.ones([num_gt_boxes], dtype=tf.float32)
# set scores on the gt boxes
scores = 1 - groundtruth_labels[:, 0]
groundtruth_boxes.add_field(fields.BoxListFields.scores, scores)
with tf.control_dependencies( with tf.control_dependencies(
[unmatched_shape_assert, labels_and_box_shapes_assert]): [unmatched_shape_assert, labels_and_box_shapes_assert]):
match_quality_matrix = self._similarity_calc.compare(groundtruth_boxes, match_quality_matrix = self._similarity_calc.compare(groundtruth_boxes,
...@@ -164,6 +170,7 @@ class TargetAssigner(object): ...@@ -164,6 +170,7 @@ class TargetAssigner(object):
groundtruth_boxes, groundtruth_boxes,
match) match)
cls_targets = self._create_classification_targets(groundtruth_labels, cls_targets = self._create_classification_targets(groundtruth_labels,
unmatched_class_label,
match) match)
reg_weights = self._create_regression_weights(match, groundtruth_weights) reg_weights = self._create_regression_weights(match, groundtruth_weights)
cls_weights = self._create_classification_weights(match, cls_weights = self._create_classification_weights(match,
...@@ -245,7 +252,8 @@ class TargetAssigner(object): ...@@ -245,7 +252,8 @@ class TargetAssigner(object):
""" """
return tf.constant([self._box_coder.code_size*[0]], tf.float32) return tf.constant([self._box_coder.code_size*[0]], tf.float32)
def _create_classification_targets(self, groundtruth_labels, match): def _create_classification_targets(self, groundtruth_labels,
unmatched_class_label, match):
"""Create classification targets for each anchor. """Create classification targets for each anchor.
Assign a classification target of for each anchor to the matching Assign a classification target of for each anchor to the matching
...@@ -256,6 +264,11 @@ class TargetAssigner(object): ...@@ -256,6 +264,11 @@ class TargetAssigner(object):
groundtruth_labels: a tensor of shape [num_gt_boxes, d_1, ... d_k] groundtruth_labels: a tensor of shape [num_gt_boxes, d_1, ... d_k]
with labels for each of the ground_truth boxes. The subshape with labels for each of the ground_truth boxes. The subshape
[d_1, ... d_k] can be empty (corresponding to scalar labels). [d_1, ... d_k] can be empty (corresponding to scalar labels).
unmatched_class_label: a float32 tensor with shape [d_1, d_2, ..., d_k]
which is consistent with the classification target for each
anchor (and can be empty for scalar targets). This shape must thus be
compatible with the groundtruth labels that are passed to the "assign"
function (which have shape [num_gt_boxes, d_1, d_2, ..., d_k]).
match: a matcher.Match object that provides a matching between anchors match: a matcher.Match object that provides a matching between anchors
and groundtruth boxes. and groundtruth boxes.
...@@ -266,8 +279,8 @@ class TargetAssigner(object): ...@@ -266,8 +279,8 @@ class TargetAssigner(object):
""" """
return match.gather_based_on_match( return match.gather_based_on_match(
groundtruth_labels, groundtruth_labels,
unmatched_value=self._unmatched_cls_target, unmatched_value=unmatched_class_label,
ignored_value=self._unmatched_cls_target) ignored_value=unmatched_class_label)
def _create_regression_weights(self, match, groundtruth_weights): def _create_regression_weights(self, match, groundtruth_weights):
"""Set regression weight for each anchor. """Set regression weight for each anchor.
...@@ -327,8 +340,7 @@ class TargetAssigner(object): ...@@ -327,8 +340,7 @@ class TargetAssigner(object):
# TODO(rathodv): This method pulls in all the implementation dependencies into # TODO(rathodv): This method pulls in all the implementation dependencies into
# core. Therefore its best to have this factory method outside of core. # core. Therefore its best to have this factory method outside of core.
def create_target_assigner(reference, stage=None, def create_target_assigner(reference, stage=None,
negative_class_weight=1.0, negative_class_weight=1.0, use_matmul_gather=False):
unmatched_cls_target=None):
"""Factory function for creating standard target assigners. """Factory function for creating standard target assigners.
Args: Args:
...@@ -336,12 +348,8 @@ def create_target_assigner(reference, stage=None, ...@@ -336,12 +348,8 @@ def create_target_assigner(reference, stage=None,
stage: string denoting stage: {proposal, detection}. stage: string denoting stage: {proposal, detection}.
negative_class_weight: classification weight to be associated to negative negative_class_weight: classification weight to be associated to negative
anchors (default: 1.0) anchors (default: 1.0)
unmatched_cls_target: a float32 tensor with shape [d_1, d_2, ..., d_k] use_matmul_gather: whether to use matrix multiplication based gather which
which is consistent with the classification target for each are better suited for TPUs.
anchor (and can be empty for scalar targets). This shape must thus be
compatible with the groundtruth labels that are passed to the Assign
function (which have shape [num_gt_boxes, d_1, d_2, ..., d_k]).
If set to None, unmatched_cls_target is set to be 0 for each anchor.
Returns: Returns:
TargetAssigner: desired target assigner. TargetAssigner: desired target assigner.
...@@ -358,7 +366,8 @@ def create_target_assigner(reference, stage=None, ...@@ -358,7 +366,8 @@ def create_target_assigner(reference, stage=None,
similarity_calc = sim_calc.IouSimilarity() similarity_calc = sim_calc.IouSimilarity()
matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.7, matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.7,
unmatched_threshold=0.3, unmatched_threshold=0.3,
force_match_for_each_row=True) force_match_for_each_row=True,
use_matmul_gather=use_matmul_gather)
box_coder = faster_rcnn_box_coder.FasterRcnnBoxCoder( box_coder = faster_rcnn_box_coder.FasterRcnnBoxCoder(
scale_factors=[10.0, 10.0, 5.0, 5.0]) scale_factors=[10.0, 10.0, 5.0, 5.0])
...@@ -366,7 +375,8 @@ def create_target_assigner(reference, stage=None, ...@@ -366,7 +375,8 @@ def create_target_assigner(reference, stage=None,
similarity_calc = sim_calc.IouSimilarity() similarity_calc = sim_calc.IouSimilarity()
# Uses all proposals with IOU < 0.5 as candidate negatives. # Uses all proposals with IOU < 0.5 as candidate negatives.
matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5, matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5,
negatives_lower_than_unmatched=True) negatives_lower_than_unmatched=True,
use_matmul_gather=use_matmul_gather)
box_coder = faster_rcnn_box_coder.FasterRcnnBoxCoder( box_coder = faster_rcnn_box_coder.FasterRcnnBoxCoder(
scale_factors=[10.0, 10.0, 5.0, 5.0]) scale_factors=[10.0, 10.0, 5.0, 5.0])
...@@ -375,21 +385,22 @@ def create_target_assigner(reference, stage=None, ...@@ -375,21 +385,22 @@ def create_target_assigner(reference, stage=None,
matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5, matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5,
unmatched_threshold=0.1, unmatched_threshold=0.1,
force_match_for_each_row=False, force_match_for_each_row=False,
negatives_lower_than_unmatched=False) negatives_lower_than_unmatched=False,
use_matmul_gather=use_matmul_gather)
box_coder = faster_rcnn_box_coder.FasterRcnnBoxCoder() box_coder = faster_rcnn_box_coder.FasterRcnnBoxCoder()
else: else:
raise ValueError('No valid combination of reference and stage.') raise ValueError('No valid combination of reference and stage.')
return TargetAssigner(similarity_calc, matcher, box_coder, return TargetAssigner(similarity_calc, matcher, box_coder,
negative_class_weight=negative_class_weight, negative_class_weight=negative_class_weight)
unmatched_cls_target=unmatched_cls_target)
def batch_assign_targets(target_assigner, def batch_assign_targets(target_assigner,
anchors_batch, anchors_batch,
gt_box_batch, gt_box_batch,
gt_class_targets_batch, gt_class_targets_batch,
unmatched_class_label=None,
gt_weights_batch=None): gt_weights_batch=None):
"""Batched assignment of classification and regression targets. """Batched assignment of classification and regression targets.
...@@ -403,6 +414,11 @@ def batch_assign_targets(target_assigner, ...@@ -403,6 +414,11 @@ def batch_assign_targets(target_assigner,
each tensor has shape [num_gt_boxes_i, classification_target_size] and each tensor has shape [num_gt_boxes_i, classification_target_size] and
num_gt_boxes_i is the number of boxes in the ith boxlist of num_gt_boxes_i is the number of boxes in the ith boxlist of
gt_box_batch. gt_box_batch.
unmatched_class_label: a float32 tensor with shape [d_1, d_2, ..., d_k]
which is consistent with the classification target for each
anchor (and can be empty for scalar targets). This shape must thus be
compatible with the groundtruth labels that are passed to the "assign"
function (which have shape [num_gt_boxes, d_1, d_2, ..., d_k]).
gt_weights_batch: A list of 1-D tf.float32 tensors of shape gt_weights_batch: A list of 1-D tf.float32 tensors of shape
[num_boxes] containing weights for groundtruth boxes. [num_boxes] containing weights for groundtruth boxes.
...@@ -442,9 +458,9 @@ def batch_assign_targets(target_assigner, ...@@ -442,9 +458,9 @@ def batch_assign_targets(target_assigner,
gt_weights_batch = [None] * len(gt_class_targets_batch) gt_weights_batch = [None] * len(gt_class_targets_batch)
for anchors, gt_boxes, gt_class_targets, gt_weights in zip( for anchors, gt_boxes, gt_class_targets, gt_weights in zip(
anchors_batch, gt_box_batch, gt_class_targets_batch, gt_weights_batch): anchors_batch, gt_box_batch, gt_class_targets_batch, gt_weights_batch):
(cls_targets, cls_weights, reg_targets, (cls_targets, cls_weights, reg_targets, reg_weights,
reg_weights, match) = target_assigner.assign( match) = target_assigner.assign(anchors, gt_boxes, gt_class_targets,
anchors, gt_boxes, gt_class_targets, gt_weights) unmatched_class_label, gt_weights)
cls_targets_list.append(cls_targets) cls_targets_list.append(cls_targets)
cls_weights_list.append(cls_weights) cls_weights_list.append(cls_weights)
reg_targets_list.append(reg_targets) reg_targets_list.append(reg_targets)
......
...@@ -37,10 +37,11 @@ class TargetAssignerTest(test_case.TestCase): ...@@ -37,10 +37,11 @@ class TargetAssignerTest(test_case.TestCase):
unmatched_threshold=0.5) unmatched_threshold=0.5)
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1) box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
target_assigner = targetassigner.TargetAssigner( target_assigner = targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder, unmatched_cls_target=None) similarity_calc, matcher, box_coder)
anchors_boxlist = box_list.BoxList(anchor_means) anchors_boxlist = box_list.BoxList(anchor_means)
groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners) groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners)
result = target_assigner.assign(anchors_boxlist, groundtruth_boxlist) result = target_assigner.assign(
anchors_boxlist, groundtruth_boxlist, unmatched_class_label=None)
(cls_targets, cls_weights, reg_targets, reg_weights, _) = result (cls_targets, cls_weights, reg_targets, reg_weights, _) = result
return (cls_targets, cls_weights, reg_targets, reg_weights) return (cls_targets, cls_weights, reg_targets, reg_weights)
...@@ -81,10 +82,11 @@ class TargetAssignerTest(test_case.TestCase): ...@@ -81,10 +82,11 @@ class TargetAssignerTest(test_case.TestCase):
unmatched_threshold=0.3) unmatched_threshold=0.3)
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1) box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
target_assigner = targetassigner.TargetAssigner( target_assigner = targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder, unmatched_cls_target=None) similarity_calc, matcher, box_coder)
anchors_boxlist = box_list.BoxList(anchor_means) anchors_boxlist = box_list.BoxList(anchor_means)
groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners) groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners)
result = target_assigner.assign(anchors_boxlist, groundtruth_boxlist) result = target_assigner.assign(
anchors_boxlist, groundtruth_boxlist, unmatched_class_label=None)
(cls_targets, cls_weights, reg_targets, reg_weights, _) = result (cls_targets, cls_weights, reg_targets, reg_weights, _) = result
return (cls_targets, cls_weights, reg_targets, reg_weights) return (cls_targets, cls_weights, reg_targets, reg_weights)
...@@ -120,12 +122,13 @@ class TargetAssignerTest(test_case.TestCase): ...@@ -120,12 +122,13 @@ class TargetAssignerTest(test_case.TestCase):
box_coder = keypoint_box_coder.KeypointBoxCoder( box_coder = keypoint_box_coder.KeypointBoxCoder(
num_keypoints=6, scale_factors=[10.0, 10.0, 5.0, 5.0]) num_keypoints=6, scale_factors=[10.0, 10.0, 5.0, 5.0])
target_assigner = targetassigner.TargetAssigner( target_assigner = targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder, unmatched_cls_target=None) similarity_calc, matcher, box_coder)
anchors_boxlist = box_list.BoxList(anchor_means) anchors_boxlist = box_list.BoxList(anchor_means)
groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners) groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners)
groundtruth_boxlist.add_field(fields.BoxListFields.keypoints, groundtruth_boxlist.add_field(fields.BoxListFields.keypoints,
groundtruth_keypoints) groundtruth_keypoints)
result = target_assigner.assign(anchors_boxlist, groundtruth_boxlist) result = target_assigner.assign(
anchors_boxlist, groundtruth_boxlist, unmatched_class_label=None)
(cls_targets, cls_weights, reg_targets, reg_weights, _) = result (cls_targets, cls_weights, reg_targets, reg_weights, _) = result
return (cls_targets, cls_weights, reg_targets, reg_weights) return (cls_targets, cls_weights, reg_targets, reg_weights)
...@@ -174,12 +177,13 @@ class TargetAssignerTest(test_case.TestCase): ...@@ -174,12 +177,13 @@ class TargetAssignerTest(test_case.TestCase):
box_coder = keypoint_box_coder.KeypointBoxCoder( box_coder = keypoint_box_coder.KeypointBoxCoder(
num_keypoints=6, scale_factors=[10.0, 10.0, 5.0, 5.0]) num_keypoints=6, scale_factors=[10.0, 10.0, 5.0, 5.0])
target_assigner = targetassigner.TargetAssigner( target_assigner = targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder, unmatched_cls_target=None) similarity_calc, matcher, box_coder)
anchors_boxlist = box_list.BoxList(anchor_means) anchors_boxlist = box_list.BoxList(anchor_means)
groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners) groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners)
groundtruth_boxlist.add_field(fields.BoxListFields.keypoints, groundtruth_boxlist.add_field(fields.BoxListFields.keypoints,
groundtruth_keypoints) groundtruth_keypoints)
result = target_assigner.assign(anchors_boxlist, groundtruth_boxlist) result = target_assigner.assign(
anchors_boxlist, groundtruth_boxlist, unmatched_class_label=None)
(cls_targets, cls_weights, reg_targets, reg_weights, _) = result (cls_targets, cls_weights, reg_targets, reg_weights, _) = result
return (cls_targets, cls_weights, reg_targets, reg_weights) return (cls_targets, cls_weights, reg_targets, reg_weights)
...@@ -221,15 +225,17 @@ class TargetAssignerTest(test_case.TestCase): ...@@ -221,15 +225,17 @@ class TargetAssignerTest(test_case.TestCase):
matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5, matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5,
unmatched_threshold=0.5) unmatched_threshold=0.5)
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1) box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
unmatched_cls_target = tf.constant([1, 0, 0, 0, 0, 0, 0], tf.float32) unmatched_class_label = tf.constant([1, 0, 0, 0, 0, 0, 0], tf.float32)
target_assigner = targetassigner.TargetAssigner( target_assigner = targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder, similarity_calc, matcher, box_coder)
unmatched_cls_target=unmatched_cls_target)
anchors_boxlist = box_list.BoxList(anchor_means) anchors_boxlist = box_list.BoxList(anchor_means)
groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners) groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners)
result = target_assigner.assign(anchors_boxlist, groundtruth_boxlist, result = target_assigner.assign(
groundtruth_labels) anchors_boxlist,
groundtruth_boxlist,
groundtruth_labels,
unmatched_class_label=unmatched_class_label)
(cls_targets, cls_weights, reg_targets, reg_weights, _) = result (cls_targets, cls_weights, reg_targets, reg_weights, _) = result
return (cls_targets, cls_weights, reg_targets, reg_weights) return (cls_targets, cls_weights, reg_targets, reg_weights)
...@@ -275,16 +281,18 @@ class TargetAssignerTest(test_case.TestCase): ...@@ -275,16 +281,18 @@ class TargetAssignerTest(test_case.TestCase):
matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5, matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5,
unmatched_threshold=0.5) unmatched_threshold=0.5)
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1) box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
unmatched_cls_target = tf.constant([1, 0, 0, 0, 0, 0, 0], tf.float32) unmatched_class_label = tf.constant([1, 0, 0, 0, 0, 0, 0], tf.float32)
target_assigner = targetassigner.TargetAssigner( target_assigner = targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder, similarity_calc, matcher, box_coder)
unmatched_cls_target=unmatched_cls_target)
anchors_boxlist = box_list.BoxList(anchor_means) anchors_boxlist = box_list.BoxList(anchor_means)
groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners) groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners)
result = target_assigner.assign(anchors_boxlist, groundtruth_boxlist, result = target_assigner.assign(
groundtruth_labels, anchors_boxlist,
groundtruth_weights) groundtruth_boxlist,
groundtruth_labels,
unmatched_class_label=unmatched_class_label,
groundtruth_weights=groundtruth_weights)
(_, cls_weights, _, reg_weights, _) = result (_, cls_weights, _, reg_weights, _) = result
return (cls_weights, reg_weights) return (cls_weights, reg_weights)
...@@ -318,15 +326,17 @@ class TargetAssignerTest(test_case.TestCase): ...@@ -318,15 +326,17 @@ class TargetAssignerTest(test_case.TestCase):
unmatched_threshold=0.5) unmatched_threshold=0.5)
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1) box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
unmatched_cls_target = tf.constant([[0, 0], [0, 0]], tf.float32) unmatched_class_label = tf.constant([[0, 0], [0, 0]], tf.float32)
target_assigner = targetassigner.TargetAssigner( target_assigner = targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder, similarity_calc, matcher, box_coder)
unmatched_cls_target=unmatched_cls_target)
anchors_boxlist = box_list.BoxList(anchor_means) anchors_boxlist = box_list.BoxList(anchor_means)
groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners) groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners)
result = target_assigner.assign(anchors_boxlist, groundtruth_boxlist, result = target_assigner.assign(
groundtruth_labels) anchors_boxlist,
groundtruth_boxlist,
groundtruth_labels,
unmatched_class_label=unmatched_class_label)
(cls_targets, cls_weights, reg_targets, reg_weights, _) = result (cls_targets, cls_weights, reg_targets, reg_weights, _) = result
return (cls_targets, cls_weights, reg_targets, reg_weights) return (cls_targets, cls_weights, reg_targets, reg_weights)
...@@ -371,14 +381,16 @@ class TargetAssignerTest(test_case.TestCase): ...@@ -371,14 +381,16 @@ class TargetAssignerTest(test_case.TestCase):
matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5, matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5,
unmatched_threshold=0.5) unmatched_threshold=0.5)
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1) box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
unmatched_cls_target = tf.constant([0, 0, 0], tf.float32) unmatched_class_label = tf.constant([0, 0, 0], tf.float32)
anchors_boxlist = box_list.BoxList(anchor_means) anchors_boxlist = box_list.BoxList(anchor_means)
groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners) groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners)
target_assigner = targetassigner.TargetAssigner( target_assigner = targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder, similarity_calc, matcher, box_coder)
unmatched_cls_target=unmatched_cls_target) result = target_assigner.assign(
result = target_assigner.assign(anchors_boxlist, groundtruth_boxlist, anchors_boxlist,
groundtruth_labels) groundtruth_boxlist,
groundtruth_labels,
unmatched_class_label=unmatched_class_label)
(cls_targets, cls_weights, reg_targets, reg_weights, _) = result (cls_targets, cls_weights, reg_targets, reg_weights, _) = result
return (cls_targets, cls_weights, reg_targets, reg_weights) return (cls_targets, cls_weights, reg_targets, reg_weights)
...@@ -415,10 +427,9 @@ class TargetAssignerTest(test_case.TestCase): ...@@ -415,10 +427,9 @@ class TargetAssignerTest(test_case.TestCase):
similarity_calc = region_similarity_calculator.NegSqDistSimilarity() similarity_calc = region_similarity_calculator.NegSqDistSimilarity()
matcher = bipartite_matcher.GreedyBipartiteMatcher() matcher = bipartite_matcher.GreedyBipartiteMatcher()
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder() box_coder = mean_stddev_box_coder.MeanStddevBoxCoder()
unmatched_cls_target = tf.constant([1, 0, 0, 0, 0, 0, 0], tf.float32) unmatched_class_label = tf.constant([1, 0, 0, 0, 0, 0, 0], tf.float32)
target_assigner = targetassigner.TargetAssigner( target_assigner = targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder, similarity_calc, matcher, box_coder)
unmatched_cls_target=unmatched_cls_target)
prior_means = tf.constant([[0.0, 0.0, 0.5, 0.5], prior_means = tf.constant([[0.0, 0.0, 0.5, 0.5],
[0.5, 0.5, 1.0, 0.8], [0.5, 0.5, 1.0, 0.8],
...@@ -436,17 +447,20 @@ class TargetAssignerTest(test_case.TestCase): ...@@ -436,17 +447,20 @@ class TargetAssignerTest(test_case.TestCase):
[0, 0, 0, 0, 0, 1, 0], [0, 0, 0, 0, 0, 1, 0],
[0, 0, 0, 1, 0, 0, 0]], tf.float32) [0, 0, 0, 1, 0, 0, 0]], tf.float32)
with self.assertRaisesRegexp(ValueError, 'Unequal shapes'): with self.assertRaisesRegexp(ValueError, 'Unequal shapes'):
target_assigner.assign(priors, boxes, groundtruth_labels, target_assigner.assign(
num_valid_rows=3) priors,
boxes,
groundtruth_labels,
unmatched_class_label=unmatched_class_label,
num_valid_rows=3)
def test_raises_error_on_invalid_groundtruth_labels(self): def test_raises_error_on_invalid_groundtruth_labels(self):
similarity_calc = region_similarity_calculator.NegSqDistSimilarity() similarity_calc = region_similarity_calculator.NegSqDistSimilarity()
matcher = bipartite_matcher.GreedyBipartiteMatcher() matcher = bipartite_matcher.GreedyBipartiteMatcher()
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=1.0) box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=1.0)
unmatched_cls_target = tf.constant([[0, 0], [0, 0], [0, 0]], tf.float32) unmatched_class_label = tf.constant([[0, 0], [0, 0], [0, 0]], tf.float32)
target_assigner = targetassigner.TargetAssigner( target_assigner = targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder, similarity_calc, matcher, box_coder)
unmatched_cls_target=unmatched_cls_target)
prior_means = tf.constant([[0.0, 0.0, 0.5, 0.5]]) prior_means = tf.constant([[0.0, 0.0, 0.5, 0.5]])
priors = box_list.BoxList(prior_means) priors = box_list.BoxList(prior_means)
...@@ -458,41 +472,22 @@ class TargetAssignerTest(test_case.TestCase): ...@@ -458,41 +472,22 @@ class TargetAssignerTest(test_case.TestCase):
groundtruth_labels = tf.constant([[[0, 1], [1, 0]]], tf.float32) groundtruth_labels = tf.constant([[[0, 1], [1, 0]]], tf.float32)
with self.assertRaises(ValueError): with self.assertRaises(ValueError):
target_assigner.assign(priors, boxes, groundtruth_labels, target_assigner.assign(
num_valid_rows=3) priors,
boxes,
groundtruth_labels,
unmatched_class_label=unmatched_class_label,
num_valid_rows=3)
class BatchTargetAssignerTest(test_case.TestCase): class BatchTargetAssignerTest(test_case.TestCase):
def _get_agnostic_target_assigner(self): def _get_target_assigner(self):
similarity_calc = region_similarity_calculator.IouSimilarity() similarity_calc = region_similarity_calculator.IouSimilarity()
matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5, matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5,
unmatched_threshold=0.5) unmatched_threshold=0.5)
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1) box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
return targetassigner.TargetAssigner( return targetassigner.TargetAssigner(similarity_calc, matcher, box_coder)
similarity_calc, matcher, box_coder,
unmatched_cls_target=None)
def _get_multi_class_target_assigner(self, num_classes):
similarity_calc = region_similarity_calculator.IouSimilarity()
matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5,
unmatched_threshold=0.5)
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
unmatched_cls_target = tf.constant([1] + num_classes * [0], tf.float32)
return targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder,
unmatched_cls_target=unmatched_cls_target)
def _get_multi_dimensional_target_assigner(self, target_dimensions):
similarity_calc = region_similarity_calculator.IouSimilarity()
matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5,
unmatched_threshold=0.5)
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
unmatched_cls_target = tf.constant(np.zeros(target_dimensions),
tf.float32)
return targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder,
unmatched_cls_target=unmatched_cls_target)
def test_batch_assign_targets(self): def test_batch_assign_targets(self):
...@@ -502,7 +497,7 @@ class BatchTargetAssignerTest(test_case.TestCase): ...@@ -502,7 +497,7 @@ class BatchTargetAssignerTest(test_case.TestCase):
gt_box_batch = [box_list1, box_list2] gt_box_batch = [box_list1, box_list2]
gt_class_targets = [None, None] gt_class_targets = [None, None]
anchors_boxlist = box_list.BoxList(anchor_means) anchors_boxlist = box_list.BoxList(anchor_means)
agnostic_target_assigner = self._get_agnostic_target_assigner() agnostic_target_assigner = self._get_target_assigner()
(cls_targets, cls_weights, reg_targets, reg_weights, (cls_targets, cls_weights, reg_targets, reg_weights,
_) = targetassigner.batch_assign_targets( _) = targetassigner.batch_assign_targets(
agnostic_target_assigner, anchors_boxlist, gt_box_batch, agnostic_target_assigner, anchors_boxlist, gt_box_batch,
...@@ -550,12 +545,13 @@ class BatchTargetAssignerTest(test_case.TestCase): ...@@ -550,12 +545,13 @@ class BatchTargetAssignerTest(test_case.TestCase):
gt_box_batch = [box_list1, box_list2] gt_box_batch = [box_list1, box_list2]
gt_class_targets = [class_targets1, class_targets2] gt_class_targets = [class_targets1, class_targets2]
anchors_boxlist = box_list.BoxList(anchor_means) anchors_boxlist = box_list.BoxList(anchor_means)
multiclass_target_assigner = self._get_multi_class_target_assigner( multiclass_target_assigner = self._get_target_assigner()
num_classes=3) num_classes = 3
unmatched_class_label = tf.constant([1] + num_classes * [0], tf.float32)
(cls_targets, cls_weights, reg_targets, reg_weights, (cls_targets, cls_weights, reg_targets, reg_weights,
_) = targetassigner.batch_assign_targets( _) = targetassigner.batch_assign_targets(
multiclass_target_assigner, anchors_boxlist, gt_box_batch, multiclass_target_assigner, anchors_boxlist, gt_box_batch,
gt_class_targets) gt_class_targets, unmatched_class_label)
return (cls_targets, cls_weights, reg_targets, reg_weights) return (cls_targets, cls_weights, reg_targets, reg_weights)
groundtruth_boxlist1 = np.array([[0., 0., 0.2, 0.2]], dtype=np.float32) groundtruth_boxlist1 = np.array([[0., 0., 0.2, 0.2]], dtype=np.float32)
...@@ -613,12 +609,13 @@ class BatchTargetAssignerTest(test_case.TestCase): ...@@ -613,12 +609,13 @@ class BatchTargetAssignerTest(test_case.TestCase):
gt_class_targets = [class_targets1, class_targets2] gt_class_targets = [class_targets1, class_targets2]
gt_weights = [groundtruth_weights1, groundtruth_weights2] gt_weights = [groundtruth_weights1, groundtruth_weights2]
anchors_boxlist = box_list.BoxList(anchor_means) anchors_boxlist = box_list.BoxList(anchor_means)
multiclass_target_assigner = self._get_multi_class_target_assigner( multiclass_target_assigner = self._get_target_assigner()
num_classes=3) num_classes = 3
unmatched_class_label = tf.constant([1] + num_classes * [0], tf.float32)
(cls_targets, cls_weights, reg_targets, reg_weights, (cls_targets, cls_weights, reg_targets, reg_weights,
_) = targetassigner.batch_assign_targets( _) = targetassigner.batch_assign_targets(
multiclass_target_assigner, anchors_boxlist, gt_box_batch, multiclass_target_assigner, anchors_boxlist, gt_box_batch,
gt_class_targets, gt_weights) gt_class_targets, unmatched_class_label, gt_weights)
return (cls_targets, cls_weights, reg_targets, reg_weights) return (cls_targets, cls_weights, reg_targets, reg_weights)
groundtruth_boxlist1 = np.array([[0., 0., 0.2, 0.2], groundtruth_boxlist1 = np.array([[0., 0., 0.2, 0.2],
...@@ -680,12 +677,14 @@ class BatchTargetAssignerTest(test_case.TestCase): ...@@ -680,12 +677,14 @@ class BatchTargetAssignerTest(test_case.TestCase):
gt_box_batch = [box_list1, box_list2] gt_box_batch = [box_list1, box_list2]
gt_class_targets = [class_targets1, class_targets2] gt_class_targets = [class_targets1, class_targets2]
anchors_boxlist = box_list.BoxList(anchor_means) anchors_boxlist = box_list.BoxList(anchor_means)
multiclass_target_assigner = self._get_multi_dimensional_target_assigner( multiclass_target_assigner = self._get_target_assigner()
target_dimensions=(2, 3)) target_dimensions = (2, 3)
unmatched_class_label = tf.constant(np.zeros(target_dimensions),
tf.float32)
(cls_targets, cls_weights, reg_targets, reg_weights, (cls_targets, cls_weights, reg_targets, reg_weights,
_) = targetassigner.batch_assign_targets( _) = targetassigner.batch_assign_targets(
multiclass_target_assigner, anchors_boxlist, gt_box_batch, multiclass_target_assigner, anchors_boxlist, gt_box_batch,
gt_class_targets) gt_class_targets, unmatched_class_label)
return (cls_targets, cls_weights, reg_targets, reg_weights) return (cls_targets, cls_weights, reg_targets, reg_weights)
groundtruth_boxlist1 = np.array([[0., 0., 0.2, 0.2]], dtype=np.float32) groundtruth_boxlist1 = np.array([[0., 0., 0.2, 0.2]], dtype=np.float32)
...@@ -754,13 +753,13 @@ class BatchTargetAssignerTest(test_case.TestCase): ...@@ -754,13 +753,13 @@ class BatchTargetAssignerTest(test_case.TestCase):
gt_class_targets_batch = [gt_class_targets] gt_class_targets_batch = [gt_class_targets]
anchors_boxlist = box_list.BoxList(anchor_means) anchors_boxlist = box_list.BoxList(anchor_means)
multiclass_target_assigner = self._get_multi_class_target_assigner( multiclass_target_assigner = self._get_target_assigner()
num_classes=3) num_classes = 3
unmatched_class_label = tf.constant([1] + num_classes * [0], tf.float32)
(cls_targets, cls_weights, reg_targets, reg_weights, (cls_targets, cls_weights, reg_targets, reg_weights,
_) = targetassigner.batch_assign_targets( _) = targetassigner.batch_assign_targets(
multiclass_target_assigner, anchors_boxlist, multiclass_target_assigner, anchors_boxlist,
gt_box_batch, gt_class_targets_batch) gt_box_batch, gt_class_targets_batch, unmatched_class_label)
return (cls_targets, cls_weights, reg_targets, reg_weights) return (cls_targets, cls_weights, reg_targets, reg_weights)
groundtruth_box_corners = np.zeros((0, 4), dtype=np.float32) groundtruth_box_corners = np.zeros((0, 4), dtype=np.float32)
......
...@@ -162,12 +162,12 @@ def main(parsed_args): ...@@ -162,12 +162,12 @@ def main(parsed_args):
for line in source: for line in source:
if not header: if not header:
header = line header = line
target.writelines(header)
continue continue
if labels_file: if labels_file:
expanded_lines = expansion_generator.expand_labels_from_csv(line) expanded_lines = expansion_generator.expand_labels_from_csv(line)
else: else:
expanded_lines = expansion_generator.expand_boxes_from_csv(line) expanded_lines = expansion_generator.expand_boxes_from_csv(line)
expanded_lines = [header] + expanded_lines
target.writelines(expanded_lines) target.writelines(expanded_lines)
......
...@@ -140,10 +140,10 @@ def main(_): ...@@ -140,10 +140,10 @@ def main(_):
] ]
else: else:
input_shape = None input_shape = None
exporter.export_inference_graph(FLAGS.input_type, pipeline_config, exporter.export_inference_graph(
FLAGS.trained_checkpoint_prefix, FLAGS.input_type, pipeline_config, FLAGS.trained_checkpoint_prefix,
FLAGS.output_directory, input_shape, FLAGS.output_directory, input_shape=input_shape,
FLAGS.write_inference_graph) write_inference_graph=FLAGS.write_inference_graph)
if __name__ == '__main__': if __name__ == '__main__':
......
...@@ -258,6 +258,7 @@ def export_tflite_graph(pipeline_config, trained_checkpoint_prefix, output_dir, ...@@ -258,6 +258,7 @@ def export_tflite_graph(pipeline_config, trained_checkpoint_prefix, output_dir,
restore_op_name='save/restore_all', restore_op_name='save/restore_all',
filename_tensor_name='save/Const:0', filename_tensor_name='save/Const:0',
clear_devices=True, clear_devices=True,
output_graph='',
initializer_nodes='') initializer_nodes='')
# Add new operation to do post processing in a custom op (TF Lite only) # Add new operation to do post processing in a custom op (TF Lite only)
......
...@@ -14,17 +14,16 @@ ...@@ -14,17 +14,16 @@
# ============================================================================== # ==============================================================================
"""Functions to export object detection inference graph.""" """Functions to export object detection inference graph."""
import logging
import os import os
import tempfile import tempfile
import tensorflow as tf import tensorflow as tf
from tensorflow.core.protobuf import saver_pb2 from tensorflow.core.protobuf import saver_pb2
from tensorflow.python import pywrap_tensorflow
from tensorflow.python.client import session from tensorflow.python.client import session
from tensorflow.python.framework import graph_util
from tensorflow.python.platform import gfile from tensorflow.python.platform import gfile
from tensorflow.python.saved_model import signature_constants from tensorflow.python.saved_model import signature_constants
from tensorflow.python.tools import freeze_graph
from tensorflow.python.training import saver as saver_lib from tensorflow.python.training import saver as saver_lib
from object_detection.builders import graph_rewriter_builder
from object_detection.builders import model_builder from object_detection.builders import model_builder
from object_detection.core import standard_fields as fields from object_detection.core import standard_fields as fields
from object_detection.data_decoders import tf_example_decoder from object_detection.data_decoders import tf_example_decoder
...@@ -32,70 +31,7 @@ from object_detection.utils import config_util ...@@ -32,70 +31,7 @@ from object_detection.utils import config_util
slim = tf.contrib.slim slim = tf.contrib.slim
freeze_graph_with_def_protos = freeze_graph.freeze_graph_with_def_protos
# TODO(derekjchow): Replace with freeze_graph.freeze_graph_with_def_protos when
# newer version of Tensorflow becomes more common.
def freeze_graph_with_def_protos(
input_graph_def,
input_saver_def,
input_checkpoint,
output_node_names,
restore_op_name,
filename_tensor_name,
clear_devices,
initializer_nodes,
variable_names_blacklist=''):
"""Converts all variables in a graph and checkpoint into constants."""
del restore_op_name, filename_tensor_name # Unused by updated loading code.
# 'input_checkpoint' may be a prefix if we're using Saver V2 format
if not saver_lib.checkpoint_exists(input_checkpoint):
raise ValueError(
'Input checkpoint "' + input_checkpoint + '" does not exist!')
if not output_node_names:
raise ValueError(
'You must supply the name of a node to --output_node_names.')
# Remove all the explicit device specifications for this node. This helps to
# make the graph more portable.
if clear_devices:
for node in input_graph_def.node:
node.device = ''
with tf.Graph().as_default():
tf.import_graph_def(input_graph_def, name='')
config = tf.ConfigProto(graph_options=tf.GraphOptions())
with session.Session(config=config) as sess:
if input_saver_def:
saver = saver_lib.Saver(saver_def=input_saver_def)
saver.restore(sess, input_checkpoint)
else:
var_list = {}
reader = pywrap_tensorflow.NewCheckpointReader(input_checkpoint)
var_to_shape_map = reader.get_variable_to_shape_map()
for key in var_to_shape_map:
try:
tensor = sess.graph.get_tensor_by_name(key + ':0')
except KeyError:
# This tensor doesn't exist in the graph (for example it's
# 'global_step' or a similar housekeeping element) so skip it.
continue
var_list[key] = tensor
saver = saver_lib.Saver(var_list=var_list)
saver.restore(sess, input_checkpoint)
if initializer_nodes:
sess.run(initializer_nodes)
variable_names_blacklist = (variable_names_blacklist.split(',') if
variable_names_blacklist else None)
output_graph_def = graph_util.convert_variables_to_constants(
sess,
input_graph_def,
output_node_names.split(','),
variable_names_blacklist=variable_names_blacklist)
return output_graph_def
def replace_variable_values_with_moving_averages(graph, def replace_variable_values_with_moving_averages(graph,
...@@ -247,18 +183,6 @@ def _add_output_tensor_nodes(postprocessed_tensors, ...@@ -247,18 +183,6 @@ def _add_output_tensor_nodes(postprocessed_tensors,
return outputs return outputs
def write_frozen_graph(frozen_graph_path, frozen_graph_def):
"""Writes frozen graph to disk.
Args:
frozen_graph_path: Path to write inference graph.
frozen_graph_def: tf.GraphDef holding frozen graph.
"""
with gfile.GFile(frozen_graph_path, 'wb') as f:
f.write(frozen_graph_def.SerializeToString())
logging.info('%d ops in the final graph.', len(frozen_graph_def.node))
def write_saved_model(saved_model_path, def write_saved_model(saved_model_path,
frozen_graph_def, frozen_graph_def,
inputs, inputs,
...@@ -384,6 +308,7 @@ def _export_inference_graph(input_type, ...@@ -384,6 +308,7 @@ def _export_inference_graph(input_type,
output_collection_name=output_collection_name, output_collection_name=output_collection_name,
graph_hook_fn=graph_hook_fn) graph_hook_fn=graph_hook_fn)
profile_inference_graph(tf.get_default_graph())
saver_kwargs = {} saver_kwargs = {}
if use_moving_averages: if use_moving_averages:
# This check is to be compatible with both version of SaverDef. # This check is to be compatible with both version of SaverDef.
...@@ -421,16 +346,17 @@ def _export_inference_graph(input_type, ...@@ -421,16 +346,17 @@ def _export_inference_graph(input_type,
else: else:
output_node_names = ','.join(outputs.keys()) output_node_names = ','.join(outputs.keys())
frozen_graph_def = freeze_graph_with_def_protos( frozen_graph_def = freeze_graph.freeze_graph_with_def_protos(
input_graph_def=tf.get_default_graph().as_graph_def(), input_graph_def=tf.get_default_graph().as_graph_def(),
input_saver_def=input_saver_def, input_saver_def=input_saver_def,
input_checkpoint=checkpoint_to_use, input_checkpoint=checkpoint_to_use,
output_node_names=output_node_names, output_node_names=output_node_names,
restore_op_name='save/restore_all', restore_op_name='save/restore_all',
filename_tensor_name='save/Const:0', filename_tensor_name='save/Const:0',
output_graph=frozen_graph_path,
clear_devices=True, clear_devices=True,
initializer_nodes='') initializer_nodes='')
write_frozen_graph(frozen_graph_path, frozen_graph_def)
write_saved_model(saved_model_path, frozen_graph_def, write_saved_model(saved_model_path, frozen_graph_def,
placeholder_tensor, outputs) placeholder_tensor, outputs)
...@@ -461,6 +387,11 @@ def export_inference_graph(input_type, ...@@ -461,6 +387,11 @@ def export_inference_graph(input_type,
""" """
detection_model = model_builder.build(pipeline_config.model, detection_model = model_builder.build(pipeline_config.model,
is_training=False) is_training=False)
graph_rewriter_fn = None
if pipeline_config.HasField('graph_rewriter'):
graph_rewriter_config = pipeline_config.graph_rewriter
graph_rewriter_fn = graph_rewriter_builder.build(graph_rewriter_config,
is_training=False)
_export_inference_graph( _export_inference_graph(
input_type, input_type,
detection_model, detection_model,
...@@ -470,7 +401,39 @@ def export_inference_graph(input_type, ...@@ -470,7 +401,39 @@ def export_inference_graph(input_type,
additional_output_tensor_names, additional_output_tensor_names,
input_shape, input_shape,
output_collection_name, output_collection_name,
graph_hook_fn=None, graph_hook_fn=graph_rewriter_fn,
write_inference_graph=write_inference_graph) write_inference_graph=write_inference_graph)
pipeline_config.eval_config.use_moving_averages = False pipeline_config.eval_config.use_moving_averages = False
config_util.save_pipeline_config(pipeline_config, output_directory) config_util.save_pipeline_config(pipeline_config, output_directory)
def profile_inference_graph(graph):
"""Profiles the inference graph.
Prints model parameters and computation FLOPs given an inference graph.
BatchNorms are excluded from the parameter count due to the fact that
BatchNorms are usually folded. BatchNorm, Initializer, Regularizer
and BiasAdd are not considered in FLOP count.
Args:
graph: the inference graph.
"""
tfprof_vars_option = (
tf.contrib.tfprof.model_analyzer.TRAINABLE_VARS_PARAMS_STAT_OPTIONS)
tfprof_flops_option = tf.contrib.tfprof.model_analyzer.FLOAT_OPS_OPTIONS
# Batchnorm is usually folded during inference.
tfprof_vars_option['trim_name_regexes'] = ['.*BatchNorm.*']
# Initializer and Regularizer are only used in training.
tfprof_flops_option['trim_name_regexes'] = [
'.*BatchNorm.*', '.*Initializer.*', '.*Regularizer.*', '.*BiasAdd.*'
]
tf.contrib.tfprof.model_analyzer.print_model_analysis(
graph,
tfprof_options=tfprof_vars_option)
tf.contrib.tfprof.model_analyzer.print_model_analysis(
graph,
tfprof_options=tfprof_flops_option)
...@@ -20,8 +20,10 @@ import six ...@@ -20,8 +20,10 @@ import six
import tensorflow as tf import tensorflow as tf
from google.protobuf import text_format from google.protobuf import text_format
from object_detection import exporter from object_detection import exporter
from object_detection.builders import graph_rewriter_builder
from object_detection.builders import model_builder from object_detection.builders import model_builder
from object_detection.core import model from object_detection.core import model
from object_detection.protos import graph_rewriter_pb2
from object_detection.protos import pipeline_pb2 from object_detection.protos import pipeline_pb2
if six.PY2: if six.PY2:
...@@ -75,8 +77,10 @@ class FakeModel(model.DetectionModel): ...@@ -75,8 +77,10 @@ class FakeModel(model.DetectionModel):
class ExportInferenceGraphTest(tf.test.TestCase): class ExportInferenceGraphTest(tf.test.TestCase):
def _save_checkpoint_from_mock_model(self, checkpoint_path, def _save_checkpoint_from_mock_model(self,
use_moving_averages): checkpoint_path,
use_moving_averages,
enable_quantization=False):
g = tf.Graph() g = tf.Graph()
with g.as_default(): with g.as_default():
mock_model = FakeModel() mock_model = FakeModel()
...@@ -86,20 +90,28 @@ class ExportInferenceGraphTest(tf.test.TestCase): ...@@ -86,20 +90,28 @@ class ExportInferenceGraphTest(tf.test.TestCase):
mock_model.postprocess(predictions, true_image_shapes) mock_model.postprocess(predictions, true_image_shapes)
if use_moving_averages: if use_moving_averages:
tf.train.ExponentialMovingAverage(0.0).apply() tf.train.ExponentialMovingAverage(0.0).apply()
slim.get_or_create_global_step() tf.train.get_or_create_global_step()
if enable_quantization:
graph_rewriter_config = graph_rewriter_pb2.GraphRewriter()
graph_rewriter_config.quantization.delay = 500000
graph_rewriter_fn = graph_rewriter_builder.build(
graph_rewriter_config, is_training=False)
graph_rewriter_fn()
saver = tf.train.Saver() saver = tf.train.Saver()
init = tf.global_variables_initializer() init = tf.global_variables_initializer()
with self.test_session() as sess: with self.test_session() as sess:
sess.run(init) sess.run(init)
saver.save(sess, checkpoint_path) saver.save(sess, checkpoint_path)
def _load_inference_graph(self, inference_graph_path): def _load_inference_graph(self, inference_graph_path, is_binary=True):
od_graph = tf.Graph() od_graph = tf.Graph()
with od_graph.as_default(): with od_graph.as_default():
od_graph_def = tf.GraphDef() od_graph_def = tf.GraphDef()
with tf.gfile.GFile(inference_graph_path) as fid: with tf.gfile.GFile(inference_graph_path) as fid:
serialized_graph = fid.read() if is_binary:
od_graph_def.ParseFromString(serialized_graph) od_graph_def.ParseFromString(fid.read())
else:
text_format.Parse(fid.read(), od_graph_def)
tf.import_graph_def(od_graph_def, name='') tf.import_graph_def(od_graph_def, name='')
return od_graph return od_graph
...@@ -284,6 +296,42 @@ class ExportInferenceGraphTest(tf.test.TestCase): ...@@ -284,6 +296,42 @@ class ExportInferenceGraphTest(tf.test.TestCase):
[var_name for var_name, _ in tf.train.list_variables(output_directory)]) [var_name for var_name, _ in tf.train.list_variables(output_directory)])
self.assertTrue(expected_variables.issubset(actual_variables)) self.assertTrue(expected_variables.issubset(actual_variables))
def test_export_model_with_quantization_nodes(self):
tmp_dir = self.get_temp_dir()
trained_checkpoint_prefix = os.path.join(tmp_dir, 'model.ckpt')
self._save_checkpoint_from_mock_model(
trained_checkpoint_prefix,
use_moving_averages=False,
enable_quantization=True)
output_directory = os.path.join(tmp_dir, 'output')
inference_graph_path = os.path.join(output_directory,
'inference_graph.pbtxt')
with mock.patch.object(
model_builder, 'build', autospec=True) as mock_builder:
mock_builder.return_value = FakeModel()
pipeline_config = pipeline_pb2.TrainEvalPipelineConfig()
text_format.Merge(
"""graph_rewriter {
quantization {
delay: 50000
activation_bits: 8
weight_bits: 8
}
}""", pipeline_config)
exporter.export_inference_graph(
input_type='image_tensor',
pipeline_config=pipeline_config,
trained_checkpoint_prefix=trained_checkpoint_prefix,
output_directory=output_directory,
write_inference_graph=True)
self._load_inference_graph(inference_graph_path, is_binary=False)
has_quant_nodes = False
for v in tf.global_variables():
if v.op.name.endswith('act_quant/min'):
has_quant_nodes = True
break
self.assertTrue(has_quant_nodes)
def test_export_model_with_all_output_nodes(self): def test_export_model_with_all_output_nodes(self):
tmp_dir = self.get_temp_dir() tmp_dir = self.get_temp_dir()
trained_checkpoint_prefix = os.path.join(tmp_dir, 'model.ckpt') trained_checkpoint_prefix = os.path.join(tmp_dir, 'model.ckpt')
...@@ -564,16 +612,16 @@ class ExportInferenceGraphTest(tf.test.TestCase): ...@@ -564,16 +612,16 @@ class ExportInferenceGraphTest(tf.test.TestCase):
output_node_names = ','.join(outputs.keys()) output_node_names = ','.join(outputs.keys())
saver = tf.train.Saver() saver = tf.train.Saver()
input_saver_def = saver.as_saver_def() input_saver_def = saver.as_saver_def()
frozen_graph_def = exporter.freeze_graph_with_def_protos( exporter.freeze_graph_with_def_protos(
input_graph_def=tf.get_default_graph().as_graph_def(), input_graph_def=tf.get_default_graph().as_graph_def(),
input_saver_def=input_saver_def, input_saver_def=input_saver_def,
input_checkpoint=trained_checkpoint_prefix, input_checkpoint=trained_checkpoint_prefix,
output_node_names=output_node_names, output_node_names=output_node_names,
restore_op_name='save/restore_all', restore_op_name='save/restore_all',
filename_tensor_name='save/Const:0', filename_tensor_name='save/Const:0',
output_graph=inference_graph_path,
clear_devices=True, clear_devices=True,
initializer_nodes='') initializer_nodes='')
exporter.write_frozen_graph(inference_graph_path, frozen_graph_def)
inference_graph = self._load_inference_graph(inference_graph_path) inference_graph = self._load_inference_graph(inference_graph_path)
tf_example_np = np.expand_dims(self._create_tf_example( tf_example_np = np.expand_dims(self._create_tf_example(
...@@ -719,6 +767,7 @@ class ExportInferenceGraphTest(tf.test.TestCase): ...@@ -719,6 +767,7 @@ class ExportInferenceGraphTest(tf.test.TestCase):
output_node_names=output_node_names, output_node_names=output_node_names,
restore_op_name='save/restore_all', restore_op_name='save/restore_all',
filename_tensor_name='save/Const:0', filename_tensor_name='save/Const:0',
output_graph='',
clear_devices=True, clear_devices=True,
initializer_nodes='') initializer_nodes='')
exporter.write_saved_model( exporter.write_saved_model(
......
...@@ -48,6 +48,7 @@ pip install --user jupyter ...@@ -48,6 +48,7 @@ pip install --user jupyter
pip install --user matplotlib pip install --user matplotlib
``` ```
<!-- common_typos_disable -->
**Note**: sometimes "sudo apt-get install protobuf-compiler" will install **Note**: sometimes "sudo apt-get install protobuf-compiler" will install
Protobuf 3+ versions for you and some users have issues when using 3.5. Protobuf 3+ versions for you and some users have issues when using 3.5.
If that is your case, try the [manual](#Manual-protobuf-compiler-installation-and-usage) installation. If that is your case, try the [manual](#Manual-protobuf-compiler-installation-and-usage) installation.
...@@ -88,6 +89,7 @@ protoc object_detection/protos/*.proto --python_out=. ...@@ -88,6 +89,7 @@ protoc object_detection/protos/*.proto --python_out=.
## Manual protobuf-compiler installation and usage ## Manual protobuf-compiler installation and usage
Download and install the 3.0 release of protoc, then unzip the file. Download and install the 3.0 release of protoc, then unzip the file.
```bash ```bash
# From tensorflow/models/research/ # From tensorflow/models/research/
wget -O protobuf.zip https://github.com/google/protobuf/releases/download/v3.0.0/protoc-3.0.0-linux-x86_64.zip wget -O protobuf.zip https://github.com/google/protobuf/releases/download/v3.0.0/protoc-3.0.0-linux-x86_64.zip
......
...@@ -219,20 +219,8 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes, ...@@ -219,20 +219,8 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes,
padded_tensor_dict = {} padded_tensor_dict = {}
for tensor_name in tensor_dict: for tensor_name in tensor_dict:
expected_shape = padding_shapes[tensor_name] padded_tensor_dict[tensor_name] = shape_utils.pad_or_clip_nd(
current_shape = shape_utils.combined_static_and_dynamic_shape( tensor_dict[tensor_name], padding_shapes[tensor_name])
tensor_dict[tensor_name])
trailing_paddings = [
expected_shape_dim - current_shape_dim if expected_shape_dim else 0
for expected_shape_dim, current_shape_dim in zip(
expected_shape, current_shape)
]
paddings = tf.stack([tf.zeros(len(trailing_paddings), dtype=tf.int32),
trailing_paddings],
axis=1)
padded_tensor_dict[tensor_name] = tf.pad(
tensor_dict[tensor_name], paddings=paddings)
padded_tensor_dict[tensor_name].set_shape(expected_shape)
return padded_tensor_dict return padded_tensor_dict
...@@ -529,7 +517,7 @@ def create_predict_input_fn(model_config, predict_input_config): ...@@ -529,7 +517,7 @@ def create_predict_input_fn(model_config, predict_input_config):
`ServingInputReceiver`. `ServingInputReceiver`.
""" """
del params del params
example = tf.placeholder(dtype=tf.string, shape=[], name='input_feature') example = tf.placeholder(dtype=tf.string, shape=[], name='tf_example')
num_classes = config_util.get_number_of_classes(model_config) num_classes = config_util.get_number_of_classes(model_config)
model = model_builder.build(model_config, is_training=False) model = model_builder.build(model_config, is_training=False)
......
...@@ -657,6 +657,42 @@ class PadInputDataToStaticShapesFnTest(tf.test.TestCase): ...@@ -657,6 +657,42 @@ class PadInputDataToStaticShapesFnTest(tf.test.TestCase):
padded_tensor_dict[fields.InputDataFields.groundtruth_classes] padded_tensor_dict[fields.InputDataFields.groundtruth_classes]
.shape.as_list(), [3, 3]) .shape.as_list(), [3, 3])
def test_clip_boxes_and_classes(self):
input_tensor_dict = {
fields.InputDataFields.groundtruth_boxes:
tf.placeholder(tf.float32, [None, 4]),
fields.InputDataFields.groundtruth_classes:
tf.placeholder(tf.int32, [None, 3]),
}
padded_tensor_dict = inputs.pad_input_data_to_static_shapes(
tensor_dict=input_tensor_dict,
max_num_boxes=3,
num_classes=3,
spatial_image_shape=[5, 6])
self.assertAllEqual(
padded_tensor_dict[fields.InputDataFields.groundtruth_boxes]
.shape.as_list(), [3, 4])
self.assertAllEqual(
padded_tensor_dict[fields.InputDataFields.groundtruth_classes]
.shape.as_list(), [3, 3])
with self.test_session() as sess:
out_tensor_dict = sess.run(
padded_tensor_dict,
feed_dict={
input_tensor_dict[fields.InputDataFields.groundtruth_boxes]:
np.random.rand(5, 4),
input_tensor_dict[fields.InputDataFields.groundtruth_classes]:
np.random.rand(2, 3),
})
self.assertAllEqual(
out_tensor_dict[fields.InputDataFields.groundtruth_boxes].shape, [3, 4])
self.assertAllEqual(
out_tensor_dict[fields.InputDataFields.groundtruth_classes].shape,
[3, 3])
def test_do_not_pad_dynamic_images(self): def test_do_not_pad_dynamic_images(self):
input_tensor_dict = { input_tensor_dict = {
fields.InputDataFields.image: fields.InputDataFields.image:
......
...@@ -12,7 +12,6 @@ ...@@ -12,7 +12,6 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
# ============================================================================== # ==============================================================================
"""Faster R-CNN meta-architecture definition. """Faster R-CNN meta-architecture definition.
General tensorflow implementation of Faster R-CNN detection models. General tensorflow implementation of Faster R-CNN detection models.
...@@ -98,7 +97,6 @@ from functools import partial ...@@ -98,7 +97,6 @@ from functools import partial
import tensorflow as tf import tensorflow as tf
from object_detection.anchor_generators import grid_anchor_generator from object_detection.anchor_generators import grid_anchor_generator
from object_detection.core import balanced_positive_negative_sampler as sampler
from object_detection.core import box_list from object_detection.core import box_list
from object_detection.core import box_list_ops from object_detection.core import box_list_ops
from object_detection.core import box_predictor from object_detection.core import box_predictor
...@@ -107,6 +105,7 @@ from object_detection.core import model ...@@ -107,6 +105,7 @@ from object_detection.core import model
from object_detection.core import post_processing from object_detection.core import post_processing
from object_detection.core import standard_fields as fields from object_detection.core import standard_fields as fields
from object_detection.core import target_assigner from object_detection.core import target_assigner
from object_detection.predictors import convolutional_box_predictor
from object_detection.utils import ops from object_detection.utils import ops
from object_detection.utils import shape_utils from object_detection.utils import shape_utils
...@@ -228,12 +227,13 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -228,12 +227,13 @@ class FasterRCNNMetaArch(model.DetectionModel):
feature_extractor, feature_extractor,
number_of_stages, number_of_stages,
first_stage_anchor_generator, first_stage_anchor_generator,
first_stage_target_assigner,
first_stage_atrous_rate, first_stage_atrous_rate,
first_stage_box_predictor_arg_scope_fn, first_stage_box_predictor_arg_scope_fn,
first_stage_box_predictor_kernel_size, first_stage_box_predictor_kernel_size,
first_stage_box_predictor_depth, first_stage_box_predictor_depth,
first_stage_minibatch_size, first_stage_minibatch_size,
first_stage_positive_balance_fraction, first_stage_sampler,
first_stage_nms_score_threshold, first_stage_nms_score_threshold,
first_stage_nms_iou_threshold, first_stage_nms_iou_threshold,
first_stage_max_proposals, first_stage_max_proposals,
...@@ -242,9 +242,10 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -242,9 +242,10 @@ class FasterRCNNMetaArch(model.DetectionModel):
initial_crop_size, initial_crop_size,
maxpool_kernel_size, maxpool_kernel_size,
maxpool_stride, maxpool_stride,
second_stage_target_assigner,
second_stage_mask_rcnn_box_predictor, second_stage_mask_rcnn_box_predictor,
second_stage_batch_size, second_stage_batch_size,
second_stage_balance_fraction, second_stage_sampler,
second_stage_non_max_suppression_fn, second_stage_non_max_suppression_fn,
second_stage_score_conversion_fn, second_stage_score_conversion_fn,
second_stage_localization_loss_weight, second_stage_localization_loss_weight,
...@@ -254,7 +255,8 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -254,7 +255,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
hard_example_miner=None, hard_example_miner=None,
parallel_iterations=16, parallel_iterations=16,
add_summaries=True, add_summaries=True,
use_matmul_crop_and_resize=False): use_matmul_crop_and_resize=False,
clip_anchors_to_image=False):
"""FasterRCNNMetaArch Constructor. """FasterRCNNMetaArch Constructor.
Args: Args:
...@@ -285,6 +287,8 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -285,6 +287,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
first_stage_anchor_generator: An anchor_generator.AnchorGenerator object first_stage_anchor_generator: An anchor_generator.AnchorGenerator object
(note that currently we only support (note that currently we only support
grid_anchor_generator.GridAnchorGenerator objects) grid_anchor_generator.GridAnchorGenerator objects)
first_stage_target_assigner: Target assigner to use for first stage of
Faster R-CNN (RPN).
first_stage_atrous_rate: A single integer indicating the atrous rate for first_stage_atrous_rate: A single integer indicating the atrous rate for
the single convolution op which is applied to the `rpn_features_to_crop` the single convolution op which is applied to the `rpn_features_to_crop`
tensor to obtain a tensor to be used for box prediction. Some feature tensor to obtain a tensor to be used for box prediction. Some feature
...@@ -304,8 +308,7 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -304,8 +308,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
"batch size" refers to the number of anchors selected as contributing "batch size" refers to the number of anchors selected as contributing
to the loss function for any given image within the image batch and is to the loss function for any given image within the image batch and is
only called "batch_size" due to terminology from the Faster R-CNN paper. only called "batch_size" due to terminology from the Faster R-CNN paper.
first_stage_positive_balance_fraction: Fraction of positive examples first_stage_sampler: Sampler to use for first stage loss (RPN loss).
per image for the RPN. The recommended value for Faster RCNN is 0.5.
first_stage_nms_score_threshold: Score threshold for non max suppression first_stage_nms_score_threshold: Score threshold for non max suppression
for the Region Proposal Network (RPN). This value is expected to be in for the Region Proposal Network (RPN). This value is expected to be in
[0, 1] as it is applied directly after a softmax transformation. The [0, 1] as it is applied directly after a softmax transformation. The
...@@ -325,6 +328,10 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -325,6 +328,10 @@ class FasterRCNNMetaArch(model.DetectionModel):
max pool op on the cropped feature map during ROI pooling. max pool op on the cropped feature map during ROI pooling.
maxpool_stride: A single integer indicating the stride of the max pool maxpool_stride: A single integer indicating the stride of the max pool
op on the cropped feature map during ROI pooling. op on the cropped feature map during ROI pooling.
second_stage_target_assigner: Target assigner to use for second stage of
Faster R-CNN. If the model is configured with multiple prediction heads,
this target assigner is used to generate targets for all heads (with the
correct `unmatched_class_label`).
second_stage_mask_rcnn_box_predictor: Mask R-CNN box predictor to use for second_stage_mask_rcnn_box_predictor: Mask R-CNN box predictor to use for
the second stage. the second stage.
second_stage_batch_size: The batch size used for computing the second_stage_batch_size: The batch size used for computing the
...@@ -332,9 +339,8 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -332,9 +339,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
"batch size" refers to the number of proposals selected as contributing "batch size" refers to the number of proposals selected as contributing
to the loss function for any given image within the image batch and is to the loss function for any given image within the image batch and is
only called "batch_size" due to terminology from the Faster R-CNN paper. only called "batch_size" due to terminology from the Faster R-CNN paper.
second_stage_balance_fraction: Fraction of positive examples to use second_stage_sampler: Sampler to use for second stage loss (box
per image for the box classifier. The recommended value for Faster RCNN classifier loss).
is 0.25.
second_stage_non_max_suppression_fn: batch_multiclass_non_max_suppression second_stage_non_max_suppression_fn: batch_multiclass_non_max_suppression
callable that takes `boxes`, `scores`, optional `clip_window` and callable that takes `boxes`, `scores`, optional `clip_window` and
optional (kwarg) `mask` inputs (with all other inputs already set) optional (kwarg) `mask` inputs (with all other inputs already set)
...@@ -364,6 +370,9 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -364,6 +370,9 @@ class FasterRCNNMetaArch(model.DetectionModel):
use_matmul_crop_and_resize: Force the use of matrix multiplication based use_matmul_crop_and_resize: Force the use of matrix multiplication based
crop and resize instead of standard tf.image.crop_and_resize while crop and resize instead of standard tf.image.crop_and_resize while
computing second stage input feature maps. computing second stage input feature maps.
clip_anchors_to_image: Normally, anchors generated for a given image size
are pruned during training if they lie outside the image window. This
option clips the anchors to be within the image instead of pruning.
Raises: Raises:
ValueError: If `second_stage_batch_size` > `first_stage_max_proposals` at ValueError: If `second_stage_batch_size` > `first_stage_max_proposals` at
...@@ -388,13 +397,8 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -388,13 +397,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
self._feature_extractor = feature_extractor self._feature_extractor = feature_extractor
self._number_of_stages = number_of_stages self._number_of_stages = number_of_stages
# The first class is reserved as background. self._proposal_target_assigner = first_stage_target_assigner
unmatched_cls_target = tf.constant( self._detector_target_assigner = second_stage_target_assigner
[1] + self._num_classes * [0], dtype=tf.float32)
self._proposal_target_assigner = target_assigner.create_target_assigner(
'FasterRCNN', 'proposal')
self._detector_target_assigner = target_assigner.create_target_assigner(
'FasterRCNN', 'detection', unmatched_cls_target=unmatched_cls_target)
# Both proposal and detector target assigners use the same box coder # Both proposal and detector target assigners use the same box coder
self._box_coder = self._proposal_target_assigner.box_coder self._box_coder = self._proposal_target_assigner.box_coder
...@@ -407,14 +411,19 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -407,14 +411,19 @@ class FasterRCNNMetaArch(model.DetectionModel):
first_stage_box_predictor_kernel_size) first_stage_box_predictor_kernel_size)
self._first_stage_box_predictor_depth = first_stage_box_predictor_depth self._first_stage_box_predictor_depth = first_stage_box_predictor_depth
self._first_stage_minibatch_size = first_stage_minibatch_size self._first_stage_minibatch_size = first_stage_minibatch_size
self._first_stage_sampler = sampler.BalancedPositiveNegativeSampler( self._first_stage_sampler = first_stage_sampler
positive_fraction=first_stage_positive_balance_fraction) self._first_stage_box_predictor = (
self._first_stage_box_predictor = box_predictor.ConvolutionalBoxPredictor( convolutional_box_predictor.ConvolutionalBoxPredictor(
self._is_training, num_classes=1, self._is_training,
conv_hyperparams_fn=self._first_stage_box_predictor_arg_scope_fn, num_classes=1,
min_depth=0, max_depth=0, num_layers_before_predictor=0, conv_hyperparams_fn=self._first_stage_box_predictor_arg_scope_fn,
use_dropout=False, dropout_keep_prob=1.0, kernel_size=1, min_depth=0,
box_code_size=self._box_coder.code_size) max_depth=0,
num_layers_before_predictor=0,
use_dropout=False,
dropout_keep_prob=1.0,
kernel_size=1,
box_code_size=self._box_coder.code_size))
self._first_stage_nms_score_threshold = first_stage_nms_score_threshold self._first_stage_nms_score_threshold = first_stage_nms_score_threshold
self._first_stage_nms_iou_threshold = first_stage_nms_iou_threshold self._first_stage_nms_iou_threshold = first_stage_nms_iou_threshold
...@@ -435,8 +444,7 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -435,8 +444,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
self._mask_rcnn_box_predictor = second_stage_mask_rcnn_box_predictor self._mask_rcnn_box_predictor = second_stage_mask_rcnn_box_predictor
self._second_stage_batch_size = second_stage_batch_size self._second_stage_batch_size = second_stage_batch_size
self._second_stage_sampler = sampler.BalancedPositiveNegativeSampler( self._second_stage_sampler = second_stage_sampler
positive_fraction=second_stage_balance_fraction)
self._second_stage_nms_fn = second_stage_non_max_suppression_fn self._second_stage_nms_fn = second_stage_non_max_suppression_fn
self._second_stage_score_conversion_fn = second_stage_score_conversion_fn self._second_stage_score_conversion_fn = second_stage_score_conversion_fn
...@@ -454,6 +462,8 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -454,6 +462,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
self._hard_example_miner = hard_example_miner self._hard_example_miner = hard_example_miner
self._parallel_iterations = parallel_iterations self._parallel_iterations = parallel_iterations
self.clip_anchors_to_image = clip_anchors_to_image
if self._number_of_stages <= 0 or self._number_of_stages > 3: if self._number_of_stages <= 0 or self._number_of_stages > 3:
raise ValueError('Number of stages should be a value in {1, 2, 3}.') raise ValueError('Number of stages should be a value in {1, 2, 3}.')
...@@ -639,10 +649,14 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -639,10 +649,14 @@ class FasterRCNNMetaArch(model.DetectionModel):
# the image window at training time and clipping at inference time. # the image window at training time and clipping at inference time.
clip_window = tf.to_float(tf.stack([0, 0, image_shape[1], image_shape[2]])) clip_window = tf.to_float(tf.stack([0, 0, image_shape[1], image_shape[2]]))
if self._is_training: if self._is_training:
(rpn_box_encodings, rpn_objectness_predictions_with_background, if self.clip_anchors_to_image:
anchors_boxlist) = self._remove_invalid_anchors_and_predictions( anchors_boxlist = box_list_ops.clip_to_window(
rpn_box_encodings, rpn_objectness_predictions_with_background, anchors_boxlist, clip_window, filter_nonoverlapping=False)
anchors_boxlist, clip_window) else:
(rpn_box_encodings, rpn_objectness_predictions_with_background,
anchors_boxlist) = self._remove_invalid_anchors_and_predictions(
rpn_box_encodings, rpn_objectness_predictions_with_background,
anchors_boxlist, clip_window)
else: else:
anchors_boxlist = box_list_ops.clip_to_window( anchors_boxlist = box_list_ops.clip_to_window(
anchors_boxlist, clip_window) anchors_boxlist, clip_window)
...@@ -761,11 +775,16 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -761,11 +775,16 @@ class FasterRCNNMetaArch(model.DetectionModel):
flattened_proposal_feature_maps, flattened_proposal_feature_maps,
scope=self.second_stage_feature_extractor_scope)) scope=self.second_stage_feature_extractor_scope))
box_predictions = self._mask_rcnn_box_predictor.predict( if self._mask_rcnn_box_predictor.is_keras_model:
[box_classifier_features], box_predictions = self._mask_rcnn_box_predictor(
num_predictions_per_location=[1], [box_classifier_features],
scope=self.second_stage_box_predictor_scope, prediction_stage=2)
predict_boxes_and_classes=True) else:
box_predictions = self._mask_rcnn_box_predictor.predict(
[box_classifier_features],
num_predictions_per_location=[1],
scope=self.second_stage_box_predictor_scope,
prediction_stage=2)
refined_box_encodings = tf.squeeze( refined_box_encodings = tf.squeeze(
box_predictions[box_predictor.BOX_ENCODINGS], box_predictions[box_predictor.BOX_ENCODINGS],
...@@ -834,12 +853,16 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -834,12 +853,16 @@ class FasterRCNNMetaArch(model.DetectionModel):
if self._is_training: if self._is_training:
curr_box_classifier_features = prediction_dict['box_classifier_features'] curr_box_classifier_features = prediction_dict['box_classifier_features']
detection_classes = prediction_dict['class_predictions_with_background'] detection_classes = prediction_dict['class_predictions_with_background']
mask_predictions = self._mask_rcnn_box_predictor.predict( if self._mask_rcnn_box_predictor.is_keras_model:
[curr_box_classifier_features], mask_predictions = self._mask_rcnn_box_predictor(
num_predictions_per_location=[1], [curr_box_classifier_features],
scope=self.second_stage_box_predictor_scope, prediction_stage=3)
predict_boxes_and_classes=False, else:
predict_auxiliary_outputs=True) mask_predictions = self._mask_rcnn_box_predictor.predict(
[curr_box_classifier_features],
num_predictions_per_location=[1],
scope=self.second_stage_box_predictor_scope,
prediction_stage=3)
prediction_dict['mask_predictions'] = tf.squeeze(mask_predictions[ prediction_dict['mask_predictions'] = tf.squeeze(mask_predictions[
box_predictor.MASK_PREDICTIONS], axis=1) box_predictor.MASK_PREDICTIONS], axis=1)
else: else:
...@@ -865,12 +888,16 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -865,12 +888,16 @@ class FasterRCNNMetaArch(model.DetectionModel):
flattened_detected_feature_maps, flattened_detected_feature_maps,
scope=self.second_stage_feature_extractor_scope)) scope=self.second_stage_feature_extractor_scope))
mask_predictions = self._mask_rcnn_box_predictor.predict( if self._mask_rcnn_box_predictor.is_keras_model:
[curr_box_classifier_features], mask_predictions = self._mask_rcnn_box_predictor(
num_predictions_per_location=[1], [curr_box_classifier_features],
scope=self.second_stage_box_predictor_scope, prediction_stage=3)
predict_boxes_and_classes=False, else:
predict_auxiliary_outputs=True) mask_predictions = self._mask_rcnn_box_predictor.predict(
[curr_box_classifier_features],
num_predictions_per_location=[1],
scope=self.second_stage_box_predictor_scope,
prediction_stage=3)
detection_masks = tf.squeeze(mask_predictions[ detection_masks = tf.squeeze(mask_predictions[
box_predictor.MASK_PREDICTIONS], axis=1) box_predictor.MASK_PREDICTIONS], axis=1)
...@@ -976,10 +1003,14 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -976,10 +1003,14 @@ class FasterRCNNMetaArch(model.DetectionModel):
if len(num_anchors_per_location) != 1: if len(num_anchors_per_location) != 1:
raise RuntimeError('anchor_generator is expected to generate anchors ' raise RuntimeError('anchor_generator is expected to generate anchors '
'corresponding to a single feature map.') 'corresponding to a single feature map.')
box_predictions = self._first_stage_box_predictor.predict( if self._first_stage_box_predictor.is_keras_model:
[rpn_box_predictor_features], box_predictions = self._first_stage_box_predictor(
num_anchors_per_location, [rpn_box_predictor_features])
scope=self.first_stage_box_predictor_scope) else:
box_predictions = self._first_stage_box_predictor.predict(
[rpn_box_predictor_features],
num_anchors_per_location,
scope=self.first_stage_box_predictor_scope)
box_encodings = tf.concat( box_encodings = tf.concat(
box_predictions[box_predictor.BOX_ENCODINGS], axis=1) box_predictions[box_predictor.BOX_ENCODINGS], axis=1)
...@@ -1393,8 +1424,11 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1393,8 +1424,11 @@ class FasterRCNNMetaArch(model.DetectionModel):
a BoxList contained sampled proposals. a BoxList contained sampled proposals.
""" """
(cls_targets, cls_weights, _, _, _) = self._detector_target_assigner.assign( (cls_targets, cls_weights, _, _, _) = self._detector_target_assigner.assign(
proposal_boxlist, groundtruth_boxlist, proposal_boxlist,
groundtruth_classes_with_background) groundtruth_boxlist,
groundtruth_classes_with_background,
unmatched_class_label=tf.constant(
[1] + self._num_classes * [0], dtype=tf.float32))
# Selects all boxes as candidates if none of them is selected according # Selects all boxes as candidates if none of them is selected according
# to cls_weights. This could happen as boxes within certain IOU ranges # to cls_weights. This could happen as boxes within certain IOU ranges
# are ignored. If triggered, the selected boxes will still be ignored # are ignored. If triggered, the selected boxes will still be ignored
...@@ -1672,7 +1706,8 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1672,7 +1706,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
batch_reg_weights, _) = target_assigner.batch_assign_targets( batch_reg_weights, _) = target_assigner.batch_assign_targets(
self._proposal_target_assigner, box_list.BoxList(anchors), self._proposal_target_assigner, box_list.BoxList(anchors),
groundtruth_boxlists, groundtruth_boxlists,
len(groundtruth_boxlists) * [None], groundtruth_weights_list) len(groundtruth_boxlists) * [None],
gt_weights_batch=groundtruth_weights_list)
batch_cls_targets = tf.squeeze(batch_cls_targets, axis=2) batch_cls_targets = tf.squeeze(batch_cls_targets, axis=2)
def _minibatch_subsample_fn(inputs): def _minibatch_subsample_fn(inputs):
...@@ -1792,9 +1827,13 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1792,9 +1827,13 @@ class FasterRCNNMetaArch(model.DetectionModel):
(batch_cls_targets_with_background, batch_cls_weights, batch_reg_targets, (batch_cls_targets_with_background, batch_cls_weights, batch_reg_targets,
batch_reg_weights, _) = target_assigner.batch_assign_targets( batch_reg_weights, _) = target_assigner.batch_assign_targets(
self._detector_target_assigner, proposal_boxlists, self._detector_target_assigner,
groundtruth_boxlists, groundtruth_classes_with_background_list, proposal_boxlists,
groundtruth_weights_list) groundtruth_boxlists,
groundtruth_classes_with_background_list,
unmatched_class_label=tf.constant(
[1] + self._num_classes * [0], dtype=tf.float32),
gt_weights_batch=groundtruth_weights_list)
class_predictions_with_background = tf.reshape( class_predictions_with_background = tf.reshape(
class_predictions_with_background, class_predictions_with_background,
...@@ -1866,18 +1905,12 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1866,18 +1905,12 @@ class FasterRCNNMetaArch(model.DetectionModel):
raise ValueError('Groundtruth instance masks not provided. ' raise ValueError('Groundtruth instance masks not provided. '
'Please configure input reader.') 'Please configure input reader.')
# Create a new target assigner that matches the proposals to groundtruth unmatched_mask_label = tf.zeros(image_shape[1:3], dtype=tf.float32)
# and returns the mask targets. (batch_mask_targets, _, _, batch_mask_target_weights,
# TODO(rathodv): Move `unmatched_cls_target` from constructor to assign _) = target_assigner.batch_assign_targets(
# function. This will enable reuse of a single target assigner for both self._detector_target_assigner, proposal_boxlists,
# class targets and mask targets. groundtruth_boxlists, groundtruth_masks_list, unmatched_mask_label,
mask_target_assigner = target_assigner.create_target_assigner( groundtruth_weights_list)
'FasterRCNN', 'detection',
unmatched_cls_target=tf.zeros(image_shape[1:3], dtype=tf.float32))
(batch_mask_targets, _, _,
batch_mask_target_weights, _) = target_assigner.batch_assign_targets(
mask_target_assigner, proposal_boxlists, groundtruth_boxlists,
groundtruth_masks_list, groundtruth_weights_list)
# Pad the prediction_masks with to add zeros for background class to be # Pad the prediction_masks with to add zeros for background class to be
# consistent with class predictions. # consistent with class predictions.
......
...@@ -21,7 +21,9 @@ from object_detection.anchor_generators import grid_anchor_generator ...@@ -21,7 +21,9 @@ from object_detection.anchor_generators import grid_anchor_generator
from object_detection.builders import box_predictor_builder from object_detection.builders import box_predictor_builder
from object_detection.builders import hyperparams_builder from object_detection.builders import hyperparams_builder
from object_detection.builders import post_processing_builder from object_detection.builders import post_processing_builder
from object_detection.core import balanced_positive_negative_sampler as sampler
from object_detection.core import losses from object_detection.core import losses
from object_detection.core import target_assigner
from object_detection.meta_architectures import faster_rcnn_meta_arch from object_detection.meta_architectures import faster_rcnn_meta_arch
from object_detection.protos import box_predictor_pb2 from object_detection.protos import box_predictor_pb2
from object_detection.protos import hyperparams_pb2 from object_detection.protos import hyperparams_pb2
...@@ -153,7 +155,9 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase): ...@@ -153,7 +155,9 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
predict_masks=False, predict_masks=False,
pad_to_max_dimension=None, pad_to_max_dimension=None,
masks_are_class_agnostic=False, masks_are_class_agnostic=False,
use_matmul_crop_and_resize=False): use_matmul_crop_and_resize=False,
clip_anchors_to_image=False,
use_matmul_gather_in_matcher=False):
def image_resizer_fn(image, masks=None): def image_resizer_fn(image, masks=None):
"""Fake image resizer function.""" """Fake image resizer function."""
...@@ -186,6 +190,10 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase): ...@@ -186,6 +190,10 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
first_stage_anchor_scales, first_stage_anchor_scales,
first_stage_anchor_aspect_ratios, first_stage_anchor_aspect_ratios,
anchor_stride=first_stage_anchor_strides) anchor_stride=first_stage_anchor_strides)
first_stage_target_assigner = target_assigner.create_target_assigner(
'FasterRCNN',
'proposal',
use_matmul_gather=use_matmul_gather_in_matcher)
fake_feature_extractor = FakeFasterRCNNFeatureExtractor() fake_feature_extractor = FakeFasterRCNNFeatureExtractor()
...@@ -211,7 +219,8 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase): ...@@ -211,7 +219,8 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
first_stage_atrous_rate = 1 first_stage_atrous_rate = 1
first_stage_box_predictor_depth = 512 first_stage_box_predictor_depth = 512
first_stage_minibatch_size = 3 first_stage_minibatch_size = 3
first_stage_positive_balance_fraction = .5 first_stage_sampler = sampler.BalancedPositiveNegativeSampler(
positive_fraction=0.5, is_static=False)
first_stage_nms_score_threshold = -1.0 first_stage_nms_score_threshold = -1.0
first_stage_nms_iou_threshold = 1.0 first_stage_nms_iou_threshold = 1.0
...@@ -230,9 +239,14 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase): ...@@ -230,9 +239,14 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
""" """
post_processing_config = post_processing_pb2.PostProcessing() post_processing_config = post_processing_pb2.PostProcessing()
text_format.Merge(post_processing_text_proto, post_processing_config) text_format.Merge(post_processing_text_proto, post_processing_config)
second_stage_target_assigner = target_assigner.create_target_assigner(
'FasterRCNN', 'detection',
use_matmul_gather=use_matmul_gather_in_matcher)
second_stage_non_max_suppression_fn, _ = post_processing_builder.build( second_stage_non_max_suppression_fn, _ = post_processing_builder.build(
post_processing_config) post_processing_config)
second_stage_balance_fraction = 1.0 second_stage_sampler = sampler.BalancedPositiveNegativeSampler(
positive_fraction=1.0, is_static=False)
second_stage_score_conversion_fn = tf.identity second_stage_score_conversion_fn = tf.identity
second_stage_localization_loss_weight = 1.0 second_stage_localization_loss_weight = 1.0
...@@ -261,6 +275,7 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase): ...@@ -261,6 +275,7 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
'feature_extractor': fake_feature_extractor, 'feature_extractor': fake_feature_extractor,
'number_of_stages': number_of_stages, 'number_of_stages': number_of_stages,
'first_stage_anchor_generator': first_stage_anchor_generator, 'first_stage_anchor_generator': first_stage_anchor_generator,
'first_stage_target_assigner': first_stage_target_assigner,
'first_stage_atrous_rate': first_stage_atrous_rate, 'first_stage_atrous_rate': first_stage_atrous_rate,
'first_stage_box_predictor_arg_scope_fn': 'first_stage_box_predictor_arg_scope_fn':
first_stage_box_predictor_arg_scope_fn, first_stage_box_predictor_arg_scope_fn,
...@@ -268,8 +283,7 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase): ...@@ -268,8 +283,7 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
first_stage_box_predictor_kernel_size, first_stage_box_predictor_kernel_size,
'first_stage_box_predictor_depth': first_stage_box_predictor_depth, 'first_stage_box_predictor_depth': first_stage_box_predictor_depth,
'first_stage_minibatch_size': first_stage_minibatch_size, 'first_stage_minibatch_size': first_stage_minibatch_size,
'first_stage_positive_balance_fraction': 'first_stage_sampler': first_stage_sampler,
first_stage_positive_balance_fraction,
'first_stage_nms_score_threshold': first_stage_nms_score_threshold, 'first_stage_nms_score_threshold': first_stage_nms_score_threshold,
'first_stage_nms_iou_threshold': first_stage_nms_iou_threshold, 'first_stage_nms_iou_threshold': first_stage_nms_iou_threshold,
'first_stage_max_proposals': first_stage_max_proposals, 'first_stage_max_proposals': first_stage_max_proposals,
...@@ -277,8 +291,9 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase): ...@@ -277,8 +291,9 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
first_stage_localization_loss_weight, first_stage_localization_loss_weight,
'first_stage_objectness_loss_weight': 'first_stage_objectness_loss_weight':
first_stage_objectness_loss_weight, first_stage_objectness_loss_weight,
'second_stage_target_assigner': second_stage_target_assigner,
'second_stage_batch_size': second_stage_batch_size, 'second_stage_batch_size': second_stage_batch_size,
'second_stage_balance_fraction': second_stage_balance_fraction, 'second_stage_sampler': second_stage_sampler,
'second_stage_non_max_suppression_fn': 'second_stage_non_max_suppression_fn':
second_stage_non_max_suppression_fn, second_stage_non_max_suppression_fn,
'second_stage_score_conversion_fn': second_stage_score_conversion_fn, 'second_stage_score_conversion_fn': second_stage_score_conversion_fn,
...@@ -289,7 +304,8 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase): ...@@ -289,7 +304,8 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
'second_stage_classification_loss': 'second_stage_classification_loss':
second_stage_classification_loss, second_stage_classification_loss,
'hard_example_miner': hard_example_miner, 'hard_example_miner': hard_example_miner,
'use_matmul_crop_and_resize': use_matmul_crop_and_resize 'use_matmul_crop_and_resize': use_matmul_crop_and_resize,
'clip_anchors_to_image': clip_anchors_to_image
} }
return self._get_model( return self._get_model(
...@@ -469,7 +485,8 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase): ...@@ -469,7 +485,8 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
self.assertAllEqual(tensor_dict_out[key].shape, expected_shapes[key]) self.assertAllEqual(tensor_dict_out[key].shape, expected_shapes[key])
def _test_predict_gives_correct_shapes_in_train_mode_both_stages( def _test_predict_gives_correct_shapes_in_train_mode_both_stages(
self, use_matmul_crop_and_resize=False): self, use_matmul_crop_and_resize=False,
clip_anchors_to_image=False):
test_graph = tf.Graph() test_graph = tf.Graph()
with test_graph.as_default(): with test_graph.as_default():
model = self._build_model( model = self._build_model(
...@@ -477,7 +494,8 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase): ...@@ -477,7 +494,8 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
number_of_stages=2, number_of_stages=2,
second_stage_batch_size=7, second_stage_batch_size=7,
predict_masks=False, predict_masks=False,
use_matmul_crop_and_resize=use_matmul_crop_and_resize) use_matmul_crop_and_resize=use_matmul_crop_and_resize,
clip_anchors_to_image=clip_anchors_to_image)
batch_size = 2 batch_size = 2
image_size = 10 image_size = 10
...@@ -547,6 +565,10 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase): ...@@ -547,6 +565,10 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
self._test_predict_gives_correct_shapes_in_train_mode_both_stages( self._test_predict_gives_correct_shapes_in_train_mode_both_stages(
use_matmul_crop_and_resize=True) use_matmul_crop_and_resize=True)
def test_predict_gives_correct_shapes_in_train_mode_clip_anchors(self):
self._test_predict_gives_correct_shapes_in_train_mode_both_stages(
clip_anchors_to_image=True)
def _test_postprocess_first_stage_only_inference_mode( def _test_postprocess_first_stage_only_inference_mode(
self, pad_to_max_dimension=None): self, pad_to_max_dimension=None):
model = self._build_model( model = self._build_model(
......
...@@ -55,20 +55,22 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch): ...@@ -55,20 +55,22 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
feature_extractor, feature_extractor,
number_of_stages, number_of_stages,
first_stage_anchor_generator, first_stage_anchor_generator,
first_stage_target_assigner,
first_stage_atrous_rate, first_stage_atrous_rate,
first_stage_box_predictor_arg_scope_fn, first_stage_box_predictor_arg_scope_fn,
first_stage_box_predictor_kernel_size, first_stage_box_predictor_kernel_size,
first_stage_box_predictor_depth, first_stage_box_predictor_depth,
first_stage_minibatch_size, first_stage_minibatch_size,
first_stage_positive_balance_fraction, first_stage_sampler,
first_stage_nms_score_threshold, first_stage_nms_score_threshold,
first_stage_nms_iou_threshold, first_stage_nms_iou_threshold,
first_stage_max_proposals, first_stage_max_proposals,
first_stage_localization_loss_weight, first_stage_localization_loss_weight,
first_stage_objectness_loss_weight, first_stage_objectness_loss_weight,
second_stage_target_assigner,
second_stage_rfcn_box_predictor, second_stage_rfcn_box_predictor,
second_stage_batch_size, second_stage_batch_size,
second_stage_balance_fraction, second_stage_sampler,
second_stage_non_max_suppression_fn, second_stage_non_max_suppression_fn,
second_stage_score_conversion_fn, second_stage_score_conversion_fn,
second_stage_localization_loss_weight, second_stage_localization_loss_weight,
...@@ -77,7 +79,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch): ...@@ -77,7 +79,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
hard_example_miner, hard_example_miner,
parallel_iterations=16, parallel_iterations=16,
add_summaries=True, add_summaries=True,
use_matmul_crop_and_resize=False): use_matmul_crop_and_resize=False,
clip_anchors_to_image=False):
"""RFCNMetaArch Constructor. """RFCNMetaArch Constructor.
Args: Args:
...@@ -97,6 +100,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch): ...@@ -97,6 +100,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
first_stage_anchor_generator: An anchor_generator.AnchorGenerator object first_stage_anchor_generator: An anchor_generator.AnchorGenerator object
(note that currently we only support (note that currently we only support
grid_anchor_generator.GridAnchorGenerator objects) grid_anchor_generator.GridAnchorGenerator objects)
first_stage_target_assigner: Target assigner to use for first stage of
R-FCN (RPN).
first_stage_atrous_rate: A single integer indicating the atrous rate for first_stage_atrous_rate: A single integer indicating the atrous rate for
the single convolution op which is applied to the `rpn_features_to_crop` the single convolution op which is applied to the `rpn_features_to_crop`
tensor to obtain a tensor to be used for box prediction. Some feature tensor to obtain a tensor to be used for box prediction. Some feature
...@@ -116,8 +121,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch): ...@@ -116,8 +121,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
"batch size" refers to the number of anchors selected as contributing "batch size" refers to the number of anchors selected as contributing
to the loss function for any given image within the image batch and is to the loss function for any given image within the image batch and is
only called "batch_size" due to terminology from the Faster R-CNN paper. only called "batch_size" due to terminology from the Faster R-CNN paper.
first_stage_positive_balance_fraction: Fraction of positive examples first_stage_sampler: The sampler for the boxes used to calculate the RPN
per image for the RPN. The recommended value for Faster RCNN is 0.5. loss after the first stage.
first_stage_nms_score_threshold: Score threshold for non max suppression first_stage_nms_score_threshold: Score threshold for non max suppression
for the Region Proposal Network (RPN). This value is expected to be in for the Region Proposal Network (RPN). This value is expected to be in
[0, 1] as it is applied directly after a softmax transformation. The [0, 1] as it is applied directly after a softmax transformation. The
...@@ -130,6 +135,10 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch): ...@@ -130,6 +135,10 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
Region Proposal Network (RPN). Region Proposal Network (RPN).
first_stage_localization_loss_weight: A float first_stage_localization_loss_weight: A float
first_stage_objectness_loss_weight: A float first_stage_objectness_loss_weight: A float
second_stage_target_assigner: Target assigner to use for second stage of
R-FCN. If the model is configured with multiple prediction heads, this
target assigner is used to generate targets for all heads (with the
correct `unmatched_class_label`).
second_stage_rfcn_box_predictor: RFCN box predictor to use for second_stage_rfcn_box_predictor: RFCN box predictor to use for
second stage. second stage.
second_stage_batch_size: The batch size used for computing the second_stage_batch_size: The batch size used for computing the
...@@ -137,9 +146,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch): ...@@ -137,9 +146,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
"batch size" refers to the number of proposals selected as contributing "batch size" refers to the number of proposals selected as contributing
to the loss function for any given image within the image batch and is to the loss function for any given image within the image batch and is
only called "batch_size" due to terminology from the Faster R-CNN paper. only called "batch_size" due to terminology from the Faster R-CNN paper.
second_stage_balance_fraction: Fraction of positive examples to use second_stage_sampler: The sampler for the boxes used for second stage
per image for the box classifier. The recommended value for Faster RCNN box classifier.
is 0.25.
second_stage_non_max_suppression_fn: batch_multiclass_non_max_suppression second_stage_non_max_suppression_fn: batch_multiclass_non_max_suppression
callable that takes `boxes`, `scores`, optional `clip_window` and callable that takes `boxes`, `scores`, optional `clip_window` and
optional (kwarg) `mask` inputs (with all other inputs already set) optional (kwarg) `mask` inputs (with all other inputs already set)
...@@ -163,6 +171,9 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch): ...@@ -163,6 +171,9 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
use_matmul_crop_and_resize: Force the use of matrix multiplication based use_matmul_crop_and_resize: Force the use of matrix multiplication based
crop and resize instead of standard tf.image.crop_and_resize while crop and resize instead of standard tf.image.crop_and_resize while
computing second stage input feature maps. computing second stage input feature maps.
clip_anchors_to_image: The anchors generated are clip to the
window size without filtering the nonoverlapping anchors. This generates
a static number of anchors. This argument is unused.
Raises: Raises:
ValueError: If `second_stage_batch_size` > `first_stage_max_proposals` ValueError: If `second_stage_batch_size` > `first_stage_max_proposals`
...@@ -178,12 +189,13 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch): ...@@ -178,12 +189,13 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
feature_extractor, feature_extractor,
number_of_stages, number_of_stages,
first_stage_anchor_generator, first_stage_anchor_generator,
first_stage_target_assigner,
first_stage_atrous_rate, first_stage_atrous_rate,
first_stage_box_predictor_arg_scope_fn, first_stage_box_predictor_arg_scope_fn,
first_stage_box_predictor_kernel_size, first_stage_box_predictor_kernel_size,
first_stage_box_predictor_depth, first_stage_box_predictor_depth,
first_stage_minibatch_size, first_stage_minibatch_size,
first_stage_positive_balance_fraction, first_stage_sampler,
first_stage_nms_score_threshold, first_stage_nms_score_threshold,
first_stage_nms_iou_threshold, first_stage_nms_iou_threshold,
first_stage_max_proposals, first_stage_max_proposals,
...@@ -192,9 +204,10 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch): ...@@ -192,9 +204,10 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
None, # initial_crop_size is not used in R-FCN None, # initial_crop_size is not used in R-FCN
None, # maxpool_kernel_size is not use in R-FCN None, # maxpool_kernel_size is not use in R-FCN
None, # maxpool_stride is not use in R-FCN None, # maxpool_stride is not use in R-FCN
second_stage_target_assigner,
None, # fully_connected_box_predictor is not used in R-FCN. None, # fully_connected_box_predictor is not used in R-FCN.
second_stage_batch_size, second_stage_batch_size,
second_stage_balance_fraction, second_stage_sampler,
second_stage_non_max_suppression_fn, second_stage_non_max_suppression_fn,
second_stage_score_conversion_fn, second_stage_score_conversion_fn,
second_stage_localization_loss_weight, second_stage_localization_loss_weight,
...@@ -274,11 +287,16 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch): ...@@ -274,11 +287,16 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
rpn_features, rpn_features,
scope=self.second_stage_feature_extractor_scope)) scope=self.second_stage_feature_extractor_scope))
box_predictions = self._rfcn_box_predictor.predict( if self._rfcn_box_predictor.is_keras_model:
[box_classifier_features], box_predictions = self._rfcn_box_predictor(
num_predictions_per_location=[1], [box_classifier_features],
scope=self.second_stage_box_predictor_scope, proposal_boxes=proposal_boxes_normalized)
proposal_boxes=proposal_boxes_normalized) else:
box_predictions = self._rfcn_box_predictor.predict(
[box_classifier_features],
num_predictions_per_location=[1],
scope=self.second_stage_box_predictor_scope,
proposal_boxes=proposal_boxes_normalized)
refined_box_encodings = tf.squeeze( refined_box_encodings = tf.squeeze(
tf.concat(box_predictions[box_predictor.BOX_ENCODINGS], axis=1), axis=1) tf.concat(box_predictions[box_predictor.BOX_ENCODINGS], axis=1), axis=1)
class_predictions_with_background = tf.squeeze( class_predictions_with_background = tf.squeeze(
......
...@@ -35,7 +35,7 @@ slim = tf.contrib.slim ...@@ -35,7 +35,7 @@ slim = tf.contrib.slim
class SSDFeatureExtractor(object): class SSDFeatureExtractor(object):
"""SSD Feature Extractor definition.""" """SSD Slim Feature Extractor definition."""
def __init__(self, def __init__(self,
is_training, is_training,
...@@ -77,6 +77,10 @@ class SSDFeatureExtractor(object): ...@@ -77,6 +77,10 @@ class SSDFeatureExtractor(object):
self._override_base_feature_extractor_hyperparams = ( self._override_base_feature_extractor_hyperparams = (
override_base_feature_extractor_hyperparams) override_base_feature_extractor_hyperparams)
@property
def is_keras_model(self):
return False
@abstractmethod @abstractmethod
def preprocess(self, resized_inputs): def preprocess(self, resized_inputs):
"""Preprocesses images for feature extraction (minus image resizing). """Preprocesses images for feature extraction (minus image resizing).
...@@ -113,6 +117,105 @@ class SSDFeatureExtractor(object): ...@@ -113,6 +117,105 @@ class SSDFeatureExtractor(object):
raise NotImplementedError raise NotImplementedError
class SSDKerasFeatureExtractor(tf.keras.Model):
"""SSD Feature Extractor definition."""
def __init__(self,
is_training,
depth_multiplier,
min_depth,
pad_to_multiple,
conv_hyperparams_config,
freeze_batchnorm,
inplace_batchnorm_update,
use_explicit_padding=False,
use_depthwise=False,
override_base_feature_extractor_hyperparams=False):
"""Constructor.
Args:
is_training: whether the network is in training mode.
depth_multiplier: float depth multiplier for feature extractor.
min_depth: minimum feature extractor depth.
pad_to_multiple: the nearest multiple to zero pad the input height and
width dimensions to.
conv_hyperparams_config: A hyperparams.proto object containing
convolution hyperparameters for the layers added on top of the
base feature extractor.
freeze_batchnorm: Whether to freeze batch norm parameters during
training or not. When training with a small batch size (e.g. 1), it is
desirable to freeze batch norm update and use pretrained batch norm
params.
inplace_batchnorm_update: Whether to update batch norm moving average
values inplace. When this is false train op must add a control
dependency on tf.graphkeys.UPDATE_OPS collection in order to update
batch norm statistics.
use_explicit_padding: Whether to use explicit padding when extracting
features. Default is False.
use_depthwise: Whether to use depthwise convolutions. Default is False.
override_base_feature_extractor_hyperparams: Whether to override
hyperparameters of the base feature extractor with the one from
`conv_hyperparams_config`.
"""
super(SSDKerasFeatureExtractor, self).__init__()
self._is_training = is_training
self._depth_multiplier = depth_multiplier
self._min_depth = min_depth
self._pad_to_multiple = pad_to_multiple
self._conv_hyperparams_config = conv_hyperparams_config
self._freeze_batchnorm = freeze_batchnorm
self._inplace_batchnorm_update = inplace_batchnorm_update
self._use_explicit_padding = use_explicit_padding
self._use_depthwise = use_depthwise
self._override_base_feature_extractor_hyperparams = (
override_base_feature_extractor_hyperparams)
@property
def is_keras_model(self):
return True
@abstractmethod
def preprocess(self, resized_inputs):
"""Preprocesses images for feature extraction (minus image resizing).
Args:
resized_inputs: a [batch, height, width, channels] float tensor
representing a batch of images.
Returns:
preprocessed_inputs: a [batch, height, width, channels] float tensor
representing a batch of images.
true_image_shapes: int32 tensor of shape [batch, 3] where each row is
of the form [height, width, channels] indicating the shapes
of true images in the resized images, as resized images can be padded
with zeros.
"""
raise NotImplementedError
@abstractmethod
def _extract_features(self, preprocessed_inputs):
"""Extracts features from preprocessed inputs.
This function is responsible for extracting feature maps from preprocessed
images.
Args:
preprocessed_inputs: a [batch, height, width, channels] float tensor
representing a batch of images.
Returns:
feature_maps: a list of tensors where the ith tensor has shape
[batch, height_i, width_i, depth_i]
"""
raise NotImplementedError
# This overrides the keras.Model `call` method with the _extract_features
# method.
def call(self, inputs, **kwargs):
return self._extract_features(inputs)
class SSDMetaArch(model.DetectionModel): class SSDMetaArch(model.DetectionModel):
"""SSD Meta-architecture definition.""" """SSD Meta-architecture definition."""
...@@ -211,10 +314,6 @@ class SSDMetaArch(model.DetectionModel): ...@@ -211,10 +314,6 @@ class SSDMetaArch(model.DetectionModel):
self._freeze_batchnorm = freeze_batchnorm self._freeze_batchnorm = freeze_batchnorm
self._inplace_batchnorm_update = inplace_batchnorm_update self._inplace_batchnorm_update = inplace_batchnorm_update
# Needed for fine-tuning from classification checkpoints whose
# variables do not have the feature extractor scope.
self._extract_features_scope = 'FeatureExtractor'
self._anchor_generator = anchor_generator self._anchor_generator = anchor_generator
self._box_predictor = box_predictor self._box_predictor = box_predictor
...@@ -224,21 +323,30 @@ class SSDMetaArch(model.DetectionModel): ...@@ -224,21 +323,30 @@ class SSDMetaArch(model.DetectionModel):
self._region_similarity_calculator = region_similarity_calculator self._region_similarity_calculator = region_similarity_calculator
self._add_background_class = add_background_class self._add_background_class = add_background_class
# Needed for fine-tuning from classification checkpoints whose
# variables do not have the feature extractor scope.
if self._feature_extractor.is_keras_model:
# Keras feature extractors will have a name they implicitly use to scope.
# So, all contained variables are prefixed by this name.
# To load from classification checkpoints, need to filter out this name.
self._extract_features_scope = feature_extractor.name
else:
# Slim feature extractors get an explicit naming scope
self._extract_features_scope = 'FeatureExtractor'
# TODO(jonathanhuang): handle agnostic mode # TODO(jonathanhuang): handle agnostic mode
# weights # weights
unmatched_cls_target = None self._unmatched_class_label = tf.constant([1] + self.num_classes * [0],
unmatched_cls_target = tf.constant([1] + self.num_classes * [0], tf.float32)
tf.float32)
if encode_background_as_zeros: if encode_background_as_zeros:
unmatched_cls_target = tf.constant((self.num_classes + 1) * [0], self._unmatched_class_label = tf.constant((self.num_classes + 1) * [0],
tf.float32) tf.float32)
self._target_assigner = target_assigner.TargetAssigner( self._target_assigner = target_assigner.TargetAssigner(
self._region_similarity_calculator, self._region_similarity_calculator,
self._matcher, self._matcher,
self._box_coder, self._box_coder,
negative_class_weight=negative_class_weight, negative_class_weight=negative_class_weight)
unmatched_cls_target=unmatched_cls_target)
self._classification_loss = classification_loss self._classification_loss = classification_loss
self._localization_loss = localization_loss self._localization_loss = localization_loss
...@@ -383,41 +491,53 @@ class SSDMetaArch(model.DetectionModel): ...@@ -383,41 +491,53 @@ class SSDMetaArch(model.DetectionModel):
""" """
batchnorm_updates_collections = (None if self._inplace_batchnorm_update batchnorm_updates_collections = (None if self._inplace_batchnorm_update
else tf.GraphKeys.UPDATE_OPS) else tf.GraphKeys.UPDATE_OPS)
with slim.arg_scope([slim.batch_norm], if self._feature_extractor.is_keras_model:
is_training=(self._is_training and feature_maps = self._feature_extractor(preprocessed_inputs)
not self._freeze_batchnorm), else:
updates_collections=batchnorm_updates_collections): with slim.arg_scope([slim.batch_norm],
with tf.variable_scope(None, self._extract_features_scope, is_training=(self._is_training and
[preprocessed_inputs]): not self._freeze_batchnorm),
feature_maps = self._feature_extractor.extract_features( updates_collections=batchnorm_updates_collections):
preprocessed_inputs) with tf.variable_scope(None, self._extract_features_scope,
feature_map_spatial_dims = self._get_feature_map_spatial_dims( [preprocessed_inputs]):
feature_maps) feature_maps = self._feature_extractor.extract_features(
image_shape = shape_utils.combined_static_and_dynamic_shape( preprocessed_inputs)
preprocessed_inputs)
self._anchors = box_list_ops.concatenate( feature_map_spatial_dims = self._get_feature_map_spatial_dims(
self._anchor_generator.generate( feature_maps)
feature_map_spatial_dims, image_shape = shape_utils.combined_static_and_dynamic_shape(
im_height=image_shape[1], preprocessed_inputs)
im_width=image_shape[2])) self._anchors = box_list_ops.concatenate(
prediction_dict = self._box_predictor.predict( self._anchor_generator.generate(
feature_maps, self._anchor_generator.num_anchors_per_location()) feature_map_spatial_dims,
box_encodings = tf.concat(prediction_dict['box_encodings'], axis=1) im_height=image_shape[1],
if box_encodings.shape.ndims == 4 and box_encodings.shape[2] == 1: im_width=image_shape[2]))
box_encodings = tf.squeeze(box_encodings, axis=2) if self._box_predictor.is_keras_model:
class_predictions_with_background = tf.concat( prediction_dict = self._box_predictor(feature_maps)
prediction_dict['class_predictions_with_background'], axis=1) else:
predictions_dict = { with slim.arg_scope([slim.batch_norm],
'preprocessed_inputs': preprocessed_inputs, is_training=(self._is_training and
'box_encodings': box_encodings, not self._freeze_batchnorm),
'class_predictions_with_background': updates_collections=batchnorm_updates_collections):
class_predictions_with_background, prediction_dict = self._box_predictor.predict(
'feature_maps': feature_maps, feature_maps, self._anchor_generator.num_anchors_per_location())
'anchors': self._anchors.get()
} box_encodings = tf.concat(prediction_dict['box_encodings'], axis=1)
self._batched_prediction_tensor_names = [x for x in predictions_dict if box_encodings.shape.ndims == 4 and box_encodings.shape[2] == 1:
if x != 'anchors'] box_encodings = tf.squeeze(box_encodings, axis=2)
return predictions_dict class_predictions_with_background = tf.concat(
prediction_dict['class_predictions_with_background'], axis=1)
predictions_dict = {
'preprocessed_inputs': preprocessed_inputs,
'box_encodings': box_encodings,
'class_predictions_with_background':
class_predictions_with_background,
'feature_maps': feature_maps,
'anchors': self._anchors.get()
}
self._batched_prediction_tensor_names = [x for x in predictions_dict
if x != 'anchors']
return predictions_dict
def _get_feature_map_spatial_dims(self, feature_maps): def _get_feature_map_spatial_dims(self, feature_maps):
"""Return list of spatial dimensions for each feature map in a list. """Return list of spatial dimensions for each feature map in a list.
...@@ -710,7 +830,8 @@ class SSDMetaArch(model.DetectionModel): ...@@ -710,7 +830,8 @@ class SSDMetaArch(model.DetectionModel):
boxlist.add_field(fields.BoxListFields.keypoints, keypoints) boxlist.add_field(fields.BoxListFields.keypoints, keypoints)
return target_assigner.batch_assign_targets( return target_assigner.batch_assign_targets(
self._target_assigner, self.anchors, groundtruth_boxlists, self._target_assigner, self.anchors, groundtruth_boxlists,
groundtruth_classes_with_background_list, groundtruth_weights_list) groundtruth_classes_with_background_list, self._unmatched_class_label,
groundtruth_weights_list)
def _summarize_target_assignment(self, groundtruth_boxes_list, match_list): def _summarize_target_assignment(self, groundtruth_boxes_list, match_list):
"""Creates tensorflow summaries for the input boxes and anchors. """Creates tensorflow summaries for the input boxes and anchors.
...@@ -872,3 +993,4 @@ class SSDMetaArch(model.DetectionModel): ...@@ -872,3 +993,4 @@ class SSDMetaArch(model.DetectionModel):
variables_to_restore[var_name] = variable variables_to_restore[var_name] = variable
return variables_to_restore return variables_to_restore
...@@ -15,6 +15,8 @@ ...@@ -15,6 +15,8 @@
"""Tests for object_detection.meta_architectures.ssd_meta_arch.""" """Tests for object_detection.meta_architectures.ssd_meta_arch."""
import functools import functools
from absl.testing import parameterized
import numpy as np import numpy as np
import tensorflow as tf import tensorflow as tf
...@@ -29,6 +31,7 @@ from object_detection.utils import test_case ...@@ -29,6 +31,7 @@ from object_detection.utils import test_case
from object_detection.utils import test_utils from object_detection.utils import test_utils
slim = tf.contrib.slim slim = tf.contrib.slim
keras = tf.keras.layers
class FakeSSDFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor): class FakeSSDFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
...@@ -51,6 +54,30 @@ class FakeSSDFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor): ...@@ -51,6 +54,30 @@ class FakeSSDFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
return [features] return [features]
class FakeSSDKerasFeatureExtractor(ssd_meta_arch.SSDKerasFeatureExtractor):
def __init__(self):
with tf.name_scope('mock_model'):
super(FakeSSDKerasFeatureExtractor, self).__init__(
is_training=True,
depth_multiplier=0,
min_depth=0,
pad_to_multiple=1,
conv_hyperparams_config=None,
freeze_batchnorm=False,
inplace_batchnorm_update=False,
)
self._conv = keras.Conv2D(filters=32, kernel_size=1, name='layer1')
def preprocess(self, resized_inputs):
return tf.identity(resized_inputs)
def _extract_features(self, preprocessed_inputs, **kwargs):
with tf.name_scope('mock_model'):
return [self._conv(preprocessed_inputs)]
class MockAnchorGenerator2x2(anchor_generator.AnchorGenerator): class MockAnchorGenerator2x2(anchor_generator.AnchorGenerator):
"""Sets up a simple 2x2 anchor grid on the unit square.""" """Sets up a simple 2x2 anchor grid on the unit square."""
...@@ -79,20 +106,32 @@ def _get_value_for_matching_key(dictionary, suffix): ...@@ -79,20 +106,32 @@ def _get_value_for_matching_key(dictionary, suffix):
raise ValueError('key not found {}'.format(suffix)) raise ValueError('key not found {}'.format(suffix))
class SsdMetaArchTest(test_case.TestCase): @parameterized.parameters(
{'use_keras': False},
{'use_keras': True},
)
class SsdMetaArchTest(test_case.TestCase, parameterized.TestCase):
def _create_model(self, def _create_model(self,
apply_hard_mining=True, apply_hard_mining=True,
normalize_loc_loss_by_codesize=False, normalize_loc_loss_by_codesize=False,
add_background_class=True, add_background_class=True,
random_example_sampling=False): random_example_sampling=False,
use_keras=False):
is_training = False is_training = False
num_classes = 1 num_classes = 1
mock_anchor_generator = MockAnchorGenerator2x2() mock_anchor_generator = MockAnchorGenerator2x2()
mock_box_predictor = test_utils.MockBoxPredictor( if use_keras:
is_training, num_classes) mock_box_predictor = test_utils.MockKerasBoxPredictor(
is_training, num_classes)
else:
mock_box_predictor = test_utils.MockBoxPredictor(
is_training, num_classes)
mock_box_coder = test_utils.MockBoxCoder() mock_box_coder = test_utils.MockBoxCoder()
fake_feature_extractor = FakeSSDFeatureExtractor() if use_keras:
fake_feature_extractor = FakeSSDKerasFeatureExtractor()
else:
fake_feature_extractor = FakeSSDFeatureExtractor()
mock_matcher = test_utils.MockMatcher() mock_matcher = test_utils.MockMatcher()
region_similarity_calculator = sim_calc.IouSimilarity() region_similarity_calculator = sim_calc.IouSimilarity()
encode_background_as_zeros = False encode_background_as_zeros = False
...@@ -152,25 +191,26 @@ class SsdMetaArchTest(test_case.TestCase): ...@@ -152,25 +191,26 @@ class SsdMetaArchTest(test_case.TestCase):
random_example_sampler=random_example_sampler) random_example_sampler=random_example_sampler)
return model, num_classes, mock_anchor_generator.num_anchors(), code_size return model, num_classes, mock_anchor_generator.num_anchors(), code_size
def test_preprocess_preserves_shapes_with_dynamic_input_image(self): def test_preprocess_preserves_shapes_with_dynamic_input_image(
self, use_keras):
image_shapes = [(3, None, None, 3), image_shapes = [(3, None, None, 3),
(None, 10, 10, 3), (None, 10, 10, 3),
(None, None, None, 3)] (None, None, None, 3)]
model, _, _, _ = self._create_model() model, _, _, _ = self._create_model(use_keras=use_keras)
for image_shape in image_shapes: for image_shape in image_shapes:
image_placeholder = tf.placeholder(tf.float32, shape=image_shape) image_placeholder = tf.placeholder(tf.float32, shape=image_shape)
preprocessed_inputs, _ = model.preprocess(image_placeholder) preprocessed_inputs, _ = model.preprocess(image_placeholder)
self.assertAllEqual(preprocessed_inputs.shape.as_list(), image_shape) self.assertAllEqual(preprocessed_inputs.shape.as_list(), image_shape)
def test_preprocess_preserves_shape_with_static_input_image(self): def test_preprocess_preserves_shape_with_static_input_image(self, use_keras):
def graph_fn(input_image): def graph_fn(input_image):
model, _, _, _ = self._create_model() model, _, _, _ = self._create_model(use_keras=use_keras)
return model.preprocess(input_image) return model.preprocess(input_image)
input_image = np.random.rand(2, 3, 3, 3).astype(np.float32) input_image = np.random.rand(2, 3, 3, 3).astype(np.float32)
preprocessed_inputs, _ = self.execute(graph_fn, [input_image]) preprocessed_inputs, _ = self.execute(graph_fn, [input_image])
self.assertAllEqual(preprocessed_inputs.shape, [2, 3, 3, 3]) self.assertAllEqual(preprocessed_inputs.shape, [2, 3, 3, 3])
def test_predict_result_shapes_on_image_with_dynamic_shape(self): def test_predict_result_shapes_on_image_with_dynamic_shape(self, use_keras):
batch_size = 3 batch_size = 3
image_size = 2 image_size = 2
input_shapes = [(None, image_size, image_size, 3), input_shapes = [(None, image_size, image_size, 3),
...@@ -180,16 +220,17 @@ class SsdMetaArchTest(test_case.TestCase): ...@@ -180,16 +220,17 @@ class SsdMetaArchTest(test_case.TestCase):
for input_shape in input_shapes: for input_shape in input_shapes:
tf_graph = tf.Graph() tf_graph = tf.Graph()
with tf_graph.as_default(): with tf_graph.as_default():
model, num_classes, num_anchors, code_size = self._create_model() model, num_classes, num_anchors, code_size = self._create_model(
use_keras=use_keras)
preprocessed_input_placeholder = tf.placeholder(tf.float32, preprocessed_input_placeholder = tf.placeholder(tf.float32,
shape=input_shape) shape=input_shape)
prediction_dict = model.predict( prediction_dict = model.predict(
preprocessed_input_placeholder, true_image_shapes=None) preprocessed_input_placeholder, true_image_shapes=None)
self.assertTrue('box_encodings' in prediction_dict) self.assertIn('box_encodings', prediction_dict)
self.assertTrue('class_predictions_with_background' in prediction_dict) self.assertIn('class_predictions_with_background', prediction_dict)
self.assertTrue('feature_maps' in prediction_dict) self.assertIn('feature_maps', prediction_dict)
self.assertTrue('anchors' in prediction_dict) self.assertIn('anchors', prediction_dict)
init_op = tf.global_variables_initializer() init_op = tf.global_variables_initializer()
with self.test_session(graph=tf_graph) as sess: with self.test_session(graph=tf_graph) as sess:
...@@ -210,10 +251,11 @@ class SsdMetaArchTest(test_case.TestCase): ...@@ -210,10 +251,11 @@ class SsdMetaArchTest(test_case.TestCase):
prediction_out['class_predictions_with_background'].shape, prediction_out['class_predictions_with_background'].shape,
expected_class_predictions_with_background_shape_out) expected_class_predictions_with_background_shape_out)
def test_predict_result_shapes_on_image_with_static_shape(self): def test_predict_result_shapes_on_image_with_static_shape(self, use_keras):
with tf.Graph().as_default(): with tf.Graph().as_default():
_, num_classes, num_anchors, code_size = self._create_model() _, num_classes, num_anchors, code_size = self._create_model(
use_keras=use_keras)
def graph_fn(input_image): def graph_fn(input_image):
model, _, _, _ = self._create_model() model, _, _, _ = self._create_model()
...@@ -235,7 +277,7 @@ class SsdMetaArchTest(test_case.TestCase): ...@@ -235,7 +277,7 @@ class SsdMetaArchTest(test_case.TestCase):
self.assertAllEqual(class_predictions.shape, self.assertAllEqual(class_predictions.shape,
expected_class_predictions_shape) expected_class_predictions_shape)
def test_postprocess_results_are_correct(self): def test_postprocess_results_are_correct(self, use_keras):
batch_size = 2 batch_size = 2
image_size = 2 image_size = 2
input_shapes = [(batch_size, image_size, image_size, 3), input_shapes = [(batch_size, image_size, image_size, 3),
...@@ -266,17 +308,17 @@ class SsdMetaArchTest(test_case.TestCase): ...@@ -266,17 +308,17 @@ class SsdMetaArchTest(test_case.TestCase):
for input_shape in input_shapes: for input_shape in input_shapes:
tf_graph = tf.Graph() tf_graph = tf.Graph()
with tf_graph.as_default(): with tf_graph.as_default():
model, _, _, _ = self._create_model() model, _, _, _ = self._create_model(use_keras=use_keras)
input_placeholder = tf.placeholder(tf.float32, shape=input_shape) input_placeholder = tf.placeholder(tf.float32, shape=input_shape)
preprocessed_inputs, true_image_shapes = model.preprocess( preprocessed_inputs, true_image_shapes = model.preprocess(
input_placeholder) input_placeholder)
prediction_dict = model.predict(preprocessed_inputs, prediction_dict = model.predict(preprocessed_inputs,
true_image_shapes) true_image_shapes)
detections = model.postprocess(prediction_dict, true_image_shapes) detections = model.postprocess(prediction_dict, true_image_shapes)
self.assertTrue('detection_boxes' in detections) self.assertIn('detection_boxes', detections)
self.assertTrue('detection_scores' in detections) self.assertIn('detection_scores', detections)
self.assertTrue('detection_classes' in detections) self.assertIn('detection_classes', detections)
self.assertTrue('num_detections' in detections) self.assertIn('num_detections', detections)
init_op = tf.global_variables_initializer() init_op = tf.global_variables_initializer()
with self.test_session(graph=tf_graph) as sess: with self.test_session(graph=tf_graph) as sess:
sess.run(init_op) sess.run(init_op)
...@@ -295,10 +337,10 @@ class SsdMetaArchTest(test_case.TestCase): ...@@ -295,10 +337,10 @@ class SsdMetaArchTest(test_case.TestCase):
self.assertAllClose(detections_out['num_detections'], self.assertAllClose(detections_out['num_detections'],
expected_num_detections) expected_num_detections)
def test_loss_results_are_correct(self): def test_loss_results_are_correct(self, use_keras):
with tf.Graph().as_default(): with tf.Graph().as_default():
_, num_classes, num_anchors, _ = self._create_model() _, num_classes, num_anchors, _ = self._create_model(use_keras=use_keras)
def graph_fn(preprocessed_tensor, groundtruth_boxes1, groundtruth_boxes2, def graph_fn(preprocessed_tensor, groundtruth_boxes1, groundtruth_boxes2,
groundtruth_classes1, groundtruth_classes2): groundtruth_classes1, groundtruth_classes2):
groundtruth_boxes_list = [groundtruth_boxes1, groundtruth_boxes2] groundtruth_boxes_list = [groundtruth_boxes1, groundtruth_boxes2]
...@@ -331,16 +373,18 @@ class SsdMetaArchTest(test_case.TestCase): ...@@ -331,16 +373,18 @@ class SsdMetaArchTest(test_case.TestCase):
self.assertAllClose(localization_loss, expected_localization_loss) self.assertAllClose(localization_loss, expected_localization_loss)
self.assertAllClose(classification_loss, expected_classification_loss) self.assertAllClose(classification_loss, expected_classification_loss)
def test_loss_results_are_correct_with_normalize_by_codesize_true(self): def test_loss_results_are_correct_with_normalize_by_codesize_true(
self, use_keras):
with tf.Graph().as_default(): with tf.Graph().as_default():
_, _, _, _ = self._create_model() _, _, _, _ = self._create_model(use_keras=use_keras)
def graph_fn(preprocessed_tensor, groundtruth_boxes1, groundtruth_boxes2, def graph_fn(preprocessed_tensor, groundtruth_boxes1, groundtruth_boxes2,
groundtruth_classes1, groundtruth_classes2): groundtruth_classes1, groundtruth_classes2):
groundtruth_boxes_list = [groundtruth_boxes1, groundtruth_boxes2] groundtruth_boxes_list = [groundtruth_boxes1, groundtruth_boxes2]
groundtruth_classes_list = [groundtruth_classes1, groundtruth_classes2] groundtruth_classes_list = [groundtruth_classes1, groundtruth_classes2]
model, _, _, _ = self._create_model(apply_hard_mining=False, model, _, _, _ = self._create_model(apply_hard_mining=False,
normalize_loc_loss_by_codesize=True) normalize_loc_loss_by_codesize=True,
use_keras=use_keras)
model.provide_groundtruth(groundtruth_boxes_list, model.provide_groundtruth(groundtruth_boxes_list,
groundtruth_classes_list) groundtruth_classes_list)
prediction_dict = model.predict(preprocessed_tensor, prediction_dict = model.predict(preprocessed_tensor,
...@@ -362,10 +406,10 @@ class SsdMetaArchTest(test_case.TestCase): ...@@ -362,10 +406,10 @@ class SsdMetaArchTest(test_case.TestCase):
groundtruth_classes2]) groundtruth_classes2])
self.assertAllClose(localization_loss, expected_localization_loss) self.assertAllClose(localization_loss, expected_localization_loss)
def test_loss_results_are_correct_with_hard_example_mining(self): def test_loss_results_are_correct_with_hard_example_mining(self, use_keras):
with tf.Graph().as_default(): with tf.Graph().as_default():
_, num_classes, num_anchors, _ = self._create_model() _, num_classes, num_anchors, _ = self._create_model(use_keras=use_keras)
def graph_fn(preprocessed_tensor, groundtruth_boxes1, groundtruth_boxes2, def graph_fn(preprocessed_tensor, groundtruth_boxes1, groundtruth_boxes2,
groundtruth_classes1, groundtruth_classes2): groundtruth_classes1, groundtruth_classes2):
groundtruth_boxes_list = [groundtruth_boxes1, groundtruth_boxes2] groundtruth_boxes_list = [groundtruth_boxes1, groundtruth_boxes2]
...@@ -397,18 +441,20 @@ class SsdMetaArchTest(test_case.TestCase): ...@@ -397,18 +441,20 @@ class SsdMetaArchTest(test_case.TestCase):
self.assertAllClose(localization_loss, expected_localization_loss) self.assertAllClose(localization_loss, expected_localization_loss)
self.assertAllClose(classification_loss, expected_classification_loss) self.assertAllClose(classification_loss, expected_classification_loss)
def test_loss_results_are_correct_without_add_background_class(self): def test_loss_results_are_correct_without_add_background_class(
self, use_keras):
with tf.Graph().as_default(): with tf.Graph().as_default():
_, num_classes, num_anchors, _ = self._create_model( _, num_classes, num_anchors, _ = self._create_model(
add_background_class=False) add_background_class=False, use_keras=use_keras)
def graph_fn(preprocessed_tensor, groundtruth_boxes1, groundtruth_boxes2, def graph_fn(preprocessed_tensor, groundtruth_boxes1, groundtruth_boxes2,
groundtruth_classes1, groundtruth_classes2): groundtruth_classes1, groundtruth_classes2):
groundtruth_boxes_list = [groundtruth_boxes1, groundtruth_boxes2] groundtruth_boxes_list = [groundtruth_boxes1, groundtruth_boxes2]
groundtruth_classes_list = [groundtruth_classes1, groundtruth_classes2] groundtruth_classes_list = [groundtruth_classes1, groundtruth_classes2]
model, _, _, _ = self._create_model( model, _, _, _ = self._create_model(
apply_hard_mining=False, add_background_class=False) apply_hard_mining=False, add_background_class=False,
use_keras=use_keras)
model.provide_groundtruth(groundtruth_boxes_list, model.provide_groundtruth(groundtruth_boxes_list,
groundtruth_classes_list) groundtruth_classes_list)
prediction_dict = model.predict( prediction_dict = model.predict(
...@@ -434,8 +480,8 @@ class SsdMetaArchTest(test_case.TestCase): ...@@ -434,8 +480,8 @@ class SsdMetaArchTest(test_case.TestCase):
self.assertAllClose(localization_loss, expected_localization_loss) self.assertAllClose(localization_loss, expected_localization_loss)
self.assertAllClose(classification_loss, expected_classification_loss) self.assertAllClose(classification_loss, expected_classification_loss)
def test_restore_map_for_detection_ckpt(self): def test_restore_map_for_detection_ckpt(self, use_keras):
model, _, _, _ = self._create_model() model, _, _, _ = self._create_model(use_keras=use_keras)
model.predict(tf.constant(np.array([[[[0, 0], [1, 1]], [[1, 0], [0, 1]]]], model.predict(tf.constant(np.array([[[[0, 0], [1, 1]], [[1, 0], [0, 1]]]],
dtype=np.float32)), dtype=np.float32)),
true_image_shapes=None) true_image_shapes=None)
...@@ -454,14 +500,22 @@ class SsdMetaArchTest(test_case.TestCase): ...@@ -454,14 +500,22 @@ class SsdMetaArchTest(test_case.TestCase):
for var in sess.run(tf.report_uninitialized_variables()): for var in sess.run(tf.report_uninitialized_variables()):
self.assertNotIn('FeatureExtractor', var) self.assertNotIn('FeatureExtractor', var)
def test_restore_map_for_classification_ckpt(self): def test_restore_map_for_classification_ckpt(self, use_keras):
# Define mock tensorflow classification graph and save variables. # Define mock tensorflow classification graph and save variables.
test_graph_classification = tf.Graph() test_graph_classification = tf.Graph()
with test_graph_classification.as_default(): with test_graph_classification.as_default():
image = tf.placeholder(dtype=tf.float32, shape=[1, 20, 20, 3]) image = tf.placeholder(dtype=tf.float32, shape=[1, 20, 20, 3])
with tf.variable_scope('mock_model'): if use_keras:
net = slim.conv2d(image, num_outputs=32, kernel_size=1, scope='layer1') with tf.name_scope('mock_model'):
slim.conv2d(net, num_outputs=3, kernel_size=1, scope='layer2') layer_one = keras.Conv2D(32, kernel_size=1, name='layer1')
net = layer_one(image)
layer_two = keras.Conv2D(3, kernel_size=1, name='layer2')
layer_two(net)
else:
with tf.variable_scope('mock_model'):
net = slim.conv2d(image, num_outputs=32, kernel_size=1,
scope='layer1')
slim.conv2d(net, num_outputs=3, kernel_size=1, scope='layer2')
init_op = tf.global_variables_initializer() init_op = tf.global_variables_initializer()
saver = tf.train.Saver() saver = tf.train.Saver()
...@@ -474,7 +528,7 @@ class SsdMetaArchTest(test_case.TestCase): ...@@ -474,7 +528,7 @@ class SsdMetaArchTest(test_case.TestCase):
# classification checkpoint. # classification checkpoint.
test_graph_detection = tf.Graph() test_graph_detection = tf.Graph()
with test_graph_detection.as_default(): with test_graph_detection.as_default():
model, _, _, _ = self._create_model() model, _, _, _ = self._create_model(use_keras=use_keras)
inputs_shape = [2, 2, 2, 3] inputs_shape = [2, 2, 2, 3]
inputs = tf.to_float(tf.random_uniform( inputs = tf.to_float(tf.random_uniform(
inputs_shape, minval=0, maxval=255, dtype=tf.int32)) inputs_shape, minval=0, maxval=255, dtype=tf.int32))
...@@ -491,10 +545,10 @@ class SsdMetaArchTest(test_case.TestCase): ...@@ -491,10 +545,10 @@ class SsdMetaArchTest(test_case.TestCase):
for var in sess.run(tf.report_uninitialized_variables()): for var in sess.run(tf.report_uninitialized_variables()):
self.assertNotIn('FeatureExtractor', var) self.assertNotIn('FeatureExtractor', var)
def test_load_all_det_checkpoint_vars(self): def test_load_all_det_checkpoint_vars(self, use_keras):
test_graph_detection = tf.Graph() test_graph_detection = tf.Graph()
with test_graph_detection.as_default(): with test_graph_detection.as_default():
model, _, _, _ = self._create_model() model, _, _, _ = self._create_model(use_keras=use_keras)
inputs_shape = [2, 2, 2, 3] inputs_shape = [2, 2, 2, 3]
inputs = tf.to_float( inputs = tf.to_float(
tf.random_uniform(inputs_shape, minval=0, maxval=255, dtype=tf.int32)) tf.random_uniform(inputs_shape, minval=0, maxval=255, dtype=tf.int32))
...@@ -508,18 +562,22 @@ class SsdMetaArchTest(test_case.TestCase): ...@@ -508,18 +562,22 @@ class SsdMetaArchTest(test_case.TestCase):
self.assertIsInstance(var_map, dict) self.assertIsInstance(var_map, dict)
self.assertIn('another_variable', var_map) self.assertIn('another_variable', var_map)
def test_loss_results_are_correct_with_random_example_sampling(self): def test_loss_results_are_correct_with_random_example_sampling(
self,
use_keras):
with tf.Graph().as_default(): with tf.Graph().as_default():
_, num_classes, num_anchors, _ = self._create_model( _, num_classes, num_anchors, _ = self._create_model(
random_example_sampling=True) random_example_sampling=True,
use_keras=use_keras)
print num_classes, num_anchors print num_classes, num_anchors
def graph_fn(preprocessed_tensor, groundtruth_boxes1, groundtruth_boxes2, def graph_fn(preprocessed_tensor, groundtruth_boxes1, groundtruth_boxes2,
groundtruth_classes1, groundtruth_classes2): groundtruth_classes1, groundtruth_classes2):
groundtruth_boxes_list = [groundtruth_boxes1, groundtruth_boxes2] groundtruth_boxes_list = [groundtruth_boxes1, groundtruth_boxes2]
groundtruth_classes_list = [groundtruth_classes1, groundtruth_classes2] groundtruth_classes_list = [groundtruth_classes1, groundtruth_classes2]
model, _, _, _ = self._create_model(random_example_sampling=True) model, _, _, _ = self._create_model(random_example_sampling=True,
use_keras=use_keras)
model.provide_groundtruth(groundtruth_boxes_list, model.provide_groundtruth(groundtruth_boxes_list,
groundtruth_classes_list) groundtruth_classes_list)
prediction_dict = model.predict( prediction_dict = model.predict(
......
...@@ -202,6 +202,10 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False): ...@@ -202,6 +202,10 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False):
params = params or {} params = params or {}
total_loss, train_op, detections, export_outputs = None, None, None, None total_loss, train_op, detections, export_outputs = None, None, None, None
is_training = mode == tf.estimator.ModeKeys.TRAIN is_training = mode == tf.estimator.ModeKeys.TRAIN
# Make sure to set the Keras learning phase. True during training,
# False for inference.
tf.keras.backend.set_learning_phase(is_training)
detection_model = detection_model_fn(is_training=is_training, detection_model = detection_model_fn(is_training=is_training,
add_summaries=(not use_tpu)) add_summaries=(not use_tpu))
scaffold_fn = None scaffold_fn = None
...@@ -279,7 +283,7 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False): ...@@ -279,7 +283,7 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False):
if mode in (tf.estimator.ModeKeys.TRAIN, tf.estimator.ModeKeys.EVAL): if mode in (tf.estimator.ModeKeys.TRAIN, tf.estimator.ModeKeys.EVAL):
losses_dict = detection_model.loss( losses_dict = detection_model.loss(
prediction_dict, features[fields.InputDataFields.true_image_shape]) prediction_dict, features[fields.InputDataFields.true_image_shape])
losses = [loss_tensor for loss_tensor in losses_dict.itervalues()] losses = [loss_tensor for loss_tensor in losses_dict.values()]
if train_config.add_regularization_loss: if train_config.add_regularization_loss:
regularization_losses = tf.get_collection( regularization_losses = tf.get_collection(
tf.GraphKeys.REGULARIZATION_LOSSES) tf.GraphKeys.REGULARIZATION_LOSSES)
......
...@@ -221,8 +221,8 @@ def fpn_top_down_feature_maps(image_features, depth, scope=None): ...@@ -221,8 +221,8 @@ def fpn_top_down_feature_maps(image_features, depth, scope=None):
depth, [3, 3], depth, [3, 3],
scope='smoothing_%d' % (level + 1))) scope='smoothing_%d' % (level + 1)))
output_feature_map_keys.append('top_down_%s' % image_features[level][0]) output_feature_map_keys.append('top_down_%s' % image_features[level][0])
return collections.OrderedDict( return collections.OrderedDict(reversed(
reversed(zip(output_feature_map_keys, output_feature_maps_list))) list(zip(output_feature_map_keys, output_feature_maps_list))))
def pooling_pyramid_feature_maps(base_feature_map_depth, num_layers, def pooling_pyramid_feature_maps(base_feature_map_depth, num_layers,
...@@ -288,4 +288,3 @@ def pooling_pyramid_feature_maps(base_feature_map_depth, num_layers, ...@@ -288,4 +288,3 @@ def pooling_pyramid_feature_maps(base_feature_map_depth, num_layers,
feature_maps.append(feature_map) feature_maps.append(feature_map)
return collections.OrderedDict( return collections.OrderedDict(
[(x, y) for (x, y) in zip(feature_map_keys, feature_maps)]) [(x, y) for (x, y) in zip(feature_map_keys, feature_maps)])
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment