Unverified Commit 02a9969e authored by pkulzc's avatar pkulzc Committed by GitHub
Browse files

Refactor object detection box predictors and fix some issues with model_main. (#4965)

* Merged commit includes the following changes:
206852642  by Zhichao Lu:

    Build the balanced_positive_negative_sampler in the model builder for FasterRCNN. Also adds an option to use the static implementation of the sampler.

--
206803260  by Zhichao Lu:

    Fixes a misplaced argument in resnet fpn feature extractor.

--
206682736  by Zhichao Lu:

    This CL modifies the SSD meta architecture to support both Slim-based and Keras-based box predictors, and begins preparation for Keras box predictor support in the other meta architectures.

    Concretely, this CL adds a new `KerasBoxPredictor` base class and makes the meta architectures appropriately call whichever box predictors they are using.

    We can switch the non-ssd meta architectures to fully support Keras box predictors once the Keras Convolutional Box Predictor CL is submitted.

--
206669634  by Zhichao Lu:

    Adds an alternate method for balanced positive negative sampler using static shapes.

--
206643278  by Zhichao Lu:

    This CL adds a Keras layer hyperparameter configuration object to the hyperparams_builder.

    It automatically converts from Slim layer hyperparameter configs to Keras layer hyperparameters. Namely, it:
    - Builds Keras initializers/regularizers instead of Slim ones
    - sets weights_regularizer/initializer to kernel_regularizer/initializer
    - converts batchnorm decay to momentum
    - converts Slim l2 regularizer weights to the equivalent Keras l2 weights

    This will be used in the conversion of object detection feature extractors & box predictors to newer Tensorflow APIs.

--
206611681  by Zhichao Lu:

    Internal changes.

--
206591619  by Zhichao Lu:

    Clip the to shape when the input tensors are larger than the expected padded static shape

--
206517644  by Zhichao Lu:

    Make MultiscaleGridAnchorGenerator more consistent with MultipleGridAnchorGenerator.

--
206415624  by Zhichao Lu:

    Make the hardcoded feature pyramid network (FPN) levels configurable for both SSD
    Resnet and SSD Mobilenet.

--
206398204  by Zhichao Lu:

    This CL modifies the SSD meta architecture to support both Slim-based and Keras-based feature extractors.

    This allows us to begin the conversion of object detection to newer Tensorflow APIs.

--
206213448  by Zhichao Lu:

    Adding a method to compute the expected classification loss by background/foreground weighting.

--
206204232  by Zhichao Lu:

    Adding the keypoint head to the Mask RCNN pipeline.

--
206200352  by Zhichao Lu:

    - Create Faster R-CNN target assigner in the model builder. This allows configuring matchers in Target assigner to use TPU compatible ops (tf.gather in this case) without any change in meta architecture.
    - As a +ve side effect of the refactoring, we can now re-use a single target assigner for all of second stage heads in Faster R-CNN.

--
206178206  by Zhichao Lu:

    Force ssd feature extractor builder to use keyword arguments so values won't be passed to wrong arguments.

--
206168297  by Zhichao Lu:

    Updating exporter to use freeze_graph.freeze_graph_with_def_protos rather than a homegrown version.

--
206080748  by Zhichao Lu:

    Merge external contributions.

--
206074460  by Zhichao Lu:

    Update to preprocessor to apply temperature and softmax to the multiclass scores on read.

--
205960802  by Zhichao Lu:

    Fixing a bug in hierarchical label expansion script.

--
205944686  by Zhichao Lu:

    Update exporter to support exporting quantized model.

--
205912529  by Zhichao Lu:

    Add a two stage matcher to allow for thresholding by one criteria and then argmaxing on the other.

--
205909017  by Zhichao Lu:

    Add test for grayscale image_resizer

--
205892801  by Zhichao Lu:

    Add flag to decide whether to apply batch norm to conv layers of weight shared box predictor.

--
205824449  by Zhichao Lu:

    make sure that by default mask rcnn box predictor predicts 2 stages.

--
205730139  by Zhichao Lu:

    Updating warning message to be more explicit about variable size mismatch.

--
205696992  by Zhichao Lu:

    Remove utils/ops.py's dependency on core/box_list_ops.py. This will allow re-using TPU compatible ops from utils/ops.py in core/box_list_ops.py.

--
205696867  by Zhichao Lu:

    Refactoring mask rcnn predictor so have each head in a separate file.
    This CL lets us to add new heads more easily in the future to mask rcnn.

--
205492073  by Zhichao Lu:

    Refactor R-FCN box predictor to be TPU compliant.

    - Change utils/ops.py:position_sensitive_crop_regions to operate on single image and set of boxes without `box_ind`
    - Add a batch version that operations on batches of images and batches of boxes.
    - Refactor R-FCN box predictor to use the batched version of position sensitive crop regions.

--
205453567  by Zhichao Lu:

    Fix bug that cannot export inference graph when write_inference_graph flag is True.

--
205316039  by Zhichao Lu:

    Changing input tensor name.

--
205256307  by Zhichao Lu:

    Fix model zoo links for quantized model.

--
205164432  by Zhichao Lu:

    Fixes eval error when label map contains non-ascii characters.

--
205129842  by Zhichao Lu:

    Adds a option to clip the anchors to the window size without filtering the overlapped boxes in Faster-RCNN

--
205094863  by Zhichao Lu:

    Update to label map util to allow the option of adding a background class and fill in gaps in the label map. Useful for using multiclass scores which require a complete label map with explicit background label.

--
204989032  by Zhichao Lu:

    Add tf.prof support to exporter.

--
204825267  by Zhichao Lu:

    Modify mask rcnn box predictor tests for TPU compatibility.

--
204778749  by Zhichao Lu:

    Remove score filtering from postprocessing.py and rely on filtering logic in tf.image.non_max_suppression

--
204775818  by Zhichao Lu:

    Python3 fixes for object_detection.

--
204745920  by Zhichao Lu:

    Object Detection Dataset visualization tool (documentation).

--
204686993  by Zhichao Lu:

    Internal changes.

--
204559667  by Zhichao Lu:

    Refactor box_predictor.py into multiple files.
    The abstract base class remains in the object_detection/core, The other classes have moved to a separate file each in object_detection/predictors

--
204552847  by Zhichao Lu:

    Update blog post link.

--
204508028  by Zhichao Lu:

    Bump down the batch size to 1024 to be a bit more tolerant to OOM and double the number of iterations. This job still converges to 20.5 mAP in 3 hours.

--

PiperOrigin-RevId: 206852642

* Add original post-processing back.
parent d135ed9c
......@@ -18,6 +18,7 @@ import tensorflow as tf
from object_detection.core import box_list
from object_detection.core import region_similarity_calculator
from object_detection.core import standard_fields as fields
class RegionSimilarityCalculatorTest(tf.test.TestCase):
......@@ -70,6 +71,25 @@ class RegionSimilarityCalculatorTest(tf.test.TestCase):
self.assertAllClose(iou_output_1, exp_output_1)
self.assertAllClose(iou_output_2, exp_output_2)
def test_get_correct_pairwise_similarity_based_on_thresholded_iou(self):
corners1 = tf.constant([[4.0, 3.0, 7.0, 5.0], [5.0, 6.0, 10.0, 7.0]])
corners2 = tf.constant([[3.0, 4.0, 6.0, 8.0], [14.0, 14.0, 15.0, 15.0],
[0.0, 0.0, 20.0, 20.0]])
scores = tf.constant([.3, .6])
iou_threshold = .013
exp_output = tf.constant([[0.3, 0., 0.3], [0.6, 0., 0.]])
boxes1 = box_list.BoxList(corners1)
boxes1.add_field(fields.BoxListFields.scores, scores)
boxes2 = box_list.BoxList(corners2)
iou_similarity_calculator = (
region_similarity_calculator.ThresholdedIouSimilarity(
iou_threshold=iou_threshold))
iou_similarity = iou_similarity_calculator.compare(boxes1, boxes2)
with self.test_session() as sess:
iou_output = sess.run(iou_similarity)
self.assertAllClose(iou_output, exp_output)
if __name__ == '__main__':
tf.test.main()
......@@ -49,7 +49,7 @@ class TargetAssigner(object):
"""Target assigner to compute classification and regression targets."""
def __init__(self, similarity_calc, matcher, box_coder,
negative_class_weight=1.0, unmatched_cls_target=None):
negative_class_weight=1.0):
"""Construct Object Detection Target Assigner.
Args:
......@@ -60,12 +60,6 @@ class TargetAssigner(object):
groundtruth boxes with respect to anchors.
negative_class_weight: classification weight to be associated to negative
anchors (default: 1.0). The weight must be in [0., 1.].
unmatched_cls_target: a float32 tensor with shape [d_1, d_2, ..., d_k]
which is consistent with the classification target for each
anchor (and can be empty for scalar targets). This shape must thus be
compatible with the groundtruth labels that are passed to the "assign"
function (which have shape [num_gt_boxes, d_1, d_2, ..., d_k]).
If set to None, unmatched_cls_target is set to be [0] for each anchor.
Raises:
ValueError: if similarity_calc is not a RegionSimilarityCalculator or
......@@ -81,17 +75,14 @@ class TargetAssigner(object):
self._matcher = matcher
self._box_coder = box_coder
self._negative_class_weight = negative_class_weight
if unmatched_cls_target is None:
self._unmatched_cls_target = tf.constant([0], tf.float32)
else:
self._unmatched_cls_target = unmatched_cls_target
@property
def box_coder(self):
return self._box_coder
# TODO(rathodv): move labels, scores, and weights to groundtruth_boxes fields.
def assign(self, anchors, groundtruth_boxes, groundtruth_labels=None,
groundtruth_weights=None, **params):
unmatched_class_label=None, groundtruth_weights=None, **params):
"""Assign classification and regression targets to each anchor.
For a given set of anchors and groundtruth detections, match anchors
......@@ -110,6 +101,12 @@ class TargetAssigner(object):
[d_1, ... d_k] can be empty (corresponding to scalar inputs). When set
to None, groundtruth_labels assumes a binary problem where all
ground_truth boxes get a positive label (of 1).
unmatched_class_label: a float32 tensor with shape [d_1, d_2, ..., d_k]
which is consistent with the classification target for each
anchor (and can be empty for scalar targets). This shape must thus be
compatible with the groundtruth labels that are passed to the "assign"
function (which have shape [num_gt_boxes, d_1, d_2, ..., d_k]).
If set to None, unmatched_cls_target is set to be [0] for each anchor.
groundtruth_weights: a float tensor of shape [M] indicating the weight to
assign to all anchors match to a particular groundtruth box. The weights
must be in [0., 1.]. If None, all weights are set to 1.
......@@ -136,14 +133,17 @@ class TargetAssigner(object):
if not isinstance(groundtruth_boxes, box_list.BoxList):
raise ValueError('groundtruth_boxes must be an BoxList')
if unmatched_class_label is None:
unmatched_class_label = tf.constant([0], tf.float32)
if groundtruth_labels is None:
groundtruth_labels = tf.ones(tf.expand_dims(groundtruth_boxes.num_boxes(),
0))
groundtruth_labels = tf.expand_dims(groundtruth_labels, -1)
unmatched_shape_assert = shape_utils.assert_shape_equal(
shape_utils.combined_static_and_dynamic_shape(groundtruth_labels)[1:],
shape_utils.combined_static_and_dynamic_shape(
self._unmatched_cls_target))
shape_utils.combined_static_and_dynamic_shape(unmatched_class_label))
labels_and_box_shapes_assert = shape_utils.assert_shape_equal(
shape_utils.combined_static_and_dynamic_shape(
groundtruth_labels)[:1],
......@@ -155,6 +155,12 @@ class TargetAssigner(object):
if not num_gt_boxes:
num_gt_boxes = groundtruth_boxes.num_boxes()
groundtruth_weights = tf.ones([num_gt_boxes], dtype=tf.float32)
# set scores on the gt boxes
scores = 1 - groundtruth_labels[:, 0]
groundtruth_boxes.add_field(fields.BoxListFields.scores, scores)
with tf.control_dependencies(
[unmatched_shape_assert, labels_and_box_shapes_assert]):
match_quality_matrix = self._similarity_calc.compare(groundtruth_boxes,
......@@ -164,6 +170,7 @@ class TargetAssigner(object):
groundtruth_boxes,
match)
cls_targets = self._create_classification_targets(groundtruth_labels,
unmatched_class_label,
match)
reg_weights = self._create_regression_weights(match, groundtruth_weights)
cls_weights = self._create_classification_weights(match,
......@@ -245,7 +252,8 @@ class TargetAssigner(object):
"""
return tf.constant([self._box_coder.code_size*[0]], tf.float32)
def _create_classification_targets(self, groundtruth_labels, match):
def _create_classification_targets(self, groundtruth_labels,
unmatched_class_label, match):
"""Create classification targets for each anchor.
Assign a classification target of for each anchor to the matching
......@@ -256,6 +264,11 @@ class TargetAssigner(object):
groundtruth_labels: a tensor of shape [num_gt_boxes, d_1, ... d_k]
with labels for each of the ground_truth boxes. The subshape
[d_1, ... d_k] can be empty (corresponding to scalar labels).
unmatched_class_label: a float32 tensor with shape [d_1, d_2, ..., d_k]
which is consistent with the classification target for each
anchor (and can be empty for scalar targets). This shape must thus be
compatible with the groundtruth labels that are passed to the "assign"
function (which have shape [num_gt_boxes, d_1, d_2, ..., d_k]).
match: a matcher.Match object that provides a matching between anchors
and groundtruth boxes.
......@@ -266,8 +279,8 @@ class TargetAssigner(object):
"""
return match.gather_based_on_match(
groundtruth_labels,
unmatched_value=self._unmatched_cls_target,
ignored_value=self._unmatched_cls_target)
unmatched_value=unmatched_class_label,
ignored_value=unmatched_class_label)
def _create_regression_weights(self, match, groundtruth_weights):
"""Set regression weight for each anchor.
......@@ -327,8 +340,7 @@ class TargetAssigner(object):
# TODO(rathodv): This method pulls in all the implementation dependencies into
# core. Therefore its best to have this factory method outside of core.
def create_target_assigner(reference, stage=None,
negative_class_weight=1.0,
unmatched_cls_target=None):
negative_class_weight=1.0, use_matmul_gather=False):
"""Factory function for creating standard target assigners.
Args:
......@@ -336,12 +348,8 @@ def create_target_assigner(reference, stage=None,
stage: string denoting stage: {proposal, detection}.
negative_class_weight: classification weight to be associated to negative
anchors (default: 1.0)
unmatched_cls_target: a float32 tensor with shape [d_1, d_2, ..., d_k]
which is consistent with the classification target for each
anchor (and can be empty for scalar targets). This shape must thus be
compatible with the groundtruth labels that are passed to the Assign
function (which have shape [num_gt_boxes, d_1, d_2, ..., d_k]).
If set to None, unmatched_cls_target is set to be 0 for each anchor.
use_matmul_gather: whether to use matrix multiplication based gather which
are better suited for TPUs.
Returns:
TargetAssigner: desired target assigner.
......@@ -358,7 +366,8 @@ def create_target_assigner(reference, stage=None,
similarity_calc = sim_calc.IouSimilarity()
matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.7,
unmatched_threshold=0.3,
force_match_for_each_row=True)
force_match_for_each_row=True,
use_matmul_gather=use_matmul_gather)
box_coder = faster_rcnn_box_coder.FasterRcnnBoxCoder(
scale_factors=[10.0, 10.0, 5.0, 5.0])
......@@ -366,7 +375,8 @@ def create_target_assigner(reference, stage=None,
similarity_calc = sim_calc.IouSimilarity()
# Uses all proposals with IOU < 0.5 as candidate negatives.
matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5,
negatives_lower_than_unmatched=True)
negatives_lower_than_unmatched=True,
use_matmul_gather=use_matmul_gather)
box_coder = faster_rcnn_box_coder.FasterRcnnBoxCoder(
scale_factors=[10.0, 10.0, 5.0, 5.0])
......@@ -375,21 +385,22 @@ def create_target_assigner(reference, stage=None,
matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5,
unmatched_threshold=0.1,
force_match_for_each_row=False,
negatives_lower_than_unmatched=False)
negatives_lower_than_unmatched=False,
use_matmul_gather=use_matmul_gather)
box_coder = faster_rcnn_box_coder.FasterRcnnBoxCoder()
else:
raise ValueError('No valid combination of reference and stage.')
return TargetAssigner(similarity_calc, matcher, box_coder,
negative_class_weight=negative_class_weight,
unmatched_cls_target=unmatched_cls_target)
negative_class_weight=negative_class_weight)
def batch_assign_targets(target_assigner,
anchors_batch,
gt_box_batch,
gt_class_targets_batch,
unmatched_class_label=None,
gt_weights_batch=None):
"""Batched assignment of classification and regression targets.
......@@ -403,6 +414,11 @@ def batch_assign_targets(target_assigner,
each tensor has shape [num_gt_boxes_i, classification_target_size] and
num_gt_boxes_i is the number of boxes in the ith boxlist of
gt_box_batch.
unmatched_class_label: a float32 tensor with shape [d_1, d_2, ..., d_k]
which is consistent with the classification target for each
anchor (and can be empty for scalar targets). This shape must thus be
compatible with the groundtruth labels that are passed to the "assign"
function (which have shape [num_gt_boxes, d_1, d_2, ..., d_k]).
gt_weights_batch: A list of 1-D tf.float32 tensors of shape
[num_boxes] containing weights for groundtruth boxes.
......@@ -442,9 +458,9 @@ def batch_assign_targets(target_assigner,
gt_weights_batch = [None] * len(gt_class_targets_batch)
for anchors, gt_boxes, gt_class_targets, gt_weights in zip(
anchors_batch, gt_box_batch, gt_class_targets_batch, gt_weights_batch):
(cls_targets, cls_weights, reg_targets,
reg_weights, match) = target_assigner.assign(
anchors, gt_boxes, gt_class_targets, gt_weights)
(cls_targets, cls_weights, reg_targets, reg_weights,
match) = target_assigner.assign(anchors, gt_boxes, gt_class_targets,
unmatched_class_label, gt_weights)
cls_targets_list.append(cls_targets)
cls_weights_list.append(cls_weights)
reg_targets_list.append(reg_targets)
......
......@@ -37,10 +37,11 @@ class TargetAssignerTest(test_case.TestCase):
unmatched_threshold=0.5)
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
target_assigner = targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder, unmatched_cls_target=None)
similarity_calc, matcher, box_coder)
anchors_boxlist = box_list.BoxList(anchor_means)
groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners)
result = target_assigner.assign(anchors_boxlist, groundtruth_boxlist)
result = target_assigner.assign(
anchors_boxlist, groundtruth_boxlist, unmatched_class_label=None)
(cls_targets, cls_weights, reg_targets, reg_weights, _) = result
return (cls_targets, cls_weights, reg_targets, reg_weights)
......@@ -81,10 +82,11 @@ class TargetAssignerTest(test_case.TestCase):
unmatched_threshold=0.3)
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
target_assigner = targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder, unmatched_cls_target=None)
similarity_calc, matcher, box_coder)
anchors_boxlist = box_list.BoxList(anchor_means)
groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners)
result = target_assigner.assign(anchors_boxlist, groundtruth_boxlist)
result = target_assigner.assign(
anchors_boxlist, groundtruth_boxlist, unmatched_class_label=None)
(cls_targets, cls_weights, reg_targets, reg_weights, _) = result
return (cls_targets, cls_weights, reg_targets, reg_weights)
......@@ -120,12 +122,13 @@ class TargetAssignerTest(test_case.TestCase):
box_coder = keypoint_box_coder.KeypointBoxCoder(
num_keypoints=6, scale_factors=[10.0, 10.0, 5.0, 5.0])
target_assigner = targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder, unmatched_cls_target=None)
similarity_calc, matcher, box_coder)
anchors_boxlist = box_list.BoxList(anchor_means)
groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners)
groundtruth_boxlist.add_field(fields.BoxListFields.keypoints,
groundtruth_keypoints)
result = target_assigner.assign(anchors_boxlist, groundtruth_boxlist)
result = target_assigner.assign(
anchors_boxlist, groundtruth_boxlist, unmatched_class_label=None)
(cls_targets, cls_weights, reg_targets, reg_weights, _) = result
return (cls_targets, cls_weights, reg_targets, reg_weights)
......@@ -174,12 +177,13 @@ class TargetAssignerTest(test_case.TestCase):
box_coder = keypoint_box_coder.KeypointBoxCoder(
num_keypoints=6, scale_factors=[10.0, 10.0, 5.0, 5.0])
target_assigner = targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder, unmatched_cls_target=None)
similarity_calc, matcher, box_coder)
anchors_boxlist = box_list.BoxList(anchor_means)
groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners)
groundtruth_boxlist.add_field(fields.BoxListFields.keypoints,
groundtruth_keypoints)
result = target_assigner.assign(anchors_boxlist, groundtruth_boxlist)
result = target_assigner.assign(
anchors_boxlist, groundtruth_boxlist, unmatched_class_label=None)
(cls_targets, cls_weights, reg_targets, reg_weights, _) = result
return (cls_targets, cls_weights, reg_targets, reg_weights)
......@@ -221,15 +225,17 @@ class TargetAssignerTest(test_case.TestCase):
matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5,
unmatched_threshold=0.5)
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
unmatched_cls_target = tf.constant([1, 0, 0, 0, 0, 0, 0], tf.float32)
unmatched_class_label = tf.constant([1, 0, 0, 0, 0, 0, 0], tf.float32)
target_assigner = targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder,
unmatched_cls_target=unmatched_cls_target)
similarity_calc, matcher, box_coder)
anchors_boxlist = box_list.BoxList(anchor_means)
groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners)
result = target_assigner.assign(anchors_boxlist, groundtruth_boxlist,
groundtruth_labels)
result = target_assigner.assign(
anchors_boxlist,
groundtruth_boxlist,
groundtruth_labels,
unmatched_class_label=unmatched_class_label)
(cls_targets, cls_weights, reg_targets, reg_weights, _) = result
return (cls_targets, cls_weights, reg_targets, reg_weights)
......@@ -275,16 +281,18 @@ class TargetAssignerTest(test_case.TestCase):
matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5,
unmatched_threshold=0.5)
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
unmatched_cls_target = tf.constant([1, 0, 0, 0, 0, 0, 0], tf.float32)
unmatched_class_label = tf.constant([1, 0, 0, 0, 0, 0, 0], tf.float32)
target_assigner = targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder,
unmatched_cls_target=unmatched_cls_target)
similarity_calc, matcher, box_coder)
anchors_boxlist = box_list.BoxList(anchor_means)
groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners)
result = target_assigner.assign(anchors_boxlist, groundtruth_boxlist,
groundtruth_labels,
groundtruth_weights)
result = target_assigner.assign(
anchors_boxlist,
groundtruth_boxlist,
groundtruth_labels,
unmatched_class_label=unmatched_class_label,
groundtruth_weights=groundtruth_weights)
(_, cls_weights, _, reg_weights, _) = result
return (cls_weights, reg_weights)
......@@ -318,15 +326,17 @@ class TargetAssignerTest(test_case.TestCase):
unmatched_threshold=0.5)
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
unmatched_cls_target = tf.constant([[0, 0], [0, 0]], tf.float32)
unmatched_class_label = tf.constant([[0, 0], [0, 0]], tf.float32)
target_assigner = targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder,
unmatched_cls_target=unmatched_cls_target)
similarity_calc, matcher, box_coder)
anchors_boxlist = box_list.BoxList(anchor_means)
groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners)
result = target_assigner.assign(anchors_boxlist, groundtruth_boxlist,
groundtruth_labels)
result = target_assigner.assign(
anchors_boxlist,
groundtruth_boxlist,
groundtruth_labels,
unmatched_class_label=unmatched_class_label)
(cls_targets, cls_weights, reg_targets, reg_weights, _) = result
return (cls_targets, cls_weights, reg_targets, reg_weights)
......@@ -371,14 +381,16 @@ class TargetAssignerTest(test_case.TestCase):
matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5,
unmatched_threshold=0.5)
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
unmatched_cls_target = tf.constant([0, 0, 0], tf.float32)
unmatched_class_label = tf.constant([0, 0, 0], tf.float32)
anchors_boxlist = box_list.BoxList(anchor_means)
groundtruth_boxlist = box_list.BoxList(groundtruth_box_corners)
target_assigner = targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder,
unmatched_cls_target=unmatched_cls_target)
result = target_assigner.assign(anchors_boxlist, groundtruth_boxlist,
groundtruth_labels)
similarity_calc, matcher, box_coder)
result = target_assigner.assign(
anchors_boxlist,
groundtruth_boxlist,
groundtruth_labels,
unmatched_class_label=unmatched_class_label)
(cls_targets, cls_weights, reg_targets, reg_weights, _) = result
return (cls_targets, cls_weights, reg_targets, reg_weights)
......@@ -415,10 +427,9 @@ class TargetAssignerTest(test_case.TestCase):
similarity_calc = region_similarity_calculator.NegSqDistSimilarity()
matcher = bipartite_matcher.GreedyBipartiteMatcher()
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder()
unmatched_cls_target = tf.constant([1, 0, 0, 0, 0, 0, 0], tf.float32)
unmatched_class_label = tf.constant([1, 0, 0, 0, 0, 0, 0], tf.float32)
target_assigner = targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder,
unmatched_cls_target=unmatched_cls_target)
similarity_calc, matcher, box_coder)
prior_means = tf.constant([[0.0, 0.0, 0.5, 0.5],
[0.5, 0.5, 1.0, 0.8],
......@@ -436,17 +447,20 @@ class TargetAssignerTest(test_case.TestCase):
[0, 0, 0, 0, 0, 1, 0],
[0, 0, 0, 1, 0, 0, 0]], tf.float32)
with self.assertRaisesRegexp(ValueError, 'Unequal shapes'):
target_assigner.assign(priors, boxes, groundtruth_labels,
num_valid_rows=3)
target_assigner.assign(
priors,
boxes,
groundtruth_labels,
unmatched_class_label=unmatched_class_label,
num_valid_rows=3)
def test_raises_error_on_invalid_groundtruth_labels(self):
similarity_calc = region_similarity_calculator.NegSqDistSimilarity()
matcher = bipartite_matcher.GreedyBipartiteMatcher()
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=1.0)
unmatched_cls_target = tf.constant([[0, 0], [0, 0], [0, 0]], tf.float32)
unmatched_class_label = tf.constant([[0, 0], [0, 0], [0, 0]], tf.float32)
target_assigner = targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder,
unmatched_cls_target=unmatched_cls_target)
similarity_calc, matcher, box_coder)
prior_means = tf.constant([[0.0, 0.0, 0.5, 0.5]])
priors = box_list.BoxList(prior_means)
......@@ -458,41 +472,22 @@ class TargetAssignerTest(test_case.TestCase):
groundtruth_labels = tf.constant([[[0, 1], [1, 0]]], tf.float32)
with self.assertRaises(ValueError):
target_assigner.assign(priors, boxes, groundtruth_labels,
num_valid_rows=3)
target_assigner.assign(
priors,
boxes,
groundtruth_labels,
unmatched_class_label=unmatched_class_label,
num_valid_rows=3)
class BatchTargetAssignerTest(test_case.TestCase):
def _get_agnostic_target_assigner(self):
def _get_target_assigner(self):
similarity_calc = region_similarity_calculator.IouSimilarity()
matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5,
unmatched_threshold=0.5)
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
return targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder,
unmatched_cls_target=None)
def _get_multi_class_target_assigner(self, num_classes):
similarity_calc = region_similarity_calculator.IouSimilarity()
matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5,
unmatched_threshold=0.5)
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
unmatched_cls_target = tf.constant([1] + num_classes * [0], tf.float32)
return targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder,
unmatched_cls_target=unmatched_cls_target)
def _get_multi_dimensional_target_assigner(self, target_dimensions):
similarity_calc = region_similarity_calculator.IouSimilarity()
matcher = argmax_matcher.ArgMaxMatcher(matched_threshold=0.5,
unmatched_threshold=0.5)
box_coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
unmatched_cls_target = tf.constant(np.zeros(target_dimensions),
tf.float32)
return targetassigner.TargetAssigner(
similarity_calc, matcher, box_coder,
unmatched_cls_target=unmatched_cls_target)
return targetassigner.TargetAssigner(similarity_calc, matcher, box_coder)
def test_batch_assign_targets(self):
......@@ -502,7 +497,7 @@ class BatchTargetAssignerTest(test_case.TestCase):
gt_box_batch = [box_list1, box_list2]
gt_class_targets = [None, None]
anchors_boxlist = box_list.BoxList(anchor_means)
agnostic_target_assigner = self._get_agnostic_target_assigner()
agnostic_target_assigner = self._get_target_assigner()
(cls_targets, cls_weights, reg_targets, reg_weights,
_) = targetassigner.batch_assign_targets(
agnostic_target_assigner, anchors_boxlist, gt_box_batch,
......@@ -550,12 +545,13 @@ class BatchTargetAssignerTest(test_case.TestCase):
gt_box_batch = [box_list1, box_list2]
gt_class_targets = [class_targets1, class_targets2]
anchors_boxlist = box_list.BoxList(anchor_means)
multiclass_target_assigner = self._get_multi_class_target_assigner(
num_classes=3)
multiclass_target_assigner = self._get_target_assigner()
num_classes = 3
unmatched_class_label = tf.constant([1] + num_classes * [0], tf.float32)
(cls_targets, cls_weights, reg_targets, reg_weights,
_) = targetassigner.batch_assign_targets(
multiclass_target_assigner, anchors_boxlist, gt_box_batch,
gt_class_targets)
gt_class_targets, unmatched_class_label)
return (cls_targets, cls_weights, reg_targets, reg_weights)
groundtruth_boxlist1 = np.array([[0., 0., 0.2, 0.2]], dtype=np.float32)
......@@ -613,12 +609,13 @@ class BatchTargetAssignerTest(test_case.TestCase):
gt_class_targets = [class_targets1, class_targets2]
gt_weights = [groundtruth_weights1, groundtruth_weights2]
anchors_boxlist = box_list.BoxList(anchor_means)
multiclass_target_assigner = self._get_multi_class_target_assigner(
num_classes=3)
multiclass_target_assigner = self._get_target_assigner()
num_classes = 3
unmatched_class_label = tf.constant([1] + num_classes * [0], tf.float32)
(cls_targets, cls_weights, reg_targets, reg_weights,
_) = targetassigner.batch_assign_targets(
multiclass_target_assigner, anchors_boxlist, gt_box_batch,
gt_class_targets, gt_weights)
gt_class_targets, unmatched_class_label, gt_weights)
return (cls_targets, cls_weights, reg_targets, reg_weights)
groundtruth_boxlist1 = np.array([[0., 0., 0.2, 0.2],
......@@ -680,12 +677,14 @@ class BatchTargetAssignerTest(test_case.TestCase):
gt_box_batch = [box_list1, box_list2]
gt_class_targets = [class_targets1, class_targets2]
anchors_boxlist = box_list.BoxList(anchor_means)
multiclass_target_assigner = self._get_multi_dimensional_target_assigner(
target_dimensions=(2, 3))
multiclass_target_assigner = self._get_target_assigner()
target_dimensions = (2, 3)
unmatched_class_label = tf.constant(np.zeros(target_dimensions),
tf.float32)
(cls_targets, cls_weights, reg_targets, reg_weights,
_) = targetassigner.batch_assign_targets(
multiclass_target_assigner, anchors_boxlist, gt_box_batch,
gt_class_targets)
gt_class_targets, unmatched_class_label)
return (cls_targets, cls_weights, reg_targets, reg_weights)
groundtruth_boxlist1 = np.array([[0., 0., 0.2, 0.2]], dtype=np.float32)
......@@ -754,13 +753,13 @@ class BatchTargetAssignerTest(test_case.TestCase):
gt_class_targets_batch = [gt_class_targets]
anchors_boxlist = box_list.BoxList(anchor_means)
multiclass_target_assigner = self._get_multi_class_target_assigner(
num_classes=3)
multiclass_target_assigner = self._get_target_assigner()
num_classes = 3
unmatched_class_label = tf.constant([1] + num_classes * [0], tf.float32)
(cls_targets, cls_weights, reg_targets, reg_weights,
_) = targetassigner.batch_assign_targets(
multiclass_target_assigner, anchors_boxlist,
gt_box_batch, gt_class_targets_batch)
gt_box_batch, gt_class_targets_batch, unmatched_class_label)
return (cls_targets, cls_weights, reg_targets, reg_weights)
groundtruth_box_corners = np.zeros((0, 4), dtype=np.float32)
......
......@@ -162,12 +162,12 @@ def main(parsed_args):
for line in source:
if not header:
header = line
target.writelines(header)
continue
if labels_file:
expanded_lines = expansion_generator.expand_labels_from_csv(line)
else:
expanded_lines = expansion_generator.expand_boxes_from_csv(line)
expanded_lines = [header] + expanded_lines
target.writelines(expanded_lines)
......
......@@ -140,10 +140,10 @@ def main(_):
]
else:
input_shape = None
exporter.export_inference_graph(FLAGS.input_type, pipeline_config,
FLAGS.trained_checkpoint_prefix,
FLAGS.output_directory, input_shape,
FLAGS.write_inference_graph)
exporter.export_inference_graph(
FLAGS.input_type, pipeline_config, FLAGS.trained_checkpoint_prefix,
FLAGS.output_directory, input_shape=input_shape,
write_inference_graph=FLAGS.write_inference_graph)
if __name__ == '__main__':
......
......@@ -258,6 +258,7 @@ def export_tflite_graph(pipeline_config, trained_checkpoint_prefix, output_dir,
restore_op_name='save/restore_all',
filename_tensor_name='save/Const:0',
clear_devices=True,
output_graph='',
initializer_nodes='')
# Add new operation to do post processing in a custom op (TF Lite only)
......
......@@ -14,17 +14,16 @@
# ==============================================================================
"""Functions to export object detection inference graph."""
import logging
import os
import tempfile
import tensorflow as tf
from tensorflow.core.protobuf import saver_pb2
from tensorflow.python import pywrap_tensorflow
from tensorflow.python.client import session
from tensorflow.python.framework import graph_util
from tensorflow.python.platform import gfile
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.tools import freeze_graph
from tensorflow.python.training import saver as saver_lib
from object_detection.builders import graph_rewriter_builder
from object_detection.builders import model_builder
from object_detection.core import standard_fields as fields
from object_detection.data_decoders import tf_example_decoder
......@@ -32,70 +31,7 @@ from object_detection.utils import config_util
slim = tf.contrib.slim
# TODO(derekjchow): Replace with freeze_graph.freeze_graph_with_def_protos when
# newer version of Tensorflow becomes more common.
def freeze_graph_with_def_protos(
input_graph_def,
input_saver_def,
input_checkpoint,
output_node_names,
restore_op_name,
filename_tensor_name,
clear_devices,
initializer_nodes,
variable_names_blacklist=''):
"""Converts all variables in a graph and checkpoint into constants."""
del restore_op_name, filename_tensor_name # Unused by updated loading code.
# 'input_checkpoint' may be a prefix if we're using Saver V2 format
if not saver_lib.checkpoint_exists(input_checkpoint):
raise ValueError(
'Input checkpoint "' + input_checkpoint + '" does not exist!')
if not output_node_names:
raise ValueError(
'You must supply the name of a node to --output_node_names.')
# Remove all the explicit device specifications for this node. This helps to
# make the graph more portable.
if clear_devices:
for node in input_graph_def.node:
node.device = ''
with tf.Graph().as_default():
tf.import_graph_def(input_graph_def, name='')
config = tf.ConfigProto(graph_options=tf.GraphOptions())
with session.Session(config=config) as sess:
if input_saver_def:
saver = saver_lib.Saver(saver_def=input_saver_def)
saver.restore(sess, input_checkpoint)
else:
var_list = {}
reader = pywrap_tensorflow.NewCheckpointReader(input_checkpoint)
var_to_shape_map = reader.get_variable_to_shape_map()
for key in var_to_shape_map:
try:
tensor = sess.graph.get_tensor_by_name(key + ':0')
except KeyError:
# This tensor doesn't exist in the graph (for example it's
# 'global_step' or a similar housekeeping element) so skip it.
continue
var_list[key] = tensor
saver = saver_lib.Saver(var_list=var_list)
saver.restore(sess, input_checkpoint)
if initializer_nodes:
sess.run(initializer_nodes)
variable_names_blacklist = (variable_names_blacklist.split(',') if
variable_names_blacklist else None)
output_graph_def = graph_util.convert_variables_to_constants(
sess,
input_graph_def,
output_node_names.split(','),
variable_names_blacklist=variable_names_blacklist)
return output_graph_def
freeze_graph_with_def_protos = freeze_graph.freeze_graph_with_def_protos
def replace_variable_values_with_moving_averages(graph,
......@@ -247,18 +183,6 @@ def _add_output_tensor_nodes(postprocessed_tensors,
return outputs
def write_frozen_graph(frozen_graph_path, frozen_graph_def):
"""Writes frozen graph to disk.
Args:
frozen_graph_path: Path to write inference graph.
frozen_graph_def: tf.GraphDef holding frozen graph.
"""
with gfile.GFile(frozen_graph_path, 'wb') as f:
f.write(frozen_graph_def.SerializeToString())
logging.info('%d ops in the final graph.', len(frozen_graph_def.node))
def write_saved_model(saved_model_path,
frozen_graph_def,
inputs,
......@@ -384,6 +308,7 @@ def _export_inference_graph(input_type,
output_collection_name=output_collection_name,
graph_hook_fn=graph_hook_fn)
profile_inference_graph(tf.get_default_graph())
saver_kwargs = {}
if use_moving_averages:
# This check is to be compatible with both version of SaverDef.
......@@ -421,16 +346,17 @@ def _export_inference_graph(input_type,
else:
output_node_names = ','.join(outputs.keys())
frozen_graph_def = freeze_graph_with_def_protos(
frozen_graph_def = freeze_graph.freeze_graph_with_def_protos(
input_graph_def=tf.get_default_graph().as_graph_def(),
input_saver_def=input_saver_def,
input_checkpoint=checkpoint_to_use,
output_node_names=output_node_names,
restore_op_name='save/restore_all',
filename_tensor_name='save/Const:0',
output_graph=frozen_graph_path,
clear_devices=True,
initializer_nodes='')
write_frozen_graph(frozen_graph_path, frozen_graph_def)
write_saved_model(saved_model_path, frozen_graph_def,
placeholder_tensor, outputs)
......@@ -461,6 +387,11 @@ def export_inference_graph(input_type,
"""
detection_model = model_builder.build(pipeline_config.model,
is_training=False)
graph_rewriter_fn = None
if pipeline_config.HasField('graph_rewriter'):
graph_rewriter_config = pipeline_config.graph_rewriter
graph_rewriter_fn = graph_rewriter_builder.build(graph_rewriter_config,
is_training=False)
_export_inference_graph(
input_type,
detection_model,
......@@ -470,7 +401,39 @@ def export_inference_graph(input_type,
additional_output_tensor_names,
input_shape,
output_collection_name,
graph_hook_fn=None,
graph_hook_fn=graph_rewriter_fn,
write_inference_graph=write_inference_graph)
pipeline_config.eval_config.use_moving_averages = False
config_util.save_pipeline_config(pipeline_config, output_directory)
def profile_inference_graph(graph):
"""Profiles the inference graph.
Prints model parameters and computation FLOPs given an inference graph.
BatchNorms are excluded from the parameter count due to the fact that
BatchNorms are usually folded. BatchNorm, Initializer, Regularizer
and BiasAdd are not considered in FLOP count.
Args:
graph: the inference graph.
"""
tfprof_vars_option = (
tf.contrib.tfprof.model_analyzer.TRAINABLE_VARS_PARAMS_STAT_OPTIONS)
tfprof_flops_option = tf.contrib.tfprof.model_analyzer.FLOAT_OPS_OPTIONS
# Batchnorm is usually folded during inference.
tfprof_vars_option['trim_name_regexes'] = ['.*BatchNorm.*']
# Initializer and Regularizer are only used in training.
tfprof_flops_option['trim_name_regexes'] = [
'.*BatchNorm.*', '.*Initializer.*', '.*Regularizer.*', '.*BiasAdd.*'
]
tf.contrib.tfprof.model_analyzer.print_model_analysis(
graph,
tfprof_options=tfprof_vars_option)
tf.contrib.tfprof.model_analyzer.print_model_analysis(
graph,
tfprof_options=tfprof_flops_option)
......@@ -20,8 +20,10 @@ import six
import tensorflow as tf
from google.protobuf import text_format
from object_detection import exporter
from object_detection.builders import graph_rewriter_builder
from object_detection.builders import model_builder
from object_detection.core import model
from object_detection.protos import graph_rewriter_pb2
from object_detection.protos import pipeline_pb2
if six.PY2:
......@@ -75,8 +77,10 @@ class FakeModel(model.DetectionModel):
class ExportInferenceGraphTest(tf.test.TestCase):
def _save_checkpoint_from_mock_model(self, checkpoint_path,
use_moving_averages):
def _save_checkpoint_from_mock_model(self,
checkpoint_path,
use_moving_averages,
enable_quantization=False):
g = tf.Graph()
with g.as_default():
mock_model = FakeModel()
......@@ -86,20 +90,28 @@ class ExportInferenceGraphTest(tf.test.TestCase):
mock_model.postprocess(predictions, true_image_shapes)
if use_moving_averages:
tf.train.ExponentialMovingAverage(0.0).apply()
slim.get_or_create_global_step()
tf.train.get_or_create_global_step()
if enable_quantization:
graph_rewriter_config = graph_rewriter_pb2.GraphRewriter()
graph_rewriter_config.quantization.delay = 500000
graph_rewriter_fn = graph_rewriter_builder.build(
graph_rewriter_config, is_training=False)
graph_rewriter_fn()
saver = tf.train.Saver()
init = tf.global_variables_initializer()
with self.test_session() as sess:
sess.run(init)
saver.save(sess, checkpoint_path)
def _load_inference_graph(self, inference_graph_path):
def _load_inference_graph(self, inference_graph_path, is_binary=True):
od_graph = tf.Graph()
with od_graph.as_default():
od_graph_def = tf.GraphDef()
with tf.gfile.GFile(inference_graph_path) as fid:
serialized_graph = fid.read()
od_graph_def.ParseFromString(serialized_graph)
if is_binary:
od_graph_def.ParseFromString(fid.read())
else:
text_format.Parse(fid.read(), od_graph_def)
tf.import_graph_def(od_graph_def, name='')
return od_graph
......@@ -284,6 +296,42 @@ class ExportInferenceGraphTest(tf.test.TestCase):
[var_name for var_name, _ in tf.train.list_variables(output_directory)])
self.assertTrue(expected_variables.issubset(actual_variables))
def test_export_model_with_quantization_nodes(self):
tmp_dir = self.get_temp_dir()
trained_checkpoint_prefix = os.path.join(tmp_dir, 'model.ckpt')
self._save_checkpoint_from_mock_model(
trained_checkpoint_prefix,
use_moving_averages=False,
enable_quantization=True)
output_directory = os.path.join(tmp_dir, 'output')
inference_graph_path = os.path.join(output_directory,
'inference_graph.pbtxt')
with mock.patch.object(
model_builder, 'build', autospec=True) as mock_builder:
mock_builder.return_value = FakeModel()
pipeline_config = pipeline_pb2.TrainEvalPipelineConfig()
text_format.Merge(
"""graph_rewriter {
quantization {
delay: 50000
activation_bits: 8
weight_bits: 8
}
}""", pipeline_config)
exporter.export_inference_graph(
input_type='image_tensor',
pipeline_config=pipeline_config,
trained_checkpoint_prefix=trained_checkpoint_prefix,
output_directory=output_directory,
write_inference_graph=True)
self._load_inference_graph(inference_graph_path, is_binary=False)
has_quant_nodes = False
for v in tf.global_variables():
if v.op.name.endswith('act_quant/min'):
has_quant_nodes = True
break
self.assertTrue(has_quant_nodes)
def test_export_model_with_all_output_nodes(self):
tmp_dir = self.get_temp_dir()
trained_checkpoint_prefix = os.path.join(tmp_dir, 'model.ckpt')
......@@ -564,16 +612,16 @@ class ExportInferenceGraphTest(tf.test.TestCase):
output_node_names = ','.join(outputs.keys())
saver = tf.train.Saver()
input_saver_def = saver.as_saver_def()
frozen_graph_def = exporter.freeze_graph_with_def_protos(
exporter.freeze_graph_with_def_protos(
input_graph_def=tf.get_default_graph().as_graph_def(),
input_saver_def=input_saver_def,
input_checkpoint=trained_checkpoint_prefix,
output_node_names=output_node_names,
restore_op_name='save/restore_all',
filename_tensor_name='save/Const:0',
output_graph=inference_graph_path,
clear_devices=True,
initializer_nodes='')
exporter.write_frozen_graph(inference_graph_path, frozen_graph_def)
inference_graph = self._load_inference_graph(inference_graph_path)
tf_example_np = np.expand_dims(self._create_tf_example(
......@@ -719,6 +767,7 @@ class ExportInferenceGraphTest(tf.test.TestCase):
output_node_names=output_node_names,
restore_op_name='save/restore_all',
filename_tensor_name='save/Const:0',
output_graph='',
clear_devices=True,
initializer_nodes='')
exporter.write_saved_model(
......
......@@ -48,6 +48,7 @@ pip install --user jupyter
pip install --user matplotlib
```
<!-- common_typos_disable -->
**Note**: sometimes "sudo apt-get install protobuf-compiler" will install
Protobuf 3+ versions for you and some users have issues when using 3.5.
If that is your case, try the [manual](#Manual-protobuf-compiler-installation-and-usage) installation.
......@@ -88,6 +89,7 @@ protoc object_detection/protos/*.proto --python_out=.
## Manual protobuf-compiler installation and usage
Download and install the 3.0 release of protoc, then unzip the file.
```bash
# From tensorflow/models/research/
wget -O protobuf.zip https://github.com/google/protobuf/releases/download/v3.0.0/protoc-3.0.0-linux-x86_64.zip
......
......@@ -219,20 +219,8 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes,
padded_tensor_dict = {}
for tensor_name in tensor_dict:
expected_shape = padding_shapes[tensor_name]
current_shape = shape_utils.combined_static_and_dynamic_shape(
tensor_dict[tensor_name])
trailing_paddings = [
expected_shape_dim - current_shape_dim if expected_shape_dim else 0
for expected_shape_dim, current_shape_dim in zip(
expected_shape, current_shape)
]
paddings = tf.stack([tf.zeros(len(trailing_paddings), dtype=tf.int32),
trailing_paddings],
axis=1)
padded_tensor_dict[tensor_name] = tf.pad(
tensor_dict[tensor_name], paddings=paddings)
padded_tensor_dict[tensor_name].set_shape(expected_shape)
padded_tensor_dict[tensor_name] = shape_utils.pad_or_clip_nd(
tensor_dict[tensor_name], padding_shapes[tensor_name])
return padded_tensor_dict
......@@ -529,7 +517,7 @@ def create_predict_input_fn(model_config, predict_input_config):
`ServingInputReceiver`.
"""
del params
example = tf.placeholder(dtype=tf.string, shape=[], name='input_feature')
example = tf.placeholder(dtype=tf.string, shape=[], name='tf_example')
num_classes = config_util.get_number_of_classes(model_config)
model = model_builder.build(model_config, is_training=False)
......
......@@ -657,6 +657,42 @@ class PadInputDataToStaticShapesFnTest(tf.test.TestCase):
padded_tensor_dict[fields.InputDataFields.groundtruth_classes]
.shape.as_list(), [3, 3])
def test_clip_boxes_and_classes(self):
input_tensor_dict = {
fields.InputDataFields.groundtruth_boxes:
tf.placeholder(tf.float32, [None, 4]),
fields.InputDataFields.groundtruth_classes:
tf.placeholder(tf.int32, [None, 3]),
}
padded_tensor_dict = inputs.pad_input_data_to_static_shapes(
tensor_dict=input_tensor_dict,
max_num_boxes=3,
num_classes=3,
spatial_image_shape=[5, 6])
self.assertAllEqual(
padded_tensor_dict[fields.InputDataFields.groundtruth_boxes]
.shape.as_list(), [3, 4])
self.assertAllEqual(
padded_tensor_dict[fields.InputDataFields.groundtruth_classes]
.shape.as_list(), [3, 3])
with self.test_session() as sess:
out_tensor_dict = sess.run(
padded_tensor_dict,
feed_dict={
input_tensor_dict[fields.InputDataFields.groundtruth_boxes]:
np.random.rand(5, 4),
input_tensor_dict[fields.InputDataFields.groundtruth_classes]:
np.random.rand(2, 3),
})
self.assertAllEqual(
out_tensor_dict[fields.InputDataFields.groundtruth_boxes].shape, [3, 4])
self.assertAllEqual(
out_tensor_dict[fields.InputDataFields.groundtruth_classes].shape,
[3, 3])
def test_do_not_pad_dynamic_images(self):
input_tensor_dict = {
fields.InputDataFields.image:
......
......@@ -12,7 +12,6 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Faster R-CNN meta-architecture definition.
General tensorflow implementation of Faster R-CNN detection models.
......@@ -98,7 +97,6 @@ from functools import partial
import tensorflow as tf
from object_detection.anchor_generators import grid_anchor_generator
from object_detection.core import balanced_positive_negative_sampler as sampler
from object_detection.core import box_list
from object_detection.core import box_list_ops
from object_detection.core import box_predictor
......@@ -107,6 +105,7 @@ from object_detection.core import model
from object_detection.core import post_processing
from object_detection.core import standard_fields as fields
from object_detection.core import target_assigner
from object_detection.predictors import convolutional_box_predictor
from object_detection.utils import ops
from object_detection.utils import shape_utils
......@@ -228,12 +227,13 @@ class FasterRCNNMetaArch(model.DetectionModel):
feature_extractor,
number_of_stages,
first_stage_anchor_generator,
first_stage_target_assigner,
first_stage_atrous_rate,
first_stage_box_predictor_arg_scope_fn,
first_stage_box_predictor_kernel_size,
first_stage_box_predictor_depth,
first_stage_minibatch_size,
first_stage_positive_balance_fraction,
first_stage_sampler,
first_stage_nms_score_threshold,
first_stage_nms_iou_threshold,
first_stage_max_proposals,
......@@ -242,9 +242,10 @@ class FasterRCNNMetaArch(model.DetectionModel):
initial_crop_size,
maxpool_kernel_size,
maxpool_stride,
second_stage_target_assigner,
second_stage_mask_rcnn_box_predictor,
second_stage_batch_size,
second_stage_balance_fraction,
second_stage_sampler,
second_stage_non_max_suppression_fn,
second_stage_score_conversion_fn,
second_stage_localization_loss_weight,
......@@ -254,7 +255,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
hard_example_miner=None,
parallel_iterations=16,
add_summaries=True,
use_matmul_crop_and_resize=False):
use_matmul_crop_and_resize=False,
clip_anchors_to_image=False):
"""FasterRCNNMetaArch Constructor.
Args:
......@@ -285,6 +287,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
first_stage_anchor_generator: An anchor_generator.AnchorGenerator object
(note that currently we only support
grid_anchor_generator.GridAnchorGenerator objects)
first_stage_target_assigner: Target assigner to use for first stage of
Faster R-CNN (RPN).
first_stage_atrous_rate: A single integer indicating the atrous rate for
the single convolution op which is applied to the `rpn_features_to_crop`
tensor to obtain a tensor to be used for box prediction. Some feature
......@@ -304,8 +308,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
"batch size" refers to the number of anchors selected as contributing
to the loss function for any given image within the image batch and is
only called "batch_size" due to terminology from the Faster R-CNN paper.
first_stage_positive_balance_fraction: Fraction of positive examples
per image for the RPN. The recommended value for Faster RCNN is 0.5.
first_stage_sampler: Sampler to use for first stage loss (RPN loss).
first_stage_nms_score_threshold: Score threshold for non max suppression
for the Region Proposal Network (RPN). This value is expected to be in
[0, 1] as it is applied directly after a softmax transformation. The
......@@ -325,6 +328,10 @@ class FasterRCNNMetaArch(model.DetectionModel):
max pool op on the cropped feature map during ROI pooling.
maxpool_stride: A single integer indicating the stride of the max pool
op on the cropped feature map during ROI pooling.
second_stage_target_assigner: Target assigner to use for second stage of
Faster R-CNN. If the model is configured with multiple prediction heads,
this target assigner is used to generate targets for all heads (with the
correct `unmatched_class_label`).
second_stage_mask_rcnn_box_predictor: Mask R-CNN box predictor to use for
the second stage.
second_stage_batch_size: The batch size used for computing the
......@@ -332,9 +339,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
"batch size" refers to the number of proposals selected as contributing
to the loss function for any given image within the image batch and is
only called "batch_size" due to terminology from the Faster R-CNN paper.
second_stage_balance_fraction: Fraction of positive examples to use
per image for the box classifier. The recommended value for Faster RCNN
is 0.25.
second_stage_sampler: Sampler to use for second stage loss (box
classifier loss).
second_stage_non_max_suppression_fn: batch_multiclass_non_max_suppression
callable that takes `boxes`, `scores`, optional `clip_window` and
optional (kwarg) `mask` inputs (with all other inputs already set)
......@@ -364,6 +370,9 @@ class FasterRCNNMetaArch(model.DetectionModel):
use_matmul_crop_and_resize: Force the use of matrix multiplication based
crop and resize instead of standard tf.image.crop_and_resize while
computing second stage input feature maps.
clip_anchors_to_image: Normally, anchors generated for a given image size
are pruned during training if they lie outside the image window. This
option clips the anchors to be within the image instead of pruning.
Raises:
ValueError: If `second_stage_batch_size` > `first_stage_max_proposals` at
......@@ -388,13 +397,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
self._feature_extractor = feature_extractor
self._number_of_stages = number_of_stages
# The first class is reserved as background.
unmatched_cls_target = tf.constant(
[1] + self._num_classes * [0], dtype=tf.float32)
self._proposal_target_assigner = target_assigner.create_target_assigner(
'FasterRCNN', 'proposal')
self._detector_target_assigner = target_assigner.create_target_assigner(
'FasterRCNN', 'detection', unmatched_cls_target=unmatched_cls_target)
self._proposal_target_assigner = first_stage_target_assigner
self._detector_target_assigner = second_stage_target_assigner
# Both proposal and detector target assigners use the same box coder
self._box_coder = self._proposal_target_assigner.box_coder
......@@ -407,14 +411,19 @@ class FasterRCNNMetaArch(model.DetectionModel):
first_stage_box_predictor_kernel_size)
self._first_stage_box_predictor_depth = first_stage_box_predictor_depth
self._first_stage_minibatch_size = first_stage_minibatch_size
self._first_stage_sampler = sampler.BalancedPositiveNegativeSampler(
positive_fraction=first_stage_positive_balance_fraction)
self._first_stage_box_predictor = box_predictor.ConvolutionalBoxPredictor(
self._is_training, num_classes=1,
conv_hyperparams_fn=self._first_stage_box_predictor_arg_scope_fn,
min_depth=0, max_depth=0, num_layers_before_predictor=0,
use_dropout=False, dropout_keep_prob=1.0, kernel_size=1,
box_code_size=self._box_coder.code_size)
self._first_stage_sampler = first_stage_sampler
self._first_stage_box_predictor = (
convolutional_box_predictor.ConvolutionalBoxPredictor(
self._is_training,
num_classes=1,
conv_hyperparams_fn=self._first_stage_box_predictor_arg_scope_fn,
min_depth=0,
max_depth=0,
num_layers_before_predictor=0,
use_dropout=False,
dropout_keep_prob=1.0,
kernel_size=1,
box_code_size=self._box_coder.code_size))
self._first_stage_nms_score_threshold = first_stage_nms_score_threshold
self._first_stage_nms_iou_threshold = first_stage_nms_iou_threshold
......@@ -435,8 +444,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
self._mask_rcnn_box_predictor = second_stage_mask_rcnn_box_predictor
self._second_stage_batch_size = second_stage_batch_size
self._second_stage_sampler = sampler.BalancedPositiveNegativeSampler(
positive_fraction=second_stage_balance_fraction)
self._second_stage_sampler = second_stage_sampler
self._second_stage_nms_fn = second_stage_non_max_suppression_fn
self._second_stage_score_conversion_fn = second_stage_score_conversion_fn
......@@ -454,6 +462,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
self._hard_example_miner = hard_example_miner
self._parallel_iterations = parallel_iterations
self.clip_anchors_to_image = clip_anchors_to_image
if self._number_of_stages <= 0 or self._number_of_stages > 3:
raise ValueError('Number of stages should be a value in {1, 2, 3}.')
......@@ -639,10 +649,14 @@ class FasterRCNNMetaArch(model.DetectionModel):
# the image window at training time and clipping at inference time.
clip_window = tf.to_float(tf.stack([0, 0, image_shape[1], image_shape[2]]))
if self._is_training:
(rpn_box_encodings, rpn_objectness_predictions_with_background,
anchors_boxlist) = self._remove_invalid_anchors_and_predictions(
rpn_box_encodings, rpn_objectness_predictions_with_background,
anchors_boxlist, clip_window)
if self.clip_anchors_to_image:
anchors_boxlist = box_list_ops.clip_to_window(
anchors_boxlist, clip_window, filter_nonoverlapping=False)
else:
(rpn_box_encodings, rpn_objectness_predictions_with_background,
anchors_boxlist) = self._remove_invalid_anchors_and_predictions(
rpn_box_encodings, rpn_objectness_predictions_with_background,
anchors_boxlist, clip_window)
else:
anchors_boxlist = box_list_ops.clip_to_window(
anchors_boxlist, clip_window)
......@@ -761,11 +775,16 @@ class FasterRCNNMetaArch(model.DetectionModel):
flattened_proposal_feature_maps,
scope=self.second_stage_feature_extractor_scope))
box_predictions = self._mask_rcnn_box_predictor.predict(
[box_classifier_features],
num_predictions_per_location=[1],
scope=self.second_stage_box_predictor_scope,
predict_boxes_and_classes=True)
if self._mask_rcnn_box_predictor.is_keras_model:
box_predictions = self._mask_rcnn_box_predictor(
[box_classifier_features],
prediction_stage=2)
else:
box_predictions = self._mask_rcnn_box_predictor.predict(
[box_classifier_features],
num_predictions_per_location=[1],
scope=self.second_stage_box_predictor_scope,
prediction_stage=2)
refined_box_encodings = tf.squeeze(
box_predictions[box_predictor.BOX_ENCODINGS],
......@@ -834,12 +853,16 @@ class FasterRCNNMetaArch(model.DetectionModel):
if self._is_training:
curr_box_classifier_features = prediction_dict['box_classifier_features']
detection_classes = prediction_dict['class_predictions_with_background']
mask_predictions = self._mask_rcnn_box_predictor.predict(
[curr_box_classifier_features],
num_predictions_per_location=[1],
scope=self.second_stage_box_predictor_scope,
predict_boxes_and_classes=False,
predict_auxiliary_outputs=True)
if self._mask_rcnn_box_predictor.is_keras_model:
mask_predictions = self._mask_rcnn_box_predictor(
[curr_box_classifier_features],
prediction_stage=3)
else:
mask_predictions = self._mask_rcnn_box_predictor.predict(
[curr_box_classifier_features],
num_predictions_per_location=[1],
scope=self.second_stage_box_predictor_scope,
prediction_stage=3)
prediction_dict['mask_predictions'] = tf.squeeze(mask_predictions[
box_predictor.MASK_PREDICTIONS], axis=1)
else:
......@@ -865,12 +888,16 @@ class FasterRCNNMetaArch(model.DetectionModel):
flattened_detected_feature_maps,
scope=self.second_stage_feature_extractor_scope))
mask_predictions = self._mask_rcnn_box_predictor.predict(
[curr_box_classifier_features],
num_predictions_per_location=[1],
scope=self.second_stage_box_predictor_scope,
predict_boxes_and_classes=False,
predict_auxiliary_outputs=True)
if self._mask_rcnn_box_predictor.is_keras_model:
mask_predictions = self._mask_rcnn_box_predictor(
[curr_box_classifier_features],
prediction_stage=3)
else:
mask_predictions = self._mask_rcnn_box_predictor.predict(
[curr_box_classifier_features],
num_predictions_per_location=[1],
scope=self.second_stage_box_predictor_scope,
prediction_stage=3)
detection_masks = tf.squeeze(mask_predictions[
box_predictor.MASK_PREDICTIONS], axis=1)
......@@ -976,10 +1003,14 @@ class FasterRCNNMetaArch(model.DetectionModel):
if len(num_anchors_per_location) != 1:
raise RuntimeError('anchor_generator is expected to generate anchors '
'corresponding to a single feature map.')
box_predictions = self._first_stage_box_predictor.predict(
[rpn_box_predictor_features],
num_anchors_per_location,
scope=self.first_stage_box_predictor_scope)
if self._first_stage_box_predictor.is_keras_model:
box_predictions = self._first_stage_box_predictor(
[rpn_box_predictor_features])
else:
box_predictions = self._first_stage_box_predictor.predict(
[rpn_box_predictor_features],
num_anchors_per_location,
scope=self.first_stage_box_predictor_scope)
box_encodings = tf.concat(
box_predictions[box_predictor.BOX_ENCODINGS], axis=1)
......@@ -1393,8 +1424,11 @@ class FasterRCNNMetaArch(model.DetectionModel):
a BoxList contained sampled proposals.
"""
(cls_targets, cls_weights, _, _, _) = self._detector_target_assigner.assign(
proposal_boxlist, groundtruth_boxlist,
groundtruth_classes_with_background)
proposal_boxlist,
groundtruth_boxlist,
groundtruth_classes_with_background,
unmatched_class_label=tf.constant(
[1] + self._num_classes * [0], dtype=tf.float32))
# Selects all boxes as candidates if none of them is selected according
# to cls_weights. This could happen as boxes within certain IOU ranges
# are ignored. If triggered, the selected boxes will still be ignored
......@@ -1672,7 +1706,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
batch_reg_weights, _) = target_assigner.batch_assign_targets(
self._proposal_target_assigner, box_list.BoxList(anchors),
groundtruth_boxlists,
len(groundtruth_boxlists) * [None], groundtruth_weights_list)
len(groundtruth_boxlists) * [None],
gt_weights_batch=groundtruth_weights_list)
batch_cls_targets = tf.squeeze(batch_cls_targets, axis=2)
def _minibatch_subsample_fn(inputs):
......@@ -1792,9 +1827,13 @@ class FasterRCNNMetaArch(model.DetectionModel):
(batch_cls_targets_with_background, batch_cls_weights, batch_reg_targets,
batch_reg_weights, _) = target_assigner.batch_assign_targets(
self._detector_target_assigner, proposal_boxlists,
groundtruth_boxlists, groundtruth_classes_with_background_list,
groundtruth_weights_list)
self._detector_target_assigner,
proposal_boxlists,
groundtruth_boxlists,
groundtruth_classes_with_background_list,
unmatched_class_label=tf.constant(
[1] + self._num_classes * [0], dtype=tf.float32),
gt_weights_batch=groundtruth_weights_list)
class_predictions_with_background = tf.reshape(
class_predictions_with_background,
......@@ -1866,18 +1905,12 @@ class FasterRCNNMetaArch(model.DetectionModel):
raise ValueError('Groundtruth instance masks not provided. '
'Please configure input reader.')
# Create a new target assigner that matches the proposals to groundtruth
# and returns the mask targets.
# TODO(rathodv): Move `unmatched_cls_target` from constructor to assign
# function. This will enable reuse of a single target assigner for both
# class targets and mask targets.
mask_target_assigner = target_assigner.create_target_assigner(
'FasterRCNN', 'detection',
unmatched_cls_target=tf.zeros(image_shape[1:3], dtype=tf.float32))
(batch_mask_targets, _, _,
batch_mask_target_weights, _) = target_assigner.batch_assign_targets(
mask_target_assigner, proposal_boxlists, groundtruth_boxlists,
groundtruth_masks_list, groundtruth_weights_list)
unmatched_mask_label = tf.zeros(image_shape[1:3], dtype=tf.float32)
(batch_mask_targets, _, _, batch_mask_target_weights,
_) = target_assigner.batch_assign_targets(
self._detector_target_assigner, proposal_boxlists,
groundtruth_boxlists, groundtruth_masks_list, unmatched_mask_label,
groundtruth_weights_list)
# Pad the prediction_masks with to add zeros for background class to be
# consistent with class predictions.
......
......@@ -21,7 +21,9 @@ from object_detection.anchor_generators import grid_anchor_generator
from object_detection.builders import box_predictor_builder
from object_detection.builders import hyperparams_builder
from object_detection.builders import post_processing_builder
from object_detection.core import balanced_positive_negative_sampler as sampler
from object_detection.core import losses
from object_detection.core import target_assigner
from object_detection.meta_architectures import faster_rcnn_meta_arch
from object_detection.protos import box_predictor_pb2
from object_detection.protos import hyperparams_pb2
......@@ -153,7 +155,9 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
predict_masks=False,
pad_to_max_dimension=None,
masks_are_class_agnostic=False,
use_matmul_crop_and_resize=False):
use_matmul_crop_and_resize=False,
clip_anchors_to_image=False,
use_matmul_gather_in_matcher=False):
def image_resizer_fn(image, masks=None):
"""Fake image resizer function."""
......@@ -186,6 +190,10 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
first_stage_anchor_scales,
first_stage_anchor_aspect_ratios,
anchor_stride=first_stage_anchor_strides)
first_stage_target_assigner = target_assigner.create_target_assigner(
'FasterRCNN',
'proposal',
use_matmul_gather=use_matmul_gather_in_matcher)
fake_feature_extractor = FakeFasterRCNNFeatureExtractor()
......@@ -211,7 +219,8 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
first_stage_atrous_rate = 1
first_stage_box_predictor_depth = 512
first_stage_minibatch_size = 3
first_stage_positive_balance_fraction = .5
first_stage_sampler = sampler.BalancedPositiveNegativeSampler(
positive_fraction=0.5, is_static=False)
first_stage_nms_score_threshold = -1.0
first_stage_nms_iou_threshold = 1.0
......@@ -230,9 +239,14 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
"""
post_processing_config = post_processing_pb2.PostProcessing()
text_format.Merge(post_processing_text_proto, post_processing_config)
second_stage_target_assigner = target_assigner.create_target_assigner(
'FasterRCNN', 'detection',
use_matmul_gather=use_matmul_gather_in_matcher)
second_stage_non_max_suppression_fn, _ = post_processing_builder.build(
post_processing_config)
second_stage_balance_fraction = 1.0
second_stage_sampler = sampler.BalancedPositiveNegativeSampler(
positive_fraction=1.0, is_static=False)
second_stage_score_conversion_fn = tf.identity
second_stage_localization_loss_weight = 1.0
......@@ -261,6 +275,7 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
'feature_extractor': fake_feature_extractor,
'number_of_stages': number_of_stages,
'first_stage_anchor_generator': first_stage_anchor_generator,
'first_stage_target_assigner': first_stage_target_assigner,
'first_stage_atrous_rate': first_stage_atrous_rate,
'first_stage_box_predictor_arg_scope_fn':
first_stage_box_predictor_arg_scope_fn,
......@@ -268,8 +283,7 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
first_stage_box_predictor_kernel_size,
'first_stage_box_predictor_depth': first_stage_box_predictor_depth,
'first_stage_minibatch_size': first_stage_minibatch_size,
'first_stage_positive_balance_fraction':
first_stage_positive_balance_fraction,
'first_stage_sampler': first_stage_sampler,
'first_stage_nms_score_threshold': first_stage_nms_score_threshold,
'first_stage_nms_iou_threshold': first_stage_nms_iou_threshold,
'first_stage_max_proposals': first_stage_max_proposals,
......@@ -277,8 +291,9 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
first_stage_localization_loss_weight,
'first_stage_objectness_loss_weight':
first_stage_objectness_loss_weight,
'second_stage_target_assigner': second_stage_target_assigner,
'second_stage_batch_size': second_stage_batch_size,
'second_stage_balance_fraction': second_stage_balance_fraction,
'second_stage_sampler': second_stage_sampler,
'second_stage_non_max_suppression_fn':
second_stage_non_max_suppression_fn,
'second_stage_score_conversion_fn': second_stage_score_conversion_fn,
......@@ -289,7 +304,8 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
'second_stage_classification_loss':
second_stage_classification_loss,
'hard_example_miner': hard_example_miner,
'use_matmul_crop_and_resize': use_matmul_crop_and_resize
'use_matmul_crop_and_resize': use_matmul_crop_and_resize,
'clip_anchors_to_image': clip_anchors_to_image
}
return self._get_model(
......@@ -469,7 +485,8 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
self.assertAllEqual(tensor_dict_out[key].shape, expected_shapes[key])
def _test_predict_gives_correct_shapes_in_train_mode_both_stages(
self, use_matmul_crop_and_resize=False):
self, use_matmul_crop_and_resize=False,
clip_anchors_to_image=False):
test_graph = tf.Graph()
with test_graph.as_default():
model = self._build_model(
......@@ -477,7 +494,8 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
number_of_stages=2,
second_stage_batch_size=7,
predict_masks=False,
use_matmul_crop_and_resize=use_matmul_crop_and_resize)
use_matmul_crop_and_resize=use_matmul_crop_and_resize,
clip_anchors_to_image=clip_anchors_to_image)
batch_size = 2
image_size = 10
......@@ -547,6 +565,10 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
self._test_predict_gives_correct_shapes_in_train_mode_both_stages(
use_matmul_crop_and_resize=True)
def test_predict_gives_correct_shapes_in_train_mode_clip_anchors(self):
self._test_predict_gives_correct_shapes_in_train_mode_both_stages(
clip_anchors_to_image=True)
def _test_postprocess_first_stage_only_inference_mode(
self, pad_to_max_dimension=None):
model = self._build_model(
......
......@@ -55,20 +55,22 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
feature_extractor,
number_of_stages,
first_stage_anchor_generator,
first_stage_target_assigner,
first_stage_atrous_rate,
first_stage_box_predictor_arg_scope_fn,
first_stage_box_predictor_kernel_size,
first_stage_box_predictor_depth,
first_stage_minibatch_size,
first_stage_positive_balance_fraction,
first_stage_sampler,
first_stage_nms_score_threshold,
first_stage_nms_iou_threshold,
first_stage_max_proposals,
first_stage_localization_loss_weight,
first_stage_objectness_loss_weight,
second_stage_target_assigner,
second_stage_rfcn_box_predictor,
second_stage_batch_size,
second_stage_balance_fraction,
second_stage_sampler,
second_stage_non_max_suppression_fn,
second_stage_score_conversion_fn,
second_stage_localization_loss_weight,
......@@ -77,7 +79,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
hard_example_miner,
parallel_iterations=16,
add_summaries=True,
use_matmul_crop_and_resize=False):
use_matmul_crop_and_resize=False,
clip_anchors_to_image=False):
"""RFCNMetaArch Constructor.
Args:
......@@ -97,6 +100,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
first_stage_anchor_generator: An anchor_generator.AnchorGenerator object
(note that currently we only support
grid_anchor_generator.GridAnchorGenerator objects)
first_stage_target_assigner: Target assigner to use for first stage of
R-FCN (RPN).
first_stage_atrous_rate: A single integer indicating the atrous rate for
the single convolution op which is applied to the `rpn_features_to_crop`
tensor to obtain a tensor to be used for box prediction. Some feature
......@@ -116,8 +121,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
"batch size" refers to the number of anchors selected as contributing
to the loss function for any given image within the image batch and is
only called "batch_size" due to terminology from the Faster R-CNN paper.
first_stage_positive_balance_fraction: Fraction of positive examples
per image for the RPN. The recommended value for Faster RCNN is 0.5.
first_stage_sampler: The sampler for the boxes used to calculate the RPN
loss after the first stage.
first_stage_nms_score_threshold: Score threshold for non max suppression
for the Region Proposal Network (RPN). This value is expected to be in
[0, 1] as it is applied directly after a softmax transformation. The
......@@ -130,6 +135,10 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
Region Proposal Network (RPN).
first_stage_localization_loss_weight: A float
first_stage_objectness_loss_weight: A float
second_stage_target_assigner: Target assigner to use for second stage of
R-FCN. If the model is configured with multiple prediction heads, this
target assigner is used to generate targets for all heads (with the
correct `unmatched_class_label`).
second_stage_rfcn_box_predictor: RFCN box predictor to use for
second stage.
second_stage_batch_size: The batch size used for computing the
......@@ -137,9 +146,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
"batch size" refers to the number of proposals selected as contributing
to the loss function for any given image within the image batch and is
only called "batch_size" due to terminology from the Faster R-CNN paper.
second_stage_balance_fraction: Fraction of positive examples to use
per image for the box classifier. The recommended value for Faster RCNN
is 0.25.
second_stage_sampler: The sampler for the boxes used for second stage
box classifier.
second_stage_non_max_suppression_fn: batch_multiclass_non_max_suppression
callable that takes `boxes`, `scores`, optional `clip_window` and
optional (kwarg) `mask` inputs (with all other inputs already set)
......@@ -163,6 +171,9 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
use_matmul_crop_and_resize: Force the use of matrix multiplication based
crop and resize instead of standard tf.image.crop_and_resize while
computing second stage input feature maps.
clip_anchors_to_image: The anchors generated are clip to the
window size without filtering the nonoverlapping anchors. This generates
a static number of anchors. This argument is unused.
Raises:
ValueError: If `second_stage_batch_size` > `first_stage_max_proposals`
......@@ -178,12 +189,13 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
feature_extractor,
number_of_stages,
first_stage_anchor_generator,
first_stage_target_assigner,
first_stage_atrous_rate,
first_stage_box_predictor_arg_scope_fn,
first_stage_box_predictor_kernel_size,
first_stage_box_predictor_depth,
first_stage_minibatch_size,
first_stage_positive_balance_fraction,
first_stage_sampler,
first_stage_nms_score_threshold,
first_stage_nms_iou_threshold,
first_stage_max_proposals,
......@@ -192,9 +204,10 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
None, # initial_crop_size is not used in R-FCN
None, # maxpool_kernel_size is not use in R-FCN
None, # maxpool_stride is not use in R-FCN
second_stage_target_assigner,
None, # fully_connected_box_predictor is not used in R-FCN.
second_stage_batch_size,
second_stage_balance_fraction,
second_stage_sampler,
second_stage_non_max_suppression_fn,
second_stage_score_conversion_fn,
second_stage_localization_loss_weight,
......@@ -274,11 +287,16 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
rpn_features,
scope=self.second_stage_feature_extractor_scope))
box_predictions = self._rfcn_box_predictor.predict(
[box_classifier_features],
num_predictions_per_location=[1],
scope=self.second_stage_box_predictor_scope,
proposal_boxes=proposal_boxes_normalized)
if self._rfcn_box_predictor.is_keras_model:
box_predictions = self._rfcn_box_predictor(
[box_classifier_features],
proposal_boxes=proposal_boxes_normalized)
else:
box_predictions = self._rfcn_box_predictor.predict(
[box_classifier_features],
num_predictions_per_location=[1],
scope=self.second_stage_box_predictor_scope,
proposal_boxes=proposal_boxes_normalized)
refined_box_encodings = tf.squeeze(
tf.concat(box_predictions[box_predictor.BOX_ENCODINGS], axis=1), axis=1)
class_predictions_with_background = tf.squeeze(
......
......@@ -35,7 +35,7 @@ slim = tf.contrib.slim
class SSDFeatureExtractor(object):
"""SSD Feature Extractor definition."""
"""SSD Slim Feature Extractor definition."""
def __init__(self,
is_training,
......@@ -77,6 +77,10 @@ class SSDFeatureExtractor(object):
self._override_base_feature_extractor_hyperparams = (
override_base_feature_extractor_hyperparams)
@property
def is_keras_model(self):
return False
@abstractmethod
def preprocess(self, resized_inputs):
"""Preprocesses images for feature extraction (minus image resizing).
......@@ -113,6 +117,105 @@ class SSDFeatureExtractor(object):
raise NotImplementedError
class SSDKerasFeatureExtractor(tf.keras.Model):
"""SSD Feature Extractor definition."""
def __init__(self,
is_training,
depth_multiplier,
min_depth,
pad_to_multiple,
conv_hyperparams_config,
freeze_batchnorm,
inplace_batchnorm_update,
use_explicit_padding=False,
use_depthwise=False,
override_base_feature_extractor_hyperparams=False):
"""Constructor.
Args:
is_training: whether the network is in training mode.
depth_multiplier: float depth multiplier for feature extractor.
min_depth: minimum feature extractor depth.
pad_to_multiple: the nearest multiple to zero pad the input height and
width dimensions to.
conv_hyperparams_config: A hyperparams.proto object containing
convolution hyperparameters for the layers added on top of the
base feature extractor.
freeze_batchnorm: Whether to freeze batch norm parameters during
training or not. When training with a small batch size (e.g. 1), it is
desirable to freeze batch norm update and use pretrained batch norm
params.
inplace_batchnorm_update: Whether to update batch norm moving average
values inplace. When this is false train op must add a control
dependency on tf.graphkeys.UPDATE_OPS collection in order to update
batch norm statistics.
use_explicit_padding: Whether to use explicit padding when extracting
features. Default is False.
use_depthwise: Whether to use depthwise convolutions. Default is False.
override_base_feature_extractor_hyperparams: Whether to override
hyperparameters of the base feature extractor with the one from
`conv_hyperparams_config`.
"""
super(SSDKerasFeatureExtractor, self).__init__()
self._is_training = is_training
self._depth_multiplier = depth_multiplier
self._min_depth = min_depth
self._pad_to_multiple = pad_to_multiple
self._conv_hyperparams_config = conv_hyperparams_config
self._freeze_batchnorm = freeze_batchnorm
self._inplace_batchnorm_update = inplace_batchnorm_update
self._use_explicit_padding = use_explicit_padding
self._use_depthwise = use_depthwise
self._override_base_feature_extractor_hyperparams = (
override_base_feature_extractor_hyperparams)
@property
def is_keras_model(self):
return True
@abstractmethod
def preprocess(self, resized_inputs):
"""Preprocesses images for feature extraction (minus image resizing).
Args:
resized_inputs: a [batch, height, width, channels] float tensor
representing a batch of images.
Returns:
preprocessed_inputs: a [batch, height, width, channels] float tensor
representing a batch of images.
true_image_shapes: int32 tensor of shape [batch, 3] where each row is
of the form [height, width, channels] indicating the shapes
of true images in the resized images, as resized images can be padded
with zeros.
"""
raise NotImplementedError
@abstractmethod
def _extract_features(self, preprocessed_inputs):
"""Extracts features from preprocessed inputs.
This function is responsible for extracting feature maps from preprocessed
images.
Args:
preprocessed_inputs: a [batch, height, width, channels] float tensor
representing a batch of images.
Returns:
feature_maps: a list of tensors where the ith tensor has shape
[batch, height_i, width_i, depth_i]
"""
raise NotImplementedError
# This overrides the keras.Model `call` method with the _extract_features
# method.
def call(self, inputs, **kwargs):
return self._extract_features(inputs)
class SSDMetaArch(model.DetectionModel):
"""SSD Meta-architecture definition."""
......@@ -211,10 +314,6 @@ class SSDMetaArch(model.DetectionModel):
self._freeze_batchnorm = freeze_batchnorm
self._inplace_batchnorm_update = inplace_batchnorm_update
# Needed for fine-tuning from classification checkpoints whose
# variables do not have the feature extractor scope.
self._extract_features_scope = 'FeatureExtractor'
self._anchor_generator = anchor_generator
self._box_predictor = box_predictor
......@@ -224,21 +323,30 @@ class SSDMetaArch(model.DetectionModel):
self._region_similarity_calculator = region_similarity_calculator
self._add_background_class = add_background_class
# Needed for fine-tuning from classification checkpoints whose
# variables do not have the feature extractor scope.
if self._feature_extractor.is_keras_model:
# Keras feature extractors will have a name they implicitly use to scope.
# So, all contained variables are prefixed by this name.
# To load from classification checkpoints, need to filter out this name.
self._extract_features_scope = feature_extractor.name
else:
# Slim feature extractors get an explicit naming scope
self._extract_features_scope = 'FeatureExtractor'
# TODO(jonathanhuang): handle agnostic mode
# weights
unmatched_cls_target = None
unmatched_cls_target = tf.constant([1] + self.num_classes * [0],
tf.float32)
self._unmatched_class_label = tf.constant([1] + self.num_classes * [0],
tf.float32)
if encode_background_as_zeros:
unmatched_cls_target = tf.constant((self.num_classes + 1) * [0],
tf.float32)
self._unmatched_class_label = tf.constant((self.num_classes + 1) * [0],
tf.float32)
self._target_assigner = target_assigner.TargetAssigner(
self._region_similarity_calculator,
self._matcher,
self._box_coder,
negative_class_weight=negative_class_weight,
unmatched_cls_target=unmatched_cls_target)
negative_class_weight=negative_class_weight)
self._classification_loss = classification_loss
self._localization_loss = localization_loss
......@@ -383,41 +491,53 @@ class SSDMetaArch(model.DetectionModel):
"""
batchnorm_updates_collections = (None if self._inplace_batchnorm_update
else tf.GraphKeys.UPDATE_OPS)
with slim.arg_scope([slim.batch_norm],
is_training=(self._is_training and
not self._freeze_batchnorm),
updates_collections=batchnorm_updates_collections):
with tf.variable_scope(None, self._extract_features_scope,
[preprocessed_inputs]):
feature_maps = self._feature_extractor.extract_features(
preprocessed_inputs)
feature_map_spatial_dims = self._get_feature_map_spatial_dims(
feature_maps)
image_shape = shape_utils.combined_static_and_dynamic_shape(
preprocessed_inputs)
self._anchors = box_list_ops.concatenate(
self._anchor_generator.generate(
feature_map_spatial_dims,
im_height=image_shape[1],
im_width=image_shape[2]))
prediction_dict = self._box_predictor.predict(
feature_maps, self._anchor_generator.num_anchors_per_location())
box_encodings = tf.concat(prediction_dict['box_encodings'], axis=1)
if box_encodings.shape.ndims == 4 and box_encodings.shape[2] == 1:
box_encodings = tf.squeeze(box_encodings, axis=2)
class_predictions_with_background = tf.concat(
prediction_dict['class_predictions_with_background'], axis=1)
predictions_dict = {
'preprocessed_inputs': preprocessed_inputs,
'box_encodings': box_encodings,
'class_predictions_with_background':
class_predictions_with_background,
'feature_maps': feature_maps,
'anchors': self._anchors.get()
}
self._batched_prediction_tensor_names = [x for x in predictions_dict
if x != 'anchors']
return predictions_dict
if self._feature_extractor.is_keras_model:
feature_maps = self._feature_extractor(preprocessed_inputs)
else:
with slim.arg_scope([slim.batch_norm],
is_training=(self._is_training and
not self._freeze_batchnorm),
updates_collections=batchnorm_updates_collections):
with tf.variable_scope(None, self._extract_features_scope,
[preprocessed_inputs]):
feature_maps = self._feature_extractor.extract_features(
preprocessed_inputs)
feature_map_spatial_dims = self._get_feature_map_spatial_dims(
feature_maps)
image_shape = shape_utils.combined_static_and_dynamic_shape(
preprocessed_inputs)
self._anchors = box_list_ops.concatenate(
self._anchor_generator.generate(
feature_map_spatial_dims,
im_height=image_shape[1],
im_width=image_shape[2]))
if self._box_predictor.is_keras_model:
prediction_dict = self._box_predictor(feature_maps)
else:
with slim.arg_scope([slim.batch_norm],
is_training=(self._is_training and
not self._freeze_batchnorm),
updates_collections=batchnorm_updates_collections):
prediction_dict = self._box_predictor.predict(
feature_maps, self._anchor_generator.num_anchors_per_location())
box_encodings = tf.concat(prediction_dict['box_encodings'], axis=1)
if box_encodings.shape.ndims == 4 and box_encodings.shape[2] == 1:
box_encodings = tf.squeeze(box_encodings, axis=2)
class_predictions_with_background = tf.concat(
prediction_dict['class_predictions_with_background'], axis=1)
predictions_dict = {
'preprocessed_inputs': preprocessed_inputs,
'box_encodings': box_encodings,
'class_predictions_with_background':
class_predictions_with_background,
'feature_maps': feature_maps,
'anchors': self._anchors.get()
}
self._batched_prediction_tensor_names = [x for x in predictions_dict
if x != 'anchors']
return predictions_dict
def _get_feature_map_spatial_dims(self, feature_maps):
"""Return list of spatial dimensions for each feature map in a list.
......@@ -710,7 +830,8 @@ class SSDMetaArch(model.DetectionModel):
boxlist.add_field(fields.BoxListFields.keypoints, keypoints)
return target_assigner.batch_assign_targets(
self._target_assigner, self.anchors, groundtruth_boxlists,
groundtruth_classes_with_background_list, groundtruth_weights_list)
groundtruth_classes_with_background_list, self._unmatched_class_label,
groundtruth_weights_list)
def _summarize_target_assignment(self, groundtruth_boxes_list, match_list):
"""Creates tensorflow summaries for the input boxes and anchors.
......@@ -872,3 +993,4 @@ class SSDMetaArch(model.DetectionModel):
variables_to_restore[var_name] = variable
return variables_to_restore
......@@ -15,6 +15,8 @@
"""Tests for object_detection.meta_architectures.ssd_meta_arch."""
import functools
from absl.testing import parameterized
import numpy as np
import tensorflow as tf
......@@ -29,6 +31,7 @@ from object_detection.utils import test_case
from object_detection.utils import test_utils
slim = tf.contrib.slim
keras = tf.keras.layers
class FakeSSDFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
......@@ -51,6 +54,30 @@ class FakeSSDFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
return [features]
class FakeSSDKerasFeatureExtractor(ssd_meta_arch.SSDKerasFeatureExtractor):
def __init__(self):
with tf.name_scope('mock_model'):
super(FakeSSDKerasFeatureExtractor, self).__init__(
is_training=True,
depth_multiplier=0,
min_depth=0,
pad_to_multiple=1,
conv_hyperparams_config=None,
freeze_batchnorm=False,
inplace_batchnorm_update=False,
)
self._conv = keras.Conv2D(filters=32, kernel_size=1, name='layer1')
def preprocess(self, resized_inputs):
return tf.identity(resized_inputs)
def _extract_features(self, preprocessed_inputs, **kwargs):
with tf.name_scope('mock_model'):
return [self._conv(preprocessed_inputs)]
class MockAnchorGenerator2x2(anchor_generator.AnchorGenerator):
"""Sets up a simple 2x2 anchor grid on the unit square."""
......@@ -79,20 +106,32 @@ def _get_value_for_matching_key(dictionary, suffix):
raise ValueError('key not found {}'.format(suffix))
class SsdMetaArchTest(test_case.TestCase):
@parameterized.parameters(
{'use_keras': False},
{'use_keras': True},
)
class SsdMetaArchTest(test_case.TestCase, parameterized.TestCase):
def _create_model(self,
apply_hard_mining=True,
normalize_loc_loss_by_codesize=False,
add_background_class=True,
random_example_sampling=False):
random_example_sampling=False,
use_keras=False):
is_training = False
num_classes = 1
mock_anchor_generator = MockAnchorGenerator2x2()
mock_box_predictor = test_utils.MockBoxPredictor(
is_training, num_classes)
if use_keras:
mock_box_predictor = test_utils.MockKerasBoxPredictor(
is_training, num_classes)
else:
mock_box_predictor = test_utils.MockBoxPredictor(
is_training, num_classes)
mock_box_coder = test_utils.MockBoxCoder()
fake_feature_extractor = FakeSSDFeatureExtractor()
if use_keras:
fake_feature_extractor = FakeSSDKerasFeatureExtractor()
else:
fake_feature_extractor = FakeSSDFeatureExtractor()
mock_matcher = test_utils.MockMatcher()
region_similarity_calculator = sim_calc.IouSimilarity()
encode_background_as_zeros = False
......@@ -152,25 +191,26 @@ class SsdMetaArchTest(test_case.TestCase):
random_example_sampler=random_example_sampler)
return model, num_classes, mock_anchor_generator.num_anchors(), code_size
def test_preprocess_preserves_shapes_with_dynamic_input_image(self):
def test_preprocess_preserves_shapes_with_dynamic_input_image(
self, use_keras):
image_shapes = [(3, None, None, 3),
(None, 10, 10, 3),
(None, None, None, 3)]
model, _, _, _ = self._create_model()
model, _, _, _ = self._create_model(use_keras=use_keras)
for image_shape in image_shapes:
image_placeholder = tf.placeholder(tf.float32, shape=image_shape)
preprocessed_inputs, _ = model.preprocess(image_placeholder)
self.assertAllEqual(preprocessed_inputs.shape.as_list(), image_shape)
def test_preprocess_preserves_shape_with_static_input_image(self):
def test_preprocess_preserves_shape_with_static_input_image(self, use_keras):
def graph_fn(input_image):
model, _, _, _ = self._create_model()
model, _, _, _ = self._create_model(use_keras=use_keras)
return model.preprocess(input_image)
input_image = np.random.rand(2, 3, 3, 3).astype(np.float32)
preprocessed_inputs, _ = self.execute(graph_fn, [input_image])
self.assertAllEqual(preprocessed_inputs.shape, [2, 3, 3, 3])
def test_predict_result_shapes_on_image_with_dynamic_shape(self):
def test_predict_result_shapes_on_image_with_dynamic_shape(self, use_keras):
batch_size = 3
image_size = 2
input_shapes = [(None, image_size, image_size, 3),
......@@ -180,16 +220,17 @@ class SsdMetaArchTest(test_case.TestCase):
for input_shape in input_shapes:
tf_graph = tf.Graph()
with tf_graph.as_default():
model, num_classes, num_anchors, code_size = self._create_model()
model, num_classes, num_anchors, code_size = self._create_model(
use_keras=use_keras)
preprocessed_input_placeholder = tf.placeholder(tf.float32,
shape=input_shape)
prediction_dict = model.predict(
preprocessed_input_placeholder, true_image_shapes=None)
self.assertTrue('box_encodings' in prediction_dict)
self.assertTrue('class_predictions_with_background' in prediction_dict)
self.assertTrue('feature_maps' in prediction_dict)
self.assertTrue('anchors' in prediction_dict)
self.assertIn('box_encodings', prediction_dict)
self.assertIn('class_predictions_with_background', prediction_dict)
self.assertIn('feature_maps', prediction_dict)
self.assertIn('anchors', prediction_dict)
init_op = tf.global_variables_initializer()
with self.test_session(graph=tf_graph) as sess:
......@@ -210,10 +251,11 @@ class SsdMetaArchTest(test_case.TestCase):
prediction_out['class_predictions_with_background'].shape,
expected_class_predictions_with_background_shape_out)
def test_predict_result_shapes_on_image_with_static_shape(self):
def test_predict_result_shapes_on_image_with_static_shape(self, use_keras):
with tf.Graph().as_default():
_, num_classes, num_anchors, code_size = self._create_model()
_, num_classes, num_anchors, code_size = self._create_model(
use_keras=use_keras)
def graph_fn(input_image):
model, _, _, _ = self._create_model()
......@@ -235,7 +277,7 @@ class SsdMetaArchTest(test_case.TestCase):
self.assertAllEqual(class_predictions.shape,
expected_class_predictions_shape)
def test_postprocess_results_are_correct(self):
def test_postprocess_results_are_correct(self, use_keras):
batch_size = 2
image_size = 2
input_shapes = [(batch_size, image_size, image_size, 3),
......@@ -266,17 +308,17 @@ class SsdMetaArchTest(test_case.TestCase):
for input_shape in input_shapes:
tf_graph = tf.Graph()
with tf_graph.as_default():
model, _, _, _ = self._create_model()
model, _, _, _ = self._create_model(use_keras=use_keras)
input_placeholder = tf.placeholder(tf.float32, shape=input_shape)
preprocessed_inputs, true_image_shapes = model.preprocess(
input_placeholder)
prediction_dict = model.predict(preprocessed_inputs,
true_image_shapes)
detections = model.postprocess(prediction_dict, true_image_shapes)
self.assertTrue('detection_boxes' in detections)
self.assertTrue('detection_scores' in detections)
self.assertTrue('detection_classes' in detections)
self.assertTrue('num_detections' in detections)
self.assertIn('detection_boxes', detections)
self.assertIn('detection_scores', detections)
self.assertIn('detection_classes', detections)
self.assertIn('num_detections', detections)
init_op = tf.global_variables_initializer()
with self.test_session(graph=tf_graph) as sess:
sess.run(init_op)
......@@ -295,10 +337,10 @@ class SsdMetaArchTest(test_case.TestCase):
self.assertAllClose(detections_out['num_detections'],
expected_num_detections)
def test_loss_results_are_correct(self):
def test_loss_results_are_correct(self, use_keras):
with tf.Graph().as_default():
_, num_classes, num_anchors, _ = self._create_model()
_, num_classes, num_anchors, _ = self._create_model(use_keras=use_keras)
def graph_fn(preprocessed_tensor, groundtruth_boxes1, groundtruth_boxes2,
groundtruth_classes1, groundtruth_classes2):
groundtruth_boxes_list = [groundtruth_boxes1, groundtruth_boxes2]
......@@ -331,16 +373,18 @@ class SsdMetaArchTest(test_case.TestCase):
self.assertAllClose(localization_loss, expected_localization_loss)
self.assertAllClose(classification_loss, expected_classification_loss)
def test_loss_results_are_correct_with_normalize_by_codesize_true(self):
def test_loss_results_are_correct_with_normalize_by_codesize_true(
self, use_keras):
with tf.Graph().as_default():
_, _, _, _ = self._create_model()
_, _, _, _ = self._create_model(use_keras=use_keras)
def graph_fn(preprocessed_tensor, groundtruth_boxes1, groundtruth_boxes2,
groundtruth_classes1, groundtruth_classes2):
groundtruth_boxes_list = [groundtruth_boxes1, groundtruth_boxes2]
groundtruth_classes_list = [groundtruth_classes1, groundtruth_classes2]
model, _, _, _ = self._create_model(apply_hard_mining=False,
normalize_loc_loss_by_codesize=True)
normalize_loc_loss_by_codesize=True,
use_keras=use_keras)
model.provide_groundtruth(groundtruth_boxes_list,
groundtruth_classes_list)
prediction_dict = model.predict(preprocessed_tensor,
......@@ -362,10 +406,10 @@ class SsdMetaArchTest(test_case.TestCase):
groundtruth_classes2])
self.assertAllClose(localization_loss, expected_localization_loss)
def test_loss_results_are_correct_with_hard_example_mining(self):
def test_loss_results_are_correct_with_hard_example_mining(self, use_keras):
with tf.Graph().as_default():
_, num_classes, num_anchors, _ = self._create_model()
_, num_classes, num_anchors, _ = self._create_model(use_keras=use_keras)
def graph_fn(preprocessed_tensor, groundtruth_boxes1, groundtruth_boxes2,
groundtruth_classes1, groundtruth_classes2):
groundtruth_boxes_list = [groundtruth_boxes1, groundtruth_boxes2]
......@@ -397,18 +441,20 @@ class SsdMetaArchTest(test_case.TestCase):
self.assertAllClose(localization_loss, expected_localization_loss)
self.assertAllClose(classification_loss, expected_classification_loss)
def test_loss_results_are_correct_without_add_background_class(self):
def test_loss_results_are_correct_without_add_background_class(
self, use_keras):
with tf.Graph().as_default():
_, num_classes, num_anchors, _ = self._create_model(
add_background_class=False)
add_background_class=False, use_keras=use_keras)
def graph_fn(preprocessed_tensor, groundtruth_boxes1, groundtruth_boxes2,
groundtruth_classes1, groundtruth_classes2):
groundtruth_boxes_list = [groundtruth_boxes1, groundtruth_boxes2]
groundtruth_classes_list = [groundtruth_classes1, groundtruth_classes2]
model, _, _, _ = self._create_model(
apply_hard_mining=False, add_background_class=False)
apply_hard_mining=False, add_background_class=False,
use_keras=use_keras)
model.provide_groundtruth(groundtruth_boxes_list,
groundtruth_classes_list)
prediction_dict = model.predict(
......@@ -434,8 +480,8 @@ class SsdMetaArchTest(test_case.TestCase):
self.assertAllClose(localization_loss, expected_localization_loss)
self.assertAllClose(classification_loss, expected_classification_loss)
def test_restore_map_for_detection_ckpt(self):
model, _, _, _ = self._create_model()
def test_restore_map_for_detection_ckpt(self, use_keras):
model, _, _, _ = self._create_model(use_keras=use_keras)
model.predict(tf.constant(np.array([[[[0, 0], [1, 1]], [[1, 0], [0, 1]]]],
dtype=np.float32)),
true_image_shapes=None)
......@@ -454,14 +500,22 @@ class SsdMetaArchTest(test_case.TestCase):
for var in sess.run(tf.report_uninitialized_variables()):
self.assertNotIn('FeatureExtractor', var)
def test_restore_map_for_classification_ckpt(self):
def test_restore_map_for_classification_ckpt(self, use_keras):
# Define mock tensorflow classification graph and save variables.
test_graph_classification = tf.Graph()
with test_graph_classification.as_default():
image = tf.placeholder(dtype=tf.float32, shape=[1, 20, 20, 3])
with tf.variable_scope('mock_model'):
net = slim.conv2d(image, num_outputs=32, kernel_size=1, scope='layer1')
slim.conv2d(net, num_outputs=3, kernel_size=1, scope='layer2')
if use_keras:
with tf.name_scope('mock_model'):
layer_one = keras.Conv2D(32, kernel_size=1, name='layer1')
net = layer_one(image)
layer_two = keras.Conv2D(3, kernel_size=1, name='layer2')
layer_two(net)
else:
with tf.variable_scope('mock_model'):
net = slim.conv2d(image, num_outputs=32, kernel_size=1,
scope='layer1')
slim.conv2d(net, num_outputs=3, kernel_size=1, scope='layer2')
init_op = tf.global_variables_initializer()
saver = tf.train.Saver()
......@@ -474,7 +528,7 @@ class SsdMetaArchTest(test_case.TestCase):
# classification checkpoint.
test_graph_detection = tf.Graph()
with test_graph_detection.as_default():
model, _, _, _ = self._create_model()
model, _, _, _ = self._create_model(use_keras=use_keras)
inputs_shape = [2, 2, 2, 3]
inputs = tf.to_float(tf.random_uniform(
inputs_shape, minval=0, maxval=255, dtype=tf.int32))
......@@ -491,10 +545,10 @@ class SsdMetaArchTest(test_case.TestCase):
for var in sess.run(tf.report_uninitialized_variables()):
self.assertNotIn('FeatureExtractor', var)
def test_load_all_det_checkpoint_vars(self):
def test_load_all_det_checkpoint_vars(self, use_keras):
test_graph_detection = tf.Graph()
with test_graph_detection.as_default():
model, _, _, _ = self._create_model()
model, _, _, _ = self._create_model(use_keras=use_keras)
inputs_shape = [2, 2, 2, 3]
inputs = tf.to_float(
tf.random_uniform(inputs_shape, minval=0, maxval=255, dtype=tf.int32))
......@@ -508,18 +562,22 @@ class SsdMetaArchTest(test_case.TestCase):
self.assertIsInstance(var_map, dict)
self.assertIn('another_variable', var_map)
def test_loss_results_are_correct_with_random_example_sampling(self):
def test_loss_results_are_correct_with_random_example_sampling(
self,
use_keras):
with tf.Graph().as_default():
_, num_classes, num_anchors, _ = self._create_model(
random_example_sampling=True)
random_example_sampling=True,
use_keras=use_keras)
print num_classes, num_anchors
def graph_fn(preprocessed_tensor, groundtruth_boxes1, groundtruth_boxes2,
groundtruth_classes1, groundtruth_classes2):
groundtruth_boxes_list = [groundtruth_boxes1, groundtruth_boxes2]
groundtruth_classes_list = [groundtruth_classes1, groundtruth_classes2]
model, _, _, _ = self._create_model(random_example_sampling=True)
model, _, _, _ = self._create_model(random_example_sampling=True,
use_keras=use_keras)
model.provide_groundtruth(groundtruth_boxes_list,
groundtruth_classes_list)
prediction_dict = model.predict(
......
......@@ -202,6 +202,10 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False):
params = params or {}
total_loss, train_op, detections, export_outputs = None, None, None, None
is_training = mode == tf.estimator.ModeKeys.TRAIN
# Make sure to set the Keras learning phase. True during training,
# False for inference.
tf.keras.backend.set_learning_phase(is_training)
detection_model = detection_model_fn(is_training=is_training,
add_summaries=(not use_tpu))
scaffold_fn = None
......@@ -279,7 +283,7 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False):
if mode in (tf.estimator.ModeKeys.TRAIN, tf.estimator.ModeKeys.EVAL):
losses_dict = detection_model.loss(
prediction_dict, features[fields.InputDataFields.true_image_shape])
losses = [loss_tensor for loss_tensor in losses_dict.itervalues()]
losses = [loss_tensor for loss_tensor in losses_dict.values()]
if train_config.add_regularization_loss:
regularization_losses = tf.get_collection(
tf.GraphKeys.REGULARIZATION_LOSSES)
......
......@@ -221,8 +221,8 @@ def fpn_top_down_feature_maps(image_features, depth, scope=None):
depth, [3, 3],
scope='smoothing_%d' % (level + 1)))
output_feature_map_keys.append('top_down_%s' % image_features[level][0])
return collections.OrderedDict(
reversed(zip(output_feature_map_keys, output_feature_maps_list)))
return collections.OrderedDict(reversed(
list(zip(output_feature_map_keys, output_feature_maps_list))))
def pooling_pyramid_feature_maps(base_feature_map_depth, num_layers,
......@@ -288,4 +288,3 @@ def pooling_pyramid_feature_maps(base_feature_map_depth, num_layers,
feature_maps.append(feature_map)
return collections.OrderedDict(
[(x, y) for (x, y) in zip(feature_map_keys, feature_maps)])
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment