Commit 05584085 authored by pkulzc's avatar pkulzc Committed by Jonathan Huang
Browse files

Merged commit includes the following changes: (#6315)

236813471  by lzc:

    Internal change.

--
236507310  by lzc:

    Fix preprocess.random_resize_method config type issue. The target height and width will be passed as "size" to tf.image.resize_images which only accepts integer.

--
236409989  by Zhichao Lu:

    Config export_to_tpu from function parameter instead of HParams for TPU inference.

--
236403186  by Zhichao Lu:

    Make graph file names optional arguments.

--
236237072  by Zhichao Lu:

    Minor bugfix for keyword args.

--
236209602  by Zhichao Lu:

    Add support for PartitionedVariable to get_variables_available_in_checkpoint.

--
235828658  by Zhichao Lu:

    Automatically stop evaluation jobs when training is finished.

--
235817964  by Zhichao Lu:

    Add an optional process_metrics_fn callback to eval_util, it gets called
    with evaluation results once each evaluation is complete.

--
235788721  by lzc:

    Fix yml file tf runtime version.

--
235262897  by Zhichao Lu:

    Add keypoint support to the random_pad_image preprocessor method.

--
235257380  by Zhichao Lu:

    Support InputDataFields.groundtruth_confidences in retain_groundtruth(), retain_groundtruth_with_positive_classes(), filter_groundtruth_with_crowd_boxes(), filter_groundtruth_with_nan_box_coordinates(), filter_unrecognized_classes().

--
235109188  by Zhichao Lu:

    Fix bug in pad_input_data_to_static_shapes for num_additional_channels > 0; make color-specific data augmentation only touch RGB channels.

--
235045010  by Zhichao Lu:

    Don't slice class_predictions_with_background when add_background_class is false.

--
235026189  by lzc:

    Fix import in g3doc.

--
234863426  by Zhichao Lu:

    Added fixes in exporter to allow writing a checkpoint to a specified temporary directory.

--
234671886  by lzc:

    Internal Change.

--
234630803  by rathodv:

    Internal Change.

--
233985896  by Zhichao Lu:

    Add Neumann optimizer to object detection.

--
233560911  by Zhichao Lu:

    Add NAS-FPN object detection with Resnet and Mobilenet v2.

--
233513536  by Zhichao Lu:

    Export TPU compatible object detection model

--
233495772  by lzc:

    Internal change.

--
233453557  by Zhichao Lu:

    Create Keras-based SSD+MobilenetV1 for object detection.

--
233220074  by lzc:

    Update release notes date.

--
233165761  by Zhichao Lu:

    Support depth_multiplier and min_depth in _SSDResnetV1FpnFeatureExtractor.

--
233160046  by lzc:

    Internal change.

--
232926599  by Zhichao Lu:

    [tf.data] Switching tf.data functions to use `defun`, providing an escape hatch to continue using the legacy `Defun`.

    There are subtle differences between the implementation of `defun` and `Defun` (such as resources handling or control flow) and it is possible that input pipelines that use control flow or resources in their functions might be affected by this change. To migrate majority of existing pipelines to the recommended way of creating functions in TF 2.0 world, while allowing (a small number of) existing pipelines to continue relying on the deprecated behavior, this CL provides an escape hatch.

    If your input pipeline is affected by this CL, it should apply the escape hatch by replacing `foo.map(...)` with `foo.map_with_legacy_function(...)`.

--
232891621  by Zhichao Lu:

    Modify faster_rcnn meta architecture to normalize raw detections.

--
232875817  by Zhichao Lu:

    Make calibration a post-processing step.

    Specifically:
    - Move the calibration config from pipeline.proto --> post_processing.proto
    - Edit post_processing_builder.py to return a calibration function. If no calibration config is provided, it None.
    - Edit SSD and FasterRCNN meta architectures to optionally call the calibration function on detection scores after score conversion and before NMS.

--
232704481  by Zhichao Lu:

    Edit calibration builder to build a function that will be used within a detection model's `postprocess` method, after score conversion and before non-maxima suppression.

    Specific Edits:
    - The returned function now accepts class_predictions_with_background as its argument instead of detection_scores and detection_classes.
    - Class-specific calibration was temporarily removed, as it requires more significant refactoring. Will be added later.

--
232615379  by Zhichao Lu:

    Internal change

--
232483345  by ronnyvotel:

    Making the use of bfloat16 restricted to TPUs.

--
232399572  by Zhichao Lu:

    Edit calibration builder and proto to support class-agnostic calibration.

    Specifically:
    - Edit calibration protos to include path to relevant label map if required for class-specific calibration. Previously, label maps were inferred from other parts of the pipeline proto; this allows all information required by the builder stay within the calibration proto and remove extraneous information from being passed with class-agnostic calibration.
    - Add class-agnostic protos to the calibration config.

    Note that the proto supports sigmoid and linear interpolation parameters, but the builder currently only supports linear interpolation.

--
231613048  by Zhichao Lu:

    Add calibration builder for applying calibration transformations from output of object detection models.

    Specifically:
    - Add calibration proto to support sigmoid and isotonic regression (stepwise function) calibration.
    - Add a builder to support calibration from isotonic regression outputs.

--
231519786  by lzc:

    model_builder test refactor.
    - removed proto text boilerplate in each test case and let them call a create_default_proto function instead.
    - consolidated all separate ssd model creation tests into one.
    - consolidated all separate faster rcnn model creation tests into one.
    - used parameterized test for testing mask rcnn models and use_matmul_crop_and_resize
    - added all failures test.

--
231448169  by Zhichao Lu:

    Return static shape as a constant tensor.

--
231423126  by lzc:

    Add a release note for OID v4 models.

--
231401941  by Zhichao Lu:

    Adding correct labelmap for the models trained on Open Images V4 (*oid_v4
    config suffix).

--
231320357  by Zhichao Lu:

    Add scope to Nearest Neighbor Resize op so that it stays in the same name scope as the original resize ops.

--
231257699  by Zhichao Lu:

    Switch to using preserve_aspect_ratio in tf.image.resize_images rather than using a custom implementation.

--
231247368  by rathodv:

    Internal change.

--
231004874  by lzc:

    Update documentations to use tf 1.12 for object detection API.

--
230999911  by rathodv:

    Use tf.batch_gather instead of ops.batch_gather

--
230999720  by huizhongc:

    Fix weight equalization test in ops_test.

--
230984728  by rathodv:

    Internal update.

--
230929019  by lzc:

    Add an option to replace preprocess operation with placeholder for ssd feature extractor.

--
230845266  by lzc:

    Require tensorflow version 1.12 for object detection API and rename keras_applications to keras_models

--
230392064  by lzc:

    Add RetinaNet 101 checkpoint trained on OID v4 to detection model zoo.

--
230014128  by derekjchow:

    This file was re-located below the tensorflow/lite/g3doc/convert

--
229941449  by lzc:

    Update SSD mobilenet v2 quantized model download path.

--
229843662  by lzc:

    Add an option to use native resize tf op in fpn top-down feature map generation.

--
229636034  by rathodv:

    Add deprecation notice to a few old parameters in train.proto

--
228959078  by derekjchow:

    Remove duplicate elif case in _check_and_convert_legacy_input_config_key

--
228749719  by rathodv:

    Minor refactoring to make exporter's `build_detection_graph` method public.

--
228573828  by rathodv:

    Mofity model.postprocess to return raw detections and raw scores.

    Modify, post-process methods in core/model.py and the meta architectures to export raw detection (without any non-max suppression) and raw multiclass score logits for those detections.

--
228420670  by Zhichao Lu:

    Add shims for custom architectures for object detection models.

--
228241692  by Zhichao Lu:

    Fix the comment on "losses_mask" in "Loss" class.

--
228223810  by Zhichao Lu:

    Support other_heads' predictions in WeightSharedConvolutionalBoxPredictor. Also remove a few unused parameters and fix a couple of comments in convolutional_box_predictor.py.

--
228200588  by Zhichao Lu:

    Add Expected Calibration Error and an evaluator that calculates the metric for object detections.

--
228167740  by lzc:

    Add option to use bounded activations in FPN top-down feature map generation.

--
227767700  by rathodv:

    Internal.

--
226295236  by Zhichao Lu:

    Add Open Image V4 Resnet101-FPN training config to third_party

--
226254842  by Zhichao Lu:

    Fix typo in documentation.

--
225833971  by Zhichao Lu:

    Option to have no resizer in object detection model.

--
225824890  by lzc:

    Fixes p3 compatibility for model_lib.py

--
225760897  by menglong:

    normalizer should be at least 1.

--
225559842  by menglong:

    Add extra logic filtering unrecognized classes.

--
225379421  by lzc:

    Add faster_rcnn_inception_resnet_v2_atrous_oid_v4 config to third_party

--
225368337  by Zhichao Lu:

    Add extra logic filtering unrecognized classes.

--
225341095  by Zhichao Lu:

    Adding Open Images V4 models to OD API model zoo and corresponding configs to the
    configs.

--
225218450  by menglong:

    Add extra logic filtering unrecognized classes.

--
225057591  by Zhichao Lu:

    Internal change.

--
224895417  by rathodv:

    Internal change.

--
224209282  by Zhichao Lu:

    Add two data augmentations to object detection: (1) Self-concat (2) Absolute pads.

--
224073762  by Zhichao Lu:

    Do not create tf.constant until _generate() is actually called in the object detector.

--

PiperOrigin-RevId: 236813471
parent a5db4420
...@@ -92,8 +92,8 @@ configured in the meta architecture: ...@@ -92,8 +92,8 @@ configured in the meta architecture:
non-max suppression and normalize them. In this case, the `postprocess` method non-max suppression and normalize them. In this case, the `postprocess` method
skips both `_postprocess_rpn` and `_postprocess_box_classifier`. skips both `_postprocess_rpn` and `_postprocess_box_classifier`.
""" """
from abc import abstractmethod import abc
from functools import partial import functools
import tensorflow as tf import tensorflow as tf
from object_detection.anchor_generators import grid_anchor_generator from object_detection.anchor_generators import grid_anchor_generator
...@@ -138,7 +138,7 @@ class FasterRCNNFeatureExtractor(object): ...@@ -138,7 +138,7 @@ class FasterRCNNFeatureExtractor(object):
self._reuse_weights = reuse_weights self._reuse_weights = reuse_weights
self._weight_decay = weight_decay self._weight_decay = weight_decay
@abstractmethod @abc.abstractmethod
def preprocess(self, resized_inputs): def preprocess(self, resized_inputs):
"""Feature-extractor specific preprocessing (minus image resizing).""" """Feature-extractor specific preprocessing (minus image resizing)."""
pass pass
...@@ -162,7 +162,7 @@ class FasterRCNNFeatureExtractor(object): ...@@ -162,7 +162,7 @@ class FasterRCNNFeatureExtractor(object):
with tf.variable_scope(scope, values=[preprocessed_inputs]): with tf.variable_scope(scope, values=[preprocessed_inputs]):
return self._extract_proposal_features(preprocessed_inputs, scope) return self._extract_proposal_features(preprocessed_inputs, scope)
@abstractmethod @abc.abstractmethod
def _extract_proposal_features(self, preprocessed_inputs, scope): def _extract_proposal_features(self, preprocessed_inputs, scope):
"""Extracts first stage RPN features, to be overridden.""" """Extracts first stage RPN features, to be overridden."""
pass pass
...@@ -185,7 +185,7 @@ class FasterRCNNFeatureExtractor(object): ...@@ -185,7 +185,7 @@ class FasterRCNNFeatureExtractor(object):
scope, values=[proposal_feature_maps], reuse=tf.AUTO_REUSE): scope, values=[proposal_feature_maps], reuse=tf.AUTO_REUSE):
return self._extract_box_classifier_features(proposal_feature_maps, scope) return self._extract_box_classifier_features(proposal_feature_maps, scope)
@abstractmethod @abc.abstractmethod
def _extract_box_classifier_features(self, proposal_feature_maps, scope): def _extract_box_classifier_features(self, proposal_feature_maps, scope):
"""Extracts second stage box classifier features, to be overridden.""" """Extracts second stage box classifier features, to be overridden."""
pass pass
...@@ -770,7 +770,7 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -770,7 +770,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
representing the features for each proposal. representing the features for each proposal.
""" """
image_shape_2d = self._image_batch_shape_2d(image_shape) image_shape_2d = self._image_batch_shape_2d(image_shape)
proposal_boxes_normalized, _, num_proposals = self._postprocess_rpn( proposal_boxes_normalized, _, num_proposals, _, _ = self._postprocess_rpn(
rpn_box_encodings, rpn_objectness_predictions_with_background, rpn_box_encodings, rpn_objectness_predictions_with_background,
anchors, image_shape_2d, true_image_shapes) anchors, image_shape_2d, true_image_shapes)
...@@ -1080,7 +1080,7 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1080,7 +1080,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
anchors_boxlist, clip_window) anchors_boxlist, clip_window)
def _batch_gather_kept_indices(predictions_tensor): def _batch_gather_kept_indices(predictions_tensor):
return shape_utils.static_or_dynamic_map_fn( return shape_utils.static_or_dynamic_map_fn(
partial(tf.gather, indices=keep_indices), functools.partial(tf.gather, indices=keep_indices),
elems=predictions_tensor, elems=predictions_tensor,
dtype=tf.float32, dtype=tf.float32,
parallel_iterations=self._parallel_iterations, parallel_iterations=self._parallel_iterations,
...@@ -1148,17 +1148,22 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1148,17 +1148,22 @@ class FasterRCNNMetaArch(model.DetectionModel):
with tf.name_scope('FirstStagePostprocessor'): with tf.name_scope('FirstStagePostprocessor'):
if self._number_of_stages == 1: if self._number_of_stages == 1:
proposal_boxes, proposal_scores, num_proposals = self._postprocess_rpn( (proposal_boxes, proposal_scores, num_proposals, raw_proposal_boxes,
raw_proposal_scores) = self._postprocess_rpn(
prediction_dict['rpn_box_encodings'], prediction_dict['rpn_box_encodings'],
prediction_dict['rpn_objectness_predictions_with_background'], prediction_dict['rpn_objectness_predictions_with_background'],
prediction_dict['anchors'], prediction_dict['anchors'], true_image_shapes, true_image_shapes)
true_image_shapes,
true_image_shapes)
return { return {
fields.DetectionResultFields.detection_boxes: proposal_boxes, fields.DetectionResultFields.detection_boxes:
fields.DetectionResultFields.detection_scores: proposal_scores, proposal_boxes,
fields.DetectionResultFields.detection_scores:
proposal_scores,
fields.DetectionResultFields.num_detections: fields.DetectionResultFields.num_detections:
tf.to_float(num_proposals), tf.to_float(num_proposals),
fields.DetectionResultFields.raw_detection_boxes:
raw_proposal_boxes,
fields.DetectionResultFields.raw_detection_scores:
raw_proposal_scores
} }
# TODO(jrru): Remove mask_predictions from _post_process_box_classifier. # TODO(jrru): Remove mask_predictions from _post_process_box_classifier.
...@@ -1266,6 +1271,11 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1266,6 +1271,11 @@ class FasterRCNNMetaArch(model.DetectionModel):
num_proposals: A Tensor of type `int32`. A 1-D tensor of shape [batch] num_proposals: A Tensor of type `int32`. A 1-D tensor of shape [batch]
representing the number of proposals predicted for each image in representing the number of proposals predicted for each image in
the batch. the batch.
raw_detection_boxes: [batch, total_detections, 4] tensor with decoded
proposal boxes before Non-Max Suppression.
raw_detection_score: [batch, total_detections,
num_classes_with_background] tensor of class score logits for
raw proposal boxes.
""" """
rpn_box_encodings_batch = tf.expand_dims(rpn_box_encodings_batch, axis=2) rpn_box_encodings_batch = tf.expand_dims(rpn_box_encodings_batch, axis=2)
rpn_encodings_shape = shape_utils.combined_static_and_dynamic_shape( rpn_encodings_shape = shape_utils.combined_static_and_dynamic_shape(
...@@ -1274,13 +1284,13 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1274,13 +1284,13 @@ class FasterRCNNMetaArch(model.DetectionModel):
tf.expand_dims(anchors, 0), [rpn_encodings_shape[0], 1, 1]) tf.expand_dims(anchors, 0), [rpn_encodings_shape[0], 1, 1])
proposal_boxes = self._batch_decode_boxes(rpn_box_encodings_batch, proposal_boxes = self._batch_decode_boxes(rpn_box_encodings_batch,
tiled_anchor_boxes) tiled_anchor_boxes)
proposal_boxes = tf.squeeze(proposal_boxes, axis=2) raw_proposal_boxes = tf.squeeze(proposal_boxes, axis=2)
rpn_objectness_softmax_without_background = tf.nn.softmax( rpn_objectness_softmax_without_background = tf.nn.softmax(
rpn_objectness_predictions_with_background_batch)[:, :, 1] rpn_objectness_predictions_with_background_batch)[:, :, 1]
clip_window = self._compute_clip_window(image_shapes) clip_window = self._compute_clip_window(image_shapes)
(proposal_boxes, proposal_scores, _, _, _, (proposal_boxes, proposal_scores, _, _, _,
num_proposals) = self._first_stage_nms_fn( num_proposals) = self._first_stage_nms_fn(
tf.expand_dims(proposal_boxes, axis=2), tf.expand_dims(raw_proposal_boxes, axis=2),
tf.expand_dims(rpn_objectness_softmax_without_background, axis=2), tf.expand_dims(rpn_objectness_softmax_without_background, axis=2),
clip_window=clip_window) clip_window=clip_window)
if self._is_training: if self._is_training:
...@@ -1304,7 +1314,13 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1304,7 +1314,13 @@ class FasterRCNNMetaArch(model.DetectionModel):
return normalized_boxes_per_image return normalized_boxes_per_image
normalized_proposal_boxes = shape_utils.static_or_dynamic_map_fn( normalized_proposal_boxes = shape_utils.static_or_dynamic_map_fn(
normalize_boxes, elems=[proposal_boxes, image_shapes], dtype=tf.float32) normalize_boxes, elems=[proposal_boxes, image_shapes], dtype=tf.float32)
return normalized_proposal_boxes, proposal_scores, num_proposals raw_normalized_proposal_boxes = shape_utils.static_or_dynamic_map_fn(
normalize_boxes,
elems=[raw_proposal_boxes, image_shapes],
dtype=tf.float32)
return (normalized_proposal_boxes, proposal_scores, num_proposals,
raw_normalized_proposal_boxes,
rpn_objectness_predictions_with_background_batch)
def _sample_box_classifier_batch( def _sample_box_classifier_batch(
self, self,
...@@ -1576,6 +1592,11 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1576,6 +1592,11 @@ class FasterRCNNMetaArch(model.DetectionModel):
(optional) [batch, max_detections, mask_height, mask_width]. Note (optional) [batch, max_detections, mask_height, mask_width]. Note
that a pixel-wise sigmoid score converter is applied to the detection that a pixel-wise sigmoid score converter is applied to the detection
masks. masks.
`raw_detection_boxes`: [batch, total_detections, 4] tensor with decoded
detection boxes before Non-Max Suppression.
`raw_detection_score`: [batch, total_detections,
num_classes_with_background] tensor of multi-class score logits for
raw detection boxes.
""" """
refined_box_encodings_batch = tf.reshape( refined_box_encodings_batch = tf.reshape(
refined_box_encodings, refined_box_encodings,
...@@ -1589,11 +1610,11 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1589,11 +1610,11 @@ class FasterRCNNMetaArch(model.DetectionModel):
) )
refined_decoded_boxes_batch = self._batch_decode_boxes( refined_decoded_boxes_batch = self._batch_decode_boxes(
refined_box_encodings_batch, proposal_boxes) refined_box_encodings_batch, proposal_boxes)
class_predictions_with_background_batch = ( class_predictions_with_background_batch_normalized = (
self._second_stage_score_conversion_fn( self._second_stage_score_conversion_fn(
class_predictions_with_background_batch)) class_predictions_with_background_batch))
class_predictions_batch = tf.reshape( class_predictions_batch = tf.reshape(
tf.slice(class_predictions_with_background_batch, tf.slice(class_predictions_with_background_batch_normalized,
[0, 0, 1], [-1, -1, -1]), [0, 0, 1], [-1, -1, -1]),
[-1, self.max_num_proposals, self.num_classes]) [-1, self.max_num_proposals, self.num_classes])
clip_window = self._compute_clip_window(image_shapes) clip_window = self._compute_clip_window(image_shapes)
...@@ -1614,11 +1635,51 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1614,11 +1635,51 @@ class FasterRCNNMetaArch(model.DetectionModel):
change_coordinate_frame=True, change_coordinate_frame=True,
num_valid_boxes=num_proposals, num_valid_boxes=num_proposals,
masks=mask_predictions_batch) masks=mask_predictions_batch)
if refined_decoded_boxes_batch.shape[2] > 1:
class_ids = tf.expand_dims(
tf.argmax(class_predictions_with_background_batch[:, :, 1:], axis=2,
output_type=tf.int32),
axis=-1)
raw_detection_boxes = tf.squeeze(
tf.batch_gather(refined_decoded_boxes_batch, class_ids), axis=2)
else:
raw_detection_boxes = tf.squeeze(refined_decoded_boxes_batch, axis=2)
def normalize_and_clip_boxes(args):
"""Normalize and clip boxes."""
boxes_per_image = args[0]
image_shape = args[1]
normalized_boxes_per_image = box_list_ops.to_normalized_coordinates(
box_list.BoxList(boxes_per_image),
image_shape[0],
image_shape[1],
check_range=False).get()
normalized_boxes_per_image = box_list_ops.clip_to_window(
box_list.BoxList(normalized_boxes_per_image),
tf.constant([0.0, 0.0, 1.0, 1.0], tf.float32),
filter_nonoverlapping=False).get()
return normalized_boxes_per_image
raw_normalized_detection_boxes = shape_utils.static_or_dynamic_map_fn(
normalize_and_clip_boxes,
elems=[raw_detection_boxes, image_shapes],
dtype=tf.float32)
detections = { detections = {
fields.DetectionResultFields.detection_boxes: nmsed_boxes, fields.DetectionResultFields.detection_boxes:
fields.DetectionResultFields.detection_scores: nmsed_scores, nmsed_boxes,
fields.DetectionResultFields.detection_classes: nmsed_classes, fields.DetectionResultFields.detection_scores:
fields.DetectionResultFields.num_detections: tf.to_float(num_detections) nmsed_scores,
fields.DetectionResultFields.detection_classes:
nmsed_classes,
fields.DetectionResultFields.num_detections:
tf.to_float(num_detections),
fields.DetectionResultFields.raw_detection_boxes:
raw_normalized_detection_boxes,
fields.DetectionResultFields.raw_detection_scores:
class_predictions_with_background_batch
} }
if nmsed_masks is not None: if nmsed_masks is not None:
detections[fields.DetectionResultFields.detection_masks] = nmsed_masks detections[fields.DetectionResultFields.detection_masks] = nmsed_masks
...@@ -1769,7 +1830,8 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1769,7 +1830,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
back_prop=True)) back_prop=True))
# Normalize by number of examples in sampled minibatch # Normalize by number of examples in sampled minibatch
normalizer = tf.reduce_sum(batch_sampled_indices, axis=1) normalizer = tf.maximum(
tf.reduce_sum(batch_sampled_indices, axis=1), 1.0)
batch_one_hot_targets = tf.one_hot( batch_one_hot_targets = tf.one_hot(
tf.to_int32(batch_cls_targets), depth=2) tf.to_int32(batch_cls_targets), depth=2)
sampled_reg_indices = tf.multiply(batch_sampled_indices, sampled_reg_indices = tf.multiply(batch_sampled_indices,
......
...@@ -85,6 +85,68 @@ class FasterRCNNMetaArchTest( ...@@ -85,6 +85,68 @@ class FasterRCNNMetaArchTest(
self.assertTrue(np.amax(detections_out['detection_masks'] <= 1.0)) self.assertTrue(np.amax(detections_out['detection_masks'] <= 1.0))
self.assertTrue(np.amin(detections_out['detection_masks'] >= 0.0)) self.assertTrue(np.amin(detections_out['detection_masks'] >= 0.0))
def test_postprocess_second_stage_only_inference_mode_with_calibration(self):
model = self._build_model(
is_training=False, number_of_stages=2, second_stage_batch_size=6,
calibration_mapping_value=0.5)
batch_size = 2
total_num_padded_proposals = batch_size * model.max_num_proposals
proposal_boxes = tf.constant(
[[[1, 1, 2, 3],
[0, 0, 1, 1],
[.5, .5, .6, .6],
4*[0], 4*[0], 4*[0], 4*[0], 4*[0]],
[[2, 3, 6, 8],
[1, 2, 5, 3],
4*[0], 4*[0], 4*[0], 4*[0], 4*[0], 4*[0]]], dtype=tf.float32)
num_proposals = tf.constant([3, 2], dtype=tf.int32)
refined_box_encodings = tf.zeros(
[total_num_padded_proposals, model.num_classes, 4], dtype=tf.float32)
class_predictions_with_background = tf.ones(
[total_num_padded_proposals, model.num_classes+1], dtype=tf.float32)
image_shape = tf.constant([batch_size, 36, 48, 3], dtype=tf.int32)
mask_height = 2
mask_width = 2
mask_predictions = 30. * tf.ones(
[total_num_padded_proposals, model.num_classes,
mask_height, mask_width], dtype=tf.float32)
exp_detection_masks = np.array([[[[1, 1], [1, 1]],
[[1, 1], [1, 1]],
[[1, 1], [1, 1]],
[[1, 1], [1, 1]],
[[1, 1], [1, 1]]],
[[[1, 1], [1, 1]],
[[1, 1], [1, 1]],
[[1, 1], [1, 1]],
[[1, 1], [1, 1]],
[[0, 0], [0, 0]]]])
_, true_image_shapes = model.preprocess(tf.zeros(image_shape))
detections = model.postprocess({
'refined_box_encodings': refined_box_encodings,
'class_predictions_with_background': class_predictions_with_background,
'num_proposals': num_proposals,
'proposal_boxes': proposal_boxes,
'image_shape': image_shape,
'mask_predictions': mask_predictions
}, true_image_shapes)
with self.test_session() as sess:
detections_out = sess.run(detections)
self.assertAllEqual(detections_out['detection_boxes'].shape, [2, 5, 4])
# All scores map to 0.5, except for the final one, which is pruned.
self.assertAllClose(detections_out['detection_scores'],
[[0.5, 0.5, 0.5, 0.5, 0.5],
[0.5, 0.5, 0.5, 0.5, 0.0]])
self.assertAllClose(detections_out['detection_classes'],
[[0, 0, 0, 1, 1], [0, 0, 1, 1, 0]])
self.assertAllClose(detections_out['num_detections'], [5, 4])
self.assertAllClose(detections_out['detection_masks'],
exp_detection_masks)
self.assertTrue(np.amax(detections_out['detection_masks'] <= 1.0))
self.assertTrue(np.amin(detections_out['detection_masks'] >= 0.0))
def test_postprocess_second_stage_only_inference_mode_with_shared_boxes(self): def test_postprocess_second_stage_only_inference_mode_with_shared_boxes(self):
model = self._build_model( model = self._build_model(
is_training=False, number_of_stages=2, second_stage_batch_size=6) is_training=False, number_of_stages=2, second_stage_batch_size=6)
...@@ -190,6 +252,7 @@ class FasterRCNNMetaArchTest( ...@@ -190,6 +252,7 @@ class FasterRCNNMetaArchTest(
set([ set([
'detection_boxes', 'detection_scores', 'detection_classes', 'detection_boxes', 'detection_scores', 'detection_classes',
'detection_masks', 'num_detections', 'mask_predictions', 'detection_masks', 'num_detections', 'mask_predictions',
'raw_detection_boxes', 'raw_detection_scores'
]))) ])))
for key in expected_shapes: for key in expected_shapes:
self.assertAllEqual(tensor_dict_out[key].shape, expected_shapes[key]) self.assertAllEqual(tensor_dict_out[key].shape, expected_shapes[key])
...@@ -276,7 +339,7 @@ class FasterRCNNMetaArchTest( ...@@ -276,7 +339,7 @@ class FasterRCNNMetaArchTest(
self.assertAllEqual(tensor_dict_out[key].shape, expected_shapes[key]) self.assertAllEqual(tensor_dict_out[key].shape, expected_shapes[key])
anchors_shape_out = tensor_dict_out['anchors'].shape anchors_shape_out = tensor_dict_out['anchors'].shape
self.assertEqual(2, len(anchors_shape_out)) self.assertLen(anchors_shape_out, 2)
self.assertEqual(4, anchors_shape_out[1]) self.assertEqual(4, anchors_shape_out[1])
num_anchors_out = anchors_shape_out[0] num_anchors_out = anchors_shape_out[0]
self.assertAllEqual(tensor_dict_out['rpn_box_encodings'].shape, self.assertAllEqual(tensor_dict_out['rpn_box_encodings'].shape,
......
...@@ -165,7 +165,8 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase): ...@@ -165,7 +165,8 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
use_matmul_crop_and_resize=False, use_matmul_crop_and_resize=False,
clip_anchors_to_image=False, clip_anchors_to_image=False,
use_matmul_gather_in_matcher=False, use_matmul_gather_in_matcher=False,
use_static_shapes=False): use_static_shapes=False,
calibration_mapping_value=None):
def image_resizer_fn(image, masks=None): def image_resizer_fn(image, masks=None):
"""Fake image resizer function.""" """Fake image resizer function."""
...@@ -244,7 +245,9 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase): ...@@ -244,7 +245,9 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
first_stage_localization_loss_weight = 1.0 first_stage_localization_loss_weight = 1.0
first_stage_objectness_loss_weight = 1.0 first_stage_objectness_loss_weight = 1.0
post_processing_config = post_processing_pb2.PostProcessing()
post_processing_text_proto = """ post_processing_text_proto = """
score_converter: IDENTITY
batch_non_max_suppression { batch_non_max_suppression {
score_threshold: -20.0 score_threshold: -20.0
iou_threshold: 1.0 iou_threshold: 1.0
...@@ -253,18 +256,31 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase): ...@@ -253,18 +256,31 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
use_static_shapes: """ +'{}'.format(use_static_shapes) + """ use_static_shapes: """ +'{}'.format(use_static_shapes) + """
} }
""" """
post_processing_config = post_processing_pb2.PostProcessing() if calibration_mapping_value:
calibration_text_proto = """
calibration_config {
function_approximation {
x_y_pairs {
x_y_pair {
x: 0.0
y: %f
}
x_y_pair {
x: 1.0
y: %f
}}}}""" % (calibration_mapping_value, calibration_mapping_value)
post_processing_text_proto = (post_processing_text_proto
+ ' ' + calibration_text_proto)
text_format.Merge(post_processing_text_proto, post_processing_config) text_format.Merge(post_processing_text_proto, post_processing_config)
second_stage_non_max_suppression_fn, second_stage_score_conversion_fn = (
post_processing_builder.build(post_processing_config))
second_stage_target_assigner = target_assigner.create_target_assigner( second_stage_target_assigner = target_assigner.create_target_assigner(
'FasterRCNN', 'detection', 'FasterRCNN', 'detection',
use_matmul_gather=use_matmul_gather_in_matcher) use_matmul_gather=use_matmul_gather_in_matcher)
second_stage_non_max_suppression_fn, _ = post_processing_builder.build(
post_processing_config)
second_stage_sampler = sampler.BalancedPositiveNegativeSampler( second_stage_sampler = sampler.BalancedPositiveNegativeSampler(
positive_fraction=1.0, is_static=use_static_shapes) positive_fraction=1.0, is_static=use_static_shapes)
second_stage_score_conversion_fn = tf.identity
second_stage_localization_loss_weight = 1.0 second_stage_localization_loss_weight = 1.0
second_stage_classification_loss_weight = 1.0 second_stage_classification_loss_weight = 1.0
if softmax_second_stage_classification_loss: if softmax_second_stage_classification_loss:
...@@ -336,6 +352,10 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase): ...@@ -336,6 +352,10 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
predict_masks=predict_masks, predict_masks=predict_masks,
masks_are_class_agnostic=masks_are_class_agnostic), **common_kwargs) masks_are_class_agnostic=masks_are_class_agnostic), **common_kwargs)
@parameterized.parameters(
{'use_static_shapes': False},
{'use_static_shapes': True}
)
def test_predict_gives_correct_shapes_in_inference_mode_first_stage_only( def test_predict_gives_correct_shapes_in_inference_mode_first_stage_only(
self, use_static_shapes=False): self, use_static_shapes=False):
batch_size = 2 batch_size = 2
...@@ -457,6 +477,10 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase): ...@@ -457,6 +477,10 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
prediction_out['rpn_objectness_predictions_with_background'].shape, prediction_out['rpn_objectness_predictions_with_background'].shape,
(batch_size, num_anchors_out, 2)) (batch_size, num_anchors_out, 2))
@parameterized.parameters(
{'use_static_shapes': False},
{'use_static_shapes': True}
)
def test_predict_correct_shapes_in_inference_mode_two_stages( def test_predict_correct_shapes_in_inference_mode_two_stages(
self, use_static_shapes=False): self, use_static_shapes=False):
...@@ -578,6 +602,10 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase): ...@@ -578,6 +602,10 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
for key in expected_shapes: for key in expected_shapes:
self.assertAllEqual(tensor_dict_out[key].shape, expected_shapes[key]) self.assertAllEqual(tensor_dict_out[key].shape, expected_shapes[key])
@parameterized.parameters(
{'use_static_shapes': False},
{'use_static_shapes': True}
)
def test_predict_gives_correct_shapes_in_train_mode_both_stages( def test_predict_gives_correct_shapes_in_train_mode_both_stages(
self, self,
use_static_shapes=False): use_static_shapes=False):
...@@ -670,6 +698,12 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase): ...@@ -670,6 +698,12 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
self.assertAllEqual(results[8].shape, self.assertAllEqual(results[8].shape,
expected_shapes['rpn_box_predictor_features']) expected_shapes['rpn_box_predictor_features'])
@parameterized.parameters(
{'use_static_shapes': False, 'pad_to_max_dimension': None},
{'use_static_shapes': True, 'pad_to_max_dimension': None},
{'use_static_shapes': False, 'pad_to_max_dimension': 56},
{'use_static_shapes': True, 'pad_to_max_dimension': 56}
)
def test_postprocess_first_stage_only_inference_mode( def test_postprocess_first_stage_only_inference_mode(
self, use_static_shapes=False, pad_to_max_dimension=None): self, use_static_shapes=False, pad_to_max_dimension=None):
batch_size = 2 batch_size = 2
...@@ -696,9 +730,9 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase): ...@@ -696,9 +730,9 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
rpn_objectness_predictions_with_background, rpn_objectness_predictions_with_background,
'rpn_features_to_crop': rpn_features_to_crop, 'rpn_features_to_crop': rpn_features_to_crop,
'anchors': anchors}, true_image_shapes) 'anchors': anchors}, true_image_shapes)
return (proposals['num_detections'], return (proposals['num_detections'], proposals['detection_boxes'],
proposals['detection_boxes'], proposals['detection_scores'], proposals['raw_detection_boxes'],
proposals['detection_scores']) proposals['raw_detection_scores'])
anchors = np.array( anchors = np.array(
[[0, 0, 16, 16], [[0, 0, 16, 16],
...@@ -741,6 +775,12 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase): ...@@ -741,6 +775,12 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
expected_proposal_scores = [[1, 1, 0, 0, 0, 0, 0, 0], expected_proposal_scores = [[1, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0]] [1, 1, 0, 0, 0, 0, 0, 0]]
expected_num_proposals = [4, 4] expected_num_proposals = [4, 4]
expected_raw_proposal_boxes = [[[0., 0., 0.5, 0.5], [0., 0.5, 0.5, 1.],
[0.5, 0., 1., 0.5], [0.5, 0.5, 1., 1.]],
[[0., 0., 0.5, 0.5], [0., 0.5, 0.5, 1.],
[0.5, 0., 1., 0.5], [0.5, 0.5, 1., 1.]]]
expected_raw_scores = [[[-10., 13.], [10., -10.], [10., -11.], [-10., 12.]],
[[10., -10.], [-10., 13.], [-10., 12.], [10., -11.]]]
self.assertAllClose(results[0], expected_num_proposals) self.assertAllClose(results[0], expected_num_proposals)
for indx, num_proposals in enumerate(expected_num_proposals): for indx, num_proposals in enumerate(expected_num_proposals):
...@@ -748,6 +788,8 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase): ...@@ -748,6 +788,8 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
expected_proposal_boxes[indx][0:num_proposals]) expected_proposal_boxes[indx][0:num_proposals])
self.assertAllClose(results[2][indx][0:num_proposals], self.assertAllClose(results[2][indx][0:num_proposals],
expected_proposal_scores[indx][0:num_proposals]) expected_proposal_scores[indx][0:num_proposals])
self.assertAllClose(results[3], expected_raw_proposal_boxes)
self.assertAllClose(results[4], expected_raw_scores)
def _test_postprocess_first_stage_only_train_mode(self, def _test_postprocess_first_stage_only_train_mode(self,
pad_to_max_dimension=None): pad_to_max_dimension=None):
...@@ -801,9 +843,17 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase): ...@@ -801,9 +843,17 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
expected_proposal_scores = [[1, 1], expected_proposal_scores = [[1, 1],
[1, 1]] [1, 1]]
expected_num_proposals = [2, 2] expected_num_proposals = [2, 2]
expected_raw_proposal_boxes = [[[0., 0., 0.5, 0.5], [0., 0.5, 0.5, 1.],
[0.5, 0., 1., 0.5], [0.5, 0.5, 1., 1.]],
[[0., 0., 0.5, 0.5], [0., 0.5, 0.5, 1.],
[0.5, 0., 1., 0.5], [0.5, 0.5, 1., 1.]]]
expected_raw_scores = [[[-10., 13.], [-10., 12.], [-10., 11.], [-10., 10.]],
[[-10., 13.], [-10., 12.], [-10., 11.], [-10., 10.]]]
expected_output_keys = set(['detection_boxes', 'detection_scores', expected_output_keys = set([
'num_detections']) 'detection_boxes', 'detection_scores', 'num_detections',
'raw_detection_boxes', 'raw_detection_scores'
])
self.assertEqual(set(proposals.keys()), expected_output_keys) self.assertEqual(set(proposals.keys()), expected_output_keys)
with self.test_session() as sess: with self.test_session() as sess:
...@@ -817,6 +867,10 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase): ...@@ -817,6 +867,10 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
expected_proposal_scores) expected_proposal_scores)
self.assertAllEqual(proposals_out['num_detections'], self.assertAllEqual(proposals_out['num_detections'],
expected_num_proposals) expected_num_proposals)
self.assertAllClose(proposals_out['raw_detection_boxes'],
expected_raw_proposal_boxes)
self.assertAllClose(proposals_out['raw_detection_scores'],
expected_raw_scores)
def test_postprocess_first_stage_only_train_mode(self): def test_postprocess_first_stage_only_train_mode(self):
self._test_postprocess_first_stage_only_train_mode() self._test_postprocess_first_stage_only_train_mode()
...@@ -824,6 +878,12 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase): ...@@ -824,6 +878,12 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
def test_postprocess_first_stage_only_train_mode_padded_image(self): def test_postprocess_first_stage_only_train_mode_padded_image(self):
self._test_postprocess_first_stage_only_train_mode(pad_to_max_dimension=56) self._test_postprocess_first_stage_only_train_mode(pad_to_max_dimension=56)
@parameterized.parameters(
{'use_static_shapes': False, 'pad_to_max_dimension': None},
{'use_static_shapes': True, 'pad_to_max_dimension': None},
{'use_static_shapes': False, 'pad_to_max_dimension': 56},
{'use_static_shapes': True, 'pad_to_max_dimension': 56}
)
def test_postprocess_second_stage_only_inference_mode( def test_postprocess_second_stage_only_inference_mode(
self, use_static_shapes=False, pad_to_max_dimension=None): self, use_static_shapes=False, pad_to_max_dimension=None):
batch_size = 2 batch_size = 2
...@@ -854,10 +914,10 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase): ...@@ -854,10 +914,10 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
'num_proposals': num_proposals, 'num_proposals': num_proposals,
'proposal_boxes': proposal_boxes, 'proposal_boxes': proposal_boxes,
}, true_image_shapes) }, true_image_shapes)
return (detections['num_detections'], return (detections['num_detections'], detections['detection_boxes'],
detections['detection_boxes'], detections['detection_scores'], detections['detection_classes'],
detections['detection_scores'], detections['raw_detection_boxes'],
detections['detection_classes']) detections['raw_detection_scores'])
proposal_boxes = np.array( proposal_boxes = np.array(
[[[1, 1, 2, 3], [[[1, 1, 2, 3],
...@@ -867,6 +927,7 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase): ...@@ -867,6 +927,7 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
[[2, 3, 6, 8], [[2, 3, 6, 8],
[1, 2, 5, 3], [1, 2, 5, 3],
4*[0], 4*[0], 4*[0], 4*[0], 4*[0], 4*[0]]], dtype=np.float32) 4*[0], 4*[0], 4*[0], 4*[0], 4*[0], 4*[0]]], dtype=np.float32)
num_proposals = np.array([3, 2], dtype=np.int32) num_proposals = np.array([3, 2], dtype=np.int32)
refined_box_encodings = np.zeros( refined_box_encodings = np.zeros(
[total_num_padded_proposals, num_classes, 4], dtype=np.float32) [total_num_padded_proposals, num_classes, 4], dtype=np.float32)
...@@ -887,6 +948,15 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase): ...@@ -887,6 +948,15 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
expected_num_detections = [5, 4] expected_num_detections = [5, 4]
expected_detection_classes = [[0, 0, 0, 1, 1], [0, 0, 1, 1, 0]] expected_detection_classes = [[0, 0, 0, 1, 1], [0, 0, 1, 1, 0]]
expected_detection_scores = [[1, 1, 1, 1, 1], [1, 1, 1, 1, 0]] expected_detection_scores = [[1, 1, 1, 1, 1], [1, 1, 1, 1, 0]]
h = float(image_shape[1])
w = float(image_shape[2])
expected_raw_detection_boxes = np.array(
[[[1 / h, 1 / w, 2 / h, 3 / w], [0, 0, 1 / h, 1 / w],
[.5 / h, .5 / w, .6 / h, .6 / w], 4 * [0], 4 * [0], 4 * [0], 4 * [0],
4 * [0]],
[[2 / h, 3 / w, 6 / h, 8 / w], [1 / h, 2 / w, 5 / h, 3 / w], 4 * [0],
4 * [0], 4 * [0], 4 * [0], 4 * [0], 4 * [0]]],
dtype=np.float32)
self.assertAllClose(results[0], expected_num_detections) self.assertAllClose(results[0], expected_num_detections)
...@@ -896,6 +966,9 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase): ...@@ -896,6 +966,9 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
self.assertAllClose(results[3][indx][0:num_proposals], self.assertAllClose(results[3][indx][0:num_proposals],
expected_detection_classes[indx][0:num_proposals]) expected_detection_classes[indx][0:num_proposals])
self.assertAllClose(results[4], expected_raw_detection_boxes)
self.assertAllClose(results[5],
class_predictions_with_background.reshape([-1, 8, 3]))
if not use_static_shapes: if not use_static_shapes:
self.assertAllEqual(results[1].shape, [2, 5, 4]) self.assertAllEqual(results[1].shape, [2, 5, 4])
...@@ -1268,6 +1341,13 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase): ...@@ -1268,6 +1341,13 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
'Loss/BoxClassifierLoss/classification_loss'], 0) 'Loss/BoxClassifierLoss/classification_loss'], 0)
self.assertAllClose(loss_dict_out['Loss/BoxClassifierLoss/mask_loss'], 0) self.assertAllClose(loss_dict_out['Loss/BoxClassifierLoss/mask_loss'], 0)
@parameterized.parameters(
{'use_static_shapes': False, 'shared_boxes': False},
{'use_static_shapes': False, 'shared_boxes': True},
{'use_static_shapes': True, 'shared_boxes': False},
{'use_static_shapes': True, 'shared_boxes': True},
)
def test_loss_full_zero_padded_proposals_nonzero_loss_with_two_images( def test_loss_full_zero_padded_proposals_nonzero_loss_with_two_images(
self, use_static_shapes=False, shared_boxes=False): self, use_static_shapes=False, shared_boxes=False):
batch_size = 2 batch_size = 2
......
...@@ -288,7 +288,7 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch): ...@@ -288,7 +288,7 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
""" """
image_shape_2d = tf.tile(tf.expand_dims(image_shape[1:], 0), image_shape_2d = tf.tile(tf.expand_dims(image_shape[1:], 0),
[image_shape[0], 1]) [image_shape[0], 1])
proposal_boxes_normalized, _, num_proposals = self._postprocess_rpn( proposal_boxes_normalized, _, num_proposals, _, _ = self._postprocess_rpn(
rpn_box_encodings, rpn_objectness_predictions_with_background, rpn_box_encodings, rpn_objectness_predictions_with_background,
anchors, image_shape_2d, true_image_shapes) anchors, image_shape_2d, true_image_shapes)
......
...@@ -17,8 +17,7 @@ ...@@ -17,8 +17,7 @@
General tensorflow implementation of convolutional Multibox/SSD detection General tensorflow implementation of convolutional Multibox/SSD detection
models. models.
""" """
from abc import abstractmethod import abc
import tensorflow as tf import tensorflow as tf
from object_detection.core import box_list from object_detection.core import box_list
...@@ -80,7 +79,7 @@ class SSDFeatureExtractor(object): ...@@ -80,7 +79,7 @@ class SSDFeatureExtractor(object):
def is_keras_model(self): def is_keras_model(self):
return False return False
@abstractmethod @abc.abstractmethod
def preprocess(self, resized_inputs): def preprocess(self, resized_inputs):
"""Preprocesses images for feature extraction (minus image resizing). """Preprocesses images for feature extraction (minus image resizing).
...@@ -98,7 +97,7 @@ class SSDFeatureExtractor(object): ...@@ -98,7 +97,7 @@ class SSDFeatureExtractor(object):
""" """
pass pass
@abstractmethod @abc.abstractmethod
def extract_features(self, preprocessed_inputs): def extract_features(self, preprocessed_inputs):
"""Extracts features from preprocessed inputs. """Extracts features from preprocessed inputs.
...@@ -196,7 +195,7 @@ class SSDKerasFeatureExtractor(tf.keras.Model): ...@@ -196,7 +195,7 @@ class SSDKerasFeatureExtractor(tf.keras.Model):
def is_keras_model(self): def is_keras_model(self):
return True return True
@abstractmethod @abc.abstractmethod
def preprocess(self, resized_inputs): def preprocess(self, resized_inputs):
"""Preprocesses images for feature extraction (minus image resizing). """Preprocesses images for feature extraction (minus image resizing).
...@@ -214,7 +213,7 @@ class SSDKerasFeatureExtractor(tf.keras.Model): ...@@ -214,7 +213,7 @@ class SSDKerasFeatureExtractor(tf.keras.Model):
""" """
raise NotImplementedError raise NotImplementedError
@abstractmethod @abc.abstractmethod
def _extract_features(self, preprocessed_inputs): def _extract_features(self, preprocessed_inputs):
"""Extracts features from preprocessed inputs. """Extracts features from preprocessed inputs.
...@@ -552,8 +551,10 @@ class SSDMetaArch(model.DetectionModel): ...@@ -552,8 +551,10 @@ class SSDMetaArch(model.DetectionModel):
5) anchors: 2-D float tensor of shape [num_anchors, 4] containing 5) anchors: 2-D float tensor of shape [num_anchors, 4] containing
the generated anchors in normalized coordinates. the generated anchors in normalized coordinates.
""" """
batchnorm_updates_collections = (None if self._inplace_batchnorm_update if self._inplace_batchnorm_update:
else tf.GraphKeys.UPDATE_OPS) batchnorm_updates_collections = None
else:
batchnorm_updates_collections = tf.GraphKeys.UPDATE_OPS
if self._feature_extractor.is_keras_model: if self._feature_extractor.is_keras_model:
feature_maps = self._feature_extractor(preprocessed_inputs) feature_maps = self._feature_extractor(preprocessed_inputs)
else: else:
...@@ -648,14 +649,22 @@ class SSDMetaArch(model.DetectionModel): ...@@ -648,14 +649,22 @@ class SSDMetaArch(model.DetectionModel):
Returns: Returns:
detections: a dictionary containing the following fields detections: a dictionary containing the following fields
detection_boxes: [batch, max_detections, 4] detection_boxes: [batch, max_detections, 4] tensor with post-processed
detection_scores: [batch, max_detections] detection boxes.
detection_classes: [batch, max_detections] detection_scores: [batch, max_detections] tensor with scalar scores for
post-processed detection boxes.
detection_classes: [batch, max_detections] tensor with classes for
post-processed detection classes.
detection_keypoints: [batch, max_detections, num_keypoints, 2] (if detection_keypoints: [batch, max_detections, num_keypoints, 2] (if
encoded in the prediction_dict 'box_encodings') encoded in the prediction_dict 'box_encodings')
detection_masks: [batch_size, max_detections, mask_height, mask_width] detection_masks: [batch_size, max_detections, mask_height, mask_width]
(optional) (optional)
num_detections: [batch] num_detections: [batch]
raw_detection_boxes: [batch, total_detections, 4] tensor with decoded
detection boxes before Non-Max Suppression.
raw_detection_score: [batch, total_detections,
num_classes_with_background] tensor of multi-class score logits for
raw detection boxes.
Raises: Raises:
ValueError: if prediction_dict does not contain `box_encodings` or ValueError: if prediction_dict does not contain `box_encodings` or
`class_predictions_with_background` fields. `class_predictions_with_background` fields.
...@@ -700,11 +709,18 @@ class SSDMetaArch(model.DetectionModel): ...@@ -700,11 +709,18 @@ class SSDMetaArch(model.DetectionModel):
additional_fields=additional_fields, additional_fields=additional_fields,
masks=prediction_dict.get('mask_predictions')) masks=prediction_dict.get('mask_predictions'))
detection_dict = { detection_dict = {
fields.DetectionResultFields.detection_boxes: nmsed_boxes, fields.DetectionResultFields.detection_boxes:
fields.DetectionResultFields.detection_scores: nmsed_scores, nmsed_boxes,
fields.DetectionResultFields.detection_classes: nmsed_classes, fields.DetectionResultFields.detection_scores:
nmsed_scores,
fields.DetectionResultFields.detection_classes:
nmsed_classes,
fields.DetectionResultFields.num_detections: fields.DetectionResultFields.num_detections:
tf.to_float(num_detections) tf.to_float(num_detections),
fields.DetectionResultFields.raw_detection_boxes:
tf.squeeze(detection_boxes, axis=2),
fields.DetectionResultFields.raw_detection_scores:
class_predictions
} }
if (nmsed_additional_fields is not None and if (nmsed_additional_fields is not None and
fields.BoxListFields.keypoints in nmsed_additional_fields): fields.BoxListFields.keypoints in nmsed_additional_fields):
...@@ -1049,9 +1065,9 @@ class SSDMetaArch(model.DetectionModel): ...@@ -1049,9 +1065,9 @@ class SSDMetaArch(model.DetectionModel):
mined_cls_loss: a float scalar with sum of classification losses from mined_cls_loss: a float scalar with sum of classification losses from
selected hard examples. selected hard examples.
""" """
class_predictions = tf.slice( class_predictions = prediction_dict['class_predictions_with_background']
prediction_dict['class_predictions_with_background'], [0, 0, if self._add_background_class:
1], [-1, -1, -1]) class_predictions = tf.slice(class_predictions, [0, 0, 1], [-1, -1, -1])
decoded_boxes, _ = self._batch_decode(prediction_dict['box_encodings']) decoded_boxes, _ = self._batch_decode(prediction_dict['box_encodings'])
decoded_box_tensors_list = tf.unstack(decoded_boxes) decoded_box_tensors_list = tf.unstack(decoded_boxes)
......
...@@ -16,7 +16,9 @@ ...@@ -16,7 +16,9 @@
import functools import functools
import tensorflow as tf import tensorflow as tf
from google.protobuf import text_format
from object_detection.builders import post_processing_builder
from object_detection.core import anchor_generator from object_detection.core import anchor_generator
from object_detection.core import balanced_positive_negative_sampler as sampler from object_detection.core import balanced_positive_negative_sampler as sampler
from object_detection.core import box_list from object_detection.core import box_list
...@@ -25,6 +27,7 @@ from object_detection.core import post_processing ...@@ -25,6 +27,7 @@ from object_detection.core import post_processing
from object_detection.core import region_similarity_calculator as sim_calc from object_detection.core import region_similarity_calculator as sim_calc
from object_detection.core import target_assigner from object_detection.core import target_assigner
from object_detection.meta_architectures import ssd_meta_arch from object_detection.meta_architectures import ssd_meta_arch
from object_detection.protos import calibration_pb2
from object_detection.protos import model_pb2 from object_detection.protos import model_pb2
from object_detection.utils import ops from object_detection.utils import ops
from object_detection.utils import test_case from object_detection.utils import test_case
...@@ -125,7 +128,8 @@ class SSDMetaArchTestBase(test_case.TestCase): ...@@ -125,7 +128,8 @@ class SSDMetaArchTestBase(test_case.TestCase):
use_keras=False, use_keras=False,
predict_mask=False, predict_mask=False,
use_static_shapes=False, use_static_shapes=False,
nms_max_size_per_class=5): nms_max_size_per_class=5,
calibration_mapping_value=None):
is_training = False is_training = False
num_classes = 1 num_classes = 1
mock_anchor_generator = MockAnchorGenerator2x2() mock_anchor_generator = MockAnchorGenerator2x2()
...@@ -156,6 +160,24 @@ class SSDMetaArchTestBase(test_case.TestCase): ...@@ -156,6 +160,24 @@ class SSDMetaArchTestBase(test_case.TestCase):
max_size_per_class=nms_max_size_per_class, max_size_per_class=nms_max_size_per_class,
max_total_size=nms_max_size_per_class, max_total_size=nms_max_size_per_class,
use_static_shapes=use_static_shapes) use_static_shapes=use_static_shapes)
score_conversion_fn = tf.identity
calibration_config = calibration_pb2.CalibrationConfig()
if calibration_mapping_value:
calibration_text_proto = """
function_approximation {
x_y_pairs {
x_y_pair {
x: 0.0
y: %f
}
x_y_pair {
x: 1.0
y: %f
}}}""" % (calibration_mapping_value, calibration_mapping_value)
text_format.Merge(calibration_text_proto, calibration_config)
score_conversion_fn = (
post_processing_builder._build_calibrated_score_converter( # pylint: disable=protected-access
tf.identity, calibration_config))
classification_loss_weight = 1.0 classification_loss_weight = 1.0
localization_loss_weight = 1.0 localization_loss_weight = 1.0
negative_class_weight = 1.0 negative_class_weight = 1.0
...@@ -201,7 +223,7 @@ class SSDMetaArchTestBase(test_case.TestCase): ...@@ -201,7 +223,7 @@ class SSDMetaArchTestBase(test_case.TestCase):
encode_background_as_zeros=encode_background_as_zeros, encode_background_as_zeros=encode_background_as_zeros,
image_resizer_fn=image_resizer_fn, image_resizer_fn=image_resizer_fn,
non_max_suppression_fn=non_max_suppression_fn, non_max_suppression_fn=non_max_suppression_fn,
score_conversion_fn=tf.identity, score_conversion_fn=score_conversion_fn,
classification_loss=classification_loss, classification_loss=classification_loss,
localization_loss=localization_loss, localization_loss=localization_loss,
classification_loss_weight=classification_loss_weight, classification_loss_weight=classification_loss_weight,
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment