Unverified Commit 9bbf8015 authored by pkulzc's avatar pkulzc Committed by GitHub
Browse files

Merged commit includes the following changes: (#6932)

250447559  by Zhichao Lu:

    Update expected files format for Instance Segmentation challenge:
    - add fields ImageWidth, ImageHeight and store the values per prediction
    - as mask, store only encoded image and assume its size is ImageWidth x ImageHeight

--
250402780  by rathodv:

    Fix failing Mask R-CNN TPU convergence test.

    Cast second stage prediction tensors from bfloat16 to float32 to prevent errors in third target assignment (Mask Prediction) - Concat with different types bfloat16 and bfloat32 isn't allowed.

--
250300240  by Zhichao Lu:

    Addion Open Images Challenge 2019 object detection and instance segmentation
    support into Estimator framework.

--
249944839  by rathodv:

    Modify exporter.py to add multiclass score nodes in exported inference graphs.

--
249935201  by rathodv:

    Modify postprocess methods to preserve multiclass scores after non max suppression.

--
249878079  by Zhichao Lu:

    This CL slightly refactors some Object Detection helper functions for data creation, evaluation, and groundtruth providing.

    This will allow the eager+function custom loops to share code with the existing estimator training loops.

    Concretely we make the following changes:
    1. In input creation we separate dataset-creation into top-level helpers, and allow it to optionally accept a pre-constructed model directly instead of always creating a model from the config just for feature preprocessing.

    2. In coco evaluation we split the update_op creation into its own function, which the custom loops will call directly.

    3. In model_lib we move groundtruth providing/ datastructure munging into a helper function

    4. For now we put an escape hatch in `_summarize_target_assignment` when executing in tf v2.0 behavior because the summary apis used only work w/ tf 1.x

--
249673507  by rathodv:

    Use explicit casts instead of tf.to_float and tf.to_int32 to avoid warnings.

--
249656006  by Zhichao Lu:

    Add named "raw_keypoint_locations" node that corresponds with the "raw_box_locations" node.

--
249651674  by rathodv:

    Keep proposal boxes in float format. MatMulCropAndResize can handle the type even when feature themselves are bfloat16s.

--
249568633  by rathodv:

    Support q > 1 in class agnostic NMS.
    Break post_processing_test.py into 3 separate files to avoid linter errors.

--
249535530  by rathodv:

    Update some deprecated arguments to tf ops.

--
249368223  by rathodv:

    Modify MatMulCropAndResize to use MultiLevelRoIAlign method and move the tests to spatial_transform_ops.py module.

    This cl establishes that CropAndResize and RoIAlign are equivalent and only differ in the sampling point grid within the boxes. CropAndResize uses a uniform size x size point grid such that the corner points exactly overlap box corners, while RoiAlign divides boxes into size x size cells and uses their centers as sampling points. In this cl, we switch MatMulCropAndResize to use the MultiLevelRoIAlign implementation with `align_corner` option as MultiLevelRoIAlign implementation is more memory efficient on TPU when compared to the original MatMulCropAndResize.

--
249337338  by chowdhery:

    Add class-agnostic non-max-suppression in post_processing

--
249139196  by Zhichao Lu:

    Fix positional argument bug in export_tflite_ssd_graph

--
249120219  by Zhichao Lu:

    Add evaluator for computing precision limited to a given recall range.

--
249030593  by Zhichao Lu:

    Evaluation util to run segmentation and detection challenge evaluation.

--
248554358  by Zhichao Lu:

    This change contains the auxiliary changes required for TF 2.0 style training with eager+functions+dist strat loops, but not the loops themselves.

    It includes:
    - Updates to shape usage to support both tensorshape v1 and tensorshape v2
    - A fix to FreezableBatchNorm to not override the `training` arg in call when `None` was passed to the constructor (Not an issue in the estimator loops but it was in the custom loops)
    - Puts some constants in init_scope so they work in eager + functions
    - Makes learning rate schedules return a callable in eager mode (required so they update when the global_step changes)
    - Makes DetectionModel a tf.module so it tracks variables (e.g. ones nested in layers)
    - Removes some references to `op.name` for some losses and replaces it w/ explicit names
    - A small part of the change to allow the coco evaluation metrics to work in eager mode

--
248271226  by rathodv:

    Add MultiLevel RoIAlign op.

--
248229103  by rathodv:

    Add functions to 1. pad features maps 2. ravel 5-D indices

--
248206769  by rathodv:

    Add utilities needed to introduce RoI Align op.

--
248177733  by pengchong:

    Internal changes

--
247742582  by Zhichao Lu:

    Open Images Challenge 2019 instance segmentation metric: part 2

--
247525401  by Zhichao Lu:

    Update comments on max_class_per_detection.

--
247520753  by rathodv:

    Add multilevel crop and resize operation that builds on top of matmul_crop_and_resize.

--
247391600  by Zhichao Lu:

    Open Images Challenge 2019 instance segmentation metric

--
247325813  by chowdhery:

    Quantized MobileNet v2 SSD FPNLite config with depth multiplier 0.75

--

PiperOrigin-RevId: 250447559
parent f42fddee
......@@ -271,7 +271,8 @@ class FasterRCNNMetaArchTest(
set(tensor_dict_out.keys()),
set(expected_shapes.keys()).union(
set([
'detection_boxes', 'detection_scores', 'detection_classes',
'detection_boxes', 'detection_scores',
'detection_multiclass_scores', 'detection_classes',
'detection_masks', 'num_detections', 'mask_predictions',
'raw_detection_boxes', 'raw_detection_scores'
])))
......
......@@ -967,7 +967,8 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
[[0, 0, .5, .5], [.5, .5, 1, 1]], [[0, .5, .5, 1], [.5, 0, 1, .5]]]
expected_proposal_scores = [[1, 1],
[1, 1]]
expected_num_proposals = [2, 2]
expected_proposal_multiclass_scores = [[[0., 1.], [0., 1.]],
[[0., 1.], [0., 1.]]]
expected_raw_proposal_boxes = [[[0., 0., 0.5, 0.5], [0., 0.5, 0.5, 1.],
[0.5, 0., 1., 0.5], [0.5, 0.5, 1., 1.]],
[[0., 0., 0.5, 0.5], [0., 0.5, 0.5, 1.],
......@@ -975,31 +976,45 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
expected_raw_scores = [[[0., 1.], [0., 1.], [0., 1.], [0., 1.]],
[[0., 1.], [0., 1.], [0., 1.], [0., 1.]]]
expected_output_keys = set([
'detection_boxes', 'detection_scores', 'num_detections',
'raw_detection_boxes', 'raw_detection_scores'
'detection_boxes', 'detection_scores', 'detection_multiclass_scores',
'num_detections', 'raw_detection_boxes', 'raw_detection_scores'
])
self.assertEqual(set(proposals.keys()), expected_output_keys)
with self.test_session() as sess:
proposals_out = sess.run(proposals)
for image_idx in range(batch_size):
num_detections = int(proposals_out['num_detections'][image_idx])
boxes = proposals_out['detection_boxes'][
image_idx][:num_detections, :].tolist()
scores = proposals_out['detection_scores'][
image_idx][:num_detections].tolist()
multiclass_scores = proposals_out['detection_multiclass_scores'][
image_idx][:num_detections, :].tolist()
expected_boxes = expected_proposal_boxes[image_idx]
expected_scores = expected_proposal_scores[image_idx]
expected_multiclass_scores = expected_proposal_multiclass_scores[
image_idx]
self.assertTrue(
test_utils.first_rows_close_as_set(
proposals_out['detection_boxes'][image_idx].tolist(),
expected_proposal_boxes[image_idx]))
self.assertAllClose(proposals_out['detection_scores'],
expected_proposal_scores)
self.assertAllEqual(proposals_out['num_detections'],
expected_num_proposals)
test_utils.first_rows_close_as_set(boxes, expected_boxes))
self.assertTrue(
test_utils.first_rows_close_as_set(scores, expected_scores))
self.assertTrue(
test_utils.first_rows_close_as_set(multiclass_scores,
expected_multiclass_scores))
self.assertAllClose(proposals_out['raw_detection_boxes'],
expected_raw_proposal_boxes)
self.assertAllClose(proposals_out['raw_detection_scores'],
expected_raw_scores)
@parameterized.parameters(
{'use_keras': True},
{'use_keras': False}
)
@parameterized.named_parameters({
'testcase_name': 'keras',
'use_keras': True
}, {
'testcase_name': 'slim',
'use_keras': False
})
def test_postprocess_first_stage_only_train_mode(self, use_keras=False):
self._test_postprocess_first_stage_only_train_mode(use_keras=use_keras)
......@@ -1066,7 +1081,8 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
return (detections['num_detections'], detections['detection_boxes'],
detections['detection_scores'], detections['detection_classes'],
detections['raw_detection_boxes'],
detections['raw_detection_scores'])
detections['raw_detection_scores'],
detections['detection_multiclass_scores'])
proposal_boxes = np.array(
[[[1, 1, 2, 3],
......@@ -1097,6 +1113,17 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
expected_num_detections = [5, 4]
expected_detection_classes = [[0, 0, 0, 1, 1], [0, 0, 1, 1, 0]]
expected_detection_scores = [[1, 1, 1, 1, 1], [1, 1, 1, 1, 0]]
expected_multiclass_scores = [[[1, 1, 1],
[1, 1, 1],
[1, 1, 1],
[1, 1, 1],
[1, 1, 1]],
[[1, 1, 1],
[1, 1, 1],
[1, 1, 1],
[1, 1, 1],
[0, 0, 0]]]
h = float(image_shape[1])
w = float(image_shape[2])
expected_raw_detection_boxes = np.array(
......@@ -1114,6 +1141,8 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
expected_detection_scores[indx][0:num_proposals])
self.assertAllClose(results[3][indx][0:num_proposals],
expected_detection_classes[indx][0:num_proposals])
self.assertAllClose(results[6][indx][0:num_proposals],
expected_multiclass_scores[indx][0:num_proposals])
self.assertAllClose(results[4], expected_raw_detection_boxes)
self.assertAllClose(results[5],
......@@ -1895,8 +1924,8 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
number_of_stages=2, second_stage_batch_size=6)
inputs_shape = (2, 20, 20, 3)
inputs = tf.to_float(tf.random_uniform(
inputs_shape, minval=0, maxval=255, dtype=tf.int32))
inputs = tf.cast(tf.random_uniform(
inputs_shape, minval=0, maxval=255, dtype=tf.int32), dtype=tf.float32)
preprocessed_inputs, true_image_shapes = model.preprocess(inputs)
prediction_dict = model.predict(preprocessed_inputs, true_image_shapes)
model.postprocess(prediction_dict, true_image_shapes)
......@@ -1921,8 +1950,8 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
is_training=False, use_keras=use_keras,
number_of_stages=2, second_stage_batch_size=6)
inputs_shape = (2, 20, 20, 3)
inputs = tf.to_float(tf.random_uniform(
inputs_shape, minval=0, maxval=255, dtype=tf.int32))
inputs = tf.cast(tf.random_uniform(
inputs_shape, minval=0, maxval=255, dtype=tf.int32), dtype=tf.float32)
preprocessed_inputs, true_image_shapes = model.preprocess(inputs)
prediction_dict = model.predict(preprocessed_inputs, true_image_shapes)
model.postprocess(prediction_dict, true_image_shapes)
......@@ -1942,8 +1971,9 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
second_stage_batch_size=6, num_classes=42)
inputs_shape2 = (2, 20, 20, 3)
inputs2 = tf.to_float(tf.random_uniform(
inputs_shape2, minval=0, maxval=255, dtype=tf.int32))
inputs2 = tf.cast(tf.random_uniform(
inputs_shape2, minval=0, maxval=255, dtype=tf.int32),
dtype=tf.float32)
preprocessed_inputs2, true_image_shapes = model2.preprocess(inputs2)
prediction_dict2 = model2.predict(preprocessed_inputs2, true_image_shapes)
model2.postprocess(prediction_dict2, true_image_shapes)
......@@ -1974,8 +2004,9 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
num_classes=42)
inputs_shape = (2, 20, 20, 3)
inputs = tf.to_float(
tf.random_uniform(inputs_shape, minval=0, maxval=255, dtype=tf.int32))
inputs = tf.cast(
tf.random_uniform(inputs_shape, minval=0, maxval=255, dtype=tf.int32),
dtype=tf.float32)
preprocessed_inputs, true_image_shapes = model.preprocess(inputs)
prediction_dict = model.predict(preprocessed_inputs, true_image_shapes)
model.postprocess(prediction_dict, true_image_shapes)
......
......@@ -297,9 +297,10 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
"""
image_shape_2d = tf.tile(tf.expand_dims(image_shape[1:], 0),
[image_shape[0], 1])
proposal_boxes_normalized, _, num_proposals, _, _ = self._postprocess_rpn(
rpn_box_encodings, rpn_objectness_predictions_with_background,
anchors, image_shape_2d, true_image_shapes)
(proposal_boxes_normalized, _, _, num_proposals, _,
_) = self._postprocess_rpn(rpn_box_encodings,
rpn_objectness_predictions_with_background,
anchors, image_shape_2d, true_image_shapes)
box_classifier_features = (
self._extract_box_classifier_features(rpn_features))
......
......@@ -509,9 +509,9 @@ class SSDMetaArch(model.DetectionModel):
resized_inputs_shape = shape_utils.combined_static_and_dynamic_shape(
preprocessed_images)
true_heights, true_widths, _ = tf.unstack(
tf.to_float(true_image_shapes), axis=1)
padded_height = tf.to_float(resized_inputs_shape[1])
padded_width = tf.to_float(resized_inputs_shape[2])
tf.cast(true_image_shapes, dtype=tf.float32), axis=1)
padded_height = tf.cast(resized_inputs_shape[1], dtype=tf.float32)
padded_width = tf.cast(resized_inputs_shape[2], dtype=tf.float32)
return tf.stack(
[
tf.zeros_like(true_heights),
......@@ -654,6 +654,9 @@ class SSDMetaArch(model.DetectionModel):
detection boxes.
detection_scores: [batch, max_detections] tensor with scalar scores for
post-processed detection boxes.
detection_multiclass_scores: [batch, max_detections,
num_classes_with_background] tensor with class score distribution for
post-processed detection boxes including background class if any.
detection_classes: [batch, max_detections] tensor with classes for
post-processed detection classes.
detection_keypoints: [batch, max_detections, num_keypoints, 2] (if
......@@ -703,10 +706,13 @@ class SSDMetaArch(model.DetectionModel):
feature_map_list.append(tf.reshape(feature_map, [batch_size, -1]))
box_features = tf.concat(feature_map_list, 1)
box_features = tf.identity(box_features, 'raw_box_features')
additional_fields = {
'multiclass_scores': detection_scores_with_background
}
if detection_keypoints is not None:
additional_fields = {
fields.BoxListFields.keypoints: detection_keypoints}
detection_keypoints = tf.identity(
detection_keypoints, 'raw_keypoint_locations')
additional_fields[fields.BoxListFields.keypoints] = detection_keypoints
(nmsed_boxes, nmsed_scores, nmsed_classes, nmsed_masks,
nmsed_additional_fields, num_detections) = self._non_max_suppression_fn(
detection_boxes,
......@@ -722,8 +728,10 @@ class SSDMetaArch(model.DetectionModel):
nmsed_scores,
fields.DetectionResultFields.detection_classes:
nmsed_classes,
fields.DetectionResultFields.detection_multiclass_scores:
nmsed_additional_fields['multiclass_scores'],
fields.DetectionResultFields.num_detections:
tf.to_float(num_detections),
tf.cast(num_detections, dtype=tf.float32),
fields.DetectionResultFields.raw_detection_boxes:
tf.squeeze(detection_boxes, axis=2),
fields.DetectionResultFields.raw_detection_scores:
......@@ -786,13 +794,13 @@ class SSDMetaArch(model.DetectionModel):
if self._random_example_sampler:
batch_cls_per_anchor_weights = tf.reduce_mean(
batch_cls_weights, axis=-1)
batch_sampled_indicator = tf.to_float(
batch_sampled_indicator = tf.cast(
shape_utils.static_or_dynamic_map_fn(
self._minibatch_subsample_fn,
[batch_cls_targets, batch_cls_per_anchor_weights],
dtype=tf.bool,
parallel_iterations=self._parallel_iterations,
back_prop=True))
back_prop=True), dtype=tf.float32)
batch_reg_weights = tf.multiply(batch_sampled_indicator,
batch_reg_weights)
batch_cls_weights = tf.multiply(
......@@ -868,7 +876,8 @@ class SSDMetaArch(model.DetectionModel):
# Optionally normalize by number of positive matches
normalizer = tf.constant(1.0, dtype=tf.float32)
if self._normalize_loss_by_num_matches:
normalizer = tf.maximum(tf.to_float(tf.reduce_sum(batch_reg_weights)),
normalizer = tf.maximum(tf.cast(tf.reduce_sum(batch_reg_weights),
dtype=tf.float32),
1.0)
localization_loss_normalizer = normalizer
......@@ -883,8 +892,8 @@ class SSDMetaArch(model.DetectionModel):
name='classification_loss')
loss_dict = {
str(localization_loss.op.name): localization_loss,
str(classification_loss.op.name): classification_loss
'Loss/localization_loss': localization_loss,
'Loss/classification_loss': classification_loss
}
......@@ -1025,17 +1034,35 @@ class SSDMetaArch(model.DetectionModel):
with rows of the Match objects corresponding to groundtruth boxes
and columns corresponding to anchors.
"""
avg_num_gt_boxes = tf.reduce_mean(tf.to_float(tf.stack(
[tf.shape(x)[0] for x in groundtruth_boxes_list])))
avg_num_matched_gt_boxes = tf.reduce_mean(tf.to_float(tf.stack(
[match.num_matched_rows() for match in match_list])))
avg_pos_anchors = tf.reduce_mean(tf.to_float(tf.stack(
[match.num_matched_columns() for match in match_list])))
avg_neg_anchors = tf.reduce_mean(tf.to_float(tf.stack(
[match.num_unmatched_columns() for match in match_list])))
avg_ignored_anchors = tf.reduce_mean(tf.to_float(tf.stack(
[match.num_ignored_columns() for match in match_list])))
avg_num_gt_boxes = tf.reduce_mean(
tf.cast(
tf.stack([tf.shape(x)[0] for x in groundtruth_boxes_list]),
dtype=tf.float32))
avg_num_matched_gt_boxes = tf.reduce_mean(
tf.cast(
tf.stack([match.num_matched_rows() for match in match_list]),
dtype=tf.float32))
avg_pos_anchors = tf.reduce_mean(
tf.cast(
tf.stack([match.num_matched_columns() for match in match_list]),
dtype=tf.float32))
avg_neg_anchors = tf.reduce_mean(
tf.cast(
tf.stack([match.num_unmatched_columns() for match in match_list]),
dtype=tf.float32))
avg_ignored_anchors = tf.reduce_mean(
tf.cast(
tf.stack([match.num_ignored_columns() for match in match_list]),
dtype=tf.float32))
# TODO(rathodv): Add a test for these summaries.
try:
# TODO(kaftan): Integrate these summaries into the v2 style loops
with tf.compat.v2.init_scope():
if tf.compat.v2.executing_eagerly():
return
except AttributeError:
pass
tf.summary.scalar('AvgNumGroundtruthBoxesPerImage',
avg_num_gt_boxes,
family='TargetAssignment')
......
......@@ -176,6 +176,9 @@ class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase,
]
] # padding
expected_scores = [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]
expected_multiclass_scores = [[[0, 0], [0, 0], [0, 0], [0, 0], [0, 0]],
[[0, 0], [0, 0], [0, 0], [0, 0], [0, 0]]]
expected_classes = [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]
expected_num_detections = np.array([3, 3])
......@@ -198,6 +201,7 @@ class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase,
detections = model.postprocess(prediction_dict, true_image_shapes)
self.assertIn('detection_boxes', detections)
self.assertIn('detection_scores', detections)
self.assertIn('detection_multiclass_scores', detections)
self.assertIn('detection_classes', detections)
self.assertIn('num_detections', detections)
self.assertIn('raw_detection_boxes', detections)
......@@ -217,6 +221,8 @@ class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase,
expected_boxes[image_idx]))
self.assertAllClose(detections_out['detection_scores'], expected_scores)
self.assertAllClose(detections_out['detection_classes'], expected_classes)
self.assertAllClose(detections_out['detection_multiclass_scores'],
expected_multiclass_scores)
self.assertAllClose(detections_out['num_detections'],
expected_num_detections)
self.assertAllEqual(detections_out['raw_detection_boxes'],
......@@ -235,7 +241,8 @@ class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase,
true_image_shapes)
detections = model.postprocess(prediction_dict, true_image_shapes)
return (detections['detection_boxes'], detections['detection_scores'],
detections['detection_classes'], detections['num_detections'])
detections['detection_classes'], detections['num_detections'],
detections['detection_multiclass_scores'])
batch_size = 2
image_size = 2
......@@ -257,11 +264,14 @@ class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase,
]
] # padding
expected_scores = [[0, 0, 0, 0], [0, 0, 0, 0]]
expected_multiclass_scores = [[[0, 0], [0, 0], [0, 0], [0, 0]],
[[0, 0], [0, 0], [0, 0], [0, 0]]]
expected_classes = [[0, 0, 0, 0], [0, 0, 0, 0]]
expected_num_detections = np.array([3, 3])
(detection_boxes, detection_scores, detection_classes,
num_detections) = self.execute(graph_fn, [input_image])
num_detections, detection_multiclass_scores) = self.execute(graph_fn,
[input_image])
for image_idx in range(batch_size):
self.assertTrue(test_utils.first_rows_close_as_set(
detection_boxes[image_idx][
......@@ -270,6 +280,11 @@ class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase,
self.assertAllClose(
detection_scores[image_idx][0:expected_num_detections[image_idx]],
expected_scores[image_idx][0:expected_num_detections[image_idx]])
self.assertAllClose(
detection_multiclass_scores[image_idx]
[0:expected_num_detections[image_idx]],
expected_multiclass_scores[image_idx]
[0:expected_num_detections[image_idx]])
self.assertAllClose(
detection_classes[image_idx][0:expected_num_detections[image_idx]],
expected_classes[image_idx][0:expected_num_detections[image_idx]])
......@@ -600,8 +615,8 @@ class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase,
with test_graph_detection.as_default():
model, _, _, _ = self._create_model(use_keras=use_keras)
inputs_shape = [2, 2, 2, 3]
inputs = tf.to_float(tf.random_uniform(
inputs_shape, minval=0, maxval=255, dtype=tf.int32))
inputs = tf.cast(tf.random_uniform(
inputs_shape, minval=0, maxval=255, dtype=tf.int32), dtype=tf.float32)
preprocessed_inputs, true_image_shapes = model.preprocess(inputs)
prediction_dict = model.predict(preprocessed_inputs, true_image_shapes)
model.postprocess(prediction_dict, true_image_shapes)
......@@ -620,8 +635,9 @@ class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase,
with test_graph_detection.as_default():
model, _, _, _ = self._create_model(use_keras=use_keras)
inputs_shape = [2, 2, 2, 3]
inputs = tf.to_float(
tf.random_uniform(inputs_shape, minval=0, maxval=255, dtype=tf.int32))
inputs = tf.cast(
tf.random_uniform(inputs_shape, minval=0, maxval=255, dtype=tf.int32),
dtype=tf.float32)
preprocessed_inputs, true_image_shapes = model.preprocess(inputs)
prediction_dict = model.predict(preprocessed_inputs, true_image_shapes)
model.postprocess(prediction_dict, true_image_shapes)
......
......@@ -98,13 +98,16 @@ def expected_calibration_error(y_true, y_pred, nbins=20):
with tf.control_dependencies([bin_ids]):
update_bin_counts_op = tf.assign_add(
bin_counts, tf.to_float(tf.bincount(bin_ids, minlength=nbins)))
bin_counts, tf.cast(tf.bincount(bin_ids, minlength=nbins),
dtype=tf.float32))
update_bin_true_sum_op = tf.assign_add(
bin_true_sum,
tf.to_float(tf.bincount(bin_ids, weights=y_true, minlength=nbins)))
tf.cast(tf.bincount(bin_ids, weights=y_true, minlength=nbins),
dtype=tf.float32))
update_bin_preds_sum_op = tf.assign_add(
bin_preds_sum,
tf.to_float(tf.bincount(bin_ids, weights=y_pred, minlength=nbins)))
tf.cast(tf.bincount(bin_ids, weights=y_pred, minlength=nbins),
dtype=tf.float32))
ece_update_op = _ece_from_bins(
update_bin_counts_op,
......
......@@ -216,29 +216,23 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
for key, value in iter(box_metrics.items())}
return box_metrics
def get_estimator_eval_metric_ops(self, eval_dict):
"""Returns a dictionary of eval metric ops.
def add_eval_dict(self, eval_dict):
"""Observes an evaluation result dict for a single example.
Note that once value_op is called, the detections and groundtruth added via
update_op are cleared.
When executing eagerly, once all observations have been observed by this
method you can use `.evaluate()` to get the final metrics.
This function can take in groundtruth and detections for a batch of images,
or for a single image. For the latter case, the batch dimension for input
tensors need not be present.
When using `tf.estimator.Estimator` for evaluation this function is used by
`get_estimator_eval_metric_ops()` to construct the metric update op.
Args:
eval_dict: A dictionary that holds tensors for evaluating object detection
performance. For single-image evaluation, this dictionary may be
produced from eval_util.result_dict_for_single_example(). If multi-image
evaluation, `eval_dict` should contain the fields
'num_groundtruth_boxes_per_image' and 'num_det_boxes_per_image' to
properly unpad the tensors from the batch.
eval_dict: A dictionary that holds tensors for evaluating an object
detection model, returned from
eval_util.result_dict_for_single_example().
Returns:
a dictionary of metric names to tuple of value_op and update_op that can
be used as eval metric ops in tf.estimator.EstimatorSpec. Note that all
update ops must be run together and similarly all value ops must be run
together to guarantee correct behaviour.
None when executing eagerly, or an update_op that can be used to update
the eval metrics in `tf.estimator.EstimatorSpec`.
"""
def update_op(
image_id_batched,
......@@ -328,16 +322,42 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
if is_annotated is None:
is_annotated = tf.ones_like(image_id, dtype=tf.bool)
update_op = tf.py_func(update_op, [image_id,
groundtruth_boxes,
groundtruth_classes,
groundtruth_is_crowd,
num_gt_boxes_per_image,
detection_boxes,
detection_scores,
detection_classes,
num_det_boxes_per_image,
is_annotated], [])
return tf.py_func(update_op, [image_id,
groundtruth_boxes,
groundtruth_classes,
groundtruth_is_crowd,
num_gt_boxes_per_image,
detection_boxes,
detection_scores,
detection_classes,
num_det_boxes_per_image,
is_annotated], [])
def get_estimator_eval_metric_ops(self, eval_dict):
"""Returns a dictionary of eval metric ops.
Note that once value_op is called, the detections and groundtruth added via
update_op are cleared.
This function can take in groundtruth and detections for a batch of images,
or for a single image. For the latter case, the batch dimension for input
tensors need not be present.
Args:
eval_dict: A dictionary that holds tensors for evaluating object detection
performance. For single-image evaluation, this dictionary may be
produced from eval_util.result_dict_for_single_example(). If multi-image
evaluation, `eval_dict` should contain the fields
'num_groundtruth_boxes_per_image' and 'num_det_boxes_per_image' to
properly unpad the tensors from the batch.
Returns:
a dictionary of metric names to tuple of value_op and update_op that can
be used as eval metric ops in tf.estimator.EstimatorSpec. Note that all
update ops must be run together and similarly all value ops must be run
together to guarantee correct behaviour.
"""
update_op = self.add_eval_dict(eval_dict)
metric_names = ['DetectionBoxes_Precision/mAP',
'DetectionBoxes_Precision/mAP@.50IOU',
'DetectionBoxes_Precision/mAP@.75IOU',
......
......@@ -14,6 +14,8 @@
# ==============================================================================
r"""Runs evaluation using OpenImages groundtruth and predictions.
Uses Open Images Challenge 2018, 2019 metrics
Example usage:
python models/research/object_detection/metrics/oid_od_challenge_evaluation.py \
--input_annotations_boxes=/path/to/input/annotations-human-bbox.csv \
......@@ -21,27 +23,50 @@ python models/research/object_detection/metrics/oid_od_challenge_evaluation.py \
--input_class_labelmap=/path/to/input/class_labelmap.pbtxt \
--input_predictions=/path/to/input/predictions.csv \
--output_metrics=/path/to/output/metric.csv \
--input_annotations_segm=[/path/to/input/annotations-human-mask.csv] \
If optional flag has_masks is True, Mask column is also expected in CSV.
CSVs with bounding box annotations and image label (including the image URLs)
CSVs with bounding box annotations, instance segmentations and image label
can be downloaded from the Open Images Challenge website:
https://storage.googleapis.com/openimages/web/challenge.html
The format of the input csv and the metrics itself are described on the
challenge website.
challenge website as well.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import argparse
from absl import app
from absl import flags
import pandas as pd
from google.protobuf import text_format
from object_detection.metrics import io_utils
from object_detection.metrics import oid_od_challenge_evaluation_utils as utils
from object_detection.metrics import oid_challenge_evaluation_utils as utils
from object_detection.protos import string_int_label_map_pb2
from object_detection.utils import object_detection_evaluation
flags.DEFINE_string('input_annotations_boxes', None,
'File with groundtruth boxes annotations.')
flags.DEFINE_string('input_annotations_labels', None,
'File with groundtruth labels annotations.')
flags.DEFINE_string(
'input_predictions', None,
"""File with detection predictions; NOTE: no postprocessing is applied in the evaluation script."""
)
flags.DEFINE_string('input_class_labelmap', None,
'Open Images Challenge labelmap.')
flags.DEFINE_string('output_metrics', None, 'Output file with csv metrics.')
flags.DEFINE_string(
'input_annotations_segm', None,
'File with groundtruth instance segmentation annotations [OPTIONAL].')
FLAGS = flags.FLAGS
def _load_labelmap(labelmap_path):
"""Loads labelmap from the labelmap path.
......@@ -66,26 +91,43 @@ def _load_labelmap(labelmap_path):
return labelmap_dict, categories
def main(parsed_args):
all_box_annotations = pd.read_csv(parsed_args.input_annotations_boxes)
all_label_annotations = pd.read_csv(parsed_args.input_annotations_labels)
def main(unused_argv):
flags.mark_flag_as_required('input_annotations_boxes')
flags.mark_flag_as_required('input_annotations_labels')
flags.mark_flag_as_required('input_predictions')
flags.mark_flag_as_required('input_class_labelmap')
flags.mark_flag_as_required('output_metrics')
all_location_annotations = pd.read_csv(FLAGS.input_annotations_boxes)
all_label_annotations = pd.read_csv(FLAGS.input_annotations_labels)
all_label_annotations.rename(
columns={'Confidence': 'ConfidenceImageLabel'}, inplace=True)
all_annotations = pd.concat([all_box_annotations, all_label_annotations])
class_label_map, categories = _load_labelmap(parsed_args.input_class_labelmap)
is_instance_segmentation_eval = False
if FLAGS.input_annotations_segm:
is_instance_segmentation_eval = True
all_segm_annotations = pd.read_csv(FLAGS.input_annotations_segm)
# Note: this part is unstable as it requires the float point numbers in both
# csvs are exactly the same;
# Will be replaced by more stable solution: merge on LabelName and ImageID
# and filter down by IoU.
all_location_annotations = utils.merge_boxes_and_masks(
all_location_annotations, all_segm_annotations)
all_annotations = pd.concat([all_location_annotations, all_label_annotations])
class_label_map, categories = _load_labelmap(FLAGS.input_class_labelmap)
challenge_evaluator = (
object_detection_evaluation.OpenImagesDetectionChallengeEvaluator(
categories))
object_detection_evaluation.OpenImagesChallengeEvaluator(
categories, evaluate_masks=is_instance_segmentation_eval))
for _, groundtruth in enumerate(all_annotations.groupby('ImageID')):
image_id, image_groundtruth = groundtruth
groundtruth_dictionary = utils.build_groundtruth_boxes_dictionary(
groundtruth_dictionary = utils.build_groundtruth_dictionary(
image_groundtruth, class_label_map)
challenge_evaluator.add_single_ground_truth_image_info(
image_id, groundtruth_dictionary)
all_predictions = pd.read_csv(parsed_args.input_predictions)
all_predictions = pd.read_csv(FLAGS.input_predictions)
for _, prediction_data in enumerate(all_predictions.groupby('ImageID')):
image_id, image_predictions = prediction_data
prediction_dictionary = utils.build_predictions_dictionary(
......@@ -95,34 +137,9 @@ def main(parsed_args):
metrics = challenge_evaluator.evaluate()
with open(parsed_args.output_metrics, 'w') as fid:
with open(FLAGS.output_metrics, 'w') as fid:
io_utils.write_csv(fid, metrics)
if __name__ == '__main__':
parser = argparse.ArgumentParser(
description='Evaluate Open Images Object Detection Challenge predictions.'
)
parser.add_argument(
'--input_annotations_boxes',
required=True,
help='File with groundtruth boxes annotations.')
parser.add_argument(
'--input_annotations_labels',
required=True,
help='File with groundtruth labels annotations')
parser.add_argument(
'--input_predictions',
required=True,
help="""File with detection predictions; NOTE: no postprocessing is
applied in the evaluation script.""")
parser.add_argument(
'--input_class_labelmap',
required=True,
help='Open Images Challenge labelmap.')
parser.add_argument(
'--output_metrics', required=True, help='Output file with csv metrics')
args = parser.parse_args()
main(args)
app.run(main)
......@@ -12,17 +12,92 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
r"""Converts data from CSV to the OpenImagesDetectionChallengeEvaluator format.
"""
r"""Converts data from CSV to the OpenImagesDetectionChallengeEvaluator format."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import pandas as pd
from pycocotools import mask
from object_detection.core import standard_fields
def build_groundtruth_boxes_dictionary(data, class_label_map):
def _to_normalized_box(mask_np):
"""Decodes binary segmentation masks into np.arrays and boxes.
Args:
mask_np: np.ndarray of size NxWxH.
Returns:
a np.ndarray of the size Nx4, each row containing normalized coordinates
[YMin, XMin, YMax, XMax] of a box computed of axis parallel enclosing box of
a mask.
"""
coord1, coord2 = np.nonzero(mask_np)
if coord1.size > 0:
ymin = float(min(coord1)) / mask_np.shape[0]
ymax = float(max(coord1) + 1) / mask_np.shape[0]
xmin = float(min(coord2)) / mask_np.shape[1]
xmax = float((max(coord2) + 1)) / mask_np.shape[1]
return np.array([ymin, xmin, ymax, xmax])
else:
return np.array([0.0, 0.0, 0.0, 0.0])
def _decode_raw_data_into_masks_and_boxes(segments, image_widths,
image_heights):
"""Decods binary segmentation masks into np.arrays and boxes.
Args:
segments: pandas Series object containing either None entries or strings
with COCO-encoded binary masks. All masks are expected to be the same size.
image_widths: pandas Series of mask widths.
image_heights: pandas Series of mask heights.
Returns:
a np.ndarray of the size NxWxH, where W and H is determined from the encoded
masks; for the None values, zero arrays of size WxH are created. if input
contains only None values, W=1, H=1.
"""
segment_masks = []
segment_boxes = []
ind = segments.first_valid_index()
if ind is not None:
size = [int(image_heights.iloc[ind]), int(image_widths[ind])]
else:
# It does not matter which size we pick since no masks will ever be
# evaluated.
size = [1, 1]
for segment, im_width, im_height in zip(segments, image_widths,
image_heights):
if pd.isnull(segment):
segment_masks.append(np.zeros([1, size[0], size[1]], dtype=np.uint8))
segment_boxes.append(np.expand_dims(np.array([0.0, 0.0, 0.0, 0.0]), 0))
else:
encoding_dict = {'size': [im_height, im_width], 'counts': segment}
mask_tensor = mask.decode(encoding_dict)
segment_masks.append(np.expand_dims(mask_tensor, 0))
segment_boxes.append(np.expand_dims(_to_normalized_box(mask_tensor), 0))
return np.concatenate(
segment_masks, axis=0), np.concatenate(
segment_boxes, axis=0)
def merge_boxes_and_masks(box_data, mask_data):
return pd.merge(
box_data,
mask_data,
how='outer',
on=['LabelName', 'ImageID', 'XMin', 'XMax', 'YMin', 'YMax', 'IsGroupOf'])
def build_groundtruth_dictionary(data, class_label_map):
"""Builds a groundtruth dictionary from groundtruth data in CSV file.
Args:
......@@ -44,21 +119,31 @@ def build_groundtruth_boxes_dictionary(data, class_label_map):
M numpy boolean array denoting whether a groundtruth box contains a
group of instances.
"""
data_boxes = data[data.ConfidenceImageLabel.isnull()]
data_labels = data[data.XMin.isnull()]
data_location = data[data.XMin.notnull()]
data_labels = data[data.ConfidenceImageLabel.notnull()]
return {
dictionary = {
standard_fields.InputDataFields.groundtruth_boxes:
data_boxes[['YMin', 'XMin', 'YMax', 'XMax']].as_matrix(),
data_location[['YMin', 'XMin', 'YMax', 'XMax']].as_matrix(),
standard_fields.InputDataFields.groundtruth_classes:
data_boxes['LabelName'].map(lambda x: class_label_map[x]).as_matrix(),
data_location['LabelName'].map(lambda x: class_label_map[x]
).as_matrix(),
standard_fields.InputDataFields.groundtruth_group_of:
data_boxes['IsGroupOf'].as_matrix().astype(int),
data_location['IsGroupOf'].as_matrix().astype(int),
standard_fields.InputDataFields.groundtruth_image_classes:
data_labels['LabelName'].map(lambda x: class_label_map[x])
.as_matrix(),
data_labels['LabelName'].map(lambda x: class_label_map[x]
).as_matrix(),
}
if 'Mask' in data_location:
segments, _ = _decode_raw_data_into_masks_and_boxes(
data_location['Mask'], data_location['ImageWidth'],
data_location['ImageHeight'])
dictionary[
standard_fields.InputDataFields.groundtruth_instance_masks] = segments
return dictionary
def build_predictions_dictionary(data, class_label_map):
"""Builds a predictions dictionary from predictions data in CSV file.
......@@ -80,11 +165,21 @@ def build_predictions_dictionary(data, class_label_map):
the boxes.
"""
return {
standard_fields.DetectionResultFields.detection_boxes:
data[['YMin', 'XMin', 'YMax', 'XMax']].as_matrix(),
dictionary = {
standard_fields.DetectionResultFields.detection_classes:
data['LabelName'].map(lambda x: class_label_map[x]).as_matrix(),
standard_fields.DetectionResultFields.detection_scores:
data['Score'].as_matrix()
}
if 'Mask' in data:
segments, boxes = _decode_raw_data_into_masks_and_boxes(
data['Mask'], data['ImageWidth'], data['ImageHeight'])
dictionary[standard_fields.DetectionResultFields.detection_masks] = segments
dictionary[standard_fields.DetectionResultFields.detection_boxes] = boxes
else:
dictionary[standard_fields.DetectionResultFields.detection_boxes] = data[[
'YMin', 'XMin', 'YMax', 'XMax'
]].as_matrix()
return dictionary
# Copyright 2018 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for oid_od_challenge_evaluation_util."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import pandas as pd
from pycocotools import mask
import tensorflow as tf
from object_detection.core import standard_fields
from object_detection.metrics import oid_challenge_evaluation_utils as utils
class OidUtilTest(tf.test.TestCase):
def testMaskToNormalizedBox(self):
mask_np = np.array([[0, 0, 0, 0], [0, 1, 0, 0], [0, 1, 0, 0], [0, 0, 0, 0]])
box = utils._to_normalized_box(mask_np)
self.assertAllEqual(np.array([0.25, 0.25, 0.75, 0.5]), box)
mask_np = np.array([[0, 0, 0, 0], [0, 1, 0, 1], [0, 1, 0, 1], [0, 1, 1, 1]])
box = utils._to_normalized_box(mask_np)
self.assertAllEqual(np.array([0.25, 0.25, 1.0, 1.0]), box)
mask_np = np.array([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]])
box = utils._to_normalized_box(mask_np)
self.assertAllEqual(np.array([0.0, 0.0, 0.0, 0.0]), box)
def testDecodeToTensors(self):
mask1 = np.array([[0, 0, 1, 1], [0, 0, 1, 1], [0, 0, 0, 0]], dtype=np.uint8)
mask2 = np.array([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]], dtype=np.uint8)
encoding1 = mask.encode(np.asfortranarray(mask1))
encoding2 = mask.encode(np.asfortranarray(mask2))
vals = pd.Series([encoding1['counts'], encoding2['counts']])
image_widths = pd.Series([mask1.shape[1], mask2.shape[1]])
image_heights = pd.Series([mask1.shape[0], mask2.shape[0]])
segm, bbox = utils._decode_raw_data_into_masks_and_boxes(
vals, image_widths, image_heights)
expected_segm = np.concatenate(
[np.expand_dims(mask1, 0),
np.expand_dims(mask2, 0)], axis=0)
expected_bbox = np.array([[0.0, 0.5, 2.0 / 3.0, 1.0], [0, 0, 0, 0]])
self.assertAllEqual(expected_segm, segm)
self.assertAllEqual(expected_bbox, bbox)
class OidChallengeEvaluationUtilTest(tf.test.TestCase):
def testBuildGroundtruthDictionaryBoxes(self):
np_data = pd.DataFrame(
[['fe58ec1b06db2bb7', '/m/04bcr3', 0.0, 0.3, 0.5, 0.6, 1, None],
['fe58ec1b06db2bb7', '/m/02gy9n', 0.1, 0.2, 0.3, 0.4, 0, None],
['fe58ec1b06db2bb7', '/m/04bcr3', None, None, None, None, None, 1],
['fe58ec1b06db2bb7', '/m/083vt', None, None, None, None, None, 0],
['fe58ec1b06db2bb7', '/m/02gy9n', None, None, None, None, None, 1]],
columns=[
'ImageID', 'LabelName', 'XMin', 'XMax', 'YMin', 'YMax', 'IsGroupOf',
'ConfidenceImageLabel'
])
class_label_map = {'/m/04bcr3': 1, '/m/083vt': 2, '/m/02gy9n': 3}
groundtruth_dictionary = utils.build_groundtruth_dictionary(
np_data, class_label_map)
self.assertIn(standard_fields.InputDataFields.groundtruth_boxes,
groundtruth_dictionary)
self.assertIn(standard_fields.InputDataFields.groundtruth_classes,
groundtruth_dictionary)
self.assertIn(standard_fields.InputDataFields.groundtruth_group_of,
groundtruth_dictionary)
self.assertIn(standard_fields.InputDataFields.groundtruth_image_classes,
groundtruth_dictionary)
self.assertAllEqual(
np.array([1, 3]), groundtruth_dictionary[
standard_fields.InputDataFields.groundtruth_classes])
self.assertAllEqual(
np.array([1, 0]), groundtruth_dictionary[
standard_fields.InputDataFields.groundtruth_group_of])
expected_boxes_data = np.array([[0.5, 0.0, 0.6, 0.3], [0.3, 0.1, 0.4, 0.2]])
self.assertNDArrayNear(
expected_boxes_data, groundtruth_dictionary[
standard_fields.InputDataFields.groundtruth_boxes], 1e-5)
self.assertAllEqual(
np.array([1, 2, 3]), groundtruth_dictionary[
standard_fields.InputDataFields.groundtruth_image_classes])
def testBuildPredictionDictionaryBoxes(self):
np_data = pd.DataFrame(
[['fe58ec1b06db2bb7', '/m/04bcr3', 0.0, 0.3, 0.5, 0.6, 0.1],
['fe58ec1b06db2bb7', '/m/02gy9n', 0.1, 0.2, 0.3, 0.4, 0.2],
['fe58ec1b06db2bb7', '/m/04bcr3', 0.0, 0.1, 0.2, 0.3, 0.3]],
columns=[
'ImageID', 'LabelName', 'XMin', 'XMax', 'YMin', 'YMax', 'Score'
])
class_label_map = {'/m/04bcr3': 1, '/m/083vt': 2, '/m/02gy9n': 3}
prediction_dictionary = utils.build_predictions_dictionary(
np_data, class_label_map)
self.assertIn(standard_fields.DetectionResultFields.detection_boxes,
prediction_dictionary)
self.assertIn(standard_fields.DetectionResultFields.detection_classes,
prediction_dictionary)
self.assertIn(standard_fields.DetectionResultFields.detection_scores,
prediction_dictionary)
self.assertAllEqual(
np.array([1, 3, 1]), prediction_dictionary[
standard_fields.DetectionResultFields.detection_classes])
expected_boxes_data = np.array([[0.5, 0.0, 0.6, 0.3], [0.3, 0.1, 0.4, 0.2],
[0.2, 0.0, 0.3, 0.1]])
self.assertNDArrayNear(
expected_boxes_data, prediction_dictionary[
standard_fields.DetectionResultFields.detection_boxes], 1e-5)
self.assertNDArrayNear(
np.array([0.1, 0.2, 0.3]), prediction_dictionary[
standard_fields.DetectionResultFields.detection_scores], 1e-5)
def testBuildGroundtruthDictionaryMasks(self):
mask1 = np.array([[0, 0, 1, 1], [0, 0, 1, 1], [0, 0, 0, 0], [0, 0, 0, 0]],
dtype=np.uint8)
mask2 = np.array([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]],
dtype=np.uint8)
encoding1 = mask.encode(np.asfortranarray(mask1))
encoding2 = mask.encode(np.asfortranarray(mask2))
np_data = pd.DataFrame(
[[
'fe58ec1b06db2bb7', mask1.shape[1], mask1.shape[0], '/m/04bcr3',
0.0, 0.3, 0.5, 0.6, 0, None, encoding1['counts']
],
[
'fe58ec1b06db2bb7', None, None, '/m/02gy9n', 0.1, 0.2, 0.3, 0.4, 1,
None, None
],
[
'fe58ec1b06db2bb7', mask2.shape[1], mask2.shape[0], '/m/02gy9n',
0.5, 0.6, 0.8, 0.9, 0, None, encoding2['counts']
],
[
'fe58ec1b06db2bb7', None, None, '/m/04bcr3', None, None, None,
None, None, 1, None
],
[
'fe58ec1b06db2bb7', None, None, '/m/083vt', None, None, None, None,
None, 0, None
],
[
'fe58ec1b06db2bb7', None, None, '/m/02gy9n', None, None, None,
None, None, 1, None
]],
columns=[
'ImageID', 'ImageWidth', 'ImageHeight', 'LabelName', 'XMin', 'XMax',
'YMin', 'YMax', 'IsGroupOf', 'ConfidenceImageLabel', 'Mask'
])
class_label_map = {'/m/04bcr3': 1, '/m/083vt': 2, '/m/02gy9n': 3}
groundtruth_dictionary = utils.build_groundtruth_dictionary(
np_data, class_label_map)
self.assertIn(standard_fields.InputDataFields.groundtruth_boxes,
groundtruth_dictionary)
self.assertIn(standard_fields.InputDataFields.groundtruth_classes,
groundtruth_dictionary)
self.assertIn(standard_fields.InputDataFields.groundtruth_group_of,
groundtruth_dictionary)
self.assertIn(standard_fields.InputDataFields.groundtruth_image_classes,
groundtruth_dictionary)
self.assertIn(standard_fields.InputDataFields.groundtruth_instance_masks,
groundtruth_dictionary)
self.assertAllEqual(
np.array([1, 3, 3]), groundtruth_dictionary[
standard_fields.InputDataFields.groundtruth_classes])
self.assertAllEqual(
np.array([0, 1, 0]), groundtruth_dictionary[
standard_fields.InputDataFields.groundtruth_group_of])
expected_boxes_data = np.array([[0.5, 0.0, 0.6, 0.3], [0.3, 0.1, 0.4, 0.2],
[0.8, 0.5, 0.9, 0.6]])
self.assertNDArrayNear(
expected_boxes_data, groundtruth_dictionary[
standard_fields.InputDataFields.groundtruth_boxes], 1e-5)
self.assertAllEqual(
np.array([1, 2, 3]), groundtruth_dictionary[
standard_fields.InputDataFields.groundtruth_image_classes])
expected_segm = np.concatenate([
np.expand_dims(mask1, 0),
np.zeros((1, 4, 4), dtype=np.uint8),
np.expand_dims(mask2, 0)
],
axis=0)
self.assertAllEqual(
expected_segm, groundtruth_dictionary[
standard_fields.InputDataFields.groundtruth_instance_masks])
def testBuildPredictionDictionaryMasks(self):
mask1 = np.array([[0, 0, 1, 1], [0, 0, 1, 1], [0, 0, 0, 0], [0, 0, 0, 0]],
dtype=np.uint8)
mask2 = np.array([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]],
dtype=np.uint8)
encoding1 = mask.encode(np.asfortranarray(mask1))
encoding2 = mask.encode(np.asfortranarray(mask2))
np_data = pd.DataFrame(
[[
'fe58ec1b06db2bb7', mask1.shape[1], mask1.shape[0], '/m/04bcr3',
encoding1['counts'], 0.8
],
[
'fe58ec1b06db2bb7', mask2.shape[1], mask2.shape[0], '/m/02gy9n',
encoding2['counts'], 0.6
]],
columns=[
'ImageID', 'ImageWidth', 'ImageHeight', 'LabelName', 'Mask', 'Score'
])
class_label_map = {'/m/04bcr3': 1, '/m/02gy9n': 3}
prediction_dictionary = utils.build_predictions_dictionary(
np_data, class_label_map)
self.assertIn(standard_fields.DetectionResultFields.detection_boxes,
prediction_dictionary)
self.assertIn(standard_fields.DetectionResultFields.detection_classes,
prediction_dictionary)
self.assertIn(standard_fields.DetectionResultFields.detection_scores,
prediction_dictionary)
self.assertIn(standard_fields.DetectionResultFields.detection_masks,
prediction_dictionary)
self.assertAllEqual(
np.array([1, 3]), prediction_dictionary[
standard_fields.DetectionResultFields.detection_classes])
expected_boxes_data = np.array([[0.0, 0.5, 0.5, 1.0], [0, 0, 0, 0]])
self.assertNDArrayNear(
expected_boxes_data, prediction_dictionary[
standard_fields.DetectionResultFields.detection_boxes], 1e-5)
self.assertNDArrayNear(
np.array([0.8, 0.6]), prediction_dictionary[
standard_fields.DetectionResultFields.detection_scores], 1e-5)
expected_segm = np.concatenate(
[np.expand_dims(mask1, 0),
np.expand_dims(mask2, 0)], axis=0)
self.assertAllEqual(
expected_segm, prediction_dictionary[
standard_fields.DetectionResultFields.detection_masks])
if __name__ == '__main__':
tf.test.main()
# Copyright 2018 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for oid_od_challenge_evaluation_util."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import pandas as pd
import tensorflow as tf
from object_detection.core import standard_fields
from object_detection.metrics import oid_od_challenge_evaluation_utils as utils
class OidOdChallengeEvaluationUtilTest(tf.test.TestCase):
def testBuildGroundtruthDictionary(self):
np_data = pd.DataFrame(
[['fe58ec1b06db2bb7', '/m/04bcr3', 0.0, 0.3, 0.5, 0.6, 1, None], [
'fe58ec1b06db2bb7', '/m/02gy9n', 0.1, 0.2, 0.3, 0.4, 0, None
], ['fe58ec1b06db2bb7', '/m/04bcr3', None, None, None, None, None, 1], [
'fe58ec1b06db2bb7', '/m/083vt', None, None, None, None, None, 0
], ['fe58ec1b06db2bb7', '/m/02gy9n', None, None, None, None, None, 1]],
columns=[
'ImageID', 'LabelName', 'XMin', 'XMax', 'YMin', 'YMax', 'IsGroupOf',
'ConfidenceImageLabel'
])
class_label_map = {'/m/04bcr3': 1, '/m/083vt': 2, '/m/02gy9n': 3}
groundtruth_dictionary = utils.build_groundtruth_boxes_dictionary(
np_data, class_label_map)
self.assertTrue(standard_fields.InputDataFields.groundtruth_boxes in
groundtruth_dictionary)
self.assertTrue(standard_fields.InputDataFields.groundtruth_classes in
groundtruth_dictionary)
self.assertTrue(standard_fields.InputDataFields.groundtruth_group_of in
groundtruth_dictionary)
self.assertTrue(standard_fields.InputDataFields.groundtruth_image_classes in
groundtruth_dictionary)
self.assertAllEqual(
np.array([1, 3]), groundtruth_dictionary[
standard_fields.InputDataFields.groundtruth_classes])
self.assertAllEqual(
np.array([1, 0]), groundtruth_dictionary[
standard_fields.InputDataFields.groundtruth_group_of])
expected_boxes_data = np.array([[0.5, 0.0, 0.6, 0.3], [0.3, 0.1, 0.4, 0.2]])
self.assertNDArrayNear(
expected_boxes_data, groundtruth_dictionary[
standard_fields.InputDataFields.groundtruth_boxes], 1e-5)
self.assertAllEqual(
np.array([1, 2, 3]), groundtruth_dictionary[
standard_fields.InputDataFields.groundtruth_image_classes])
def testBuildPredictionDictionary(self):
np_data = pd.DataFrame(
[['fe58ec1b06db2bb7', '/m/04bcr3', 0.0, 0.3, 0.5, 0.6, 0.1], [
'fe58ec1b06db2bb7', '/m/02gy9n', 0.1, 0.2, 0.3, 0.4, 0.2
], ['fe58ec1b06db2bb7', '/m/04bcr3', 0.0, 0.1, 0.2, 0.3, 0.3]],
columns=[
'ImageID', 'LabelName', 'XMin', 'XMax', 'YMin', 'YMax', 'Score'
])
class_label_map = {'/m/04bcr3': 1, '/m/083vt': 2, '/m/02gy9n': 3}
prediction_dictionary = utils.build_predictions_dictionary(
np_data, class_label_map)
self.assertTrue(standard_fields.DetectionResultFields.detection_boxes in
prediction_dictionary)
self.assertTrue(standard_fields.DetectionResultFields.detection_classes in
prediction_dictionary)
self.assertTrue(standard_fields.DetectionResultFields.detection_scores in
prediction_dictionary)
self.assertAllEqual(
np.array([1, 3, 1]), prediction_dictionary[
standard_fields.DetectionResultFields.detection_classes])
expected_boxes_data = np.array([[0.5, 0.0, 0.6, 0.3], [0.3, 0.1, 0.4, 0.2],
[0.2, 0.0, 0.3, 0.1]])
self.assertNDArrayNear(
expected_boxes_data, prediction_dictionary[
standard_fields.DetectionResultFields.detection_boxes], 1e-5)
self.assertNDArrayNear(
np.array([0.1, 0.2, 0.3]), prediction_dictionary[
standard_fields.DetectionResultFields.detection_scores], 1e-5)
if __name__ == '__main__':
tf.test.main()
......@@ -17,7 +17,7 @@ r"""Runs evaluation using OpenImages groundtruth and predictions.
Example usage:
python \
models/research/object_detection/metrics/oid_vrd_challenge_evaluation.py \
--input_annotations_boxes=/path/to/input/annotations-human-bbox.csv \
--input_annotations_vrd=/path/to/input/annotations-human-bbox.csv \
--input_annotations_labels=/path/to/input/annotations-label.csv \
--input_class_labelmap=/path/to/input/class_labelmap.pbtxt \
--input_relationship_labelmap=/path/to/input/relationship_labelmap.pbtxt \
......@@ -126,7 +126,7 @@ if __name__ == '__main__':
description=
'Evaluate Open Images Visual Relationship Detection predictions.')
parser.add_argument(
'--input_annotations_boxes',
'--input_annotations_vrd',
required=True,
help='File with groundtruth vrd annotations.')
parser.add_argument(
......
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment