Unverified Commit 9bbf8015 authored by pkulzc's avatar pkulzc Committed by GitHub
Browse files

Merged commit includes the following changes: (#6932)

250447559  by Zhichao Lu:

    Update expected files format for Instance Segmentation challenge:
    - add fields ImageWidth, ImageHeight and store the values per prediction
    - as mask, store only encoded image and assume its size is ImageWidth x ImageHeight

--
250402780  by rathodv:

    Fix failing Mask R-CNN TPU convergence test.

    Cast second stage prediction tensors from bfloat16 to float32 to prevent errors in third target assignment (Mask Prediction) - Concat with different types bfloat16 and bfloat32 isn't allowed.

--
250300240  by Zhichao Lu:

    Addion Open Images Challenge 2019 object detection and instance segmentation
    support into Estimator framework.

--
249944839  by rathodv:

    Modify exporter.py to add multiclass score nodes in exported inference graphs.

--
249935201  by rathodv:

    Modify postprocess methods to preserve multiclass scores after non max suppression.

--
249878079  by Zhichao Lu:

    This CL slightly refactors some Object Detection helper functions for data creation, evaluation, and groundtruth providing.

    This will allow the eager+function custom loops to share code with the existing estimator training loops.

    Concretely we make the following changes:
    1. In input creation we separate dataset-creation into top-level helpers, and allow it to optionally accept a pre-constructed model directly instead of always creating a model from the config just for feature preprocessing.

    2. In coco evaluation we split the update_op creation into its own function, which the custom loops will call directly.

    3. In model_lib we move groundtruth providing/ datastructure munging into a helper function

    4. For now we put an escape hatch in `_summarize_target_assignment` when executing in tf v2.0 behavior because the summary apis used only work w/ tf 1.x

--
249673507  by rathodv:

    Use explicit casts instead of tf.to_float and tf.to_int32 to avoid warnings.

--
249656006  by Zhichao Lu:

    Add named "raw_keypoint_locations" node that corresponds with the "raw_box_locations" node.

--
249651674  by rathodv:

    Keep proposal boxes in float format. MatMulCropAndResize can handle the type even when feature themselves are bfloat16s.

--
249568633  by rathodv:

    Support q > 1 in class agnostic NMS.
    Break post_processing_test.py into 3 separate files to avoid linter errors.

--
249535530  by rathodv:

    Update some deprecated arguments to tf ops.

--
249368223  by rathodv:

    Modify MatMulCropAndResize to use MultiLevelRoIAlign method and move the tests to spatial_transform_ops.py module.

    This cl establishes that CropAndResize and RoIAlign are equivalent and only differ in the sampling point grid within the boxes. CropAndResize uses a uniform size x size point grid such that the corner points exactly overlap box corners, while RoiAlign divides boxes into size x size cells and uses their centers as sampling points. In this cl, we switch MatMulCropAndResize to use the MultiLevelRoIAlign implementation with `align_corner` option as MultiLevelRoIAlign implementation is more memory efficient on TPU when compared to the original MatMulCropAndResize.

--
249337338  by chowdhery:

    Add class-agnostic non-max-suppression in post_processing

--
249139196  by Zhichao Lu:

    Fix positional argument bug in export_tflite_ssd_graph

--
249120219  by Zhichao Lu:

    Add evaluator for computing precision limited to a given recall range.

--
249030593  by Zhichao Lu:

    Evaluation util to run segmentation and detection challenge evaluation.

--
248554358  by Zhichao Lu:

    This change contains the auxiliary changes required for TF 2.0 style training with eager+functions+dist strat loops, but not the loops themselves.

    It includes:
    - Updates to shape usage to support both tensorshape v1 and tensorshape v2
    - A fix to FreezableBatchNorm to not override the `training` arg in call when `None` was passed to the constructor (Not an issue in the estimator loops but it was in the custom loops)
    - Puts some constants in init_scope so they work in eager + functions
    - Makes learning rate schedules return a callable in eager mode (required so they update when the global_step changes)
    - Makes DetectionModel a tf.module so it tracks variables (e.g. ones nested in layers)
    - Removes some references to `op.name` for some losses and replaces it w/ explicit names
    - A small part of the change to allow the coco evaluation metrics to work in eager mode

--
248271226  by rathodv:

    Add MultiLevel RoIAlign op.

--
248229103  by rathodv:

    Add functions to 1. pad features maps 2. ravel 5-D indices

--
248206769  by rathodv:

    Add utilities needed to introduce RoI Align op.

--
248177733  by pengchong:

    Internal changes

--
247742582  by Zhichao Lu:

    Open Images Challenge 2019 instance segmentation metric: part 2

--
247525401  by Zhichao Lu:

    Update comments on max_class_per_detection.

--
247520753  by rathodv:

    Add multilevel crop and resize operation that builds on top of matmul_crop_and_resize.

--
247391600  by Zhichao Lu:

    Open Images Challenge 2019 instance segmentation metric

--
247325813  by chowdhery:

    Quantized MobileNet v2 SSD FPNLite config with depth multiplier 0.75

--

PiperOrigin-RevId: 250447559
parent f42fddee
......@@ -55,12 +55,24 @@ a handful of auxiliary annotations associated with each bounding box, namely,
instance masks and keypoints.
"""
import abc
import tensorflow as tf
from object_detection.core import standard_fields as fields
class DetectionModel(object):
"""Abstract base class for detection models."""
# If using a new enough version of TensorFlow, detection models should be a
# tf module or keras model for tracking.
try:
_BaseClass = tf.Module
except AttributeError:
_BaseClass = object
class DetectionModel(_BaseClass):
"""Abstract base class for detection models.
Extends tf.Module to guarantee variable tracking.
"""
__metaclass__ = abc.ABCMeta
def __init__(self, num_classes):
......
This diff is collapsed.
......@@ -55,7 +55,7 @@ def prefetch(tensor_dict, capacity):
enqueue_op = prefetch_queue.enqueue(tensor_dict)
tf.train.queue_runner.add_queue_runner(tf.train.queue_runner.QueueRunner(
prefetch_queue, [enqueue_op]))
tf.summary.scalar('queue/%s/fraction_of_%d_full' % (prefetch_queue.name,
capacity),
tf.to_float(prefetch_queue.size()) * (1. / capacity))
tf.summary.scalar(
'queue/%s/fraction_of_%d_full' % (prefetch_queue.name, capacity),
tf.cast(prefetch_queue.size(), dtype=tf.float32) * (1. / capacity))
return prefetch_queue
......@@ -261,7 +261,7 @@ def normalize_image(image, original_minval, original_maxval, target_minval,
original_maxval = float(original_maxval)
target_minval = float(target_minval)
target_maxval = float(target_maxval)
image = tf.to_float(image)
image = tf.cast(image, dtype=tf.float32)
image = tf.subtract(image, original_minval)
image = tf.multiply(image, (target_maxval - target_minval) /
(original_maxval - original_minval))
......@@ -810,10 +810,12 @@ def random_image_scale(image,
generator_func, preprocessor_cache.PreprocessorCache.IMAGE_SCALE,
preprocess_vars_cache)
image_newysize = tf.to_int32(
tf.multiply(tf.to_float(image_height), size_coef))
image_newxsize = tf.to_int32(
tf.multiply(tf.to_float(image_width), size_coef))
image_newysize = tf.cast(
tf.multiply(tf.cast(image_height, dtype=tf.float32), size_coef),
dtype=tf.int32)
image_newxsize = tf.cast(
tf.multiply(tf.cast(image_width, dtype=tf.float32), size_coef),
dtype=tf.int32)
image = tf.image.resize_images(
image, [image_newysize, image_newxsize], align_corners=True)
result.append(image)
......@@ -1237,7 +1239,7 @@ def _strict_random_crop_image(image,
new_image.set_shape([None, None, image.get_shape()[2]])
# [1, 4]
im_box_rank2 = tf.squeeze(im_box, squeeze_dims=[0])
im_box_rank2 = tf.squeeze(im_box, axis=[0])
# [4]
im_box_rank1 = tf.squeeze(im_box)
......@@ -1555,13 +1557,15 @@ def random_pad_image(image,
new_image += image_color_padded
# setting boxes
new_window = tf.to_float(
new_window = tf.cast(
tf.stack([
-offset_height, -offset_width, target_height - offset_height,
target_width - offset_width
]))
new_window /= tf.to_float(
tf.stack([image_height, image_width, image_height, image_width]))
]),
dtype=tf.float32)
new_window /= tf.cast(
tf.stack([image_height, image_width, image_height, image_width]),
dtype=tf.float32)
boxlist = box_list.BoxList(boxes)
new_boxlist = box_list_ops.change_coordinate_frame(boxlist, new_window)
new_boxes = new_boxlist.get()
......@@ -1616,8 +1620,8 @@ def random_absolute_pad_image(image,
form.
"""
min_image_size = tf.shape(image)[:2]
max_image_size = min_image_size + tf.to_int32(
[max_height_padding, max_width_padding])
max_image_size = min_image_size + tf.cast(
[max_height_padding, max_width_padding], dtype=tf.int32)
return random_pad_image(image, boxes, min_image_size=min_image_size,
max_image_size=max_image_size, pad_color=pad_color,
seed=seed,
......@@ -1723,12 +1727,14 @@ def random_crop_pad_image(image,
cropped_image, cropped_boxes, cropped_labels = result[:3]
min_image_size = tf.to_int32(
tf.to_float(tf.stack([image_height, image_width])) *
min_padded_size_ratio)
max_image_size = tf.to_int32(
tf.to_float(tf.stack([image_height, image_width])) *
max_padded_size_ratio)
min_image_size = tf.cast(
tf.cast(tf.stack([image_height, image_width]), dtype=tf.float32) *
min_padded_size_ratio,
dtype=tf.int32)
max_image_size = tf.cast(
tf.cast(tf.stack([image_height, image_width]), dtype=tf.float32) *
max_padded_size_ratio,
dtype=tf.int32)
padded_image, padded_boxes = random_pad_image(
cropped_image,
......@@ -1840,16 +1846,23 @@ def random_crop_to_aspect_ratio(image,
image_shape = tf.shape(image)
orig_height = image_shape[0]
orig_width = image_shape[1]
orig_aspect_ratio = tf.to_float(orig_width) / tf.to_float(orig_height)
orig_aspect_ratio = tf.cast(
orig_width, dtype=tf.float32) / tf.cast(
orig_height, dtype=tf.float32)
new_aspect_ratio = tf.constant(aspect_ratio, dtype=tf.float32)
def target_height_fn():
return tf.to_int32(tf.round(tf.to_float(orig_width) / new_aspect_ratio))
return tf.cast(
tf.round(tf.cast(orig_width, dtype=tf.float32) / new_aspect_ratio),
dtype=tf.int32)
target_height = tf.cond(orig_aspect_ratio >= new_aspect_ratio,
lambda: orig_height, target_height_fn)
def target_width_fn():
return tf.to_int32(tf.round(tf.to_float(orig_height) * new_aspect_ratio))
return tf.cast(
tf.round(tf.cast(orig_height, dtype=tf.float32) * new_aspect_ratio),
dtype=tf.int32)
target_width = tf.cond(orig_aspect_ratio <= new_aspect_ratio,
lambda: orig_width, target_width_fn)
......@@ -1870,10 +1883,14 @@ def random_crop_to_aspect_ratio(image,
image, offset_height, offset_width, target_height, target_width)
im_box = tf.stack([
tf.to_float(offset_height) / tf.to_float(orig_height),
tf.to_float(offset_width) / tf.to_float(orig_width),
tf.to_float(offset_height + target_height) / tf.to_float(orig_height),
tf.to_float(offset_width + target_width) / tf.to_float(orig_width)
tf.cast(offset_height, dtype=tf.float32) /
tf.cast(orig_height, dtype=tf.float32),
tf.cast(offset_width, dtype=tf.float32) /
tf.cast(orig_width, dtype=tf.float32),
tf.cast(offset_height + target_height, dtype=tf.float32) /
tf.cast(orig_height, dtype=tf.float32),
tf.cast(offset_width + target_width, dtype=tf.float32) /
tf.cast(orig_width, dtype=tf.float32)
])
boxlist = box_list.BoxList(boxes)
......@@ -1996,8 +2013,8 @@ def random_pad_to_aspect_ratio(image,
with tf.name_scope('RandomPadToAspectRatio', values=[image]):
image_shape = tf.shape(image)
image_height = tf.to_float(image_shape[0])
image_width = tf.to_float(image_shape[1])
image_height = tf.cast(image_shape[0], dtype=tf.float32)
image_width = tf.cast(image_shape[1], dtype=tf.float32)
image_aspect_ratio = image_width / image_height
new_aspect_ratio = tf.constant(aspect_ratio, dtype=tf.float32)
target_height = tf.cond(
......@@ -2034,7 +2051,8 @@ def random_pad_to_aspect_ratio(image,
target_width = tf.round(scale * target_width)
new_image = tf.image.pad_to_bounding_box(
image, 0, 0, tf.to_int32(target_height), tf.to_int32(target_width))
image, 0, 0, tf.cast(target_height, dtype=tf.int32),
tf.cast(target_width, dtype=tf.int32))
im_box = tf.stack([
0.0,
......@@ -2050,9 +2068,9 @@ def random_pad_to_aspect_ratio(image,
if masks is not None:
new_masks = tf.expand_dims(masks, -1)
new_masks = tf.image.pad_to_bounding_box(new_masks, 0, 0,
tf.to_int32(target_height),
tf.to_int32(target_width))
new_masks = tf.image.pad_to_bounding_box(
new_masks, 0, 0, tf.cast(target_height, dtype=tf.int32),
tf.cast(target_width, dtype=tf.int32))
new_masks = tf.squeeze(new_masks, [-1])
result.append(new_masks)
......@@ -2106,10 +2124,12 @@ def random_black_patches(image,
image_shape = tf.shape(image)
image_height = image_shape[0]
image_width = image_shape[1]
box_size = tf.to_int32(
box_size = tf.cast(
tf.multiply(
tf.minimum(tf.to_float(image_height), tf.to_float(image_width)),
size_to_image_ratio))
tf.minimum(
tf.cast(image_height, dtype=tf.float32),
tf.cast(image_width, dtype=tf.float32)), size_to_image_ratio),
dtype=tf.int32)
generator_func = functools.partial(tf.random_uniform, [], minval=0.0,
maxval=(1.0 - size_to_image_ratio),
......@@ -2123,8 +2143,12 @@ def random_black_patches(image,
preprocessor_cache.PreprocessorCache.ADD_BLACK_PATCH,
preprocess_vars_cache, key=str(idx) + 'x')
y_min = tf.to_int32(normalized_y_min * tf.to_float(image_height))
x_min = tf.to_int32(normalized_x_min * tf.to_float(image_width))
y_min = tf.cast(
normalized_y_min * tf.cast(image_height, dtype=tf.float32),
dtype=tf.int32)
x_min = tf.cast(
normalized_x_min * tf.cast(image_width, dtype=tf.float32),
dtype=tf.int32)
black_box = tf.ones([box_size, box_size, 3], dtype=tf.float32)
mask = 1.0 - tf.image.pad_to_bounding_box(black_box, y_min, x_min,
image_height, image_width)
......@@ -2156,7 +2180,7 @@ def image_to_float(image):
image: image in tf.float32 format.
"""
with tf.name_scope('ImageToFloat', values=[image]):
image = tf.to_float(image)
image = tf.cast(image, dtype=tf.float32)
return image
......@@ -2342,10 +2366,12 @@ def resize_to_min_dimension(image, masks=None, min_dimension=600,
(image_height, image_width, num_channels) = _get_image_info(image)
min_image_dimension = tf.minimum(image_height, image_width)
min_target_dimension = tf.maximum(min_image_dimension, min_dimension)
target_ratio = tf.to_float(min_target_dimension) / tf.to_float(
min_image_dimension)
target_height = tf.to_int32(tf.to_float(image_height) * target_ratio)
target_width = tf.to_int32(tf.to_float(image_width) * target_ratio)
target_ratio = tf.cast(min_target_dimension, dtype=tf.float32) / tf.cast(
min_image_dimension, dtype=tf.float32)
target_height = tf.cast(
tf.cast(image_height, dtype=tf.float32) * target_ratio, dtype=tf.int32)
target_width = tf.cast(
tf.cast(image_width, dtype=tf.float32) * target_ratio, dtype=tf.int32)
image = tf.image.resize_images(
tf.expand_dims(image, axis=0), size=[target_height, target_width],
method=method,
......@@ -2398,10 +2424,12 @@ def resize_to_max_dimension(image, masks=None, max_dimension=600,
(image_height, image_width, num_channels) = _get_image_info(image)
max_image_dimension = tf.maximum(image_height, image_width)
max_target_dimension = tf.minimum(max_image_dimension, max_dimension)
target_ratio = tf.to_float(max_target_dimension) / tf.to_float(
max_image_dimension)
target_height = tf.to_int32(tf.to_float(image_height) * target_ratio)
target_width = tf.to_int32(tf.to_float(image_width) * target_ratio)
target_ratio = tf.cast(max_target_dimension, dtype=tf.float32) / tf.cast(
max_image_dimension, dtype=tf.float32)
target_height = tf.cast(
tf.cast(image_height, dtype=tf.float32) * target_ratio, dtype=tf.int32)
target_width = tf.cast(
tf.cast(image_width, dtype=tf.float32) * target_ratio, dtype=tf.int32)
image = tf.image.resize_images(
tf.expand_dims(image, axis=0), size=[target_height, target_width],
method=method,
......@@ -2639,11 +2667,11 @@ def random_self_concat_image(
if axis == 0:
# Concat vertically, so need to reduce the y coordinates.
old_scaling = tf.to_float([0.5, 1.0, 0.5, 1.0])
new_translation = tf.to_float([0.5, 0.0, 0.5, 0.0])
old_scaling = tf.constant([0.5, 1.0, 0.5, 1.0])
new_translation = tf.constant([0.5, 0.0, 0.5, 0.0])
elif axis == 1:
old_scaling = tf.to_float([1.0, 0.5, 1.0, 0.5])
new_translation = tf.to_float([0.0, 0.5, 0.0, 0.5])
old_scaling = tf.constant([1.0, 0.5, 1.0, 0.5])
new_translation = tf.constant([0.0, 0.5, 0.0, 0.5])
old_boxes = old_scaling * boxes
new_boxes = old_boxes + new_translation
......
......@@ -795,8 +795,8 @@ class PreprocessorTest(tf.test.TestCase):
images = self.createTestImages()
tensor_dict = {fields.InputDataFields.image: images}
tensor_dict = preprocessor.preprocess(tensor_dict, preprocessing_options)
images_min = tf.to_float(images) * 0.9 / 255.0
images_max = tf.to_float(images) * 1.1 / 255.0
images_min = tf.cast(images, dtype=tf.float32) * 0.9 / 255.0
images_max = tf.cast(images, dtype=tf.float32) * 1.1 / 255.0
images = tensor_dict[fields.InputDataFields.image]
values_greater = tf.greater_equal(images, images_min)
values_less = tf.less_equal(images, images_max)
......@@ -858,20 +858,26 @@ class PreprocessorTest(tf.test.TestCase):
value=images_gray, num_or_size_splits=3, axis=3)
images_r, images_g, images_b = tf.split(
value=images_original, num_or_size_splits=3, axis=3)
images_r_diff1 = tf.squared_difference(tf.to_float(images_r),
tf.to_float(images_gray_r))
images_r_diff2 = tf.squared_difference(tf.to_float(images_gray_r),
tf.to_float(images_gray_g))
images_r_diff1 = tf.squared_difference(
tf.cast(images_r, dtype=tf.float32),
tf.cast(images_gray_r, dtype=tf.float32))
images_r_diff2 = tf.squared_difference(
tf.cast(images_gray_r, dtype=tf.float32),
tf.cast(images_gray_g, dtype=tf.float32))
images_r_diff = tf.multiply(images_r_diff1, images_r_diff2)
images_g_diff1 = tf.squared_difference(tf.to_float(images_g),
tf.to_float(images_gray_g))
images_g_diff2 = tf.squared_difference(tf.to_float(images_gray_g),
tf.to_float(images_gray_b))
images_g_diff1 = tf.squared_difference(
tf.cast(images_g, dtype=tf.float32),
tf.cast(images_gray_g, dtype=tf.float32))
images_g_diff2 = tf.squared_difference(
tf.cast(images_gray_g, dtype=tf.float32),
tf.cast(images_gray_b, dtype=tf.float32))
images_g_diff = tf.multiply(images_g_diff1, images_g_diff2)
images_b_diff1 = tf.squared_difference(tf.to_float(images_b),
tf.to_float(images_gray_b))
images_b_diff2 = tf.squared_difference(tf.to_float(images_gray_b),
tf.to_float(images_gray_r))
images_b_diff1 = tf.squared_difference(
tf.cast(images_b, dtype=tf.float32),
tf.cast(images_gray_b, dtype=tf.float32))
images_b_diff2 = tf.squared_difference(
tf.cast(images_gray_b, dtype=tf.float32),
tf.cast(images_gray_r, dtype=tf.float32))
images_b_diff = tf.multiply(images_b_diff1, images_b_diff2)
image_zero1 = tf.constant(0, dtype=tf.float32, shape=[1, 4, 4, 1])
with self.test_session() as sess:
......@@ -2135,7 +2141,7 @@ class PreprocessorTest(tf.test.TestCase):
boxes = self.createTestBoxes()
labels = self.createTestLabels()
tensor_dict = {
fields.InputDataFields.image: tf.to_float(images),
fields.InputDataFields.image: tf.cast(images, dtype=tf.float32),
fields.InputDataFields.groundtruth_boxes: boxes,
fields.InputDataFields.groundtruth_classes: labels,
}
......@@ -2856,7 +2862,7 @@ class PreprocessorTest(tf.test.TestCase):
scores = self.createTestMultiClassScores()
tensor_dict = {
fields.InputDataFields.image: tf.to_float(images),
fields.InputDataFields.image: tf.cast(images, dtype=tf.float32),
fields.InputDataFields.groundtruth_boxes: boxes,
fields.InputDataFields.groundtruth_classes: labels,
fields.InputDataFields.groundtruth_weights: weights,
......
......@@ -109,6 +109,8 @@ class DetectionResultFields(object):
key: unique key corresponding to image.
detection_boxes: coordinates of the detection boxes in the image.
detection_scores: detection scores for the detection boxes in the image.
detection_multiclass_scores: class score distribution (including background)
for detection boxes in the image including background class.
detection_classes: detection-level class labels.
detection_masks: contains a segmentation mask for each detection box.
detection_boundaries: contains an object boundary for each detection box.
......@@ -123,6 +125,7 @@ class DetectionResultFields(object):
key = 'key'
detection_boxes = 'detection_boxes'
detection_scores = 'detection_scores'
detection_multiclass_scores = 'detection_multiclass_scores'
detection_classes = 'detection_classes'
detection_masks = 'detection_masks'
detection_boundaries = 'detection_boundaries'
......
......@@ -660,16 +660,16 @@ def batch_assign_confidences(target_assigner,
explicit_example_mask = tf.logical_or(positive_mask, negative_mask)
positive_anchors = tf.reduce_any(positive_mask, axis=-1)
regression_weights = tf.to_float(positive_anchors)
regression_weights = tf.cast(positive_anchors, dtype=tf.float32)
regression_targets = (
reg_targets * tf.expand_dims(regression_weights, axis=-1))
regression_weights_expanded = tf.expand_dims(regression_weights, axis=-1)
cls_targets_without_background = (
cls_targets_without_background * (1 - tf.to_float(negative_mask)))
cls_weights_without_background = (
(1 - implicit_class_weight) * tf.to_float(explicit_example_mask)
+ implicit_class_weight)
cls_targets_without_background *
(1 - tf.cast(negative_mask, dtype=tf.float32)))
cls_weights_without_background = ((1 - implicit_class_weight) * tf.cast(
explicit_example_mask, dtype=tf.float32) + implicit_class_weight)
if include_background_class:
cls_weights_background = (
......
......@@ -59,8 +59,15 @@ class _ClassTensorHandler(slim_example_decoder.Tensor):
label_map_proto_file, use_display_name=False)
# We use a default_value of -1, but we expect all labels to be contained
# in the label map.
name_to_id_table = tf.contrib.lookup.HashTable(
initializer=tf.contrib.lookup.KeyValueTensorInitializer(
try:
# Dynamically try to load the tf v2 lookup, falling back to contrib
lookup = tf.compat.v2.lookup
hash_table_class = tf.compat.v2.lookup.StaticHashTable
except AttributeError:
lookup = tf.contrib.lookup
hash_table_class = tf.contrib.lookup.HashTable
name_to_id_table = hash_table_class(
initializer=lookup.KeyValueTensorInitializer(
keys=tf.constant(list(name_to_id.keys())),
values=tf.constant(list(name_to_id.values()), dtype=tf.int64)),
default_value=-1)
......@@ -68,8 +75,8 @@ class _ClassTensorHandler(slim_example_decoder.Tensor):
label_map_proto_file, use_display_name=True)
# We use a default_value of -1, but we expect all labels to be contained
# in the label map.
display_name_to_id_table = tf.contrib.lookup.HashTable(
initializer=tf.contrib.lookup.KeyValueTensorInitializer(
display_name_to_id_table = hash_table_class(
initializer=lookup.KeyValueTensorInitializer(
keys=tf.constant(list(display_name_to_id.keys())),
values=tf.constant(
list(display_name_to_id.values()), dtype=tf.int64)),
......@@ -444,7 +451,8 @@ class TfExampleDecoder(data_decoder.DataDecoder):
masks = keys_to_tensors['image/object/mask']
if isinstance(masks, tf.SparseTensor):
masks = tf.sparse_tensor_to_dense(masks)
masks = tf.reshape(tf.to_float(tf.greater(masks, 0.0)), to_shape)
masks = tf.reshape(
tf.cast(tf.greater(masks, 0.0), dtype=tf.float32), to_shape)
return tf.cast(masks, tf.float32)
def _decode_png_instance_masks(self, keys_to_tensors):
......@@ -465,7 +473,7 @@ class TfExampleDecoder(data_decoder.DataDecoder):
image = tf.squeeze(
tf.image.decode_image(image_buffer, channels=1), axis=2)
image.set_shape([None, None])
image = tf.to_float(tf.greater(image, 0))
image = tf.cast(tf.greater(image, 0), dtype=tf.float32)
return image
png_masks = keys_to_tensors['image/object/mask']
......@@ -476,4 +484,4 @@ class TfExampleDecoder(data_decoder.DataDecoder):
return tf.cond(
tf.greater(tf.size(png_masks), 0),
lambda: tf.map_fn(decode_png_mask, png_masks, dtype=tf.float32),
lambda: tf.zeros(tf.to_int32(tf.stack([0, height, width]))))
lambda: tf.zeros(tf.cast(tf.stack([0, height, width]), dtype=tf.int32)))
......@@ -44,10 +44,15 @@ EVAL_METRICS_CLASS_DICT = {
coco_evaluation.CocoMaskEvaluator,
'oid_challenge_detection_metrics':
object_detection_evaluation.OpenImagesDetectionChallengeEvaluator,
'oid_challenge_segmentation_metrics':
object_detection_evaluation
.OpenImagesInstanceSegmentationChallengeEvaluator,
'pascal_voc_detection_metrics':
object_detection_evaluation.PascalDetectionEvaluator,
'weighted_pascal_voc_detection_metrics':
object_detection_evaluation.WeightedPascalDetectionEvaluator,
'precision_at_recall_detection_metrics':
object_detection_evaluation.PrecisionAtRecallDetectionEvaluator,
'pascal_voc_instance_segmentation_metrics':
object_detection_evaluation.PascalInstanceSegmentationEvaluator,
'weighted_pascal_voc_instance_segmentation_metrics':
......@@ -776,7 +781,8 @@ def result_dict_for_batched_example(images,
detection_fields = fields.DetectionResultFields
detection_boxes = detections[detection_fields.detection_boxes]
detection_scores = detections[detection_fields.detection_scores]
num_detections = tf.to_int32(detections[detection_fields.num_detections])
num_detections = tf.cast(detections[detection_fields.num_detections],
dtype=tf.int32)
if class_agnostic:
detection_classes = tf.ones_like(detection_scores, dtype=tf.int64)
......@@ -939,4 +945,9 @@ def evaluator_options_from_eval_config(eval_config):
'include_metrics_per_category': (
eval_config.include_metrics_per_category)
}
elif eval_metric_fn_key == 'precision_at_recall_detection_metrics':
evaluator_options[eval_metric_fn_key] = {
'recall_lower_bound': (eval_config.recall_lower_bound),
'recall_upper_bound': (eval_config.recall_upper_bound)
}
return evaluator_options
......@@ -31,9 +31,9 @@ from object_detection.utils import test_case
class EvalUtilTest(test_case.TestCase, parameterized.TestCase):
def _get_categories_list(self):
return [{'id': 0, 'name': 'person'},
{'id': 1, 'name': 'dog'},
{'id': 2, 'name': 'cat'}]
return [{'id': 1, 'name': 'person'},
{'id': 2, 'name': 'dog'},
{'id': 3, 'name': 'cat'}]
def _make_evaluation_dict(self,
resized_groundtruth_masks=False,
......@@ -192,43 +192,66 @@ class EvalUtilTest(test_case.TestCase, parameterized.TestCase):
def test_get_eval_metric_ops_for_evaluators(self):
eval_config = eval_pb2.EvalConfig()
eval_config.metrics_set.extend(
['coco_detection_metrics', 'coco_mask_metrics'])
eval_config.metrics_set.extend([
'coco_detection_metrics', 'coco_mask_metrics',
'precision_at_recall_detection_metrics'
])
eval_config.include_metrics_per_category = True
eval_config.recall_lower_bound = 0.2
eval_config.recall_upper_bound = 0.6
evaluator_options = eval_util.evaluator_options_from_eval_config(
eval_config)
self.assertTrue(evaluator_options['coco_detection_metrics'][
'include_metrics_per_category'])
self.assertTrue(evaluator_options['coco_mask_metrics'][
'include_metrics_per_category'])
self.assertTrue(evaluator_options['coco_detection_metrics']
['include_metrics_per_category'])
self.assertTrue(
evaluator_options['coco_mask_metrics']['include_metrics_per_category'])
self.assertAlmostEqual(
evaluator_options['precision_at_recall_detection_metrics']
['recall_lower_bound'], eval_config.recall_lower_bound)
self.assertAlmostEqual(
evaluator_options['precision_at_recall_detection_metrics']
['recall_upper_bound'], eval_config.recall_upper_bound)
def test_get_evaluator_with_evaluator_options(self):
eval_config = eval_pb2.EvalConfig()
eval_config.metrics_set.extend(['coco_detection_metrics'])
eval_config.metrics_set.extend(
['coco_detection_metrics', 'precision_at_recall_detection_metrics'])
eval_config.include_metrics_per_category = True
eval_config.recall_lower_bound = 0.2
eval_config.recall_upper_bound = 0.6
categories = self._get_categories_list()
evaluator_options = eval_util.evaluator_options_from_eval_config(
eval_config)
evaluator = eval_util.get_evaluators(
eval_config, categories, evaluator_options)
evaluator = eval_util.get_evaluators(eval_config, categories,
evaluator_options)
self.assertTrue(evaluator[0]._include_metrics_per_category)
self.assertAlmostEqual(evaluator[1]._recall_lower_bound,
eval_config.recall_lower_bound)
self.assertAlmostEqual(evaluator[1]._recall_upper_bound,
eval_config.recall_upper_bound)
def test_get_evaluator_with_no_evaluator_options(self):
eval_config = eval_pb2.EvalConfig()
eval_config.metrics_set.extend(['coco_detection_metrics'])
eval_config.metrics_set.extend(
['coco_detection_metrics', 'precision_at_recall_detection_metrics'])
eval_config.include_metrics_per_category = True
eval_config.recall_lower_bound = 0.2
eval_config.recall_upper_bound = 0.6
categories = self._get_categories_list()
evaluator = eval_util.get_evaluators(
eval_config, categories, evaluator_options=None)
# Even though we are setting eval_config.include_metrics_per_category = True
# this option is never passed into the DetectionEvaluator constructor (via
# `evaluator_options`).
# and bounds on recall, these options are never passed into the
# DetectionEvaluator constructor (via `evaluator_options`).
self.assertFalse(evaluator[0]._include_metrics_per_category)
self.assertAlmostEqual(evaluator[1]._recall_lower_bound, 0.0)
self.assertAlmostEqual(evaluator[1]._recall_upper_bound, 1.0)
if __name__ == '__main__':
tf.test.main()
......@@ -106,7 +106,7 @@ flags.DEFINE_string('trained_checkpoint_prefix', None, 'Checkpoint prefix.')
flags.DEFINE_integer('max_detections', 10,
'Maximum number of detections (boxes) to show.')
flags.DEFINE_integer('max_classes_per_detection', 1,
'Number of classes to display per detection box.')
'Maximum number of classes to output per detection box.')
flags.DEFINE_integer(
'detections_per_class', 100,
'Number of anchors used per class in Regular Non-Max-Suppression.')
......@@ -136,7 +136,7 @@ def main(argv):
export_tflite_ssd_graph_lib.export_tflite_graph(
pipeline_config, FLAGS.trained_checkpoint_prefix, FLAGS.output_directory,
FLAGS.add_postprocessing_op, FLAGS.max_detections,
FLAGS.max_classes_per_detection, FLAGS.use_regular_nms)
FLAGS.max_classes_per_detection, use_regular_nms=FLAGS.use_regular_nms)
if __name__ == '__main__':
......
......@@ -176,6 +176,9 @@ def add_output_tensor_nodes(postprocessed_tensors,
containing detected boxes.
* detection_scores: float32 tensor of shape [batch_size, num_boxes]
containing scores for the detected boxes.
* detection_multiclass_scores: (Optional) float32 tensor of shape
[batch_size, num_boxes, num_classes_with_background] for containing class
score distribution for detected boxes including background if any.
* detection_classes: float32 tensor of shape [batch_size, num_boxes]
containing class predictions for the detected boxes.
* detection_keypoints: (Optional) float32 tensor of shape
......@@ -189,6 +192,8 @@ def add_output_tensor_nodes(postprocessed_tensors,
postprocessed_tensors: a dictionary containing the following fields
'detection_boxes': [batch, max_detections, 4]
'detection_scores': [batch, max_detections]
'detection_multiclass_scores': [batch, max_detections,
num_classes_with_background]
'detection_classes': [batch, max_detections]
'detection_masks': [batch, max_detections, mask_height, mask_width]
(optional).
......@@ -204,6 +209,8 @@ def add_output_tensor_nodes(postprocessed_tensors,
label_id_offset = 1
boxes = postprocessed_tensors.get(detection_fields.detection_boxes)
scores = postprocessed_tensors.get(detection_fields.detection_scores)
multiclass_scores = postprocessed_tensors.get(
detection_fields.detection_multiclass_scores)
raw_boxes = postprocessed_tensors.get(detection_fields.raw_detection_boxes)
raw_scores = postprocessed_tensors.get(detection_fields.raw_detection_scores)
classes = postprocessed_tensors.get(
......@@ -216,6 +223,9 @@ def add_output_tensor_nodes(postprocessed_tensors,
boxes, name=detection_fields.detection_boxes)
outputs[detection_fields.detection_scores] = tf.identity(
scores, name=detection_fields.detection_scores)
if multiclass_scores is not None:
outputs[detection_fields.detection_multiclass_scores] = tf.identity(
multiclass_scores, name=detection_fields.detection_multiclass_scores)
outputs[detection_fields.detection_classes] = tf.identity(
classes, name=detection_fields.detection_classes)
outputs[detection_fields.num_detections] = tf.identity(
......@@ -306,7 +316,7 @@ def write_graph_and_checkpoint(inference_graph_def,
def _get_outputs_from_inputs(input_tensors, detection_model,
output_collection_name):
inputs = tf.to_float(input_tensors)
inputs = tf.cast(input_tensors, dtype=tf.float32)
preprocessed_inputs, true_image_shapes = detection_model.preprocess(inputs)
output_tensors = detection_model.predict(
preprocessed_inputs, true_image_shapes)
......
......@@ -59,6 +59,9 @@ class FakeModel(model.DetectionModel):
[0.0, 0.0, 0.0, 0.0]]], tf.float32),
'detection_scores': tf.constant([[0.7, 0.6],
[0.9, 0.0]], tf.float32),
'detection_multiclass_scores': tf.constant([[[0.3, 0.7], [0.4, 0.6]],
[[0.1, 0.9], [0.0, 0.0]]],
tf.float32),
'detection_classes': tf.constant([[0, 1],
[1, 0]], tf.float32),
'num_detections': tf.constant([2, 1], tf.float32),
......@@ -371,6 +374,7 @@ class ExportInferenceGraphTest(tf.test.TestCase):
inference_graph.get_tensor_by_name('image_tensor:0')
inference_graph.get_tensor_by_name('detection_boxes:0')
inference_graph.get_tensor_by_name('detection_scores:0')
inference_graph.get_tensor_by_name('detection_multiclass_scores:0')
inference_graph.get_tensor_by_name('detection_classes:0')
inference_graph.get_tensor_by_name('detection_keypoints:0')
inference_graph.get_tensor_by_name('detection_masks:0')
......@@ -398,6 +402,7 @@ class ExportInferenceGraphTest(tf.test.TestCase):
inference_graph.get_tensor_by_name('image_tensor:0')
inference_graph.get_tensor_by_name('detection_boxes:0')
inference_graph.get_tensor_by_name('detection_scores:0')
inference_graph.get_tensor_by_name('detection_multiclass_scores:0')
inference_graph.get_tensor_by_name('detection_classes:0')
inference_graph.get_tensor_by_name('num_detections:0')
with self.assertRaises(KeyError):
......@@ -491,15 +496,20 @@ class ExportInferenceGraphTest(tf.test.TestCase):
'encoded_image_string_tensor:0')
boxes = inference_graph.get_tensor_by_name('detection_boxes:0')
scores = inference_graph.get_tensor_by_name('detection_scores:0')
multiclass_scores = inference_graph.get_tensor_by_name(
'detection_multiclass_scores:0')
classes = inference_graph.get_tensor_by_name('detection_classes:0')
keypoints = inference_graph.get_tensor_by_name('detection_keypoints:0')
masks = inference_graph.get_tensor_by_name('detection_masks:0')
num_detections = inference_graph.get_tensor_by_name('num_detections:0')
for image_str in [jpg_image_str, png_image_str]:
image_str_batch_np = np.hstack([image_str]* 2)
(boxes_np, scores_np, classes_np, keypoints_np, masks_np,
num_detections_np) = sess.run(
[boxes, scores, classes, keypoints, masks, num_detections],
(boxes_np, scores_np, multiclass_scores_np, classes_np, keypoints_np,
masks_np, num_detections_np) = sess.run(
[
boxes, scores, multiclass_scores, classes, keypoints, masks,
num_detections
],
feed_dict={image_str_tensor: image_str_batch_np})
self.assertAllClose(boxes_np, [[[0.0, 0.0, 0.5, 0.5],
[0.5, 0.5, 0.8, 0.8]],
......@@ -507,6 +517,8 @@ class ExportInferenceGraphTest(tf.test.TestCase):
[0.0, 0.0, 0.0, 0.0]]])
self.assertAllClose(scores_np, [[0.7, 0.6],
[0.9, 0.0]])
self.assertAllClose(multiclass_scores_np, [[[0.3, 0.7], [0.4, 0.6]],
[[0.1, 0.9], [0.0, 0.0]]])
self.assertAllClose(classes_np, [[1, 2],
[2, 1]])
self.assertAllClose(keypoints_np, np.arange(48).reshape([2, 2, 6, 2]))
......
This diff is collapsed.
......@@ -53,6 +53,9 @@ EVAL_METRICS_CLASS_DICT = {
# DEPRECATED: please use oid_challenge_detection_metrics instead
'oid_challenge_object_detection_metrics':
object_detection_evaluation.OpenImagesDetectionChallengeEvaluator,
'oid_challenge_segmentation_metrics':
object_detection_evaluation
.OpenImagesInstanceSegmentationChallengeEvaluator,
}
EVAL_DEFAULT_METRIC = 'pascal_voc_detection_metrics'
......@@ -80,7 +83,7 @@ def _extract_predictions_and_losses(model,
input_dict = prefetch_queue.dequeue()
original_image = tf.expand_dims(input_dict[fields.InputDataFields.image], 0)
preprocessed_image, true_image_shapes = model.preprocess(
tf.to_float(original_image))
tf.cast(original_image, dtype=tf.float32))
prediction_dict = model.predict(preprocessed_image, true_image_shapes)
detections = model.postprocess(prediction_dict, true_image_shapes)
......
......@@ -62,7 +62,7 @@ def create_input_queue(batch_size_per_clone, create_tensor_dict_fn,
tensor_dict[fields.InputDataFields.image], 0)
images = tensor_dict[fields.InputDataFields.image]
float_images = tf.to_float(images)
float_images = tf.cast(images, dtype=tf.float32)
tensor_dict[fields.InputDataFields.image] = float_images
include_instance_masks = (fields.InputDataFields.groundtruth_instance_masks
......
......@@ -184,7 +184,7 @@ class ArgMaxMatcher(matcher.Matcher):
return matches
if similarity_matrix.shape.is_fully_defined():
if similarity_matrix.shape[0].value == 0:
if shape_utils.get_dim_as_int(similarity_matrix.shape[0]) == 0:
return _match_when_rows_are_empty()
else:
return _match_when_rows_are_non_empty()
......
......@@ -62,7 +62,7 @@ class GreedyBipartiteMatcher(matcher.Matcher):
# Convert similarity matrix to distance matrix as tf.image.bipartite tries
# to find minimum distance matches.
distance_matrix = -1 * similarity_matrix
num_valid_rows = tf.reduce_sum(tf.to_float(valid_rows))
num_valid_rows = tf.reduce_sum(tf.cast(valid_rows, dtype=tf.float32))
_, match_results = image_ops.bipartite_match(
distance_matrix, num_valid_rows=num_valid_rows)
match_results = tf.reshape(match_results, [-1])
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment