Unverified Commit 99256cf4 authored by pkulzc's avatar pkulzc Committed by GitHub
Browse files

Release iNaturalist Species-trained models, refactor of evaluation, box...

Release iNaturalist Species-trained models, refactor of evaluation, box predictor for object detection. (#5289)

* Merged commit includes the following changes:
212389173  by Zhichao Lu:

    1. Replace tf.boolean_mask with tf.where

--
212282646  by Zhichao Lu:

    1. Fix a typo in model_builder.py and add a test to cover it.

--
212142989  by Zhichao Lu:

    Only resize masks in meta architecture if it has not already been resized in the input pipeline.

--
212136935  by Zhichao Lu:

    Choose matmul or native crop_and_resize in the model builder instead of faster r-cnn meta architecture.

--
211907984  by Zhichao Lu:

    Make eval input reader repeated field and update config util to handle this field.

--
211858098  by Zhichao Lu:

    Change the implementation of merge_boxes_with_multiple_labels.

--
211843915  by Zhichao Lu:

    Add Mobilenet v2 + FPN support.

--
211655076  by Zhichao Lu:

    Bug fix for generic keys in config overrides

    In generic configuration overrides, we had a duplicate entry for train_input_config and we were missing the eval_input_config and eval_config.

    This change also introduces testing for all config overrides.

--
211157501  by Zhichao Lu:

    Make the locally-modified conv defs a copy.

    So that it doesn't modify MobileNet conv defs globally for other code that
    transitively imports this package.

--
211112813  by Zhichao Lu:

    Refactoring visualization tools for Estimator's eval_metric_ops. This will make it easier for future models to take advantage of a single interface and mechanics.

--
211109571  by Zhichao Lu:

    A test decorator.

--
210747685  by Zhichao Lu:

    For FPN, when use_depthwise is set to true, use slightly modified mobilenet v1 config.

--
210723882  by Zhichao Lu:

    Integrating the losses mask into the meta architectures. When providing groundtruth, one can optionally specify annotation information (i.e. which images are labeled vs. unlabeled). For any image that is unlabeled, there is no loss accumulation.

--
210673675  by Zhichao Lu:

    Internal change.

--
210546590  by Zhichao Lu:

    Internal change.

--
210529752  by Zhichao Lu:

    Support batched inputs with ops.matmul_crop_and_resize.

    With this change the new inputs are images of shape [batch, heigh, width, depth] and boxes of shape [batch, num_boxes, 4]. The output tensor is of the shape [batch, num_boxes, crop_height, crop_width, depth].

--
210485912  by Zhichao Lu:

    Fix TensorFlow version check in object_detection_tutorial.ipynb

--
210484076  by Zhichao Lu:

    Reduce TPU memory required for single image matmul_crop_and_resize.

    Using tf.einsum eliminates intermediate tensors, tiling and expansion. for an image of size [40, 40, 1024] and boxes of shape [300, 4] HBM memory usage goes down from 3.52G to 1.67G.

--
210468361  by Zhichao Lu:

    Remove PositiveAnchorLossCDF/NegativeAnchorLossCDF to resolve "Main thread is not in main loop error" issue in local training.

--
210100253  by Zhichao Lu:

    Pooling pyramid feature maps: add option to replace max pool with convolution layers.

--
209995842  by Zhichao Lu:

    Fix a bug which prevents variable sharing in Faster RCNN.

--
209965526  by Zhichao Lu:

    Add support for enabling export_to_tpu through the estimator.

--
209946440  by Zhichao Lu:

    Replace deprecated tf.train.Supervisor with tf.train.MonitoredSession. MonitoredSession also takes away the hassle of starting queue runners.

--
209888003  by Zhichao Lu:

    Implement function to handle data where source_id is not set.

    If the field source_id is found to be the empty string for any image during runtime, it will be replaced with a random string. This avoids hash-collisions on dataset where many examples do not have source_id set. Those hash-collisions have unintended site effects and may lead to bugs in the detection pipeline.

--
209842134  by Zhichao Lu:

    Converting loss mask into multiplier, rather than using it as a boolean mask (which changes tensor shape). This is necessary, since other utilities (e.g. hard example miner) require a loss matrix with the same dimensions as the original prediction tensor.

--
209768066  by Zhichao Lu:

    Adding ability to remove loss computation from specific images in a batch, via an optional boolean mask.

--
209722556  by Zhichao Lu:

    Remove dead code.

    (_USE_C_API was flipped to True by default in TensorFlow 1.8)

--
209701861  by Zhichao Lu:

    This CL cleans-up some tf.Example creation snippets, by reusing the convenient tf.train.Feature building functions in dataset_util.

--
209697893  by Zhichao Lu:

    Do not overwrite num_epoch for eval input. This leads to errors in some cases.

--
209694652  by Zhichao Lu:

    Sample boxes by jittering around the currently given boxes.

--
209550300  by Zhichao Lu:

    `create_category_index_from_labelmap()` function now accepts `use_display_name` parameter.
    Also added create_categories_from_labelmap function for convenience

--
209490273  by Zhichao Lu:

    Check result_dict type before accessing image_id via key.

--
209442529  by Zhichao Lu:

    Introducing the capability to sample examples for evaluation. This makes it easy to specify one full epoch of evaluation, or a subset (e.g. sample 1 of every N examples).

--
208941150  by Zhichao Lu:

    Adding the capability of exporting the results in json format.

--
208888798  by Zhichao Lu:

    Fixes wrong dictionary key for num_det_boxes_per_image.

--
208873549  by Zhichao Lu:

    Reduce the number of HLO ops created by matmul_crop_and_resize.

    Do not unroll along the channels dimension. Instead, transpose the input image dimensions, apply tf.matmul and transpose back.

    The number of HLO instructions for 1024 channels reduce from 12368 to 110.

--
208844315  by Zhichao Lu:

    Add an option to use tf.non_maximal_supression_padded in SSD post-process

--
208731380  by Zhichao Lu:

    Add field in box_predictor config to enable mask prediction and update builders accordingly.

--
208699405  by Zhichao Lu:

    This CL creates a keras-based multi-resolution feature map extractor.

--
208557208  by Zhichao Lu:

    Add TPU tests for Faster R-CNN Meta arch.

    * Tests that two_stage_predict and total_loss tests run successfully on TPU.
    * Small mods to multiclass_non_max_suppression to preserve static shapes.

--
208499278  by Zhichao Lu:

    This CL makes sure the Keras convolutional box predictor & head layers apply activation layers *after* normalization (as opposed to before).

--
208391694  by Zhichao Lu:

    Updating visualization tool to produce multiple evaluation images.

--
208275961  by Zhichao Lu:

    This CL adds a Keras version of the Convolutional Box Predictor, as well as more general infrastructure for making Keras Prediction heads & Keras box predictors.

--
208275585  by Zhichao Lu:

    This CL enables the Keras layer hyperparameter object to build a dedicated activation layer, and to disable activation by default in the op layer construction kwargs.

    This is necessary because in most cases the normalization layer must be applied before the activation layer. So, in Keras models we must set the convolution activation in a dedicated layer after normalization is applied, rather than setting it in the convolution layer construction args.

--
208263792  by Zhichao Lu:

    Add a new SSD mask meta arch that can predict masks for SSD models.
    Changes including:
     - overwrite loss function to add mask loss computation.
     - update ssd_meta_arch to handle masks if predicted in predict and postprocessing.

--
208000218  by Zhichao Lu:

    Make FasterRCNN choose static shape operations only in training mode.

--
207997797  by Zhichao Lu:

    Add static boolean_mask op to box_list_ops.py and use that in faster_rcnn_meta_arch.py to support use_static_shapes option.

--
207993460  by Zhichao Lu:

    Include FGVC detection models in model zoo.

--
207971213  by Zhichao Lu:

    remove the restriction to run tf.nn.top_k op on CPU

--
207961187  by Zhichao Lu:

    Build the first stage NMS function in the model builder and pass it to FasterRCNN meta arch.

--
207960608  by Zhichao Lu:

    Internal Change.

--
207927015  by Zhichao Lu:

    Have an option to use the TPU compatible NMS op cl/206673787, in the batch_multiclass_non_max_suppression function. On setting pad_to_max_output_size to true, the output nmsed boxes are padded to be of length max_size_per_class.

    This can be used in first stage Region Proposal Network in FasterRCNN model by setting the first_stage_nms_pad_to_max_proposals field to true in config proto.

--
207809668  by Zhichao Lu:

    Add option to use depthwise separable conv instead of conv2d in FPN and WeightSharedBoxPredictor. More specifically, there are two related configs:
    - SsdFeatureExtractor.use_depthwise
    - WeightSharedConvolutionalBoxPredictor.use_depthwise

--
207808651  by Zhichao Lu:

    Fix the static balanced positive negative sampler's TPU tests

--
207798658  by Zhichao Lu:

    Fixes a post-refactoring bug where the pre-prediction convolution layers in the convolutional box predictor are ignored.

--
207796470  by Zhichao Lu:

    Make slim endpoints visible in FasterRCNNMetaArch.

--
207787053  by Zhichao Lu:

    Refactor ssd_meta_arch so that the target assigner instance is passed into the SSDMetaArch constructor rather than constructed inside.

--

PiperOrigin-RevId: 212389173

* Fix detection model zoo typo.

* Modify tf example decoder to handle label maps with either `display_name` or `name` fields seamlessly.

Currently, tf example decoder uses only `name` field to look up ids for class text field present in the data. This change uses both `display_name` and `name` fields in the label map to fetch ids for class text.

PiperOrigin-RevId: 212672223

* Modify create_coco_tf_record tool to write out class text instead of class labels.

PiperOrigin-RevId: 212679112

* Fix detection model zoo typo.

PiperOrigin-RevId: 212715692

* Adding the following two optional flags to WeightSharedConvolutionalBoxHead:
1) In the box head, apply clipping to box encodings in the box head.
2) In the class head, apply sigmoid to class predictions at inference time.

PiperOrigin-RevId: 212723242

* Support class confidences in merge boxes with multiple labels.

PiperOrigin-RevId: 212884998

* Creates multiple eval specs for object detection.

PiperOrigin-RevId: 212894556

* Set batch_norm on last layer in Mask Head to None.

PiperOrigin-RevId: 213030087

* Enable bfloat16 training for object detection models.

PiperOrigin-RevId: 213053547

* Skip padding op when unnecessary.

PiperOrigin-RevId: 213065869

* Modify `Matchers` to use groundtruth weights before performing matching.

Groundtruth weights tensor is used to indicate padding in groundtruth box tensor. It is handled in `TargetAssigner` by creating appropriate classification and regression target weights based on the groundtruth box each anchor matches to. However, options such as `force_match_all_rows` in `ArgmaxMatcher` force certain anchors to match to groundtruth boxes that are just paddings thereby reducing the number of anchors that could otherwise match to real groundtruth boxes.

For single stage models like SSD the effect of this is negligible as there are two orders of magnitude more anchors than the number of padded groundtruth boxes. But for Faster R-CNN and Mask R-CNN where there are only 300 anchors in the second stage, a significant number of these match to groundtruth paddings reducing the number of anchors regressing to real groundtruth boxes degrading the performance severely.

Therefore, this change introduces an additional boolean argument `valid_rows` to `Matcher.match` methods and the implementations now ignore such padded groudtruth boxes during matching.

PiperOrigin-RevId: 213345395

* Add release note for iNaturalist Species trained models.

PiperOrigin-RevId: 213347179

* Fix the bug of uninitialized gt_is_crowd_list variable.

PiperOrigin-RevId: 213364858

* ...text exposed to open source public git repo...

PiperOrigin-RevId: 213554260
parent 256b8ae6
...@@ -33,6 +33,7 @@ def multiclass_non_max_suppression(boxes, ...@@ -33,6 +33,7 @@ def multiclass_non_max_suppression(boxes,
change_coordinate_frame=False, change_coordinate_frame=False,
masks=None, masks=None,
boundaries=None, boundaries=None,
pad_to_max_output_size=False,
additional_fields=None, additional_fields=None,
scope=None): scope=None):
"""Multi-class version of non maximum suppression. """Multi-class version of non maximum suppression.
...@@ -55,7 +56,8 @@ def multiclass_non_max_suppression(boxes, ...@@ -55,7 +56,8 @@ def multiclass_non_max_suppression(boxes,
number of classes or 1 depending on whether a separate box is predicted number of classes or 1 depending on whether a separate box is predicted
per class. per class.
scores: A [k, num_classes] float32 tensor containing the scores for each of scores: A [k, num_classes] float32 tensor containing the scores for each of
the k detections. the k detections. The scores have to be non-negative when
pad_to_max_output_size is True.
score_thresh: scalar threshold for score (low scoring boxes are removed). score_thresh: scalar threshold for score (low scoring boxes are removed).
iou_thresh: scalar threshold for IOU (new boxes that have high IOU overlap iou_thresh: scalar threshold for IOU (new boxes that have high IOU overlap
with previously selected boxes are removed). with previously selected boxes are removed).
...@@ -74,6 +76,8 @@ def multiclass_non_max_suppression(boxes, ...@@ -74,6 +76,8 @@ def multiclass_non_max_suppression(boxes,
boundaries: (optional) a [k, q, boundary_height, boundary_width] float32 boundaries: (optional) a [k, q, boundary_height, boundary_width] float32
tensor containing box boundaries. `q` can be either number of classes or 1 tensor containing box boundaries. `q` can be either number of classes or 1
depending on whether a separate boundary is predicted per class. depending on whether a separate boundary is predicted per class.
pad_to_max_output_size: If true, the output nmsed boxes are padded to be of
length `max_size_per_class`. Defaults to false.
additional_fields: (optional) If not None, a dictionary that maps keys to additional_fields: (optional) If not None, a dictionary that maps keys to
tensors whose first dimensions are all of size `k`. After non-maximum tensors whose first dimensions are all of size `k`. After non-maximum
suppression, all tensors corresponding to the selected boxes will be suppression, all tensors corresponding to the selected boxes will be
...@@ -81,9 +85,12 @@ def multiclass_non_max_suppression(boxes, ...@@ -81,9 +85,12 @@ def multiclass_non_max_suppression(boxes,
scope: name scope. scope: name scope.
Returns: Returns:
a BoxList holding M boxes with a rank-1 scores field representing A tuple of sorted_boxes and num_valid_nms_boxes. The sorted_boxes is a
BoxList holds M boxes with a rank-1 scores field representing
corresponding scores for each box with scores sorted in decreasing order corresponding scores for each box with scores sorted in decreasing order
and a rank-1 classes field representing a class label for each box. and a rank-1 classes field representing a class label for each box. The
num_valid_nms_boxes is a 0-D integer tensor representing the number of
valid elements in `BoxList`, with the valid elements appearing first.
Raises: Raises:
ValueError: if iou_thresh is not in [0, 1] or if input boxlist does not have ValueError: if iou_thresh is not in [0, 1] or if input boxlist does not have
...@@ -113,6 +120,7 @@ def multiclass_non_max_suppression(boxes, ...@@ -113,6 +120,7 @@ def multiclass_non_max_suppression(boxes,
num_classes = scores.get_shape()[1] num_classes = scores.get_shape()[1]
selected_boxes_list = [] selected_boxes_list = []
num_valid_nms_boxes_cumulative = tf.constant(0)
per_class_boxes_list = tf.unstack(boxes, axis=1) per_class_boxes_list = tf.unstack(boxes, axis=1)
if masks is not None: if masks is not None:
per_class_masks_list = tf.unstack(masks, axis=1) per_class_masks_list = tf.unstack(masks, axis=1)
...@@ -140,16 +148,40 @@ def multiclass_non_max_suppression(boxes, ...@@ -140,16 +148,40 @@ def multiclass_non_max_suppression(boxes,
for key, tensor in additional_fields.items(): for key, tensor in additional_fields.items():
boxlist_and_class_scores.add_field(key, tensor) boxlist_and_class_scores.add_field(key, tensor)
max_selection_size = tf.minimum(max_size_per_class, if pad_to_max_output_size:
boxlist_and_class_scores.num_boxes()) max_selection_size = max_size_per_class
selected_indices = tf.image.non_max_suppression( selected_indices, num_valid_nms_boxes = (
boxlist_and_class_scores.get(), tf.image.non_max_suppression_padded(
boxlist_and_class_scores.get_field(fields.BoxListFields.scores), boxlist_and_class_scores.get(),
max_selection_size, boxlist_and_class_scores.get_field(fields.BoxListFields.scores),
iou_threshold=iou_thresh, max_selection_size,
score_threshold=score_thresh) iou_threshold=iou_thresh,
score_threshold=score_thresh,
pad_to_max_output_size=True))
else:
max_selection_size = tf.minimum(max_size_per_class,
boxlist_and_class_scores.num_boxes())
selected_indices = tf.image.non_max_suppression(
boxlist_and_class_scores.get(),
boxlist_and_class_scores.get_field(fields.BoxListFields.scores),
max_selection_size,
iou_threshold=iou_thresh,
score_threshold=score_thresh)
num_valid_nms_boxes = tf.shape(selected_indices)[0]
selected_indices = tf.concat(
[selected_indices,
tf.zeros(max_selection_size-num_valid_nms_boxes, tf.int32)], 0)
nms_result = box_list_ops.gather(boxlist_and_class_scores, nms_result = box_list_ops.gather(boxlist_and_class_scores,
selected_indices) selected_indices)
# Make the scores -1 for invalid boxes.
valid_nms_boxes_indx = tf.less(
tf.range(max_selection_size), num_valid_nms_boxes)
nms_scores = nms_result.get_field(fields.BoxListFields.scores)
nms_result.add_field(fields.BoxListFields.scores,
tf.where(valid_nms_boxes_indx,
nms_scores, -1*tf.ones(max_selection_size)))
num_valid_nms_boxes_cumulative += num_valid_nms_boxes
nms_result.add_field( nms_result.add_field(
fields.BoxListFields.classes, (tf.zeros_like( fields.BoxListFields.classes, (tf.zeros_like(
nms_result.get_field(fields.BoxListFields.scores)) + class_idx)) nms_result.get_field(fields.BoxListFields.scores)) + class_idx))
...@@ -158,16 +190,43 @@ def multiclass_non_max_suppression(boxes, ...@@ -158,16 +190,43 @@ def multiclass_non_max_suppression(boxes,
sorted_boxes = box_list_ops.sort_by_field(selected_boxes, sorted_boxes = box_list_ops.sort_by_field(selected_boxes,
fields.BoxListFields.scores) fields.BoxListFields.scores)
if clip_window is not None: if clip_window is not None:
sorted_boxes = box_list_ops.clip_to_window(sorted_boxes, clip_window) # When pad_to_max_output_size is False, it prunes the boxes with zero
# area.
sorted_boxes = box_list_ops.clip_to_window(
sorted_boxes,
clip_window,
filter_nonoverlapping=not pad_to_max_output_size)
# Set the scores of boxes with zero area to -1 to keep the default
# behaviour of pruning out zero area boxes.
sorted_boxes_size = tf.shape(sorted_boxes.get())[0]
non_zero_box_area = tf.cast(box_list_ops.area(sorted_boxes), tf.bool)
sorted_boxes_scores = tf.where(
non_zero_box_area,
sorted_boxes.get_field(fields.BoxListFields.scores),
-1*tf.ones(sorted_boxes_size))
sorted_boxes.add_field(fields.BoxListFields.scores, sorted_boxes_scores)
num_valid_nms_boxes_cumulative = tf.reduce_sum(
tf.cast(tf.greater_equal(sorted_boxes_scores, 0), tf.int32))
sorted_boxes = box_list_ops.sort_by_field(sorted_boxes,
fields.BoxListFields.scores)
if change_coordinate_frame: if change_coordinate_frame:
sorted_boxes = box_list_ops.change_coordinate_frame( sorted_boxes = box_list_ops.change_coordinate_frame(
sorted_boxes, clip_window) sorted_boxes, clip_window)
if max_total_size: if max_total_size:
max_total_size = tf.minimum(max_total_size, max_total_size = tf.minimum(max_total_size,
sorted_boxes.num_boxes()) sorted_boxes.num_boxes())
sorted_boxes = box_list_ops.gather(sorted_boxes, sorted_boxes = box_list_ops.gather(sorted_boxes,
tf.range(max_total_size)) tf.range(max_total_size))
return sorted_boxes num_valid_nms_boxes_cumulative = tf.where(
max_total_size > num_valid_nms_boxes_cumulative,
num_valid_nms_boxes_cumulative, max_total_size)
# Select only the valid boxes if pad_to_max_output_size is False.
if not pad_to_max_output_size:
sorted_boxes = box_list_ops.gather(
sorted_boxes, tf.range(num_valid_nms_boxes_cumulative))
return sorted_boxes, num_valid_nms_boxes_cumulative
def batch_multiclass_non_max_suppression(boxes, def batch_multiclass_non_max_suppression(boxes,
...@@ -182,6 +241,7 @@ def batch_multiclass_non_max_suppression(boxes, ...@@ -182,6 +241,7 @@ def batch_multiclass_non_max_suppression(boxes,
masks=None, masks=None,
additional_fields=None, additional_fields=None,
scope=None, scope=None,
use_static_shapes=False,
parallel_iterations=32): parallel_iterations=32):
"""Multi-class version of non maximum suppression that operates on a batch. """Multi-class version of non maximum suppression that operates on a batch.
...@@ -195,7 +255,8 @@ def batch_multiclass_non_max_suppression(boxes, ...@@ -195,7 +255,8 @@ def batch_multiclass_non_max_suppression(boxes,
otherwise, if `q` is equal to number of classes, class-specific boxes otherwise, if `q` is equal to number of classes, class-specific boxes
are used. are used.
scores: A [batch_size, num_anchors, num_classes] float32 tensor containing scores: A [batch_size, num_anchors, num_classes] float32 tensor containing
the scores for each of the `num_anchors` detections. the scores for each of the `num_anchors` detections. The scores have to be
non-negative when use_static_shapes is set True.
score_thresh: scalar threshold for score (low scoring boxes are removed). score_thresh: scalar threshold for score (low scoring boxes are removed).
iou_thresh: scalar threshold for IOU (new boxes that have high IOU overlap iou_thresh: scalar threshold for IOU (new boxes that have high IOU overlap
with previously selected boxes are removed). with previously selected boxes are removed).
...@@ -221,6 +282,9 @@ def batch_multiclass_non_max_suppression(boxes, ...@@ -221,6 +282,9 @@ def batch_multiclass_non_max_suppression(boxes,
additional_fields: (optional) If not None, a dictionary that maps keys to additional_fields: (optional) If not None, a dictionary that maps keys to
tensors whose dimensions are [batch_size, num_anchors, ...]. tensors whose dimensions are [batch_size, num_anchors, ...].
scope: tf scope name. scope: tf scope name.
use_static_shapes: If true, the output nmsed boxes are padded to be of
length `max_size_per_class` and it doesn't clip boxes to max_total_size.
Defaults to false.
parallel_iterations: (optional) number of batch items to process in parallel_iterations: (optional) number of batch items to process in
parallel. parallel.
...@@ -276,7 +340,7 @@ def batch_multiclass_non_max_suppression(boxes, ...@@ -276,7 +340,7 @@ def batch_multiclass_non_max_suppression(boxes,
# If masks aren't provided, create dummy masks so we can only have one copy # If masks aren't provided, create dummy masks so we can only have one copy
# of _single_image_nms_fn and discard the dummy masks after map_fn. # of _single_image_nms_fn and discard the dummy masks after map_fn.
if masks is None: if masks is None:
masks_shape = tf.stack([batch_size, num_anchors, 1, 0, 0]) masks_shape = tf.stack([batch_size, num_anchors, q, 1, 1])
masks = tf.zeros(masks_shape) masks = tf.zeros(masks_shape)
if clip_window is None: if clip_window is None:
...@@ -365,7 +429,7 @@ def batch_multiclass_non_max_suppression(boxes, ...@@ -365,7 +429,7 @@ def batch_multiclass_non_max_suppression(boxes,
tf.stack([per_image_num_valid_boxes] + tf.stack([per_image_num_valid_boxes] +
(additional_field_dim - 1) * [-1])), (additional_field_dim - 1) * [-1])),
[-1] + [dim.value for dim in additional_field_shape[1:]]) [-1] + [dim.value for dim in additional_field_shape[1:]])
nmsed_boxlist = multiclass_non_max_suppression( nmsed_boxlist, num_valid_nms_boxes = multiclass_non_max_suppression(
per_image_boxes, per_image_boxes,
per_image_scores, per_image_scores,
score_thresh, score_thresh,
...@@ -375,16 +439,19 @@ def batch_multiclass_non_max_suppression(boxes, ...@@ -375,16 +439,19 @@ def batch_multiclass_non_max_suppression(boxes,
clip_window=per_image_clip_window, clip_window=per_image_clip_window,
change_coordinate_frame=change_coordinate_frame, change_coordinate_frame=change_coordinate_frame,
masks=per_image_masks, masks=per_image_masks,
pad_to_max_output_size=use_static_shapes,
additional_fields=per_image_additional_fields) additional_fields=per_image_additional_fields)
padded_boxlist = box_list_ops.pad_or_clip_box_list(nmsed_boxlist,
max_total_size) if not use_static_shapes:
num_detections = nmsed_boxlist.num_boxes() nmsed_boxlist = box_list_ops.pad_or_clip_box_list(
nmsed_boxes = padded_boxlist.get() nmsed_boxlist, max_total_size)
nmsed_scores = padded_boxlist.get_field(fields.BoxListFields.scores) num_detections = num_valid_nms_boxes
nmsed_classes = padded_boxlist.get_field(fields.BoxListFields.classes) nmsed_boxes = nmsed_boxlist.get()
nmsed_masks = padded_boxlist.get_field(fields.BoxListFields.masks) nmsed_scores = nmsed_boxlist.get_field(fields.BoxListFields.scores)
nmsed_classes = nmsed_boxlist.get_field(fields.BoxListFields.classes)
nmsed_masks = nmsed_boxlist.get_field(fields.BoxListFields.masks)
nmsed_additional_fields = [ nmsed_additional_fields = [
padded_boxlist.get_field(key) for key in per_image_additional_fields nmsed_boxlist.get_field(key) for key in per_image_additional_fields
] ]
return ([nmsed_boxes, nmsed_scores, nmsed_classes, nmsed_masks] + return ([nmsed_boxes, nmsed_scores, nmsed_classes, nmsed_masks] +
nmsed_additional_fields + [num_detections]) nmsed_additional_fields + [num_detections])
......
...@@ -18,9 +18,10 @@ import numpy as np ...@@ -18,9 +18,10 @@ import numpy as np
import tensorflow as tf import tensorflow as tf
from object_detection.core import post_processing from object_detection.core import post_processing
from object_detection.core import standard_fields as fields from object_detection.core import standard_fields as fields
from object_detection.utils import test_case
class MulticlassNonMaxSuppressionTest(tf.test.TestCase): class MulticlassNonMaxSuppressionTest(test_case.TestCase):
def test_multiclass_nms_select_with_shared_boxes(self): def test_multiclass_nms_select_with_shared_boxes(self):
boxes = tf.constant([[[0, 0, 1, 1]], boxes = tf.constant([[[0, 0, 1, 1]],
...@@ -46,7 +47,7 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase): ...@@ -46,7 +47,7 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase):
exp_nms_scores = [.95, .9, .85, .3] exp_nms_scores = [.95, .9, .85, .3]
exp_nms_classes = [0, 0, 1, 0] exp_nms_classes = [0, 0, 1, 0]
nms = post_processing.multiclass_non_max_suppression( nms, _ = post_processing.multiclass_non_max_suppression(
boxes, scores, score_thresh, iou_thresh, max_output_size) boxes, scores, score_thresh, iou_thresh, max_output_size)
with self.test_session() as sess: with self.test_session() as sess:
nms_corners_output, nms_scores_output, nms_classes_output = sess.run( nms_corners_output, nms_scores_output, nms_classes_output = sess.run(
...@@ -56,6 +57,57 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase): ...@@ -56,6 +57,57 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase):
self.assertAllClose(nms_scores_output, exp_nms_scores) self.assertAllClose(nms_scores_output, exp_nms_scores)
self.assertAllClose(nms_classes_output, exp_nms_classes) self.assertAllClose(nms_classes_output, exp_nms_classes)
# TODO(bhattad): Remove conditional after CMLE moves to TF 1.9
# BEGIN GOOGLE-INTERNAL
def test_multiclass_nms_select_with_shared_boxes_pad_to_max_output_size(self):
boxes = np.array([[[0, 0, 1, 1]],
[[0, 0.1, 1, 1.1]],
[[0, -0.1, 1, 0.9]],
[[0, 10, 1, 11]],
[[0, 10.1, 1, 11.1]],
[[0, 100, 1, 101]],
[[0, 1000, 1, 1002]],
[[0, 1000, 1, 1002.1]]], np.float32)
scores = np.array([[.9, 0.01], [.75, 0.05],
[.6, 0.01], [.95, 0],
[.5, 0.01], [.3, 0.01],
[.01, .85], [.01, .5]], np.float32)
score_thresh = 0.1
iou_thresh = .5
max_size_per_class = 4
max_output_size = 5
exp_nms_corners = [[0, 10, 1, 11],
[0, 0, 1, 1],
[0, 1000, 1, 1002],
[0, 100, 1, 101]]
exp_nms_scores = [.95, .9, .85, .3]
exp_nms_classes = [0, 0, 1, 0]
def graph_fn(boxes, scores):
nms, num_valid_nms_boxes = post_processing.multiclass_non_max_suppression(
boxes,
scores,
score_thresh,
iou_thresh,
max_size_per_class,
max_total_size=max_output_size,
pad_to_max_output_size=True)
return [nms.get(), nms.get_field(fields.BoxListFields.scores),
nms.get_field(fields.BoxListFields.classes), num_valid_nms_boxes]
[nms_corners_output, nms_scores_output, nms_classes_output,
num_valid_nms_boxes] = self.execute(graph_fn, [boxes, scores])
self.assertEqual(num_valid_nms_boxes, 4)
self.assertAllClose(nms_corners_output[0:num_valid_nms_boxes],
exp_nms_corners)
self.assertAllClose(nms_scores_output[0:num_valid_nms_boxes],
exp_nms_scores)
self.assertAllClose(nms_classes_output[0:num_valid_nms_boxes],
exp_nms_classes)
# END GOOGLE-INTERNAL
def test_multiclass_nms_select_with_shared_boxes_given_keypoints(self): def test_multiclass_nms_select_with_shared_boxes_given_keypoints(self):
boxes = tf.constant([[[0, 0, 1, 1]], boxes = tf.constant([[[0, 0, 1, 1]],
[[0, 0.1, 1, 1.1]], [[0, 0.1, 1, 1.1]],
...@@ -87,10 +139,13 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase): ...@@ -87,10 +139,13 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase):
tf.reshape(tf.constant([3, 0, 6, 5], dtype=tf.float32), [4, 1, 1]), tf.reshape(tf.constant([3, 0, 6, 5], dtype=tf.float32), [4, 1, 1]),
[1, num_keypoints, 2]) [1, num_keypoints, 2])
nms = post_processing.multiclass_non_max_suppression( nms, _ = post_processing.multiclass_non_max_suppression(
boxes, scores, score_thresh, iou_thresh, max_output_size, boxes,
additional_fields={ scores,
fields.BoxListFields.keypoints: keypoints}) score_thresh,
iou_thresh,
max_output_size,
additional_fields={fields.BoxListFields.keypoints: keypoints})
with self.test_session() as sess: with self.test_session() as sess:
(nms_corners_output, (nms_corners_output,
...@@ -145,10 +200,15 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase): ...@@ -145,10 +200,15 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase):
exp_nms_keypoint_heatmaps = np.ones( exp_nms_keypoint_heatmaps = np.ones(
(4, heatmap_height, heatmap_width, num_keypoints), dtype=np.float32) (4, heatmap_height, heatmap_width, num_keypoints), dtype=np.float32)
nms = post_processing.multiclass_non_max_suppression( nms, _ = post_processing.multiclass_non_max_suppression(
boxes, scores, score_thresh, iou_thresh, max_output_size, boxes,
scores,
score_thresh,
iou_thresh,
max_output_size,
additional_fields={ additional_fields={
fields.BoxListFields.keypoint_heatmaps: keypoint_heatmaps}) fields.BoxListFields.keypoint_heatmaps: keypoint_heatmaps
})
with self.test_session() as sess: with self.test_session() as sess:
(nms_corners_output, (nms_corners_output,
...@@ -208,8 +268,12 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase): ...@@ -208,8 +268,12 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase):
exp_nms_scores = [.95, .9, .85, .3] exp_nms_scores = [.95, .9, .85, .3]
exp_nms_classes = [0, 0, 1, 0] exp_nms_classes = [0, 0, 1, 0]
nms = post_processing.multiclass_non_max_suppression( nms, _ = post_processing.multiclass_non_max_suppression(
boxes, scores, score_thresh, iou_thresh, max_output_size, boxes,
scores,
score_thresh,
iou_thresh,
max_output_size,
additional_fields={coarse_boxes_key: coarse_boxes}) additional_fields={coarse_boxes_key: coarse_boxes})
with self.test_session() as sess: with self.test_session() as sess:
...@@ -260,11 +324,8 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase): ...@@ -260,11 +324,8 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase):
tf.reshape(tf.constant([3, 0, 6, 5], dtype=tf.float32), [4, 1, 1]), tf.reshape(tf.constant([3, 0, 6, 5], dtype=tf.float32), [4, 1, 1]),
[1, mask_height, mask_width]) [1, mask_height, mask_width])
nms = post_processing.multiclass_non_max_suppression(boxes, scores, nms, _ = post_processing.multiclass_non_max_suppression(
score_thresh, boxes, scores, score_thresh, iou_thresh, max_output_size, masks=masks)
iou_thresh,
max_output_size,
masks=masks)
with self.test_session() as sess: with self.test_session() as sess:
(nms_corners_output, (nms_corners_output,
nms_scores_output, nms_scores_output,
...@@ -293,8 +354,12 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase): ...@@ -293,8 +354,12 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase):
exp_nms_scores = [.9] exp_nms_scores = [.9]
exp_nms_classes = [0] exp_nms_classes = [0]
nms = post_processing.multiclass_non_max_suppression( nms, _ = post_processing.multiclass_non_max_suppression(
boxes, scores, score_thresh, iou_thresh, max_output_size, boxes,
scores,
score_thresh,
iou_thresh,
max_output_size,
clip_window=clip_window) clip_window=clip_window)
with self.test_session() as sess: with self.test_session() as sess:
nms_corners_output, nms_scores_output, nms_classes_output = sess.run( nms_corners_output, nms_scores_output, nms_classes_output = sess.run(
...@@ -317,9 +382,14 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase): ...@@ -317,9 +382,14 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase):
exp_nms_scores = [.9] exp_nms_scores = [.9]
exp_nms_classes = [0] exp_nms_classes = [0]
nms = post_processing.multiclass_non_max_suppression( nms, _ = post_processing.multiclass_non_max_suppression(
boxes, scores, score_thresh, iou_thresh, max_output_size, boxes,
clip_window=clip_window, change_coordinate_frame=True) scores,
score_thresh,
iou_thresh,
max_output_size,
clip_window=clip_window,
change_coordinate_frame=True)
with self.test_session() as sess: with self.test_session() as sess:
nms_corners_output, nms_scores_output, nms_classes_output = sess.run( nms_corners_output, nms_scores_output, nms_classes_output = sess.run(
[nms.get(), nms.get_field(fields.BoxListFields.scores), [nms.get(), nms.get_field(fields.BoxListFields.scores),
...@@ -351,7 +421,7 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase): ...@@ -351,7 +421,7 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase):
exp_nms_scores = [.95, .9, .85] exp_nms_scores = [.95, .9, .85]
exp_nms_classes = [0, 0, 1] exp_nms_classes = [0, 0, 1]
nms = post_processing.multiclass_non_max_suppression( nms, _ = post_processing.multiclass_non_max_suppression(
boxes, scores, score_thresh, iou_thresh, max_size_per_class) boxes, scores, score_thresh, iou_thresh, max_size_per_class)
with self.test_session() as sess: with self.test_session() as sess:
nms_corners_output, nms_scores_output, nms_classes_output = sess.run( nms_corners_output, nms_scores_output, nms_classes_output = sess.run(
...@@ -384,7 +454,7 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase): ...@@ -384,7 +454,7 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase):
exp_nms_scores = [.95, .9] exp_nms_scores = [.95, .9]
exp_nms_classes = [0, 0] exp_nms_classes = [0, 0]
nms = post_processing.multiclass_non_max_suppression( nms, _ = post_processing.multiclass_non_max_suppression(
boxes, scores, score_thresh, iou_thresh, max_size_per_class, boxes, scores, score_thresh, iou_thresh, max_size_per_class,
max_total_size) max_total_size)
with self.test_session() as sess: with self.test_session() as sess:
...@@ -412,7 +482,7 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase): ...@@ -412,7 +482,7 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase):
exp_nms = [[0, 10, 1, 11], exp_nms = [[0, 10, 1, 11],
[0, 0, 1, 1], [0, 0, 1, 1],
[0, 100, 1, 101]] [0, 100, 1, 101]]
nms = post_processing.multiclass_non_max_suppression( nms, _ = post_processing.multiclass_non_max_suppression(
boxes, scores, score_thresh, iou_thresh, max_output_size) boxes, scores, score_thresh, iou_thresh, max_output_size)
with self.test_session() as sess: with self.test_session() as sess:
nms_output = sess.run(nms.get()) nms_output = sess.run(nms.get())
...@@ -443,7 +513,7 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase): ...@@ -443,7 +513,7 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase):
exp_nms_scores = [.95, .9, .85, .3] exp_nms_scores = [.95, .9, .85, .3]
exp_nms_classes = [0, 0, 1, 0] exp_nms_classes = [0, 0, 1, 0]
nms = post_processing.multiclass_non_max_suppression( nms, _ = post_processing.multiclass_non_max_suppression(
boxes, scores, score_thresh, iou_thresh, max_output_size) boxes, scores, score_thresh, iou_thresh, max_output_size)
with self.test_session() as sess: with self.test_session() as sess:
nms_corners_output, nms_scores_output, nms_classes_output = sess.run( nms_corners_output, nms_scores_output, nms_classes_output = sess.run(
...@@ -1055,6 +1125,62 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase): ...@@ -1055,6 +1125,62 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase):
exp_nms_additional_fields[key]) exp_nms_additional_fields[key])
self.assertAllClose(num_detections, [1, 1]) self.assertAllClose(num_detections, [1, 1])
# TODO(bhattad): Remove conditional after CMLE moves to TF 1.9
# BEGIN GOOGLE-INTERNAL
def test_batch_multiclass_nms_with_use_static_shapes(self):
boxes = np.array([[[[0, 0, 1, 1], [0, 0, 4, 5]],
[[0, 0.1, 1, 1.1], [0, 0.1, 2, 1.1]],
[[0, -0.1, 1, 0.9], [0, -0.1, 1, 0.9]],
[[0, 10, 1, 11], [0, 10, 1, 11]]],
[[[0, 10.1, 1, 11.1], [0, 10.1, 1, 11.1]],
[[0, 100, 1, 101], [0, 100, 1, 101]],
[[0, 1000, 1, 1002], [0, 999, 2, 1004]],
[[0, 1000, 1, 1002.1], [0, 999, 2, 1002.7]]]],
np.float32)
scores = np.array([[[.9, 0.01], [.75, 0.05],
[.6, 0.01], [.95, 0]],
[[.5, 0.01], [.3, 0.01],
[.01, .85], [.01, .5]]],
np.float32)
clip_window = np.array([[0., 0., 5., 5.],
[0., 0., 200., 200.]],
np.float32)
score_thresh = 0.1
iou_thresh = .5
max_output_size = 4
exp_nms_corners = np.array([[[0, 0, 1, 1],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]],
[[0, 10.1, 1, 11.1],
[0, 100, 1, 101],
[0, 0, 0, 0],
[0, 0, 0, 0]]])
exp_nms_scores = np.array([[.9, 0., 0., 0.],
[.5, .3, 0, 0]])
exp_nms_classes = np.array([[0, 0, 0, 0],
[0, 0, 0, 0]])
def graph_fn(boxes, scores, clip_window):
(nmsed_boxes, nmsed_scores, nmsed_classes, _, _, num_detections
) = post_processing.batch_multiclass_non_max_suppression(
boxes, scores, score_thresh, iou_thresh,
max_size_per_class=max_output_size, clip_window=clip_window,
use_static_shapes=True)
return nmsed_boxes, nmsed_scores, nmsed_classes, num_detections
(nmsed_boxes, nmsed_scores, nmsed_classes,
num_detections) = self.execute(graph_fn, [boxes, scores, clip_window])
for i in range(len(num_detections)):
self.assertAllClose(nmsed_boxes[i, 0:num_detections[i]],
exp_nms_corners[i, 0:num_detections[i]])
self.assertAllClose(nmsed_scores[i, 0:num_detections[i]],
exp_nms_scores[i, 0:num_detections[i]])
self.assertAllClose(nmsed_classes[i, 0:num_detections[i]],
exp_nms_classes[i, 0:num_detections[i]])
self.assertAllClose(num_detections, [1, 2])
# END GOOGLE-INTERNAL
if __name__ == '__main__': if __name__ == '__main__':
tf.test.main() tf.test.main()
...@@ -810,7 +810,7 @@ def random_image_scale(image, ...@@ -810,7 +810,7 @@ def random_image_scale(image,
image = tf.image.resize_images( image = tf.image.resize_images(
image, [image_newysize, image_newxsize], align_corners=True) image, [image_newysize, image_newxsize], align_corners=True)
result.append(image) result.append(image)
if masks: if masks is not None:
masks = tf.image.resize_nearest_neighbor( masks = tf.image.resize_nearest_neighbor(
masks, [image_newysize, image_newxsize], align_corners=True) masks, [image_newysize, image_newxsize], align_corners=True)
result.append(masks) result.append(masks)
...@@ -2969,7 +2969,8 @@ def get_default_func_arg_map(include_label_scores=False, ...@@ -2969,7 +2969,8 @@ def get_default_func_arg_map(include_label_scores=False,
""" """
groundtruth_label_scores = None groundtruth_label_scores = None
if include_label_scores: if include_label_scores:
groundtruth_label_scores = (fields.InputDataFields.groundtruth_label_scores) groundtruth_label_scores = (
fields.InputDataFields.groundtruth_confidences)
multiclass_scores = None multiclass_scores = None
if include_multiclass_scores: if include_multiclass_scores:
......
...@@ -67,7 +67,7 @@ class PreprocessorCache(object): ...@@ -67,7 +67,7 @@ class PreprocessorCache(object):
def clear(self): def clear(self):
"""Resets cache.""" """Resets cache."""
self._history = {} self._history = defaultdict(dict)
def get(self, function_id, key): def get(self, function_id, key):
"""Gets stored value given a function id and key. """Gets stored value given a function id and key.
......
...@@ -1615,7 +1615,7 @@ class PreprocessorTest(tf.test.TestCase): ...@@ -1615,7 +1615,7 @@ class PreprocessorTest(tf.test.TestCase):
tensor_dict = { tensor_dict = {
fields.InputDataFields.groundtruth_boxes: boxes, fields.InputDataFields.groundtruth_boxes: boxes,
fields.InputDataFields.groundtruth_classes: labels, fields.InputDataFields.groundtruth_classes: labels,
fields.InputDataFields.groundtruth_label_scores: label_scores fields.InputDataFields.groundtruth_confidences: label_scores
} }
preprocessing_options = [ preprocessing_options = [
...@@ -1630,7 +1630,7 @@ class PreprocessorTest(tf.test.TestCase): ...@@ -1630,7 +1630,7 @@ class PreprocessorTest(tf.test.TestCase):
retained_labels = retained_tensor_dict[ retained_labels = retained_tensor_dict[
fields.InputDataFields.groundtruth_classes] fields.InputDataFields.groundtruth_classes]
retained_label_scores = retained_tensor_dict[ retained_label_scores = retained_tensor_dict[
fields.InputDataFields.groundtruth_label_scores] fields.InputDataFields.groundtruth_confidences]
with self.test_session() as sess: with self.test_session() as sess:
(retained_boxes_, retained_labels_, (retained_boxes_, retained_labels_,
...@@ -1655,7 +1655,7 @@ class PreprocessorTest(tf.test.TestCase): ...@@ -1655,7 +1655,7 @@ class PreprocessorTest(tf.test.TestCase):
tensor_dict = { tensor_dict = {
fields.InputDataFields.groundtruth_boxes: boxes, fields.InputDataFields.groundtruth_boxes: boxes,
fields.InputDataFields.groundtruth_classes: labels, fields.InputDataFields.groundtruth_classes: labels,
fields.InputDataFields.groundtruth_label_scores: label_scores, fields.InputDataFields.groundtruth_confidences: label_scores,
fields.InputDataFields.groundtruth_instance_masks: masks fields.InputDataFields.groundtruth_instance_masks: masks
} }
...@@ -1687,7 +1687,7 @@ class PreprocessorTest(tf.test.TestCase): ...@@ -1687,7 +1687,7 @@ class PreprocessorTest(tf.test.TestCase):
tensor_dict = { tensor_dict = {
fields.InputDataFields.groundtruth_boxes: boxes, fields.InputDataFields.groundtruth_boxes: boxes,
fields.InputDataFields.groundtruth_classes: labels, fields.InputDataFields.groundtruth_classes: labels,
fields.InputDataFields.groundtruth_label_scores: label_scores, fields.InputDataFields.groundtruth_confidences: label_scores,
fields.InputDataFields.groundtruth_keypoints: keypoints fields.InputDataFields.groundtruth_keypoints: keypoints
} }
...@@ -2784,7 +2784,7 @@ class PreprocessorTest(tf.test.TestCase): ...@@ -2784,7 +2784,7 @@ class PreprocessorTest(tf.test.TestCase):
} }
if include_label_scores: if include_label_scores:
label_scores = self.createTestLabelScores() label_scores = self.createTestLabelScores()
tensor_dict[fields.InputDataFields.groundtruth_label_scores] = ( tensor_dict[fields.InputDataFields.groundtruth_confidences] = (
label_scores) label_scores)
if include_multiclass_scores: if include_multiclass_scores:
multiclass_scores = self.createTestMultiClassScores() multiclass_scores = self.createTestMultiClassScores()
......
...@@ -40,8 +40,10 @@ class InputDataFields(object): ...@@ -40,8 +40,10 @@ class InputDataFields(object):
source_id: source of the original image. source_id: source of the original image.
filename: original filename of the dataset (without common path). filename: original filename of the dataset (without common path).
groundtruth_image_classes: image-level class labels. groundtruth_image_classes: image-level class labels.
groundtruth_image_confidences: image-level class confidences.
groundtruth_boxes: coordinates of the ground truth boxes in the image. groundtruth_boxes: coordinates of the ground truth boxes in the image.
groundtruth_classes: box-level class labels. groundtruth_classes: box-level class labels.
groundtruth_confidences: box-level class confidences.
groundtruth_label_types: box-level label types (e.g. explicit negative). groundtruth_label_types: box-level label types (e.g. explicit negative).
groundtruth_is_crowd: [DEPRECATED, use groundtruth_group_of instead] groundtruth_is_crowd: [DEPRECATED, use groundtruth_group_of instead]
is the groundtruth a single object or a crowd. is the groundtruth a single object or a crowd.
...@@ -60,6 +62,7 @@ class InputDataFields(object): ...@@ -60,6 +62,7 @@ class InputDataFields(object):
groundtruth_label_scores: groundtruth label scores. groundtruth_label_scores: groundtruth label scores.
groundtruth_weights: groundtruth weight factor for bounding boxes. groundtruth_weights: groundtruth weight factor for bounding boxes.
num_groundtruth_boxes: number of groundtruth boxes. num_groundtruth_boxes: number of groundtruth boxes.
is_annotated: whether an image has been labeled or not.
true_image_shapes: true shapes of images in the resized images, as resized true_image_shapes: true shapes of images in the resized images, as resized
images can be padded with zeros. images can be padded with zeros.
multiclass_scores: the label score per class for each box. multiclass_scores: the label score per class for each box.
...@@ -71,8 +74,10 @@ class InputDataFields(object): ...@@ -71,8 +74,10 @@ class InputDataFields(object):
source_id = 'source_id' source_id = 'source_id'
filename = 'filename' filename = 'filename'
groundtruth_image_classes = 'groundtruth_image_classes' groundtruth_image_classes = 'groundtruth_image_classes'
groundtruth_image_confidences = 'groundtruth_image_confidences'
groundtruth_boxes = 'groundtruth_boxes' groundtruth_boxes = 'groundtruth_boxes'
groundtruth_classes = 'groundtruth_classes' groundtruth_classes = 'groundtruth_classes'
groundtruth_confidences = 'groundtruth_confidences'
groundtruth_label_types = 'groundtruth_label_types' groundtruth_label_types = 'groundtruth_label_types'
groundtruth_is_crowd = 'groundtruth_is_crowd' groundtruth_is_crowd = 'groundtruth_is_crowd'
groundtruth_area = 'groundtruth_area' groundtruth_area = 'groundtruth_area'
...@@ -88,6 +93,7 @@ class InputDataFields(object): ...@@ -88,6 +93,7 @@ class InputDataFields(object):
groundtruth_label_scores = 'groundtruth_label_scores' groundtruth_label_scores = 'groundtruth_label_scores'
groundtruth_weights = 'groundtruth_weights' groundtruth_weights = 'groundtruth_weights'
num_groundtruth_boxes = 'num_groundtruth_boxes' num_groundtruth_boxes = 'num_groundtruth_boxes'
is_annotated = 'is_annotated'
true_image_shape = 'true_image_shape' true_image_shape = 'true_image_shape'
multiclass_scores = 'multiclass_scores' multiclass_scores = 'multiclass_scores'
......
...@@ -93,8 +93,7 @@ class TargetAssigner(object): ...@@ -93,8 +93,7 @@ class TargetAssigner(object):
groundtruth_boxes, groundtruth_boxes,
groundtruth_labels=None, groundtruth_labels=None,
unmatched_class_label=None, unmatched_class_label=None,
groundtruth_weights=None, groundtruth_weights=None):
**params):
"""Assign classification and regression targets to each anchor. """Assign classification and regression targets to each anchor.
For a given set of anchors and groundtruth detections, match anchors For a given set of anchors and groundtruth detections, match anchors
...@@ -121,9 +120,11 @@ class TargetAssigner(object): ...@@ -121,9 +120,11 @@ class TargetAssigner(object):
If set to None, unmatched_cls_target is set to be [0] for each anchor. If set to None, unmatched_cls_target is set to be [0] for each anchor.
groundtruth_weights: a float tensor of shape [M] indicating the weight to groundtruth_weights: a float tensor of shape [M] indicating the weight to
assign to all anchors match to a particular groundtruth box. The weights assign to all anchors match to a particular groundtruth box. The weights
must be in [0., 1.]. If None, all weights are set to 1. must be in [0., 1.]. If None, all weights are set to 1. Generally no
**params: Additional keyword arguments for specific implementations of groundtruth boxes with zero weight match to any anchors as matchers are
the Matcher. aware of groundtruth weights. Additionally, `cls_weights` and
`reg_weights` are calculated using groundtruth weights as an added
safety.
Returns: Returns:
cls_targets: a float32 tensor with shape [num_anchors, d_1, d_2 ... d_k], cls_targets: a float32 tensor with shape [num_anchors, d_1, d_2 ... d_k],
...@@ -177,7 +178,8 @@ class TargetAssigner(object): ...@@ -177,7 +178,8 @@ class TargetAssigner(object):
[unmatched_shape_assert, labels_and_box_shapes_assert]): [unmatched_shape_assert, labels_and_box_shapes_assert]):
match_quality_matrix = self._similarity_calc.compare(groundtruth_boxes, match_quality_matrix = self._similarity_calc.compare(groundtruth_boxes,
anchors) anchors)
match = self._matcher.match(match_quality_matrix, **params) match = self._matcher.match(match_quality_matrix,
valid_rows=tf.greater(groundtruth_weights, 0))
reg_targets = self._create_regression_targets(anchors, reg_targets = self._create_regression_targets(anchors,
groundtruth_boxes, groundtruth_boxes,
match) match)
......
...@@ -495,8 +495,7 @@ class TargetAssignerTest(test_case.TestCase): ...@@ -495,8 +495,7 @@ class TargetAssignerTest(test_case.TestCase):
priors, priors,
boxes, boxes,
groundtruth_labels, groundtruth_labels,
unmatched_class_label=unmatched_class_label, unmatched_class_label=unmatched_class_label)
num_valid_rows=3)
def test_raises_error_on_invalid_groundtruth_labels(self): def test_raises_error_on_invalid_groundtruth_labels(self):
similarity_calc = region_similarity_calculator.NegSqDistSimilarity() similarity_calc = region_similarity_calculator.NegSqDistSimilarity()
...@@ -520,8 +519,7 @@ class TargetAssignerTest(test_case.TestCase): ...@@ -520,8 +519,7 @@ class TargetAssignerTest(test_case.TestCase):
priors, priors,
boxes, boxes,
groundtruth_labels, groundtruth_labels,
unmatched_class_label=unmatched_class_label, unmatched_class_label=unmatched_class_label)
num_valid_rows=3)
class BatchTargetAssignerTest(test_case.TestCase): class BatchTargetAssignerTest(test_case.TestCase):
......
This source diff could not be displayed because it is too large. You can view the blob instead.
...@@ -19,9 +19,6 @@ protos for object detection. ...@@ -19,9 +19,6 @@ protos for object detection.
""" """
import tensorflow as tf import tensorflow as tf
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import control_flow_ops
from tensorflow.python.ops import math_ops
from object_detection.core import data_decoder from object_detection.core import data_decoder
from object_detection.core import standard_fields as fields from object_detection.core import standard_fields as fields
from object_detection.protos import input_reader_pb2 from object_detection.protos import input_reader_pb2
...@@ -30,14 +27,12 @@ from object_detection.utils import label_map_util ...@@ -30,14 +27,12 @@ from object_detection.utils import label_map_util
slim_example_decoder = tf.contrib.slim.tfexample_decoder slim_example_decoder = tf.contrib.slim.tfexample_decoder
# TODO(lzc): keep LookupTensor and BackupHandler in sync with class _ClassTensorHandler(slim_example_decoder.Tensor):
# tf.contrib.slim.tfexample_decoder version. """An ItemHandler to fetch class ids from class text."""
class LookupTensor(slim_example_decoder.Tensor):
"""An ItemHandler that returns a parsed Tensor, the result of a lookup."""
def __init__(self, def __init__(self,
tensor_key, tensor_key,
table, label_map_proto_file,
shape_keys=None, shape_keys=None,
shape=None, shape=None,
default_value=''): default_value=''):
...@@ -47,7 +42,8 @@ class LookupTensor(slim_example_decoder.Tensor): ...@@ -47,7 +42,8 @@ class LookupTensor(slim_example_decoder.Tensor):
Args: Args:
tensor_key: the name of the `TFExample` feature to read the tensor from. tensor_key: the name of the `TFExample` feature to read the tensor from.
table: A tf.lookup table. label_map_proto_file: File path to a text format LabelMapProto message
mapping class text to id.
shape_keys: Optional name or list of names of the TF-Example feature in shape_keys: Optional name or list of names of the TF-Example feature in
which the tensor shape is stored. If a list, then each corresponds to which the tensor shape is stored. If a list, then each corresponds to
one dimension of the shape. one dimension of the shape.
...@@ -59,16 +55,39 @@ class LookupTensor(slim_example_decoder.Tensor): ...@@ -59,16 +55,39 @@ class LookupTensor(slim_example_decoder.Tensor):
Raises: Raises:
ValueError: if both `shape_keys` and `shape` are specified. ValueError: if both `shape_keys` and `shape` are specified.
""" """
self._table = table name_to_id = label_map_util.get_label_map_dict(
super(LookupTensor, self).__init__(tensor_key, shape_keys, shape, label_map_proto_file, use_display_name=False)
default_value) # We use a default_value of -1, but we expect all labels to be contained
# in the label map.
name_to_id_table = tf.contrib.lookup.HashTable(
initializer=tf.contrib.lookup.KeyValueTensorInitializer(
keys=tf.constant(list(name_to_id.keys())),
values=tf.constant(list(name_to_id.values()), dtype=tf.int64)),
default_value=-1)
display_name_to_id = label_map_util.get_label_map_dict(
label_map_proto_file, use_display_name=True)
# We use a default_value of -1, but we expect all labels to be contained
# in the label map.
display_name_to_id_table = tf.contrib.lookup.HashTable(
initializer=tf.contrib.lookup.KeyValueTensorInitializer(
keys=tf.constant(list(display_name_to_id.keys())),
values=tf.constant(
list(display_name_to_id.values()), dtype=tf.int64)),
default_value=-1)
self._name_to_id_table = name_to_id_table
self._display_name_to_id_table = display_name_to_id_table
super(_ClassTensorHandler, self).__init__(tensor_key, shape_keys, shape,
default_value)
def tensors_to_item(self, keys_to_tensors): def tensors_to_item(self, keys_to_tensors):
unmapped_tensor = super(LookupTensor, self).tensors_to_item(keys_to_tensors) unmapped_tensor = super(_ClassTensorHandler,
return self._table.lookup(unmapped_tensor) self).tensors_to_item(keys_to_tensors)
return tf.maximum(self._name_to_id_table.lookup(unmapped_tensor),
self._display_name_to_id_table.lookup(unmapped_tensor))
class BackupHandler(slim_example_decoder.ItemHandler): class _BackupHandler(slim_example_decoder.ItemHandler):
"""An ItemHandler that tries two ItemHandlers in order.""" """An ItemHandler that tries two ItemHandlers in order."""
def __init__(self, handler, backup): def __init__(self, handler, backup):
...@@ -92,12 +111,12 @@ class BackupHandler(slim_example_decoder.ItemHandler): ...@@ -92,12 +111,12 @@ class BackupHandler(slim_example_decoder.ItemHandler):
'Backup handler is of type %s instead of ItemHandler' % type(backup)) 'Backup handler is of type %s instead of ItemHandler' % type(backup))
self._handler = handler self._handler = handler
self._backup = backup self._backup = backup
super(BackupHandler, self).__init__(handler.keys + backup.keys) super(_BackupHandler, self).__init__(handler.keys + backup.keys)
def tensors_to_item(self, keys_to_tensors): def tensors_to_item(self, keys_to_tensors):
item = self._handler.tensors_to_item(keys_to_tensors) item = self._handler.tensors_to_item(keys_to_tensors)
return control_flow_ops.cond( return tf.cond(
pred=math_ops.equal(math_ops.reduce_prod(array_ops.shape(item)), 0), pred=tf.equal(tf.reduce_prod(tf.shape(item)), 0),
true_fn=lambda: self._backup.tensors_to_item(keys_to_tensors), true_fn=lambda: self._backup.tensors_to_item(keys_to_tensors),
false_fn=lambda: item) false_fn=lambda: item)
...@@ -140,6 +159,9 @@ class TfExampleDecoder(data_decoder.DataDecoder): ...@@ -140,6 +159,9 @@ class TfExampleDecoder(data_decoder.DataDecoder):
input_reader_pb2.DEFAULT, input_reader_pb2.NUMERICAL, or input_reader_pb2.DEFAULT, input_reader_pb2.NUMERICAL, or
input_reader_pb2.PNG_MASKS. input_reader_pb2.PNG_MASKS.
""" """
# TODO(rathodv): delete unused `use_display_name` argument once we change
# other decoders to handle label maps similarly.
del use_display_name
self.keys_to_features = { self.keys_to_features = {
'image/encoded': 'image/encoded':
tf.FixedLenFeature((), tf.string, default_value=''), tf.FixedLenFeature((), tf.string, default_value=''),
...@@ -267,27 +289,18 @@ class TfExampleDecoder(data_decoder.DataDecoder): ...@@ -267,27 +289,18 @@ class TfExampleDecoder(data_decoder.DataDecoder):
else: else:
raise ValueError('Did not recognize the `instance_mask_type` option.') raise ValueError('Did not recognize the `instance_mask_type` option.')
if label_map_proto_file: if label_map_proto_file:
label_map = label_map_util.get_label_map_dict(label_map_proto_file,
use_display_name)
# We use a default_value of -1, but we expect all labels to be contained
# in the label map.
table = tf.contrib.lookup.HashTable(
initializer=tf.contrib.lookup.KeyValueTensorInitializer(
keys=tf.constant(list(label_map.keys())),
values=tf.constant(list(label_map.values()), dtype=tf.int64)),
default_value=-1)
# If the label_map_proto is provided, try to use it in conjunction with # If the label_map_proto is provided, try to use it in conjunction with
# the class text, and fall back to a materialized ID. # the class text, and fall back to a materialized ID.
# TODO(lzc): note that here we are using BackupHandler defined in this label_handler = _BackupHandler(
# file(which is branching slim_example_decoder.BackupHandler). Need to _ClassTensorHandler(
# switch back to slim_example_decoder.BackupHandler once tf 1.5 becomes 'image/object/class/text', label_map_proto_file,
# more popular. default_value=''),
label_handler = BackupHandler(
LookupTensor('image/object/class/text', table, default_value=''),
slim_example_decoder.Tensor('image/object/class/label')) slim_example_decoder.Tensor('image/object/class/label'))
image_label_handler = BackupHandler( image_label_handler = _BackupHandler(
LookupTensor( _ClassTensorHandler(
fields.TfExampleFields.image_class_text, table, default_value=''), fields.TfExampleFields.image_class_text,
label_map_proto_file,
default_value=''),
slim_example_decoder.Tensor(fields.TfExampleFields.image_class_label)) slim_example_decoder.Tensor(fields.TfExampleFields.image_class_label))
else: else:
label_handler = slim_example_decoder.Tensor('image/object/class/label') label_handler = slim_example_decoder.Tensor('image/object/class/label')
......
...@@ -12,24 +12,17 @@ ...@@ -12,24 +12,17 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
# ============================================================================== # ==============================================================================
"""Tests for object_detection.data_decoders.tf_example_decoder.""" """Tests for object_detection.data_decoders.tf_example_decoder."""
import os import os
import numpy as np import numpy as np
import tensorflow as tf import tensorflow as tf
from tensorflow.core.example import example_pb2
from tensorflow.core.example import feature_pb2
from tensorflow.python.framework import constant_op
from tensorflow.python.framework import dtypes
from tensorflow.python.framework import test_util from tensorflow.python.framework import test_util
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import lookup_ops
from tensorflow.python.ops import parsing_ops
from object_detection.core import standard_fields as fields from object_detection.core import standard_fields as fields
from object_detection.data_decoders import tf_example_decoder from object_detection.data_decoders import tf_example_decoder
from object_detection.protos import input_reader_pb2 from object_detection.protos import input_reader_pb2
from object_detection.utils import dataset_util
slim_example_decoder = tf.contrib.slim.tfexample_decoder slim_example_decoder = tf.contrib.slim.tfexample_decoder
...@@ -56,25 +49,6 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -56,25 +49,6 @@ class TfExampleDecoderTest(tf.test.TestCase):
raise ValueError('Invalid encoding type.') raise ValueError('Invalid encoding type.')
return image_decoded return image_decoded
def _Int64Feature(self, value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=value))
def _FloatFeature(self, value):
return tf.train.Feature(float_list=tf.train.FloatList(value=value))
def _BytesFeature(self, value):
if isinstance(value, list):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=value))
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
def _Int64FeatureFromList(self, ndarray):
return feature_pb2.Feature(
int64_list=feature_pb2.Int64List(value=ndarray.flatten().tolist()))
def _BytesFeatureFromList(self, ndarray):
values = ndarray.flatten().tolist()
return feature_pb2.Feature(bytes_list=feature_pb2.BytesList(value=values))
def testDecodeAdditionalChannels(self): def testDecodeAdditionalChannels(self):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8) image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
encoded_jpeg = self._EncodeImage(image_tensor) encoded_jpeg = self._EncodeImage(image_tensor)
...@@ -88,14 +62,14 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -88,14 +62,14 @@ class TfExampleDecoderTest(tf.test.TestCase):
features=tf.train.Features( features=tf.train.Features(
feature={ feature={
'image/encoded': 'image/encoded':
self._BytesFeature(encoded_jpeg), dataset_util.bytes_feature(encoded_jpeg),
'image/additional_channels/encoded': 'image/additional_channels/encoded':
self._BytesFeatureFromList( dataset_util.bytes_list_feature(
np.array([encoded_additional_channel] * 2)), [encoded_additional_channel] * 2),
'image/format': 'image/format':
self._BytesFeature('jpeg'), dataset_util.bytes_feature('jpeg'),
'image/source_id': 'image/source_id':
self._BytesFeature('image_id'), dataset_util.bytes_feature('image_id'),
})).SerializeToString() })).SerializeToString()
example_decoder = tf_example_decoder.TfExampleDecoder( example_decoder = tf_example_decoder.TfExampleDecoder(
...@@ -108,104 +82,24 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -108,104 +82,24 @@ class TfExampleDecoderTest(tf.test.TestCase):
np.concatenate([decoded_additional_channel] * 2, axis=2), np.concatenate([decoded_additional_channel] * 2, axis=2),
tensor_dict[fields.InputDataFields.image_additional_channels]) tensor_dict[fields.InputDataFields.image_additional_channels])
def testDecodeExampleWithBranchedBackupHandler(self):
example1 = example_pb2.Example(
features=feature_pb2.Features(
feature={
'image/object/class/text':
self._BytesFeatureFromList(
np.array(['cat', 'dog', 'guinea pig'])),
'image/object/class/label':
self._Int64FeatureFromList(np.array([42, 10, 900]))
}))
example2 = example_pb2.Example(
features=feature_pb2.Features(
feature={
'image/object/class/text':
self._BytesFeatureFromList(
np.array(['cat', 'dog', 'guinea pig'])),
}))
example3 = example_pb2.Example(
features=feature_pb2.Features(
feature={
'image/object/class/label':
self._Int64FeatureFromList(np.array([42, 10, 901]))
}))
# 'dog' -> 0, 'guinea pig' -> 1, 'cat' -> 2
table = lookup_ops.index_table_from_tensor(
constant_op.constant(['dog', 'guinea pig', 'cat']))
keys_to_features = {
'image/object/class/text': parsing_ops.VarLenFeature(dtypes.string),
'image/object/class/label': parsing_ops.VarLenFeature(dtypes.int64),
}
backup_handler = tf_example_decoder.BackupHandler(
handler=slim_example_decoder.Tensor('image/object/class/label'),
backup=tf_example_decoder.LookupTensor('image/object/class/text',
table))
items_to_handlers = {
'labels': backup_handler,
}
decoder = slim_example_decoder.TFExampleDecoder(keys_to_features,
items_to_handlers)
obtained_class_ids_each_example = []
with self.test_session() as sess:
sess.run(lookup_ops.tables_initializer())
for example in [example1, example2, example3]:
serialized_example = array_ops.reshape(
example.SerializeToString(), shape=[])
obtained_class_ids_each_example.append(
decoder.decode(serialized_example)[0].eval())
self.assertAllClose([42, 10, 900], obtained_class_ids_each_example[0])
self.assertAllClose([2, 0, 1], obtained_class_ids_each_example[1])
self.assertAllClose([42, 10, 901], obtained_class_ids_each_example[2])
def testDecodeExampleWithBranchedLookup(self):
example = example_pb2.Example(features=feature_pb2.Features(feature={
'image/object/class/text': self._BytesFeatureFromList(
np.array(['cat', 'dog', 'guinea pig'])),
}))
serialized_example = example.SerializeToString()
# 'dog' -> 0, 'guinea pig' -> 1, 'cat' -> 2
table = lookup_ops.index_table_from_tensor(
constant_op.constant(['dog', 'guinea pig', 'cat']))
with self.test_session() as sess:
sess.run(lookup_ops.tables_initializer())
serialized_example = array_ops.reshape(serialized_example, shape=[])
keys_to_features = {
'image/object/class/text': parsing_ops.VarLenFeature(dtypes.string),
}
items_to_handlers = {
'labels':
tf_example_decoder.LookupTensor('image/object/class/text', table),
}
decoder = slim_example_decoder.TFExampleDecoder(keys_to_features,
items_to_handlers)
obtained_class_ids = decoder.decode(serialized_example)[0].eval()
self.assertAllClose([2, 0, 1], obtained_class_ids)
def testDecodeJpegImage(self): def testDecodeJpegImage(self):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8) image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
encoded_jpeg = self._EncodeImage(image_tensor) encoded_jpeg = self._EncodeImage(image_tensor)
decoded_jpeg = self._DecodeImage(encoded_jpeg) decoded_jpeg = self._DecodeImage(encoded_jpeg)
example = tf.train.Example(features=tf.train.Features(feature={ example = tf.train.Example(
'image/encoded': self._BytesFeature(encoded_jpeg), features=tf.train.Features(
'image/format': self._BytesFeature('jpeg'), feature={
'image/source_id': self._BytesFeature('image_id'), 'image/encoded': dataset_util.bytes_feature(encoded_jpeg),
})).SerializeToString() 'image/format': dataset_util.bytes_feature('jpeg'),
'image/source_id': dataset_util.bytes_feature('image_id'),
})).SerializeToString()
example_decoder = tf_example_decoder.TfExampleDecoder() example_decoder = tf_example_decoder.TfExampleDecoder()
tensor_dict = example_decoder.decode(tf.convert_to_tensor(example)) tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
self.assertAllEqual((tensor_dict[fields.InputDataFields.image]. self.assertAllEqual(
get_shape().as_list()), [None, None, 3]) (tensor_dict[fields.InputDataFields.image].get_shape().as_list()),
[None, None, 3])
with self.test_session() as sess: with self.test_session() as sess:
tensor_dict = sess.run(tensor_dict) tensor_dict = sess.run(tensor_dict)
...@@ -215,11 +109,13 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -215,11 +109,13 @@ class TfExampleDecoderTest(tf.test.TestCase):
def testDecodeImageKeyAndFilename(self): def testDecodeImageKeyAndFilename(self):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8) image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
encoded_jpeg = self._EncodeImage(image_tensor) encoded_jpeg = self._EncodeImage(image_tensor)
example = tf.train.Example(features=tf.train.Features(feature={ example = tf.train.Example(
'image/encoded': self._BytesFeature(encoded_jpeg), features=tf.train.Features(
'image/key/sha256': self._BytesFeature('abc'), feature={
'image/filename': self._BytesFeature('filename') 'image/encoded': dataset_util.bytes_feature(encoded_jpeg),
})).SerializeToString() 'image/key/sha256': dataset_util.bytes_feature('abc'),
'image/filename': dataset_util.bytes_feature('filename')
})).SerializeToString()
example_decoder = tf_example_decoder.TfExampleDecoder() example_decoder = tf_example_decoder.TfExampleDecoder()
tensor_dict = example_decoder.decode(tf.convert_to_tensor(example)) tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
...@@ -234,17 +130,20 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -234,17 +130,20 @@ class TfExampleDecoderTest(tf.test.TestCase):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8) image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
encoded_png = self._EncodeImage(image_tensor, encoding_type='png') encoded_png = self._EncodeImage(image_tensor, encoding_type='png')
decoded_png = self._DecodeImage(encoded_png, encoding_type='png') decoded_png = self._DecodeImage(encoded_png, encoding_type='png')
example = tf.train.Example(features=tf.train.Features(feature={ example = tf.train.Example(
'image/encoded': self._BytesFeature(encoded_png), features=tf.train.Features(
'image/format': self._BytesFeature('png'), feature={
'image/source_id': self._BytesFeature('image_id') 'image/encoded': dataset_util.bytes_feature(encoded_png),
})).SerializeToString() 'image/format': dataset_util.bytes_feature('png'),
'image/source_id': dataset_util.bytes_feature('image_id')
})).SerializeToString()
example_decoder = tf_example_decoder.TfExampleDecoder() example_decoder = tf_example_decoder.TfExampleDecoder()
tensor_dict = example_decoder.decode(tf.convert_to_tensor(example)) tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
self.assertAllEqual((tensor_dict[fields.InputDataFields.image]. self.assertAllEqual(
get_shape().as_list()), [None, None, 3]) (tensor_dict[fields.InputDataFields.image].get_shape().as_list()),
[None, None, 3])
with self.test_session() as sess: with self.test_session() as sess:
tensor_dict = sess.run(tensor_dict) tensor_dict = sess.run(tensor_dict)
...@@ -265,9 +164,12 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -265,9 +164,12 @@ class TfExampleDecoderTest(tf.test.TestCase):
example = tf.train.Example( example = tf.train.Example(
features=tf.train.Features( features=tf.train.Features(
feature={ feature={
'image/encoded': self._BytesFeature(encoded_jpeg), 'image/encoded':
'image/format': self._BytesFeature('jpeg'), dataset_util.bytes_feature(encoded_jpeg),
'image/object/mask': self._BytesFeature(encoded_masks) 'image/format':
dataset_util.bytes_feature('jpeg'),
'image/object/mask':
dataset_util.bytes_list_feature(encoded_masks)
})).SerializeToString() })).SerializeToString()
example_decoder = tf_example_decoder.TfExampleDecoder( example_decoder = tf_example_decoder.TfExampleDecoder(
...@@ -288,11 +190,16 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -288,11 +190,16 @@ class TfExampleDecoderTest(tf.test.TestCase):
example = tf.train.Example( example = tf.train.Example(
features=tf.train.Features( features=tf.train.Features(
feature={ feature={
'image/encoded': self._BytesFeature(encoded_jpeg), 'image/encoded':
'image/format': self._BytesFeature('jpeg'), dataset_util.bytes_feature(encoded_jpeg),
'image/object/mask': self._BytesFeature(encoded_masks), 'image/format':
'image/height': self._Int64Feature([10]), dataset_util.bytes_feature('jpeg'),
'image/width': self._Int64Feature([10]), 'image/object/mask':
dataset_util.bytes_list_feature(encoded_masks),
'image/height':
dataset_util.int64_feature(10),
'image/width':
dataset_util.int64_feature(10),
})).SerializeToString() })).SerializeToString()
example_decoder = tf_example_decoder.TfExampleDecoder( example_decoder = tf_example_decoder.TfExampleDecoder(
...@@ -312,25 +219,33 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -312,25 +219,33 @@ class TfExampleDecoderTest(tf.test.TestCase):
bbox_xmins = [1.0, 5.0] bbox_xmins = [1.0, 5.0]
bbox_ymaxs = [2.0, 6.0] bbox_ymaxs = [2.0, 6.0]
bbox_xmaxs = [3.0, 7.0] bbox_xmaxs = [3.0, 7.0]
example = tf.train.Example(features=tf.train.Features(feature={ example = tf.train.Example(
'image/encoded': self._BytesFeature(encoded_jpeg), features=tf.train.Features(
'image/format': self._BytesFeature('jpeg'), feature={
'image/object/bbox/ymin': self._FloatFeature(bbox_ymins), 'image/encoded':
'image/object/bbox/xmin': self._FloatFeature(bbox_xmins), dataset_util.bytes_feature(encoded_jpeg),
'image/object/bbox/ymax': self._FloatFeature(bbox_ymaxs), 'image/format':
'image/object/bbox/xmax': self._FloatFeature(bbox_xmaxs), dataset_util.bytes_feature('jpeg'),
})).SerializeToString() 'image/object/bbox/ymin':
dataset_util.float_list_feature(bbox_ymins),
'image/object/bbox/xmin':
dataset_util.float_list_feature(bbox_xmins),
'image/object/bbox/ymax':
dataset_util.float_list_feature(bbox_ymaxs),
'image/object/bbox/xmax':
dataset_util.float_list_feature(bbox_xmaxs),
})).SerializeToString()
example_decoder = tf_example_decoder.TfExampleDecoder() example_decoder = tf_example_decoder.TfExampleDecoder()
tensor_dict = example_decoder.decode(tf.convert_to_tensor(example)) tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
self.assertAllEqual((tensor_dict[fields.InputDataFields.groundtruth_boxes]. self.assertAllEqual((tensor_dict[fields.InputDataFields.groundtruth_boxes]
get_shape().as_list()), [None, 4]) .get_shape().as_list()), [None, 4])
with self.test_session() as sess: with self.test_session() as sess:
tensor_dict = sess.run(tensor_dict) tensor_dict = sess.run(tensor_dict)
expected_boxes = np.vstack([bbox_ymins, bbox_xmins, expected_boxes = np.vstack([bbox_ymins, bbox_xmins, bbox_ymaxs,
bbox_ymaxs, bbox_xmaxs]).transpose() bbox_xmaxs]).transpose()
self.assertAllEqual(expected_boxes, self.assertAllEqual(expected_boxes,
tensor_dict[fields.InputDataFields.groundtruth_boxes]) tensor_dict[fields.InputDataFields.groundtruth_boxes])
self.assertAllEqual( self.assertAllEqual(
...@@ -346,30 +261,40 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -346,30 +261,40 @@ class TfExampleDecoderTest(tf.test.TestCase):
bbox_xmaxs = [3.0, 7.0] bbox_xmaxs = [3.0, 7.0]
keypoint_ys = [0.0, 1.0, 2.0, 3.0, 4.0, 5.0] keypoint_ys = [0.0, 1.0, 2.0, 3.0, 4.0, 5.0]
keypoint_xs = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0] keypoint_xs = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]
example = tf.train.Example(features=tf.train.Features(feature={ example = tf.train.Example(
'image/encoded': self._BytesFeature(encoded_jpeg), features=tf.train.Features(
'image/format': self._BytesFeature('jpeg'), feature={
'image/object/bbox/ymin': self._FloatFeature(bbox_ymins), 'image/encoded':
'image/object/bbox/xmin': self._FloatFeature(bbox_xmins), dataset_util.bytes_feature(encoded_jpeg),
'image/object/bbox/ymax': self._FloatFeature(bbox_ymaxs), 'image/format':
'image/object/bbox/xmax': self._FloatFeature(bbox_xmaxs), dataset_util.bytes_feature('jpeg'),
'image/object/keypoint/y': self._FloatFeature(keypoint_ys), 'image/object/bbox/ymin':
'image/object/keypoint/x': self._FloatFeature(keypoint_xs), dataset_util.float_list_feature(bbox_ymins),
})).SerializeToString() 'image/object/bbox/xmin':
dataset_util.float_list_feature(bbox_xmins),
'image/object/bbox/ymax':
dataset_util.float_list_feature(bbox_ymaxs),
'image/object/bbox/xmax':
dataset_util.float_list_feature(bbox_xmaxs),
'image/object/keypoint/y':
dataset_util.float_list_feature(keypoint_ys),
'image/object/keypoint/x':
dataset_util.float_list_feature(keypoint_xs),
})).SerializeToString()
example_decoder = tf_example_decoder.TfExampleDecoder(num_keypoints=3) example_decoder = tf_example_decoder.TfExampleDecoder(num_keypoints=3)
tensor_dict = example_decoder.decode(tf.convert_to_tensor(example)) tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
self.assertAllEqual((tensor_dict[fields.InputDataFields.groundtruth_boxes]. self.assertAllEqual((tensor_dict[fields.InputDataFields.groundtruth_boxes]
get_shape().as_list()), [None, 4]) .get_shape().as_list()), [None, 4])
self.assertAllEqual((tensor_dict[fields.InputDataFields. self.assertAllEqual(
groundtruth_keypoints]. (tensor_dict[fields.InputDataFields.groundtruth_keypoints].get_shape()
get_shape().as_list()), [2, 3, 2]) .as_list()), [2, 3, 2])
with self.test_session() as sess: with self.test_session() as sess:
tensor_dict = sess.run(tensor_dict) tensor_dict = sess.run(tensor_dict)
expected_boxes = np.vstack([bbox_ymins, bbox_xmins, expected_boxes = np.vstack([bbox_ymins, bbox_xmins, bbox_ymaxs,
bbox_ymaxs, bbox_xmaxs]).transpose() bbox_xmaxs]).transpose()
self.assertAllEqual(expected_boxes, self.assertAllEqual(expected_boxes,
tensor_dict[fields.InputDataFields.groundtruth_boxes]) tensor_dict[fields.InputDataFields.groundtruth_boxes])
self.assertAllEqual( self.assertAllEqual(
...@@ -377,9 +302,9 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -377,9 +302,9 @@ class TfExampleDecoderTest(tf.test.TestCase):
expected_keypoints = ( expected_keypoints = (
np.vstack([keypoint_ys, keypoint_xs]).transpose().reshape((2, 3, 2))) np.vstack([keypoint_ys, keypoint_xs]).transpose().reshape((2, 3, 2)))
self.assertAllEqual(expected_keypoints, self.assertAllEqual(
tensor_dict[ expected_keypoints,
fields.InputDataFields.groundtruth_keypoints]) tensor_dict[fields.InputDataFields.groundtruth_keypoints])
def testDecodeDefaultGroundtruthWeights(self): def testDecodeDefaultGroundtruthWeights(self):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8) image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
...@@ -388,20 +313,28 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -388,20 +313,28 @@ class TfExampleDecoderTest(tf.test.TestCase):
bbox_xmins = [1.0, 5.0] bbox_xmins = [1.0, 5.0]
bbox_ymaxs = [2.0, 6.0] bbox_ymaxs = [2.0, 6.0]
bbox_xmaxs = [3.0, 7.0] bbox_xmaxs = [3.0, 7.0]
example = tf.train.Example(features=tf.train.Features(feature={ example = tf.train.Example(
'image/encoded': self._BytesFeature(encoded_jpeg), features=tf.train.Features(
'image/format': self._BytesFeature('jpeg'), feature={
'image/object/bbox/ymin': self._FloatFeature(bbox_ymins), 'image/encoded':
'image/object/bbox/xmin': self._FloatFeature(bbox_xmins), dataset_util.bytes_feature(encoded_jpeg),
'image/object/bbox/ymax': self._FloatFeature(bbox_ymaxs), 'image/format':
'image/object/bbox/xmax': self._FloatFeature(bbox_xmaxs), dataset_util.bytes_feature('jpeg'),
})).SerializeToString() 'image/object/bbox/ymin':
dataset_util.float_list_feature(bbox_ymins),
'image/object/bbox/xmin':
dataset_util.float_list_feature(bbox_xmins),
'image/object/bbox/ymax':
dataset_util.float_list_feature(bbox_ymaxs),
'image/object/bbox/xmax':
dataset_util.float_list_feature(bbox_xmaxs),
})).SerializeToString()
example_decoder = tf_example_decoder.TfExampleDecoder() example_decoder = tf_example_decoder.TfExampleDecoder()
tensor_dict = example_decoder.decode(tf.convert_to_tensor(example)) tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
self.assertAllEqual((tensor_dict[fields.InputDataFields.groundtruth_boxes]. self.assertAllEqual((tensor_dict[fields.InputDataFields.groundtruth_boxes]
get_shape().as_list()), [None, 4]) .get_shape().as_list()), [None, 4])
with self.test_session() as sess: with self.test_session() as sess:
tensor_dict = sess.run(tensor_dict) tensor_dict = sess.run(tensor_dict)
...@@ -414,18 +347,22 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -414,18 +347,22 @@ class TfExampleDecoderTest(tf.test.TestCase):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8) image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
encoded_jpeg = self._EncodeImage(image_tensor) encoded_jpeg = self._EncodeImage(image_tensor)
bbox_classes = [0, 1] bbox_classes = [0, 1]
example = tf.train.Example(features=tf.train.Features(feature={ example = tf.train.Example(
'image/encoded': self._BytesFeature(encoded_jpeg), features=tf.train.Features(
'image/format': self._BytesFeature('jpeg'), feature={
'image/object/class/label': self._Int64Feature(bbox_classes), 'image/encoded':
})).SerializeToString() dataset_util.bytes_feature(encoded_jpeg),
'image/format':
dataset_util.bytes_feature('jpeg'),
'image/object/class/label':
dataset_util.int64_list_feature(bbox_classes),
})).SerializeToString()
example_decoder = tf_example_decoder.TfExampleDecoder() example_decoder = tf_example_decoder.TfExampleDecoder()
tensor_dict = example_decoder.decode(tf.convert_to_tensor(example)) tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
self.assertAllEqual((tensor_dict[ self.assertAllEqual((tensor_dict[fields.InputDataFields.groundtruth_classes]
fields.InputDataFields.groundtruth_classes].get_shape().as_list()), .get_shape().as_list()), [2])
[2])
with self.test_session() as sess: with self.test_session() as sess:
tensor_dict = sess.run(tensor_dict) tensor_dict = sess.run(tensor_dict)
...@@ -437,11 +374,16 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -437,11 +374,16 @@ class TfExampleDecoderTest(tf.test.TestCase):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8) image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
encoded_jpeg = self._EncodeImage(image_tensor) encoded_jpeg = self._EncodeImage(image_tensor)
bbox_classes = [1, 2] bbox_classes = [1, 2]
example = tf.train.Example(features=tf.train.Features(feature={ example = tf.train.Example(
'image/encoded': self._BytesFeature(encoded_jpeg), features=tf.train.Features(
'image/format': self._BytesFeature('jpeg'), feature={
'image/object/class/label': self._Int64Feature(bbox_classes), 'image/encoded':
})).SerializeToString() dataset_util.bytes_feature(encoded_jpeg),
'image/format':
dataset_util.bytes_feature('jpeg'),
'image/object/class/label':
dataset_util.int64_list_feature(bbox_classes),
})).SerializeToString()
label_map_string = """ label_map_string = """
item { item {
id:1 id:1
...@@ -460,9 +402,8 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -460,9 +402,8 @@ class TfExampleDecoderTest(tf.test.TestCase):
label_map_proto_file=label_map_path) label_map_proto_file=label_map_path)
tensor_dict = example_decoder.decode(tf.convert_to_tensor(example)) tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
self.assertAllEqual((tensor_dict[ self.assertAllEqual((tensor_dict[fields.InputDataFields.groundtruth_classes]
fields.InputDataFields.groundtruth_classes].get_shape().as_list()), .get_shape().as_list()), [None])
[None])
init = tf.tables_initializer() init = tf.tables_initializer()
with self.test_session() as sess: with self.test_session() as sess:
...@@ -480,11 +421,11 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -480,11 +421,11 @@ class TfExampleDecoderTest(tf.test.TestCase):
features=tf.train.Features( features=tf.train.Features(
feature={ feature={
'image/encoded': 'image/encoded':
self._BytesFeature(encoded_jpeg), dataset_util.bytes_feature(encoded_jpeg),
'image/format': 'image/format':
self._BytesFeature('jpeg'), dataset_util.bytes_feature('jpeg'),
'image/object/class/text': 'image/object/class/text':
self._BytesFeature(bbox_classes_text), dataset_util.bytes_list_feature(bbox_classes_text),
})).SerializeToString() })).SerializeToString()
label_map_string = """ label_map_string = """
...@@ -514,7 +455,7 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -514,7 +455,7 @@ class TfExampleDecoderTest(tf.test.TestCase):
self.assertAllEqual([2, -1], self.assertAllEqual([2, -1],
tensor_dict[fields.InputDataFields.groundtruth_classes]) tensor_dict[fields.InputDataFields.groundtruth_classes])
def testDecodeObjectLabelWithMapping(self): def testDecodeObjectLabelWithMappingWithDisplayName(self):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8) image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
encoded_jpeg = self._EncodeImage(image_tensor) encoded_jpeg = self._EncodeImage(image_tensor)
bbox_classes_text = ['cat', 'dog'] bbox_classes_text = ['cat', 'dog']
...@@ -522,11 +463,53 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -522,11 +463,53 @@ class TfExampleDecoderTest(tf.test.TestCase):
features=tf.train.Features( features=tf.train.Features(
feature={ feature={
'image/encoded': 'image/encoded':
self._BytesFeature(encoded_jpeg), dataset_util.bytes_feature(encoded_jpeg),
'image/format': 'image/format':
self._BytesFeature('jpeg'), dataset_util.bytes_feature('jpeg'),
'image/object/class/text': 'image/object/class/text':
self._BytesFeature(bbox_classes_text), dataset_util.bytes_list_feature(bbox_classes_text),
})).SerializeToString()
label_map_string = """
item {
id:3
display_name:'cat'
}
item {
id:1
display_name:'dog'
}
"""
label_map_path = os.path.join(self.get_temp_dir(), 'label_map.pbtxt')
with tf.gfile.Open(label_map_path, 'wb') as f:
f.write(label_map_string)
example_decoder = tf_example_decoder.TfExampleDecoder(
label_map_proto_file=label_map_path)
tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
self.assertAllEqual((tensor_dict[fields.InputDataFields.groundtruth_classes]
.get_shape().as_list()), [None])
with self.test_session() as sess:
sess.run(tf.tables_initializer())
tensor_dict = sess.run(tensor_dict)
self.assertAllEqual([3, 1],
tensor_dict[fields.InputDataFields.groundtruth_classes])
def testDecodeObjectLabelWithMappingWithName(self):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
encoded_jpeg = self._EncodeImage(image_tensor)
bbox_classes_text = ['cat', 'dog']
example = tf.train.Example(
features=tf.train.Features(
feature={
'image/encoded':
dataset_util.bytes_feature(encoded_jpeg),
'image/format':
dataset_util.bytes_feature('jpeg'),
'image/object/class/text':
dataset_util.bytes_list_feature(bbox_classes_text),
})).SerializeToString() })).SerializeToString()
label_map_string = """ label_map_string = """
...@@ -561,17 +544,22 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -561,17 +544,22 @@ class TfExampleDecoderTest(tf.test.TestCase):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8) image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
encoded_jpeg = self._EncodeImage(image_tensor) encoded_jpeg = self._EncodeImage(image_tensor)
object_area = [100., 174.] object_area = [100., 174.]
example = tf.train.Example(features=tf.train.Features(feature={ example = tf.train.Example(
'image/encoded': self._BytesFeature(encoded_jpeg), features=tf.train.Features(
'image/format': self._BytesFeature('jpeg'), feature={
'image/object/area': self._FloatFeature(object_area), 'image/encoded':
})).SerializeToString() dataset_util.bytes_feature(encoded_jpeg),
'image/format':
dataset_util.bytes_feature('jpeg'),
'image/object/area':
dataset_util.float_list_feature(object_area),
})).SerializeToString()
example_decoder = tf_example_decoder.TfExampleDecoder() example_decoder = tf_example_decoder.TfExampleDecoder()
tensor_dict = example_decoder.decode(tf.convert_to_tensor(example)) tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
self.assertAllEqual((tensor_dict[fields.InputDataFields.groundtruth_area]. self.assertAllEqual((tensor_dict[fields.InputDataFields.groundtruth_area]
get_shape().as_list()), [2]) .get_shape().as_list()), [2])
with self.test_session() as sess: with self.test_session() as sess:
tensor_dict = sess.run(tensor_dict) tensor_dict = sess.run(tensor_dict)
...@@ -583,67 +571,81 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -583,67 +571,81 @@ class TfExampleDecoderTest(tf.test.TestCase):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8) image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
encoded_jpeg = self._EncodeImage(image_tensor) encoded_jpeg = self._EncodeImage(image_tensor)
object_is_crowd = [0, 1] object_is_crowd = [0, 1]
example = tf.train.Example(features=tf.train.Features(feature={ example = tf.train.Example(
'image/encoded': self._BytesFeature(encoded_jpeg), features=tf.train.Features(
'image/format': self._BytesFeature('jpeg'), feature={
'image/object/is_crowd': self._Int64Feature(object_is_crowd), 'image/encoded':
})).SerializeToString() dataset_util.bytes_feature(encoded_jpeg),
'image/format':
dataset_util.bytes_feature('jpeg'),
'image/object/is_crowd':
dataset_util.int64_list_feature(object_is_crowd),
})).SerializeToString()
example_decoder = tf_example_decoder.TfExampleDecoder() example_decoder = tf_example_decoder.TfExampleDecoder()
tensor_dict = example_decoder.decode(tf.convert_to_tensor(example)) tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
self.assertAllEqual((tensor_dict[ self.assertAllEqual(
fields.InputDataFields.groundtruth_is_crowd].get_shape().as_list()), (tensor_dict[fields.InputDataFields.groundtruth_is_crowd].get_shape()
[2]) .as_list()), [2])
with self.test_session() as sess: with self.test_session() as sess:
tensor_dict = sess.run(tensor_dict) tensor_dict = sess.run(tensor_dict)
self.assertAllEqual([bool(item) for item in object_is_crowd], self.assertAllEqual(
tensor_dict[ [bool(item) for item in object_is_crowd],
fields.InputDataFields.groundtruth_is_crowd]) tensor_dict[fields.InputDataFields.groundtruth_is_crowd])
@test_util.enable_c_shapes @test_util.enable_c_shapes
def testDecodeObjectDifficult(self): def testDecodeObjectDifficult(self):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8) image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
encoded_jpeg = self._EncodeImage(image_tensor) encoded_jpeg = self._EncodeImage(image_tensor)
object_difficult = [0, 1] object_difficult = [0, 1]
example = tf.train.Example(features=tf.train.Features(feature={ example = tf.train.Example(
'image/encoded': self._BytesFeature(encoded_jpeg), features=tf.train.Features(
'image/format': self._BytesFeature('jpeg'), feature={
'image/object/difficult': self._Int64Feature(object_difficult), 'image/encoded':
})).SerializeToString() dataset_util.bytes_feature(encoded_jpeg),
'image/format':
dataset_util.bytes_feature('jpeg'),
'image/object/difficult':
dataset_util.int64_list_feature(object_difficult),
})).SerializeToString()
example_decoder = tf_example_decoder.TfExampleDecoder() example_decoder = tf_example_decoder.TfExampleDecoder()
tensor_dict = example_decoder.decode(tf.convert_to_tensor(example)) tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
self.assertAllEqual((tensor_dict[ self.assertAllEqual(
fields.InputDataFields.groundtruth_difficult].get_shape().as_list()), (tensor_dict[fields.InputDataFields.groundtruth_difficult].get_shape()
[2]) .as_list()), [2])
with self.test_session() as sess: with self.test_session() as sess:
tensor_dict = sess.run(tensor_dict) tensor_dict = sess.run(tensor_dict)
self.assertAllEqual([bool(item) for item in object_difficult], self.assertAllEqual(
tensor_dict[ [bool(item) for item in object_difficult],
fields.InputDataFields.groundtruth_difficult]) tensor_dict[fields.InputDataFields.groundtruth_difficult])
@test_util.enable_c_shapes @test_util.enable_c_shapes
def testDecodeObjectGroupOf(self): def testDecodeObjectGroupOf(self):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8) image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
encoded_jpeg = self._EncodeImage(image_tensor) encoded_jpeg = self._EncodeImage(image_tensor)
object_group_of = [0, 1] object_group_of = [0, 1]
example = tf.train.Example(features=tf.train.Features( example = tf.train.Example(
feature={ features=tf.train.Features(
'image/encoded': self._BytesFeature(encoded_jpeg), feature={
'image/format': self._BytesFeature('jpeg'), 'image/encoded':
'image/object/group_of': self._Int64Feature(object_group_of), dataset_util.bytes_feature(encoded_jpeg),
})).SerializeToString() 'image/format':
dataset_util.bytes_feature('jpeg'),
'image/object/group_of':
dataset_util.int64_list_feature(object_group_of),
})).SerializeToString()
example_decoder = tf_example_decoder.TfExampleDecoder() example_decoder = tf_example_decoder.TfExampleDecoder()
tensor_dict = example_decoder.decode(tf.convert_to_tensor(example)) tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
self.assertAllEqual((tensor_dict[ self.assertAllEqual(
fields.InputDataFields.groundtruth_group_of].get_shape().as_list()), (tensor_dict[fields.InputDataFields.groundtruth_group_of].get_shape()
[2]) .as_list()), [2])
with self.test_session() as sess: with self.test_session() as sess:
tensor_dict = sess.run(tensor_dict) tensor_dict = sess.run(tensor_dict)
...@@ -655,25 +657,27 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -655,25 +657,27 @@ class TfExampleDecoderTest(tf.test.TestCase):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8) image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
encoded_jpeg = self._EncodeImage(image_tensor) encoded_jpeg = self._EncodeImage(image_tensor)
object_weights = [0.75, 1.0] object_weights = [0.75, 1.0]
example = tf.train.Example(features=tf.train.Features( example = tf.train.Example(
feature={ features=tf.train.Features(
'image/encoded': self._BytesFeature(encoded_jpeg), feature={
'image/format': self._BytesFeature('jpeg'), 'image/encoded':
'image/object/weight': self._FloatFeature(object_weights), dataset_util.bytes_feature(encoded_jpeg),
})).SerializeToString() 'image/format':
dataset_util.bytes_feature('jpeg'),
'image/object/weight':
dataset_util.float_list_feature(object_weights),
})).SerializeToString()
example_decoder = tf_example_decoder.TfExampleDecoder() example_decoder = tf_example_decoder.TfExampleDecoder()
tensor_dict = example_decoder.decode(tf.convert_to_tensor(example)) tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
self.assertAllEqual((tensor_dict[ self.assertAllEqual((tensor_dict[fields.InputDataFields.groundtruth_weights]
fields.InputDataFields.groundtruth_weights].get_shape().as_list()), .get_shape().as_list()), [None])
[None])
with self.test_session() as sess: with self.test_session() as sess:
tensor_dict = sess.run(tensor_dict) tensor_dict = sess.run(tensor_dict)
self.assertAllEqual( self.assertAllEqual(object_weights,
object_weights, tensor_dict[fields.InputDataFields.groundtruth_weights])
tensor_dict[fields.InputDataFields.groundtruth_weights])
@test_util.enable_c_shapes @test_util.enable_c_shapes
def testDecodeInstanceSegmentation(self): def testDecodeInstanceSegmentation(self):
...@@ -682,15 +686,13 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -682,15 +686,13 @@ class TfExampleDecoderTest(tf.test.TestCase):
image_width = 3 image_width = 3
# Randomly generate image. # Randomly generate image.
image_tensor = np.random.randint(256, size=(image_height, image_tensor = np.random.randint(
image_width, 256, size=(image_height, image_width, 3)).astype(np.uint8)
3)).astype(np.uint8)
encoded_jpeg = self._EncodeImage(image_tensor) encoded_jpeg = self._EncodeImage(image_tensor)
# Randomly generate instance segmentation masks. # Randomly generate instance segmentation masks.
instance_masks = ( instance_masks = (
np.random.randint(2, size=(num_instances, np.random.randint(2, size=(num_instances, image_height,
image_height,
image_width)).astype(np.float32)) image_width)).astype(np.float32))
instance_masks_flattened = np.reshape(instance_masks, [-1]) instance_masks_flattened = np.reshape(instance_masks, [-1])
...@@ -698,25 +700,32 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -698,25 +700,32 @@ class TfExampleDecoderTest(tf.test.TestCase):
object_classes = np.random.randint( object_classes = np.random.randint(
100, size=(num_instances)).astype(np.int64) 100, size=(num_instances)).astype(np.int64)
example = tf.train.Example(features=tf.train.Features(feature={ example = tf.train.Example(
'image/encoded': self._BytesFeature(encoded_jpeg), features=tf.train.Features(
'image/format': self._BytesFeature('jpeg'), feature={
'image/height': self._Int64Feature([image_height]), 'image/encoded':
'image/width': self._Int64Feature([image_width]), dataset_util.bytes_feature(encoded_jpeg),
'image/object/mask': self._FloatFeature(instance_masks_flattened), 'image/format':
'image/object/class/label': self._Int64Feature( dataset_util.bytes_feature('jpeg'),
object_classes)})).SerializeToString() 'image/height':
dataset_util.int64_feature(image_height),
'image/width':
dataset_util.int64_feature(image_width),
'image/object/mask':
dataset_util.float_list_feature(instance_masks_flattened),
'image/object/class/label':
dataset_util.int64_list_feature(object_classes)
})).SerializeToString()
example_decoder = tf_example_decoder.TfExampleDecoder( example_decoder = tf_example_decoder.TfExampleDecoder(
load_instance_masks=True) load_instance_masks=True)
tensor_dict = example_decoder.decode(tf.convert_to_tensor(example)) tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
self.assertAllEqual(( self.assertAllEqual(
tensor_dict[fields.InputDataFields.groundtruth_instance_masks]. (tensor_dict[fields.InputDataFields.groundtruth_instance_masks]
get_shape().as_list()), [4, 5, 3]) .get_shape().as_list()), [4, 5, 3])
self.assertAllEqual(( self.assertAllEqual((tensor_dict[fields.InputDataFields.groundtruth_classes]
tensor_dict[fields.InputDataFields.groundtruth_classes]. .get_shape().as_list()), [4])
get_shape().as_list()), [4])
with self.test_session() as sess: with self.test_session() as sess:
tensor_dict = sess.run(tensor_dict) tensor_dict = sess.run(tensor_dict)
...@@ -724,24 +733,21 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -724,24 +733,21 @@ class TfExampleDecoderTest(tf.test.TestCase):
self.assertAllEqual( self.assertAllEqual(
instance_masks.astype(np.float32), instance_masks.astype(np.float32),
tensor_dict[fields.InputDataFields.groundtruth_instance_masks]) tensor_dict[fields.InputDataFields.groundtruth_instance_masks])
self.assertAllEqual( self.assertAllEqual(object_classes,
object_classes, tensor_dict[fields.InputDataFields.groundtruth_classes])
tensor_dict[fields.InputDataFields.groundtruth_classes])
def testInstancesNotAvailableByDefault(self): def testInstancesNotAvailableByDefault(self):
num_instances = 4 num_instances = 4
image_height = 5 image_height = 5
image_width = 3 image_width = 3
# Randomly generate image. # Randomly generate image.
image_tensor = np.random.randint(256, size=(image_height, image_tensor = np.random.randint(
image_width, 256, size=(image_height, image_width, 3)).astype(np.uint8)
3)).astype(np.uint8)
encoded_jpeg = self._EncodeImage(image_tensor) encoded_jpeg = self._EncodeImage(image_tensor)
# Randomly generate instance segmentation masks. # Randomly generate instance segmentation masks.
instance_masks = ( instance_masks = (
np.random.randint(2, size=(num_instances, np.random.randint(2, size=(num_instances, image_height,
image_height,
image_width)).astype(np.float32)) image_width)).astype(np.float32))
instance_masks_flattened = np.reshape(instance_masks, [-1]) instance_masks_flattened = np.reshape(instance_masks, [-1])
...@@ -749,18 +755,26 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -749,18 +755,26 @@ class TfExampleDecoderTest(tf.test.TestCase):
object_classes = np.random.randint( object_classes = np.random.randint(
100, size=(num_instances)).astype(np.int64) 100, size=(num_instances)).astype(np.int64)
example = tf.train.Example(features=tf.train.Features(feature={ example = tf.train.Example(
'image/encoded': self._BytesFeature(encoded_jpeg), features=tf.train.Features(
'image/format': self._BytesFeature('jpeg'), feature={
'image/height': self._Int64Feature([image_height]), 'image/encoded':
'image/width': self._Int64Feature([image_width]), dataset_util.bytes_feature(encoded_jpeg),
'image/object/mask': self._FloatFeature(instance_masks_flattened), 'image/format':
'image/object/class/label': self._Int64Feature( dataset_util.bytes_feature('jpeg'),
object_classes)})).SerializeToString() 'image/height':
dataset_util.int64_feature(image_height),
'image/width':
dataset_util.int64_feature(image_width),
'image/object/mask':
dataset_util.float_list_feature(instance_masks_flattened),
'image/object/class/label':
dataset_util.int64_list_feature(object_classes)
})).SerializeToString()
example_decoder = tf_example_decoder.TfExampleDecoder() example_decoder = tf_example_decoder.TfExampleDecoder()
tensor_dict = example_decoder.decode(tf.convert_to_tensor(example)) tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
self.assertTrue(fields.InputDataFields.groundtruth_instance_masks self.assertTrue(
not in tensor_dict) fields.InputDataFields.groundtruth_instance_masks not in tensor_dict)
def testDecodeImageLabels(self): def testDecodeImageLabels(self):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8) image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
...@@ -768,9 +782,9 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -768,9 +782,9 @@ class TfExampleDecoderTest(tf.test.TestCase):
example = tf.train.Example( example = tf.train.Example(
features=tf.train.Features( features=tf.train.Features(
feature={ feature={
'image/encoded': self._BytesFeature(encoded_jpeg), 'image/encoded': dataset_util.bytes_feature(encoded_jpeg),
'image/format': self._BytesFeature('jpeg'), 'image/format': dataset_util.bytes_feature('jpeg'),
'image/class/label': self._Int64Feature([1, 2]), 'image/class/label': dataset_util.int64_list_feature([1, 2]),
})).SerializeToString() })).SerializeToString()
example_decoder = tf_example_decoder.TfExampleDecoder() example_decoder = tf_example_decoder.TfExampleDecoder()
tensor_dict = example_decoder.decode(tf.convert_to_tensor(example)) tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
...@@ -784,9 +798,12 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -784,9 +798,12 @@ class TfExampleDecoderTest(tf.test.TestCase):
example = tf.train.Example( example = tf.train.Example(
features=tf.train.Features( features=tf.train.Features(
feature={ feature={
'image/encoded': self._BytesFeature(encoded_jpeg), 'image/encoded':
'image/format': self._BytesFeature('jpeg'), dataset_util.bytes_feature(encoded_jpeg),
'image/class/text': self._BytesFeature(['dog', 'cat']), 'image/format':
dataset_util.bytes_feature('jpeg'),
'image/class/text':
dataset_util.bytes_list_feature(['dog', 'cat']),
})).SerializeToString() })).SerializeToString()
label_map_string = """ label_map_string = """
item { item {
......
...@@ -177,8 +177,8 @@ def create_tf_example(image, ...@@ -177,8 +177,8 @@ def create_tf_example(image,
dataset_util.float_list_feature(ymin), dataset_util.float_list_feature(ymin),
'image/object/bbox/ymax': 'image/object/bbox/ymax':
dataset_util.float_list_feature(ymax), dataset_util.float_list_feature(ymax),
'image/object/class/label': 'image/object/class/text':
dataset_util.int64_list_feature(category_ids), dataset_util.bytes_list_feature(category_names),
'image/object/is_crowd': 'image/object/is_crowd':
dataset_util.int64_list_feature(is_crowd), dataset_util.int64_list_feature(is_crowd),
'image/object/area': 'image/object/area':
......
...@@ -106,6 +106,9 @@ class CreateCocoTFRecordTest(tf.test.TestCase): ...@@ -106,6 +106,9 @@ class CreateCocoTFRecordTest(tf.test.TestCase):
self._assertProtoEqual( self._assertProtoEqual(
example.features.feature['image/object/bbox/ymax'].float_list.value, example.features.feature['image/object/bbox/ymax'].float_list.value,
[0.75]) [0.75])
self._assertProtoEqual(
example.features.feature['image/object/class/text'].bytes_list.value,
['cat'])
def test_create_tf_example_with_instance_masks(self): def test_create_tf_example_with_instance_masks(self):
image_file_name = 'tmp_image.jpg' image_file_name = 'tmp_image.jpg'
...@@ -169,6 +172,9 @@ class CreateCocoTFRecordTest(tf.test.TestCase): ...@@ -169,6 +172,9 @@ class CreateCocoTFRecordTest(tf.test.TestCase):
self._assertProtoEqual( self._assertProtoEqual(
example.features.feature['image/object/bbox/ymax'].float_list.value, example.features.feature['image/object/bbox/ymax'].float_list.value,
[1]) [1])
self._assertProtoEqual(
example.features.feature['image/object/class/text'].bytes_list.value,
['dog'])
encoded_mask_pngs = [ encoded_mask_pngs = [
io.BytesIO(encoded_masks) for encoded_masks in example.features.feature[ io.BytesIO(encoded_masks) for encoded_masks in example.features.feature[
'image/object/mask'].bytes_list.value 'image/object/mask'].bytes_list.value
......
...@@ -14,7 +14,6 @@ ...@@ -14,7 +14,6 @@
# ============================================================================== # ==============================================================================
"""Common utility functions for evaluation.""" """Common utility functions for evaluation."""
import collections import collections
import logging
import os import os
import time import time
...@@ -53,15 +52,15 @@ def write_metrics(metrics, global_step, summary_dir): ...@@ -53,15 +52,15 @@ def write_metrics(metrics, global_step, summary_dir):
global_step: Global step at which the metrics are computed. global_step: Global step at which the metrics are computed.
summary_dir: Directory to write tensorflow summaries to. summary_dir: Directory to write tensorflow summaries to.
""" """
logging.info('Writing metrics to tf summary.') tf.logging.info('Writing metrics to tf summary.')
summary_writer = tf.summary.FileWriterCache.get(summary_dir) summary_writer = tf.summary.FileWriterCache.get(summary_dir)
for key in sorted(metrics): for key in sorted(metrics):
summary = tf.Summary(value=[ summary = tf.Summary(value=[
tf.Summary.Value(tag=key, simple_value=metrics[key]), tf.Summary.Value(tag=key, simple_value=metrics[key]),
]) ])
summary_writer.add_summary(summary, global_step) summary_writer.add_summary(summary, global_step)
logging.info('%s: %f', key, metrics[key]) tf.logging.info('%s: %f', key, metrics[key])
logging.info('Metrics written to tf summary.') tf.logging.info('Metrics written to tf summary.')
# TODO(rathodv): Add tests. # TODO(rathodv): Add tests.
...@@ -141,7 +140,7 @@ def visualize_detection_results(result_dict, ...@@ -141,7 +140,7 @@ def visualize_detection_results(result_dict,
if show_groundtruth and input_fields.groundtruth_boxes not in result_dict: if show_groundtruth and input_fields.groundtruth_boxes not in result_dict:
raise ValueError('If show_groundtruth is enabled, result_dict must contain ' raise ValueError('If show_groundtruth is enabled, result_dict must contain '
'groundtruth_boxes.') 'groundtruth_boxes.')
logging.info('Creating detection visualizations.') tf.logging.info('Creating detection visualizations.')
category_index = label_map_util.create_category_index(categories) category_index = label_map_util.create_category_index(categories)
image = np.squeeze(result_dict[input_fields.original_image], axis=0) image = np.squeeze(result_dict[input_fields.original_image], axis=0)
...@@ -205,7 +204,8 @@ def visualize_detection_results(result_dict, ...@@ -205,7 +204,8 @@ def visualize_detection_results(result_dict,
summary_writer = tf.summary.FileWriterCache.get(summary_dir) summary_writer = tf.summary.FileWriterCache.get(summary_dir)
summary_writer.add_summary(summary, global_step) summary_writer.add_summary(summary, global_step)
logging.info('Detection visualizations written to summary with tag %s.', tag) tf.logging.info('Detection visualizations written to summary with tag %s.',
tag)
def _run_checkpoint_once(tensor_dict, def _run_checkpoint_once(tensor_dict,
...@@ -218,7 +218,8 @@ def _run_checkpoint_once(tensor_dict, ...@@ -218,7 +218,8 @@ def _run_checkpoint_once(tensor_dict,
master='', master='',
save_graph=False, save_graph=False,
save_graph_dir='', save_graph_dir='',
losses_dict=None): losses_dict=None,
eval_export_path=None):
"""Evaluates metrics defined in evaluators and returns summaries. """Evaluates metrics defined in evaluators and returns summaries.
This function loads the latest checkpoint in checkpoint_dirs and evaluates This function loads the latest checkpoint in checkpoint_dirs and evaluates
...@@ -258,6 +259,8 @@ def _run_checkpoint_once(tensor_dict, ...@@ -258,6 +259,8 @@ def _run_checkpoint_once(tensor_dict,
save_graph_dir: where to store the Tensorflow graph on disk. If save_graph save_graph_dir: where to store the Tensorflow graph on disk. If save_graph
is True this must be non-empty. is True this must be non-empty.
losses_dict: optional dictionary of scalar detection losses. losses_dict: optional dictionary of scalar detection losses.
eval_export_path: Path for saving a json file that contains the detection
results in json format.
Returns: Returns:
global_step: the count of global steps. global_step: the count of global steps.
...@@ -292,7 +295,8 @@ def _run_checkpoint_once(tensor_dict, ...@@ -292,7 +295,8 @@ def _run_checkpoint_once(tensor_dict,
try: try:
for batch in range(int(num_batches)): for batch in range(int(num_batches)):
if (batch + 1) % 100 == 0: if (batch + 1) % 100 == 0:
logging.info('Running eval ops batch %d/%d', batch + 1, num_batches) tf.logging.info('Running eval ops batch %d/%d', batch + 1,
num_batches)
if not batch_processor: if not batch_processor:
try: try:
if not losses_dict: if not losses_dict:
...@@ -301,7 +305,7 @@ def _run_checkpoint_once(tensor_dict, ...@@ -301,7 +305,7 @@ def _run_checkpoint_once(tensor_dict,
losses_dict]) losses_dict])
counters['success'] += 1 counters['success'] += 1
except tf.errors.InvalidArgumentError: except tf.errors.InvalidArgumentError:
logging.info('Skipping image') tf.logging.info('Skipping image')
counters['skipped'] += 1 counters['skipped'] += 1
result_dict = {} result_dict = {}
else: else:
...@@ -316,18 +320,31 @@ def _run_checkpoint_once(tensor_dict, ...@@ -316,18 +320,31 @@ def _run_checkpoint_once(tensor_dict,
# decoders to return correct image_id. # decoders to return correct image_id.
# TODO(akuznetsa): result_dict contains batches of images, while # TODO(akuznetsa): result_dict contains batches of images, while
# add_single_ground_truth_image_info expects a single image. Fix # add_single_ground_truth_image_info expects a single image. Fix
if (isinstance(result_dict, dict) and
result_dict[fields.InputDataFields.key]):
image_id = result_dict[fields.InputDataFields.key]
else:
image_id = batch
evaluator.add_single_ground_truth_image_info( evaluator.add_single_ground_truth_image_info(
image_id=batch, groundtruth_dict=result_dict) image_id=image_id, groundtruth_dict=result_dict)
evaluator.add_single_detected_image_info( evaluator.add_single_detected_image_info(
image_id=batch, detections_dict=result_dict) image_id=image_id, detections_dict=result_dict)
logging.info('Running eval batches done.') tf.logging.info('Running eval batches done.')
except tf.errors.OutOfRangeError: except tf.errors.OutOfRangeError:
logging.info('Done evaluating -- epoch limit reached') tf.logging.info('Done evaluating -- epoch limit reached')
finally: finally:
# When done, ask the threads to stop. # When done, ask the threads to stop.
logging.info('# success: %d', counters['success']) tf.logging.info('# success: %d', counters['success'])
logging.info('# skipped: %d', counters['skipped']) tf.logging.info('# skipped: %d', counters['skipped'])
all_evaluator_metrics = {} all_evaluator_metrics = {}
if eval_export_path and eval_export_path is not None:
for evaluator in evaluators:
if (isinstance(evaluator, coco_evaluation.CocoDetectionEvaluator) or
isinstance(evaluator, coco_evaluation.CocoMaskEvaluator)):
tf.logging.info('Started dumping to json file.')
evaluator.dump_detections_to_json_file(
json_output_path=eval_export_path)
tf.logging.info('Finished dumping to json file.')
for evaluator in evaluators: for evaluator in evaluators:
metrics = evaluator.evaluate() metrics = evaluator.evaluate()
evaluator.clear() evaluator.clear()
...@@ -356,7 +373,8 @@ def repeated_checkpoint_run(tensor_dict, ...@@ -356,7 +373,8 @@ def repeated_checkpoint_run(tensor_dict,
master='', master='',
save_graph=False, save_graph=False,
save_graph_dir='', save_graph_dir='',
losses_dict=None): losses_dict=None,
eval_export_path=None):
"""Periodically evaluates desired tensors using checkpoint_dirs or restore_fn. """Periodically evaluates desired tensors using checkpoint_dirs or restore_fn.
This function repeatedly loads a checkpoint and evaluates a desired This function repeatedly loads a checkpoint and evaluates a desired
...@@ -397,6 +415,8 @@ def repeated_checkpoint_run(tensor_dict, ...@@ -397,6 +415,8 @@ def repeated_checkpoint_run(tensor_dict,
save_graph_dir: where to save on disk the Tensorflow graph. If store_graph save_graph_dir: where to save on disk the Tensorflow graph. If store_graph
is True this must be non-empty. is True this must be non-empty.
losses_dict: optional dictionary of scalar detection losses. losses_dict: optional dictionary of scalar detection losses.
eval_export_path: Path for saving a json file that contains the detection
results in json format.
Returns: Returns:
metrics: A dictionary containing metric names and values in the latest metrics: A dictionary containing metric names and values in the latest
...@@ -417,31 +437,36 @@ def repeated_checkpoint_run(tensor_dict, ...@@ -417,31 +437,36 @@ def repeated_checkpoint_run(tensor_dict,
number_of_evaluations = 0 number_of_evaluations = 0
while True: while True:
start = time.time() start = time.time()
logging.info('Starting evaluation at ' + time.strftime( tf.logging.info('Starting evaluation at ' + time.strftime(
'%Y-%m-%d-%H:%M:%S', time.gmtime())) '%Y-%m-%d-%H:%M:%S', time.gmtime()))
model_path = tf.train.latest_checkpoint(checkpoint_dirs[0]) model_path = tf.train.latest_checkpoint(checkpoint_dirs[0])
if not model_path: if not model_path:
logging.info('No model found in %s. Will try again in %d seconds', tf.logging.info('No model found in %s. Will try again in %d seconds',
checkpoint_dirs[0], eval_interval_secs) checkpoint_dirs[0], eval_interval_secs)
elif model_path == last_evaluated_model_path: elif model_path == last_evaluated_model_path:
logging.info('Found already evaluated checkpoint. Will try again in %d ' tf.logging.info('Found already evaluated checkpoint. Will try again in '
'seconds', eval_interval_secs) '%d seconds', eval_interval_secs)
else: else:
last_evaluated_model_path = model_path last_evaluated_model_path = model_path
global_step, metrics = _run_checkpoint_once(tensor_dict, evaluators, global_step, metrics = _run_checkpoint_once(
batch_processor, tensor_dict,
checkpoint_dirs, evaluators,
variables_to_restore, batch_processor,
restore_fn, num_batches, checkpoint_dirs,
master, save_graph, variables_to_restore,
save_graph_dir, restore_fn,
losses_dict=losses_dict) num_batches,
master,
save_graph,
save_graph_dir,
losses_dict=losses_dict,
eval_export_path=eval_export_path)
write_metrics(metrics, global_step, summary_dir) write_metrics(metrics, global_step, summary_dir)
number_of_evaluations += 1 number_of_evaluations += 1
if (max_number_of_evaluations and if (max_number_of_evaluations and
number_of_evaluations >= max_number_of_evaluations): number_of_evaluations >= max_number_of_evaluations):
logging.info('Finished evaluation!') tf.logging.info('Finished evaluation!')
break break
time_to_next_eval = start + eval_interval_secs - time.time() time_to_next_eval = start + eval_interval_secs - time.time()
if time_to_next_eval > 0: if time_to_next_eval > 0:
...@@ -680,4 +705,3 @@ def evaluator_options_from_eval_config(eval_config): ...@@ -680,4 +705,3 @@ def evaluator_options_from_eval_config(eval_config):
eval_config.include_metrics_per_category) eval_config.include_metrics_per_category)
} }
return evaluator_options return evaluator_options
...@@ -2,13 +2,12 @@ ...@@ -2,13 +2,12 @@
We provide a collection of detection models pre-trained on the [COCO We provide a collection of detection models pre-trained on the [COCO
dataset](http://mscoco.org), the [Kitti dataset](http://www.cvlibs.net/datasets/kitti/), dataset](http://mscoco.org), the [Kitti dataset](http://www.cvlibs.net/datasets/kitti/),
the [Open Images dataset](https://github.com/openimages/dataset) and the the [Open Images dataset](https://github.com/openimages/dataset), the
[AVA v2.1 dataset](https://research.google.com/ava/). These models can [AVA v2.1 dataset](https://research.google.com/ava/) and the
be useful for [iNaturalist Species Detection Dataset](https://github.com/visipedia/inat_comp/blob/master/2017/README.md#bounding-boxes).
out-of-the-box inference if you are interested in categories already in COCO These models can be useful for out-of-the-box inference if you are interested in
(e.g., humans, cars, etc) or in Open Images (e.g., categories already in those datasets. They are also useful for initializing your
surfboard, jacuzzi, etc). They are also useful for initializing your models when models when training on novel datasets.
training on novel datasets.
In the table below, we list each such pre-trained model including: In the table below, we list each such pre-trained model including:
...@@ -113,6 +112,13 @@ Model name ...@@ -113,6 +112,13 @@ Model name
[faster_rcnn_inception_resnet_v2_atrous_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_oid_2018_01_28.tar.gz) | 727 | 37 | Boxes [faster_rcnn_inception_resnet_v2_atrous_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_oid_2018_01_28.tar.gz) | 727 | 37 | Boxes
[faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2018_01_28.tar.gz) | 347 | | Boxes [faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2018_01_28.tar.gz) | 347 | | Boxes
## iNaturalist Species-trained models
Model name | Speed (ms) | Pascal mAP@0.5 | Outputs
----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----:
[faster_rcnn_resnet101_fgvc](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_fgvc_2018_07_19.tar.gz) | 395 | 58 | Boxes
[faster_rcnn_resnet50_fgvc](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet50_fgvc_2018_07_19.tar.gz) | 366 | 55 | Boxes
## AVA v2.1 trained models ## AVA v2.1 trained models
......
...@@ -37,12 +37,12 @@ A local training job can be run with the following command: ...@@ -37,12 +37,12 @@ A local training job can be run with the following command:
PIPELINE_CONFIG_PATH={path to pipeline config file} PIPELINE_CONFIG_PATH={path to pipeline config file}
MODEL_DIR={path to model directory} MODEL_DIR={path to model directory}
NUM_TRAIN_STEPS=50000 NUM_TRAIN_STEPS=50000
NUM_EVAL_STEPS=2000 SAMPLE_1_OF_N_EVAL_EXAMPLES=1
python object_detection/model_main.py \ python object_detection/model_main.py \
--pipeline_config_path=${PIPELINE_CONFIG_PATH} \ --pipeline_config_path=${PIPELINE_CONFIG_PATH} \
--model_dir=${MODEL_DIR} \ --model_dir=${MODEL_DIR} \
--num_train_steps=${NUM_TRAIN_STEPS} \ --num_train_steps=${NUM_TRAIN_STEPS} \
--num_eval_steps=${NUM_EVAL_STEPS} \ --sample_1_of_n_eval_examples=$SAMPLE_1_OF_N_EVAL_EXAMPLES \
--alsologtostderr --alsologtostderr
``` ```
......
...@@ -216,7 +216,7 @@ To start training and evaluation, execute the following command from the ...@@ -216,7 +216,7 @@ To start training and evaluation, execute the following command from the
```bash ```bash
# From tensorflow/models/research/ # From tensorflow/models/research/
gcloud ml-engine jobs submit training `whoami`_object_detection_pets_`date +%m_%d_%Y_%H_%M_%S` \ gcloud ml-engine jobs submit training `whoami`_object_detection_pets_`date +%m_%d_%Y_%H_%M_%S` \
--runtime-version 1.9 \ --runtime-version 1.8 \
--job-dir=gs://${YOUR_GCS_BUCKET}/model_dir \ --job-dir=gs://${YOUR_GCS_BUCKET}/model_dir \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \ --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
--module-name object_detection.model_main \ --module-name object_detection.model_main \
......
...@@ -52,7 +52,8 @@ def transform_input_data(tensor_dict, ...@@ -52,7 +52,8 @@ def transform_input_data(tensor_dict,
num_classes, num_classes,
data_augmentation_fn=None, data_augmentation_fn=None,
merge_multiple_boxes=False, merge_multiple_boxes=False,
retain_original_image=False): retain_original_image=False,
use_bfloat16=False):
"""A single function that is responsible for all input data transformations. """A single function that is responsible for all input data transformations.
Data transformation functions are applied in the following order. Data transformation functions are applied in the following order.
...@@ -86,6 +87,7 @@ def transform_input_data(tensor_dict, ...@@ -86,6 +87,7 @@ def transform_input_data(tensor_dict,
and classes for a given image if the boxes are exactly the same. and classes for a given image if the boxes are exactly the same.
retain_original_image: (optional) whether to retain original image in the retain_original_image: (optional) whether to retain original image in the
output dictionary. output dictionary.
use_bfloat16: (optional) a bool, whether to use bfloat16 in training.
Returns: Returns:
A dictionary keyed by fields.InputDataFields containing the tensors obtained A dictionary keyed by fields.InputDataFields containing the tensors obtained
...@@ -111,6 +113,9 @@ def transform_input_data(tensor_dict, ...@@ -111,6 +113,9 @@ def transform_input_data(tensor_dict,
image = tensor_dict[fields.InputDataFields.image] image = tensor_dict[fields.InputDataFields.image]
preprocessed_resized_image, true_image_shape = model_preprocess_fn( preprocessed_resized_image, true_image_shape = model_preprocess_fn(
tf.expand_dims(tf.to_float(image), axis=0)) tf.expand_dims(tf.to_float(image), axis=0))
if use_bfloat16:
preprocessed_resized_image = tf.cast(
preprocessed_resized_image, tf.bfloat16)
tensor_dict[fields.InputDataFields.image] = tf.squeeze( tensor_dict[fields.InputDataFields.image] = tf.squeeze(
preprocessed_resized_image, axis=0) preprocessed_resized_image, axis=0)
tensor_dict[fields.InputDataFields.true_image_shape] = tf.squeeze( tensor_dict[fields.InputDataFields.true_image_shape] = tf.squeeze(
...@@ -128,13 +133,33 @@ def transform_input_data(tensor_dict, ...@@ -128,13 +133,33 @@ def transform_input_data(tensor_dict,
tensor_dict[fields.InputDataFields.groundtruth_classes] = tf.one_hot( tensor_dict[fields.InputDataFields.groundtruth_classes] = tf.one_hot(
zero_indexed_groundtruth_classes, num_classes) zero_indexed_groundtruth_classes, num_classes)
if fields.InputDataFields.groundtruth_confidences in tensor_dict:
groundtruth_confidences = tensor_dict[
fields.InputDataFields.groundtruth_confidences]
tensor_dict[fields.InputDataFields.groundtruth_confidences] = (
tf.sparse_to_dense(
zero_indexed_groundtruth_classes,
[num_classes],
groundtruth_confidences,
validate_indices=False))
else:
groundtruth_confidences = tf.ones_like(
zero_indexed_groundtruth_classes, dtype=tf.float32)
tensor_dict[fields.InputDataFields.groundtruth_confidences] = (
tensor_dict[fields.InputDataFields.groundtruth_classes])
if merge_multiple_boxes: if merge_multiple_boxes:
merged_boxes, merged_classes, _ = util_ops.merge_boxes_with_multiple_labels( merged_boxes, merged_classes, merged_confidences, _ = (
tensor_dict[fields.InputDataFields.groundtruth_boxes], util_ops.merge_boxes_with_multiple_labels(
zero_indexed_groundtruth_classes, num_classes) tensor_dict[fields.InputDataFields.groundtruth_boxes],
zero_indexed_groundtruth_classes,
groundtruth_confidences,
num_classes))
merged_classes = tf.cast(merged_classes, tf.float32) merged_classes = tf.cast(merged_classes, tf.float32)
tensor_dict[fields.InputDataFields.groundtruth_boxes] = merged_boxes tensor_dict[fields.InputDataFields.groundtruth_boxes] = merged_boxes
tensor_dict[fields.InputDataFields.groundtruth_classes] = merged_classes tensor_dict[fields.InputDataFields.groundtruth_classes] = merged_classes
tensor_dict[fields.InputDataFields.groundtruth_confidences] = (
merged_confidences)
return tensor_dict return tensor_dict
...@@ -183,6 +208,8 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes, ...@@ -183,6 +208,8 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes,
fields.InputDataFields.groundtruth_difficult: [max_num_boxes], fields.InputDataFields.groundtruth_difficult: [max_num_boxes],
fields.InputDataFields.groundtruth_boxes: [max_num_boxes, 4], fields.InputDataFields.groundtruth_boxes: [max_num_boxes, 4],
fields.InputDataFields.groundtruth_classes: [max_num_boxes, num_classes], fields.InputDataFields.groundtruth_classes: [max_num_boxes, num_classes],
fields.InputDataFields.groundtruth_confidences: [
max_num_boxes, num_classes],
fields.InputDataFields.groundtruth_instance_masks: [ fields.InputDataFields.groundtruth_instance_masks: [
max_num_boxes, height, width max_num_boxes, height, width
], ],
...@@ -198,6 +225,7 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes, ...@@ -198,6 +225,7 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes,
max_num_boxes, num_classes + 1 if num_classes is not None else None max_num_boxes, num_classes + 1 if num_classes is not None else None
], ],
fields.InputDataFields.groundtruth_image_classes: [num_classes], fields.InputDataFields.groundtruth_image_classes: [num_classes],
fields.InputDataFields.groundtruth_image_confidences: [num_classes],
} }
if fields.InputDataFields.original_image in tensor_dict: if fields.InputDataFields.original_image in tensor_dict:
...@@ -252,9 +280,12 @@ def augment_input_data(tensor_dict, data_augmentation_options): ...@@ -252,9 +280,12 @@ def augment_input_data(tensor_dict, data_augmentation_options):
in tensor_dict) in tensor_dict)
include_keypoints = (fields.InputDataFields.groundtruth_keypoints include_keypoints = (fields.InputDataFields.groundtruth_keypoints
in tensor_dict) in tensor_dict)
include_label_scores = (fields.InputDataFields.groundtruth_confidences in
tensor_dict)
tensor_dict = preprocessor.preprocess( tensor_dict = preprocessor.preprocess(
tensor_dict, data_augmentation_options, tensor_dict, data_augmentation_options,
func_arg_map=preprocessor.get_default_func_arg_map( func_arg_map=preprocessor.get_default_func_arg_map(
include_label_scores=include_label_scores,
include_instance_masks=include_instance_masks, include_instance_masks=include_instance_masks,
include_keypoints=include_keypoints)) include_keypoints=include_keypoints))
tensor_dict[fields.InputDataFields.image] = tf.squeeze( tensor_dict[fields.InputDataFields.image] = tf.squeeze(
...@@ -275,6 +306,7 @@ def _get_labels_dict(input_dict): ...@@ -275,6 +306,7 @@ def _get_labels_dict(input_dict):
labels_dict[key] = input_dict[key] labels_dict[key] = input_dict[key]
optional_label_keys = [ optional_label_keys = [
fields.InputDataFields.groundtruth_confidences,
fields.InputDataFields.groundtruth_keypoints, fields.InputDataFields.groundtruth_keypoints,
fields.InputDataFields.groundtruth_instance_masks, fields.InputDataFields.groundtruth_instance_masks,
fields.InputDataFields.groundtruth_area, fields.InputDataFields.groundtruth_area,
...@@ -291,10 +323,42 @@ def _get_labels_dict(input_dict): ...@@ -291,10 +323,42 @@ def _get_labels_dict(input_dict):
return labels_dict return labels_dict
def _replace_empty_string_with_random_number(string_tensor):
"""Returns string unchanged if non-empty, and random string tensor otherwise.
The random string is an integer 0 and 2**63 - 1, casted as string.
Args:
string_tensor: A tf.tensor of dtype string.
Returns:
out_string: A tf.tensor of dtype string. If string_tensor contains the empty
string, out_string will contain a random integer casted to a string.
Otherwise string_tensor is returned unchanged.
"""
empty_string = tf.constant('', dtype=tf.string, name='EmptyString')
random_source_id = tf.as_string(
tf.random_uniform(shape=[], maxval=2**63 - 1, dtype=tf.int64))
out_string = tf.cond(
tf.equal(string_tensor, empty_string),
true_fn=lambda: random_source_id,
false_fn=lambda: string_tensor)
return out_string
def _get_features_dict(input_dict): def _get_features_dict(input_dict):
"""Extracts features dict from input dict.""" """Extracts features dict from input dict."""
hash_from_source_id = tf.string_to_hash_bucket_fast(
input_dict[fields.InputDataFields.source_id], HASH_BINS) source_id = _replace_empty_string_with_random_number(
input_dict[fields.InputDataFields.source_id])
hash_from_source_id = tf.string_to_hash_bucket_fast(source_id, HASH_BINS)
features = { features = {
fields.InputDataFields.image: fields.InputDataFields.image:
input_dict[fields.InputDataFields.image], input_dict[fields.InputDataFields.image],
...@@ -392,7 +456,8 @@ def create_train_input_fn(train_config, train_input_config, ...@@ -392,7 +456,8 @@ def create_train_input_fn(train_config, train_input_config,
num_classes=config_util.get_number_of_classes(model_config), num_classes=config_util.get_number_of_classes(model_config),
data_augmentation_fn=data_augmentation_fn, data_augmentation_fn=data_augmentation_fn,
merge_multiple_boxes=train_config.merge_multiple_label_boxes, merge_multiple_boxes=train_config.merge_multiple_label_boxes,
retain_original_image=train_config.retain_original_images) retain_original_image=train_config.retain_original_images,
use_bfloat16=train_config.use_bfloat16)
tensor_dict = pad_input_data_to_static_shapes( tensor_dict = pad_input_data_to_static_shapes(
tensor_dict=transform_data_fn(tensor_dict), tensor_dict=transform_data_fn(tensor_dict),
......
...@@ -41,11 +41,13 @@ def _get_configs_for_model(model_name): ...@@ -41,11 +41,13 @@ def _get_configs_for_model(model_name):
data_path = os.path.join(tf.resource_loader.get_data_files_path(), data_path = os.path.join(tf.resource_loader.get_data_files_path(),
'test_data/pets_examples.record') 'test_data/pets_examples.record')
configs = config_util.get_configs_from_pipeline_file(fname) configs = config_util.get_configs_from_pipeline_file(fname)
override_dict = {
'train_input_path': data_path,
'eval_input_path': data_path,
'label_map_path': label_map_path
}
return config_util.merge_external_params_with_configs( return config_util.merge_external_params_with_configs(
configs, configs, kwargs_dict=override_dict)
train_input_path=data_path,
eval_input_path=data_path,
label_map_path=label_map_path)
def _make_initializable_iterator(dataset): def _make_initializable_iterator(dataset):
...@@ -89,6 +91,12 @@ class InputsTest(tf.test.TestCase): ...@@ -89,6 +91,12 @@ class InputsTest(tf.test.TestCase):
labels[fields.InputDataFields.groundtruth_classes].shape.as_list()) labels[fields.InputDataFields.groundtruth_classes].shape.as_list())
self.assertEqual(tf.float32, self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_classes].dtype) labels[fields.InputDataFields.groundtruth_classes].dtype)
self.assertAllEqual(
[1, 100, model_config.faster_rcnn.num_classes],
labels[fields.InputDataFields.groundtruth_confidences].shape.as_list())
self.assertEqual(
tf.float32,
labels[fields.InputDataFields.groundtruth_confidences].dtype)
self.assertAllEqual( self.assertAllEqual(
[1, 100], [1, 100],
labels[fields.InputDataFields.groundtruth_weights].shape.as_list()) labels[fields.InputDataFields.groundtruth_weights].shape.as_list())
...@@ -101,7 +109,7 @@ class InputsTest(tf.test.TestCase): ...@@ -101,7 +109,7 @@ class InputsTest(tf.test.TestCase):
model_config = configs['model'] model_config = configs['model']
model_config.faster_rcnn.num_classes = 37 model_config.faster_rcnn.num_classes = 37
eval_input_fn = inputs.create_eval_input_fn( eval_input_fn = inputs.create_eval_input_fn(
configs['eval_config'], configs['eval_input_config'], model_config) configs['eval_config'], configs['eval_input_configs'][0], model_config)
features, labels = _make_initializable_iterator(eval_input_fn()).get_next() features, labels = _make_initializable_iterator(eval_input_fn()).get_next()
self.assertAllEqual([1, None, None, 3], self.assertAllEqual([1, None, None, 3],
features[fields.InputDataFields.image].shape.as_list()) features[fields.InputDataFields.image].shape.as_list())
...@@ -123,6 +131,12 @@ class InputsTest(tf.test.TestCase): ...@@ -123,6 +131,12 @@ class InputsTest(tf.test.TestCase):
labels[fields.InputDataFields.groundtruth_classes].shape.as_list()) labels[fields.InputDataFields.groundtruth_classes].shape.as_list())
self.assertEqual(tf.float32, self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_classes].dtype) labels[fields.InputDataFields.groundtruth_classes].dtype)
self.assertAllEqual(
[1, 100, model_config.faster_rcnn.num_classes],
labels[fields.InputDataFields.groundtruth_confidences].shape.as_list())
self.assertEqual(
tf.float32,
labels[fields.InputDataFields.groundtruth_confidences].dtype)
self.assertAllEqual( self.assertAllEqual(
[1, 100], [1, 100],
labels[fields.InputDataFields.groundtruth_area].shape.as_list()) labels[fields.InputDataFields.groundtruth_area].shape.as_list())
...@@ -170,6 +184,13 @@ class InputsTest(tf.test.TestCase): ...@@ -170,6 +184,13 @@ class InputsTest(tf.test.TestCase):
labels[fields.InputDataFields.groundtruth_classes].shape.as_list()) labels[fields.InputDataFields.groundtruth_classes].shape.as_list())
self.assertEqual(tf.float32, self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_classes].dtype) labels[fields.InputDataFields.groundtruth_classes].dtype)
self.assertAllEqual(
[batch_size, 100, model_config.ssd.num_classes],
labels[
fields.InputDataFields.groundtruth_confidences].shape.as_list())
self.assertEqual(
tf.float32,
labels[fields.InputDataFields.groundtruth_confidences].dtype)
self.assertAllEqual( self.assertAllEqual(
[batch_size, 100], [batch_size, 100],
labels[fields.InputDataFields.groundtruth_weights].shape.as_list()) labels[fields.InputDataFields.groundtruth_weights].shape.as_list())
...@@ -182,7 +203,7 @@ class InputsTest(tf.test.TestCase): ...@@ -182,7 +203,7 @@ class InputsTest(tf.test.TestCase):
model_config = configs['model'] model_config = configs['model']
model_config.ssd.num_classes = 37 model_config.ssd.num_classes = 37
eval_input_fn = inputs.create_eval_input_fn( eval_input_fn = inputs.create_eval_input_fn(
configs['eval_config'], configs['eval_input_config'], model_config) configs['eval_config'], configs['eval_input_configs'][0], model_config)
features, labels = _make_initializable_iterator(eval_input_fn()).get_next() features, labels = _make_initializable_iterator(eval_input_fn()).get_next()
self.assertAllEqual([1, 300, 300, 3], self.assertAllEqual([1, 300, 300, 3],
features[fields.InputDataFields.image].shape.as_list()) features[fields.InputDataFields.image].shape.as_list())
...@@ -204,6 +225,13 @@ class InputsTest(tf.test.TestCase): ...@@ -204,6 +225,13 @@ class InputsTest(tf.test.TestCase):
labels[fields.InputDataFields.groundtruth_classes].shape.as_list()) labels[fields.InputDataFields.groundtruth_classes].shape.as_list())
self.assertEqual(tf.float32, self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_classes].dtype) labels[fields.InputDataFields.groundtruth_classes].dtype)
self.assertAllEqual(
[1, 100, model_config.ssd.num_classes],
labels[
fields.InputDataFields.groundtruth_confidences].shape.as_list())
self.assertEqual(
tf.float32,
labels[fields.InputDataFields.groundtruth_confidences].dtype)
self.assertAllEqual( self.assertAllEqual(
[1, 100], [1, 100],
labels[fields.InputDataFields.groundtruth_area].shape.as_list()) labels[fields.InputDataFields.groundtruth_area].shape.as_list())
...@@ -225,7 +253,7 @@ class InputsTest(tf.test.TestCase): ...@@ -225,7 +253,7 @@ class InputsTest(tf.test.TestCase):
configs = _get_configs_for_model('ssd_inception_v2_pets') configs = _get_configs_for_model('ssd_inception_v2_pets')
predict_input_fn = inputs.create_predict_input_fn( predict_input_fn = inputs.create_predict_input_fn(
model_config=configs['model'], model_config=configs['model'],
predict_input_config=configs['eval_input_config']) predict_input_config=configs['eval_input_configs'][0])
serving_input_receiver = predict_input_fn() serving_input_receiver = predict_input_fn()
image = serving_input_receiver.features[fields.InputDataFields.image] image = serving_input_receiver.features[fields.InputDataFields.image]
...@@ -238,10 +266,10 @@ class InputsTest(tf.test.TestCase): ...@@ -238,10 +266,10 @@ class InputsTest(tf.test.TestCase):
def test_predict_input_with_additional_channels(self): def test_predict_input_with_additional_channels(self):
"""Tests the predict input function with additional channels.""" """Tests the predict input function with additional channels."""
configs = _get_configs_for_model('ssd_inception_v2_pets') configs = _get_configs_for_model('ssd_inception_v2_pets')
configs['eval_input_config'].num_additional_channels = 2 configs['eval_input_configs'][0].num_additional_channels = 2
predict_input_fn = inputs.create_predict_input_fn( predict_input_fn = inputs.create_predict_input_fn(
model_config=configs['model'], model_config=configs['model'],
predict_input_config=configs['eval_input_config']) predict_input_config=configs['eval_input_configs'][0])
serving_input_receiver = predict_input_fn() serving_input_receiver = predict_input_fn()
image = serving_input_receiver.features[fields.InputDataFields.image] image = serving_input_receiver.features[fields.InputDataFields.image]
...@@ -291,7 +319,7 @@ class InputsTest(tf.test.TestCase): ...@@ -291,7 +319,7 @@ class InputsTest(tf.test.TestCase):
configs['model'].ssd.num_classes = 37 configs['model'].ssd.num_classes = 37
eval_input_fn = inputs.create_eval_input_fn( eval_input_fn = inputs.create_eval_input_fn(
eval_config=configs['train_config'], # Expecting `EvalConfig`. eval_config=configs['train_config'], # Expecting `EvalConfig`.
eval_input_config=configs['eval_input_config'], eval_input_config=configs['eval_input_configs'][0],
model_config=configs['model']) model_config=configs['model'])
with self.assertRaises(TypeError): with self.assertRaises(TypeError):
eval_input_fn() eval_input_fn()
...@@ -313,11 +341,43 @@ class InputsTest(tf.test.TestCase): ...@@ -313,11 +341,43 @@ class InputsTest(tf.test.TestCase):
configs['model'].ssd.num_classes = 37 configs['model'].ssd.num_classes = 37
eval_input_fn = inputs.create_eval_input_fn( eval_input_fn = inputs.create_eval_input_fn(
eval_config=configs['eval_config'], eval_config=configs['eval_config'],
eval_input_config=configs['eval_input_config'], eval_input_config=configs['eval_input_configs'][0],
model_config=configs['eval_config']) # Expecting `DetectionModel`. model_config=configs['eval_config']) # Expecting `DetectionModel`.
with self.assertRaises(TypeError): with self.assertRaises(TypeError):
eval_input_fn() eval_input_fn()
def test_output_equal_in_replace_empty_string_with_random_number(self):
string_placeholder = tf.placeholder(tf.string, shape=[])
replaced_string = inputs._replace_empty_string_with_random_number(
string_placeholder)
test_string = 'hello world'
feed_dict = {string_placeholder: test_string}
with self.test_session() as sess:
out_string = sess.run(replaced_string, feed_dict=feed_dict)
self.assertEqual(test_string, out_string)
def test_output_is_integer_in_replace_empty_string_with_random_number(self):
string_placeholder = tf.placeholder(tf.string, shape=[])
replaced_string = inputs._replace_empty_string_with_random_number(
string_placeholder)
empty_string = ''
feed_dict = {string_placeholder: empty_string}
tf.set_random_seed(0)
with self.test_session() as sess:
out_string = sess.run(replaced_string, feed_dict=feed_dict)
# Test whether out_string is a string which represents an integer.
int(out_string) # throws an error if out_string is not castable to int.
self.assertEqual(out_string, '2798129067578209328')
class DataAugmentationFnTest(tf.test.TestCase): class DataAugmentationFnTest(tf.test.TestCase):
...@@ -352,6 +412,50 @@ class DataAugmentationFnTest(tf.test.TestCase): ...@@ -352,6 +412,50 @@ class DataAugmentationFnTest(tf.test.TestCase):
[[10, 10, 20, 20]] [[10, 10, 20, 20]]
) )
def test_apply_image_and_box_augmentation_with_scores(self):
data_augmentation_options = [
(preprocessor.resize_image, {
'new_height': 20,
'new_width': 20,
'method': tf.image.ResizeMethod.NEAREST_NEIGHBOR
}),
(preprocessor.scale_boxes_to_pixel_coordinates, {}),
]
data_augmentation_fn = functools.partial(
inputs.augment_input_data,
data_augmentation_options=data_augmentation_options)
tensor_dict = {
fields.InputDataFields.image:
tf.constant(np.random.rand(10, 10, 3).astype(np.float32)),
fields.InputDataFields.groundtruth_boxes:
tf.constant(np.array([[.5, .5, 1., 1.]], np.float32)),
fields.InputDataFields.groundtruth_classes:
tf.constant(np.array([1.0], np.float32)),
fields.InputDataFields.groundtruth_confidences:
tf.constant(np.array([0.8], np.float32)),
}
augmented_tensor_dict = data_augmentation_fn(tensor_dict=tensor_dict)
with self.test_session() as sess:
augmented_tensor_dict_out = sess.run(augmented_tensor_dict)
self.assertAllEqual(
augmented_tensor_dict_out[fields.InputDataFields.image].shape,
[20, 20, 3]
)
self.assertAllClose(
augmented_tensor_dict_out[fields.InputDataFields.groundtruth_boxes],
[[10, 10, 20, 20]]
)
self.assertAllClose(
augmented_tensor_dict_out[fields.InputDataFields.groundtruth_classes],
[1.0]
)
self.assertAllClose(
augmented_tensor_dict_out[
fields.InputDataFields.groundtruth_confidences],
[0.8]
)
def test_include_masks_in_data_augmentation(self): def test_include_masks_in_data_augmentation(self):
data_augmentation_options = [ data_augmentation_options = [
(preprocessor.resize_image, { (preprocessor.resize_image, {
...@@ -476,6 +580,9 @@ class DataTransformationFnTest(tf.test.TestCase): ...@@ -476,6 +580,9 @@ class DataTransformationFnTest(tf.test.TestCase):
self.assertAllClose( self.assertAllClose(
transformed_inputs[fields.InputDataFields.groundtruth_classes], transformed_inputs[fields.InputDataFields.groundtruth_classes],
[[0, 0, 1], [1, 0, 0]]) [[0, 0, 1], [1, 0, 0]])
self.assertAllClose(
transformed_inputs[fields.InputDataFields.groundtruth_confidences],
[[0, 0, 1], [1, 0, 0]])
def test_returns_correct_merged_boxes(self): def test_returns_correct_merged_boxes(self):
tensor_dict = { tensor_dict = {
...@@ -504,6 +611,9 @@ class DataTransformationFnTest(tf.test.TestCase): ...@@ -504,6 +611,9 @@ class DataTransformationFnTest(tf.test.TestCase):
self.assertAllClose( self.assertAllClose(
transformed_inputs[fields.InputDataFields.groundtruth_classes], transformed_inputs[fields.InputDataFields.groundtruth_classes],
[[1, 0, 1]]) [[1, 0, 1]])
self.assertAllClose(
transformed_inputs[fields.InputDataFields.groundtruth_confidences],
[[1, 0, 1]])
def test_returns_resized_masks(self): def test_returns_resized_masks(self):
tensor_dict = { tensor_dict = {
...@@ -514,6 +624,7 @@ class DataTransformationFnTest(tf.test.TestCase): ...@@ -514,6 +624,7 @@ class DataTransformationFnTest(tf.test.TestCase):
fields.InputDataFields.groundtruth_classes: fields.InputDataFields.groundtruth_classes:
tf.constant(np.array([3, 1], np.int32)) tf.constant(np.array([3, 1], np.int32))
} }
def fake_image_resizer_fn(image, masks=None): def fake_image_resizer_fn(image, masks=None):
resized_image = tf.image.resize_images(image, [8, 8]) resized_image = tf.image.resize_images(image, [8, 8])
results = [resized_image] results = [resized_image]
...@@ -550,6 +661,7 @@ class DataTransformationFnTest(tf.test.TestCase): ...@@ -550,6 +661,7 @@ class DataTransformationFnTest(tf.test.TestCase):
fields.InputDataFields.groundtruth_classes: fields.InputDataFields.groundtruth_classes:
tf.constant(np.array([3, 1], np.int32)) tf.constant(np.array([3, 1], np.int32))
} }
def fake_model_preprocessor_fn(image): def fake_model_preprocessor_fn(image):
return (image / 255., tf.expand_dims(tf.shape(image)[1:], axis=0)) return (image / 255., tf.expand_dims(tf.shape(image)[1:], axis=0))
...@@ -577,6 +689,7 @@ class DataTransformationFnTest(tf.test.TestCase): ...@@ -577,6 +689,7 @@ class DataTransformationFnTest(tf.test.TestCase):
fields.InputDataFields.groundtruth_classes: fields.InputDataFields.groundtruth_classes:
tf.constant(np.array([3, 1], np.int32)) tf.constant(np.array([3, 1], np.int32))
} }
def add_one_data_augmentation_fn(tensor_dict): def add_one_data_augmentation_fn(tensor_dict):
return {key: value + 1 for key, value in tensor_dict.items()} return {key: value + 1 for key, value in tensor_dict.items()}
...@@ -605,8 +718,10 @@ class DataTransformationFnTest(tf.test.TestCase): ...@@ -605,8 +718,10 @@ class DataTransformationFnTest(tf.test.TestCase):
fields.InputDataFields.groundtruth_classes: fields.InputDataFields.groundtruth_classes:
tf.constant(np.array([3, 1], np.int32)) tf.constant(np.array([3, 1], np.int32))
} }
def mul_two_model_preprocessor_fn(image): def mul_two_model_preprocessor_fn(image):
return (image * 2, tf.expand_dims(tf.shape(image)[1:], axis=0)) return (image * 2, tf.expand_dims(tf.shape(image)[1:], axis=0))
def add_five_to_image_data_augmentation_fn(tensor_dict): def add_five_to_image_data_augmentation_fn(tensor_dict):
tensor_dict[fields.InputDataFields.image] += 5 tensor_dict[fields.InputDataFields.image] += 5
return tensor_dict return tensor_dict
......
...@@ -12,7 +12,6 @@ ...@@ -12,7 +12,6 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
# ============================================================================== # ==============================================================================
r"""Evaluation executable for detection models. r"""Evaluation executable for detection models.
This executable is used to evaluate DetectionModels. There are two ways of This executable is used to evaluate DetectionModels. There are two ways of
...@@ -54,29 +53,30 @@ from object_detection.legacy import evaluator ...@@ -54,29 +53,30 @@ from object_detection.legacy import evaluator
from object_detection.utils import config_util from object_detection.utils import config_util
from object_detection.utils import label_map_util from object_detection.utils import label_map_util
tf.logging.set_verbosity(tf.logging.INFO) tf.logging.set_verbosity(tf.logging.INFO)
flags = tf.app.flags flags = tf.app.flags
flags.DEFINE_boolean('eval_training_data', False, flags.DEFINE_boolean('eval_training_data', False,
'If training data should be evaluated for this job.') 'If training data should be evaluated for this job.')
flags.DEFINE_string('checkpoint_dir', '', flags.DEFINE_string(
'Directory containing checkpoints to evaluate, typically ' 'checkpoint_dir', '',
'set to `train_dir` used in the training job.') 'Directory containing checkpoints to evaluate, typically '
flags.DEFINE_string('eval_dir', '', 'set to `train_dir` used in the training job.')
'Directory to write eval summaries to.') flags.DEFINE_string('eval_dir', '', 'Directory to write eval summaries to.')
flags.DEFINE_string('pipeline_config_path', '', flags.DEFINE_string(
'Path to a pipeline_pb2.TrainEvalPipelineConfig config ' 'pipeline_config_path', '',
'file. If provided, other configs are ignored') 'Path to a pipeline_pb2.TrainEvalPipelineConfig config '
'file. If provided, other configs are ignored')
flags.DEFINE_string('eval_config_path', '', flags.DEFINE_string('eval_config_path', '',
'Path to an eval_pb2.EvalConfig config file.') 'Path to an eval_pb2.EvalConfig config file.')
flags.DEFINE_string('input_config_path', '', flags.DEFINE_string('input_config_path', '',
'Path to an input_reader_pb2.InputReader config file.') 'Path to an input_reader_pb2.InputReader config file.')
flags.DEFINE_string('model_config_path', '', flags.DEFINE_string('model_config_path', '',
'Path to a model_pb2.DetectionModel config file.') 'Path to a model_pb2.DetectionModel config file.')
flags.DEFINE_boolean('run_once', False, 'Option to only run a single pass of ' flags.DEFINE_boolean(
'evaluation. Overrides the `max_evals` parameter in the ' 'run_once', False, 'Option to only run a single pass of '
'provided config.') 'evaluation. Overrides the `max_evals` parameter in the '
'provided config.')
FLAGS = flags.FLAGS FLAGS = flags.FLAGS
...@@ -88,9 +88,10 @@ def main(unused_argv): ...@@ -88,9 +88,10 @@ def main(unused_argv):
if FLAGS.pipeline_config_path: if FLAGS.pipeline_config_path:
configs = config_util.get_configs_from_pipeline_file( configs = config_util.get_configs_from_pipeline_file(
FLAGS.pipeline_config_path) FLAGS.pipeline_config_path)
tf.gfile.Copy(FLAGS.pipeline_config_path, tf.gfile.Copy(
os.path.join(FLAGS.eval_dir, 'pipeline.config'), FLAGS.pipeline_config_path,
overwrite=True) os.path.join(FLAGS.eval_dir, 'pipeline.config'),
overwrite=True)
else: else:
configs = config_util.get_configs_from_multiple_files( configs = config_util.get_configs_from_multiple_files(
model_config_path=FLAGS.model_config_path, model_config_path=FLAGS.model_config_path,
...@@ -99,9 +100,7 @@ def main(unused_argv): ...@@ -99,9 +100,7 @@ def main(unused_argv):
for name, config in [('model.config', FLAGS.model_config_path), for name, config in [('model.config', FLAGS.model_config_path),
('eval.config', FLAGS.eval_config_path), ('eval.config', FLAGS.eval_config_path),
('input.config', FLAGS.input_config_path)]: ('input.config', FLAGS.input_config_path)]:
tf.gfile.Copy(config, tf.gfile.Copy(config, os.path.join(FLAGS.eval_dir, name), overwrite=True)
os.path.join(FLAGS.eval_dir, name),
overwrite=True)
model_config = configs['model'] model_config = configs['model']
eval_config = configs['eval_config'] eval_config = configs['eval_config']
...@@ -110,9 +109,7 @@ def main(unused_argv): ...@@ -110,9 +109,7 @@ def main(unused_argv):
input_config = configs['train_input_config'] input_config = configs['train_input_config']
model_fn = functools.partial( model_fn = functools.partial(
model_builder.build, model_builder.build, model_config=model_config, is_training=False)
model_config=model_config,
is_training=False)
def get_next(config): def get_next(config):
return dataset_builder.make_initializable_iterator( return dataset_builder.make_initializable_iterator(
...@@ -120,10 +117,8 @@ def main(unused_argv): ...@@ -120,10 +117,8 @@ def main(unused_argv):
create_input_dict_fn = functools.partial(get_next, input_config) create_input_dict_fn = functools.partial(get_next, input_config)
label_map = label_map_util.load_labelmap(input_config.label_map_path) categories = label_map_util.create_categories_from_labelmap(
max_num_classes = max([item.id for item in label_map.item]) input_config.label_map_path)
categories = label_map_util.convert_label_map_to_categories(
label_map, max_num_classes)
if FLAGS.run_once: if FLAGS.run_once:
eval_config.max_evals = 1 eval_config.max_evals = 1
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment