Unverified Commit fe748d4a authored by pkulzc's avatar pkulzc Committed by GitHub
Browse files

Object detection changes: (#7208)

257914648  by lzc:

    Internal changes

--
257525973  by Zhichao Lu:

    Fixes bug that silently prevents checkpoints from loading when training w/ eager + functions. Also sets up scripts to run training.

--
257296614  by Zhichao Lu:

    Adding detection_features to model outputs

--
257234565  by Zhichao Lu:

    Fix wrong order of `classes_with_max_scores` in class-agnostic NMS caused by
    sorting in partitioned-NMS.

--
257232002  by ronnyvotel:

    Supporting `filter_nonoverlapping` option in np_box_list_ops.clip_to_window().

--
257198282  by Zhichao Lu:

    Adding the focal loss and l1 loss from the Objects as Points paper.

--
257089535  by Zhichao Lu:

    Create Keras based ssd + resnetv1 + fpn.

--
257087407  by Zhichao Lu:

    Make object_detection/data_decoders Python3-compatible.

--
257004582  by Zhichao Lu:

    Updates _decode_raw_data_into_masks_and_boxes to the latest binary masks-to-string encoding format.

--
257002124  by Zhichao Lu:

    Make object_detection/utils Python3-compatible, except json_utils.

    The patching trick used in json_utils is not going to work in Python 3.

--
256795056  by lzc:

    Add a detection_anchor_indices field to detection outputs.

--
256477542  by Zhichao Lu:

    Make object_detection/core Python3-compatible.

--
256387593  by Zhichao Lu:

    Edit class_id_function_approximations builder to skip class ids not present in label map.

--
256259039  by Zhichao Lu:

    Move NMS to TPU for FasterRCNN.

--
256071360  by rathodv:

    When multiclass_scores is empty, add one-hot encoding of groundtruth_classes as multiclass scores so that data_augmentation ops that expect the presence of multiclass_scores don't have to individually handle this case.

    Also copy input tensor_dict to out_tensor_dict first to avoid inplace modification.

--
256023645  by Zhichao Lu:

    Adds the first WIP iterations of TensorFlow v2 eager + functions style custom training & evaluation loops.

--
255980623  by Zhichao Lu:

    Adds a new data augmentation operation "remap_labels" which remaps a set of labels to a new label.

--
255753259  by Zhichao Lu:

    Announcement of the released evaluation tutorial for Open Images Challenge
    2019.

--
255698776  by lzc:

    Fix rewrite_nn_resize_op function which was broken by tf forward compatibility movement.

--
255623150  by Zhichao Lu:

    Add Keras-based ResnetV1 models.

--
255504992  by Zhichao Lu:

    Fixing the typo in specifying label expansion for ground truth segmentation
    file.

--
255470768  by Zhichao Lu:

    1. Fixing Python bug with parsed arguments.
    2. Adding capability to parse relevant columns from CSV header.
    3. Fixing bug with duplicated labels expansion.

--
255462432  by Zhichao Lu:

    Adds a new data augmentation operation "drop_label_probabilistically" which drops a given label with the given probability. This supports experiments on training in the presence of label noise.

--
255441632  by rathodv:

    Fallback on groundtruth classes when multiclass_scores tensor is empty.

--
255434899  by Zhichao Lu:

    Ensuring evaluation binary can run even with big files by synchronizing
    processing of ground truth and predictions: in this way, ground truth is not stored but immediatly
    used for evaluation. In case gt of object masks, this allows to run
    evaluations on relatively large sets.

--
255337855  by lzc:

    Internal change.

--
255308908  by Zhichao Lu:

    Add comment to clarify usage of calibration parameters proto.

--
255266371  by Zhichao Lu:

    Ensuring correct processing of the case, when no groundtruth masks are provided
    for an image.

--
255236648  by Zhichao Lu:

    Refactor model_builder in faster_rcnn.py to a util_map, so that it's possible to be overwritten.

--
255093285  by Zhichao Lu:

    Updating capability to subsample data during evaluation

--
255081222  by rathodv:

    Convert groundtruth masks to be of type float32 before its used in the loss function.

    When using mixed precision training, masks are represented using bfloat16 tensors in the input pipeline for performance reasons. We need to convert them to float32 before using it in the loss function.

--
254788436  by Zhichao Lu:

    Add forward_compatible to non_max_suppression_with_scores to make it is
    compatible with older tensorflow version.

--
254442362  by Zhichao Lu:

    Add num_layer field to ssd feature extractor proto.

--
253911582  by jonathanhuang:

    Plumbs Soft-NMS options (using the new tf.image.non_max_suppression_with_scores op) into the TF Object Detection API.  It adds a `soft_nms_sigma` field to the postprocessing proto file and plumbs this through to both the multiclass and class_agnostic versions of NMS. Note that there is no effect on behavior of NMS when soft_nms_sigma=0 (which it is set to by default).

    See also "Soft-NMS -- Improving Object Detection With One Line of Code" by Bodla et al (https://arxiv.org/abs/1704.04503)

--
253703949  by Zhichao Lu:

    Internal test fixes.

--
253151266  by Zhichao Lu:

    Fix the op type check for FusedBatchNorm, given that we introduced
    FusedBatchNormV3 in a previous change.

--
252718956  by Zhichao Lu:

    Customize activation function to enable relu6 instead of relu for saliency
    prediction model seastarization

--
252158593  by Zhichao Lu:

    Make object_detection/core Python3-compatible.

--
252150717  by Zhichao Lu:

    Make object_detection/core Python3-compatible.

--
251967048  by Zhichao Lu:

    Make GraphRewriter proto extensible.

--
251950039  by Zhichao Lu:

    Remove experimental_export_device_assignment from TPUEstimator.export_savedmodel(), so as to remove rewrite_for_inference().

    As a replacement, export_savedmodel() V2 API supports device_assignment where user call tpu.rewrite in model_fn and pass in device_assigment there.

--
251890697  by rathodv:

    Updated docstring to include new output nodes.

--
251662894  by Zhichao Lu:

    Add autoaugment augmentation option to objection detection api codebase. This
    is an available option in preprocessor.py.

    The intended usage of autoaugment is to be done along with random flipping and
    cropping for best results.

--
251532908  by Zhichao Lu:

    Add TrainingDataType enum to track whether class-specific or agnostic data was used to fit the calibration function.

    This is useful, since classes with few observations may require a calibration function fit on all classes.

--
251511339  by Zhichao Lu:

    Add multiclass isotonic regression to the calibration builder.

--
251317769  by pengchong:

    Internal Change.

--
250729989  by Zhichao Lu:

    Fixing bug in gt statistics count in case of mask and box annotations.

--
250729627  by Zhichao Lu:

    Label expansion for segmentation.

--
250724905  by Zhichao Lu:

    Fix use_depthwise in fpn and test it with fpnlite on ssd + mobilenet v2.

--
250670379  by Zhichao Lu:

    Internal change

250630364  by lzc:

    Fix detection_model_zoo footnotes

--
250560654  by Zhichao Lu:

    Fix static shape issue in matmul_crop_and_resize.

--
250534857  by Zhichao Lu:

    Edit class agnostic calibration function docstring to more accurately describe the function's outputs.

--
250533277  by Zhichao Lu:

    Edit the multiclass messages to use class ids instead of labels.

--

PiperOrigin-RevId: 257914648
parent 81123ebf
...@@ -15,6 +15,10 @@ ...@@ -15,6 +15,10 @@
"""Tests for object_detection.utils.np_box_mask_list_test.""" """Tests for object_detection.utils.np_box_mask_list_test."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np import numpy as np
import tensorflow as tf import tensorflow as tf
......
...@@ -19,6 +19,11 @@ Example box operations that are supported: ...@@ -19,6 +19,11 @@ Example box operations that are supported:
* Areas: compute bounding box areas * Areas: compute bounding box areas
* IOU: pairwise intersection-over-union scores * IOU: pairwise intersection-over-union scores
""" """
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np import numpy as np
......
...@@ -15,6 +15,10 @@ ...@@ -15,6 +15,10 @@
"""Tests for object_detection.np_box_ops.""" """Tests for object_detection.np_box_ops."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np import numpy as np
import tensorflow as tf import tensorflow as tf
......
...@@ -19,6 +19,11 @@ Example mask operations that are supported: ...@@ -19,6 +19,11 @@ Example mask operations that are supported:
* Areas: compute mask areas * Areas: compute mask areas
* IOU: pairwise intersection-over-union scores * IOU: pairwise intersection-over-union scores
""" """
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np import numpy as np
EPSILON = 1e-7 EPSILON = 1e-7
......
...@@ -15,6 +15,10 @@ ...@@ -15,6 +15,10 @@
"""Tests for object_detection.np_mask_ops.""" """Tests for object_detection.np_mask_ops."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np import numpy as np
import tensorflow as tf import tensorflow as tf
......
...@@ -27,12 +27,19 @@ It supports the following operations: ...@@ -27,12 +27,19 @@ It supports the following operations:
Note: This module operates on numpy boxes and box lists. Note: This module operates on numpy boxes and box lists.
""" """
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from abc import ABCMeta from abc import ABCMeta
from abc import abstractmethod from abc import abstractmethod
import collections import collections
import logging import logging
import unicodedata import unicodedata
import numpy as np import numpy as np
import six
from six.moves import range
from six.moves import zip
import tensorflow as tf import tensorflow as tf
from object_detection.core import standard_fields from object_detection.core import standard_fields
...@@ -41,7 +48,7 @@ from object_detection.utils import metrics ...@@ -41,7 +48,7 @@ from object_detection.utils import metrics
from object_detection.utils import per_image_evaluation from object_detection.utils import per_image_evaluation
class DetectionEvaluator(object): class DetectionEvaluator(six.with_metaclass(ABCMeta, object)):
"""Interface for object detection evalution classes. """Interface for object detection evalution classes.
Example usage of the Evaluator: Example usage of the Evaluator:
...@@ -58,7 +65,6 @@ class DetectionEvaluator(object): ...@@ -58,7 +65,6 @@ class DetectionEvaluator(object):
metrics_dict = evaluator.evaluate() metrics_dict = evaluator.evaluate()
""" """
__metaclass__ = ABCMeta
def __init__(self, categories): def __init__(self, categories):
"""Constructor. """Constructor.
...@@ -96,8 +102,8 @@ class DetectionEvaluator(object): ...@@ -96,8 +102,8 @@ class DetectionEvaluator(object):
Args: Args:
image_id: A unique string/integer identifier for the image. image_id: A unique string/integer identifier for the image.
groundtruth_dict: A dictionary of groundtruth numpy arrays required groundtruth_dict: A dictionary of groundtruth numpy arrays required for
for evaluations. evaluations.
""" """
pass pass
...@@ -107,8 +113,8 @@ class DetectionEvaluator(object): ...@@ -107,8 +113,8 @@ class DetectionEvaluator(object):
Args: Args:
image_id: A unique string/integer identifier for the image. image_id: A unique string/integer identifier for the image.
detections_dict: A dictionary of detection numpy arrays required detections_dict: A dictionary of detection numpy arrays required for
for evaluation. evaluation.
""" """
pass pass
...@@ -164,8 +170,8 @@ class ObjectDetectionEvaluator(DetectionEvaluator): ...@@ -164,8 +170,8 @@ class ObjectDetectionEvaluator(DetectionEvaluator):
boxes to detection boxes. boxes to detection boxes.
recall_lower_bound: lower bound of recall operating area. recall_lower_bound: lower bound of recall operating area.
recall_upper_bound: upper bound of recall operating area. recall_upper_bound: upper bound of recall operating area.
evaluate_corlocs: (optional) boolean which determines if corloc scores evaluate_corlocs: (optional) boolean which determines if corloc scores are
are to be returned or not. to be returned or not.
evaluate_precision_recall: (optional) boolean which determines if evaluate_precision_recall: (optional) boolean which determines if
precision and recall values are to be returned or not. precision and recall values are to be returned or not.
metric_prefix: (optional) string prefix for metric name; if None, no metric_prefix: (optional) string prefix for metric name; if None, no
...@@ -173,8 +179,8 @@ class ObjectDetectionEvaluator(DetectionEvaluator): ...@@ -173,8 +179,8 @@ class ObjectDetectionEvaluator(DetectionEvaluator):
use_weighted_mean_ap: (optional) boolean which determines if the mean use_weighted_mean_ap: (optional) boolean which determines if the mean
average precision is computed directly from the scores and tp_fp_labels average precision is computed directly from the scores and tp_fp_labels
of all classes. of all classes.
evaluate_masks: If False, evaluation will be performed based on boxes. evaluate_masks: If False, evaluation will be performed based on boxes. If
If True, mask evaluation will be performed instead. True, mask evaluation will be performed instead.
group_of_weight: Weight of group-of boxes.If set to 0, detections of the group_of_weight: Weight of group-of boxes.If set to 0, detections of the
correct class within a group-of box are ignored. If weight is > 0, then correct class within a group-of box are ignored. If weight is > 0, then
if at least one detection falls within a group-of box with if at least one detection falls within a group-of box with
...@@ -245,18 +251,20 @@ class ObjectDetectionEvaluator(DetectionEvaluator): ...@@ -245,18 +251,20 @@ class ObjectDetectionEvaluator(DetectionEvaluator):
if idx + self._label_id_offset in category_index: if idx + self._label_id_offset in category_index:
category_name = category_index[idx + self._label_id_offset]['name'] category_name = category_index[idx + self._label_id_offset]['name']
try: try:
category_name = unicode(category_name, 'utf-8') category_name = six.text_type(category_name, 'utf-8')
except TypeError: except TypeError:
pass pass
category_name = unicodedata.normalize('NFKD', category_name).encode( category_name = unicodedata.normalize('NFKD', category_name)
'ascii', 'ignore') if six.PY2:
category_name = category_name.encode('ascii', 'ignore')
self._metric_names.append( self._metric_names.append(
self._metric_prefix + 'PerformanceByCategory/AP@{}IOU/{}'.format( self._metric_prefix + 'PerformanceByCategory/AP@{}IOU/{}'.format(
self._matching_iou_threshold, category_name)) self._matching_iou_threshold, category_name))
if self._evaluate_corlocs: if self._evaluate_corlocs:
self._metric_names.append( self._metric_names.append(
self._metric_prefix + 'PerformanceByCategory/CorLoc@{}IOU/{}' self._metric_prefix +
.format(self._matching_iou_threshold, category_name)) 'PerformanceByCategory/CorLoc@{}IOU/{}'.format(
self._matching_iou_threshold, category_name))
def add_single_ground_truth_image_info(self, image_id, groundtruth_dict): def add_single_ground_truth_image_info(self, image_id, groundtruth_dict):
"""Adds groundtruth for a single image to be used for evaluation. """Adds groundtruth for a single image to be used for evaluation.
...@@ -270,10 +278,10 @@ class ObjectDetectionEvaluator(DetectionEvaluator): ...@@ -270,10 +278,10 @@ class ObjectDetectionEvaluator(DetectionEvaluator):
standard_fields.InputDataFields.groundtruth_classes: integer numpy array standard_fields.InputDataFields.groundtruth_classes: integer numpy array
of shape [num_boxes] containing 1-indexed groundtruth classes for the of shape [num_boxes] containing 1-indexed groundtruth classes for the
boxes. boxes.
standard_fields.InputDataFields.groundtruth_difficult: Optional length standard_fields.InputDataFields.groundtruth_difficult: Optional length M
M numpy boolean array denoting whether a ground truth box is a numpy boolean array denoting whether a ground truth box is a difficult
difficult instance or not. This field is optional to support the case instance or not. This field is optional to support the case that no
that no boxes are difficult. boxes are difficult.
standard_fields.InputDataFields.groundtruth_instance_masks: Optional standard_fields.InputDataFields.groundtruth_instance_masks: Optional
numpy array of shape [num_boxes, height, width] with values in {0, 1}. numpy array of shape [num_boxes, height, width] with values in {0, 1}.
...@@ -290,8 +298,8 @@ class ObjectDetectionEvaluator(DetectionEvaluator): ...@@ -290,8 +298,8 @@ class ObjectDetectionEvaluator(DetectionEvaluator):
# If the key is not present in the groundtruth_dict or the array is empty # If the key is not present in the groundtruth_dict or the array is empty
# (unless there are no annotations for the groundtruth on this image) # (unless there are no annotations for the groundtruth on this image)
# use values from the dictionary or insert None otherwise. # use values from the dictionary or insert None otherwise.
if (standard_fields.InputDataFields.groundtruth_difficult in if (standard_fields.InputDataFields.groundtruth_difficult in six.viewkeys(
groundtruth_dict.keys() and groundtruth_dict) and
(groundtruth_dict[standard_fields.InputDataFields.groundtruth_difficult] (groundtruth_dict[standard_fields.InputDataFields.groundtruth_difficult]
.size or not groundtruth_classes.size)): .size or not groundtruth_classes.size)):
groundtruth_difficult = groundtruth_dict[ groundtruth_difficult = groundtruth_dict[
...@@ -299,7 +307,7 @@ class ObjectDetectionEvaluator(DetectionEvaluator): ...@@ -299,7 +307,7 @@ class ObjectDetectionEvaluator(DetectionEvaluator):
else: else:
groundtruth_difficult = None groundtruth_difficult = None
if not len(self._image_ids) % 1000: if not len(self._image_ids) % 1000:
logging.warn( logging.warning(
'image %s does not have groundtruth difficult flag specified', 'image %s does not have groundtruth difficult flag specified',
image_id) image_id)
groundtruth_masks = None groundtruth_masks = None
...@@ -332,9 +340,9 @@ class ObjectDetectionEvaluator(DetectionEvaluator): ...@@ -332,9 +340,9 @@ class ObjectDetectionEvaluator(DetectionEvaluator):
standard_fields.DetectionResultFields.detection_classes: integer numpy standard_fields.DetectionResultFields.detection_classes: integer numpy
array of shape [num_boxes] containing 1-indexed detection classes for array of shape [num_boxes] containing 1-indexed detection classes for
the boxes. the boxes.
standard_fields.DetectionResultFields.detection_masks: uint8 numpy standard_fields.DetectionResultFields.detection_masks: uint8 numpy array
array of shape [num_boxes, height, width] containing `num_boxes` masks of shape [num_boxes, height, width] containing `num_boxes` masks of
of values ranging between 0 and 1. values ranging between 0 and 1.
Raises: Raises:
ValueError: If detection masks are not in detections dictionary. ValueError: If detection masks are not in detections dictionary.
...@@ -383,11 +391,12 @@ class ObjectDetectionEvaluator(DetectionEvaluator): ...@@ -383,11 +391,12 @@ class ObjectDetectionEvaluator(DetectionEvaluator):
if idx + self._label_id_offset in category_index: if idx + self._label_id_offset in category_index:
category_name = category_index[idx + self._label_id_offset]['name'] category_name = category_index[idx + self._label_id_offset]['name']
try: try:
category_name = unicode(category_name, 'utf-8') category_name = six.text_type(category_name, 'utf-8')
except TypeError: except TypeError:
pass pass
category_name = unicodedata.normalize( category_name = unicodedata.normalize('NFKD', category_name)
'NFKD', category_name).encode('ascii', 'ignore') if six.PY2:
category_name = category_name.encode('ascii', 'ignore')
display_name = ( display_name = (
self._metric_prefix + 'PerformanceByCategory/AP@{}IOU/{}'.format( self._metric_prefix + 'PerformanceByCategory/AP@{}IOU/{}'.format(
self._matching_iou_threshold, category_name)) self._matching_iou_threshold, category_name))
...@@ -409,8 +418,9 @@ class ObjectDetectionEvaluator(DetectionEvaluator): ...@@ -409,8 +418,9 @@ class ObjectDetectionEvaluator(DetectionEvaluator):
# Optionally add CorLoc metrics.classes # Optionally add CorLoc metrics.classes
if self._evaluate_corlocs: if self._evaluate_corlocs:
display_name = ( display_name = (
self._metric_prefix + 'PerformanceByCategory/CorLoc@{}IOU/{}' self._metric_prefix +
.format(self._matching_iou_threshold, category_name)) 'PerformanceByCategory/CorLoc@{}IOU/{}'.format(
self._matching_iou_threshold, category_name))
pascal_metrics[display_name] = per_class_corloc[idx] pascal_metrics[display_name] = per_class_corloc[idx]
return pascal_metrics return pascal_metrics
...@@ -446,7 +456,7 @@ class ObjectDetectionEvaluator(DetectionEvaluator): ...@@ -446,7 +456,7 @@ class ObjectDetectionEvaluator(DetectionEvaluator):
if key in self._expected_keys: if key in self._expected_keys:
eval_dict_filtered[key] = value eval_dict_filtered[key] = value
eval_dict_keys = eval_dict_filtered.keys() eval_dict_keys = list(eval_dict_filtered.keys())
def update_op(image_id, *eval_dict_batched_as_list): def update_op(image_id, *eval_dict_batched_as_list):
"""Update operation that adds batch of images to ObjectDetectionEvaluator. """Update operation that adds batch of images to ObjectDetectionEvaluator.
...@@ -468,7 +478,7 @@ class ObjectDetectionEvaluator(DetectionEvaluator): ...@@ -468,7 +478,7 @@ class ObjectDetectionEvaluator(DetectionEvaluator):
self.add_single_detected_image_info(image_id, single_example_dict) self.add_single_detected_image_info(image_id, single_example_dict)
args = [eval_dict_filtered[standard_fields.InputDataFields.key]] args = [eval_dict_filtered[standard_fields.InputDataFields.key]]
args.extend(eval_dict_filtered.values()) args.extend(six.itervalues(eval_dict_filtered))
update_op = tf.py_func(update_op, args, []) update_op = tf.py_func(update_op, args, [])
def first_value_func(): def first_value_func():
...@@ -651,8 +661,8 @@ class OpenImagesDetectionEvaluator(ObjectDetectionEvaluator): ...@@ -651,8 +661,8 @@ class OpenImagesDetectionEvaluator(ObjectDetectionEvaluator):
standard_fields.InputDataFields.groundtruth_classes: integer numpy array standard_fields.InputDataFields.groundtruth_classes: integer numpy array
of shape [num_boxes] containing 1-indexed groundtruth classes for the of shape [num_boxes] containing 1-indexed groundtruth classes for the
boxes. boxes.
standard_fields.InputDataFields.groundtruth_group_of: Optional length standard_fields.InputDataFields.groundtruth_group_of: Optional length M
M numpy boolean array denoting whether a groundtruth box contains a numpy boolean array denoting whether a groundtruth box contains a
group of instances. group of instances.
Raises: Raises:
...@@ -667,8 +677,8 @@ class OpenImagesDetectionEvaluator(ObjectDetectionEvaluator): ...@@ -667,8 +677,8 @@ class OpenImagesDetectionEvaluator(ObjectDetectionEvaluator):
# If the key is not present in the groundtruth_dict or the array is empty # If the key is not present in the groundtruth_dict or the array is empty
# (unless there are no annotations for the groundtruth on this image) # (unless there are no annotations for the groundtruth on this image)
# use values from the dictionary or insert None otherwise. # use values from the dictionary or insert None otherwise.
if (standard_fields.InputDataFields.groundtruth_group_of in if (standard_fields.InputDataFields.groundtruth_group_of in six.viewkeys(
groundtruth_dict.keys() and groundtruth_dict) and
(groundtruth_dict[standard_fields.InputDataFields.groundtruth_group_of] (groundtruth_dict[standard_fields.InputDataFields.groundtruth_group_of]
.size or not groundtruth_classes.size)): .size or not groundtruth_classes.size)):
groundtruth_group_of = groundtruth_dict[ groundtruth_group_of = groundtruth_dict[
...@@ -676,7 +686,7 @@ class OpenImagesDetectionEvaluator(ObjectDetectionEvaluator): ...@@ -676,7 +686,7 @@ class OpenImagesDetectionEvaluator(ObjectDetectionEvaluator):
else: else:
groundtruth_group_of = None groundtruth_group_of = None
if not len(self._image_ids) % 1000: if not len(self._image_ids) % 1000:
logging.warn( logging.warning(
'image %s does not have groundtruth group_of flag specified', 'image %s does not have groundtruth group_of flag specified',
image_id) image_id)
if self._evaluate_masks: if self._evaluate_masks:
...@@ -741,18 +751,18 @@ class OpenImagesChallengeEvaluator(OpenImagesDetectionEvaluator): ...@@ -741,18 +751,18 @@ class OpenImagesChallengeEvaluator(OpenImagesDetectionEvaluator):
matching_iou_threshold: IOU threshold to use for matching groundtruth matching_iou_threshold: IOU threshold to use for matching groundtruth
boxes to detection boxes. boxes to detection boxes.
evaluate_corlocs: if True, additionally evaluates and returns CorLoc. evaluate_corlocs: if True, additionally evaluates and returns CorLoc.
group_of_weight: weight of a group-of box. If set to 0, detections of the group_of_weight: Weight of group-of boxes. If set to 0, detections of the
correct class within a group-of box are ignored. If weight is > 0 correct class within a group-of box are ignored. If weight is > 0, then
(default for Open Images Detection Challenge), then if at least one if at least one detection falls within a group-of box with
detection falls within a group-of box with matching_iou_threshold, matching_iou_threshold, weight group_of_weight is added to true
weight group_of_weight is added to true positives. Consequently, if no positives. Consequently, if no detection falls within a group-of box,
detection falls within a group-of box, weight group_of_weight is added weight group_of_weight is added to false negatives.
to false negatives.
""" """
if not evaluate_masks: if not evaluate_masks:
metrics_prefix = 'OpenImagesDetectionChallenge' metrics_prefix = 'OpenImagesDetectionChallenge'
else: else:
metrics_prefix = 'OpenImagesInstanceSegmentationChallenge' metrics_prefix = 'OpenImagesInstanceSegmentationChallenge'
super(OpenImagesChallengeEvaluator, self).__init__( super(OpenImagesChallengeEvaluator, self).__init__(
categories, categories,
matching_iou_threshold, matching_iou_threshold,
...@@ -779,8 +789,8 @@ class OpenImagesChallengeEvaluator(OpenImagesDetectionEvaluator): ...@@ -779,8 +789,8 @@ class OpenImagesChallengeEvaluator(OpenImagesDetectionEvaluator):
boxes. boxes.
standard_fields.InputDataFields.groundtruth_image_classes: integer 1D standard_fields.InputDataFields.groundtruth_image_classes: integer 1D
numpy array containing all classes for which labels are verified. numpy array containing all classes for which labels are verified.
standard_fields.InputDataFields.groundtruth_group_of: Optional length standard_fields.InputDataFields.groundtruth_group_of: Optional length M
M numpy boolean array denoting whether a groundtruth box contains a numpy boolean array denoting whether a groundtruth box contains a
group of instances. group of instances.
Raises: Raises:
...@@ -864,8 +874,7 @@ class OpenImagesDetectionChallengeEvaluator(OpenImagesChallengeEvaluator): ...@@ -864,8 +874,7 @@ class OpenImagesDetectionChallengeEvaluator(OpenImagesChallengeEvaluator):
def __init__(self, def __init__(self,
categories, categories,
matching_iou_threshold=0.5, matching_iou_threshold=0.5,
evaluate_corlocs=False, evaluate_corlocs=False):
group_of_weight=1.0):
"""Constructor. """Constructor.
Args: Args:
...@@ -875,13 +884,6 @@ class OpenImagesDetectionChallengeEvaluator(OpenImagesChallengeEvaluator): ...@@ -875,13 +884,6 @@ class OpenImagesDetectionChallengeEvaluator(OpenImagesChallengeEvaluator):
matching_iou_threshold: IOU threshold to use for matching groundtruth matching_iou_threshold: IOU threshold to use for matching groundtruth
boxes to detection boxes. boxes to detection boxes.
evaluate_corlocs: if True, additionally evaluates and returns CorLoc. evaluate_corlocs: if True, additionally evaluates and returns CorLoc.
group_of_weight: weight of a group-of box. If set to 0, detections of the
correct class within a group-of box are ignored. If weight is > 0
(default for Open Images Detection Challenge), then if at least one
detection falls within a group-of box with matching_iou_threshold,
weight group_of_weight is added to true positives. Consequently, if no
detection falls within a group-of box, weight group_of_weight is added
to false negatives.
""" """
super(OpenImagesDetectionChallengeEvaluator, self).__init__( super(OpenImagesDetectionChallengeEvaluator, self).__init__(
categories=categories, categories=categories,
...@@ -898,8 +900,7 @@ class OpenImagesInstanceSegmentationChallengeEvaluator( ...@@ -898,8 +900,7 @@ class OpenImagesInstanceSegmentationChallengeEvaluator(
def __init__(self, def __init__(self,
categories, categories,
matching_iou_threshold=0.5, matching_iou_threshold=0.5,
evaluate_corlocs=False, evaluate_corlocs=False):
group_of_weight=1.0):
"""Constructor. """Constructor.
Args: Args:
...@@ -909,20 +910,13 @@ class OpenImagesInstanceSegmentationChallengeEvaluator( ...@@ -909,20 +910,13 @@ class OpenImagesInstanceSegmentationChallengeEvaluator(
matching_iou_threshold: IOU threshold to use for matching groundtruth matching_iou_threshold: IOU threshold to use for matching groundtruth
boxes to detection boxes. boxes to detection boxes.
evaluate_corlocs: if True, additionally evaluates and returns CorLoc. evaluate_corlocs: if True, additionally evaluates and returns CorLoc.
group_of_weight: weight of a group-of box. If set to 0, detections of the
correct class within a group-of box are ignored. If weight is > 0
(default for Open Images Detection Challenge), then if at least one
detection falls within a group-of box with matching_iou_threshold,
weight group_of_weight is added to true positives. Consequently, if no
detection falls within a group-of box, weight group_of_weight is added
to false negatives.
""" """
super(OpenImagesInstanceSegmentationChallengeEvaluator, self).__init__( super(OpenImagesInstanceSegmentationChallengeEvaluator, self).__init__(
categories=categories, categories=categories,
evaluate_masks=True, evaluate_masks=True,
matching_iou_threshold=matching_iou_threshold, matching_iou_threshold=matching_iou_threshold,
evaluate_corlocs=False, evaluate_corlocs=False,
group_of_weight=1.0) group_of_weight=0.0)
class ObjectDetectionEvaluation(object): class ObjectDetectionEvaluation(object):
...@@ -943,8 +937,8 @@ class ObjectDetectionEvaluation(object): ...@@ -943,8 +937,8 @@ class ObjectDetectionEvaluation(object):
Args: Args:
num_groundtruth_classes: Number of ground-truth classes. num_groundtruth_classes: Number of ground-truth classes.
matching_iou_threshold: IOU threshold used for matching detected boxes matching_iou_threshold: IOU threshold used for matching detected boxes to
to ground-truth boxes. ground-truth boxes.
nms_iou_threshold: IOU threshold used for non-maximum suppression. nms_iou_threshold: IOU threshold used for non-maximum suppression.
nms_max_output_boxes: Maximum number of boxes returned by non-maximum nms_max_output_boxes: Maximum number of boxes returned by non-maximum
suppression. suppression.
...@@ -960,8 +954,8 @@ class ObjectDetectionEvaluation(object): ...@@ -960,8 +954,8 @@ class ObjectDetectionEvaluation(object):
matching_iou_threshold, weight group_of_weight is added to true matching_iou_threshold, weight group_of_weight is added to true
positives. Consequently, if no detection falls within a group-of box, positives. Consequently, if no detection falls within a group-of box,
weight group_of_weight is added to false negatives. weight group_of_weight is added to false negatives.
per_image_eval_class: The class that contains functions for computing per_image_eval_class: The class that contains functions for computing per
per image metrics. image metrics.
Raises: Raises:
ValueError: if num_groundtruth_classes is smaller than 1. ValueError: if num_groundtruth_classes is smaller than 1.
...@@ -1019,23 +1013,23 @@ class ObjectDetectionEvaluation(object): ...@@ -1019,23 +1013,23 @@ class ObjectDetectionEvaluation(object):
Args: Args:
image_key: A unique string/integer identifier for the image. image_key: A unique string/integer identifier for the image.
groundtruth_boxes: float32 numpy array of shape [num_boxes, 4] groundtruth_boxes: float32 numpy array of shape [num_boxes, 4] containing
containing `num_boxes` groundtruth boxes of the format `num_boxes` groundtruth boxes of the format [ymin, xmin, ymax, xmax] in
[ymin, xmin, ymax, xmax] in absolute image coordinates. absolute image coordinates.
groundtruth_class_labels: integer numpy array of shape [num_boxes] groundtruth_class_labels: integer numpy array of shape [num_boxes]
containing 0-indexed groundtruth classes for the boxes. containing 0-indexed groundtruth classes for the boxes.
groundtruth_is_difficult_list: A length M numpy boolean array denoting groundtruth_is_difficult_list: A length M numpy boolean array denoting
whether a ground truth box is a difficult instance or not. To support whether a ground truth box is a difficult instance or not. To support
the case that no boxes are difficult, it is by default set as None. the case that no boxes are difficult, it is by default set as None.
groundtruth_is_group_of_list: A length M numpy boolean array denoting groundtruth_is_group_of_list: A length M numpy boolean array denoting
whether a ground truth box is a group-of box or not. To support whether a ground truth box is a group-of box or not. To support the case
the case that no boxes are groups-of, it is by default set as None. that no boxes are groups-of, it is by default set as None.
groundtruth_masks: uint8 numpy array of shape groundtruth_masks: uint8 numpy array of shape [num_boxes, height, width]
[num_boxes, height, width] containing `num_boxes` groundtruth masks. containing `num_boxes` groundtruth masks. The mask values range from 0
The mask values range from 0 to 1. to 1.
""" """
if image_key in self.groundtruth_boxes: if image_key in self.groundtruth_boxes:
logging.warn( logging.warning(
'image %s has already been added to the ground truth database.', 'image %s has already been added to the ground truth database.',
image_key) image_key)
return return
...@@ -1051,31 +1045,42 @@ class ObjectDetectionEvaluation(object): ...@@ -1051,31 +1045,42 @@ class ObjectDetectionEvaluation(object):
if groundtruth_is_group_of_list is None: if groundtruth_is_group_of_list is None:
num_boxes = groundtruth_boxes.shape[0] num_boxes = groundtruth_boxes.shape[0]
groundtruth_is_group_of_list = np.zeros(num_boxes, dtype=bool) groundtruth_is_group_of_list = np.zeros(num_boxes, dtype=bool)
if groundtruth_masks is None:
num_boxes = groundtruth_boxes.shape[0]
mask_presence_indicator = np.zeros(num_boxes, dtype=bool)
else:
mask_presence_indicator = (np.sum(groundtruth_masks,
axis=(1, 2)) == 0).astype(dtype=bool)
self.groundtruth_is_group_of_list[ self.groundtruth_is_group_of_list[
image_key] = groundtruth_is_group_of_list.astype(dtype=bool) image_key] = groundtruth_is_group_of_list.astype(dtype=bool)
self._update_ground_truth_statistics( self._update_ground_truth_statistics(
groundtruth_class_labels, groundtruth_class_labels,
groundtruth_is_difficult_list.astype(dtype=bool), groundtruth_is_difficult_list.astype(dtype=bool)
| mask_presence_indicator, # ignore boxes without masks
groundtruth_is_group_of_list.astype(dtype=bool)) groundtruth_is_group_of_list.astype(dtype=bool))
def add_single_detected_image_info(self, image_key, detected_boxes, def add_single_detected_image_info(self,
detected_scores, detected_class_labels, image_key,
detected_boxes,
detected_scores,
detected_class_labels,
detected_masks=None): detected_masks=None):
"""Adds detections for a single image to be used for evaluation. """Adds detections for a single image to be used for evaluation.
Args: Args:
image_key: A unique string/integer identifier for the image. image_key: A unique string/integer identifier for the image.
detected_boxes: float32 numpy array of shape [num_boxes, 4] detected_boxes: float32 numpy array of shape [num_boxes, 4] containing
containing `num_boxes` detection boxes of the format `num_boxes` detection boxes of the format [ymin, xmin, ymax, xmax] in
[ymin, xmin, ymax, xmax] in absolute image coordinates. absolute image coordinates.
detected_scores: float32 numpy array of shape [num_boxes] containing detected_scores: float32 numpy array of shape [num_boxes] containing
detection scores for the boxes. detection scores for the boxes.
detected_class_labels: integer numpy array of shape [num_boxes] containing detected_class_labels: integer numpy array of shape [num_boxes] containing
0-indexed detection classes for the boxes. 0-indexed detection classes for the boxes.
detected_masks: np.uint8 numpy array of shape [num_boxes, height, width] detected_masks: np.uint8 numpy array of shape [num_boxes, height, width]
containing `num_boxes` detection masks with values ranging containing `num_boxes` detection masks with values ranging between 0 and
between 0 and 1. 1.
Raises: Raises:
ValueError: if the number of boxes, scores and class labels differ in ValueError: if the number of boxes, scores and class labels differ in
...@@ -1083,13 +1088,14 @@ class ObjectDetectionEvaluation(object): ...@@ -1083,13 +1088,14 @@ class ObjectDetectionEvaluation(object):
""" """
if (len(detected_boxes) != len(detected_scores) or if (len(detected_boxes) != len(detected_scores) or
len(detected_boxes) != len(detected_class_labels)): len(detected_boxes) != len(detected_class_labels)):
raise ValueError('detected_boxes, detected_scores and ' raise ValueError(
'detected_class_labels should all have same lengths. Got' 'detected_boxes, detected_scores and '
'[%d, %d, %d]' % len(detected_boxes), 'detected_class_labels should all have same lengths. Got'
len(detected_scores), len(detected_class_labels)) '[%d, %d, %d]' % len(detected_boxes), len(detected_scores),
len(detected_class_labels))
if image_key in self.detection_keys: if image_key in self.detection_keys:
logging.warn( logging.warning(
'image %s has already been added to the detection result database', 'image %s has already been added to the detection result database',
image_key) image_key)
return return
...@@ -1100,8 +1106,7 @@ class ObjectDetectionEvaluation(object): ...@@ -1100,8 +1106,7 @@ class ObjectDetectionEvaluation(object):
groundtruth_class_labels = self.groundtruth_class_labels[image_key] groundtruth_class_labels = self.groundtruth_class_labels[image_key]
# Masks are popped instead of look up. The reason is that we do not want # Masks are popped instead of look up. The reason is that we do not want
# to keep all masks in memory which can cause memory overflow. # to keep all masks in memory which can cause memory overflow.
groundtruth_masks = self.groundtruth_masks.pop( groundtruth_masks = self.groundtruth_masks.pop(image_key)
image_key)
groundtruth_is_difficult_list = self.groundtruth_is_difficult_list[ groundtruth_is_difficult_list = self.groundtruth_is_difficult_list[
image_key] image_key]
groundtruth_is_group_of_list = self.groundtruth_is_group_of_list[ groundtruth_is_group_of_list = self.groundtruth_is_group_of_list[
...@@ -1145,19 +1150,21 @@ class ObjectDetectionEvaluation(object): ...@@ -1145,19 +1150,21 @@ class ObjectDetectionEvaluation(object):
statitistics. statitistics.
Args: Args:
groundtruth_class_labels: An integer numpy array of length M, groundtruth_class_labels: An integer numpy array of length M, representing
representing M class labels of object instances in ground truth M class labels of object instances in ground truth
groundtruth_is_difficult_list: A boolean numpy array of length M denoting groundtruth_is_difficult_list: A boolean numpy array of length M denoting
whether a ground truth box is a difficult instance or not whether a ground truth box is a difficult instance or not
groundtruth_is_group_of_list: A boolean numpy array of length M denoting groundtruth_is_group_of_list: A boolean numpy array of length M denoting
whether a ground truth box is a group-of box or not whether a ground truth box is a group-of box or not
""" """
for class_index in range(self.num_class): for class_index in range(self.num_class):
num_gt_instances = np.sum(groundtruth_class_labels[ num_gt_instances = np.sum(groundtruth_class_labels[
~groundtruth_is_difficult_list ~groundtruth_is_difficult_list
& ~groundtruth_is_group_of_list] == class_index) & ~groundtruth_is_group_of_list] == class_index)
num_groupof_gt_instances = self.group_of_weight * np.sum( num_groupof_gt_instances = self.group_of_weight * np.sum(
groundtruth_class_labels[groundtruth_is_group_of_list] == class_index) groundtruth_class_labels[groundtruth_is_group_of_list
& ~groundtruth_is_difficult_list] ==
class_index)
self.num_gt_instances_per_class[ self.num_gt_instances_per_class[
class_index] += num_gt_instances + num_groupof_gt_instances class_index] += num_gt_instances + num_groupof_gt_instances
if np.any(groundtruth_class_labels == class_index): if np.any(groundtruth_class_labels == class_index):
...@@ -1178,7 +1185,7 @@ class ObjectDetectionEvaluation(object): ...@@ -1178,7 +1185,7 @@ class ObjectDetectionEvaluation(object):
mean_corloc: Mean CorLoc score for each class, float scalar mean_corloc: Mean CorLoc score for each class, float scalar
""" """
if (self.num_gt_instances_per_class == 0).any(): if (self.num_gt_instances_per_class == 0).any():
logging.warn( logging.warning(
'The following classes have no ground truth examples: %s', 'The following classes have no ground truth examples: %s',
np.squeeze(np.argwhere(self.num_gt_instances_per_class == 0)) + np.squeeze(np.argwhere(self.num_gt_instances_per_class == 0)) +
self.label_id_offset) self.label_id_offset)
...@@ -1233,6 +1240,7 @@ class ObjectDetectionEvaluation(object): ...@@ -1233,6 +1240,7 @@ class ObjectDetectionEvaluation(object):
else: else:
mean_ap = np.nanmean(self.average_precision_per_class) mean_ap = np.nanmean(self.average_precision_per_class)
mean_corloc = np.nanmean(self.corloc_per_class) mean_corloc = np.nanmean(self.corloc_per_class)
return ObjectDetectionEvalMetrics( return ObjectDetectionEvalMetrics(self.average_precision_per_class, mean_ap,
self.average_precision_per_class, mean_ap, self.precisions_per_class, self.precisions_per_class,
self.recalls_per_class, self.corloc_per_class, mean_corloc) self.recalls_per_class,
self.corloc_per_class, mean_corloc)
...@@ -15,8 +15,13 @@ ...@@ -15,8 +15,13 @@
"""Tests for object_detection.utils.object_detection_evaluation.""" """Tests for object_detection.utils.object_detection_evaluation."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from absl.testing import parameterized from absl.testing import parameterized
import numpy as np import numpy as np
import six
from six.moves import range
import tensorflow as tf import tensorflow as tf
from object_detection import eval_util from object_detection import eval_util
from object_detection.core import standard_fields from object_detection.core import standard_fields
...@@ -310,17 +315,14 @@ class OpenImagesChallengeEvaluatorTest(tf.test.TestCase): ...@@ -310,17 +315,14 @@ class OpenImagesChallengeEvaluatorTest(tf.test.TestCase):
expected_metric_name = 'OpenImagesInstanceSegmentationChallenge' expected_metric_name = 'OpenImagesInstanceSegmentationChallenge'
self.assertAlmostEqual( self.assertAlmostEqual(
metrics[ metrics[expected_metric_name + '_PerformanceByCategory/AP@0.5IOU/dog'],
expected_metric_name + '_PerformanceByCategory/AP@0.5IOU/dog'], 1.0)
0.5)
self.assertAlmostEqual( self.assertAlmostEqual(
metrics[ metrics[
expected_metric_name + '_PerformanceByCategory/AP@0.5IOU/cat'], expected_metric_name + '_PerformanceByCategory/AP@0.5IOU/cat'],
0) 0)
self.assertAlmostEqual( self.assertAlmostEqual(
metrics[ metrics[expected_metric_name + '_Precision/mAP@0.5IOU'], 0.5)
expected_metric_name + '_Precision/mAP@0.5IOU'],
0.25)
oivchallenge_evaluator.clear() oivchallenge_evaluator.clear()
self.assertFalse(oivchallenge_evaluator._image_ids) self.assertFalse(oivchallenge_evaluator._image_ids)
...@@ -925,7 +927,7 @@ class ObjectDetectionEvaluationTest(tf.test.TestCase): ...@@ -925,7 +927,7 @@ class ObjectDetectionEvaluationTest(tf.test.TestCase):
] ]
expected_average_precision_per_class = np.array([1. / 6., 0, 0], expected_average_precision_per_class = np.array([1. / 6., 0, 0],
dtype=float) dtype=float)
expected_corloc_per_class = np.array([0, np.divide(0, 0), 0], dtype=float) expected_corloc_per_class = np.array([0, 0, 0], dtype=float)
expected_mean_ap = 1. / 18 expected_mean_ap = 1. / 18
expected_mean_corloc = 0.0 expected_mean_corloc = 0.0
for i in range(self.od_eval.num_class): for i in range(self.od_eval.num_class):
...@@ -1069,7 +1071,7 @@ class ObjectDetectionEvaluatorTest(tf.test.TestCase, parameterized.TestCase): ...@@ -1069,7 +1071,7 @@ class ObjectDetectionEvaluatorTest(tf.test.TestCase, parameterized.TestCase):
with self.test_session() as sess: with self.test_session() as sess:
metrics = {} metrics = {}
for key, (value_op, _) in metric_ops.iteritems(): for key, (value_op, _) in six.iteritems(metric_ops):
metrics[key] = value_op metrics[key] = value_op
sess.run(update_op) sess.run(update_op)
metrics = sess.run(metrics) metrics = sess.run(metrics)
......
...@@ -14,10 +14,16 @@ ...@@ -14,10 +14,16 @@
# ============================================================================== # ==============================================================================
"""A module for helper tensorflow ops.""" """A module for helper tensorflow ops."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import collections import collections
import math import math
import six import six
from six.moves import range
from six.moves import zip
import tensorflow as tf import tensorflow as tf
from object_detection.core import standard_fields as fields from object_detection.core import standard_fields as fields
......
...@@ -14,7 +14,13 @@ ...@@ -14,7 +14,13 @@
# ============================================================================== # ==============================================================================
"""Tests for object_detection.utils.ops.""" """Tests for object_detection.utils.ops."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np import numpy as np
import six
from six.moves import range
import tensorflow as tf import tensorflow as tf
from object_detection.core import standard_fields as fields from object_detection.core import standard_fields as fields
...@@ -436,7 +442,7 @@ class GroundtruthFilterTest(tf.test.TestCase): ...@@ -436,7 +442,7 @@ class GroundtruthFilterTest(tf.test.TestCase):
fields.InputDataFields.groundtruth_is_crowd: [False], fields.InputDataFields.groundtruth_is_crowd: [False],
fields.InputDataFields.groundtruth_area: [32], fields.InputDataFields.groundtruth_area: [32],
fields.InputDataFields.groundtruth_difficult: [True], fields.InputDataFields.groundtruth_difficult: [True],
fields.InputDataFields.groundtruth_label_types: ['APPROPRIATE'], fields.InputDataFields.groundtruth_label_types: [six.b('APPROPRIATE')],
fields.InputDataFields.groundtruth_confidences: [0.99], fields.InputDataFields.groundtruth_confidences: [0.99],
} }
with self.test_session() as sess: with self.test_session() as sess:
...@@ -610,7 +616,7 @@ class RetainGroundTruthWithPositiveClasses(tf.test.TestCase): ...@@ -610,7 +616,7 @@ class RetainGroundTruthWithPositiveClasses(tf.test.TestCase):
fields.InputDataFields.groundtruth_is_crowd: [False], fields.InputDataFields.groundtruth_is_crowd: [False],
fields.InputDataFields.groundtruth_area: [32], fields.InputDataFields.groundtruth_area: [32],
fields.InputDataFields.groundtruth_difficult: [True], fields.InputDataFields.groundtruth_difficult: [True],
fields.InputDataFields.groundtruth_label_types: ['APPROPRIATE'], fields.InputDataFields.groundtruth_label_types: [six.b('APPROPRIATE')],
fields.InputDataFields.groundtruth_confidences: [0.99], fields.InputDataFields.groundtruth_confidences: [0.99],
} }
with self.test_session() as sess: with self.test_session() as sess:
...@@ -819,8 +825,8 @@ class OpsTestPositionSensitiveCropRegions(tf.test.TestCase): ...@@ -819,8 +825,8 @@ class OpsTestPositionSensitiveCropRegions(tf.test.TestCase):
image_shape = [3, 2, 6] image_shape = [3, 2, 6]
# First channel is 1's, second channel is 2's, etc. # First channel is 1's, second channel is 2's, etc.
image = tf.constant(range(1, 3 * 2 + 1) * 6, dtype=tf.float32, image = tf.constant(
shape=image_shape) list(range(1, 3 * 2 + 1)) * 6, dtype=tf.float32, shape=image_shape)
boxes = tf.random_uniform((2, 4)) boxes = tf.random_uniform((2, 4))
# The result for both boxes should be [[1, 2], [3, 4], [5, 6]] # The result for both boxes should be [[1, 2], [3, 4], [5, 6]]
...@@ -841,8 +847,8 @@ class OpsTestPositionSensitiveCropRegions(tf.test.TestCase): ...@@ -841,8 +847,8 @@ class OpsTestPositionSensitiveCropRegions(tf.test.TestCase):
image_shape = [3, 3, 4] image_shape = [3, 3, 4]
crop_size = [2, 2] crop_size = [2, 2]
image = tf.constant(range(1, 3 * 3 + 1), dtype=tf.float32, image = tf.constant(
shape=[3, 3, 1]) list(range(1, 3 * 3 + 1)), dtype=tf.float32, shape=[3, 3, 1])
tiled_image = tf.tile(image, [1, 1, image_shape[2]]) tiled_image = tf.tile(image, [1, 1, image_shape[2]])
boxes = tf.random_uniform((3, 4)) boxes = tf.random_uniform((3, 4))
box_ind = tf.constant([0, 0, 0], dtype=tf.int32) box_ind = tf.constant([0, 0, 0], dtype=tf.int32)
...@@ -908,8 +914,8 @@ class OpsTestPositionSensitiveCropRegions(tf.test.TestCase): ...@@ -908,8 +914,8 @@ class OpsTestPositionSensitiveCropRegions(tf.test.TestCase):
num_boxes = 2 num_boxes = 2
# First channel is 1's, second channel is 2's, etc. # First channel is 1's, second channel is 2's, etc.
image = tf.constant(range(1, 3 * 2 + 1) * 6, dtype=tf.float32, image = tf.constant(
shape=image_shape) list(range(1, 3 * 2 + 1)) * 6, dtype=tf.float32, shape=image_shape)
boxes = tf.random_uniform((num_boxes, 4)) boxes = tf.random_uniform((num_boxes, 4))
expected_output = [] expected_output = []
...@@ -945,8 +951,8 @@ class OpsTestPositionSensitiveCropRegions(tf.test.TestCase): ...@@ -945,8 +951,8 @@ class OpsTestPositionSensitiveCropRegions(tf.test.TestCase):
num_boxes = 2 num_boxes = 2
# First channel is 1's, second channel is 2's, etc. # First channel is 1's, second channel is 2's, etc.
image = tf.constant(range(1, 3 * 2 + 1) * 6, dtype=tf.float32, image = tf.constant(
shape=image_shape) list(range(1, 3 * 2 + 1)) * 6, dtype=tf.float32, shape=image_shape)
boxes = tf.random_uniform((num_boxes, 4)) boxes = tf.random_uniform((num_boxes, 4))
expected_output = [] expected_output = []
...@@ -1031,8 +1037,8 @@ class OpsTestBatchPositionSensitiveCropRegions(tf.test.TestCase): ...@@ -1031,8 +1037,8 @@ class OpsTestBatchPositionSensitiveCropRegions(tf.test.TestCase):
image_shape = [2, 2, 2, 4] image_shape = [2, 2, 2, 4]
crop_size = [2, 2] crop_size = [2, 2]
images = tf.constant(range(1, 2 * 2 * 4 + 1) * 2, dtype=tf.float32, images = tf.constant(
shape=image_shape) list(range(1, 2 * 2 * 4 + 1)) * 2, dtype=tf.float32, shape=image_shape)
# First box contains whole image, and second box contains only first row. # First box contains whole image, and second box contains only first row.
boxes = tf.constant(np.array([[[0., 0., 1., 1.]], boxes = tf.constant(np.array([[[0., 0., 1., 1.]],
......
...@@ -20,7 +20,12 @@ detection is supported by default. ...@@ -20,7 +20,12 @@ detection is supported by default.
Based on the settings, per image evaluation is either performed on boxes or Based on the settings, per image evaluation is either performed on boxes or
on object masks. on object masks.
""" """
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np import numpy as np
from six.moves import range
from object_detection.utils import np_box_list from object_detection.utils import np_box_list
from object_detection.utils import np_box_list_ops from object_detection.utils import np_box_list_ops
......
...@@ -15,7 +15,12 @@ ...@@ -15,7 +15,12 @@
"""Tests for object_detection.utils.per_image_evaluation.""" """Tests for object_detection.utils.per_image_evaluation."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np import numpy as np
from six.moves import range
import tensorflow as tf import tensorflow as tf
from object_detection.utils import per_image_evaluation from object_detection.utils import per_image_evaluation
......
...@@ -19,7 +19,12 @@ a predefined IOU ratio. Multi-class detection is supported by default. ...@@ -19,7 +19,12 @@ a predefined IOU ratio. Multi-class detection is supported by default.
Based on the settings, per image evaluation is performed either on phrase Based on the settings, per image evaluation is performed either on phrase
detection subtask or on relation detection subtask. detection subtask or on relation detection subtask.
""" """
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np import numpy as np
from six.moves import range
from object_detection.utils import np_box_list from object_detection.utils import np_box_list
from object_detection.utils import np_box_list_ops from object_detection.utils import np_box_list_ops
......
...@@ -13,6 +13,11 @@ ...@@ -13,6 +13,11 @@
# limitations under the License. # limitations under the License.
# ============================================================================== # ==============================================================================
"""Tests for object_detection.utils.per_image_vrd_evaluation.""" """Tests for object_detection.utils.per_image_vrd_evaluation."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np import numpy as np
import tensorflow as tf import tensorflow as tf
......
...@@ -15,6 +15,11 @@ ...@@ -15,6 +15,11 @@
"""Utils used to manipulate tensor shapes.""" """Utils used to manipulate tensor shapes."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from six.moves import zip
import tensorflow as tf import tensorflow as tf
from object_detection.utils import static_shape from object_detection.utils import static_shape
......
...@@ -15,6 +15,10 @@ ...@@ -15,6 +15,10 @@
"""Tests for object_detection.utils.shape_utils.""" """Tests for object_detection.utils.shape_utils."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np import numpy as np
import tensorflow as tf import tensorflow as tf
......
...@@ -13,6 +13,11 @@ ...@@ -13,6 +13,11 @@
# limitations under the License. # limitations under the License.
# ============================================================================== # ==============================================================================
"""Spatial transformation ops like RoIAlign, CropAndResize.""" """Spatial transformation ops like RoIAlign, CropAndResize."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf import tensorflow as tf
...@@ -32,7 +37,7 @@ def _coordinate_vector_1d(start, end, size, align_endpoints): ...@@ -32,7 +37,7 @@ def _coordinate_vector_1d(start, end, size, align_endpoints):
""" """
start = tf.expand_dims(start, -1) start = tf.expand_dims(start, -1)
end = tf.expand_dims(end, -1) end = tf.expand_dims(end, -1)
length = tf.cast(end - start, dtype=tf.float32) length = end - start
if align_endpoints: if align_endpoints:
relative_grid_spacing = tf.linspace(0.0, 1.0, size) relative_grid_spacing = tf.linspace(0.0, 1.0, size)
offset = 0 if size > 1 else length / 2 offset = 0 if size > 1 else length / 2
...@@ -40,6 +45,7 @@ def _coordinate_vector_1d(start, end, size, align_endpoints): ...@@ -40,6 +45,7 @@ def _coordinate_vector_1d(start, end, size, align_endpoints):
relative_grid_spacing = tf.linspace(0.0, 1.0, size + 1)[:-1] relative_grid_spacing = tf.linspace(0.0, 1.0, size + 1)[:-1]
offset = length / (2 * size) offset = length / (2 * size)
relative_grid_spacing = tf.reshape(relative_grid_spacing, [1, 1, size]) relative_grid_spacing = tf.reshape(relative_grid_spacing, [1, 1, size])
relative_grid_spacing = tf.cast(relative_grid_spacing, dtype=start.dtype)
absolute_grid = start + offset + relative_grid_spacing * length absolute_grid = start + offset + relative_grid_spacing * length
return absolute_grid return absolute_grid
...@@ -170,12 +176,10 @@ def ravel_indices(feature_grid_y, feature_grid_x, num_levels, height, width, ...@@ -170,12 +176,10 @@ def ravel_indices(feature_grid_y, feature_grid_x, num_levels, height, width,
indices: A 1D int32 tensor containing feature point indices in a flattened indices: A 1D int32 tensor containing feature point indices in a flattened
feature grid. feature grid.
""" """
assert feature_grid_y.shape[0] == feature_grid_x.shape[0] num_boxes = tf.shape(feature_grid_y)[1]
assert feature_grid_y.shape[1] == feature_grid_x.shape[1] batch_size = tf.shape(feature_grid_y)[0]
num_boxes = feature_grid_y.shape[1].value size_y = tf.shape(feature_grid_y)[2]
batch_size = feature_grid_y.shape[0].value size_x = tf.shape(feature_grid_x)[2]
size_y = feature_grid_y.shape[2]
size_x = feature_grid_x.shape[2]
height_dim_offset = width height_dim_offset = width
level_dim_offset = height * height_dim_offset level_dim_offset = height * height_dim_offset
batch_dim_offset = num_levels * level_dim_offset batch_dim_offset = num_levels * level_dim_offset
...@@ -213,17 +217,18 @@ def pad_to_max_size(features): ...@@ -213,17 +217,18 @@ def pad_to_max_size(features):
true_feature_shapes: A 2D int32 tensor of shape [num_levels, 2] containing true_feature_shapes: A 2D int32 tensor of shape [num_levels, 2] containing
height and width of the feature maps before padding. height and width of the feature maps before padding.
""" """
heights = [feature.shape[1].value for feature in features] heights = [tf.shape(feature)[1] for feature in features]
widths = [feature.shape[2].value for feature in features] widths = [tf.shape(feature)[2] for feature in features]
max_height = max(heights) max_height = tf.reduce_max(heights)
max_width = max(widths) max_width = tf.reduce_max(widths)
features_all = [ features_all = [
tf.image.pad_to_bounding_box(feature, 0, 0, max_height, tf.image.pad_to_bounding_box(feature, 0, 0, max_height,
max_width) for feature in features max_width) for feature in features
] ]
features_all = tf.stack(features_all, axis=1) features_all = tf.stack(features_all, axis=1)
true_feature_shapes = tf.stack([feature.shape[1:3] for feature in features]) true_feature_shapes = tf.stack([tf.shape(feature)[1:3]
for feature in features])
return features_all, true_feature_shapes return features_all, true_feature_shapes
...@@ -247,7 +252,7 @@ def _gather_valid_indices(tensor, indices, padding_value=0.0): ...@@ -247,7 +252,7 @@ def _gather_valid_indices(tensor, indices, padding_value=0.0):
padded_tensor = tf.concat( padded_tensor = tf.concat(
[ [
padding_value * padding_value *
tf.ones([1, tensor.shape[-1].value], dtype=tensor.dtype), tensor tf.ones([1, tf.shape(tensor)[-1]], dtype=tensor.dtype), tensor
], ],
axis=0, axis=0,
) )
...@@ -307,9 +312,12 @@ def multilevel_roi_align(features, boxes, box_levels, output_size, ...@@ -307,9 +312,12 @@ def multilevel_roi_align(features, boxes, box_levels, output_size,
""" """
with tf.name_scope(scope, 'MultiLevelRoIAlign'): with tf.name_scope(scope, 'MultiLevelRoIAlign'):
features, true_feature_shapes = pad_to_max_size(features) features, true_feature_shapes = pad_to_max_size(features)
(batch_size, num_levels, max_feature_height, max_feature_width, batch_size = tf.shape(features)[0]
num_filters) = features.get_shape().as_list() num_levels = features.get_shape().as_list()[1]
_, num_boxes, _ = boxes.get_shape().as_list() max_feature_height = tf.shape(features)[2]
max_feature_width = tf.shape(features)[3]
num_filters = features.get_shape().as_list()[4]
num_boxes = tf.shape(boxes)[1]
# Convert boxes to absolute co-ordinates. # Convert boxes to absolute co-ordinates.
true_feature_shapes = tf.cast(true_feature_shapes, dtype=boxes.dtype) true_feature_shapes = tf.cast(true_feature_shapes, dtype=boxes.dtype)
...@@ -463,7 +471,7 @@ def matmul_crop_and_resize(image, boxes, crop_size, extrapolation_value=0.0, ...@@ -463,7 +471,7 @@ def matmul_crop_and_resize(image, boxes, crop_size, extrapolation_value=0.0,
A 5-D tensor of shape `[batch, num_boxes, crop_height, crop_width, depth]` A 5-D tensor of shape `[batch, num_boxes, crop_height, crop_width, depth]`
""" """
with tf.name_scope(scope, 'MatMulCropAndResize'): with tf.name_scope(scope, 'MatMulCropAndResize'):
box_levels = tf.zeros(boxes.shape.as_list()[:2], dtype=tf.int32) box_levels = tf.zeros(tf.shape(boxes)[:2], dtype=tf.int32)
return multilevel_roi_align([image], return multilevel_roi_align([image],
boxes, boxes,
box_levels, box_levels,
......
...@@ -19,6 +19,7 @@ from __future__ import division ...@@ -19,6 +19,7 @@ from __future__ import division
from __future__ import print_function from __future__ import print_function
import numpy as np import numpy as np
from six.moves import range
import tensorflow as tf import tensorflow as tf
from object_detection.utils import spatial_transform_ops as spatial_ops from object_detection.utils import spatial_transform_ops as spatial_ops
......
...@@ -18,6 +18,10 @@ ...@@ -18,6 +18,10 @@
The rank 4 tensor_shape must be of the form [batch_size, height, width, depth]. The rank 4 tensor_shape must be of the form [batch_size, height, width, depth].
""" """
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
def get_dim_as_int(dim): def get_dim_as_int(dim):
"""Utility to get v1 or v2 TensorShape dim as an int. """Utility to get v1 or v2 TensorShape dim as an int.
......
...@@ -15,6 +15,10 @@ ...@@ -15,6 +15,10 @@
"""Tests for object_detection.utils.static_shape.""" """Tests for object_detection.utils.static_shape."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf import tensorflow as tf
from object_detection.utils import static_shape from object_detection.utils import static_shape
......
...@@ -14,7 +14,11 @@ ...@@ -14,7 +14,11 @@
# ============================================================================== # ==============================================================================
"""A convenience wrapper around tf.test.TestCase to enable TPU tests.""" """A convenience wrapper around tf.test.TestCase to enable TPU tests."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os import os
from six.moves import zip
import tensorflow as tf import tensorflow as tf
from tensorflow.contrib import tpu from tensorflow.contrib import tpu
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment