Commit 1efe98bb authored by Zhichao Lu's avatar Zhichao Lu Committed by lzc5123016
Browse files

Merged commit includes the following changes:

185215255  by Zhichao Lu:

    Stop populating image/object/class/text field when generating COCO tf record.

--
185213306  by Zhichao Lu:

    Use the params batch size and not the one from train_config in input_fn

--
185209081  by Zhichao Lu:

    Handle the case when there are no ground-truth masks for an image.

--
185195531  by Zhichao Lu:

    Remove unstack and stack operations on features from third_party/object_detection/model.py.

--
185195017  by Zhichao Lu:

    Matrix multiplication based gather op implementation.

--
185187744  by Zhichao Lu:

    Fix eval_util minor issue.

--
185098733  by Zhichao Lu:

    Internal change

185076656  by Zhichao Lu:

    Increment the amount of boxes for coco17.

--
185074199  by Zhichao Lu:

    Add config for SSD Resnet50 v1 with FPN.

--
185060199  by Zhichao Lu:

    Fix a bug in clear_detections.
    This method set detection_keys to an empty dictionary instead of an empty set. I've refactored so that this method and the constructor use the same code path.

--
185031359  by Zhichao Lu:

    Eval TPU trained models continuously.

--
185016591  by Zhichao Lu:

    Use TPUEstimatorSpec for TPU

--
185013651  by Zhichao Lu:

    Add PreprocessorCache to record and duplicate augmentations.

--
184921763  by Zhichao Lu:

    Minor fixes for object detection.

--
184920610  by Zhichao Lu:

    Adds a model builder test for "embedded_ssd_mobilenet_v1" feature extractor.

--
184919284  by Zhichao Lu:

    Added unit tests for TPU, with optional training / eval.

--
184915910  by Zhichao Lu:

    Update third_party g3 doc with Mask RCNN detection models.

--
184914085  by Zhichao Lu:

    Slight change to WeightSharedConvolutionalBoxPredictor implementation to make things match more closely with RetinaNet.  Specifically we now construct the box encoding and class predictor towers separately rather than having them share weights until penultimate layer.

--
184913786  by Zhichao Lu:

    Plumbs SSD Resnet V1 with FPN models into model builder.

--
184910030  by Zhichao Lu:

    Add coco metrics to evaluator.

--
184897758  by Zhichao Lu:

    Merge changes from github.

--
184888736  by Zhichao Lu:

    Ensure groundtruth_weights are always 1-D.

--
184887256  by Zhichao Lu:

    Introduce an option to add summaries in the model so it can be turned off when necessary.

--
184865559  by Zhichao Lu:

    Updating inputs so that a dictionary of tensors is returned from input_fn. Moving unbatch/unpad to model.py.
    Also removing source_id key from features dictionary, and replacing with an integer hash.

--
184859205  by Zhichao Lu:

    This CL is trying to hide those differences by making the default settings work with the public code.

--
184769779  by Zhichao Lu:

    Pass groundtruth weights into ssd meta architecture all the way to target assigner.

    This will allow training ssd models with padded groundtruth tensors.

--
184767117  by Zhichao Lu:

    * Add `params` arg to make all input fns work with TPUEstimator
    * Add --master
    * Output eval results

--
184766244  by Zhichao Lu:

    Update create_coco_tf_record to include category indices

--
184752937  by Zhichao Lu:

    Create a third_party version of TPU compatible mobilenet_v2_focal_loss coco config.

--
184750174  by Zhichao Lu:

    A few small fixes for multiscale anchor generator and a test.

--
184746581  by Zhichao Lu:

    Update jupyter notebook to show mask if provided by model.

--
184728646  by Zhichao Lu:

    Adding a few more tests to make sure decoding with/without label maps performs as expected.

--
184624154  by Zhichao Lu:

    Add an object detection binary for TPU.

--
184622118  by Zhichao Lu:

    Batch, transform, and unbatch in the tflearn interface.

--
184595064  by Zhichao Lu:

    Add support for training grayscale models.

--
184532026  by Zhichao Lu:

    Change dataset_builder.build to perform optional batching using tf.data.Dataset API

--
184330239  by Zhichao Lu:

    Add augment_input_data and transform_input_data helper functions to third_party/tensorflow_models/object_detection/inputs.py

--
184328681  by Zhichao Lu:

    Use an internal rgb to gray method that can be quantized.

--
184327909  by Zhichao Lu:

    Helper function to return padding shapes to use with Dataset.padded_batch.

--
184326291  by Zhichao Lu:

    Added decode_func for specialized decoding.

--
184314676  by Zhichao Lu:

    Add unstack_batch method to inputs.py.

    This will enable us to convert batched tensors to lists of tensors. This is compatible with OD API that consumes groundtruth batch as a list of tensors.

--
184281269  by Zhichao Lu:

    Internal test target changes.

--
184192851  by Zhichao Lu:

    Adding `Estimator` interface for object detection.

--
184187885  by Zhichao Lu:

    Add config_util functions to help with input pipeline.

    1. function to return expected shapes from the resizer config
    2. function to extract image_resizer_config from model_config.

--
184139892  by Zhichao Lu:

    Adding support for depthwise SSD (ssd-lite) and depthwise box predictions.

--
184089891  by Zhichao Lu:

    Fix third_party faster rcnn resnet101 coco config.

--
184083378  by Zhichao Lu:

    In the case when there is no object/weights field in tf.Example proto, return a default weight of 1.0 for all boxes.

--

PiperOrigin-RevId: 185215255
parent fbc5ba06
......@@ -19,6 +19,7 @@ import os
import tempfile
import tensorflow as tf
from google.protobuf import text_format
from tensorflow.core.protobuf import saver_pb2
from tensorflow.python import pywrap_tensorflow
from tensorflow.python.client import session
from tensorflow.python.framework import graph_util
......@@ -354,16 +355,22 @@ def _export_inference_graph(input_type,
if graph_hook_fn: graph_hook_fn()
saver_kwargs = {}
if use_moving_averages:
temp_checkpoint_file = tempfile.NamedTemporaryFile()
# This check is to be compatible with both version of SaverDef.
if os.path.isfile(trained_checkpoint_prefix):
saver_kwargs['write_version'] = saver_pb2.SaverDef.V1
temp_checkpoint_prefix = tempfile.NamedTemporaryFile().name
else:
temp_checkpoint_prefix = tempfile.mkdtemp()
replace_variable_values_with_moving_averages(
tf.get_default_graph(), trained_checkpoint_prefix,
temp_checkpoint_file.name)
checkpoint_to_use = temp_checkpoint_file.name
temp_checkpoint_prefix)
checkpoint_to_use = temp_checkpoint_prefix
else:
checkpoint_to_use = trained_checkpoint_prefix
saver = tf.train.Saver()
saver = tf.train.Saver(**saver_kwargs)
input_saver_def = saver.as_saver_def()
_write_graph_and_checkpoint(
......
......@@ -23,7 +23,7 @@ In the table below, we list each such pre-trained model including:
* detector performance on subset of the COCO validation set or Open Images test split as measured by the dataset-specific mAP measure.
Here, higher is better, and we only report bounding box mAP rounded to the
nearest integer.
* Output types (currently only `Boxes`)
* Output types (`Boxes`, and `Masks` if applicable )
You can un-tar each tar.gz file via, e.g.,:
......@@ -55,7 +55,7 @@ Some remarks on frozen inference graphs:
a detector (and discarding the part past that point), which negatively impacts
standard mAP metrics.
* Our frozen inference graphs are generated using the
[v1.4.0](https://github.com/tensorflow/tensorflow/tree/v1.4.0)
[v1.5.0](https://github.com/tensorflow/tensorflow/tree/v1.5.0)
release version of Tensorflow and we do not guarantee that these will work
with other versions; this being said, each frozen inference graph can be
regenerated using your current version of Tensorflow by re-running the
......@@ -69,16 +69,20 @@ Some remarks on frozen inference graphs:
| ------------ | :--------------: | :--------------: | :-------------: |
| [ssd_mobilenet_v1_coco](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_2017_11_17.tar.gz) | 30 | 21 | Boxes |
| [ssd_inception_v2_coco](http://download.tensorflow.org/models/object_detection/ssd_inception_v2_coco_2017_11_17.tar.gz) | 42 | 24 | Boxes |
| [faster_rcnn_inception_v2_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_v2_coco_2017_11_08.tar.gz) | 58 | 28 | Boxes |
| [faster_rcnn_resnet50_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet50_coco_2017_11_08.tar.gz) | 89 | 30 | Boxes |
| [faster_rcnn_resnet50_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet50_lowproposals_coco_2017_11_08.tar.gz) | 64 | | Boxes |
| [rfcn_resnet101_coco](http://download.tensorflow.org/models/object_detection/rfcn_resnet101_coco_2017_11_08.tar.gz) | 92 | 30 | Boxes |
| [faster_rcnn_resnet101_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_coco_2017_11_08.tar.gz) | 106 | 32 | Boxes |
| [faster_rcnn_resnet101_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_lowproposals_coco_2017_11_08.tar.gz) | 82 | | Boxes |
| [faster_rcnn_inception_resnet_v2_atrous_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_coco_2017_11_08.tar.gz) | 620 | 37 | Boxes |
| [faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco_2017_11_08.tar.gz) | 241 | | Boxes |
| [faster_rcnn_nas](http://download.tensorflow.org/models/object_detection/faster_rcnn_nas_coco_2017_11_08.tar.gz) | 1833 | 43 | Boxes |
| [faster_rcnn_nas_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_nas_lowproposals_coco_2017_11_08.tar.gz) | 540 | | Boxes |
| [faster_rcnn_inception_v2_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_v2_coco_2018_01_28.tar.gz) | 58 | 28 | Boxes |
| [faster_rcnn_resnet50_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet50_coco_2018_01_28.tar.gz) | 89 | 30 | Boxes |
| [faster_rcnn_resnet50_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet50_lowproposals_coco_2018_01_28.tar.gz) | 64 | | Boxes |
| [rfcn_resnet101_coco](http://download.tensorflow.org/models/object_detection/rfcn_resnet101_coco_2018_01_28.tar.gz) | 92 | 30 | Boxes |
| [faster_rcnn_resnet101_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_coco_2018_01_28.tar.gz) | 106 | 32 | Boxes |
| [faster_rcnn_resnet101_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_lowproposals_coco_2018_01_28.tar.gz) | 82 | | Boxes |
| [faster_rcnn_inception_resnet_v2_atrous_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_coco_2018_01_28.tar.gz) | 620 | 37 | Boxes |
| [faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco_2018_01_28.tar.gz) | 241 | | Boxes |
| [faster_rcnn_nas](http://download.tensorflow.org/models/object_detection/faster_rcnn_nas_coco_2018_01_28.tar.gz) | 1833 | 43 | Boxes |
| [faster_rcnn_nas_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_nas_lowproposals_coco_2018_01_28.tar.gz) | 540 | | Boxes |
| [mask_rcnn_inception_resnet_v2_atrous_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_inception_resnet_v2_atrous_coco_2018_01_28.tar.gz) | 771 | 36 | Masks |
| [mask_rcnn_inception_v2_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_inception_v2_coco_2018_01_28.tar.gz) | 79 | 25 | Masks |
| [mask_rcnn_resnet101_atrous_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_resnet101_atrous_coco_2018_01_28.tar.gz) | 470 | 33 | Masks |
| [mask_rcnn_resnet50_atrous_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_resnet50_atrous_coco_2018_01_28.tar.gz) | 343 | 29 | Masks |
......@@ -86,14 +90,14 @@ Some remarks on frozen inference graphs:
Model name | Speed (ms) | Pascal mAP@0.5 (ms) | Outputs
----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----:
[faster_rcnn_resnet101_kitti](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_kitti_2017_11_08.tar.gz) | 79 | 87 | Boxes
[faster_rcnn_resnet101_kitti](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_kitti_2018_01_28.tar.gz) | 79 | 87 | Boxes
## Open Images-trained models {#open-images-models}
Model name | Speed (ms) | Open Images mAP@0.5[^2] | Outputs
----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----:
[faster_rcnn_inception_resnet_v2_atrous_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_oid_2017_11_08.tar.gz) | 727 | 37 | Boxes
[faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2017_11_08.tar.gz) | 347 | | Boxes
[faster_rcnn_inception_resnet_v2_atrous_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_oid_2018_01_28.tar.gz) | 727 | 37 | Boxes
[faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2018_01_28.tar.gz) | 347 | | Boxes
[^1]: See [MSCOCO evaluation protocol](http://cocodataset.org/#detections-eval).
......
## Run an Instance Segmentation Model
For some applications it isn't adequate enough to localize an object with a
simple bounding box. For instance, you might want to segment an object region
once it is detected. This class of problems is called **instance segmentation**.
<p align="center">
<img src="img/kites_with_segment_overlay.png" width=676 height=450>
</p>
### Materializing data for instance segmentation {#materializing-instance-seg}
Instance segmentation is an extension of object detection, where a binary mask
(i.e. object vs. background) is associated with every bounding box. This allows
for more fine-grained information about the extent of the object within the box.
To train an instance segmentation model, a groundtruth mask must be supplied for
every groundtruth bounding box. In additional to the proto fields listed in the
section titled [Using your own dataset](using_your_own_dataset.md), one must
also supply `image/object/mask`, which can either be a repeated list of
single-channel encoded PNG strings, or a single dense 3D binary tensor where
masks corresponding to each object are stacked along the first dimension. Each
is described in more detail below.
#### PNG Instance Segmentation Masks
Instance segmentation masks can be supplied as serialized PNG images.
```shell
image/object/mask = ["\x89PNG\r\n\x1A\n\x00\x00\x00\rIHDR\...", ...]
```
These masks are whole-image masks, one for each object instance. The spatial
dimensions of each mask must agree with the image. Each mask has only a single
channel, and the pixel values are either 0 (background) or 1 (object mask).
**PNG masks are the preferred parameterization since they offer considerable
space savings compared to dense numerical masks.**
#### Dense Numerical Instance Segmentation Masks
Masks can also be specified via a dense numerical tensor.
```shell
image/object/mask = [0.0, 0.0, 1.0, 1.0, 0.0, ...]
```
For an image with dimensions `H` x `W` and `num_boxes` groundtruth boxes, the
mask corresponds to a [`num_boxes`, `H`, `W`] float32 tensor, flattened into a
single vector of shape `num_boxes` * `H` * `W`. In TensorFlow, examples are read
in row-major format, so the elements are organized as:
```shell
... mask 0 row 0 ... mask 0 row 1 ... // ... mask 0 row H-1 ... mask 1 row 0 ...
```
where each row has W contiguous binary values.
To see an example tf-records with mask labels, see the examples under the
[Preparing Inputs](preparing_inputs.md) section.
### Pre-existing config files
We provide four instance segmentation config files that you can use to train
your own models:
1. <a href="https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/mask_rcnn_inception_resnet_v2_atrous_coco.config" target=_blank>mask_rcnn_inception_resnet_v2_atrous_coco</a>
1. <a href="https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/mask_rcnn_resnet101_atrous_coco.config" target=_blank>mask_rcnn_resnet101_atrous_coco</a>
1. <a href="https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/mask_rcnn_resnet50_atrous_coco.config" target=_blank>mask_rcnn_resnet50_atrous_coco</a>
1. <a href="https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/mask_rcnn_inception_v2_coco.config" target=_blank>mask_rcnn_inception_v2_coco</a>
For more details see the [detection model zoo](detection_model_zoo.md).
### Updating a Faster R-CNN config file
Currently, the only supported instance segmentation model is [Mask
R-CNN](https://arxiv.org/abs/1703.06870), which requires Faster R-CNN as the
backbone object detector.
Once you have a baseline Faster R-CNN pipeline configuration, you can make the
following modifications in order to convert it into a Mask R-CNN model.
1. Within `train_input_reader` and `eval_input_reader`, set
`load_instance_masks` to `True`. If using PNG masks, set `mask_type` to
`PNG_MASKS`, otherwise you can leave it as the default 'NUMERICAL_MASKS'.
1. Within the `faster_rcnn` config, use a `MaskRCNNBoxPredictor` as the
`second_stage_box_predictor`.
1. Within the `MaskRCNNBoxPredictor` message, set `predict_instance_masks` to
`True`. You must also define `conv_hyperparams`.
1. Within the `faster_rcnn` message, set `number_of_stages` to `3`.
1. Add instance segmentation metrics to the set of metrics:
`'coco_mask_metrics'`.
1. Update the `input_path`s to point at your data.
Please refer to the section on [Running the pets dataset](running_pets.md) for
additional details.
> Note: The mask prediction branch consists of a sequence of convolution layers.
> You can set the number of convolution layers and their depth as follows:
>
> 1. Within the `MaskRCNNBoxPredictor` message, set the
> `mask_prediction_conv_depth` to your value of interest. The default value
> is 256. If you set it to `0` (recommended), the depth is computed
> automatically based on the number of classes in the dataset.
> 1. Within the `MaskRCNNBoxPredictor` message, set the
> `mask_prediction_num_conv_layers` to your value of interest. The default
> value is 2.
......@@ -319,6 +319,9 @@ instance segmentation pipeline. Everything above that was mentioned about object
detection holds true for instance segmentation. Instance segmentation consists
of an object detection model with an additional head that predicts the object
mask inside each predicted box once we remove the training and other details.
Please refer to the section on [Running an Instance Segmentation
Model](instance_segmentation.md) for instructions on how to configure a model
that predicts masks in addition to object bounding boxes.
## What's Next
......
......@@ -103,7 +103,7 @@ FLAGS = flags.FLAGS
def create_tf_example(example):
# TODO(user): Populate the following variables from your example.
# TODO: Populate the following variables from your example.
height = None # Image height
width = None # Image width
filename = None # Filename of the image. Empty if image is not from file
......@@ -139,7 +139,7 @@ def create_tf_example(example):
def main(_):
writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
# TODO(user): Write code to read in your dataset to examples variable
# TODO: Write code to read in your dataset to examples variable
for example in examples:
tf_example = create_tf_example(example)
......@@ -155,3 +155,7 @@ if __name__ == '__main__':
Note: You may notice additional fields in some other datasets. They are
currently unused by the API and are optional.
Note: Please refer to the section on [Running an Instance Segmentation
Model](instance_segmentation.md) for instructions on how to configure a model
that predicts masks in addition to object bounding boxes.
This diff is collapsed.
This diff is collapsed.
......@@ -55,7 +55,8 @@ class ArgMaxMatcher(matcher.Matcher):
matched_threshold,
unmatched_threshold=None,
negatives_lower_than_unmatched=True,
force_match_for_each_row=False):
force_match_for_each_row=False,
use_matmul_gather=False):
"""Construct ArgMaxMatcher.
Args:
......@@ -74,11 +75,15 @@ class ArgMaxMatcher(matcher.Matcher):
at least one column (which is not guaranteed otherwise if the
matched_threshold is high). Defaults to False. See
argmax_matcher_test.testMatcherForceMatch() for an example.
use_matmul_gather: Force constructed match objects to use matrix
multiplication based gather instead of standard tf.gather.
(Default: False).
Raises:
ValueError: if unmatched_threshold is set but matched_threshold is not set
or if unmatched_threshold > matched_threshold.
"""
super(ArgMaxMatcher, self).__init__(use_matmul_gather=use_matmul_gather)
if (matched_threshold is None) and (unmatched_threshold is not None):
raise ValueError('Need to also define matched_threshold when'
'unmatched_threshold is defined')
......
......@@ -24,6 +24,17 @@ from object_detection.core import matcher
class GreedyBipartiteMatcher(matcher.Matcher):
"""Wraps a Tensorflow greedy bipartite matcher."""
def __init__(self, use_matmul_gather=False):
"""Constructs a Matcher.
Args:
use_matmul_gather: Force constructed match objects to use matrix
multiplication based gather instead of standard tf.gather.
(Default: False).
"""
super(GreedyBipartiteMatcher, self).__init__(
use_matmul_gather=use_matmul_gather)
def _match(self, similarity_matrix, num_valid_rows=-1):
"""Bipartite matches a collection rows and columns. A greedy bi-partite.
......
......@@ -251,7 +251,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
second_stage_classification_loss,
second_stage_mask_prediction_loss_weight=1.0,
hard_example_miner=None,
parallel_iterations=16):
parallel_iterations=16,
add_summaries=True):
"""FasterRCNNMetaArch Constructor.
Args:
......@@ -355,12 +356,17 @@ class FasterRCNNMetaArch(model.DetectionModel):
hard_example_miner: A losses.HardExampleMiner object (can be None).
parallel_iterations: (Optional) The number of iterations allowed to run
in parallel for calls to tf.map_fn.
add_summaries: boolean (default: True) controlling whether summary ops
should be added to tensorflow graph.
Raises:
ValueError: If `second_stage_batch_size` > `first_stage_max_proposals` at
training time.
ValueError: If first_stage_anchor_generator is not of type
grid_anchor_generator.GridAnchorGenerator.
"""
# TODO: add_summaries is currently unused. Respect that directive
# in the future.
super(FasterRCNNMetaArch, self).__init__(num_classes=num_classes)
if is_training and second_stage_batch_size > first_stage_max_proposals:
......
......@@ -75,7 +75,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
second_stage_classification_loss_weight,
second_stage_classification_loss,
hard_example_miner,
parallel_iterations=16):
parallel_iterations=16,
add_summaries=True):
"""RFCNMetaArch Constructor.
Args:
......@@ -155,11 +156,16 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
hard_example_miner: A losses.HardExampleMiner object (can be None).
parallel_iterations: (Optional) The number of iterations allowed to run
in parallel for calls to tf.map_fn.
add_summaries: boolean (default: True) controlling whether summary ops
should be added to tensorflow graph.
Raises:
ValueError: If `second_stage_batch_size` > `first_stage_max_proposals`
ValueError: If first_stage_anchor_generator is not of type
grid_anchor_generator.GridAnchorGenerator.
"""
# TODO: add_summaries is currently unused. Respect that directive
# in the future.
super(RFCNMetaArch, self).__init__(
is_training,
num_classes,
......
......@@ -44,7 +44,8 @@ class SSDFeatureExtractor(object):
conv_hyperparams,
batch_norm_trainable=True,
reuse_weights=None,
use_explicit_padding=False):
use_explicit_padding=False,
use_depthwise=False):
"""Constructor.
Args:
......@@ -61,6 +62,7 @@ class SSDFeatureExtractor(object):
reuse_weights: whether to reuse variables. Default is None.
use_explicit_padding: Whether to use explicit padding when extracting
features. Default is False.
use_depthwise: Whether to use depthwise convolutions. Default is False.
"""
self._is_training = is_training
self._depth_multiplier = depth_multiplier
......@@ -70,6 +72,7 @@ class SSDFeatureExtractor(object):
self._batch_norm_trainable = batch_norm_trainable
self._reuse_weights = reuse_weights
self._use_explicit_padding = use_explicit_padding
self._use_depthwise = use_depthwise
@abstractmethod
def preprocess(self, resized_inputs):
......@@ -130,7 +133,7 @@ class SSDMetaArch(model.DetectionModel):
add_summaries=True):
"""SSDMetaArch Constructor.
TODO: group NMS parameters + score converter into
TODO(rathodv,jonathanhuang): group NMS parameters + score converter into
a class and loss parameters into a class and write config protos for
postprocessing and losses.
......@@ -330,7 +333,8 @@ class SSDMetaArch(model.DetectionModel):
feature_maps = self._feature_extractor.extract_features(
preprocessed_inputs)
feature_map_spatial_dims = self._get_feature_map_spatial_dims(feature_maps)
image_shape = tf.shape(preprocessed_inputs)
image_shape = shape_utils.combined_static_and_dynamic_shape(
preprocessed_inputs)
self._anchors = self._anchor_generator.generate(
feature_map_spatial_dims,
im_height=image_shape[1],
......@@ -472,11 +476,14 @@ class SSDMetaArch(model.DetectionModel):
keypoints = None
if self.groundtruth_has_field(fields.BoxListFields.keypoints):
keypoints = self.groundtruth_lists(fields.BoxListFields.keypoints)
weights = None
if self.groundtruth_has_field(fields.BoxListFields.weights):
weights = self.groundtruth_lists(fields.BoxListFields.weights)
(batch_cls_targets, batch_cls_weights, batch_reg_targets,
batch_reg_weights, match_list) = self._assign_targets(
self.groundtruth_lists(fields.BoxListFields.boxes),
self.groundtruth_lists(fields.BoxListFields.classes),
keypoints)
keypoints, weights)
if self._add_summaries:
self._summarize_input(
self.groundtruth_lists(fields.BoxListFields.boxes), match_list)
......@@ -539,7 +546,8 @@ class SSDMetaArch(model.DetectionModel):
'NegativeAnchorLossCDF')
def _assign_targets(self, groundtruth_boxes_list, groundtruth_classes_list,
groundtruth_keypoints_list=None):
groundtruth_keypoints_list=None,
groundtruth_weights_list=None):
"""Assign groundtruth targets.
Adds a background class to each one-hot encoding of groundtruth classes
......@@ -556,6 +564,8 @@ class SSDMetaArch(model.DetectionModel):
index assumed to map to the first non-background class.
groundtruth_keypoints_list: (optional) a list of 3-D tensors of shape
[num_boxes, num_keypoints, 2]
groundtruth_weights_list: A list of 1-D tf.float32 tensors of shape
[num_boxes] containing weights for groundtruth boxes.
Returns:
batch_cls_targets: a tensor with shape [batch_size, num_anchors,
......@@ -582,7 +592,7 @@ class SSDMetaArch(model.DetectionModel):
boxlist.add_field(fields.BoxListFields.keypoints, keypoints)
return target_assigner.batch_assign_targets(
self._target_assigner, self.anchors, groundtruth_boxlists,
groundtruth_classes_with_background_list)
groundtruth_classes_with_background_list, groundtruth_weights_list)
def _summarize_input(self, groundtruth_boxes_list, match_list):
"""Creates tensorflow summaries for the input boxes and anchors.
......
......@@ -24,13 +24,17 @@ from object_detection.utils import object_detection_evaluation
class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
"""Class to evaluate COCO detection metrics."""
def __init__(self, categories, all_metrics_per_category=False):
def __init__(self,
categories,
include_metrics_per_category=False,
all_metrics_per_category=False):
"""Constructor.
Args:
categories: A list of dicts, each of which has the following keys -
'id': (required) an integer id uniquely identifying this category.
'name': (required) string representing category name e.g., 'cat', 'dog'.
include_metrics_per_category: If True, include metrics for each category.
all_metrics_per_category: Whether to include all the summary metrics for
each category in per_category_ap. Be careful with setting it to true if
you have more than handful of categories, because it will pollute
......@@ -45,6 +49,7 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
self._category_id_set = set([cat['id'] for cat in self._categories])
self._annotation_id = 1
self._metrics = None
self._include_metrics_per_category = include_metrics_per_category
self._all_metrics_per_category = all_metrics_per_category
def clear(self):
......@@ -166,7 +171,8 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
'DetectionBoxes_Recall/AR@100 (large)': average recall for large objects
with 100 detections.
2. per_category_ap: category specific results with keys of the form:
2. per_category_ap: if include_metrics_per_category is True, category
specific results with keys of the form:
'Precision mAP ByCategory/category' (without the supercategory part if
no supercategories exist). For backward compatibility
'PerformanceByCategory' is included in the output regardless of
......@@ -183,6 +189,7 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
box_evaluator = coco_tools.COCOEvalWrapper(
coco_wrapped_groundtruth, coco_wrapped_detections, agnostic_mode=False)
box_metrics, box_per_category_ap = box_evaluator.ComputeMetrics(
include_metrics_per_category=self._include_metrics_per_category,
all_metrics_per_category=self._all_metrics_per_category)
box_metrics.update(box_per_category_ap)
box_metrics = {'DetectionBoxes_'+ key: value
......@@ -253,6 +260,7 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
'DetectionBoxes_Recall/AR@100 (large)',
'DetectionBoxes_Recall/AR@100 (medium)',
'DetectionBoxes_Recall/AR@100 (small)']
if self._include_metrics_per_category:
for category_dict in self._categories:
metric_names.append('DetectionBoxes_PerformanceByCategory/mAP/' +
category_dict['name'])
......@@ -289,13 +297,14 @@ def _check_mask_type_and_value(array_name, masks):
class CocoMaskEvaluator(object_detection_evaluation.DetectionEvaluator):
"""Class to evaluate COCO detection metrics."""
def __init__(self, categories):
def __init__(self, categories, include_metrics_per_category=False):
"""Constructor.
Args:
categories: A list of dicts, each of which has the following keys -
'id': (required) an integer id uniquely identifying this category.
'name': (required) string representing category name e.g., 'cat', 'dog'.
include_metrics_per_category: If True, include metrics for each category.
"""
super(CocoMaskEvaluator, self).__init__(categories)
self._image_id_to_mask_shape_map = {}
......@@ -304,6 +313,7 @@ class CocoMaskEvaluator(object_detection_evaluation.DetectionEvaluator):
self._detection_masks_list = []
self._category_id_set = set([cat['id'] for cat in self._categories])
self._annotation_id = 1
self._include_metrics_per_category = include_metrics_per_category
def clear(self):
"""Clears the state to prepare for a fresh evaluation."""
......@@ -438,7 +448,8 @@ class CocoMaskEvaluator(object_detection_evaluation.DetectionEvaluator):
'Recall/AR@100 (large)': average recall for large objects with 100
detections
2. per_category_ap: category specific results with keys of the form:
2. per_category_ap: if include_metrics_per_category is True, category
specific results with keys of the form:
'Precision mAP ByCategory/category' (without the supercategory part if
no supercategories exist). For backward compatibility
'PerformanceByCategory' is included in the output regardless of
......@@ -458,7 +469,8 @@ class CocoMaskEvaluator(object_detection_evaluation.DetectionEvaluator):
mask_evaluator = coco_tools.COCOEvalWrapper(
coco_wrapped_groundtruth, coco_wrapped_detection_masks,
agnostic_mode=False, iou_type='segm')
mask_metrics, mask_per_category_ap = mask_evaluator.ComputeMetrics()
mask_metrics, mask_per_category_ap = mask_evaluator.ComputeMetrics(
include_metrics_per_category=self._include_metrics_per_category)
mask_metrics.update(mask_per_category_ap)
mask_metrics = {'DetectionMasks_'+ key: value
for key, value in mask_metrics.iteritems()}
......
......@@ -12,13 +12,12 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for image.understanding.object_detection.metrics.coco_evaluation."""
"""Tests for tensorflow_models.object_detection.metrics.coco_evaluation."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import math
import numpy as np
import tensorflow as tf
from object_detection.core import standard_fields
......@@ -87,43 +86,6 @@ class CocoDetectionEvaluationTest(tf.test.TestCase):
metrics = coco_evaluator.evaluate()
self.assertAlmostEqual(metrics['DetectionBoxes_Precision/mAP'], 1.0)
def testReturnAllMetricsPerCategory(self):
"""Tests that mAP is calculated correctly on GT and Detections."""
category_list = [{'id': 0, 'name': 'person'}]
coco_evaluator = coco_evaluation.CocoDetectionEvaluator(
category_list, all_metrics_per_category=True)
coco_evaluator.add_single_ground_truth_image_info(
image_id='image1',
groundtruth_dict={
standard_fields.InputDataFields.groundtruth_boxes:
np.array([[100., 100., 200., 200.]]),
standard_fields.InputDataFields.groundtruth_classes: np.array([1])
})
coco_evaluator.add_single_detected_image_info(
image_id='image1',
detections_dict={
standard_fields.DetectionResultFields.detection_boxes:
np.array([[100., 100., 200., 200.]]),
standard_fields.DetectionResultFields.detection_scores:
np.array([.8]),
standard_fields.DetectionResultFields.detection_classes:
np.array([1])
})
metrics = coco_evaluator.evaluate()
expected_metrics = [
'DetectionBoxes_Recall AR@10 ByCategory/person',
'DetectionBoxes_Precision mAP (medium) ByCategory/person',
'DetectionBoxes_Precision mAP ByCategory/person',
'DetectionBoxes_Precision mAP@.50IOU ByCategory/person',
'DetectionBoxes_Precision mAP (small) ByCategory/person',
'DetectionBoxes_Precision mAP (large) ByCategory/person',
'DetectionBoxes_Recall AR@1 ByCategory/person',
'DetectionBoxes_Precision mAP@.75IOU ByCategory/person',
'DetectionBoxes_Recall AR@100 ByCategory/person',
'DetectionBoxes_Recall AR@100 (medium) ByCategory/person',
'DetectionBoxes_Recall AR@100 (large) ByCategory/person']
self.assertTrue(set(expected_metrics).issubset(set(metrics)))
def testRejectionOnDuplicateGroundtruth(self):
"""Tests that groundtruth cannot be added more than once for an image."""
categories = [{'id': 1, 'name': 'cat'},
......@@ -279,12 +241,6 @@ class CocoEvaluationPyFuncTest(tf.test.TestCase):
self.assertAlmostEqual(metrics['DetectionBoxes_Recall/AR@100 (medium)'],
-1.0)
self.assertAlmostEqual(metrics['DetectionBoxes_Recall/AR@100 (small)'], 1.0)
self.assertAlmostEqual(metrics[
'DetectionBoxes_PerformanceByCategory/mAP/dog'], 1.0)
self.assertAlmostEqual(metrics[
'DetectionBoxes_PerformanceByCategory/mAP/cat'], 1.0)
self.assertTrue(math.isnan(metrics[
'DetectionBoxes_PerformanceByCategory/mAP/person']))
self.assertFalse(coco_evaluator._groundtruth_list)
self.assertFalse(coco_evaluator._detection_boxes_list)
self.assertFalse(coco_evaluator._image_ids)
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment