Commit 1efe98bb authored by Zhichao Lu's avatar Zhichao Lu Committed by lzc5123016
Browse files

Merged commit includes the following changes:

185215255  by Zhichao Lu:

    Stop populating image/object/class/text field when generating COCO tf record.

--
185213306  by Zhichao Lu:

    Use the params batch size and not the one from train_config in input_fn

--
185209081  by Zhichao Lu:

    Handle the case when there are no ground-truth masks for an image.

--
185195531  by Zhichao Lu:

    Remove unstack and stack operations on features from third_party/object_detection/model.py.

--
185195017  by Zhichao Lu:

    Matrix multiplication based gather op implementation.

--
185187744  by Zhichao Lu:

    Fix eval_util minor issue.

--
185098733  by Zhichao Lu:

    Internal change

185076656  by Zhichao Lu:

    Increment the amount of boxes for coco17.

--
185074199  by Zhichao Lu:

    Add config for SSD Resnet50 v1 with FPN.

--
185060199  by Zhichao Lu:

    Fix a bug in clear_detections.
    This method set detection_keys to an empty dictionary instead of an empty set. I've refactored so that this method and the constructor use the same code path.

--
185031359  by Zhichao Lu:

    Eval TPU trained models continuously.

--
185016591  by Zhichao Lu:

    Use TPUEstimatorSpec for TPU

--
185013651  by Zhichao Lu:

    Add PreprocessorCache to record and duplicate augmentations.

--
184921763  by Zhichao Lu:

    Minor fixes for object detection.

--
184920610  by Zhichao Lu:

    Adds a model builder test for "embedded_ssd_mobilenet_v1" feature extractor.

--
184919284  by Zhichao Lu:

    Added unit tests for TPU, with optional training / eval.

--
184915910  by Zhichao Lu:

    Update third_party g3 doc with Mask RCNN detection models.

--
184914085  by Zhichao Lu:

    Slight change to WeightSharedConvolutionalBoxPredictor implementation to make things match more closely with RetinaNet.  Specifically we now construct the box encoding and class predictor towers separately rather than having them share weights until penultimate layer.

--
184913786  by Zhichao Lu:

    Plumbs SSD Resnet V1 with FPN models into model builder.

--
184910030  by Zhichao Lu:

    Add coco metrics to evaluator.

--
184897758  by Zhichao Lu:

    Merge changes from github.

--
184888736  by Zhichao Lu:

    Ensure groundtruth_weights are always 1-D.

--
184887256  by Zhichao Lu:

    Introduce an option to add summaries in the model so it can be turned off when necessary.

--
184865559  by Zhichao Lu:

    Updating inputs so that a dictionary of tensors is returned from input_fn. Moving unbatch/unpad to model.py.
    Also removing source_id key from features dictionary, and replacing with an integer hash.

--
184859205  by Zhichao Lu:

    This CL is trying to hide those differences by making the default settings work with the public code.

--
184769779  by Zhichao Lu:

    Pass groundtruth weights into ssd meta architecture all the way to target assigner.

    This will allow training ssd models with padded groundtruth tensors.

--
184767117  by Zhichao Lu:

    * Add `params` arg to make all input fns work with TPUEstimator
    * Add --master
    * Output eval results

--
184766244  by Zhichao Lu:

    Update create_coco_tf_record to include category indices

--
184752937  by Zhichao Lu:

    Create a third_party version of TPU compatible mobilenet_v2_focal_loss coco config.

--
184750174  by Zhichao Lu:

    A few small fixes for multiscale anchor generator and a test.

--
184746581  by Zhichao Lu:

    Update jupyter notebook to show mask if provided by model.

--
184728646  by Zhichao Lu:

    Adding a few more tests to make sure decoding with/without label maps performs as expected.

--
184624154  by Zhichao Lu:

    Add an object detection binary for TPU.

--
184622118  by Zhichao Lu:

    Batch, transform, and unbatch in the tflearn interface.

--
184595064  by Zhichao Lu:

    Add support for training grayscale models.

--
184532026  by Zhichao Lu:

    Change dataset_builder.build to perform optional batching using tf.data.Dataset API

--
184330239  by Zhichao Lu:

    Add augment_input_data and transform_input_data helper functions to third_party/tensorflow_models/object_detection/inputs.py

--
184328681  by Zhichao Lu:

    Use an internal rgb to gray method that can be quantized.

--
184327909  by Zhichao Lu:

    Helper function to return padding shapes to use with Dataset.padded_batch.

--
184326291  by Zhichao Lu:

    Added decode_func for specialized decoding.

--
184314676  by Zhichao Lu:

    Add unstack_batch method to inputs.py.

    This will enable us to convert batched tensors to lists of tensors. This is compatible with OD API that consumes groundtruth batch as a list of tensors.

--
184281269  by Zhichao Lu:

    Internal test target changes.

--
184192851  by Zhichao Lu:

    Adding `Estimator` interface for object detection.

--
184187885  by Zhichao Lu:

    Add config_util functions to help with input pipeline.

    1. function to return expected shapes from the resizer config
    2. function to extract image_resizer_config from model_config.

--
184139892  by Zhichao Lu:

    Adding support for depthwise SSD (ssd-lite) and depthwise box predictions.

--
184089891  by Zhichao Lu:

    Fix third_party faster rcnn resnet101 coco config.

--
184083378  by Zhichao Lu:

    In the case when there is no object/weights field in tf.Example proto, return a default weight of 1.0 for all boxes.

--

PiperOrigin-RevId: 185215255
parent fbc5ba06
...@@ -19,6 +19,7 @@ import os ...@@ -19,6 +19,7 @@ import os
import tempfile import tempfile
import tensorflow as tf import tensorflow as tf
from google.protobuf import text_format from google.protobuf import text_format
from tensorflow.core.protobuf import saver_pb2
from tensorflow.python import pywrap_tensorflow from tensorflow.python import pywrap_tensorflow
from tensorflow.python.client import session from tensorflow.python.client import session
from tensorflow.python.framework import graph_util from tensorflow.python.framework import graph_util
...@@ -354,16 +355,22 @@ def _export_inference_graph(input_type, ...@@ -354,16 +355,22 @@ def _export_inference_graph(input_type,
if graph_hook_fn: graph_hook_fn() if graph_hook_fn: graph_hook_fn()
saver_kwargs = {}
if use_moving_averages: if use_moving_averages:
temp_checkpoint_file = tempfile.NamedTemporaryFile() # This check is to be compatible with both version of SaverDef.
if os.path.isfile(trained_checkpoint_prefix):
saver_kwargs['write_version'] = saver_pb2.SaverDef.V1
temp_checkpoint_prefix = tempfile.NamedTemporaryFile().name
else:
temp_checkpoint_prefix = tempfile.mkdtemp()
replace_variable_values_with_moving_averages( replace_variable_values_with_moving_averages(
tf.get_default_graph(), trained_checkpoint_prefix, tf.get_default_graph(), trained_checkpoint_prefix,
temp_checkpoint_file.name) temp_checkpoint_prefix)
checkpoint_to_use = temp_checkpoint_file.name checkpoint_to_use = temp_checkpoint_prefix
else: else:
checkpoint_to_use = trained_checkpoint_prefix checkpoint_to_use = trained_checkpoint_prefix
saver = tf.train.Saver() saver = tf.train.Saver(**saver_kwargs)
input_saver_def = saver.as_saver_def() input_saver_def = saver.as_saver_def()
_write_graph_and_checkpoint( _write_graph_and_checkpoint(
......
...@@ -23,7 +23,7 @@ In the table below, we list each such pre-trained model including: ...@@ -23,7 +23,7 @@ In the table below, we list each such pre-trained model including:
* detector performance on subset of the COCO validation set or Open Images test split as measured by the dataset-specific mAP measure. * detector performance on subset of the COCO validation set or Open Images test split as measured by the dataset-specific mAP measure.
Here, higher is better, and we only report bounding box mAP rounded to the Here, higher is better, and we only report bounding box mAP rounded to the
nearest integer. nearest integer.
* Output types (currently only `Boxes`) * Output types (`Boxes`, and `Masks` if applicable )
You can un-tar each tar.gz file via, e.g.,: You can un-tar each tar.gz file via, e.g.,:
...@@ -55,7 +55,7 @@ Some remarks on frozen inference graphs: ...@@ -55,7 +55,7 @@ Some remarks on frozen inference graphs:
a detector (and discarding the part past that point), which negatively impacts a detector (and discarding the part past that point), which negatively impacts
standard mAP metrics. standard mAP metrics.
* Our frozen inference graphs are generated using the * Our frozen inference graphs are generated using the
[v1.4.0](https://github.com/tensorflow/tensorflow/tree/v1.4.0) [v1.5.0](https://github.com/tensorflow/tensorflow/tree/v1.5.0)
release version of Tensorflow and we do not guarantee that these will work release version of Tensorflow and we do not guarantee that these will work
with other versions; this being said, each frozen inference graph can be with other versions; this being said, each frozen inference graph can be
regenerated using your current version of Tensorflow by re-running the regenerated using your current version of Tensorflow by re-running the
...@@ -69,16 +69,20 @@ Some remarks on frozen inference graphs: ...@@ -69,16 +69,20 @@ Some remarks on frozen inference graphs:
| ------------ | :--------------: | :--------------: | :-------------: | | ------------ | :--------------: | :--------------: | :-------------: |
| [ssd_mobilenet_v1_coco](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_2017_11_17.tar.gz) | 30 | 21 | Boxes | | [ssd_mobilenet_v1_coco](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_2017_11_17.tar.gz) | 30 | 21 | Boxes |
| [ssd_inception_v2_coco](http://download.tensorflow.org/models/object_detection/ssd_inception_v2_coco_2017_11_17.tar.gz) | 42 | 24 | Boxes | | [ssd_inception_v2_coco](http://download.tensorflow.org/models/object_detection/ssd_inception_v2_coco_2017_11_17.tar.gz) | 42 | 24 | Boxes |
| [faster_rcnn_inception_v2_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_v2_coco_2017_11_08.tar.gz) | 58 | 28 | Boxes | | [faster_rcnn_inception_v2_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_v2_coco_2018_01_28.tar.gz) | 58 | 28 | Boxes |
| [faster_rcnn_resnet50_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet50_coco_2017_11_08.tar.gz) | 89 | 30 | Boxes | | [faster_rcnn_resnet50_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet50_coco_2018_01_28.tar.gz) | 89 | 30 | Boxes |
| [faster_rcnn_resnet50_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet50_lowproposals_coco_2017_11_08.tar.gz) | 64 | | Boxes | | [faster_rcnn_resnet50_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet50_lowproposals_coco_2018_01_28.tar.gz) | 64 | | Boxes |
| [rfcn_resnet101_coco](http://download.tensorflow.org/models/object_detection/rfcn_resnet101_coco_2017_11_08.tar.gz) | 92 | 30 | Boxes | | [rfcn_resnet101_coco](http://download.tensorflow.org/models/object_detection/rfcn_resnet101_coco_2018_01_28.tar.gz) | 92 | 30 | Boxes |
| [faster_rcnn_resnet101_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_coco_2017_11_08.tar.gz) | 106 | 32 | Boxes | | [faster_rcnn_resnet101_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_coco_2018_01_28.tar.gz) | 106 | 32 | Boxes |
| [faster_rcnn_resnet101_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_lowproposals_coco_2017_11_08.tar.gz) | 82 | | Boxes | | [faster_rcnn_resnet101_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_lowproposals_coco_2018_01_28.tar.gz) | 82 | | Boxes |
| [faster_rcnn_inception_resnet_v2_atrous_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_coco_2017_11_08.tar.gz) | 620 | 37 | Boxes | | [faster_rcnn_inception_resnet_v2_atrous_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_coco_2018_01_28.tar.gz) | 620 | 37 | Boxes |
| [faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco_2017_11_08.tar.gz) | 241 | | Boxes | | [faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco_2018_01_28.tar.gz) | 241 | | Boxes |
| [faster_rcnn_nas](http://download.tensorflow.org/models/object_detection/faster_rcnn_nas_coco_2017_11_08.tar.gz) | 1833 | 43 | Boxes | | [faster_rcnn_nas](http://download.tensorflow.org/models/object_detection/faster_rcnn_nas_coco_2018_01_28.tar.gz) | 1833 | 43 | Boxes |
| [faster_rcnn_nas_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_nas_lowproposals_coco_2017_11_08.tar.gz) | 540 | | Boxes | | [faster_rcnn_nas_lowproposals_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_nas_lowproposals_coco_2018_01_28.tar.gz) | 540 | | Boxes |
| [mask_rcnn_inception_resnet_v2_atrous_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_inception_resnet_v2_atrous_coco_2018_01_28.tar.gz) | 771 | 36 | Masks |
| [mask_rcnn_inception_v2_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_inception_v2_coco_2018_01_28.tar.gz) | 79 | 25 | Masks |
| [mask_rcnn_resnet101_atrous_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_resnet101_atrous_coco_2018_01_28.tar.gz) | 470 | 33 | Masks |
| [mask_rcnn_resnet50_atrous_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_resnet50_atrous_coco_2018_01_28.tar.gz) | 343 | 29 | Masks |
...@@ -86,14 +90,14 @@ Some remarks on frozen inference graphs: ...@@ -86,14 +90,14 @@ Some remarks on frozen inference graphs:
Model name | Speed (ms) | Pascal mAP@0.5 (ms) | Outputs Model name | Speed (ms) | Pascal mAP@0.5 (ms) | Outputs
----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----: ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----:
[faster_rcnn_resnet101_kitti](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_kitti_2017_11_08.tar.gz) | 79 | 87 | Boxes [faster_rcnn_resnet101_kitti](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_kitti_2018_01_28.tar.gz) | 79 | 87 | Boxes
## Open Images-trained models {#open-images-models} ## Open Images-trained models {#open-images-models}
Model name | Speed (ms) | Open Images mAP@0.5[^2] | Outputs Model name | Speed (ms) | Open Images mAP@0.5[^2] | Outputs
----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----: ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----:
[faster_rcnn_inception_resnet_v2_atrous_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_oid_2017_11_08.tar.gz) | 727 | 37 | Boxes [faster_rcnn_inception_resnet_v2_atrous_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_oid_2018_01_28.tar.gz) | 727 | 37 | Boxes
[faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2017_11_08.tar.gz) | 347 | | Boxes [faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2018_01_28.tar.gz) | 347 | | Boxes
[^1]: See [MSCOCO evaluation protocol](http://cocodataset.org/#detections-eval). [^1]: See [MSCOCO evaluation protocol](http://cocodataset.org/#detections-eval).
......
## Run an Instance Segmentation Model
For some applications it isn't adequate enough to localize an object with a
simple bounding box. For instance, you might want to segment an object region
once it is detected. This class of problems is called **instance segmentation**.
<p align="center">
<img src="img/kites_with_segment_overlay.png" width=676 height=450>
</p>
### Materializing data for instance segmentation {#materializing-instance-seg}
Instance segmentation is an extension of object detection, where a binary mask
(i.e. object vs. background) is associated with every bounding box. This allows
for more fine-grained information about the extent of the object within the box.
To train an instance segmentation model, a groundtruth mask must be supplied for
every groundtruth bounding box. In additional to the proto fields listed in the
section titled [Using your own dataset](using_your_own_dataset.md), one must
also supply `image/object/mask`, which can either be a repeated list of
single-channel encoded PNG strings, or a single dense 3D binary tensor where
masks corresponding to each object are stacked along the first dimension. Each
is described in more detail below.
#### PNG Instance Segmentation Masks
Instance segmentation masks can be supplied as serialized PNG images.
```shell
image/object/mask = ["\x89PNG\r\n\x1A\n\x00\x00\x00\rIHDR\...", ...]
```
These masks are whole-image masks, one for each object instance. The spatial
dimensions of each mask must agree with the image. Each mask has only a single
channel, and the pixel values are either 0 (background) or 1 (object mask).
**PNG masks are the preferred parameterization since they offer considerable
space savings compared to dense numerical masks.**
#### Dense Numerical Instance Segmentation Masks
Masks can also be specified via a dense numerical tensor.
```shell
image/object/mask = [0.0, 0.0, 1.0, 1.0, 0.0, ...]
```
For an image with dimensions `H` x `W` and `num_boxes` groundtruth boxes, the
mask corresponds to a [`num_boxes`, `H`, `W`] float32 tensor, flattened into a
single vector of shape `num_boxes` * `H` * `W`. In TensorFlow, examples are read
in row-major format, so the elements are organized as:
```shell
... mask 0 row 0 ... mask 0 row 1 ... // ... mask 0 row H-1 ... mask 1 row 0 ...
```
where each row has W contiguous binary values.
To see an example tf-records with mask labels, see the examples under the
[Preparing Inputs](preparing_inputs.md) section.
### Pre-existing config files
We provide four instance segmentation config files that you can use to train
your own models:
1. <a href="https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/mask_rcnn_inception_resnet_v2_atrous_coco.config" target=_blank>mask_rcnn_inception_resnet_v2_atrous_coco</a>
1. <a href="https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/mask_rcnn_resnet101_atrous_coco.config" target=_blank>mask_rcnn_resnet101_atrous_coco</a>
1. <a href="https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/mask_rcnn_resnet50_atrous_coco.config" target=_blank>mask_rcnn_resnet50_atrous_coco</a>
1. <a href="https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/mask_rcnn_inception_v2_coco.config" target=_blank>mask_rcnn_inception_v2_coco</a>
For more details see the [detection model zoo](detection_model_zoo.md).
### Updating a Faster R-CNN config file
Currently, the only supported instance segmentation model is [Mask
R-CNN](https://arxiv.org/abs/1703.06870), which requires Faster R-CNN as the
backbone object detector.
Once you have a baseline Faster R-CNN pipeline configuration, you can make the
following modifications in order to convert it into a Mask R-CNN model.
1. Within `train_input_reader` and `eval_input_reader`, set
`load_instance_masks` to `True`. If using PNG masks, set `mask_type` to
`PNG_MASKS`, otherwise you can leave it as the default 'NUMERICAL_MASKS'.
1. Within the `faster_rcnn` config, use a `MaskRCNNBoxPredictor` as the
`second_stage_box_predictor`.
1. Within the `MaskRCNNBoxPredictor` message, set `predict_instance_masks` to
`True`. You must also define `conv_hyperparams`.
1. Within the `faster_rcnn` message, set `number_of_stages` to `3`.
1. Add instance segmentation metrics to the set of metrics:
`'coco_mask_metrics'`.
1. Update the `input_path`s to point at your data.
Please refer to the section on [Running the pets dataset](running_pets.md) for
additional details.
> Note: The mask prediction branch consists of a sequence of convolution layers.
> You can set the number of convolution layers and their depth as follows:
>
> 1. Within the `MaskRCNNBoxPredictor` message, set the
> `mask_prediction_conv_depth` to your value of interest. The default value
> is 256. If you set it to `0` (recommended), the depth is computed
> automatically based on the number of classes in the dataset.
> 1. Within the `MaskRCNNBoxPredictor` message, set the
> `mask_prediction_num_conv_layers` to your value of interest. The default
> value is 2.
...@@ -319,6 +319,9 @@ instance segmentation pipeline. Everything above that was mentioned about object ...@@ -319,6 +319,9 @@ instance segmentation pipeline. Everything above that was mentioned about object
detection holds true for instance segmentation. Instance segmentation consists detection holds true for instance segmentation. Instance segmentation consists
of an object detection model with an additional head that predicts the object of an object detection model with an additional head that predicts the object
mask inside each predicted box once we remove the training and other details. mask inside each predicted box once we remove the training and other details.
Please refer to the section on [Running an Instance Segmentation
Model](instance_segmentation.md) for instructions on how to configure a model
that predicts masks in addition to object bounding boxes.
## What's Next ## What's Next
......
...@@ -103,7 +103,7 @@ FLAGS = flags.FLAGS ...@@ -103,7 +103,7 @@ FLAGS = flags.FLAGS
def create_tf_example(example): def create_tf_example(example):
# TODO(user): Populate the following variables from your example. # TODO: Populate the following variables from your example.
height = None # Image height height = None # Image height
width = None # Image width width = None # Image width
filename = None # Filename of the image. Empty if image is not from file filename = None # Filename of the image. Empty if image is not from file
...@@ -139,7 +139,7 @@ def create_tf_example(example): ...@@ -139,7 +139,7 @@ def create_tf_example(example):
def main(_): def main(_):
writer = tf.python_io.TFRecordWriter(FLAGS.output_path) writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
# TODO(user): Write code to read in your dataset to examples variable # TODO: Write code to read in your dataset to examples variable
for example in examples: for example in examples:
tf_example = create_tf_example(example) tf_example = create_tf_example(example)
...@@ -155,3 +155,7 @@ if __name__ == '__main__': ...@@ -155,3 +155,7 @@ if __name__ == '__main__':
Note: You may notice additional fields in some other datasets. They are Note: You may notice additional fields in some other datasets. They are
currently unused by the API and are optional. currently unused by the API and are optional.
Note: Please refer to the section on [Running an Instance Segmentation
Model](instance_segmentation.md) for instructions on how to configure a model
that predicts masks in addition to object bounding boxes.
...@@ -21,56 +21,183 @@ from __future__ import print_function ...@@ -21,56 +21,183 @@ from __future__ import print_function
import functools import functools
import tensorflow as tf import tensorflow as tf
from object_detection import trainer
from object_detection.builders import dataset_builder from object_detection.builders import dataset_builder
from object_detection.builders import image_resizer_builder
from object_detection.builders import model_builder
from object_detection.builders import preprocessor_builder from object_detection.builders import preprocessor_builder
from object_detection.core import prefetcher from object_detection.core import preprocessor
from object_detection.core import standard_fields as fields from object_detection.core import standard_fields as fields
from object_detection.data_decoders import tf_example_decoder from object_detection.data_decoders import tf_example_decoder
from object_detection.protos import eval_pb2 from object_detection.protos import eval_pb2
from object_detection.protos import input_reader_pb2 from object_detection.protos import input_reader_pb2
from object_detection.protos import model_pb2
from object_detection.protos import train_pb2 from object_detection.protos import train_pb2
from object_detection.utils import config_util
from object_detection.utils import dataset_util from object_detection.utils import dataset_util
from object_detection.utils import ops as util_ops from object_detection.utils import ops as util_ops
FEATURES_IMAGE = 'images' HASH_KEY = 'hash'
FEATURES_KEY = 'key' HASH_BINS = 1 << 31
SERVING_FED_EXAMPLE_KEY = 'serialized_example' SERVING_FED_EXAMPLE_KEY = 'serialized_example'
def create_train_input_fn(num_classes, train_config, train_input_config): def transform_input_data(tensor_dict,
model_preprocess_fn,
image_resizer_fn,
num_classes,
data_augmentation_fn=None,
merge_multiple_boxes=False,
retain_original_image=False):
"""A single function that is responsible for all input data transformations.
Data transformation functions are applied in the following order.
1. data_augmentation_fn (optional): applied on tensor_dict.
2. model_preprocess_fn: applied only on image tensor in tensor_dict.
3. image_resizer_fn: applied only on instance mask tensor in tensor_dict.
4. one_hot_encoding: applied to classes tensor in tensor_dict.
5. merge_multiple_boxes (optional): when groundtruth boxes are exactly the
same they can be merged into a single box with an associated k-hot class
label.
Args:
tensor_dict: dictionary containing input tensors keyed by
fields.InputDataFields.
model_preprocess_fn: model's preprocess function to apply on image tensor.
This function must take in a 4-D float tensor and return a 4-D preprocess
float tensor and a tensor containing the true image shape.
image_resizer_fn: image resizer function to apply on groundtruth instance
masks. This function must take a 4-D float tensor of image and a 4-D
tensor of instances masks and return resized version of these along with
the true shapes.
num_classes: number of max classes to one-hot (or k-hot) encode the class
labels.
data_augmentation_fn: (optional) data augmentation function to apply on
input `tensor_dict`.
merge_multiple_boxes: (optional) whether to merge multiple groundtruth boxes
and classes for a given image if the boxes are exactly the same.
retain_original_image: (optional) whether to retain original image in the
output dictionary.
Returns:
A dictionary keyed by fields.InputDataFields containing the tensors obtained
after applying all the transformations.
"""
if retain_original_image:
tensor_dict[fields.InputDataFields.
original_image] = tensor_dict[fields.InputDataFields.image]
# Apply data augmentation ops.
if data_augmentation_fn is not None:
tensor_dict = data_augmentation_fn(tensor_dict)
# Apply model preprocessing ops and resize instance masks.
image = tf.expand_dims(
tf.to_float(tensor_dict[fields.InputDataFields.image]), axis=0)
preprocessed_resized_image, true_image_shape = model_preprocess_fn(image)
tensor_dict[fields.InputDataFields.image] = tf.squeeze(
preprocessed_resized_image, axis=0)
tensor_dict[fields.InputDataFields.true_image_shape] = tf.squeeze(
true_image_shape, axis=0)
if fields.InputDataFields.groundtruth_instance_masks in tensor_dict:
masks = tensor_dict[fields.InputDataFields.groundtruth_instance_masks]
_, resized_masks, _ = image_resizer_fn(image, masks)
tensor_dict[fields.InputDataFields.
groundtruth_instance_masks] = resized_masks
# Transform groundtruth classes to one hot encodings.
label_offset = 1
zero_indexed_groundtruth_classes = tensor_dict[
fields.InputDataFields.groundtruth_classes] - label_offset
tensor_dict[fields.InputDataFields.groundtruth_classes] = tf.one_hot(
zero_indexed_groundtruth_classes, num_classes)
if merge_multiple_boxes:
merged_boxes, merged_classes, _ = util_ops.merge_boxes_with_multiple_labels(
tensor_dict[fields.InputDataFields.groundtruth_boxes],
zero_indexed_groundtruth_classes, num_classes)
tensor_dict[fields.InputDataFields.groundtruth_boxes] = merged_boxes
tensor_dict[fields.InputDataFields.groundtruth_classes] = merged_classes
return tensor_dict
def augment_input_data(tensor_dict, data_augmentation_options):
"""Applies data augmentation ops to input tensors.
Args:
tensor_dict: A dictionary of input tensors keyed by fields.InputDataFields.
data_augmentation_options: A list of tuples, where each tuple contains a
function and a dictionary that contains arguments and their values.
Usually, this is the output of core/preprocessor.build.
Returns:
A dictionary of tensors obtained by applying data augmentation ops to the
input tensor dictionary.
"""
tensor_dict[fields.InputDataFields.image] = tf.expand_dims(
tf.to_float(tensor_dict[fields.InputDataFields.image]), 0)
include_instance_masks = (fields.InputDataFields.groundtruth_instance_masks
in tensor_dict)
include_keypoints = (fields.InputDataFields.groundtruth_keypoints
in tensor_dict)
tensor_dict = preprocessor.preprocess(
tensor_dict, data_augmentation_options,
func_arg_map=preprocessor.get_default_func_arg_map(
include_instance_masks=include_instance_masks,
include_keypoints=include_keypoints))
tensor_dict[fields.InputDataFields.image] = tf.squeeze(
tensor_dict[fields.InputDataFields.image], axis=0)
return tensor_dict
def create_train_input_fn(train_config, train_input_config,
model_config):
"""Creates a train `input` function for `Estimator`. """Creates a train `input` function for `Estimator`.
Args: Args:
num_classes: Number of classes, which does not include a background
category.
train_config: A train_pb2.TrainConfig. train_config: A train_pb2.TrainConfig.
train_input_config: An input_reader_pb2.InputReader. train_input_config: An input_reader_pb2.InputReader.
model_config: A model_pb2.DetectionModel.
Returns: Returns:
`input_fn` for `Estimator` in TRAIN mode. `input_fn` for `Estimator` in TRAIN mode.
""" """
def _train_input_fn(): def _train_input_fn(params=None):
"""Returns `features` and `labels` tensor dictionaries for training. """Returns `features` and `labels` tensor dictionaries for training.
Args:
params: Parameter dictionary passed from the estimator.
Returns: Returns:
features: Dictionary of feature tensors. features: Dictionary of feature tensors.
features['images'] is a list of N [1, H, W, C] float32 tensors, features[fields.InputDataFields.image] is a [batch_size, H, W, C]
where N is the number of images in a batch. float32 tensor with preprocessed images.
features['key'] is a list of N string tensors, each representing a features[HASH_KEY] is a [batch_size] int32 tensor representing unique
unique identifier for the image. identifiers for the images.
features[fields.InputDataFields.true_image_shape] is a [batch_size, 3]
int32 tensor representing the true image shapes, as preprocessed
images could be padded.
labels: Dictionary of groundtruth tensors. labels: Dictionary of groundtruth tensors.
labels['locations_list'] is a list of N [num_boxes, 4] float32 tensors labels[fields.InputDataFields.num_groundtruth_boxes] is a [batch_size]
containing the corners of the groundtruth boxes. int32 tensor indicating the number of groundtruth boxes.
labels['classes_list'] is a list of N [num_boxes, num_classes] float32 labels[fields.InputDataFields.groundtruth_boxes] is a
padded one-hot tensors of classes. [batch_size, num_boxes, 4] float32 tensor containing the corners of
labels['masks_list'] is a list of N [num_boxes, H, W] float32 tensors the groundtruth boxes.
containing only binary values, which represent instance masks for labels[fields.InputDataFields.groundtruth_classes] is a
objects if present in the dataset. Else returns None. [batch_size, num_boxes, num_classes] float32 one-hot tensor of
labels[fields.InputDataFields.groundtruth_weights] is a list of N classes.
[num_boxes] float32 tensors containing groundtruth weights for the labels[fields.InputDataFields.groundtruth_weights] is a
boxes. [batch_size, num_boxes] float32 tensor containing groundtruth weights
for the boxes.
-- Optional --
labels[fields.InputDataFields.groundtruth_instance_masks] is a
[batch_size, num_boxes, H, W] float32 tensor containing only binary
values, which represent instance masks for objects.
labels[fields.InputDataFields.groundtruth_keypoints] is a
[batch_size, num_boxes, num_keypoints, 2] float32 tensor containing
keypoints for each box.
Raises: Raises:
TypeError: if the `train_config` or `train_input_config` are not of the TypeError: if the `train_config` or `train_input_config` are not of the
...@@ -82,164 +209,226 @@ def create_train_input_fn(num_classes, train_config, train_input_config): ...@@ -82,164 +209,226 @@ def create_train_input_fn(num_classes, train_config, train_input_config):
if not isinstance(train_input_config, input_reader_pb2.InputReader): if not isinstance(train_input_config, input_reader_pb2.InputReader):
raise TypeError('The `train_input_config` must be a ' raise TypeError('The `train_input_config` must be a '
'input_reader_pb2.InputReader.') 'input_reader_pb2.InputReader.')
if not isinstance(model_config, model_pb2.DetectionModel):
def get_next(config): raise TypeError('The `model_config` must be a '
return dataset_util.make_initializable_iterator( 'model_pb2.DetectionModel.')
dataset_builder.build(config)).get_next()
create_tensor_dict_fn = functools.partial(get_next, train_input_config)
data_augmentation_options = [ data_augmentation_options = [
preprocessor_builder.build(step) preprocessor_builder.build(step)
for step in train_config.data_augmentation_options for step in train_config.data_augmentation_options
] ]
data_augmentation_fn = functools.partial(
input_queue = trainer.create_input_queue( augment_input_data, data_augmentation_options=data_augmentation_options)
batch_size_per_clone=train_config.batch_size,
create_tensor_dict_fn=create_tensor_dict_fn, model = model_builder.build(model_config, is_training=True)
batch_queue_capacity=train_config.batch_queue_capacity, image_resizer_config = config_util.get_image_resizer_config(model_config)
num_batch_queue_threads=train_config.num_batch_queue_threads, image_resizer_fn = image_resizer_builder.build(image_resizer_config)
prefetch_queue_capacity=train_config.prefetch_queue_capacity,
data_augmentation_options=data_augmentation_options) transform_data_fn = functools.partial(
transform_input_data, model_preprocess_fn=model.preprocess,
(images_tuple, image_keys, locations_tuple, classes_tuple, masks_tuple, image_resizer_fn=image_resizer_fn,
keypoints_tuple, weights_tuple) = (trainer.get_inputs( num_classes=config_util.get_number_of_classes(model_config),
input_queue=input_queue, num_classes=num_classes)) data_augmentation_fn=data_augmentation_fn)
dataset = dataset_builder.build(
train_input_config,
transform_input_data_fn=transform_data_fn,
batch_size=params['batch_size'] if params else train_config.batch_size,
max_num_boxes=train_config.max_number_of_boxes,
num_classes=config_util.get_number_of_classes(model_config),
spatial_image_shape=config_util.get_spatial_image_size(
image_resizer_config))
tensor_dict = dataset_util.make_initializable_iterator(dataset).get_next()
hash_from_source_id = tf.string_to_hash_bucket_fast(
tensor_dict[fields.InputDataFields.source_id], HASH_BINS)
features = { features = {
FEATURES_IMAGE: list(images_tuple), fields.InputDataFields.image: tensor_dict[fields.InputDataFields.image],
FEATURES_KEY: list(image_keys) HASH_KEY: tf.cast(hash_from_source_id, tf.int32),
fields.InputDataFields.true_image_shape: tensor_dict[
fields.InputDataFields.true_image_shape]
} }
labels = { labels = {
'locations_list': list(locations_tuple), fields.InputDataFields.num_groundtruth_boxes: tensor_dict[
'classes_list': list(classes_tuple) fields.InputDataFields.num_groundtruth_boxes],
fields.InputDataFields.groundtruth_boxes: tensor_dict[
fields.InputDataFields.groundtruth_boxes],
fields.InputDataFields.groundtruth_classes: tensor_dict[
fields.InputDataFields.groundtruth_classes],
fields.InputDataFields.groundtruth_weights: tensor_dict[
fields.InputDataFields.groundtruth_weights]
} }
if fields.InputDataFields.groundtruth_keypoints in tensor_dict:
# Make sure that there are no tuple elements with None. labels[fields.InputDataFields.groundtruth_keypoints] = tensor_dict[
if all(masks is not None for masks in masks_tuple): fields.InputDataFields.groundtruth_keypoints]
labels['masks_list'] = list(masks_tuple) if fields.InputDataFields.groundtruth_instance_masks in tensor_dict:
if all(keypoints is not None for keypoints in keypoints_tuple): labels[fields.InputDataFields.groundtruth_instance_masks] = tensor_dict[
labels['keypoints_list'] = list(keypoints_tuple) fields.InputDataFields.groundtruth_instance_masks]
if all((elem is not None for elem in weights_tuple)):
labels[fields.InputDataFields.groundtruth_weights] = list(weights_tuple)
return features, labels return features, labels
return _train_input_fn return _train_input_fn
def create_eval_input_fn(num_classes, eval_config, eval_input_config): def create_eval_input_fn(eval_config, eval_input_config, model_config):
"""Creates an eval `input` function for `Estimator`. """Creates an eval `input` function for `Estimator`.
Args: Args:
num_classes: Number of classes, which does not include a background
category.
eval_config: An eval_pb2.EvalConfig. eval_config: An eval_pb2.EvalConfig.
eval_input_config: An input_reader_pb2.InputReader. eval_input_config: An input_reader_pb2.InputReader.
model_config: A model_pb2.DetectionModel.
Returns: Returns:
`input_fn` for `Estimator` in EVAL mode. `input_fn` for `Estimator` in EVAL mode.
""" """
def _eval_input_fn(): def _eval_input_fn(params=None):
"""Returns `features` and `labels` tensor dictionaries for evaluation. """Returns `features` and `labels` tensor dictionaries for evaluation.
Args:
params: Parameter dictionary passed from the estimator.
Returns: Returns:
features: Dictionary of feature tensors. features: Dictionary of feature tensors.
features['images'] is a [1, H, W, C] float32 tensor. features[fields.InputDataFields.image] is a [1, H, W, C] float32 tensor
features['key'] is a string tensor representing a unique identifier for with preprocessed images.
the image. features[HASH_KEY] is a [1] int32 tensor representing unique
identifiers for the images.
features[fields.InputDataFields.true_image_shape] is a [1, 3]
int32 tensor representing the true image shapes, as preprocessed
images could be padded.
features[fields.InputDataFields.original_image] is a [1, H', W', C]
float32 tensor with the original image.
labels: Dictionary of groundtruth tensors. labels: Dictionary of groundtruth tensors.
labels['locations_list'] is a list of 1 [num_boxes, 4] float32 tensors labels[fields.InputDataFields.groundtruth_boxes] is a [1, num_boxes, 4]
containing the corners of the groundtruth boxes. float32 tensor containing the corners of the groundtruth boxes.
labels['classes_list'] is a list of 1 [num_boxes, num_classes] float32 labels[fields.InputDataFields.groundtruth_classes] is a
padded one-hot tensors of classes. [num_boxes, num_classes] float32 one-hot tensor of classes.
labels['masks_list'] is an (optional) list of 1 [num_boxes, H, W] labels[fields.InputDataFields.groundtruth_area] is a [1, num_boxes]
float32 tensors containing only binary values, which represent float32 tensor containing object areas.
instance masks for objects if present in the dataset. Else returns labels[fields.InputDataFields.groundtruth_is_crowd] is a [1, num_boxes]
None. bool tensor indicating if the boxes enclose a crowd.
labels['image_id_list'] is a list of 1 string tensors containing the labels[fields.InputDataFields.groundtruth_difficult] is a [1, num_boxes]
original image id. int32 tensor indicating if the boxes represent difficult instances.
labels['area_list'] is a list of 1 [num_boxes] float32 tensors -- Optional --
containing object mask area in pixels squared. labels[fields.InputDataFields.groundtruth_instance_masks] is a
labels['is_crowd_list'] is a list of 1 [num_boxes] bool tensors [1, num_boxes, H, W] float32 tensor containing only binary values,
indicating if the boxes enclose a crowd. which represent instance masks for objects.
labels['difficult_list'] is a list of 1 [num_boxes] bool tensors
indicating if the boxes represent `difficult` instances.
Raises: Raises:
TypeError: if the `eval_config` or `eval_input_config` are not of the TypeError: if the `eval_config` or `eval_input_config` are not of the
correct type. correct type.
""" """
del params
if not isinstance(eval_config, eval_pb2.EvalConfig): if not isinstance(eval_config, eval_pb2.EvalConfig):
raise TypeError('For eval mode, the `eval_config` must be a ' raise TypeError('For eval mode, the `eval_config` must be a '
'eval_pb2.EvalConfig.') 'train_pb2.EvalConfig.')
if not isinstance(eval_input_config, input_reader_pb2.InputReader): if not isinstance(eval_input_config, input_reader_pb2.InputReader):
raise TypeError('The `eval_input_config` must be a ' raise TypeError('The `eval_input_config` must be a '
'input_reader_pb2.InputReader.') 'input_reader_pb2.InputReader.')
if not isinstance(model_config, model_pb2.DetectionModel):
raise TypeError('The `model_config` must be a '
'model_pb2.DetectionModel.')
num_classes = config_util.get_number_of_classes(model_config)
model = model_builder.build(model_config, is_training=False)
image_resizer_config = config_util.get_image_resizer_config(model_config)
image_resizer_fn = image_resizer_builder.build(image_resizer_config)
transform_data_fn = functools.partial(
transform_input_data, model_preprocess_fn=model.preprocess,
image_resizer_fn=image_resizer_fn,
num_classes=num_classes,
data_augmentation_fn=None,
retain_original_image=True)
dataset = dataset_builder.build(eval_input_config,
transform_input_data_fn=transform_data_fn)
input_dict = dataset_util.make_initializable_iterator(dataset).get_next()
hash_from_source_id = tf.string_to_hash_bucket_fast(
input_dict[fields.InputDataFields.source_id], HASH_BINS)
features = {
fields.InputDataFields.image:
input_dict[fields.InputDataFields.image],
fields.InputDataFields.original_image:
input_dict[fields.InputDataFields.original_image],
HASH_KEY: tf.cast(hash_from_source_id, tf.int32),
fields.InputDataFields.true_image_shape:
input_dict[fields.InputDataFields.true_image_shape]
}
input_dict = dataset_util.make_initializable_iterator( labels = {
dataset_builder.build(eval_input_config)).get_next() fields.InputDataFields.groundtruth_boxes:
prefetch_queue = prefetcher.prefetch(input_dict, capacity=500) input_dict[fields.InputDataFields.groundtruth_boxes],
input_dict = prefetch_queue.dequeue() fields.InputDataFields.groundtruth_classes:
original_image = tf.to_float( input_dict[fields.InputDataFields.groundtruth_classes],
tf.expand_dims(input_dict[fields.InputDataFields.image], 0)) fields.InputDataFields.groundtruth_area:
features = {} input_dict[fields.InputDataFields.groundtruth_area],
features[FEATURES_IMAGE] = original_image fields.InputDataFields.groundtruth_is_crowd:
features[FEATURES_KEY] = input_dict[fields.InputDataFields.source_id] input_dict[fields.InputDataFields.groundtruth_is_crowd],
fields.InputDataFields.groundtruth_difficult:
labels = {} tf.cast(input_dict[fields.InputDataFields.groundtruth_difficult],
labels['locations_list'] = [ tf.int32)
input_dict[fields.InputDataFields.groundtruth_boxes] }
]
classes_gt = tf.cast(input_dict[fields.InputDataFields.groundtruth_classes],
tf.int32)
classes_gt -= 1 # Remove the label id offset.
labels['classes_list'] = [
util_ops.padded_one_hot_encoding(
indices=classes_gt, depth=num_classes, left_pad=0)
]
labels['image_id_list'] = [input_dict[fields.InputDataFields.source_id]]
labels['area_list'] = [input_dict[fields.InputDataFields.groundtruth_area]]
labels['is_crowd_list'] = [
input_dict[fields.InputDataFields.groundtruth_is_crowd]
]
labels['difficult_list'] = [
input_dict[fields.InputDataFields.groundtruth_difficult]
]
if fields.InputDataFields.groundtruth_instance_masks in input_dict: if fields.InputDataFields.groundtruth_instance_masks in input_dict:
labels['masks_list'] = [ labels[fields.InputDataFields.groundtruth_instance_masks] = input_dict[
input_dict[fields.InputDataFields.groundtruth_instance_masks] fields.InputDataFields.groundtruth_instance_masks]
]
# Add a batch dimension to the tensors.
features = {
key: tf.expand_dims(features[key], axis=0)
for key, feature in features.items()
}
labels = {
key: tf.expand_dims(labels[key], axis=0)
for key, label in labels.items()
}
return features, labels return features, labels
return _eval_input_fn return _eval_input_fn
def create_predict_input_fn(): def create_predict_input_fn(model_config):
"""Creates a predict `input` function for `Estimator`. """Creates a predict `input` function for `Estimator`.
Args:
model_config: A model_pb2.DetectionModel.
Returns: Returns:
`input_fn` for `Estimator` in PREDICT mode. `input_fn` for `Estimator` in PREDICT mode.
""" """
def _predict_input_fn(): def _predict_input_fn(params=None):
"""Decodes serialized tf.Examples and returns `ServingInputReceiver`. """Decodes serialized tf.Examples and returns `ServingInputReceiver`.
Args:
params: Parameter dictionary passed from the estimator.
Returns: Returns:
`ServingInputReceiver`. `ServingInputReceiver`.
""" """
del params
example = tf.placeholder(dtype=tf.string, shape=[], name='input_feature') example = tf.placeholder(dtype=tf.string, shape=[], name='input_feature')
decoder = tf_example_decoder.TfExampleDecoder(load_instance_masks=False) num_classes = config_util.get_number_of_classes(model_config)
model = model_builder.build(model_config, is_training=False)
image_resizer_config = config_util.get_image_resizer_config(model_config)
image_resizer_fn = image_resizer_builder.build(image_resizer_config)
input_dict = decoder.decode(example) transform_fn = functools.partial(
transform_input_data, model_preprocess_fn=model.preprocess,
image_resizer_fn=image_resizer_fn,
num_classes=num_classes,
data_augmentation_fn=None)
decoder = tf_example_decoder.TfExampleDecoder(load_instance_masks=False)
input_dict = transform_fn(decoder.decode(example))
images = tf.to_float(input_dict[fields.InputDataFields.image]) images = tf.to_float(input_dict[fields.InputDataFields.image])
images = tf.expand_dims(images, axis=0) images = tf.expand_dims(images, axis=0)
return tf.estimator.export.ServingInputReceiver( return tf.estimator.export.ServingInputReceiver(
features={FEATURES_IMAGE: images}, features={fields.InputDataFields.image: images},
receiver_tensors={SERVING_FED_EXAMPLE_KEY: example}) receiver_tensors={SERVING_FED_EXAMPLE_KEY: example})
return _predict_input_fn return _predict_input_fn
...@@ -18,11 +18,14 @@ from __future__ import absolute_import ...@@ -18,11 +18,14 @@ from __future__ import absolute_import
from __future__ import division from __future__ import division
from __future__ import print_function from __future__ import print_function
import functools
import os import os
import numpy as np
import tensorflow as tf import tensorflow as tf
from object_detection import inputs from object_detection import inputs
from object_detection.core import preprocessor
from object_detection.core import standard_fields as fields from object_detection.core import standard_fields as fields
from object_detection.utils import config_util from object_detection.utils import config_util
...@@ -52,148 +55,516 @@ def _get_configs_for_model(model_name): ...@@ -52,148 +55,516 @@ def _get_configs_for_model(model_name):
class InputsTest(tf.test.TestCase): class InputsTest(tf.test.TestCase):
def _assert_training_inputs(self, features, labels, num_classes, batch_size):
self.assertEqual(batch_size, len(features['images']))
self.assertEqual(batch_size, len(features['key']))
self.assertEqual(batch_size, len(labels['locations_list']))
self.assertEqual(batch_size, len(labels['classes_list']))
for i in range(batch_size):
image = features['images'][i]
key = features['key'][i]
locations_list = labels['locations_list'][i]
classes_list = labels['classes_list'][i]
weights_list = labels[fields.InputDataFields.groundtruth_weights][i]
self.assertEqual([1, None, None, 3], image.shape.as_list())
self.assertEqual(tf.float32, image.dtype)
self.assertEqual(tf.string, key.dtype)
self.assertEqual([None, 4], locations_list.shape.as_list())
self.assertEqual(tf.float32, locations_list.dtype)
self.assertEqual([None, num_classes], classes_list.shape.as_list())
self.assertEqual(tf.float32, classes_list.dtype)
self.assertEqual([None], weights_list.shape.as_list())
self.assertEqual(tf.float32, weights_list.dtype)
def _assert_eval_inputs(self, features, labels, num_classes):
self.assertEqual(1, len(labels['locations_list']))
self.assertEqual(1, len(labels['classes_list']))
self.assertEqual(1, len(labels['image_id_list']))
self.assertEqual(1, len(labels['area_list']))
self.assertEqual(1, len(labels['is_crowd_list']))
self.assertEqual(1, len(labels['difficult_list']))
image = features['images']
key = features['key']
locations_list = labels['locations_list'][0]
classes_list = labels['classes_list'][0]
image_id_list = labels['image_id_list'][0]
area_list = labels['area_list'][0]
is_crowd_list = labels['is_crowd_list'][0]
difficult_list = labels['difficult_list'][0]
self.assertEqual([1, None, None, 3], image.shape.as_list())
self.assertEqual(tf.float32, image.dtype)
self.assertEqual(tf.string, key.dtype)
self.assertEqual([None, 4], locations_list.shape.as_list())
self.assertEqual(tf.float32, locations_list.dtype)
self.assertEqual([None, num_classes], classes_list.shape.as_list())
self.assertEqual(tf.float32, classes_list.dtype)
self.assertEqual(tf.string, image_id_list.dtype)
self.assertEqual(tf.float32, area_list.dtype)
self.assertEqual(tf.bool, is_crowd_list.dtype)
self.assertEqual(tf.int64, difficult_list.dtype)
def test_faster_rcnn_resnet50_train_input(self): def test_faster_rcnn_resnet50_train_input(self):
"""Tests the training input function for FasterRcnnResnet50.""" """Tests the training input function for FasterRcnnResnet50."""
configs = _get_configs_for_model('faster_rcnn_resnet50_pets') configs = _get_configs_for_model('faster_rcnn_resnet50_pets')
classes = 37 configs['train_config'].unpad_groundtruth_tensors = True
batch_size = configs['train_config'].batch_size model_config = configs['model']
model_config.faster_rcnn.num_classes = 37
train_input_fn = inputs.create_train_input_fn( train_input_fn = inputs.create_train_input_fn(
classes, configs['train_config'], configs['train_input_config']) configs['train_config'], configs['train_input_config'], model_config)
features, labels = train_input_fn() features, labels = train_input_fn()
self._assert_training_inputs(features, labels, classes, batch_size)
self.assertAllEqual([None, None, 3],
features[fields.InputDataFields.image].shape.as_list())
self.assertEqual(tf.float32, features[fields.InputDataFields.image].dtype)
self.assertAllEqual([],
features[inputs.HASH_KEY].shape.as_list())
self.assertEqual(tf.int32, features[inputs.HASH_KEY].dtype)
self.assertAllEqual(
[None, 4],
labels[fields.InputDataFields.groundtruth_boxes].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_boxes].dtype)
self.assertAllEqual(
[None, model_config.faster_rcnn.num_classes],
labels[fields.InputDataFields.groundtruth_classes].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_classes].dtype)
self.assertAllEqual(
[None],
labels[fields.InputDataFields.groundtruth_weights].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_weights].dtype)
def test_faster_rcnn_resnet50_eval_input(self): def test_faster_rcnn_resnet50_eval_input(self):
"""Tests the eval input function for FasterRcnnResnet50.""" """Tests the eval input function for FasterRcnnResnet50."""
configs = _get_configs_for_model('faster_rcnn_resnet50_pets') configs = _get_configs_for_model('faster_rcnn_resnet50_pets')
classes = 37 model_config = configs['model']
eval_input_fn = inputs.create_eval_input_fn(classes, configs['eval_config'], model_config.faster_rcnn.num_classes = 37
configs['eval_input_config']) eval_input_fn = inputs.create_eval_input_fn(
configs['eval_config'], configs['eval_input_config'], model_config)
features, labels = eval_input_fn() features, labels = eval_input_fn()
self._assert_eval_inputs(features, labels, classes)
self.assertAllEqual([1, None, None, 3],
features[fields.InputDataFields.image].shape.as_list())
self.assertEqual(tf.float32, features[fields.InputDataFields.image].dtype)
self.assertAllEqual(
[1, None, None, 3],
features[fields.InputDataFields.original_image].shape.as_list())
self.assertEqual(tf.uint8,
features[fields.InputDataFields.original_image].dtype)
self.assertAllEqual([1], features[inputs.HASH_KEY].shape.as_list())
self.assertEqual(tf.int32, features[inputs.HASH_KEY].dtype)
self.assertAllEqual(
[1, None, 4],
labels[fields.InputDataFields.groundtruth_boxes].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_boxes].dtype)
self.assertAllEqual(
[1, None, model_config.faster_rcnn.num_classes],
labels[fields.InputDataFields.groundtruth_classes].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_classes].dtype)
self.assertAllEqual(
[1, None],
labels[fields.InputDataFields.groundtruth_area].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_area].dtype)
self.assertAllEqual(
[1, None],
labels[fields.InputDataFields.groundtruth_is_crowd].shape.as_list())
self.assertEqual(
tf.bool, labels[fields.InputDataFields.groundtruth_is_crowd].dtype)
self.assertAllEqual(
[1, None],
labels[fields.InputDataFields.groundtruth_difficult].shape.as_list())
self.assertEqual(
tf.int32, labels[fields.InputDataFields.groundtruth_difficult].dtype)
def test_ssd_inceptionV2_train_input(self): def test_ssd_inceptionV2_train_input(self):
"""Tests the training input function for SSDInceptionV2.""" """Tests the training input function for SSDInceptionV2."""
configs = _get_configs_for_model('ssd_inception_v2_pets') configs = _get_configs_for_model('ssd_inception_v2_pets')
classes = 37 model_config = configs['model']
model_config.ssd.num_classes = 37
batch_size = configs['train_config'].batch_size batch_size = configs['train_config'].batch_size
train_input_fn = inputs.create_train_input_fn( train_input_fn = inputs.create_train_input_fn(
classes, configs['train_config'], configs['train_input_config']) configs['train_config'], configs['train_input_config'], model_config)
features, labels = train_input_fn() features, labels = train_input_fn()
self._assert_training_inputs(features, labels, classes, batch_size)
self.assertAllEqual([batch_size, 300, 300, 3],
features[fields.InputDataFields.image].shape.as_list())
self.assertEqual(tf.float32, features[fields.InputDataFields.image].dtype)
self.assertAllEqual([batch_size],
features[inputs.HASH_KEY].shape.as_list())
self.assertEqual(tf.int32, features[inputs.HASH_KEY].dtype)
self.assertAllEqual(
[batch_size],
labels[fields.InputDataFields.num_groundtruth_boxes].shape.as_list())
self.assertEqual(tf.int32,
labels[fields.InputDataFields.num_groundtruth_boxes].dtype)
self.assertAllEqual(
[batch_size, 50, 4],
labels[fields.InputDataFields.groundtruth_boxes].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_boxes].dtype)
self.assertAllEqual(
[batch_size, 50, model_config.ssd.num_classes],
labels[fields.InputDataFields.groundtruth_classes].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_classes].dtype)
self.assertAllEqual(
[batch_size, 50],
labels[fields.InputDataFields.groundtruth_weights].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_weights].dtype)
def test_ssd_inceptionV2_eval_input(self): def test_ssd_inceptionV2_eval_input(self):
"""Tests the eval input function for SSDInceptionV2.""" """Tests the eval input function for SSDInceptionV2."""
configs = _get_configs_for_model('ssd_inception_v2_pets') configs = _get_configs_for_model('ssd_inception_v2_pets')
classes = 37 model_config = configs['model']
eval_input_fn = inputs.create_eval_input_fn(classes, configs['eval_config'], model_config.ssd.num_classes = 37
configs['eval_input_config']) eval_input_fn = inputs.create_eval_input_fn(
configs['eval_config'], configs['eval_input_config'], model_config)
features, labels = eval_input_fn() features, labels = eval_input_fn()
self._assert_eval_inputs(features, labels, classes)
self.assertAllEqual([1, 300, 300, 3],
features[fields.InputDataFields.image].shape.as_list())
self.assertEqual(tf.float32, features[fields.InputDataFields.image].dtype)
self.assertAllEqual(
[1, None, None, 3],
features[fields.InputDataFields.original_image].shape.as_list())
self.assertEqual(tf.uint8,
features[fields.InputDataFields.original_image].dtype)
self.assertAllEqual([1], features[inputs.HASH_KEY].shape.as_list())
self.assertEqual(tf.int32, features[inputs.HASH_KEY].dtype)
self.assertAllEqual(
[1, None, 4],
labels[fields.InputDataFields.groundtruth_boxes].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_boxes].dtype)
self.assertAllEqual(
[1, None, model_config.ssd.num_classes],
labels[fields.InputDataFields.groundtruth_classes].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_classes].dtype)
self.assertAllEqual(
[1, None],
labels[fields.InputDataFields.groundtruth_area].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_area].dtype)
self.assertAllEqual(
[1, None],
labels[fields.InputDataFields.groundtruth_is_crowd].shape.as_list())
self.assertEqual(
tf.bool, labels[fields.InputDataFields.groundtruth_is_crowd].dtype)
self.assertAllEqual(
[1, None],
labels[fields.InputDataFields.groundtruth_difficult].shape.as_list())
self.assertEqual(
tf.int32, labels[fields.InputDataFields.groundtruth_difficult].dtype)
def test_predict_input(self): def test_predict_input(self):
"""Tests the predict input function.""" """Tests the predict input function."""
predict_input_fn = inputs.create_predict_input_fn() configs = _get_configs_for_model('ssd_inception_v2_pets')
predict_input_fn = inputs.create_predict_input_fn(
model_config=configs['model'])
serving_input_receiver = predict_input_fn() serving_input_receiver = predict_input_fn()
image = serving_input_receiver.features['images'] image = serving_input_receiver.features[fields.InputDataFields.image]
receiver_tensors = serving_input_receiver.receiver_tensors[ receiver_tensors = serving_input_receiver.receiver_tensors[
'serialized_example'] inputs.SERVING_FED_EXAMPLE_KEY]
self.assertEqual([1, None, None, 3], image.shape.as_list()) self.assertEqual([1, 300, 300, 3], image.shape.as_list())
self.assertEqual(tf.float32, image.dtype) self.assertEqual(tf.float32, image.dtype)
self.assertEqual(tf.string, receiver_tensors.dtype) self.assertEqual(tf.string, receiver_tensors.dtype)
def test_error_with_bad_train_config(self): def test_error_with_bad_train_config(self):
"""Tests that a TypeError is raised with improper train config.""" """Tests that a TypeError is raised with improper train config."""
configs = _get_configs_for_model('ssd_inception_v2_pets') configs = _get_configs_for_model('ssd_inception_v2_pets')
classes = 37 configs['model'].ssd.num_classes = 37
train_input_fn = inputs.create_train_input_fn( train_input_fn = inputs.create_train_input_fn(
num_classes=classes,
train_config=configs['eval_config'], # Expecting `TrainConfig`. train_config=configs['eval_config'], # Expecting `TrainConfig`.
train_input_config=configs['train_input_config']) train_input_config=configs['train_input_config'],
model_config=configs['model'])
with self.assertRaises(TypeError): with self.assertRaises(TypeError):
train_input_fn() train_input_fn()
def test_error_with_bad_train_input_config(self): def test_error_with_bad_train_input_config(self):
"""Tests that a TypeError is raised with improper train input config.""" """Tests that a TypeError is raised with improper train input config."""
configs = _get_configs_for_model('ssd_inception_v2_pets') configs = _get_configs_for_model('ssd_inception_v2_pets')
classes = 37 configs['model'].ssd.num_classes = 37
train_input_fn = inputs.create_train_input_fn(
train_config=configs['train_config'],
train_input_config=configs['model'], # Expecting `InputReader`.
model_config=configs['model'])
with self.assertRaises(TypeError):
train_input_fn()
def test_error_with_bad_train_model_config(self):
"""Tests that a TypeError is raised with improper train model config."""
configs = _get_configs_for_model('ssd_inception_v2_pets')
configs['model'].ssd.num_classes = 37
train_input_fn = inputs.create_train_input_fn( train_input_fn = inputs.create_train_input_fn(
num_classes=classes,
train_config=configs['train_config'], train_config=configs['train_config'],
train_input_config=configs['model']) # Expecting `InputReader`. train_input_config=configs['train_input_config'],
model_config=configs['train_config']) # Expecting `DetectionModel`.
with self.assertRaises(TypeError): with self.assertRaises(TypeError):
train_input_fn() train_input_fn()
def test_error_with_bad_eval_config(self): def test_error_with_bad_eval_config(self):
"""Tests that a TypeError is raised with improper eval config.""" """Tests that a TypeError is raised with improper eval config."""
configs = _get_configs_for_model('ssd_inception_v2_pets') configs = _get_configs_for_model('ssd_inception_v2_pets')
classes = 37 configs['model'].ssd.num_classes = 37
eval_input_fn = inputs.create_eval_input_fn( eval_input_fn = inputs.create_eval_input_fn(
num_classes=classes,
eval_config=configs['train_config'], # Expecting `EvalConfig`. eval_config=configs['train_config'], # Expecting `EvalConfig`.
eval_input_config=configs['eval_input_config']) eval_input_config=configs['eval_input_config'],
model_config=configs['model'])
with self.assertRaises(TypeError): with self.assertRaises(TypeError):
eval_input_fn() eval_input_fn()
def test_error_with_bad_eval_input_config(self): def test_error_with_bad_eval_input_config(self):
"""Tests that a TypeError is raised with improper eval input config.""" """Tests that a TypeError is raised with improper eval input config."""
configs = _get_configs_for_model('ssd_inception_v2_pets') configs = _get_configs_for_model('ssd_inception_v2_pets')
classes = 37 configs['model'].ssd.num_classes = 37
eval_input_fn = inputs.create_eval_input_fn( eval_input_fn = inputs.create_eval_input_fn(
num_classes=classes,
eval_config=configs['eval_config'], eval_config=configs['eval_config'],
eval_input_config=configs['model']) # Expecting `InputReader`. eval_input_config=configs['model'], # Expecting `InputReader`.
model_config=configs['model'])
with self.assertRaises(TypeError): with self.assertRaises(TypeError):
eval_input_fn() eval_input_fn()
def test_error_with_bad_eval_model_config(self):
"""Tests that a TypeError is raised with improper eval model config."""
configs = _get_configs_for_model('ssd_inception_v2_pets')
configs['model'].ssd.num_classes = 37
eval_input_fn = inputs.create_eval_input_fn(
eval_config=configs['eval_config'],
eval_input_config=configs['eval_input_config'],
model_config=configs['eval_config']) # Expecting `DetectionModel`.
with self.assertRaises(TypeError):
eval_input_fn()
class DataAugmentationFnTest(tf.test.TestCase):
def test_apply_image_and_box_augmentation(self):
data_augmentation_options = [
(preprocessor.resize_image, {
'new_height': 20,
'new_width': 20,
'method': tf.image.ResizeMethod.NEAREST_NEIGHBOR
}),
(preprocessor.scale_boxes_to_pixel_coordinates, {}),
]
data_augmentation_fn = functools.partial(
inputs.augment_input_data,
data_augmentation_options=data_augmentation_options)
tensor_dict = {
fields.InputDataFields.image:
tf.constant(np.random.rand(10, 10, 3).astype(np.float32)),
fields.InputDataFields.groundtruth_boxes:
tf.constant(np.array([[.5, .5, 1., 1.]], np.float32))
}
augmented_tensor_dict = data_augmentation_fn(tensor_dict=tensor_dict)
with self.test_session() as sess:
augmented_tensor_dict_out = sess.run(augmented_tensor_dict)
self.assertAllEqual(
augmented_tensor_dict_out[fields.InputDataFields.image].shape,
[20, 20, 3]
)
self.assertAllClose(
augmented_tensor_dict_out[fields.InputDataFields.groundtruth_boxes],
[[10, 10, 20, 20]]
)
def test_include_masks_in_data_augmentation(self):
data_augmentation_options = [
(preprocessor.resize_image, {
'new_height': 20,
'new_width': 20,
'method': tf.image.ResizeMethod.NEAREST_NEIGHBOR
})
]
data_augmentation_fn = functools.partial(
inputs.augment_input_data,
data_augmentation_options=data_augmentation_options)
tensor_dict = {
fields.InputDataFields.image:
tf.constant(np.random.rand(10, 10, 3).astype(np.float32)),
fields.InputDataFields.groundtruth_instance_masks:
tf.constant(np.zeros([2, 10, 10], np.uint8))
}
augmented_tensor_dict = data_augmentation_fn(tensor_dict=tensor_dict)
with self.test_session() as sess:
augmented_tensor_dict_out = sess.run(augmented_tensor_dict)
self.assertAllEqual(
augmented_tensor_dict_out[fields.InputDataFields.image].shape,
[20, 20, 3])
self.assertAllEqual(augmented_tensor_dict_out[
fields.InputDataFields.groundtruth_instance_masks].shape, [2, 20, 20])
def test_include_keypoints_in_data_augmentation(self):
data_augmentation_options = [
(preprocessor.resize_image, {
'new_height': 20,
'new_width': 20,
'method': tf.image.ResizeMethod.NEAREST_NEIGHBOR
}),
(preprocessor.scale_boxes_to_pixel_coordinates, {}),
]
data_augmentation_fn = functools.partial(
inputs.augment_input_data,
data_augmentation_options=data_augmentation_options)
tensor_dict = {
fields.InputDataFields.image:
tf.constant(np.random.rand(10, 10, 3).astype(np.float32)),
fields.InputDataFields.groundtruth_boxes:
tf.constant(np.array([[.5, .5, 1., 1.]], np.float32)),
fields.InputDataFields.groundtruth_keypoints:
tf.constant(np.array([[[0.5, 1.0], [0.5, 0.5]]], np.float32))
}
augmented_tensor_dict = data_augmentation_fn(tensor_dict=tensor_dict)
with self.test_session() as sess:
augmented_tensor_dict_out = sess.run(augmented_tensor_dict)
self.assertAllEqual(
augmented_tensor_dict_out[fields.InputDataFields.image].shape,
[20, 20, 3]
)
self.assertAllClose(
augmented_tensor_dict_out[fields.InputDataFields.groundtruth_boxes],
[[10, 10, 20, 20]]
)
self.assertAllClose(
augmented_tensor_dict_out[fields.InputDataFields.groundtruth_keypoints],
[[[10, 20], [10, 10]]]
)
def _fake_model_preprocessor_fn(image):
return (image, tf.expand_dims(tf.shape(image)[1:], axis=0))
def _fake_image_resizer_fn(image, mask):
return (image, mask, tf.shape(image))
class DataTransformationFnTest(tf.test.TestCase):
def test_returns_correct_class_label_encodings(self):
tensor_dict = {
fields.InputDataFields.image:
tf.constant(np.random.rand(4, 4, 3).astype(np.float32)),
fields.InputDataFields.groundtruth_boxes:
tf.constant(np.array([[0, 0, 1, 1], [.5, .5, 1, 1]], np.float32)),
fields.InputDataFields.groundtruth_classes:
tf.constant(np.array([3, 1], np.int32))
}
num_classes = 3
input_transformation_fn = functools.partial(
inputs.transform_input_data,
model_preprocess_fn=_fake_model_preprocessor_fn,
image_resizer_fn=_fake_image_resizer_fn,
num_classes=num_classes)
with self.test_session() as sess:
transformed_inputs = sess.run(
input_transformation_fn(tensor_dict=tensor_dict))
self.assertAllClose(
transformed_inputs[fields.InputDataFields.groundtruth_classes],
[[0, 0, 1], [1, 0, 0]])
def test_returns_correct_merged_boxes(self):
tensor_dict = {
fields.InputDataFields.image:
tf.constant(np.random.rand(4, 4, 3).astype(np.float32)),
fields.InputDataFields.groundtruth_boxes:
tf.constant(np.array([[.5, .5, 1, 1], [.5, .5, 1, 1]], np.float32)),
fields.InputDataFields.groundtruth_classes:
tf.constant(np.array([3, 1], np.int32))
}
num_classes = 3
input_transformation_fn = functools.partial(
inputs.transform_input_data,
model_preprocess_fn=_fake_model_preprocessor_fn,
image_resizer_fn=_fake_image_resizer_fn,
num_classes=num_classes,
merge_multiple_boxes=True)
with self.test_session() as sess:
transformed_inputs = sess.run(
input_transformation_fn(tensor_dict=tensor_dict))
self.assertAllClose(
transformed_inputs[fields.InputDataFields.groundtruth_boxes],
[[.5, .5, 1., 1.]])
self.assertAllClose(
transformed_inputs[fields.InputDataFields.groundtruth_classes],
[[1, 0, 1]])
def test_returns_resized_masks(self):
tensor_dict = {
fields.InputDataFields.image:
tf.constant(np.random.rand(4, 4, 3).astype(np.float32)),
fields.InputDataFields.groundtruth_instance_masks:
tf.constant(np.random.rand(2, 4, 4).astype(np.float32)),
fields.InputDataFields.groundtruth_classes:
tf.constant(np.array([3, 1], np.int32))
}
def fake_image_resizer_fn(image, masks):
resized_image = tf.image.resize_images(image, [8, 8])
resized_masks = tf.transpose(
tf.image.resize_images(tf.transpose(masks, [1, 2, 0]), [8, 8]),
[2, 0, 1])
return resized_image, resized_masks, tf.shape(resized_image)
num_classes = 3
input_transformation_fn = functools.partial(
inputs.transform_input_data,
model_preprocess_fn=_fake_model_preprocessor_fn,
image_resizer_fn=fake_image_resizer_fn,
num_classes=num_classes)
with self.test_session() as sess:
transformed_inputs = sess.run(
input_transformation_fn(tensor_dict=tensor_dict))
self.assertAllEqual(transformed_inputs[
fields.InputDataFields.groundtruth_instance_masks].shape, [2, 8, 8])
def test_applies_model_preprocess_fn_to_image_tensor(self):
np_image = np.random.randint(256, size=(4, 4, 3))
tensor_dict = {
fields.InputDataFields.image:
tf.constant(np_image),
fields.InputDataFields.groundtruth_classes:
tf.constant(np.array([3, 1], np.int32))
}
def fake_model_preprocessor_fn(image):
return (image / 255., tf.expand_dims(tf.shape(image)[1:], axis=0))
num_classes = 3
input_transformation_fn = functools.partial(
inputs.transform_input_data,
model_preprocess_fn=fake_model_preprocessor_fn,
image_resizer_fn=_fake_image_resizer_fn,
num_classes=num_classes)
with self.test_session() as sess:
transformed_inputs = sess.run(
input_transformation_fn(tensor_dict=tensor_dict))
self.assertAllClose(transformed_inputs[fields.InputDataFields.image],
np_image / 255.)
self.assertAllClose(transformed_inputs[fields.InputDataFields.
true_image_shape],
[4, 4, 3])
def test_applies_data_augmentation_fn_to_tensor_dict(self):
np_image = np.random.randint(256, size=(4, 4, 3))
tensor_dict = {
fields.InputDataFields.image:
tf.constant(np_image),
fields.InputDataFields.groundtruth_classes:
tf.constant(np.array([3, 1], np.int32))
}
def add_one_data_augmentation_fn(tensor_dict):
return {key: value + 1 for key, value in tensor_dict.items()}
num_classes = 4
input_transformation_fn = functools.partial(
inputs.transform_input_data,
model_preprocess_fn=_fake_model_preprocessor_fn,
image_resizer_fn=_fake_image_resizer_fn,
num_classes=num_classes,
data_augmentation_fn=add_one_data_augmentation_fn)
with self.test_session() as sess:
augmented_tensor_dict = sess.run(
input_transformation_fn(tensor_dict=tensor_dict))
self.assertAllEqual(augmented_tensor_dict[fields.InputDataFields.image],
np_image + 1)
self.assertAllEqual(
augmented_tensor_dict[fields.InputDataFields.groundtruth_classes],
[[0, 0, 0, 1], [0, 1, 0, 0]])
def test_applies_data_augmentation_fn_before_model_preprocess_fn(self):
np_image = np.random.randint(256, size=(4, 4, 3))
tensor_dict = {
fields.InputDataFields.image:
tf.constant(np_image),
fields.InputDataFields.groundtruth_classes:
tf.constant(np.array([3, 1], np.int32))
}
def mul_two_model_preprocessor_fn(image):
return (image * 2, tf.expand_dims(tf.shape(image)[1:], axis=0))
def add_five_to_image_data_augmentation_fn(tensor_dict):
tensor_dict[fields.InputDataFields.image] += 5
return tensor_dict
num_classes = 4
input_transformation_fn = functools.partial(
inputs.transform_input_data,
model_preprocess_fn=mul_two_model_preprocessor_fn,
image_resizer_fn=_fake_image_resizer_fn,
num_classes=num_classes,
data_augmentation_fn=add_five_to_image_data_augmentation_fn)
with self.test_session() as sess:
augmented_tensor_dict = sess.run(
input_transformation_fn(tensor_dict=tensor_dict))
self.assertAllEqual(augmented_tensor_dict[fields.InputDataFields.image],
(np_image + 5) * 2)
if __name__ == '__main__': if __name__ == '__main__':
tf.test.main() tf.test.main()
...@@ -55,7 +55,8 @@ class ArgMaxMatcher(matcher.Matcher): ...@@ -55,7 +55,8 @@ class ArgMaxMatcher(matcher.Matcher):
matched_threshold, matched_threshold,
unmatched_threshold=None, unmatched_threshold=None,
negatives_lower_than_unmatched=True, negatives_lower_than_unmatched=True,
force_match_for_each_row=False): force_match_for_each_row=False,
use_matmul_gather=False):
"""Construct ArgMaxMatcher. """Construct ArgMaxMatcher.
Args: Args:
...@@ -74,11 +75,15 @@ class ArgMaxMatcher(matcher.Matcher): ...@@ -74,11 +75,15 @@ class ArgMaxMatcher(matcher.Matcher):
at least one column (which is not guaranteed otherwise if the at least one column (which is not guaranteed otherwise if the
matched_threshold is high). Defaults to False. See matched_threshold is high). Defaults to False. See
argmax_matcher_test.testMatcherForceMatch() for an example. argmax_matcher_test.testMatcherForceMatch() for an example.
use_matmul_gather: Force constructed match objects to use matrix
multiplication based gather instead of standard tf.gather.
(Default: False).
Raises: Raises:
ValueError: if unmatched_threshold is set but matched_threshold is not set ValueError: if unmatched_threshold is set but matched_threshold is not set
or if unmatched_threshold > matched_threshold. or if unmatched_threshold > matched_threshold.
""" """
super(ArgMaxMatcher, self).__init__(use_matmul_gather=use_matmul_gather)
if (matched_threshold is None) and (unmatched_threshold is not None): if (matched_threshold is None) and (unmatched_threshold is not None):
raise ValueError('Need to also define matched_threshold when' raise ValueError('Need to also define matched_threshold when'
'unmatched_threshold is defined') 'unmatched_threshold is defined')
......
...@@ -24,6 +24,17 @@ from object_detection.core import matcher ...@@ -24,6 +24,17 @@ from object_detection.core import matcher
class GreedyBipartiteMatcher(matcher.Matcher): class GreedyBipartiteMatcher(matcher.Matcher):
"""Wraps a Tensorflow greedy bipartite matcher.""" """Wraps a Tensorflow greedy bipartite matcher."""
def __init__(self, use_matmul_gather=False):
"""Constructs a Matcher.
Args:
use_matmul_gather: Force constructed match objects to use matrix
multiplication based gather instead of standard tf.gather.
(Default: False).
"""
super(GreedyBipartiteMatcher, self).__init__(
use_matmul_gather=use_matmul_gather)
def _match(self, similarity_matrix, num_valid_rows=-1): def _match(self, similarity_matrix, num_valid_rows=-1):
"""Bipartite matches a collection rows and columns. A greedy bi-partite. """Bipartite matches a collection rows and columns. A greedy bi-partite.
......
...@@ -251,7 +251,8 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -251,7 +251,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
second_stage_classification_loss, second_stage_classification_loss,
second_stage_mask_prediction_loss_weight=1.0, second_stage_mask_prediction_loss_weight=1.0,
hard_example_miner=None, hard_example_miner=None,
parallel_iterations=16): parallel_iterations=16,
add_summaries=True):
"""FasterRCNNMetaArch Constructor. """FasterRCNNMetaArch Constructor.
Args: Args:
...@@ -355,12 +356,17 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -355,12 +356,17 @@ class FasterRCNNMetaArch(model.DetectionModel):
hard_example_miner: A losses.HardExampleMiner object (can be None). hard_example_miner: A losses.HardExampleMiner object (can be None).
parallel_iterations: (Optional) The number of iterations allowed to run parallel_iterations: (Optional) The number of iterations allowed to run
in parallel for calls to tf.map_fn. in parallel for calls to tf.map_fn.
add_summaries: boolean (default: True) controlling whether summary ops
should be added to tensorflow graph.
Raises: Raises:
ValueError: If `second_stage_batch_size` > `first_stage_max_proposals` at ValueError: If `second_stage_batch_size` > `first_stage_max_proposals` at
training time. training time.
ValueError: If first_stage_anchor_generator is not of type ValueError: If first_stage_anchor_generator is not of type
grid_anchor_generator.GridAnchorGenerator. grid_anchor_generator.GridAnchorGenerator.
""" """
# TODO: add_summaries is currently unused. Respect that directive
# in the future.
super(FasterRCNNMetaArch, self).__init__(num_classes=num_classes) super(FasterRCNNMetaArch, self).__init__(num_classes=num_classes)
if is_training and second_stage_batch_size > first_stage_max_proposals: if is_training and second_stage_batch_size > first_stage_max_proposals:
......
...@@ -75,7 +75,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch): ...@@ -75,7 +75,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
second_stage_classification_loss_weight, second_stage_classification_loss_weight,
second_stage_classification_loss, second_stage_classification_loss,
hard_example_miner, hard_example_miner,
parallel_iterations=16): parallel_iterations=16,
add_summaries=True):
"""RFCNMetaArch Constructor. """RFCNMetaArch Constructor.
Args: Args:
...@@ -155,11 +156,16 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch): ...@@ -155,11 +156,16 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
hard_example_miner: A losses.HardExampleMiner object (can be None). hard_example_miner: A losses.HardExampleMiner object (can be None).
parallel_iterations: (Optional) The number of iterations allowed to run parallel_iterations: (Optional) The number of iterations allowed to run
in parallel for calls to tf.map_fn. in parallel for calls to tf.map_fn.
add_summaries: boolean (default: True) controlling whether summary ops
should be added to tensorflow graph.
Raises: Raises:
ValueError: If `second_stage_batch_size` > `first_stage_max_proposals` ValueError: If `second_stage_batch_size` > `first_stage_max_proposals`
ValueError: If first_stage_anchor_generator is not of type ValueError: If first_stage_anchor_generator is not of type
grid_anchor_generator.GridAnchorGenerator. grid_anchor_generator.GridAnchorGenerator.
""" """
# TODO: add_summaries is currently unused. Respect that directive
# in the future.
super(RFCNMetaArch, self).__init__( super(RFCNMetaArch, self).__init__(
is_training, is_training,
num_classes, num_classes,
......
...@@ -44,7 +44,8 @@ class SSDFeatureExtractor(object): ...@@ -44,7 +44,8 @@ class SSDFeatureExtractor(object):
conv_hyperparams, conv_hyperparams,
batch_norm_trainable=True, batch_norm_trainable=True,
reuse_weights=None, reuse_weights=None,
use_explicit_padding=False): use_explicit_padding=False,
use_depthwise=False):
"""Constructor. """Constructor.
Args: Args:
...@@ -61,6 +62,7 @@ class SSDFeatureExtractor(object): ...@@ -61,6 +62,7 @@ class SSDFeatureExtractor(object):
reuse_weights: whether to reuse variables. Default is None. reuse_weights: whether to reuse variables. Default is None.
use_explicit_padding: Whether to use explicit padding when extracting use_explicit_padding: Whether to use explicit padding when extracting
features. Default is False. features. Default is False.
use_depthwise: Whether to use depthwise convolutions. Default is False.
""" """
self._is_training = is_training self._is_training = is_training
self._depth_multiplier = depth_multiplier self._depth_multiplier = depth_multiplier
...@@ -70,6 +72,7 @@ class SSDFeatureExtractor(object): ...@@ -70,6 +72,7 @@ class SSDFeatureExtractor(object):
self._batch_norm_trainable = batch_norm_trainable self._batch_norm_trainable = batch_norm_trainable
self._reuse_weights = reuse_weights self._reuse_weights = reuse_weights
self._use_explicit_padding = use_explicit_padding self._use_explicit_padding = use_explicit_padding
self._use_depthwise = use_depthwise
@abstractmethod @abstractmethod
def preprocess(self, resized_inputs): def preprocess(self, resized_inputs):
...@@ -130,7 +133,7 @@ class SSDMetaArch(model.DetectionModel): ...@@ -130,7 +133,7 @@ class SSDMetaArch(model.DetectionModel):
add_summaries=True): add_summaries=True):
"""SSDMetaArch Constructor. """SSDMetaArch Constructor.
TODO: group NMS parameters + score converter into TODO(rathodv,jonathanhuang): group NMS parameters + score converter into
a class and loss parameters into a class and write config protos for a class and loss parameters into a class and write config protos for
postprocessing and losses. postprocessing and losses.
...@@ -330,7 +333,8 @@ class SSDMetaArch(model.DetectionModel): ...@@ -330,7 +333,8 @@ class SSDMetaArch(model.DetectionModel):
feature_maps = self._feature_extractor.extract_features( feature_maps = self._feature_extractor.extract_features(
preprocessed_inputs) preprocessed_inputs)
feature_map_spatial_dims = self._get_feature_map_spatial_dims(feature_maps) feature_map_spatial_dims = self._get_feature_map_spatial_dims(feature_maps)
image_shape = tf.shape(preprocessed_inputs) image_shape = shape_utils.combined_static_and_dynamic_shape(
preprocessed_inputs)
self._anchors = self._anchor_generator.generate( self._anchors = self._anchor_generator.generate(
feature_map_spatial_dims, feature_map_spatial_dims,
im_height=image_shape[1], im_height=image_shape[1],
...@@ -472,11 +476,14 @@ class SSDMetaArch(model.DetectionModel): ...@@ -472,11 +476,14 @@ class SSDMetaArch(model.DetectionModel):
keypoints = None keypoints = None
if self.groundtruth_has_field(fields.BoxListFields.keypoints): if self.groundtruth_has_field(fields.BoxListFields.keypoints):
keypoints = self.groundtruth_lists(fields.BoxListFields.keypoints) keypoints = self.groundtruth_lists(fields.BoxListFields.keypoints)
weights = None
if self.groundtruth_has_field(fields.BoxListFields.weights):
weights = self.groundtruth_lists(fields.BoxListFields.weights)
(batch_cls_targets, batch_cls_weights, batch_reg_targets, (batch_cls_targets, batch_cls_weights, batch_reg_targets,
batch_reg_weights, match_list) = self._assign_targets( batch_reg_weights, match_list) = self._assign_targets(
self.groundtruth_lists(fields.BoxListFields.boxes), self.groundtruth_lists(fields.BoxListFields.boxes),
self.groundtruth_lists(fields.BoxListFields.classes), self.groundtruth_lists(fields.BoxListFields.classes),
keypoints) keypoints, weights)
if self._add_summaries: if self._add_summaries:
self._summarize_input( self._summarize_input(
self.groundtruth_lists(fields.BoxListFields.boxes), match_list) self.groundtruth_lists(fields.BoxListFields.boxes), match_list)
...@@ -539,7 +546,8 @@ class SSDMetaArch(model.DetectionModel): ...@@ -539,7 +546,8 @@ class SSDMetaArch(model.DetectionModel):
'NegativeAnchorLossCDF') 'NegativeAnchorLossCDF')
def _assign_targets(self, groundtruth_boxes_list, groundtruth_classes_list, def _assign_targets(self, groundtruth_boxes_list, groundtruth_classes_list,
groundtruth_keypoints_list=None): groundtruth_keypoints_list=None,
groundtruth_weights_list=None):
"""Assign groundtruth targets. """Assign groundtruth targets.
Adds a background class to each one-hot encoding of groundtruth classes Adds a background class to each one-hot encoding of groundtruth classes
...@@ -556,6 +564,8 @@ class SSDMetaArch(model.DetectionModel): ...@@ -556,6 +564,8 @@ class SSDMetaArch(model.DetectionModel):
index assumed to map to the first non-background class. index assumed to map to the first non-background class.
groundtruth_keypoints_list: (optional) a list of 3-D tensors of shape groundtruth_keypoints_list: (optional) a list of 3-D tensors of shape
[num_boxes, num_keypoints, 2] [num_boxes, num_keypoints, 2]
groundtruth_weights_list: A list of 1-D tf.float32 tensors of shape
[num_boxes] containing weights for groundtruth boxes.
Returns: Returns:
batch_cls_targets: a tensor with shape [batch_size, num_anchors, batch_cls_targets: a tensor with shape [batch_size, num_anchors,
...@@ -582,7 +592,7 @@ class SSDMetaArch(model.DetectionModel): ...@@ -582,7 +592,7 @@ class SSDMetaArch(model.DetectionModel):
boxlist.add_field(fields.BoxListFields.keypoints, keypoints) boxlist.add_field(fields.BoxListFields.keypoints, keypoints)
return target_assigner.batch_assign_targets( return target_assigner.batch_assign_targets(
self._target_assigner, self.anchors, groundtruth_boxlists, self._target_assigner, self.anchors, groundtruth_boxlists,
groundtruth_classes_with_background_list) groundtruth_classes_with_background_list, groundtruth_weights_list)
def _summarize_input(self, groundtruth_boxes_list, match_list): def _summarize_input(self, groundtruth_boxes_list, match_list):
"""Creates tensorflow summaries for the input boxes and anchors. """Creates tensorflow summaries for the input boxes and anchors.
......
...@@ -24,13 +24,17 @@ from object_detection.utils import object_detection_evaluation ...@@ -24,13 +24,17 @@ from object_detection.utils import object_detection_evaluation
class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator): class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
"""Class to evaluate COCO detection metrics.""" """Class to evaluate COCO detection metrics."""
def __init__(self, categories, all_metrics_per_category=False): def __init__(self,
categories,
include_metrics_per_category=False,
all_metrics_per_category=False):
"""Constructor. """Constructor.
Args: Args:
categories: A list of dicts, each of which has the following keys - categories: A list of dicts, each of which has the following keys -
'id': (required) an integer id uniquely identifying this category. 'id': (required) an integer id uniquely identifying this category.
'name': (required) string representing category name e.g., 'cat', 'dog'. 'name': (required) string representing category name e.g., 'cat', 'dog'.
include_metrics_per_category: If True, include metrics for each category.
all_metrics_per_category: Whether to include all the summary metrics for all_metrics_per_category: Whether to include all the summary metrics for
each category in per_category_ap. Be careful with setting it to true if each category in per_category_ap. Be careful with setting it to true if
you have more than handful of categories, because it will pollute you have more than handful of categories, because it will pollute
...@@ -45,6 +49,7 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator): ...@@ -45,6 +49,7 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
self._category_id_set = set([cat['id'] for cat in self._categories]) self._category_id_set = set([cat['id'] for cat in self._categories])
self._annotation_id = 1 self._annotation_id = 1
self._metrics = None self._metrics = None
self._include_metrics_per_category = include_metrics_per_category
self._all_metrics_per_category = all_metrics_per_category self._all_metrics_per_category = all_metrics_per_category
def clear(self): def clear(self):
...@@ -166,7 +171,8 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator): ...@@ -166,7 +171,8 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
'DetectionBoxes_Recall/AR@100 (large)': average recall for large objects 'DetectionBoxes_Recall/AR@100 (large)': average recall for large objects
with 100 detections. with 100 detections.
2. per_category_ap: category specific results with keys of the form: 2. per_category_ap: if include_metrics_per_category is True, category
specific results with keys of the form:
'Precision mAP ByCategory/category' (without the supercategory part if 'Precision mAP ByCategory/category' (without the supercategory part if
no supercategories exist). For backward compatibility no supercategories exist). For backward compatibility
'PerformanceByCategory' is included in the output regardless of 'PerformanceByCategory' is included in the output regardless of
...@@ -183,6 +189,7 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator): ...@@ -183,6 +189,7 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
box_evaluator = coco_tools.COCOEvalWrapper( box_evaluator = coco_tools.COCOEvalWrapper(
coco_wrapped_groundtruth, coco_wrapped_detections, agnostic_mode=False) coco_wrapped_groundtruth, coco_wrapped_detections, agnostic_mode=False)
box_metrics, box_per_category_ap = box_evaluator.ComputeMetrics( box_metrics, box_per_category_ap = box_evaluator.ComputeMetrics(
include_metrics_per_category=self._include_metrics_per_category,
all_metrics_per_category=self._all_metrics_per_category) all_metrics_per_category=self._all_metrics_per_category)
box_metrics.update(box_per_category_ap) box_metrics.update(box_per_category_ap)
box_metrics = {'DetectionBoxes_'+ key: value box_metrics = {'DetectionBoxes_'+ key: value
...@@ -253,9 +260,10 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator): ...@@ -253,9 +260,10 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
'DetectionBoxes_Recall/AR@100 (large)', 'DetectionBoxes_Recall/AR@100 (large)',
'DetectionBoxes_Recall/AR@100 (medium)', 'DetectionBoxes_Recall/AR@100 (medium)',
'DetectionBoxes_Recall/AR@100 (small)'] 'DetectionBoxes_Recall/AR@100 (small)']
for category_dict in self._categories: if self._include_metrics_per_category:
metric_names.append('DetectionBoxes_PerformanceByCategory/mAP/' + for category_dict in self._categories:
category_dict['name']) metric_names.append('DetectionBoxes_PerformanceByCategory/mAP/' +
category_dict['name'])
def first_value_func(): def first_value_func():
self._metrics = self.evaluate() self._metrics = self.evaluate()
...@@ -289,13 +297,14 @@ def _check_mask_type_and_value(array_name, masks): ...@@ -289,13 +297,14 @@ def _check_mask_type_and_value(array_name, masks):
class CocoMaskEvaluator(object_detection_evaluation.DetectionEvaluator): class CocoMaskEvaluator(object_detection_evaluation.DetectionEvaluator):
"""Class to evaluate COCO detection metrics.""" """Class to evaluate COCO detection metrics."""
def __init__(self, categories): def __init__(self, categories, include_metrics_per_category=False):
"""Constructor. """Constructor.
Args: Args:
categories: A list of dicts, each of which has the following keys - categories: A list of dicts, each of which has the following keys -
'id': (required) an integer id uniquely identifying this category. 'id': (required) an integer id uniquely identifying this category.
'name': (required) string representing category name e.g., 'cat', 'dog'. 'name': (required) string representing category name e.g., 'cat', 'dog'.
include_metrics_per_category: If True, include metrics for each category.
""" """
super(CocoMaskEvaluator, self).__init__(categories) super(CocoMaskEvaluator, self).__init__(categories)
self._image_id_to_mask_shape_map = {} self._image_id_to_mask_shape_map = {}
...@@ -304,6 +313,7 @@ class CocoMaskEvaluator(object_detection_evaluation.DetectionEvaluator): ...@@ -304,6 +313,7 @@ class CocoMaskEvaluator(object_detection_evaluation.DetectionEvaluator):
self._detection_masks_list = [] self._detection_masks_list = []
self._category_id_set = set([cat['id'] for cat in self._categories]) self._category_id_set = set([cat['id'] for cat in self._categories])
self._annotation_id = 1 self._annotation_id = 1
self._include_metrics_per_category = include_metrics_per_category
def clear(self): def clear(self):
"""Clears the state to prepare for a fresh evaluation.""" """Clears the state to prepare for a fresh evaluation."""
...@@ -438,7 +448,8 @@ class CocoMaskEvaluator(object_detection_evaluation.DetectionEvaluator): ...@@ -438,7 +448,8 @@ class CocoMaskEvaluator(object_detection_evaluation.DetectionEvaluator):
'Recall/AR@100 (large)': average recall for large objects with 100 'Recall/AR@100 (large)': average recall for large objects with 100
detections detections
2. per_category_ap: category specific results with keys of the form: 2. per_category_ap: if include_metrics_per_category is True, category
specific results with keys of the form:
'Precision mAP ByCategory/category' (without the supercategory part if 'Precision mAP ByCategory/category' (without the supercategory part if
no supercategories exist). For backward compatibility no supercategories exist). For backward compatibility
'PerformanceByCategory' is included in the output regardless of 'PerformanceByCategory' is included in the output regardless of
...@@ -458,7 +469,8 @@ class CocoMaskEvaluator(object_detection_evaluation.DetectionEvaluator): ...@@ -458,7 +469,8 @@ class CocoMaskEvaluator(object_detection_evaluation.DetectionEvaluator):
mask_evaluator = coco_tools.COCOEvalWrapper( mask_evaluator = coco_tools.COCOEvalWrapper(
coco_wrapped_groundtruth, coco_wrapped_detection_masks, coco_wrapped_groundtruth, coco_wrapped_detection_masks,
agnostic_mode=False, iou_type='segm') agnostic_mode=False, iou_type='segm')
mask_metrics, mask_per_category_ap = mask_evaluator.ComputeMetrics() mask_metrics, mask_per_category_ap = mask_evaluator.ComputeMetrics(
include_metrics_per_category=self._include_metrics_per_category)
mask_metrics.update(mask_per_category_ap) mask_metrics.update(mask_per_category_ap)
mask_metrics = {'DetectionMasks_'+ key: value mask_metrics = {'DetectionMasks_'+ key: value
for key, value in mask_metrics.iteritems()} for key, value in mask_metrics.iteritems()}
......
...@@ -12,13 +12,12 @@ ...@@ -12,13 +12,12 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
# ============================================================================== # ==============================================================================
"""Tests for image.understanding.object_detection.metrics.coco_evaluation.""" """Tests for tensorflow_models.object_detection.metrics.coco_evaluation."""
from __future__ import absolute_import from __future__ import absolute_import
from __future__ import division from __future__ import division
from __future__ import print_function from __future__ import print_function
import math
import numpy as np import numpy as np
import tensorflow as tf import tensorflow as tf
from object_detection.core import standard_fields from object_detection.core import standard_fields
...@@ -87,43 +86,6 @@ class CocoDetectionEvaluationTest(tf.test.TestCase): ...@@ -87,43 +86,6 @@ class CocoDetectionEvaluationTest(tf.test.TestCase):
metrics = coco_evaluator.evaluate() metrics = coco_evaluator.evaluate()
self.assertAlmostEqual(metrics['DetectionBoxes_Precision/mAP'], 1.0) self.assertAlmostEqual(metrics['DetectionBoxes_Precision/mAP'], 1.0)
def testReturnAllMetricsPerCategory(self):
"""Tests that mAP is calculated correctly on GT and Detections."""
category_list = [{'id': 0, 'name': 'person'}]
coco_evaluator = coco_evaluation.CocoDetectionEvaluator(
category_list, all_metrics_per_category=True)
coco_evaluator.add_single_ground_truth_image_info(
image_id='image1',
groundtruth_dict={
standard_fields.InputDataFields.groundtruth_boxes:
np.array([[100., 100., 200., 200.]]),
standard_fields.InputDataFields.groundtruth_classes: np.array([1])
})
coco_evaluator.add_single_detected_image_info(
image_id='image1',
detections_dict={
standard_fields.DetectionResultFields.detection_boxes:
np.array([[100., 100., 200., 200.]]),
standard_fields.DetectionResultFields.detection_scores:
np.array([.8]),
standard_fields.DetectionResultFields.detection_classes:
np.array([1])
})
metrics = coco_evaluator.evaluate()
expected_metrics = [
'DetectionBoxes_Recall AR@10 ByCategory/person',
'DetectionBoxes_Precision mAP (medium) ByCategory/person',
'DetectionBoxes_Precision mAP ByCategory/person',
'DetectionBoxes_Precision mAP@.50IOU ByCategory/person',
'DetectionBoxes_Precision mAP (small) ByCategory/person',
'DetectionBoxes_Precision mAP (large) ByCategory/person',
'DetectionBoxes_Recall AR@1 ByCategory/person',
'DetectionBoxes_Precision mAP@.75IOU ByCategory/person',
'DetectionBoxes_Recall AR@100 ByCategory/person',
'DetectionBoxes_Recall AR@100 (medium) ByCategory/person',
'DetectionBoxes_Recall AR@100 (large) ByCategory/person']
self.assertTrue(set(expected_metrics).issubset(set(metrics)))
def testRejectionOnDuplicateGroundtruth(self): def testRejectionOnDuplicateGroundtruth(self):
"""Tests that groundtruth cannot be added more than once for an image.""" """Tests that groundtruth cannot be added more than once for an image."""
categories = [{'id': 1, 'name': 'cat'}, categories = [{'id': 1, 'name': 'cat'},
...@@ -279,12 +241,6 @@ class CocoEvaluationPyFuncTest(tf.test.TestCase): ...@@ -279,12 +241,6 @@ class CocoEvaluationPyFuncTest(tf.test.TestCase):
self.assertAlmostEqual(metrics['DetectionBoxes_Recall/AR@100 (medium)'], self.assertAlmostEqual(metrics['DetectionBoxes_Recall/AR@100 (medium)'],
-1.0) -1.0)
self.assertAlmostEqual(metrics['DetectionBoxes_Recall/AR@100 (small)'], 1.0) self.assertAlmostEqual(metrics['DetectionBoxes_Recall/AR@100 (small)'], 1.0)
self.assertAlmostEqual(metrics[
'DetectionBoxes_PerformanceByCategory/mAP/dog'], 1.0)
self.assertAlmostEqual(metrics[
'DetectionBoxes_PerformanceByCategory/mAP/cat'], 1.0)
self.assertTrue(math.isnan(metrics[
'DetectionBoxes_PerformanceByCategory/mAP/person']))
self.assertFalse(coco_evaluator._groundtruth_list) self.assertFalse(coco_evaluator._groundtruth_list)
self.assertFalse(coco_evaluator._detection_boxes_list) self.assertFalse(coco_evaluator._detection_boxes_list)
self.assertFalse(coco_evaluator._image_ids) self.assertFalse(coco_evaluator._image_ids)
......
...@@ -189,14 +189,18 @@ class COCOEvalWrapper(cocoeval.COCOeval): ...@@ -189,14 +189,18 @@ class COCOEvalWrapper(cocoeval.COCOeval):
"""Returns list of valid category ids.""" """Returns list of valid category ids."""
return self.params.catIds return self.params.catIds
def ComputeMetrics(self, all_metrics_per_category=False): def ComputeMetrics(self,
include_metrics_per_category=False,
all_metrics_per_category=False):
"""Computes detection metrics. """Computes detection metrics.
Args: Args:
include_metrics_per_category: If True, will include metrics per category.
all_metrics_per_category: If true, include all the summery metrics for all_metrics_per_category: If true, include all the summery metrics for
each category in per_category_ap. Be careful with setting it to true if each category in per_category_ap. Be careful with setting it to true if
you have more than handful of categories, because it will pollute you have more than handful of categories, because it will pollute
your mldash. your mldash.
Returns: Returns:
1. summary_metrics: a dictionary holding: 1. summary_metrics: a dictionary holding:
'Precision/mAP': mean average precision over classes averaged over IOU 'Precision/mAP': mean average precision over classes averaged over IOU
...@@ -225,6 +229,9 @@ class COCOEvalWrapper(cocoeval.COCOeval): ...@@ -225,6 +229,9 @@ class COCOEvalWrapper(cocoeval.COCOeval):
output regardless of all_metrics_per_category. output regardless of all_metrics_per_category.
If evaluating class-agnostic mode, per_category_ap is an empty If evaluating class-agnostic mode, per_category_ap is an empty
dictionary. dictionary.
Raises:
ValueError: If category_stats does not exist.
""" """
self.evaluate() self.evaluate()
self.accumulate() self.accumulate()
...@@ -244,6 +251,10 @@ class COCOEvalWrapper(cocoeval.COCOeval): ...@@ -244,6 +251,10 @@ class COCOEvalWrapper(cocoeval.COCOeval):
('Recall/AR@100 (medium)', self.stats[10]), ('Recall/AR@100 (medium)', self.stats[10]),
('Recall/AR@100 (large)', self.stats[11]) ('Recall/AR@100 (large)', self.stats[11])
]) ])
if not include_metrics_per_category:
return summary_metrics, {}
if not hasattr(self, 'category_stats'):
raise ValueError('Category stats do not exist')
per_category_ap = OrderedDict([]) per_category_ap = OrderedDict([])
if self.GetAgnosticMode(): if self.GetAgnosticMode():
return summary_metrics, per_category_ap return summary_metrics, per_category_ap
......
...@@ -12,7 +12,7 @@ ...@@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
# ============================================================================== # ==============================================================================
"""Tests for google3.image.understanding.object_detection.metrics.coco_tools.""" """Tests for tensorflow_model.object_detection.metrics.coco_tools."""
import json import json
import os import os
import re import re
......
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
r"""Creates and runs `Experiment` for object detection model.
This uses the TF.learn framework to define and run an object detection model
wrapped in an `Estimator`.
Note that this module is only compatible with SSD Meta architecture at the
moment.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import functools
import os
import tensorflow as tf
from google.protobuf import text_format
from tensorflow.contrib.learn.python.learn import learn_runner
from tensorflow.contrib.tpu.python.tpu import tpu_optimizer
from object_detection import eval_util
from object_detection import inputs
from object_detection import model_hparams
from object_detection.builders import model_builder
from object_detection.builders import optimizer_builder
from object_detection.core import standard_fields as fields
from object_detection.metrics import coco_evaluation
from object_detection.utils import config_util
from object_detection.utils import label_map_util
from object_detection.utils import shape_utils
from object_detection.utils import variables_helper
from object_detection.utils import visualization_utils as vis_utils
tf.flags.DEFINE_string('model_dir', None, 'Path to output model directory '
'where event and checkpoint files will be written.')
tf.flags.DEFINE_string('pipeline_config_path', None, 'Path to pipeline config '
'file.')
tf.flags.DEFINE_integer('num_train_steps', 500000, 'Number of train steps.')
tf.flags.DEFINE_integer('num_eval_steps', 10000, 'Number of train steps.')
FLAGS = tf.flags.FLAGS
def _get_groundtruth_data(detection_model, class_agnostic):
"""Extracts groundtruth data from detection_model.
Args:
detection_model: A `DetectionModel` object.
class_agnostic: Whether the detections are class_agnostic.
Returns:
A tuple of:
groundtruth: Dictionary with the following fields:
'groundtruth_boxes': [num_boxes, 4] float32 tensor of boxes, in
normalized coordinates.
'groundtruth_classes': [num_boxes] int64 tensor of 1-indexed classes.
'groundtruth_masks': 3D float32 tensor of instance masks (if provided in
groundtruth)
class_agnostic: Boolean indicating whether detections are class agnostic.
"""
input_data_fields = fields.InputDataFields()
groundtruth_boxes = detection_model.groundtruth_lists(
fields.BoxListFields.boxes)[0]
# For class-agnostic models, groundtruth one-hot encodings collapse to all
# ones.
if class_agnostic:
groundtruth_boxes_shape = tf.shape(groundtruth_boxes)
groundtruth_classes_one_hot = tf.ones([groundtruth_boxes_shape[0], 1])
else:
groundtruth_classes_one_hot = detection_model.groundtruth_lists(
fields.BoxListFields.classes)[0]
label_id_offset = 1 # Applying label id offset (b/63711816)
groundtruth_classes = (
tf.argmax(groundtruth_classes_one_hot, axis=1) + label_id_offset)
groundtruth = {
input_data_fields.groundtruth_boxes: groundtruth_boxes,
input_data_fields.groundtruth_classes: groundtruth_classes
}
if detection_model.groundtruth_has_field(fields.BoxListFields.masks):
groundtruth[input_data_fields.groundtruth_instance_masks] = (
detection_model.groundtruth_lists(fields.BoxListFields.masks)[0])
return groundtruth
def unstack_batch(tensor_dict, unpad_groundtruth_tensors=True):
"""Unstacks all tensors in `tensor_dict` along 0th dimension.
Unstacks tensor from the tensor dict along 0th dimension and returns a
tensor_dict containing values that are lists of unstacked tensors.
Tensors in the `tensor_dict` are expected to be of one of the three shapes:
1. [batch_size]
2. [batch_size, height, width, channels]
3. [batch_size, num_boxes, d1, d2, ... dn]
When unpad_tensors is set to true, unstacked tensors of form 3 above are
sliced along the `num_boxes` dimension using the value in tensor
field.InputDataFields.num_groundtruth_boxes.
Note that this function has a static list of input data fields and has to be
kept in sync with the InputDataFields defined in core/standard_fields.py
Args:
tensor_dict: A dictionary of batched groundtruth tensors.
unpad_groundtruth_tensors: Whether to remove padding along `num_boxes`
dimension of the groundtruth tensors.
Returns:
A dictionary where the keys are from fields.InputDataFields and values are
a list of unstacked (optionally unpadded) tensors.
Raises:
ValueError: If unpad_tensors is True and `tensor_dict` does not contain
`num_groundtruth_boxes` tensor.
"""
unbatched_tensor_dict = {key: tf.unstack(tensor)
for key, tensor in tensor_dict.items()}
if unpad_groundtruth_tensors:
if (fields.InputDataFields.num_groundtruth_boxes not in
unbatched_tensor_dict):
raise ValueError('`num_groundtruth_boxes` not found in tensor_dict. '
'Keys available: {}'.format(
unbatched_tensor_dict.keys()))
unbatched_unpadded_tensor_dict = {}
unpad_keys = set([
# List of input data fields that are padded along the num_boxes
# dimension. This list has to be kept in sync with InputDataFields in
# standard_fields.py.
fields.InputDataFields.groundtruth_instance_masks,
fields.InputDataFields.groundtruth_classes,
fields.InputDataFields.groundtruth_boxes,
fields.InputDataFields.groundtruth_keypoints,
fields.InputDataFields.groundtruth_group_of,
fields.InputDataFields.groundtruth_difficult,
fields.InputDataFields.groundtruth_is_crowd,
fields.InputDataFields.groundtruth_area,
fields.InputDataFields.groundtruth_weights
]).intersection(set(unbatched_tensor_dict.keys()))
for key in unpad_keys:
unpadded_tensor_list = []
for num_gt, padded_tensor in zip(
unbatched_tensor_dict[fields.InputDataFields.num_groundtruth_boxes],
unbatched_tensor_dict[key]):
tensor_shape = shape_utils.combined_static_and_dynamic_shape(
padded_tensor)
slice_begin = tf.zeros([len(tensor_shape)], dtype=tf.int32)
slice_size = tf.stack(
[num_gt] + [-1 if dim is None else dim for dim in tensor_shape[1:]])
unpadded_tensor = tf.slice(padded_tensor, slice_begin, slice_size)
unpadded_tensor_list.append(unpadded_tensor)
unbatched_unpadded_tensor_dict[key] = unpadded_tensor_list
unbatched_tensor_dict.update(unbatched_unpadded_tensor_dict)
return unbatched_tensor_dict
def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False):
"""Creates a model function for `Estimator`.
Args:
detection_model_fn: Function that returns a `DetectionModel` instance.
configs: Dictionary of pipeline config objects.
hparams: `HParams` object.
use_tpu: Boolean indicating whether model should be constructed for
use on TPU.
Returns:
`model_fn` for `Estimator`.
"""
train_config = configs['train_config']
eval_input_config = configs['eval_input_config']
def model_fn(features, labels, mode, params=None):
"""Constructs the object detection model.
Args:
features: Dictionary of feature tensors, returned from `input_fn`.
labels: Dictionary of groundtruth tensors if mode is TRAIN or EVAL,
otherwise None.
mode: Mode key from tf.estimator.ModeKeys.
params: Parameter dictionary passed from the estimator.
Returns:
An `EstimatorSpec` that encapsulates the model and its serving
configurations.
"""
params = params or {}
total_loss, train_op, detections, export_outputs = None, None, None, None
is_training = mode == tf.estimator.ModeKeys.TRAIN
detection_model = detection_model_fn(is_training=is_training,
add_summaries=(not use_tpu))
scaffold_fn = None
if mode == tf.estimator.ModeKeys.TRAIN:
labels = unstack_batch(
labels,
unpad_groundtruth_tensors=train_config.unpad_groundtruth_tensors)
elif mode == tf.estimator.ModeKeys.EVAL:
labels = unstack_batch(labels, unpad_groundtruth_tensors=False)
if mode in (tf.estimator.ModeKeys.TRAIN, tf.estimator.ModeKeys.EVAL):
gt_boxes_list = labels[fields.InputDataFields.groundtruth_boxes]
gt_classes_list = labels[fields.InputDataFields.groundtruth_classes]
gt_masks_list = None
if fields.InputDataFields.groundtruth_instance_masks in labels:
gt_masks_list = labels[
fields.InputDataFields.groundtruth_instance_masks]
gt_keypoints_list = None
if fields.InputDataFields.groundtruth_keypoints in labels:
gt_keypoints_list = labels[fields.InputDataFields.groundtruth_keypoints]
detection_model.provide_groundtruth(
groundtruth_boxes_list=gt_boxes_list,
groundtruth_classes_list=gt_classes_list,
groundtruth_masks_list=gt_masks_list,
groundtruth_keypoints_list=gt_keypoints_list)
preprocessed_images = features[fields.InputDataFields.image]
prediction_dict = detection_model.predict(
preprocessed_images, features[fields.InputDataFields.true_image_shape])
detections = detection_model.postprocess(
prediction_dict, features[fields.InputDataFields.true_image_shape])
if mode == tf.estimator.ModeKeys.TRAIN:
if train_config.fine_tune_checkpoint and hparams.load_pretrained:
asg_map = detection_model.restore_map(
from_detection_checkpoint=train_config.from_detection_checkpoint,
load_all_detection_checkpoint_vars=(
train_config.load_all_detection_checkpoint_vars))
available_var_map = (
variables_helper.get_variables_available_in_checkpoint(
asg_map, train_config.fine_tune_checkpoint,
include_global_step=False))
if use_tpu:
def tpu_scaffold():
tf.train.init_from_checkpoint(train_config.fine_tune_checkpoint,
available_var_map)
return tf.train.Scaffold()
scaffold_fn = tpu_scaffold
else:
tf.train.init_from_checkpoint(train_config.fine_tune_checkpoint,
available_var_map)
if mode in (tf.estimator.ModeKeys.TRAIN, tf.estimator.ModeKeys.EVAL):
losses_dict = detection_model.loss(
prediction_dict, features[fields.InputDataFields.true_image_shape])
losses = [loss_tensor for loss_tensor in losses_dict.itervalues()]
total_loss = tf.add_n(losses, name='total_loss')
if mode == tf.estimator.ModeKeys.TRAIN:
global_step = tf.train.get_or_create_global_step()
training_optimizer, optimizer_summary_vars = optimizer_builder.build(
train_config.optimizer)
if use_tpu:
training_optimizer = tpu_optimizer.CrossShardOptimizer(
training_optimizer)
# Optionally freeze some layers by setting their gradients to be zero.
trainable_variables = None
if train_config.freeze_variables:
trainable_variables = tf.contrib.framework.filter_variables(
tf.trainable_variables(),
exclude_patterns=train_config.freeze_variables)
clip_gradients_value = None
if train_config.gradient_clipping_by_norm > 0:
clip_gradients_value = train_config.gradient_clipping_by_norm
if not use_tpu:
for var in optimizer_summary_vars:
tf.summary.scalar(var.op.name, var)
summaries = [] if use_tpu else None
train_op = tf.contrib.layers.optimize_loss(
loss=total_loss,
global_step=global_step,
learning_rate=None,
clip_gradients=clip_gradients_value,
optimizer=training_optimizer,
variables=trainable_variables,
summaries=summaries,
name='') # Preventing scope prefix on all variables.
if mode == tf.estimator.ModeKeys.PREDICT:
export_outputs = {
tf.saved_model.signature_constants.PREDICT_METHOD_NAME:
tf.estimator.export.PredictOutput(detections)
}
eval_metric_ops = None
if mode == tf.estimator.ModeKeys.EVAL:
# Detection summaries during eval.
class_agnostic = (fields.DetectionResultFields.detection_classes
not in detections)
groundtruth = _get_groundtruth_data(detection_model, class_agnostic)
eval_dict = eval_util.result_dict_for_single_example(
tf.expand_dims(features[fields.InputDataFields.original_image][0], 0),
features[inputs.HASH_KEY][0],
detections,
groundtruth,
class_agnostic=class_agnostic,
scale_to_absolute=False)
if class_agnostic:
category_index = label_map_util.create_class_agnostic_category_index()
else:
category_index = label_map_util.create_category_index_from_labelmap(
eval_input_config.label_map_path)
detection_and_groundtruth = vis_utils.draw_side_by_side_evaluation_image(
eval_dict, category_index, max_boxes_to_draw=20, min_score_thresh=0.2)
if not use_tpu:
tf.summary.image('Detections_Left_Groundtruth_Right',
detection_and_groundtruth)
# Eval metrics on a single image.
detection_fields = fields.DetectionResultFields()
input_data_fields = fields.InputDataFields()
coco_evaluator = coco_evaluation.CocoDetectionEvaluator(
category_index.values())
eval_metric_ops = coco_evaluator.get_estimator_eval_metric_ops(
image_id=eval_dict[input_data_fields.key],
groundtruth_boxes=eval_dict[input_data_fields.groundtruth_boxes],
groundtruth_classes=eval_dict[input_data_fields.groundtruth_classes],
detection_boxes=eval_dict[detection_fields.detection_boxes],
detection_scores=eval_dict[detection_fields.detection_scores],
detection_classes=eval_dict[detection_fields.detection_classes])
if use_tpu:
return tf.contrib.tpu.TPUEstimatorSpec(
mode=mode,
scaffold_fn=scaffold_fn,
predictions=detections,
loss=total_loss,
train_op=train_op,
eval_metrics=eval_metric_ops,
export_outputs=export_outputs)
else:
return tf.estimator.EstimatorSpec(
mode=mode,
predictions=detections,
loss=total_loss,
train_op=train_op,
eval_metric_ops=eval_metric_ops,
export_outputs=export_outputs)
return model_fn
def _build_experiment_fn(train_steps, eval_steps):
"""Returns a function that creates an `Experiment`."""
def build_experiment(run_config, hparams):
"""Builds an `Experiment` from configuration and hyperparameters.
Args:
run_config: A `RunConfig`.
hparams: A `HParams`.
Returns:
An `Experiment` object.
"""
return populate_experiment(run_config, hparams, FLAGS.pipeline_config_path,
train_steps, eval_steps)
return build_experiment
def populate_experiment(run_config,
hparams,
pipeline_config_path,
train_steps=None,
eval_steps=None,
model_fn_creator=create_model_fn,
**kwargs):
"""Populates an `Experiment` object.
Args:
run_config: A `RunConfig`.
hparams: A `HParams`.
pipeline_config_path: A path to a pipeline config file.
train_steps: Number of training steps. If None, the number of training steps
is set from the `TrainConfig` proto.
eval_steps: Number of evaluation steps per evaluation cycle. If None, the
number of evaluation steps is set from the `EvalConfig` proto.
model_fn_creator: A function that creates a `model_fn` for `Estimator`.
Follows the signature:
* Args:
* `detection_model_fn`: Function that returns `DetectionModel` instance.
* `configs`: Dictionary of pipeline config objects.
* `hparams`: `HParams` object.
* Returns:
`model_fn` for `Estimator`.
**kwargs: Additional keyword arguments for configuration override.
Returns:
An `Experiment` that defines all aspects of training, evaluation, and
export.
"""
configs = config_util.get_configs_from_pipeline_file(pipeline_config_path)
configs = config_util.merge_external_params_with_configs(
configs,
hparams,
train_steps=train_steps,
eval_steps=eval_steps,
**kwargs)
model_config = configs['model']
train_config = configs['train_config']
train_input_config = configs['train_input_config']
eval_config = configs['eval_config']
eval_input_config = configs['eval_input_config']
if train_steps is None:
train_steps = train_config.num_steps if train_config.num_steps else None
if eval_steps is None:
eval_steps = eval_config.num_examples if eval_config.num_examples else None
detection_model_fn = functools.partial(
model_builder.build, model_config=model_config)
# Create the input functions for TRAIN/EVAL.
train_input_fn = inputs.create_train_input_fn(
train_config=train_config,
train_input_config=train_input_config,
model_config=model_config)
eval_input_fn = inputs.create_eval_input_fn(
eval_config=eval_config,
eval_input_config=eval_input_config,
model_config=model_config)
export_strategies = [
tf.contrib.learn.utils.saved_model_export_utils.make_export_strategy(
serving_input_fn=inputs.create_predict_input_fn(
model_config=model_config))
]
estimator = tf.estimator.Estimator(
model_fn=model_fn_creator(detection_model_fn, configs, hparams),
config=run_config)
if run_config.is_chief:
# Store the final pipeline config for traceability.
pipeline_config_final = config_util.create_pipeline_proto_from_configs(
configs)
pipeline_config_final_path = os.path.join(estimator.model_dir,
'pipeline.config')
config_text = text_format.MessageToString(pipeline_config_final)
with tf.gfile.Open(pipeline_config_final_path, 'wb') as f:
tf.logging.info('Writing as-run pipeline config file to %s',
pipeline_config_final_path)
f.write(config_text)
return tf.contrib.learn.Experiment(
estimator=estimator,
train_input_fn=train_input_fn,
eval_input_fn=eval_input_fn,
train_steps=train_steps,
eval_steps=eval_steps,
export_strategies=export_strategies,
eval_delay_secs=120,)
def main(unused_argv):
tf.flags.mark_flag_as_required('model_dir')
tf.flags.mark_flag_as_required('pipeline_config_path')
config = tf.contrib.learn.RunConfig(model_dir=FLAGS.model_dir)
learn_runner.run(
experiment_fn=_build_experiment_fn(FLAGS.num_train_steps,
FLAGS.num_eval_steps),
run_config=config,
hparams=model_hparams.create_hparams())
if __name__ == '__main__':
tf.app.run()
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Hyperparameters for the object detection model in TF.learn.
This file consolidates and documents the hyperparameters used by the model.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
def create_hparams(hparams_overrides=None):
"""Returns hyperparameters, including any flag value overrides.
Args:
hparams_overrides: Optional hparams overrides, represented as a
string containing comma-separated hparam_name=value pairs.
Returns:
The hyperparameters as a tf.HParams object.
"""
hparams = tf.contrib.training.HParams(
# Whether a fine tuning checkpoint (provided in the pipeline config)
# should be loaded for training.
load_pretrained=True)
# Override any of the preceding hyperparameter values.
if hparams_overrides:
hparams = hparams.parse(hparams_overrides)
return hparams
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for object detection model."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import functools
import os
import numpy as np
import tensorflow as tf
from object_detection import inputs
from object_detection import model
from object_detection import model_hparams
from object_detection import model_test_util
from object_detection.builders import model_builder
from object_detection.core import standard_fields as fields
from object_detection.utils import config_util
FLAGS = tf.flags.FLAGS
MODEL_NAME_FOR_TEST = model_test_util.SSD_INCEPTION_MODEL_NAME
def _get_data_path():
"""Returns an absolute path to TFRecord file."""
return os.path.join(FLAGS.test_srcdir, model_test_util.PATH_BASE, 'test_data',
'pets_examples.record')
def _get_labelmap_path():
"""Returns an absolute path to label map file."""
return os.path.join(FLAGS.test_srcdir, model_test_util.PATH_BASE, 'data',
'pet_label_map.pbtxt')
def _get_configs_for_model(model_name):
"""Returns configurations for model."""
filename = model_test_util.GetPipelineConfigPath(model_name)
data_path = _get_data_path()
label_map_path = _get_labelmap_path()
configs = config_util.get_configs_from_pipeline_file(filename)
configs = config_util.merge_external_params_with_configs(
configs,
train_input_path=data_path,
eval_input_path=data_path,
label_map_path=label_map_path)
return configs
def setUpModule():
model_test_util.InitializeFlags(MODEL_NAME_FOR_TEST)
class ModelTflearnTest(tf.test.TestCase):
@classmethod
def setUpClass(cls):
tf.reset_default_graph()
def _assert_outputs_for_train_eval(self, configs, mode, class_agnostic=False):
model_config = configs['model']
train_config = configs['train_config']
with tf.Graph().as_default():
if mode == tf.estimator.ModeKeys.TRAIN:
features, labels = inputs.create_train_input_fn(
configs['train_config'],
configs['train_input_config'],
configs['model'])()
batch_size = train_config.batch_size
else:
features, labels = inputs.create_eval_input_fn(
configs['eval_config'],
configs['eval_input_config'],
configs['model'])()
batch_size = 1
detection_model_fn = functools.partial(
model_builder.build, model_config=model_config, is_training=True)
hparams = model_hparams.create_hparams(
hparams_overrides='load_pretrained=false')
model_fn = model.create_model_fn(detection_model_fn, configs, hparams)
estimator_spec = model_fn(features, labels, mode)
self.assertIsNotNone(estimator_spec.loss)
self.assertIsNotNone(estimator_spec.predictions)
if class_agnostic:
self.assertNotIn('detection_classes', estimator_spec.predictions)
else:
detection_classes = estimator_spec.predictions['detection_classes']
self.assertEqual(batch_size, detection_classes.shape.as_list()[0])
self.assertEqual(tf.float32, detection_classes.dtype)
detection_boxes = estimator_spec.predictions['detection_boxes']
detection_scores = estimator_spec.predictions['detection_scores']
num_detections = estimator_spec.predictions['num_detections']
self.assertEqual(batch_size, detection_boxes.shape.as_list()[0])
self.assertEqual(tf.float32, detection_boxes.dtype)
self.assertEqual(batch_size, detection_scores.shape.as_list()[0])
self.assertEqual(tf.float32, detection_scores.dtype)
self.assertEqual(tf.float32, num_detections.dtype)
if mode == tf.estimator.ModeKeys.TRAIN:
self.assertIsNotNone(estimator_spec.train_op)
return estimator_spec
def _assert_outputs_for_predict(self, configs):
model_config = configs['model']
with tf.Graph().as_default():
features, _ = inputs.create_eval_input_fn(
configs['eval_config'],
configs['eval_input_config'],
configs['model'])()
detection_model_fn = functools.partial(
model_builder.build, model_config=model_config, is_training=False)
hparams = model_hparams.create_hparams(
hparams_overrides='load_pretrained=false')
model_fn = model.create_model_fn(detection_model_fn, configs, hparams)
estimator_spec = model_fn(features, None, tf.estimator.ModeKeys.PREDICT)
self.assertIsNone(estimator_spec.loss)
self.assertIsNone(estimator_spec.train_op)
self.assertIsNotNone(estimator_spec.predictions)
self.assertIsNotNone(estimator_spec.export_outputs)
self.assertIn(tf.saved_model.signature_constants.PREDICT_METHOD_NAME,
estimator_spec.export_outputs)
def testModelFnInTrainMode(self):
"""Tests the model function in TRAIN mode."""
configs = _get_configs_for_model(MODEL_NAME_FOR_TEST)
self._assert_outputs_for_train_eval(configs, tf.estimator.ModeKeys.TRAIN)
def testModelFnInEvalMode(self):
"""Tests the model function in EVAL mode."""
configs = _get_configs_for_model(MODEL_NAME_FOR_TEST)
self._assert_outputs_for_train_eval(configs, tf.estimator.ModeKeys.EVAL)
def testModelFnInPredictMode(self):
"""Tests the model function in PREDICT mode."""
configs = _get_configs_for_model(MODEL_NAME_FOR_TEST)
self._assert_outputs_for_predict(configs)
def testExperiment(self):
"""Tests that the `Experiment` object is constructed correctly."""
experiment = model_test_util.BuildExperiment()
model_dir = experiment.estimator.model_dir
pipeline_config_path = os.path.join(model_dir, 'pipeline.config')
self.assertTrue(tf.gfile.Exists(pipeline_config_path))
class UnbatchTensorsTest(tf.test.TestCase):
def test_unbatch_without_unpadding(self):
image_placeholder = tf.placeholder(tf.float32, [2, None, None, None])
groundtruth_boxes_placeholder = tf.placeholder(tf.float32, [2, None, None])
groundtruth_classes_placeholder = tf.placeholder(tf.float32,
[2, None, None])
groundtruth_weights_placeholder = tf.placeholder(tf.float32, [2, None])
tensor_dict = {
fields.InputDataFields.image:
image_placeholder,
fields.InputDataFields.groundtruth_boxes:
groundtruth_boxes_placeholder,
fields.InputDataFields.groundtruth_classes:
groundtruth_classes_placeholder,
fields.InputDataFields.groundtruth_weights:
groundtruth_weights_placeholder
}
unbatched_tensor_dict = model.unstack_batch(
tensor_dict, unpad_groundtruth_tensors=False)
with self.test_session() as sess:
unbatched_tensor_dict_out = sess.run(
unbatched_tensor_dict,
feed_dict={
image_placeholder:
np.random.rand(2, 4, 4, 3).astype(np.float32),
groundtruth_boxes_placeholder:
np.random.rand(2, 5, 4).astype(np.float32),
groundtruth_classes_placeholder:
np.random.rand(2, 5, 6).astype(np.float32),
groundtruth_weights_placeholder:
np.random.rand(2, 5).astype(np.float32)
})
for image_out in unbatched_tensor_dict_out[fields.InputDataFields.image]:
self.assertAllEqual(image_out.shape, [4, 4, 3])
for groundtruth_boxes_out in unbatched_tensor_dict_out[
fields.InputDataFields.groundtruth_boxes]:
self.assertAllEqual(groundtruth_boxes_out.shape, [5, 4])
for groundtruth_classes_out in unbatched_tensor_dict_out[
fields.InputDataFields.groundtruth_classes]:
self.assertAllEqual(groundtruth_classes_out.shape, [5, 6])
for groundtruth_weights_out in unbatched_tensor_dict_out[
fields.InputDataFields.groundtruth_weights]:
self.assertAllEqual(groundtruth_weights_out.shape, [5])
def test_unbatch_and_unpad_groundtruth_tensors(self):
image_placeholder = tf.placeholder(tf.float32, [2, None, None, None])
groundtruth_boxes_placeholder = tf.placeholder(tf.float32, [2, 5, None])
groundtruth_classes_placeholder = tf.placeholder(tf.float32, [2, 5, None])
groundtruth_weights_placeholder = tf.placeholder(tf.float32, [2, 5])
num_groundtruth_placeholder = tf.placeholder(tf.int32, [2])
tensor_dict = {
fields.InputDataFields.image:
image_placeholder,
fields.InputDataFields.groundtruth_boxes:
groundtruth_boxes_placeholder,
fields.InputDataFields.groundtruth_classes:
groundtruth_classes_placeholder,
fields.InputDataFields.groundtruth_weights:
groundtruth_weights_placeholder,
fields.InputDataFields.num_groundtruth_boxes:
num_groundtruth_placeholder
}
unbatched_tensor_dict = model.unstack_batch(
tensor_dict, unpad_groundtruth_tensors=True)
with self.test_session() as sess:
unbatched_tensor_dict_out = sess.run(
unbatched_tensor_dict,
feed_dict={
image_placeholder:
np.random.rand(2, 4, 4, 3).astype(np.float32),
groundtruth_boxes_placeholder:
np.random.rand(2, 5, 4).astype(np.float32),
groundtruth_classes_placeholder:
np.random.rand(2, 5, 6).astype(np.float32),
groundtruth_weights_placeholder:
np.random.rand(2, 5).astype(np.float32),
num_groundtruth_placeholder:
np.array([3, 3], np.int32)
})
for image_out in unbatched_tensor_dict_out[fields.InputDataFields.image]:
self.assertAllEqual(image_out.shape, [4, 4, 3])
for groundtruth_boxes_out in unbatched_tensor_dict_out[
fields.InputDataFields.groundtruth_boxes]:
self.assertAllEqual(groundtruth_boxes_out.shape, [3, 4])
for groundtruth_classes_out in unbatched_tensor_dict_out[
fields.InputDataFields.groundtruth_classes]:
self.assertAllEqual(groundtruth_classes_out.shape, [3, 6])
for groundtruth_weights_out in unbatched_tensor_dict_out[
fields.InputDataFields.groundtruth_weights]:
self.assertAllEqual(groundtruth_weights_out.shape, [3])
if __name__ == '__main__':
tf.test.main()
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment