Unverified Commit 32e7d660 authored by pkulzc's avatar pkulzc Committed by GitHub
Browse files

Open Images Challenge 2018 tools, minor fixes and refactors. (#4661)

* Merged commit includes the following changes:
202804536  by Zhichao Lu:

    Return tf.data.Dataset from input_fn that goes into the estimator and use PER_HOST_V2 option for tpu input pipeline config.

    This change shaves off 100ms per step resulting in 25 minutes of total reduced training time for ssd mobilenet v1 (15k steps to convergence).

--
202769340  by Zhichao Lu:

    Adding as_matrix() transformation for image-level labels.

--
202768721  by Zhichao Lu:

    Challenge evaluation protocol modification: adding labelmaps creation.

--
202750966  by Zhichao Lu:

    Add the explicit names to two output nodes.

--
202732783  by Zhichao Lu:

    Enforcing that batch size is 1 for evaluation, and no original images are retained during evaluation when use_tpu=False (to avoid dynamic shapes).

--
202425430  by Zhichao Lu:

    Refactor input pipeline to improve performance.

--
202406389  by Zhichao Lu:

    Only check the validity of `warmup_learning_rate` if it will be used.

--
202330450  by Zhichao Lu:

    Adding the description of the flag input_image_label_annotations_csv to add
      image-level labels to tf.Example.

--
202029012  by Zhichao Lu:

    Enabling displaying relationship name in the final metrics output.

--
202024010  by Zhichao Lu:

    Update to the public README.

--
201999677  by Zhichao Lu:

    Fixing the way negative labels are handled in VRD evaluation.

--
201962313  by Zhichao Lu:

    Fix a bug in resize_to_range.

--
201808488  by Zhichao Lu:

    Update ssd_inception_v2_pets.config to use right filename of pets dataset tf records.

--
201779225  by Zhichao Lu:

    Update object detection API installation doc

--
201766518  by Zhichao Lu:

    Add shell script to create pycocotools package for CMLE.

--
201722377  by Zhichao Lu:

    Removes verified_labels field and uses groundtruth_image_classes field instead.

--
201616819  by Zhichao Lu:

    Disable eval_on_tpu since eval_metrics is not setup to execute on TPU.
    Do not use run_config.task_type to switch tpu mode for EVAL,
    since that won't work in unit test.
    Expand unit test to verify that the same instantiation of the Estimator can independently disable eval on TPU whereas training is enabled on TPU.

--
201524716  by Zhichao Lu:

    Disable export model to TPU, inference is not compatible with TPU.
    Add GOOGLE_INTERNAL support in object detection copy.bara.sky

--
201453347  by Zhichao Lu:

    Fixing bug when evaluating the quantized model.

--
200795826  by Zhichao Lu:

    Fixing parsing bug: image-level labels are parsed as tuples instead of numpy
    array.

--
200746134  by Zhichao Lu:

    Adding image_class_text and image_class_label fields into tf_example_decoder.py

--
200743003  by Zhichao Lu:

    Changes to model_main.py and model_tpu_main to enable training and continuous eval.

--
200736324  by Zhichao Lu:

    Replace deprecated squeeze_dims argument.

--
200730072  by Zhichao Lu:

    Make detections only during predict and eval mode while creating model function

--
200729699  by Zhichao Lu:

    Minor correction to internal documentation (definition of Huber loss)

--
200727142  by Zhichao Lu:

    Add command line parsing as a set of flags using argparse and add header to the
    resulting file.

--
200726169  by Zhichao Lu:

    A tutorial on running evaluation for the Open Images Challenge 2018.

--
200665093  by Zhichao Lu:

    Cleanup on variables_helper_test.py.

--
200652145  by Zhichao Lu:

    Add an option to write (non-frozen) graph when exporting inference graph.

--
200573810  by Zhichao Lu:

    Update ssd_mobilenet_v1_coco and ssd_inception_v2_coco download links to point to a newer version.

--
200498014  by Zhichao Lu:

    Add test for groundtruth mask resizing.

--
200453245  by Zhichao Lu:

    Cleaning up exporting_models.md along with exporting scripts

--
200311747  by Zhichao Lu:

    Resize groundtruth mask to match the size of the original image.

--
200287269  by Zhichao Lu:

    Having a option to use custom MatMul based crop_and_resize op as an alternate to the TF op in Faster-RCNN

--
200127859  by Zhichao Lu:

    Updating the instructions to run locally with new binary. Also updating pets configs since file path naming has changed.

--
200127044  by Zhichao Lu:

    A simpler evaluation util to compute Open Images Challenge
    2018 metric (object detection track).

--
200124019  by Zhichao Lu:

    Freshening up configuring_jobs.md

--
200086825  by Zhichao Lu:

    Make merge_multiple_label_boxes work for ssd model.

--
199843258  by Zhichao Lu:

    Allows inconsistent feature channels to be compatible with WeightSharedConvolutionalBoxPredictor.

--
199676082  by Zhichao Lu:

    Enable an override for `InputReader.shuffle` for object detection pipelines.

--
199599212  by Zhichao Lu:

    Markdown fixes.

--
199535432  by Zhichao Lu:

    Pass num_additional_channels to tf.example decoder in predict_input_fn.

--
199399439  by Zhichao Lu:

    Adding `num_additional_channels` field to specify how many additional channels to use in the model.

--

PiperOrigin-RevId: 202804536

* Add original model builder and docs back.
parent 86ac7a47
......@@ -72,6 +72,8 @@ Extras:
Inference and evaluation on the Open Images dataset</a><br>
* <a href='g3doc/instance_segmentation.md'>
Run an instance segmentation model</a><br>
* <a href='g3doc/challenge_evaluation.md'>
Run the evaluation for the Open Images Challenge 2018.</a><br>
## Getting Help
......@@ -90,6 +92,20 @@ reporting an issue.
## Release information
### June 25, 2018
Additional evaluation tools for the [Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html) are out.
Check out our short tutorial on data preparation and running evaluation [here](g3doc/challenge_evaluation.md)!
<b>Thanks to contributors</b>: Alina Kuznetsova
### June 5, 2018
We have released the implementation of evaluation metrics for both tracks of the [Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html) as a part of the Object Detection API - see the [evaluation protocols](g3doc/evaluation_protocols.md) for more details.
Additionally, we have released a tool for hierarchical labels expansion for the Open Images Challenge: check out [oid_hierarchical_labels_expansion.py](dataset_tools/oid_hierarchical_labels_expansion.py).
<b>Thanks to contributors</b>: Alina Kuznetsova, Vittorio Ferrari, Jasper Uijlings
### April 30, 2018
We have released a Faster R-CNN detector with ResNet-101 feature extractor trained on [AVA](https://research.google.com/ava/) v2.1.
......
......@@ -24,111 +24,66 @@ that wraps the build function.
import functools
import tensorflow as tf
from object_detection.core import standard_fields as fields
from object_detection.data_decoders import tf_example_decoder
from object_detection.protos import input_reader_pb2
from object_detection.utils import dataset_util
def _get_padding_shapes(dataset, max_num_boxes=None, num_classes=None,
spatial_image_shape=None):
"""Returns shapes to pad dataset tensors to before batching.
def make_initializable_iterator(dataset):
"""Creates an iterator, and initializes tables.
This is useful in cases where make_one_shot_iterator wouldn't work because
the graph contains a hash table that needs to be initialized.
Args:
dataset: tf.data.Dataset object.
max_num_boxes: Max number of groundtruth boxes needed to computes shapes for
padding.
num_classes: Number of classes in the dataset needed to compute shapes for
padding.
spatial_image_shape: A list of two integers of the form [height, width]
containing expected spatial shape of the image.
dataset: A `tf.data.Dataset` object.
Returns:
A dictionary keyed by fields.InputDataFields containing padding shapes for
tensors in the dataset.
Raises:
ValueError: If groundtruth classes is neither rank 1 nor rank 2.
A `tf.data.Iterator`.
"""
iterator = dataset.make_initializable_iterator()
tf.add_to_collection(tf.GraphKeys.TABLE_INITIALIZERS, iterator.initializer)
return iterator
if not spatial_image_shape or spatial_image_shape == [-1, -1]:
height, width = None, None
else:
height, width = spatial_image_shape # pylint: disable=unpacking-non-sequence
num_additional_channels = 0
if fields.InputDataFields.image_additional_channels in dataset.output_shapes:
num_additional_channels = dataset.output_shapes[
fields.InputDataFields.image_additional_channels].dims[2].value
padding_shapes = {
# Additional channels are merged before batching.
fields.InputDataFields.image: [
height, width, 3 + num_additional_channels
],
fields.InputDataFields.image_additional_channels: [
height, width, num_additional_channels
],
fields.InputDataFields.source_id: [],
fields.InputDataFields.filename: [],
fields.InputDataFields.key: [],
fields.InputDataFields.groundtruth_difficult: [max_num_boxes],
fields.InputDataFields.groundtruth_boxes: [max_num_boxes, 4],
fields.InputDataFields.groundtruth_instance_masks: [
max_num_boxes, height, width
],
fields.InputDataFields.groundtruth_is_crowd: [max_num_boxes],
fields.InputDataFields.groundtruth_group_of: [max_num_boxes],
fields.InputDataFields.groundtruth_area: [max_num_boxes],
fields.InputDataFields.groundtruth_weights: [max_num_boxes],
fields.InputDataFields.num_groundtruth_boxes: [],
fields.InputDataFields.groundtruth_label_types: [max_num_boxes],
fields.InputDataFields.groundtruth_label_scores: [max_num_boxes],
fields.InputDataFields.true_image_shape: [3],
fields.InputDataFields.multiclass_scores: [
max_num_boxes, num_classes + 1 if num_classes is not None else None
],
}
# Determine whether groundtruth_classes are integers or one-hot encodings, and
# apply batching appropriately.
classes_shape = dataset.output_shapes[
fields.InputDataFields.groundtruth_classes]
if len(classes_shape) == 1: # Class integers.
padding_shapes[fields.InputDataFields.groundtruth_classes] = [max_num_boxes]
elif len(classes_shape) == 2: # One-hot or k-hot encoding.
padding_shapes[fields.InputDataFields.groundtruth_classes] = [
max_num_boxes, num_classes]
else:
raise ValueError('Groundtruth classes must be a rank 1 tensor (classes) or '
'rank 2 tensor (one-hot encodings)')
if fields.InputDataFields.original_image in dataset.output_shapes:
padding_shapes[fields.InputDataFields.original_image] = [
None, None, 3 + num_additional_channels
]
if fields.InputDataFields.groundtruth_keypoints in dataset.output_shapes:
tensor_shape = dataset.output_shapes[fields.InputDataFields.
groundtruth_keypoints]
padding_shape = [max_num_boxes, tensor_shape[1].value,
tensor_shape[2].value]
padding_shapes[fields.InputDataFields.groundtruth_keypoints] = padding_shape
if (fields.InputDataFields.groundtruth_keypoint_visibilities
in dataset.output_shapes):
tensor_shape = dataset.output_shapes[fields.InputDataFields.
groundtruth_keypoint_visibilities]
padding_shape = [max_num_boxes, tensor_shape[1].value]
padding_shapes[fields.InputDataFields.
groundtruth_keypoint_visibilities] = padding_shape
return {tensor_key: padding_shapes[tensor_key]
for tensor_key, _ in dataset.output_shapes.items()}
def build(input_reader_config,
transform_input_data_fn=None,
batch_size=None,
max_num_boxes=None,
num_classes=None,
spatial_image_shape=None,
num_additional_channels=0):
def read_dataset(file_read_func, input_files, config):
"""Reads a dataset, and handles repetition and shuffling.
Args:
file_read_func: Function to use in tf.contrib.data.parallel_interleave, to
read every individual file into a tf.data.Dataset.
input_files: A list of file paths to read.
config: A input_reader_builder.InputReader object.
Returns:
A tf.data.Dataset of (undecoded) tf-records based on config.
"""
# Shard, shuffle, and read files.
filenames = tf.gfile.Glob(input_files)
num_readers = config.num_readers
if num_readers > len(filenames):
num_readers = len(filenames)
tf.logging.warning('num_readers has been reduced to %d to match input file '
'shards.' % num_readers)
filename_dataset = tf.data.Dataset.from_tensor_slices(filenames)
if config.shuffle:
filename_dataset = filename_dataset.shuffle(
config.filenames_shuffle_buffer_size)
elif num_readers > 1:
tf.logging.warning('`shuffle` is false, but the input data stream is '
'still slightly shuffled since `num_readers` > 1.')
filename_dataset = filename_dataset.repeat(config.num_epochs or None)
records_dataset = filename_dataset.apply(
tf.contrib.data.parallel_interleave(
file_read_func,
cycle_length=num_readers,
block_length=config.read_block_length,
sloppy=config.shuffle))
if config.shuffle:
records_dataset = records_dataset.shuffle(config.shuffle_buffer_size)
return records_dataset
def build(input_reader_config, batch_size=None, transform_input_data_fn=None):
"""Builds a tf.data.Dataset.
Builds a tf.data.Dataset by applying the `transform_input_data_fn` on all
......@@ -136,17 +91,9 @@ def build(input_reader_config,
Args:
input_reader_config: A input_reader_pb2.InputReader object.
transform_input_data_fn: Function to apply to all records, or None if
no extra decoding is required.
batch_size: Batch size. If None, batching is not performed.
max_num_boxes: Max number of groundtruth boxes needed to compute shapes for
padding. If None, will use a dynamic shape.
num_classes: Number of classes in the dataset needed to compute shapes for
padding. If None, will use a dynamic shape.
spatial_image_shape: A list of two integers of the form [height, width]
containing expected spatial shape of the image after applying
transform_input_data_fn. If None, will use dynamic shapes.
num_additional_channels: Number of additional channels to use in the input.
batch_size: Batch size. If batch size is None, no batching is performed.
transform_input_data_fn: Function to apply transformation to all records,
or None if no extra decoding is required.
Returns:
A tf.data.Dataset based on the input_reader_config.
......@@ -173,24 +120,31 @@ def build(input_reader_config,
instance_mask_type=input_reader_config.mask_type,
label_map_proto_file=label_map_proto_file,
use_display_name=input_reader_config.use_display_name,
num_additional_channels=num_additional_channels)
num_additional_channels=input_reader_config.num_additional_channels)
def process_fn(value):
processed = decoder.decode(value)
"""Sets up tf graph that decodes, transforms and pads input data."""
processed_tensors = decoder.decode(value)
if transform_input_data_fn is not None:
return transform_input_data_fn(processed)
return processed
processed_tensors = transform_input_data_fn(processed_tensors)
return processed_tensors
dataset = dataset_util.read_dataset(
dataset = read_dataset(
functools.partial(tf.data.TFRecordDataset, buffer_size=8 * 1000 * 1000),
process_fn, config.input_path[:], input_reader_config)
config.input_path[:], input_reader_config)
# TODO(rathodv): make batch size a required argument once the old binaries
# are deleted.
if batch_size:
num_parallel_calls = batch_size * input_reader_config.num_parallel_batches
else:
num_parallel_calls = input_reader_config.num_parallel_map_calls
dataset = dataset.map(
process_fn,
num_parallel_calls=num_parallel_calls)
if batch_size:
padding_shapes = _get_padding_shapes(dataset, max_num_boxes, num_classes,
spatial_image_shape)
dataset = dataset.apply(
tf.contrib.data.padded_batch_and_drop_remainder(batch_size,
padding_shapes))
tf.contrib.data.batch_and_drop_remainder(batch_size))
dataset = dataset.prefetch(input_reader_config.num_prefetch_batches)
return dataset
raise ValueError('Unsupported input_reader_config.')
......@@ -25,7 +25,6 @@ from tensorflow.core.example import feature_pb2
from object_detection.builders import dataset_builder
from object_detection.core import standard_fields as fields
from object_detection.protos import input_reader_pb2
from object_detection.utils import dataset_util
class DatasetBuilderTest(tf.test.TestCase):
......@@ -91,7 +90,7 @@ class DatasetBuilderTest(tf.test.TestCase):
""".format(tf_record_path)
input_reader_proto = input_reader_pb2.InputReader()
text_format.Merge(input_reader_text_proto, input_reader_proto)
tensor_dict = dataset_util.make_initializable_iterator(
tensor_dict = dataset_builder.make_initializable_iterator(
dataset_builder.build(input_reader_proto, batch_size=1)).get_next()
sv = tf.train.Supervisor(logdir=self.get_temp_dir())
......@@ -124,7 +123,7 @@ class DatasetBuilderTest(tf.test.TestCase):
""".format(tf_record_path)
input_reader_proto = input_reader_pb2.InputReader()
text_format.Merge(input_reader_text_proto, input_reader_proto)
tensor_dict = dataset_util.make_initializable_iterator(
tensor_dict = dataset_builder.make_initializable_iterator(
dataset_builder.build(input_reader_proto, batch_size=1)).get_next()
sv = tf.train.Supervisor(logdir=self.get_temp_dir())
......@@ -153,14 +152,11 @@ class DatasetBuilderTest(tf.test.TestCase):
tensor_dict[fields.InputDataFields.groundtruth_classes] - 1, depth=3)
return tensor_dict
tensor_dict = dataset_util.make_initializable_iterator(
tensor_dict = dataset_builder.make_initializable_iterator(
dataset_builder.build(
input_reader_proto,
transform_input_data_fn=one_hot_class_encoding_fn,
batch_size=2,
max_num_boxes=2,
num_classes=3,
spatial_image_shape=[4, 5])).get_next()
batch_size=2)).get_next()
sv = tf.train.Supervisor(logdir=self.get_temp_dir())
with sv.prepare_or_wait_for_session() as sess:
......@@ -169,17 +165,15 @@ class DatasetBuilderTest(tf.test.TestCase):
self.assertAllEqual([2, 4, 5, 3],
output_dict[fields.InputDataFields.image].shape)
self.assertAllEqual([2, 2, 3],
self.assertAllEqual([2, 1, 3],
output_dict[fields.InputDataFields.groundtruth_classes].
shape)
self.assertAllEqual([2, 2, 4],
self.assertAllEqual([2, 1, 4],
output_dict[fields.InputDataFields.groundtruth_boxes].
shape)
self.assertAllEqual(
[[[0.0, 0.0, 1.0, 1.0],
[0.0, 0.0, 0.0, 0.0]],
[[0.0, 0.0, 1.0, 1.0],
[0.0, 0.0, 0.0, 0.0]]],
[[[0.0, 0.0, 1.0, 1.0]],
[[0.0, 0.0, 1.0, 1.0]]],
output_dict[fields.InputDataFields.groundtruth_boxes])
def test_build_tf_record_input_reader_with_batch_size_two_and_masks(self):
......@@ -201,14 +195,11 @@ class DatasetBuilderTest(tf.test.TestCase):
tensor_dict[fields.InputDataFields.groundtruth_classes] - 1, depth=3)
return tensor_dict
tensor_dict = dataset_util.make_initializable_iterator(
tensor_dict = dataset_builder.make_initializable_iterator(
dataset_builder.build(
input_reader_proto,
transform_input_data_fn=one_hot_class_encoding_fn,
batch_size=2,
max_num_boxes=2,
num_classes=3,
spatial_image_shape=[4, 5])).get_next()
batch_size=2)).get_next()
sv = tf.train.Supervisor(logdir=self.get_temp_dir())
with sv.prepare_or_wait_for_session() as sess:
......@@ -216,34 +207,9 @@ class DatasetBuilderTest(tf.test.TestCase):
output_dict = sess.run(tensor_dict)
self.assertAllEqual(
[2, 2, 4, 5],
[2, 1, 4, 5],
output_dict[fields.InputDataFields.groundtruth_instance_masks].shape)
def test_build_tf_record_input_reader_with_additional_channels(self):
tf_record_path = self.create_tf_record(has_additional_channels=True)
input_reader_text_proto = """
shuffle: false
num_readers: 1
tf_record_input_reader {{
input_path: '{0}'
}}
""".format(tf_record_path)
input_reader_proto = input_reader_pb2.InputReader()
text_format.Merge(input_reader_text_proto, input_reader_proto)
tensor_dict = dataset_util.make_initializable_iterator(
dataset_builder.build(
input_reader_proto, batch_size=2,
num_additional_channels=2)).get_next()
sv = tf.train.Supervisor(logdir=self.get_temp_dir())
with sv.prepare_or_wait_for_session() as sess:
sv.start_queue_runners(sess)
output_dict = sess.run(tensor_dict)
self.assertEquals((2, 4, 5, 5),
output_dict[fields.InputDataFields.image].shape)
def test_raises_error_with_no_input_paths(self):
input_reader_text_proto = """
shuffle: false
......@@ -253,7 +219,114 @@ class DatasetBuilderTest(tf.test.TestCase):
input_reader_proto = input_reader_pb2.InputReader()
text_format.Merge(input_reader_text_proto, input_reader_proto)
with self.assertRaises(ValueError):
dataset_builder.build(input_reader_proto)
dataset_builder.build(input_reader_proto, batch_size=1)
class ReadDatasetTest(tf.test.TestCase):
def setUp(self):
self._path_template = os.path.join(self.get_temp_dir(), 'examples_%s.txt')
for i in range(5):
path = self._path_template % i
with tf.gfile.Open(path, 'wb') as f:
f.write('\n'.join([str(i + 1), str((i + 1) * 10)]))
self._shuffle_path_template = os.path.join(self.get_temp_dir(),
'shuffle_%s.txt')
for i in range(2):
path = self._shuffle_path_template % i
with tf.gfile.Open(path, 'wb') as f:
f.write('\n'.join([str(i)] * 5))
def _get_dataset_next(self, files, config, batch_size):
def decode_func(value):
return [tf.string_to_number(value, out_type=tf.int32)]
dataset = dataset_builder.read_dataset(
tf.data.TextLineDataset, files, config)
dataset = dataset.map(decode_func)
dataset = dataset.batch(batch_size)
return dataset.make_one_shot_iterator().get_next()
def test_make_initializable_iterator_with_hashTable(self):
keys = [1, 0, -1]
dataset = tf.data.Dataset.from_tensor_slices([[1, 2, -1, 5]])
table = tf.contrib.lookup.HashTable(
initializer=tf.contrib.lookup.KeyValueTensorInitializer(
keys=keys,
values=list(reversed(keys))),
default_value=100)
dataset = dataset.map(table.lookup)
data = dataset_builder.make_initializable_iterator(dataset).get_next()
init = tf.tables_initializer()
with self.test_session() as sess:
sess.run(init)
self.assertAllEqual(sess.run(data), [-1, 100, 1, 100])
def test_read_dataset(self):
config = input_reader_pb2.InputReader()
config.num_readers = 1
config.shuffle = False
data = self._get_dataset_next([self._path_template % '*'], config,
batch_size=20)
with self.test_session() as sess:
self.assertAllEqual(sess.run(data),
[[1, 10, 2, 20, 3, 30, 4, 40, 5, 50, 1, 10, 2, 20, 3,
30, 4, 40, 5, 50]])
def test_reduce_num_reader(self):
config = input_reader_pb2.InputReader()
config.num_readers = 10
config.shuffle = False
data = self._get_dataset_next([self._path_template % '*'], config,
batch_size=20)
with self.test_session() as sess:
self.assertAllEqual(sess.run(data),
[[1, 10, 2, 20, 3, 30, 4, 40, 5, 50, 1, 10, 2, 20, 3,
30, 4, 40, 5, 50]])
def test_enable_shuffle(self):
config = input_reader_pb2.InputReader()
config.num_readers = 1
config.shuffle = True
tf.set_random_seed(1) # Set graph level seed.
data = self._get_dataset_next(
[self._shuffle_path_template % '*'], config, batch_size=10)
expected_non_shuffle_output = [0, 0, 0, 0, 0, 1, 1, 1, 1, 1]
with self.test_session() as sess:
self.assertTrue(
np.any(np.not_equal(sess.run(data), expected_non_shuffle_output)))
def test_disable_shuffle_(self):
config = input_reader_pb2.InputReader()
config.num_readers = 1
config.shuffle = False
data = self._get_dataset_next(
[self._shuffle_path_template % '*'], config, batch_size=10)
expected_non_shuffle_output = [0, 0, 0, 0, 0, 1, 1, 1, 1, 1]
with self.test_session() as sess:
self.assertAllEqual(sess.run(data), [expected_non_shuffle_output])
def test_read_dataset_single_epoch(self):
config = input_reader_pb2.InputReader()
config.num_epochs = 1
config.num_readers = 1
config.shuffle = False
data = self._get_dataset_next([self._path_template % '0'], config,
batch_size=30)
with self.test_session() as sess:
# First batch will retrieve as much as it can, second batch will fail.
self.assertAllEqual(sess.run(data), [[1, 10]])
self.assertRaises(tf.errors.OutOfRangeError, sess.run, data)
if __name__ == '__main__':
......
......@@ -840,7 +840,9 @@ class WeightSharedConvolutionalBoxPredictor(BoxPredictor):
Args:
image_features: A list of float tensors of shape [batch_size, height_i,
width_i, channels] containing features for a batch of images. Note that
all tensors in the list must have the same number of channels.
when not all tensors in the list have the same number of channels, an
additional projection layer will be added on top the tensor to generate
feature map with number of channels consitent with the majority.
num_predictions_per_location_list: A list of integers representing the
number of box predictions to be made per spatial location for each
feature map. Note that all values must be the same since the weights are
......@@ -869,11 +871,17 @@ class WeightSharedConvolutionalBoxPredictor(BoxPredictor):
feature_channels = [
image_feature.shape[3].value for image_feature in image_features
]
if len(set(feature_channels)) > 1:
raise ValueError('all feature maps must have the same number of '
'channels, found: {}'.format(feature_channels))
has_different_feature_channels = len(set(feature_channels)) > 1
if has_different_feature_channels:
inserted_layer_counter = 0
target_channel = max(set(feature_channels), key=feature_channels.count)
tf.logging.info('Not all feature maps have the same number of '
'channels, found: {}, addition project layers '
'to bring all feature maps to uniform channels '
'of {}'.format(feature_channels, target_channel))
box_encodings_list = []
class_predictions_list = []
num_class_slots = self.num_classes + 1
for feature_index, (image_feature,
num_predictions_per_location) in enumerate(
zip(image_features,
......@@ -881,11 +889,28 @@ class WeightSharedConvolutionalBoxPredictor(BoxPredictor):
# Add a slot for the background class.
with tf.variable_scope('WeightSharedConvolutionalBoxPredictor',
reuse=tf.AUTO_REUSE):
num_class_slots = self.num_classes + 1
box_encodings_net = image_feature
class_predictions_net = image_feature
with slim.arg_scope(self._conv_hyperparams_fn()) as sc:
apply_batch_norm = _arg_scope_func_key(slim.batch_norm) in sc
# Insert an additional projection layer if necessary.
if (has_different_feature_channels and
image_feature.shape[3].value != target_channel):
image_feature = slim.conv2d(
image_feature,
target_channel, [1, 1],
stride=1,
padding='SAME',
activation_fn=None,
normalizer_fn=(tf.identity if apply_batch_norm else None),
scope='ProjectionLayer/conv2d_{}'.format(
inserted_layer_counter))
if apply_batch_norm:
image_feature = slim.batch_norm(
image_feature,
scope='ProjectionLayer/conv2d_{}/BatchNorm'.format(
inserted_layer_counter))
inserted_layer_counter += 1
box_encodings_net = image_feature
class_predictions_net = image_feature
for i in range(self._num_layers_before_predictor):
box_encodings_net = slim.conv2d(
box_encodings_net,
......
......@@ -565,6 +565,38 @@ class WeightSharedConvolutionalBoxPredictorTest(test_case.TestCase):
self.assertAllEqual(class_predictions_with_background.shape,
[4, 640, num_classes_without_background+1])
def test_get_multi_class_predictions_from_feature_maps_of_different_depth(
self):
num_classes_without_background = 6
def graph_fn(image_features1, image_features2, image_features3):
conv_box_predictor = box_predictor.WeightSharedConvolutionalBoxPredictor(
is_training=False,
num_classes=num_classes_without_background,
conv_hyperparams_fn=self._build_arg_scope_with_conv_hyperparams(),
depth=32,
num_layers_before_predictor=1,
box_code_size=4)
box_predictions = conv_box_predictor.predict(
[image_features1, image_features2, image_features3],
num_predictions_per_location=[5, 5, 5],
scope='BoxPredictor')
box_encodings = tf.concat(
box_predictions[box_predictor.BOX_ENCODINGS], axis=1)
class_predictions_with_background = tf.concat(
box_predictions[box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND],
axis=1)
return (box_encodings, class_predictions_with_background)
image_features1 = np.random.rand(4, 8, 8, 64).astype(np.float32)
image_features2 = np.random.rand(4, 8, 8, 64).astype(np.float32)
image_features3 = np.random.rand(4, 8, 8, 32).astype(np.float32)
(box_encodings, class_predictions_with_background) = self.execute(
graph_fn, [image_features1, image_features2, image_features3])
self.assertAllEqual(box_encodings.shape, [4, 960, 4])
self.assertAllEqual(class_predictions_with_background.shape,
[4, 960, num_classes_without_background+1])
def test_predictions_from_multiple_feature_maps_share_weights_not_batchnorm(
self):
num_classes_without_background = 6
......
......@@ -120,7 +120,7 @@ class WeightedSmoothL1LocalizationLoss(Loss):
"""Smooth L1 localization loss function aka Huber Loss..
The smooth L1_loss is defined elementwise as .5 x^2 if |x| <= delta and
0.5 x^2 + delta * (|x|-delta) otherwise, where x is the difference between
delta * (|x|- 0.5*delta) otherwise, where x is the difference between
predictions and target.
See also Equation (3) in the Fast R-CNN paper by Ross Girshick (ICCV 2015)
......
......@@ -2207,10 +2207,10 @@ def resize_to_range(image,
new_size[:-1],
method=tf.image.ResizeMethod.NEAREST_NEIGHBOR,
align_corners=align_corners)
new_masks = tf.squeeze(new_masks, 3)
if pad_to_max_dimension:
new_masks = tf.image.pad_to_bounding_box(
new_masks, 0, 0, max_dimension, max_dimension)
new_masks = tf.squeeze(new_masks, 3)
result.append(new_masks)
result.append(new_size)
......@@ -3136,7 +3136,7 @@ def preprocess(tensor_dict,
images = tensor_dict[fields.InputDataFields.image]
if len(images.get_shape()) != 4:
raise ValueError('images in tensor_dict should be rank 4')
image = tf.squeeze(images, squeeze_dims=[0])
image = tf.squeeze(images, axis=0)
tensor_dict[fields.InputDataFields.image] = image
# Preprocess inputs based on preprocess_options
......
......@@ -2377,6 +2377,40 @@ class PreprocessorTest(tf.test.TestCase):
self.assertAllEqual(out_masks.get_shape().as_list(), expected_mask_shape)
self.assertAllEqual(out_image.get_shape().as_list(), expected_image_shape)
def testResizeToRangeWithMasksAndPadToMaxDimension(self):
"""Tests image resizing, checking output sizes."""
in_image_shape_list = [[60, 40, 3], [15, 30, 3]]
in_masks_shape_list = [[15, 60, 40], [10, 15, 30]]
min_dim = 50
max_dim = 100
expected_image_shape_list = [[100, 100, 3], [100, 100, 3]]
expected_masks_shape_list = [[15, 100, 100], [10, 100, 100]]
for (in_image_shape,
expected_image_shape, in_masks_shape, expected_mask_shape) in zip(
in_image_shape_list, expected_image_shape_list,
in_masks_shape_list, expected_masks_shape_list):
in_image = tf.placeholder(tf.float32, shape=(None, None, 3))
in_masks = tf.placeholder(tf.float32, shape=(None, None, None))
out_image, out_masks, _ = preprocessor.resize_to_range(
in_image,
in_masks,
min_dimension=min_dim,
max_dimension=max_dim,
pad_to_max_dimension=True)
out_image_shape = tf.shape(out_image)
out_masks_shape = tf.shape(out_masks)
with self.test_session() as sess:
out_image_shape, out_masks_shape = sess.run(
[out_image_shape, out_masks_shape],
feed_dict={
in_image: np.random.randn(*in_image_shape),
in_masks: np.random.randn(*in_masks_shape)
})
self.assertAllEqual(out_image_shape, expected_image_shape)
self.assertAllEqual(out_masks_shape, expected_mask_shape)
def testResizeToRangeWithMasksAndDynamicSpatialShape(self):
"""Tests image resizing, checking output sizes."""
in_image_shape_list = [[60, 40, 3], [15, 30, 3]]
......
......@@ -62,8 +62,6 @@ class InputDataFields(object):
num_groundtruth_boxes: number of groundtruth boxes.
true_image_shapes: true shapes of images in the resized images, as resized
images can be padded with zeros.
verified_labels: list of human-verified image-level labels (note, that a
label can be verified both as positive and negative).
multiclass_scores: the label score per class for each box.
"""
image = 'image'
......@@ -91,7 +89,6 @@ class InputDataFields(object):
groundtruth_weights = 'groundtruth_weights'
num_groundtruth_boxes = 'num_groundtruth_boxes'
true_image_shape = 'true_image_shape'
verified_labels = 'verified_labels'
multiclass_scores = 'multiclass_scores'
......
......@@ -12,7 +12,6 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tensorflow Example proto decoder for object detection.
A decoder to decode string tensors containing serialized tensorflow.Example
......@@ -156,6 +155,11 @@ class TfExampleDecoder(data_decoder.DataDecoder):
tf.FixedLenFeature((), tf.int64, default_value=1),
'image/width':
tf.FixedLenFeature((), tf.int64, default_value=1),
# Image-level labels.
'image/class/text':
tf.VarLenFeature(tf.string),
'image/class/label':
tf.VarLenFeature(tf.int64),
# Object boxes and classes.
'image/object/bbox/xmin':
tf.VarLenFeature(tf.float32),
......@@ -281,10 +285,18 @@ class TfExampleDecoder(data_decoder.DataDecoder):
label_handler = BackupHandler(
LookupTensor('image/object/class/text', table, default_value=''),
slim_example_decoder.Tensor('image/object/class/label'))
image_label_handler = BackupHandler(
LookupTensor(
fields.TfExampleFields.image_class_text, table, default_value=''),
slim_example_decoder.Tensor(fields.TfExampleFields.image_class_label))
else:
label_handler = slim_example_decoder.Tensor('image/object/class/label')
image_label_handler = slim_example_decoder.Tensor(
fields.TfExampleFields.image_class_label)
self.items_to_handlers[
fields.InputDataFields.groundtruth_classes] = label_handler
self.items_to_handlers[
fields.InputDataFields.groundtruth_image_classes] = image_label_handler
def decode(self, tf_example_string_tensor):
"""Decodes serialized tensorflow example and returns a tensor dictionary.
......@@ -328,6 +340,8 @@ class TfExampleDecoder(data_decoder.DataDecoder):
the keypoints are ordered (y, x).
fields.InputDataFields.groundtruth_instance_masks - 3D float32 tensor of
shape [None, None, None] containing instance masks.
fields.InputDataFields.groundtruth_image_classes - 1D uint64 of shape
[None] containing classes for the boxes.
"""
serialized_example = tf.reshape(tf_example_string_tensor, shape=[])
decoder = slim_example_decoder.TFExampleDecoder(self.keys_to_features,
......
......@@ -762,6 +762,57 @@ class TfExampleDecoderTest(tf.test.TestCase):
self.assertTrue(fields.InputDataFields.groundtruth_instance_masks
not in tensor_dict)
def testDecodeImageLabels(self):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
encoded_jpeg = self._EncodeImage(image_tensor)
example = tf.train.Example(
features=tf.train.Features(
feature={
'image/encoded': self._BytesFeature(encoded_jpeg),
'image/format': self._BytesFeature('jpeg'),
'image/class/label': self._Int64Feature([1, 2]),
})).SerializeToString()
example_decoder = tf_example_decoder.TfExampleDecoder()
tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
with self.test_session() as sess:
tensor_dict = sess.run(tensor_dict)
self.assertTrue(
fields.InputDataFields.groundtruth_image_classes in tensor_dict)
self.assertAllEqual(
tensor_dict[fields.InputDataFields.groundtruth_image_classes],
np.array([1, 2]))
example = tf.train.Example(
features=tf.train.Features(
feature={
'image/encoded': self._BytesFeature(encoded_jpeg),
'image/format': self._BytesFeature('jpeg'),
'image/class/text': self._BytesFeature(['dog', 'cat']),
})).SerializeToString()
label_map_string = """
item {
id:3
name:'cat'
}
item {
id:1
name:'dog'
}
"""
label_map_path = os.path.join(self.get_temp_dir(), 'label_map.pbtxt')
with tf.gfile.Open(label_map_path, 'wb') as f:
f.write(label_map_string)
example_decoder = tf_example_decoder.TfExampleDecoder(
label_map_proto_file=label_map_path)
tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
with self.test_session() as sess:
sess.run(tf.tables_initializer())
tensor_dict = sess.run(tensor_dict)
self.assertTrue(
fields.InputDataFields.groundtruth_image_classes in tensor_dict)
self.assertAllEqual(
tensor_dict[fields.InputDataFields.groundtruth_image_classes],
np.array([1, 3]))
if __name__ == '__main__':
tf.test.main()
#!/bin/bash
# Copyright 2018 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
# Script to download pycocotools and make package for CMLE jobs.
#
# usage:
# bash object_detection/dataset_tools/create_pycocotools_package.sh \
# /tmp/pycocotools
set -e
if [ -z "$1" ]; then
echo "usage create_pycocotools_package.sh [output dir]"
exit
fi
# Create the output directory.
OUTPUT_DIR="${1%/}"
SCRATCH_DIR="${OUTPUT_DIR}/raw"
mkdir -p "${OUTPUT_DIR}"
mkdir -p "${SCRATCH_DIR}"
cd ${SCRATCH_DIR}
git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPI && mv ../common ./
sed "s/\.\.\/common/common/g" setup.py > setup.py.updated
cp -f setup.py.updated setup.py
rm setup.py.updated
sed "s/\.\.\/common/common/g" pycocotools/_mask.pyx > _mask.pyx.updated
cp -f _mask.pyx.updated pycocotools/_mask.pyx
rm _mask.pyx.updated
sed "s/import matplotlib\.pyplot as plt/import matplotlib\nmatplotlib\.use\(\'Agg\'\)\nimport matplotlib\.pyplot as plt/g" pycocotools/coco.py > coco.py.updated
cp -f coco.py.updated pycocotools/coco.py
rm coco.py.updated
cd "${OUTPUT_DIR}"
tar -czf pycocotools-2.0.tar.gz -C "${SCRATCH_DIR}/cocoapi/" PythonAPI/
rm -rf ${SCRATCH_DIR}
......@@ -12,15 +12,19 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""A class and executable to expand hierarchically image-level labels and boxes.
r"""An executable to expand hierarchically image-level labels and boxes.
Example usage:
./hierarchical_labels_expansion <path to JSON hierarchy> <input csv file>
<output csv file> [optional]labels_file
python models/research/object_detection/dataset_tools/\
oid_hierarchical_labels_expansion.py \
--json_hierarchy_file=<path to JSON hierarchy> \
--input_annotations=<input csv file> \
--output_annotations=<output csv file> \
--annotation_type=<1 (for boxes) or 2 (for image-level labels)>
"""
import argparse
import json
import sys
def _update_dict(initial_dict, update):
......@@ -80,7 +84,7 @@ class OIDHierarchicalLabelsExpansion(object):
"""Constructor.
Args:
hierarchy: labels hierarchy as JSON file.
hierarchy: labels hierarchy as JSON object.
"""
self._hierarchy_keyed_parent, self._hierarchy_keyed_child, _ = (
......@@ -100,14 +104,14 @@ class OIDHierarchicalLabelsExpansion(object):
# Row header is expected to be exactly:
# ImageID,Source,LabelName,Confidence,XMin,XMax,YMin,YMax,IsOccluded,
# IsTruncated,IsGroupOf,IsDepiction,IsInside
cvs_row_splited = csv_row.split(',')
assert len(cvs_row_splited) == 13
cvs_row_splitted = csv_row.split(',')
assert len(cvs_row_splitted) == 13
result = [csv_row]
assert cvs_row_splited[2] in self._hierarchy_keyed_child
parent_nodes = self._hierarchy_keyed_child[cvs_row_splited[2]]
assert cvs_row_splitted[2] in self._hierarchy_keyed_child
parent_nodes = self._hierarchy_keyed_child[cvs_row_splitted[2]]
for parent_node in parent_nodes:
cvs_row_splited[2] = parent_node
result.append(','.join(cvs_row_splited))
cvs_row_splitted[2] = parent_node
result.append(','.join(cvs_row_splitted))
return result
def expand_labels_from_csv(self, csv_row):
......@@ -141,32 +145,55 @@ class OIDHierarchicalLabelsExpansion(object):
return result
def main(argv):
def main(parsed_args):
if len(argv) < 4:
print """Missing arguments. \n
Usage: ./hierarchical_labels_expansion <path to JSON hierarchy>
<input csv file> <output csv file> [optional]labels_file"""
return
with open(argv[1]) as f:
with open(parsed_args.json_hierarchy_file) as f:
hierarchy = json.load(f)
expansion_generator = OIDHierarchicalLabelsExpansion(hierarchy)
labels_file = False
if len(argv) > 4 and argv[4] == 'labels_file':
if parsed_args.annotation_type == 2:
labels_file = True
with open(argv[2], 'r') as source:
with open(argv[3], 'w') as target:
header_skipped = False
elif parsed_args.annotation_type != 1:
print '--annotation_type expected value is 1 or 2.'
return -1
with open(parsed_args.input_annotations, 'r') as source:
with open(parsed_args.output_annotations, 'w') as target:
header = None
for line in source:
if not header_skipped:
header_skipped = True
if not header:
header = line
continue
if labels_file:
expanded_lines = expansion_generator.expand_labels_from_csv(line)
else:
expanded_lines = expansion_generator.expand_boxes_from_csv(line)
expanded_lines = [header] + expanded_lines
target.writelines(expanded_lines)
if __name__ == '__main__':
main(sys.argv)
parser = argparse.ArgumentParser(
description='Hierarchically expand annotations (excluding root node).')
parser.add_argument(
'--json_hierarchy_file',
required=True,
help='Path to the file containing label hierarchy in JSON format.')
parser.add_argument(
'--input_annotations',
required=True,
help="""Path to Open Images annotations file (either bounding boxes or
image-level labels).""")
parser.add_argument(
'--output_annotations',
required=True,
help="""Path to the output file.""")
parser.add_argument(
'--annotation_type',
type=int,
required=True,
help="""Type of the input annotations: 1 - boxes, 2 - image-level
labels"""
)
args = parser.parse_args()
main(args)
......@@ -52,7 +52,6 @@ from object_detection.builders import dataset_builder
from object_detection.builders import graph_rewriter_builder
from object_detection.builders import model_builder
from object_detection.utils import config_util
from object_detection.utils import dataset_util
from object_detection.utils import label_map_util
......@@ -115,7 +114,7 @@ def main(unused_argv):
is_training=False)
def get_next(config):
return dataset_util.make_initializable_iterator(
return dataset_builder.make_initializable_iterator(
dataset_builder.build(config)).get_next()
create_input_dict_fn = functools.partial(get_next, input_config)
......
......@@ -556,8 +556,16 @@ def result_dict_for_single_example(image,
if groundtruth:
if input_data_fields.groundtruth_instance_masks in groundtruth:
masks = groundtruth[input_data_fields.groundtruth_instance_masks]
masks = tf.expand_dims(masks, 3)
masks = tf.image.resize_images(
masks,
image_shape[1:3],
method=tf.image.ResizeMethod.NEAREST_NEIGHBOR,
align_corners=True)
masks = tf.squeeze(masks, 3)
groundtruth[input_data_fields.groundtruth_instance_masks] = tf.cast(
groundtruth[input_data_fields.groundtruth_instance_masks], tf.uint8)
masks, tf.uint8)
output_dict.update(groundtruth)
if scale_to_absolute:
groundtruth_boxes = groundtruth[input_data_fields.groundtruth_boxes]
......@@ -641,5 +649,3 @@ def get_eval_metric_ops_for_evaluators(evaluation_metrics,
'Found {} in the evaluation metrics'.format(metric))
return eval_metric_ops
......@@ -32,7 +32,7 @@ class EvalUtilTest(tf.test.TestCase):
{'id': 1, 'name': 'dog'},
{'id': 2, 'name': 'cat'}]
def _make_evaluation_dict(self):
def _make_evaluation_dict(self, resized_groundtruth_masks=False):
input_data_fields = fields.InputDataFields
detection_fields = fields.DetectionResultFields
......@@ -46,6 +46,8 @@ class EvalUtilTest(tf.test.TestCase):
groundtruth_boxes = tf.constant([[0., 0., 1., 1.]])
groundtruth_classes = tf.constant([1])
groundtruth_instance_masks = tf.ones(shape=[1, 20, 20], dtype=tf.uint8)
if resized_groundtruth_masks:
groundtruth_instance_masks = tf.ones(shape=[1, 10, 10], dtype=tf.uint8)
detections = {
detection_fields.detection_boxes: detection_boxes,
detection_fields.detection_scores: detection_scores,
......@@ -99,6 +101,26 @@ class EvalUtilTest(tf.test.TestCase):
self.assertAlmostEqual(1.0, metrics['DetectionBoxes_Precision/mAP'])
self.assertAlmostEqual(1.0, metrics['DetectionMasks_Precision/mAP'])
def test_get_eval_metric_ops_for_coco_detections_and_resized_masks(self):
evaluation_metrics = ['coco_detection_metrics',
'coco_mask_metrics']
categories = self._get_categories_list()
eval_dict = self._make_evaluation_dict(resized_groundtruth_masks=True)
metric_ops = eval_util.get_eval_metric_ops_for_evaluators(
evaluation_metrics, categories, eval_dict)
_, update_op_boxes = metric_ops['DetectionBoxes_Precision/mAP']
_, update_op_masks = metric_ops['DetectionMasks_Precision/mAP']
with self.test_session() as sess:
metrics = {}
for key, (value_op, _) in metric_ops.iteritems():
metrics[key] = value_op
sess.run(update_op_boxes)
sess.run(update_op_masks)
metrics = sess.run(metrics)
self.assertAlmostEqual(1.0, metrics['DetectionBoxes_Precision/mAP'])
self.assertAlmostEqual(1.0, metrics['DetectionMasks_Precision/mAP'])
def test_get_eval_metric_ops_raises_error_with_unsupported_metric(self):
evaluation_metrics = ['unsupported_metrics']
categories = self._get_categories_list()
......
......@@ -16,7 +16,7 @@
r"""Tool to export an object detection model for inference.
Prepares an object detection tensorflow graph for inference using model
configuration and an optional trained checkpoint. Outputs inference
configuration and a trained checkpoint. Outputs inference
graph, associated checkpoint files, a frozen inference graph and a
SavedModel (https://tensorflow.github.io/serving/serving_basic.html).
......@@ -59,7 +59,7 @@ python export_inference_graph \
The expected output would be in the directory
path/to/exported_model_directory (which is created if it does not exist)
with contents:
- graph.pbtxt
- inference_graph.pbtxt
- model.ckpt.data-00000-of-00001
- model.ckpt.info
- model.ckpt.meta
......@@ -120,6 +120,8 @@ flags.DEFINE_string('output_directory', None, 'Path to write outputs.')
flags.DEFINE_string('config_override', '',
'pipeline_pb2.TrainEvalPipelineConfig '
'text proto to override pipeline_config_path.')
flags.DEFINE_boolean('write_inference_graph', False,
'If true, writes inference graph to disk.')
tf.app.flags.mark_flag_as_required('pipeline_config_path')
tf.app.flags.mark_flag_as_required('trained_checkpoint_prefix')
tf.app.flags.mark_flag_as_required('output_directory')
......@@ -140,7 +142,8 @@ def main(_):
input_shape = None
exporter.export_inference_graph(FLAGS.input_type, pipeline_config,
FLAGS.trained_checkpoint_prefix,
FLAGS.output_directory, input_shape)
FLAGS.output_directory, input_shape,
FLAGS.write_inference_graph)
if __name__ == '__main__':
......
......@@ -18,7 +18,6 @@ import logging
import os
import tempfile
import tensorflow as tf
from google.protobuf import text_format
from tensorflow.core.protobuf import saver_pb2
from tensorflow.python import pywrap_tensorflow
from tensorflow.python.client import session
......@@ -29,6 +28,7 @@ from tensorflow.python.training import saver as saver_lib
from object_detection.builders import model_builder
from object_detection.core import standard_fields as fields
from object_detection.data_decoders import tf_example_decoder
from object_detection.utils import config_util
slim = tf.contrib.slim
......@@ -243,9 +243,7 @@ def _add_output_tensor_nodes(postprocessed_tensors,
masks, name=detection_fields.detection_masks)
for output_key in outputs:
tf.add_to_collection(output_collection_name, outputs[output_key])
if masks is not None:
tf.add_to_collection(output_collection_name,
outputs[detection_fields.detection_masks])
return outputs
......@@ -276,7 +274,7 @@ def write_saved_model(saved_model_path,
Args:
saved_model_path: Path to write SavedModel.
frozen_graph_def: tf.GraphDef holding frozen graph.
inputs: The input image tensor to use for detection.
inputs: The input placeholder tensor.
outputs: A tensor dictionary containing the outputs of a DetectionModel.
"""
with tf.Graph().as_default():
......@@ -370,7 +368,8 @@ def _export_inference_graph(input_type,
additional_output_tensor_names=None,
input_shape=None,
output_collection_name='inference_op',
graph_hook_fn=None):
graph_hook_fn=None,
write_inference_graph=False):
"""Export helper."""
tf.gfile.MakeDirs(output_directory)
frozen_graph_path = os.path.join(output_directory,
......@@ -408,6 +407,14 @@ def _export_inference_graph(input_type,
model_path=model_path,
input_saver_def=input_saver_def,
trained_checkpoint_prefix=checkpoint_to_use)
if write_inference_graph:
inference_graph_def = tf.get_default_graph().as_graph_def()
inference_graph_path = os.path.join(output_directory,
'inference_graph.pbtxt')
for node in inference_graph_def.node:
node.device = ''
with gfile.GFile(inference_graph_path, 'wb') as f:
f.write(str(inference_graph_def))
if additional_output_tensor_names is not None:
output_node_names = ','.join(outputs.keys()+additional_output_tensor_names)
......@@ -434,12 +441,13 @@ def export_inference_graph(input_type,
output_directory,
input_shape=None,
output_collection_name='inference_op',
additional_output_tensor_names=None):
additional_output_tensor_names=None,
write_inference_graph=False):
"""Exports inference graph for the model specified in the pipeline config.
Args:
input_type: Type of input for the graph. Can be one of [`image_tensor`,
`tf_example`].
input_type: Type of input for the graph. Can be one of ['image_tensor',
'encoded_image_string_tensor', 'tf_example'].
pipeline_config: pipeline_pb2.TrainAndEvalPipelineConfig proto.
trained_checkpoint_prefix: Path to the trained checkpoint file.
output_directory: Path to write outputs.
......@@ -449,17 +457,20 @@ def export_inference_graph(input_type,
If None, does not add output tensors to a collection.
additional_output_tensor_names: list of additional output
tensors to include in the frozen graph.
write_inference_graph: If true, writes inference graph to disk.
"""
detection_model = model_builder.build(pipeline_config.model,
is_training=False)
_export_inference_graph(input_type, detection_model,
pipeline_config.eval_config.use_moving_averages,
trained_checkpoint_prefix,
output_directory, additional_output_tensor_names,
input_shape, output_collection_name,
graph_hook_fn=None)
_export_inference_graph(
input_type,
detection_model,
pipeline_config.eval_config.use_moving_averages,
trained_checkpoint_prefix,
output_directory,
additional_output_tensor_names,
input_shape,
output_collection_name,
graph_hook_fn=None,
write_inference_graph=write_inference_graph)
pipeline_config.eval_config.use_moving_averages = False
config_text = text_format.MessageToString(pipeline_config)
with tf.gfile.Open(
os.path.join(output_directory, 'pipeline.config'), 'wb') as f:
f.write(config_text)
config_util.save_pipeline_config(pipeline_config, output_directory)
......@@ -134,6 +134,26 @@ class ExportInferenceGraphTest(tf.test.TestCase):
self.assertTrue(os.path.exists(os.path.join(
output_directory, 'saved_model', 'saved_model.pb')))
def test_write_inference_graph(self):
tmp_dir = self.get_temp_dir()
trained_checkpoint_prefix = os.path.join(tmp_dir, 'model.ckpt')
self._save_checkpoint_from_mock_model(trained_checkpoint_prefix,
use_moving_averages=False)
with mock.patch.object(
model_builder, 'build', autospec=True) as mock_builder:
mock_builder.return_value = FakeModel()
output_directory = os.path.join(tmp_dir, 'output')
pipeline_config = pipeline_pb2.TrainEvalPipelineConfig()
pipeline_config.eval_config.use_moving_averages = False
exporter.export_inference_graph(
input_type='image_tensor',
pipeline_config=pipeline_config,
trained_checkpoint_prefix=trained_checkpoint_prefix,
output_directory=output_directory,
write_inference_graph=True)
self.assertTrue(os.path.exists(os.path.join(
output_directory, 'inference_graph.pbtxt')))
def test_export_graph_with_fixed_size_image_tensor_input(self):
input_shape = [1, 320, 320, 3]
......
# Open Images Challenge Evaluation
The Object Detection API is currently supporting several evaluation metrics used in the [Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html).
In addition, several data processing tools are available. Detailed instructions on using the tools for each track are available below.
**NOTE**: links to the external website in this tutorial may change after the Open Images Challenge 2018 is finished.
## Object Detection Track
The [Object Detection metric](https://storage.googleapis.com/openimages/web/object_detection_metric.html) protocol requires a pre-processing of the released data to ensure correct evaluation. The released data contains only leaf-most bounding box annotations and image-level labels.
The evaluation metric implementation is available in the class `OpenImagesDetectionChallengeEvaluator`.
1. Download class hierarchy of Open Images Challenge 2018 in JSON format from [here](https://storage.googleapis.com/openimages/challenge_2018/bbox_labels_500_hierarchy.json).
2. Download ground-truth [boundling boxes](https://storage.googleapis.com/openimages/challenge_2018/train/challenge-2018-train-annotations-bbox.csv) and [image-level labels](https://storage.googleapis.com/openimages/challenge_2018/train/challenge-2018-train-annotations-human-imagelabels.csv).
3. Filter the rows corresponding to the validation set images you want to use and store the results in the same CSV format.
4. Run the following command to create hierarchical expansion of the bounding boxes annotations:
```
HIERARCHY_FILE=/path/to/bbox_labels_500_hierarchy.json
BOUNDING_BOXES=/path/to/challenge-2018-train-annotations-bbox
IMAGE_LABELS=/path/to/challenge-2018-train-annotations-human-imagelabels
python object_detection/dataset_tools/oid_hierarchical_labels_expansion.py \
--json_hierarchy_file=${HIERARCHY_FILE} \
--input_annotations=${BOUNDING_BOXES}.csv \
--output_annotations=${BOUNDING_BOXES}_expanded.csv \
--annotation_type=1
python object_detection/dataset_tools/oid_hierarchical_labels_expansion.py \
--json_hierarchy_file=${HIERARCHY_FILE} \
--input_annotations=${IMAGE_LABELS}.csv \
--output_annotations=${IMAGE_LABELS}_expanded.csv \
--annotation_type=2
```
After step 4 you will have produced the ground-truth files suitable for running 'OID Challenge Object Detection Metric 2018' evaluation.
```
INPUT_PREDICTIONS=/path/to/detection_predictions.csv
OUTPUT_METRICS=/path/to/output/metrics/file
python models/research/object_detection/metrics/oid_od_challenge_evaluation.py \
--input_annotations_boxes=${BOUNDING_BOXES}_expanded.csv \
--input_annotations_labels=${IMAGE_LABELS}_expanded.csv \
--input_class_labelmap=object_detection/data/oid_object_detection_challenge_500_label_map.pbtxt \
--input_predictions=${INPUT_PREDICTIONS} \
--output_metrics=${OUTPUT_METRICS} \
```
### Running evaluation on CSV files directly
5. If you are not using Tensorflow, you can run evaluation directly using your algorithm's output and generated ground-truth files. {value=5}
### Running evaluation using TF Object Detection API
5. Produce tf.Example files suitable for running inference: {value=5}
```
RAW_IMAGES_DIR=/path/to/raw_images_location
OUTPUT_DIR=/path/to/output_tfrecords
python object_detection/dataset_tools/create_oid_tf_record.py \
--input_box_annotations_csv ${BOUNDING_BOXES}_expanded.csv \
--input_image_label_annotations_csv ${IMAGE_LABELS}_expanded.csv \
--input_images_directory ${RAW_IMAGES_DIR} \
--input_label_map object_detection/data/oid_object_detection_challenge_500_label_map.pbtxt \
--output_tf_record_path_prefix ${OUTPUT_DIR} \
--num_shards=100
```
6. Run inference of your model and fill corresponding fields in tf.Example: see [this tutorial](object_detection/g3doc/oid_inference_and_evaluation.md) on running the inference with Tensorflow Object Detection API models. {value=6}
7. Finally, run the evaluation script to produce the final evaluation result.
```
INPUT_TFRECORDS_WITH_DETECTIONS=/path/to/tf_records_with_detections
OUTPUT_CONFIG_DIR=/path/to/configs
echo "
label_map_path: 'object_detection/data/oid_object_detection_challenge_500_label_map.pbtxt'
tf_record_input_reader: { input_path: '${INPUT_TFRECORDS_WITH_DETECTIONS}' }
" > ${OUTPUT_CONFIG_DIR}/input_config.pbtxt
echo "
metrics_set: 'oid_challenge_object_detection_metrics'
" > ${OUTPUT_CONFIG_DIR}/eval_config.pbtxt
OUTPUT_METRICS_DIR=/path/to/metrics_csv
python object_detection/metrics/offline_eval_map_corloc.py \
--eval_dir=${OUTPUT_METRICS_DIR} \
--eval_config_path=${OUTPUT_CONFIG_DIR}/eval_config.pbtxt \
--input_config_path=${OUTPUT_CONFIG_DIR}/input_config.pbtxt
```
The result of the evaluation will be stored in `${OUTPUT_METRICS_DIR}/metrics.csv`
For the Object Detection Track, the participants will be ranked on:
- "OpenImagesChallenge2018_Precision/mAP@0.5IOU"
## Visual Relationships Detection Track
The [Visual Relationships Detection metrics](https://storage.googleapis.com/openimages/web/vrd_detection_metric.html) can be directly evaluated using the ground-truth data and model predictions. The evaluation metric implementation is available in the class `VRDRelationDetectionEvaluator`,`VRDPhraseDetectionEvaluator`.
1. Download the ground-truth [visual relationships annotations](https://storage.googleapis.com/openimages/challenge_2018/train/challenge-2018-train-vrd.csv) and [image-level labels](https://storage.googleapis.com/openimages/challenge_2018/train/challenge-2018-train-vrd-labels.csv).
2. Filter the rows corresponding to the validation set images you want to use and store the results in the same CSV format.
3. Run the follwing command to produce final metrics:
```
INPUT_ANNOTATIONS_BOXES=/path/to/challenge-2018-train-vrd.csv
INPUT_ANNOTATIONS_LABELS=/path/to/challenge-2018-train-vrd-labels.csv
INPUT_PREDICTIONS=/path/to/predictions.csv
INPUT_CLASS_LABELMAP=/path/to/oid_object_detection_challenge_500_label_map.pbtxt
INPUT_RELATIONSHIP_LABELMAP=/path/to/relationships_labelmap.pbtxt
OUTPUT_METRICS=/path/to/output/metrics/file
echo "item { name: '/m/02gy9n' id: 602 display_name: 'Transparent' }
item { name: '/m/05z87' id: 603 display_name: 'Plastic' }
item { name: '/m/0dnr7' id: 604 display_name: '(made of)Textile' }
item { name: '/m/04lbp' id: 605 display_name: '(made of)Leather' }
item { name: '/m/083vt' id: 606 display_name: 'Wooden'}
">>${INPUT_CLASS_LABELMAP}
echo "item { name: 'at' id: 1 display_name: 'at' }
item { name: 'on' id: 2 display_name: 'on (top of)' }
item { name: 'holds' id: 3 display_name: 'holds' }
item { name: 'plays' id: 4 display_name: 'plays' }
item { name: 'interacts_with' id: 5 display_name: 'interacts with' }
item { name: 'wears' id: 6 display_name: 'wears' }
item { name: 'is' id: 7 display_name: 'is' }
item { name: 'inside_of' id: 8 display_name: 'inside of' }
item { name: 'under' id: 9 display_name: 'under' }
item { name: 'hits' id: 10 display_name: 'hits' }
"> ${INPUT_RELATIONSHIP_LABELMAP}
python object_detection/metrics/oid_vrd_challenge_evaluation.py \
--input_annotations_boxes=${INPUT_ANNOTATIONS_BOXES} \
--input_annotations_labels=${INPUT_ANNOTATIONS_LABELS} \
--input_predictions=${INPUT_PREDICTIONS} \
--input_class_labelmap=${INPUT_CLASS_LABELMAP} \
--input_relationship_labelmap=${INPUT_RELATIONSHIP_LABELMAP} \
--output_metrics=${OUTPUT_METRICS}
```
The participants of the challenge will be evaluated by weighted average of the following three metrics:
- "VRDMetric_Relationships_mAP@0.5IOU"
- "VRDMetric_Relationships_Recall@50@0.5IOU"
- "VRDMetric_Phrases_mAP@0.5IOU"
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment