Open Images Challenge 2018 tools, minor fixes and refactors. (#4661)

* Merged commit includes the following changes: 202804536 by Zhichao Lu: Return tf.data.Dataset from input_fn that goes into the estimator and use PER_HOST_V2 option for tpu input pipeline config. This change shaves off 100ms per step resulting in 25 minutes of total reduced training time for ssd mobilenet v1 (15k steps to convergence). -- 202769340 by Zhichao Lu: Adding as_matrix() transformation for image-level labels. -- 202768721 by Zhichao Lu: Challenge evaluation protocol modification: adding labelmaps creation. -- 202750966 by Zhichao Lu: Add the explicit names to two output nodes. -- 202732783 by Zhichao Lu: Enforcing that batch size is 1 for evaluation, and no original images are retained during evaluation when use_tpu=False (to avoid dynamic shapes). -- 202425430 by Zhichao Lu: Refactor input pipeline to improve performance. -- 202406389 by Zhichao Lu: Only check the validity of `warmup_learning_rate` if it will be used. -- 202330450 by Zhichao Lu: Adding the description of the flag input_image_label_annotations_csv to add image-level labels to tf.Example. -- 202029012 by Zhichao Lu: Enabling displaying relationship name in the final metrics output. -- 202024010 by Zhichao Lu: Update to the public README. -- 201999677 by Zhichao Lu: Fixing the way negative labels are handled in VRD evaluation. -- 201962313 by Zhichao Lu: Fix a bug in resize_to_range. -- 201808488 by Zhichao Lu: Update ssd_inception_v2_pets.config to use right filename of pets dataset tf records. -- 201779225 by Zhichao Lu: Update object detection API installation doc -- 201766518 by Zhichao Lu: Add shell script to create pycocotools package for CMLE. -- 201722377 by Zhichao Lu: Removes verified_labels field and uses groundtruth_image_classes field instead. -- 201616819 by Zhichao Lu: Disable eval_on_tpu since eval_metrics is not setup to execute on TPU. Do not use run_config.task_type to switch tpu mode for EVAL, since that won't work in unit test. Expand unit test to verify that the same instantiation of the Estimator can independently disable eval on TPU whereas training is enabled on TPU. -- 201524716 by Zhichao Lu: Disable export model to TPU, inference is not compatible with TPU. Add GOOGLE_INTERNAL support in object detection copy.bara.sky -- 201453347 by Zhichao Lu: Fixing bug when evaluating the quantized model. -- 200795826 by Zhichao Lu: Fixing parsing bug: image-level labels are parsed as tuples instead of numpy array. -- 200746134 by Zhichao Lu: Adding image_class_text and image_class_label fields into tf_example_decoder.py -- 200743003 by Zhichao Lu: Changes to model_main.py and model_tpu_main to enable training and continuous eval. -- 200736324 by Zhichao Lu: Replace deprecated squeeze_dims argument. -- 200730072 by Zhichao Lu: Make detections only during predict and eval mode while creating model function -- 200729699 by Zhichao Lu: Minor correction to internal documentation (definition of Huber loss) -- 200727142 by Zhichao Lu: Add command line parsing as a set of flags using argparse and add header to the resulting file. -- 200726169 by Zhichao Lu: A tutorial on running evaluation for the Open Images Challenge 2018. -- 200665093 by Zhichao Lu: Cleanup on variables_helper_test.py. -- 200652145 by Zhichao Lu: Add an option to write (non-frozen) graph when exporting inference graph. -- 200573810 by Zhichao Lu: Update ssd_mobilenet_v1_coco and ssd_inception_v2_coco download links to point to a newer version. -- 200498014 by Zhichao Lu: Add test for groundtruth mask resizing. -- 200453245 by Zhichao Lu: Cleaning up exporting_models.md along with exporting scripts -- 200311747 by Zhichao Lu: Resize groundtruth mask to match the size of the original image. -- 200287269 by Zhichao Lu: Having a option to use custom MatMul based crop_and_resize op as an alternate to the TF op in Faster-RCNN -- 200127859 by Zhichao Lu: Updating the instructions to run locally with new binary. Also updating pets configs since file path naming has changed. -- 200127044 by Zhichao Lu: A simpler evaluation util to compute Open Images Challenge 2018 metric (object detection track). -- 200124019 by Zhichao Lu: Freshening up configuring_jobs.md -- 200086825 by Zhichao Lu: Make merge_multiple_label_boxes work for ssd model. -- 199843258 by Zhichao Lu: Allows inconsistent feature channels to be compatible with WeightSharedConvolutionalBoxPredictor. -- 199676082 by Zhichao Lu: Enable an override for `InputReader.shuffle` for object detection pipelines. -- 199599212 by Zhichao Lu: Markdown fixes. -- 199535432 by Zhichao Lu: Pass num_additional_channels to tf.example decoder in predict_input_fn. -- 199399439 by Zhichao Lu: Adding `num_additional_channels` field to specify how many additional channels to use in the model. -- PiperOrigin-RevId: 202804536 * Add original model builder and docs back.

Open Images Challenge 2018 tools, minor fixes and refactors. (#4661)
* Merged commit includes the following changes: 202804536 by Zhichao Lu: Return tf.data.Dataset from input_fn that goes into the estimator and use PER_HOST_V2 option for tpu input pipeline config. This change shaves off 100ms per step resulting in 25 minutes of total reduced training time for ssd mobilenet v1 (15k steps to convergence). -- 202769340 by Zhichao Lu: Adding as_matrix() transformation for image-level labels. -- 202768721 by Zhichao Lu: Challenge evaluation protocol modification: adding labelmaps creation. -- 202750966 by Zhichao Lu: Add the explicit names to two output nodes. -- 202732783 by Zhichao Lu: Enforcing that batch size is 1 for evaluation, and no original images are retained during evaluation when use_tpu=False (to avoid dynamic shapes). -- 202425430 by Zhichao Lu: Refactor input pipeline to improve performance. -- 202406389 by Zhichao Lu: Only check the validity of `warmup_learning_rate` if it will be used. -- 202330450 by Zhichao Lu: Adding the description of the flag input_image_label_annotations_csv to add image-level labels to tf.Example. -- 202029012 by Zhichao Lu: Enabling displaying relationship name in the final metrics output. -- 202024010 by Zhichao Lu: Update to the public README. -- 201999677 by Zhichao Lu: Fixing the way negative labels are handled in VRD evaluation. -- 201962313 by Zhichao Lu: Fix a bug in resize_to_range. -- 201808488 by Zhichao Lu: Update ssd_inception_v2_pets.config to use right filename of pets dataset tf records. -- 201779225 by Zhichao Lu: Update object detection API installation doc -- 201766518 by Zhichao Lu: Add shell script to create pycocotools package for CMLE. -- 201722377 by Zhichao Lu: Removes verified_labels field and uses groundtruth_image_classes field instead. -- 201616819 by Zhichao Lu: Disable eval_on_tpu since eval_metrics is not setup to execute on TPU. Do not use run_config.task_type to switch tpu mode for EVAL, since that won't work in unit test. Expand unit test to verify that the same instantiation of the Estimator can independently disable eval on TPU whereas training is enabled on TPU. -- 201524716 by Zhichao Lu: Disable export model to TPU, inference is not compatible with TPU. Add GOOGLE_INTERNAL support in object detection copy.bara.sky -- 201453347 by Zhichao Lu: Fixing bug when evaluating the quantized model. -- 200795826 by Zhichao Lu: Fixing parsing bug: image-level labels are parsed as tuples instead of numpy array. -- 200746134 by Zhichao Lu: Adding image_class_text and image_class_label fields into tf_example_decoder.py -- 200743003 by Zhichao Lu: Changes to model_main.py and model_tpu_main to enable training and continuous eval. -- 200736324 by Zhichao Lu: Replace deprecated squeeze_dims argument. -- 200730072 by Zhichao Lu: Make detections only during predict and eval mode while creating model function -- 200729699 by Zhichao Lu: Minor correction to internal documentation (definition of Huber loss) -- 200727142 by Zhichao Lu: Add command line parsing as a set of flags using argparse and add header to the resulting file. -- 200726169 by Zhichao Lu: A tutorial on running evaluation for the Open Images Challenge 2018. -- 200665093 by Zhichao Lu: Cleanup on variables_helper_test.py. -- 200652145 by Zhichao Lu: Add an option to write (non-frozen) graph when exporting inference graph. -- 200573810 by Zhichao Lu: Update ssd_mobilenet_v1_coco and ssd_inception_v2_coco download links to point to a newer version. -- 200498014 by Zhichao Lu: Add test for groundtruth mask resizing. -- 200453245 by Zhichao Lu: Cleaning up exporting_models.md along with exporting scripts -- 200311747 by Zhichao Lu: Resize groundtruth mask to match the size of the original image. -- 200287269 by Zhichao Lu: Having a option to use custom MatMul based crop_and_resize op as an alternate to the TF op in Faster-RCNN -- 200127859 by Zhichao Lu: Updating the instructions to run locally with new binary. Also updating pets configs since file path naming has changed. -- 200127044 by Zhichao Lu: A simpler evaluation util to compute Open Images Challenge 2018 metric (object detection track). -- 200124019 by Zhichao Lu: Freshening up configuring_jobs.md -- 200086825 by Zhichao Lu: Make merge_multiple_label_boxes work for ssd model. -- 199843258 by Zhichao Lu: Allows inconsistent feature channels to be compatible with WeightSharedConvolutionalBoxPredictor. -- 199676082 by Zhichao Lu: Enable an override for `InputReader.shuffle` for object detection pipelines. -- 199599212 by Zhichao Lu: Markdown fixes. -- 199535432 by Zhichao Lu: Pass num_additional_channels to tf.example decoder in predict_input_fn. -- 199399439 by Zhichao Lu: Adding `num_additional_channels` field to specify how many additional channels to use in the model. -- PiperOrigin-RevId: 202804536 * Add original model builder and docs back.
32e7d660 · pkulzc · GitHub · 86ac7a47 · 32e7d660 · 32e7d660
Unverified Commit 32e7d660 authored Jul 02, 2018 by pkulzc Committed by GitHub Jul 02, 2018
20 changed files
--- a/research/object_detection/README.md
+++ b/research/object_detection/README.md
@@ -72,6 +72,8 @@ Extras:
      Inference and evaluation on the Open Images dataset</a><br>
  * <a href='g3doc/instance_segmentation.md'>
      Run an instance segmentation model</a><br>
+  * <a href='g3doc/challenge_evaluation.md'>
+      Run the evaluation for the Open Images Challenge 2018.</a><br>

 ## Getting Help

@@ -90,6 +92,20 @@ reporting an issue.

 ## Release information

+### June 25, 2018
+
+Additional evaluation tools for the [Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html) are out.
+Check out our short tutorial on data preparation and running evaluation [here](g3doc/challenge_evaluation.md)!
+
+<b>Thanks to contributors</b>: Alina Kuznetsova
+
+### June 5, 2018
+
+We have released the implementation of evaluation metrics for both tracks of the [Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html) as a part of the Object Detection API - see the [evaluation protocols](g3doc/evaluation_protocols.md) for more details.
+Additionally, we have released a tool for hierarchical labels expansion for the Open Images Challenge: check out [oid_hierarchical_labels_expansion.py](dataset_tools/oid_hierarchical_labels_expansion.py).
+
+<b>Thanks to contributors</b>: Alina Kuznetsova, Vittorio Ferrari, Jasper Uijlings
+
 ### April 30, 2018

 We have released a Faster R-CNN detector with ResNet-101 feature extractor trained on [AVA](https://research.google.com/ava/) v2.1.

--- a/research/object_detection/builders/dataset_builder.py
+++ b/research/object_detection/builders/dataset_builder.py
@@ -24,111 +24,66 @@ that wraps the build function.
 import functools
 import tensorflow as tf

-from object_detection.core import standard_fields as fields
 from object_detection.data_decoders import tf_example_decoder
 from object_detection.protos import input_reader_pb2
-from object_detection.utils import dataset_util


-def _get_padding_shapes(dataset, max_num_boxes=None, num_classes=None,
-                        spatial_image_shape=None):
-  """Returns shapes to pad dataset tensors to before batching.
+def make_initializable_iterator(dataset):
+  """Creates an iterator, and initializes tables.
+
+  This is useful in cases where make_one_shot_iterator wouldn't work because
+  the graph contains a hash table that needs to be initialized.

  Args:
-    dataset: tf.data.Dataset object.
-    max_num_boxes: Max number of groundtruth boxes needed to computes shapes for
-      padding.
-    num_classes: Number of classes in the dataset needed to compute shapes for
-      padding.
-    spatial_image_shape: A list of two integers of the form [height, width]
-      containing expected spatial shape of the image.
+    dataset: A `tf.data.Dataset` object.

  Returns:
-    A dictionary keyed by fields.InputDataFields containing padding shapes for
-    tensors in the dataset.
-
-  Raises:
-    ValueError: If groundtruth classes is neither rank 1 nor rank 2.
+    A `tf.data.Iterator`.
  """
+  iterator = dataset.make_initializable_iterator()
+  tf.add_to_collection(tf.GraphKeys.TABLE_INITIALIZERS, iterator.initializer)
+  return iterator

-  if not spatial_image_shape or spatial_image_shape == [-1, -1]:
-    height, width = None, None
-  else:
-    height, width = spatial_image_shape  # pylint: disable=unpacking-non-sequence
-
-  num_additional_channels = 0
-  if fields.InputDataFields.image_additional_channels in dataset.output_shapes:
-    num_additional_channels = dataset.output_shapes[
-        fields.InputDataFields.image_additional_channels].dims[2].value
-  padding_shapes = {
-      # Additional channels are merged before batching.
-      fields.InputDataFields.image: [
-          height, width, 3 + num_additional_channels
-      ],
-      fields.InputDataFields.image_additional_channels: [
-          height, width, num_additional_channels
-      ],
-      fields.InputDataFields.source_id: [],
-      fields.InputDataFields.filename: [],
-      fields.InputDataFields.key: [],
-      fields.InputDataFields.groundtruth_difficult: [max_num_boxes],
-      fields.InputDataFields.groundtruth_boxes: [max_num_boxes, 4],
-      fields.InputDataFields.groundtruth_instance_masks: [
-          max_num_boxes, height, width
-      ],
-      fields.InputDataFields.groundtruth_is_crowd: [max_num_boxes],
-      fields.InputDataFields.groundtruth_group_of: [max_num_boxes],
-      fields.InputDataFields.groundtruth_area: [max_num_boxes],
-      fields.InputDataFields.groundtruth_weights: [max_num_boxes],
-      fields.InputDataFields.num_groundtruth_boxes: [],
-      fields.InputDataFields.groundtruth_label_types: [max_num_boxes],
-      fields.InputDataFields.groundtruth_label_scores: [max_num_boxes],
-      fields.InputDataFields.true_image_shape: [3],
-      fields.InputDataFields.multiclass_scores: [
-          max_num_boxes, num_classes + 1 if num_classes is not None else None
-      ],
-  }
-  # Determine whether groundtruth_classes are integers or one-hot encodings, and
-  # apply batching appropriately.
-  classes_shape = dataset.output_shapes[
-      fields.InputDataFields.groundtruth_classes]
-  if len(classes_shape) == 1:  # Class integers.
-    padding_shapes[fields.InputDataFields.groundtruth_classes] = [max_num_boxes]
-  elif len(classes_shape) == 2:  # One-hot or k-hot encoding.
-    padding_shapes[fields.InputDataFields.groundtruth_classes] = [
-        max_num_boxes, num_classes]
-  else:
-    raise ValueError('Groundtruth classes must be a rank 1 tensor (classes) or '
-                     'rank 2 tensor (one-hot encodings)')
-
-  if fields.InputDataFields.original_image in dataset.output_shapes:
-    padding_shapes[fields.InputDataFields.original_image] = [
-        None, None, 3 + num_additional_channels
-    ]
-  if fields.InputDataFields.groundtruth_keypoints in dataset.output_shapes:
-    tensor_shape = dataset.output_shapes[fields.InputDataFields.
-                                         groundtruth_keypoints]
-    padding_shape = [max_num_boxes, tensor_shape[1].value,
-                     tensor_shape[2].value]
-    padding_shapes[fields.InputDataFields.groundtruth_keypoints] = padding_shape
-  if (fields.InputDataFields.groundtruth_keypoint_visibilities
-      in dataset.output_shapes):
-    tensor_shape = dataset.output_shapes[fields.InputDataFields.
-                                         groundtruth_keypoint_visibilities]
-    padding_shape = [max_num_boxes, tensor_shape[1].value]
-    padding_shapes[fields.InputDataFields.
-                   groundtruth_keypoint_visibilities] = padding_shape
-  return {tensor_key: padding_shapes[tensor_key]
-          for tensor_key, _ in dataset.output_shapes.items()}
-
-
-def build(input_reader_config,
-          transform_input_data_fn=None,
-          batch_size=None,
-          max_num_boxes=None,
-          num_classes=None,
-          spatial_image_shape=None,
-          num_additional_channels=0):
+
+def read_dataset(file_read_func, input_files, config):
+  """Reads a dataset, and handles repetition and shuffling.
+
+  Args:
+    file_read_func: Function to use in tf.contrib.data.parallel_interleave, to
+      read every individual file into a tf.data.Dataset.
+    input_files: A list of file paths to read.
+    config: A input_reader_builder.InputReader object.
+
+  Returns:
+    A tf.data.Dataset of (undecoded) tf-records based on config.
+  """
+  # Shard, shuffle, and read files.
+  filenames = tf.gfile.Glob(input_files)
+  num_readers = config.num_readers
+  if num_readers > len(filenames):
+    num_readers = len(filenames)
+    tf.logging.warning('num_readers has been reduced to %d to match input file '
+                       'shards.' % num_readers)
+  filename_dataset = tf.data.Dataset.from_tensor_slices(filenames)
+  if config.shuffle:
+    filename_dataset = filename_dataset.shuffle(
+        config.filenames_shuffle_buffer_size)
+  elif num_readers > 1:
+    tf.logging.warning('`shuffle` is false, but the input data stream is '
+                       'still slightly shuffled since `num_readers` > 1.')
+  filename_dataset = filename_dataset.repeat(config.num_epochs or None)
+  records_dataset = filename_dataset.apply(
+      tf.contrib.data.parallel_interleave(
+          file_read_func,
+          cycle_length=num_readers,
+          block_length=config.read_block_length,
+          sloppy=config.shuffle))
+  if config.shuffle:
+    records_dataset = records_dataset.shuffle(config.shuffle_buffer_size)
+  return records_dataset
+
+
+def build(input_reader_config, batch_size=None, transform_input_data_fn=None):
  """Builds a tf.data.Dataset.

  Builds a tf.data.Dataset by applying the `transform_input_data_fn` on all
@@ -136,17 +91,9 @@ def build(input_reader_config,

  Args:
    input_reader_config: A input_reader_pb2.InputReader object.
-    transform_input_data_fn: Function to apply to all records, or None if
-      no extra decoding is required.
-    batch_size: Batch size. If None, batching is not performed.
-    max_num_boxes: Max number of groundtruth boxes needed to compute shapes for
-      padding. If None, will use a dynamic shape.
-    num_classes: Number of classes in the dataset needed to compute shapes for
-      padding. If None, will use a dynamic shape.
-    spatial_image_shape: A list of two integers of the form [height, width]
-      containing expected spatial shape of the image after applying
-      transform_input_data_fn. If None, will use dynamic shapes.
-    num_additional_channels: Number of additional channels to use in the input.
+    batch_size: Batch size. If batch size is None, no batching is performed.
+    transform_input_data_fn: Function to apply transformation to all records,
+      or None if no extra decoding is required.

  Returns:
    A tf.data.Dataset based on the input_reader_config.
@@ -173,24 +120,31 @@ def build(input_reader_config,
        instance_mask_type=input_reader_config.mask_type,
        label_map_proto_file=label_map_proto_file,
        use_display_name=input_reader_config.use_display_name,
-        num_additional_channels=num_additional_channels)
+        num_additional_channels=input_reader_config.num_additional_channels)

    def process_fn(value):
-      processed = decoder.decode(value)
+      """Sets up tf graph that decodes, transforms and pads input data."""
+      processed_tensors = decoder.decode(value)
      if transform_input_data_fn is not None:
-        return transform_input_data_fn(processed)
-      return processed
+        processed_tensors = transform_input_data_fn(processed_tensors)
+      return processed_tensors

-    dataset = dataset_util.read_dataset(
+    dataset = read_dataset(
        functools.partial(tf.data.TFRecordDataset, buffer_size=8 * 1000 * 1000),
-        process_fn, config.input_path[:], input_reader_config)
-
+        config.input_path[:], input_reader_config)
+    # TODO(rathodv): make batch size a required argument once the old binaries
+    # are deleted.
+    if batch_size:
+      num_parallel_calls = batch_size * input_reader_config.num_parallel_batches
+    else:
+      num_parallel_calls = input_reader_config.num_parallel_map_calls
+    dataset = dataset.map(
+        process_fn,
+        num_parallel_calls=num_parallel_calls)
    if batch_size:
-      padding_shapes = _get_padding_shapes(dataset, max_num_boxes, num_classes,
-                                           spatial_image_shape)
      dataset = dataset.apply(
-          tf.contrib.data.padded_batch_and_drop_remainder(batch_size,
-                                                          padding_shapes))
+          tf.contrib.data.batch_and_drop_remainder(batch_size))
+    dataset = dataset.prefetch(input_reader_config.num_prefetch_batches)
    return dataset

  raise ValueError('Unsupported input_reader_config.')
--- a/research/object_detection/builders/dataset_builder_test.py
+++ b/research/object_detection/builders/dataset_builder_test.py
@@ -25,7 +25,6 @@ from tensorflow.core.example import feature_pb2
 from object_detection.builders import dataset_builder
 from object_detection.core import standard_fields as fields
 from object_detection.protos import input_reader_pb2
-from object_detection.utils import dataset_util


 class DatasetBuilderTest(tf.test.TestCase):
@@ -91,7 +90,7 @@ class DatasetBuilderTest(tf.test.TestCase):
    """.format(tf_record_path)
    input_reader_proto = input_reader_pb2.InputReader()
    text_format.Merge(input_reader_text_proto, input_reader_proto)
-    tensor_dict = dataset_util.make_initializable_iterator(
+    tensor_dict = dataset_builder.make_initializable_iterator(
        dataset_builder.build(input_reader_proto, batch_size=1)).get_next()

    sv = tf.train.Supervisor(logdir=self.get_temp_dir())
@@ -124,7 +123,7 @@ class DatasetBuilderTest(tf.test.TestCase):
    """.format(tf_record_path)
    input_reader_proto = input_reader_pb2.InputReader()
    text_format.Merge(input_reader_text_proto, input_reader_proto)
-    tensor_dict = dataset_util.make_initializable_iterator(
+    tensor_dict = dataset_builder.make_initializable_iterator(
        dataset_builder.build(input_reader_proto, batch_size=1)).get_next()

    sv = tf.train.Supervisor(logdir=self.get_temp_dir())
@@ -153,14 +152,11 @@ class DatasetBuilderTest(tf.test.TestCase):
          tensor_dict[fields.InputDataFields.groundtruth_classes] - 1, depth=3)
      return tensor_dict

-    tensor_dict = dataset_util.make_initializable_iterator(
+    tensor_dict = dataset_builder.make_initializable_iterator(
        dataset_builder.build(
            input_reader_proto,
            transform_input_data_fn=one_hot_class_encoding_fn,
-            batch_size=2,
-            max_num_boxes=2,
-            num_classes=3,
-            spatial_image_shape=[4, 5])).get_next()
+            batch_size=2)).get_next()

    sv = tf.train.Supervisor(logdir=self.get_temp_dir())
    with sv.prepare_or_wait_for_session() as sess:
@@ -169,17 +165,15 @@ class DatasetBuilderTest(tf.test.TestCase):

    self.assertAllEqual([2, 4, 5, 3],
                        output_dict[fields.InputDataFields.image].shape)
-    self.assertAllEqual([2, 2, 3],
+    self.assertAllEqual([2, 1, 3],
                        output_dict[fields.InputDataFields.groundtruth_classes].
                        shape)
-    self.assertAllEqual([2, 2, 4],
+    self.assertAllEqual([2, 1, 4],
                        output_dict[fields.InputDataFields.groundtruth_boxes].
                        shape)
    self.assertAllEqual(
-        [[[0.0, 0.0, 1.0, 1.0],
-          [0.0, 0.0, 0.0, 0.0]],
-         [[0.0, 0.0, 1.0, 1.0],
-          [0.0, 0.0, 0.0, 0.0]]],
+        [[[0.0, 0.0, 1.0, 1.0]],
+         [[0.0, 0.0, 1.0, 1.0]]],
        output_dict[fields.InputDataFields.groundtruth_boxes])

  def test_build_tf_record_input_reader_with_batch_size_two_and_masks(self):
@@ -201,14 +195,11 @@ class DatasetBuilderTest(tf.test.TestCase):
          tensor_dict[fields.InputDataFields.groundtruth_classes] - 1, depth=3)
      return tensor_dict

-    tensor_dict = dataset_util.make_initializable_iterator(
+    tensor_dict = dataset_builder.make_initializable_iterator(
        dataset_builder.build(
            input_reader_proto,
            transform_input_data_fn=one_hot_class_encoding_fn,
-            batch_size=2,
-            max_num_boxes=2,
-            num_classes=3,
-            spatial_image_shape=[4, 5])).get_next()
+            batch_size=2)).get_next()

    sv = tf.train.Supervisor(logdir=self.get_temp_dir())
    with sv.prepare_or_wait_for_session() as sess:
@@ -216,34 +207,9 @@ class DatasetBuilderTest(tf.test.TestCase):
      output_dict = sess.run(tensor_dict)

    self.assertAllEqual(
-        [2, 2, 4, 5],
+        [2, 1, 4, 5],
        output_dict[fields.InputDataFields.groundtruth_instance_masks].shape)

-  def test_build_tf_record_input_reader_with_additional_channels(self):
-    tf_record_path = self.create_tf_record(has_additional_channels=True)
-
-    input_reader_text_proto = """
-      shuffle: false
-      num_readers: 1
-      tf_record_input_reader {{
-        input_path: '{0}'
-      }}
-    """.format(tf_record_path)
-    input_reader_proto = input_reader_pb2.InputReader()
-    text_format.Merge(input_reader_text_proto, input_reader_proto)
-    tensor_dict = dataset_util.make_initializable_iterator(
-        dataset_builder.build(
-            input_reader_proto, batch_size=2,
-            num_additional_channels=2)).get_next()
-
-    sv = tf.train.Supervisor(logdir=self.get_temp_dir())
-    with sv.prepare_or_wait_for_session() as sess:
-      sv.start_queue_runners(sess)
-      output_dict = sess.run(tensor_dict)
-
-    self.assertEquals((2, 4, 5, 5),
-                      output_dict[fields.InputDataFields.image].shape)
-
  def test_raises_error_with_no_input_paths(self):
    input_reader_text_proto = """
      shuffle: false
@@ -253,7 +219,114 @@ class DatasetBuilderTest(tf.test.TestCase):
    input_reader_proto = input_reader_pb2.InputReader()
    text_format.Merge(input_reader_text_proto, input_reader_proto)
    with self.assertRaises(ValueError):
-      dataset_builder.build(input_reader_proto)
+      dataset_builder.build(input_reader_proto, batch_size=1)
+
+
+class ReadDatasetTest(tf.test.TestCase):
+
+  def setUp(self):
+    self._path_template = os.path.join(self.get_temp_dir(), 'examples_%s.txt')
+
+    for i in range(5):
+      path = self._path_template % i
+      with tf.gfile.Open(path, 'wb') as f:
+        f.write('\n'.join([str(i + 1), str((i + 1) * 10)]))
+
+    self._shuffle_path_template = os.path.join(self.get_temp_dir(),
+                                               'shuffle_%s.txt')
+    for i in range(2):
+      path = self._shuffle_path_template % i
+      with tf.gfile.Open(path, 'wb') as f:
+        f.write('\n'.join([str(i)] * 5))
+
+  def _get_dataset_next(self, files, config, batch_size):
+    def decode_func(value):
+      return [tf.string_to_number(value, out_type=tf.int32)]
+
+    dataset = dataset_builder.read_dataset(
+        tf.data.TextLineDataset, files, config)
+    dataset = dataset.map(decode_func)
+    dataset = dataset.batch(batch_size)
+    return dataset.make_one_shot_iterator().get_next()
+
+  def test_make_initializable_iterator_with_hashTable(self):
+    keys = [1, 0, -1]
+    dataset = tf.data.Dataset.from_tensor_slices([[1, 2, -1, 5]])
+    table = tf.contrib.lookup.HashTable(
+        initializer=tf.contrib.lookup.KeyValueTensorInitializer(
+            keys=keys,
+            values=list(reversed(keys))),
+        default_value=100)
+    dataset = dataset.map(table.lookup)
+    data = dataset_builder.make_initializable_iterator(dataset).get_next()
+    init = tf.tables_initializer()
+
+    with self.test_session() as sess:
+      sess.run(init)
+      self.assertAllEqual(sess.run(data), [-1, 100, 1, 100])
+
+  def test_read_dataset(self):
+    config = input_reader_pb2.InputReader()
+    config.num_readers = 1
+    config.shuffle = False
+
+    data = self._get_dataset_next([self._path_template % '*'], config,
+                                  batch_size=20)
+    with self.test_session() as sess:
+      self.assertAllEqual(sess.run(data),
+                          [[1, 10, 2, 20, 3, 30, 4, 40, 5, 50, 1, 10, 2, 20, 3,
+                            30, 4, 40, 5, 50]])
+
+  def test_reduce_num_reader(self):
+    config = input_reader_pb2.InputReader()
+    config.num_readers = 10
+    config.shuffle = False
+
+    data = self._get_dataset_next([self._path_template % '*'], config,
+                                  batch_size=20)
+    with self.test_session() as sess:
+      self.assertAllEqual(sess.run(data),
+                          [[1, 10, 2, 20, 3, 30, 4, 40, 5, 50, 1, 10, 2, 20, 3,
+                            30, 4, 40, 5, 50]])
+
+  def test_enable_shuffle(self):
+    config = input_reader_pb2.InputReader()
+    config.num_readers = 1
+    config.shuffle = True
+
+    tf.set_random_seed(1)  # Set graph level seed.
+    data = self._get_dataset_next(
+        [self._shuffle_path_template % '*'], config, batch_size=10)
+    expected_non_shuffle_output = [0, 0, 0, 0, 0, 1, 1, 1, 1, 1]
+
+    with self.test_session() as sess:
+      self.assertTrue(
+          np.any(np.not_equal(sess.run(data), expected_non_shuffle_output)))
+
+  def test_disable_shuffle_(self):
+    config = input_reader_pb2.InputReader()
+    config.num_readers = 1
+    config.shuffle = False
+
+    data = self._get_dataset_next(
+        [self._shuffle_path_template % '*'], config, batch_size=10)
+    expected_non_shuffle_output = [0, 0, 0, 0, 0, 1, 1, 1, 1, 1]
+
+    with self.test_session() as sess:
+      self.assertAllEqual(sess.run(data), [expected_non_shuffle_output])
+
+  def test_read_dataset_single_epoch(self):
+    config = input_reader_pb2.InputReader()
+    config.num_epochs = 1
+    config.num_readers = 1
+    config.shuffle = False
+
+    data = self._get_dataset_next([self._path_template % '0'], config,
+                                  batch_size=30)
+    with self.test_session() as sess:
+      # First batch will retrieve as much as it can, second batch will fail.
+      self.assertAllEqual(sess.run(data), [[1, 10]])
+      self.assertRaises(tf.errors.OutOfRangeError, sess.run, data)


 if __name__ == '__main__':

--- a/research/object_detection/core/box_predictor.py
+++ b/research/object_detection/core/box_predictor.py
@@ -840,7 +840,9 @@ class WeightSharedConvolutionalBoxPredictor(BoxPredictor):
    Args:
      image_features: A list of float tensors of shape [batch_size, height_i,
        width_i, channels] containing features for a batch of images. Note that
-        all tensors in the list must have the same number of channels.
+        when not all tensors in the list have the same number of channels, an
+        additional projection layer will be added on top the tensor to generate
+        feature map with number of channels consitent with the majority.
      num_predictions_per_location_list: A list of integers representing the
        number of box predictions to be made per spatial location for each
        feature map. Note that all values must be the same since the weights are
@@ -869,11 +871,17 @@ class WeightSharedConvolutionalBoxPredictor(BoxPredictor):
    feature_channels = [
        image_feature.shape[3].value for image_feature in image_features
    ]
-    if len(set(feature_channels)) > 1:
-      raise ValueError('all feature maps must have the same number of '
-                       'channels, found: {}'.format(feature_channels))
+    has_different_feature_channels = len(set(feature_channels)) > 1
+    if has_different_feature_channels:
+      inserted_layer_counter = 0
+      target_channel = max(set(feature_channels), key=feature_channels.count)
+      tf.logging.info('Not all feature maps have the same number of '
+                      'channels, found: {}, addition project layers '
+                      'to bring all feature maps to uniform channels '
+                      'of {}'.format(feature_channels, target_channel))
    box_encodings_list = []
    class_predictions_list = []
+    num_class_slots = self.num_classes + 1
    for feature_index, (image_feature,
                        num_predictions_per_location) in enumerate(
                            zip(image_features,
@@ -881,11 +889,28 @@ class WeightSharedConvolutionalBoxPredictor(BoxPredictor):
      # Add a slot for the background class.
      with tf.variable_scope('WeightSharedConvolutionalBoxPredictor',
                             reuse=tf.AUTO_REUSE):
-        num_class_slots = self.num_classes + 1
-        box_encodings_net = image_feature
-        class_predictions_net = image_feature
        with slim.arg_scope(self._conv_hyperparams_fn()) as sc:
          apply_batch_norm = _arg_scope_func_key(slim.batch_norm) in sc
+          # Insert an additional projection layer if necessary.
+          if (has_different_feature_channels and
+              image_feature.shape[3].value != target_channel):
+            image_feature = slim.conv2d(
+                image_feature,
+                target_channel, [1, 1],
+                stride=1,
+                padding='SAME',
+                activation_fn=None,
+                normalizer_fn=(tf.identity if apply_batch_norm else None),
+                scope='ProjectionLayer/conv2d_{}'.format(
+                    inserted_layer_counter))
+            if apply_batch_norm:
+              image_feature = slim.batch_norm(
+                  image_feature,
+                  scope='ProjectionLayer/conv2d_{}/BatchNorm'.format(
+                      inserted_layer_counter))
+            inserted_layer_counter += 1
+          box_encodings_net = image_feature
+          class_predictions_net = image_feature
          for i in range(self._num_layers_before_predictor):
            box_encodings_net = slim.conv2d(
                box_encodings_net,

--- a/research/object_detection/core/box_predictor_test.py
+++ b/research/object_detection/core/box_predictor_test.py
@@ -565,6 +565,38 @@ class WeightSharedConvolutionalBoxPredictorTest(test_case.TestCase):
    self.assertAllEqual(class_predictions_with_background.shape,
                        [4, 640, num_classes_without_background+1])

+  def test_get_multi_class_predictions_from_feature_maps_of_different_depth(
+      self):
+
+    num_classes_without_background = 6
+    def graph_fn(image_features1, image_features2, image_features3):
+      conv_box_predictor = box_predictor.WeightSharedConvolutionalBoxPredictor(
+          is_training=False,
+          num_classes=num_classes_without_background,
+          conv_hyperparams_fn=self._build_arg_scope_with_conv_hyperparams(),
+          depth=32,
+          num_layers_before_predictor=1,
+          box_code_size=4)
+      box_predictions = conv_box_predictor.predict(
+          [image_features1, image_features2, image_features3],
+          num_predictions_per_location=[5, 5, 5],
+          scope='BoxPredictor')
+      box_encodings = tf.concat(
+          box_predictions[box_predictor.BOX_ENCODINGS], axis=1)
+      class_predictions_with_background = tf.concat(
+          box_predictions[box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND],
+          axis=1)
+      return (box_encodings, class_predictions_with_background)
+
+    image_features1 = np.random.rand(4, 8, 8, 64).astype(np.float32)
+    image_features2 = np.random.rand(4, 8, 8, 64).astype(np.float32)
+    image_features3 = np.random.rand(4, 8, 8, 32).astype(np.float32)
+    (box_encodings, class_predictions_with_background) = self.execute(
+        graph_fn, [image_features1, image_features2, image_features3])
+    self.assertAllEqual(box_encodings.shape, [4, 960, 4])
+    self.assertAllEqual(class_predictions_with_background.shape,
+                        [4, 960, num_classes_without_background+1])
+
  def test_predictions_from_multiple_feature_maps_share_weights_not_batchnorm(
      self):
    num_classes_without_background = 6

--- a/research/object_detection/core/losses.py
+++ b/research/object_detection/core/losses.py
@@ -120,7 +120,7 @@ class WeightedSmoothL1LocalizationLoss(Loss):
  """Smooth L1 localization loss function aka Huber Loss..

  The smooth L1_loss is defined elementwise as .5 x^2 if |x| <= delta and
-  0.5 x^2 + delta * (|x|-delta) otherwise, where x is the difference between
+  delta * (|x|- 0.5*delta) otherwise, where x is the difference between
  predictions and target.

  See also Equation (3) in the Fast R-CNN paper by Ross Girshick (ICCV 2015)

--- a/research/object_detection/core/preprocessor.py
+++ b/research/object_detection/core/preprocessor.py
@@ -2207,10 +2207,10 @@ def resize_to_range(image,
          new_size[:-1],
          method=tf.image.ResizeMethod.NEAREST_NEIGHBOR,
          align_corners=align_corners)
-      new_masks = tf.squeeze(new_masks, 3)
      if pad_to_max_dimension:
        new_masks = tf.image.pad_to_bounding_box(
            new_masks, 0, 0, max_dimension, max_dimension)
+      new_masks = tf.squeeze(new_masks, 3)
      result.append(new_masks)

    result.append(new_size)
@@ -3136,7 +3136,7 @@ def preprocess(tensor_dict,
    images = tensor_dict[fields.InputDataFields.image]
    if len(images.get_shape()) != 4:
      raise ValueError('images in tensor_dict should be rank 4')
-    image = tf.squeeze(images, squeeze_dims=[0])
+    image = tf.squeeze(images, axis=0)
    tensor_dict[fields.InputDataFields.image] = image

  # Preprocess inputs based on preprocess_options

--- a/research/object_detection/core/preprocessor_test.py
+++ b/research/object_detection/core/preprocessor_test.py
@@ -2377,6 +2377,40 @@ class PreprocessorTest(tf.test.TestCase):
      self.assertAllEqual(out_masks.get_shape().as_list(), expected_mask_shape)
      self.assertAllEqual(out_image.get_shape().as_list(), expected_image_shape)

+  def testResizeToRangeWithMasksAndPadToMaxDimension(self):
+    """Tests image resizing, checking output sizes."""
+    in_image_shape_list = [[60, 40, 3], [15, 30, 3]]
+    in_masks_shape_list = [[15, 60, 40], [10, 15, 30]]
+    min_dim = 50
+    max_dim = 100
+    expected_image_shape_list = [[100, 100, 3], [100, 100, 3]]
+    expected_masks_shape_list = [[15, 100, 100], [10, 100, 100]]
+
+    for (in_image_shape,
+         expected_image_shape, in_masks_shape, expected_mask_shape) in zip(
+             in_image_shape_list, expected_image_shape_list,
+             in_masks_shape_list, expected_masks_shape_list):
+      in_image = tf.placeholder(tf.float32, shape=(None, None, 3))
+      in_masks = tf.placeholder(tf.float32, shape=(None, None, None))
+      out_image, out_masks, _ = preprocessor.resize_to_range(
+          in_image,
+          in_masks,
+          min_dimension=min_dim,
+          max_dimension=max_dim,
+          pad_to_max_dimension=True)
+      out_image_shape = tf.shape(out_image)
+      out_masks_shape = tf.shape(out_masks)
+
+      with self.test_session() as sess:
+        out_image_shape, out_masks_shape = sess.run(
+            [out_image_shape, out_masks_shape],
+            feed_dict={
+                in_image: np.random.randn(*in_image_shape),
+                in_masks: np.random.randn(*in_masks_shape)
+            })
+        self.assertAllEqual(out_image_shape, expected_image_shape)
+        self.assertAllEqual(out_masks_shape, expected_mask_shape)
+
  def testResizeToRangeWithMasksAndDynamicSpatialShape(self):
    """Tests image resizing, checking output sizes."""
    in_image_shape_list = [[60, 40, 3], [15, 30, 3]]

--- a/research/object_detection/core/standard_fields.py
+++ b/research/object_detection/core/standard_fields.py
@@ -62,8 +62,6 @@ class InputDataFields(object):
    num_groundtruth_boxes: number of groundtruth boxes.
    true_image_shapes: true shapes of images in the resized images, as resized
      images can be padded with zeros.
-    verified_labels: list of human-verified image-level labels (note, that a
-      label can be verified both as positive and negative).
    multiclass_scores: the label score per class for each box.
  """
  image = 'image'
@@ -91,7 +89,6 @@ class InputDataFields(object):
  groundtruth_weights = 'groundtruth_weights'
  num_groundtruth_boxes = 'num_groundtruth_boxes'
  true_image_shape = 'true_image_shape'
-  verified_labels = 'verified_labels'
  multiclass_scores = 'multiclass_scores'



--- a/research/object_detection/data_decoders/tf_example_decoder.py
+++ b/research/object_detection/data_decoders/tf_example_decoder.py
@@ -12,7 +12,6 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # ==============================================================================
-
 """Tensorflow Example proto decoder for object detection.

 A decoder to decode string tensors containing serialized tensorflow.Example
@@ -156,6 +155,11 @@ class TfExampleDecoder(data_decoder.DataDecoder):
            tf.FixedLenFeature((), tf.int64, default_value=1),
        'image/width':
            tf.FixedLenFeature((), tf.int64, default_value=1),
+        # Image-level labels.
+        'image/class/text':
+            tf.VarLenFeature(tf.string),
+        'image/class/label':
+            tf.VarLenFeature(tf.int64),
        # Object boxes and classes.
        'image/object/bbox/xmin':
            tf.VarLenFeature(tf.float32),
@@ -281,10 +285,18 @@ class TfExampleDecoder(data_decoder.DataDecoder):
      label_handler = BackupHandler(
          LookupTensor('image/object/class/text', table, default_value=''),
          slim_example_decoder.Tensor('image/object/class/label'))
+      image_label_handler = BackupHandler(
+          LookupTensor(
+              fields.TfExampleFields.image_class_text, table, default_value=''),
+          slim_example_decoder.Tensor(fields.TfExampleFields.image_class_label))
    else:
      label_handler = slim_example_decoder.Tensor('image/object/class/label')
+      image_label_handler = slim_example_decoder.Tensor(
+          fields.TfExampleFields.image_class_label)
    self.items_to_handlers[
        fields.InputDataFields.groundtruth_classes] = label_handler
+    self.items_to_handlers[
+        fields.InputDataFields.groundtruth_image_classes] = image_label_handler

  def decode(self, tf_example_string_tensor):
    """Decodes serialized tensorflow example and returns a tensor dictionary.
@@ -328,6 +340,8 @@ class TfExampleDecoder(data_decoder.DataDecoder):
        the keypoints are ordered (y, x).
      fields.InputDataFields.groundtruth_instance_masks - 3D float32 tensor of
        shape [None, None, None] containing instance masks.
+      fields.InputDataFields.groundtruth_image_classes - 1D uint64 of shape
+        [None] containing classes for the boxes.
    """
    serialized_example = tf.reshape(tf_example_string_tensor, shape=[])
    decoder = slim_example_decoder.TFExampleDecoder(self.keys_to_features,

--- a/research/object_detection/data_decoders/tf_example_decoder_test.py
+++ b/research/object_detection/data_decoders/tf_example_decoder_test.py
@@ -762,6 +762,57 @@ class TfExampleDecoderTest(tf.test.TestCase):
    self.assertTrue(fields.InputDataFields.groundtruth_instance_masks
                    not in tensor_dict)

+  def testDecodeImageLabels(self):
+    image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
+    encoded_jpeg = self._EncodeImage(image_tensor)
+    example = tf.train.Example(
+        features=tf.train.Features(
+            feature={
+                'image/encoded': self._BytesFeature(encoded_jpeg),
+                'image/format': self._BytesFeature('jpeg'),
+                'image/class/label': self._Int64Feature([1, 2]),
+            })).SerializeToString()
+    example_decoder = tf_example_decoder.TfExampleDecoder()
+    tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
+    with self.test_session() as sess:
+      tensor_dict = sess.run(tensor_dict)
+    self.assertTrue(
+        fields.InputDataFields.groundtruth_image_classes in tensor_dict)
+    self.assertAllEqual(
+        tensor_dict[fields.InputDataFields.groundtruth_image_classes],
+        np.array([1, 2]))
+    example = tf.train.Example(
+        features=tf.train.Features(
+            feature={
+                'image/encoded': self._BytesFeature(encoded_jpeg),
+                'image/format': self._BytesFeature('jpeg'),
+                'image/class/text': self._BytesFeature(['dog', 'cat']),
+            })).SerializeToString()
+    label_map_string = """
+      item {
+        id:3
+        name:'cat'
+      }
+      item {
+        id:1
+        name:'dog'
+      }
+    """
+    label_map_path = os.path.join(self.get_temp_dir(), 'label_map.pbtxt')
+    with tf.gfile.Open(label_map_path, 'wb') as f:
+      f.write(label_map_string)
+    example_decoder = tf_example_decoder.TfExampleDecoder(
+        label_map_proto_file=label_map_path)
+    tensor_dict = example_decoder.decode(tf.convert_to_tensor(example))
+    with self.test_session() as sess:
+      sess.run(tf.tables_initializer())
+      tensor_dict = sess.run(tensor_dict)
+    self.assertTrue(
+        fields.InputDataFields.groundtruth_image_classes in tensor_dict)
+    self.assertAllEqual(
+        tensor_dict[fields.InputDataFields.groundtruth_image_classes],
+        np.array([1, 3]))
+

 if __name__ == '__main__':
  tf.test.main()
--- a/research/object_detection/dataset_tools/create_pycocotools_package.sh
+++ b/research/object_detection/dataset_tools/create_pycocotools_package.sh
+#!/bin/bash
+# Copyright 2018 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+
+# Script to download pycocotools and make package for CMLE jobs.
+#
+# usage:
+#  bash object_detection/dataset_tools/create_pycocotools_package.sh \
+#    /tmp/pycocotools
+set -e
+
+if [ -z "$1" ]; then
+  echo "usage create_pycocotools_package.sh [output dir]"
+  exit
+fi
+
+# Create the output directory.
+OUTPUT_DIR="${1%/}"
+SCRATCH_DIR="${OUTPUT_DIR}/raw"
+mkdir -p "${OUTPUT_DIR}"
+mkdir -p "${SCRATCH_DIR}"
+
+cd ${SCRATCH_DIR}
+git clone https://github.com/cocodataset/cocoapi.git
+cd cocoapi/PythonAPI && mv ../common ./
+
+sed "s/\.\.\/common/common/g" setup.py > setup.py.updated
+cp -f setup.py.updated setup.py
+rm setup.py.updated
+
+sed "s/\.\.\/common/common/g" pycocotools/_mask.pyx > _mask.pyx.updated
+cp -f _mask.pyx.updated pycocotools/_mask.pyx
+rm _mask.pyx.updated
+
+sed "s/import matplotlib\.pyplot as plt/import matplotlib\nmatplotlib\.use\(\'Agg\'\)\nimport matplotlib\.pyplot as plt/g" pycocotools/coco.py > coco.py.updated
+cp -f coco.py.updated pycocotools/coco.py
+rm coco.py.updated
+
+cd "${OUTPUT_DIR}"
+tar -czf pycocotools-2.0.tar.gz -C "${SCRATCH_DIR}/cocoapi/" PythonAPI/
+rm -rf ${SCRATCH_DIR}
--- a/research/object_detection/dataset_tools/oid_hierarchical_labels_expansion.py
+++ b/research/object_detection/dataset_tools/oid_hierarchical_labels_expansion.py
@@ -12,15 +12,19 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # ==============================================================================
-"""A class and executable to expand hierarchically image-level labels and boxes.
+r"""An executable to expand hierarchically image-level labels and boxes.

 Example usage:
-    ./hierarchical_labels_expansion <path to JSON hierarchy> <input csv file>
-    <output csv file> [optional]labels_file
+python models/research/object_detection/dataset_tools/\
+oid_hierarchical_labels_expansion.py \
+--json_hierarchy_file=<path to JSON hierarchy> \
+--input_annotations=<input csv file> \
+--output_annotations=<output csv file> \
+--annotation_type=<1 (for boxes) or 2 (for image-level labels)>
 """

+import argparse
 import json
-import sys


 def _update_dict(initial_dict, update):
@@ -80,7 +84,7 @@ class OIDHierarchicalLabelsExpansion(object):
    """Constructor.

    Args:
-      hierarchy: labels hierarchy as JSON file.
+      hierarchy: labels hierarchy as JSON object.
    """

    self._hierarchy_keyed_parent, self._hierarchy_keyed_child, _ = (
@@ -100,14 +104,14 @@ class OIDHierarchicalLabelsExpansion(object):
    # Row header is expected to be exactly:
    # ImageID,Source,LabelName,Confidence,XMin,XMax,YMin,YMax,IsOccluded,
    # IsTruncated,IsGroupOf,IsDepiction,IsInside
-    cvs_row_splited = csv_row.split(',')
-    assert len(cvs_row_splited) == 13
+    cvs_row_splitted = csv_row.split(',')
+    assert len(cvs_row_splitted) == 13
    result = [csv_row]
-    assert cvs_row_splited[2] in self._hierarchy_keyed_child
-    parent_nodes = self._hierarchy_keyed_child[cvs_row_splited[2]]
+    assert cvs_row_splitted[2] in self._hierarchy_keyed_child
+    parent_nodes = self._hierarchy_keyed_child[cvs_row_splitted[2]]
    for parent_node in parent_nodes:
-      cvs_row_splited[2] = parent_node
-      result.append(','.join(cvs_row_splited))
+      cvs_row_splitted[2] = parent_node
+      result.append(','.join(cvs_row_splitted))
    return result

  def expand_labels_from_csv(self, csv_row):
@@ -141,32 +145,55 @@ class OIDHierarchicalLabelsExpansion(object):
    return result


-def main(argv):
+def main(parsed_args):

-  if len(argv) < 4:
-    print """Missing arguments. \n
-             Usage: ./hierarchical_labels_expansion <path to JSON hierarchy>
-             <input csv file> <output csv file> [optional]labels_file"""
-    return
-  with open(argv[1]) as f:
+  with open(parsed_args.json_hierarchy_file) as f:
    hierarchy = json.load(f)
  expansion_generator = OIDHierarchicalLabelsExpansion(hierarchy)
  labels_file = False
-  if len(argv) > 4 and argv[4] == 'labels_file':
+  if parsed_args.annotation_type == 2:
    labels_file = True
-  with open(argv[2], 'r') as source:
-    with open(argv[3], 'w') as target:
-      header_skipped = False
+  elif parsed_args.annotation_type != 1:
+    print '--annotation_type expected value is 1 or 2.'
+    return -1
+  with open(parsed_args.input_annotations, 'r') as source:
+    with open(parsed_args.output_annotations, 'w') as target:
+      header = None
      for line in source:
-        if not header_skipped:
-          header_skipped = True
+        if not header:
+          header = line
          continue
        if labels_file:
          expanded_lines = expansion_generator.expand_labels_from_csv(line)
        else:
          expanded_lines = expansion_generator.expand_boxes_from_csv(line)
+        expanded_lines = [header] + expanded_lines
        target.writelines(expanded_lines)


 if __name__ == '__main__':
-  main(sys.argv)
+
+  parser = argparse.ArgumentParser(
+      description='Hierarchically expand annotations (excluding root node).')
+  parser.add_argument(
+      '--json_hierarchy_file',
+      required=True,
+      help='Path to the file containing label hierarchy in JSON format.')
+  parser.add_argument(
+      '--input_annotations',
+      required=True,
+      help="""Path to Open Images annotations file (either bounding boxes or
+      image-level labels).""")
+  parser.add_argument(
+      '--output_annotations',
+      required=True,
+      help="""Path to the output file.""")
+  parser.add_argument(
+      '--annotation_type',
+      type=int,
+      required=True,
+      help="""Type of the input annotations: 1 - boxes, 2 - image-level
+      labels"""
+  )
+  args = parser.parse_args()
+  main(args)
--- a/research/object_detection/eval.py
+++ b/research/object_detection/eval.py
@@ -52,7 +52,6 @@ from object_detection.builders import dataset_builder
 from object_detection.builders import graph_rewriter_builder
 from object_detection.builders import model_builder
 from object_detection.utils import config_util
-from object_detection.utils import dataset_util
 from object_detection.utils import label_map_util


@@ -115,7 +114,7 @@ def main(unused_argv):
      is_training=False)

  def get_next(config):
-    return dataset_util.make_initializable_iterator(
+    return dataset_builder.make_initializable_iterator(
        dataset_builder.build(config)).get_next()

  create_input_dict_fn = functools.partial(get_next, input_config)

--- a/research/object_detection/eval_util.py
+++ b/research/object_detection/eval_util.py
@@ -556,8 +556,16 @@ def result_dict_for_single_example(image,

  if groundtruth:
    if input_data_fields.groundtruth_instance_masks in groundtruth:
+      masks = groundtruth[input_data_fields.groundtruth_instance_masks]
+      masks = tf.expand_dims(masks, 3)
+      masks = tf.image.resize_images(
+          masks,
+          image_shape[1:3],
+          method=tf.image.ResizeMethod.NEAREST_NEIGHBOR,
+          align_corners=True)
+      masks = tf.squeeze(masks, 3)
      groundtruth[input_data_fields.groundtruth_instance_masks] = tf.cast(
-          groundtruth[input_data_fields.groundtruth_instance_masks], tf.uint8)
+          masks, tf.uint8)
    output_dict.update(groundtruth)
    if scale_to_absolute:
      groundtruth_boxes = groundtruth[input_data_fields.groundtruth_boxes]
@@ -641,5 +649,3 @@ def get_eval_metric_ops_for_evaluators(evaluation_metrics,
                       'Found {} in the evaluation metrics'.format(metric))

  return eval_metric_ops
-
-
--- a/research/object_detection/eval_util_test.py
+++ b/research/object_detection/eval_util_test.py
@@ -32,7 +32,7 @@ class EvalUtilTest(tf.test.TestCase):
            {'id': 1, 'name': 'dog'},
            {'id': 2, 'name': 'cat'}]

-  def _make_evaluation_dict(self):
+  def _make_evaluation_dict(self, resized_groundtruth_masks=False):
    input_data_fields = fields.InputDataFields
    detection_fields = fields.DetectionResultFields

@@ -46,6 +46,8 @@ class EvalUtilTest(tf.test.TestCase):
    groundtruth_boxes = tf.constant([[0., 0., 1., 1.]])
    groundtruth_classes = tf.constant([1])
    groundtruth_instance_masks = tf.ones(shape=[1, 20, 20], dtype=tf.uint8)
+    if resized_groundtruth_masks:
+      groundtruth_instance_masks = tf.ones(shape=[1, 10, 10], dtype=tf.uint8)
    detections = {
        detection_fields.detection_boxes: detection_boxes,
        detection_fields.detection_scores: detection_scores,
@@ -99,6 +101,26 @@ class EvalUtilTest(tf.test.TestCase):
      self.assertAlmostEqual(1.0, metrics['DetectionBoxes_Precision/mAP'])
      self.assertAlmostEqual(1.0, metrics['DetectionMasks_Precision/mAP'])

+  def test_get_eval_metric_ops_for_coco_detections_and_resized_masks(self):
+    evaluation_metrics = ['coco_detection_metrics',
+                          'coco_mask_metrics']
+    categories = self._get_categories_list()
+    eval_dict = self._make_evaluation_dict(resized_groundtruth_masks=True)
+    metric_ops = eval_util.get_eval_metric_ops_for_evaluators(
+        evaluation_metrics, categories, eval_dict)
+    _, update_op_boxes = metric_ops['DetectionBoxes_Precision/mAP']
+    _, update_op_masks = metric_ops['DetectionMasks_Precision/mAP']
+
+    with self.test_session() as sess:
+      metrics = {}
+      for key, (value_op, _) in metric_ops.iteritems():
+        metrics[key] = value_op
+      sess.run(update_op_boxes)
+      sess.run(update_op_masks)
+      metrics = sess.run(metrics)
+      self.assertAlmostEqual(1.0, metrics['DetectionBoxes_Precision/mAP'])
+      self.assertAlmostEqual(1.0, metrics['DetectionMasks_Precision/mAP'])
+
  def test_get_eval_metric_ops_raises_error_with_unsupported_metric(self):
    evaluation_metrics = ['unsupported_metrics']
    categories = self._get_categories_list()

--- a/research/object_detection/export_inference_graph.py
+++ b/research/object_detection/export_inference_graph.py
@@ -16,7 +16,7 @@
 r"""Tool to export an object detection model for inference.

 Prepares an object detection tensorflow graph for inference using model
-configuration and an optional trained checkpoint. Outputs inference
+configuration and a trained checkpoint. Outputs inference
 graph, associated checkpoint files, a frozen inference graph and a
 SavedModel (https://tensorflow.github.io/serving/serving_basic.html).

@@ -59,7 +59,7 @@ python export_inference_graph \
 The expected output would be in the directory
 path/to/exported_model_directory (which is created if it does not exist)
 with contents:
- - graph.pbtxt
+ - inference_graph.pbtxt
 - model.ckpt.data-00000-of-00001
 - model.ckpt.info
 - model.ckpt.meta
@@ -120,6 +120,8 @@ flags.DEFINE_string('output_directory', None, 'Path to write outputs.')
 flags.DEFINE_string('config_override', '',
                    'pipeline_pb2.TrainEvalPipelineConfig '
                    'text proto to override pipeline_config_path.')
+flags.DEFINE_boolean('write_inference_graph', False,
+                     'If true, writes inference graph to disk.')
 tf.app.flags.mark_flag_as_required('pipeline_config_path')
 tf.app.flags.mark_flag_as_required('trained_checkpoint_prefix')
 tf.app.flags.mark_flag_as_required('output_directory')
@@ -140,7 +142,8 @@ def main(_):
    input_shape = None
  exporter.export_inference_graph(FLAGS.input_type, pipeline_config,
                                  FLAGS.trained_checkpoint_prefix,
-                                  FLAGS.output_directory, input_shape)
+                                  FLAGS.output_directory, input_shape,
+                                  FLAGS.write_inference_graph)


 if __name__ == '__main__':

--- a/research/object_detection/exporter.py
+++ b/research/object_detection/exporter.py
@@ -18,7 +18,6 @@ import logging
 import os
 import tempfile
 import tensorflow as tf
-from google.protobuf import text_format
 from tensorflow.core.protobuf import saver_pb2
 from tensorflow.python import pywrap_tensorflow
 from tensorflow.python.client import session
@@ -29,6 +28,7 @@ from tensorflow.python.training import saver as saver_lib
 from object_detection.builders import model_builder
 from object_detection.core import standard_fields as fields
 from object_detection.data_decoders import tf_example_decoder
+from object_detection.utils import config_util

 slim = tf.contrib.slim

@@ -243,9 +243,7 @@ def _add_output_tensor_nodes(postprocessed_tensors,
        masks, name=detection_fields.detection_masks)
  for output_key in outputs:
    tf.add_to_collection(output_collection_name, outputs[output_key])
-  if masks is not None:
-    tf.add_to_collection(output_collection_name,
-                         outputs[detection_fields.detection_masks])
+
  return outputs


@@ -276,7 +274,7 @@ def write_saved_model(saved_model_path,
  Args:
    saved_model_path: Path to write SavedModel.
    frozen_graph_def: tf.GraphDef holding frozen graph.
-    inputs: The input image tensor to use for detection.
+    inputs: The input placeholder tensor.
    outputs: A tensor dictionary containing the outputs of a DetectionModel.
  """
  with tf.Graph().as_default():
@@ -370,7 +368,8 @@ def _export_inference_graph(input_type,
                            additional_output_tensor_names=None,
                            input_shape=None,
                            output_collection_name='inference_op',
-                            graph_hook_fn=None):
+                            graph_hook_fn=None,
+                            write_inference_graph=False):
  """Export helper."""
  tf.gfile.MakeDirs(output_directory)
  frozen_graph_path = os.path.join(output_directory,
@@ -408,6 +407,14 @@ def _export_inference_graph(input_type,
      model_path=model_path,
      input_saver_def=input_saver_def,
      trained_checkpoint_prefix=checkpoint_to_use)
+  if write_inference_graph:
+    inference_graph_def = tf.get_default_graph().as_graph_def()
+    inference_graph_path = os.path.join(output_directory,
+                                        'inference_graph.pbtxt')
+    for node in inference_graph_def.node:
+      node.device = ''
+    with gfile.GFile(inference_graph_path, 'wb') as f:
+      f.write(str(inference_graph_def))

  if additional_output_tensor_names is not None:
    output_node_names = ','.join(outputs.keys()+additional_output_tensor_names)
@@ -434,12 +441,13 @@ def export_inference_graph(input_type,
                           output_directory,
                           input_shape=None,
                           output_collection_name='inference_op',
-                           additional_output_tensor_names=None):
+                           additional_output_tensor_names=None,
+                           write_inference_graph=False):
  """Exports inference graph for the model specified in the pipeline config.

  Args:
-    input_type: Type of input for the graph. Can be one of [`image_tensor`,
-      `tf_example`].
+    input_type: Type of input for the graph. Can be one of ['image_tensor',
+      'encoded_image_string_tensor', 'tf_example'].
    pipeline_config: pipeline_pb2.TrainAndEvalPipelineConfig proto.
    trained_checkpoint_prefix: Path to the trained checkpoint file.
    output_directory: Path to write outputs.
@@ -449,17 +457,20 @@ def export_inference_graph(input_type,
      If None, does not add output tensors to a collection.
    additional_output_tensor_names: list of additional output
      tensors to include in the frozen graph.
+    write_inference_graph: If true, writes inference graph to disk.
  """
  detection_model = model_builder.build(pipeline_config.model,
                                        is_training=False)
-  _export_inference_graph(input_type, detection_model,
-                          pipeline_config.eval_config.use_moving_averages,
-                          trained_checkpoint_prefix,
-                          output_directory, additional_output_tensor_names,
-                          input_shape, output_collection_name,
-                          graph_hook_fn=None)
+  _export_inference_graph(
+      input_type,
+      detection_model,
+      pipeline_config.eval_config.use_moving_averages,
+      trained_checkpoint_prefix,
+      output_directory,
+      additional_output_tensor_names,
+      input_shape,
+      output_collection_name,
+      graph_hook_fn=None,
+      write_inference_graph=write_inference_graph)
  pipeline_config.eval_config.use_moving_averages = False
-  config_text = text_format.MessageToString(pipeline_config)
-  with tf.gfile.Open(
-      os.path.join(output_directory, 'pipeline.config'), 'wb') as f:
-    f.write(config_text)
+  config_util.save_pipeline_config(pipeline_config, output_directory)
--- a/research/object_detection/exporter_test.py
+++ b/research/object_detection/exporter_test.py
@@ -134,6 +134,26 @@ class ExportInferenceGraphTest(tf.test.TestCase):
      self.assertTrue(os.path.exists(os.path.join(
          output_directory, 'saved_model', 'saved_model.pb')))

+  def test_write_inference_graph(self):
+    tmp_dir = self.get_temp_dir()
+    trained_checkpoint_prefix = os.path.join(tmp_dir, 'model.ckpt')
+    self._save_checkpoint_from_mock_model(trained_checkpoint_prefix,
+                                          use_moving_averages=False)
+    with mock.patch.object(
+        model_builder, 'build', autospec=True) as mock_builder:
+      mock_builder.return_value = FakeModel()
+      output_directory = os.path.join(tmp_dir, 'output')
+      pipeline_config = pipeline_pb2.TrainEvalPipelineConfig()
+      pipeline_config.eval_config.use_moving_averages = False
+      exporter.export_inference_graph(
+          input_type='image_tensor',
+          pipeline_config=pipeline_config,
+          trained_checkpoint_prefix=trained_checkpoint_prefix,
+          output_directory=output_directory,
+          write_inference_graph=True)
+      self.assertTrue(os.path.exists(os.path.join(
+          output_directory, 'inference_graph.pbtxt')))
+
  def test_export_graph_with_fixed_size_image_tensor_input(self):
    input_shape = [1, 320, 320, 3]


--- a/research/object_detection/g3doc/challenge_evaluation.md
+++ b/research/object_detection/g3doc/challenge_evaluation.md
+# Open Images Challenge Evaluation
+
+The Object Detection API is currently supporting several evaluation metrics used in the [Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html).
+In addition, several data processing tools are available. Detailed instructions on using the tools for each track are available below.
+
+**NOTE**: links to the external website in this tutorial may change after the Open Images Challenge 2018 is finished.
+
+## Object Detection Track
+
+The [Object Detection metric](https://storage.googleapis.com/openimages/web/object_detection_metric.html) protocol requires a pre-processing of the released data to ensure correct evaluation. The released data contains only leaf-most bounding box annotations and image-level labels.
+The evaluation metric implementation is available in the class `OpenImagesDetectionChallengeEvaluator`.
+
+1. Download class hierarchy of Open Images Challenge 2018 in JSON format from [here](https://storage.googleapis.com/openimages/challenge_2018/bbox_labels_500_hierarchy.json).
+2. Download ground-truth [boundling boxes](https://storage.googleapis.com/openimages/challenge_2018/train/challenge-2018-train-annotations-bbox.csv) and [image-level labels](https://storage.googleapis.com/openimages/challenge_2018/train/challenge-2018-train-annotations-human-imagelabels.csv).
+3. Filter the rows corresponding to the validation set images you want to use and store the results in the same CSV format.
+4. Run the following command to create hierarchical expansion of the bounding boxes annotations:
+
+```
+HIERARCHY_FILE=/path/to/bbox_labels_500_hierarchy.json
+BOUNDING_BOXES=/path/to/challenge-2018-train-annotations-bbox
+IMAGE_LABELS=/path/to/challenge-2018-train-annotations-human-imagelabels
+
+python object_detection/dataset_tools/oid_hierarchical_labels_expansion.py \
+    --json_hierarchy_file=${HIERARCHY_FILE} \
+    --input_annotations=${BOUNDING_BOXES}.csv \
+    --output_annotations=${BOUNDING_BOXES}_expanded.csv \
+    --annotation_type=1
+
+python object_detection/dataset_tools/oid_hierarchical_labels_expansion.py \
+    --json_hierarchy_file=${HIERARCHY_FILE} \
+    --input_annotations=${IMAGE_LABELS}.csv \
+    --output_annotations=${IMAGE_LABELS}_expanded.csv \
+    --annotation_type=2
+```
+
+After step 4 you will have produced the ground-truth files suitable for running 'OID Challenge Object Detection Metric 2018' evaluation.
+
+```
+INPUT_PREDICTIONS=/path/to/detection_predictions.csv
+OUTPUT_METRICS=/path/to/output/metrics/file
+
+python models/research/object_detection/metrics/oid_od_challenge_evaluation.py \
+    --input_annotations_boxes=${BOUNDING_BOXES}_expanded.csv \
+    --input_annotations_labels=${IMAGE_LABELS}_expanded.csv \
+    --input_class_labelmap=object_detection/data/oid_object_detection_challenge_500_label_map.pbtxt \
+    --input_predictions=${INPUT_PREDICTIONS} \
+    --output_metrics=${OUTPUT_METRICS} \
+```
+
+### Running evaluation on CSV files directly
+
+5. If you are not using Tensorflow, you can run evaluation directly using your algorithm's output and generated ground-truth files. {value=5}
+
+
+### Running evaluation using TF Object Detection API
+
+5. Produce tf.Example files suitable for running inference: {value=5}
+
+```
+RAW_IMAGES_DIR=/path/to/raw_images_location
+OUTPUT_DIR=/path/to/output_tfrecords
+
+python object_detection/dataset_tools/create_oid_tf_record.py \
+    --input_box_annotations_csv ${BOUNDING_BOXES}_expanded.csv \
+    --input_image_label_annotations_csv ${IMAGE_LABELS}_expanded.csv \
+    --input_images_directory ${RAW_IMAGES_DIR} \
+    --input_label_map object_detection/data/oid_object_detection_challenge_500_label_map.pbtxt \
+    --output_tf_record_path_prefix ${OUTPUT_DIR} \
+    --num_shards=100
+```
+
+6. Run inference of your model and fill corresponding fields in tf.Example: see [this tutorial](object_detection/g3doc/oid_inference_and_evaluation.md) on running the inference with Tensorflow Object Detection API models. {value=6}
+
+7. Finally, run the evaluation script to produce the final evaluation result.
+
+```
+INPUT_TFRECORDS_WITH_DETECTIONS=/path/to/tf_records_with_detections
+OUTPUT_CONFIG_DIR=/path/to/configs
+
+echo "
+label_map_path: 'object_detection/data/oid_object_detection_challenge_500_label_map.pbtxt'
+tf_record_input_reader: { input_path: '${INPUT_TFRECORDS_WITH_DETECTIONS}' }
+" > ${OUTPUT_CONFIG_DIR}/input_config.pbtxt
+
+echo "
+metrics_set: 'oid_challenge_object_detection_metrics'
+" > ${OUTPUT_CONFIG_DIR}/eval_config.pbtxt
+
+OUTPUT_METRICS_DIR=/path/to/metrics_csv
+
+python object_detection/metrics/offline_eval_map_corloc.py \
+    --eval_dir=${OUTPUT_METRICS_DIR} \
+    --eval_config_path=${OUTPUT_CONFIG_DIR}/eval_config.pbtxt \
+    --input_config_path=${OUTPUT_CONFIG_DIR}/input_config.pbtxt
+```
+
+The result of the evaluation will be stored in `${OUTPUT_METRICS_DIR}/metrics.csv`
+
+For the Object Detection Track, the participants will be ranked on:
+
+- "OpenImagesChallenge2018_Precision/mAP@0.5IOU"
+
+## Visual Relationships Detection Track
+
+The [Visual Relationships Detection metrics](https://storage.googleapis.com/openimages/web/vrd_detection_metric.html) can be directly evaluated using the ground-truth data and model predictions. The evaluation metric implementation is available in the class `VRDRelationDetectionEvaluator`,`VRDPhraseDetectionEvaluator`.
+
+1. Download the ground-truth [visual relationships annotations](https://storage.googleapis.com/openimages/challenge_2018/train/challenge-2018-train-vrd.csv) and [image-level labels](https://storage.googleapis.com/openimages/challenge_2018/train/challenge-2018-train-vrd-labels.csv).
+2. Filter the rows corresponding to the validation set images you want to use and store the results in the same CSV format.
+3. Run the follwing command to produce final metrics:
+
+```
+INPUT_ANNOTATIONS_BOXES=/path/to/challenge-2018-train-vrd.csv
+INPUT_ANNOTATIONS_LABELS=/path/to/challenge-2018-train-vrd-labels.csv
+INPUT_PREDICTIONS=/path/to/predictions.csv
+INPUT_CLASS_LABELMAP=/path/to/oid_object_detection_challenge_500_label_map.pbtxt
+INPUT_RELATIONSHIP_LABELMAP=/path/to/relationships_labelmap.pbtxt
+OUTPUT_METRICS=/path/to/output/metrics/file
+
+echo "item { name: '/m/02gy9n' id: 602 display_name: 'Transparent' }
+item { name: '/m/05z87' id: 603 display_name: 'Plastic' }
+item { name: '/m/0dnr7' id: 604 display_name: '(made of)Textile' }
+item { name: '/m/04lbp' id: 605 display_name: '(made of)Leather' }
+item { name: '/m/083vt' id: 606 display_name: 'Wooden'}
+">>${INPUT_CLASS_LABELMAP}
+
+echo "item { name: 'at' id: 1 display_name: 'at' }
+item { name: 'on' id: 2 display_name: 'on (top of)' }
+item { name: 'holds' id: 3 display_name: 'holds' }
+item { name: 'plays' id: 4 display_name: 'plays' }
+item { name: 'interacts_with' id: 5 display_name: 'interacts with' }
+item { name: 'wears' id: 6 display_name: 'wears' }
+item { name: 'is' id: 7 display_name: 'is' }
+item { name: 'inside_of' id: 8 display_name: 'inside of' }
+item { name: 'under' id: 9 display_name: 'under' }
+item { name: 'hits' id: 10 display_name: 'hits' }
+"> ${INPUT_RELATIONSHIP_LABELMAP}
+
+python object_detection/metrics/oid_vrd_challenge_evaluation.py \
+    --input_annotations_boxes=${INPUT_ANNOTATIONS_BOXES} \
+    --input_annotations_labels=${INPUT_ANNOTATIONS_LABELS} \
+    --input_predictions=${INPUT_PREDICTIONS} \
+    --input_class_labelmap=${INPUT_CLASS_LABELMAP} \
+    --input_relationship_labelmap=${INPUT_RELATIONSHIP_LABELMAP} \
+    --output_metrics=${OUTPUT_METRICS}
+```
+
+The participants of the challenge will be evaluated by weighted average of the following three metrics:
+
+- "VRDMetric_Relationships_mAP@0.5IOU"
+- "VRDMetric_Relationships_Recall@50@0.5IOU"
+- "VRDMetric_Phrases_mAP@0.5IOU"