Object detection changes: (#7208)

257914648 by lzc: Internal changes -- 257525973 by Zhichao Lu: Fixes bug that silently prevents checkpoints from loading when training w/ eager + functions. Also sets up scripts to run training. -- 257296614 by Zhichao Lu: Adding detection_features to model outputs -- 257234565 by Zhichao Lu: Fix wrong order of `classes_with_max_scores` in class-agnostic NMS caused by sorting in partitioned-NMS. -- 257232002 by ronnyvotel: Supporting `filter_nonoverlapping` option in np_box_list_ops.clip_to_window(). -- 257198282 by Zhichao Lu: Adding the focal loss and l1 loss from the Objects as Points paper. -- 257089535 by Zhichao Lu: Create Keras based ssd + resnetv1 + fpn. -- 257087407 by Zhichao Lu: Make object_detection/data_decoders Python3-compatible. -- 257004582 by Zhichao Lu: Updates _decode_raw_data_into_masks_and_boxes to the latest binary masks-to-string encoding format. -- 257002124 by Zhichao Lu: Make object_detection/utils Python3-compatible, except json_utils. The patching trick used in json_utils is not going to work in Python 3. -- 256795056 by lzc: Add a detection_anchor_indices field to detection outputs. -- 256477542 by Zhichao Lu: Make object_detection/core Python3-compatible. -- 256387593 by Zhichao Lu: Edit class_id_function_approximations builder to skip class ids not present in label map. -- 256259039 by Zhichao Lu: Move NMS to TPU for FasterRCNN. -- 256071360 by rathodv: When multiclass_scores is empty, add one-hot encoding of groundtruth_classes as multiclass scores so that data_augmentation ops that expect the presence of multiclass_scores don't have to individually handle this case. Also copy input tensor_dict to out_tensor_dict first to avoid inplace modification. -- 256023645 by Zhichao Lu: Adds the first WIP iterations of TensorFlow v2 eager + functions style custom training & evaluation loops. -- 255980623 by Zhichao Lu: Adds a new data augmentation operation "remap_labels" which remaps a set of labels to a new label. -- 255753259 by Zhichao Lu: Announcement of the released evaluation tutorial for Open Images Challenge 2019. -- 255698776 by lzc: Fix rewrite_nn_resize_op function which was broken by tf forward compatibility movement. -- 255623150 by Zhichao Lu: Add Keras-based ResnetV1 models. -- 255504992 by Zhichao Lu: Fixing the typo in specifying label expansion for ground truth segmentation file. -- 255470768 by Zhichao Lu: 1. Fixing Python bug with parsed arguments. 2. Adding capability to parse relevant columns from CSV header. 3. Fixing bug with duplicated labels expansion. -- 255462432 by Zhichao Lu: Adds a new data augmentation operation "drop_label_probabilistically" which drops a given label with the given probability. This supports experiments on training in the presence of label noise. -- 255441632 by rathodv: Fallback on groundtruth classes when multiclass_scores tensor is empty. -- 255434899 by Zhichao Lu: Ensuring evaluation binary can run even with big files by synchronizing processing of ground truth and predictions: in this way, ground truth is not stored but immediatly used for evaluation. In case gt of object masks, this allows to run evaluations on relatively large sets. -- 255337855 by lzc: Internal change. -- 255308908 by Zhichao Lu: Add comment to clarify usage of calibration parameters proto. -- 255266371 by Zhichao Lu: Ensuring correct processing of the case, when no groundtruth masks are provided for an image. -- 255236648 by Zhichao Lu: Refactor model_builder in faster_rcnn.py to a util_map, so that it's possible to be overwritten. -- 255093285 by Zhichao Lu: Updating capability to subsample data during evaluation -- 255081222 by rathodv: Convert groundtruth masks to be of type float32 before its used in the loss function. When using mixed precision training, masks are represented using bfloat16 tensors in the input pipeline for performance reasons. We need to convert them to float32 before using it in the loss function. -- 254788436 by Zhichao Lu: Add forward_compatible to non_max_suppression_with_scores to make it is compatible with older tensorflow version. -- 254442362 by Zhichao Lu: Add num_layer field to ssd feature extractor proto. -- 253911582 by jonathanhuang: Plumbs Soft-NMS options (using the new tf.image.non_max_suppression_with_scores op) into the TF Object Detection API. It adds a `soft_nms_sigma` field to the postprocessing proto file and plumbs this through to both the multiclass and class_agnostic versions of NMS. Note that there is no effect on behavior of NMS when soft_nms_sigma=0 (which it is set to by default). See also "Soft-NMS -- Improving Object Detection With One Line of Code" by Bodla et al (https://arxiv.org/abs/1704.04503) -- 253703949 by Zhichao Lu: Internal test fixes. -- 253151266 by Zhichao Lu: Fix the op type check for FusedBatchNorm, given that we introduced FusedBatchNormV3 in a previous change. -- 252718956 by Zhichao Lu: Customize activation function to enable relu6 instead of relu for saliency prediction model seastarization -- 252158593 by Zhichao Lu: Make object_detection/core Python3-compatible. -- 252150717 by Zhichao Lu: Make object_detection/core Python3-compatible. -- 251967048 by Zhichao Lu: Make GraphRewriter proto extensible. -- 251950039 by Zhichao Lu: Remove experimental_export_device_assignment from TPUEstimator.export_savedmodel(), so as to remove rewrite_for_inference(). As a replacement, export_savedmodel() V2 API supports device_assignment where user call tpu.rewrite in model_fn and pass in device_assigment there. -- 251890697 by rathodv: Updated docstring to include new output nodes. -- 251662894 by Zhichao Lu: Add autoaugment augmentation option to objection detection api codebase. This is an available option in preprocessor.py. The intended usage of autoaugment is to be done along with random flipping and cropping for best results. -- 251532908 by Zhichao Lu: Add TrainingDataType enum to track whether class-specific or agnostic data was used to fit the calibration function. This is useful, since classes with few observations may require a calibration function fit on all classes. -- 251511339 by Zhichao Lu: Add multiclass isotonic regression to the calibration builder. -- 251317769 by pengchong: Internal Change. -- 250729989 by Zhichao Lu: Fixing bug in gt statistics count in case of mask and box annotations. -- 250729627 by Zhichao Lu: Label expansion for segmentation. -- 250724905 by Zhichao Lu: Fix use_depthwise in fpn and test it with fpnlite on ssd + mobilenet v2. -- 250670379 by Zhichao Lu: Internal change 250630364 by lzc: Fix detection_model_zoo footnotes -- 250560654 by Zhichao Lu: Fix static shape issue in matmul_crop_and_resize. -- 250534857 by Zhichao Lu: Edit class agnostic calibration function docstring to more accurately describe the function's outputs. -- 250533277 by Zhichao Lu: Edit the multiclass messages to use class ids instead of labels. -- PiperOrigin-RevId: 257914648

Object detection changes: (#7208)
257914648 by lzc: Internal changes -- 257525973 by Zhichao Lu: Fixes bug that silently prevents checkpoints from loading when training w/ eager + functions. Also sets up scripts to run training. -- 257296614 by Zhichao Lu: Adding detection_features to model outputs -- 257234565 by Zhichao Lu: Fix wrong order of `classes_with_max_scores` in class-agnostic NMS caused by sorting in partitioned-NMS. -- 257232002 by ronnyvotel: Supporting `filter_nonoverlapping` option in np_box_list_ops.clip_to_window(). -- 257198282 by Zhichao Lu: Adding the focal loss and l1 loss from the Objects as Points paper. -- 257089535 by Zhichao Lu: Create Keras based ssd + resnetv1 + fpn. -- 257087407 by Zhichao Lu: Make object_detection/data_decoders Python3-compatible. -- 257004582 by Zhichao Lu: Updates _decode_raw_data_into_masks_and_boxes to the latest binary masks-to-string encoding format. -- 257002124 by Zhichao Lu: Make object_detection/utils Python3-compatible, except json_utils. The patching trick used in json_utils is not going to work in Python 3. -- 256795056 by lzc: Add a detection_anchor_indices field to detection outputs. -- 256477542 by Zhichao Lu: Make object_detection/core Python3-compatible. -- 256387593 by Zhichao Lu: Edit class_id_function_approximations builder to skip class ids not present in label map. -- 256259039 by Zhichao Lu: Move NMS to TPU for FasterRCNN. -- 256071360 by rathodv: When multiclass_scores is empty, add one-hot encoding of groundtruth_classes as multiclass scores so that data_augmentation ops that expect the presence of multiclass_scores don't have to individually handle this case. Also copy input tensor_dict to out_tensor_dict first to avoid inplace modification. -- 256023645 by Zhichao Lu: Adds the first WIP iterations of TensorFlow v2 eager + functions style custom training & evaluation loops. -- 255980623 by Zhichao Lu: Adds a new data augmentation operation "remap_labels" which remaps a set of labels to a new label. -- 255753259 by Zhichao Lu: Announcement of the released evaluation tutorial for Open Images Challenge 2019. -- 255698776 by lzc: Fix rewrite_nn_resize_op function which was broken by tf forward compatibility movement. -- 255623150 by Zhichao Lu: Add Keras-based ResnetV1 models. -- 255504992 by Zhichao Lu: Fixing the typo in specifying label expansion for ground truth segmentation file. -- 255470768 by Zhichao Lu: 1. Fixing Python bug with parsed arguments. 2. Adding capability to parse relevant columns from CSV header. 3. Fixing bug with duplicated labels expansion. -- 255462432 by Zhichao Lu: Adds a new data augmentation operation "drop_label_probabilistically" which drops a given label with the given probability. This supports experiments on training in the presence of label noise. -- 255441632 by rathodv: Fallback on groundtruth classes when multiclass_scores tensor is empty. -- 255434899 by Zhichao Lu: Ensuring evaluation binary can run even with big files by synchronizing processing of ground truth and predictions: in this way, ground truth is not stored but immediatly used for evaluation. In case gt of object masks, this allows to run evaluations on relatively large sets. -- 255337855 by lzc: Internal change. -- 255308908 by Zhichao Lu: Add comment to clarify usage of calibration parameters proto. -- 255266371 by Zhichao Lu: Ensuring correct processing of the case, when no groundtruth masks are provided for an image. -- 255236648 by Zhichao Lu: Refactor model_builder in faster_rcnn.py to a util_map, so that it's possible to be overwritten. -- 255093285 by Zhichao Lu: Updating capability to subsample data during evaluation -- 255081222 by rathodv: Convert groundtruth masks to be of type float32 before its used in the loss function. When using mixed precision training, masks are represented using bfloat16 tensors in the input pipeline for performance reasons. We need to convert them to float32 before using it in the loss function. -- 254788436 by Zhichao Lu: Add forward_compatible to non_max_suppression_with_scores to make it is compatible with older tensorflow version. -- 254442362 by Zhichao Lu: Add num_layer field to ssd feature extractor proto. -- 253911582 by jonathanhuang: Plumbs Soft-NMS options (using the new tf.image.non_max_suppression_with_scores op) into the TF Object Detection API. It adds a `soft_nms_sigma` field to the postprocessing proto file and plumbs this through to both the multiclass and class_agnostic versions of NMS. Note that there is no effect on behavior of NMS when soft_nms_sigma=0 (which it is set to by default). See also "Soft-NMS -- Improving Object Detection With One Line of Code" by Bodla et al (https://arxiv.org/abs/1704.04503) -- 253703949 by Zhichao Lu: Internal test fixes. -- 253151266 by Zhichao Lu: Fix the op type check for FusedBatchNorm, given that we introduced FusedBatchNormV3 in a previous change. -- 252718956 by Zhichao Lu: Customize activation function to enable relu6 instead of relu for saliency prediction model seastarization -- 252158593 by Zhichao Lu: Make object_detection/core Python3-compatible. -- 252150717 by Zhichao Lu: Make object_detection/core Python3-compatible. -- 251967048 by Zhichao Lu: Make GraphRewriter proto extensible. -- 251950039 by Zhichao Lu: Remove experimental_export_device_assignment from TPUEstimator.export_savedmodel(), so as to remove rewrite_for_inference(). As a replacement, export_savedmodel() V2 API supports device_assignment where user call tpu.rewrite in model_fn and pass in device_assigment there. -- 251890697 by rathodv: Updated docstring to include new output nodes. -- 251662894 by Zhichao Lu: Add autoaugment augmentation option to objection detection api codebase. This is an available option in preprocessor.py. The intended usage of autoaugment is to be done along with random flipping and cropping for best results. -- 251532908 by Zhichao Lu: Add TrainingDataType enum to track whether class-specific or agnostic data was used to fit the calibration function. This is useful, since classes with few observations may require a calibration function fit on all classes. -- 251511339 by Zhichao Lu: Add multiclass isotonic regression to the calibration builder. -- 251317769 by pengchong: Internal Change. -- 250729989 by Zhichao Lu: Fixing bug in gt statistics count in case of mask and box annotations. -- 250729627 by Zhichao Lu: Label expansion for segmentation. -- 250724905 by Zhichao Lu: Fix use_depthwise in fpn and test it with fpnlite on ssd + mobilenet v2. -- 250670379 by Zhichao Lu: Internal change 250630364 by lzc: Fix detection_model_zoo footnotes -- 250560654 by Zhichao Lu: Fix static shape issue in matmul_crop_and_resize. -- 250534857 by Zhichao Lu: Edit class agnostic calibration function docstring to more accurately describe the function's outputs. -- 250533277 by Zhichao Lu: Edit the multiclass messages to use class ids instead of labels. -- PiperOrigin-RevId: 257914648
fe748d4a · pkulzc · GitHub · 81123ebf · fe748d4a · fe748d4a
Unverified Commit fe748d4a authored Jul 15, 2019 by pkulzc Committed by GitHub Jul 15, 2019
20 changed files
--- a/research/object_detection/g3doc/challenge_evaluation.md
+++ b/research/object_detection/g3doc/challenge_evaluation.md
 # Open Images Challenge Evaluation

-The Object Detection API is currently supporting several evaluation metrics used in the [Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html).
-In addition, several data processing tools are available. Detailed instructions on using the tools for each track are available below.
+The Object Detection API is currently supporting several evaluation metrics used
+in the
+[Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html)
+and
+[Open Images Challenge 2019](https://storage.googleapis.com/openimages/web/challenge2019.html).
+In addition, several data processing tools are available. Detailed instructions
+on using the tools for each track are available below.

-**NOTE**: links to the external website in this tutorial may change after the Open Images Challenge 2018 is finished.
+**NOTE:** all data links are updated to the Open Images Challenge 2019.

 ## Object Detection Track

-The [Object Detection metric](https://storage.googleapis.com/openimages/web/object_detection_metric.html) protocol requires a pre-processing of the released data to ensure correct evaluation. The released data contains only leaf-most bounding box annotations and image-level labels.
-The evaluation metric implementation is available in the class `OpenImagesDetectionChallengeEvaluator`.
-
-1. Download class hierarchy of Open Images Challenge 2018 in JSON format from [here](https://storage.googleapis.com/openimages/challenge_2018/bbox_labels_500_hierarchy.json).
-2. Download ground-truth [boundling boxes](https://storage.googleapis.com/openimages/challenge_2018/train/challenge-2018-train-annotations-bbox.csv) and [image-level labels](https://storage.googleapis.com/openimages/challenge_2018/train/challenge-2018-train-annotations-human-imagelabels.csv).
-3. Filter the rows corresponding to the validation set images you want to use and store the results in the same CSV format.
-4. Run the following command to create hierarchical expansion of the bounding boxes annotations:
+The
+[Object Detection metric](https://storage.googleapis.com/openimages/web/evaluation.html#object_detection_eval)
+protocol requires a pre-processing of the released data to ensure correct
+evaluation. The released data contains only leaf-most bounding box annotations
+and image-level labels. The evaluation metric implementation is available in the
+class `OpenImagesChallengeEvaluator`.
+
+1.  Download
+    [class hierarchy of Open Images Detection Challenge 2019](https://storage.googleapis.com/openimages/challenge_2019/challenge-2019-label500-hierarchy.json)
+    in JSON format.
+2.  Download
+    [ground-truth boundling boxes](https://storage.googleapis.com/openimages/challenge_2019/challenge-2019-validation-detection-bbox.csv)
+    and
+    [image-level labels](https://storage.googleapis.com/openimages/challenge_2019/challenge-2019-validation-detection-human-imagelabels.csv).
+3.  Run the following command to create hierarchical expansion of the bounding
+    boxes and image-level label annotations:

 ```
-HIERARCHY_FILE=/path/to/bbox_labels_500_hierarchy.json
-BOUNDING_BOXES=/path/to/challenge-2018-train-annotations-bbox
-IMAGE_LABELS=/path/to/challenge-2018-train-annotations-human-imagelabels
+HIERARCHY_FILE=/path/to/challenge-2019-label500-hierarchy.json
+BOUNDING_BOXES=/path/to/challenge-2019-validation-detection-bbox
+IMAGE_LABELS=/path/to/challenge-2019-validation-detection-human-imagelabels

 python object_detection/dataset_tools/oid_hierarchical_labels_expansion.py \
    --json_hierarchy_file=${HIERARCHY_FILE} \
@@ -33,13 +47,18 @@ python object_detection/dataset_tools/oid_hierarchical_labels_expansion.py \
    --annotation_type=2
 ```

-After step 4 you will have produced the ground-truth files suitable for running 'OID Challenge Object Detection Metric 2018' evaluation.
+1.  If you are not using Tensorflow, you can run evaluation directly using your
+    algorithm's output and generated ground-truth files. {value=4}
+
+After step 3 you produced the ground-truth files suitable for running 'OID
+Challenge Object Detection Metric 2019' evaluation. To run the evaluation, use
+the following command:

 ```
 INPUT_PREDICTIONS=/path/to/detection_predictions.csv
 OUTPUT_METRICS=/path/to/output/metrics/file

-python models/research/object_detection/metrics/oid_od_challenge_evaluation.py \
+python models/research/object_detection/metrics/oid_challenge_evaluation.py \
    --input_annotations_boxes=${BOUNDING_BOXES}_expanded.csv \
    --input_annotations_labels=${IMAGE_LABELS}_expanded.csv \
    --input_class_labelmap=object_detection/data/oid_object_detection_challenge_500_label_map.pbtxt \
@@ -47,66 +66,99 @@ python models/research/object_detection/metrics/oid_od_challenge_evaluation.py \
    --output_metrics=${OUTPUT_METRICS} \
 ```

-### Running evaluation on CSV files directly
+For the Object Detection Track, the participants will be ranked on:

-5. If you are not using Tensorflow, you can run evaluation directly using your algorithm's output and generated ground-truth files. {value=5}
+-   "OpenImagesDetectionChallenge_Precision/mAP@0.5IOU"

+To use evaluation within Tensorflow training, use metric name
+`oid_challenge_detection_metrics` in the evaluation config.

-### Running evaluation using TF Object Detection API
+## Instance Segmentation Track

-5. Produce tf.Example files suitable for running inference: {value=5}
+The
+[Instance Segmentation metric](https://storage.googleapis.com/openimages/web/evaluation.html#instance_segmentation_eval)
+can be directly evaluated using the ground-truth data and model predictions. The
+evaluation metric implementation is available in the class
+`OpenImagesChallengeEvaluator`.

-```
-RAW_IMAGES_DIR=/path/to/raw_images_location
-OUTPUT_DIR=/path/to/output_tfrecords
-
-python object_detection/dataset_tools/create_oid_tf_record.py \
-    --input_box_annotations_csv ${BOUNDING_BOXES}_expanded.csv \
-    --input_image_label_annotations_csv ${IMAGE_LABELS}_expanded.csv \
-    --input_images_directory ${RAW_IMAGES_DIR} \
-    --input_label_map object_detection/data/oid_object_detection_challenge_500_label_map.pbtxt \
-    --output_tf_record_path_prefix ${OUTPUT_DIR} \
-    --num_shards=100
-```
+1.  Download
+    [class hierarchy of Open Images Instance Segmentation Challenge 2019](https://storage.googleapis.com/openimages/challenge_2019/challenge-2019-label300-segmentable-hierarchy.json)
+    in JSON format.
+2.  Download
+    [ground-truth bounding boxes](https://storage.googleapis.com/openimages/challenge_2019/challenge-2019-validation-segmentation-bbox.csv)
+    and
+    [image-level labels](https://storage.googleapis.com/openimages/challenge_2019/challenge-2019-validation-segmentation-labels.csv).
+3.  Download instance segmentation files for the validation set (see
+    [Open Images Challenge Downloads page](https://storage.googleapis.com/openimages/web/challenge2019_downloads.html)).
+    The download consists of a set of .zip archives containing binary .png
+    masks.
+    Those should be transformed into a single CSV file in the format:
+
+    ImageID,LabelName,ImageWidth,ImageHeight,XMin,YMin,XMax,YMax,GroupOf,Mask
+    where Mask is MS COCO RLE encoding of a binary mask stored in .png file.

-6. Run inference of your model and fill corresponding fields in tf.Example: see [this tutorial](object_detection/g3doc/oid_inference_and_evaluation.md) on running the inference with Tensorflow Object Detection API models. {value=6}
+    NOTE: the util to make the transformation will be released soon.

-7. Finally, run the evaluation script to produce the final evaluation result.
+1.  Run the following command to create hierarchical expansion of the instance
+    segmentation, bounding boxes and image-level label annotations: {value=4}

 ```
-INPUT_TFRECORDS_WITH_DETECTIONS=/path/to/tf_records_with_detections
-OUTPUT_CONFIG_DIR=/path/to/configs
+HIERARCHY_FILE=/path/to/challenge-2019-label300-hierarchy.json
+BOUNDING_BOXES=/path/to/challenge-2019-validation-detection-bbox
+IMAGE_LABELS=/path/to/challenge-2019-validation-detection-human-imagelabels

-echo "
-label_map_path: 'object_detection/data/oid_object_detection_challenge_500_label_map.pbtxt'
-tf_record_input_reader: { input_path: '${INPUT_TFRECORDS_WITH_DETECTIONS}' }
-" > ${OUTPUT_CONFIG_DIR}/input_config.pbtxt
+python object_detection/dataset_tools/oid_hierarchical_labels_expansion.py \
+    --json_hierarchy_file=${HIERARCHY_FILE} \
+    --input_annotations=${BOUNDING_BOXES}.csv \
+    --output_annotations=${BOUNDING_BOXES}_expanded.csv \
+    --annotation_type=1
+
+python object_detection/dataset_tools/oid_hierarchical_labels_expansion.py \
+    --json_hierarchy_file=${HIERARCHY_FILE} \
+    --input_annotations=${IMAGE_LABELS}.csv \
+    --output_annotations=${IMAGE_LABELS}_expanded.csv \
+    --annotation_type=2

-echo "
-metrics_set: 'oid_challenge_detection_metrics'
-" > ${OUTPUT_CONFIG_DIR}/eval_config.pbtxt
+python object_detection/dataset_tools/oid_hierarchical_labels_expansion.py \
+    --json_hierarchy_file=${HIERARCHY_FILE} \
+    --input_annotations=${INSTANCE_SEGMENTATIONS}.csv \
+    --output_annotations=${INSTANCE_SEGMENTATIONS}_expanded.csv \
+    --annotation_type=1
+```

-OUTPUT_METRICS_DIR=/path/to/metrics_csv
+1.  If you are not using Tensorflow, you can run evaluation directly using your
+    algorithm's output and generated ground-truth files. {value=4}

-python object_detection/metrics/offline_eval_map_corloc.py \
-    --eval_dir=${OUTPUT_METRICS_DIR} \
-    --eval_config_path=${OUTPUT_CONFIG_DIR}/eval_config.pbtxt \
-    --input_config_path=${OUTPUT_CONFIG_DIR}/input_config.pbtxt
 ```
+INPUT_PREDICTIONS=/path/to/instance_segmentation_predictions.csv
+OUTPUT_METRICS=/path/to/output/metrics/file

-The result of the evaluation will be stored in `${OUTPUT_METRICS_DIR}/metrics.csv`
+python models/research/object_detection/metrics/oid_challenge_evaluation.py \
+    --input_annotations_boxes=${BOUNDING_BOXES}_expanded.csv \
+    --input_annotations_labels=${IMAGE_LABELS}_expanded.csv \
+    --input_class_labelmap=object_detection/data/oid_object_detection_challenge_500_label_map.pbtxt \
+    --input_predictions=${INPUT_PREDICTIONS} \
+    --input_annotations_segm=${INSTANCE_SEGMENTATIONS}_expanded.csv
+    --output_metrics=${OUTPUT_METRICS} \
+```

-For the Object Detection Track, the participants will be ranked on:
+For the Instance Segmentation Track, the participants will be ranked on:

- "OpenImagesChallenge2018_Precision/mAP@0.5IOU"
+-   "OpenImagesInstanceSegmentationChallenge_Precision/mAP@0.5IOU"

 ## Visual Relationships Detection Track

-The [Visual Relationships Detection metrics](https://storage.googleapis.com/openimages/web/vrd_detection_metric.html) can be directly evaluated using the ground-truth data and model predictions. The evaluation metric implementation is available in the class `VRDRelationDetectionEvaluator`,`VRDPhraseDetectionEvaluator`.
-
-1. Download the ground-truth [visual relationships annotations](https://storage.googleapis.com/openimages/challenge_2018/train/challenge-2018-train-vrd.csv) and [image-level labels](https://storage.googleapis.com/openimages/challenge_2018/train/challenge-2018-train-vrd-labels.csv).
-2. Filter the rows corresponding to the validation set images you want to use and store the results in the same CSV format.
-3. Run the follwing command to produce final metrics:
+The
+[Visual Relationships Detection metrics](https://storage.googleapis.com/openimages/web/evaluation.html#visual_relationships_eval)
+can be directly evaluated using the ground-truth data and model predictions. The
+evaluation metric implementation is available in the class
+`VRDRelationDetectionEvaluator`,`VRDPhraseDetectionEvaluator`.
+
+1.  Download the ground-truth
+    [visual relationships annotations](https://storage.googleapis.com/openimages/challenge_2019/challenge-2019-validation-vrd.csv)
+    and
+    [image-level labels](https://storage.googleapis.com/openimages/challenge_2019/challenge-2019-validation-vrd-labels.csv).
+2.  Run the follwing command to produce final metrics:

 ```
 INPUT_ANNOTATIONS_BOXES=/path/to/challenge-2018-train-vrd.csv

--- a/research/object_detection/g3doc/detection_model_zoo.md
+++ b/research/object_detection/g3doc/detection_model_zoo.md
@@ -138,6 +138,8 @@ Model name


 [^2]: This is PASCAL mAP with a slightly different way of true positives computation: see [Open Images evaluation protocols](evaluation_protocols.md), oid_V2_detection_metrics.
+
 [^3]: Non-face boxes are dropped during training and non-face groundtruth boxes are ignored when evaluating.
+
 [^4]: This is Open Images Challenge metric: see [Open Images evaluation protocols](evaluation_protocols.md), oid_challenge_detection_metrics.

--- a/research/object_detection/g3doc/evaluation_protocols.md
+++ b/research/object_detection/g3doc/evaluation_protocols.md
@@ -135,22 +135,29 @@ output bounding-boxes labelled in the same manner.
 The old metric name is DEPRECATED.
 `EvalConfig.metrics_set='open_images_V2_detection_metrics'`

-## OID Challenge Object Detection Metric 2018
+## OID Challenge Object Detection Metric

 `EvalConfig.metrics_set='oid_challenge_detection_metrics'`

-The metric for the OID Challenge Object Detection Metric 2018, Object Detection
-track. The description is provided on the [Open Images Challenge
-website](https://storage.googleapis.com/openimages/web/challenge.html).
+The metric for the OID Challenge Object Detection Metric 2018/2019 Object
+Detection track. The description is provided on the
+[Open Images Challenge website](https://storage.googleapis.com/openimages/web/evaluation.html#object_detection_eval).

 The old metric name is DEPRECATED.
 `EvalConfig.metrics_set='oid_challenge_object_detection_metrics'`

-## OID Challenge Visual Relationship Detection Metric 2018
+## OID Challenge Visual Relationship Detection Metric

-The metric for the OID Challenge Visual Relationship Detection Metric 2018, Visual
-Relationship Detection track. The description is provided on the [Open Images
-Challenge
-website](https://storage.googleapis.com/openimages/web/challenge.html). Note:
-this is currently a stand-alone metric, that can be used only through the
+The metric for the OID Challenge Visual Relationship Detection Metric 2018,2019
+Visual Relationship Detection track. The description is provided on the
+[Open Images Challenge website](https://storage.googleapis.com/openimages/web/evaluation.html#visual_relationships_eval).
+Note: this is currently a stand-alone metric, that can be used only through the
 `metrics/oid_vrd_challenge_evaluation.py` util.
+
+## OID Challenge Instance Segmentation Metric
+
+`EvalConfig.metrics_set='oid_challenge_segmentation_metrics'`
+
+The metric for the OID Challenge Instance Segmentation Metric 2019, Instance
+Segmentation track. The description is provided on the
+[Open Images Challenge website](https://storage.googleapis.com/openimages/web/evaluation.html#instance_segmentation_eval).
--- a/research/object_detection/inputs.py
+++ b/research/object_detection/inputs.py
@@ -47,6 +47,22 @@ INPUT_BUILDER_UTIL_MAP = {
 }


+def _multiclass_scores_or_one_hot_labels(multiclass_scores,
+                                         groundtruth_boxes,
+                                         groundtruth_classes, num_classes):
+  """Returns one-hot encoding of classes when multiclass_scores is empty."""
+  # Replace groundtruth_classes tensor with multiclass_scores tensor when its
+  # non-empty. If multiclass_scores is empty fall back on groundtruth_classes
+  # tensor.
+  def true_fn():
+    return tf.reshape(multiclass_scores,
+                      [tf.shape(groundtruth_boxes)[0], num_classes])
+  def false_fn():
+    return tf.one_hot(groundtruth_classes, num_classes)
+
+  return tf.cond(tf.size(multiclass_scores) > 0, true_fn, false_fn)
+
+
 def transform_input_data(tensor_dict,
                         model_preprocess_fn,
                         image_resizer_fn,
@@ -89,102 +105,106 @@ def transform_input_data(tensor_dict,
      and classes for a given image if the boxes are exactly the same.
    retain_original_image: (optional) whether to retain original image in the
      output dictionary.
-    use_multiclass_scores: whether to use multiclass scores as
-      class targets instead of one-hot encoding of `groundtruth_classes`.
+    use_multiclass_scores: whether to use multiclass scores as class targets
+      instead of one-hot encoding of `groundtruth_classes`. When
+      this is True and multiclass_scores is empty, one-hot encoding of
+      `groundtruth_classes` is used as a fallback.
    use_bfloat16: (optional) a bool, whether to use bfloat16 in training.

  Returns:
    A dictionary keyed by fields.InputDataFields containing the tensors obtained
    after applying all the transformations.
  """
-  # Reshape flattened multiclass scores tensor into a 2D tensor of shape
-  # [num_boxes, num_classes].
-  if fields.InputDataFields.multiclass_scores in tensor_dict:
-    tensor_dict[fields.InputDataFields.multiclass_scores] = tf.reshape(
-        tensor_dict[fields.InputDataFields.multiclass_scores], [
-            tf.shape(tensor_dict[fields.InputDataFields.groundtruth_boxes])[0],
-            num_classes
-        ])
-  if fields.InputDataFields.groundtruth_boxes in tensor_dict:
-    tensor_dict = util_ops.filter_groundtruth_with_nan_box_coordinates(
-        tensor_dict)
-    tensor_dict = util_ops.filter_unrecognized_classes(tensor_dict)
+  out_tensor_dict = tensor_dict.copy()
+  if fields.InputDataFields.multiclass_scores in out_tensor_dict:
+    out_tensor_dict[
+        fields.InputDataFields
+        .multiclass_scores] = _multiclass_scores_or_one_hot_labels(
+            out_tensor_dict[fields.InputDataFields.multiclass_scores],
+            out_tensor_dict[fields.InputDataFields.groundtruth_boxes],
+            out_tensor_dict[fields.InputDataFields.groundtruth_classes],
+            num_classes)
+
+  if fields.InputDataFields.groundtruth_boxes in out_tensor_dict:
+    out_tensor_dict = util_ops.filter_groundtruth_with_nan_box_coordinates(
+        out_tensor_dict)
+    out_tensor_dict = util_ops.filter_unrecognized_classes(out_tensor_dict)

  if retain_original_image:
-    tensor_dict[fields.InputDataFields.original_image] = tf.cast(
-        image_resizer_fn(tensor_dict[fields.InputDataFields.image], None)[0],
-        tf.uint8)
+    out_tensor_dict[fields.InputDataFields.original_image] = tf.cast(
+        image_resizer_fn(out_tensor_dict[fields.InputDataFields.image],
+                         None)[0], tf.uint8)

-  if fields.InputDataFields.image_additional_channels in tensor_dict:
-    channels = tensor_dict[fields.InputDataFields.image_additional_channels]
-    tensor_dict[fields.InputDataFields.image] = tf.concat(
-        [tensor_dict[fields.InputDataFields.image], channels], axis=2)
+  if fields.InputDataFields.image_additional_channels in out_tensor_dict:
+    channels = out_tensor_dict[fields.InputDataFields.image_additional_channels]
+    out_tensor_dict[fields.InputDataFields.image] = tf.concat(
+        [out_tensor_dict[fields.InputDataFields.image], channels], axis=2)

  # Apply data augmentation ops.
  if data_augmentation_fn is not None:
-    tensor_dict = data_augmentation_fn(tensor_dict)
+    out_tensor_dict = data_augmentation_fn(out_tensor_dict)

  # Apply model preprocessing ops and resize instance masks.
-  image = tensor_dict[fields.InputDataFields.image]
+  image = out_tensor_dict[fields.InputDataFields.image]
  preprocessed_resized_image, true_image_shape = model_preprocess_fn(
      tf.expand_dims(tf.cast(image, dtype=tf.float32), axis=0))
  if use_bfloat16:
    preprocessed_resized_image = tf.cast(
        preprocessed_resized_image, tf.bfloat16)
-  tensor_dict[fields.InputDataFields.image] = tf.squeeze(
+  out_tensor_dict[fields.InputDataFields.image] = tf.squeeze(
      preprocessed_resized_image, axis=0)
-  tensor_dict[fields.InputDataFields.true_image_shape] = tf.squeeze(
+  out_tensor_dict[fields.InputDataFields.true_image_shape] = tf.squeeze(
      true_image_shape, axis=0)
-  if fields.InputDataFields.groundtruth_instance_masks in tensor_dict:
-    masks = tensor_dict[fields.InputDataFields.groundtruth_instance_masks]
+  if fields.InputDataFields.groundtruth_instance_masks in out_tensor_dict:
+    masks = out_tensor_dict[fields.InputDataFields.groundtruth_instance_masks]
    _, resized_masks, _ = image_resizer_fn(image, masks)
    if use_bfloat16:
      resized_masks = tf.cast(resized_masks, tf.bfloat16)
-    tensor_dict[fields.InputDataFields.
-                groundtruth_instance_masks] = resized_masks
+    out_tensor_dict[
+        fields.InputDataFields.groundtruth_instance_masks] = resized_masks

-  # Transform groundtruth classes to one hot encodings.
  label_offset = 1
-  zero_indexed_groundtruth_classes = tensor_dict[
+  zero_indexed_groundtruth_classes = out_tensor_dict[
      fields.InputDataFields.groundtruth_classes] - label_offset
-  tensor_dict[fields.InputDataFields.groundtruth_classes] = tf.one_hot(
-      zero_indexed_groundtruth_classes, num_classes)
-
  if use_multiclass_scores:
-    tensor_dict[fields.InputDataFields.groundtruth_classes] = tensor_dict[
-        fields.InputDataFields.multiclass_scores]
-  tensor_dict.pop(fields.InputDataFields.multiclass_scores, None)
+    out_tensor_dict[
+        fields.InputDataFields.groundtruth_classes] = out_tensor_dict[
+            fields.InputDataFields.multiclass_scores]
+  else:
+    out_tensor_dict[fields.InputDataFields.groundtruth_classes] = tf.one_hot(
+        zero_indexed_groundtruth_classes, num_classes)
+  out_tensor_dict.pop(fields.InputDataFields.multiclass_scores, None)

-  if fields.InputDataFields.groundtruth_confidences in tensor_dict:
-    groundtruth_confidences = tensor_dict[
+  if fields.InputDataFields.groundtruth_confidences in out_tensor_dict:
+    groundtruth_confidences = out_tensor_dict[
        fields.InputDataFields.groundtruth_confidences]
    # Map the confidences to the one-hot encoding of classes
-    tensor_dict[fields.InputDataFields.groundtruth_confidences] = (
+    out_tensor_dict[fields.InputDataFields.groundtruth_confidences] = (
        tf.reshape(groundtruth_confidences, [-1, 1]) *
-        tensor_dict[fields.InputDataFields.groundtruth_classes])
+        out_tensor_dict[fields.InputDataFields.groundtruth_classes])
  else:
    groundtruth_confidences = tf.ones_like(
        zero_indexed_groundtruth_classes, dtype=tf.float32)
-    tensor_dict[fields.InputDataFields.groundtruth_confidences] = (
-        tensor_dict[fields.InputDataFields.groundtruth_classes])
+    out_tensor_dict[fields.InputDataFields.groundtruth_confidences] = (
+        out_tensor_dict[fields.InputDataFields.groundtruth_classes])

  if merge_multiple_boxes:
    merged_boxes, merged_classes, merged_confidences, _ = (
        util_ops.merge_boxes_with_multiple_labels(
-            tensor_dict[fields.InputDataFields.groundtruth_boxes],
+            out_tensor_dict[fields.InputDataFields.groundtruth_boxes],
            zero_indexed_groundtruth_classes,
            groundtruth_confidences,
            num_classes))
    merged_classes = tf.cast(merged_classes, tf.float32)
-    tensor_dict[fields.InputDataFields.groundtruth_boxes] = merged_boxes
-    tensor_dict[fields.InputDataFields.groundtruth_classes] = merged_classes
-    tensor_dict[fields.InputDataFields.groundtruth_confidences] = (
+    out_tensor_dict[fields.InputDataFields.groundtruth_boxes] = merged_boxes
+    out_tensor_dict[fields.InputDataFields.groundtruth_classes] = merged_classes
+    out_tensor_dict[fields.InputDataFields.groundtruth_confidences] = (
        merged_confidences)
-  if fields.InputDataFields.groundtruth_boxes in tensor_dict:
-    tensor_dict[fields.InputDataFields.num_groundtruth_boxes] = tf.shape(
-        tensor_dict[fields.InputDataFields.groundtruth_boxes])[0]
+  if fields.InputDataFields.groundtruth_boxes in out_tensor_dict:
+    out_tensor_dict[fields.InputDataFields.num_groundtruth_boxes] = tf.shape(
+        out_tensor_dict[fields.InputDataFields.groundtruth_boxes])[0]

-  return tensor_dict
+  return out_tensor_dict


 def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes,

--- a/research/object_detection/inputs_test.py
+++ b/research/object_detection/inputs_test.py
@@ -611,6 +611,62 @@ class DataTransformationFnTest(test_case.TestCase):
    self.assertAllClose(transformed_inputs[fields.InputDataFields.image],
                        np.concatenate((image, additional_channels), axis=2))

+  def test_use_multiclass_scores_when_present(self):
+    image = np.random.rand(4, 4, 3).astype(np.float32)
+    tensor_dict = {
+        fields.InputDataFields.image:
+            tf.constant(image),
+        fields.InputDataFields.groundtruth_boxes:
+            tf.constant(np.array([[.5, .5, 1, 1], [.5, .5, 1, 1]], np.float32)),
+        fields.InputDataFields.multiclass_scores:
+            tf.constant(np.array([0.2, 0.3, 0.5, 0.1, 0.6, 0.3], np.float32)),
+        fields.InputDataFields.groundtruth_classes:
+            tf.constant(np.array([1, 2], np.int32))
+    }
+
+    input_transformation_fn = functools.partial(
+        inputs.transform_input_data,
+        model_preprocess_fn=_fake_model_preprocessor_fn,
+        image_resizer_fn=_fake_image_resizer_fn,
+        num_classes=3, use_multiclass_scores=True)
+    with self.test_session() as sess:
+      transformed_inputs = sess.run(
+          input_transformation_fn(tensor_dict=tensor_dict))
+
+    self.assertAllClose(
+        np.array([[0.2, 0.3, 0.5], [0.1, 0.6, 0.3]], np.float32),
+        transformed_inputs[fields.InputDataFields.groundtruth_classes])
+
+  def test_use_multiclass_scores_when_not_present(self):
+    image = np.random.rand(4, 4, 3).astype(np.float32)
+    tensor_dict = {
+        fields.InputDataFields.image:
+            tf.constant(image),
+        fields.InputDataFields.groundtruth_boxes:
+            tf.constant(np.array([[.5, .5, 1, 1], [.5, .5, 1, 1]], np.float32)),
+        fields.InputDataFields.multiclass_scores:
+            tf.placeholder(tf.float32),
+        fields.InputDataFields.groundtruth_classes:
+            tf.constant(np.array([1, 2], np.int32))
+    }
+
+    input_transformation_fn = functools.partial(
+        inputs.transform_input_data,
+        model_preprocess_fn=_fake_model_preprocessor_fn,
+        image_resizer_fn=_fake_image_resizer_fn,
+        num_classes=3, use_multiclass_scores=True)
+    with self.test_session() as sess:
+      transformed_inputs = sess.run(
+          input_transformation_fn(tensor_dict=tensor_dict),
+          feed_dict={
+              tensor_dict[fields.InputDataFields.multiclass_scores]:
+                  np.array([], dtype=np.float32)
+          })
+
+    self.assertAllClose(
+        np.array([[0, 1, 0], [0, 0, 1]], np.float32),
+        transformed_inputs[fields.InputDataFields.groundtruth_classes])
+
  def test_returns_correct_class_label_encodings(self):
    tensor_dict = {
        fields.InputDataFields.image:

--- a/research/object_detection/meta_architectures/faster_rcnn_meta_arch.py
+++ b/research/object_detection/meta_architectures/faster_rcnn_meta_arch.py
@@ -108,6 +108,7 @@ from object_detection.core import standard_fields as fields
 from object_detection.core import target_assigner
 from object_detection.utils import ops
 from object_detection.utils import shape_utils
+from object_detection.utils import variables_helper

 slim = tf.contrib.slim

@@ -210,7 +211,7 @@ class FasterRCNNFeatureExtractor(object):
      the model graph.
    """
    variables_to_restore = {}
-    for variable in tf.global_variables():
+    for variable in variables_helper.get_global_variables_safely():
      for scope_name in [first_stage_feature_extractor_scope,
                         second_stage_feature_extractor_scope]:
        if variable.op.name.startswith(scope_name):
@@ -275,7 +276,7 @@ class FasterRCNNKerasFeatureExtractor(object):
      the model graph.
    """
    variables_to_restore = {}
-    for variable in tf.global_variables():
+    for variable in variables_helper.get_global_variables_safely():
      for scope_name in [first_stage_feature_extractor_scope,
                         second_stage_feature_extractor_scope]:
        if variable.op.name.startswith(scope_name):
@@ -1193,6 +1194,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
        detection_masks = self._gather_instance_masks(
            detection_masks, detection_classes)

+      detection_masks = tf.cast(detection_masks, tf.float32)
      prediction_dict[fields.DetectionResultFields.detection_masks] = (
          tf.reshape(tf.sigmoid(detection_masks),
                     [batch_size, max_detection, mask_height, mask_width]))
@@ -1461,9 +1463,10 @@ class FasterRCNNMetaArch(model.DetectionModel):
            mask_predictions=mask_predictions)

      if 'rpn_features_to_crop' in prediction_dict and self._initial_crop_size:
-        self._add_detection_features_output_node(
-            detections_dict[fields.DetectionResultFields.detection_boxes],
-            prediction_dict['rpn_features_to_crop'])
+        detections_dict[
+            'detection_features'] = self._add_detection_features_output_node(
+                detections_dict[fields.DetectionResultFields.detection_boxes],
+                prediction_dict['rpn_features_to_crop'])

      return detections_dict

@@ -1474,18 +1477,25 @@ class FasterRCNNMetaArch(model.DetectionModel):

  def _add_detection_features_output_node(self, detection_boxes,
                                          rpn_features_to_crop):
-    """Add the detection features to the output node.
+    """Add detection features to outputs.

-    The detection features are from cropping rpn_features with boxes.
-    Each bounding box has one feature vector of length depth, which comes from
-    mean_pooling of the cropped rpn_features.
+    This function extracts box features for each box in rpn_features_to_crop.
+    It returns the extracted box features, reshaped to
+    [batch size, max_detections, height, width, depth], and average pools
+    the extracted features across the spatial dimensions and adds a graph node
+    to the pooled features named 'pooled_detection_features'

    Args:
      detection_boxes: a 3-D float32 tensor of shape
-        [batch_size, max_detection, 4] which represents the bounding boxes.
+        [batch_size, max_detections, 4] which represents the bounding boxes.
      rpn_features_to_crop: A 4-D float32 tensor with shape
        [batch, height, width, depth] representing image features to crop using
        the proposals boxes.
+
+    Returns:
+      detection_features: a 4-D float32 tensor of shape
+        [batch size, max_detections, height, width, depth] representing
+        cropped image features
    """
    with tf.name_scope('SecondStageDetectionFeaturesExtract'):
      flattened_detected_feature_maps = (
@@ -1495,15 +1505,23 @@ class FasterRCNNMetaArch(model.DetectionModel):
          flattened_detected_feature_maps)

      batch_size = tf.shape(detection_boxes)[0]
-      max_detection = tf.shape(detection_boxes)[1]
+      max_detections = tf.shape(detection_boxes)[1]
      detection_features_pool = tf.reduce_mean(
          detection_features_unpooled, axis=[1, 2])
-      detection_features = tf.reshape(
+      reshaped_detection_features_pool = tf.reshape(
          detection_features_pool,
-          [batch_size, max_detection, tf.shape(detection_features_pool)[-1]])
+          [batch_size, max_detections, tf.shape(detection_features_pool)[-1]])
+      reshaped_detection_features_pool = tf.identity(
+          reshaped_detection_features_pool, 'pooled_detection_features')

-    detection_features = tf.identity(
-        detection_features, 'detection_features')
+      reshaped_detection_features = tf.reshape(
+          detection_features_unpooled,
+          [batch_size, max_detections,
+           tf.shape(detection_features_unpooled)[1],
+           tf.shape(detection_features_unpooled)[2],
+           tf.shape(detection_features_unpooled)[3]])
+
+    return reshaped_detection_features

  def _postprocess_rpn(self,
                       rpn_box_encodings_batch,
@@ -1749,6 +1767,15 @@ class FasterRCNNMetaArch(model.DetectionModel):
        resized_masks_list.append(resized_mask)

      groundtruth_masks_list = resized_masks_list
+    # Masks could be set to bfloat16 in the input pipeline for performance
+    # reasons. Convert masks back to floating point space here since the rest of
+    # this module assumes groundtruth to be of float32 type.
+    float_groundtruth_masks_list = []
+    if groundtruth_masks_list:
+      for mask in groundtruth_masks_list:
+        float_groundtruth_masks_list.append(tf.cast(mask, tf.float32))
+      groundtruth_masks_list = float_groundtruth_masks_list
+
    if self.groundtruth_has_field(fields.BoxListFields.weights):
      groundtruth_weights_list = self.groundtruth_lists(
          fields.BoxListFields.weights)
@@ -2619,7 +2646,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
          self.first_stage_feature_extractor_scope,
          self.second_stage_feature_extractor_scope)

-    variables_to_restore = tf.global_variables()
+    variables_to_restore = variables_helper.get_global_variables_safely()
    variables_to_restore.append(slim.get_or_create_global_step())
    # Only load feature extractor variables to be consistent with loading from
    # a classification checkpoint.

--- a/research/object_detection/meta_architectures/faster_rcnn_meta_arch_test.py
+++ b/research/object_detection/meta_architectures/faster_rcnn_meta_arch_test.py
@@ -383,6 +383,11 @@ class FasterRCNNMetaArchTest(
    class_predictions_with_background_shapes = [(16, 3), (None, 3)]
    proposal_boxes_shapes = [(2, 8, 4), (None, 8, 4)]
    batch_size = 2
+    initial_crop_size = 3
+    maxpool_stride = 1
+    height = initial_crop_size/maxpool_stride
+    width = initial_crop_size/maxpool_stride
+    depth = 3
    image_shape = np.array((2, 36, 48, 3), dtype=np.int32)
    for (num_proposals_shape, refined_box_encoding_shape,
         class_predictions_with_background_shape,
@@ -433,6 +438,7 @@ class FasterRCNNMetaArchTest(
            'detection_scores': tf.zeros([2, 5]),
            'detection_classes': tf.zeros([2, 5]),
            'num_detections': tf.zeros([2]),
+            'detection_features': tf.zeros([2, 5, width, height, depth])
        }, true_image_shapes)
      with self.test_session(graph=tf_graph) as sess:
        detections_out = sess.run(
@@ -453,6 +459,9 @@ class FasterRCNNMetaArchTest(
      self.assertAllClose(detections_out['num_detections'].shape, [2])
      self.assertTrue(np.amax(detections_out['detection_masks'] <= 1.0))
      self.assertTrue(np.amin(detections_out['detection_masks'] >= 0.0))
+      self.assertAllEqual(detections_out['detection_features'].shape,
+                          [2, 5, width, height, depth])
+      self.assertGreaterEqual(np.amax(detections_out['detection_features']), 0)

  def _get_box_classifier_features_shape(self,
                                         image_size,

--- a/research/object_detection/meta_architectures/ssd_meta_arch.py
+++ b/research/object_detection/meta_architectures/ssd_meta_arch.py
@@ -28,6 +28,7 @@ from object_detection.core import standard_fields as fields
 from object_detection.core import target_assigner
 from object_detection.utils import ops
 from object_detection.utils import shape_utils
+from object_detection.utils import variables_helper
 from object_detection.utils import visualization_utils

 slim = tf.contrib.slim
@@ -45,6 +46,7 @@ class SSDFeatureExtractor(object):
               reuse_weights=None,
               use_explicit_padding=False,
               use_depthwise=False,
+               num_layers=6,
               override_base_feature_extractor_hyperparams=False):
    """Constructor.

@@ -61,6 +63,7 @@ class SSDFeatureExtractor(object):
      use_explicit_padding: Whether to use explicit padding when extracting
        features. Default is False.
      use_depthwise: Whether to use depthwise convolutions. Default is False.
+      num_layers: Number of SSD layers.
      override_base_feature_extractor_hyperparams: Whether to override
        hyperparameters of the base feature extractor with the one from
        `conv_hyperparams_fn`.
@@ -73,6 +76,7 @@ class SSDFeatureExtractor(object):
    self._reuse_weights = reuse_weights
    self._use_explicit_padding = use_explicit_padding
    self._use_depthwise = use_depthwise
+    self._num_layers = num_layers
    self._override_base_feature_extractor_hyperparams = (
        override_base_feature_extractor_hyperparams)

@@ -126,7 +130,7 @@ class SSDFeatureExtractor(object):
      the model graph.
    """
    variables_to_restore = {}
-    for variable in tf.global_variables():
+    for variable in variables_helper.get_global_variables_safely():
      var_name = variable.op.name
      if var_name.startswith(feature_extractor_scope + '/'):
        var_name = var_name.replace(feature_extractor_scope + '/', '')
@@ -148,6 +152,7 @@ class SSDKerasFeatureExtractor(tf.keras.Model):
               inplace_batchnorm_update,
               use_explicit_padding=False,
               use_depthwise=False,
+               num_layers=6,
               override_base_feature_extractor_hyperparams=False,
               name=None):
    """Constructor.
@@ -172,6 +177,7 @@ class SSDKerasFeatureExtractor(tf.keras.Model):
      use_explicit_padding: Whether to use explicit padding when extracting
        features. Default is False.
      use_depthwise: Whether to use depthwise convolutions. Default is False.
+      num_layers: Number of SSD layers.
      override_base_feature_extractor_hyperparams: Whether to override
        hyperparameters of the base feature extractor with the one from
        `conv_hyperparams_config`.
@@ -189,6 +195,7 @@ class SSDKerasFeatureExtractor(tf.keras.Model):
    self._inplace_batchnorm_update = inplace_batchnorm_update
    self._use_explicit_padding = use_explicit_padding
    self._use_depthwise = use_depthwise
+    self._num_layers = num_layers
    self._override_base_feature_extractor_hyperparams = (
        override_base_feature_extractor_hyperparams)

@@ -247,11 +254,13 @@ class SSDKerasFeatureExtractor(tf.keras.Model):
      the model graph.
    """
    variables_to_restore = {}
-    for variable in tf.global_variables():
-      var_name = variable.op.name
+    for variable in self.variables:
+      # variable.name includes ":0" at the end, but the names in the checkpoint
+      # do not have the suffix ":0". So, we strip it here.
+      var_name = variable.name[:-2]
      if var_name.startswith(feature_extractor_scope + '/'):
        var_name = var_name.replace(feature_extractor_scope + '/', '')
-        variables_to_restore[var_name] = variable
+      variables_to_restore[var_name] = variable

    return variables_to_restore

@@ -709,6 +718,14 @@ class SSDMetaArch(model.DetectionModel):
      additional_fields = {
          'multiclass_scores': detection_scores_with_background
      }
+      if self._anchors is not None:
+        anchor_indices = tf.range(self._anchors.num_boxes_static())
+        batch_anchor_indices = tf.tile(
+            tf.expand_dims(anchor_indices, 0), [batch_size, 1])
+        # All additional fields need to be float.
+        additional_fields.update({
+            'anchor_indices': tf.cast(batch_anchor_indices, tf.float32),
+        })
      if detection_keypoints is not None:
        detection_keypoints = tf.identity(
            detection_keypoints, 'raw_keypoint_locations')
@@ -737,6 +754,12 @@ class SSDMetaArch(model.DetectionModel):
          fields.DetectionResultFields.raw_detection_scores:
              detection_scores_with_background
      }
+      if (nmsed_additional_fields is not None and
+          'anchor_indices' in nmsed_additional_fields):
+        detection_dict.update({
+            fields.DetectionResultFields.detection_anchor_indices:
+                tf.cast(nmsed_additional_fields['anchor_indices'], tf.int32),
+        })
      if (nmsed_additional_fields is not None and
          fields.BoxListFields.keypoints in nmsed_additional_fields):
        detection_dict[fields.DetectionResultFields.detection_keypoints] = (
@@ -1218,13 +1241,24 @@ class SSDMetaArch(model.DetectionModel):

    if fine_tune_checkpoint_type == 'detection':
      variables_to_restore = {}
-      for variable in tf.global_variables():
-        var_name = variable.op.name
-        if load_all_detection_checkpoint_vars:
-          variables_to_restore[var_name] = variable
-        else:
-          if var_name.startswith(self._extract_features_scope):
+      if tf.executing_eagerly():
+        for variable in self.variables:
+          # variable.name includes ":0" at the end, but the names in the
+          # checkpoint do not have the suffix ":0". So, we strip it here.
+          var_name = variable.name[:-2]
+          if load_all_detection_checkpoint_vars:
+            variables_to_restore[var_name] = variable
+          else:
+            if var_name.startswith(self._extract_features_scope):
+              variables_to_restore[var_name] = variable
+      else:
+        for variable in variables_helper.get_global_variables_safely():
+          var_name = variable.op.name
+          if load_all_detection_checkpoint_vars:
            variables_to_restore[var_name] = variable
+          else:
+            if var_name.startswith(self._extract_features_scope):
+              variables_to_restore[var_name] = variable

    return variables_to_restore


--- a/research/object_detection/meta_architectures/ssd_meta_arch_test.py
+++ b/research/object_detection/meta_architectures/ssd_meta_arch_test.py
@@ -188,6 +188,7 @@ class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase,
                            [0.5, 0., 1., 0.5], [1., 1., 1.5, 1.5]]]
    raw_detection_scores = [[[0, 0], [0, 0], [0, 0], [0, 0]],
                            [[0, 0], [0, 0], [0, 0], [0, 0]]]
+    detection_anchor_indices = [[0, 2, 1, 0, 0], [0, 2, 1, 0, 0]]

    for input_shape in input_shapes:
      tf_graph = tf.Graph()
@@ -229,6 +230,8 @@ class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase,
                          raw_detection_boxes)
      self.assertAllEqual(detections_out['raw_detection_scores'],
                          raw_detection_scores)
+      self.assertAllEqual(detections_out['detection_anchor_indices'],
+                          detection_anchor_indices)

  def test_postprocess_results_are_correct_static(self, use_keras):
    with tf.Graph().as_default():

--- a/research/object_detection/metrics/coco_evaluation.py
+++ b/research/object_detection/metrics/coco_evaluation.py
@@ -13,7 +13,12 @@
 # limitations under the License.
 # ==============================================================================
 """Class for evaluating object detections with COCO metrics."""
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
 import numpy as np
+from six.moves import zip
 import tensorflow as tf

 from object_detection.core import standard_fields

--- a/research/object_detection/metrics/coco_tools.py
+++ b/research/object_detection/metrics/coco_tools.py
@@ -39,6 +39,10 @@ then evaluation (in multi-class mode) can be invoked as follows:
  metrics = evaluator.ComputeMetrics()

 """
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
 from collections import OrderedDict
 import copy
 import time
@@ -48,6 +52,8 @@ from pycocotools import coco
 from pycocotools import cocoeval
 from pycocotools import mask

+from six.moves import range
+from six.moves import zip
 import tensorflow as tf

 from object_detection.utils import json_utils

--- a/research/object_detection/metrics/oid_challenge_evaluation.py
+++ b/research/object_detection/metrics/oid_challenge_evaluation.py
@@ -40,6 +40,8 @@ from __future__ import absolute_import
 from __future__ import division
 from __future__ import print_function

+import logging
+
 from absl import app
 from absl import flags
 import pandas as pd
@@ -120,20 +122,22 @@ def main(unused_argv):
      object_detection_evaluation.OpenImagesChallengeEvaluator(
          categories, evaluate_masks=is_instance_segmentation_eval))

+  all_predictions = pd.read_csv(FLAGS.input_predictions)
+  images_processed = 0
  for _, groundtruth in enumerate(all_annotations.groupby('ImageID')):
+    logging.info('Processing image %d', images_processed)
    image_id, image_groundtruth = groundtruth
    groundtruth_dictionary = utils.build_groundtruth_dictionary(
        image_groundtruth, class_label_map)
    challenge_evaluator.add_single_ground_truth_image_info(
        image_id, groundtruth_dictionary)

-  all_predictions = pd.read_csv(FLAGS.input_predictions)
-  for _, prediction_data in enumerate(all_predictions.groupby('ImageID')):
-    image_id, image_predictions = prediction_data
    prediction_dictionary = utils.build_predictions_dictionary(
-        image_predictions, class_label_map)
+        all_predictions.loc[all_predictions['ImageID'] == image_id],
+        class_label_map)
    challenge_evaluator.add_single_detected_image_info(image_id,
                                                       prediction_dictionary)
+    images_processed += 1

  metrics = challenge_evaluator.evaluate()


--- a/research/object_detection/metrics/oid_challenge_evaluation_utils.py
+++ b/research/object_detection/metrics/oid_challenge_evaluation_utils.py
@@ -18,10 +18,13 @@ from __future__ import absolute_import
 from __future__ import division
 from __future__ import print_function

+import base64
+import zlib
+
 import numpy as np
 import pandas as pd
+from pycocotools import mask as coco_mask

-from pycocotools import mask
 from object_detection.core import standard_fields


@@ -53,33 +56,42 @@ def _decode_raw_data_into_masks_and_boxes(segments, image_widths,
  """Decods binary segmentation masks into np.arrays and boxes.

  Args:
-    segments: pandas Series object containing either None entries or strings
-    with COCO-encoded binary masks. All masks are expected to be the same size.
+    segments: pandas Series object containing either
+      None entries, or strings with
+      base64, zlib compressed, COCO RLE-encoded binary masks.
+      All masks are expected to be the same size.
    image_widths: pandas Series of mask widths.
    image_heights: pandas Series of mask heights.

  Returns:
    a np.ndarray of the size NxWxH, where W and H is determined from the encoded
-    masks; for the None values, zero arrays of size WxH are created. if input
+    masks; for the None values, zero arrays of size WxH are created. If input
    contains only None values, W=1, H=1.
  """
  segment_masks = []
  segment_boxes = []
  ind = segments.first_valid_index()
  if ind is not None:
-    size = [int(image_heights.iloc[ind]), int(image_widths[ind])]
+    size = [int(image_heights[ind]), int(image_widths[ind])]
  else:
    # It does not matter which size we pick since no masks will ever be
    # evaluated.
-    size = [1, 1]
+    return np.zeros((segments.shape[0], 1, 1), dtype=np.uint8), np.zeros(
+        (segments.shape[0], 4), dtype=np.float32)
+
  for segment, im_width, im_height in zip(segments, image_widths,
                                          image_heights):
    if pd.isnull(segment):
      segment_masks.append(np.zeros([1, size[0], size[1]], dtype=np.uint8))
      segment_boxes.append(np.expand_dims(np.array([0.0, 0.0, 0.0, 0.0]), 0))
    else:
-      encoding_dict = {'size': [im_height, im_width], 'counts': segment}
-      mask_tensor = mask.decode(encoding_dict)
+      compressed_mask = base64.b64decode(segment)
+      rle_encoded_mask = zlib.decompress(compressed_mask)
+      decoding_dict = {
+          'size': [im_height, im_width],
+          'counts': rle_encoded_mask
+      }
+      mask_tensor = coco_mask.decode(decoding_dict)

      segment_masks.append(np.expand_dims(mask_tensor, 0))
      segment_boxes.append(np.expand_dims(_to_normalized_box(mask_tensor), 0))

--- a/research/object_detection/metrics/oid_challenge_evaluation_utils_test.py
+++ b/research/object_detection/metrics/oid_challenge_evaluation_utils_test.py
@@ -18,15 +18,43 @@ from __future__ import absolute_import
 from __future__ import division
 from __future__ import print_function

+import base64
+import zlib
+
 import numpy as np
 import pandas as pd
-from pycocotools import mask
+from pycocotools import mask as coco_mask
 import tensorflow as tf

 from object_detection.core import standard_fields
 from object_detection.metrics import oid_challenge_evaluation_utils as utils


+def encode_mask(mask_to_encode):
+  """Encodes a binary mask into the Kaggle challenge text format.
+
+  The encoding is done in three stages:
+   - COCO RLE-encoding,
+   - zlib compression,
+   - base64 encoding (to use as entry in csv file).
+
+  Args:
+    mask_to_encode: binary np.ndarray of dtype bool and 2d shape.
+
+  Returns:
+    A (base64) text string of the encoded mask.
+  """
+  mask_to_encode = np.squeeze(mask_to_encode)
+  mask_to_encode = mask_to_encode.reshape(mask_to_encode.shape[0],
+                                          mask_to_encode.shape[1], 1)
+  mask_to_encode = mask_to_encode.astype(np.uint8)
+  mask_to_encode = np.asfortranarray(mask_to_encode)
+  encoded_mask = coco_mask.encode(mask_to_encode)[0]['counts']
+  compressed_mask = zlib.compress(encoded_mask, zlib.Z_BEST_COMPRESSION)
+  base64_mask = base64.b64encode(compressed_mask)
+  return base64_mask
+
+
 class OidUtilTest(tf.test.TestCase):

  def testMaskToNormalizedBox(self):
@@ -44,10 +72,10 @@ class OidUtilTest(tf.test.TestCase):
    mask1 = np.array([[0, 0, 1, 1], [0, 0, 1, 1], [0, 0, 0, 0]], dtype=np.uint8)
    mask2 = np.array([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]], dtype=np.uint8)

-    encoding1 = mask.encode(np.asfortranarray(mask1))
-    encoding2 = mask.encode(np.asfortranarray(mask2))
+    encoding1 = encode_mask(mask1)
+    encoding2 = encode_mask(mask2)

-    vals = pd.Series([encoding1['counts'], encoding2['counts']])
+    vals = pd.Series([encoding1, encoding2])
    image_widths = pd.Series([mask1.shape[1], mask2.shape[1]])
    image_heights = pd.Series([mask1.shape[0], mask2.shape[0]])

@@ -60,6 +88,15 @@ class OidUtilTest(tf.test.TestCase):
    self.assertAllEqual(expected_segm, segm)
    self.assertAllEqual(expected_bbox, bbox)

+  def testDecodeToTensorsNoMasks(self):
+    vals = pd.Series([None, None])
+    image_widths = pd.Series([None, None])
+    image_heights = pd.Series([None, None])
+    segm, bbox = utils._decode_raw_data_into_masks_and_boxes(
+        vals, image_widths, image_heights)
+    self.assertAllEqual(np.zeros((2, 1, 1), dtype=np.uint8), segm)
+    self.assertAllEqual(np.zeros((2, 4), dtype=np.float32), bbox)
+

 class OidChallengeEvaluationUtilTest(tf.test.TestCase):

@@ -140,13 +177,13 @@ class OidChallengeEvaluationUtilTest(tf.test.TestCase):
    mask2 = np.array([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]],
                     dtype=np.uint8)

-    encoding1 = mask.encode(np.asfortranarray(mask1))
-    encoding2 = mask.encode(np.asfortranarray(mask2))
+    encoding1 = encode_mask(mask1)
+    encoding2 = encode_mask(mask2)

    np_data = pd.DataFrame(
        [[
            'fe58ec1b06db2bb7', mask1.shape[1], mask1.shape[0], '/m/04bcr3',
-            0.0, 0.3, 0.5, 0.6, 0, None, encoding1['counts']
+            0.0, 0.3, 0.5, 0.6, 0, None, encoding1
        ],
         [
             'fe58ec1b06db2bb7', None, None, '/m/02gy9n', 0.1, 0.2, 0.3, 0.4, 1,
@@ -154,7 +191,7 @@ class OidChallengeEvaluationUtilTest(tf.test.TestCase):
         ],
         [
             'fe58ec1b06db2bb7', mask2.shape[1], mask2.shape[0], '/m/02gy9n',
-             0.5, 0.6, 0.8, 0.9, 0, None, encoding2['counts']
+             0.5, 0.6, 0.8, 0.9, 0, None, encoding2
         ],
         [
             'fe58ec1b06db2bb7', None, None, '/m/04bcr3', None, None, None,
@@ -218,21 +255,21 @@ class OidChallengeEvaluationUtilTest(tf.test.TestCase):
    mask2 = np.array([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]],
                     dtype=np.uint8)

-    encoding1 = mask.encode(np.asfortranarray(mask1))
-    encoding2 = mask.encode(np.asfortranarray(mask2))
+    encoding1 = encode_mask(mask1)
+    encoding2 = encode_mask(mask2)

-    np_data = pd.DataFrame(
-        [[
-            'fe58ec1b06db2bb7', mask1.shape[1], mask1.shape[0], '/m/04bcr3',
-            encoding1['counts'], 0.8
-        ],
-         [
-             'fe58ec1b06db2bb7', mask2.shape[1], mask2.shape[0], '/m/02gy9n',
-             encoding2['counts'], 0.6
-         ]],
-        columns=[
-            'ImageID', 'ImageWidth', 'ImageHeight', 'LabelName', 'Mask', 'Score'
-        ])
+    np_data = pd.DataFrame([[
+        'fe58ec1b06db2bb7', mask1.shape[1], mask1.shape[0], '/m/04bcr3',
+        encoding1, 0.8
+    ],
+                            [
+                                'fe58ec1b06db2bb7', mask2.shape[1],
+                                mask2.shape[0], '/m/02gy9n', encoding2, 0.6
+                            ]],
+                           columns=[
+                               'ImageID', 'ImageWidth', 'ImageHeight',
+                               'LabelName', 'Mask', 'Score'
+                           ])
    class_label_map = {'/m/04bcr3': 1, '/m/02gy9n': 3}
    prediction_dictionary = utils.build_predictions_dictionary(
        np_data, class_label_map)

--- a/research/object_detection/model_lib.py
+++ b/research/object_detection/model_lib.py
@@ -24,7 +24,6 @@ import os

 import tensorflow as tf

-from tensorflow.python.util import function_utils
 from object_detection import eval_util
 from object_detection import exporter as exporter_lib
 from object_detection import inputs
@@ -187,7 +186,7 @@ def unstack_batch(tensor_dict, unpad_groundtruth_tensors=True):
  return unbatched_tensor_dict


-def _provide_groundtruth(model, labels):
+def provide_groundtruth(model, labels):
  """Provides the labels to a model as groundtruth.

  This helper function extracts the corresponding boxes, classes,
@@ -287,7 +286,7 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False,
          labels, unpad_groundtruth_tensors=unpad_groundtruth_tensors)

    if mode in (tf.estimator.ModeKeys.TRAIN, tf.estimator.ModeKeys.EVAL):
-      _provide_groundtruth(detection_model, labels)
+      provide_groundtruth(detection_model, labels)

    preprocessed_images = features[fields.InputDataFields.image]
    if use_tpu and train_config.use_bfloat16:
@@ -524,7 +523,7 @@ def create_estimator_and_inputs(run_config,
                                pipeline_config_path,
                                config_override=None,
                                train_steps=None,
-                                sample_1_of_n_eval_examples=1,
+                                sample_1_of_n_eval_examples=None,
                                sample_1_of_n_eval_on_train_examples=1,
                                model_fn_creator=create_model_fn,
                                use_tpu_estimator=False,
@@ -606,9 +605,12 @@ def create_estimator_and_inputs(run_config,
      pipeline_config_path, config_override=config_override)
  kwargs.update({
      'train_steps': train_steps,
-      'sample_1_of_n_eval_examples': sample_1_of_n_eval_examples,
      'use_bfloat16': configs['train_config'].use_bfloat16 and use_tpu
  })
+  if sample_1_of_n_eval_examples >= 1:
+    kwargs.update({
+        'sample_1_of_n_eval_examples': sample_1_of_n_eval_examples
+    })
  if override_eval_num_epochs:
    kwargs.update({'eval_num_epochs': 1})
    tf.logging.warning(
@@ -667,11 +669,6 @@ def create_estimator_and_inputs(run_config,
  model_fn = model_fn_creator(detection_model_fn, configs, hparams, use_tpu,
                              postprocess_on_cpu)
  if use_tpu_estimator:
-    # Multicore inference disabled due to b/129367127
-    tpu_estimator_args = function_utils.fn_args(tf.contrib.tpu.TPUEstimator)
-    kwargs = {}
-    if 'experimental_export_device_assignment' in tpu_estimator_args:
-      kwargs['experimental_export_device_assignment'] = True
    estimator = tf.contrib.tpu.TPUEstimator(
        model_fn=model_fn,
        train_batch_size=train_config.batch_size,
@@ -681,8 +678,7 @@ def create_estimator_and_inputs(run_config,
        config=run_config,
        export_to_tpu=export_to_tpu,
        eval_on_tpu=False,  # Eval runs on CPU, so disable eval on TPU
-        params=params if params else {},
-        **kwargs)
+        params=params if params else {})
  else:
    estimator = tf.estimator.Estimator(model_fn=model_fn, config=run_config)


--- a/research/object_detection/model_lib_v2.py
+++ b/research/object_detection/model_lib_v2.py
+# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+r"""Constructs model, inputs, and training environment."""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import copy
+import time
+
+import tensorflow as tf
+
+from object_detection import eval_util
+from object_detection import inputs
+from object_detection import model_lib
+from object_detection.builders import model_builder
+from object_detection.builders import optimizer_builder
+from object_detection.core import standard_fields as fields
+from object_detection.utils import config_util
+from object_detection.utils import label_map_util
+from object_detection.utils import ops
+from object_detection.utils import variables_helper
+
+MODEL_BUILD_UTIL_MAP = model_lib.MODEL_BUILD_UTIL_MAP
+
+### NOTE: This file is a wip.
+### TODO(kaftan): Explore adding unit tests for individual methods
+### TODO(kaftan): Add unit test that checks training on a single image w/
+#### groundtruth, and verfiy that loss goes to zero.
+#### Possibly have version that takes it as the whole train & eval dataset,
+#### & verify the loss output from the eval_loop method.
+### TODO(kaftan): Make sure the unit tests run in TAP presubmits or Kokoro
+
+
+def _compute_losses_and_predictions_dicts(
+    model, features, labels,
+    add_regularization_loss=True,
+    use_tpu=False,
+    use_bfloat16=False):
+  """Computes the losses dict and predictions dict for a model on inputs.
+
+  Args:
+    model: a DetectionModel (based on Keras).
+    features: Dictionary of feature tensors from the input dataset.
+      Should be in the format output by `inputs.train_input` and
+      `inputs.eval_input`.
+        features[fields.InputDataFields.image] is a [batch_size, H, W, C]
+          float32 tensor with preprocessed images.
+        features[HASH_KEY] is a [batch_size] int32 tensor representing unique
+          identifiers for the images.
+        features[fields.InputDataFields.true_image_shape] is a [batch_size, 3]
+          int32 tensor representing the true image shapes, as preprocessed
+          images could be padded.
+        features[fields.InputDataFields.original_image] (optional) is a
+          [batch_size, H, W, C] float32 tensor with original images.
+    labels: A dictionary of groundtruth tensors post-unstacking. The original
+      labels are of the form returned by `inputs.train_input` and
+      `inputs.eval_input`. The shapes may have been modified by unstacking with
+      `model_lib.unstack_batch`. However, the dictionary includes the following
+      fields.
+        labels[fields.InputDataFields.num_groundtruth_boxes] is a
+          int32 tensor indicating the number of valid groundtruth boxes
+          per image.
+        labels[fields.InputDataFields.groundtruth_boxes] is a float32 tensor
+          containing the corners of the groundtruth boxes.
+        labels[fields.InputDataFields.groundtruth_classes] is a float32
+          one-hot tensor of classes.
+        labels[fields.InputDataFields.groundtruth_weights] is a float32 tensor
+          containing groundtruth weights for the boxes.
+        -- Optional --
+        labels[fields.InputDataFields.groundtruth_instance_masks] is a
+          float32 tensor containing only binary values, which represent
+          instance masks for objects.
+        labels[fields.InputDataFields.groundtruth_keypoints] is a
+          float32 tensor containing keypoints for each box.
+    add_regularization_loss: Whether or not to include the model's
+      regularization loss in the losses dictionary.
+    use_tpu: Whether computation should happen on a TPU.
+    use_bfloat16: Whether computation on a TPU should use bfloat16.
+
+  Returns:
+    A tuple containing the losses dictionary (with the total loss under
+    the key 'Loss/total_loss'), and the predictions dictionary produced by
+    `model.predict`.
+
+  """
+  model_lib.provide_groundtruth(model, labels)
+  preprocessed_images = features[fields.InputDataFields.image]
+
+  # TODO(kaftan): Check how we're supposed to do this mixed precision stuff
+  ## in TF2 TPUStrategy + Keras
+  if use_tpu and use_bfloat16:
+    with tf.contrib.tpu.bfloat16_scope():
+      prediction_dict = model.predict(
+          preprocessed_images,
+          features[fields.InputDataFields.true_image_shape])
+      prediction_dict = ops.bfloat16_to_float32_nested(prediction_dict)
+  else:
+    prediction_dict = model.predict(
+        preprocessed_images,
+        features[fields.InputDataFields.true_image_shape])
+
+  losses_dict = model.loss(
+      prediction_dict, features[fields.InputDataFields.true_image_shape])
+  losses = [loss_tensor for loss_tensor in losses_dict.values()]
+  if add_regularization_loss:
+    # TODO(kaftan): As we figure out mixed precision & bfloat 16, we may
+    ## need to convert these regularization losses from bfloat16 to float32
+    ## as well.
+    regularization_losses = model.regularization_losses()
+    if regularization_losses:
+      regularization_loss = tf.add_n(
+          regularization_losses, name='regularization_loss')
+      losses.append(regularization_loss)
+      losses_dict['Loss/regularization_loss'] = regularization_loss
+
+  total_loss = tf.add_n(losses, name='total_loss')
+  losses_dict['Loss/total_loss'] = total_loss
+
+  return losses_dict, prediction_dict
+
+
+# TODO(kaftan): Explore removing learning_rate from this method & returning
+## The full losses dict instead of just total_loss, then doing all summaries
+## saving in a utility method called by the outer training loop.
+# TODO(kaftan): Explore adding gradient summaries
+def eager_train_step(detection_model,
+                     features,
+                     labels,
+                     unpad_groundtruth_tensors,
+                     optimizer,
+                     learning_rate,
+                     add_regularization_loss=True,
+                     clip_gradients_value=None,
+                     use_tpu=False,
+                     use_bfloat16=False,
+                     global_step=None,
+                     num_replicas=1.0):
+  """Process a single training batch.
+
+  This method computes the loss for the model on a single training batch,
+  while tracking the gradients with a gradient tape. It then updates the
+  model variables with the optimizer, clipping the gradients if
+  clip_gradients_value is present.
+
+  This method can run eagerly or inside a tf.function.
+
+  Args:
+    detection_model: A DetectionModel (based on Keras) to train.
+    features: Dictionary of feature tensors from the input dataset.
+      Should be in the format output by `inputs.train_input.
+        features[fields.InputDataFields.image] is a [batch_size, H, W, C]
+          float32 tensor with preprocessed images.
+        features[HASH_KEY] is a [batch_size] int32 tensor representing unique
+          identifiers for the images.
+        features[fields.InputDataFields.true_image_shape] is a [batch_size, 3]
+          int32 tensor representing the true image shapes, as preprocessed
+          images could be padded.
+        features[fields.InputDataFields.original_image] (optional, not used
+          during training) is a
+          [batch_size, H, W, C] float32 tensor with original images.
+    labels: A dictionary of groundtruth tensors. This method unstacks
+      these labels using model_lib.unstack_batch. The stacked labels are of
+      the form returned by `inputs.train_input` and `inputs.eval_input`.
+        labels[fields.InputDataFields.num_groundtruth_boxes] is a [batch_size]
+          int32 tensor indicating the number of valid groundtruth boxes
+          per image.
+        labels[fields.InputDataFields.groundtruth_boxes] is a
+          [batch_size, num_boxes, 4] float32 tensor containing the corners of
+          the groundtruth boxes.
+        labels[fields.InputDataFields.groundtruth_classes] is a
+          [batch_size, num_boxes, num_classes] float32 one-hot tensor of
+          classes. num_classes includes the background class.
+        labels[fields.InputDataFields.groundtruth_weights] is a
+          [batch_size, num_boxes] float32 tensor containing groundtruth weights
+          for the boxes.
+        -- Optional --
+        labels[fields.InputDataFields.groundtruth_instance_masks] is a
+          [batch_size, num_boxes, H, W] float32 tensor containing only binary
+          values, which represent instance masks for objects.
+        labels[fields.InputDataFields.groundtruth_keypoints] is a
+          [batch_size, num_boxes, num_keypoints, 2] float32 tensor containing
+          keypoints for each box.
+    unpad_groundtruth_tensors: A parameter passed to unstack_batch.
+    optimizer: The training optimizer that will update the variables.
+    learning_rate: The learning rate tensor for the current training step.
+      This is used only for TensorBoard logging purposes, it does not affect
+       model training.
+    add_regularization_loss: Whether or not to include the model's
+      regularization loss in the losses dictionary.
+    clip_gradients_value: If this is present, clip the gradients global norm
+      at this value using `tf.clip_by_global_norm`.
+    use_tpu: Whether computation should happen on a TPU.
+    use_bfloat16: Whether computation on a TPU should use bfloat16.
+    global_step: The current training step. Used for TensorBoard logging
+      purposes. This step is not updated by this function and must be
+      incremented separately.
+    num_replicas: The number of replicas in the current distribution strategy.
+      This is used to scale the total loss so that training in a distribution
+      strategy works correctly.
+
+  Returns:
+    The total loss observed at this training step
+  """
+  # """Execute a single training step in the TF v2 style loop."""
+  is_training = True
+
+  detection_model._is_training = is_training  # pylint: disable=protected-access
+  tf.keras.backend.set_learning_phase(is_training)
+
+  labels = model_lib.unstack_batch(
+      labels, unpad_groundtruth_tensors=unpad_groundtruth_tensors)
+
+  with tf.GradientTape() as tape:
+    losses_dict, _ = _compute_losses_and_predictions_dicts(
+        detection_model, features, labels, add_regularization_loss, use_tpu,
+        use_bfloat16)
+
+    total_loss = losses_dict['Loss/total_loss']
+
+    # Normalize loss for num replicas
+    total_loss = tf.math.divide(total_loss,
+                                tf.constant(num_replicas, dtype=tf.float32))
+    losses_dict['Loss/normalized_total_loss'] = total_loss
+
+  for loss_type in losses_dict:
+    tf.compat.v2.summary.scalar(
+        loss_type, losses_dict[loss_type], step=global_step)
+
+  trainable_variables = detection_model.trainable_variables
+
+  gradients = tape.gradient(total_loss, trainable_variables)
+
+  if clip_gradients_value:
+    gradients, _ = tf.clip_by_global_norm(gradients, clip_gradients_value)
+  optimizer.apply_gradients(zip(gradients, trainable_variables))
+
+  if not use_tpu:
+    tf.compat.v2.summary.scalar('learning_rate', learning_rate,
+                                step=global_step)
+
+  return total_loss
+
+
+def load_fine_tune_checkpoint(
+    model, checkpoint_path, checkpoint_type,
+    load_all_detection_checkpoint_vars, input_dataset,
+    unpad_groundtruth_tensors, use_tpu, use_bfloat16):
+  """Load a fine tuning classification or detection checkpoint.
+
+  To make sure the model variables are all built, this method first executes
+  the model by computing a dummy loss. (Models might not have built their
+  variables before their first execution)
+
+  It then loads a variable-name based classification or detection checkpoint
+  that comes from converted TF 1.x slim model checkpoints.
+
+  This method updates the model in-place and does not return a value.
+
+  Args:
+    model: A DetectionModel (based on Keras) to load a fine-tuning
+      checkpoint for.
+    checkpoint_path: Directory with checkpoints file or path to checkpoint.
+    checkpoint_type: Whether to restore from a full detection
+      checkpoint (with compatible variable names) or to restore from a
+      classification checkpoint for initialization prior to training.
+      Valid values: `detection`, `classification`.
+    load_all_detection_checkpoint_vars: whether to load all variables (when
+      `fine_tune_checkpoint_type` is `detection`). If False, only variables
+      within the feature extractor scopes are included. Default False.
+    input_dataset: The tf.data Dataset the model is being trained on. Needed
+      to get the shapes for the dummy loss computation.
+    unpad_groundtruth_tensors: A parameter passed to unstack_batch.
+    use_tpu: Whether computation should happen on a TPU.
+    use_bfloat16: Whether computation on a TPU should use bfloat16.
+  """
+  features, labels = iter(input_dataset).next()
+
+  def _dummy_computation_fn(features, labels):
+    model._is_training = False  # pylint: disable=protected-access
+    tf.keras.backend.set_learning_phase(False)
+
+    labels = model_lib.unstack_batch(
+        labels, unpad_groundtruth_tensors=unpad_groundtruth_tensors)
+
+    return _compute_losses_and_predictions_dicts(
+        model,
+        features,
+        labels,
+        use_tpu=use_tpu,
+        use_bfloat16=use_bfloat16)
+
+  strategy = tf.compat.v2.distribute.get_strategy()
+  strategy.experimental_run_v2(
+      _dummy_computation_fn, args=(
+          features,
+          labels,
+      ))
+  var_map = model.restore_map(
+      fine_tune_checkpoint_type=checkpoint_type,
+      load_all_detection_checkpoint_vars=(
+          load_all_detection_checkpoint_vars))
+  available_var_map = (
+      variables_helper.get_variables_available_in_checkpoint(
+          var_map,
+          checkpoint_path,
+          include_global_step=False))
+  tf.train.init_from_checkpoint(checkpoint_path,
+                                available_var_map)
+
+
+def train_loop(
+    hparams,
+    pipeline_config_path,
+    model_dir,
+    config_override=None,
+    train_steps=None,
+    use_tpu=False,
+    save_final_config=False,
+    export_to_tpu=None,
+    checkpoint_every_n=1000, **kwargs):
+  """Trains a model using eager + functions.
+
+  This method:
+    1. Processes the pipeline configs
+    2. (Optionally) saves the as-run config
+    3. Builds the model & optimizer
+    4. Gets the training input data
+    5. Loads a fine-tuning detection or classification checkpoint if requested
+    6. Loops over the train data, executing distributed training steps inside
+       tf.functions.
+    7. Checkpoints the model every `checkpoint_every_n` training steps.
+    8. Logs the training metrics as TensorBoard summaries.
+
+  Args:
+    hparams: A `HParams`.
+    pipeline_config_path: A path to a pipeline config file.
+    model_dir:
+      The directory to save checkpoints and summaries to.
+    config_override: A pipeline_pb2.TrainEvalPipelineConfig text proto to
+      override the config from `pipeline_config_path`.
+    train_steps: Number of training steps. If None, the number of training steps
+      is set from the `TrainConfig` proto.
+    use_tpu: Boolean, whether training and evaluation should run on TPU.
+    save_final_config: Whether to save final config (obtained after applying
+      overrides) to `model_dir`.
+    export_to_tpu: When use_tpu and export_to_tpu are true,
+      `export_savedmodel()` exports a metagraph for serving on TPU besides the
+      one on CPU. If export_to_tpu is not provided, we will look for it in
+      hparams too.
+    checkpoint_every_n:
+      Checkpoint every n training steps.
+    **kwargs: Additional keyword arguments for configuration override.
+  """
+  ## Parse the configs
+  get_configs_from_pipeline_file = MODEL_BUILD_UTIL_MAP[
+      'get_configs_from_pipeline_file']
+  merge_external_params_with_configs = MODEL_BUILD_UTIL_MAP[
+      'merge_external_params_with_configs']
+  create_pipeline_proto_from_configs = MODEL_BUILD_UTIL_MAP[
+      'create_pipeline_proto_from_configs']
+
+  configs = get_configs_from_pipeline_file(
+      pipeline_config_path, config_override=config_override)
+  kwargs.update({
+      'train_steps': train_steps,
+      'use_bfloat16': configs['train_config'].use_bfloat16 and use_tpu
+  })
+  configs = merge_external_params_with_configs(
+      configs, hparams, kwargs_dict=kwargs)
+  model_config = configs['model']
+  train_config = configs['train_config']
+  train_input_config = configs['train_input_config']
+
+  unpad_groundtruth_tensors = train_config.unpad_groundtruth_tensors
+  use_bfloat16 = train_config.use_bfloat16
+  add_regularization_loss = train_config.add_regularization_loss
+  clip_gradients_value = None
+  if train_config.gradient_clipping_by_norm > 0:
+    clip_gradients_value = train_config.gradient_clipping_by_norm
+
+  # update train_steps from config but only when non-zero value is provided
+  if train_steps is None and train_config.num_steps != 0:
+    train_steps = train_config.num_steps
+
+  # Read export_to_tpu from hparams if not passed.
+  if export_to_tpu is None:
+    export_to_tpu = hparams.get('export_to_tpu', False)
+  tf.logging.info(
+      'train_loop: use_tpu %s, export_to_tpu %s', use_tpu,
+      export_to_tpu)
+
+  # Parse the checkpoint fine tuning configs
+  if hparams.load_pretrained:
+    fine_tune_checkpoint_path = train_config.fine_tune_checkpoint
+  else:
+    fine_tune_checkpoint_path = None
+  load_all_detection_checkpoint_vars = (
+      train_config.load_all_detection_checkpoint_vars)
+  # TODO(kaftan) (or anyone else): move this piece of config munging to
+  ## utils/config_util.py
+  if not train_config.fine_tune_checkpoint_type:
+    # train_config.from_detection_checkpoint field is deprecated. For
+    # backward compatibility, set train_config.fine_tune_checkpoint_type
+    # based on train_config.from_detection_checkpoint.
+    if train_config.from_detection_checkpoint:
+      train_config.fine_tune_checkpoint_type = 'detection'
+    else:
+      train_config.fine_tune_checkpoint_type = 'classification'
+  fine_tune_checkpoint_type = train_config.fine_tune_checkpoint_type
+
+  # Write the as-run pipeline config to disk.
+  if save_final_config:
+    pipeline_config_final = create_pipeline_proto_from_configs(configs)
+    config_util.save_pipeline_config(pipeline_config_final, model_dir)
+
+  # TODO(kaftan): Either make strategy a parameter of this method, or
+  ## grab it w/  Distribution strategy's get_scope
+  # Build the model, optimizer, and training input
+  strategy = tf.compat.v2.distribute.MirroredStrategy()
+  with strategy.scope():
+    detection_model = model_builder.build(
+        model_config=model_config, is_training=True)
+
+    # Create the inputs.
+    train_input = inputs.train_input(
+        train_config=train_config,
+        train_input_config=train_input_config,
+        model_config=model_config,
+        model=detection_model)
+
+    train_input = strategy.experimental_distribute_dataset(
+        train_input.repeat())
+
+    global_step = tf.compat.v2.Variable(
+        0, trainable=False, dtype=tf.compat.v2.dtypes.int64)
+    optimizer, (learning_rate,) = optimizer_builder.build(
+        train_config.optimizer, global_step=global_step)
+
+    if callable(learning_rate):
+      learning_rate_fn = learning_rate
+    else:
+      learning_rate_fn = lambda: learning_rate
+
+  ## Train the model
+  summary_writer = tf.compat.v2.summary.create_file_writer(model_dir + '/train')
+  with summary_writer.as_default():
+    with strategy.scope():
+      # Load a fine-tuning checkpoint.
+      if fine_tune_checkpoint_path:
+        load_fine_tune_checkpoint(detection_model, fine_tune_checkpoint_path,
+                                  fine_tune_checkpoint_type,
+                                  load_all_detection_checkpoint_vars,
+                                  train_input,
+                                  unpad_groundtruth_tensors, use_tpu,
+                                  use_bfloat16)
+
+      ckpt = tf.compat.v2.train.Checkpoint(
+          step=global_step, model=detection_model)
+      manager = tf.compat.v2.train.CheckpointManager(
+          ckpt, model_dir, max_to_keep=7)
+      ## Maybe re-enable checkpoint restoration depending on how it works:
+      # ckpt.restore(manager.latest_checkpoint)
+
+      def train_step_fn(features, labels):
+        return eager_train_step(
+            detection_model,
+            features,
+            labels,
+            unpad_groundtruth_tensors,
+            optimizer,
+            learning_rate=learning_rate_fn(),
+            use_bfloat16=use_bfloat16,
+            add_regularization_loss=add_regularization_loss,
+            clip_gradients_value=clip_gradients_value,
+            use_tpu=use_tpu,
+            global_step=global_step,
+            num_replicas=strategy.num_replicas_in_sync)
+
+      @tf.function
+      def _dist_train_step(data_iterator):
+        """A distributed train step."""
+        features, labels = data_iterator.next()
+        per_replica_losses = strategy.experimental_run_v2(
+            train_step_fn, args=(
+                features,
+                labels,
+            ))
+        # TODO(anjalisridhar): explore if it is safe to remove the
+        ## num_replicas scaling of the loss and switch this to a ReduceOp.Mean
+        mean_loss = strategy.reduce(
+            tf.distribute.ReduceOp.SUM, per_replica_losses, axis=None)
+        return mean_loss
+
+      train_input_iter = iter(train_input)
+      for _ in range(train_steps):
+        start_time = time.time()
+
+        loss = _dist_train_step(train_input_iter)
+        global_step.assign_add(1)
+        end_time = time.time()
+        tf.compat.v2.summary.scalar(
+            'steps_per_sec', 1.0 / (end_time - start_time), step=global_step)
+        # TODO(kaftan): Remove this print after it is no longer helpful for
+        ## debugging.
+        tf.print('Finished step', global_step, end_time, loss)
+        if int(global_step.value().numpy()) % checkpoint_every_n == 0:
+          manager.save()
+
+
+def eager_eval_loop(
+    detection_model,
+    configs,
+    eval_dataset,
+    use_tpu=False,
+    postprocess_on_cpu=False,
+    global_step=None):
+  """Evaluate the model eagerly on the evaluation dataset.
+
+  This method will compute the evaluation metrics specified in the configs on
+  the entire evaluation dataset, then return the metrics. It will also log
+  the metrics to TensorBoard
+
+  Args:
+    detection_model: A DetectionModel (based on Keras) to evaluate.
+    configs: Object detection configs that specify the evaluators that should
+      be used, as well as whether regularization loss should be included and
+      if bfloat16 should be used on TPUs.
+    eval_dataset: Dataset containing evaluation data.
+    use_tpu: Whether a TPU is being used to execute the model for evaluation.
+    postprocess_on_cpu: Whether model postprocessing should happen on
+      the CPU when using a TPU to execute the model.
+    global_step: A variable containing the training step this model was trained
+      to. Used for logging purposes.
+
+  Returns:
+    A dict of evaluation metrics representing the results of this evaluation.
+  """
+  train_config = configs['train_config']
+  eval_input_config = configs['eval_input_config']
+  eval_config = configs['eval_config']
+  use_bfloat16 = train_config.use_bfloat16
+  add_regularization_loss = train_config.add_regularization_loss
+
+  is_training = False
+  detection_model._is_training = is_training  # pylint: disable=protected-access
+  tf.keras.backend.set_learning_phase(is_training)
+
+  evaluator_options = eval_util.evaluator_options_from_eval_config(
+      eval_config)
+
+  class_agnostic_category_index = (
+      label_map_util.create_class_agnostic_category_index())
+  class_agnostic_evaluators = eval_util.get_evaluators(
+      eval_config,
+      list(class_agnostic_category_index.values()),
+      evaluator_options)
+
+  class_aware_evaluators = None
+  if eval_input_config.label_map_path:
+    class_aware_category_index = (
+        label_map_util.create_category_index_from_labelmap(
+            eval_input_config.label_map_path))
+    class_aware_evaluators = eval_util.get_evaluators(
+        eval_config,
+        list(class_aware_category_index.values()),
+        evaluator_options)
+
+  evaluators = None
+  loss_metrics = {}
+
+  @tf.function
+  def compute_eval_dict(features, labels):
+    """Compute the evaluation result on an image."""
+    # For evaling on train data, it is necessary to check whether groundtruth
+    # must be unpadded.
+    boxes_shape = (
+        labels[fields.InputDataFields.groundtruth_boxes].get_shape().as_list())
+    unpad_groundtruth_tensors = boxes_shape[1] is not None and not use_tpu
+    labels = model_lib.unstack_batch(
+        labels, unpad_groundtruth_tensors=unpad_groundtruth_tensors)
+
+    losses_dict, prediction_dict = _compute_losses_and_predictions_dicts(
+        detection_model, features, labels, add_regularization_loss, use_tpu,
+        use_bfloat16)
+
+    def postprocess_wrapper(args):
+      return detection_model.postprocess(args[0], args[1])
+
+    # TODO(kaftan): Depending on how postprocessing will work for TPUS w/
+    ## TPUStrategy, may be good to move wrapping to a utility method
+    if use_tpu and postprocess_on_cpu:
+      detections = tf.contrib.tpu.outside_compilation(
+          postprocess_wrapper,
+          (prediction_dict, features[fields.InputDataFields.true_image_shape]))
+    else:
+      detections = postprocess_wrapper(
+          (prediction_dict, features[fields.InputDataFields.true_image_shape]))
+
+    class_agnostic = (
+        fields.DetectionResultFields.detection_classes not in detections)
+    # TODO(kaftan) (or anyone): move `_prepare_groundtruth_for_eval to eval_util
+    ## and call this from there.
+    groundtruth = model_lib._prepare_groundtruth_for_eval(  # pylint: disable=protected-access
+        detection_model, class_agnostic, eval_input_config.max_number_of_boxes)
+    use_original_images = fields.InputDataFields.original_image in features
+    if use_original_images:
+      eval_images = features[fields.InputDataFields.original_image]
+      true_image_shapes = tf.slice(
+          features[fields.InputDataFields.true_image_shape], [0, 0], [-1, 3])
+      original_image_spatial_shapes = features[
+          fields.InputDataFields.original_image_spatial_shape]
+    else:
+      eval_images = features[fields.InputDataFields.image]
+      true_image_shapes = None
+      original_image_spatial_shapes = None
+
+    eval_dict = eval_util.result_dict_for_batched_example(
+        eval_images,
+        features[inputs.HASH_KEY],
+        detections,
+        groundtruth,
+        class_agnostic=class_agnostic,
+        scale_to_absolute=True,
+        original_image_spatial_shapes=original_image_spatial_shapes,
+        true_image_shapes=true_image_shapes)
+
+    return eval_dict, losses_dict, class_agnostic
+
+  i = 0
+  for features, labels in eval_dataset:
+    eval_dict, losses_dict, class_agnostic = compute_eval_dict(features, labels)
+    end_time = time.time()
+    # TODO(kaftan): Remove this print after it is no longer helpful for
+    ## debugging.
+    tf.print('Finished eval dict computation', i, end_time)
+    i += 1
+
+    if evaluators is None:
+      if class_agnostic:
+        evaluators = class_agnostic_evaluators
+      else:
+        evaluators = class_aware_evaluators
+
+    for evaluator in evaluators:
+      evaluator.add_eval_dict(eval_dict)
+
+    for loss_key, loss_tensor in iter(losses_dict.items()):
+      if loss_key not in loss_metrics:
+        loss_metrics[loss_key] = tf.keras.metrics.Mean()
+      loss_metrics[loss_key].update_state(loss_tensor)
+
+  eval_metrics = {}
+
+  for evaluator in evaluators:
+    eval_metrics.update(evaluator.evaluate())
+  for loss_key in loss_metrics:
+    eval_metrics[loss_key] = loss_metrics[loss_key].result()
+
+  eval_metrics = {str(k): v for k, v in eval_metrics.items()}
+  for k in eval_metrics:
+    tf.compat.v2.summary.scalar(k, eval_metrics[k], step=global_step)
+
+  return eval_metrics
+
+
+def eval_continuously(
+    hparams,
+    pipeline_config_path,
+    config_override=None,
+    train_steps=None,
+    sample_1_of_n_eval_examples=1,
+    sample_1_of_n_eval_on_train_examples=1,
+    use_tpu=False,
+    override_eval_num_epochs=True,
+    postprocess_on_cpu=False,
+    export_to_tpu=None,
+    model_dir=None,
+    checkpoint_dir=None,
+    wait_interval=180,
+    **kwargs):
+  """Run continuous evaluation of a detection model eagerly.
+
+  This method builds the model, and continously restores it from the most
+  recent training checkpoint in the checkpoint directory & evaluates it
+  on the evaluation data.
+
+  Args:
+    hparams: A `HParams`.
+    pipeline_config_path: A path to a pipeline config file.
+    config_override: A pipeline_pb2.TrainEvalPipelineConfig text proto to
+      override the config from `pipeline_config_path`.
+    train_steps: Number of training steps. If None, the number of training steps
+      is set from the `TrainConfig` proto.
+    sample_1_of_n_eval_examples: Integer representing how often an eval example
+      should be sampled. If 1, will sample all examples.
+    sample_1_of_n_eval_on_train_examples: Similar to
+      `sample_1_of_n_eval_examples`, except controls the sampling of training
+      data for evaluation.
+    use_tpu: Boolean, whether training and evaluation should run on TPU.
+    override_eval_num_epochs: Whether to overwrite the number of epochs to 1 for
+      eval_input.
+    postprocess_on_cpu: When use_tpu and postprocess_on_cpu are true,
+      postprocess is scheduled on the host cpu.
+    export_to_tpu: When use_tpu and export_to_tpu are true,
+      `export_savedmodel()` exports a metagraph for serving on TPU besides the
+      one on CPU. If export_to_tpu is not provided, we will look for it in
+      hparams too.
+    model_dir:
+      Directory to output resulting evaluation summaries to.
+    checkpoint_dir:
+      Directory that contains the training checkpoints.
+    wait_interval:
+      Terminate evaluation in no new checkpoints arrive within this wait
+      interval (in seconds).
+    **kwargs: Additional keyword arguments for configuration override.
+  """
+  get_configs_from_pipeline_file = MODEL_BUILD_UTIL_MAP[
+      'get_configs_from_pipeline_file']
+  merge_external_params_with_configs = MODEL_BUILD_UTIL_MAP[
+      'merge_external_params_with_configs']
+
+  configs = get_configs_from_pipeline_file(
+      pipeline_config_path, config_override=config_override)
+  kwargs.update({
+      'sample_1_of_n_eval_examples': sample_1_of_n_eval_examples,
+      'use_bfloat16': configs['train_config'].use_bfloat16 and use_tpu
+  })
+  if train_steps is not None:
+    kwargs['train_steps'] = train_steps
+  if override_eval_num_epochs:
+    kwargs.update({'eval_num_epochs': 1})
+    tf.logging.warning(
+        'Forced number of epochs for all eval validations to be 1.')
+  configs = merge_external_params_with_configs(
+      configs, hparams, kwargs_dict=kwargs)
+  model_config = configs['model']
+  train_input_config = configs['train_input_config']
+  eval_config = configs['eval_config']
+  eval_input_configs = configs['eval_input_configs']
+  eval_on_train_input_config = copy.deepcopy(train_input_config)
+  eval_on_train_input_config.sample_1_of_n_examples = (
+      sample_1_of_n_eval_on_train_examples)
+  if override_eval_num_epochs and eval_on_train_input_config.num_epochs != 1:
+    tf.logging.warning('Expected number of evaluation epochs is 1, but '
+                       'instead encountered `eval_on_train_input_config'
+                       '.num_epochs` = '
+                       '{}. Overwriting `num_epochs` to 1.'.format(
+                           eval_on_train_input_config.num_epochs))
+    eval_on_train_input_config.num_epochs = 1
+
+  detection_model = model_builder.build(
+      model_config=model_config, is_training=True)
+
+  # Create the inputs.
+  eval_inputs = []
+  for eval_input_config in eval_input_configs:
+    next_eval_input = inputs.eval_input(
+        eval_config=eval_config,
+        eval_input_config=eval_input_config,
+        model_config=model_config,
+        model=detection_model)
+    eval_inputs.append((eval_input_config.name, next_eval_input))
+
+  # Read export_to_tpu from hparams if not passed.
+  if export_to_tpu is None:
+    export_to_tpu = hparams.get('export_to_tpu', False)
+  tf.logging.info('eval_continuously: use_tpu %s, export_to_tpu %s',
+                  use_tpu, export_to_tpu)
+
+  global_step = tf.compat.v2.Variable(
+      0, trainable=False, dtype=tf.compat.v2.dtypes.int64)
+
+  prev_checkpoint = None
+  waiting = False
+  while True:
+    ckpt = tf.compat.v2.train.Checkpoint(
+        step=global_step, model=detection_model)
+    manager = tf.compat.v2.train.CheckpointManager(
+        ckpt, checkpoint_dir, max_to_keep=3)
+
+    latest_checkpoint = manager.latest_checkpoint
+    if prev_checkpoint == latest_checkpoint:
+      if prev_checkpoint is None:
+        tf.logging.info('No checkpoints found yet. Trying again in %s seconds.'
+                        % wait_interval)
+        time.sleep(wait_interval)
+      else:
+        if waiting:
+          tf.logging.info('Terminating eval after %s seconds of no new '
+                          'checkpoints.' % wait_interval)
+          break
+        else:
+          tf.logging.info('No new checkpoint found. Will try again '
+                          'in %s seconds and terminate if no checkpoint '
+                          'appears.' % wait_interval)
+          waiting = True
+          time.sleep(wait_interval)
+    else:
+      tf.logging.info('New checkpoint found. Starting evaluation.')
+      waiting = False
+      prev_checkpoint = latest_checkpoint
+      ckpt.restore(latest_checkpoint)
+
+      for eval_name, eval_input in eval_inputs:
+        summary_writer = tf.compat.v2.summary.create_file_writer(
+            model_dir + '/eval' + eval_name)
+        with summary_writer.as_default():
+          eager_eval_loop(
+              detection_model,
+              configs,
+              eval_input,
+              use_tpu=use_tpu,
+              postprocess_on_cpu=postprocess_on_cpu,
+              global_step=global_step)
--- a/research/object_detection/model_lib_v2_test.py
+++ b/research/object_detection/model_lib_v2_test.py
+# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""Tests for object detection model library."""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import os
+
+import tensorflow as tf
+
+from object_detection import model_hparams
+from object_detection import model_lib_v2
+from object_detection.utils import config_util
+
+
+# Model for test. Current options are:
+# 'ssd_mobilenet_v2_pets_keras'
+MODEL_NAME_FOR_TEST = 'ssd_mobilenet_v2_pets_keras'
+
+
+def _get_data_path():
+  """Returns an absolute path to TFRecord file."""
+  return os.path.join(tf.resource_loader.get_data_files_path(), 'test_data',
+                      'pets_examples.record')
+
+
+def get_pipeline_config_path(model_name):
+  """Returns path to the local pipeline config file."""
+  return os.path.join(tf.resource_loader.get_data_files_path(), 'samples',
+                      'configs', model_name + '.config')
+
+
+def _get_labelmap_path():
+  """Returns an absolute path to label map file."""
+  return os.path.join(tf.resource_loader.get_data_files_path(), 'data',
+                      'pet_label_map.pbtxt')
+
+
+def _get_config_kwarg_overrides():
+  """Returns overrides to the configs that insert the correct local paths."""
+  data_path = _get_data_path()
+  label_map_path = _get_labelmap_path()
+  return {
+      'train_input_path': data_path,
+      'eval_input_path': data_path,
+      'label_map_path': label_map_path
+  }
+
+
+def _get_configs_for_model(model_name):
+  """Returns configurations for model."""
+  filename = get_pipeline_config_path(model_name)
+  configs = config_util.get_configs_from_pipeline_file(filename)
+  configs = config_util.merge_external_params_with_configs(
+      configs, kwargs_dict=_get_config_kwarg_overrides())
+  return configs
+
+
+class ModelLibTest(tf.test.TestCase):
+
+  @classmethod
+  def setUpClass(cls):
+    tf.keras.backend.clear_session()
+
+  def test_train_loop_then_eval_loop(self):
+    """Tests that Estimator and input function are constructed correctly."""
+    hparams = model_hparams.create_hparams(
+        hparams_overrides='load_pretrained=false')
+    pipeline_config_path = get_pipeline_config_path(MODEL_NAME_FOR_TEST)
+    config_kwarg_overrides = _get_config_kwarg_overrides()
+    model_dir = tf.test.get_temp_dir()
+
+    train_steps = 2
+    model_lib_v2.train_loop(
+        hparams,
+        pipeline_config_path,
+        model_dir=model_dir,
+        train_steps=train_steps,
+        checkpoint_every_n=1,
+        **config_kwarg_overrides)
+
+    model_lib_v2.eval_continuously(
+        hparams,
+        pipeline_config_path,
+        model_dir=model_dir,
+        checkpoint_dir=model_dir,
+        train_steps=train_steps,
+        wait_interval=10,
+        **config_kwarg_overrides)
+
--- a/research/object_detection/models/faster_rcnn_inception_resnet_v2_feature_extractor.py
+++ b/research/object_detection/models/faster_rcnn_inception_resnet_v2_feature_extractor.py
@@ -25,6 +25,7 @@ Huang et al. (https://arxiv.org/abs/1611.10012)
 import tensorflow as tf

 from object_detection.meta_architectures import faster_rcnn_meta_arch
+from object_detection.utils import variables_helper
 from nets import inception_resnet_v2

 slim = tf.contrib.slim
@@ -195,7 +196,7 @@ class FasterRCNNInceptionResnetV2FeatureExtractor(
    """

    variables_to_restore = {}
-    for variable in tf.global_variables():
+    for variable in variables_helper.get_global_variables_safely():
      if variable.op.name.startswith(
          first_stage_feature_extractor_scope):
        var_name = variable.op.name.replace(

--- a/research/object_detection/models/faster_rcnn_inception_resnet_v2_keras_feature_extractor.py
+++ b/research/object_detection/models/faster_rcnn_inception_resnet_v2_keras_feature_extractor.py
@@ -30,6 +30,7 @@ import tensorflow as tf
 from object_detection.meta_architectures import faster_rcnn_meta_arch
 from object_detection.models.keras_models import inception_resnet_v2
 from object_detection.utils import model_util
+from object_detection.utils import variables_helper


 class FasterRCNNInceptionResnetV2KerasFeatureExtractor(
@@ -1070,7 +1071,7 @@ class FasterRCNNInceptionResnetV2KerasFeatureExtractor(
    }

    variables_to_restore = {}
-    for variable in tf.global_variables():
+    for variable in variables_helper.get_global_variables_safely():
      var_name = keras_to_slim_name_mapping.get(variable.op.name)
      if var_name:
        variables_to_restore[var_name] = variable

--- a/research/object_detection/models/faster_rcnn_nas_feature_extractor.py
+++ b/research/object_detection/models/faster_rcnn_nas_feature_extractor.py
@@ -23,6 +23,7 @@ https://arxiv.org/abs/1707.07012
 import tensorflow as tf

 from object_detection.meta_architectures import faster_rcnn_meta_arch
+from object_detection.utils import variables_helper
 from nets.nasnet import nasnet
 from nets.nasnet import nasnet_utils

@@ -307,7 +308,7 @@ class FasterRCNNNASFeatureExtractor(
    # Note that the NAS checkpoint only contains the moving average version of
    # the Variables so we need to generate an appropriate dictionary mapping.
    variables_to_restore = {}
-    for variable in tf.global_variables():
+    for variable in variables_helper.get_global_variables_safely():
      if variable.op.name.startswith(
          first_stage_feature_extractor_scope):
        var_name = variable.op.name.replace(