Unverified Commit fe748d4a authored by pkulzc's avatar pkulzc Committed by GitHub
Browse files

Object detection changes: (#7208)

257914648  by lzc:

    Internal changes

--
257525973  by Zhichao Lu:

    Fixes bug that silently prevents checkpoints from loading when training w/ eager + functions. Also sets up scripts to run training.

--
257296614  by Zhichao Lu:

    Adding detection_features to model outputs

--
257234565  by Zhichao Lu:

    Fix wrong order of `classes_with_max_scores` in class-agnostic NMS caused by
    sorting in partitioned-NMS.

--
257232002  by ronnyvotel:

    Supporting `filter_nonoverlapping` option in np_box_list_ops.clip_to_window().

--
257198282  by Zhichao Lu:

    Adding the focal loss and l1 loss from the Objects as Points paper.

--
257089535  by Zhichao Lu:

    Create Keras based ssd + resnetv1 + fpn.

--
257087407  by Zhichao Lu:

    Make object_detection/data_decoders Python3-compatible.

--
257004582  by Zhichao Lu:

    Updates _decode_raw_data_into_masks_and_boxes to the latest binary masks-to-string encoding fo...
parent 81123ebf
# Open Images Challenge Evaluation
The Object Detection API is currently supporting several evaluation metrics used in the [Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html).
In addition, several data processing tools are available. Detailed instructions on using the tools for each track are available below.
The Object Detection API is currently supporting several evaluation metrics used
in the
[Open Images Challenge 2018](https://storage.googleapis.com/openimages/web/challenge.html)
and
[Open Images Challenge 2019](https://storage.googleapis.com/openimages/web/challenge2019.html).
In addition, several data processing tools are available. Detailed instructions
on using the tools for each track are available below.
**NOTE**: links to the external website in this tutorial may change after the Open Images Challenge 2018 is finished.
**NOTE:** all data links are updated to the Open Images Challenge 2019.
## Object Detection Track
The [Object Detection metric](https://storage.googleapis.com/openimages/web/object_detection_metric.html) protocol requires a pre-processing of the released data to ensure correct evaluation. The released data contains only leaf-most bounding box annotations and image-level labels.
The evaluation metric implementation is available in the class `OpenImagesDetectionChallengeEvaluator`.
1. Download class hierarchy of Open Images Challenge 2018 in JSON format from [here](https://storage.googleapis.com/openimages/challenge_2018/bbox_labels_500_hierarchy.json).
2. Download ground-truth [boundling boxes](https://storage.googleapis.com/openimages/challenge_2018/train/challenge-2018-train-annotations-bbox.csv) and [image-level labels](https://storage.googleapis.com/openimages/challenge_2018/train/challenge-2018-train-annotations-human-imagelabels.csv).
3. Filter the rows corresponding to the validation set images you want to use and store the results in the same CSV format.
4. Run the following command to create hierarchical expansion of the bounding boxes annotations:
The
[Object Detection metric](https://storage.googleapis.com/openimages/web/evaluation.html#object_detection_eval)
protocol requires a pre-processing of the released data to ensure correct
evaluation. The released data contains only leaf-most bounding box annotations
and image-level labels. The evaluation metric implementation is available in the
class `OpenImagesChallengeEvaluator`.
1. Download
[class hierarchy of Open Images Detection Challenge 2019](https://storage.googleapis.com/openimages/challenge_2019/challenge-2019-label500-hierarchy.json)
in JSON format.
2. Download
[ground-truth boundling boxes](https://storage.googleapis.com/openimages/challenge_2019/challenge-2019-validation-detection-bbox.csv)
and
[image-level labels](https://storage.googleapis.com/openimages/challenge_2019/challenge-2019-validation-detection-human-imagelabels.csv).
3. Run the following command to create hierarchical expansion of the bounding
boxes and image-level label annotations:
```
HIERARCHY_FILE=/path/to/bbox_labels_500_hierarchy.json
BOUNDING_BOXES=/path/to/challenge-2018-train-annotations-bbox
IMAGE_LABELS=/path/to/challenge-2018-train-annotations-human-imagelabels
HIERARCHY_FILE=/path/to/challenge-2019-label500-hierarchy.json
BOUNDING_BOXES=/path/to/challenge-2019-validation-detection-bbox
IMAGE_LABELS=/path/to/challenge-2019-validation-detection-human-imagelabels
python object_detection/dataset_tools/oid_hierarchical_labels_expansion.py \
--json_hierarchy_file=${HIERARCHY_FILE} \
......@@ -33,13 +47,18 @@ python object_detection/dataset_tools/oid_hierarchical_labels_expansion.py \
--annotation_type=2
```
After step 4 you will have produced the ground-truth files suitable for running 'OID Challenge Object Detection Metric 2018' evaluation.
1. If you are not using Tensorflow, you can run evaluation directly using your
algorithm's output and generated ground-truth files. {value=4}
After step 3 you produced the ground-truth files suitable for running 'OID
Challenge Object Detection Metric 2019' evaluation. To run the evaluation, use
the following command:
```
INPUT_PREDICTIONS=/path/to/detection_predictions.csv
OUTPUT_METRICS=/path/to/output/metrics/file
python models/research/object_detection/metrics/oid_od_challenge_evaluation.py \
python models/research/object_detection/metrics/oid_challenge_evaluation.py \
--input_annotations_boxes=${BOUNDING_BOXES}_expanded.csv \
--input_annotations_labels=${IMAGE_LABELS}_expanded.csv \
--input_class_labelmap=object_detection/data/oid_object_detection_challenge_500_label_map.pbtxt \
......@@ -47,66 +66,99 @@ python models/research/object_detection/metrics/oid_od_challenge_evaluation.py \
--output_metrics=${OUTPUT_METRICS} \
```
### Running evaluation on CSV files directly
For the Object Detection Track, the participants will be ranked on:
5. If you are not using Tensorflow, you can run evaluation directly using your algorithm's output and generated ground-truth files. {value=5}
- "OpenImagesDetectionChallenge_Precision/mAP@0.5IOU"
To use evaluation within Tensorflow training, use metric name
`oid_challenge_detection_metrics` in the evaluation config.
### Running evaluation using TF Object Detection API
## Instance Segmentation Track
5. Produce tf.Example files suitable for running inference: {value=5}
The
[Instance Segmentation metric](https://storage.googleapis.com/openimages/web/evaluation.html#instance_segmentation_eval)
can be directly evaluated using the ground-truth data and model predictions. The
evaluation metric implementation is available in the class
`OpenImagesChallengeEvaluator`.
```
RAW_IMAGES_DIR=/path/to/raw_images_location
OUTPUT_DIR=/path/to/output_tfrecords
python object_detection/dataset_tools/create_oid_tf_record.py \
--input_box_annotations_csv ${BOUNDING_BOXES}_expanded.csv \
--input_image_label_annotations_csv ${IMAGE_LABELS}_expanded.csv \
--input_images_directory ${RAW_IMAGES_DIR} \
--input_label_map object_detection/data/oid_object_detection_challenge_500_label_map.pbtxt \
--output_tf_record_path_prefix ${OUTPUT_DIR} \
--num_shards=100
```
1. Download
[class hierarchy of Open Images Instance Segmentation Challenge 2019](https://storage.googleapis.com/openimages/challenge_2019/challenge-2019-label300-segmentable-hierarchy.json)
in JSON format.
2. Download
[ground-truth bounding boxes](https://storage.googleapis.com/openimages/challenge_2019/challenge-2019-validation-segmentation-bbox.csv)
and
[image-level labels](https://storage.googleapis.com/openimages/challenge_2019/challenge-2019-validation-segmentation-labels.csv).
3. Download instance segmentation files for the validation set (see
[Open Images Challenge Downloads page](https://storage.googleapis.com/openimages/web/challenge2019_downloads.html)).
The download consists of a set of .zip archives containing binary .png
masks.
Those should be transformed into a single CSV file in the format:
ImageID,LabelName,ImageWidth,ImageHeight,XMin,YMin,XMax,YMax,GroupOf,Mask
where Mask is MS COCO RLE encoding of a binary mask stored in .png file.
6. Run inference of your model and fill corresponding fields in tf.Example: see [this tutorial](object_detection/g3doc/oid_inference_and_evaluation.md) on running the inference with Tensorflow Object Detection API models. {value=6}
NOTE: the util to make the transformation will be released soon.
7. Finally, run the evaluation script to produce the final evaluation result.
1. Run the following command to create hierarchical expansion of the instance
segmentation, bounding boxes and image-level label annotations: {value=4}
```
INPUT_TFRECORDS_WITH_DETECTIONS=/path/to/tf_records_with_detections
OUTPUT_CONFIG_DIR=/path/to/configs
HIERARCHY_FILE=/path/to/challenge-2019-label300-hierarchy.json
BOUNDING_BOXES=/path/to/challenge-2019-validation-detection-bbox
IMAGE_LABELS=/path/to/challenge-2019-validation-detection-human-imagelabels
echo "
label_map_path: 'object_detection/data/oid_object_detection_challenge_500_label_map.pbtxt'
tf_record_input_reader: { input_path: '${INPUT_TFRECORDS_WITH_DETECTIONS}' }
" > ${OUTPUT_CONFIG_DIR}/input_config.pbtxt
python object_detection/dataset_tools/oid_hierarchical_labels_expansion.py \
--json_hierarchy_file=${HIERARCHY_FILE} \
--input_annotations=${BOUNDING_BOXES}.csv \
--output_annotations=${BOUNDING_BOXES}_expanded.csv \
--annotation_type=1
python object_detection/dataset_tools/oid_hierarchical_labels_expansion.py \
--json_hierarchy_file=${HIERARCHY_FILE} \
--input_annotations=${IMAGE_LABELS}.csv \
--output_annotations=${IMAGE_LABELS}_expanded.csv \
--annotation_type=2
echo "
metrics_set: 'oid_challenge_detection_metrics'
" > ${OUTPUT_CONFIG_DIR}/eval_config.pbtxt
python object_detection/dataset_tools/oid_hierarchical_labels_expansion.py \
--json_hierarchy_file=${HIERARCHY_FILE} \
--input_annotations=${INSTANCE_SEGMENTATIONS}.csv \
--output_annotations=${INSTANCE_SEGMENTATIONS}_expanded.csv \
--annotation_type=1
```
OUTPUT_METRICS_DIR=/path/to/metrics_csv
1. If you are not using Tensorflow, you can run evaluation directly using your
algorithm's output and generated ground-truth files. {value=4}
python object_detection/metrics/offline_eval_map_corloc.py \
--eval_dir=${OUTPUT_METRICS_DIR} \
--eval_config_path=${OUTPUT_CONFIG_DIR}/eval_config.pbtxt \
--input_config_path=${OUTPUT_CONFIG_DIR}/input_config.pbtxt
```
INPUT_PREDICTIONS=/path/to/instance_segmentation_predictions.csv
OUTPUT_METRICS=/path/to/output/metrics/file
The result of the evaluation will be stored in `${OUTPUT_METRICS_DIR}/metrics.csv`
python models/research/object_detection/metrics/oid_challenge_evaluation.py \
--input_annotations_boxes=${BOUNDING_BOXES}_expanded.csv \
--input_annotations_labels=${IMAGE_LABELS}_expanded.csv \
--input_class_labelmap=object_detection/data/oid_object_detection_challenge_500_label_map.pbtxt \
--input_predictions=${INPUT_PREDICTIONS} \
--input_annotations_segm=${INSTANCE_SEGMENTATIONS}_expanded.csv
--output_metrics=${OUTPUT_METRICS} \
```
For the Object Detection Track, the participants will be ranked on:
For the Instance Segmentation Track, the participants will be ranked on:
- "OpenImagesChallenge2018_Precision/mAP@0.5IOU"
- "OpenImagesInstanceSegmentationChallenge_Precision/mAP@0.5IOU"
## Visual Relationships Detection Track
The [Visual Relationships Detection metrics](https://storage.googleapis.com/openimages/web/vrd_detection_metric.html) can be directly evaluated using the ground-truth data and model predictions. The evaluation metric implementation is available in the class `VRDRelationDetectionEvaluator`,`VRDPhraseDetectionEvaluator`.
1. Download the ground-truth [visual relationships annotations](https://storage.googleapis.com/openimages/challenge_2018/train/challenge-2018-train-vrd.csv) and [image-level labels](https://storage.googleapis.com/openimages/challenge_2018/train/challenge-2018-train-vrd-labels.csv).
2. Filter the rows corresponding to the validation set images you want to use and store the results in the same CSV format.
3. Run the follwing command to produce final metrics:
The
[Visual Relationships Detection metrics](https://storage.googleapis.com/openimages/web/evaluation.html#visual_relationships_eval)
can be directly evaluated using the ground-truth data and model predictions. The
evaluation metric implementation is available in the class
`VRDRelationDetectionEvaluator`,`VRDPhraseDetectionEvaluator`.
1. Download the ground-truth
[visual relationships annotations](https://storage.googleapis.com/openimages/challenge_2019/challenge-2019-validation-vrd.csv)
and
[image-level labels](https://storage.googleapis.com/openimages/challenge_2019/challenge-2019-validation-vrd-labels.csv).
2. Run the follwing command to produce final metrics:
```
INPUT_ANNOTATIONS_BOXES=/path/to/challenge-2018-train-vrd.csv
......
......@@ -138,6 +138,8 @@ Model name
[^2]: This is PASCAL mAP with a slightly different way of true positives computation: see [Open Images evaluation protocols](evaluation_protocols.md), oid_V2_detection_metrics.
[^3]: Non-face boxes are dropped during training and non-face groundtruth boxes are ignored when evaluating.
[^4]: This is Open Images Challenge metric: see [Open Images evaluation protocols](evaluation_protocols.md), oid_challenge_detection_metrics.
......@@ -135,22 +135,29 @@ output bounding-boxes labelled in the same manner.
The old metric name is DEPRECATED.
`EvalConfig.metrics_set='open_images_V2_detection_metrics'`
## OID Challenge Object Detection Metric 2018
## OID Challenge Object Detection Metric
`EvalConfig.metrics_set='oid_challenge_detection_metrics'`
The metric for the OID Challenge Object Detection Metric 2018, Object Detection
track. The description is provided on the [Open Images Challenge
website](https://storage.googleapis.com/openimages/web/challenge.html).
The metric for the OID Challenge Object Detection Metric 2018/2019 Object
Detection track. The description is provided on the
[Open Images Challenge website](https://storage.googleapis.com/openimages/web/evaluation.html#object_detection_eval).
The old metric name is DEPRECATED.
`EvalConfig.metrics_set='oid_challenge_object_detection_metrics'`
## OID Challenge Visual Relationship Detection Metric 2018
## OID Challenge Visual Relationship Detection Metric
The metric for the OID Challenge Visual Relationship Detection Metric 2018, Visual
Relationship Detection track. The description is provided on the [Open Images
Challenge
website](https://storage.googleapis.com/openimages/web/challenge.html). Note:
this is currently a stand-alone metric, that can be used only through the
The metric for the OID Challenge Visual Relationship Detection Metric 2018,2019
Visual Relationship Detection track. The description is provided on the
[Open Images Challenge website](https://storage.googleapis.com/openimages/web/evaluation.html#visual_relationships_eval).
Note: this is currently a stand-alone metric, that can be used only through the
`metrics/oid_vrd_challenge_evaluation.py` util.
## OID Challenge Instance Segmentation Metric
`EvalConfig.metrics_set='oid_challenge_segmentation_metrics'`
The metric for the OID Challenge Instance Segmentation Metric 2019, Instance
Segmentation track. The description is provided on the
[Open Images Challenge website](https://storage.googleapis.com/openimages/web/evaluation.html#instance_segmentation_eval).
......@@ -47,6 +47,22 @@ INPUT_BUILDER_UTIL_MAP = {
}
def _multiclass_scores_or_one_hot_labels(multiclass_scores,
groundtruth_boxes,
groundtruth_classes, num_classes):
"""Returns one-hot encoding of classes when multiclass_scores is empty."""
# Replace groundtruth_classes tensor with multiclass_scores tensor when its
# non-empty. If multiclass_scores is empty fall back on groundtruth_classes
# tensor.
def true_fn():
return tf.reshape(multiclass_scores,
[tf.shape(groundtruth_boxes)[0], num_classes])
def false_fn():
return tf.one_hot(groundtruth_classes, num_classes)
return tf.cond(tf.size(multiclass_scores) > 0, true_fn, false_fn)
def transform_input_data(tensor_dict,
model_preprocess_fn,
image_resizer_fn,
......@@ -89,102 +105,106 @@ def transform_input_data(tensor_dict,
and classes for a given image if the boxes are exactly the same.
retain_original_image: (optional) whether to retain original image in the
output dictionary.
use_multiclass_scores: whether to use multiclass scores as
class targets instead of one-hot encoding of `groundtruth_classes`.
use_multiclass_scores: whether to use multiclass scores as class targets
instead of one-hot encoding of `groundtruth_classes`. When
this is True and multiclass_scores is empty, one-hot encoding of
`groundtruth_classes` is used as a fallback.
use_bfloat16: (optional) a bool, whether to use bfloat16 in training.
Returns:
A dictionary keyed by fields.InputDataFields containing the tensors obtained
after applying all the transformations.
"""
# Reshape flattened multiclass scores tensor into a 2D tensor of shape
# [num_boxes, num_classes].
if fields.InputDataFields.multiclass_scores in tensor_dict:
tensor_dict[fields.InputDataFields.multiclass_scores] = tf.reshape(
tensor_dict[fields.InputDataFields.multiclass_scores], [
tf.shape(tensor_dict[fields.InputDataFields.groundtruth_boxes])[0],
num_classes
])
if fields.InputDataFields.groundtruth_boxes in tensor_dict:
tensor_dict = util_ops.filter_groundtruth_with_nan_box_coordinates(
tensor_dict)
tensor_dict = util_ops.filter_unrecognized_classes(tensor_dict)
out_tensor_dict = tensor_dict.copy()
if fields.InputDataFields.multiclass_scores in out_tensor_dict:
out_tensor_dict[
fields.InputDataFields
.multiclass_scores] = _multiclass_scores_or_one_hot_labels(
out_tensor_dict[fields.InputDataFields.multiclass_scores],
out_tensor_dict[fields.InputDataFields.groundtruth_boxes],
out_tensor_dict[fields.InputDataFields.groundtruth_classes],
num_classes)
if fields.InputDataFields.groundtruth_boxes in out_tensor_dict:
out_tensor_dict = util_ops.filter_groundtruth_with_nan_box_coordinates(
out_tensor_dict)
out_tensor_dict = util_ops.filter_unrecognized_classes(out_tensor_dict)
if retain_original_image:
tensor_dict[fields.InputDataFields.original_image] = tf.cast(
image_resizer_fn(tensor_dict[fields.InputDataFields.image], None)[0],
tf.uint8)
out_tensor_dict[fields.InputDataFields.original_image] = tf.cast(
image_resizer_fn(out_tensor_dict[fields.InputDataFields.image],
None)[0], tf.uint8)
if fields.InputDataFields.image_additional_channels in tensor_dict:
channels = tensor_dict[fields.InputDataFields.image_additional_channels]
tensor_dict[fields.InputDataFields.image] = tf.concat(
[tensor_dict[fields.InputDataFields.image], channels], axis=2)
if fields.InputDataFields.image_additional_channels in out_tensor_dict:
channels = out_tensor_dict[fields.InputDataFields.image_additional_channels]
out_tensor_dict[fields.InputDataFields.image] = tf.concat(
[out_tensor_dict[fields.InputDataFields.image], channels], axis=2)
# Apply data augmentation ops.
if data_augmentation_fn is not None:
tensor_dict = data_augmentation_fn(tensor_dict)
out_tensor_dict = data_augmentation_fn(out_tensor_dict)
# Apply model preprocessing ops and resize instance masks.
image = tensor_dict[fields.InputDataFields.image]
image = out_tensor_dict[fields.InputDataFields.image]
preprocessed_resized_image, true_image_shape = model_preprocess_fn(
tf.expand_dims(tf.cast(image, dtype=tf.float32), axis=0))
if use_bfloat16:
preprocessed_resized_image = tf.cast(
preprocessed_resized_image, tf.bfloat16)
tensor_dict[fields.InputDataFields.image] = tf.squeeze(
out_tensor_dict[fields.InputDataFields.image] = tf.squeeze(
preprocessed_resized_image, axis=0)
tensor_dict[fields.InputDataFields.true_image_shape] = tf.squeeze(
out_tensor_dict[fields.InputDataFields.true_image_shape] = tf.squeeze(
true_image_shape, axis=0)
if fields.InputDataFields.groundtruth_instance_masks in tensor_dict:
masks = tensor_dict[fields.InputDataFields.groundtruth_instance_masks]
if fields.InputDataFields.groundtruth_instance_masks in out_tensor_dict:
masks = out_tensor_dict[fields.InputDataFields.groundtruth_instance_masks]
_, resized_masks, _ = image_resizer_fn(image, masks)
if use_bfloat16:
resized_masks = tf.cast(resized_masks, tf.bfloat16)
tensor_dict[fields.InputDataFields.
groundtruth_instance_masks] = resized_masks
out_tensor_dict[
fields.InputDataFields.groundtruth_instance_masks] = resized_masks
# Transform groundtruth classes to one hot encodings.
label_offset = 1
zero_indexed_groundtruth_classes = tensor_dict[
zero_indexed_groundtruth_classes = out_tensor_dict[
fields.InputDataFields.groundtruth_classes] - label_offset
tensor_dict[fields.InputDataFields.groundtruth_classes] = tf.one_hot(
zero_indexed_groundtruth_classes, num_classes)
if use_multiclass_scores:
tensor_dict[fields.InputDataFields.groundtruth_classes] = tensor_dict[
fields.InputDataFields.multiclass_scores]
tensor_dict.pop(fields.InputDataFields.multiclass_scores, None)
out_tensor_dict[
fields.InputDataFields.groundtruth_classes] = out_tensor_dict[
fields.InputDataFields.multiclass_scores]
else:
out_tensor_dict[fields.InputDataFields.groundtruth_classes] = tf.one_hot(
zero_indexed_groundtruth_classes, num_classes)
out_tensor_dict.pop(fields.InputDataFields.multiclass_scores, None)
if fields.InputDataFields.groundtruth_confidences in tensor_dict:
groundtruth_confidences = tensor_dict[
if fields.InputDataFields.groundtruth_confidences in out_tensor_dict:
groundtruth_confidences = out_tensor_dict[
fields.InputDataFields.groundtruth_confidences]
# Map the confidences to the one-hot encoding of classes
tensor_dict[fields.InputDataFields.groundtruth_confidences] = (
out_tensor_dict[fields.InputDataFields.groundtruth_confidences] = (
tf.reshape(groundtruth_confidences, [-1, 1]) *
tensor_dict[fields.InputDataFields.groundtruth_classes])
out_tensor_dict[fields.InputDataFields.groundtruth_classes])
else:
groundtruth_confidences = tf.ones_like(
zero_indexed_groundtruth_classes, dtype=tf.float32)
tensor_dict[fields.InputDataFields.groundtruth_confidences] = (
tensor_dict[fields.InputDataFields.groundtruth_classes])
out_tensor_dict[fields.InputDataFields.groundtruth_confidences] = (
out_tensor_dict[fields.InputDataFields.groundtruth_classes])
if merge_multiple_boxes:
merged_boxes, merged_classes, merged_confidences, _ = (
util_ops.merge_boxes_with_multiple_labels(
tensor_dict[fields.InputDataFields.groundtruth_boxes],
out_tensor_dict[fields.InputDataFields.groundtruth_boxes],
zero_indexed_groundtruth_classes,
groundtruth_confidences,
num_classes))
merged_classes = tf.cast(merged_classes, tf.float32)
tensor_dict[fields.InputDataFields.groundtruth_boxes] = merged_boxes
tensor_dict[fields.InputDataFields.groundtruth_classes] = merged_classes
tensor_dict[fields.InputDataFields.groundtruth_confidences] = (
out_tensor_dict[fields.InputDataFields.groundtruth_boxes] = merged_boxes
out_tensor_dict[fields.InputDataFields.groundtruth_classes] = merged_classes
out_tensor_dict[fields.InputDataFields.groundtruth_confidences] = (
merged_confidences)
if fields.InputDataFields.groundtruth_boxes in tensor_dict:
tensor_dict[fields.InputDataFields.num_groundtruth_boxes] = tf.shape(
tensor_dict[fields.InputDataFields.groundtruth_boxes])[0]
if fields.InputDataFields.groundtruth_boxes in out_tensor_dict:
out_tensor_dict[fields.InputDataFields.num_groundtruth_boxes] = tf.shape(
out_tensor_dict[fields.InputDataFields.groundtruth_boxes])[0]
return tensor_dict
return out_tensor_dict
def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes,
......
......@@ -611,6 +611,62 @@ class DataTransformationFnTest(test_case.TestCase):
self.assertAllClose(transformed_inputs[fields.InputDataFields.image],
np.concatenate((image, additional_channels), axis=2))
def test_use_multiclass_scores_when_present(self):
image = np.random.rand(4, 4, 3).astype(np.float32)
tensor_dict = {
fields.InputDataFields.image:
tf.constant(image),
fields.InputDataFields.groundtruth_boxes:
tf.constant(np.array([[.5, .5, 1, 1], [.5, .5, 1, 1]], np.float32)),
fields.InputDataFields.multiclass_scores:
tf.constant(np.array([0.2, 0.3, 0.5, 0.1, 0.6, 0.3], np.float32)),
fields.InputDataFields.groundtruth_classes:
tf.constant(np.array([1, 2], np.int32))
}
input_transformation_fn = functools.partial(
inputs.transform_input_data,
model_preprocess_fn=_fake_model_preprocessor_fn,
image_resizer_fn=_fake_image_resizer_fn,
num_classes=3, use_multiclass_scores=True)
with self.test_session() as sess:
transformed_inputs = sess.run(
input_transformation_fn(tensor_dict=tensor_dict))
self.assertAllClose(
np.array([[0.2, 0.3, 0.5], [0.1, 0.6, 0.3]], np.float32),
transformed_inputs[fields.InputDataFields.groundtruth_classes])
def test_use_multiclass_scores_when_not_present(self):
image = np.random.rand(4, 4, 3).astype(np.float32)
tensor_dict = {
fields.InputDataFields.image:
tf.constant(image),
fields.InputDataFields.groundtruth_boxes:
tf.constant(np.array([[.5, .5, 1, 1], [.5, .5, 1, 1]], np.float32)),
fields.InputDataFields.multiclass_scores:
tf.placeholder(tf.float32),
fields.InputDataFields.groundtruth_classes:
tf.constant(np.array([1, 2], np.int32))
}
input_transformation_fn = functools.partial(
inputs.transform_input_data,
model_preprocess_fn=_fake_model_preprocessor_fn,
image_resizer_fn=_fake_image_resizer_fn,
num_classes=3, use_multiclass_scores=True)
with self.test_session() as sess:
transformed_inputs = sess.run(
input_transformation_fn(tensor_dict=tensor_dict),
feed_dict={
tensor_dict[fields.InputDataFields.multiclass_scores]:
np.array([], dtype=np.float32)
})
self.assertAllClose(
np.array([[0, 1, 0], [0, 0, 1]], np.float32),
transformed_inputs[fields.InputDataFields.groundtruth_classes])
def test_returns_correct_class_label_encodings(self):
tensor_dict = {
fields.InputDataFields.image:
......
......@@ -108,6 +108,7 @@ from object_detection.core import standard_fields as fields
from object_detection.core import target_assigner
from object_detection.utils import ops
from object_detection.utils import shape_utils
from object_detection.utils import variables_helper
slim = tf.contrib.slim
......@@ -210,7 +211,7 @@ class FasterRCNNFeatureExtractor(object):
the model graph.
"""
variables_to_restore = {}
for variable in tf.global_variables():
for variable in variables_helper.get_global_variables_safely():
for scope_name in [first_stage_feature_extractor_scope,
second_stage_feature_extractor_scope]:
if variable.op.name.startswith(scope_name):
......@@ -275,7 +276,7 @@ class FasterRCNNKerasFeatureExtractor(object):
the model graph.
"""
variables_to_restore = {}
for variable in tf.global_variables():
for variable in variables_helper.get_global_variables_safely():
for scope_name in [first_stage_feature_extractor_scope,
second_stage_feature_extractor_scope]:
if variable.op.name.startswith(scope_name):
......@@ -1193,6 +1194,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
detection_masks = self._gather_instance_masks(
detection_masks, detection_classes)
detection_masks = tf.cast(detection_masks, tf.float32)
prediction_dict[fields.DetectionResultFields.detection_masks] = (
tf.reshape(tf.sigmoid(detection_masks),
[batch_size, max_detection, mask_height, mask_width]))
......@@ -1461,9 +1463,10 @@ class FasterRCNNMetaArch(model.DetectionModel):
mask_predictions=mask_predictions)
if 'rpn_features_to_crop' in prediction_dict and self._initial_crop_size:
self._add_detection_features_output_node(
detections_dict[fields.DetectionResultFields.detection_boxes],
prediction_dict['rpn_features_to_crop'])
detections_dict[
'detection_features'] = self._add_detection_features_output_node(
detections_dict[fields.DetectionResultFields.detection_boxes],
prediction_dict['rpn_features_to_crop'])
return detections_dict
......@@ -1474,18 +1477,25 @@ class FasterRCNNMetaArch(model.DetectionModel):
def _add_detection_features_output_node(self, detection_boxes,
rpn_features_to_crop):
"""Add the detection features to the output node.
"""Add detection features to outputs.
The detection features are from cropping rpn_features with boxes.
Each bounding box has one feature vector of length depth, which comes from
mean_pooling of the cropped rpn_features.
This function extracts box features for each box in rpn_features_to_crop.
It returns the extracted box features, reshaped to
[batch size, max_detections, height, width, depth], and average pools
the extracted features across the spatial dimensions and adds a graph node
to the pooled features named 'pooled_detection_features'
Args:
detection_boxes: a 3-D float32 tensor of shape
[batch_size, max_detection, 4] which represents the bounding boxes.
[batch_size, max_detections, 4] which represents the bounding boxes.
rpn_features_to_crop: A 4-D float32 tensor with shape
[batch, height, width, depth] representing image features to crop using
the proposals boxes.
Returns:
detection_features: a 4-D float32 tensor of shape
[batch size, max_detections, height, width, depth] representing
cropped image features
"""
with tf.name_scope('SecondStageDetectionFeaturesExtract'):
flattened_detected_feature_maps = (
......@@ -1495,15 +1505,23 @@ class FasterRCNNMetaArch(model.DetectionModel):
flattened_detected_feature_maps)
batch_size = tf.shape(detection_boxes)[0]
max_detection = tf.shape(detection_boxes)[1]
max_detections = tf.shape(detection_boxes)[1]
detection_features_pool = tf.reduce_mean(
detection_features_unpooled, axis=[1, 2])
detection_features = tf.reshape(
reshaped_detection_features_pool = tf.reshape(
detection_features_pool,
[batch_size, max_detection, tf.shape(detection_features_pool)[-1]])
[batch_size, max_detections, tf.shape(detection_features_pool)[-1]])
reshaped_detection_features_pool = tf.identity(
reshaped_detection_features_pool, 'pooled_detection_features')
detection_features = tf.identity(
detection_features, 'detection_features')
reshaped_detection_features = tf.reshape(
detection_features_unpooled,
[batch_size, max_detections,
tf.shape(detection_features_unpooled)[1],
tf.shape(detection_features_unpooled)[2],
tf.shape(detection_features_unpooled)[3]])
return reshaped_detection_features
def _postprocess_rpn(self,
rpn_box_encodings_batch,
......@@ -1749,6 +1767,15 @@ class FasterRCNNMetaArch(model.DetectionModel):
resized_masks_list.append(resized_mask)
groundtruth_masks_list = resized_masks_list
# Masks could be set to bfloat16 in the input pipeline for performance
# reasons. Convert masks back to floating point space here since the rest of
# this module assumes groundtruth to be of float32 type.
float_groundtruth_masks_list = []
if groundtruth_masks_list:
for mask in groundtruth_masks_list:
float_groundtruth_masks_list.append(tf.cast(mask, tf.float32))
groundtruth_masks_list = float_groundtruth_masks_list
if self.groundtruth_has_field(fields.BoxListFields.weights):
groundtruth_weights_list = self.groundtruth_lists(
fields.BoxListFields.weights)
......@@ -2619,7 +2646,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
self.first_stage_feature_extractor_scope,
self.second_stage_feature_extractor_scope)
variables_to_restore = tf.global_variables()
variables_to_restore = variables_helper.get_global_variables_safely()
variables_to_restore.append(slim.get_or_create_global_step())
# Only load feature extractor variables to be consistent with loading from
# a classification checkpoint.
......
......@@ -383,6 +383,11 @@ class FasterRCNNMetaArchTest(
class_predictions_with_background_shapes = [(16, 3), (None, 3)]
proposal_boxes_shapes = [(2, 8, 4), (None, 8, 4)]
batch_size = 2
initial_crop_size = 3
maxpool_stride = 1
height = initial_crop_size/maxpool_stride
width = initial_crop_size/maxpool_stride
depth = 3
image_shape = np.array((2, 36, 48, 3), dtype=np.int32)
for (num_proposals_shape, refined_box_encoding_shape,
class_predictions_with_background_shape,
......@@ -433,6 +438,7 @@ class FasterRCNNMetaArchTest(
'detection_scores': tf.zeros([2, 5]),
'detection_classes': tf.zeros([2, 5]),
'num_detections': tf.zeros([2]),
'detection_features': tf.zeros([2, 5, width, height, depth])
}, true_image_shapes)
with self.test_session(graph=tf_graph) as sess:
detections_out = sess.run(
......@@ -453,6 +459,9 @@ class FasterRCNNMetaArchTest(
self.assertAllClose(detections_out['num_detections'].shape, [2])
self.assertTrue(np.amax(detections_out['detection_masks'] <= 1.0))
self.assertTrue(np.amin(detections_out['detection_masks'] >= 0.0))
self.assertAllEqual(detections_out['detection_features'].shape,
[2, 5, width, height, depth])
self.assertGreaterEqual(np.amax(detections_out['detection_features']), 0)
def _get_box_classifier_features_shape(self,
image_size,
......
......@@ -28,6 +28,7 @@ from object_detection.core import standard_fields as fields
from object_detection.core import target_assigner
from object_detection.utils import ops
from object_detection.utils import shape_utils
from object_detection.utils import variables_helper
from object_detection.utils import visualization_utils
slim = tf.contrib.slim
......@@ -45,6 +46,7 @@ class SSDFeatureExtractor(object):
reuse_weights=None,
use_explicit_padding=False,
use_depthwise=False,
num_layers=6,
override_base_feature_extractor_hyperparams=False):
"""Constructor.
......@@ -61,6 +63,7 @@ class SSDFeatureExtractor(object):
use_explicit_padding: Whether to use explicit padding when extracting
features. Default is False.
use_depthwise: Whether to use depthwise convolutions. Default is False.
num_layers: Number of SSD layers.
override_base_feature_extractor_hyperparams: Whether to override
hyperparameters of the base feature extractor with the one from
`conv_hyperparams_fn`.
......@@ -73,6 +76,7 @@ class SSDFeatureExtractor(object):
self._reuse_weights = reuse_weights
self._use_explicit_padding = use_explicit_padding
self._use_depthwise = use_depthwise
self._num_layers = num_layers
self._override_base_feature_extractor_hyperparams = (
override_base_feature_extractor_hyperparams)
......@@ -126,7 +130,7 @@ class SSDFeatureExtractor(object):
the model graph.
"""
variables_to_restore = {}
for variable in tf.global_variables():
for variable in variables_helper.get_global_variables_safely():
var_name = variable.op.name
if var_name.startswith(feature_extractor_scope + '/'):
var_name = var_name.replace(feature_extractor_scope + '/', '')
......@@ -148,6 +152,7 @@ class SSDKerasFeatureExtractor(tf.keras.Model):
inplace_batchnorm_update,
use_explicit_padding=False,
use_depthwise=False,
num_layers=6,
override_base_feature_extractor_hyperparams=False,
name=None):
"""Constructor.
......@@ -172,6 +177,7 @@ class SSDKerasFeatureExtractor(tf.keras.Model):
use_explicit_padding: Whether to use explicit padding when extracting
features. Default is False.
use_depthwise: Whether to use depthwise convolutions. Default is False.
num_layers: Number of SSD layers.
override_base_feature_extractor_hyperparams: Whether to override
hyperparameters of the base feature extractor with the one from
`conv_hyperparams_config`.
......@@ -189,6 +195,7 @@ class SSDKerasFeatureExtractor(tf.keras.Model):
self._inplace_batchnorm_update = inplace_batchnorm_update
self._use_explicit_padding = use_explicit_padding
self._use_depthwise = use_depthwise
self._num_layers = num_layers
self._override_base_feature_extractor_hyperparams = (
override_base_feature_extractor_hyperparams)
......@@ -247,11 +254,13 @@ class SSDKerasFeatureExtractor(tf.keras.Model):
the model graph.
"""
variables_to_restore = {}
for variable in tf.global_variables():
var_name = variable.op.name
for variable in self.variables:
# variable.name includes ":0" at the end, but the names in the checkpoint
# do not have the suffix ":0". So, we strip it here.
var_name = variable.name[:-2]
if var_name.startswith(feature_extractor_scope + '/'):
var_name = var_name.replace(feature_extractor_scope + '/', '')
variables_to_restore[var_name] = variable
variables_to_restore[var_name] = variable
return variables_to_restore
......@@ -709,6 +718,14 @@ class SSDMetaArch(model.DetectionModel):
additional_fields = {
'multiclass_scores': detection_scores_with_background
}
if self._anchors is not None:
anchor_indices = tf.range(self._anchors.num_boxes_static())
batch_anchor_indices = tf.tile(
tf.expand_dims(anchor_indices, 0), [batch_size, 1])
# All additional fields need to be float.
additional_fields.update({
'anchor_indices': tf.cast(batch_anchor_indices, tf.float32),
})
if detection_keypoints is not None:
detection_keypoints = tf.identity(
detection_keypoints, 'raw_keypoint_locations')
......@@ -737,6 +754,12 @@ class SSDMetaArch(model.DetectionModel):
fields.DetectionResultFields.raw_detection_scores:
detection_scores_with_background
}
if (nmsed_additional_fields is not None and
'anchor_indices' in nmsed_additional_fields):
detection_dict.update({
fields.DetectionResultFields.detection_anchor_indices:
tf.cast(nmsed_additional_fields['anchor_indices'], tf.int32),
})
if (nmsed_additional_fields is not None and
fields.BoxListFields.keypoints in nmsed_additional_fields):
detection_dict[fields.DetectionResultFields.detection_keypoints] = (
......@@ -1218,13 +1241,24 @@ class SSDMetaArch(model.DetectionModel):
if fine_tune_checkpoint_type == 'detection':
variables_to_restore = {}
for variable in tf.global_variables():
var_name = variable.op.name
if load_all_detection_checkpoint_vars:
variables_to_restore[var_name] = variable
else:
if var_name.startswith(self._extract_features_scope):
if tf.executing_eagerly():
for variable in self.variables:
# variable.name includes ":0" at the end, but the names in the
# checkpoint do not have the suffix ":0". So, we strip it here.
var_name = variable.name[:-2]
if load_all_detection_checkpoint_vars:
variables_to_restore[var_name] = variable
else:
if var_name.startswith(self._extract_features_scope):
variables_to_restore[var_name] = variable
else:
for variable in variables_helper.get_global_variables_safely():
var_name = variable.op.name
if load_all_detection_checkpoint_vars:
variables_to_restore[var_name] = variable
else:
if var_name.startswith(self._extract_features_scope):
variables_to_restore[var_name] = variable
return variables_to_restore
......
......@@ -188,6 +188,7 @@ class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase,
[0.5, 0., 1., 0.5], [1., 1., 1.5, 1.5]]]
raw_detection_scores = [[[0, 0], [0, 0], [0, 0], [0, 0]],
[[0, 0], [0, 0], [0, 0], [0, 0]]]
detection_anchor_indices = [[0, 2, 1, 0, 0], [0, 2, 1, 0, 0]]
for input_shape in input_shapes:
tf_graph = tf.Graph()
......@@ -229,6 +230,8 @@ class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase,
raw_detection_boxes)
self.assertAllEqual(detections_out['raw_detection_scores'],
raw_detection_scores)
self.assertAllEqual(detections_out['detection_anchor_indices'],
detection_anchor_indices)
def test_postprocess_results_are_correct_static(self, use_keras):
with tf.Graph().as_default():
......
......@@ -13,7 +13,12 @@
# limitations under the License.
# ==============================================================================
"""Class for evaluating object detections with COCO metrics."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
from six.moves import zip
import tensorflow as tf
from object_detection.core import standard_fields
......
......@@ -39,6 +39,10 @@ then evaluation (in multi-class mode) can be invoked as follows:
metrics = evaluator.ComputeMetrics()
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from collections import OrderedDict
import copy
import time
......@@ -48,6 +52,8 @@ from pycocotools import coco
from pycocotools import cocoeval
from pycocotools import mask
from six.moves import range
from six.moves import zip
import tensorflow as tf
from object_detection.utils import json_utils
......
......@@ -40,6 +40,8 @@ from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import logging
from absl import app
from absl import flags
import pandas as pd
......@@ -120,20 +122,22 @@ def main(unused_argv):
object_detection_evaluation.OpenImagesChallengeEvaluator(
categories, evaluate_masks=is_instance_segmentation_eval))
all_predictions = pd.read_csv(FLAGS.input_predictions)
images_processed = 0
for _, groundtruth in enumerate(all_annotations.groupby('ImageID')):
logging.info('Processing image %d', images_processed)
image_id, image_groundtruth = groundtruth
groundtruth_dictionary = utils.build_groundtruth_dictionary(
image_groundtruth, class_label_map)
challenge_evaluator.add_single_ground_truth_image_info(
image_id, groundtruth_dictionary)
all_predictions = pd.read_csv(FLAGS.input_predictions)
for _, prediction_data in enumerate(all_predictions.groupby('ImageID')):
image_id, image_predictions = prediction_data
prediction_dictionary = utils.build_predictions_dictionary(
image_predictions, class_label_map)
all_predictions.loc[all_predictions['ImageID'] == image_id],
class_label_map)
challenge_evaluator.add_single_detected_image_info(image_id,
prediction_dictionary)
images_processed += 1
metrics = challenge_evaluator.evaluate()
......
......@@ -18,10 +18,13 @@ from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import base64
import zlib
import numpy as np
import pandas as pd
from pycocotools import mask as coco_mask
from pycocotools import mask
from object_detection.core import standard_fields
......@@ -53,33 +56,42 @@ def _decode_raw_data_into_masks_and_boxes(segments, image_widths,
"""Decods binary segmentation masks into np.arrays and boxes.
Args:
segments: pandas Series object containing either None entries or strings
with COCO-encoded binary masks. All masks are expected to be the same size.
segments: pandas Series object containing either
None entries, or strings with
base64, zlib compressed, COCO RLE-encoded binary masks.
All masks are expected to be the same size.
image_widths: pandas Series of mask widths.
image_heights: pandas Series of mask heights.
Returns:
a np.ndarray of the size NxWxH, where W and H is determined from the encoded
masks; for the None values, zero arrays of size WxH are created. if input
masks; for the None values, zero arrays of size WxH are created. If input
contains only None values, W=1, H=1.
"""
segment_masks = []
segment_boxes = []
ind = segments.first_valid_index()
if ind is not None:
size = [int(image_heights.iloc[ind]), int(image_widths[ind])]
size = [int(image_heights[ind]), int(image_widths[ind])]
else:
# It does not matter which size we pick since no masks will ever be
# evaluated.
size = [1, 1]
return np.zeros((segments.shape[0], 1, 1), dtype=np.uint8), np.zeros(
(segments.shape[0], 4), dtype=np.float32)
for segment, im_width, im_height in zip(segments, image_widths,
image_heights):
if pd.isnull(segment):
segment_masks.append(np.zeros([1, size[0], size[1]], dtype=np.uint8))
segment_boxes.append(np.expand_dims(np.array([0.0, 0.0, 0.0, 0.0]), 0))
else:
encoding_dict = {'size': [im_height, im_width], 'counts': segment}
mask_tensor = mask.decode(encoding_dict)
compressed_mask = base64.b64decode(segment)
rle_encoded_mask = zlib.decompress(compressed_mask)
decoding_dict = {
'size': [im_height, im_width],
'counts': rle_encoded_mask
}
mask_tensor = coco_mask.decode(decoding_dict)
segment_masks.append(np.expand_dims(mask_tensor, 0))
segment_boxes.append(np.expand_dims(_to_normalized_box(mask_tensor), 0))
......
......@@ -18,15 +18,43 @@ from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import base64
import zlib
import numpy as np
import pandas as pd
from pycocotools import mask
from pycocotools import mask as coco_mask
import tensorflow as tf
from object_detection.core import standard_fields
from object_detection.metrics import oid_challenge_evaluation_utils as utils
def encode_mask(mask_to_encode):
"""Encodes a binary mask into the Kaggle challenge text format.
The encoding is done in three stages:
- COCO RLE-encoding,
- zlib compression,
- base64 encoding (to use as entry in csv file).
Args:
mask_to_encode: binary np.ndarray of dtype bool and 2d shape.
Returns:
A (base64) text string of the encoded mask.
"""
mask_to_encode = np.squeeze(mask_to_encode)
mask_to_encode = mask_to_encode.reshape(mask_to_encode.shape[0],
mask_to_encode.shape[1], 1)
mask_to_encode = mask_to_encode.astype(np.uint8)
mask_to_encode = np.asfortranarray(mask_to_encode)
encoded_mask = coco_mask.encode(mask_to_encode)[0]['counts']
compressed_mask = zlib.compress(encoded_mask, zlib.Z_BEST_COMPRESSION)
base64_mask = base64.b64encode(compressed_mask)
return base64_mask
class OidUtilTest(tf.test.TestCase):
def testMaskToNormalizedBox(self):
......@@ -44,10 +72,10 @@ class OidUtilTest(tf.test.TestCase):
mask1 = np.array([[0, 0, 1, 1], [0, 0, 1, 1], [0, 0, 0, 0]], dtype=np.uint8)
mask2 = np.array([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]], dtype=np.uint8)
encoding1 = mask.encode(np.asfortranarray(mask1))
encoding2 = mask.encode(np.asfortranarray(mask2))
encoding1 = encode_mask(mask1)
encoding2 = encode_mask(mask2)
vals = pd.Series([encoding1['counts'], encoding2['counts']])
vals = pd.Series([encoding1, encoding2])
image_widths = pd.Series([mask1.shape[1], mask2.shape[1]])
image_heights = pd.Series([mask1.shape[0], mask2.shape[0]])
......@@ -60,6 +88,15 @@ class OidUtilTest(tf.test.TestCase):
self.assertAllEqual(expected_segm, segm)
self.assertAllEqual(expected_bbox, bbox)
def testDecodeToTensorsNoMasks(self):
vals = pd.Series([None, None])
image_widths = pd.Series([None, None])
image_heights = pd.Series([None, None])
segm, bbox = utils._decode_raw_data_into_masks_and_boxes(
vals, image_widths, image_heights)
self.assertAllEqual(np.zeros((2, 1, 1), dtype=np.uint8), segm)
self.assertAllEqual(np.zeros((2, 4), dtype=np.float32), bbox)
class OidChallengeEvaluationUtilTest(tf.test.TestCase):
......@@ -140,13 +177,13 @@ class OidChallengeEvaluationUtilTest(tf.test.TestCase):
mask2 = np.array([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]],
dtype=np.uint8)
encoding1 = mask.encode(np.asfortranarray(mask1))
encoding2 = mask.encode(np.asfortranarray(mask2))
encoding1 = encode_mask(mask1)
encoding2 = encode_mask(mask2)
np_data = pd.DataFrame(
[[
'fe58ec1b06db2bb7', mask1.shape[1], mask1.shape[0], '/m/04bcr3',
0.0, 0.3, 0.5, 0.6, 0, None, encoding1['counts']
0.0, 0.3, 0.5, 0.6, 0, None, encoding1
],
[
'fe58ec1b06db2bb7', None, None, '/m/02gy9n', 0.1, 0.2, 0.3, 0.4, 1,
......@@ -154,7 +191,7 @@ class OidChallengeEvaluationUtilTest(tf.test.TestCase):
],
[
'fe58ec1b06db2bb7', mask2.shape[1], mask2.shape[0], '/m/02gy9n',
0.5, 0.6, 0.8, 0.9, 0, None, encoding2['counts']
0.5, 0.6, 0.8, 0.9, 0, None, encoding2
],
[
'fe58ec1b06db2bb7', None, None, '/m/04bcr3', None, None, None,
......@@ -218,21 +255,21 @@ class OidChallengeEvaluationUtilTest(tf.test.TestCase):
mask2 = np.array([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]],
dtype=np.uint8)
encoding1 = mask.encode(np.asfortranarray(mask1))
encoding2 = mask.encode(np.asfortranarray(mask2))
encoding1 = encode_mask(mask1)
encoding2 = encode_mask(mask2)
np_data = pd.DataFrame(
[[
'fe58ec1b06db2bb7', mask1.shape[1], mask1.shape[0], '/m/04bcr3',
encoding1['counts'], 0.8
],
[
'fe58ec1b06db2bb7', mask2.shape[1], mask2.shape[0], '/m/02gy9n',
encoding2['counts'], 0.6
]],
columns=[
'ImageID', 'ImageWidth', 'ImageHeight', 'LabelName', 'Mask', 'Score'
])
np_data = pd.DataFrame([[
'fe58ec1b06db2bb7', mask1.shape[1], mask1.shape[0], '/m/04bcr3',
encoding1, 0.8
],
[
'fe58ec1b06db2bb7', mask2.shape[1],
mask2.shape[0], '/m/02gy9n', encoding2, 0.6
]],
columns=[
'ImageID', 'ImageWidth', 'ImageHeight',
'LabelName', 'Mask', 'Score'
])
class_label_map = {'/m/04bcr3': 1, '/m/02gy9n': 3}
prediction_dictionary = utils.build_predictions_dictionary(
np_data, class_label_map)
......
......@@ -24,7 +24,6 @@ import os
import tensorflow as tf
from tensorflow.python.util import function_utils
from object_detection import eval_util
from object_detection import exporter as exporter_lib
from object_detection import inputs
......@@ -187,7 +186,7 @@ def unstack_batch(tensor_dict, unpad_groundtruth_tensors=True):
return unbatched_tensor_dict
def _provide_groundtruth(model, labels):
def provide_groundtruth(model, labels):
"""Provides the labels to a model as groundtruth.
This helper function extracts the corresponding boxes, classes,
......@@ -287,7 +286,7 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False,
labels, unpad_groundtruth_tensors=unpad_groundtruth_tensors)
if mode in (tf.estimator.ModeKeys.TRAIN, tf.estimator.ModeKeys.EVAL):
_provide_groundtruth(detection_model, labels)
provide_groundtruth(detection_model, labels)
preprocessed_images = features[fields.InputDataFields.image]
if use_tpu and train_config.use_bfloat16:
......@@ -524,7 +523,7 @@ def create_estimator_and_inputs(run_config,
pipeline_config_path,
config_override=None,
train_steps=None,
sample_1_of_n_eval_examples=1,
sample_1_of_n_eval_examples=None,
sample_1_of_n_eval_on_train_examples=1,
model_fn_creator=create_model_fn,
use_tpu_estimator=False,
......@@ -606,9 +605,12 @@ def create_estimator_and_inputs(run_config,
pipeline_config_path, config_override=config_override)
kwargs.update({
'train_steps': train_steps,
'sample_1_of_n_eval_examples': sample_1_of_n_eval_examples,
'use_bfloat16': configs['train_config'].use_bfloat16 and use_tpu
})
if sample_1_of_n_eval_examples >= 1:
kwargs.update({
'sample_1_of_n_eval_examples': sample_1_of_n_eval_examples
})
if override_eval_num_epochs:
kwargs.update({'eval_num_epochs': 1})
tf.logging.warning(
......@@ -667,11 +669,6 @@ def create_estimator_and_inputs(run_config,
model_fn = model_fn_creator(detection_model_fn, configs, hparams, use_tpu,
postprocess_on_cpu)
if use_tpu_estimator:
# Multicore inference disabled due to b/129367127
tpu_estimator_args = function_utils.fn_args(tf.contrib.tpu.TPUEstimator)
kwargs = {}
if 'experimental_export_device_assignment' in tpu_estimator_args:
kwargs['experimental_export_device_assignment'] = True
estimator = tf.contrib.tpu.TPUEstimator(
model_fn=model_fn,
train_batch_size=train_config.batch_size,
......@@ -681,8 +678,7 @@ def create_estimator_and_inputs(run_config,
config=run_config,
export_to_tpu=export_to_tpu,
eval_on_tpu=False, # Eval runs on CPU, so disable eval on TPU
params=params if params else {},
**kwargs)
params=params if params else {})
else:
estimator = tf.estimator.Estimator(model_fn=model_fn, config=run_config)
......
This diff is collapsed.
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for object detection model library."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import tensorflow as tf
from object_detection import model_hparams
from object_detection import model_lib_v2
from object_detection.utils import config_util
# Model for test. Current options are:
# 'ssd_mobilenet_v2_pets_keras'
MODEL_NAME_FOR_TEST = 'ssd_mobilenet_v2_pets_keras'
def _get_data_path():
"""Returns an absolute path to TFRecord file."""
return os.path.join(tf.resource_loader.get_data_files_path(), 'test_data',
'pets_examples.record')
def get_pipeline_config_path(model_name):
"""Returns path to the local pipeline config file."""
return os.path.join(tf.resource_loader.get_data_files_path(), 'samples',
'configs', model_name + '.config')
def _get_labelmap_path():
"""Returns an absolute path to label map file."""
return os.path.join(tf.resource_loader.get_data_files_path(), 'data',
'pet_label_map.pbtxt')
def _get_config_kwarg_overrides():
"""Returns overrides to the configs that insert the correct local paths."""
data_path = _get_data_path()
label_map_path = _get_labelmap_path()
return {
'train_input_path': data_path,
'eval_input_path': data_path,
'label_map_path': label_map_path
}
def _get_configs_for_model(model_name):
"""Returns configurations for model."""
filename = get_pipeline_config_path(model_name)
configs = config_util.get_configs_from_pipeline_file(filename)
configs = config_util.merge_external_params_with_configs(
configs, kwargs_dict=_get_config_kwarg_overrides())
return configs
class ModelLibTest(tf.test.TestCase):
@classmethod
def setUpClass(cls):
tf.keras.backend.clear_session()
def test_train_loop_then_eval_loop(self):
"""Tests that Estimator and input function are constructed correctly."""
hparams = model_hparams.create_hparams(
hparams_overrides='load_pretrained=false')
pipeline_config_path = get_pipeline_config_path(MODEL_NAME_FOR_TEST)
config_kwarg_overrides = _get_config_kwarg_overrides()
model_dir = tf.test.get_temp_dir()
train_steps = 2
model_lib_v2.train_loop(
hparams,
pipeline_config_path,
model_dir=model_dir,
train_steps=train_steps,
checkpoint_every_n=1,
**config_kwarg_overrides)
model_lib_v2.eval_continuously(
hparams,
pipeline_config_path,
model_dir=model_dir,
checkpoint_dir=model_dir,
train_steps=train_steps,
wait_interval=10,
**config_kwarg_overrides)
......@@ -25,6 +25,7 @@ Huang et al. (https://arxiv.org/abs/1611.10012)
import tensorflow as tf
from object_detection.meta_architectures import faster_rcnn_meta_arch
from object_detection.utils import variables_helper
from nets import inception_resnet_v2
slim = tf.contrib.slim
......@@ -195,7 +196,7 @@ class FasterRCNNInceptionResnetV2FeatureExtractor(
"""
variables_to_restore = {}
for variable in tf.global_variables():
for variable in variables_helper.get_global_variables_safely():
if variable.op.name.startswith(
first_stage_feature_extractor_scope):
var_name = variable.op.name.replace(
......
......@@ -30,6 +30,7 @@ import tensorflow as tf
from object_detection.meta_architectures import faster_rcnn_meta_arch
from object_detection.models.keras_models import inception_resnet_v2
from object_detection.utils import model_util
from object_detection.utils import variables_helper
class FasterRCNNInceptionResnetV2KerasFeatureExtractor(
......@@ -1070,7 +1071,7 @@ class FasterRCNNInceptionResnetV2KerasFeatureExtractor(
}
variables_to_restore = {}
for variable in tf.global_variables():
for variable in variables_helper.get_global_variables_safely():
var_name = keras_to_slim_name_mapping.get(variable.op.name)
if var_name:
variables_to_restore[var_name] = variable
......
......@@ -23,6 +23,7 @@ https://arxiv.org/abs/1707.07012
import tensorflow as tf
from object_detection.meta_architectures import faster_rcnn_meta_arch
from object_detection.utils import variables_helper
from nets.nasnet import nasnet
from nets.nasnet import nasnet_utils
......@@ -307,7 +308,7 @@ class FasterRCNNNASFeatureExtractor(
# Note that the NAS checkpoint only contains the moving average version of
# the Variables so we need to generate an appropriate dictionary mapping.
variables_to_restore = {}
for variable in tf.global_variables():
for variable in variables_helper.get_global_variables_safely():
if variable.op.name.startswith(
first_stage_feature_extractor_scope):
var_name = variable.op.name.replace(
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment