Commit a1337e01 authored by Zhichao Lu's avatar Zhichao Lu Committed by pkulzc
Browse files

Merged commit includes the following changes:

223075771  by lzc:

    Bring in external fixes.

--
222919755  by ronnyvotel:

    Bug fix in faster r-cnn model builder. Was previously using `inplace_batchnorm_update` for `reuse_weights`.

--
222885680  by Zhichao Lu:

    Use the result_dict_for_batched_example in models_lib
    Also fixes the visualization size on when eval is on GPU

--
222883648  by Zhichao Lu:

    Fix _unmatched_class_label for the _add_background_class == False case in ssd_meta_arch.py.

--
222836663  by Zhichao Lu:

    Adding support for visualizing grayscale images. Without this change, the images are black-red instead of grayscale.

--
222501978  by Zhichao Lu:

    Fix a bug that caused convert_to_grayscale flag not to be respected.

--
222432846  by richardmunoz:

    Fix mapping of groundtruth_confidences from shape [num_boxes] to [num_boxes, num_classes] when the input contains the groundtruth_confidences field.

--
221725755  by richardmunoz:

    Internal change.

--
221458536  by Zhichao Lu:

    Fix saver defer build bug in object detection train codepath.

--
221391590  by Zhichao Lu:

    Add support for group normalization in the object detection API. Just adding MobileNet-v1 SSD currently. This may serve as a road map for other models that wish to support group normalization as an option.

--
221367993  by Zhichao Lu:

    Bug fixes (1) Make RandomPadImage work, (2) Fix keep_checkpoint_every_n_hours.

--
221266403  by rathodv:

    Use detection boxes as proposals to compute correct mask loss in eval jobs.

--
220845934  by lzc:

    Internal change.

--
220778850  by Zhichao Lu:

    Incorporating existing metrics into Estimator framework.
    Should restore:
    -oid_challenge_detection_metrics
    -pascal_voc_detection_metrics
    -weighted_pascal_voc_detection_metrics
    -pascal_voc_instance_segmentation_metrics
    -weighted_pascal_voc_instance_segmentation_metrics
    -oid_V2_detection_metrics

--
220370391  by alirezafathi:

    Adding precision and recall to the metrics.

--
220321268  by Zhichao Lu:

    Allow the option of setting max_examples_to_draw to zero.

--
220193337  by Zhichao Lu:

    This CL fixes a bug where the Keras convolutional box predictor was applying heads in the non-deterministic dict order. The consequence of this bug was that variables were created in non-deterministic orders. This in turn led different workers in a multi-gpu training setup to have slightly different graphs which had variables assigned to mismatched parameter servers. As a result, roughly half of all workers were unable to initialize and did no work, and training time was slowed down approximately 2x.

--
220136508  by huizhongc:

    Add weight equalization loss to SSD meta arch.

--
220125875  by pengchong:

    Rename label_scores to label_weights

--
219730108  by Zhichao Lu:

    Add description of detection_keypoints in postprocessed_tensors to docstring.

--
219577519  by pengchong:

    Support parsing the class confidences and training using them.

--
219547611  by lzc:

    Stop using static shapes in GPU eval jobs.

--
219536476  by Zhichao Lu:

    Migrate TensorFlow Lite out of tensorflow/contrib

    This change moves //tensorflow/contrib/lite to //tensorflow/lite in preparation
    for TensorFlow 2.0's deprecation of contrib/. If you refer to TF Lite build
    targets or headers, you will need to update them manually. If you use TF Lite
    from the TensorFlow python package, "tf.contrib.lite" now points to "tf.lite".
    Please update your imports as soon as possible.

    For more details, see https://groups.google.com/a/tensorflow.org/forum/#!topic/tflite/iIIXOTOFvwQ

    @angersson and @aselle are conducting this migration. Please contact them if
    you have any further questions.

--
219190083  by Zhichao Lu:

    Add a second expected_loss_weights function using an alternative expectation calculation compared to previous. Integrate this op into ssd_meta_arch and losses builder. Affects files that use losses_builder.build to handle the returning of an additional element.

--
218924451  by pengchong:

    Add a new way to assign training targets using groundtruth confidences.

--
218760524  by chowdhery:

    Modify export script to add option for regular NMS in TFLite post-processing op.

--

PiperOrigin-RevId: 223075771
parent 2c680af3
...@@ -195,6 +195,8 @@ def add_output_tensor_nodes(postprocessed_tensors, ...@@ -195,6 +195,8 @@ def add_output_tensor_nodes(postprocessed_tensors,
'detection_classes': [batch, max_detections] 'detection_classes': [batch, max_detections]
'detection_masks': [batch, max_detections, mask_height, mask_width] 'detection_masks': [batch, max_detections, mask_height, mask_width]
(optional). (optional).
'detection_keypoints': [batch, max_detections, num_keypoints, 2]
(optional).
'num_detections': [batch] 'num_detections': [batch]
output_collection_name: Name of collection to add output tensors to. output_collection_name: Name of collection to add output tensors to.
......
...@@ -83,7 +83,7 @@ tf_record_input_reader: { input_path: '${INPUT_TFRECORDS_WITH_DETECTIONS}' } ...@@ -83,7 +83,7 @@ tf_record_input_reader: { input_path: '${INPUT_TFRECORDS_WITH_DETECTIONS}' }
" > ${OUTPUT_CONFIG_DIR}/input_config.pbtxt " > ${OUTPUT_CONFIG_DIR}/input_config.pbtxt
echo " echo "
metrics_set: 'oid_challenge_object_detection_metrics' metrics_set: 'oid_challenge_detection_metrics'
" > ${OUTPUT_CONFIG_DIR}/eval_config.pbtxt " > ${OUTPUT_CONFIG_DIR}/eval_config.pbtxt
OUTPUT_METRICS_DIR=/path/to/metrics_csv OUTPUT_METRICS_DIR=/path/to/metrics_csv
......
...@@ -109,9 +109,11 @@ Model name ...@@ -109,9 +109,11 @@ Model name
## Open Images-trained models ## Open Images-trained models
Model name | Speed (ms) | Open Images mAP@0.5[^2] | Outputs Model name | Speed (ms) | Open Images mAP@0.5[^2] | Outputs
----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----: --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :--------: | :---------------------: | :-----:
[faster_rcnn_inception_resnet_v2_atrous_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_oid_2018_01_28.tar.gz) | 727 | 37 | Boxes [faster_rcnn_inception_resnet_v2_atrous_oidv2](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_oid_2018_01_28.tar.gz) | 727 | 37 | Boxes
[faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2018_01_28.tar.gz) | 347 | | Boxes [faster_rcnn_inception_resnet_v2_atrous_lowproposals_oidv2](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2018_01_28.tar.gz) | 347 | | Boxes
[faster_rcnn_inception_resnet_v2_atrous_oidv4](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_oidv4_2018_10_30.tar.gz) | 455 | | Boxes
[ssd_mobilenetv2_oidv4](http://download.tensorflow.org/models/object_detection/ssd_mobilenetv2_oidv4_2018_10_30.tar.gz) | 24 | | Boxes
[facessd_mobilenet_v2_quantized_open_image_v4](http://download.tensorflow.org/models/object_detection/facessd_mobilenet_v2_quantized_320x320_open_image_v4.tar.gz) [^3] | 20 | 73 (faces) | Boxes [facessd_mobilenet_v2_quantized_open_image_v4](http://download.tensorflow.org/models/object_detection/facessd_mobilenet_v2_quantized_320x320_open_image_v4.tar.gz) [^3] | 20 | 73 (faces) | Boxes
## iNaturalist Species-trained models ## iNaturalist Species-trained models
......
...@@ -65,7 +65,7 @@ intersection over union based on the object masks instead of object boxes. ...@@ -65,7 +65,7 @@ intersection over union based on the object masks instead of object boxes.
## Open Images V2 detection metric ## Open Images V2 detection metric
`EvalConfig.metrics_set='open_images_V2_detection_metrics'` `EvalConfig.metrics_set='oid_V2_detection_metrics'`
This metric is defined originally for evaluating detector performance on [Open This metric is defined originally for evaluating detector performance on [Open
Images V2 dataset](https://github.com/openimages/dataset) and is fairly similar Images V2 dataset](https://github.com/openimages/dataset) and is fairly similar
...@@ -132,14 +132,20 @@ convention, the evaluation software treats all classes independently, ignoring ...@@ -132,14 +132,20 @@ convention, the evaluation software treats all classes independently, ignoring
the hierarchy. To achieve high performance values, object detectors should the hierarchy. To achieve high performance values, object detectors should
output bounding-boxes labelled in the same manner. output bounding-boxes labelled in the same manner.
The old metric name is DEPRECATED.
`EvalConfig.metrics_set='open_images_V2_detection_metrics'`
## OID Challenge Object Detection Metric 2018 ## OID Challenge Object Detection Metric 2018
`EvalConfig.metrics_set='oid_challenge_object_detection_metrics'` `EvalConfig.metrics_set='oid_challenge_detection_metrics'`
The metric for the OID Challenge Object Detection Metric 2018, Object Detection The metric for the OID Challenge Object Detection Metric 2018, Object Detection
track. The description is provided on the [Open Images Challenge track. The description is provided on the [Open Images Challenge
website](https://storage.googleapis.com/openimages/web/challenge.html). website](https://storage.googleapis.com/openimages/web/challenge.html).
The old metric name is DEPRECATED.
`EvalConfig.metrics_set='oid_challenge_object_detection_metrics'`
## OID Challenge Visual Relationship Detection Metric 2018 ## OID Challenge Visual Relationship Detection Metric 2018
The metric for the OID Challenge Visual Relationship Detection Metric 2018, Visual The metric for the OID Challenge Visual Relationship Detection Metric 2018, Visual
......
...@@ -216,7 +216,7 @@ tf_record_input_reader: { input_path: '${SPLIT}_detections.tfrecord@${NUM_SHARDS ...@@ -216,7 +216,7 @@ tf_record_input_reader: { input_path: '${SPLIT}_detections.tfrecord@${NUM_SHARDS
" > ${SPLIT}_eval_metrics/${SPLIT}_input_config.pbtxt " > ${SPLIT}_eval_metrics/${SPLIT}_input_config.pbtxt
echo " echo "
metrics_set: 'open_images_V2_detection_metrics' metrics_set: 'oid_V2_detection_metrics'
" > ${SPLIT}_eval_metrics/${SPLIT}_eval_config.pbtxt " > ${SPLIT}_eval_metrics/${SPLIT}_eval_config.pbtxt
``` ```
......
...@@ -49,14 +49,14 @@ will output the frozen graph that we can input to TensorFlow Lite directly and ...@@ -49,14 +49,14 @@ will output the frozen graph that we can input to TensorFlow Lite directly and
is the one we’ll be using. is the one we’ll be using.
Next we’ll use TensorFlow Lite to get the optimized model by using Next we’ll use TensorFlow Lite to get the optimized model by using
[TOCO](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/toco), [TOCO](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/toco),
the TensorFlow Lite Optimizing Converter. This will convert the resulting frozen the TensorFlow Lite Optimizing Converter. This will convert the resulting frozen
graph (tflite_graph.pb) to the TensorFlow Lite flatbuffer format (detect.tflite) graph (tflite_graph.pb) to the TensorFlow Lite flatbuffer format (detect.tflite)
via the following command. For a quantized model, run this from the tensorflow/ via the following command. For a quantized model, run this from the tensorflow/
directory: directory:
```shell ```shell
bazel run --config=opt tensorflow/contrib/lite/toco:toco -- \ bazel run --config=opt tensorflow/lite/toco:toco -- \
--input_file=$OUTPUT_DIR/tflite_graph.pb \ --input_file=$OUTPUT_DIR/tflite_graph.pb \
--output_file=$OUTPUT_DIR/detect.tflite \ --output_file=$OUTPUT_DIR/detect.tflite \
--input_shapes=1,300,300,3 \ --input_shapes=1,300,300,3 \
...@@ -75,14 +75,14 @@ are named 'TFLite_Detection_PostProcess', 'TFLite_Detection_PostProcess:1', ...@@ -75,14 +75,14 @@ are named 'TFLite_Detection_PostProcess', 'TFLite_Detection_PostProcess:1',
'TFLite_Detection_PostProcess:2', and 'TFLite_Detection_PostProcess:3' and 'TFLite_Detection_PostProcess:2', and 'TFLite_Detection_PostProcess:3' and
represent four arrays: detection_boxes, detection_classes, detection_scores, and represent four arrays: detection_boxes, detection_classes, detection_scores, and
num_detections. The documentation for other flags used in this command is num_detections. The documentation for other flags used in this command is
[here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/toco/g3doc/cmdline_reference.md). [here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/toco/g3doc/cmdline_reference.md).
If things ran successfully, you should now see a third file in the /tmp/tflite If things ran successfully, you should now see a third file in the /tmp/tflite
directory called detect.tflite. This file contains the graph and all model directory called detect.tflite. This file contains the graph and all model
parameters and can be run via the TensorFlow Lite interpreter on the Android parameters and can be run via the TensorFlow Lite interpreter on the Android
device. For a floating point model, run this from the tensorflow/ directory: device. For a floating point model, run this from the tensorflow/ directory:
```shell ```shell
bazel run -c opt tensorflow/lite/toco:toco -- \ bazel run --config=opt tensorflow/lite/toco:toco -- \
--input_file=$OUTPUT_DIR/tflite_graph.pb \ --input_file=$OUTPUT_DIR/tflite_graph.pb \
--output_file=$OUTPUT_DIR/detect.tflite \ --output_file=$OUTPUT_DIR/detect.tflite \
--input_shapes=1,300,300,3 \ --input_shapes=1,300,300,3 \
...@@ -105,7 +105,7 @@ Studio](https://developer.android.com/studio/index.html). To build the ...@@ -105,7 +105,7 @@ Studio](https://developer.android.com/studio/index.html). To build the
TensorFlow Lite Android demo, build tools require API >= 23 (but it will run on TensorFlow Lite Android demo, build tools require API >= 23 (but it will run on
devices with API >= 21). Additional details are available on the [TensorFlow devices with API >= 21). Additional details are available on the [TensorFlow
Lite Android App Lite Android App
page](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/java/demo/README.md). page](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/java/demo/README.md).
Next we need to point the app to our new detect.tflite file and give it the Next we need to point the app to our new detect.tflite file and give it the
names of our new labels. Specifically, we will copy our TensorFlow Lite names of our new labels. Specifically, we will copy our TensorFlow Lite
...@@ -113,24 +113,24 @@ flatbuffer to the app assets directory with the following command: ...@@ -113,24 +113,24 @@ flatbuffer to the app assets directory with the following command:
```shell ```shell
cp /tmp/tflite/detect.tflite \ cp /tmp/tflite/detect.tflite \
//tensorflow/contrib/lite/examples/android/app/src/main/assets //tensorflow/lite/examples/android/app/src/main/assets
``` ```
You will also need to copy your new labelmap labels_list.txt to the assets You will also need to copy your new labelmap labels_list.txt to the assets
directory. directory.
We will now edit the BUILD file to point to this new model. First, open the We will now edit the BUILD file to point to this new model. First, open the
BUILD file tensorflow/contrib/lite/examples/android/BUILD. Then find the assets BUILD file tensorflow/lite/examples/android/BUILD. Then find the assets
section, and replace the line “@tflite_mobilenet_ssd_quant//:detect.tflite” section, and replace the line “@tflite_mobilenet_ssd_quant//:detect.tflite”
(which by default points to a COCO pretrained model) with the path to your new (which by default points to a COCO pretrained model) with the path to your new
TFLite model TFLite model
“//tensorflow/contrib/lite/examples/android/app/src/main/assets:detect.tflite”. “//tensorflow/lite/examples/android/app/src/main/assets:detect.tflite”.
Finally, change the last line in assets section to use the new label map as Finally, change the last line in assets section to use the new label map as
well. well.
We will also need to tell our app to use the new label map. In order to do this, We will also need to tell our app to use the new label map. In order to do this,
open up the open up the
tensorflow/contrib/lite/examples/android/app/src/main/java/org/tensorflow/demo/DetectorActivity.java tensorflow/lite/examples/android/app/src/main/java/org/tensorflow/demo/DetectorActivity.java
file in a text editor and find the definition of TF_OD_API_LABELS_FILE. Update file in a text editor and find the definition of TF_OD_API_LABELS_FILE. Update
this path to point to your new label map file: this path to point to your new label map file:
"file:///android_asset/labels_list.txt". Note that if your model is quantized, "file:///android_asset/labels_list.txt". Note that if your model is quantized,
...@@ -150,7 +150,7 @@ from the tensorflow directory: ...@@ -150,7 +150,7 @@ from the tensorflow directory:
```shell ```shell
bazel build -c opt --config=android_arm{,64} --cxxopt='--std=c++11' bazel build -c opt --config=android_arm{,64} --cxxopt='--std=c++11'
"//tensorflow/contrib/lite/examples/android:tflite_demo" "//tensorflow/lite/examples/android:tflite_demo"
``` ```
Now install the demo on a Now install the demo on a
...@@ -159,5 +159,5 @@ Android phone via [Android Debug ...@@ -159,5 +159,5 @@ Android phone via [Android Debug
Bridge](https://developer.android.com/studio/command-line/adb) (adb): Bridge](https://developer.android.com/studio/command-line/adb) (adb):
```shell ```shell
adb install bazel-bin/tensorflow/contrib/lite/examples/android/tflite_demo.apk adb install bazel-bin/tensorflow/lite/examples/android/tflite_demo.apk
``` ```
...@@ -139,12 +139,10 @@ def transform_input_data(tensor_dict, ...@@ -139,12 +139,10 @@ def transform_input_data(tensor_dict,
if fields.InputDataFields.groundtruth_confidences in tensor_dict: if fields.InputDataFields.groundtruth_confidences in tensor_dict:
groundtruth_confidences = tensor_dict[ groundtruth_confidences = tensor_dict[
fields.InputDataFields.groundtruth_confidences] fields.InputDataFields.groundtruth_confidences]
# Map the confidences to the one-hot encoding of classes
tensor_dict[fields.InputDataFields.groundtruth_confidences] = ( tensor_dict[fields.InputDataFields.groundtruth_confidences] = (
tf.sparse_to_dense( tf.reshape(groundtruth_confidences, [-1, 1]) *
zero_indexed_groundtruth_classes, tensor_dict[fields.InputDataFields.groundtruth_classes])
[num_classes],
groundtruth_confidences,
validate_indices=False))
else: else:
groundtruth_confidences = tf.ones_like( groundtruth_confidences = tf.ones_like(
zero_indexed_groundtruth_classes, dtype=tf.float32) zero_indexed_groundtruth_classes, dtype=tf.float32)
...@@ -200,10 +198,14 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes, ...@@ -200,10 +198,14 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes,
if fields.InputDataFields.image_additional_channels in tensor_dict: if fields.InputDataFields.image_additional_channels in tensor_dict:
num_additional_channels = tensor_dict[ num_additional_channels = tensor_dict[
fields.InputDataFields.image_additional_channels].shape[2].value fields.InputDataFields.image_additional_channels].shape[2].value
num_image_channels = 3
if fields.InputDataFields.image in tensor_dict:
num_image_channels = tensor_dict[fields.InputDataFields
.image].shape[2].value
padding_shapes = { padding_shapes = {
# Additional channels are merged before batching. # Additional channels are merged before batching.
fields.InputDataFields.image: [ fields.InputDataFields.image: [
height, width, 3 + num_additional_channels height, width, num_image_channels + num_additional_channels
], ],
fields.InputDataFields.original_image_spatial_shape: [2], fields.InputDataFields.original_image_spatial_shape: [2],
fields.InputDataFields.image_additional_channels: [ fields.InputDataFields.image_additional_channels: [
...@@ -215,8 +217,6 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes, ...@@ -215,8 +217,6 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes,
fields.InputDataFields.groundtruth_difficult: [max_num_boxes], fields.InputDataFields.groundtruth_difficult: [max_num_boxes],
fields.InputDataFields.groundtruth_boxes: [max_num_boxes, 4], fields.InputDataFields.groundtruth_boxes: [max_num_boxes, 4],
fields.InputDataFields.groundtruth_classes: [max_num_boxes, num_classes], fields.InputDataFields.groundtruth_classes: [max_num_boxes, num_classes],
fields.InputDataFields.groundtruth_confidences: [
max_num_boxes, num_classes],
fields.InputDataFields.groundtruth_instance_masks: [ fields.InputDataFields.groundtruth_instance_masks: [
max_num_boxes, height, width max_num_boxes, height, width
], ],
...@@ -224,9 +224,12 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes, ...@@ -224,9 +224,12 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes,
fields.InputDataFields.groundtruth_group_of: [max_num_boxes], fields.InputDataFields.groundtruth_group_of: [max_num_boxes],
fields.InputDataFields.groundtruth_area: [max_num_boxes], fields.InputDataFields.groundtruth_area: [max_num_boxes],
fields.InputDataFields.groundtruth_weights: [max_num_boxes], fields.InputDataFields.groundtruth_weights: [max_num_boxes],
fields.InputDataFields.groundtruth_confidences: [
max_num_boxes, num_classes
],
fields.InputDataFields.num_groundtruth_boxes: [], fields.InputDataFields.num_groundtruth_boxes: [],
fields.InputDataFields.groundtruth_label_types: [max_num_boxes], fields.InputDataFields.groundtruth_label_types: [max_num_boxes],
fields.InputDataFields.groundtruth_label_scores: [max_num_boxes], fields.InputDataFields.groundtruth_label_weights: [max_num_boxes],
fields.InputDataFields.true_image_shape: [3], fields.InputDataFields.true_image_shape: [3],
fields.InputDataFields.multiclass_scores: [ fields.InputDataFields.multiclass_scores: [
max_num_boxes, num_classes + 1 if num_classes is not None else None max_num_boxes, num_classes + 1 if num_classes is not None else None
...@@ -237,7 +240,7 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes, ...@@ -237,7 +240,7 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes,
if fields.InputDataFields.original_image in tensor_dict: if fields.InputDataFields.original_image in tensor_dict:
padding_shapes[fields.InputDataFields.original_image] = [ padding_shapes[fields.InputDataFields.original_image] = [
height, width, 3 + num_additional_channels height, width, num_image_channels + num_additional_channels
] ]
if fields.InputDataFields.groundtruth_keypoints in tensor_dict: if fields.InputDataFields.groundtruth_keypoints in tensor_dict:
tensor_shape = ( tensor_shape = (
...@@ -287,9 +290,15 @@ def augment_input_data(tensor_dict, data_augmentation_options): ...@@ -287,9 +290,15 @@ def augment_input_data(tensor_dict, data_augmentation_options):
in tensor_dict) in tensor_dict)
include_keypoints = (fields.InputDataFields.groundtruth_keypoints include_keypoints = (fields.InputDataFields.groundtruth_keypoints
in tensor_dict) in tensor_dict)
include_label_weights = (fields.InputDataFields.groundtruth_weights
in tensor_dict)
include_label_confidences = (fields.InputDataFields.groundtruth_confidences
in tensor_dict)
tensor_dict = preprocessor.preprocess( tensor_dict = preprocessor.preprocess(
tensor_dict, data_augmentation_options, tensor_dict, data_augmentation_options,
func_arg_map=preprocessor.get_default_func_arg_map( func_arg_map=preprocessor.get_default_func_arg_map(
include_label_weights=include_label_weights,
include_label_confidences=include_label_confidences,
include_instance_masks=include_instance_masks, include_instance_masks=include_instance_masks,
include_keypoints=include_keypoints)) include_keypoints=include_keypoints))
tensor_dict[fields.InputDataFields.image] = tf.squeeze( tensor_dict[fields.InputDataFields.image] = tf.squeeze(
...@@ -303,7 +312,7 @@ def _get_labels_dict(input_dict): ...@@ -303,7 +312,7 @@ def _get_labels_dict(input_dict):
fields.InputDataFields.num_groundtruth_boxes, fields.InputDataFields.num_groundtruth_boxes,
fields.InputDataFields.groundtruth_boxes, fields.InputDataFields.groundtruth_boxes,
fields.InputDataFields.groundtruth_classes, fields.InputDataFields.groundtruth_classes,
fields.InputDataFields.groundtruth_weights fields.InputDataFields.groundtruth_weights,
] ]
labels_dict = {} labels_dict = {}
for key in required_label_keys: for key in required_label_keys:
......
...@@ -93,17 +93,17 @@ class InputsTest(test_case.TestCase, parameterized.TestCase): ...@@ -93,17 +93,17 @@ class InputsTest(test_case.TestCase, parameterized.TestCase):
labels[fields.InputDataFields.groundtruth_classes].shape.as_list()) labels[fields.InputDataFields.groundtruth_classes].shape.as_list())
self.assertEqual(tf.float32, self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_classes].dtype) labels[fields.InputDataFields.groundtruth_classes].dtype)
self.assertAllEqual(
[1, 100],
labels[fields.InputDataFields.groundtruth_weights].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_weights].dtype)
self.assertAllEqual( self.assertAllEqual(
[1, 100, model_config.faster_rcnn.num_classes], [1, 100, model_config.faster_rcnn.num_classes],
labels[fields.InputDataFields.groundtruth_confidences].shape.as_list()) labels[fields.InputDataFields.groundtruth_confidences].shape.as_list())
self.assertEqual( self.assertEqual(
tf.float32, tf.float32,
labels[fields.InputDataFields.groundtruth_confidences].dtype) labels[fields.InputDataFields.groundtruth_confidences].dtype)
self.assertAllEqual(
[1, 100],
labels[fields.InputDataFields.groundtruth_weights].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_weights].dtype)
@parameterized.parameters( @parameterized.parameters(
{'eval_batch_size': 1}, {'eval_batch_size': 1},
...@@ -141,11 +141,11 @@ class InputsTest(test_case.TestCase, parameterized.TestCase): ...@@ -141,11 +141,11 @@ class InputsTest(test_case.TestCase, parameterized.TestCase):
self.assertEqual(tf.float32, self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_classes].dtype) labels[fields.InputDataFields.groundtruth_classes].dtype)
self.assertAllEqual( self.assertAllEqual(
[eval_batch_size, 100, model_config.faster_rcnn.num_classes], [eval_batch_size, 100],
labels[fields.InputDataFields.groundtruth_confidences].shape.as_list()) labels[fields.InputDataFields.groundtruth_weights].shape.as_list())
self.assertEqual( self.assertEqual(
tf.float32, tf.float32,
labels[fields.InputDataFields.groundtruth_confidences].dtype) labels[fields.InputDataFields.groundtruth_weights].dtype)
self.assertAllEqual( self.assertAllEqual(
[eval_batch_size, 100], [eval_batch_size, 100],
labels[fields.InputDataFields.groundtruth_area].shape.as_list()) labels[fields.InputDataFields.groundtruth_area].shape.as_list())
...@@ -194,16 +194,11 @@ class InputsTest(test_case.TestCase, parameterized.TestCase): ...@@ -194,16 +194,11 @@ class InputsTest(test_case.TestCase, parameterized.TestCase):
self.assertEqual(tf.float32, self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_classes].dtype) labels[fields.InputDataFields.groundtruth_classes].dtype)
self.assertAllEqual( self.assertAllEqual(
[batch_size, 100, model_config.ssd.num_classes], [batch_size, 100],
labels[ labels[
fields.InputDataFields.groundtruth_confidences].shape.as_list()) fields.InputDataFields.groundtruth_weights].shape.as_list())
self.assertEqual( self.assertEqual(
tf.float32, tf.float32,
labels[fields.InputDataFields.groundtruth_confidences].dtype)
self.assertAllEqual(
[batch_size, 100],
labels[fields.InputDataFields.groundtruth_weights].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_weights].dtype) labels[fields.InputDataFields.groundtruth_weights].dtype)
@parameterized.parameters( @parameterized.parameters(
...@@ -242,12 +237,12 @@ class InputsTest(test_case.TestCase, parameterized.TestCase): ...@@ -242,12 +237,12 @@ class InputsTest(test_case.TestCase, parameterized.TestCase):
self.assertEqual(tf.float32, self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_classes].dtype) labels[fields.InputDataFields.groundtruth_classes].dtype)
self.assertAllEqual( self.assertAllEqual(
[eval_batch_size, 100, model_config.ssd.num_classes], [eval_batch_size, 100],
labels[ labels[
fields.InputDataFields.groundtruth_confidences].shape.as_list()) fields.InputDataFields.groundtruth_weights].shape.as_list())
self.assertEqual( self.assertEqual(
tf.float32, tf.float32,
labels[fields.InputDataFields.groundtruth_confidences].dtype) labels[fields.InputDataFields.groundtruth_weights].dtype)
self.assertAllEqual( self.assertAllEqual(
[eval_batch_size, 100], [eval_batch_size, 100],
labels[fields.InputDataFields.groundtruth_area].shape.as_list()) labels[fields.InputDataFields.groundtruth_area].shape.as_list())
...@@ -447,7 +442,7 @@ class DataAugmentationFnTest(test_case.TestCase): ...@@ -447,7 +442,7 @@ class DataAugmentationFnTest(test_case.TestCase):
tf.constant(np.array([[.5, .5, 1., 1.]], np.float32)), tf.constant(np.array([[.5, .5, 1., 1.]], np.float32)),
fields.InputDataFields.groundtruth_classes: fields.InputDataFields.groundtruth_classes:
tf.constant(np.array([1.0], np.float32)), tf.constant(np.array([1.0], np.float32)),
fields.InputDataFields.groundtruth_confidences: fields.InputDataFields.groundtruth_weights:
tf.constant(np.array([0.8], np.float32)), tf.constant(np.array([0.8], np.float32)),
} }
augmented_tensor_dict = data_augmentation_fn(tensor_dict=tensor_dict) augmented_tensor_dict = data_augmentation_fn(tensor_dict=tensor_dict)
...@@ -468,7 +463,7 @@ class DataAugmentationFnTest(test_case.TestCase): ...@@ -468,7 +463,7 @@ class DataAugmentationFnTest(test_case.TestCase):
) )
self.assertAllClose( self.assertAllClose(
augmented_tensor_dict_out[ augmented_tensor_dict_out[
fields.InputDataFields.groundtruth_confidences], fields.InputDataFields.groundtruth_weights],
[0.8] [0.8]
) )
...@@ -634,6 +629,34 @@ class DataTransformationFnTest(test_case.TestCase): ...@@ -634,6 +629,34 @@ class DataTransformationFnTest(test_case.TestCase):
transformed_inputs[fields.InputDataFields.num_groundtruth_boxes], transformed_inputs[fields.InputDataFields.num_groundtruth_boxes],
1) 1)
def test_returns_correct_groundtruth_confidences_when_input_present(self):
tensor_dict = {
fields.InputDataFields.image:
tf.constant(np.random.rand(4, 4, 3).astype(np.float32)),
fields.InputDataFields.groundtruth_boxes:
tf.constant(np.array([[0, 0, 1, 1], [.5, .5, 1, 1]], np.float32)),
fields.InputDataFields.groundtruth_classes:
tf.constant(np.array([3, 1], np.int32)),
fields.InputDataFields.groundtruth_confidences:
tf.constant(np.array([1.0, -1.0], np.float32))
}
num_classes = 3
input_transformation_fn = functools.partial(
inputs.transform_input_data,
model_preprocess_fn=_fake_model_preprocessor_fn,
image_resizer_fn=_fake_image_resizer_fn,
num_classes=num_classes)
with self.test_session() as sess:
transformed_inputs = sess.run(
input_transformation_fn(tensor_dict=tensor_dict))
self.assertAllClose(
transformed_inputs[fields.InputDataFields.groundtruth_classes],
[[0, 0, 1], [1, 0, 0]])
self.assertAllClose(
transformed_inputs[fields.InputDataFields.groundtruth_confidences],
[[0, 0, 1], [-1, 0, 0]])
def test_returns_resized_masks(self): def test_returns_resized_masks(self):
tensor_dict = { tensor_dict = {
fields.InputDataFields.image: fields.InputDataFields.image:
...@@ -879,6 +902,41 @@ class PadInputDataToStaticShapesFnTest(test_case.TestCase): ...@@ -879,6 +902,41 @@ class PadInputDataToStaticShapesFnTest(test_case.TestCase):
padded_tensor_dict[fields.InputDataFields.image_additional_channels] padded_tensor_dict[fields.InputDataFields.image_additional_channels]
.shape.as_list(), [5, 6, 2]) .shape.as_list(), [5, 6, 2])
def test_gray_images(self):
input_tensor_dict = {
fields.InputDataFields.image:
tf.placeholder(tf.float32, [None, None, 1]),
}
padded_tensor_dict = inputs.pad_input_data_to_static_shapes(
tensor_dict=input_tensor_dict,
max_num_boxes=3,
num_classes=3,
spatial_image_shape=[5, 6])
self.assertAllEqual(
padded_tensor_dict[fields.InputDataFields.image].shape.as_list(),
[5, 6, 1])
def test_gray_images_and_additional_channels(self):
input_tensor_dict = {
fields.InputDataFields.image:
tf.placeholder(tf.float32, [None, None, 1]),
fields.InputDataFields.image_additional_channels:
tf.placeholder(tf.float32, [None, None, 2]),
}
padded_tensor_dict = inputs.pad_input_data_to_static_shapes(
tensor_dict=input_tensor_dict,
max_num_boxes=3,
num_classes=3,
spatial_image_shape=[5, 6])
self.assertAllEqual(
padded_tensor_dict[fields.InputDataFields.image].shape.as_list(),
[5, 6, 3])
self.assertAllEqual(
padded_tensor_dict[fields.InputDataFields.image_additional_channels]
.shape.as_list(), [5, 6, 2])
def test_keypoints(self): def test_keypoints(self):
input_tensor_dict = { input_tensor_dict = {
fields.InputDataFields.groundtruth_keypoints: fields.InputDataFields.groundtruth_keypoints:
......
...@@ -39,12 +39,18 @@ EVAL_METRICS_CLASS_DICT = { ...@@ -39,12 +39,18 @@ EVAL_METRICS_CLASS_DICT = {
object_detection_evaluation.PascalInstanceSegmentationEvaluator, object_detection_evaluation.PascalInstanceSegmentationEvaluator,
'weighted_pascal_voc_instance_segmentation_metrics': 'weighted_pascal_voc_instance_segmentation_metrics':
object_detection_evaluation.WeightedPascalInstanceSegmentationEvaluator, object_detection_evaluation.WeightedPascalInstanceSegmentationEvaluator,
'oid_V2_detection_metrics':
object_detection_evaluation.OpenImagesDetectionEvaluator,
# DEPRECATED: please use oid_V2_detection_metrics instead
'open_images_V2_detection_metrics': 'open_images_V2_detection_metrics':
object_detection_evaluation.OpenImagesDetectionEvaluator, object_detection_evaluation.OpenImagesDetectionEvaluator,
'coco_detection_metrics': 'coco_detection_metrics':
coco_evaluation.CocoDetectionEvaluator, coco_evaluation.CocoDetectionEvaluator,
'coco_mask_metrics': 'coco_mask_metrics':
coco_evaluation.CocoMaskEvaluator, coco_evaluation.CocoMaskEvaluator,
'oid_challenge_detection_metrics':
object_detection_evaluation.OpenImagesDetectionChallengeEvaluator,
# DEPRECATED: please use oid_challenge_detection_metrics instead
'oid_challenge_object_detection_metrics': 'oid_challenge_object_detection_metrics':
object_detection_evaluation.OpenImagesDetectionChallengeEvaluator, object_detection_evaluation.OpenImagesDetectionChallengeEvaluator,
} }
...@@ -146,6 +152,16 @@ def get_evaluators(eval_config, categories): ...@@ -146,6 +152,16 @@ def get_evaluators(eval_config, categories):
for eval_metric_fn_key in eval_metric_fn_keys: for eval_metric_fn_key in eval_metric_fn_keys:
if eval_metric_fn_key not in EVAL_METRICS_CLASS_DICT: if eval_metric_fn_key not in EVAL_METRICS_CLASS_DICT:
raise ValueError('Metric not found: {}'.format(eval_metric_fn_key)) raise ValueError('Metric not found: {}'.format(eval_metric_fn_key))
if eval_metric_fn_key == 'oid_challenge_object_detection_metrics':
logging.warning(
'oid_challenge_object_detection_metrics is deprecated; '
'use oid_challenge_detection_metrics instead'
)
if eval_metric_fn_key == 'oid_V2_detection_metrics':
logging.warning(
'open_images_V2_detection_metrics is deprecated; '
'use oid_V2_detection_metrics instead'
)
evaluators_list.append( evaluators_list.append(
EVAL_METRICS_CLASS_DICT[eval_metric_fn_key](categories=categories)) EVAL_METRICS_CLASS_DICT[eval_metric_fn_key](categories=categories))
return evaluators_list return evaluators_list
......
...@@ -75,6 +75,7 @@ def create_input_queue(batch_size_per_clone, create_tensor_dict_fn, ...@@ -75,6 +75,7 @@ def create_input_queue(batch_size_per_clone, create_tensor_dict_fn,
tensor_dict = preprocessor.preprocess( tensor_dict = preprocessor.preprocess(
tensor_dict, data_augmentation_options, tensor_dict, data_augmentation_options,
func_arg_map=preprocessor.get_default_func_arg_map( func_arg_map=preprocessor.get_default_func_arg_map(
include_label_weights=True,
include_multiclass_scores=include_multiclass_scores, include_multiclass_scores=include_multiclass_scores,
include_instance_masks=include_instance_masks, include_instance_masks=include_instance_masks,
include_keypoints=include_keypoints)) include_keypoints=include_keypoints))
......
...@@ -1568,7 +1568,7 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1568,7 +1568,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
Returns: Returns:
A dictionary containing: A dictionary containing:
`detection_boxes`: [batch, max_detection, 4] `detection_boxes`: [batch, max_detection, 4] in normalized co-ordinates.
`detection_scores`: [batch, max_detections] `detection_scores`: [batch, max_detections]
`detection_classes`: [batch, max_detections] `detection_classes`: [batch, max_detections]
`num_detections`: [batch] `num_detections`: [batch]
...@@ -1701,14 +1701,14 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1701,14 +1701,14 @@ class FasterRCNNMetaArch(model.DetectionModel):
prediction_dict['refined_box_encodings'], prediction_dict['refined_box_encodings'],
prediction_dict['class_predictions_with_background'], prediction_dict['class_predictions_with_background'],
prediction_dict['proposal_boxes'], prediction_dict['proposal_boxes'],
prediction_dict['num_proposals'], prediction_dict['num_proposals'], groundtruth_boxlists,
groundtruth_boxlists,
groundtruth_classes_with_background_list, groundtruth_classes_with_background_list,
groundtruth_weights_list, groundtruth_weights_list, prediction_dict['image_shape'],
prediction_dict['image_shape'], prediction_dict.get('mask_predictions'), groundtruth_masks_list,
prediction_dict.get('mask_predictions'), prediction_dict.get(
groundtruth_masks_list, fields.DetectionResultFields.detection_boxes),
)) prediction_dict.get(
fields.DetectionResultFields.num_detections)))
return loss_dict return loss_dict
def _loss_rpn(self, rpn_box_encodings, def _loss_rpn(self, rpn_box_encodings,
...@@ -1811,7 +1811,9 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1811,7 +1811,9 @@ class FasterRCNNMetaArch(model.DetectionModel):
groundtruth_weights_list, groundtruth_weights_list,
image_shape, image_shape,
prediction_masks=None, prediction_masks=None,
groundtruth_masks_list=None): groundtruth_masks_list=None,
detection_boxes=None,
num_detections=None):
"""Computes scalar box classifier loss tensors. """Computes scalar box classifier loss tensors.
Uses self._detector_target_assigner to obtain regression and classification Uses self._detector_target_assigner to obtain regression and classification
...@@ -1854,6 +1856,11 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1854,6 +1856,11 @@ class FasterRCNNMetaArch(model.DetectionModel):
groundtruth_masks_list: an optional list of 3-D tensors of shape groundtruth_masks_list: an optional list of 3-D tensors of shape
[num_boxes, image_height, image_width] containing the instance masks for [num_boxes, image_height, image_width] containing the instance masks for
each of the boxes. each of the boxes.
detection_boxes: 3-D float tensor of shape [batch,
max_total_detections, 4] containing post-processed detection boxes in
normalized co-ordinates.
num_detections: 1-D int32 tensor of shape [batch] containing number of
valid detections in `detection_boxes`.
Returns: Returns:
a dictionary mapping loss keys ('second_stage_localization_loss', a dictionary mapping loss keys ('second_stage_localization_loss',
...@@ -1867,7 +1874,7 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1867,7 +1874,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
""" """
with tf.name_scope('BoxClassifierLoss'): with tf.name_scope('BoxClassifierLoss'):
paddings_indicator = self._padded_batched_proposals_indicator( paddings_indicator = self._padded_batched_proposals_indicator(
num_proposals, self.max_num_proposals) num_proposals, proposal_boxes.shape[1])
proposal_boxlists = [ proposal_boxlists = [
box_list.BoxList(proposal_boxes_single_image) box_list.BoxList(proposal_boxes_single_image)
for proposal_boxes_single_image in tf.unstack(proposal_boxes)] for proposal_boxes_single_image in tf.unstack(proposal_boxes)]
...@@ -1958,6 +1965,13 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1958,6 +1965,13 @@ class FasterRCNNMetaArch(model.DetectionModel):
raise ValueError('Groundtruth instance masks not provided. ' raise ValueError('Groundtruth instance masks not provided. '
'Please configure input reader.') 'Please configure input reader.')
if not self._is_training:
(proposal_boxes, proposal_boxlists, paddings_indicator,
one_hot_flat_cls_targets_with_background
) = self._get_mask_proposal_boxes_and_classes(
detection_boxes, num_detections, image_shape,
groundtruth_boxlists, groundtruth_classes_with_background_list,
groundtruth_weights_list)
unmatched_mask_label = tf.zeros(image_shape[1:3], dtype=tf.float32) unmatched_mask_label = tf.zeros(image_shape[1:3], dtype=tf.float32)
(batch_mask_targets, _, _, batch_mask_target_weights, (batch_mask_targets, _, _, batch_mask_target_weights,
_) = target_assigner.batch_assign_targets( _) = target_assigner.batch_assign_targets(
...@@ -2031,6 +2045,64 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -2031,6 +2045,64 @@ class FasterRCNNMetaArch(model.DetectionModel):
loss_dict[mask_loss.op.name] = mask_loss loss_dict[mask_loss.op.name] = mask_loss
return loss_dict return loss_dict
def _get_mask_proposal_boxes_and_classes(
self, detection_boxes, num_detections, image_shape, groundtruth_boxlists,
groundtruth_classes_with_background_list, groundtruth_weights_list):
"""Returns proposal boxes and class targets to compute evaluation mask loss.
During evaluation, detection boxes are used to extract features for mask
prediction. Therefore, to compute mask loss during evaluation detection
boxes must be used to compute correct class and mask targets. This function
returns boxes and classes in the correct format for computing mask targets
during evaluation.
Args:
detection_boxes: A 3-D float tensor of shape [batch, max_detection_boxes,
4] containing detection boxes in normalized co-ordinates.
num_detections: A 1-D float tensor of shape [batch] containing number of
valid boxes in `detection_boxes`.
image_shape: A 1-D tensor of shape [4] containing image tensor shape.
groundtruth_boxlists: A list of groundtruth boxlists.
groundtruth_classes_with_background_list: A list of groundtruth classes.
groundtruth_weights_list: A list of groundtruth weights.
Return:
mask_proposal_boxes: detection boxes to use for mask proposals in absolute
co-ordinates.
mask_proposal_boxlists: `mask_proposal_boxes` in a list of BoxLists in
absolute co-ordinates.
mask_proposal_paddings_indicator: a tensor indicating valid boxes.
mask_proposal_one_hot_flat_cls_targets_with_background: Class targets
computed using detection boxes.
"""
batch, max_num_detections, _ = detection_boxes.shape.as_list()
proposal_boxes = tf.reshape(box_list_ops.to_absolute_coordinates(
box_list.BoxList(tf.reshape(detection_boxes, [-1, 4])), image_shape[1],
image_shape[2]).get(), [batch, max_num_detections, 4])
proposal_boxlists = [
box_list.BoxList(detection_boxes_single_image)
for detection_boxes_single_image in tf.unstack(proposal_boxes)
]
paddings_indicator = self._padded_batched_proposals_indicator(
tf.to_int32(num_detections), detection_boxes.shape[1])
(batch_cls_targets_with_background, _, _, _,
_) = target_assigner.batch_assign_targets(
target_assigner=self._detector_target_assigner,
anchors_batch=proposal_boxlists,
gt_box_batch=groundtruth_boxlists,
gt_class_targets_batch=groundtruth_classes_with_background_list,
unmatched_class_label=tf.constant(
[1] + self._num_classes * [0], dtype=tf.float32),
gt_weights_batch=groundtruth_weights_list)
flat_cls_targets_with_background = tf.reshape(
batch_cls_targets_with_background, [-1, self._num_classes + 1])
one_hot_flat_cls_targets_with_background = tf.argmax(
flat_cls_targets_with_background, axis=1)
one_hot_flat_cls_targets_with_background = tf.one_hot(
one_hot_flat_cls_targets_with_background,
flat_cls_targets_with_background.get_shape()[1])
return (proposal_boxes, proposal_boxlists, paddings_indicator,
one_hot_flat_cls_targets_with_background)
def _get_refined_encodings_for_postitive_class( def _get_refined_encodings_for_postitive_class(
self, refined_box_encodings, flat_cls_targets_with_background, self, refined_box_encodings, flat_cls_targets_with_background,
batch_size): batch_size):
...@@ -2185,4 +2257,3 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -2185,4 +2257,3 @@ class FasterRCNNMetaArch(model.DetectionModel):
A list of update operators. A list of update operators.
""" """
return tf.get_collection(tf.GraphKeys.UPDATE_OPS) return tf.get_collection(tf.GraphKeys.UPDATE_OPS)
...@@ -281,8 +281,12 @@ class SSDMetaArch(model.DetectionModel): ...@@ -281,8 +281,12 @@ class SSDMetaArch(model.DetectionModel):
freeze_batchnorm=False, freeze_batchnorm=False,
inplace_batchnorm_update=False, inplace_batchnorm_update=False,
add_background_class=True, add_background_class=True,
explicit_background_class=False,
random_example_sampler=None, random_example_sampler=None,
expected_classification_loss_under_sampling=None): expected_loss_weights_fn=None,
use_confidences_as_targets=False,
implicit_example_weight=0.5,
equalization_loss_config=None):
"""SSDMetaArch Constructor. """SSDMetaArch Constructor.
TODO(rathodv,jonathanhuang): group NMS parameters + score converter into TODO(rathodv,jonathanhuang): group NMS parameters + score converter into
...@@ -335,17 +339,29 @@ class SSDMetaArch(model.DetectionModel): ...@@ -335,17 +339,29 @@ class SSDMetaArch(model.DetectionModel):
dependency on tf.graphkeys.UPDATE_OPS collection in order to update dependency on tf.graphkeys.UPDATE_OPS collection in order to update
batch norm statistics. batch norm statistics.
add_background_class: Whether to add an implicit background class to add_background_class: Whether to add an implicit background class to
one-hot encodings of groundtruth labels. Set to false if using one-hot encodings of groundtruth labels. Set to false if training a
groundtruth labels with an explicit background class or using multiclass single class model or using groundtruth labels with an explicit
scores instead of truth in the case of distillation. background class.
explicit_background_class: Set to true if using groundtruth labels with an
explicit background class, as in multiclass scores.
random_example_sampler: a BalancedPositiveNegativeSampler object that can random_example_sampler: a BalancedPositiveNegativeSampler object that can
perform random example sampling when computing loss. If None, random perform random example sampling when computing loss. If None, random
sampling process is skipped. Note that random example sampler and hard sampling process is skipped. Note that random example sampler and hard
example miner can both be applied to the model. In that case, random example miner can both be applied to the model. In that case, random
sampler will take effect first and hard example miner can only process sampler will take effect first and hard example miner can only process
the random sampled examples. the random sampled examples.
expected_classification_loss_under_sampling: If not None, use expected_loss_weights_fn: If not None, use to calculate
to calcualte classification loss by background/foreground weighting. loss by background/foreground weighting. Should take batch_cls_targets
as inputs and return foreground_weights, background_weights. See
expected_classification_loss_by_expected_sampling and
expected_classification_loss_by_reweighting_unmatched_anchors in
third_party/tensorflow_models/object_detection/utils/ops.py as examples.
use_confidences_as_targets: Whether to use groundtruth_condifences field
to assign the targets.
implicit_example_weight: a float number that specifies the weight used
for the implicit negative examples.
equalization_loss_config: a namedtuple that specifies configs for
computing equalization loss.
""" """
super(SSDMetaArch, self).__init__(num_classes=box_predictor.num_classes) super(SSDMetaArch, self).__init__(num_classes=box_predictor.num_classes)
self._is_training = is_training self._is_training = is_training
...@@ -358,6 +374,11 @@ class SSDMetaArch(model.DetectionModel): ...@@ -358,6 +374,11 @@ class SSDMetaArch(model.DetectionModel):
self._box_coder = box_coder self._box_coder = box_coder
self._feature_extractor = feature_extractor self._feature_extractor = feature_extractor
self._add_background_class = add_background_class self._add_background_class = add_background_class
self._explicit_background_class = explicit_background_class
if add_background_class and explicit_background_class:
raise ValueError("Cannot have both 'add_background_class' and"
" 'explicit_background_class' true.")
# Needed for fine-tuning from classification checkpoints whose # Needed for fine-tuning from classification checkpoints whose
# variables do not have the feature extractor scope. # variables do not have the feature extractor scope.
...@@ -370,15 +391,18 @@ class SSDMetaArch(model.DetectionModel): ...@@ -370,15 +391,18 @@ class SSDMetaArch(model.DetectionModel):
# Slim feature extractors get an explicit naming scope # Slim feature extractors get an explicit naming scope
self._extract_features_scope = 'FeatureExtractor' self._extract_features_scope = 'FeatureExtractor'
if self._add_background_class and encode_background_as_zeros: if encode_background_as_zeros:
self._unmatched_class_label = tf.constant((self.num_classes + 1) * [0], background_class = [0]
tf.float32) else:
elif self._add_background_class: background_class = [1]
self._unmatched_class_label = tf.constant([1] + self.num_classes * [0],
tf.float32) if self._add_background_class:
num_foreground_classes = self.num_classes
else: else:
self._unmatched_class_label = tf.constant(self.num_classes * [0], num_foreground_classes = self.num_classes - 1
tf.float32)
self._unmatched_class_label = tf.constant(
background_class + num_foreground_classes * [0], tf.float32)
self._target_assigner = target_assigner_instance self._target_assigner = target_assigner_instance
...@@ -399,8 +423,11 @@ class SSDMetaArch(model.DetectionModel): ...@@ -399,8 +423,11 @@ class SSDMetaArch(model.DetectionModel):
self._anchors = None self._anchors = None
self._add_summaries = add_summaries self._add_summaries = add_summaries
self._batched_prediction_tensor_names = [] self._batched_prediction_tensor_names = []
self._expected_classification_loss_under_sampling = ( self._expected_loss_weights_fn = expected_loss_weights_fn
expected_classification_loss_under_sampling) self._use_confidences_as_targets = use_confidences_as_targets
self._implicit_example_weight = implicit_example_weight
self._equalization_loss_config = equalization_loss_config
@property @property
def anchors(self): def anchors(self):
...@@ -647,7 +674,7 @@ class SSDMetaArch(model.DetectionModel): ...@@ -647,7 +674,7 @@ class SSDMetaArch(model.DetectionModel):
detection_scores = self._score_conversion_fn(class_predictions) detection_scores = self._score_conversion_fn(class_predictions)
detection_scores = tf.identity(detection_scores, 'raw_box_scores') detection_scores = tf.identity(detection_scores, 'raw_box_scores')
if self._add_background_class: if self._add_background_class or self._explicit_background_class:
detection_scores = tf.slice(detection_scores, [0, 0, 1], [-1, -1, -1]) detection_scores = tf.slice(detection_scores, [0, 0, 1], [-1, -1, -1])
additional_fields = None additional_fields = None
...@@ -720,11 +747,14 @@ class SSDMetaArch(model.DetectionModel): ...@@ -720,11 +747,14 @@ class SSDMetaArch(model.DetectionModel):
weights = None weights = None
if self.groundtruth_has_field(fields.BoxListFields.weights): if self.groundtruth_has_field(fields.BoxListFields.weights):
weights = self.groundtruth_lists(fields.BoxListFields.weights) weights = self.groundtruth_lists(fields.BoxListFields.weights)
confidences = None
if self.groundtruth_has_field(fields.BoxListFields.confidences):
confidences = self.groundtruth_lists(fields.BoxListFields.confidences)
(batch_cls_targets, batch_cls_weights, batch_reg_targets, (batch_cls_targets, batch_cls_weights, batch_reg_targets,
batch_reg_weights, match_list) = self._assign_targets( batch_reg_weights, match_list) = self._assign_targets(
self.groundtruth_lists(fields.BoxListFields.boxes), self.groundtruth_lists(fields.BoxListFields.boxes),
self.groundtruth_lists(fields.BoxListFields.classes), self.groundtruth_lists(fields.BoxListFields.classes),
keypoints, weights) keypoints, weights, confidences)
if self._add_summaries: if self._add_summaries:
self._summarize_target_assignment( self._summarize_target_assignment(
self.groundtruth_lists(fields.BoxListFields.boxes), match_list) self.groundtruth_lists(fields.BoxListFields.boxes), match_list)
...@@ -762,7 +792,7 @@ class SSDMetaArch(model.DetectionModel): ...@@ -762,7 +792,7 @@ class SSDMetaArch(model.DetectionModel):
weights=batch_cls_weights, weights=batch_cls_weights,
losses_mask=losses_mask) losses_mask=losses_mask)
if self._expected_classification_loss_under_sampling: if self._expected_loss_weights_fn:
# Need to compute losses for assigned targets against the # Need to compute losses for assigned targets against the
# unmatched_class_label as well as their assigned targets. # unmatched_class_label as well as their assigned targets.
# simplest thing (but wasteful) is just to calculate all losses # simplest thing (but wasteful) is just to calculate all losses
...@@ -787,8 +817,16 @@ class SSDMetaArch(model.DetectionModel): ...@@ -787,8 +817,16 @@ class SSDMetaArch(model.DetectionModel):
batch_cls_targets = tf.concat( batch_cls_targets = tf.concat(
[1 - batch_cls_targets, batch_cls_targets], axis=-1) [1 - batch_cls_targets, batch_cls_targets], axis=-1)
cls_losses = self._expected_classification_loss_under_sampling( location_losses = tf.tile(location_losses, [1, num_classes])
batch_cls_targets, cls_losses, unmatched_cls_losses)
foreground_weights, background_weights = (
self._expected_loss_weights_fn(batch_cls_targets))
cls_losses = (
foreground_weights * cls_losses +
background_weights * unmatched_cls_losses)
location_losses *= foreground_weights
classification_loss = tf.reduce_sum(cls_losses) classification_loss = tf.reduce_sum(cls_losses)
localization_loss = tf.reduce_sum(location_losses) localization_loss = tf.reduce_sum(location_losses)
...@@ -824,6 +862,8 @@ class SSDMetaArch(model.DetectionModel): ...@@ -824,6 +862,8 @@ class SSDMetaArch(model.DetectionModel):
str(localization_loss.op.name): localization_loss, str(localization_loss.op.name): localization_loss,
str(classification_loss.op.name): classification_loss str(classification_loss.op.name): classification_loss
} }
return loss_dict return loss_dict
def _minibatch_subsample_fn(self, inputs): def _minibatch_subsample_fn(self, inputs):
...@@ -864,9 +904,12 @@ class SSDMetaArch(model.DetectionModel): ...@@ -864,9 +904,12 @@ class SSDMetaArch(model.DetectionModel):
visualization_utils.add_cdf_image_summary(negative_anchor_cls_loss, visualization_utils.add_cdf_image_summary(negative_anchor_cls_loss,
'NegativeAnchorLossCDF') 'NegativeAnchorLossCDF')
def _assign_targets(self, groundtruth_boxes_list, groundtruth_classes_list, def _assign_targets(self,
groundtruth_boxes_list,
groundtruth_classes_list,
groundtruth_keypoints_list=None, groundtruth_keypoints_list=None,
groundtruth_weights_list=None): groundtruth_weights_list=None,
groundtruth_confidences_list=None):
"""Assign groundtruth targets. """Assign groundtruth targets.
Adds a background class to each one-hot encoding of groundtruth classes Adds a background class to each one-hot encoding of groundtruth classes
...@@ -885,6 +928,9 @@ class SSDMetaArch(model.DetectionModel): ...@@ -885,6 +928,9 @@ class SSDMetaArch(model.DetectionModel):
[num_boxes, num_keypoints, 2] [num_boxes, num_keypoints, 2]
groundtruth_weights_list: A list of 1-D tf.float32 tensors of shape groundtruth_weights_list: A list of 1-D tf.float32 tensors of shape
[num_boxes] containing weights for groundtruth boxes. [num_boxes] containing weights for groundtruth boxes.
groundtruth_confidences_list: A list of 2-D tf.float32 tensors of shape
[num_boxes, num_classes] containing class confidences for
groundtruth boxes.
Returns: Returns:
batch_cls_targets: a tensor with shape [batch_size, num_anchors, batch_cls_targets: a tensor with shape [batch_size, num_anchors,
...@@ -901,11 +947,18 @@ class SSDMetaArch(model.DetectionModel): ...@@ -901,11 +947,18 @@ class SSDMetaArch(model.DetectionModel):
groundtruth_boxlists = [ groundtruth_boxlists = [
box_list.BoxList(boxes) for boxes in groundtruth_boxes_list box_list.BoxList(boxes) for boxes in groundtruth_boxes_list
] ]
train_using_confidences = (self._is_training and
self._use_confidences_as_targets)
if self._add_background_class: if self._add_background_class:
groundtruth_classes_with_background_list = [ groundtruth_classes_with_background_list = [
tf.pad(one_hot_encoding, [[0, 0], [1, 0]], mode='CONSTANT') tf.pad(one_hot_encoding, [[0, 0], [1, 0]], mode='CONSTANT')
for one_hot_encoding in groundtruth_classes_list for one_hot_encoding in groundtruth_classes_list
] ]
if train_using_confidences:
groundtruth_confidences_with_background_list = [
tf.pad(groundtruth_confidences, [[0, 0], [1, 0]], mode='CONSTANT')
for groundtruth_confidences in groundtruth_confidences_list
]
else: else:
groundtruth_classes_with_background_list = groundtruth_classes_list groundtruth_classes_with_background_list = groundtruth_classes_list
...@@ -913,9 +966,23 @@ class SSDMetaArch(model.DetectionModel): ...@@ -913,9 +966,23 @@ class SSDMetaArch(model.DetectionModel):
for boxlist, keypoints in zip( for boxlist, keypoints in zip(
groundtruth_boxlists, groundtruth_keypoints_list): groundtruth_boxlists, groundtruth_keypoints_list):
boxlist.add_field(fields.BoxListFields.keypoints, keypoints) boxlist.add_field(fields.BoxListFields.keypoints, keypoints)
if train_using_confidences:
return target_assigner.batch_assign_confidences(
self._target_assigner,
self.anchors,
groundtruth_boxlists,
groundtruth_confidences_with_background_list,
groundtruth_weights_list,
self._unmatched_class_label,
self._add_background_class,
self._implicit_example_weight)
else:
return target_assigner.batch_assign_targets( return target_assigner.batch_assign_targets(
self._target_assigner, self.anchors, groundtruth_boxlists, self._target_assigner,
groundtruth_classes_with_background_list, self._unmatched_class_label, self.anchors,
groundtruth_boxlists,
groundtruth_classes_with_background_list,
self._unmatched_class_label,
groundtruth_weights_list) groundtruth_weights_list)
def _summarize_target_assignment(self, groundtruth_boxes_list, match_list): def _summarize_target_assignment(self, groundtruth_boxes_list, match_list):
......
...@@ -22,6 +22,7 @@ import tensorflow as tf ...@@ -22,6 +22,7 @@ import tensorflow as tf
from object_detection.meta_architectures import ssd_meta_arch from object_detection.meta_architectures import ssd_meta_arch
from object_detection.meta_architectures import ssd_meta_arch_test_lib from object_detection.meta_architectures import ssd_meta_arch_test_lib
from object_detection.protos import model_pb2
from object_detection.utils import test_utils from object_detection.utils import test_utils
slim = tf.contrib.slim slim = tf.contrib.slim
...@@ -35,13 +36,13 @@ keras = tf.keras.layers ...@@ -35,13 +36,13 @@ keras = tf.keras.layers
class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase, class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase,
parameterized.TestCase): parameterized.TestCase):
def _create_model(self, def _create_model(
self,
apply_hard_mining=True, apply_hard_mining=True,
normalize_loc_loss_by_codesize=False, normalize_loc_loss_by_codesize=False,
add_background_class=True, add_background_class=True,
random_example_sampling=False, random_example_sampling=False,
weight_regression_loss_by_score=False, expected_loss_weights=model_pb2.DetectionModel().ssd.loss.NONE,
use_expected_classification_loss_under_sampling=False,
min_num_negative_samples=1, min_num_negative_samples=1,
desired_negative_sampling_ratio=3, desired_negative_sampling_ratio=3,
use_keras=False, use_keras=False,
...@@ -54,9 +55,7 @@ class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase, ...@@ -54,9 +55,7 @@ class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase,
normalize_loc_loss_by_codesize=normalize_loc_loss_by_codesize, normalize_loc_loss_by_codesize=normalize_loc_loss_by_codesize,
add_background_class=add_background_class, add_background_class=add_background_class,
random_example_sampling=random_example_sampling, random_example_sampling=random_example_sampling,
weight_regression_loss_by_score=weight_regression_loss_by_score, expected_loss_weights=expected_loss_weights,
use_expected_classification_loss_under_sampling=
use_expected_classification_loss_under_sampling,
min_num_negative_samples=min_num_negative_samples, min_num_negative_samples=min_num_negative_samples,
desired_negative_sampling_ratio=desired_negative_sampling_ratio, desired_negative_sampling_ratio=desired_negative_sampling_ratio,
use_keras=use_keras, use_keras=use_keras,
...@@ -358,91 +357,6 @@ class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase, ...@@ -358,91 +357,6 @@ class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase,
self.assertAllClose(localization_loss, expected_localization_loss) self.assertAllClose(localization_loss, expected_localization_loss)
self.assertAllClose(classification_loss, expected_classification_loss) self.assertAllClose(classification_loss, expected_classification_loss)
def test_loss_with_expected_classification_loss(self, use_keras):
with tf.Graph().as_default():
_, num_classes, num_anchors, _ = self._create_model(use_keras=use_keras)
def graph_fn(preprocessed_tensor, groundtruth_boxes1, groundtruth_boxes2,
groundtruth_classes1, groundtruth_classes2):
groundtruth_boxes_list = [groundtruth_boxes1, groundtruth_boxes2]
groundtruth_classes_list = [groundtruth_classes1, groundtruth_classes2]
model, _, _, _ = self._create_model(
apply_hard_mining=False,
add_background_class=True,
use_expected_classification_loss_under_sampling=True,
min_num_negative_samples=1,
desired_negative_sampling_ratio=desired_negative_sampling_ratio)
model.provide_groundtruth(groundtruth_boxes_list,
groundtruth_classes_list)
prediction_dict = model.predict(
preprocessed_tensor, true_image_shapes=None)
loss_dict = model.loss(prediction_dict, true_image_shapes=None)
return (loss_dict['Loss/localization_loss'],
loss_dict['Loss/classification_loss'])
batch_size = 2
desired_negative_sampling_ratio = 4
preprocessed_input = np.random.rand(batch_size, 2, 2, 3).astype(np.float32)
groundtruth_boxes1 = np.array([[0, 0, .5, .5]], dtype=np.float32)
groundtruth_boxes2 = np.array([[0, 0, .5, .5]], dtype=np.float32)
groundtruth_classes1 = np.array([[1]], dtype=np.float32)
groundtruth_classes2 = np.array([[1]], dtype=np.float32)
expected_localization_loss = 0.0
expected_classification_loss = (
batch_size * (num_anchors + num_classes * num_anchors) * np.log(2.0))
(localization_loss, classification_loss) = self.execute(
graph_fn, [
preprocessed_input, groundtruth_boxes1, groundtruth_boxes2,
groundtruth_classes1, groundtruth_classes2
])
self.assertAllClose(localization_loss, expected_localization_loss)
self.assertAllClose(classification_loss, expected_classification_loss)
def test_loss_results_are_correct_with_weight_regression_loss_by_score(
self, use_keras):
with tf.Graph().as_default():
_, num_classes, num_anchors, _ = self._create_model(
use_keras=use_keras,
add_background_class=False,
weight_regression_loss_by_score=True)
def graph_fn(preprocessed_tensor, groundtruth_boxes1, groundtruth_boxes2,
groundtruth_classes1, groundtruth_classes2):
groundtruth_boxes_list = [groundtruth_boxes1, groundtruth_boxes2]
groundtruth_classes_list = [groundtruth_classes1, groundtruth_classes2]
model, _, _, _ = self._create_model(
use_keras=use_keras,
apply_hard_mining=False,
add_background_class=False,
weight_regression_loss_by_score=True)
model.provide_groundtruth(groundtruth_boxes_list,
groundtruth_classes_list)
prediction_dict = model.predict(
preprocessed_tensor, true_image_shapes=None)
loss_dict = model.loss(prediction_dict, true_image_shapes=None)
return (loss_dict['Loss/localization_loss'],
loss_dict['Loss/classification_loss'])
batch_size = 2
preprocessed_input = np.random.rand(batch_size, 2, 2, 3).astype(np.float32)
groundtruth_boxes1 = np.array([[0, 0, 1, 1]], dtype=np.float32)
groundtruth_boxes2 = np.array([[0, 0, 1, 1]], dtype=np.float32)
groundtruth_classes1 = np.array([[1]], dtype=np.float32)
groundtruth_classes2 = np.array([[0]], dtype=np.float32)
expected_localization_loss = 0.25
expected_classification_loss = (
batch_size * num_anchors * num_classes * np.log(2.0))
(localization_loss, classification_loss) = self.execute(
graph_fn, [
preprocessed_input, groundtruth_boxes1, groundtruth_boxes2,
groundtruth_classes1, groundtruth_classes2
])
self.assertAllClose(localization_loss, expected_localization_loss)
self.assertAllClose(classification_loss, expected_classification_loss)
def test_loss_results_are_correct_with_losses_mask(self, use_keras): def test_loss_results_are_correct_with_losses_mask(self, use_keras):
......
...@@ -25,6 +25,7 @@ from object_detection.core import post_processing ...@@ -25,6 +25,7 @@ from object_detection.core import post_processing
from object_detection.core import region_similarity_calculator as sim_calc from object_detection.core import region_similarity_calculator as sim_calc
from object_detection.core import target_assigner from object_detection.core import target_assigner
from object_detection.meta_architectures import ssd_meta_arch from object_detection.meta_architectures import ssd_meta_arch
from object_detection.protos import model_pb2
from object_detection.utils import ops from object_detection.utils import ops
from object_detection.utils import test_case from object_detection.utils import test_case
from object_detection.utils import test_utils from object_detection.utils import test_utils
...@@ -111,14 +112,14 @@ class MockAnchorGenerator2x2(anchor_generator.AnchorGenerator): ...@@ -111,14 +112,14 @@ class MockAnchorGenerator2x2(anchor_generator.AnchorGenerator):
class SSDMetaArchTestBase(test_case.TestCase): class SSDMetaArchTestBase(test_case.TestCase):
"""Base class to test SSD based meta architectures.""" """Base class to test SSD based meta architectures."""
def _create_model(self, def _create_model(
self,
model_fn=ssd_meta_arch.SSDMetaArch, model_fn=ssd_meta_arch.SSDMetaArch,
apply_hard_mining=True, apply_hard_mining=True,
normalize_loc_loss_by_codesize=False, normalize_loc_loss_by_codesize=False,
add_background_class=True, add_background_class=True,
random_example_sampling=False, random_example_sampling=False,
weight_regression_loss_by_score=False, expected_loss_weights=model_pb2.DetectionModel().ssd.loss.NONE,
use_expected_classification_loss_under_sampling=False,
min_num_negative_samples=1, min_num_negative_samples=1,
desired_negative_sampling_ratio=3, desired_negative_sampling_ratio=3,
use_keras=False, use_keras=False,
...@@ -130,12 +131,10 @@ class SSDMetaArchTestBase(test_case.TestCase): ...@@ -130,12 +131,10 @@ class SSDMetaArchTestBase(test_case.TestCase):
mock_anchor_generator = MockAnchorGenerator2x2() mock_anchor_generator = MockAnchorGenerator2x2()
if use_keras: if use_keras:
mock_box_predictor = test_utils.MockKerasBoxPredictor( mock_box_predictor = test_utils.MockKerasBoxPredictor(
is_training, num_classes, add_background_class=add_background_class, is_training, num_classes, add_background_class=add_background_class)
predict_mask=predict_mask)
else: else:
mock_box_predictor = test_utils.MockBoxPredictor( mock_box_predictor = test_utils.MockBoxPredictor(
is_training, num_classes, add_background_class=add_background_class, is_training, num_classes, add_background_class=add_background_class)
predict_mask=predict_mask)
mock_box_coder = test_utils.MockBoxCoder() mock_box_coder = test_utils.MockBoxCoder()
if use_keras: if use_keras:
fake_feature_extractor = FakeSSDKerasFeatureExtractor() fake_feature_extractor = FakeSSDKerasFeatureExtractor()
...@@ -177,17 +176,22 @@ class SSDMetaArchTestBase(test_case.TestCase): ...@@ -177,17 +176,22 @@ class SSDMetaArchTestBase(test_case.TestCase):
region_similarity_calculator, region_similarity_calculator,
mock_matcher, mock_matcher,
mock_box_coder, mock_box_coder,
negative_class_weight=negative_class_weight, negative_class_weight=negative_class_weight)
weight_regression_loss_by_score=weight_regression_loss_by_score)
expected_classification_loss_under_sampling = None model_config = model_pb2.DetectionModel()
if use_expected_classification_loss_under_sampling: if expected_loss_weights == model_config.ssd.loss.NONE:
expected_classification_loss_under_sampling = functools.partial( expected_loss_weights_fn = None
ops.expected_classification_loss_under_sampling, else:
min_num_negative_samples=min_num_negative_samples, raise ValueError('Not a valid value for expected_loss_weights.')
desired_negative_sampling_ratio=desired_negative_sampling_ratio)
code_size = 4 code_size = 4
kwargs = {}
if predict_mask:
kwargs.update({
'mask_prediction_fn': test_utils.MockMaskHead(num_classes=1).predict,
})
model = model_fn( model = model_fn(
is_training=is_training, is_training=is_training,
anchor_generator=mock_anchor_generator, anchor_generator=mock_anchor_generator,
...@@ -211,8 +215,8 @@ class SSDMetaArchTestBase(test_case.TestCase): ...@@ -211,8 +215,8 @@ class SSDMetaArchTestBase(test_case.TestCase):
inplace_batchnorm_update=False, inplace_batchnorm_update=False,
add_background_class=add_background_class, add_background_class=add_background_class,
random_example_sampler=random_example_sampler, random_example_sampler=random_example_sampler,
expected_classification_loss_under_sampling= expected_loss_weights_fn=expected_loss_weights_fn,
expected_classification_loss_under_sampling) **kwargs)
return model, num_classes, mock_anchor_generator.num_anchors(), code_size return model, num_classes, mock_anchor_generator.num_anchors(), code_size
def _get_value_for_matching_key(self, dictionary, suffix): def _get_value_for_matching_key(self, dictionary, suffix):
......
...@@ -54,49 +54,59 @@ MODEL_BUILD_UTIL_MAP = { ...@@ -54,49 +54,59 @@ MODEL_BUILD_UTIL_MAP = {
} }
def _prepare_groundtruth_for_eval(detection_model, class_agnostic): def _prepare_groundtruth_for_eval(detection_model, class_agnostic,
max_number_of_boxes):
"""Extracts groundtruth data from detection_model and prepares it for eval. """Extracts groundtruth data from detection_model and prepares it for eval.
Args: Args:
detection_model: A `DetectionModel` object. detection_model: A `DetectionModel` object.
class_agnostic: Whether the detections are class_agnostic. class_agnostic: Whether the detections are class_agnostic.
max_number_of_boxes: Max number of groundtruth boxes.
Returns: Returns:
A tuple of: A tuple of:
groundtruth: Dictionary with the following fields: groundtruth: Dictionary with the following fields:
'groundtruth_boxes': [num_boxes, 4] float32 tensor of boxes, in 'groundtruth_boxes': [batch_size, num_boxes, 4] float32 tensor of boxes,
normalized coordinates. in normalized coordinates.
'groundtruth_classes': [num_boxes] int64 tensor of 1-indexed classes. 'groundtruth_classes': [batch_size, num_boxes] int64 tensor of 1-indexed
'groundtruth_masks': 3D float32 tensor of instance masks (if provided in classes.
'groundtruth_masks': 4D float32 tensor of instance masks (if provided in
groundtruth) groundtruth)
'groundtruth_is_crowd': [num_boxes] bool tensor indicating is_crowd 'groundtruth_is_crowd': [batch_size, num_boxes] bool tensor indicating
annotations (if provided in groundtruth). is_crowd annotations (if provided in groundtruth).
'num_groundtruth_boxes': [batch_size] tensor containing the maximum number
of groundtruth boxes per image..
class_agnostic: Boolean indicating whether detections are class agnostic. class_agnostic: Boolean indicating whether detections are class agnostic.
""" """
input_data_fields = fields.InputDataFields() input_data_fields = fields.InputDataFields()
groundtruth_boxes = detection_model.groundtruth_lists( groundtruth_boxes = tf.stack(
fields.BoxListFields.boxes)[0] detection_model.groundtruth_lists(fields.BoxListFields.boxes))
groundtruth_boxes_shape = tf.shape(groundtruth_boxes)
# For class-agnostic models, groundtruth one-hot encodings collapse to all # For class-agnostic models, groundtruth one-hot encodings collapse to all
# ones. # ones.
if class_agnostic: if class_agnostic:
groundtruth_boxes_shape = tf.shape(groundtruth_boxes) groundtruth_classes_one_hot = tf.ones(
groundtruth_classes_one_hot = tf.ones([groundtruth_boxes_shape[0], 1]) [groundtruth_boxes_shape[0], groundtruth_boxes_shape[1], 1])
else: else:
groundtruth_classes_one_hot = detection_model.groundtruth_lists( groundtruth_classes_one_hot = tf.stack(
fields.BoxListFields.classes)[0] detection_model.groundtruth_lists(fields.BoxListFields.classes))
label_id_offset = 1 # Applying label id offset (b/63711816) label_id_offset = 1 # Applying label id offset (b/63711816)
groundtruth_classes = ( groundtruth_classes = (
tf.argmax(groundtruth_classes_one_hot, axis=1) + label_id_offset) tf.argmax(groundtruth_classes_one_hot, axis=2) + label_id_offset)
groundtruth = { groundtruth = {
input_data_fields.groundtruth_boxes: groundtruth_boxes, input_data_fields.groundtruth_boxes: groundtruth_boxes,
input_data_fields.groundtruth_classes: groundtruth_classes input_data_fields.groundtruth_classes: groundtruth_classes
} }
if detection_model.groundtruth_has_field(fields.BoxListFields.masks): if detection_model.groundtruth_has_field(fields.BoxListFields.masks):
groundtruth[input_data_fields.groundtruth_instance_masks] = ( groundtruth[input_data_fields.groundtruth_instance_masks] = tf.stack(
detection_model.groundtruth_lists(fields.BoxListFields.masks)[0]) detection_model.groundtruth_lists(fields.BoxListFields.masks))
if detection_model.groundtruth_has_field(fields.BoxListFields.is_crowd): if detection_model.groundtruth_has_field(fields.BoxListFields.is_crowd):
groundtruth[input_data_fields.groundtruth_is_crowd] = ( groundtruth[input_data_fields.groundtruth_is_crowd] = tf.stack(
detection_model.groundtruth_lists(fields.BoxListFields.is_crowd)[0]) detection_model.groundtruth_lists(fields.BoxListFields.is_crowd))
groundtruth[input_data_fields.num_groundtruth_boxes] = (
tf.tile([max_number_of_boxes], multiples=[groundtruth_boxes_shape[0]]))
return groundtruth return groundtruth
...@@ -226,7 +236,7 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False): ...@@ -226,7 +236,7 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False):
boxes_shape = ( boxes_shape = (
labels[fields.InputDataFields.groundtruth_boxes].get_shape() labels[fields.InputDataFields.groundtruth_boxes].get_shape()
.as_list()) .as_list())
unpad_groundtruth_tensors = True if boxes_shape[1] is not None else False unpad_groundtruth_tensors = boxes_shape[1] is not None and not use_tpu
labels = unstack_batch( labels = unstack_batch(
labels, unpad_groundtruth_tensors=unpad_groundtruth_tensors) labels, unpad_groundtruth_tensors=unpad_groundtruth_tensors)
...@@ -243,12 +253,17 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False): ...@@ -243,12 +253,17 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False):
gt_weights_list = None gt_weights_list = None
if fields.InputDataFields.groundtruth_weights in labels: if fields.InputDataFields.groundtruth_weights in labels:
gt_weights_list = labels[fields.InputDataFields.groundtruth_weights] gt_weights_list = labels[fields.InputDataFields.groundtruth_weights]
gt_confidences_list = None
if fields.InputDataFields.groundtruth_confidences in labels:
gt_confidences_list = labels[
fields.InputDataFields.groundtruth_confidences]
gt_is_crowd_list = None gt_is_crowd_list = None
if fields.InputDataFields.groundtruth_is_crowd in labels: if fields.InputDataFields.groundtruth_is_crowd in labels:
gt_is_crowd_list = labels[fields.InputDataFields.groundtruth_is_crowd] gt_is_crowd_list = labels[fields.InputDataFields.groundtruth_is_crowd]
detection_model.provide_groundtruth( detection_model.provide_groundtruth(
groundtruth_boxes_list=gt_boxes_list, groundtruth_boxes_list=gt_boxes_list,
groundtruth_classes_list=gt_classes_list, groundtruth_classes_list=gt_classes_list,
groundtruth_confidences_list=gt_confidences_list,
groundtruth_masks_list=gt_masks_list, groundtruth_masks_list=gt_masks_list,
groundtruth_keypoints_list=gt_keypoints_list, groundtruth_keypoints_list=gt_keypoints_list,
groundtruth_weights_list=gt_weights_list, groundtruth_weights_list=gt_weights_list,
...@@ -378,24 +393,30 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False): ...@@ -378,24 +393,30 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False):
if mode == tf.estimator.ModeKeys.EVAL: if mode == tf.estimator.ModeKeys.EVAL:
class_agnostic = ( class_agnostic = (
fields.DetectionResultFields.detection_classes not in detections) fields.DetectionResultFields.detection_classes not in detections)
groundtruth = _prepare_groundtruth_for_eval(detection_model, groundtruth = _prepare_groundtruth_for_eval(
class_agnostic) detection_model, class_agnostic,
eval_input_config.max_number_of_boxes)
use_original_images = fields.InputDataFields.original_image in features use_original_images = fields.InputDataFields.original_image in features
if use_original_images: if use_original_images:
eval_images = tf.cast(tf.image.resize_bilinear( eval_images = features[fields.InputDataFields.original_image]
features[fields.InputDataFields.original_image][0:1], true_image_shapes = tf.slice(
features[fields.InputDataFields.original_image_spatial_shape][0]), features[fields.InputDataFields.true_image_shape], [0, 0], [-1, 3])
tf.uint8) original_image_spatial_shapes = features[fields.InputDataFields
.original_image_spatial_shape]
else: else:
eval_images = features[fields.InputDataFields.image] eval_images = features[fields.InputDataFields.image]
true_image_shapes = None
original_image_spatial_shapes = None
eval_dict = eval_util.result_dict_for_single_example( eval_dict = eval_util.result_dict_for_batched_example(
eval_images[0:1], eval_images,
features[inputs.HASH_KEY][0], features[inputs.HASH_KEY],
detections, detections,
groundtruth, groundtruth,
class_agnostic=class_agnostic, class_agnostic=class_agnostic,
scale_to_absolute=True) scale_to_absolute=True,
original_image_spatial_shapes=original_image_spatial_shapes,
true_image_shapes=true_image_shapes)
if class_agnostic: if class_agnostic:
category_index = label_map_util.create_class_agnostic_category_index() category_index = label_map_util.create_class_agnostic_category_index()
...@@ -445,6 +466,15 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False): ...@@ -445,6 +466,15 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False):
eval_metrics=eval_metric_ops, eval_metrics=eval_metric_ops,
export_outputs=export_outputs) export_outputs=export_outputs)
else: else:
if scaffold is None:
keep_checkpoint_every_n_hours = (
train_config.keep_checkpoint_every_n_hours)
saver = tf.train.Saver(
sharded=True,
keep_checkpoint_every_n_hours=keep_checkpoint_every_n_hours,
save_relative_paths=True)
tf.add_to_collection(tf.GraphKeys.SAVERS, saver)
scaffold = tf.train.Scaffold(saver=saver)
return tf.estimator.EstimatorSpec( return tf.estimator.EstimatorSpec(
mode=mode, mode=mode,
predictions=detections, predictions=detections,
......
...@@ -301,8 +301,6 @@ class WeightSharedConvolutionalBoxPredictor(box_predictor.BoxPredictor): ...@@ -301,8 +301,6 @@ class WeightSharedConvolutionalBoxPredictor(box_predictor.BoxPredictor):
num_predictions_per_location): num_predictions_per_location):
if head_name == CLASS_PREDICTIONS_WITH_BACKGROUND: if head_name == CLASS_PREDICTIONS_WITH_BACKGROUND:
tower_name_scope = 'ClassPredictionTower' tower_name_scope = 'ClassPredictionTower'
elif head_name == MASK_PREDICTIONS:
tower_name_scope = 'MaskPredictionTower'
else: else:
raise ValueError('Unknown head') raise ValueError('Unknown head')
if self._share_prediction_tower: if self._share_prediction_tower:
......
...@@ -119,6 +119,10 @@ class ConvolutionalBoxPredictor(box_predictor.KerasBoxPredictor): ...@@ -119,6 +119,10 @@ class ConvolutionalBoxPredictor(box_predictor.KerasBoxPredictor):
if other_heads: if other_heads:
self._prediction_heads.update(other_heads) self._prediction_heads.update(other_heads)
# We generate a consistent ordering for the prediction head names,
# So that all workers build the model in the exact same order
self._sorted_head_names = sorted(self._prediction_heads.keys())
self._conv_hyperparams = conv_hyperparams self._conv_hyperparams = conv_hyperparams
self._min_depth = min_depth self._min_depth = min_depth
self._max_depth = max_depth self._max_depth = max_depth
...@@ -187,7 +191,7 @@ class ConvolutionalBoxPredictor(box_predictor.KerasBoxPredictor): ...@@ -187,7 +191,7 @@ class ConvolutionalBoxPredictor(box_predictor.KerasBoxPredictor):
for layer in self._shared_nets[index]: for layer in self._shared_nets[index]:
net = layer(net) net = layer(net)
for head_name in self._prediction_heads: for head_name in self._sorted_head_names:
head_obj = self._prediction_heads[head_name][index] head_obj = self._prediction_heads[head_name][index]
prediction = head_obj(net) prediction = head_obj(net)
predictions[head_name].append(prediction) predictions[head_name].append(prediction)
......
...@@ -188,6 +188,8 @@ class ConvolutionalKerasBoxPredictorTest(test_case.TestCase): ...@@ -188,6 +188,8 @@ class ConvolutionalKerasBoxPredictorTest(test_case.TestCase):
'BoxPredictor/ConvolutionalClassHead_0/ClassPredictor/bias', 'BoxPredictor/ConvolutionalClassHead_0/ClassPredictor/bias',
'BoxPredictor/ConvolutionalClassHead_0/ClassPredictor/kernel']) 'BoxPredictor/ConvolutionalClassHead_0/ClassPredictor/kernel'])
self.assertEqual(expected_variable_set, actual_variable_set) self.assertEqual(expected_variable_set, actual_variable_set)
self.assertEqual(conv_box_predictor._sorted_head_names,
['box_encodings', 'class_predictions_with_background'])
# TODO(kaftan): Remove conditional after CMLE moves to TF 1.10 # TODO(kaftan): Remove conditional after CMLE moves to TF 1.10
......
...@@ -15,18 +15,6 @@ message BoxPredictor { ...@@ -15,18 +15,6 @@ message BoxPredictor {
} }
} }
// Configuration proto for MaskHead in predictors.
// Next id: 4
message MaskHead {
// The height and the width of the predicted mask. Only used when
// predict_instance_masks is true.
optional int32 mask_height = 1 [default = 15];
optional int32 mask_width = 2 [default = 15];
// Whether to predict class agnostic masks. Only used when
// predict_instance_masks is true.
optional bool masks_are_class_agnostic = 3 [default = true];
}
// Configuration proto for Convolutional box predictor. // Configuration proto for Convolutional box predictor.
// Next id: 13 // Next id: 13
...@@ -69,9 +57,6 @@ message ConvolutionalBoxPredictor { ...@@ -69,9 +57,6 @@ message ConvolutionalBoxPredictor {
// Whether to use depthwise separable convolution for box predictor layers. // Whether to use depthwise separable convolution for box predictor layers.
optional bool use_depthwise = 11 [default = false]; optional bool use_depthwise = 11 [default = false];
// Configs for a mask prediction head.
optional MaskHead mask_head = 12;
} }
// Configuration proto for weight shared convolutional box predictor. // Configuration proto for weight shared convolutional box predictor.
...@@ -113,9 +98,6 @@ message WeightSharedConvolutionalBoxPredictor { ...@@ -113,9 +98,6 @@ message WeightSharedConvolutionalBoxPredictor {
// Whether to use depthwise separable convolution for box predictor layers. // Whether to use depthwise separable convolution for box predictor layers.
optional bool use_depthwise = 14 [default = false]; optional bool use_depthwise = 14 [default = false];
// Configs for a mask prediction head.
optional MaskHead mask_head = 15;
// Enum to specify how to convert the detection scores at inference time. // Enum to specify how to convert the detection scores at inference time.
enum ScoreConverter { enum ScoreConverter {
// Input scores equals output scores. // Input scores equals output scores.
......
...@@ -35,9 +35,16 @@ message Hyperparams { ...@@ -35,9 +35,16 @@ message Hyperparams {
} }
optional Activation activation = 4 [default = RELU]; optional Activation activation = 4 [default = RELU];
// BatchNorm hyperparameters. If this parameter is NOT set then BatchNorm is oneof normalizer_oneof {
// not applied! // Note that if nothing below is selected, then no normalization is applied
optional BatchNorm batch_norm = 5; // BatchNorm hyperparameters.
BatchNorm batch_norm = 5;
// GroupNorm hyperparameters. This is only supported on a subset of models.
// Note that the current implementation of group norm instantiated in
// tf.contrib.group.layers.group_norm() only supports fixed_size_resizer
// for image preprocessing.
GroupNorm group_norm = 7;
}
// Whether depthwise convolutions should be regularized. If this parameter is // Whether depthwise convolutions should be regularized. If this parameter is
// NOT set then the conv hyperparams will default to the parent scope. // NOT set then the conv hyperparams will default to the parent scope.
...@@ -113,3 +120,8 @@ message BatchNorm { ...@@ -113,3 +120,8 @@ message BatchNorm {
// forward pass but they are never updated. // forward pass but they are never updated.
optional bool train = 5 [default = true]; optional bool train = 5 [default = true];
} }
// Configuration proto for group normalization to apply after convolution op.
// https://arxiv.org/abs/1803.08494
message GroupNorm {
}
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment