Commit a1337e01 authored by Zhichao Lu's avatar Zhichao Lu Committed by pkulzc
Browse files

Merged commit includes the following changes:

223075771  by lzc:

    Bring in external fixes.

--
222919755  by ronnyvotel:

    Bug fix in faster r-cnn model builder. Was previously using `inplace_batchnorm_update` for `reuse_weights`.

--
222885680  by Zhichao Lu:

    Use the result_dict_for_batched_example in models_lib
    Also fixes the visualization size on when eval is on GPU

--
222883648  by Zhichao Lu:

    Fix _unmatched_class_label for the _add_background_class == False case in ssd_meta_arch.py.

--
222836663  by Zhichao Lu:

    Adding support for visualizing grayscale images. Without this change, the images are black-red instead of grayscale.

--
222501978  by Zhichao Lu:

    Fix a bug that caused convert_to_grayscale flag not to be respected.

--
222432846  by richardmunoz:

    Fix mapping of groundtruth_confidences from shape [num_boxes] to [num_boxes, num_classes] when the input contains the groundtruth_confidences field.

--
221725755  by richardmunoz:

    Internal change.

--
221458536  by Zhichao Lu:

    Fix saver defer build bug in object detection train codepath.

--
221391590  by Zhichao Lu:

    Add support for group normalization in the object detection API. Just adding MobileNet-v1 SSD currently. This may serve as a road map for other models that wish to support group normalization as an option.

--
221367993  by Zhichao Lu:

    Bug fixes (1) Make RandomPadImage work, (2) Fix keep_checkpoint_every_n_hours.

--
221266403  by rathodv:

    Use detection boxes as proposals to compute correct mask loss in eval jobs.

--
220845934  by lzc:

    Internal change.

--
220778850  by Zhichao Lu:

    Incorporating existing metrics into Estimator framework.
    Should restore:
    -oid_challenge_detection_metrics
    -pascal_voc_detection_metrics
    -weighted_pascal_voc_detection_metrics
    -pascal_voc_instance_segmentation_metrics
    -weighted_pascal_voc_instance_segmentation_metrics
    -oid_V2_detection_metrics

--
220370391  by alirezafathi:

    Adding precision and recall to the metrics.

--
220321268  by Zhichao Lu:

    Allow the option of setting max_examples_to_draw to zero.

--
220193337  by Zhichao Lu:

    This CL fixes a bug where the Keras convolutional box predictor was applying heads in the non-deterministic dict order. The consequence of this bug was that variables were created in non-deterministic orders. This in turn led different workers in a multi-gpu training setup to have slightly different graphs which had variables assigned to mismatched parameter servers. As a result, roughly half of all workers were unable to initialize and did no work, and training time was slowed down approximately 2x.

--
220136508  by huizhongc:

    Add weight equalization loss to SSD meta arch.

--
220125875  by pengchong:

    Rename label_scores to label_weights

--
219730108  by Zhichao Lu:

    Add description of detection_keypoints in postprocessed_tensors to docstring.

--
219577519  by pengchong:

    Support parsing the class confidences and training using them.

--
219547611  by lzc:

    Stop using static shapes in GPU eval jobs.

--
219536476  by Zhichao Lu:

    Migrate TensorFlow Lite out of tensorflow/contrib

    This change moves //tensorflow/contrib/lite to //tensorflow/lite in preparation
    for TensorFlow 2.0's deprecation of contrib/. If you refer to TF Lite build
    targets or headers, you will need to update them manually. If you use TF Lite
    from the TensorFlow python package, "tf.contrib.lite" now points to "tf.lite".
    Please update your imports as soon as possible.

    For more details, see https://groups.google.com/a/tensorflow.org/forum/#!topic/tflite/iIIXOTOFvwQ

    @angersson and @aselle are conducting this migration. Please contact them if
    you have any further questions.

--
219190083  by Zhichao Lu:

    Add a second expected_loss_weights function using an alternative expectation calculation compared to previous. Integrate this op into ssd_meta_arch and losses builder. Affects files that use losses_builder.build to handle the returning of an additional element.

--
218924451  by pengchong:

    Add a new way to assign training targets using groundtruth confidences.

--
218760524  by chowdhery:

    Modify export script to add option for regular NMS in TFLite post-processing op.

--

PiperOrigin-RevId: 223075771
parent 2c680af3
......@@ -195,6 +195,8 @@ def add_output_tensor_nodes(postprocessed_tensors,
'detection_classes': [batch, max_detections]
'detection_masks': [batch, max_detections, mask_height, mask_width]
(optional).
'detection_keypoints': [batch, max_detections, num_keypoints, 2]
(optional).
'num_detections': [batch]
output_collection_name: Name of collection to add output tensors to.
......
......@@ -83,7 +83,7 @@ tf_record_input_reader: { input_path: '${INPUT_TFRECORDS_WITH_DETECTIONS}' }
" > ${OUTPUT_CONFIG_DIR}/input_config.pbtxt
echo "
metrics_set: 'oid_challenge_object_detection_metrics'
metrics_set: 'oid_challenge_detection_metrics'
" > ${OUTPUT_CONFIG_DIR}/eval_config.pbtxt
OUTPUT_METRICS_DIR=/path/to/metrics_csv
......
......@@ -109,9 +109,11 @@ Model name
## Open Images-trained models
Model name | Speed (ms) | Open Images mAP@0.5[^2] | Outputs
----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----:
[faster_rcnn_inception_resnet_v2_atrous_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_oid_2018_01_28.tar.gz) | 727 | 37 | Boxes
[faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2018_01_28.tar.gz) | 347 | | Boxes
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :--------: | :---------------------: | :-----:
[faster_rcnn_inception_resnet_v2_atrous_oidv2](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_oid_2018_01_28.tar.gz) | 727 | 37 | Boxes
[faster_rcnn_inception_resnet_v2_atrous_lowproposals_oidv2](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2018_01_28.tar.gz) | 347 | | Boxes
[faster_rcnn_inception_resnet_v2_atrous_oidv4](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_oidv4_2018_10_30.tar.gz) | 455 | | Boxes
[ssd_mobilenetv2_oidv4](http://download.tensorflow.org/models/object_detection/ssd_mobilenetv2_oidv4_2018_10_30.tar.gz) | 24 | | Boxes
[facessd_mobilenet_v2_quantized_open_image_v4](http://download.tensorflow.org/models/object_detection/facessd_mobilenet_v2_quantized_320x320_open_image_v4.tar.gz) [^3] | 20 | 73 (faces) | Boxes
## iNaturalist Species-trained models
......
......@@ -65,7 +65,7 @@ intersection over union based on the object masks instead of object boxes.
## Open Images V2 detection metric
`EvalConfig.metrics_set='open_images_V2_detection_metrics'`
`EvalConfig.metrics_set='oid_V2_detection_metrics'`
This metric is defined originally for evaluating detector performance on [Open
Images V2 dataset](https://github.com/openimages/dataset) and is fairly similar
......@@ -132,14 +132,20 @@ convention, the evaluation software treats all classes independently, ignoring
the hierarchy. To achieve high performance values, object detectors should
output bounding-boxes labelled in the same manner.
The old metric name is DEPRECATED.
`EvalConfig.metrics_set='open_images_V2_detection_metrics'`
## OID Challenge Object Detection Metric 2018
`EvalConfig.metrics_set='oid_challenge_object_detection_metrics'`
`EvalConfig.metrics_set='oid_challenge_detection_metrics'`
The metric for the OID Challenge Object Detection Metric 2018, Object Detection
track. The description is provided on the [Open Images Challenge
website](https://storage.googleapis.com/openimages/web/challenge.html).
The old metric name is DEPRECATED.
`EvalConfig.metrics_set='oid_challenge_object_detection_metrics'`
## OID Challenge Visual Relationship Detection Metric 2018
The metric for the OID Challenge Visual Relationship Detection Metric 2018, Visual
......
......@@ -216,7 +216,7 @@ tf_record_input_reader: { input_path: '${SPLIT}_detections.tfrecord@${NUM_SHARDS
" > ${SPLIT}_eval_metrics/${SPLIT}_input_config.pbtxt
echo "
metrics_set: 'open_images_V2_detection_metrics'
metrics_set: 'oid_V2_detection_metrics'
" > ${SPLIT}_eval_metrics/${SPLIT}_eval_config.pbtxt
```
......
......@@ -49,14 +49,14 @@ will output the frozen graph that we can input to TensorFlow Lite directly and
is the one we’ll be using.
Next we’ll use TensorFlow Lite to get the optimized model by using
[TOCO](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/toco),
[TOCO](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/toco),
the TensorFlow Lite Optimizing Converter. This will convert the resulting frozen
graph (tflite_graph.pb) to the TensorFlow Lite flatbuffer format (detect.tflite)
via the following command. For a quantized model, run this from the tensorflow/
directory:
```shell
bazel run --config=opt tensorflow/contrib/lite/toco:toco -- \
bazel run --config=opt tensorflow/lite/toco:toco -- \
--input_file=$OUTPUT_DIR/tflite_graph.pb \
--output_file=$OUTPUT_DIR/detect.tflite \
--input_shapes=1,300,300,3 \
......@@ -75,14 +75,14 @@ are named 'TFLite_Detection_PostProcess', 'TFLite_Detection_PostProcess:1',
'TFLite_Detection_PostProcess:2', and 'TFLite_Detection_PostProcess:3' and
represent four arrays: detection_boxes, detection_classes, detection_scores, and
num_detections. The documentation for other flags used in this command is
[here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/toco/g3doc/cmdline_reference.md).
[here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/toco/g3doc/cmdline_reference.md).
If things ran successfully, you should now see a third file in the /tmp/tflite
directory called detect.tflite. This file contains the graph and all model
parameters and can be run via the TensorFlow Lite interpreter on the Android
device. For a floating point model, run this from the tensorflow/ directory:
```shell
bazel run -c opt tensorflow/lite/toco:toco -- \
bazel run --config=opt tensorflow/lite/toco:toco -- \
--input_file=$OUTPUT_DIR/tflite_graph.pb \
--output_file=$OUTPUT_DIR/detect.tflite \
--input_shapes=1,300,300,3 \
......@@ -105,7 +105,7 @@ Studio](https://developer.android.com/studio/index.html). To build the
TensorFlow Lite Android demo, build tools require API >= 23 (but it will run on
devices with API >= 21). Additional details are available on the [TensorFlow
Lite Android App
page](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/java/demo/README.md).
page](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/java/demo/README.md).
Next we need to point the app to our new detect.tflite file and give it the
names of our new labels. Specifically, we will copy our TensorFlow Lite
......@@ -113,24 +113,24 @@ flatbuffer to the app assets directory with the following command:
```shell
cp /tmp/tflite/detect.tflite \
//tensorflow/contrib/lite/examples/android/app/src/main/assets
//tensorflow/lite/examples/android/app/src/main/assets
```
You will also need to copy your new labelmap labels_list.txt to the assets
directory.
We will now edit the BUILD file to point to this new model. First, open the
BUILD file tensorflow/contrib/lite/examples/android/BUILD. Then find the assets
BUILD file tensorflow/lite/examples/android/BUILD. Then find the assets
section, and replace the line “@tflite_mobilenet_ssd_quant//:detect.tflite”
(which by default points to a COCO pretrained model) with the path to your new
TFLite model
“//tensorflow/contrib/lite/examples/android/app/src/main/assets:detect.tflite”.
“//tensorflow/lite/examples/android/app/src/main/assets:detect.tflite”.
Finally, change the last line in assets section to use the new label map as
well.
We will also need to tell our app to use the new label map. In order to do this,
open up the
tensorflow/contrib/lite/examples/android/app/src/main/java/org/tensorflow/demo/DetectorActivity.java
tensorflow/lite/examples/android/app/src/main/java/org/tensorflow/demo/DetectorActivity.java
file in a text editor and find the definition of TF_OD_API_LABELS_FILE. Update
this path to point to your new label map file:
"file:///android_asset/labels_list.txt". Note that if your model is quantized,
......@@ -150,7 +150,7 @@ from the tensorflow directory:
```shell
bazel build -c opt --config=android_arm{,64} --cxxopt='--std=c++11'
"//tensorflow/contrib/lite/examples/android:tflite_demo"
"//tensorflow/lite/examples/android:tflite_demo"
```
Now install the demo on a
......@@ -159,5 +159,5 @@ Android phone via [Android Debug
Bridge](https://developer.android.com/studio/command-line/adb) (adb):
```shell
adb install bazel-bin/tensorflow/contrib/lite/examples/android/tflite_demo.apk
adb install bazel-bin/tensorflow/lite/examples/android/tflite_demo.apk
```
......@@ -139,12 +139,10 @@ def transform_input_data(tensor_dict,
if fields.InputDataFields.groundtruth_confidences in tensor_dict:
groundtruth_confidences = tensor_dict[
fields.InputDataFields.groundtruth_confidences]
# Map the confidences to the one-hot encoding of classes
tensor_dict[fields.InputDataFields.groundtruth_confidences] = (
tf.sparse_to_dense(
zero_indexed_groundtruth_classes,
[num_classes],
groundtruth_confidences,
validate_indices=False))
tf.reshape(groundtruth_confidences, [-1, 1]) *
tensor_dict[fields.InputDataFields.groundtruth_classes])
else:
groundtruth_confidences = tf.ones_like(
zero_indexed_groundtruth_classes, dtype=tf.float32)
......@@ -200,10 +198,14 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes,
if fields.InputDataFields.image_additional_channels in tensor_dict:
num_additional_channels = tensor_dict[
fields.InputDataFields.image_additional_channels].shape[2].value
num_image_channels = 3
if fields.InputDataFields.image in tensor_dict:
num_image_channels = tensor_dict[fields.InputDataFields
.image].shape[2].value
padding_shapes = {
# Additional channels are merged before batching.
fields.InputDataFields.image: [
height, width, 3 + num_additional_channels
height, width, num_image_channels + num_additional_channels
],
fields.InputDataFields.original_image_spatial_shape: [2],
fields.InputDataFields.image_additional_channels: [
......@@ -215,8 +217,6 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes,
fields.InputDataFields.groundtruth_difficult: [max_num_boxes],
fields.InputDataFields.groundtruth_boxes: [max_num_boxes, 4],
fields.InputDataFields.groundtruth_classes: [max_num_boxes, num_classes],
fields.InputDataFields.groundtruth_confidences: [
max_num_boxes, num_classes],
fields.InputDataFields.groundtruth_instance_masks: [
max_num_boxes, height, width
],
......@@ -224,9 +224,12 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes,
fields.InputDataFields.groundtruth_group_of: [max_num_boxes],
fields.InputDataFields.groundtruth_area: [max_num_boxes],
fields.InputDataFields.groundtruth_weights: [max_num_boxes],
fields.InputDataFields.groundtruth_confidences: [
max_num_boxes, num_classes
],
fields.InputDataFields.num_groundtruth_boxes: [],
fields.InputDataFields.groundtruth_label_types: [max_num_boxes],
fields.InputDataFields.groundtruth_label_scores: [max_num_boxes],
fields.InputDataFields.groundtruth_label_weights: [max_num_boxes],
fields.InputDataFields.true_image_shape: [3],
fields.InputDataFields.multiclass_scores: [
max_num_boxes, num_classes + 1 if num_classes is not None else None
......@@ -237,7 +240,7 @@ def pad_input_data_to_static_shapes(tensor_dict, max_num_boxes, num_classes,
if fields.InputDataFields.original_image in tensor_dict:
padding_shapes[fields.InputDataFields.original_image] = [
height, width, 3 + num_additional_channels
height, width, num_image_channels + num_additional_channels
]
if fields.InputDataFields.groundtruth_keypoints in tensor_dict:
tensor_shape = (
......@@ -287,9 +290,15 @@ def augment_input_data(tensor_dict, data_augmentation_options):
in tensor_dict)
include_keypoints = (fields.InputDataFields.groundtruth_keypoints
in tensor_dict)
include_label_weights = (fields.InputDataFields.groundtruth_weights
in tensor_dict)
include_label_confidences = (fields.InputDataFields.groundtruth_confidences
in tensor_dict)
tensor_dict = preprocessor.preprocess(
tensor_dict, data_augmentation_options,
func_arg_map=preprocessor.get_default_func_arg_map(
include_label_weights=include_label_weights,
include_label_confidences=include_label_confidences,
include_instance_masks=include_instance_masks,
include_keypoints=include_keypoints))
tensor_dict[fields.InputDataFields.image] = tf.squeeze(
......@@ -303,7 +312,7 @@ def _get_labels_dict(input_dict):
fields.InputDataFields.num_groundtruth_boxes,
fields.InputDataFields.groundtruth_boxes,
fields.InputDataFields.groundtruth_classes,
fields.InputDataFields.groundtruth_weights
fields.InputDataFields.groundtruth_weights,
]
labels_dict = {}
for key in required_label_keys:
......
......@@ -93,17 +93,17 @@ class InputsTest(test_case.TestCase, parameterized.TestCase):
labels[fields.InputDataFields.groundtruth_classes].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_classes].dtype)
self.assertAllEqual(
[1, 100],
labels[fields.InputDataFields.groundtruth_weights].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_weights].dtype)
self.assertAllEqual(
[1, 100, model_config.faster_rcnn.num_classes],
labels[fields.InputDataFields.groundtruth_confidences].shape.as_list())
self.assertEqual(
tf.float32,
labels[fields.InputDataFields.groundtruth_confidences].dtype)
self.assertAllEqual(
[1, 100],
labels[fields.InputDataFields.groundtruth_weights].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_weights].dtype)
@parameterized.parameters(
{'eval_batch_size': 1},
......@@ -141,11 +141,11 @@ class InputsTest(test_case.TestCase, parameterized.TestCase):
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_classes].dtype)
self.assertAllEqual(
[eval_batch_size, 100, model_config.faster_rcnn.num_classes],
labels[fields.InputDataFields.groundtruth_confidences].shape.as_list())
[eval_batch_size, 100],
labels[fields.InputDataFields.groundtruth_weights].shape.as_list())
self.assertEqual(
tf.float32,
labels[fields.InputDataFields.groundtruth_confidences].dtype)
labels[fields.InputDataFields.groundtruth_weights].dtype)
self.assertAllEqual(
[eval_batch_size, 100],
labels[fields.InputDataFields.groundtruth_area].shape.as_list())
......@@ -194,16 +194,11 @@ class InputsTest(test_case.TestCase, parameterized.TestCase):
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_classes].dtype)
self.assertAllEqual(
[batch_size, 100, model_config.ssd.num_classes],
[batch_size, 100],
labels[
fields.InputDataFields.groundtruth_confidences].shape.as_list())
fields.InputDataFields.groundtruth_weights].shape.as_list())
self.assertEqual(
tf.float32,
labels[fields.InputDataFields.groundtruth_confidences].dtype)
self.assertAllEqual(
[batch_size, 100],
labels[fields.InputDataFields.groundtruth_weights].shape.as_list())
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_weights].dtype)
@parameterized.parameters(
......@@ -242,12 +237,12 @@ class InputsTest(test_case.TestCase, parameterized.TestCase):
self.assertEqual(tf.float32,
labels[fields.InputDataFields.groundtruth_classes].dtype)
self.assertAllEqual(
[eval_batch_size, 100, model_config.ssd.num_classes],
[eval_batch_size, 100],
labels[
fields.InputDataFields.groundtruth_confidences].shape.as_list())
fields.InputDataFields.groundtruth_weights].shape.as_list())
self.assertEqual(
tf.float32,
labels[fields.InputDataFields.groundtruth_confidences].dtype)
labels[fields.InputDataFields.groundtruth_weights].dtype)
self.assertAllEqual(
[eval_batch_size, 100],
labels[fields.InputDataFields.groundtruth_area].shape.as_list())
......@@ -447,7 +442,7 @@ class DataAugmentationFnTest(test_case.TestCase):
tf.constant(np.array([[.5, .5, 1., 1.]], np.float32)),
fields.InputDataFields.groundtruth_classes:
tf.constant(np.array([1.0], np.float32)),
fields.InputDataFields.groundtruth_confidences:
fields.InputDataFields.groundtruth_weights:
tf.constant(np.array([0.8], np.float32)),
}
augmented_tensor_dict = data_augmentation_fn(tensor_dict=tensor_dict)
......@@ -468,7 +463,7 @@ class DataAugmentationFnTest(test_case.TestCase):
)
self.assertAllClose(
augmented_tensor_dict_out[
fields.InputDataFields.groundtruth_confidences],
fields.InputDataFields.groundtruth_weights],
[0.8]
)
......@@ -634,6 +629,34 @@ class DataTransformationFnTest(test_case.TestCase):
transformed_inputs[fields.InputDataFields.num_groundtruth_boxes],
1)
def test_returns_correct_groundtruth_confidences_when_input_present(self):
tensor_dict = {
fields.InputDataFields.image:
tf.constant(np.random.rand(4, 4, 3).astype(np.float32)),
fields.InputDataFields.groundtruth_boxes:
tf.constant(np.array([[0, 0, 1, 1], [.5, .5, 1, 1]], np.float32)),
fields.InputDataFields.groundtruth_classes:
tf.constant(np.array([3, 1], np.int32)),
fields.InputDataFields.groundtruth_confidences:
tf.constant(np.array([1.0, -1.0], np.float32))
}
num_classes = 3
input_transformation_fn = functools.partial(
inputs.transform_input_data,
model_preprocess_fn=_fake_model_preprocessor_fn,
image_resizer_fn=_fake_image_resizer_fn,
num_classes=num_classes)
with self.test_session() as sess:
transformed_inputs = sess.run(
input_transformation_fn(tensor_dict=tensor_dict))
self.assertAllClose(
transformed_inputs[fields.InputDataFields.groundtruth_classes],
[[0, 0, 1], [1, 0, 0]])
self.assertAllClose(
transformed_inputs[fields.InputDataFields.groundtruth_confidences],
[[0, 0, 1], [-1, 0, 0]])
def test_returns_resized_masks(self):
tensor_dict = {
fields.InputDataFields.image:
......@@ -879,6 +902,41 @@ class PadInputDataToStaticShapesFnTest(test_case.TestCase):
padded_tensor_dict[fields.InputDataFields.image_additional_channels]
.shape.as_list(), [5, 6, 2])
def test_gray_images(self):
input_tensor_dict = {
fields.InputDataFields.image:
tf.placeholder(tf.float32, [None, None, 1]),
}
padded_tensor_dict = inputs.pad_input_data_to_static_shapes(
tensor_dict=input_tensor_dict,
max_num_boxes=3,
num_classes=3,
spatial_image_shape=[5, 6])
self.assertAllEqual(
padded_tensor_dict[fields.InputDataFields.image].shape.as_list(),
[5, 6, 1])
def test_gray_images_and_additional_channels(self):
input_tensor_dict = {
fields.InputDataFields.image:
tf.placeholder(tf.float32, [None, None, 1]),
fields.InputDataFields.image_additional_channels:
tf.placeholder(tf.float32, [None, None, 2]),
}
padded_tensor_dict = inputs.pad_input_data_to_static_shapes(
tensor_dict=input_tensor_dict,
max_num_boxes=3,
num_classes=3,
spatial_image_shape=[5, 6])
self.assertAllEqual(
padded_tensor_dict[fields.InputDataFields.image].shape.as_list(),
[5, 6, 3])
self.assertAllEqual(
padded_tensor_dict[fields.InputDataFields.image_additional_channels]
.shape.as_list(), [5, 6, 2])
def test_keypoints(self):
input_tensor_dict = {
fields.InputDataFields.groundtruth_keypoints:
......
......@@ -39,12 +39,18 @@ EVAL_METRICS_CLASS_DICT = {
object_detection_evaluation.PascalInstanceSegmentationEvaluator,
'weighted_pascal_voc_instance_segmentation_metrics':
object_detection_evaluation.WeightedPascalInstanceSegmentationEvaluator,
'oid_V2_detection_metrics':
object_detection_evaluation.OpenImagesDetectionEvaluator,
# DEPRECATED: please use oid_V2_detection_metrics instead
'open_images_V2_detection_metrics':
object_detection_evaluation.OpenImagesDetectionEvaluator,
'coco_detection_metrics':
coco_evaluation.CocoDetectionEvaluator,
'coco_mask_metrics':
coco_evaluation.CocoMaskEvaluator,
'oid_challenge_detection_metrics':
object_detection_evaluation.OpenImagesDetectionChallengeEvaluator,
# DEPRECATED: please use oid_challenge_detection_metrics instead
'oid_challenge_object_detection_metrics':
object_detection_evaluation.OpenImagesDetectionChallengeEvaluator,
}
......@@ -146,6 +152,16 @@ def get_evaluators(eval_config, categories):
for eval_metric_fn_key in eval_metric_fn_keys:
if eval_metric_fn_key not in EVAL_METRICS_CLASS_DICT:
raise ValueError('Metric not found: {}'.format(eval_metric_fn_key))
if eval_metric_fn_key == 'oid_challenge_object_detection_metrics':
logging.warning(
'oid_challenge_object_detection_metrics is deprecated; '
'use oid_challenge_detection_metrics instead'
)
if eval_metric_fn_key == 'oid_V2_detection_metrics':
logging.warning(
'open_images_V2_detection_metrics is deprecated; '
'use oid_V2_detection_metrics instead'
)
evaluators_list.append(
EVAL_METRICS_CLASS_DICT[eval_metric_fn_key](categories=categories))
return evaluators_list
......
......@@ -75,6 +75,7 @@ def create_input_queue(batch_size_per_clone, create_tensor_dict_fn,
tensor_dict = preprocessor.preprocess(
tensor_dict, data_augmentation_options,
func_arg_map=preprocessor.get_default_func_arg_map(
include_label_weights=True,
include_multiclass_scores=include_multiclass_scores,
include_instance_masks=include_instance_masks,
include_keypoints=include_keypoints))
......
......@@ -1568,7 +1568,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
Returns:
A dictionary containing:
`detection_boxes`: [batch, max_detection, 4]
`detection_boxes`: [batch, max_detection, 4] in normalized co-ordinates.
`detection_scores`: [batch, max_detections]
`detection_classes`: [batch, max_detections]
`num_detections`: [batch]
......@@ -1701,14 +1701,14 @@ class FasterRCNNMetaArch(model.DetectionModel):
prediction_dict['refined_box_encodings'],
prediction_dict['class_predictions_with_background'],
prediction_dict['proposal_boxes'],
prediction_dict['num_proposals'],
groundtruth_boxlists,
prediction_dict['num_proposals'], groundtruth_boxlists,
groundtruth_classes_with_background_list,
groundtruth_weights_list,
prediction_dict['image_shape'],
prediction_dict.get('mask_predictions'),
groundtruth_masks_list,
))
groundtruth_weights_list, prediction_dict['image_shape'],
prediction_dict.get('mask_predictions'), groundtruth_masks_list,
prediction_dict.get(
fields.DetectionResultFields.detection_boxes),
prediction_dict.get(
fields.DetectionResultFields.num_detections)))
return loss_dict
def _loss_rpn(self, rpn_box_encodings,
......@@ -1811,7 +1811,9 @@ class FasterRCNNMetaArch(model.DetectionModel):
groundtruth_weights_list,
image_shape,
prediction_masks=None,
groundtruth_masks_list=None):
groundtruth_masks_list=None,
detection_boxes=None,
num_detections=None):
"""Computes scalar box classifier loss tensors.
Uses self._detector_target_assigner to obtain regression and classification
......@@ -1854,6 +1856,11 @@ class FasterRCNNMetaArch(model.DetectionModel):
groundtruth_masks_list: an optional list of 3-D tensors of shape
[num_boxes, image_height, image_width] containing the instance masks for
each of the boxes.
detection_boxes: 3-D float tensor of shape [batch,
max_total_detections, 4] containing post-processed detection boxes in
normalized co-ordinates.
num_detections: 1-D int32 tensor of shape [batch] containing number of
valid detections in `detection_boxes`.
Returns:
a dictionary mapping loss keys ('second_stage_localization_loss',
......@@ -1867,7 +1874,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
"""
with tf.name_scope('BoxClassifierLoss'):
paddings_indicator = self._padded_batched_proposals_indicator(
num_proposals, self.max_num_proposals)
num_proposals, proposal_boxes.shape[1])
proposal_boxlists = [
box_list.BoxList(proposal_boxes_single_image)
for proposal_boxes_single_image in tf.unstack(proposal_boxes)]
......@@ -1958,6 +1965,13 @@ class FasterRCNNMetaArch(model.DetectionModel):
raise ValueError('Groundtruth instance masks not provided. '
'Please configure input reader.')
if not self._is_training:
(proposal_boxes, proposal_boxlists, paddings_indicator,
one_hot_flat_cls_targets_with_background
) = self._get_mask_proposal_boxes_and_classes(
detection_boxes, num_detections, image_shape,
groundtruth_boxlists, groundtruth_classes_with_background_list,
groundtruth_weights_list)
unmatched_mask_label = tf.zeros(image_shape[1:3], dtype=tf.float32)
(batch_mask_targets, _, _, batch_mask_target_weights,
_) = target_assigner.batch_assign_targets(
......@@ -2031,6 +2045,64 @@ class FasterRCNNMetaArch(model.DetectionModel):
loss_dict[mask_loss.op.name] = mask_loss
return loss_dict
def _get_mask_proposal_boxes_and_classes(
self, detection_boxes, num_detections, image_shape, groundtruth_boxlists,
groundtruth_classes_with_background_list, groundtruth_weights_list):
"""Returns proposal boxes and class targets to compute evaluation mask loss.
During evaluation, detection boxes are used to extract features for mask
prediction. Therefore, to compute mask loss during evaluation detection
boxes must be used to compute correct class and mask targets. This function
returns boxes and classes in the correct format for computing mask targets
during evaluation.
Args:
detection_boxes: A 3-D float tensor of shape [batch, max_detection_boxes,
4] containing detection boxes in normalized co-ordinates.
num_detections: A 1-D float tensor of shape [batch] containing number of
valid boxes in `detection_boxes`.
image_shape: A 1-D tensor of shape [4] containing image tensor shape.
groundtruth_boxlists: A list of groundtruth boxlists.
groundtruth_classes_with_background_list: A list of groundtruth classes.
groundtruth_weights_list: A list of groundtruth weights.
Return:
mask_proposal_boxes: detection boxes to use for mask proposals in absolute
co-ordinates.
mask_proposal_boxlists: `mask_proposal_boxes` in a list of BoxLists in
absolute co-ordinates.
mask_proposal_paddings_indicator: a tensor indicating valid boxes.
mask_proposal_one_hot_flat_cls_targets_with_background: Class targets
computed using detection boxes.
"""
batch, max_num_detections, _ = detection_boxes.shape.as_list()
proposal_boxes = tf.reshape(box_list_ops.to_absolute_coordinates(
box_list.BoxList(tf.reshape(detection_boxes, [-1, 4])), image_shape[1],
image_shape[2]).get(), [batch, max_num_detections, 4])
proposal_boxlists = [
box_list.BoxList(detection_boxes_single_image)
for detection_boxes_single_image in tf.unstack(proposal_boxes)
]
paddings_indicator = self._padded_batched_proposals_indicator(
tf.to_int32(num_detections), detection_boxes.shape[1])
(batch_cls_targets_with_background, _, _, _,
_) = target_assigner.batch_assign_targets(
target_assigner=self._detector_target_assigner,
anchors_batch=proposal_boxlists,
gt_box_batch=groundtruth_boxlists,
gt_class_targets_batch=groundtruth_classes_with_background_list,
unmatched_class_label=tf.constant(
[1] + self._num_classes * [0], dtype=tf.float32),
gt_weights_batch=groundtruth_weights_list)
flat_cls_targets_with_background = tf.reshape(
batch_cls_targets_with_background, [-1, self._num_classes + 1])
one_hot_flat_cls_targets_with_background = tf.argmax(
flat_cls_targets_with_background, axis=1)
one_hot_flat_cls_targets_with_background = tf.one_hot(
one_hot_flat_cls_targets_with_background,
flat_cls_targets_with_background.get_shape()[1])
return (proposal_boxes, proposal_boxlists, paddings_indicator,
one_hot_flat_cls_targets_with_background)
def _get_refined_encodings_for_postitive_class(
self, refined_box_encodings, flat_cls_targets_with_background,
batch_size):
......@@ -2185,4 +2257,3 @@ class FasterRCNNMetaArch(model.DetectionModel):
A list of update operators.
"""
return tf.get_collection(tf.GraphKeys.UPDATE_OPS)
......@@ -281,8 +281,12 @@ class SSDMetaArch(model.DetectionModel):
freeze_batchnorm=False,
inplace_batchnorm_update=False,
add_background_class=True,
explicit_background_class=False,
random_example_sampler=None,
expected_classification_loss_under_sampling=None):
expected_loss_weights_fn=None,
use_confidences_as_targets=False,
implicit_example_weight=0.5,
equalization_loss_config=None):
"""SSDMetaArch Constructor.
TODO(rathodv,jonathanhuang): group NMS parameters + score converter into
......@@ -335,17 +339,29 @@ class SSDMetaArch(model.DetectionModel):
dependency on tf.graphkeys.UPDATE_OPS collection in order to update
batch norm statistics.
add_background_class: Whether to add an implicit background class to
one-hot encodings of groundtruth labels. Set to false if using
groundtruth labels with an explicit background class or using multiclass
scores instead of truth in the case of distillation.
one-hot encodings of groundtruth labels. Set to false if training a
single class model or using groundtruth labels with an explicit
background class.
explicit_background_class: Set to true if using groundtruth labels with an
explicit background class, as in multiclass scores.
random_example_sampler: a BalancedPositiveNegativeSampler object that can
perform random example sampling when computing loss. If None, random
sampling process is skipped. Note that random example sampler and hard
example miner can both be applied to the model. In that case, random
sampler will take effect first and hard example miner can only process
the random sampled examples.
expected_classification_loss_under_sampling: If not None, use
to calcualte classification loss by background/foreground weighting.
expected_loss_weights_fn: If not None, use to calculate
loss by background/foreground weighting. Should take batch_cls_targets
as inputs and return foreground_weights, background_weights. See
expected_classification_loss_by_expected_sampling and
expected_classification_loss_by_reweighting_unmatched_anchors in
third_party/tensorflow_models/object_detection/utils/ops.py as examples.
use_confidences_as_targets: Whether to use groundtruth_condifences field
to assign the targets.
implicit_example_weight: a float number that specifies the weight used
for the implicit negative examples.
equalization_loss_config: a namedtuple that specifies configs for
computing equalization loss.
"""
super(SSDMetaArch, self).__init__(num_classes=box_predictor.num_classes)
self._is_training = is_training
......@@ -358,6 +374,11 @@ class SSDMetaArch(model.DetectionModel):
self._box_coder = box_coder
self._feature_extractor = feature_extractor
self._add_background_class = add_background_class
self._explicit_background_class = explicit_background_class
if add_background_class and explicit_background_class:
raise ValueError("Cannot have both 'add_background_class' and"
" 'explicit_background_class' true.")
# Needed for fine-tuning from classification checkpoints whose
# variables do not have the feature extractor scope.
......@@ -370,15 +391,18 @@ class SSDMetaArch(model.DetectionModel):
# Slim feature extractors get an explicit naming scope
self._extract_features_scope = 'FeatureExtractor'
if self._add_background_class and encode_background_as_zeros:
self._unmatched_class_label = tf.constant((self.num_classes + 1) * [0],
tf.float32)
elif self._add_background_class:
self._unmatched_class_label = tf.constant([1] + self.num_classes * [0],
tf.float32)
if encode_background_as_zeros:
background_class = [0]
else:
background_class = [1]
if self._add_background_class:
num_foreground_classes = self.num_classes
else:
self._unmatched_class_label = tf.constant(self.num_classes * [0],
tf.float32)
num_foreground_classes = self.num_classes - 1
self._unmatched_class_label = tf.constant(
background_class + num_foreground_classes * [0], tf.float32)
self._target_assigner = target_assigner_instance
......@@ -399,8 +423,11 @@ class SSDMetaArch(model.DetectionModel):
self._anchors = None
self._add_summaries = add_summaries
self._batched_prediction_tensor_names = []
self._expected_classification_loss_under_sampling = (
expected_classification_loss_under_sampling)
self._expected_loss_weights_fn = expected_loss_weights_fn
self._use_confidences_as_targets = use_confidences_as_targets
self._implicit_example_weight = implicit_example_weight
self._equalization_loss_config = equalization_loss_config
@property
def anchors(self):
......@@ -647,7 +674,7 @@ class SSDMetaArch(model.DetectionModel):
detection_scores = self._score_conversion_fn(class_predictions)
detection_scores = tf.identity(detection_scores, 'raw_box_scores')
if self._add_background_class:
if self._add_background_class or self._explicit_background_class:
detection_scores = tf.slice(detection_scores, [0, 0, 1], [-1, -1, -1])
additional_fields = None
......@@ -720,11 +747,14 @@ class SSDMetaArch(model.DetectionModel):
weights = None
if self.groundtruth_has_field(fields.BoxListFields.weights):
weights = self.groundtruth_lists(fields.BoxListFields.weights)
confidences = None
if self.groundtruth_has_field(fields.BoxListFields.confidences):
confidences = self.groundtruth_lists(fields.BoxListFields.confidences)
(batch_cls_targets, batch_cls_weights, batch_reg_targets,
batch_reg_weights, match_list) = self._assign_targets(
self.groundtruth_lists(fields.BoxListFields.boxes),
self.groundtruth_lists(fields.BoxListFields.classes),
keypoints, weights)
keypoints, weights, confidences)
if self._add_summaries:
self._summarize_target_assignment(
self.groundtruth_lists(fields.BoxListFields.boxes), match_list)
......@@ -762,7 +792,7 @@ class SSDMetaArch(model.DetectionModel):
weights=batch_cls_weights,
losses_mask=losses_mask)
if self._expected_classification_loss_under_sampling:
if self._expected_loss_weights_fn:
# Need to compute losses for assigned targets against the
# unmatched_class_label as well as their assigned targets.
# simplest thing (but wasteful) is just to calculate all losses
......@@ -787,8 +817,16 @@ class SSDMetaArch(model.DetectionModel):
batch_cls_targets = tf.concat(
[1 - batch_cls_targets, batch_cls_targets], axis=-1)
cls_losses = self._expected_classification_loss_under_sampling(
batch_cls_targets, cls_losses, unmatched_cls_losses)
location_losses = tf.tile(location_losses, [1, num_classes])
foreground_weights, background_weights = (
self._expected_loss_weights_fn(batch_cls_targets))
cls_losses = (
foreground_weights * cls_losses +
background_weights * unmatched_cls_losses)
location_losses *= foreground_weights
classification_loss = tf.reduce_sum(cls_losses)
localization_loss = tf.reduce_sum(location_losses)
......@@ -824,6 +862,8 @@ class SSDMetaArch(model.DetectionModel):
str(localization_loss.op.name): localization_loss,
str(classification_loss.op.name): classification_loss
}
return loss_dict
def _minibatch_subsample_fn(self, inputs):
......@@ -864,9 +904,12 @@ class SSDMetaArch(model.DetectionModel):
visualization_utils.add_cdf_image_summary(negative_anchor_cls_loss,
'NegativeAnchorLossCDF')
def _assign_targets(self, groundtruth_boxes_list, groundtruth_classes_list,
def _assign_targets(self,
groundtruth_boxes_list,
groundtruth_classes_list,
groundtruth_keypoints_list=None,
groundtruth_weights_list=None):
groundtruth_weights_list=None,
groundtruth_confidences_list=None):
"""Assign groundtruth targets.
Adds a background class to each one-hot encoding of groundtruth classes
......@@ -885,6 +928,9 @@ class SSDMetaArch(model.DetectionModel):
[num_boxes, num_keypoints, 2]
groundtruth_weights_list: A list of 1-D tf.float32 tensors of shape
[num_boxes] containing weights for groundtruth boxes.
groundtruth_confidences_list: A list of 2-D tf.float32 tensors of shape
[num_boxes, num_classes] containing class confidences for
groundtruth boxes.
Returns:
batch_cls_targets: a tensor with shape [batch_size, num_anchors,
......@@ -901,11 +947,18 @@ class SSDMetaArch(model.DetectionModel):
groundtruth_boxlists = [
box_list.BoxList(boxes) for boxes in groundtruth_boxes_list
]
train_using_confidences = (self._is_training and
self._use_confidences_as_targets)
if self._add_background_class:
groundtruth_classes_with_background_list = [
tf.pad(one_hot_encoding, [[0, 0], [1, 0]], mode='CONSTANT')
for one_hot_encoding in groundtruth_classes_list
]
if train_using_confidences:
groundtruth_confidences_with_background_list = [
tf.pad(groundtruth_confidences, [[0, 0], [1, 0]], mode='CONSTANT')
for groundtruth_confidences in groundtruth_confidences_list
]
else:
groundtruth_classes_with_background_list = groundtruth_classes_list
......@@ -913,9 +966,23 @@ class SSDMetaArch(model.DetectionModel):
for boxlist, keypoints in zip(
groundtruth_boxlists, groundtruth_keypoints_list):
boxlist.add_field(fields.BoxListFields.keypoints, keypoints)
if train_using_confidences:
return target_assigner.batch_assign_confidences(
self._target_assigner,
self.anchors,
groundtruth_boxlists,
groundtruth_confidences_with_background_list,
groundtruth_weights_list,
self._unmatched_class_label,
self._add_background_class,
self._implicit_example_weight)
else:
return target_assigner.batch_assign_targets(
self._target_assigner, self.anchors, groundtruth_boxlists,
groundtruth_classes_with_background_list, self._unmatched_class_label,
self._target_assigner,
self.anchors,
groundtruth_boxlists,
groundtruth_classes_with_background_list,
self._unmatched_class_label,
groundtruth_weights_list)
def _summarize_target_assignment(self, groundtruth_boxes_list, match_list):
......
......@@ -22,6 +22,7 @@ import tensorflow as tf
from object_detection.meta_architectures import ssd_meta_arch
from object_detection.meta_architectures import ssd_meta_arch_test_lib
from object_detection.protos import model_pb2
from object_detection.utils import test_utils
slim = tf.contrib.slim
......@@ -35,13 +36,13 @@ keras = tf.keras.layers
class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase,
parameterized.TestCase):
def _create_model(self,
def _create_model(
self,
apply_hard_mining=True,
normalize_loc_loss_by_codesize=False,
add_background_class=True,
random_example_sampling=False,
weight_regression_loss_by_score=False,
use_expected_classification_loss_under_sampling=False,
expected_loss_weights=model_pb2.DetectionModel().ssd.loss.NONE,
min_num_negative_samples=1,
desired_negative_sampling_ratio=3,
use_keras=False,
......@@ -54,9 +55,7 @@ class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase,
normalize_loc_loss_by_codesize=normalize_loc_loss_by_codesize,
add_background_class=add_background_class,
random_example_sampling=random_example_sampling,
weight_regression_loss_by_score=weight_regression_loss_by_score,
use_expected_classification_loss_under_sampling=
use_expected_classification_loss_under_sampling,
expected_loss_weights=expected_loss_weights,
min_num_negative_samples=min_num_negative_samples,
desired_negative_sampling_ratio=desired_negative_sampling_ratio,
use_keras=use_keras,
......@@ -358,91 +357,6 @@ class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase,
self.assertAllClose(localization_loss, expected_localization_loss)
self.assertAllClose(classification_loss, expected_classification_loss)
def test_loss_with_expected_classification_loss(self, use_keras):
with tf.Graph().as_default():
_, num_classes, num_anchors, _ = self._create_model(use_keras=use_keras)
def graph_fn(preprocessed_tensor, groundtruth_boxes1, groundtruth_boxes2,
groundtruth_classes1, groundtruth_classes2):
groundtruth_boxes_list = [groundtruth_boxes1, groundtruth_boxes2]
groundtruth_classes_list = [groundtruth_classes1, groundtruth_classes2]
model, _, _, _ = self._create_model(
apply_hard_mining=False,
add_background_class=True,
use_expected_classification_loss_under_sampling=True,
min_num_negative_samples=1,
desired_negative_sampling_ratio=desired_negative_sampling_ratio)
model.provide_groundtruth(groundtruth_boxes_list,
groundtruth_classes_list)
prediction_dict = model.predict(
preprocessed_tensor, true_image_shapes=None)
loss_dict = model.loss(prediction_dict, true_image_shapes=None)
return (loss_dict['Loss/localization_loss'],
loss_dict['Loss/classification_loss'])
batch_size = 2
desired_negative_sampling_ratio = 4
preprocessed_input = np.random.rand(batch_size, 2, 2, 3).astype(np.float32)
groundtruth_boxes1 = np.array([[0, 0, .5, .5]], dtype=np.float32)
groundtruth_boxes2 = np.array([[0, 0, .5, .5]], dtype=np.float32)
groundtruth_classes1 = np.array([[1]], dtype=np.float32)
groundtruth_classes2 = np.array([[1]], dtype=np.float32)
expected_localization_loss = 0.0
expected_classification_loss = (
batch_size * (num_anchors + num_classes * num_anchors) * np.log(2.0))
(localization_loss, classification_loss) = self.execute(
graph_fn, [
preprocessed_input, groundtruth_boxes1, groundtruth_boxes2,
groundtruth_classes1, groundtruth_classes2
])
self.assertAllClose(localization_loss, expected_localization_loss)
self.assertAllClose(classification_loss, expected_classification_loss)
def test_loss_results_are_correct_with_weight_regression_loss_by_score(
self, use_keras):
with tf.Graph().as_default():
_, num_classes, num_anchors, _ = self._create_model(
use_keras=use_keras,
add_background_class=False,
weight_regression_loss_by_score=True)
def graph_fn(preprocessed_tensor, groundtruth_boxes1, groundtruth_boxes2,
groundtruth_classes1, groundtruth_classes2):
groundtruth_boxes_list = [groundtruth_boxes1, groundtruth_boxes2]
groundtruth_classes_list = [groundtruth_classes1, groundtruth_classes2]
model, _, _, _ = self._create_model(
use_keras=use_keras,
apply_hard_mining=False,
add_background_class=False,
weight_regression_loss_by_score=True)
model.provide_groundtruth(groundtruth_boxes_list,
groundtruth_classes_list)
prediction_dict = model.predict(
preprocessed_tensor, true_image_shapes=None)
loss_dict = model.loss(prediction_dict, true_image_shapes=None)
return (loss_dict['Loss/localization_loss'],
loss_dict['Loss/classification_loss'])
batch_size = 2
preprocessed_input = np.random.rand(batch_size, 2, 2, 3).astype(np.float32)
groundtruth_boxes1 = np.array([[0, 0, 1, 1]], dtype=np.float32)
groundtruth_boxes2 = np.array([[0, 0, 1, 1]], dtype=np.float32)
groundtruth_classes1 = np.array([[1]], dtype=np.float32)
groundtruth_classes2 = np.array([[0]], dtype=np.float32)
expected_localization_loss = 0.25
expected_classification_loss = (
batch_size * num_anchors * num_classes * np.log(2.0))
(localization_loss, classification_loss) = self.execute(
graph_fn, [
preprocessed_input, groundtruth_boxes1, groundtruth_boxes2,
groundtruth_classes1, groundtruth_classes2
])
self.assertAllClose(localization_loss, expected_localization_loss)
self.assertAllClose(classification_loss, expected_classification_loss)
def test_loss_results_are_correct_with_losses_mask(self, use_keras):
......
......@@ -25,6 +25,7 @@ from object_detection.core import post_processing
from object_detection.core import region_similarity_calculator as sim_calc
from object_detection.core import target_assigner
from object_detection.meta_architectures import ssd_meta_arch
from object_detection.protos import model_pb2
from object_detection.utils import ops
from object_detection.utils import test_case
from object_detection.utils import test_utils
......@@ -111,14 +112,14 @@ class MockAnchorGenerator2x2(anchor_generator.AnchorGenerator):
class SSDMetaArchTestBase(test_case.TestCase):
"""Base class to test SSD based meta architectures."""
def _create_model(self,
def _create_model(
self,
model_fn=ssd_meta_arch.SSDMetaArch,
apply_hard_mining=True,
normalize_loc_loss_by_codesize=False,
add_background_class=True,
random_example_sampling=False,
weight_regression_loss_by_score=False,
use_expected_classification_loss_under_sampling=False,
expected_loss_weights=model_pb2.DetectionModel().ssd.loss.NONE,
min_num_negative_samples=1,
desired_negative_sampling_ratio=3,
use_keras=False,
......@@ -130,12 +131,10 @@ class SSDMetaArchTestBase(test_case.TestCase):
mock_anchor_generator = MockAnchorGenerator2x2()
if use_keras:
mock_box_predictor = test_utils.MockKerasBoxPredictor(
is_training, num_classes, add_background_class=add_background_class,
predict_mask=predict_mask)
is_training, num_classes, add_background_class=add_background_class)
else:
mock_box_predictor = test_utils.MockBoxPredictor(
is_training, num_classes, add_background_class=add_background_class,
predict_mask=predict_mask)
is_training, num_classes, add_background_class=add_background_class)
mock_box_coder = test_utils.MockBoxCoder()
if use_keras:
fake_feature_extractor = FakeSSDKerasFeatureExtractor()
......@@ -177,17 +176,22 @@ class SSDMetaArchTestBase(test_case.TestCase):
region_similarity_calculator,
mock_matcher,
mock_box_coder,
negative_class_weight=negative_class_weight,
weight_regression_loss_by_score=weight_regression_loss_by_score)
negative_class_weight=negative_class_weight)
expected_classification_loss_under_sampling = None
if use_expected_classification_loss_under_sampling:
expected_classification_loss_under_sampling = functools.partial(
ops.expected_classification_loss_under_sampling,
min_num_negative_samples=min_num_negative_samples,
desired_negative_sampling_ratio=desired_negative_sampling_ratio)
model_config = model_pb2.DetectionModel()
if expected_loss_weights == model_config.ssd.loss.NONE:
expected_loss_weights_fn = None
else:
raise ValueError('Not a valid value for expected_loss_weights.')
code_size = 4
kwargs = {}
if predict_mask:
kwargs.update({
'mask_prediction_fn': test_utils.MockMaskHead(num_classes=1).predict,
})
model = model_fn(
is_training=is_training,
anchor_generator=mock_anchor_generator,
......@@ -211,8 +215,8 @@ class SSDMetaArchTestBase(test_case.TestCase):
inplace_batchnorm_update=False,
add_background_class=add_background_class,
random_example_sampler=random_example_sampler,
expected_classification_loss_under_sampling=
expected_classification_loss_under_sampling)
expected_loss_weights_fn=expected_loss_weights_fn,
**kwargs)
return model, num_classes, mock_anchor_generator.num_anchors(), code_size
def _get_value_for_matching_key(self, dictionary, suffix):
......
......@@ -54,49 +54,59 @@ MODEL_BUILD_UTIL_MAP = {
}
def _prepare_groundtruth_for_eval(detection_model, class_agnostic):
def _prepare_groundtruth_for_eval(detection_model, class_agnostic,
max_number_of_boxes):
"""Extracts groundtruth data from detection_model and prepares it for eval.
Args:
detection_model: A `DetectionModel` object.
class_agnostic: Whether the detections are class_agnostic.
max_number_of_boxes: Max number of groundtruth boxes.
Returns:
A tuple of:
groundtruth: Dictionary with the following fields:
'groundtruth_boxes': [num_boxes, 4] float32 tensor of boxes, in
normalized coordinates.
'groundtruth_classes': [num_boxes] int64 tensor of 1-indexed classes.
'groundtruth_masks': 3D float32 tensor of instance masks (if provided in
'groundtruth_boxes': [batch_size, num_boxes, 4] float32 tensor of boxes,
in normalized coordinates.
'groundtruth_classes': [batch_size, num_boxes] int64 tensor of 1-indexed
classes.
'groundtruth_masks': 4D float32 tensor of instance masks (if provided in
groundtruth)
'groundtruth_is_crowd': [num_boxes] bool tensor indicating is_crowd
annotations (if provided in groundtruth).
'groundtruth_is_crowd': [batch_size, num_boxes] bool tensor indicating
is_crowd annotations (if provided in groundtruth).
'num_groundtruth_boxes': [batch_size] tensor containing the maximum number
of groundtruth boxes per image..
class_agnostic: Boolean indicating whether detections are class agnostic.
"""
input_data_fields = fields.InputDataFields()
groundtruth_boxes = detection_model.groundtruth_lists(
fields.BoxListFields.boxes)[0]
groundtruth_boxes = tf.stack(
detection_model.groundtruth_lists(fields.BoxListFields.boxes))
groundtruth_boxes_shape = tf.shape(groundtruth_boxes)
# For class-agnostic models, groundtruth one-hot encodings collapse to all
# ones.
if class_agnostic:
groundtruth_boxes_shape = tf.shape(groundtruth_boxes)
groundtruth_classes_one_hot = tf.ones([groundtruth_boxes_shape[0], 1])
groundtruth_classes_one_hot = tf.ones(
[groundtruth_boxes_shape[0], groundtruth_boxes_shape[1], 1])
else:
groundtruth_classes_one_hot = detection_model.groundtruth_lists(
fields.BoxListFields.classes)[0]
groundtruth_classes_one_hot = tf.stack(
detection_model.groundtruth_lists(fields.BoxListFields.classes))
label_id_offset = 1 # Applying label id offset (b/63711816)
groundtruth_classes = (
tf.argmax(groundtruth_classes_one_hot, axis=1) + label_id_offset)
tf.argmax(groundtruth_classes_one_hot, axis=2) + label_id_offset)
groundtruth = {
input_data_fields.groundtruth_boxes: groundtruth_boxes,
input_data_fields.groundtruth_classes: groundtruth_classes
}
if detection_model.groundtruth_has_field(fields.BoxListFields.masks):
groundtruth[input_data_fields.groundtruth_instance_masks] = (
detection_model.groundtruth_lists(fields.BoxListFields.masks)[0])
groundtruth[input_data_fields.groundtruth_instance_masks] = tf.stack(
detection_model.groundtruth_lists(fields.BoxListFields.masks))
if detection_model.groundtruth_has_field(fields.BoxListFields.is_crowd):
groundtruth[input_data_fields.groundtruth_is_crowd] = (
detection_model.groundtruth_lists(fields.BoxListFields.is_crowd)[0])
groundtruth[input_data_fields.groundtruth_is_crowd] = tf.stack(
detection_model.groundtruth_lists(fields.BoxListFields.is_crowd))
groundtruth[input_data_fields.num_groundtruth_boxes] = (
tf.tile([max_number_of_boxes], multiples=[groundtruth_boxes_shape[0]]))
return groundtruth
......@@ -226,7 +236,7 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False):
boxes_shape = (
labels[fields.InputDataFields.groundtruth_boxes].get_shape()
.as_list())
unpad_groundtruth_tensors = True if boxes_shape[1] is not None else False
unpad_groundtruth_tensors = boxes_shape[1] is not None and not use_tpu
labels = unstack_batch(
labels, unpad_groundtruth_tensors=unpad_groundtruth_tensors)
......@@ -243,12 +253,17 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False):
gt_weights_list = None
if fields.InputDataFields.groundtruth_weights in labels:
gt_weights_list = labels[fields.InputDataFields.groundtruth_weights]
gt_confidences_list = None
if fields.InputDataFields.groundtruth_confidences in labels:
gt_confidences_list = labels[
fields.InputDataFields.groundtruth_confidences]
gt_is_crowd_list = None
if fields.InputDataFields.groundtruth_is_crowd in labels:
gt_is_crowd_list = labels[fields.InputDataFields.groundtruth_is_crowd]
detection_model.provide_groundtruth(
groundtruth_boxes_list=gt_boxes_list,
groundtruth_classes_list=gt_classes_list,
groundtruth_confidences_list=gt_confidences_list,
groundtruth_masks_list=gt_masks_list,
groundtruth_keypoints_list=gt_keypoints_list,
groundtruth_weights_list=gt_weights_list,
......@@ -378,24 +393,30 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False):
if mode == tf.estimator.ModeKeys.EVAL:
class_agnostic = (
fields.DetectionResultFields.detection_classes not in detections)
groundtruth = _prepare_groundtruth_for_eval(detection_model,
class_agnostic)
groundtruth = _prepare_groundtruth_for_eval(
detection_model, class_agnostic,
eval_input_config.max_number_of_boxes)
use_original_images = fields.InputDataFields.original_image in features
if use_original_images:
eval_images = tf.cast(tf.image.resize_bilinear(
features[fields.InputDataFields.original_image][0:1],
features[fields.InputDataFields.original_image_spatial_shape][0]),
tf.uint8)
eval_images = features[fields.InputDataFields.original_image]
true_image_shapes = tf.slice(
features[fields.InputDataFields.true_image_shape], [0, 0], [-1, 3])
original_image_spatial_shapes = features[fields.InputDataFields
.original_image_spatial_shape]
else:
eval_images = features[fields.InputDataFields.image]
true_image_shapes = None
original_image_spatial_shapes = None
eval_dict = eval_util.result_dict_for_single_example(
eval_images[0:1],
features[inputs.HASH_KEY][0],
eval_dict = eval_util.result_dict_for_batched_example(
eval_images,
features[inputs.HASH_KEY],
detections,
groundtruth,
class_agnostic=class_agnostic,
scale_to_absolute=True)
scale_to_absolute=True,
original_image_spatial_shapes=original_image_spatial_shapes,
true_image_shapes=true_image_shapes)
if class_agnostic:
category_index = label_map_util.create_class_agnostic_category_index()
......@@ -445,6 +466,15 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False):
eval_metrics=eval_metric_ops,
export_outputs=export_outputs)
else:
if scaffold is None:
keep_checkpoint_every_n_hours = (
train_config.keep_checkpoint_every_n_hours)
saver = tf.train.Saver(
sharded=True,
keep_checkpoint_every_n_hours=keep_checkpoint_every_n_hours,
save_relative_paths=True)
tf.add_to_collection(tf.GraphKeys.SAVERS, saver)
scaffold = tf.train.Scaffold(saver=saver)
return tf.estimator.EstimatorSpec(
mode=mode,
predictions=detections,
......
......@@ -301,8 +301,6 @@ class WeightSharedConvolutionalBoxPredictor(box_predictor.BoxPredictor):
num_predictions_per_location):
if head_name == CLASS_PREDICTIONS_WITH_BACKGROUND:
tower_name_scope = 'ClassPredictionTower'
elif head_name == MASK_PREDICTIONS:
tower_name_scope = 'MaskPredictionTower'
else:
raise ValueError('Unknown head')
if self._share_prediction_tower:
......
......@@ -119,6 +119,10 @@ class ConvolutionalBoxPredictor(box_predictor.KerasBoxPredictor):
if other_heads:
self._prediction_heads.update(other_heads)
# We generate a consistent ordering for the prediction head names,
# So that all workers build the model in the exact same order
self._sorted_head_names = sorted(self._prediction_heads.keys())
self._conv_hyperparams = conv_hyperparams
self._min_depth = min_depth
self._max_depth = max_depth
......@@ -187,7 +191,7 @@ class ConvolutionalBoxPredictor(box_predictor.KerasBoxPredictor):
for layer in self._shared_nets[index]:
net = layer(net)
for head_name in self._prediction_heads:
for head_name in self._sorted_head_names:
head_obj = self._prediction_heads[head_name][index]
prediction = head_obj(net)
predictions[head_name].append(prediction)
......
......@@ -188,6 +188,8 @@ class ConvolutionalKerasBoxPredictorTest(test_case.TestCase):
'BoxPredictor/ConvolutionalClassHead_0/ClassPredictor/bias',
'BoxPredictor/ConvolutionalClassHead_0/ClassPredictor/kernel'])
self.assertEqual(expected_variable_set, actual_variable_set)
self.assertEqual(conv_box_predictor._sorted_head_names,
['box_encodings', 'class_predictions_with_background'])
# TODO(kaftan): Remove conditional after CMLE moves to TF 1.10
......
......@@ -15,18 +15,6 @@ message BoxPredictor {
}
}
// Configuration proto for MaskHead in predictors.
// Next id: 4
message MaskHead {
// The height and the width of the predicted mask. Only used when
// predict_instance_masks is true.
optional int32 mask_height = 1 [default = 15];
optional int32 mask_width = 2 [default = 15];
// Whether to predict class agnostic masks. Only used when
// predict_instance_masks is true.
optional bool masks_are_class_agnostic = 3 [default = true];
}
// Configuration proto for Convolutional box predictor.
// Next id: 13
......@@ -69,9 +57,6 @@ message ConvolutionalBoxPredictor {
// Whether to use depthwise separable convolution for box predictor layers.
optional bool use_depthwise = 11 [default = false];
// Configs for a mask prediction head.
optional MaskHead mask_head = 12;
}
// Configuration proto for weight shared convolutional box predictor.
......@@ -113,9 +98,6 @@ message WeightSharedConvolutionalBoxPredictor {
// Whether to use depthwise separable convolution for box predictor layers.
optional bool use_depthwise = 14 [default = false];
// Configs for a mask prediction head.
optional MaskHead mask_head = 15;
// Enum to specify how to convert the detection scores at inference time.
enum ScoreConverter {
// Input scores equals output scores.
......
......@@ -35,9 +35,16 @@ message Hyperparams {
}
optional Activation activation = 4 [default = RELU];
// BatchNorm hyperparameters. If this parameter is NOT set then BatchNorm is
// not applied!
optional BatchNorm batch_norm = 5;
oneof normalizer_oneof {
// Note that if nothing below is selected, then no normalization is applied
// BatchNorm hyperparameters.
BatchNorm batch_norm = 5;
// GroupNorm hyperparameters. This is only supported on a subset of models.
// Note that the current implementation of group norm instantiated in
// tf.contrib.group.layers.group_norm() only supports fixed_size_resizer
// for image preprocessing.
GroupNorm group_norm = 7;
}
// Whether depthwise convolutions should be regularized. If this parameter is
// NOT set then the conv hyperparams will default to the parent scope.
......@@ -113,3 +120,8 @@ message BatchNorm {
// forward pass but they are never updated.
optional bool train = 5 [default = true];
}
// Configuration proto for group normalization to apply after convolution op.
// https://arxiv.org/abs/1803.08494
message GroupNorm {
}
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment