Commit 80444539 authored by Zhuoran Liu's avatar Zhuoran Liu Committed by pkulzc
Browse files

Add TPU SavedModel exporter and refactor OD code (#6737)

247226201  by ronnyvotel:

    Updating the visualization tools to accept unique_ids for color coding.

--
247067830  by Zhichao Lu:

    Add box_encodings_clip_range options for the convolutional box predictor (for TPU compatibility).

--
246888475  by Zhichao Lu:

    Remove unused _update_eval_steps function.

--
246163259  by lzc:

    Add a gather op that can handle ignore indices (which are "-1"s in this case).

--
246084944  by Zhichao Lu:

    Keras based implementation for SSD + MobilenetV2 + FPN.

--
245544227  by rathodv:

    Add batch_get_targets method to target assigner module to gather any groundtruth tensors based on the results of target assigner.

--
245540854  by rathodv:

    Update target assigner to return match tensor instead of a match object.

--
245434441  by Zhichao Lu:

    Add README for tpu_exporters package.

--
245381834  by lzc:

    Internal change.

--
245298983  by Zhichao Lu:

    Add conditional_shape_resizer to config_util

--
245134666  by Zhichao Lu:

    Adds ConditionalShapeResizer to the ImageResizer proto which enables resizing only if input image height or width is is greater or smaller than a certain size. Also enables specification of resize method in resize_to_{max, min}_dimension methods.

--
245093975  by Zhichao Lu:

    Exporting SavedModel for Object Detection TPU inference. (faster-rcnn)

--
245072421  by Zhichao Lu:

    Adds a new image resizing method "resize_to_max_dimension" which resizes images only if a dimension is greater than the maximum desired value while maintaining aspect ratio.

--
244946998  by lzc:

    Internal Changes.

--
244943693  by Zhichao Lu:

    Add a custom config to mobilenet v2 that makes it more detection friendly.

--
244754158  by derekjchow:

    Internal change.

--
244699875  by Zhichao Lu:

    Add check_range=False to box_list_ops.to_normalized_coordinates when training
    for instance segmentation.  This is consistent with other calls when training
    for object detection.  There could be wrongly annotated boxes in the dataset.

--
244507425  by rathodv:

    Support bfloat16 for ssd models.

--
244399982  by Zhichao Lu:

    Exporting SavedModel for Object Detection TPU inference. (ssd)

--
244209387  by Zhichao Lu:

    Internal change.

--
243922296  by rathodv:

    Change `raw_detection_scores` to contain softmax/sigmoid scores (not logits) for `raw_ detection_boxes`.

--
243883978  by Zhichao Lu:

    Add a sample fully conv config.

--
243369455  by Zhichao Lu:

    Fix regularization loss gap in Keras and Slim.

--
243292002  by lzc:

    Internal changes.

--
243097958  by Zhichao Lu:

    Exporting SavedModel for Object Detection TPU inference. (ssd model)

--
243007177  by Zhichao Lu:

    Exporting SavedModel for Object Detection TPU inference. (ssd model)

--
242776550  by Zhichao Lu:

    Make object detection pre-processing run on GPU.  tf.map_fn() uses
    TensorArrayV3 ops, which have no int32 GPU implementation.  Cast to int64,
    then cast back to int32.

--
242723128  by Zhichao Lu:

    Using sorted dictionaries for additional heads in non_max_suppression to ensure tensor order

--
242495311  by Zhichao Lu:

    Update documentation to reflect new TFLite examples repo location

--
242230527  by Zhichao Lu:

    Fix Dropout bugs for WeightSharedConvolutionalBoxPred.

--
242226573  by Zhichao Lu:

    Create Keras-based WeightSharedConvolutionalBoxPredictor.

--
241806074  by Zhichao Lu:

    Add inference in unit tests of TFX OD template.

--
241641498  by lzc:

    Internal change.

--
241637481  by Zhichao Lu:

    matmul_crop_and_resize(): Switch to dynamic shaping, so that not all dimensions are required to be known.

--
241429980  by Zhichao Lu:

    Internal change

--
241167237  by Zhichao Lu:

    Adds a faster_rcnn_inception_resnet_v2 Keras feature extractor, and updates the model builder to construct it.

--
241088616  by Zhichao Lu:

    Make it compatible with different dtype, e.g. float32, bfloat16, etc.

--
240897364  by lzc:

    Use image_np_expanded in object_detection_tutorial notebook.

--
240890393  by Zhichao Lu:

    Disable multicore inference for OD template as its not yet compatible.

--
240352168  by Zhichao Lu:

    Make SSDResnetV1FpnFeatureExtractor not protected to allow inheritance.

--
240351470  by lzc:

    Internal change.

--
239878928  by Zhichao Lu:

    Defines Keras box predictors for Faster RCNN and RFCN

--
239872103  by Zhichao Lu:

    Delete duplicated inputs in test.

--
239714273  by Zhichao Lu:

    Adding scope variable to all class heads

--
239698643  by Zhichao Lu:

    Create FPN feature extractor for object detection.

--
239696657  by Zhichao Lu:

    Internal Change.

--
239299404  by Zhichao Lu:

    Allows the faster rcnn meta-architecture to support Keras subcomponents

--
238502595  by Zhichao Lu:

    Lay the groundwork for symmetric quantization.

--
238496885  by Zhichao Lu:

    Add flexible_grid_anchor_generator

--
238138727  by lzc:

    Remove dead code.

    _USE_C_SHAPES has been forced True in TensorFlow releases since
    TensorFlow 1.9
    (https://github.com/tensorflow/tensorflow/commit/1d74a69443f741e69f9f52cb6bc2940b4d4ae3b7)

--
238123936  by rathodv:

    Add num_matched_groundtruth summary to target assigner in SSD.

--
238103345  by ronnyvotel:

    Raising error if input file pattern does not match any files.
    Also printing the number of evaluation images for coco metrics.

--
238044081  by Zhichao Lu:

    Fix docstring to state the correct dimensionality of `class_predictions_with_background`.

--
237920279  by Zhichao Lu:

    [XLA] Rework debug flags for dumping HLO.

    The following flags (usually passed via the XLA_FLAGS envvar) are removed:

      xla_dump_computations_to
      xla_dump_executions_to
      xla_dump_ir_to
      xla_dump_optimized_hlo_proto_to
      xla_dump_per_pass_hlo_proto_to
      xla_dump_unoptimized_hlo_proto_to
      xla_generate_hlo_graph
      xla_generate_hlo_text_to
      xla_hlo_dump_as_html
      xla_hlo_graph_path
      xla_log_hlo_text

    The following new flags are added:

      xla_dump_to
      xla_dump_hlo_module_re
      xla_dump_hlo_pass_re
      xla_dump_hlo_as_text
      xla_dump_hlo_as_proto
      xla_dump_hlo_as_dot
      xla_dump_hlo_as_url
      xla_dump_hlo_as_html
      xla_dump_ir
      xla_dump_hlo_snapshots

    The default is not to dump anything at all, but as soon as some dumping flag is
    specified, we enable the following defaults (most of which can be overridden).

     * dump to stdout (overridden by --xla_dump_to)
     * dump HLO modules at the very beginning and end of the optimization pipeline
     * don't dump between any HLO passes (overridden by --xla_dump_hlo_pass_re)
     * dump all HLO modules (overridden by --xla_dump_hlo_module_re)
     * dump in textual format (overridden by
       --xla_dump_hlo_as_{text,proto,dot,url,html}).

    For example, to dump optimized and unoptimized HLO text and protos to /tmp/foo,
    pass

      --xla_dump_to=/tmp/foo --xla_dump_hlo_as_text --xla_dump_hlo_as_proto

    For details on these flags' meanings, see xla.proto.

    The intent of this change is to make dumping both simpler to use and more
    powerful.

    For example:

     * Previously there was no way to dump the HLO module during the pass pipeline
       in HLO text format; the only option was --dump_per_pass_hlo_proto_to, which
       dumped in proto format.

       Now this is --xla_dump_pass_re=.* --xla_dump_hlo_as_text.  (In fact, the
       second flag is not necessary in this case, as dumping as text is the
       default.)

     * Previously there was no way to dump HLO as a graph before and after
       compilation; the only option was --xla_generate_hlo_graph, which would dump
       before/after every pass.

       Now this is --xla_dump_hlo_as_{dot,url,html} (depending on what format you
       want the graph in).

     * Previously, there was no coordination between the filenames written by the
       various flags, so info about one module might be dumped with various
       filename prefixes.  Now the filenames are consistent and all dumps from a
       particular module are next to each other.

    If you only specify some of these flags, we try to figure out what you wanted.
    For example:

     * --xla_dump_to implies --xla_dump_hlo_as_text unless you specify some
       other --xla_dump_as_* flag.

     * --xla_dump_hlo_as_text or --xla_dump_ir implies dumping to stdout unless you
       specify a different --xla_dump_to directory.  You can explicitly dump to
       stdout with --xla_dump_to=-.

    As part of this change, I simplified the debugging code in the HLO passes for
    dumping HLO modules.  Previously, many tests explicitly VLOG'ed the HLO module
    before, after, and sometimes during the pass.  I removed these VLOGs.  If you
    want dumps before/during/after an HLO pass, use --xla_dump_pass_re=<pass_name>.

--
237510043  by lzc:

    Internal Change.

--
237469515  by Zhichao Lu:

    Parameterize model_builder.build in inputs.py.

--
237293511  by rathodv:

    Remove multiclass_scores from tensor_dict in transform_data_fn always.

--
237260333  by ronnyvotel:

    Updating faster_rcnn_meta_arch to define prediction dictionary fields that are batched.

--

PiperOrigin-RevId: 247226201
parent c4f34e58
...@@ -713,9 +713,6 @@ class BatchTargetAssignerTest(test_case.TestCase): ...@@ -713,9 +713,6 @@ class BatchTargetAssignerTest(test_case.TestCase):
groundtruth_boxlist2 = np.array([[0, 0.25123152, 1, 1], groundtruth_boxlist2 = np.array([[0, 0.25123152, 1, 1],
[0.015789, 0.0985, 0.55789, 0.3842]], [0.015789, 0.0985, 0.55789, 0.3842]],
dtype=np.float32) dtype=np.float32)
class_targets1 = np.array([[0, 1, 0, 0]], dtype=np.float32)
class_targets2 = np.array([[0, 0, 0, 1],
[0, 0, 1, 0]], dtype=np.float32)
class_targets1 = np.array([[[0, 1, 1], class_targets1 = np.array([[[0, 1, 1],
[1, 1, 0]]], dtype=np.float32) [1, 1, 0]]], dtype=np.float32)
class_targets2 = np.array([[[0, 1, 1], class_targets2 = np.array([[[0, 1, 1],
...@@ -821,6 +818,63 @@ class BatchTargetAssignerTest(test_case.TestCase): ...@@ -821,6 +818,63 @@ class BatchTargetAssignerTest(test_case.TestCase):
self.assertAllClose(reg_weights_out, exp_reg_weights) self.assertAllClose(reg_weights_out, exp_reg_weights)
class BatchGetTargetsTest(test_case.TestCase):
def test_scalar_targets(self):
batch_match = np.array([[1, 0, 1],
[-2, -1, 1]], dtype=np.int32)
groundtruth_tensors_list = np.array([[11, 12], [13, 14]], dtype=np.int32)
groundtruth_weights_list = np.array([[1.0, 1.0], [1.0, 0.5]],
dtype=np.float32)
unmatched_value = np.array(99, dtype=np.int32)
unmatched_weight = np.array(0.0, dtype=np.float32)
def graph_fn(batch_match, groundtruth_tensors_list,
groundtruth_weights_list, unmatched_value, unmatched_weight):
targets, weights = targetassigner.batch_get_targets(
batch_match, tf.unstack(groundtruth_tensors_list),
tf.unstack(groundtruth_weights_list),
unmatched_value, unmatched_weight)
return (targets, weights)
(targets_np, weights_np) = self.execute(graph_fn, [
batch_match, groundtruth_tensors_list, groundtruth_weights_list,
unmatched_value, unmatched_weight
])
self.assertAllEqual([[12, 11, 12],
[99, 99, 14]], targets_np)
self.assertAllClose([[1.0, 1.0, 1.0],
[0.0, 0.0, 0.5]], weights_np)
def test_1d_targets(self):
batch_match = np.array([[1, 0, 1],
[-2, -1, 1]], dtype=np.int32)
groundtruth_tensors_list = np.array([[[11, 12], [12, 13]],
[[13, 14], [14, 15]]],
dtype=np.float32)
groundtruth_weights_list = np.array([[1.0, 1.0], [1.0, 0.5]],
dtype=np.float32)
unmatched_value = np.array([99, 99], dtype=np.float32)
unmatched_weight = np.array(0.0, dtype=np.float32)
def graph_fn(batch_match, groundtruth_tensors_list,
groundtruth_weights_list, unmatched_value, unmatched_weight):
targets, weights = targetassigner.batch_get_targets(
batch_match, tf.unstack(groundtruth_tensors_list),
tf.unstack(groundtruth_weights_list),
unmatched_value, unmatched_weight)
return (targets, weights)
(targets_np, weights_np) = self.execute(graph_fn, [
batch_match, groundtruth_tensors_list, groundtruth_weights_list,
unmatched_value, unmatched_weight
])
self.assertAllClose([[[12, 13], [11, 12], [12, 13]],
[[99, 99], [99, 99], [14, 15]]], targets_np)
self.assertAllClose([[1.0, 1.0, 1.0],
[0.0, 0.0, 0.5]], weights_np)
class BatchTargetAssignConfidencesTest(test_case.TestCase): class BatchTargetAssignConfidencesTest(test_case.TestCase):
def _get_target_assigner(self): def _get_target_assigner(self):
......
...@@ -18,7 +18,6 @@ import os ...@@ -18,7 +18,6 @@ import os
import numpy as np import numpy as np
import tensorflow as tf import tensorflow as tf
from tensorflow.python.framework import test_util
from object_detection.core import standard_fields as fields from object_detection.core import standard_fields as fields
from object_detection.data_decoders import tf_example_decoder from object_detection.data_decoders import tf_example_decoder
from object_detection.protos import input_reader_pb2 from object_detection.protos import input_reader_pb2
...@@ -257,7 +256,6 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -257,7 +256,6 @@ class TfExampleDecoderTest(tf.test.TestCase):
self.assertAllEqual(expected_boxes, self.assertAllEqual(expected_boxes,
tensor_dict[fields.InputDataFields.groundtruth_boxes]) tensor_dict[fields.InputDataFields.groundtruth_boxes])
@test_util.enable_c_shapes
def testDecodeKeypoint(self): def testDecodeKeypoint(self):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8) image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
encoded_jpeg = self._EncodeImage(image_tensor) encoded_jpeg = self._EncodeImage(image_tensor)
...@@ -346,7 +344,6 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -346,7 +344,6 @@ class TfExampleDecoderTest(tf.test.TestCase):
self.assertAllClose(tensor_dict[fields.InputDataFields.groundtruth_weights], self.assertAllClose(tensor_dict[fields.InputDataFields.groundtruth_weights],
np.ones(2, dtype=np.float32)) np.ones(2, dtype=np.float32))
@test_util.enable_c_shapes
def testDecodeObjectLabel(self): def testDecodeObjectLabel(self):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8) image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
encoded_jpeg = self._EncodeImage(image_tensor) encoded_jpeg = self._EncodeImage(image_tensor)
...@@ -669,7 +666,6 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -669,7 +666,6 @@ class TfExampleDecoderTest(tf.test.TestCase):
self.assertAllEqual([3, 1], self.assertAllEqual([3, 1],
tensor_dict[fields.InputDataFields.groundtruth_classes]) tensor_dict[fields.InputDataFields.groundtruth_classes])
@test_util.enable_c_shapes
def testDecodeObjectArea(self): def testDecodeObjectArea(self):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8) image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
encoded_jpeg = self._EncodeImage(image_tensor) encoded_jpeg = self._EncodeImage(image_tensor)
...@@ -696,7 +692,6 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -696,7 +692,6 @@ class TfExampleDecoderTest(tf.test.TestCase):
self.assertAllEqual(object_area, self.assertAllEqual(object_area,
tensor_dict[fields.InputDataFields.groundtruth_area]) tensor_dict[fields.InputDataFields.groundtruth_area])
@test_util.enable_c_shapes
def testDecodeObjectIsCrowd(self): def testDecodeObjectIsCrowd(self):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8) image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
encoded_jpeg = self._EncodeImage(image_tensor) encoded_jpeg = self._EncodeImage(image_tensor)
...@@ -725,7 +720,6 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -725,7 +720,6 @@ class TfExampleDecoderTest(tf.test.TestCase):
[bool(item) for item in object_is_crowd], [bool(item) for item in object_is_crowd],
tensor_dict[fields.InputDataFields.groundtruth_is_crowd]) tensor_dict[fields.InputDataFields.groundtruth_is_crowd])
@test_util.enable_c_shapes
def testDecodeObjectDifficult(self): def testDecodeObjectDifficult(self):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8) image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
encoded_jpeg = self._EncodeImage(image_tensor) encoded_jpeg = self._EncodeImage(image_tensor)
...@@ -754,7 +748,6 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -754,7 +748,6 @@ class TfExampleDecoderTest(tf.test.TestCase):
[bool(item) for item in object_difficult], [bool(item) for item in object_difficult],
tensor_dict[fields.InputDataFields.groundtruth_difficult]) tensor_dict[fields.InputDataFields.groundtruth_difficult])
@test_util.enable_c_shapes
def testDecodeObjectGroupOf(self): def testDecodeObjectGroupOf(self):
image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8) image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
encoded_jpeg = self._EncodeImage(image_tensor) encoded_jpeg = self._EncodeImage(image_tensor)
...@@ -809,7 +802,6 @@ class TfExampleDecoderTest(tf.test.TestCase): ...@@ -809,7 +802,6 @@ class TfExampleDecoderTest(tf.test.TestCase):
self.assertAllEqual(object_weights, self.assertAllEqual(object_weights,
tensor_dict[fields.InputDataFields.groundtruth_weights]) tensor_dict[fields.InputDataFields.groundtruth_weights])
@test_util.enable_c_shapes
def testDecodeInstanceSegmentation(self): def testDecodeInstanceSegmentation(self):
num_instances = 4 num_instances = 4
image_height = 5 image_height = 5
......
...@@ -14,7 +14,7 @@ A couple words of warning: ...@@ -14,7 +14,7 @@ A couple words of warning:
the container. When running through the tutorial, the container. When running through the tutorial,
**do not close the container**. **do not close the container**.
2. To be able to deploy the [Android app]( 2. To be able to deploy the [Android app](
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/examples/android/app) https://github.com/tensorflow/examples/tree/master/lite/examples/object_detection/android)
(which you will build at the end of the tutorial), (which you will build at the end of the tutorial),
you will need to kill any instances of `adb` running on the host machine. You you will need to kill any instances of `adb` running on the host machine. You
can accomplish this by closing all instances of Android Studio, and then can accomplish this by closing all instances of Android Studio, and then
......
...@@ -94,43 +94,41 @@ bazel run --config=opt tensorflow/lite/toco:toco -- \ ...@@ -94,43 +94,41 @@ bazel run --config=opt tensorflow/lite/toco:toco -- \
# Running our model on Android # Running our model on Android
To run our TensorFlow Lite model on device, we will need to install the Android To run our TensorFlow Lite model on device, we will use Android Studio to build
NDK and SDK. The current recommended Android NDK version is 14b and can be found and run the TensorFlow Lite detection example with the new model. The example is
on the [NDK found in the
Archives](https://developer.android.com/ndk/downloads/older_releases.html#ndk-14b-downloads) [TensorFlow examples repository](https://github.com/tensorflow/examples) under
page. Android SDK and build tools can be [downloaded `/lite/examples/object_detection`. The example can be built with
separately](https://developer.android.com/tools/revisions/build-tools.html) or [Android Studio](https://developer.android.com/studio/index.html), and requires
used as part of [Android the
Studio](https://developer.android.com/studio/index.html). To build the [Android SDK with build tools](https://developer.android.com/tools/revisions/build-tools.html)
TensorFlow Lite Android demo, build tools require API >= 23 (but it will run on that support API >= 21. Additional details are available on the
devices with API >= 21). Additional details are available on the [TensorFlow [TensorFlow Lite example page](https://github.com/tensorflow/examples/tree/master/lite/examples/object_detection/android).
Lite Android App
page](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/java/demo/README.md).
Next we need to point the app to our new detect.tflite file and give it the Next we need to point the app to our new detect.tflite file and give it the
names of our new labels. Specifically, we will copy our TensorFlow Lite names of our new labels. Specifically, we will copy our TensorFlow Lite
flatbuffer to the app assets directory with the following command: flatbuffer to the app assets directory with the following command:
```shell ```shell
mkdir $TF_EXAMPLES/lite/examples/object_detection/android/app/src/main/assets
cp /tmp/tflite/detect.tflite \ cp /tmp/tflite/detect.tflite \
//tensorflow/lite/examples/android/app/src/main/assets $TF_EXAMPLES/lite/examples/object_detection/android/app/src/main/assets
``` ```
You will also need to copy your new labelmap labels_list.txt to the assets You will also need to copy your new labelmap labelmap.txt to the assets
directory. directory.
We will now edit the BUILD file to point to this new model. First, open the We will now edit the gradle build file to use these assets. First, open the
BUILD file tensorflow/lite/examples/android/BUILD. Then find the assets `build.gradle` file
section, and replace the line “@tflite_mobilenet_ssd_quant//:detect.tflite” `$TF_EXAMPLES/lite/examples/object_detection/android/app/build.gradle`. Comment
(which by default points to a COCO pretrained model) with the path to your new out the model download script to avoid your assets being overwritten: `// apply
TFLite model from:'download_model.gradle'` ```
“//tensorflow/lite/examples/android/app/src/main/assets:detect.tflite”.
Finally, change the last line in assets section to use the new label map as If your model is named `detect.tflite`, and your labels file `labelmap.txt`, the
well. example will use them automatically as long as they've been properly copied into
the base assets directory. If you need to use a custom path or filename, open up
We will also need to tell our app to use the new label map. In order to do this, the
open up the $TF_EXAMPLES/lite/examples/object_detection/android/app/src/main/java/org/tensorflow/demo/DetectorActivity.java
tensorflow/lite/examples/android/app/src/main/java/org/tensorflow/demo/DetectorActivity.java
file in a text editor and find the definition of TF_OD_API_LABELS_FILE. Update file in a text editor and find the definition of TF_OD_API_LABELS_FILE. Update
this path to point to your new label map file: this path to point to your new label map file:
"file:///android_asset/labels_list.txt". Note that if your model is quantized, "file:///android_asset/labels_list.txt". Note that if your model is quantized,
...@@ -144,20 +142,6 @@ DetectorActivity.java should now look as follows for a quantized model: ...@@ -144,20 +142,6 @@ DetectorActivity.java should now look as follows for a quantized model:
private static final String TF_OD_API_LABELS_FILE = "file:///android_asset/labels_list.txt"; private static final String TF_OD_API_LABELS_FILE = "file:///android_asset/labels_list.txt";
``` ```
Once you’ve copied the TensorFlow Lite file and edited your BUILD and Once you’ve copied the TensorFlow Lite model and edited the gradle build script
DetectorActivity.java files, you can build the demo app, run this bazel command to not use the downloaded assets, you can build and deploy the app using the
from the tensorflow directory: usual Android Studio build process.
```shell
bazel build -c opt --config=android_arm{,64} --cxxopt='--std=c++11'
"//tensorflow/lite/examples/android:tflite_demo"
```
Now install the demo on a
[debug-enabled](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android#install)
Android phone via [Android Debug
Bridge](https://developer.android.com/studio/command-line/adb) (adb):
```shell
adb install bazel-bin/tensorflow/lite/examples/android/tflite_demo.apk
```
# Object Detection TPU Inference Exporter
This package contains SavedModel Exporter for TPU Inference of object detection
models.
## Usage
This Exporter is intended for users who have trained models with CPUs / GPUs,
but would like to use them for inference on TPU without changing their code or
re-training their models.
Users are assumed to have:
+ `PIPELINE_CONFIG`: A pipeline_pb2.TrainEvalPipelineConfig config file;
+ `CHECKPOINT`: A model checkpoint trained on any device;
and need to correctly set:
+ `EXPORT_DIR`: Path to export SavedModel;
+ `INPUT_PLACEHOLDER`: Name of input placeholder in model's signature_def_map;
+ `INPUT_TYPE`: Type of input node, which can be one of 'image_tensor',
'encoded_image_string_tensor', or 'tf_example';
+ `USE_BFLOAT16`: Whether to use bfloat16 instead of float32 on TPU.
The model can be exported with:
```
python object_detection/tpu_exporters/export_saved_model_tpu.py \
--pipeline_config_file=<PIPELINE_CONFIG> \
--ckpt_path=<CHECKPOINT> \
--export_dir=<EXPORT_DIR> \
--input_placeholder_name=<INPUT_PLACEHOLDER> \
--input_type=<INPUT_TYPE> \
--use_bfloat16=<USE_BFLOAT16>
```
...@@ -43,6 +43,7 @@ SERVING_FED_EXAMPLE_KEY = 'serialized_example' ...@@ -43,6 +43,7 @@ SERVING_FED_EXAMPLE_KEY = 'serialized_example'
# A map of names to methods that help build the input pipeline. # A map of names to methods that help build the input pipeline.
INPUT_BUILDER_UTIL_MAP = { INPUT_BUILDER_UTIL_MAP = {
'dataset_build': dataset_builder.build, 'dataset_build': dataset_builder.build,
'model_build': model_builder.build,
} }
...@@ -152,7 +153,7 @@ def transform_input_data(tensor_dict, ...@@ -152,7 +153,7 @@ def transform_input_data(tensor_dict,
if use_multiclass_scores: if use_multiclass_scores:
tensor_dict[fields.InputDataFields.groundtruth_classes] = tensor_dict[ tensor_dict[fields.InputDataFields.groundtruth_classes] = tensor_dict[
fields.InputDataFields.multiclass_scores] fields.InputDataFields.multiclass_scores]
tensor_dict.pop(fields.InputDataFields.multiclass_scores, None) tensor_dict.pop(fields.InputDataFields.multiclass_scores, None)
if fields.InputDataFields.groundtruth_confidences in tensor_dict: if fields.InputDataFields.groundtruth_confidences in tensor_dict:
groundtruth_confidences = tensor_dict[ groundtruth_confidences = tensor_dict[
...@@ -498,11 +499,13 @@ def create_train_input_fn(train_config, train_input_config, ...@@ -498,11 +499,13 @@ def create_train_input_fn(train_config, train_input_config,
data_augmentation_fn = functools.partial( data_augmentation_fn = functools.partial(
augment_input_data, augment_input_data,
data_augmentation_options=data_augmentation_options) data_augmentation_options=data_augmentation_options)
model = model_builder.build(model_config, is_training=True)
model_preprocess_fn = INPUT_BUILDER_UTIL_MAP['model_build'](
model_config, is_training=True).preprocess
image_resizer_config = config_util.get_image_resizer_config(model_config) image_resizer_config = config_util.get_image_resizer_config(model_config)
image_resizer_fn = image_resizer_builder.build(image_resizer_config) image_resizer_fn = image_resizer_builder.build(image_resizer_config)
transform_data_fn = functools.partial( transform_data_fn = functools.partial(
transform_input_data, model_preprocess_fn=model.preprocess, transform_input_data, model_preprocess_fn=model_preprocess_fn,
image_resizer_fn=image_resizer_fn, image_resizer_fn=image_resizer_fn,
num_classes=config_util.get_number_of_classes(model_config), num_classes=config_util.get_number_of_classes(model_config),
data_augmentation_fn=data_augmentation_fn, data_augmentation_fn=data_augmentation_fn,
...@@ -593,12 +596,14 @@ def create_eval_input_fn(eval_config, eval_input_config, model_config): ...@@ -593,12 +596,14 @@ def create_eval_input_fn(eval_config, eval_input_config, model_config):
def transform_and_pad_input_data_fn(tensor_dict): def transform_and_pad_input_data_fn(tensor_dict):
"""Combines transform and pad operation.""" """Combines transform and pad operation."""
num_classes = config_util.get_number_of_classes(model_config) num_classes = config_util.get_number_of_classes(model_config)
model = model_builder.build(model_config, is_training=False) model_preprocess_fn = INPUT_BUILDER_UTIL_MAP['model_build'](
model_config, is_training=False).preprocess
image_resizer_config = config_util.get_image_resizer_config(model_config) image_resizer_config = config_util.get_image_resizer_config(model_config)
image_resizer_fn = image_resizer_builder.build(image_resizer_config) image_resizer_fn = image_resizer_builder.build(image_resizer_config)
transform_data_fn = functools.partial( transform_data_fn = functools.partial(
transform_input_data, model_preprocess_fn=model.preprocess, transform_input_data, model_preprocess_fn=model_preprocess_fn,
image_resizer_fn=image_resizer_fn, image_resizer_fn=image_resizer_fn,
num_classes=num_classes, num_classes=num_classes,
data_augmentation_fn=None, data_augmentation_fn=None,
...@@ -643,12 +648,14 @@ def create_predict_input_fn(model_config, predict_input_config): ...@@ -643,12 +648,14 @@ def create_predict_input_fn(model_config, predict_input_config):
example = tf.placeholder(dtype=tf.string, shape=[], name='tf_example') example = tf.placeholder(dtype=tf.string, shape=[], name='tf_example')
num_classes = config_util.get_number_of_classes(model_config) num_classes = config_util.get_number_of_classes(model_config)
model = model_builder.build(model_config, is_training=False) model_preprocess_fn = INPUT_BUILDER_UTIL_MAP['model_build'](
model_config, is_training=False).preprocess
image_resizer_config = config_util.get_image_resizer_config(model_config) image_resizer_config = config_util.get_image_resizer_config(model_config)
image_resizer_fn = image_resizer_builder.build(image_resizer_config) image_resizer_fn = image_resizer_builder.build(image_resizer_config)
transform_fn = functools.partial( transform_fn = functools.partial(
transform_input_data, model_preprocess_fn=model.preprocess, transform_input_data, model_preprocess_fn=model_preprocess_fn,
image_resizer_fn=image_resizer_fn, image_resizer_fn=image_resizer_fn,
num_classes=num_classes, num_classes=num_classes,
data_augmentation_fn=None) data_augmentation_fn=None)
......
...@@ -26,9 +26,15 @@ class FasterRCNNMetaArchTest( ...@@ -26,9 +26,15 @@ class FasterRCNNMetaArchTest(
faster_rcnn_meta_arch_test_lib.FasterRCNNMetaArchTestBase, faster_rcnn_meta_arch_test_lib.FasterRCNNMetaArchTestBase,
parameterized.TestCase): parameterized.TestCase):
def test_postprocess_second_stage_only_inference_mode_with_masks(self): @parameterized.parameters(
{'use_keras': True},
{'use_keras': False}
)
def test_postprocess_second_stage_only_inference_mode_with_masks(
self, use_keras=False):
model = self._build_model( model = self._build_model(
is_training=False, number_of_stages=2, second_stage_batch_size=6) is_training=False, use_keras=use_keras,
number_of_stages=2, second_stage_batch_size=6)
batch_size = 2 batch_size = 2
total_num_padded_proposals = batch_size * model.max_num_proposals total_num_padded_proposals = batch_size * model.max_num_proposals
...@@ -85,9 +91,15 @@ class FasterRCNNMetaArchTest( ...@@ -85,9 +91,15 @@ class FasterRCNNMetaArchTest(
self.assertTrue(np.amax(detections_out['detection_masks'] <= 1.0)) self.assertTrue(np.amax(detections_out['detection_masks'] <= 1.0))
self.assertTrue(np.amin(detections_out['detection_masks'] >= 0.0)) self.assertTrue(np.amin(detections_out['detection_masks'] >= 0.0))
def test_postprocess_second_stage_only_inference_mode_with_calibration(self): @parameterized.parameters(
{'use_keras': True},
{'use_keras': False}
)
def test_postprocess_second_stage_only_inference_mode_with_calibration(
self, use_keras=False):
model = self._build_model( model = self._build_model(
is_training=False, number_of_stages=2, second_stage_batch_size=6, is_training=False, use_keras=use_keras,
number_of_stages=2, second_stage_batch_size=6,
calibration_mapping_value=0.5) calibration_mapping_value=0.5)
batch_size = 2 batch_size = 2
...@@ -147,9 +159,15 @@ class FasterRCNNMetaArchTest( ...@@ -147,9 +159,15 @@ class FasterRCNNMetaArchTest(
self.assertTrue(np.amax(detections_out['detection_masks'] <= 1.0)) self.assertTrue(np.amax(detections_out['detection_masks'] <= 1.0))
self.assertTrue(np.amin(detections_out['detection_masks'] >= 0.0)) self.assertTrue(np.amin(detections_out['detection_masks'] >= 0.0))
def test_postprocess_second_stage_only_inference_mode_with_shared_boxes(self): @parameterized.parameters(
{'use_keras': True},
{'use_keras': False}
)
def test_postprocess_second_stage_only_inference_mode_with_shared_boxes(
self, use_keras=False):
model = self._build_model( model = self._build_model(
is_training=False, number_of_stages=2, second_stage_batch_size=6) is_training=False, use_keras=use_keras,
number_of_stages=2, second_stage_batch_size=6)
batch_size = 2 batch_size = 2
total_num_padded_proposals = batch_size * model.max_num_proposals total_num_padded_proposals = batch_size * model.max_num_proposals
...@@ -188,11 +206,13 @@ class FasterRCNNMetaArchTest( ...@@ -188,11 +206,13 @@ class FasterRCNNMetaArchTest(
self.assertAllClose(detections_out['num_detections'], [5, 4]) self.assertAllClose(detections_out['num_detections'], [5, 4])
@parameterized.parameters( @parameterized.parameters(
{'masks_are_class_agnostic': False}, {'masks_are_class_agnostic': False, 'use_keras': True},
{'masks_are_class_agnostic': True}, {'masks_are_class_agnostic': True, 'use_keras': True},
{'masks_are_class_agnostic': False, 'use_keras': False},
{'masks_are_class_agnostic': True, 'use_keras': False},
) )
def test_predict_correct_shapes_in_inference_mode_three_stages_with_masks( def test_predict_correct_shapes_in_inference_mode_three_stages_with_masks(
self, masks_are_class_agnostic): self, masks_are_class_agnostic, use_keras):
batch_size = 2 batch_size = 2
image_size = 10 image_size = 10
max_num_proposals = 8 max_num_proposals = 8
...@@ -232,6 +252,7 @@ class FasterRCNNMetaArchTest( ...@@ -232,6 +252,7 @@ class FasterRCNNMetaArchTest(
with test_graph.as_default(): with test_graph.as_default():
model = self._build_model( model = self._build_model(
is_training=False, is_training=False,
use_keras=use_keras,
number_of_stages=3, number_of_stages=3,
second_stage_batch_size=2, second_stage_batch_size=2,
predict_masks=True, predict_masks=True,
...@@ -267,15 +288,18 @@ class FasterRCNNMetaArchTest( ...@@ -267,15 +288,18 @@ class FasterRCNNMetaArchTest(
[10, num_classes, 14, 14]) [10, num_classes, 14, 14])
@parameterized.parameters( @parameterized.parameters(
{'masks_are_class_agnostic': False}, {'masks_are_class_agnostic': False, 'use_keras': True},
{'masks_are_class_agnostic': True}, {'masks_are_class_agnostic': True, 'use_keras': True},
{'masks_are_class_agnostic': False, 'use_keras': False},
{'masks_are_class_agnostic': True, 'use_keras': False},
) )
def test_predict_gives_correct_shapes_in_train_mode_both_stages_with_masks( def test_predict_gives_correct_shapes_in_train_mode_both_stages_with_masks(
self, masks_are_class_agnostic): self, masks_are_class_agnostic, use_keras):
test_graph = tf.Graph() test_graph = tf.Graph()
with test_graph.as_default(): with test_graph.as_default():
model = self._build_model( model = self._build_model(
is_training=True, is_training=True,
use_keras=use_keras,
number_of_stages=3, number_of_stages=3,
second_stage_batch_size=7, second_stage_batch_size=7,
predict_masks=True, predict_masks=True,
...@@ -348,7 +372,11 @@ class FasterRCNNMetaArchTest( ...@@ -348,7 +372,11 @@ class FasterRCNNMetaArchTest(
tensor_dict_out['rpn_objectness_predictions_with_background'].shape, tensor_dict_out['rpn_objectness_predictions_with_background'].shape,
(2, num_anchors_out, 2)) (2, num_anchors_out, 2))
def test_postprocess_third_stage_only_inference_mode(self): @parameterized.parameters(
{'use_keras': True},
{'use_keras': False}
)
def test_postprocess_third_stage_only_inference_mode(self, use_keras=False):
num_proposals_shapes = [(2), (None)] num_proposals_shapes = [(2), (None)]
refined_box_encodings_shapes = [(16, 2, 4), (None, 2, 4)] refined_box_encodings_shapes = [(16, 2, 4), (None, 2, 4)]
class_predictions_with_background_shapes = [(16, 3), (None, 3)] class_predictions_with_background_shapes = [(16, 3), (None, 3)]
...@@ -364,7 +392,7 @@ class FasterRCNNMetaArchTest( ...@@ -364,7 +392,7 @@ class FasterRCNNMetaArchTest(
tf_graph = tf.Graph() tf_graph = tf.Graph()
with tf_graph.as_default(): with tf_graph.as_default():
model = self._build_model( model = self._build_model(
is_training=False, number_of_stages=3, is_training=False, use_keras=use_keras, number_of_stages=3,
second_stage_batch_size=6, predict_masks=True) second_stage_batch_size=6, predict_masks=True)
total_num_padded_proposals = batch_size * model.max_num_proposals total_num_padded_proposals = batch_size * model.max_num_proposals
proposal_boxes = np.array( proposal_boxes = np.array(
......
...@@ -81,7 +81,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch): ...@@ -81,7 +81,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
add_summaries=True, add_summaries=True,
clip_anchors_to_image=False, clip_anchors_to_image=False,
use_static_shapes=False, use_static_shapes=False,
resize_masks=False): resize_masks=False,
freeze_batchnorm=False):
"""RFCNMetaArch Constructor. """RFCNMetaArch Constructor.
Args: Args:
...@@ -110,9 +111,12 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch): ...@@ -110,9 +111,12 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
denser resolutions. The atrous rate is used to compensate for the denser resolutions. The atrous rate is used to compensate for the
denser feature maps by using an effectively larger receptive field. denser feature maps by using an effectively larger receptive field.
(This should typically be set to 1). (This should typically be set to 1).
first_stage_box_predictor_arg_scope_fn: A function to generate tf-slim first_stage_box_predictor_arg_scope_fn: Either a
arg_scope for conv2d, separable_conv2d and fully_connected ops for the Keras layer hyperparams object or a function to construct tf-slim
RPN box predictor. arg_scope for conv2d, separable_conv2d and fully_connected ops. Used
for the RPN box predictor. If it is a keras hyperparams object the
RPN box predictor will be a Keras model. If it is a function to
construct an arg scope it will be a tf-slim box predictor.
first_stage_box_predictor_kernel_size: Kernel size to use for the first_stage_box_predictor_kernel_size: Kernel size to use for the
convolution op just prior to RPN box predictions. convolution op just prior to RPN box predictions.
first_stage_box_predictor_depth: Output depth for the convolution op first_stage_box_predictor_depth: Output depth for the convolution op
...@@ -180,6 +184,10 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch): ...@@ -180,6 +184,10 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
guarantees. guarantees.
resize_masks: Indicates whether the masks presend in the groundtruth resize_masks: Indicates whether the masks presend in the groundtruth
should be resized in the model with `image_resizer_fn` should be resized in the model with `image_resizer_fn`
freeze_batchnorm: Whether to freeze batch norm parameters during
training or not. When training with a small batch size (e.g. 1), it is
desirable to freeze batch norm update and use pretrained batch norm
params.
Raises: Raises:
ValueError: If `second_stage_batch_size` > `first_stage_max_proposals` ValueError: If `second_stage_batch_size` > `first_stage_max_proposals`
...@@ -225,7 +233,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch): ...@@ -225,7 +233,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
add_summaries, add_summaries,
clip_anchors_to_image, clip_anchors_to_image,
use_static_shapes, use_static_shapes,
resize_masks) resize_masks,
freeze_batchnorm=freeze_batchnorm)
self._rfcn_box_predictor = second_stage_rfcn_box_predictor self._rfcn_box_predictor = second_stage_rfcn_box_predictor
...@@ -293,9 +302,7 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch): ...@@ -293,9 +302,7 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
anchors, image_shape_2d, true_image_shapes) anchors, image_shape_2d, true_image_shapes)
box_classifier_features = ( box_classifier_features = (
self._feature_extractor.extract_box_classifier_features( self._extract_box_classifier_features(rpn_features))
rpn_features,
scope=self.second_stage_feature_extractor_scope))
if self._rfcn_box_predictor.is_keras_model: if self._rfcn_box_predictor.is_keras_model:
box_predictions = self._rfcn_box_predictor( box_predictions = self._rfcn_box_predictor(
...@@ -329,3 +336,37 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch): ...@@ -329,3 +336,37 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
'proposal_boxes_normalized': proposal_boxes_normalized, 'proposal_boxes_normalized': proposal_boxes_normalized,
} }
return prediction_dict return prediction_dict
def regularization_losses(self):
"""Returns a list of regularization losses for this model.
Returns a list of regularization losses for this model that the estimator
needs to use during training/optimization.
Returns:
A list of regularization loss tensors.
"""
reg_losses = super(RFCNMetaArch, self).regularization_losses()
if self._rfcn_box_predictor.is_keras_model:
reg_losses.extend(self._rfcn_box_predictor.losses)
return reg_losses
def updates(self):
"""Returns a list of update operators for this model.
Returns a list of update operators for this model that must be executed at
each training step. The estimator's train op needs to have a control
dependency on these updates.
Returns:
A list of update operators.
"""
update_ops = super(RFCNMetaArch, self).updates()
if self._rfcn_box_predictor.is_keras_model:
update_ops.extend(
self._rfcn_box_predictor.get_updates_for(None))
update_ops.extend(
self._rfcn_box_predictor.get_updates_for(
self._rfcn_box_predictor.inputs))
return update_ops
...@@ -22,6 +22,7 @@ import tensorflow as tf ...@@ -22,6 +22,7 @@ import tensorflow as tf
from object_detection.core import box_list from object_detection.core import box_list
from object_detection.core import box_list_ops from object_detection.core import box_list_ops
from object_detection.core import matcher
from object_detection.core import model from object_detection.core import model
from object_detection.core import standard_fields as fields from object_detection.core import standard_fields as fields
from object_detection.core import target_assigner from object_detection.core import target_assigner
...@@ -663,8 +664,8 @@ class SSDMetaArch(model.DetectionModel): ...@@ -663,8 +664,8 @@ class SSDMetaArch(model.DetectionModel):
raw_detection_boxes: [batch, total_detections, 4] tensor with decoded raw_detection_boxes: [batch, total_detections, 4] tensor with decoded
detection boxes before Non-Max Suppression. detection boxes before Non-Max Suppression.
raw_detection_score: [batch, total_detections, raw_detection_score: [batch, total_detections,
num_classes_with_background] tensor of multi-class score logits for num_classes_with_background] tensor of multi-class scores for raw
raw detection boxes. detection boxes.
Raises: Raises:
ValueError: if prediction_dict does not contain `box_encodings` or ValueError: if prediction_dict does not contain `box_encodings` or
`class_predictions_with_background` fields. `class_predictions_with_background` fields.
...@@ -672,17 +673,23 @@ class SSDMetaArch(model.DetectionModel): ...@@ -672,17 +673,23 @@ class SSDMetaArch(model.DetectionModel):
if ('box_encodings' not in prediction_dict or if ('box_encodings' not in prediction_dict or
'class_predictions_with_background' not in prediction_dict): 'class_predictions_with_background' not in prediction_dict):
raise ValueError('prediction_dict does not contain expected entries.') raise ValueError('prediction_dict does not contain expected entries.')
if 'anchors' not in prediction_dict:
prediction_dict['anchors'] = self.anchors.get()
with tf.name_scope('Postprocessor'): with tf.name_scope('Postprocessor'):
preprocessed_images = prediction_dict['preprocessed_inputs'] preprocessed_images = prediction_dict['preprocessed_inputs']
box_encodings = prediction_dict['box_encodings'] box_encodings = prediction_dict['box_encodings']
box_encodings = tf.identity(box_encodings, 'raw_box_encodings') box_encodings = tf.identity(box_encodings, 'raw_box_encodings')
class_predictions = prediction_dict['class_predictions_with_background'] class_predictions_with_background = (
detection_boxes, detection_keypoints = self._batch_decode(box_encodings) prediction_dict['class_predictions_with_background'])
detection_boxes, detection_keypoints = self._batch_decode(
box_encodings, prediction_dict['anchors'])
detection_boxes = tf.identity(detection_boxes, 'raw_box_locations') detection_boxes = tf.identity(detection_boxes, 'raw_box_locations')
detection_boxes = tf.expand_dims(detection_boxes, axis=2) detection_boxes = tf.expand_dims(detection_boxes, axis=2)
detection_scores = self._score_conversion_fn(class_predictions) detection_scores_with_background = self._score_conversion_fn(
detection_scores = tf.identity(detection_scores, 'raw_box_scores') class_predictions_with_background)
detection_scores = tf.identity(detection_scores_with_background,
'raw_box_scores')
if self._add_background_class or self._explicit_background_class: if self._add_background_class or self._explicit_background_class:
detection_scores = tf.slice(detection_scores, [0, 0, 1], [-1, -1, -1]) detection_scores = tf.slice(detection_scores, [0, 0, 1], [-1, -1, -1])
additional_fields = None additional_fields = None
...@@ -720,7 +727,7 @@ class SSDMetaArch(model.DetectionModel): ...@@ -720,7 +727,7 @@ class SSDMetaArch(model.DetectionModel):
fields.DetectionResultFields.raw_detection_boxes: fields.DetectionResultFields.raw_detection_boxes:
tf.squeeze(detection_boxes, axis=2), tf.squeeze(detection_boxes, axis=2),
fields.DetectionResultFields.raw_detection_scores: fields.DetectionResultFields.raw_detection_scores:
class_predictions detection_scores_with_background
} }
if (nmsed_additional_fields is not None and if (nmsed_additional_fields is not None and
fields.BoxListFields.keypoints in nmsed_additional_fields): fields.BoxListFields.keypoints in nmsed_additional_fields):
...@@ -767,10 +774,11 @@ class SSDMetaArch(model.DetectionModel): ...@@ -767,10 +774,11 @@ class SSDMetaArch(model.DetectionModel):
if self.groundtruth_has_field(fields.BoxListFields.confidences): if self.groundtruth_has_field(fields.BoxListFields.confidences):
confidences = self.groundtruth_lists(fields.BoxListFields.confidences) confidences = self.groundtruth_lists(fields.BoxListFields.confidences)
(batch_cls_targets, batch_cls_weights, batch_reg_targets, (batch_cls_targets, batch_cls_weights, batch_reg_targets,
batch_reg_weights, match_list) = self._assign_targets( batch_reg_weights, batch_match) = self._assign_targets(
self.groundtruth_lists(fields.BoxListFields.boxes), self.groundtruth_lists(fields.BoxListFields.boxes),
self.groundtruth_lists(fields.BoxListFields.classes), self.groundtruth_lists(fields.BoxListFields.classes),
keypoints, weights, confidences) keypoints, weights, confidences)
match_list = [matcher.Match(match) for match in tf.unstack(batch_match)]
if self._add_summaries: if self._add_summaries:
self._summarize_target_assignment( self._summarize_target_assignment(
self.groundtruth_lists(fields.BoxListFields.boxes), match_list) self.groundtruth_lists(fields.BoxListFields.boxes), match_list)
...@@ -1017,25 +1025,31 @@ class SSDMetaArch(model.DetectionModel): ...@@ -1017,25 +1025,31 @@ class SSDMetaArch(model.DetectionModel):
with rows of the Match objects corresponding to groundtruth boxes with rows of the Match objects corresponding to groundtruth boxes
and columns corresponding to anchors. and columns corresponding to anchors.
""" """
num_boxes_per_image = tf.stack( avg_num_gt_boxes = tf.reduce_mean(tf.to_float(tf.stack(
[tf.shape(x)[0] for x in groundtruth_boxes_list]) [tf.shape(x)[0] for x in groundtruth_boxes_list])))
pos_anchors_per_image = tf.stack( avg_num_matched_gt_boxes = tf.reduce_mean(tf.to_float(tf.stack(
[match.num_matched_columns() for match in match_list]) [match.num_matched_rows() for match in match_list])))
neg_anchors_per_image = tf.stack( avg_pos_anchors = tf.reduce_mean(tf.to_float(tf.stack(
[match.num_unmatched_columns() for match in match_list]) [match.num_matched_columns() for match in match_list])))
ignored_anchors_per_image = tf.stack( avg_neg_anchors = tf.reduce_mean(tf.to_float(tf.stack(
[match.num_ignored_columns() for match in match_list]) [match.num_unmatched_columns() for match in match_list])))
avg_ignored_anchors = tf.reduce_mean(tf.to_float(tf.stack(
[match.num_ignored_columns() for match in match_list])))
# TODO(rathodv): Add a test for these summaries.
tf.summary.scalar('AvgNumGroundtruthBoxesPerImage', tf.summary.scalar('AvgNumGroundtruthBoxesPerImage',
tf.reduce_mean(tf.to_float(num_boxes_per_image)), avg_num_gt_boxes,
family='TargetAssignment')
tf.summary.scalar('AvgNumGroundtruthBoxesMatchedPerImage',
avg_num_matched_gt_boxes,
family='TargetAssignment') family='TargetAssignment')
tf.summary.scalar('AvgNumPositiveAnchorsPerImage', tf.summary.scalar('AvgNumPositiveAnchorsPerImage',
tf.reduce_mean(tf.to_float(pos_anchors_per_image)), avg_pos_anchors,
family='TargetAssignment') family='TargetAssignment')
tf.summary.scalar('AvgNumNegativeAnchorsPerImage', tf.summary.scalar('AvgNumNegativeAnchorsPerImage',
tf.reduce_mean(tf.to_float(neg_anchors_per_image)), avg_neg_anchors,
family='TargetAssignment') family='TargetAssignment')
tf.summary.scalar('AvgNumIgnoredAnchorsPerImage', tf.summary.scalar('AvgNumIgnoredAnchorsPerImage',
tf.reduce_mean(tf.to_float(ignored_anchors_per_image)), avg_ignored_anchors,
family='TargetAssignment') family='TargetAssignment')
def _apply_hard_mining(self, location_losses, cls_losses, prediction_dict, def _apply_hard_mining(self, location_losses, cls_losses, prediction_dict,
...@@ -1054,6 +1068,7 @@ class SSDMetaArch(model.DetectionModel): ...@@ -1054,6 +1068,7 @@ class SSDMetaArch(model.DetectionModel):
[batch_size, num_anchors, num_classes+1] containing class predictions [batch_size, num_anchors, num_classes+1] containing class predictions
(logits) for each of the anchors. Note that this tensor *includes* (logits) for each of the anchors. Note that this tensor *includes*
background class predictions. background class predictions.
3) anchors: (optional) 2-D float tensor of shape [num_anchors, 4].
match_list: a list of matcher.Match objects encoding the match between match_list: a list of matcher.Match objects encoding the match between
anchors and groundtruth boxes for each image of the batch, anchors and groundtruth boxes for each image of the batch,
with rows of the Match objects corresponding to groundtruth boxes with rows of the Match objects corresponding to groundtruth boxes
...@@ -1069,7 +1084,10 @@ class SSDMetaArch(model.DetectionModel): ...@@ -1069,7 +1084,10 @@ class SSDMetaArch(model.DetectionModel):
if self._add_background_class: if self._add_background_class:
class_predictions = tf.slice(class_predictions, [0, 0, 1], [-1, -1, -1]) class_predictions = tf.slice(class_predictions, [0, 0, 1], [-1, -1, -1])
decoded_boxes, _ = self._batch_decode(prediction_dict['box_encodings']) if 'anchors' not in prediction_dict:
prediction_dict['anchors'] = self.anchors.get()
decoded_boxes, _ = self._batch_decode(prediction_dict['box_encodings'],
prediction_dict['anchors'])
decoded_box_tensors_list = tf.unstack(decoded_boxes) decoded_box_tensors_list = tf.unstack(decoded_boxes)
class_prediction_list = tf.unstack(class_predictions) class_prediction_list = tf.unstack(class_predictions)
decoded_boxlist_list = [] decoded_boxlist_list = []
...@@ -1084,12 +1102,13 @@ class SSDMetaArch(model.DetectionModel): ...@@ -1084,12 +1102,13 @@ class SSDMetaArch(model.DetectionModel):
decoded_boxlist_list=decoded_boxlist_list, decoded_boxlist_list=decoded_boxlist_list,
match_list=match_list) match_list=match_list)
def _batch_decode(self, box_encodings): def _batch_decode(self, box_encodings, anchors):
"""Decodes a batch of box encodings with respect to the anchors. """Decodes a batch of box encodings with respect to the anchors.
Args: Args:
box_encodings: A float32 tensor of shape box_encodings: A float32 tensor of shape
[batch_size, num_anchors, box_code_size] containing box encodings. [batch_size, num_anchors, box_code_size] containing box encodings.
anchors: A tensor of shape [num_anchors, 4].
Returns: Returns:
decoded_boxes: A float32 tensor of shape decoded_boxes: A float32 tensor of shape
...@@ -1101,8 +1120,7 @@ class SSDMetaArch(model.DetectionModel): ...@@ -1101,8 +1120,7 @@ class SSDMetaArch(model.DetectionModel):
combined_shape = shape_utils.combined_static_and_dynamic_shape( combined_shape = shape_utils.combined_static_and_dynamic_shape(
box_encodings) box_encodings)
batch_size = combined_shape[0] batch_size = combined_shape[0]
tiled_anchor_boxes = tf.tile( tiled_anchor_boxes = tf.tile(tf.expand_dims(anchors, 0), [batch_size, 1, 1])
tf.expand_dims(self.anchors.get(), 0), [batch_size, 1, 1])
tiled_anchors_boxlist = box_list.BoxList( tiled_anchors_boxlist = box_list.BoxList(
tf.reshape(tiled_anchor_boxes, [-1, 4])) tf.reshape(tiled_anchor_boxes, [-1, 4]))
decoded_boxes = self._box_coder.decode( decoded_boxes = self._box_coder.decode(
......
...@@ -311,8 +311,8 @@ class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase, ...@@ -311,8 +311,8 @@ class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase,
[0.5, 0., 1., 0.5], [1., 1., 1.5, 1.5]], [0.5, 0., 1., 0.5], [1., 1., 1.5, 1.5]],
[[0., 0., 0.5, 0.5], [0., 0.5, 0.5, 1.], [[0., 0., 0.5, 0.5], [0., 0.5, 0.5, 1.],
[0.5, 0., 1., 0.5], [1., 1., 1.5, 1.5]]] [0.5, 0., 1., 0.5], [1., 1., 1.5, 1.5]]]
raw_detection_scores = [[[0, 0], [0, 0], [0, 0], [0, 0]], raw_detection_scores = [[[0.5, 0.5], [0.5, 0.5], [0.5, 0.5], [0.5, 0.5]],
[[0, 0], [0, 0], [0, 0], [0, 0]]] [[0.5, 0.5], [0.5, 0.5], [0.5, 0.5], [0.5, 0.5]]]
for input_shape in input_shapes: for input_shape in input_shapes:
tf_graph = tf.Graph() tf_graph = tf.Graph()
......
...@@ -197,6 +197,7 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator): ...@@ -197,6 +197,7 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
'PerformanceByCategory' is included in the output regardless of 'PerformanceByCategory' is included in the output regardless of
all_metrics_per_category. all_metrics_per_category.
""" """
tf.logging.info('Performing evaluation on %d images.', len(self._image_ids))
groundtruth_dict = { groundtruth_dict = {
'annotations': self._groundtruth_list, 'annotations': self._groundtruth_list,
'images': [{'id': image_id} for image_id in self._image_ids], 'images': [{'id': image_id} for image_id in self._image_ids],
......
...@@ -24,6 +24,7 @@ import os ...@@ -24,6 +24,7 @@ import os
import tensorflow as tf import tensorflow as tf
from tensorflow.python.util import function_utils
from object_detection import eval_util from object_detection import eval_util
from object_detection import exporter as exporter_lib from object_detection import exporter as exporter_lib
from object_detection import inputs from object_detection import inputs
...@@ -33,6 +34,7 @@ from object_detection.builders import optimizer_builder ...@@ -33,6 +34,7 @@ from object_detection.builders import optimizer_builder
from object_detection.core import standard_fields as fields from object_detection.core import standard_fields as fields
from object_detection.utils import config_util from object_detection.utils import config_util
from object_detection.utils import label_map_util from object_detection.utils import label_map_util
from object_detection.utils import ops
from object_detection.utils import shape_utils from object_detection.utils import shape_utils
from object_detection.utils import variables_helper from object_detection.utils import variables_helper
from object_detection.utils import visualization_utils as vis_utils from object_detection.utils import visualization_utils as vis_utils
...@@ -279,9 +281,7 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False, ...@@ -279,9 +281,7 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False,
prediction_dict = detection_model.predict( prediction_dict = detection_model.predict(
preprocessed_images, preprocessed_images,
features[fields.InputDataFields.true_image_shape]) features[fields.InputDataFields.true_image_shape])
for k, v in prediction_dict.items(): prediction_dict = ops.bfloat16_to_float32_nested(prediction_dict)
if v.dtype == tf.bfloat16:
prediction_dict[k] = tf.cast(v, tf.float32)
else: else:
prediction_dict = detection_model.predict( prediction_dict = detection_model.predict(
preprocessed_images, preprocessed_images,
...@@ -338,6 +338,9 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False, ...@@ -338,6 +338,9 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False,
losses = [loss_tensor for loss_tensor in losses_dict.values()] losses = [loss_tensor for loss_tensor in losses_dict.values()]
if train_config.add_regularization_loss: if train_config.add_regularization_loss:
regularization_losses = detection_model.regularization_losses() regularization_losses = detection_model.regularization_losses()
if use_tpu and train_config.use_bfloat16:
regularization_losses = ops.bfloat16_to_float32_nested(
regularization_losses)
if regularization_losses: if regularization_losses:
regularization_loss = tf.add_n( regularization_loss = tf.add_n(
regularization_losses, name='regularization_loss') regularization_losses, name='regularization_loss')
...@@ -650,6 +653,11 @@ def create_estimator_and_inputs(run_config, ...@@ -650,6 +653,11 @@ def create_estimator_and_inputs(run_config,
model_fn = model_fn_creator(detection_model_fn, configs, hparams, use_tpu, model_fn = model_fn_creator(detection_model_fn, configs, hparams, use_tpu,
postprocess_on_cpu) postprocess_on_cpu)
if use_tpu_estimator: if use_tpu_estimator:
# Multicore inference disabled due to b/129367127
tpu_estimator_args = function_utils.fn_args(tf.contrib.tpu.TPUEstimator)
kwargs = {}
if 'experimental_export_device_assignment' in tpu_estimator_args:
kwargs['experimental_export_device_assignment'] = True
estimator = tf.contrib.tpu.TPUEstimator( estimator = tf.contrib.tpu.TPUEstimator(
model_fn=model_fn, model_fn=model_fn,
train_batch_size=train_config.batch_size, train_batch_size=train_config.batch_size,
...@@ -659,7 +667,8 @@ def create_estimator_and_inputs(run_config, ...@@ -659,7 +667,8 @@ def create_estimator_and_inputs(run_config,
config=run_config, config=run_config,
export_to_tpu=export_to_tpu, export_to_tpu=export_to_tpu,
eval_on_tpu=False, # Eval runs on CPU, so disable eval on TPU eval_on_tpu=False, # Eval runs on CPU, so disable eval on TPU
params=params if params else {}) params=params if params else {},
**kwargs)
else: else:
estimator = tf.estimator.Estimator(model_fn=model_fn, config=run_config) estimator = tf.estimator.Estimator(model_fn=model_fn, config=run_config)
......
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for models.faster_rcnn_inception_resnet_v2_keras_feature_extractor."""
import tensorflow as tf
from object_detection.models import faster_rcnn_inception_resnet_v2_keras_feature_extractor as frcnn_inc_res
class FasterRcnnInceptionResnetV2KerasFeatureExtractorTest(tf.test.TestCase):
def _build_feature_extractor(self, first_stage_features_stride):
return frcnn_inc_res.FasterRCNNInceptionResnetV2KerasFeatureExtractor(
is_training=False,
first_stage_features_stride=first_stage_features_stride,
batch_norm_trainable=False,
weight_decay=0.0)
def test_extract_proposal_features_returns_expected_size(self):
feature_extractor = self._build_feature_extractor(
first_stage_features_stride=16)
preprocessed_inputs = tf.random_uniform(
[1, 299, 299, 3], maxval=255, dtype=tf.float32)
rpn_feature_map = feature_extractor.get_proposal_feature_extractor_model(
name='TestScope')(preprocessed_inputs)
features_shape = tf.shape(rpn_feature_map)
init_op = tf.global_variables_initializer()
with self.test_session() as sess:
sess.run(init_op)
features_shape_out = sess.run(features_shape)
self.assertAllEqual(features_shape_out, [1, 19, 19, 1088])
def test_extract_proposal_features_stride_eight(self):
feature_extractor = self._build_feature_extractor(
first_stage_features_stride=8)
preprocessed_inputs = tf.random_uniform(
[1, 224, 224, 3], maxval=255, dtype=tf.float32)
rpn_feature_map = feature_extractor.get_proposal_feature_extractor_model(
name='TestScope')(preprocessed_inputs)
features_shape = tf.shape(rpn_feature_map)
init_op = tf.global_variables_initializer()
with self.test_session() as sess:
sess.run(init_op)
features_shape_out = sess.run(features_shape)
self.assertAllEqual(features_shape_out, [1, 28, 28, 1088])
def test_extract_proposal_features_half_size_input(self):
feature_extractor = self._build_feature_extractor(
first_stage_features_stride=16)
preprocessed_inputs = tf.random_uniform(
[1, 112, 112, 3], maxval=255, dtype=tf.float32)
rpn_feature_map = feature_extractor.get_proposal_feature_extractor_model(
name='TestScope')(preprocessed_inputs)
features_shape = tf.shape(rpn_feature_map)
init_op = tf.global_variables_initializer()
with self.test_session() as sess:
sess.run(init_op)
features_shape_out = sess.run(features_shape)
self.assertAllEqual(features_shape_out, [1, 7, 7, 1088])
def test_extract_proposal_features_dies_on_invalid_stride(self):
with self.assertRaises(ValueError):
self._build_feature_extractor(first_stage_features_stride=99)
def test_extract_proposal_features_dies_with_incorrect_rank_inputs(self):
feature_extractor = self._build_feature_extractor(
first_stage_features_stride=16)
preprocessed_inputs = tf.random_uniform(
[224, 224, 3], maxval=255, dtype=tf.float32)
with self.assertRaises(ValueError):
feature_extractor.get_proposal_feature_extractor_model(
name='TestScope')(preprocessed_inputs)
def test_extract_box_classifier_features_returns_expected_size(self):
feature_extractor = self._build_feature_extractor(
first_stage_features_stride=16)
proposal_feature_maps = tf.random_uniform(
[2, 17, 17, 1088], maxval=255, dtype=tf.float32)
model = feature_extractor.get_box_classifier_feature_extractor_model(
name='TestScope')
proposal_classifier_features = (
model(proposal_feature_maps))
features_shape = tf.shape(proposal_classifier_features)
init_op = tf.global_variables_initializer()
with self.test_session() as sess:
sess.run(init_op)
features_shape_out = sess.run(features_shape)
self.assertAllEqual(features_shape_out, [2, 8, 8, 1536])
if __name__ == '__main__':
tf.test.main()
...@@ -51,6 +51,60 @@ def get_depth_fn(depth_multiplier, min_depth): ...@@ -51,6 +51,60 @@ def get_depth_fn(depth_multiplier, min_depth):
return multiply_depth return multiply_depth
def create_conv_block(
use_depthwise, kernel_size, padding, stride, layer_name, conv_hyperparams,
is_training, freeze_batchnorm, depth):
"""Create Keras layers for depthwise & non-depthwise convolutions.
Args:
use_depthwise: Whether to use depthwise separable conv instead of regular
conv.
kernel_size: A list of length 2: [kernel_height, kernel_width] of the
filters. Can be an int if both values are the same.
padding: One of 'VALID' or 'SAME'.
stride: A list of length 2: [stride_height, stride_width], specifying the
convolution stride. Can be an int if both strides are the same.
layer_name: String. The name of the layer.
conv_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
containing hyperparameters for convolution ops.
is_training: Indicates whether the feature generator is in training mode.
freeze_batchnorm: Bool. Whether to freeze batch norm parameters during
training or not. When training with a small batch size (e.g. 1), it is
desirable to freeze batch norm update and use pretrained batch norm
params.
depth: Depth of output feature maps.
Returns:
A list of conv layers.
"""
layers = []
if use_depthwise:
layers.append(tf.keras.layers.SeparableConv2D(
depth,
[kernel_size, kernel_size],
depth_multiplier=1,
padding=padding,
strides=stride,
name=layer_name + '_depthwise_conv',
**conv_hyperparams.params()))
else:
layers.append(tf.keras.layers.Conv2D(
depth,
[kernel_size, kernel_size],
padding=padding,
strides=stride,
name=layer_name + '_conv',
**conv_hyperparams.params()))
layers.append(
conv_hyperparams.build_batch_norm(
training=(is_training and not freeze_batchnorm),
name=layer_name + '_batchnorm'))
layers.append(
conv_hyperparams.build_activation_layer(
name=layer_name))
return layers
class KerasMultiResolutionFeatureMaps(tf.keras.Model): class KerasMultiResolutionFeatureMaps(tf.keras.Model):
"""Generates multi resolution feature maps from input image features. """Generates multi resolution feature maps from input image features.
...@@ -419,6 +473,180 @@ def multi_resolution_feature_maps(feature_map_layout, depth_multiplier, ...@@ -419,6 +473,180 @@ def multi_resolution_feature_maps(feature_map_layout, depth_multiplier,
[(x, y) for (x, y) in zip(feature_map_keys, feature_maps)]) [(x, y) for (x, y) in zip(feature_map_keys, feature_maps)])
class KerasFpnTopDownFeatureMaps(tf.keras.Model):
"""Generates Keras based `top-down` feature maps for Feature Pyramid Networks.
See https://arxiv.org/abs/1612.03144 for details.
"""
def __init__(self,
num_levels,
depth,
is_training,
conv_hyperparams,
freeze_batchnorm,
use_depthwise=False,
use_explicit_padding=False,
use_bounded_activations=False,
use_native_resize_op=False,
scope=None,
name=None):
"""Constructor.
Args:
num_levels: the number of image features.
depth: depth of output feature maps.
is_training: Indicates whether the feature generator is in training mode.
conv_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
containing hyperparameters for convolution ops.
freeze_batchnorm: Bool. Whether to freeze batch norm parameters during
training or not. When training with a small batch size (e.g. 1), it is
desirable to freeze batch norm update and use pretrained batch norm
params.
use_depthwise: whether to use depthwise separable conv instead of regular
conv.
use_explicit_padding: whether to use explicit padding.
use_bounded_activations: Whether or not to clip activations to range
[-ACTIVATION_BOUND, ACTIVATION_BOUND]. Bounded activations better lend
themselves to quantized inference.
use_native_resize_op: If True, uses tf.image.resize_nearest_neighbor op
for the upsampling process instead of reshape and broadcasting
implementation.
scope: A scope name to wrap this op under.
name: A string name scope to assign to the model. If 'None', Keras
will auto-generate one from the class name.
"""
super(KerasFpnTopDownFeatureMaps, self).__init__(name=name)
self.scope = scope if scope else 'top_down'
self.top_layers = []
self.residual_blocks = []
self.top_down_blocks = []
self.reshape_blocks = []
self.conv_layers = []
padding = 'VALID' if use_explicit_padding else 'SAME'
stride = 1
kernel_size = 3
def clip_by_value(features):
return tf.clip_by_value(features, -ACTIVATION_BOUND, ACTIVATION_BOUND)
# top layers
self.top_layers.append(tf.keras.layers.Conv2D(
depth, [1, 1], strides=stride, padding=padding,
name='projection_%d' % num_levels,
**conv_hyperparams.params(use_bias=True)))
if use_bounded_activations:
self.top_layers.append(tf.keras.layers.Lambda(
clip_by_value, name='clip_by_value'))
for level in reversed(range(num_levels - 1)):
# to generate residual from image features
residual_net = []
# to preprocess top_down (the image feature map from last layer)
top_down_net = []
# to reshape top_down according to residual if necessary
reshaped_residual = []
# to apply convolution layers to feature map
conv_net = []
# residual block
residual_net.append(tf.keras.layers.Conv2D(
depth, [1, 1], padding=padding, strides=1,
name='projection_%d' % (level + 1),
**conv_hyperparams.params(use_bias=True)))
if use_bounded_activations:
residual_net.append(tf.keras.layers.Lambda(
clip_by_value, name='clip_by_value'))
# top-down block
# TODO (b/128922690): clean-up of ops.nearest_neighbor_upsampling
if use_native_resize_op:
def resize_nearest_neighbor(image):
image_shape = image.shape.as_list()
return tf.image.resize_nearest_neighbor(
image, [image_shape[1] * 2, image_shape[2] * 2])
top_down_net.append(tf.keras.layers.Lambda(
resize_nearest_neighbor, name='nearest_neighbor_upsampling'))
else:
def nearest_neighbor_upsampling(image):
return ops.nearest_neighbor_upsampling(image, scale=2)
top_down_net.append(tf.keras.layers.Lambda(
nearest_neighbor_upsampling, name='nearest_neighbor_upsampling'))
# reshape block
if use_explicit_padding:
def reshape(inputs):
residual_shape = tf.shape(inputs[0])
return inputs[1][:, :residual_shape[1], :residual_shape[2], :]
reshaped_residual.append(
tf.keras.layers.Lambda(reshape, name='reshape'))
# down layers
if use_bounded_activations:
conv_net.append(tf.keras.layers.Lambda(
clip_by_value, name='clip_by_value'))
if use_explicit_padding:
def fixed_padding(features, kernel_size=kernel_size):
return ops.fixed_padding(features, kernel_size)
conv_net.append(tf.keras.layers.Lambda(
fixed_padding, name='fixed_padding'))
layer_name = 'smoothing_%d' % (level + 1)
conv_block = create_conv_block(
use_depthwise, kernel_size, padding, stride, layer_name,
conv_hyperparams, is_training, freeze_batchnorm, depth)
conv_net.extend(conv_block)
self.residual_blocks.append(residual_net)
self.top_down_blocks.append(top_down_net)
self.reshape_blocks.append(reshaped_residual)
self.conv_layers.append(conv_net)
def call(self, image_features):
"""Generate the multi-resolution feature maps.
Executed when calling the `.__call__` method on input.
Args:
image_features: list of tuples of (tensor_name, image_feature_tensor).
Spatial resolutions of succesive tensors must reduce exactly by a factor
of 2.
Returns:
feature_maps: an OrderedDict mapping keys (feature map names) to
tensors where each tensor has shape [batch, height_i, width_i, depth_i].
"""
output_feature_maps_list = []
output_feature_map_keys = []
with tf.name_scope(self.scope):
top_down = image_features[-1][1]
for layer in self.top_layers:
top_down = layer(top_down)
output_feature_maps_list.append(top_down)
output_feature_map_keys.append('top_down_%s' % image_features[-1][0])
num_levels = len(image_features)
for index, level in enumerate(reversed(range(num_levels - 1))):
residual = image_features[level][1]
top_down = output_feature_maps_list[-1]
for layer in self.residual_blocks[index]:
residual = layer(residual)
for layer in self.top_down_blocks[index]:
top_down = layer(top_down)
for layer in self.reshape_blocks[index]:
top_down = layer([residual, top_down])
top_down += residual
for layer in self.conv_layers[index]:
top_down = layer(top_down)
output_feature_maps_list.append(top_down)
output_feature_map_keys.append('top_down_%s' % image_features[level][0])
return collections.OrderedDict(reversed(
list(zip(output_feature_map_keys, output_feature_maps_list))))
def fpn_top_down_feature_maps(image_features, def fpn_top_down_feature_maps(image_features,
depth, depth,
use_depthwise=False, use_depthwise=False,
......
...@@ -403,21 +403,101 @@ class MultiResolutionFeatureMapGeneratorTest(tf.test.TestCase): ...@@ -403,21 +403,101 @@ class MultiResolutionFeatureMapGeneratorTest(tf.test.TestCase):
self.assertSetEqual(expected_slim_variables, actual_variable_set) self.assertSetEqual(expected_slim_variables, actual_variable_set)
@parameterized.parameters({'use_native_resize_op': True}, @parameterized.parameters({'use_native_resize_op': True, 'use_keras': False},
{'use_native_resize_op': False}) {'use_native_resize_op': False, 'use_keras': False},
{'use_native_resize_op': True, 'use_keras': True},
{'use_native_resize_op': False, 'use_keras': True})
class FPNFeatureMapGeneratorTest(tf.test.TestCase, parameterized.TestCase): class FPNFeatureMapGeneratorTest(tf.test.TestCase, parameterized.TestCase):
def test_get_expected_feature_map_shapes(self, use_native_resize_op): def _build_conv_hyperparams(self):
conv_hyperparams = hyperparams_pb2.Hyperparams()
conv_hyperparams_text_proto = """
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
"""
text_format.Merge(conv_hyperparams_text_proto, conv_hyperparams)
return hyperparams_builder.KerasLayerHyperparams(conv_hyperparams)
def _build_feature_map_generator(
self, image_features, depth, use_keras, use_bounded_activations=False,
use_native_resize_op=False, use_explicit_padding=False,
use_depthwise=False):
if use_keras:
return feature_map_generators.KerasFpnTopDownFeatureMaps(
num_levels=len(image_features),
depth=depth,
is_training=True,
conv_hyperparams=self._build_conv_hyperparams(),
freeze_batchnorm=False,
use_depthwise=use_depthwise,
use_explicit_padding=use_explicit_padding,
use_bounded_activations=use_bounded_activations,
use_native_resize_op=use_native_resize_op,
scope=None,
name='FeatureMaps',
)
else:
def feature_map_generator(image_features):
return feature_map_generators.fpn_top_down_feature_maps(
image_features=image_features,
depth=depth,
use_depthwise=use_depthwise,
use_explicit_padding=use_explicit_padding,
use_bounded_activations=use_bounded_activations,
use_native_resize_op=use_native_resize_op)
return feature_map_generator
def test_get_expected_feature_map_shapes(
self, use_native_resize_op, use_keras):
image_features = [
('block2', tf.random_uniform([4, 8, 8, 256], dtype=tf.float32)),
('block3', tf.random_uniform([4, 4, 4, 256], dtype=tf.float32)),
('block4', tf.random_uniform([4, 2, 2, 256], dtype=tf.float32)),
('block5', tf.random_uniform([4, 1, 1, 256], dtype=tf.float32))
]
feature_map_generator = self._build_feature_map_generator(
image_features=image_features,
depth=128,
use_keras=use_keras,
use_native_resize_op=use_native_resize_op)
feature_maps = feature_map_generator(image_features)
expected_feature_map_shapes = {
'top_down_block2': (4, 8, 8, 128),
'top_down_block3': (4, 4, 4, 128),
'top_down_block4': (4, 2, 2, 128),
'top_down_block5': (4, 1, 1, 128)
}
init_op = tf.global_variables_initializer()
with self.test_session() as sess:
sess.run(init_op)
out_feature_maps = sess.run(feature_maps)
out_feature_map_shapes = {key: value.shape
for key, value in out_feature_maps.items()}
self.assertDictEqual(out_feature_map_shapes, expected_feature_map_shapes)
def test_get_expected_feature_map_shapes_with_explicit_padding(
self, use_native_resize_op, use_keras):
image_features = [ image_features = [
('block2', tf.random_uniform([4, 8, 8, 256], dtype=tf.float32)), ('block2', tf.random_uniform([4, 8, 8, 256], dtype=tf.float32)),
('block3', tf.random_uniform([4, 4, 4, 256], dtype=tf.float32)), ('block3', tf.random_uniform([4, 4, 4, 256], dtype=tf.float32)),
('block4', tf.random_uniform([4, 2, 2, 256], dtype=tf.float32)), ('block4', tf.random_uniform([4, 2, 2, 256], dtype=tf.float32)),
('block5', tf.random_uniform([4, 1, 1, 256], dtype=tf.float32)) ('block5', tf.random_uniform([4, 1, 1, 256], dtype=tf.float32))
] ]
feature_maps = feature_map_generators.fpn_top_down_feature_maps( feature_map_generator = self._build_feature_map_generator(
image_features=image_features, image_features=image_features,
depth=128, depth=128,
use_keras=use_keras,
use_explicit_padding=True,
use_native_resize_op=use_native_resize_op) use_native_resize_op=use_native_resize_op)
feature_maps = feature_map_generator(image_features)
expected_feature_map_shapes = { expected_feature_map_shapes = {
'top_down_block2': (4, 8, 8, 128), 'top_down_block2': (4, 8, 8, 128),
...@@ -434,7 +514,8 @@ class FPNFeatureMapGeneratorTest(tf.test.TestCase, parameterized.TestCase): ...@@ -434,7 +514,8 @@ class FPNFeatureMapGeneratorTest(tf.test.TestCase, parameterized.TestCase):
for key, value in out_feature_maps.items()} for key, value in out_feature_maps.items()}
self.assertDictEqual(out_feature_map_shapes, expected_feature_map_shapes) self.assertDictEqual(out_feature_map_shapes, expected_feature_map_shapes)
def test_use_bounded_activations_add_operations(self, use_native_resize_op): def test_use_bounded_activations_add_operations(
self, use_native_resize_op, use_keras):
tf_graph = tf.Graph() tf_graph = tf.Graph()
with tf_graph.as_default(): with tf_graph.as_default():
image_features = [('block2', image_features = [('block2',
...@@ -445,22 +526,37 @@ class FPNFeatureMapGeneratorTest(tf.test.TestCase, parameterized.TestCase): ...@@ -445,22 +526,37 @@ class FPNFeatureMapGeneratorTest(tf.test.TestCase, parameterized.TestCase):
tf.random_uniform([4, 2, 2, 256], dtype=tf.float32)), tf.random_uniform([4, 2, 2, 256], dtype=tf.float32)),
('block5', ('block5',
tf.random_uniform([4, 1, 1, 256], dtype=tf.float32))] tf.random_uniform([4, 1, 1, 256], dtype=tf.float32))]
feature_map_generators.fpn_top_down_feature_maps( feature_map_generator = self._build_feature_map_generator(
image_features=image_features, image_features=image_features,
depth=128, depth=128,
use_keras=use_keras,
use_bounded_activations=True, use_bounded_activations=True,
use_native_resize_op=use_native_resize_op) use_native_resize_op=use_native_resize_op)
feature_map_generator(image_features)
if use_keras:
expected_added_operations = dict.fromkeys([
'FeatureMaps/top_down/clip_by_value/clip_by_value',
'FeatureMaps/top_down/clip_by_value_1/clip_by_value',
'FeatureMaps/top_down/clip_by_value_2/clip_by_value',
'FeatureMaps/top_down/clip_by_value_3/clip_by_value',
'FeatureMaps/top_down/clip_by_value_4/clip_by_value',
'FeatureMaps/top_down/clip_by_value_5/clip_by_value',
'FeatureMaps/top_down/clip_by_value_6/clip_by_value',
])
else:
expected_added_operations = dict.fromkeys([
'top_down/clip_by_value', 'top_down/clip_by_value_1',
'top_down/clip_by_value_2', 'top_down/clip_by_value_3',
'top_down/clip_by_value_4', 'top_down/clip_by_value_5',
'top_down/clip_by_value_6'
])
expected_added_operations = dict.fromkeys([
'top_down/clip_by_value', 'top_down/clip_by_value_1',
'top_down/clip_by_value_2', 'top_down/clip_by_value_3',
'top_down/clip_by_value_4', 'top_down/clip_by_value_5',
'top_down/clip_by_value_6'
])
op_names = {op.name: None for op in tf_graph.get_operations()} op_names = {op.name: None for op in tf_graph.get_operations()}
self.assertDictContainsSubset(expected_added_operations, op_names) self.assertDictContainsSubset(expected_added_operations, op_names)
def test_use_bounded_activations_clip_value(self, use_native_resize_op): def test_use_bounded_activations_clip_value(
self, use_native_resize_op, use_keras):
tf_graph = tf.Graph() tf_graph = tf.Graph()
with tf_graph.as_default(): with tf_graph.as_default():
image_features = [ image_features = [
...@@ -469,18 +565,31 @@ class FPNFeatureMapGeneratorTest(tf.test.TestCase, parameterized.TestCase): ...@@ -469,18 +565,31 @@ class FPNFeatureMapGeneratorTest(tf.test.TestCase, parameterized.TestCase):
('block4', 255 * tf.ones([4, 2, 2, 256], dtype=tf.float32)), ('block4', 255 * tf.ones([4, 2, 2, 256], dtype=tf.float32)),
('block5', 255 * tf.ones([4, 1, 1, 256], dtype=tf.float32)) ('block5', 255 * tf.ones([4, 1, 1, 256], dtype=tf.float32))
] ]
feature_map_generators.fpn_top_down_feature_maps( feature_map_generator = self._build_feature_map_generator(
image_features=image_features, image_features=image_features,
depth=128, depth=128,
use_keras=use_keras,
use_bounded_activations=True, use_bounded_activations=True,
use_native_resize_op=use_native_resize_op) use_native_resize_op=use_native_resize_op)
feature_map_generator(image_features)
expected_clip_by_value_ops = [ if use_keras:
'top_down/clip_by_value', 'top_down/clip_by_value_1', expected_clip_by_value_ops = dict.fromkeys([
'top_down/clip_by_value_2', 'top_down/clip_by_value_3', 'FeatureMaps/top_down/clip_by_value/clip_by_value',
'top_down/clip_by_value_4', 'top_down/clip_by_value_5', 'FeatureMaps/top_down/clip_by_value_1/clip_by_value',
'top_down/clip_by_value_6' 'FeatureMaps/top_down/clip_by_value_2/clip_by_value',
] 'FeatureMaps/top_down/clip_by_value_3/clip_by_value',
'FeatureMaps/top_down/clip_by_value_4/clip_by_value',
'FeatureMaps/top_down/clip_by_value_5/clip_by_value',
'FeatureMaps/top_down/clip_by_value_6/clip_by_value',
])
else:
expected_clip_by_value_ops = [
'top_down/clip_by_value', 'top_down/clip_by_value_1',
'top_down/clip_by_value_2', 'top_down/clip_by_value_3',
'top_down/clip_by_value_4', 'top_down/clip_by_value_5',
'top_down/clip_by_value_6'
]
# Gathers activation tensors before and after clip_by_value operations. # Gathers activation tensors before and after clip_by_value operations.
activations = {} activations = {}
...@@ -522,18 +631,20 @@ class FPNFeatureMapGeneratorTest(tf.test.TestCase, parameterized.TestCase): ...@@ -522,18 +631,20 @@ class FPNFeatureMapGeneratorTest(tf.test.TestCase, parameterized.TestCase):
self.assertLessEqual(after_clipping_upper_bound, expected_upper_bound) self.assertLessEqual(after_clipping_upper_bound, expected_upper_bound)
def test_get_expected_feature_map_shapes_with_depthwise( def test_get_expected_feature_map_shapes_with_depthwise(
self, use_native_resize_op): self, use_native_resize_op, use_keras):
image_features = [ image_features = [
('block2', tf.random_uniform([4, 8, 8, 256], dtype=tf.float32)), ('block2', tf.random_uniform([4, 8, 8, 256], dtype=tf.float32)),
('block3', tf.random_uniform([4, 4, 4, 256], dtype=tf.float32)), ('block3', tf.random_uniform([4, 4, 4, 256], dtype=tf.float32)),
('block4', tf.random_uniform([4, 2, 2, 256], dtype=tf.float32)), ('block4', tf.random_uniform([4, 2, 2, 256], dtype=tf.float32)),
('block5', tf.random_uniform([4, 1, 1, 256], dtype=tf.float32)) ('block5', tf.random_uniform([4, 1, 1, 256], dtype=tf.float32))
] ]
feature_maps = feature_map_generators.fpn_top_down_feature_maps( feature_map_generator = self._build_feature_map_generator(
image_features=image_features, image_features=image_features,
depth=128, depth=128,
use_keras=use_keras,
use_depthwise=True, use_depthwise=True,
use_native_resize_op=use_native_resize_op) use_native_resize_op=use_native_resize_op)
feature_maps = feature_map_generator(image_features)
expected_feature_map_shapes = { expected_feature_map_shapes = {
'top_down_block2': (4, 8, 8, 128), 'top_down_block2': (4, 8, 8, 128),
...@@ -550,6 +661,131 @@ class FPNFeatureMapGeneratorTest(tf.test.TestCase, parameterized.TestCase): ...@@ -550,6 +661,131 @@ class FPNFeatureMapGeneratorTest(tf.test.TestCase, parameterized.TestCase):
for key, value in out_feature_maps.items()} for key, value in out_feature_maps.items()}
self.assertDictEqual(out_feature_map_shapes, expected_feature_map_shapes) self.assertDictEqual(out_feature_map_shapes, expected_feature_map_shapes)
def test_get_expected_variable_names(
self, use_native_resize_op, use_keras):
image_features = [
('block2', tf.random_uniform([4, 8, 8, 256], dtype=tf.float32)),
('block3', tf.random_uniform([4, 4, 4, 256], dtype=tf.float32)),
('block4', tf.random_uniform([4, 2, 2, 256], dtype=tf.float32)),
('block5', tf.random_uniform([4, 1, 1, 256], dtype=tf.float32))
]
feature_map_generator = self._build_feature_map_generator(
image_features=image_features,
depth=128,
use_keras=use_keras,
use_native_resize_op=use_native_resize_op)
feature_maps = feature_map_generator(image_features)
expected_slim_variables = set([
'projection_1/weights',
'projection_1/biases',
'projection_2/weights',
'projection_2/biases',
'projection_3/weights',
'projection_3/biases',
'projection_4/weights',
'projection_4/biases',
'smoothing_1/weights',
'smoothing_1/biases',
'smoothing_2/weights',
'smoothing_2/biases',
'smoothing_3/weights',
'smoothing_3/biases',
])
expected_keras_variables = set([
'FeatureMaps/top_down/projection_1/kernel',
'FeatureMaps/top_down/projection_1/bias',
'FeatureMaps/top_down/projection_2/kernel',
'FeatureMaps/top_down/projection_2/bias',
'FeatureMaps/top_down/projection_3/kernel',
'FeatureMaps/top_down/projection_3/bias',
'FeatureMaps/top_down/projection_4/kernel',
'FeatureMaps/top_down/projection_4/bias',
'FeatureMaps/top_down/smoothing_1_conv/kernel',
'FeatureMaps/top_down/smoothing_1_conv/bias',
'FeatureMaps/top_down/smoothing_2_conv/kernel',
'FeatureMaps/top_down/smoothing_2_conv/bias',
'FeatureMaps/top_down/smoothing_3_conv/kernel',
'FeatureMaps/top_down/smoothing_3_conv/bias'
])
init_op = tf.global_variables_initializer()
with self.test_session() as sess:
sess.run(init_op)
sess.run(feature_maps)
actual_variable_set = set(
[var.op.name for var in tf.trainable_variables()])
if use_keras:
self.assertSetEqual(expected_keras_variables, actual_variable_set)
else:
self.assertSetEqual(expected_slim_variables, actual_variable_set)
def test_get_expected_variable_names_with_depthwise(
self, use_native_resize_op, use_keras):
image_features = [
('block2', tf.random_uniform([4, 8, 8, 256], dtype=tf.float32)),
('block3', tf.random_uniform([4, 4, 4, 256], dtype=tf.float32)),
('block4', tf.random_uniform([4, 2, 2, 256], dtype=tf.float32)),
('block5', tf.random_uniform([4, 1, 1, 256], dtype=tf.float32))
]
feature_map_generator = self._build_feature_map_generator(
image_features=image_features,
depth=128,
use_keras=use_keras,
use_depthwise=True,
use_native_resize_op=use_native_resize_op)
feature_maps = feature_map_generator(image_features)
expected_slim_variables = set([
'projection_1/weights',
'projection_1/biases',
'projection_2/weights',
'projection_2/biases',
'projection_3/weights',
'projection_3/biases',
'projection_4/weights',
'projection_4/biases',
'smoothing_1/depthwise_weights',
'smoothing_1/pointwise_weights',
'smoothing_1/biases',
'smoothing_2/depthwise_weights',
'smoothing_2/pointwise_weights',
'smoothing_2/biases',
'smoothing_3/depthwise_weights',
'smoothing_3/pointwise_weights',
'smoothing_3/biases',
])
expected_keras_variables = set([
'FeatureMaps/top_down/projection_1/kernel',
'FeatureMaps/top_down/projection_1/bias',
'FeatureMaps/top_down/projection_2/kernel',
'FeatureMaps/top_down/projection_2/bias',
'FeatureMaps/top_down/projection_3/kernel',
'FeatureMaps/top_down/projection_3/bias',
'FeatureMaps/top_down/projection_4/kernel',
'FeatureMaps/top_down/projection_4/bias',
'FeatureMaps/top_down/smoothing_1_depthwise_conv/depthwise_kernel',
'FeatureMaps/top_down/smoothing_1_depthwise_conv/pointwise_kernel',
'FeatureMaps/top_down/smoothing_1_depthwise_conv/bias',
'FeatureMaps/top_down/smoothing_2_depthwise_conv/depthwise_kernel',
'FeatureMaps/top_down/smoothing_2_depthwise_conv/pointwise_kernel',
'FeatureMaps/top_down/smoothing_2_depthwise_conv/bias',
'FeatureMaps/top_down/smoothing_3_depthwise_conv/depthwise_kernel',
'FeatureMaps/top_down/smoothing_3_depthwise_conv/pointwise_kernel',
'FeatureMaps/top_down/smoothing_3_depthwise_conv/bias'
])
init_op = tf.global_variables_initializer()
with self.test_session() as sess:
sess.run(init_op)
sess.run(feature_maps)
actual_variable_set = set(
[var.op.name for var in tf.trainable_variables()])
if use_keras:
self.assertSetEqual(expected_keras_variables, actual_variable_set)
else:
self.assertSetEqual(expected_slim_variables, actual_variable_set)
class GetDepthFunctionTest(tf.test.TestCase): class GetDepthFunctionTest(tf.test.TestCase):
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment