Add TPU SavedModel exporter and refactor OD code (#6737)

247226201 by ronnyvotel: Updating the visualization tools to accept unique_ids for color coding. -- 247067830 by Zhichao Lu: Add box_encodings_clip_range options for the convolutional box predictor (for TPU compatibility). -- 246888475 by Zhichao Lu: Remove unused _update_eval_steps function. -- 246163259 by lzc: Add a gather op that can handle ignore indices (which are "-1"s in this case). -- 246084944 by Zhichao Lu: Keras based implementation for SSD + MobilenetV2 + FPN. -- 245544227 by rathodv: Add batch_get_targets method to target assigner module to gather any groundtruth tensors based on the results of target assigner. -- 245540854 by rathodv: Update target assigner to return match tensor instead of a match object. -- 245434441 by Zhichao Lu: Add README for tpu_exporters package. -- 245381834 by lzc: Internal change. -- 245298983 by Zhichao Lu: Add conditional_shape_resizer to config_util -- 245134666 by Zhichao Lu: Adds ConditionalShapeResizer to the ImageResizer proto which enables resizing only if input image height or width is is greater or smaller than a certain size. Also enables specification of resize method in resize_to_{max, min}_dimension methods. -- 245093975 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (faster-rcnn) -- 245072421 by Zhichao Lu: Adds a new image resizing method "resize_to_max_dimension" which resizes images only if a dimension is greater than the maximum desired value while maintaining aspect ratio. -- 244946998 by lzc: Internal Changes. -- 244943693 by Zhichao Lu: Add a custom config to mobilenet v2 that makes it more detection friendly. -- 244754158 by derekjchow: Internal change. -- 244699875 by Zhichao Lu: Add check_range=False to box_list_ops.to_normalized_coordinates when training for instance segmentation. This is consistent with other calls when training for object detection. There could be wrongly annotated boxes in the dataset. -- 244507425 by rathodv: Support bfloat16 for ssd models. -- 244399982 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (ssd) -- 244209387 by Zhichao Lu: Internal change. -- 243922296 by rathodv: Change `raw_detection_scores` to contain softmax/sigmoid scores (not logits) for `raw_ detection_boxes`. -- 243883978 by Zhichao Lu: Add a sample fully conv config. -- 243369455 by Zhichao Lu: Fix regularization loss gap in Keras and Slim. -- 243292002 by lzc: Internal changes. -- 243097958 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (ssd model) -- 243007177 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (ssd model) -- 242776550 by Zhichao Lu: Make object detection pre-processing run on GPU. tf.map_fn() uses TensorArrayV3 ops, which have no int32 GPU implementation. Cast to int64, then cast back to int32. -- 242723128 by Zhichao Lu: Using sorted dictionaries for additional heads in non_max_suppression to ensure tensor order -- 242495311 by Zhichao Lu: Update documentation to reflect new TFLite examples repo location -- 242230527 by Zhichao Lu: Fix Dropout bugs for WeightSharedConvolutionalBoxPred. -- 242226573 by Zhichao Lu: Create Keras-based WeightSharedConvolutionalBoxPredictor. -- 241806074 by Zhichao Lu: Add inference in unit tests of TFX OD template. -- 241641498 by lzc: Internal change. -- 241637481 by Zhichao Lu: matmul_crop_and_resize(): Switch to dynamic shaping, so that not all dimensions are required to be known. -- 241429980 by Zhichao Lu: Internal change -- 241167237 by Zhichao Lu: Adds a faster_rcnn_inception_resnet_v2 Keras feature extractor, and updates the model builder to construct it. -- 241088616 by Zhichao Lu: Make it compatible with different dtype, e.g. float32, bfloat16, etc. -- 240897364 by lzc: Use image_np_expanded in object_detection_tutorial notebook. -- 240890393 by Zhichao Lu: Disable multicore inference for OD template as its not yet compatible. -- 240352168 by Zhichao Lu: Make SSDResnetV1FpnFeatureExtractor not protected to allow inheritance. -- 240351470 by lzc: Internal change. -- 239878928 by Zhichao Lu: Defines Keras box predictors for Faster RCNN and RFCN -- 239872103 by Zhichao Lu: Delete duplicated inputs in test. -- 239714273 by Zhichao Lu: Adding scope variable to all class heads -- 239698643 by Zhichao Lu: Create FPN feature extractor for object detection. -- 239696657 by Zhichao Lu: Internal Change. -- 239299404 by Zhichao Lu: Allows the faster rcnn meta-architecture to support Keras subcomponents -- 238502595 by Zhichao Lu: Lay the groundwork for symmetric quantization. -- 238496885 by Zhichao Lu: Add flexible_grid_anchor_generator -- 238138727 by lzc: Remove dead code. _USE_C_SHAPES has been forced True in TensorFlow releases since TensorFlow 1.9 (https://github.com/tensorflow/tensorflow/commit/1d74a69443f741e69f9f52cb6bc2940b4d4ae3b7) -- 238123936 by rathodv: Add num_matched_groundtruth summary to target assigner in SSD. -- 238103345 by ronnyvotel: Raising error if input file pattern does not match any files. Also printing the number of evaluation images for coco metrics. -- 238044081 by Zhichao Lu: Fix docstring to state the correct dimensionality of `class_predictions_with_background`. -- 237920279 by Zhichao Lu: [XLA] Rework debug flags for dumping HLO. The following flags (usually passed via the XLA_FLAGS envvar) are removed: xla_dump_computations_to xla_dump_executions_to xla_dump_ir_to xla_dump_optimized_hlo_proto_to xla_dump_per_pass_hlo_proto_to xla_dump_unoptimized_hlo_proto_to xla_generate_hlo_graph xla_generate_hlo_text_to xla_hlo_dump_as_html xla_hlo_graph_path xla_log_hlo_text The following new flags are added: xla_dump_to xla_dump_hlo_module_re xla_dump_hlo_pass_re xla_dump_hlo_as_text xla_dump_hlo_as_proto xla_dump_hlo_as_dot xla_dump_hlo_as_url xla_dump_hlo_as_html xla_dump_ir xla_dump_hlo_snapshots The default is not to dump anything at all, but as soon as some dumping flag is specified, we enable the following defaults (most of which can be overridden). * dump to stdout (overridden by --xla_dump_to) * dump HLO modules at the very beginning and end of the optimization pipeline * don't dump between any HLO passes (overridden by --xla_dump_hlo_pass_re) * dump all HLO modules (overridden by --xla_dump_hlo_module_re) * dump in textual format (overridden by --xla_dump_hlo_as_{text,proto,dot,url,html}). For example, to dump optimized and unoptimized HLO text and protos to /tmp/foo, pass --xla_dump_to=/tmp/foo --xla_dump_hlo_as_text --xla_dump_hlo_as_proto For details on these flags' meanings, see xla.proto. The intent of this change is to make dumping both simpler to use and more powerful. For example: * Previously there was no way to dump the HLO module during the pass pipeline in HLO text format; the only option was --dump_per_pass_hlo_proto_to, which dumped in proto format. Now this is --xla_dump_pass_re=.* --xla_dump_hlo_as_text. (In fact, the second flag is not necessary in this case, as dumping as text is the default.) * Previously there was no way to dump HLO as a graph before and after compilation; the only option was --xla_generate_hlo_graph, which would dump before/after every pass. Now this is --xla_dump_hlo_as_{dot,url,html} (depending on what format you want the graph in). * Previously, there was no coordination between the filenames written by the various flags, so info about one module might be dumped with various filename prefixes. Now the filenames are consistent and all dumps from a particular module are next to each other. If you only specify some of these flags, we try to figure out what you wanted. For example: * --xla_dump_to implies --xla_dump_hlo_as_text unless you specify some other --xla_dump_as_* flag. * --xla_dump_hlo_as_text or --xla_dump_ir implies dumping to stdout unless you specify a different --xla_dump_to directory. You can explicitly dump to stdout with --xla_dump_to=-. As part of this change, I simplified the debugging code in the HLO passes for dumping HLO modules. Previously, many tests explicitly VLOG'ed the HLO module before, after, and sometimes during the pass. I removed these VLOGs. If you want dumps before/during/after an HLO pass, use --xla_dump_pass_re=<pass_name>. -- 237510043 by lzc: Internal Change. -- 237469515 by Zhichao Lu: Parameterize model_builder.build in inputs.py. -- 237293511 by rathodv: Remove multiclass_scores from tensor_dict in transform_data_fn always. -- 237260333 by ronnyvotel: Updating faster_rcnn_meta_arch to define prediction dictionary fields that are batched. -- PiperOrigin-RevId: 247226201

Add TPU SavedModel exporter and refactor OD code (#6737)
247226201 by ronnyvotel: Updating the visualization tools to accept unique_ids for color coding. -- 247067830 by Zhichao Lu: Add box_encodings_clip_range options for the convolutional box predictor (for TPU compatibility). -- 246888475 by Zhichao Lu: Remove unused _update_eval_steps function. -- 246163259 by lzc: Add a gather op that can handle ignore indices (which are "-1"s in this case). -- 246084944 by Zhichao Lu: Keras based implementation for SSD + MobilenetV2 + FPN. -- 245544227 by rathodv: Add batch_get_targets method to target assigner module to gather any groundtruth tensors based on the results of target assigner. -- 245540854 by rathodv: Update target assigner to return match tensor instead of a match object. -- 245434441 by Zhichao Lu: Add README for tpu_exporters package. -- 245381834 by lzc: Internal change. -- 245298983 by Zhichao Lu: Add conditional_shape_resizer to config_util -- 245134666 by Zhichao Lu: Adds ConditionalShapeResizer to the ImageResizer proto which enables resizing only if input image height or width is is greater or smaller than a certain size. Also enables specification of resize method in resize_to_{max, min}_dimension methods. -- 245093975 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (faster-rcnn) -- 245072421 by Zhichao Lu: Adds a new image resizing method "resize_to_max_dimension" which resizes images only if a dimension is greater than the maximum desired value while maintaining aspect ratio. -- 244946998 by lzc: Internal Changes. -- 244943693 by Zhichao Lu: Add a custom config to mobilenet v2 that makes it more detection friendly. -- 244754158 by derekjchow: Internal change. -- 244699875 by Zhichao Lu: Add check_range=False to box_list_ops.to_normalized_coordinates when training for instance segmentation. This is consistent with other calls when training for object detection. There could be wrongly annotated boxes in the dataset. -- 244507425 by rathodv: Support bfloat16 for ssd models. -- 244399982 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (ssd) -- 244209387 by Zhichao Lu: Internal change. -- 243922296 by rathodv: Change `raw_detection_scores` to contain softmax/sigmoid scores (not logits) for `raw_ detection_boxes`. -- 243883978 by Zhichao Lu: Add a sample fully conv config. -- 243369455 by Zhichao Lu: Fix regularization loss gap in Keras and Slim. -- 243292002 by lzc: Internal changes. -- 243097958 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (ssd model) -- 243007177 by Zhichao Lu: Exporting SavedModel for Object Detection TPU inference. (ssd model) -- 242776550 by Zhichao Lu: Make object detection pre-processing run on GPU. tf.map_fn() uses TensorArrayV3 ops, which have no int32 GPU implementation. Cast to int64, then cast back to int32. -- 242723128 by Zhichao Lu: Using sorted dictionaries for additional heads in non_max_suppression to ensure tensor order -- 242495311 by Zhichao Lu: Update documentation to reflect new TFLite examples repo location -- 242230527 by Zhichao Lu: Fix Dropout bugs for WeightSharedConvolutionalBoxPred. -- 242226573 by Zhichao Lu: Create Keras-based WeightSharedConvolutionalBoxPredictor. -- 241806074 by Zhichao Lu: Add inference in unit tests of TFX OD template. -- 241641498 by lzc: Internal change. -- 241637481 by Zhichao Lu: matmul_crop_and_resize(): Switch to dynamic shaping, so that not all dimensions are required to be known. -- 241429980 by Zhichao Lu: Internal change -- 241167237 by Zhichao Lu: Adds a faster_rcnn_inception_resnet_v2 Keras feature extractor, and updates the model builder to construct it. -- 241088616 by Zhichao Lu: Make it compatible with different dtype, e.g. float32, bfloat16, etc. -- 240897364 by lzc: Use image_np_expanded in object_detection_tutorial notebook. -- 240890393 by Zhichao Lu: Disable multicore inference for OD template as its not yet compatible. -- 240352168 by Zhichao Lu: Make SSDResnetV1FpnFeatureExtractor not protected to allow inheritance. -- 240351470 by lzc: Internal change. -- 239878928 by Zhichao Lu: Defines Keras box predictors for Faster RCNN and RFCN -- 239872103 by Zhichao Lu: Delete duplicated inputs in test. -- 239714273 by Zhichao Lu: Adding scope variable to all class heads -- 239698643 by Zhichao Lu: Create FPN feature extractor for object detection. -- 239696657 by Zhichao Lu: Internal Change. -- 239299404 by Zhichao Lu: Allows the faster rcnn meta-architecture to support Keras subcomponents -- 238502595 by Zhichao Lu: Lay the groundwork for symmetric quantization. -- 238496885 by Zhichao Lu: Add flexible_grid_anchor_generator -- 238138727 by lzc: Remove dead code. _USE_C_SHAPES has been forced True in TensorFlow releases since TensorFlow 1.9 (https://github.com/tensorflow/tensorflow/commit/1d74a69443f741e69f9f52cb6bc2940b4d4ae3b7) -- 238123936 by rathodv: Add num_matched_groundtruth summary to target assigner in SSD. -- 238103345 by ronnyvotel: Raising error if input file pattern does not match any files. Also printing the number of evaluation images for coco metrics. -- 238044081 by Zhichao Lu: Fix docstring to state the correct dimensionality of `class_predictions_with_background`. -- 237920279 by Zhichao Lu: [XLA] Rework debug flags for dumping HLO. The following flags (usually passed via the XLA_FLAGS envvar) are removed: xla_dump_computations_to xla_dump_executions_to xla_dump_ir_to xla_dump_optimized_hlo_proto_to xla_dump_per_pass_hlo_proto_to xla_dump_unoptimized_hlo_proto_to xla_generate_hlo_graph xla_generate_hlo_text_to xla_hlo_dump_as_html xla_hlo_graph_path xla_log_hlo_text The following new flags are added: xla_dump_to xla_dump_hlo_module_re xla_dump_hlo_pass_re xla_dump_hlo_as_text xla_dump_hlo_as_proto xla_dump_hlo_as_dot xla_dump_hlo_as_url xla_dump_hlo_as_html xla_dump_ir xla_dump_hlo_snapshots The default is not to dump anything at all, but as soon as some dumping flag is specified, we enable the following defaults (most of which can be overridden). * dump to stdout (overridden by --xla_dump_to) * dump HLO modules at the very beginning and end of the optimization pipeline * don't dump between any HLO passes (overridden by --xla_dump_hlo_pass_re) * dump all HLO modules (overridden by --xla_dump_hlo_module_re) * dump in textual format (overridden by --xla_dump_hlo_as_{text,proto,dot,url,html}). For example, to dump optimized and unoptimized HLO text and protos to /tmp/foo, pass --xla_dump_to=/tmp/foo --xla_dump_hlo_as_text --xla_dump_hlo_as_proto For details on these flags' meanings, see xla.proto. The intent of this change is to make dumping both simpler to use and more powerful. For example: * Previously there was no way to dump the HLO module during the pass pipeline in HLO text format; the only option was --dump_per_pass_hlo_proto_to, which dumped in proto format. Now this is --xla_dump_pass_re=.* --xla_dump_hlo_as_text. (In fact, the second flag is not necessary in this case, as dumping as text is the default.) * Previously there was no way to dump HLO as a graph before and after compilation; the only option was --xla_generate_hlo_graph, which would dump before/after every pass. Now this is --xla_dump_hlo_as_{dot,url,html} (depending on what format you want the graph in). * Previously, there was no coordination between the filenames written by the various flags, so info about one module might be dumped with various filename prefixes. Now the filenames are consistent and all dumps from a particular module are next to each other. If you only specify some of these flags, we try to figure out what you wanted. For example: * --xla_dump_to implies --xla_dump_hlo_as_text unless you specify some other --xla_dump_as_* flag. * --xla_dump_hlo_as_text or --xla_dump_ir implies dumping to stdout unless you specify a different --xla_dump_to directory. You can explicitly dump to stdout with --xla_dump_to=-. As part of this change, I simplified the debugging code in the HLO passes for dumping HLO modules. Previously, many tests explicitly VLOG'ed the HLO module before, after, and sometimes during the pass. I removed these VLOGs. If you want dumps before/during/after an HLO pass, use --xla_dump_pass_re=<pass_name>. -- 237510043 by lzc: Internal Change. -- 237469515 by Zhichao Lu: Parameterize model_builder.build in inputs.py. -- 237293511 by rathodv: Remove multiclass_scores from tensor_dict in transform_data_fn always. -- 237260333 by ronnyvotel: Updating faster_rcnn_meta_arch to define prediction dictionary fields that are batched. -- PiperOrigin-RevId: 247226201
80444539 · Zhuoran Liu · pkulzc · c4f34e58 · 80444539 · 80444539
Commit 80444539 authored May 22, 2019 by Zhuoran Liu Committed by pkulzc May 22, 2019
20 changed files
--- a/research/object_detection/core/target_assigner_test.py
+++ b/research/object_detection/core/target_assigner_test.py
@@ -713,9 +713,6 @@ class BatchTargetAssignerTest(test_case.TestCase):
    groundtruth_boxlist2 = np.array([[0, 0.25123152, 1, 1],
                                     [0.015789, 0.0985, 0.55789, 0.3842]],
                                    dtype=np.float32)
-    class_targets1 = np.array([[0, 1, 0, 0]], dtype=np.float32)
-    class_targets2 = np.array([[0, 0, 0, 1],
-                               [0, 0, 1, 0]], dtype=np.float32)
    class_targets1 = np.array([[[0, 1, 1],
                                [1, 1, 0]]], dtype=np.float32)
    class_targets2 = np.array([[[0, 1, 1],
@@ -821,6 +818,63 @@ class BatchTargetAssignerTest(test_case.TestCase):
    self.assertAllClose(reg_weights_out, exp_reg_weights)


+class BatchGetTargetsTest(test_case.TestCase):
+
+  def test_scalar_targets(self):
+    batch_match = np.array([[1, 0, 1],
+                            [-2, -1, 1]], dtype=np.int32)
+    groundtruth_tensors_list = np.array([[11, 12], [13, 14]], dtype=np.int32)
+    groundtruth_weights_list = np.array([[1.0, 1.0], [1.0, 0.5]],
+                                        dtype=np.float32)
+    unmatched_value = np.array(99, dtype=np.int32)
+    unmatched_weight = np.array(0.0, dtype=np.float32)
+
+    def graph_fn(batch_match, groundtruth_tensors_list,
+                 groundtruth_weights_list, unmatched_value, unmatched_weight):
+      targets, weights = targetassigner.batch_get_targets(
+          batch_match, tf.unstack(groundtruth_tensors_list),
+          tf.unstack(groundtruth_weights_list),
+          unmatched_value, unmatched_weight)
+      return (targets, weights)
+
+    (targets_np, weights_np) = self.execute(graph_fn, [
+        batch_match, groundtruth_tensors_list, groundtruth_weights_list,
+        unmatched_value, unmatched_weight
+    ])
+    self.assertAllEqual([[12, 11, 12],
+                         [99, 99, 14]], targets_np)
+    self.assertAllClose([[1.0, 1.0, 1.0],
+                         [0.0, 0.0, 0.5]], weights_np)
+
+  def test_1d_targets(self):
+    batch_match = np.array([[1, 0, 1],
+                            [-2, -1, 1]], dtype=np.int32)
+    groundtruth_tensors_list = np.array([[[11, 12], [12, 13]],
+                                         [[13, 14], [14, 15]]],
+                                        dtype=np.float32)
+    groundtruth_weights_list = np.array([[1.0, 1.0], [1.0, 0.5]],
+                                        dtype=np.float32)
+    unmatched_value = np.array([99, 99], dtype=np.float32)
+    unmatched_weight = np.array(0.0, dtype=np.float32)
+
+    def graph_fn(batch_match, groundtruth_tensors_list,
+                 groundtruth_weights_list, unmatched_value, unmatched_weight):
+      targets, weights = targetassigner.batch_get_targets(
+          batch_match, tf.unstack(groundtruth_tensors_list),
+          tf.unstack(groundtruth_weights_list),
+          unmatched_value, unmatched_weight)
+      return (targets, weights)
+
+    (targets_np, weights_np) = self.execute(graph_fn, [
+        batch_match, groundtruth_tensors_list, groundtruth_weights_list,
+        unmatched_value, unmatched_weight
+    ])
+    self.assertAllClose([[[12, 13], [11, 12], [12, 13]],
+                         [[99, 99], [99, 99], [14, 15]]], targets_np)
+    self.assertAllClose([[1.0, 1.0, 1.0],
+                         [0.0, 0.0, 0.5]], weights_np)
+
+
 class BatchTargetAssignConfidencesTest(test_case.TestCase):

  def _get_target_assigner(self):

--- a/research/object_detection/data_decoders/tf_example_decoder_test.py
+++ b/research/object_detection/data_decoders/tf_example_decoder_test.py
@@ -18,7 +18,6 @@ import os
 import numpy as np
 import tensorflow as tf

-from tensorflow.python.framework import test_util
 from object_detection.core import standard_fields as fields
 from object_detection.data_decoders import tf_example_decoder
 from object_detection.protos import input_reader_pb2
@@ -257,7 +256,6 @@ class TfExampleDecoderTest(tf.test.TestCase):
    self.assertAllEqual(expected_boxes,
                        tensor_dict[fields.InputDataFields.groundtruth_boxes])

-  @test_util.enable_c_shapes
  def testDecodeKeypoint(self):
    image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
    encoded_jpeg = self._EncodeImage(image_tensor)
@@ -346,7 +344,6 @@ class TfExampleDecoderTest(tf.test.TestCase):
    self.assertAllClose(tensor_dict[fields.InputDataFields.groundtruth_weights],
                        np.ones(2, dtype=np.float32))

-  @test_util.enable_c_shapes
  def testDecodeObjectLabel(self):
    image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
    encoded_jpeg = self._EncodeImage(image_tensor)
@@ -669,7 +666,6 @@ class TfExampleDecoderTest(tf.test.TestCase):
    self.assertAllEqual([3, 1],
                        tensor_dict[fields.InputDataFields.groundtruth_classes])

-  @test_util.enable_c_shapes
  def testDecodeObjectArea(self):
    image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
    encoded_jpeg = self._EncodeImage(image_tensor)
@@ -696,7 +692,6 @@ class TfExampleDecoderTest(tf.test.TestCase):
    self.assertAllEqual(object_area,
                        tensor_dict[fields.InputDataFields.groundtruth_area])

-  @test_util.enable_c_shapes
  def testDecodeObjectIsCrowd(self):
    image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
    encoded_jpeg = self._EncodeImage(image_tensor)
@@ -725,7 +720,6 @@ class TfExampleDecoderTest(tf.test.TestCase):
        [bool(item) for item in object_is_crowd],
        tensor_dict[fields.InputDataFields.groundtruth_is_crowd])

-  @test_util.enable_c_shapes
  def testDecodeObjectDifficult(self):
    image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
    encoded_jpeg = self._EncodeImage(image_tensor)
@@ -754,7 +748,6 @@ class TfExampleDecoderTest(tf.test.TestCase):
        [bool(item) for item in object_difficult],
        tensor_dict[fields.InputDataFields.groundtruth_difficult])

-  @test_util.enable_c_shapes
  def testDecodeObjectGroupOf(self):
    image_tensor = np.random.randint(256, size=(4, 5, 3)).astype(np.uint8)
    encoded_jpeg = self._EncodeImage(image_tensor)
@@ -809,7 +802,6 @@ class TfExampleDecoderTest(tf.test.TestCase):
    self.assertAllEqual(object_weights,
                        tensor_dict[fields.InputDataFields.groundtruth_weights])

-  @test_util.enable_c_shapes
  def testDecodeInstanceSegmentation(self):
    num_instances = 4
    image_height = 5

--- a/research/object_detection/dockerfiles/android/README.md
+++ b/research/object_detection/dockerfiles/android/README.md
@@ -14,7 +14,7 @@ A couple words of warning:
   the container. When running through the tutorial,
   **do not close the container**.
 2. To be able to deploy the [Android app](
-   https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/examples/android/app)
+   https://github.com/tensorflow/examples/tree/master/lite/examples/object_detection/android)
   (which you will build at the end of the tutorial),
   you will need to kill any instances of `adb` running on the host machine. You
   can accomplish this by closing all instances of Android Studio, and then

--- a/research/object_detection/g3doc/running_on_mobile_tensorflowlite.md
+++ b/research/object_detection/g3doc/running_on_mobile_tensorflowlite.md
@@ -94,43 +94,41 @@ bazel run --config=opt tensorflow/lite/toco:toco -- \

 # Running our model on Android

-To run our TensorFlow Lite model on device, we will need to install the Android
-NDK and SDK. The current recommended Android NDK version is 14b and can be found
-on the [NDK
-Archives](https://developer.android.com/ndk/downloads/older_releases.html#ndk-14b-downloads)
-page. Android SDK and build tools can be [downloaded
-separately](https://developer.android.com/tools/revisions/build-tools.html) or
-used as part of [Android
-Studio](https://developer.android.com/studio/index.html). To build the
-TensorFlow Lite Android demo, build tools require API >= 23 (but it will run on
-devices with API >= 21). Additional details are available on the [TensorFlow
-Lite Android App
-page](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/java/demo/README.md).
+To run our TensorFlow Lite model on device, we will use Android Studio to build
+and run the TensorFlow Lite detection example with the new model. The example is
+found in the
+[TensorFlow examples repository](https://github.com/tensorflow/examples) under
+`/lite/examples/object_detection`. The example can be built with
+[Android Studio](https://developer.android.com/studio/index.html), and requires
+the
+[Android SDK with build tools](https://developer.android.com/tools/revisions/build-tools.html)
+that support API >= 21. Additional details are available on the
+[TensorFlow Lite example page](https://github.com/tensorflow/examples/tree/master/lite/examples/object_detection/android).

 Next we need to point the app to our new detect.tflite file and give it the
 names of our new labels. Specifically, we will copy our TensorFlow Lite
 flatbuffer to the app assets directory with the following command:

 ```shell
+mkdir $TF_EXAMPLES/lite/examples/object_detection/android/app/src/main/assets
 cp /tmp/tflite/detect.tflite \
-//tensorflow/lite/examples/android/app/src/main/assets
+  $TF_EXAMPLES/lite/examples/object_detection/android/app/src/main/assets
 ```

-You will also need to copy your new labelmap labels_list.txt to the assets
+You will also need to copy your new labelmap labelmap.txt to the assets
 directory.

-We will now edit the BUILD file to point to this new model. First, open the
-BUILD file tensorflow/lite/examples/android/BUILD. Then find the assets
-section, and replace the line “@tflite_mobilenet_ssd_quant//:detect.tflite”
-(which by default points to a COCO pretrained model) with the path to your new
-TFLite model
-“//tensorflow/lite/examples/android/app/src/main/assets:detect.tflite”.
-Finally, change the last line in assets section to use the new label map as
-well.
-
-We will also need to tell our app to use the new label map. In order to do this,
-open up the
-tensorflow/lite/examples/android/app/src/main/java/org/tensorflow/demo/DetectorActivity.java
+We will now edit the gradle build file to use these assets. First, open the
+`build.gradle` file
+`$TF_EXAMPLES/lite/examples/object_detection/android/app/build.gradle`. Comment
+out the model download script to avoid your assets being overwritten: `// apply
+from:'download_model.gradle'` ```
+
+If your model is named `detect.tflite`, and your labels file `labelmap.txt`, the
+example will use them automatically as long as they've been properly copied into
+the base assets directory. If you need to use a custom path or filename, open up
+the
+$TF_EXAMPLES/lite/examples/object_detection/android/app/src/main/java/org/tensorflow/demo/DetectorActivity.java
 file in a text editor and find the definition of TF_OD_API_LABELS_FILE. Update
 this path to point to your new label map file:
 "file:///android_asset/labels_list.txt". Note that if your model is quantized,
@@ -144,20 +142,6 @@ DetectorActivity.java should now look as follows for a quantized model:
  private static final String TF_OD_API_LABELS_FILE = "file:///android_asset/labels_list.txt";
 ```

-Once you’ve copied the TensorFlow Lite file and edited your BUILD and
-DetectorActivity.java files, you can build the demo app, run this bazel command
-from the tensorflow directory:
-
-```shell
- bazel build -c opt --config=android_arm{,64} --cxxopt='--std=c++11'
-"//tensorflow/lite/examples/android:tflite_demo"
-```
-
-Now install the demo on a
-[debug-enabled](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android#install)
-Android phone via [Android Debug
-Bridge](https://developer.android.com/studio/command-line/adb) (adb):
-
-```shell
-adb install bazel-bin/tensorflow/lite/examples/android/tflite_demo.apk
-```
+Once you’ve copied the TensorFlow Lite model and edited the gradle build script
+to not use the downloaded assets, you can build and deploy the app using the
+usual Android Studio build process.
--- a/research/object_detection/g3doc/tpu_exporters.md
+++ b/research/object_detection/g3doc/tpu_exporters.md
+# Object Detection TPU Inference Exporter
+
+This package contains SavedModel Exporter for TPU Inference of object detection
+models.
+
+## Usage
+
+This Exporter is intended for users who have trained models with CPUs / GPUs,
+but would like to use them for inference on TPU without changing their code or
+re-training their models.
+
+Users are assumed to have:
+
+   `PIPELINE_CONFIG`: A pipeline_pb2.TrainEvalPipelineConfig config file;
+   `CHECKPOINT`: A model checkpoint trained on any device;
+
+and need to correctly set:
+
+   `EXPORT_DIR`: Path to export SavedModel;
+   `INPUT_PLACEHOLDER`: Name of input placeholder in model's signature_def_map;
+   `INPUT_TYPE`: Type of input node, which can be one of 'image_tensor',
+    'encoded_image_string_tensor', or 'tf_example';
+   `USE_BFLOAT16`: Whether to use bfloat16 instead of float32 on TPU.
+
+The model can be exported with:
+
+```
+python object_detection/tpu_exporters/export_saved_model_tpu.py \
+    --pipeline_config_file=<PIPELINE_CONFIG> \
+    --ckpt_path=<CHECKPOINT> \
+    --export_dir=<EXPORT_DIR> \
+    --input_placeholder_name=<INPUT_PLACEHOLDER> \
+    --input_type=<INPUT_TYPE> \
+    --use_bfloat16=<USE_BFLOAT16>
+```
--- a/research/object_detection/inputs.py
+++ b/research/object_detection/inputs.py
@@ -43,6 +43,7 @@ SERVING_FED_EXAMPLE_KEY = 'serialized_example'
 # A map of names to methods that help build the input pipeline.
 INPUT_BUILDER_UTIL_MAP = {
    'dataset_build': dataset_builder.build,
+    'model_build': model_builder.build,
 }


@@ -152,7 +153,7 @@ def transform_input_data(tensor_dict,
  if use_multiclass_scores:
    tensor_dict[fields.InputDataFields.groundtruth_classes] = tensor_dict[
        fields.InputDataFields.multiclass_scores]
-    tensor_dict.pop(fields.InputDataFields.multiclass_scores, None)
+  tensor_dict.pop(fields.InputDataFields.multiclass_scores, None)

  if fields.InputDataFields.groundtruth_confidences in tensor_dict:
    groundtruth_confidences = tensor_dict[
@@ -498,11 +499,13 @@ def create_train_input_fn(train_config, train_input_config,
      data_augmentation_fn = functools.partial(
          augment_input_data,
          data_augmentation_options=data_augmentation_options)
-      model = model_builder.build(model_config, is_training=True)
+
+      model_preprocess_fn = INPUT_BUILDER_UTIL_MAP['model_build'](
+          model_config, is_training=True).preprocess
      image_resizer_config = config_util.get_image_resizer_config(model_config)
      image_resizer_fn = image_resizer_builder.build(image_resizer_config)
      transform_data_fn = functools.partial(
-          transform_input_data, model_preprocess_fn=model.preprocess,
+          transform_input_data, model_preprocess_fn=model_preprocess_fn,
          image_resizer_fn=image_resizer_fn,
          num_classes=config_util.get_number_of_classes(model_config),
          data_augmentation_fn=data_augmentation_fn,
@@ -593,12 +596,14 @@ def create_eval_input_fn(eval_config, eval_input_config, model_config):
    def transform_and_pad_input_data_fn(tensor_dict):
      """Combines transform and pad operation."""
      num_classes = config_util.get_number_of_classes(model_config)
-      model = model_builder.build(model_config, is_training=False)
+      model_preprocess_fn = INPUT_BUILDER_UTIL_MAP['model_build'](
+          model_config, is_training=False).preprocess
+
      image_resizer_config = config_util.get_image_resizer_config(model_config)
      image_resizer_fn = image_resizer_builder.build(image_resizer_config)

      transform_data_fn = functools.partial(
-          transform_input_data, model_preprocess_fn=model.preprocess,
+          transform_input_data, model_preprocess_fn=model_preprocess_fn,
          image_resizer_fn=image_resizer_fn,
          num_classes=num_classes,
          data_augmentation_fn=None,
@@ -643,12 +648,14 @@ def create_predict_input_fn(model_config, predict_input_config):
    example = tf.placeholder(dtype=tf.string, shape=[], name='tf_example')

    num_classes = config_util.get_number_of_classes(model_config)
-    model = model_builder.build(model_config, is_training=False)
+    model_preprocess_fn = INPUT_BUILDER_UTIL_MAP['model_build'](
+        model_config, is_training=False).preprocess
+
    image_resizer_config = config_util.get_image_resizer_config(model_config)
    image_resizer_fn = image_resizer_builder.build(image_resizer_config)

    transform_fn = functools.partial(
-        transform_input_data, model_preprocess_fn=model.preprocess,
+        transform_input_data, model_preprocess_fn=model_preprocess_fn,
        image_resizer_fn=image_resizer_fn,
        num_classes=num_classes,
        data_augmentation_fn=None)

--- a/research/object_detection/meta_architectures/faster_rcnn_meta_arch.py
+++ b/research/object_detection/meta_architectures/faster_rcnn_meta_arch.py
@@ -98,6 +98,7 @@ import tensorflow as tf

 from object_detection.anchor_generators import grid_anchor_generator
 from object_detection.builders import box_predictor_builder
+from object_detection.builders import hyperparams_builder
 from object_detection.core import box_list
 from object_detection.core import box_list_ops
 from object_detection.core import box_predictor
@@ -110,6 +111,8 @@ from object_detection.utils import shape_utils

 slim = tf.contrib.slim

+_UNINITIALIZED_FEATURE_EXTRACTOR = '__uninitialized__'
+

 class FasterRCNNFeatureExtractor(object):
  """Faster R-CNN Feature Extractor definition."""
@@ -216,6 +219,71 @@ class FasterRCNNFeatureExtractor(object):
    return variables_to_restore


+class FasterRCNNKerasFeatureExtractor(object):
+  """Keras-based Faster R-CNN Feature Extractor definition."""
+
+  def __init__(self,
+               is_training,
+               first_stage_features_stride,
+               batch_norm_trainable=False,
+               weight_decay=0.0):
+    """Constructor.
+
+    Args:
+      is_training: A boolean indicating whether the training version of the
+        computation graph should be constructed.
+      first_stage_features_stride: Output stride of extracted RPN feature map.
+      batch_norm_trainable: Whether to update batch norm parameters during
+        training or not. When training with a relative large batch size
+        (e.g. 8), it could be desirable to enable batch norm update.
+      weight_decay: float weight decay for feature extractor (default: 0.0).
+    """
+    self._is_training = is_training
+    self._first_stage_features_stride = first_stage_features_stride
+    self._train_batch_norm = (batch_norm_trainable and is_training)
+    self._weight_decay = weight_decay
+
+  @abc.abstractmethod
+  def preprocess(self, resized_inputs):
+    """Feature-extractor specific preprocessing (minus image resizing)."""
+    pass
+
+  @abc.abstractmethod
+  def get_proposal_feature_extractor_model(self, name):
+    """Get model that extracts first stage RPN features, to be overridden."""
+    pass
+
+  @abc.abstractmethod
+  def get_box_classifier_feature_extractor_model(self, name):
+    """Get model that extracts second stage box classifier features."""
+    pass
+
+  def restore_from_classification_checkpoint_fn(
+      self,
+      first_stage_feature_extractor_scope,
+      second_stage_feature_extractor_scope):
+    """Returns a map of variables to load from a foreign checkpoint.
+
+    Args:
+      first_stage_feature_extractor_scope: A scope name for the first stage
+        feature extractor.
+      second_stage_feature_extractor_scope: A scope name for the second stage
+        feature extractor.
+
+    Returns:
+      A dict mapping variable names (to load from a checkpoint) to variables in
+      the model graph.
+    """
+    variables_to_restore = {}
+    for variable in tf.global_variables():
+      for scope_name in [first_stage_feature_extractor_scope,
+                         second_stage_feature_extractor_scope]:
+        if variable.op.name.startswith(scope_name):
+          var_name = variable.op.name.replace(scope_name + '/', '')
+          variables_to_restore[var_name] = variable
+    return variables_to_restore
+
+
 class FasterRCNNMetaArch(model.DetectionModel):
  """Faster R-CNN Meta-architecture definition."""

@@ -256,7 +324,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
               add_summaries=True,
               clip_anchors_to_image=False,
               use_static_shapes=False,
-               resize_masks=True):
+               resize_masks=True,
+               freeze_batchnorm=False):
    """FasterRCNNMetaArch Constructor.

    Args:
@@ -296,9 +365,12 @@ class FasterRCNNMetaArch(model.DetectionModel):
        denser resolutions.  The atrous rate is used to compensate for the
        denser feature maps by using an effectively larger receptive field.
        (This should typically be set to 1).
-      first_stage_box_predictor_arg_scope_fn: A function to construct tf-slim
-        arg_scope for conv2d, separable_conv2d and fully_connected ops for the
-        RPN box predictor.
+      first_stage_box_predictor_arg_scope_fn: Either a
+        Keras layer hyperparams object or a function to construct tf-slim
+        arg_scope for conv2d, separable_conv2d and fully_connected ops. Used
+        for the RPN box predictor. If it is a keras hyperparams object the
+        RPN box predictor will be a Keras model. If it is a function to
+        construct an arg scope it will be a tf-slim box predictor.
      first_stage_box_predictor_kernel_size: Kernel size to use for the
        convolution op just prior to RPN box predictions.
      first_stage_box_predictor_depth: Output depth for the convolution op
@@ -378,6 +450,10 @@ class FasterRCNNMetaArch(model.DetectionModel):
        guarantees.
      resize_masks: Indicates whether the masks presend in the groundtruth
        should be resized in the model with `image_resizer_fn`
+      freeze_batchnorm: Whether to freeze batch norm parameters in the first
+        stage box predictor during training or not. When training with a small
+        batch size (e.g. 1), it is desirable to freeze batch norm update and
+        use pretrained batch norm params.

    Raises:
      ValueError: If `second_stage_batch_size` > `first_stage_max_proposals` at
@@ -398,6 +474,19 @@ class FasterRCNNMetaArch(model.DetectionModel):
    self._image_resizer_fn = image_resizer_fn
    self._resize_masks = resize_masks
    self._feature_extractor = feature_extractor
+    if isinstance(feature_extractor, FasterRCNNKerasFeatureExtractor):
+      # We delay building the feature extractor until it is used,
+      # to avoid creating the variables when a model is built just for data
+      # preprocessing. (This prevents a subtle bug where variable names are
+      # mismatched across workers, causing only one worker to be able to train)
+      self._feature_extractor_for_proposal_features = (
+          _UNINITIALIZED_FEATURE_EXTRACTOR)
+      self._feature_extractor_for_box_classifier_features = (
+          _UNINITIALIZED_FEATURE_EXTRACTOR)
+    else:
+      self._feature_extractor_for_proposal_features = None
+      self._feature_extractor_for_box_classifier_features = None
+
    self._number_of_stages = number_of_stages

    self._proposal_target_assigner = first_stage_target_assigner
@@ -408,25 +497,82 @@ class FasterRCNNMetaArch(model.DetectionModel):
    # (First stage) Region proposal network parameters
    self._first_stage_anchor_generator = first_stage_anchor_generator
    self._first_stage_atrous_rate = first_stage_atrous_rate
-    self._first_stage_box_predictor_arg_scope_fn = (
-        first_stage_box_predictor_arg_scope_fn)
+    self._first_stage_box_predictor_depth = first_stage_box_predictor_depth
    self._first_stage_box_predictor_kernel_size = (
        first_stage_box_predictor_kernel_size)
-    self._first_stage_box_predictor_depth = first_stage_box_predictor_depth
    self._first_stage_minibatch_size = first_stage_minibatch_size
    self._first_stage_sampler = first_stage_sampler
-    self._first_stage_box_predictor = (
-        box_predictor_builder.build_convolutional_box_predictor(
-            is_training=self._is_training,
-            num_classes=1,
-            conv_hyperparams_fn=self._first_stage_box_predictor_arg_scope_fn,
-            use_dropout=False,
-            dropout_keep_prob=1.0,
-            box_code_size=self._box_coder.code_size,
-            kernel_size=1,
-            num_layers_before_predictor=0,
-            min_depth=0,
-            max_depth=0))
+    if isinstance(first_stage_box_predictor_arg_scope_fn,
+                  hyperparams_builder.KerasLayerHyperparams):
+      num_anchors_per_location = (
+          self._first_stage_anchor_generator.num_anchors_per_location())
+      if len(num_anchors_per_location) != 1:
+        raise ValueError('anchor_generator is expected to generate anchors '
+                         'corresponding to a single feature map.')
+      conv_hyperparams = (
+          first_stage_box_predictor_arg_scope_fn)
+      self._first_stage_box_predictor_first_conv = (
+          tf.keras.Sequential([
+              tf.keras.layers.Conv2D(
+                  self._first_stage_box_predictor_depth,
+                  kernel_size=[self._first_stage_box_predictor_kernel_size,
+                               self._first_stage_box_predictor_kernel_size],
+                  dilation_rate=self._first_stage_atrous_rate,
+                  padding='SAME',
+                  name='RPNConv',
+                  **conv_hyperparams.params()),
+              conv_hyperparams.build_batch_norm(
+                  (self._is_training and not freeze_batchnorm),
+                  name='RPNBatchNorm'),
+              tf.keras.layers.Lambda(
+                  tf.nn.relu6,
+                  name='RPNActivation')
+          ], name='FirstStageRPNFeatures'))
+      self._first_stage_box_predictor = (
+          box_predictor_builder.build_convolutional_keras_box_predictor(
+              is_training=self._is_training,
+              num_classes=1,
+              conv_hyperparams=conv_hyperparams,
+              freeze_batchnorm=freeze_batchnorm,
+              inplace_batchnorm_update=False,
+              num_predictions_per_location_list=num_anchors_per_location,
+              use_dropout=False,
+              dropout_keep_prob=1.0,
+              box_code_size=self._box_coder.code_size,
+              kernel_size=1,
+              num_layers_before_predictor=0,
+              min_depth=0,
+              max_depth=0,
+              name=self.first_stage_box_predictor_scope))
+    else:
+      self._first_stage_box_predictor_arg_scope_fn = (
+          first_stage_box_predictor_arg_scope_fn)
+      def rpn_box_predictor_feature_extractor(rpn_features_to_crop):
+        with slim.arg_scope(self._first_stage_box_predictor_arg_scope_fn()):
+          reuse = tf.get_variable_scope().reuse
+          return slim.conv2d(
+              rpn_features_to_crop,
+              self._first_stage_box_predictor_depth,
+              kernel_size=[self._first_stage_box_predictor_kernel_size,
+                           self._first_stage_box_predictor_kernel_size],
+              rate=self._first_stage_atrous_rate,
+              activation_fn=tf.nn.relu6,
+              scope='Conv',
+              reuse=reuse)
+      self._first_stage_box_predictor_first_conv = (
+          rpn_box_predictor_feature_extractor)
+      self._first_stage_box_predictor = (
+          box_predictor_builder.build_convolutional_box_predictor(
+              is_training=self._is_training,
+              num_classes=1,
+              conv_hyperparams_fn=self._first_stage_box_predictor_arg_scope_fn,
+              use_dropout=False,
+              dropout_keep_prob=1.0,
+              box_code_size=self._box_coder.code_size,
+              kernel_size=1,
+              num_layers_before_predictor=0,
+              min_depth=0,
+              max_depth=0))

    self._first_stage_nms_fn = first_stage_non_max_suppression_fn
    self._first_stage_max_proposals = first_stage_max_proposals
@@ -444,6 +590,12 @@ class FasterRCNNMetaArch(model.DetectionModel):
    self._initial_crop_size = initial_crop_size
    self._maxpool_kernel_size = maxpool_kernel_size
    self._maxpool_stride = maxpool_stride
+    # If max pooling is to be used, build the layer
+    if maxpool_kernel_size:
+      self._maxpool_layer = tf.keras.layers.MaxPooling2D(
+          [self._maxpool_kernel_size, self._maxpool_kernel_size],
+          strides=self._maxpool_stride,
+          name='MaxPool2D')

    self._mask_rcnn_box_predictor = second_stage_mask_rcnn_box_predictor

@@ -469,6 +621,7 @@ class FasterRCNNMetaArch(model.DetectionModel):

    if self._number_of_stages <= 0 or self._number_of_stages > 3:
      raise ValueError('Number of stages should be a value in {1, 2, 3}.')
+    self._batched_prediction_tensor_names = []

  @property
  def first_stage_feature_extractor_scope(self):
@@ -510,6 +663,13 @@ class FasterRCNNMetaArch(model.DetectionModel):
      raise RuntimeError('anchors should be a BoxList object, but is not.')
    return self._anchors

+  @property
+  def batched_prediction_tensor_names(self):
+    if not self._batched_prediction_tensor_names:
+      raise RuntimeError('Must call predict() method to get batched prediction '
+                         'tensor names.')
+    return self._batched_prediction_tensor_names
+
  def preprocess(self, inputs):
    """Feature-extractor specific preprocessing.

@@ -567,6 +727,17 @@ class FasterRCNNMetaArch(model.DetectionModel):
                                        clip_heights, clip_widths], axis=1))
    return clip_window

+  def _proposal_postprocess(self, rpn_box_encodings,
+                            rpn_objectness_predictions_with_background, anchors,
+                            image_shape, true_image_shapes):
+    """Wraps over FasterRCNNMetaArch._postprocess_rpn()."""
+    image_shape_2d = self._image_batch_shape_2d(image_shape)
+    proposal_boxes_normalized, _, num_proposals, _, _ = \
+        self._postprocess_rpn(
+            rpn_box_encodings, rpn_objectness_predictions_with_background,
+            anchors, image_shape_2d, true_image_shapes)
+    return proposal_boxes_normalized, num_proposals
+
  def predict(self, preprocessed_inputs, true_image_shapes):
    """Predicts unpostprocessed tensors from input tensor.

@@ -643,6 +814,60 @@ class FasterRCNNMetaArch(model.DetectionModel):
    Raises:
      ValueError: If `predict` is called before `preprocess`.
    """
+    prediction_dict = self._predict_first_stage(preprocessed_inputs)
+
+    if self._number_of_stages >= 2:
+      # If mixed-precision training on TPU is enabled, rpn_box_encodings and
+      # rpn_objectness_predictions_with_background are bfloat16 tensors.
+      # Considered prediction results, they need to be casted to float32
+      # tensors for correct postprocess_rpn computation in predict_second_stage.
+      prediction_dict.update(
+          self._predict_second_stage(
+              tf.to_float(prediction_dict['rpn_box_encodings']),
+              tf.to_float(
+                  prediction_dict['rpn_objectness_predictions_with_background']
+              ), prediction_dict['rpn_features_to_crop'],
+              prediction_dict['anchors'], prediction_dict['image_shape'],
+              true_image_shapes))
+
+    if self._number_of_stages == 3:
+      prediction_dict = self._predict_third_stage(prediction_dict,
+                                                  true_image_shapes)
+
+    self._batched_prediction_tensor_names = [
+        x for x in prediction_dict if x not in ('image_shape', 'anchors')
+    ]
+    return prediction_dict
+
+  def _predict_first_stage(self, preprocessed_inputs):
+    """First stage of prediction.
+
+    Args:
+      preprocessed_inputs: a [batch, height, width, channels] float tensor
+        representing a batch of images.
+
+    Returns:
+      prediction_dict: a dictionary holding "raw" prediction tensors:
+        1) rpn_box_predictor_features: A 4-D float32 tensor with shape
+          [batch_size, height, width, depth] to be used for predicting proposal
+          boxes and corresponding objectness scores.
+        2) rpn_features_to_crop: A 4-D float32 tensor with shape
+          [batch_size, height, width, depth] representing image features to crop
+          using the proposal boxes predicted by the RPN.
+        3) image_shape: a 1-D tensor of shape [4] representing the input
+          image shape.
+        4) rpn_box_encodings:  3-D float tensor of shape
+          [batch_size, num_anchors, self._box_coder.code_size] containing
+          predicted boxes.
+        5) rpn_objectness_predictions_with_background: 3-D float tensor of shape
+          [batch_size, num_anchors, 2] containing class
+          predictions (logits) for each of the anchors.  Note that this
+          tensor *includes* background class predictions (at class index 0).
+        6) anchors: A 2-D tensor of shape [num_anchors, 4] representing anchors
+          for the first stage RPN (in absolute coordinates).  Note that
+          `num_anchors` can differ depending on whether the model is created in
+          training or inference mode.
+    """
    (rpn_box_predictor_features, rpn_features_to_crop, anchors_boxlist,
     image_shape) = self._extract_rpn_feature_maps(preprocessed_inputs)
    (rpn_box_encodings, rpn_objectness_predictions_with_background
@@ -667,30 +892,19 @@ class FasterRCNNMetaArch(model.DetectionModel):

    self._anchors = anchors_boxlist
    prediction_dict = {
-        'rpn_box_predictor_features': rpn_box_predictor_features,
-        'rpn_features_to_crop': rpn_features_to_crop,
-        'image_shape': image_shape,
-        'rpn_box_encodings': rpn_box_encodings,
+        'rpn_box_predictor_features':
+            rpn_box_predictor_features,
+        'rpn_features_to_crop':
+            rpn_features_to_crop,
+        'image_shape':
+            image_shape,
+        'rpn_box_encodings':
+            rpn_box_encodings,
        'rpn_objectness_predictions_with_background':
-        rpn_objectness_predictions_with_background,
-        'anchors': self._anchors.get()
+            rpn_objectness_predictions_with_background,
+        'anchors':
+            anchors_boxlist.data['boxes'],
    }
-
-    if self._number_of_stages >= 2:
-      # If mixed-precision training on TPU is enabled, rpn_box_encodings and
-      # rpn_objectness_predictions_with_background are bfloat16 tensors.
-      # Considered prediction results, they need to be casted to float32
-      # tensors for correct postprocess_rpn computation in predict_second_stage.
-      prediction_dict.update(self._predict_second_stage(
-          tf.to_float(rpn_box_encodings),
-          tf.to_float(rpn_objectness_predictions_with_background),
-          rpn_features_to_crop,
-          self._anchors.get(), image_shape, true_image_shapes))
-
-    if self._number_of_stages == 3:
-      prediction_dict = self._predict_third_stage(
-          prediction_dict, true_image_shapes)
-
    return prediction_dict

  def _image_batch_shape_2d(self, image_batch_shape_1d):
@@ -769,11 +983,55 @@ class FasterRCNNMetaArch(model.DetectionModel):
        6) box_classifier_features: a 4-D float32 or bfloat16 tensor
          representing the features for each proposal.
    """
-    image_shape_2d = self._image_batch_shape_2d(image_shape)
-    proposal_boxes_normalized, _, num_proposals, _, _ = self._postprocess_rpn(
-        rpn_box_encodings, rpn_objectness_predictions_with_background,
-        anchors, image_shape_2d, true_image_shapes)
+    proposal_boxes_normalized, num_proposals = self._proposal_postprocess(
+        rpn_box_encodings, rpn_objectness_predictions_with_background, anchors,
+        image_shape, true_image_shapes)
+    prediction_dict = self._box_prediction(rpn_features_to_crop,
+                                           proposal_boxes_normalized,
+                                           image_shape)
+    prediction_dict['num_proposals'] = num_proposals
+    return prediction_dict
+
+  def _box_prediction(self, rpn_features_to_crop, proposal_boxes_normalized,
+                      image_shape):
+    """Predicts the output tensors from second stage of Faster R-CNN.

+    Args:
+      rpn_features_to_crop: A 4-D float32 or bfloat16 tensor with shape
+        [batch_size, height, width, depth] representing image features to crop
+        using the proposal boxes predicted by the RPN.
+      proposal_boxes_normalized: A float tensor with shape [batch_size,
+        max_num_proposals, 4] representing the (potentially zero padded)
+        proposal boxes for all images in the batch.  These boxes are represented
+        as normalized coordinates.
+      image_shape: A 1D int32 tensors of size [4] containing the image shape.
+
+    Returns:
+      prediction_dict: a dictionary holding "raw" prediction tensors:
+        1) refined_box_encodings: a 3-D tensor with shape
+          [total_num_proposals, num_classes, self._box_coder.code_size]
+          representing predicted (final) refined box encodings, where
+          total_num_proposals=batch_size*self._max_num_proposals. If using a
+          shared box across classes the shape will instead be
+          [total_num_proposals, 1, self._box_coder.code_size].
+        2) class_predictions_with_background: a 3-D tensor with shape
+          [total_num_proposals, num_classes + 1] containing class
+          predictions (logits) for each of the anchors, where
+          total_num_proposals=batch_size*self._max_num_proposals.
+          Note that this tensor *includes* background class predictions
+          (at class index 0).
+        3) proposal_boxes: A float32 tensor of shape
+          [batch_size, self.max_num_proposals, 4] representing
+          decoded proposal bounding boxes in absolute coordinates.
+        4) proposal_boxes_normalized: A float32 tensor of shape
+          [batch_size, self.max_num_proposals, 4] representing decoded proposal
+          bounding boxes in normalized coordinates. Can be used to override the
+          boxes proposed by the RPN, thus enabling one to extract features and
+          get box classification and prediction for externally selected areas
+          of the image.
+        5) box_classifier_features: a 4-D float32 or bfloat16 tensor
+          representing the features for each proposal.
+    """
    # If mixed-precision training on TPU is enabled, the dtype of
    # rpn_features_to_crop is bfloat16, otherwise it is float32. tf.cast is
    # used to match the dtype of proposal_boxes_normalized to that of
@@ -783,10 +1041,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
            rpn_features_to_crop,
            tf.cast(proposal_boxes_normalized, rpn_features_to_crop.dtype)))

-    box_classifier_features = (
-        self._feature_extractor.extract_box_classifier_features(
-            flattened_proposal_feature_maps,
-            scope=self.second_stage_feature_extractor_scope))
+    box_classifier_features = self._extract_box_classifier_features(
+        flattened_proposal_feature_maps)

    if self._mask_rcnn_box_predictor.is_keras_model:
      box_predictions = self._mask_rcnn_box_predictor(
@@ -813,7 +1069,6 @@ class FasterRCNNMetaArch(model.DetectionModel):
        'refined_box_encodings': refined_box_encodings,
        'class_predictions_with_background':
        class_predictions_with_background,
-        'num_proposals': num_proposals,
        'proposal_boxes': absolute_proposal_boxes,
        'box_classifier_features': box_classifier_features,
        'proposal_boxes_normalized': proposal_boxes_normalized,
@@ -821,6 +1076,24 @@ class FasterRCNNMetaArch(model.DetectionModel):

    return prediction_dict

+  def _extract_box_classifier_features(self, flattened_feature_maps):
+    if self._feature_extractor_for_box_classifier_features == (
+        _UNINITIALIZED_FEATURE_EXTRACTOR):
+      self._feature_extractor_for_box_classifier_features = (
+          self._feature_extractor.get_box_classifier_feature_extractor_model(
+              name=self.second_stage_feature_extractor_scope))
+
+    if self._feature_extractor_for_box_classifier_features:
+      box_classifier_features = (
+          self._feature_extractor_for_box_classifier_features(
+              flattened_feature_maps))
+    else:
+      box_classifier_features = (
+          self._feature_extractor.extract_box_classifier_features(
+              flattened_feature_maps,
+              scope=self.second_stage_feature_extractor_scope))
+    return box_classifier_features
+
  def _predict_third_stage(self, prediction_dict, image_shapes):
    """Predicts non-box, non-class outputs using refined detections.

@@ -896,10 +1169,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
      flattened_detected_feature_maps = (
          self._compute_second_stage_input_feature_maps(
              rpn_features_to_crop, detection_boxes))
-      curr_box_classifier_features = (
-          self._feature_extractor.extract_box_classifier_features(
-              flattened_detected_feature_maps,
-              scope=self.second_stage_feature_extractor_scope))
+      curr_box_classifier_features = self._extract_box_classifier_features(
+          flattened_detected_feature_maps)

      if self._mask_rcnn_box_predictor.is_keras_model:
        mask_predictions = self._mask_rcnn_box_predictor(
@@ -972,29 +1243,35 @@ class FasterRCNNMetaArch(model.DetectionModel):
    """
    image_shape = tf.shape(preprocessed_inputs)

-    rpn_features_to_crop, self.endpoints = (
-        self._feature_extractor.extract_proposal_features(
-            preprocessed_inputs,
-            scope=self.first_stage_feature_extractor_scope))
+    rpn_features_to_crop, self.endpoints = self._extract_proposal_features(
+        preprocessed_inputs)

    feature_map_shape = tf.shape(rpn_features_to_crop)
    anchors = box_list_ops.concatenate(
        self._first_stage_anchor_generator.generate([(feature_map_shape[1],
                                                      feature_map_shape[2])]))
-    with slim.arg_scope(self._first_stage_box_predictor_arg_scope_fn()):
-      kernel_size = self._first_stage_box_predictor_kernel_size
-      reuse = tf.get_variable_scope().reuse
-      rpn_box_predictor_features = slim.conv2d(
-          rpn_features_to_crop,
-          self._first_stage_box_predictor_depth,
-          kernel_size=[kernel_size, kernel_size],
-          rate=self._first_stage_atrous_rate,
-          activation_fn=tf.nn.relu6,
-          scope='Conv',
-          reuse=reuse)
+    rpn_box_predictor_features = (
+        self._first_stage_box_predictor_first_conv(rpn_features_to_crop))
    return (rpn_box_predictor_features, rpn_features_to_crop,
            anchors, image_shape)

+  def _extract_proposal_features(self, preprocessed_inputs):
+    if self._feature_extractor_for_proposal_features == (
+        _UNINITIALIZED_FEATURE_EXTRACTOR):
+      self._feature_extractor_for_proposal_features = (
+          self._feature_extractor.get_proposal_feature_extractor_model(
+              name=self.first_stage_feature_extractor_scope))
+    if self._feature_extractor_for_proposal_features:
+      proposal_features = (
+          self._feature_extractor_for_proposal_features(preprocessed_inputs),
+          {})
+    else:
+      proposal_features = (
+          self._feature_extractor.extract_proposal_features(
+              preprocessed_inputs,
+              scope=self.first_stage_feature_extractor_scope))
+    return proposal_features
+
  def _predict_rpn_proposals(self, rpn_box_predictor_features):
    """Adds box predictors to RPN feature map to predict proposals.

@@ -1141,6 +1418,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
        detection_classes: [batch, max_detections]
          (this entry is only created if rpn_mode=False)
        num_detections: [batch]
+        raw_detection_boxes: [batch, max_detections, 4]
+        raw_detection_scores: [batch, max_detections, num_classes + 1]

    Raises:
      ValueError: If `predict` is called before `preprocess`.
@@ -1210,10 +1489,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
      flattened_detected_feature_maps = (
          self._compute_second_stage_input_feature_maps(
              rpn_features_to_crop, detection_boxes))
-      detection_features_unpooled = (
-          self._feature_extractor.extract_box_classifier_features(
-              flattened_detected_feature_maps,
-              scope=self.second_stage_feature_extractor_scope))
+      detection_features_unpooled = self._extract_box_classifier_features(
+          flattened_detected_feature_maps)

      batch_size = tf.shape(detection_boxes)[0]
      max_detection = tf.shape(detection_boxes)[1]
@@ -1273,9 +1550,9 @@ class FasterRCNNMetaArch(model.DetectionModel):
        the batch.
      raw_detection_boxes: [batch, total_detections, 4] tensor with decoded
        proposal boxes before Non-Max Suppression.
-      raw_detection_score: [batch, total_detections,
-        num_classes_with_background] tensor of class score logits for
-        raw proposal boxes.
+      raw_detection_scores: [batch, total_detections,
+        num_classes_with_background] tensor of multi-class scores for raw
+        proposal boxes.
    """
    rpn_box_encodings_batch = tf.expand_dims(rpn_box_encodings_batch, axis=2)
    rpn_encodings_shape = shape_utils.combined_static_and_dynamic_shape(
@@ -1285,8 +1562,9 @@ class FasterRCNNMetaArch(model.DetectionModel):
    proposal_boxes = self._batch_decode_boxes(rpn_box_encodings_batch,
                                              tiled_anchor_boxes)
    raw_proposal_boxes = tf.squeeze(proposal_boxes, axis=2)
-    rpn_objectness_softmax_without_background = tf.nn.softmax(
-        rpn_objectness_predictions_with_background_batch)[:, :, 1]
+    rpn_objectness_softmax = tf.nn.softmax(
+        rpn_objectness_predictions_with_background_batch)
+    rpn_objectness_softmax_without_background = rpn_objectness_softmax[:, :, 1]
    clip_window = self._compute_clip_window(image_shapes)
    (proposal_boxes, proposal_scores, _, _, _,
     num_proposals) = self._first_stage_nms_fn(
@@ -1319,8 +1597,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
        elems=[raw_proposal_boxes, image_shapes],
        dtype=tf.float32)
    return (normalized_proposal_boxes, proposal_scores, num_proposals,
-            raw_normalized_proposal_boxes,
-            rpn_objectness_predictions_with_background_batch)
+            raw_normalized_proposal_boxes, rpn_objectness_softmax)

  def _sample_box_classifier_batch(
      self,
@@ -1547,10 +1824,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
        self._crop_and_resize_fn(
            features_to_crop, proposal_boxes_normalized,
            [self._initial_crop_size, self._initial_crop_size]))
-    return slim.max_pool2d(
-        cropped_regions,
-        [self._maxpool_kernel_size, self._maxpool_kernel_size],
-        stride=self._maxpool_stride)
+    return self._maxpool_layer(cropped_regions)

  def _postprocess_box_classifier(self,
                                  refined_box_encodings,
@@ -1567,7 +1841,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
        representing predicted (final) refined box encodings. If using a shared
        box across classes the shape will instead be
        [total_num_padded_proposals, 1, 4]
-      class_predictions_with_background: a 3-D tensor float with shape
+      class_predictions_with_background: a 2-D tensor float with shape
        [total_num_padded_proposals, num_classes + 1] containing class
        predictions (logits) for each of the proposals.  Note that this tensor
        *includes* background class predictions (at class index 0).
@@ -1594,8 +1868,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
          masks.
        `raw_detection_boxes`: [batch, total_detections, 4] tensor with decoded
          detection boxes before Non-Max Suppression.
-        `raw_detection_score`: [batch, total_detections,
-          num_classes_with_background] tensor of multi-class score logits for
+        `raw_detection_scores`: [batch, total_detections,
+          num_classes_with_background] tensor of multi-class scores for
          raw detection boxes.
    """
    refined_box_encodings_batch = tf.reshape(
@@ -1679,7 +1953,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
        fields.DetectionResultFields.raw_detection_boxes:
            raw_normalized_detection_boxes,
        fields.DetectionResultFields.raw_detection_scores:
-            class_predictions_with_background_batch
+            class_predictions_with_background_batch_normalized
    }
    if nmsed_masks is not None:
      detections[fields.DetectionResultFields.detection_masks] = nmsed_masks
@@ -2072,7 +2346,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
        # Use normalized proposals to crop mask targets from image masks.
        flat_normalized_proposals = box_list_ops.to_normalized_coordinates(
            box_list.BoxList(tf.reshape(proposal_boxes, [-1, 4])),
-            image_shape[1], image_shape[2]).get()
+            image_shape[1], image_shape[2], check_range=False).get()

        flat_cropped_gt_mask = self._crop_and_resize_fn(
            tf.expand_dims(flat_gt_masks, -1),
@@ -2261,7 +2535,32 @@ class FasterRCNNMetaArch(model.DetectionModel):
    Returns:
      A list of regularization loss tensors.
    """
-    return tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
+    all_losses = []
+    slim_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
+    # Copy the slim losses to avoid modifying the collection
+    if slim_losses:
+      all_losses.extend(slim_losses)
+    # TODO(kaftan): Possibly raise an error if the feature extractors are
+    # uninitialized in Keras.
+    if self._feature_extractor_for_proposal_features:
+      if (self._feature_extractor_for_proposal_features !=
+          _UNINITIALIZED_FEATURE_EXTRACTOR):
+        all_losses.extend(self._feature_extractor_for_proposal_features.losses)
+    if isinstance(self._first_stage_box_predictor_first_conv,
+                  tf.keras.Model):
+      all_losses.extend(
+          self._first_stage_box_predictor_first_conv.losses)
+    if self._first_stage_box_predictor.is_keras_model:
+      all_losses.extend(self._first_stage_box_predictor.losses)
+    if self._feature_extractor_for_box_classifier_features:
+      if (self._feature_extractor_for_box_classifier_features !=
+          _UNINITIALIZED_FEATURE_EXTRACTOR):
+        all_losses.extend(
+            self._feature_extractor_for_box_classifier_features.losses)
+    if self._mask_rcnn_box_predictor:
+      if self._mask_rcnn_box_predictor.is_keras_model:
+        all_losses.extend(self._mask_rcnn_box_predictor.losses)
+    return all_losses

  def restore_map(self,
                  fine_tune_checkpoint_type='detection',
@@ -2318,4 +2617,55 @@ class FasterRCNNMetaArch(model.DetectionModel):
    Returns:
      A list of update operators.
    """
-    return tf.get_collection(tf.GraphKeys.UPDATE_OPS)
+    update_ops = []
+    slim_update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
+    # Copy the slim ops to avoid modifying the collection
+    if slim_update_ops:
+      update_ops.extend(slim_update_ops)
+    # Passing None to get_updates_for grabs updates that should always be
+    # executed and don't depend on any model inputs in the graph.
+    # (E.g. if there was some count that should be incremented every time a
+    # model is run).
+    #
+    # Passing inputs grabs updates that are transitively computed from the
+    # model inputs being passed in.
+    # (E.g. a batchnorm update depends on the observed inputs)
+    if self._feature_extractor_for_proposal_features:
+      if (self._feature_extractor_for_proposal_features !=
+          _UNINITIALIZED_FEATURE_EXTRACTOR):
+        update_ops.extend(
+            self._feature_extractor_for_proposal_features.get_updates_for(None))
+        update_ops.extend(
+            self._feature_extractor_for_proposal_features.get_updates_for(
+                self._feature_extractor_for_proposal_features.inputs))
+    if isinstance(self._first_stage_box_predictor_first_conv,
+                  tf.keras.Model):
+      update_ops.extend(
+          self._first_stage_box_predictor_first_conv.get_updates_for(
+              None))
+      update_ops.extend(
+          self._first_stage_box_predictor_first_conv.get_updates_for(
+              self._first_stage_box_predictor_first_conv.inputs))
+    if self._first_stage_box_predictor.is_keras_model:
+      update_ops.extend(
+          self._first_stage_box_predictor.get_updates_for(None))
+      update_ops.extend(
+          self._first_stage_box_predictor.get_updates_for(
+              self._first_stage_box_predictor.inputs))
+    if self._feature_extractor_for_box_classifier_features:
+      if (self._feature_extractor_for_box_classifier_features !=
+          _UNINITIALIZED_FEATURE_EXTRACTOR):
+        update_ops.extend(
+            self._feature_extractor_for_box_classifier_features.get_updates_for(
+                None))
+        update_ops.extend(
+            self._feature_extractor_for_box_classifier_features.get_updates_for(
+                self._feature_extractor_for_box_classifier_features.inputs))
+    if self._mask_rcnn_box_predictor:
+      if self._mask_rcnn_box_predictor.is_keras_model:
+        update_ops.extend(
+            self._mask_rcnn_box_predictor.get_updates_for(None))
+        update_ops.extend(
+            self._mask_rcnn_box_predictor.get_updates_for(
+                self._mask_rcnn_box_predictor.inputs))
+    return update_ops
--- a/research/object_detection/meta_architectures/faster_rcnn_meta_arch_test.py
+++ b/research/object_detection/meta_architectures/faster_rcnn_meta_arch_test.py
@@ -26,9 +26,15 @@ class FasterRCNNMetaArchTest(
    faster_rcnn_meta_arch_test_lib.FasterRCNNMetaArchTestBase,
    parameterized.TestCase):

-  def test_postprocess_second_stage_only_inference_mode_with_masks(self):
+  @parameterized.parameters(
+      {'use_keras': True},
+      {'use_keras': False}
+  )
+  def test_postprocess_second_stage_only_inference_mode_with_masks(
+      self, use_keras=False):
    model = self._build_model(
-        is_training=False, number_of_stages=2, second_stage_batch_size=6)
+        is_training=False, use_keras=use_keras,
+        number_of_stages=2, second_stage_batch_size=6)

    batch_size = 2
    total_num_padded_proposals = batch_size * model.max_num_proposals
@@ -85,9 +91,15 @@ class FasterRCNNMetaArchTest(
      self.assertTrue(np.amax(detections_out['detection_masks'] <= 1.0))
      self.assertTrue(np.amin(detections_out['detection_masks'] >= 0.0))

-  def test_postprocess_second_stage_only_inference_mode_with_calibration(self):
+  @parameterized.parameters(
+      {'use_keras': True},
+      {'use_keras': False}
+  )
+  def test_postprocess_second_stage_only_inference_mode_with_calibration(
+      self, use_keras=False):
    model = self._build_model(
-        is_training=False, number_of_stages=2, second_stage_batch_size=6,
+        is_training=False, use_keras=use_keras,
+        number_of_stages=2, second_stage_batch_size=6,
        calibration_mapping_value=0.5)

    batch_size = 2
@@ -147,9 +159,15 @@ class FasterRCNNMetaArchTest(
      self.assertTrue(np.amax(detections_out['detection_masks'] <= 1.0))
      self.assertTrue(np.amin(detections_out['detection_masks'] >= 0.0))

-  def test_postprocess_second_stage_only_inference_mode_with_shared_boxes(self):
+  @parameterized.parameters(
+      {'use_keras': True},
+      {'use_keras': False}
+  )
+  def test_postprocess_second_stage_only_inference_mode_with_shared_boxes(
+      self, use_keras=False):
    model = self._build_model(
-        is_training=False, number_of_stages=2, second_stage_batch_size=6)
+        is_training=False, use_keras=use_keras,
+        number_of_stages=2, second_stage_batch_size=6)

    batch_size = 2
    total_num_padded_proposals = batch_size * model.max_num_proposals
@@ -188,11 +206,13 @@ class FasterRCNNMetaArchTest(
      self.assertAllClose(detections_out['num_detections'], [5, 4])

  @parameterized.parameters(
-      {'masks_are_class_agnostic': False},
-      {'masks_are_class_agnostic': True},
+      {'masks_are_class_agnostic': False, 'use_keras': True},
+      {'masks_are_class_agnostic': True, 'use_keras': True},
+      {'masks_are_class_agnostic': False, 'use_keras': False},
+      {'masks_are_class_agnostic': True, 'use_keras': False},
  )
  def test_predict_correct_shapes_in_inference_mode_three_stages_with_masks(
-      self, masks_are_class_agnostic):
+      self, masks_are_class_agnostic, use_keras):
    batch_size = 2
    image_size = 10
    max_num_proposals = 8
@@ -232,6 +252,7 @@ class FasterRCNNMetaArchTest(
      with test_graph.as_default():
        model = self._build_model(
            is_training=False,
+            use_keras=use_keras,
            number_of_stages=3,
            second_stage_batch_size=2,
            predict_masks=True,
@@ -267,15 +288,18 @@ class FasterRCNNMetaArchTest(
                          [10, num_classes, 14, 14])

  @parameterized.parameters(
-      {'masks_are_class_agnostic': False},
-      {'masks_are_class_agnostic': True},
+      {'masks_are_class_agnostic': False, 'use_keras': True},
+      {'masks_are_class_agnostic': True, 'use_keras': True},
+      {'masks_are_class_agnostic': False, 'use_keras': False},
+      {'masks_are_class_agnostic': True, 'use_keras': False},
  )
  def test_predict_gives_correct_shapes_in_train_mode_both_stages_with_masks(
-      self, masks_are_class_agnostic):
+      self, masks_are_class_agnostic, use_keras):
    test_graph = tf.Graph()
    with test_graph.as_default():
      model = self._build_model(
          is_training=True,
+          use_keras=use_keras,
          number_of_stages=3,
          second_stage_batch_size=7,
          predict_masks=True,
@@ -348,7 +372,11 @@ class FasterRCNNMetaArchTest(
            tensor_dict_out['rpn_objectness_predictions_with_background'].shape,
            (2, num_anchors_out, 2))

-  def test_postprocess_third_stage_only_inference_mode(self):
+  @parameterized.parameters(
+      {'use_keras': True},
+      {'use_keras': False}
+  )
+  def test_postprocess_third_stage_only_inference_mode(self, use_keras=False):
    num_proposals_shapes = [(2), (None)]
    refined_box_encodings_shapes = [(16, 2, 4), (None, 2, 4)]
    class_predictions_with_background_shapes = [(16, 3), (None, 3)]
@@ -364,7 +392,7 @@ class FasterRCNNMetaArchTest(
      tf_graph = tf.Graph()
      with tf_graph.as_default():
        model = self._build_model(
-            is_training=False, number_of_stages=3,
+            is_training=False, use_keras=use_keras, number_of_stages=3,
            second_stage_batch_size=6, predict_masks=True)
        total_num_padded_proposals = batch_size * model.max_num_proposals
        proposal_boxes = np.array(

--- a/research/object_detection/meta_architectures/faster_rcnn_meta_arch_test_lib.py
+++ b/research/object_detection/meta_architectures/faster_rcnn_meta_arch_test_lib.py
@@ -43,7 +43,7 @@ BOX_CODE_SIZE = 4

 class FakeFasterRCNNFeatureExtractor(
    faster_rcnn_meta_arch.FasterRCNNFeatureExtractor):
-  """Fake feature extracture to use in tests."""
+  """Fake feature extractor to use in tests."""

  def __init__(self):
    super(FakeFasterRCNNFeatureExtractor, self).__init__(
@@ -67,6 +67,42 @@ class FakeFasterRCNNFeatureExtractor(
                             num_outputs=3, kernel_size=1, scope='layer2')


+class FakeFasterRCNNKerasFeatureExtractor(
+    faster_rcnn_meta_arch.FasterRCNNKerasFeatureExtractor):
+  """Fake feature extractor to use in tests."""
+
+  def __init__(self):
+    super(FakeFasterRCNNKerasFeatureExtractor, self).__init__(
+        is_training=False,
+        first_stage_features_stride=32,
+        weight_decay=0.0)
+
+  def preprocess(self, resized_inputs):
+    return tf.identity(resized_inputs)
+
+  def get_proposal_feature_extractor_model(self, name):
+
+    class ProposalFeatureExtractor(tf.keras.Model):
+      """Dummy proposal feature extraction."""
+
+      def __init__(self, name):
+        super(ProposalFeatureExtractor, self).__init__(name=name)
+        self.conv = None
+
+      def build(self, input_shape):
+        self.conv = tf.keras.layers.Conv2D(
+            3, kernel_size=1, padding='SAME', name='layer1')
+
+      def call(self, inputs):
+        return self.conv(inputs)
+
+    return ProposalFeatureExtractor(name=name)
+
+  def get_box_classifier_feature_extractor_model(self, name):
+    return tf.keras.Sequential([tf.keras.layers.Conv2D(
+        3, kernel_size=1, padding='SAME', name=name + '_layer2')])
+
+
 class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
  """Base class to test Faster R-CNN and R-FCN meta architectures."""

@@ -77,6 +113,11 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
    text_format.Merge(hyperparams_text_proto, hyperparams)
    return hyperparams_builder.build(hyperparams, is_training=is_training)

+  def _build_keras_layer_hyperparams(self, hyperparams_text_proto):
+    hyperparams = hyperparams_pb2.Hyperparams()
+    text_format.Merge(hyperparams_text_proto, hyperparams)
+    return hyperparams_builder.KerasLayerHyperparams(hyperparams)
+
  def _get_second_stage_box_predictor_text_proto(self):
    box_predictor_text_proto = """
      mask_rcnn_box_predictor {
@@ -127,7 +168,8 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
    return box_predictor_text_proto

  def _get_second_stage_box_predictor(self, num_classes, is_training,
-                                      predict_masks, masks_are_class_agnostic):
+                                      predict_masks, masks_are_class_agnostic,
+                                      use_keras=False):
    box_predictor_proto = box_predictor_pb2.BoxPredictor()
    text_format.Merge(self._get_second_stage_box_predictor_text_proto(),
                      box_predictor_proto)
@@ -137,13 +179,23 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
              masks_are_class_agnostic),
          box_predictor_proto)

-    return box_predictor_builder.build(
-        hyperparams_builder.build,
-        box_predictor_proto,
-        num_classes=num_classes,
-        is_training=is_training)
+    if use_keras:
+      return box_predictor_builder.build_keras(
+          hyperparams_builder.KerasLayerHyperparams,
+          inplace_batchnorm_update=False,
+          freeze_batchnorm=False,
+          box_predictor_config=box_predictor_proto,
+          num_classes=num_classes,
+          num_predictions_per_location_list=None,
+          is_training=is_training)
+    else:
+      return box_predictor_builder.build(
+          hyperparams_builder.build,
+          box_predictor_proto,
+          num_classes=num_classes,
+          is_training=is_training)

-  def _get_model(self, box_predictor, **common_kwargs):
+  def _get_model(self, box_predictor, keras_model=False, **common_kwargs):
    return faster_rcnn_meta_arch.FasterRCNNMetaArch(
        initial_crop_size=3,
        maxpool_kernel_size=1,
@@ -155,6 +207,7 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
                   is_training,
                   number_of_stages,
                   second_stage_batch_size,
+                   use_keras=False,
                   first_stage_max_proposals=8,
                   num_classes=2,
                   hard_mining=False,
@@ -204,7 +257,10 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
        'proposal',
        use_matmul_gather=use_matmul_gather_in_matcher)

-    fake_feature_extractor = FakeFasterRCNNFeatureExtractor()
+    if use_keras:
+      fake_feature_extractor = FakeFasterRCNNKerasFeatureExtractor()
+    else:
+      fake_feature_extractor = FakeFasterRCNNFeatureExtractor()

    first_stage_box_predictor_hyperparams_text_proto = """
      op: CONV
@@ -220,9 +276,14 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
        }
      }
    """
-    first_stage_box_predictor_arg_scope_fn = (
-        self._build_arg_scope_with_hyperparams(
-            first_stage_box_predictor_hyperparams_text_proto, is_training))
+    if use_keras:
+      first_stage_box_predictor_arg_scope_fn = (
+          self._build_keras_layer_hyperparams(
+              first_stage_box_predictor_hyperparams_text_proto))
+    else:
+      first_stage_box_predictor_arg_scope_fn = (
+          self._build_arg_scope_with_hyperparams(
+              first_stage_box_predictor_hyperparams_text_proto, is_training))

    first_stage_box_predictor_kernel_size = 3
    first_stage_atrous_rate = 1
@@ -349,15 +410,18 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
        self._get_second_stage_box_predictor(
            num_classes=num_classes,
            is_training=is_training,
+            use_keras=use_keras,
            predict_masks=predict_masks,
            masks_are_class_agnostic=masks_are_class_agnostic), **common_kwargs)

  @parameterized.parameters(
-      {'use_static_shapes': False},
-      {'use_static_shapes': True}
+      {'use_static_shapes': False, 'use_keras': True},
+      {'use_static_shapes': False, 'use_keras': False},
+      {'use_static_shapes': True, 'use_keras': True},
+      {'use_static_shapes': True, 'use_keras': False},
  )
  def test_predict_gives_correct_shapes_in_inference_mode_first_stage_only(
-      self, use_static_shapes=False):
+      self, use_static_shapes=False, use_keras=False):
    batch_size = 2
    height = 10
    width = 12
@@ -367,6 +431,7 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
      """Function to construct tf graph for the test."""
      model = self._build_model(
          is_training=False,
+          use_keras=use_keras,
          number_of_stages=1,
          second_stage_batch_size=2,
          clip_anchors_to_image=use_static_shapes,
@@ -423,11 +488,44 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
    self.assertTrue(np.all(np.less_equal(anchors[:, 2], height)))
    self.assertTrue(np.all(np.less_equal(anchors[:, 3], width)))

-  def test_predict_gives_valid_anchors_in_training_mode_first_stage_only(self):
+  @parameterized.parameters(
+      {'use_keras': True},
+      {'use_keras': False}
+  )
+  def test_regularization_losses(
+      self, use_keras=False):
    test_graph = tf.Graph()
    with test_graph.as_default():
      model = self._build_model(
-          is_training=True, number_of_stages=1, second_stage_batch_size=2)
+          is_training=True, use_keras=use_keras,
+          number_of_stages=1, second_stage_batch_size=2,)
+      batch_size = 2
+      height = 10
+      width = 12
+      input_image_shape = (batch_size, height, width, 3)
+      _, true_image_shapes = model.preprocess(tf.zeros(input_image_shape))
+      preprocessed_inputs = tf.placeholder(
+          dtype=tf.float32, shape=(batch_size, None, None, 3))
+      model.predict(preprocessed_inputs, true_image_shapes)
+
+      reg_losses = tf.math.add_n(model.regularization_losses())
+
+      init_op = tf.global_variables_initializer()
+      with self.test_session(graph=test_graph) as sess:
+        sess.run(init_op)
+        self.assertGreaterEqual(sess.run(reg_losses), 0)
+
+  @parameterized.parameters(
+      {'use_keras': True},
+      {'use_keras': False}
+  )
+  def test_predict_gives_valid_anchors_in_training_mode_first_stage_only(
+      self, use_keras=False):
+    test_graph = tf.Graph()
+    with test_graph.as_default():
+      model = self._build_model(
+          is_training=True, use_keras=use_keras,
+          number_of_stages=1, second_stage_batch_size=2,)
      batch_size = 2
      height = 10
      width = 12
@@ -478,11 +576,13 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
            (batch_size, num_anchors_out, 2))

  @parameterized.parameters(
-      {'use_static_shapes': False},
-      {'use_static_shapes': True}
+      {'use_static_shapes': False, 'use_keras': True},
+      {'use_static_shapes': False, 'use_keras': False},
+      {'use_static_shapes': True, 'use_keras': True},
+      {'use_static_shapes': True, 'use_keras': False},
  )
  def test_predict_correct_shapes_in_inference_mode_two_stages(
-      self, use_static_shapes=False):
+      self, use_static_shapes=False, use_keras=False):

    def compare_results(results, expected_output_shapes):
      """Checks if the shape of the predictions are as expected."""
@@ -528,6 +628,7 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
      """Function to construct tf graph for the test."""
      model = self._build_model(
          is_training=False,
+          use_keras=use_keras,
          number_of_stages=2,
          second_stage_batch_size=2,
          predict_masks=False,
@@ -584,6 +685,7 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
        with test_graph.as_default():
          model = self._build_model(
              is_training=False,
+              use_keras=use_keras,
              number_of_stages=2,
              second_stage_batch_size=2,
              predict_masks=False)
@@ -603,12 +705,15 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
          self.assertAllEqual(tensor_dict_out[key].shape, expected_shapes[key])

  @parameterized.parameters(
-      {'use_static_shapes': False},
-      {'use_static_shapes': True}
+      {'use_static_shapes': False, 'use_keras': True},
+      {'use_static_shapes': False, 'use_keras': False},
+      {'use_static_shapes': True, 'use_keras': True},
+      {'use_static_shapes': True, 'use_keras': False},
  )
  def test_predict_gives_correct_shapes_in_train_mode_both_stages(
      self,
-      use_static_shapes=False):
+      use_static_shapes=False,
+      use_keras=False):
    batch_size = 2
    image_size = 10
    max_num_proposals = 7
@@ -619,6 +724,7 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
      """Function to construct tf graph for the test."""
      model = self._build_model(
          is_training=True,
+          use_keras=use_keras,
          number_of_stages=2,
          second_stage_batch_size=7,
          predict_masks=False,
@@ -632,6 +738,7 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
          groundtruth_classes_list=tf.unstack(gt_classes),
          groundtruth_weights_list=tf.unstack(gt_weights))
      result_tensor_dict = model.predict(preprocessed_inputs, true_image_shapes)
+      updates = model.updates()
      return (result_tensor_dict['refined_box_encodings'],
              result_tensor_dict['class_predictions_with_background'],
              result_tensor_dict['proposal_boxes'],
@@ -641,6 +748,7 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
              result_tensor_dict['rpn_objectness_predictions_with_background'],
              result_tensor_dict['rpn_features_to_crop'],
              result_tensor_dict['rpn_box_predictor_features'],
+              updates
             )

    image_shape = (batch_size, image_size, image_size, 3)
@@ -699,13 +807,27 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
                        expected_shapes['rpn_box_predictor_features'])

  @parameterized.parameters(
-      {'use_static_shapes': False, 'pad_to_max_dimension': None},
-      {'use_static_shapes': True, 'pad_to_max_dimension': None},
-      {'use_static_shapes': False, 'pad_to_max_dimension': 56},
-      {'use_static_shapes': True, 'pad_to_max_dimension': 56}
+      {'use_static_shapes': False, 'pad_to_max_dimension': None,
+       'use_keras': True},
+      {'use_static_shapes': True, 'pad_to_max_dimension': None,
+       'use_keras': True},
+      {'use_static_shapes': False, 'pad_to_max_dimension': 56,
+       'use_keras': True},
+      {'use_static_shapes': True, 'pad_to_max_dimension': 56,
+       'use_keras': True},
+      {'use_static_shapes': False, 'pad_to_max_dimension': None,
+       'use_keras': False},
+      {'use_static_shapes': True, 'pad_to_max_dimension': None,
+       'use_keras': False},
+      {'use_static_shapes': False, 'pad_to_max_dimension': 56,
+       'use_keras': False},
+      {'use_static_shapes': True, 'pad_to_max_dimension': 56,
+       'use_keras': False}
  )
  def test_postprocess_first_stage_only_inference_mode(
-      self, use_static_shapes=False, pad_to_max_dimension=None):
+      self, use_static_shapes=False,
+      pad_to_max_dimension=None,
+      use_keras=False):
    batch_size = 2
    first_stage_max_proposals = 4 if use_static_shapes else 8

@@ -716,7 +838,8 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
                 anchors):
      """Function to construct tf graph for the test."""
      model = self._build_model(
-          is_training=False, number_of_stages=1, second_stage_batch_size=6,
+          is_training=False, use_keras=use_keras,
+          number_of_stages=1, second_stage_batch_size=6,
          use_matmul_crop_and_resize=use_static_shapes,
          clip_anchors_to_image=use_static_shapes,
          use_static_shapes=use_static_shapes,
@@ -779,8 +902,8 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
                                    [0.5, 0., 1., 0.5], [0.5, 0.5, 1., 1.]],
                                   [[0., 0., 0.5, 0.5], [0., 0.5, 0.5, 1.],
                                    [0.5, 0., 1., 0.5], [0.5, 0.5, 1., 1.]]]
-    expected_raw_scores = [[[-10., 13.], [10., -10.], [10., -11.], [-10., 12.]],
-                           [[10., -10.], [-10., 13.], [-10., 12.], [10., -11.]]]
+    expected_raw_scores = [[[0., 1.], [1., 0.], [1., 0.], [0., 1.]],
+                           [[1., 0.], [0., 1.], [0., 1.], [1., 0.]]]

    self.assertAllClose(results[0], expected_num_proposals)
    for indx, num_proposals in enumerate(expected_num_proposals):
@@ -792,9 +915,11 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
    self.assertAllClose(results[4], expected_raw_scores)

  def _test_postprocess_first_stage_only_train_mode(self,
+                                                    use_keras=False,
                                                    pad_to_max_dimension=None):
    model = self._build_model(
-        is_training=True, number_of_stages=1, second_stage_batch_size=2,
+        is_training=True, use_keras=use_keras,
+        number_of_stages=1, second_stage_batch_size=2,
        pad_to_max_dimension=pad_to_max_dimension)
    batch_size = 2
    anchors = tf.constant(
@@ -847,9 +972,8 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
                                    [0.5, 0., 1., 0.5], [0.5, 0.5, 1., 1.]],
                                   [[0., 0., 0.5, 0.5], [0., 0.5, 0.5, 1.],
                                    [0.5, 0., 1., 0.5], [0.5, 0.5, 1., 1.]]]
-    expected_raw_scores = [[[-10., 13.], [-10., 12.], [-10., 11.], [-10., 10.]],
-                           [[-10., 13.], [-10., 12.], [-10., 11.], [-10., 10.]]]
-
+    expected_raw_scores = [[[0., 1.], [0., 1.], [0., 1.], [0., 1.]],
+                           [[0., 1.], [0., 1.], [0., 1.], [0., 1.]]]
    expected_output_keys = set([
        'detection_boxes', 'detection_scores', 'num_detections',
        'raw_detection_boxes', 'raw_detection_scores'
@@ -872,20 +996,44 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
    self.assertAllClose(proposals_out['raw_detection_scores'],
                        expected_raw_scores)

-  def test_postprocess_first_stage_only_train_mode(self):
-    self._test_postprocess_first_stage_only_train_mode()
+  @parameterized.parameters(
+      {'use_keras': True},
+      {'use_keras': False}
+  )
+  def test_postprocess_first_stage_only_train_mode(self, use_keras=False):
+    self._test_postprocess_first_stage_only_train_mode(use_keras=use_keras)

-  def test_postprocess_first_stage_only_train_mode_padded_image(self):
-    self._test_postprocess_first_stage_only_train_mode(pad_to_max_dimension=56)
+  @parameterized.parameters(
+      {'use_keras': True},
+      {'use_keras': False}
+  )
+  def test_postprocess_first_stage_only_train_mode_padded_image(
+      self, use_keras=False):
+    self._test_postprocess_first_stage_only_train_mode(pad_to_max_dimension=56,
+                                                       use_keras=use_keras)

  @parameterized.parameters(
-      {'use_static_shapes': False, 'pad_to_max_dimension': None},
-      {'use_static_shapes': True, 'pad_to_max_dimension': None},
-      {'use_static_shapes': False, 'pad_to_max_dimension': 56},
-      {'use_static_shapes': True, 'pad_to_max_dimension': 56}
+      {'use_static_shapes': False, 'pad_to_max_dimension': None,
+       'use_keras': True},
+      {'use_static_shapes': True, 'pad_to_max_dimension': None,
+       'use_keras': True},
+      {'use_static_shapes': False, 'pad_to_max_dimension': 56,
+       'use_keras': True},
+      {'use_static_shapes': True, 'pad_to_max_dimension': 56,
+       'use_keras': True},
+      {'use_static_shapes': False, 'pad_to_max_dimension': None,
+       'use_keras': False},
+      {'use_static_shapes': True, 'pad_to_max_dimension': None,
+       'use_keras': False},
+      {'use_static_shapes': False, 'pad_to_max_dimension': 56,
+       'use_keras': False},
+      {'use_static_shapes': True, 'pad_to_max_dimension': 56,
+       'use_keras': False}
  )
  def test_postprocess_second_stage_only_inference_mode(
-      self, use_static_shapes=False, pad_to_max_dimension=None):
+      self, use_static_shapes=False,
+      pad_to_max_dimension=None,
+      use_keras=False):
    batch_size = 2
    num_classes = 2
    image_shape = np.array((2, 36, 48, 3), dtype=np.int32)
@@ -899,7 +1047,8 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
                 proposal_boxes):
      """Function to construct tf graph for the test."""
      model = self._build_model(
-          is_training=False, number_of_stages=2,
+          is_training=False, use_keras=use_keras,
+          number_of_stages=2,
          second_stage_batch_size=6,
          use_matmul_crop_and_resize=use_static_shapes,
          clip_anchors_to_image=use_static_shapes,
@@ -972,21 +1121,31 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
    if not use_static_shapes:
      self.assertAllEqual(results[1].shape, [2, 5, 4])

-  def test_preprocess_preserves_input_shapes(self):
+  @parameterized.parameters(
+      {'use_keras': True},
+      {'use_keras': False}
+  )
+  def test_preprocess_preserves_input_shapes(self, use_keras=False):
    image_shapes = [(3, None, None, 3),
                    (None, 10, 10, 3),
                    (None, None, None, 3)]
    for image_shape in image_shapes:
      model = self._build_model(
-          is_training=False, number_of_stages=2, second_stage_batch_size=6)
+          is_training=False, use_keras=use_keras,
+          number_of_stages=2, second_stage_batch_size=6)
      image_placeholder = tf.placeholder(tf.float32, shape=image_shape)
      preprocessed_inputs, _ = model.preprocess(image_placeholder)
      self.assertAllEqual(preprocessed_inputs.shape.as_list(), image_shape)

  # TODO(rathodv): Split test into two - with and without masks.
-  def test_loss_first_stage_only_mode(self):
+  @parameterized.parameters(
+      {'use_keras': True},
+      {'use_keras': False}
+  )
+  def test_loss_first_stage_only_mode(self, use_keras=False):
    model = self._build_model(
-        is_training=True, number_of_stages=1, second_stage_batch_size=6)
+        is_training=True, use_keras=use_keras,
+        number_of_stages=1, second_stage_batch_size=6)
    batch_size = 2
    anchors = tf.constant(
        [[0, 0, 16, 16],
@@ -1038,9 +1197,14 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
                       loss_dict_out)

  # TODO(rathodv): Split test into two - with and without masks.
-  def test_loss_full(self):
+  @parameterized.parameters(
+      {'use_keras': True},
+      {'use_keras': False}
+  )
+  def test_loss_full(self, use_keras=False):
    model = self._build_model(
-        is_training=True, number_of_stages=2, second_stage_batch_size=6)
+        is_training=True, use_keras=use_keras,
+        number_of_stages=2, second_stage_batch_size=6)
    batch_size = 3
    anchors = tf.constant(
        [[0, 0, 16, 16],
@@ -1153,9 +1317,14 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
          'Loss/BoxClassifierLoss/classification_loss'], 0)
      self.assertAllClose(loss_dict_out['Loss/BoxClassifierLoss/mask_loss'], 0)

-  def test_loss_full_zero_padded_proposals(self):
+  @parameterized.parameters(
+      {'use_keras': True},
+      {'use_keras': False}
+  )
+  def test_loss_full_zero_padded_proposals(self, use_keras=False):
    model = self._build_model(
-        is_training=True, number_of_stages=2, second_stage_batch_size=6)
+        is_training=True, use_keras=use_keras,
+        number_of_stages=2, second_stage_batch_size=6)
    batch_size = 1
    anchors = tf.constant(
        [[0, 0, 16, 16],
@@ -1243,9 +1412,14 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
          'Loss/BoxClassifierLoss/classification_loss'], 0)
      self.assertAllClose(loss_dict_out['Loss/BoxClassifierLoss/mask_loss'], 0)

-  def test_loss_full_multiple_label_groundtruth(self):
+  @parameterized.parameters(
+      {'use_keras': True},
+      {'use_keras': False}
+  )
+  def test_loss_full_multiple_label_groundtruth(self, use_keras=False):
    model = self._build_model(
-        is_training=True, number_of_stages=2, second_stage_batch_size=6,
+        is_training=True, use_keras=use_keras,
+        number_of_stages=2, second_stage_batch_size=6,
        softmax_second_stage_classification_loss=False)
    batch_size = 1
    anchors = tf.constant(
@@ -1342,14 +1516,17 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
      self.assertAllClose(loss_dict_out['Loss/BoxClassifierLoss/mask_loss'], 0)

  @parameterized.parameters(
-      {'use_static_shapes': False, 'shared_boxes': False},
-      {'use_static_shapes': False, 'shared_boxes': True},
-
-      {'use_static_shapes': True, 'shared_boxes': False},
-      {'use_static_shapes': True, 'shared_boxes': True},
+      {'use_static_shapes': False, 'shared_boxes': False, 'use_keras': True},
+      {'use_static_shapes': False, 'shared_boxes': True, 'use_keras': True},
+      {'use_static_shapes': True, 'shared_boxes': False, 'use_keras': True},
+      {'use_static_shapes': True, 'shared_boxes': True, 'use_keras': True},
+      {'use_static_shapes': False, 'shared_boxes': False, 'use_keras': False},
+      {'use_static_shapes': False, 'shared_boxes': True, 'use_keras': False},
+      {'use_static_shapes': True, 'shared_boxes': False, 'use_keras': False},
+      {'use_static_shapes': True, 'shared_boxes': True, 'use_keras': False},
  )
  def test_loss_full_zero_padded_proposals_nonzero_loss_with_two_images(
-      self, use_static_shapes=False, shared_boxes=False):
+      self, use_static_shapes=False, shared_boxes=False, use_keras=False):
    batch_size = 2
    first_stage_max_proposals = 8
    second_stage_batch_size = 6
@@ -1361,7 +1538,8 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
                 groundtruth_classes):
      """Function to construct tf graph for the test."""
      model = self._build_model(
-          is_training=True, number_of_stages=2,
+          is_training=True, use_keras=use_keras,
+          number_of_stages=2,
          second_stage_batch_size=second_stage_batch_size,
          first_stage_max_proposals=first_stage_max_proposals,
          num_classes=num_classes,
@@ -1476,8 +1654,13 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
    self.assertAllClose(results[2], exp_loc_loss, rtol=1e-4, atol=1e-4)
    self.assertAllClose(results[3], 0.0)

-  def test_loss_with_hard_mining(self):
+  @parameterized.parameters(
+      {'use_keras': True},
+      {'use_keras': False}
+  )
+  def test_loss_with_hard_mining(self, use_keras=False):
    model = self._build_model(is_training=True,
+                              use_keras=use_keras,
                              number_of_stages=2,
                              second_stage_batch_size=None,
                              first_stage_max_proposals=6,
@@ -1564,8 +1747,13 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
      self.assertAllClose(loss_dict_out[
          'Loss/BoxClassifierLoss/classification_loss'], 0)

-  def test_loss_with_hard_mining_and_losses_mask(self):
+  @parameterized.parameters(
+      {'use_keras': True},
+      {'use_keras': False}
+  )
+  def test_loss_with_hard_mining_and_losses_mask(self, use_keras=False):
    model = self._build_model(is_training=True,
+                              use_keras=use_keras,
                              number_of_stages=2,
                              second_stage_batch_size=None,
                              first_stage_max_proposals=6,
@@ -1678,7 +1866,11 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
      self.assertAllClose(loss_dict_out[
          'Loss/BoxClassifierLoss/classification_loss'], 0)

-  def test_restore_map_for_classification_ckpt(self):
+  @parameterized.parameters(
+      {'use_keras': True},
+      {'use_keras': False}
+  )
+  def test_restore_map_for_classification_ckpt(self, use_keras=False):
    # Define mock tensorflow classification graph and save variables.
    test_graph_classification = tf.Graph()
    with test_graph_classification.as_default():
@@ -1699,7 +1891,8 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
    test_graph_detection = tf.Graph()
    with test_graph_detection.as_default():
      model = self._build_model(
-          is_training=False, number_of_stages=2, second_stage_batch_size=6)
+          is_training=False, use_keras=use_keras,
+          number_of_stages=2, second_stage_batch_size=6)

      inputs_shape = (2, 20, 20, 3)
      inputs = tf.to_float(tf.random_uniform(
@@ -1716,12 +1909,17 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
          self.assertNotIn(model.first_stage_feature_extractor_scope, var)
          self.assertNotIn(model.second_stage_feature_extractor_scope, var)

-  def test_restore_map_for_detection_ckpt(self):
+  @parameterized.parameters(
+      {'use_keras': True},
+      {'use_keras': False}
+  )
+  def test_restore_map_for_detection_ckpt(self, use_keras=False):
    # Define first detection graph and save variables.
    test_graph_detection1 = tf.Graph()
    with test_graph_detection1.as_default():
      model = self._build_model(
-          is_training=False, number_of_stages=2, second_stage_batch_size=6)
+          is_training=False, use_keras=use_keras,
+          number_of_stages=2, second_stage_batch_size=6)
      inputs_shape = (2, 20, 20, 3)
      inputs = tf.to_float(tf.random_uniform(
          inputs_shape, minval=0, maxval=255, dtype=tf.int32))
@@ -1739,7 +1937,8 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
    # Define second detection graph and restore variables.
    test_graph_detection2 = tf.Graph()
    with test_graph_detection2.as_default():
-      model2 = self._build_model(is_training=False, number_of_stages=2,
+      model2 = self._build_model(is_training=False, use_keras=use_keras,
+                                 number_of_stages=2,
                                 second_stage_batch_size=6, num_classes=42)

      inputs_shape2 = (2, 20, 20, 3)
@@ -1760,11 +1959,16 @@ class FasterRCNNMetaArchTestBase(test_case.TestCase, parameterized.TestCase):
          self.assertNotIn(model2.first_stage_feature_extractor_scope, var)
          self.assertNotIn(model2.second_stage_feature_extractor_scope, var)

-  def test_load_all_det_checkpoint_vars(self):
+  @parameterized.parameters(
+      {'use_keras': True},
+      {'use_keras': False}
+  )
+  def test_load_all_det_checkpoint_vars(self, use_keras=False):
    test_graph_detection = tf.Graph()
    with test_graph_detection.as_default():
      model = self._build_model(
          is_training=False,
+          use_keras=use_keras,
          number_of_stages=2,
          second_stage_batch_size=6,
          num_classes=42)

--- a/research/object_detection/meta_architectures/rfcn_meta_arch.py
+++ b/research/object_detection/meta_architectures/rfcn_meta_arch.py
@@ -81,7 +81,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
               add_summaries=True,
               clip_anchors_to_image=False,
               use_static_shapes=False,
-               resize_masks=False):
+               resize_masks=False,
+               freeze_batchnorm=False):
    """RFCNMetaArch Constructor.

    Args:
@@ -110,9 +111,12 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
        denser resolutions.  The atrous rate is used to compensate for the
        denser feature maps by using an effectively larger receptive field.
        (This should typically be set to 1).
-      first_stage_box_predictor_arg_scope_fn: A function to generate tf-slim
-        arg_scope for conv2d, separable_conv2d and fully_connected ops for the
-        RPN box predictor.
+      first_stage_box_predictor_arg_scope_fn: Either a
+        Keras layer hyperparams object or a function to construct tf-slim
+        arg_scope for conv2d, separable_conv2d and fully_connected ops. Used
+        for the RPN box predictor. If it is a keras hyperparams object the
+        RPN box predictor will be a Keras model. If it is a function to
+        construct an arg scope it will be a tf-slim box predictor.
      first_stage_box_predictor_kernel_size: Kernel size to use for the
        convolution op just prior to RPN box predictions.
      first_stage_box_predictor_depth: Output depth for the convolution op
@@ -180,6 +184,10 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
        guarantees.
      resize_masks: Indicates whether the masks presend in the groundtruth
        should be resized in the model with `image_resizer_fn`
+      freeze_batchnorm: Whether to freeze batch norm parameters during
+        training or not. When training with a small batch size (e.g. 1), it is
+        desirable to freeze batch norm update and use pretrained batch norm
+        params.

    Raises:
      ValueError: If `second_stage_batch_size` > `first_stage_max_proposals`
@@ -225,7 +233,8 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
        add_summaries,
        clip_anchors_to_image,
        use_static_shapes,
-        resize_masks)
+        resize_masks,
+        freeze_batchnorm=freeze_batchnorm)

    self._rfcn_box_predictor = second_stage_rfcn_box_predictor

@@ -293,9 +302,7 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
        anchors, image_shape_2d, true_image_shapes)

    box_classifier_features = (
-        self._feature_extractor.extract_box_classifier_features(
-            rpn_features,
-            scope=self.second_stage_feature_extractor_scope))
+        self._extract_box_classifier_features(rpn_features))

    if self._rfcn_box_predictor.is_keras_model:
      box_predictions = self._rfcn_box_predictor(
@@ -329,3 +336,37 @@ class RFCNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
        'proposal_boxes_normalized': proposal_boxes_normalized,
    }
    return prediction_dict
+
+  def regularization_losses(self):
+    """Returns a list of regularization losses for this model.
+
+    Returns a list of regularization losses for this model that the estimator
+    needs to use during training/optimization.
+
+    Returns:
+      A list of regularization loss tensors.
+    """
+    reg_losses = super(RFCNMetaArch, self).regularization_losses()
+    if self._rfcn_box_predictor.is_keras_model:
+      reg_losses.extend(self._rfcn_box_predictor.losses)
+    return reg_losses
+
+  def updates(self):
+    """Returns a list of update operators for this model.
+
+    Returns a list of update operators for this model that must be executed at
+    each training step. The estimator's train op needs to have a control
+    dependency on these updates.
+
+    Returns:
+      A list of update operators.
+    """
+    update_ops = super(RFCNMetaArch, self).updates()
+
+    if self._rfcn_box_predictor.is_keras_model:
+      update_ops.extend(
+          self._rfcn_box_predictor.get_updates_for(None))
+      update_ops.extend(
+          self._rfcn_box_predictor.get_updates_for(
+              self._rfcn_box_predictor.inputs))
+    return update_ops
--- a/research/object_detection/meta_architectures/ssd_meta_arch.py
+++ b/research/object_detection/meta_architectures/ssd_meta_arch.py
@@ -22,6 +22,7 @@ import tensorflow as tf

 from object_detection.core import box_list
 from object_detection.core import box_list_ops
+from object_detection.core import matcher
 from object_detection.core import model
 from object_detection.core import standard_fields as fields
 from object_detection.core import target_assigner
@@ -663,8 +664,8 @@ class SSDMetaArch(model.DetectionModel):
        raw_detection_boxes: [batch, total_detections, 4] tensor with decoded
          detection boxes before Non-Max Suppression.
        raw_detection_score: [batch, total_detections,
-          num_classes_with_background] tensor of multi-class score logits for
-          raw detection boxes.
+          num_classes_with_background] tensor of multi-class scores for raw
+          detection boxes.
    Raises:
      ValueError: if prediction_dict does not contain `box_encodings` or
        `class_predictions_with_background` fields.
@@ -672,17 +673,23 @@ class SSDMetaArch(model.DetectionModel):
    if ('box_encodings' not in prediction_dict or
        'class_predictions_with_background' not in prediction_dict):
      raise ValueError('prediction_dict does not contain expected entries.')
+    if 'anchors' not in prediction_dict:
+      prediction_dict['anchors'] = self.anchors.get()
    with tf.name_scope('Postprocessor'):
      preprocessed_images = prediction_dict['preprocessed_inputs']
      box_encodings = prediction_dict['box_encodings']
      box_encodings = tf.identity(box_encodings, 'raw_box_encodings')
-      class_predictions = prediction_dict['class_predictions_with_background']
-      detection_boxes, detection_keypoints = self._batch_decode(box_encodings)
+      class_predictions_with_background = (
+          prediction_dict['class_predictions_with_background'])
+      detection_boxes, detection_keypoints = self._batch_decode(
+          box_encodings, prediction_dict['anchors'])
      detection_boxes = tf.identity(detection_boxes, 'raw_box_locations')
      detection_boxes = tf.expand_dims(detection_boxes, axis=2)

-      detection_scores = self._score_conversion_fn(class_predictions)
-      detection_scores = tf.identity(detection_scores, 'raw_box_scores')
+      detection_scores_with_background = self._score_conversion_fn(
+          class_predictions_with_background)
+      detection_scores = tf.identity(detection_scores_with_background,
+                                     'raw_box_scores')
      if self._add_background_class or self._explicit_background_class:
        detection_scores = tf.slice(detection_scores, [0, 0, 1], [-1, -1, -1])
      additional_fields = None
@@ -720,7 +727,7 @@ class SSDMetaArch(model.DetectionModel):
          fields.DetectionResultFields.raw_detection_boxes:
              tf.squeeze(detection_boxes, axis=2),
          fields.DetectionResultFields.raw_detection_scores:
-              class_predictions
+              detection_scores_with_background
      }
      if (nmsed_additional_fields is not None and
          fields.BoxListFields.keypoints in nmsed_additional_fields):
@@ -767,10 +774,11 @@ class SSDMetaArch(model.DetectionModel):
      if self.groundtruth_has_field(fields.BoxListFields.confidences):
        confidences = self.groundtruth_lists(fields.BoxListFields.confidences)
      (batch_cls_targets, batch_cls_weights, batch_reg_targets,
-       batch_reg_weights, match_list) = self._assign_targets(
+       batch_reg_weights, batch_match) = self._assign_targets(
           self.groundtruth_lists(fields.BoxListFields.boxes),
           self.groundtruth_lists(fields.BoxListFields.classes),
           keypoints, weights, confidences)
+      match_list = [matcher.Match(match) for match in tf.unstack(batch_match)]
      if self._add_summaries:
        self._summarize_target_assignment(
            self.groundtruth_lists(fields.BoxListFields.boxes), match_list)
@@ -1017,25 +1025,31 @@ class SSDMetaArch(model.DetectionModel):
        with rows of the Match objects corresponding to groundtruth boxes
        and columns corresponding to anchors.
    """
-    num_boxes_per_image = tf.stack(
-        [tf.shape(x)[0] for x in groundtruth_boxes_list])
-    pos_anchors_per_image = tf.stack(
-        [match.num_matched_columns() for match in match_list])
-    neg_anchors_per_image = tf.stack(
-        [match.num_unmatched_columns() for match in match_list])
-    ignored_anchors_per_image = tf.stack(
-        [match.num_ignored_columns() for match in match_list])
+    avg_num_gt_boxes = tf.reduce_mean(tf.to_float(tf.stack(
+        [tf.shape(x)[0] for x in groundtruth_boxes_list])))
+    avg_num_matched_gt_boxes = tf.reduce_mean(tf.to_float(tf.stack(
+        [match.num_matched_rows() for match in match_list])))
+    avg_pos_anchors = tf.reduce_mean(tf.to_float(tf.stack(
+        [match.num_matched_columns() for match in match_list])))
+    avg_neg_anchors = tf.reduce_mean(tf.to_float(tf.stack(
+        [match.num_unmatched_columns() for match in match_list])))
+    avg_ignored_anchors = tf.reduce_mean(tf.to_float(tf.stack(
+        [match.num_ignored_columns() for match in match_list])))
+    # TODO(rathodv): Add a test for these summaries.
    tf.summary.scalar('AvgNumGroundtruthBoxesPerImage',
-                      tf.reduce_mean(tf.to_float(num_boxes_per_image)),
+                      avg_num_gt_boxes,
+                      family='TargetAssignment')
+    tf.summary.scalar('AvgNumGroundtruthBoxesMatchedPerImage',
+                      avg_num_matched_gt_boxes,
                      family='TargetAssignment')
    tf.summary.scalar('AvgNumPositiveAnchorsPerImage',
-                      tf.reduce_mean(tf.to_float(pos_anchors_per_image)),
+                      avg_pos_anchors,
                      family='TargetAssignment')
    tf.summary.scalar('AvgNumNegativeAnchorsPerImage',
-                      tf.reduce_mean(tf.to_float(neg_anchors_per_image)),
+                      avg_neg_anchors,
                      family='TargetAssignment')
    tf.summary.scalar('AvgNumIgnoredAnchorsPerImage',
-                      tf.reduce_mean(tf.to_float(ignored_anchors_per_image)),
+                      avg_ignored_anchors,
                      family='TargetAssignment')

  def _apply_hard_mining(self, location_losses, cls_losses, prediction_dict,
@@ -1054,6 +1068,7 @@ class SSDMetaArch(model.DetectionModel):
          [batch_size, num_anchors, num_classes+1] containing class predictions
          (logits) for each of the anchors.  Note that this tensor *includes*
          background class predictions.
+        3) anchors: (optional) 2-D float tensor of shape [num_anchors, 4].
      match_list: a list of matcher.Match objects encoding the match between
        anchors and groundtruth boxes for each image of the batch,
        with rows of the Match objects corresponding to groundtruth boxes
@@ -1069,7 +1084,10 @@ class SSDMetaArch(model.DetectionModel):
    if self._add_background_class:
      class_predictions = tf.slice(class_predictions, [0, 0, 1], [-1, -1, -1])

-    decoded_boxes, _ = self._batch_decode(prediction_dict['box_encodings'])
+    if 'anchors' not in prediction_dict:
+      prediction_dict['anchors'] = self.anchors.get()
+    decoded_boxes, _ = self._batch_decode(prediction_dict['box_encodings'],
+                                          prediction_dict['anchors'])
    decoded_box_tensors_list = tf.unstack(decoded_boxes)
    class_prediction_list = tf.unstack(class_predictions)
    decoded_boxlist_list = []
@@ -1084,12 +1102,13 @@ class SSDMetaArch(model.DetectionModel):
        decoded_boxlist_list=decoded_boxlist_list,
        match_list=match_list)

-  def _batch_decode(self, box_encodings):
+  def _batch_decode(self, box_encodings, anchors):
    """Decodes a batch of box encodings with respect to the anchors.

    Args:
      box_encodings: A float32 tensor of shape
        [batch_size, num_anchors, box_code_size] containing box encodings.
+      anchors: A tensor of shape [num_anchors, 4].

    Returns:
      decoded_boxes: A float32 tensor of shape
@@ -1101,8 +1120,7 @@ class SSDMetaArch(model.DetectionModel):
    combined_shape = shape_utils.combined_static_and_dynamic_shape(
        box_encodings)
    batch_size = combined_shape[0]
-    tiled_anchor_boxes = tf.tile(
-        tf.expand_dims(self.anchors.get(), 0), [batch_size, 1, 1])
+    tiled_anchor_boxes = tf.tile(tf.expand_dims(anchors, 0), [batch_size, 1, 1])
    tiled_anchors_boxlist = box_list.BoxList(
        tf.reshape(tiled_anchor_boxes, [-1, 4]))
    decoded_boxes = self._box_coder.decode(

--- a/research/object_detection/meta_architectures/ssd_meta_arch_test.py
+++ b/research/object_detection/meta_architectures/ssd_meta_arch_test.py
@@ -311,8 +311,8 @@ class SsdMetaArchTest(ssd_meta_arch_test_lib.SSDMetaArchTestBase,
                            [0.5, 0., 1., 0.5], [1., 1., 1.5, 1.5]],
                           [[0., 0., 0.5, 0.5], [0., 0.5, 0.5, 1.],
                            [0.5, 0., 1., 0.5], [1., 1., 1.5, 1.5]]]
-    raw_detection_scores = [[[0, 0], [0, 0], [0, 0], [0, 0]],
-                            [[0, 0], [0, 0], [0, 0], [0, 0]]]
+    raw_detection_scores = [[[0.5, 0.5], [0.5, 0.5], [0.5, 0.5], [0.5, 0.5]],
+                            [[0.5, 0.5], [0.5, 0.5], [0.5, 0.5], [0.5, 0.5]]]

    for input_shape in input_shapes:
      tf_graph = tf.Graph()

--- a/research/object_detection/metrics/coco_evaluation.py
+++ b/research/object_detection/metrics/coco_evaluation.py
@@ -197,6 +197,7 @@ class CocoDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
      'PerformanceByCategory' is included in the output regardless of
      all_metrics_per_category.
    """
+    tf.logging.info('Performing evaluation on %d images.', len(self._image_ids))
    groundtruth_dict = {
        'annotations': self._groundtruth_list,
        'images': [{'id': image_id} for image_id in self._image_ids],

--- a/research/object_detection/model_lib.py
+++ b/research/object_detection/model_lib.py
@@ -24,6 +24,7 @@ import os

 import tensorflow as tf

+from tensorflow.python.util import function_utils
 from object_detection import eval_util
 from object_detection import exporter as exporter_lib
 from object_detection import inputs
@@ -33,6 +34,7 @@ from object_detection.builders import optimizer_builder
 from object_detection.core import standard_fields as fields
 from object_detection.utils import config_util
 from object_detection.utils import label_map_util
+from object_detection.utils import ops
 from object_detection.utils import shape_utils
 from object_detection.utils import variables_helper
 from object_detection.utils import visualization_utils as vis_utils
@@ -279,9 +281,7 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False,
        prediction_dict = detection_model.predict(
            preprocessed_images,
            features[fields.InputDataFields.true_image_shape])
-        for k, v in prediction_dict.items():
-          if v.dtype == tf.bfloat16:
-            prediction_dict[k] = tf.cast(v, tf.float32)
+        prediction_dict = ops.bfloat16_to_float32_nested(prediction_dict)
    else:
      prediction_dict = detection_model.predict(
          preprocessed_images,
@@ -338,6 +338,9 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False,
      losses = [loss_tensor for loss_tensor in losses_dict.values()]
      if train_config.add_regularization_loss:
        regularization_losses = detection_model.regularization_losses()
+        if use_tpu and train_config.use_bfloat16:
+          regularization_losses = ops.bfloat16_to_float32_nested(
+              regularization_losses)
        if regularization_losses:
          regularization_loss = tf.add_n(
              regularization_losses, name='regularization_loss')
@@ -650,6 +653,11 @@ def create_estimator_and_inputs(run_config,
  model_fn = model_fn_creator(detection_model_fn, configs, hparams, use_tpu,
                              postprocess_on_cpu)
  if use_tpu_estimator:
+    # Multicore inference disabled due to b/129367127
+    tpu_estimator_args = function_utils.fn_args(tf.contrib.tpu.TPUEstimator)
+    kwargs = {}
+    if 'experimental_export_device_assignment' in tpu_estimator_args:
+      kwargs['experimental_export_device_assignment'] = True
    estimator = tf.contrib.tpu.TPUEstimator(
        model_fn=model_fn,
        train_batch_size=train_config.batch_size,
@@ -659,7 +667,8 @@ def create_estimator_and_inputs(run_config,
        config=run_config,
        export_to_tpu=export_to_tpu,
        eval_on_tpu=False,  # Eval runs on CPU, so disable eval on TPU
-        params=params if params else {})
+        params=params if params else {},
+        **kwargs)
  else:
    estimator = tf.estimator.Estimator(model_fn=model_fn, config=run_config)


--- a/research/object_detection/models/faster_rcnn_inception_resnet_v2_keras_feature_extractor.py
+++ b/research/object_detection/models/faster_rcnn_inception_resnet_v2_keras_feature_extractor.py
+# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+
+"""Inception Resnet v2 Faster R-CNN implementation in Keras.
+
+See "Inception-v4, Inception-ResNet and the Impact of Residual Connections on
+Learning" by Szegedy et al. (https://arxiv.org/abs/1602.07261)
+as well as
+"Speed/accuracy trade-offs for modern convolutional object detectors" by
+Huang et al. (https://arxiv.org/abs/1611.10012)
+"""
+
+# Skip pylint for this file because it times out
+# pylint: skip-file
+
+import tensorflow as tf
+
+from object_detection.meta_architectures import faster_rcnn_meta_arch
+from object_detection.models.keras_models import inception_resnet_v2
+from object_detection.utils import model_util
+
+
+class FasterRCNNInceptionResnetV2KerasFeatureExtractor(
+    faster_rcnn_meta_arch.FasterRCNNKerasFeatureExtractor):
+  """Faster R-CNN with Inception Resnet v2 feature extractor implementation."""
+
+  def __init__(self,
+               is_training,
+               first_stage_features_stride,
+               batch_norm_trainable=False,
+               weight_decay=0.0):
+    """Constructor.
+
+    Args:
+      is_training: See base class.
+      first_stage_features_stride: See base class.
+      batch_norm_trainable: See base class.
+      weight_decay: See base class.
+
+    Raises:
+      ValueError: If `first_stage_features_stride` is not 8 or 16.
+    """
+    if first_stage_features_stride != 8 and first_stage_features_stride != 16:
+      raise ValueError('`first_stage_features_stride` must be 8 or 16.')
+    super(FasterRCNNInceptionResnetV2KerasFeatureExtractor, self).__init__(
+        is_training, first_stage_features_stride, batch_norm_trainable,
+        weight_decay)
+
+  def preprocess(self, resized_inputs):
+    """Faster R-CNN with Inception Resnet v2 preprocessing.
+
+    Maps pixel values to the range [-1, 1].
+
+    Args:
+      resized_inputs: A [batch, height_in, width_in, channels] float32 tensor
+        representing a batch of images with values between 0 and 255.0.
+
+    Returns:
+      preprocessed_inputs: A [batch, height_out, width_out, channels] float32
+        tensor representing a batch of images.
+
+    """
+    return (2.0 / 255.0) * resized_inputs - 1.0
+
+  def get_proposal_feature_extractor_model(self, name=None):
+    """Returns a model that extracts first stage RPN features.
+
+    Extracts features using the first half of the Inception Resnet v2 network.
+    We construct the network in `align_feature_maps=True` mode, which means
+    that all VALID paddings in the network are changed to SAME padding so that
+    the feature maps are aligned.
+
+    Args:
+      name: A scope name to construct all variables within.
+
+    Returns:
+      A Keras model that takes preprocessed_inputs:
+        A [batch, height, width, channels] float32 tensor
+        representing a batch of images.
+
+      And returns rpn_feature_map:
+        A tensor with shape [batch, height, width, depth]
+    """
+    with tf.name_scope(name):
+      with tf.name_scope('InceptionResnetV2'):
+        model = inception_resnet_v2.inception_resnet_v2(
+              self._train_batch_norm,
+              output_stride=self._first_stage_features_stride,
+              align_feature_maps=True,
+              weight_decay=self._weight_decay,
+              weights=None,
+              include_top=False)
+        proposal_features = model.get_layer(
+            name='block17_20_ac').output
+        return tf.keras.Model(
+            inputs=model.inputs,
+            outputs=proposal_features)
+
+  def get_box_classifier_feature_extractor_model(self, name=None):
+    """Returns a model that extracts second stage box classifier features.
+
+    This function reconstructs the "second half" of the Inception ResNet v2
+    network after the part defined in `get_proposal_feature_extractor_model`.
+
+    Args:
+      name: A scope name to construct all variables within.
+
+    Returns:
+      A Keras model that takes proposal_feature_maps:
+        A 4-D float tensor with shape
+        [batch_size * self.max_num_proposals, crop_height, crop_width, depth]
+        representing the feature map cropped to each proposal.
+      And returns proposal_classifier_features:
+        A 4-D float tensor with shape
+        [batch_size * self.max_num_proposals, height, width, depth]
+        representing box classifier features for each proposal.
+    """
+    with tf.name_scope(name):
+      with tf.name_scope('InceptionResnetV2'):
+        model = inception_resnet_v2.inception_resnet_v2(
+            self._train_batch_norm,
+            output_stride=16,
+            align_feature_maps=False,
+            weight_decay=self._weight_decay,
+            weights=None,
+            include_top=False)
+
+        proposal_feature_maps = model.get_layer(
+            name='block17_20_ac').output
+        proposal_classifier_features = model.get_layer(
+            name='conv_7b_ac').output
+
+        return model_util.extract_submodel(
+            model=model,
+            inputs=proposal_feature_maps,
+            outputs=proposal_classifier_features)
+
+  def restore_from_classification_checkpoint_fn(
+      self,
+      first_stage_feature_extractor_scope,
+      second_stage_feature_extractor_scope):
+    """Returns a map of variables to load from a foreign checkpoint.
+
+    This uses a hard-coded conversion to load into Keras from a slim-trained
+    inception_resnet_v2 checkpoint.
+    Note that this overrides the default implementation in
+    faster_rcnn_meta_arch.FasterRCNNKerasFeatureExtractor which does not work
+    for InceptionResnetV2 checkpoints.
+
+    Args:
+      first_stage_feature_extractor_scope: A scope name for the first stage
+        feature extractor.
+      second_stage_feature_extractor_scope: A scope name for the second stage
+        feature extractor.
+
+    Returns:
+      A dict mapping variable names (to load from a checkpoint) to variables in
+      the model graph.
+    """
+
+    keras_to_slim_name_mapping = {
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d/kernel': 'InceptionResnetV2/Conv2d_1a_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm/beta': 'InceptionResnetV2/Conv2d_1a_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm/moving_mean': 'InceptionResnetV2/Conv2d_1a_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm/moving_variance': 'InceptionResnetV2/Conv2d_1a_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_1/kernel': 'InceptionResnetV2/Conv2d_2a_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_1/beta': 'InceptionResnetV2/Conv2d_2a_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_1/moving_mean': 'InceptionResnetV2/Conv2d_2a_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_1/moving_variance': 'InceptionResnetV2/Conv2d_2a_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_2/kernel': 'InceptionResnetV2/Conv2d_2b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_2/beta': 'InceptionResnetV2/Conv2d_2b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_2/moving_mean': 'InceptionResnetV2/Conv2d_2b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_2/moving_variance': 'InceptionResnetV2/Conv2d_2b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_3/kernel': 'InceptionResnetV2/Conv2d_3b_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_3/beta': 'InceptionResnetV2/Conv2d_3b_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_3/moving_mean': 'InceptionResnetV2/Conv2d_3b_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_3/moving_variance': 'InceptionResnetV2/Conv2d_3b_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_4/kernel': 'InceptionResnetV2/Conv2d_4a_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_4/beta': 'InceptionResnetV2/Conv2d_4a_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_4/moving_mean': 'InceptionResnetV2/Conv2d_4a_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_4/moving_variance': 'InceptionResnetV2/Conv2d_4a_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_5/kernel': 'InceptionResnetV2/Mixed_5b/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_5/beta': 'InceptionResnetV2/Mixed_5b/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_5/moving_mean': 'InceptionResnetV2/Mixed_5b/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_5/moving_variance': 'InceptionResnetV2/Mixed_5b/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_6/kernel': 'InceptionResnetV2/Mixed_5b/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_6/beta': 'InceptionResnetV2/Mixed_5b/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_6/moving_mean': 'InceptionResnetV2/Mixed_5b/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_6/moving_variance': 'InceptionResnetV2/Mixed_5b/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_7/kernel': 'InceptionResnetV2/Mixed_5b/Branch_1/Conv2d_0b_5x5/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_7/beta': 'InceptionResnetV2/Mixed_5b/Branch_1/Conv2d_0b_5x5/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_7/moving_mean': 'InceptionResnetV2/Mixed_5b/Branch_1/Conv2d_0b_5x5/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_7/moving_variance': 'InceptionResnetV2/Mixed_5b/Branch_1/Conv2d_0b_5x5/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_8/kernel': 'InceptionResnetV2/Mixed_5b/Branch_2/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_8/beta': 'InceptionResnetV2/Mixed_5b/Branch_2/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_8/moving_mean': 'InceptionResnetV2/Mixed_5b/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_8/moving_variance': 'InceptionResnetV2/Mixed_5b/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_9/kernel': 'InceptionResnetV2/Mixed_5b/Branch_2/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_9/beta': 'InceptionResnetV2/Mixed_5b/Branch_2/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_9/moving_mean': 'InceptionResnetV2/Mixed_5b/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_9/moving_variance': 'InceptionResnetV2/Mixed_5b/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_10/kernel': 'InceptionResnetV2/Mixed_5b/Branch_2/Conv2d_0c_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_10/beta': 'InceptionResnetV2/Mixed_5b/Branch_2/Conv2d_0c_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_10/moving_mean': 'InceptionResnetV2/Mixed_5b/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_10/moving_variance': 'InceptionResnetV2/Mixed_5b/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_11/kernel': 'InceptionResnetV2/Mixed_5b/Branch_3/Conv2d_0b_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_11/beta': 'InceptionResnetV2/Mixed_5b/Branch_3/Conv2d_0b_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_11/moving_mean': 'InceptionResnetV2/Mixed_5b/Branch_3/Conv2d_0b_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_11/moving_variance': 'InceptionResnetV2/Mixed_5b/Branch_3/Conv2d_0b_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_12/kernel': 'InceptionResnetV2/Repeat/block35_1/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_12/beta': 'InceptionResnetV2/Repeat/block35_1/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_12/moving_mean': 'InceptionResnetV2/Repeat/block35_1/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_12/moving_variance': 'InceptionResnetV2/Repeat/block35_1/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_13/kernel': 'InceptionResnetV2/Repeat/block35_1/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_13/beta': 'InceptionResnetV2/Repeat/block35_1/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_13/moving_mean': 'InceptionResnetV2/Repeat/block35_1/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_13/moving_variance': 'InceptionResnetV2/Repeat/block35_1/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_14/kernel': 'InceptionResnetV2/Repeat/block35_1/Branch_1/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_14/beta': 'InceptionResnetV2/Repeat/block35_1/Branch_1/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_14/moving_mean': 'InceptionResnetV2/Repeat/block35_1/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_14/moving_variance': 'InceptionResnetV2/Repeat/block35_1/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_15/kernel': 'InceptionResnetV2/Repeat/block35_1/Branch_2/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_15/beta': 'InceptionResnetV2/Repeat/block35_1/Branch_2/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_15/moving_mean': 'InceptionResnetV2/Repeat/block35_1/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_15/moving_variance': 'InceptionResnetV2/Repeat/block35_1/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_16/kernel': 'InceptionResnetV2/Repeat/block35_1/Branch_2/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_16/beta': 'InceptionResnetV2/Repeat/block35_1/Branch_2/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_16/moving_mean': 'InceptionResnetV2/Repeat/block35_1/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_16/moving_variance': 'InceptionResnetV2/Repeat/block35_1/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_17/kernel': 'InceptionResnetV2/Repeat/block35_1/Branch_2/Conv2d_0c_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_17/beta': 'InceptionResnetV2/Repeat/block35_1/Branch_2/Conv2d_0c_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_17/moving_mean': 'InceptionResnetV2/Repeat/block35_1/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_17/moving_variance': 'InceptionResnetV2/Repeat/block35_1/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block35_1_conv/kernel': 'InceptionResnetV2/Repeat/block35_1/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block35_1_conv/bias': 'InceptionResnetV2/Repeat/block35_1/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_18/kernel': 'InceptionResnetV2/Repeat/block35_2/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_18/beta': 'InceptionResnetV2/Repeat/block35_2/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_18/moving_mean': 'InceptionResnetV2/Repeat/block35_2/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_18/moving_variance': 'InceptionResnetV2/Repeat/block35_2/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_19/kernel': 'InceptionResnetV2/Repeat/block35_2/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_19/beta': 'InceptionResnetV2/Repeat/block35_2/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_19/moving_mean': 'InceptionResnetV2/Repeat/block35_2/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_19/moving_variance': 'InceptionResnetV2/Repeat/block35_2/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_20/kernel': 'InceptionResnetV2/Repeat/block35_2/Branch_1/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_20/beta': 'InceptionResnetV2/Repeat/block35_2/Branch_1/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_20/moving_mean': 'InceptionResnetV2/Repeat/block35_2/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_20/moving_variance': 'InceptionResnetV2/Repeat/block35_2/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_21/kernel': 'InceptionResnetV2/Repeat/block35_2/Branch_2/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_21/beta': 'InceptionResnetV2/Repeat/block35_2/Branch_2/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_21/moving_mean': 'InceptionResnetV2/Repeat/block35_2/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_21/moving_variance': 'InceptionResnetV2/Repeat/block35_2/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_22/kernel': 'InceptionResnetV2/Repeat/block35_2/Branch_2/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_22/beta': 'InceptionResnetV2/Repeat/block35_2/Branch_2/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_22/moving_mean': 'InceptionResnetV2/Repeat/block35_2/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_22/moving_variance': 'InceptionResnetV2/Repeat/block35_2/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_23/kernel': 'InceptionResnetV2/Repeat/block35_2/Branch_2/Conv2d_0c_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_23/beta': 'InceptionResnetV2/Repeat/block35_2/Branch_2/Conv2d_0c_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_23/moving_mean': 'InceptionResnetV2/Repeat/block35_2/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_23/moving_variance': 'InceptionResnetV2/Repeat/block35_2/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block35_2_conv/kernel': 'InceptionResnetV2/Repeat/block35_2/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block35_2_conv/bias': 'InceptionResnetV2/Repeat/block35_2/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_24/kernel': 'InceptionResnetV2/Repeat/block35_3/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_24/beta': 'InceptionResnetV2/Repeat/block35_3/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_24/moving_mean': 'InceptionResnetV2/Repeat/block35_3/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_24/moving_variance': 'InceptionResnetV2/Repeat/block35_3/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_25/kernel': 'InceptionResnetV2/Repeat/block35_3/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_25/beta': 'InceptionResnetV2/Repeat/block35_3/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_25/moving_mean': 'InceptionResnetV2/Repeat/block35_3/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_25/moving_variance': 'InceptionResnetV2/Repeat/block35_3/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_26/kernel': 'InceptionResnetV2/Repeat/block35_3/Branch_1/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_26/beta': 'InceptionResnetV2/Repeat/block35_3/Branch_1/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_26/moving_mean': 'InceptionResnetV2/Repeat/block35_3/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_26/moving_variance': 'InceptionResnetV2/Repeat/block35_3/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_27/kernel': 'InceptionResnetV2/Repeat/block35_3/Branch_2/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_27/beta': 'InceptionResnetV2/Repeat/block35_3/Branch_2/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_27/moving_mean': 'InceptionResnetV2/Repeat/block35_3/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_27/moving_variance': 'InceptionResnetV2/Repeat/block35_3/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_28/kernel': 'InceptionResnetV2/Repeat/block35_3/Branch_2/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_28/beta': 'InceptionResnetV2/Repeat/block35_3/Branch_2/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_28/moving_mean': 'InceptionResnetV2/Repeat/block35_3/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_28/moving_variance': 'InceptionResnetV2/Repeat/block35_3/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_29/kernel': 'InceptionResnetV2/Repeat/block35_3/Branch_2/Conv2d_0c_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_29/beta': 'InceptionResnetV2/Repeat/block35_3/Branch_2/Conv2d_0c_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_29/moving_mean': 'InceptionResnetV2/Repeat/block35_3/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_29/moving_variance': 'InceptionResnetV2/Repeat/block35_3/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block35_3_conv/kernel': 'InceptionResnetV2/Repeat/block35_3/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block35_3_conv/bias': 'InceptionResnetV2/Repeat/block35_3/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_30/kernel': 'InceptionResnetV2/Repeat/block35_4/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_30/beta': 'InceptionResnetV2/Repeat/block35_4/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_30/moving_mean': 'InceptionResnetV2/Repeat/block35_4/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_30/moving_variance': 'InceptionResnetV2/Repeat/block35_4/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_31/kernel': 'InceptionResnetV2/Repeat/block35_4/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_31/beta': 'InceptionResnetV2/Repeat/block35_4/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_31/moving_mean': 'InceptionResnetV2/Repeat/block35_4/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_31/moving_variance': 'InceptionResnetV2/Repeat/block35_4/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_32/kernel': 'InceptionResnetV2/Repeat/block35_4/Branch_1/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_32/beta': 'InceptionResnetV2/Repeat/block35_4/Branch_1/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_32/moving_mean': 'InceptionResnetV2/Repeat/block35_4/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_32/moving_variance': 'InceptionResnetV2/Repeat/block35_4/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_33/kernel': 'InceptionResnetV2/Repeat/block35_4/Branch_2/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_33/beta': 'InceptionResnetV2/Repeat/block35_4/Branch_2/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_33/moving_mean': 'InceptionResnetV2/Repeat/block35_4/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_33/moving_variance': 'InceptionResnetV2/Repeat/block35_4/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_34/kernel': 'InceptionResnetV2/Repeat/block35_4/Branch_2/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_34/beta': 'InceptionResnetV2/Repeat/block35_4/Branch_2/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_34/moving_mean': 'InceptionResnetV2/Repeat/block35_4/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_34/moving_variance': 'InceptionResnetV2/Repeat/block35_4/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_35/kernel': 'InceptionResnetV2/Repeat/block35_4/Branch_2/Conv2d_0c_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_35/beta': 'InceptionResnetV2/Repeat/block35_4/Branch_2/Conv2d_0c_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_35/moving_mean': 'InceptionResnetV2/Repeat/block35_4/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_35/moving_variance': 'InceptionResnetV2/Repeat/block35_4/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block35_4_conv/kernel': 'InceptionResnetV2/Repeat/block35_4/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block35_4_conv/bias': 'InceptionResnetV2/Repeat/block35_4/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_36/kernel': 'InceptionResnetV2/Repeat/block35_5/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_36/beta': 'InceptionResnetV2/Repeat/block35_5/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_36/moving_mean': 'InceptionResnetV2/Repeat/block35_5/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_36/moving_variance': 'InceptionResnetV2/Repeat/block35_5/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_37/kernel': 'InceptionResnetV2/Repeat/block35_5/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_37/beta': 'InceptionResnetV2/Repeat/block35_5/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_37/moving_mean': 'InceptionResnetV2/Repeat/block35_5/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_37/moving_variance': 'InceptionResnetV2/Repeat/block35_5/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_38/kernel': 'InceptionResnetV2/Repeat/block35_5/Branch_1/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_38/beta': 'InceptionResnetV2/Repeat/block35_5/Branch_1/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_38/moving_mean': 'InceptionResnetV2/Repeat/block35_5/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_38/moving_variance': 'InceptionResnetV2/Repeat/block35_5/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_39/kernel': 'InceptionResnetV2/Repeat/block35_5/Branch_2/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_39/beta': 'InceptionResnetV2/Repeat/block35_5/Branch_2/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_39/moving_mean': 'InceptionResnetV2/Repeat/block35_5/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_39/moving_variance': 'InceptionResnetV2/Repeat/block35_5/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_40/kernel': 'InceptionResnetV2/Repeat/block35_5/Branch_2/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_40/beta': 'InceptionResnetV2/Repeat/block35_5/Branch_2/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_40/moving_mean': 'InceptionResnetV2/Repeat/block35_5/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_40/moving_variance': 'InceptionResnetV2/Repeat/block35_5/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_41/kernel': 'InceptionResnetV2/Repeat/block35_5/Branch_2/Conv2d_0c_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_41/beta': 'InceptionResnetV2/Repeat/block35_5/Branch_2/Conv2d_0c_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_41/moving_mean': 'InceptionResnetV2/Repeat/block35_5/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_41/moving_variance': 'InceptionResnetV2/Repeat/block35_5/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block35_5_conv/kernel': 'InceptionResnetV2/Repeat/block35_5/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block35_5_conv/bias': 'InceptionResnetV2/Repeat/block35_5/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_42/kernel': 'InceptionResnetV2/Repeat/block35_6/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_42/beta': 'InceptionResnetV2/Repeat/block35_6/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_42/moving_mean': 'InceptionResnetV2/Repeat/block35_6/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_42/moving_variance': 'InceptionResnetV2/Repeat/block35_6/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_43/kernel': 'InceptionResnetV2/Repeat/block35_6/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_43/beta': 'InceptionResnetV2/Repeat/block35_6/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_43/moving_mean': 'InceptionResnetV2/Repeat/block35_6/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_43/moving_variance': 'InceptionResnetV2/Repeat/block35_6/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_44/kernel': 'InceptionResnetV2/Repeat/block35_6/Branch_1/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_44/beta': 'InceptionResnetV2/Repeat/block35_6/Branch_1/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_44/moving_mean': 'InceptionResnetV2/Repeat/block35_6/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_44/moving_variance': 'InceptionResnetV2/Repeat/block35_6/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_45/kernel': 'InceptionResnetV2/Repeat/block35_6/Branch_2/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_45/beta': 'InceptionResnetV2/Repeat/block35_6/Branch_2/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_45/moving_mean': 'InceptionResnetV2/Repeat/block35_6/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_45/moving_variance': 'InceptionResnetV2/Repeat/block35_6/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_46/kernel': 'InceptionResnetV2/Repeat/block35_6/Branch_2/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_46/beta': 'InceptionResnetV2/Repeat/block35_6/Branch_2/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_46/moving_mean': 'InceptionResnetV2/Repeat/block35_6/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_46/moving_variance': 'InceptionResnetV2/Repeat/block35_6/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_47/kernel': 'InceptionResnetV2/Repeat/block35_6/Branch_2/Conv2d_0c_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_47/beta': 'InceptionResnetV2/Repeat/block35_6/Branch_2/Conv2d_0c_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_47/moving_mean': 'InceptionResnetV2/Repeat/block35_6/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_47/moving_variance': 'InceptionResnetV2/Repeat/block35_6/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block35_6_conv/kernel': 'InceptionResnetV2/Repeat/block35_6/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block35_6_conv/bias': 'InceptionResnetV2/Repeat/block35_6/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_48/kernel': 'InceptionResnetV2/Repeat/block35_7/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_48/beta': 'InceptionResnetV2/Repeat/block35_7/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_48/moving_mean': 'InceptionResnetV2/Repeat/block35_7/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_48/moving_variance': 'InceptionResnetV2/Repeat/block35_7/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_49/kernel': 'InceptionResnetV2/Repeat/block35_7/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_49/beta': 'InceptionResnetV2/Repeat/block35_7/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_49/moving_mean': 'InceptionResnetV2/Repeat/block35_7/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_49/moving_variance': 'InceptionResnetV2/Repeat/block35_7/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_50/kernel': 'InceptionResnetV2/Repeat/block35_7/Branch_1/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_50/beta': 'InceptionResnetV2/Repeat/block35_7/Branch_1/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_50/moving_mean': 'InceptionResnetV2/Repeat/block35_7/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_50/moving_variance': 'InceptionResnetV2/Repeat/block35_7/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_51/kernel': 'InceptionResnetV2/Repeat/block35_7/Branch_2/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_51/beta': 'InceptionResnetV2/Repeat/block35_7/Branch_2/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_51/moving_mean': 'InceptionResnetV2/Repeat/block35_7/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_51/moving_variance': 'InceptionResnetV2/Repeat/block35_7/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_52/kernel': 'InceptionResnetV2/Repeat/block35_7/Branch_2/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_52/beta': 'InceptionResnetV2/Repeat/block35_7/Branch_2/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_52/moving_mean': 'InceptionResnetV2/Repeat/block35_7/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_52/moving_variance': 'InceptionResnetV2/Repeat/block35_7/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_53/kernel': 'InceptionResnetV2/Repeat/block35_7/Branch_2/Conv2d_0c_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_53/beta': 'InceptionResnetV2/Repeat/block35_7/Branch_2/Conv2d_0c_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_53/moving_mean': 'InceptionResnetV2/Repeat/block35_7/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_53/moving_variance': 'InceptionResnetV2/Repeat/block35_7/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block35_7_conv/kernel': 'InceptionResnetV2/Repeat/block35_7/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block35_7_conv/bias': 'InceptionResnetV2/Repeat/block35_7/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_54/kernel': 'InceptionResnetV2/Repeat/block35_8/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_54/beta': 'InceptionResnetV2/Repeat/block35_8/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_54/moving_mean': 'InceptionResnetV2/Repeat/block35_8/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_54/moving_variance': 'InceptionResnetV2/Repeat/block35_8/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_55/kernel': 'InceptionResnetV2/Repeat/block35_8/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_55/beta': 'InceptionResnetV2/Repeat/block35_8/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_55/moving_mean': 'InceptionResnetV2/Repeat/block35_8/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_55/moving_variance': 'InceptionResnetV2/Repeat/block35_8/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_56/kernel': 'InceptionResnetV2/Repeat/block35_8/Branch_1/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_56/beta': 'InceptionResnetV2/Repeat/block35_8/Branch_1/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_56/moving_mean': 'InceptionResnetV2/Repeat/block35_8/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_56/moving_variance': 'InceptionResnetV2/Repeat/block35_8/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_57/kernel': 'InceptionResnetV2/Repeat/block35_8/Branch_2/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_57/beta': 'InceptionResnetV2/Repeat/block35_8/Branch_2/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_57/moving_mean': 'InceptionResnetV2/Repeat/block35_8/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_57/moving_variance': 'InceptionResnetV2/Repeat/block35_8/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_58/kernel': 'InceptionResnetV2/Repeat/block35_8/Branch_2/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_58/beta': 'InceptionResnetV2/Repeat/block35_8/Branch_2/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_58/moving_mean': 'InceptionResnetV2/Repeat/block35_8/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_58/moving_variance': 'InceptionResnetV2/Repeat/block35_8/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_59/kernel': 'InceptionResnetV2/Repeat/block35_8/Branch_2/Conv2d_0c_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_59/beta': 'InceptionResnetV2/Repeat/block35_8/Branch_2/Conv2d_0c_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_59/moving_mean': 'InceptionResnetV2/Repeat/block35_8/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_59/moving_variance': 'InceptionResnetV2/Repeat/block35_8/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block35_8_conv/kernel': 'InceptionResnetV2/Repeat/block35_8/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block35_8_conv/bias': 'InceptionResnetV2/Repeat/block35_8/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_60/kernel': 'InceptionResnetV2/Repeat/block35_9/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_60/beta': 'InceptionResnetV2/Repeat/block35_9/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_60/moving_mean': 'InceptionResnetV2/Repeat/block35_9/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_60/moving_variance': 'InceptionResnetV2/Repeat/block35_9/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_61/kernel': 'InceptionResnetV2/Repeat/block35_9/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_61/beta': 'InceptionResnetV2/Repeat/block35_9/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_61/moving_mean': 'InceptionResnetV2/Repeat/block35_9/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_61/moving_variance': 'InceptionResnetV2/Repeat/block35_9/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_62/kernel': 'InceptionResnetV2/Repeat/block35_9/Branch_1/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_62/beta': 'InceptionResnetV2/Repeat/block35_9/Branch_1/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_62/moving_mean': 'InceptionResnetV2/Repeat/block35_9/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_62/moving_variance': 'InceptionResnetV2/Repeat/block35_9/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_63/kernel': 'InceptionResnetV2/Repeat/block35_9/Branch_2/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_63/beta': 'InceptionResnetV2/Repeat/block35_9/Branch_2/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_63/moving_mean': 'InceptionResnetV2/Repeat/block35_9/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_63/moving_variance': 'InceptionResnetV2/Repeat/block35_9/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_64/kernel': 'InceptionResnetV2/Repeat/block35_9/Branch_2/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_64/beta': 'InceptionResnetV2/Repeat/block35_9/Branch_2/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_64/moving_mean': 'InceptionResnetV2/Repeat/block35_9/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_64/moving_variance': 'InceptionResnetV2/Repeat/block35_9/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_65/kernel': 'InceptionResnetV2/Repeat/block35_9/Branch_2/Conv2d_0c_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_65/beta': 'InceptionResnetV2/Repeat/block35_9/Branch_2/Conv2d_0c_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_65/moving_mean': 'InceptionResnetV2/Repeat/block35_9/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_65/moving_variance': 'InceptionResnetV2/Repeat/block35_9/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block35_9_conv/kernel': 'InceptionResnetV2/Repeat/block35_9/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block35_9_conv/bias': 'InceptionResnetV2/Repeat/block35_9/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_66/kernel': 'InceptionResnetV2/Repeat/block35_10/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_66/beta': 'InceptionResnetV2/Repeat/block35_10/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_66/moving_mean': 'InceptionResnetV2/Repeat/block35_10/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_66/moving_variance': 'InceptionResnetV2/Repeat/block35_10/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_67/kernel': 'InceptionResnetV2/Repeat/block35_10/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_67/beta': 'InceptionResnetV2/Repeat/block35_10/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_67/moving_mean': 'InceptionResnetV2/Repeat/block35_10/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_67/moving_variance': 'InceptionResnetV2/Repeat/block35_10/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_68/kernel': 'InceptionResnetV2/Repeat/block35_10/Branch_1/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_68/beta': 'InceptionResnetV2/Repeat/block35_10/Branch_1/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_68/moving_mean': 'InceptionResnetV2/Repeat/block35_10/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_68/moving_variance': 'InceptionResnetV2/Repeat/block35_10/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_69/kernel': 'InceptionResnetV2/Repeat/block35_10/Branch_2/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_69/beta': 'InceptionResnetV2/Repeat/block35_10/Branch_2/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_69/moving_mean': 'InceptionResnetV2/Repeat/block35_10/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_69/moving_variance': 'InceptionResnetV2/Repeat/block35_10/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_70/kernel': 'InceptionResnetV2/Repeat/block35_10/Branch_2/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_70/beta': 'InceptionResnetV2/Repeat/block35_10/Branch_2/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_70/moving_mean': 'InceptionResnetV2/Repeat/block35_10/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_70/moving_variance': 'InceptionResnetV2/Repeat/block35_10/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_71/kernel': 'InceptionResnetV2/Repeat/block35_10/Branch_2/Conv2d_0c_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_71/beta': 'InceptionResnetV2/Repeat/block35_10/Branch_2/Conv2d_0c_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_71/moving_mean': 'InceptionResnetV2/Repeat/block35_10/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_71/moving_variance': 'InceptionResnetV2/Repeat/block35_10/Branch_2/Conv2d_0c_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block35_10_conv/kernel': 'InceptionResnetV2/Repeat/block35_10/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block35_10_conv/bias': 'InceptionResnetV2/Repeat/block35_10/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_72/kernel': 'InceptionResnetV2/Mixed_6a/Branch_0/Conv2d_1a_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_72/beta': 'InceptionResnetV2/Mixed_6a/Branch_0/Conv2d_1a_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_72/moving_mean': 'InceptionResnetV2/Mixed_6a/Branch_0/Conv2d_1a_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_72/moving_variance': 'InceptionResnetV2/Mixed_6a/Branch_0/Conv2d_1a_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_73/kernel': 'InceptionResnetV2/Mixed_6a/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_73/beta': 'InceptionResnetV2/Mixed_6a/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_73/moving_mean': 'InceptionResnetV2/Mixed_6a/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_73/moving_variance': 'InceptionResnetV2/Mixed_6a/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_74/kernel': 'InceptionResnetV2/Mixed_6a/Branch_1/Conv2d_0b_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_74/beta': 'InceptionResnetV2/Mixed_6a/Branch_1/Conv2d_0b_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_74/moving_mean': 'InceptionResnetV2/Mixed_6a/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_74/moving_variance': 'InceptionResnetV2/Mixed_6a/Branch_1/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_75/kernel': 'InceptionResnetV2/Mixed_6a/Branch_1/Conv2d_1a_3x3/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_75/beta': 'InceptionResnetV2/Mixed_6a/Branch_1/Conv2d_1a_3x3/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_75/moving_mean': 'InceptionResnetV2/Mixed_6a/Branch_1/Conv2d_1a_3x3/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_75/moving_variance': 'InceptionResnetV2/Mixed_6a/Branch_1/Conv2d_1a_3x3/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_76/kernel': 'InceptionResnetV2/Repeat_1/block17_1/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_76/beta': 'InceptionResnetV2/Repeat_1/block17_1/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_76/moving_mean': 'InceptionResnetV2/Repeat_1/block17_1/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_76/moving_variance': 'InceptionResnetV2/Repeat_1/block17_1/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_77/kernel': 'InceptionResnetV2/Repeat_1/block17_1/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_77/beta': 'InceptionResnetV2/Repeat_1/block17_1/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_77/moving_mean': 'InceptionResnetV2/Repeat_1/block17_1/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_77/moving_variance': 'InceptionResnetV2/Repeat_1/block17_1/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_78/kernel': 'InceptionResnetV2/Repeat_1/block17_1/Branch_1/Conv2d_0b_1x7/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_78/beta': 'InceptionResnetV2/Repeat_1/block17_1/Branch_1/Conv2d_0b_1x7/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_78/moving_mean': 'InceptionResnetV2/Repeat_1/block17_1/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_78/moving_variance': 'InceptionResnetV2/Repeat_1/block17_1/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_79/kernel': 'InceptionResnetV2/Repeat_1/block17_1/Branch_1/Conv2d_0c_7x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_79/beta': 'InceptionResnetV2/Repeat_1/block17_1/Branch_1/Conv2d_0c_7x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_79/moving_mean': 'InceptionResnetV2/Repeat_1/block17_1/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_79/moving_variance': 'InceptionResnetV2/Repeat_1/block17_1/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_1_conv/kernel': 'InceptionResnetV2/Repeat_1/block17_1/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_1_conv/bias': 'InceptionResnetV2/Repeat_1/block17_1/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_80/kernel': 'InceptionResnetV2/Repeat_1/block17_2/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_80/beta': 'InceptionResnetV2/Repeat_1/block17_2/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_80/moving_mean': 'InceptionResnetV2/Repeat_1/block17_2/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_80/moving_variance': 'InceptionResnetV2/Repeat_1/block17_2/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_81/kernel': 'InceptionResnetV2/Repeat_1/block17_2/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_81/beta': 'InceptionResnetV2/Repeat_1/block17_2/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_81/moving_mean': 'InceptionResnetV2/Repeat_1/block17_2/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_81/moving_variance': 'InceptionResnetV2/Repeat_1/block17_2/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_82/kernel': 'InceptionResnetV2/Repeat_1/block17_2/Branch_1/Conv2d_0b_1x7/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_82/beta': 'InceptionResnetV2/Repeat_1/block17_2/Branch_1/Conv2d_0b_1x7/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_82/moving_mean': 'InceptionResnetV2/Repeat_1/block17_2/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_82/moving_variance': 'InceptionResnetV2/Repeat_1/block17_2/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_83/kernel': 'InceptionResnetV2/Repeat_1/block17_2/Branch_1/Conv2d_0c_7x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_83/beta': 'InceptionResnetV2/Repeat_1/block17_2/Branch_1/Conv2d_0c_7x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_83/moving_mean': 'InceptionResnetV2/Repeat_1/block17_2/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_83/moving_variance': 'InceptionResnetV2/Repeat_1/block17_2/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_2_conv/kernel': 'InceptionResnetV2/Repeat_1/block17_2/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_2_conv/bias': 'InceptionResnetV2/Repeat_1/block17_2/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_84/kernel': 'InceptionResnetV2/Repeat_1/block17_3/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_84/beta': 'InceptionResnetV2/Repeat_1/block17_3/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_84/moving_mean': 'InceptionResnetV2/Repeat_1/block17_3/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_84/moving_variance': 'InceptionResnetV2/Repeat_1/block17_3/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_85/kernel': 'InceptionResnetV2/Repeat_1/block17_3/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_85/beta': 'InceptionResnetV2/Repeat_1/block17_3/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_85/moving_mean': 'InceptionResnetV2/Repeat_1/block17_3/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_85/moving_variance': 'InceptionResnetV2/Repeat_1/block17_3/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_86/kernel': 'InceptionResnetV2/Repeat_1/block17_3/Branch_1/Conv2d_0b_1x7/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_86/beta': 'InceptionResnetV2/Repeat_1/block17_3/Branch_1/Conv2d_0b_1x7/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_86/moving_mean': 'InceptionResnetV2/Repeat_1/block17_3/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_86/moving_variance': 'InceptionResnetV2/Repeat_1/block17_3/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_87/kernel': 'InceptionResnetV2/Repeat_1/block17_3/Branch_1/Conv2d_0c_7x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_87/beta': 'InceptionResnetV2/Repeat_1/block17_3/Branch_1/Conv2d_0c_7x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_87/moving_mean': 'InceptionResnetV2/Repeat_1/block17_3/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_87/moving_variance': 'InceptionResnetV2/Repeat_1/block17_3/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_3_conv/kernel': 'InceptionResnetV2/Repeat_1/block17_3/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_3_conv/bias': 'InceptionResnetV2/Repeat_1/block17_3/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_88/kernel': 'InceptionResnetV2/Repeat_1/block17_4/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_88/beta': 'InceptionResnetV2/Repeat_1/block17_4/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_88/moving_mean': 'InceptionResnetV2/Repeat_1/block17_4/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_88/moving_variance': 'InceptionResnetV2/Repeat_1/block17_4/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_89/kernel': 'InceptionResnetV2/Repeat_1/block17_4/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_89/beta': 'InceptionResnetV2/Repeat_1/block17_4/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_89/moving_mean': 'InceptionResnetV2/Repeat_1/block17_4/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_89/moving_variance': 'InceptionResnetV2/Repeat_1/block17_4/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_90/kernel': 'InceptionResnetV2/Repeat_1/block17_4/Branch_1/Conv2d_0b_1x7/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_90/beta': 'InceptionResnetV2/Repeat_1/block17_4/Branch_1/Conv2d_0b_1x7/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_90/moving_mean': 'InceptionResnetV2/Repeat_1/block17_4/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_90/moving_variance': 'InceptionResnetV2/Repeat_1/block17_4/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_91/kernel': 'InceptionResnetV2/Repeat_1/block17_4/Branch_1/Conv2d_0c_7x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_91/beta': 'InceptionResnetV2/Repeat_1/block17_4/Branch_1/Conv2d_0c_7x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_91/moving_mean': 'InceptionResnetV2/Repeat_1/block17_4/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_91/moving_variance': 'InceptionResnetV2/Repeat_1/block17_4/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_4_conv/kernel': 'InceptionResnetV2/Repeat_1/block17_4/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_4_conv/bias': 'InceptionResnetV2/Repeat_1/block17_4/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_92/kernel': 'InceptionResnetV2/Repeat_1/block17_5/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_92/beta': 'InceptionResnetV2/Repeat_1/block17_5/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_92/moving_mean': 'InceptionResnetV2/Repeat_1/block17_5/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_92/moving_variance': 'InceptionResnetV2/Repeat_1/block17_5/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_93/kernel': 'InceptionResnetV2/Repeat_1/block17_5/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_93/beta': 'InceptionResnetV2/Repeat_1/block17_5/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_93/moving_mean': 'InceptionResnetV2/Repeat_1/block17_5/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_93/moving_variance': 'InceptionResnetV2/Repeat_1/block17_5/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_94/kernel': 'InceptionResnetV2/Repeat_1/block17_5/Branch_1/Conv2d_0b_1x7/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_94/beta': 'InceptionResnetV2/Repeat_1/block17_5/Branch_1/Conv2d_0b_1x7/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_94/moving_mean': 'InceptionResnetV2/Repeat_1/block17_5/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_94/moving_variance': 'InceptionResnetV2/Repeat_1/block17_5/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_95/kernel': 'InceptionResnetV2/Repeat_1/block17_5/Branch_1/Conv2d_0c_7x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_95/beta': 'InceptionResnetV2/Repeat_1/block17_5/Branch_1/Conv2d_0c_7x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_95/moving_mean': 'InceptionResnetV2/Repeat_1/block17_5/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_95/moving_variance': 'InceptionResnetV2/Repeat_1/block17_5/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_5_conv/kernel': 'InceptionResnetV2/Repeat_1/block17_5/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_5_conv/bias': 'InceptionResnetV2/Repeat_1/block17_5/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_96/kernel': 'InceptionResnetV2/Repeat_1/block17_6/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_96/beta': 'InceptionResnetV2/Repeat_1/block17_6/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_96/moving_mean': 'InceptionResnetV2/Repeat_1/block17_6/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_96/moving_variance': 'InceptionResnetV2/Repeat_1/block17_6/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_97/kernel': 'InceptionResnetV2/Repeat_1/block17_6/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_97/beta': 'InceptionResnetV2/Repeat_1/block17_6/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_97/moving_mean': 'InceptionResnetV2/Repeat_1/block17_6/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_97/moving_variance': 'InceptionResnetV2/Repeat_1/block17_6/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_98/kernel': 'InceptionResnetV2/Repeat_1/block17_6/Branch_1/Conv2d_0b_1x7/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_98/beta': 'InceptionResnetV2/Repeat_1/block17_6/Branch_1/Conv2d_0b_1x7/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_98/moving_mean': 'InceptionResnetV2/Repeat_1/block17_6/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_98/moving_variance': 'InceptionResnetV2/Repeat_1/block17_6/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_99/kernel': 'InceptionResnetV2/Repeat_1/block17_6/Branch_1/Conv2d_0c_7x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_99/beta': 'InceptionResnetV2/Repeat_1/block17_6/Branch_1/Conv2d_0c_7x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_99/moving_mean': 'InceptionResnetV2/Repeat_1/block17_6/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_99/moving_variance': 'InceptionResnetV2/Repeat_1/block17_6/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_6_conv/kernel': 'InceptionResnetV2/Repeat_1/block17_6/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_6_conv/bias': 'InceptionResnetV2/Repeat_1/block17_6/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_100/kernel': 'InceptionResnetV2/Repeat_1/block17_7/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_100/beta': 'InceptionResnetV2/Repeat_1/block17_7/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_100/moving_mean': 'InceptionResnetV2/Repeat_1/block17_7/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_100/moving_variance': 'InceptionResnetV2/Repeat_1/block17_7/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_101/kernel': 'InceptionResnetV2/Repeat_1/block17_7/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_101/beta': 'InceptionResnetV2/Repeat_1/block17_7/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_101/moving_mean': 'InceptionResnetV2/Repeat_1/block17_7/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_101/moving_variance': 'InceptionResnetV2/Repeat_1/block17_7/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_102/kernel': 'InceptionResnetV2/Repeat_1/block17_7/Branch_1/Conv2d_0b_1x7/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_102/beta': 'InceptionResnetV2/Repeat_1/block17_7/Branch_1/Conv2d_0b_1x7/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_102/moving_mean': 'InceptionResnetV2/Repeat_1/block17_7/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_102/moving_variance': 'InceptionResnetV2/Repeat_1/block17_7/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_103/kernel': 'InceptionResnetV2/Repeat_1/block17_7/Branch_1/Conv2d_0c_7x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_103/beta': 'InceptionResnetV2/Repeat_1/block17_7/Branch_1/Conv2d_0c_7x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_103/moving_mean': 'InceptionResnetV2/Repeat_1/block17_7/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_103/moving_variance': 'InceptionResnetV2/Repeat_1/block17_7/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_7_conv/kernel': 'InceptionResnetV2/Repeat_1/block17_7/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_7_conv/bias': 'InceptionResnetV2/Repeat_1/block17_7/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_104/kernel': 'InceptionResnetV2/Repeat_1/block17_8/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_104/beta': 'InceptionResnetV2/Repeat_1/block17_8/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_104/moving_mean': 'InceptionResnetV2/Repeat_1/block17_8/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_104/moving_variance': 'InceptionResnetV2/Repeat_1/block17_8/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_105/kernel': 'InceptionResnetV2/Repeat_1/block17_8/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_105/beta': 'InceptionResnetV2/Repeat_1/block17_8/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_105/moving_mean': 'InceptionResnetV2/Repeat_1/block17_8/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_105/moving_variance': 'InceptionResnetV2/Repeat_1/block17_8/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_106/kernel': 'InceptionResnetV2/Repeat_1/block17_8/Branch_1/Conv2d_0b_1x7/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_106/beta': 'InceptionResnetV2/Repeat_1/block17_8/Branch_1/Conv2d_0b_1x7/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_106/moving_mean': 'InceptionResnetV2/Repeat_1/block17_8/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_106/moving_variance': 'InceptionResnetV2/Repeat_1/block17_8/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_107/kernel': 'InceptionResnetV2/Repeat_1/block17_8/Branch_1/Conv2d_0c_7x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_107/beta': 'InceptionResnetV2/Repeat_1/block17_8/Branch_1/Conv2d_0c_7x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_107/moving_mean': 'InceptionResnetV2/Repeat_1/block17_8/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_107/moving_variance': 'InceptionResnetV2/Repeat_1/block17_8/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_8_conv/kernel': 'InceptionResnetV2/Repeat_1/block17_8/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_8_conv/bias': 'InceptionResnetV2/Repeat_1/block17_8/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_108/kernel': 'InceptionResnetV2/Repeat_1/block17_9/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_108/beta': 'InceptionResnetV2/Repeat_1/block17_9/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_108/moving_mean': 'InceptionResnetV2/Repeat_1/block17_9/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_108/moving_variance': 'InceptionResnetV2/Repeat_1/block17_9/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_109/kernel': 'InceptionResnetV2/Repeat_1/block17_9/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_109/beta': 'InceptionResnetV2/Repeat_1/block17_9/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_109/moving_mean': 'InceptionResnetV2/Repeat_1/block17_9/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_109/moving_variance': 'InceptionResnetV2/Repeat_1/block17_9/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_110/kernel': 'InceptionResnetV2/Repeat_1/block17_9/Branch_1/Conv2d_0b_1x7/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_110/beta': 'InceptionResnetV2/Repeat_1/block17_9/Branch_1/Conv2d_0b_1x7/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_110/moving_mean': 'InceptionResnetV2/Repeat_1/block17_9/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_110/moving_variance': 'InceptionResnetV2/Repeat_1/block17_9/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_111/kernel': 'InceptionResnetV2/Repeat_1/block17_9/Branch_1/Conv2d_0c_7x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_111/beta': 'InceptionResnetV2/Repeat_1/block17_9/Branch_1/Conv2d_0c_7x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_111/moving_mean': 'InceptionResnetV2/Repeat_1/block17_9/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_111/moving_variance': 'InceptionResnetV2/Repeat_1/block17_9/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_9_conv/kernel': 'InceptionResnetV2/Repeat_1/block17_9/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_9_conv/bias': 'InceptionResnetV2/Repeat_1/block17_9/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_112/kernel': 'InceptionResnetV2/Repeat_1/block17_10/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_112/beta': 'InceptionResnetV2/Repeat_1/block17_10/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_112/moving_mean': 'InceptionResnetV2/Repeat_1/block17_10/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_112/moving_variance': 'InceptionResnetV2/Repeat_1/block17_10/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_113/kernel': 'InceptionResnetV2/Repeat_1/block17_10/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_113/beta': 'InceptionResnetV2/Repeat_1/block17_10/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_113/moving_mean': 'InceptionResnetV2/Repeat_1/block17_10/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_113/moving_variance': 'InceptionResnetV2/Repeat_1/block17_10/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_114/kernel': 'InceptionResnetV2/Repeat_1/block17_10/Branch_1/Conv2d_0b_1x7/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_114/beta': 'InceptionResnetV2/Repeat_1/block17_10/Branch_1/Conv2d_0b_1x7/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_114/moving_mean': 'InceptionResnetV2/Repeat_1/block17_10/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_114/moving_variance': 'InceptionResnetV2/Repeat_1/block17_10/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_115/kernel': 'InceptionResnetV2/Repeat_1/block17_10/Branch_1/Conv2d_0c_7x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_115/beta': 'InceptionResnetV2/Repeat_1/block17_10/Branch_1/Conv2d_0c_7x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_115/moving_mean': 'InceptionResnetV2/Repeat_1/block17_10/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_115/moving_variance': 'InceptionResnetV2/Repeat_1/block17_10/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_10_conv/kernel': 'InceptionResnetV2/Repeat_1/block17_10/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_10_conv/bias': 'InceptionResnetV2/Repeat_1/block17_10/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_116/kernel': 'InceptionResnetV2/Repeat_1/block17_11/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_116/beta': 'InceptionResnetV2/Repeat_1/block17_11/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_116/moving_mean': 'InceptionResnetV2/Repeat_1/block17_11/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_116/moving_variance': 'InceptionResnetV2/Repeat_1/block17_11/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_117/kernel': 'InceptionResnetV2/Repeat_1/block17_11/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_117/beta': 'InceptionResnetV2/Repeat_1/block17_11/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_117/moving_mean': 'InceptionResnetV2/Repeat_1/block17_11/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_117/moving_variance': 'InceptionResnetV2/Repeat_1/block17_11/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_118/kernel': 'InceptionResnetV2/Repeat_1/block17_11/Branch_1/Conv2d_0b_1x7/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_118/beta': 'InceptionResnetV2/Repeat_1/block17_11/Branch_1/Conv2d_0b_1x7/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_118/moving_mean': 'InceptionResnetV2/Repeat_1/block17_11/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_118/moving_variance': 'InceptionResnetV2/Repeat_1/block17_11/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_119/kernel': 'InceptionResnetV2/Repeat_1/block17_11/Branch_1/Conv2d_0c_7x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_119/beta': 'InceptionResnetV2/Repeat_1/block17_11/Branch_1/Conv2d_0c_7x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_119/moving_mean': 'InceptionResnetV2/Repeat_1/block17_11/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_119/moving_variance': 'InceptionResnetV2/Repeat_1/block17_11/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_11_conv/kernel': 'InceptionResnetV2/Repeat_1/block17_11/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_11_conv/bias': 'InceptionResnetV2/Repeat_1/block17_11/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_120/kernel': 'InceptionResnetV2/Repeat_1/block17_12/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_120/beta': 'InceptionResnetV2/Repeat_1/block17_12/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_120/moving_mean': 'InceptionResnetV2/Repeat_1/block17_12/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_120/moving_variance': 'InceptionResnetV2/Repeat_1/block17_12/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_121/kernel': 'InceptionResnetV2/Repeat_1/block17_12/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_121/beta': 'InceptionResnetV2/Repeat_1/block17_12/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_121/moving_mean': 'InceptionResnetV2/Repeat_1/block17_12/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_121/moving_variance': 'InceptionResnetV2/Repeat_1/block17_12/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_122/kernel': 'InceptionResnetV2/Repeat_1/block17_12/Branch_1/Conv2d_0b_1x7/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_122/beta': 'InceptionResnetV2/Repeat_1/block17_12/Branch_1/Conv2d_0b_1x7/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_122/moving_mean': 'InceptionResnetV2/Repeat_1/block17_12/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_122/moving_variance': 'InceptionResnetV2/Repeat_1/block17_12/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_123/kernel': 'InceptionResnetV2/Repeat_1/block17_12/Branch_1/Conv2d_0c_7x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_123/beta': 'InceptionResnetV2/Repeat_1/block17_12/Branch_1/Conv2d_0c_7x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_123/moving_mean': 'InceptionResnetV2/Repeat_1/block17_12/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_123/moving_variance': 'InceptionResnetV2/Repeat_1/block17_12/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_12_conv/kernel': 'InceptionResnetV2/Repeat_1/block17_12/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_12_conv/bias': 'InceptionResnetV2/Repeat_1/block17_12/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_124/kernel': 'InceptionResnetV2/Repeat_1/block17_13/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_124/beta': 'InceptionResnetV2/Repeat_1/block17_13/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_124/moving_mean': 'InceptionResnetV2/Repeat_1/block17_13/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_124/moving_variance': 'InceptionResnetV2/Repeat_1/block17_13/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_125/kernel': 'InceptionResnetV2/Repeat_1/block17_13/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_125/beta': 'InceptionResnetV2/Repeat_1/block17_13/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_125/moving_mean': 'InceptionResnetV2/Repeat_1/block17_13/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_125/moving_variance': 'InceptionResnetV2/Repeat_1/block17_13/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_126/kernel': 'InceptionResnetV2/Repeat_1/block17_13/Branch_1/Conv2d_0b_1x7/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_126/beta': 'InceptionResnetV2/Repeat_1/block17_13/Branch_1/Conv2d_0b_1x7/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_126/moving_mean': 'InceptionResnetV2/Repeat_1/block17_13/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_126/moving_variance': 'InceptionResnetV2/Repeat_1/block17_13/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_127/kernel': 'InceptionResnetV2/Repeat_1/block17_13/Branch_1/Conv2d_0c_7x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_127/beta': 'InceptionResnetV2/Repeat_1/block17_13/Branch_1/Conv2d_0c_7x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_127/moving_mean': 'InceptionResnetV2/Repeat_1/block17_13/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_127/moving_variance': 'InceptionResnetV2/Repeat_1/block17_13/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_13_conv/kernel': 'InceptionResnetV2/Repeat_1/block17_13/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_13_conv/bias': 'InceptionResnetV2/Repeat_1/block17_13/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_128/kernel': 'InceptionResnetV2/Repeat_1/block17_14/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_128/beta': 'InceptionResnetV2/Repeat_1/block17_14/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_128/moving_mean': 'InceptionResnetV2/Repeat_1/block17_14/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_128/moving_variance': 'InceptionResnetV2/Repeat_1/block17_14/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_129/kernel': 'InceptionResnetV2/Repeat_1/block17_14/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_129/beta': 'InceptionResnetV2/Repeat_1/block17_14/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_129/moving_mean': 'InceptionResnetV2/Repeat_1/block17_14/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_129/moving_variance': 'InceptionResnetV2/Repeat_1/block17_14/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_130/kernel': 'InceptionResnetV2/Repeat_1/block17_14/Branch_1/Conv2d_0b_1x7/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_130/beta': 'InceptionResnetV2/Repeat_1/block17_14/Branch_1/Conv2d_0b_1x7/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_130/moving_mean': 'InceptionResnetV2/Repeat_1/block17_14/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_130/moving_variance': 'InceptionResnetV2/Repeat_1/block17_14/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_131/kernel': 'InceptionResnetV2/Repeat_1/block17_14/Branch_1/Conv2d_0c_7x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_131/beta': 'InceptionResnetV2/Repeat_1/block17_14/Branch_1/Conv2d_0c_7x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_131/moving_mean': 'InceptionResnetV2/Repeat_1/block17_14/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_131/moving_variance': 'InceptionResnetV2/Repeat_1/block17_14/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_14_conv/kernel': 'InceptionResnetV2/Repeat_1/block17_14/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_14_conv/bias': 'InceptionResnetV2/Repeat_1/block17_14/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_132/kernel': 'InceptionResnetV2/Repeat_1/block17_15/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_132/beta': 'InceptionResnetV2/Repeat_1/block17_15/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_132/moving_mean': 'InceptionResnetV2/Repeat_1/block17_15/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_132/moving_variance': 'InceptionResnetV2/Repeat_1/block17_15/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_133/kernel': 'InceptionResnetV2/Repeat_1/block17_15/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_133/beta': 'InceptionResnetV2/Repeat_1/block17_15/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_133/moving_mean': 'InceptionResnetV2/Repeat_1/block17_15/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_133/moving_variance': 'InceptionResnetV2/Repeat_1/block17_15/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_134/kernel': 'InceptionResnetV2/Repeat_1/block17_15/Branch_1/Conv2d_0b_1x7/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_134/beta': 'InceptionResnetV2/Repeat_1/block17_15/Branch_1/Conv2d_0b_1x7/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_134/moving_mean': 'InceptionResnetV2/Repeat_1/block17_15/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_134/moving_variance': 'InceptionResnetV2/Repeat_1/block17_15/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_135/kernel': 'InceptionResnetV2/Repeat_1/block17_15/Branch_1/Conv2d_0c_7x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_135/beta': 'InceptionResnetV2/Repeat_1/block17_15/Branch_1/Conv2d_0c_7x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_135/moving_mean': 'InceptionResnetV2/Repeat_1/block17_15/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_135/moving_variance': 'InceptionResnetV2/Repeat_1/block17_15/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_15_conv/kernel': 'InceptionResnetV2/Repeat_1/block17_15/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_15_conv/bias': 'InceptionResnetV2/Repeat_1/block17_15/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_136/kernel': 'InceptionResnetV2/Repeat_1/block17_16/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_136/beta': 'InceptionResnetV2/Repeat_1/block17_16/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_136/moving_mean': 'InceptionResnetV2/Repeat_1/block17_16/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_136/moving_variance': 'InceptionResnetV2/Repeat_1/block17_16/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_137/kernel': 'InceptionResnetV2/Repeat_1/block17_16/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_137/beta': 'InceptionResnetV2/Repeat_1/block17_16/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_137/moving_mean': 'InceptionResnetV2/Repeat_1/block17_16/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_137/moving_variance': 'InceptionResnetV2/Repeat_1/block17_16/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_138/kernel': 'InceptionResnetV2/Repeat_1/block17_16/Branch_1/Conv2d_0b_1x7/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_138/beta': 'InceptionResnetV2/Repeat_1/block17_16/Branch_1/Conv2d_0b_1x7/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_138/moving_mean': 'InceptionResnetV2/Repeat_1/block17_16/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_138/moving_variance': 'InceptionResnetV2/Repeat_1/block17_16/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_139/kernel': 'InceptionResnetV2/Repeat_1/block17_16/Branch_1/Conv2d_0c_7x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_139/beta': 'InceptionResnetV2/Repeat_1/block17_16/Branch_1/Conv2d_0c_7x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_139/moving_mean': 'InceptionResnetV2/Repeat_1/block17_16/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_139/moving_variance': 'InceptionResnetV2/Repeat_1/block17_16/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_16_conv/kernel': 'InceptionResnetV2/Repeat_1/block17_16/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_16_conv/bias': 'InceptionResnetV2/Repeat_1/block17_16/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_140/kernel': 'InceptionResnetV2/Repeat_1/block17_17/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_140/beta': 'InceptionResnetV2/Repeat_1/block17_17/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_140/moving_mean': 'InceptionResnetV2/Repeat_1/block17_17/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_140/moving_variance': 'InceptionResnetV2/Repeat_1/block17_17/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_141/kernel': 'InceptionResnetV2/Repeat_1/block17_17/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_141/beta': 'InceptionResnetV2/Repeat_1/block17_17/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_141/moving_mean': 'InceptionResnetV2/Repeat_1/block17_17/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_141/moving_variance': 'InceptionResnetV2/Repeat_1/block17_17/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_142/kernel': 'InceptionResnetV2/Repeat_1/block17_17/Branch_1/Conv2d_0b_1x7/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_142/beta': 'InceptionResnetV2/Repeat_1/block17_17/Branch_1/Conv2d_0b_1x7/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_142/moving_mean': 'InceptionResnetV2/Repeat_1/block17_17/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_142/moving_variance': 'InceptionResnetV2/Repeat_1/block17_17/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_143/kernel': 'InceptionResnetV2/Repeat_1/block17_17/Branch_1/Conv2d_0c_7x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_143/beta': 'InceptionResnetV2/Repeat_1/block17_17/Branch_1/Conv2d_0c_7x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_143/moving_mean': 'InceptionResnetV2/Repeat_1/block17_17/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_143/moving_variance': 'InceptionResnetV2/Repeat_1/block17_17/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_17_conv/kernel': 'InceptionResnetV2/Repeat_1/block17_17/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_17_conv/bias': 'InceptionResnetV2/Repeat_1/block17_17/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_144/kernel': 'InceptionResnetV2/Repeat_1/block17_18/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_144/beta': 'InceptionResnetV2/Repeat_1/block17_18/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_144/moving_mean': 'InceptionResnetV2/Repeat_1/block17_18/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_144/moving_variance': 'InceptionResnetV2/Repeat_1/block17_18/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_145/kernel': 'InceptionResnetV2/Repeat_1/block17_18/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_145/beta': 'InceptionResnetV2/Repeat_1/block17_18/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_145/moving_mean': 'InceptionResnetV2/Repeat_1/block17_18/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_145/moving_variance': 'InceptionResnetV2/Repeat_1/block17_18/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_146/kernel': 'InceptionResnetV2/Repeat_1/block17_18/Branch_1/Conv2d_0b_1x7/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_146/beta': 'InceptionResnetV2/Repeat_1/block17_18/Branch_1/Conv2d_0b_1x7/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_146/moving_mean': 'InceptionResnetV2/Repeat_1/block17_18/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_146/moving_variance': 'InceptionResnetV2/Repeat_1/block17_18/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_147/kernel': 'InceptionResnetV2/Repeat_1/block17_18/Branch_1/Conv2d_0c_7x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_147/beta': 'InceptionResnetV2/Repeat_1/block17_18/Branch_1/Conv2d_0c_7x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_147/moving_mean': 'InceptionResnetV2/Repeat_1/block17_18/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_147/moving_variance': 'InceptionResnetV2/Repeat_1/block17_18/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_18_conv/kernel': 'InceptionResnetV2/Repeat_1/block17_18/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_18_conv/bias': 'InceptionResnetV2/Repeat_1/block17_18/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_148/kernel': 'InceptionResnetV2/Repeat_1/block17_19/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_148/beta': 'InceptionResnetV2/Repeat_1/block17_19/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_148/moving_mean': 'InceptionResnetV2/Repeat_1/block17_19/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_148/moving_variance': 'InceptionResnetV2/Repeat_1/block17_19/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_149/kernel': 'InceptionResnetV2/Repeat_1/block17_19/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_149/beta': 'InceptionResnetV2/Repeat_1/block17_19/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_149/moving_mean': 'InceptionResnetV2/Repeat_1/block17_19/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_149/moving_variance': 'InceptionResnetV2/Repeat_1/block17_19/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_150/kernel': 'InceptionResnetV2/Repeat_1/block17_19/Branch_1/Conv2d_0b_1x7/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_150/beta': 'InceptionResnetV2/Repeat_1/block17_19/Branch_1/Conv2d_0b_1x7/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_150/moving_mean': 'InceptionResnetV2/Repeat_1/block17_19/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_150/moving_variance': 'InceptionResnetV2/Repeat_1/block17_19/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_151/kernel': 'InceptionResnetV2/Repeat_1/block17_19/Branch_1/Conv2d_0c_7x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_151/beta': 'InceptionResnetV2/Repeat_1/block17_19/Branch_1/Conv2d_0c_7x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_151/moving_mean': 'InceptionResnetV2/Repeat_1/block17_19/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_151/moving_variance': 'InceptionResnetV2/Repeat_1/block17_19/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_19_conv/kernel': 'InceptionResnetV2/Repeat_1/block17_19/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_19_conv/bias': 'InceptionResnetV2/Repeat_1/block17_19/Conv2d_1x1/biases',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_152/kernel': 'InceptionResnetV2/Repeat_1/block17_20/Branch_0/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_152/beta': 'InceptionResnetV2/Repeat_1/block17_20/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_152/moving_mean': 'InceptionResnetV2/Repeat_1/block17_20/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_152/moving_variance': 'InceptionResnetV2/Repeat_1/block17_20/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_153/kernel': 'InceptionResnetV2/Repeat_1/block17_20/Branch_1/Conv2d_0a_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_153/beta': 'InceptionResnetV2/Repeat_1/block17_20/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_153/moving_mean': 'InceptionResnetV2/Repeat_1/block17_20/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_153/moving_variance': 'InceptionResnetV2/Repeat_1/block17_20/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_154/kernel': 'InceptionResnetV2/Repeat_1/block17_20/Branch_1/Conv2d_0b_1x7/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_154/beta': 'InceptionResnetV2/Repeat_1/block17_20/Branch_1/Conv2d_0b_1x7/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_154/moving_mean': 'InceptionResnetV2/Repeat_1/block17_20/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_154/moving_variance': 'InceptionResnetV2/Repeat_1/block17_20/Branch_1/Conv2d_0b_1x7/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/conv2d_155/kernel': 'InceptionResnetV2/Repeat_1/block17_20/Branch_1/Conv2d_0c_7x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_155/beta': 'InceptionResnetV2/Repeat_1/block17_20/Branch_1/Conv2d_0c_7x1/BatchNorm/beta',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_155/moving_mean': 'InceptionResnetV2/Repeat_1/block17_20/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_mean',
+        'FirstStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_155/moving_variance': 'InceptionResnetV2/Repeat_1/block17_20/Branch_1/Conv2d_0c_7x1/BatchNorm/moving_variance',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_20_conv/kernel': 'InceptionResnetV2/Repeat_1/block17_20/Conv2d_1x1/weights',
+        'FirstStageFeatureExtractor/InceptionResnetV2/block17_20_conv/bias': 'InceptionResnetV2/Repeat_1/block17_20/Conv2d_1x1/biases',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_359/kernel': 'InceptionResnetV2/Mixed_7a/Branch_0/Conv2d_0a_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_359/beta': 'InceptionResnetV2/Mixed_7a/Branch_0/Conv2d_0a_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_359/moving_mean': 'InceptionResnetV2/Mixed_7a/Branch_0/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_359/moving_variance': 'InceptionResnetV2/Mixed_7a/Branch_0/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_360/kernel': 'InceptionResnetV2/Mixed_7a/Branch_0/Conv2d_1a_3x3/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_360/beta': 'InceptionResnetV2/Mixed_7a/Branch_0/Conv2d_1a_3x3/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_360/moving_mean': 'InceptionResnetV2/Mixed_7a/Branch_0/Conv2d_1a_3x3/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_360/moving_variance': 'InceptionResnetV2/Mixed_7a/Branch_0/Conv2d_1a_3x3/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_361/kernel': 'InceptionResnetV2/Mixed_7a/Branch_1/Conv2d_0a_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_361/beta': 'InceptionResnetV2/Mixed_7a/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_361/moving_mean': 'InceptionResnetV2/Mixed_7a/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_361/moving_variance': 'InceptionResnetV2/Mixed_7a/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_362/kernel': 'InceptionResnetV2/Mixed_7a/Branch_1/Conv2d_1a_3x3/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_362/beta': 'InceptionResnetV2/Mixed_7a/Branch_1/Conv2d_1a_3x3/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_362/moving_mean': 'InceptionResnetV2/Mixed_7a/Branch_1/Conv2d_1a_3x3/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_362/moving_variance': 'InceptionResnetV2/Mixed_7a/Branch_1/Conv2d_1a_3x3/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_363/kernel': 'InceptionResnetV2/Mixed_7a/Branch_2/Conv2d_0a_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_363/beta': 'InceptionResnetV2/Mixed_7a/Branch_2/Conv2d_0a_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_363/moving_mean': 'InceptionResnetV2/Mixed_7a/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_363/moving_variance': 'InceptionResnetV2/Mixed_7a/Branch_2/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_364/kernel': 'InceptionResnetV2/Mixed_7a/Branch_2/Conv2d_0b_3x3/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_364/beta': 'InceptionResnetV2/Mixed_7a/Branch_2/Conv2d_0b_3x3/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_364/moving_mean': 'InceptionResnetV2/Mixed_7a/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_364/moving_variance': 'InceptionResnetV2/Mixed_7a/Branch_2/Conv2d_0b_3x3/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_365/kernel': 'InceptionResnetV2/Mixed_7a/Branch_2/Conv2d_1a_3x3/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_365/beta': 'InceptionResnetV2/Mixed_7a/Branch_2/Conv2d_1a_3x3/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_365/moving_mean': 'InceptionResnetV2/Mixed_7a/Branch_2/Conv2d_1a_3x3/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_365/moving_variance': 'InceptionResnetV2/Mixed_7a/Branch_2/Conv2d_1a_3x3/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_366/kernel': 'InceptionResnetV2/Repeat_2/block8_1/Branch_0/Conv2d_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_366/beta': 'InceptionResnetV2/Repeat_2/block8_1/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_366/moving_mean': 'InceptionResnetV2/Repeat_2/block8_1/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_366/moving_variance': 'InceptionResnetV2/Repeat_2/block8_1/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_367/kernel': 'InceptionResnetV2/Repeat_2/block8_1/Branch_1/Conv2d_0a_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_367/beta': 'InceptionResnetV2/Repeat_2/block8_1/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_367/moving_mean': 'InceptionResnetV2/Repeat_2/block8_1/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_367/moving_variance': 'InceptionResnetV2/Repeat_2/block8_1/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_368/kernel': 'InceptionResnetV2/Repeat_2/block8_1/Branch_1/Conv2d_0b_1x3/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_368/beta': 'InceptionResnetV2/Repeat_2/block8_1/Branch_1/Conv2d_0b_1x3/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_368/moving_mean': 'InceptionResnetV2/Repeat_2/block8_1/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_368/moving_variance': 'InceptionResnetV2/Repeat_2/block8_1/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_369/kernel': 'InceptionResnetV2/Repeat_2/block8_1/Branch_1/Conv2d_0c_3x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_369/beta': 'InceptionResnetV2/Repeat_2/block8_1/Branch_1/Conv2d_0c_3x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_369/moving_mean': 'InceptionResnetV2/Repeat_2/block8_1/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_369/moving_variance': 'InceptionResnetV2/Repeat_2/block8_1/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/block8_1_conv/kernel': 'InceptionResnetV2/Repeat_2/block8_1/Conv2d_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/block8_1_conv/bias': 'InceptionResnetV2/Repeat_2/block8_1/Conv2d_1x1/biases',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_370/kernel': 'InceptionResnetV2/Repeat_2/block8_2/Branch_0/Conv2d_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_370/beta': 'InceptionResnetV2/Repeat_2/block8_2/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_370/moving_mean': 'InceptionResnetV2/Repeat_2/block8_2/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_370/moving_variance': 'InceptionResnetV2/Repeat_2/block8_2/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_371/kernel': 'InceptionResnetV2/Repeat_2/block8_2/Branch_1/Conv2d_0a_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_371/beta': 'InceptionResnetV2/Repeat_2/block8_2/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_371/moving_mean': 'InceptionResnetV2/Repeat_2/block8_2/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_371/moving_variance': 'InceptionResnetV2/Repeat_2/block8_2/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_372/kernel': 'InceptionResnetV2/Repeat_2/block8_2/Branch_1/Conv2d_0b_1x3/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_372/beta': 'InceptionResnetV2/Repeat_2/block8_2/Branch_1/Conv2d_0b_1x3/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_372/moving_mean': 'InceptionResnetV2/Repeat_2/block8_2/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_372/moving_variance': 'InceptionResnetV2/Repeat_2/block8_2/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_373/kernel': 'InceptionResnetV2/Repeat_2/block8_2/Branch_1/Conv2d_0c_3x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_373/beta': 'InceptionResnetV2/Repeat_2/block8_2/Branch_1/Conv2d_0c_3x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_373/moving_mean': 'InceptionResnetV2/Repeat_2/block8_2/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_373/moving_variance': 'InceptionResnetV2/Repeat_2/block8_2/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/block8_2_conv/kernel': 'InceptionResnetV2/Repeat_2/block8_2/Conv2d_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/block8_2_conv/bias': 'InceptionResnetV2/Repeat_2/block8_2/Conv2d_1x1/biases',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_374/kernel': 'InceptionResnetV2/Repeat_2/block8_3/Branch_0/Conv2d_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_374/beta': 'InceptionResnetV2/Repeat_2/block8_3/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_374/moving_mean': 'InceptionResnetV2/Repeat_2/block8_3/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_374/moving_variance': 'InceptionResnetV2/Repeat_2/block8_3/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_375/kernel': 'InceptionResnetV2/Repeat_2/block8_3/Branch_1/Conv2d_0a_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_375/beta': 'InceptionResnetV2/Repeat_2/block8_3/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_375/moving_mean': 'InceptionResnetV2/Repeat_2/block8_3/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_375/moving_variance': 'InceptionResnetV2/Repeat_2/block8_3/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_376/kernel': 'InceptionResnetV2/Repeat_2/block8_3/Branch_1/Conv2d_0b_1x3/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_376/beta': 'InceptionResnetV2/Repeat_2/block8_3/Branch_1/Conv2d_0b_1x3/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_376/moving_mean': 'InceptionResnetV2/Repeat_2/block8_3/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_376/moving_variance': 'InceptionResnetV2/Repeat_2/block8_3/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_377/kernel': 'InceptionResnetV2/Repeat_2/block8_3/Branch_1/Conv2d_0c_3x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_377/beta': 'InceptionResnetV2/Repeat_2/block8_3/Branch_1/Conv2d_0c_3x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_377/moving_mean': 'InceptionResnetV2/Repeat_2/block8_3/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_377/moving_variance': 'InceptionResnetV2/Repeat_2/block8_3/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/block8_3_conv/kernel': 'InceptionResnetV2/Repeat_2/block8_3/Conv2d_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/block8_3_conv/bias': 'InceptionResnetV2/Repeat_2/block8_3/Conv2d_1x1/biases',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_378/kernel': 'InceptionResnetV2/Repeat_2/block8_4/Branch_0/Conv2d_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_378/beta': 'InceptionResnetV2/Repeat_2/block8_4/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_378/moving_mean': 'InceptionResnetV2/Repeat_2/block8_4/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_378/moving_variance': 'InceptionResnetV2/Repeat_2/block8_4/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_379/kernel': 'InceptionResnetV2/Repeat_2/block8_4/Branch_1/Conv2d_0a_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_379/beta': 'InceptionResnetV2/Repeat_2/block8_4/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_379/moving_mean': 'InceptionResnetV2/Repeat_2/block8_4/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_379/moving_variance': 'InceptionResnetV2/Repeat_2/block8_4/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_380/kernel': 'InceptionResnetV2/Repeat_2/block8_4/Branch_1/Conv2d_0b_1x3/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_380/beta': 'InceptionResnetV2/Repeat_2/block8_4/Branch_1/Conv2d_0b_1x3/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_380/moving_mean': 'InceptionResnetV2/Repeat_2/block8_4/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_380/moving_variance': 'InceptionResnetV2/Repeat_2/block8_4/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_381/kernel': 'InceptionResnetV2/Repeat_2/block8_4/Branch_1/Conv2d_0c_3x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_381/beta': 'InceptionResnetV2/Repeat_2/block8_4/Branch_1/Conv2d_0c_3x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_381/moving_mean': 'InceptionResnetV2/Repeat_2/block8_4/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_381/moving_variance': 'InceptionResnetV2/Repeat_2/block8_4/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/block8_4_conv/kernel': 'InceptionResnetV2/Repeat_2/block8_4/Conv2d_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/block8_4_conv/bias': 'InceptionResnetV2/Repeat_2/block8_4/Conv2d_1x1/biases',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_382/kernel': 'InceptionResnetV2/Repeat_2/block8_5/Branch_0/Conv2d_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_382/beta': 'InceptionResnetV2/Repeat_2/block8_5/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_382/moving_mean': 'InceptionResnetV2/Repeat_2/block8_5/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_382/moving_variance': 'InceptionResnetV2/Repeat_2/block8_5/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_383/kernel': 'InceptionResnetV2/Repeat_2/block8_5/Branch_1/Conv2d_0a_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_383/beta': 'InceptionResnetV2/Repeat_2/block8_5/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_383/moving_mean': 'InceptionResnetV2/Repeat_2/block8_5/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_383/moving_variance': 'InceptionResnetV2/Repeat_2/block8_5/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_384/kernel': 'InceptionResnetV2/Repeat_2/block8_5/Branch_1/Conv2d_0b_1x3/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_384/beta': 'InceptionResnetV2/Repeat_2/block8_5/Branch_1/Conv2d_0b_1x3/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_384/moving_mean': 'InceptionResnetV2/Repeat_2/block8_5/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_384/moving_variance': 'InceptionResnetV2/Repeat_2/block8_5/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_385/kernel': 'InceptionResnetV2/Repeat_2/block8_5/Branch_1/Conv2d_0c_3x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_385/beta': 'InceptionResnetV2/Repeat_2/block8_5/Branch_1/Conv2d_0c_3x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_385/moving_mean': 'InceptionResnetV2/Repeat_2/block8_5/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_385/moving_variance': 'InceptionResnetV2/Repeat_2/block8_5/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/block8_5_conv/kernel': 'InceptionResnetV2/Repeat_2/block8_5/Conv2d_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/block8_5_conv/bias': 'InceptionResnetV2/Repeat_2/block8_5/Conv2d_1x1/biases',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_386/kernel': 'InceptionResnetV2/Repeat_2/block8_6/Branch_0/Conv2d_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_386/beta': 'InceptionResnetV2/Repeat_2/block8_6/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_386/moving_mean': 'InceptionResnetV2/Repeat_2/block8_6/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_386/moving_variance': 'InceptionResnetV2/Repeat_2/block8_6/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_387/kernel': 'InceptionResnetV2/Repeat_2/block8_6/Branch_1/Conv2d_0a_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_387/beta': 'InceptionResnetV2/Repeat_2/block8_6/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_387/moving_mean': 'InceptionResnetV2/Repeat_2/block8_6/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_387/moving_variance': 'InceptionResnetV2/Repeat_2/block8_6/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_388/kernel': 'InceptionResnetV2/Repeat_2/block8_6/Branch_1/Conv2d_0b_1x3/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_388/beta': 'InceptionResnetV2/Repeat_2/block8_6/Branch_1/Conv2d_0b_1x3/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_388/moving_mean': 'InceptionResnetV2/Repeat_2/block8_6/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_388/moving_variance': 'InceptionResnetV2/Repeat_2/block8_6/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_389/kernel': 'InceptionResnetV2/Repeat_2/block8_6/Branch_1/Conv2d_0c_3x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_389/beta': 'InceptionResnetV2/Repeat_2/block8_6/Branch_1/Conv2d_0c_3x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_389/moving_mean': 'InceptionResnetV2/Repeat_2/block8_6/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_389/moving_variance': 'InceptionResnetV2/Repeat_2/block8_6/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/block8_6_conv/kernel': 'InceptionResnetV2/Repeat_2/block8_6/Conv2d_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/block8_6_conv/bias': 'InceptionResnetV2/Repeat_2/block8_6/Conv2d_1x1/biases',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_390/kernel': 'InceptionResnetV2/Repeat_2/block8_7/Branch_0/Conv2d_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_390/beta': 'InceptionResnetV2/Repeat_2/block8_7/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_390/moving_mean': 'InceptionResnetV2/Repeat_2/block8_7/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_390/moving_variance': 'InceptionResnetV2/Repeat_2/block8_7/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_391/kernel': 'InceptionResnetV2/Repeat_2/block8_7/Branch_1/Conv2d_0a_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_391/beta': 'InceptionResnetV2/Repeat_2/block8_7/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_391/moving_mean': 'InceptionResnetV2/Repeat_2/block8_7/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_391/moving_variance': 'InceptionResnetV2/Repeat_2/block8_7/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_392/kernel': 'InceptionResnetV2/Repeat_2/block8_7/Branch_1/Conv2d_0b_1x3/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_392/beta': 'InceptionResnetV2/Repeat_2/block8_7/Branch_1/Conv2d_0b_1x3/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_392/moving_mean': 'InceptionResnetV2/Repeat_2/block8_7/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_392/moving_variance': 'InceptionResnetV2/Repeat_2/block8_7/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_393/kernel': 'InceptionResnetV2/Repeat_2/block8_7/Branch_1/Conv2d_0c_3x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_393/beta': 'InceptionResnetV2/Repeat_2/block8_7/Branch_1/Conv2d_0c_3x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_393/moving_mean': 'InceptionResnetV2/Repeat_2/block8_7/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_393/moving_variance': 'InceptionResnetV2/Repeat_2/block8_7/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/block8_7_conv/kernel': 'InceptionResnetV2/Repeat_2/block8_7/Conv2d_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/block8_7_conv/bias': 'InceptionResnetV2/Repeat_2/block8_7/Conv2d_1x1/biases',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_394/kernel': 'InceptionResnetV2/Repeat_2/block8_8/Branch_0/Conv2d_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_394/beta': 'InceptionResnetV2/Repeat_2/block8_8/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_394/moving_mean': 'InceptionResnetV2/Repeat_2/block8_8/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_394/moving_variance': 'InceptionResnetV2/Repeat_2/block8_8/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_395/kernel': 'InceptionResnetV2/Repeat_2/block8_8/Branch_1/Conv2d_0a_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_395/beta': 'InceptionResnetV2/Repeat_2/block8_8/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_395/moving_mean': 'InceptionResnetV2/Repeat_2/block8_8/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_395/moving_variance': 'InceptionResnetV2/Repeat_2/block8_8/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_396/kernel': 'InceptionResnetV2/Repeat_2/block8_8/Branch_1/Conv2d_0b_1x3/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_396/beta': 'InceptionResnetV2/Repeat_2/block8_8/Branch_1/Conv2d_0b_1x3/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_396/moving_mean': 'InceptionResnetV2/Repeat_2/block8_8/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_396/moving_variance': 'InceptionResnetV2/Repeat_2/block8_8/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_397/kernel': 'InceptionResnetV2/Repeat_2/block8_8/Branch_1/Conv2d_0c_3x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_397/beta': 'InceptionResnetV2/Repeat_2/block8_8/Branch_1/Conv2d_0c_3x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_397/moving_mean': 'InceptionResnetV2/Repeat_2/block8_8/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_397/moving_variance': 'InceptionResnetV2/Repeat_2/block8_8/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/block8_8_conv/kernel': 'InceptionResnetV2/Repeat_2/block8_8/Conv2d_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/block8_8_conv/bias': 'InceptionResnetV2/Repeat_2/block8_8/Conv2d_1x1/biases',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_398/kernel': 'InceptionResnetV2/Repeat_2/block8_9/Branch_0/Conv2d_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_398/beta': 'InceptionResnetV2/Repeat_2/block8_9/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_398/moving_mean': 'InceptionResnetV2/Repeat_2/block8_9/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_398/moving_variance': 'InceptionResnetV2/Repeat_2/block8_9/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_399/kernel': 'InceptionResnetV2/Repeat_2/block8_9/Branch_1/Conv2d_0a_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_399/beta': 'InceptionResnetV2/Repeat_2/block8_9/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_399/moving_mean': 'InceptionResnetV2/Repeat_2/block8_9/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_399/moving_variance': 'InceptionResnetV2/Repeat_2/block8_9/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_400/kernel': 'InceptionResnetV2/Repeat_2/block8_9/Branch_1/Conv2d_0b_1x3/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_400/beta': 'InceptionResnetV2/Repeat_2/block8_9/Branch_1/Conv2d_0b_1x3/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_400/moving_mean': 'InceptionResnetV2/Repeat_2/block8_9/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_400/moving_variance': 'InceptionResnetV2/Repeat_2/block8_9/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_401/kernel': 'InceptionResnetV2/Repeat_2/block8_9/Branch_1/Conv2d_0c_3x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_401/beta': 'InceptionResnetV2/Repeat_2/block8_9/Branch_1/Conv2d_0c_3x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_401/moving_mean': 'InceptionResnetV2/Repeat_2/block8_9/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_401/moving_variance': 'InceptionResnetV2/Repeat_2/block8_9/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/block8_9_conv/kernel': 'InceptionResnetV2/Repeat_2/block8_9/Conv2d_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/block8_9_conv/bias': 'InceptionResnetV2/Repeat_2/block8_9/Conv2d_1x1/biases',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_402/kernel': 'InceptionResnetV2/Block8/Branch_0/Conv2d_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_402/beta': 'InceptionResnetV2/Block8/Branch_0/Conv2d_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_402/moving_mean': 'InceptionResnetV2/Block8/Branch_0/Conv2d_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_402/moving_variance': 'InceptionResnetV2/Block8/Branch_0/Conv2d_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_403/kernel': 'InceptionResnetV2/Block8/Branch_1/Conv2d_0a_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_403/beta': 'InceptionResnetV2/Block8/Branch_1/Conv2d_0a_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_403/moving_mean': 'InceptionResnetV2/Block8/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_403/moving_variance': 'InceptionResnetV2/Block8/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_404/kernel': 'InceptionResnetV2/Block8/Branch_1/Conv2d_0b_1x3/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_404/beta': 'InceptionResnetV2/Block8/Branch_1/Conv2d_0b_1x3/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_404/moving_mean': 'InceptionResnetV2/Block8/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_404/moving_variance': 'InceptionResnetV2/Block8/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv2d_405/kernel': 'InceptionResnetV2/Block8/Branch_1/Conv2d_0c_3x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_405/beta': 'InceptionResnetV2/Block8/Branch_1/Conv2d_0c_3x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_405/moving_mean': 'InceptionResnetV2/Block8/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/freezable_batch_norm_405/moving_variance': 'InceptionResnetV2/Block8/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_variance',
+        'SecondStageFeatureExtractor/InceptionResnetV2/block8_10_conv/kernel': 'InceptionResnetV2/Block8/Conv2d_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/block8_10_conv/bias': 'InceptionResnetV2/Block8/Conv2d_1x1/biases',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv_7b/kernel': 'InceptionResnetV2/Conv2d_7b_1x1/weights',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv_7b_bn/beta': 'InceptionResnetV2/Conv2d_7b_1x1/BatchNorm/beta',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv_7b_bn/moving_mean': 'InceptionResnetV2/Conv2d_7b_1x1/BatchNorm/moving_mean',
+        'SecondStageFeatureExtractor/InceptionResnetV2/conv_7b_bn/moving_variance': 'InceptionResnetV2/Conv2d_7b_1x1/BatchNorm/moving_variance',
+    }
+
+    variables_to_restore = {}
+    for variable in tf.global_variables():
+      var_name = keras_to_slim_name_mapping.get(variable.op.name)
+      if var_name:
+        variables_to_restore[var_name] = variable
+    return variables_to_restore
+
--- a/research/object_detection/models/faster_rcnn_inception_resnet_v2_keras_feature_extractor_test.py
+++ b/research/object_detection/models/faster_rcnn_inception_resnet_v2_keras_feature_extractor_test.py
+# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+
+"""Tests for models.faster_rcnn_inception_resnet_v2_keras_feature_extractor."""
+
+import tensorflow as tf
+
+from object_detection.models import faster_rcnn_inception_resnet_v2_keras_feature_extractor as frcnn_inc_res
+
+
+class FasterRcnnInceptionResnetV2KerasFeatureExtractorTest(tf.test.TestCase):
+
+  def _build_feature_extractor(self, first_stage_features_stride):
+    return frcnn_inc_res.FasterRCNNInceptionResnetV2KerasFeatureExtractor(
+        is_training=False,
+        first_stage_features_stride=first_stage_features_stride,
+        batch_norm_trainable=False,
+        weight_decay=0.0)
+
+  def test_extract_proposal_features_returns_expected_size(self):
+    feature_extractor = self._build_feature_extractor(
+        first_stage_features_stride=16)
+    preprocessed_inputs = tf.random_uniform(
+        [1, 299, 299, 3], maxval=255, dtype=tf.float32)
+    rpn_feature_map = feature_extractor.get_proposal_feature_extractor_model(
+        name='TestScope')(preprocessed_inputs)
+    features_shape = tf.shape(rpn_feature_map)
+
+    init_op = tf.global_variables_initializer()
+    with self.test_session() as sess:
+      sess.run(init_op)
+      features_shape_out = sess.run(features_shape)
+      self.assertAllEqual(features_shape_out, [1, 19, 19, 1088])
+
+  def test_extract_proposal_features_stride_eight(self):
+    feature_extractor = self._build_feature_extractor(
+        first_stage_features_stride=8)
+    preprocessed_inputs = tf.random_uniform(
+        [1, 224, 224, 3], maxval=255, dtype=tf.float32)
+    rpn_feature_map = feature_extractor.get_proposal_feature_extractor_model(
+        name='TestScope')(preprocessed_inputs)
+    features_shape = tf.shape(rpn_feature_map)
+
+    init_op = tf.global_variables_initializer()
+    with self.test_session() as sess:
+      sess.run(init_op)
+      features_shape_out = sess.run(features_shape)
+      self.assertAllEqual(features_shape_out, [1, 28, 28, 1088])
+
+  def test_extract_proposal_features_half_size_input(self):
+    feature_extractor = self._build_feature_extractor(
+        first_stage_features_stride=16)
+    preprocessed_inputs = tf.random_uniform(
+        [1, 112, 112, 3], maxval=255, dtype=tf.float32)
+    rpn_feature_map = feature_extractor.get_proposal_feature_extractor_model(
+        name='TestScope')(preprocessed_inputs)
+    features_shape = tf.shape(rpn_feature_map)
+
+    init_op = tf.global_variables_initializer()
+    with self.test_session() as sess:
+      sess.run(init_op)
+      features_shape_out = sess.run(features_shape)
+      self.assertAllEqual(features_shape_out, [1, 7, 7, 1088])
+
+  def test_extract_proposal_features_dies_on_invalid_stride(self):
+    with self.assertRaises(ValueError):
+      self._build_feature_extractor(first_stage_features_stride=99)
+
+  def test_extract_proposal_features_dies_with_incorrect_rank_inputs(self):
+    feature_extractor = self._build_feature_extractor(
+        first_stage_features_stride=16)
+    preprocessed_inputs = tf.random_uniform(
+        [224, 224, 3], maxval=255, dtype=tf.float32)
+    with self.assertRaises(ValueError):
+      feature_extractor.get_proposal_feature_extractor_model(
+          name='TestScope')(preprocessed_inputs)
+
+  def test_extract_box_classifier_features_returns_expected_size(self):
+    feature_extractor = self._build_feature_extractor(
+        first_stage_features_stride=16)
+    proposal_feature_maps = tf.random_uniform(
+        [2, 17, 17, 1088], maxval=255, dtype=tf.float32)
+    model = feature_extractor.get_box_classifier_feature_extractor_model(
+        name='TestScope')
+    proposal_classifier_features = (
+        model(proposal_feature_maps))
+    features_shape = tf.shape(proposal_classifier_features)
+
+    init_op = tf.global_variables_initializer()
+    with self.test_session() as sess:
+      sess.run(init_op)
+      features_shape_out = sess.run(features_shape)
+      self.assertAllEqual(features_shape_out, [2, 8, 8, 1536])
+
+
+if __name__ == '__main__':
+  tf.test.main()
--- a/research/object_detection/models/feature_map_generators.py
+++ b/research/object_detection/models/feature_map_generators.py
@@ -51,6 +51,60 @@ def get_depth_fn(depth_multiplier, min_depth):
  return multiply_depth


+def create_conv_block(
+    use_depthwise, kernel_size, padding, stride, layer_name, conv_hyperparams,
+    is_training, freeze_batchnorm, depth):
+  """Create Keras layers for depthwise & non-depthwise convolutions.
+
+  Args:
+    use_depthwise: Whether to use depthwise separable conv instead of regular
+      conv.
+    kernel_size: A list of length 2: [kernel_height, kernel_width] of the
+      filters. Can be an int if both values are the same.
+    padding: One of 'VALID' or 'SAME'.
+    stride: A list of length 2: [stride_height, stride_width], specifying the
+      convolution stride. Can be an int if both strides are the same.
+    layer_name: String. The name of the layer.
+    conv_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
+      containing hyperparameters for convolution ops.
+    is_training: Indicates whether the feature generator is in training mode.
+    freeze_batchnorm: Bool. Whether to freeze batch norm parameters during
+      training or not. When training with a small batch size (e.g. 1), it is
+      desirable to freeze batch norm update and use pretrained batch norm
+      params.
+    depth: Depth of output feature maps.
+
+  Returns:
+    A list of conv layers.
+  """
+  layers = []
+  if use_depthwise:
+    layers.append(tf.keras.layers.SeparableConv2D(
+        depth,
+        [kernel_size, kernel_size],
+        depth_multiplier=1,
+        padding=padding,
+        strides=stride,
+        name=layer_name + '_depthwise_conv',
+        **conv_hyperparams.params()))
+  else:
+    layers.append(tf.keras.layers.Conv2D(
+        depth,
+        [kernel_size, kernel_size],
+        padding=padding,
+        strides=stride,
+        name=layer_name + '_conv',
+        **conv_hyperparams.params()))
+  layers.append(
+      conv_hyperparams.build_batch_norm(
+          training=(is_training and not freeze_batchnorm),
+          name=layer_name + '_batchnorm'))
+  layers.append(
+      conv_hyperparams.build_activation_layer(
+          name=layer_name))
+  return layers
+
+
 class KerasMultiResolutionFeatureMaps(tf.keras.Model):
  """Generates multi resolution feature maps from input image features.

@@ -419,6 +473,180 @@ def multi_resolution_feature_maps(feature_map_layout, depth_multiplier,
      [(x, y) for (x, y) in zip(feature_map_keys, feature_maps)])


+class KerasFpnTopDownFeatureMaps(tf.keras.Model):
+  """Generates Keras based `top-down` feature maps for Feature Pyramid Networks.
+
+  See https://arxiv.org/abs/1612.03144 for details.
+  """
+
+  def __init__(self,
+               num_levels,
+               depth,
+               is_training,
+               conv_hyperparams,
+               freeze_batchnorm,
+               use_depthwise=False,
+               use_explicit_padding=False,
+               use_bounded_activations=False,
+               use_native_resize_op=False,
+               scope=None,
+               name=None):
+    """Constructor.
+
+    Args:
+      num_levels: the number of image features.
+      depth: depth of output feature maps.
+      is_training: Indicates whether the feature generator is in training mode.
+      conv_hyperparams: A `hyperparams_builder.KerasLayerHyperparams` object
+        containing hyperparameters for convolution ops.
+      freeze_batchnorm: Bool. Whether to freeze batch norm parameters during
+        training or not. When training with a small batch size (e.g. 1), it is
+        desirable to freeze batch norm update and use pretrained batch norm
+        params.
+      use_depthwise: whether to use depthwise separable conv instead of regular
+        conv.
+      use_explicit_padding: whether to use explicit padding.
+      use_bounded_activations: Whether or not to clip activations to range
+        [-ACTIVATION_BOUND, ACTIVATION_BOUND]. Bounded activations better lend
+        themselves to quantized inference.
+      use_native_resize_op: If True, uses tf.image.resize_nearest_neighbor op
+        for the upsampling process instead of reshape and broadcasting
+        implementation.
+      scope: A scope name to wrap this op under.
+      name: A string name scope to assign to the model. If 'None', Keras
+        will auto-generate one from the class name.
+    """
+    super(KerasFpnTopDownFeatureMaps, self).__init__(name=name)
+
+    self.scope = scope if scope else 'top_down'
+    self.top_layers = []
+    self.residual_blocks = []
+    self.top_down_blocks = []
+    self.reshape_blocks = []
+    self.conv_layers = []
+
+    padding = 'VALID' if use_explicit_padding else 'SAME'
+    stride = 1
+    kernel_size = 3
+    def clip_by_value(features):
+      return tf.clip_by_value(features, -ACTIVATION_BOUND, ACTIVATION_BOUND)
+
+    # top layers
+    self.top_layers.append(tf.keras.layers.Conv2D(
+        depth, [1, 1], strides=stride, padding=padding,
+        name='projection_%d' % num_levels,
+        **conv_hyperparams.params(use_bias=True)))
+    if use_bounded_activations:
+      self.top_layers.append(tf.keras.layers.Lambda(
+          clip_by_value, name='clip_by_value'))
+
+    for level in reversed(range(num_levels - 1)):
+      # to generate residual from image features
+      residual_net = []
+      # to preprocess top_down (the image feature map from last layer)
+      top_down_net = []
+      # to reshape top_down according to residual if necessary
+      reshaped_residual = []
+      # to apply convolution layers to feature map
+      conv_net = []
+
+      # residual block
+      residual_net.append(tf.keras.layers.Conv2D(
+          depth, [1, 1], padding=padding, strides=1,
+          name='projection_%d' % (level + 1),
+          **conv_hyperparams.params(use_bias=True)))
+      if use_bounded_activations:
+        residual_net.append(tf.keras.layers.Lambda(
+            clip_by_value, name='clip_by_value'))
+
+      # top-down block
+      # TODO (b/128922690): clean-up of ops.nearest_neighbor_upsampling
+      if use_native_resize_op:
+        def resize_nearest_neighbor(image):
+          image_shape = image.shape.as_list()
+          return tf.image.resize_nearest_neighbor(
+              image, [image_shape[1] * 2, image_shape[2] * 2])
+        top_down_net.append(tf.keras.layers.Lambda(
+            resize_nearest_neighbor, name='nearest_neighbor_upsampling'))
+      else:
+        def nearest_neighbor_upsampling(image):
+          return ops.nearest_neighbor_upsampling(image, scale=2)
+        top_down_net.append(tf.keras.layers.Lambda(
+            nearest_neighbor_upsampling, name='nearest_neighbor_upsampling'))
+
+      # reshape block
+      if use_explicit_padding:
+        def reshape(inputs):
+          residual_shape = tf.shape(inputs[0])
+          return inputs[1][:, :residual_shape[1], :residual_shape[2], :]
+        reshaped_residual.append(
+            tf.keras.layers.Lambda(reshape, name='reshape'))
+
+      # down layers
+      if use_bounded_activations:
+        conv_net.append(tf.keras.layers.Lambda(
+            clip_by_value, name='clip_by_value'))
+
+      if use_explicit_padding:
+        def fixed_padding(features, kernel_size=kernel_size):
+          return ops.fixed_padding(features, kernel_size)
+        conv_net.append(tf.keras.layers.Lambda(
+            fixed_padding, name='fixed_padding'))
+
+      layer_name = 'smoothing_%d' % (level + 1)
+      conv_block = create_conv_block(
+          use_depthwise, kernel_size, padding, stride, layer_name,
+          conv_hyperparams, is_training, freeze_batchnorm, depth)
+      conv_net.extend(conv_block)
+
+      self.residual_blocks.append(residual_net)
+      self.top_down_blocks.append(top_down_net)
+      self.reshape_blocks.append(reshaped_residual)
+      self.conv_layers.append(conv_net)
+
+  def call(self, image_features):
+    """Generate the multi-resolution feature maps.
+
+    Executed when calling the `.__call__` method on input.
+
+    Args:
+      image_features: list of tuples of (tensor_name, image_feature_tensor).
+        Spatial resolutions of succesive tensors must reduce exactly by a factor
+        of 2.
+
+    Returns:
+      feature_maps: an OrderedDict mapping keys (feature map names) to
+        tensors where each tensor has shape [batch, height_i, width_i, depth_i].
+    """
+    output_feature_maps_list = []
+    output_feature_map_keys = []
+
+    with tf.name_scope(self.scope):
+      top_down = image_features[-1][1]
+      for layer in self.top_layers:
+        top_down = layer(top_down)
+      output_feature_maps_list.append(top_down)
+      output_feature_map_keys.append('top_down_%s' % image_features[-1][0])
+
+      num_levels = len(image_features)
+      for index, level in enumerate(reversed(range(num_levels - 1))):
+        residual = image_features[level][1]
+        top_down = output_feature_maps_list[-1]
+        for layer in self.residual_blocks[index]:
+          residual = layer(residual)
+        for layer in self.top_down_blocks[index]:
+          top_down = layer(top_down)
+        for layer in self.reshape_blocks[index]:
+          top_down = layer([residual, top_down])
+        top_down += residual
+        for layer in self.conv_layers[index]:
+          top_down = layer(top_down)
+        output_feature_maps_list.append(top_down)
+        output_feature_map_keys.append('top_down_%s' % image_features[level][0])
+    return collections.OrderedDict(reversed(
+        list(zip(output_feature_map_keys, output_feature_maps_list))))
+
+
 def fpn_top_down_feature_maps(image_features,
                              depth,
                              use_depthwise=False,

--- a/research/object_detection/models/feature_map_generators_test.py
+++ b/research/object_detection/models/feature_map_generators_test.py
@@ -403,21 +403,101 @@ class MultiResolutionFeatureMapGeneratorTest(tf.test.TestCase):
        self.assertSetEqual(expected_slim_variables, actual_variable_set)


-@parameterized.parameters({'use_native_resize_op': True},
-                          {'use_native_resize_op': False})
+@parameterized.parameters({'use_native_resize_op': True, 'use_keras': False},
+                          {'use_native_resize_op': False, 'use_keras': False},
+                          {'use_native_resize_op': True, 'use_keras': True},
+                          {'use_native_resize_op': False, 'use_keras': True})
 class FPNFeatureMapGeneratorTest(tf.test.TestCase, parameterized.TestCase):

-  def test_get_expected_feature_map_shapes(self, use_native_resize_op):
+  def _build_conv_hyperparams(self):
+    conv_hyperparams = hyperparams_pb2.Hyperparams()
+    conv_hyperparams_text_proto = """
+      regularizer {
+        l2_regularizer {
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+        }
+      }
+    """
+    text_format.Merge(conv_hyperparams_text_proto, conv_hyperparams)
+    return hyperparams_builder.KerasLayerHyperparams(conv_hyperparams)
+
+  def _build_feature_map_generator(
+      self, image_features, depth, use_keras, use_bounded_activations=False,
+      use_native_resize_op=False, use_explicit_padding=False,
+      use_depthwise=False):
+    if use_keras:
+      return feature_map_generators.KerasFpnTopDownFeatureMaps(
+          num_levels=len(image_features),
+          depth=depth,
+          is_training=True,
+          conv_hyperparams=self._build_conv_hyperparams(),
+          freeze_batchnorm=False,
+          use_depthwise=use_depthwise,
+          use_explicit_padding=use_explicit_padding,
+          use_bounded_activations=use_bounded_activations,
+          use_native_resize_op=use_native_resize_op,
+          scope=None,
+          name='FeatureMaps',
+      )
+    else:
+      def feature_map_generator(image_features):
+        return feature_map_generators.fpn_top_down_feature_maps(
+            image_features=image_features,
+            depth=depth,
+            use_depthwise=use_depthwise,
+            use_explicit_padding=use_explicit_padding,
+            use_bounded_activations=use_bounded_activations,
+            use_native_resize_op=use_native_resize_op)
+      return feature_map_generator
+
+  def test_get_expected_feature_map_shapes(
+      self, use_native_resize_op, use_keras):
+    image_features = [
+        ('block2', tf.random_uniform([4, 8, 8, 256], dtype=tf.float32)),
+        ('block3', tf.random_uniform([4, 4, 4, 256], dtype=tf.float32)),
+        ('block4', tf.random_uniform([4, 2, 2, 256], dtype=tf.float32)),
+        ('block5', tf.random_uniform([4, 1, 1, 256], dtype=tf.float32))
+    ]
+    feature_map_generator = self._build_feature_map_generator(
+        image_features=image_features,
+        depth=128,
+        use_keras=use_keras,
+        use_native_resize_op=use_native_resize_op)
+    feature_maps = feature_map_generator(image_features)
+
+    expected_feature_map_shapes = {
+        'top_down_block2': (4, 8, 8, 128),
+        'top_down_block3': (4, 4, 4, 128),
+        'top_down_block4': (4, 2, 2, 128),
+        'top_down_block5': (4, 1, 1, 128)
+    }
+
+    init_op = tf.global_variables_initializer()
+    with self.test_session() as sess:
+      sess.run(init_op)
+      out_feature_maps = sess.run(feature_maps)
+      out_feature_map_shapes = {key: value.shape
+                                for key, value in out_feature_maps.items()}
+      self.assertDictEqual(out_feature_map_shapes, expected_feature_map_shapes)
+
+  def test_get_expected_feature_map_shapes_with_explicit_padding(
+      self, use_native_resize_op, use_keras):
    image_features = [
        ('block2', tf.random_uniform([4, 8, 8, 256], dtype=tf.float32)),
        ('block3', tf.random_uniform([4, 4, 4, 256], dtype=tf.float32)),
        ('block4', tf.random_uniform([4, 2, 2, 256], dtype=tf.float32)),
        ('block5', tf.random_uniform([4, 1, 1, 256], dtype=tf.float32))
    ]
-    feature_maps = feature_map_generators.fpn_top_down_feature_maps(
+    feature_map_generator = self._build_feature_map_generator(
        image_features=image_features,
        depth=128,
+        use_keras=use_keras,
+        use_explicit_padding=True,
        use_native_resize_op=use_native_resize_op)
+    feature_maps = feature_map_generator(image_features)

    expected_feature_map_shapes = {
        'top_down_block2': (4, 8, 8, 128),
@@ -434,7 +514,8 @@ class FPNFeatureMapGeneratorTest(tf.test.TestCase, parameterized.TestCase):
                                for key, value in out_feature_maps.items()}
      self.assertDictEqual(out_feature_map_shapes, expected_feature_map_shapes)

-  def test_use_bounded_activations_add_operations(self, use_native_resize_op):
+  def test_use_bounded_activations_add_operations(
+      self, use_native_resize_op, use_keras):
    tf_graph = tf.Graph()
    with tf_graph.as_default():
      image_features = [('block2',
@@ -445,22 +526,37 @@ class FPNFeatureMapGeneratorTest(tf.test.TestCase, parameterized.TestCase):
                         tf.random_uniform([4, 2, 2, 256], dtype=tf.float32)),
                        ('block5',
                         tf.random_uniform([4, 1, 1, 256], dtype=tf.float32))]
-      feature_map_generators.fpn_top_down_feature_maps(
+      feature_map_generator = self._build_feature_map_generator(
          image_features=image_features,
          depth=128,
+          use_keras=use_keras,
          use_bounded_activations=True,
          use_native_resize_op=use_native_resize_op)
+      feature_map_generator(image_features)
+
+      if use_keras:
+        expected_added_operations = dict.fromkeys([
+            'FeatureMaps/top_down/clip_by_value/clip_by_value',
+            'FeatureMaps/top_down/clip_by_value_1/clip_by_value',
+            'FeatureMaps/top_down/clip_by_value_2/clip_by_value',
+            'FeatureMaps/top_down/clip_by_value_3/clip_by_value',
+            'FeatureMaps/top_down/clip_by_value_4/clip_by_value',
+            'FeatureMaps/top_down/clip_by_value_5/clip_by_value',
+            'FeatureMaps/top_down/clip_by_value_6/clip_by_value',
+        ])
+      else:
+        expected_added_operations = dict.fromkeys([
+            'top_down/clip_by_value', 'top_down/clip_by_value_1',
+            'top_down/clip_by_value_2', 'top_down/clip_by_value_3',
+            'top_down/clip_by_value_4', 'top_down/clip_by_value_5',
+            'top_down/clip_by_value_6'
+        ])

-      expected_added_operations = dict.fromkeys([
-          'top_down/clip_by_value', 'top_down/clip_by_value_1',
-          'top_down/clip_by_value_2', 'top_down/clip_by_value_3',
-          'top_down/clip_by_value_4', 'top_down/clip_by_value_5',
-          'top_down/clip_by_value_6'
-      ])
      op_names = {op.name: None for op in tf_graph.get_operations()}
      self.assertDictContainsSubset(expected_added_operations, op_names)

-  def test_use_bounded_activations_clip_value(self, use_native_resize_op):
+  def test_use_bounded_activations_clip_value(
+      self, use_native_resize_op, use_keras):
    tf_graph = tf.Graph()
    with tf_graph.as_default():
      image_features = [
@@ -469,18 +565,31 @@ class FPNFeatureMapGeneratorTest(tf.test.TestCase, parameterized.TestCase):
          ('block4', 255 * tf.ones([4, 2, 2, 256], dtype=tf.float32)),
          ('block5', 255 * tf.ones([4, 1, 1, 256], dtype=tf.float32))
      ]
-      feature_map_generators.fpn_top_down_feature_maps(
+      feature_map_generator = self._build_feature_map_generator(
          image_features=image_features,
          depth=128,
+          use_keras=use_keras,
          use_bounded_activations=True,
          use_native_resize_op=use_native_resize_op)
+      feature_map_generator(image_features)

-      expected_clip_by_value_ops = [
-          'top_down/clip_by_value', 'top_down/clip_by_value_1',
-          'top_down/clip_by_value_2', 'top_down/clip_by_value_3',
-          'top_down/clip_by_value_4', 'top_down/clip_by_value_5',
-          'top_down/clip_by_value_6'
-      ]
+      if use_keras:
+        expected_clip_by_value_ops = dict.fromkeys([
+            'FeatureMaps/top_down/clip_by_value/clip_by_value',
+            'FeatureMaps/top_down/clip_by_value_1/clip_by_value',
+            'FeatureMaps/top_down/clip_by_value_2/clip_by_value',
+            'FeatureMaps/top_down/clip_by_value_3/clip_by_value',
+            'FeatureMaps/top_down/clip_by_value_4/clip_by_value',
+            'FeatureMaps/top_down/clip_by_value_5/clip_by_value',
+            'FeatureMaps/top_down/clip_by_value_6/clip_by_value',
+        ])
+      else:
+        expected_clip_by_value_ops = [
+            'top_down/clip_by_value', 'top_down/clip_by_value_1',
+            'top_down/clip_by_value_2', 'top_down/clip_by_value_3',
+            'top_down/clip_by_value_4', 'top_down/clip_by_value_5',
+            'top_down/clip_by_value_6'
+        ]

      # Gathers activation tensors before and after clip_by_value operations.
      activations = {}
@@ -522,18 +631,20 @@ class FPNFeatureMapGeneratorTest(tf.test.TestCase, parameterized.TestCase):
          self.assertLessEqual(after_clipping_upper_bound, expected_upper_bound)

  def test_get_expected_feature_map_shapes_with_depthwise(
-      self, use_native_resize_op):
+      self, use_native_resize_op, use_keras):
    image_features = [
        ('block2', tf.random_uniform([4, 8, 8, 256], dtype=tf.float32)),
        ('block3', tf.random_uniform([4, 4, 4, 256], dtype=tf.float32)),
        ('block4', tf.random_uniform([4, 2, 2, 256], dtype=tf.float32)),
        ('block5', tf.random_uniform([4, 1, 1, 256], dtype=tf.float32))
    ]
-    feature_maps = feature_map_generators.fpn_top_down_feature_maps(
+    feature_map_generator = self._build_feature_map_generator(
        image_features=image_features,
        depth=128,
+        use_keras=use_keras,
        use_depthwise=True,
        use_native_resize_op=use_native_resize_op)
+    feature_maps = feature_map_generator(image_features)

    expected_feature_map_shapes = {
        'top_down_block2': (4, 8, 8, 128),
@@ -550,6 +661,131 @@ class FPNFeatureMapGeneratorTest(tf.test.TestCase, parameterized.TestCase):
                                for key, value in out_feature_maps.items()}
      self.assertDictEqual(out_feature_map_shapes, expected_feature_map_shapes)

+  def test_get_expected_variable_names(
+      self, use_native_resize_op, use_keras):
+    image_features = [
+        ('block2', tf.random_uniform([4, 8, 8, 256], dtype=tf.float32)),
+        ('block3', tf.random_uniform([4, 4, 4, 256], dtype=tf.float32)),
+        ('block4', tf.random_uniform([4, 2, 2, 256], dtype=tf.float32)),
+        ('block5', tf.random_uniform([4, 1, 1, 256], dtype=tf.float32))
+    ]
+    feature_map_generator = self._build_feature_map_generator(
+        image_features=image_features,
+        depth=128,
+        use_keras=use_keras,
+        use_native_resize_op=use_native_resize_op)
+    feature_maps = feature_map_generator(image_features)
+
+    expected_slim_variables = set([
+        'projection_1/weights',
+        'projection_1/biases',
+        'projection_2/weights',
+        'projection_2/biases',
+        'projection_3/weights',
+        'projection_3/biases',
+        'projection_4/weights',
+        'projection_4/biases',
+        'smoothing_1/weights',
+        'smoothing_1/biases',
+        'smoothing_2/weights',
+        'smoothing_2/biases',
+        'smoothing_3/weights',
+        'smoothing_3/biases',
+    ])
+
+    expected_keras_variables = set([
+        'FeatureMaps/top_down/projection_1/kernel',
+        'FeatureMaps/top_down/projection_1/bias',
+        'FeatureMaps/top_down/projection_2/kernel',
+        'FeatureMaps/top_down/projection_2/bias',
+        'FeatureMaps/top_down/projection_3/kernel',
+        'FeatureMaps/top_down/projection_3/bias',
+        'FeatureMaps/top_down/projection_4/kernel',
+        'FeatureMaps/top_down/projection_4/bias',
+        'FeatureMaps/top_down/smoothing_1_conv/kernel',
+        'FeatureMaps/top_down/smoothing_1_conv/bias',
+        'FeatureMaps/top_down/smoothing_2_conv/kernel',
+        'FeatureMaps/top_down/smoothing_2_conv/bias',
+        'FeatureMaps/top_down/smoothing_3_conv/kernel',
+        'FeatureMaps/top_down/smoothing_3_conv/bias'
+    ])
+    init_op = tf.global_variables_initializer()
+    with self.test_session() as sess:
+      sess.run(init_op)
+      sess.run(feature_maps)
+      actual_variable_set = set(
+          [var.op.name for var in tf.trainable_variables()])
+      if use_keras:
+        self.assertSetEqual(expected_keras_variables, actual_variable_set)
+      else:
+        self.assertSetEqual(expected_slim_variables, actual_variable_set)
+
+  def test_get_expected_variable_names_with_depthwise(
+      self, use_native_resize_op, use_keras):
+    image_features = [
+        ('block2', tf.random_uniform([4, 8, 8, 256], dtype=tf.float32)),
+        ('block3', tf.random_uniform([4, 4, 4, 256], dtype=tf.float32)),
+        ('block4', tf.random_uniform([4, 2, 2, 256], dtype=tf.float32)),
+        ('block5', tf.random_uniform([4, 1, 1, 256], dtype=tf.float32))
+    ]
+    feature_map_generator = self._build_feature_map_generator(
+        image_features=image_features,
+        depth=128,
+        use_keras=use_keras,
+        use_depthwise=True,
+        use_native_resize_op=use_native_resize_op)
+    feature_maps = feature_map_generator(image_features)
+
+    expected_slim_variables = set([
+        'projection_1/weights',
+        'projection_1/biases',
+        'projection_2/weights',
+        'projection_2/biases',
+        'projection_3/weights',
+        'projection_3/biases',
+        'projection_4/weights',
+        'projection_4/biases',
+        'smoothing_1/depthwise_weights',
+        'smoothing_1/pointwise_weights',
+        'smoothing_1/biases',
+        'smoothing_2/depthwise_weights',
+        'smoothing_2/pointwise_weights',
+        'smoothing_2/biases',
+        'smoothing_3/depthwise_weights',
+        'smoothing_3/pointwise_weights',
+        'smoothing_3/biases',
+    ])
+
+    expected_keras_variables = set([
+        'FeatureMaps/top_down/projection_1/kernel',
+        'FeatureMaps/top_down/projection_1/bias',
+        'FeatureMaps/top_down/projection_2/kernel',
+        'FeatureMaps/top_down/projection_2/bias',
+        'FeatureMaps/top_down/projection_3/kernel',
+        'FeatureMaps/top_down/projection_3/bias',
+        'FeatureMaps/top_down/projection_4/kernel',
+        'FeatureMaps/top_down/projection_4/bias',
+        'FeatureMaps/top_down/smoothing_1_depthwise_conv/depthwise_kernel',
+        'FeatureMaps/top_down/smoothing_1_depthwise_conv/pointwise_kernel',
+        'FeatureMaps/top_down/smoothing_1_depthwise_conv/bias',
+        'FeatureMaps/top_down/smoothing_2_depthwise_conv/depthwise_kernel',
+        'FeatureMaps/top_down/smoothing_2_depthwise_conv/pointwise_kernel',
+        'FeatureMaps/top_down/smoothing_2_depthwise_conv/bias',
+        'FeatureMaps/top_down/smoothing_3_depthwise_conv/depthwise_kernel',
+        'FeatureMaps/top_down/smoothing_3_depthwise_conv/pointwise_kernel',
+        'FeatureMaps/top_down/smoothing_3_depthwise_conv/bias'
+    ])
+    init_op = tf.global_variables_initializer()
+    with self.test_session() as sess:
+      sess.run(init_op)
+      sess.run(feature_maps)
+      actual_variable_set = set(
+          [var.op.name for var in tf.trainable_variables()])
+      if use_keras:
+        self.assertSetEqual(expected_keras_variables, actual_variable_set)
+      else:
+        self.assertSetEqual(expected_slim_variables, actual_variable_set)
+

 class GetDepthFunctionTest(tf.test.TestCase):


--- a/research/object_detection/models/keras_models/base_models/original_mobilenet_v2.py
+++ b/research/object_detection/models/keras_models/base_models/original_mobilenet_v2.py
+
+# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+
+"""MobileNet v2 models for Keras.
+
+MobileNetV2 is a general architecture and can be used for multiple use cases.
+Depending on the use case, it can use different input layer size and
+different width factors. This allows different width models to reduce
+the number of multiply-adds and thereby
+reduce inference cost on mobile devices.
+
+MobileNetV2 is very similar to the original MobileNet,
+except that it uses inverted residual blocks with
+bottlenecking features. It has a drastically lower
+parameter count than the original MobileNet.
+MobileNets support any input size greater
+than 32 x 32, with larger image sizes
+offering better performance.
+
+The number of parameters and number of multiply-adds
+can be modified by using the `alpha` parameter,
+which increases/decreases the number of filters in each layer.
+By altering the image size and `alpha` parameter,
+all 22 models from the paper can be built, with ImageNet weights provided.
+
+The paper demonstrates the performance of MobileNets using `alpha` values of
+1.0 (also called 100 % MobileNet), 0.35, 0.5, 0.75, 1.0, 1.3, and 1.4
+
+For each of these `alpha` values, weights for 5 different input image sizes
+are provided (224, 192, 160, 128, and 96).
+
+
+The following table describes the performance of
+MobileNet on various input sizes:
+------------------------------------------------------------------------
+MACs stands for Multiply Adds
+
+ Classification Checkpoint| MACs (M)   | Parameters (M)| Top 1 Acc| Top 5 Acc
+--------------------------|------------|---------------|---------|----|-------
+| [mobilenet_v2_1.4_224]  | 582 | 6.06 |          75.0 | 92.5 |
+| [mobilenet_v2_1.3_224]  | 509 | 5.34 |          74.4 | 92.1 |
+| [mobilenet_v2_1.0_224]  | 300 | 3.47 |          71.8 | 91.0 |
+| [mobilenet_v2_1.0_192]  | 221 | 3.47 |          70.7 | 90.1 |
+| [mobilenet_v2_1.0_160]  | 154 | 3.47 |          68.8 | 89.0 |
+| [mobilenet_v2_1.0_128]  | 99  | 3.47 |          65.3 | 86.9 |
+| [mobilenet_v2_1.0_96]   | 56  | 3.47 |          60.3 | 83.2 |
+| [mobilenet_v2_0.75_224] | 209 | 2.61 |          69.8 | 89.6 |
+| [mobilenet_v2_0.75_192] | 153 | 2.61 |          68.7 | 88.9 |
+| [mobilenet_v2_0.75_160] | 107 | 2.61 |          66.4 | 87.3 |
+| [mobilenet_v2_0.75_128] | 69  | 2.61 |          63.2 | 85.3 |
+| [mobilenet_v2_0.75_96]  | 39  | 2.61 |          58.8 | 81.6 |
+| [mobilenet_v2_0.5_224]  | 97  | 1.95 |          65.4 | 86.4 |
+| [mobilenet_v2_0.5_192]  | 71  | 1.95 |          63.9 | 85.4 |
+| [mobilenet_v2_0.5_160]  | 50  | 1.95 |          61.0 | 83.2 |
+| [mobilenet_v2_0.5_128]  | 32  | 1.95 |          57.7 | 80.8 |
+| [mobilenet_v2_0.5_96]   | 18  | 1.95 |          51.2 | 75.8 |
+| [mobilenet_v2_0.35_224] | 59  | 1.66 |          60.3 | 82.9 |
+| [mobilenet_v2_0.35_192] | 43  | 1.66 |          58.2 | 81.2 |
+| [mobilenet_v2_0.35_160] | 30  | 1.66 |          55.7 | 79.1 |
+| [mobilenet_v2_0.35_128] | 20  | 1.66 |          50.8 | 75.0 |
+| [mobilenet_v2_0.35_96]  | 11  | 1.66 |          45.5 | 70.4 |
+
+The weights for all 16 models are obtained and translated from the Tensorflow
+checkpoints from TensorFlow checkpoints found at
+https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/README.md
+
+# Reference
+This file contains building code for MobileNetV2, based on
+[MobileNetV2: Inverted Residuals and Linear Bottlenecks]
+(https://arxiv.org/abs/1801.04381)
+
+Tests comparing this model to the existing Tensorflow model can be
+found at
+[mobilenet_v2_keras](https://github.com/JonathanCMitchell/mobilenet_v2_keras)
+"""
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import warnings
+import numpy as np
+import tensorflow as tf
+
+Model = tf.keras.Model
+Input = tf.keras.layers.Input
+Activation = tf.keras.layers.Activation
+BatchNormalization = tf.keras.layers.BatchNormalization
+Conv2D = tf.keras.layers.Conv2D
+DepthwiseConv2D = tf.keras.layers.DepthwiseConv2D
+GlobalAveragePooling2D = tf.keras.layers.GlobalAveragePooling2D
+Add = tf.keras.layers.Add
+Dense = tf.keras.layers.Dense
+K = tf.keras.Backend
+
+
+def relu6(x):
+  return K.relu(x, max_value=6)
+
+
+def _obtain_input_shape(
+    input_shape,
+    default_size,
+    min_size,
+    data_format,
+    require_flatten):
+  """Internal utility to compute/validate an ImageNet model's input shape.
+
+  Arguments:
+      input_shape: either None (will return the default network input shape),
+          or a user-provided shape to be validated.
+      default_size: default input width/height for the model.
+      min_size: minimum input width/height accepted by the model.
+      data_format: image data format to use.
+      require_flatten: whether the model is expected to
+          be linked to a classifier via a Flatten layer.
+
+  Returns:
+      An integer shape tuple (may include None entries).
+
+  Raises:
+      ValueError: in case of invalid argument values.
+  """
+  if input_shape and len(input_shape) == 3:
+    if data_format == 'channels_first':
+      if input_shape[0] not in {1, 3}:
+        warnings.warn(
+            'This model usually expects 1 or 3 input channels. '
+            'However, it was passed an input_shape with ' +
+            str(input_shape[0]) + ' input channels.')
+      default_shape = (input_shape[0], default_size, default_size)
+    else:
+      if input_shape[-1] not in {1, 3}:
+        warnings.warn(
+            'This model usually expects 1 or 3 input channels. '
+            'However, it was passed an input_shape with ' +
+            str(input_shape[-1]) + ' input channels.')
+      default_shape = (default_size, default_size, input_shape[-1])
+  else:
+    if data_format == 'channels_first':
+      default_shape = (3, default_size, default_size)
+    else:
+      default_shape = (default_size, default_size, 3)
+  if input_shape:
+    if data_format == 'channels_first':
+      if input_shape is not None:
+        if len(input_shape) != 3:
+          raise ValueError(
+              '`input_shape` must be a tuple of three integers.')
+        if ((input_shape[1] is not None and input_shape[1] < min_size) or
+            (input_shape[2] is not None and input_shape[2] < min_size)):
+          raise ValueError('Input size must be at least ' +
+                           str(min_size) + 'x' + str(min_size) +
+                           '; got `input_shape=' +
+                           str(input_shape) + '`')
+    else:
+      if input_shape is not None:
+        if len(input_shape) != 3:
+          raise ValueError(
+              '`input_shape` must be a tuple of three integers.')
+        if ((input_shape[0] is not None and input_shape[0] < min_size) or
+            (input_shape[1] is not None and input_shape[1] < min_size)):
+          raise ValueError('Input size must be at least ' +
+                           str(min_size) + 'x' + str(min_size) +
+                           '; got `input_shape=' +
+                           str(input_shape) + '`')
+  else:
+    if require_flatten:
+      input_shape = default_shape
+    else:
+      if data_format == 'channels_first':
+        input_shape = (3, None, None)
+      else:
+        input_shape = (None, None, 3)
+  if require_flatten:
+    if None in input_shape:
+      raise ValueError('If `include_top` is True, '
+                       'you should specify a static `input_shape`. '
+                       'Got `input_shape=' + str(input_shape) + '`')
+  return input_shape
+
+
+def preprocess_input(x):
+  """Preprocesses a numpy array encoding a batch of images.
+
+  This function applies the "Inception" preprocessing which converts
+  the RGB values from [0, 255] to [-1, 1]. Note that this preprocessing
+  function is different from `imagenet_utils.preprocess_input()`.
+
+  Arguments:
+    x: a 4D numpy array consists of RGB values within [0, 255].
+
+  Returns:
+    Preprocessed array.
+  """
+  x /= 128.
+  x -= 1.
+  return x.astype(np.float32)
+
+
+# This function is taken from the original tf repo.
+# It ensures that all layers have a channel number that is divisible by 8
+# It can be seen here:
+# https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.py
+
+
+def _make_divisible(v, divisor, min_value=None):
+  if min_value is None:
+    min_value = divisor
+  new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)
+  # Make sure that round down does not go down by more than 10%.
+  if new_v < 0.9 * v:
+    new_v += divisor
+  return new_v
+
+
+def mobilenet_v2(input_shape=None,
+                 alpha=1.0,
+                 include_top=True,
+                 classes=1000):
+  """Instantiates the MobileNetV2 architecture.
+
+  To load a MobileNetV2 model via `load_model`, import the custom
+  objects `relu6` and pass them to the `custom_objects` parameter.
+  E.g.
+  model = load_model('mobilenet.h5', custom_objects={
+                     'relu6': mobilenet.relu6})
+
+  Arguments:
+    input_shape: optional shape tuple, to be specified if you would
+      like to use a model with an input img resolution that is not
+      (224, 224, 3).
+      It should have exactly 3 inputs channels (224, 224, 3).
+      You can also omit this option if you would like
+      to infer input_shape from an input_tensor.
+      If you choose to include both input_tensor and input_shape then
+      input_shape will be used if they match, if the shapes
+      do not match then we will throw an error.
+      E.g. `(160, 160, 3)` would be one valid value.
+    alpha: controls the width of the network. This is known as the
+    width multiplier in the MobileNetV2 paper.
+      - If `alpha` < 1.0, proportionally decreases the number
+          of filters in each layer.
+      - If `alpha` > 1.0, proportionally increases the number
+          of filters in each layer.
+      - If `alpha` = 1, default number of filters from the paper
+           are used at each layer.
+    include_top: whether to include the fully-connected
+      layer at the top of the network.
+    classes: optional number of classes to classify images
+      into, only to be specified if `include_top` is True, and
+      if no `weights` argument is specified.
+
+  Returns:
+    A Keras model instance.
+
+  Raises:
+    ValueError: in case of invalid argument for `weights`,
+        or invalid input shape or invalid depth_multiplier, alpha,
+        rows when weights='imagenet'
+  """
+
+  # Determine proper input shape and default size.
+  # If input_shape is None and no input_tensor
+  if input_shape is None:
+    default_size = 224
+
+  # If input_shape is not None, assume default size
+  else:
+    if K.image_data_format() == 'channels_first':
+      rows = input_shape[1]
+      cols = input_shape[2]
+    else:
+      rows = input_shape[0]
+      cols = input_shape[1]
+
+    if rows == cols and rows in [96, 128, 160, 192, 224]:
+      default_size = rows
+    else:
+      default_size = 224
+
+  input_shape = _obtain_input_shape(input_shape,
+                                    default_size=default_size,
+                                    min_size=32,
+                                    data_format=K.image_data_format(),
+                                    require_flatten=include_top)
+
+  if K.image_data_format() == 'channels_last':
+    row_axis, col_axis = (0, 1)
+  else:
+    row_axis, col_axis = (1, 2)
+  rows = input_shape[row_axis]
+  cols = input_shape[col_axis]
+
+  if K.image_data_format() != 'channels_last':
+    warnings.warn('The MobileNet family of models is only available '
+                  'for the input data format "channels_last" '
+                  '(width, height, channels). '
+                  'However your settings specify the default '
+                  'data format "channels_first" (channels, width, height).'
+                  ' You should set `image_data_format="channels_last"` '
+                  'in your Keras config located at ~/.keras/keras.json. '
+                  'The model being returned right now will expect inputs '
+                  'to follow the "channels_last" data format.')
+    K.set_image_data_format('channels_last')
+    old_data_format = 'channels_first'
+  else:
+    old_data_format = None
+
+  img_input = Input(shape=input_shape)
+
+  first_block_filters = _make_divisible(32 * alpha, 8)
+  x = Conv2D(first_block_filters,
+             kernel_size=3,
+             strides=(2, 2), padding='same',
+             use_bias=False, name='Conv1')(img_input)
+  x = BatchNormalization(epsilon=1e-3, momentum=0.999, name='bn_Conv1')(x)
+  x = Activation(relu6, name='Conv1_relu')(x)
+
+  x = _first_inverted_res_block(x,
+                                filters=16,
+                                alpha=alpha,
+                                stride=1,
+                                block_id=0)
+
+  x = _inverted_res_block(x, filters=24, alpha=alpha, stride=2,
+                          expansion=6, block_id=1)
+  x = _inverted_res_block(x, filters=24, alpha=alpha, stride=1,
+                          expansion=6, block_id=2)
+
+  x = _inverted_res_block(x, filters=32, alpha=alpha, stride=2,
+                          expansion=6, block_id=3)
+  x = _inverted_res_block(x, filters=32, alpha=alpha, stride=1,
+                          expansion=6, block_id=4)
+  x = _inverted_res_block(x, filters=32, alpha=alpha, stride=1,
+                          expansion=6, block_id=5)
+
+  x = _inverted_res_block(x, filters=64, alpha=alpha, stride=2,
+                          expansion=6, block_id=6)
+  x = _inverted_res_block(x, filters=64, alpha=alpha, stride=1,
+                          expansion=6, block_id=7)
+  x = _inverted_res_block(x, filters=64, alpha=alpha, stride=1,
+                          expansion=6, block_id=8)
+  x = _inverted_res_block(x, filters=64, alpha=alpha, stride=1,
+                          expansion=6, block_id=9)
+
+  x = _inverted_res_block(x, filters=96, alpha=alpha, stride=1,
+                          expansion=6, block_id=10)
+  x = _inverted_res_block(x, filters=96, alpha=alpha, stride=1,
+                          expansion=6, block_id=11)
+  x = _inverted_res_block(x, filters=96, alpha=alpha, stride=1,
+                          expansion=6, block_id=12)
+
+  x = _inverted_res_block(x, filters=160, alpha=alpha, stride=2,
+                          expansion=6, block_id=13)
+  x = _inverted_res_block(x, filters=160, alpha=alpha, stride=1,
+                          expansion=6, block_id=14)
+  x = _inverted_res_block(x, filters=160, alpha=alpha, stride=1,
+                          expansion=6, block_id=15)
+
+  x = _inverted_res_block(x, filters=320, alpha=alpha, stride=1,
+                          expansion=6, block_id=16)
+
+  # no alpha applied to last conv as stated in the paper:
+  # if the width multiplier is greater than 1 we
+  # increase the number of output channels
+  if alpha > 1.0:
+    last_block_filters = _make_divisible(1280 * alpha, 8)
+  else:
+    last_block_filters = 1280
+
+  x = Conv2D(last_block_filters,
+             kernel_size=1,
+             use_bias=False,
+             name='Conv_1')(x)
+  x = BatchNormalization(epsilon=1e-3, momentum=0.999, name='Conv_1_bn')(x)
+  x = Activation(relu6, name='out_relu')(x)
+
+  if include_top:
+    x = GlobalAveragePooling2D()(x)
+    x = Dense(classes, activation='softmax',
+              use_bias=True, name='Logits')(x)
+
+  # Ensure that the model takes into account
+  # any potential predecessors of `input_tensor`.
+  inputs = img_input
+
+  # Create model.
+  model = Model(inputs, x, name='mobilenetv2_%0.2f_%s' % (alpha, rows))
+
+  if old_data_format:
+    K.set_image_data_format(old_data_format)
+  return model
+
+
+def _inverted_res_block(inputs, expansion, stride, alpha, filters, block_id):
+  """Build an inverted res block."""
+  in_channels = int(inputs.shape[-1])
+  pointwise_conv_filters = int(filters * alpha)
+  pointwise_filters = _make_divisible(pointwise_conv_filters, 8)
+  # Expand
+
+  x = Conv2D(expansion * in_channels, kernel_size=1, padding='same',
+             use_bias=False, activation=None,
+             name='mobl%d_conv_expand' % block_id)(inputs)
+  x = BatchNormalization(epsilon=1e-3, momentum=0.999,
+                         name='bn%d_conv_bn_expand' %
+                         block_id)(x)
+  x = Activation(relu6, name='conv_%d_relu' % block_id)(x)
+
+  # Depthwise
+  x = DepthwiseConv2D(kernel_size=3, strides=stride, activation=None,
+                      use_bias=False, padding='same',
+                      name='mobl%d_conv_depthwise' % block_id)(x)
+  x = BatchNormalization(epsilon=1e-3, momentum=0.999,
+                         name='bn%d_conv_depthwise' % block_id)(x)
+
+  x = Activation(relu6, name='conv_dw_%d_relu' % block_id)(x)
+
+  # Project
+  x = Conv2D(pointwise_filters,
+             kernel_size=1, padding='same', use_bias=False, activation=None,
+             name='mobl%d_conv_project' % block_id)(x)
+  x = BatchNormalization(epsilon=1e-3, momentum=0.999,
+                         name='bn%d_conv_bn_project' % block_id)(x)
+
+  if in_channels == pointwise_filters and stride == 1:
+    return Add(name='res_connect_' + str(block_id))([inputs, x])
+
+  return x
+
+
+def _first_inverted_res_block(inputs,
+                              stride,
+                              alpha, filters, block_id):
+  """Build the first inverted res block."""
+  in_channels = int(inputs.shape[-1])
+  pointwise_conv_filters = int(filters * alpha)
+  pointwise_filters = _make_divisible(pointwise_conv_filters, 8)
+
+  # Depthwise
+  x = DepthwiseConv2D(kernel_size=3,
+                      strides=stride, activation=None,
+                      use_bias=False, padding='same',
+                      name='mobl%d_conv_depthwise' %
+                      block_id)(inputs)
+  x = BatchNormalization(epsilon=1e-3, momentum=0.999,
+                         name='bn%d_conv_depthwise' %
+                         block_id)(x)
+  x = Activation(relu6, name='conv_dw_%d_relu' % block_id)(x)
+
+  # Project
+  x = Conv2D(pointwise_filters,
+             kernel_size=1,
+             padding='same',
+             use_bias=False,
+             activation=None,
+             name='mobl%d_conv_project' %
+             block_id)(x)
+  x = BatchNormalization(epsilon=1e-3, momentum=0.999,
+                         name='bn%d_conv_project' %
+                         block_id)(x)
+
+  if in_channels == pointwise_filters and stride == 1:
+    return Add(name='res_connect_' + str(block_id))([inputs, x])
+
+  return x
--- a/research/object_detection/models/keras_models/inception_resnet_v2.py
+++ b/research/object_detection/models/keras_models/inception_resnet_v2.py
+# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+
+"""A wrapper around the Keras InceptionResnetV2 models for object detection."""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import tensorflow as tf
+
+from object_detection.core import freezable_batch_norm
+
+
+class _LayersOverride(object):
+  """Alternative Keras layers interface for the Keras InceptionResNetV2."""
+
+  def __init__(self,
+               batchnorm_training,
+               output_stride=16,
+               align_feature_maps=False,
+               batchnorm_scale=False,
+               default_batchnorm_momentum=0.999,
+               default_batchnorm_epsilon=1e-3,
+               weight_decay=0.00004):
+    """Alternative tf.keras.layers interface, for use by InceptionResNetV2.
+
+    It is used by the Keras applications kwargs injection API to
+    modify the Inception Resnet V2 Keras application with changes required by
+    the Object Detection API.
+
+    These injected interfaces make the following changes to the network:
+
+    - Supports freezing batch norm layers
+    - Adds support for feature map alignment (like in the Slim model)
+    - Adds support for changing the output stride (like in the Slim model)
+    - Adds support for overriding various batch norm hyperparameters
+
+    Because the Keras inception resnet v2 application does not assign explicit
+    names to most individual layers, the injection of output stride support
+    works by identifying convolution layers according to their filter counts
+    and pre-feature-map-alignment padding arguments.
+
+    Args:
+      batchnorm_training: Bool. Assigned to Batch norm layer `training` param
+        when constructing `freezable_batch_norm.FreezableBatchNorm` layers.
+      output_stride: A scalar that specifies the requested ratio of input to
+        output spatial resolution. Only supports 8 and 16.
+      align_feature_maps: When true, changes all the VALID paddings in the
+        network to SAME padding so that the feature maps are aligned.
+      batchnorm_scale: If True, uses an explicit `gamma` multiplier to scale the
+        activations in the batch normalization layer.
+      default_batchnorm_momentum: Float. Batch norm layers will be constructed
+        using this value as the momentum.
+      default_batchnorm_epsilon: small float added to variance to avoid
+        dividing by zero.
+      weight_decay: the l2 regularization weight decay for weights variables.
+        (gets multiplied by 0.5 to map from slim l2 regularization weight to
+        Keras l2 regularization weight).
+    """
+    self._use_atrous = output_stride == 8
+    self._align_feature_maps = align_feature_maps
+    self._batchnorm_training = batchnorm_training
+    self._batchnorm_scale = batchnorm_scale
+    self._default_batchnorm_momentum = default_batchnorm_momentum
+    self._default_batchnorm_epsilon = default_batchnorm_epsilon
+    self.regularizer = tf.keras.regularizers.l2(weight_decay * 0.5)
+
+  def Conv2D(self, filters, kernel_size, **kwargs):
+    """Builds a Conv2D layer according to the current Object Detection config.
+
+    Overrides the Keras InceptionResnetV2 application's convolutions with ones
+    that follow the spec specified by the Object Detection hyperparameters.
+
+    If feature map alignment is enabled, the padding will be forced to 'same'.
+    If output_stride is 8, some conv2d layers will be matched according to
+    their name or filter counts or pre-alignment padding parameters, and will
+    have the correct 'dilation rate' or 'strides' set.
+
+    Args:
+      filters: The number of filters to use for the convolution.
+      kernel_size: The kernel size to specify the height and width of the 2D
+        convolution window.
+      **kwargs: Keyword args specified by the Keras application for
+        constructing the convolution.
+
+    Returns:
+      A Keras Conv2D layer specified by the Object Detection hyperparameter
+      configurations.
+    """
+    kwargs['kernel_regularizer'] = self.regularizer
+    kwargs['bias_regularizer'] = self.regularizer
+
+    # Because the Keras application does not set explicit names for most layers,
+    # (instead allowing names to auto-increment), we must match individual
+    # layers in the model according to their filter count, name, or
+    # pre-alignment mapping. This means we can only align the feature maps
+    # after we have applied our updates in cases where output_stride=8.
+    if self._use_atrous and (filters == 384):
+      kwargs['strides'] = 1
+
+    name = kwargs.get('name')
+    if self._use_atrous and (
+        (name and 'block17' in name) or
+        (filters == 128 or filters == 160 or
+         (filters == 192 and kwargs.get('padding', '').lower() != 'valid'))):
+      kwargs['dilation_rate'] = 2
+
+    if self._align_feature_maps:
+      kwargs['padding'] = 'same'
+
+    return tf.keras.layers.Conv2D(filters, kernel_size, **kwargs)
+
+  def MaxPooling2D(self, pool_size, strides, **kwargs):
+    """Builds a pooling layer according to the current Object Detection config.
+
+    Overrides the Keras InceptionResnetV2 application's MaxPooling2D layers with
+    ones that follow the spec specified by the Object Detection hyperparameters.
+
+    If feature map alignment is enabled, the padding will be forced to 'same'.
+    If output_stride is 8, some pooling layers will be matched according to
+    their pre-alignment padding parameters, and will have their 'strides'
+    argument overridden.
+
+    Args:
+      pool_size: The pool size specified by the Keras application.
+      strides: The strides specified by the unwrapped Keras application.
+      **kwargs: Keyword args specified by the Keras application for
+        constructing the max pooling layer.
+
+    Returns:
+      A MaxPool2D layer specified by the Object Detection hyperparameter
+      configurations.
+    """
+    if self._use_atrous and kwargs.get('padding', '').lower() == 'valid':
+      strides = 1
+
+    if self._align_feature_maps:
+      kwargs['padding'] = 'same'
+
+    return tf.keras.layers.MaxPool2D(pool_size, strides=strides, **kwargs)
+
+  # We alias MaxPool2D because Keras has that alias
+  MaxPool2D = MaxPooling2D  # pylint: disable=invalid-name
+
+  def BatchNormalization(self, **kwargs):
+    """Builds a normalization layer.
+
+    Overrides the Keras application batch norm with the norm specified by the
+    Object Detection configuration.
+
+    Args:
+      **kwargs: Keyword arguments from the `layers.BatchNormalization` calls in
+        the Keras application.
+
+    Returns:
+      A normalization layer specified by the Object Detection hyperparameter
+      configurations.
+    """
+    kwargs['scale'] = self._batchnorm_scale
+    return freezable_batch_norm.FreezableBatchNorm(
+        training=self._batchnorm_training,
+        epsilon=self._default_batchnorm_epsilon,
+        momentum=self._default_batchnorm_momentum,
+        **kwargs)
+
+  # Forward all non-overridden methods to the keras layers
+  def __getattr__(self, item):
+    return getattr(tf.keras.layers, item)
+
+
+# pylint: disable=invalid-name
+def inception_resnet_v2(
+    batchnorm_training,
+    output_stride=16,
+    align_feature_maps=False,
+    batchnorm_scale=False,
+    weight_decay=0.00004,
+    default_batchnorm_momentum=0.9997,
+    default_batchnorm_epsilon=0.001,
+    **kwargs):
+  """Instantiates the InceptionResnetV2 architecture.
+
+  (Modified for object detection)
+
+  This wraps the InceptionResnetV2 tensorflow Keras application, but uses the
+  Keras application's kwargs-based monkey-patching API to override the Keras
+  architecture with the following changes:
+
+  - Supports freezing batch norm layers with FreezableBatchNorms
+  - Adds support for feature map alignment (like in the Slim model)
+  - Adds support for changing the output stride (like in the Slim model)
+  - Changes the default batchnorm momentum to 0.9997
+  - Adds support for overriding various batchnorm hyperparameters
+
+  Args:
+      batchnorm_training: Bool. Assigned to Batch norm layer `training` param
+        when constructing `freezable_batch_norm.FreezableBatchNorm` layers.
+      output_stride: A scalar that specifies the requested ratio of input to
+        output spatial resolution. Only supports 8 and 16.
+      align_feature_maps: When true, changes all the VALID paddings in the
+        network to SAME padding so that the feature maps are aligned.
+      batchnorm_scale: If True, uses an explicit `gamma` multiplier to scale the
+        activations in the batch normalization layer.
+      weight_decay: the l2 regularization weight decay for weights variables.
+        (gets multiplied by 0.5 to map from slim l2 regularization weight to
+        Keras l2 regularization weight).
+      default_batchnorm_momentum: Float. Batch norm layers will be constructed
+        using this value as the momentum.
+      default_batchnorm_epsilon: small float added to variance to avoid
+        dividing by zero.
+      **kwargs: Keyword arguments forwarded directly to the
+        `tf.keras.applications.InceptionResNetV2` method that constructs the
+        Keras model.
+
+  Returns:
+      A Keras model instance.
+  """
+  if output_stride != 8 and output_stride != 16:
+    raise ValueError('output_stride must be 8 or 16.')
+
+  layers_override = _LayersOverride(
+      batchnorm_training,
+      output_stride,
+      align_feature_maps=align_feature_maps,
+      batchnorm_scale=batchnorm_scale,
+      default_batchnorm_momentum=default_batchnorm_momentum,
+      default_batchnorm_epsilon=default_batchnorm_epsilon,
+      weight_decay=weight_decay)
+  return tf.keras.applications.InceptionResNetV2(
+      layers=layers_override, **kwargs)
+# pylint: enable=invalid-name