Merged commit includes the following changes:

195269567 by Zhichao Lu: Removing image summaries during train mode. -- 195147413 by Zhichao Lu: SSDLite config for mobilenet v2. -- 194883585 by Zhichao Lu: Simplify TPU compatible nearest neighbor upsampling using reshape and broadcasting. -- 194851009 by Zhichao Lu: Include ava v2.1 detection models in model zoo. -- 194292198 by Zhichao Lu: Add option to evaluate any checkpoint (without requiring write access to that directory and overwriting any existing logs there). -- 194122420 by Zhichao Lu: num_gt_boxes_per_image and num_det_boxes_per_image value incorrect. Should be not the expand dim. -- 193974479 by Zhichao Lu: Fixing a bug in the coco evaluator. -- 193959861 by Zhichao Lu: Read the default batch size from config file. -- 193737238 by Zhichao Lu: Fix data augmentation functions. -- 193576336 by Zhichao Lu: Add support for training keypoints. -- 193409179 by Zhichao Lu: Update protobuf requirements to 3+ in installation docs. -- 193382651 by Zhichao Lu: Updating coco evaluation metrics to allow for a batch of image info, rather than a single image. -- 193244778 by Zhichao Lu: Remove deprecated batch_norm_trainable field from ssd mobilenet v2 config -- 193228972 by Zhichao Lu: Make sure the final layers are also resized proportional to conv_depth_ratio. -- 193204364 by Zhichao Lu: Do not add batch norm parameters to final conv2d ops that predict boxes encodings and class scores in weight shared conv box predictor. This allows us to set proper bias and force initial predictions to be background when using focal loss. -- 193137342 by Zhichao Lu: Add a util function to visualize value histogram as a tf.summary.image. -- 193119411 by Zhichao Lu: Adding support for reading in logits as groundtruth labels and applying an optional temperature (scaling) before softmax in support of distillation. -- 193087707 by Zhichao Lu: Post-process now works again in train mode. -- 193067658 by Zhichao Lu: fix flakiness in testSSDRandomCropWithMultiClassScores due to randomness. -- 192922089 by Zhichao Lu: Add option to set dropout for classification net in weight shared box predictor. -- 192850747 by Zhichao Lu: Remove inaccurate caveat from proto file. -- 192837477 by Zhichao Lu: Extend to accept different ratios of conv channels. -- 192813444 by Zhichao Lu: Adding option for one_box_for_all_classes to the box_predictor -- 192624207 by Zhichao Lu: Update to trainer to allow for reading multiclass scores -- 192583425 by Zhichao Lu: Contains implementation of Visual Relations Detection evaluation metric (per image evaluation). -- 192529600 by Zhichao Lu: Modify the ssd meta arch to allow the option of not adding an implicit background class. -- 192512429 by Zhichao Lu: Refactor model_tpu_main.py files and move continuous eval loop into model_lib.py -- 192494267 by Zhichao Lu: Update create_pascal_tf_record.py and create_pet_tf_record.py -- 192485456 by Zhichao Lu: Enforcing that all eval metric ops have valid python strings. -- 192472546 by Zhichao Lu: Set regularize_depthwise to true in mobilenet_v1_argscope. -- 192421843 by Zhichao Lu: Refactoring of Mask-RCNN to put all mask prediction code in third stage. -- 192320460 by Zhichao Lu: Returning eval_on_train_input_fn from create_estimator_and_inputs(), rather than using train_input_fn in EVAL mode (which will still have data augmentation). -- 192226678 by Zhichao Lu: Access TPUEstimator and CrossShardOptimizer from tf namesspace. -- 192195514 by Zhichao Lu: Fix test that was flaky due to randomness -- 192166224 by Zhichao Lu: Minor fixes to match git repo. -- 192147130 by Zhichao Lu: use shape utils for assertion in feature extractor. -- 192132440 by Zhichao Lu: Class agnostic masks for mask_rcnn -- 192006190 by Zhichao Lu: Add learning rate summary in EVAL mode in model.py -- 192004845 by Zhichao Lu: Migrating away from Experiment class, as it is now deprecated. Also, refactoring into a separate model library and binaries. -- 191957195 by Zhichao Lu: Add classification_loss and localiztion_loss metrics for TPU jobs. -- 191932855 by Zhichao Lu: Add an option to skip the last striding in mobilenet. The modified network has nominal output stride 16 instead of 32. -- 191787921 by Zhichao Lu: Add option to override base feature extractor hyperparams in SSD models. This would allow us to use the same set of hyperparams for the complete feature extractor (base + new layers) if desired. -- 191743097 by Zhichao Lu: Adding an attribute to SSD model to indicate which fields in prediction dictionary have a batch dimension. This will be useful for future video models. -- 191668425 by Zhichao Lu: Internal change. -- 191649512 by Zhichao Lu: Introduce two parameters in ssd.proto - freeze_batchnorm, inplace_batchnorm_update - and set up slim arg_scopes in ssd_meta_arch.py such that applies it to all batchnorm ops in the predict() method. This centralizes the control of freezing and doing inplace batchnorm updates. -- 191620303 by Zhichao Lu: Modifications to the preprocessor to support multiclass scores -- PiperOrigin-RevId: 195269567

Merged commit includes the following changes:
195269567 by Zhichao Lu: Removing image summaries during train mode. -- 195147413 by Zhichao Lu: SSDLite config for mobilenet v2. -- 194883585 by Zhichao Lu: Simplify TPU compatible nearest neighbor upsampling using reshape and broadcasting. -- 194851009 by Zhichao Lu: Include ava v2.1 detection models in model zoo. -- 194292198 by Zhichao Lu: Add option to evaluate any checkpoint (without requiring write access to that directory and overwriting any existing logs there). -- 194122420 by Zhichao Lu: num_gt_boxes_per_image and num_det_boxes_per_image value incorrect. Should be not the expand dim. -- 193974479 by Zhichao Lu: Fixing a bug in the coco evaluator. -- 193959861 by Zhichao Lu: Read the default batch size from config file. -- 193737238 by Zhichao Lu: Fix data augmentation functions. -- 193576336 by Zhichao Lu: Add support for training keypoints. -- 193409179 by Zhichao Lu: Update protobuf requirements to 3+ in installation docs. -- 193382651 by Zhichao Lu: Updating coco evaluation metrics to allow for a batch of image info, rather than a single image. -- 193244778 by Zhichao Lu: Remove deprecated batch_norm_trainable field from ssd mobilenet v2 config -- 193228972 by Zhichao Lu: Make sure the final layers are also resized proportional to conv_depth_ratio. -- 193204364 by Zhichao Lu: Do not add batch norm parameters to final conv2d ops that predict boxes encodings and class scores in weight shared conv box predictor. This allows us to set proper bias and force initial predictions to be background when using focal loss. -- 193137342 by Zhichao Lu: Add a util function to visualize value histogram as a tf.summary.image. -- 193119411 by Zhichao Lu: Adding support for reading in logits as groundtruth labels and applying an optional temperature (scaling) before softmax in support of distillation. -- 193087707 by Zhichao Lu: Post-process now works again in train mode. -- 193067658 by Zhichao Lu: fix flakiness in testSSDRandomCropWithMultiClassScores due to randomness. -- 192922089 by Zhichao Lu: Add option to set dropout for classification net in weight shared box predictor. -- 192850747 by Zhichao Lu: Remove inaccurate caveat from proto file. -- 192837477 by Zhichao Lu: Extend to accept different ratios of conv channels. -- 192813444 by Zhichao Lu: Adding option for one_box_for_all_classes to the box_predictor -- 192624207 by Zhichao Lu: Update to trainer to allow for reading multiclass scores -- 192583425 by Zhichao Lu: Contains implementation of Visual Relations Detection evaluation metric (per image evaluation). -- 192529600 by Zhichao Lu: Modify the ssd meta arch to allow the option of not adding an implicit background class. -- 192512429 by Zhichao Lu: Refactor model_tpu_main.py files and move continuous eval loop into model_lib.py -- 192494267 by Zhichao Lu: Update create_pascal_tf_record.py and create_pet_tf_record.py -- 192485456 by Zhichao Lu: Enforcing that all eval metric ops have valid python strings. -- 192472546 by Zhichao Lu: Set regularize_depthwise to true in mobilenet_v1_argscope. -- 192421843 by Zhichao Lu: Refactoring of Mask-RCNN to put all mask prediction code in third stage. -- 192320460 by Zhichao Lu: Returning eval_on_train_input_fn from create_estimator_and_inputs(), rather than using train_input_fn in EVAL mode (which will still have data augmentation). -- 192226678 by Zhichao Lu: Access TPUEstimator and CrossShardOptimizer from tf namesspace. -- 192195514 by Zhichao Lu: Fix test that was flaky due to randomness -- 192166224 by Zhichao Lu: Minor fixes to match git repo. -- 192147130 by Zhichao Lu: use shape utils for assertion in feature extractor. -- 192132440 by Zhichao Lu: Class agnostic masks for mask_rcnn -- 192006190 by Zhichao Lu: Add learning rate summary in EVAL mode in model.py -- 192004845 by Zhichao Lu: Migrating away from Experiment class, as it is now deprecated. Also, refactoring into a separate model library and binaries. -- 191957195 by Zhichao Lu: Add classification_loss and localiztion_loss metrics for TPU jobs. -- 191932855 by Zhichao Lu: Add an option to skip the last striding in mobilenet. The modified network has nominal output stride 16 instead of 32. -- 191787921 by Zhichao Lu: Add option to override base feature extractor hyperparams in SSD models. This would allow us to use the same set of hyperparams for the complete feature extractor (base + new layers) if desired. -- 191743097 by Zhichao Lu: Adding an attribute to SSD model to indicate which fields in prediction dictionary have a batch dimension. This will be useful for future video models. -- 191668425 by Zhichao Lu: Internal change. -- 191649512 by Zhichao Lu: Introduce two parameters in ssd.proto - freeze_batchnorm, inplace_batchnorm_update - and set up slim arg_scopes in ssd_meta_arch.py such that applies it to all batchnorm ops in the predict() method. This centralizes the control of freezing and doing inplace batchnorm updates. -- 191620303 by Zhichao Lu: Modifications to the preprocessor to support multiclass scores -- PiperOrigin-RevId: 195269567
63054210 · Zhichao Lu · pkulzc · 5f9f6b84 · 63054210 · 63054210
Commit 63054210 authored May 03, 2018 by Zhichao Lu Committed by pkulzc May 03, 2018
7 changed files
--- a/research/object_detection/README.md
+++ b/research/object_detection/README.md
@@ -90,6 +90,15 @@ reporting an issue.

 ## Release information

+### April 30, 2018
+
+We have released a Faster R-CNN detector with ResNet-101 feature extractor trained on [AVA](https://research.google.com/ava/) v2.1.
+Compared with other commonly used object detectors, it changes the action classification loss function to per-class Sigmoid loss to handle boxes with multiple labels.
+The model is trained on the training split of AVA v2.1 for 1.5M iterations, it achieves mean AP of 11.25% over 60 classes on the validation split of AVA v2.1.
+For more details please refer to this [paper](https://arxiv.org/abs/1705.08421).
+
+<b>Thanks to contributors</b>: Chen Sun, David Ross
+
 ### April 2, 2018

 Supercharge your mobile phones with the next generation mobile object detector!

--- a/research/object_detection/data/ava_label_map_v2.1.pbtxt
+++ b/research/object_detection/data/ava_label_map_v2.1.pbtxt
+item {
+  name: "bend/bow (at the waist)"
+  id: 1
+}
+item {
+  name: "crouch/kneel"
+  id: 3
+}
+item {
+  name: "dance"
+  id: 4
+}
+item {
+  name: "fall down"
+  id: 5
+}
+item {
+  name: "get up"
+  id: 6
+}
+item {
+  name: "jump/leap"
+  id: 7
+}
+item {
+  name: "lie/sleep"
+  id: 8
+}
+item {
+  name: "martial art"
+  id: 9
+}
+item {
+  name: "run/jog"
+  id: 10
+}
+item {
+  name: "sit"
+  id: 11
+}
+item {
+  name: "stand"
+  id: 12
+}
+item {
+  name: "swim"
+  id: 13
+}
+item {
+  name: "walk"
+  id: 14
+}
+item {
+  name: "answer phone"
+  id: 15
+}
+item {
+  name: "carry/hold (an object)"
+  id: 17
+}
+item {
+  name: "climb (e.g., a mountain)"
+  id: 20
+}
+item {
+  name: "close (e.g., a door, a box)"
+  id: 22
+}
+item {
+  name: "cut"
+  id: 24
+}
+item {
+  name: "dress/put on clothing"
+  id: 26
+}
+item {
+  name: "drink"
+  id: 27
+}
+item {
+  name: "drive (e.g., a car, a truck)"
+  id: 28
+}
+item {
+  name: "eat"
+  id: 29
+}
+item {
+  name: "enter"
+  id: 30
+}
+item {
+  name: "hit (an object)"
+  id: 34
+}
+item {
+  name: "lift/pick up"
+  id: 36
+}
+item {
+  name: "listen (e.g., to music)"
+  id: 37
+}
+item {
+  name: "open (e.g., a window, a car door)"
+  id: 38
+}
+item {
+  name: "play musical instrument"
+  id: 41
+}
+item {
+  name: "point to (an object)"
+  id: 43
+}
+item {
+  name: "pull (an object)"
+  id: 45
+}
+item {
+  name: "push (an object)"
+  id: 46
+}
+item {
+  name: "put down"
+  id: 47
+}
+item {
+  name: "read"
+  id: 48
+}
+item {
+  name: "ride (e.g., a bike, a car, a horse)"
+  id: 49
+}
+item {
+  name: "sail boat"
+  id: 51
+}
+item {
+  name: "shoot"
+  id: 52
+}
+item {
+  name: "smoke"
+  id: 54
+}
+item {
+  name: "take a photo"
+  id: 56
+}
+item {
+  name: "text on/look at a cellphone"
+  id: 57
+}
+item {
+  name: "throw"
+  id: 58
+}
+item {
+  name: "touch (an object)"
+  id: 59
+}
+item {
+  name: "turn (e.g., a screwdriver)"
+  id: 60
+}
+item {
+  name: "watch (e.g., TV)"
+  id: 61
+}
+item {
+  name: "work on a computer"
+  id: 62
+}
+item {
+  name: "write"
+  id: 63
+}
+item {
+  name: "fight/hit (a person)"
+  id: 64
+}
+item {
+  name: "give/serve (an object) to (a person)"
+  id: 65
+}
+item {
+  name: "grab (a person)"
+  id: 66
+}
+item {
+  name: "hand clap"
+  id: 67
+}
+item {
+  name: "hand shake"
+  id: 68
+}
+item {
+  name: "hand wave"
+  id: 69
+}
+item {
+  name: "hug (a person)"
+  id: 70
+}
+item {
+  name: "kiss (a person)"
+  id: 72
+}
+item {
+  name: "lift (a person)"
+  id: 73
+}
+item {
+  name: "listen to (a person)"
+  id: 74
+}
+item {
+  name: "push (another person)"
+  id: 76
+}
+item {
+  name: "sing to (e.g., self, a person, a group)"
+  id: 77
+}
+item {
+  name: "take (an object) from (a person)"
+  id: 78
+}
+item {
+  name: "talk to (e.g., self, a person, a group)"
+  id: 79
+}
+item {
+  name: "watch (a person)"
+  id: 80
+}
--- a/research/object_detection/g3doc/detection_model_zoo.md
+++ b/research/object_detection/g3doc/detection_model_zoo.md
@@ -91,7 +91,7 @@ Some remarks on frozen inference graphs:

 ## Kitti-trained models {#kitti-models}

-Model name                                                                                                                                                        | Speed (ms) | Pascal mAP@0.5 (ms) | Outputs
+Model name                                                                                                                                                        | Speed (ms) | Pascal mAP@0.5 | Outputs
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----:
 [faster_rcnn_resnet101_kitti](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_kitti_2018_01_28.tar.gz) | 79  | 87              | Boxes

@@ -103,6 +103,13 @@ Model name
 [faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2018_01_28.tar.gz) | 347  |               | Boxes


+## AVA v2.1 trained models {#ava-models}
+
+Model name                                                                                                                                                        | Speed (ms) | Pascal mAP@0.5 | Outputs
+----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---: | :-------------: | :-----:
+[faster_rcnn_resnet101_ava_v2.1](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_ava_v2.1_2018_04_30.tar.gz) | 93  | 11              | Boxes
+
+
 [^1]: See [MSCOCO evaluation protocol](http://cocodataset.org/#detections-eval).
 [^2]: This is PASCAL mAP with a slightly different way of true positives computation: see [Open Images evaluation protocol](evaluation_protocols.md#open-images).

--- a/research/object_detection/model_lib.py
+++ b/research/object_detection/model_lib.py
@@ -325,16 +325,16 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False):
      }

    eval_metric_ops = None
-    if mode in (tf.estimator.ModeKeys.TRAIN, tf.estimator.ModeKeys.EVAL):
+    if mode == tf.estimator.ModeKeys.EVAL:
      class_agnostic = (fields.DetectionResultFields.detection_classes
                        not in detections)
      groundtruth = _get_groundtruth_data(detection_model, class_agnostic)
      use_original_images = fields.InputDataFields.original_image in features
-      original_images = (
+      eval_images = (
          features[fields.InputDataFields.original_image] if use_original_images
          else features[fields.InputDataFields.image])
      eval_dict = eval_util.result_dict_for_single_example(
-          original_images[0:1],
+          eval_images[0:1],
          features[inputs.HASH_KEY][0],
          detections,
          groundtruth,
@@ -355,22 +355,21 @@ def create_model_fn(detection_model_fn, configs, hparams, use_tpu=False):
        img_summary = tf.summary.image('Detections_Left_Groundtruth_Right',
                                       detection_and_groundtruth)

-      if mode == tf.estimator.ModeKeys.EVAL:
-        # Eval metrics on a single example.
-        eval_metrics = eval_config.metrics_set
-        if not eval_metrics:
-          eval_metrics = ['coco_detection_metrics']
-        eval_metric_ops = eval_util.get_eval_metric_ops_for_evaluators(
-            eval_metrics, category_index.values(), eval_dict,
-            include_metrics_per_category=False)
-        for loss_key, loss_tensor in iter(losses_dict.items()):
-          eval_metric_ops[loss_key] = tf.metrics.mean(loss_tensor)
-        for var in optimizer_summary_vars:
-          eval_metric_ops[var.op.name] = (var, tf.no_op())
-        if img_summary is not None:
-          eval_metric_ops['Detections_Left_Groundtruth_Right'] = (
-              img_summary, tf.no_op())
-        eval_metric_ops = {str(k): v for k, v in eval_metric_ops.iteritems()}
+      # Eval metrics on a single example.
+      eval_metrics = eval_config.metrics_set
+      if not eval_metrics:
+        eval_metrics = ['coco_detection_metrics']
+      eval_metric_ops = eval_util.get_eval_metric_ops_for_evaluators(
+          eval_metrics, category_index.values(), eval_dict,
+          include_metrics_per_category=False)
+      for loss_key, loss_tensor in iter(losses_dict.items()):
+        eval_metric_ops[loss_key] = tf.metrics.mean(loss_tensor)
+      for var in optimizer_summary_vars:
+        eval_metric_ops[var.op.name] = (var, tf.no_op())
+      if img_summary is not None:
+        eval_metric_ops['Detections_Left_Groundtruth_Right'] = (
+            img_summary, tf.no_op())
+      eval_metric_ops = {str(k): v for k, v in eval_metric_ops.iteritems()}

    if use_tpu:
      return tf.contrib.tpu.TPUEstimatorSpec(

--- a/research/object_detection/samples/configs/faster_rcnn_resnet101_ava_v2.1.config
+++ b/research/object_detection/samples/configs/faster_rcnn_resnet101_ava_v2.1.config
+# Faster R-CNN with Resnet-101 (v1), configuration for AVA v2.1.
+# Users should configure the fine_tune_checkpoint field in the train config as
+# well as the label_map_path and input_path fields in the train_input_reader and
+# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
+# should be configured.
+
+model {
+  faster_rcnn {
+    num_classes: 80
+    image_resizer {
+      keep_aspect_ratio_resizer {
+        min_dimension: 600
+        max_dimension: 1024
+      }
+    }
+    feature_extractor {
+      type: 'faster_rcnn_resnet101'
+      first_stage_features_stride: 16
+    }
+    first_stage_anchor_generator {
+      grid_anchor_generator {
+        scales: [0.25, 0.5, 1.0, 2.0]
+        aspect_ratios: [0.5, 1.0, 2.0]
+        height_stride: 16
+        width_stride: 16
+      }
+    }
+    first_stage_box_predictor_conv_hyperparams {
+      op: CONV
+      regularizer {
+        l2_regularizer {
+          weight: 0.0
+        }
+      }
+      initializer {
+        truncated_normal_initializer {
+          stddev: 0.01
+        }
+      }
+    }
+    first_stage_nms_score_threshold: 0.0
+    first_stage_nms_iou_threshold: 0.7
+    first_stage_max_proposals: 300
+    first_stage_localization_loss_weight: 2.0
+    first_stage_objectness_loss_weight: 1.0
+    initial_crop_size: 14
+    maxpool_kernel_size: 2
+    maxpool_stride: 2
+    second_stage_box_predictor {
+      mask_rcnn_box_predictor {
+        use_dropout: false
+        dropout_keep_probability: 1.0
+        fc_hyperparams {
+          op: FC
+          regularizer {
+            l2_regularizer {
+              weight: 0.0
+            }
+          }
+          initializer {
+            variance_scaling_initializer {
+              factor: 1.0
+              uniform: true
+              mode: FAN_AVG
+            }
+          }
+        }
+      }
+    }
+    second_stage_post_processing {
+      batch_non_max_suppression {
+        score_threshold: 0.0
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 300
+      }
+      score_converter: SIGMOID
+    }
+    second_stage_localization_loss_weight: 2.0
+    second_stage_classification_loss_weight: 1.0
+    second_stage_classification_loss {
+      weighted_sigmoid {
+        anchorwise_output: true
+      }
+    }
+  }
+}
+
+train_config: {
+  batch_size: 1
+  num_steps: 1500000
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        manual_step_learning_rate {
+          initial_learning_rate: 0.0003
+          schedule {
+            step: 1200000
+            learning_rate: .00003
+          }
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  gradient_clipping_by_norm: 10.0
+  merge_multiple_label_boxes: true
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+  max_number_of_boxes: 100
+}
+
+train_input_reader: {
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/ava_train.record"
+  }
+  label_map_path: "PATH_TO_BE_CONFIGURED/ava_label_map_v2.1.pbtxt"
+}
+
+eval_config: {
+  metrics_set: "pascal_voc_detection_metrics"
+  use_moving_averages: false
+  num_examples: 57371
+}
+
+eval_input_reader: {
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/ava_val.record"
+  }
+  label_map_path: "PATH_TO_BE_CONFIGURED/ava_label_map_v2.1.pbtxt"
+  shuffle: false
+  num_readers: 1
+}
+
--- a/research/object_detection/samples/configs/ssd_mobilenet_v2_coco.config
+++ b/research/object_detection/samples/configs/ssd_mobilenet_v2_coco.config
@@ -54,6 +54,7 @@ model {
        use_dropout: false
        dropout_keep_probability: 0.8
        kernel_size: 3
+        use_depthwise: true
        box_code_size: 4
        apply_sigmoid_to_scores: false
        conv_hyperparams {

--- a/research/object_detection/utils/ops.py
+++ b/research/object_detection/utils/ops.py
@@ -774,8 +774,8 @@ def nearest_neighbor_upsampling(input_tensor, scale):

  Nearest neighbor upsampling function that maps input tensor with shape
  [batch_size, height, width, channels] to [batch_size, height * scale
-  , width * scale, channels]. This implementation only uses reshape and tile to
-  make it compatible with certain hardware.
+  , width * scale, channels]. This implementation only uses reshape and
+  broadcasting to make it TPU compatible.

  Args:
    input_tensor: A float32 tensor of size [batch, height_in, width_in,
@@ -785,13 +785,14 @@ def nearest_neighbor_upsampling(input_tensor, scale):
    data_up: A float32 tensor of size
      [batch, height_in*scale, width_in*scale, channels].
  """
-  shape = shape_utils.combined_static_and_dynamic_shape(input_tensor)
-  shape_before_tile = [shape[0], shape[1], 1, shape[2], 1, shape[3]]
-  shape_after_tile = [shape[0], shape[1] * scale, shape[2] * scale, shape[3]]
-  data_reshaped = tf.reshape(input_tensor, shape_before_tile)
-  resized_tensor = tf.tile(data_reshaped, [1, 1, scale, 1, scale, 1])
-  resized_tensor = tf.reshape(resized_tensor, shape_after_tile)
-  return resized_tensor
+  with tf.name_scope('nearest_neighbor_upsampling'):
+    (batch_size, height, width,
+     channels) = shape_utils.combined_static_and_dynamic_shape(input_tensor)
+    output_tensor = tf.reshape(
+        input_tensor, [batch_size, height, 1, width, 1, channels]) * tf.ones(
+            [1, 1, scale, 1, scale, 1], dtype=input_tensor.dtype)
+    return tf.reshape(output_tensor,
+                      [batch_size, height * scale, width * scale, channels])


 def matmul_gather_on_zeroth_axis(params, indices, scope=None):