Internal changes to slim and object detection (#4100)

* Adding option for one_box_for_all_classes to the box_predictor PiperOrigin-RevId: 192813444 * Extend to accept different ratios of conv channels. PiperOrigin-RevId: 192837477 * Remove inaccurate caveat from proto file. PiperOrigin-RevId: 192850747 * Add option to set dropout for classification net in weight shared box predictor. PiperOrigin-RevId: 192922089 * fix flakiness in testSSDRandomCropWithMultiClassScores due to randomness. PiperOrigin-RevId: 193067658 * Post-process now works again in train mode. PiperOrigin-RevId: 193087707 * Adding support for reading in logits as groundtruth labels and applying an optional temperature (scaling) before softmax in support of distillation. PiperOrigin-RevId: 193119411 * Add a util function to visualize value histogram as a tf.summary.image. PiperOrigin-RevId: 193137342 * Do not add batch norm parameters to final conv2d ops that predict boxes encodings and class scores in weight shared conv box predictor. This allows us to set proper bias and force initial predictions to be background when using focal loss. PiperOrigin-RevId: 193204364 * Make sure the final layers are also resized proportional to conv_depth_ratio. PiperOrigin-RevId: 193228972 * Remove deprecated batch_norm_trainable field from ssd mobilenet v2 config PiperOrigin-RevId: 193244778 * Updating coco evaluation metrics to allow for a batch of image info, rather than a single image. PiperOrigin-RevId: 193382651 * Update protobuf requirements to 3+ in installation docs. PiperOrigin-RevId: 193409179 * Add support for training keypoints. PiperOrigin-RevId: 193576336 * Fix data augmentation functions. PiperOrigin-RevId: 193737238 * Read the default batch size from config file. PiperOrigin-RevId: 193959861 * Fixing a bug in the coco evaluator. PiperOrigin-RevId: 193974479 * num_gt_boxes_per_image and num_det_boxes_per_image value incorrect. Should be not the expand dim. PiperOrigin-RevId: 194122420 * Add option to evaluate any checkpoint (without requiring write access to that directory and overwriting any existing logs there). PiperOrigin-RevId: 194292198 * PiperOrigin-RevId: 190346687 * - Expose slim arg_scope function to compute keys to enable tessting. - Add is_training=None option to mobinenet arg_scopes. This allows the users to set is_training from an outer scope. PiperOrigin-RevId: 190997959 * Add an option to not set slim arg_scope for batch_norm is_training parameter. This enables users to set the is_training parameter from an outer scope. PiperOrigin-RevId: 191611934 * PiperOrigin-RevId: 191955231 * PiperOrigin-RevId: 193254125 * PiperOrigin-RevId: 193371562 * PiperOrigin-RevId: 194085628

Internal changes to slim and object detection (#4100)
* Adding option for one_box_for_all_classes to the box_predictor PiperOrigin-RevId: 192813444 * Extend to accept different ratios of conv channels. PiperOrigin-RevId: 192837477 * Remove inaccurate caveat from proto file. PiperOrigin-RevId: 192850747 * Add option to set dropout for classification net in weight shared box predictor. PiperOrigin-RevId: 192922089 * fix flakiness in testSSDRandomCropWithMultiClassScores due to randomness. PiperOrigin-RevId: 193067658 * Post-process now works again in train mode. PiperOrigin-RevId: 193087707 * Adding support for reading in logits as groundtruth labels and applying an optional temperature (scaling) before softmax in support of distillation. PiperOrigin-RevId: 193119411 * Add a util function to visualize value histogram as a tf.summary.image. PiperOrigin-RevId: 193137342 * Do not add batch norm parameters to final conv2d ops that predict boxes encodings and class scores in weight shared conv box predictor. This allows us to set proper bias and force initial predictions to be background when using focal loss. PiperOrigin-RevId: 193204364 * Make sure the final layers are also resized proportional to conv_depth_ratio. PiperOrigin-RevId: 193228972 * Remove deprecated batch_norm_trainable field from ssd mobilenet v2 config PiperOrigin-RevId: 193244778 * Updating coco evaluation metrics to allow for a batch of image info, rather than a single image. PiperOrigin-RevId: 193382651 * Update protobuf requirements to 3+ in installation docs. PiperOrigin-RevId: 193409179 * Add support for training keypoints. PiperOrigin-RevId: 193576336 * Fix data augmentation functions. PiperOrigin-RevId: 193737238 * Read the default batch size from config file. PiperOrigin-RevId: 193959861 * Fixing a bug in the coco evaluator. PiperOrigin-RevId: 193974479 * num_gt_boxes_per_image and num_det_boxes_per_image value incorrect. Should be not the expand dim. PiperOrigin-RevId: 194122420 * Add option to evaluate any checkpoint (without requiring write access to that directory and overwriting any existing logs there). PiperOrigin-RevId: 194292198 * PiperOrigin-RevId: 190346687 * - Expose slim arg_scope function to compute keys to enable tessting. - Add is_training=None option to mobinenet arg_scopes. This allows the users to set is_training from an outer scope. PiperOrigin-RevId: 190997959 * Add an option to not set slim arg_scope for batch_norm is_training parameter. This enables users to set the is_training parameter from an outer scope. PiperOrigin-RevId: 191611934 * PiperOrigin-RevId: 191955231 * PiperOrigin-RevId: 193254125 * PiperOrigin-RevId: 193371562 * PiperOrigin-RevId: 194085628
505f554c · pkulzc · Sergio Guadarrama · 5c78b9d7 · 505f554c · 505f554c
Commit 505f554c authored May 01, 2018 by pkulzc Committed by Sergio Guadarrama May 01, 2018
20 changed files
--- a/research/object_detection/models/faster_rcnn_mobilenet_v1_feature_extractor.py
+++ b/research/object_detection/models/faster_rcnn_mobilenet_v1_feature_extractor.py
@@ -14,6 +14,8 @@
 # ==============================================================================

 """Mobilenet v1 Faster R-CNN implementation."""
+import numpy as np
+
 import tensorflow as tf

 from object_detection.meta_architectures import faster_rcnn_meta_arch
@@ -23,22 +25,31 @@ from nets import mobilenet_v1
 slim = tf.contrib.slim


-_MOBILENET_V1_100_CONV_NO_LAST_STRIDE_DEFS = [
-    mobilenet_v1.Conv(kernel=[3, 3], stride=2, depth=32),
-    mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=64),
-    mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=2, depth=128),
-    mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=128),
-    mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=2, depth=256),
-    mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=256),
-    mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=2, depth=512),
-    mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=512),
-    mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=512),
-    mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=512),
-    mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=512),
-    mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=512),
-    mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=1024),
-    mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=1024)
-]
+def _get_mobilenet_conv_no_last_stride_defs(conv_depth_ratio_in_percentage):
+  if conv_depth_ratio_in_percentage not in [25, 50, 75, 100]:
+    raise ValueError(
+        'Only the following ratio percentages are supported: 25, 50, 75, 100')
+  conv_depth_ratio_in_percentage = float(conv_depth_ratio_in_percentage) / 100.0
+  channels = np.array([
+      32, 64, 128, 128, 256, 256, 512, 512, 512, 512, 512, 512, 1024, 1024
+  ], dtype=np.float32)
+  channels = (channels * conv_depth_ratio_in_percentage).astype(np.int32)
+  return [
+      mobilenet_v1.Conv(kernel=[3, 3], stride=2, depth=channels[0]),
+      mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=channels[1]),
+      mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=2, depth=channels[2]),
+      mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=channels[3]),
+      mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=2, depth=channels[4]),
+      mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=channels[5]),
+      mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=2, depth=channels[6]),
+      mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=channels[7]),
+      mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=channels[8]),
+      mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=channels[9]),
+      mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=channels[10]),
+      mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=channels[11]),
+      mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=channels[12]),
+      mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=channels[13])
+  ]


 class FasterRCNNMobilenetV1FeatureExtractor(
@@ -53,7 +64,8 @@ class FasterRCNNMobilenetV1FeatureExtractor(
               weight_decay=0.0,
               depth_multiplier=1.0,
               min_depth=16,
-               skip_last_stride=False):
+               skip_last_stride=False,
+               conv_depth_ratio_in_percentage=100):
    """Constructor.

    Args:
@@ -65,6 +77,8 @@ class FasterRCNNMobilenetV1FeatureExtractor(
      depth_multiplier: float depth multiplier for feature extractor.
      min_depth: minimum feature extractor depth.
      skip_last_stride: Skip the last stride if True.
+      conv_depth_ratio_in_percentage: Conv depth ratio in percentage. Only
+        applied if skip_last_stride is True.

    Raises:
      ValueError: If `first_stage_features_stride` is not 8 or 16.
@@ -74,6 +88,7 @@ class FasterRCNNMobilenetV1FeatureExtractor(
    self._depth_multiplier = depth_multiplier
    self._min_depth = min_depth
    self._skip_last_stride = skip_last_stride
+    self._conv_depth_ratio_in_percentage = conv_depth_ratio_in_percentage
    super(FasterRCNNMobilenetV1FeatureExtractor, self).__init__(
        is_training, first_stage_features_stride, batch_norm_trainable,
        reuse_weights, weight_decay)
@@ -124,7 +139,9 @@ class FasterRCNNMobilenetV1FeatureExtractor(
                             reuse=self._reuse_weights) as scope:
        params = {}
        if self._skip_last_stride:
-          params['conv_defs'] = _MOBILENET_V1_100_CONV_NO_LAST_STRIDE_DEFS
+          params['conv_defs'] = _get_mobilenet_conv_no_last_stride_defs(
+              conv_depth_ratio_in_percentage=self.
+              _conv_depth_ratio_in_percentage)
        _, activations = mobilenet_v1.mobilenet_v1_base(
            preprocessed_inputs,
            final_endpoint='Conv2d_11_pointwise',
@@ -150,6 +167,11 @@ class FasterRCNNMobilenetV1FeatureExtractor(
    """
    net = proposal_feature_maps

+    conv_depth = 1024
+    if self._skip_last_stride:
+      conv_depth_ratio = float(self._conv_depth_ratio_in_percentage) / 100.0
+      conv_depth = int(float(conv_depth) * conv_depth_ratio)
+
    depth = lambda d: max(int(d * 1.0), 16)
    with tf.variable_scope('MobilenetV1', reuse=self._reuse_weights):
      with slim.arg_scope(
@@ -160,13 +182,13 @@ class FasterRCNNMobilenetV1FeatureExtractor(
            [slim.conv2d, slim.separable_conv2d], padding='SAME'):
          net = slim.separable_conv2d(
              net,
-              depth(1024), [3, 3],
+              depth(conv_depth), [3, 3],
              depth_multiplier=1,
              stride=2,
              scope='Conv2d_12_pointwise')
          return slim.separable_conv2d(
              net,
-              depth(1024), [3, 3],
+              depth(conv_depth), [3, 3],
              depth_multiplier=1,
              stride=1,
              scope='Conv2d_13_pointwise')
--- a/research/object_detection/protos/box_predictor.proto
+++ b/research/object_detection/protos/box_predictor.proto
@@ -20,7 +20,7 @@ message ConvolutionalBoxPredictor {
  // Hyperparameters for convolution ops used in the box predictor.
  optional Hyperparams conv_hyperparams = 1;

-  // Minumum feature depth prior to predicting box encodings and class
+  // Minimum feature depth prior to predicting box encodings and class
  // predictions.
  optional int32 min_depth = 2 [default = 0];

@@ -81,6 +81,12 @@ message WeightSharedConvolutionalBoxPredictor {
  // training where there are large number of negative boxes. See
  // https://arxiv.org/abs/1708.02002 for details.
  optional float class_prediction_bias_init = 10 [default = 0.0];
+
+   // Whether to use dropout for class prediction.
+  optional bool use_dropout = 11 [default = false];
+
+  // Keep probability for dropout
+  optional float dropout_keep_probability = 12 [default = 0.8];
 }

 message MaskRCNNBoxPredictor {
@@ -119,6 +125,10 @@ message MaskRCNNBoxPredictor {
  // branch.
  optional int32 mask_prediction_num_conv_layers = 11 [default = 2];
  optional bool masks_are_class_agnostic = 12 [default = false];
+
+  // Whether to use one box for all classes rather than a different box for each
+  // class.
+  optional bool share_box_across_classes = 13 [default = false];
 }

 message RfcnBoxPredictor {

--- a/research/object_detection/protos/input_reader.proto
+++ b/research/object_detection/protos/input_reader.proto
@@ -60,6 +60,9 @@ message InputReader {
  // Number of parallel decode ops to apply.
  optional uint32 num_parallel_map_calls = 14 [default = 64];

+  // Number of groundtruth keypoints per object.
+  optional uint32 num_keypoints = 16 [default = 0];
+
  // Whether to load groundtruth instance masks.
  optional bool load_instance_masks = 7 [default = false];


--- a/research/object_detection/protos/losses.proto
+++ b/research/object_detection/protos/losses.proto
@@ -60,6 +60,7 @@ message ClassificationLoss {
  oneof classification_loss {
    WeightedSigmoidClassificationLoss weighted_sigmoid = 1;
    WeightedSoftmaxClassificationLoss weighted_softmax = 2;
+    WeightedSoftmaxClassificationAgainstLogitsLoss weighted_logits_softmax = 5;
    BootstrappedSigmoidClassificationLoss bootstrapped_sigmoid = 3;
    SigmoidFocalClassificationLoss weighted_sigmoid_focal = 4;
  }
@@ -93,6 +94,17 @@ message WeightedSoftmaxClassificationLoss {
  optional float logit_scale = 2 [default = 1.0];
 }

+// Classification loss using a softmax function over class predictions and
+// a softmax function over the groundtruth labels (assumed to be logits).
+message WeightedSoftmaxClassificationAgainstLogitsLoss {
+  // DEPRECATED, do not use.
+  optional bool anchorwise_output = 1 [default = false];
+  // Scale and softmax groundtruth logits before calculating softmax
+  // classification loss. Typically used for softmax distillation with teacher
+  // annotations stored as logits.
+  optional float logit_scale = 2 [default = 1.0];
+}
+
 // Classification loss using a sigmoid function over the class prediction with
 // the highest prediction score.
 message BootstrappedSigmoidClassificationLoss {

--- a/research/object_detection/protos/ssd.proto
+++ b/research/object_detection/protos/ssd.proto
@@ -113,7 +113,8 @@ message SsdFeatureExtractor {
  optional int32 pad_to_multiple = 5 [default = 1];

  // Whether to use explicit padding when extracting SSD multiresolution
-  // features. Note that this does not apply to the base feature extractor.
+  // features. This will also apply to the base feature extractor if a MobileNet
+  // architecture is used.
  optional bool use_explicit_padding = 7 [default=false];

  // Whether to use depthwise separable convolutions for to extract additional

--- a/research/object_detection/samples/configs/ssd_mobilenet_v2_coco.config
+++ b/research/object_detection/samples/configs/ssd_mobilenet_v2_coco.config
@@ -105,7 +105,6 @@ model {
          epsilon: 0.001,
        }
      }
-      batch_norm_trainable: true
    }
    loss {
      classification_loss {

--- a/research/object_detection/utils/visualization_utils.py
+++ b/research/object_detection/utils/visualization_utils.py
@@ -689,3 +689,32 @@ def add_cdf_image_summary(values, name):
    return image
  cdf_plot = tf.py_func(cdf_plot, [values], tf.uint8)
  tf.summary.image(name, cdf_plot)
+
+
+def add_hist_image_summary(values, bins, name):
+  """Adds a tf.summary.image for a histogram plot of the values.
+
+  Plots the histogram of values and creates a tf image summary.
+
+  Args:
+    values: a 1-D float32 tensor containing the values.
+    bins: bin edges which will be directly passed to np.histogram.
+    name: name for the image summary.
+  """
+
+  def hist_plot(values, bins):
+    """Numpy function to plot hist."""
+    fig = plt.figure(frameon=False)
+    ax = fig.add_subplot('111')
+    y, x = np.histogram(values, bins=bins)
+    ax.plot(x[:-1], y)
+    ax.set_ylabel('count')
+    ax.set_xlabel('value')
+    fig.canvas.draw()
+    width, height = fig.get_size_inches() * fig.get_dpi()
+    image = np.fromstring(
+        fig.canvas.tostring_rgb(), dtype='uint8').reshape(
+            1, int(height), int(width), 3)
+    return image
+  hist_plot = tf.py_func(hist_plot, [values, bins], tf.uint8)
+  tf.summary.image(name, hist_plot)
--- a/research/object_detection/utils/visualization_utils_test.py
+++ b/research/object_detection/utils/visualization_utils_test.py
@@ -187,6 +187,15 @@ class VisualizationUtilsTest(tf.test.TestCase):
    with self.test_session():
      cdf_image_summary.eval()

+  def test_add_hist_image_summary(self):
+    values = [0.1, 0.2, 0.3, 0.4, 0.42, 0.44, 0.46, 0.48, 0.50]
+    bins = [0.01 * i for i in range(101)]
+    visualization_utils.add_hist_image_summary(values, bins,
+                                               'ScoresDistribution')
+    hist_image_summary = tf.get_collection(key=tf.GraphKeys.SUMMARIES)[0]
+    with self.test_session():
+      hist_image_summary.eval()
+

 if __name__ == '__main__':
  tf.test.main()
--- a/research/slim/nets/cyclegan.py
+++ b/research/slim/nets/cyclegan.py
@@ -97,14 +97,19 @@ def cyclegan_upsample(net, num_outputs, stride, method='conv2d_transpose'):
          net, [stride[0] * height, stride[1] * width])
      net = tf.pad(net, spatial_pad_1, 'REFLECT')
      net = layers.conv2d(net, num_outputs, kernel_size=[3, 3], padding='valid')
-    if method == 'bilinear_upsample_conv':
+    elif method == 'bilinear_upsample_conv':
      net = tf.image.resize_bilinear(
          net, [stride[0] * height, stride[1] * width])
      net = tf.pad(net, spatial_pad_1, 'REFLECT')
      net = layers.conv2d(net, num_outputs, kernel_size=[3, 3], padding='valid')
    elif method == 'conv2d_transpose':
+      # This corrects 1 pixel offset for images with even width and height.
+      # conv2d is left aligned and conv2d_transpose is right aligned for even
+      # sized images (while doing 'SAME' padding).
+      # Note: This doesn't reflect actual model in paper.
      net = layers.conv2d_transpose(
-          net, num_outputs, kernel_size=[3, 3], stride=stride, padding='same')
+          net, num_outputs, kernel_size=[3, 3], stride=stride, padding='valid')
+      net = net[:, 1:, 1:, :]
    else:
      raise ValueError('Unknown method: [%s]', method)


--- a/research/slim/nets/mobilenet/conv_blocks.py
+++ b/research/slim/nets/mobilenet/conv_blocks.py
@@ -175,6 +175,7 @@ def expanded_conv(input_tensor,
                  depthwise_channel_multiplier=1,
                  endpoints=None,
                  use_explicit_padding=False,
+                  padding='SAME',
                  scope=None):
  """Depthwise Convolution Block with expansion.

@@ -214,6 +215,7 @@ def expanded_conv(input_tensor,
    use_explicit_padding: Use 'VALID' padding for convolutions, but prepad
      inputs so that the output dimensions are the same as if 'SAME' padding
      were used.
+    padding: Padding type to use if `use_explicit_padding` is not set.
    scope: optional scope.

  Returns:
@@ -228,8 +230,10 @@ def expanded_conv(input_tensor,
    if  depthwise_location not in [None, 'input', 'output', 'expansion']:
      raise TypeError('%r is unknown value for depthwise_location' %
                      depthwise_location)
-    padding = 'SAME'
    if use_explicit_padding:
+      if padding != 'SAME':
+        raise TypeError('`use_explicit_padding` should only be used with '
+                        '"SAME" padding.')
      padding = 'VALID'
    depthwise_func = functools.partial(
        slim.separable_conv2d,

--- a/research/slim/nets/mobilenet/mobilenet.py
+++ b/research/slim/nets/mobilenet/mobilenet.py
@@ -114,6 +114,37 @@ def op(opfunc, **params):
  return _Op(opfunc, params=params, multiplier_func=multiplier)


+class NoOpScope(object):
+  """No-op context manager."""
+
+  def __enter__(self):
+    return None
+
+  def __exit__(self, exc_type, exc_value, traceback):
+    return False
+
+
+def safe_arg_scope(funcs, **kwargs):
+  """Returns `slim.arg_scope` with all None arguments removed.
+
+  Arguments:
+    funcs: Functions to pass to `arg_scope`.
+    **kwargs: Arguments to pass to `arg_scope`.
+
+  Returns:
+    arg_scope or No-op context manager.
+
+  Note: can be useful if None value should be interpreted as "do not overwrite
+    this parameter value".
+  """
+  filtered_args = {name: value for name, value in kwargs.items()
+                   if value is not None}
+  if filtered_args:
+    return slim.arg_scope(funcs, **filtered_args)
+  else:
+    return NoOpScope()
+
+
 @slim.add_arg_scope
 def mobilenet_base(  # pylint: disable=invalid-name
    inputs,
@@ -163,7 +194,9 @@ def mobilenet_base(  # pylint: disable=invalid-name
      only. It is safe to set it to the value matching
      training_scope(is_training=...). It is also safe to explicitly set
      it to False, even if there is outer training_scope set to to training.
-      (The network will be built in inference mode).
+      (The network will be built in inference mode). If this is set to None,
+      no arg_scope is added for slim.batch_norm's is_training parameter.
+
  Returns:
    tensor_out: output tensor.
    end_points: a set of activations for external use, for example summaries or
@@ -194,7 +227,7 @@ def mobilenet_base(  # pylint: disable=invalid-name
  # c) set all defaults
  # d) set all extra overrides.
  with _scope_all(scope, default_scope='Mobilenet'), \
-      slim.arg_scope([slim.batch_norm], is_training=is_training), \
+      safe_arg_scope([slim.batch_norm], is_training=is_training), \
      _set_arg_scope_defaults(conv_defs_defaults), \
      _set_arg_scope_defaults(conv_defs_overrides):
    # The current_stride variable keeps track of the output stride of the
@@ -394,14 +427,16 @@ def training_scope(is_training=True,
     # initialized appropriately.
  Args:
    is_training: if set to False this will ensure that all customizations are
-    set to non-training mode. This might be helpful for code that is reused
-    across both training/evaluation, but most of the time training_scope with
-    value False is not needed.
+      set to non-training mode. This might be helpful for code that is reused
+      across both training/evaluation, but most of the time training_scope with
+      value False is not needed. If this is set to None, the parameters is not
+      added to the batch_norm arg_scope.

    weight_decay: The weight decay to use for regularizing the model.
    stddev: Standard deviation for initialization, if negative uses xavier.
-    dropout_keep_prob: dropout keep probability
-    bn_decay: decay for the batch norm moving averages.
+    dropout_keep_prob: dropout keep probability (not set if equals to None).
+    bn_decay: decay for the batch norm moving averages (not set if equals to
+      None).

  Returns:
    An argument scope to use via arg_scope.
@@ -409,10 +444,9 @@ def training_scope(is_training=True,
  # Note: do not introduce parameters that would change the inference
  # model here (for example whether to use bias), modify conv_def instead.
  batch_norm_params = {
-      'is_training': is_training,
      'decay': bn_decay,
+      'is_training': is_training
  }
-
  if stddev < 0:
    weight_intitializer = slim.initializers.xavier_initializer()
  else:
@@ -424,8 +458,8 @@ def training_scope(is_training=True,
      weights_initializer=weight_intitializer,
      normalizer_fn=slim.batch_norm), \
      slim.arg_scope([mobilenet_base, mobilenet], is_training=is_training),\
-      slim.arg_scope([slim.batch_norm], **batch_norm_params), \
-      slim.arg_scope([slim.dropout], is_training=is_training,
+      safe_arg_scope([slim.batch_norm], **batch_norm_params), \
+      safe_arg_scope([slim.dropout], is_training=is_training,
                     keep_prob=dropout_keep_prob), \
      slim.arg_scope([slim.conv2d], \
                     weights_regularizer=slim.l2_regularizer(weight_decay)), \

--- a/research/slim/nets/mobilenet/mobilenet_example.ipynb
+++ b/research/slim/nets/mobilenet/mobilenet_example.ipynb
--- a/research/slim/nets/mobilenet/mobilenet_v2_test.py
+++ b/research/slim/nets/mobilenet/mobilenet_v2_test.py
@@ -171,6 +171,19 @@ class MobilenetV2Test(tf.test.TestCase):
        use_explicit_padding=True)
    self.assertEqual(out.get_shape().as_list()[1:3], [14, 14])

+  def testBatchNormScopeDoesNotHaveIsTrainingWhenItsSetToNone(self):
+    sc = mobilenet.training_scope(is_training=None)
+    self.assertNotIn('is_training', sc[slim.arg_scope_func_key(
+        slim.batch_norm)])
+
+  def testBatchNormScopeDoesHasIsTrainingWhenItsNotNone(self):
+    sc = mobilenet.training_scope(is_training=False)
+    self.assertIn('is_training', sc[slim.arg_scope_func_key(slim.batch_norm)])
+    sc = mobilenet.training_scope(is_training=True)
+    self.assertIn('is_training', sc[slim.arg_scope_func_key(slim.batch_norm)])
+    sc = mobilenet.training_scope()
+    self.assertIn('is_training', sc[slim.arg_scope_func_key(slim.batch_norm)])
+

 if __name__ == '__main__':
  tf.test.main()
--- a/research/slim/nets/mobilenet_v1.py
+++ b/research/slim/nets/mobilenet_v1.py
@@ -434,7 +434,8 @@ def mobilenet_v1_arg_scope(is_training=True,
  """Defines the default MobilenetV1 arg scope.

  Args:
-    is_training: Whether or not we're training the model.
+    is_training: Whether or not we're training the model. If this is set to
+      None, the parameter is not added to the batch_norm arg_scope.
    weight_decay: The weight decay to use for regularizing the model.
    stddev: The standard deviation of the trunctated normal weight initializer.
    regularize_depthwise: Whether or not apply regularization on depthwise.
@@ -446,12 +447,13 @@ def mobilenet_v1_arg_scope(is_training=True,
    An `arg_scope` to use for the mobilenet v1 model.
  """
  batch_norm_params = {
-      'is_training': is_training,
      'center': True,
      'scale': True,
      'decay': batch_norm_decay,
      'epsilon': batch_norm_epsilon,
  }
+  if is_training is not None:
+    batch_norm_params['is_training'] = is_training

  # Set weight_decay for weights in Conv and DepthSepConv layers.
  weights_init = tf.truncated_normal_initializer(stddev=stddev)

--- a/research/slim/nets/mobilenet_v1_test.py
+++ b/research/slim/nets/mobilenet_v1_test.py
@@ -517,6 +517,18 @@ class MobilenetV1Test(tf.test.TestCase):
      logits_out = sess.run(logits)
      self.assertListEqual(list(logits_out.shape), [1, 1, 1, num_classes])

+  def testBatchNormScopeDoesNotHaveIsTrainingWhenItsSetToNone(self):
+    sc = mobilenet_v1.mobilenet_v1_arg_scope(is_training=None)
+    self.assertNotIn('is_training', sc[slim.arg_scope_func_key(
+        slim.batch_norm)])
+
+  def testBatchNormScopeDoesHasIsTrainingWhenItsNotNone(self):
+    sc = mobilenet_v1.mobilenet_v1_arg_scope(is_training=True)
+    self.assertIn('is_training', sc[slim.arg_scope_func_key(slim.batch_norm)])
+    sc = mobilenet_v1.mobilenet_v1_arg_scope(is_training=False)
+    self.assertIn('is_training', sc[slim.arg_scope_func_key(slim.batch_norm)])
+    sc = mobilenet_v1.mobilenet_v1_arg_scope()
+    self.assertIn('is_training', sc[slim.arg_scope_func_key(slim.batch_norm)])

 if __name__ == '__main__':
  tf.test.main()
--- a/research/slim/nets/pix2pix.py
+++ b/research/slim/nets/pix2pix.py
@@ -154,9 +154,6 @@ def pix2pix_generator(net,
  blocks = blocks or _default_generator_blocks()

  input_size = net.get_shape().as_list()
-  height, width = input_size[1], input_size[2]
-  if height != width:
-    raise ValueError('The input height must match the input width.')

  input_size[3] = num_outputs

@@ -213,7 +210,10 @@ def pix2pix_generator(net,
        end_points['decoder%d' % block_id] = net

  with tf.variable_scope('output'):
-    logits = layers.conv2d(net, num_outputs, [4, 4], activation_fn=None)
+    # Explicitly set the normalizer_fn to None to override any default value
+    # that may come from an arg_scope, such as pix2pix_arg_scope.
+    logits = layers.conv2d(
+        net, num_outputs, [4, 4], activation_fn=None, normalizer_fn=None)
    logits = tf.reshape(logits, input_size)

    end_points['logits'] = logits

--- a/research/slim/nets/pix2pix_test.py
+++ b/research/slim/nets/pix2pix_test.py
@@ -24,18 +24,6 @@ from nets import pix2pix

 class GeneratorTest(tf.test.TestCase):

-  def test_nonsquare_inputs_raise_exception(self):
-    batch_size = 2
-    height, width = 240, 320
-    num_outputs = 4
-
-    images = tf.ones((batch_size, height, width, 3))
-
-    with self.assertRaises(ValueError):
-      with tf.contrib.framework.arg_scope(pix2pix.pix2pix_arg_scope()):
-        pix2pix.pix2pix_generator(
-            images, num_outputs, upsample_method='nn_upsample_conv')
-
  def _reduced_default_blocks(self):
    """Returns the default blocks, scaled down to make test run faster."""
    return [pix2pix.Block(b.num_filters // 32, b.decoder_keep_prob)

--- a/research/slim/nets/resnet_v1.py
+++ b/research/slim/nets/resnet_v1.py
@@ -65,6 +65,16 @@ resnet_arg_scope = resnet_utils.resnet_arg_scope
 slim = tf.contrib.slim


+class NoOpScope(object):
+  """No-op context manager."""
+
+  def __enter__(self):
+    return None
+
+  def __exit__(self, exc_type, exc_value, traceback):
+    return False
+
+
 @slim.add_arg_scope
 def bottleneck(inputs,
               depth,
@@ -169,7 +179,9 @@ def resnet_v1(inputs,
      is a resnet_utils.Block object describing the units in the block.
    num_classes: Number of predicted classes for classification tasks.
      If 0 or None, we return the features before the logit layer.
-    is_training: whether batch_norm layers are in training mode.
+    is_training: whether batch_norm layers are in training mode. If this is set
+      to None, the callers can specify slim.batch_norm's is_training parameter
+      from an outer slim.arg_scope.
    global_pool: If True, we perform global average pooling before computing the
      logits. Set to True for image classification, False for dense prediction.
    output_stride: If None, then the output will be computed at the nominal
@@ -211,7 +223,8 @@ def resnet_v1(inputs,
    with slim.arg_scope([slim.conv2d, bottleneck,
                         resnet_utils.stack_blocks_dense],
                        outputs_collections=end_points_collection):
-      with slim.arg_scope([slim.batch_norm], is_training=is_training):
+      with (slim.arg_scope([slim.batch_norm], is_training=is_training)
+            if is_training is not None else NoOpScope()):
        net = inputs
        if include_root_block:
          if output_stride is not None:

--- a/research/slim/nets/resnet_v1_test.py
+++ b/research/slim/nets/resnet_v1_test.py
@@ -353,6 +353,25 @@ class ResnetCompleteNetworkTest(tf.test.TestCase):
    self.assertListEqual(end_points['global_pool'].get_shape().as_list(),
                         [2, 1, 1, 32])

+  def testClassificationEndPointsWithNoBatchNormArgscope(self):
+    global_pool = True
+    num_classes = 10
+    inputs = create_test_input(2, 224, 224, 3)
+    with slim.arg_scope(resnet_utils.resnet_arg_scope()):
+      logits, end_points = self._resnet_small(inputs, num_classes,
+                                              global_pool=global_pool,
+                                              spatial_squeeze=False,
+                                              is_training=None,
+                                              scope='resnet')
+    self.assertTrue(logits.op.name.startswith('resnet/logits'))
+    self.assertListEqual(logits.get_shape().as_list(), [2, 1, 1, num_classes])
+    self.assertTrue('predictions' in end_points)
+    self.assertListEqual(end_points['predictions'].get_shape().as_list(),
+                         [2, 1, 1, num_classes])
+    self.assertTrue('global_pool' in end_points)
+    self.assertListEqual(end_points['global_pool'].get_shape().as_list(),
+                         [2, 1, 1, 32])
+
  def testEndpointNames(self):
    # Like ResnetUtilsTest.testEndPointsV1(), but for the public API.
    global_pool = True

--- a/research/slim/train_image_classifier.py
+++ b/research/slim/train_image_classifier.py
@@ -552,8 +552,6 @@ def main(_):
    # Merge all summaries together.
    summary_op = tf.summary.merge(list(summaries), name='summary_op')

-    # Add config to avoid 'could not satisfy explicit device' problem 
-    sess_config = tf.ConfigProto(allow_soft_placement=True)

    ###########################
    # Kicks off the training. #
@@ -569,8 +567,7 @@ def main(_):
        log_every_n_steps=FLAGS.log_every_n_steps,
        save_summaries_secs=FLAGS.save_summaries_secs,
        save_interval_secs=FLAGS.save_interval_secs,
-        sync_optimizer=optimizer if FLAGS.sync_replicas else None,
-        session_config=sess_config)
+        sync_optimizer=optimizer if FLAGS.sync_replicas else None)


 if __name__ == '__main__':