Object detection Internal Changes. (#4757)

* Merged commit includes the following changes: 204316992 by Zhichao Lu: Update docs to prepare inputs -- 204309254 by Zhichao Lu: Update running_pets.md to use new binaries and correct a few things in running_on_cloud.md -- 204306734 by Zhichao Lu: Move old binaries into legacy folder and add deprecation notice. -- 204267757 by Zhichao Lu: Fixing a problem in VRD evaluation with missing ground truth annotations for images that do not contain objects from 62 groundtruth classes. -- 204167430 by Zhichao Lu: This fixes a flaky losses test failure. -- 203670721 by Zhichao Lu: Internal change. -- 203569388 by Zhichao Lu: Internal change 203546580 by Zhichao Lu: * Expand TPU compatibility g3doc with config snippets * Change mscoco dataset path in sample configs to the sharded versions -- 203325694 by Zhichao Lu: Make merge_multiple_label_boxes work for model_main code path. -- 203305655 by Zhichao Lu: Remove the 1x1 conv layer before pooling in MobileNet-v1-PPN feature extractor. -- 203139608 by Zhichao Lu: - Support exponential_decay with burnin learning rate schedule. - Add the minimum learning rate option. - Make the exponential decay start only after the burnin steps. -- 203068703 by Zhichao Lu: Modify create_coco_tf_record.py to output sharded files. -- 203025308 by Zhichao Lu: Add an option to share the prediction tower in WeightSharedBoxPredictor. -- 203024942 by Zhichao Lu: Move ssd mobilenet v1 ppn configs to third party. -- 202901259 by Zhichao Lu: Delete obsolete ssd mobilenet v1 focal loss configs and update pets dataset path -- 202894154 by Zhichao Lu: Move all TPU compatible ssd mobilenet v1 coco14/pet configs to third party. -- 202861774 by Zhichao Lu: Move Retinanet (SSD + FPN + Shared box predictor) configs to third_party. -- PiperOrigin-RevId: 204316992 * Add original files back.

Object detection Internal Changes. (#4757)
* Merged commit includes the following changes: 204316992 by Zhichao Lu: Update docs to prepare inputs -- 204309254 by Zhichao Lu: Update running_pets.md to use new binaries and correct a few things in running_on_cloud.md -- 204306734 by Zhichao Lu: Move old binaries into legacy folder and add deprecation notice. -- 204267757 by Zhichao Lu: Fixing a problem in VRD evaluation with missing ground truth annotations for images that do not contain objects from 62 groundtruth classes. -- 204167430 by Zhichao Lu: This fixes a flaky losses test failure. -- 203670721 by Zhichao Lu: Internal change. -- 203569388 by Zhichao Lu: Internal change 203546580 by Zhichao Lu: * Expand TPU compatibility g3doc with config snippets * Change mscoco dataset path in sample configs to the sharded versions -- 203325694 by Zhichao Lu: Make merge_multiple_label_boxes work for model_main code path. -- 203305655 by Zhichao Lu: Remove the 1x1 conv layer before pooling in MobileNet-v1-PPN feature extractor. -- 203139608 by Zhichao Lu: - Support exponential_decay with burnin learning rate schedule. - Add the minimum learning rate option. - Make the exponential decay start only after the burnin steps. -- 203068703 by Zhichao Lu: Modify create_coco_tf_record.py to output sharded files. -- 203025308 by Zhichao Lu: Add an option to share the prediction tower in WeightSharedBoxPredictor. -- 203024942 by Zhichao Lu: Move ssd mobilenet v1 ppn configs to third party. -- 202901259 by Zhichao Lu: Delete obsolete ssd mobilenet v1 focal loss configs and update pets dataset path -- 202894154 by Zhichao Lu: Move all TPU compatible ssd mobilenet v1 coco14/pet configs to third party. -- 202861774 by Zhichao Lu: Move Retinanet (SSD + FPN + Shared box predictor) configs to third_party. -- PiperOrigin-RevId: 204316992 * Add original files back.
70255908 · pkulzc · GitHub · ee6fdda1 · 70255908 · 70255908
Unverified Commit 70255908 authored Jul 12, 2018 by pkulzc Committed by GitHub Jul 12, 2018
17 changed files
--- a/research/object_detection/samples/configs/ssd_mobilenet_v1_0.75_depth_quantized_300x300_coco14_sync.config
+++ b/research/object_detection/samples/configs/ssd_mobilenet_v1_0.75_depth_quantized_300x300_coco14_sync.config
+# SSD with Mobilenet v1 with quantized training.
+# Trained on COCO, initialized from Imagenet classification checkpoint
+# Achieves 18.2 mAP on coco14 minival dataset.
+# This config is TPU compatible
+model {
+  ssd {
+    inplace_batchnorm_update: true
+    freeze_batchnorm: false
+    num_classes: 90
+    box_coder {
+      faster_rcnn_box_coder {
+        y_scale: 10.0
+        x_scale: 10.0
+        height_scale: 5.0
+        width_scale: 5.0
+      }
+    }
+    matcher {
+      argmax_matcher {
+        matched_threshold: 0.5
+        unmatched_threshold: 0.5
+        ignore_thresholds: false
+        negatives_lower_than_unmatched: true
+        force_match_for_each_row: true
+        use_matmul_gather: true
+      }
+    }
+    similarity_calculator {
+      iou_similarity {
+      }
+    }
+    encode_background_as_zeros: true
+    anchor_generator {
+      ssd_anchor_generator {
+        num_layers: 6
+        min_scale: 0.2
+        max_scale: 0.95
+        aspect_ratios: 1.0
+        aspect_ratios: 2.0
+        aspect_ratios: 0.5
+        aspect_ratios: 3.0
+        aspect_ratios: 0.3333
+      }
+    }
+    image_resizer {
+      fixed_shape_resizer {
+        height: 300
+        width: 300
+      }
+    }
+    box_predictor {
+      convolutional_box_predictor {
+        min_depth: 0
+        max_depth: 0
+        num_layers_before_predictor: 0
+        use_dropout: false
+        dropout_keep_probability: 0.8
+        kernel_size: 1
+        box_code_size: 4
+        apply_sigmoid_to_scores: false
+        class_prediction_bias_init: -4.6
+        conv_hyperparams {
+          activation: RELU_6,
+          regularizer {
+            l2_regularizer {
+              weight: 0.00004
+            }
+          }
+          initializer {
+            random_normal_initializer {
+              stddev: 0.01
+              mean: 0.0
+            }
+          }
+          batch_norm {
+            scale: true,
+            center: true,
+            decay: 0.97,
+            epsilon: 0.001,
+          }
+        }
+      }
+    }
+    feature_extractor {
+      type: 'ssd_mobilenet_v1'
+      min_depth: 16
+      depth_multiplier: 0.75
+      conv_hyperparams {
+        activation: RELU_6,
+        regularizer {
+          l2_regularizer {
+            weight: 0.00004
+          }
+        }
+        initializer {
+          random_normal_initializer {
+            stddev: 0.01
+            mean: 0.0
+          }
+        }
+        batch_norm {
+          scale: true,
+          center: true,
+          decay: 0.97,
+          epsilon: 0.001,
+        }
+      }
+      override_base_feature_extractor_hyperparams: true
+    }
+    loss {
+      classification_loss {
+        weighted_sigmoid_focal {
+          alpha: 0.75,
+          gamma: 2.0
+        }
+      }
+      localization_loss {
+        weighted_smooth_l1 {
+        }
+      }
+      classification_weight: 1.0
+      localization_weight: 1.0
+    }
+    normalize_loss_by_num_matches: true
+    normalize_loc_loss_by_codesize: true
+    post_processing {
+      batch_non_max_suppression {
+        score_threshold: 1e-8
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 100
+      }
+      score_converter: SIGMOID
+    }
+  }
+}
+train_config: {
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
+  batch_size: 128
+  sync_replicas: true
+  startup_delay_steps: 0
+  replicas_to_aggregate: 8
+  num_steps: 50000
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+  data_augmentation_options {
+    ssd_random_crop {
+    }
+  }
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: .2
+          total_steps: 50000
+          warmup_learning_rate: 0.06
+          warmup_steps: 2000
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+}
+train_input_reader: {
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record-00000-of-00100"
+  }
+  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
+}
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  num_examples: 8000
+}
+eval_input_reader: {
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record-00000-of-00010"
+  }
+  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
+  shuffle: false
+  num_readers: 1
+}
+graph_rewriter {
+  quantization {
+    delay: 48000
+    activation_bits: 8
+    weight_bits: 8
+  }
+}
--- a/research/object_detection/samples/configs/ssd_mobilenet_v1_0.75_depth_quantized_300x300_pets_sync.config
+++ b/research/object_detection/samples/configs/ssd_mobilenet_v1_0.75_depth_quantized_300x300_pets_sync.config
+# SSD with Mobilenet v1 0.75 depth multiplied feature extractor, focal loss and
+# quantized training.
+# Trained on IIIT-Oxford pets, initialized from COCO detection checkpoint
+# This config is TPU compatible
+model {
+  ssd {
+    inplace_batchnorm_update: true
+    freeze_batchnorm: false
+    num_classes: 37
+    box_coder {
+      faster_rcnn_box_coder {
+        y_scale: 10.0
+        x_scale: 10.0
+        height_scale: 5.0
+        width_scale: 5.0
+      }
+    }
+    matcher {
+      argmax_matcher {
+        matched_threshold: 0.5
+        unmatched_threshold: 0.5
+        ignore_thresholds: false
+        negatives_lower_than_unmatched: true
+        force_match_for_each_row: true
+        use_matmul_gather: true
+      }
+    }
+    similarity_calculator {
+      iou_similarity {
+      }
+    }
+    encode_background_as_zeros: true
+    anchor_generator {
+      ssd_anchor_generator {
+        num_layers: 6
+        min_scale: 0.2
+        max_scale: 0.95
+        aspect_ratios: 1.0
+        aspect_ratios: 2.0
+        aspect_ratios: 0.5
+        aspect_ratios: 3.0
+        aspect_ratios: 0.3333
+      }
+    }
+    image_resizer {
+      fixed_shape_resizer {
+        height: 300
+        width: 300
+      }
+    }
+    box_predictor {
+      convolutional_box_predictor {
+        min_depth: 0
+        max_depth: 0
+        num_layers_before_predictor: 0
+        use_dropout: false
+        dropout_keep_probability: 0.8
+        kernel_size: 1
+        box_code_size: 4
+        apply_sigmoid_to_scores: false
+        class_prediction_bias_init: -4.6
+        conv_hyperparams {
+          activation: RELU_6,
+          regularizer {
+            l2_regularizer {
+              weight: 0.00004
+            }
+          }
+          initializer {
+            random_normal_initializer {
+              stddev: 0.01
+              mean: 0.0
+            }
+          }
+          batch_norm {
+            train: true,
+            scale: true,
+            center: true,
+            decay: 0.9,
+            epsilon: 0.001,
+          }
+        }
+      }
+    }
+    feature_extractor {
+      type: 'ssd_mobilenet_v1'
+      min_depth: 16
+      depth_multiplier: 0.75
+      conv_hyperparams {
+        activation: RELU_6,
+        regularizer {
+          l2_regularizer {
+            weight: 0.00004
+          }
+        }
+        initializer {
+          truncated_normal_initializer {
+            stddev: 0.03
+            mean: 0.0
+          }
+        }
+        batch_norm {
+          scale: true,
+          center: true,
+          decay: 0.9,
+          epsilon: 0.001,
+        }
+      }
+      override_base_feature_extractor_hyperparams: true
+    }
+    loss {
+      classification_loss {
+        weighted_sigmoid_focal {
+          alpha: 0.75,
+          gamma: 2.0
+        }
+      }
+      localization_loss {
+        weighted_smooth_l1 {
+          delta: 1.0
+        }
+      }
+      classification_weight: 1.0
+      localization_weight: 1.0
+    }
+    normalize_loss_by_num_matches: true
+    normalize_loc_loss_by_codesize: true
+    post_processing {
+      batch_non_max_suppression {
+        score_threshold: 1e-8
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 100
+      }
+      score_converter: SIGMOID
+    }
+  }
+}
+train_config: {
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
+  fine_tune_checkpoint_type: "detection"
+  load_all_detection_checkpoint_vars: true
+  batch_size: 128
+  sync_replicas: true
+  startup_delay_steps: 0
+  replicas_to_aggregate: 8
+  num_steps: 2000
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+  data_augmentation_options {
+    ssd_random_crop {
+    }
+  }
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: 0.2
+          total_steps: 2000
+          warmup_steps: 0
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+}
+train_input_reader: {
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/pet_faces_train.record-?????-of-00010"
+  }
+  label_map_path: "PATH_TO_BE_CONFIGURED/pet_label_map.pbtxt"
+}
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  num_examples: 1100
+}
+eval_input_reader: {
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/pet_faces_val.record-?????-of-00010"
+  }
+  label_map_path: "PATH_TO_BE_CONFIGURED/pet_label_map.pbtxt"
+  shuffle: false
+  num_readers: 1
+}
+graph_rewriter {
+  quantization {
+    delay: 1800
+    activation_bits: 8
+    weight_bits: 8
+  }
+}
\ No newline at end of file
--- a/research/object_detection/samples/configs/ssd_mobilenet_v1_focal_loss_coco.config
+++ b/research/object_detection/samples/configs/ssd_mobilenet_v1_focal_loss_coco.config
-# SSD with Mobilenet v1 configuration for MSCOCO Dataset.
+# SSD with Mobilenet v1 feature extractor and focal loss.
-# Users should configure the fine_tune_checkpoint field in the train config as
+# Trained on COCO14, initialized from Imagenet classification checkpoint
-# well as the label_map_path and input_path fields in the train_input_reader and
-# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
+# Achieves 19.3 mAP on COCO14 minival dataset. Doubling the number of training
-# should be configured.
+# steps gets to 20.6 mAP.
-# TPU-compatible
+# This config is TPU compatible
 model {
  ssd {
+    inplace_batchnorm_update: true
+    freeze_batchnorm: false
    num_classes: 90
    box_coder {
      faster_rcnn_box_coder {
@@ -23,12 +26,14 @@ model {
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
+        use_matmul_gather: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
+    encode_background_as_zeros: true
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
@@ -57,6 +62,7 @@ model {
        kernel_size: 1
        box_code_size: 4
        apply_sigmoid_to_scores: false
+        class_prediction_bias_init: -4.6
        conv_hyperparams {
          activation: RELU_6,
          regularizer {
@@ -65,8 +71,8 @@ model {
            }
          }
          initializer {
-            truncated_normal_initializer {
+            random_normal_initializer {
-              stddev: 0.03
+              stddev: 0.01
              mean: 0.0
            }
          }
@@ -74,7 +80,7 @@ model {
            train: true,
            scale: true,
            center: true,
-            decay: 0.9997,
+            decay: 0.97,
            epsilon: 0.001,
          }
        }
@@ -101,26 +107,29 @@ model {
          train: true,
          scale: true,
          center: true,
-          decay: 0.9997,
+          decay: 0.97,
          epsilon: 0.001,
        }
      }
+      override_base_feature_extractor_hyperparams: true
    }
    loss {
      classification_loss {
        weighted_sigmoid_focal {
-          alpha: 0.75
+          alpha: 0.75,
          gamma: 2.0
        }
      }
      localization_loss {
        weighted_smooth_l1 {
+          delta: 1.0
        }
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
+    normalize_loc_loss_by_codesize: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
@@ -134,28 +143,12 @@ model {
 }
 train_config: {
-  batch_size: 24
-  optimizer {
-    rms_prop_optimizer: {
-      learning_rate: {
-        exponential_decay_learning_rate {
-          initial_learning_rate: 0.004
-          decay_steps: 800720
-          decay_factor: 0.95
-        }
-      }
-      momentum_optimizer_value: 0.9
-      decay: 0.9
-      epsilon: 1.0
-    }
-  }
  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
-  from_detection_checkpoint: true
+  batch_size: 2048
-  # Note: The below line limits the training process to 200K steps, which we
+  sync_replicas: true
-  # empirically found to be sufficient enough to train the pets dataset. This
+  startup_delay_steps: 0
-  # effectively bypasses the learning rate schedule (the learning rate will
+  replicas_to_aggregate: 8
-  # never decay). Remove the below line to train indefinitely.
+  num_steps: 10000
-  num_steps: 200000
  data_augmentation_options {
    random_horizontal_flip {
    }
@@ -164,29 +157,42 @@ train_config: {
    ssd_random_crop {
    }
  }
-  max_number_of_boxes: 50
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: 0.9
+          total_steps: 10000
+          warmup_learning_rate: 0.3
+          warmup_steps: 300
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  max_number_of_boxes: 100
  unpad_groundtruth_tensors: false
 }
 train_input_reader: {
  tf_record_input_reader {
-    input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record"
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record-00000-of-00100"
  }
  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
 }
 eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
  num_examples: 8000
-  # Note: The below line limits the evaluation process to 10 evaluations.
-  # Remove the below line to evaluate indefinitely.
-  max_evals: 10
 }
 eval_input_reader: {
  tf_record_input_reader {
-    input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record"
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record-00000-of-00010"
  }
  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
  shuffle: false
  num_readers: 1
 }
\ No newline at end of file
--- a/research/object_detection/samples/configs/ssd_mobilenet_v1_coco.config
+++ b/research/object_detection/samples/configs/ssd_mobilenet_v1_coco.config
@@ -172,7 +172,7 @@ train_config: {
 train_input_reader: {
  tf_record_input_reader {
-    input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record"
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record-?????-of-00100"
  }
  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
 }
@@ -186,7 +186,7 @@ eval_config: {
 eval_input_reader: {
  tf_record_input_reader {
-    input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record"
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record-?????-of-00010"
  }
  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
  shuffle: false

--- a/research/object_detection/samples/configs/ssd_mobilenet_v1_focal_loss_pets.config
+++ b/research/object_detection/samples/configs/ssd_mobilenet_v1_focal_loss_pets.config
@@ -171,7 +171,7 @@ train_config: {
 train_input_reader: {
  tf_record_input_reader {
-    input_path: "PATH_TO_BE_CONFIGURED/pet_faces_train.record-?????-of-?????"
+    input_path: "PATH_TO_BE_CONFIGURED/pet_faces_train.record-?????-of-00010"
  }
  label_map_path: "PATH_TO_BE_CONFIGURED/pet_label_map.pbtxt"
 }
@@ -183,7 +183,7 @@ eval_config: {
 eval_input_reader: {
  tf_record_input_reader {
-    input_path: "PATH_TO_BE_CONFIGURED/pet_faces_val.record-?????-of-?????"
+    input_path: "PATH_TO_BE_CONFIGURED/pet_faces_val.record-?????-of-00010"
  }
  label_map_path: "PATH_TO_BE_CONFIGURED/pet_label_map.pbtxt"
  shuffle: false

--- a/research/object_detection/samples/configs/ssd_mobilenet_v1_fpn_shared_box_predictor_640x640_coco14_sync.config
+++ b/research/object_detection/samples/configs/ssd_mobilenet_v1_fpn_shared_box_predictor_640x640_coco14_sync.config
+# SSD with Mobilenet v1 FPN feature extractor, shared box predictor and focal
+# loss (a.k.a Retinanet).
+# See Lin et al, https://arxiv.org/abs/1708.02002
+# Trained on COCO, initialized from Imagenet classification checkpoint
+# Achieves 29.6 mAP on COCO14 minival dataset. Doubling the number of training
+# steps to 25k gets 31.5 mAP
+# This config is TPU compatible
+model {
+  ssd {
+    inplace_batchnorm_update: true
+    freeze_batchnorm: false
+    num_classes: 90
+    box_coder {
+      faster_rcnn_box_coder {
+        y_scale: 10.0
+        x_scale: 10.0
+        height_scale: 5.0
+        width_scale: 5.0
+      }
+    }
+    matcher {
+      argmax_matcher {
+        matched_threshold: 0.5
+        unmatched_threshold: 0.5
+        ignore_thresholds: false
+        negatives_lower_than_unmatched: true
+        force_match_for_each_row: true
+        use_matmul_gather: true
+      }
+    }
+    similarity_calculator {
+      iou_similarity {
+      }
+    }
+    encode_background_as_zeros: true
+    anchor_generator {
+      multiscale_anchor_generator {
+        min_level: 3
+        max_level: 7
+        anchor_scale: 4.0
+        aspect_ratios: [1.0, 2.0, 0.5]
+        scales_per_octave: 2
+      }
+    }
+    image_resizer {
+      fixed_shape_resizer {
+        height: 640
+        width: 640
+      }
+    }
+    box_predictor {
+      weight_shared_convolutional_box_predictor {
+        depth: 256
+        class_prediction_bias_init: -4.6
+        conv_hyperparams {
+          activation: RELU_6,
+          regularizer {
+            l2_regularizer {
+              weight: 0.00004
+            }
+          }
+          initializer {
+            random_normal_initializer {
+              stddev: 0.01
+              mean: 0.0
+            }
+          }
+          batch_norm {
+            scale: true,
+            decay: 0.997,
+            epsilon: 0.001,
+          }
+        }
+        num_layers_before_predictor: 4
+        kernel_size: 3
+      }
+    }
+    feature_extractor {
+      type: 'ssd_mobilenet_v1_fpn'
+      min_depth: 16
+      depth_multiplier: 1.0
+      conv_hyperparams {
+        activation: RELU_6,
+        regularizer {
+          l2_regularizer {
+            weight: 0.00004
+          }
+        }
+        initializer {
+          random_normal_initializer {
+            stddev: 0.01
+            mean: 0.0
+          }
+        }
+        batch_norm {
+          scale: true,
+          decay: 0.997,
+          epsilon: 0.001,
+        }
+      }
+      override_base_feature_extractor_hyperparams: true
+    }
+    loss {
+      classification_loss {
+        weighted_sigmoid_focal {
+          alpha: 0.25
+          gamma: 2.0
+        }
+      }
+      localization_loss {
+        weighted_smooth_l1 {
+        }
+      }
+      classification_weight: 1.0
+      localization_weight: 1.0
+    }
+    normalize_loss_by_num_matches: true
+    normalize_loc_loss_by_codesize: true
+    post_processing {
+      batch_non_max_suppression {
+        score_threshold: 1e-8
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 100
+      }
+      score_converter: SIGMOID
+    }
+  }
+}
+train_config: {
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
+  batch_size: 128
+  sync_replicas: true
+  startup_delay_steps: 0
+  replicas_to_aggregate: 8
+  num_steps: 12500
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+  data_augmentation_options {
+    random_crop_image {
+      min_object_covered: 0.0
+      min_aspect_ratio: 0.75
+      max_aspect_ratio: 3.0
+      min_area: 0.75
+      max_area: 1.0
+      overlap_thresh: 0.0
+    }
+  }
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: .08
+          total_steps: 12500
+          warmup_learning_rate: .026666
+          warmup_steps: 1000
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+}
+train_input_reader: {
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record-00000-of-00100"
+  }
+  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
+}
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  num_examples: 8000
+}
+eval_input_reader: {
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record-00000-of-00010"
+  }
+  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
+  shuffle: false
+  num_readers: 1
+}
\ No newline at end of file
--- a/research/object_detection/samples/configs/ssd_mobilenet_v1_pets.config
+++ b/research/object_detection/samples/configs/ssd_mobilenet_v1_pets.config
@@ -173,19 +173,19 @@ train_config: {
 train_input_reader: {
  tf_record_input_reader {
-    input_path: "PATH_TO_BE_CONFIGURED/pet_faces_train.record-?????-of-?????"
+    input_path: "PATH_TO_BE_CONFIGURED/pet_faces_train.record-?????-of-00010"
  }
  label_map_path: "PATH_TO_BE_CONFIGURED/pet_label_map.pbtxt"
 }
 eval_config: {
  metrics_set: "coco_detection_metrics"
-  num_examples: 1101
+  num_examples: 1100
 }
 eval_input_reader: {
  tf_record_input_reader {
-    input_path: "PATH_TO_BE_CONFIGURED/pet_faces_val.record-?????-of-?????"
+    input_path: "PATH_TO_BE_CONFIGURED/pet_faces_val.record-?????-of-00010"
  }
  label_map_path: "PATH_TO_BE_CONFIGURED/pet_label_map.pbtxt"
  shuffle: false

--- a/research/object_detection/samples/configs/ssd_mobilenet_v1_ppn_shared_box_predictor_300x300_coco14_sync.config
+++ b/research/object_detection/samples/configs/ssd_mobilenet_v1_ppn_shared_box_predictor_300x300_coco14_sync.config
+# SSD with Mobilenet v1 PPN feature extractor.
+# Trained on COCO, initialized from Imagenet classification checkpoint
+# Achieves 19.7 mAP on COCO14 minival dataset.
+# This config is TPU compatible.
+model {
+  ssd {
+    inplace_batchnorm_update: true
+    freeze_batchnorm: false
+    num_classes: 90
+    box_coder {
+      faster_rcnn_box_coder {
+        y_scale: 10.0
+        x_scale: 10.0
+        height_scale: 5.0
+        width_scale: 5.0
+      }
+    }
+    matcher {
+      argmax_matcher {
+        matched_threshold: 0.5
+        unmatched_threshold: 0.5
+        ignore_thresholds: false
+        negatives_lower_than_unmatched: true
+        force_match_for_each_row: true
+        use_matmul_gather: true
+      }
+    }
+    similarity_calculator {
+      iou_similarity {
+      }
+    }
+    encode_background_as_zeros: true
+    anchor_generator {
+      ssd_anchor_generator {
+        num_layers: 6
+        min_scale: 0.15
+        max_scale: 0.95
+        aspect_ratios: 1.0
+        aspect_ratios: 2.0
+        aspect_ratios: 0.5
+        aspect_ratios: 3.0
+        aspect_ratios: 0.3333
+        reduce_boxes_in_lowest_layer: false
+      }
+    }
+    image_resizer {
+      fixed_shape_resizer {
+        height: 300
+        width: 300
+      }
+    }
+    box_predictor {
+      weight_shared_convolutional_box_predictor {
+        depth: 512
+        class_prediction_bias_init: -4.6
+        conv_hyperparams {
+          activation: RELU_6,
+          regularizer {
+            l2_regularizer {
+              weight: 0.00004
+            }
+          }
+          initializer {
+            random_normal_initializer {
+              stddev: 0.01
+              mean: 0.0
+            }
+          }
+          batch_norm {
+            scale: true
+            center: true
+            train: true
+            decay: 0.97
+            epsilon: 0.001
+          }
+        }
+        num_layers_before_predictor: 1
+        kernel_size: 1
+        share_prediction_tower: true
+      }
+    }
+    feature_extractor {
+      type: 'ssd_mobilenet_v1_ppn'
+      conv_hyperparams {
+        activation: RELU_6,
+        regularizer {
+          l2_regularizer {
+            weight: 0.00004
+          }
+        }
+        initializer {
+          random_normal_initializer {
+            stddev: 0.01
+            mean: 0.0
+          }
+        }
+        batch_norm {
+          scale: true
+          center: true
+          decay: 0.97
+          epsilon: 0.001
+        }
+      }
+      override_base_feature_extractor_hyperparams: true
+    }
+    loss {
+      classification_loss {
+        weighted_sigmoid_focal {
+          alpha: 0.75
+          gamma: 2.0
+        }
+      }
+      localization_loss {
+        weighted_smooth_l1 {
+        }
+      }
+      classification_weight: 1.0
+      localization_weight: 1.5
+    }
+    normalize_loss_by_num_matches: true
+    normalize_loc_loss_by_codesize: true
+    post_processing {
+      batch_non_max_suppression {
+        score_threshold: 1e-8
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 100
+      }
+      score_converter: SIGMOID
+    }
+  }
+}
+train_config: {
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
+  batch_size: 512
+  sync_replicas: true
+  startup_delay_steps: 0
+  replicas_to_aggregate: 8
+  num_steps: 50000
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+  data_augmentation_options {
+    ssd_random_crop {
+    }
+  }
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: 0.7
+          total_steps: 50000
+          warmup_learning_rate: 0.1333
+          warmup_steps: 2000
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+}
+train_input_reader: {
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record-00000-of-00100"
+  }
+  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
+}
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  num_examples: 8000
+}
+eval_input_reader: {
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record-00000-of-00010"
+  }
+  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
+  shuffle: false
+  num_readers: 1
+}
--- a/research/object_detection/samples/configs/ssd_mobilenet_v1_quantized_300x300_coco14_sync.config
+++ b/research/object_detection/samples/configs/ssd_mobilenet_v1_quantized_300x300_coco14_sync.config
+# SSD with Mobilenet v1 with quantized training.
+# Trained on COCO, initialized from Imagenet classification checkpoint
+# Achieves 18.2 mAP on coco14 minival dataset.
+# This config is TPU compatible
+model {
+  ssd {
+    inplace_batchnorm_update: true
+    freeze_batchnorm: false
+    num_classes: 90
+    box_coder {
+      faster_rcnn_box_coder {
+        y_scale: 10.0
+        x_scale: 10.0
+        height_scale: 5.0
+        width_scale: 5.0
+      }
+    }
+    matcher {
+      argmax_matcher {
+        matched_threshold: 0.5
+        unmatched_threshold: 0.5
+        ignore_thresholds: false
+        negatives_lower_than_unmatched: true
+        force_match_for_each_row: true
+        use_matmul_gather: true
+      }
+    }
+    similarity_calculator {
+      iou_similarity {
+      }
+    }
+    encode_background_as_zeros: true
+    anchor_generator {
+      ssd_anchor_generator {
+        num_layers: 6
+        min_scale: 0.2
+        max_scale: 0.95
+        aspect_ratios: 1.0
+        aspect_ratios: 2.0
+        aspect_ratios: 0.5
+        aspect_ratios: 3.0
+        aspect_ratios: 0.3333
+      }
+    }
+    image_resizer {
+      fixed_shape_resizer {
+        height: 300
+        width: 300
+      }
+    }
+    box_predictor {
+      convolutional_box_predictor {
+        min_depth: 0
+        max_depth: 0
+        num_layers_before_predictor: 0
+        use_dropout: false
+        dropout_keep_probability: 0.8
+        kernel_size: 1
+        box_code_size: 4
+        apply_sigmoid_to_scores: false
+        class_prediction_bias_init: -4.6
+        conv_hyperparams {
+          activation: RELU_6,
+          regularizer {
+            l2_regularizer {
+              weight: 0.00004
+            }
+          }
+          initializer {
+            random_normal_initializer {
+              stddev: 0.01
+              mean: 0.0
+            }
+          }
+          batch_norm {
+            scale: true,
+            center: true,
+            decay: 0.97,
+            epsilon: 0.001,
+          }
+        }
+      }
+    }
+    feature_extractor {
+      type: 'ssd_mobilenet_v1'
+      min_depth: 16
+      depth_multiplier: 1.0
+      conv_hyperparams {
+        activation: RELU_6,
+        regularizer {
+          l2_regularizer {
+            weight: 0.00004
+          }
+        }
+        initializer {
+          random_normal_initializer {
+            stddev: 0.01
+            mean: 0.0
+          }
+        }
+        batch_norm {
+          scale: true,
+          center: true,
+          decay: 0.97,
+          epsilon: 0.001,
+        }
+      }
+      override_base_feature_extractor_hyperparams: true
+    }
+    loss {
+      classification_loss {
+        weighted_sigmoid_focal {
+          alpha: 0.75,
+          gamma: 2.0
+        }
+      }
+      localization_loss {
+        weighted_smooth_l1 {
+        }
+      }
+      classification_weight: 1.0
+      localization_weight: 1.0
+    }
+    normalize_loss_by_num_matches: true
+    normalize_loc_loss_by_codesize: true
+    post_processing {
+      batch_non_max_suppression {
+        score_threshold: 1e-8
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 100
+      }
+      score_converter: SIGMOID
+    }
+  }
+}
+train_config: {
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
+  batch_size: 128
+  sync_replicas: true
+  startup_delay_steps: 0
+  replicas_to_aggregate: 8
+  num_steps: 50000
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+  data_augmentation_options {
+    ssd_random_crop {
+    }
+  }
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: .2
+          total_steps: 50000
+          warmup_learning_rate: 0.06
+          warmup_steps: 2000
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+}
+train_input_reader: {
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record-00000-of-00100"
+  }
+  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
+}
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  num_examples: 8000
+}
+eval_input_reader: {
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record-00000-of-00010"
+  }
+  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
+  shuffle: false
+  num_readers: 1
+}
+graph_rewriter {
+  quantization {
+    delay: 48000
+    activation_bits: 8
+    weight_bits: 8
+  }
+}
--- a/research/object_detection/samples/configs/ssd_mobilenet_v2_coco.config
+++ b/research/object_detection/samples/configs/ssd_mobilenet_v2_coco.config
@@ -172,7 +172,7 @@ train_config: {
 train_input_reader: {
  tf_record_input_reader {
-    input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record"
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record-?????-of-00100"
  }
  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
 }
@@ -186,7 +186,7 @@ eval_config: {
 eval_input_reader: {
  tf_record_input_reader {
-    input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record"
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record-?????-of-00010"
  }
  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
  shuffle: false

--- a/research/object_detection/samples/configs/ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync.config
+++ b/research/object_detection/samples/configs/ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync.config
+# SSD with Resnet 50 v1 FPN feature extractor, shared box predictor and focal
+# loss (a.k.a Retinanet).
+# See Lin et al, https://arxiv.org/abs/1708.02002
+# Trained on COCO, initialized from Imagenet classification checkpoint
+# Achieves 35.2 mAP on COCO14 minival dataset. Doubling the number of training
+# steps to 50k gets 36.9 mAP
+# This config is TPU compatible
+model {
+  ssd {
+    inplace_batchnorm_update: true
+    freeze_batchnorm: false
+    num_classes: 90
+    box_coder {
+      faster_rcnn_box_coder {
+        y_scale: 10.0
+        x_scale: 10.0
+        height_scale: 5.0
+        width_scale: 5.0
+      }
+    }
+    matcher {
+      argmax_matcher {
+        matched_threshold: 0.5
+        unmatched_threshold: 0.5
+        ignore_thresholds: false
+        negatives_lower_than_unmatched: true
+        force_match_for_each_row: true
+        use_matmul_gather: true
+      }
+    }
+    similarity_calculator {
+      iou_similarity {
+      }
+    }
+    encode_background_as_zeros: true
+    anchor_generator {
+      multiscale_anchor_generator {
+        min_level: 3
+        max_level: 7
+        anchor_scale: 4.0
+        aspect_ratios: [1.0, 2.0, 0.5]
+        scales_per_octave: 2
+      }
+    }
+    image_resizer {
+      fixed_shape_resizer {
+        height: 640
+        width: 640
+      }
+    }
+    box_predictor {
+      weight_shared_convolutional_box_predictor {
+        depth: 256
+        class_prediction_bias_init: -4.6
+        conv_hyperparams {
+          activation: RELU_6,
+          regularizer {
+            l2_regularizer {
+              weight: 0.0004
+            }
+          }
+          initializer {
+            random_normal_initializer {
+              stddev: 0.01
+              mean: 0.0
+            }
+          }
+          batch_norm {
+            scale: true,
+            decay: 0.997,
+            epsilon: 0.001,
+          }
+        }
+        num_layers_before_predictor: 4
+        kernel_size: 3
+      }
+    }
+    feature_extractor {
+      type: 'ssd_resnet50_v1_fpn'
+      min_depth: 16
+      depth_multiplier: 1.0
+      conv_hyperparams {
+        activation: RELU_6,
+        regularizer {
+          l2_regularizer {
+            weight: 0.0004
+          }
+        }
+        initializer {
+          truncated_normal_initializer {
+            stddev: 0.03
+            mean: 0.0
+          }
+        }
+        batch_norm {
+          scale: true,
+          decay: 0.997,
+          epsilon: 0.001,
+        }
+      }
+      override_base_feature_extractor_hyperparams: true
+    }
+    loss {
+      classification_loss {
+        weighted_sigmoid_focal {
+          alpha: 0.25
+          gamma: 2.0
+        }
+      }
+      localization_loss {
+        weighted_smooth_l1 {
+        }
+      }
+      classification_weight: 1.0
+      localization_weight: 1.0
+    }
+    normalize_loss_by_num_matches: true
+    normalize_loc_loss_by_codesize: true
+    post_processing {
+      batch_non_max_suppression {
+        score_threshold: 1e-8
+        iou_threshold: 0.6
+        max_detections_per_class: 100
+        max_total_detections: 100
+      }
+      score_converter: SIGMOID
+    }
+  }
+}
+train_config: {
+  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
+  batch_size: 64
+  sync_replicas: true
+  startup_delay_steps: 0
+  replicas_to_aggregate: 8
+  num_steps: 25000
+  data_augmentation_options {
+    random_horizontal_flip {
+    }
+  }
+  data_augmentation_options {
+    random_crop_image {
+      min_object_covered: 0.0
+      min_aspect_ratio: 0.75
+      max_aspect_ratio: 3.0
+      min_area: 0.75
+      max_area: 1.0
+      overlap_thresh: 0.0
+    }
+  }
+  optimizer {
+    momentum_optimizer: {
+      learning_rate: {
+        cosine_decay_learning_rate {
+          learning_rate_base: .04
+          total_steps: 25000
+          warmup_learning_rate: .013333
+          warmup_steps: 2000
+        }
+      }
+      momentum_optimizer_value: 0.9
+    }
+    use_moving_average: false
+  }
+  max_number_of_boxes: 100
+  unpad_groundtruth_tensors: false
+}
+train_input_reader: {
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record-00000-of-00100"
+  }
+  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
+}
+eval_config: {
+  metrics_set: "coco_detection_metrics"
+  use_moving_averages: false
+  num_examples: 8000
+}
+eval_input_reader: {
+  tf_record_input_reader {
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record-00000-of-00010"
+  }
+  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
+  shuffle: false
+  num_readers: 1
+}
\ No newline at end of file
--- a/research/object_detection/samples/configs/ssdlite_mobilenet_v1_coco.config
+++ b/research/object_detection/samples/configs/ssdlite_mobilenet_v1_coco.config
@@ -174,7 +174,7 @@ train_config: {
 train_input_reader: {
  tf_record_input_reader {
-    input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record"
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record-?????-of-00100"
  }
  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
 }
@@ -188,7 +188,7 @@ eval_config: {
 eval_input_reader: {
  tf_record_input_reader {
-    input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record"
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record-?????-of-00010"
  }
  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
  shuffle: false

--- a/research/object_detection/samples/configs/ssdlite_mobilenet_v2_coco.config
+++ b/research/object_detection/samples/configs/ssdlite_mobilenet_v2_coco.config
@@ -174,7 +174,7 @@ train_config: {
 train_input_reader: {
  tf_record_input_reader {
-    input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record"
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record-?????-of-00100"
  }
  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
 }
@@ -188,7 +188,7 @@ eval_config: {
 eval_input_reader: {
  tf_record_input_reader {
-    input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record"
+    input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record-?????-of-00010"
  }
  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
  shuffle: false

--- a/research/object_detection/utils/learning_schedules.py
+++ b/research/object_detection/utils/learning_schedules.py
@@ -23,7 +23,9 @@ def exponential_decay_with_burnin(global_step,
                                  learning_rate_decay_steps,
                                  learning_rate_decay_factor,
                                  burnin_learning_rate=0.0,
-                                  burnin_steps=0):
+                                  burnin_steps=0,
+                                  min_learning_rate=0.0,
+                                  staircase=True):
  """Exponential decay schedule with burn-in period.
  In this schedule, learning rate is fixed at burnin_learning_rate
@@ -41,6 +43,8 @@ def exponential_decay_with_burnin(global_step,
      0.0 (which is the default), then the burn-in learning rate is simply
      set to learning_rate_base.
    burnin_steps: number of steps to use burnin learning rate.
+    min_learning_rate: the minimum learning rate.
+    staircase: whether use staircase decay.
  Returns:
    a (scalar) float tensor representing learning rate
@@ -49,14 +53,14 @@ def exponential_decay_with_burnin(global_step,
    burnin_learning_rate = learning_rate_base
  post_burnin_learning_rate = tf.train.exponential_decay(
      learning_rate_base,
-      global_step,
+      global_step - burnin_steps,
      learning_rate_decay_steps,
      learning_rate_decay_factor,
-      staircase=True)
+      staircase=staircase)
-  return tf.where(
+  return tf.maximum(tf.where(
      tf.less(tf.cast(global_step, tf.int32), tf.constant(burnin_steps)),
      tf.constant(burnin_learning_rate),
-      post_burnin_learning_rate, name='learning_rate')
+      post_burnin_learning_rate), min_learning_rate, name='learning_rate')
 def cosine_decay_with_warmup(global_step,

--- a/research/object_detection/utils/learning_schedules_test.py
+++ b/research/object_detection/utils/learning_schedules_test.py
@@ -30,17 +30,19 @@ class LearningSchedulesTest(test_case.TestCase):
      learning_rate_decay_factor = .1
      burnin_learning_rate = .5
      burnin_steps = 2
+      min_learning_rate = .05
      learning_rate = learning_schedules.exponential_decay_with_burnin(
          global_step, learning_rate_base, learning_rate_decay_steps,
-          learning_rate_decay_factor, burnin_learning_rate, burnin_steps)
+          learning_rate_decay_factor, burnin_learning_rate, burnin_steps,
+          min_learning_rate)
      assert learning_rate.op.name.endswith('learning_rate')
      return (learning_rate,)
    output_rates = [
-        self.execute(graph_fn, [np.array(i).astype(np.int64)]) for i in range(8)
+        self.execute(graph_fn, [np.array(i).astype(np.int64)]) for i in range(9)
    ]
-    exp_rates = [.5, .5, 1, .1, .1, .1, .01, .01]
+    exp_rates = [.5, .5, 1, 1, 1, .1, .1, .1, .05]
    self.assertAllClose(output_rates, exp_rates, rtol=1e-4)
  def testCosineDecayWithWarmup(self):

--- a/research/object_detection/utils/per_image_vrd_evaluation.py
+++ b/research/object_detection/utils/per_image_vrd_evaluation.py
@@ -137,9 +137,14 @@ class PerImageVRDEvaluation(object):
      result_tp_fp_labels.append(tp_fp_labels)
      result_mapping.append(selector_mapping[sorted_indices])
-    result_scores = np.concatenate(result_scores)
+    if result_scores:
-    result_tp_fp_labels = np.concatenate(result_tp_fp_labels)
+      result_scores = np.concatenate(result_scores)
-    result_mapping = np.concatenate(result_mapping)
+      result_tp_fp_labels = np.concatenate(result_tp_fp_labels)
+      result_mapping = np.concatenate(result_mapping)
+    else:
+      result_scores = np.array([], dtype=float)
+      result_tp_fp_labels = np.array([], dtype=bool)
+      result_mapping = np.array([], dtype=int)
    sorted_indices = np.argsort(result_scores)
    sorted_indices = sorted_indices[::-1]

--- a/research/object_detection/utils/vrd_evaluation.py
+++ b/research/object_detection/utils/vrd_evaluation.py
@@ -178,6 +178,14 @@ class VRDDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
          corresponding bounding boxes and possibly additional classes (see
          datatype label_data_type above).
    """
+    if image_id not in self._image_ids:
+      logging.warn('No groundtruth for the image with id %s.', image_id)
+      # Since for the correct work of evaluator it is assumed that groundtruth
+      # is inserted first we make sure to break the code if is it not the case.
+      self._image_ids.update([image_id])
+      self._negative_labels[image_id] = np.array([])
+      self._evaluatable_labels[image_id] = np.array([])
    num_detections = detections_dict[
        standard_fields.DetectionResultFields.detection_boxes].shape[0]
    detection_class_tuples = detections_dict[
@@ -186,7 +194,6 @@ class VRDDetectionEvaluator(object_detection_evaluation.DetectionEvaluator):
        standard_fields.DetectionResultFields.detection_boxes]
    negative_selector = np.zeros(num_detections, dtype=bool)
    selector = np.ones(num_detections, dtype=bool)
    # Only check boxable labels
    for field in detection_box_tuples.dtype.fields:
      # Verify if one of the labels is negative (this is sure FP)
@@ -483,8 +490,9 @@ class _VRDDetectionEvaluation(object):
      groundtruth_box_tuples = self._groundtruth_box_tuples[image_key]
      groundtruth_class_tuples = self._groundtruth_class_tuples[image_key]
    else:
-      groundtruth_box_tuples = np.empty(shape=[0, 4], dtype=float)
+      groundtruth_box_tuples = np.empty(
-      groundtruth_class_tuples = np.array([], dtype=int)
+          shape=[0, 4], dtype=detected_box_tuples.dtype)
+      groundtruth_class_tuples = np.array([], dtype=detected_class_tuples.dtype)
    scores, tp_fp_labels, mapping = (
        self._per_image_eval.compute_detection_tp_fp(