Commit dff0f0c1 authored by Alexander Gorban's avatar Alexander Gorban
Browse files

Merge branch 'master' of github.com:tensorflow/models

parents da341f70 36203f09
...@@ -11,5 +11,5 @@ jupyter notebook ...@@ -11,5 +11,5 @@ jupyter notebook
``` ```
The notebook should open in your favorite web browser. Click the The notebook should open in your favorite web browser. Click the
[`object_detection_tutorial.ipynb`](../object_detection_tutorial.ipynb) link [`object_detection_tutorial.ipynb`](../object_detection_tutorial.ipynb) link to
to open the demo. open the demo.
...@@ -88,7 +88,7 @@ training checkpoints and events will be written to and ...@@ -88,7 +88,7 @@ training checkpoints and events will be written to and
Google Cloud Storage. Google Cloud Storage.
Users can monitor the progress of their training job on the [ML Engine Users can monitor the progress of their training job on the [ML Engine
Dashboard](https://pantheon.corp.google.com/mlengine/jobs). Dashboard](https://console.cloud.google.com/mlengine/jobs).
## Running an Evaluation Job on Cloud ## Running an Evaluation Job on Cloud
......
...@@ -51,29 +51,35 @@ dataset for Oxford-IIIT Pets lives ...@@ -51,29 +51,35 @@ dataset for Oxford-IIIT Pets lives
[here](http://www.robots.ox.ac.uk/~vgg/data/pets/). You will need to download [here](http://www.robots.ox.ac.uk/~vgg/data/pets/). You will need to download
both the image dataset [`images.tar.gz`](http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz) both the image dataset [`images.tar.gz`](http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz)
and the groundtruth data [`annotations.tar.gz`](http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz) and the groundtruth data [`annotations.tar.gz`](http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz)
to the `tensorflow/models` directory. This may take some time. After downloading to the `tensorflow/models` directory and unzip them. This may take some time.
the tarballs, your `object_detection` directory should appear as follows:
``` bash
# From tensorflow/models/
wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz
wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz
tar -xvf images.tar.gz
tar -xvf annotations.tar.gz
```
After downloading the tarballs, your `tensorflow/models` directory should appear
as follows:
```lang-none ```lang-none
- images.tar.gz
- annotations.tar.gz
+ images/
+ annotations/
+ object_detection/ + object_detection/
+ data/ ... other files and directories
- images.tar.gz
- annotations.tar.gz
- create_pet_tf_record.py
... other files and directories
``` ```
The Tensorflow Object Detection API expects data to be in the TFRecord format, The Tensorflow Object Detection API expects data to be in the TFRecord format,
so we'll now run the `create_pet_tf_record` script to convert from the raw so we'll now run the `create_pet_tf_record` script to convert from the raw
Oxford-IIIT Pet dataset into TFRecords. Run the following commands from the Oxford-IIIT Pet dataset into TFRecords. Run the following commands from the
`object_detection` directory: `tensorflow/models` directory:
``` bash ``` bash
# From tensorflow/models/ # From tensorflow/models/
wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz
wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz
tar -xvf annotations.tar.gz
tar -xvf images.tar.gz
python object_detection/create_pet_tf_record.py \ python object_detection/create_pet_tf_record.py \
--label_map_path=object_detection/data/pet_label_map.pbtxt \ --label_map_path=object_detection/data/pet_label_map.pbtxt \
--data_dir=`pwd` \ --data_dir=`pwd` \
...@@ -83,8 +89,8 @@ python object_detection/create_pet_tf_record.py \ ...@@ -83,8 +89,8 @@ python object_detection/create_pet_tf_record.py \
Note: It is normal to see some warnings when running this script. You may ignore Note: It is normal to see some warnings when running this script. You may ignore
them. them.
Two TFRecord files named `pet_train.record` and `pet_val.record` should be generated Two TFRecord files named `pet_train.record` and `pet_val.record` should be
in the `object_detection` directory. generated in the `tensorflow/models` directory.
Now that the data has been generated, we'll need to upload it to Google Cloud Now that the data has been generated, we'll need to upload it to Google Cloud
Storage so the data can be accessed by ML Engine. Run the following command to Storage so the data can be accessed by ML Engine. Run the following command to
...@@ -263,7 +269,10 @@ Note: It takes roughly 10 minutes for a job to get started on ML Engine, and ...@@ -263,7 +269,10 @@ Note: It takes roughly 10 minutes for a job to get started on ML Engine, and
roughly an hour for the system to evaluate the validation dataset. It may take roughly an hour for the system to evaluate the validation dataset. It may take
some time to populate the dashboards. If you do not see any entries after half some time to populate the dashboards. If you do not see any entries after half
an hour, check the logs from the [ML Engine an hour, check the logs from the [ML Engine
Dashboard](https://console.cloud.google.com/mlengine/jobs). Dashboard](https://console.cloud.google.com/mlengine/jobs). Note that by default
the training jobs are configured to go for much longer than is necessary for
convergence. To save money, we recommend killing your jobs once you've seen
that they've converged.
## Exporting the Tensorflow Graph ## Exporting the Tensorflow Graph
...@@ -279,7 +288,7 @@ three files: ...@@ -279,7 +288,7 @@ three files:
* `model.ckpt-${CHECKPOINT_NUMBER}.meta` * `model.ckpt-${CHECKPOINT_NUMBER}.meta`
After you've identified a candidate checkpoint to export, run the following After you've identified a candidate checkpoint to export, run the following
command from `tensorflow/models/object_detection`: command from `tensorflow/models`:
``` bash ``` bash
# From tensorflow/models # From tensorflow/models
......
# Preparing Inputs
To use your own dataset in Tensorflow Object Detection API, you must convert it
into the [TFRecord file format](https://www.tensorflow.org/api_guides/python/python_io#tfrecords_format_details).
This document outlines how to write a script to generate the TFRecord file.
## Label Maps
Each dataset is required to have a label map associated with it. This label map
defines a mapping from string class names to integer class Ids. The label map
should be a `StringIntLabelMap` text protobuf. Sample label maps can be found in
object_detection/data. Label maps should always start from id 1.
## Dataset Requirements
For every example in your dataset, you should have the following information:
1. An RGB image for the dataset encoded as jpeg or png.
2. A list of bounding boxes for the image. Each bounding box should contain:
1. A bounding box coordinates (with origin in top left corner) defined by 4
floating point numbers [ymin, xmin, ymax, xmax]. Note that we store the
_normalized_ coordinates (x / width, y / height) in the TFRecord dataset.
2. The class of the object in the bounding box.
# Example Image
Consider the following image:
![Example Image](img/example_cat.jpg "Example Image")
with the following label map:
```
item {
id: 1
name: 'Cat'
}
item {
id: 2
name: 'Dog'
}
```
We can generate a tf.Example proto for this image using the following code:
```python
def create_cat_tf_example(encoded_cat_image_data):
"""Creates a tf.Example proto from sample cat image.
Args:
encoded_cat_image_data: The jpg encoded data of the cat image.
Returns:
example: The created tf.Example.
"""
height = 1032.0
width = 1200.0
filename = 'example_cat.jpg'
image_format = b'jpg'
xmins = [322.0 / 1200.0]
xmaxs = [1062.0 / 1200.0]
ymins = [174.0 / 1032.0]
ymaxs = [761.0 / 1032.0]
classes_text = ['Cat']
classes = [1]
tf_example = tf.train.Example(features=tf.train.Features(feature={
'image/height': dataset_util.int64_feature(height),
'image/width': dataset_util.int64_feature(width),
'image/filename': dataset_util.bytes_feature(filename),
'image/source_id': dataset_util.bytes_feature(filename),
'image/encoded': dataset_util.bytes_feature(encoded_image_data),
'image/format': dataset_util.bytes_feature(image_format),
'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
'image/object/class/label': dataset_util.int64_list_feature(classes),
}))
return tf_example
```
## Conversion Script Outline
A typical conversion script will look like the following:
```python
import tensorflow as tf
from object_detection.utils import dataset_util
flags = tf.app.flags
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
FLAGS = flags.FLAGS
def create_tf_example(example):
# TODO(user): Populate the following variables from your example.
height = None # Image height
width = None # Image width
filename = None # Filename of the image. Empty if image is not from file
encoded_image_data = None # Encoded image bytes
image_format = None # b'jpeg' or b'png'
xmins = [] # List of normalized left x coordinates in bounding box (1 per box)
xmaxs = [] # List of normalized right x coordinates in bounding box
# (1 per box)
ymins = [] # List of normalized top y coordinates in bounding box (1 per box)
ymaxs = [] # List of normalized bottom y coordinates in bounding box
# (1 per box)
classes_text = [] # List of string class name of bounding box (1 per box)
classes = [] # List of integer class id of bounding box (1 per box)
tf_example = tf.train.Example(features=tf.train.Features(feature={
'image/height': dataset_util.int64_feature(height),
'image/width': dataset_util.int64_feature(width),
'image/filename': dataset_util.bytes_feature(filename),
'image/source_id': dataset_util.bytes_feature(filename),
'image/encoded': dataset_util.bytes_feature(encoded_image_data),
'image/format': dataset_util.bytes_feature(image_format),
'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
'image/object/class/label': dataset_util.int64_list_feature(classes),
}))
return tf_example
def main(_):
writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
# TODO(user): Write code to read in your dataset to examples variable
for example in examples:
tf_example = create_tf_example(example)
writer.write(tf_example.SerializeToString())
writer.close()
if __name__ == '__main__':
tf.app.run()
```
Note: You may notice additional fields in some other datasets. They are
currently unused by the API and are optional.
...@@ -13,12 +13,11 @@ py_library( ...@@ -13,12 +13,11 @@ py_library(
srcs = ["ssd_meta_arch.py"], srcs = ["ssd_meta_arch.py"],
deps = [ deps = [
"//tensorflow", "//tensorflow",
"//tensorflow_models/object_detection/core:box_coder",
"//tensorflow_models/object_detection/core:box_list", "//tensorflow_models/object_detection/core:box_list",
"//tensorflow_models/object_detection/core:box_predictor", "//tensorflow_models/object_detection/core:box_predictor",
"//tensorflow_models/object_detection/core:model", "//tensorflow_models/object_detection/core:model",
"//tensorflow_models/object_detection/core:target_assigner", "//tensorflow_models/object_detection/core:target_assigner",
"//tensorflow_models/object_detection/utils:variables_helper", "//tensorflow_models/object_detection/utils:shape_utils",
], ],
) )
...@@ -56,7 +55,7 @@ py_library( ...@@ -56,7 +55,7 @@ py_library(
"//tensorflow_models/object_detection/core:standard_fields", "//tensorflow_models/object_detection/core:standard_fields",
"//tensorflow_models/object_detection/core:target_assigner", "//tensorflow_models/object_detection/core:target_assigner",
"//tensorflow_models/object_detection/utils:ops", "//tensorflow_models/object_detection/utils:ops",
"//tensorflow_models/object_detection/utils:variables_helper", "//tensorflow_models/object_detection/utils:shape_utils",
], ],
) )
......
...@@ -80,7 +80,7 @@ from object_detection.core import post_processing ...@@ -80,7 +80,7 @@ from object_detection.core import post_processing
from object_detection.core import standard_fields as fields from object_detection.core import standard_fields as fields
from object_detection.core import target_assigner from object_detection.core import target_assigner
from object_detection.utils import ops from object_detection.utils import ops
from object_detection.utils import variables_helper from object_detection.utils import shape_utils
slim = tf.contrib.slim slim = tf.contrib.slim
...@@ -159,21 +159,19 @@ class FasterRCNNFeatureExtractor(object): ...@@ -159,21 +159,19 @@ class FasterRCNNFeatureExtractor(object):
def restore_from_classification_checkpoint_fn( def restore_from_classification_checkpoint_fn(
self, self,
checkpoint_path,
first_stage_feature_extractor_scope, first_stage_feature_extractor_scope,
second_stage_feature_extractor_scope): second_stage_feature_extractor_scope):
"""Returns callable for loading a checkpoint into the tensorflow graph. """Returns a map of variables to load from a foreign checkpoint.
Args: Args:
checkpoint_path: path to checkpoint to restore.
first_stage_feature_extractor_scope: A scope name for the first stage first_stage_feature_extractor_scope: A scope name for the first stage
feature extractor. feature extractor.
second_stage_feature_extractor_scope: A scope name for the second stage second_stage_feature_extractor_scope: A scope name for the second stage
feature extractor. feature extractor.
Returns: Returns:
a callable which takes a tf.Session as input and loads a checkpoint when A dict mapping variable names (to load from a checkpoint) to variables in
run. the model graph.
""" """
variables_to_restore = {} variables_to_restore = {}
for variable in tf.global_variables(): for variable in tf.global_variables():
...@@ -182,13 +180,7 @@ class FasterRCNNFeatureExtractor(object): ...@@ -182,13 +180,7 @@ class FasterRCNNFeatureExtractor(object):
if variable.op.name.startswith(scope_name): if variable.op.name.startswith(scope_name):
var_name = variable.op.name.replace(scope_name + '/', '') var_name = variable.op.name.replace(scope_name + '/', '')
variables_to_restore[var_name] = variable variables_to_restore[var_name] = variable
variables_to_restore = ( return variables_to_restore
variables_helper.get_variables_available_in_checkpoint(
variables_to_restore, checkpoint_path))
saver = tf.train.Saver(variables_to_restore)
def restore(sess):
saver.restore(sess, checkpoint_path)
return restore
class FasterRCNNMetaArch(model.DetectionModel): class FasterRCNNMetaArch(model.DetectionModel):
...@@ -774,10 +766,9 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -774,10 +766,9 @@ class FasterRCNNMetaArch(model.DetectionModel):
A float tensor with shape [A * B, ..., depth] (where the first and last A float tensor with shape [A * B, ..., depth] (where the first and last
dimension are statically defined. dimension are statically defined.
""" """
inputs_shape = inputs.get_shape().as_list() combined_shape = shape_utils.combined_static_and_dynamic_shape(inputs)
flattened_shape = tf.concat([ flattened_shape = tf.stack([combined_shape[0] * combined_shape[1]] +
[inputs_shape[0]*inputs_shape[1]], tf.shape(inputs)[2:-1], combined_shape[2:])
[inputs_shape[-1]]], 0)
return tf.reshape(inputs, flattened_shape) return tf.reshape(inputs, flattened_shape)
def postprocess(self, prediction_dict): def postprocess(self, prediction_dict):
...@@ -875,52 +866,128 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -875,52 +866,128 @@ class FasterRCNNMetaArch(model.DetectionModel):
representing the number of proposals predicted for each image in representing the number of proposals predicted for each image in
the batch. the batch.
""" """
rpn_box_encodings_batch = tf.expand_dims(rpn_box_encodings_batch, axis=2)
rpn_encodings_shape = shape_utils.combined_static_and_dynamic_shape(
rpn_box_encodings_batch)
tiled_anchor_boxes = tf.tile(
tf.expand_dims(anchors, 0), [rpn_encodings_shape[0], 1, 1])
proposal_boxes = self._batch_decode_boxes(rpn_box_encodings_batch,
tiled_anchor_boxes)
proposal_boxes = tf.squeeze(proposal_boxes, axis=2)
rpn_objectness_softmax_without_background = tf.nn.softmax(
rpn_objectness_predictions_with_background_batch)[:, :, 1]
clip_window = tf.to_float(tf.stack([0, 0, image_shape[1], image_shape[2]])) clip_window = tf.to_float(tf.stack([0, 0, image_shape[1], image_shape[2]]))
(proposal_boxes, proposal_scores, _, _,
num_proposals) = post_processing.batch_multiclass_non_max_suppression(
tf.expand_dims(proposal_boxes, axis=2),
tf.expand_dims(rpn_objectness_softmax_without_background,
axis=2),
self._first_stage_nms_score_threshold,
self._first_stage_nms_iou_threshold,
self._first_stage_max_proposals,
self._first_stage_max_proposals,
clip_window=clip_window)
if self._is_training: if self._is_training:
(groundtruth_boxlists, groundtruth_classes_with_background_list proposal_boxes = tf.stop_gradient(proposal_boxes)
) = self._format_groundtruth_data(image_shape) if not self._hard_example_miner:
(groundtruth_boxlists, groundtruth_classes_with_background_list,
proposal_boxes_list = [] ) = self._format_groundtruth_data(image_shape)
proposal_scores_list = [] (proposal_boxes, proposal_scores,
num_proposals_list = [] num_proposals) = self._unpad_proposals_and_sample_box_classifier_batch(
for (batch_index, proposal_boxes, proposal_scores, num_proposals,
(rpn_box_encodings, groundtruth_boxlists, groundtruth_classes_with_background_list)
rpn_objectness_predictions_with_background)) in enumerate(zip( # normalize proposal boxes
tf.unstack(rpn_box_encodings_batch), proposal_boxes_reshaped = tf.reshape(proposal_boxes, [-1, 4])
tf.unstack(rpn_objectness_predictions_with_background_batch))): normalized_proposal_boxes_reshaped = box_list_ops.to_normalized_coordinates(
decoded_boxes = self._box_coder.decode( box_list.BoxList(proposal_boxes_reshaped),
rpn_box_encodings, box_list.BoxList(anchors)) image_shape[1], image_shape[2], check_range=False).get()
objectness_scores = tf.unstack( proposal_boxes = tf.reshape(normalized_proposal_boxes_reshaped,
tf.nn.softmax(rpn_objectness_predictions_with_background), axis=1)[1] [-1, proposal_boxes.shape[1].value, 4])
proposal_boxlist = post_processing.multiclass_non_max_suppression( return proposal_boxes, proposal_scores, num_proposals
tf.expand_dims(decoded_boxes.get(), 1),
tf.expand_dims(objectness_scores, 1), def _unpad_proposals_and_sample_box_classifier_batch(
self._first_stage_nms_score_threshold, self,
self._first_stage_nms_iou_threshold, self._first_stage_max_proposals, proposal_boxes,
clip_window=clip_window) proposal_scores,
num_proposals,
if self._is_training: groundtruth_boxlists,
proposal_boxlist.set(tf.stop_gradient(proposal_boxlist.get())) groundtruth_classes_with_background_list):
if not self._hard_example_miner: """Unpads proposals and samples a minibatch for second stage.
proposal_boxlist = self._sample_box_classifier_minibatch(
proposal_boxlist, groundtruth_boxlists[batch_index], Args:
groundtruth_classes_with_background_list[batch_index]) proposal_boxes: A float tensor with shape
[batch_size, num_proposals, 4] representing the (potentially zero
normalized_proposals = box_list_ops.to_normalized_coordinates( padded) proposal boxes for all images in the batch. These boxes are
proposal_boxlist, image_shape[1], image_shape[2], represented as normalized coordinates.
check_range=False) proposal_scores: A float tensor with shape
[batch_size, num_proposals] representing the (potentially zero
# pad proposals to max_num_proposals padded) proposal objectness scores for all images in the batch.
padded_proposals = box_list_ops.pad_or_clip_box_list( num_proposals: A Tensor of type `int32`. A 1-D tensor of shape [batch]
normalized_proposals, num_boxes=self.max_num_proposals) representing the number of proposals predicted for each image in
proposal_boxes_list.append(padded_proposals.get()) the batch.
proposal_scores_list.append( groundtruth_boxlists: A list of BoxLists containing (absolute) coordinates
padded_proposals.get_field(fields.BoxListFields.scores)) of the groundtruth boxes.
num_proposals_list.append(tf.minimum(normalized_proposals.num_boxes(), groundtruth_classes_with_background_list: A list of 2-D one-hot
self.max_num_proposals)) (or k-hot) tensors of shape [num_boxes, num_classes+1] containing the
class targets with the 0th index assumed to map to the background class.
return (tf.stack(proposal_boxes_list), tf.stack(proposal_scores_list),
tf.stack(num_proposals_list)) Returns:
proposal_boxes: A float tensor with shape
[batch_size, second_stage_batch_size, 4] representing the (potentially
zero padded) proposal boxes for all images in the batch. These boxes
are represented as normalized coordinates.
proposal_scores: A float tensor with shape
[batch_size, second_stage_batch_size] representing the (potentially zero
padded) proposal objectness scores for all images in the batch.
num_proposals: A Tensor of type `int32`. A 1-D tensor of shape [batch]
representing the number of proposals predicted for each image in
the batch.
"""
single_image_proposal_box_sample = []
single_image_proposal_score_sample = []
single_image_num_proposals_sample = []
for (single_image_proposal_boxes,
single_image_proposal_scores,
single_image_num_proposals,
single_image_groundtruth_boxlist,
single_image_groundtruth_classes_with_background) in zip(
tf.unstack(proposal_boxes),
tf.unstack(proposal_scores),
tf.unstack(num_proposals),
groundtruth_boxlists,
groundtruth_classes_with_background_list):
static_shape = single_image_proposal_boxes.get_shape()
sliced_static_shape = tf.TensorShape([tf.Dimension(None),
static_shape.dims[-1]])
single_image_proposal_boxes = tf.slice(
single_image_proposal_boxes,
[0, 0],
[single_image_num_proposals, -1])
single_image_proposal_boxes.set_shape(sliced_static_shape)
single_image_proposal_scores = tf.slice(single_image_proposal_scores,
[0],
[single_image_num_proposals])
single_image_boxlist = box_list.BoxList(single_image_proposal_boxes)
single_image_boxlist.add_field(fields.BoxListFields.scores,
single_image_proposal_scores)
sampled_boxlist = self._sample_box_classifier_minibatch(
single_image_boxlist,
single_image_groundtruth_boxlist,
single_image_groundtruth_classes_with_background)
sampled_padded_boxlist = box_list_ops.pad_or_clip_box_list(
sampled_boxlist,
num_boxes=self._second_stage_batch_size)
single_image_num_proposals_sample.append(tf.minimum(
sampled_boxlist.num_boxes(),
self._second_stage_batch_size))
bb = sampled_padded_boxlist.get()
single_image_proposal_box_sample.append(bb)
single_image_proposal_score_sample.append(
sampled_padded_boxlist.get_field(fields.BoxListFields.scores))
return (tf.stack(single_image_proposal_box_sample),
tf.stack(single_image_proposal_score_sample),
tf.stack(single_image_num_proposals_sample))
def _format_groundtruth_data(self, image_shape): def _format_groundtruth_data(self, image_shape):
"""Helper function for preparing groundtruth data for target assignment. """Helper function for preparing groundtruth data for target assignment.
...@@ -1074,7 +1141,7 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1074,7 +1141,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
class_predictions_with_background, class_predictions_with_background,
[-1, self.max_num_proposals, self.num_classes + 1] [-1, self.max_num_proposals, self.num_classes + 1]
) )
refined_decoded_boxes_batch = self._batch_decode_refined_boxes( refined_decoded_boxes_batch = self._batch_decode_boxes(
refined_box_encodings_batch, proposal_boxes) refined_box_encodings_batch, proposal_boxes)
class_predictions_with_background_batch = ( class_predictions_with_background_batch = (
self._second_stage_score_conversion_fn( self._second_stage_score_conversion_fn(
...@@ -1092,19 +1159,26 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1092,19 +1159,26 @@ class FasterRCNNMetaArch(model.DetectionModel):
mask_predictions_batch = tf.reshape( mask_predictions_batch = tf.reshape(
mask_predictions, [-1, self.max_num_proposals, mask_predictions, [-1, self.max_num_proposals,
self.num_classes, mask_height, mask_width]) self.num_classes, mask_height, mask_width])
detections = self._second_stage_nms_fn( (nmsed_boxes, nmsed_scores, nmsed_classes, nmsed_masks,
refined_decoded_boxes_batch, num_detections) = self._second_stage_nms_fn(
class_predictions_batch, refined_decoded_boxes_batch,
clip_window=clip_window, class_predictions_batch,
change_coordinate_frame=True, clip_window=clip_window,
num_valid_boxes=num_proposals, change_coordinate_frame=True,
masks=mask_predictions_batch) num_valid_boxes=num_proposals,
masks=mask_predictions_batch)
detections = {'detection_boxes': nmsed_boxes,
'detection_scores': nmsed_scores,
'detection_classes': nmsed_classes,
'num_detections': tf.to_float(num_detections)}
if nmsed_masks is not None:
detections['detection_masks'] = nmsed_masks
if mask_predictions is not None: if mask_predictions is not None:
detections['detection_masks'] = tf.to_float( detections['detection_masks'] = tf.to_float(
tf.greater_equal(detections['detection_masks'], mask_threshold)) tf.greater_equal(detections['detection_masks'], mask_threshold))
return detections return detections
def _batch_decode_refined_boxes(self, refined_box_encodings, proposal_boxes): def _batch_decode_boxes(self, box_encodings, anchor_boxes):
"""Decode tensor of refined box encodings. """Decode tensor of refined box encodings.
Args: Args:
...@@ -1119,15 +1193,33 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1119,15 +1193,33 @@ class FasterRCNNMetaArch(model.DetectionModel):
float tensor representing (padded) refined bounding box predictions float tensor representing (padded) refined bounding box predictions
(for each image in batch, proposal and class). (for each image in batch, proposal and class).
""" """
tiled_proposal_boxes = tf.tile( """Decodes box encodings with respect to the anchor boxes.
tf.expand_dims(proposal_boxes, 2), [1, 1, self.num_classes, 1])
tiled_proposals_boxlist = box_list.BoxList( Args:
tf.reshape(tiled_proposal_boxes, [-1, 4])) box_encodings: a 4-D tensor with shape
[batch_size, num_anchors, num_classes, self._box_coder.code_size]
representing box encodings.
anchor_boxes: [batch_size, num_anchors, 4] representing
decoded bounding boxes.
Returns:
decoded_boxes: a [batch_size, num_anchors, num_classes, 4]
float tensor representing bounding box predictions
(for each image in batch, proposal and class).
"""
combined_shape = shape_utils.combined_static_and_dynamic_shape(
box_encodings)
num_classes = combined_shape[2]
tiled_anchor_boxes = tf.tile(
tf.expand_dims(anchor_boxes, 2), [1, 1, num_classes, 1])
tiled_anchors_boxlist = box_list.BoxList(
tf.reshape(tiled_anchor_boxes, [-1, 4]))
decoded_boxes = self._box_coder.decode( decoded_boxes = self._box_coder.decode(
tf.reshape(refined_box_encodings, [-1, self._box_coder.code_size]), tf.reshape(box_encodings, [-1, self._box_coder.code_size]),
tiled_proposals_boxlist) tiled_anchors_boxlist)
return tf.reshape(decoded_boxes.get(), return tf.reshape(decoded_boxes.get(),
[-1, self.max_num_proposals, self.num_classes, 4]) tf.stack([combined_shape[0], combined_shape[1],
num_classes, 4]))
def loss(self, prediction_dict, scope=None): def loss(self, prediction_dict, scope=None):
"""Compute scalar loss tensors given prediction tensors. """Compute scalar loss tensors given prediction tensors.
...@@ -1413,25 +1505,22 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1413,25 +1505,22 @@ class FasterRCNNMetaArch(model.DetectionModel):
cls_losses=tf.expand_dims(single_image_cls_loss, 0), cls_losses=tf.expand_dims(single_image_cls_loss, 0),
decoded_boxlist_list=[proposal_boxlist]) decoded_boxlist_list=[proposal_boxlist])
def restore_fn(self, checkpoint_path, from_detection_checkpoint=True): def restore_map(self, from_detection_checkpoint=True):
"""Returns callable for loading a checkpoint into the tensorflow graph. """Returns a map of variables to load from a foreign checkpoint.
See parent class for details.
Args: Args:
checkpoint_path: path to checkpoint to restore. from_detection_checkpoint: whether to restore from a full detection
from_detection_checkpoint: whether to restore from a detection checkpoint checkpoint (with compatible variable names) or to restore from a
(with compatible variable names) or to restore from a classification classification checkpoint for initialization prior to training.
checkpoint for initialization prior to training. Note that when
from_detection_checkpoint=True, the current implementation only
supports restoration from an (exactly) identical model (with exception
of the num_classes parameter).
Returns: Returns:
a callable which takes a tf.Session as input and loads a checkpoint when A dict mapping variable names (to load from a checkpoint) to variables in
run. the model graph.
""" """
if not from_detection_checkpoint: if not from_detection_checkpoint:
return self._feature_extractor.restore_from_classification_checkpoint_fn( return self._feature_extractor.restore_from_classification_checkpoint_fn(
checkpoint_path,
self.first_stage_feature_extractor_scope, self.first_stage_feature_extractor_scope,
self.second_stage_feature_extractor_scope) self.second_stage_feature_extractor_scope)
...@@ -1439,13 +1528,8 @@ class FasterRCNNMetaArch(model.DetectionModel): ...@@ -1439,13 +1528,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
variables_to_restore.append(slim.get_or_create_global_step()) variables_to_restore.append(slim.get_or_create_global_step())
# Only load feature extractor variables to be consistent with loading from # Only load feature extractor variables to be consistent with loading from
# a classification checkpoint. # a classification checkpoint.
first_stage_variables = tf.contrib.framework.filter_variables( feature_extractor_variables = tf.contrib.framework.filter_variables(
variables_to_restore, variables_to_restore,
include_patterns=[self.first_stage_feature_extractor_scope, include_patterns=[self.first_stage_feature_extractor_scope,
self.second_stage_feature_extractor_scope]) self.second_stage_feature_extractor_scope])
return {var.op.name: var for var in feature_extractor_variables}
saver = tf.train.Saver(first_stage_variables)
def restore(sess):
saver.restore(sess, checkpoint_path)
return restore
...@@ -226,61 +226,47 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase): ...@@ -226,61 +226,47 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
return self._get_model(self._get_second_stage_box_predictor( return self._get_model(self._get_second_stage_box_predictor(
num_classes=num_classes, is_training=is_training), **common_kwargs) num_classes=num_classes, is_training=is_training), **common_kwargs)
def test_predict_gives_correct_shapes_in_inference_mode_first_stage_only( def test_predict_correct_shapes_in_inference_mode_both_stages(
self): self):
test_graph = tf.Graph() batch_size = 2
with test_graph.as_default(): image_size = 10
model = self._build_model( input_shapes = [(batch_size, image_size, image_size, 3),
is_training=False, first_stage_only=True, second_stage_batch_size=2) (None, image_size, image_size, 3),
batch_size = 2 (batch_size, None, None, 3),
height = 10 (None, None, None, 3)]
width = 12 expected_num_anchors = image_size * image_size * 3 * 3
input_image_shape = (batch_size, height, width, 3) expected_shapes = {
'rpn_box_predictor_features':
preprocessed_inputs = tf.placeholder(dtype=tf.float32, (2, image_size, image_size, 512),
shape=(batch_size, None, None, 3)) 'rpn_features_to_crop': (2, image_size, image_size, 3),
prediction_dict = model.predict(preprocessed_inputs) 'image_shape': (4,),
'rpn_box_encodings': (2, expected_num_anchors, 4),
# In inference mode, anchors are clipped to the image window, but not 'rpn_objectness_predictions_with_background':
# pruned. Since MockFasterRCNN.extract_proposal_features returns a (2, expected_num_anchors, 2),
# tensor with the same shape as its input, the expected number of anchors 'anchors': (expected_num_anchors, 4),
# is height * width * the number of anchors per location (i.e. 3x3). 'refined_box_encodings': (2 * 8, 2, 4),
expected_num_anchors = height * width * 3 * 3 'class_predictions_with_background': (2 * 8, 2 + 1),
expected_output_keys = set([ 'num_proposals': (2,),
'rpn_box_predictor_features', 'rpn_features_to_crop', 'image_shape', 'proposal_boxes': (2, 8, 4),
'rpn_box_encodings', 'rpn_objectness_predictions_with_background', }
'anchors']) for input_shape in input_shapes:
expected_output_shapes = { test_graph = tf.Graph()
'rpn_box_predictor_features': (batch_size, height, width, 512), with test_graph.as_default():
'rpn_features_to_crop': (batch_size, height, width, 3), model = self._build_model(
'rpn_box_encodings': (batch_size, expected_num_anchors, 4), is_training=False, first_stage_only=False,
'rpn_objectness_predictions_with_background': second_stage_batch_size=2)
(batch_size, expected_num_anchors, 2), preprocessed_inputs = tf.placeholder(tf.float32, shape=input_shape)
'anchors': (expected_num_anchors, 4) result_tensor_dict = model.predict(preprocessed_inputs)
} init_op = tf.global_variables_initializer()
with self.test_session(graph=test_graph) as sess:
init_op = tf.global_variables_initializer()
with self.test_session() as sess:
sess.run(init_op) sess.run(init_op)
prediction_out = sess.run(prediction_dict, tensor_dict_out = sess.run(result_tensor_dict, feed_dict={
feed_dict={ preprocessed_inputs:
preprocessed_inputs: np.zeros((batch_size, image_size, image_size, 3))})
np.zeros(input_image_shape) self.assertEqual(set(tensor_dict_out.keys()),
}) set(expected_shapes.keys()))
for key in expected_shapes:
self.assertEqual(set(prediction_out.keys()), expected_output_keys) self.assertAllEqual(tensor_dict_out[key].shape, expected_shapes[key])
self.assertAllEqual(prediction_out['image_shape'], input_image_shape)
for output_key, expected_shape in expected_output_shapes.iteritems():
self.assertAllEqual(prediction_out[output_key].shape, expected_shape)
# Check that anchors are clipped to window.
anchors = prediction_out['anchors']
self.assertTrue(np.all(np.greater_equal(anchors, 0)))
self.assertTrue(np.all(np.less_equal(anchors[:, 0], height)))
self.assertTrue(np.all(np.less_equal(anchors[:, 1], width)))
self.assertTrue(np.all(np.less_equal(anchors[:, 2], height)))
self.assertTrue(np.all(np.less_equal(anchors[:, 3], width)))
def test_predict_gives_valid_anchors_in_training_mode_first_stage_only(self): def test_predict_gives_valid_anchors_in_training_mode_first_stage_only(self):
test_graph = tf.Graph() test_graph = tf.Graph()
...@@ -535,35 +521,67 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase): ...@@ -535,35 +521,67 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
expected_num_proposals) expected_num_proposals)
def test_postprocess_second_stage_only_inference_mode(self): def test_postprocess_second_stage_only_inference_mode(self):
model = self._build_model( num_proposals_shapes = [(2), (None)]
is_training=False, first_stage_only=False, second_stage_batch_size=6) refined_box_encodings_shapes = [(16, 2, 4), (None, 2, 4)]
class_predictions_with_background_shapes = [(16, 3), (None, 3)]
proposal_boxes_shapes = [(2, 8, 4), (None, 8, 4)]
batch_size = 2 batch_size = 2
total_num_padded_proposals = batch_size * model.max_num_proposals image_shape = np.array((2, 36, 48, 3), dtype=np.int32)
proposal_boxes = tf.constant( for (num_proposals_shape, refined_box_encoding_shape,
[[[1, 1, 2, 3], class_predictions_with_background_shape,
[0, 0, 1, 1], proposal_boxes_shape) in zip(num_proposals_shapes,
[.5, .5, .6, .6], refined_box_encodings_shapes,
4*[0], 4*[0], 4*[0], 4*[0], 4*[0]], class_predictions_with_background_shapes,
[[2, 3, 6, 8], proposal_boxes_shapes):
[1, 2, 5, 3], tf_graph = tf.Graph()
4*[0], 4*[0], 4*[0], 4*[0], 4*[0], 4*[0]]], dtype=tf.float32) with tf_graph.as_default():
num_proposals = tf.constant([3, 2], dtype=tf.int32) model = self._build_model(
refined_box_encodings = tf.zeros( is_training=False, first_stage_only=False,
[total_num_padded_proposals, model.num_classes, 4], dtype=tf.float32) second_stage_batch_size=6)
class_predictions_with_background = tf.ones( total_num_padded_proposals = batch_size * model.max_num_proposals
[total_num_padded_proposals, model.num_classes+1], dtype=tf.float32) proposal_boxes = np.array(
image_shape = tf.constant([batch_size, 36, 48, 3], dtype=tf.int32) [[[1, 1, 2, 3],
[0, 0, 1, 1],
detections = model.postprocess({ [.5, .5, .6, .6],
'refined_box_encodings': refined_box_encodings, 4*[0], 4*[0], 4*[0], 4*[0], 4*[0]],
'class_predictions_with_background': class_predictions_with_background, [[2, 3, 6, 8],
'num_proposals': num_proposals, [1, 2, 5, 3],
'proposal_boxes': proposal_boxes, 4*[0], 4*[0], 4*[0], 4*[0], 4*[0], 4*[0]]])
'image_shape': image_shape num_proposals = np.array([3, 2], dtype=np.int32)
}) refined_box_encodings = np.zeros(
with self.test_session() as sess: [total_num_padded_proposals, model.num_classes, 4])
detections_out = sess.run(detections) class_predictions_with_background = np.ones(
[total_num_padded_proposals, model.num_classes+1])
num_proposals_placeholder = tf.placeholder(tf.int32,
shape=num_proposals_shape)
refined_box_encodings_placeholder = tf.placeholder(
tf.float32, shape=refined_box_encoding_shape)
class_predictions_with_background_placeholder = tf.placeholder(
tf.float32, shape=class_predictions_with_background_shape)
proposal_boxes_placeholder = tf.placeholder(
tf.float32, shape=proposal_boxes_shape)
image_shape_placeholder = tf.placeholder(tf.int32, shape=(4))
detections = model.postprocess({
'refined_box_encodings': refined_box_encodings_placeholder,
'class_predictions_with_background':
class_predictions_with_background_placeholder,
'num_proposals': num_proposals_placeholder,
'proposal_boxes': proposal_boxes_placeholder,
'image_shape': image_shape_placeholder,
})
with self.test_session(graph=tf_graph) as sess:
detections_out = sess.run(
detections,
feed_dict={
refined_box_encodings_placeholder: refined_box_encodings,
class_predictions_with_background_placeholder:
class_predictions_with_background,
num_proposals_placeholder: num_proposals,
proposal_boxes_placeholder: proposal_boxes,
image_shape_placeholder: image_shape
})
self.assertAllEqual(detections_out['detection_boxes'].shape, [2, 5, 4]) self.assertAllEqual(detections_out['detection_boxes'].shape, [2, 5, 4])
self.assertAllClose(detections_out['detection_scores'], self.assertAllClose(detections_out['detection_scores'],
[[1, 1, 1, 1, 1], [1, 1, 1, 1, 0]]) [[1, 1, 1, 1, 1], [1, 1, 1, 1, 0]])
...@@ -571,6 +589,17 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase): ...@@ -571,6 +589,17 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
[[0, 0, 0, 1, 1], [0, 0, 1, 1, 0]]) [[0, 0, 0, 1, 1], [0, 0, 1, 1, 0]])
self.assertAllClose(detections_out['num_detections'], [5, 4]) self.assertAllClose(detections_out['num_detections'], [5, 4])
def test_preprocess_preserves_input_shapes(self):
image_shapes = [(3, None, None, 3),
(None, 10, 10, 3),
(None, None, None, 3)]
for image_shape in image_shapes:
model = self._build_model(
is_training=False, first_stage_only=False, second_stage_batch_size=6)
image_placeholder = tf.placeholder(tf.float32, shape=image_shape)
preprocessed_inputs = model.preprocess(image_placeholder)
self.assertAllEqual(preprocessed_inputs.shape.as_list(), image_shape)
def test_loss_first_stage_only_mode(self): def test_loss_first_stage_only_mode(self):
model = self._build_model( model = self._build_model(
is_training=True, first_stage_only=True, second_stage_batch_size=6) is_training=True, first_stage_only=True, second_stage_batch_size=6)
...@@ -957,7 +986,7 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase): ...@@ -957,7 +986,7 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
exp_loc_loss) exp_loc_loss)
self.assertAllClose(loss_dict_out['second_stage_classification_loss'], 0) self.assertAllClose(loss_dict_out['second_stage_classification_loss'], 0)
def test_restore_fn_classification(self): def test_restore_map_for_classification_ckpt(self):
# Define mock tensorflow classification graph and save variables. # Define mock tensorflow classification graph and save variables.
test_graph_classification = tf.Graph() test_graph_classification = tf.Graph()
with test_graph_classification.as_default(): with test_graph_classification.as_default():
...@@ -986,12 +1015,17 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase): ...@@ -986,12 +1015,17 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
preprocessed_inputs = model.preprocess(inputs) preprocessed_inputs = model.preprocess(inputs)
prediction_dict = model.predict(preprocessed_inputs) prediction_dict = model.predict(preprocessed_inputs)
model.postprocess(prediction_dict) model.postprocess(prediction_dict)
restore_fn = model.restore_fn(saved_model_path, var_map = model.restore_map(from_detection_checkpoint=False)
from_detection_checkpoint=False) self.assertIsInstance(var_map, dict)
saver = tf.train.Saver(var_map)
with self.test_session() as sess: with self.test_session() as sess:
restore_fn(sess) saver.restore(sess, saved_model_path)
for var in sess.run(tf.report_uninitialized_variables()):
self.assertNotIn(model.first_stage_feature_extractor_scope, var.name)
self.assertNotIn(model.second_stage_feature_extractor_scope,
var.name)
def test_restore_fn_detection(self): def test_restore_map_for_detection_ckpt(self):
# Define first detection graph and save variables. # Define first detection graph and save variables.
test_graph_detection1 = tf.Graph() test_graph_detection1 = tf.Graph()
with test_graph_detection1.as_default(): with test_graph_detection1.as_default():
...@@ -1022,10 +1056,11 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase): ...@@ -1022,10 +1056,11 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
preprocessed_inputs2 = model2.preprocess(inputs2) preprocessed_inputs2 = model2.preprocess(inputs2)
prediction_dict2 = model2.predict(preprocessed_inputs2) prediction_dict2 = model2.predict(preprocessed_inputs2)
model2.postprocess(prediction_dict2) model2.postprocess(prediction_dict2)
restore_fn = model2.restore_fn(saved_model_path, var_map = model2.restore_map(from_detection_checkpoint=True)
from_detection_checkpoint=True) self.assertIsInstance(var_map, dict)
saver = tf.train.Saver(var_map)
with self.test_session() as sess: with self.test_session() as sess:
restore_fn(sess) saver.restore(sess, saved_model_path)
for var in sess.run(tf.report_uninitialized_variables()): for var in sess.run(tf.report_uninitialized_variables()):
self.assertNotIn(model2.first_stage_feature_extractor_scope, var.name) self.assertNotIn(model2.first_stage_feature_extractor_scope, var.name)
self.assertNotIn(model2.second_stage_feature_extractor_scope, self.assertNotIn(model2.second_stage_feature_extractor_scope,
......
...@@ -23,13 +23,12 @@ from abc import abstractmethod ...@@ -23,13 +23,12 @@ from abc import abstractmethod
import re import re
import tensorflow as tf import tensorflow as tf
from object_detection.core import box_coder as bcoder
from object_detection.core import box_list from object_detection.core import box_list
from object_detection.core import box_predictor as bpredictor from object_detection.core import box_predictor as bpredictor
from object_detection.core import model from object_detection.core import model
from object_detection.core import standard_fields as fields from object_detection.core import standard_fields as fields
from object_detection.core import target_assigner from object_detection.core import target_assigner
from object_detection.utils import variables_helper from object_detection.utils import shape_utils
slim = tf.contrib.slim slim = tf.contrib.slim
...@@ -324,7 +323,8 @@ class SSDMetaArch(model.DetectionModel): ...@@ -324,7 +323,8 @@ class SSDMetaArch(model.DetectionModel):
a list of pairs (height, width) for each feature map in feature_maps a list of pairs (height, width) for each feature map in feature_maps
""" """
feature_map_shapes = [ feature_map_shapes = [
feature_map.get_shape().as_list() for feature_map in feature_maps shape_utils.combined_static_and_dynamic_shape(
feature_map) for feature_map in feature_maps
] ]
return [(shape[1], shape[2]) for shape in feature_map_shapes] return [(shape[1], shape[2]) for shape in feature_map_shapes]
...@@ -365,8 +365,7 @@ class SSDMetaArch(model.DetectionModel): ...@@ -365,8 +365,7 @@ class SSDMetaArch(model.DetectionModel):
with tf.name_scope('Postprocessor'): with tf.name_scope('Postprocessor'):
box_encodings = prediction_dict['box_encodings'] box_encodings = prediction_dict['box_encodings']
class_predictions = prediction_dict['class_predictions_with_background'] class_predictions = prediction_dict['class_predictions_with_background']
detection_boxes = bcoder.batch_decode(box_encodings, self._box_coder, detection_boxes = self._batch_decode(box_encodings)
self.anchors)
detection_boxes = tf.expand_dims(detection_boxes, axis=2) detection_boxes = tf.expand_dims(detection_boxes, axis=2)
class_predictions_without_background = tf.slice(class_predictions, class_predictions_without_background = tf.slice(class_predictions,
...@@ -375,10 +374,14 @@ class SSDMetaArch(model.DetectionModel): ...@@ -375,10 +374,14 @@ class SSDMetaArch(model.DetectionModel):
detection_scores = self._score_conversion_fn( detection_scores = self._score_conversion_fn(
class_predictions_without_background) class_predictions_without_background)
clip_window = tf.constant([0, 0, 1, 1], tf.float32) clip_window = tf.constant([0, 0, 1, 1], tf.float32)
detections = self._non_max_suppression_fn(detection_boxes, (nmsed_boxes, nmsed_scores, nmsed_classes, _,
detection_scores, num_detections) = self._non_max_suppression_fn(detection_boxes,
clip_window=clip_window) detection_scores,
return detections clip_window=clip_window)
return {'detection_boxes': nmsed_boxes,
'detection_scores': nmsed_scores,
'detection_classes': nmsed_classes,
'num_detections': tf.to_float(num_detections)}
def loss(self, prediction_dict, scope=None): def loss(self, prediction_dict, scope=None):
"""Compute scalar loss tensors with respect to provided groundtruth. """Compute scalar loss tensors with respect to provided groundtruth.
...@@ -546,8 +549,7 @@ class SSDMetaArch(model.DetectionModel): ...@@ -546,8 +549,7 @@ class SSDMetaArch(model.DetectionModel):
tf.slice(prediction_dict['class_predictions_with_background'], tf.slice(prediction_dict['class_predictions_with_background'],
[0, 0, 1], class_pred_shape), class_pred_shape) [0, 0, 1], class_pred_shape), class_pred_shape)
decoded_boxes = bcoder.batch_decode(prediction_dict['box_encodings'], decoded_boxes = self._batch_decode(prediction_dict['box_encodings'])
self._box_coder, self.anchors)
decoded_box_tensors_list = tf.unstack(decoded_boxes) decoded_box_tensors_list = tf.unstack(decoded_boxes)
class_prediction_list = tf.unstack(class_predictions) class_prediction_list = tf.unstack(class_predictions)
decoded_boxlist_list = [] decoded_boxlist_list = []
...@@ -562,33 +564,51 @@ class SSDMetaArch(model.DetectionModel): ...@@ -562,33 +564,51 @@ class SSDMetaArch(model.DetectionModel):
decoded_boxlist_list=decoded_boxlist_list, decoded_boxlist_list=decoded_boxlist_list,
match_list=match_list) match_list=match_list)
def restore_fn(self, checkpoint_path, from_detection_checkpoint=True): def _batch_decode(self, box_encodings):
"""Return callable for loading a checkpoint into the tensorflow graph. """Decodes a batch of box encodings with respect to the anchors.
Args:
box_encodings: A float32 tensor of shape
[batch_size, num_anchors, box_code_size] containing box encodings.
Returns:
decoded_boxes: A float32 tensor of shape
[batch_size, num_anchors, 4] containing the decoded boxes.
"""
combined_shape = shape_utils.combined_static_and_dynamic_shape(
box_encodings)
batch_size = combined_shape[0]
tiled_anchor_boxes = tf.tile(
tf.expand_dims(self.anchors.get(), 0), [batch_size, 1, 1])
tiled_anchors_boxlist = box_list.BoxList(
tf.reshape(tiled_anchor_boxes, [-1, self._box_coder.code_size]))
decoded_boxes = self._box_coder.decode(
tf.reshape(box_encodings, [-1, self._box_coder.code_size]),
tiled_anchors_boxlist)
return tf.reshape(decoded_boxes.get(),
tf.stack([combined_shape[0], combined_shape[1],
4]))
def restore_map(self, from_detection_checkpoint=True):
"""Returns a map of variables to load from a foreign checkpoint.
See parent class for details.
Args: Args:
checkpoint_path: path to checkpoint to restore.
from_detection_checkpoint: whether to restore from a full detection from_detection_checkpoint: whether to restore from a full detection
checkpoint (with compatible variable names) or to restore from a checkpoint (with compatible variable names) or to restore from a
classification checkpoint for initialization prior to training. classification checkpoint for initialization prior to training.
Returns: Returns:
a callable which takes a tf.Session as input and loads a checkpoint when A dict mapping variable names (to load from a checkpoint) to variables in
run. the model graph.
""" """
variables_to_restore = {} variables_to_restore = {}
for variable in tf.all_variables(): for variable in tf.all_variables():
if variable.op.name.startswith(self._extract_features_scope): if variable.op.name.startswith(self._extract_features_scope):
var_name = variable.op.name var_name = variable.op.name
if not from_detection_checkpoint: if not from_detection_checkpoint:
var_name = ( var_name = (re.split('^' + self._extract_features_scope + '/',
re.split('^' + self._extract_features_scope + '/', var_name)[-1]) var_name)[-1])
variables_to_restore[var_name] = variable variables_to_restore[var_name] = variable
# TODO: Load variables selectively using scopes. return variables_to_restore
variables_to_restore = (
variables_helper.get_variables_available_in_checkpoint(
variables_to_restore, checkpoint_path))
saver = tf.train.Saver(variables_to_restore)
def restore(sess):
saver.restore(sess, checkpoint_path)
return restore
...@@ -116,24 +116,46 @@ class SsdMetaArchTest(tf.test.TestCase): ...@@ -116,24 +116,46 @@ class SsdMetaArchTest(tf.test.TestCase):
localization_loss_weight, normalize_loss_by_num_matches, localization_loss_weight, normalize_loss_by_num_matches,
hard_example_miner) hard_example_miner)
def test_preprocess_preserves_input_shapes(self):
image_shapes = [(3, None, None, 3),
(None, 10, 10, 3),
(None, None, None, 3)]
for image_shape in image_shapes:
image_placeholder = tf.placeholder(tf.float32, shape=image_shape)
preprocessed_inputs = self._model.preprocess(image_placeholder)
self.assertAllEqual(preprocessed_inputs.shape.as_list(), image_shape)
def test_predict_results_have_correct_keys_and_shapes(self): def test_predict_results_have_correct_keys_and_shapes(self):
batch_size = 3 batch_size = 3
preprocessed_input = tf.random_uniform((batch_size, 2, 2, 3), image_size = 2
dtype=tf.float32) input_shapes = [(batch_size, image_size, image_size, 3),
prediction_dict = self._model.predict(preprocessed_input) (None, image_size, image_size, 3),
(batch_size, None, None, 3),
self.assertTrue('box_encodings' in prediction_dict) (None, None, None, 3)]
self.assertTrue('class_predictions_with_background' in prediction_dict)
self.assertTrue('feature_maps' in prediction_dict)
expected_box_encodings_shape_out = ( expected_box_encodings_shape_out = (
batch_size, self._num_anchors, self._code_size) batch_size, self._num_anchors, self._code_size)
expected_class_predictions_with_background_shape_out = ( expected_class_predictions_with_background_shape_out = (
batch_size, self._num_anchors, self._num_classes+1) batch_size, self._num_anchors, self._num_classes+1)
init_op = tf.global_variables_initializer()
with self.test_session() as sess: for input_shape in input_shapes:
sess.run(init_op) tf_graph = tf.Graph()
prediction_out = sess.run(prediction_dict) with tf_graph.as_default():
preprocessed_input_placeholder = tf.placeholder(tf.float32,
shape=input_shape)
prediction_dict = self._model.predict(preprocessed_input_placeholder)
self.assertTrue('box_encodings' in prediction_dict)
self.assertTrue('class_predictions_with_background' in prediction_dict)
self.assertTrue('feature_maps' in prediction_dict)
init_op = tf.global_variables_initializer()
with self.test_session(graph=tf_graph) as sess:
sess.run(init_op)
prediction_out = sess.run(prediction_dict,
feed_dict={
preprocessed_input_placeholder:
np.random.uniform(
size=(batch_size, 2, 2, 3))})
self.assertAllEqual(prediction_out['box_encodings'].shape, self.assertAllEqual(prediction_out['box_encodings'].shape,
expected_box_encodings_shape_out) expected_box_encodings_shape_out)
self.assertAllEqual( self.assertAllEqual(
...@@ -142,10 +164,11 @@ class SsdMetaArchTest(tf.test.TestCase): ...@@ -142,10 +164,11 @@ class SsdMetaArchTest(tf.test.TestCase):
def test_postprocess_results_are_correct(self): def test_postprocess_results_are_correct(self):
batch_size = 2 batch_size = 2
preprocessed_input = tf.random_uniform((batch_size, 2, 2, 3), image_size = 2
dtype=tf.float32) input_shapes = [(batch_size, image_size, image_size, 3),
prediction_dict = self._model.predict(preprocessed_input) (None, image_size, image_size, 3),
detections = self._model.postprocess(prediction_dict) (batch_size, None, None, 3),
(None, None, None, 3)]
expected_boxes = np.array([[[0, 0, .5, .5], expected_boxes = np.array([[[0, 0, .5, .5],
[0, .5, .5, 1], [0, .5, .5, 1],
...@@ -163,15 +186,25 @@ class SsdMetaArchTest(tf.test.TestCase): ...@@ -163,15 +186,25 @@ class SsdMetaArchTest(tf.test.TestCase):
[0, 0, 0, 0, 0]]) [0, 0, 0, 0, 0]])
expected_num_detections = np.array([4, 4]) expected_num_detections = np.array([4, 4])
self.assertTrue('detection_boxes' in detections) for input_shape in input_shapes:
self.assertTrue('detection_scores' in detections) tf_graph = tf.Graph()
self.assertTrue('detection_classes' in detections) with tf_graph.as_default():
self.assertTrue('num_detections' in detections) preprocessed_input_placeholder = tf.placeholder(tf.float32,
shape=input_shape)
init_op = tf.global_variables_initializer() prediction_dict = self._model.predict(preprocessed_input_placeholder)
with self.test_session() as sess: detections = self._model.postprocess(prediction_dict)
sess.run(init_op) self.assertTrue('detection_boxes' in detections)
detections_out = sess.run(detections) self.assertTrue('detection_scores' in detections)
self.assertTrue('detection_classes' in detections)
self.assertTrue('num_detections' in detections)
init_op = tf.global_variables_initializer()
with self.test_session(graph=tf_graph) as sess:
sess.run(init_op)
detections_out = sess.run(detections,
feed_dict={
preprocessed_input_placeholder:
np.random.uniform(
size=(batch_size, 2, 2, 3))})
self.assertAllClose(detections_out['detection_boxes'], expected_boxes) self.assertAllClose(detections_out['detection_boxes'], expected_boxes)
self.assertAllClose(detections_out['detection_scores'], expected_scores) self.assertAllClose(detections_out['detection_scores'], expected_scores)
self.assertAllClose(detections_out['detection_classes'], expected_classes) self.assertAllClose(detections_out['detection_classes'], expected_classes)
...@@ -207,20 +240,21 @@ class SsdMetaArchTest(tf.test.TestCase): ...@@ -207,20 +240,21 @@ class SsdMetaArchTest(tf.test.TestCase):
self.assertAllClose(losses_out['classification_loss'], self.assertAllClose(losses_out['classification_loss'],
expected_classification_loss) expected_classification_loss)
def test_restore_fn_detection(self): def test_restore_map_for_detection_ckpt(self):
init_op = tf.global_variables_initializer() init_op = tf.global_variables_initializer()
saver = tf_saver.Saver() saver = tf_saver.Saver()
save_path = self.get_temp_dir() save_path = self.get_temp_dir()
with self.test_session() as sess: with self.test_session() as sess:
sess.run(init_op) sess.run(init_op)
saved_model_path = saver.save(sess, save_path) saved_model_path = saver.save(sess, save_path)
restore_fn = self._model.restore_fn(saved_model_path, var_map = self._model.restore_map(from_detection_checkpoint=True)
from_detection_checkpoint=True) self.assertIsInstance(var_map, dict)
restore_fn(sess) saver = tf.train.Saver(var_map)
saver.restore(sess, saved_model_path)
for var in sess.run(tf.report_uninitialized_variables()): for var in sess.run(tf.report_uninitialized_variables()):
self.assertNotIn('FeatureExtractor', var.name) self.assertNotIn('FeatureExtractor', var.name)
def test_restore_fn_classification(self): def test_restore_map_for_classification_ckpt(self):
# Define mock tensorflow classification graph and save variables. # Define mock tensorflow classification graph and save variables.
test_graph_classification = tf.Graph() test_graph_classification = tf.Graph()
with test_graph_classification.as_default(): with test_graph_classification.as_default():
...@@ -246,10 +280,11 @@ class SsdMetaArchTest(tf.test.TestCase): ...@@ -246,10 +280,11 @@ class SsdMetaArchTest(tf.test.TestCase):
preprocessed_inputs = self._model.preprocess(inputs) preprocessed_inputs = self._model.preprocess(inputs)
prediction_dict = self._model.predict(preprocessed_inputs) prediction_dict = self._model.predict(preprocessed_inputs)
self._model.postprocess(prediction_dict) self._model.postprocess(prediction_dict)
restore_fn = self._model.restore_fn(saved_model_path, var_map = self._model.restore_map(from_detection_checkpoint=False)
from_detection_checkpoint=False) self.assertIsInstance(var_map, dict)
saver = tf.train.Saver(var_map)
with self.test_session() as sess: with self.test_session() as sess:
restore_fn(sess) saver.restore(sess, saved_model_path)
for var in sess.run(tf.report_uninitialized_variables()): for var in sess.run(tf.report_uninitialized_variables()):
self.assertNotIn('FeatureExtractor', var.name) self.assertNotIn('FeatureExtractor', var.name)
......
...@@ -94,7 +94,6 @@ py_library( ...@@ -94,7 +94,6 @@ py_library(
deps = [ deps = [
"//tensorflow", "//tensorflow",
"//tensorflow_models/object_detection/meta_architectures:faster_rcnn_meta_arch", "//tensorflow_models/object_detection/meta_architectures:faster_rcnn_meta_arch",
"//tensorflow_models/object_detection/utils:variables_helper",
"//tensorflow_models/slim:inception_resnet_v2", "//tensorflow_models/slim:inception_resnet_v2",
], ],
) )
......
...@@ -25,7 +25,6 @@ Huang et al. (https://arxiv.org/abs/1611.10012) ...@@ -25,7 +25,6 @@ Huang et al. (https://arxiv.org/abs/1611.10012)
import tensorflow as tf import tensorflow as tf
from object_detection.meta_architectures import faster_rcnn_meta_arch from object_detection.meta_architectures import faster_rcnn_meta_arch
from object_detection.utils import variables_helper
from nets import inception_resnet_v2 from nets import inception_resnet_v2
slim = tf.contrib.slim slim = tf.contrib.slim
...@@ -168,30 +167,30 @@ class FasterRCNNInceptionResnetV2FeatureExtractor( ...@@ -168,30 +167,30 @@ class FasterRCNNInceptionResnetV2FeatureExtractor(
def restore_from_classification_checkpoint_fn( def restore_from_classification_checkpoint_fn(
self, self,
checkpoint_path,
first_stage_feature_extractor_scope, first_stage_feature_extractor_scope,
second_stage_feature_extractor_scope): second_stage_feature_extractor_scope):
"""Returns callable for loading a checkpoint into the tensorflow graph. """Returns a map of variables to load from a foreign checkpoint.
Note that this overrides the default implementation in Note that this overrides the default implementation in
faster_rcnn_meta_arch.FasterRCNNFeatureExtractor which does not work for faster_rcnn_meta_arch.FasterRCNNFeatureExtractor which does not work for
InceptionResnetV2 checkpoints. InceptionResnetV2 checkpoints.
TODO: revisit whether it's possible to force the `Repeat` namescope as TODO: revisit whether it's possible to force the
created in `_extract_box_classifier_features` to start counting at 2 (e.g. `Repeat` namescope as created in `_extract_box_classifier_features` to
`Repeat_2`) so that the default restore_fn can be used. start counting at 2 (e.g. `Repeat_2`) so that the default restore_fn can
be used.
Args: Args:
checkpoint_path: Path to checkpoint to restore.
first_stage_feature_extractor_scope: A scope name for the first stage first_stage_feature_extractor_scope: A scope name for the first stage
feature extractor. feature extractor.
second_stage_feature_extractor_scope: A scope name for the second stage second_stage_feature_extractor_scope: A scope name for the second stage
feature extractor. feature extractor.
Returns: Returns:
a callable which takes a tf.Session as input and loads a checkpoint when A dict mapping variable names (to load from a checkpoint) to variables in
run. the model graph.
""" """
variables_to_restore = {} variables_to_restore = {}
for variable in tf.global_variables(): for variable in tf.global_variables():
if variable.op.name.startswith( if variable.op.name.startswith(
...@@ -207,10 +206,4 @@ class FasterRCNNInceptionResnetV2FeatureExtractor( ...@@ -207,10 +206,4 @@ class FasterRCNNInceptionResnetV2FeatureExtractor(
var_name = var_name.replace( var_name = var_name.replace(
second_stage_feature_extractor_scope + '/', '') second_stage_feature_extractor_scope + '/', '')
variables_to_restore[var_name] = variable variables_to_restore[var_name] = variable
variables_to_restore = ( return variables_to_restore
variables_helper.get_variables_available_in_checkpoint(
variables_to_restore, checkpoint_path))
saver = tf.train.Saver(variables_to_restore)
def restore(sess):
saver.restore(sess, checkpoint_path)
return restore
...@@ -63,7 +63,7 @@ class MultiResolutionFeatureMapGeneratorTest(tf.test.TestCase): ...@@ -63,7 +63,7 @@ class MultiResolutionFeatureMapGeneratorTest(tf.test.TestCase):
sess.run(init_op) sess.run(init_op)
out_feature_maps = sess.run(feature_maps) out_feature_maps = sess.run(feature_maps)
out_feature_map_shapes = dict( out_feature_map_shapes = dict(
(key, value.shape) for key, value in out_feature_maps.iteritems()) (key, value.shape) for key, value in out_feature_maps.items())
self.assertDictEqual(out_feature_map_shapes, expected_feature_map_shapes) self.assertDictEqual(out_feature_map_shapes, expected_feature_map_shapes)
def test_get_expected_feature_map_shapes_with_inception_v3(self): def test_get_expected_feature_map_shapes_with_inception_v3(self):
...@@ -93,7 +93,7 @@ class MultiResolutionFeatureMapGeneratorTest(tf.test.TestCase): ...@@ -93,7 +93,7 @@ class MultiResolutionFeatureMapGeneratorTest(tf.test.TestCase):
sess.run(init_op) sess.run(init_op)
out_feature_maps = sess.run(feature_maps) out_feature_maps = sess.run(feature_maps)
out_feature_map_shapes = dict( out_feature_map_shapes = dict(
(key, value.shape) for key, value in out_feature_maps.iteritems()) (key, value.shape) for key, value in out_feature_maps.items())
self.assertDictEqual(out_feature_map_shapes, expected_feature_map_shapes) self.assertDictEqual(out_feature_map_shapes, expected_feature_map_shapes)
......
...@@ -140,9 +140,9 @@ ...@@ -140,9 +140,9 @@
"opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)\n", "opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)\n",
"tar_file = tarfile.open(MODEL_FILE)\n", "tar_file = tarfile.open(MODEL_FILE)\n",
"for file in tar_file.getmembers():\n", "for file in tar_file.getmembers():\n",
" file_name = os.path.basename(file.name)\n", " file_name = os.path.basename(file.name)\n",
" if 'frozen_inference_graph.pb' in file_name:\n", " if 'frozen_inference_graph.pb' in file_name:\n",
" tar_file.extract(file, os.getcwd())" " tar_file.extract(file, os.getcwd())"
] ]
}, },
{ {
...@@ -162,11 +162,11 @@ ...@@ -162,11 +162,11 @@
"source": [ "source": [
"detection_graph = tf.Graph()\n", "detection_graph = tf.Graph()\n",
"with detection_graph.as_default():\n", "with detection_graph.as_default():\n",
" od_graph_def = tf.GraphDef()\n", " od_graph_def = tf.GraphDef()\n",
" with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:\n", " with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:\n",
" serialized_graph = fid.read()\n", " serialized_graph = fid.read()\n",
" od_graph_def.ParseFromString(serialized_graph)\n", " od_graph_def.ParseFromString(serialized_graph)\n",
" tf.import_graph_def(od_graph_def, name='')" " tf.import_graph_def(od_graph_def, name='')"
] ]
}, },
{ {
......
...@@ -111,6 +111,11 @@ train_config: { ...@@ -111,6 +111,11 @@ train_config: {
gradient_clipping_by_norm: 10.0 gradient_clipping_by_norm: 10.0
fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt" fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
from_detection_checkpoint: true from_detection_checkpoint: true
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps: 200000
data_augmentation_options { data_augmentation_options {
random_horizontal_flip { random_horizontal_flip {
} }
...@@ -126,6 +131,9 @@ train_input_reader: { ...@@ -126,6 +131,9 @@ train_input_reader: {
eval_config: { eval_config: {
num_examples: 2000 num_examples: 2000
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals: 10
} }
eval_input_reader: { eval_input_reader: {
......
...@@ -109,6 +109,11 @@ train_config: { ...@@ -109,6 +109,11 @@ train_config: {
gradient_clipping_by_norm: 10.0 gradient_clipping_by_norm: 10.0
fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt" fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
from_detection_checkpoint: true from_detection_checkpoint: true
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps: 200000
data_augmentation_options { data_augmentation_options {
random_horizontal_flip { random_horizontal_flip {
} }
...@@ -124,6 +129,9 @@ train_input_reader: { ...@@ -124,6 +129,9 @@ train_input_reader: {
eval_config: { eval_config: {
num_examples: 2000 num_examples: 2000
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals: 10
} }
eval_input_reader: { eval_input_reader: {
......
...@@ -109,6 +109,11 @@ train_config: { ...@@ -109,6 +109,11 @@ train_config: {
gradient_clipping_by_norm: 10.0 gradient_clipping_by_norm: 10.0
fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt" fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
from_detection_checkpoint: true from_detection_checkpoint: true
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps: 200000
data_augmentation_options { data_augmentation_options {
random_horizontal_flip { random_horizontal_flip {
} }
...@@ -124,6 +129,9 @@ train_input_reader: { ...@@ -124,6 +129,9 @@ train_input_reader: {
eval_config: { eval_config: {
num_examples: 2000 num_examples: 2000
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals: 10
} }
eval_input_reader: { eval_input_reader: {
......
...@@ -109,6 +109,11 @@ train_config: { ...@@ -109,6 +109,11 @@ train_config: {
gradient_clipping_by_norm: 10.0 gradient_clipping_by_norm: 10.0
fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt" fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
from_detection_checkpoint: true from_detection_checkpoint: true
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps: 200000
data_augmentation_options { data_augmentation_options {
random_horizontal_flip { random_horizontal_flip {
} }
...@@ -124,6 +129,9 @@ train_input_reader: { ...@@ -124,6 +129,9 @@ train_input_reader: {
eval_config: { eval_config: {
num_examples: 2000 num_examples: 2000
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals: 10
} }
eval_input_reader: { eval_input_reader: {
......
...@@ -106,6 +106,11 @@ train_config: { ...@@ -106,6 +106,11 @@ train_config: {
gradient_clipping_by_norm: 10.0 gradient_clipping_by_norm: 10.0
fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt" fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
from_detection_checkpoint: true from_detection_checkpoint: true
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps: 200000
data_augmentation_options { data_augmentation_options {
random_horizontal_flip { random_horizontal_flip {
} }
...@@ -121,6 +126,9 @@ train_input_reader: { ...@@ -121,6 +126,9 @@ train_input_reader: {
eval_config: { eval_config: {
num_examples: 2000 num_examples: 2000
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals: 10
} }
eval_input_reader: { eval_input_reader: {
......
...@@ -151,6 +151,11 @@ train_config: { ...@@ -151,6 +151,11 @@ train_config: {
} }
fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt" fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
from_detection_checkpoint: true from_detection_checkpoint: true
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps: 200000
data_augmentation_options { data_augmentation_options {
random_horizontal_flip { random_horizontal_flip {
} }
...@@ -170,6 +175,9 @@ train_input_reader: { ...@@ -170,6 +175,9 @@ train_input_reader: {
eval_config: { eval_config: {
num_examples: 2000 num_examples: 2000
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals: 10
} }
eval_input_reader: { eval_input_reader: {
......
...@@ -157,6 +157,11 @@ train_config: { ...@@ -157,6 +157,11 @@ train_config: {
} }
fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt" fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
from_detection_checkpoint: true from_detection_checkpoint: true
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps: 200000
data_augmentation_options { data_augmentation_options {
random_horizontal_flip { random_horizontal_flip {
} }
...@@ -176,6 +181,9 @@ train_input_reader: { ...@@ -176,6 +181,9 @@ train_input_reader: {
eval_config: { eval_config: {
num_examples: 2000 num_examples: 2000
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals: 10
} }
eval_input_reader: { eval_input_reader: {
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment