Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
ResNet50_tensorflow
Commits
dff0f0c1
Commit
dff0f0c1
authored
Aug 08, 2017
by
Alexander Gorban
Browse files
Merge branch 'master' of github.com:tensorflow/models
parents
da341f70
36203f09
Changes
187
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
681 additions
and
294 deletions
+681
-294
object_detection/g3doc/running_notebook.md
object_detection/g3doc/running_notebook.md
+2
-2
object_detection/g3doc/running_on_cloud.md
object_detection/g3doc/running_on_cloud.md
+1
-1
object_detection/g3doc/running_pets.md
object_detection/g3doc/running_pets.md
+25
-16
object_detection/g3doc/using_your_own_dataset.md
object_detection/g3doc/using_your_own_dataset.md
+157
-0
object_detection/meta_architectures/BUILD
object_detection/meta_architectures/BUILD
+2
-3
object_detection/meta_architectures/faster_rcnn_meta_arch.py
object_detection/meta_architectures/faster_rcnn_meta_arch.py
+180
-96
object_detection/meta_architectures/faster_rcnn_meta_arch_test_lib.py
...tion/meta_architectures/faster_rcnn_meta_arch_test_lib.py
+124
-89
object_detection/meta_architectures/ssd_meta_arch.py
object_detection/meta_architectures/ssd_meta_arch.py
+47
-27
object_detection/meta_architectures/ssd_meta_arch_test.py
object_detection/meta_architectures/ssd_meta_arch_test.py
+68
-33
object_detection/models/BUILD
object_detection/models/BUILD
+0
-1
object_detection/models/faster_rcnn_inception_resnet_v2_feature_extractor.py
...dels/faster_rcnn_inception_resnet_v2_feature_extractor.py
+9
-16
object_detection/models/feature_map_generators_test.py
object_detection/models/feature_map_generators_test.py
+2
-2
object_detection/object_detection_tutorial.ipynb
object_detection/object_detection_tutorial.ipynb
+8
-8
object_detection/samples/configs/faster_rcnn_inception_resnet_v2_atrous_pets.config
...onfigs/faster_rcnn_inception_resnet_v2_atrous_pets.config
+8
-0
object_detection/samples/configs/faster_rcnn_resnet101_pets.config
...tection/samples/configs/faster_rcnn_resnet101_pets.config
+8
-0
object_detection/samples/configs/faster_rcnn_resnet152_pets.config
...tection/samples/configs/faster_rcnn_resnet152_pets.config
+8
-0
object_detection/samples/configs/faster_rcnn_resnet50_pets.config
...etection/samples/configs/faster_rcnn_resnet50_pets.config
+8
-0
object_detection/samples/configs/rfcn_resnet101_pets.config
object_detection/samples/configs/rfcn_resnet101_pets.config
+8
-0
object_detection/samples/configs/ssd_inception_v2_pets.config
...ct_detection/samples/configs/ssd_inception_v2_pets.config
+8
-0
object_detection/samples/configs/ssd_mobilenet_v1_pets.config
...ct_detection/samples/configs/ssd_mobilenet_v1_pets.config
+8
-0
No files found.
object_detection/g3doc/running_notebook.md
View file @
dff0f0c1
...
...
@@ -11,5 +11,5 @@ jupyter notebook
```
The notebook should open in your favorite web browser. Click the
[
`object_detection_tutorial.ipynb`
](
../object_detection_tutorial.ipynb
)
link
to
open the demo.
[
`object_detection_tutorial.ipynb`
](
../object_detection_tutorial.ipynb
)
link
to
open the demo.
object_detection/g3doc/running_on_cloud.md
View file @
dff0f0c1
...
...
@@ -88,7 +88,7 @@ training checkpoints and events will be written to and
Google Cloud Storage.
Users can monitor the progress of their training job on the
[
ML Engine
Dashboard
](
https://
pantheon.corp
.google.com/mlengine/jobs
)
.
Dashboard
](
https://
console.cloud
.google.com/mlengine/jobs
)
.
## Running an Evaluation Job on Cloud
...
...
object_detection/g3doc/running_pets.md
View file @
dff0f0c1
...
...
@@ -51,29 +51,35 @@ dataset for Oxford-IIIT Pets lives
[
here
](
http://www.robots.ox.ac.uk/~vgg/data/pets/
)
. You will need to download
both the image dataset
[
`images.tar.gz`
](
http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz
)
and the groundtruth data
[
`annotations.tar.gz`
](
http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz
)
to the
`tensorflow/models`
directory. This may take some time. After downloading
the tarballs, your
`object_detection`
directory should appear as follows:
to the
`tensorflow/models`
directory and unzip them. This may take some time.
```
bash
# From tensorflow/models/
wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz
wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz
tar
-xvf
images.tar.gz
tar
-xvf
annotations.tar.gz
```
After downloading the tarballs, your
`tensorflow/models`
directory should appear
as follows:
```
lang-none
- images.tar.gz
- annotations.tar.gz
+ images/
+ annotations/
+ object_detection/
+ data/
- images.tar.gz
- annotations.tar.gz
- create_pet_tf_record.py
... other files and directories
... other files and directories
```
The Tensorflow Object Detection API expects data to be in the TFRecord format,
so we'll now run the
`create_pet_tf_record`
script to convert from the raw
Oxford-IIIT Pet dataset into TFRecords. Run the following commands from the
`
object_detection
`
directory:
`
tensorflow/models
`
directory:
```
bash
# From tensorflow/models/
wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz
wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz
tar
-xvf
annotations.tar.gz
tar
-xvf
images.tar.gz
python object_detection/create_pet_tf_record.py
\
--label_map_path
=
object_detection/data/pet_label_map.pbtxt
\
--data_dir
=
`
pwd
`
\
...
...
@@ -83,8 +89,8 @@ python object_detection/create_pet_tf_record.py \
Note: It is normal to see some warnings when running this script. You may ignore
them.
Two TFRecord files named
`pet_train.record`
and
`pet_val.record`
should be
generated
in the
`object_detection
`
directory.
Two TFRecord files named
`pet_train.record`
and
`pet_val.record`
should be
generated in the
`tensorflow/models
`
directory.
Now that the data has been generated, we'll need to upload it to Google Cloud
Storage so the data can be accessed by ML Engine. Run the following command to
...
...
@@ -263,7 +269,10 @@ Note: It takes roughly 10 minutes for a job to get started on ML Engine, and
roughly an hour for the system to evaluate the validation dataset. It may take
some time to populate the dashboards. If you do not see any entries after half
an hour, check the logs from the
[
ML Engine
Dashboard
](
https://console.cloud.google.com/mlengine/jobs
)
.
Dashboard
](
https://console.cloud.google.com/mlengine/jobs
)
. Note that by default
the training jobs are configured to go for much longer than is necessary for
convergence. To save money, we recommend killing your jobs once you've seen
that they've converged.
## Exporting the Tensorflow Graph
...
...
@@ -279,7 +288,7 @@ three files:
*
`model.ckpt-${CHECKPOINT_NUMBER}.meta`
After you've identified a candidate checkpoint to export, run the following
command from
`tensorflow/models
/object_detection
`
:
command from
`tensorflow/models`
:
```
bash
# From tensorflow/models
...
...
object_detection/g3doc/using_your_own_dataset.md
0 → 100644
View file @
dff0f0c1
# Preparing Inputs
To use your own dataset in Tensorflow Object Detection API, you must convert it
into the
[
TFRecord file format
](
https://www.tensorflow.org/api_guides/python/python_io#tfrecords_format_details
)
.
This document outlines how to write a script to generate the TFRecord file.
## Label Maps
Each dataset is required to have a label map associated with it. This label map
defines a mapping from string class names to integer class Ids. The label map
should be a
`StringIntLabelMap`
text protobuf. Sample label maps can be found in
object_detection/data. Label maps should always start from id 1.
## Dataset Requirements
For every example in your dataset, you should have the following information:
1.
An RGB image for the dataset encoded as jpeg or png.
2.
A list of bounding boxes for the image. Each bounding box should contain:
1.
A bounding box coordinates (with origin in top left corner) defined by 4
floating point numbers [ymin, xmin, ymax, xmax]. Note that we store the
_normalized_ coordinates (x / width, y / height) in the TFRecord dataset.
2.
The class of the object in the bounding box.
# Example Image
Consider the following image:

with the following label map:
```
item {
id: 1
name: 'Cat'
}
item {
id: 2
name: 'Dog'
}
```
We can generate a tf.Example proto for this image using the following code:
```
python
def
create_cat_tf_example
(
encoded_cat_image_data
):
"""Creates a tf.Example proto from sample cat image.
Args:
encoded_cat_image_data: The jpg encoded data of the cat image.
Returns:
example: The created tf.Example.
"""
height
=
1032.0
width
=
1200.0
filename
=
'example_cat.jpg'
image_format
=
b
'jpg'
xmins
=
[
322.0
/
1200.0
]
xmaxs
=
[
1062.0
/
1200.0
]
ymins
=
[
174.0
/
1032.0
]
ymaxs
=
[
761.0
/
1032.0
]
classes_text
=
[
'Cat'
]
classes
=
[
1
]
tf_example
=
tf
.
train
.
Example
(
features
=
tf
.
train
.
Features
(
feature
=
{
'image/height'
:
dataset_util
.
int64_feature
(
height
),
'image/width'
:
dataset_util
.
int64_feature
(
width
),
'image/filename'
:
dataset_util
.
bytes_feature
(
filename
),
'image/source_id'
:
dataset_util
.
bytes_feature
(
filename
),
'image/encoded'
:
dataset_util
.
bytes_feature
(
encoded_image_data
),
'image/format'
:
dataset_util
.
bytes_feature
(
image_format
),
'image/object/bbox/xmin'
:
dataset_util
.
float_list_feature
(
xmins
),
'image/object/bbox/xmax'
:
dataset_util
.
float_list_feature
(
xmaxs
),
'image/object/bbox/ymin'
:
dataset_util
.
float_list_feature
(
ymins
),
'image/object/bbox/ymax'
:
dataset_util
.
float_list_feature
(
ymaxs
),
'image/object/class/text'
:
dataset_util
.
bytes_list_feature
(
classes_text
),
'image/object/class/label'
:
dataset_util
.
int64_list_feature
(
classes
),
}))
return
tf_example
```
## Conversion Script Outline
A typical conversion script will look like the following:
```
python
import
tensorflow
as
tf
from
object_detection.utils
import
dataset_util
flags
=
tf
.
app
.
flags
flags
.
DEFINE_string
(
'output_path'
,
''
,
'Path to output TFRecord'
)
FLAGS
=
flags
.
FLAGS
def
create_tf_example
(
example
):
# TODO(user): Populate the following variables from your example.
height
=
None
# Image height
width
=
None
# Image width
filename
=
None
# Filename of the image. Empty if image is not from file
encoded_image_data
=
None
# Encoded image bytes
image_format
=
None
# b'jpeg' or b'png'
xmins
=
[]
# List of normalized left x coordinates in bounding box (1 per box)
xmaxs
=
[]
# List of normalized right x coordinates in bounding box
# (1 per box)
ymins
=
[]
# List of normalized top y coordinates in bounding box (1 per box)
ymaxs
=
[]
# List of normalized bottom y coordinates in bounding box
# (1 per box)
classes_text
=
[]
# List of string class name of bounding box (1 per box)
classes
=
[]
# List of integer class id of bounding box (1 per box)
tf_example
=
tf
.
train
.
Example
(
features
=
tf
.
train
.
Features
(
feature
=
{
'image/height'
:
dataset_util
.
int64_feature
(
height
),
'image/width'
:
dataset_util
.
int64_feature
(
width
),
'image/filename'
:
dataset_util
.
bytes_feature
(
filename
),
'image/source_id'
:
dataset_util
.
bytes_feature
(
filename
),
'image/encoded'
:
dataset_util
.
bytes_feature
(
encoded_image_data
),
'image/format'
:
dataset_util
.
bytes_feature
(
image_format
),
'image/object/bbox/xmin'
:
dataset_util
.
float_list_feature
(
xmins
),
'image/object/bbox/xmax'
:
dataset_util
.
float_list_feature
(
xmaxs
),
'image/object/bbox/ymin'
:
dataset_util
.
float_list_feature
(
ymins
),
'image/object/bbox/ymax'
:
dataset_util
.
float_list_feature
(
ymaxs
),
'image/object/class/text'
:
dataset_util
.
bytes_list_feature
(
classes_text
),
'image/object/class/label'
:
dataset_util
.
int64_list_feature
(
classes
),
}))
return
tf_example
def
main
(
_
):
writer
=
tf
.
python_io
.
TFRecordWriter
(
FLAGS
.
output_path
)
# TODO(user): Write code to read in your dataset to examples variable
for
example
in
examples
:
tf_example
=
create_tf_example
(
example
)
writer
.
write
(
tf_example
.
SerializeToString
())
writer
.
close
()
if
__name__
==
'__main__'
:
tf
.
app
.
run
()
```
Note: You may notice additional fields in some other datasets. They are
currently unused by the API and are optional.
object_detection/meta_architectures/BUILD
View file @
dff0f0c1
...
...
@@ -13,12 +13,11 @@ py_library(
srcs
=
[
"ssd_meta_arch.py"
],
deps
=
[
"//tensorflow"
,
"//tensorflow_models/object_detection/core:box_coder"
,
"//tensorflow_models/object_detection/core:box_list"
,
"//tensorflow_models/object_detection/core:box_predictor"
,
"//tensorflow_models/object_detection/core:model"
,
"//tensorflow_models/object_detection/core:target_assigner"
,
"//tensorflow_models/object_detection/utils:
variables_helper
"
,
"//tensorflow_models/object_detection/utils:
shape_utils
"
,
],
)
...
...
@@ -56,7 +55,7 @@ py_library(
"//tensorflow_models/object_detection/core:standard_fields"
,
"//tensorflow_models/object_detection/core:target_assigner"
,
"//tensorflow_models/object_detection/utils:ops"
,
"//tensorflow_models/object_detection/utils:
variables_helper
"
,
"//tensorflow_models/object_detection/utils:
shape_utils
"
,
],
)
...
...
object_detection/meta_architectures/faster_rcnn_meta_arch.py
View file @
dff0f0c1
...
...
@@ -80,7 +80,7 @@ from object_detection.core import post_processing
from
object_detection.core
import
standard_fields
as
fields
from
object_detection.core
import
target_assigner
from
object_detection.utils
import
ops
from
object_detection.utils
import
variables_helper
from
object_detection.utils
import
shape_utils
slim
=
tf
.
contrib
.
slim
...
...
@@ -159,21 +159,19 @@ class FasterRCNNFeatureExtractor(object):
def
restore_from_classification_checkpoint_fn
(
self
,
checkpoint_path
,
first_stage_feature_extractor_scope
,
second_stage_feature_extractor_scope
):
"""Returns
callable for loading a checkpoint into the tensorflow graph
.
"""Returns
a map of variables to load from a foreign checkpoint
.
Args:
checkpoint_path: path to checkpoint to restore.
first_stage_feature_extractor_scope: A scope name for the first stage
feature extractor.
second_stage_feature_extractor_scope: A scope name for the second stage
feature extractor.
Returns:
a callable which takes a tf.Session as input and loads a checkpoint whe
n
run
.
A dict mapping variable names (to load from a checkpoint) to variables i
n
the model graph
.
"""
variables_to_restore
=
{}
for
variable
in
tf
.
global_variables
():
...
...
@@ -182,13 +180,7 @@ class FasterRCNNFeatureExtractor(object):
if
variable
.
op
.
name
.
startswith
(
scope_name
):
var_name
=
variable
.
op
.
name
.
replace
(
scope_name
+
'/'
,
''
)
variables_to_restore
[
var_name
]
=
variable
variables_to_restore
=
(
variables_helper
.
get_variables_available_in_checkpoint
(
variables_to_restore
,
checkpoint_path
))
saver
=
tf
.
train
.
Saver
(
variables_to_restore
)
def
restore
(
sess
):
saver
.
restore
(
sess
,
checkpoint_path
)
return
restore
return
variables_to_restore
class
FasterRCNNMetaArch
(
model
.
DetectionModel
):
...
...
@@ -774,10 +766,9 @@ class FasterRCNNMetaArch(model.DetectionModel):
A float tensor with shape [A * B, ..., depth] (where the first and last
dimension are statically defined.
"""
inputs_shape
=
inputs
.
get_shape
().
as_list
()
flattened_shape
=
tf
.
concat
([
[
inputs_shape
[
0
]
*
inputs_shape
[
1
]],
tf
.
shape
(
inputs
)[
2
:
-
1
],
[
inputs_shape
[
-
1
]]],
0
)
combined_shape
=
shape_utils
.
combined_static_and_dynamic_shape
(
inputs
)
flattened_shape
=
tf
.
stack
([
combined_shape
[
0
]
*
combined_shape
[
1
]]
+
combined_shape
[
2
:])
return
tf
.
reshape
(
inputs
,
flattened_shape
)
def
postprocess
(
self
,
prediction_dict
):
...
...
@@ -875,52 +866,128 @@ class FasterRCNNMetaArch(model.DetectionModel):
representing the number of proposals predicted for each image in
the batch.
"""
rpn_box_encodings_batch
=
tf
.
expand_dims
(
rpn_box_encodings_batch
,
axis
=
2
)
rpn_encodings_shape
=
shape_utils
.
combined_static_and_dynamic_shape
(
rpn_box_encodings_batch
)
tiled_anchor_boxes
=
tf
.
tile
(
tf
.
expand_dims
(
anchors
,
0
),
[
rpn_encodings_shape
[
0
],
1
,
1
])
proposal_boxes
=
self
.
_batch_decode_boxes
(
rpn_box_encodings_batch
,
tiled_anchor_boxes
)
proposal_boxes
=
tf
.
squeeze
(
proposal_boxes
,
axis
=
2
)
rpn_objectness_softmax_without_background
=
tf
.
nn
.
softmax
(
rpn_objectness_predictions_with_background_batch
)[:,
:,
1
]
clip_window
=
tf
.
to_float
(
tf
.
stack
([
0
,
0
,
image_shape
[
1
],
image_shape
[
2
]]))
(
proposal_boxes
,
proposal_scores
,
_
,
_
,
num_proposals
)
=
post_processing
.
batch_multiclass_non_max_suppression
(
tf
.
expand_dims
(
proposal_boxes
,
axis
=
2
),
tf
.
expand_dims
(
rpn_objectness_softmax_without_background
,
axis
=
2
),
self
.
_first_stage_nms_score_threshold
,
self
.
_first_stage_nms_iou_threshold
,
self
.
_first_stage_max_proposals
,
self
.
_first_stage_max_proposals
,
clip_window
=
clip_window
)
if
self
.
_is_training
:
(
groundtruth_boxlists
,
groundtruth_classes_with_background_list
)
=
self
.
_format_groundtruth_data
(
image_shape
)
proposal_boxes_list
=
[]
proposal_scores_list
=
[]
num_proposals_list
=
[]
for
(
batch_index
,
(
rpn_box_encodings
,
rpn_objectness_predictions_with_background
))
in
enumerate
(
zip
(
tf
.
unstack
(
rpn_box_encodings_batch
),
tf
.
unstack
(
rpn_objectness_predictions_with_background_batch
))):
decoded_boxes
=
self
.
_box_coder
.
decode
(
rpn_box_encodings
,
box_list
.
BoxList
(
anchors
))
objectness_scores
=
tf
.
unstack
(
tf
.
nn
.
softmax
(
rpn_objectness_predictions_with_background
),
axis
=
1
)[
1
]
proposal_boxlist
=
post_processing
.
multiclass_non_max_suppression
(
tf
.
expand_dims
(
decoded_boxes
.
get
(),
1
),
tf
.
expand_dims
(
objectness_scores
,
1
),
self
.
_first_stage_nms_score_threshold
,
self
.
_first_stage_nms_iou_threshold
,
self
.
_first_stage_max_proposals
,
clip_window
=
clip_window
)
if
self
.
_is_training
:
proposal_boxlist
.
set
(
tf
.
stop_gradient
(
proposal_boxlist
.
get
()))
if
not
self
.
_hard_example_miner
:
proposal_boxlist
=
self
.
_sample_box_classifier_minibatch
(
proposal_boxlist
,
groundtruth_boxlists
[
batch_index
],
groundtruth_classes_with_background_list
[
batch_index
])
normalized_proposals
=
box_list_ops
.
to_normalized_coordinates
(
proposal_boxlist
,
image_shape
[
1
],
image_shape
[
2
],
check_range
=
False
)
# pad proposals to max_num_proposals
padded_proposals
=
box_list_ops
.
pad_or_clip_box_list
(
normalized_proposals
,
num_boxes
=
self
.
max_num_proposals
)
proposal_boxes_list
.
append
(
padded_proposals
.
get
())
proposal_scores_list
.
append
(
padded_proposals
.
get_field
(
fields
.
BoxListFields
.
scores
))
num_proposals_list
.
append
(
tf
.
minimum
(
normalized_proposals
.
num_boxes
(),
self
.
max_num_proposals
))
return
(
tf
.
stack
(
proposal_boxes_list
),
tf
.
stack
(
proposal_scores_list
),
tf
.
stack
(
num_proposals_list
))
proposal_boxes
=
tf
.
stop_gradient
(
proposal_boxes
)
if
not
self
.
_hard_example_miner
:
(
groundtruth_boxlists
,
groundtruth_classes_with_background_list
,
)
=
self
.
_format_groundtruth_data
(
image_shape
)
(
proposal_boxes
,
proposal_scores
,
num_proposals
)
=
self
.
_unpad_proposals_and_sample_box_classifier_batch
(
proposal_boxes
,
proposal_scores
,
num_proposals
,
groundtruth_boxlists
,
groundtruth_classes_with_background_list
)
# normalize proposal boxes
proposal_boxes_reshaped
=
tf
.
reshape
(
proposal_boxes
,
[
-
1
,
4
])
normalized_proposal_boxes_reshaped
=
box_list_ops
.
to_normalized_coordinates
(
box_list
.
BoxList
(
proposal_boxes_reshaped
),
image_shape
[
1
],
image_shape
[
2
],
check_range
=
False
).
get
()
proposal_boxes
=
tf
.
reshape
(
normalized_proposal_boxes_reshaped
,
[
-
1
,
proposal_boxes
.
shape
[
1
].
value
,
4
])
return
proposal_boxes
,
proposal_scores
,
num_proposals
def
_unpad_proposals_and_sample_box_classifier_batch
(
self
,
proposal_boxes
,
proposal_scores
,
num_proposals
,
groundtruth_boxlists
,
groundtruth_classes_with_background_list
):
"""Unpads proposals and samples a minibatch for second stage.
Args:
proposal_boxes: A float tensor with shape
[batch_size, num_proposals, 4] representing the (potentially zero
padded) proposal boxes for all images in the batch. These boxes are
represented as normalized coordinates.
proposal_scores: A float tensor with shape
[batch_size, num_proposals] representing the (potentially zero
padded) proposal objectness scores for all images in the batch.
num_proposals: A Tensor of type `int32`. A 1-D tensor of shape [batch]
representing the number of proposals predicted for each image in
the batch.
groundtruth_boxlists: A list of BoxLists containing (absolute) coordinates
of the groundtruth boxes.
groundtruth_classes_with_background_list: A list of 2-D one-hot
(or k-hot) tensors of shape [num_boxes, num_classes+1] containing the
class targets with the 0th index assumed to map to the background class.
Returns:
proposal_boxes: A float tensor with shape
[batch_size, second_stage_batch_size, 4] representing the (potentially
zero padded) proposal boxes for all images in the batch. These boxes
are represented as normalized coordinates.
proposal_scores: A float tensor with shape
[batch_size, second_stage_batch_size] representing the (potentially zero
padded) proposal objectness scores for all images in the batch.
num_proposals: A Tensor of type `int32`. A 1-D tensor of shape [batch]
representing the number of proposals predicted for each image in
the batch.
"""
single_image_proposal_box_sample
=
[]
single_image_proposal_score_sample
=
[]
single_image_num_proposals_sample
=
[]
for
(
single_image_proposal_boxes
,
single_image_proposal_scores
,
single_image_num_proposals
,
single_image_groundtruth_boxlist
,
single_image_groundtruth_classes_with_background
)
in
zip
(
tf
.
unstack
(
proposal_boxes
),
tf
.
unstack
(
proposal_scores
),
tf
.
unstack
(
num_proposals
),
groundtruth_boxlists
,
groundtruth_classes_with_background_list
):
static_shape
=
single_image_proposal_boxes
.
get_shape
()
sliced_static_shape
=
tf
.
TensorShape
([
tf
.
Dimension
(
None
),
static_shape
.
dims
[
-
1
]])
single_image_proposal_boxes
=
tf
.
slice
(
single_image_proposal_boxes
,
[
0
,
0
],
[
single_image_num_proposals
,
-
1
])
single_image_proposal_boxes
.
set_shape
(
sliced_static_shape
)
single_image_proposal_scores
=
tf
.
slice
(
single_image_proposal_scores
,
[
0
],
[
single_image_num_proposals
])
single_image_boxlist
=
box_list
.
BoxList
(
single_image_proposal_boxes
)
single_image_boxlist
.
add_field
(
fields
.
BoxListFields
.
scores
,
single_image_proposal_scores
)
sampled_boxlist
=
self
.
_sample_box_classifier_minibatch
(
single_image_boxlist
,
single_image_groundtruth_boxlist
,
single_image_groundtruth_classes_with_background
)
sampled_padded_boxlist
=
box_list_ops
.
pad_or_clip_box_list
(
sampled_boxlist
,
num_boxes
=
self
.
_second_stage_batch_size
)
single_image_num_proposals_sample
.
append
(
tf
.
minimum
(
sampled_boxlist
.
num_boxes
(),
self
.
_second_stage_batch_size
))
bb
=
sampled_padded_boxlist
.
get
()
single_image_proposal_box_sample
.
append
(
bb
)
single_image_proposal_score_sample
.
append
(
sampled_padded_boxlist
.
get_field
(
fields
.
BoxListFields
.
scores
))
return
(
tf
.
stack
(
single_image_proposal_box_sample
),
tf
.
stack
(
single_image_proposal_score_sample
),
tf
.
stack
(
single_image_num_proposals_sample
))
def
_format_groundtruth_data
(
self
,
image_shape
):
"""Helper function for preparing groundtruth data for target assignment.
...
...
@@ -1074,7 +1141,7 @@ class FasterRCNNMetaArch(model.DetectionModel):
class_predictions_with_background
,
[
-
1
,
self
.
max_num_proposals
,
self
.
num_classes
+
1
]
)
refined_decoded_boxes_batch
=
self
.
_batch_decode_
refined_
boxes
(
refined_decoded_boxes_batch
=
self
.
_batch_decode_boxes
(
refined_box_encodings_batch
,
proposal_boxes
)
class_predictions_with_background_batch
=
(
self
.
_second_stage_score_conversion_fn
(
...
...
@@ -1092,19 +1159,26 @@ class FasterRCNNMetaArch(model.DetectionModel):
mask_predictions_batch
=
tf
.
reshape
(
mask_predictions
,
[
-
1
,
self
.
max_num_proposals
,
self
.
num_classes
,
mask_height
,
mask_width
])
detections
=
self
.
_second_stage_nms_fn
(
refined_decoded_boxes_batch
,
class_predictions_batch
,
clip_window
=
clip_window
,
change_coordinate_frame
=
True
,
num_valid_boxes
=
num_proposals
,
masks
=
mask_predictions_batch
)
(
nmsed_boxes
,
nmsed_scores
,
nmsed_classes
,
nmsed_masks
,
num_detections
)
=
self
.
_second_stage_nms_fn
(
refined_decoded_boxes_batch
,
class_predictions_batch
,
clip_window
=
clip_window
,
change_coordinate_frame
=
True
,
num_valid_boxes
=
num_proposals
,
masks
=
mask_predictions_batch
)
detections
=
{
'detection_boxes'
:
nmsed_boxes
,
'detection_scores'
:
nmsed_scores
,
'detection_classes'
:
nmsed_classes
,
'num_detections'
:
tf
.
to_float
(
num_detections
)}
if
nmsed_masks
is
not
None
:
detections
[
'detection_masks'
]
=
nmsed_masks
if
mask_predictions
is
not
None
:
detections
[
'detection_masks'
]
=
tf
.
to_float
(
tf
.
greater_equal
(
detections
[
'detection_masks'
],
mask_threshold
))
return
detections
def
_batch_decode_
refined_
boxes
(
self
,
refined_
box_encodings
,
proposal
_boxes
):
def
_batch_decode_boxes
(
self
,
box_encodings
,
anchor
_boxes
):
"""Decode tensor of refined box encodings.
Args:
...
...
@@ -1119,15 +1193,33 @@ class FasterRCNNMetaArch(model.DetectionModel):
float tensor representing (padded) refined bounding box predictions
(for each image in batch, proposal and class).
"""
tiled_proposal_boxes
=
tf
.
tile
(
tf
.
expand_dims
(
proposal_boxes
,
2
),
[
1
,
1
,
self
.
num_classes
,
1
])
tiled_proposals_boxlist
=
box_list
.
BoxList
(
tf
.
reshape
(
tiled_proposal_boxes
,
[
-
1
,
4
]))
"""Decodes box encodings with respect to the anchor boxes.
Args:
box_encodings: a 4-D tensor with shape
[batch_size, num_anchors, num_classes, self._box_coder.code_size]
representing box encodings.
anchor_boxes: [batch_size, num_anchors, 4] representing
decoded bounding boxes.
Returns:
decoded_boxes: a [batch_size, num_anchors, num_classes, 4]
float tensor representing bounding box predictions
(for each image in batch, proposal and class).
"""
combined_shape
=
shape_utils
.
combined_static_and_dynamic_shape
(
box_encodings
)
num_classes
=
combined_shape
[
2
]
tiled_anchor_boxes
=
tf
.
tile
(
tf
.
expand_dims
(
anchor_boxes
,
2
),
[
1
,
1
,
num_classes
,
1
])
tiled_anchors_boxlist
=
box_list
.
BoxList
(
tf
.
reshape
(
tiled_anchor_boxes
,
[
-
1
,
4
]))
decoded_boxes
=
self
.
_box_coder
.
decode
(
tf
.
reshape
(
refined_
box_encodings
,
[
-
1
,
self
.
_box_coder
.
code_size
]),
tiled_
proposal
s_boxlist
)
tf
.
reshape
(
box_encodings
,
[
-
1
,
self
.
_box_coder
.
code_size
]),
tiled_
anchor
s_boxlist
)
return
tf
.
reshape
(
decoded_boxes
.
get
(),
[
-
1
,
self
.
max_num_proposals
,
self
.
num_classes
,
4
])
tf
.
stack
([
combined_shape
[
0
],
combined_shape
[
1
],
num_classes
,
4
]))
def
loss
(
self
,
prediction_dict
,
scope
=
None
):
"""Compute scalar loss tensors given prediction tensors.
...
...
@@ -1413,25 +1505,22 @@ class FasterRCNNMetaArch(model.DetectionModel):
cls_losses
=
tf
.
expand_dims
(
single_image_cls_loss
,
0
),
decoded_boxlist_list
=
[
proposal_boxlist
])
def
restore_fn
(
self
,
checkpoint_path
,
from_detection_checkpoint
=
True
):
"""Returns callable for loading a checkpoint into the tensorflow graph.
def
restore_map
(
self
,
from_detection_checkpoint
=
True
):
"""Returns a map of variables to load from a foreign checkpoint.
See parent class for details.
Args:
checkpoint_path: path to checkpoint to restore.
from_detection_checkpoint: whether to restore from a detection checkpoint
(with compatible variable names) or to restore from a classification
checkpoint for initialization prior to training. Note that when
from_detection_checkpoint=True, the current implementation only
supports restoration from an (exactly) identical model (with exception
of the num_classes parameter).
from_detection_checkpoint: whether to restore from a full detection
checkpoint (with compatible variable names) or to restore from a
classification checkpoint for initialization prior to training.
Returns:
a callable which takes a tf.Session as input and loads a checkpoint whe
n
run
.
A dict mapping variable names (to load from a checkpoint) to variables i
n
the model graph
.
"""
if
not
from_detection_checkpoint
:
return
self
.
_feature_extractor
.
restore_from_classification_checkpoint_fn
(
checkpoint_path
,
self
.
first_stage_feature_extractor_scope
,
self
.
second_stage_feature_extractor_scope
)
...
...
@@ -1439,13 +1528,8 @@ class FasterRCNNMetaArch(model.DetectionModel):
variables_to_restore
.
append
(
slim
.
get_or_create_global_step
())
# Only load feature extractor variables to be consistent with loading from
# a classification checkpoint.
f
irst_stage
_variables
=
tf
.
contrib
.
framework
.
filter_variables
(
f
eature_extractor
_variables
=
tf
.
contrib
.
framework
.
filter_variables
(
variables_to_restore
,
include_patterns
=
[
self
.
first_stage_feature_extractor_scope
,
self
.
second_stage_feature_extractor_scope
])
saver
=
tf
.
train
.
Saver
(
first_stage_variables
)
def
restore
(
sess
):
saver
.
restore
(
sess
,
checkpoint_path
)
return
restore
return
{
var
.
op
.
name
:
var
for
var
in
feature_extractor_variables
}
object_detection/meta_architectures/faster_rcnn_meta_arch_test_lib.py
View file @
dff0f0c1
...
...
@@ -226,61 +226,47 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
return
self
.
_get_model
(
self
.
_get_second_stage_box_predictor
(
num_classes
=
num_classes
,
is_training
=
is_training
),
**
common_kwargs
)
def
test_predict_
gives_
correct_shapes_in_inference_mode_
first
_stage
_only
(
def
test_predict_correct_shapes_in_inference_mode_
both
_stage
s
(
self
):
test_graph
=
tf
.
Graph
()
with
test_graph
.
as_default
():
model
=
self
.
_build_model
(
is_training
=
False
,
first_stage_only
=
True
,
second_stage_batch_size
=
2
)
batch_size
=
2
height
=
10
width
=
12
input_image_shape
=
(
batch_size
,
height
,
width
,
3
)
preprocessed_inputs
=
tf
.
placeholder
(
dtype
=
tf
.
float32
,
shape
=
(
batch_size
,
None
,
None
,
3
))
prediction_dict
=
model
.
predict
(
preprocessed_inputs
)
# In inference mode, anchors are clipped to the image window, but not
# pruned. Since MockFasterRCNN.extract_proposal_features returns a
# tensor with the same shape as its input, the expected number of anchors
# is height * width * the number of anchors per location (i.e. 3x3).
expected_num_anchors
=
height
*
width
*
3
*
3
expected_output_keys
=
set
([
'rpn_box_predictor_features'
,
'rpn_features_to_crop'
,
'image_shape'
,
'rpn_box_encodings'
,
'rpn_objectness_predictions_with_background'
,
'anchors'
])
expected_output_shapes
=
{
'rpn_box_predictor_features'
:
(
batch_size
,
height
,
width
,
512
),
'rpn_features_to_crop'
:
(
batch_size
,
height
,
width
,
3
),
'rpn_box_encodings'
:
(
batch_size
,
expected_num_anchors
,
4
),
'rpn_objectness_predictions_with_background'
:
(
batch_size
,
expected_num_anchors
,
2
),
'anchors'
:
(
expected_num_anchors
,
4
)
}
init_op
=
tf
.
global_variables_initializer
()
with
self
.
test_session
()
as
sess
:
batch_size
=
2
image_size
=
10
input_shapes
=
[(
batch_size
,
image_size
,
image_size
,
3
),
(
None
,
image_size
,
image_size
,
3
),
(
batch_size
,
None
,
None
,
3
),
(
None
,
None
,
None
,
3
)]
expected_num_anchors
=
image_size
*
image_size
*
3
*
3
expected_shapes
=
{
'rpn_box_predictor_features'
:
(
2
,
image_size
,
image_size
,
512
),
'rpn_features_to_crop'
:
(
2
,
image_size
,
image_size
,
3
),
'image_shape'
:
(
4
,),
'rpn_box_encodings'
:
(
2
,
expected_num_anchors
,
4
),
'rpn_objectness_predictions_with_background'
:
(
2
,
expected_num_anchors
,
2
),
'anchors'
:
(
expected_num_anchors
,
4
),
'refined_box_encodings'
:
(
2
*
8
,
2
,
4
),
'class_predictions_with_background'
:
(
2
*
8
,
2
+
1
),
'num_proposals'
:
(
2
,),
'proposal_boxes'
:
(
2
,
8
,
4
),
}
for
input_shape
in
input_shapes
:
test_graph
=
tf
.
Graph
()
with
test_graph
.
as_default
():
model
=
self
.
_build_model
(
is_training
=
False
,
first_stage_only
=
False
,
second_stage_batch_size
=
2
)
preprocessed_inputs
=
tf
.
placeholder
(
tf
.
float32
,
shape
=
input_shape
)
result_tensor_dict
=
model
.
predict
(
preprocessed_inputs
)
init_op
=
tf
.
global_variables_initializer
()
with
self
.
test_session
(
graph
=
test_graph
)
as
sess
:
sess
.
run
(
init_op
)
prediction_out
=
sess
.
run
(
prediction_dict
,
feed_dict
=
{
preprocessed_inputs
:
np
.
zeros
(
input_image_shape
)
})
self
.
assertEqual
(
set
(
prediction_out
.
keys
()),
expected_output_keys
)
self
.
assertAllEqual
(
prediction_out
[
'image_shape'
],
input_image_shape
)
for
output_key
,
expected_shape
in
expected_output_shapes
.
iteritems
():
self
.
assertAllEqual
(
prediction_out
[
output_key
].
shape
,
expected_shape
)
# Check that anchors are clipped to window.
anchors
=
prediction_out
[
'anchors'
]
self
.
assertTrue
(
np
.
all
(
np
.
greater_equal
(
anchors
,
0
)))
self
.
assertTrue
(
np
.
all
(
np
.
less_equal
(
anchors
[:,
0
],
height
)))
self
.
assertTrue
(
np
.
all
(
np
.
less_equal
(
anchors
[:,
1
],
width
)))
self
.
assertTrue
(
np
.
all
(
np
.
less_equal
(
anchors
[:,
2
],
height
)))
self
.
assertTrue
(
np
.
all
(
np
.
less_equal
(
anchors
[:,
3
],
width
)))
tensor_dict_out
=
sess
.
run
(
result_tensor_dict
,
feed_dict
=
{
preprocessed_inputs
:
np
.
zeros
((
batch_size
,
image_size
,
image_size
,
3
))})
self
.
assertEqual
(
set
(
tensor_dict_out
.
keys
()),
set
(
expected_shapes
.
keys
()))
for
key
in
expected_shapes
:
self
.
assertAllEqual
(
tensor_dict_out
[
key
].
shape
,
expected_shapes
[
key
])
def
test_predict_gives_valid_anchors_in_training_mode_first_stage_only
(
self
):
test_graph
=
tf
.
Graph
()
...
...
@@ -535,35 +521,67 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
expected_num_proposals
)
def
test_postprocess_second_stage_only_inference_mode
(
self
):
model
=
self
.
_build_model
(
is_training
=
False
,
first_stage_only
=
False
,
second_stage_batch_size
=
6
)
num_proposals_shapes
=
[(
2
),
(
None
)]
refined_box_encodings_shapes
=
[(
16
,
2
,
4
),
(
None
,
2
,
4
)]
class_predictions_with_background_shapes
=
[(
16
,
3
),
(
None
,
3
)]
proposal_boxes_shapes
=
[(
2
,
8
,
4
),
(
None
,
8
,
4
)]
batch_size
=
2
total_num_padded_proposals
=
batch_size
*
model
.
max_num_proposals
proposal_boxes
=
tf
.
constant
(
[[[
1
,
1
,
2
,
3
],
[
0
,
0
,
1
,
1
],
[.
5
,
.
5
,
.
6
,
.
6
],
4
*
[
0
],
4
*
[
0
],
4
*
[
0
],
4
*
[
0
],
4
*
[
0
]],
[[
2
,
3
,
6
,
8
],
[
1
,
2
,
5
,
3
],
4
*
[
0
],
4
*
[
0
],
4
*
[
0
],
4
*
[
0
],
4
*
[
0
],
4
*
[
0
]]],
dtype
=
tf
.
float32
)
num_proposals
=
tf
.
constant
([
3
,
2
],
dtype
=
tf
.
int32
)
refined_box_encodings
=
tf
.
zeros
(
[
total_num_padded_proposals
,
model
.
num_classes
,
4
],
dtype
=
tf
.
float32
)
class_predictions_with_background
=
tf
.
ones
(
[
total_num_padded_proposals
,
model
.
num_classes
+
1
],
dtype
=
tf
.
float32
)
image_shape
=
tf
.
constant
([
batch_size
,
36
,
48
,
3
],
dtype
=
tf
.
int32
)
detections
=
model
.
postprocess
({
'refined_box_encodings'
:
refined_box_encodings
,
'class_predictions_with_background'
:
class_predictions_with_background
,
'num_proposals'
:
num_proposals
,
'proposal_boxes'
:
proposal_boxes
,
'image_shape'
:
image_shape
})
with
self
.
test_session
()
as
sess
:
detections_out
=
sess
.
run
(
detections
)
image_shape
=
np
.
array
((
2
,
36
,
48
,
3
),
dtype
=
np
.
int32
)
for
(
num_proposals_shape
,
refined_box_encoding_shape
,
class_predictions_with_background_shape
,
proposal_boxes_shape
)
in
zip
(
num_proposals_shapes
,
refined_box_encodings_shapes
,
class_predictions_with_background_shapes
,
proposal_boxes_shapes
):
tf_graph
=
tf
.
Graph
()
with
tf_graph
.
as_default
():
model
=
self
.
_build_model
(
is_training
=
False
,
first_stage_only
=
False
,
second_stage_batch_size
=
6
)
total_num_padded_proposals
=
batch_size
*
model
.
max_num_proposals
proposal_boxes
=
np
.
array
(
[[[
1
,
1
,
2
,
3
],
[
0
,
0
,
1
,
1
],
[.
5
,
.
5
,
.
6
,
.
6
],
4
*
[
0
],
4
*
[
0
],
4
*
[
0
],
4
*
[
0
],
4
*
[
0
]],
[[
2
,
3
,
6
,
8
],
[
1
,
2
,
5
,
3
],
4
*
[
0
],
4
*
[
0
],
4
*
[
0
],
4
*
[
0
],
4
*
[
0
],
4
*
[
0
]]])
num_proposals
=
np
.
array
([
3
,
2
],
dtype
=
np
.
int32
)
refined_box_encodings
=
np
.
zeros
(
[
total_num_padded_proposals
,
model
.
num_classes
,
4
])
class_predictions_with_background
=
np
.
ones
(
[
total_num_padded_proposals
,
model
.
num_classes
+
1
])
num_proposals_placeholder
=
tf
.
placeholder
(
tf
.
int32
,
shape
=
num_proposals_shape
)
refined_box_encodings_placeholder
=
tf
.
placeholder
(
tf
.
float32
,
shape
=
refined_box_encoding_shape
)
class_predictions_with_background_placeholder
=
tf
.
placeholder
(
tf
.
float32
,
shape
=
class_predictions_with_background_shape
)
proposal_boxes_placeholder
=
tf
.
placeholder
(
tf
.
float32
,
shape
=
proposal_boxes_shape
)
image_shape_placeholder
=
tf
.
placeholder
(
tf
.
int32
,
shape
=
(
4
))
detections
=
model
.
postprocess
({
'refined_box_encodings'
:
refined_box_encodings_placeholder
,
'class_predictions_with_background'
:
class_predictions_with_background_placeholder
,
'num_proposals'
:
num_proposals_placeholder
,
'proposal_boxes'
:
proposal_boxes_placeholder
,
'image_shape'
:
image_shape_placeholder
,
})
with
self
.
test_session
(
graph
=
tf_graph
)
as
sess
:
detections_out
=
sess
.
run
(
detections
,
feed_dict
=
{
refined_box_encodings_placeholder
:
refined_box_encodings
,
class_predictions_with_background_placeholder
:
class_predictions_with_background
,
num_proposals_placeholder
:
num_proposals
,
proposal_boxes_placeholder
:
proposal_boxes
,
image_shape_placeholder
:
image_shape
})
self
.
assertAllEqual
(
detections_out
[
'detection_boxes'
].
shape
,
[
2
,
5
,
4
])
self
.
assertAllClose
(
detections_out
[
'detection_scores'
],
[[
1
,
1
,
1
,
1
,
1
],
[
1
,
1
,
1
,
1
,
0
]])
...
...
@@ -571,6 +589,17 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
[[
0
,
0
,
0
,
1
,
1
],
[
0
,
0
,
1
,
1
,
0
]])
self
.
assertAllClose
(
detections_out
[
'num_detections'
],
[
5
,
4
])
def
test_preprocess_preserves_input_shapes
(
self
):
image_shapes
=
[(
3
,
None
,
None
,
3
),
(
None
,
10
,
10
,
3
),
(
None
,
None
,
None
,
3
)]
for
image_shape
in
image_shapes
:
model
=
self
.
_build_model
(
is_training
=
False
,
first_stage_only
=
False
,
second_stage_batch_size
=
6
)
image_placeholder
=
tf
.
placeholder
(
tf
.
float32
,
shape
=
image_shape
)
preprocessed_inputs
=
model
.
preprocess
(
image_placeholder
)
self
.
assertAllEqual
(
preprocessed_inputs
.
shape
.
as_list
(),
image_shape
)
def
test_loss_first_stage_only_mode
(
self
):
model
=
self
.
_build_model
(
is_training
=
True
,
first_stage_only
=
True
,
second_stage_batch_size
=
6
)
...
...
@@ -957,7 +986,7 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
exp_loc_loss
)
self
.
assertAllClose
(
loss_dict_out
[
'second_stage_classification_loss'
],
0
)
def
test_restore_
fn
_classification
(
self
):
def
test_restore_
map_for
_classification
_ckpt
(
self
):
# Define mock tensorflow classification graph and save variables.
test_graph_classification
=
tf
.
Graph
()
with
test_graph_classification
.
as_default
():
...
...
@@ -986,12 +1015,17 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
preprocessed_inputs
=
model
.
preprocess
(
inputs
)
prediction_dict
=
model
.
predict
(
preprocessed_inputs
)
model
.
postprocess
(
prediction_dict
)
restore_fn
=
model
.
restore_fn
(
saved_model_path
,
from_detection_checkpoint
=
False
)
var_map
=
model
.
restore_map
(
from_detection_checkpoint
=
False
)
self
.
assertIsInstance
(
var_map
,
dict
)
saver
=
tf
.
train
.
Saver
(
var_map
)
with
self
.
test_session
()
as
sess
:
restore_fn
(
sess
)
saver
.
restore
(
sess
,
saved_model_path
)
for
var
in
sess
.
run
(
tf
.
report_uninitialized_variables
()):
self
.
assertNotIn
(
model
.
first_stage_feature_extractor_scope
,
var
.
name
)
self
.
assertNotIn
(
model
.
second_stage_feature_extractor_scope
,
var
.
name
)
def
test_restore_
fn
_detection
(
self
):
def
test_restore_
map_for
_detection
_ckpt
(
self
):
# Define first detection graph and save variables.
test_graph_detection1
=
tf
.
Graph
()
with
test_graph_detection1
.
as_default
():
...
...
@@ -1022,10 +1056,11 @@ class FasterRCNNMetaArchTestBase(tf.test.TestCase):
preprocessed_inputs2
=
model2
.
preprocess
(
inputs2
)
prediction_dict2
=
model2
.
predict
(
preprocessed_inputs2
)
model2
.
postprocess
(
prediction_dict2
)
restore_fn
=
model2
.
restore_fn
(
saved_model_path
,
from_detection_checkpoint
=
True
)
var_map
=
model2
.
restore_map
(
from_detection_checkpoint
=
True
)
self
.
assertIsInstance
(
var_map
,
dict
)
saver
=
tf
.
train
.
Saver
(
var_map
)
with
self
.
test_session
()
as
sess
:
restore
_fn
(
sess
)
saver
.
restore
(
sess
,
saved_model_path
)
for
var
in
sess
.
run
(
tf
.
report_uninitialized_variables
()):
self
.
assertNotIn
(
model2
.
first_stage_feature_extractor_scope
,
var
.
name
)
self
.
assertNotIn
(
model2
.
second_stage_feature_extractor_scope
,
...
...
object_detection/meta_architectures/ssd_meta_arch.py
View file @
dff0f0c1
...
...
@@ -23,13 +23,12 @@ from abc import abstractmethod
import
re
import
tensorflow
as
tf
from
object_detection.core
import
box_coder
as
bcoder
from
object_detection.core
import
box_list
from
object_detection.core
import
box_predictor
as
bpredictor
from
object_detection.core
import
model
from
object_detection.core
import
standard_fields
as
fields
from
object_detection.core
import
target_assigner
from
object_detection.utils
import
variables_helper
from
object_detection.utils
import
shape_utils
slim
=
tf
.
contrib
.
slim
...
...
@@ -324,7 +323,8 @@ class SSDMetaArch(model.DetectionModel):
a list of pairs (height, width) for each feature map in feature_maps
"""
feature_map_shapes
=
[
feature_map
.
get_shape
().
as_list
()
for
feature_map
in
feature_maps
shape_utils
.
combined_static_and_dynamic_shape
(
feature_map
)
for
feature_map
in
feature_maps
]
return
[(
shape
[
1
],
shape
[
2
])
for
shape
in
feature_map_shapes
]
...
...
@@ -365,8 +365,7 @@ class SSDMetaArch(model.DetectionModel):
with
tf
.
name_scope
(
'Postprocessor'
):
box_encodings
=
prediction_dict
[
'box_encodings'
]
class_predictions
=
prediction_dict
[
'class_predictions_with_background'
]
detection_boxes
=
bcoder
.
batch_decode
(
box_encodings
,
self
.
_box_coder
,
self
.
anchors
)
detection_boxes
=
self
.
_batch_decode
(
box_encodings
)
detection_boxes
=
tf
.
expand_dims
(
detection_boxes
,
axis
=
2
)
class_predictions_without_background
=
tf
.
slice
(
class_predictions
,
...
...
@@ -375,10 +374,14 @@ class SSDMetaArch(model.DetectionModel):
detection_scores
=
self
.
_score_conversion_fn
(
class_predictions_without_background
)
clip_window
=
tf
.
constant
([
0
,
0
,
1
,
1
],
tf
.
float32
)
detections
=
self
.
_non_max_suppression_fn
(
detection_boxes
,
detection_scores
,
clip_window
=
clip_window
)
return
detections
(
nmsed_boxes
,
nmsed_scores
,
nmsed_classes
,
_
,
num_detections
)
=
self
.
_non_max_suppression_fn
(
detection_boxes
,
detection_scores
,
clip_window
=
clip_window
)
return
{
'detection_boxes'
:
nmsed_boxes
,
'detection_scores'
:
nmsed_scores
,
'detection_classes'
:
nmsed_classes
,
'num_detections'
:
tf
.
to_float
(
num_detections
)}
def
loss
(
self
,
prediction_dict
,
scope
=
None
):
"""Compute scalar loss tensors with respect to provided groundtruth.
...
...
@@ -546,8 +549,7 @@ class SSDMetaArch(model.DetectionModel):
tf
.
slice
(
prediction_dict
[
'class_predictions_with_background'
],
[
0
,
0
,
1
],
class_pred_shape
),
class_pred_shape
)
decoded_boxes
=
bcoder
.
batch_decode
(
prediction_dict
[
'box_encodings'
],
self
.
_box_coder
,
self
.
anchors
)
decoded_boxes
=
self
.
_batch_decode
(
prediction_dict
[
'box_encodings'
])
decoded_box_tensors_list
=
tf
.
unstack
(
decoded_boxes
)
class_prediction_list
=
tf
.
unstack
(
class_predictions
)
decoded_boxlist_list
=
[]
...
...
@@ -562,33 +564,51 @@ class SSDMetaArch(model.DetectionModel):
decoded_boxlist_list
=
decoded_boxlist_list
,
match_list
=
match_list
)
def
restore_fn
(
self
,
checkpoint_path
,
from_detection_checkpoint
=
True
):
"""Return callable for loading a checkpoint into the tensorflow graph.
def
_batch_decode
(
self
,
box_encodings
):
"""Decodes a batch of box encodings with respect to the anchors.
Args:
box_encodings: A float32 tensor of shape
[batch_size, num_anchors, box_code_size] containing box encodings.
Returns:
decoded_boxes: A float32 tensor of shape
[batch_size, num_anchors, 4] containing the decoded boxes.
"""
combined_shape
=
shape_utils
.
combined_static_and_dynamic_shape
(
box_encodings
)
batch_size
=
combined_shape
[
0
]
tiled_anchor_boxes
=
tf
.
tile
(
tf
.
expand_dims
(
self
.
anchors
.
get
(),
0
),
[
batch_size
,
1
,
1
])
tiled_anchors_boxlist
=
box_list
.
BoxList
(
tf
.
reshape
(
tiled_anchor_boxes
,
[
-
1
,
self
.
_box_coder
.
code_size
]))
decoded_boxes
=
self
.
_box_coder
.
decode
(
tf
.
reshape
(
box_encodings
,
[
-
1
,
self
.
_box_coder
.
code_size
]),
tiled_anchors_boxlist
)
return
tf
.
reshape
(
decoded_boxes
.
get
(),
tf
.
stack
([
combined_shape
[
0
],
combined_shape
[
1
],
4
]))
def
restore_map
(
self
,
from_detection_checkpoint
=
True
):
"""Returns a map of variables to load from a foreign checkpoint.
See parent class for details.
Args:
checkpoint_path: path to checkpoint to restore.
from_detection_checkpoint: whether to restore from a full detection
checkpoint (with compatible variable names) or to restore from a
classification checkpoint for initialization prior to training.
Returns:
a callable which takes a tf.Session as input and loads a checkpoint whe
n
run
.
A dict mapping variable names (to load from a checkpoint) to variables i
n
the model graph
.
"""
variables_to_restore
=
{}
for
variable
in
tf
.
all_variables
():
if
variable
.
op
.
name
.
startswith
(
self
.
_extract_features_scope
):
var_name
=
variable
.
op
.
name
if
not
from_detection_checkpoint
:
var_name
=
(
re
.
split
(
'^'
+
self
.
_extract_features_scope
+
'/'
,
var_name
)[
-
1
])
var_name
=
(
re
.
split
(
'^'
+
self
.
_extract_features_scope
+
'/'
,
var_name
)[
-
1
])
variables_to_restore
[
var_name
]
=
variable
# TODO: Load variables selectively using scopes.
variables_to_restore
=
(
variables_helper
.
get_variables_available_in_checkpoint
(
variables_to_restore
,
checkpoint_path
))
saver
=
tf
.
train
.
Saver
(
variables_to_restore
)
def
restore
(
sess
):
saver
.
restore
(
sess
,
checkpoint_path
)
return
restore
return
variables_to_restore
object_detection/meta_architectures/ssd_meta_arch_test.py
View file @
dff0f0c1
...
...
@@ -116,24 +116,46 @@ class SsdMetaArchTest(tf.test.TestCase):
localization_loss_weight
,
normalize_loss_by_num_matches
,
hard_example_miner
)
def
test_preprocess_preserves_input_shapes
(
self
):
image_shapes
=
[(
3
,
None
,
None
,
3
),
(
None
,
10
,
10
,
3
),
(
None
,
None
,
None
,
3
)]
for
image_shape
in
image_shapes
:
image_placeholder
=
tf
.
placeholder
(
tf
.
float32
,
shape
=
image_shape
)
preprocessed_inputs
=
self
.
_model
.
preprocess
(
image_placeholder
)
self
.
assertAllEqual
(
preprocessed_inputs
.
shape
.
as_list
(),
image_shape
)
def
test_predict_results_have_correct_keys_and_shapes
(
self
):
batch_size
=
3
preprocessed_input
=
tf
.
random_uniform
((
batch_size
,
2
,
2
,
3
),
dtype
=
tf
.
float32
)
prediction_dict
=
self
.
_model
.
predict
(
preprocessed_input
)
self
.
assertTrue
(
'box_encodings'
in
prediction_dict
)
self
.
assertTrue
(
'class_predictions_with_background'
in
prediction_dict
)
self
.
assertTrue
(
'feature_maps'
in
prediction_dict
)
image_size
=
2
input_shapes
=
[(
batch_size
,
image_size
,
image_size
,
3
),
(
None
,
image_size
,
image_size
,
3
),
(
batch_size
,
None
,
None
,
3
),
(
None
,
None
,
None
,
3
)]
expected_box_encodings_shape_out
=
(
batch_size
,
self
.
_num_anchors
,
self
.
_code_size
)
expected_class_predictions_with_background_shape_out
=
(
batch_size
,
self
.
_num_anchors
,
self
.
_num_classes
+
1
)
init_op
=
tf
.
global_variables_initializer
()
with
self
.
test_session
()
as
sess
:
sess
.
run
(
init_op
)
prediction_out
=
sess
.
run
(
prediction_dict
)
for
input_shape
in
input_shapes
:
tf_graph
=
tf
.
Graph
()
with
tf_graph
.
as_default
():
preprocessed_input_placeholder
=
tf
.
placeholder
(
tf
.
float32
,
shape
=
input_shape
)
prediction_dict
=
self
.
_model
.
predict
(
preprocessed_input_placeholder
)
self
.
assertTrue
(
'box_encodings'
in
prediction_dict
)
self
.
assertTrue
(
'class_predictions_with_background'
in
prediction_dict
)
self
.
assertTrue
(
'feature_maps'
in
prediction_dict
)
init_op
=
tf
.
global_variables_initializer
()
with
self
.
test_session
(
graph
=
tf_graph
)
as
sess
:
sess
.
run
(
init_op
)
prediction_out
=
sess
.
run
(
prediction_dict
,
feed_dict
=
{
preprocessed_input_placeholder
:
np
.
random
.
uniform
(
size
=
(
batch_size
,
2
,
2
,
3
))})
self
.
assertAllEqual
(
prediction_out
[
'box_encodings'
].
shape
,
expected_box_encodings_shape_out
)
self
.
assertAllEqual
(
...
...
@@ -142,10 +164,11 @@ class SsdMetaArchTest(tf.test.TestCase):
def
test_postprocess_results_are_correct
(
self
):
batch_size
=
2
preprocessed_input
=
tf
.
random_uniform
((
batch_size
,
2
,
2
,
3
),
dtype
=
tf
.
float32
)
prediction_dict
=
self
.
_model
.
predict
(
preprocessed_input
)
detections
=
self
.
_model
.
postprocess
(
prediction_dict
)
image_size
=
2
input_shapes
=
[(
batch_size
,
image_size
,
image_size
,
3
),
(
None
,
image_size
,
image_size
,
3
),
(
batch_size
,
None
,
None
,
3
),
(
None
,
None
,
None
,
3
)]
expected_boxes
=
np
.
array
([[[
0
,
0
,
.
5
,
.
5
],
[
0
,
.
5
,
.
5
,
1
],
...
...
@@ -163,15 +186,25 @@ class SsdMetaArchTest(tf.test.TestCase):
[
0
,
0
,
0
,
0
,
0
]])
expected_num_detections
=
np
.
array
([
4
,
4
])
self
.
assertTrue
(
'detection_boxes'
in
detections
)
self
.
assertTrue
(
'detection_scores'
in
detections
)
self
.
assertTrue
(
'detection_classes'
in
detections
)
self
.
assertTrue
(
'num_detections'
in
detections
)
init_op
=
tf
.
global_variables_initializer
()
with
self
.
test_session
()
as
sess
:
sess
.
run
(
init_op
)
detections_out
=
sess
.
run
(
detections
)
for
input_shape
in
input_shapes
:
tf_graph
=
tf
.
Graph
()
with
tf_graph
.
as_default
():
preprocessed_input_placeholder
=
tf
.
placeholder
(
tf
.
float32
,
shape
=
input_shape
)
prediction_dict
=
self
.
_model
.
predict
(
preprocessed_input_placeholder
)
detections
=
self
.
_model
.
postprocess
(
prediction_dict
)
self
.
assertTrue
(
'detection_boxes'
in
detections
)
self
.
assertTrue
(
'detection_scores'
in
detections
)
self
.
assertTrue
(
'detection_classes'
in
detections
)
self
.
assertTrue
(
'num_detections'
in
detections
)
init_op
=
tf
.
global_variables_initializer
()
with
self
.
test_session
(
graph
=
tf_graph
)
as
sess
:
sess
.
run
(
init_op
)
detections_out
=
sess
.
run
(
detections
,
feed_dict
=
{
preprocessed_input_placeholder
:
np
.
random
.
uniform
(
size
=
(
batch_size
,
2
,
2
,
3
))})
self
.
assertAllClose
(
detections_out
[
'detection_boxes'
],
expected_boxes
)
self
.
assertAllClose
(
detections_out
[
'detection_scores'
],
expected_scores
)
self
.
assertAllClose
(
detections_out
[
'detection_classes'
],
expected_classes
)
...
...
@@ -207,20 +240,21 @@ class SsdMetaArchTest(tf.test.TestCase):
self
.
assertAllClose
(
losses_out
[
'classification_loss'
],
expected_classification_loss
)
def
test_restore_
fn
_detection
(
self
):
def
test_restore_
map_for
_detection
_ckpt
(
self
):
init_op
=
tf
.
global_variables_initializer
()
saver
=
tf_saver
.
Saver
()
save_path
=
self
.
get_temp_dir
()
with
self
.
test_session
()
as
sess
:
sess
.
run
(
init_op
)
saved_model_path
=
saver
.
save
(
sess
,
save_path
)
restore_fn
=
self
.
_model
.
restore_fn
(
saved_model_path
,
from_detection_checkpoint
=
True
)
restore_fn
(
sess
)
var_map
=
self
.
_model
.
restore_map
(
from_detection_checkpoint
=
True
)
self
.
assertIsInstance
(
var_map
,
dict
)
saver
=
tf
.
train
.
Saver
(
var_map
)
saver
.
restore
(
sess
,
saved_model_path
)
for
var
in
sess
.
run
(
tf
.
report_uninitialized_variables
()):
self
.
assertNotIn
(
'FeatureExtractor'
,
var
.
name
)
def
test_restore_
fn
_classification
(
self
):
def
test_restore_
map_for
_classification
_ckpt
(
self
):
# Define mock tensorflow classification graph and save variables.
test_graph_classification
=
tf
.
Graph
()
with
test_graph_classification
.
as_default
():
...
...
@@ -246,10 +280,11 @@ class SsdMetaArchTest(tf.test.TestCase):
preprocessed_inputs
=
self
.
_model
.
preprocess
(
inputs
)
prediction_dict
=
self
.
_model
.
predict
(
preprocessed_inputs
)
self
.
_model
.
postprocess
(
prediction_dict
)
restore_fn
=
self
.
_model
.
restore_fn
(
saved_model_path
,
from_detection_checkpoint
=
False
)
var_map
=
self
.
_model
.
restore_map
(
from_detection_checkpoint
=
False
)
self
.
assertIsInstance
(
var_map
,
dict
)
saver
=
tf
.
train
.
Saver
(
var_map
)
with
self
.
test_session
()
as
sess
:
restore
_fn
(
sess
)
saver
.
restore
(
sess
,
saved_model_path
)
for
var
in
sess
.
run
(
tf
.
report_uninitialized_variables
()):
self
.
assertNotIn
(
'FeatureExtractor'
,
var
.
name
)
...
...
object_detection/models/BUILD
View file @
dff0f0c1
...
...
@@ -94,7 +94,6 @@ py_library(
deps
=
[
"//tensorflow"
,
"//tensorflow_models/object_detection/meta_architectures:faster_rcnn_meta_arch"
,
"//tensorflow_models/object_detection/utils:variables_helper"
,
"//tensorflow_models/slim:inception_resnet_v2"
,
],
)
...
...
object_detection/models/faster_rcnn_inception_resnet_v2_feature_extractor.py
View file @
dff0f0c1
...
...
@@ -25,7 +25,6 @@ Huang et al. (https://arxiv.org/abs/1611.10012)
import
tensorflow
as
tf
from
object_detection.meta_architectures
import
faster_rcnn_meta_arch
from
object_detection.utils
import
variables_helper
from
nets
import
inception_resnet_v2
slim
=
tf
.
contrib
.
slim
...
...
@@ -168,30 +167,30 @@ class FasterRCNNInceptionResnetV2FeatureExtractor(
def
restore_from_classification_checkpoint_fn
(
self
,
checkpoint_path
,
first_stage_feature_extractor_scope
,
second_stage_feature_extractor_scope
):
"""Returns
callable for loading a checkpoint into the tensorflow graph
.
"""Returns
a map of variables to load from a foreign checkpoint
.
Note that this overrides the default implementation in
faster_rcnn_meta_arch.FasterRCNNFeatureExtractor which does not work for
InceptionResnetV2 checkpoints.
TODO: revisit whether it's possible to force the `Repeat` namescope as
created in `_extract_box_classifier_features` to start counting at 2 (e.g.
`Repeat_2`) so that the default restore_fn can be used.
TODO: revisit whether it's possible to force the
`Repeat` namescope as created in `_extract_box_classifier_features` to
start counting at 2 (e.g. `Repeat_2`) so that the default restore_fn can
be used.
Args:
checkpoint_path: Path to checkpoint to restore.
first_stage_feature_extractor_scope: A scope name for the first stage
feature extractor.
second_stage_feature_extractor_scope: A scope name for the second stage
feature extractor.
Returns:
a callable which takes a tf.Session as input and loads a checkpoint whe
n
run
.
A dict mapping variable names (to load from a checkpoint) to variables i
n
the model graph
.
"""
variables_to_restore
=
{}
for
variable
in
tf
.
global_variables
():
if
variable
.
op
.
name
.
startswith
(
...
...
@@ -207,10 +206,4 @@ class FasterRCNNInceptionResnetV2FeatureExtractor(
var_name
=
var_name
.
replace
(
second_stage_feature_extractor_scope
+
'/'
,
''
)
variables_to_restore
[
var_name
]
=
variable
variables_to_restore
=
(
variables_helper
.
get_variables_available_in_checkpoint
(
variables_to_restore
,
checkpoint_path
))
saver
=
tf
.
train
.
Saver
(
variables_to_restore
)
def
restore
(
sess
):
saver
.
restore
(
sess
,
checkpoint_path
)
return
restore
return
variables_to_restore
object_detection/models/feature_map_generators_test.py
View file @
dff0f0c1
...
...
@@ -63,7 +63,7 @@ class MultiResolutionFeatureMapGeneratorTest(tf.test.TestCase):
sess
.
run
(
init_op
)
out_feature_maps
=
sess
.
run
(
feature_maps
)
out_feature_map_shapes
=
dict
(
(
key
,
value
.
shape
)
for
key
,
value
in
out_feature_maps
.
iter
items
())
(
key
,
value
.
shape
)
for
key
,
value
in
out_feature_maps
.
items
())
self
.
assertDictEqual
(
out_feature_map_shapes
,
expected_feature_map_shapes
)
def
test_get_expected_feature_map_shapes_with_inception_v3
(
self
):
...
...
@@ -93,7 +93,7 @@ class MultiResolutionFeatureMapGeneratorTest(tf.test.TestCase):
sess
.
run
(
init_op
)
out_feature_maps
=
sess
.
run
(
feature_maps
)
out_feature_map_shapes
=
dict
(
(
key
,
value
.
shape
)
for
key
,
value
in
out_feature_maps
.
iter
items
())
(
key
,
value
.
shape
)
for
key
,
value
in
out_feature_maps
.
items
())
self
.
assertDictEqual
(
out_feature_map_shapes
,
expected_feature_map_shapes
)
...
...
object_detection/object_detection_tutorial.ipynb
View file @
dff0f0c1
...
...
@@ -140,9 +140,9 @@
"opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)\n",
"tar_file = tarfile.open(MODEL_FILE)\n",
"for file in tar_file.getmembers():\n",
"
file_name = os.path.basename(file.name)\n",
"
if 'frozen_inference_graph.pb' in file_name:\n",
"
tar_file.extract(file, os.getcwd())"
" file_name = os.path.basename(file.name)\n",
" if 'frozen_inference_graph.pb' in file_name:\n",
" tar_file.extract(file, os.getcwd())"
]
},
{
...
...
@@ -162,11 +162,11 @@
"source": [
"detection_graph = tf.Graph()\n",
"with detection_graph.as_default():\n",
"
od_graph_def = tf.GraphDef()\n",
"
with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:\n",
"
serialized_graph = fid.read()\n",
"
od_graph_def.ParseFromString(serialized_graph)\n",
"
tf.import_graph_def(od_graph_def, name='')"
" od_graph_def = tf.GraphDef()\n",
" with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:\n",
" serialized_graph = fid.read()\n",
" od_graph_def.ParseFromString(serialized_graph)\n",
" tf.import_graph_def(od_graph_def, name='')"
]
},
{
...
...
object_detection/samples/configs/faster_rcnn_inception_resnet_v2_atrous_pets.config
View file @
dff0f0c1
...
...
@@ -111,6 +111,11 @@ train_config: {
gradient_clipping_by_norm
:
10
.
0
fine_tune_checkpoint
:
"PATH_TO_BE_CONFIGURED/model.ckpt"
from_detection_checkpoint
:
true
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps
:
200000
data_augmentation_options
{
random_horizontal_flip
{
}
...
...
@@ -126,6 +131,9 @@ train_input_reader: {
eval_config
: {
num_examples
:
2000
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals
:
10
}
eval_input_reader
: {
...
...
object_detection/samples/configs/faster_rcnn_resnet101_pets.config
View file @
dff0f0c1
...
...
@@ -109,6 +109,11 @@ train_config: {
gradient_clipping_by_norm
:
10
.
0
fine_tune_checkpoint
:
"PATH_TO_BE_CONFIGURED/model.ckpt"
from_detection_checkpoint
:
true
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps
:
200000
data_augmentation_options
{
random_horizontal_flip
{
}
...
...
@@ -124,6 +129,9 @@ train_input_reader: {
eval_config
: {
num_examples
:
2000
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals
:
10
}
eval_input_reader
: {
...
...
object_detection/samples/configs/faster_rcnn_resnet152_pets.config
View file @
dff0f0c1
...
...
@@ -109,6 +109,11 @@ train_config: {
gradient_clipping_by_norm
:
10
.
0
fine_tune_checkpoint
:
"PATH_TO_BE_CONFIGURED/model.ckpt"
from_detection_checkpoint
:
true
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps
:
200000
data_augmentation_options
{
random_horizontal_flip
{
}
...
...
@@ -124,6 +129,9 @@ train_input_reader: {
eval_config
: {
num_examples
:
2000
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals
:
10
}
eval_input_reader
: {
...
...
object_detection/samples/configs/faster_rcnn_resnet50_pets.config
View file @
dff0f0c1
...
...
@@ -109,6 +109,11 @@ train_config: {
gradient_clipping_by_norm
:
10
.
0
fine_tune_checkpoint
:
"PATH_TO_BE_CONFIGURED/model.ckpt"
from_detection_checkpoint
:
true
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps
:
200000
data_augmentation_options
{
random_horizontal_flip
{
}
...
...
@@ -124,6 +129,9 @@ train_input_reader: {
eval_config
: {
num_examples
:
2000
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals
:
10
}
eval_input_reader
: {
...
...
object_detection/samples/configs/rfcn_resnet101_pets.config
View file @
dff0f0c1
...
...
@@ -106,6 +106,11 @@ train_config: {
gradient_clipping_by_norm
:
10
.
0
fine_tune_checkpoint
:
"PATH_TO_BE_CONFIGURED/model.ckpt"
from_detection_checkpoint
:
true
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps
:
200000
data_augmentation_options
{
random_horizontal_flip
{
}
...
...
@@ -121,6 +126,9 @@ train_input_reader: {
eval_config
: {
num_examples
:
2000
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals
:
10
}
eval_input_reader
: {
...
...
object_detection/samples/configs/ssd_inception_v2_pets.config
View file @
dff0f0c1
...
...
@@ -151,6 +151,11 @@ train_config: {
}
fine_tune_checkpoint
:
"PATH_TO_BE_CONFIGURED/model.ckpt"
from_detection_checkpoint
:
true
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps
:
200000
data_augmentation_options
{
random_horizontal_flip
{
}
...
...
@@ -170,6 +175,9 @@ train_input_reader: {
eval_config
: {
num_examples
:
2000
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals
:
10
}
eval_input_reader
: {
...
...
object_detection/samples/configs/ssd_mobilenet_v1_pets.config
View file @
dff0f0c1
...
...
@@ -157,6 +157,11 @@ train_config: {
}
fine_tune_checkpoint
:
"PATH_TO_BE_CONFIGURED/model.ckpt"
from_detection_checkpoint
:
true
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps
:
200000
data_augmentation_options
{
random_horizontal_flip
{
}
...
...
@@ -176,6 +181,9 @@ train_input_reader: {
eval_config
: {
num_examples
:
2000
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals
:
10
}
eval_input_reader
: {
...
...
Prev
1
2
3
4
5
6
7
8
…
10
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment