Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
ResNet50_tensorflow
Commits
5a2cf36f
Commit
5a2cf36f
authored
Jul 23, 2020
by
Kaushik Shivakumar
Browse files
Merge remote-tracking branch 'upstream/master' into newavarecords
parents
258ddfc3
a829e648
Changes
330
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
1917 additions
and
90 deletions
+1917
-90
research/object_detection/g3doc/running_notebook.md
research/object_detection/g3doc/running_notebook.md
+3
-0
research/object_detection/g3doc/running_on_mobile_tensorflowlite.md
...bject_detection/g3doc/running_on_mobile_tensorflowlite.md
+2
-0
research/object_detection/g3doc/running_pets.md
research/object_detection/g3doc/running_pets.md
+12
-10
research/object_detection/g3doc/tf1.md
research/object_detection/g3doc/tf1.md
+94
-0
research/object_detection/g3doc/tf1_detection_zoo.md
research/object_detection/g3doc/tf1_detection_zoo.md
+6
-3
research/object_detection/g3doc/tf1_training_and_evaluation.md
...rch/object_detection/g3doc/tf1_training_and_evaluation.md
+237
-0
research/object_detection/g3doc/tf2.md
research/object_detection/g3doc/tf2.md
+84
-0
research/object_detection/g3doc/tf2_classification_zoo.md
research/object_detection/g3doc/tf2_classification_zoo.md
+25
-0
research/object_detection/g3doc/tf2_detection_zoo.md
research/object_detection/g3doc/tf2_detection_zoo.md
+65
-0
research/object_detection/g3doc/tf2_training_and_evaluation.md
...rch/object_detection/g3doc/tf2_training_and_evaluation.md
+285
-0
research/object_detection/g3doc/tpu_compatibility.md
research/object_detection/g3doc/tpu_compatibility.md
+3
-3
research/object_detection/g3doc/tpu_exporters.md
research/object_detection/g3doc/tpu_exporters.md
+2
-0
research/object_detection/g3doc/using_your_own_dataset.md
research/object_detection/g3doc/using_your_own_dataset.md
+1
-1
research/object_detection/inputs.py
research/object_detection/inputs.py
+6
-2
research/object_detection/inputs_test.py
research/object_detection/inputs_test.py
+43
-2
research/object_detection/meta_architectures/center_net_meta_arch.py
...ject_detection/meta_architectures/center_net_meta_arch.py
+460
-57
research/object_detection/meta_architectures/center_net_meta_arch_tf2_test.py
...ction/meta_architectures/center_net_meta_arch_tf2_test.py
+216
-6
research/object_detection/meta_architectures/context_rcnn_lib_tf2.py
...ject_detection/meta_architectures/context_rcnn_lib_tf2.py
+238
-0
research/object_detection/meta_architectures/context_rcnn_lib_tf2_test.py
...detection/meta_architectures/context_rcnn_lib_tf2_test.py
+120
-0
research/object_detection/meta_architectures/context_rcnn_meta_arch.py
...ct_detection/meta_architectures/context_rcnn_meta_arch.py
+15
-6
No files found.
research/object_detection/g3doc/running_notebook.md
View file @
5a2cf36f
# Quick Start: Jupyter notebook for off-the-shelf inference
# Quick Start: Jupyter notebook for off-the-shelf inference
[

](https://github.com/tensorflow/tensorflow/releases/tag/v2.2.0)
[

](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
If you'd like to hit the ground running and run detection on a few example
If you'd like to hit the ground running and run detection on a few example
images right out of the box, we recommend trying out the Jupyter notebook demo.
images right out of the box, we recommend trying out the Jupyter notebook demo.
To run the Jupyter notebook, run the following command from
To run the Jupyter notebook, run the following command from
...
...
research/object_detection/g3doc/running_on_mobile_tensorflowlite.md
View file @
5a2cf36f
# Running on mobile with TensorFlow Lite
# Running on mobile with TensorFlow Lite
[

](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
In this section, we will show you how to use
[
TensorFlow
In this section, we will show you how to use
[
TensorFlow
Lite
](
https://www.tensorflow.org/mobile/tflite/
)
to get a smaller model and
Lite
](
https://www.tensorflow.org/mobile/tflite/
)
to get a smaller model and
allow you take advantage of ops that have been optimized for mobile devices.
allow you take advantage of ops that have been optimized for mobile devices.
...
...
research/object_detection/g3doc/running_pets.md
View file @
5a2cf36f
# Quick Start: Distributed Training on the Oxford-IIIT Pets Dataset on Google Cloud
# Quick Start: Distributed Training on the Oxford-IIIT Pets Dataset on Google Cloud
This page is a walkthrough for training an object detector using the Tensorflow
[

](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
This page is a walkthrough for training an object detector using the TensorFlow
Object Detection API. In this tutorial, we'll be training on the Oxford-IIIT Pets
Object Detection API. In this tutorial, we'll be training on the Oxford-IIIT Pets
dataset to build a system to detect various breeds of cats and dogs. The output
dataset to build a system to detect various breeds of cats and dogs. The output
of the detector will look like the following:
of the detector will look like the following:
...
@@ -40,10 +42,10 @@ export YOUR_GCS_BUCKET=${YOUR_GCS_BUCKET}
...
@@ -40,10 +42,10 @@ export YOUR_GCS_BUCKET=${YOUR_GCS_BUCKET}
It is also possible to run locally by following
It is also possible to run locally by following
[
the running locally instructions
](
running_locally.md
)
.
[
the running locally instructions
](
running_locally.md
)
.
## Installing Tensor
f
low and the Tensor
f
low Object Detection API
## Installing Tensor
F
low and the Tensor
F
low Object Detection API
Please run through the
[
installation instructions
](
installation.md
)
to install
Please run through the
[
installation instructions
](
installation.md
)
to install
Tensor
f
low and all it dependencies. Ensure the Protobuf libraries are
Tensor
F
low and all it dependencies. Ensure the Protobuf libraries are
compiled and the library directories are added to
`PYTHONPATH`
.
compiled and the library directories are added to
`PYTHONPATH`
.
## Getting the Oxford-IIIT Pets Dataset and Uploading it to Google Cloud Storage
## Getting the Oxford-IIIT Pets Dataset and Uploading it to Google Cloud Storage
...
@@ -77,7 +79,7 @@ should appear as follows:
...
@@ -77,7 +79,7 @@ should appear as follows:
... other files and directories
... other files and directories
```
```
The Tensor
f
low Object Detection API expects data to be in the TFRecord format,
The Tensor
F
low Object Detection API expects data to be in the TFRecord format,
so we'll now run the
`create_pet_tf_record`
script to convert from the raw
so we'll now run the
`create_pet_tf_record`
script to convert from the raw
Oxford-IIIT Pet dataset into TFRecords. Run the following commands from the
Oxford-IIIT Pet dataset into TFRecords. Run the following commands from the
`tensorflow/models/research/`
directory:
`tensorflow/models/research/`
directory:
...
@@ -134,7 +136,7 @@ in the following step.
...
@@ -134,7 +136,7 @@ in the following step.
## Configuring the Object Detection Pipeline
## Configuring the Object Detection Pipeline
In the Tensor
f
low Object Detection API, the model parameters, training
In the Tensor
F
low Object Detection API, the model parameters, training
parameters and eval parameters are all defined by a config file. More details
parameters and eval parameters are all defined by a config file. More details
can be found
[
here
](
configuring_jobs.md
)
. For this tutorial, we will use some
can be found
[
here
](
configuring_jobs.md
)
. For this tutorial, we will use some
predefined templates provided with the source code. In the
predefined templates provided with the source code. In the
...
@@ -188,10 +190,10 @@ browser](https://console.cloud.google.com/storage/browser).
...
@@ -188,10 +190,10 @@ browser](https://console.cloud.google.com/storage/browser).
Before we can start a job on Google Cloud ML Engine, we must:
Before we can start a job on Google Cloud ML Engine, we must:
1.
Package the Tensor
f
low Object Detection code.
1.
Package the Tensor
F
low Object Detection code.
2.
Write a cluster configuration for our Google Cloud ML job.
2.
Write a cluster configuration for our Google Cloud ML job.
To package the Tensor
f
low Object Detection code, run the following commands from
To package the Tensor
F
low Object Detection code, run the following commands from
the
`tensorflow/models/research/`
directory:
the
`tensorflow/models/research/`
directory:
```
bash
```
bash
...
@@ -248,7 +250,7 @@ web browser. You should see something similar to the following:
...
@@ -248,7 +250,7 @@ web browser. You should see something similar to the following:


Make sure your Tensorboard version is the same minor version as your Tensor
f
low (1.x)
Make sure your Tensorboard version is the same minor version as your Tensor
F
low (1.x)
You will also want to click on the images tab to see example detections made by
You will also want to click on the images tab to see example detections made by
the model while it trains. After about an hour and a half of training, you can
the model while it trains. After about an hour and a half of training, you can
...
@@ -265,9 +267,9 @@ the training jobs are configured to go for much longer than is necessary for
...
@@ -265,9 +267,9 @@ the training jobs are configured to go for much longer than is necessary for
convergence. To save money, we recommend killing your jobs once you've seen
convergence. To save money, we recommend killing your jobs once you've seen
that they've converged.
that they've converged.
## Exporting the Tensor
f
low Graph
## Exporting the Tensor
F
low Graph
After your model has been trained, you should export it to a Tensor
f
low graph
After your model has been trained, you should export it to a Tensor
F
low graph
proto. First, you need to identify a candidate checkpoint to export. You can
proto. First, you need to identify a candidate checkpoint to export. You can
search your bucket using the
[
Google Cloud Storage
search your bucket using the
[
Google Cloud Storage
Browser
](
https://console.cloud.google.com/storage/browser
)
. The file should be
Browser
](
https://console.cloud.google.com/storage/browser
)
. The file should be
...
...
research/object_detection/g3doc/tf1.md
0 → 100644
View file @
5a2cf36f
# Object Detection API with TensorFlow 1
## Requirements
[

](https://www.python.org/downloads/release/python-360/)
[

](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
[

](https://grpc.io/docs/protoc-installation/#install-using-a-package-manager)
## Installation
You can install the TensorFlow Object Detection API either with Python Package
Installer (pip) or Docker. For local runs we recommend using Docker and for
Google Cloud runs we recommend using pip.
Clone the TensorFlow Models repository and proceed to one of the installation
options.
```
bash
git clone https://github.com/tensorflow/models.git
```
### Docker Installation
```
bash
# From the root of the git repository
docker build
-f
research/object_detection/dockerfiles/tf1/Dockerfile
-t
od
.
docker run
-it
od
```
### Python Package Installation
```
bash
cd
models/research
# Compile protos.
protoc object_detection/protos/
*
.proto
--python_out
=
.
# Install TensorFlow Object Detection API.
cp
object_detection/packages/tf1/setup.py
.
python
-m
pip
install
.
```
```
bash
# Test the installation.
python object_detection/builders/model_builder_tf1_test.py
```
## Quick Start
### Colabs
*
[
Jupyter notebook for off-the-shelf inference
](
../colab_tutorials/object_detection_tutorial.ipynb
)
*
[
Training a pet detector
](
running_pets.md
)
### Training and Evaluation
To train and evaluate your models either locally or on Google Cloud see
[
instructions
](
tf1_training_and_evaluation.md
)
.
## Model Zoo
We provide a large collection of models that are trained on several datasets in
the
[
Model Zoo
](
tf1_detection_zoo.md
)
.
## Guides
*
<a
href=
'configuring_jobs.md'
>
Configuring an object detection pipeline
</a><br>
*
<a
href=
'preparing_inputs.md'
>
Preparing inputs
</a><br>
*
<a
href=
'defining_your_own_model.md'
>
Defining your own model architecture
</a><br>
*
<a
href=
'using_your_own_dataset.md'
>
Bringing in your own dataset
</a><br>
*
<a
href=
'evaluation_protocols.md'
>
Supported object detection evaluation protocols
</a><br>
*
<a
href=
'tpu_compatibility.md'
>
TPU compatible detection pipelines
</a><br>
*
<a
href=
'tf1_training_and_evaluation.md'
>
Training and evaluation guide (CPU, GPU, or TPU)
</a><br>
## Extras:
*
<a
href=
'exporting_models.md'
>
Exporting a trained model for inference
</a><br>
*
<a
href=
'tpu_exporters.md'
>
Exporting a trained model for TPU inference
</a><br>
*
<a
href=
'oid_inference_and_evaluation.md'
>
Inference and evaluation on the Open Images dataset
</a><br>
*
<a
href=
'instance_segmentation.md'
>
Run an instance segmentation model
</a><br>
*
<a
href=
'challenge_evaluation.md'
>
Run the evaluation for the Open Images Challenge 2018/2019
</a><br>
*
<a
href=
'running_on_mobile_tensorflowlite.md'
>
Running object detection on mobile devices with TensorFlow Lite
</a><br>
*
<a
href=
'context_rcnn.md'
>
Context R-CNN documentation for data preparation, training, and export
</a><br>
research/object_detection/g3doc/detection_
model_
zoo.md
→
research/object_detection/g3doc/
tf1_
detection_zoo.md
View file @
5a2cf36f
# Tensorflow detection model zoo
# TensorFlow 1 Detection Model Zoo
[

](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
[

](https://www.python.org/downloads/release/python-360/)
We provide a collection of detection models pre-trained on the
We provide a collection of detection models pre-trained on the
[
COCO dataset
](
http://cocodataset.org
)
, the
[
COCO dataset
](
http://cocodataset.org
)
, the
...
@@ -64,9 +67,9 @@ Some remarks on frozen inference graphs:
...
@@ -64,9 +67,9 @@ Some remarks on frozen inference graphs:
metrics.
metrics.
*
Our frozen inference graphs are generated using the
*
Our frozen inference graphs are generated using the
[
v1.12.0
](
https://github.com/tensorflow/tensorflow/tree/v1.12.0
)
release
[
v1.12.0
](
https://github.com/tensorflow/tensorflow/tree/v1.12.0
)
release
version of Tensor
f
low and we do not guarantee that these will work with
version of Tensor
F
low and we do not guarantee that these will work with
other versions; this being said, each frozen inference graph can be
other versions; this being said, each frozen inference graph can be
regenerated using your current version of Tensor
f
low by re-running the
regenerated using your current version of Tensor
F
low by re-running the
[
exporter
](
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/exporting_models.md
)
,
[
exporter
](
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/exporting_models.md
)
,
pointing it at the model directory as well as the corresponding config file
pointing it at the model directory as well as the corresponding config file
in
in
...
...
research/object_detection/g3doc/
running_on_cloud
.md
→
research/object_detection/g3doc/
tf1_training_and_evaluation
.md
View file @
5a2cf36f
#
Run
ning
on Google Cloud ML Engine
#
Trai
ning
and Evaluation with TensorFlow 1
The Tensorflow Object Detection API supports distributed training on Google
[

](https://www.python.org/downloads/release/python-360/)
Cloud ML Engine. This section documents instructions on how to train and
[

](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
evaluate your model using Cloud ML. The reader should complete the following
prerequistes:
1.
The reader has created and configured a project on Google Cloud Platform.
This page walks through the steps required to train an object detection model.
See
[
the Cloud ML quick start guide
](
https://cloud.google.com/ml-engine/docs/quickstarts/command-line
)
.
It assumes the reader has completed the following prerequisites:
2.
The reader has installed the Tensorflow Object Detection API as documented
in the
[
installation instructions
](
installation.md
)
.
3.
The reader has a valid data set and stored it in a Google Cloud Storage
bucket. See
[
this page
](
preparing_inputs.md
)
for instructions on how to generate
a dataset for the PASCAL VOC challenge or the Oxford-IIIT Pet dataset.
4.
The reader has configured a valid Object Detection pipeline, and stored it
in a Google Cloud Storage bucket. See
[
this page
](
configuring_jobs.md
)
for
details on how to write a pipeline configuration.
Additionally, it is recommended users test their job by running training and
1.
The TensorFlow Object Detection API has been installed as documented in the
evaluation jobs for a few iterations
[
installation instructions
](
tf1.md#installation
)
.
[
locally on their own machines
](
running_locally.md
)
.
2.
A valid data set has been created. See
[
this page
](
preparing_inputs.md
)
for
instructions on how to generate a dataset for the PASCAL VOC challenge or
the Oxford-IIIT Pet dataset.
## Recommended Directory Structure for Training and Evaluation
```
bash
.
├── data/
│ ├── eval-00000-of-00001.tfrecord
│ ├── label_map.txt
│ ├── train-00000-of-00002.tfrecord
│ └── train-00001-of-00002.tfrecord
└── models/
└── my_model_dir/
├──
eval
/
# Created by evaluation job.
├── my_model.config
└── train/
#
└── model_ckpt-100-data@1
# Created by training job.
└── model_ckpt-100-index
#
└── checkpoint
#
```
## Writing a model configuration
Please refer to sample
[
TF1 configs
](
../samples/configs
)
and
[
configuring jobs
](
configuring_jobs.md
)
to create a model config.
##
Packaging
##
# Model Parameter Initialization
In order to run the Tensorflow Object Detection API on Cloud ML, it must be
While optional, it is highly recommended that users utilize classification or
packaged (along with it's TF-Slim dependency and the
object detection checkpoints. Training an object detector from scratch can take
[
pycocotools
](
https://github.com/cocodataset/cocoapi/tree/master/PythonAPI/pycocotools
)
days. To speed up the training process, it is recommended that users re-use the
library). The required packages can be created with the following command
feature extractor parameters from a pre-existing image classification or object
detection checkpoint. The
`train_config`
section in the config provides two
fields to specify pre-existing checkpoints:
```
bash
*
`fine_tune_checkpoint`
: a path prefix to the pre-existing checkpoint
# From tensorflow/models/research/
(ie:"/usr/home/username/checkpoint/model.ckpt-#####").
bash object_detection/dataset_tools/create_pycocotools_package.sh /tmp/pycocotools
python setup.py sdist
*
`fine_tune_checkpoint_type`
: with value
`classification`
or
`detection`
(
cd
slim
&&
python setup.py sdist
)
depending on the type.
A list of detection checkpoints can be found
[
here
](
tf1_detection_zoo.md
)
.
## Local
### Training
A local training job can be run with the following command:
```
bash
# From the tensorflow/models/research/ directory
PIPELINE_CONFIG_PATH
={
path to pipeline config file
}
MODEL_DIR
={
path to model directory
}
NUM_TRAIN_STEPS
=
50000
SAMPLE_1_OF_N_EVAL_EXAMPLES
=
1
python object_detection/model_main.py
\
--pipeline_config_path
=
${
PIPELINE_CONFIG_PATH
}
\
--model_dir
=
${
MODEL_DIR
}
\
--num_train_steps
=
${
NUM_TRAIN_STEPS
}
\
--sample_1_of_n_eval_examples
=
${
SAMPLE_1_OF_N_EVAL_EXAMPLES
}
\
--alsologtostderr
```
```
This will create python packages dist/object_detection-0.1.tar.gz,
where
`${PIPELINE_CONFIG_PATH}`
points to the pipeline config and
`${MODEL_DIR}`
slim/dist/slim-0.1.tar.gz, and /tmp/pycocotools/pycocotools-2.0.tar.gz.
points to the directory in which training checkpoints and events will be
written. Note that this binary will interleave both training and evaluation.
## Running a Multiworker (GPU) Training Job on CMLE
## Google Cloud AI Platform
The TensorFlow Object Detection API supports training on Google Cloud AI
Platform. This section documents instructions on how to train and evaluate your
model using Cloud AI Platform. The reader should complete the following
prerequistes:
1.
The reader has created and configured a project on Google Cloud AI Platform.
See
[
Using GPUs
](
https://cloud.google.com/ai-platform/training/docs/using-gpus
)
and
[
Using TPUs
](
https://cloud.google.com/ai-platform/training/docs/using-tpus
)
guides.
2.
The reader has a valid data set and stored it in a Google Cloud Storage
bucket. See
[
this page
](
preparing_inputs.md
)
for instructions on how to
generate a dataset for the PASCAL VOC challenge or the Oxford-IIIT Pet
dataset.
Additionally, it is recommended users test their job by running training and
evaluation jobs for a few iterations
[
locally on their own machines
](
#local
)
.
### Training with multiple workers with single GPU
Google Cloud ML requires a YAML configuration file for a multiworker training
Google Cloud ML requires a YAML configuration file for a multiworker training
job using GPUs. A sample YAML file is given below:
job using GPUs. A sample YAML file is given below:
```
```
trainingInput:
trainingInput:
runtimeVersion: "1.1
2
"
runtimeVersion: "1.1
5
"
scaleTier: CUSTOM
scaleTier: CUSTOM
masterType: standard_gpu
masterType: standard_gpu
workerCount: 9
workerCount: 9
...
@@ -52,30 +113,32 @@ trainingInput:
...
@@ -52,30 +113,32 @@ trainingInput:
parameterServerCount: 3
parameterServerCount: 3
parameterServerType: standard
parameterServerType: standard
```
```
Please keep the following guidelines in mind when writing the YAML
Please keep the following guidelines in mind when writing the YAML
configuration:
configuration:
*
A job with n workers will have n + 1 training machines (n workers + 1 master).
*
A job with n workers will have n + 1 training machines (n workers + 1
*
The number of parameters servers used should be an odd number to prevent
master).
a parameter server from storing only weight variables or only bias variables
*
The number of parameters servers used should be an odd number to prevent a
(due to round robin parameter scheduling).
parameter server from storing only weight variables or only bias variables
*
The learning rate in the training config should be decreased when using a
(due to round robin parameter scheduling).
larger number of workers. Some experimentation is required to find the
*
The learning rate in the training config should be decreased when using a
optimal learning rate.
larger number of workers. Some experimentation is required to find the
optimal learning rate.
The YAML file should be saved on the local machine (not on GCP). Once it has
The YAML file should be saved on the local machine (not on GCP). Once it has
been written, a user can start a training job on Cloud ML Engine using the
been written, a user can start a training job on Cloud ML Engine using the
following command:
following command:
```
bash
```
bash
# From tensorflow/models/research/
# From the tensorflow/models/research/ directory
cp
object_detection/packages/tf1/setup.py
.
gcloud ml-engine
jobs
submit training object_detection_
`
date
+%m_%d_%Y_%H_%M_%S
`
\
gcloud ml-engine
jobs
submit training object_detection_
`
date
+%m_%d_%Y_%H_%M_%S
`
\
--runtime-version
1.12
\
--runtime-version
1.15
\
--python-version
3.6
\
--job-dir
=
gs://
${
MODEL_DIR
}
\
--job-dir
=
gs://
${
MODEL_DIR
}
\
--package
s
dist
/object_detection
-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz
\
--package
-path
.
/object_detection
\
--module-name
object_detection.model_main
\
--module-name
object_detection.model_main
\
--region
us-central1
\
--region
us-central1
\
--config
${
PATH_TO_LOCAL_YAML_FILE
}
\
--config
${
PATH_TO_LOCAL_YAML_FILE
}
\
...
@@ -90,41 +153,42 @@ training checkpoints and events will be written to and
...
@@ -90,41 +153,42 @@ training checkpoints and events will be written to and
`gs://${PIPELINE_CONFIG_PATH}`
points to the pipeline configuration stored on
`gs://${PIPELINE_CONFIG_PATH}`
points to the pipeline configuration stored on
Google Cloud Storage.
Google Cloud Storage.
Users can monitor the progress of their training job on the
[
ML Engine
Users can monitor the progress of their training job on the
Dashboard
](
https://console.cloud.google.com/mlengine/jobs
)
.
[
ML Engine Dashboard
](
https://console.cloud.google.com/ai-platform/jobs
)
.
Note: This sample is supported for use with 1.12 runtime version.
##
Running a TPU Training Job on CMLE
##
Training with TPU
Launching a training job with a TPU compatible pipeline config requires using a
Launching a training job with a TPU compatible pipeline config requires using a
similar command:
similar command:
```
bash
```
bash
# From the tensorflow/models/research/ directory
cp
object_detection/packages/tf1/setup.py
.
gcloud ml-engine
jobs
submit training
`
whoami
`
_object_detection_
`
date
+%m_%d_%Y_%H_%M_%S
`
\
gcloud ml-engine
jobs
submit training
`
whoami
`
_object_detection_
`
date
+%m_%d_%Y_%H_%M_%S
`
\
--job-dir
=
gs://
${
MODEL_DIR
}
\
--job-dir
=
gs://
${
MODEL_DIR
}
\
--packages
dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz
\
--package-path
./object_detection
\
--module-name
object_detection.model_tpu_main
\
--module-name
object_detection.model_tpu_main
\
--runtime-version
1.12
\
--runtime-version
1.15
\
--scale-tier
BASIC_TPU
\
--python-version
3.6
\
--region
us-central1
\
--scale-tier
BASIC_TPU
\
--
\
--region
us-central1
\
--tpu_zone
us-central1
\
--
\
--model_dir
=
gs://
${
MODEL_DIR
}
\
--tpu_zone
us-central1
\
--pipeline_config_path
=
gs://
${
PIPELINE_CONFIG_PATH
}
--model_dir
=
gs://
${
MODEL_DIR
}
\
--pipeline_config_path
=
gs://
${
PIPELINE_CONFIG_PATH
}
```
```
In contrast with the GPU training command, there is no need to specify a YAML
In contrast with the GPU training command, there is no need to specify a YAML
file and we point to the
*object_detection.model_tpu_main*
binary instead of
file
,
and we point to the
*object_detection.model_tpu_main*
binary instead of
*object_detection.model_main*
. We must also now set
`scale-tier`
to be
*object_detection.model_main*
. We must also now set
`scale-tier`
to be
`BASIC_TPU`
and provide a
`tpu_zone`
. Finally as before
`pipeline_config_path`
`BASIC_TPU`
and provide a
`tpu_zone`
. Finally as before
`pipeline_config_path`
points to a points to the pipeline configuration stored on Google Cloud Storage
points to a points to the pipeline configuration stored on Google Cloud Storage
(but is now must be a TPU compatible model).
(but is now must be a TPU compatible model).
##
Running an Evaluation Job on CMLE
##
Evaluation with GPU
Note: You only need to do this when using TPU for training as it does not
Note: You only need to do this when using TPU for training
,
as it does not
interleave evaluation during training as in the case of Multiworker GPU
interleave evaluation during training
,
as in the case of Multiworker GPU
training.
training.
Evaluation jobs run on a single machine, so it is not necessary to write a YAML
Evaluation jobs run on a single machine, so it is not necessary to write a YAML
...
@@ -132,10 +196,13 @@ configuration for evaluation. Run the following command to start the evaluation
...
@@ -132,10 +196,13 @@ configuration for evaluation. Run the following command to start the evaluation
job:
job:
```
bash
```
bash
# From the tensorflow/models/research/ directory
cp
object_detection/packages/tf1/setup.py
.
gcloud ml-engine
jobs
submit training object_detection_eval_
`
date
+%m_%d_%Y_%H_%M_%S
`
\
gcloud ml-engine
jobs
submit training object_detection_eval_
`
date
+%m_%d_%Y_%H_%M_%S
`
\
--runtime-version
1.12
\
--runtime-version
1.15
\
--python-version
3.6
\
--job-dir
=
gs://
${
MODEL_DIR
}
\
--job-dir
=
gs://
${
MODEL_DIR
}
\
--package
s
dist
/object_detection
-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz
\
--package
-path
.
/object_detection
\
--module-name
object_detection.model_main
\
--module-name
object_detection.model_main
\
--region
us-central1
\
--region
us-central1
\
--scale-tier
BASIC_GPU
\
--scale-tier
BASIC_GPU
\
...
@@ -146,25 +213,25 @@ gcloud ml-engine jobs submit training object_detection_eval_`date +%m_%d_%Y_%H_%
...
@@ -146,25 +213,25 @@ gcloud ml-engine jobs submit training object_detection_eval_`date +%m_%d_%Y_%H_%
```
```
Where
`gs://${MODEL_DIR}`
points to the directory on Google Cloud Storage where
Where
`gs://${MODEL_DIR}`
points to the directory on Google Cloud Storage where
training checkpoints are saved (same as the training job), as well as
training checkpoints are saved (same as the training job), as well as
to where
to where
evaluation events will be saved on Google Cloud Storage and
evaluation events will be saved on Google Cloud Storage and
`gs://${PIPELINE_CONFIG_PATH}`
points to where the pipeline configuration is
`gs://${PIPELINE_CONFIG_PATH}`
points to where the pipeline configuration is
stored on Google Cloud Storage.
stored on Google Cloud Storage.
Typically one starts an evaluation job concurrently with the training job.
Typically one starts an evaluation job concurrently with the training job. Note
Note that we do not support running evaluation on TPU, so the above command
that we do not support running evaluation on TPU, so the above command line for
line for launching evaluation jobs is the same whether you are training
launching evaluation jobs is the same whether you are training on GPU or TPU.
on GPU or TPU.
## Running Tensorboard
## Running Tensorboard
You can run Tensorboard locally on your own machine to view progress of your
Progress for training and eval jobs can be inspected using Tensorboard. If using
t
raining and eval jobs on Google Cloud ML. Run the following command to start
t
he recommended directory structure, Tensorboard can be run using the following
Tensorboar
d:
comman
d:
```
bash
```
bash
tensorboard
--logdir
=
gs://
${
YOUR_CLOUD_BUCKET
}
tensorboard
--logdir
=
${
MODEL_DIR
}
```
```
Note it may Tensorboard a few minutes to populate with results.
where
`${MODEL_DIR}`
points to the directory that contains the train and eval
directories. Please note it may take Tensorboard a couple minutes to populate
with data.
research/object_detection/g3doc/tf2.md
0 → 100644
View file @
5a2cf36f
# Object Detection API with TensorFlow 2
## Requirements
[

](https://www.python.org/downloads/release/python-360/)
[

](https://github.com/tensorflow/tensorflow/releases/tag/v2.2.0)
[

](https://grpc.io/docs/protoc-installation/#install-using-a-package-manager)
## Installation
You can install the TensorFlow Object Detection API either with Python Package
Installer (pip) or Docker. For local runs we recommend using Docker and for
Google Cloud runs we recommend using pip.
Clone the TensorFlow Models repository and proceed to one of the installation
options.
```
bash
git clone https://github.com/tensorflow/models.git
```
### Docker Installation
```
bash
# From the root of the git repository
docker build
-f
research/object_detection/dockerfiles/tf2/Dockerfile
-t
od
.
docker run
-it
od
```
### Python Package Installation
```
bash
cd
models/research
# Compile protos.
protoc object_detection/protos/
*
.proto
--python_out
=
.
# Install TensorFlow Object Detection API.
cp
object_detection/packages/tf2/setup.py
.
python
-m
pip
install
.
```
```
bash
# Test the installation.
python object_detection/builders/model_builder_tf2_test.py
```
## Quick Start
### Colabs
<!-- mdlint off(URL_BAD_G3DOC_PATH) -->
*
Training -
[
Fine-tune a pre-trained detector in eager mode on custom data
](
../colab_tutorials/eager_few_shot_od_training_tf2_colab.ipynb
)
*
Inference -
[
Run inference with models from the zoo
](
../colab_tutorials/inference_tf2_colab.ipynb
)
<!-- mdlint on -->
## Training and Evaluation
To train and evaluate your models either locally or on Google Cloud see
[
instructions
](
tf2_training_and_evaluation.md
)
.
## Model Zoo
We provide a large collection of models that are trained on COCO 2017 in the
[
Model Zoo
](
tf2_detection_zoo.md
)
.
## Guides
*
<a
href=
'configuring_jobs.md'
>
Configuring an object detection pipeline
</a><br>
*
<a
href=
'preparing_inputs.md'
>
Preparing inputs
</a><br>
*
<a
href=
'defining_your_own_model.md'
>
Defining your own model architecture
</a><br>
*
<a
href=
'using_your_own_dataset.md'
>
Bringing in your own dataset
</a><br>
*
<a
href=
'evaluation_protocols.md'
>
Supported object detection evaluation protocols
</a><br>
*
<a
href=
'tpu_compatibility.md'
>
TPU compatible detection pipelines
</a><br>
*
<a
href=
'tf2_training_and_evaluation.md'
>
Training and evaluation guide (CPU, GPU, or TPU)
</a><br>
\ No newline at end of file
research/object_detection/g3doc/tf2_classification_zoo.md
0 → 100644
View file @
5a2cf36f
# TensorFlow 2 Classification Model Zoo
[

](https://github.com/tensorflow/tensorflow/releases/tag/v2.2.0)
[

](https://www.python.org/downloads/release/python-360/)
We provide a collection of classification models pre-trained on the
[
Imagenet
](
http://www.image-net.org
)
. These can be used to initilize detection
model parameters.
Model name |
---------- |
[
EfficientNet B0
](
http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b0.tar.gz
)
|
[
EfficientNet B1
](
http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b1.tar.gz
)
|
[
EfficientNet B2
](
http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b2.tar.gz
)
|
[
EfficientNet B3
](
http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b3.tar.gz
)
|
[
EfficientNet B4
](
http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b4.tar.gz
)
|
[
EfficientNet B5
](
http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b5.tar.gz
)
|
[
EfficientNet B6
](
http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b6.tar.gz
)
|
[
EfficientNet B7
](
http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/efficientnet_b7.tar.gz
)
|
[
Resnet V1 50
](
http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/resnet50_v1.tar.gz
)
|
[
Resnet V1 101
](
http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/resnet101_v1.tar.gz
)
|
[
Resnet V1 152
](
http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/resnet152_v1.tar.gz
)
|
[
Inception Resnet V2
](
http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/inception_resnet_v2.tar.gz
)
|
[
MobileNet V1
](
http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/mobilnet_v1.tar.gz
)
|
[
MobileNet V2
](
http://download.tensorflow.org/models/object_detection/classification/tf2/20200710/mobilnet_v2.tar.gz
)
|
research/object_detection/g3doc/tf2_detection_zoo.md
0 → 100644
View file @
5a2cf36f
# TensorFlow 2 Detection Model Zoo
[

](https://github.com/tensorflow/tensorflow/releases/tag/v2.2.0)
[

](https://www.python.org/downloads/release/python-360/)
<!-- mdlint off(URL_BAD_G3DOC_PATH) -->
We provide a collection of detection models pre-trained on the
[
COCO 2017 dataset
](
http://cocodataset.org
)
. These models can be useful for
out-of-the-box inference if you are interested in categories already in those
datasets. You can try it in our inference
[
colab
](
../colab_tutorials/inference_tf2_colab.ipynb
)
They are also useful for initializing your models when training on novel
datasets. You can try this out on our few-shot training
[
colab
](
../colab_tutorials/eager_few_shot_od_training_tf2_colab.ipynb
)
.
<!-- mdlint on -->
Finally, if you would like to train these models from scratch, you can find the
model configs in this
[
directory
](
../configs/tf2
)
(
also
in the linked
`tar.gz`
s).
Model name | Speed (ms) | COCO mAP | Outputs
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :--------: | :----------: | :-----:
[
CenterNet HourGlass104 512x512
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_hg104_512x512_coco17_tpu-8.tar.gz
)
| 70 | 41.6 | Boxes
[
CenterNet HourGlass104 Keypoints 512x512
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_hg104_512x512_kpts_coco17_tpu-32.tar.gz
)
| 76 | 40.0/61.4 | Boxes/Keypoints
[
CenterNet HourGlass104 1024x1024
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_hg104_1024x1024_coco17_tpu-32.tar.gz
)
| 197 | 43.5 | Boxes
[
CenterNet HourGlass104 Keypoints 1024x1024
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_hg104_1024x1024_kpts_coco17_tpu-32.tar.gz
)
| 211 | 42.8/64.5 | Boxes/Keypoints
[
CenterNet Resnet50 V1 FPN 512x512
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_resnet50_v1_fpn_512x512_coco17_tpu-8.tar.gz
)
| 27 | 31.2 | Boxes
[
CenterNet Resnet50 V1 FPN Keypoints 512x512
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_resnet50_v1_fpn_512x512_kpts_coco17_tpu-8.tar.gz
)
| 30 | 29.3/50.7 | Boxes/Keypoints
[
CenterNet Resnet101 V1 FPN 512x512
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_resnet101_v1_fpn_512x512_coco17_tpu-8.tar.gz
)
| 34 | 34.2 | Boxes
[
CenterNet Resnet50 V2 512x512
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_resnet50_v2_512x512_coco17_tpu-8.tar.gz
)
| 27 | 29.5 | Boxes
[
CenterNet Resnet50 V2 Keypoints 512x512
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/centernet_resnet50_v2_512x512_kpts_coco17_tpu-8.tar.gz
)
| 30 | 27.6/48.2 | Boxes/Keypoints
[
EfficientDet D0 512x512
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d0_coco17_tpu-32.tar.gz
)
| 39 | 33.6 | Boxes
[
EfficientDet D1 640x640
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d1_coco17_tpu-32.tar.gz
)
| 54 | 38.4 | Boxes
[
EfficientDet D2 768x768
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d2_coco17_tpu-32.tar.gz
)
| 67 | 41.8 | Boxes
[
EfficientDet D3 896x896
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d3_coco17_tpu-32.tar.gz
)
| 95 | 45.4 | Boxes
[
EfficientDet D4 1024x1024
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d4_coco17_tpu-32.tar.gz
)
| 133 | 48.5 | Boxes
[
EfficientDet D5 1280x1280
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d5_coco17_tpu-32.tar.gz
)
| 222 | 49.7 | Boxes
[
EfficientDet D6 1280x1280
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d6_coco17_tpu-32.tar.gz
)
| 268 | 50.5 | Boxes
[
EfficientDet D7 1536x1536
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d7_coco17_tpu-32.tar.gz
)
| 325 | 51.2 | Boxes
[
SSD MobileNet v2 320x320
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_320x320_coco17_tpu-8.tar.gz
)
|19 | 20.2 | Boxes
[
SSD MobileNet V1 FPN 640x640
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v1_fpn_640x640_coco17_tpu-8.tar.gz
)
| 48 | 29.1 | Boxes
[
SSD MobileNet V2 FPNLite 320x320
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.tar.gz
)
| 22 | 22.2 | Boxes
[
SSD MobileNet V2 FPNLite 640x640
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8.tar.gz
)
| 39 | 28.2 | Boxes
[
SSD ResNet50 V1 FPN 640x640 (RetinaNet50)
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.tar.gz
)
| 46 | 34.3 | Boxes
[
SSD ResNet50 V1 FPN 1024x1024 (RetinaNet50)
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet50_v1_fpn_1024x1024_coco17_tpu-8.tar.gz
)
| 87 | 38.3 | Boxes
[
SSD ResNet101 V1 FPN 640x640 (RetinaNet101)
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet101_v1_fpn_640x640_coco17_tpu-8.tar.gz
)
| 57 | 35.6 | Boxes
[
SSD ResNet101 V1 FPN 1024x1024 (RetinaNet101)
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet101_v1_fpn_1024x1024_coco17_tpu-8.tar.gz
)
| 104 | 39.5 | Boxes
[
SSD ResNet152 V1 FPN 640x640 (RetinaNet152)
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet152_v1_fpn_640x640_coco17_tpu-8.tar.gz
)
| 80 | 35.4 | Boxes
[
SSD ResNet152 V1 FPN 1024x1024 (RetinaNet152)
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet152_v1_fpn_1024x1024_coco17_tpu-8.tar.gz
)
| 111 | 39.6 | Boxes
[
Faster R-CNN ResNet50 V1 640x640
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet50_v1_640x640_coco17_tpu-8.tar.gz
)
| 53 | 29.3 | Boxes
[
Faster R-CNN ResNet50 V1 1024x1024
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet50_v1_1024x1024_coco17_tpu-8.tar.gz
)
| 65 | 31.0 | Boxes
[
Faster R-CNN ResNet50 V1 800x1333
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet50_v1_800x1333_coco17_gpu-8.tar.gz
)
| 65 | 31.6 | Boxes
[
Faster R-CNN ResNet101 V1 640x640
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet101_v1_640x640_coco17_tpu-8.tar.gz
)
| 55 | 31.8 | Boxes
[
Faster R-CNN ResNet101 V1 1024x1024
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet101_v1_1024x1024_coco17_tpu-8.tar.gz
)
| 72 | 37.1 | Boxes
[
Faster R-CNN ResNet101 V1 800x1333
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet101_v1_800x1333_coco17_gpu-8.tar.gz
)
| 77 | 36.6 | Boxes
[
Faster R-CNN ResNet152 V1 640x640
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet152_v1_640x640_coco17_tpu-8.tar.gz
)
| 64 | 32.4 | Boxes
[
Faster R-CNN ResNet152 V1 1024x1024
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet152_v1_1024x1024_coco17_tpu-8.tar.gz
)
| 85 | 37.6 | Boxes
[
Faster R-CNN ResNet152 V1 800x1333
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_resnet152_v1_800x1333_coco17_gpu-8.tar.gz
)
| 101 | 37.4 | Boxes
[
Faster R-CNN Inception ResNet V2 640x640
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_inception_resnet_v2_640x640_coco17_tpu-8.tar.gz
)
| 206 | 37.7 | Boxes
[
Faster R-CNN Inception ResNet V2 1024x1024
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_inception_resnet_v2_1024x1024_coco17_tpu-8.tar.gz
)
| 236 | 38.7 | Boxes
[
Mask R-CNN Inception ResNet V2 1024x1024
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8.tar.gz
)
| 301 | 39.0/34.6 | Boxes/Masks
[
ExtremeNet
](
http://download.tensorflow.org/models/object_detection/tf2/20200711/extremenet.tar.gz
)
| -- | -- | Boxes
research/object_detection/g3doc/tf2_training_and_evaluation.md
0 → 100644
View file @
5a2cf36f
# Training and Evaluation with TensorFlow 2
[

](https://github.com/tensorflow/tensorflow/releases/tag/v2.2.0)
[

](https://www.python.org/downloads/release/python-360/)
This page walks through the steps required to train an object detection model.
It assumes the reader has completed the following prerequisites:
1.
The TensorFlow Object Detection API has been installed as documented in the
[
installation instructions
](
tf2.md#installation
)
.
2.
A valid data set has been created. See
[
this page
](
preparing_inputs.md
)
for
instructions on how to generate a dataset for the PASCAL VOC challenge or
the Oxford-IIIT Pet dataset.
## Recommended Directory Structure for Training and Evaluation
```
bash
.
├── data/
│ ├── eval-00000-of-00001.tfrecord
│ ├── label_map.txt
│ ├── train-00000-of-00002.tfrecord
│ └── train-00001-of-00002.tfrecord
└── models/
└── my_model_dir/
├──
eval
/
# Created by evaluation job.
├── my_model.config
└── model_ckpt-100-data@1
#
└── model_ckpt-100-index
# Created by training job.
└── checkpoint
#
```
## Writing a model configuration
Please refer to sample
[
TF2 configs
](
../configs/tf2
)
and
[
configuring jobs
](
configuring_jobs.md
)
to create a model config.
### Model Parameter Initialization
While optional, it is highly recommended that users utilize classification or
object detection checkpoints. Training an object detector from scratch can take
days. To speed up the training process, it is recommended that users re-use the
feature extractor parameters from a pre-existing image classification or object
detection checkpoint. The
`train_config`
section in the config provides two
fields to specify pre-existing checkpoints:
*
`fine_tune_checkpoint`
: a path prefix to the pre-existing checkpoint
(ie:"/usr/home/username/checkpoint/model.ckpt-#####").
*
`fine_tune_checkpoint_type`
: with value
`classification`
or
`detection`
depending on the type.
A list of classification checkpoints can be found
[
here
](
tf2_classification_zoo.md
)
A list of detection checkpoints can be found
[
here
](
tf2_detection_zoo.md
)
.
## Local
### Training
A local training job can be run with the following command:
```
bash
# From the tensorflow/models/research/ directory
PIPELINE_CONFIG_PATH
={
path to pipeline config file
}
MODEL_DIR
={
path to model directory
}
python object_detection/model_main_tf2.py
\
--pipeline_config_path
=
${
PIPELINE_CONFIG_PATH
}
\
--model_dir
=
${
MODEL_DIR
}
\
--alsologtostderr
```
where
`${PIPELINE_CONFIG_PATH}`
points to the pipeline config and
`${MODEL_DIR}`
points to the directory in which training checkpoints and events will be
written.
### Evaluation
A local evaluation job can be run with the following command:
```
bash
# From the tensorflow/models/research/ directory
PIPELINE_CONFIG_PATH
={
path to pipeline config file
}
MODEL_DIR
={
path to model directory
}
CHECKPOINT_DIR
=
${
MODEL_DIR
}
MODEL_DIR
={
path to model directory
}
python object_detection/model_main_tf2.py
\
--pipeline_config_path
=
${
PIPELINE_CONFIG_PATH
}
\
--model_dir
=
${
MODEL_DIR
}
\
--checkpoint_dir
=
${
CHECKPOINT_DIR
}
\
--alsologtostderr
```
where
`${CHECKPOINT_DIR}`
points to the directory with checkpoints produced by
the training job. Evaluation events are written to
`${MODEL_DIR/eval}`
## Google Cloud VM
The TensorFlow Object Detection API supports training on Google Cloud with Deep
Learning GPU VMs and TPU VMs. This section documents instructions on how to
train and evaluate your model on them. The reader should complete the following
prerequistes:
1.
The reader has create and configured a GPU VM or TPU VM on Google Cloud with
TensorFlow >= 2.2.0. See
[
TPU quickstart
](
https://cloud.google.com/tpu/docs/quickstart
)
and
[
GPU quickstart
](
https://cloud.google.com/ai-platform/deep-learning-vm/docs/tensorflow_start_instance#with-one-or-more-gpus
)
2.
The reader has installed the TensorFlow Object Detection API as documented
in the
[
installation instructions
](
tf2.md#installation
)
on the VM.
3.
The reader has a valid data set and stored it in a Google Cloud Storage
bucket or locally on the VM. See
[
this page
](
preparing_inputs.md
)
for
instructions on how to generate a dataset for the PASCAL VOC challenge or
the Oxford-IIIT Pet dataset.
Additionally, it is recommended users test their job by running training and
evaluation jobs for a few iterations
[
locally on their own machines
](
#local
)
.
### Training
Training on GPU or TPU VMs is similar to local training. It can be launched
using the following command.
```
bash
# From the tensorflow/models/research/ directory
USE_TPU
=
true
TPU_NAME
=
"MY_TPU_NAME"
PIPELINE_CONFIG_PATH
={
path to pipeline config file
}
MODEL_DIR
={
path to model directory
}
python object_detection/model_main_tf2.py
\
--pipeline_config_path
=
${
PIPELINE_CONFIG_PATH
}
\
--model_dir
=
${
MODEL_DIR
}
\
--use_tpu
=
${
USE_TPU
}
\
# (optional) only required for TPU training.
--tpu_name
=
${
TPU_NAME
}
\
# (optional) only required for TPU training.
--alsologtostderr
```
where
`${PIPELINE_CONFIG_PATH}`
points to the pipeline config and
`${MODEL_DIR}`
points to the root directory for the files produces. Training checkpoints and
events are written to
`${MODEL_DIR}`
. Note that the paths can be either local or
a path to GCS bucket.
### Evaluation
Evaluation is only supported on GPU. Similar to local evaluation it can be
launched using the following command:
```
bash
# From the tensorflow/models/research/ directory
PIPELINE_CONFIG_PATH
={
path to pipeline config file
}
MODEL_DIR
={
path to model directory
}
CHECKPOINT_DIR
=
${
MODEL_DIR
}
MODEL_DIR
={
path to model directory
}
python object_detection/model_main_tf2.py
\
--pipeline_config_path
=
${
PIPELINE_CONFIG_PATH
}
\
--model_dir
=
${
MODEL_DIR
}
\
--checkpoint_dir
=
${
CHECKPOINT_DIR
}
\
--alsologtostderr
```
where
`${CHECKPOINT_DIR}`
points to the directory with checkpoints produced by
the training job. Evaluation events are written to
`${MODEL_DIR/eval}`
. Note
that the paths can be either local or a path to GCS bucket.
## Google Cloud AI Platform
The TensorFlow Object Detection API supports also supports training on Google
Cloud AI Platform. This section documents instructions on how to train and
evaluate your model using Cloud ML. The reader should complete the following
prerequistes:
1.
The reader has created and configured a project on Google Cloud AI Platform.
See
[
Using GPUs
](
https://cloud.google.com/ai-platform/training/docs/using-gpus
)
and
[
Using TPUs
](
https://cloud.google.com/ai-platform/training/docs/using-tpus
)
guides.
2.
The reader has a valid data set and stored it in a Google Cloud Storage
bucket. See
[
this page
](
preparing_inputs.md
)
for instructions on how to
generate a dataset for the PASCAL VOC challenge or the Oxford-IIIT Pet
dataset.
Additionally, it is recommended users test their job by running training and
evaluation jobs for a few iterations
[
locally on their own machines
](
#local
)
.
### Training with multiple GPUs
A user can start a training job on Cloud AI Platform using the following
command:
```
bash
# From the tensorflow/models/research/ directory
cp
object_detection/packages/tf2/setup.py
.
gcloud ai-platform
jobs
submit training object_detection_
`
date
+%m_%d_%Y_%H_%M_%S
`
\
--runtime-version
2.1
\
--python-version
3.6
\
--job-dir
=
gs://
${
MODEL_DIR
}
\
--package-path
./object_detection
\
--module-name
object_detection.model_main_tf2
\
--region
us-central1
\
--master-machine-type
n1-highcpu-16
\
--master-accelerator
count
=
8,type
=
nvidia-tesla-v100
\
--
\
--model_dir
=
gs://
${
MODEL_DIR
}
\
--pipeline_config_path
=
gs://
${
PIPELINE_CONFIG_PATH
}
```
Where
`gs://${MODEL_DIR}`
specifies the directory on Google Cloud Storage where
the training checkpoints and events will be written to and
`gs://${PIPELINE_CONFIG_PATH}`
points to the pipeline configuration stored on
Google Cloud Storage.
Users can monitor the progress of their training job on the
[
ML Engine Dashboard
](
https://console.cloud.google.com/ai-platform/jobs
)
.
### Training with TPU
Launching a training job with a TPU compatible pipeline config requires using a
similar command:
```
bash
# From the tensorflow/models/research/ directory
cp
object_detection/packages/tf2/setup.py
.
gcloud ai-platform
jobs
submit training
`
whoami
`
_object_detection_
`
date
+%m_%d_%Y_%H_%M_%S
`
\
--job-dir
=
gs://
${
MODEL_DIR
}
\
--package-path
./object_detection
\
--module-name
object_detection.model_main_tf2
\
--runtime-version
2.1
\
--python-version
3.6
\
--scale-tier
BASIC_TPU
\
--region
us-central1
\
--
\
--use_tpu
true
\
--model_dir
=
gs://
${
MODEL_DIR
}
\
--pipeline_config_path
=
gs://
${
PIPELINE_CONFIG_PATH
}
```
As before
`pipeline_config_path`
points to the pipeline configuration stored on
Google Cloud Storage (but is now must be a TPU compatible model).
### Evaluating with GPU
Evaluation jobs run on a single machine. Run the following command to start the
evaluation job:
```
bash
# From the tensorflow/models/research/ directory
cp
object_detection/packages/tf2/setup.py
.
gcloud ai-platform
jobs
submit training object_detection_eval_
`
date
+%m_%d_%Y_%H_%M_%S
`
\
--runtime-version
2.1
\
--python-version
3.6
\
--job-dir
=
gs://
${
MODEL_DIR
}
\
--package-path
./object_detection
\
--module-name
object_detection.model_main_tf2
\
--region
us-central1
\
--scale-tier
BASIC_GPU
\
--
\
--model_dir
=
gs://
${
MODEL_DIR
}
\
--pipeline_config_path
=
gs://
${
PIPELINE_CONFIG_PATH
}
\
--checkpoint_dir
=
gs://
${
MODEL_DIR
}
```
where
`gs://${MODEL_DIR}`
points to the directory on Google Cloud Storage where
training checkpoints are saved and
`gs://{PIPELINE_CONFIG_PATH}`
points to where
the model configuration file stored on Google Cloud Storage. Evaluation events
are written to
`gs://${MODEL_DIR}/eval`
Typically one starts an evaluation job concurrently with the training job. Note
that we do not support running evaluation on TPU.
## Running Tensorboard
Progress for training and eval jobs can be inspected using Tensorboard. If using
the recommended directory structure, Tensorboard can be run using the following
command:
```
bash
tensorboard
--logdir
=
${
MODEL_DIR
}
```
where
`${MODEL_DIR}`
points to the directory that contains the train and eval
directories. Please note it may take Tensorboard a couple minutes to populate
with data.
research/object_detection/g3doc/tpu_compatibility.md
View file @
5a2cf36f
...
@@ -2,7 +2,7 @@
...
@@ -2,7 +2,7 @@
[TOC]
[TOC]
The Tensor
f
low Object Detection API supports TPU training for some models. To
The Tensor
F
low Object Detection API supports TPU training for some models. To
make models TPU compatible you need to make a few tweaks to the model config as
make models TPU compatible you need to make a few tweaks to the model config as
mentioned below. We also provide several sample configs that you can use as a
mentioned below. We also provide several sample configs that you can use as a
template.
template.
...
@@ -11,7 +11,7 @@ template.
...
@@ -11,7 +11,7 @@ template.
### Static shaped tensors
### Static shaped tensors
TPU training currently requires all tensors in the Tensor
f
low Graph to have
TPU training currently requires all tensors in the Tensor
F
low Graph to have
static shapes. However, most of the sample configs in Object Detection API have
static shapes. However, most of the sample configs in Object Detection API have
a few different tensors that are dynamically shaped. Fortunately, we provide
a few different tensors that are dynamically shaped. Fortunately, we provide
simple alternatives in the model configuration that modifies these tensors to
simple alternatives in the model configuration that modifies these tensors to
...
@@ -62,7 +62,7 @@ have static shape:
...
@@ -62,7 +62,7 @@ have static shape:
### TPU friendly ops
### TPU friendly ops
Although TPU supports a vast number of tensorflow ops, a few used in the
Although TPU supports a vast number of tensorflow ops, a few used in the
Tensor
f
low Object Detection API are unsupported. We list such ops below and
Tensor
F
low Object Detection API are unsupported. We list such ops below and
recommend compatible substitutes.
recommend compatible substitutes.
*
**Anchor sampling**
- Typically we use hard example mining in standard SSD
*
**Anchor sampling**
- Typically we use hard example mining in standard SSD
...
...
research/object_detection/g3doc/tpu_exporters.md
View file @
5a2cf36f
# Object Detection TPU Inference Exporter
# Object Detection TPU Inference Exporter
[

](https://github.com/tensorflow/tensorflow/releases/tag/v1.15.0)
This package contains SavedModel Exporter for TPU Inference of object detection
This package contains SavedModel Exporter for TPU Inference of object detection
models.
models.
...
...
research/object_detection/g3doc/using_your_own_dataset.md
View file @
5a2cf36f
...
@@ -2,7 +2,7 @@
...
@@ -2,7 +2,7 @@
[TOC]
[TOC]
To use your own dataset in Tensor
f
low Object Detection API, you must convert it
To use your own dataset in Tensor
F
low Object Detection API, you must convert it
into the
[
TFRecord file format
](
https://www.tensorflow.org/api_guides/python/python_io#tfrecords_format_details
)
.
into the
[
TFRecord file format
](
https://www.tensorflow.org/api_guides/python/python_io#tfrecords_format_details
)
.
This document outlines how to write a script to generate the TFRecord file.
This document outlines how to write a script to generate the TFRecord file.
...
...
research/object_detection/inputs.py
View file @
5a2cf36f
...
@@ -1094,8 +1094,12 @@ def get_reduce_to_frame_fn(input_reader_config, is_training):
...
@@ -1094,8 +1094,12 @@ def get_reduce_to_frame_fn(input_reader_config, is_training):
num_frames
=
tf
.
cast
(
num_frames
=
tf
.
cast
(
tf
.
shape
(
tensor_dict
[
fields
.
InputDataFields
.
source_id
])[
0
],
tf
.
shape
(
tensor_dict
[
fields
.
InputDataFields
.
source_id
])[
0
],
dtype
=
tf
.
int32
)
dtype
=
tf
.
int32
)
frame_index
=
tf
.
random
.
uniform
((),
minval
=
0
,
maxval
=
num_frames
,
if
input_reader_config
.
frame_index
==
-
1
:
dtype
=
tf
.
int32
)
frame_index
=
tf
.
random
.
uniform
((),
minval
=
0
,
maxval
=
num_frames
,
dtype
=
tf
.
int32
)
else
:
frame_index
=
tf
.
constant
(
input_reader_config
.
frame_index
,
dtype
=
tf
.
int32
)
out_tensor_dict
=
{}
out_tensor_dict
=
{}
for
key
in
tensor_dict
:
for
key
in
tensor_dict
:
if
key
in
fields
.
SEQUENCE_FIELDS
:
if
key
in
fields
.
SEQUENCE_FIELDS
:
...
...
research/object_detection/inputs_test.py
View file @
5a2cf36f
...
@@ -61,7 +61,7 @@ def _get_configs_for_model(model_name):
...
@@ -61,7 +61,7 @@ def _get_configs_for_model(model_name):
configs
,
kwargs_dict
=
override_dict
)
configs
,
kwargs_dict
=
override_dict
)
def
_get_configs_for_model_sequence_example
(
model_name
):
def
_get_configs_for_model_sequence_example
(
model_name
,
frame_index
=-
1
):
"""Returns configurations for model."""
"""Returns configurations for model."""
fname
=
os
.
path
.
join
(
tf
.
resource_loader
.
get_data_files_path
(),
fname
=
os
.
path
.
join
(
tf
.
resource_loader
.
get_data_files_path
(),
'test_data/'
+
model_name
+
'.config'
)
'test_data/'
+
model_name
+
'.config'
)
...
@@ -74,7 +74,8 @@ def _get_configs_for_model_sequence_example(model_name):
...
@@ -74,7 +74,8 @@ def _get_configs_for_model_sequence_example(model_name):
override_dict
=
{
override_dict
=
{
'train_input_path'
:
data_path
,
'train_input_path'
:
data_path
,
'eval_input_path'
:
data_path
,
'eval_input_path'
:
data_path
,
'label_map_path'
:
label_map_path
'label_map_path'
:
label_map_path
,
'frame_index'
:
frame_index
}
}
return
config_util
.
merge_external_params_with_configs
(
return
config_util
.
merge_external_params_with_configs
(
configs
,
kwargs_dict
=
override_dict
)
configs
,
kwargs_dict
=
override_dict
)
...
@@ -312,6 +313,46 @@ class InputFnTest(test_case.TestCase, parameterized.TestCase):
...
@@ -312,6 +313,46 @@ class InputFnTest(test_case.TestCase, parameterized.TestCase):
tf
.
float32
,
tf
.
float32
,
labels
[
fields
.
InputDataFields
.
groundtruth_weights
].
dtype
)
labels
[
fields
.
InputDataFields
.
groundtruth_weights
].
dtype
)
def
test_context_rcnn_resnet50_train_input_with_sequence_example_frame_index
(
self
,
train_batch_size
=
8
):
"""Tests the training input function for FasterRcnnResnet50."""
configs
=
_get_configs_for_model_sequence_example
(
'context_rcnn_camera_trap'
,
frame_index
=
2
)
model_config
=
configs
[
'model'
]
train_config
=
configs
[
'train_config'
]
train_config
.
batch_size
=
train_batch_size
train_input_fn
=
inputs
.
create_train_input_fn
(
train_config
,
configs
[
'train_input_config'
],
model_config
)
features
,
labels
=
_make_initializable_iterator
(
train_input_fn
()).
get_next
()
self
.
assertAllEqual
([
train_batch_size
,
640
,
640
,
3
],
features
[
fields
.
InputDataFields
.
image
].
shape
.
as_list
())
self
.
assertEqual
(
tf
.
float32
,
features
[
fields
.
InputDataFields
.
image
].
dtype
)
self
.
assertAllEqual
([
train_batch_size
],
features
[
inputs
.
HASH_KEY
].
shape
.
as_list
())
self
.
assertEqual
(
tf
.
int32
,
features
[
inputs
.
HASH_KEY
].
dtype
)
self
.
assertAllEqual
(
[
train_batch_size
,
100
,
4
],
labels
[
fields
.
InputDataFields
.
groundtruth_boxes
].
shape
.
as_list
())
self
.
assertEqual
(
tf
.
float32
,
labels
[
fields
.
InputDataFields
.
groundtruth_boxes
].
dtype
)
self
.
assertAllEqual
(
[
train_batch_size
,
100
,
model_config
.
faster_rcnn
.
num_classes
],
labels
[
fields
.
InputDataFields
.
groundtruth_classes
].
shape
.
as_list
())
self
.
assertEqual
(
tf
.
float32
,
labels
[
fields
.
InputDataFields
.
groundtruth_classes
].
dtype
)
self
.
assertAllEqual
(
[
train_batch_size
,
100
],
labels
[
fields
.
InputDataFields
.
groundtruth_weights
].
shape
.
as_list
())
self
.
assertEqual
(
tf
.
float32
,
labels
[
fields
.
InputDataFields
.
groundtruth_weights
].
dtype
)
self
.
assertAllEqual
(
[
train_batch_size
,
100
,
model_config
.
faster_rcnn
.
num_classes
],
labels
[
fields
.
InputDataFields
.
groundtruth_confidences
].
shape
.
as_list
())
self
.
assertEqual
(
tf
.
float32
,
labels
[
fields
.
InputDataFields
.
groundtruth_confidences
].
dtype
)
def
test_ssd_inceptionV2_train_input
(
self
):
def
test_ssd_inceptionV2_train_input
(
self
):
"""Tests the training input function for SSDInceptionV2."""
"""Tests the training input function for SSDInceptionV2."""
configs
=
_get_configs_for_model
(
'ssd_inception_v2_pets'
)
configs
=
_get_configs_for_model
(
'ssd_inception_v2_pets'
)
...
...
research/object_detection/meta_architectures/center_net_meta_arch.py
View file @
5a2cf36f
...
@@ -924,13 +924,16 @@ def convert_strided_predictions_to_normalized_keypoints(
...
@@ -924,13 +924,16 @@ def convert_strided_predictions_to_normalized_keypoints(
def
convert_strided_predictions_to_instance_masks
(
def
convert_strided_predictions_to_instance_masks
(
boxes
,
classes
,
masks
,
stride
,
mask_height
,
mask_width
,
boxes
,
classes
,
masks
,
true_image_shapes
,
true_image_shapes
,
score_threshold
=
0.5
):
densepose_part_heatmap
=
None
,
densepose_surface_coords
=
None
,
stride
=
4
,
mask_height
=
256
,
mask_width
=
256
,
score_threshold
=
0.5
,
densepose_class_index
=-
1
):
"""Converts predicted full-image masks into instance masks.
"""Converts predicted full-image masks into instance masks.
For each predicted detection box:
For each predicted detection box:
* Crop and resize the predicted mask based on the detected bounding box
* Crop and resize the predicted mask (and optionally DensePose coordinates)
coordinates and class prediction. Uses bilinear resampling.
based on the detected bounding box coordinates and class prediction. Uses
bilinear resampling.
* Binarize the mask using the provided score threshold.
* Binarize the mask using the provided score threshold.
Args:
Args:
...
@@ -940,57 +943,212 @@ def convert_strided_predictions_to_instance_masks(
...
@@ -940,57 +943,212 @@ def convert_strided_predictions_to_instance_masks(
detected class for each box (0-indexed).
detected class for each box (0-indexed).
masks: A [batch, output_height, output_width, num_classes] float32
masks: A [batch, output_height, output_width, num_classes] float32
tensor with class probabilities.
tensor with class probabilities.
true_image_shapes: A tensor of shape [batch, 3] representing the true
shape of the inputs not considering padding.
densepose_part_heatmap: (Optional) A [batch, output_height, output_width,
num_parts] float32 tensor with part scores (i.e. logits).
densepose_surface_coords: (Optional) A [batch, output_height, output_width,
2 * num_parts] float32 tensor with predicted part coordinates (in
vu-format).
stride: The stride in the output space.
stride: The stride in the output space.
mask_height: The desired resized height for instance masks.
mask_height: The desired resized height for instance masks.
mask_width: The desired resized width for instance masks.
mask_width: The desired resized width for instance masks.
true_image_shapes: A tensor of shape [batch, 3] representing the true
shape of the inputs not considering padding.
score_threshold: The threshold at which to convert predicted mask
score_threshold: The threshold at which to convert predicted mask
into foreground pixels.
into foreground pixels.
densepose_class_index: The class index (0-indexed) corresponding to the
class which has DensePose labels (e.g. person class).
Returns:
Returns:
A [batch_size, max_detections, mask_height, mask_width] uint8 tensor with
A tuple of masks and surface_coords.
predicted foreground mask for each instance. The masks take values in
instance_masks: A [batch_size, max_detections, mask_height, mask_width]
{0, 1}.
uint8 tensor with predicted foreground mask for each
instance. If DensePose tensors are provided, then each pixel value in the
mask encodes the 1-indexed part.
surface_coords: A [batch_size, max_detections, mask_height, mask_width, 2]
float32 tensor with (v, u) coordinates. Note that v, u coordinates are
only defined on instance masks, and the coordinates at each location of
the foreground mask correspond to coordinates on a local part coordinate
system (the specific part can be inferred from the `instance_masks`
output. If DensePose feature maps are not passed to this function, this
output will be None.
Raises:
ValueError: If one but not both of `densepose_part_heatmap` and
`densepose_surface_coords` is provided.
"""
"""
_
,
output_height
,
output_width
,
_
=
(
batch_size
,
output_height
,
output_width
,
_
=
(
shape_utils
.
combined_static_and_dynamic_shape
(
masks
))
shape_utils
.
combined_static_and_dynamic_shape
(
masks
))
input_height
=
stride
*
output_height
input_height
=
stride
*
output_height
input_width
=
stride
*
output_width
input_width
=
stride
*
output_width
true_heights
,
true_widths
,
_
=
tf
.
unstack
(
true_image_shapes
,
axis
=
1
)
# If necessary, create dummy DensePose tensors to simplify the map function.
densepose_present
=
True
if
((
densepose_part_heatmap
is
not
None
)
^
(
densepose_surface_coords
is
not
None
)):
raise
ValueError
(
'To use DensePose, both `densepose_part_heatmap` and '
'`densepose_surface_coords` must be provided'
)
if
densepose_part_heatmap
is
None
and
densepose_surface_coords
is
None
:
densepose_present
=
False
densepose_part_heatmap
=
tf
.
zeros
(
(
batch_size
,
output_height
,
output_width
,
1
),
dtype
=
tf
.
float32
)
densepose_surface_coords
=
tf
.
zeros
(
(
batch_size
,
output_height
,
output_width
,
2
),
dtype
=
tf
.
float32
)
crop_and_threshold_fn
=
functools
.
partial
(
crop_and_threshold_masks
,
input_height
=
input_height
,
input_width
=
input_width
,
mask_height
=
mask_height
,
mask_width
=
mask_width
,
score_threshold
=
score_threshold
,
densepose_class_index
=
densepose_class_index
)
instance_masks
,
surface_coords
=
shape_utils
.
static_or_dynamic_map_fn
(
crop_and_threshold_fn
,
elems
=
[
boxes
,
classes
,
masks
,
densepose_part_heatmap
,
densepose_surface_coords
,
true_heights
,
true_widths
],
dtype
=
[
tf
.
uint8
,
tf
.
float32
],
back_prop
=
False
)
surface_coords
=
surface_coords
if
densepose_present
else
None
return
instance_masks
,
surface_coords
def
crop_and_threshold_masks
(
elems
,
input_height
,
input_width
,
mask_height
=
256
,
mask_width
=
256
,
score_threshold
=
0.5
,
densepose_class_index
=-
1
):
"""Crops and thresholds masks based on detection boxes.
Args:
elems: A tuple of
boxes - float32 tensor of shape [max_detections, 4]
classes - int32 tensor of shape [max_detections] (0-indexed)
masks - float32 tensor of shape [output_height, output_width, num_classes]
part_heatmap - float32 tensor of shape [output_height, output_width,
num_parts]
surf_coords - float32 tensor of shape [output_height, output_width,
2 * num_parts]
true_height - scalar int tensor
true_width - scalar int tensor
input_height: Input height to network.
input_width: Input width to network.
mask_height: Height for resizing mask crops.
mask_width: Width for resizing mask crops.
score_threshold: The threshold at which to convert predicted mask
into foreground pixels.
densepose_class_index: scalar int tensor with the class index (0-indexed)
for DensePose.
Returns:
A tuple of
all_instances: A [max_detections, mask_height, mask_width] uint8 tensor
with a predicted foreground mask for each instance. Background is encoded
as 0, and foreground is encoded as a positive integer. Specific part
indices are encoded as 1-indexed parts (for classes that have part
information).
surface_coords: A [max_detections, mask_height, mask_width, 2]
float32 tensor with (v, u) coordinates. for each part.
"""
(
boxes
,
classes
,
masks
,
part_heatmap
,
surf_coords
,
true_height
,
true_width
)
=
elems
# Boxes are in normalized coordinates relative to true image shapes. Convert
# Boxes are in normalized coordinates relative to true image shapes. Convert
# coordinates to be normalized relative to input image shapes (since masks
# coordinates to be normalized relative to input image shapes (since masks
# may still have padding).
# may still have padding).
# Then crop and resize each mask.
boxlist
=
box_list
.
BoxList
(
boxes
)
def
crop_and_threshold_masks
(
args
):
y_scale
=
true_height
/
input_height
"""Crops masks based on detection boxes."""
x_scale
=
true_width
/
input_width
boxes
,
classes
,
masks
,
true_height
,
true_width
=
args
boxlist
=
box_list_ops
.
scale
(
boxlist
,
y_scale
,
x_scale
)
boxlist
=
box_list
.
BoxList
(
boxes
)
boxes
=
boxlist
.
get
()
y_scale
=
true_height
/
input_height
# Convert masks from [output_height, output_width, num_classes] to
x_scale
=
true_width
/
input_width
# [num_classes, output_height, output_width, 1].
boxlist
=
box_list_ops
.
scale
(
boxlist
,
y_scale
,
x_scale
)
num_classes
=
tf
.
shape
(
masks
)[
-
1
]
boxes
=
boxlist
.
get
()
masks_4d
=
tf
.
transpose
(
masks
,
perm
=
[
2
,
0
,
1
])[:,
:,
:,
tf
.
newaxis
]
# Convert masks from [input_height, input_width, num_classes] to
# Tile part and surface coordinate masks for all classes.
# [num_classes, input_height, input_width, 1].
part_heatmap_4d
=
tf
.
tile
(
part_heatmap
[
tf
.
newaxis
,
:,
:,
:],
masks_4d
=
tf
.
transpose
(
masks
,
perm
=
[
2
,
0
,
1
])[:,
:,
:,
tf
.
newaxis
]
multiples
=
[
num_classes
,
1
,
1
,
1
])
cropped_masks
=
tf2
.
image
.
crop_and_resize
(
surf_coords_4d
=
tf
.
tile
(
surf_coords
[
tf
.
newaxis
,
:,
:,
:],
masks_4d
,
multiples
=
[
num_classes
,
1
,
1
,
1
])
boxes
=
boxes
,
feature_maps_concat
=
tf
.
concat
([
masks_4d
,
part_heatmap_4d
,
surf_coords_4d
],
box_indices
=
classes
,
axis
=-
1
)
crop_size
=
[
mask_height
,
mask_width
],
# The following tensor has shape
method
=
'bilinear'
)
# [max_detections, mask_height, mask_width, 1 + 3 * num_parts].
masks_3d
=
tf
.
squeeze
(
cropped_masks
,
axis
=
3
)
cropped_masks
=
tf2
.
image
.
crop_and_resize
(
masks_binarized
=
tf
.
math
.
greater_equal
(
masks_3d
,
score_threshold
)
feature_maps_concat
,
return
tf
.
cast
(
masks_binarized
,
tf
.
uint8
)
boxes
=
boxes
,
box_indices
=
classes
,
crop_size
=
[
mask_height
,
mask_width
],
method
=
'bilinear'
)
# Split the cropped masks back into instance masks, part masks, and surface
# coordinates.
num_parts
=
tf
.
shape
(
part_heatmap
)[
-
1
]
instance_masks
,
part_heatmap_cropped
,
surface_coords_cropped
=
tf
.
split
(
cropped_masks
,
[
1
,
num_parts
,
2
*
num_parts
],
axis
=-
1
)
# Threshold the instance masks. Resulting tensor has shape
# [max_detections, mask_height, mask_width, 1].
instance_masks_int
=
tf
.
cast
(
tf
.
math
.
greater_equal
(
instance_masks
,
score_threshold
),
dtype
=
tf
.
int32
)
# Produce a binary mask that is 1.0 only:
# - in the foreground region for an instance
# - in detections corresponding to the DensePose class
det_with_parts
=
tf
.
equal
(
classes
,
densepose_class_index
)
det_with_parts
=
tf
.
cast
(
tf
.
reshape
(
det_with_parts
,
[
-
1
,
1
,
1
,
1
]),
dtype
=
tf
.
int32
)
instance_masks_with_parts
=
tf
.
math
.
multiply
(
instance_masks_int
,
det_with_parts
)
# Similarly, produce a binary mask that holds the foreground masks only for
# instances without parts (i.e. non-DensePose classes).
det_without_parts
=
1
-
det_with_parts
instance_masks_without_parts
=
tf
.
math
.
multiply
(
instance_masks_int
,
det_without_parts
)
# Assemble a tensor that has standard instance segmentation masks for
# non-DensePose classes (with values in [0, 1]), and part segmentation masks
# for DensePose classes (with vaues in [0, 1, ..., num_parts]).
part_mask_int_zero_indexed
=
tf
.
math
.
argmax
(
part_heatmap_cropped
,
axis
=-
1
,
output_type
=
tf
.
int32
)[:,
:,
:,
tf
.
newaxis
]
part_mask_int_one_indexed
=
part_mask_int_zero_indexed
+
1
all_instances
=
(
instance_masks_without_parts
+
instance_masks_with_parts
*
part_mask_int_one_indexed
)
# Gather the surface coordinates for the parts.
surface_coords_cropped
=
tf
.
reshape
(
surface_coords_cropped
,
[
-
1
,
mask_height
,
mask_width
,
num_parts
,
2
])
surface_coords
=
gather_surface_coords_for_parts
(
surface_coords_cropped
,
part_mask_int_zero_indexed
)
surface_coords
=
(
surface_coords
*
tf
.
cast
(
instance_masks_with_parts
,
tf
.
float32
))
return
[
tf
.
squeeze
(
all_instances
,
axis
=
3
),
surface_coords
]
def
gather_surface_coords_for_parts
(
surface_coords_cropped
,
highest_scoring_part
):
"""Gathers the (v, u) coordinates for the highest scoring DensePose parts.
true_heights
,
true_widths
,
_
=
tf
.
unstack
(
true_image_shapes
,
axis
=
1
)
Args:
masks_for_image
=
shape_utils
.
static_or_dynamic_map_fn
(
surface_coords_cropped: A [max_detections, height, width, num_parts, 2]
crop_and_threshold_masks
,
float32 tensor with (v, u) surface coordinates.
elems
=
[
boxes
,
classes
,
masks
,
true_heights
,
true_widths
],
highest_scoring_part: A [max_detections, height, width] integer tensor with
dtype
=
tf
.
uint8
,
the highest scoring part (0-indexed) indices for each location.
back_prop
=
False
)
masks
=
tf
.
stack
(
masks_for_image
,
axis
=
0
)
Returns:
return
masks
A [max_detections, height, width, 2] float32 tensor with the (v, u)
coordinates selected from the highest scoring parts.
"""
max_detections
,
height
,
width
,
num_parts
,
_
=
(
shape_utils
.
combined_static_and_dynamic_shape
(
surface_coords_cropped
))
flattened_surface_coords
=
tf
.
reshape
(
surface_coords_cropped
,
[
-
1
,
2
])
flattened_part_ids
=
tf
.
reshape
(
highest_scoring_part
,
[
-
1
])
# Produce lookup indices that represent the locations of the highest scoring
# parts in the `flattened_surface_coords` tensor.
flattened_lookup_indices
=
(
num_parts
*
tf
.
range
(
max_detections
*
height
*
width
)
+
flattened_part_ids
)
vu_coords_flattened
=
tf
.
gather
(
flattened_surface_coords
,
flattened_lookup_indices
,
axis
=
0
)
return
tf
.
reshape
(
vu_coords_flattened
,
[
max_detections
,
height
,
width
,
2
])
class
ObjectDetectionParams
(
class
ObjectDetectionParams
(
...
@@ -1235,6 +1393,64 @@ class MaskParams(
...
@@ -1235,6 +1393,64 @@ class MaskParams(
score_threshold
,
heatmap_bias_init
)
score_threshold
,
heatmap_bias_init
)
class
DensePoseParams
(
collections
.
namedtuple
(
'DensePoseParams'
,
[
'class_id'
,
'classification_loss'
,
'localization_loss'
,
'part_loss_weight'
,
'coordinate_loss_weight'
,
'num_parts'
,
'task_loss_weight'
,
'upsample_to_input_res'
,
'upsample_method'
,
'heatmap_bias_init'
])):
"""Namedtuple to store DensePose prediction related parameters."""
__slots__
=
()
def
__new__
(
cls
,
class_id
,
classification_loss
,
localization_loss
,
part_loss_weight
=
1.0
,
coordinate_loss_weight
=
1.0
,
num_parts
=
24
,
task_loss_weight
=
1.0
,
upsample_to_input_res
=
True
,
upsample_method
=
'bilinear'
,
heatmap_bias_init
=-
2.19
):
"""Constructor with default values for DensePoseParams.
Args:
class_id: the ID of the class that contains the DensePose groundtruth.
This should typically correspond to the "person" class. Note that the ID
is 0-based, meaning that class 0 corresponds to the first non-background
object class.
classification_loss: an object_detection.core.losses.Loss object to
compute the loss for the body part predictions in CenterNet.
localization_loss: an object_detection.core.losses.Loss object to compute
the loss for the surface coordinate regression in CenterNet.
part_loss_weight: The loss weight to apply to part prediction.
coordinate_loss_weight: The loss weight to apply to surface coordinate
prediction.
num_parts: The number of DensePose parts to predict.
task_loss_weight: float, the loss weight for the DensePose task.
upsample_to_input_res: Whether to upsample the DensePose feature maps to
the input resolution before applying loss. Note that the prediction
outputs are still at the standard CenterNet output stride.
upsample_method: Method for upsampling DensePose feature maps. Options are
either 'bilinear' or 'nearest'). This takes no effect when
`upsample_to_input_res` is False.
heatmap_bias_init: float, the initial value of bias in the convolutional
kernel of the part prediction head. If set to None, the
bias is initialized with zeros.
Returns:
An initialized DensePoseParams namedtuple.
"""
return
super
(
DensePoseParams
,
cls
).
__new__
(
cls
,
class_id
,
classification_loss
,
localization_loss
,
part_loss_weight
,
coordinate_loss_weight
,
num_parts
,
task_loss_weight
,
upsample_to_input_res
,
upsample_method
,
heatmap_bias_init
)
# The following constants are used to generate the keys of the
# The following constants are used to generate the keys of the
# (prediction, loss, target assigner,...) dictionaries used in CenterNetMetaArch
# (prediction, loss, target assigner,...) dictionaries used in CenterNetMetaArch
# class.
# class.
...
@@ -1247,6 +1463,9 @@ KEYPOINT_HEATMAP = 'keypoint/heatmap'
...
@@ -1247,6 +1463,9 @@ KEYPOINT_HEATMAP = 'keypoint/heatmap'
KEYPOINT_OFFSET
=
'keypoint/offset'
KEYPOINT_OFFSET
=
'keypoint/offset'
SEGMENTATION_TASK
=
'segmentation_task'
SEGMENTATION_TASK
=
'segmentation_task'
SEGMENTATION_HEATMAP
=
'segmentation/heatmap'
SEGMENTATION_HEATMAP
=
'segmentation/heatmap'
DENSEPOSE_TASK
=
'densepose_task'
DENSEPOSE_HEATMAP
=
'densepose/heatmap'
DENSEPOSE_REGRESSION
=
'densepose/regression'
LOSS_KEY_PREFIX
=
'Loss'
LOSS_KEY_PREFIX
=
'Loss'
...
@@ -1290,7 +1509,8 @@ class CenterNetMetaArch(model.DetectionModel):
...
@@ -1290,7 +1509,8 @@ class CenterNetMetaArch(model.DetectionModel):
object_center_params
,
object_center_params
,
object_detection_params
=
None
,
object_detection_params
=
None
,
keypoint_params_dict
=
None
,
keypoint_params_dict
=
None
,
mask_params
=
None
):
mask_params
=
None
,
densepose_params
=
None
):
"""Initializes a CenterNet model.
"""Initializes a CenterNet model.
Args:
Args:
...
@@ -1318,6 +1538,10 @@ class CenterNetMetaArch(model.DetectionModel):
...
@@ -1318,6 +1538,10 @@ class CenterNetMetaArch(model.DetectionModel):
mask_params: A MaskParams namedtuple. This object
mask_params: A MaskParams namedtuple. This object
holds the hyper-parameters for segmentation. Please see the class
holds the hyper-parameters for segmentation. Please see the class
definition for more details.
definition for more details.
densepose_params: A DensePoseParams namedtuple. This object holds the
hyper-parameters for DensePose prediction. Please see the class
definition for more details. Note that if this is provided, it is
expected that `mask_params` is also provided.
"""
"""
assert
object_detection_params
or
keypoint_params_dict
assert
object_detection_params
or
keypoint_params_dict
# Shorten the name for convenience and better formatting.
# Shorten the name for convenience and better formatting.
...
@@ -1333,6 +1557,10 @@ class CenterNetMetaArch(model.DetectionModel):
...
@@ -1333,6 +1557,10 @@ class CenterNetMetaArch(model.DetectionModel):
self
.
_od_params
=
object_detection_params
self
.
_od_params
=
object_detection_params
self
.
_kp_params_dict
=
keypoint_params_dict
self
.
_kp_params_dict
=
keypoint_params_dict
self
.
_mask_params
=
mask_params
self
.
_mask_params
=
mask_params
if
densepose_params
is
not
None
and
mask_params
is
None
:
raise
ValueError
(
'To run DensePose prediction, `mask_params` must also '
'be supplied.'
)
self
.
_densepose_params
=
densepose_params
# Construct the prediction head nets.
# Construct the prediction head nets.
self
.
_prediction_head_dict
=
self
.
_construct_prediction_heads
(
self
.
_prediction_head_dict
=
self
.
_construct_prediction_heads
(
...
@@ -1413,8 +1641,18 @@ class CenterNetMetaArch(model.DetectionModel):
...
@@ -1413,8 +1641,18 @@ class CenterNetMetaArch(model.DetectionModel):
if
self
.
_mask_params
is
not
None
:
if
self
.
_mask_params
is
not
None
:
prediction_heads
[
SEGMENTATION_HEATMAP
]
=
[
prediction_heads
[
SEGMENTATION_HEATMAP
]
=
[
make_prediction_net
(
num_classes
,
make_prediction_net
(
num_classes
,
bias_fill
=
class_prediction_bias_init
)
bias_fill
=
self
.
_mask_params
.
heatmap_bias_init
)
for
_
in
range
(
num_feature_outputs
)]
if
self
.
_densepose_params
is
not
None
:
prediction_heads
[
DENSEPOSE_HEATMAP
]
=
[
make_prediction_net
(
# pylint: disable=g-complex-comprehension
self
.
_densepose_params
.
num_parts
,
bias_fill
=
self
.
_densepose_params
.
heatmap_bias_init
)
for
_
in
range
(
num_feature_outputs
)]
for
_
in
range
(
num_feature_outputs
)]
prediction_heads
[
DENSEPOSE_REGRESSION
]
=
[
make_prediction_net
(
2
*
self
.
_densepose_params
.
num_parts
)
for
_
in
range
(
num_feature_outputs
)
]
return
prediction_heads
return
prediction_heads
def
_initialize_target_assigners
(
self
,
stride
,
min_box_overlap_iou
):
def
_initialize_target_assigners
(
self
,
stride
,
min_box_overlap_iou
):
...
@@ -1449,6 +1687,10 @@ class CenterNetMetaArch(model.DetectionModel):
...
@@ -1449,6 +1687,10 @@ class CenterNetMetaArch(model.DetectionModel):
if
self
.
_mask_params
is
not
None
:
if
self
.
_mask_params
is
not
None
:
target_assigners
[
SEGMENTATION_TASK
]
=
(
target_assigners
[
SEGMENTATION_TASK
]
=
(
cn_assigner
.
CenterNetMaskTargetAssigner
(
stride
))
cn_assigner
.
CenterNetMaskTargetAssigner
(
stride
))
if
self
.
_densepose_params
is
not
None
:
dp_stride
=
1
if
self
.
_densepose_params
.
upsample_to_input_res
else
stride
target_assigners
[
DENSEPOSE_TASK
]
=
(
cn_assigner
.
CenterNetDensePoseTargetAssigner
(
dp_stride
))
return
target_assigners
return
target_assigners
...
@@ -1860,6 +2102,113 @@ class CenterNetMetaArch(model.DetectionModel):
...
@@ -1860,6 +2102,113 @@ class CenterNetMetaArch(model.DetectionModel):
float
(
len
(
segmentation_predictions
))
*
total_pixels_in_loss
)
float
(
len
(
segmentation_predictions
))
*
total_pixels_in_loss
)
return
total_loss
return
total_loss
def
_compute_densepose_losses
(
self
,
input_height
,
input_width
,
prediction_dict
):
"""Computes the weighted DensePose losses.
Args:
input_height: An integer scalar tensor representing input image height.
input_width: An integer scalar tensor representing input image width.
prediction_dict: A dictionary holding predicted tensors output by the
"predict" function. See the "predict" function for more detailed
description.
Returns:
A dictionary of scalar float tensors representing the weighted losses for
the DensePose task:
DENSEPOSE_HEATMAP: the weighted part segmentation loss.
DENSEPOSE_REGRESSION: the weighted part surface coordinate loss.
"""
dp_heatmap_loss
,
dp_regression_loss
=
(
self
.
_compute_densepose_part_and_coordinate_losses
(
input_height
=
input_height
,
input_width
=
input_width
,
part_predictions
=
prediction_dict
[
DENSEPOSE_HEATMAP
],
surface_coord_predictions
=
prediction_dict
[
DENSEPOSE_REGRESSION
]))
loss_dict
=
{}
loss_dict
[
DENSEPOSE_HEATMAP
]
=
(
self
.
_densepose_params
.
part_loss_weight
*
dp_heatmap_loss
)
loss_dict
[
DENSEPOSE_REGRESSION
]
=
(
self
.
_densepose_params
.
coordinate_loss_weight
*
dp_regression_loss
)
return
loss_dict
def
_compute_densepose_part_and_coordinate_losses
(
self
,
input_height
,
input_width
,
part_predictions
,
surface_coord_predictions
):
"""Computes the individual losses for the DensePose task.
Args:
input_height: An integer scalar tensor representing input image height.
input_width: An integer scalar tensor representing input image width.
part_predictions: A list of float tensors of shape [batch_size,
out_height, out_width, num_parts].
surface_coord_predictions: A list of float tensors of shape [batch_size,
out_height, out_width, 2 * num_parts].
Returns:
A tuple with two scalar loss tensors: part_prediction_loss and
surface_coord_loss.
"""
gt_dp_num_points_list
=
self
.
groundtruth_lists
(
fields
.
BoxListFields
.
densepose_num_points
)
gt_dp_part_ids_list
=
self
.
groundtruth_lists
(
fields
.
BoxListFields
.
densepose_part_ids
)
gt_dp_surface_coords_list
=
self
.
groundtruth_lists
(
fields
.
BoxListFields
.
densepose_surface_coords
)
gt_weights_list
=
self
.
groundtruth_lists
(
fields
.
BoxListFields
.
weights
)
assigner
=
self
.
_target_assigner_dict
[
DENSEPOSE_TASK
]
batch_indices
,
batch_part_ids
,
batch_surface_coords
,
batch_weights
=
(
assigner
.
assign_part_and_coordinate_targets
(
height
=
input_height
,
width
=
input_width
,
gt_dp_num_points_list
=
gt_dp_num_points_list
,
gt_dp_part_ids_list
=
gt_dp_part_ids_list
,
gt_dp_surface_coords_list
=
gt_dp_surface_coords_list
,
gt_weights_list
=
gt_weights_list
))
part_prediction_loss
=
0
surface_coord_loss
=
0
classification_loss_fn
=
self
.
_densepose_params
.
classification_loss
localization_loss_fn
=
self
.
_densepose_params
.
localization_loss
num_predictions
=
float
(
len
(
part_predictions
))
num_valid_points
=
tf
.
math
.
count_nonzero
(
batch_weights
)
num_valid_points
=
tf
.
cast
(
tf
.
math
.
maximum
(
num_valid_points
,
1
),
tf
.
float32
)
for
part_pred
,
surface_coord_pred
in
zip
(
part_predictions
,
surface_coord_predictions
):
# Potentially upsample the feature maps, so that better quality (i.e.
# higher res) groundtruth can be applied.
if
self
.
_densepose_params
.
upsample_to_input_res
:
part_pred
=
tf
.
keras
.
layers
.
UpSampling2D
(
self
.
_stride
,
interpolation
=
self
.
_densepose_params
.
upsample_method
)(
part_pred
)
surface_coord_pred
=
tf
.
keras
.
layers
.
UpSampling2D
(
self
.
_stride
,
interpolation
=
self
.
_densepose_params
.
upsample_method
)(
surface_coord_pred
)
# Compute the part prediction loss.
part_pred
=
cn_assigner
.
get_batch_predictions_from_indices
(
part_pred
,
batch_indices
[:,
0
:
3
])
part_prediction_loss
+=
classification_loss_fn
(
part_pred
[:,
tf
.
newaxis
,
:],
batch_part_ids
[:,
tf
.
newaxis
,
:],
weights
=
batch_weights
[:,
tf
.
newaxis
,
tf
.
newaxis
])
# Compute the surface coordinate loss.
batch_size
,
out_height
,
out_width
,
_
=
_get_shape
(
surface_coord_pred
,
4
)
surface_coord_pred
=
tf
.
reshape
(
surface_coord_pred
,
[
batch_size
,
out_height
,
out_width
,
-
1
,
2
])
surface_coord_pred
=
cn_assigner
.
get_batch_predictions_from_indices
(
surface_coord_pred
,
batch_indices
)
surface_coord_loss
+=
localization_loss_fn
(
surface_coord_pred
,
batch_surface_coords
,
weights
=
batch_weights
[:,
tf
.
newaxis
])
part_prediction_loss
=
tf
.
reduce_sum
(
part_prediction_loss
)
/
(
num_predictions
*
num_valid_points
)
surface_coord_loss
=
tf
.
reduce_sum
(
surface_coord_loss
)
/
(
num_predictions
*
num_valid_points
)
return
part_prediction_loss
,
surface_coord_loss
def
preprocess
(
self
,
inputs
):
def
preprocess
(
self
,
inputs
):
outputs
=
shape_utils
.
resize_images_and_return_shapes
(
outputs
=
shape_utils
.
resize_images_and_return_shapes
(
inputs
,
self
.
_image_resizer_fn
)
inputs
,
self
.
_image_resizer_fn
)
...
@@ -1909,6 +2258,13 @@ class CenterNetMetaArch(model.DetectionModel):
...
@@ -1909,6 +2258,13 @@ class CenterNetMetaArch(model.DetectionModel):
'segmentation/heatmap' - [optional] A list of size num_feature_outputs
'segmentation/heatmap' - [optional] A list of size num_feature_outputs
holding float tensors of size [batch_size, output_height,
holding float tensors of size [batch_size, output_height,
output_width, num_classes] representing the mask logits.
output_width, num_classes] representing the mask logits.
'densepose/heatmap' - [optional] A list of size num_feature_outputs
holding float tensors of size [batch_size, output_height,
output_width, num_parts] representing the mask logits for each part.
'densepose/regression' - [optional] A list of size num_feature_outputs
holding float tensors of size [batch_size, output_height,
output_width, 2 * num_parts] representing the DensePose surface
coordinate predictions.
Note the $TASK_NAME is provided by the KeypointEstimation namedtuple
Note the $TASK_NAME is provided by the KeypointEstimation namedtuple
used to differentiate between different keypoint tasks.
used to differentiate between different keypoint tasks.
"""
"""
...
@@ -1938,10 +2294,16 @@ class CenterNetMetaArch(model.DetectionModel):
...
@@ -1938,10 +2294,16 @@ class CenterNetMetaArch(model.DetectionModel):
scope: Optional scope name.
scope: Optional scope name.
Returns:
Returns:
A dictionary mapping the keys ['Loss/object_center', 'Loss/box/scale',
A dictionary mapping the keys [
'Loss/box/offset', 'Loss/$TASK_NAME/keypoint/heatmap',
'Loss/object_center',
'Loss/$TASK_NAME/keypoint/offset',
'Loss/box/scale', (optional)
'Loss/$TASK_NAME/keypoint/regression', 'Loss/segmentation/heatmap'] to
'Loss/box/offset', (optional)
'Loss/$TASK_NAME/keypoint/heatmap', (optional)
'Loss/$TASK_NAME/keypoint/offset', (optional)
'Loss/$TASK_NAME/keypoint/regression', (optional)
'Loss/segmentation/heatmap', (optional)
'Loss/densepose/heatmap', (optional)
'Loss/densepose/regression]' (optional)
scalar tensors corresponding to the losses for different tasks. Note the
scalar tensors corresponding to the losses for different tasks. Note the
$TASK_NAME is provided by the KeypointEstimation namedtuple used to
$TASK_NAME is provided by the KeypointEstimation namedtuple used to
differentiate between different keypoint tasks.
differentiate between different keypoint tasks.
...
@@ -1999,6 +2361,16 @@ class CenterNetMetaArch(model.DetectionModel):
...
@@ -1999,6 +2361,16 @@ class CenterNetMetaArch(model.DetectionModel):
seg_losses
[
key
]
=
seg_losses
[
key
]
*
self
.
_mask_params
.
task_loss_weight
seg_losses
[
key
]
=
seg_losses
[
key
]
*
self
.
_mask_params
.
task_loss_weight
losses
.
update
(
seg_losses
)
losses
.
update
(
seg_losses
)
if
self
.
_densepose_params
is
not
None
:
densepose_losses
=
self
.
_compute_densepose_losses
(
input_height
=
input_height
,
input_width
=
input_width
,
prediction_dict
=
prediction_dict
)
for
key
in
densepose_losses
:
densepose_losses
[
key
]
=
(
densepose_losses
[
key
]
*
self
.
_densepose_params
.
task_loss_weight
)
losses
.
update
(
densepose_losses
)
# Prepend the LOSS_KEY_PREFIX to the keys in the dictionary such that the
# Prepend the LOSS_KEY_PREFIX to the keys in the dictionary such that the
# losses will be grouped together in Tensorboard.
# losses will be grouped together in Tensorboard.
return
dict
([(
'%s/%s'
%
(
LOSS_KEY_PREFIX
,
key
),
val
)
return
dict
([(
'%s/%s'
%
(
LOSS_KEY_PREFIX
,
key
),
val
)
...
@@ -2033,9 +2405,14 @@ class CenterNetMetaArch(model.DetectionModel):
...
@@ -2033,9 +2405,14 @@ class CenterNetMetaArch(model.DetectionModel):
invalid keypoints have their coordinates and scores set to 0.0.
invalid keypoints have their coordinates and scores set to 0.0.
detection_keypoint_scores: (Optional) A float tensor of shape [batch,
detection_keypoint_scores: (Optional) A float tensor of shape [batch,
max_detection, num_keypoints] with scores for each keypoint.
max_detection, num_keypoints] with scores for each keypoint.
detection_masks: (Optional) An int tensor of shape [batch,
detection_masks: (Optional) A uint8 tensor of shape [batch,
max_detections, mask_height, mask_width] with binarized masks for each
max_detections, mask_height, mask_width] with masks for each
detection.
detection. Background is specified with 0, and foreground is specified
with positive integers (1 for standard instance segmentation mask, and
1-indexed parts for DensePose task).
detection_surface_coords: (Optional) A float32 tensor of shape [batch,
max_detection, mask_height, mask_width, 2] with DensePose surface
coordinates, in (v, u) format.
"""
"""
object_center_prob
=
tf
.
nn
.
sigmoid
(
prediction_dict
[
OBJECT_CENTER
][
-
1
])
object_center_prob
=
tf
.
nn
.
sigmoid
(
prediction_dict
[
OBJECT_CENTER
][
-
1
])
# Get x, y and channel indices corresponding to the top indices in the class
# Get x, y and channel indices corresponding to the top indices in the class
...
@@ -2076,14 +2453,27 @@ class CenterNetMetaArch(model.DetectionModel):
...
@@ -2076,14 +2453,27 @@ class CenterNetMetaArch(model.DetectionModel):
if
self
.
_mask_params
:
if
self
.
_mask_params
:
masks
=
tf
.
nn
.
sigmoid
(
prediction_dict
[
SEGMENTATION_HEATMAP
][
-
1
])
masks
=
tf
.
nn
.
sigmoid
(
prediction_dict
[
SEGMENTATION_HEATMAP
][
-
1
])
instance_masks
=
convert_strided_predictions_to_instance_masks
(
densepose_part_heatmap
,
densepose_surface_coords
=
None
,
None
boxes
,
classes
,
masks
,
self
.
_stride
,
self
.
_mask_params
.
mask_height
,
densepose_class_index
=
0
self
.
_mask_params
.
mask_width
,
true_image_shapes
,
if
self
.
_densepose_params
:
self
.
_mask_params
.
score_threshold
)
densepose_part_heatmap
=
prediction_dict
[
DENSEPOSE_HEATMAP
][
-
1
]
postprocess_dict
.
update
({
densepose_surface_coords
=
prediction_dict
[
DENSEPOSE_REGRESSION
][
-
1
]
fields
.
DetectionResultFields
.
detection_masks
:
densepose_class_index
=
self
.
_densepose_params
.
class_id
instance_masks
instance_masks
,
surface_coords
=
(
})
convert_strided_predictions_to_instance_masks
(
boxes
,
classes
,
masks
,
true_image_shapes
,
densepose_part_heatmap
,
densepose_surface_coords
,
stride
=
self
.
_stride
,
mask_height
=
self
.
_mask_params
.
mask_height
,
mask_width
=
self
.
_mask_params
.
mask_width
,
score_threshold
=
self
.
_mask_params
.
score_threshold
,
densepose_class_index
=
densepose_class_index
))
postprocess_dict
[
fields
.
DetectionResultFields
.
detection_masks
]
=
instance_masks
if
self
.
_densepose_params
:
postprocess_dict
[
fields
.
DetectionResultFields
.
detection_surface_coords
]
=
(
surface_coords
)
return
postprocess_dict
return
postprocess_dict
def
_postprocess_keypoints
(
self
,
prediction_dict
,
classes
,
y_indices
,
def
_postprocess_keypoints
(
self
,
prediction_dict
,
classes
,
y_indices
,
...
@@ -2359,6 +2749,14 @@ class CenterNetMetaArch(model.DetectionModel):
...
@@ -2359,6 +2749,14 @@ class CenterNetMetaArch(model.DetectionModel):
checkpoint (with compatible variable names) or to restore from a
checkpoint (with compatible variable names) or to restore from a
classification checkpoint for initialization prior to training.
classification checkpoint for initialization prior to training.
Valid values: `detection`, `classification`. Default 'detection'.
Valid values: `detection`, `classification`. Default 'detection'.
'detection': used when loading in the Hourglass model pre-trained on
other detection task.
'classification': used when loading in the ResNet model pre-trained on
image classification task. Note that only the image feature encoding
part is loaded but not those upsampling layers.
'fine_tune': used when loading the entire CenterNet feature extractor
pre-trained on other tasks. The checkpoints saved during CenterNet
model training can be directly loaded using this mode.
Returns:
Returns:
A dict mapping keys to Trackable objects (tf.Module or Checkpoint).
A dict mapping keys to Trackable objects (tf.Module or Checkpoint).
...
@@ -2367,9 +2765,14 @@ class CenterNetMetaArch(model.DetectionModel):
...
@@ -2367,9 +2765,14 @@ class CenterNetMetaArch(model.DetectionModel):
if
fine_tune_checkpoint_type
==
'classification'
:
if
fine_tune_checkpoint_type
==
'classification'
:
return
{
'feature_extractor'
:
self
.
_feature_extractor
.
get_base_model
()}
return
{
'feature_extractor'
:
self
.
_feature_extractor
.
get_base_model
()}
if
fine_tune_checkpoint_type
==
'detection'
:
el
if
fine_tune_checkpoint_type
==
'detection'
:
return
{
'feature_extractor'
:
self
.
_feature_extractor
.
get_model
()}
return
{
'feature_extractor'
:
self
.
_feature_extractor
.
get_model
()}
elif
fine_tune_checkpoint_type
==
'fine_tune'
:
feature_extractor_model
=
tf
.
train
.
Checkpoint
(
_feature_extractor
=
self
.
_feature_extractor
)
return
{
'model'
:
feature_extractor_model
}
else
:
else
:
raise
ValueError
(
'Not supported fine tune checkpoint type - {}'
.
format
(
raise
ValueError
(
'Not supported fine tune checkpoint type - {}'
.
format
(
fine_tune_checkpoint_type
))
fine_tune_checkpoint_type
))
...
...
research/object_detection/meta_architectures/center_net_meta_arch_tf2_test.py
View file @
5a2cf36f
...
@@ -266,7 +266,7 @@ class CenterNetMetaArchHelpersTest(test_case.TestCase, parameterized.TestCase):
...
@@ -266,7 +266,7 @@ class CenterNetMetaArchHelpersTest(test_case.TestCase, parameterized.TestCase):
masks_np
[
0
,
:,
:
3
,
1
]
=
1
# Class 1.
masks_np
[
0
,
:,
:
3
,
1
]
=
1
# Class 1.
masks
=
tf
.
constant
(
masks_np
)
masks
=
tf
.
constant
(
masks_np
)
true_image_shapes
=
tf
.
constant
([[
6
,
8
,
3
]])
true_image_shapes
=
tf
.
constant
([[
6
,
8
,
3
]])
instance_masks
=
cnma
.
convert_strided_predictions_to_instance_masks
(
instance_masks
,
_
=
cnma
.
convert_strided_predictions_to_instance_masks
(
boxes
,
classes
,
masks
,
stride
=
2
,
mask_height
=
2
,
mask_width
=
2
,
boxes
,
classes
,
masks
,
stride
=
2
,
mask_height
=
2
,
mask_width
=
2
,
true_image_shapes
=
true_image_shapes
)
true_image_shapes
=
true_image_shapes
)
return
instance_masks
return
instance_masks
...
@@ -289,6 +289,104 @@ class CenterNetMetaArchHelpersTest(test_case.TestCase, parameterized.TestCase):
...
@@ -289,6 +289,104 @@ class CenterNetMetaArchHelpersTest(test_case.TestCase, parameterized.TestCase):
])
])
np
.
testing
.
assert_array_equal
(
expected_instance_masks
,
instance_masks
)
np
.
testing
.
assert_array_equal
(
expected_instance_masks
,
instance_masks
)
def
test_convert_strided_predictions_raises_error_with_one_tensor
(
self
):
def
graph_fn
():
boxes
=
tf
.
constant
(
[
[[
0.5
,
0.5
,
1.0
,
1.0
],
[
0.0
,
0.5
,
0.5
,
1.0
],
[
0.0
,
0.0
,
0.0
,
0.0
]],
],
tf
.
float32
)
classes
=
tf
.
constant
(
[
[
0
,
1
,
0
],
],
tf
.
int32
)
masks_np
=
np
.
zeros
((
1
,
4
,
4
,
2
),
dtype
=
np
.
float32
)
masks_np
[
0
,
:,
2
:,
0
]
=
1
# Class 0.
masks_np
[
0
,
:,
:
3
,
1
]
=
1
# Class 1.
masks
=
tf
.
constant
(
masks_np
)
true_image_shapes
=
tf
.
constant
([[
6
,
8
,
3
]])
densepose_part_heatmap
=
tf
.
random
.
uniform
(
[
1
,
4
,
4
,
24
])
instance_masks
,
_
=
cnma
.
convert_strided_predictions_to_instance_masks
(
boxes
,
classes
,
masks
,
true_image_shapes
,
densepose_part_heatmap
=
densepose_part_heatmap
,
densepose_surface_coords
=
None
)
return
instance_masks
with
self
.
assertRaises
(
ValueError
):
self
.
execute_cpu
(
graph_fn
,
[])
def
test_crop_and_threshold_masks
(
self
):
boxes_np
=
np
.
array
(
[[
0.
,
0.
,
0.5
,
0.5
],
[
0.25
,
0.25
,
1.0
,
1.0
]],
dtype
=
np
.
float32
)
classes_np
=
np
.
array
([
0
,
2
],
dtype
=
np
.
int32
)
masks_np
=
np
.
zeros
((
4
,
4
,
_NUM_CLASSES
),
dtype
=
np
.
float32
)
masks_np
[
0
,
0
,
0
]
=
0.8
masks_np
[
1
,
1
,
0
]
=
0.6
masks_np
[
3
,
3
,
2
]
=
0.7
part_heatmap_np
=
np
.
zeros
((
4
,
4
,
_DENSEPOSE_NUM_PARTS
),
dtype
=
np
.
float32
)
part_heatmap_np
[
0
,
0
,
4
]
=
1
part_heatmap_np
[
0
,
0
,
2
]
=
0.6
# Lower scoring.
part_heatmap_np
[
1
,
1
,
8
]
=
0.2
part_heatmap_np
[
3
,
3
,
4
]
=
0.5
surf_coords_np
=
np
.
zeros
((
4
,
4
,
2
*
_DENSEPOSE_NUM_PARTS
),
dtype
=
np
.
float32
)
surf_coords_np
[:,
:,
8
:
10
]
=
0.2
,
0.9
surf_coords_np
[:,
:,
16
:
18
]
=
0.3
,
0.5
true_height
,
true_width
=
10
,
10
input_height
,
input_width
=
10
,
10
mask_height
=
4
mask_width
=
4
def
graph_fn
():
elems
=
[
tf
.
constant
(
boxes_np
),
tf
.
constant
(
classes_np
),
tf
.
constant
(
masks_np
),
tf
.
constant
(
part_heatmap_np
),
tf
.
constant
(
surf_coords_np
),
tf
.
constant
(
true_height
,
dtype
=
tf
.
int32
),
tf
.
constant
(
true_width
,
dtype
=
tf
.
int32
)
]
part_masks
,
surface_coords
=
cnma
.
crop_and_threshold_masks
(
elems
,
input_height
,
input_width
,
mask_height
=
mask_height
,
mask_width
=
mask_width
,
densepose_class_index
=
0
)
return
part_masks
,
surface_coords
part_masks
,
surface_coords
=
self
.
execute_cpu
(
graph_fn
,
[])
expected_part_masks
=
np
.
zeros
((
2
,
4
,
4
),
dtype
=
np
.
uint8
)
expected_part_masks
[
0
,
0
,
0
]
=
5
# Recall classes are 1-indexed in output.
expected_part_masks
[
0
,
2
,
2
]
=
9
# Recall classes are 1-indexed in output.
expected_part_masks
[
1
,
3
,
3
]
=
1
# Standard instance segmentation mask.
expected_surface_coords
=
np
.
zeros
((
2
,
4
,
4
,
2
),
dtype
=
np
.
float32
)
expected_surface_coords
[
0
,
0
,
0
,
:]
=
0.2
,
0.9
expected_surface_coords
[
0
,
2
,
2
,
:]
=
0.3
,
0.5
np
.
testing
.
assert_allclose
(
expected_part_masks
,
part_masks
)
np
.
testing
.
assert_allclose
(
expected_surface_coords
,
surface_coords
)
def
test_gather_surface_coords_for_parts
(
self
):
surface_coords_cropped_np
=
np
.
zeros
((
2
,
5
,
5
,
_DENSEPOSE_NUM_PARTS
,
2
),
dtype
=
np
.
float32
)
surface_coords_cropped_np
[
0
,
0
,
0
,
5
]
=
0.3
,
0.4
surface_coords_cropped_np
[
0
,
1
,
0
,
9
]
=
0.5
,
0.6
highest_scoring_part_np
=
np
.
zeros
((
2
,
5
,
5
),
dtype
=
np
.
int32
)
highest_scoring_part_np
[
0
,
0
,
0
]
=
5
highest_scoring_part_np
[
0
,
1
,
0
]
=
9
def
graph_fn
():
surface_coords_cropped
=
tf
.
constant
(
surface_coords_cropped_np
,
tf
.
float32
)
highest_scoring_part
=
tf
.
constant
(
highest_scoring_part_np
,
tf
.
int32
)
surface_coords_gathered
=
cnma
.
gather_surface_coords_for_parts
(
surface_coords_cropped
,
highest_scoring_part
)
return
surface_coords_gathered
surface_coords_gathered
=
self
.
execute_cpu
(
graph_fn
,
[])
np
.
testing
.
assert_allclose
([
0.3
,
0.4
],
surface_coords_gathered
[
0
,
0
,
0
])
np
.
testing
.
assert_allclose
([
0.5
,
0.6
],
surface_coords_gathered
[
0
,
1
,
0
])
def
test_top_k_feature_map_locations
(
self
):
def
test_top_k_feature_map_locations
(
self
):
feature_map_np
=
np
.
zeros
((
2
,
3
,
3
,
2
),
dtype
=
np
.
float32
)
feature_map_np
=
np
.
zeros
((
2
,
3
,
3
,
2
),
dtype
=
np
.
float32
)
feature_map_np
[
0
,
2
,
0
,
1
]
=
1.0
feature_map_np
[
0
,
2
,
0
,
1
]
=
1.0
...
@@ -535,6 +633,8 @@ class CenterNetMetaArchHelpersTest(test_case.TestCase, parameterized.TestCase):
...
@@ -535,6 +633,8 @@ class CenterNetMetaArchHelpersTest(test_case.TestCase, parameterized.TestCase):
keypoint_heatmap_np
[
1
,
0
,
1
,
1
]
=
0.9
keypoint_heatmap_np
[
1
,
0
,
1
,
1
]
=
0.9
keypoint_heatmap_np
[
1
,
2
,
0
,
1
]
=
0.8
keypoint_heatmap_np
[
1
,
2
,
0
,
1
]
=
0.8
# Note that the keypoint offsets are now per keypoint (as opposed to
# keypoint agnostic, in the test test_keypoint_candidate_prediction).
keypoint_heatmap_offsets_np
=
np
.
zeros
((
2
,
3
,
3
,
4
),
dtype
=
np
.
float32
)
keypoint_heatmap_offsets_np
=
np
.
zeros
((
2
,
3
,
3
,
4
),
dtype
=
np
.
float32
)
keypoint_heatmap_offsets_np
[
0
,
0
,
0
]
=
[
0.5
,
0.25
,
0.0
,
0.0
]
keypoint_heatmap_offsets_np
[
0
,
0
,
0
]
=
[
0.5
,
0.25
,
0.0
,
0.0
]
keypoint_heatmap_offsets_np
[
0
,
2
,
1
]
=
[
-
0.25
,
0.5
,
0.0
,
0.0
]
keypoint_heatmap_offsets_np
[
0
,
2
,
1
]
=
[
-
0.25
,
0.5
,
0.0
,
0.0
]
...
@@ -949,6 +1049,7 @@ class CenterNetMetaArchHelpersTest(test_case.TestCase, parameterized.TestCase):
...
@@ -949,6 +1049,7 @@ class CenterNetMetaArchHelpersTest(test_case.TestCase, parameterized.TestCase):
_NUM_CLASSES
=
10
_NUM_CLASSES
=
10
_KEYPOINT_INDICES
=
[
0
,
1
,
2
,
3
]
_KEYPOINT_INDICES
=
[
0
,
1
,
2
,
3
]
_NUM_KEYPOINTS
=
len
(
_KEYPOINT_INDICES
)
_NUM_KEYPOINTS
=
len
(
_KEYPOINT_INDICES
)
_DENSEPOSE_NUM_PARTS
=
24
_TASK_NAME
=
'human_pose'
_TASK_NAME
=
'human_pose'
...
@@ -991,6 +1092,20 @@ def get_fake_mask_params():
...
@@ -991,6 +1092,20 @@ def get_fake_mask_params():
mask_width
=
4
)
mask_width
=
4
)
def
get_fake_densepose_params
():
"""Returns the fake DensePose estimation parameter namedtuple."""
return
cnma
.
DensePoseParams
(
class_id
=
1
,
classification_loss
=
losses
.
WeightedSoftmaxClassificationLoss
(),
localization_loss
=
losses
.
L1LocalizationLoss
(),
part_loss_weight
=
1.0
,
coordinate_loss_weight
=
1.0
,
num_parts
=
_DENSEPOSE_NUM_PARTS
,
task_loss_weight
=
1.0
,
upsample_to_input_res
=
True
,
upsample_method
=
'nearest'
)
def
build_center_net_meta_arch
(
build_resnet
=
False
):
def
build_center_net_meta_arch
(
build_resnet
=
False
):
"""Builds the CenterNet meta architecture."""
"""Builds the CenterNet meta architecture."""
if
build_resnet
:
if
build_resnet
:
...
@@ -1018,7 +1133,8 @@ def build_center_net_meta_arch(build_resnet=False):
...
@@ -1018,7 +1133,8 @@ def build_center_net_meta_arch(build_resnet=False):
object_center_params
=
get_fake_center_params
(),
object_center_params
=
get_fake_center_params
(),
object_detection_params
=
get_fake_od_params
(),
object_detection_params
=
get_fake_od_params
(),
keypoint_params_dict
=
{
_TASK_NAME
:
get_fake_kp_params
()},
keypoint_params_dict
=
{
_TASK_NAME
:
get_fake_kp_params
()},
mask_params
=
get_fake_mask_params
())
mask_params
=
get_fake_mask_params
(),
densepose_params
=
get_fake_densepose_params
())
def
_logit
(
p
):
def
_logit
(
p
):
...
@@ -1102,6 +1218,16 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
...
@@ -1102,6 +1218,16 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
fake_feature_map
)
fake_feature_map
)
self
.
assertEqual
((
4
,
128
,
128
,
_NUM_CLASSES
),
output
.
shape
)
self
.
assertEqual
((
4
,
128
,
128
,
_NUM_CLASSES
),
output
.
shape
)
# "densepose parts" head:
output
=
model
.
_prediction_head_dict
[
cnma
.
DENSEPOSE_HEATMAP
][
-
1
](
fake_feature_map
)
self
.
assertEqual
((
4
,
128
,
128
,
_DENSEPOSE_NUM_PARTS
),
output
.
shape
)
# "densepose surface coordinates" head:
output
=
model
.
_prediction_head_dict
[
cnma
.
DENSEPOSE_REGRESSION
][
-
1
](
fake_feature_map
)
self
.
assertEqual
((
4
,
128
,
128
,
2
*
_DENSEPOSE_NUM_PARTS
),
output
.
shape
)
def
test_initialize_target_assigners
(
self
):
def
test_initialize_target_assigners
(
self
):
model
=
build_center_net_meta_arch
()
model
=
build_center_net_meta_arch
()
assigner_dict
=
model
.
_initialize_target_assigners
(
assigner_dict
=
model
.
_initialize_target_assigners
(
...
@@ -1125,6 +1251,10 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
...
@@ -1125,6 +1251,10 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
self
.
assertIsInstance
(
assigner_dict
[
cnma
.
SEGMENTATION_TASK
],
self
.
assertIsInstance
(
assigner_dict
[
cnma
.
SEGMENTATION_TASK
],
cn_assigner
.
CenterNetMaskTargetAssigner
)
cn_assigner
.
CenterNetMaskTargetAssigner
)
# DensePose estimation target assigner:
self
.
assertIsInstance
(
assigner_dict
[
cnma
.
DENSEPOSE_TASK
],
cn_assigner
.
CenterNetDensePoseTargetAssigner
)
def
test_predict
(
self
):
def
test_predict
(
self
):
"""Test the predict function."""
"""Test the predict function."""
...
@@ -1145,6 +1275,10 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
...
@@ -1145,6 +1275,10 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
(
2
,
32
,
32
,
2
))
(
2
,
32
,
32
,
2
))
self
.
assertEqual
(
prediction_dict
[
cnma
.
SEGMENTATION_HEATMAP
][
0
].
shape
,
self
.
assertEqual
(
prediction_dict
[
cnma
.
SEGMENTATION_HEATMAP
][
0
].
shape
,
(
2
,
32
,
32
,
_NUM_CLASSES
))
(
2
,
32
,
32
,
_NUM_CLASSES
))
self
.
assertEqual
(
prediction_dict
[
cnma
.
DENSEPOSE_HEATMAP
][
0
].
shape
,
(
2
,
32
,
32
,
_DENSEPOSE_NUM_PARTS
))
self
.
assertEqual
(
prediction_dict
[
cnma
.
DENSEPOSE_REGRESSION
][
0
].
shape
,
(
2
,
32
,
32
,
2
*
_DENSEPOSE_NUM_PARTS
))
def
test_loss
(
self
):
def
test_loss
(
self
):
"""Test the loss function."""
"""Test the loss function."""
...
@@ -1157,7 +1291,13 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
...
@@ -1157,7 +1291,13 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
groundtruth_keypoints_list
=
groundtruth_dict
[
groundtruth_keypoints_list
=
groundtruth_dict
[
fields
.
BoxListFields
.
keypoints
],
fields
.
BoxListFields
.
keypoints
],
groundtruth_masks_list
=
groundtruth_dict
[
groundtruth_masks_list
=
groundtruth_dict
[
fields
.
BoxListFields
.
masks
])
fields
.
BoxListFields
.
masks
],
groundtruth_dp_num_points_list
=
groundtruth_dict
[
fields
.
BoxListFields
.
densepose_num_points
],
groundtruth_dp_part_ids_list
=
groundtruth_dict
[
fields
.
BoxListFields
.
densepose_part_ids
],
groundtruth_dp_surface_coords_list
=
groundtruth_dict
[
fields
.
BoxListFields
.
densepose_surface_coords
])
prediction_dict
=
get_fake_prediction_dict
(
prediction_dict
=
get_fake_prediction_dict
(
input_height
=
16
,
input_width
=
32
,
stride
=
4
)
input_height
=
16
,
input_width
=
32
,
stride
=
4
)
...
@@ -1193,6 +1333,12 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
...
@@ -1193,6 +1333,12 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
self
.
assertGreater
(
self
.
assertGreater
(
0.01
,
loss_dict
[
'%s/%s'
%
(
cnma
.
LOSS_KEY_PREFIX
,
0.01
,
loss_dict
[
'%s/%s'
%
(
cnma
.
LOSS_KEY_PREFIX
,
cnma
.
SEGMENTATION_HEATMAP
)])
cnma
.
SEGMENTATION_HEATMAP
)])
self
.
assertGreater
(
0.01
,
loss_dict
[
'%s/%s'
%
(
cnma
.
LOSS_KEY_PREFIX
,
cnma
.
DENSEPOSE_HEATMAP
)])
self
.
assertGreater
(
0.01
,
loss_dict
[
'%s/%s'
%
(
cnma
.
LOSS_KEY_PREFIX
,
cnma
.
DENSEPOSE_REGRESSION
)])
@
parameterized
.
parameters
(
@
parameterized
.
parameters
(
{
'target_class_id'
:
1
},
{
'target_class_id'
:
1
},
...
@@ -1230,6 +1376,14 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
...
@@ -1230,6 +1376,14 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
segmentation_heatmap
[:,
14
:
18
,
14
:
18
,
target_class_id
]
=
1.0
segmentation_heatmap
[:,
14
:
18
,
14
:
18
,
target_class_id
]
=
1.0
segmentation_heatmap
=
_logit
(
segmentation_heatmap
)
segmentation_heatmap
=
_logit
(
segmentation_heatmap
)
dp_part_ind
=
4
dp_part_heatmap
=
np
.
zeros
((
1
,
32
,
32
,
_DENSEPOSE_NUM_PARTS
),
dtype
=
np
.
float32
)
dp_part_heatmap
[
0
,
14
:
18
,
14
:
18
,
dp_part_ind
]
=
1.0
dp_part_heatmap
=
_logit
(
dp_part_heatmap
)
dp_surf_coords
=
np
.
random
.
randn
(
1
,
32
,
32
,
2
*
_DENSEPOSE_NUM_PARTS
)
class_center
=
tf
.
constant
(
class_center
)
class_center
=
tf
.
constant
(
class_center
)
height_width
=
tf
.
constant
(
height_width
)
height_width
=
tf
.
constant
(
height_width
)
offset
=
tf
.
constant
(
offset
)
offset
=
tf
.
constant
(
offset
)
...
@@ -1237,6 +1391,8 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
...
@@ -1237,6 +1391,8 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
keypoint_offsets
=
tf
.
constant
(
keypoint_offsets
,
dtype
=
tf
.
float32
)
keypoint_offsets
=
tf
.
constant
(
keypoint_offsets
,
dtype
=
tf
.
float32
)
keypoint_regression
=
tf
.
constant
(
keypoint_regression
,
dtype
=
tf
.
float32
)
keypoint_regression
=
tf
.
constant
(
keypoint_regression
,
dtype
=
tf
.
float32
)
segmentation_heatmap
=
tf
.
constant
(
segmentation_heatmap
,
dtype
=
tf
.
float32
)
segmentation_heatmap
=
tf
.
constant
(
segmentation_heatmap
,
dtype
=
tf
.
float32
)
dp_part_heatmap
=
tf
.
constant
(
dp_part_heatmap
,
dtype
=
tf
.
float32
)
dp_surf_coords
=
tf
.
constant
(
dp_surf_coords
,
dtype
=
tf
.
float32
)
prediction_dict
=
{
prediction_dict
=
{
cnma
.
OBJECT_CENTER
:
[
class_center
],
cnma
.
OBJECT_CENTER
:
[
class_center
],
...
@@ -1249,6 +1405,8 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
...
@@ -1249,6 +1405,8 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
cnma
.
get_keypoint_name
(
_TASK_NAME
,
cnma
.
KEYPOINT_REGRESSION
):
cnma
.
get_keypoint_name
(
_TASK_NAME
,
cnma
.
KEYPOINT_REGRESSION
):
[
keypoint_regression
],
[
keypoint_regression
],
cnma
.
SEGMENTATION_HEATMAP
:
[
segmentation_heatmap
],
cnma
.
SEGMENTATION_HEATMAP
:
[
segmentation_heatmap
],
cnma
.
DENSEPOSE_HEATMAP
:
[
dp_part_heatmap
],
cnma
.
DENSEPOSE_REGRESSION
:
[
dp_surf_coords
]
}
}
def
graph_fn
():
def
graph_fn
():
...
@@ -1271,12 +1429,13 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
...
@@ -1271,12 +1429,13 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
self
.
assertAllEqual
([
1
,
max_detection
,
4
,
4
],
self
.
assertAllEqual
([
1
,
max_detection
,
4
,
4
],
detections
[
'detection_masks'
].
shape
)
detections
[
'detection_masks'
].
shape
)
# There should be some section of the first mask (correspond to the only
# Masks should be empty for everything but the first detection.
# detection) with non-zero mask values.
self
.
assertGreater
(
np
.
sum
(
detections
[
'detection_masks'
][
0
,
0
,
:,
:]
>
0
),
0
)
self
.
assertAllEqual
(
self
.
assertAllEqual
(
detections
[
'detection_masks'
][
0
,
1
:,
:,
:],
detections
[
'detection_masks'
][
0
,
1
:,
:,
:],
np
.
zeros_like
(
detections
[
'detection_masks'
][
0
,
1
:,
:,
:]))
np
.
zeros_like
(
detections
[
'detection_masks'
][
0
,
1
:,
:,
:]))
self
.
assertAllEqual
(
detections
[
'detection_surface_coords'
][
0
,
1
:,
:,
:],
np
.
zeros_like
(
detections
[
'detection_surface_coords'
][
0
,
1
:,
:,
:]))
if
target_class_id
==
1
:
if
target_class_id
==
1
:
expected_kpts_for_obj_0
=
np
.
array
(
expected_kpts_for_obj_0
=
np
.
array
(
...
@@ -1287,6 +1446,12 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
...
@@ -1287,6 +1446,12 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
expected_kpts_for_obj_0
,
rtol
=
1e-6
)
expected_kpts_for_obj_0
,
rtol
=
1e-6
)
np
.
testing
.
assert_allclose
(
detections
[
'detection_keypoint_scores'
][
0
][
0
],
np
.
testing
.
assert_allclose
(
detections
[
'detection_keypoint_scores'
][
0
][
0
],
expected_kpt_scores_for_obj_0
,
rtol
=
1e-6
)
expected_kpt_scores_for_obj_0
,
rtol
=
1e-6
)
# First detection has DensePose parts.
self
.
assertSameElements
(
np
.
unique
(
detections
[
'detection_masks'
][
0
,
0
,
:,
:]),
set
([
0
,
dp_part_ind
+
1
]))
self
.
assertGreater
(
np
.
sum
(
np
.
abs
(
detections
[
'detection_surface_coords'
])),
0.0
)
else
:
else
:
# All keypoint outputs should be zeros.
# All keypoint outputs should be zeros.
np
.
testing
.
assert_allclose
(
np
.
testing
.
assert_allclose
(
...
@@ -1297,6 +1462,14 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
...
@@ -1297,6 +1462,14 @@ class CenterNetMetaArchTest(test_case.TestCase, parameterized.TestCase):
detections
[
'detection_keypoint_scores'
][
0
][
0
],
detections
[
'detection_keypoint_scores'
][
0
][
0
],
np
.
zeros
([
num_keypoints
],
np
.
float
),
np
.
zeros
([
num_keypoints
],
np
.
float
),
rtol
=
1e-6
)
rtol
=
1e-6
)
# Binary segmentation mask.
self
.
assertSameElements
(
np
.
unique
(
detections
[
'detection_masks'
][
0
,
0
,
:,
:]),
set
([
0
,
1
]))
# No DensePose surface coordinates.
np
.
testing
.
assert_allclose
(
detections
[
'detection_surface_coords'
][
0
,
0
,
:,
:],
np
.
zeros_like
(
detections
[
'detection_surface_coords'
][
0
,
0
,
:,
:]))
def
test_get_instance_indices
(
self
):
def
test_get_instance_indices
(
self
):
classes
=
tf
.
constant
([[
0
,
1
,
2
,
0
],
[
2
,
1
,
2
,
2
]],
dtype
=
tf
.
int32
)
classes
=
tf
.
constant
([[
0
,
1
,
2
,
0
],
[
2
,
1
,
2
,
2
]],
dtype
=
tf
.
int32
)
...
@@ -1353,6 +1526,17 @@ def get_fake_prediction_dict(input_height, input_width, stride):
...
@@ -1353,6 +1526,17 @@ def get_fake_prediction_dict(input_height, input_width, stride):
mask_heatmap
[
0
,
2
,
4
,
1
]
=
1.0
mask_heatmap
[
0
,
2
,
4
,
1
]
=
1.0
mask_heatmap
=
_logit
(
mask_heatmap
)
mask_heatmap
=
_logit
(
mask_heatmap
)
densepose_heatmap
=
np
.
zeros
((
2
,
output_height
,
output_width
,
_DENSEPOSE_NUM_PARTS
),
dtype
=
np
.
float32
)
densepose_heatmap
[
0
,
2
,
4
,
5
]
=
1.0
densepose_heatmap
=
_logit
(
densepose_heatmap
)
densepose_regression
=
np
.
zeros
((
2
,
output_height
,
output_width
,
2
*
_DENSEPOSE_NUM_PARTS
),
dtype
=
np
.
float32
)
# The surface coordinate indices for part index 5 are:
# (5 * 2, 5 * 2 + 1), or (10, 11).
densepose_regression
[
0
,
2
,
4
,
10
:
12
]
=
0.4
,
0.7
prediction_dict
=
{
prediction_dict
=
{
'preprocessed_inputs'
:
'preprocessed_inputs'
:
tf
.
zeros
((
2
,
input_height
,
input_width
,
3
)),
tf
.
zeros
((
2
,
input_height
,
input_width
,
3
)),
...
@@ -1383,6 +1567,14 @@ def get_fake_prediction_dict(input_height, input_width, stride):
...
@@ -1383,6 +1567,14 @@ def get_fake_prediction_dict(input_height, input_width, stride):
cnma
.
SEGMENTATION_HEATMAP
:
[
cnma
.
SEGMENTATION_HEATMAP
:
[
tf
.
constant
(
mask_heatmap
),
tf
.
constant
(
mask_heatmap
),
tf
.
constant
(
mask_heatmap
)
tf
.
constant
(
mask_heatmap
)
],
cnma
.
DENSEPOSE_HEATMAP
:
[
tf
.
constant
(
densepose_heatmap
),
tf
.
constant
(
densepose_heatmap
),
],
cnma
.
DENSEPOSE_REGRESSION
:
[
tf
.
constant
(
densepose_regression
),
tf
.
constant
(
densepose_regression
),
]
]
}
}
return
prediction_dict
return
prediction_dict
...
@@ -1427,12 +1619,30 @@ def get_fake_groundtruth_dict(input_height, input_width, stride):
...
@@ -1427,12 +1619,30 @@ def get_fake_groundtruth_dict(input_height, input_width, stride):
tf
.
constant
(
mask
),
tf
.
constant
(
mask
),
tf
.
zeros_like
(
mask
),
tf
.
zeros_like
(
mask
),
]
]
densepose_num_points
=
[
tf
.
constant
([
1
],
dtype
=
tf
.
int32
),
tf
.
constant
([
0
],
dtype
=
tf
.
int32
),
]
densepose_part_ids
=
[
tf
.
constant
([[
5
,
0
,
0
]],
dtype
=
tf
.
int32
),
tf
.
constant
([[
0
,
0
,
0
]],
dtype
=
tf
.
int32
),
]
densepose_surface_coords_np
=
np
.
zeros
((
1
,
3
,
4
),
dtype
=
np
.
float32
)
densepose_surface_coords_np
[
0
,
0
,
:]
=
0.55
,
0.55
,
0.4
,
0.7
densepose_surface_coords
=
[
tf
.
constant
(
densepose_surface_coords_np
),
tf
.
zeros_like
(
densepose_surface_coords_np
)
]
groundtruth_dict
=
{
groundtruth_dict
=
{
fields
.
BoxListFields
.
boxes
:
boxes
,
fields
.
BoxListFields
.
boxes
:
boxes
,
fields
.
BoxListFields
.
weights
:
weights
,
fields
.
BoxListFields
.
weights
:
weights
,
fields
.
BoxListFields
.
classes
:
classes
,
fields
.
BoxListFields
.
classes
:
classes
,
fields
.
BoxListFields
.
keypoints
:
keypoints
,
fields
.
BoxListFields
.
keypoints
:
keypoints
,
fields
.
BoxListFields
.
masks
:
masks
,
fields
.
BoxListFields
.
masks
:
masks
,
fields
.
BoxListFields
.
densepose_num_points
:
densepose_num_points
,
fields
.
BoxListFields
.
densepose_part_ids
:
densepose_part_ids
,
fields
.
BoxListFields
.
densepose_surface_coords
:
densepose_surface_coords
,
fields
.
InputDataFields
.
groundtruth_labeled_classes
:
labeled_classes
,
fields
.
InputDataFields
.
groundtruth_labeled_classes
:
labeled_classes
,
}
}
return
groundtruth_dict
return
groundtruth_dict
...
...
research/object_detection/meta_architectures/context_rcnn_lib_tf2.py
0 → 100644
View file @
5a2cf36f
# Lint as: python3
# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Library functions for Context R-CNN."""
import
tensorflow
as
tf
from
object_detection.core
import
freezable_batch_norm
# The negative value used in padding the invalid weights.
_NEGATIVE_PADDING_VALUE
=
-
100000
class
ContextProjection
(
tf
.
keras
.
layers
.
Layer
):
"""Custom layer to do batch normalization and projection."""
def
__init__
(
self
,
projection_dimension
,
**
kwargs
):
self
.
batch_norm
=
freezable_batch_norm
.
FreezableBatchNorm
(
epsilon
=
0.001
,
center
=
True
,
scale
=
True
,
momentum
=
0.97
,
trainable
=
True
)
self
.
projection
=
tf
.
keras
.
layers
.
Dense
(
units
=
projection_dimension
,
activation
=
tf
.
nn
.
relu6
,
use_bias
=
True
)
super
(
ContextProjection
,
self
).
__init__
(
**
kwargs
)
def
build
(
self
,
input_shape
):
self
.
batch_norm
.
build
(
input_shape
)
self
.
projection
.
build
(
input_shape
)
def
call
(
self
,
input_features
,
is_training
=
False
):
return
self
.
projection
(
self
.
batch_norm
(
input_features
,
is_training
))
class
AttentionBlock
(
tf
.
keras
.
layers
.
Layer
):
"""Custom layer to perform all attention."""
def
__init__
(
self
,
bottleneck_dimension
,
attention_temperature
,
output_dimension
=
None
,
is_training
=
False
,
name
=
'AttentionBlock'
,
**
kwargs
):
"""Constructs an attention block.
Args:
bottleneck_dimension: A int32 Tensor representing the bottleneck dimension
for intermediate projections.
attention_temperature: A float Tensor. It controls the temperature of the
softmax for weights calculation. The formula for calculation as follows:
weights = exp(weights / temperature) / sum(exp(weights / temperature))
output_dimension: A int32 Tensor representing the last dimension of the
output feature.
is_training: A boolean Tensor (affecting batch normalization).
name: A string describing what to name the variables in this block.
**kwargs: Additional keyword arguments.
"""
self
.
_key_proj
=
ContextProjection
(
bottleneck_dimension
)
self
.
_val_proj
=
ContextProjection
(
bottleneck_dimension
)
self
.
_query_proj
=
ContextProjection
(
bottleneck_dimension
)
self
.
_feature_proj
=
None
self
.
_attention_temperature
=
attention_temperature
self
.
_bottleneck_dimension
=
bottleneck_dimension
self
.
_is_training
=
is_training
self
.
_output_dimension
=
output_dimension
if
self
.
_output_dimension
:
self
.
_feature_proj
=
ContextProjection
(
self
.
_output_dimension
)
super
(
AttentionBlock
,
self
).
__init__
(
name
=
name
,
**
kwargs
)
def
build
(
self
,
input_shapes
):
"""Finishes building the attention block.
Args:
input_shapes: the shape of the primary input box features.
"""
if
not
self
.
_feature_proj
:
self
.
_output_dimension
=
input_shapes
[
-
1
]
self
.
_feature_proj
=
ContextProjection
(
self
.
_output_dimension
)
def
call
(
self
,
box_features
,
context_features
,
valid_context_size
):
"""Handles a call by performing attention.
Args:
box_features: A float Tensor of shape [batch_size, input_size,
num_input_features].
context_features: A float Tensor of shape [batch_size, context_size,
num_context_features].
valid_context_size: A int32 Tensor of shape [batch_size].
Returns:
A float Tensor with shape [batch_size, input_size, num_input_features]
containing output features after attention with context features.
"""
_
,
context_size
,
_
=
context_features
.
shape
valid_mask
=
compute_valid_mask
(
valid_context_size
,
context_size
)
# Average pools over height and width dimension so that the shape of
# box_features becomes [batch_size, max_num_proposals, channels].
box_features
=
tf
.
reduce_mean
(
box_features
,
[
2
,
3
])
queries
=
project_features
(
box_features
,
self
.
_bottleneck_dimension
,
self
.
_is_training
,
self
.
_query_proj
,
normalize
=
True
)
keys
=
project_features
(
context_features
,
self
.
_bottleneck_dimension
,
self
.
_is_training
,
self
.
_key_proj
,
normalize
=
True
)
values
=
project_features
(
context_features
,
self
.
_bottleneck_dimension
,
self
.
_is_training
,
self
.
_val_proj
,
normalize
=
True
)
weights
=
tf
.
matmul
(
queries
,
keys
,
transpose_b
=
True
)
weights
,
values
=
filter_weight_value
(
weights
,
values
,
valid_mask
)
weights
=
tf
.
nn
.
softmax
(
weights
/
self
.
_attention_temperature
)
features
=
tf
.
matmul
(
weights
,
values
)
output_features
=
project_features
(
features
,
self
.
_output_dimension
,
self
.
_is_training
,
self
.
_feature_proj
,
normalize
=
False
)
output_features
=
output_features
[:,
:,
tf
.
newaxis
,
tf
.
newaxis
,
:]
return
output_features
def
filter_weight_value
(
weights
,
values
,
valid_mask
):
"""Filters weights and values based on valid_mask.
_NEGATIVE_PADDING_VALUE will be added to invalid elements in the weights to
avoid their contribution in softmax. 0 will be set for the invalid elements in
the values.
Args:
weights: A float Tensor of shape [batch_size, input_size, context_size].
values: A float Tensor of shape [batch_size, context_size,
projected_dimension].
valid_mask: A boolean Tensor of shape [batch_size, context_size]. True means
valid and False means invalid.
Returns:
weights: A float Tensor of shape [batch_size, input_size, context_size].
values: A float Tensor of shape [batch_size, context_size,
projected_dimension].
Raises:
ValueError: If shape of doesn't match.
"""
w_batch_size
,
_
,
w_context_size
=
weights
.
shape
v_batch_size
,
v_context_size
,
_
=
values
.
shape
m_batch_size
,
m_context_size
=
valid_mask
.
shape
if
w_batch_size
!=
v_batch_size
or
v_batch_size
!=
m_batch_size
:
raise
ValueError
(
'Please make sure the first dimension of the input'
' tensors are the same.'
)
if
w_context_size
!=
v_context_size
:
raise
ValueError
(
'Please make sure the third dimension of weights matches'
' the second dimension of values.'
)
if
w_context_size
!=
m_context_size
:
raise
ValueError
(
'Please make sure the third dimension of the weights'
' matches the second dimension of the valid_mask.'
)
valid_mask
=
valid_mask
[...,
tf
.
newaxis
]
# Force the invalid weights to be very negative so it won't contribute to
# the softmax.
weights
+=
tf
.
transpose
(
tf
.
cast
(
tf
.
math
.
logical_not
(
valid_mask
),
weights
.
dtype
)
*
_NEGATIVE_PADDING_VALUE
,
perm
=
[
0
,
2
,
1
])
# Force the invalid values to be 0.
values
*=
tf
.
cast
(
valid_mask
,
values
.
dtype
)
return
weights
,
values
def
project_features
(
features
,
bottleneck_dimension
,
is_training
,
layer
,
normalize
=
True
):
"""Projects features to another feature space.
Args:
features: A float Tensor of shape [batch_size, features_size,
num_features].
bottleneck_dimension: A int32 Tensor.
is_training: A boolean Tensor (affecting batch normalization).
layer: Contains a custom layer specific to the particular operation
being performed (key, value, query, features)
normalize: A boolean Tensor. If true, the output features will be l2
normalized on the last dimension.
Returns:
A float Tensor of shape [batch, features_size, projection_dimension].
"""
shape_arr
=
features
.
shape
batch_size
,
_
,
num_features
=
shape_arr
features
=
tf
.
reshape
(
features
,
[
-
1
,
num_features
])
projected_features
=
layer
(
features
,
is_training
)
projected_features
=
tf
.
reshape
(
projected_features
,
[
batch_size
,
-
1
,
bottleneck_dimension
])
if
normalize
:
projected_features
=
tf
.
keras
.
backend
.
l2_normalize
(
projected_features
,
axis
=-
1
)
return
projected_features
def
compute_valid_mask
(
num_valid_elements
,
num_elements
):
"""Computes mask of valid entries within padded context feature.
Args:
num_valid_elements: A int32 Tensor of shape [batch_size].
num_elements: An int32 Tensor.
Returns:
A boolean Tensor of the shape [batch_size, num_elements]. True means
valid and False means invalid.
"""
batch_size
=
num_valid_elements
.
shape
[
0
]
element_idxs
=
tf
.
range
(
num_elements
,
dtype
=
tf
.
int32
)
batch_element_idxs
=
tf
.
tile
(
element_idxs
[
tf
.
newaxis
,
...],
[
batch_size
,
1
])
num_valid_elements
=
num_valid_elements
[...,
tf
.
newaxis
]
valid_mask
=
tf
.
less
(
batch_element_idxs
,
num_valid_elements
)
return
valid_mask
research/object_detection/meta_architectures/context_rcnn_lib_tf2_test.py
0 → 100644
View file @
5a2cf36f
# Lint as: python3
# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for context_rcnn_lib."""
from
__future__
import
absolute_import
from
__future__
import
division
from
__future__
import
print_function
import
unittest
from
absl.testing
import
parameterized
import
tensorflow.compat.v1
as
tf
from
object_detection.meta_architectures
import
context_rcnn_lib_tf2
as
context_rcnn_lib
from
object_detection.utils
import
test_case
from
object_detection.utils
import
tf_version
_NEGATIVE_PADDING_VALUE
=
-
100000
@
unittest
.
skipIf
(
tf_version
.
is_tf1
(),
'Skipping TF2.X only test.'
)
class
ContextRcnnLibTest
(
parameterized
.
TestCase
,
test_case
.
TestCase
):
"""Tests for the functions in context_rcnn_lib."""
def
test_compute_valid_mask
(
self
):
num_elements
=
tf
.
constant
(
3
,
tf
.
int32
)
num_valid_elementss
=
tf
.
constant
((
1
,
2
),
tf
.
int32
)
valid_mask
=
context_rcnn_lib
.
compute_valid_mask
(
num_valid_elementss
,
num_elements
)
expected_valid_mask
=
tf
.
constant
([[
1
,
0
,
0
],
[
1
,
1
,
0
]],
tf
.
float32
)
self
.
assertAllEqual
(
valid_mask
,
expected_valid_mask
)
def
test_filter_weight_value
(
self
):
weights
=
tf
.
ones
((
2
,
3
,
2
),
tf
.
float32
)
*
4
values
=
tf
.
ones
((
2
,
2
,
4
),
tf
.
float32
)
valid_mask
=
tf
.
constant
([[
True
,
True
],
[
True
,
False
]],
tf
.
bool
)
filtered_weights
,
filtered_values
=
context_rcnn_lib
.
filter_weight_value
(
weights
,
values
,
valid_mask
)
expected_weights
=
tf
.
constant
([[[
4
,
4
],
[
4
,
4
],
[
4
,
4
]],
[[
4
,
_NEGATIVE_PADDING_VALUE
+
4
],
[
4
,
_NEGATIVE_PADDING_VALUE
+
4
],
[
4
,
_NEGATIVE_PADDING_VALUE
+
4
]]])
expected_values
=
tf
.
constant
([[[
1
,
1
,
1
,
1
],
[
1
,
1
,
1
,
1
]],
[[
1
,
1
,
1
,
1
],
[
0
,
0
,
0
,
0
]]])
self
.
assertAllEqual
(
filtered_weights
,
expected_weights
)
self
.
assertAllEqual
(
filtered_values
,
expected_values
)
# Changes the valid_mask so the results will be different.
valid_mask
=
tf
.
constant
([[
True
,
True
],
[
False
,
False
]],
tf
.
bool
)
filtered_weights
,
filtered_values
=
context_rcnn_lib
.
filter_weight_value
(
weights
,
values
,
valid_mask
)
expected_weights
=
tf
.
constant
(
[[[
4
,
4
],
[
4
,
4
],
[
4
,
4
]],
[[
_NEGATIVE_PADDING_VALUE
+
4
,
_NEGATIVE_PADDING_VALUE
+
4
],
[
_NEGATIVE_PADDING_VALUE
+
4
,
_NEGATIVE_PADDING_VALUE
+
4
],
[
_NEGATIVE_PADDING_VALUE
+
4
,
_NEGATIVE_PADDING_VALUE
+
4
]]])
expected_values
=
tf
.
constant
([[[
1
,
1
,
1
,
1
],
[
1
,
1
,
1
,
1
]],
[[
0
,
0
,
0
,
0
],
[
0
,
0
,
0
,
0
]]])
self
.
assertAllEqual
(
filtered_weights
,
expected_weights
)
self
.
assertAllEqual
(
filtered_values
,
expected_values
)
@
parameterized
.
parameters
((
2
,
True
,
True
),
(
2
,
False
,
True
),
(
10
,
True
,
False
),
(
10
,
False
,
False
))
def
test_project_features
(
self
,
projection_dimension
,
is_training
,
normalize
):
features
=
tf
.
ones
([
2
,
3
,
4
],
tf
.
float32
)
projected_features
=
context_rcnn_lib
.
project_features
(
features
,
projection_dimension
,
is_training
,
context_rcnn_lib
.
ContextProjection
(
projection_dimension
),
normalize
=
normalize
)
# Makes sure the shape is correct.
self
.
assertAllEqual
(
projected_features
.
shape
,
[
2
,
3
,
projection_dimension
])
@
parameterized
.
parameters
(
(
2
,
10
,
1
),
(
3
,
10
,
2
),
(
4
,
None
,
3
),
(
5
,
20
,
4
),
(
7
,
None
,
5
),
)
def
test_attention_block
(
self
,
bottleneck_dimension
,
output_dimension
,
attention_temperature
):
input_features
=
tf
.
ones
([
2
,
8
,
3
,
3
,
3
],
tf
.
float32
)
context_features
=
tf
.
ones
([
2
,
20
,
10
],
tf
.
float32
)
attention_block
=
context_rcnn_lib
.
AttentionBlock
(
bottleneck_dimension
,
attention_temperature
,
output_dimension
=
output_dimension
,
is_training
=
False
)
valid_context_size
=
tf
.
random_uniform
((
2
,),
minval
=
0
,
maxval
=
10
,
dtype
=
tf
.
int32
)
output_features
=
attention_block
(
input_features
,
context_features
,
valid_context_size
)
# Makes sure the shape is correct.
self
.
assertAllEqual
(
output_features
.
shape
,
[
2
,
8
,
1
,
1
,
(
output_dimension
or
3
)])
if
__name__
==
'__main__'
:
tf
.
test
.
main
()
research/object_detection/meta_architectures/context_rcnn_meta_arch.py
View file @
5a2cf36f
...
@@ -27,7 +27,9 @@ import functools
...
@@ -27,7 +27,9 @@ import functools
from
object_detection.core
import
standard_fields
as
fields
from
object_detection.core
import
standard_fields
as
fields
from
object_detection.meta_architectures
import
context_rcnn_lib
from
object_detection.meta_architectures
import
context_rcnn_lib
from
object_detection.meta_architectures
import
context_rcnn_lib_tf2
from
object_detection.meta_architectures
import
faster_rcnn_meta_arch
from
object_detection.meta_architectures
import
faster_rcnn_meta_arch
from
object_detection.utils
import
tf_version
class
ContextRCNNMetaArch
(
faster_rcnn_meta_arch
.
FasterRCNNMetaArch
):
class
ContextRCNNMetaArch
(
faster_rcnn_meta_arch
.
FasterRCNNMetaArch
):
...
@@ -264,11 +266,17 @@ class ContextRCNNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
...
@@ -264,11 +266,17 @@ class ContextRCNNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
return_raw_detections_during_predict
),
return_raw_detections_during_predict
),
output_final_box_features
=
output_final_box_features
)
output_final_box_features
=
output_final_box_features
)
self
.
_context_feature_extract_fn
=
functools
.
partial
(
if
tf_version
.
is_tf1
():
context_rcnn_lib
.
compute_box_context_attention
,
self
.
_context_feature_extract_fn
=
functools
.
partial
(
bottleneck_dimension
=
attention_bottleneck_dimension
,
context_rcnn_lib
.
compute_box_context_attention
,
attention_temperature
=
attention_temperature
,
bottleneck_dimension
=
attention_bottleneck_dimension
,
is_training
=
is_training
)
attention_temperature
=
attention_temperature
,
is_training
=
is_training
)
else
:
self
.
_context_feature_extract_fn
=
context_rcnn_lib_tf2
.
AttentionBlock
(
bottleneck_dimension
=
attention_bottleneck_dimension
,
attention_temperature
=
attention_temperature
,
is_training
=
is_training
)
@
staticmethod
@
staticmethod
def
get_side_inputs
(
features
):
def
get_side_inputs
(
features
):
...
@@ -323,8 +331,9 @@ class ContextRCNNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
...
@@ -323,8 +331,9 @@ class ContextRCNNMetaArch(faster_rcnn_meta_arch.FasterRCNNMetaArch):
Returns:
Returns:
A float32 Tensor with shape [K, new_height, new_width, depth].
A float32 Tensor with shape [K, new_height, new_width, depth].
"""
"""
box_features
=
self
.
_crop_and_resize_fn
(
box_features
=
self
.
_crop_and_resize_fn
(
features_to_crop
,
proposal_boxes_normalized
,
[
features_to_crop
]
,
proposal_boxes_normalized
,
None
,
[
self
.
_initial_crop_size
,
self
.
_initial_crop_size
])
[
self
.
_initial_crop_size
,
self
.
_initial_crop_size
])
attention_features
=
self
.
_context_feature_extract_fn
(
attention_features
=
self
.
_context_feature_extract_fn
(
...
...
Prev
1
…
8
9
10
11
12
13
14
15
16
17
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment