README.md 4.85 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Mask R-CNN with deep mask heads

This project brings insights from the DeepMAC model into the Mask-RCNN
architecture. Please see the paper
[The surprising impact of mask-head architecture on novel class segmentation](https://arxiv.org/abs/2104.00613)
for more details.

## Code structure

*   This folder contains forks of a few Mask R-CNN files and repurposes them to
    support deep mask heads.
*   To see the benefits of using deep mask heads, it is important to train the
    mask head with only groundtruth boxes. This is configured via the
    `task.model.use_gt_boxes_for_masks` flag.
*   Architecture of the mask head can be changed via the config value
    `task.model.mask_head.convnet_variant`. Supported values are `"default"`,
    `"hourglass20"`, `"hourglass52"`, and `"hourglass100"`.
*   The flag `task.model.mask_head.class_agnostic` trains the model in class
    agnostic mode and `task.allowed_mask_class_ids` controls which classes are
    allowed to have masks during training.
*   Majority of experiments and ablations from the paper are perfomed with the
A. Unique TensorFlower's avatar
A. Unique TensorFlower committed
22
23
    [DeepMAC model](../../../../../research/object_detection/g3doc/deepmac.md)
    in the Object Detection API code base.
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45

## Prerequisites

### Prepare dataset

Use [create_coco_tf_record.py](../../data/create_coco_tf_record.py) to create
the COCO dataset. The data needs to be store in a
[Google cloud storage bucket](https://cloud.google.com/storage/docs/creating-buckets)
so that it can be accessed by the TPU.

### Start a TPU v3-32 instance

See [TPU Quickstart](https://cloud.google.com/tpu/docs/quickstart) for
instructions. An example command would look like:

```shell
ctpu up --name <tpu-name> --zone <zone> --tpu-size=v3-32 --tf-version nightly
```

This model requires TF version `>= 2.5`. Currently, that is only available via a
`nightly` build on Cloud.

A. Unique TensorFlower's avatar
A. Unique TensorFlower committed
46

47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
### Install requirements

SSH into the TPU host with `gcloud compute ssh <tpu-name>` and execute the
following.

```shell
$ git clone https://github.com/tensorflow/models.git
$ cd models
$ pip3 install -r official/requirements.txt
```

## Training Models

The configurations can be found in the `configs/experiments` directory. You can
launch a training job by executing.

```shell
A. Unique TensorFlower's avatar
A. Unique TensorFlower committed
64
$ export CONFIG=./official/projects/deepmac_maskrcnn/configs/experiments/deep_mask_head_rcnn_voc_r50.yaml
65
66
67
68
69
70
71
72
73
74
$ export MODEL_DIR="gs://<path-for-checkpoints>"
$ export ANNOTAION_FILE="gs://<path-to-coco-annotation-json>"
$ export TRAIN_DATA="gs://<path-to-train-data>"
$ export EVAL_DATA="gs://<path-to-eval-data>"
# Overrides to access data. These can also be changed in the config file.
$ export OVERRIDES="task.validation_data.input_path=${EVAL_DATA},\
task.train_data.input_path=${TRAIN_DATA},\
task.annotation_file=${ANNOTAION_FILE},\
runtime.distribution_strategy=tpu"

A. Unique TensorFlower's avatar
A. Unique TensorFlower committed
75
$ python3 -m official.projects.deepmac_maskrcnn.train \
76
77
78
79
80
81
82
83
84
85
  --logtostderr \
  --mode=train_and_eval \
  --experiment=deep_mask_head_rcnn_resnetfpn_coco \
  --model_dir=$MODEL_DIR \
  --config_file=$CONFIG \
  --params_override=$OVERRIDES\
  --tpu=<tpu-name>
```

`CONFIG_FILE` can be any file in the `configs/experiments` directory.
86
87
When using SpineNet models, please specify
`--experiment=deep_mask_head_rcnn_spinenet_coco`
88
89
90
91
92
93
94
95
96
97
98

**Note:** The default eval batch size of 32 discards some samples during
validation. For accurate vaidation statistics, launch a dedicated eval job on
TPU `v3-8` and set batch size to 8.

## Configurations

In the following table, we report the Mask mAP of our models on the non-VOC
classes when only training with masks for the VOC calsses. Performance is
measured on the `coco-val2017` set.

99
100
101
102
103
104
Backbone     | Mask head    | Config name                                     | Mask mAP
:------------| :----------- | :-----------------------------------------------| -------:
ResNet-50    | Default      | `deep_mask_head_rcnn_voc_r50.yaml`              | 25.9
ResNet-50    | Hourglass-52 | `deep_mask_head_rcnn_voc_r50_hg52.yaml`         | 33.1
ResNet-101   | Hourglass-52 | `deep_mask_head_rcnn_voc_r101_hg52.yaml`        | 34.4
SpienNet-143 | Hourglass-52 | `deep_mask_head_rcnn_voc_spinenet143_hg52.yaml` | 38.7
105

106
107
108
109
110
111
## Checkpoints
This model takes Image + boxes as input and produces per-box instance
masks as output.

*   [Mask-RCNN SpineNet backbone](https://storage.cloud.google.com/tf_model_garden/vision/deepmac_maskrcnn/deepmarc_spinenet.zip)

112
113
114
115
## See also

*   [DeepMAC model](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/deepmac.md)
    in the Object Detection API code base.
116
*   Project website - [git.io/deepmac](https://google.github.io/deepmac/)
117
118
119
120
121
122
123
124
125
126
127
128
129

## Citation

```
@misc{birodkar2021surprising,
      title={The surprising impact of mask-head architecture on novel class segmentation}, 
      author={Vighnesh Birodkar and Zhichao Lu and Siyang Li and Vivek Rathod and Jonathan Huang},
      year={2021},
      eprint={2104.00613},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
```