deepmac.md 4.13 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
# DeepMAC model

<!-- TODO(vighneshb) add correct arxiv links and test this page.-->

**DeepMAC** (Deep Mask heads Above CenterNet) is a neural network architecture
that is designed for the partially supervised instance segmentation task. For
details see the
[The surprising impact of mask-head architecture on novel class segmentation](https://arxiv.org/abs/2104.00613)
paper. The figure below shows improved mask predictions for unseen classes as we
use better mask-head architectures.

<p align="center">
13
<img src="./img/mask_improvement.png" width=50%/>
14
15
16
17
18
19
20
21
22
23
24
</p>

Just by using better mask-head architectures (no extra losses or modules) we
achieve state-of-the-art performance in the partially supervised instance
segmentation task.

## Code structure

*   `deepmac_meta_arch.py` implements our main architecture, DeepMAC, on top of
    the CenterNet detection architecture.
*   The proto message `DeepMACMaskEstimation` in `center_net.proto` controls the
Vighnesh Birodkar's avatar
Vighnesh Birodkar committed
25
    configuration of the mask head used.
26
27
28
*   The field `allowed_masked_classes_ids` controls which classes recieve mask
    supervision during training.
*   Mask R-CNN based ablations in the paper are implemented in the
29
30
    [TF model garden](../../../official/vision/beta/projects/deepmac_maskrcnn)
    code base.
31
32
33
34
35

## Prerequisites

1.  Follow [TF2 install instructions](tf2.md) to install Object Detection API.
2.  Generate COCO dataset by using
36
    [create_coco_tf_record.py](../../../official/vision/beta/data/create_coco_tf_record.py)
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71

## Configurations

We provide pre-defined configs which can be run as a
[TF2 training pipeline](tf2_training_and_evaluation.md). Each of these
configurations needs to be passed as the `pipeline_config_path` argument to the
`object_detection/model_main_tf2.py` binary. Note that the `512x512` resolution
models require a TPU `v3-32` and the `1024x1024` resolution models require a TPU
`v3-128` to train. The configs can be found in the [configs/tf2](../configs/tf2)
directory. In the table below `X->Y` indicates that we train with masks from `X`
and evaluate with masks from `Y`. Performance is measured on the `coco-val2017`
set.

### Partially supervised models

Resolution | Mask head     | Train->Eval    | Config name                                        | Mask mAP
:--------- | :------------ | :------------- | :------------------------------------------------- | -------:
512x512    | Hourglass-52  | VOC -> Non-VOC | `center_net_deepmac_512x512_voc_only.config`       | 32.5
1024x1024  | Hourglass-100 | VOC -> Non-VOC | `center_net_deepmac_1024x1024_voc_only.config`     | 35.5
1024x1024  | Hourglass-100 | Non-VOC -> VOC | `center_net_deepmac_1024x1024_non_voc_only.config` | 39.1

### Fully supervised models

Here we report the Mask mAP averaged over all COCO classes on the `test-dev2017`
set .

Resolution | Mask head     | Config name                                | Mask mAP
:--------- | :------------ | :----------------------------------------- | -------:
1024x1024  | Hourglass-100 | `center_net_deepmac_1024x1024_coco.config` | 39.4

## Demos

*   [DeepMAC Colab](../colab_tutorials/deepmac_colab.ipynb) lets you run a
    pre-trained DeepMAC model on user-specified boxes. Note that you are not
    restricted to COCO classes!
Vighnesh Birodkar's avatar
Vighnesh Birodkar committed
72
73
*   [iWildCam Notebook](https://www.kaggle.com/vighneshbgoogle/iwildcam-visualize-instance-masks)
    to visualize instance masks generated by DeepMAC on the iWildCam dataset.
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97

## Pre-trained models

*   [COCO Checkpoint](http://download.tensorflow.org/models/object_detection/tf2/20210329/deepmac_1024x1024_coco17.tar.gz) -
    Takes as input Image + Boxes and produces per-box instance masks as output.

## See also

*   [Mask RCNN code](https://github.com/tensorflow/models/tree/master/official/vision/beta/projects/deepmac_maskrcnn)
    in TF Model garden code base.
*   Project website - [git.io/deepmac](https://git.io/deepmac)

## Citation

```
@misc{birodkar2021surprising,
      title={The surprising impact of mask-head architecture on novel class segmentation},
      author={Vighnesh Birodkar and Zhichao Lu and Siyang Li and Vivek Rathod and Jonathan Huang},
      year={2021},
      eprint={2104.00613},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
```