MODEL_ZOO.md 30 KB
Newer Older
Kai Chen's avatar
Kai Chen committed
1
2
3
4
5
6
7
8
9
10
11
# Benchmark and Model Zoo

## Environment

### Hardware

- 8 NVIDIA Tesla V100 GPUs
- Intel Xeon 4114 CPU @ 2.20GHz

### Software environment

Kai Chen's avatar
Kai Chen committed
12
- Python 3.6 / 3.7
Cao Yuhang's avatar
Cao Yuhang committed
13
- PyTorch Nightly
Kai Chen's avatar
Kai Chen committed
14
15
16
17
- CUDA 9.0.176
- CUDNN 7.0.4
- NCCL 2.1.15

Kai Chen's avatar
Kai Chen committed
18
19
20
## Mirror sites

We use AWS as the main site to host our model zoo, and maintain a mirror on aliyun.
Kai Chen's avatar
Kai Chen committed
21
You can replace `https://s3.ap-northeast-2.amazonaws.com/open-mmlab` with `https://open-mmlab.oss-cn-beijing.aliyuncs.com` in model urls.
Kai Chen's avatar
Kai Chen committed
22
23
24
25
26
27
28
29

## Common settings

- All baselines were trained using 8 GPU with a batch size of 16 (2 images per GPU).
- All models were trained on `coco_2017_train`, and tested on the `coco_2017_val`.
- We use distributed training and BN layer stats are fixed.
- We adopt the same training schedules as Detectron. 1x indicates 12 epochs and 2x indicates 24 epochs, which corresponds to slightly less iterations than Detectron and the difference can be ignored.
- All pytorch-style pretrained backbones on ImageNet are from PyTorch model zoo.
30
31
- For fair comparison with other codebases, we report the GPU memory as the maximum value of `torch.cuda.max_memory_allocated()` for all 8 GPUs. Note that this value is usually less than what `nvidia-smi` shows.
- We report the inference time as the overall time including data loading, network forwarding and post processing.
Kai Chen's avatar
Kai Chen committed
32
33
34
35


## Baselines

36
More models with different backbones will be added to the model zoo.
Kai Chen's avatar
Kai Chen committed
37
38
39

### RPN

Kai Chen's avatar
Kai Chen committed
40
41
| Backbone | Style   | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | AR1000 | Download |
|:--------:|:-------:|:-------:|:--------:|:-------------------:|:--------------:|:------:|:--------:|
Cao Yuhang's avatar
Cao Yuhang committed
42
43
| R-50-FPN | caffe   | 1x      | 3.3      | 0.253               | 16.9           | 58.2   | -        |
| R-50-FPN | pytorch | 1x      | 3.5      | 0.276               | 17.7           | 57.1   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/rpn_r50_fpn_1x_20181010-4a9c0712.pth) |
44
| R-50-FPN | pytorch | 2x      | -        | -                   | -              | 57.6   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/rpn_r50_fpn_2x_20181010-88a4a471.pth) |
Cao Yuhang's avatar
Cao Yuhang committed
45
46
| R-101-FPN | caffe   | 1x      | 5.2      | 0.379               | 13.9           | 59.4   | -        |
| R-101-FPN | pytorch | 1x      | 5.4      | 0.396               | 14.4           | 58.6   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/rpn_r101_fpn_1x_20181129-f50da4bd.pth) |
47
| R-101-FPN | pytorch | 2x      | -        | -                   | -              | 59.1   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/rpn_r101_fpn_2x_20181129-e42c6c9a.pth) |
Cao Yuhang's avatar
Cao Yuhang committed
48
| X-101-32x4d-FPN | pytorch |1x | 6.6      | 0.589               | 11.8            | 59.4   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/rpn_x101_32x4d_fpn_1x_20181218-7e379d26.pth)
49
| X-101-32x4d-FPN | pytorch |2x | -        | -                   | -               | 59.9   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/rpn_x101_32x4d_fpn_2x_20181218-0510af40.pth)
Cao Yuhang's avatar
Cao Yuhang committed
50
| X-101-64x4d-FPN | pytorch |1x | 9.5      | 0.955               | 8.3            | 59.8   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/rpn_x101_64x4d_fpn_1x_20181218-c1a24f1f.pth)
51
| X-101-64x4d-FPN | pytorch |2x | -        | -                   | -              | 60.0   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/rpn_x101_64x4d_fpn_2x_20181218-c22bdd70.pth)
Kai Chen's avatar
Kai Chen committed
52
53
54

### Faster R-CNN

Kai Chen's avatar
Kai Chen committed
55
56
| Backbone | Style   | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | Download |
|:--------:|:-------:|:-------:|:--------:|:-------------------:|:--------------:|:------:|:--------:|
Cao Yuhang's avatar
Cao Yuhang committed
57
58
| R-50-FPN | caffe   | 1x      | 3.6      | 0.333               | 12.9           | 36.7   | -        |
| R-50-FPN | pytorch | 1x      | 3.8      | 0.353               | 12.5           | 36.4   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/faster_rcnn_r50_fpn_1x_20181010-3d1b3351.pth) |
59
| R-50-FPN | pytorch | 2x      | -        | -                   |  -             | 37.7   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/faster_rcnn_r50_fpn_2x_20181010-443129e1.pth) |
Cao Yuhang's avatar
Cao Yuhang committed
60
61
| R-101-FPN | caffe   | 1x      | 5.5      | 0.465               | 10.7          | 38.8   | -        |
| R-101-FPN | pytorch | 1x      | 5.7      | 0.474               | 10.8          | 38.6   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/faster_rcnn_r101_fpn_1x_20181129-d1468807.pth) |
62
| R-101-FPN | pytorch | 2x      | -        | -                   | -             | 39.4   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/faster_rcnn_r101_fpn_2x_20181129-73e7ade7.pth) |
Cao Yuhang's avatar
Cao Yuhang committed
63
| X-101-32x4d-FPN | pytorch | 1x| 6.9      | 0.672               | 9.3           | 40.2    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/faster_rcnn_x101_32x4d_fpn_1x_20181218-ad81c133.pth)
64
| X-101-32x4d-FPN | pytorch | 2x| -        | -                   | -             | 40.5    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/faster_rcnn_x101_32x4d_fpn_2x_20181218-0ed58946.pth)
Cao Yuhang's avatar
Cao Yuhang committed
65
| X-101-64x4d-FPN | pytorch | 1x| 9.8      | 1.040               | 7.1           | 41.3    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/faster_rcnn_x101_64x4d_fpn_1x_20181218-c9c69c8f.pth)
66
| X-101-64x4d-FPN | pytorch | 2x| -        | -                   | -             | 40.7    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/faster_rcnn_x101_64x4d_fpn_2x_20181218-fe94f9b8.pth)
Kai Chen's avatar
Kai Chen committed
67
68
69

### Mask R-CNN

Kai Chen's avatar
Kai Chen committed
70
71
| Backbone | Style   | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | mask AP | Download |
|:--------:|:-------:|:-------:|:--------:|:-------------------:|:--------------:|:------:|:-------:|:--------:|
Cao Yuhang's avatar
Cao Yuhang committed
72
73
| R-50-FPN | caffe   | 1x      | 3.8      | 0.430               | 9.9            | 37.5   | 34.4    | -        |
| R-50-FPN | pytorch | 1x      | 3.9      | 0.453               | 9.6            | 37.3   | 34.2    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_r50_fpn_1x_20181010-069fa190.pth) |
74
| R-50-FPN | pytorch | 2x      | -        | -                   | -              | 38.6   | 35.1    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_r50_fpn_2x_20181010-41d35c05.pth) |
Cao Yuhang's avatar
Cao Yuhang committed
75
76
| R-101-FPN | caffe   | 1x      | 5.7      | 0.534               | 8.8            | 39.9   | 36.1    | -        |
| R-101-FPN | pytorch | 1x      | 5.8      | 0.571               | 8.9            | 39.4   | 35.9    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_r101_fpn_1x_20181129-34ad1961.pth) |
77
| R-101-FPN | pytorch | 2x      | -        | -                   | -              | 40.4   | 36.6    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_r101_fpn_2x_20181129-a254bdfc.pth) |
Cao Yuhang's avatar
Cao Yuhang committed
78
| X-101-32x4d-FPN | pytorch | 1x| 7.1      | 0.759               | 7.9            | 41.2   | 37.2    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_x101_32x4d_fpn_1x_20181218-44e635cc.pth)
79
| X-101-32x4d-FPN | pytorch | 2x| -        | -                   | -              | 41.4   | 37.1    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_x101_32x4d_fpn_2x_20181218-f023dffa.pth)
Cao Yuhang's avatar
Cao Yuhang committed
80
| X-101-64x4d-FPN | pytorch | 1x| 10.0     | 1.102               | 5.8            | 42.2   | 38.1    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_x101_64x4d_fpn_1x_20181218-cb159987.pth)
81
| X-101-64x4d-FPN | pytorch | 2x| -        | -                   | -              | 42.0   | 37.8    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_x101_64x4d_fpn_2x_20181218-ea936e44.pth)
Kai Chen's avatar
Kai Chen committed
82

Kai Chen's avatar
Kai Chen committed
83
### Fast R-CNN (with pre-computed proposals)
Kai Chen's avatar
Kai Chen committed
84

Kai Chen's avatar
Kai Chen committed
85
86
| Backbone | Style   | Type   | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | mask AP | Download |
|:--------:|:-------:|:------:|:-------:|:--------:|:-------------------:|:--------------:|:------:|:-------:|:--------:|
Cao Yuhang's avatar
Cao Yuhang committed
87
88
| R-50-FPN | caffe   | Faster | 1x      | 3.3      | 0.242               | 18.4           | 36.6   | -       | -        |
| R-50-FPN | pytorch | Faster | 1x      | 3.5      | 0.250               | 16.5           | 35.8   | -       | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/fast_rcnn_r50_fpn_1x_20181010-08160859.pth) |
89
| R-50-FPN | pytorch | Faster | 2x      | -        | -                   | -              | 37.1   | -       | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/fast_rcnn_r50_fpn_2x_20181010-d263ada5.pth) |
Cao Yuhang's avatar
Cao Yuhang committed
90
91
| R-101-FPN| caffe   | Faster | 1x      | 5.2      | 0.355               | 14.4           | 38.6   | -       | -        |
| R-101-FPN| pytorch | Faster | 1x      | 5.4      | 0.388               | 13.2           | 38.1   | -       | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/fast_rcnn_r101_fpn_1x_20181129-ffaa2eb0.pth) |
92
| R-101-FPN| pytorch | Faster | 2x      | -        | -                   | -              | 38.8   | -       | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/fast_rcnn_r101_fpn_2x_20181129-9dba92ce.pth) |
Cao Yuhang's avatar
Cao Yuhang committed
93
94
| R-50-FPN | caffe   | Mask   | 1x      | 3.4      | 0.328               | 12.8           | 37.3   | 34.5    | -        |
| R-50-FPN | pytorch | Mask   | 1x      | 3.5      | 0.346               | 12.7           | 36.8   | 34.1    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/fast_mask_rcnn_r50_fpn_1x_20181010-e030a38f.pth) |
95
| R-50-FPN | pytorch | Mask   | 2x      | -        | -                   | -              | 37.9   | 34.8    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/fast_mask_rcnn_r50_fpn_2x_20181010-5048cb03.pth) |
Cao Yuhang's avatar
Cao Yuhang committed
96
97
| R-101-FPN| caffe   | Mask   | 1x      | 5.2      | 0.429               | 11.2            | 39.4   | 36.1    | -        |
| R-101-FPN| pytorch | Mask   | 1x      | 5.4      | 0.462               | 10.9            | 38.9   | 35.8    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/fast_mask_rcnn_r101_fpn_1x_20181129-2273fa9b.pth) |
98
| R-101-FPN| pytorch | Mask   | 2x      | -        | -                   | -               | 39.9   | 36.4    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/fast_mask_rcnn_r101_fpn_2x_20181129-bf63ec5e.pth) |
Kai Chen's avatar
Kai Chen committed
99

Kai Chen's avatar
Kai Chen committed
100
### RetinaNet
Kai Chen's avatar
Kai Chen committed
101

Kai Chen's avatar
Kai Chen committed
102
| Backbone | Style   | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | Download |
Kai Chen's avatar
Kai Chen committed
103
|:--------:|:-------:|:-------:|:--------:|:-------------------:|:--------------:|:------:|:--------:|
Cao Yuhang's avatar
Cao Yuhang committed
104
105
| R-50-FPN | caffe   | 1x      | 3.4      | 0.285               | 12.5            | 35.8   | -        |
| R-50-FPN | pytorch | 1x      | 3.6      | 0.308               | 12.1            | 35.6   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/retinanet_r50_fpn_1x_20181125-3d3c2142.pth) |
106
| R-50-FPN | pytorch | 2x      | -        | -                   | -               | 36.5   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/retinanet_r50_fpn_2x_20181125-e0dbec97.pth) |
Cao Yuhang's avatar
Cao Yuhang committed
107
108
| R-101-FPN | caffe   | 1x      | 5.3      | 0.410               | 10.4            | 37.8   | -        |
| R-101-FPN | pytorch | 1x      | 5.5      | 0.429               | 10.9            | 37.7   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/retinanet_r101_fpn_1x_20181129-f738a02f.pth) |
109
| R-101-FPN | pytorch | 2x      | -        | -                   | -               | 38.1   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/retinanet_r101_fpn_2x_20181129-f654534b.pth) |
Cao Yuhang's avatar
Cao Yuhang committed
110
| X-101-32x4d-FPN | pytorch | 1x| 6.7      | 0.632               | 9.3            | 39.0   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/retinanet_x101_32x4d_fpn_1x_20190501-967812ba.pth)
111
| X-101-32x4d-FPN | pytorch | 2x| -        | -                   | -              | 39.3   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/retinanet_x101_32x4d_fpn_2x_20181218-605dcd0a.pth)
Cao Yuhang's avatar
Cao Yuhang committed
112
| X-101-64x4d-FPN | pytorch | 1x| 9.6      | 0.993               | 7.0            | 40.0   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/retinanet_x101_64x4d_fpn_1x_20181218-2f6f778b.pth)
113
| X-101-64x4d-FPN | pytorch | 2x| -        | -                   | -              | 39.6   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/retinanet_x101_64x4d_fpn_2x_20181218-2f598dc5.pth)
Kai Chen's avatar
Kai Chen committed
114

Kai Chen's avatar
Kai Chen committed
115
116
117
118
### Cascade R-CNN

| Backbone | Style   | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | Download |
|:--------:|:-------:|:-------:|:--------:|:-------------------:|:--------------:|:------:|:--------:|
Cao Yuhang's avatar
Cao Yuhang committed
119
120
| R-50-FPN | caffe   | 1x      | 3.9      | 0.464               | 9.7            | 40.6   | -        |
| R-50-FPN | pytorch | 1x      | 4.1      | 0.455               | 10.1           | 40.5   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_rcnn_r50_fpn_1x_20190501-3b6211ab.pth) |
121
| R-50-FPN | pytorch | 20e     | -        | -                   | -              | 41.1   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_rcnn_r50_fpn_20e_20181123-db483a09.pth) |
Cao Yuhang's avatar
Cao Yuhang committed
122
123
| R-101-FPN | caffe   | 1x      | 5.8      | 0.569               | 8.7            | 42.5   | -        |
| R-101-FPN | pytorch | 1x      | 6.0      | 0.584               | 8.7            | 42.1   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_rcnn_r101_fpn_1x_20181129-d64ebac7.pth) |
124
| R-101-FPN | pytorch | 20e     | -        | -                   | -              | 42.6   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_rcnn_r101_fpn_20e_20181129-b46dcede.pth) |
Cao Yuhang's avatar
Cao Yuhang committed
125
| X-101-32x4d-FPN | pytorch | 1x| 7.2      | 0.770               | 7.8            | 43.7   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_rcnn_x101_32x4d_fpn_1x_20190501-af628be5.pth)
126
| X-101-32x4d-FPN | pytorch |20e| -        | -                   | -              | 44.1   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_rcnn_x101_32x4d_fpn_2x_20181218-28f73c4c.pth)
Cao Yuhang's avatar
Cao Yuhang committed
127
| X-101-64x4d-FPN | pytorch | 1x| 10.0     | 1.133               | 6.1            | 44.6   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_rcnn_x101_64x4d_fpn_1x_20181218-e2dc376a.pth)
128
| X-101-64x4d-FPN | pytorch |20e| -        | -                   | -              | 44.8   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_rcnn_x101_64x4d_fpn_2x_20181218-5add321e.pth)
Kai Chen's avatar
Kai Chen committed
129
130
131
132
133

### Cascade Mask R-CNN

| Backbone | Style   | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | mask AP | Download |
|:--------:|:-------:|:-------:|:--------:|:-------------------:|:--------------:|:------:|:-------:|:--------:|
Cao Yuhang's avatar
Cao Yuhang committed
134
135
| R-50-FPN | caffe   | 1x      | 5.1      | 0.692               | 6.7            | 41.0   | 35.6    | -        |
| R-50-FPN | pytorch | 1x      | 5.3      | 0.683               | 6.5            | 41.3   | 35.7    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_mask_rcnn_r50_fpn_1x_20181123-88b170c9.pth) |
136
| R-50-FPN | pytorch | 20e     | -        | -                   | -              | 42.4   | 36.6    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_mask_rcnn_r50_fpn_20e_20181123-6e0c9713.pth) |
Cao Yuhang's avatar
Cao Yuhang committed
137
138
| R-101-FPN | caffe   | 1x      | 7.0     | 0.803               | 6.3            | 43.1   | 37.3    | -        |
| R-101-FPN | pytorch | 1x      | 7.2     | 0.807               | 6.1            | 42.7   | 37.1    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_mask_rcnn_r101_fpn_1x_20181129-64f00602.pth) |
139
| R-101-FPN | pytorch | 20e     | -       | -                   | -              | 43.4   | 37.6    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_mask_rcnn_r101_fpn_20e_20181129-cb85151d.pth) |
Cao Yuhang's avatar
Cao Yuhang committed
140
| X-101-32x4d-FPN | pytorch | 1x| 8.4      | 0.976               | 5.7            | 44.4   | 38.3    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_mask_rcnn_x101_32x4d_fpn_1x_20181218-1d944c89.pth)
141
| X-101-32x4d-FPN | pytorch |20e| -        | -                   | -              | 44.9   | 38.7    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_mask_rcnn_x101_32x4d_fpn_20e_20181218-761a3473.pth)
Cao Yuhang's avatar
Cao Yuhang committed
142
| X-101-64x4d-FPN | pytorch | 1x| 11.4     | 1.33                | 4.7            | 45.3   | 39.1    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_mask_rcnn_x101_64x4d_fpn_1x_20190501-827e0a70.pth)
143
| X-101-64x4d-FPN | pytorch |20e| -        | -                   | -              | 45.8   | 39.5    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_mask_rcnn_x101_64x4d_fpn_20e_20181218-630773a7.pth)
Kai Chen's avatar
Kai Chen committed
144

pangjm's avatar
pangjm committed
145
146
**Notes:**

Kai Chen's avatar
Kai Chen committed
147
- The `20e` schedule in Cascade (Mask) R-CNN indicates decreasing the lr at 16 and 19 epochs, with a total of 20 epochs.
Kai Chen's avatar
Kai Chen committed
148

149
150
151
152
### Hybrid Task Cascade (HTC)

Please refer to [HTC](configs/htc/README.md) for details.

Kai Chen's avatar
Kai Chen committed
153
154
155
156
### SSD

| Backbone | Size | Style  | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | Download |
|:--------:|:----:|:------:|:-------:|:--------:|:-------------------:|:--------------:|:------:|:--------:|
Cao Yuhang's avatar
Cao Yuhang committed
157
158
| VGG16    | 300  | caffe  | 120e    | 3.5      | 0.256               | 25.9 / 34.6    | 25.7   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/ssd300_coco_vgg16_caffe_120e_20181221-84d7110b.pth)  |
| VGG16    | 512  | caffe  | 120e    | 7.6      | 0.412               | 20.7 / 25.4    | 29.3   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/ssd512_coco_vgg16_caffe_120e_20181221-d48b0be8.pth) |
Kai Chen's avatar
Kai Chen committed
159
160
161
162
163

### SSD (PASCAL VOC)

| Backbone | Size | Style  | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | Download |
|:--------:|:----:|:------:|:-------:|:--------:|:-------------------:|:--------------:|:------:|:--------:|
Cao Yuhang's avatar
Cao Yuhang committed
164
165
| VGG16    | 300  | caffe  | 240e    | 2.5      | 0.159               | 35.7 / 53.6    | 77.5   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/ssd300_voc_vgg16_caffe_240e_20190501-7160d09a.pth)  |
| VGG16    | 512  | caffe  | 240e    | 4.3      | 0.214               | 27.5 / 35.9    | 80.0   | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/ssd512_voc_vgg16_caffe_240e_20190501-ff194be1.pth) |
Kai Chen's avatar
Kai Chen committed
166
167
168
169
170
171
172

**Notes:**

- `cudnn.benchmark` is set as `True` for SSD training and testing.
- Inference time is reported for batch size = 1 and batch size = 8.
- The speed difference between VOC and COCO is caused by model parameters and nms.

Kai Chen's avatar
Kai Chen committed
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
### Group Normalization (GN)

| Backbone      | model      | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | mask AP | Download |
|:-------------:|:----------:|:-------:|:--------:|:-------------------:|:--------------:|:------:|:-------:|:--------:|
| R-50-FPN (d)  | Mask R-CNN | 2x      | 7.2      | 0.806               | 5.4            | 39.9   | 36.1    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_r50_fpn_gn_2x_20180113-86832cf2.pth) |
| R-50-FPN (d)  | Mask R-CNN | 3x      | 7.2      | 0.806               | 5.4            | 40.2   | 36.5    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_r50_fpn_gn_3x_20180113-8e82f48d.pth) |
| R-101-FPN (d) | Mask R-CNN | 2x      | 9.9      | 0.970               | 4.8            | 41.6   | 37.1    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_r101_fpn_gn_2x_20180113-9598649c.pth) |
| R-101-FPN (d) | Mask R-CNN | 3x      | 9.9      | 0.970               | 4.8            | 41.7   | 37.3    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_r101_fpn_gn_3x_20180113-a14ffb96.pth) |
| R-50-FPN (c)  | Mask R-CNN | 2x      | 7.2      | 0.806               | 5.4            | 39.7   | 35.9    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_r50_fpn_gn_contrib_2x_20180113-ec93305c.pth) |
| R-50-FPN (c)  | Mask R-CNN | 3x      | 7.2      | 0.806               | 5.4            | 40.1   | 36.2    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_r50_fpn_gn_contrib_3x_20180113-9d230cab.pth) |

**Notes:**
- (d) means pretrained model converted from Detectron, and (c) means the contributed model pretrained by [@thangvubk](https://github.com/thangvubk).
- The `3x` schedule is epoch [28, 34, 36].

Kai Chen's avatar
Kai Chen committed
188
189
190
191
192
193
194
195
196
### Deformable Convolution v2

| Backbone  | Model        | Style   | Conv          | Pool   | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | mask AP | Download |
|:---------:|:------------:|:-------:|:-------------:|:------:|:-------:|:--------:|:-------------------:|:--------------:|:------:|:-------:|:--------:|
| R-50-FPN  | Faster       | pytorch | dconv(c3-c5)  | -      | 1x      | 3.9      | 0.594               | 10.2           | 40.0   | -       | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/dcn/faster_rcnn_dconv_c3-c5_r50_fpn_1x_20190125-e41688c9.pth) |
| R-50-FPN  | Faster       | pytorch | mdconv(c3-c5) | -      | 1x      | 3.7      | 0.598               | 10.0           | 40.3   | -       | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/dcn/faster_rcnn_mdconv_c3-c5_r50_fpn_1x_20190125-1b768045.pth) |
| R-50-FPN  | Faster       | pytorch | -             | dpool  | 1x      | 4.6      | 0.714               | 8.7            | 37.9   | -       | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/dcn/faster_rcnn_dpool_r50_fpn_1x_20190125-f4fc1d70.pth) |
| R-50-FPN  | Faster       | pytorch | -             | mdpool | 1x      | 5.2      | 0.769               | 8.2            | 38.1   | -       | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/dcn/faster_rcnn_mdpool_r50_fpn_1x_20190125-473d0f3d.pth) |
| R-101-FPN | Faster       | pytorch | dconv(c3-c5)  | -      | 1x      | 5.8      | 0.811               | 8.0            | 42.1   | -       | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/dcn/faster_rcnn_dconv_c3-c5_r101_fpn_1x_20190125-a7e31b65.pth) |
Kai Chen's avatar
Kai Chen committed
197
| X-101-32x4d-FPN | Faster       | pytorch | dconv(c3-c5)  | -      | 1x      | 7.1      | 1.126               | 6.6            | 43.5   | -       | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/dcn/faster_rcnn_dconv_c3-c5_x101_32x4d_fpn_1x_20190201-6d46376f.pth) |
Kai Chen's avatar
Kai Chen committed
198
199
200
201
202
203
204
205
206
207
208
209
| R-50-FPN  | Mask         | pytorch | dconv(c3-c5)  | -      | 1x      | 4.5      | 0.712               | 7.7            | 41.1   | 37.2    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/dcn/mask_rcnn_dconv_c3-c5_r50_fpn_1x_20190125-4f94ff79.pth) |
| R-50-FPN  | Mask         | pytorch | mdconv(c3-c5) | -      | 1x      | 4.5      | 0.712               | 7.7            | 41.4   | 37.4    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/dcn/mask_rcnn_mdconv_c3-c5_r50_fpn_1x_20190125-c5601dc3.pth) |
| R-101-FPN | Mask         | pytorch | dconv(c3-c5)  | -      | 1x      | 6.4      | 0.939               | 6.5            | 43.2   | 38.7    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/dcn/mask_rcnn_dconv_c3-c5_r101_fpn_1x_20190125-decb6db5.pth) |
| R-50-FPN  | Cascade      | pytorch | dconv(c3-c5)  | -      | 1x      | 4.4      | 0.660               | 7.6            | 44.1   | -       | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/dcn/cascade_rcnn_dconv_c3-c5_r50_fpn_1x_20190125-dfa53166.pth) |
| R-101-FPN | Cascade      | pytorch | dconv(c3-c5)  | -      | 1x      | 6.3      | 0.881               | 6.8            | 45.1   | -       | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/dcn/cascade_rcnn_dconv_c3-c5_r101_fpn_1x_20190125-aaa877cc.pth) |
| R-50-FPN  | Cascade Mask | pytorch | dconv(c3-c5)  | -      | 1x      | 6.6      | 0.942               | 5.7            | 44.5   | 38.3    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/dcn/cascade_mask_rcnn_dconv_c3-c5_r50_fpn_1x_20190125-09d8a443.pth) |
| R-101-FPN | Cascade Mask | pytorch | dconv(c3-c5)  | -      | 1x      | 8.5      | 1.156               | 5.1            | 45.8   | 39.5    | [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/dcn/cascade_mask_rcnn_dconv_c3-c5_r101_fpn_1x_20190125-0d62c190.pth) |

**Notes:**

- `dconv` and `mdconv` denote (modulated) deformable convolution, `c3-c5` means adding dconv in resnet stage 3 to 5. `dpool` and `mdpool` denote (modulated) deformable roi pooling.
- The dcn ops are modified from https://github.com/chengdazhi/Deformable-Convolution-V2-PyTorch, which should be more memory efficient and slightly faster.
Kai Chen's avatar
Kai Chen committed
210

211
## Comparison with Detectron and maskrcnn-benchmark
Kai Chen's avatar
Kai Chen committed
212
213

We compare mmdetection with [Detectron](https://github.com/facebookresearch/Detectron)
214
and [maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark). The backbone used is R-50-FPN.
Kai Chen's avatar
Kai Chen committed
215

Kai Chen's avatar
Kai Chen committed
216
217
218
219
220
221
In general, mmdetection has 3 advantages over Detectron.

- **Higher performance** (especially in terms of mask AP)
- **Faster training speed**
- **Memory efficient**

Kai Chen's avatar
Kai Chen committed
222
223
### Performance

224
Detectron and maskrcnn-benchmark use caffe-style ResNet as the backbone.
Kai Chen's avatar
Kai Chen committed
225
226
227
228
229
We report results using both caffe-style (weights converted from
[here](https://github.com/facebookresearch/Detectron/blob/master/MODEL_ZOO.md#imagenet-pretrained-models))
and pytorch-style (weights from the official model zoo) ResNet backbone,
indicated as *pytorch-style results* / *caffe-style results*.

230
231
232
233
We find that pytorch-style ResNet usually converges slower than caffe-style ResNet,
thus leading to slightly lower results in 1x schedule, but the final results
of 2x schedule is higher.

Kai Chen's avatar
Kai Chen committed
234
235
236
237
238
<table>
  <tr>
    <th>Type</th>
    <th>Lr schd</th>
    <th>Detectron</th>
239
    <th>maskrcnn-benchmark</th>
Kai Chen's avatar
Kai Chen committed
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
    <th>mmdetection</th>
  </tr>
  <tr>
    <td rowspan="2">RPN</td>
    <td>1x</td>
    <td>57.2</td>
    <td>-</td>
    <td>57.1 / 58.2</td>
  </tr>
  <tr>
    <td>2x</td>
    <td>-</td>
    <td>-</td>
    <td>57.6 / -</td>
  </tr>
  <tr>
    <td rowspan="2">Faster R-CNN</td>
    <td>1x</td>
    <td>36.7</td>
259
    <td>36.8</td>
Kai Chen's avatar
Kai Chen committed
260
261
262
263
264
265
266
267
268
269
270
271
    <td>36.4 / 36.7</td>
  </tr>
  <tr>
    <td>2x</td>
    <td>37.9</td>
    <td>-</td>
    <td>37.7 / -</td>
  </tr>
  <tr>
    <td rowspan="2">Mask R-CNN</td>
    <td>1x</td>
    <td>37.7 &amp; 33.9</td>
272
    <td>37.8 &amp; 34.2</td>
Kai Chen's avatar
Kai Chen committed
273
274
275
276
277
278
279
280
    <td>37.3 &amp; 34.2 / 37.5 &amp; 34.4</td>
  </tr>
  <tr>
    <td>2x</td>
    <td>38.6 &amp; 34.5</td>
    <td>-</td>
    <td>38.6 &amp; 35.1 / -</td>
  </tr>
Kai Chen's avatar
Kai Chen committed
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
  <tr>
    <td rowspan="2">Fast R-CNN</td>
    <td>1x</td>
    <td>36.4</td>
    <td>-</td>
    <td>35.8 / 36.6</td>
  </tr>
  <tr>
    <td>2x</td>
    <td>36.8</td>
    <td>-</td>
    <td>37.1 / -</td>
  </tr>
  <tr>
    <td rowspan="2">Fast R-CNN (w/mask)</td>
    <td>1x</td>
    <td>37.3 &amp; 33.7</td>
    <td>-</td>
    <td>36.8 &amp; 34.1 / 37.3 &amp; 34.5</td>
  </tr>
  <tr>
    <td>2x</td>
    <td>37.7 &amp; 34.0</td>
    <td>-</td>
    <td>37.9 &amp; 34.8 / -</td>
  </tr>
Kai Chen's avatar
Kai Chen committed
307
308
</table>

Kai Chen's avatar
Kai Chen committed
309
### Training Speed
Kai Chen's avatar
Kai Chen committed
310

Kai Chen's avatar
Kai Chen committed
311
The training speed is measure with s/iter. The lower, the better.
Kai Chen's avatar
Kai Chen committed
312
313
314
315
316

<table>
  <tr>
    <th>Type</th>
    <th>Detectron (P100<sup>1</sup>)</th>
317
318
    <th>maskrcnn-benchmark (V100)</th>
    <th>mmdetection (V100<sup>2</sup>)</th>
Kai Chen's avatar
Kai Chen committed
319
320
321
322
323
  </tr>
  <tr>
    <td>RPN</td>
    <td>0.416</td>
    <td>-</td>
324
    <td>0.253</td>
Kai Chen's avatar
Kai Chen committed
325
326
327
328
  </tr>
  <tr>
    <td>Faster R-CNN</td>
    <td>0.544</td>
329
330
    <td>0.353</td>
    <td>0.333</td>
Kai Chen's avatar
Kai Chen committed
331
332
333
334
  </tr>
  <tr>
    <td>Mask R-CNN</td>
    <td>0.889</td>
335
336
    <td>0.454</td>
    <td>0.430</td>
Kai Chen's avatar
Kai Chen committed
337
  </tr>
Kai Chen's avatar
Kai Chen committed
338
339
340
341
  <tr>
    <td>Fast R-CNN</td>
    <td>0.285</td>
    <td>-</td>
342
    <td>0.242</td>
Kai Chen's avatar
Kai Chen committed
343
344
345
346
347
  </tr>
  <tr>
    <td>Fast R-CNN (w/mask)</td>
    <td>0.377</td>
    <td>-</td>
348
    <td>0.328</td>
Kai Chen's avatar
Kai Chen committed
349
  </tr>
Kai Chen's avatar
Kai Chen committed
350
351
</table>

352
\*1. Facebook's Big Basin servers (P100/V100) is slightly faster than the servers we use. mmdetection can also run slightly faster on FB's servers.
Kai Chen's avatar
Kai Chen committed
353

354
\*2. For fair comparison, we list the caffe-style results here.
Kai Chen's avatar
Kai Chen committed
355

Kai Chen's avatar
Kai Chen committed
356
357
358
359
360
361
362
363
364

### Inference Speed

The inference speed is measured with fps (img/s) on a single GPU. The higher, the better.

<table>
  <tr>
    <th>Type</th>
    <th>Detectron (P100)</th>
365
366
    <th>maskrcnn-benchmark (V100)</th>
    <th>mmdetection (V100)</th>
Kai Chen's avatar
Kai Chen committed
367
368
369
370
371
  </tr>
  <tr>
    <td>RPN</td>
    <td>12.5</td>
    <td>-</td>
372
    <td>16.9</td>
Kai Chen's avatar
Kai Chen committed
373
374
375
376
  </tr>
  <tr>
    <td>Faster R-CNN</td>
    <td>10.3</td>
377
378
    <td>7.9</td>
    <td>12.9</td>
Kai Chen's avatar
Kai Chen committed
379
380
381
382
  </tr>
  <tr>
    <td>Mask R-CNN</td>
    <td>8.5</td>
383
384
    <td>7.7</td>
    <td>9.9</td>
Kai Chen's avatar
Kai Chen committed
385
  </tr>
Kai Chen's avatar
Kai Chen committed
386
387
388
  <tr>
    <td>Fast R-CNN</td>
    <td>12.5</td>
389
390
    <td>-</td>
    <td>18.4</td>
Kai Chen's avatar
Kai Chen committed
391
392
393
394
  </tr>
  <tr>
    <td>Fast R-CNN (w/mask)</td>
    <td>9.9</td>
395
396
    <td>-</td>
    <td>12.8</td>
Kai Chen's avatar
Kai Chen committed
397
  </tr>
Kai Chen's avatar
Kai Chen committed
398
399
</table>

Kai Chen's avatar
Kai Chen committed
400
401
### Training memory

402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
<table>
  <tr>
    <th>Type</th>
    <th>Detectron</th>
    <th>maskrcnn-benchmark</th>
    <th>mmdetection</th>
  </tr>
  <tr>
    <td>RPN</td>
    <td>6.4</td>
    <td>-</td>
    <td>3.3</td>
  </tr>
  <tr>
    <td>Faster R-CNN</td>
    <td>7.2</td>
    <td>4.4</td>
    <td>3.6</td>
  </tr>
  <tr>
    <td>Mask R-CNN</td>
    <td>8.6</td>
    <td>5.2</td>
    <td>3.8</td>
  </tr>
  <tr>
    <td>Fast R-CNN</td>
    <td>6.0</td>
    <td>-</td>
    <td>3.3</td>
  </tr>
  <tr>
    <td>Fast R-CNN (w/mask)</td>
    <td>7.9</td>
    <td>-</td>
    <td>3.4</td>
  </tr>
</table>

There is no doubt that maskrcnn-benchmark and mmdetection is more memory efficient than Detectron,
and the main advantage is PyTorch itself. We also perform some memory optimizations to push it forward.

Note that Caffe2 and PyTorch have different apis to obtain memory usage with different implementations.
For all codebases, `nvidia-smi` shows a larger memory usage than the reported number in the above table.

Kai Chen's avatar
Kai Chen committed
447
448