MODEL_ZOO.md 8.83 KB
Newer Older
zhangwenwei's avatar
zhangwenwei committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Benchmark and Model Zoo

## Mirror sites

We use AWS as the main site to host our model zoo, and maintain a mirror on aliyun.
You can replace `https://s3.ap-northeast-2.amazonaws.com/open-mmlab` with `https://open-mmlab.oss-cn-beijing.aliyuncs.com` in model urls.

## Common settings

- All FPN baselines and RPN-C4 baselines were trained using 8 GPU with a batch size of 16 (2 images per GPU). Other C4 baselines were trained using 8 GPU with a batch size of 8 (1 image per GPU).
- All models were trained on `coco_2017_train`, and tested on the `coco_2017_val`.
- We use distributed training and BN layer stats are fixed.
- We adopt the same training schedules as Detectron. 1x indicates 12 epochs and 2x indicates 24 epochs, which corresponds to slightly less iterations than Detectron and the difference can be ignored.
- All pytorch-style pretrained backbones on ImageNet are from PyTorch model zoo.
- For fair comparison with other codebases, we report the GPU memory as the maximum value of `torch.cuda.max_memory_allocated()` for all 8 GPUs. Note that this value is usually less than what `nvidia-smi` shows.
- We report the inference time as the overall time including data loading, network forwarding and post processing.


## Baselines

### RPN

zhangwenwei's avatar
Doc  
zhangwenwei committed
23
Please refer to [RPN](https://github.com/open-mmlab/mmdetection/blob/master/configs/rpn) for details.
zhangwenwei's avatar
zhangwenwei committed
24
25
26

### Faster R-CNN

zhangwenwei's avatar
Doc  
zhangwenwei committed
27
Please refer to [Faster R-CNN](https://github.com/open-mmlab/mmdetection/blob/master/configs/faster_rcnn) for details.
zhangwenwei's avatar
zhangwenwei committed
28
29
30

### Mask R-CNN

zhangwenwei's avatar
Doc  
zhangwenwei committed
31
Please refer to [Mask R-CNN](https://github.com/open-mmlab/mmdetection/blob/master/configs/mask_rcnn) for details.
zhangwenwei's avatar
zhangwenwei committed
32
33
34

### Fast R-CNN (with pre-computed proposals)

zhangwenwei's avatar
Doc  
zhangwenwei committed
35
Please refer to [Fast R-CNN](https://github.com/open-mmlab/mmdetection/blob/master/configs/fast_rcnn) for details.
zhangwenwei's avatar
zhangwenwei committed
36
37
38

### RetinaNet

zhangwenwei's avatar
Doc  
zhangwenwei committed
39
Please refer to [RetinaNet](https://github.com/open-mmlab/mmdetection/blob/master/configs/retinanet) for details.
zhangwenwei's avatar
zhangwenwei committed
40

zhangwenwei's avatar
Doc  
zhangwenwei committed
41
### Cascade R-CNN and Cascade Mask R-CNN
zhangwenwei's avatar
zhangwenwei committed
42

zhangwenwei's avatar
Doc  
zhangwenwei committed
43
Please refer to [Cascade R-CNN](https://github.com/open-mmlab/mmdetection/blob/master/configs/cascade_rcnn) for details.
zhangwenwei's avatar
zhangwenwei committed
44

zhangwenwei's avatar
Doc  
zhangwenwei committed
45
### Hybrid Task Cascade (HTC)
zhangwenwei's avatar
zhangwenwei committed
46

zhangwenwei's avatar
Doc  
zhangwenwei committed
47
Please refer to [HTC](https://github.com/open-mmlab/mmdetection/blob/master/configs/htc) for details.
zhangwenwei's avatar
zhangwenwei committed
48
49
50

### SSD

zhangwenwei's avatar
Doc  
zhangwenwei committed
51
Please refer to [SSD](https://github.com/open-mmlab/mmdetection/blob/master/configs/ssd) for details.
zhangwenwei's avatar
zhangwenwei committed
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124

### Group Normalization (GN)

Please refer to [Group Normalization](https://github.com/open-mmlab/mmdetection/blob/master/configs/gn) for details.

### Weight Standardization

Please refer to [Weight Standardization](https://github.com/open-mmlab/mmdetection/blob/master/configs/gn+ws) for details.

### Deformable Convolution v2

Please refer to [Deformable Convolutional Networks](https://github.com/open-mmlab/mmdetection/blob/master/configs/dcn) for details.

### CARAFE: Content-Aware ReAssembly of FEatures
Please refer to [CARAFE](https://github.com/open-mmlab/mmdetection/blob/master/configs/carafe) for details.

### Instaboost

Please refer to [Instaboost](https://github.com/open-mmlab/mmdetection/blob/master/configs/instaboost) for details.

### Libra R-CNN

Please refer to [Libra R-CNN](https://github.com/open-mmlab/mmdetection/blob/master/configs/libra_rcnn) for details.

### Guided Anchoring

Please refer to [Guided Anchoring](https://github.com/open-mmlab/mmdetection/blob/master/configs/guided_anchoring) for details.

### FCOS

Please refer to [FCOS](https://github.com/open-mmlab/mmdetection/blob/master/configs/fcos) for details.

### FoveaBox

Please refer to [FoveaBox](https://github.com/open-mmlab/mmdetection/blob/master/configs/foveabox) for details.

### RepPoints

Please refer to [RepPoints](https://github.com/open-mmlab/mmdetection/blob/master/configs/reppoints) for details.

### FreeAnchor

Please refer to [FreeAnchor](https://github.com/open-mmlab/mmdetection/blob/master/configs/free_anchor) for details.

### Grid R-CNN (plus)

Please refer to [Grid R-CNN](https://github.com/open-mmlab/mmdetection/blob/master/configs/grid_rcnn) for details.

### GHM

Please refer to [GHM](https://github.com/open-mmlab/mmdetection/blob/master/configs/ghm) for details.

### GCNet

Please refer to [GCNet](https://github.com/open-mmlab/mmdetection/blob/master/configs/gcnet) for details.

### HRNet
Please refer to [HRNet](https://github.com/open-mmlab/mmdetection/blob/master/configs/hrnet) for details.

### Mask Scoring R-CNN

Please refer to [Mask Scoring R-CNN](https://github.com/open-mmlab/mmdetection/blob/master/configs/ms_rcnn) for details.

### Train from Scratch

Please refer to [Rethinking ImageNet Pre-training](https://github.com/open-mmlab/mmdetection/blob/master/configs/scratch) for details.

### NAS-FPN
Please refer to [NAS-FPN](https://github.com/open-mmlab/mmdetection/blob/master/configs/nas_fpn) for details.

### ATSS
Please refer to [ATSS](https://github.com/open-mmlab/mmdetection/blob/master/configs/atss) for details.

zhangwenwei's avatar
Doc  
zhangwenwei committed
125
126
127
### FSAF
Please refer to [FSAF](https://github.com/open-mmlab/mmdetection/blob/master/configs/fsaf) for details.

zhangwenwei's avatar
zhangwenwei committed
128
129
130
131
132
### Other datasets

We also benchmark some methods on [PASCAL VOC](https://github.com/open-mmlab/mmdetection/blob/master/configs/pascal_voc), [Cityscapes](https://github.com/open-mmlab/mmdetection/blob/master/configs/cityscapes) and [WIDER FACE](https://github.com/open-mmlab/mmdetection/blob/master/configs/wider_face).


zhangwenwei's avatar
Doc  
zhangwenwei committed
133
134
## Speed benchmark
We compare the training speed of Mask R-CNN with some other popular frameworks (The data is copied from [detectron2](https://github.com/facebookresearch/detectron2/blob/master/docs/notes/benchmarks.md)).
zhangwenwei's avatar
zhangwenwei committed
135

zhangwenwei's avatar
Doc  
zhangwenwei committed
136
137
138
139
140
141
142
143
144
| Implementation       | Throughput (img/s) |
|----------------------|--------------------|
| [Detectron2](https://github.com/facebookresearch/detectron2) | 61 |
| [MMDetection](https://github.com/open-mmlab/mmdetection) | 60 |
| [maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark/)   | 51 |
| [tensorpack](https://github.com/tensorpack/tensorpack/tree/master/examples/FasterRCNN) | 50 |
| [simpledet](https://github.com/TuSimple/simpledet/) | 39 |
| [Detectron](https://github.com/facebookresearch/Detectron) | 19 |
| [matterport/Mask_RCNN](https://github.com/matterport/Mask_RCNN/) | 14 |
zhangwenwei's avatar
zhangwenwei committed
145

zhangwenwei's avatar
Doc  
zhangwenwei committed
146
## Comparison with Detectron2
zhangwenwei's avatar
zhangwenwei committed
147

zhangwenwei's avatar
Doc  
zhangwenwei committed
148
149
150
We compare mmdetection with [Detectron2](https://github.com/facebookresearch/detectron2.git) in terms of speed and performance.
We use the commit id [185c27e](https://github.com/facebookresearch/detectron2/tree/185c27e4b4d2d4c68b5627b3765420c6d7f5a659)(30/4/2020) of detectron.
For fair comparison, we install and run both frameworks on the same machine.
zhangwenwei's avatar
zhangwenwei committed
151

zhangwenwei's avatar
Doc  
zhangwenwei committed
152
### Hardware
zhangwenwei's avatar
zhangwenwei committed
153

zhangwenwei's avatar
Doc  
zhangwenwei committed
154
155
156
157
- 8 NVIDIA Tesla V100 (32G) GPUs
- Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz

### Software environment
zhangwenwei's avatar
zhangwenwei committed
158

zhangwenwei's avatar
Doc  
zhangwenwei committed
159
160
161
162
163
- Python 3.7
- PyTorch 1.4
- CUDA 10.1
- CUDNN 7.6.03
- NCCL 2.4.08
zhangwenwei's avatar
zhangwenwei committed
164

zhangwenwei's avatar
Doc  
zhangwenwei committed
165
166
167
### Performance

<table border="1">
zhangwenwei's avatar
zhangwenwei committed
168
169
170
  <tr>
    <th>Type</th>
    <th>Lr schd</th>
zhangwenwei's avatar
Doc  
zhangwenwei committed
171
    <th>Detectron2</th>
zhangwenwei's avatar
zhangwenwei committed
172
173
174
175
176
177
    <th>mmdetection</th>
  </tr>
  <tr>
    <td rowspan="2">Faster R-CNN</td>
    <td>1x</td>
    <td>37.9</td>
zhangwenwei's avatar
Doc  
zhangwenwei committed
178
    <td>38.0</td>
zhangwenwei's avatar
zhangwenwei committed
179
180
  </tr>
  <tr>
zhangwenwei's avatar
Doc  
zhangwenwei committed
181
182
    <td>3x</td>
    <td>40.2</td>
zhangwenwei's avatar
zhangwenwei committed
183
184
185
    <td>-</td>
  </tr>
  <tr>
zhangwenwei's avatar
Doc  
zhangwenwei committed
186
    <td rowspan="2">Mask R-CNN</td>
zhangwenwei's avatar
zhangwenwei committed
187
    <td>1x</td>
zhangwenwei's avatar
Doc  
zhangwenwei committed
188
189
    <td>38.6 &amp; 35.2</td>
    <td>38.8 &amp; 35.4</td>
zhangwenwei's avatar
zhangwenwei committed
190
191
  </tr>
  <tr>
zhangwenwei's avatar
Doc  
zhangwenwei committed
192
193
    <td>3x</td>
    <td>41.0 &amp; 37.2 </td>
zhangwenwei's avatar
zhangwenwei committed
194
195
196
    <td>-</td>
  </tr>
  <tr>
zhangwenwei's avatar
Doc  
zhangwenwei committed
197
    <td rowspan="2">Retinanet</td>
zhangwenwei's avatar
zhangwenwei committed
198
    <td>1x</td>
zhangwenwei's avatar
Doc  
zhangwenwei committed
199
200
    <td>36.5</td>
    <td>37.0</td>
zhangwenwei's avatar
zhangwenwei committed
201
202
  </tr>
  <tr>
zhangwenwei's avatar
Doc  
zhangwenwei committed
203
204
    <td>3x</td>
    <td>37.9</td>
zhangwenwei's avatar
zhangwenwei committed
205
206
207
208
209
210
211
212
    <td>-</td>
  </tr>
</table>

### Training Speed

The training speed is measure with s/iter. The lower, the better.

zhangwenwei's avatar
Doc  
zhangwenwei committed
213
<table border="1">
zhangwenwei's avatar
zhangwenwei committed
214
215
  <tr>
    <th>Type</th>
zhangwenwei's avatar
Doc  
zhangwenwei committed
216
217
    <th>Detectron2</th>
    <th>mmdetection</th>
zhangwenwei's avatar
zhangwenwei committed
218
219
220
  </tr>
  <tr>
    <td>Faster R-CNN</td>
zhangwenwei's avatar
Doc  
zhangwenwei committed
221
222
    <td>0.210</td>
    <td>0.216</td>
zhangwenwei's avatar
zhangwenwei committed
223
224
225
  </tr>
  <tr>
    <td>Mask R-CNN</td>
zhangwenwei's avatar
Doc  
zhangwenwei committed
226
227
    <td>0.261</td>
    <td>0.265</td>
zhangwenwei's avatar
zhangwenwei committed
228
229
  </tr>
  <tr>
zhangwenwei's avatar
Doc  
zhangwenwei committed
230
231
232
    <td>Retinanet</td>
    <td>0.200</td>
    <td>0.205</td>
zhangwenwei's avatar
zhangwenwei committed
233
234
235
236
237
238
  </tr>
</table>


### Inference Speed

zhangwenwei's avatar
Doc  
zhangwenwei committed
239
240
241
242
243
The inference speed is measured with fps (img/s) on a single GPU, the higher, the better.
To be consistent with Detectron2, we report the pure inference speed (without the time of data loading).
For Mask R-CNN, we exclude the time of RLE encoding in post-processing.
We also include the officially reported speed in the parentheses, which is slightly higher
than the results tested on our server due to differences of hardwares.
zhangwenwei's avatar
zhangwenwei committed
244

zhangwenwei's avatar
Doc  
zhangwenwei committed
245
<table border="1">
zhangwenwei's avatar
zhangwenwei committed
246
247
  <tr>
    <th>Type</th>
zhangwenwei's avatar
Doc  
zhangwenwei committed
248
249
    <th>Detectron2</th>
    <th>mmdetection</th>
zhangwenwei's avatar
zhangwenwei committed
250
251
252
  </tr>
  <tr>
    <td>Faster R-CNN</td>
zhangwenwei's avatar
Doc  
zhangwenwei committed
253
254
    <td>25.6 (26.3)</td>
    <td>22.2</td>
zhangwenwei's avatar
zhangwenwei committed
255
256
257
  </tr>
  <tr>
    <td>Mask R-CNN</td>
zhangwenwei's avatar
Doc  
zhangwenwei committed
258
259
    <td>22.5 (23.3)</td>
    <td>19.6</td>
zhangwenwei's avatar
zhangwenwei committed
260
261
  </tr>
  <tr>
zhangwenwei's avatar
Doc  
zhangwenwei committed
262
263
264
    <td>Retinanet</td>
    <td>17.8 (18.2)</td>
    <td>20.6</td>
zhangwenwei's avatar
zhangwenwei committed
265
266
267
268
269
  </tr>
</table>

### Training memory

zhangwenwei's avatar
Doc  
zhangwenwei committed
270
<table border="1">
zhangwenwei's avatar
zhangwenwei committed
271
272
  <tr>
    <th>Type</th>
zhangwenwei's avatar
Doc  
zhangwenwei committed
273
    <th>Detectron2</th>
zhangwenwei's avatar
zhangwenwei committed
274
275
276
277
    <th>mmdetection</th>
  </tr>
  <tr>
    <td>Faster R-CNN</td>
zhangwenwei's avatar
Doc  
zhangwenwei committed
278
    <td>3.0</td>
zhangwenwei's avatar
zhangwenwei committed
279
280
281
    <td>3.8</td>
  </tr>
  <tr>
zhangwenwei's avatar
Doc  
zhangwenwei committed
282
283
284
    <td>Mask R-CNN</td>
    <td>3.4</td>
    <td>3.9</td>
zhangwenwei's avatar
zhangwenwei committed
285
286
  </tr>
  <tr>
zhangwenwei's avatar
Doc  
zhangwenwei committed
287
288
    <td>Retinanet</td>
    <td>3.9</td>
zhangwenwei's avatar
zhangwenwei committed
289
290
291
    <td>3.4</td>
  </tr>
</table>