README.md 9.09 KB
Newer Older
pangjm's avatar
pangjm committed
1

Kai Chen's avatar
Kai Chen committed
2
# mmdetection
Dahua Lin's avatar
Dahua Lin committed
3

Kai Chen's avatar
Kai Chen committed
4
## Introduction
Dahua Lin's avatar
Dahua Lin committed
5

Kai Chen's avatar
Kai Chen committed
6
7
8
The master branch works with **PyTorch 1.0**. If you would like to use PyTorch 0.4.1,
please checkout to the [pytorch-0.4.1](https://github.com/open-mmlab/mmdetection/tree/pytorch-0.4.1) branch.

Kai Chen's avatar
Kai Chen committed
9
10
mmdetection is an open source object detection toolbox based on PyTorch. It is
a part of the open-mmlab project developed by [Multimedia Laboratory, CUHK](http://mmlab.ie.cuhk.edu.hk/).
Dahua Lin's avatar
Dahua Lin committed
11

Kai Chen's avatar
Kai Chen committed
12
13
![demo image](demo/coco_test_12510.jpg)

Kai Chen's avatar
Kai Chen committed
14
### Major features
Dahua Lin's avatar
Dahua Lin committed
15
16
17

- **Modular Design**

pangjm's avatar
pangjm committed
18
19
  One can easily construct a customized object detection framework by combining different components.

Dahua Lin's avatar
Dahua Lin committed
20
21
- **Support of multiple frameworks out of box**

Kai Chen's avatar
Kai Chen committed
22
  The toolbox directly supports popular detection frameworks, *e.g.* Faster RCNN, Mask RCNN, RetinaNet, etc.
Kai Chen's avatar
Kai Chen committed
23
24
25
26
27

- **Efficient**

  All basic bbox and mask operations run on GPUs now.
  The training speed is about 5% ~ 20% faster than Detectron for different models.
pangjm's avatar
pangjm committed
28

Dahua Lin's avatar
Dahua Lin committed
29
30
- **State of the art**

pangjm's avatar
pangjm committed
31
  This was the codebase of the *MMDet* team, who won the [COCO Detection 2018 challenge](http://cocodataset.org/#detection-leaderboard).
Kai Chen's avatar
Kai Chen committed
32

Kai Chen's avatar
Kai Chen committed
33
34
Apart from mmdetection, we also released a library [mmcv](https://github.com/open-mmlab/mmcv) for computer vision research,
which is heavily depended on by this toolbox.
Kai Chen's avatar
Kai Chen committed
35
36
37

## License

Kai Chen's avatar
Kai Chen committed
38
This project is released under the [Apache 2.0 license](LICENSE).
Kai Chen's avatar
Kai Chen committed
39

Kai Chen's avatar
Kai Chen committed
40
41
## Updates

Kai Chen's avatar
Kai Chen committed
42
v0.6.0 (14/04/2019)
Kai Chen's avatar
Kai Chen committed
43
44
- Up to 30% speedup compared to the model zoo.
- Support both PyTorch stable and nightly version.
Kai Chen's avatar
Kai Chen committed
45
46
- Replace NMS and SigmoidFocalLoss with Pytorch CUDA extensions.

Kai Chen's avatar
Kai Chen committed
47
48
49
v0.6rc0(06/02/2019)
- Migrate to PyTorch 1.0.

Kai Chen's avatar
Kai Chen committed
50
51
52
53
v0.5.7 (06/02/2019)
- Add support for Deformable ConvNet v2. (Many thanks to the authors and [@chengdazhi](https://github.com/chengdazhi))
- This is the last release based on PyTorch 0.4.1.

Kai Chen's avatar
Kai Chen committed
54
55
56
57
v0.5.6 (17/01/2019)
- Add support for Group Normalization.
- Unify RPNHead and single stage heads (RetinaHead, SSDHead) with AnchorHead.

Kai Chen's avatar
Kai Chen committed
58
59
60
61
62
63
v0.5.5 (22/12/2018)
- Add SSD for COCO and PASCAL VOC.
- Add ResNeXt backbones and detection models.
- Refactoring for Samplers/Assigners and add OHEM.
- Add VOC dataset and evaluation scripts.

Kai Chen's avatar
Kai Chen committed
64
65
66
v0.5.4 (27/11/2018)
- Add SingleStageDetector and RetinaNet.

Kai Chen's avatar
Kai Chen committed
67
68
v0.5.3 (26/11/2018)
- Add Cascade R-CNN and Cascade Mask R-CNN.
Kai Chen's avatar
Kai Chen committed
69
- Add support for Soft-NMS in config files.
Kai Chen's avatar
Kai Chen committed
70

Kai Chen's avatar
Kai Chen committed
71
72
73
74
v0.5.2 (21/10/2018)
- Add support for custom datasets.
- Add a script to convert PASCAL VOC annotations to the expected format.

Kai Chen's avatar
Kai Chen committed
75
76
77
78
v0.5.1 (20/10/2018)
- Add BBoxAssigner and BBoxSampler, the `train_cfg` field in config files are restructured.
- `ConvFCRoIHead` / `SharedFCRoIHead` are renamed to `ConvFCBBoxHead` / `SharedFCBBoxHead` for consistency.

Kai Chen's avatar
Kai Chen committed
79
80
## Benchmark and model zoo

Kai Chen's avatar
Kai Chen committed
81
82
83
84
85
Supported methods and backbones are shown in the below table.
Results and models are available in the [Model zoo](MODEL_ZOO.md).

|                    | ResNet   | ResNeXt  | SENet    | VGG      |
|--------------------|:--------:|:--------:|:--------:|:--------:|
pangjm's avatar
pangjm committed
86
87
88
89
90
91
| RPN                | ✓        | ✓        | ☐        | ✗        |
| Fast R-CNN         | ✓        | ✓        | ☐        | ✗        |
| Faster R-CNN       | ✓        | ✓        | ☐        | ✗        |
| Mask R-CNN         | ✓        | ✓        | ☐        | ✗        |
| Cascade R-CNN      | ✓        | ✓        | ☐        | ✗        |
| Cascade Mask R-CNN | ✓        | ✓        | ☐        | ✗        |
Kai Chen's avatar
Kai Chen committed
92
| SSD                | ✗        | ✗        | ✗        | ✓        |
pangjm's avatar
pangjm committed
93
| RetinaNet          | ✓        | ✓        | ☐        | ✗        |
94
| Hybrid Task Cascade| ✓        | ✓        | ☐        | ✗        |
Kai Chen's avatar
Kai Chen committed
95

Kai Chen's avatar
Kai Chen committed
96
Other features
Kai Chen's avatar
Kai Chen committed
97
- [x] DCNv2
Kai Chen's avatar
Kai Chen committed
98
- [x] Group Normalization
99
- [x] Weight Standardization
Kai Chen's avatar
Kai Chen committed
100
101
102
103
- [x] OHEM
- [x] Soft-NMS


Kai Chen's avatar
Kai Chen committed
104
105
## Installation

Kai Chen's avatar
Kai Chen committed
106
Please refer to [INSTALL.md](INSTALL.md) for installation and dataset preparation.
Kai Chen's avatar
Kai Chen committed
107

Kai Chen's avatar
Kai Chen committed
108
109
110

## Inference with pretrained models

Kai Chen's avatar
Kai Chen committed
111
112
113
114
115
116
117
118
119
120
121
122
### Test a dataset

- [x] single GPU testing
- [x] multiple GPU testing
- [x] visualize detection results

We allow to run one or multiple processes on each GPU, e.g. 8 processes on 8 GPU
or 16 processes on 8 GPU. When the GPU workload is not very heavy for a single
process, running multiple processes will accelerate the testing, which is specified
with the argument `--proc_per_gpu <PROCESS_NUM>`.


Kai Chen's avatar
Kai Chen committed
123
124
125
126
127
128
129
To test a dataset and save the results.

```shell
python tools/test.py <CONFIG_FILE> <CHECKPOINT_FILE> --gpus <GPU_NUM> --out <OUT_FILE>
```

To perform evaluation after testing, add `--eval <EVAL_TYPES>`. Supported types are:
Kai Chen's avatar
Kai Chen committed
130
131
132
`[proposal_fast, proposal, bbox, segm, keypoints]`.
`proposal_fast` denotes evaluating proposal recalls with our own implementation,
others denote evaluating the corresponding metric with the official coco api.
Kai Chen's avatar
Kai Chen committed
133

Kai Chen's avatar
Kai Chen committed
134
For example, to evaluate Mask R-CNN with 8 GPUs and save the result as `results.pkl`.
Kai Chen's avatar
Kai Chen committed
135
136
137
138
139

```shell
python tools/test.py configs/mask_rcnn_r50_fpn_1x.py <CHECKPOINT_FILE> --gpus 8 --out results.pkl --eval bbox segm
```

Kai Chen's avatar
Kai Chen committed
140
It is also convenient to visualize the results during testing by adding an argument `--show`.
Kai Chen's avatar
Kai Chen committed
141
142
143
144
145

```shell
python tools/test.py <CONFIG_FILE> <CHECKPOINT_FILE> --show
```

Kai Chen's avatar
Kai Chen committed
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
### Test image(s)

We provide some high-level apis (experimental) to test an image.

```python
import mmcv
from mmcv.runner import load_checkpoint
from mmdet.models import build_detector
from mmdet.apis import inference_detector, show_result

cfg = mmcv.Config.fromfile('configs/faster_rcnn_r50_fpn_1x.py')
cfg.model.pretrained = None

# construct the model and load checkpoint
model = build_detector(cfg.model, test_cfg=cfg.test_cfg)
_ = load_checkpoint(model, 'https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/faster_rcnn_r50_fpn_1x_20181010-3d1b3351.pth')

# test a single image
img = mmcv.imread('test.jpg')
result = inference_detector(model, img, cfg)
show_result(img, result)

# test a list of images
imgs = ['test1.jpg', 'test2.jpg']
for i, result in enumerate(inference_detector(model, imgs, cfg, device='cuda:0')):
    print(i, imgs[i])
    show_result(imgs[i], result)
```

Kai Chen's avatar
Kai Chen committed
175
176
177

## Train a model

Kai Chen's avatar
Kai Chen committed
178
mmdetection implements distributed training and non-distributed training,
Kai Chen's avatar
Kai Chen committed
179
180
which uses `MMDistributedDataParallel` and `MMDataParallel` respectively.

Kai Chen's avatar
Kai Chen committed
181
### Distributed training (Single or Multiples machines)
Kai Chen's avatar
Kai Chen committed
182

Kai Chen's avatar
Kai Chen committed
183
mmdetection potentially supports multiple launch methods, e.g., PyTorch’s built-in launch utility, slurm and MPI.
Kai Chen's avatar
Kai Chen committed
184
185
186
187
188
189
190
191
192
193

We provide a training script using the launch utility provided by PyTorch.

```shell
./tools/dist_train.sh <CONFIG_FILE> <GPU_NUM> [optional arguments]
```

Supported arguments are:

- --validate: perform evaluation every k (default=1) epochs during the training.
youkaichao's avatar
youkaichao committed
194
- --work_dir <WORK_DIR>: if specified, the path in config file will be replaced.
Kai Chen's avatar
Kai Chen committed
195
196
197
198
199
200
201

Expected results in WORK_DIR:

- log file
- saved checkpoints (every k epochs, defaults=1)
- a symbol link to the latest checkpoint

Kai Chen's avatar
Kai Chen committed
202
203
204
205
206
207
**Important**: The default learning rate is for 8 GPUs. If you use less or more than 8 GPUs, you need to set the learning rate proportional to the GPU num. E.g., modify lr to 0.01 for 4 GPUs or 0.04 for 16 GPUs.

### Non-distributed training

Please refer to `tools/train.py` for non-distributed training, which is not recommended
and left for debugging. Even on a single machine, distributed training is preferred.
Kai Chen's avatar
Kai Chen committed
208

Kai Chen's avatar
Kai Chen committed
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
### Train on custom datasets

We define a simple annotation format.

The annotation of a dataset is a list of dict, each dict corresponds to an image.
There are 3 field `filename` (relative path), `width`, `height` for testing,
and an additional field `ann` for training. `ann` is also a dict containing at least 2 fields:
`bboxes` and `labels`, both of which are numpy arrays. Some datasets may provide
annotations like crowd/difficult/ignored bboxes, we use `bboxes_ignore` and `labels_ignore`
to cover them.

Here is an example.
```
[
    {
        'filename': 'a.jpg',
        'width': 1280,
        'height': 720,
        'ann': {
            'bboxes': <np.ndarray> (n, 4),
            'labels': <np.ndarray> (n, ),
            'bboxes_ignore': <np.ndarray> (k, 4),
Kai Chen's avatar
Kai Chen committed
231
            'labels_ignore': <np.ndarray> (k, ) (optional field)
Kai Chen's avatar
Kai Chen committed
232
233
234
235
236
237
238
239
240
241
242
        }
    },
    ...
]
```

There are two ways to work with custom datasets.

- online conversion

  You can write a new Dataset class inherited from `CustomDataset`, and overwrite two methods
Kai Chen's avatar
Kai Chen committed
243
  `load_annotations(self, ann_file)` and `get_ann_info(self, idx)`, like [CocoDataset](mmdet/datasets/coco.py) and [VOCDataset](mmdet/datasets/voc.py).
Kai Chen's avatar
Kai Chen committed
244
245
246
247

- offline conversion

  You can convert the annotation format to the expected format above and save it to
Kai Chen's avatar
Kai Chen committed
248
  a pickle or json file, like [pascal_voc.py](tools/convert_datasets/pascal_voc.py).
Kai Chen's avatar
Kai Chen committed
249
250
  Then you can simply use `CustomDataset`.

Kai Chen's avatar
Kai Chen committed
251
## Technical details
Kai Chen's avatar
Kai Chen committed
252

pangjm's avatar
pangjm committed
253
Some implementation details and project structures are described in the [technical details](TECHNICAL_DETAILS.md).
Kai Chen's avatar
Kai Chen committed
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269

## Citation

If you use our codebase or models in your research, please cite this project.
We will release a paper or technical report later.

```
@misc{mmdetection2018,
  author =       {Kai Chen and Jiangmiao Pang and Jiaqi Wang and Yu Xiong and Xiaoxiao Li
                  and Shuyang Sun and Wansen Feng and Ziwei Liu and Jianping Shi and
                  Wanli Ouyang and Chen Change Loy and Dahua Lin},
  title =        {mmdetection},
  howpublished = {\url{https://github.com/open-mmlab/mmdetection}},
  year =         {2018}
}
```