README.md 8.74 KB
Newer Older
pangjm's avatar
pangjm committed
1

Kai Chen's avatar
Kai Chen committed
2
# mmdetection
Dahua Lin's avatar
Dahua Lin committed
3

Kai Chen's avatar
Kai Chen committed
4
## Introduction
Dahua Lin's avatar
Dahua Lin committed
5

Kai Chen's avatar
Kai Chen committed
6
7
mmdetection is an open source object detection toolbox based on PyTorch. It is
a part of the open-mmlab project developed by [Multimedia Laboratory, CUHK](http://mmlab.ie.cuhk.edu.hk/).
Dahua Lin's avatar
Dahua Lin committed
8

Kai Chen's avatar
Kai Chen committed
9
10
![demo image](demo/coco_test_12510.jpg)

Kai Chen's avatar
Kai Chen committed
11
### Major features
Dahua Lin's avatar
Dahua Lin committed
12
13
14

- **Modular Design**

pangjm's avatar
pangjm committed
15
16
  One can easily construct a customized object detection framework by combining different components.

Dahua Lin's avatar
Dahua Lin committed
17
18
- **Support of multiple frameworks out of box**

Kai Chen's avatar
Kai Chen committed
19
  The toolbox directly supports popular detection frameworks, *e.g.* Faster RCNN, Mask RCNN, RetinaNet, etc.
Kai Chen's avatar
Kai Chen committed
20
21
22
23
24

- **Efficient**

  All basic bbox and mask operations run on GPUs now.
  The training speed is about 5% ~ 20% faster than Detectron for different models.
pangjm's avatar
pangjm committed
25

Dahua Lin's avatar
Dahua Lin committed
26
27
- **State of the art**

pangjm's avatar
pangjm committed
28
  This was the codebase of the *MMDet* team, who won the [COCO Detection 2018 challenge](http://cocodataset.org/#detection-leaderboard).
Kai Chen's avatar
Kai Chen committed
29

Kai Chen's avatar
Kai Chen committed
30
31
Apart from mmdetection, we also released a library [mmcv](https://github.com/open-mmlab/mmcv) for computer vision research,
which is heavily depended on by this toolbox.
Kai Chen's avatar
Kai Chen committed
32
33
34

## License

Kai Chen's avatar
Kai Chen committed
35
This project is released under the [Apache 2.0 license](LICENSE).
Kai Chen's avatar
Kai Chen committed
36

Kai Chen's avatar
Kai Chen committed
37
38
## Updates

Kai Chen's avatar
Kai Chen committed
39
40
41
v0.5.3 (26/11/2018)
- Add Cascade R-CNN and Cascade Mask R-CNN.

Kai Chen's avatar
Kai Chen committed
42
43
44
45
v0.5.2 (21/10/2018)
- Add support for custom datasets.
- Add a script to convert PASCAL VOC annotations to the expected format.

Kai Chen's avatar
Kai Chen committed
46
47
48
49
v0.5.1 (20/10/2018)
- Add BBoxAssigner and BBoxSampler, the `train_cfg` field in config files are restructured.
- `ConvFCRoIHead` / `SharedFCRoIHead` are renamed to `ConvFCBBoxHead` / `SharedFCBBoxHead` for consistency.

Kai Chen's avatar
Kai Chen committed
50
51
## Benchmark and model zoo

Kai Chen's avatar
Kai Chen committed
52
53
We provide our baseline results and the comparision with Detectron, the most
popular detection projects. Results and models are available in the [Model zoo](MODEL_ZOO.md).
Kai Chen's avatar
Kai Chen committed
54
55
56

## Installation

Kai Chen's avatar
Kai Chen committed
57
### Requirements
Kai Chen's avatar
Kai Chen committed
58
59

- Linux (tested on Ubuntu 16.04 and CentOS 7.2)
Kai Chen's avatar
Kai Chen committed
60
- Python 3.4+
Kai Chen's avatar
Kai Chen committed
61
- PyTorch 0.4.1 and torchvision
Kai Chen's avatar
Kai Chen committed
62
- Cython
Kai Chen's avatar
Kai Chen committed
63
64
- [mmcv](https://github.com/open-mmlab/mmcv)

Kai Chen's avatar
Kai Chen committed
65
66
67
68
### Install mmdetection

a. Install PyTorch 0.4.1 and torchvision following the [official instructions](https://pytorch.org/).

pangjm's avatar
pangjm committed
69
b. Clone the mmdetection repository.
Kai Chen's avatar
Kai Chen committed
70
71
72
73
74

```shell
git clone https://github.com/open-mmlab/mmdetection.git
```

Kai Chen's avatar
Kai Chen committed
75
c. Compile cuda extensions.
Kai Chen's avatar
Kai Chen committed
76
77
78

```shell
cd mmdetection
Kai Chen's avatar
Kai Chen committed
79
pip install cython  # or "conda install cython" if you prefer conda
Kai Chen's avatar
Kai Chen committed
80
81
82
./compile.sh  # or "PYTHON=python3 ./compile.sh" if you use system python3 without virtual environments
```

Kai Chen's avatar
Kai Chen committed
83
d. Install mmdetection (other dependencies will be installed automatically).
Kai Chen's avatar
Kai Chen committed
84
85
86
87
88
89

```shell
python(3) setup.py install  # add --user if you want to install it locally
# or "pip install ."
```

Kai Chen's avatar
Kai Chen committed
90
91
Note: You need to run the last step each time you pull updates from github.
The git commit id will be written to the version number and also saved in trained models.
Kai Chen's avatar
Kai Chen committed
92

Kai Chen's avatar
Kai Chen committed
93
### Prepare COCO dataset.
Kai Chen's avatar
Kai Chen committed
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110

It is recommended to symlink the dataset root to `$MMDETECTION/data`.

```
mmdetection
├── mmdet
├── tools
├── configs
├── data
│   ├── coco
│   │   ├── annotations
│   │   ├── train2017
│   │   ├── val2017
│   │   ├── test2017

```

Kai Chen's avatar
Kai Chen committed
111
112
113
> [Here](https://gist.github.com/hellock/bf23cd7348c727d69d48682cb6909047) is
a script for setting up mmdetection with conda for reference.

Kai Chen's avatar
Kai Chen committed
114
115
116

## Inference with pretrained models

Kai Chen's avatar
Kai Chen committed
117
118
119
120
121
122
123
124
125
126
127
128
### Test a dataset

- [x] single GPU testing
- [x] multiple GPU testing
- [x] visualize detection results

We allow to run one or multiple processes on each GPU, e.g. 8 processes on 8 GPU
or 16 processes on 8 GPU. When the GPU workload is not very heavy for a single
process, running multiple processes will accelerate the testing, which is specified
with the argument `--proc_per_gpu <PROCESS_NUM>`.


Kai Chen's avatar
Kai Chen committed
129
130
131
132
133
134
135
136
137
138
139
140
141
142
To test a dataset and save the results.

```shell
python tools/test.py <CONFIG_FILE> <CHECKPOINT_FILE> --gpus <GPU_NUM> --out <OUT_FILE>
```

To perform evaluation after testing, add `--eval <EVAL_TYPES>`. Supported types are:

- proposal_fast: eval recalls of proposals with our own codes. (supposed to get the same results as the official evaluation)
- proposal: eval recalls of proposals with the official code provided by COCO.
- bbox: eval box AP with the official code provided by COCO.
- segm: eval mask AP with the official code provided by COCO.
- keypoints: eval keypoint AP with the official code provided by COCO.

Kai Chen's avatar
Kai Chen committed
143
For example, to evaluate Mask R-CNN with 8 GPUs and save the result as `results.pkl`.
Kai Chen's avatar
Kai Chen committed
144
145
146
147
148

```shell
python tools/test.py configs/mask_rcnn_r50_fpn_1x.py <CHECKPOINT_FILE> --gpus 8 --out results.pkl --eval bbox segm
```

Kai Chen's avatar
Kai Chen committed
149
It is also convenient to visualize the results during testing by adding an argument `--show`.
Kai Chen's avatar
Kai Chen committed
150
151
152
153
154

```shell
python tools/test.py <CONFIG_FILE> <CHECKPOINT_FILE> --show
```

Kai Chen's avatar
Kai Chen committed
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
### Test image(s)

We provide some high-level apis (experimental) to test an image.

```python
import mmcv
from mmcv.runner import load_checkpoint
from mmdet.models import build_detector
from mmdet.apis import inference_detector, show_result

cfg = mmcv.Config.fromfile('configs/faster_rcnn_r50_fpn_1x.py')
cfg.model.pretrained = None

# construct the model and load checkpoint
model = build_detector(cfg.model, test_cfg=cfg.test_cfg)
_ = load_checkpoint(model, 'https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/faster_rcnn_r50_fpn_1x_20181010-3d1b3351.pth')

# test a single image
img = mmcv.imread('test.jpg')
result = inference_detector(model, img, cfg)
show_result(img, result)

# test a list of images
imgs = ['test1.jpg', 'test2.jpg']
for i, result in enumerate(inference_detector(model, imgs, cfg, device='cuda:0')):
    print(i, imgs[i])
    show_result(imgs[i], result)
```

Kai Chen's avatar
Kai Chen committed
184
185
186

## Train a model

Kai Chen's avatar
Kai Chen committed
187
mmdetection implements distributed training and non-distributed training,
Kai Chen's avatar
Kai Chen committed
188
189
190
191
which uses `MMDistributedDataParallel` and `MMDataParallel` respectively.

### Distributed training

Kai Chen's avatar
Kai Chen committed
192
mmdetection potentially supports multiple launch methods, e.g., PyTorch’s built-in launch utility, slurm and MPI.
Kai Chen's avatar
Kai Chen committed
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216

We provide a training script using the launch utility provided by PyTorch.

```shell
./tools/dist_train.sh <CONFIG_FILE> <GPU_NUM> [optional arguments]
```

Supported arguments are:

- --validate: perform evaluation every k (default=1) epochs during the training.
- --work_dir <WORK_DIR>: if specified, the path in config file will be overwritten.

### Non-distributed training

```shell
python tools/train.py <CONFIG_FILE> --gpus <GPU_NUM> --work_dir <WORK_DIR> --validate
```

Expected results in WORK_DIR:

- log file
- saved checkpoints (every k epochs, defaults=1)
- a symbol link to the latest checkpoint

217
218
219
> **Note**
> 1. We recommend using distributed training with NCCL2 even on a single machine, which is faster. Non-distributed training is for debugging or other purposes.
> 2. The default learning rate is for 8 GPUs. If you use less or more than 8 GPUs, you need to set the learning rate proportional to the GPU num. E.g., modify lr to 0.01 for 4 GPUs or 0.04 for 16 GPUs.
Kai Chen's avatar
Kai Chen committed
220

Kai Chen's avatar
Kai Chen committed
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
### Train on custom datasets

We define a simple annotation format.

The annotation of a dataset is a list of dict, each dict corresponds to an image.
There are 3 field `filename` (relative path), `width`, `height` for testing,
and an additional field `ann` for training. `ann` is also a dict containing at least 2 fields:
`bboxes` and `labels`, both of which are numpy arrays. Some datasets may provide
annotations like crowd/difficult/ignored bboxes, we use `bboxes_ignore` and `labels_ignore`
to cover them.

Here is an example.
```
[
    {
        'filename': 'a.jpg',
        'width': 1280,
        'height': 720,
        'ann': {
            'bboxes': <np.ndarray> (n, 4),
            'labels': <np.ndarray> (n, ),
            'bboxes_ignore': <np.ndarray> (k, 4),
            'labels_ignore': <np.ndarray> (k, 4) (optional field)
        }
    },
    ...
]
```

There are two ways to work with custom datasets.

- online conversion

  You can write a new Dataset class inherited from `CustomDataset`, and overwrite two methods
  `load_annotations(self, ann_file)` and `get_ann_info(self, idx)`, like [CocoDataset](mmdet/datasets/coco.py).

- offline conversion

  You can convert the annotation format to the expected format above and save it to
  a pickle file, like [pascal_voc.py](tools/convert_datasets/pascal_voc.py).
  Then you can simply use `CustomDataset`.

Kai Chen's avatar
Kai Chen committed
263
## Technical details
Kai Chen's avatar
Kai Chen committed
264

pangjm's avatar
pangjm committed
265
Some implementation details and project structures are described in the [technical details](TECHNICAL_DETAILS.md).
Kai Chen's avatar
Kai Chen committed
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281

## Citation

If you use our codebase or models in your research, please cite this project.
We will release a paper or technical report later.

```
@misc{mmdetection2018,
  author =       {Kai Chen and Jiangmiao Pang and Jiaqi Wang and Yu Xiong and Xiaoxiao Li
                  and Shuyang Sun and Wansen Feng and Ziwei Liu and Jianping Shi and
                  Wanli Ouyang and Chen Change Loy and Dahua Lin},
  title =        {mmdetection},
  howpublished = {\url{https://github.com/open-mmlab/mmdetection}},
  year =         {2018}
}
```