README.md 13.6 KB
Newer Older
zhe chen's avatar
zhe chen committed
1
2
# InternImage for Object Detection

zhe chen's avatar
zhe chen committed
3
This folder contains the implementation of the InternImage for object detection.
zhe chen's avatar
zhe chen committed
4
5
6

Our detection code is developed on top of [MMDetection v2.28.1](https://github.com/open-mmlab/mmdetection/tree/v2.28.1).

zhe chen's avatar
zhe chen committed
7
<!-- TOC -->
zhe chen's avatar
zhe chen committed
8

zhe chen's avatar
zhe chen committed
9
10
11
12
13
14
15
- [Installation](#installation)
- [Data Preparation](#data-preparation)
- [Released Models](#released-models)
- [Evaluation](#evaluation)
- [Training](#training)
- [Manage Jobs with Slurm](#manage-jobs-with-slurm)
- [Export](#export)
zhe chen's avatar
zhe chen committed
16

zhe chen's avatar
zhe chen committed
17
18
19
20
21
<!-- TOC -->

## Installation

- Clone this repository:
zhe chen's avatar
zhe chen committed
22
23
24
25
26
27
28
29
30

```bash
git clone https://github.com/OpenGVLab/InternImage.git
cd InternImage
```

- Create a conda virtual environment and activate it:

```bash
zhe chen's avatar
zhe chen committed
31
conda create -n internimage python=3.9
zhe chen's avatar
zhe chen committed
32
33
34
35
36
conda activate internimage
```

- Install `CUDA>=10.2` with `cudnn>=7` following
  the [official installation instructions](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html)
37
- Install `PyTorch>=1.10.0` and `torchvision>=0.9.0` with `CUDA>=10.2`:
zhe chen's avatar
zhe chen committed
38

zhe chen's avatar
zhe chen committed
39
For examples, to install `torch==1.11` with `CUDA==11.3`:
zhe chen's avatar
zhe chen committed
40

zhe chen's avatar
zhe chen committed
41
42
43
44
```bash
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113  -f https://download.pytorch.org/whl/torch_stable.html
```

zhe chen's avatar
zhe chen committed
45
46
47
48
49
50
51
52
53
54
- Install other requirements:

  note: conda opencv will break torchvision as not to support GPU, so we need to install opencv using pip.

```bash
conda install -c conda-forge termcolor yacs pyyaml scipy pip -y
pip install opencv-python
```

- Install `timm`, `mmcv-full` and \`mmsegmentation':
zhe chen's avatar
zhe chen committed
55
56
57
58

```bash
pip install -U openmim
mim install mmcv-full==1.5.0
zhe chen's avatar
zhe chen committed
59
mim install mmsegmentation==0.27.0
zhe chen's avatar
zhe chen committed
60
61
62
63
64
65
66
pip install timm==0.6.11 mmdet==2.28.1
```

- Install other requirements:

```bash
pip install opencv-python termcolor yacs pyyaml scipy
zhe chen's avatar
zhe chen committed
67
68
69
# Please use a version of numpy lower than 2.0
pip install numpy==1.26.4
pip install pydantic==1.10.13
zhe chen's avatar
zhe chen committed
70
pip install yapf==0.40.1
zhe chen's avatar
zhe chen committed
71
72
73
```

- Compile CUDA operators
zhe chen's avatar
zhe chen committed
74

zhe chen's avatar
zhe chen committed
75
76
Before compiling, please use the `nvcc -V` command to check whether your `nvcc` version matches the CUDA version of PyTorch.

zhe chen's avatar
zhe chen committed
77
78
79
80
81
82
```bash
cd ./ops_dcnv3
sh ./make.sh
# unit test (should see all checking is True)
python test.py
```
zhe chen's avatar
zhe chen committed
83

zhe chen's avatar
zhe chen committed
84
85
- You can also install the operator using precompiled `.whl` files
  [DCNv3-1.0-whl](https://github.com/OpenGVLab/InternImage/releases/tag/whl_files)
yeshenglong1's avatar
yeshenglong1 committed
86

zhe chen's avatar
zhe chen committed
87
## Data Preparation
zhe chen's avatar
zhe chen committed
88

zhe chen's avatar
zhe chen committed
89
Prepare datasets according to the guidelines in [MMDetection v2.28.1](https://github.com/open-mmlab/mmdetection/blob/master/docs/en/1_exist_data_model.md).
zhe chen's avatar
zhe chen committed
90

zhe chen's avatar
zhe chen committed
91
## Released Models
zhe chen's avatar
zhe chen committed
92

zhe chen's avatar
zhe chen committed
93
94
95
96
97
<details open>
<summary> Dataset: COCO </summary>
<br>
<div>

zhe chen's avatar
zhe chen committed
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
|   method   |    backbone    | schd | box mAP | mask mAP | #param | FLOPs |                             Config                              |                                                                                                          Download                                                                                                          |
| :--------: | :------------: | :--: | :-----: | :------: | :----: | :---: | :-------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| Mask R-CNN | InternImage-T  |  1x  |  47.2   |   42.5   |  49M   | 270G  | [config](./configs/coco/mask_rcnn_internimage_t_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_1x_coco.log.json) |
| Mask R-CNN | InternImage-T  |  3x  |  49.1   |   43.7   |  49M   | 270G  | [config](./configs/coco/mask_rcnn_internimage_t_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_3x_coco.log.json) |
| Mask R-CNN | InternImage-S  |  1x  |  47.8   |   43.3   |  69M   | 340G  | [config](./configs/coco/mask_rcnn_internimage_s_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_1x_coco.log.json) |
| Mask R-CNN | InternImage-S  |  3x  |  49.7   |   44.5   |  69M   | 340G  | [config](./configs/coco/mask_rcnn_internimage_s_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_3x_coco.log.json) |
| Mask R-CNN | InternImage-B  |  1x  |  48.8   |   44.0   |  115M  | 501G  | [config](./configs/coco/mask_rcnn_internimage_b_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_1x_coco.log.json) |
| Mask R-CNN | InternImage-B  |  3x  |  50.3   |   44.8   |  115M  | 501G  | [config](./configs/coco/mask_rcnn_internimage_b_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_3x_coco.log.json) |
|  Cascade   | InternImage-L  |  1x  |  54.9   |   47.7   |  277M  | 1399G |  [config](./configs/coco/cascade_internimage_l_fpn_1x_coco.py)  |                                                          [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_l_fpn_1x_coco.pth)                                                           |
|  Cascade   | InternImage-L  |  3x  |  56.1   |   48.5   |  277M  | 1399G |  [config](./configs/coco/cascade_internimage_l_fpn_3x_coco.py)  |   [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_l_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_l_fpn_3x_coco.log.json)   |
|  Cascade   | InternImage-XL |  1x  |  55.3   |   48.1   |  387M  | 1782G | [config](./configs/coco/cascade_internimage_xl_fpn_1x_coco.py)  |  [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_1x_coco.log.json)  |
|  Cascade   | InternImage-XL |  3x  |  56.2   |   48.8   |  387M  | 1782G | [config](./configs/coco/cascade_internimage_xl_fpn_3x_coco.py)  |  [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_3x_coco.log.json)  |

|   method   |     backbone     | schd | box mAP | #param |                                     Config                                     |                                                                                                                         Download                                                                                                                         |
| :--------: | :--------------: | :--: | :-----: | :----: | :----------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
|    DINO    |  InternImage-T   |  1x  |  53.9   |  49M   |  [config](./configs/coco/dino_4scale_internimage_t_1x_coco_layer_wise_lr.py)   |                    [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_t_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_t_1x_coco.json)                    |
|    DINO    |  InternImage-L   |  1x  |  57.6   |  241M  | [config](./configs/coco/dino_4scale_internimage_l_1x_coco_0.1x_backbone_lr.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_l_1x_coco_0.1x_backbone_lr.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_l_1x_coco_0.1x_backbone_lr.log.json) |
|    DINO    | CB-InternImage-H |  1x  |  64.5   | 2.18B  |   [config](./configs/coco/dino_4scale_cbinternimage_h_objects365_coco_ss.py)   |                                                                    [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_cbinternimage_h_objects365_coco.pth)                                                                     |
| DINO (TTA) | CB-InternImage-H |  1x  |  65.0   | 2.18B  |                                       -                                        |                                                                    [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_cbinternimage_h_objects365_coco.pth)                                                                     |
zhe chen's avatar
zhe chen committed
117
118
119
120
121

</div>

</details>

zhe chen's avatar
zhe chen committed
122
<details>
zhe chen's avatar
zhe chen committed
123
124
125
126
<summary> Dataset: LVIS </summary>
<br>
<div>

zhe chen's avatar
zhe chen committed
127
128
129
| method |     backbone     | minival (ss) | val (ss/ms) | #param |                                       Config                                       |                                                     Download                                                      |
| :----: | :--------------: | :----------: | :---------: | :----: | :--------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------: |
|  DINO  | CB-InternImage-H |     65.8     | 62.3 / 63.2 | 2.18B  | [config](./configs/lvis/dino_4scale_cbinternimage_h_objects365_lvis_minival_ss.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_cbinternimage_h_objects365_lvis.pth) |
zhe chen's avatar
zhe chen committed
130
131
132
133
134

</div>

</details>

zhe chen's avatar
zhe chen committed
135
<details>
zhe chen's avatar
zhe chen committed
136
137
138
139
<summary> Dataset: OpenImages </summary>
<br>
<div>

zhe chen's avatar
zhe chen committed
140
141
142
| method |     backbone     | mAP (ss) | #param |                                         Config                                         |                                                        Download                                                         |
| :----: | :--------------: | :------: | :----: | :------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------: |
|  DINO  | CB-InternImage-H |   74.1   | 2.18B  | [config](./configs/openimages/dino_4scale_cbinternimage_h_objects365_openimages_ss.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_cbinternimage_h_objects365_openimages.pth) |
zhe chen's avatar
zhe chen committed
143
144
145
146
147

</div>

</details>

zhe chen's avatar
zhe chen committed
148
149
150
151
152
<details>
<summary> Dataset: VOC 2007 & 2012 </summary>
<br>
<div>

zhe chen's avatar
zhe chen committed
153
154
155
| method |     backbone     | VOC 2007 | VOC 2012 | #param |                                 Config                                  |                                                       Download                                                       |
| :----: | :--------------: | :------: | :------: | :----: | :---------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------: |
|  DINO  | CB-InternImage-H |   94.0   |   97.2   | 2.18B  | [config](./configs/voc/dino_4scale_cbinternimage_h_objects365_voc07.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_cbinternimage_h_objects365_voc0712.pth) |
zhe chen's avatar
zhe chen committed
156
157
158
159
160

</div>

</details>

zhe chen's avatar
zhe chen committed
161
## Evaluation
zhe chen's avatar
zhe chen committed
162
163
164
165
166
167
168
169
170
171

To evaluate our `InternImage` on COCO val, run:

```bash
sh dist_test.sh <config-file> <checkpoint> <gpu-num> --eval bbox segm
```

For example, to evaluate the `InternImage-T` with a single GPU:

```bash
zhe chen's avatar
zhe chen committed
172
python test.py configs/coco/mask_rcnn_internimage_t_fpn_1x_coco.py pretrained/mask_rcnn_internimage_t_fpn_1x_coco.pth --eval bbox segm
zhe chen's avatar
zhe chen committed
173
174
175
176
177
```

For example, to evaluate the `InternImage-B` with a single node with 8 GPUs:

```bash
zhe chen's avatar
zhe chen committed
178
sh dist_test.sh configs/coco/mask_rcnn_internimage_b_fpn_1x_coco.py pretrained/mask_rcnn_internimage_b_fpn_1x_coco.py 8 --eval bbox segm
zhe chen's avatar
zhe chen committed
179
180
```

zhe chen's avatar
zhe chen committed
181
## Training
zhe chen's avatar
zhe chen committed
182
183
184
185
186
187
188
189
190
191

To train an `InternImage` on COCO, run:

```bash
sh dist_train.sh <config-file> <gpu-num>
```

For example, to train `InternImage-T` with 8 GPU on 1 node, run:

```bash
192
sh dist_train.sh configs/coco/mask_rcnn_internimage_t_fpn_1x_coco.py 8
zhe chen's avatar
zhe chen committed
193
194
```

zhe chen's avatar
zhe chen committed
195
## Manage Jobs with Slurm
zhe chen's avatar
zhe chen committed
196
197
198
199

For example, to train `InternImage-L` with 32 GPU on 4 node, run:

```bash
200
GPUS=32 sh slurm_train.sh <partition> <job-name> configs/coco/cascade_internimage_xl_fpn_3x_coco.py work_dirs/cascade_internimage_xl_fpn_3x_coco
zhe chen's avatar
zhe chen committed
201
```
Weiyun1025's avatar
Weiyun1025 committed
202

zhe chen's avatar
zhe chen committed
203
204
205
206
207
208
209
## Export

Install `mmdeploy` at first:

```shell
pip install mmdeploy==0.14.0
```
Weiyun1025's avatar
Weiyun1025 committed
210
211

To export a detection model from PyTorch to TensorRT, run:
zhe chen's avatar
zhe chen committed
212

Weiyun1025's avatar
Weiyun1025 committed
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
```shell
MODEL="model_name"
CKPT_PATH="/path/to/model/ckpt.pth"

python deploy.py \
    "./deploy/configs/mmdet/instance-seg/instance-seg_tensorrt_dynamic-320x320-1344x1344.py" \
    "./configs/coco/${MODEL}.py" \
    "${CKPT_PATH}" \
    "./deploy/demo.jpg" \
    --work-dir "./work_dirs/mmdet/instance-seg/${MODEL}" \
    --device cuda \
    --dump-info
```

For example, to export `mask_rcnn_internimage_t_fpn_1x_coco` from PyTorch to TensorRT, run:
zhe chen's avatar
zhe chen committed
228

Weiyun1025's avatar
Weiyun1025 committed
229
230
231
232
233
234
235
236
237
238
239
240
241
```shell
MODEL="mask_rcnn_internimage_t_fpn_1x_coco"
CKPT_PATH="/path/to/model/ckpt/mask_rcnn_internimage_t_fpn_1x_coco.pth"

python deploy.py \
    "./deploy/configs/mmdet/instance-seg/instance-seg_tensorrt_dynamic-320x320-1344x1344.py" \
    "./configs/coco/${MODEL}.py" \
    "${CKPT_PATH}" \
    "./deploy/demo.jpg" \
    --work-dir "./work_dirs/mmdet/instance-seg/${MODEL}" \
    --device cuda \
    --dump-info
```