README.md 15.2 KB
Newer Older
zhe chen's avatar
zhe chen committed
1
2
# InternImage for Object Detection

zhe chen's avatar
zhe chen committed
3
This folder contains the implementation of the InternImage for object detection.
zhe chen's avatar
zhe chen committed
4
5
6

Our detection code is developed on top of [MMDetection v2.28.1](https://github.com/open-mmlab/mmdetection/tree/v2.28.1).

zhe chen's avatar
zhe chen committed
7
<!-- TOC -->
zhe chen's avatar
zhe chen committed
8

zhe chen's avatar
zhe chen committed
9
10
11
12
13
14
15
- [Installation](#installation)
- [Data Preparation](#data-preparation)
- [Released Models](#released-models)
- [Evaluation](#evaluation)
- [Training](#training)
- [Manage Jobs with Slurm](#manage-jobs-with-slurm)
- [Export](#export)
zhe chen's avatar
zhe chen committed
16

zhe chen's avatar
zhe chen committed
17
18
19
20
21
<!-- TOC -->

## Installation

- Clone this repository:
zhe chen's avatar
zhe chen committed
22
23
24
25
26
27
28
29
30

```bash
git clone https://github.com/OpenGVLab/InternImage.git
cd InternImage
```

- Create a conda virtual environment and activate it:

```bash
zhe chen's avatar
zhe chen committed
31
conda create -n internimage python=3.9
zhe chen's avatar
zhe chen committed
32
33
34
35
36
conda activate internimage
```

- Install `CUDA>=10.2` with `cudnn>=7` following
  the [official installation instructions](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html)
37
- Install `PyTorch>=1.10.0` and `torchvision>=0.9.0` with `CUDA>=10.2`:
zhe chen's avatar
zhe chen committed
38

zhe chen's avatar
zhe chen committed
39
For examples, to install `torch==1.11` with `CUDA==11.3`:
zhe chen's avatar
zhe chen committed
40

zhe chen's avatar
zhe chen committed
41
42
43
44
```bash
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113  -f https://download.pytorch.org/whl/torch_stable.html
```

zhe chen's avatar
zhe chen committed
45
46
47
48
49
50
51
52
53
54
- Install other requirements:

  note: conda opencv will break torchvision as not to support GPU, so we need to install opencv using pip.

```bash
conda install -c conda-forge termcolor yacs pyyaml scipy pip -y
pip install opencv-python
```

- Install `timm`, `mmcv-full` and \`mmsegmentation':
zhe chen's avatar
zhe chen committed
55
56
57
58

```bash
pip install -U openmim
mim install mmcv-full==1.5.0
zhe chen's avatar
zhe chen committed
59
mim install mmsegmentation==0.27.0
zhe chen's avatar
zhe chen committed
60
61
62
63
64
65
66
pip install timm==0.6.11 mmdet==2.28.1
```

- Install other requirements:

```bash
pip install opencv-python termcolor yacs pyyaml scipy
zhe chen's avatar
zhe chen committed
67
68
69
# Please use a version of numpy lower than 2.0
pip install numpy==1.26.4
pip install pydantic==1.10.13
zhe chen's avatar
zhe chen committed
70
pip install yapf==0.40.1
zhe chen's avatar
zhe chen committed
71
72
73
```

- Compile CUDA operators
zhe chen's avatar
zhe chen committed
74

zhe chen's avatar
zhe chen committed
75
76
Before compiling, please use the `nvcc -V` command to check whether your `nvcc` version matches the CUDA version of PyTorch.

zhe chen's avatar
zhe chen committed
77
78
79
80
81
82
```bash
cd ./ops_dcnv3
sh ./make.sh
# unit test (should see all checking is True)
python test.py
```
zhe chen's avatar
zhe chen committed
83

zhe chen's avatar
zhe chen committed
84
85
- You can also install the operator using precompiled `.whl` files
  [DCNv3-1.0-whl](https://github.com/OpenGVLab/InternImage/releases/tag/whl_files)
yeshenglong1's avatar
yeshenglong1 committed
86

zhe chen's avatar
zhe chen committed
87
## Data Preparation
zhe chen's avatar
zhe chen committed
88

zhe chen's avatar
zhe chen committed
89
Prepare datasets according to the guidelines in [MMDetection v2.28.1](https://github.com/open-mmlab/mmdetection/blob/master/docs/en/1_exist_data_model.md).
zhe chen's avatar
zhe chen committed
90

zhe chen's avatar
zhe chen committed
91
## Released Models
zhe chen's avatar
zhe chen committed
92

zhe chen's avatar
zhe chen committed
93
94
95
96
97
<details open>
<summary> Dataset: COCO </summary>
<br>
<div>

zhe chen's avatar
zhe chen committed
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
|   method   |    backbone    | schd | box mAP | mask mAP | #param | FLOPs |                             Config                              |                                                                                                          Download                                                                                                          |
| :--------: | :------------: | :--: | :-----: | :------: | :----: | :---: | :-------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| Mask R-CNN | InternImage-T  |  1x  |  47.2   |   42.5   |  49M   | 270G  | [config](./configs/coco/mask_rcnn_internimage_t_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_1x_coco.log.json) |
| Mask R-CNN | InternImage-T  |  3x  |  49.1   |   43.7   |  49M   | 270G  | [config](./configs/coco/mask_rcnn_internimage_t_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_3x_coco.log.json) |
| Mask R-CNN | InternImage-S  |  1x  |  47.8   |   43.3   |  69M   | 340G  | [config](./configs/coco/mask_rcnn_internimage_s_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_1x_coco.log.json) |
| Mask R-CNN | InternImage-S  |  3x  |  49.7   |   44.5   |  69M   | 340G  | [config](./configs/coco/mask_rcnn_internimage_s_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_3x_coco.log.json) |
| Mask R-CNN | InternImage-B  |  1x  |  48.8   |   44.0   |  115M  | 501G  | [config](./configs/coco/mask_rcnn_internimage_b_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_1x_coco.log.json) |
| Mask R-CNN | InternImage-B  |  3x  |  50.3   |   44.8   |  115M  | 501G  | [config](./configs/coco/mask_rcnn_internimage_b_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_3x_coco.log.json) |
|  Cascade   | InternImage-L  |  1x  |  54.9   |   47.7   |  277M  | 1399G |  [config](./configs/coco/cascade_internimage_l_fpn_1x_coco.py)  |                                                          [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_l_fpn_1x_coco.pth)                                                           |
|  Cascade   | InternImage-L  |  3x  |  56.1   |   48.5   |  277M  | 1399G |  [config](./configs/coco/cascade_internimage_l_fpn_3x_coco.py)  |   [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_l_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_l_fpn_3x_coco.log.json)   |
|  Cascade   | InternImage-XL |  1x  |  55.3   |   48.1   |  387M  | 1782G | [config](./configs/coco/cascade_internimage_xl_fpn_1x_coco.py)  |  [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_1x_coco.log.json)  |
|  Cascade   | InternImage-XL |  3x  |  56.2   |   48.8   |  387M  | 1782G | [config](./configs/coco/cascade_internimage_xl_fpn_3x_coco.py)  |  [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_3x_coco.log.json)  |

|   method   |     backbone     | schd | box mAP | #param |                                     Config                                     |                                                                                                                         Download                                                                                                                         |
| :--------: | :--------------: | :--: | :-----: | :----: | :----------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
|    DINO    |  InternImage-T   |  1x  |  53.9   |  49M   |  [config](./configs/coco/dino_4scale_internimage_t_1x_coco_layer_wise_lr.py)   |                    [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_t_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_t_1x_coco.json)                    |
|    DINO    |  InternImage-L   |  1x  |  57.6   |  241M  | [config](./configs/coco/dino_4scale_internimage_l_1x_coco_0.1x_backbone_lr.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_l_1x_coco_0.1x_backbone_lr.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_l_1x_coco_0.1x_backbone_lr.log.json) |
zhe chen's avatar
zhe chen committed
115
116
117
118
119
120
|    DINO    |  InternImage-H   |  1x  |  63.4   |  1.1B  |    [config](./configs/coco/dino_4scale_internimage_h_objects365_coco_ss.py)    |                                                                     [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_h_objects365_coco.pth)                                                                      |
|    DINO    | CB-InternImage-H |  1x  |  64.5   |  2.2B  |   [config](./configs/coco/dino_4scale_cbinternimage_h_objects365_coco_ss.py)   |                                                                    [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_cbinternimage_h_objects365_coco.pth)                                                                     |
| DINO (TTA) | CB-InternImage-H |  1x  |  65.0   |  2.2B  |                                      TODO                                      |                                                                    [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_cbinternimage_h_objects365_coco.pth)                                                                     |
|    DINO    |  InternImage-G   |  1x  |  64.2   |  3.1B  |    [config](./configs/coco/dino_4scale_internimage_g_objects365_coco_ss.py)    |                                                                     [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_g_objects365_coco.pth)                                                                      |
| DINO (TTA) | CB-InternImage-G |  1x  |  65.1   |   6B   |                                      TODO                                      |                                                                                                                           TODO                                                                                                                           |
| DINO (TTA) | CB-InternImage-G |  1x  |  65.3   |   6B   |                                      TODO                                      |                                                                                                                           TODO                                                                                                                           |
zhe chen's avatar
zhe chen committed
121
122
123
124
125

</div>

</details>

zhe chen's avatar
zhe chen committed
126
<details>
zhe chen's avatar
zhe chen committed
127
128
129
130
<summary> Dataset: LVIS </summary>
<br>
<div>

zhe chen's avatar
zhe chen committed
131
132
133
| method |     backbone     | minival (ss) | val (ss/ms) | #param |                                       Config                                       |                                                     Download                                                      |
| :----: | :--------------: | :----------: | :---------: | :----: | :--------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------: |
|  DINO  | CB-InternImage-H |     65.8     | 62.3 / 63.2 | 2.18B  | [config](./configs/lvis/dino_4scale_cbinternimage_h_objects365_lvis_minival_ss.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_cbinternimage_h_objects365_lvis.pth) |
zhe chen's avatar
zhe chen committed
134
135
136
137
138

</div>

</details>

zhe chen's avatar
zhe chen committed
139
<details>
zhe chen's avatar
zhe chen committed
140
141
142
143
<summary> Dataset: OpenImages </summary>
<br>
<div>

zhe chen's avatar
zhe chen committed
144
145
146
| method |     backbone     | mAP (ss) | #param |                                         Config                                         |                                                        Download                                                         |
| :----: | :--------------: | :------: | :----: | :------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------: |
|  DINO  | CB-InternImage-H |   74.1   | 2.18B  | [config](./configs/openimages/dino_4scale_cbinternimage_h_objects365_openimages_ss.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_cbinternimage_h_objects365_openimages.pth) |
zhe chen's avatar
zhe chen committed
147
148
149
150
151

</div>

</details>

zhe chen's avatar
zhe chen committed
152
153
154
155
156
<details>
<summary> Dataset: VOC 2007 & 2012 </summary>
<br>
<div>

zhe chen's avatar
zhe chen committed
157
158
159
| method |     backbone     | VOC 2007 | VOC 2012 | #param |                                 Config                                  |                                                       Download                                                       |
| :----: | :--------------: | :------: | :------: | :----: | :---------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------: |
|  DINO  | CB-InternImage-H |   94.0   |   97.2   | 2.18B  | [config](./configs/voc/dino_4scale_cbinternimage_h_objects365_voc07.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_cbinternimage_h_objects365_voc0712.pth) |
zhe chen's avatar
zhe chen committed
160
161
162
163
164

</div>

</details>

zhe chen's avatar
zhe chen committed
165
## Evaluation
zhe chen's avatar
zhe chen committed
166
167
168
169
170
171
172
173
174
175

To evaluate our `InternImage` on COCO val, run:

```bash
sh dist_test.sh <config-file> <checkpoint> <gpu-num> --eval bbox segm
```

For example, to evaluate the `InternImage-T` with a single GPU:

```bash
zhe chen's avatar
zhe chen committed
176
python test.py configs/coco/mask_rcnn_internimage_t_fpn_1x_coco.py pretrained/mask_rcnn_internimage_t_fpn_1x_coco.pth --eval bbox segm
zhe chen's avatar
zhe chen committed
177
178
179
180
181
```

For example, to evaluate the `InternImage-B` with a single node with 8 GPUs:

```bash
zhe chen's avatar
zhe chen committed
182
sh dist_test.sh configs/coco/mask_rcnn_internimage_b_fpn_1x_coco.py pretrained/mask_rcnn_internimage_b_fpn_1x_coco.py 8 --eval bbox segm
zhe chen's avatar
zhe chen committed
183
184
```

zhe chen's avatar
zhe chen committed
185
## Training
zhe chen's avatar
zhe chen committed
186
187
188
189
190
191
192
193
194
195

To train an `InternImage` on COCO, run:

```bash
sh dist_train.sh <config-file> <gpu-num>
```

For example, to train `InternImage-T` with 8 GPU on 1 node, run:

```bash
196
sh dist_train.sh configs/coco/mask_rcnn_internimage_t_fpn_1x_coco.py 8
zhe chen's avatar
zhe chen committed
197
198
```

zhe chen's avatar
zhe chen committed
199
## Manage Jobs with Slurm
zhe chen's avatar
zhe chen committed
200
201
202
203

For example, to train `InternImage-L` with 32 GPU on 4 node, run:

```bash
204
GPUS=32 sh slurm_train.sh <partition> <job-name> configs/coco/cascade_internimage_xl_fpn_3x_coco.py work_dirs/cascade_internimage_xl_fpn_3x_coco
zhe chen's avatar
zhe chen committed
205
```
Weiyun1025's avatar
Weiyun1025 committed
206

zhe chen's avatar
zhe chen committed
207
208
209
210
211
212
213
## Export

Install `mmdeploy` at first:

```shell
pip install mmdeploy==0.14.0
```
Weiyun1025's avatar
Weiyun1025 committed
214
215

To export a detection model from PyTorch to TensorRT, run:
zhe chen's avatar
zhe chen committed
216

Weiyun1025's avatar
Weiyun1025 committed
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
```shell
MODEL="model_name"
CKPT_PATH="/path/to/model/ckpt.pth"

python deploy.py \
    "./deploy/configs/mmdet/instance-seg/instance-seg_tensorrt_dynamic-320x320-1344x1344.py" \
    "./configs/coco/${MODEL}.py" \
    "${CKPT_PATH}" \
    "./deploy/demo.jpg" \
    --work-dir "./work_dirs/mmdet/instance-seg/${MODEL}" \
    --device cuda \
    --dump-info
```

For example, to export `mask_rcnn_internimage_t_fpn_1x_coco` from PyTorch to TensorRT, run:
zhe chen's avatar
zhe chen committed
232

Weiyun1025's avatar
Weiyun1025 committed
233
234
235
236
237
238
239
240
241
242
243
244
245
```shell
MODEL="mask_rcnn_internimage_t_fpn_1x_coco"
CKPT_PATH="/path/to/model/ckpt/mask_rcnn_internimage_t_fpn_1x_coco.pth"

python deploy.py \
    "./deploy/configs/mmdet/instance-seg/instance-seg_tensorrt_dynamic-320x320-1344x1344.py" \
    "./configs/coco/${MODEL}.py" \
    "${CKPT_PATH}" \
    "./deploy/demo.jpg" \
    --work-dir "./work_dirs/mmdet/instance-seg/${MODEL}" \
    --device cuda \
    --dump-info
```