Commit 41b18fd8 authored by zhe chen's avatar zhe chen
Browse files

Use pre-commit to reformat code


Use pre-commit to reformat code
parent ff20ea39
...@@ -4,17 +4,15 @@ ...@@ -4,17 +4,15 @@
# Licensed under The MIT License [see LICENSE for details] # Licensed under The MIT License [see LICENSE for details]
# -------------------------------------------------------- # --------------------------------------------------------
from __future__ import absolute_import from __future__ import absolute_import, division, print_function
from __future__ import print_function
from __future__ import division
import math
import time import time
import torch import torch
import torch.nn as nn import torch.nn as nn
import math
from torch.autograd import gradcheck
from functions.dcnv3_func import DCNv3Function, dcnv3_core_pytorch from functions.dcnv3_func import DCNv3Function, dcnv3_core_pytorch
from torch.autograd import gradcheck
H_in, W_in = 8, 8 H_in, W_in = 8, 8
N, M, D = 2, 4, 16 N, M, D = 2, 4, 16
......
...@@ -35,7 +35,7 @@ def build_optimizer(config, model): ...@@ -35,7 +35,7 @@ def build_optimizer(config, model):
optimizer = None optimizer = None
use_zero = config.TRAIN.OPTIMIZER.USE_ZERO use_zero = config.TRAIN.OPTIMIZER.USE_ZERO
if use_zero: if use_zero:
print(f"\nUse Zero!") print(f'\nUse Zero!')
if opt_lower == 'sgd': if opt_lower == 'sgd':
# an ugly implementation # an ugly implementation
# this problem is fixed after torch 1.12 # this problem is fixed after torch 1.12
...@@ -119,7 +119,7 @@ def set_weight_decay_and_lr( ...@@ -119,7 +119,7 @@ def set_weight_decay_and_lr(
if f'levels.{i}' in name: if f'levels.{i}' in name:
param.requires_grad = False param.requires_grad = False
# 1. check wd # 1. check wd
if len(param.shape) == 1 or name.endswith(".bias") or ( if len(param.shape) == 1 or name.endswith('.bias') or (
name in skip_list) or check_keywords_in_name( name in skip_list) or check_keywords_in_name(
name, skip_keywords): name, skip_keywords):
wd = 0. wd = 0.
......
...@@ -4,13 +4,15 @@ ...@@ -4,13 +4,15 @@
# Licensed under The MIT License [see LICENSE for details] # Licensed under The MIT License [see LICENSE for details]
# -------------------------------------------------------- # --------------------------------------------------------
import os
import math import math
import torch import os
from collections import OrderedDict
import numpy as np import numpy as np
import torch
import torch.distributed as dist import torch.distributed as dist
from collections import OrderedDict
from timm.utils import get_state_dict from timm.utils import get_state_dict
try: try:
# noinspection PyUnresolvedReferences # noinspection PyUnresolvedReferences
from apex import amp from apex import amp
......
# InternImage for Object Detection # InternImage for Object Detection
This folder contains the implementation of the InternImage for object detection. This folder contains the implementation of the InternImage for object detection.
Our detection code is developed on top of [MMDetection v2.28.1](https://github.com/open-mmlab/mmdetection/tree/v2.28.1). Our detection code is developed on top of [MMDetection v2.28.1](https://github.com/open-mmlab/mmdetection/tree/v2.28.1).
## Usage ## Usage
### Install ### Install
...@@ -28,6 +27,7 @@ conda activate internimage ...@@ -28,6 +27,7 @@ conda activate internimage
- Install `PyTorch>=1.10.0` and `torchvision>=0.9.0` with `CUDA>=10.2`: - Install `PyTorch>=1.10.0` and `torchvision>=0.9.0` with `CUDA>=10.2`:
For examples, to install torch==1.11 with CUDA==11.3: For examples, to install torch==1.11 with CUDA==11.3:
```bash ```bash
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 -f https://download.pytorch.org/whl/torch_stable.html pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 -f https://download.pytorch.org/whl/torch_stable.html
``` ```
...@@ -47,12 +47,14 @@ pip install opencv-python termcolor yacs pyyaml scipy ...@@ -47,12 +47,14 @@ pip install opencv-python termcolor yacs pyyaml scipy
``` ```
- Compile CUDA operators - Compile CUDA operators
```bash ```bash
cd ./ops_dcnv3 cd ./ops_dcnv3
sh ./make.sh sh ./make.sh
# unit test (should see all checking is True) # unit test (should see all checking is True)
python test.py python test.py
``` ```
- You can also install the operator using .whl files - You can also install the operator using .whl files
[DCNv3-1.0-whl](https://github.com/OpenGVLab/InternImage/releases/tag/whl_files) [DCNv3-1.0-whl](https://github.com/OpenGVLab/InternImage/releases/tag/whl_files)
...@@ -61,7 +63,6 @@ python test.py ...@@ -61,7 +63,6 @@ python test.py
Prepare COCO according to the guidelines in [MMDetection v2.28.1](https://github.com/open-mmlab/mmdetection/blob/master/docs/en/1_exist_data_model.md). Prepare COCO according to the guidelines in [MMDetection v2.28.1](https://github.com/open-mmlab/mmdetection/blob/master/docs/en/1_exist_data_model.md).
### Evaluation ### Evaluation
To evaluate our `InternImage` on COCO val, run: To evaluate our `InternImage` on COCO val, run:
...@@ -107,6 +108,7 @@ GPUS=32 sh slurm_train.sh <partition> <job-name> configs/coco/cascade_internimag ...@@ -107,6 +108,7 @@ GPUS=32 sh slurm_train.sh <partition> <job-name> configs/coco/cascade_internimag
### Export ### Export
To export a detection model from PyTorch to TensorRT, run: To export a detection model from PyTorch to TensorRT, run:
```shell ```shell
MODEL="model_name" MODEL="model_name"
CKPT_PATH="/path/to/model/ckpt.pth" CKPT_PATH="/path/to/model/ckpt.pth"
...@@ -122,6 +124,7 @@ python deploy.py \ ...@@ -122,6 +124,7 @@ python deploy.py \
``` ```
For example, to export `mask_rcnn_internimage_t_fpn_1x_coco` from PyTorch to TensorRT, run: For example, to export `mask_rcnn_internimage_t_fpn_1x_coco` from PyTorch to TensorRT, run:
```shell ```shell
MODEL="mask_rcnn_internimage_t_fpn_1x_coco" MODEL="mask_rcnn_internimage_t_fpn_1x_coco"
CKPT_PATH="/path/to/model/ckpt/mask_rcnn_internimage_t_fpn_1x_coco.pth" CKPT_PATH="/path/to/model/ckpt/mask_rcnn_internimage_t_fpn_1x_coco.pth"
......
...@@ -46,4 +46,4 @@ data = dict( ...@@ -46,4 +46,4 @@ data = dict(
ann_file=data_root + 'annotations/instances_val2017.json', ann_file=data_root + 'annotations/instances_val2017.json',
img_prefix=data_root + 'val2017/', img_prefix=data_root + 'val2017/',
pipeline=test_pipeline)) pipeline=test_pipeline))
evaluation = dict(interval=1, metric='bbox', classwise=True) evaluation = dict(interval=1, metric='bbox', classwise=True)
\ No newline at end of file
...@@ -125,4 +125,4 @@ model = dict( ...@@ -125,4 +125,4 @@ model = dict(
score_thr=0.05, score_thr=0.05,
nms=dict(type='nms', iou_threshold=0.5), nms=dict(type='nms', iou_threshold=0.5),
max_per_img=100, max_per_img=100,
mask_thr_binary=0.5))) mask_thr_binary=0.5)))
\ No newline at end of file
# COCO # COCO
## Introduction ## Introduction
Introduced by Lin et al. in [Microsoft COCO: Common Objects in Context](https://arxiv.org/pdf/1405.0312v3.pdf) Introduced by Lin et al. in [Microsoft COCO: Common Objects in Context](https://arxiv.org/pdf/1405.0312v3.pdf)
...@@ -11,19 +10,18 @@ Splits: The first version of MS COCO dataset was released in 2014. It contains 1 ...@@ -11,19 +10,18 @@ Splits: The first version of MS COCO dataset was released in 2014. It contains 1
Based on community feedback, in 2017 the training/validation split was changed from 83K/41K to 118K/5K. The new split uses the same images and annotations. The 2017 test set is a subset of 41K images of the 2015 test set. Additionally, the 2017 release contains a new unannotated dataset of 123K images. Based on community feedback, in 2017 the training/validation split was changed from 83K/41K to 118K/5K. The new split uses the same images and annotations. The 2017 test set is a subset of 41K images of the 2015 test set. Additionally, the 2017 release contains a new unannotated dataset of 123K images.
## Model Zoo ## Model Zoo
### Mask R-CNN + InternImage ### Mask R-CNN + InternImage
| backbone | schd | box mAP | mask mAP | train speed | train time |#param | FLOPs | Config | Download | | backbone | schd | box mAP | mask mAP | train speed | train time | #param | FLOPs | Config | Download |
| :------------: | :---------: | :-----: | :------: | :-----: |:------: | :-----: |:------: | :-----: | :---: | | :-----------: | :--: | :-----: | :------: | :----------: | :--------: | :----: | :---: | :------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| InternImage-T | 1x | 47.2 | 42.5 | 0.36s / iter | 9h | 49M | 270G | [config](./mask_rcnn_internimage_t_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_1x_coco.log.json) | | InternImage-T | 1x | 47.2 | 42.5 | 0.36s / iter | 9h | 49M | 270G | [config](./mask_rcnn_internimage_t_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_1x_coco.log.json) |
| InternImage-T | 3x | 49.1 | 43.7 | 0.34s / iter | 26h | 49M | 270G | [config](./mask_rcnn_internimage_t_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_3x_coco.log.json) | | InternImage-T | 3x | 49.1 | 43.7 | 0.34s / iter | 26h | 49M | 270G | [config](./mask_rcnn_internimage_t_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_3x_coco.log.json) |
| InternImage-S | 1x | 47.8 | 43.3 | 0.40s / iter | 10h | 69M | 340G | [config](./mask_rcnn_internimage_s_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_1x_coco.log.json) | | InternImage-S | 1x | 47.8 | 43.3 | 0.40s / iter | 10h | 69M | 340G | [config](./mask_rcnn_internimage_s_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_1x_coco.log.json) |
| InternImage-S | 3x | 49.7 | 44.5 | 0.40s / iter | 30h | 69M | 340G | [config](./mask_rcnn_internimage_s_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_3x_coco.log.json) | | InternImage-S | 3x | 49.7 | 44.5 | 0.40s / iter | 30h | 69M | 340G | [config](./mask_rcnn_internimage_s_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_3x_coco.log.json) |
| InternImage-B | 1x | 48.8 | 44.0 | 0.45s / iter | 11.5h | 115M | 501G | [config](./mask_rcnn_internimage_b_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_1x_coco.log.json) | | InternImage-B | 1x | 48.8 | 44.0 | 0.45s / iter | 11.5h | 115M | 501G | [config](./mask_rcnn_internimage_b_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_1x_coco.log.json) |
| InternImage-B | 3x | 50.3 | 44.8 | 0.45s / iter | 34h | 115M | 501G | [config](./mask_rcnn_internimage_b_fpn_3x_coco.py)| [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_3x_coco.log.json) | | InternImage-B | 3x | 50.3 | 44.8 | 0.45s / iter | 34h | 115M | 501G | [config](./mask_rcnn_internimage_b_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_3x_coco.log.json) |
- Training speed is measured with A100 GPUs using current code and may be faster than the speed in logs. - Training speed is measured with A100 GPUs using current code and may be faster than the speed in logs.
- Some logs are our recent newly trained ones. There might be slight differences between the results in logs and our paper. - Some logs are our recent newly trained ones. There might be slight differences between the results in logs and our paper.
...@@ -31,22 +29,21 @@ Based on community feedback, in 2017 the training/validation split was changed f ...@@ -31,22 +29,21 @@ Based on community feedback, in 2017 the training/validation split was changed f
### Cascade Mask R-CNN + InternImage ### Cascade Mask R-CNN + InternImage
| backbone | schd | box mAP | mask mAP | train speed | train time | #param | FLOPs | Config | Download | | backbone | schd | box mAP | mask mAP | train speed | train time | #param | FLOPs | Config | Download |
| :------------: | :---------: | :-----: | :------: | :-----: | :---: | :-----: | :---: | :---: | :---: | | :------------: | :--: | :-----: | :------: | :----------: | :--------: | :----: | :---: | :-----------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| InternImage-L | 1x | 54.9 | 47.7 | 0.73s / iter | 18h | 277M | 1399G | [config](./cascade_internimage_l_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_l_fpn_1x_coco.pth) | | InternImage-L | 1x | 54.9 | 47.7 | 0.73s / iter | 18h | 277M | 1399G | [config](./cascade_internimage_l_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_l_fpn_1x_coco.pth) |
| InternImage-L | 3x | 56.1 | 48.5 | 0.79s / iter | 15h (4n) | 277M | 1399G | [config](./cascade_internimage_l_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_l_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_l_fpn_3x_coco.log.json) | | InternImage-L | 3x | 56.1 | 48.5 | 0.79s / iter | 15h (4n) | 277M | 1399G | [config](./cascade_internimage_l_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_l_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_l_fpn_3x_coco.log.json) |
| InternImage-XL | 1x | 55.3 | 48.1 | 0.82s / iter | 21h | 387M | 1782G | [config](./cascade_internimage_xl_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_1x_coco.log.json) | | InternImage-XL | 1x | 55.3 | 48.1 | 0.82s / iter | 21h | 387M | 1782G | [config](./cascade_internimage_xl_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_1x_coco.log.json) |
| InternImage-XL | 3x | 56.2 | 48.8 | 0.91s / iter | 17h (4n) | 387M | 1782G | [config](./cascade_internimage_xl_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_3x_coco.log.json) | | InternImage-XL | 3x | 56.2 | 48.8 | 0.91s / iter | 17h (4n) | 387M | 1782G | [config](./cascade_internimage_xl_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_3x_coco.log.json) |
- Training speed is measured with A100 GPUs using current code and may be faster than the speed in logs. - Training speed is measured with A100 GPUs using current code and may be faster than the speed in logs.
- Some logs are our recent newly trained ones. There might be slight differences between the results in logs and our paper. - Some logs are our recent newly trained ones. There might be slight differences between the results in logs and our paper.
- Please set `with_cp=True` to save memory if you meet `out-of-memory` issues. - Please set `with_cp=True` to save memory if you meet `out-of-memory` issues.
### DINO + InternImage ### DINO + InternImage
| backbone | lr type | pretrain | schd | box mAP | train time | #param | Config | Download |
| :------------: | :---------: |:---------: | :---------: | :-----: | :---: | :-----: | :---: | :---: |
| InternImage-T | layer-wise lr | ImageNet-1K | 1x | 53.9 | 9.5h | 49M | [config](./dino_4scale_internimage_t_1x_coco_layer_wise_lr.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_t_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_t_1x_coco.json) |
| InternImage-L | layer-wise lr | ImageNet-22K | 1x | 57.5 | 18h | 241M | [config](./dino_4scale_internimage_l_1x_coco_layer_wise_lr.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_l_1x_coco_layer_wise_lr.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_l_1x_coco_layer_wise_lr.log.json) |
| InternImage-L | 0.1x backbone lr | ImageNet-22K | 1x | 57.6 | 18h | 241M | [config](./dino_4scale_internimage_l_1x_coco_0.1x_backbone_lr.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_l_1x_coco_0.1x_backbone_lr.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_l_1x_coco_0.1x_backbone_lr.log.json) |
| backbone | lr type | pretrain | schd | box mAP | train time | #param | Config | Download |
| :-----------: | :--------------: | :----------: | :--: | :-----: | :--------: | :----: | :---------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| InternImage-T | layer-wise lr | ImageNet-1K | 1x | 53.9 | 9.5h | 49M | [config](./dino_4scale_internimage_t_1x_coco_layer_wise_lr.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_t_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_t_1x_coco.json) |
| InternImage-L | layer-wise lr | ImageNet-22K | 1x | 57.5 | 18h | 241M | [config](./dino_4scale_internimage_l_1x_coco_layer_wise_lr.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_l_1x_coco_layer_wise_lr.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_l_1x_coco_layer_wise_lr.log.json) |
| InternImage-L | 0.1x backbone lr | ImageNet-22K | 1x | 57.6 | 18h | 241M | [config](./dino_4scale_internimage_l_1x_coco_0.1x_backbone_lr.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_l_1x_coco_0.1x_backbone_lr.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_l_1x_coco_0.1x_backbone_lr.log.json) |
...@@ -106,4 +106,4 @@ checkpoint_config = dict( ...@@ -106,4 +106,4 @@ checkpoint_config = dict(
interval=1, interval=1,
max_keep_ckpts=3, max_keep_ckpts=3,
save_last=True, save_last=True,
) )
\ No newline at end of file
...@@ -106,4 +106,4 @@ checkpoint_config = dict( ...@@ -106,4 +106,4 @@ checkpoint_config = dict(
interval=1, interval=1,
max_keep_ckpts=3, max_keep_ckpts=3,
save_last=True, save_last=True,
) )
\ No newline at end of file
...@@ -175,4 +175,4 @@ checkpoint_config = dict( ...@@ -175,4 +175,4 @@ checkpoint_config = dict(
interval=1, interval=1,
max_keep_ckpts=3, max_keep_ckpts=3,
save_last=True, save_last=True,
) )
\ No newline at end of file
...@@ -174,4 +174,4 @@ checkpoint_config = dict( ...@@ -174,4 +174,4 @@ checkpoint_config = dict(
interval=1, interval=1,
max_keep_ckpts=3, max_keep_ckpts=3,
save_last=True, save_last=True,
) )
\ No newline at end of file
...@@ -174,4 +174,4 @@ checkpoint_config = dict( ...@@ -174,4 +174,4 @@ checkpoint_config = dict(
interval=1, interval=1,
max_keep_ckpts=3, max_keep_ckpts=3,
save_last=True, save_last=True,
) )
\ No newline at end of file
...@@ -46,4 +46,4 @@ checkpoint_config = dict( ...@@ -46,4 +46,4 @@ checkpoint_config = dict(
interval=1, interval=1,
max_keep_ckpts=3, max_keep_ckpts=3,
save_last=True, save_last=True,
) )
\ No newline at end of file
...@@ -89,4 +89,4 @@ checkpoint_config = dict( ...@@ -89,4 +89,4 @@ checkpoint_config = dict(
interval=1, interval=1,
max_keep_ckpts=3, max_keep_ckpts=3,
save_last=True, save_last=True,
) )
\ No newline at end of file
...@@ -46,4 +46,4 @@ checkpoint_config = dict( ...@@ -46,4 +46,4 @@ checkpoint_config = dict(
interval=1, interval=1,
max_keep_ckpts=3, max_keep_ckpts=3,
save_last=True, save_last=True,
) )
\ No newline at end of file
...@@ -89,4 +89,4 @@ checkpoint_config = dict( ...@@ -89,4 +89,4 @@ checkpoint_config = dict(
interval=1, interval=1,
max_keep_ckpts=3, max_keep_ckpts=3,
save_last=True, save_last=True,
) )
\ No newline at end of file
...@@ -46,4 +46,4 @@ checkpoint_config = dict( ...@@ -46,4 +46,4 @@ checkpoint_config = dict(
interval=1, interval=1,
max_keep_ckpts=3, max_keep_ckpts=3,
save_last=True, save_last=True,
) )
\ No newline at end of file
...@@ -47,4 +47,4 @@ checkpoint_config = dict( ...@@ -47,4 +47,4 @@ checkpoint_config = dict(
interval=1, interval=1,
max_keep_ckpts=3, max_keep_ckpts=3,
save_last=True, save_last=True,
) )
\ No newline at end of file
...@@ -89,4 +89,4 @@ checkpoint_config = dict( ...@@ -89,4 +89,4 @@ checkpoint_config = dict(
interval=1, interval=1,
max_keep_ckpts=3, max_keep_ckpts=3,
save_last=True, save_last=True,
) )
\ No newline at end of file
# CrowdHuman # CrowdHuman
## Introduction ## Introduction
Introduced by Shao et al. in [CrowdHuman: A Benchmark for Detecting Human in a Crowd](https://arxiv.org/pdf/1805.00123.pdf) Introduced by Shao et al. in [CrowdHuman: A Benchmark for Detecting Human in a Crowd](https://arxiv.org/pdf/1805.00123.pdf)
...@@ -8,6 +7,7 @@ Introduced by Shao et al. in [CrowdHuman: A Benchmark for Detecting Human in a C ...@@ -8,6 +7,7 @@ Introduced by Shao et al. in [CrowdHuman: A Benchmark for Detecting Human in a C
CrowdHuman is a benchmark dataset to better evaluate detectors in crowd scenarios. The CrowdHuman dataset is large, rich-annotated and contains high diversity. CrowdHuman contains 15000, 4370 and 5000 images for training, validation, and testing, respectively. There are a total of 470K human instances from train and validation subsets and 23 persons per image, with various kinds of occlusions in the dataset. Each human instance is annotated with a head bounding-box, human visible-region bounding-box and human full-body bounding-box. We hope our dataset will serve as a solid baseline and help promote future research in human detection tasks. CrowdHuman is a benchmark dataset to better evaluate detectors in crowd scenarios. The CrowdHuman dataset is large, rich-annotated and contains high diversity. CrowdHuman contains 15000, 4370 and 5000 images for training, validation, and testing, respectively. There are a total of 470K human instances from train and validation subsets and 23 persons per image, with various kinds of occlusions in the dataset. Each human instance is annotated with a head bounding-box, human visible-region bounding-box and human full-body bounding-box. We hope our dataset will serve as a solid baseline and help promote future research in human detection tasks.
## Prepare the data ## Prepare the data
Download the original dataset from [CrowdHuman](https://www.crowdhuman.org/download.html). Then convert annotations by detection/tools/create_crowd_anno.py Download the original dataset from [CrowdHuman](https://www.crowdhuman.org/download.html). Then convert annotations by detection/tools/create_crowd_anno.py
- Data Tree of CrowdHuman should look like: - Data Tree of CrowdHuman should look like:
...@@ -25,16 +25,15 @@ Download the original dataset from [CrowdHuman](https://www.crowdhuman.org/downl ...@@ -25,16 +25,15 @@ Download the original dataset from [CrowdHuman](https://www.crowdhuman.org/downl
├── 1074488,79d54000c6f9d9e5.jpg ├── 1074488,79d54000c6f9d9e5.jpg
└── ... └── ...
## Model Zoo ```
## Model Zoo
### Cascade Mask R-CNN + InternImage ### Cascade Mask R-CNN + InternImage
| backbone | schd | box mAP | mask mAP | train speed | train time | #param | FLOPs | Config | Download |
| backbone | schd | box mAP | mask mAP | train speed | train time | #param | FLOPs | Config | Download | | :------------: | :--: | :-----: | :------: | :---------: | :--------: | :----: | :---: | :------------------------------------------------------: | :------: |
| :------------: | :---------: |:-------:|:--------:|:-----------:|:-----------:|:------:|:-----:| :---: |:--------:| | InternImage-XL | 3x | TBD | TBD | TBD | TBD | TBD | TBD | [config](./cascade_internimage_xl_fpn_3x_crowd_human.py) | TBD |
| InternImage-XL | 3x | TBD | TBD | TBD | TBD | TBD | TBD | [config](./cascade_internimage_xl_fpn_3x_crowd_human.py) | TBD |
- Training speed is measured with A100 GPUs using current code and may be faster than the speed in logs. - Training speed is measured with A100 GPUs using current code and may be faster than the speed in logs.
- Some logs are our recent newly trained ones. There might be slight differences between the results in logs and our paper. - Some logs are our recent newly trained ones. There might be slight differences between the results in logs and our paper.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment