Commit 85529f35 authored by unknown's avatar unknown
Browse files

添加openmmlab测试用例

parent b21b0c01
_base_ = './mask_rcnn_hrnetv2p_w40_1x_coco.py'
# learning policy
lr_config = dict(step=[16, 22])
runner = dict(type='EpochBasedRunner', max_epochs=24)
Collections:
- Name: HRNet
Metadata:
Training Data: COCO
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- HRNet
Paper: https://arxiv.org/abs/1904.04514
README: configs/hrnet/README.md
Models:
- Name: faster_rcnn_hrnetv2p_w18_1x_coco
In Collection: HRNet
Config: configs/hrnet/faster_rcnn_hrnetv2p_w18_1x_coco.py
Metadata:
Training Memory (GB): 6.6
inference time (s/im): 0.07463
Epochs: 12
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 36.9
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/faster_rcnn_hrnetv2p_w18_1x_coco/faster_rcnn_hrnetv2p_w18_1x_coco_20200130-56651a6d.pth
- Name: faster_rcnn_hrnetv2p_w18_2x_coco
In Collection: HRNet
Config: configs/hrnet/faster_rcnn_hrnetv2p_w18_2x_coco.py
Metadata:
Training Memory (GB): 6.6
Epochs: 24
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 38.9
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/faster_rcnn_hrnetv2p_w18_2x_coco/faster_rcnn_hrnetv2p_w18_2x_coco_20200702_085731-a4ec0611.pth
- Name: faster_rcnn_hrnetv2p_w32_1x_coco
In Collection: HRNet
Config: configs/hrnet/faster_rcnn_hrnetv2p_w32_1x_coco.py
Metadata:
Training Memory (GB): 9.0
inference time (s/im): 0.08065
Epochs: 12
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 40.2
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/faster_rcnn_hrnetv2p_w32_1x_coco/faster_rcnn_hrnetv2p_w32_1x_coco_20200130-6e286425.pth
- Name: faster_rcnn_hrnetv2p_w32_2x_coco
In Collection: HRNet
Config: configs/hrnet/faster_rcnn_hrnetv2p_w32_2x_coco.py
Metadata:
Training Memory (GB): 9.0
Epochs: 24
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 41.4
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/faster_rcnn_hrnetv2p_w32_2x_coco/faster_rcnn_hrnetv2p_w32_2x_coco_20200529_015927-976a9c15.pth
- Name: faster_rcnn_hrnetv2p_w40_1x_coco
In Collection: HRNet
Config: configs/hrnet/faster_rcnn_hrnetv2p_w40_1x_coco.py
Metadata:
Training Memory (GB): 10.4
inference time (s/im): 0.09524
Epochs: 12
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 41.2
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/faster_rcnn_hrnetv2p_w40_1x_coco/faster_rcnn_hrnetv2p_w40_1x_coco_20200210-95c1f5ce.pth
- Name: faster_rcnn_hrnetv2p_w40_2x_coco
In Collection: HRNet
Config: configs/hrnet/faster_rcnn_hrnetv2p_w40_2x_coco.py
Metadata:
Training Memory (GB): 10.4
Epochs: 24
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 42.1
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/faster_rcnn_hrnetv2p_w40_2x_coco/faster_rcnn_hrnetv2p_w40_2x_coco_20200512_161033-0f236ef4.pth
- Name: mask_rcnn_hrnetv2p_w18_1x_coco
In Collection: HRNet
Config: configs/hrnet/mask_rcnn_hrnetv2p_w18_1x_coco.py
Metadata:
Training Memory (GB): 7.0
inference time (s/im): 0.08547
Epochs: 12
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 37.7
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 34.2
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/mask_rcnn_hrnetv2p_w18_1x_coco/mask_rcnn_hrnetv2p_w18_1x_coco_20200205-1c3d78ed.pth
- Name: mask_rcnn_hrnetv2p_w18_2x_coco
In Collection: HRNet
Config: configs/hrnet/mask_rcnn_hrnetv2p_w18_2x_coco.py
Metadata:
Training Memory (GB): 7.0
Epochs: 24
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 39.8
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 36.0
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/mask_rcnn_hrnetv2p_w18_2x_coco/mask_rcnn_hrnetv2p_w18_2x_coco_20200212-b3c825b1.pth
- Name: mask_rcnn_hrnetv2p_w32_1x_coco
In Collection: HRNet
Config: configs/hrnet/mask_rcnn_hrnetv2p_w32_1x_coco.py
Metadata:
Training Memory (GB): 9.4
inference time (s/im): 0.0885
Epochs: 12
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 41.2
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 37.1
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/mask_rcnn_hrnetv2p_w32_1x_coco/mask_rcnn_hrnetv2p_w32_1x_coco_20200207-b29f616e.pth
- Name: mask_rcnn_hrnetv2p_w32_2x_coco
In Collection: HRNet
Config: configs/hrnet/mask_rcnn_hrnetv2p_w32_2x_coco.py
Metadata:
Training Memory (GB): 9.4
Epochs: 24
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 42.5
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 37.8
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/mask_rcnn_hrnetv2p_w32_2x_coco/mask_rcnn_hrnetv2p_w32_2x_coco_20200213-45b75b4d.pth
- Name: mask_rcnn_hrnetv2p_w40_1x_coco
In Collection: HRNet
Config: configs/hrnet/mask_rcnn_hrnetv2p_w40_1x_coco.py
Metadata:
Training Memory (GB): 10.9
Epochs: 12
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 42.1
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 37.5
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/mask_rcnn_hrnetv2p_w40_1x_coco/mask_rcnn_hrnetv2p_w40_1x_coco_20200511_015646-66738b35.pth
- Name: mask_rcnn_hrnetv2p_w40_2x_coco
In Collection: HRNet
Config: configs/hrnet/mask_rcnn_hrnetv2p_w40_2x_coco.py
Metadata:
Training Memory (GB): 10.9
Epochs: 24
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 42.8
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 38.2
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/mask_rcnn_hrnetv2p_w40_2x_coco/mask_rcnn_hrnetv2p_w40_2x_coco_20200512_163732-aed5e4ab.pth
- Name: cascade_rcnn_hrnetv2p_w18_20e_coco
In Collection: HRNet
Config: configs/hrnet/cascade_rcnn_hrnetv2p_w18_20e_coco.py
Metadata:
Training Memory (GB): 7.0
inference time (s/im): 0.09091
Epochs: 20
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 41.2
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/cascade_rcnn_hrnetv2p_w18_20e_coco/cascade_rcnn_hrnetv2p_w18_20e_coco_20200210-434be9d7.pth
- Name: cascade_rcnn_hrnetv2p_w32_20e_coco
In Collection: HRNet
Config: configs/hrnet/cascade_rcnn_hrnetv2p_w32_20e_coco.py
Metadata:
Training Memory (GB): 9.4
inference time (s/im): 0.09091
Epochs: 20
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 43.3
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/cascade_rcnn_hrnetv2p_w32_20e_coco/cascade_rcnn_hrnetv2p_w32_20e_coco_20200208-928455a4.pth
- Name: cascade_rcnn_hrnetv2p_w40_20e_coco
In Collection: HRNet
Config: configs/hrnet/cascade_rcnn_hrnetv2p_w40_20e_coco.py
Metadata:
Training Memory (GB): 10.8
Epochs: 20
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 43.8
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/cascade_rcnn_hrnetv2p_w40_20e_coco/cascade_rcnn_hrnetv2p_w40_20e_coco_20200512_161112-75e47b04.pth
- Name: cascade_mask_rcnn_hrnetv2p_w18_20e_coco
In Collection: HRNet
Config: configs/hrnet/cascade_mask_rcnn_hrnetv2p_w18_20e_coco.py
Metadata:
Training Memory (GB): 8.5
inference time (s/im): 0.11765
Epochs: 20
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 41.6
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 36.4
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/cascade_mask_rcnn_hrnetv2p_w18_20e_coco/cascade_mask_rcnn_hrnetv2p_w18_20e_coco_20200210-b543cd2b.pth
- Name: cascade_mask_rcnn_hrnetv2p_w32_20e_coco
In Collection: HRNet
Config: configs/hrnet/cascade_mask_rcnn_hrnetv2p_w32_20e_coco.py
Metadata:
inference time (s/im): 0.12048
Epochs: 20
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 44.3
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 38.6
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/cascade_mask_rcnn_hrnetv2p_w32_20e_coco/cascade_mask_rcnn_hrnetv2p_w32_20e_coco_20200512_154043-39d9cf7b.pth
- Name: cascade_mask_rcnn_hrnetv2p_w40_20e_coco
In Collection: HRNet
Config: configs/hrnet/cascade_mask_rcnn_hrnetv2p_w40_20e_coco.py
Metadata:
Training Memory (GB): 12.5
Epochs: 20
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 45.1
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 39.3
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/cascade_mask_rcnn_hrnetv2p_w40_20e_coco/cascade_mask_rcnn_hrnetv2p_w40_20e_coco_20200527_204922-969c4610.pth
- Name: htc_hrnetv2p_w18_20e_coco
In Collection: HRNet
Config: configs/hrnet/htc_hrnetv2p_w18_20e_coco.py
Metadata:
Training Memory (GB): 10.8
inference time (s/im): 0.21277
Epochs: 20
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 42.8
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 37.9
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/htc_hrnetv2p_w18_20e_coco/htc_hrnetv2p_w18_20e_coco_20200210-b266988c.pth
- Name: htc_hrnetv2p_w32_20e_coco
In Collection: HRNet
Config: configs/hrnet/htc_hrnetv2p_w32_20e_coco.py
Metadata:
Training Memory (GB): 13.1
inference time (s/im): 0.20408
Epochs: 20
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 45.4
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 39.9
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/htc_hrnetv2p_w32_20e_coco/htc_hrnetv2p_w32_20e_coco_20200207-7639fa12.pth
- Name: htc_hrnetv2p_w40_20e_coco
In Collection: HRNet
Config: configs/hrnet/htc_hrnetv2p_w40_20e_coco.py
Metadata:
Training Memory (GB): 14.6
Epochs: 20
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 46.4
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 40.8
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/htc_hrnetv2p_w40_20e_coco/htc_hrnetv2p_w40_20e_coco_20200529_183411-417c4d5b.pth
- Name: fcos_hrnetv2p_w18_gn-head_4x4_1x_coco
In Collection: HRNet
Config: configs/hrnet/fcos_hrnetv2p_w18_gn-head_4x4_1x_coco.py
Metadata:
Training Memory (GB): 13.0
inference time (s/im): 0.07752
Epochs: 12
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 35.3
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/fcos_hrnetv2p_w18_gn-head_4x4_1x_coco/fcos_hrnetv2p_w18_gn-head_4x4_1x_coco_20201212_100710-4ad151de.pth
- Name: fcos_hrnetv2p_w18_gn-head_4x4_2x_coco
In Collection: HRNet
Config: configs/hrnet/fcos_hrnetv2p_w18_gn-head_4x4_2x_coco.py
Metadata:
Training Memory (GB): 13.0
Epochs: 24
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 38.2
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/fcos_hrnetv2p_w18_gn-head_4x4_2x_coco/fcos_hrnetv2p_w18_gn-head_4x4_2x_coco_20201212_101110-5c575fa5.pth
- Name: fcos_hrnetv2p_w32_gn-head_4x4_1x_coco
In Collection: HRNet
Config: configs/hrnet/fcos_hrnetv2p_w32_gn-head_4x4_1x_coco.py
Metadata:
Training Memory (GB): 17.5
inference time (s/im): 0.07752
Epochs: 12
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 39.5
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/fcos_hrnetv2p_w32_gn-head_4x4_1x_coco/fcos_hrnetv2p_w32_gn-head_4x4_1x_coco_20201211_134730-cb8055c0.pth
- Name: fcos_hrnetv2p_w32_gn-head_4x4_2x_coco
In Collection: HRNet
Config: configs/hrnet/fcos_hrnetv2p_w32_gn-head_4x4_2x_coco.py
Metadata:
Training Memory (GB): 17.5
Epochs: 24
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 40.8
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/fcos_hrnetv2p_w32_gn-head_4x4_2x_coco/fcos_hrnetv2p_w32_gn-head_4x4_2x_coco_20201212_112133-77b6b9bb.pth
- Name: fcos_hrnetv2p_w18_gn-head_mstrain_640-800_4x4_2x_coco
In Collection: HRNet
Config: configs/hrnet/fcos_hrnetv2p_w18_gn-head_mstrain_640-800_4x4_2x_coco.py
Metadata:
Training Memory (GB): 13.0
inference time (s/im): 0.07752
Epochs: 24
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 38.3
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/fcos_hrnetv2p_w18_gn-head_mstrain_640-800_4x4_2x_coco/fcos_hrnetv2p_w18_gn-head_mstrain_640-800_4x4_2x_coco_20201212_111651-441e9d9f.pth
- Name: fcos_hrnetv2p_w32_gn-head_mstrain_640-800_4x4_2x_coco
In Collection: HRNet
Config: configs/hrnet/fcos_hrnetv2p_w32_gn-head_mstrain_640-800_4x4_2x_coco.py
Metadata:
Training Memory (GB): 17.5
inference time (s/im): 0.08065
Epochs: 24
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 41.9
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/fcos_hrnetv2p_w32_gn-head_mstrain_640-800_4x4_2x_coco/fcos_hrnetv2p_w32_gn-head_mstrain_640-800_4x4_2x_coco_20201212_090846-b6f2b49f.pth
- Name: fcos_hrnetv2p_w40_gn-head_mstrain_640-800_4x4_2x_coco
In Collection: HRNet
Config: configs/hrnet/fcos_hrnetv2p_w40_gn-head_mstrain_640-800_4x4_2x_coco.py
Metadata:
Training Memory (GB): 20.3
inference time (s/im): 0.09259
Epochs: 24
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 42.7
Weights: https://download.openmmlab.com/mmdetection/v2.0/hrnet/fcos_hrnetv2p_w40_gn-head_mstrain_640-800_4x4_2x_coco/fcos_hrnetv2p_w40_gn-head_mstrain_640-800_4x4_2x_coco_20201212_124752-f22d2ce5.pth
# Hybrid Task Cascade for Instance Segmentation
## Introduction
<!-- [ALGORITHM] -->
We provide config files to reproduce the results in the CVPR 2019 paper for [Hybrid Task Cascade](https://arxiv.org/abs/1901.07518).
```latex
@inproceedings{chen2019hybrid,
title={Hybrid task cascade for instance segmentation},
author={Chen, Kai and Pang, Jiangmiao and Wang, Jiaqi and Xiong, Yu and Li, Xiaoxiao and Sun, Shuyang and Feng, Wansen and Liu, Ziwei and Shi, Jianping and Ouyang, Wanli and Chen Change Loy and Dahua Lin},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
year={2019}
}
```
## Dataset
HTC requires COCO and [COCO-stuff](http://calvin.inf.ed.ac.uk/wp-content/uploads/data/cocostuffdataset/stuffthingmaps_trainval2017.zip) dataset for training. You need to download and extract it in the COCO dataset path.
The directory should be like this.
```none
mmdetection
├── mmdet
├── tools
├── configs
├── data
│ ├── coco
│ │ ├── annotations
│ │ ├── train2017
│ │ ├── val2017
│ │ ├── test2017
| | ├── stuffthingmaps
```
## Results and Models
The results on COCO 2017val are shown in the below table. (results on test-dev are usually slightly higher than val)
| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Config | Download |
|:---------:|:-------:|:-------:|:--------:|:--------------:|:------:|:-------:|:------:|:--------:|
| R-50-FPN | pytorch | 1x | 8.2 | 5.8 | 42.3 | 37.4 | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/htc/htc_r50_fpn_1x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/htc/htc_r50_fpn_1x_coco/htc_r50_fpn_1x_coco_20200317-7332cf16.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/htc/htc_r50_fpn_1x_coco/htc_r50_fpn_1x_coco_20200317_070435.log.json) |
| R-50-FPN | pytorch | 20e | 8.2 | - | 43.3 | 38.3 | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/htc/htc_r50_fpn_20e_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/htc/htc_r50_fpn_20e_coco/htc_r50_fpn_20e_coco_20200319-fe28c577.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/htc/htc_r50_fpn_20e_coco/htc_r50_fpn_20e_coco_20200319_070313.log.json) |
| R-101-FPN | pytorch | 20e | 10.2 | 5.5 | 44.8 | 39.6 | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/htc/htc_r101_fpn_20e_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/htc/htc_r101_fpn_20e_coco/htc_r101_fpn_20e_coco_20200317-9b41b48f.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/htc/htc_r101_fpn_20e_coco/htc_r101_fpn_20e_coco_20200317_153107.log.json) |
| X-101-32x4d-FPN | pytorch |20e| 11.4 | 5.0 | 46.1 | 40.5 | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/htc/htc_x101_32x4d_fpn_16x1_20e_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/htc/htc_x101_32x4d_fpn_16x1_20e_coco/htc_x101_32x4d_fpn_16x1_20e_coco_20200318-de97ae01.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/htc/htc_x101_32x4d_fpn_16x1_20e_coco/htc_x101_32x4d_fpn_16x1_20e_coco_20200318_034519.log.json) |
| X-101-64x4d-FPN | pytorch |20e| 14.5 | 4.4 | 47.0 | 41.4 | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/htc/htc_x101_64x4d_fpn_16x1_20e_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/htc/htc_x101_64x4d_fpn_16x1_20e_coco/htc_x101_64x4d_fpn_16x1_20e_coco_20200318-b181fd7a.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/htc/htc_x101_64x4d_fpn_16x1_20e_coco/htc_x101_64x4d_fpn_16x1_20e_coco_20200318_081711.log.json) |
- In the HTC paper and COCO 2018 Challenge, `score_thr` is set to 0.001 for both baselines and HTC.
- We use 8 GPUs with 2 images/GPU for R-50 and R-101 models, and 16 GPUs with 1 image/GPU for X-101 models.
If you would like to train X-101 HTC with 8 GPUs, you need to change the lr from 0.02 to 0.01.
We also provide a powerful HTC with DCN and multi-scale training model. No testing augmentation is used.
| Backbone | Style | DCN | training scales | Lr schd | box AP | mask AP | Config | Download |
|:----------------:|:-------:|:-----:|:---------------:|:-------:|:------:|:-------:|:------:|:--------:|
| X-101-64x4d-FPN | pytorch | c3-c5 | 400~1400 | 20e | 50.4 | 43.8 | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/htc/htc_x101_64x4d_fpn_dconv_c3-c5_mstrain_400_1400_16x1_20e_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/htc/htc_x101_64x4d_fpn_dconv_c3-c5_mstrain_400_1400_16x1_20e_coco/htc_x101_64x4d_fpn_dconv_c3-c5_mstrain_400_1400_16x1_20e_coco_20200312-946fd751.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/htc/htc_x101_64x4d_fpn_dconv_c3-c5_mstrain_400_1400_16x1_20e_coco/htc_x101_64x4d_fpn_dconv_c3-c5_mstrain_400_1400_16x1_20e_coco_20200312_203410.log.json) |
_base_ = './htc_r50_fpn_1x_coco.py'
model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
# learning policy
lr_config = dict(step=[16, 19])
runner = dict(type='EpochBasedRunner', max_epochs=20)
_base_ = './htc_without_semantic_r50_fpn_1x_coco.py'
model = dict(
roi_head=dict(
semantic_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
out_channels=256,
featmap_strides=[8]),
semantic_head=dict(
type='FusedSemanticHead',
num_ins=5,
fusion_level=1,
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=183,
ignore_label=255,
loss_weight=0.2)))
data_root = 'data/coco/'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='LoadAnnotations', with_bbox=True, with_mask=True, with_seg=True),
dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='SegRescale', scale_factor=1 / 8),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks', 'gt_semantic_seg']),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
data = dict(
train=dict(
seg_prefix=data_root + 'stuffthingmaps/train2017/',
pipeline=train_pipeline),
val=dict(pipeline=test_pipeline),
test=dict(pipeline=test_pipeline))
_base_ = './htc_r50_fpn_1x_coco.py'
# learning policy
lr_config = dict(step=[16, 19])
runner = dict(type='EpochBasedRunner', max_epochs=20)
_base_ = [
'../_base_/datasets/coco_instance.py',
'../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
]
# model settings
model = dict(
type='HybridTaskCascade',
pretrained='torchvision://resnet50',
backbone=dict(
type='ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
style='pytorch'),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5),
rpn_head=dict(
type='RPNHead',
in_channels=256,
feat_channels=256,
anchor_generator=dict(
type='AnchorGenerator',
scales=[8],
ratios=[0.5, 1.0, 2.0],
strides=[4, 8, 16, 32, 64]),
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[.0, .0, .0, .0],
target_stds=[1.0, 1.0, 1.0, 1.0]),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0)),
roi_head=dict(
type='HybridTaskCascadeRoIHead',
interleaved=True,
mask_info_flow=True,
num_stages=3,
stage_loss_weights=[1, 0.5, 0.25],
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=[
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=80,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
loss_weight=1.0)),
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=80,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.05, 0.05, 0.1, 0.1]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
loss_weight=1.0)),
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=80,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.033, 0.033, 0.067, 0.067]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))
],
mask_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
mask_head=[
dict(
type='HTCMaskHead',
with_conv_res=False,
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=80,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)),
dict(
type='HTCMaskHead',
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=80,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)),
dict(
type='HTCMaskHead',
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=80,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0))
]),
# model training and testing settings
train_cfg=dict(
rpn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
allowed_border=0,
pos_weight=-1,
debug=False),
rpn_proposal=dict(
nms_pre=2000,
max_per_img=2000,
nms=dict(type='nms', iou_threshold=0.7),
min_bbox_size=0),
rcnn=[
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.5,
min_pos_iou=0.5,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False),
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.6,
neg_iou_thr=0.6,
min_pos_iou=0.6,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False),
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.7,
min_pos_iou=0.7,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False)
]),
test_cfg=dict(
rpn=dict(
nms_pre=1000,
max_per_img=1000,
nms=dict(type='nms', iou_threshold=0.7),
min_bbox_size=0),
rcnn=dict(
score_thr=0.001,
nms=dict(type='nms', iou_threshold=0.5),
max_per_img=100,
mask_thr_binary=0.5)))
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
data = dict(
val=dict(pipeline=test_pipeline), test=dict(pipeline=test_pipeline))
_base_ = './htc_r50_fpn_1x_coco.py'
model = dict(
pretrained='open-mmlab://resnext101_32x4d',
backbone=dict(
type='ResNeXt',
depth=101,
groups=32,
base_width=4,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
style='pytorch'))
data = dict(samples_per_gpu=1, workers_per_gpu=1)
# learning policy
lr_config = dict(step=[16, 19])
runner = dict(type='EpochBasedRunner', max_epochs=20)
_base_ = './htc_r50_fpn_1x_coco.py'
model = dict(
pretrained='open-mmlab://resnext101_64x4d',
backbone=dict(
type='ResNeXt',
depth=101,
groups=64,
base_width=4,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
style='pytorch'))
data = dict(samples_per_gpu=1, workers_per_gpu=1)
# learning policy
lr_config = dict(step=[16, 19])
runner = dict(type='EpochBasedRunner', max_epochs=20)
_base_ = './htc_r50_fpn_1x_coco.py'
model = dict(
pretrained='open-mmlab://resnext101_64x4d',
backbone=dict(
type='ResNeXt',
depth=101,
groups=64,
base_width=4,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
style='pytorch',
dcn=dict(type='DCN', deform_groups=1, fallback_on_stride=False),
stage_with_dcn=(False, True, True, True)))
# dataset settings
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='LoadAnnotations', with_bbox=True, with_mask=True, with_seg=True),
dict(
type='Resize',
img_scale=[(1600, 400), (1600, 1400)],
multiscale_mode='range',
keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='SegRescale', scale_factor=1 / 8),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks', 'gt_semantic_seg']),
]
data = dict(
samples_per_gpu=1, workers_per_gpu=1, train=dict(pipeline=train_pipeline))
# learning policy
lr_config = dict(step=[16, 19])
runner = dict(type='EpochBasedRunner', max_epochs=20)
Collections:
- Name: HTC
Metadata:
Training Data: COCO
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- FPN
- HTC
- RPN
- ResNet
- ResNeXt
- RoIAlign
Paper: https://arxiv.org/abs/1901.07518
README: configs/htc/README.md
Models:
- Name: htc_r50_fpn_1x_coco
In Collection: HTC
Config: configs/htc/htc_r50_fpn_1x_coco.py
Metadata:
Training Memory (GB): 8.2
inference time (s/im): 0.17241
Epochs: 12
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 42.3
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 37.4
Weights: https://download.openmmlab.com/mmdetection/v2.0/htc/htc_r50_fpn_1x_coco/htc_r50_fpn_1x_coco_20200317-7332cf16.pth
- Name: htc_r50_fpn_20e_coco
In Collection: HTC
Config: configs/htc/htc_r50_fpn_20e_coco.py
Metadata:
Training Memory (GB): 8.2
Epochs: 20
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 43.3
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 38.3
Weights: https://download.openmmlab.com/mmdetection/v2.0/htc/htc_r50_fpn_20e_coco/htc_r50_fpn_20e_coco_20200319-fe28c577.pth
- Name: htc_r101_fpn_20e_coco
In Collection: HTC
Config: configs/htc/htc_r101_fpn_20e_coco.py
Metadata:
Training Memory (GB): 10.2
inference time (s/im): 0.18182
Epochs: 20
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 44.8
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 39.6
Weights: https://download.openmmlab.com/mmdetection/v2.0/htc/htc_r101_fpn_20e_coco/htc_r101_fpn_20e_coco_20200317-9b41b48f.pth
- Name: htc_x101_32x4d_fpn_16x1_20e_coco
In Collection: HTC
Config: configs/htc/htc_x101_32x4d_fpn_16x1_20e_coco.py
Metadata:
Training Memory (GB): 11.4
inference time (s/im): 0.2
Epochs: 20
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 46.1
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 40.5
Weights: https://download.openmmlab.com/mmdetection/v2.0/htc/htc_x101_32x4d_fpn_16x1_20e_coco/htc_x101_32x4d_fpn_16x1_20e_coco_20200318-de97ae01.pth
- Name: htc_x101_64x4d_fpn_16x1_20e_coco
In Collection: HTC
Config: configs/htc/htc_x101_64x4d_fpn_16x1_20e_coco.py
Metadata:
Training Memory (GB): 14.5
inference time (s/im): 0.22727
Epochs: 20
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 47.0
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 41.4
Weights: https://download.openmmlab.com/mmdetection/v2.0/htc/htc_x101_64x4d_fpn_16x1_20e_coco/htc_x101_64x4d_fpn_16x1_20e_coco_20200318-b181fd7a.pth
- Name: htc_x101_64x4d_fpn_dconv_c3-c5_mstrain_400_1400_16x1_20e_coco
In Collection: HTC
Config: configs/htc/htc_x101_64x4d_fpn_dconv_c3-c5_mstrain_400_1400_16x1_20e_coco.py
Metadata:
Epochs: 20
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 50.4
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 43.8
Weights: https://download.openmmlab.com/mmdetection/v2.0/htc/htc_x101_64x4d_fpn_dconv_c3-c5_mstrain_400_1400_16x1_20e_coco/htc_x101_64x4d_fpn_dconv_c3-c5_mstrain_400_1400_16x1_20e_coco_20200312-946fd751.pth
# InstaBoost for MMDetection
<!-- [ALGORITHM] -->
Configs in this directory is the implementation for ICCV2019 paper "InstaBoost: Boosting Instance Segmentation Via Probability Map Guided Copy-Pasting" and provided by the authors of the paper. InstaBoost is a data augmentation method for object detection and instance segmentation. The paper has been released on [`arXiv`](https://arxiv.org/abs/1908.07801).
```latex
@inproceedings{fang2019instaboost,
title={Instaboost: Boosting instance segmentation via probability map guided copy-pasting},
author={Fang, Hao-Shu and Sun, Jianhua and Wang, Runzhong and Gou, Minghao and Li, Yong-Lu and Lu, Cewu},
booktitle={Proceedings of the IEEE International Conference on Computer Vision},
pages={682--691},
year={2019}
}
```
## Usage
### Requirements
You need to install `instaboostfast` before using it.
```shell
pip install instaboostfast
```
The code and more details can be found [here](https://github.com/GothicAi/Instaboost).
### Integration with MMDetection
InstaBoost have been already integrated in the data pipeline, thus all you need is to add or change **InstaBoost** configurations after **LoadImageFromFile**. We have provided examples like [this](mask_rcnn_r50_fpn_instaboost_4x#L121). You can refer to [`InstaBoostConfig`](https://github.com/GothicAi/InstaBoost-pypi#instaboostconfig) for more details.
## Results and Models
- All models were trained on `coco_2017_train` and tested on `coco_2017_val` for conveinience of evaluation and comparison. In the paper, the results are obtained from `test-dev`.
- To balance accuracy and training time when using InstaBoost, models released in this page are all trained for 48 Epochs. Other training and testing configs strictly follow the original framework.
- For results and models in MMDetection V1.x, please refer to [Instaboost](https://github.com/GothicAi/Instaboost).
| Network | Backbone | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Config | Download |
| :-------------: | :--------: | :-----: | :------: | :------------: | :------:| :-----: | :------: | :-----------------: |
| Mask R-CNN | R-50-FPN | 4x | 4.4 | 17.5 | 40.6 | 36.6 | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/instaboost/mask_rcnn_r50_fpn_instaboost_4x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/instaboost/mask_rcnn_r50_fpn_instaboost_4x_coco/mask_rcnn_r50_fpn_instaboost_4x_coco_20200307-d025f83a.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/instaboost/mask_rcnn_r50_fpn_instaboost_4x_coco/mask_rcnn_r50_fpn_instaboost_4x_coco_20200307_223635.log.json) |
| Mask R-CNN | R-101-FPN | 4x | 6.4 | | 42.5 | 38.0 | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/instaboost/mask_rcnn_r101_fpn_instaboost_4x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/instaboost/mask_rcnn_r101_fpn_instaboost_4x_coco/mask_rcnn_r101_fpn_instaboost_4x_coco_20200703_235738-f23f3a5f.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/instaboost/mask_rcnn_r101_fpn_instaboost_4x_coco/mask_rcnn_r101_fpn_instaboost_4x_coco_20200703_235738.log.json) |
| Mask R-CNN | X-101-64x4d-FPN | 4x | 10.7 | | 44.7 | 39.7 | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/instaboost/mask_rcnn_x101_64x4d_fpn_instaboost_4x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/instaboost/mask_rcnn_x101_64x4d_fpn_instaboost_4x_coco/mask_rcnn_x101_64x4d_fpn_instaboost_4x_coco_20200515_080947-8ed58c1b.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/instaboost/mask_rcnn_x101_64x4d_fpn_instaboost_4x_coco/mask_rcnn_x101_64x4d_fpn_instaboost_4x_coco_20200515_080947.log.json) |
| Cascade R-CNN | R-101-FPN | 4x | 6.0 | 12.0 | 43.7 | 38.0 | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/instaboost/cascade_mask_rcnn_r50_fpn_instaboost_4x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/instaboost/cascade_mask_rcnn_r50_fpn_instaboost_4x_coco/cascade_mask_rcnn_r50_fpn_instaboost_4x_coco_20200307-c19d98d9.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/instaboost/cascade_mask_rcnn_r50_fpn_instaboost_4x_coco/cascade_mask_rcnn_r50_fpn_instaboost_4x_coco_20200307_223646.log.json) |
_base_ = './cascade_mask_rcnn_r50_fpn_instaboost_4x_coco.py'
model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
_base_ = '../cascade_rcnn/cascade_mask_rcnn_r50_fpn_1x_coco.py'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='InstaBoost',
action_candidate=('normal', 'horizontal', 'skip'),
action_prob=(1, 0, 0),
scale=(0.8, 1.2),
dx=15,
dy=15,
theta=(-1, 1),
color_prob=0.5,
hflag=False,
aug_ratio=0.5),
dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
]
data = dict(train=dict(pipeline=train_pipeline))
# learning policy
lr_config = dict(step=[32, 44])
runner = dict(type='EpochBasedRunner', max_epochs=48)
_base_ = './cascade_mask_rcnn_r50_fpn_instaboost_4x_coco.py'
model = dict(
pretrained='open-mmlab://resnext101_64x4d',
backbone=dict(
type='ResNeXt',
depth=101,
groups=64,
base_width=4,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
style='pytorch'))
_base_ = './mask_rcnn_r50_fpn_instaboost_4x_coco.py'
model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
_base_ = '../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='InstaBoost',
action_candidate=('normal', 'horizontal', 'skip'),
action_prob=(1, 0, 0),
scale=(0.8, 1.2),
dx=15,
dy=15,
theta=(-1, 1),
color_prob=0.5,
hflag=False,
aug_ratio=0.5),
dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
]
data = dict(train=dict(pipeline=train_pipeline))
# learning policy
lr_config = dict(step=[32, 44])
runner = dict(type='EpochBasedRunner', max_epochs=48)
_base_ = './mask_rcnn_r50_fpn_instaboost_4x_coco.py'
model = dict(
pretrained='open-mmlab://resnext101_64x4d',
backbone=dict(
type='ResNeXt',
depth=101,
groups=64,
base_width=4,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
style='pytorch'))
Collections:
- Name: InstaBoost
Metadata:
Training Data: COCO
Training Techniques:
- InstaBoost
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Paper: https://arxiv.org/abs/1908.07801
README: configs/instaboost/README.md
Models:
- Name: mask_rcnn_r50_fpn_instaboost_4x_coco
In Collection: InstaBoost
Config: configs/instaboost/mask_rcnn_r50_fpn_instaboost_4x_coco.py
Metadata:
Training Memory (GB): 4.4
inference time (s/im): 0.05714
Epochs: 48
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 40.6
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 36.6
Weights: https://download.openmmlab.com/mmdetection/v2.0/instaboost/mask_rcnn_r50_fpn_instaboost_4x_coco/mask_rcnn_r50_fpn_instaboost_4x_coco_20200307-d025f83a.pth
- Name: mask_rcnn_r101_fpn_instaboost_4x_coco
In Collection: InstaBoost
Config: configs/instaboost/mask_rcnn_r101_fpn_instaboost_4x_coco.py
Metadata:
Training Memory (GB): 6.4
Epochs: 48
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 42.5
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 38.0
Weights: https://download.openmmlab.com/mmdetection/v2.0/instaboost/mask_rcnn_r101_fpn_instaboost_4x_coco/mask_rcnn_r101_fpn_instaboost_4x_coco_20200703_235738-f23f3a5f.pth
- Name: mask_rcnn_x101_64x4d_fpn_instaboost_4x_coco
In Collection: InstaBoost
Config: configs/instaboost/mask_rcnn_x101_64x4d_fpn_instaboost_4x_coco.py
Metadata:
Training Memory (GB): 10.7
Epochs: 48
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 44.7
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 39.7
Weights: https://download.openmmlab.com/mmdetection/v2.0/instaboost/mask_rcnn_x101_64x4d_fpn_instaboost_4x_coco/mask_rcnn_x101_64x4d_fpn_instaboost_4x_coco_20200515_080947-8ed58c1b.pth
- Name: cascade_mask_rcnn_r50_fpn_instaboost_4x_coco
In Collection: InstaBoost
Config: configs/instaboost/cascade_mask_rcnn_r50_fpn_instaboost_4x_coco.py
Metadata:
Training Memory (GB): 6.0
inference time (s/im): 0.08333
Epochs: 48
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 43.7
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 38.0
Weights: https://download.openmmlab.com/mmdetection/v2.0/instaboost/cascade_mask_rcnn_r50_fpn_instaboost_4x_coco/cascade_mask_rcnn_r50_fpn_instaboost_4x_coco_20200307-c19d98d9.pth
# Localization Distillation for Object Detection
## Introduction
<!-- [ALGORITHM] -->
```latex
@Article{zheng2021LD,
title={Localization Distillation for Object Detection},
author= {Zhaohui Zheng, Rongguang Ye, Ping Wang, Jun Wang, Dongwei Ren, Wangmeng Zuo},
journal={arXiv:2102.12252},
year={2021}
}
```
### GFocalV1 with LD
| Teacher | Student | Training schedule | Mini-batch size | AP (val) | AP50 (val) | AP75 (val) | Config |
| :-------: | :-----: | :---------------: | :-------------: | :------: | :--------: | :--------: | :--------------: |
| -- | R-18 | 1x | 6 | 35.8 | 53.1 | 38.2 | |
| R-101 | R-18 | 1x | 6 | 36.5 | 52.9 | 39.3 | [config](https://github.com/open-mmlab/mmdetection/blob/master/configs/ld/ld_r18_gflv1_r101_fpn_coco_1x.py) |
| -- | R-34 | 1x | 6 | 38.9 | 56.6 | 42.2 | |
| R-101 | R-34 | 1x | 6 | 39.8 | 56.6 | 43.1 | [config](https://github.com/open-mmlab/mmdetection/blob/master/configs/ld/ld_r34_gflv1_r101_fpn_coco_1x.py) |
| -- | R-50 | 1x | 6 | 40.1 | 58.2 | 43.1 | |
| R-101 | R-50 | 1x | 6 | 41.1 | 58.7 | 44.9 | [config](https://github.com/open-mmlab/mmdetection/blob/master/configs/ld/ld_r50_gflv1_r101_fpn_coco_1x.py) |
| -- | R-101 | 2x | 6 | 44.6 | 62.9 | 48.4 | |
| R-101-DCN | R-101 | 2x | 6 | 45.4 | 63.1 | 49.5 | [config](https://github.com/open-mmlab/mmdetection/blob/master/configs/ld/ld_r101_gflv1_r101dcn_fpn_coco_1x.py) |
## Note
- Meaning of Config name: ld_r18(student model)_gflv1(based on gflv1)_r101(teacher model)_fpn(neck)_coco(dataset)_1x(12 epoch).py
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment