Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
SOLOv2-pytorch
Commits
2017c81e
Commit
2017c81e
authored
Dec 20, 2018
by
Kai Chen
Browse files
Merge branch 'master' into pytorch-1.0
parents
c4408812
6594f862
Changes
48
Show whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
2394 additions
and
21 deletions
+2394
-21
MODEL_ZOO.md
MODEL_ZOO.md
+27
-0
README.md
README.md
+10
-10
TECHNICAL_DETAILS.md
TECHNICAL_DETAILS.md
+2
-2
configs/cascade_mask_rcnn_x101_32x4d_fpn_1x.py
configs/cascade_mask_rcnn_x101_32x4d_fpn_1x.py
+229
-0
configs/cascade_mask_rcnn_x101_64x4d_fpn_1x.py
configs/cascade_mask_rcnn_x101_64x4d_fpn_1x.py
+229
-0
configs/cascade_rcnn_x101_32x4d_fpn_1x.py
configs/cascade_rcnn_x101_32x4d_fpn_1x.py
+212
-0
configs/cascade_rcnn_x101_64x4d_fpn_1x.py
configs/cascade_rcnn_x101_64x4d_fpn_1x.py
+212
-0
configs/faster_rcnn_ohem_r50_fpn_1x.py
configs/faster_rcnn_ohem_r50_fpn_1x.py
+156
-0
configs/faster_rcnn_x101_32x4d_fpn_1x.py
configs/faster_rcnn_x101_32x4d_fpn_1x.py
+158
-0
configs/faster_rcnn_x101_64x4d_fpn_1x.py
configs/faster_rcnn_x101_64x4d_fpn_1x.py
+158
-0
configs/mask_rcnn_x101_32x4d_fpn_1x.py
configs/mask_rcnn_x101_32x4d_fpn_1x.py
+170
-0
configs/mask_rcnn_x101_64x4d_fpn_1x.py
configs/mask_rcnn_x101_64x4d_fpn_1x.py
+170
-0
configs/pascal_voc/faster_rcnn_r50_fpn_1x_voc0712.py
configs/pascal_voc/faster_rcnn_r50_fpn_1x_voc0712.py
+157
-0
configs/retinanet_x101_32x4d_fpn_1x.py
configs/retinanet_x101_32x4d_fpn_1x.py
+122
-0
configs/retinanet_x101_64x4d_fpn_1x.py
configs/retinanet_x101_64x4d_fpn_1x.py
+122
-0
configs/rpn_x101_32x4d_fpn_1x.py
configs/rpn_x101_32x4d_fpn_1x.py
+123
-0
configs/rpn_x101_64x4d_fpn_1x.py
configs/rpn_x101_64x4d_fpn_1x.py
+123
-0
mmdet/apis/train.py
mmdet/apis/train.py
+8
-4
mmdet/core/bbox/assign_sampling.py
mmdet/core/bbox/assign_sampling.py
+4
-4
mmdet/core/bbox/samplers/__init__.py
mmdet/core/bbox/samplers/__init__.py
+2
-1
No files found.
MODEL_ZOO.md
View file @
2017c81e
...
@@ -46,6 +46,10 @@ We released RPN, Faster R-CNN and Mask R-CNN models in the first version. More m
...
@@ -46,6 +46,10 @@ We released RPN, Faster R-CNN and Mask R-CNN models in the first version. More m
| R-101-FPN | caffe | 1x | 7.4 | 0.513 | 11.1 | 59.4 | - |
| R-101-FPN | caffe | 1x | 7.4 | 0.513 | 11.1 | 59.4 | - |
| R-101-FPN | pytorch | 1x | 8.0 | 0.552 | 11.1 | 58.6 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/rpn_r101_fpn_1x_20181129-f50da4bd.pth
)
|
| R-101-FPN | pytorch | 1x | 8.0 | 0.552 | 11.1 | 58.6 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/rpn_r101_fpn_1x_20181129-f50da4bd.pth
)
|
| R-101-FPN | pytorch | 2x | 8.0 | 0.552 | 11.1 | 59.1 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/rpn_r101_fpn_2x_20181129-e42c6c9a.pth
)
|
| R-101-FPN | pytorch | 2x | 8.0 | 0.552 | 11.1 | 59.1 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/rpn_r101_fpn_2x_20181129-e42c6c9a.pth
)
|
| X-101-32x4d-FPN | pytorch |1x | 9.9 | 0.691 | 8.3 | 59.4 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/rpn_x101_32x4d_fpn_1x_20181218-7e379d26.pth
)
| X-101-32x4d-FPN | pytorch |2x | 9.9 | 0.691 | 8.3 | 59.9 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/rpn_x101_32x4d_fpn_2x_20181218-0510af40.pth
)
| X-101-64x4d-FPN | pytorch |1x | 14.6 | 1.032 | 6.2 | 59.8 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/rpn_x101_64x4d_fpn_1x_20181218-c1a24f1f.pth
)
| X-101-64x4d-FPN | pytorch |2x | 14.6 | 1.032 | 6.2 | 60.0 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/rpn_x101_64x4d_fpn_2x_20181218-c22bdd70.pth
)
### Faster R-CNN
### Faster R-CNN
...
@@ -57,6 +61,10 @@ We released RPN, Faster R-CNN and Mask R-CNN models in the first version. More m
...
@@ -57,6 +61,10 @@ We released RPN, Faster R-CNN and Mask R-CNN models in the first version. More m
| R-101-FPN | caffe | 1x | 7.4 | 0.663 | 8.4 | 38.8 | - |
| R-101-FPN | caffe | 1x | 7.4 | 0.663 | 8.4 | 38.8 | - |
| R-101-FPN | pytorch | 1x | 8.0 | 0.698 | 8.3 | 38.6 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/faster_rcnn_r101_fpn_1x_20181129-d1468807.pth
)
|
| R-101-FPN | pytorch | 1x | 8.0 | 0.698 | 8.3 | 38.6 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/faster_rcnn_r101_fpn_1x_20181129-d1468807.pth
)
|
| R-101-FPN | pytorch | 2x | 8.0 | 0.698 | 8.3 | 39.4 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/faster_rcnn_r101_fpn_2x_20181129-73e7ade7.pth
)
|
| R-101-FPN | pytorch | 2x | 8.0 | 0.698 | 8.3 | 39.4 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/faster_rcnn_r101_fpn_2x_20181129-73e7ade7.pth
)
|
| X-101-32x4d-FPN | pytorch | 1x| 9.9 | 0.842 | 7.0 | 40.2 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/faster_rcnn_x101_32x4d_fpn_1x_20181218-ad81c133.pth
)
| X-101-32x4d-FPN | pytorch | 2x| 9.9 | 0.842 | 7.0 | 40.5 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/faster_rcnn_x101_32x4d_fpn_2x_20181218-0ed58946.pth
)
| X-101-64x4d-FPN | pytorch | 1x| 14.1 | 1.181 | 5.2 | 41.3 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/faster_rcnn_x101_64x4d_fpn_1x_20181218-c9c69c8f.pth
)
| X-101-64x4d-FPN | pytorch | 2x| 14.1 | 1.181 | 5.2 | 40.7 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/faster_rcnn_x101_64x4d_fpn_2x_20181218-fe94f9b8.pth
)
### Mask R-CNN
### Mask R-CNN
...
@@ -68,6 +76,10 @@ We released RPN, Faster R-CNN and Mask R-CNN models in the first version. More m
...
@@ -68,6 +76,10 @@ We released RPN, Faster R-CNN and Mask R-CNN models in the first version. More m
| R-101-FPN | caffe | 1x | 8.8 | 0.791 | 7.0 | 39.9 | 36.1 | - |
| R-101-FPN | caffe | 1x | 8.8 | 0.791 | 7.0 | 39.9 | 36.1 | - |
| R-101-FPN | pytorch | 1x | 9.1 | 0.825 | 6.7 | 39.4 | 35.9 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_r101_fpn_1x_20181129-34ad1961.pth
)
|
| R-101-FPN | pytorch | 1x | 9.1 | 0.825 | 6.7 | 39.4 | 35.9 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_r101_fpn_1x_20181129-34ad1961.pth
)
|
| R-101-FPN | pytorch | 2x | 9.1 | 0.825 | 6.7 | 40.4 | 36.6 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_r101_fpn_2x_20181129-a254bdfc.pth
)
|
| R-101-FPN | pytorch | 2x | 9.1 | 0.825 | 6.7 | 40.4 | 36.6 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_r101_fpn_2x_20181129-a254bdfc.pth
)
|
| X-101-32x4d-FPN | pytorch | 1x| 10.9 | 0.972 | 5.8 | 41.2 | 37.2 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_x101_32x4d_fpn_1x_20181218-44e635cc.pth
)
| X-101-64x4d-FPN | pytorch | 2x| 10.9 | 0.972 | 5.8 | 41.4 | 37.1 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_x101_32x4d_fpn_2x_20181218-f023dffa.pth
)
| X-101-32x4d-FPN | pytorch | 1x| 14.1 | 1.302 | 4.7 | 42.2 | 38.1 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_x101_64x4d_fpn_1x_20181218-cb159987.pth
)
| X-101-64x4d-FPN | pytorch | 2x| 14.1 | 1.302 | 4.7 | 42.0 | 37.8 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_x101_64x4d_fpn_2x_20181218-ea936e44.pth
)
### Fast R-CNN (with pre-computed proposals)
### Fast R-CNN (with pre-computed proposals)
...
@@ -96,6 +108,10 @@ We released RPN, Faster R-CNN and Mask R-CNN models in the first version. More m
...
@@ -96,6 +108,10 @@ We released RPN, Faster R-CNN and Mask R-CNN models in the first version. More m
| R-101-FPN | caffe | 1x | 9.2 | 0.614 | 8.2 | 37.8 | - |
| R-101-FPN | caffe | 1x | 9.2 | 0.614 | 8.2 | 37.8 | - |
| R-101-FPN | pytorch | 1x | 9.6 | 0.643 | 8.1 | 37.7 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/retinanet_r101_fpn_1x_20181129-f738a02f.pth
)
|
| R-101-FPN | pytorch | 1x | 9.6 | 0.643 | 8.1 | 37.7 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/retinanet_r101_fpn_1x_20181129-f738a02f.pth
)
|
| R-101-FPN | pytorch | 2x | 9.6 | 0.643 | 8.1 | 38.1 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/retinanet_r101_fpn_2x_20181129-f654534b.pth
)
|
| R-101-FPN | pytorch | 2x | 9.6 | 0.643 | 8.1 | 38.1 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/retinanet_r101_fpn_2x_20181129-f654534b.pth
)
|
| X-101-32x4d-FPN | pytorch | 1x| 10.8 | 0.792 | 6.7 | 38.7 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/retinanet_x101_32x4d_fpn_1x_20181218-c140fb82.pth
)
| X-101-32x4d-FPN | pytorch | 2x| 10.8 | 0.792 | 6.7 | 39.3 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/retinanet_x101_32x4d_fpn_2x_20181218-605dcd0a.pth
)
| X-101-64x4d-FPN | pytorch | 1x| 14.6 | 1.128 | 5.3 | 40.0 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/retinanet_x101_64x4d_fpn_1x_20181218-2f6f778b.pth
)
| X-101-64x4d-FPN | pytorch | 2x| 14.6 | 1.128 | 5.3 | 39.6 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/retinanet_x101_64x4d_fpn_2x_20181218-2f598dc5.pth
)
### Cascade R-CNN
### Cascade R-CNN
...
@@ -107,6 +123,10 @@ We released RPN, Faster R-CNN and Mask R-CNN models in the first version. More m
...
@@ -107,6 +123,10 @@ We released RPN, Faster R-CNN and Mask R-CNN models in the first version. More m
| R-101-FPN | caffe | 1x | 8.5 | 0.731 | 7.0 | 42.2 | - |
| R-101-FPN | caffe | 1x | 8.5 | 0.731 | 7.0 | 42.2 | - |
| R-101-FPN | pytorch | 1x | 8.7 | 0.766 | 6.9 | 42.1 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_rcnn_r101_fpn_1x_20181129-d64ebac7.pth
)
|
| R-101-FPN | pytorch | 1x | 8.7 | 0.766 | 6.9 | 42.1 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_rcnn_r101_fpn_1x_20181129-d64ebac7.pth
)
|
| R-101-FPN | pytorch | 20e | 8.7 | 0.766 | 6.9 | 42.6 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_rcnn_r101_fpn_20e_20181129-b46dcede.pth
)
|
| R-101-FPN | pytorch | 20e | 8.7 | 0.766 | 6.9 | 42.6 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_rcnn_r101_fpn_20e_20181129-b46dcede.pth
)
|
| X-101-32x4d-FPN | pytorch | 1x| 10.6 | 0.902 | 5.7 | 43.5 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_rcnn_x101_32x4d_fpn_1x_20181218-941c0925.pth
)
| X-101-32x4d-FPN | pytorch |20e| 10.6 | 0.902 | 5.7 | 44.1 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_rcnn_x101_32x4d_fpn_2x_20181218-28f73c4c.pth
)
| X-101-64x4d-FPN | pytorch | 1x| 14.1 | 1.251 | 4.6 | 44.6 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_rcnn_x101_64x4d_fpn_1x_20181218-e2dc376a.pth
)
| X-101-64x4d-FPN | pytorch |20e| 14.1 | 1.251 | 4.6 | 44.8 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_rcnn_x101_64x4d_fpn_2x_20181218-5add321e.pth
)
### Cascade Mask R-CNN
### Cascade Mask R-CNN
...
@@ -118,7 +138,14 @@ We released RPN, Faster R-CNN and Mask R-CNN models in the first version. More m
...
@@ -118,7 +138,14 @@ We released RPN, Faster R-CNN and Mask R-CNN models in the first version. More m
| R-101-FPN | caffe | 1x | 10.5 | 1.024 | 5.3 | 43.1 | 37.3 | - |
| R-101-FPN | caffe | 1x | 10.5 | 1.024 | 5.3 | 43.1 | 37.3 | - |
| R-101-FPN | pytorch | 1x | 10.9 | 1.055 | 5.2 | 42.7 | 37.1 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_mask_rcnn_r101_fpn_1x_20181129-64f00602.pth
)
|
| R-101-FPN | pytorch | 1x | 10.9 | 1.055 | 5.2 | 42.7 | 37.1 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_mask_rcnn_r101_fpn_1x_20181129-64f00602.pth
)
|
| R-101-FPN | pytorch | 20e | 10.9 | 1.055 | 5.2 | 43.4 | 37.6 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_mask_rcnn_r101_fpn_20e_20181129-cb85151d.pth
)
|
| R-101-FPN | pytorch | 20e | 10.9 | 1.055 | 5.2 | 43.4 | 37.6 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_mask_rcnn_r101_fpn_20e_20181129-cb85151d.pth
)
|
| X-101-32x4d-FPN | pytorch | 1x| 12.67 | 1.181 | 4.2 | 44.4 | 38.3 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_mask_rcnn_x101_32x4d_fpn_1x_20181218-1d944c89.pth
)
| X-101-32x4d-FPN | pytorch |20e| 12.67 | 1.181 | 4.2 | 44.9 | 38.7 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_mask_rcnn_x101_32x4d_fpn_20e_20181218-761a3473.pth
)
| X-101-64x4d-FPN | pytorch | 1x| 10.87 | 1.125 | 3.6 | 45.5 | 39.2 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_mask_rcnn_x101_64x4d_fpn_1x_20181218-85953a91.pth
)
| X-101-64x4d-FPN | pytorch |20e| 10.87 | 1.125 | 3.6 | 45.8 | 39.5 |
[
model
](
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_mask_rcnn_x101_64x4d_fpn_20e_20181218-630773a7.pth
)
**Notes:**
-
Cascade Mask R-CNN with X-101-64x4d-FPN was trained using 16 GPU with a batch size of 16 (1 images per GPU).
## Comparison with Detectron
## Comparison with Detectron
...
...
README.md
View file @
2017c81e
...
@@ -58,14 +58,14 @@ Results and models are available in the [Model zoo](MODEL_ZOO.md).
...
@@ -58,14 +58,14 @@ Results and models are available in the [Model zoo](MODEL_ZOO.md).
| | ResNet | ResNeXt | SENet | VGG |
| | ResNet | ResNeXt | SENet | VGG |
|--------------------|:--------:|:--------:|:--------:|:--------:|
|--------------------|:--------:|:--------:|:--------:|:--------:|
| RPN | ✓ |
☐
| ☐ | ✗ |
| RPN | ✓ |
✓
| ☐ | ✗ |
| Fast R-CNN | ✓ |
☐
| ☐ | ✗ |
| Fast R-CNN | ✓ |
✓
| ☐ | ✗ |
| Faster R-CNN | ✓ |
☐
| ☐ | ✗ |
| Faster R-CNN | ✓ |
✓
| ☐ | ✗ |
| Mask R-CNN | ✓ |
☐
| ☐ | ✗ |
| Mask R-CNN | ✓ |
✓
| ☐ | ✗ |
| Cascade R-CNN | ✓ |
☐
| ☐ | ✗ |
| Cascade R-CNN | ✓ |
✓
| ☐ | ✗ |
| Cascade Mask R-CNN | ✓ |
☐
| ☐ | ✗ |
| Cascade Mask R-CNN | ✓ |
✓
| ☐ | ✗ |
| SSD | ✗ | ✗ | ✗ | ☐ |
| SSD | ✗ | ✗ | ✗ | ☐ |
| RetinaNet | ✓ |
☐
| ☐ | ✗ |
| RetinaNet | ✓ |
✓
| ☐ | ✗ |
## Installation
## Installation
...
@@ -194,7 +194,7 @@ Here is an example.
...
@@ -194,7 +194,7 @@ Here is an example.
'bboxes': <np.ndarray> (n, 4),
'bboxes': <np.ndarray> (n, 4),
'labels': <np.ndarray> (n, ),
'labels': <np.ndarray> (n, ),
'bboxes_ignore': <np.ndarray> (k, 4),
'bboxes_ignore': <np.ndarray> (k, 4),
'labels_ignore': <np.ndarray> (k,
4
) (optional field)
'labels_ignore': <np.ndarray> (k, ) (optional field)
}
}
},
},
...
...
...
@@ -206,12 +206,12 @@ There are two ways to work with custom datasets.
...
@@ -206,12 +206,12 @@ There are two ways to work with custom datasets.
-
online conversion
-
online conversion
You can write a new Dataset class inherited from
`CustomDataset`
, and overwrite two methods
You can write a new Dataset class inherited from
`CustomDataset`
, and overwrite two methods
`load_annotations(self, ann_file)`
and
`get_ann_info(self, idx)`
, like
[
CocoDataset
](
mmdet/datasets/coco.py
)
.
`load_annotations(self, ann_file)`
and
`get_ann_info(self, idx)`
, like
[
CocoDataset
](
mmdet/datasets/coco.py
)
and
[
VOCDataset
](
mmdet/datasets/voc.py
)
.
-
offline conversion
-
offline conversion
You can convert the annotation format to the expected format above and save it to
You can convert the annotation format to the expected format above and save it to
a pickle file, like
[
pascal_voc.py
](
tools/convert_datasets/pascal_voc.py
)
.
a pickle
or json
file, like
[
pascal_voc.py
](
tools/convert_datasets/pascal_voc.py
)
.
Then you can simply use
`CustomDataset`
.
Then you can simply use
`CustomDataset`
.
## Technical details
## Technical details
...
...
TECHNICAL_DETAILS.md
View file @
2017c81e
...
@@ -82,7 +82,7 @@ which defines the following abstract methods.
...
@@ -82,7 +82,7 @@ which defines the following abstract methods.
-
`extract_feat()`
: given an image batch of shape (n, c, h, w), extract the feature map(s).
-
`extract_feat()`
: given an image batch of shape (n, c, h, w), extract the feature map(s).
-
`forward_train()`
: forward method of the training mode
-
`forward_train()`
: forward method of the training mode
-
`simple_test()`
: single scale testing without augmentation
-
`simple_test()`
: single scale testing without augmentation
-
`aug_test()`
: testing with
out
augmentation (multi-scale, flip, etc.)
-
`aug_test()`
: testing with augmentation (multi-scale, flip, etc.)
[
TwoStageDetector
](
https://github.com/hellock/mmdetection/blob/master/mmdet/models/detectors/two_stage.py
)
[
TwoStageDetector
](
https://github.com/hellock/mmdetection/blob/master/mmdet/models/detectors/two_stage.py
)
is a good example which shows how to do that.
is a good example which shows how to do that.
...
...
configs/cascade_mask_rcnn_x101_32x4d_fpn_1x.py
0 → 100644
View file @
2017c81e
# model settings
model
=
dict
(
type
=
'CascadeRCNN'
,
num_stages
=
3
,
pretrained
=
'open-mmlab://resnext101_32x4d'
,
backbone
=
dict
(
type
=
'ResNeXt'
,
depth
=
101
,
groups
=
32
,
base_width
=
4
,
num_stages
=
4
,
out_indices
=
(
0
,
1
,
2
,
3
),
frozen_stages
=
1
,
style
=
'pytorch'
),
neck
=
dict
(
type
=
'FPN'
,
in_channels
=
[
256
,
512
,
1024
,
2048
],
out_channels
=
256
,
num_outs
=
5
),
rpn_head
=
dict
(
type
=
'RPNHead'
,
in_channels
=
256
,
feat_channels
=
256
,
anchor_scales
=
[
8
],
anchor_ratios
=
[
0.5
,
1.0
,
2.0
],
anchor_strides
=
[
4
,
8
,
16
,
32
,
64
],
target_means
=
[.
0
,
.
0
,
.
0
,
.
0
],
target_stds
=
[
1.0
,
1.0
,
1.0
,
1.0
],
use_sigmoid_cls
=
True
),
bbox_roi_extractor
=
dict
(
type
=
'SingleRoIExtractor'
,
roi_layer
=
dict
(
type
=
'RoIAlign'
,
out_size
=
7
,
sample_num
=
2
),
out_channels
=
256
,
featmap_strides
=
[
4
,
8
,
16
,
32
]),
bbox_head
=
[
dict
(
type
=
'SharedFCBBoxHead'
,
num_fcs
=
2
,
in_channels
=
256
,
fc_out_channels
=
1024
,
roi_feat_size
=
7
,
num_classes
=
81
,
target_means
=
[
0.
,
0.
,
0.
,
0.
],
target_stds
=
[
0.1
,
0.1
,
0.2
,
0.2
],
reg_class_agnostic
=
True
),
dict
(
type
=
'SharedFCBBoxHead'
,
num_fcs
=
2
,
in_channels
=
256
,
fc_out_channels
=
1024
,
roi_feat_size
=
7
,
num_classes
=
81
,
target_means
=
[
0.
,
0.
,
0.
,
0.
],
target_stds
=
[
0.05
,
0.05
,
0.1
,
0.1
],
reg_class_agnostic
=
True
),
dict
(
type
=
'SharedFCBBoxHead'
,
num_fcs
=
2
,
in_channels
=
256
,
fc_out_channels
=
1024
,
roi_feat_size
=
7
,
num_classes
=
81
,
target_means
=
[
0.
,
0.
,
0.
,
0.
],
target_stds
=
[
0.033
,
0.033
,
0.067
,
0.067
],
reg_class_agnostic
=
True
)
],
mask_roi_extractor
=
dict
(
type
=
'SingleRoIExtractor'
,
roi_layer
=
dict
(
type
=
'RoIAlign'
,
out_size
=
14
,
sample_num
=
2
),
out_channels
=
256
,
featmap_strides
=
[
4
,
8
,
16
,
32
]),
mask_head
=
dict
(
type
=
'FCNMaskHead'
,
num_convs
=
4
,
in_channels
=
256
,
conv_out_channels
=
256
,
num_classes
=
81
))
# model training and testing settings
train_cfg
=
dict
(
rpn
=
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.7
,
neg_iou_thr
=
0.3
,
min_pos_iou
=
0.3
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
256
,
pos_fraction
=
0.5
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
False
),
allowed_border
=
0
,
pos_weight
=-
1
,
smoothl1_beta
=
1
/
9.0
,
debug
=
False
),
rcnn
=
[
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.5
,
neg_iou_thr
=
0.5
,
min_pos_iou
=
0.5
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
512
,
pos_fraction
=
0.25
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
True
),
mask_size
=
28
,
pos_weight
=-
1
,
debug
=
False
),
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.6
,
neg_iou_thr
=
0.6
,
min_pos_iou
=
0.6
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
512
,
pos_fraction
=
0.25
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
True
),
mask_size
=
28
,
pos_weight
=-
1
,
debug
=
False
),
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.7
,
neg_iou_thr
=
0.7
,
min_pos_iou
=
0.7
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
512
,
pos_fraction
=
0.25
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
True
),
mask_size
=
28
,
pos_weight
=-
1
,
debug
=
False
)
],
stage_loss_weights
=
[
1
,
0.5
,
0.25
])
test_cfg
=
dict
(
rpn
=
dict
(
nms_across_levels
=
False
,
nms_pre
=
2000
,
nms_post
=
2000
,
max_num
=
2000
,
nms_thr
=
0.7
,
min_bbox_size
=
0
),
rcnn
=
dict
(
score_thr
=
0.05
,
nms
=
dict
(
type
=
'nms'
,
iou_thr
=
0.5
),
max_per_img
=
100
,
mask_thr_binary
=
0.5
),
keep_all_stages
=
False
)
# dataset settings
dataset_type
=
'CocoDataset'
data_root
=
'data/coco/'
img_norm_cfg
=
dict
(
mean
=
[
123.675
,
116.28
,
103.53
],
std
=
[
58.395
,
57.12
,
57.375
],
to_rgb
=
True
)
data
=
dict
(
imgs_per_gpu
=
2
,
workers_per_gpu
=
2
,
train
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_train2017.json'
,
img_prefix
=
data_root
+
'train2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0.5
,
with_mask
=
True
,
with_crowd
=
True
,
with_label
=
True
),
val
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
True
,
with_crowd
=
True
,
with_label
=
True
),
test
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
True
,
with_label
=
False
,
test_mode
=
True
))
# optimizer
optimizer
=
dict
(
type
=
'SGD'
,
lr
=
0.02
,
momentum
=
0.9
,
weight_decay
=
0.0001
)
optimizer_config
=
dict
(
grad_clip
=
dict
(
max_norm
=
35
,
norm_type
=
2
))
# learning policy
lr_config
=
dict
(
policy
=
'step'
,
warmup
=
'linear'
,
warmup_iters
=
500
,
warmup_ratio
=
1.0
/
3
,
step
=
[
8
,
11
])
checkpoint_config
=
dict
(
interval
=
1
)
# yapf:disable
log_config
=
dict
(
interval
=
50
,
hooks
=
[
dict
(
type
=
'TextLoggerHook'
),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
# runtime settings
total_epochs
=
12
dist_params
=
dict
(
backend
=
'nccl'
)
log_level
=
'INFO'
work_dir
=
'./work_dirs/cascade_mask_rcnn_r50_fpn_1x'
load_from
=
None
resume_from
=
None
workflow
=
[(
'train'
,
1
)]
configs/cascade_mask_rcnn_x101_64x4d_fpn_1x.py
0 → 100644
View file @
2017c81e
# model settings
model
=
dict
(
type
=
'CascadeRCNN'
,
num_stages
=
3
,
pretrained
=
'open-mmlab://resnext101_64x4d'
,
backbone
=
dict
(
type
=
'ResNeXt'
,
depth
=
101
,
groups
=
64
,
base_width
=
4
,
num_stages
=
4
,
out_indices
=
(
0
,
1
,
2
,
3
),
frozen_stages
=
1
,
style
=
'pytorch'
),
neck
=
dict
(
type
=
'FPN'
,
in_channels
=
[
256
,
512
,
1024
,
2048
],
out_channels
=
256
,
num_outs
=
5
),
rpn_head
=
dict
(
type
=
'RPNHead'
,
in_channels
=
256
,
feat_channels
=
256
,
anchor_scales
=
[
8
],
anchor_ratios
=
[
0.5
,
1.0
,
2.0
],
anchor_strides
=
[
4
,
8
,
16
,
32
,
64
],
target_means
=
[.
0
,
.
0
,
.
0
,
.
0
],
target_stds
=
[
1.0
,
1.0
,
1.0
,
1.0
],
use_sigmoid_cls
=
True
),
bbox_roi_extractor
=
dict
(
type
=
'SingleRoIExtractor'
,
roi_layer
=
dict
(
type
=
'RoIAlign'
,
out_size
=
7
,
sample_num
=
2
),
out_channels
=
256
,
featmap_strides
=
[
4
,
8
,
16
,
32
]),
bbox_head
=
[
dict
(
type
=
'SharedFCBBoxHead'
,
num_fcs
=
2
,
in_channels
=
256
,
fc_out_channels
=
1024
,
roi_feat_size
=
7
,
num_classes
=
81
,
target_means
=
[
0.
,
0.
,
0.
,
0.
],
target_stds
=
[
0.1
,
0.1
,
0.2
,
0.2
],
reg_class_agnostic
=
True
),
dict
(
type
=
'SharedFCBBoxHead'
,
num_fcs
=
2
,
in_channels
=
256
,
fc_out_channels
=
1024
,
roi_feat_size
=
7
,
num_classes
=
81
,
target_means
=
[
0.
,
0.
,
0.
,
0.
],
target_stds
=
[
0.05
,
0.05
,
0.1
,
0.1
],
reg_class_agnostic
=
True
),
dict
(
type
=
'SharedFCBBoxHead'
,
num_fcs
=
2
,
in_channels
=
256
,
fc_out_channels
=
1024
,
roi_feat_size
=
7
,
num_classes
=
81
,
target_means
=
[
0.
,
0.
,
0.
,
0.
],
target_stds
=
[
0.033
,
0.033
,
0.067
,
0.067
],
reg_class_agnostic
=
True
)
],
mask_roi_extractor
=
dict
(
type
=
'SingleRoIExtractor'
,
roi_layer
=
dict
(
type
=
'RoIAlign'
,
out_size
=
14
,
sample_num
=
2
),
out_channels
=
256
,
featmap_strides
=
[
4
,
8
,
16
,
32
]),
mask_head
=
dict
(
type
=
'FCNMaskHead'
,
num_convs
=
4
,
in_channels
=
256
,
conv_out_channels
=
256
,
num_classes
=
81
))
# model training and testing settings
train_cfg
=
dict
(
rpn
=
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.7
,
neg_iou_thr
=
0.3
,
min_pos_iou
=
0.3
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
256
,
pos_fraction
=
0.5
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
False
),
allowed_border
=
0
,
pos_weight
=-
1
,
smoothl1_beta
=
1
/
9.0
,
debug
=
False
),
rcnn
=
[
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.5
,
neg_iou_thr
=
0.5
,
min_pos_iou
=
0.5
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
512
,
pos_fraction
=
0.25
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
True
),
mask_size
=
28
,
pos_weight
=-
1
,
debug
=
False
),
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.6
,
neg_iou_thr
=
0.6
,
min_pos_iou
=
0.6
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
512
,
pos_fraction
=
0.25
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
True
),
mask_size
=
28
,
pos_weight
=-
1
,
debug
=
False
),
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.7
,
neg_iou_thr
=
0.7
,
min_pos_iou
=
0.7
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
512
,
pos_fraction
=
0.25
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
True
),
mask_size
=
28
,
pos_weight
=-
1
,
debug
=
False
)
],
stage_loss_weights
=
[
1
,
0.5
,
0.25
])
test_cfg
=
dict
(
rpn
=
dict
(
nms_across_levels
=
False
,
nms_pre
=
2000
,
nms_post
=
2000
,
max_num
=
2000
,
nms_thr
=
0.7
,
min_bbox_size
=
0
),
rcnn
=
dict
(
score_thr
=
0.05
,
nms
=
dict
(
type
=
'nms'
,
iou_thr
=
0.5
),
max_per_img
=
100
,
mask_thr_binary
=
0.5
),
keep_all_stages
=
False
)
# dataset settings
dataset_type
=
'CocoDataset'
data_root
=
'data/coco/'
img_norm_cfg
=
dict
(
mean
=
[
123.675
,
116.28
,
103.53
],
std
=
[
58.395
,
57.12
,
57.375
],
to_rgb
=
True
)
data
=
dict
(
imgs_per_gpu
=
2
,
workers_per_gpu
=
2
,
train
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_train2017.json'
,
img_prefix
=
data_root
+
'train2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0.5
,
with_mask
=
True
,
with_crowd
=
True
,
with_label
=
True
),
val
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
True
,
with_crowd
=
True
,
with_label
=
True
),
test
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
True
,
with_label
=
False
,
test_mode
=
True
))
# optimizer
optimizer
=
dict
(
type
=
'SGD'
,
lr
=
0.02
,
momentum
=
0.9
,
weight_decay
=
0.0001
)
optimizer_config
=
dict
(
grad_clip
=
dict
(
max_norm
=
35
,
norm_type
=
2
))
# learning policy
lr_config
=
dict
(
policy
=
'step'
,
warmup
=
'linear'
,
warmup_iters
=
500
,
warmup_ratio
=
1.0
/
3
,
step
=
[
8
,
11
])
checkpoint_config
=
dict
(
interval
=
1
)
# yapf:disable
log_config
=
dict
(
interval
=
50
,
hooks
=
[
dict
(
type
=
'TextLoggerHook'
),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
# runtime settings
total_epochs
=
12
dist_params
=
dict
(
backend
=
'nccl'
)
log_level
=
'INFO'
work_dir
=
'./work_dirs/cascade_mask_rcnn_r50_fpn_1x'
load_from
=
None
resume_from
=
None
workflow
=
[(
'train'
,
1
)]
configs/cascade_rcnn_x101_32x4d_fpn_1x.py
0 → 100644
View file @
2017c81e
# model settings
model
=
dict
(
type
=
'CascadeRCNN'
,
num_stages
=
3
,
pretrained
=
'open-mmlab://resnext101_32x4d'
,
backbone
=
dict
(
type
=
'ResNeXt'
,
depth
=
101
,
groups
=
32
,
base_width
=
4
,
num_stages
=
4
,
out_indices
=
(
0
,
1
,
2
,
3
),
frozen_stages
=
1
,
style
=
'pytorch'
),
neck
=
dict
(
type
=
'FPN'
,
in_channels
=
[
256
,
512
,
1024
,
2048
],
out_channels
=
256
,
num_outs
=
5
),
rpn_head
=
dict
(
type
=
'RPNHead'
,
in_channels
=
256
,
feat_channels
=
256
,
anchor_scales
=
[
8
],
anchor_ratios
=
[
0.5
,
1.0
,
2.0
],
anchor_strides
=
[
4
,
8
,
16
,
32
,
64
],
target_means
=
[.
0
,
.
0
,
.
0
,
.
0
],
target_stds
=
[
1.0
,
1.0
,
1.0
,
1.0
],
use_sigmoid_cls
=
True
),
bbox_roi_extractor
=
dict
(
type
=
'SingleRoIExtractor'
,
roi_layer
=
dict
(
type
=
'RoIAlign'
,
out_size
=
7
,
sample_num
=
2
),
out_channels
=
256
,
featmap_strides
=
[
4
,
8
,
16
,
32
]),
bbox_head
=
[
dict
(
type
=
'SharedFCBBoxHead'
,
num_fcs
=
2
,
in_channels
=
256
,
fc_out_channels
=
1024
,
roi_feat_size
=
7
,
num_classes
=
81
,
target_means
=
[
0.
,
0.
,
0.
,
0.
],
target_stds
=
[
0.1
,
0.1
,
0.2
,
0.2
],
reg_class_agnostic
=
True
),
dict
(
type
=
'SharedFCBBoxHead'
,
num_fcs
=
2
,
in_channels
=
256
,
fc_out_channels
=
1024
,
roi_feat_size
=
7
,
num_classes
=
81
,
target_means
=
[
0.
,
0.
,
0.
,
0.
],
target_stds
=
[
0.05
,
0.05
,
0.1
,
0.1
],
reg_class_agnostic
=
True
),
dict
(
type
=
'SharedFCBBoxHead'
,
num_fcs
=
2
,
in_channels
=
256
,
fc_out_channels
=
1024
,
roi_feat_size
=
7
,
num_classes
=
81
,
target_means
=
[
0.
,
0.
,
0.
,
0.
],
target_stds
=
[
0.033
,
0.033
,
0.067
,
0.067
],
reg_class_agnostic
=
True
)
])
# model training and testing settings
train_cfg
=
dict
(
rpn
=
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.7
,
neg_iou_thr
=
0.3
,
min_pos_iou
=
0.3
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
256
,
pos_fraction
=
0.5
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
False
),
allowed_border
=
0
,
pos_weight
=-
1
,
smoothl1_beta
=
1
/
9.0
,
debug
=
False
),
rcnn
=
[
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.5
,
neg_iou_thr
=
0.5
,
min_pos_iou
=
0.5
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
512
,
pos_fraction
=
0.25
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
True
),
pos_weight
=-
1
,
debug
=
False
),
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.6
,
neg_iou_thr
=
0.6
,
min_pos_iou
=
0.6
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
512
,
pos_fraction
=
0.25
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
True
),
pos_weight
=-
1
,
debug
=
False
),
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.7
,
neg_iou_thr
=
0.7
,
min_pos_iou
=
0.7
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
512
,
pos_fraction
=
0.25
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
True
),
pos_weight
=-
1
,
debug
=
False
)
],
stage_loss_weights
=
[
1
,
0.5
,
0.25
])
test_cfg
=
dict
(
rpn
=
dict
(
nms_across_levels
=
False
,
nms_pre
=
2000
,
nms_post
=
2000
,
max_num
=
2000
,
nms_thr
=
0.7
,
min_bbox_size
=
0
),
rcnn
=
dict
(
score_thr
=
0.05
,
nms
=
dict
(
type
=
'nms'
,
iou_thr
=
0.5
),
max_per_img
=
100
),
keep_all_stages
=
False
)
# dataset settings
dataset_type
=
'CocoDataset'
data_root
=
'data/coco/'
img_norm_cfg
=
dict
(
mean
=
[
123.675
,
116.28
,
103.53
],
std
=
[
58.395
,
57.12
,
57.375
],
to_rgb
=
True
)
data
=
dict
(
imgs_per_gpu
=
2
,
workers_per_gpu
=
2
,
train
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_train2017.json'
,
img_prefix
=
data_root
+
'train2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0.5
,
with_mask
=
False
,
with_crowd
=
True
,
with_label
=
True
),
val
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_crowd
=
True
,
with_label
=
True
),
test
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_label
=
False
,
test_mode
=
True
))
# optimizer
optimizer
=
dict
(
type
=
'SGD'
,
lr
=
0.02
,
momentum
=
0.9
,
weight_decay
=
0.0001
)
optimizer_config
=
dict
(
grad_clip
=
dict
(
max_norm
=
35
,
norm_type
=
2
))
# learning policy
lr_config
=
dict
(
policy
=
'step'
,
warmup
=
'linear'
,
warmup_iters
=
500
,
warmup_ratio
=
1.0
/
3
,
step
=
[
8
,
11
])
checkpoint_config
=
dict
(
interval
=
1
)
# yapf:disable
log_config
=
dict
(
interval
=
50
,
hooks
=
[
dict
(
type
=
'TextLoggerHook'
),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
# runtime settings
total_epochs
=
12
dist_params
=
dict
(
backend
=
'nccl'
)
log_level
=
'INFO'
work_dir
=
'./work_dirs/cascade_rcnn_r50_fpn_1x'
load_from
=
None
resume_from
=
None
workflow
=
[(
'train'
,
1
)]
configs/cascade_rcnn_x101_64x4d_fpn_1x.py
0 → 100644
View file @
2017c81e
# model settings
model
=
dict
(
type
=
'CascadeRCNN'
,
num_stages
=
3
,
pretrained
=
'open-mmlab://resnext101_64x4d'
,
backbone
=
dict
(
type
=
'ResNeXt'
,
depth
=
101
,
groups
=
64
,
base_width
=
4
,
num_stages
=
4
,
out_indices
=
(
0
,
1
,
2
,
3
),
frozen_stages
=
1
,
style
=
'pytorch'
),
neck
=
dict
(
type
=
'FPN'
,
in_channels
=
[
256
,
512
,
1024
,
2048
],
out_channels
=
256
,
num_outs
=
5
),
rpn_head
=
dict
(
type
=
'RPNHead'
,
in_channels
=
256
,
feat_channels
=
256
,
anchor_scales
=
[
8
],
anchor_ratios
=
[
0.5
,
1.0
,
2.0
],
anchor_strides
=
[
4
,
8
,
16
,
32
,
64
],
target_means
=
[.
0
,
.
0
,
.
0
,
.
0
],
target_stds
=
[
1.0
,
1.0
,
1.0
,
1.0
],
use_sigmoid_cls
=
True
),
bbox_roi_extractor
=
dict
(
type
=
'SingleRoIExtractor'
,
roi_layer
=
dict
(
type
=
'RoIAlign'
,
out_size
=
7
,
sample_num
=
2
),
out_channels
=
256
,
featmap_strides
=
[
4
,
8
,
16
,
32
]),
bbox_head
=
[
dict
(
type
=
'SharedFCBBoxHead'
,
num_fcs
=
2
,
in_channels
=
256
,
fc_out_channels
=
1024
,
roi_feat_size
=
7
,
num_classes
=
81
,
target_means
=
[
0.
,
0.
,
0.
,
0.
],
target_stds
=
[
0.1
,
0.1
,
0.2
,
0.2
],
reg_class_agnostic
=
True
),
dict
(
type
=
'SharedFCBBoxHead'
,
num_fcs
=
2
,
in_channels
=
256
,
fc_out_channels
=
1024
,
roi_feat_size
=
7
,
num_classes
=
81
,
target_means
=
[
0.
,
0.
,
0.
,
0.
],
target_stds
=
[
0.05
,
0.05
,
0.1
,
0.1
],
reg_class_agnostic
=
True
),
dict
(
type
=
'SharedFCBBoxHead'
,
num_fcs
=
2
,
in_channels
=
256
,
fc_out_channels
=
1024
,
roi_feat_size
=
7
,
num_classes
=
81
,
target_means
=
[
0.
,
0.
,
0.
,
0.
],
target_stds
=
[
0.033
,
0.033
,
0.067
,
0.067
],
reg_class_agnostic
=
True
)
])
# model training and testing settings
train_cfg
=
dict
(
rpn
=
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.7
,
neg_iou_thr
=
0.3
,
min_pos_iou
=
0.3
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
256
,
pos_fraction
=
0.5
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
False
),
allowed_border
=
0
,
pos_weight
=-
1
,
smoothl1_beta
=
1
/
9.0
,
debug
=
False
),
rcnn
=
[
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.5
,
neg_iou_thr
=
0.5
,
min_pos_iou
=
0.5
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
512
,
pos_fraction
=
0.25
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
True
),
pos_weight
=-
1
,
debug
=
False
),
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.6
,
neg_iou_thr
=
0.6
,
min_pos_iou
=
0.6
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
512
,
pos_fraction
=
0.25
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
True
),
pos_weight
=-
1
,
debug
=
False
),
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.7
,
neg_iou_thr
=
0.7
,
min_pos_iou
=
0.7
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
512
,
pos_fraction
=
0.25
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
True
),
pos_weight
=-
1
,
debug
=
False
)
],
stage_loss_weights
=
[
1
,
0.5
,
0.25
])
test_cfg
=
dict
(
rpn
=
dict
(
nms_across_levels
=
False
,
nms_pre
=
2000
,
nms_post
=
2000
,
max_num
=
2000
,
nms_thr
=
0.7
,
min_bbox_size
=
0
),
rcnn
=
dict
(
score_thr
=
0.05
,
nms
=
dict
(
type
=
'nms'
,
iou_thr
=
0.5
),
max_per_img
=
100
),
keep_all_stages
=
False
)
# dataset settings
dataset_type
=
'CocoDataset'
data_root
=
'data/coco/'
img_norm_cfg
=
dict
(
mean
=
[
123.675
,
116.28
,
103.53
],
std
=
[
58.395
,
57.12
,
57.375
],
to_rgb
=
True
)
data
=
dict
(
imgs_per_gpu
=
2
,
workers_per_gpu
=
2
,
train
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_train2017.json'
,
img_prefix
=
data_root
+
'train2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0.5
,
with_mask
=
False
,
with_crowd
=
True
,
with_label
=
True
),
val
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_crowd
=
True
,
with_label
=
True
),
test
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_label
=
False
,
test_mode
=
True
))
# optimizer
optimizer
=
dict
(
type
=
'SGD'
,
lr
=
0.02
,
momentum
=
0.9
,
weight_decay
=
0.0001
)
optimizer_config
=
dict
(
grad_clip
=
dict
(
max_norm
=
35
,
norm_type
=
2
))
# learning policy
lr_config
=
dict
(
policy
=
'step'
,
warmup
=
'linear'
,
warmup_iters
=
500
,
warmup_ratio
=
1.0
/
3
,
step
=
[
8
,
11
])
checkpoint_config
=
dict
(
interval
=
1
)
# yapf:disable
log_config
=
dict
(
interval
=
50
,
hooks
=
[
dict
(
type
=
'TextLoggerHook'
),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
# runtime settings
total_epochs
=
12
dist_params
=
dict
(
backend
=
'nccl'
)
log_level
=
'INFO'
work_dir
=
'./work_dirs/cascade_rcnn_r50_fpn_1x'
load_from
=
None
resume_from
=
None
workflow
=
[(
'train'
,
1
)]
configs/faster_rcnn_ohem_r50_fpn_1x.py
0 → 100644
View file @
2017c81e
# model settings
model
=
dict
(
type
=
'FasterRCNN'
,
pretrained
=
'modelzoo://resnet50'
,
backbone
=
dict
(
type
=
'ResNet'
,
depth
=
50
,
num_stages
=
4
,
out_indices
=
(
0
,
1
,
2
,
3
),
frozen_stages
=
1
,
style
=
'pytorch'
),
neck
=
dict
(
type
=
'FPN'
,
in_channels
=
[
256
,
512
,
1024
,
2048
],
out_channels
=
256
,
num_outs
=
5
),
rpn_head
=
dict
(
type
=
'RPNHead'
,
in_channels
=
256
,
feat_channels
=
256
,
anchor_scales
=
[
8
],
anchor_ratios
=
[
0.5
,
1.0
,
2.0
],
anchor_strides
=
[
4
,
8
,
16
,
32
,
64
],
target_means
=
[.
0
,
.
0
,
.
0
,
.
0
],
target_stds
=
[
1.0
,
1.0
,
1.0
,
1.0
],
use_sigmoid_cls
=
True
),
bbox_roi_extractor
=
dict
(
type
=
'SingleRoIExtractor'
,
roi_layer
=
dict
(
type
=
'RoIAlign'
,
out_size
=
7
,
sample_num
=
2
),
out_channels
=
256
,
featmap_strides
=
[
4
,
8
,
16
,
32
]),
bbox_head
=
dict
(
type
=
'SharedFCBBoxHead'
,
num_fcs
=
2
,
in_channels
=
256
,
fc_out_channels
=
1024
,
roi_feat_size
=
7
,
num_classes
=
81
,
target_means
=
[
0.
,
0.
,
0.
,
0.
],
target_stds
=
[
0.1
,
0.1
,
0.2
,
0.2
],
reg_class_agnostic
=
False
))
# model training and testing settings
train_cfg
=
dict
(
rpn
=
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.7
,
neg_iou_thr
=
0.3
,
min_pos_iou
=
0.3
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
256
,
pos_fraction
=
0.5
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
False
),
allowed_border
=
0
,
pos_weight
=-
1
,
smoothl1_beta
=
1
/
9.0
,
debug
=
False
),
rcnn
=
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.5
,
neg_iou_thr
=
0.5
,
min_pos_iou
=
0.5
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'OHEMSampler'
,
num
=
512
,
pos_fraction
=
0.25
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
True
),
pos_weight
=-
1
,
debug
=
False
))
test_cfg
=
dict
(
rpn
=
dict
(
nms_across_levels
=
False
,
nms_pre
=
2000
,
nms_post
=
2000
,
max_num
=
2000
,
nms_thr
=
0.7
,
min_bbox_size
=
0
),
rcnn
=
dict
(
score_thr
=
0.05
,
nms
=
dict
(
type
=
'nms'
,
iou_thr
=
0.5
),
max_per_img
=
100
)
# soft-nms is also supported for rcnn testing
# e.g., nms=dict(type='soft_nms', iou_thr=0.5, min_score=0.05)
)
# dataset settings
dataset_type
=
'CocoDataset'
data_root
=
'data/coco/'
img_norm_cfg
=
dict
(
mean
=
[
123.675
,
116.28
,
103.53
],
std
=
[
58.395
,
57.12
,
57.375
],
to_rgb
=
True
)
data
=
dict
(
imgs_per_gpu
=
2
,
workers_per_gpu
=
2
,
train
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_train2017.json'
,
img_prefix
=
data_root
+
'train2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0.5
,
with_mask
=
False
,
with_crowd
=
True
,
with_label
=
True
),
val
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_crowd
=
True
,
with_label
=
True
),
test
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_label
=
False
,
test_mode
=
True
))
# optimizer
optimizer
=
dict
(
type
=
'SGD'
,
lr
=
0.02
,
momentum
=
0.9
,
weight_decay
=
0.0001
)
optimizer_config
=
dict
(
grad_clip
=
dict
(
max_norm
=
35
,
norm_type
=
2
))
# learning policy
lr_config
=
dict
(
policy
=
'step'
,
warmup
=
'linear'
,
warmup_iters
=
500
,
warmup_ratio
=
1.0
/
3
,
step
=
[
8
,
11
])
checkpoint_config
=
dict
(
interval
=
1
)
# yapf:disable
log_config
=
dict
(
interval
=
50
,
hooks
=
[
dict
(
type
=
'TextLoggerHook'
),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
# runtime settings
total_epochs
=
12
dist_params
=
dict
(
backend
=
'nccl'
)
log_level
=
'INFO'
work_dir
=
'./work_dirs/faster_rcnn_r50_fpn_1x'
load_from
=
None
resume_from
=
None
workflow
=
[(
'train'
,
1
)]
configs/faster_rcnn_x101_32x4d_fpn_1x.py
0 → 100644
View file @
2017c81e
# model settings
model
=
dict
(
type
=
'FasterRCNN'
,
pretrained
=
'open-mmlab://resnext101_32x4d'
,
backbone
=
dict
(
type
=
'ResNeXt'
,
depth
=
101
,
groups
=
32
,
base_width
=
4
,
num_stages
=
4
,
out_indices
=
(
0
,
1
,
2
,
3
),
frozen_stages
=
1
,
style
=
'pytorch'
),
neck
=
dict
(
type
=
'FPN'
,
in_channels
=
[
256
,
512
,
1024
,
2048
],
out_channels
=
256
,
num_outs
=
5
),
rpn_head
=
dict
(
type
=
'RPNHead'
,
in_channels
=
256
,
feat_channels
=
256
,
anchor_scales
=
[
8
],
anchor_ratios
=
[
0.5
,
1.0
,
2.0
],
anchor_strides
=
[
4
,
8
,
16
,
32
,
64
],
target_means
=
[.
0
,
.
0
,
.
0
,
.
0
],
target_stds
=
[
1.0
,
1.0
,
1.0
,
1.0
],
use_sigmoid_cls
=
True
),
bbox_roi_extractor
=
dict
(
type
=
'SingleRoIExtractor'
,
roi_layer
=
dict
(
type
=
'RoIAlign'
,
out_size
=
7
,
sample_num
=
2
),
out_channels
=
256
,
featmap_strides
=
[
4
,
8
,
16
,
32
]),
bbox_head
=
dict
(
type
=
'SharedFCBBoxHead'
,
num_fcs
=
2
,
in_channels
=
256
,
fc_out_channels
=
1024
,
roi_feat_size
=
7
,
num_classes
=
81
,
target_means
=
[
0.
,
0.
,
0.
,
0.
],
target_stds
=
[
0.1
,
0.1
,
0.2
,
0.2
],
reg_class_agnostic
=
False
))
# model training and testing settings
train_cfg
=
dict
(
rpn
=
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.7
,
neg_iou_thr
=
0.3
,
min_pos_iou
=
0.3
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
256
,
pos_fraction
=
0.5
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
False
),
allowed_border
=
0
,
pos_weight
=-
1
,
smoothl1_beta
=
1
/
9.0
,
debug
=
False
),
rcnn
=
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.5
,
neg_iou_thr
=
0.5
,
min_pos_iou
=
0.5
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
512
,
pos_fraction
=
0.25
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
True
),
pos_weight
=-
1
,
debug
=
False
))
test_cfg
=
dict
(
rpn
=
dict
(
nms_across_levels
=
False
,
nms_pre
=
2000
,
nms_post
=
2000
,
max_num
=
2000
,
nms_thr
=
0.7
,
min_bbox_size
=
0
),
rcnn
=
dict
(
score_thr
=
0.05
,
nms
=
dict
(
type
=
'nms'
,
iou_thr
=
0.5
),
max_per_img
=
100
)
# soft-nms is also supported for rcnn testing
# e.g., nms=dict(type='soft_nms', iou_thr=0.5, min_score=0.05)
)
# dataset settings
dataset_type
=
'CocoDataset'
data_root
=
'data/coco/'
img_norm_cfg
=
dict
(
mean
=
[
123.675
,
116.28
,
103.53
],
std
=
[
58.395
,
57.12
,
57.375
],
to_rgb
=
True
)
data
=
dict
(
imgs_per_gpu
=
2
,
workers_per_gpu
=
2
,
train
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_train2017.json'
,
img_prefix
=
data_root
+
'train2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0.5
,
with_mask
=
False
,
with_crowd
=
True
,
with_label
=
True
),
val
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_crowd
=
True
,
with_label
=
True
),
test
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_label
=
False
,
test_mode
=
True
))
# optimizer
optimizer
=
dict
(
type
=
'SGD'
,
lr
=
0.02
,
momentum
=
0.9
,
weight_decay
=
0.0001
)
optimizer_config
=
dict
(
grad_clip
=
dict
(
max_norm
=
35
,
norm_type
=
2
))
# learning policy
lr_config
=
dict
(
policy
=
'step'
,
warmup
=
'linear'
,
warmup_iters
=
500
,
warmup_ratio
=
1.0
/
3
,
step
=
[
8
,
11
])
checkpoint_config
=
dict
(
interval
=
1
)
# yapf:disable
log_config
=
dict
(
interval
=
50
,
hooks
=
[
dict
(
type
=
'TextLoggerHook'
),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
# runtime settings
total_epochs
=
12
dist_params
=
dict
(
backend
=
'nccl'
)
log_level
=
'INFO'
work_dir
=
'./work_dirs/faster_rcnn_r50_fpn_1x'
load_from
=
None
resume_from
=
None
workflow
=
[(
'train'
,
1
)]
configs/faster_rcnn_x101_64x4d_fpn_1x.py
0 → 100644
View file @
2017c81e
# model settings
model
=
dict
(
type
=
'FasterRCNN'
,
pretrained
=
'open-mmlab://resnext101_64x4d'
,
backbone
=
dict
(
type
=
'ResNeXt'
,
depth
=
101
,
groups
=
64
,
base_width
=
4
,
num_stages
=
4
,
out_indices
=
(
0
,
1
,
2
,
3
),
frozen_stages
=
1
,
style
=
'pytorch'
),
neck
=
dict
(
type
=
'FPN'
,
in_channels
=
[
256
,
512
,
1024
,
2048
],
out_channels
=
256
,
num_outs
=
5
),
rpn_head
=
dict
(
type
=
'RPNHead'
,
in_channels
=
256
,
feat_channels
=
256
,
anchor_scales
=
[
8
],
anchor_ratios
=
[
0.5
,
1.0
,
2.0
],
anchor_strides
=
[
4
,
8
,
16
,
32
,
64
],
target_means
=
[.
0
,
.
0
,
.
0
,
.
0
],
target_stds
=
[
1.0
,
1.0
,
1.0
,
1.0
],
use_sigmoid_cls
=
True
),
bbox_roi_extractor
=
dict
(
type
=
'SingleRoIExtractor'
,
roi_layer
=
dict
(
type
=
'RoIAlign'
,
out_size
=
7
,
sample_num
=
2
),
out_channels
=
256
,
featmap_strides
=
[
4
,
8
,
16
,
32
]),
bbox_head
=
dict
(
type
=
'SharedFCBBoxHead'
,
num_fcs
=
2
,
in_channels
=
256
,
fc_out_channels
=
1024
,
roi_feat_size
=
7
,
num_classes
=
81
,
target_means
=
[
0.
,
0.
,
0.
,
0.
],
target_stds
=
[
0.1
,
0.1
,
0.2
,
0.2
],
reg_class_agnostic
=
False
))
# model training and testing settings
train_cfg
=
dict
(
rpn
=
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.7
,
neg_iou_thr
=
0.3
,
min_pos_iou
=
0.3
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
256
,
pos_fraction
=
0.5
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
False
),
allowed_border
=
0
,
pos_weight
=-
1
,
smoothl1_beta
=
1
/
9.0
,
debug
=
False
),
rcnn
=
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.5
,
neg_iou_thr
=
0.5
,
min_pos_iou
=
0.5
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
512
,
pos_fraction
=
0.25
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
True
),
pos_weight
=-
1
,
debug
=
False
))
test_cfg
=
dict
(
rpn
=
dict
(
nms_across_levels
=
False
,
nms_pre
=
2000
,
nms_post
=
2000
,
max_num
=
2000
,
nms_thr
=
0.7
,
min_bbox_size
=
0
),
rcnn
=
dict
(
score_thr
=
0.05
,
nms
=
dict
(
type
=
'nms'
,
iou_thr
=
0.5
),
max_per_img
=
100
)
# soft-nms is also supported for rcnn testing
# e.g., nms=dict(type='soft_nms', iou_thr=0.5, min_score=0.05)
)
# dataset settings
dataset_type
=
'CocoDataset'
data_root
=
'data/coco/'
img_norm_cfg
=
dict
(
mean
=
[
123.675
,
116.28
,
103.53
],
std
=
[
58.395
,
57.12
,
57.375
],
to_rgb
=
True
)
data
=
dict
(
imgs_per_gpu
=
2
,
workers_per_gpu
=
2
,
train
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_train2017.json'
,
img_prefix
=
data_root
+
'train2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0.5
,
with_mask
=
False
,
with_crowd
=
True
,
with_label
=
True
),
val
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_crowd
=
True
,
with_label
=
True
),
test
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_label
=
False
,
test_mode
=
True
))
# optimizer
optimizer
=
dict
(
type
=
'SGD'
,
lr
=
0.02
,
momentum
=
0.9
,
weight_decay
=
0.0001
)
optimizer_config
=
dict
(
grad_clip
=
dict
(
max_norm
=
35
,
norm_type
=
2
))
# learning policy
lr_config
=
dict
(
policy
=
'step'
,
warmup
=
'linear'
,
warmup_iters
=
500
,
warmup_ratio
=
1.0
/
3
,
step
=
[
8
,
11
])
checkpoint_config
=
dict
(
interval
=
1
)
# yapf:disable
log_config
=
dict
(
interval
=
50
,
hooks
=
[
dict
(
type
=
'TextLoggerHook'
),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
# runtime settings
total_epochs
=
12
dist_params
=
dict
(
backend
=
'nccl'
)
log_level
=
'INFO'
work_dir
=
'./work_dirs/faster_rcnn_r50_fpn_1x'
load_from
=
None
resume_from
=
None
workflow
=
[(
'train'
,
1
)]
configs/mask_rcnn_x101_32x4d_fpn_1x.py
0 → 100644
View file @
2017c81e
# model settings
model
=
dict
(
type
=
'MaskRCNN'
,
pretrained
=
'open-mmlab://resnext101_32x4d'
,
backbone
=
dict
(
type
=
'ResNeXt'
,
depth
=
101
,
groups
=
32
,
base_width
=
4
,
num_stages
=
4
,
out_indices
=
(
0
,
1
,
2
,
3
),
frozen_stages
=
1
,
style
=
'pytorch'
),
neck
=
dict
(
type
=
'FPN'
,
in_channels
=
[
256
,
512
,
1024
,
2048
],
out_channels
=
256
,
num_outs
=
5
),
rpn_head
=
dict
(
type
=
'RPNHead'
,
in_channels
=
256
,
feat_channels
=
256
,
anchor_scales
=
[
8
],
anchor_ratios
=
[
0.5
,
1.0
,
2.0
],
anchor_strides
=
[
4
,
8
,
16
,
32
,
64
],
target_means
=
[.
0
,
.
0
,
.
0
,
.
0
],
target_stds
=
[
1.0
,
1.0
,
1.0
,
1.0
],
use_sigmoid_cls
=
True
),
bbox_roi_extractor
=
dict
(
type
=
'SingleRoIExtractor'
,
roi_layer
=
dict
(
type
=
'RoIAlign'
,
out_size
=
7
,
sample_num
=
2
),
out_channels
=
256
,
featmap_strides
=
[
4
,
8
,
16
,
32
]),
bbox_head
=
dict
(
type
=
'SharedFCBBoxHead'
,
num_fcs
=
2
,
in_channels
=
256
,
fc_out_channels
=
1024
,
roi_feat_size
=
7
,
num_classes
=
81
,
target_means
=
[
0.
,
0.
,
0.
,
0.
],
target_stds
=
[
0.1
,
0.1
,
0.2
,
0.2
],
reg_class_agnostic
=
False
),
mask_roi_extractor
=
dict
(
type
=
'SingleRoIExtractor'
,
roi_layer
=
dict
(
type
=
'RoIAlign'
,
out_size
=
14
,
sample_num
=
2
),
out_channels
=
256
,
featmap_strides
=
[
4
,
8
,
16
,
32
]),
mask_head
=
dict
(
type
=
'FCNMaskHead'
,
num_convs
=
4
,
in_channels
=
256
,
conv_out_channels
=
256
,
num_classes
=
81
))
# model training and testing settings
train_cfg
=
dict
(
rpn
=
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.7
,
neg_iou_thr
=
0.3
,
min_pos_iou
=
0.3
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
256
,
pos_fraction
=
0.5
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
False
),
allowed_border
=
0
,
pos_weight
=-
1
,
smoothl1_beta
=
1
/
9.0
,
debug
=
False
),
rcnn
=
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.5
,
neg_iou_thr
=
0.5
,
min_pos_iou
=
0.5
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
512
,
pos_fraction
=
0.25
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
True
),
mask_size
=
28
,
pos_weight
=-
1
,
debug
=
False
))
test_cfg
=
dict
(
rpn
=
dict
(
nms_across_levels
=
False
,
nms_pre
=
2000
,
nms_post
=
2000
,
max_num
=
2000
,
nms_thr
=
0.7
,
min_bbox_size
=
0
),
rcnn
=
dict
(
score_thr
=
0.05
,
nms
=
dict
(
type
=
'nms'
,
iou_thr
=
0.5
),
max_per_img
=
100
,
mask_thr_binary
=
0.5
))
# dataset settings
dataset_type
=
'CocoDataset'
data_root
=
'data/coco/'
img_norm_cfg
=
dict
(
mean
=
[
123.675
,
116.28
,
103.53
],
std
=
[
58.395
,
57.12
,
57.375
],
to_rgb
=
True
)
data
=
dict
(
imgs_per_gpu
=
2
,
workers_per_gpu
=
2
,
train
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_train2017.json'
,
img_prefix
=
data_root
+
'train2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0.5
,
with_mask
=
True
,
with_crowd
=
True
,
with_label
=
True
),
val
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
True
,
with_crowd
=
True
,
with_label
=
True
),
test
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_label
=
False
,
test_mode
=
True
))
# optimizer
optimizer
=
dict
(
type
=
'SGD'
,
lr
=
0.02
,
momentum
=
0.9
,
weight_decay
=
0.0001
)
optimizer_config
=
dict
(
grad_clip
=
dict
(
max_norm
=
35
,
norm_type
=
2
))
# learning policy
lr_config
=
dict
(
policy
=
'step'
,
warmup
=
'linear'
,
warmup_iters
=
500
,
warmup_ratio
=
1.0
/
3
,
step
=
[
8
,
11
])
checkpoint_config
=
dict
(
interval
=
1
)
# yapf:disable
log_config
=
dict
(
interval
=
50
,
hooks
=
[
dict
(
type
=
'TextLoggerHook'
),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
# runtime settings
total_epochs
=
12
dist_params
=
dict
(
backend
=
'nccl'
)
log_level
=
'INFO'
work_dir
=
'./work_dirs/mask_rcnn_r50_fpn_1x'
load_from
=
None
resume_from
=
None
workflow
=
[(
'train'
,
1
)]
configs/mask_rcnn_x101_64x4d_fpn_1x.py
0 → 100644
View file @
2017c81e
# model settings
model
=
dict
(
type
=
'MaskRCNN'
,
pretrained
=
'open-mmlab://resnext101_64x4d'
,
backbone
=
dict
(
type
=
'ResNeXt'
,
depth
=
101
,
groups
=
64
,
base_width
=
4
,
num_stages
=
4
,
out_indices
=
(
0
,
1
,
2
,
3
),
frozen_stages
=
1
,
style
=
'pytorch'
),
neck
=
dict
(
type
=
'FPN'
,
in_channels
=
[
256
,
512
,
1024
,
2048
],
out_channels
=
256
,
num_outs
=
5
),
rpn_head
=
dict
(
type
=
'RPNHead'
,
in_channels
=
256
,
feat_channels
=
256
,
anchor_scales
=
[
8
],
anchor_ratios
=
[
0.5
,
1.0
,
2.0
],
anchor_strides
=
[
4
,
8
,
16
,
32
,
64
],
target_means
=
[.
0
,
.
0
,
.
0
,
.
0
],
target_stds
=
[
1.0
,
1.0
,
1.0
,
1.0
],
use_sigmoid_cls
=
True
),
bbox_roi_extractor
=
dict
(
type
=
'SingleRoIExtractor'
,
roi_layer
=
dict
(
type
=
'RoIAlign'
,
out_size
=
7
,
sample_num
=
2
),
out_channels
=
256
,
featmap_strides
=
[
4
,
8
,
16
,
32
]),
bbox_head
=
dict
(
type
=
'SharedFCBBoxHead'
,
num_fcs
=
2
,
in_channels
=
256
,
fc_out_channels
=
1024
,
roi_feat_size
=
7
,
num_classes
=
81
,
target_means
=
[
0.
,
0.
,
0.
,
0.
],
target_stds
=
[
0.1
,
0.1
,
0.2
,
0.2
],
reg_class_agnostic
=
False
),
mask_roi_extractor
=
dict
(
type
=
'SingleRoIExtractor'
,
roi_layer
=
dict
(
type
=
'RoIAlign'
,
out_size
=
14
,
sample_num
=
2
),
out_channels
=
256
,
featmap_strides
=
[
4
,
8
,
16
,
32
]),
mask_head
=
dict
(
type
=
'FCNMaskHead'
,
num_convs
=
4
,
in_channels
=
256
,
conv_out_channels
=
256
,
num_classes
=
81
))
# model training and testing settings
train_cfg
=
dict
(
rpn
=
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.7
,
neg_iou_thr
=
0.3
,
min_pos_iou
=
0.3
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
256
,
pos_fraction
=
0.5
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
False
),
allowed_border
=
0
,
pos_weight
=-
1
,
smoothl1_beta
=
1
/
9.0
,
debug
=
False
),
rcnn
=
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.5
,
neg_iou_thr
=
0.5
,
min_pos_iou
=
0.5
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
512
,
pos_fraction
=
0.25
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
True
),
mask_size
=
28
,
pos_weight
=-
1
,
debug
=
False
))
test_cfg
=
dict
(
rpn
=
dict
(
nms_across_levels
=
False
,
nms_pre
=
2000
,
nms_post
=
2000
,
max_num
=
2000
,
nms_thr
=
0.7
,
min_bbox_size
=
0
),
rcnn
=
dict
(
score_thr
=
0.05
,
nms
=
dict
(
type
=
'nms'
,
iou_thr
=
0.5
),
max_per_img
=
100
,
mask_thr_binary
=
0.5
))
# dataset settings
dataset_type
=
'CocoDataset'
data_root
=
'data/coco/'
img_norm_cfg
=
dict
(
mean
=
[
123.675
,
116.28
,
103.53
],
std
=
[
58.395
,
57.12
,
57.375
],
to_rgb
=
True
)
data
=
dict
(
imgs_per_gpu
=
2
,
workers_per_gpu
=
2
,
train
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_train2017.json'
,
img_prefix
=
data_root
+
'train2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0.5
,
with_mask
=
True
,
with_crowd
=
True
,
with_label
=
True
),
val
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
True
,
with_crowd
=
True
,
with_label
=
True
),
test
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_label
=
False
,
test_mode
=
True
))
# optimizer
optimizer
=
dict
(
type
=
'SGD'
,
lr
=
0.02
,
momentum
=
0.9
,
weight_decay
=
0.0001
)
optimizer_config
=
dict
(
grad_clip
=
dict
(
max_norm
=
35
,
norm_type
=
2
))
# learning policy
lr_config
=
dict
(
policy
=
'step'
,
warmup
=
'linear'
,
warmup_iters
=
500
,
warmup_ratio
=
1.0
/
3
,
step
=
[
8
,
11
])
checkpoint_config
=
dict
(
interval
=
1
)
# yapf:disable
log_config
=
dict
(
interval
=
50
,
hooks
=
[
dict
(
type
=
'TextLoggerHook'
),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
# runtime settings
total_epochs
=
12
dist_params
=
dict
(
backend
=
'nccl'
)
log_level
=
'INFO'
work_dir
=
'./work_dirs/mask_rcnn_r50_fpn_1x'
load_from
=
None
resume_from
=
None
workflow
=
[(
'train'
,
1
)]
configs/pascal_voc/faster_rcnn_r50_fpn_1x_voc0712.py
0 → 100644
View file @
2017c81e
# model settings
model
=
dict
(
type
=
'FasterRCNN'
,
pretrained
=
'modelzoo://resnet50'
,
backbone
=
dict
(
type
=
'ResNet'
,
depth
=
50
,
num_stages
=
4
,
out_indices
=
(
0
,
1
,
2
,
3
),
frozen_stages
=
1
,
style
=
'pytorch'
),
neck
=
dict
(
type
=
'FPN'
,
in_channels
=
[
256
,
512
,
1024
,
2048
],
out_channels
=
256
,
num_outs
=
5
),
rpn_head
=
dict
(
type
=
'RPNHead'
,
in_channels
=
256
,
feat_channels
=
256
,
anchor_scales
=
[
8
],
anchor_ratios
=
[
0.5
,
1.0
,
2.0
],
anchor_strides
=
[
4
,
8
,
16
,
32
,
64
],
target_means
=
[.
0
,
.
0
,
.
0
,
.
0
],
target_stds
=
[
1.0
,
1.0
,
1.0
,
1.0
],
use_sigmoid_cls
=
True
),
bbox_roi_extractor
=
dict
(
type
=
'SingleRoIExtractor'
,
roi_layer
=
dict
(
type
=
'RoIAlign'
,
out_size
=
7
,
sample_num
=
2
),
out_channels
=
256
,
featmap_strides
=
[
4
,
8
,
16
,
32
]),
bbox_head
=
dict
(
type
=
'SharedFCBBoxHead'
,
num_fcs
=
2
,
in_channels
=
256
,
fc_out_channels
=
1024
,
roi_feat_size
=
7
,
num_classes
=
21
,
target_means
=
[
0.
,
0.
,
0.
,
0.
],
target_stds
=
[
0.1
,
0.1
,
0.2
,
0.2
],
reg_class_agnostic
=
False
))
# model training and testing settings
train_cfg
=
dict
(
rpn
=
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.7
,
neg_iou_thr
=
0.3
,
min_pos_iou
=
0.3
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
256
,
pos_fraction
=
0.5
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
False
),
allowed_border
=
0
,
pos_weight
=-
1
,
smoothl1_beta
=
1
/
9.0
,
debug
=
False
),
rcnn
=
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.5
,
neg_iou_thr
=
0.5
,
min_pos_iou
=
0.5
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
512
,
pos_fraction
=
0.25
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
True
),
pos_weight
=-
1
,
debug
=
False
))
test_cfg
=
dict
(
rpn
=
dict
(
nms_across_levels
=
False
,
nms_pre
=
2000
,
nms_post
=
2000
,
max_num
=
2000
,
nms_thr
=
0.7
,
min_bbox_size
=
0
),
rcnn
=
dict
(
score_thr
=
0.05
,
nms
=
dict
(
type
=
'nms'
,
iou_thr
=
0.5
),
max_per_img
=
100
)
# soft-nms is also supported for rcnn testing
# e.g., nms=dict(type='soft_nms', iou_thr=0.5, min_score=0.05)
)
# dataset settings
dataset_type
=
'VOCDataset'
data_root
=
'data/VOCdevkit/'
img_norm_cfg
=
dict
(
mean
=
[
123.675
,
116.28
,
103.53
],
std
=
[
58.395
,
57.12
,
57.375
],
to_rgb
=
True
)
data
=
dict
(
imgs_per_gpu
=
2
,
workers_per_gpu
=
2
,
train
=
dict
(
type
=
'RepeatDataset'
,
# to avoid reloading datasets frequently
times
=
3
,
dataset
=
dict
(
type
=
dataset_type
,
ann_file
=
[
data_root
+
'VOC2007/ImageSets/Main/trainval.txt'
,
data_root
+
'VOC2012/ImageSets/Main/trainval.txt'
],
img_prefix
=
[
data_root
+
'VOC2007/'
,
data_root
+
'VOC2012/'
],
img_scale
=
(
1000
,
600
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0.5
,
with_mask
=
False
,
with_crowd
=
True
,
with_label
=
True
)),
val
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'VOC2007/ImageSets/Main/test.txt'
,
img_prefix
=
data_root
+
'VOC2007/'
,
img_scale
=
(
1000
,
600
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_crowd
=
True
,
with_label
=
True
),
test
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'VOC2007/ImageSets/Main/test.txt'
,
img_prefix
=
data_root
+
'VOC2007/'
,
img_scale
=
(
1000
,
600
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_label
=
False
,
test_mode
=
True
))
# optimizer
optimizer
=
dict
(
type
=
'SGD'
,
lr
=
0.01
,
momentum
=
0.9
,
weight_decay
=
0.0001
)
optimizer_config
=
dict
(
grad_clip
=
dict
(
max_norm
=
35
,
norm_type
=
2
))
# learning policy
lr_config
=
dict
(
policy
=
'step'
,
step
=
[
3
])
# actual epoch = 3 * 3 = 9
checkpoint_config
=
dict
(
interval
=
1
)
# yapf:disable
log_config
=
dict
(
interval
=
50
,
hooks
=
[
dict
(
type
=
'TextLoggerHook'
),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
# runtime settings
total_epochs
=
4
# actual epoch = 4 * 3 = 12
dist_params
=
dict
(
backend
=
'nccl'
)
log_level
=
'INFO'
work_dir
=
'./work_dirs/faster_rcnn_r50_fpn_1x_voc0712'
load_from
=
None
resume_from
=
None
workflow
=
[(
'train'
,
1
)]
configs/retinanet_x101_32x4d_fpn_1x.py
0 → 100644
View file @
2017c81e
# model settings
model
=
dict
(
type
=
'RetinaNet'
,
pretrained
=
'open-mmlab://resnext101_32x4d'
,
backbone
=
dict
(
type
=
'ResNeXt'
,
depth
=
101
,
groups
=
32
,
base_width
=
4
,
num_stages
=
4
,
out_indices
=
(
0
,
1
,
2
,
3
),
frozen_stages
=
1
,
style
=
'pytorch'
),
neck
=
dict
(
type
=
'FPN'
,
in_channels
=
[
256
,
512
,
1024
,
2048
],
out_channels
=
256
,
start_level
=
1
,
add_extra_convs
=
True
,
num_outs
=
5
),
bbox_head
=
dict
(
type
=
'RetinaHead'
,
num_classes
=
81
,
in_channels
=
256
,
stacked_convs
=
4
,
feat_channels
=
256
,
octave_base_scale
=
4
,
scales_per_octave
=
3
,
anchor_ratios
=
[
0.5
,
1.0
,
2.0
],
anchor_strides
=
[
8
,
16
,
32
,
64
,
128
],
target_means
=
[.
0
,
.
0
,
.
0
,
.
0
],
target_stds
=
[
1.0
,
1.0
,
1.0
,
1.0
]))
# training and testing settings
train_cfg
=
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.5
,
neg_iou_thr
=
0.4
,
min_pos_iou
=
0
,
ignore_iof_thr
=-
1
),
smoothl1_beta
=
0.11
,
gamma
=
2.0
,
alpha
=
0.25
,
allowed_border
=-
1
,
pos_weight
=-
1
,
debug
=
False
)
test_cfg
=
dict
(
nms_pre
=
1000
,
min_bbox_size
=
0
,
score_thr
=
0.05
,
nms
=
dict
(
type
=
'nms'
,
iou_thr
=
0.5
),
max_per_img
=
100
)
# dataset settings
dataset_type
=
'CocoDataset'
data_root
=
'data/coco/'
img_norm_cfg
=
dict
(
mean
=
[
123.675
,
116.28
,
103.53
],
std
=
[
58.395
,
57.12
,
57.375
],
to_rgb
=
True
)
data
=
dict
(
imgs_per_gpu
=
2
,
workers_per_gpu
=
2
,
train
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_train2017.json'
,
img_prefix
=
data_root
+
'train2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0.5
,
with_mask
=
False
,
with_crowd
=
False
,
with_label
=
True
),
val
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_crowd
=
False
,
with_label
=
True
),
test
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_crowd
=
False
,
with_label
=
False
,
test_mode
=
True
))
# optimizer
optimizer
=
dict
(
type
=
'SGD'
,
lr
=
0.01
,
momentum
=
0.9
,
weight_decay
=
0.0001
)
optimizer_config
=
dict
(
grad_clip
=
dict
(
max_norm
=
35
,
norm_type
=
2
))
# learning policy
lr_config
=
dict
(
policy
=
'step'
,
warmup
=
'linear'
,
warmup_iters
=
500
,
warmup_ratio
=
1.0
/
3
,
step
=
[
8
,
11
])
checkpoint_config
=
dict
(
interval
=
1
)
# yapf:disable
log_config
=
dict
(
interval
=
50
,
hooks
=
[
dict
(
type
=
'TextLoggerHook'
),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
# runtime settings
total_epochs
=
12
device_ids
=
range
(
8
)
dist_params
=
dict
(
backend
=
'nccl'
)
log_level
=
'INFO'
work_dir
=
'./work_dirs/retinanet_r50_fpn_1x'
load_from
=
None
resume_from
=
None
workflow
=
[(
'train'
,
1
)]
configs/retinanet_x101_64x4d_fpn_1x.py
0 → 100644
View file @
2017c81e
# model settings
model
=
dict
(
type
=
'RetinaNet'
,
pretrained
=
'open-mmlab://resnext101_64x4d'
,
backbone
=
dict
(
type
=
'ResNeXt'
,
depth
=
101
,
groups
=
64
,
base_width
=
4
,
num_stages
=
4
,
out_indices
=
(
0
,
1
,
2
,
3
),
frozen_stages
=
1
,
style
=
'pytorch'
),
neck
=
dict
(
type
=
'FPN'
,
in_channels
=
[
256
,
512
,
1024
,
2048
],
out_channels
=
256
,
start_level
=
1
,
add_extra_convs
=
True
,
num_outs
=
5
),
bbox_head
=
dict
(
type
=
'RetinaHead'
,
num_classes
=
81
,
in_channels
=
256
,
stacked_convs
=
4
,
feat_channels
=
256
,
octave_base_scale
=
4
,
scales_per_octave
=
3
,
anchor_ratios
=
[
0.5
,
1.0
,
2.0
],
anchor_strides
=
[
8
,
16
,
32
,
64
,
128
],
target_means
=
[.
0
,
.
0
,
.
0
,
.
0
],
target_stds
=
[
1.0
,
1.0
,
1.0
,
1.0
]))
# training and testing settings
train_cfg
=
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.5
,
neg_iou_thr
=
0.4
,
min_pos_iou
=
0
,
ignore_iof_thr
=-
1
),
smoothl1_beta
=
0.11
,
gamma
=
2.0
,
alpha
=
0.25
,
allowed_border
=-
1
,
pos_weight
=-
1
,
debug
=
False
)
test_cfg
=
dict
(
nms_pre
=
1000
,
min_bbox_size
=
0
,
score_thr
=
0.05
,
nms
=
dict
(
type
=
'nms'
,
iou_thr
=
0.5
),
max_per_img
=
100
)
# dataset settings
dataset_type
=
'CocoDataset'
data_root
=
'data/coco/'
img_norm_cfg
=
dict
(
mean
=
[
123.675
,
116.28
,
103.53
],
std
=
[
58.395
,
57.12
,
57.375
],
to_rgb
=
True
)
data
=
dict
(
imgs_per_gpu
=
2
,
workers_per_gpu
=
2
,
train
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_train2017.json'
,
img_prefix
=
data_root
+
'train2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0.5
,
with_mask
=
False
,
with_crowd
=
False
,
with_label
=
True
),
val
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_crowd
=
False
,
with_label
=
True
),
test
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_crowd
=
False
,
with_label
=
False
,
test_mode
=
True
))
# optimizer
optimizer
=
dict
(
type
=
'SGD'
,
lr
=
0.01
,
momentum
=
0.9
,
weight_decay
=
0.0001
)
optimizer_config
=
dict
(
grad_clip
=
dict
(
max_norm
=
35
,
norm_type
=
2
))
# learning policy
lr_config
=
dict
(
policy
=
'step'
,
warmup
=
'linear'
,
warmup_iters
=
500
,
warmup_ratio
=
1.0
/
3
,
step
=
[
8
,
11
])
checkpoint_config
=
dict
(
interval
=
1
)
# yapf:disable
log_config
=
dict
(
interval
=
50
,
hooks
=
[
dict
(
type
=
'TextLoggerHook'
),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
# runtime settings
total_epochs
=
12
device_ids
=
range
(
8
)
dist_params
=
dict
(
backend
=
'nccl'
)
log_level
=
'INFO'
work_dir
=
'./work_dirs/retinanet_r50_fpn_1x'
load_from
=
None
resume_from
=
None
workflow
=
[(
'train'
,
1
)]
configs/rpn_x101_32x4d_fpn_1x.py
0 → 100644
View file @
2017c81e
# model settings
model
=
dict
(
type
=
'RPN'
,
pretrained
=
'open-mmlab://resnext101_32x4d'
,
backbone
=
dict
(
type
=
'ResNeXt'
,
depth
=
101
,
groups
=
32
,
base_width
=
4
,
num_stages
=
4
,
out_indices
=
(
0
,
1
,
2
,
3
),
frozen_stages
=
1
,
style
=
'pytorch'
),
neck
=
dict
(
type
=
'FPN'
,
in_channels
=
[
256
,
512
,
1024
,
2048
],
out_channels
=
256
,
num_outs
=
5
),
rpn_head
=
dict
(
type
=
'RPNHead'
,
in_channels
=
256
,
feat_channels
=
256
,
anchor_scales
=
[
8
],
anchor_ratios
=
[
0.5
,
1.0
,
2.0
],
anchor_strides
=
[
4
,
8
,
16
,
32
,
64
],
target_means
=
[.
0
,
.
0
,
.
0
,
.
0
],
target_stds
=
[
1.0
,
1.0
,
1.0
,
1.0
],
use_sigmoid_cls
=
True
))
# model training and testing settings
train_cfg
=
dict
(
rpn
=
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.7
,
neg_iou_thr
=
0.3
,
min_pos_iou
=
0.3
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
256
,
pos_fraction
=
0.5
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
False
),
allowed_border
=
0
,
pos_weight
=-
1
,
smoothl1_beta
=
1
/
9.0
,
debug
=
False
))
test_cfg
=
dict
(
rpn
=
dict
(
nms_across_levels
=
False
,
nms_pre
=
2000
,
nms_post
=
2000
,
max_num
=
2000
,
nms_thr
=
0.7
,
min_bbox_size
=
0
))
# dataset settings
dataset_type
=
'CocoDataset'
data_root
=
'data/coco/'
img_norm_cfg
=
dict
(
mean
=
[
123.675
,
116.28
,
103.53
],
std
=
[
58.395
,
57.12
,
57.375
],
to_rgb
=
True
)
data
=
dict
(
imgs_per_gpu
=
2
,
workers_per_gpu
=
2
,
train
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_train2017.json'
,
img_prefix
=
data_root
+
'train2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0.5
,
with_mask
=
False
,
with_crowd
=
False
,
with_label
=
False
),
val
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_crowd
=
False
,
with_label
=
False
),
test
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_label
=
False
,
test_mode
=
True
))
# optimizer
optimizer
=
dict
(
type
=
'SGD'
,
lr
=
0.02
,
momentum
=
0.9
,
weight_decay
=
0.0001
)
# runner configs
optimizer_config
=
dict
(
grad_clip
=
dict
(
max_norm
=
35
,
norm_type
=
2
))
lr_config
=
dict
(
policy
=
'step'
,
warmup
=
'linear'
,
warmup_iters
=
500
,
warmup_ratio
=
1.0
/
3
,
step
=
[
8
,
11
])
checkpoint_config
=
dict
(
interval
=
1
)
# yapf:disable
log_config
=
dict
(
interval
=
50
,
hooks
=
[
dict
(
type
=
'TextLoggerHook'
),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
# runtime settings
total_epochs
=
12
dist_params
=
dict
(
backend
=
'nccl'
)
log_level
=
'INFO'
work_dir
=
'./work_dirs/rpn_r101_fpn_1x'
load_from
=
None
resume_from
=
None
workflow
=
[(
'train'
,
1
)]
configs/rpn_x101_64x4d_fpn_1x.py
0 → 100644
View file @
2017c81e
# model settings
model
=
dict
(
type
=
'RPN'
,
pretrained
=
'open-mmlab://resnext101_64x4d'
,
backbone
=
dict
(
type
=
'ResNeXt'
,
depth
=
101
,
groups
=
64
,
base_width
=
4
,
num_stages
=
4
,
out_indices
=
(
0
,
1
,
2
,
3
),
frozen_stages
=
1
,
style
=
'pytorch'
),
neck
=
dict
(
type
=
'FPN'
,
in_channels
=
[
256
,
512
,
1024
,
2048
],
out_channels
=
256
,
num_outs
=
5
),
rpn_head
=
dict
(
type
=
'RPNHead'
,
in_channels
=
256
,
feat_channels
=
256
,
anchor_scales
=
[
8
],
anchor_ratios
=
[
0.5
,
1.0
,
2.0
],
anchor_strides
=
[
4
,
8
,
16
,
32
,
64
],
target_means
=
[.
0
,
.
0
,
.
0
,
.
0
],
target_stds
=
[
1.0
,
1.0
,
1.0
,
1.0
],
use_sigmoid_cls
=
True
))
# model training and testing settings
train_cfg
=
dict
(
rpn
=
dict
(
assigner
=
dict
(
type
=
'MaxIoUAssigner'
,
pos_iou_thr
=
0.7
,
neg_iou_thr
=
0.3
,
min_pos_iou
=
0.3
,
ignore_iof_thr
=-
1
),
sampler
=
dict
(
type
=
'RandomSampler'
,
num
=
256
,
pos_fraction
=
0.5
,
neg_pos_ub
=-
1
,
add_gt_as_proposals
=
False
),
allowed_border
=
0
,
pos_weight
=-
1
,
smoothl1_beta
=
1
/
9.0
,
debug
=
False
))
test_cfg
=
dict
(
rpn
=
dict
(
nms_across_levels
=
False
,
nms_pre
=
2000
,
nms_post
=
2000
,
max_num
=
2000
,
nms_thr
=
0.7
,
min_bbox_size
=
0
))
# dataset settings
dataset_type
=
'CocoDataset'
data_root
=
'data/coco/'
img_norm_cfg
=
dict
(
mean
=
[
123.675
,
116.28
,
103.53
],
std
=
[
58.395
,
57.12
,
57.375
],
to_rgb
=
True
)
data
=
dict
(
imgs_per_gpu
=
2
,
workers_per_gpu
=
2
,
train
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_train2017.json'
,
img_prefix
=
data_root
+
'train2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0.5
,
with_mask
=
False
,
with_crowd
=
False
,
with_label
=
False
),
val
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_crowd
=
False
,
with_label
=
False
),
test
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/instances_val2017.json'
,
img_prefix
=
data_root
+
'val2017/'
,
img_scale
=
(
1333
,
800
),
img_norm_cfg
=
img_norm_cfg
,
size_divisor
=
32
,
flip_ratio
=
0
,
with_mask
=
False
,
with_label
=
False
,
test_mode
=
True
))
# optimizer
optimizer
=
dict
(
type
=
'SGD'
,
lr
=
0.02
,
momentum
=
0.9
,
weight_decay
=
0.0001
)
# runner configs
optimizer_config
=
dict
(
grad_clip
=
dict
(
max_norm
=
35
,
norm_type
=
2
))
lr_config
=
dict
(
policy
=
'step'
,
warmup
=
'linear'
,
warmup_iters
=
500
,
warmup_ratio
=
1.0
/
3
,
step
=
[
8
,
11
])
checkpoint_config
=
dict
(
interval
=
1
)
# yapf:disable
log_config
=
dict
(
interval
=
50
,
hooks
=
[
dict
(
type
=
'TextLoggerHook'
),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
# runtime settings
total_epochs
=
12
dist_params
=
dict
(
backend
=
'nccl'
)
log_level
=
'INFO'
work_dir
=
'./work_dirs/rpn_r101_fpn_1x'
load_from
=
None
resume_from
=
None
workflow
=
[(
'train'
,
1
)]
mmdet/apis/train.py
View file @
2017c81e
...
@@ -6,8 +6,8 @@ import torch
...
@@ -6,8 +6,8 @@ import torch
from
mmcv.runner
import
Runner
,
DistSamplerSeedHook
from
mmcv.runner
import
Runner
,
DistSamplerSeedHook
from
mmcv.parallel
import
MMDataParallel
,
MMDistributedDataParallel
from
mmcv.parallel
import
MMDataParallel
,
MMDistributedDataParallel
from
mmdet.core
import
(
DistOptimizerHook
,
Coco
DistEval
Recall
Hook
,
from
mmdet.core
import
(
DistOptimizerHook
,
DistEval
mAP
Hook
,
CocoDistEvalmAPHook
)
CocoDistEvalRecallHook
,
CocoDistEvalmAPHook
)
from
mmdet.datasets
import
build_dataloader
from
mmdet.datasets
import
build_dataloader
from
mmdet.models
import
RPN
from
mmdet.models
import
RPN
from
.env
import
get_root_logger
from
.env
import
get_root_logger
...
@@ -81,9 +81,13 @@ def _dist_train(model, dataset, cfg, validate=False):
...
@@ -81,9 +81,13 @@ def _dist_train(model, dataset, cfg, validate=False):
# register eval hooks
# register eval hooks
if
validate
:
if
validate
:
if
isinstance
(
model
.
module
,
RPN
):
if
isinstance
(
model
.
module
,
RPN
):
# TODO: implement recall hooks for other datasets
runner
.
register_hook
(
CocoDistEvalRecallHook
(
cfg
.
data
.
val
))
runner
.
register_hook
(
CocoDistEvalRecallHook
(
cfg
.
data
.
val
))
elif
cfg
.
data
.
val
.
type
==
'CocoDataset'
:
else
:
if
cfg
.
data
.
val
.
type
==
'CocoDataset'
:
runner
.
register_hook
(
CocoDistEvalmAPHook
(
cfg
.
data
.
val
))
runner
.
register_hook
(
CocoDistEvalmAPHook
(
cfg
.
data
.
val
))
else
:
runner
.
register_hook
(
DistEvalmAPHook
(
cfg
.
data
.
val
))
if
cfg
.
resume_from
:
if
cfg
.
resume_from
:
runner
.
resume
(
cfg
.
resume_from
)
runner
.
resume
(
cfg
.
resume_from
)
...
...
mmdet/core/bbox/assign_sampling.py
View file @
2017c81e
...
@@ -3,23 +3,23 @@ import mmcv
...
@@ -3,23 +3,23 @@ import mmcv
from
.
import
assigners
,
samplers
from
.
import
assigners
,
samplers
def
build_assigner
(
cfg
,
default_args
=
None
):
def
build_assigner
(
cfg
,
**
kwargs
):
if
isinstance
(
cfg
,
assigners
.
BaseAssigner
):
if
isinstance
(
cfg
,
assigners
.
BaseAssigner
):
return
cfg
return
cfg
elif
isinstance
(
cfg
,
dict
):
elif
isinstance
(
cfg
,
dict
):
return
mmcv
.
runner
.
obj_from_dict
(
return
mmcv
.
runner
.
obj_from_dict
(
cfg
,
assigners
,
default_args
=
default_
args
)
cfg
,
assigners
,
default_args
=
kw
args
)
else
:
else
:
raise
TypeError
(
'Invalid type {} for building a sampler'
.
format
(
raise
TypeError
(
'Invalid type {} for building a sampler'
.
format
(
type
(
cfg
)))
type
(
cfg
)))
def
build_sampler
(
cfg
,
default_args
=
None
):
def
build_sampler
(
cfg
,
**
kwargs
):
if
isinstance
(
cfg
,
samplers
.
BaseSampler
):
if
isinstance
(
cfg
,
samplers
.
BaseSampler
):
return
cfg
return
cfg
elif
isinstance
(
cfg
,
dict
):
elif
isinstance
(
cfg
,
dict
):
return
mmcv
.
runner
.
obj_from_dict
(
return
mmcv
.
runner
.
obj_from_dict
(
cfg
,
samplers
,
default_args
=
default_
args
)
cfg
,
samplers
,
default_args
=
kw
args
)
else
:
else
:
raise
TypeError
(
'Invalid type {} for building a sampler'
.
format
(
raise
TypeError
(
'Invalid type {} for building a sampler'
.
format
(
type
(
cfg
)))
type
(
cfg
)))
...
...
mmdet/core/bbox/samplers/__init__.py
View file @
2017c81e
...
@@ -4,10 +4,11 @@ from .random_sampler import RandomSampler
...
@@ -4,10 +4,11 @@ from .random_sampler import RandomSampler
from
.instance_balanced_pos_sampler
import
InstanceBalancedPosSampler
from
.instance_balanced_pos_sampler
import
InstanceBalancedPosSampler
from
.iou_balanced_neg_sampler
import
IoUBalancedNegSampler
from
.iou_balanced_neg_sampler
import
IoUBalancedNegSampler
from
.combined_sampler
import
CombinedSampler
from
.combined_sampler
import
CombinedSampler
from
.ohem_sampler
import
OHEMSampler
from
.sampling_result
import
SamplingResult
from
.sampling_result
import
SamplingResult
__all__
=
[
__all__
=
[
'BaseSampler'
,
'PseudoSampler'
,
'RandomSampler'
,
'BaseSampler'
,
'PseudoSampler'
,
'RandomSampler'
,
'InstanceBalancedPosSampler'
,
'IoUBalancedNegSampler'
,
'CombinedSampler'
,
'InstanceBalancedPosSampler'
,
'IoUBalancedNegSampler'
,
'CombinedSampler'
,
'SamplingResult'
'OHEMSampler'
,
'SamplingResult'
]
]
Prev
1
2
3
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment