Unverified Commit 9fef99cc authored by twang's avatar twang Committed by GitHub
Browse files

[Enhance] SSN benchmark on nuScenes (#182)

* Add ssn benchmark on nuscenes

* Update preliminary nuscenes results

* Update model_zoo.md

* RegNetX-SSN updated

* Add ssn regnet benchmark on Lyft

* RegNetX-SSN on Lyft updated

* Modify the batch size in the ssn regnet config name

* Memory of RegNetX-SSN on Lyft updated

* Model and log links updated

* Model and log links updated
parent ac39ca96
......@@ -75,7 +75,7 @@ Results and models are available in the [model zoo](docs/model_zoo.md).
| Part-A2 | ☐ | ☐ | ☐ | ✗ | ☐ | ✓ | ☐ |
| MVXNet | ☐ | ☐ | ☐ | ✗ | ☐ | ✓ | ☐ |
| CenterPoint | ☐ | ☐ | ☐ | ✗ | ☐ | ✓ | ☐ |
| SSN | ☐ | ☐ | ☐ | ✗ | ☐ | | ☐ |
| SSN | ☐ | ☐ | ☐ | ✗ | ☐ | | ☐ |
Other features
- [x] [Dynamic Voxelization](configs/carafe/README.md)
......
......@@ -2,7 +2,7 @@
## Introduction
We implement PointPillars with Shape-aware grouping heads used in the SSN and provide the results and checkpoints on Lyft datasets.
We implement PointPillars with Shape-aware grouping heads used in the SSN and provide the results and checkpoints on the nuScenes and Lyft dataset.
```
@inproceedings{zhu2020ssn,
......@@ -16,12 +16,22 @@ We implement PointPillars with Shape-aware grouping heads used in the SSN and pr
## Results
### NuScenes
| Backbone | Lr schd | Mem (GB) | Inf time (fps) | mAP | NDS | Download |
| :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
|[SECFPN](../pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d.py)|2x|16.4||35.17|49.76|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230725-0817d270.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230725.log.json)|
|[SSN](./hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d.py)|2x|9.62||41.56|54.83|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/ssn/hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d/hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d_20201023_193737-5fda3f00.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/ssn/hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d/hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d_20201023_193737.log.json)|
[RegNetX-400MF-SECFPN](./hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d.py)|2x|16.4||41.15|55.20|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230334-53044f32.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230334.log.json)|
|[RegNetX-400MF-SSN](./hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d.py)|2x|10.26||46.95|58.24|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/ssn/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d_20201024_232447-7af3d8c8.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/ssn/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d_20201024_232447.log.json)|
### Lyft
| Backbone | Lr schd | Mem (GB) | Inf time (fps) | Private Score | Public Score | Download |
| :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
|[SECFPN](../pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_lyft-3d.py)|2x|||13.4|13.4||
|[SSN](./hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d.py)|2x|8.30||17.4|17.5|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/ssn/hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d/hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d_20201016_220844-3058d9fc.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/ssn/hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d/hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d_20201016_220844.log.json)|
|[RegNetX-400MF-SSN](./hv_ssn_regnet-400mf_secfpn_sbn-all_1x16_2x_lyft-3d.py)|2x|9.98||18.1|18.3|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/ssn/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_lyft-3d/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_lyft-3d_20201025_213155-4532096c.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/ssn/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_lyft-3d/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_lyft-3d_20201025_213155.log.json)|
Note:
......
_base_ = './hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d.py'
# model settings
model = dict(
type='MVXFasterRCNN',
pretrained=dict(pts='open-mmlab://regnetx_400mf'),
pts_backbone=dict(
_delete_=True,
type='NoStemRegNet',
arch=dict(w0=24, wa=24.48, wm=2.54, group_w=16, depth=22, bot_mul=1.0),
out_indices=(1, 2, 3),
frozen_stages=-1,
strides=(1, 2, 2, 2),
base_channels=64,
stem_channels=64,
norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01),
norm_eval=False,
style='pytorch'),
pts_neck=dict(in_channels=[64, 160, 384]))
# dataset settings
data = dict(samples_per_gpu=1, workers_per_gpu=2)
_base_ = './hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d.py'
# model settings
model = dict(
type='MVXFasterRCNN',
pretrained=dict(pts='open-mmlab://regnetx_400mf'),
pts_backbone=dict(
_delete_=True,
type='NoStemRegNet',
arch=dict(w0=24, wa=24.48, wm=2.54, group_w=16, depth=22, bot_mul=1.0),
out_indices=(1, 2, 3),
frozen_stages=-1,
strides=(1, 2, 2, 2),
base_channels=64,
stem_channels=64,
norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01),
norm_eval=False,
style='pytorch'),
pts_neck=dict(in_channels=[64, 160, 384]))
_base_ = [
'../_base_/models/hv_pointpillars_fpn_nus.py',
'../_base_/datasets/nus-3d.py',
'../_base_/schedules/schedule_2x.py',
'../_base_/default_runtime.py',
]
# Note that the order of class names should be consistent with
# the following anchors' order
point_cloud_range = [-50, -50, -5, 50, 50, 3]
class_names = [
'bicycle', 'motorcycle', 'pedestrian', 'traffic_cone', 'barrier', 'car',
'truck', 'trailer', 'bus', 'construction_vehicle'
]
train_pipeline = [
dict(type='LoadPointsFromFile', load_dim=5, use_dim=5),
dict(type='LoadPointsFromMultiSweeps', sweeps_num=10),
dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True),
dict(
type='GlobalRotScaleTrans',
rot_range=[-0.3925, 0.3925],
scale_ratio_range=[0.95, 1.05],
translation_std=[0, 0, 0]),
dict(
type='RandomFlip3D',
sync_2d=False,
flip_ratio_bev_horizontal=0.5,
flip_ratio_bev_vertical=0.5),
dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
dict(type='PointShuffle'),
dict(type='DefaultFormatBundle3D', class_names=class_names),
dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
]
test_pipeline = [
dict(type='LoadPointsFromFile', load_dim=5, use_dim=5),
dict(type='LoadPointsFromMultiSweeps', sweeps_num=10),
dict(
type='MultiScaleFlipAug3D',
img_scale=(1333, 800),
pts_scale_ratio=1,
flip=False,
transforms=[
dict(
type='GlobalRotScaleTrans',
rot_range=[0, 0],
scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'),
dict(
type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict(
type='DefaultFormatBundle3D',
class_names=class_names,
with_label=False),
dict(type='Collect3D', keys=['points'])
])
]
data = dict(
samples_per_gpu=2,
workers_per_gpu=4,
train=dict(pipeline=train_pipeline, classes=class_names),
val=dict(pipeline=test_pipeline, classes=class_names),
test=dict(pipeline=test_pipeline, classes=class_names))
# model settings
model = dict(
pts_voxel_layer=dict(max_num_points=20),
pts_voxel_encoder=dict(feat_channels=[64, 64]),
pts_neck=dict(
_delete_=True,
type='SECONDFPN',
norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01),
in_channels=[64, 128, 256],
upsample_strides=[1, 2, 4],
out_channels=[128, 128, 128]),
pts_bbox_head=dict(
_delete_=True,
type='ShapeAwareHead',
num_classes=10,
in_channels=384,
feat_channels=384,
use_direction_classifier=True,
anchor_generator=dict(
type='AlignedAnchor3DRangeGeneratorPerCls',
ranges=[[-50, -50, -1.67339111, 50, 50, -1.67339111],
[-50, -50, -1.71396371, 50, 50, -1.71396371],
[-50, -50, -1.61785072, 50, 50, -1.61785072],
[-50, -50, -1.80984986, 50, 50, -1.80984986],
[-50, -50, -1.76396500, 50, 50, -1.76396500],
[-50, -50, -1.80032795, 50, 50, -1.80032795],
[-50, -50, -1.74440365, 50, 50, -1.74440365],
[-50, -50, -1.68526504, 50, 50, -1.68526504],
[-50, -50, -1.80673031, 50, 50, -1.80673031],
[-50, -50, -1.64824291, 50, 50, -1.64824291]],
sizes=[
[0.60058911, 1.68452161, 1.27192197], # bicycle
[0.76279481, 2.09973778, 1.44403034], # motorcycle
[0.66344886, 0.72564370, 1.75748069], # pedestrian
[0.39694519, 0.40359262, 1.06232151], # traffic cone
[2.49008838, 0.48578221, 0.98297065], # barrier
[1.95017717, 4.60718145, 1.72270761], # car
[2.45609390, 6.73778078, 2.73004906], # truck
[2.87427237, 12.01320693, 3.81509561], # trailer
[2.94046906, 11.1885991, 3.47030982], # bus
[2.73050468, 6.38352896, 3.13312415] # construction vehicle
],
custom_values=[0, 0],
rotations=[0, 1.57],
reshape_out=False),
tasks=[
dict(
num_class=2,
class_names=['bicycle', 'motorcycle'],
shared_conv_channels=(64, 64),
shared_conv_strides=(1, 1),
norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01)),
dict(
num_class=1,
class_names=['pedestrian'],
shared_conv_channels=(64, 64),
shared_conv_strides=(1, 1),
norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01)),
dict(
num_class=2,
class_names=['traffic_cone', 'barrier'],
shared_conv_channels=(64, 64),
shared_conv_strides=(1, 1),
norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01)),
dict(
num_class=1,
class_names=['car'],
shared_conv_channels=(64, 64, 64),
shared_conv_strides=(2, 1, 1),
norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01)),
dict(
num_class=4,
class_names=[
'truck', 'trailer', 'bus', 'construction_vehicle'
],
shared_conv_channels=(64, 64, 64),
shared_conv_strides=(2, 1, 1),
norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01))
],
assign_per_class=True,
diff_rad_by_sin=True,
dir_offset=0.7854, # pi/4
dir_limit_offset=0,
bbox_coder=dict(type='DeltaXYZWLHRBBoxCoder', code_size=9),
loss_cls=dict(
type='FocalLoss',
use_sigmoid=True,
gamma=2.0,
alpha=0.25,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0),
loss_dir=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.2)))
# model training and testing settings
train_cfg = dict(
_delete_=True,
pts=dict(
assigner=[
dict( # bicycle
type='MaxIoUAssigner',
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.5,
neg_iou_thr=0.35,
min_pos_iou=0.35,
ignore_iof_thr=-1),
dict( # motorcycle
type='MaxIoUAssigner',
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.5,
neg_iou_thr=0.3,
min_pos_iou=0.3,
ignore_iof_thr=-1),
dict( # pedestrian
type='MaxIoUAssigner',
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.6,
neg_iou_thr=0.4,
min_pos_iou=0.4,
ignore_iof_thr=-1),
dict( # traffic cone
type='MaxIoUAssigner',
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.6,
neg_iou_thr=0.4,
min_pos_iou=0.4,
ignore_iof_thr=-1),
dict( # barrier
type='MaxIoUAssigner',
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.55,
neg_iou_thr=0.4,
min_pos_iou=0.4,
ignore_iof_thr=-1),
dict( # car
type='MaxIoUAssigner',
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.6,
neg_iou_thr=0.45,
min_pos_iou=0.45,
ignore_iof_thr=-1),
dict( # truck
type='MaxIoUAssigner',
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.55,
neg_iou_thr=0.4,
min_pos_iou=0.4,
ignore_iof_thr=-1),
dict( # trailer
type='MaxIoUAssigner',
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.5,
neg_iou_thr=0.35,
min_pos_iou=0.35,
ignore_iof_thr=-1),
dict( # bus
type='MaxIoUAssigner',
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.55,
neg_iou_thr=0.4,
min_pos_iou=0.4,
ignore_iof_thr=-1),
dict( # construction vehicle
type='MaxIoUAssigner',
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.5,
neg_iou_thr=0.35,
min_pos_iou=0.35,
ignore_iof_thr=-1)
],
allowed_border=0,
code_weight=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.2, 0.2],
pos_weight=-1,
debug=False))
......@@ -52,4 +52,4 @@ Please refer to [CenterPoint](https://github.com/open-mmlab/mmdetection3d/blob/m
### SSN
Please refer to [SSN](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/ssn) for details. We provide pointpillars with shape-aware grouping heads used in SSN on Lyft datasets currently.
Please refer to [SSN](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/ssn) for details. We provide pointpillars with shape-aware grouping heads used in SSN on the nuScenes and Lyft dataset currently.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment