Unverified Commit e813610f authored by twang's avatar twang Committed by GitHub
Browse files

[Feature] SSN benchmark on Lyft (#174)

* Add ssn config on Lyft

* Update readme of pointpillars

* Update SSN readme

* Update model_zoo.md

* Update README.md

* Fix a minor typo in the nus regnet config

* Update README.md
parent b016d90c
...@@ -75,6 +75,7 @@ Results and models are available in the [model zoo](docs/model_zoo.md). ...@@ -75,6 +75,7 @@ Results and models are available in the [model zoo](docs/model_zoo.md).
| Part-A2 | ☐ | ☐ | ☐ | ✗ | ☐ | ✓ | ☐ | | Part-A2 | ☐ | ☐ | ☐ | ✗ | ☐ | ✓ | ☐ |
| MVXNet | ☐ | ☐ | ☐ | ✗ | ☐ | ✓ | ☐ | | MVXNet | ☐ | ☐ | ☐ | ✗ | ☐ | ✓ | ☐ |
| CenterPoint | ☐ | ☐ | ☐ | ✗ | ☐ | ✓ | ☐ | | CenterPoint | ☐ | ☐ | ☐ | ✗ | ☐ | ✓ | ☐ |
| SSN | ☐ | ☐ | ☐ | ✗ | ☐ | ✗ | ☐ |
Other features Other features
- [x] [Dynamic Voxelization](configs/carafe/README.md) - [x] [Dynamic Voxelization](configs/carafe/README.md)
......
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
## Introduction ## Introduction
We implement PointPillars and provide the results and checkpoints on KITTI and nuScenes datasets. We implement PointPillars and provide the results and checkpoints on KITTI, nuScenes, Lyft and Waymo datasets.
``` ```
@inproceedings{lang2019pointpillars, @inproceedings{lang2019pointpillars,
......
_base_ = [ _base_ = [
'../_base_/models/hv_pointpillars_fpn_lyft.py', '../_base_/models/hv_pointpillars_fpn_nus.py',
'../_base_/datasets/nus-3d.py', '../_base_/datasets/nus-3d.py',
'../_base_/schedules/schedule_2x.py', '../_base_/schedules/schedule_2x.py',
'../_base_/default_runtime.py', '../_base_/default_runtime.py',
......
# SSN: Shape Signature Networks for Multi-class Object Detection from Point Clouds
## Introduction
We implement PointPillars with Shape-aware grouping heads used in the SSN and provide the results and checkpoints on Lyft datasets.
```
@inproceedings{zhu2020ssn,
title={SSN: Shape Signature Networks for Multi-class Object Detection from Point Clouds},
author={Zhu, Xinge and Ma, Yuexin and Wang, Tai and Xu, Yan and Shi, Jianping and Lin, Dahua},
booktitle={Proceedings of the European Conference on Computer Vision},
year={2020}
}
```
## Results
### Lyft
| Backbone | Lr schd | Mem (GB) | Inf time (fps) | Private Score | Public Score | Download |
| :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
|[SECFPN](../pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_lyft-3d.py)|2x|||13.4|13.4||
|[SSN](./hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d.py)|2x|8.30||17.4|17.5|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/ssn/hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d/hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d_20201016_220844-3058d9fc.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/ssn/hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d/hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d_20201016_220844.log.json)|
Note:
The main difference of the shape-aware grouping heads with the original SECOND FPN heads is that the former groups objects with similar sizes and shapes together, and design shape-specific heads for each group. Heavier heads (with more convolutions and large strides) are designed for large objects while smaller heads for small objects. Note that there may appear different feature map sizes in the outputs, so an anchor generator tailored to these feature maps is also needed in the implementation.
Users could try other settings in terms of the head design. Here we basically refer to the implementation [HERE](https://github.com/xinge008/SSN).
_base_ = [
'../_base_/models/hv_pointpillars_fpn_lyft.py',
'../_base_/datasets/lyft-3d.py',
'../_base_/schedules/schedule_2x.py',
'../_base_/default_runtime.py',
]
point_cloud_range = [-100, -100, -5, 100, 100, 3]
# Note that the order of class names should be consistent with
# the following anchors' order
class_names = [
'bicycle', 'motorcycle', 'pedestrian', 'animal', 'car',
'emergency_vehicle', 'bus', 'other_vehicle', 'truck'
]
train_pipeline = [
dict(type='LoadPointsFromFile', load_dim=5, use_dim=5),
dict(type='LoadPointsFromMultiSweeps', sweeps_num=10),
dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True),
dict(
type='GlobalRotScaleTrans',
rot_range=[-0.3925, 0.3925],
scale_ratio_range=[0.95, 1.05],
translation_std=[0, 0, 0]),
dict(
type='RandomFlip3D',
sync_2d=False,
flip_ratio_bev_horizontal=0.5,
flip_ratio_bev_vertical=0.5),
dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
dict(type='PointShuffle'),
dict(type='DefaultFormatBundle3D', class_names=class_names),
dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
]
test_pipeline = [
dict(type='LoadPointsFromFile', load_dim=5, use_dim=5),
dict(type='LoadPointsFromMultiSweeps', sweeps_num=10),
dict(
type='MultiScaleFlipAug3D',
img_scale=(1333, 800),
pts_scale_ratio=1,
flip=False,
transforms=[
dict(
type='GlobalRotScaleTrans',
rot_range=[0, 0],
scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'),
dict(
type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict(
type='DefaultFormatBundle3D',
class_names=class_names,
with_label=False),
dict(type='Collect3D', keys=['points'])
])
]
data = dict(
samples_per_gpu=2,
workers_per_gpu=4,
train=dict(pipeline=train_pipeline, classes=class_names),
val=dict(pipeline=test_pipeline, classes=class_names),
test=dict(pipeline=test_pipeline, classes=class_names))
# model settings
model = dict(
pts_voxel_layer=dict(point_cloud_range=[-100, -100, -5, 100, 100, 3]),
pts_voxel_encoder=dict(
feat_channels=[32, 64],
point_cloud_range=[-100, -100, -5, 100, 100, 3]),
pts_middle_encoder=dict(output_shape=[800, 800]),
pts_neck=dict(
_delete_=True,
type='SECONDFPN',
norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01),
in_channels=[64, 128, 256],
upsample_strides=[1, 2, 4],
out_channels=[128, 128, 128]),
pts_bbox_head=dict(
_delete_=True,
type='ShapeAwareHead',
num_classes=9,
in_channels=384,
feat_channels=384,
use_direction_classifier=True,
anchor_generator=dict(
type='AlignedAnchor3DRangeGeneratorPerCls',
ranges=[[-100, -100, -1.0709302, 100, 100, -1.0709302],
[-100, -100, -1.3220503, 100, 100, -1.3220503],
[-100, -100, -0.9122268, 100, 100, -0.9122268],
[-100, -100, -1.8012227, 100, 100, -1.8012227],
[-100, -100, -1.0715024, 100, 100, -1.0715024],
[-100, -100, -0.8871424, 100, 100, -0.8871424],
[-100, -100, -0.3519405, 100, 100, -0.3519405],
[-100, -100, -0.6276341, 100, 100, -0.6276341],
[-100, -100, -0.3033737, 100, 100, -0.3033737]],
sizes=[
[0.63, 1.76, 1.44], # bicycle
[0.96, 2.35, 1.59], # motorcycle
[0.76, 0.80, 1.76], # pedestrian
[0.35, 0.73, 0.50], # animal
[1.92, 4.75, 1.71], # car
[2.42, 6.52, 2.34], # emergency vehicle
[2.92, 12.70, 3.42], # bus
[2.75, 8.17, 3.20], # other vehicle
[2.84, 10.24, 3.44] # truck
],
custom_values=[],
rotations=[0, 1.57],
reshape_out=False),
tasks=[
dict(
num_class=2,
class_names=['bicycle', 'motorcycle'],
shared_conv_channels=(64, 64),
shared_conv_strides=(1, 1),
norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01)),
dict(
num_class=2,
class_names=['pedestrian', 'animal'],
shared_conv_channels=(64, 64),
shared_conv_strides=(1, 1),
norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01)),
dict(
num_class=2,
class_names=['car', 'emergency_vehicle'],
shared_conv_channels=(64, 64, 64),
shared_conv_strides=(2, 1, 1),
norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01)),
dict(
num_class=3,
class_names=['bus', 'other_vehicle', 'truck'],
shared_conv_channels=(64, 64, 64),
shared_conv_strides=(2, 1, 1),
norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01))
],
assign_per_class=True,
diff_rad_by_sin=True,
dir_offset=0.7854, # pi/4
dir_limit_offset=0,
bbox_coder=dict(type='DeltaXYZWLHRBBoxCoder', code_size=7),
loss_cls=dict(
type='FocalLoss',
use_sigmoid=True,
gamma=2.0,
alpha=0.25,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0),
loss_dir=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.2)))
# model training and testing settings
train_cfg = dict(
_delete_=True,
pts=dict(
assigner=[
dict( # bicycle
type='MaxIoUAssigner',
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.55,
neg_iou_thr=0.4,
min_pos_iou=0.4,
ignore_iof_thr=-1),
dict( # motorcycle
type='MaxIoUAssigner',
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.55,
neg_iou_thr=0.4,
min_pos_iou=0.4,
ignore_iof_thr=-1),
dict( # pedestrian
type='MaxIoUAssigner',
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.55,
neg_iou_thr=0.4,
min_pos_iou=0.4,
ignore_iof_thr=-1),
dict( # animal
type='MaxIoUAssigner',
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.55,
neg_iou_thr=0.4,
min_pos_iou=0.4,
ignore_iof_thr=-1),
dict( # car
type='MaxIoUAssigner',
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.6,
neg_iou_thr=0.45,
min_pos_iou=0.45,
ignore_iof_thr=-1),
dict( # emergency vehicle
type='MaxIoUAssigner',
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.55,
neg_iou_thr=0.4,
min_pos_iou=0.4,
ignore_iof_thr=-1),
dict( # bus
type='MaxIoUAssigner',
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.6,
neg_iou_thr=0.45,
min_pos_iou=0.45,
ignore_iof_thr=-1),
dict( # other vehicle
type='MaxIoUAssigner',
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.55,
neg_iou_thr=0.4,
min_pos_iou=0.4,
ignore_iof_thr=-1),
dict( # truck
type='MaxIoUAssigner',
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.6,
neg_iou_thr=0.45,
min_pos_iou=0.45,
ignore_iof_thr=-1)
],
allowed_border=0,
code_weight=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
pos_weight=-1,
debug=False))
...@@ -49,3 +49,7 @@ Please refer to [3DSSD](https://github.com/open-mmlab/mmdetection3d/blob/master/ ...@@ -49,3 +49,7 @@ Please refer to [3DSSD](https://github.com/open-mmlab/mmdetection3d/blob/master/
### CenterPoint ### CenterPoint
Please refer to [CenterPoint](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/centerpoint) for details. Please refer to [CenterPoint](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/centerpoint) for details.
### SSN
Please refer to [SSN](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/ssn) for details. We provide pointpillars with shape-aware grouping heads used in SSN on Lyft datasets currently.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment