[Enhance] SSN benchmark on nuScenes (#182)

* Add ssn benchmark on nuscenes * Update preliminary nuscenes results * Update model_zoo.md * RegNetX-SSN updated * Add ssn regnet benchmark on Lyft * RegNetX-SSN on Lyft updated * Modify the batch size in the ssn regnet config name * Memory of RegNetX-SSN on Lyft updated * Model and log links updated * Model and log links updated

[Enhance] SSN benchmark on nuScenes (#182)
* Add ssn benchmark on nuscenes * Update preliminary nuscenes results * Update model_zoo.md * RegNetX-SSN updated * Add ssn regnet benchmark on Lyft * RegNetX-SSN on Lyft updated * Modify the batch size in the ssn regnet config name * Memory of RegNetX-SSN on Lyft updated * Model and log links updated * Model and log links updated
9fef99cc · twang · GitHub · ac39ca96 · 9fef99cc · 9fef99cc
Unverified Commit 9fef99cc authored Oct 30, 2020 by twang Committed by GitHub Oct 30, 2020
6 changed files
--- a/README.md
+++ b/README.md
@@ -75,7 +75,7 @@ Results and models are available in the [model zoo](docs/model_zoo.md).
 | Part-A2            | ☐        | ☐        | ☐        | ✗         | ☐     | ✓        | ☐     |
 | MVXNet             | ☐        | ☐        | ☐        | ✗         | ☐     | ✓        | ☐     |
 | CenterPoint        | ☐        | ☐        | ☐        | ✗         | ☐     | ✓        | ☐     |
-| SSN                | ☐        | ☐        | ☐        | ✗         | ☐     | ✗        | ☐     |
+| SSN                | ☐        | ☐        | ☐        | ✗         | ☐     | ✓        | ☐     |

 Other features
 - [x] [Dynamic Voxelization](configs/carafe/README.md)

--- a/configs/ssn/README.md
+++ b/configs/ssn/README.md
@@ -2,7 +2,7 @@

 ## Introduction

-We implement PointPillars with Shape-aware grouping heads used in the SSN and provide the results and checkpoints on Lyft datasets.
+We implement PointPillars with Shape-aware grouping heads used in the SSN and provide the results and checkpoints on the nuScenes and Lyft dataset.

 ```
 @inproceedings{zhu2020ssn,
@@ -16,12 +16,22 @@ We implement PointPillars with Shape-aware grouping heads used in the SSN and pr

 ## Results

+### NuScenes
+
+|  Backbone   | Lr schd | Mem (GB) | Inf time (fps) | mAP | NDS | Download |
+| :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
+|[SECFPN](../pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d.py)|2x|16.4||35.17|49.76|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230725-0817d270.pth) &#124; [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230725.log.json)|
+|[SSN](./hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d.py)|2x|9.62||41.56|54.83|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/ssn/hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d/hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d_20201023_193737-5fda3f00.pth) &#124; [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/ssn/hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d/hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d_20201023_193737.log.json)|
+[RegNetX-400MF-SECFPN](./hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d.py)|2x|16.4||41.15|55.20|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230334-53044f32.pth) &#124; [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230334.log.json)|
+|[RegNetX-400MF-SSN](./hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d.py)|2x|10.26||46.95|58.24|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/ssn/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d_20201024_232447-7af3d8c8.pth) &#124; [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/ssn/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d_20201024_232447.log.json)|
+
 ### Lyft

 |  Backbone   | Lr schd | Mem (GB) | Inf time (fps) | Private Score | Public Score | Download |
 | :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
 |[SECFPN](../pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_lyft-3d.py)|2x|||13.4|13.4||
 |[SSN](./hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d.py)|2x|8.30||17.4|17.5|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/ssn/hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d/hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d_20201016_220844-3058d9fc.pth) &#124; [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/ssn/hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d/hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d_20201016_220844.log.json)|
+|[RegNetX-400MF-SSN](./hv_ssn_regnet-400mf_secfpn_sbn-all_1x16_2x_lyft-3d.py)|2x|9.98||18.1|18.3|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/ssn/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_lyft-3d/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_lyft-3d_20201025_213155-4532096c.pth) &#124; [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/ssn/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_lyft-3d/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_lyft-3d_20201025_213155.log.json)|

 Note:


--- a/configs/ssn/hv_ssn_regnet-400mf_secfpn_sbn-all_1x16_2x_lyft-3d.py
+++ b/configs/ssn/hv_ssn_regnet-400mf_secfpn_sbn-all_1x16_2x_lyft-3d.py
+_base_ = './hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d.py'
+# model settings
+model = dict(
+    type='MVXFasterRCNN',
+    pretrained=dict(pts='open-mmlab://regnetx_400mf'),
+    pts_backbone=dict(
+        _delete_=True,
+        type='NoStemRegNet',
+        arch=dict(w0=24, wa=24.48, wm=2.54, group_w=16, depth=22, bot_mul=1.0),
+        out_indices=(1, 2, 3),
+        frozen_stages=-1,
+        strides=(1, 2, 2, 2),
+        base_channels=64,
+        stem_channels=64,
+        norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01),
+        norm_eval=False,
+        style='pytorch'),
+    pts_neck=dict(in_channels=[64, 160, 384]))
+# dataset settings
+data = dict(samples_per_gpu=1, workers_per_gpu=2)
--- a/configs/ssn/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d.py
+++ b/configs/ssn/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d.py
+_base_ = './hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d.py'
+# model settings
+model = dict(
+    type='MVXFasterRCNN',
+    pretrained=dict(pts='open-mmlab://regnetx_400mf'),
+    pts_backbone=dict(
+        _delete_=True,
+        type='NoStemRegNet',
+        arch=dict(w0=24, wa=24.48, wm=2.54, group_w=16, depth=22, bot_mul=1.0),
+        out_indices=(1, 2, 3),
+        frozen_stages=-1,
+        strides=(1, 2, 2, 2),
+        base_channels=64,
+        stem_channels=64,
+        norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01),
+        norm_eval=False,
+        style='pytorch'),
+    pts_neck=dict(in_channels=[64, 160, 384]))
--- a/configs/ssn/hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d.py
+++ b/configs/ssn/hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d.py
+_base_ = [
+    '../_base_/models/hv_pointpillars_fpn_nus.py',
+    '../_base_/datasets/nus-3d.py',
+    '../_base_/schedules/schedule_2x.py',
+    '../_base_/default_runtime.py',
+]
+# Note that the order of class names should be consistent with
+# the following anchors' order
+point_cloud_range = [-50, -50, -5, 50, 50, 3]
+class_names = [
+    'bicycle', 'motorcycle', 'pedestrian', 'traffic_cone', 'barrier', 'car',
+    'truck', 'trailer', 'bus', 'construction_vehicle'
+]
+
+train_pipeline = [
+    dict(type='LoadPointsFromFile', load_dim=5, use_dim=5),
+    dict(type='LoadPointsFromMultiSweeps', sweeps_num=10),
+    dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True),
+    dict(
+        type='GlobalRotScaleTrans',
+        rot_range=[-0.3925, 0.3925],
+        scale_ratio_range=[0.95, 1.05],
+        translation_std=[0, 0, 0]),
+    dict(
+        type='RandomFlip3D',
+        sync_2d=False,
+        flip_ratio_bev_horizontal=0.5,
+        flip_ratio_bev_vertical=0.5),
+    dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
+    dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
+    dict(type='PointShuffle'),
+    dict(type='DefaultFormatBundle3D', class_names=class_names),
+    dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
+]
+test_pipeline = [
+    dict(type='LoadPointsFromFile', load_dim=5, use_dim=5),
+    dict(type='LoadPointsFromMultiSweeps', sweeps_num=10),
+    dict(
+        type='MultiScaleFlipAug3D',
+        img_scale=(1333, 800),
+        pts_scale_ratio=1,
+        flip=False,
+        transforms=[
+            dict(
+                type='GlobalRotScaleTrans',
+                rot_range=[0, 0],
+                scale_ratio_range=[1., 1.],
+                translation_std=[0, 0, 0]),
+            dict(type='RandomFlip3D'),
+            dict(
+                type='PointsRangeFilter', point_cloud_range=point_cloud_range),
+            dict(
+                type='DefaultFormatBundle3D',
+                class_names=class_names,
+                with_label=False),
+            dict(type='Collect3D', keys=['points'])
+        ])
+]
+data = dict(
+    samples_per_gpu=2,
+    workers_per_gpu=4,
+    train=dict(pipeline=train_pipeline, classes=class_names),
+    val=dict(pipeline=test_pipeline, classes=class_names),
+    test=dict(pipeline=test_pipeline, classes=class_names))
+
+# model settings
+model = dict(
+    pts_voxel_layer=dict(max_num_points=20),
+    pts_voxel_encoder=dict(feat_channels=[64, 64]),
+    pts_neck=dict(
+        _delete_=True,
+        type='SECONDFPN',
+        norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01),
+        in_channels=[64, 128, 256],
+        upsample_strides=[1, 2, 4],
+        out_channels=[128, 128, 128]),
+    pts_bbox_head=dict(
+        _delete_=True,
+        type='ShapeAwareHead',
+        num_classes=10,
+        in_channels=384,
+        feat_channels=384,
+        use_direction_classifier=True,
+        anchor_generator=dict(
+            type='AlignedAnchor3DRangeGeneratorPerCls',
+            ranges=[[-50, -50, -1.67339111, 50, 50, -1.67339111],
+                    [-50, -50, -1.71396371, 50, 50, -1.71396371],
+                    [-50, -50, -1.61785072, 50, 50, -1.61785072],
+                    [-50, -50, -1.80984986, 50, 50, -1.80984986],
+                    [-50, -50, -1.76396500, 50, 50, -1.76396500],
+                    [-50, -50, -1.80032795, 50, 50, -1.80032795],
+                    [-50, -50, -1.74440365, 50, 50, -1.74440365],
+                    [-50, -50, -1.68526504, 50, 50, -1.68526504],
+                    [-50, -50, -1.80673031, 50, 50, -1.80673031],
+                    [-50, -50, -1.64824291, 50, 50, -1.64824291]],
+            sizes=[
+                [0.60058911, 1.68452161, 1.27192197],  # bicycle
+                [0.76279481, 2.09973778, 1.44403034],  # motorcycle
+                [0.66344886, 0.72564370, 1.75748069],  # pedestrian
+                [0.39694519, 0.40359262, 1.06232151],  # traffic cone
+                [2.49008838, 0.48578221, 0.98297065],  # barrier
+                [1.95017717, 4.60718145, 1.72270761],  # car
+                [2.45609390, 6.73778078, 2.73004906],  # truck
+                [2.87427237, 12.01320693, 3.81509561],  # trailer
+                [2.94046906, 11.1885991, 3.47030982],  # bus
+                [2.73050468, 6.38352896, 3.13312415]  # construction vehicle
+            ],
+            custom_values=[0, 0],
+            rotations=[0, 1.57],
+            reshape_out=False),
+        tasks=[
+            dict(
+                num_class=2,
+                class_names=['bicycle', 'motorcycle'],
+                shared_conv_channels=(64, 64),
+                shared_conv_strides=(1, 1),
+                norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01)),
+            dict(
+                num_class=1,
+                class_names=['pedestrian'],
+                shared_conv_channels=(64, 64),
+                shared_conv_strides=(1, 1),
+                norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01)),
+            dict(
+                num_class=2,
+                class_names=['traffic_cone', 'barrier'],
+                shared_conv_channels=(64, 64),
+                shared_conv_strides=(1, 1),
+                norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01)),
+            dict(
+                num_class=1,
+                class_names=['car'],
+                shared_conv_channels=(64, 64, 64),
+                shared_conv_strides=(2, 1, 1),
+                norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01)),
+            dict(
+                num_class=4,
+                class_names=[
+                    'truck', 'trailer', 'bus', 'construction_vehicle'
+                ],
+                shared_conv_channels=(64, 64, 64),
+                shared_conv_strides=(2, 1, 1),
+                norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01))
+        ],
+        assign_per_class=True,
+        diff_rad_by_sin=True,
+        dir_offset=0.7854,  # pi/4
+        dir_limit_offset=0,
+        bbox_coder=dict(type='DeltaXYZWLHRBBoxCoder', code_size=9),
+        loss_cls=dict(
+            type='FocalLoss',
+            use_sigmoid=True,
+            gamma=2.0,
+            alpha=0.25,
+            loss_weight=1.0),
+        loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0),
+        loss_dir=dict(
+            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.2)))
+
+# model training and testing settings
+train_cfg = dict(
+    _delete_=True,
+    pts=dict(
+        assigner=[
+            dict(  # bicycle
+                type='MaxIoUAssigner',
+                iou_calculator=dict(type='BboxOverlapsNearest3D'),
+                pos_iou_thr=0.5,
+                neg_iou_thr=0.35,
+                min_pos_iou=0.35,
+                ignore_iof_thr=-1),
+            dict(  # motorcycle
+                type='MaxIoUAssigner',
+                iou_calculator=dict(type='BboxOverlapsNearest3D'),
+                pos_iou_thr=0.5,
+                neg_iou_thr=0.3,
+                min_pos_iou=0.3,
+                ignore_iof_thr=-1),
+            dict(  # pedestrian
+                type='MaxIoUAssigner',
+                iou_calculator=dict(type='BboxOverlapsNearest3D'),
+                pos_iou_thr=0.6,
+                neg_iou_thr=0.4,
+                min_pos_iou=0.4,
+                ignore_iof_thr=-1),
+            dict(  # traffic cone
+                type='MaxIoUAssigner',
+                iou_calculator=dict(type='BboxOverlapsNearest3D'),
+                pos_iou_thr=0.6,
+                neg_iou_thr=0.4,
+                min_pos_iou=0.4,
+                ignore_iof_thr=-1),
+            dict(  # barrier
+                type='MaxIoUAssigner',
+                iou_calculator=dict(type='BboxOverlapsNearest3D'),
+                pos_iou_thr=0.55,
+                neg_iou_thr=0.4,
+                min_pos_iou=0.4,
+                ignore_iof_thr=-1),
+            dict(  # car
+                type='MaxIoUAssigner',
+                iou_calculator=dict(type='BboxOverlapsNearest3D'),
+                pos_iou_thr=0.6,
+                neg_iou_thr=0.45,
+                min_pos_iou=0.45,
+                ignore_iof_thr=-1),
+            dict(  # truck
+                type='MaxIoUAssigner',
+                iou_calculator=dict(type='BboxOverlapsNearest3D'),
+                pos_iou_thr=0.55,
+                neg_iou_thr=0.4,
+                min_pos_iou=0.4,
+                ignore_iof_thr=-1),
+            dict(  # trailer
+                type='MaxIoUAssigner',
+                iou_calculator=dict(type='BboxOverlapsNearest3D'),
+                pos_iou_thr=0.5,
+                neg_iou_thr=0.35,
+                min_pos_iou=0.35,
+                ignore_iof_thr=-1),
+            dict(  # bus
+                type='MaxIoUAssigner',
+                iou_calculator=dict(type='BboxOverlapsNearest3D'),
+                pos_iou_thr=0.55,
+                neg_iou_thr=0.4,
+                min_pos_iou=0.4,
+                ignore_iof_thr=-1),
+            dict(  # construction vehicle
+                type='MaxIoUAssigner',
+                iou_calculator=dict(type='BboxOverlapsNearest3D'),
+                pos_iou_thr=0.5,
+                neg_iou_thr=0.35,
+                min_pos_iou=0.35,
+                ignore_iof_thr=-1)
+        ],
+        allowed_border=0,
+        code_weight=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.2, 0.2],
+        pos_weight=-1,
+        debug=False))
--- a/docs/model_zoo.md
+++ b/docs/model_zoo.md
@@ -52,4 +52,4 @@ Please refer to [CenterPoint](https://github.com/open-mmlab/mmdetection3d/blob/m

 ### SSN

-Please refer to [SSN](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/ssn) for details. We provide pointpillars with shape-aware grouping heads used in SSN on Lyft datasets currently.
+Please refer to [SSN](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/ssn) for details. We provide pointpillars with shape-aware grouping heads used in SSN on the nuScenes and Lyft dataset currently.