Commit f3b13cad authored by yeshenglong1's avatar yeshenglong1
Browse files

UpDate README.md

parent 0797920d
<div id="top" align="center"> <div id="top" align="center">
# Online HD Map Construction Challenge For Autonomous Driving # InternImage-based Baseline for Online HD Map Construction Challenge For Autonomous Driving
</div> </div>
If you need detaild information about the challenge, please refer to https://github.com/Tsinghua-MARS-Lab/Online-HD-Map-Construction-CVPR2023/tree/master If you need detaild information about the challenge, please refer to https://github.com/Tsinghua-MARS-Lab/Online-HD-Map-Construction-CVPR2023/tree/master
#### 1. Requirements #### 1. Requirements
```bash ```bash
python>=3.8 python>=3.8
torch==1.11 # recommend torch==1.11 # recommend
mmcv-full>=1.5.2 mmcv-full>=1.5.2
mmdet==2.28.1 mmdet==2.28.1
mmsegmentation==0.29.1 mmsegmentation==0.29.1
timm timm
numpy==1.23.5 numpy==1.23.5
mmdet3d==1.0.0rc6 # recommend mmdet3d==1.0.0rc6 # recommend
``` ```
### 2. Install DCNv3 for InternImage ### 2. Install DCNv3 for InternImage
```bash ```bash
cd projects/ops_dcnv3 cd projects/ops_dcnv3
bash make.sh # requires torch>=1.10 bash make.sh # requires torch>=1.10
``` ```
### 3. Train with InternImage-Small ### 3. Train with InternImage-Small
```bash ```bash
bash tools/dist_train.sh src/configs/vectormapnet_intern.py ${NUM_GPUS} bash tools/dist_train.sh src/configs/vectormapnet_intern.py ${NUM_GPUS}
``` ```
Notes: InatenImage provides abundant pre-trained model weights that can be used!!! Notes: InatenImage provides abundant pre-trained model weights that can be used!!!
### 4. Performance compared to baseline ### 4. Performance compared to baseline
model name|weight|$\mathrm{mAP}$ | $\mathrm{AP}_{pc}$ | $\mathrm{AP}_{div}$ | $\mathrm{AP}_{bound}$ | model name|weight|$\mathrm{mAP}$ | $\mathrm{AP}_{pc}$ | $\mathrm{AP}_{div}$ | $\mathrm{AP}_{bound}$ |
----|:----------:| :--: | :--: | :--: | :--: | ----|:----------:| :--: | :--: | :--: | :--: |
vectormapnet_intern|[Checkpoint](https://github.com/OpenGVLab/InternImage/releases/download/track_model/vectormapnet_internimage.pth) | 49.35 | 45.05 | 56.78 | 46.22 | vectormapnet_intern|[Checkpoint](https://github.com/OpenGVLab/InternImage/releases/download/track_model/vectormapnet_internimage.pth) | 49.35 | 45.05 | 56.78 | 46.22 |
vectormapnet_base|[Google Drive](https://drive.google.com/file/d/16D1CMinwA8PG1sd9PV9_WtHzcBohvO-D/view) | 42.79 | 37.22 | 50.47 | 40.68 | vectormapnet_base|[Google Drive](https://drive.google.com/file/d/16D1CMinwA8PG1sd9PV9_WtHzcBohvO-D/view) | 42.79 | 37.22 | 50.47 | 40.68 |
## Citation ## Citation
The evaluation metrics of this challenge follows [HDMapNet](https://arxiv.org/abs/2107.06307). We provide [VectorMapNet](https://arxiv.org/abs/2206.08920) as the baseline. Please cite: The evaluation metrics of this challenge follows [HDMapNet](https://arxiv.org/abs/2107.06307). We provide [VectorMapNet](https://arxiv.org/abs/2206.08920) as the baseline. Please cite:
``` ```
@article{li2021hdmapnet, @article{li2021hdmapnet,
title={HDMapNet: An Online HD Map Construction and Evaluation Framework}, title={HDMapNet: An Online HD Map Construction and Evaluation Framework},
author={Qi Li and Yue Wang and Yilun Wang and Hang Zhao}, author={Qi Li and Yue Wang and Yilun Wang and Hang Zhao},
journal={arXiv preprint arXiv:2107.06307}, journal={arXiv preprint arXiv:2107.06307},
year={2021} year={2021}
} }
``` ```
Our dataset is built on top of the [Argoverse 2](https://www.argoverse.org/av2.html) dataset. Please also cite: Our dataset is built on top of the [Argoverse 2](https://www.argoverse.org/av2.html) dataset. Please also cite:
``` ```
@INPROCEEDINGS {Argoverse2, @INPROCEEDINGS {Argoverse2,
author = {Benjamin Wilson and William Qi and Tanmay Agarwal and John Lambert and Jagjeet Singh and Siddhesh Khandelwal and Bowen Pan and Ratnesh Kumar and Andrew Hartnett and Jhony Kaesemodel Pontes and Deva Ramanan and Peter Carr and James Hays}, author = {Benjamin Wilson and William Qi and Tanmay Agarwal and John Lambert and Jagjeet Singh and Siddhesh Khandelwal and Bowen Pan and Ratnesh Kumar and Andrew Hartnett and Jhony Kaesemodel Pontes and Deva Ramanan and Peter Carr and James Hays},
title = {Argoverse 2: Next Generation Datasets for Self-driving Perception and Forecasting}, title = {Argoverse 2: Next Generation Datasets for Self-driving Perception and Forecasting},
booktitle = {Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS Datasets and Benchmarks 2021)}, booktitle = {Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS Datasets and Benchmarks 2021)},
year = {2021} year = {2021}
} }
``` ```
## License ## License
Before participating in our challenge, you should register on the website and agree to the terms of use of the [Argoverse 2](https://www.argoverse.org/av2.html) dataset. Before participating in our challenge, you should register on the website and agree to the terms of use of the [Argoverse 2](https://www.argoverse.org/av2.html) dataset.
All code in this project is released under [GNU General Public License v3.0](./LICENSE). All code in this project is released under [GNU General Public License v3.0](./LICENSE).
from .models import * from .models import *
from .datasets import * from .datasets import *
\ No newline at end of file
dataset_type = 'CocoDataset' dataset_type = 'CocoDataset'
data_root = 'data/coco/' data_root = 'data/coco/'
img_norm_cfg = dict( img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [ train_pipeline = [
dict(type='LoadImageFromFile'), dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True, with_mask=True), dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
dict(type='Resize', img_scale=(1333, 800), keep_ratio=True), dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5), dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg), dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32), dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'), dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
] ]
test_pipeline = [ test_pipeline = [
dict(type='LoadImageFromFile'), dict(type='LoadImageFromFile'),
dict( dict(
type='MultiScaleFlipAug', type='MultiScaleFlipAug',
img_scale=(1333, 800), img_scale=(1333, 800),
flip=False, flip=False,
transforms=[ transforms=[
dict(type='Resize', keep_ratio=True), dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'), dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg), dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32), dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']), dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']), dict(type='Collect', keys=['img']),
]) ])
] ]
data = dict( data = dict(
samples_per_gpu=2, samples_per_gpu=2,
workers_per_gpu=2, workers_per_gpu=2,
train=dict( train=dict(
type=dataset_type, type=dataset_type,
ann_file=data_root + 'annotations/instances_train2017.json', ann_file=data_root + 'annotations/instances_train2017.json',
img_prefix=data_root + 'train2017/', img_prefix=data_root + 'train2017/',
pipeline=train_pipeline), pipeline=train_pipeline),
val=dict( val=dict(
type=dataset_type, type=dataset_type,
ann_file=data_root + 'annotations/instances_val2017.json', ann_file=data_root + 'annotations/instances_val2017.json',
img_prefix=data_root + 'val2017/', img_prefix=data_root + 'val2017/',
pipeline=test_pipeline), pipeline=test_pipeline),
test=dict( test=dict(
type=dataset_type, type=dataset_type,
ann_file=data_root + 'annotations/instances_val2017.json', ann_file=data_root + 'annotations/instances_val2017.json',
img_prefix=data_root + 'val2017/', img_prefix=data_root + 'val2017/',
pipeline=test_pipeline)) pipeline=test_pipeline))
evaluation = dict(metric=['bbox', 'segm']) evaluation = dict(metric=['bbox', 'segm'])
# dataset settings # dataset settings
dataset_type = 'KittiDataset' dataset_type = 'KittiDataset'
data_root = 'data/kitti/' data_root = 'data/kitti/'
class_names = ['Pedestrian', 'Cyclist', 'Car'] class_names = ['Pedestrian', 'Cyclist', 'Car']
point_cloud_range = [0, -40, -3, 70.4, 40, 1] point_cloud_range = [0, -40, -3, 70.4, 40, 1]
input_modality = dict(use_lidar=True, use_camera=False) input_modality = dict(use_lidar=True, use_camera=False)
db_sampler = dict( db_sampler = dict(
data_root=data_root, data_root=data_root,
info_path=data_root + 'kitti_dbinfos_train.pkl', info_path=data_root + 'kitti_dbinfos_train.pkl',
rate=1.0, rate=1.0,
prepare=dict( prepare=dict(
filter_by_difficulty=[-1], filter_by_difficulty=[-1],
filter_by_min_points=dict(Car=5, Pedestrian=10, Cyclist=10)), filter_by_min_points=dict(Car=5, Pedestrian=10, Cyclist=10)),
classes=class_names, classes=class_names,
sample_groups=dict(Car=12, Pedestrian=6, Cyclist=6)) sample_groups=dict(Car=12, Pedestrian=6, Cyclist=6))
file_client_args = dict(backend='disk') file_client_args = dict(backend='disk')
# Uncomment the following if use ceph or other file clients. # Uncomment the following if use ceph or other file clients.
# See https://mmcv.readthedocs.io/en/latest/api.html#mmcv.fileio.FileClient # See https://mmcv.readthedocs.io/en/latest/api.html#mmcv.fileio.FileClient
# for more details. # for more details.
# file_client_args = dict( # file_client_args = dict(
# backend='petrel', path_mapping=dict(data='s3://kitti_data/')) # backend='petrel', path_mapping=dict(data='s3://kitti_data/'))
train_pipeline = [ train_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=4, load_dim=4,
use_dim=4, use_dim=4,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='LoadAnnotations3D', type='LoadAnnotations3D',
with_bbox_3d=True, with_bbox_3d=True,
with_label_3d=True, with_label_3d=True,
file_client_args=file_client_args), file_client_args=file_client_args),
dict(type='ObjectSample', db_sampler=db_sampler), dict(type='ObjectSample', db_sampler=db_sampler),
dict( dict(
type='ObjectNoise', type='ObjectNoise',
num_try=100, num_try=100,
translation_std=[1.0, 1.0, 0.5], translation_std=[1.0, 1.0, 0.5],
global_rot_range=[0.0, 0.0], global_rot_range=[0.0, 0.0],
rot_range=[-0.78539816, 0.78539816]), rot_range=[-0.78539816, 0.78539816]),
dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5), dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
rot_range=[-0.78539816, 0.78539816], rot_range=[-0.78539816, 0.78539816],
scale_ratio_range=[0.95, 1.05]), scale_ratio_range=[0.95, 1.05]),
dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range), dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range), dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
dict(type='PointShuffle'), dict(type='PointShuffle'),
dict(type='DefaultFormatBundle3D', class_names=class_names), dict(type='DefaultFormatBundle3D', class_names=class_names),
dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d']) dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
] ]
test_pipeline = [ test_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=4, load_dim=4,
use_dim=4, use_dim=4,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='MultiScaleFlipAug3D', type='MultiScaleFlipAug3D',
img_scale=(1333, 800), img_scale=(1333, 800),
pts_scale_ratio=1, pts_scale_ratio=1,
flip=False, flip=False,
transforms=[ transforms=[
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
rot_range=[0, 0], rot_range=[0, 0],
scale_ratio_range=[1., 1.], scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]), translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'), dict(type='RandomFlip3D'),
dict( dict(
type='PointsRangeFilter', point_cloud_range=point_cloud_range), type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['points']) dict(type='Collect3D', keys=['points'])
]) ])
] ]
# construct a pipeline for data and gt loading in show function # construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client) # please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline = [ eval_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=4, load_dim=4,
use_dim=4, use_dim=4,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['points']) dict(type='Collect3D', keys=['points'])
] ]
data = dict( data = dict(
samples_per_gpu=6, samples_per_gpu=6,
workers_per_gpu=4, workers_per_gpu=4,
train=dict( train=dict(
type='RepeatDataset', type='RepeatDataset',
times=2, times=2,
dataset=dict( dataset=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'kitti_infos_train.pkl', ann_file=data_root + 'kitti_infos_train.pkl',
split='training', split='training',
pts_prefix='velodyne_reduced', pts_prefix='velodyne_reduced',
pipeline=train_pipeline, pipeline=train_pipeline,
modality=input_modality, modality=input_modality,
classes=class_names, classes=class_names,
test_mode=False, test_mode=False,
# we use box_type_3d='LiDAR' in kitti and nuscenes dataset # we use box_type_3d='LiDAR' in kitti and nuscenes dataset
# and box_type_3d='Depth' in sunrgbd and scannet dataset. # and box_type_3d='Depth' in sunrgbd and scannet dataset.
box_type_3d='LiDAR')), box_type_3d='LiDAR')),
val=dict( val=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'kitti_infos_val.pkl', ann_file=data_root + 'kitti_infos_val.pkl',
split='training', split='training',
pts_prefix='velodyne_reduced', pts_prefix='velodyne_reduced',
pipeline=test_pipeline, pipeline=test_pipeline,
modality=input_modality, modality=input_modality,
classes=class_names, classes=class_names,
test_mode=True, test_mode=True,
box_type_3d='LiDAR'), box_type_3d='LiDAR'),
test=dict( test=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'kitti_infos_val.pkl', ann_file=data_root + 'kitti_infos_val.pkl',
split='training', split='training',
pts_prefix='velodyne_reduced', pts_prefix='velodyne_reduced',
pipeline=test_pipeline, pipeline=test_pipeline,
modality=input_modality, modality=input_modality,
classes=class_names, classes=class_names,
test_mode=True, test_mode=True,
box_type_3d='LiDAR')) box_type_3d='LiDAR'))
evaluation = dict(interval=1, pipeline=eval_pipeline) evaluation = dict(interval=1, pipeline=eval_pipeline)
# dataset settings # dataset settings
dataset_type = 'KittiDataset' dataset_type = 'KittiDataset'
data_root = 'data/kitti/' data_root = 'data/kitti/'
class_names = ['Car'] class_names = ['Car']
point_cloud_range = [0, -40, -3, 70.4, 40, 1] point_cloud_range = [0, -40, -3, 70.4, 40, 1]
input_modality = dict(use_lidar=True, use_camera=False) input_modality = dict(use_lidar=True, use_camera=False)
db_sampler = dict( db_sampler = dict(
data_root=data_root, data_root=data_root,
info_path=data_root + 'kitti_dbinfos_train.pkl', info_path=data_root + 'kitti_dbinfos_train.pkl',
rate=1.0, rate=1.0,
prepare=dict(filter_by_difficulty=[-1], filter_by_min_points=dict(Car=5)), prepare=dict(filter_by_difficulty=[-1], filter_by_min_points=dict(Car=5)),
classes=class_names, classes=class_names,
sample_groups=dict(Car=15)) sample_groups=dict(Car=15))
file_client_args = dict(backend='disk') file_client_args = dict(backend='disk')
# Uncomment the following if use ceph or other file clients. # Uncomment the following if use ceph or other file clients.
# See https://mmcv.readthedocs.io/en/latest/api.html#mmcv.fileio.FileClient # See https://mmcv.readthedocs.io/en/latest/api.html#mmcv.fileio.FileClient
# for more details. # for more details.
# file_client_args = dict( # file_client_args = dict(
# backend='petrel', path_mapping=dict(data='s3://kitti_data/')) # backend='petrel', path_mapping=dict(data='s3://kitti_data/'))
train_pipeline = [ train_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=4, load_dim=4,
use_dim=4, use_dim=4,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='LoadAnnotations3D', type='LoadAnnotations3D',
with_bbox_3d=True, with_bbox_3d=True,
with_label_3d=True, with_label_3d=True,
file_client_args=file_client_args), file_client_args=file_client_args),
dict(type='ObjectSample', db_sampler=db_sampler), dict(type='ObjectSample', db_sampler=db_sampler),
dict( dict(
type='ObjectNoise', type='ObjectNoise',
num_try=100, num_try=100,
translation_std=[1.0, 1.0, 0.5], translation_std=[1.0, 1.0, 0.5],
global_rot_range=[0.0, 0.0], global_rot_range=[0.0, 0.0],
rot_range=[-0.78539816, 0.78539816]), rot_range=[-0.78539816, 0.78539816]),
dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5), dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
rot_range=[-0.78539816, 0.78539816], rot_range=[-0.78539816, 0.78539816],
scale_ratio_range=[0.95, 1.05]), scale_ratio_range=[0.95, 1.05]),
dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range), dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range), dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
dict(type='PointShuffle'), dict(type='PointShuffle'),
dict(type='DefaultFormatBundle3D', class_names=class_names), dict(type='DefaultFormatBundle3D', class_names=class_names),
dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d']) dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
] ]
test_pipeline = [ test_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=4, load_dim=4,
use_dim=4, use_dim=4,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='MultiScaleFlipAug3D', type='MultiScaleFlipAug3D',
img_scale=(1333, 800), img_scale=(1333, 800),
pts_scale_ratio=1, pts_scale_ratio=1,
flip=False, flip=False,
transforms=[ transforms=[
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
rot_range=[0, 0], rot_range=[0, 0],
scale_ratio_range=[1., 1.], scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]), translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'), dict(type='RandomFlip3D'),
dict( dict(
type='PointsRangeFilter', point_cloud_range=point_cloud_range), type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['points']) dict(type='Collect3D', keys=['points'])
]) ])
] ]
# construct a pipeline for data and gt loading in show function # construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client) # please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline = [ eval_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=4, load_dim=4,
use_dim=4, use_dim=4,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['points']) dict(type='Collect3D', keys=['points'])
] ]
data = dict( data = dict(
samples_per_gpu=6, samples_per_gpu=6,
workers_per_gpu=4, workers_per_gpu=4,
train=dict( train=dict(
type='RepeatDataset', type='RepeatDataset',
times=2, times=2,
dataset=dict( dataset=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'kitti_infos_train.pkl', ann_file=data_root + 'kitti_infos_train.pkl',
split='training', split='training',
pts_prefix='velodyne_reduced', pts_prefix='velodyne_reduced',
pipeline=train_pipeline, pipeline=train_pipeline,
modality=input_modality, modality=input_modality,
classes=class_names, classes=class_names,
test_mode=False, test_mode=False,
# we use box_type_3d='LiDAR' in kitti and nuscenes dataset # we use box_type_3d='LiDAR' in kitti and nuscenes dataset
# and box_type_3d='Depth' in sunrgbd and scannet dataset. # and box_type_3d='Depth' in sunrgbd and scannet dataset.
box_type_3d='LiDAR')), box_type_3d='LiDAR')),
val=dict( val=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'kitti_infos_val.pkl', ann_file=data_root + 'kitti_infos_val.pkl',
split='training', split='training',
pts_prefix='velodyne_reduced', pts_prefix='velodyne_reduced',
pipeline=test_pipeline, pipeline=test_pipeline,
modality=input_modality, modality=input_modality,
classes=class_names, classes=class_names,
test_mode=True, test_mode=True,
box_type_3d='LiDAR'), box_type_3d='LiDAR'),
test=dict( test=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'kitti_infos_val.pkl', ann_file=data_root + 'kitti_infos_val.pkl',
split='training', split='training',
pts_prefix='velodyne_reduced', pts_prefix='velodyne_reduced',
pipeline=test_pipeline, pipeline=test_pipeline,
modality=input_modality, modality=input_modality,
classes=class_names, classes=class_names,
test_mode=True, test_mode=True,
box_type_3d='LiDAR')) box_type_3d='LiDAR'))
evaluation = dict(interval=1, pipeline=eval_pipeline) evaluation = dict(interval=1, pipeline=eval_pipeline)
# If point cloud range is changed, the models should also change their point # If point cloud range is changed, the models should also change their point
# cloud range accordingly # cloud range accordingly
point_cloud_range = [-80, -80, -5, 80, 80, 3] point_cloud_range = [-80, -80, -5, 80, 80, 3]
# For Lyft we usually do 9-class detection # For Lyft we usually do 9-class detection
class_names = [ class_names = [
'car', 'truck', 'bus', 'emergency_vehicle', 'other_vehicle', 'motorcycle', 'car', 'truck', 'bus', 'emergency_vehicle', 'other_vehicle', 'motorcycle',
'bicycle', 'pedestrian', 'animal' 'bicycle', 'pedestrian', 'animal'
] ]
dataset_type = 'LyftDataset' dataset_type = 'LyftDataset'
data_root = 'data/lyft/' data_root = 'data/lyft/'
# Input modality for Lyft dataset, this is consistent with the submission # Input modality for Lyft dataset, this is consistent with the submission
# format which requires the information in input_modality. # format which requires the information in input_modality.
input_modality = dict( input_modality = dict(
use_lidar=True, use_lidar=True,
use_camera=False, use_camera=False,
use_radar=False, use_radar=False,
use_map=False, use_map=False,
use_external=False) use_external=False)
file_client_args = dict(backend='disk') file_client_args = dict(backend='disk')
# Uncomment the following if use ceph or other file clients. # Uncomment the following if use ceph or other file clients.
# See https://mmcv.readthedocs.io/en/latest/api.html#mmcv.fileio.FileClient # See https://mmcv.readthedocs.io/en/latest/api.html#mmcv.fileio.FileClient
# for more details. # for more details.
# file_client_args = dict( # file_client_args = dict(
# backend='petrel', # backend='petrel',
# path_mapping=dict({ # path_mapping=dict({
# './data/lyft/': 's3://lyft/lyft/', # './data/lyft/': 's3://lyft/lyft/',
# 'data/lyft/': 's3://lyft/lyft/' # 'data/lyft/': 's3://lyft/lyft/'
# })) # }))
train_pipeline = [ train_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=5, load_dim=5,
use_dim=5, use_dim=5,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='LoadPointsFromMultiSweeps', type='LoadPointsFromMultiSweeps',
sweeps_num=10, sweeps_num=10,
file_client_args=file_client_args), file_client_args=file_client_args),
dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True), dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True),
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
rot_range=[-0.3925, 0.3925], rot_range=[-0.3925, 0.3925],
scale_ratio_range=[0.95, 1.05], scale_ratio_range=[0.95, 1.05],
translation_std=[0, 0, 0]), translation_std=[0, 0, 0]),
dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5), dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range), dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range), dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
dict(type='PointShuffle'), dict(type='PointShuffle'),
dict(type='DefaultFormatBundle3D', class_names=class_names), dict(type='DefaultFormatBundle3D', class_names=class_names),
dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d']) dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
] ]
test_pipeline = [ test_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=5, load_dim=5,
use_dim=5, use_dim=5,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='LoadPointsFromMultiSweeps', type='LoadPointsFromMultiSweeps',
sweeps_num=10, sweeps_num=10,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='MultiScaleFlipAug3D', type='MultiScaleFlipAug3D',
img_scale=(1333, 800), img_scale=(1333, 800),
pts_scale_ratio=1, pts_scale_ratio=1,
flip=False, flip=False,
transforms=[ transforms=[
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
rot_range=[0, 0], rot_range=[0, 0],
scale_ratio_range=[1., 1.], scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]), translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'), dict(type='RandomFlip3D'),
dict( dict(
type='PointsRangeFilter', point_cloud_range=point_cloud_range), type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['points']) dict(type='Collect3D', keys=['points'])
]) ])
] ]
# construct a pipeline for data and gt loading in show function # construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client) # please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline = [ eval_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=5, load_dim=5,
use_dim=5, use_dim=5,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='LoadPointsFromMultiSweeps', type='LoadPointsFromMultiSweeps',
sweeps_num=10, sweeps_num=10,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['points']) dict(type='Collect3D', keys=['points'])
] ]
data = dict( data = dict(
samples_per_gpu=2, samples_per_gpu=2,
workers_per_gpu=2, workers_per_gpu=2,
train=dict( train=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'lyft_infos_train.pkl', ann_file=data_root + 'lyft_infos_train.pkl',
pipeline=train_pipeline, pipeline=train_pipeline,
classes=class_names, classes=class_names,
modality=input_modality, modality=input_modality,
test_mode=False), test_mode=False),
val=dict( val=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'lyft_infos_val.pkl', ann_file=data_root + 'lyft_infos_val.pkl',
pipeline=test_pipeline, pipeline=test_pipeline,
classes=class_names, classes=class_names,
modality=input_modality, modality=input_modality,
test_mode=True), test_mode=True),
test=dict( test=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'lyft_infos_test.pkl', ann_file=data_root + 'lyft_infos_test.pkl',
pipeline=test_pipeline, pipeline=test_pipeline,
classes=class_names, classes=class_names,
modality=input_modality, modality=input_modality,
test_mode=True)) test_mode=True))
# For Lyft dataset, we usually evaluate the model at the end of training. # For Lyft dataset, we usually evaluate the model at the end of training.
# Since the models are trained by 24 epochs by default, we set evaluation # Since the models are trained by 24 epochs by default, we set evaluation
# interval to be 24. Please change the interval accordingly if you do not # interval to be 24. Please change the interval accordingly if you do not
# use a default schedule. # use a default schedule.
evaluation = dict(interval=24, pipeline=eval_pipeline) evaluation = dict(interval=24, pipeline=eval_pipeline)
dataset_type = 'CocoDataset' dataset_type = 'CocoDataset'
data_root = 'data/nuimages/' data_root = 'data/nuimages/'
class_names = [ class_names = [
'car', 'truck', 'trailer', 'bus', 'construction_vehicle', 'bicycle', 'car', 'truck', 'trailer', 'bus', 'construction_vehicle', 'bicycle',
'motorcycle', 'pedestrian', 'traffic_cone', 'barrier' 'motorcycle', 'pedestrian', 'traffic_cone', 'barrier'
] ]
img_norm_cfg = dict( img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [ train_pipeline = [
dict(type='LoadImageFromFile'), dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True, with_mask=True), dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
dict( dict(
type='Resize', type='Resize',
img_scale=[(1280, 720), (1920, 1080)], img_scale=[(1280, 720), (1920, 1080)],
multiscale_mode='range', multiscale_mode='range',
keep_ratio=True), keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5), dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg), dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32), dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'), dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
] ]
test_pipeline = [ test_pipeline = [
dict(type='LoadImageFromFile'), dict(type='LoadImageFromFile'),
dict( dict(
type='MultiScaleFlipAug', type='MultiScaleFlipAug',
img_scale=(1600, 900), img_scale=(1600, 900),
flip=False, flip=False,
transforms=[ transforms=[
dict(type='Resize', keep_ratio=True), dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'), dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg), dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32), dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']), dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']), dict(type='Collect', keys=['img']),
]) ])
] ]
data = dict( data = dict(
samples_per_gpu=2, samples_per_gpu=2,
workers_per_gpu=2, workers_per_gpu=2,
train=dict( train=dict(
type=dataset_type, type=dataset_type,
ann_file=data_root + 'annotations/nuimages_v1.0-train.json', ann_file=data_root + 'annotations/nuimages_v1.0-train.json',
img_prefix=data_root, img_prefix=data_root,
classes=class_names, classes=class_names,
pipeline=train_pipeline), pipeline=train_pipeline),
val=dict( val=dict(
type=dataset_type, type=dataset_type,
ann_file=data_root + 'annotations/nuimages_v1.0-val.json', ann_file=data_root + 'annotations/nuimages_v1.0-val.json',
img_prefix=data_root, img_prefix=data_root,
classes=class_names, classes=class_names,
pipeline=test_pipeline), pipeline=test_pipeline),
test=dict( test=dict(
type=dataset_type, type=dataset_type,
ann_file=data_root + 'annotations/nuimages_v1.0-val.json', ann_file=data_root + 'annotations/nuimages_v1.0-val.json',
img_prefix=data_root, img_prefix=data_root,
classes=class_names, classes=class_names,
pipeline=test_pipeline)) pipeline=test_pipeline))
evaluation = dict(metric=['bbox', 'segm']) evaluation = dict(metric=['bbox', 'segm'])
# If point cloud range is changed, the models should also change their point # If point cloud range is changed, the models should also change their point
# cloud range accordingly # cloud range accordingly
point_cloud_range = [-50, -50, -5, 50, 50, 3] point_cloud_range = [-50, -50, -5, 50, 50, 3]
# For nuScenes we usually do 10-class detection # For nuScenes we usually do 10-class detection
class_names = [ class_names = [
'car', 'truck', 'trailer', 'bus', 'construction_vehicle', 'bicycle', 'car', 'truck', 'trailer', 'bus', 'construction_vehicle', 'bicycle',
'motorcycle', 'pedestrian', 'traffic_cone', 'barrier' 'motorcycle', 'pedestrian', 'traffic_cone', 'barrier'
] ]
dataset_type = 'NuScenesDataset' dataset_type = 'NuScenesDataset'
data_root = 'data/nuscenes/' data_root = 'data/nuscenes/'
# Input modality for nuScenes dataset, this is consistent with the submission # Input modality for nuScenes dataset, this is consistent with the submission
# format which requires the information in input_modality. # format which requires the information in input_modality.
input_modality = dict( input_modality = dict(
use_lidar=True, use_lidar=True,
use_camera=False, use_camera=False,
use_radar=False, use_radar=False,
use_map=False, use_map=False,
use_external=False) use_external=False)
file_client_args = dict(backend='disk') file_client_args = dict(backend='disk')
# Uncomment the following if use ceph or other file clients. # Uncomment the following if use ceph or other file clients.
# See https://mmcv.readthedocs.io/en/latest/api.html#mmcv.fileio.FileClient # See https://mmcv.readthedocs.io/en/latest/api.html#mmcv.fileio.FileClient
# for more details. # for more details.
# file_client_args = dict( # file_client_args = dict(
# backend='petrel', # backend='petrel',
# path_mapping=dict({ # path_mapping=dict({
# './data/nuscenes/': 's3://nuscenes/nuscenes/', # './data/nuscenes/': 's3://nuscenes/nuscenes/',
# 'data/nuscenes/': 's3://nuscenes/nuscenes/' # 'data/nuscenes/': 's3://nuscenes/nuscenes/'
# })) # }))
train_pipeline = [ train_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=5, load_dim=5,
use_dim=5, use_dim=5,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='LoadPointsFromMultiSweeps', type='LoadPointsFromMultiSweeps',
sweeps_num=10, sweeps_num=10,
file_client_args=file_client_args), file_client_args=file_client_args),
dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True), dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True),
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
rot_range=[-0.3925, 0.3925], rot_range=[-0.3925, 0.3925],
scale_ratio_range=[0.95, 1.05], scale_ratio_range=[0.95, 1.05],
translation_std=[0, 0, 0]), translation_std=[0, 0, 0]),
dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5), dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range), dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range), dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
dict(type='ObjectNameFilter', classes=class_names), dict(type='ObjectNameFilter', classes=class_names),
dict(type='PointShuffle'), dict(type='PointShuffle'),
dict(type='DefaultFormatBundle3D', class_names=class_names), dict(type='DefaultFormatBundle3D', class_names=class_names),
dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d']) dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
] ]
test_pipeline = [ test_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=5, load_dim=5,
use_dim=5, use_dim=5,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='LoadPointsFromMultiSweeps', type='LoadPointsFromMultiSweeps',
sweeps_num=10, sweeps_num=10,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='MultiScaleFlipAug3D', type='MultiScaleFlipAug3D',
img_scale=(1333, 800), img_scale=(1333, 800),
pts_scale_ratio=1, pts_scale_ratio=1,
flip=False, flip=False,
transforms=[ transforms=[
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
rot_range=[0, 0], rot_range=[0, 0],
scale_ratio_range=[1., 1.], scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]), translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'), dict(type='RandomFlip3D'),
dict( dict(
type='PointsRangeFilter', point_cloud_range=point_cloud_range), type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['points']) dict(type='Collect3D', keys=['points'])
]) ])
] ]
# construct a pipeline for data and gt loading in show function # construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client) # please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline = [ eval_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=5, load_dim=5,
use_dim=5, use_dim=5,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='LoadPointsFromMultiSweeps', type='LoadPointsFromMultiSweeps',
sweeps_num=10, sweeps_num=10,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['points']) dict(type='Collect3D', keys=['points'])
] ]
data = dict( data = dict(
samples_per_gpu=4, samples_per_gpu=4,
workers_per_gpu=4, workers_per_gpu=4,
train=dict( train=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'nuscenes_infos_train.pkl', ann_file=data_root + 'nuscenes_infos_train.pkl',
pipeline=train_pipeline, pipeline=train_pipeline,
classes=class_names, classes=class_names,
modality=input_modality, modality=input_modality,
test_mode=False, test_mode=False,
# we use box_type_3d='LiDAR' in kitti and nuscenes dataset # we use box_type_3d='LiDAR' in kitti and nuscenes dataset
# and box_type_3d='Depth' in sunrgbd and scannet dataset. # and box_type_3d='Depth' in sunrgbd and scannet dataset.
box_type_3d='LiDAR'), box_type_3d='LiDAR'),
val=dict( val=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'nuscenes_infos_val.pkl', ann_file=data_root + 'nuscenes_infos_val.pkl',
pipeline=test_pipeline, pipeline=test_pipeline,
classes=class_names, classes=class_names,
modality=input_modality, modality=input_modality,
test_mode=True, test_mode=True,
box_type_3d='LiDAR'), box_type_3d='LiDAR'),
test=dict( test=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'nuscenes_infos_val.pkl', ann_file=data_root + 'nuscenes_infos_val.pkl',
pipeline=test_pipeline, pipeline=test_pipeline,
classes=class_names, classes=class_names,
modality=input_modality, modality=input_modality,
test_mode=True, test_mode=True,
box_type_3d='LiDAR')) box_type_3d='LiDAR'))
# For nuScenes dataset, we usually evaluate the model at the end of training. # For nuScenes dataset, we usually evaluate the model at the end of training.
# Since the models are trained by 24 epochs by default, we set evaluation # Since the models are trained by 24 epochs by default, we set evaluation
# interval to be 24. Please change the interval accordingly if you do not # interval to be 24. Please change the interval accordingly if you do not
# use a default schedule. # use a default schedule.
evaluation = dict(interval=24, pipeline=eval_pipeline) evaluation = dict(interval=24, pipeline=eval_pipeline)
dataset_type = 'NuScenesMonoDataset' dataset_type = 'NuScenesMonoDataset'
data_root = 'data/nuscenes/' data_root = 'data/nuscenes/'
class_names = [ class_names = [
'car', 'truck', 'trailer', 'bus', 'construction_vehicle', 'bicycle', 'car', 'truck', 'trailer', 'bus', 'construction_vehicle', 'bicycle',
'motorcycle', 'pedestrian', 'traffic_cone', 'barrier' 'motorcycle', 'pedestrian', 'traffic_cone', 'barrier'
] ]
# Input modality for nuScenes dataset, this is consistent with the submission # Input modality for nuScenes dataset, this is consistent with the submission
# format which requires the information in input_modality. # format which requires the information in input_modality.
input_modality = dict( input_modality = dict(
use_lidar=False, use_lidar=False,
use_camera=True, use_camera=True,
use_radar=False, use_radar=False,
use_map=False, use_map=False,
use_external=False) use_external=False)
img_norm_cfg = dict( img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [ train_pipeline = [
dict(type='LoadImageFromFileMono3D'), dict(type='LoadImageFromFileMono3D'),
dict( dict(
type='LoadAnnotations3D', type='LoadAnnotations3D',
with_bbox=True, with_bbox=True,
with_label=True, with_label=True,
with_attr_label=True, with_attr_label=True,
with_bbox_3d=True, with_bbox_3d=True,
with_label_3d=True, with_label_3d=True,
with_bbox_depth=True), with_bbox_depth=True),
dict(type='Resize', img_scale=(1600, 900), keep_ratio=True), dict(type='Resize', img_scale=(1600, 900), keep_ratio=True),
dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5), dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
dict(type='Normalize', **img_norm_cfg), dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32), dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle3D', class_names=class_names), dict(type='DefaultFormatBundle3D', class_names=class_names),
dict( dict(
type='Collect3D', type='Collect3D',
keys=[ keys=[
'img', 'gt_bboxes', 'gt_labels', 'attr_labels', 'gt_bboxes_3d', 'img', 'gt_bboxes', 'gt_labels', 'attr_labels', 'gt_bboxes_3d',
'gt_labels_3d', 'centers2d', 'depths' 'gt_labels_3d', 'centers2d', 'depths'
]), ]),
] ]
test_pipeline = [ test_pipeline = [
dict(type='LoadImageFromFileMono3D'), dict(type='LoadImageFromFileMono3D'),
dict( dict(
type='MultiScaleFlipAug', type='MultiScaleFlipAug',
scale_factor=1.0, scale_factor=1.0,
flip=False, flip=False,
transforms=[ transforms=[
dict(type='RandomFlip3D'), dict(type='RandomFlip3D'),
dict(type='Normalize', **img_norm_cfg), dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32), dict(type='Pad', size_divisor=32),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['img']), dict(type='Collect3D', keys=['img']),
]) ])
] ]
# construct a pipeline for data and gt loading in show function # construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client) # please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline = [ eval_pipeline = [
dict(type='LoadImageFromFileMono3D'), dict(type='LoadImageFromFileMono3D'),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['img']) dict(type='Collect3D', keys=['img'])
] ]
data = dict( data = dict(
samples_per_gpu=2, samples_per_gpu=2,
workers_per_gpu=2, workers_per_gpu=2,
train=dict( train=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'nuscenes_infos_train_mono3d.coco.json', ann_file=data_root + 'nuscenes_infos_train_mono3d.coco.json',
img_prefix=data_root, img_prefix=data_root,
classes=class_names, classes=class_names,
pipeline=train_pipeline, pipeline=train_pipeline,
modality=input_modality, modality=input_modality,
test_mode=False, test_mode=False,
box_type_3d='Camera'), box_type_3d='Camera'),
val=dict( val=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'nuscenes_infos_val_mono3d.coco.json', ann_file=data_root + 'nuscenes_infos_val_mono3d.coco.json',
img_prefix=data_root, img_prefix=data_root,
classes=class_names, classes=class_names,
pipeline=test_pipeline, pipeline=test_pipeline,
modality=input_modality, modality=input_modality,
test_mode=True, test_mode=True,
box_type_3d='Camera'), box_type_3d='Camera'),
test=dict( test=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'nuscenes_infos_val_mono3d.coco.json', ann_file=data_root + 'nuscenes_infos_val_mono3d.coco.json',
img_prefix=data_root, img_prefix=data_root,
classes=class_names, classes=class_names,
pipeline=test_pipeline, pipeline=test_pipeline,
modality=input_modality, modality=input_modality,
test_mode=True, test_mode=True,
box_type_3d='Camera')) box_type_3d='Camera'))
evaluation = dict(interval=2) evaluation = dict(interval=2)
# If point cloud range is changed, the models should also change their point # If point cloud range is changed, the models should also change their point
# cloud range accordingly # cloud range accordingly
point_cloud_range = [-100, -100, -5, 100, 100, 3] point_cloud_range = [-100, -100, -5, 100, 100, 3]
# For Lyft we usually do 9-class detection # For Lyft we usually do 9-class detection
class_names = [ class_names = [
'car', 'truck', 'bus', 'emergency_vehicle', 'other_vehicle', 'motorcycle', 'car', 'truck', 'bus', 'emergency_vehicle', 'other_vehicle', 'motorcycle',
'bicycle', 'pedestrian', 'animal' 'bicycle', 'pedestrian', 'animal'
] ]
dataset_type = 'LyftDataset' dataset_type = 'LyftDataset'
data_root = 'data/lyft/' data_root = 'data/lyft/'
# Input modality for Lyft dataset, this is consistent with the submission # Input modality for Lyft dataset, this is consistent with the submission
# format which requires the information in input_modality. # format which requires the information in input_modality.
input_modality = dict( input_modality = dict(
use_lidar=True, use_lidar=True,
use_camera=False, use_camera=False,
use_radar=False, use_radar=False,
use_map=False, use_map=False,
use_external=False) use_external=False)
file_client_args = dict(backend='disk') file_client_args = dict(backend='disk')
# Uncomment the following if use ceph or other file clients. # Uncomment the following if use ceph or other file clients.
# See https://mmcv.readthedocs.io/en/latest/api.html#mmcv.fileio.FileClient # See https://mmcv.readthedocs.io/en/latest/api.html#mmcv.fileio.FileClient
# for more details. # for more details.
# file_client_args = dict( # file_client_args = dict(
# backend='petrel', # backend='petrel',
# path_mapping=dict({ # path_mapping=dict({
# './data/lyft/': 's3://lyft/lyft/', # './data/lyft/': 's3://lyft/lyft/',
# 'data/lyft/': 's3://lyft/lyft/' # 'data/lyft/': 's3://lyft/lyft/'
# })) # }))
train_pipeline = [ train_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=5, load_dim=5,
use_dim=5, use_dim=5,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='LoadPointsFromMultiSweeps', type='LoadPointsFromMultiSweeps',
sweeps_num=10, sweeps_num=10,
file_client_args=file_client_args), file_client_args=file_client_args),
dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True), dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True),
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
rot_range=[-0.3925, 0.3925], rot_range=[-0.3925, 0.3925],
scale_ratio_range=[0.95, 1.05], scale_ratio_range=[0.95, 1.05],
translation_std=[0, 0, 0]), translation_std=[0, 0, 0]),
dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5), dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range), dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range), dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
dict(type='PointShuffle'), dict(type='PointShuffle'),
dict(type='DefaultFormatBundle3D', class_names=class_names), dict(type='DefaultFormatBundle3D', class_names=class_names),
dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d']) dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
] ]
test_pipeline = [ test_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=5, load_dim=5,
use_dim=5, use_dim=5,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='LoadPointsFromMultiSweeps', type='LoadPointsFromMultiSweeps',
sweeps_num=10, sweeps_num=10,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='MultiScaleFlipAug3D', type='MultiScaleFlipAug3D',
img_scale=(1333, 800), img_scale=(1333, 800),
pts_scale_ratio=1, pts_scale_ratio=1,
flip=False, flip=False,
transforms=[ transforms=[
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
rot_range=[0, 0], rot_range=[0, 0],
scale_ratio_range=[1., 1.], scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]), translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'), dict(type='RandomFlip3D'),
dict( dict(
type='PointsRangeFilter', point_cloud_range=point_cloud_range), type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['points']) dict(type='Collect3D', keys=['points'])
]) ])
] ]
# construct a pipeline for data and gt loading in show function # construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client) # please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline = [ eval_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=5, load_dim=5,
use_dim=5, use_dim=5,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='LoadPointsFromMultiSweeps', type='LoadPointsFromMultiSweeps',
sweeps_num=10, sweeps_num=10,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['points']) dict(type='Collect3D', keys=['points'])
] ]
data = dict( data = dict(
samples_per_gpu=2, samples_per_gpu=2,
workers_per_gpu=2, workers_per_gpu=2,
train=dict( train=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'lyft_infos_train.pkl', ann_file=data_root + 'lyft_infos_train.pkl',
pipeline=train_pipeline, pipeline=train_pipeline,
classes=class_names, classes=class_names,
modality=input_modality, modality=input_modality,
test_mode=False), test_mode=False),
val=dict( val=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'lyft_infos_val.pkl', ann_file=data_root + 'lyft_infos_val.pkl',
pipeline=test_pipeline, pipeline=test_pipeline,
classes=class_names, classes=class_names,
modality=input_modality, modality=input_modality,
test_mode=True), test_mode=True),
test=dict( test=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'lyft_infos_test.pkl', ann_file=data_root + 'lyft_infos_test.pkl',
pipeline=test_pipeline, pipeline=test_pipeline,
classes=class_names, classes=class_names,
modality=input_modality, modality=input_modality,
test_mode=True)) test_mode=True))
# For Lyft dataset, we usually evaluate the model at the end of training. # For Lyft dataset, we usually evaluate the model at the end of training.
# Since the models are trained by 24 epochs by default, we set evaluation # Since the models are trained by 24 epochs by default, we set evaluation
# interval to be 24. Please change the interval accordingly if you do not # interval to be 24. Please change the interval accordingly if you do not
# use a default schedule. # use a default schedule.
evaluation = dict(interval=24, pipeline=eval_pipeline) evaluation = dict(interval=24, pipeline=eval_pipeline)
# dataset settings # dataset settings
dataset_type = 'S3DISSegDataset' dataset_type = 'S3DISSegDataset'
data_root = './data/s3dis/' data_root = './data/s3dis/'
class_names = ('ceiling', 'floor', 'wall', 'beam', 'column', 'window', 'door', class_names = ('ceiling', 'floor', 'wall', 'beam', 'column', 'window', 'door',
'table', 'chair', 'sofa', 'bookcase', 'board', 'clutter') 'table', 'chair', 'sofa', 'bookcase', 'board', 'clutter')
num_points = 4096 num_points = 4096
train_area = [1, 2, 3, 4, 6] train_area = [1, 2, 3, 4, 6]
test_area = 5 test_area = 5
train_pipeline = [ train_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='DEPTH', coord_type='DEPTH',
shift_height=False, shift_height=False,
use_color=True, use_color=True,
load_dim=6, load_dim=6,
use_dim=[0, 1, 2, 3, 4, 5]), use_dim=[0, 1, 2, 3, 4, 5]),
dict( dict(
type='LoadAnnotations3D', type='LoadAnnotations3D',
with_bbox_3d=False, with_bbox_3d=False,
with_label_3d=False, with_label_3d=False,
with_mask_3d=False, with_mask_3d=False,
with_seg_3d=True), with_seg_3d=True),
dict( dict(
type='PointSegClassMapping', type='PointSegClassMapping',
valid_cat_ids=tuple(range(len(class_names))), valid_cat_ids=tuple(range(len(class_names))),
max_cat_id=13), max_cat_id=13),
dict( dict(
type='IndoorPatchPointSample', type='IndoorPatchPointSample',
num_points=num_points, num_points=num_points,
block_size=1.0, block_size=1.0,
ignore_index=len(class_names), ignore_index=len(class_names),
use_normalized_coord=True, use_normalized_coord=True,
enlarge_size=0.2, enlarge_size=0.2,
min_unique_num=None), min_unique_num=None),
dict(type='NormalizePointsColor', color_mean=None), dict(type='NormalizePointsColor', color_mean=None),
dict(type='DefaultFormatBundle3D', class_names=class_names), dict(type='DefaultFormatBundle3D', class_names=class_names),
dict(type='Collect3D', keys=['points', 'pts_semantic_mask']) dict(type='Collect3D', keys=['points', 'pts_semantic_mask'])
] ]
test_pipeline = [ test_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='DEPTH', coord_type='DEPTH',
shift_height=False, shift_height=False,
use_color=True, use_color=True,
load_dim=6, load_dim=6,
use_dim=[0, 1, 2, 3, 4, 5]), use_dim=[0, 1, 2, 3, 4, 5]),
dict(type='NormalizePointsColor', color_mean=None), dict(type='NormalizePointsColor', color_mean=None),
dict( dict(
# a wrapper in order to successfully call test function # a wrapper in order to successfully call test function
# actually we don't perform test-time-aug # actually we don't perform test-time-aug
type='MultiScaleFlipAug3D', type='MultiScaleFlipAug3D',
img_scale=(1333, 800), img_scale=(1333, 800),
pts_scale_ratio=1, pts_scale_ratio=1,
flip=False, flip=False,
transforms=[ transforms=[
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
rot_range=[0, 0], rot_range=[0, 0],
scale_ratio_range=[1., 1.], scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]), translation_std=[0, 0, 0]),
dict( dict(
type='RandomFlip3D', type='RandomFlip3D',
sync_2d=False, sync_2d=False,
flip_ratio_bev_horizontal=0.0, flip_ratio_bev_horizontal=0.0,
flip_ratio_bev_vertical=0.0), flip_ratio_bev_vertical=0.0),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['points']) dict(type='Collect3D', keys=['points'])
]) ])
] ]
# construct a pipeline for data and gt loading in show function # construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client) # please keep its loading function consistent with test_pipeline (e.g. client)
# we need to load gt seg_mask! # we need to load gt seg_mask!
eval_pipeline = [ eval_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='DEPTH', coord_type='DEPTH',
shift_height=False, shift_height=False,
use_color=True, use_color=True,
load_dim=6, load_dim=6,
use_dim=[0, 1, 2, 3, 4, 5]), use_dim=[0, 1, 2, 3, 4, 5]),
dict( dict(
type='LoadAnnotations3D', type='LoadAnnotations3D',
with_bbox_3d=False, with_bbox_3d=False,
with_label_3d=False, with_label_3d=False,
with_mask_3d=False, with_mask_3d=False,
with_seg_3d=True), with_seg_3d=True),
dict( dict(
type='PointSegClassMapping', type='PointSegClassMapping',
valid_cat_ids=tuple(range(len(class_names))), valid_cat_ids=tuple(range(len(class_names))),
max_cat_id=13), max_cat_id=13),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
with_label=False, with_label=False,
class_names=class_names), class_names=class_names),
dict(type='Collect3D', keys=['points', 'pts_semantic_mask']) dict(type='Collect3D', keys=['points', 'pts_semantic_mask'])
] ]
data = dict( data = dict(
samples_per_gpu=8, samples_per_gpu=8,
workers_per_gpu=4, workers_per_gpu=4,
# train on area 1, 2, 3, 4, 6 # train on area 1, 2, 3, 4, 6
# test on area 5 # test on area 5
train=dict( train=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_files=[ ann_files=[
data_root + f's3dis_infos_Area_{i}.pkl' for i in train_area data_root + f's3dis_infos_Area_{i}.pkl' for i in train_area
], ],
pipeline=train_pipeline, pipeline=train_pipeline,
classes=class_names, classes=class_names,
test_mode=False, test_mode=False,
ignore_index=len(class_names), ignore_index=len(class_names),
scene_idxs=[ scene_idxs=[
data_root + f'seg_info/Area_{i}_resampled_scene_idxs.npy' data_root + f'seg_info/Area_{i}_resampled_scene_idxs.npy'
for i in train_area for i in train_area
]), ]),
val=dict( val=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_files=data_root + f's3dis_infos_Area_{test_area}.pkl', ann_files=data_root + f's3dis_infos_Area_{test_area}.pkl',
pipeline=test_pipeline, pipeline=test_pipeline,
classes=class_names, classes=class_names,
test_mode=True, test_mode=True,
ignore_index=len(class_names), ignore_index=len(class_names),
scene_idxs=data_root + scene_idxs=data_root +
f'seg_info/Area_{test_area}_resampled_scene_idxs.npy'), f'seg_info/Area_{test_area}_resampled_scene_idxs.npy'),
test=dict( test=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_files=data_root + f's3dis_infos_Area_{test_area}.pkl', ann_files=data_root + f's3dis_infos_Area_{test_area}.pkl',
pipeline=test_pipeline, pipeline=test_pipeline,
classes=class_names, classes=class_names,
test_mode=True, test_mode=True,
ignore_index=len(class_names))) ignore_index=len(class_names)))
evaluation = dict(pipeline=eval_pipeline) evaluation = dict(pipeline=eval_pipeline)
# dataset settings # dataset settings
dataset_type = 'ScanNetDataset' dataset_type = 'ScanNetDataset'
data_root = './data/scannet/' data_root = './data/scannet/'
class_names = ('cabinet', 'bed', 'chair', 'sofa', 'table', 'door', 'window', class_names = ('cabinet', 'bed', 'chair', 'sofa', 'table', 'door', 'window',
'bookshelf', 'picture', 'counter', 'desk', 'curtain', 'bookshelf', 'picture', 'counter', 'desk', 'curtain',
'refrigerator', 'showercurtrain', 'toilet', 'sink', 'bathtub', 'refrigerator', 'showercurtrain', 'toilet', 'sink', 'bathtub',
'garbagebin') 'garbagebin')
train_pipeline = [ train_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='DEPTH', coord_type='DEPTH',
shift_height=True, shift_height=True,
load_dim=6, load_dim=6,
use_dim=[0, 1, 2]), use_dim=[0, 1, 2]),
dict( dict(
type='LoadAnnotations3D', type='LoadAnnotations3D',
with_bbox_3d=True, with_bbox_3d=True,
with_label_3d=True, with_label_3d=True,
with_mask_3d=True, with_mask_3d=True,
with_seg_3d=True), with_seg_3d=True),
dict(type='GlobalAlignment', rotation_axis=2), dict(type='GlobalAlignment', rotation_axis=2),
dict( dict(
type='PointSegClassMapping', type='PointSegClassMapping',
valid_cat_ids=(3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 24, 28, 33, 34, valid_cat_ids=(3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 24, 28, 33, 34,
36, 39), 36, 39),
max_cat_id=40), max_cat_id=40),
dict(type='IndoorPointSample', num_points=40000), dict(type='IndoorPointSample', num_points=40000),
dict( dict(
type='RandomFlip3D', type='RandomFlip3D',
sync_2d=False, sync_2d=False,
flip_ratio_bev_horizontal=0.5, flip_ratio_bev_horizontal=0.5,
flip_ratio_bev_vertical=0.5), flip_ratio_bev_vertical=0.5),
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
rot_range=[-0.087266, 0.087266], rot_range=[-0.087266, 0.087266],
scale_ratio_range=[1.0, 1.0], scale_ratio_range=[1.0, 1.0],
shift_height=True), shift_height=True),
dict(type='DefaultFormatBundle3D', class_names=class_names), dict(type='DefaultFormatBundle3D', class_names=class_names),
dict( dict(
type='Collect3D', type='Collect3D',
keys=[ keys=[
'points', 'gt_bboxes_3d', 'gt_labels_3d', 'pts_semantic_mask', 'points', 'gt_bboxes_3d', 'gt_labels_3d', 'pts_semantic_mask',
'pts_instance_mask' 'pts_instance_mask'
]) ])
] ]
test_pipeline = [ test_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='DEPTH', coord_type='DEPTH',
shift_height=True, shift_height=True,
load_dim=6, load_dim=6,
use_dim=[0, 1, 2]), use_dim=[0, 1, 2]),
dict(type='GlobalAlignment', rotation_axis=2), dict(type='GlobalAlignment', rotation_axis=2),
dict( dict(
type='MultiScaleFlipAug3D', type='MultiScaleFlipAug3D',
img_scale=(1333, 800), img_scale=(1333, 800),
pts_scale_ratio=1, pts_scale_ratio=1,
flip=False, flip=False,
transforms=[ transforms=[
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
rot_range=[0, 0], rot_range=[0, 0],
scale_ratio_range=[1., 1.], scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]), translation_std=[0, 0, 0]),
dict( dict(
type='RandomFlip3D', type='RandomFlip3D',
sync_2d=False, sync_2d=False,
flip_ratio_bev_horizontal=0.5, flip_ratio_bev_horizontal=0.5,
flip_ratio_bev_vertical=0.5), flip_ratio_bev_vertical=0.5),
dict(type='IndoorPointSample', num_points=40000), dict(type='IndoorPointSample', num_points=40000),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['points']) dict(type='Collect3D', keys=['points'])
]) ])
] ]
# construct a pipeline for data and gt loading in show function # construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client) # please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline = [ eval_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='DEPTH', coord_type='DEPTH',
shift_height=False, shift_height=False,
load_dim=6, load_dim=6,
use_dim=[0, 1, 2]), use_dim=[0, 1, 2]),
dict(type='GlobalAlignment', rotation_axis=2), dict(type='GlobalAlignment', rotation_axis=2),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['points']) dict(type='Collect3D', keys=['points'])
] ]
data = dict( data = dict(
samples_per_gpu=8, samples_per_gpu=8,
workers_per_gpu=4, workers_per_gpu=4,
train=dict( train=dict(
type='RepeatDataset', type='RepeatDataset',
times=5, times=5,
dataset=dict( dataset=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'scannet_infos_train.pkl', ann_file=data_root + 'scannet_infos_train.pkl',
pipeline=train_pipeline, pipeline=train_pipeline,
filter_empty_gt=False, filter_empty_gt=False,
classes=class_names, classes=class_names,
# we use box_type_3d='LiDAR' in kitti and nuscenes dataset # we use box_type_3d='LiDAR' in kitti and nuscenes dataset
# and box_type_3d='Depth' in sunrgbd and scannet dataset. # and box_type_3d='Depth' in sunrgbd and scannet dataset.
box_type_3d='Depth')), box_type_3d='Depth')),
val=dict( val=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'scannet_infos_val.pkl', ann_file=data_root + 'scannet_infos_val.pkl',
pipeline=test_pipeline, pipeline=test_pipeline,
classes=class_names, classes=class_names,
test_mode=True, test_mode=True,
box_type_3d='Depth'), box_type_3d='Depth'),
test=dict( test=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'scannet_infos_val.pkl', ann_file=data_root + 'scannet_infos_val.pkl',
pipeline=test_pipeline, pipeline=test_pipeline,
classes=class_names, classes=class_names,
test_mode=True, test_mode=True,
box_type_3d='Depth')) box_type_3d='Depth'))
evaluation = dict(pipeline=eval_pipeline) evaluation = dict(pipeline=eval_pipeline)
# dataset settings # dataset settings
dataset_type = 'ScanNetSegDataset' dataset_type = 'ScanNetSegDataset'
data_root = './data/scannet/' data_root = './data/scannet/'
class_names = ('wall', 'floor', 'cabinet', 'bed', 'chair', 'sofa', 'table', class_names = ('wall', 'floor', 'cabinet', 'bed', 'chair', 'sofa', 'table',
'door', 'window', 'bookshelf', 'picture', 'counter', 'desk', 'door', 'window', 'bookshelf', 'picture', 'counter', 'desk',
'curtain', 'refrigerator', 'showercurtrain', 'toilet', 'sink', 'curtain', 'refrigerator', 'showercurtrain', 'toilet', 'sink',
'bathtub', 'otherfurniture') 'bathtub', 'otherfurniture')
num_points = 8192 num_points = 8192
train_pipeline = [ train_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='DEPTH', coord_type='DEPTH',
shift_height=False, shift_height=False,
use_color=True, use_color=True,
load_dim=6, load_dim=6,
use_dim=[0, 1, 2, 3, 4, 5]), use_dim=[0, 1, 2, 3, 4, 5]),
dict( dict(
type='LoadAnnotations3D', type='LoadAnnotations3D',
with_bbox_3d=False, with_bbox_3d=False,
with_label_3d=False, with_label_3d=False,
with_mask_3d=False, with_mask_3d=False,
with_seg_3d=True), with_seg_3d=True),
dict( dict(
type='PointSegClassMapping', type='PointSegClassMapping',
valid_cat_ids=(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 24, 28, valid_cat_ids=(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 24, 28,
33, 34, 36, 39), 33, 34, 36, 39),
max_cat_id=40), max_cat_id=40),
dict( dict(
type='IndoorPatchPointSample', type='IndoorPatchPointSample',
num_points=num_points, num_points=num_points,
block_size=1.5, block_size=1.5,
ignore_index=len(class_names), ignore_index=len(class_names),
use_normalized_coord=False, use_normalized_coord=False,
enlarge_size=0.2, enlarge_size=0.2,
min_unique_num=None), min_unique_num=None),
dict(type='NormalizePointsColor', color_mean=None), dict(type='NormalizePointsColor', color_mean=None),
dict(type='DefaultFormatBundle3D', class_names=class_names), dict(type='DefaultFormatBundle3D', class_names=class_names),
dict(type='Collect3D', keys=['points', 'pts_semantic_mask']) dict(type='Collect3D', keys=['points', 'pts_semantic_mask'])
] ]
test_pipeline = [ test_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='DEPTH', coord_type='DEPTH',
shift_height=False, shift_height=False,
use_color=True, use_color=True,
load_dim=6, load_dim=6,
use_dim=[0, 1, 2, 3, 4, 5]), use_dim=[0, 1, 2, 3, 4, 5]),
dict(type='NormalizePointsColor', color_mean=None), dict(type='NormalizePointsColor', color_mean=None),
dict( dict(
# a wrapper in order to successfully call test function # a wrapper in order to successfully call test function
# actually we don't perform test-time-aug # actually we don't perform test-time-aug
type='MultiScaleFlipAug3D', type='MultiScaleFlipAug3D',
img_scale=(1333, 800), img_scale=(1333, 800),
pts_scale_ratio=1, pts_scale_ratio=1,
flip=False, flip=False,
transforms=[ transforms=[
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
rot_range=[0, 0], rot_range=[0, 0],
scale_ratio_range=[1., 1.], scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]), translation_std=[0, 0, 0]),
dict( dict(
type='RandomFlip3D', type='RandomFlip3D',
sync_2d=False, sync_2d=False,
flip_ratio_bev_horizontal=0.0, flip_ratio_bev_horizontal=0.0,
flip_ratio_bev_vertical=0.0), flip_ratio_bev_vertical=0.0),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['points']) dict(type='Collect3D', keys=['points'])
]) ])
] ]
# construct a pipeline for data and gt loading in show function # construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client) # please keep its loading function consistent with test_pipeline (e.g. client)
# we need to load gt seg_mask! # we need to load gt seg_mask!
eval_pipeline = [ eval_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='DEPTH', coord_type='DEPTH',
shift_height=False, shift_height=False,
use_color=True, use_color=True,
load_dim=6, load_dim=6,
use_dim=[0, 1, 2, 3, 4, 5]), use_dim=[0, 1, 2, 3, 4, 5]),
dict( dict(
type='LoadAnnotations3D', type='LoadAnnotations3D',
with_bbox_3d=False, with_bbox_3d=False,
with_label_3d=False, with_label_3d=False,
with_mask_3d=False, with_mask_3d=False,
with_seg_3d=True), with_seg_3d=True),
dict( dict(
type='PointSegClassMapping', type='PointSegClassMapping',
valid_cat_ids=(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 24, 28, valid_cat_ids=(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 24, 28,
33, 34, 36, 39), 33, 34, 36, 39),
max_cat_id=40), max_cat_id=40),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
with_label=False, with_label=False,
class_names=class_names), class_names=class_names),
dict(type='Collect3D', keys=['points', 'pts_semantic_mask']) dict(type='Collect3D', keys=['points', 'pts_semantic_mask'])
] ]
data = dict( data = dict(
samples_per_gpu=8, samples_per_gpu=8,
workers_per_gpu=4, workers_per_gpu=4,
train=dict( train=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'scannet_infos_train.pkl', ann_file=data_root + 'scannet_infos_train.pkl',
pipeline=train_pipeline, pipeline=train_pipeline,
classes=class_names, classes=class_names,
test_mode=False, test_mode=False,
ignore_index=len(class_names), ignore_index=len(class_names),
scene_idxs=data_root + 'seg_info/train_resampled_scene_idxs.npy'), scene_idxs=data_root + 'seg_info/train_resampled_scene_idxs.npy'),
val=dict( val=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'scannet_infos_val.pkl', ann_file=data_root + 'scannet_infos_val.pkl',
pipeline=test_pipeline, pipeline=test_pipeline,
classes=class_names, classes=class_names,
test_mode=True, test_mode=True,
ignore_index=len(class_names)), ignore_index=len(class_names)),
test=dict( test=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'scannet_infos_val.pkl', ann_file=data_root + 'scannet_infos_val.pkl',
pipeline=test_pipeline, pipeline=test_pipeline,
classes=class_names, classes=class_names,
test_mode=True, test_mode=True,
ignore_index=len(class_names))) ignore_index=len(class_names)))
evaluation = dict(pipeline=eval_pipeline) evaluation = dict(pipeline=eval_pipeline)
dataset_type = 'SUNRGBDDataset' dataset_type = 'SUNRGBDDataset'
data_root = 'data/sunrgbd/' data_root = 'data/sunrgbd/'
class_names = ('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser', class_names = ('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser',
'night_stand', 'bookshelf', 'bathtub') 'night_stand', 'bookshelf', 'bathtub')
train_pipeline = [ train_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='DEPTH', coord_type='DEPTH',
shift_height=True, shift_height=True,
load_dim=6, load_dim=6,
use_dim=[0, 1, 2]), use_dim=[0, 1, 2]),
dict(type='LoadAnnotations3D'), dict(type='LoadAnnotations3D'),
dict( dict(
type='RandomFlip3D', type='RandomFlip3D',
sync_2d=False, sync_2d=False,
flip_ratio_bev_horizontal=0.5, flip_ratio_bev_horizontal=0.5,
), ),
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
rot_range=[-0.523599, 0.523599], rot_range=[-0.523599, 0.523599],
scale_ratio_range=[0.85, 1.15], scale_ratio_range=[0.85, 1.15],
shift_height=True), shift_height=True),
dict(type='IndoorPointSample', num_points=20000), dict(type='IndoorPointSample', num_points=20000),
dict(type='DefaultFormatBundle3D', class_names=class_names), dict(type='DefaultFormatBundle3D', class_names=class_names),
dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d']) dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
] ]
test_pipeline = [ test_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='DEPTH', coord_type='DEPTH',
shift_height=True, shift_height=True,
load_dim=6, load_dim=6,
use_dim=[0, 1, 2]), use_dim=[0, 1, 2]),
dict( dict(
type='MultiScaleFlipAug3D', type='MultiScaleFlipAug3D',
img_scale=(1333, 800), img_scale=(1333, 800),
pts_scale_ratio=1, pts_scale_ratio=1,
flip=False, flip=False,
transforms=[ transforms=[
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
rot_range=[0, 0], rot_range=[0, 0],
scale_ratio_range=[1., 1.], scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]), translation_std=[0, 0, 0]),
dict( dict(
type='RandomFlip3D', type='RandomFlip3D',
sync_2d=False, sync_2d=False,
flip_ratio_bev_horizontal=0.5, flip_ratio_bev_horizontal=0.5,
), ),
dict(type='IndoorPointSample', num_points=20000), dict(type='IndoorPointSample', num_points=20000),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['points']) dict(type='Collect3D', keys=['points'])
]) ])
] ]
# construct a pipeline for data and gt loading in show function # construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client) # please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline = [ eval_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='DEPTH', coord_type='DEPTH',
shift_height=False, shift_height=False,
load_dim=6, load_dim=6,
use_dim=[0, 1, 2]), use_dim=[0, 1, 2]),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['points']) dict(type='Collect3D', keys=['points'])
] ]
data = dict( data = dict(
samples_per_gpu=16, samples_per_gpu=16,
workers_per_gpu=4, workers_per_gpu=4,
train=dict( train=dict(
type='RepeatDataset', type='RepeatDataset',
times=5, times=5,
dataset=dict( dataset=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'sunrgbd_infos_train.pkl', ann_file=data_root + 'sunrgbd_infos_train.pkl',
pipeline=train_pipeline, pipeline=train_pipeline,
classes=class_names, classes=class_names,
filter_empty_gt=False, filter_empty_gt=False,
# we use box_type_3d='LiDAR' in kitti and nuscenes dataset # we use box_type_3d='LiDAR' in kitti and nuscenes dataset
# and box_type_3d='Depth' in sunrgbd and scannet dataset. # and box_type_3d='Depth' in sunrgbd and scannet dataset.
box_type_3d='Depth')), box_type_3d='Depth')),
val=dict( val=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'sunrgbd_infos_val.pkl', ann_file=data_root + 'sunrgbd_infos_val.pkl',
pipeline=test_pipeline, pipeline=test_pipeline,
classes=class_names, classes=class_names,
test_mode=True, test_mode=True,
box_type_3d='Depth'), box_type_3d='Depth'),
test=dict( test=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'sunrgbd_infos_val.pkl', ann_file=data_root + 'sunrgbd_infos_val.pkl',
pipeline=test_pipeline, pipeline=test_pipeline,
classes=class_names, classes=class_names,
test_mode=True, test_mode=True,
box_type_3d='Depth')) box_type_3d='Depth'))
evaluation = dict(pipeline=eval_pipeline) evaluation = dict(pipeline=eval_pipeline)
# dataset settings # dataset settings
# D5 in the config name means the whole dataset is divided into 5 folds # D5 in the config name means the whole dataset is divided into 5 folds
# We only use one fold for efficient experiments # We only use one fold for efficient experiments
dataset_type = 'WaymoDataset' dataset_type = 'WaymoDataset'
data_root = 'data/waymo/kitti_format/' data_root = 'data/waymo/kitti_format/'
file_client_args = dict(backend='disk') file_client_args = dict(backend='disk')
# Uncomment the following if use ceph or other file clients. # Uncomment the following if use ceph or other file clients.
# See https://mmcv.readthedocs.io/en/latest/api.html#mmcv.fileio.FileClient # See https://mmcv.readthedocs.io/en/latest/api.html#mmcv.fileio.FileClient
# for more details. # for more details.
# file_client_args = dict( # file_client_args = dict(
# backend='petrel', path_mapping=dict(data='s3://waymo_data/')) # backend='petrel', path_mapping=dict(data='s3://waymo_data/'))
class_names = ['Car', 'Pedestrian', 'Cyclist'] class_names = ['Car', 'Pedestrian', 'Cyclist']
point_cloud_range = [-74.88, -74.88, -2, 74.88, 74.88, 4] point_cloud_range = [-74.88, -74.88, -2, 74.88, 74.88, 4]
input_modality = dict(use_lidar=True, use_camera=False) input_modality = dict(use_lidar=True, use_camera=False)
db_sampler = dict( db_sampler = dict(
data_root=data_root, data_root=data_root,
info_path=data_root + 'waymo_dbinfos_train.pkl', info_path=data_root + 'waymo_dbinfos_train.pkl',
rate=1.0, rate=1.0,
prepare=dict( prepare=dict(
filter_by_difficulty=[-1], filter_by_difficulty=[-1],
filter_by_min_points=dict(Car=5, Pedestrian=10, Cyclist=10)), filter_by_min_points=dict(Car=5, Pedestrian=10, Cyclist=10)),
classes=class_names, classes=class_names,
sample_groups=dict(Car=15, Pedestrian=10, Cyclist=10), sample_groups=dict(Car=15, Pedestrian=10, Cyclist=10),
points_loader=dict( points_loader=dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=5, load_dim=5,
use_dim=[0, 1, 2, 3, 4], use_dim=[0, 1, 2, 3, 4],
file_client_args=file_client_args)) file_client_args=file_client_args))
train_pipeline = [ train_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=6, load_dim=6,
use_dim=5, use_dim=5,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='LoadAnnotations3D', type='LoadAnnotations3D',
with_bbox_3d=True, with_bbox_3d=True,
with_label_3d=True, with_label_3d=True,
file_client_args=file_client_args), file_client_args=file_client_args),
dict(type='ObjectSample', db_sampler=db_sampler), dict(type='ObjectSample', db_sampler=db_sampler),
dict( dict(
type='RandomFlip3D', type='RandomFlip3D',
sync_2d=False, sync_2d=False,
flip_ratio_bev_horizontal=0.5, flip_ratio_bev_horizontal=0.5,
flip_ratio_bev_vertical=0.5), flip_ratio_bev_vertical=0.5),
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
rot_range=[-0.78539816, 0.78539816], rot_range=[-0.78539816, 0.78539816],
scale_ratio_range=[0.95, 1.05]), scale_ratio_range=[0.95, 1.05]),
dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range), dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range), dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
dict(type='PointShuffle'), dict(type='PointShuffle'),
dict(type='DefaultFormatBundle3D', class_names=class_names), dict(type='DefaultFormatBundle3D', class_names=class_names),
dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d']) dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
] ]
test_pipeline = [ test_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=6, load_dim=6,
use_dim=5, use_dim=5,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='MultiScaleFlipAug3D', type='MultiScaleFlipAug3D',
img_scale=(1333, 800), img_scale=(1333, 800),
pts_scale_ratio=1, pts_scale_ratio=1,
flip=False, flip=False,
transforms=[ transforms=[
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
rot_range=[0, 0], rot_range=[0, 0],
scale_ratio_range=[1., 1.], scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]), translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'), dict(type='RandomFlip3D'),
dict( dict(
type='PointsRangeFilter', point_cloud_range=point_cloud_range), type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['points']) dict(type='Collect3D', keys=['points'])
]) ])
] ]
# construct a pipeline for data and gt loading in show function # construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client) # please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline = [ eval_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=6, load_dim=6,
use_dim=5, use_dim=5,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['points']) dict(type='Collect3D', keys=['points'])
] ]
data = dict( data = dict(
samples_per_gpu=2, samples_per_gpu=2,
workers_per_gpu=4, workers_per_gpu=4,
train=dict( train=dict(
type='RepeatDataset', type='RepeatDataset',
times=2, times=2,
dataset=dict( dataset=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'waymo_infos_train.pkl', ann_file=data_root + 'waymo_infos_train.pkl',
split='training', split='training',
pipeline=train_pipeline, pipeline=train_pipeline,
modality=input_modality, modality=input_modality,
classes=class_names, classes=class_names,
test_mode=False, test_mode=False,
# we use box_type_3d='LiDAR' in kitti and nuscenes dataset # we use box_type_3d='LiDAR' in kitti and nuscenes dataset
# and box_type_3d='Depth' in sunrgbd and scannet dataset. # and box_type_3d='Depth' in sunrgbd and scannet dataset.
box_type_3d='LiDAR', box_type_3d='LiDAR',
# load one frame every five frames # load one frame every five frames
load_interval=5)), load_interval=5)),
val=dict( val=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'waymo_infos_val.pkl', ann_file=data_root + 'waymo_infos_val.pkl',
split='training', split='training',
pipeline=test_pipeline, pipeline=test_pipeline,
modality=input_modality, modality=input_modality,
classes=class_names, classes=class_names,
test_mode=True, test_mode=True,
box_type_3d='LiDAR'), box_type_3d='LiDAR'),
test=dict( test=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'waymo_infos_val.pkl', ann_file=data_root + 'waymo_infos_val.pkl',
split='training', split='training',
pipeline=test_pipeline, pipeline=test_pipeline,
modality=input_modality, modality=input_modality,
classes=class_names, classes=class_names,
test_mode=True, test_mode=True,
box_type_3d='LiDAR')) box_type_3d='LiDAR'))
evaluation = dict(interval=24, pipeline=eval_pipeline) evaluation = dict(interval=24, pipeline=eval_pipeline)
# dataset settings # dataset settings
# D5 in the config name means the whole dataset is divided into 5 folds # D5 in the config name means the whole dataset is divided into 5 folds
# We only use one fold for efficient experiments # We only use one fold for efficient experiments
dataset_type = 'WaymoDataset' dataset_type = 'WaymoDataset'
data_root = 'data/waymo/kitti_format/' data_root = 'data/waymo/kitti_format/'
file_client_args = dict(backend='disk') file_client_args = dict(backend='disk')
# Uncomment the following if use ceph or other file clients. # Uncomment the following if use ceph or other file clients.
# See https://mmcv.readthedocs.io/en/latest/api.html#mmcv.fileio.FileClient # See https://mmcv.readthedocs.io/en/latest/api.html#mmcv.fileio.FileClient
# for more details. # for more details.
# file_client_args = dict( # file_client_args = dict(
# backend='petrel', path_mapping=dict(data='s3://waymo_data/')) # backend='petrel', path_mapping=dict(data='s3://waymo_data/'))
class_names = ['Car'] class_names = ['Car']
point_cloud_range = [-74.88, -74.88, -2, 74.88, 74.88, 4] point_cloud_range = [-74.88, -74.88, -2, 74.88, 74.88, 4]
input_modality = dict(use_lidar=True, use_camera=False) input_modality = dict(use_lidar=True, use_camera=False)
db_sampler = dict( db_sampler = dict(
data_root=data_root, data_root=data_root,
info_path=data_root + 'waymo_dbinfos_train.pkl', info_path=data_root + 'waymo_dbinfos_train.pkl',
rate=1.0, rate=1.0,
prepare=dict(filter_by_difficulty=[-1], filter_by_min_points=dict(Car=5)), prepare=dict(filter_by_difficulty=[-1], filter_by_min_points=dict(Car=5)),
classes=class_names, classes=class_names,
sample_groups=dict(Car=15), sample_groups=dict(Car=15),
points_loader=dict( points_loader=dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=5, load_dim=5,
use_dim=[0, 1, 2, 3, 4], use_dim=[0, 1, 2, 3, 4],
file_client_args=file_client_args)) file_client_args=file_client_args))
train_pipeline = [ train_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=6, load_dim=6,
use_dim=5, use_dim=5,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='LoadAnnotations3D', type='LoadAnnotations3D',
with_bbox_3d=True, with_bbox_3d=True,
with_label_3d=True, with_label_3d=True,
file_client_args=file_client_args), file_client_args=file_client_args),
dict(type='ObjectSample', db_sampler=db_sampler), dict(type='ObjectSample', db_sampler=db_sampler),
dict( dict(
type='RandomFlip3D', type='RandomFlip3D',
sync_2d=False, sync_2d=False,
flip_ratio_bev_horizontal=0.5, flip_ratio_bev_horizontal=0.5,
flip_ratio_bev_vertical=0.5), flip_ratio_bev_vertical=0.5),
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
rot_range=[-0.78539816, 0.78539816], rot_range=[-0.78539816, 0.78539816],
scale_ratio_range=[0.95, 1.05]), scale_ratio_range=[0.95, 1.05]),
dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range), dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range), dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
dict(type='PointShuffle'), dict(type='PointShuffle'),
dict(type='DefaultFormatBundle3D', class_names=class_names), dict(type='DefaultFormatBundle3D', class_names=class_names),
dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d']) dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
] ]
test_pipeline = [ test_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=6, load_dim=6,
use_dim=5, use_dim=5,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='MultiScaleFlipAug3D', type='MultiScaleFlipAug3D',
img_scale=(1333, 800), img_scale=(1333, 800),
pts_scale_ratio=1, pts_scale_ratio=1,
flip=False, flip=False,
transforms=[ transforms=[
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
rot_range=[0, 0], rot_range=[0, 0],
scale_ratio_range=[1., 1.], scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]), translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'), dict(type='RandomFlip3D'),
dict( dict(
type='PointsRangeFilter', point_cloud_range=point_cloud_range), type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['points']) dict(type='Collect3D', keys=['points'])
]) ])
] ]
# construct a pipeline for data and gt loading in show function # construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client) # please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline = [ eval_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=6, load_dim=6,
use_dim=5, use_dim=5,
file_client_args=file_client_args), file_client_args=file_client_args),
dict( dict(
type='DefaultFormatBundle3D', type='DefaultFormatBundle3D',
class_names=class_names, class_names=class_names,
with_label=False), with_label=False),
dict(type='Collect3D', keys=['points']) dict(type='Collect3D', keys=['points'])
] ]
data = dict( data = dict(
samples_per_gpu=2, samples_per_gpu=2,
workers_per_gpu=4, workers_per_gpu=4,
train=dict( train=dict(
type='RepeatDataset', type='RepeatDataset',
times=2, times=2,
dataset=dict( dataset=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'waymo_infos_train.pkl', ann_file=data_root + 'waymo_infos_train.pkl',
split='training', split='training',
pipeline=train_pipeline, pipeline=train_pipeline,
modality=input_modality, modality=input_modality,
classes=class_names, classes=class_names,
test_mode=False, test_mode=False,
# we use box_type_3d='LiDAR' in kitti and nuscenes dataset # we use box_type_3d='LiDAR' in kitti and nuscenes dataset
# and box_type_3d='Depth' in sunrgbd and scannet dataset. # and box_type_3d='Depth' in sunrgbd and scannet dataset.
box_type_3d='LiDAR', box_type_3d='LiDAR',
# load one frame every five frames # load one frame every five frames
load_interval=5)), load_interval=5)),
val=dict( val=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'waymo_infos_val.pkl', ann_file=data_root + 'waymo_infos_val.pkl',
split='training', split='training',
pipeline=test_pipeline, pipeline=test_pipeline,
modality=input_modality, modality=input_modality,
classes=class_names, classes=class_names,
test_mode=True, test_mode=True,
box_type_3d='LiDAR'), box_type_3d='LiDAR'),
test=dict( test=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file=data_root + 'waymo_infos_val.pkl', ann_file=data_root + 'waymo_infos_val.pkl',
split='training', split='training',
pipeline=test_pipeline, pipeline=test_pipeline,
modality=input_modality, modality=input_modality,
classes=class_names, classes=class_names,
test_mode=True, test_mode=True,
box_type_3d='LiDAR')) box_type_3d='LiDAR'))
evaluation = dict(interval=24, pipeline=eval_pipeline) evaluation = dict(interval=24, pipeline=eval_pipeline)
checkpoint_config = dict(interval=1) checkpoint_config = dict(interval=1)
# yapf:disable push # yapf:disable push
# By default we use textlogger hook and tensorboard # By default we use textlogger hook and tensorboard
# For more loggers see # For more loggers see
# https://mmcv.readthedocs.io/en/latest/api.html#mmcv.runner.LoggerHook # https://mmcv.readthedocs.io/en/latest/api.html#mmcv.runner.LoggerHook
log_config = dict( log_config = dict(
interval=50, interval=50,
hooks=[ hooks=[
dict(type='TextLoggerHook'), dict(type='TextLoggerHook'),
dict(type='TensorboardLoggerHook') dict(type='TensorboardLoggerHook')
]) ])
# yapf:enable # yapf:enable
dist_params = dict(backend='nccl') dist_params = dict(backend='nccl')
log_level = 'INFO' log_level = 'INFO'
work_dir = None work_dir = None
load_from = None load_from = None
resume_from = None resume_from = None
workflow = [('train', 1)] workflow = [('train', 1)]
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment