Commit 7aa442d5 authored by raojy's avatar raojy
Browse files

raw_mmdetection

parent 9c03eaa8
# 可视化
MMDetection3D 提供了 `Det3DLocalVisualizer` 用来在训练及测试阶段可视化和存储模型的状态以及结果,其具有以下特性:
1. 支持多模态数据和多任务的基本绘图界面。
2. 支持多个后端(如 local,TensorBoard),将训练状态(如 `loss``lr`)或模型评估指标写入指定的一个或多个后端中。
3. 支持多模态数据真实标签的可视化,3D 检测结果的跨模态可视化。
## 基本绘制界面
继承自 `DetLocalVisualizer``Det3DLocalVisualizer` 提供了在 2D 图像上绘制常见目标的界面,例如绘制检测框、点、文本、线、圆、多边形、二进制掩码等。关于 2D 绘制的更多细节,请参考 MMDetection 中的[可视化文档](https://mmengine.readthedocs.io/zh_CN/latest/advanced_tutorials/visualization.html)。这里我们介绍 3D 绘制界面。
### 在图像上绘制点云
通过使用 `draw_points_on_image`,我们支持在图像上绘制点云。
```python
import mmcv
import numpy as np
from mmengine import load
from mmdet3d.visualization import Det3DLocalVisualizer
info_file = load('demo/data/kitti/000008.pkl')
points = np.fromfile('demo/data/kitti/000008.bin', dtype=np.float32)
points = points.reshape(-1, 4)[:, :3]
lidar2img = np.array(info_file['data_list'][0]['images']['CAM2']['lidar2img'], dtype=np.float32)
visualizer = Det3DLocalVisualizer()
img = mmcv.imread('demo/data/kitti/000008.png')
img = mmcv.imconvert(img, 'bgr', 'rgb')
visualizer.set_image(img)
visualizer.draw_points_on_image(points, lidar2img)
visualizer.show()
```
![points_on_image](../../../resources/points_on_image.png)
### 在点云上绘制 3D 框
通过使用 `draw_bboxes_3d`,我们支持在点云上绘制 3D 框。
```python
import torch
import numpy as np
from mmdet3d.visualization import Det3DLocalVisualizer
from mmdet3d.structures import LiDARInstance3DBoxes
points = np.fromfile('demo/data/kitti/000008.bin', dtype=np.float32)
points = points.reshape(-1, 4)
visualizer = Det3DLocalVisualizer()
# set point cloud in visualizer
visualizer.set_points(points)
bboxes_3d = LiDARInstance3DBoxes(
torch.tensor([[8.7314, -1.8559, -1.5997, 4.2000, 3.4800, 1.8900,
-1.5808]]))
# Draw 3D bboxes
visualizer.draw_bboxes_3d(bboxes_3d)
visualizer.show()
```
![mono3d](../../../resources/pcd.png)
### 在图像上绘制投影的 3D 框
通过使用 `draw_proj_bboxes_3d`,我们支持在图像上绘制投影的 3D 框。
```python
import mmcv
import numpy as np
from mmengine import load
from mmdet3d.visualization import Det3DLocalVisualizer
from mmdet3d.structures import CameraInstance3DBoxes
info_file = load('demo/data/kitti/000008.pkl')
cam2img = np.array(info_file['data_list'][0]['images']['CAM2']['cam2img'], dtype=np.float32)
bboxes_3d = []
for instance in info_file['data_list'][0]['instances']:
bboxes_3d.append(instance['bbox_3d'])
gt_bboxes_3d = np.array(bboxes_3d, dtype=np.float32)
gt_bboxes_3d = CameraInstance3DBoxes(gt_bboxes_3d)
input_meta = {'cam2img': cam2img}
visualizer = Det3DLocalVisualizer()
img = mmcv.imread('demo/data/kitti/000008.png')
img = mmcv.imconvert(img, 'bgr', 'rgb')
visualizer.set_image(img)
# project 3D bboxes to image
visualizer.draw_proj_bboxes_3d(gt_bboxes_3d, input_meta)
visualizer.show()
```
### 绘制 BEV 视角的框
通过使用 `draw_bev_bboxes`,我们支持绘制 BEV 视角下的框。
```python
import numpy as np
from mmengine import load
from mmdet3d.visualization import Det3DLocalVisualizer
from mmdet3d.structures import CameraInstance3DBoxes
info_file = load('demo/data/kitti/000008.pkl')
bboxes_3d = []
for instance in info_file['data_list'][0]['instances']:
bboxes_3d.append(instance['bbox_3d'])
gt_bboxes_3d = np.array(bboxes_3d, dtype=np.float32)
gt_bboxes_3d = CameraInstance3DBoxes(gt_bboxes_3d)
visualizer = Det3DLocalVisualizer()
# set bev image in visualizer
visualizer.set_bev_image()
# draw bev bboxes
visualizer.draw_bev_bboxes(gt_bboxes_3d, edge_colors='orange')
visualizer.show()
```
### 绘制 3D 分割掩码
通过使用 `draw_seg_mask`,我们支持通过逐点着色来绘制分割掩码。
```python
import numpy as np
from mmdet3d.visualization import Det3DLocalVisualizer
points = np.fromfile('demo/data/sunrgbd/000017.bin', dtype=np.float32)
points = points.reshape(-1, 3)
visualizer = Det3DLocalVisualizer()
mask = np.random.rand(points.shape[0], 3)
points_with_mask = np.concatenate((points, mask), axis=-1)
# Draw 3D points with mask
visualizer.set_points(points, pcd_mode=2, vis_mode='add')
visualizer.draw_seg_mask(points_with_mask)
visualizer.show()
```
## 结果
如果想要可视化训练模型的预测结果,你可以运行如下指令:
```bash
python tools/test.py ${CONFIG_FILE} ${CKPT_PATH} --show --show-dir ${SHOW_DIR}
```
运行该指令后,绘制的结果(包括输入数据和网络输出在输入上的可视化)将会被保存在 `${SHOW_DIR}` 中。
运行该指令后,你将在 `${SHOW_DIR}` 中获得输入数据,网络输出和真是标签在输入上的可视化(如在多模态检测任务和基于视觉的检测任务中的 `***_gt.png``***_pred.png`)。当启用 `show` 时,[Open3D](http://www.open3d.org/) 将会用于在线可视化结果。如果你是在没有 GUI 的远程服务器上测试时,在线可视化是不被支持的。你可以从远程服务器中下载 `results.pkl`,并在本地机器上离线可视化预测结果。
使用 `Open3D` 后端离线可视化结果,你可以运行如下指令:
```bash
python tools/misc/visualize_results.py ${CONFIG_FILE} --result ${RESULTS_PATH} --show-dir ${SHOW_DIR}
```
![](../../../resources/open3d_visual.gif)
这需要在远程服务器中能够推理并生成结果,然后用户在主机中使用 GUI 打开。
## 数据集
我们也提供了脚本来可视化数据集而无需推理。你可以使用 `tools/misc/browse_dataset.py` 来在线可视化加载的数据的真实标签,并保存在硬盘中。目前我们支持所有数据集的单模态 3D 检测和 3D 分割,KITTI 和 SUN RGB-D 的多模态 3D 检测,以及 nuScenes 的单目 3D 检测。如果想要浏览 KITTI 数据集,你可以运行如下指令:
```shell
python tools/misc/browse_dataset.py configs/_base_/datasets/kitti-3d-3class.py --task lidar_det --output-dir ${OUTPUT_DIR}
```
**注意**:一旦指定了 `--output-dir`,当在 open3d 窗口中按下 `_ESC_` 时,用户指定的视图图像将会被保存下来。如果你想要对点云进行缩放操作以观察更多细节, 你可以在命令中指定 `--show-interval=0`
为了验证数据的一致性和数据增强的效果,你可以加上 `--aug` 来可视化数据增强后的数据,指令如下所示:
```shell
python tools/misc/browse_dataset.py configs/_base_/datasets/kitti-3d-3class.py --task det --aug --output-dir ${OUTPUT_DIR}
```
如果你想显示带有投影的 3D 边界框的 2D 图像,你需要一个支持多模态数据加载的配置文件,并将 `--task` 参数改为 `multi-modality_det`。示例如下:
```shell
python tools/misc/browse_dataset.py configs/mvxnet/mvxnet_fpn_dv_second_secfpn_8xb2-80e_kitti-3d-3class.py --task multi-modality_det --output-dir ${OUTPUT_DIR}
```
![](../../../resources/browse_dataset_multi_modality.png)
你可以使用不同的配置浏览不同的数据集,例如在 3D 语义分割任务中可视化 ScanNet 数据集:
```shell
python tools/misc/browse_dataset.py configs/_base_/datasets/scannet-seg.py --task lidar_seg --output-dir ${OUTPUT_DIR}
```
![](../../../resources/browse_dataset_seg.png)
在单目 3D 检测任务中浏览 nuScenes 数据集:
```shell
python tools/misc/browse_dataset.py configs/_base_/datasets/nus-mono3d.py --task mono_det --output-dir ${OUTPUT_DIR}
```
![](../../../resources/browse_dataset_mono.png)
# Copyright (c) OpenMMLab. All rights reserved.
import mmcv
import mmdet
import mmengine
from mmengine.utils import digit_version
from .version import __version__, version_info
mmcv_minimum_version = '2.0.0rc4'
mmcv_maximum_version = '2.2.0'
mmcv_version = digit_version(mmcv.__version__)
mmengine_minimum_version = '0.8.0'
mmengine_maximum_version = '1.0.0'
mmengine_version = digit_version(mmengine.__version__)
mmdet_minimum_version = '3.0.0rc5'
mmdet_maximum_version = '3.4.0'
mmdet_version = digit_version(mmdet.__version__)
assert (mmcv_version >= digit_version(mmcv_minimum_version)
and mmcv_version < digit_version(mmcv_maximum_version)), \
f'MMCV=={mmcv.__version__} is used but incompatible. ' \
f'Please install mmcv>={mmcv_minimum_version}, <{mmcv_maximum_version}.'
assert (mmengine_version >= digit_version(mmengine_minimum_version)
and mmengine_version < digit_version(mmengine_maximum_version)), \
f'MMEngine=={mmengine.__version__} is used but incompatible. ' \
f'Please install mmengine>={mmengine_minimum_version}, ' \
f'<{mmengine_maximum_version}.'
assert (mmdet_version >= digit_version(mmdet_minimum_version)
and mmdet_version < digit_version(mmdet_maximum_version)), \
f'MMDET=={mmdet.__version__} is used but incompatible. ' \
f'Please install mmdet>={mmdet_minimum_version}, ' \
f'<{mmdet_maximum_version}.'
__all__ = ['__version__', 'version_info', 'digit_version']
# Copyright (c) OpenMMLab. All rights reserved.
from .inference import (convert_SyncBN, inference_detector,
inference_mono_3d_detector,
inference_multi_modality_detector, inference_segmentor,
init_model)
from .inferencers import (Base3DInferencer, LidarDet3DInferencer,
LidarSeg3DInferencer, MonoDet3DInferencer,
MultiModalityDet3DInferencer)
__all__ = [
'inference_detector', 'init_model', 'inference_mono_3d_detector',
'convert_SyncBN', 'inference_multi_modality_detector',
'inference_segmentor', 'Base3DInferencer', 'MonoDet3DInferencer',
'LidarDet3DInferencer', 'LidarSeg3DInferencer',
'MultiModalityDet3DInferencer'
]
# Copyright (c) OpenMMLab. All rights reserved.
import warnings
from copy import deepcopy
from os import path as osp
from pathlib import Path
from typing import Optional, Sequence, Union
import mmengine
import numpy as np
import torch
import torch.nn as nn
from mmengine.config import Config
from mmengine.dataset import Compose, pseudo_collate
from mmengine.registry import init_default_scope
from mmengine.runner import load_checkpoint
from mmdet3d.registry import DATASETS, MODELS
from mmdet3d.structures import Box3DMode, Det3DDataSample, get_box_type
from mmdet3d.structures.det3d_data_sample import SampleList
def convert_SyncBN(config):
"""Convert config's naiveSyncBN to BN.
Args:
config (str or :obj:`mmengine.Config`): Config file path or the config
object.
"""
if isinstance(config, dict):
for item in config:
if item == 'norm_cfg':
config[item]['type'] = config[item]['type']. \
replace('naiveSyncBN', 'BN')
else:
convert_SyncBN(config[item])
def init_model(config: Union[str, Path, Config],
checkpoint: Optional[str] = None,
device: str = 'cuda:0',
palette: str = 'none',
cfg_options: Optional[dict] = None):
"""Initialize a model from config file, which could be a 3D detector or a
3D segmentor.
Args:
config (str, :obj:`Path`, or :obj:`mmengine.Config`): Config file path,
:obj:`Path`, or the config object.
checkpoint (str, optional): Checkpoint path. If left as None, the model
will not load any weights.
device (str): Device to use.
cfg_options (dict, optional): Options to override some settings in
the used config.
Returns:
nn.Module: The constructed detector.
"""
if isinstance(config, (str, Path)):
config = Config.fromfile(config)
elif not isinstance(config, Config):
raise TypeError('config must be a filename or Config object, '
f'but got {type(config)}')
if cfg_options is not None:
config.merge_from_dict(cfg_options)
convert_SyncBN(config.model)
config.model.train_cfg = None
init_default_scope(config.get('default_scope', 'mmdet3d'))
model = MODELS.build(config.model)
if checkpoint is not None:
checkpoint = load_checkpoint(model, checkpoint, map_location='cpu')
# save the dataset_meta in the model for convenience
if 'dataset_meta' in checkpoint.get('meta', {}):
# mmdet3d 1.x
model.dataset_meta = checkpoint['meta']['dataset_meta']
elif 'CLASSES' in checkpoint.get('meta', {}):
# < mmdet3d 1.x
classes = checkpoint['meta']['CLASSES']
model.dataset_meta = {'classes': classes}
if 'PALETTE' in checkpoint.get('meta', {}): # 3D Segmentor
model.dataset_meta['palette'] = checkpoint['meta']['PALETTE']
else:
# < mmdet3d 1.x
model.dataset_meta = {'classes': config.class_names}
if 'PALETTE' in checkpoint.get('meta', {}): # 3D Segmentor
model.dataset_meta['palette'] = checkpoint['meta']['PALETTE']
test_dataset_cfg = deepcopy(config.test_dataloader.dataset)
# lazy init. We only need the metainfo.
test_dataset_cfg['lazy_init'] = True
metainfo = DATASETS.build(test_dataset_cfg).metainfo
cfg_palette = metainfo.get('palette', None)
if cfg_palette is not None:
model.dataset_meta['palette'] = cfg_palette
else:
if 'palette' not in model.dataset_meta:
warnings.warn(
'palette does not exist, random is used by default. '
'You can also set the palette to customize.')
model.dataset_meta['palette'] = 'random'
model.cfg = config # save the config in the model for convenience
if device != 'cpu':
torch.cuda.set_device(device)
else:
warnings.warn('Don\'t suggest using CPU device. '
'Some functions are not supported for now.')
model.to(device)
model.eval()
return model
PointsType = Union[str, np.ndarray, Sequence[str], Sequence[np.ndarray]]
ImagesType = Union[str, np.ndarray, Sequence[str], Sequence[np.ndarray]]
def inference_detector(model: nn.Module,
pcds: PointsType) -> Union[Det3DDataSample, SampleList]:
"""Inference point cloud with the detector.
Args:
model (nn.Module): The loaded detector.
pcds (str, ndarray, Sequence[str/ndarray]):
Either point cloud files or loaded point cloud.
Returns:
:obj:`Det3DDataSample` or list[:obj:`Det3DDataSample`]:
If pcds is a list or tuple, the same length list type results
will be returned, otherwise return the detection results directly.
"""
if isinstance(pcds, (list, tuple)):
is_batch = True
else:
pcds = [pcds]
is_batch = False
cfg = model.cfg
if not isinstance(pcds[0], str):
cfg = cfg.copy()
# set loading pipeline type
cfg.test_dataloader.dataset.pipeline[0].type = 'LoadPointsFromDict'
# build the data pipeline
test_pipeline = deepcopy(cfg.test_dataloader.dataset.pipeline)
test_pipeline = Compose(test_pipeline)
box_type_3d, box_mode_3d = \
get_box_type(cfg.test_dataloader.dataset.box_type_3d)
data = []
for pcd in pcds:
# prepare data
if isinstance(pcd, str):
# load from point cloud file
data_ = dict(
lidar_points=dict(lidar_path=pcd),
timestamp=1,
# for ScanNet demo we need axis_align_matrix
axis_align_matrix=np.eye(4),
box_type_3d=box_type_3d,
box_mode_3d=box_mode_3d)
else:
# directly use loaded point cloud
data_ = dict(
points=pcd,
timestamp=1,
# for ScanNet demo we need axis_align_matrix
axis_align_matrix=np.eye(4),
box_type_3d=box_type_3d,
box_mode_3d=box_mode_3d)
data_ = test_pipeline(data_)
data.append(data_)
collate_data = pseudo_collate(data)
# forward the model
with torch.no_grad():
results = model.test_step(collate_data)
if not is_batch:
return results[0], data[0]
else:
return results, data
def inference_multi_modality_detector(model: nn.Module,
pcds: Union[str, Sequence[str]],
imgs: Union[str, Sequence[str]],
ann_file: Union[str, Sequence[str]],
cam_type: str = 'CAM2'):
"""Inference point cloud with the multi-modality detector. Now we only
support multi-modality detector for KITTI and SUNRGBD datasets since the
multi-view image loading is not supported yet in this inference function.
Args:
model (nn.Module): The loaded detector.
pcds (str, Sequence[str]):
Either point cloud files or loaded point cloud.
imgs (str, Sequence[str]):
Either image files or loaded images.
ann_file (str, Sequence[str]): Annotation files.
cam_type (str): Image of Camera chose to infer. When detector only uses
single-view image, we need to specify a camera view. For kitti
dataset, it should be 'CAM2'. For sunrgbd, it should be 'CAM0'.
When detector uses multi-view images, we should set it to 'all'.
Returns:
:obj:`Det3DDataSample` or list[:obj:`Det3DDataSample`]:
If pcds is a list or tuple, the same length list type results
will be returned, otherwise return the detection results directly.
"""
if isinstance(pcds, (list, tuple)):
is_batch = True
assert isinstance(imgs, (list, tuple))
assert len(pcds) == len(imgs)
else:
pcds = [pcds]
imgs = [imgs]
is_batch = False
cfg = model.cfg
# build the data pipeline
test_pipeline = deepcopy(cfg.test_dataloader.dataset.pipeline)
test_pipeline = Compose(test_pipeline)
box_type_3d, box_mode_3d = \
get_box_type(cfg.test_dataloader.dataset.box_type_3d)
data_list = mmengine.load(ann_file)['data_list']
data = []
for index, pcd in enumerate(pcds):
# get data info containing calib
data_info = data_list[index]
img = imgs[index]
if cam_type != 'all':
assert osp.isfile(img), f'{img} must be a file.'
img_path = data_info['images'][cam_type]['img_path']
if osp.basename(img_path) != osp.basename(img):
raise ValueError(
f'the info file of {img_path} is not provided.')
data_ = dict(
lidar_points=dict(lidar_path=pcd),
img_path=img,
box_type_3d=box_type_3d,
box_mode_3d=box_mode_3d)
data_info['images'][cam_type]['img_path'] = img
if 'cam2img' in data_info['images'][cam_type]:
# The data annotation in SRUNRGBD dataset does not contain
# `cam2img`
data_['cam2img'] = np.array(
data_info['images'][cam_type]['cam2img'])
# LiDAR to image conversion for KITTI dataset
if box_mode_3d == Box3DMode.LIDAR:
if 'lidar2img' in data_info['images'][cam_type]:
data_['lidar2img'] = np.array(
data_info['images'][cam_type]['lidar2img'])
# Depth to image conversion for SUNRGBD dataset
elif box_mode_3d == Box3DMode.DEPTH:
data_['depth2img'] = np.array(
data_info['images'][cam_type]['depth2img'])
else:
assert osp.isdir(img), f'{img} must be a file directory'
for _, img_info in data_info['images'].items():
img_info['img_path'] = osp.join(img, img_info['img_path'])
assert osp.isfile(img_info['img_path']
), f'{img_info["img_path"]} does not exist.'
data_ = dict(
lidar_points=dict(lidar_path=pcd),
images=data_info['images'],
box_type_3d=box_type_3d,
box_mode_3d=box_mode_3d)
if 'timestamp' in data_info:
# Using multi-sweeps need `timestamp`
data_['timestamp'] = data_info['timestamp']
data_ = test_pipeline(data_)
data.append(data_)
collate_data = pseudo_collate(data)
# forward the model
with torch.no_grad():
results = model.test_step(collate_data)
if not is_batch:
return results[0], data[0]
else:
return results, data
def inference_mono_3d_detector(model: nn.Module,
imgs: ImagesType,
ann_file: Union[str, Sequence[str]],
cam_type: str = 'CAM_FRONT'):
"""Inference image with the monocular 3D detector.
Args:
model (nn.Module): The loaded detector.
imgs (str, Sequence[str]):
Either image files or loaded images.
ann_files (str, Sequence[str]): Annotation files.
cam_type (str): Image of Camera chose to infer.
For kitti dataset, it should be 'CAM_2',
and for nuscenes dataset, it should be
'CAM_FRONT'. Defaults to 'CAM_FRONT'.
Returns:
:obj:`Det3DDataSample` or list[:obj:`Det3DDataSample`]:
If pcds is a list or tuple, the same length list type results
will be returned, otherwise return the detection results directly.
"""
if isinstance(imgs, (list, tuple)):
is_batch = True
else:
imgs = [imgs]
is_batch = False
cfg = model.cfg
# build the data pipeline
test_pipeline = deepcopy(cfg.test_dataloader.dataset.pipeline)
test_pipeline = Compose(test_pipeline)
box_type_3d, box_mode_3d = \
get_box_type(cfg.test_dataloader.dataset.box_type_3d)
data_list = mmengine.load(ann_file)['data_list']
assert len(imgs) == len(data_list)
data = []
for index, img in enumerate(imgs):
# get data info containing calib
data_info = data_list[index]
img_path = data_info['images'][cam_type]['img_path']
if osp.basename(img_path) != osp.basename(img):
raise ValueError(f'the info file of {img_path} is not provided.')
# replace the img_path in data_info with img
data_info['images'][cam_type]['img_path'] = img
# avoid data_info['images'] has multiple keys anout camera views.
mono_img_info = {f'{cam_type}': data_info['images'][cam_type]}
data_ = dict(
images=mono_img_info,
box_type_3d=box_type_3d,
box_mode_3d=box_mode_3d)
data_ = test_pipeline(data_)
data.append(data_)
collate_data = pseudo_collate(data)
# forward the model
with torch.no_grad():
results = model.test_step(collate_data)
if not is_batch:
return results[0]
else:
return results
def inference_segmentor(model: nn.Module, pcds: PointsType):
"""Inference point cloud with the segmentor.
Args:
model (nn.Module): The loaded segmentor.
pcds (str, Sequence[str]):
Either point cloud files or loaded point cloud.
Returns:
:obj:`Det3DDataSample` or list[:obj:`Det3DDataSample`]:
If pcds is a list or tuple, the same length list type results
will be returned, otherwise return the detection results directly.
"""
if isinstance(pcds, (list, tuple)):
is_batch = True
else:
pcds = [pcds]
is_batch = False
cfg = model.cfg
# build the data pipeline
test_pipeline = deepcopy(cfg.test_dataloader.dataset.pipeline)
new_test_pipeline = []
for pipeline in test_pipeline:
if pipeline['type'] != 'LoadAnnotations3D' and pipeline[
'type'] != 'PointSegClassMapping':
new_test_pipeline.append(pipeline)
test_pipeline = Compose(new_test_pipeline)
data = []
# TODO: support load points array
for pcd in pcds:
data_ = dict(lidar_points=dict(lidar_path=pcd))
data_ = test_pipeline(data_)
data.append(data_)
collate_data = pseudo_collate(data)
# forward the model
with torch.no_grad():
results = model.test_step(collate_data)
if not is_batch:
return results[0], data[0]
else:
return results, data
# Copyright (c) OpenMMLab. All rights reserved.
from .base_3d_inferencer import Base3DInferencer
from .lidar_det3d_inferencer import LidarDet3DInferencer
from .lidar_seg3d_inferencer import LidarSeg3DInferencer
from .mono_det3d_inferencer import MonoDet3DInferencer
from .multi_modality_det3d_inferencer import MultiModalityDet3DInferencer
__all__ = [
'Base3DInferencer', 'MonoDet3DInferencer', 'LidarDet3DInferencer',
'LidarSeg3DInferencer', 'MultiModalityDet3DInferencer'
]
# Copyright (c) OpenMMLab. All rights reserved.
import logging
import os.path as osp
from copy import deepcopy
from typing import Dict, List, Optional, Sequence, Tuple, Union
import numpy as np
import torch.nn as nn
from mmengine import dump, print_log
from mmengine.infer.infer import BaseInferencer, ModelType
from mmengine.model.utils import revert_sync_batchnorm
from mmengine.registry import init_default_scope
from mmengine.runner import load_checkpoint
from mmengine.structures import InstanceData
from mmengine.visualization import Visualizer
from rich.progress import track
from mmdet3d.registry import DATASETS, MODELS
from mmdet3d.structures import Box3DMode, Det3DDataSample
from mmdet3d.utils import ConfigType
InstanceList = List[InstanceData]
InputType = Union[str, np.ndarray]
InputsType = Union[InputType, Sequence[InputType]]
PredType = Union[InstanceData, InstanceList]
ImgType = Union[np.ndarray, Sequence[np.ndarray]]
ResType = Union[Dict, List[Dict], InstanceData, List[InstanceData]]
class Base3DInferencer(BaseInferencer):
"""Base 3D model inferencer.
Args:
model (str, optional): Path to the config file or the model name
defined in metafile. For example, it could be
"pgd-kitti" or
"configs/pgd/pgd_r101-caffe_fpn_head-gn_4xb3-4x_kitti-mono3d.py".
If model is not specified, user must provide the
`weights` saved by MMEngine which contains the config string.
Defaults to None.
weights (str, optional): Path to the checkpoint. If it is not specified
and model is a model name of metafile, the weights will be loaded
from metafile. Defaults to None.
device (str, optional): Device to run inference. If None, the available
device will be automatically used. Defaults to None.
scope (str): The scope of the model. Defaults to 'mmdet3d'.
palette (str): Color palette used for visualization. The order of
priority is palette -> config -> checkpoint. Defaults to 'none'.
"""
preprocess_kwargs: set = {'cam_type'}
forward_kwargs: set = set()
visualize_kwargs: set = {
'return_vis', 'show', 'wait_time', 'draw_pred', 'pred_score_thr',
'img_out_dir', 'no_save_vis', 'cam_type_dir'
}
postprocess_kwargs: set = {
'print_result', 'pred_out_dir', 'return_datasample', 'no_save_pred'
}
def __init__(self,
model: Union[ModelType, str, None] = None,
weights: Optional[str] = None,
device: Optional[str] = None,
scope: str = 'mmdet3d',
palette: str = 'none') -> None:
# A global counter tracking the number of frames processed, for
# naming of the output results
self.num_predicted_frames = 0
self.palette = palette
init_default_scope(scope)
super().__init__(
model=model, weights=weights, device=device, scope=scope)
self.model = revert_sync_batchnorm(self.model)
def _convert_syncbn(self, cfg: ConfigType):
"""Convert config's naiveSyncBN to BN.
Args:
config (str or :obj:`mmengine.Config`): Config file path
or the config object.
"""
if isinstance(cfg, dict):
for item in cfg:
if item == 'norm_cfg':
cfg[item]['type'] = cfg[item]['type']. \
replace('naiveSyncBN', 'BN')
else:
self._convert_syncbn(cfg[item])
def _init_model(
self,
cfg: ConfigType,
weights: str,
device: str = 'cpu',
) -> nn.Module:
self._convert_syncbn(cfg.model)
cfg.model.train_cfg = None
model = MODELS.build(cfg.model)
checkpoint = load_checkpoint(model, weights, map_location='cpu')
if 'dataset_meta' in checkpoint.get('meta', {}):
# mmdet3d 1.x
model.dataset_meta = checkpoint['meta']['dataset_meta']
elif 'CLASSES' in checkpoint.get('meta', {}):
# < mmdet3d 1.x
classes = checkpoint['meta']['CLASSES']
model.dataset_meta = {'classes': classes}
if 'PALETTE' in checkpoint.get('meta', {}): # 3D Segmentor
model.dataset_meta['palette'] = checkpoint['meta']['PALETTE']
else:
# < mmdet3d 1.x
model.dataset_meta = {'classes': cfg.class_names}
if 'PALETTE' in checkpoint.get('meta', {}): # 3D Segmentor
model.dataset_meta['palette'] = checkpoint['meta']['PALETTE']
test_dataset_cfg = deepcopy(cfg.test_dataloader.dataset)
# lazy init. We only need the metainfo.
test_dataset_cfg['lazy_init'] = True
metainfo = DATASETS.build(test_dataset_cfg).metainfo
cfg_palette = metainfo.get('palette', None)
if cfg_palette is not None:
model.dataset_meta['palette'] = cfg_palette
model.cfg = cfg # save the config in the model for convenience
model.to(device)
model.eval()
return model
def _get_transform_idx(self, pipeline_cfg: ConfigType, name: str) -> int:
"""Returns the index of the transform in a pipeline.
If the transform is not found, returns -1.
"""
for i, transform in enumerate(pipeline_cfg):
if transform['type'] == name:
return i
return -1
def _init_visualizer(self, cfg: ConfigType) -> Optional[Visualizer]:
visualizer = super()._init_visualizer(cfg)
visualizer.dataset_meta = self.model.dataset_meta
return visualizer
def _dispatch_kwargs(self,
out_dir: str = '',
cam_type: str = '',
**kwargs) -> Tuple[Dict, Dict, Dict, Dict]:
"""Dispatch kwargs to preprocess(), forward(), visualize() and
postprocess() according to the actual demands.
Args:
out_dir (str): Dir to save the inference results.
cam_type (str): Camera type. Defaults to ''.
**kwargs (dict): Key words arguments passed to :meth:`preprocess`,
:meth:`forward`, :meth:`visualize` and :meth:`postprocess`.
Each key in kwargs should be in the corresponding set of
``preprocess_kwargs``, ``forward_kwargs``, ``visualize_kwargs``
and ``postprocess_kwargs``.
Returns:
Tuple[Dict, Dict, Dict, Dict]: kwargs passed to preprocess,
forward, visualize and postprocess respectively.
"""
kwargs['img_out_dir'] = out_dir
kwargs['pred_out_dir'] = out_dir
if cam_type != '':
kwargs['cam_type_dir'] = cam_type
return super()._dispatch_kwargs(**kwargs)
def __call__(self,
inputs: InputsType,
batch_size: int = 1,
return_datasamples: bool = False,
**kwargs) -> Optional[dict]:
"""Call the inferencer.
Args:
inputs (InputsType): Inputs for the inferencer.
batch_size (int): Batch size. Defaults to 1.
return_datasamples (bool): Whether to return results as
:obj:`BaseDataElement`. Defaults to False.
**kwargs: Key words arguments passed to :meth:`preprocess`,
:meth:`forward`, :meth:`visualize` and :meth:`postprocess`.
Each key in kwargs should be in the corresponding set of
``preprocess_kwargs``, ``forward_kwargs``, ``visualize_kwargs``
and ``postprocess_kwargs``.
Returns:
dict: Inference and visualization results.
"""
(
preprocess_kwargs,
forward_kwargs,
visualize_kwargs,
postprocess_kwargs,
) = self._dispatch_kwargs(**kwargs)
cam_type = preprocess_kwargs.pop('cam_type', 'CAM2')
ori_inputs = self._inputs_to_list(inputs, cam_type=cam_type)
inputs = self.preprocess(
ori_inputs, batch_size=batch_size, **preprocess_kwargs)
preds = []
results_dict = {'predictions': [], 'visualization': []}
for data in (track(inputs, description='Inference')
if self.show_progress else inputs):
preds.extend(self.forward(data, **forward_kwargs))
visualization = self.visualize(ori_inputs, preds,
**visualize_kwargs)
results = self.postprocess(preds, visualization,
return_datasamples,
**postprocess_kwargs)
results_dict['predictions'].extend(results['predictions'])
if results['visualization'] is not None:
results_dict['visualization'].extend(results['visualization'])
return results_dict
def postprocess(
self,
preds: PredType,
visualization: Optional[List[np.ndarray]] = None,
return_datasample: bool = False,
print_result: bool = False,
no_save_pred: bool = False,
pred_out_dir: str = '',
) -> Union[ResType, Tuple[ResType, np.ndarray]]:
"""Process the predictions and visualization results from ``forward``
and ``visualize``.
This method should be responsible for the following tasks:
1. Convert datasamples into a json-serializable dict if needed.
2. Pack the predictions and visualization results and return them.
3. Dump or log the predictions.
Args:
preds (List[Dict]): Predictions of the model.
visualization (np.ndarray, optional): Visualized predictions.
Defaults to None.
return_datasample (bool): Whether to use Datasample to store
inference results. If False, dict will be used.
Defaults to False.
print_result (bool): Whether to print the inference result w/o
visualization to the console. Defaults to False.
pred_out_dir (str): Directory to save the inference results w/o
visualization. If left as empty, no file will be saved.
Defaults to ''.
Returns:
dict: Inference and visualization results with key ``predictions``
and ``visualization``.
- ``visualization`` (Any): Returned by :meth:`visualize`.
- ``predictions`` (dict or DataSample): Returned by
:meth:`forward` and processed in :meth:`postprocess`.
If ``return_datasample=False``, it usually should be a
json-serializable dict containing only basic data elements such
as strings and numbers.
"""
if no_save_pred is True:
pred_out_dir = ''
result_dict = {}
results = preds
if not return_datasample:
results = []
for pred in preds:
result = self.pred2dict(pred, pred_out_dir)
results.append(result)
elif pred_out_dir != '':
print_log(
'Currently does not support saving datasample '
'when return_datasample is set to True. '
'Prediction results are not saved!',
level=logging.WARNING)
# Add img to the results after printing and dumping
result_dict['predictions'] = results
if print_result:
print(result_dict)
result_dict['visualization'] = visualization
return result_dict
# TODO: The data format and fields saved in json need further discussion.
# Maybe should include model name, timestamp, filename, image info etc.
def pred2dict(self,
data_sample: Det3DDataSample,
pred_out_dir: str = '') -> Dict:
"""Extract elements necessary to represent a prediction into a
dictionary.
It's better to contain only basic data elements such as strings and
numbers in order to guarantee it's json-serializable.
Args:
data_sample (:obj:`DetDataSample`): Predictions of the model.
pred_out_dir: Dir to save the inference results w/o
visualization. If left as empty, no file will be saved.
Defaults to ''.
Returns:
dict: Prediction results.
"""
result = {}
if 'pred_instances_3d' in data_sample:
pred_instances_3d = data_sample.pred_instances_3d.numpy()
result = {
'labels_3d': pred_instances_3d.labels_3d.tolist(),
'scores_3d': pred_instances_3d.scores_3d.tolist(),
'bboxes_3d': pred_instances_3d.bboxes_3d.tensor.cpu().tolist()
}
if 'pred_pts_seg' in data_sample:
pred_pts_seg = data_sample.pred_pts_seg.numpy()
result['pts_semantic_mask'] = \
pred_pts_seg.pts_semantic_mask.tolist()
if data_sample.box_mode_3d == Box3DMode.LIDAR:
result['box_type_3d'] = 'LiDAR'
elif data_sample.box_mode_3d == Box3DMode.CAM:
result['box_type_3d'] = 'Camera'
elif data_sample.box_mode_3d == Box3DMode.DEPTH:
result['box_type_3d'] = 'Depth'
if pred_out_dir != '':
if 'lidar_path' in data_sample:
lidar_path = osp.basename(data_sample.lidar_path)
lidar_path = osp.splitext(lidar_path)[0]
out_json_path = osp.join(pred_out_dir, 'preds',
lidar_path + '.json')
elif 'img_path' in data_sample:
img_path = osp.basename(data_sample.img_path)
img_path = osp.splitext(img_path)[0]
out_json_path = osp.join(pred_out_dir, 'preds',
img_path + '.json')
else:
out_json_path = osp.join(
pred_out_dir, 'preds',
f'{str(self.num_visualized_imgs).zfill(8)}.json')
dump(result, out_json_path)
return result
# Copyright (c) OpenMMLab. All rights reserved.
import os.path as osp
from typing import Dict, List, Optional, Sequence, Union
import mmengine
import numpy as np
import torch
from mmengine.dataset import Compose
from mmengine.fileio import (get_file_backend, isdir, join_path,
list_dir_or_file)
from mmengine.infer.infer import ModelType
from mmengine.structures import InstanceData
from mmdet3d.registry import INFERENCERS
from mmdet3d.structures import (CameraInstance3DBoxes, DepthInstance3DBoxes,
Det3DDataSample, LiDARInstance3DBoxes)
from mmdet3d.utils import ConfigType
from .base_3d_inferencer import Base3DInferencer
InstanceList = List[InstanceData]
InputType = Union[str, np.ndarray]
InputsType = Union[InputType, Sequence[InputType]]
PredType = Union[InstanceData, InstanceList]
ImgType = Union[np.ndarray, Sequence[np.ndarray]]
ResType = Union[Dict, List[Dict], InstanceData, List[InstanceData]]
@INFERENCERS.register_module(name='det3d-lidar')
@INFERENCERS.register_module()
class LidarDet3DInferencer(Base3DInferencer):
"""The inferencer of LiDAR-based detection.
Args:
model (str, optional): Path to the config file or the model name
defined in metafile. For example, it could be
"pointpillars_kitti-3class" or
"configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class.py". # noqa: E501
If model is not specified, user must provide the
`weights` saved by MMEngine which contains the config string.
Defaults to None.
weights (str, optional): Path to the checkpoint. If it is not specified
and model is a model name of metafile, the weights will be loaded
from metafile. Defaults to None.
device (str, optional): Device to run inference. If None, the available
device will be automatically used. Defaults to None.
scope (str): The scope of the model. Defaults to 'mmdet3d'.
palette (str): Color palette used for visualization. The order of
priority is palette -> config -> checkpoint. Defaults to 'none'.
"""
def __init__(self,
model: Union[ModelType, str, None] = None,
weights: Optional[str] = None,
device: Optional[str] = None,
scope: str = 'mmdet3d',
palette: str = 'none') -> None:
# A global counter tracking the number of frames processed, for
# naming of the output results
self.num_visualized_frames = 0
super(LidarDet3DInferencer, self).__init__(
model=model,
weights=weights,
device=device,
scope=scope,
palette=palette)
def _inputs_to_list(self, inputs: Union[dict, list], **kwargs) -> list:
"""Preprocess the inputs to a list.
Preprocess inputs to a list according to its type:
- list or tuple: return inputs
- dict: the value with key 'points' is
- Directory path: return all files in the directory
- other cases: return a list containing the string. The string
could be a path to file, a url or other types of string according
to the task.
Args:
inputs (Union[dict, list]): Inputs for the inferencer.
Returns:
list: List of input for the :meth:`preprocess`.
"""
if isinstance(inputs, dict) and isinstance(inputs['points'], str):
pcd = inputs['points']
backend = get_file_backend(pcd)
if hasattr(backend, 'isdir') and isdir(pcd):
# Backends like HttpsBackend do not implement `isdir`, so
# only those backends that implement `isdir` could accept
# the inputs as a directory
filename_list = list_dir_or_file(pcd, list_dir=False)
inputs = [{
'points': join_path(pcd, filename)
} for filename in filename_list]
if not isinstance(inputs, (list, tuple)):
inputs = [inputs]
return list(inputs)
def _init_pipeline(self, cfg: ConfigType) -> Compose:
"""Initialize the test pipeline."""
pipeline_cfg = cfg.test_dataloader.dataset.pipeline
load_point_idx = self._get_transform_idx(pipeline_cfg,
'LoadPointsFromFile')
if load_point_idx == -1:
raise ValueError(
'LoadPointsFromFile is not found in the test pipeline')
load_cfg = pipeline_cfg[load_point_idx]
self.coord_type, self.load_dim = load_cfg['coord_type'], load_cfg[
'load_dim']
self.use_dim = list(range(load_cfg['use_dim'])) if isinstance(
load_cfg['use_dim'], int) else load_cfg['use_dim']
pipeline_cfg[load_point_idx]['type'] = 'LidarDet3DInferencerLoader'
return Compose(pipeline_cfg)
def visualize(self,
inputs: InputsType,
preds: PredType,
return_vis: bool = False,
show: bool = False,
wait_time: int = -1,
draw_pred: bool = True,
pred_score_thr: float = 0.3,
no_save_vis: bool = False,
img_out_dir: str = '') -> Union[List[np.ndarray], None]:
"""Visualize predictions.
Args:
inputs (InputsType): Inputs for the inferencer.
preds (PredType): Predictions of the model.
return_vis (bool): Whether to return the visualization result.
Defaults to False.
show (bool): Whether to display the image in a popup window.
Defaults to False.
wait_time (float): The interval of show (s). Defaults to -1.
draw_pred (bool): Whether to draw predicted bounding boxes.
Defaults to True.
pred_score_thr (float): Minimum score of bboxes to draw.
Defaults to 0.3.
no_save_vis (bool): Whether to force not to save prediction
vis results. Defaults to False.
img_out_dir (str): Output directory of visualization results.
If left as empty, no file will be saved. Defaults to ''.
Returns:
List[np.ndarray] or None: Returns visualization results only if
applicable.
"""
if no_save_vis is True:
img_out_dir = ''
if not show and img_out_dir == '' and not return_vis:
return None
if getattr(self, 'visualizer') is None:
raise ValueError('Visualization needs the "visualizer" term'
'defined in the config, but got None.')
results = []
for single_input, pred in zip(inputs, preds):
single_input = single_input['points']
if isinstance(single_input, str):
pts_bytes = mmengine.fileio.get(single_input)
points = np.frombuffer(pts_bytes, dtype=np.float32)
points = points.reshape(-1, self.load_dim)
points = points[:, self.use_dim]
pc_name = osp.basename(single_input).split('.bin')[0]
pc_name = f'{pc_name}.png'
elif isinstance(single_input, np.ndarray):
points = single_input.copy()
pc_num = str(self.num_visualized_frames).zfill(8)
pc_name = f'{pc_num}.png'
else:
raise ValueError('Unsupported input type: '
f'{type(single_input)}')
if img_out_dir != '' and show:
o3d_save_path = osp.join(img_out_dir, 'vis_lidar', pc_name)
mmengine.mkdir_or_exist(osp.dirname(o3d_save_path))
else:
o3d_save_path = None
data_input = dict(points=points)
self.visualizer.add_datasample(
pc_name,
data_input,
pred,
show=show,
wait_time=wait_time,
draw_gt=False,
draw_pred=draw_pred,
pred_score_thr=pred_score_thr,
o3d_save_path=o3d_save_path,
vis_task='lidar_det',
)
results.append(points)
self.num_visualized_frames += 1
return results
def visualize_preds_fromfile(self, inputs: InputsType, preds: PredType,
**kwargs) -> Union[List[np.ndarray], None]:
"""Visualize predictions from `*.json` files.
Args:
inputs (InputsType): Inputs for the inferencer.
preds (PredType): Predictions of the model.
Returns:
List[np.ndarray] or None: Returns visualization results only if
applicable.
"""
data_samples = []
for pred in preds:
pred = mmengine.load(pred)
data_sample = Det3DDataSample()
data_sample.pred_instances_3d = InstanceData()
data_sample.pred_instances_3d.labels_3d = torch.tensor(
pred['labels_3d'])
data_sample.pred_instances_3d.scores_3d = torch.tensor(
pred['scores_3d'])
if pred['box_type_3d'] == 'LiDAR':
data_sample.pred_instances_3d.bboxes_3d = \
LiDARInstance3DBoxes(pred['bboxes_3d'])
elif pred['box_type_3d'] == 'Camera':
data_sample.pred_instances_3d.bboxes_3d = \
CameraInstance3DBoxes(pred['bboxes_3d'])
elif pred['box_type_3d'] == 'Depth':
data_sample.pred_instances_3d.bboxes_3d = \
DepthInstance3DBoxes(pred['bboxes_3d'])
else:
raise ValueError('Unsupported box type: '
f'{pred["box_type_3d"]}')
data_samples.append(data_sample)
return self.visualize(inputs=inputs, preds=data_samples, **kwargs)
# Copyright (c) OpenMMLab. All rights reserved.
import os.path as osp
from typing import Dict, List, Optional, Sequence, Union
import mmengine
import numpy as np
from mmengine.dataset import Compose
from mmengine.fileio import (get_file_backend, isdir, join_path,
list_dir_or_file)
from mmengine.infer.infer import ModelType
from mmengine.structures import InstanceData
from mmdet3d.registry import INFERENCERS
from mmdet3d.utils import ConfigType
from .base_3d_inferencer import Base3DInferencer
InstanceList = List[InstanceData]
InputType = Union[str, np.ndarray]
InputsType = Union[InputType, Sequence[InputType]]
PredType = Union[InstanceData, InstanceList]
ImgType = Union[np.ndarray, Sequence[np.ndarray]]
ResType = Union[Dict, List[Dict], InstanceData, List[InstanceData]]
@INFERENCERS.register_module(name='seg3d-lidar')
@INFERENCERS.register_module()
class LidarSeg3DInferencer(Base3DInferencer):
"""The inferencer of LiDAR-based segmentation.
Args:
model (str, optional): Path to the config file or the model name
defined in metafile. For example, it could be
"pointnet2-ssg_s3dis-seg" or
"configs/pointnet2/pointnet2_ssg_2xb16-cosine-50e_s3dis-seg.py".
If model is not specified, user must provide the
`weights` saved by MMEngine which contains the config string.
Defaults to None.
weights (str, optional): Path to the checkpoint. If it is not specified
and model is a model name of metafile, the weights will be loaded
from metafile. Defaults to None.
device (str, optional): Device to run inference. If None, the available
device will be automatically used. Defaults to None.
scope (str): The scope of the model. Defaults to 'mmdet3d'.
palette (str): Color palette used for visualization. The order of
priority is palette -> config -> checkpoint. Defaults to 'none'.
"""
def __init__(self,
model: Union[ModelType, str, None] = None,
weights: Optional[str] = None,
device: Optional[str] = None,
scope: str = 'mmdet3d',
palette: str = 'none') -> None:
# A global counter tracking the number of frames processed, for
# naming of the output results
self.num_visualized_frames = 0
super(LidarSeg3DInferencer, self).__init__(
model=model,
weights=weights,
device=device,
scope=scope,
palette=palette)
def _inputs_to_list(self, inputs: Union[dict, list], **kwargs) -> list:
"""Preprocess the inputs to a list.
Preprocess inputs to a list according to its type:
- list or tuple: return inputs
- dict: the value with key 'points' is
- Directory path: return all files in the directory
- other cases: return a list containing the string. The string
could be a path to file, a url or other types of string according
to the task.
Args:
inputs (Union[dict, list]): Inputs for the inferencer.
Returns:
list: List of input for the :meth:`preprocess`.
"""
if isinstance(inputs, dict) and isinstance(inputs['points'], str):
pcd = inputs['points']
backend = get_file_backend(pcd)
if hasattr(backend, 'isdir') and isdir(pcd):
# Backends like HttpsBackend do not implement `isdir`, so
# only those backends that implement `isdir` could accept
# the inputs as a directory
filename_list = list_dir_or_file(pcd, list_dir=False)
inputs = [{
'points': join_path(pcd, filename)
} for filename in filename_list]
if not isinstance(inputs, (list, tuple)):
inputs = [inputs]
return list(inputs)
def _init_pipeline(self, cfg: ConfigType) -> Compose:
"""Initialize the test pipeline."""
pipeline_cfg = cfg.test_dataloader.dataset.pipeline
# Load annotation is also not applicable
idx = self._get_transform_idx(pipeline_cfg, 'LoadAnnotations3D')
if idx != -1:
del pipeline_cfg[idx]
idx = self._get_transform_idx(pipeline_cfg, 'PointSegClassMapping')
if idx != -1:
del pipeline_cfg[idx]
load_point_idx = self._get_transform_idx(pipeline_cfg,
'LoadPointsFromFile')
if load_point_idx == -1:
raise ValueError(
'LoadPointsFromFile is not found in the test pipeline')
load_cfg = pipeline_cfg[load_point_idx]
self.coord_type, self.load_dim = load_cfg['coord_type'], load_cfg[
'load_dim']
self.use_dim = list(range(load_cfg['use_dim'])) if isinstance(
load_cfg['use_dim'], int) else load_cfg['use_dim']
pipeline_cfg[load_point_idx]['type'] = 'LidarDet3DInferencerLoader'
return Compose(pipeline_cfg)
def visualize(self,
inputs: InputsType,
preds: PredType,
return_vis: bool = False,
show: bool = False,
wait_time: int = 0,
draw_pred: bool = True,
pred_score_thr: float = 0.3,
no_save_vis: bool = False,
img_out_dir: str = '') -> Union[List[np.ndarray], None]:
"""Visualize predictions.
Args:
inputs (InputsType): Inputs for the inferencer.
preds (PredType): Predictions of the model.
return_vis (bool): Whether to return the visualization result.
Defaults to False.
show (bool): Whether to display the image in a popup window.
Defaults to False.
wait_time (float): The interval of show (s). Defaults to 0.
draw_pred (bool): Whether to draw predicted bounding boxes.
Defaults to True.
pred_score_thr (float): Minimum score of bboxes to draw.
Defaults to 0.3.
no_save_vis (bool): Whether to save visualization results.
img_out_dir (str): Output directory of visualization results.
If left as empty, no file will be saved. Defaults to ''.
Returns:
List[np.ndarray] or None: Returns visualization results only if
applicable.
"""
if no_save_vis is True:
img_out_dir = ''
if not show and img_out_dir == '' and not return_vis:
return None
if getattr(self, 'visualizer') is None:
raise ValueError('Visualization needs the "visualizer" term'
'defined in the config, but got None.')
results = []
for single_input, pred in zip(inputs, preds):
single_input = single_input['points']
if isinstance(single_input, str):
pts_bytes = mmengine.fileio.get(single_input)
points = np.frombuffer(pts_bytes, dtype=np.float32)
points = points.reshape(-1, self.load_dim)
points = points[:, self.use_dim]
pc_name = osp.basename(single_input).split('.bin')[0]
pc_name = f'{pc_name}.png'
elif isinstance(single_input, np.ndarray):
points = single_input.copy()
pc_num = str(self.num_visualized_frames).zfill(8)
pc_name = f'{pc_num}.png'
else:
raise ValueError('Unsupported input type: '
f'{type(single_input)}')
if img_out_dir != '' and show:
o3d_save_path = osp.join(img_out_dir, 'vis_lidar', pc_name)
mmengine.mkdir_or_exist(osp.dirname(o3d_save_path))
else:
o3d_save_path = None
data_input = dict(points=points)
self.visualizer.add_datasample(
pc_name,
data_input,
pred,
show=show,
wait_time=wait_time,
draw_gt=False,
draw_pred=draw_pred,
pred_score_thr=pred_score_thr,
o3d_save_path=o3d_save_path,
vis_task='lidar_seg',
)
results.append(points)
self.num_visualized_frames += 1
return results
# Copyright (c) OpenMMLab. All rights reserved.
import os.path as osp
from typing import Dict, List, Optional, Sequence, Union
import mmcv
import mmengine
import numpy as np
from mmengine.dataset import Compose
from mmengine.fileio import (get_file_backend, isdir, join_path,
list_dir_or_file)
from mmengine.infer.infer import ModelType
from mmengine.structures import InstanceData
from mmdet3d.registry import INFERENCERS
from mmdet3d.utils import ConfigType
from .base_3d_inferencer import Base3DInferencer
InstanceList = List[InstanceData]
InputType = Union[str, np.ndarray]
InputsType = Union[InputType, Sequence[InputType]]
PredType = Union[InstanceData, InstanceList]
ImgType = Union[np.ndarray, Sequence[np.ndarray]]
ResType = Union[Dict, List[Dict], InstanceData, List[InstanceData]]
@INFERENCERS.register_module(name='det3d-mono')
@INFERENCERS.register_module()
class MonoDet3DInferencer(Base3DInferencer):
"""MMDet3D Monocular 3D object detection inferencer.
Args:
model (str, optional): Path to the config file or the model name
defined in metafile. For example, it could be
"pgd_kitti" or
"configs/pgd/pgd_r101-caffe_fpn_head-gn_4xb3-4x_kitti-mono3d.py".
If model is not specified, user must provide the
`weights` saved by MMEngine which contains the config string.
Defaults to None.
weights (str, optional): Path to the checkpoint. If it is not specified
and model is a model name of metafile, the weights will be loaded
from metafile. Defaults to None.
device (str, optional): Device to run inference. If None, the available
device will be automatically used. Defaults to None.
scope (str): The scope of the model. Defaults to 'mmdet3d'.
palette (str): Color palette used for visualization. The order of
priority is palette -> config -> checkpoint. Defaults to 'none'.
"""
def __init__(self,
model: Union[ModelType, str, None] = None,
weights: Optional[str] = None,
device: Optional[str] = None,
scope: str = 'mmdet3d',
palette: str = 'none') -> None:
# A global counter tracking the number of images processed, for
# naming of the output images
self.num_visualized_imgs = 0
super(MonoDet3DInferencer, self).__init__(
model=model,
weights=weights,
device=device,
scope=scope,
palette=palette)
def _inputs_to_list(self,
inputs: Union[dict, list],
cam_type='CAM2',
**kwargs) -> list:
"""Preprocess the inputs to a list.
Preprocess inputs to a list according to its type:
- list or tuple: return inputs
- dict: the value with key 'img' is
- Directory path: return all files in the directory
- other cases: return a list containing the string. The string
could be a path to file, a url or other types of string according
to the task.
Args:
inputs (Union[dict, list]): Inputs for the inferencer.
Returns:
list: List of input for the :meth:`preprocess`.
"""
if isinstance(inputs, dict):
assert 'infos' in inputs
infos = inputs.pop('infos')
if isinstance(inputs['img'], str):
img = inputs['img']
backend = get_file_backend(img)
if hasattr(backend, 'isdir') and isdir(img):
# Backends like HttpsBackend do not implement `isdir`, so
# only those backends that implement `isdir` could accept
# the inputs as a directory
filename_list = list_dir_or_file(img, list_dir=False)
inputs = [{
'img': join_path(img, filename)
} for filename in filename_list]
if not isinstance(inputs, (list, tuple)):
inputs = [inputs]
# get cam2img, lidar2cam and lidar2img from infos
info_list = mmengine.load(infos)['data_list']
assert len(info_list) == len(inputs)
for index, input in enumerate(inputs):
data_info = info_list[index]
img_path = data_info['images'][cam_type]['img_path']
if isinstance(input['img'], str) and \
osp.basename(img_path) != osp.basename(input['img']):
raise ValueError(
f'the info file of {img_path} is not provided.')
cam2img = np.asarray(
data_info['images'][cam_type]['cam2img'], dtype=np.float32)
lidar2cam = np.asarray(
data_info['images'][cam_type]['lidar2cam'],
dtype=np.float32)
if 'lidar2img' in data_info['images'][cam_type]:
lidar2img = np.asarray(
data_info['images'][cam_type]['lidar2img'],
dtype=np.float32)
else:
lidar2img = cam2img @ lidar2cam
input['cam2img'] = cam2img
input['lidar2cam'] = lidar2cam
input['lidar2img'] = lidar2img
elif isinstance(inputs, (list, tuple)):
# get cam2img, lidar2cam and lidar2img from infos
for input in inputs:
assert 'infos' in input
infos = input.pop('infos')
info_list = mmengine.load(infos)['data_list']
assert len(info_list) == 1, 'Only support single sample info' \
'in `.pkl`, when inputs is a list.'
data_info = info_list[0]
img_path = data_info['images'][cam_type]['img_path']
if isinstance(input['img'], str) and \
osp.basename(img_path) != osp.basename(input['img']):
raise ValueError(
f'the info file of {img_path} is not provided.')
cam2img = np.asarray(
data_info['images'][cam_type]['cam2img'], dtype=np.float32)
lidar2cam = np.asarray(
data_info['images'][cam_type]['lidar2cam'],
dtype=np.float32)
if 'lidar2img' in data_info['images'][cam_type]:
lidar2img = np.asarray(
data_info['images'][cam_type]['lidar2img'],
dtype=np.float32)
else:
lidar2img = cam2img @ lidar2cam
input['cam2img'] = cam2img
input['lidar2cam'] = lidar2cam
input['lidar2img'] = lidar2img
return list(inputs)
def _init_pipeline(self, cfg: ConfigType) -> Compose:
"""Initialize the test pipeline."""
pipeline_cfg = cfg.test_dataloader.dataset.pipeline
load_img_idx = self._get_transform_idx(pipeline_cfg,
'LoadImageFromFileMono3D')
if load_img_idx == -1:
raise ValueError(
'LoadImageFromFileMono3D is not found in the test pipeline')
pipeline_cfg[load_img_idx]['type'] = 'MonoDet3DInferencerLoader'
return Compose(pipeline_cfg)
def visualize(self,
inputs: InputsType,
preds: PredType,
return_vis: bool = False,
show: bool = False,
wait_time: int = 0,
draw_pred: bool = True,
pred_score_thr: float = 0.3,
no_save_vis: bool = False,
img_out_dir: str = '',
cam_type_dir: str = 'CAM2') -> Union[List[np.ndarray], None]:
"""Visualize predictions.
Args:
inputs (List[Dict]): Inputs for the inferencer.
preds (List[Dict]): Predictions of the model.
return_vis (bool): Whether to return the visualization result.
Defaults to False.
show (bool): Whether to display the image in a popup window.
Defaults to False.
wait_time (float): The interval of show (s). Defaults to 0.
draw_pred (bool): Whether to draw predicted bounding boxes.
Defaults to True.
pred_score_thr (float): Minimum score of bboxes to draw.
Defaults to 0.3.
no_save_vis (bool): Whether to save visualization results.
img_out_dir (str): Output directory of visualization results.
If left as empty, no file will be saved. Defaults to ''.
cam_type_dir (str): Camera type directory. Defaults to 'CAM2'.
Returns:
List[np.ndarray] or None: Returns visualization results only if
applicable.
"""
if no_save_vis is True:
img_out_dir = ''
if not show and img_out_dir == '' and not return_vis:
return None
if getattr(self, 'visualizer') is None:
raise ValueError('Visualization needs the "visualizer" term'
'defined in the config, but got None.')
results = []
for single_input, pred in zip(inputs, preds):
if isinstance(single_input['img'], str):
img_bytes = mmengine.fileio.get(single_input['img'])
img = mmcv.imfrombytes(img_bytes)
img = img[:, :, ::-1]
img_name = osp.basename(single_input['img'])
elif isinstance(single_input['img'], np.ndarray):
img = single_input['img'].copy()
img_num = str(self.num_visualized_imgs).zfill(8)
img_name = f'{img_num}.jpg'
else:
raise ValueError('Unsupported input type: '
f"{type(single_input['img'])}")
out_file = osp.join(img_out_dir, 'vis_camera', cam_type_dir,
img_name) if img_out_dir != '' else None
data_input = dict(img=img)
self.visualizer.add_datasample(
img_name,
data_input,
pred,
show=show,
wait_time=wait_time,
draw_gt=False,
draw_pred=draw_pred,
pred_score_thr=pred_score_thr,
out_file=out_file,
vis_task='mono_det',
)
results.append(img)
self.num_visualized_imgs += 1
return results
# Copyright (c) OpenMMLab. All rights reserved.
import os.path as osp
import warnings
from typing import Dict, List, Optional, Sequence, Union
import mmcv
import mmengine
import numpy as np
from mmengine.dataset import Compose
from mmengine.fileio import (get_file_backend, isdir, join_path,
list_dir_or_file)
from mmengine.infer.infer import ModelType
from mmengine.structures import InstanceData
from mmdet3d.registry import INFERENCERS
from mmdet3d.utils import ConfigType
from .base_3d_inferencer import Base3DInferencer
InstanceList = List[InstanceData]
InputType = Union[str, np.ndarray]
InputsType = Union[InputType, Sequence[InputType]]
PredType = Union[InstanceData, InstanceList]
ImgType = Union[np.ndarray, Sequence[np.ndarray]]
ResType = Union[Dict, List[Dict], InstanceData, List[InstanceData]]
@INFERENCERS.register_module(name='det3d-multi_modality')
@INFERENCERS.register_module()
class MultiModalityDet3DInferencer(Base3DInferencer):
"""The inferencer of multi-modality detection.
Args:
model (str, optional): Path to the config file or the model name
defined in metafile. For example, it could be
"pointpillars_kitti-3class" or
"configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class.py". # noqa: E501
If model is not specified, user must provide the
`weights` saved by MMEngine which contains the config string.
Defaults to None.
weights (str, optional): Path to the checkpoint. If it is not specified
and model is a model name of metafile, the weights will be loaded
from metafile. Defaults to None.
device (str, optional): Device to run inference. If None, the available
device will be automatically used. Defaults to None.
scope (str): The scope of registry. Defaults to 'mmdet3d'.
palette (str): The palette of visualization. Defaults to 'none'.
"""
def __init__(self,
model: Union[ModelType, str, None] = None,
weights: Optional[str] = None,
device: Optional[str] = None,
scope: str = 'mmdet3d',
palette: str = 'none') -> None:
# A global counter tracking the number of frames processed, for
# naming of the output results
self.num_visualized_frames = 0
super(MultiModalityDet3DInferencer, self).__init__(
model=model,
weights=weights,
device=device,
scope=scope,
palette=palette)
def _inputs_to_list(self,
inputs: Union[dict, list],
cam_type: str = 'CAM2',
**kwargs) -> list:
"""Preprocess the inputs to a list.
Preprocess inputs to a list according to its type:
- list or tuple: return inputs
- dict: the value with key 'points' is
- Directory path: return all files in the directory
- other cases: return a list containing the string. The string
could be a path to file, a url or other types of string according
to the task.
Args:
inputs (Union[dict, list]): Inputs for the inferencer.
Returns:
list: List of input for the :meth:`preprocess`.
"""
if isinstance(inputs, dict):
assert 'infos' in inputs
infos = inputs.pop('infos')
if isinstance(inputs['img'], str):
img, pcd = inputs['img'], inputs['points']
backend = get_file_backend(img)
if hasattr(backend, 'isdir') and isdir(img) and isdir(pcd):
# Backends like HttpsBackend do not implement `isdir`, so
# only those backends that implement `isdir` could accept
# the inputs as a directory
img_filename_list = list_dir_or_file(
img, list_dir=False, suffix=['.png', '.jpg'])
pcd_filename_list = list_dir_or_file(
pcd, list_dir=False, suffix='.bin')
assert len(img_filename_list) == len(pcd_filename_list)
inputs = [{
'img': join_path(img, img_filename),
'points': join_path(pcd, pcd_filename)
} for pcd_filename, img_filename in zip(
pcd_filename_list, img_filename_list)]
if not isinstance(inputs, (list, tuple)):
inputs = [inputs]
# get cam2img, lidar2cam and lidar2img from infos
info_list = mmengine.load(infos)['data_list']
assert len(info_list) == len(inputs)
for index, input in enumerate(inputs):
data_info = info_list[index]
img_path = data_info['images'][cam_type]['img_path']
if isinstance(input['img'], str) and \
osp.basename(img_path) != osp.basename(input['img']):
raise ValueError(
f'the info file of {img_path} is not provided.')
cam2img = np.asarray(
data_info['images'][cam_type]['cam2img'], dtype=np.float32)
lidar2cam = np.asarray(
data_info['images'][cam_type]['lidar2cam'],
dtype=np.float32)
if 'lidar2img' in data_info['images'][cam_type]:
lidar2img = np.asarray(
data_info['images'][cam_type]['lidar2img'],
dtype=np.float32)
else:
lidar2img = cam2img @ lidar2cam
input['cam2img'] = cam2img
input['lidar2cam'] = lidar2cam
input['lidar2img'] = lidar2img
elif isinstance(inputs, (list, tuple)):
# get cam2img, lidar2cam and lidar2img from infos
for input in inputs:
assert 'infos' in input
infos = input.pop('infos')
info_list = mmengine.load(infos)['data_list']
assert len(info_list) == 1, 'Only support single sample' \
'info in `.pkl`, when input is a list.'
data_info = info_list[0]
img_path = data_info['images'][cam_type]['img_path']
if isinstance(input['img'], str) and \
osp.basename(img_path) != osp.basename(input['img']):
raise ValueError(
f'the info file of {img_path} is not provided.')
cam2img = np.asarray(
data_info['images'][cam_type]['cam2img'], dtype=np.float32)
lidar2cam = np.asarray(
data_info['images'][cam_type]['lidar2cam'],
dtype=np.float32)
if 'lidar2img' in data_info['images'][cam_type]:
lidar2img = np.asarray(
data_info['images'][cam_type]['lidar2img'],
dtype=np.float32)
else:
lidar2img = cam2img @ lidar2cam
input['cam2img'] = cam2img
input['lidar2cam'] = lidar2cam
input['lidar2img'] = lidar2img
return list(inputs)
def _init_pipeline(self, cfg: ConfigType) -> Compose:
"""Initialize the test pipeline."""
pipeline_cfg = cfg.test_dataloader.dataset.pipeline
load_point_idx = self._get_transform_idx(pipeline_cfg,
'LoadPointsFromFile')
load_mv_img_idx = self._get_transform_idx(
pipeline_cfg, 'LoadMultiViewImageFromFiles')
if load_mv_img_idx != -1:
warnings.warn(
'LoadMultiViewImageFromFiles is not supported yet in the '
'multi-modality inferencer. Please remove it')
# Now, we only support ``LoadImageFromFile`` as the image loader in the
# original piepline. `LoadMultiViewImageFromFiles` is not supported
# yet.
load_img_idx = self._get_transform_idx(pipeline_cfg,
'LoadImageFromFile')
if load_point_idx == -1 or load_img_idx == -1:
raise ValueError(
'Both LoadPointsFromFile and LoadImageFromFile must '
'be specified the pipeline, but LoadPointsFromFile is '
f'{load_point_idx == -1} and LoadImageFromFile is '
f'{load_img_idx}')
load_cfg = pipeline_cfg[load_point_idx]
self.coord_type, self.load_dim = load_cfg['coord_type'], load_cfg[
'load_dim']
self.use_dim = list(range(load_cfg['use_dim'])) if isinstance(
load_cfg['use_dim'], int) else load_cfg['use_dim']
load_point_args = pipeline_cfg[load_point_idx]
load_point_args.pop('type')
load_img_args = pipeline_cfg[load_img_idx]
load_img_args.pop('type')
load_idx = min(load_point_idx, load_img_idx)
pipeline_cfg.pop(max(load_point_idx, load_img_idx))
pipeline_cfg[load_idx] = dict(
type='MultiModalityDet3DInferencerLoader',
load_point_args=load_point_args,
load_img_args=load_img_args)
return Compose(pipeline_cfg)
def visualize(self,
inputs: InputsType,
preds: PredType,
return_vis: bool = False,
show: bool = False,
wait_time: int = 0,
draw_pred: bool = True,
pred_score_thr: float = 0.3,
no_save_vis: bool = False,
img_out_dir: str = '',
cam_type_dir: str = 'CAM2') -> Union[List[np.ndarray], None]:
"""Visualize predictions.
Args:
inputs (InputsType): Inputs for the inferencer.
preds (PredType): Predictions of the model.
return_vis (bool): Whether to return the visualization result.
Defaults to False.
show (bool): Whether to display the image in a popup window.
Defaults to False.
wait_time (float): The interval of show (s). Defaults to 0.
draw_pred (bool): Whether to draw predicted bounding boxes.
Defaults to True.
no_save_vis (bool): Whether to save visualization results.
pred_score_thr (float): Minimum score of bboxes to draw.
Defaults to 0.3.
img_out_dir (str): Output directory of visualization results.
If left as empty, no file will be saved. Defaults to ''.
Returns:
List[np.ndarray] or None: Returns visualization results only if
applicable.
"""
if no_save_vis is True:
img_out_dir = ''
if not show and img_out_dir == '' and not return_vis:
return None
if getattr(self, 'visualizer') is None:
raise ValueError('Visualization needs the "visualizer" term'
'defined in the config, but got None.')
results = []
for single_input, pred in zip(inputs, preds):
points_input = single_input['points']
if isinstance(points_input, str):
pts_bytes = mmengine.fileio.get(points_input)
points = np.frombuffer(pts_bytes, dtype=np.float32)
points = points.reshape(-1, self.load_dim)
points = points[:, self.use_dim]
pc_name = osp.basename(points_input).split('.bin')[0]
pc_name = f'{pc_name}.png'
elif isinstance(points_input, np.ndarray):
points = points_input.copy()
pc_num = str(self.num_visualized_frames).zfill(8)
pc_name = f'{pc_num}.png'
else:
raise ValueError('Unsupported input type: '
f'{type(points_input)}')
if img_out_dir != '' and show:
o3d_save_path = osp.join(img_out_dir, 'vis_lidar', pc_name)
mmengine.mkdir_or_exist(osp.dirname(o3d_save_path))
else:
o3d_save_path = None
img_input = single_input['img']
if isinstance(single_input['img'], str):
img_bytes = mmengine.fileio.get(img_input)
img = mmcv.imfrombytes(img_bytes)
img = img[:, :, ::-1]
img_name = osp.basename(img_input)
elif isinstance(img_input, np.ndarray):
img = img_input.copy()
img_num = str(self.num_visualized_frames).zfill(8)
img_name = f'{img_num}.jpg'
else:
raise ValueError('Unsupported input type: '
f'{type(img_input)}')
out_file = osp.join(img_out_dir, 'vis_camera', cam_type_dir,
img_name) if img_out_dir != '' else None
data_input = dict(points=points, img=img)
self.visualizer.add_datasample(
pc_name,
data_input,
pred,
show=show,
wait_time=wait_time,
draw_gt=False,
draw_pred=draw_pred,
pred_score_thr=pred_score_thr,
o3d_save_path=o3d_save_path,
out_file=out_file,
vis_task='multi-modality_det',
)
results.append(points)
self.num_visualized_frames += 1
return results
# Copyright (c) OpenMMLab. All rights reserved.
from mmengine.dataset.dataset_wrapper import RepeatDataset
from mmengine.dataset.sampler import DefaultSampler
from mmengine.visualization.vis_backend import LocalVisBackend
from mmdet3d.datasets.kitti_dataset import KittiDataset
from mmdet3d.datasets.transforms.formating import Pack3DDetInputs
from mmdet3d.datasets.transforms.loading import (LoadAnnotations3D,
LoadPointsFromFile)
from mmdet3d.datasets.transforms.test_time_aug import MultiScaleFlipAug3D
from mmdet3d.datasets.transforms.transforms_3d import ( # noqa
GlobalRotScaleTrans, ObjectNoise, ObjectRangeFilter, ObjectSample,
PointShuffle, PointsRangeFilter, RandomFlip3D)
from mmdet3d.evaluation.metrics.kitti_metric import KittiMetric
from mmdet3d.visualization.local_visualizer import Det3DLocalVisualizer
# dataset settings
dataset_type = 'KittiDataset'
data_root = 'data/kitti/'
class_names = ['Pedestrian', 'Cyclist', 'Car']
point_cloud_range = [0, -40, -3, 70.4, 40, 1]
input_modality = dict(use_lidar=True, use_camera=False)
metainfo = dict(classes=class_names)
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection3d/kitti/'
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/detection3d/',
# 'data/': 's3://openmmlab/datasets/detection3d/'
# }))
backend_args = None
db_sampler = dict(
data_root=data_root,
info_path=data_root + 'kitti_dbinfos_train.pkl',
rate=1.0,
prepare=dict(
filter_by_difficulty=[-1],
filter_by_min_points=dict(Car=5, Pedestrian=10, Cyclist=10)),
classes=class_names,
sample_groups=dict(Car=12, Pedestrian=6, Cyclist=6),
points_loader=dict(
type=LoadPointsFromFile,
coord_type='LIDAR',
load_dim=4,
use_dim=4,
backend_args=backend_args),
backend_args=backend_args)
train_pipeline = [
dict(
type=LoadPointsFromFile,
coord_type='LIDAR',
load_dim=4, # x, y, z, intensity
use_dim=4,
backend_args=backend_args),
dict(type=LoadAnnotations3D, with_bbox_3d=True, with_label_3d=True),
dict(type=ObjectSample, db_sampler=db_sampler),
dict(
type=ObjectNoise,
num_try=100,
translation_std=[1.0, 1.0, 0.5],
global_rot_range=[0.0, 0.0],
rot_range=[-0.78539816, 0.78539816]),
dict(type=RandomFlip3D, flip_ratio_bev_horizontal=0.5),
dict(
type=GlobalRotScaleTrans,
rot_range=[-0.78539816, 0.78539816],
scale_ratio_range=[0.95, 1.05]),
dict(type=PointsRangeFilter, point_cloud_range=point_cloud_range),
dict(type=ObjectRangeFilter, point_cloud_range=point_cloud_range),
dict(type=PointShuffle),
dict(
type=Pack3DDetInputs, keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
]
test_pipeline = [
dict(
type=LoadPointsFromFile,
coord_type='LIDAR',
load_dim=4,
use_dim=4,
backend_args=backend_args),
dict(
type=MultiScaleFlipAug3D,
img_scale=(1333, 800),
pts_scale_ratio=1,
flip=False,
transforms=[
dict(
type=GlobalRotScaleTrans,
rot_range=[0, 0],
scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]),
dict(type=RandomFlip3D),
dict(type=PointsRangeFilter, point_cloud_range=point_cloud_range)
]),
dict(type=Pack3DDetInputs, keys=['points'])
]
# construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline = [
dict(
type=LoadPointsFromFile,
coord_type='LIDAR',
load_dim=4,
use_dim=4,
backend_args=backend_args),
dict(type=Pack3DDetInputs, keys=['points'])
]
train_dataloader = dict(
batch_size=6,
num_workers=4,
persistent_workers=True,
sampler=dict(type=DefaultSampler, shuffle=True),
dataset=dict(
type=RepeatDataset,
times=2,
dataset=dict(
type=KittiDataset,
data_root=data_root,
ann_file='kitti_infos_train.pkl',
data_prefix=dict(pts='training/velodyne_reduced'),
pipeline=train_pipeline,
modality=input_modality,
test_mode=False,
metainfo=metainfo,
# we use box_type_3d='LiDAR' in kitti and nuscenes dataset
# and box_type_3d='Depth' in sunrgbd and scannet dataset.
box_type_3d='LiDAR',
backend_args=backend_args)))
val_dataloader = dict(
batch_size=1,
num_workers=1,
persistent_workers=True,
drop_last=False,
sampler=dict(type=DefaultSampler, shuffle=False),
dataset=dict(
type=KittiDataset,
data_root=data_root,
data_prefix=dict(pts='training/velodyne_reduced'),
ann_file='kitti_infos_val.pkl',
pipeline=test_pipeline,
modality=input_modality,
test_mode=True,
metainfo=metainfo,
box_type_3d='LiDAR',
backend_args=backend_args))
test_dataloader = dict(
batch_size=1,
num_workers=1,
persistent_workers=True,
drop_last=False,
sampler=dict(type=DefaultSampler, shuffle=False),
dataset=dict(
type=KittiDataset,
data_root=data_root,
data_prefix=dict(pts='training/velodyne_reduced'),
ann_file='kitti_infos_val.pkl',
pipeline=test_pipeline,
modality=input_modality,
test_mode=True,
metainfo=metainfo,
box_type_3d='LiDAR',
backend_args=backend_args))
val_evaluator = dict(
type=KittiMetric,
ann_file=data_root + 'kitti_infos_val.pkl',
metric='bbox',
backend_args=backend_args)
test_evaluator = val_evaluator
vis_backends = [dict(type=LocalVisBackend)]
visualizer = dict(
type=Det3DLocalVisualizer, vis_backends=vis_backends, name='visualizer')
# Copyright (c) OpenMMLab. All rights reserved.
from mmengine.dataset.dataset_wrapper import RepeatDataset
from mmengine.dataset.sampler import DefaultSampler
from mmengine.visualization.vis_backend import LocalVisBackend
from mmdet3d.datasets.kitti_dataset import KittiDataset
from mmdet3d.datasets.transforms.formating import Pack3DDetInputs
from mmdet3d.datasets.transforms.loading import (LoadAnnotations3D,
LoadPointsFromFile)
from mmdet3d.datasets.transforms.test_time_aug import MultiScaleFlipAug3D
from mmdet3d.datasets.transforms.transforms_3d import ( # noqa
GlobalRotScaleTrans, ObjectNoise, ObjectRangeFilter, ObjectSample,
PointShuffle, PointsRangeFilter, RandomFlip3D)
from mmdet3d.evaluation.metrics.kitti_metric import KittiMetric
from mmdet3d.visualization.local_visualizer import Det3DLocalVisualizer
# dataset settings
dataset_type = 'KittiDataset'
data_root = 'data/kitti/'
class_names = ['Car']
point_cloud_range = [0, -40, -3, 70.4, 40, 1]
input_modality = dict(use_lidar=True, use_camera=False)
metainfo = dict(classes=class_names)
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection3d/kitti/'
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/detection3d/',
# 'data/': 's3://openmmlab/datasets/detection3d/'
# }))
backend_args = None
db_sampler = dict(
data_root=data_root,
info_path=data_root + 'kitti_dbinfos_train.pkl',
rate=1.0,
prepare=dict(filter_by_difficulty=[-1], filter_by_min_points=dict(Car=5)),
classes=class_names,
sample_groups=dict(Car=15),
points_loader=dict(
type=LoadPointsFromFile,
coord_type='LIDAR',
load_dim=4,
use_dim=4,
backend_args=backend_args),
backend_args=backend_args)
train_pipeline = [
dict(
type=LoadPointsFromFile,
coord_type='LIDAR',
load_dim=4, # x, y, z, intensity
use_dim=4,
backend_args=backend_args),
dict(type=LoadAnnotations3D, with_bbox_3d=True, with_label_3d=True),
dict(type=ObjectSample, db_sampler=db_sampler),
dict(
type=ObjectNoise,
num_try=100,
translation_std=[1.0, 1.0, 0.5],
global_rot_range=[0.0, 0.0],
rot_range=[-0.78539816, 0.78539816]),
dict(type=RandomFlip3D, flip_ratio_bev_horizontal=0.5),
dict(
type=GlobalRotScaleTrans,
rot_range=[-0.78539816, 0.78539816],
scale_ratio_range=[0.95, 1.05]),
dict(type=PointsRangeFilter, point_cloud_range=point_cloud_range),
dict(type=ObjectRangeFilter, point_cloud_range=point_cloud_range),
dict(type=PointShuffle),
dict(
type=Pack3DDetInputs, keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
]
test_pipeline = [
dict(
type=LoadPointsFromFile,
coord_type='LIDAR',
load_dim=4,
use_dim=4,
backend_args=backend_args),
dict(
type=MultiScaleFlipAug3D,
img_scale=(1333, 800),
pts_scale_ratio=1,
flip=False,
transforms=[
dict(
type=GlobalRotScaleTrans,
rot_range=[0, 0],
scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]),
dict(type=RandomFlip3D),
dict(type=PointsRangeFilter, point_cloud_range=point_cloud_range)
]),
dict(type=Pack3DDetInputs, keys=['points'])
]
# construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline = [
dict(
type=LoadPointsFromFile,
coord_type='LIDAR',
load_dim=4,
use_dim=4,
backend_args=backend_args),
dict(type=Pack3DDetInputs, keys=['points'])
]
train_dataloader = dict(
batch_size=6,
num_workers=4,
persistent_workers=True,
sampler=dict(type=DefaultSampler, shuffle=True),
dataset=dict(
type=RepeatDataset,
times=2,
dataset=dict(
type=KittiDataset,
data_root=data_root,
ann_file='kitti_infos_train.pkl',
data_prefix=dict(pts='training/velodyne_reduced'),
pipeline=train_pipeline,
modality=input_modality,
test_mode=False,
metainfo=metainfo,
# we use box_type_3d='LiDAR' in kitti and nuscenes dataset
# and box_type_3d='Depth' in sunrgbd and scannet dataset.
box_type_3d='LiDAR',
backend_args=backend_args)))
val_dataloader = dict(
batch_size=1,
num_workers=1,
persistent_workers=True,
drop_last=False,
sampler=dict(type=DefaultSampler, shuffle=False),
dataset=dict(
type=KittiDataset,
data_root=data_root,
data_prefix=dict(pts='training/velodyne_reduced'),
ann_file='kitti_infos_val.pkl',
pipeline=test_pipeline,
modality=input_modality,
test_mode=True,
metainfo=metainfo,
box_type_3d='LiDAR',
backend_args=backend_args))
test_dataloader = dict(
batch_size=1,
num_workers=1,
persistent_workers=True,
drop_last=False,
sampler=dict(type=DefaultSampler, shuffle=False),
dataset=dict(
type=KittiDataset,
data_root=data_root,
data_prefix=dict(pts='training/velodyne_reduced'),
ann_file='kitti_infos_val.pkl',
pipeline=test_pipeline,
modality=input_modality,
test_mode=True,
metainfo=metainfo,
box_type_3d='LiDAR',
backend_args=backend_args))
val_evaluator = dict(
type=KittiMetric,
ann_file=data_root + 'kitti_infos_val.pkl',
metric='bbox',
backend_args=backend_args)
test_evaluator = val_evaluator
vis_backends = [dict(type=LocalVisBackend)]
visualizer = dict(
type=Det3DLocalVisualizer, vis_backends=vis_backends, name='visualizer')
# Copyright (c) OpenMMLab. All rights reserved.
from mmcv.transforms.processing import Resize
from mmengine.dataset.sampler import DefaultSampler
from mmengine.visualization.vis_backend import LocalVisBackend
from mmdet3d.datasets.kitti_dataset import KittiDataset
from mmdet3d.datasets.transforms.formating import Pack3DDetInputs
from mmdet3d.datasets.transforms.loading import (LoadAnnotations3D,
LoadImageFromFileMono3D)
from mmdet3d.datasets.transforms.transforms_3d import RandomFlip3D
from mmdet3d.evaluation.metrics.kitti_metric import KittiMetric
from mmdet3d.visualization.local_visualizer import Det3DLocalVisualizer
dataset_type = 'KittiDataset'
data_root = 'data/kitti/'
class_names = ['Pedestrian', 'Cyclist', 'Car']
input_modality = dict(use_lidar=False, use_camera=True)
metainfo = dict(classes=class_names)
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection3d/kitti/'
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/detection3d/',
# 'data/': 's3://openmmlab/datasets/detection3d/'
# }))
backend_args = None
train_pipeline = [
dict(type=LoadImageFromFileMono3D, backend_args=backend_args),
dict(
type=LoadAnnotations3D,
with_bbox=True,
with_label=True,
with_attr_label=False,
with_bbox_3d=True,
with_label_3d=True,
with_bbox_depth=True),
dict(type=Resize, scale=(1242, 375), keep_ratio=True),
dict(type=RandomFlip3D, flip_ratio_bev_horizontal=0.5),
dict(
type=Pack3DDetInputs,
keys=[
'img', 'gt_bboxes', 'gt_bboxes_labels', 'gt_bboxes_3d',
'gt_labels_3d', 'centers_2d', 'depths'
]),
]
test_pipeline = [
dict(type=LoadImageFromFileMono3D, backend_args=backend_args),
dict(type=Resize, scale=(1242, 375), keep_ratio=True),
dict(type=Pack3DDetInputs, keys=['img'])
]
eval_pipeline = [
dict(type=LoadImageFromFileMono3D, backend_args=backend_args),
dict(type=Pack3DDetInputs, keys=['img'])
]
train_dataloader = dict(
batch_size=2,
num_workers=2,
persistent_workers=True,
sampler=dict(type=DefaultSampler, shuffle=True),
dataset=dict(
type=KittiDataset,
data_root=data_root,
ann_file='kitti_infos_train.pkl',
data_prefix=dict(img='training/image_2'),
pipeline=train_pipeline,
modality=input_modality,
load_type='fov_image_based',
test_mode=False,
metainfo=metainfo,
# we use box_type_3d='Camera' in monocular 3d
# detection task
box_type_3d='Camera',
backend_args=backend_args))
val_dataloader = dict(
batch_size=1,
num_workers=2,
persistent_workers=True,
drop_last=False,
sampler=dict(type=DefaultSampler, shuffle=False),
dataset=dict(
type=KittiDataset,
data_root=data_root,
data_prefix=dict(img='training/image_2'),
ann_file='kitti_infos_val.pkl',
pipeline=test_pipeline,
modality=input_modality,
load_type='fov_image_based',
metainfo=metainfo,
test_mode=True,
box_type_3d='Camera',
backend_args=backend_args))
test_dataloader = val_dataloader
val_evaluator = dict(
type=KittiMetric,
ann_file=data_root + 'kitti_infos_val.pkl',
metric='bbox',
backend_args=backend_args)
test_evaluator = val_evaluator
vis_backends = [dict(type=LocalVisBackend)]
visualizer = dict(
type=Det3DLocalVisualizer, vis_backends=vis_backends, name='visualizer')
# Copyright (c) OpenMMLab. All rights reserved.
from mmengine.dataset.sampler import DefaultSampler
from mmengine.visualization.vis_backend import LocalVisBackend
from mmdet3d.datasets.lyft_dataset import LyftDataset
from mmdet3d.datasets.transforms.formating import Pack3DDetInputs
from mmdet3d.datasets.transforms.loading import (LoadAnnotations3D,
LoadPointsFromFile,
LoadPointsFromMultiSweeps)
from mmdet3d.datasets.transforms.test_time_aug import MultiScaleFlipAug3D
from mmdet3d.datasets.transforms.transforms_3d import (GlobalRotScaleTrans,
ObjectRangeFilter,
PointShuffle,
PointsRangeFilter,
RandomFlip3D)
from mmdet3d.evaluation.metrics.lyft_metric import LyftMetric
from mmdet3d.visualization.local_visualizer import Det3DLocalVisualizer
# If point cloud range is changed, the models should also change their point
# cloud range accordingly
point_cloud_range = [-80, -80, -5, 80, 80, 3]
# For Lyft we usually do 9-class detection
class_names = [
'car', 'truck', 'bus', 'emergency_vehicle', 'other_vehicle', 'motorcycle',
'bicycle', 'pedestrian', 'animal'
]
dataset_type = 'LyftDataset'
data_root = 'data/lyft/'
# Input modality for Lyft dataset, this is consistent with the submission
# format which requires the information in input_modality.
input_modality = dict(use_lidar=True, use_camera=False)
data_prefix = dict(pts='v1.01-train/lidar', img='', sweeps='v1.01-train/lidar')
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection3d/lyft/'
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/detection3d/',
# 'data/': 's3://openmmlab/datasets/detection3d/'
# }))
backend_args = None
train_pipeline = [
dict(
type=LoadPointsFromFile,
coord_type='LIDAR',
load_dim=5,
use_dim=5,
backend_args=backend_args),
dict(
type=LoadPointsFromMultiSweeps,
sweeps_num=10,
backend_args=backend_args),
dict(type=LoadAnnotations3D, with_bbox_3d=True, with_label_3d=True),
dict(
type=GlobalRotScaleTrans,
rot_range=[-0.3925, 0.3925],
scale_ratio_range=[0.95, 1.05],
translation_std=[0, 0, 0]),
dict(type=RandomFlip3D, flip_ratio_bev_horizontal=0.5),
dict(type=PointsRangeFilter, point_cloud_range=point_cloud_range),
dict(type=ObjectRangeFilter, point_cloud_range=point_cloud_range),
dict(type=PointShuffle),
dict(
type=Pack3DDetInputs, keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
]
test_pipeline = [
dict(
type=LoadPointsFromFile,
coord_type='LIDAR',
load_dim=5,
use_dim=5,
backend_args=backend_args),
dict(
type=LoadPointsFromMultiSweeps,
sweeps_num=10,
backend_args=backend_args),
dict(
type=MultiScaleFlipAug3D,
img_scale=(1333, 800),
pts_scale_ratio=1,
flip=False,
transforms=[
dict(
type=GlobalRotScaleTrans,
rot_range=[0, 0],
scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]),
dict(type=RandomFlip3D),
dict(type=PointsRangeFilter, point_cloud_range=point_cloud_range)
]),
dict(type=Pack3DDetInputs, keys=['points'])
]
# construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline = [
dict(
type=LoadPointsFromFile,
coord_type='LIDAR',
load_dim=5,
use_dim=5,
backend_args=backend_args),
dict(
type=LoadPointsFromMultiSweeps,
sweeps_num=10,
backend_args=backend_args),
dict(type=Pack3DDetInputs, keys=['points'])
]
train_dataloader = dict(
batch_size=2,
num_workers=2,
persistent_workers=True,
sampler=dict(type=DefaultSampler, shuffle=True),
dataset=dict(
type=LyftDataset,
data_root=data_root,
ann_file='lyft_infos_train.pkl',
pipeline=train_pipeline,
metainfo=dict(classes=class_names),
modality=input_modality,
data_prefix=data_prefix,
test_mode=False,
box_type_3d='LiDAR',
backend_args=backend_args))
test_dataloader = dict(
batch_size=1,
num_workers=1,
persistent_workers=True,
drop_last=False,
sampler=dict(type=DefaultSampler, shuffle=False),
dataset=dict(
type=LyftDataset,
data_root=data_root,
ann_file='lyft_infos_val.pkl',
pipeline=test_pipeline,
metainfo=dict(classes=class_names),
modality=input_modality,
data_prefix=data_prefix,
test_mode=True,
box_type_3d='LiDAR',
backend_args=backend_args))
val_dataloader = dict(
batch_size=1,
num_workers=1,
persistent_workers=True,
drop_last=False,
sampler=dict(type=DefaultSampler, shuffle=False),
dataset=dict(
type=LyftDataset,
data_root=data_root,
ann_file='lyft_infos_val.pkl',
pipeline=test_pipeline,
metainfo=dict(classes=class_names),
modality=input_modality,
test_mode=True,
data_prefix=data_prefix,
box_type_3d='LiDAR',
backend_args=backend_args))
val_evaluator = dict(
type=LyftMetric,
data_root=data_root,
ann_file='lyft_infos_val.pkl',
metric='bbox',
backend_args=backend_args)
test_evaluator = val_evaluator
vis_backends = [dict(type=LocalVisBackend)]
visualizer = dict(
type=Det3DLocalVisualizer, vis_backends=vis_backends, name='visualizer')
# Copyright (c) OpenMMLab. All rights reserved.
from mmengine.dataset.sampler import DefaultSampler
from mmengine.visualization.vis_backend import LocalVisBackend
from mmdet3d.datasets.lyft_dataset import LyftDataset
from mmdet3d.datasets.transforms.formating import Pack3DDetInputs
from mmdet3d.datasets.transforms.loading import (LoadAnnotations3D,
LoadPointsFromFile,
LoadPointsFromMultiSweeps)
from mmdet3d.datasets.transforms.test_time_aug import MultiScaleFlipAug3D
from mmdet3d.datasets.transforms.transforms_3d import (GlobalRotScaleTrans,
ObjectRangeFilter,
PointShuffle,
PointsRangeFilter,
RandomFlip3D)
from mmdet3d.evaluation.metrics.lyft_metric import LyftMetric
from mmdet3d.visualization.local_visualizer import Det3DLocalVisualizer
# If point cloud range is changed, the models should also change their point
# cloud range accordingly
point_cloud_range = [-100, -100, -5, 100, 100, 3]
# For Lyft we usually do 9-class detection
class_names = [
'car', 'truck', 'bus', 'emergency_vehicle', 'other_vehicle', 'motorcycle',
'bicycle', 'pedestrian', 'animal'
]
dataset_type = 'LyftDataset'
data_root = 'data/lyft/'
data_prefix = dict(pts='v1.01-train/lidar', img='', sweeps='v1.01-train/lidar')
# Input modality for Lyft dataset, this is consistent with the submission
# format which requires the information in input_modality.
input_modality = dict(
use_lidar=True,
use_camera=False,
use_radar=False,
use_map=False,
use_external=False)
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection3d/lyft/'
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/detection3d/',
# 'data/': 's3://openmmlab/datasets/detection3d/'
# }))
backend_args = None
train_pipeline = [
dict(
type=LoadPointsFromFile,
coord_type='LIDAR',
load_dim=5,
use_dim=5,
backend_args=backend_args),
dict(
type=LoadPointsFromMultiSweeps,
sweeps_num=10,
backend_args=backend_args),
dict(type=LoadAnnotations3D, with_bbox_3d=True, with_label_3d=True),
dict(
type=GlobalRotScaleTrans,
rot_range=[-0.3925, 0.3925],
scale_ratio_range=[0.95, 1.05],
translation_std=[0, 0, 0]),
dict(type=RandomFlip3D, flip_ratio_bev_horizontal=0.5),
dict(type=PointsRangeFilter, point_cloud_range=point_cloud_range),
dict(type=ObjectRangeFilter, point_cloud_range=point_cloud_range),
dict(type=PointShuffle),
dict(
type=Pack3DDetInputs, keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
]
test_pipeline = [
dict(
type=LoadPointsFromFile,
coord_type='LIDAR',
load_dim=5,
use_dim=5,
backend_args=backend_args),
dict(
type=LoadPointsFromMultiSweeps,
sweeps_num=10,
backend_args=backend_args),
dict(
type=MultiScaleFlipAug3D,
img_scale=(1333, 800),
pts_scale_ratio=1,
flip=False,
transforms=[
dict(
type=GlobalRotScaleTrans,
rot_range=[0, 0],
scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]),
dict(type=RandomFlip3D),
dict(type=PointsRangeFilter, point_cloud_range=point_cloud_range),
]),
dict(type=Pack3DDetInputs, keys=['points'])
]
# construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline = [
dict(
type=LoadPointsFromFile,
coord_type='LIDAR',
load_dim=5,
use_dim=5,
backend_args=backend_args),
dict(
type=LoadPointsFromMultiSweeps,
sweeps_num=10,
backend_args=backend_args),
dict(type=Pack3DDetInputs, keys=['points'])
]
train_dataloader = dict(
batch_size=2,
num_workers=2,
persistent_workers=True,
sampler=dict(type=DefaultSampler, shuffle=True),
dataset=dict(
type=LyftDataset,
data_root=data_root,
ann_file='lyft_infos_train.pkl',
pipeline=train_pipeline,
metainfo=dict(classes=class_names),
modality=input_modality,
data_prefix=data_prefix,
test_mode=False,
box_type_3d='LiDAR',
backend_args=backend_args))
val_dataloader = dict(
batch_size=1,
num_workers=1,
persistent_workers=True,
drop_last=False,
sampler=dict(type=DefaultSampler, shuffle=False),
dataset=dict(
type=LyftDataset,
data_root=data_root,
ann_file='lyft_infos_val.pkl',
pipeline=test_pipeline,
metainfo=dict(classes=class_names),
modality=input_modality,
test_mode=True,
data_prefix=data_prefix,
box_type_3d='LiDAR',
backend_args=backend_args))
test_dataloader = val_dataloader
val_evaluator = dict(
type=LyftMetric,
data_root=data_root,
ann_file='lyft_infos_val.pkl',
metric='bbox',
backend_args=backend_args)
test_evaluator = val_evaluator
vis_backends = [dict(type=LocalVisBackend)]
visualizer = dict(
type=Det3DLocalVisualizer, vis_backends=vis_backends, name='visualizer')
# Copyright (c) OpenMMLab. All rights reserved.
from mmcv.transforms.loading import LoadAnnotations, LoadImageFromFile
from mmcv.transforms.processing import MultiScaleFlipAug, RandomFlip, Resize
dataset_type = 'CocoDataset'
data_root = 'data/nuimages/'
class_names = [
'car', 'truck', 'trailer', 'bus', 'construction_vehicle', 'bicycle',
'motorcycle', 'pedestrian', 'traffic_cone', 'barrier'
]
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection3d/nuimages/'
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/detection3d/',
# 'data/': 's3://openmmlab/datasets/detection3d/'
# }))
backend_args = None
train_pipeline = [
dict(type=LoadImageFromFile, backend_args=backend_args),
dict(type=LoadAnnotations, with_bbox=True, with_mask=True),
dict(
type=Resize,
img_scale=[(1280, 720), (1920, 1080)],
multiscale_mode='range',
keep_ratio=True),
dict(type=RandomFlip, flip_ratio=0.5),
dict(type='PackDetInputs'),
]
test_pipeline = [
dict(type=LoadImageFromFile, backend_args=backend_args),
dict(
type=MultiScaleFlipAug,
img_scale=(1600, 900),
flip=False,
transforms=[
dict(type=Resize, keep_ratio=True),
dict(type=RandomFlip),
]),
dict(
type='PackDetInputs',
meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
'scale_factor')),
]
data = dict(
samples_per_gpu=2,
workers_per_gpu=2,
train=dict(
type=dataset_type,
ann_file=data_root + 'annotations/nuimages_v1.0-train.json',
img_prefix=data_root,
classes=class_names,
pipeline=train_pipeline),
val=dict(
type=dataset_type,
ann_file=data_root + 'annotations/nuimages_v1.0-val.json',
img_prefix=data_root,
classes=class_names,
pipeline=test_pipeline),
test=dict(
type=dataset_type,
ann_file=data_root + 'annotations/nuimages_v1.0-val.json',
img_prefix=data_root,
classes=class_names,
pipeline=test_pipeline))
evaluation = dict(metric=['bbox', 'segm'])
# Copyright (c) OpenMMLab. All rights reserved.
from mmengine.dataset.sampler import DefaultSampler
from mmengine.visualization.vis_backend import LocalVisBackend
from mmdet3d.datasets.nuscenes_dataset import NuScenesDataset
from mmdet3d.datasets.transforms.formating import Pack3DDetInputs
from mmdet3d.datasets.transforms.loading import (LoadAnnotations3D,
LoadPointsFromFile,
LoadPointsFromMultiSweeps)
from mmdet3d.datasets.transforms.test_time_aug import MultiScaleFlipAug3D
from mmdet3d.datasets.transforms.transforms_3d import ( # noqa
GlobalRotScaleTrans, ObjectNameFilter, ObjectRangeFilter, PointShuffle,
PointsRangeFilter, RandomFlip3D)
from mmdet3d.evaluation.metrics.nuscenes_metric import NuScenesMetric
from mmdet3d.visualization.local_visualizer import Det3DLocalVisualizer
# If point cloud range is changed, the models should also change their point
# cloud range accordingly
point_cloud_range = [-50, -50, -5, 50, 50, 3]
# Using calibration info convert the Lidar-coordinate point cloud range to the
# ego-coordinate point cloud range could bring a little promotion in nuScenes.
# point_cloud_range = [-50, -50.8, -5, 50, 49.2, 3]
# For nuScenes we usually do 10-class detection
class_names = [
'car', 'truck', 'trailer', 'bus', 'construction_vehicle', 'bicycle',
'motorcycle', 'pedestrian', 'traffic_cone', 'barrier'
]
metainfo = dict(classes=class_names)
dataset_type = 'NuScenesDataset'
data_root = 'data/nuscenes/'
# Input modality for nuScenes dataset, this is consistent with the submission
# format which requires the information in input_modality.
input_modality = dict(use_lidar=True, use_camera=False)
data_prefix = dict(pts='samples/LIDAR_TOP', img='', sweeps='sweeps/LIDAR_TOP')
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection3d/nuscenes/'
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/detection3d/',
# 'data/': 's3://openmmlab/datasets/detection3d/'
# }))
backend_args = None
train_pipeline = [
dict(
type=LoadPointsFromFile,
coord_type='LIDAR',
load_dim=5,
use_dim=5,
backend_args=backend_args),
dict(
type=LoadPointsFromMultiSweeps,
sweeps_num=10,
backend_args=backend_args),
dict(type=LoadAnnotations3D, with_bbox_3d=True, with_label_3d=True),
dict(
type=GlobalRotScaleTrans,
rot_range=[-0.3925, 0.3925],
scale_ratio_range=[0.95, 1.05],
translation_std=[0, 0, 0]),
dict(type=RandomFlip3D, flip_ratio_bev_horizontal=0.5),
dict(type=PointsRangeFilter, point_cloud_range=point_cloud_range),
dict(type=ObjectRangeFilter, point_cloud_range=point_cloud_range),
dict(type=ObjectNameFilter, classes=class_names),
dict(type=PointShuffle),
dict(
type=Pack3DDetInputs, keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
]
test_pipeline = [
dict(
type=LoadPointsFromFile,
coord_type='LIDAR',
load_dim=5,
use_dim=5,
backend_args=backend_args),
dict(
type=LoadPointsFromMultiSweeps,
sweeps_num=10,
test_mode=True,
backend_args=backend_args),
dict(
type=MultiScaleFlipAug3D,
img_scale=(1333, 800),
pts_scale_ratio=1,
flip=False,
transforms=[
dict(
type=GlobalRotScaleTrans,
rot_range=[0, 0],
scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]),
dict(type=RandomFlip3D),
dict(type=PointsRangeFilter, point_cloud_range=point_cloud_range)
]),
dict(type=Pack3DDetInputs, keys=['points'])
]
# construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline = [
dict(
type=LoadPointsFromFile,
coord_type='LIDAR',
load_dim=5,
use_dim=5,
backend_args=backend_args),
dict(
type=LoadPointsFromMultiSweeps,
sweeps_num=10,
test_mode=True,
backend_args=backend_args),
dict(type=Pack3DDetInputs, keys=['points'])
]
train_dataloader = dict(
batch_size=4,
num_workers=4,
persistent_workers=True,
sampler=dict(type=DefaultSampler, shuffle=True),
dataset=dict(
type=NuScenesDataset,
data_root=data_root,
ann_file='nuscenes_infos_train.pkl',
pipeline=train_pipeline,
metainfo=metainfo,
modality=input_modality,
test_mode=False,
data_prefix=data_prefix,
# we use box_type_3d='LiDAR' in kitti and nuscenes dataset
# and box_type_3d='Depth' in sunrgbd and scannet dataset.
box_type_3d='LiDAR',
backend_args=backend_args))
test_dataloader = dict(
batch_size=1,
num_workers=1,
persistent_workers=True,
drop_last=False,
sampler=dict(type=DefaultSampler, shuffle=False),
dataset=dict(
type=NuScenesDataset,
data_root=data_root,
ann_file='nuscenes_infos_val.pkl',
pipeline=test_pipeline,
metainfo=metainfo,
modality=input_modality,
data_prefix=data_prefix,
test_mode=True,
box_type_3d='LiDAR',
backend_args=backend_args))
val_dataloader = dict(
batch_size=1,
num_workers=1,
persistent_workers=True,
drop_last=False,
sampler=dict(type=DefaultSampler, shuffle=False),
dataset=dict(
type=NuScenesDataset,
data_root=data_root,
ann_file='nuscenes_infos_val.pkl',
pipeline=test_pipeline,
metainfo=metainfo,
modality=input_modality,
test_mode=True,
data_prefix=data_prefix,
box_type_3d='LiDAR',
backend_args=backend_args))
val_evaluator = dict(
type=NuScenesMetric,
data_root=data_root,
ann_file=data_root + 'nuscenes_infos_val.pkl',
metric='bbox',
backend_args=backend_args)
test_evaluator = val_evaluator
vis_backends = [dict(type=LocalVisBackend)]
visualizer = dict(
type=Det3DLocalVisualizer, vis_backends=vis_backends, name='visualizer')
# Copyright (c) OpenMMLab. All rights reserved.
from mmcv.transforms.processing import Resize
from mmengine.dataset.sampler import DefaultSampler
from mmengine.visualization.vis_backend import LocalVisBackend
from mmdet3d.datasets.nuscenes_dataset import NuScenesDataset
from mmdet3d.datasets.transforms.formating import Pack3DDetInputs
from mmdet3d.datasets.transforms.loading import (LoadAnnotations3D,
LoadImageFromFileMono3D)
from mmdet3d.datasets.transforms.transforms_3d import RandomFlip3D
from mmdet3d.evaluation.metrics.nuscenes_metric import NuScenesMetric
from mmdet3d.visualization.local_visualizer import Det3DLocalVisualizer
dataset_type = 'NuScenesDataset'
data_root = 'data/nuscenes/'
class_names = [
'car', 'truck', 'trailer', 'bus', 'construction_vehicle', 'bicycle',
'motorcycle', 'pedestrian', 'traffic_cone', 'barrier'
]
metainfo = dict(classes=class_names)
# Input modality for nuScenes dataset, this is consistent with the submission
# format which requires the information in input_modality.
input_modality = dict(use_lidar=False, use_camera=True)
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection3d/nuscenes/'
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/detection3d/',
# 'data/': 's3://openmmlab/datasets/detection3d/'
# }))
backend_args = None
train_pipeline = [
dict(type=LoadImageFromFileMono3D, backend_args=backend_args),
dict(
type=LoadAnnotations3D,
with_bbox=True,
with_label=True,
with_attr_label=True,
with_bbox_3d=True,
with_label_3d=True,
with_bbox_depth=True),
dict(type=Resize, scale=(1600, 900), keep_ratio=True),
dict(type=RandomFlip3D, flip_ratio_bev_horizontal=0.5),
dict(
type=Pack3DDetInputs,
keys=[
'img', 'gt_bboxes', 'gt_bboxes_labels', 'attr_labels',
'gt_bboxes_3d', 'gt_labels_3d', 'centers_2d', 'depths'
]),
]
test_pipeline = [
dict(type=LoadImageFromFileMono3D, backend_args=backend_args),
dict(type=Resize, scale=(1600, 900), keep_ratio=True),
dict(type=Pack3DDetInputs, keys=['img'])
]
train_dataloader = dict(
batch_size=2,
num_workers=2,
persistent_workers=True,
sampler=dict(type=DefaultSampler, shuffle=True),
dataset=dict(
type=NuScenesDataset,
data_root=data_root,
data_prefix=dict(
pts='',
CAM_FRONT='samples/CAM_FRONT',
CAM_FRONT_LEFT='samples/CAM_FRONT_LEFT',
CAM_FRONT_RIGHT='samples/CAM_FRONT_RIGHT',
CAM_BACK='samples/CAM_BACK',
CAM_BACK_RIGHT='samples/CAM_BACK_RIGHT',
CAM_BACK_LEFT='samples/CAM_BACK_LEFT'),
ann_file='nuscenes_infos_train.pkl',
load_type='mv_image_based',
pipeline=train_pipeline,
metainfo=metainfo,
modality=input_modality,
test_mode=False,
# we use box_type_3d='Camera' in monocular 3d
# detection task
box_type_3d='Camera',
use_valid_flag=True,
backend_args=backend_args))
val_dataloader = dict(
batch_size=1,
num_workers=2,
persistent_workers=True,
drop_last=False,
sampler=dict(type=DefaultSampler, shuffle=False),
dataset=dict(
type=NuScenesDataset,
data_root=data_root,
data_prefix=dict(
pts='',
CAM_FRONT='samples/CAM_FRONT',
CAM_FRONT_LEFT='samples/CAM_FRONT_LEFT',
CAM_FRONT_RIGHT='samples/CAM_FRONT_RIGHT',
CAM_BACK='samples/CAM_BACK',
CAM_BACK_RIGHT='samples/CAM_BACK_RIGHT',
CAM_BACK_LEFT='samples/CAM_BACK_LEFT'),
ann_file='nuscenes_infos_val.pkl',
load_type='mv_image_based',
pipeline=test_pipeline,
modality=input_modality,
metainfo=metainfo,
test_mode=True,
box_type_3d='Camera',
use_valid_flag=True,
backend_args=backend_args))
test_dataloader = val_dataloader
val_evaluator = dict(
type=NuScenesMetric,
data_root=data_root,
ann_file=data_root + 'nuscenes_infos_val.pkl',
metric='bbox',
backend_args=backend_args)
test_evaluator = val_evaluator
vis_backends = [dict(type=LocalVisBackend)]
visualizer = dict(
type=Det3DLocalVisualizer, vis_backends=vis_backends, name='visualizer')
# Copyright (c) OpenMMLab. All rights reserved.
from mmengine.dataset.dataset_wrapper import ConcatDataset, RepeatDataset
from mmengine.dataset.sampler import DefaultSampler
from mmengine.visualization.vis_backend import LocalVisBackend
from mmdet3d.datasets.s3dis_dataset import S3DISDataset
from mmdet3d.datasets.transforms.formating import Pack3DDetInputs
from mmdet3d.datasets.transforms.loading import (LoadAnnotations3D,
LoadPointsFromFile,
NormalizePointsColor)
from mmdet3d.datasets.transforms.test_time_aug import MultiScaleFlipAug3D
from mmdet3d.datasets.transforms.transforms_3d import (GlobalRotScaleTrans,
PointSample,
RandomFlip3D)
from mmdet3d.evaluation.metrics.indoor_metric import IndoorMetric
from mmdet3d.visualization.local_visualizer import Det3DLocalVisualizer
# dataset settings
dataset_type = 'S3DISDataset'
data_root = 'data/s3dis/'
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection3d/s3dis/'
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/detection3d/',
# 'data/': 's3://openmmlab/datasets/detection3d/'
# }))
backend_args = None
metainfo = dict(classes=('table', 'chair', 'sofa', 'bookcase', 'board'))
train_area = [1, 2, 3, 4, 6]
test_area = 5
train_pipeline = [
dict(
type=LoadPointsFromFile,
coord_type='DEPTH',
shift_height=False,
use_color=True,
load_dim=6,
use_dim=[0, 1, 2, 3, 4, 5],
backend_args=backend_args),
dict(type=LoadAnnotations3D, with_bbox_3d=True, with_label_3d=True),
dict(type=PointSample, num_points=100000),
dict(
type=RandomFlip3D,
sync_2d=False,
flip_ratio_bev_horizontal=0.5,
flip_ratio_bev_vertical=0.5),
dict(
type=GlobalRotScaleTrans,
rot_range=[-0.087266, 0.087266],
scale_ratio_range=[0.9, 1.1],
translation_std=[.1, .1, .1],
shift_height=False),
dict(type=NormalizePointsColor, color_mean=None),
dict(
type=Pack3DDetInputs, keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
]
test_pipeline = [
dict(
type=LoadPointsFromFile,
coord_type='DEPTH',
shift_height=False,
use_color=True,
load_dim=6,
use_dim=[0, 1, 2, 3, 4, 5],
backend_args=backend_args),
dict(
type=MultiScaleFlipAug3D,
img_scale=(1333, 800),
pts_scale_ratio=1,
flip=False,
transforms=[
dict(
type=GlobalRotScaleTrans,
rot_range=[0, 0],
scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]),
dict(
type=RandomFlip3D,
sync_2d=False,
flip_ratio_bev_horizontal=0.5,
flip_ratio_bev_vertical=0.5),
dict(type=PointSample, num_points=100000),
dict(type=NormalizePointsColor, color_mean=None),
]),
dict(type=Pack3DDetInputs, keys=['points'])
]
train_dataloader = dict(
batch_size=8,
num_workers=4,
sampler=dict(type=DefaultSampler, shuffle=True),
dataset=dict(
type=RepeatDataset,
times=13,
dataset=dict(
type=ConcatDataset,
datasets=[
dict(
type=S3DISDataset,
data_root=data_root,
ann_file=f's3dis_infos_Area_{i}.pkl',
pipeline=train_pipeline,
filter_empty_gt=True,
metainfo=metainfo,
box_type_3d='Depth',
backend_args=backend_args) for i in train_area
])))
val_dataloader = dict(
batch_size=1,
num_workers=1,
sampler=dict(type=DefaultSampler, shuffle=False),
dataset=dict(
type=S3DISDataset,
data_root=data_root,
ann_file=f's3dis_infos_Area_{test_area}.pkl',
pipeline=test_pipeline,
metainfo=metainfo,
test_mode=True,
box_type_3d='Depth',
backend_args=backend_args))
test_dataloader = dict(
batch_size=1,
num_workers=1,
sampler=dict(type=DefaultSampler, shuffle=False),
dataset=dict(
type=S3DISDataset,
data_root=data_root,
ann_file=f's3dis_infos_Area_{test_area}.pkl',
pipeline=test_pipeline,
metainfo=metainfo,
test_mode=True,
box_type_3d='Depth',
backend_args=backend_args))
val_evaluator = dict(type=IndoorMetric)
test_evaluator = val_evaluator
vis_backends = [dict(type=LocalVisBackend)]
visualizer = dict(
type=Det3DLocalVisualizer, vis_backends=vis_backends, name='visualizer')
# Copyright (c) OpenMMLab. All rights reserved.
from mmcv.transforms.processing import TestTimeAug
from mmengine.dataset.sampler import DefaultSampler
from mmengine.visualization.vis_backend import LocalVisBackend
from mmdet3d.datasets.s3dis_dataset import S3DISSegDataset
from mmdet3d.datasets.transforms.formating import Pack3DDetInputs
from mmdet3d.datasets.transforms.loading import (LoadAnnotations3D,
LoadPointsFromFile,
NormalizePointsColor,
PointSegClassMapping)
from mmdet3d.datasets.transforms.transforms_3d import (IndoorPatchPointSample,
RandomFlip3D)
from mmdet3d.evaluation.metrics.seg_metric import SegMetric
from mmdet3d.models.segmentors.seg3d_tta import Seg3DTTAModel
from mmdet3d.visualization.local_visualizer import Det3DLocalVisualizer
# For S3DIS seg we usually do 13-class segmentation
class_names = ('ceiling', 'floor', 'wall', 'beam', 'column', 'window', 'door',
'table', 'chair', 'sofa', 'bookcase', 'board', 'clutter')
metainfo = dict(classes=class_names)
dataset_type = 'S3DISSegDataset'
data_root = 'data/s3dis/'
input_modality = dict(use_lidar=True, use_camera=False)
data_prefix = dict(
pts='points',
pts_instance_mask='instance_mask',
pts_semantic_mask='semantic_mask')
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection3d/s3dis/'
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/detection3d/',
# 'data/': 's3://openmmlab/datasets/detection3d/'
# }))
backend_args = None
num_points = 4096
train_area = [1, 2, 3, 4, 6]
test_area = 5
train_pipeline = [
dict(
type=LoadPointsFromFile,
coord_type='DEPTH',
shift_height=False,
use_color=True,
load_dim=6,
use_dim=[0, 1, 2, 3, 4, 5],
backend_args=backend_args),
dict(
type=LoadAnnotations3D,
with_bbox_3d=False,
with_label_3d=False,
with_mask_3d=False,
with_seg_3d=True,
backend_args=backend_args),
dict(type=PointSegClassMapping),
dict(
type=IndoorPatchPointSample,
num_points=num_points,
block_size=1.0,
ignore_index=len(class_names),
use_normalized_coord=True,
enlarge_size=0.2,
min_unique_num=None),
dict(type=NormalizePointsColor, color_mean=None),
dict(type=Pack3DDetInputs, keys=['points', 'pts_semantic_mask'])
]
test_pipeline = [
dict(
type=LoadPointsFromFile,
coord_type='DEPTH',
shift_height=False,
use_color=True,
load_dim=6,
use_dim=[0, 1, 2, 3, 4, 5],
backend_args=backend_args),
dict(
type=LoadAnnotations3D,
with_bbox_3d=False,
with_label_3d=False,
with_mask_3d=False,
with_seg_3d=True,
backend_args=backend_args),
dict(type=NormalizePointsColor, color_mean=None),
dict(type=Pack3DDetInputs, keys=['points'])
]
# construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client)
# we need to load gt seg_mask!
eval_pipeline = [
dict(
type=LoadPointsFromFile,
coord_type='DEPTH',
shift_height=False,
use_color=True,
load_dim=6,
use_dim=[0, 1, 2, 3, 4, 5],
backend_args=backend_args),
dict(type=NormalizePointsColor, color_mean=None),
dict(type=Pack3DDetInputs, keys=['points'])
]
tta_pipeline = [
dict(
type=LoadPointsFromFile,
coord_type='DEPTH',
shift_height=False,
use_color=True,
load_dim=6,
use_dim=[0, 1, 2, 3, 4, 5],
backend_args=backend_args),
dict(
type=LoadAnnotations3D,
with_bbox_3d=False,
with_label_3d=False,
with_mask_3d=False,
with_seg_3d=True,
backend_args=backend_args),
dict(type=NormalizePointsColor, color_mean=None),
dict(
type=TestTimeAug,
transforms=[[
dict(
type=RandomFlip3D,
sync_2d=False,
flip_ratio_bev_horizontal=0.,
flip_ratio_bev_vertical=0.)
], [dict(type=Pack3DDetInputs, keys=['points'])]])
]
# train on area 1, 2, 3, 4, 6
# test on area 5
train_dataloader = dict(
batch_size=8,
num_workers=4,
persistent_workers=True,
sampler=dict(type=DefaultSampler, shuffle=True),
dataset=dict(
type=S3DISSegDataset,
data_root=data_root,
ann_files=[f's3dis_infos_Area_{i}.pkl' for i in train_area],
metainfo=metainfo,
data_prefix=data_prefix,
pipeline=train_pipeline,
modality=input_modality,
ignore_index=len(class_names),
scene_idxs=[
f'seg_info/Area_{i}_resampled_scene_idxs.npy' for i in train_area
],
test_mode=False,
backend_args=backend_args))
test_dataloader = dict(
batch_size=1,
num_workers=1,
persistent_workers=True,
drop_last=False,
sampler=dict(type=DefaultSampler, shuffle=False),
dataset=dict(
type=S3DISSegDataset,
data_root=data_root,
ann_files=f's3dis_infos_Area_{test_area}.pkl',
metainfo=metainfo,
data_prefix=data_prefix,
pipeline=test_pipeline,
modality=input_modality,
ignore_index=len(class_names),
scene_idxs=f'seg_info/Area_{test_area}_resampled_scene_idxs.npy',
test_mode=True,
backend_args=backend_args))
val_dataloader = test_dataloader
val_evaluator = dict(type=SegMetric)
test_evaluator = val_evaluator
vis_backends = [dict(type=LocalVisBackend)]
visualizer = dict(
type=Det3DLocalVisualizer, vis_backends=vis_backends, name='visualizer')
tta_model = dict(type=Seg3DTTAModel)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment