Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
raojy
mmdetection3d_rjy
Commits
7aa442d5
Commit
7aa442d5
authored
Apr 01, 2026
by
raojy
Browse files
raw_mmdetection
parent
9c03eaa8
Changes
465
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
3586 additions
and
0 deletions
+3586
-0
mmdetection3d/docs/zh_cn/user_guides/visualization.md
mmdetection3d/docs/zh_cn/user_guides/visualization.md
+202
-0
mmdetection3d/mmdet3d/__init__.py
mmdetection3d/mmdet3d/__init__.py
+38
-0
mmdetection3d/mmdet3d/apis/__init__.py
mmdetection3d/mmdet3d/apis/__init__.py
+16
-0
mmdetection3d/mmdet3d/apis/inference.py
mmdetection3d/mmdet3d/apis/inference.py
+416
-0
mmdetection3d/mmdet3d/apis/inferencers/__init__.py
mmdetection3d/mmdet3d/apis/inferencers/__init__.py
+11
-0
mmdetection3d/mmdet3d/apis/inferencers/base_3d_inferencer.py
mmdetection3d/mmdet3d/apis/inferencers/base_3d_inferencer.py
+346
-0
mmdetection3d/mmdet3d/apis/inferencers/lidar_det3d_inferencer.py
...tion3d/mmdet3d/apis/inferencers/lidar_det3d_inferencer.py
+242
-0
mmdetection3d/mmdet3d/apis/inferencers/lidar_seg3d_inferencer.py
...tion3d/mmdet3d/apis/inferencers/lidar_seg3d_inferencer.py
+209
-0
mmdetection3d/mmdet3d/apis/inferencers/mono_det3d_inferencer.py
...ction3d/mmdet3d/apis/inferencers/mono_det3d_inferencer.py
+251
-0
mmdetection3d/mmdet3d/apis/inferencers/multi_modality_det3d_inferencer.py
...det3d/apis/inferencers/multi_modality_det3d_inferencer.py
+315
-0
mmdetection3d/mmdet3d/configs/_base_/datasets/kitti_3d_3class.py
...tion3d/mmdet3d/configs/_base_/datasets/kitti_3d_3class.py
+181
-0
mmdetection3d/mmdet3d/configs/_base_/datasets/kitti_3d_car.py
...tection3d/mmdet3d/configs/_base_/datasets/kitti_3d_car.py
+179
-0
mmdetection3d/mmdet3d/configs/_base_/datasets/kitti_mono3d.py
...tection3d/mmdet3d/configs/_base_/datasets/kitti_mono3d.py
+113
-0
mmdetection3d/mmdet3d/configs/_base_/datasets/lyft_3d.py
mmdetection3d/mmdet3d/configs/_base_/datasets/lyft_3d.py
+176
-0
mmdetection3d/mmdet3d/configs/_base_/datasets/lyft_3d_range100.py
...ion3d/mmdet3d/configs/_base_/datasets/lyft_3d_range100.py
+166
-0
mmdetection3d/mmdet3d/configs/_base_/datasets/nuim_instance.py
...ection3d/mmdet3d/configs/_base_/datasets/nuim_instance.py
+74
-0
mmdetection3d/mmdet3d/configs/_base_/datasets/nus_3d.py
mmdetection3d/mmdet3d/configs/_base_/datasets/nus_3d.py
+183
-0
mmdetection3d/mmdet3d/configs/_base_/datasets/nus_mono3d.py
mmdetection3d/mmdet3d/configs/_base_/datasets/nus_mono3d.py
+132
-0
mmdetection3d/mmdet3d/configs/_base_/datasets/s3dis_3d.py
mmdetection3d/mmdet3d/configs/_base_/datasets/s3dis_3d.py
+150
-0
mmdetection3d/mmdet3d/configs/_base_/datasets/s3dis_seg.py
mmdetection3d/mmdet3d/configs/_base_/datasets/s3dis_seg.py
+186
-0
No files found.
Too many changes to show.
To preserve performance only
465 of 465+
files are displayed.
Plain diff
Email patch
mmdetection3d/docs/zh_cn/user_guides/visualization.md
0 → 100644
View file @
7aa442d5
# 可视化
MMDetection3D 提供了
`Det3DLocalVisualizer`
用来在训练及测试阶段可视化和存储模型的状态以及结果,其具有以下特性:
1.
支持多模态数据和多任务的基本绘图界面。
2.
支持多个后端(如 local,TensorBoard),将训练状态(如
`loss`
,
`lr`
)或模型评估指标写入指定的一个或多个后端中。
3.
支持多模态数据真实标签的可视化,3D 检测结果的跨模态可视化。
## 基本绘制界面
继承自
`DetLocalVisualizer`
,
`Det3DLocalVisualizer`
提供了在 2D 图像上绘制常见目标的界面,例如绘制检测框、点、文本、线、圆、多边形、二进制掩码等。关于 2D 绘制的更多细节,请参考 MMDetection 中的
[
可视化文档
](
https://mmengine.readthedocs.io/zh_CN/latest/advanced_tutorials/visualization.html
)
。这里我们介绍 3D 绘制界面。
### 在图像上绘制点云
通过使用
`draw_points_on_image`
,我们支持在图像上绘制点云。
```
python
import
mmcv
import
numpy
as
np
from
mmengine
import
load
from
mmdet3d.visualization
import
Det3DLocalVisualizer
info_file
=
load
(
'demo/data/kitti/000008.pkl'
)
points
=
np
.
fromfile
(
'demo/data/kitti/000008.bin'
,
dtype
=
np
.
float32
)
points
=
points
.
reshape
(
-
1
,
4
)[:,
:
3
]
lidar2img
=
np
.
array
(
info_file
[
'data_list'
][
0
][
'images'
][
'CAM2'
][
'lidar2img'
],
dtype
=
np
.
float32
)
visualizer
=
Det3DLocalVisualizer
()
img
=
mmcv
.
imread
(
'demo/data/kitti/000008.png'
)
img
=
mmcv
.
imconvert
(
img
,
'bgr'
,
'rgb'
)
visualizer
.
set_image
(
img
)
visualizer
.
draw_points_on_image
(
points
,
lidar2img
)
visualizer
.
show
()
```

### 在点云上绘制 3D 框
通过使用
`draw_bboxes_3d`
,我们支持在点云上绘制 3D 框。
```
python
import
torch
import
numpy
as
np
from
mmdet3d.visualization
import
Det3DLocalVisualizer
from
mmdet3d.structures
import
LiDARInstance3DBoxes
points
=
np
.
fromfile
(
'demo/data/kitti/000008.bin'
,
dtype
=
np
.
float32
)
points
=
points
.
reshape
(
-
1
,
4
)
visualizer
=
Det3DLocalVisualizer
()
# set point cloud in visualizer
visualizer
.
set_points
(
points
)
bboxes_3d
=
LiDARInstance3DBoxes
(
torch
.
tensor
([[
8.7314
,
-
1.8559
,
-
1.5997
,
4.2000
,
3.4800
,
1.8900
,
-
1.5808
]]))
# Draw 3D bboxes
visualizer
.
draw_bboxes_3d
(
bboxes_3d
)
visualizer
.
show
()
```

### 在图像上绘制投影的 3D 框
通过使用
`draw_proj_bboxes_3d`
,我们支持在图像上绘制投影的 3D 框。
```
python
import
mmcv
import
numpy
as
np
from
mmengine
import
load
from
mmdet3d.visualization
import
Det3DLocalVisualizer
from
mmdet3d.structures
import
CameraInstance3DBoxes
info_file
=
load
(
'demo/data/kitti/000008.pkl'
)
cam2img
=
np
.
array
(
info_file
[
'data_list'
][
0
][
'images'
][
'CAM2'
][
'cam2img'
],
dtype
=
np
.
float32
)
bboxes_3d
=
[]
for
instance
in
info_file
[
'data_list'
][
0
][
'instances'
]:
bboxes_3d
.
append
(
instance
[
'bbox_3d'
])
gt_bboxes_3d
=
np
.
array
(
bboxes_3d
,
dtype
=
np
.
float32
)
gt_bboxes_3d
=
CameraInstance3DBoxes
(
gt_bboxes_3d
)
input_meta
=
{
'cam2img'
:
cam2img
}
visualizer
=
Det3DLocalVisualizer
()
img
=
mmcv
.
imread
(
'demo/data/kitti/000008.png'
)
img
=
mmcv
.
imconvert
(
img
,
'bgr'
,
'rgb'
)
visualizer
.
set_image
(
img
)
# project 3D bboxes to image
visualizer
.
draw_proj_bboxes_3d
(
gt_bboxes_3d
,
input_meta
)
visualizer
.
show
()
```
### 绘制 BEV 视角的框
通过使用
`draw_bev_bboxes`
,我们支持绘制 BEV 视角下的框。
```
python
import
numpy
as
np
from
mmengine
import
load
from
mmdet3d.visualization
import
Det3DLocalVisualizer
from
mmdet3d.structures
import
CameraInstance3DBoxes
info_file
=
load
(
'demo/data/kitti/000008.pkl'
)
bboxes_3d
=
[]
for
instance
in
info_file
[
'data_list'
][
0
][
'instances'
]:
bboxes_3d
.
append
(
instance
[
'bbox_3d'
])
gt_bboxes_3d
=
np
.
array
(
bboxes_3d
,
dtype
=
np
.
float32
)
gt_bboxes_3d
=
CameraInstance3DBoxes
(
gt_bboxes_3d
)
visualizer
=
Det3DLocalVisualizer
()
# set bev image in visualizer
visualizer
.
set_bev_image
()
# draw bev bboxes
visualizer
.
draw_bev_bboxes
(
gt_bboxes_3d
,
edge_colors
=
'orange'
)
visualizer
.
show
()
```
### 绘制 3D 分割掩码
通过使用
`draw_seg_mask`
,我们支持通过逐点着色来绘制分割掩码。
```
python
import
numpy
as
np
from
mmdet3d.visualization
import
Det3DLocalVisualizer
points
=
np
.
fromfile
(
'demo/data/sunrgbd/000017.bin'
,
dtype
=
np
.
float32
)
points
=
points
.
reshape
(
-
1
,
3
)
visualizer
=
Det3DLocalVisualizer
()
mask
=
np
.
random
.
rand
(
points
.
shape
[
0
],
3
)
points_with_mask
=
np
.
concatenate
((
points
,
mask
),
axis
=-
1
)
# Draw 3D points with mask
visualizer
.
set_points
(
points
,
pcd_mode
=
2
,
vis_mode
=
'add'
)
visualizer
.
draw_seg_mask
(
points_with_mask
)
visualizer
.
show
()
```
## 结果
如果想要可视化训练模型的预测结果,你可以运行如下指令:
```
bash
python tools/test.py
${
CONFIG_FILE
}
${
CKPT_PATH
}
--show
--show-dir
${
SHOW_DIR
}
```
运行该指令后,绘制的结果(包括输入数据和网络输出在输入上的可视化)将会被保存在
`${SHOW_DIR}`
中。
运行该指令后,你将在
`${SHOW_DIR}`
中获得输入数据,网络输出和真是标签在输入上的可视化(如在多模态检测任务和基于视觉的检测任务中的
`***_gt.png`
和
`***_pred.png`
)。当启用
`show`
时,
[
Open3D
](
http://www.open3d.org/
)
将会用于在线可视化结果。如果你是在没有 GUI 的远程服务器上测试时,在线可视化是不被支持的。你可以从远程服务器中下载
`results.pkl`
,并在本地机器上离线可视化预测结果。
使用
`Open3D`
后端离线可视化结果,你可以运行如下指令:
```
bash
python tools/misc/visualize_results.py
${
CONFIG_FILE
}
--result
${
RESULTS_PATH
}
--show-dir
${
SHOW_DIR
}
```

这需要在远程服务器中能够推理并生成结果,然后用户在主机中使用 GUI 打开。
## 数据集
我们也提供了脚本来可视化数据集而无需推理。你可以使用
`tools/misc/browse_dataset.py`
来在线可视化加载的数据的真实标签,并保存在硬盘中。目前我们支持所有数据集的单模态 3D 检测和 3D 分割,KITTI 和 SUN RGB-D 的多模态 3D 检测,以及 nuScenes 的单目 3D 检测。如果想要浏览 KITTI 数据集,你可以运行如下指令:
```
shell
python tools/misc/browse_dataset.py configs/_base_/datasets/kitti-3d-3class.py
--task
lidar_det
--output-dir
${
OUTPUT_DIR
}
```
**注意**
:一旦指定了
`--output-dir`
,当在 open3d 窗口中按下
`_ESC_`
时,用户指定的视图图像将会被保存下来。如果你想要对点云进行缩放操作以观察更多细节, 你可以在命令中指定
`--show-interval=0`
。
为了验证数据的一致性和数据增强的效果,你可以加上
`--aug`
来可视化数据增强后的数据,指令如下所示:
```
shell
python tools/misc/browse_dataset.py configs/_base_/datasets/kitti-3d-3class.py
--task
det
--aug
--output-dir
${
OUTPUT_DIR
}
```
如果你想显示带有投影的 3D 边界框的 2D 图像,你需要一个支持多模态数据加载的配置文件,并将
`--task`
参数改为
`multi-modality_det`
。示例如下:
```
shell
python tools/misc/browse_dataset.py configs/mvxnet/mvxnet_fpn_dv_second_secfpn_8xb2-80e_kitti-3d-3class.py
--task
multi-modality_det
--output-dir
${
OUTPUT_DIR
}
```

你可以使用不同的配置浏览不同的数据集,例如在 3D 语义分割任务中可视化 ScanNet 数据集:
```
shell
python tools/misc/browse_dataset.py configs/_base_/datasets/scannet-seg.py
--task
lidar_seg
--output-dir
${
OUTPUT_DIR
}
```

在单目 3D 检测任务中浏览 nuScenes 数据集:
```
shell
python tools/misc/browse_dataset.py configs/_base_/datasets/nus-mono3d.py
--task
mono_det
--output-dir
${
OUTPUT_DIR
}
```

mmdetection3d/mmdet3d/__init__.py
0 → 100644
View file @
7aa442d5
# Copyright (c) OpenMMLab. All rights reserved.
import
mmcv
import
mmdet
import
mmengine
from
mmengine.utils
import
digit_version
from
.version
import
__version__
,
version_info
mmcv_minimum_version
=
'2.0.0rc4'
mmcv_maximum_version
=
'2.2.0'
mmcv_version
=
digit_version
(
mmcv
.
__version__
)
mmengine_minimum_version
=
'0.8.0'
mmengine_maximum_version
=
'1.0.0'
mmengine_version
=
digit_version
(
mmengine
.
__version__
)
mmdet_minimum_version
=
'3.0.0rc5'
mmdet_maximum_version
=
'3.4.0'
mmdet_version
=
digit_version
(
mmdet
.
__version__
)
assert
(
mmcv_version
>=
digit_version
(
mmcv_minimum_version
)
and
mmcv_version
<
digit_version
(
mmcv_maximum_version
)),
\
f
'MMCV==
{
mmcv
.
__version__
}
is used but incompatible. '
\
f
'Please install mmcv>=
{
mmcv_minimum_version
}
, <
{
mmcv_maximum_version
}
.'
assert
(
mmengine_version
>=
digit_version
(
mmengine_minimum_version
)
and
mmengine_version
<
digit_version
(
mmengine_maximum_version
)),
\
f
'MMEngine==
{
mmengine
.
__version__
}
is used but incompatible. '
\
f
'Please install mmengine>=
{
mmengine_minimum_version
}
, '
\
f
'<
{
mmengine_maximum_version
}
.'
assert
(
mmdet_version
>=
digit_version
(
mmdet_minimum_version
)
and
mmdet_version
<
digit_version
(
mmdet_maximum_version
)),
\
f
'MMDET==
{
mmdet
.
__version__
}
is used but incompatible. '
\
f
'Please install mmdet>=
{
mmdet_minimum_version
}
, '
\
f
'<
{
mmdet_maximum_version
}
.'
__all__
=
[
'__version__'
,
'version_info'
,
'digit_version'
]
mmdetection3d/mmdet3d/apis/__init__.py
0 → 100644
View file @
7aa442d5
# Copyright (c) OpenMMLab. All rights reserved.
from
.inference
import
(
convert_SyncBN
,
inference_detector
,
inference_mono_3d_detector
,
inference_multi_modality_detector
,
inference_segmentor
,
init_model
)
from
.inferencers
import
(
Base3DInferencer
,
LidarDet3DInferencer
,
LidarSeg3DInferencer
,
MonoDet3DInferencer
,
MultiModalityDet3DInferencer
)
__all__
=
[
'inference_detector'
,
'init_model'
,
'inference_mono_3d_detector'
,
'convert_SyncBN'
,
'inference_multi_modality_detector'
,
'inference_segmentor'
,
'Base3DInferencer'
,
'MonoDet3DInferencer'
,
'LidarDet3DInferencer'
,
'LidarSeg3DInferencer'
,
'MultiModalityDet3DInferencer'
]
mmdetection3d/mmdet3d/apis/inference.py
0 → 100644
View file @
7aa442d5
# Copyright (c) OpenMMLab. All rights reserved.
import
warnings
from
copy
import
deepcopy
from
os
import
path
as
osp
from
pathlib
import
Path
from
typing
import
Optional
,
Sequence
,
Union
import
mmengine
import
numpy
as
np
import
torch
import
torch.nn
as
nn
from
mmengine.config
import
Config
from
mmengine.dataset
import
Compose
,
pseudo_collate
from
mmengine.registry
import
init_default_scope
from
mmengine.runner
import
load_checkpoint
from
mmdet3d.registry
import
DATASETS
,
MODELS
from
mmdet3d.structures
import
Box3DMode
,
Det3DDataSample
,
get_box_type
from
mmdet3d.structures.det3d_data_sample
import
SampleList
def
convert_SyncBN
(
config
):
"""Convert config's naiveSyncBN to BN.
Args:
config (str or :obj:`mmengine.Config`): Config file path or the config
object.
"""
if
isinstance
(
config
,
dict
):
for
item
in
config
:
if
item
==
'norm_cfg'
:
config
[
item
][
'type'
]
=
config
[
item
][
'type'
].
\
replace
(
'naiveSyncBN'
,
'BN'
)
else
:
convert_SyncBN
(
config
[
item
])
def
init_model
(
config
:
Union
[
str
,
Path
,
Config
],
checkpoint
:
Optional
[
str
]
=
None
,
device
:
str
=
'cuda:0'
,
palette
:
str
=
'none'
,
cfg_options
:
Optional
[
dict
]
=
None
):
"""Initialize a model from config file, which could be a 3D detector or a
3D segmentor.
Args:
config (str, :obj:`Path`, or :obj:`mmengine.Config`): Config file path,
:obj:`Path`, or the config object.
checkpoint (str, optional): Checkpoint path. If left as None, the model
will not load any weights.
device (str): Device to use.
cfg_options (dict, optional): Options to override some settings in
the used config.
Returns:
nn.Module: The constructed detector.
"""
if
isinstance
(
config
,
(
str
,
Path
)):
config
=
Config
.
fromfile
(
config
)
elif
not
isinstance
(
config
,
Config
):
raise
TypeError
(
'config must be a filename or Config object, '
f
'but got
{
type
(
config
)
}
'
)
if
cfg_options
is
not
None
:
config
.
merge_from_dict
(
cfg_options
)
convert_SyncBN
(
config
.
model
)
config
.
model
.
train_cfg
=
None
init_default_scope
(
config
.
get
(
'default_scope'
,
'mmdet3d'
))
model
=
MODELS
.
build
(
config
.
model
)
if
checkpoint
is
not
None
:
checkpoint
=
load_checkpoint
(
model
,
checkpoint
,
map_location
=
'cpu'
)
# save the dataset_meta in the model for convenience
if
'dataset_meta'
in
checkpoint
.
get
(
'meta'
,
{}):
# mmdet3d 1.x
model
.
dataset_meta
=
checkpoint
[
'meta'
][
'dataset_meta'
]
elif
'CLASSES'
in
checkpoint
.
get
(
'meta'
,
{}):
# < mmdet3d 1.x
classes
=
checkpoint
[
'meta'
][
'CLASSES'
]
model
.
dataset_meta
=
{
'classes'
:
classes
}
if
'PALETTE'
in
checkpoint
.
get
(
'meta'
,
{}):
# 3D Segmentor
model
.
dataset_meta
[
'palette'
]
=
checkpoint
[
'meta'
][
'PALETTE'
]
else
:
# < mmdet3d 1.x
model
.
dataset_meta
=
{
'classes'
:
config
.
class_names
}
if
'PALETTE'
in
checkpoint
.
get
(
'meta'
,
{}):
# 3D Segmentor
model
.
dataset_meta
[
'palette'
]
=
checkpoint
[
'meta'
][
'PALETTE'
]
test_dataset_cfg
=
deepcopy
(
config
.
test_dataloader
.
dataset
)
# lazy init. We only need the metainfo.
test_dataset_cfg
[
'lazy_init'
]
=
True
metainfo
=
DATASETS
.
build
(
test_dataset_cfg
).
metainfo
cfg_palette
=
metainfo
.
get
(
'palette'
,
None
)
if
cfg_palette
is
not
None
:
model
.
dataset_meta
[
'palette'
]
=
cfg_palette
else
:
if
'palette'
not
in
model
.
dataset_meta
:
warnings
.
warn
(
'palette does not exist, random is used by default. '
'You can also set the palette to customize.'
)
model
.
dataset_meta
[
'palette'
]
=
'random'
model
.
cfg
=
config
# save the config in the model for convenience
if
device
!=
'cpu'
:
torch
.
cuda
.
set_device
(
device
)
else
:
warnings
.
warn
(
'Don
\'
t suggest using CPU device. '
'Some functions are not supported for now.'
)
model
.
to
(
device
)
model
.
eval
()
return
model
PointsType
=
Union
[
str
,
np
.
ndarray
,
Sequence
[
str
],
Sequence
[
np
.
ndarray
]]
ImagesType
=
Union
[
str
,
np
.
ndarray
,
Sequence
[
str
],
Sequence
[
np
.
ndarray
]]
def
inference_detector
(
model
:
nn
.
Module
,
pcds
:
PointsType
)
->
Union
[
Det3DDataSample
,
SampleList
]:
"""Inference point cloud with the detector.
Args:
model (nn.Module): The loaded detector.
pcds (str, ndarray, Sequence[str/ndarray]):
Either point cloud files or loaded point cloud.
Returns:
:obj:`Det3DDataSample` or list[:obj:`Det3DDataSample`]:
If pcds is a list or tuple, the same length list type results
will be returned, otherwise return the detection results directly.
"""
if
isinstance
(
pcds
,
(
list
,
tuple
)):
is_batch
=
True
else
:
pcds
=
[
pcds
]
is_batch
=
False
cfg
=
model
.
cfg
if
not
isinstance
(
pcds
[
0
],
str
):
cfg
=
cfg
.
copy
()
# set loading pipeline type
cfg
.
test_dataloader
.
dataset
.
pipeline
[
0
].
type
=
'LoadPointsFromDict'
# build the data pipeline
test_pipeline
=
deepcopy
(
cfg
.
test_dataloader
.
dataset
.
pipeline
)
test_pipeline
=
Compose
(
test_pipeline
)
box_type_3d
,
box_mode_3d
=
\
get_box_type
(
cfg
.
test_dataloader
.
dataset
.
box_type_3d
)
data
=
[]
for
pcd
in
pcds
:
# prepare data
if
isinstance
(
pcd
,
str
):
# load from point cloud file
data_
=
dict
(
lidar_points
=
dict
(
lidar_path
=
pcd
),
timestamp
=
1
,
# for ScanNet demo we need axis_align_matrix
axis_align_matrix
=
np
.
eye
(
4
),
box_type_3d
=
box_type_3d
,
box_mode_3d
=
box_mode_3d
)
else
:
# directly use loaded point cloud
data_
=
dict
(
points
=
pcd
,
timestamp
=
1
,
# for ScanNet demo we need axis_align_matrix
axis_align_matrix
=
np
.
eye
(
4
),
box_type_3d
=
box_type_3d
,
box_mode_3d
=
box_mode_3d
)
data_
=
test_pipeline
(
data_
)
data
.
append
(
data_
)
collate_data
=
pseudo_collate
(
data
)
# forward the model
with
torch
.
no_grad
():
results
=
model
.
test_step
(
collate_data
)
if
not
is_batch
:
return
results
[
0
],
data
[
0
]
else
:
return
results
,
data
def
inference_multi_modality_detector
(
model
:
nn
.
Module
,
pcds
:
Union
[
str
,
Sequence
[
str
]],
imgs
:
Union
[
str
,
Sequence
[
str
]],
ann_file
:
Union
[
str
,
Sequence
[
str
]],
cam_type
:
str
=
'CAM2'
):
"""Inference point cloud with the multi-modality detector. Now we only
support multi-modality detector for KITTI and SUNRGBD datasets since the
multi-view image loading is not supported yet in this inference function.
Args:
model (nn.Module): The loaded detector.
pcds (str, Sequence[str]):
Either point cloud files or loaded point cloud.
imgs (str, Sequence[str]):
Either image files or loaded images.
ann_file (str, Sequence[str]): Annotation files.
cam_type (str): Image of Camera chose to infer. When detector only uses
single-view image, we need to specify a camera view. For kitti
dataset, it should be 'CAM2'. For sunrgbd, it should be 'CAM0'.
When detector uses multi-view images, we should set it to 'all'.
Returns:
:obj:`Det3DDataSample` or list[:obj:`Det3DDataSample`]:
If pcds is a list or tuple, the same length list type results
will be returned, otherwise return the detection results directly.
"""
if
isinstance
(
pcds
,
(
list
,
tuple
)):
is_batch
=
True
assert
isinstance
(
imgs
,
(
list
,
tuple
))
assert
len
(
pcds
)
==
len
(
imgs
)
else
:
pcds
=
[
pcds
]
imgs
=
[
imgs
]
is_batch
=
False
cfg
=
model
.
cfg
# build the data pipeline
test_pipeline
=
deepcopy
(
cfg
.
test_dataloader
.
dataset
.
pipeline
)
test_pipeline
=
Compose
(
test_pipeline
)
box_type_3d
,
box_mode_3d
=
\
get_box_type
(
cfg
.
test_dataloader
.
dataset
.
box_type_3d
)
data_list
=
mmengine
.
load
(
ann_file
)[
'data_list'
]
data
=
[]
for
index
,
pcd
in
enumerate
(
pcds
):
# get data info containing calib
data_info
=
data_list
[
index
]
img
=
imgs
[
index
]
if
cam_type
!=
'all'
:
assert
osp
.
isfile
(
img
),
f
'
{
img
}
must be a file.'
img_path
=
data_info
[
'images'
][
cam_type
][
'img_path'
]
if
osp
.
basename
(
img_path
)
!=
osp
.
basename
(
img
):
raise
ValueError
(
f
'the info file of
{
img_path
}
is not provided.'
)
data_
=
dict
(
lidar_points
=
dict
(
lidar_path
=
pcd
),
img_path
=
img
,
box_type_3d
=
box_type_3d
,
box_mode_3d
=
box_mode_3d
)
data_info
[
'images'
][
cam_type
][
'img_path'
]
=
img
if
'cam2img'
in
data_info
[
'images'
][
cam_type
]:
# The data annotation in SRUNRGBD dataset does not contain
# `cam2img`
data_
[
'cam2img'
]
=
np
.
array
(
data_info
[
'images'
][
cam_type
][
'cam2img'
])
# LiDAR to image conversion for KITTI dataset
if
box_mode_3d
==
Box3DMode
.
LIDAR
:
if
'lidar2img'
in
data_info
[
'images'
][
cam_type
]:
data_
[
'lidar2img'
]
=
np
.
array
(
data_info
[
'images'
][
cam_type
][
'lidar2img'
])
# Depth to image conversion for SUNRGBD dataset
elif
box_mode_3d
==
Box3DMode
.
DEPTH
:
data_
[
'depth2img'
]
=
np
.
array
(
data_info
[
'images'
][
cam_type
][
'depth2img'
])
else
:
assert
osp
.
isdir
(
img
),
f
'
{
img
}
must be a file directory'
for
_
,
img_info
in
data_info
[
'images'
].
items
():
img_info
[
'img_path'
]
=
osp
.
join
(
img
,
img_info
[
'img_path'
])
assert
osp
.
isfile
(
img_info
[
'img_path'
]
),
f
'
{
img_info
[
"img_path"
]
}
does not exist.'
data_
=
dict
(
lidar_points
=
dict
(
lidar_path
=
pcd
),
images
=
data_info
[
'images'
],
box_type_3d
=
box_type_3d
,
box_mode_3d
=
box_mode_3d
)
if
'timestamp'
in
data_info
:
# Using multi-sweeps need `timestamp`
data_
[
'timestamp'
]
=
data_info
[
'timestamp'
]
data_
=
test_pipeline
(
data_
)
data
.
append
(
data_
)
collate_data
=
pseudo_collate
(
data
)
# forward the model
with
torch
.
no_grad
():
results
=
model
.
test_step
(
collate_data
)
if
not
is_batch
:
return
results
[
0
],
data
[
0
]
else
:
return
results
,
data
def
inference_mono_3d_detector
(
model
:
nn
.
Module
,
imgs
:
ImagesType
,
ann_file
:
Union
[
str
,
Sequence
[
str
]],
cam_type
:
str
=
'CAM_FRONT'
):
"""Inference image with the monocular 3D detector.
Args:
model (nn.Module): The loaded detector.
imgs (str, Sequence[str]):
Either image files or loaded images.
ann_files (str, Sequence[str]): Annotation files.
cam_type (str): Image of Camera chose to infer.
For kitti dataset, it should be 'CAM_2',
and for nuscenes dataset, it should be
'CAM_FRONT'. Defaults to 'CAM_FRONT'.
Returns:
:obj:`Det3DDataSample` or list[:obj:`Det3DDataSample`]:
If pcds is a list or tuple, the same length list type results
will be returned, otherwise return the detection results directly.
"""
if
isinstance
(
imgs
,
(
list
,
tuple
)):
is_batch
=
True
else
:
imgs
=
[
imgs
]
is_batch
=
False
cfg
=
model
.
cfg
# build the data pipeline
test_pipeline
=
deepcopy
(
cfg
.
test_dataloader
.
dataset
.
pipeline
)
test_pipeline
=
Compose
(
test_pipeline
)
box_type_3d
,
box_mode_3d
=
\
get_box_type
(
cfg
.
test_dataloader
.
dataset
.
box_type_3d
)
data_list
=
mmengine
.
load
(
ann_file
)[
'data_list'
]
assert
len
(
imgs
)
==
len
(
data_list
)
data
=
[]
for
index
,
img
in
enumerate
(
imgs
):
# get data info containing calib
data_info
=
data_list
[
index
]
img_path
=
data_info
[
'images'
][
cam_type
][
'img_path'
]
if
osp
.
basename
(
img_path
)
!=
osp
.
basename
(
img
):
raise
ValueError
(
f
'the info file of
{
img_path
}
is not provided.'
)
# replace the img_path in data_info with img
data_info
[
'images'
][
cam_type
][
'img_path'
]
=
img
# avoid data_info['images'] has multiple keys anout camera views.
mono_img_info
=
{
f
'
{
cam_type
}
'
:
data_info
[
'images'
][
cam_type
]}
data_
=
dict
(
images
=
mono_img_info
,
box_type_3d
=
box_type_3d
,
box_mode_3d
=
box_mode_3d
)
data_
=
test_pipeline
(
data_
)
data
.
append
(
data_
)
collate_data
=
pseudo_collate
(
data
)
# forward the model
with
torch
.
no_grad
():
results
=
model
.
test_step
(
collate_data
)
if
not
is_batch
:
return
results
[
0
]
else
:
return
results
def
inference_segmentor
(
model
:
nn
.
Module
,
pcds
:
PointsType
):
"""Inference point cloud with the segmentor.
Args:
model (nn.Module): The loaded segmentor.
pcds (str, Sequence[str]):
Either point cloud files or loaded point cloud.
Returns:
:obj:`Det3DDataSample` or list[:obj:`Det3DDataSample`]:
If pcds is a list or tuple, the same length list type results
will be returned, otherwise return the detection results directly.
"""
if
isinstance
(
pcds
,
(
list
,
tuple
)):
is_batch
=
True
else
:
pcds
=
[
pcds
]
is_batch
=
False
cfg
=
model
.
cfg
# build the data pipeline
test_pipeline
=
deepcopy
(
cfg
.
test_dataloader
.
dataset
.
pipeline
)
new_test_pipeline
=
[]
for
pipeline
in
test_pipeline
:
if
pipeline
[
'type'
]
!=
'LoadAnnotations3D'
and
pipeline
[
'type'
]
!=
'PointSegClassMapping'
:
new_test_pipeline
.
append
(
pipeline
)
test_pipeline
=
Compose
(
new_test_pipeline
)
data
=
[]
# TODO: support load points array
for
pcd
in
pcds
:
data_
=
dict
(
lidar_points
=
dict
(
lidar_path
=
pcd
))
data_
=
test_pipeline
(
data_
)
data
.
append
(
data_
)
collate_data
=
pseudo_collate
(
data
)
# forward the model
with
torch
.
no_grad
():
results
=
model
.
test_step
(
collate_data
)
if
not
is_batch
:
return
results
[
0
],
data
[
0
]
else
:
return
results
,
data
mmdetection3d/mmdet3d/apis/inferencers/__init__.py
0 → 100644
View file @
7aa442d5
# Copyright (c) OpenMMLab. All rights reserved.
from
.base_3d_inferencer
import
Base3DInferencer
from
.lidar_det3d_inferencer
import
LidarDet3DInferencer
from
.lidar_seg3d_inferencer
import
LidarSeg3DInferencer
from
.mono_det3d_inferencer
import
MonoDet3DInferencer
from
.multi_modality_det3d_inferencer
import
MultiModalityDet3DInferencer
__all__
=
[
'Base3DInferencer'
,
'MonoDet3DInferencer'
,
'LidarDet3DInferencer'
,
'LidarSeg3DInferencer'
,
'MultiModalityDet3DInferencer'
]
mmdetection3d/mmdet3d/apis/inferencers/base_3d_inferencer.py
0 → 100644
View file @
7aa442d5
# Copyright (c) OpenMMLab. All rights reserved.
import
logging
import
os.path
as
osp
from
copy
import
deepcopy
from
typing
import
Dict
,
List
,
Optional
,
Sequence
,
Tuple
,
Union
import
numpy
as
np
import
torch.nn
as
nn
from
mmengine
import
dump
,
print_log
from
mmengine.infer.infer
import
BaseInferencer
,
ModelType
from
mmengine.model.utils
import
revert_sync_batchnorm
from
mmengine.registry
import
init_default_scope
from
mmengine.runner
import
load_checkpoint
from
mmengine.structures
import
InstanceData
from
mmengine.visualization
import
Visualizer
from
rich.progress
import
track
from
mmdet3d.registry
import
DATASETS
,
MODELS
from
mmdet3d.structures
import
Box3DMode
,
Det3DDataSample
from
mmdet3d.utils
import
ConfigType
InstanceList
=
List
[
InstanceData
]
InputType
=
Union
[
str
,
np
.
ndarray
]
InputsType
=
Union
[
InputType
,
Sequence
[
InputType
]]
PredType
=
Union
[
InstanceData
,
InstanceList
]
ImgType
=
Union
[
np
.
ndarray
,
Sequence
[
np
.
ndarray
]]
ResType
=
Union
[
Dict
,
List
[
Dict
],
InstanceData
,
List
[
InstanceData
]]
class
Base3DInferencer
(
BaseInferencer
):
"""Base 3D model inferencer.
Args:
model (str, optional): Path to the config file or the model name
defined in metafile. For example, it could be
"pgd-kitti" or
"configs/pgd/pgd_r101-caffe_fpn_head-gn_4xb3-4x_kitti-mono3d.py".
If model is not specified, user must provide the
`weights` saved by MMEngine which contains the config string.
Defaults to None.
weights (str, optional): Path to the checkpoint. If it is not specified
and model is a model name of metafile, the weights will be loaded
from metafile. Defaults to None.
device (str, optional): Device to run inference. If None, the available
device will be automatically used. Defaults to None.
scope (str): The scope of the model. Defaults to 'mmdet3d'.
palette (str): Color palette used for visualization. The order of
priority is palette -> config -> checkpoint. Defaults to 'none'.
"""
preprocess_kwargs
:
set
=
{
'cam_type'
}
forward_kwargs
:
set
=
set
()
visualize_kwargs
:
set
=
{
'return_vis'
,
'show'
,
'wait_time'
,
'draw_pred'
,
'pred_score_thr'
,
'img_out_dir'
,
'no_save_vis'
,
'cam_type_dir'
}
postprocess_kwargs
:
set
=
{
'print_result'
,
'pred_out_dir'
,
'return_datasample'
,
'no_save_pred'
}
def
__init__
(
self
,
model
:
Union
[
ModelType
,
str
,
None
]
=
None
,
weights
:
Optional
[
str
]
=
None
,
device
:
Optional
[
str
]
=
None
,
scope
:
str
=
'mmdet3d'
,
palette
:
str
=
'none'
)
->
None
:
# A global counter tracking the number of frames processed, for
# naming of the output results
self
.
num_predicted_frames
=
0
self
.
palette
=
palette
init_default_scope
(
scope
)
super
().
__init__
(
model
=
model
,
weights
=
weights
,
device
=
device
,
scope
=
scope
)
self
.
model
=
revert_sync_batchnorm
(
self
.
model
)
def
_convert_syncbn
(
self
,
cfg
:
ConfigType
):
"""Convert config's naiveSyncBN to BN.
Args:
config (str or :obj:`mmengine.Config`): Config file path
or the config object.
"""
if
isinstance
(
cfg
,
dict
):
for
item
in
cfg
:
if
item
==
'norm_cfg'
:
cfg
[
item
][
'type'
]
=
cfg
[
item
][
'type'
].
\
replace
(
'naiveSyncBN'
,
'BN'
)
else
:
self
.
_convert_syncbn
(
cfg
[
item
])
def
_init_model
(
self
,
cfg
:
ConfigType
,
weights
:
str
,
device
:
str
=
'cpu'
,
)
->
nn
.
Module
:
self
.
_convert_syncbn
(
cfg
.
model
)
cfg
.
model
.
train_cfg
=
None
model
=
MODELS
.
build
(
cfg
.
model
)
checkpoint
=
load_checkpoint
(
model
,
weights
,
map_location
=
'cpu'
)
if
'dataset_meta'
in
checkpoint
.
get
(
'meta'
,
{}):
# mmdet3d 1.x
model
.
dataset_meta
=
checkpoint
[
'meta'
][
'dataset_meta'
]
elif
'CLASSES'
in
checkpoint
.
get
(
'meta'
,
{}):
# < mmdet3d 1.x
classes
=
checkpoint
[
'meta'
][
'CLASSES'
]
model
.
dataset_meta
=
{
'classes'
:
classes
}
if
'PALETTE'
in
checkpoint
.
get
(
'meta'
,
{}):
# 3D Segmentor
model
.
dataset_meta
[
'palette'
]
=
checkpoint
[
'meta'
][
'PALETTE'
]
else
:
# < mmdet3d 1.x
model
.
dataset_meta
=
{
'classes'
:
cfg
.
class_names
}
if
'PALETTE'
in
checkpoint
.
get
(
'meta'
,
{}):
# 3D Segmentor
model
.
dataset_meta
[
'palette'
]
=
checkpoint
[
'meta'
][
'PALETTE'
]
test_dataset_cfg
=
deepcopy
(
cfg
.
test_dataloader
.
dataset
)
# lazy init. We only need the metainfo.
test_dataset_cfg
[
'lazy_init'
]
=
True
metainfo
=
DATASETS
.
build
(
test_dataset_cfg
).
metainfo
cfg_palette
=
metainfo
.
get
(
'palette'
,
None
)
if
cfg_palette
is
not
None
:
model
.
dataset_meta
[
'palette'
]
=
cfg_palette
model
.
cfg
=
cfg
# save the config in the model for convenience
model
.
to
(
device
)
model
.
eval
()
return
model
def
_get_transform_idx
(
self
,
pipeline_cfg
:
ConfigType
,
name
:
str
)
->
int
:
"""Returns the index of the transform in a pipeline.
If the transform is not found, returns -1.
"""
for
i
,
transform
in
enumerate
(
pipeline_cfg
):
if
transform
[
'type'
]
==
name
:
return
i
return
-
1
def
_init_visualizer
(
self
,
cfg
:
ConfigType
)
->
Optional
[
Visualizer
]:
visualizer
=
super
().
_init_visualizer
(
cfg
)
visualizer
.
dataset_meta
=
self
.
model
.
dataset_meta
return
visualizer
def
_dispatch_kwargs
(
self
,
out_dir
:
str
=
''
,
cam_type
:
str
=
''
,
**
kwargs
)
->
Tuple
[
Dict
,
Dict
,
Dict
,
Dict
]:
"""Dispatch kwargs to preprocess(), forward(), visualize() and
postprocess() according to the actual demands.
Args:
out_dir (str): Dir to save the inference results.
cam_type (str): Camera type. Defaults to ''.
**kwargs (dict): Key words arguments passed to :meth:`preprocess`,
:meth:`forward`, :meth:`visualize` and :meth:`postprocess`.
Each key in kwargs should be in the corresponding set of
``preprocess_kwargs``, ``forward_kwargs``, ``visualize_kwargs``
and ``postprocess_kwargs``.
Returns:
Tuple[Dict, Dict, Dict, Dict]: kwargs passed to preprocess,
forward, visualize and postprocess respectively.
"""
kwargs
[
'img_out_dir'
]
=
out_dir
kwargs
[
'pred_out_dir'
]
=
out_dir
if
cam_type
!=
''
:
kwargs
[
'cam_type_dir'
]
=
cam_type
return
super
().
_dispatch_kwargs
(
**
kwargs
)
def
__call__
(
self
,
inputs
:
InputsType
,
batch_size
:
int
=
1
,
return_datasamples
:
bool
=
False
,
**
kwargs
)
->
Optional
[
dict
]:
"""Call the inferencer.
Args:
inputs (InputsType): Inputs for the inferencer.
batch_size (int): Batch size. Defaults to 1.
return_datasamples (bool): Whether to return results as
:obj:`BaseDataElement`. Defaults to False.
**kwargs: Key words arguments passed to :meth:`preprocess`,
:meth:`forward`, :meth:`visualize` and :meth:`postprocess`.
Each key in kwargs should be in the corresponding set of
``preprocess_kwargs``, ``forward_kwargs``, ``visualize_kwargs``
and ``postprocess_kwargs``.
Returns:
dict: Inference and visualization results.
"""
(
preprocess_kwargs
,
forward_kwargs
,
visualize_kwargs
,
postprocess_kwargs
,
)
=
self
.
_dispatch_kwargs
(
**
kwargs
)
cam_type
=
preprocess_kwargs
.
pop
(
'cam_type'
,
'CAM2'
)
ori_inputs
=
self
.
_inputs_to_list
(
inputs
,
cam_type
=
cam_type
)
inputs
=
self
.
preprocess
(
ori_inputs
,
batch_size
=
batch_size
,
**
preprocess_kwargs
)
preds
=
[]
results_dict
=
{
'predictions'
:
[],
'visualization'
:
[]}
for
data
in
(
track
(
inputs
,
description
=
'Inference'
)
if
self
.
show_progress
else
inputs
):
preds
.
extend
(
self
.
forward
(
data
,
**
forward_kwargs
))
visualization
=
self
.
visualize
(
ori_inputs
,
preds
,
**
visualize_kwargs
)
results
=
self
.
postprocess
(
preds
,
visualization
,
return_datasamples
,
**
postprocess_kwargs
)
results_dict
[
'predictions'
].
extend
(
results
[
'predictions'
])
if
results
[
'visualization'
]
is
not
None
:
results_dict
[
'visualization'
].
extend
(
results
[
'visualization'
])
return
results_dict
def
postprocess
(
self
,
preds
:
PredType
,
visualization
:
Optional
[
List
[
np
.
ndarray
]]
=
None
,
return_datasample
:
bool
=
False
,
print_result
:
bool
=
False
,
no_save_pred
:
bool
=
False
,
pred_out_dir
:
str
=
''
,
)
->
Union
[
ResType
,
Tuple
[
ResType
,
np
.
ndarray
]]:
"""Process the predictions and visualization results from ``forward``
and ``visualize``.
This method should be responsible for the following tasks:
1. Convert datasamples into a json-serializable dict if needed.
2. Pack the predictions and visualization results and return them.
3. Dump or log the predictions.
Args:
preds (List[Dict]): Predictions of the model.
visualization (np.ndarray, optional): Visualized predictions.
Defaults to None.
return_datasample (bool): Whether to use Datasample to store
inference results. If False, dict will be used.
Defaults to False.
print_result (bool): Whether to print the inference result w/o
visualization to the console. Defaults to False.
pred_out_dir (str): Directory to save the inference results w/o
visualization. If left as empty, no file will be saved.
Defaults to ''.
Returns:
dict: Inference and visualization results with key ``predictions``
and ``visualization``.
- ``visualization`` (Any): Returned by :meth:`visualize`.
- ``predictions`` (dict or DataSample): Returned by
:meth:`forward` and processed in :meth:`postprocess`.
If ``return_datasample=False``, it usually should be a
json-serializable dict containing only basic data elements such
as strings and numbers.
"""
if
no_save_pred
is
True
:
pred_out_dir
=
''
result_dict
=
{}
results
=
preds
if
not
return_datasample
:
results
=
[]
for
pred
in
preds
:
result
=
self
.
pred2dict
(
pred
,
pred_out_dir
)
results
.
append
(
result
)
elif
pred_out_dir
!=
''
:
print_log
(
'Currently does not support saving datasample '
'when return_datasample is set to True. '
'Prediction results are not saved!'
,
level
=
logging
.
WARNING
)
# Add img to the results after printing and dumping
result_dict
[
'predictions'
]
=
results
if
print_result
:
print
(
result_dict
)
result_dict
[
'visualization'
]
=
visualization
return
result_dict
# TODO: The data format and fields saved in json need further discussion.
# Maybe should include model name, timestamp, filename, image info etc.
def
pred2dict
(
self
,
data_sample
:
Det3DDataSample
,
pred_out_dir
:
str
=
''
)
->
Dict
:
"""Extract elements necessary to represent a prediction into a
dictionary.
It's better to contain only basic data elements such as strings and
numbers in order to guarantee it's json-serializable.
Args:
data_sample (:obj:`DetDataSample`): Predictions of the model.
pred_out_dir: Dir to save the inference results w/o
visualization. If left as empty, no file will be saved.
Defaults to ''.
Returns:
dict: Prediction results.
"""
result
=
{}
if
'pred_instances_3d'
in
data_sample
:
pred_instances_3d
=
data_sample
.
pred_instances_3d
.
numpy
()
result
=
{
'labels_3d'
:
pred_instances_3d
.
labels_3d
.
tolist
(),
'scores_3d'
:
pred_instances_3d
.
scores_3d
.
tolist
(),
'bboxes_3d'
:
pred_instances_3d
.
bboxes_3d
.
tensor
.
cpu
().
tolist
()
}
if
'pred_pts_seg'
in
data_sample
:
pred_pts_seg
=
data_sample
.
pred_pts_seg
.
numpy
()
result
[
'pts_semantic_mask'
]
=
\
pred_pts_seg
.
pts_semantic_mask
.
tolist
()
if
data_sample
.
box_mode_3d
==
Box3DMode
.
LIDAR
:
result
[
'box_type_3d'
]
=
'LiDAR'
elif
data_sample
.
box_mode_3d
==
Box3DMode
.
CAM
:
result
[
'box_type_3d'
]
=
'Camera'
elif
data_sample
.
box_mode_3d
==
Box3DMode
.
DEPTH
:
result
[
'box_type_3d'
]
=
'Depth'
if
pred_out_dir
!=
''
:
if
'lidar_path'
in
data_sample
:
lidar_path
=
osp
.
basename
(
data_sample
.
lidar_path
)
lidar_path
=
osp
.
splitext
(
lidar_path
)[
0
]
out_json_path
=
osp
.
join
(
pred_out_dir
,
'preds'
,
lidar_path
+
'.json'
)
elif
'img_path'
in
data_sample
:
img_path
=
osp
.
basename
(
data_sample
.
img_path
)
img_path
=
osp
.
splitext
(
img_path
)[
0
]
out_json_path
=
osp
.
join
(
pred_out_dir
,
'preds'
,
img_path
+
'.json'
)
else
:
out_json_path
=
osp
.
join
(
pred_out_dir
,
'preds'
,
f
'
{
str
(
self
.
num_visualized_imgs
).
zfill
(
8
)
}
.json'
)
dump
(
result
,
out_json_path
)
return
result
mmdetection3d/mmdet3d/apis/inferencers/lidar_det3d_inferencer.py
0 → 100644
View file @
7aa442d5
# Copyright (c) OpenMMLab. All rights reserved.
import
os.path
as
osp
from
typing
import
Dict
,
List
,
Optional
,
Sequence
,
Union
import
mmengine
import
numpy
as
np
import
torch
from
mmengine.dataset
import
Compose
from
mmengine.fileio
import
(
get_file_backend
,
isdir
,
join_path
,
list_dir_or_file
)
from
mmengine.infer.infer
import
ModelType
from
mmengine.structures
import
InstanceData
from
mmdet3d.registry
import
INFERENCERS
from
mmdet3d.structures
import
(
CameraInstance3DBoxes
,
DepthInstance3DBoxes
,
Det3DDataSample
,
LiDARInstance3DBoxes
)
from
mmdet3d.utils
import
ConfigType
from
.base_3d_inferencer
import
Base3DInferencer
InstanceList
=
List
[
InstanceData
]
InputType
=
Union
[
str
,
np
.
ndarray
]
InputsType
=
Union
[
InputType
,
Sequence
[
InputType
]]
PredType
=
Union
[
InstanceData
,
InstanceList
]
ImgType
=
Union
[
np
.
ndarray
,
Sequence
[
np
.
ndarray
]]
ResType
=
Union
[
Dict
,
List
[
Dict
],
InstanceData
,
List
[
InstanceData
]]
@
INFERENCERS
.
register_module
(
name
=
'det3d-lidar'
)
@
INFERENCERS
.
register_module
()
class
LidarDet3DInferencer
(
Base3DInferencer
):
"""The inferencer of LiDAR-based detection.
Args:
model (str, optional): Path to the config file or the model name
defined in metafile. For example, it could be
"pointpillars_kitti-3class" or
"configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class.py". # noqa: E501
If model is not specified, user must provide the
`weights` saved by MMEngine which contains the config string.
Defaults to None.
weights (str, optional): Path to the checkpoint. If it is not specified
and model is a model name of metafile, the weights will be loaded
from metafile. Defaults to None.
device (str, optional): Device to run inference. If None, the available
device will be automatically used. Defaults to None.
scope (str): The scope of the model. Defaults to 'mmdet3d'.
palette (str): Color palette used for visualization. The order of
priority is palette -> config -> checkpoint. Defaults to 'none'.
"""
def
__init__
(
self
,
model
:
Union
[
ModelType
,
str
,
None
]
=
None
,
weights
:
Optional
[
str
]
=
None
,
device
:
Optional
[
str
]
=
None
,
scope
:
str
=
'mmdet3d'
,
palette
:
str
=
'none'
)
->
None
:
# A global counter tracking the number of frames processed, for
# naming of the output results
self
.
num_visualized_frames
=
0
super
(
LidarDet3DInferencer
,
self
).
__init__
(
model
=
model
,
weights
=
weights
,
device
=
device
,
scope
=
scope
,
palette
=
palette
)
def
_inputs_to_list
(
self
,
inputs
:
Union
[
dict
,
list
],
**
kwargs
)
->
list
:
"""Preprocess the inputs to a list.
Preprocess inputs to a list according to its type:
- list or tuple: return inputs
- dict: the value with key 'points' is
- Directory path: return all files in the directory
- other cases: return a list containing the string. The string
could be a path to file, a url or other types of string according
to the task.
Args:
inputs (Union[dict, list]): Inputs for the inferencer.
Returns:
list: List of input for the :meth:`preprocess`.
"""
if
isinstance
(
inputs
,
dict
)
and
isinstance
(
inputs
[
'points'
],
str
):
pcd
=
inputs
[
'points'
]
backend
=
get_file_backend
(
pcd
)
if
hasattr
(
backend
,
'isdir'
)
and
isdir
(
pcd
):
# Backends like HttpsBackend do not implement `isdir`, so
# only those backends that implement `isdir` could accept
# the inputs as a directory
filename_list
=
list_dir_or_file
(
pcd
,
list_dir
=
False
)
inputs
=
[{
'points'
:
join_path
(
pcd
,
filename
)
}
for
filename
in
filename_list
]
if
not
isinstance
(
inputs
,
(
list
,
tuple
)):
inputs
=
[
inputs
]
return
list
(
inputs
)
def
_init_pipeline
(
self
,
cfg
:
ConfigType
)
->
Compose
:
"""Initialize the test pipeline."""
pipeline_cfg
=
cfg
.
test_dataloader
.
dataset
.
pipeline
load_point_idx
=
self
.
_get_transform_idx
(
pipeline_cfg
,
'LoadPointsFromFile'
)
if
load_point_idx
==
-
1
:
raise
ValueError
(
'LoadPointsFromFile is not found in the test pipeline'
)
load_cfg
=
pipeline_cfg
[
load_point_idx
]
self
.
coord_type
,
self
.
load_dim
=
load_cfg
[
'coord_type'
],
load_cfg
[
'load_dim'
]
self
.
use_dim
=
list
(
range
(
load_cfg
[
'use_dim'
]))
if
isinstance
(
load_cfg
[
'use_dim'
],
int
)
else
load_cfg
[
'use_dim'
]
pipeline_cfg
[
load_point_idx
][
'type'
]
=
'LidarDet3DInferencerLoader'
return
Compose
(
pipeline_cfg
)
def
visualize
(
self
,
inputs
:
InputsType
,
preds
:
PredType
,
return_vis
:
bool
=
False
,
show
:
bool
=
False
,
wait_time
:
int
=
-
1
,
draw_pred
:
bool
=
True
,
pred_score_thr
:
float
=
0.3
,
no_save_vis
:
bool
=
False
,
img_out_dir
:
str
=
''
)
->
Union
[
List
[
np
.
ndarray
],
None
]:
"""Visualize predictions.
Args:
inputs (InputsType): Inputs for the inferencer.
preds (PredType): Predictions of the model.
return_vis (bool): Whether to return the visualization result.
Defaults to False.
show (bool): Whether to display the image in a popup window.
Defaults to False.
wait_time (float): The interval of show (s). Defaults to -1.
draw_pred (bool): Whether to draw predicted bounding boxes.
Defaults to True.
pred_score_thr (float): Minimum score of bboxes to draw.
Defaults to 0.3.
no_save_vis (bool): Whether to force not to save prediction
vis results. Defaults to False.
img_out_dir (str): Output directory of visualization results.
If left as empty, no file will be saved. Defaults to ''.
Returns:
List[np.ndarray] or None: Returns visualization results only if
applicable.
"""
if
no_save_vis
is
True
:
img_out_dir
=
''
if
not
show
and
img_out_dir
==
''
and
not
return_vis
:
return
None
if
getattr
(
self
,
'visualizer'
)
is
None
:
raise
ValueError
(
'Visualization needs the "visualizer" term'
'defined in the config, but got None.'
)
results
=
[]
for
single_input
,
pred
in
zip
(
inputs
,
preds
):
single_input
=
single_input
[
'points'
]
if
isinstance
(
single_input
,
str
):
pts_bytes
=
mmengine
.
fileio
.
get
(
single_input
)
points
=
np
.
frombuffer
(
pts_bytes
,
dtype
=
np
.
float32
)
points
=
points
.
reshape
(
-
1
,
self
.
load_dim
)
points
=
points
[:,
self
.
use_dim
]
pc_name
=
osp
.
basename
(
single_input
).
split
(
'.bin'
)[
0
]
pc_name
=
f
'
{
pc_name
}
.png'
elif
isinstance
(
single_input
,
np
.
ndarray
):
points
=
single_input
.
copy
()
pc_num
=
str
(
self
.
num_visualized_frames
).
zfill
(
8
)
pc_name
=
f
'
{
pc_num
}
.png'
else
:
raise
ValueError
(
'Unsupported input type: '
f
'
{
type
(
single_input
)
}
'
)
if
img_out_dir
!=
''
and
show
:
o3d_save_path
=
osp
.
join
(
img_out_dir
,
'vis_lidar'
,
pc_name
)
mmengine
.
mkdir_or_exist
(
osp
.
dirname
(
o3d_save_path
))
else
:
o3d_save_path
=
None
data_input
=
dict
(
points
=
points
)
self
.
visualizer
.
add_datasample
(
pc_name
,
data_input
,
pred
,
show
=
show
,
wait_time
=
wait_time
,
draw_gt
=
False
,
draw_pred
=
draw_pred
,
pred_score_thr
=
pred_score_thr
,
o3d_save_path
=
o3d_save_path
,
vis_task
=
'lidar_det'
,
)
results
.
append
(
points
)
self
.
num_visualized_frames
+=
1
return
results
def
visualize_preds_fromfile
(
self
,
inputs
:
InputsType
,
preds
:
PredType
,
**
kwargs
)
->
Union
[
List
[
np
.
ndarray
],
None
]:
"""Visualize predictions from `*.json` files.
Args:
inputs (InputsType): Inputs for the inferencer.
preds (PredType): Predictions of the model.
Returns:
List[np.ndarray] or None: Returns visualization results only if
applicable.
"""
data_samples
=
[]
for
pred
in
preds
:
pred
=
mmengine
.
load
(
pred
)
data_sample
=
Det3DDataSample
()
data_sample
.
pred_instances_3d
=
InstanceData
()
data_sample
.
pred_instances_3d
.
labels_3d
=
torch
.
tensor
(
pred
[
'labels_3d'
])
data_sample
.
pred_instances_3d
.
scores_3d
=
torch
.
tensor
(
pred
[
'scores_3d'
])
if
pred
[
'box_type_3d'
]
==
'LiDAR'
:
data_sample
.
pred_instances_3d
.
bboxes_3d
=
\
LiDARInstance3DBoxes
(
pred
[
'bboxes_3d'
])
elif
pred
[
'box_type_3d'
]
==
'Camera'
:
data_sample
.
pred_instances_3d
.
bboxes_3d
=
\
CameraInstance3DBoxes
(
pred
[
'bboxes_3d'
])
elif
pred
[
'box_type_3d'
]
==
'Depth'
:
data_sample
.
pred_instances_3d
.
bboxes_3d
=
\
DepthInstance3DBoxes
(
pred
[
'bboxes_3d'
])
else
:
raise
ValueError
(
'Unsupported box type: '
f
'
{
pred
[
"box_type_3d"
]
}
'
)
data_samples
.
append
(
data_sample
)
return
self
.
visualize
(
inputs
=
inputs
,
preds
=
data_samples
,
**
kwargs
)
mmdetection3d/mmdet3d/apis/inferencers/lidar_seg3d_inferencer.py
0 → 100644
View file @
7aa442d5
# Copyright (c) OpenMMLab. All rights reserved.
import
os.path
as
osp
from
typing
import
Dict
,
List
,
Optional
,
Sequence
,
Union
import
mmengine
import
numpy
as
np
from
mmengine.dataset
import
Compose
from
mmengine.fileio
import
(
get_file_backend
,
isdir
,
join_path
,
list_dir_or_file
)
from
mmengine.infer.infer
import
ModelType
from
mmengine.structures
import
InstanceData
from
mmdet3d.registry
import
INFERENCERS
from
mmdet3d.utils
import
ConfigType
from
.base_3d_inferencer
import
Base3DInferencer
InstanceList
=
List
[
InstanceData
]
InputType
=
Union
[
str
,
np
.
ndarray
]
InputsType
=
Union
[
InputType
,
Sequence
[
InputType
]]
PredType
=
Union
[
InstanceData
,
InstanceList
]
ImgType
=
Union
[
np
.
ndarray
,
Sequence
[
np
.
ndarray
]]
ResType
=
Union
[
Dict
,
List
[
Dict
],
InstanceData
,
List
[
InstanceData
]]
@
INFERENCERS
.
register_module
(
name
=
'seg3d-lidar'
)
@
INFERENCERS
.
register_module
()
class
LidarSeg3DInferencer
(
Base3DInferencer
):
"""The inferencer of LiDAR-based segmentation.
Args:
model (str, optional): Path to the config file or the model name
defined in metafile. For example, it could be
"pointnet2-ssg_s3dis-seg" or
"configs/pointnet2/pointnet2_ssg_2xb16-cosine-50e_s3dis-seg.py".
If model is not specified, user must provide the
`weights` saved by MMEngine which contains the config string.
Defaults to None.
weights (str, optional): Path to the checkpoint. If it is not specified
and model is a model name of metafile, the weights will be loaded
from metafile. Defaults to None.
device (str, optional): Device to run inference. If None, the available
device will be automatically used. Defaults to None.
scope (str): The scope of the model. Defaults to 'mmdet3d'.
palette (str): Color palette used for visualization. The order of
priority is palette -> config -> checkpoint. Defaults to 'none'.
"""
def
__init__
(
self
,
model
:
Union
[
ModelType
,
str
,
None
]
=
None
,
weights
:
Optional
[
str
]
=
None
,
device
:
Optional
[
str
]
=
None
,
scope
:
str
=
'mmdet3d'
,
palette
:
str
=
'none'
)
->
None
:
# A global counter tracking the number of frames processed, for
# naming of the output results
self
.
num_visualized_frames
=
0
super
(
LidarSeg3DInferencer
,
self
).
__init__
(
model
=
model
,
weights
=
weights
,
device
=
device
,
scope
=
scope
,
palette
=
palette
)
def
_inputs_to_list
(
self
,
inputs
:
Union
[
dict
,
list
],
**
kwargs
)
->
list
:
"""Preprocess the inputs to a list.
Preprocess inputs to a list according to its type:
- list or tuple: return inputs
- dict: the value with key 'points' is
- Directory path: return all files in the directory
- other cases: return a list containing the string. The string
could be a path to file, a url or other types of string according
to the task.
Args:
inputs (Union[dict, list]): Inputs for the inferencer.
Returns:
list: List of input for the :meth:`preprocess`.
"""
if
isinstance
(
inputs
,
dict
)
and
isinstance
(
inputs
[
'points'
],
str
):
pcd
=
inputs
[
'points'
]
backend
=
get_file_backend
(
pcd
)
if
hasattr
(
backend
,
'isdir'
)
and
isdir
(
pcd
):
# Backends like HttpsBackend do not implement `isdir`, so
# only those backends that implement `isdir` could accept
# the inputs as a directory
filename_list
=
list_dir_or_file
(
pcd
,
list_dir
=
False
)
inputs
=
[{
'points'
:
join_path
(
pcd
,
filename
)
}
for
filename
in
filename_list
]
if
not
isinstance
(
inputs
,
(
list
,
tuple
)):
inputs
=
[
inputs
]
return
list
(
inputs
)
def
_init_pipeline
(
self
,
cfg
:
ConfigType
)
->
Compose
:
"""Initialize the test pipeline."""
pipeline_cfg
=
cfg
.
test_dataloader
.
dataset
.
pipeline
# Load annotation is also not applicable
idx
=
self
.
_get_transform_idx
(
pipeline_cfg
,
'LoadAnnotations3D'
)
if
idx
!=
-
1
:
del
pipeline_cfg
[
idx
]
idx
=
self
.
_get_transform_idx
(
pipeline_cfg
,
'PointSegClassMapping'
)
if
idx
!=
-
1
:
del
pipeline_cfg
[
idx
]
load_point_idx
=
self
.
_get_transform_idx
(
pipeline_cfg
,
'LoadPointsFromFile'
)
if
load_point_idx
==
-
1
:
raise
ValueError
(
'LoadPointsFromFile is not found in the test pipeline'
)
load_cfg
=
pipeline_cfg
[
load_point_idx
]
self
.
coord_type
,
self
.
load_dim
=
load_cfg
[
'coord_type'
],
load_cfg
[
'load_dim'
]
self
.
use_dim
=
list
(
range
(
load_cfg
[
'use_dim'
]))
if
isinstance
(
load_cfg
[
'use_dim'
],
int
)
else
load_cfg
[
'use_dim'
]
pipeline_cfg
[
load_point_idx
][
'type'
]
=
'LidarDet3DInferencerLoader'
return
Compose
(
pipeline_cfg
)
def
visualize
(
self
,
inputs
:
InputsType
,
preds
:
PredType
,
return_vis
:
bool
=
False
,
show
:
bool
=
False
,
wait_time
:
int
=
0
,
draw_pred
:
bool
=
True
,
pred_score_thr
:
float
=
0.3
,
no_save_vis
:
bool
=
False
,
img_out_dir
:
str
=
''
)
->
Union
[
List
[
np
.
ndarray
],
None
]:
"""Visualize predictions.
Args:
inputs (InputsType): Inputs for the inferencer.
preds (PredType): Predictions of the model.
return_vis (bool): Whether to return the visualization result.
Defaults to False.
show (bool): Whether to display the image in a popup window.
Defaults to False.
wait_time (float): The interval of show (s). Defaults to 0.
draw_pred (bool): Whether to draw predicted bounding boxes.
Defaults to True.
pred_score_thr (float): Minimum score of bboxes to draw.
Defaults to 0.3.
no_save_vis (bool): Whether to save visualization results.
img_out_dir (str): Output directory of visualization results.
If left as empty, no file will be saved. Defaults to ''.
Returns:
List[np.ndarray] or None: Returns visualization results only if
applicable.
"""
if
no_save_vis
is
True
:
img_out_dir
=
''
if
not
show
and
img_out_dir
==
''
and
not
return_vis
:
return
None
if
getattr
(
self
,
'visualizer'
)
is
None
:
raise
ValueError
(
'Visualization needs the "visualizer" term'
'defined in the config, but got None.'
)
results
=
[]
for
single_input
,
pred
in
zip
(
inputs
,
preds
):
single_input
=
single_input
[
'points'
]
if
isinstance
(
single_input
,
str
):
pts_bytes
=
mmengine
.
fileio
.
get
(
single_input
)
points
=
np
.
frombuffer
(
pts_bytes
,
dtype
=
np
.
float32
)
points
=
points
.
reshape
(
-
1
,
self
.
load_dim
)
points
=
points
[:,
self
.
use_dim
]
pc_name
=
osp
.
basename
(
single_input
).
split
(
'.bin'
)[
0
]
pc_name
=
f
'
{
pc_name
}
.png'
elif
isinstance
(
single_input
,
np
.
ndarray
):
points
=
single_input
.
copy
()
pc_num
=
str
(
self
.
num_visualized_frames
).
zfill
(
8
)
pc_name
=
f
'
{
pc_num
}
.png'
else
:
raise
ValueError
(
'Unsupported input type: '
f
'
{
type
(
single_input
)
}
'
)
if
img_out_dir
!=
''
and
show
:
o3d_save_path
=
osp
.
join
(
img_out_dir
,
'vis_lidar'
,
pc_name
)
mmengine
.
mkdir_or_exist
(
osp
.
dirname
(
o3d_save_path
))
else
:
o3d_save_path
=
None
data_input
=
dict
(
points
=
points
)
self
.
visualizer
.
add_datasample
(
pc_name
,
data_input
,
pred
,
show
=
show
,
wait_time
=
wait_time
,
draw_gt
=
False
,
draw_pred
=
draw_pred
,
pred_score_thr
=
pred_score_thr
,
o3d_save_path
=
o3d_save_path
,
vis_task
=
'lidar_seg'
,
)
results
.
append
(
points
)
self
.
num_visualized_frames
+=
1
return
results
mmdetection3d/mmdet3d/apis/inferencers/mono_det3d_inferencer.py
0 → 100644
View file @
7aa442d5
# Copyright (c) OpenMMLab. All rights reserved.
import
os.path
as
osp
from
typing
import
Dict
,
List
,
Optional
,
Sequence
,
Union
import
mmcv
import
mmengine
import
numpy
as
np
from
mmengine.dataset
import
Compose
from
mmengine.fileio
import
(
get_file_backend
,
isdir
,
join_path
,
list_dir_or_file
)
from
mmengine.infer.infer
import
ModelType
from
mmengine.structures
import
InstanceData
from
mmdet3d.registry
import
INFERENCERS
from
mmdet3d.utils
import
ConfigType
from
.base_3d_inferencer
import
Base3DInferencer
InstanceList
=
List
[
InstanceData
]
InputType
=
Union
[
str
,
np
.
ndarray
]
InputsType
=
Union
[
InputType
,
Sequence
[
InputType
]]
PredType
=
Union
[
InstanceData
,
InstanceList
]
ImgType
=
Union
[
np
.
ndarray
,
Sequence
[
np
.
ndarray
]]
ResType
=
Union
[
Dict
,
List
[
Dict
],
InstanceData
,
List
[
InstanceData
]]
@
INFERENCERS
.
register_module
(
name
=
'det3d-mono'
)
@
INFERENCERS
.
register_module
()
class
MonoDet3DInferencer
(
Base3DInferencer
):
"""MMDet3D Monocular 3D object detection inferencer.
Args:
model (str, optional): Path to the config file or the model name
defined in metafile. For example, it could be
"pgd_kitti" or
"configs/pgd/pgd_r101-caffe_fpn_head-gn_4xb3-4x_kitti-mono3d.py".
If model is not specified, user must provide the
`weights` saved by MMEngine which contains the config string.
Defaults to None.
weights (str, optional): Path to the checkpoint. If it is not specified
and model is a model name of metafile, the weights will be loaded
from metafile. Defaults to None.
device (str, optional): Device to run inference. If None, the available
device will be automatically used. Defaults to None.
scope (str): The scope of the model. Defaults to 'mmdet3d'.
palette (str): Color palette used for visualization. The order of
priority is palette -> config -> checkpoint. Defaults to 'none'.
"""
def
__init__
(
self
,
model
:
Union
[
ModelType
,
str
,
None
]
=
None
,
weights
:
Optional
[
str
]
=
None
,
device
:
Optional
[
str
]
=
None
,
scope
:
str
=
'mmdet3d'
,
palette
:
str
=
'none'
)
->
None
:
# A global counter tracking the number of images processed, for
# naming of the output images
self
.
num_visualized_imgs
=
0
super
(
MonoDet3DInferencer
,
self
).
__init__
(
model
=
model
,
weights
=
weights
,
device
=
device
,
scope
=
scope
,
palette
=
palette
)
def
_inputs_to_list
(
self
,
inputs
:
Union
[
dict
,
list
],
cam_type
=
'CAM2'
,
**
kwargs
)
->
list
:
"""Preprocess the inputs to a list.
Preprocess inputs to a list according to its type:
- list or tuple: return inputs
- dict: the value with key 'img' is
- Directory path: return all files in the directory
- other cases: return a list containing the string. The string
could be a path to file, a url or other types of string according
to the task.
Args:
inputs (Union[dict, list]): Inputs for the inferencer.
Returns:
list: List of input for the :meth:`preprocess`.
"""
if
isinstance
(
inputs
,
dict
):
assert
'infos'
in
inputs
infos
=
inputs
.
pop
(
'infos'
)
if
isinstance
(
inputs
[
'img'
],
str
):
img
=
inputs
[
'img'
]
backend
=
get_file_backend
(
img
)
if
hasattr
(
backend
,
'isdir'
)
and
isdir
(
img
):
# Backends like HttpsBackend do not implement `isdir`, so
# only those backends that implement `isdir` could accept
# the inputs as a directory
filename_list
=
list_dir_or_file
(
img
,
list_dir
=
False
)
inputs
=
[{
'img'
:
join_path
(
img
,
filename
)
}
for
filename
in
filename_list
]
if
not
isinstance
(
inputs
,
(
list
,
tuple
)):
inputs
=
[
inputs
]
# get cam2img, lidar2cam and lidar2img from infos
info_list
=
mmengine
.
load
(
infos
)[
'data_list'
]
assert
len
(
info_list
)
==
len
(
inputs
)
for
index
,
input
in
enumerate
(
inputs
):
data_info
=
info_list
[
index
]
img_path
=
data_info
[
'images'
][
cam_type
][
'img_path'
]
if
isinstance
(
input
[
'img'
],
str
)
and
\
osp
.
basename
(
img_path
)
!=
osp
.
basename
(
input
[
'img'
]):
raise
ValueError
(
f
'the info file of
{
img_path
}
is not provided.'
)
cam2img
=
np
.
asarray
(
data_info
[
'images'
][
cam_type
][
'cam2img'
],
dtype
=
np
.
float32
)
lidar2cam
=
np
.
asarray
(
data_info
[
'images'
][
cam_type
][
'lidar2cam'
],
dtype
=
np
.
float32
)
if
'lidar2img'
in
data_info
[
'images'
][
cam_type
]:
lidar2img
=
np
.
asarray
(
data_info
[
'images'
][
cam_type
][
'lidar2img'
],
dtype
=
np
.
float32
)
else
:
lidar2img
=
cam2img
@
lidar2cam
input
[
'cam2img'
]
=
cam2img
input
[
'lidar2cam'
]
=
lidar2cam
input
[
'lidar2img'
]
=
lidar2img
elif
isinstance
(
inputs
,
(
list
,
tuple
)):
# get cam2img, lidar2cam and lidar2img from infos
for
input
in
inputs
:
assert
'infos'
in
input
infos
=
input
.
pop
(
'infos'
)
info_list
=
mmengine
.
load
(
infos
)[
'data_list'
]
assert
len
(
info_list
)
==
1
,
'Only support single sample info'
\
'in `.pkl`, when inputs is a list.'
data_info
=
info_list
[
0
]
img_path
=
data_info
[
'images'
][
cam_type
][
'img_path'
]
if
isinstance
(
input
[
'img'
],
str
)
and
\
osp
.
basename
(
img_path
)
!=
osp
.
basename
(
input
[
'img'
]):
raise
ValueError
(
f
'the info file of
{
img_path
}
is not provided.'
)
cam2img
=
np
.
asarray
(
data_info
[
'images'
][
cam_type
][
'cam2img'
],
dtype
=
np
.
float32
)
lidar2cam
=
np
.
asarray
(
data_info
[
'images'
][
cam_type
][
'lidar2cam'
],
dtype
=
np
.
float32
)
if
'lidar2img'
in
data_info
[
'images'
][
cam_type
]:
lidar2img
=
np
.
asarray
(
data_info
[
'images'
][
cam_type
][
'lidar2img'
],
dtype
=
np
.
float32
)
else
:
lidar2img
=
cam2img
@
lidar2cam
input
[
'cam2img'
]
=
cam2img
input
[
'lidar2cam'
]
=
lidar2cam
input
[
'lidar2img'
]
=
lidar2img
return
list
(
inputs
)
def
_init_pipeline
(
self
,
cfg
:
ConfigType
)
->
Compose
:
"""Initialize the test pipeline."""
pipeline_cfg
=
cfg
.
test_dataloader
.
dataset
.
pipeline
load_img_idx
=
self
.
_get_transform_idx
(
pipeline_cfg
,
'LoadImageFromFileMono3D'
)
if
load_img_idx
==
-
1
:
raise
ValueError
(
'LoadImageFromFileMono3D is not found in the test pipeline'
)
pipeline_cfg
[
load_img_idx
][
'type'
]
=
'MonoDet3DInferencerLoader'
return
Compose
(
pipeline_cfg
)
def
visualize
(
self
,
inputs
:
InputsType
,
preds
:
PredType
,
return_vis
:
bool
=
False
,
show
:
bool
=
False
,
wait_time
:
int
=
0
,
draw_pred
:
bool
=
True
,
pred_score_thr
:
float
=
0.3
,
no_save_vis
:
bool
=
False
,
img_out_dir
:
str
=
''
,
cam_type_dir
:
str
=
'CAM2'
)
->
Union
[
List
[
np
.
ndarray
],
None
]:
"""Visualize predictions.
Args:
inputs (List[Dict]): Inputs for the inferencer.
preds (List[Dict]): Predictions of the model.
return_vis (bool): Whether to return the visualization result.
Defaults to False.
show (bool): Whether to display the image in a popup window.
Defaults to False.
wait_time (float): The interval of show (s). Defaults to 0.
draw_pred (bool): Whether to draw predicted bounding boxes.
Defaults to True.
pred_score_thr (float): Minimum score of bboxes to draw.
Defaults to 0.3.
no_save_vis (bool): Whether to save visualization results.
img_out_dir (str): Output directory of visualization results.
If left as empty, no file will be saved. Defaults to ''.
cam_type_dir (str): Camera type directory. Defaults to 'CAM2'.
Returns:
List[np.ndarray] or None: Returns visualization results only if
applicable.
"""
if
no_save_vis
is
True
:
img_out_dir
=
''
if
not
show
and
img_out_dir
==
''
and
not
return_vis
:
return
None
if
getattr
(
self
,
'visualizer'
)
is
None
:
raise
ValueError
(
'Visualization needs the "visualizer" term'
'defined in the config, but got None.'
)
results
=
[]
for
single_input
,
pred
in
zip
(
inputs
,
preds
):
if
isinstance
(
single_input
[
'img'
],
str
):
img_bytes
=
mmengine
.
fileio
.
get
(
single_input
[
'img'
])
img
=
mmcv
.
imfrombytes
(
img_bytes
)
img
=
img
[:,
:,
::
-
1
]
img_name
=
osp
.
basename
(
single_input
[
'img'
])
elif
isinstance
(
single_input
[
'img'
],
np
.
ndarray
):
img
=
single_input
[
'img'
].
copy
()
img_num
=
str
(
self
.
num_visualized_imgs
).
zfill
(
8
)
img_name
=
f
'
{
img_num
}
.jpg'
else
:
raise
ValueError
(
'Unsupported input type: '
f
"
{
type
(
single_input
[
'img'
])
}
"
)
out_file
=
osp
.
join
(
img_out_dir
,
'vis_camera'
,
cam_type_dir
,
img_name
)
if
img_out_dir
!=
''
else
None
data_input
=
dict
(
img
=
img
)
self
.
visualizer
.
add_datasample
(
img_name
,
data_input
,
pred
,
show
=
show
,
wait_time
=
wait_time
,
draw_gt
=
False
,
draw_pred
=
draw_pred
,
pred_score_thr
=
pred_score_thr
,
out_file
=
out_file
,
vis_task
=
'mono_det'
,
)
results
.
append
(
img
)
self
.
num_visualized_imgs
+=
1
return
results
mmdetection3d/mmdet3d/apis/inferencers/multi_modality_det3d_inferencer.py
0 → 100644
View file @
7aa442d5
# Copyright (c) OpenMMLab. All rights reserved.
import
os.path
as
osp
import
warnings
from
typing
import
Dict
,
List
,
Optional
,
Sequence
,
Union
import
mmcv
import
mmengine
import
numpy
as
np
from
mmengine.dataset
import
Compose
from
mmengine.fileio
import
(
get_file_backend
,
isdir
,
join_path
,
list_dir_or_file
)
from
mmengine.infer.infer
import
ModelType
from
mmengine.structures
import
InstanceData
from
mmdet3d.registry
import
INFERENCERS
from
mmdet3d.utils
import
ConfigType
from
.base_3d_inferencer
import
Base3DInferencer
InstanceList
=
List
[
InstanceData
]
InputType
=
Union
[
str
,
np
.
ndarray
]
InputsType
=
Union
[
InputType
,
Sequence
[
InputType
]]
PredType
=
Union
[
InstanceData
,
InstanceList
]
ImgType
=
Union
[
np
.
ndarray
,
Sequence
[
np
.
ndarray
]]
ResType
=
Union
[
Dict
,
List
[
Dict
],
InstanceData
,
List
[
InstanceData
]]
@
INFERENCERS
.
register_module
(
name
=
'det3d-multi_modality'
)
@
INFERENCERS
.
register_module
()
class
MultiModalityDet3DInferencer
(
Base3DInferencer
):
"""The inferencer of multi-modality detection.
Args:
model (str, optional): Path to the config file or the model name
defined in metafile. For example, it could be
"pointpillars_kitti-3class" or
"configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class.py". # noqa: E501
If model is not specified, user must provide the
`weights` saved by MMEngine which contains the config string.
Defaults to None.
weights (str, optional): Path to the checkpoint. If it is not specified
and model is a model name of metafile, the weights will be loaded
from metafile. Defaults to None.
device (str, optional): Device to run inference. If None, the available
device will be automatically used. Defaults to None.
scope (str): The scope of registry. Defaults to 'mmdet3d'.
palette (str): The palette of visualization. Defaults to 'none'.
"""
def
__init__
(
self
,
model
:
Union
[
ModelType
,
str
,
None
]
=
None
,
weights
:
Optional
[
str
]
=
None
,
device
:
Optional
[
str
]
=
None
,
scope
:
str
=
'mmdet3d'
,
palette
:
str
=
'none'
)
->
None
:
# A global counter tracking the number of frames processed, for
# naming of the output results
self
.
num_visualized_frames
=
0
super
(
MultiModalityDet3DInferencer
,
self
).
__init__
(
model
=
model
,
weights
=
weights
,
device
=
device
,
scope
=
scope
,
palette
=
palette
)
def
_inputs_to_list
(
self
,
inputs
:
Union
[
dict
,
list
],
cam_type
:
str
=
'CAM2'
,
**
kwargs
)
->
list
:
"""Preprocess the inputs to a list.
Preprocess inputs to a list according to its type:
- list or tuple: return inputs
- dict: the value with key 'points' is
- Directory path: return all files in the directory
- other cases: return a list containing the string. The string
could be a path to file, a url or other types of string according
to the task.
Args:
inputs (Union[dict, list]): Inputs for the inferencer.
Returns:
list: List of input for the :meth:`preprocess`.
"""
if
isinstance
(
inputs
,
dict
):
assert
'infos'
in
inputs
infos
=
inputs
.
pop
(
'infos'
)
if
isinstance
(
inputs
[
'img'
],
str
):
img
,
pcd
=
inputs
[
'img'
],
inputs
[
'points'
]
backend
=
get_file_backend
(
img
)
if
hasattr
(
backend
,
'isdir'
)
and
isdir
(
img
)
and
isdir
(
pcd
):
# Backends like HttpsBackend do not implement `isdir`, so
# only those backends that implement `isdir` could accept
# the inputs as a directory
img_filename_list
=
list_dir_or_file
(
img
,
list_dir
=
False
,
suffix
=
[
'.png'
,
'.jpg'
])
pcd_filename_list
=
list_dir_or_file
(
pcd
,
list_dir
=
False
,
suffix
=
'.bin'
)
assert
len
(
img_filename_list
)
==
len
(
pcd_filename_list
)
inputs
=
[{
'img'
:
join_path
(
img
,
img_filename
),
'points'
:
join_path
(
pcd
,
pcd_filename
)
}
for
pcd_filename
,
img_filename
in
zip
(
pcd_filename_list
,
img_filename_list
)]
if
not
isinstance
(
inputs
,
(
list
,
tuple
)):
inputs
=
[
inputs
]
# get cam2img, lidar2cam and lidar2img from infos
info_list
=
mmengine
.
load
(
infos
)[
'data_list'
]
assert
len
(
info_list
)
==
len
(
inputs
)
for
index
,
input
in
enumerate
(
inputs
):
data_info
=
info_list
[
index
]
img_path
=
data_info
[
'images'
][
cam_type
][
'img_path'
]
if
isinstance
(
input
[
'img'
],
str
)
and
\
osp
.
basename
(
img_path
)
!=
osp
.
basename
(
input
[
'img'
]):
raise
ValueError
(
f
'the info file of
{
img_path
}
is not provided.'
)
cam2img
=
np
.
asarray
(
data_info
[
'images'
][
cam_type
][
'cam2img'
],
dtype
=
np
.
float32
)
lidar2cam
=
np
.
asarray
(
data_info
[
'images'
][
cam_type
][
'lidar2cam'
],
dtype
=
np
.
float32
)
if
'lidar2img'
in
data_info
[
'images'
][
cam_type
]:
lidar2img
=
np
.
asarray
(
data_info
[
'images'
][
cam_type
][
'lidar2img'
],
dtype
=
np
.
float32
)
else
:
lidar2img
=
cam2img
@
lidar2cam
input
[
'cam2img'
]
=
cam2img
input
[
'lidar2cam'
]
=
lidar2cam
input
[
'lidar2img'
]
=
lidar2img
elif
isinstance
(
inputs
,
(
list
,
tuple
)):
# get cam2img, lidar2cam and lidar2img from infos
for
input
in
inputs
:
assert
'infos'
in
input
infos
=
input
.
pop
(
'infos'
)
info_list
=
mmengine
.
load
(
infos
)[
'data_list'
]
assert
len
(
info_list
)
==
1
,
'Only support single sample'
\
'info in `.pkl`, when input is a list.'
data_info
=
info_list
[
0
]
img_path
=
data_info
[
'images'
][
cam_type
][
'img_path'
]
if
isinstance
(
input
[
'img'
],
str
)
and
\
osp
.
basename
(
img_path
)
!=
osp
.
basename
(
input
[
'img'
]):
raise
ValueError
(
f
'the info file of
{
img_path
}
is not provided.'
)
cam2img
=
np
.
asarray
(
data_info
[
'images'
][
cam_type
][
'cam2img'
],
dtype
=
np
.
float32
)
lidar2cam
=
np
.
asarray
(
data_info
[
'images'
][
cam_type
][
'lidar2cam'
],
dtype
=
np
.
float32
)
if
'lidar2img'
in
data_info
[
'images'
][
cam_type
]:
lidar2img
=
np
.
asarray
(
data_info
[
'images'
][
cam_type
][
'lidar2img'
],
dtype
=
np
.
float32
)
else
:
lidar2img
=
cam2img
@
lidar2cam
input
[
'cam2img'
]
=
cam2img
input
[
'lidar2cam'
]
=
lidar2cam
input
[
'lidar2img'
]
=
lidar2img
return
list
(
inputs
)
def
_init_pipeline
(
self
,
cfg
:
ConfigType
)
->
Compose
:
"""Initialize the test pipeline."""
pipeline_cfg
=
cfg
.
test_dataloader
.
dataset
.
pipeline
load_point_idx
=
self
.
_get_transform_idx
(
pipeline_cfg
,
'LoadPointsFromFile'
)
load_mv_img_idx
=
self
.
_get_transform_idx
(
pipeline_cfg
,
'LoadMultiViewImageFromFiles'
)
if
load_mv_img_idx
!=
-
1
:
warnings
.
warn
(
'LoadMultiViewImageFromFiles is not supported yet in the '
'multi-modality inferencer. Please remove it'
)
# Now, we only support ``LoadImageFromFile`` as the image loader in the
# original piepline. `LoadMultiViewImageFromFiles` is not supported
# yet.
load_img_idx
=
self
.
_get_transform_idx
(
pipeline_cfg
,
'LoadImageFromFile'
)
if
load_point_idx
==
-
1
or
load_img_idx
==
-
1
:
raise
ValueError
(
'Both LoadPointsFromFile and LoadImageFromFile must '
'be specified the pipeline, but LoadPointsFromFile is '
f
'
{
load_point_idx
==
-
1
}
and LoadImageFromFile is '
f
'
{
load_img_idx
}
'
)
load_cfg
=
pipeline_cfg
[
load_point_idx
]
self
.
coord_type
,
self
.
load_dim
=
load_cfg
[
'coord_type'
],
load_cfg
[
'load_dim'
]
self
.
use_dim
=
list
(
range
(
load_cfg
[
'use_dim'
]))
if
isinstance
(
load_cfg
[
'use_dim'
],
int
)
else
load_cfg
[
'use_dim'
]
load_point_args
=
pipeline_cfg
[
load_point_idx
]
load_point_args
.
pop
(
'type'
)
load_img_args
=
pipeline_cfg
[
load_img_idx
]
load_img_args
.
pop
(
'type'
)
load_idx
=
min
(
load_point_idx
,
load_img_idx
)
pipeline_cfg
.
pop
(
max
(
load_point_idx
,
load_img_idx
))
pipeline_cfg
[
load_idx
]
=
dict
(
type
=
'MultiModalityDet3DInferencerLoader'
,
load_point_args
=
load_point_args
,
load_img_args
=
load_img_args
)
return
Compose
(
pipeline_cfg
)
def
visualize
(
self
,
inputs
:
InputsType
,
preds
:
PredType
,
return_vis
:
bool
=
False
,
show
:
bool
=
False
,
wait_time
:
int
=
0
,
draw_pred
:
bool
=
True
,
pred_score_thr
:
float
=
0.3
,
no_save_vis
:
bool
=
False
,
img_out_dir
:
str
=
''
,
cam_type_dir
:
str
=
'CAM2'
)
->
Union
[
List
[
np
.
ndarray
],
None
]:
"""Visualize predictions.
Args:
inputs (InputsType): Inputs for the inferencer.
preds (PredType): Predictions of the model.
return_vis (bool): Whether to return the visualization result.
Defaults to False.
show (bool): Whether to display the image in a popup window.
Defaults to False.
wait_time (float): The interval of show (s). Defaults to 0.
draw_pred (bool): Whether to draw predicted bounding boxes.
Defaults to True.
no_save_vis (bool): Whether to save visualization results.
pred_score_thr (float): Minimum score of bboxes to draw.
Defaults to 0.3.
img_out_dir (str): Output directory of visualization results.
If left as empty, no file will be saved. Defaults to ''.
Returns:
List[np.ndarray] or None: Returns visualization results only if
applicable.
"""
if
no_save_vis
is
True
:
img_out_dir
=
''
if
not
show
and
img_out_dir
==
''
and
not
return_vis
:
return
None
if
getattr
(
self
,
'visualizer'
)
is
None
:
raise
ValueError
(
'Visualization needs the "visualizer" term'
'defined in the config, but got None.'
)
results
=
[]
for
single_input
,
pred
in
zip
(
inputs
,
preds
):
points_input
=
single_input
[
'points'
]
if
isinstance
(
points_input
,
str
):
pts_bytes
=
mmengine
.
fileio
.
get
(
points_input
)
points
=
np
.
frombuffer
(
pts_bytes
,
dtype
=
np
.
float32
)
points
=
points
.
reshape
(
-
1
,
self
.
load_dim
)
points
=
points
[:,
self
.
use_dim
]
pc_name
=
osp
.
basename
(
points_input
).
split
(
'.bin'
)[
0
]
pc_name
=
f
'
{
pc_name
}
.png'
elif
isinstance
(
points_input
,
np
.
ndarray
):
points
=
points_input
.
copy
()
pc_num
=
str
(
self
.
num_visualized_frames
).
zfill
(
8
)
pc_name
=
f
'
{
pc_num
}
.png'
else
:
raise
ValueError
(
'Unsupported input type: '
f
'
{
type
(
points_input
)
}
'
)
if
img_out_dir
!=
''
and
show
:
o3d_save_path
=
osp
.
join
(
img_out_dir
,
'vis_lidar'
,
pc_name
)
mmengine
.
mkdir_or_exist
(
osp
.
dirname
(
o3d_save_path
))
else
:
o3d_save_path
=
None
img_input
=
single_input
[
'img'
]
if
isinstance
(
single_input
[
'img'
],
str
):
img_bytes
=
mmengine
.
fileio
.
get
(
img_input
)
img
=
mmcv
.
imfrombytes
(
img_bytes
)
img
=
img
[:,
:,
::
-
1
]
img_name
=
osp
.
basename
(
img_input
)
elif
isinstance
(
img_input
,
np
.
ndarray
):
img
=
img_input
.
copy
()
img_num
=
str
(
self
.
num_visualized_frames
).
zfill
(
8
)
img_name
=
f
'
{
img_num
}
.jpg'
else
:
raise
ValueError
(
'Unsupported input type: '
f
'
{
type
(
img_input
)
}
'
)
out_file
=
osp
.
join
(
img_out_dir
,
'vis_camera'
,
cam_type_dir
,
img_name
)
if
img_out_dir
!=
''
else
None
data_input
=
dict
(
points
=
points
,
img
=
img
)
self
.
visualizer
.
add_datasample
(
pc_name
,
data_input
,
pred
,
show
=
show
,
wait_time
=
wait_time
,
draw_gt
=
False
,
draw_pred
=
draw_pred
,
pred_score_thr
=
pred_score_thr
,
o3d_save_path
=
o3d_save_path
,
out_file
=
out_file
,
vis_task
=
'multi-modality_det'
,
)
results
.
append
(
points
)
self
.
num_visualized_frames
+=
1
return
results
mmdetection3d/mmdet3d/configs/_base_/datasets/kitti_3d_3class.py
0 → 100644
View file @
7aa442d5
# Copyright (c) OpenMMLab. All rights reserved.
from
mmengine.dataset.dataset_wrapper
import
RepeatDataset
from
mmengine.dataset.sampler
import
DefaultSampler
from
mmengine.visualization.vis_backend
import
LocalVisBackend
from
mmdet3d.datasets.kitti_dataset
import
KittiDataset
from
mmdet3d.datasets.transforms.formating
import
Pack3DDetInputs
from
mmdet3d.datasets.transforms.loading
import
(
LoadAnnotations3D
,
LoadPointsFromFile
)
from
mmdet3d.datasets.transforms.test_time_aug
import
MultiScaleFlipAug3D
from
mmdet3d.datasets.transforms.transforms_3d
import
(
# noqa
GlobalRotScaleTrans
,
ObjectNoise
,
ObjectRangeFilter
,
ObjectSample
,
PointShuffle
,
PointsRangeFilter
,
RandomFlip3D
)
from
mmdet3d.evaluation.metrics.kitti_metric
import
KittiMetric
from
mmdet3d.visualization.local_visualizer
import
Det3DLocalVisualizer
# dataset settings
dataset_type
=
'KittiDataset'
data_root
=
'data/kitti/'
class_names
=
[
'Pedestrian'
,
'Cyclist'
,
'Car'
]
point_cloud_range
=
[
0
,
-
40
,
-
3
,
70.4
,
40
,
1
]
input_modality
=
dict
(
use_lidar
=
True
,
use_camera
=
False
)
metainfo
=
dict
(
classes
=
class_names
)
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection3d/kitti/'
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/detection3d/',
# 'data/': 's3://openmmlab/datasets/detection3d/'
# }))
backend_args
=
None
db_sampler
=
dict
(
data_root
=
data_root
,
info_path
=
data_root
+
'kitti_dbinfos_train.pkl'
,
rate
=
1.0
,
prepare
=
dict
(
filter_by_difficulty
=
[
-
1
],
filter_by_min_points
=
dict
(
Car
=
5
,
Pedestrian
=
10
,
Cyclist
=
10
)),
classes
=
class_names
,
sample_groups
=
dict
(
Car
=
12
,
Pedestrian
=
6
,
Cyclist
=
6
),
points_loader
=
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'LIDAR'
,
load_dim
=
4
,
use_dim
=
4
,
backend_args
=
backend_args
),
backend_args
=
backend_args
)
train_pipeline
=
[
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'LIDAR'
,
load_dim
=
4
,
# x, y, z, intensity
use_dim
=
4
,
backend_args
=
backend_args
),
dict
(
type
=
LoadAnnotations3D
,
with_bbox_3d
=
True
,
with_label_3d
=
True
),
dict
(
type
=
ObjectSample
,
db_sampler
=
db_sampler
),
dict
(
type
=
ObjectNoise
,
num_try
=
100
,
translation_std
=
[
1.0
,
1.0
,
0.5
],
global_rot_range
=
[
0.0
,
0.0
],
rot_range
=
[
-
0.78539816
,
0.78539816
]),
dict
(
type
=
RandomFlip3D
,
flip_ratio_bev_horizontal
=
0.5
),
dict
(
type
=
GlobalRotScaleTrans
,
rot_range
=
[
-
0.78539816
,
0.78539816
],
scale_ratio_range
=
[
0.95
,
1.05
]),
dict
(
type
=
PointsRangeFilter
,
point_cloud_range
=
point_cloud_range
),
dict
(
type
=
ObjectRangeFilter
,
point_cloud_range
=
point_cloud_range
),
dict
(
type
=
PointShuffle
),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'points'
,
'gt_bboxes_3d'
,
'gt_labels_3d'
])
]
test_pipeline
=
[
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'LIDAR'
,
load_dim
=
4
,
use_dim
=
4
,
backend_args
=
backend_args
),
dict
(
type
=
MultiScaleFlipAug3D
,
img_scale
=
(
1333
,
800
),
pts_scale_ratio
=
1
,
flip
=
False
,
transforms
=
[
dict
(
type
=
GlobalRotScaleTrans
,
rot_range
=
[
0
,
0
],
scale_ratio_range
=
[
1.
,
1.
],
translation_std
=
[
0
,
0
,
0
]),
dict
(
type
=
RandomFlip3D
),
dict
(
type
=
PointsRangeFilter
,
point_cloud_range
=
point_cloud_range
)
]),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'points'
])
]
# construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline
=
[
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'LIDAR'
,
load_dim
=
4
,
use_dim
=
4
,
backend_args
=
backend_args
),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'points'
])
]
train_dataloader
=
dict
(
batch_size
=
6
,
num_workers
=
4
,
persistent_workers
=
True
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
True
),
dataset
=
dict
(
type
=
RepeatDataset
,
times
=
2
,
dataset
=
dict
(
type
=
KittiDataset
,
data_root
=
data_root
,
ann_file
=
'kitti_infos_train.pkl'
,
data_prefix
=
dict
(
pts
=
'training/velodyne_reduced'
),
pipeline
=
train_pipeline
,
modality
=
input_modality
,
test_mode
=
False
,
metainfo
=
metainfo
,
# we use box_type_3d='LiDAR' in kitti and nuscenes dataset
# and box_type_3d='Depth' in sunrgbd and scannet dataset.
box_type_3d
=
'LiDAR'
,
backend_args
=
backend_args
)))
val_dataloader
=
dict
(
batch_size
=
1
,
num_workers
=
1
,
persistent_workers
=
True
,
drop_last
=
False
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
False
),
dataset
=
dict
(
type
=
KittiDataset
,
data_root
=
data_root
,
data_prefix
=
dict
(
pts
=
'training/velodyne_reduced'
),
ann_file
=
'kitti_infos_val.pkl'
,
pipeline
=
test_pipeline
,
modality
=
input_modality
,
test_mode
=
True
,
metainfo
=
metainfo
,
box_type_3d
=
'LiDAR'
,
backend_args
=
backend_args
))
test_dataloader
=
dict
(
batch_size
=
1
,
num_workers
=
1
,
persistent_workers
=
True
,
drop_last
=
False
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
False
),
dataset
=
dict
(
type
=
KittiDataset
,
data_root
=
data_root
,
data_prefix
=
dict
(
pts
=
'training/velodyne_reduced'
),
ann_file
=
'kitti_infos_val.pkl'
,
pipeline
=
test_pipeline
,
modality
=
input_modality
,
test_mode
=
True
,
metainfo
=
metainfo
,
box_type_3d
=
'LiDAR'
,
backend_args
=
backend_args
))
val_evaluator
=
dict
(
type
=
KittiMetric
,
ann_file
=
data_root
+
'kitti_infos_val.pkl'
,
metric
=
'bbox'
,
backend_args
=
backend_args
)
test_evaluator
=
val_evaluator
vis_backends
=
[
dict
(
type
=
LocalVisBackend
)]
visualizer
=
dict
(
type
=
Det3DLocalVisualizer
,
vis_backends
=
vis_backends
,
name
=
'visualizer'
)
mmdetection3d/mmdet3d/configs/_base_/datasets/kitti_3d_car.py
0 → 100644
View file @
7aa442d5
# Copyright (c) OpenMMLab. All rights reserved.
from
mmengine.dataset.dataset_wrapper
import
RepeatDataset
from
mmengine.dataset.sampler
import
DefaultSampler
from
mmengine.visualization.vis_backend
import
LocalVisBackend
from
mmdet3d.datasets.kitti_dataset
import
KittiDataset
from
mmdet3d.datasets.transforms.formating
import
Pack3DDetInputs
from
mmdet3d.datasets.transforms.loading
import
(
LoadAnnotations3D
,
LoadPointsFromFile
)
from
mmdet3d.datasets.transforms.test_time_aug
import
MultiScaleFlipAug3D
from
mmdet3d.datasets.transforms.transforms_3d
import
(
# noqa
GlobalRotScaleTrans
,
ObjectNoise
,
ObjectRangeFilter
,
ObjectSample
,
PointShuffle
,
PointsRangeFilter
,
RandomFlip3D
)
from
mmdet3d.evaluation.metrics.kitti_metric
import
KittiMetric
from
mmdet3d.visualization.local_visualizer
import
Det3DLocalVisualizer
# dataset settings
dataset_type
=
'KittiDataset'
data_root
=
'data/kitti/'
class_names
=
[
'Car'
]
point_cloud_range
=
[
0
,
-
40
,
-
3
,
70.4
,
40
,
1
]
input_modality
=
dict
(
use_lidar
=
True
,
use_camera
=
False
)
metainfo
=
dict
(
classes
=
class_names
)
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection3d/kitti/'
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/detection3d/',
# 'data/': 's3://openmmlab/datasets/detection3d/'
# }))
backend_args
=
None
db_sampler
=
dict
(
data_root
=
data_root
,
info_path
=
data_root
+
'kitti_dbinfos_train.pkl'
,
rate
=
1.0
,
prepare
=
dict
(
filter_by_difficulty
=
[
-
1
],
filter_by_min_points
=
dict
(
Car
=
5
)),
classes
=
class_names
,
sample_groups
=
dict
(
Car
=
15
),
points_loader
=
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'LIDAR'
,
load_dim
=
4
,
use_dim
=
4
,
backend_args
=
backend_args
),
backend_args
=
backend_args
)
train_pipeline
=
[
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'LIDAR'
,
load_dim
=
4
,
# x, y, z, intensity
use_dim
=
4
,
backend_args
=
backend_args
),
dict
(
type
=
LoadAnnotations3D
,
with_bbox_3d
=
True
,
with_label_3d
=
True
),
dict
(
type
=
ObjectSample
,
db_sampler
=
db_sampler
),
dict
(
type
=
ObjectNoise
,
num_try
=
100
,
translation_std
=
[
1.0
,
1.0
,
0.5
],
global_rot_range
=
[
0.0
,
0.0
],
rot_range
=
[
-
0.78539816
,
0.78539816
]),
dict
(
type
=
RandomFlip3D
,
flip_ratio_bev_horizontal
=
0.5
),
dict
(
type
=
GlobalRotScaleTrans
,
rot_range
=
[
-
0.78539816
,
0.78539816
],
scale_ratio_range
=
[
0.95
,
1.05
]),
dict
(
type
=
PointsRangeFilter
,
point_cloud_range
=
point_cloud_range
),
dict
(
type
=
ObjectRangeFilter
,
point_cloud_range
=
point_cloud_range
),
dict
(
type
=
PointShuffle
),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'points'
,
'gt_bboxes_3d'
,
'gt_labels_3d'
])
]
test_pipeline
=
[
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'LIDAR'
,
load_dim
=
4
,
use_dim
=
4
,
backend_args
=
backend_args
),
dict
(
type
=
MultiScaleFlipAug3D
,
img_scale
=
(
1333
,
800
),
pts_scale_ratio
=
1
,
flip
=
False
,
transforms
=
[
dict
(
type
=
GlobalRotScaleTrans
,
rot_range
=
[
0
,
0
],
scale_ratio_range
=
[
1.
,
1.
],
translation_std
=
[
0
,
0
,
0
]),
dict
(
type
=
RandomFlip3D
),
dict
(
type
=
PointsRangeFilter
,
point_cloud_range
=
point_cloud_range
)
]),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'points'
])
]
# construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline
=
[
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'LIDAR'
,
load_dim
=
4
,
use_dim
=
4
,
backend_args
=
backend_args
),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'points'
])
]
train_dataloader
=
dict
(
batch_size
=
6
,
num_workers
=
4
,
persistent_workers
=
True
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
True
),
dataset
=
dict
(
type
=
RepeatDataset
,
times
=
2
,
dataset
=
dict
(
type
=
KittiDataset
,
data_root
=
data_root
,
ann_file
=
'kitti_infos_train.pkl'
,
data_prefix
=
dict
(
pts
=
'training/velodyne_reduced'
),
pipeline
=
train_pipeline
,
modality
=
input_modality
,
test_mode
=
False
,
metainfo
=
metainfo
,
# we use box_type_3d='LiDAR' in kitti and nuscenes dataset
# and box_type_3d='Depth' in sunrgbd and scannet dataset.
box_type_3d
=
'LiDAR'
,
backend_args
=
backend_args
)))
val_dataloader
=
dict
(
batch_size
=
1
,
num_workers
=
1
,
persistent_workers
=
True
,
drop_last
=
False
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
False
),
dataset
=
dict
(
type
=
KittiDataset
,
data_root
=
data_root
,
data_prefix
=
dict
(
pts
=
'training/velodyne_reduced'
),
ann_file
=
'kitti_infos_val.pkl'
,
pipeline
=
test_pipeline
,
modality
=
input_modality
,
test_mode
=
True
,
metainfo
=
metainfo
,
box_type_3d
=
'LiDAR'
,
backend_args
=
backend_args
))
test_dataloader
=
dict
(
batch_size
=
1
,
num_workers
=
1
,
persistent_workers
=
True
,
drop_last
=
False
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
False
),
dataset
=
dict
(
type
=
KittiDataset
,
data_root
=
data_root
,
data_prefix
=
dict
(
pts
=
'training/velodyne_reduced'
),
ann_file
=
'kitti_infos_val.pkl'
,
pipeline
=
test_pipeline
,
modality
=
input_modality
,
test_mode
=
True
,
metainfo
=
metainfo
,
box_type_3d
=
'LiDAR'
,
backend_args
=
backend_args
))
val_evaluator
=
dict
(
type
=
KittiMetric
,
ann_file
=
data_root
+
'kitti_infos_val.pkl'
,
metric
=
'bbox'
,
backend_args
=
backend_args
)
test_evaluator
=
val_evaluator
vis_backends
=
[
dict
(
type
=
LocalVisBackend
)]
visualizer
=
dict
(
type
=
Det3DLocalVisualizer
,
vis_backends
=
vis_backends
,
name
=
'visualizer'
)
mmdetection3d/mmdet3d/configs/_base_/datasets/kitti_mono3d.py
0 → 100644
View file @
7aa442d5
# Copyright (c) OpenMMLab. All rights reserved.
from
mmcv.transforms.processing
import
Resize
from
mmengine.dataset.sampler
import
DefaultSampler
from
mmengine.visualization.vis_backend
import
LocalVisBackend
from
mmdet3d.datasets.kitti_dataset
import
KittiDataset
from
mmdet3d.datasets.transforms.formating
import
Pack3DDetInputs
from
mmdet3d.datasets.transforms.loading
import
(
LoadAnnotations3D
,
LoadImageFromFileMono3D
)
from
mmdet3d.datasets.transforms.transforms_3d
import
RandomFlip3D
from
mmdet3d.evaluation.metrics.kitti_metric
import
KittiMetric
from
mmdet3d.visualization.local_visualizer
import
Det3DLocalVisualizer
dataset_type
=
'KittiDataset'
data_root
=
'data/kitti/'
class_names
=
[
'Pedestrian'
,
'Cyclist'
,
'Car'
]
input_modality
=
dict
(
use_lidar
=
False
,
use_camera
=
True
)
metainfo
=
dict
(
classes
=
class_names
)
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection3d/kitti/'
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/detection3d/',
# 'data/': 's3://openmmlab/datasets/detection3d/'
# }))
backend_args
=
None
train_pipeline
=
[
dict
(
type
=
LoadImageFromFileMono3D
,
backend_args
=
backend_args
),
dict
(
type
=
LoadAnnotations3D
,
with_bbox
=
True
,
with_label
=
True
,
with_attr_label
=
False
,
with_bbox_3d
=
True
,
with_label_3d
=
True
,
with_bbox_depth
=
True
),
dict
(
type
=
Resize
,
scale
=
(
1242
,
375
),
keep_ratio
=
True
),
dict
(
type
=
RandomFlip3D
,
flip_ratio_bev_horizontal
=
0.5
),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'img'
,
'gt_bboxes'
,
'gt_bboxes_labels'
,
'gt_bboxes_3d'
,
'gt_labels_3d'
,
'centers_2d'
,
'depths'
]),
]
test_pipeline
=
[
dict
(
type
=
LoadImageFromFileMono3D
,
backend_args
=
backend_args
),
dict
(
type
=
Resize
,
scale
=
(
1242
,
375
),
keep_ratio
=
True
),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'img'
])
]
eval_pipeline
=
[
dict
(
type
=
LoadImageFromFileMono3D
,
backend_args
=
backend_args
),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'img'
])
]
train_dataloader
=
dict
(
batch_size
=
2
,
num_workers
=
2
,
persistent_workers
=
True
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
True
),
dataset
=
dict
(
type
=
KittiDataset
,
data_root
=
data_root
,
ann_file
=
'kitti_infos_train.pkl'
,
data_prefix
=
dict
(
img
=
'training/image_2'
),
pipeline
=
train_pipeline
,
modality
=
input_modality
,
load_type
=
'fov_image_based'
,
test_mode
=
False
,
metainfo
=
metainfo
,
# we use box_type_3d='Camera' in monocular 3d
# detection task
box_type_3d
=
'Camera'
,
backend_args
=
backend_args
))
val_dataloader
=
dict
(
batch_size
=
1
,
num_workers
=
2
,
persistent_workers
=
True
,
drop_last
=
False
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
False
),
dataset
=
dict
(
type
=
KittiDataset
,
data_root
=
data_root
,
data_prefix
=
dict
(
img
=
'training/image_2'
),
ann_file
=
'kitti_infos_val.pkl'
,
pipeline
=
test_pipeline
,
modality
=
input_modality
,
load_type
=
'fov_image_based'
,
metainfo
=
metainfo
,
test_mode
=
True
,
box_type_3d
=
'Camera'
,
backend_args
=
backend_args
))
test_dataloader
=
val_dataloader
val_evaluator
=
dict
(
type
=
KittiMetric
,
ann_file
=
data_root
+
'kitti_infos_val.pkl'
,
metric
=
'bbox'
,
backend_args
=
backend_args
)
test_evaluator
=
val_evaluator
vis_backends
=
[
dict
(
type
=
LocalVisBackend
)]
visualizer
=
dict
(
type
=
Det3DLocalVisualizer
,
vis_backends
=
vis_backends
,
name
=
'visualizer'
)
mmdetection3d/mmdet3d/configs/_base_/datasets/lyft_3d.py
0 → 100644
View file @
7aa442d5
# Copyright (c) OpenMMLab. All rights reserved.
from
mmengine.dataset.sampler
import
DefaultSampler
from
mmengine.visualization.vis_backend
import
LocalVisBackend
from
mmdet3d.datasets.lyft_dataset
import
LyftDataset
from
mmdet3d.datasets.transforms.formating
import
Pack3DDetInputs
from
mmdet3d.datasets.transforms.loading
import
(
LoadAnnotations3D
,
LoadPointsFromFile
,
LoadPointsFromMultiSweeps
)
from
mmdet3d.datasets.transforms.test_time_aug
import
MultiScaleFlipAug3D
from
mmdet3d.datasets.transforms.transforms_3d
import
(
GlobalRotScaleTrans
,
ObjectRangeFilter
,
PointShuffle
,
PointsRangeFilter
,
RandomFlip3D
)
from
mmdet3d.evaluation.metrics.lyft_metric
import
LyftMetric
from
mmdet3d.visualization.local_visualizer
import
Det3DLocalVisualizer
# If point cloud range is changed, the models should also change their point
# cloud range accordingly
point_cloud_range
=
[
-
80
,
-
80
,
-
5
,
80
,
80
,
3
]
# For Lyft we usually do 9-class detection
class_names
=
[
'car'
,
'truck'
,
'bus'
,
'emergency_vehicle'
,
'other_vehicle'
,
'motorcycle'
,
'bicycle'
,
'pedestrian'
,
'animal'
]
dataset_type
=
'LyftDataset'
data_root
=
'data/lyft/'
# Input modality for Lyft dataset, this is consistent with the submission
# format which requires the information in input_modality.
input_modality
=
dict
(
use_lidar
=
True
,
use_camera
=
False
)
data_prefix
=
dict
(
pts
=
'v1.01-train/lidar'
,
img
=
''
,
sweeps
=
'v1.01-train/lidar'
)
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection3d/lyft/'
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/detection3d/',
# 'data/': 's3://openmmlab/datasets/detection3d/'
# }))
backend_args
=
None
train_pipeline
=
[
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'LIDAR'
,
load_dim
=
5
,
use_dim
=
5
,
backend_args
=
backend_args
),
dict
(
type
=
LoadPointsFromMultiSweeps
,
sweeps_num
=
10
,
backend_args
=
backend_args
),
dict
(
type
=
LoadAnnotations3D
,
with_bbox_3d
=
True
,
with_label_3d
=
True
),
dict
(
type
=
GlobalRotScaleTrans
,
rot_range
=
[
-
0.3925
,
0.3925
],
scale_ratio_range
=
[
0.95
,
1.05
],
translation_std
=
[
0
,
0
,
0
]),
dict
(
type
=
RandomFlip3D
,
flip_ratio_bev_horizontal
=
0.5
),
dict
(
type
=
PointsRangeFilter
,
point_cloud_range
=
point_cloud_range
),
dict
(
type
=
ObjectRangeFilter
,
point_cloud_range
=
point_cloud_range
),
dict
(
type
=
PointShuffle
),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'points'
,
'gt_bboxes_3d'
,
'gt_labels_3d'
])
]
test_pipeline
=
[
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'LIDAR'
,
load_dim
=
5
,
use_dim
=
5
,
backend_args
=
backend_args
),
dict
(
type
=
LoadPointsFromMultiSweeps
,
sweeps_num
=
10
,
backend_args
=
backend_args
),
dict
(
type
=
MultiScaleFlipAug3D
,
img_scale
=
(
1333
,
800
),
pts_scale_ratio
=
1
,
flip
=
False
,
transforms
=
[
dict
(
type
=
GlobalRotScaleTrans
,
rot_range
=
[
0
,
0
],
scale_ratio_range
=
[
1.
,
1.
],
translation_std
=
[
0
,
0
,
0
]),
dict
(
type
=
RandomFlip3D
),
dict
(
type
=
PointsRangeFilter
,
point_cloud_range
=
point_cloud_range
)
]),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'points'
])
]
# construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline
=
[
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'LIDAR'
,
load_dim
=
5
,
use_dim
=
5
,
backend_args
=
backend_args
),
dict
(
type
=
LoadPointsFromMultiSweeps
,
sweeps_num
=
10
,
backend_args
=
backend_args
),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'points'
])
]
train_dataloader
=
dict
(
batch_size
=
2
,
num_workers
=
2
,
persistent_workers
=
True
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
True
),
dataset
=
dict
(
type
=
LyftDataset
,
data_root
=
data_root
,
ann_file
=
'lyft_infos_train.pkl'
,
pipeline
=
train_pipeline
,
metainfo
=
dict
(
classes
=
class_names
),
modality
=
input_modality
,
data_prefix
=
data_prefix
,
test_mode
=
False
,
box_type_3d
=
'LiDAR'
,
backend_args
=
backend_args
))
test_dataloader
=
dict
(
batch_size
=
1
,
num_workers
=
1
,
persistent_workers
=
True
,
drop_last
=
False
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
False
),
dataset
=
dict
(
type
=
LyftDataset
,
data_root
=
data_root
,
ann_file
=
'lyft_infos_val.pkl'
,
pipeline
=
test_pipeline
,
metainfo
=
dict
(
classes
=
class_names
),
modality
=
input_modality
,
data_prefix
=
data_prefix
,
test_mode
=
True
,
box_type_3d
=
'LiDAR'
,
backend_args
=
backend_args
))
val_dataloader
=
dict
(
batch_size
=
1
,
num_workers
=
1
,
persistent_workers
=
True
,
drop_last
=
False
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
False
),
dataset
=
dict
(
type
=
LyftDataset
,
data_root
=
data_root
,
ann_file
=
'lyft_infos_val.pkl'
,
pipeline
=
test_pipeline
,
metainfo
=
dict
(
classes
=
class_names
),
modality
=
input_modality
,
test_mode
=
True
,
data_prefix
=
data_prefix
,
box_type_3d
=
'LiDAR'
,
backend_args
=
backend_args
))
val_evaluator
=
dict
(
type
=
LyftMetric
,
data_root
=
data_root
,
ann_file
=
'lyft_infos_val.pkl'
,
metric
=
'bbox'
,
backend_args
=
backend_args
)
test_evaluator
=
val_evaluator
vis_backends
=
[
dict
(
type
=
LocalVisBackend
)]
visualizer
=
dict
(
type
=
Det3DLocalVisualizer
,
vis_backends
=
vis_backends
,
name
=
'visualizer'
)
mmdetection3d/mmdet3d/configs/_base_/datasets/lyft_3d_range100.py
0 → 100644
View file @
7aa442d5
# Copyright (c) OpenMMLab. All rights reserved.
from
mmengine.dataset.sampler
import
DefaultSampler
from
mmengine.visualization.vis_backend
import
LocalVisBackend
from
mmdet3d.datasets.lyft_dataset
import
LyftDataset
from
mmdet3d.datasets.transforms.formating
import
Pack3DDetInputs
from
mmdet3d.datasets.transforms.loading
import
(
LoadAnnotations3D
,
LoadPointsFromFile
,
LoadPointsFromMultiSweeps
)
from
mmdet3d.datasets.transforms.test_time_aug
import
MultiScaleFlipAug3D
from
mmdet3d.datasets.transforms.transforms_3d
import
(
GlobalRotScaleTrans
,
ObjectRangeFilter
,
PointShuffle
,
PointsRangeFilter
,
RandomFlip3D
)
from
mmdet3d.evaluation.metrics.lyft_metric
import
LyftMetric
from
mmdet3d.visualization.local_visualizer
import
Det3DLocalVisualizer
# If point cloud range is changed, the models should also change their point
# cloud range accordingly
point_cloud_range
=
[
-
100
,
-
100
,
-
5
,
100
,
100
,
3
]
# For Lyft we usually do 9-class detection
class_names
=
[
'car'
,
'truck'
,
'bus'
,
'emergency_vehicle'
,
'other_vehicle'
,
'motorcycle'
,
'bicycle'
,
'pedestrian'
,
'animal'
]
dataset_type
=
'LyftDataset'
data_root
=
'data/lyft/'
data_prefix
=
dict
(
pts
=
'v1.01-train/lidar'
,
img
=
''
,
sweeps
=
'v1.01-train/lidar'
)
# Input modality for Lyft dataset, this is consistent with the submission
# format which requires the information in input_modality.
input_modality
=
dict
(
use_lidar
=
True
,
use_camera
=
False
,
use_radar
=
False
,
use_map
=
False
,
use_external
=
False
)
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection3d/lyft/'
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/detection3d/',
# 'data/': 's3://openmmlab/datasets/detection3d/'
# }))
backend_args
=
None
train_pipeline
=
[
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'LIDAR'
,
load_dim
=
5
,
use_dim
=
5
,
backend_args
=
backend_args
),
dict
(
type
=
LoadPointsFromMultiSweeps
,
sweeps_num
=
10
,
backend_args
=
backend_args
),
dict
(
type
=
LoadAnnotations3D
,
with_bbox_3d
=
True
,
with_label_3d
=
True
),
dict
(
type
=
GlobalRotScaleTrans
,
rot_range
=
[
-
0.3925
,
0.3925
],
scale_ratio_range
=
[
0.95
,
1.05
],
translation_std
=
[
0
,
0
,
0
]),
dict
(
type
=
RandomFlip3D
,
flip_ratio_bev_horizontal
=
0.5
),
dict
(
type
=
PointsRangeFilter
,
point_cloud_range
=
point_cloud_range
),
dict
(
type
=
ObjectRangeFilter
,
point_cloud_range
=
point_cloud_range
),
dict
(
type
=
PointShuffle
),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'points'
,
'gt_bboxes_3d'
,
'gt_labels_3d'
])
]
test_pipeline
=
[
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'LIDAR'
,
load_dim
=
5
,
use_dim
=
5
,
backend_args
=
backend_args
),
dict
(
type
=
LoadPointsFromMultiSweeps
,
sweeps_num
=
10
,
backend_args
=
backend_args
),
dict
(
type
=
MultiScaleFlipAug3D
,
img_scale
=
(
1333
,
800
),
pts_scale_ratio
=
1
,
flip
=
False
,
transforms
=
[
dict
(
type
=
GlobalRotScaleTrans
,
rot_range
=
[
0
,
0
],
scale_ratio_range
=
[
1.
,
1.
],
translation_std
=
[
0
,
0
,
0
]),
dict
(
type
=
RandomFlip3D
),
dict
(
type
=
PointsRangeFilter
,
point_cloud_range
=
point_cloud_range
),
]),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'points'
])
]
# construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline
=
[
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'LIDAR'
,
load_dim
=
5
,
use_dim
=
5
,
backend_args
=
backend_args
),
dict
(
type
=
LoadPointsFromMultiSweeps
,
sweeps_num
=
10
,
backend_args
=
backend_args
),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'points'
])
]
train_dataloader
=
dict
(
batch_size
=
2
,
num_workers
=
2
,
persistent_workers
=
True
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
True
),
dataset
=
dict
(
type
=
LyftDataset
,
data_root
=
data_root
,
ann_file
=
'lyft_infos_train.pkl'
,
pipeline
=
train_pipeline
,
metainfo
=
dict
(
classes
=
class_names
),
modality
=
input_modality
,
data_prefix
=
data_prefix
,
test_mode
=
False
,
box_type_3d
=
'LiDAR'
,
backend_args
=
backend_args
))
val_dataloader
=
dict
(
batch_size
=
1
,
num_workers
=
1
,
persistent_workers
=
True
,
drop_last
=
False
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
False
),
dataset
=
dict
(
type
=
LyftDataset
,
data_root
=
data_root
,
ann_file
=
'lyft_infos_val.pkl'
,
pipeline
=
test_pipeline
,
metainfo
=
dict
(
classes
=
class_names
),
modality
=
input_modality
,
test_mode
=
True
,
data_prefix
=
data_prefix
,
box_type_3d
=
'LiDAR'
,
backend_args
=
backend_args
))
test_dataloader
=
val_dataloader
val_evaluator
=
dict
(
type
=
LyftMetric
,
data_root
=
data_root
,
ann_file
=
'lyft_infos_val.pkl'
,
metric
=
'bbox'
,
backend_args
=
backend_args
)
test_evaluator
=
val_evaluator
vis_backends
=
[
dict
(
type
=
LocalVisBackend
)]
visualizer
=
dict
(
type
=
Det3DLocalVisualizer
,
vis_backends
=
vis_backends
,
name
=
'visualizer'
)
mmdetection3d/mmdet3d/configs/_base_/datasets/nuim_instance.py
0 → 100644
View file @
7aa442d5
# Copyright (c) OpenMMLab. All rights reserved.
from
mmcv.transforms.loading
import
LoadAnnotations
,
LoadImageFromFile
from
mmcv.transforms.processing
import
MultiScaleFlipAug
,
RandomFlip
,
Resize
dataset_type
=
'CocoDataset'
data_root
=
'data/nuimages/'
class_names
=
[
'car'
,
'truck'
,
'trailer'
,
'bus'
,
'construction_vehicle'
,
'bicycle'
,
'motorcycle'
,
'pedestrian'
,
'traffic_cone'
,
'barrier'
]
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection3d/nuimages/'
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/detection3d/',
# 'data/': 's3://openmmlab/datasets/detection3d/'
# }))
backend_args
=
None
train_pipeline
=
[
dict
(
type
=
LoadImageFromFile
,
backend_args
=
backend_args
),
dict
(
type
=
LoadAnnotations
,
with_bbox
=
True
,
with_mask
=
True
),
dict
(
type
=
Resize
,
img_scale
=
[(
1280
,
720
),
(
1920
,
1080
)],
multiscale_mode
=
'range'
,
keep_ratio
=
True
),
dict
(
type
=
RandomFlip
,
flip_ratio
=
0.5
),
dict
(
type
=
'PackDetInputs'
),
]
test_pipeline
=
[
dict
(
type
=
LoadImageFromFile
,
backend_args
=
backend_args
),
dict
(
type
=
MultiScaleFlipAug
,
img_scale
=
(
1600
,
900
),
flip
=
False
,
transforms
=
[
dict
(
type
=
Resize
,
keep_ratio
=
True
),
dict
(
type
=
RandomFlip
),
]),
dict
(
type
=
'PackDetInputs'
,
meta_keys
=
(
'img_id'
,
'img_path'
,
'ori_shape'
,
'img_shape'
,
'scale_factor'
)),
]
data
=
dict
(
samples_per_gpu
=
2
,
workers_per_gpu
=
2
,
train
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/nuimages_v1.0-train.json'
,
img_prefix
=
data_root
,
classes
=
class_names
,
pipeline
=
train_pipeline
),
val
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/nuimages_v1.0-val.json'
,
img_prefix
=
data_root
,
classes
=
class_names
,
pipeline
=
test_pipeline
),
test
=
dict
(
type
=
dataset_type
,
ann_file
=
data_root
+
'annotations/nuimages_v1.0-val.json'
,
img_prefix
=
data_root
,
classes
=
class_names
,
pipeline
=
test_pipeline
))
evaluation
=
dict
(
metric
=
[
'bbox'
,
'segm'
])
mmdetection3d/mmdet3d/configs/_base_/datasets/nus_3d.py
0 → 100644
View file @
7aa442d5
# Copyright (c) OpenMMLab. All rights reserved.
from
mmengine.dataset.sampler
import
DefaultSampler
from
mmengine.visualization.vis_backend
import
LocalVisBackend
from
mmdet3d.datasets.nuscenes_dataset
import
NuScenesDataset
from
mmdet3d.datasets.transforms.formating
import
Pack3DDetInputs
from
mmdet3d.datasets.transforms.loading
import
(
LoadAnnotations3D
,
LoadPointsFromFile
,
LoadPointsFromMultiSweeps
)
from
mmdet3d.datasets.transforms.test_time_aug
import
MultiScaleFlipAug3D
from
mmdet3d.datasets.transforms.transforms_3d
import
(
# noqa
GlobalRotScaleTrans
,
ObjectNameFilter
,
ObjectRangeFilter
,
PointShuffle
,
PointsRangeFilter
,
RandomFlip3D
)
from
mmdet3d.evaluation.metrics.nuscenes_metric
import
NuScenesMetric
from
mmdet3d.visualization.local_visualizer
import
Det3DLocalVisualizer
# If point cloud range is changed, the models should also change their point
# cloud range accordingly
point_cloud_range
=
[
-
50
,
-
50
,
-
5
,
50
,
50
,
3
]
# Using calibration info convert the Lidar-coordinate point cloud range to the
# ego-coordinate point cloud range could bring a little promotion in nuScenes.
# point_cloud_range = [-50, -50.8, -5, 50, 49.2, 3]
# For nuScenes we usually do 10-class detection
class_names
=
[
'car'
,
'truck'
,
'trailer'
,
'bus'
,
'construction_vehicle'
,
'bicycle'
,
'motorcycle'
,
'pedestrian'
,
'traffic_cone'
,
'barrier'
]
metainfo
=
dict
(
classes
=
class_names
)
dataset_type
=
'NuScenesDataset'
data_root
=
'data/nuscenes/'
# Input modality for nuScenes dataset, this is consistent with the submission
# format which requires the information in input_modality.
input_modality
=
dict
(
use_lidar
=
True
,
use_camera
=
False
)
data_prefix
=
dict
(
pts
=
'samples/LIDAR_TOP'
,
img
=
''
,
sweeps
=
'sweeps/LIDAR_TOP'
)
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection3d/nuscenes/'
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/detection3d/',
# 'data/': 's3://openmmlab/datasets/detection3d/'
# }))
backend_args
=
None
train_pipeline
=
[
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'LIDAR'
,
load_dim
=
5
,
use_dim
=
5
,
backend_args
=
backend_args
),
dict
(
type
=
LoadPointsFromMultiSweeps
,
sweeps_num
=
10
,
backend_args
=
backend_args
),
dict
(
type
=
LoadAnnotations3D
,
with_bbox_3d
=
True
,
with_label_3d
=
True
),
dict
(
type
=
GlobalRotScaleTrans
,
rot_range
=
[
-
0.3925
,
0.3925
],
scale_ratio_range
=
[
0.95
,
1.05
],
translation_std
=
[
0
,
0
,
0
]),
dict
(
type
=
RandomFlip3D
,
flip_ratio_bev_horizontal
=
0.5
),
dict
(
type
=
PointsRangeFilter
,
point_cloud_range
=
point_cloud_range
),
dict
(
type
=
ObjectRangeFilter
,
point_cloud_range
=
point_cloud_range
),
dict
(
type
=
ObjectNameFilter
,
classes
=
class_names
),
dict
(
type
=
PointShuffle
),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'points'
,
'gt_bboxes_3d'
,
'gt_labels_3d'
])
]
test_pipeline
=
[
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'LIDAR'
,
load_dim
=
5
,
use_dim
=
5
,
backend_args
=
backend_args
),
dict
(
type
=
LoadPointsFromMultiSweeps
,
sweeps_num
=
10
,
test_mode
=
True
,
backend_args
=
backend_args
),
dict
(
type
=
MultiScaleFlipAug3D
,
img_scale
=
(
1333
,
800
),
pts_scale_ratio
=
1
,
flip
=
False
,
transforms
=
[
dict
(
type
=
GlobalRotScaleTrans
,
rot_range
=
[
0
,
0
],
scale_ratio_range
=
[
1.
,
1.
],
translation_std
=
[
0
,
0
,
0
]),
dict
(
type
=
RandomFlip3D
),
dict
(
type
=
PointsRangeFilter
,
point_cloud_range
=
point_cloud_range
)
]),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'points'
])
]
# construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline
=
[
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'LIDAR'
,
load_dim
=
5
,
use_dim
=
5
,
backend_args
=
backend_args
),
dict
(
type
=
LoadPointsFromMultiSweeps
,
sweeps_num
=
10
,
test_mode
=
True
,
backend_args
=
backend_args
),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'points'
])
]
train_dataloader
=
dict
(
batch_size
=
4
,
num_workers
=
4
,
persistent_workers
=
True
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
True
),
dataset
=
dict
(
type
=
NuScenesDataset
,
data_root
=
data_root
,
ann_file
=
'nuscenes_infos_train.pkl'
,
pipeline
=
train_pipeline
,
metainfo
=
metainfo
,
modality
=
input_modality
,
test_mode
=
False
,
data_prefix
=
data_prefix
,
# we use box_type_3d='LiDAR' in kitti and nuscenes dataset
# and box_type_3d='Depth' in sunrgbd and scannet dataset.
box_type_3d
=
'LiDAR'
,
backend_args
=
backend_args
))
test_dataloader
=
dict
(
batch_size
=
1
,
num_workers
=
1
,
persistent_workers
=
True
,
drop_last
=
False
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
False
),
dataset
=
dict
(
type
=
NuScenesDataset
,
data_root
=
data_root
,
ann_file
=
'nuscenes_infos_val.pkl'
,
pipeline
=
test_pipeline
,
metainfo
=
metainfo
,
modality
=
input_modality
,
data_prefix
=
data_prefix
,
test_mode
=
True
,
box_type_3d
=
'LiDAR'
,
backend_args
=
backend_args
))
val_dataloader
=
dict
(
batch_size
=
1
,
num_workers
=
1
,
persistent_workers
=
True
,
drop_last
=
False
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
False
),
dataset
=
dict
(
type
=
NuScenesDataset
,
data_root
=
data_root
,
ann_file
=
'nuscenes_infos_val.pkl'
,
pipeline
=
test_pipeline
,
metainfo
=
metainfo
,
modality
=
input_modality
,
test_mode
=
True
,
data_prefix
=
data_prefix
,
box_type_3d
=
'LiDAR'
,
backend_args
=
backend_args
))
val_evaluator
=
dict
(
type
=
NuScenesMetric
,
data_root
=
data_root
,
ann_file
=
data_root
+
'nuscenes_infos_val.pkl'
,
metric
=
'bbox'
,
backend_args
=
backend_args
)
test_evaluator
=
val_evaluator
vis_backends
=
[
dict
(
type
=
LocalVisBackend
)]
visualizer
=
dict
(
type
=
Det3DLocalVisualizer
,
vis_backends
=
vis_backends
,
name
=
'visualizer'
)
mmdetection3d/mmdet3d/configs/_base_/datasets/nus_mono3d.py
0 → 100644
View file @
7aa442d5
# Copyright (c) OpenMMLab. All rights reserved.
from
mmcv.transforms.processing
import
Resize
from
mmengine.dataset.sampler
import
DefaultSampler
from
mmengine.visualization.vis_backend
import
LocalVisBackend
from
mmdet3d.datasets.nuscenes_dataset
import
NuScenesDataset
from
mmdet3d.datasets.transforms.formating
import
Pack3DDetInputs
from
mmdet3d.datasets.transforms.loading
import
(
LoadAnnotations3D
,
LoadImageFromFileMono3D
)
from
mmdet3d.datasets.transforms.transforms_3d
import
RandomFlip3D
from
mmdet3d.evaluation.metrics.nuscenes_metric
import
NuScenesMetric
from
mmdet3d.visualization.local_visualizer
import
Det3DLocalVisualizer
dataset_type
=
'NuScenesDataset'
data_root
=
'data/nuscenes/'
class_names
=
[
'car'
,
'truck'
,
'trailer'
,
'bus'
,
'construction_vehicle'
,
'bicycle'
,
'motorcycle'
,
'pedestrian'
,
'traffic_cone'
,
'barrier'
]
metainfo
=
dict
(
classes
=
class_names
)
# Input modality for nuScenes dataset, this is consistent with the submission
# format which requires the information in input_modality.
input_modality
=
dict
(
use_lidar
=
False
,
use_camera
=
True
)
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection3d/nuscenes/'
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/detection3d/',
# 'data/': 's3://openmmlab/datasets/detection3d/'
# }))
backend_args
=
None
train_pipeline
=
[
dict
(
type
=
LoadImageFromFileMono3D
,
backend_args
=
backend_args
),
dict
(
type
=
LoadAnnotations3D
,
with_bbox
=
True
,
with_label
=
True
,
with_attr_label
=
True
,
with_bbox_3d
=
True
,
with_label_3d
=
True
,
with_bbox_depth
=
True
),
dict
(
type
=
Resize
,
scale
=
(
1600
,
900
),
keep_ratio
=
True
),
dict
(
type
=
RandomFlip3D
,
flip_ratio_bev_horizontal
=
0.5
),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'img'
,
'gt_bboxes'
,
'gt_bboxes_labels'
,
'attr_labels'
,
'gt_bboxes_3d'
,
'gt_labels_3d'
,
'centers_2d'
,
'depths'
]),
]
test_pipeline
=
[
dict
(
type
=
LoadImageFromFileMono3D
,
backend_args
=
backend_args
),
dict
(
type
=
Resize
,
scale
=
(
1600
,
900
),
keep_ratio
=
True
),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'img'
])
]
train_dataloader
=
dict
(
batch_size
=
2
,
num_workers
=
2
,
persistent_workers
=
True
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
True
),
dataset
=
dict
(
type
=
NuScenesDataset
,
data_root
=
data_root
,
data_prefix
=
dict
(
pts
=
''
,
CAM_FRONT
=
'samples/CAM_FRONT'
,
CAM_FRONT_LEFT
=
'samples/CAM_FRONT_LEFT'
,
CAM_FRONT_RIGHT
=
'samples/CAM_FRONT_RIGHT'
,
CAM_BACK
=
'samples/CAM_BACK'
,
CAM_BACK_RIGHT
=
'samples/CAM_BACK_RIGHT'
,
CAM_BACK_LEFT
=
'samples/CAM_BACK_LEFT'
),
ann_file
=
'nuscenes_infos_train.pkl'
,
load_type
=
'mv_image_based'
,
pipeline
=
train_pipeline
,
metainfo
=
metainfo
,
modality
=
input_modality
,
test_mode
=
False
,
# we use box_type_3d='Camera' in monocular 3d
# detection task
box_type_3d
=
'Camera'
,
use_valid_flag
=
True
,
backend_args
=
backend_args
))
val_dataloader
=
dict
(
batch_size
=
1
,
num_workers
=
2
,
persistent_workers
=
True
,
drop_last
=
False
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
False
),
dataset
=
dict
(
type
=
NuScenesDataset
,
data_root
=
data_root
,
data_prefix
=
dict
(
pts
=
''
,
CAM_FRONT
=
'samples/CAM_FRONT'
,
CAM_FRONT_LEFT
=
'samples/CAM_FRONT_LEFT'
,
CAM_FRONT_RIGHT
=
'samples/CAM_FRONT_RIGHT'
,
CAM_BACK
=
'samples/CAM_BACK'
,
CAM_BACK_RIGHT
=
'samples/CAM_BACK_RIGHT'
,
CAM_BACK_LEFT
=
'samples/CAM_BACK_LEFT'
),
ann_file
=
'nuscenes_infos_val.pkl'
,
load_type
=
'mv_image_based'
,
pipeline
=
test_pipeline
,
modality
=
input_modality
,
metainfo
=
metainfo
,
test_mode
=
True
,
box_type_3d
=
'Camera'
,
use_valid_flag
=
True
,
backend_args
=
backend_args
))
test_dataloader
=
val_dataloader
val_evaluator
=
dict
(
type
=
NuScenesMetric
,
data_root
=
data_root
,
ann_file
=
data_root
+
'nuscenes_infos_val.pkl'
,
metric
=
'bbox'
,
backend_args
=
backend_args
)
test_evaluator
=
val_evaluator
vis_backends
=
[
dict
(
type
=
LocalVisBackend
)]
visualizer
=
dict
(
type
=
Det3DLocalVisualizer
,
vis_backends
=
vis_backends
,
name
=
'visualizer'
)
mmdetection3d/mmdet3d/configs/_base_/datasets/s3dis_3d.py
0 → 100644
View file @
7aa442d5
# Copyright (c) OpenMMLab. All rights reserved.
from
mmengine.dataset.dataset_wrapper
import
ConcatDataset
,
RepeatDataset
from
mmengine.dataset.sampler
import
DefaultSampler
from
mmengine.visualization.vis_backend
import
LocalVisBackend
from
mmdet3d.datasets.s3dis_dataset
import
S3DISDataset
from
mmdet3d.datasets.transforms.formating
import
Pack3DDetInputs
from
mmdet3d.datasets.transforms.loading
import
(
LoadAnnotations3D
,
LoadPointsFromFile
,
NormalizePointsColor
)
from
mmdet3d.datasets.transforms.test_time_aug
import
MultiScaleFlipAug3D
from
mmdet3d.datasets.transforms.transforms_3d
import
(
GlobalRotScaleTrans
,
PointSample
,
RandomFlip3D
)
from
mmdet3d.evaluation.metrics.indoor_metric
import
IndoorMetric
from
mmdet3d.visualization.local_visualizer
import
Det3DLocalVisualizer
# dataset settings
dataset_type
=
'S3DISDataset'
data_root
=
'data/s3dis/'
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection3d/s3dis/'
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/detection3d/',
# 'data/': 's3://openmmlab/datasets/detection3d/'
# }))
backend_args
=
None
metainfo
=
dict
(
classes
=
(
'table'
,
'chair'
,
'sofa'
,
'bookcase'
,
'board'
))
train_area
=
[
1
,
2
,
3
,
4
,
6
]
test_area
=
5
train_pipeline
=
[
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'DEPTH'
,
shift_height
=
False
,
use_color
=
True
,
load_dim
=
6
,
use_dim
=
[
0
,
1
,
2
,
3
,
4
,
5
],
backend_args
=
backend_args
),
dict
(
type
=
LoadAnnotations3D
,
with_bbox_3d
=
True
,
with_label_3d
=
True
),
dict
(
type
=
PointSample
,
num_points
=
100000
),
dict
(
type
=
RandomFlip3D
,
sync_2d
=
False
,
flip_ratio_bev_horizontal
=
0.5
,
flip_ratio_bev_vertical
=
0.5
),
dict
(
type
=
GlobalRotScaleTrans
,
rot_range
=
[
-
0.087266
,
0.087266
],
scale_ratio_range
=
[
0.9
,
1.1
],
translation_std
=
[.
1
,
.
1
,
.
1
],
shift_height
=
False
),
dict
(
type
=
NormalizePointsColor
,
color_mean
=
None
),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'points'
,
'gt_bboxes_3d'
,
'gt_labels_3d'
])
]
test_pipeline
=
[
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'DEPTH'
,
shift_height
=
False
,
use_color
=
True
,
load_dim
=
6
,
use_dim
=
[
0
,
1
,
2
,
3
,
4
,
5
],
backend_args
=
backend_args
),
dict
(
type
=
MultiScaleFlipAug3D
,
img_scale
=
(
1333
,
800
),
pts_scale_ratio
=
1
,
flip
=
False
,
transforms
=
[
dict
(
type
=
GlobalRotScaleTrans
,
rot_range
=
[
0
,
0
],
scale_ratio_range
=
[
1.
,
1.
],
translation_std
=
[
0
,
0
,
0
]),
dict
(
type
=
RandomFlip3D
,
sync_2d
=
False
,
flip_ratio_bev_horizontal
=
0.5
,
flip_ratio_bev_vertical
=
0.5
),
dict
(
type
=
PointSample
,
num_points
=
100000
),
dict
(
type
=
NormalizePointsColor
,
color_mean
=
None
),
]),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'points'
])
]
train_dataloader
=
dict
(
batch_size
=
8
,
num_workers
=
4
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
True
),
dataset
=
dict
(
type
=
RepeatDataset
,
times
=
13
,
dataset
=
dict
(
type
=
ConcatDataset
,
datasets
=
[
dict
(
type
=
S3DISDataset
,
data_root
=
data_root
,
ann_file
=
f
's3dis_infos_Area_
{
i
}
.pkl'
,
pipeline
=
train_pipeline
,
filter_empty_gt
=
True
,
metainfo
=
metainfo
,
box_type_3d
=
'Depth'
,
backend_args
=
backend_args
)
for
i
in
train_area
])))
val_dataloader
=
dict
(
batch_size
=
1
,
num_workers
=
1
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
False
),
dataset
=
dict
(
type
=
S3DISDataset
,
data_root
=
data_root
,
ann_file
=
f
's3dis_infos_Area_
{
test_area
}
.pkl'
,
pipeline
=
test_pipeline
,
metainfo
=
metainfo
,
test_mode
=
True
,
box_type_3d
=
'Depth'
,
backend_args
=
backend_args
))
test_dataloader
=
dict
(
batch_size
=
1
,
num_workers
=
1
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
False
),
dataset
=
dict
(
type
=
S3DISDataset
,
data_root
=
data_root
,
ann_file
=
f
's3dis_infos_Area_
{
test_area
}
.pkl'
,
pipeline
=
test_pipeline
,
metainfo
=
metainfo
,
test_mode
=
True
,
box_type_3d
=
'Depth'
,
backend_args
=
backend_args
))
val_evaluator
=
dict
(
type
=
IndoorMetric
)
test_evaluator
=
val_evaluator
vis_backends
=
[
dict
(
type
=
LocalVisBackend
)]
visualizer
=
dict
(
type
=
Det3DLocalVisualizer
,
vis_backends
=
vis_backends
,
name
=
'visualizer'
)
mmdetection3d/mmdet3d/configs/_base_/datasets/s3dis_seg.py
0 → 100644
View file @
7aa442d5
# Copyright (c) OpenMMLab. All rights reserved.
from
mmcv.transforms.processing
import
TestTimeAug
from
mmengine.dataset.sampler
import
DefaultSampler
from
mmengine.visualization.vis_backend
import
LocalVisBackend
from
mmdet3d.datasets.s3dis_dataset
import
S3DISSegDataset
from
mmdet3d.datasets.transforms.formating
import
Pack3DDetInputs
from
mmdet3d.datasets.transforms.loading
import
(
LoadAnnotations3D
,
LoadPointsFromFile
,
NormalizePointsColor
,
PointSegClassMapping
)
from
mmdet3d.datasets.transforms.transforms_3d
import
(
IndoorPatchPointSample
,
RandomFlip3D
)
from
mmdet3d.evaluation.metrics.seg_metric
import
SegMetric
from
mmdet3d.models.segmentors.seg3d_tta
import
Seg3DTTAModel
from
mmdet3d.visualization.local_visualizer
import
Det3DLocalVisualizer
# For S3DIS seg we usually do 13-class segmentation
class_names
=
(
'ceiling'
,
'floor'
,
'wall'
,
'beam'
,
'column'
,
'window'
,
'door'
,
'table'
,
'chair'
,
'sofa'
,
'bookcase'
,
'board'
,
'clutter'
)
metainfo
=
dict
(
classes
=
class_names
)
dataset_type
=
'S3DISSegDataset'
data_root
=
'data/s3dis/'
input_modality
=
dict
(
use_lidar
=
True
,
use_camera
=
False
)
data_prefix
=
dict
(
pts
=
'points'
,
pts_instance_mask
=
'instance_mask'
,
pts_semantic_mask
=
'semantic_mask'
)
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection3d/s3dis/'
# Method 2: Use backend_args, file_client_args in versions before 1.1.0
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/detection3d/',
# 'data/': 's3://openmmlab/datasets/detection3d/'
# }))
backend_args
=
None
num_points
=
4096
train_area
=
[
1
,
2
,
3
,
4
,
6
]
test_area
=
5
train_pipeline
=
[
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'DEPTH'
,
shift_height
=
False
,
use_color
=
True
,
load_dim
=
6
,
use_dim
=
[
0
,
1
,
2
,
3
,
4
,
5
],
backend_args
=
backend_args
),
dict
(
type
=
LoadAnnotations3D
,
with_bbox_3d
=
False
,
with_label_3d
=
False
,
with_mask_3d
=
False
,
with_seg_3d
=
True
,
backend_args
=
backend_args
),
dict
(
type
=
PointSegClassMapping
),
dict
(
type
=
IndoorPatchPointSample
,
num_points
=
num_points
,
block_size
=
1.0
,
ignore_index
=
len
(
class_names
),
use_normalized_coord
=
True
,
enlarge_size
=
0.2
,
min_unique_num
=
None
),
dict
(
type
=
NormalizePointsColor
,
color_mean
=
None
),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'points'
,
'pts_semantic_mask'
])
]
test_pipeline
=
[
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'DEPTH'
,
shift_height
=
False
,
use_color
=
True
,
load_dim
=
6
,
use_dim
=
[
0
,
1
,
2
,
3
,
4
,
5
],
backend_args
=
backend_args
),
dict
(
type
=
LoadAnnotations3D
,
with_bbox_3d
=
False
,
with_label_3d
=
False
,
with_mask_3d
=
False
,
with_seg_3d
=
True
,
backend_args
=
backend_args
),
dict
(
type
=
NormalizePointsColor
,
color_mean
=
None
),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'points'
])
]
# construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client)
# we need to load gt seg_mask!
eval_pipeline
=
[
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'DEPTH'
,
shift_height
=
False
,
use_color
=
True
,
load_dim
=
6
,
use_dim
=
[
0
,
1
,
2
,
3
,
4
,
5
],
backend_args
=
backend_args
),
dict
(
type
=
NormalizePointsColor
,
color_mean
=
None
),
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'points'
])
]
tta_pipeline
=
[
dict
(
type
=
LoadPointsFromFile
,
coord_type
=
'DEPTH'
,
shift_height
=
False
,
use_color
=
True
,
load_dim
=
6
,
use_dim
=
[
0
,
1
,
2
,
3
,
4
,
5
],
backend_args
=
backend_args
),
dict
(
type
=
LoadAnnotations3D
,
with_bbox_3d
=
False
,
with_label_3d
=
False
,
with_mask_3d
=
False
,
with_seg_3d
=
True
,
backend_args
=
backend_args
),
dict
(
type
=
NormalizePointsColor
,
color_mean
=
None
),
dict
(
type
=
TestTimeAug
,
transforms
=
[[
dict
(
type
=
RandomFlip3D
,
sync_2d
=
False
,
flip_ratio_bev_horizontal
=
0.
,
flip_ratio_bev_vertical
=
0.
)
],
[
dict
(
type
=
Pack3DDetInputs
,
keys
=
[
'points'
])]])
]
# train on area 1, 2, 3, 4, 6
# test on area 5
train_dataloader
=
dict
(
batch_size
=
8
,
num_workers
=
4
,
persistent_workers
=
True
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
True
),
dataset
=
dict
(
type
=
S3DISSegDataset
,
data_root
=
data_root
,
ann_files
=
[
f
's3dis_infos_Area_
{
i
}
.pkl'
for
i
in
train_area
],
metainfo
=
metainfo
,
data_prefix
=
data_prefix
,
pipeline
=
train_pipeline
,
modality
=
input_modality
,
ignore_index
=
len
(
class_names
),
scene_idxs
=
[
f
'seg_info/Area_
{
i
}
_resampled_scene_idxs.npy'
for
i
in
train_area
],
test_mode
=
False
,
backend_args
=
backend_args
))
test_dataloader
=
dict
(
batch_size
=
1
,
num_workers
=
1
,
persistent_workers
=
True
,
drop_last
=
False
,
sampler
=
dict
(
type
=
DefaultSampler
,
shuffle
=
False
),
dataset
=
dict
(
type
=
S3DISSegDataset
,
data_root
=
data_root
,
ann_files
=
f
's3dis_infos_Area_
{
test_area
}
.pkl'
,
metainfo
=
metainfo
,
data_prefix
=
data_prefix
,
pipeline
=
test_pipeline
,
modality
=
input_modality
,
ignore_index
=
len
(
class_names
),
scene_idxs
=
f
'seg_info/Area_
{
test_area
}
_resampled_scene_idxs.npy'
,
test_mode
=
True
,
backend_args
=
backend_args
))
val_dataloader
=
test_dataloader
val_evaluator
=
dict
(
type
=
SegMetric
)
test_evaluator
=
val_evaluator
vis_backends
=
[
dict
(
type
=
LocalVisBackend
)]
visualizer
=
dict
(
type
=
Det3DLocalVisualizer
,
vis_backends
=
vis_backends
,
name
=
'visualizer'
)
tta_model
=
dict
(
type
=
Seg3DTTAModel
)
Prev
1
…
17
18
19
20
21
22
23
24
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment