Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
mmdetection3d
Commits
32a4328b
Unverified
Commit
32a4328b
authored
Feb 24, 2022
by
Wenwei Zhang
Committed by
GitHub
Feb 24, 2022
Browse files
Bump version to V1.0.0rc0
Bump version to V1.0.0rc0
parents
86cc487c
a8817998
Changes
414
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
1014 additions
and
260 deletions
+1014
-260
docs/zh_cn/faq.md
docs/zh_cn/faq.md
+0
-6
docs/zh_cn/getting_started.md
docs/zh_cn/getting_started.md
+1
-0
docs/zh_cn/model_zoo.md
docs/zh_cn/model_zoo.md
+28
-0
docs/zh_cn/stat.py
docs/zh_cn/stat.py
+2
-1
docs/zh_cn/tutorials/index.rst
docs/zh_cn/tutorials/index.rst
+1
-0
docs/zh_cn/useful_tools.md
docs/zh_cn/useful_tools.md
+65
-1
mmdet3d/apis/__init__.py
mmdet3d/apis/__init__.py
+3
-2
mmdet3d/apis/inference.py
mmdet3d/apis/inference.py
+60
-31
mmdet3d/apis/test.py
mmdet3d/apis/test.py
+4
-3
mmdet3d/apis/train.py
mmdet3d/apis/train.py
+36
-0
mmdet3d/core/anchor/anchor_3d_generator.py
mmdet3d/core/anchor/anchor_3d_generator.py
+55
-40
mmdet3d/core/bbox/__init__.py
mmdet3d/core/bbox/__init__.py
+3
-2
mmdet3d/core/bbox/box_np_ops.py
mmdet3d/core/bbox/box_np_ops.py
+82
-151
mmdet3d/core/bbox/coders/__init__.py
mmdet3d/core/bbox/coders/__init__.py
+8
-1
mmdet3d/core/bbox/coders/anchor_free_bbox_coder.py
mmdet3d/core/bbox/coders/anchor_free_bbox_coder.py
+1
-1
mmdet3d/core/bbox/coders/centerpoint_bbox_coders.py
mmdet3d/core/bbox/coders/centerpoint_bbox_coders.py
+11
-10
mmdet3d/core/bbox/coders/delta_xyzwhlr_bbox_coder.py
mmdet3d/core/bbox/coders/delta_xyzwhlr_bbox_coder.py
+6
-6
mmdet3d/core/bbox/coders/fcos3d_bbox_coder.py
mmdet3d/core/bbox/coders/fcos3d_bbox_coder.py
+127
-0
mmdet3d/core/bbox/coders/groupfree3d_bbox_coder.py
mmdet3d/core/bbox/coders/groupfree3d_bbox_coder.py
+6
-5
mmdet3d/core/bbox/coders/monoflex_bbox_coder.py
mmdet3d/core/bbox/coders/monoflex_bbox_coder.py
+515
-0
No files found.
docs/zh_cn/faq.md
View file @
32a4328b
...
@@ -19,12 +19,6 @@
...
@@ -19,12 +19,6 @@
**注意**
: 我们已经在 0.13.0 及之后的版本中全面支持 pycocotools。
**注意**
: 我们已经在 0.13.0 及之后的版本中全面支持 pycocotools。
-
如果您遇到下面的问题,并且您的环境包含 numba == 0.48.0 和 numpy >= 1.20.0:
``TypeError: expected dtype object, got 'numpy.dtype[bool_]'``
请将 numpy 的版本降级至 < 1.20.0,或者从源码安装 numba == 0.48,这是由于 numpy == 1.20.0 改变了 API,使得在调用 `np.dtype` 会产生子类。请参考 [这里](https://github.com/numba/numba/issues/6041) 获取更多细节。
-
如果您在导入 pycocotools 相关包时遇到下面的问题:
-
如果您在导入 pycocotools 相关包时遇到下面的问题:
``ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject``
``ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject``
...
...
docs/zh_cn/getting_started.md
View file @
32a4328b
...
@@ -10,6 +10,7 @@
...
@@ -10,6 +10,7 @@
| MMDetection3D version | MMDetection version | MMSegmentation version | MMCV version |
| MMDetection3D version | MMDetection version | MMSegmentation version | MMCV version |
|:-------------------:|:-------------------:|:-------------------:|:-------------------:|
|:-------------------:|:-------------------:|:-------------------:|:-------------------:|
| master | mmdet>=2.19.0,
<
=3.0.0|
mmseg
>
=0.20.0,
<
=1.0.0
|
mmcv-full
>
=1.3.8, <=1.5.0|
| master | mmdet>=2.19.0,
<
=3.0.0|
mmseg
>
=0.20.0,
<
=1.0.0
|
mmcv-full
>
=1.3.8, <=1.5.0|
| v1.0.0rc0 | mmdet>=2.19.0,
<
=3.0.0|
mmseg
>
=0.20.0,
<
=1.0.0
|
mmcv-full
>
=1.3.8, <=1.5.0|
| 0.18.1 | mmdet>=2.19.0,
<
=3.0.0|
mmseg
>
=0.20.0,
<
=1.0.0
|
mmcv-full
>
=1.3.8, <=1.5.0|
| 0.18.1 | mmdet>=2.19.0,
<
=3.0.0|
mmseg
>
=0.20.0,
<
=1.0.0
|
mmcv-full
>
=1.3.8, <=1.5.0|
| 0.18.0 | mmdet>=2.19.0,
<
=3.0.0|
mmseg
>
=0.20.0,
<
=1.0.0
|
mmcv-full
>
=1.3.8, <=1.5.0|
| 0.18.0 | mmdet>=2.19.0,
<
=3.0.0|
mmseg
>
=0.20.0,
<
=1.0.0
|
mmcv-full
>
=1.3.8, <=1.5.0|
| 0.17.3 | mmdet>=2.14.0,
<
=3.0.0|
mmseg
>
=0.14.1,
<
=1.0.0
|
mmcv-full
>
=1.3.8, <=1.4.0|
| 0.17.3 | mmdet>=2.14.0,
<
=3.0.0|
mmseg
>
=0.14.1,
<
=1.0.0
|
mmcv-full
>
=1.3.8, <=1.4.0|
...
...
docs/zh_cn/model_zoo.md
View file @
32a4328b
...
@@ -75,3 +75,31 @@
...
@@ -75,3 +75,31 @@
### ImVoxelNet
### ImVoxelNet
请参考
[
ImVoxelNet
](
https://github.com/open-mmlab/mmdetection3d/blob/master/configs/imvoxelnet
)
获取更多细节,我们在 KITTI 数据集上给出了相应的结果。
请参考
[
ImVoxelNet
](
https://github.com/open-mmlab/mmdetection3d/blob/master/configs/imvoxelnet
)
获取更多细节,我们在 KITTI 数据集上给出了相应的结果。
### PAConv
请参考
[
PAConv
](
https://github.com/open-mmlab/mmdetection3d/blob/master/configs/paconv
)
获取更多细节,我们在 S3DIS 数据集上给出了相应的结果.
### DGCNN
请参考
[
DGCNN
](
https://github.com/open-mmlab/mmdetection3d/tree/v1.0.0.dev0/configs/dgcnn
)
获取更多细节,我们在 S3DIS 数据集上给出了相应的结果.
### SMOKE
请参考
[
SMOKE
](
https://github.com/open-mmlab/mmdetection3d/tree/v1.0.0.dev0/configs/smoke
)
获取更多细节,我们在 KITTI 数据集上给出了相应的结果.
### PGD
请参考
[
PGD
](
https://github.com/open-mmlab/mmdetection3d/tree/v1.0.0.dev0/configs/pgd
)
获取更多细节,我们在 KITTI 和 nuScenes 数据集上给出了相应的结果.
### PointRCNN
请参考
[
PointRCNN
](
https://github.com/open-mmlab/mmdetection3d/tree/v1.0.0.dev0/configs/point_rcnn
)
获取更多细节,我们在 KITTI 数据集上给出了相应的结果.
### MonoFlex
请参考
[
MonoFlex
](
https://github.com/open-mmlab/mmdetection3d/tree/v1.0.0.dev0/configs/monoflex
)
获取更多细节,我们在 KITTI 数据集上给出了相应的结果.
### Mixed Precision (FP16) Training
细节请参考 [Mixed Precision (FP16) Training] 在 PointPillars 训练的样例 (https://github.com/open-mmlab/mmdetection3d/tree/v1.0.0.dev0/configs/pointpillars/hv_pointpillars_fpn_sbn-all_fp16_2x8_2x_nus-3d.py).
docs/zh_cn/stat.py
View file @
32a4328b
#!/usr/bin/env python
#!/usr/bin/env python
import
functools
as
func
import
functools
as
func
import
glob
import
glob
import
numpy
as
np
import
re
import
re
from
os
import
path
as
osp
from
os
import
path
as
osp
import
numpy
as
np
url_prefix
=
'https://github.com/open-mmlab/mmdetection3d/blob/master/'
url_prefix
=
'https://github.com/open-mmlab/mmdetection3d/blob/master/'
files
=
sorted
(
glob
.
glob
(
'../configs/*/README.md'
))
files
=
sorted
(
glob
.
glob
(
'../configs/*/README.md'
))
...
...
docs/zh_cn/tutorials/index.rst
View file @
32a4328b
...
@@ -6,3 +6,4 @@
...
@@ -6,3 +6,4 @@
data_pipeline.md
data_pipeline.md
customize_models.md
customize_models.md
customize_runtime.md
customize_runtime.md
coord_sys_tutorial.md
docs/zh_cn/useful_tools.md
View file @
32a4328b
...
@@ -71,7 +71,7 @@ python tools/test.py ${CONFIG_FILE} ${CKPT_PATH} --show --show-dir ${SHOW_DIR}
...
@@ -71,7 +71,7 @@ python tools/test.py ${CONFIG_FILE} ${CKPT_PATH} --show --show-dir ${SHOW_DIR}
python tools/test.py
${
CONFIG_FILE
}
${
CKPT_PATH
}
--eval
'mAP'
--eval-options
'show=True'
'out_dir=${SHOW_DIR}'
python tools/test.py
${
CONFIG_FILE
}
${
CKPT_PATH
}
--eval
'mAP'
--eval-options
'show=True'
'out_dir=${SHOW_DIR}'
```
```
在运行这个指令后,您将会在
`${SHOW_DIR}`
获得输入数据、可视化在输入上的网络输出和真值标签(例如:在多模态检测任务中的
`***_points.obj`
,
`***_pred.obj`
,
`***_gt.obj`
,
`***_img.png`
和
`***_pred.png`
)。当
`show`
被激活,
[
Open3D
](
http://www.open3d.org/
)
将会被用来在线可视化结果。当在没有 GUI 的远程服务器上运行测试的时候,
您需要
设定
`show=False`
。
在运行这个指令后,您将会在
`${SHOW_DIR}`
获得输入数据、可视化在输入上的网络输出和真值标签(例如:在多模态检测任务中的
`***_points.obj`
,
`***_pred.obj`
,
`***_gt.obj`
,
`***_img.png`
和
`***_pred.png`
)。当
`show`
被激活,
[
Open3D
](
http://www.open3d.org/
)
将会被用来在线可视化结果。当
您
在没有 GUI 的远程服务器上运行测试的时候,
无法进行在线可视化,您可以
设定
`show=False`
将输出结果保存在
`{SHOW_DIR}`
。
至于离线可视化,您将有两个选择。
至于离线可视化,您将有两个选择。
利用
`Open3D`
后端可视化结果,您可以运行下面的指令
利用
`Open3D`
后端可视化结果,您可以运行下面的指令
...
@@ -97,6 +97,12 @@ python tools/misc/browse_dataset.py configs/_base_/datasets/kitti-3d-3class.py -
...
@@ -97,6 +97,12 @@ python tools/misc/browse_dataset.py configs/_base_/datasets/kitti-3d-3class.py -
**注意**
:一旦指定
`--output-dir`
,当按下 open3d 窗口的
`_ESC_`
,用户指定的视图图像将被保存。如果您没有显示器,您可以移除
`--online`
标志,从而仅仅保存可视化结果并且进行离线浏览。
**注意**
:一旦指定
`--output-dir`
,当按下 open3d 窗口的
`_ESC_`
,用户指定的视图图像将被保存。如果您没有显示器,您可以移除
`--online`
标志,从而仅仅保存可视化结果并且进行离线浏览。
为了验证数据的一致性和数据增强的效果,您还可以使用以下命令添加
`--aug`
标志来可视化数据增强后的数据:
```
shell
python tools/misc/browse_dataset.py configs/_base_/datasets/kitti-3d-3class.py
--task
det
--aug
--output-dir
${
OUTPUT_DIR
}
--online
```
如果您还想显示 2D 图像以及投影的 3D 边界框,则需要找到支持多模态数据加载的配置文件,然后将
`--task`
参数更改为
`multi_modality-det`
。一个例子如下所示
如果您还想显示 2D 图像以及投影的 3D 边界框,则需要找到支持多模态数据加载的配置文件,然后将
`--task`
参数更改为
`multi_modality-det`
。一个例子如下所示
```
shell
```
shell
...
@@ -123,6 +129,64 @@ python tools/misc/browse_dataset.py configs/_base_/datasets/nus-mono3d.py --task
...
@@ -123,6 +129,64 @@ python tools/misc/browse_dataset.py configs/_base_/datasets/nus-mono3d.py --task
 
 
# 模型部署
**Note**
: 此工具仍然处于试验阶段,目前只有 SECOND 支持用
[
`TorchServe`
](
https://pytorch.org/serve/
)
部署,我们将会在未来支持更多的模型。
为了使用
[
`TorchServe`
](
https://pytorch.org/serve/
)
部署
`MMDetection3D`
模型,您可以遵循以下步骤:
## 1. 将模型从 MMDetection3D 转换到 TorchServe
```
shell
python tools/deployment/mmdet3d2torchserve.py
${
CONFIG_FILE
}
${
CHECKPOINT_FILE
}
\
--output-folder
${
MODEL_STORE
}
\
--model-name
${
MODEL_NAME
}
```
**Note**
: ${MODEL_STORE} 需要为文件夹的绝对路径。
## 2. 构建 `mmdet3d-serve` 镜像
```
shell
docker build
-t
mmdet3d-serve:latest docker/serve/
```
## 3. 运行 `mmdet3d-serve`
查看官网文档来
[
使用 docker 运行 TorchServe
](
https://github.com/pytorch/serve/blob/master/docker/README.md#running-torchserve-in-a-production-docker-environment
)
。
为了在 GPU 上运行,您需要安装
[
nvidia-docker
](
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html
)
。您可以忽略
`--gpus`
参数,从而在 CPU 上运行。
例子:
```
shell
docker run
--rm
\
--cpus
8
\
--gpus
device
=
0
\
-p8080
:8080
-p8081
:8081
-p8082
:8082
\
--mount
type
=
bind
,source
=
$MODEL_STORE
,target
=
/home/model-server/model-store
\
mmdet3d-serve:latest
```
[
阅读文档
](
https://github.com/pytorch/serve/blob/072f5d088cce9bb64b2a18af065886c9b01b317b/docs/rest_api.md/
)
关于 Inference (8080), Management (8081) and Metrics (8082) 接口。
## 4. 测试部署
您可以使用
`test_torchserver.py`
进行部署, 同时比较 torchserver 和 pytorch 的结果。
```
shell
python tools/deployment/test_torchserver.py
${
IMAGE_FILE
}
${
CONFIG_FILE
}
${
CHECKPOINT_FILE
}
${
MODEL_NAME
}
[
--inference-addr
${
INFERENCE_ADDR
}
]
[
--device
${
DEVICE
}
]
[
--score-thr
${
SCORE_THR
}
]
```
例子:
```
shell
python tools/deployment/test_torchserver.py demo/data/kitti/kitti_000008.bin configs/second/hv_second_secfpn_6x8_80e_kitti-3d-car.py checkpoints/hv_second_secfpn_6x8_80e_kitti-3d-car_20200620_230238-393f000c.pth second
```
 
# 模型复杂度
# 模型复杂度
您可以使用 MMDetection 中的
`tools/analysis_tools/get_flops.py`
这个脚本文件,基于
[
flops-counter.pytorch
](
https://github.com/sovrasov/flops-counter.pytorch
)
计算一个给定模型的计算量 (FLOPS) 和参数量 (params)。
您可以使用 MMDetection 中的
`tools/analysis_tools/get_flops.py`
这个脚本文件,基于
[
flops-counter.pytorch
](
https://github.com/sovrasov/flops-counter.pytorch
)
计算一个给定模型的计算量 (FLOPS) 和参数量 (params)。
...
...
mmdet3d/apis/__init__.py
View file @
32a4328b
...
@@ -4,10 +4,11 @@ from .inference import (convert_SyncBN, inference_detector,
...
@@ -4,10 +4,11 @@ from .inference import (convert_SyncBN, inference_detector,
inference_multi_modality_detector
,
inference_segmentor
,
inference_multi_modality_detector
,
inference_segmentor
,
init_model
,
show_result_meshlab
)
init_model
,
show_result_meshlab
)
from
.test
import
single_gpu_test
from
.test
import
single_gpu_test
from
.train
import
train_model
from
.train
import
init_random_seed
,
train_model
__all__
=
[
__all__
=
[
'inference_detector'
,
'init_model'
,
'single_gpu_test'
,
'inference_detector'
,
'init_model'
,
'single_gpu_test'
,
'inference_mono_3d_detector'
,
'show_result_meshlab'
,
'convert_SyncBN'
,
'inference_mono_3d_detector'
,
'show_result_meshlab'
,
'convert_SyncBN'
,
'train_model'
,
'inference_multi_modality_detector'
,
'inference_segmentor'
'train_model'
,
'inference_multi_modality_detector'
,
'inference_segmentor'
,
'init_random_seed'
]
]
mmdet3d/apis/inference.py
View file @
32a4328b
# Copyright (c) OpenMMLab. All rights reserved.
# Copyright (c) OpenMMLab. All rights reserved.
import
re
from
copy
import
deepcopy
from
os
import
path
as
osp
import
mmcv
import
mmcv
import
numpy
as
np
import
numpy
as
np
import
re
import
torch
import
torch
from
copy
import
deepcopy
from
mmcv.parallel
import
collate
,
scatter
from
mmcv.parallel
import
collate
,
scatter
from
mmcv.runner
import
load_checkpoint
from
mmcv.runner
import
load_checkpoint
from
os
import
path
as
osp
from
mmdet3d.core
import
(
Box3DMode
,
CameraInstance3DBoxes
,
from
mmdet3d.core
import
(
Box3DMode
,
CameraInstance3DBoxes
,
Coord3DMode
,
DepthInstance3DBoxes
,
LiDARInstance3DBoxes
,
DepthInstance3DBoxes
,
LiDARInstance3DBoxes
,
show_multi_modality_result
,
show_result
,
show_multi_modality_result
,
show_result
,
show_seg_result
)
show_seg_result
)
...
@@ -83,26 +84,53 @@ def inference_detector(model, pcd):
...
@@ -83,26 +84,53 @@ def inference_detector(model, pcd):
"""
"""
cfg
=
model
.
cfg
cfg
=
model
.
cfg
device
=
next
(
model
.
parameters
()).
device
# model device
device
=
next
(
model
.
parameters
()).
device
# model device
if
not
isinstance
(
pcd
,
str
):
cfg
=
cfg
.
copy
()
# set loading pipeline type
cfg
.
data
.
test
.
pipeline
[
0
].
type
=
'LoadPointsFromDict'
# build the data pipeline
# build the data pipeline
test_pipeline
=
deepcopy
(
cfg
.
data
.
test
.
pipeline
)
test_pipeline
=
deepcopy
(
cfg
.
data
.
test
.
pipeline
)
test_pipeline
=
Compose
(
test_pipeline
)
test_pipeline
=
Compose
(
test_pipeline
)
box_type_3d
,
box_mode_3d
=
get_box_type
(
cfg
.
data
.
test
.
box_type_3d
)
box_type_3d
,
box_mode_3d
=
get_box_type
(
cfg
.
data
.
test
.
box_type_3d
)
data
=
dict
(
pts_filename
=
pcd
,
if
isinstance
(
pcd
,
str
):
box_type_3d
=
box_type_3d
,
# load from point clouds file
box_mode_3d
=
box_mode_3d
,
data
=
dict
(
# for ScanNet demo we need axis_align_matrix
pts_filename
=
pcd
,
ann_info
=
dict
(
axis_align_matrix
=
np
.
eye
(
4
)),
box_type_3d
=
box_type_3d
,
sweeps
=
[],
box_mode_3d
=
box_mode_3d
,
# set timestamp = 0
# for ScanNet demo we need axis_align_matrix
timestamp
=
[
0
],
ann_info
=
dict
(
axis_align_matrix
=
np
.
eye
(
4
)),
img_fields
=
[],
sweeps
=
[],
bbox3d_fields
=
[],
# set timestamp = 0
pts_mask_fields
=
[],
timestamp
=
[
0
],
pts_seg_fields
=
[],
img_fields
=
[],
bbox_fields
=
[],
bbox3d_fields
=
[],
mask_fields
=
[],
pts_mask_fields
=
[],
seg_fields
=
[])
pts_seg_fields
=
[],
bbox_fields
=
[],
mask_fields
=
[],
seg_fields
=
[])
else
:
# load from http
data
=
dict
(
points
=
pcd
,
box_type_3d
=
box_type_3d
,
box_mode_3d
=
box_mode_3d
,
# for ScanNet demo we need axis_align_matrix
ann_info
=
dict
(
axis_align_matrix
=
np
.
eye
(
4
)),
sweeps
=
[],
# set timestamp = 0
timestamp
=
[
0
],
img_fields
=
[],
bbox3d_fields
=
[],
pts_mask_fields
=
[],
pts_seg_fields
=
[],
bbox_fields
=
[],
mask_fields
=
[],
seg_fields
=
[])
data
=
test_pipeline
(
data
)
data
=
test_pipeline
(
data
)
data
=
collate
([
data
],
samples_per_gpu
=
1
)
data
=
collate
([
data
],
samples_per_gpu
=
1
)
if
next
(
model
.
parameters
()).
is_cuda
:
if
next
(
model
.
parameters
()).
is_cuda
:
...
@@ -317,8 +345,7 @@ def show_det_result_meshlab(data,
...
@@ -317,8 +345,7 @@ def show_det_result_meshlab(data,
# for now we convert points into depth mode
# for now we convert points into depth mode
box_mode
=
data
[
'img_metas'
][
0
][
0
][
'box_mode_3d'
]
box_mode
=
data
[
'img_metas'
][
0
][
0
][
'box_mode_3d'
]
if
box_mode
!=
Box3DMode
.
DEPTH
:
if
box_mode
!=
Box3DMode
.
DEPTH
:
points
=
points
[...,
[
1
,
0
,
2
]]
points
=
Coord3DMode
.
convert
(
points
,
box_mode
,
Coord3DMode
.
DEPTH
)
points
[...,
0
]
*=
-
1
show_bboxes
=
Box3DMode
.
convert
(
pred_bboxes
,
box_mode
,
Box3DMode
.
DEPTH
)
show_bboxes
=
Box3DMode
.
convert
(
pred_bboxes
,
box_mode
,
Box3DMode
.
DEPTH
)
else
:
else
:
show_bboxes
=
deepcopy
(
pred_bboxes
)
show_bboxes
=
deepcopy
(
pred_bboxes
)
...
@@ -462,15 +489,17 @@ def show_result_meshlab(data,
...
@@ -462,15 +489,17 @@ def show_result_meshlab(data,
data (dict): Contain data from pipeline.
data (dict): Contain data from pipeline.
result (dict): Predicted result from model.
result (dict): Predicted result from model.
out_dir (str): Directory to save visualized result.
out_dir (str): Directory to save visualized result.
score_thr (float): Minimum score of bboxes to be shown. Default: 0.0
score_thr (float, optional): Minimum score of bboxes to be shown.
show (bool): Visualize the results online. Defaults to False.
Default: 0.0
snapshot (bool): Whether to save the online results. Defaults to False.
show (bool, optional): Visualize the results online. Defaults to False.
task (str): Distinguish which task result to visualize. Currently we
snapshot (bool, optional): Whether to save the online results.
support 3D detection, multi-modality detection and 3D segmentation.
Defaults to False.
Defaults to 'det'.
task (str, optional): Distinguish which task result to visualize.
palette (list[list[int]]] | np.ndarray | None): The palette of
Currently we support 3D detection, multi-modality detection and
segmentation map. If None is given, random palette will be
3D segmentation. Defaults to 'det'.
generated. Defaults to None.
palette (list[list[int]]] | np.ndarray, optional): The palette
of segmentation map. If None is given, random palette will be
generated. Defaults to None.
"""
"""
assert
task
in
[
'det'
,
'multi_modality-det'
,
'seg'
,
'mono-det'
],
\
assert
task
in
[
'det'
,
'multi_modality-det'
,
'seg'
,
'mono-det'
],
\
f
'unsupported visualization task
{
task
}
'
f
'unsupported visualization task
{
task
}
'
...
...
mmdet3d/apis/test.py
View file @
32a4328b
# Copyright (c) OpenMMLab. All rights reserved.
# Copyright (c) OpenMMLab. All rights reserved.
from
os
import
path
as
osp
import
mmcv
import
mmcv
import
torch
import
torch
from
mmcv.image
import
tensor2imgs
from
mmcv.image
import
tensor2imgs
from
os
import
path
as
osp
from
mmdet3d.models
import
(
Base3DDetector
,
Base3DSegmentor
,
from
mmdet3d.models
import
(
Base3DDetector
,
Base3DSegmentor
,
SingleStageMono3DDetector
)
SingleStageMono3DDetector
)
...
@@ -22,9 +23,9 @@ def single_gpu_test(model,
...
@@ -22,9 +23,9 @@ def single_gpu_test(model,
Args:
Args:
model (nn.Module): Model to be tested.
model (nn.Module): Model to be tested.
data_loader (nn.Dataloader): Pytorch data loader.
data_loader (nn.Dataloader): Pytorch data loader.
show (bool): Whether to save viualization results.
show (bool
, optional
): Whether to save viualization results.
Default: True.
Default: True.
out_dir (str): The path to save visualization results.
out_dir (str
, optional
): The path to save visualization results.
Default: None.
Default: None.
Returns:
Returns:
...
...
mmdet3d/apis/train.py
View file @
32a4328b
# Copyright (c) OpenMMLab. All rights reserved.
# Copyright (c) OpenMMLab. All rights reserved.
import
numpy
as
np
import
torch
from
mmcv.runner
import
get_dist_info
from
torch
import
distributed
as
dist
from
mmdet.apis
import
train_detector
from
mmdet.apis
import
train_detector
from
mmseg.apis
import
train_segmentor
from
mmseg.apis
import
train_segmentor
def
init_random_seed
(
seed
=
None
,
device
=
'cuda'
):
"""Initialize random seed.
If the seed is not set, the seed will be automatically randomized,
and then broadcast to all processes to prevent some potential bugs.
Args:
seed (int, optional): The seed. Default to None.
device (str, optional): The device where the seed will be put on.
Default to 'cuda'.
Returns:
int: Seed to be used.
"""
if
seed
is
not
None
:
return
seed
# Make sure all ranks share the same random seed to prevent
# some potential bugs. Please refer to
# https://github.com/open-mmlab/mmdetection/issues/6339
rank
,
world_size
=
get_dist_info
()
seed
=
np
.
random
.
randint
(
2
**
31
)
if
world_size
==
1
:
return
seed
if
rank
==
0
:
random_num
=
torch
.
tensor
(
seed
,
dtype
=
torch
.
int32
,
device
=
device
)
else
:
random_num
=
torch
.
tensor
(
0
,
dtype
=
torch
.
int32
,
device
=
device
)
dist
.
broadcast
(
random_num
,
src
=
0
)
return
random_num
.
item
()
def
train_model
(
model
,
def
train_model
(
model
,
dataset
,
dataset
,
cfg
,
cfg
,
...
...
mmdet3d/core/anchor/anchor_3d_generator.py
View file @
32a4328b
...
@@ -19,20 +19,26 @@ class Anchor3DRangeGenerator(object):
...
@@ -19,20 +19,26 @@ class Anchor3DRangeGenerator(object):
ranges (list[list[float]]): Ranges of different anchors.
ranges (list[list[float]]): Ranges of different anchors.
The ranges are the same across different feature levels. But may
The ranges are the same across different feature levels. But may
vary for different anchor sizes if size_per_range is True.
vary for different anchor sizes if size_per_range is True.
sizes (list[list[float]]): 3D sizes of anchors.
sizes (list[list[float]], optional): 3D sizes of anchors.
scales (list[int]): Scales of anchors in different feature levels.
Defaults to [[3.9, 1.6, 1.56]].
rotations (list[float]): Rotations of anchors in a feature grid.
scales (list[int], optional): Scales of anchors in different feature
custom_values (tuple[float]): Customized values of that anchor. For
levels. Defaults to [1].
example, in nuScenes the anchors have velocities.
rotations (list[float], optional): Rotations of anchors in a feature
reshape_out (bool): Whether to reshape the output into (N x 4).
grid. Defaults to [0, 1.5707963].
size_per_range: Whether to use separate ranges for different sizes.
custom_values (tuple[float], optional): Customized values of that
If size_per_range is True, the ranges should have the same length
anchor. For example, in nuScenes the anchors have velocities.
as the sizes, if not, it will be duplicated.
Defaults to ().
reshape_out (bool, optional): Whether to reshape the output into
(N x 4). Defaults to True.
size_per_range (bool, optional): Whether to use separate ranges for
different sizes. If size_per_range is True, the ranges should have
the same length as the sizes, if not, it will be duplicated.
Defaults to True.
"""
"""
def
__init__
(
self
,
def
__init__
(
self
,
ranges
,
ranges
,
sizes
=
[[
1.6
,
3.9
,
1.56
]],
sizes
=
[[
3.9
,
1.6
,
1.56
]],
scales
=
[
1
],
scales
=
[
1
],
rotations
=
[
0
,
1.5707963
],
rotations
=
[
0
,
1.5707963
],
custom_values
=
(),
custom_values
=
(),
...
@@ -86,13 +92,14 @@ class Anchor3DRangeGenerator(object):
...
@@ -86,13 +92,14 @@ class Anchor3DRangeGenerator(object):
Args:
Args:
featmap_sizes (list[tuple]): List of feature map sizes in
featmap_sizes (list[tuple]): List of feature map sizes in
multiple feature levels.
multiple feature levels.
device (str): Device where the anchors will be put on.
device (str, optional): Device where the anchors will be put on.
Defaults to 'cuda'.
Returns:
Returns:
list[torch.Tensor]: Anchors in multiple feature levels.
\
list[torch.Tensor]: Anchors in multiple feature levels.
The sizes of each tensor should be [N, 4], where
\
The sizes of each tensor should be [N, 4], where
N = width * height * num_base_anchors, width and height
\
N = width * height * num_base_anchors, width and height
are the sizes of the corresponding feature l
a
vel,
\
are the sizes of the corresponding feature l
e
vel,
num_base_anchors is the number of anchors for that level.
num_base_anchors is the number of anchors for that level.
"""
"""
assert
self
.
num_levels
==
len
(
featmap_sizes
)
assert
self
.
num_levels
==
len
(
featmap_sizes
)
...
@@ -149,7 +156,7 @@ class Anchor3DRangeGenerator(object):
...
@@ -149,7 +156,7 @@ class Anchor3DRangeGenerator(object):
feature_size
,
feature_size
,
anchor_range
,
anchor_range
,
scale
=
1
,
scale
=
1
,
sizes
=
[[
1.6
,
3.9
,
1.56
]],
sizes
=
[[
3.9
,
1.6
,
1.56
]],
rotations
=
[
0
,
1.5707963
],
rotations
=
[
0
,
1.5707963
],
device
=
'cuda'
):
device
=
'cuda'
):
"""Generate anchors in a single range.
"""Generate anchors in a single range.
...
@@ -161,14 +168,18 @@ class Anchor3DRangeGenerator(object):
...
@@ -161,14 +168,18 @@ class Anchor3DRangeGenerator(object):
shape [6]. The order is consistent with that of anchors, i.e.,
shape [6]. The order is consistent with that of anchors, i.e.,
(x_min, y_min, z_min, x_max, y_max, z_max).
(x_min, y_min, z_min, x_max, y_max, z_max).
scale (float | int, optional): The scale factor of anchors.
scale (float | int, optional): The scale factor of anchors.
sizes (list[list] | np.ndarray | torch.Tensor): Anchor size with
Defaults to 1.
shape [N, 3], in order of x, y, z.
sizes (list[list] | np.ndarray | torch.Tensor, optional):
rotations (list[float] | np.ndarray | torch.Tensor): Rotations of
Anchor size with shape [N, 3], in order of x, y, z.
anchors in a single feature grid.
Defaults to [[3.9, 1.6, 1.56]].
rotations (list[float] | np.ndarray | torch.Tensor, optional):
Rotations of anchors in a single feature grid.
Defaults to [0, 1.5707963].
device (str): Devices that the anchors will be put on.
device (str): Devices that the anchors will be put on.
Defaults to 'cuda'.
Returns:
Returns:
torch.Tensor: Anchors with shape
\
torch.Tensor: Anchors with shape
[*feature_size, num_sizes, num_rots, 7].
[*feature_size, num_sizes, num_rots, 7].
"""
"""
if
len
(
feature_size
)
==
2
:
if
len
(
feature_size
)
==
2
:
...
@@ -231,10 +242,10 @@ class AlignedAnchor3DRangeGenerator(Anchor3DRangeGenerator):
...
@@ -231,10 +242,10 @@ class AlignedAnchor3DRangeGenerator(Anchor3DRangeGenerator):
up corner to distribute anchors.
up corner to distribute anchors.
Args:
Args:
anchor_corner (bool): Whether to align with the corner of the
voxel
anchor_corner (bool
, optional
): Whether to align with the corner of the
grid. By default it is False and the anchor's center will be
voxel
grid. By default it is False and the anchor's center will be
the same as the corresponding voxel's center, which is also the
the same as the corresponding voxel's center, which is also the
center of the corresponding greature grid.
center of the corresponding greature grid.
Defaults to False.
"""
"""
def
__init__
(
self
,
align_corner
=
False
,
**
kwargs
):
def
__init__
(
self
,
align_corner
=
False
,
**
kwargs
):
...
@@ -245,7 +256,7 @@ class AlignedAnchor3DRangeGenerator(Anchor3DRangeGenerator):
...
@@ -245,7 +256,7 @@ class AlignedAnchor3DRangeGenerator(Anchor3DRangeGenerator):
feature_size
,
feature_size
,
anchor_range
,
anchor_range
,
scale
,
scale
,
sizes
=
[[
1.6
,
3.9
,
1.56
]],
sizes
=
[[
3.9
,
1.6
,
1.56
]],
rotations
=
[
0
,
1.5707963
],
rotations
=
[
0
,
1.5707963
],
device
=
'cuda'
):
device
=
'cuda'
):
"""Generate anchors in a single range.
"""Generate anchors in a single range.
...
@@ -256,15 +267,18 @@ class AlignedAnchor3DRangeGenerator(Anchor3DRangeGenerator):
...
@@ -256,15 +267,18 @@ class AlignedAnchor3DRangeGenerator(Anchor3DRangeGenerator):
anchor_range (torch.Tensor | list[float]): Range of anchors with
anchor_range (torch.Tensor | list[float]): Range of anchors with
shape [6]. The order is consistent with that of anchors, i.e.,
shape [6]. The order is consistent with that of anchors, i.e.,
(x_min, y_min, z_min, x_max, y_max, z_max).
(x_min, y_min, z_min, x_max, y_max, z_max).
scale (float | int, optional): The scale factor of anchors.
scale (float | int): The scale factor of anchors.
sizes (list[list] | np.ndarray | torch.Tensor): Anchor size with
sizes (list[list] | np.ndarray | torch.Tensor, optional):
shape [N, 3], in order of x, y, z.
Anchor size with shape [N, 3], in order of x, y, z.
rotations (list[float] | np.ndarray | torch.Tensor): Rotations of
Defaults to [[3.9, 1.6, 1.56]].
anchors in a single feature grid.
rotations (list[float] | np.ndarray | torch.Tensor, optional):
device (str): Devices that the anchors will be put on.
Rotations of anchors in a single feature grid.
Defaults to [0, 1.5707963].
device (str, optional): Devices that the anchors will be put on.
Defaults to 'cuda'.
Returns:
Returns:
torch.Tensor: Anchors with shape
\
torch.Tensor: Anchors with shape
[*feature_size, num_sizes, num_rots, 7].
[*feature_size, num_sizes, num_rots, 7].
"""
"""
if
len
(
feature_size
)
==
2
:
if
len
(
feature_size
)
==
2
:
...
@@ -334,7 +348,7 @@ class AlignedAnchor3DRangeGeneratorPerCls(AlignedAnchor3DRangeGenerator):
...
@@ -334,7 +348,7 @@ class AlignedAnchor3DRangeGeneratorPerCls(AlignedAnchor3DRangeGenerator):
Note that feature maps of different classes may be different.
Note that feature maps of different classes may be different.
Args:
Args:
kwargs (dict): Arguments are the same as those in
\
kwargs (dict): Arguments are the same as those in
:class:`AlignedAnchor3DRangeGenerator`.
:class:`AlignedAnchor3DRangeGenerator`.
"""
"""
...
@@ -347,15 +361,16 @@ class AlignedAnchor3DRangeGeneratorPerCls(AlignedAnchor3DRangeGenerator):
...
@@ -347,15 +361,16 @@ class AlignedAnchor3DRangeGeneratorPerCls(AlignedAnchor3DRangeGenerator):
"""Generate grid anchors in multiple feature levels.
"""Generate grid anchors in multiple feature levels.
Args:
Args:
featmap_sizes (list[tuple]): List of feature map sizes for
\
featmap_sizes (list[tuple]): List of feature map sizes for
different classes in a single feature level.
different classes in a single feature level.
device (str): Device where the anchors will be put on.
device (str, optional): Device where the anchors will be put on.
Defaults to 'cuda'.
Returns:
Returns:
list[list[torch.Tensor]]: Anchors in multiple feature levels.
\
list[list[torch.Tensor]]: Anchors in multiple feature levels.
Note that in this anchor generator, we currently only
\
Note that in this anchor generator, we currently only
support single feature level. The sizes of each tensor
\
support single feature level. The sizes of each tensor
should be [num_sizes/ranges*num_rots*featmap_size,
\
should be [num_sizes/ranges*num_rots*featmap_size,
box_code_size].
box_code_size].
"""
"""
multi_level_anchors
=
[]
multi_level_anchors
=
[]
...
@@ -371,7 +386,7 @@ class AlignedAnchor3DRangeGeneratorPerCls(AlignedAnchor3DRangeGenerator):
...
@@ -371,7 +386,7 @@ class AlignedAnchor3DRangeGeneratorPerCls(AlignedAnchor3DRangeGenerator):
This function is usually called by method ``self.grid_anchors``.
This function is usually called by method ``self.grid_anchors``.
Args:
Args:
featmap_sizes (list[tuple]): List of feature map sizes for
\
featmap_sizes (list[tuple]): List of feature map sizes for
different classes in a single feature level.
different classes in a single feature level.
scale (float): Scale factor of the anchors in the current level.
scale (float): Scale factor of the anchors in the current level.
device (str, optional): Device the tensor will be put on.
device (str, optional): Device the tensor will be put on.
...
...
mmdet3d/core/bbox/__init__.py
View file @
32a4328b
...
@@ -12,7 +12,8 @@ from .samplers import (BaseSampler, CombinedSampler,
...
@@ -12,7 +12,8 @@ from .samplers import (BaseSampler, CombinedSampler,
from
.structures
import
(
BaseInstance3DBoxes
,
Box3DMode
,
CameraInstance3DBoxes
,
from
.structures
import
(
BaseInstance3DBoxes
,
Box3DMode
,
CameraInstance3DBoxes
,
Coord3DMode
,
DepthInstance3DBoxes
,
Coord3DMode
,
DepthInstance3DBoxes
,
LiDARInstance3DBoxes
,
get_box_type
,
limit_period
,
LiDARInstance3DBoxes
,
get_box_type
,
limit_period
,
mono_cam_box2vis
,
points_cam2img
,
xywhr2xyxyr
)
mono_cam_box2vis
,
points_cam2img
,
points_img2cam
,
xywhr2xyxyr
)
from
.transforms
import
bbox3d2result
,
bbox3d2roi
,
bbox3d_mapping_back
from
.transforms
import
bbox3d2result
,
bbox3d2roi
,
bbox3d_mapping_back
__all__
=
[
__all__
=
[
...
@@ -25,5 +26,5 @@ __all__ = [
...
@@ -25,5 +26,5 @@ __all__ = [
'LiDARInstance3DBoxes'
,
'CameraInstance3DBoxes'
,
'bbox3d2roi'
,
'LiDARInstance3DBoxes'
,
'CameraInstance3DBoxes'
,
'bbox3d2roi'
,
'bbox3d2result'
,
'DepthInstance3DBoxes'
,
'BaseInstance3DBoxes'
,
'bbox3d2result'
,
'DepthInstance3DBoxes'
,
'BaseInstance3DBoxes'
,
'bbox3d_mapping_back'
,
'xywhr2xyxyr'
,
'limit_period'
,
'points_cam2img'
,
'bbox3d_mapping_back'
,
'xywhr2xyxyr'
,
'limit_period'
,
'points_cam2img'
,
'get_box_type'
,
'Coord3DMode'
,
'mono_cam_box2vis'
'points_img2cam'
,
'get_box_type'
,
'Coord3DMode'
,
'mono_cam_box2vis'
]
]
mmdet3d/core/bbox/box_np_ops.py
View file @
32a4328b
# Copyright (c) OpenMMLab. All rights reserved.
# Copyright (c) OpenMMLab. All rights reserved.
# TODO: clean the functions in this file and move the APIs into box structures
# TODO: clean the functions in this file and move the APIs into box structures
# in the future
# in the future
# NOTICE: All functions in this file are valid for LiDAR or depth boxes only
# if we use default parameters.
import
numba
import
numba
import
numpy
as
np
import
numpy
as
np
from
.structures.utils
import
limit_period
,
points_cam2img
,
rotation_3d_in_axis
def
camera_to_lidar
(
points
,
r_rect
,
velo2cam
):
def
camera_to_lidar
(
points
,
r_rect
,
velo2cam
):
"""Convert points in camera coordinate to lidar coordinate.
"""Convert points in camera coordinate to lidar coordinate.
Note:
This function is for KITTI only.
Args:
Args:
points (np.ndarray, shape=[N, 3]): Points in camera coordinate.
points (np.ndarray, shape=[N, 3]): Points in camera coordinate.
r_rect (np.ndarray, shape=[4, 4]): Matrix to project points in
r_rect (np.ndarray, shape=[4, 4]): Matrix to project points in
...
@@ -27,7 +34,10 @@ def camera_to_lidar(points, r_rect, velo2cam):
...
@@ -27,7 +34,10 @@ def camera_to_lidar(points, r_rect, velo2cam):
def
box_camera_to_lidar
(
data
,
r_rect
,
velo2cam
):
def
box_camera_to_lidar
(
data
,
r_rect
,
velo2cam
):
"""Covert boxes in camera coordinate to lidar coordinate.
"""Convert boxes in camera coordinate to lidar coordinate.
Note:
This function is for KITTI only.
Args:
Args:
data (np.ndarray, shape=[N, 7]): Boxes in camera coordinate.
data (np.ndarray, shape=[N, 7]): Boxes in camera coordinate.
...
@@ -40,10 +50,13 @@ def box_camera_to_lidar(data, r_rect, velo2cam):
...
@@ -40,10 +50,13 @@ def box_camera_to_lidar(data, r_rect, velo2cam):
np.ndarray, shape=[N, 3]: Boxes in lidar coordinate.
np.ndarray, shape=[N, 3]: Boxes in lidar coordinate.
"""
"""
xyz
=
data
[:,
0
:
3
]
xyz
=
data
[:,
0
:
3
]
l
,
h
,
w
=
data
[:,
3
:
4
],
data
[:,
4
:
5
],
data
[:,
5
:
6
]
x_size
,
y_size
,
z_size
=
data
[:,
3
:
4
],
data
[:,
4
:
5
],
data
[:,
5
:
6
]
r
=
data
[:,
6
:
7
]
r
=
data
[:,
6
:
7
]
xyz_lidar
=
camera_to_lidar
(
xyz
,
r_rect
,
velo2cam
)
xyz_lidar
=
camera_to_lidar
(
xyz
,
r_rect
,
velo2cam
)
return
np
.
concatenate
([
xyz_lidar
,
w
,
l
,
h
,
r
],
axis
=
1
)
# yaw and dims also needs to be converted
r_new
=
-
r
-
np
.
pi
/
2
r_new
=
limit_period
(
r_new
,
period
=
np
.
pi
*
2
)
return
np
.
concatenate
([
xyz_lidar
,
x_size
,
z_size
,
y_size
,
r_new
],
axis
=
1
)
def
corners_nd
(
dims
,
origin
=
0.5
):
def
corners_nd
(
dims
,
origin
=
0.5
):
...
@@ -80,26 +93,9 @@ def corners_nd(dims, origin=0.5):
...
@@ -80,26 +93,9 @@ def corners_nd(dims, origin=0.5):
return
corners
return
corners
def
rotation_2d
(
points
,
angles
):
"""Rotation 2d points based on origin point clockwise when angle positive.
Args:
points (np.ndarray): Points to be rotated with shape
\
(N, point_size, 2).
angles (np.ndarray): Rotation angle with shape (N).
Returns:
np.ndarray: Same shape as points.
"""
rot_sin
=
np
.
sin
(
angles
)
rot_cos
=
np
.
cos
(
angles
)
rot_mat_T
=
np
.
stack
([[
rot_cos
,
-
rot_sin
],
[
rot_sin
,
rot_cos
]])
return
np
.
einsum
(
'aij,jka->aik'
,
points
,
rot_mat_T
)
def
center_to_corner_box2d
(
centers
,
dims
,
angles
=
None
,
origin
=
0.5
):
def
center_to_corner_box2d
(
centers
,
dims
,
angles
=
None
,
origin
=
0.5
):
"""Convert kitti locations, dimensions and angles to corners.
"""Convert kitti locations, dimensions and angles to corners.
format: center(xy), dims(xy), angles(clockwise when positive)
format: center(xy), dims(xy), angles(
counter
clockwise when positive)
Args:
Args:
centers (np.ndarray): Locations in kitti label file with shape (N, 2).
centers (np.ndarray): Locations in kitti label file with shape (N, 2).
...
@@ -118,7 +114,7 @@ def center_to_corner_box2d(centers, dims, angles=None, origin=0.5):
...
@@ -118,7 +114,7 @@ def center_to_corner_box2d(centers, dims, angles=None, origin=0.5):
corners
=
corners_nd
(
dims
,
origin
=
origin
)
corners
=
corners_nd
(
dims
,
origin
=
origin
)
# corners: [N, 4, 2]
# corners: [N, 4, 2]
if
angles
is
not
None
:
if
angles
is
not
None
:
corners
=
rotation_
2d
(
corners
,
angles
)
corners
=
rotation_
3d_in_axis
(
corners
,
angles
)
corners
+=
centers
.
reshape
([
-
1
,
1
,
2
])
corners
+=
centers
.
reshape
([
-
1
,
1
,
2
])
return
corners
return
corners
...
@@ -172,37 +168,6 @@ def depth_to_lidar_points(depth, trunc_pixel, P2, r_rect, velo2cam):
...
@@ -172,37 +168,6 @@ def depth_to_lidar_points(depth, trunc_pixel, P2, r_rect, velo2cam):
return
lidar_points
return
lidar_points
def
rotation_3d_in_axis
(
points
,
angles
,
axis
=
0
):
"""Rotate points in specific axis.
Args:
points (np.ndarray, shape=[N, point_size, 3]]):
angles (np.ndarray, shape=[N]]):
axis (int, optional): Axis to rotate at. Defaults to 0.
Returns:
np.ndarray: Rotated points.
"""
# points: [N, point_size, 3]
rot_sin
=
np
.
sin
(
angles
)
rot_cos
=
np
.
cos
(
angles
)
ones
=
np
.
ones_like
(
rot_cos
)
zeros
=
np
.
zeros_like
(
rot_cos
)
if
axis
==
1
:
rot_mat_T
=
np
.
stack
([[
rot_cos
,
zeros
,
-
rot_sin
],
[
zeros
,
ones
,
zeros
],
[
rot_sin
,
zeros
,
rot_cos
]])
elif
axis
==
2
or
axis
==
-
1
:
rot_mat_T
=
np
.
stack
([[
rot_cos
,
-
rot_sin
,
zeros
],
[
rot_sin
,
rot_cos
,
zeros
],
[
zeros
,
zeros
,
ones
]])
elif
axis
==
0
:
rot_mat_T
=
np
.
stack
([[
zeros
,
rot_cos
,
-
rot_sin
],
[
zeros
,
rot_sin
,
rot_cos
],
[
ones
,
zeros
,
zeros
]])
else
:
raise
ValueError
(
'axis should in range'
)
return
np
.
einsum
(
'aij,jka->aik'
,
points
,
rot_mat_T
)
def
center_to_corner_box3d
(
centers
,
def
center_to_corner_box3d
(
centers
,
dims
,
dims
,
angles
=
None
,
angles
=
None
,
...
@@ -225,7 +190,7 @@ def center_to_corner_box3d(centers,
...
@@ -225,7 +190,7 @@ def center_to_corner_box3d(centers,
np.ndarray: Corners with the shape of (N, 8, 3).
np.ndarray: Corners with the shape of (N, 8, 3).
"""
"""
# 'length' in kitti format is in x axis.
# 'length' in kitti format is in x axis.
# yzx(hwl)(kitti label file)<->xyz(lhw)(camera)<->z(-x)(-y)(
w
lh)(lidar)
# yzx(hwl)(kitti label file)<->xyz(lhw)(camera)<->z(-x)(-y)(l
w
h)(lidar)
# center in kitti format is [0.5, 1.0, 0.5] in xyz.
# center in kitti format is [0.5, 1.0, 0.5] in xyz.
corners
=
corners_nd
(
dims
,
origin
=
origin
)
corners
=
corners_nd
(
dims
,
origin
=
origin
)
# corners: [N, 8, 3]
# corners: [N, 8, 3]
...
@@ -259,8 +224,8 @@ def box2d_to_corner_jit(boxes):
...
@@ -259,8 +224,8 @@ def box2d_to_corner_jit(boxes):
rot_sin
=
np
.
sin
(
boxes
[
i
,
-
1
])
rot_sin
=
np
.
sin
(
boxes
[
i
,
-
1
])
rot_cos
=
np
.
cos
(
boxes
[
i
,
-
1
])
rot_cos
=
np
.
cos
(
boxes
[
i
,
-
1
])
rot_mat_T
[
0
,
0
]
=
rot_cos
rot_mat_T
[
0
,
0
]
=
rot_cos
rot_mat_T
[
0
,
1
]
=
-
rot_sin
rot_mat_T
[
0
,
1
]
=
rot_sin
rot_mat_T
[
1
,
0
]
=
rot_sin
rot_mat_T
[
1
,
0
]
=
-
rot_sin
rot_mat_T
[
1
,
1
]
=
rot_cos
rot_mat_T
[
1
,
1
]
=
rot_cos
box_corners
[
i
]
=
corners
[
i
]
@
rot_mat_T
+
boxes
[
i
,
:
2
]
box_corners
[
i
]
=
corners
[
i
]
@
rot_mat_T
+
boxes
[
i
,
:
2
]
return
box_corners
return
box_corners
...
@@ -327,15 +292,15 @@ def rotation_points_single_angle(points, angle, axis=0):
...
@@ -327,15 +292,15 @@ def rotation_points_single_angle(points, angle, axis=0):
rot_cos
=
np
.
cos
(
angle
)
rot_cos
=
np
.
cos
(
angle
)
if
axis
==
1
:
if
axis
==
1
:
rot_mat_T
=
np
.
array
(
rot_mat_T
=
np
.
array
(
[[
rot_cos
,
0
,
-
rot_sin
],
[
0
,
1
,
0
],
[
rot_sin
,
0
,
rot_cos
]],
[[
rot_cos
,
0
,
rot_sin
],
[
0
,
1
,
0
],
[
-
rot_sin
,
0
,
rot_cos
]],
dtype
=
points
.
dtype
)
dtype
=
points
.
dtype
)
elif
axis
==
2
or
axis
==
-
1
:
elif
axis
==
2
or
axis
==
-
1
:
rot_mat_T
=
np
.
array
(
rot_mat_T
=
np
.
array
(
[[
rot_cos
,
-
rot_sin
,
0
],
[
rot_sin
,
rot_cos
,
0
],
[
0
,
0
,
1
]],
[[
rot_cos
,
rot_sin
,
0
],
[
-
rot_sin
,
rot_cos
,
0
],
[
0
,
0
,
1
]],
dtype
=
points
.
dtype
)
dtype
=
points
.
dtype
)
elif
axis
==
0
:
elif
axis
==
0
:
rot_mat_T
=
np
.
array
(
rot_mat_T
=
np
.
array
(
[[
1
,
0
,
0
],
[
0
,
rot_cos
,
-
rot_sin
],
[
0
,
rot_sin
,
rot_cos
]],
[[
1
,
0
,
0
],
[
0
,
rot_cos
,
rot_sin
],
[
0
,
-
rot_sin
,
rot_cos
]],
dtype
=
points
.
dtype
)
dtype
=
points
.
dtype
)
else
:
else
:
raise
ValueError
(
'axis should in range'
)
raise
ValueError
(
'axis should in range'
)
...
@@ -343,44 +308,6 @@ def rotation_points_single_angle(points, angle, axis=0):
...
@@ -343,44 +308,6 @@ def rotation_points_single_angle(points, angle, axis=0):
return
points
@
rot_mat_T
,
rot_mat_T
return
points
@
rot_mat_T
,
rot_mat_T
def
points_cam2img
(
points_3d
,
proj_mat
,
with_depth
=
False
):
"""Project points in camera coordinates to image coordinates.
Args:
points_3d (np.ndarray): Points in shape (N, 3)
proj_mat (np.ndarray): Transformation matrix between coordinates.
with_depth (bool, optional): Whether to keep depth in the output.
Defaults to False.
Returns:
np.ndarray: Points in image coordinates with shape [N, 2].
"""
points_shape
=
list
(
points_3d
.
shape
)
points_shape
[
-
1
]
=
1
assert
len
(
proj_mat
.
shape
)
==
2
,
'The dimension of the projection'
\
f
' matrix should be 2 instead of
{
len
(
proj_mat
.
shape
)
}
.'
d1
,
d2
=
proj_mat
.
shape
[:
2
]
assert
(
d1
==
3
and
d2
==
3
)
or
(
d1
==
3
and
d2
==
4
)
or
(
d1
==
4
and
d2
==
4
),
'The shape of the projection matrix'
\
f
' (
{
d1
}
*
{
d2
}
) is not supported.'
if
d1
==
3
:
proj_mat_expanded
=
np
.
eye
(
4
,
dtype
=
proj_mat
.
dtype
)
proj_mat_expanded
[:
d1
,
:
d2
]
=
proj_mat
proj_mat
=
proj_mat_expanded
points_4
=
np
.
concatenate
([
points_3d
,
np
.
ones
(
points_shape
)],
axis
=-
1
)
point_2d
=
points_4
@
proj_mat
.
T
point_2d_res
=
point_2d
[...,
:
2
]
/
point_2d
[...,
2
:
3
]
if
with_depth
:
points_2d_depth
=
np
.
concatenate
([
point_2d_res
,
point_2d
[...,
2
:
3
]],
axis
=-
1
)
return
points_2d_depth
return
point_2d_res
def
box3d_to_bbox
(
box3d
,
P2
):
def
box3d_to_bbox
(
box3d
,
P2
):
"""Convert box3d in camera coordinates to bbox in image coordinates.
"""Convert box3d in camera coordinates to bbox in image coordinates.
...
@@ -424,7 +351,10 @@ def corner_to_surfaces_3d(corners):
...
@@ -424,7 +351,10 @@ def corner_to_surfaces_3d(corners):
def
points_in_rbbox
(
points
,
rbbox
,
z_axis
=
2
,
origin
=
(
0.5
,
0.5
,
0
)):
def
points_in_rbbox
(
points
,
rbbox
,
z_axis
=
2
,
origin
=
(
0.5
,
0.5
,
0
)):
"""Check points in rotated bbox and return indicces.
"""Check points in rotated bbox and return indices.
Note:
This function is for counterclockwise boxes.
Args:
Args:
points (np.ndarray, shape=[N, 3+dim]): Points to query.
points (np.ndarray, shape=[N, 3+dim]): Points to query.
...
@@ -461,25 +391,9 @@ def minmax_to_corner_2d(minmax_box):
...
@@ -461,25 +391,9 @@ def minmax_to_corner_2d(minmax_box):
return
center_to_corner_box2d
(
center
,
dims
,
origin
=
0.0
)
return
center_to_corner_box2d
(
center
,
dims
,
origin
=
0.0
)
def
limit_period
(
val
,
offset
=
0.5
,
period
=
np
.
pi
):
"""Limit the value into a period for periodic function.
Args:
val (np.ndarray): The value to be converted.
offset (float, optional): Offset to set the value range.
\
Defaults to 0.5.
period (float, optional): Period of the value. Defaults to np.pi.
Returns:
torch.Tensor: Value in the range of
\
[-offset * period, (1-offset) * period]
"""
return
val
-
np
.
floor
(
val
/
period
+
offset
)
*
period
def
create_anchors_3d_range
(
feature_size
,
def
create_anchors_3d_range
(
feature_size
,
anchor_range
,
anchor_range
,
sizes
=
((
1.6
,
3.9
,
1.56
),
),
sizes
=
((
3.9
,
1.6
,
1.56
),
),
rotations
=
(
0
,
np
.
pi
/
2
),
rotations
=
(
0
,
np
.
pi
/
2
),
dtype
=
np
.
float32
):
dtype
=
np
.
float32
):
"""Create anchors 3d by range.
"""Create anchors 3d by range.
...
@@ -492,14 +406,14 @@ def create_anchors_3d_range(feature_size,
...
@@ -492,14 +406,14 @@ def create_anchors_3d_range(feature_size,
(x_min, y_min, z_min, x_max, y_max, z_max).
(x_min, y_min, z_min, x_max, y_max, z_max).
sizes (list[list] | np.ndarray | torch.Tensor, optional):
sizes (list[list] | np.ndarray | torch.Tensor, optional):
Anchor size with shape [N, 3], in order of x, y, z.
Anchor size with shape [N, 3], in order of x, y, z.
Defaults to ((
1.6, 3.9
, 1.56), ).
Defaults to ((
3.9, 1.6
, 1.56), ).
rotations (list[float] | np.ndarray | torch.Tensor, optional):
rotations (list[float] | np.ndarray | torch.Tensor, optional):
Rotations of anchors in a single feature grid.
Rotations of anchors in a single feature grid.
Defaults to (0, np.pi / 2).
Defaults to (0, np.pi / 2).
dtype (type, optional): Data type. Default to np.float32.
dtype (type, optional): Data type. Default
s
to np.float32.
Returns:
Returns:
np.ndarray: Range based anchors with shape of
\
np.ndarray: Range based anchors with shape of
(*feature_size, num_sizes, num_rots, 7).
(*feature_size, num_sizes, num_rots, 7).
"""
"""
anchor_range
=
np
.
array
(
anchor_range
,
dtype
)
anchor_range
=
np
.
array
(
anchor_range
,
dtype
)
...
@@ -550,11 +464,11 @@ def rbbox2d_to_near_bbox(rbboxes):
...
@@ -550,11 +464,11 @@ def rbbox2d_to_near_bbox(rbboxes):
"""convert rotated bbox to nearest 'standing' or 'lying' bbox.
"""convert rotated bbox to nearest 'standing' or 'lying' bbox.
Args:
Args:
rbboxes (np.ndarray): Rotated bboxes with shape of
\
rbboxes (np.ndarray): Rotated bboxes with shape of
(N, 5(x, y, xdim, ydim, rad)).
(N, 5(x, y, xdim, ydim, rad)).
Returns:
Returns:
np.ndarray: Bounding boxes with the sh
p
ae of
np.ndarray: Bounding boxes with the sha
p
e of
(N, 4(xmin, ymin, xmax, ymax)).
(N, 4(xmin, ymin, xmax, ymax)).
"""
"""
rots
=
rbboxes
[...,
-
1
]
rots
=
rbboxes
[...,
-
1
]
...
@@ -570,6 +484,9 @@ def iou_jit(boxes, query_boxes, mode='iou', eps=0.0):
...
@@ -570,6 +484,9 @@ def iou_jit(boxes, query_boxes, mode='iou', eps=0.0):
"""Calculate box iou. Note that jit version runs ~10x faster than the
"""Calculate box iou. Note that jit version runs ~10x faster than the
box_overlaps function in mmdet3d.core.evaluation.
box_overlaps function in mmdet3d.core.evaluation.
Note:
This function is for counterclockwise boxes.
Args:
Args:
boxes (np.ndarray): Input bounding boxes with shape of (N, 4).
boxes (np.ndarray): Input bounding boxes with shape of (N, 4).
query_boxes (np.ndarray): Query boxes with shape of (K, 4).
query_boxes (np.ndarray): Query boxes with shape of (K, 4).
...
@@ -607,7 +524,10 @@ def iou_jit(boxes, query_boxes, mode='iou', eps=0.0):
...
@@ -607,7 +524,10 @@ def iou_jit(boxes, query_boxes, mode='iou', eps=0.0):
def
projection_matrix_to_CRT_kitti
(
proj
):
def
projection_matrix_to_CRT_kitti
(
proj
):
"""Split projection matrix of kitti.
"""Split projection matrix of KITTI.
Note:
This function is for KITTI only.
P = C @ [R|T]
P = C @ [R|T]
C is upper triangular matrix, so we need to inverse CR and use QR
C is upper triangular matrix, so we need to inverse CR and use QR
...
@@ -633,6 +553,9 @@ def projection_matrix_to_CRT_kitti(proj):
...
@@ -633,6 +553,9 @@ def projection_matrix_to_CRT_kitti(proj):
def
remove_outside_points
(
points
,
rect
,
Trv2c
,
P2
,
image_shape
):
def
remove_outside_points
(
points
,
rect
,
Trv2c
,
P2
,
image_shape
):
"""Remove points which are outside of image.
"""Remove points which are outside of image.
Note:
This function is for KITTI only.
Args:
Args:
points (np.ndarray, shape=[N, 3+dims]): Total points.
points (np.ndarray, shape=[N, 3+dims]): Total points.
rect (np.ndarray, shape=[4, 4]): Matrix to project points in
rect (np.ndarray, shape=[4, 4]): Matrix to project points in
...
@@ -782,8 +705,8 @@ def points_in_convex_polygon_3d_jit(points,
...
@@ -782,8 +705,8 @@ def points_in_convex_polygon_3d_jit(points,
normal_vec
,
d
,
num_surfaces
)
normal_vec
,
d
,
num_surfaces
)
@
numba
.
jit
@
numba
.
n
jit
def
points_in_convex_polygon_jit
(
points
,
polygon
,
clockwise
=
Tru
e
):
def
points_in_convex_polygon_jit
(
points
,
polygon
,
clockwise
=
Fals
e
):
"""Check points is in 2d convex polygons. True when point in polygon.
"""Check points is in 2d convex polygons. True when point in polygon.
Args:
Args:
...
@@ -800,14 +723,16 @@ def points_in_convex_polygon_jit(points, polygon, clockwise=True):
...
@@ -800,14 +723,16 @@ def points_in_convex_polygon_jit(points, polygon, clockwise=True):
num_points_of_polygon
=
polygon
.
shape
[
1
]
num_points_of_polygon
=
polygon
.
shape
[
1
]
num_points
=
points
.
shape
[
0
]
num_points
=
points
.
shape
[
0
]
num_polygons
=
polygon
.
shape
[
0
]
num_polygons
=
polygon
.
shape
[
0
]
# if clockwise:
# vec for all the polygons
# vec1 = polygon - polygon[:, [num_points_of_polygon - 1] +
if
clockwise
:
# list(range(num_points_of_polygon - 1)), :]
vec1
=
polygon
-
polygon
[:,
# else:
np
.
array
([
num_points_of_polygon
-
1
]
+
list
(
# vec1 = polygon[:, [num_points_of_polygon - 1] +
range
(
num_points_of_polygon
-
1
))),
:]
# list(range(num_points_of_polygon - 1)), :] - polygon
else
:
# vec1: [num_polygon, num_points_of_polygon, 2]
vec1
=
polygon
[:,
vec1
=
np
.
zeros
((
2
),
dtype
=
polygon
.
dtype
)
np
.
array
([
num_points_of_polygon
-
1
]
+
list
(
range
(
num_points_of_polygon
-
1
))),
:]
-
polygon
ret
=
np
.
zeros
((
num_points
,
num_polygons
),
dtype
=
np
.
bool_
)
ret
=
np
.
zeros
((
num_points
,
num_polygons
),
dtype
=
np
.
bool_
)
success
=
True
success
=
True
cross
=
0.0
cross
=
0.0
...
@@ -815,12 +740,9 @@ def points_in_convex_polygon_jit(points, polygon, clockwise=True):
...
@@ -815,12 +740,9 @@ def points_in_convex_polygon_jit(points, polygon, clockwise=True):
for
j
in
range
(
num_polygons
):
for
j
in
range
(
num_polygons
):
success
=
True
success
=
True
for
k
in
range
(
num_points_of_polygon
):
for
k
in
range
(
num_points_of_polygon
):
if
clockwise
:
vec
=
vec1
[
j
,
k
]
vec1
=
polygon
[
j
,
k
]
-
polygon
[
j
,
k
-
1
]
cross
=
vec
[
1
]
*
(
polygon
[
j
,
k
,
0
]
-
points
[
i
,
0
])
else
:
cross
-=
vec
[
0
]
*
(
polygon
[
j
,
k
,
1
]
-
points
[
i
,
1
])
vec1
=
polygon
[
j
,
k
-
1
]
-
polygon
[
j
,
k
]
cross
=
vec1
[
1
]
*
(
polygon
[
j
,
k
,
0
]
-
points
[
i
,
0
])
cross
-=
vec1
[
0
]
*
(
polygon
[
j
,
k
,
1
]
-
points
[
i
,
1
])
if
cross
>=
0
:
if
cross
>=
0
:
success
=
False
success
=
False
break
break
...
@@ -839,10 +761,13 @@ def boxes3d_to_corners3d_lidar(boxes3d, bottom_center=True):
...
@@ -839,10 +761,13 @@ def boxes3d_to_corners3d_lidar(boxes3d, bottom_center=True):
|/ |/
|/ |/
2 -------- 1
2 -------- 1
Note:
This function is for LiDAR boxes only.
Args:
Args:
boxes3d (np.ndarray): Boxes with shape of (N, 7)
boxes3d (np.ndarray): Boxes with shape of (N, 7)
[x, y, z,
w, l, h, ry] in LiDAR coords, see the definition of ry
[x, y, z,
x_size, y_size, z_size, ry] in LiDAR coords,
in KITTI dataset.
see the definition of ry
in KITTI dataset.
bottom_center (bool, optional): Whether z is on the bottom center
bottom_center (bool, optional): Whether z is on the bottom center
of object. Defaults to True.
of object. Defaults to True.
...
@@ -850,19 +775,25 @@ def boxes3d_to_corners3d_lidar(boxes3d, bottom_center=True):
...
@@ -850,19 +775,25 @@ def boxes3d_to_corners3d_lidar(boxes3d, bottom_center=True):
np.ndarray: Box corners with the shape of [N, 8, 3].
np.ndarray: Box corners with the shape of [N, 8, 3].
"""
"""
boxes_num
=
boxes3d
.
shape
[
0
]
boxes_num
=
boxes3d
.
shape
[
0
]
w
,
l
,
h
=
boxes3d
[:,
3
],
boxes3d
[:,
4
],
boxes3d
[:,
5
]
x_size
,
y_size
,
z_size
=
boxes3d
[:,
3
],
boxes3d
[:,
4
],
boxes3d
[:,
5
]
x_corners
=
np
.
array
(
x_corners
=
np
.
array
([
[
w
/
2.
,
-
w
/
2.
,
-
w
/
2.
,
w
/
2.
,
w
/
2.
,
-
w
/
2.
,
-
w
/
2.
,
w
/
2.
],
x_size
/
2.
,
-
x_size
/
2.
,
-
x_size
/
2.
,
x_size
/
2.
,
x_size
/
2.
,
dtype
=
np
.
float32
).
T
-
x_size
/
2.
,
-
x_size
/
2.
,
x_size
/
2.
y_corners
=
np
.
array
(
],
[
-
l
/
2.
,
-
l
/
2.
,
l
/
2.
,
l
/
2.
,
-
l
/
2.
,
-
l
/
2.
,
l
/
2.
,
l
/
2.
],
dtype
=
np
.
float32
).
T
dtype
=
np
.
float32
).
T
y_corners
=
np
.
array
([
-
y_size
/
2.
,
-
y_size
/
2.
,
y_size
/
2.
,
y_size
/
2.
,
-
y_size
/
2.
,
-
y_size
/
2.
,
y_size
/
2.
,
y_size
/
2.
],
dtype
=
np
.
float32
).
T
if
bottom_center
:
if
bottom_center
:
z_corners
=
np
.
zeros
((
boxes_num
,
8
),
dtype
=
np
.
float32
)
z_corners
=
np
.
zeros
((
boxes_num
,
8
),
dtype
=
np
.
float32
)
z_corners
[:,
4
:
8
]
=
h
.
reshape
(
boxes_num
,
1
).
repeat
(
4
,
axis
=
1
)
# (N, 8)
z_corners
[:,
4
:
8
]
=
z_size
.
reshape
(
boxes_num
,
1
).
repeat
(
4
,
axis
=
1
)
# (N, 8)
else
:
else
:
z_corners
=
np
.
array
([
z_corners
=
np
.
array
([
-
h
/
2.
,
-
h
/
2.
,
-
h
/
2.
,
-
h
/
2.
,
h
/
2.
,
h
/
2.
,
h
/
2.
,
h
/
2.
-
z_size
/
2.
,
-
z_size
/
2.
,
-
z_size
/
2.
,
-
z_size
/
2.
,
z_size
/
2.
,
z_size
/
2.
,
z_size
/
2.
,
z_size
/
2.
],
],
dtype
=
np
.
float32
).
T
dtype
=
np
.
float32
).
T
...
@@ -870,9 +801,9 @@ def boxes3d_to_corners3d_lidar(boxes3d, bottom_center=True):
...
@@ -870,9 +801,9 @@ def boxes3d_to_corners3d_lidar(boxes3d, bottom_center=True):
zeros
,
ones
=
np
.
zeros
(
zeros
,
ones
=
np
.
zeros
(
ry
.
size
,
dtype
=
np
.
float32
),
np
.
ones
(
ry
.
size
,
dtype
=
np
.
float32
),
np
.
ones
(
ry
.
size
,
dtype
=
np
.
float32
)
ry
.
size
,
dtype
=
np
.
float32
)
rot_list
=
np
.
array
([[
np
.
cos
(
ry
),
-
np
.
sin
(
ry
),
zeros
],
rot_list
=
np
.
array
([[
np
.
cos
(
ry
),
np
.
sin
(
ry
),
zeros
],
[
np
.
sin
(
ry
),
np
.
cos
(
ry
),
zeros
],
[
zeros
,
zeros
,
[
-
np
.
sin
(
ry
),
np
.
cos
(
ry
),
zeros
],
ones
]])
# (3, 3, N)
[
zeros
,
zeros
,
ones
]])
# (3, 3, N)
R_list
=
np
.
transpose
(
rot_list
,
(
2
,
0
,
1
))
# (N, 3, 3)
R_list
=
np
.
transpose
(
rot_list
,
(
2
,
0
,
1
))
# (N, 3, 3)
temp_corners
=
np
.
concatenate
((
x_corners
.
reshape
(
temp_corners
=
np
.
concatenate
((
x_corners
.
reshape
(
...
...
mmdet3d/core/bbox/coders/__init__.py
View file @
32a4328b
...
@@ -3,10 +3,17 @@ from mmdet.core.bbox import build_bbox_coder
...
@@ -3,10 +3,17 @@ from mmdet.core.bbox import build_bbox_coder
from
.anchor_free_bbox_coder
import
AnchorFreeBBoxCoder
from
.anchor_free_bbox_coder
import
AnchorFreeBBoxCoder
from
.centerpoint_bbox_coders
import
CenterPointBBoxCoder
from
.centerpoint_bbox_coders
import
CenterPointBBoxCoder
from
.delta_xyzwhlr_bbox_coder
import
DeltaXYZWLHRBBoxCoder
from
.delta_xyzwhlr_bbox_coder
import
DeltaXYZWLHRBBoxCoder
from
.fcos3d_bbox_coder
import
FCOS3DBBoxCoder
from
.groupfree3d_bbox_coder
import
GroupFree3DBBoxCoder
from
.groupfree3d_bbox_coder
import
GroupFree3DBBoxCoder
from
.monoflex_bbox_coder
import
MonoFlexCoder
from
.partial_bin_based_bbox_coder
import
PartialBinBasedBBoxCoder
from
.partial_bin_based_bbox_coder
import
PartialBinBasedBBoxCoder
from
.pgd_bbox_coder
import
PGDBBoxCoder
from
.point_xyzwhlr_bbox_coder
import
PointXYZWHLRBBoxCoder
from
.smoke_bbox_coder
import
SMOKECoder
__all__
=
[
__all__
=
[
'build_bbox_coder'
,
'DeltaXYZWLHRBBoxCoder'
,
'PartialBinBasedBBoxCoder'
,
'build_bbox_coder'
,
'DeltaXYZWLHRBBoxCoder'
,
'PartialBinBasedBBoxCoder'
,
'CenterPointBBoxCoder'
,
'AnchorFreeBBoxCoder'
,
'GroupFree3DBBoxCoder'
'CenterPointBBoxCoder'
,
'AnchorFreeBBoxCoder'
,
'GroupFree3DBBoxCoder'
,
'PointXYZWHLRBBoxCoder'
,
'FCOS3DBBoxCoder'
,
'PGDBBoxCoder'
,
'SMOKECoder'
,
'MonoFlexCoder'
]
]
mmdet3d/core/bbox/coders/anchor_free_bbox_coder.py
View file @
32a4328b
...
@@ -25,7 +25,7 @@ class AnchorFreeBBoxCoder(PartialBinBasedBBoxCoder):
...
@@ -25,7 +25,7 @@ class AnchorFreeBBoxCoder(PartialBinBasedBBoxCoder):
"""Encode ground truth to prediction targets.
"""Encode ground truth to prediction targets.
Args:
Args:
gt_bboxes_3d (BaseInstance3DBoxes): Ground truth bboxes
\
gt_bboxes_3d (BaseInstance3DBoxes): Ground truth bboxes
with shape (n, 7).
with shape (n, 7).
gt_labels_3d (torch.Tensor): Ground truth classes.
gt_labels_3d (torch.Tensor): Ground truth classes.
...
...
mmdet3d/core/bbox/coders/centerpoint_bbox_coders.py
View file @
32a4328b
...
@@ -13,12 +13,12 @@ class CenterPointBBoxCoder(BaseBBoxCoder):
...
@@ -13,12 +13,12 @@ class CenterPointBBoxCoder(BaseBBoxCoder):
pc_range (list[float]): Range of point cloud.
pc_range (list[float]): Range of point cloud.
out_size_factor (int): Downsample factor of the model.
out_size_factor (int): Downsample factor of the model.
voxel_size (list[float]): Size of voxel.
voxel_size (list[float]): Size of voxel.
post_center_range (list[float]): Limit of the center.
post_center_range (list[float]
, optional
): Limit of the center.
Default: None.
Default: None.
max_num (int): Max number to be kept. Default: 100.
max_num (int
, optional
): Max number to be kept. Default: 100.
score_threshold (float): Threshold to filter boxes
based on score.
score_threshold (float
, optional
): Threshold to filter boxes
Default: None.
based on score.
Default: None.
code_size (int): Code size of bboxes. Default: 9
code_size (int
, optional
): Code size of bboxes. Default: 9
"""
"""
def
__init__
(
self
,
def
__init__
(
self
,
...
@@ -45,7 +45,8 @@ class CenterPointBBoxCoder(BaseBBoxCoder):
...
@@ -45,7 +45,8 @@ class CenterPointBBoxCoder(BaseBBoxCoder):
feats (torch.Tensor): Features to be transposed and gathered
feats (torch.Tensor): Features to be transposed and gathered
with the shape of [B, 2, W, H].
with the shape of [B, 2, W, H].
inds (torch.Tensor): Indexes with the shape of [B, N].
inds (torch.Tensor): Indexes with the shape of [B, N].
feat_masks (torch.Tensor): Mask of the feats. Default: None.
feat_masks (torch.Tensor, optional): Mask of the feats.
Default: None.
Returns:
Returns:
torch.Tensor: Gathered feats.
torch.Tensor: Gathered feats.
...
@@ -64,7 +65,7 @@ class CenterPointBBoxCoder(BaseBBoxCoder):
...
@@ -64,7 +65,7 @@ class CenterPointBBoxCoder(BaseBBoxCoder):
Args:
Args:
scores (torch.Tensor): scores with the shape of [B, N, W, H].
scores (torch.Tensor): scores with the shape of [B, N, W, H].
K (int): Number to be kept. Defaults to 80.
K (int
, optional
): Number to be kept. Defaults to 80.
Returns:
Returns:
tuple[torch.Tensor]
tuple[torch.Tensor]
...
@@ -135,9 +136,9 @@ class CenterPointBBoxCoder(BaseBBoxCoder):
...
@@ -135,9 +136,9 @@ class CenterPointBBoxCoder(BaseBBoxCoder):
dim (torch.Tensor): Dim of the boxes with the shape of
dim (torch.Tensor): Dim of the boxes with the shape of
[B, 1, W, H].
[B, 1, W, H].
vel (torch.Tensor): Velocity with the shape of [B, 1, W, H].
vel (torch.Tensor): Velocity with the shape of [B, 1, W, H].
reg (torch.Tensor): Regression value of the boxes in
2D with
reg (torch.Tensor
, optional
): Regression value of the boxes in
the shape of [B, 2, W, H]. Default: None.
2D with
the shape of [B, 2, W, H]. Default: None.
task_id (int): Index of task. Default: -1.
task_id (int
, optional
): Index of task. Default: -1.
Returns:
Returns:
list[dict]: Decoded boxes.
list[dict]: Decoded boxes.
...
...
mmdet3d/core/bbox/coders/delta_xyzwhlr_bbox_coder.py
View file @
32a4328b
...
@@ -19,9 +19,9 @@ class DeltaXYZWLHRBBoxCoder(BaseBBoxCoder):
...
@@ -19,9 +19,9 @@ class DeltaXYZWLHRBBoxCoder(BaseBBoxCoder):
@
staticmethod
@
staticmethod
def
encode
(
src_boxes
,
dst_boxes
):
def
encode
(
src_boxes
,
dst_boxes
):
"""Get box regression transformation deltas (dx, dy, dz, d
w, dh, dl
,
"""Get box regression transformation deltas (dx, dy, dz, d
x_size
,
dr, dv*) that can be used to transform the
`src_boxes` into the
dy_size, dz_size,
dr, dv*) that can be used to transform the
`target_boxes`.
`src_boxes` into the
`target_boxes`.
Args:
Args:
src_boxes (torch.Tensor): source boxes, e.g., object proposals.
src_boxes (torch.Tensor): source boxes, e.g., object proposals.
...
@@ -56,13 +56,13 @@ class DeltaXYZWLHRBBoxCoder(BaseBBoxCoder):
...
@@ -56,13 +56,13 @@ class DeltaXYZWLHRBBoxCoder(BaseBBoxCoder):
@
staticmethod
@
staticmethod
def
decode
(
anchors
,
deltas
):
def
decode
(
anchors
,
deltas
):
"""Apply transformation `deltas` (dx, dy, dz, d
w, dh, dl, dr, dv*) to
"""Apply transformation `deltas` (dx, dy, dz, d
x_size, dy_size,
`boxes`.
dz_size, dr, dv*) to
`boxes`.
Args:
Args:
anchors (torch.Tensor): Parameters of anchors with shape (N, 7).
anchors (torch.Tensor): Parameters of anchors with shape (N, 7).
deltas (torch.Tensor): Encoded boxes with shape
deltas (torch.Tensor): Encoded boxes with shape
(N, 7+n) [x, y, z,
w, l, h
, r, velo*].
(N, 7+n) [x, y, z,
x_size, y_size, z_size
, r, velo*].
Returns:
Returns:
torch.Tensor: Decoded boxes.
torch.Tensor: Decoded boxes.
...
...
mmdet3d/core/bbox/coders/fcos3d_bbox_coder.py
0 → 100644
View file @
32a4328b
# Copyright (c) OpenMMLab. All rights reserved.
import
numpy
as
np
import
torch
from
mmdet.core.bbox
import
BaseBBoxCoder
from
mmdet.core.bbox.builder
import
BBOX_CODERS
from
..structures
import
limit_period
@
BBOX_CODERS
.
register_module
()
class
FCOS3DBBoxCoder
(
BaseBBoxCoder
):
"""Bounding box coder for FCOS3D.
Args:
base_depths (tuple[tuple[float]]): Depth references for decode box
depth. Defaults to None.
base_dims (tuple[tuple[float]]): Dimension references for decode box
dimension. Defaults to None.
code_size (int): The dimension of boxes to be encoded. Defaults to 7.
norm_on_bbox (bool): Whether to apply normalization on the bounding
box 2D attributes. Defaults to True.
"""
def
__init__
(
self
,
base_depths
=
None
,
base_dims
=
None
,
code_size
=
7
,
norm_on_bbox
=
True
):
super
(
FCOS3DBBoxCoder
,
self
).
__init__
()
self
.
base_depths
=
base_depths
self
.
base_dims
=
base_dims
self
.
bbox_code_size
=
code_size
self
.
norm_on_bbox
=
norm_on_bbox
def
encode
(
self
,
gt_bboxes_3d
,
gt_labels_3d
,
gt_bboxes
,
gt_labels
):
# TODO: refactor the encoder in the FCOS3D and PGD head
pass
def
decode
(
self
,
bbox
,
scale
,
stride
,
training
,
cls_score
=
None
):
"""Decode regressed results into 3D predictions.
Note that offsets are not transformed to the projected 3D centers.
Args:
bbox (torch.Tensor): Raw bounding box predictions in shape
[N, C, H, W].
scale (tuple[`Scale`]): Learnable scale parameters.
stride (int): Stride for a specific feature level.
training (bool): Whether the decoding is in the training
procedure.
cls_score (torch.Tensor): Classification score map for deciding
which base depth or dim is used. Defaults to None.
Returns:
torch.Tensor: Decoded boxes.
"""
# scale the bbox of different level
# only apply to offset, depth and size prediction
scale_offset
,
scale_depth
,
scale_size
=
scale
[
0
:
3
]
clone_bbox
=
bbox
.
clone
()
bbox
[:,
:
2
]
=
scale_offset
(
clone_bbox
[:,
:
2
]).
float
()
bbox
[:,
2
]
=
scale_depth
(
clone_bbox
[:,
2
]).
float
()
bbox
[:,
3
:
6
]
=
scale_size
(
clone_bbox
[:,
3
:
6
]).
float
()
if
self
.
base_depths
is
None
:
bbox
[:,
2
]
=
bbox
[:,
2
].
exp
()
elif
len
(
self
.
base_depths
)
==
1
:
# only single prior
mean
=
self
.
base_depths
[
0
][
0
]
std
=
self
.
base_depths
[
0
][
1
]
bbox
[:,
2
]
=
mean
+
bbox
.
clone
()[:,
2
]
*
std
else
:
# multi-class priors
assert
len
(
self
.
base_depths
)
==
cls_score
.
shape
[
1
],
\
'The number of multi-class depth priors should be equal to '
\
'the number of categories.'
indices
=
cls_score
.
max
(
dim
=
1
)[
1
]
depth_priors
=
cls_score
.
new_tensor
(
self
.
base_depths
)[
indices
,
:].
permute
(
0
,
3
,
1
,
2
)
mean
=
depth_priors
[:,
0
]
std
=
depth_priors
[:,
1
]
bbox
[:,
2
]
=
mean
+
bbox
.
clone
()[:,
2
]
*
std
bbox
[:,
3
:
6
]
=
bbox
[:,
3
:
6
].
exp
()
if
self
.
base_dims
is
not
None
:
assert
len
(
self
.
base_dims
)
==
cls_score
.
shape
[
1
],
\
'The number of anchor sizes should be equal to the number '
\
'of categories.'
indices
=
cls_score
.
max
(
dim
=
1
)[
1
]
size_priors
=
cls_score
.
new_tensor
(
self
.
base_dims
)[
indices
,
:].
permute
(
0
,
3
,
1
,
2
)
bbox
[:,
3
:
6
]
=
size_priors
*
bbox
.
clone
()[:,
3
:
6
]
assert
self
.
norm_on_bbox
is
True
,
'Setting norm_on_bbox to False '
\
'has not been thoroughly tested for FCOS3D.'
if
self
.
norm_on_bbox
:
if
not
training
:
# Note that this line is conducted only when testing
bbox
[:,
:
2
]
*=
stride
return
bbox
@
staticmethod
def
decode_yaw
(
bbox
,
centers2d
,
dir_cls
,
dir_offset
,
cam2img
):
"""Decode yaw angle and change it from local to global.i.
Args:
bbox (torch.Tensor): Bounding box predictions in shape
[N, C] with yaws to be decoded.
centers2d (torch.Tensor): Projected 3D-center on the image planes
corresponding to the box predictions.
dir_cls (torch.Tensor): Predicted direction classes.
dir_offset (float): Direction offset before dividing all the
directions into several classes.
cam2img (torch.Tensor): Camera intrinsic matrix in shape [4, 4].
Returns:
torch.Tensor: Bounding boxes with decoded yaws.
"""
if
bbox
.
shape
[
0
]
>
0
:
dir_rot
=
limit_period
(
bbox
[...,
6
]
-
dir_offset
,
0
,
np
.
pi
)
bbox
[...,
6
]
=
\
dir_rot
+
dir_offset
+
np
.
pi
*
dir_cls
.
to
(
bbox
.
dtype
)
bbox
[:,
6
]
=
torch
.
atan2
(
centers2d
[:,
0
]
-
cam2img
[
0
,
2
],
cam2img
[
0
,
0
])
+
bbox
[:,
6
]
return
bbox
mmdet3d/core/bbox/coders/groupfree3d_bbox_coder.py
View file @
32a4328b
...
@@ -14,9 +14,10 @@ class GroupFree3DBBoxCoder(PartialBinBasedBBoxCoder):
...
@@ -14,9 +14,10 @@ class GroupFree3DBBoxCoder(PartialBinBasedBBoxCoder):
num_dir_bins (int): Number of bins to encode direction angle.
num_dir_bins (int): Number of bins to encode direction angle.
num_sizes (int): Number of size clusters.
num_sizes (int): Number of size clusters.
mean_sizes (list[list[int]]): Mean size of bboxes in each class.
mean_sizes (list[list[int]]): Mean size of bboxes in each class.
with_rot (bool): Whether the bbox is with rotation. Defaults to True.
with_rot (bool, optional): Whether the bbox is with rotation.
size_cls_agnostic (bool): Whether the predicted size is class-agnostic.
Defaults to True.
Defaults to True.
size_cls_agnostic (bool, optional): Whether the predicted size is
class-agnostic. Defaults to True.
"""
"""
def
__init__
(
self
,
def
__init__
(
self
,
...
@@ -36,7 +37,7 @@ class GroupFree3DBBoxCoder(PartialBinBasedBBoxCoder):
...
@@ -36,7 +37,7 @@ class GroupFree3DBBoxCoder(PartialBinBasedBBoxCoder):
"""Encode ground truth to prediction targets.
"""Encode ground truth to prediction targets.
Args:
Args:
gt_bboxes_3d (BaseInstance3DBoxes): Ground truth bboxes
\
gt_bboxes_3d (BaseInstance3DBoxes): Ground truth bboxes
with shape (n, 7).
with shape (n, 7).
gt_labels_3d (torch.Tensor): Ground truth classes.
gt_labels_3d (torch.Tensor): Ground truth classes.
...
@@ -76,7 +77,7 @@ class GroupFree3DBBoxCoder(PartialBinBasedBBoxCoder):
...
@@ -76,7 +77,7 @@ class GroupFree3DBBoxCoder(PartialBinBasedBBoxCoder):
- size_class: predicted bbox size class.
- size_class: predicted bbox size class.
- size_res: predicted bbox size residual.
- size_res: predicted bbox size residual.
- size: predicted class-agnostic bbox size
- size: predicted class-agnostic bbox size
prefix (str): Decode predictions with specific prefix.
prefix (str
, optional
): Decode predictions with specific prefix.
Defaults to ''.
Defaults to ''.
Returns:
Returns:
...
@@ -122,7 +123,7 @@ class GroupFree3DBBoxCoder(PartialBinBasedBBoxCoder):
...
@@ -122,7 +123,7 @@ class GroupFree3DBBoxCoder(PartialBinBasedBBoxCoder):
cls_preds (torch.Tensor): Class predicted features to split.
cls_preds (torch.Tensor): Class predicted features to split.
reg_preds (torch.Tensor): Regression predicted features to split.
reg_preds (torch.Tensor): Regression predicted features to split.
base_xyz (torch.Tensor): Coordinates of points.
base_xyz (torch.Tensor): Coordinates of points.
prefix (str): Decode predictions with specific prefix.
prefix (str
, optional
): Decode predictions with specific prefix.
Defaults to ''.
Defaults to ''.
Returns:
Returns:
...
...
mmdet3d/core/bbox/coders/monoflex_bbox_coder.py
0 → 100644
View file @
32a4328b
# Copyright (c) OpenMMLab. All rights reserved.
import
numpy
as
np
import
torch
from
torch.nn
import
functional
as
F
from
mmdet.core.bbox
import
BaseBBoxCoder
from
mmdet.core.bbox.builder
import
BBOX_CODERS
@
BBOX_CODERS
.
register_module
()
class
MonoFlexCoder
(
BaseBBoxCoder
):
"""Bbox Coder for MonoFlex.
Args:
depth_mode (str): The mode for depth calculation.
Available options are "linear", "inv_sigmoid", and "exp".
base_depth (tuple[float]): References for decoding box depth.
depth_range (list): Depth range of predicted depth.
combine_depth (bool): Whether to use combined depth (direct depth
and depth from keypoints) or use direct depth only.
uncertainty_range (list): Uncertainty range of predicted depth.
base_dims (tuple[tuple[float]]): Dimensions mean and std of decode bbox
dimensions [l, h, w] for each category.
dims_mode (str): The mode for dimension calculation.
Available options are "linear" and "exp".
multibin (bool): Whether to use multibin representation.
num_dir_bins (int): Number of Number of bins to encode
direction angle.
bin_centers (list[float]): Local yaw centers while using multibin
representations.
bin_margin (float): Margin of multibin representations.
code_size (int): The dimension of boxes to be encoded.
eps (float, optional): A value added to the denominator for numerical
stability. Default 1e-3.
"""
def
__init__
(
self
,
depth_mode
,
base_depth
,
depth_range
,
combine_depth
,
uncertainty_range
,
base_dims
,
dims_mode
,
multibin
,
num_dir_bins
,
bin_centers
,
bin_margin
,
code_size
,
eps
=
1e-3
):
super
(
MonoFlexCoder
,
self
).
__init__
()
# depth related
self
.
depth_mode
=
depth_mode
self
.
base_depth
=
base_depth
self
.
depth_range
=
depth_range
self
.
combine_depth
=
combine_depth
self
.
uncertainty_range
=
uncertainty_range
# dimensions related
self
.
base_dims
=
base_dims
self
.
dims_mode
=
dims_mode
# orientation related
self
.
multibin
=
multibin
self
.
num_dir_bins
=
num_dir_bins
self
.
bin_centers
=
bin_centers
self
.
bin_margin
=
bin_margin
# output related
self
.
bbox_code_size
=
code_size
self
.
eps
=
eps
def
encode
(
self
,
gt_bboxes_3d
):
"""Encode ground truth to prediction targets.
Args:
gt_bboxes_3d (`BaseInstance3DBoxes`): Ground truth 3D bboxes.
shape: (N, 7).
Returns:
torch.Tensor: Targets of orientations.
"""
local_yaw
=
gt_bboxes_3d
.
local_yaw
# encode local yaw (-pi ~ pi) to multibin format
encode_local_yaw
=
local_yaw
.
new_zeros
(
[
local_yaw
.
shape
[
0
],
self
.
num_dir_bins
*
2
])
bin_size
=
2
*
np
.
pi
/
self
.
num_dir_bins
margin_size
=
bin_size
*
self
.
bin_margin
bin_centers
=
local_yaw
.
new_tensor
(
self
.
bin_centers
)
range_size
=
bin_size
/
2
+
margin_size
offsets
=
local_yaw
.
unsqueeze
(
1
)
-
bin_centers
.
unsqueeze
(
0
)
offsets
[
offsets
>
np
.
pi
]
=
offsets
[
offsets
>
np
.
pi
]
-
2
*
np
.
pi
offsets
[
offsets
<
-
np
.
pi
]
=
offsets
[
offsets
<
-
np
.
pi
]
+
2
*
np
.
pi
for
i
in
range
(
self
.
num_dir_bins
):
offset
=
offsets
[:,
i
]
inds
=
abs
(
offset
)
<
range_size
encode_local_yaw
[
inds
,
i
]
=
1
encode_local_yaw
[
inds
,
i
+
self
.
num_dir_bins
]
=
offset
[
inds
]
orientation_target
=
encode_local_yaw
return
orientation_target
def
decode
(
self
,
bbox
,
base_centers2d
,
labels
,
downsample_ratio
,
cam2imgs
):
"""Decode bounding box regression into 3D predictions.
Args:
bbox (Tensor): Raw bounding box predictions for each
predict center2d point.
shape: (N, C)
base_centers2d (torch.Tensor): Base centers2d for 3D bboxes.
shape: (N, 2).
labels (Tensor): Batch predict class label for each predict
center2d point.
shape: (N, )
downsample_ratio (int): The stride of feature map.
cam2imgs (Tensor): Batch images' camera intrinsic matrix.
shape: kitti (N, 4, 4) nuscenes (N, 3, 3)
Return:
dict: The 3D prediction dict decoded from regression map.
the dict has components below:
- bboxes2d (torch.Tensor): Decoded [x1, y1, x2, y2] format
2D bboxes.
- dimensions (torch.Tensor): Decoded dimensions for each
object.
- offsets2d (torch.Tenosr): Offsets between base centers2d
and real centers2d.
- direct_depth (torch.Tensor): Decoded directly regressed
depth.
- keypoints2d (torch.Tensor): Keypoints of each projected
3D box on image.
- keypoints_depth (torch.Tensor): Decoded depth from keypoints.
- combined_depth (torch.Tensor): Combined depth using direct
depth and keypoints depth with depth uncertainty.
- orientations (torch.Tensor): Multibin format orientations
(local yaw) for each objects.
"""
# 4 dimensions for FCOS style regression
pred_bboxes2d
=
bbox
[:,
0
:
4
]
# change FCOS style to [x1, y1, x2, y2] format for IOU Loss
pred_bboxes2d
=
self
.
decode_bboxes2d
(
pred_bboxes2d
,
base_centers2d
)
# 2 dimensions for projected centers2d offsets
pred_offsets2d
=
bbox
[:,
4
:
6
]
# 3 dimensions for 3D bbox dimensions offsets
pred_dimensions_offsets3d
=
bbox
[:,
29
:
32
]
# the first 8 dimensions are for orientation bin classification
# and the second 8 dimensions are for orientation offsets.
pred_orientations
=
torch
.
cat
((
bbox
[:,
32
:
40
],
bbox
[:,
40
:
48
]),
dim
=
1
)
# 3 dimensions for the uncertainties of the solved depths from
# groups of keypoints
pred_keypoints_depth_uncertainty
=
bbox
[:,
26
:
29
]
# 1 dimension for the uncertainty of directly regressed depth
pred_direct_depth_uncertainty
=
bbox
[:,
49
:
50
].
squeeze
(
-
1
)
# 2 dimension of offsets x keypoints (8 corners + top/bottom center)
pred_keypoints2d
=
bbox
[:,
6
:
26
].
reshape
(
-
1
,
10
,
2
)
# 1 dimension for depth offsets
pred_direct_depth_offsets
=
bbox
[:,
48
:
49
].
squeeze
(
-
1
)
# decode the pred residual dimensions to real dimensions
pred_dimensions
=
self
.
decode_dims
(
labels
,
pred_dimensions_offsets3d
)
pred_direct_depth
=
self
.
decode_direct_depth
(
pred_direct_depth_offsets
)
pred_keypoints_depth
=
self
.
keypoints2depth
(
pred_keypoints2d
,
pred_dimensions
,
cam2imgs
,
downsample_ratio
)
pred_direct_depth_uncertainty
=
torch
.
clamp
(
pred_direct_depth_uncertainty
,
self
.
uncertainty_range
[
0
],
self
.
uncertainty_range
[
1
])
pred_keypoints_depth_uncertainty
=
torch
.
clamp
(
pred_keypoints_depth_uncertainty
,
self
.
uncertainty_range
[
0
],
self
.
uncertainty_range
[
1
])
if
self
.
combine_depth
:
pred_depth_uncertainty
=
torch
.
cat
(
(
pred_direct_depth_uncertainty
.
unsqueeze
(
-
1
),
pred_keypoints_depth_uncertainty
),
dim
=
1
).
exp
()
pred_depth
=
torch
.
cat
(
(
pred_direct_depth
.
unsqueeze
(
-
1
),
pred_keypoints_depth
),
dim
=
1
)
pred_combined_depth
=
\
self
.
combine_depths
(
pred_depth
,
pred_depth_uncertainty
)
else
:
pred_combined_depth
=
None
preds
=
dict
(
bboxes2d
=
pred_bboxes2d
,
dimensions
=
pred_dimensions
,
offsets2d
=
pred_offsets2d
,
keypoints2d
=
pred_keypoints2d
,
orientations
=
pred_orientations
,
direct_depth
=
pred_direct_depth
,
keypoints_depth
=
pred_keypoints_depth
,
combined_depth
=
pred_combined_depth
,
direct_depth_uncertainty
=
pred_direct_depth_uncertainty
,
keypoints_depth_uncertainty
=
pred_keypoints_depth_uncertainty
,
)
return
preds
def
decode_direct_depth
(
self
,
depth_offsets
):
"""Transform depth offset to directly regressed depth.
Args:
depth_offsets (torch.Tensor): Predicted depth offsets.
shape: (N, )
Return:
torch.Tensor: Directly regressed depth.
shape: (N, )
"""
if
self
.
depth_mode
==
'exp'
:
direct_depth
=
depth_offsets
.
exp
()
elif
self
.
depth_mode
==
'linear'
:
base_depth
=
depth_offsets
.
new_tensor
(
self
.
base_depth
)
direct_depth
=
depth_offsets
*
base_depth
[
1
]
+
base_depth
[
0
]
elif
self
.
depth_mode
==
'inv_sigmoid'
:
direct_depth
=
1
/
torch
.
sigmoid
(
depth_offsets
)
-
1
else
:
raise
ValueError
if
self
.
depth_range
is
not
None
:
direct_depth
=
torch
.
clamp
(
direct_depth
,
min
=
self
.
depth_range
[
0
],
max
=
self
.
depth_range
[
1
])
return
direct_depth
def
decode_location
(
self
,
base_centers2d
,
offsets2d
,
depths
,
cam2imgs
,
downsample_ratio
,
pad_mode
=
'default'
):
"""Retrieve object location.
Args:
base_centers2d (torch.Tensor): predicted base centers2d.
shape: (N, 2)
offsets2d (torch.Tensor): The offsets between real centers2d
and base centers2d.
shape: (N , 2)
depths (torch.Tensor): Depths of objects.
shape: (N, )
cam2imgs (torch.Tensor): Batch images' camera intrinsic matrix.
shape: kitti (N, 4, 4) nuscenes (N, 3, 3)
downsample_ratio (int): The stride of feature map.
pad_mode (str, optional): Padding mode used in
training data augmentation.
Return:
tuple(torch.Tensor): Centers of 3D boxes.
shape: (N, 3)
"""
N
=
cam2imgs
.
shape
[
0
]
# (N, 4, 4)
cam2imgs_inv
=
cam2imgs
.
inverse
()
if
pad_mode
==
'default'
:
centers2d_img
=
(
base_centers2d
+
offsets2d
)
*
downsample_ratio
else
:
raise
NotImplementedError
# (N, 3)
centers2d_img
=
\
torch
.
cat
((
centers2d_img
,
depths
.
unsqueeze
(
-
1
)),
dim
=
1
)
# (N, 4, 1)
centers2d_extend
=
\
torch
.
cat
((
centers2d_img
,
centers2d_img
.
new_ones
(
N
,
1
)),
dim
=
1
).
unsqueeze
(
-
1
)
locations
=
torch
.
matmul
(
cam2imgs_inv
,
centers2d_extend
).
squeeze
(
-
1
)
return
locations
[:,
:
3
]
def
keypoints2depth
(
self
,
keypoints2d
,
dimensions
,
cam2imgs
,
downsample_ratio
=
4
,
group0_index
=
[(
7
,
3
),
(
0
,
4
)],
group1_index
=
[(
2
,
6
),
(
1
,
5
)]):
"""Decode depth form three groups of keypoints and geometry projection
model. 2D keypoints inlucding 8 coreners and top/bottom centers will be
divided into three groups which will be used to calculate three depths
of object.
.. code-block:: none
Group center keypoints:
+ --------------- +
/| top center /|
/ | . / |
/ | | / |
+ ---------|----- + +
| / | | /
| / . | /
|/ bottom center |/
+ --------------- +
Group 0 keypoints:
0
+ -------------- +
/| /|
/ | / |
/ | 5/ |
+ -------------- + +
| /3 | /
| / | /
|/ |/
+ -------------- + 6
Group 1 keypoints:
4
+ -------------- +
/| /|
/ | / |
/ | / |
1 + -------------- + + 7
| / | /
| / | /
|/ |/
2 + -------------- +
Args:
keypoints2d (torch.Tensor): Keypoints of objects.
8 vertices + top/bottom center.
shape: (N, 10, 2)
dimensions (torch.Tensor): Dimensions of objetcts.
shape: (N, 3)
cam2imgs (torch.Tensor): Batch images' camera intrinsic matrix.
shape: kitti (N, 4, 4) nuscenes (N, 3, 3)
downsample_ratio (int, opitonal): The stride of feature map.
Defaults: 4.
group0_index(list[tuple[int]], optional): Keypoints group 0
of index to calculate the depth.
Defaults: [0, 3, 4, 7].
group1_index(list[tuple[int]], optional): Keypoints group 1
of index to calculate the depth.
Defaults: [1, 2, 5, 6]
Return:
tuple(torch.Tensor): Depth computed from three groups of
keypoints (top/bottom, group0, group1)
shape: (N, 3)
"""
pred_height_3d
=
dimensions
[:,
1
].
clone
()
f_u
=
cam2imgs
[:,
0
,
0
]
center_height
=
keypoints2d
[:,
-
2
,
1
]
-
keypoints2d
[:,
-
1
,
1
]
corner_group0_height
=
keypoints2d
[:,
group0_index
[
0
],
1
]
\
-
keypoints2d
[:,
group0_index
[
1
],
1
]
corner_group1_height
=
keypoints2d
[:,
group1_index
[
0
],
1
]
\
-
keypoints2d
[:,
group1_index
[
1
],
1
]
center_depth
=
f_u
*
pred_height_3d
/
(
F
.
relu
(
center_height
)
*
downsample_ratio
+
self
.
eps
)
corner_group0_depth
=
(
f_u
*
pred_height_3d
).
unsqueeze
(
-
1
)
/
(
F
.
relu
(
corner_group0_height
)
*
downsample_ratio
+
self
.
eps
)
corner_group1_depth
=
(
f_u
*
pred_height_3d
).
unsqueeze
(
-
1
)
/
(
F
.
relu
(
corner_group1_height
)
*
downsample_ratio
+
self
.
eps
)
corner_group0_depth
=
corner_group0_depth
.
mean
(
dim
=
1
)
corner_group1_depth
=
corner_group1_depth
.
mean
(
dim
=
1
)
keypoints_depth
=
torch
.
stack
(
(
center_depth
,
corner_group0_depth
,
corner_group1_depth
),
dim
=
1
)
keypoints_depth
=
torch
.
clamp
(
keypoints_depth
,
min
=
self
.
depth_range
[
0
],
max
=
self
.
depth_range
[
1
])
return
keypoints_depth
def
decode_dims
(
self
,
labels
,
dims_offset
):
"""Retrieve object dimensions.
Args:
labels (torch.Tensor): Each points' category id.
shape: (N, K)
dims_offset (torch.Tensor): Dimension offsets.
shape: (N, 3)
Returns:
torch.Tensor: Shape (N, 3)
"""
if
self
.
dims_mode
==
'exp'
:
dims_offset
=
dims_offset
.
exp
()
elif
self
.
dims_mode
==
'linear'
:
labels
=
labels
.
long
()
base_dims
=
dims_offset
.
new_tensor
(
self
.
base_dims
)
dims_mean
=
base_dims
[:,
:
3
]
dims_std
=
base_dims
[:,
3
:
6
]
cls_dimension_mean
=
dims_mean
[
labels
,
:]
cls_dimension_std
=
dims_std
[
labels
,
:]
dimensions
=
dims_offset
*
cls_dimension_mean
+
cls_dimension_std
else
:
raise
ValueError
return
dimensions
def
decode_orientation
(
self
,
ori_vector
,
locations
):
"""Retrieve object orientation.
Args:
ori_vector (torch.Tensor): Local orientation vector
in [axis_cls, head_cls, sin, cos] format.
shape: (N, num_dir_bins * 4)
locations (torch.Tensor): Object location.
shape: (N, 3)
Returns:
tuple[torch.Tensor]: yaws and local yaws of 3d bboxes.
"""
if
self
.
multibin
:
pred_bin_cls
=
ori_vector
[:,
:
self
.
num_dir_bins
*
2
].
view
(
-
1
,
self
.
num_dir_bins
,
2
)
pred_bin_cls
=
pred_bin_cls
.
softmax
(
dim
=
2
)[...,
1
]
orientations
=
ori_vector
.
new_zeros
(
ori_vector
.
shape
[
0
])
for
i
in
range
(
self
.
num_dir_bins
):
mask_i
=
(
pred_bin_cls
.
argmax
(
dim
=
1
)
==
i
)
start_bin
=
self
.
num_dir_bins
*
2
+
i
*
2
end_bin
=
start_bin
+
2
pred_bin_offset
=
ori_vector
[
mask_i
,
start_bin
:
end_bin
]
orientations
[
mask_i
]
=
pred_bin_offset
[:,
0
].
atan2
(
pred_bin_offset
[:,
1
])
+
self
.
bin_centers
[
i
]
else
:
axis_cls
=
ori_vector
[:,
:
2
].
softmax
(
dim
=
1
)
axis_cls
=
axis_cls
[:,
0
]
<
axis_cls
[:,
1
]
head_cls
=
ori_vector
[:,
2
:
4
].
softmax
(
dim
=
1
)
head_cls
=
head_cls
[:,
0
]
<
head_cls
[:,
1
]
# cls axis
orientations
=
self
.
bin_centers
[
axis_cls
+
head_cls
*
2
]
sin_cos_offset
=
F
.
normalize
(
ori_vector
[:,
4
:])
orientations
+=
sin_cos_offset
[:,
0
].
atan
(
sin_cos_offset
[:,
1
])
locations
=
locations
.
view
(
-
1
,
3
)
rays
=
locations
[:,
0
].
atan2
(
locations
[:,
2
])
local_yaws
=
orientations
yaws
=
local_yaws
+
rays
larger_idx
=
(
yaws
>
np
.
pi
).
nonzero
(
as_tuple
=
False
)
small_idx
=
(
yaws
<
-
np
.
pi
).
nonzero
(
as_tuple
=
False
)
if
len
(
larger_idx
)
!=
0
:
yaws
[
larger_idx
]
-=
2
*
np
.
pi
if
len
(
small_idx
)
!=
0
:
yaws
[
small_idx
]
+=
2
*
np
.
pi
larger_idx
=
(
local_yaws
>
np
.
pi
).
nonzero
(
as_tuple
=
False
)
small_idx
=
(
local_yaws
<
-
np
.
pi
).
nonzero
(
as_tuple
=
False
)
if
len
(
larger_idx
)
!=
0
:
local_yaws
[
larger_idx
]
-=
2
*
np
.
pi
if
len
(
small_idx
)
!=
0
:
local_yaws
[
small_idx
]
+=
2
*
np
.
pi
return
yaws
,
local_yaws
def
decode_bboxes2d
(
self
,
reg_bboxes2d
,
base_centers2d
):
"""Retrieve [x1, y1, x2, y2] format 2D bboxes.
Args:
reg_bboxes2d (torch.Tensor): Predicted FCOS style
2D bboxes.
shape: (N, 4)
base_centers2d (torch.Tensor): predicted base centers2d.
shape: (N, 2)
Returns:
torch.Tenosr: [x1, y1, x2, y2] format 2D bboxes.
"""
centers_x
=
base_centers2d
[:,
0
]
centers_y
=
base_centers2d
[:,
1
]
xs_min
=
centers_x
-
reg_bboxes2d
[...,
0
]
ys_min
=
centers_y
-
reg_bboxes2d
[...,
1
]
xs_max
=
centers_x
+
reg_bboxes2d
[...,
2
]
ys_max
=
centers_y
+
reg_bboxes2d
[...,
3
]
bboxes2d
=
torch
.
stack
([
xs_min
,
ys_min
,
xs_max
,
ys_max
],
dim
=-
1
)
return
bboxes2d
def
combine_depths
(
self
,
depth
,
depth_uncertainty
):
"""Combine all the prediced depths with depth uncertainty.
Args:
depth (torch.Tensor): Predicted depths of each object.
2D bboxes.
shape: (N, 4)
depth_uncertainty (torch.Tensor): Depth uncertainty for
each depth of each object.
shape: (N, 4)
Returns:
torch.Tenosr: combined depth.
"""
uncertainty_weights
=
1
/
depth_uncertainty
uncertainty_weights
=
\
uncertainty_weights
/
\
uncertainty_weights
.
sum
(
dim
=
1
,
keepdim
=
True
)
combined_depth
=
torch
.
sum
(
depth
*
uncertainty_weights
,
dim
=
1
)
return
combined_depth
Prev
1
…
3
4
5
6
7
8
9
10
11
…
21
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment