"vscode:/vscode.git/clone" did not exist on "7f55738e7170e15d36b5113f33ed6afa4c229096"
Commit 17ac0691 authored by Xiangxu-0103's avatar Xiangxu-0103 Committed by ZwwWayne
Browse files

[Docs] Refine the documentation (#1994)



* refine doc

* refine docs

* replace `CLASSES` with `classes`

* update doc

* Minor fix
Co-authored-by: default avatarTai-Wang <tab_wang@outlook.com>
parent 116d9f23
...@@ -236,7 +236,7 @@ Please refer to [getting_started.md](docs/en/getting_started.md) for installatio ...@@ -236,7 +236,7 @@ Please refer to [getting_started.md](docs/en/getting_started.md) for installatio
## Get Started ## Get Started
Please see [getting_started.md](docs/en/getting_started.md) for the basic usage of MMDetection3D. We provide guidance for quick run [with existing dataset](docs/en/user_guides/train_test.md) and [with customized dataset](docs/en/user_guides/2_new_data_model.md) for beginners. There are also tutorials for [learning configuration systems](docs/en/user_guides/config.md), [adding new dataset](docs/en/advanced_guides/customize_dataset.md), [designing data pipeline](docs/en/user_guides/data_pipeline.md), [customizing models](docs/en/advanced_guides/customize_models.md), [customizing runtime settings](docs/en/advanced_guides/customize_runtime.md) and [Waymo dataset](docs/en/advanced_guides/datasets/waymo_det.md). Please see [getting_started.md](docs/en/getting_started.md) for the basic usage of MMDetection3D. We provide guidance for quick run [with existing dataset](docs/en/user_guides/train_test.md) and [with new dataset](docs/en/user_guides/2_new_data_model.md) for beginners. There are also tutorials for [learning configuration systems](docs/en/user_guides/config.md), [customizing dataset](docs/en/advanced_guides/customize_dataset.md), [designing data pipeline](docs/en/user_guides/data_pipeline.md), [customizing models](docs/en/advanced_guides/customize_models.md), [customizing runtime settings](docs/en/advanced_guides/customize_runtime.md) and [Waymo dataset](docs/en/advanced_guides/datasets/waymo_det.md).
Please refer to [FAQ](docs/en/notes/faq.md) for frequently asked questions. When updating the version of MMDetection3D, please also check the [compatibility doc](docs/en/notes/compatibility.md) to be aware of the BC-breaking updates introduced in each version. Please refer to [FAQ](docs/en/notes/faq.md) for frequently asked questions. When updating the version of MMDetection3D, please also check the [compatibility doc](docs/en/notes/compatibility.md) to be aware of the BC-breaking updates introduced in each version.
......
...@@ -19,7 +19,7 @@ ...@@ -19,7 +19,7 @@
<div>&nbsp;</div> <div>&nbsp;</div>
</div> </div>
[![docs](https://img.shields.io/badge/docs-latest-blue)](https://mmdetection3d.readthedocs.io/en/1.1/) [![docs](https://img.shields.io/badge/docs-latest-blue)](https://mmdetection3d.readthedocs.io/zh_CN/1.1/)
[![badge](https://github.com/open-mmlab/mmdetection3d/workflows/build/badge.svg)](https://github.com/open-mmlab/mmdetection3d/actions) [![badge](https://github.com/open-mmlab/mmdetection3d/workflows/build/badge.svg)](https://github.com/open-mmlab/mmdetection3d/actions)
[![codecov](https://codecov.io/gh/open-mmlab/mmdetection3d/branch/master/graph/badge.svg)](https://codecov.io/gh/open-mmlab/mmdetection3d) [![codecov](https://codecov.io/gh/open-mmlab/mmdetection3d/branch/master/graph/badge.svg)](https://codecov.io/gh/open-mmlab/mmdetection3d)
[![license](https://img.shields.io/github/license/open-mmlab/mmdetection3d.svg)](https://github.com/open-mmlab/mmdetection3d/blob/master/LICENSE) [![license](https://img.shields.io/github/license/open-mmlab/mmdetection3d.svg)](https://github.com/open-mmlab/mmdetection3d/blob/master/LICENSE)
...@@ -28,7 +28,7 @@ ...@@ -28,7 +28,7 @@
**v1.1.0rc1** 版本已经在 2022.10.11 发布。 **v1.1.0rc1** 版本已经在 2022.10.11 发布。
由于坐标系的统一和简化,模型的兼容性会受到影响。目前,大多数模型都以类似的性能对齐了精度,但仍有少数模型在进行基准测试。在接下来的版本中,我们将更新所有的模型权重文件和基准。您可以在[变更日志](docs/zh_cn/notes/changelog.md)[v1.0.x版本变更日志](docs/zh_cn/notes/changelog_v1.0.x.md)中查看更多详细信息。 由于坐标系的统一和简化,模型的兼容性会受到影响。目前,大多数模型都以类似的性能对齐了精度,但仍有少数模型在进行基准测试。在接下来的版本中,我们将更新所有的模型权重文件和基准。您可以在[变更日志](docs/zh_cn/notes/changelog.md)[v1.0.x 版本变更日志](docs/zh_cn/notes/changelog_v1.0.x.md)中查看更多详细信息。
文档:https://mmdetection3d.readthedocs.io/ 文档:https://mmdetection3d.readthedocs.io/
...@@ -50,8 +50,7 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱,下一代 ...@@ -50,8 +50,7 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱,下一代
- **支持户内/户外的数据集** - **支持户内/户外的数据集**
支持室内/室外的3D检测数据集,包括 ScanNet,SUNRGB-D,Waymo,nuScenes,Lyft,KITTI。 支持室内/室外的 3D 检测数据集,包括 ScanNet,SUNRGB-D,Waymo,nuScenes,Lyft,KITTI。
对于 nuScenes 数据集,我们也支持 [nuImages 数据集](https://github.com/open-mmlab/mmdetection3d/tree/1.1/configs/nuimages) 对于 nuScenes 数据集,我们也支持 [nuImages 数据集](https://github.com/open-mmlab/mmdetection3d/tree/1.1/configs/nuimages)
- **与 2D 检测器的自然整合** - **与 2D 检测器的自然整合**
...@@ -78,7 +77,7 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱,下一代 ...@@ -78,7 +77,7 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱,下一代
## 更新日志 ## 更新日志
我们在 2022.10.11 发布了 **1.1.0rc1** 版本. 我们在 2022.10.11 发布了 **1.1.0rc1** 版本
更多细节和版本发布历史可以参考 [changelog.md](docs/zh_cn/notes/changelog.md) 更多细节和版本发布历史可以参考 [changelog.md](docs/zh_cn/notes/changelog.md)
...@@ -236,7 +235,7 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱,下一代 ...@@ -236,7 +235,7 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱,下一代
## 快速入门 ## 快速入门
请参考[快速入门文档](docs/zh_cn/getting_started.md)学习 MMDetection3D 的基本使用。我们为新手提供了分别针对[已有数据集](docs/zh_cn/user_guides/train_test.md)[新数据集](docs/zh_cn/user_guides/2_new_data_model.md)的使用指南。我们也提供了一些进阶教程,内容覆盖了[学习配置文件](docs/zh_cn/user_guides/config.md)[增加数据集支持](docs/zh_cn/advanced_guides/customize_dataset.md)[设计新的数据预处理流程](docs/zh_cn/user_guides/data_pipeline.md)[增加自定义模型](docs/zh_cn/advanced_guides/customize_models.md)[增加自定义的运行时配置](docs/zh_cn/advanced_guides/customize_runtime.md)[Waymo 数据集](docs/zh_cn/advanced_guides/datasets/waymo_det.md) 请参考[快速入门文档](docs/zh_cn/getting_started.md)学习 MMDetection3D 的基本使用。我们为新手提供了分别针对[已有数据集](docs/zh_cn/user_guides/train_test.md)[新数据集](docs/zh_cn/user_guides/2_new_data_model.md)的使用指南。我们也提供了一些进阶教程,内容覆盖了[学习配置文件](docs/zh_cn/user_guides/config.md)[增加自定义数据集](docs/zh_cn/advanced_guides/customize_dataset.md)[设计新的数据预处理流程](docs/zh_cn/user_guides/data_pipeline.md)[增加自定义模型](docs/zh_cn/advanced_guides/customize_models.md)[增加自定义的运行时配置](docs/zh_cn/advanced_guides/customize_runtime.md)[Waymo 数据集](docs/zh_cn/advanced_guides/datasets/waymo_det.md)
请参考 [FAQ](docs/zh_cn/notes/faq.md) 查看一些常见的问题与解答。在升级 MMDetection3D 的版本时,请查看[兼容性文档](docs/zh_cn/notes/compatibility.md)以知晓每个版本引入的不与之前版本兼容的更新。 请参考 [FAQ](docs/zh_cn/notes/faq.md) 查看一些常见的问题与解答。在升级 MMDetection3D 的版本时,请查看[兼容性文档](docs/zh_cn/notes/compatibility.md)以知晓每个版本引入的不与之前版本兼容的更新。
......
...@@ -16,39 +16,39 @@ The ideal situation is that we can reorganize the customized raw data and conver ...@@ -16,39 +16,39 @@ The ideal situation is that we can reorganize the customized raw data and conver
#### Point cloud Format #### Point cloud Format
Currently, we only support '.bin' format point cloud for training and inference. Before training on your own datasets, you need to convert your point cloud files with other formats to '.bin' files. The common point cloud data formats include `.pcd` and `.las`, we list some open-source tools for reference. Currently, we only support `.bin` format point cloud for training and inference. Before training on your own datasets, you need to convert your point cloud files with other formats to `.bin` files. The common point cloud data formats include `.pcd` and `.las`, we list some open-source tools for reference.
1. Convert pcd to bin: https://github.com/DanielPollithy/pypcd 1. Convert `.pcd` to `.bin`: https://github.com/DanielPollithy/pypcd
- You can install pypcd with the following command: - You can install `pypcd` with the following command:
```bash ```bash
pip install git+https://github.com/DanielPollithy/pypcd.git pip install git+https://github.com/DanielPollithy/pypcd.git
``` ```
- You can use the following command to read the pcd file and convert it to bin format and save it: - You can use the following script to read the `.pcd` file and convert it to `.bin` format and save it:
```python ```python
import numpy as np import numpy as np
from pypcd import pypcd from pypcd import pypcd
pcd_data = pypcd.PointCloud.from_path('point_cloud_data.pcd')
points = np.zeros([pcd_data.width, 4], dtype=np.float32)
points[:, 0] = pcd_data.pc_data['x'].copy()
points[:, 1] = pcd_data.pc_data['y'].copy()
points[:, 2] = pcd_data.pc_data['z'].copy()
points[:, 3] = pcd_data.pc_data['intensity'].copy().astype(np.float32)
with open('point_cloud_data.bin', 'wb') as f:
f.write(points.tobytes())
```
2. Convert las to bin: The common conversion path is las -> pcd -> bin, and the conversion from las -> pcd can be achieved through [this tool](https://github.com/Hitachi-Automotive-And-Industry-Lab/semantic-segmentation-editor). pcd_data = pypcd.PointCloud.from_path('point_cloud_data.pcd')
points = np.zeros([pcd_data.width, 4], dtype=np.float32)
points[:, 0] = pcd_data.pc_data['x'].copy()
points[:, 1] = pcd_data.pc_data['y'].copy()
points[:, 2] = pcd_data.pc_data['z'].copy()
points[:, 3] = pcd_data.pc_data['intensity'].copy().astype(np.float32)
with open('point_cloud_data.bin', 'wb') as f:
f.write(points.tobytes())
```
2. Convert `.las` to `.bin`: The common conversion path is `.las -> .pcd -> .bin`, and the conversion from `.las -> .pcd` can be achieved through [this tool](https://github.com/Hitachi-Automotive-And-Industry-Lab/semantic-segmentation-editor).
#### Label Format #### Label Format
The most basic information: 3D bounding box and category label of each scene need to be contained in the annotation `.txt` file. Each line represents a 3D box in a certain scene as follow: The most basic information: 3D bounding box and category label of each scene need to be contained in the annotation `.txt` file. Each line represents a 3D box in a certain scene as follow:
```python ```
# format: [x, y, z, dx, dy, dz, yaw, category_name] # format: [x, y, z, dx, dy, dz, yaw, category_name]
1.23 1.42 0.23 3.96 1.65 1.55 1.56 Car 1.23 1.42 0.23 3.96 1.65 1.55 1.56 Car
3.51 2.15 0.42 1.05 0.87 1.86 1.23 Pedestrian 3.51 2.15 0.42 1.05 0.87 1.86 1.23 Pedestrian
......
...@@ -32,7 +32,7 @@ mmdetection3d ...@@ -32,7 +32,7 @@ mmdetection3d
### Create KITTI dataset ### Create KITTI dataset
To create KITTI point cloud data, we load the raw point cloud data and generate the relevant annotations including object labels and bounding boxes. We also generate all single training objects' point cloud in KITTI dataset and save them as `.bin` files in `data/kitti/kitti_gt_database`. Meanwhile, `.pkl` info files are also generated for training or validation. Subsequently, create KITTI data by running To create KITTI point cloud data, we load the raw point cloud data and generate the relevant annotations including object labels and bounding boxes. We also generate all single training objects' point cloud in KITTI dataset and save them as `.bin` files in `data/kitti/kitti_gt_database`. Meanwhile, `.pkl` info files are also generated for training or validation. Subsequently, create KITTI data by running:
```bash ```bash
mkdir ./data/kitti/ && mkdir ./data/kitti/ImageSets mkdir ./data/kitti/ && mkdir ./data/kitti/ImageSets
...@@ -98,7 +98,7 @@ kitti ...@@ -98,7 +98,7 @@ kitti
- info\['lidar_points'\]\['Tr_imu_to_velo'\]: Transformation from IMU coordinate to Velodyne coordinate with shape (4, 4). - info\['lidar_points'\]\['Tr_imu_to_velo'\]: Transformation from IMU coordinate to Velodyne coordinate with shape (4, 4).
- info\['instances'\]: It is a list of dict. Each dict contains all annotation information of single instance. For the i-th instance: - info\['instances'\]: It is a list of dict. Each dict contains all annotation information of single instance. For the i-th instance:
- info\['instances'\]\[i\]\['bbox'\]: List of 4 numbers representing the 2D bounding box of the instance, in (x1, y1, x2, y2) order. - info\['instances'\]\[i\]\['bbox'\]: List of 4 numbers representing the 2D bounding box of the instance, in (x1, y1, x2, y2) order.
- info\['instances'\]\[i\]\['bbox_3d'\]: List of 7 numbers representing the 3D bounding box of the instance, in (x, y, z, w, h, l, yaw) order. - info\['instances'\]\[i\]\['bbox_3d'\]: List of 7 numbers representing the 3D bounding box of the instance, in (x, y, z, l, h, w, yaw) order.
- info\['instances'\]\[i\]\['bbox_label'\]: An int indicate the 2D label of instance and the -1 indicating ignore. - info\['instances'\]\[i\]\['bbox_label'\]: An int indicate the 2D label of instance and the -1 indicating ignore.
- info\['instances'\]\[i\]\['bbox_label_3d'\]: An int indicate the 3D label of instance and the -1 indicating ignore. - info\['instances'\]\[i\]\['bbox_label_3d'\]: An int indicate the 3D label of instance and the -1 indicating ignore.
- info\['instances'\]\[i\]\['depth'\]: Projected center depth of the 3D bounding box with respect to the image plane. - info\['instances'\]\[i\]\['depth'\]: Projected center depth of the 3D bounding box with respect to the image plane.
...@@ -114,14 +114,15 @@ Please refer to [kitti_converter.py](https://github.com/open-mmlab/mmdetection3d ...@@ -114,14 +114,15 @@ Please refer to [kitti_converter.py](https://github.com/open-mmlab/mmdetection3d
## Train pipeline ## Train pipeline
A typical train pipeline of 3D detection on KITTI is as below. A typical train pipeline of 3D detection on KITTI is as below:
```python ```python
train_pipeline = [ train_pipeline = [
dict(type='LoadPointsFromFile', dict(
coord_type='LIDAR', type='LoadPointsFromFile',
load_dim=4, # x, y, z, intensity coord_type='LIDAR',
use_dim=4), load_dim=4, # x, y, z, intensity
use_dim=4),
dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True), dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True),
dict(type='ObjectSample', db_sampler=db_sampler), dict(type='ObjectSample', db_sampler=db_sampler),
dict( dict(
...@@ -180,32 +181,26 @@ aos AP:97.70, 89.11, 87.38 ...@@ -180,32 +181,26 @@ aos AP:97.70, 89.11, 87.38
An example to test PointPillars on KITTI with 8 GPUs and generate a submission to the leaderboard is as follows: An example to test PointPillars on KITTI with 8 GPUs and generate a submission to the leaderboard is as follows:
- First, you need to modify the `test_evaluator` dict in your config file to add `pklfile_prefix` and `submission_prefix`, just like: - First, you need to modify the `test_dataloader` and `test_evaluator` dict in your config file, just like:
```python ```python
data_root = 'data/kitti/' data_root = 'data/kitti/'
test_evaluator = dict( test_dataloader = dict(
type='KittiMetric', dataset=dict(
ann_file=data_root + 'kitti_infos_test.pkl', ann_file='kitti_infos_test.pkl',
metric='bbox', load_eval_anns=False,
pklfile_prefix='results/kitti-3class/kitti_results', data_prefix=dict(pts='testing/velodyne_reduced')))
submission_prefix='results/kitti-3class/kitti_results') test_evaluator = dict(
``` ann_file=data_root + 'kitti_infos_test.pkl',
format_only=True,
pklfile_prefix='results/kitti-3class/kitti_results',
submission_prefix='results/kitti-3class/kitti_results')
```
- And then, you can run the test script. - And then, you can run the test script.
```shell ```shell
mkdir -p results/kitti-3class ./tools/dist_test.sh configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class.py work_dirs/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class/latest.pth 8
```
./tools/dist_test.sh configs/pointpillars/configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class.py work_dirs/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class/latest.pth 8
```
- Or you can use `--cfg-options "test_evaluator.pklfile_prefix=results/kitti-3class/kitti_results" "test_evaluator.submission_prefix=results/kitti-3class/kitti_results"` after the test command, and run test script directly.
```shell
mkdir -p results/kitti-3class
./tools/dist_test.sh configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class.py work_dirs/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class/latest.pth 8 --cfg-options 'test_evaluator.pklfile_prefix=results/kitti-3class/kitti_results' 'test_evaluator.submission_prefix=results/kitti-3class/kitti_results'
```
After generating `results/kitti-3class/kitti_results/xxxxx.txt` files, you can submit these files to KITTI benchmark. Please refer to the [KITTI official website](http://www.cvlibs.net/datasets/kitti/index.php) for more details. After generating `results/kitti-3class/kitti_results/xxxxx.txt` files, you can submit these files to KITTI benchmark. Please refer to the [KITTI official website](http://www.cvlibs.net/datasets/kitti/index.php) for more details.
...@@ -40,8 +40,8 @@ Note that we follow the original folder names for clear organization. Please ren ...@@ -40,8 +40,8 @@ Note that we follow the original folder names for clear organization. Please ren
## Dataset Preparation ## Dataset Preparation
The way to organize Lyft dataset is similar to nuScenes. We also generate the .pkl and .json files which share almost the same structure. The way to organize Lyft dataset is similar to nuScenes. We also generate the `.pkl` files which share almost the same structure.
Next, we will mainly focus on the difference between these two datasets. For a more detailed explanation of the info structure, please refer to [nuScenes tutorial](https://github.com/open-mmlab/mmdetection3d/blob/master/docs/en/datasets/nuscenes_det.md). Next, we will mainly focus on the difference between these two datasets. For a more detailed explanation of the info structure, please refer to [nuScenes tutorial](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/docs/en/advanced_guides/datasets/nuscenes_det.md).
To prepare info files for Lyft, run the following commands: To prepare info files for Lyft, run the following commands:
...@@ -90,7 +90,7 @@ mmdetection3d ...@@ -90,7 +90,7 @@ mmdetection3d
- info\['lidar_points'\]\['num_pts_feats'\]: The feature dimension of point. - info\['lidar_points'\]\['num_pts_feats'\]: The feature dimension of point.
- info\['lidar_points'\]\['lidar2ego'\]: The transformation matrix from this lidar sensor to ego vehicle. (4x4 list) - info\['lidar_points'\]\['lidar2ego'\]: The transformation matrix from this lidar sensor to ego vehicle. (4x4 list)
- info\['lidar_points'\]\['ego2global'\]: The transformation matrix from the ego vehicle to global coordinates. (4x4 list) - info\['lidar_points'\]\['ego2global'\]: The transformation matrix from the ego vehicle to global coordinates. (4x4 list)
- info\['lidar_sweeps'\]: A list contains sweeps information (The intermediate lidar frames without annotations) - info\['lidar_sweeps'\]: A list contains sweeps information (The intermediate lidar frames without annotations).
- info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['data_path'\]: The lidar data path of i-th sweep. - info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['data_path'\]: The lidar data path of i-th sweep.
- info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['lidar2ego'\]: The transformation matrix from this lidar sensor to ego vehicle in i-th sweep timestamp - info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['lidar2ego'\]: The transformation matrix from this lidar sensor to ego vehicle in i-th sweep timestamp
- info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['ego2global'\]: The transformation matrix from the ego vehicle in i-th sweep timestamp to global coordinates. (4x4 list) - info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['ego2global'\]: The transformation matrix from the ego vehicle in i-th sweep timestamp to global coordinates. (4x4 list)
...@@ -111,11 +111,11 @@ mmdetection3d ...@@ -111,11 +111,11 @@ mmdetection3d
Next, we will elaborate on the difference compared to nuScenes in terms of the details recorded in these info files. Next, we will elaborate on the difference compared to nuScenes in terms of the details recorded in these info files.
- without `lyft_database/xxxxx.bin`: This folder and `.bin` files are not extracted on the Lyft dataset due to the negligible effect of ground-truth sampling in the experiments. - Without `lyft_database/xxxxx.bin`: This folder and `.bin` files are not extracted on the Lyft dataset due to the negligible effect of ground-truth sampling in the experiments.
- `lyft_infos_train.pkl`: - `lyft_infos_train.pkl`:
- Without info\['instances'\]\[i\]\['velocity'\], There is no velocity measurement on Lyft. - Without info\['instances'\]\[i\]\['velocity'\]: There is no velocity measurement on Lyft.
- Without info\['instances'\]\[i\]\['num_lidar_pts'\] and info\['instances'\]\['num_radar_pts'\] - Without info\['instances'\]\[i\]\['num_lidar_pts'\] and info\['instances'\]\['num_radar_pts'\]
Here we only explain the data recorded in the training info files. The same applies to the validation set and test set (without instances). Here we only explain the data recorded in the training info files. The same applies to the validation set and test set (without instances).
...@@ -160,7 +160,7 @@ where the first 3 dimensions refer to point coordinates, and the last refers to ...@@ -160,7 +160,7 @@ where the first 3 dimensions refer to point coordinates, and the last refers to
## Evaluation ## Evaluation
An example to evaluate PointPillars with 8 GPUs with Lyft metrics is as follows. An example to evaluate PointPillars with 8 GPUs with Lyft metrics is as follows:
```shell ```shell
bash ./tools/dist_test.sh configs/pointpillars/pointpillars_hv_fpn_sbn-all_8xb2-2x_lyft-3d.py checkpoints/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d_20210517_202818-fc6904c3.pth 8 bash ./tools/dist_test.sh configs/pointpillars/pointpillars_hv_fpn_sbn-all_8xb2-2x_lyft-3d.py checkpoints/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d_20210517_202818-fc6904c3.pth 8
......
...@@ -26,7 +26,7 @@ mmdetection3d ...@@ -26,7 +26,7 @@ mmdetection3d
## Dataset Preparation ## Dataset Preparation
We typically need to organize the useful data information with a .pkl or .json file in a specific style, e.g., coco-style for organizing images and their annotations. We typically need to organize the useful data information with a `.pkl` file in a specific style.
To prepare these files for nuScenes, run the following command: To prepare these files for nuScenes, run the following command:
```bash ```bash
...@@ -80,13 +80,13 @@ mmdetection3d ...@@ -80,13 +80,13 @@ mmdetection3d
- info\['images'\]\['CAM_XXX'\]\['cam2ego'\]: The transformation matrix from this camera sensor to ego vehicle. (4x4 list) - info\['images'\]\['CAM_XXX'\]\['cam2ego'\]: The transformation matrix from this camera sensor to ego vehicle. (4x4 list)
- info\['images'\]\['CAM_XXX'\]\['lidar2cam'\]: The transformation matrix from lidar sensor to this camera. (4x4 list) - info\['images'\]\['CAM_XXX'\]\['lidar2cam'\]: The transformation matrix from lidar sensor to this camera. (4x4 list)
- info\['instances'\]: It is a list of dict. Each dict contains all annotation information of single instance. For the i-th instance: - info\['instances'\]: It is a list of dict. Each dict contains all annotation information of single instance. For the i-th instance:
- info\['instances'\]\['bbox_3d'\]: List of 7 numbers representing the 3D bounding box of the instance, in (x, y, z, l, w, h, yaw) order. - info\['instances'\]\[i\]\['bbox_3d'\]: List of 7 numbers representing the 3D bounding box of the instance, in (x, y, z, l, w, h, yaw) order.
- info\['instances'\]\['bbox_label_3d'\]: A int indicate the label of instance and the -1 indicate ignore. - info\['instances'\]\[i\]\['bbox_label_3d'\]: A int indicate the label of instance and the -1 indicate ignore.
- info\['instances'\]\['velocity'\]: Velocities of 3D bounding boxes (no vertical measurements due to inaccuracy), a list has shape (2.). - info\['instances'\]\[i\]\['velocity'\]: Velocities of 3D bounding boxes (no vertical measurements due to inaccuracy), a list has shape (2.).
- info\['instances'\]\['num_lidar_pts'\]: Number of lidar points included in each 3D bounding box. - info\['instances'\]\[i\]\['num_lidar_pts'\]: Number of lidar points included in each 3D bounding box.
- info\['instances'\]\['num_radar_pts'\]: Number of radar points included in each 3D bounding box. - info\['instances'\]\[i\]\['num_radar_pts'\]: Number of radar points included in each 3D bounding box.
- info\['instances'\]\['bbox_3d_isvalid'\]: Whether each bounding box is valid. In general, we only take the 3D boxes that include at least one lidar or radar point as valid boxes. - info\['instances'\]\[i\]\['bbox_3d_isvalid'\]: Whether each bounding box is valid. In general, we only take the 3D boxes that include at least one lidar or radar point as valid boxes.
- info\['cam_instances'\]: It is a dict contains keys `'CAM_FRONT'`, `'CAM_FRONT_RIGHT'`, `'CAM_FRONT_LEFT'`, `'CAM_BACK'`, `'CAM_BACK_LEFT'`, `'CAM_BACK_RIGHT'`. For vision-based 3D object detection task, we split 3D annotations of the whole scenes according to the camera they belong to. - info\['cam_instances'\]: It is a dict containing keys `'CAM_FRONT'`, `'CAM_FRONT_RIGHT'`, `'CAM_FRONT_LEFT'`, `'CAM_BACK'`, `'CAM_BACK_LEFT'`, `'CAM_BACK_RIGHT'`. For vision-based 3D object detection task, we split 3D annotations of the whole scenes according to the camera they belong to. For the i-th instance:
- info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['bbox_label'\]: Label of instance. - info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['bbox_label'\]: Label of instance.
- info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['bbox_label_3d'\]: Label of instance. - info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['bbox_label_3d'\]: Label of instance.
- info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['bbox'\]: 2D bounding box annotation (exterior rectangle of the projected 3D box), a list arrange as \[x1, y1, x2, y2\]. - info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['bbox'\]: 2D bounding box annotation (exterior rectangle of the projected 3D box), a list arrange as \[x1, y1, x2, y2\].
...@@ -101,9 +101,9 @@ Note: ...@@ -101,9 +101,9 @@ Note:
1. The differences between `bbox_3d` in `instances` and that in `cam_instances`. 1. The differences between `bbox_3d` in `instances` and that in `cam_instances`.
Both `bbox_3d` have been converted to MMDet3D coordinate system, but `bboxes_3d` in `instances` is in LiDAR coordinate format and `bboxes_3d` in `cam_instances` is in Camera coordinate format. Mind the difference between them in 3D Box representation ('l, w, h' and 'l, h, w'). Both `bbox_3d` have been converted to MMDet3D coordinate system, but `bboxes_3d` in `instances` is in LiDAR coordinate format and `bboxes_3d` in `cam_instances` is in Camera coordinate format. Mind the difference between them in 3D Box representation ('l, w, h' and 'l, h, w').
2. Here we only explain the data recorded in the training info files. The same applies to validation and testing set (the pkl of test set does not contains `instances` and `cam_instances`). 2. Here we only explain the data recorded in the training info files. The same applies to validation and testing set (the `.pkl` file of test set does not contains `instances` and `cam_instances`).
The core function to get `nuscenes_infos_xxx.pkl` are [\_fill_trainval_infos](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/nuscenes_converter.py#L146) and [get_2d_boxes](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/nuscenes_converter.py#L397), respectively. The core function to get `nuscenes_infos_xxx.pkl` is [\_fill_trainval_infos](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/nuscenes_converter.py#L146).
Please refer to [nuscenes_converter.py](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/nuscenes_converter.py) for more details. Please refer to [nuscenes_converter.py](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/nuscenes_converter.py) for more details.
## Training pipeline ## Training pipeline
......
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
## Dataset preparation ## Dataset preparation
For the overall process, please refer to the [README](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/data/scannet/README.md/) page for ScanNet. For the overall process, please refer to the [README](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/data/scannet/README.md) page for ScanNet.
### Export ScanNet point cloud data ### Export ScanNet point cloud data
...@@ -188,7 +188,7 @@ def process_single_scene(sample_idx): ...@@ -188,7 +188,7 @@ def process_single_scene(sample_idx):
return info return info
``` ```
The directory structure after process should be as below The directory structure after process should be as below:
``` ```
scannet scannet
...@@ -226,12 +226,12 @@ scannet ...@@ -226,12 +226,12 @@ scannet
- `semantic_mask/xxxxx.bin`: The semantic label for each point, value range: \[1, 40\], i.e. `nyu40id` standard. Note: the `nyu40id` ID will be mapped to train ID in train pipeline `PointSegClassMapping`. - `semantic_mask/xxxxx.bin`: The semantic label for each point, value range: \[1, 40\], i.e. `nyu40id` standard. Note: the `nyu40id` ID will be mapped to train ID in train pipeline `PointSegClassMapping`.
- `posed_images/scenexxxx_xx`: The set of `.jpg` images with `.txt` 4x4 poses and the single `.txt` file with camera intrinsic matrix. - `posed_images/scenexxxx_xx`: The set of `.jpg` images with `.txt` 4x4 poses and the single `.txt` file with camera intrinsic matrix.
- `scannet_infos_train.pkl`: The train data infos, the detailed info of each scan is as follows: - `scannet_infos_train.pkl`: The train data infos, the detailed info of each scan is as follows:
- info\['lidar_points'\]: A dict containing all information relate to the lidar points. - info\['lidar_points'\]: A dict containing all information related to the lidar points.
- info\['lidar_points'\]\['lidar_path'\]: The filename of `xxx.bin` of lidar points. - info\['lidar_points'\]\['lidar_path'\]: The filename of the lidar point cloud data.
- info\['lidar_points'\]\['num_pts_feats'\]: The feature dimension of point. - info\['lidar_points'\]\['num_pts_feats'\]: The feature dimension of point.
- info\['lidar_points'\]\['axis_align_matrix'\]: The transformation matrix to align the axis. - info\['lidar_points'\]\['axis_align_matrix'\]: The transformation matrix to align the axis.
- info\['pts_semantic_mask_path'\]: The filename of `xxx.bin` contains semantic mask annotation. - info\['pts_semantic_mask_path'\]: The filename of the semantic mask annotation.
- info\['pts_instance_mask_path'\]: The filename of `xxx.bin` contains semantic mask annotation. - info\['pts_instance_mask_path'\]: The filename of the instance mask annotation.
- info\['instances'\]: A list of dict contains all annotations, each dict contains all annotation information of single instance. For the i-th instance: - info\['instances'\]: A list of dict contains all annotations, each dict contains all annotation information of single instance. For the i-th instance:
- info\['instances'\]\[i\]\['bbox_3d'\]: List of 6 numbers representing the axis-aligned 3D bounding box of the instance in depth coordinate system, in (x, y, z, l, w, h) order. - info\['instances'\]\[i\]\['bbox_3d'\]: List of 6 numbers representing the axis-aligned 3D bounding box of the instance in depth coordinate system, in (x, y, z, l, w, h) order.
- info\['instances'\]\[i\]\['bbox_label_3d'\]: The label of each 3d bounding boxes. - info\['instances'\]\[i\]\['bbox_label_3d'\]: The label of each 3d bounding boxes.
...@@ -257,8 +257,7 @@ train_pipeline = [ ...@@ -257,8 +257,7 @@ train_pipeline = [
with_mask_3d=True, with_mask_3d=True,
with_seg_3d=True), with_seg_3d=True),
dict(type='GlobalAlignment', rotation_axis=2), dict(type='GlobalAlignment', rotation_axis=2),
dict( dict(type='PointSegClassMapping'),
type='PointSegClassMapping'),
dict(type='PointSample', num_points=40000), dict(type='PointSample', num_points=40000),
dict( dict(
type='RandomFlip3D', type='RandomFlip3D',
...@@ -288,6 +287,6 @@ train_pipeline = [ ...@@ -288,6 +287,6 @@ train_pipeline = [
## Metrics ## Metrics
Typically mean Average Precision (mAP) is used for evaluation on ScanNet, e.g. `mAP@0.25` and `mAP@0.5`. In detail, a generic function to compute precision and recall for 3D object detection for multiple classes is called, please refer to [indoor_eval](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/mmdet3d/evaluation/functional/indoor_eval.py). Typically mean Average Precision (mAP) is used for evaluation on ScanNet, e.g. `mAP@0.25` and `mAP@0.5`. In detail, a generic function to compute precision and recall for 3D object detection for multiple classes is called. Please refer to [indoor_eval](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/mmdet3d/evaluation/functional/indoor_eval.py) for more details.
As introduced in section `Export ScanNet data`, all ground truth 3D bounding box are axis-aligned, i.e. the yaw is zero. So the yaw target of network predicted 3D bounding box is also zero and axis-aligned 3D Non-Maximum Suppression (NMS), which is regardless of rotation, is adopted during post-processing . As introduced in section `Export ScanNet data`, all ground truth 3D bounding box are axis-aligned, i.e. the yaw is zero. So the yaw target of network predicted 3D bounding box is also zero and axis-aligned 3D Non-Maximum Suppression (NMS), which is regardless of rotation, is adopted during post-processing .
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
## Dataset preparation ## Dataset preparation
For the overall process, please refer to the [README](https://github.com/open-mmlab/mmdetection3d/blob/master/data/sunrgbd/README.md/) page for SUN RGB-D. For the overall process, please refer to the [README](https://github.com/open-mmlab/mmdetection3d/blob/master/data/sunrgbd/README.md) page for SUN RGB-D.
### Download SUN RGB-D data and toolbox ### Download SUN RGB-D data and toolbox
...@@ -153,9 +153,9 @@ sunrgbd ...@@ -153,9 +153,9 @@ sunrgbd
- `points/xxxxxx.bin`: The point cloud data after downsample. - `points/xxxxxx.bin`: The point cloud data after downsample.
- `sunrgbd_infos_train.pkl`: The train data infos, the detailed info of each scene is as follows: - `sunrgbd_infos_train.pkl`: The train data infos, the detailed info of each scene is as follows:
- info\['lidar_points'\]: A dict containing all information relate to the the lidar points. - info\['lidar_points'\]: A dict containing all information related to the the lidar points.
- info\['lidar_points'\]\['num_pts_feats'\]: The feature dimension of point. - info\['lidar_points'\]\['num_pts_feats'\]: The feature dimension of point.
- info\['lidar_points'\]\['lidar_path'\]: The filename of `xxx.bin` of lidar points. - info\['lidar_points'\]\['lidar_path'\]: The filename of the lidar point cloud data.
- info\['images'\]: A dict containing all information relate to the image data. - info\['images'\]: A dict containing all information relate to the image data.
- info\['images'\]\['CAM0'\]\['img_path'\]: The filename of the image. - info\['images'\]\['CAM0'\]\['img_path'\]: The filename of the image.
- info\['images'\]\['CAM0'\]\['depth2img'\]: Transformation matrix from depth to image with shape (4, 4). - info\['images'\]\['CAM0'\]\['depth2img'\]: Transformation matrix from depth to image with shape (4, 4).
...@@ -245,6 +245,6 @@ The image augmentation functions are implemented in [MMDetection](https://github ...@@ -245,6 +245,6 @@ The image augmentation functions are implemented in [MMDetection](https://github
## Metrics ## Metrics
Same as ScanNet, typically mean Average Precision (mAP) is used for evaluation on SUN RGB-D, e.g. `mAP@0.25` and `mAP@0.5`. In detail, a generic function to compute precision and recall for 3D object detection for multiple classes is called, please refer to [indoor_eval](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/mmdet3d/evaluation/functional/indoor_eval.py). Same as ScanNet, typically mean Average Precision (mAP) is used for evaluation on SUN RGB-D, e.g. `mAP@0.25` and `mAP@0.5`. In detail, a generic function to compute precision and recall for 3D object detection for multiple classes is called. Please refer to [indoor_eval](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/mmdet3d/evaluation/functional/indoor_eval.py) for more details.
Since SUN RGB-D consists of image data, detection on image data is also feasible. For instance, in ImVoteNet, we first train an image detector, and we also use mAP for evaluation, e.g. `mAP@0.5`. We use the `eval_map` function from [MMDetection](https://github.com/open-mmlab/mmdetection) to calculate mAP. Since SUN RGB-D consists of image data, detection on image data is also feasible. For instance, in ImVoteNet, we first train an image detector, and we also use mAP for evaluation, e.g. `mAP@0.5`. We use the `eval_map` function from [MMDetection](https://github.com/open-mmlab/mmdetection) to calculate mAP.
...@@ -7,8 +7,8 @@ MMDection3D works on Linux, Windows (experimental support) and macOS and require ...@@ -7,8 +7,8 @@ MMDection3D works on Linux, Windows (experimental support) and macOS and require
- PyTorch 1.6+ - PyTorch 1.6+
- CUDA 9.2+ (If you build PyTorch from source, CUDA 9.0 is also compatible) - CUDA 9.2+ (If you build PyTorch from source, CUDA 9.0 is also compatible)
- GCC 5+ - GCC 5+
- [MMEngine](https://mmengine.readthedocs.io/zh_CN/latest/#installation) - [MMEngine](https://mmengine.readthedocs.io/en/latest/#installation)
- [MMCV](https://mmcv.readthedocs.io/zh_CN/latest/#installation) - [MMCV](https://mmcv.readthedocs.io/en/latest/#installation)
```{note} ```{note}
If you are experienced with PyTorch and have already installed it, just skip this part and jump to the [next section](#installation). Otherwise, you can follow these steps for the preparation. If you are experienced with PyTorch and have already installed it, just skip this part and jump to the [next section](#installation). Otherwise, you can follow these steps for the preparation.
...@@ -118,20 +118,20 @@ Note: ...@@ -118,20 +118,20 @@ Note:
4. Some dependencies are optional. Simply running `pip install -v -e .` will only install the minimum runtime requirements. To use optional dependencies like `albumentations` and `imagecorruptions` either install them manually with `pip install -r requirements/optional.txt` or specify desired extras when calling `pip` (e.g. `pip install -v -e .[optional]`). Valid keys for the extras field are: `all`, `tests`, `build`, and `optional`. 4. Some dependencies are optional. Simply running `pip install -v -e .` will only install the minimum runtime requirements. To use optional dependencies like `albumentations` and `imagecorruptions` either install them manually with `pip install -r requirements/optional.txt` or specify desired extras when calling `pip` (e.g. `pip install -v -e .[optional]`). Valid keys for the extras field are: `all`, `tests`, `build`, and `optional`.
We have supported spconv2.0. If the user has installed spconv2.0, the code will use spconv2.0 first, which will take up less GPU memory than using the default mmcv spconv. Users can use the following commands to install spconv2.0: We have supported `spconv 2.0`. If the user has installed `spconv 2.0`, the code will use `spconv 2.0` first, which will take up less GPU memory than using the default `mmcv spconv`. Users can use the following commands to install `spcon v2.0`:
```bash ```bash
pip install cumm-cuxxx pip install cumm-cuxxx
pip install spconv-cuxxx pip install spconv-cuxxx
``` ```
Where xxx is the CUDA version in the environment. Where `xxx` is the CUDA version in the environment.
For example, using CUDA 10.2, the command will be `pip install cumm-cu102 && pip install spconv-cu102`. For example, using CUDA 10.2, the command will be `pip install cumm-cu102 && pip install spconv-cu102`.
Supported CUDA versions include 10.2, 11.1, 11.3, and 11.4. Users can also install it by building from the source. For more details please refer to [spconv v2.x](https://github.com/traveller59/spconv). Supported CUDA versions include 10.2, 11.1, 11.3, and 11.4. Users can also install it by building from the source. For more details please refer to [spconv v2.x](https://github.com/traveller59/spconv).
We also support Minkowski Engine as a sparse convolution backend. If necessary please follow original [installation guide](https://github.com/NVIDIA/MinkowskiEngine#installation) or use `pip`: We also support `Minkowski Engine` as a sparse convolution backend. If necessary please follow original [installation guide](https://github.com/NVIDIA/MinkowskiEngine#installation) or use `pip` to install it:
```shell ```shell
conda install openblas-devel -c anaconda conda install openblas-devel -c anaconda
...@@ -156,7 +156,7 @@ Examples: ...@@ -156,7 +156,7 @@ Examples:
python demo/pcd_demo.py demo/data/kitti/000008.bin configs/second/second_hv-secfpn_8xb6-80e_kitti-3d-car.py checkpoints/second_hv-secfpn_8xb6-80e_kitti-3d-car_20200620_230238-393f000c.pth python demo/pcd_demo.py demo/data/kitti/000008.bin configs/second/second_hv-secfpn_8xb6-80e_kitti-3d-car.py checkpoints/second_hv-secfpn_8xb6-80e_kitti-3d-car_20200620_230238-393f000c.pth
``` ```
If you want to input a `ply` file, you can use the following function and convert it to `bin` format. Then you can use the converted `bin` file to generate demo. If you want to input a `.ply` file, you can use the following function and convert it to `.bin` format. Then you can use the converted `.bin` file to generate demo.
Note that you need to install `pandas` and `plyfile` before using this script. This function can also be used for data preprocessing for training `ply data`. Note that you need to install `pandas` and `plyfile` before using this script. This function can also be used for data preprocessing for training `ply data`.
```python ```python
...@@ -182,7 +182,7 @@ Examples: ...@@ -182,7 +182,7 @@ Examples:
convert_ply('./test.ply', './test.bin') convert_ply('./test.ply', './test.bin')
``` ```
If you have point clouds in other format (`off`, `obj`, etc.), you can use `trimesh` to convert them into `ply`. If you have point clouds in other format (`.off`, `.obj`, etc.), you can use `trimesh` to convert them into `ply`.
```python ```python
import trimesh import trimesh
......
# FAQ # FAQ
We list some potential troubles encountered by users and developers, along with their corresponding solutions. Feel free to enrich the list if you find any frequent issues and contribute your solutions to solve them. If you have any trouble with environment configuration, model training, etc, please create an issue using the [provided templates](https://github.com/open-mmlab/mmdetection3d/blob/master/.github/ISSUE_TEMPLATE/error-report.md/) and fill in all required information in the template. We list some potential troubles encountered by users and developers, along with their corresponding solutions. Feel free to enrich the list if you find any frequent issues and contribute your solutions to solve them. If you have any trouble with environment configuration, model training, etc, please create an issue using the [provided templates](https://github.com/open-mmlab/mmdetection3d/blob/master/.github/ISSUE_TEMPLATE/error-report.md) and fill in all required information in the template.
## MMEngine/MMCV/MMDet/MMDet3D Installation ## MMEngine/MMCV/MMDet/MMDet3D Installation
...@@ -8,13 +8,13 @@ We list some potential troubles encountered by users and developers, along with ...@@ -8,13 +8,13 @@ We list some potential troubles encountered by users and developers, along with
- The required versions of MMEngine, MMCV and MMDetection for different versions of MMDetection3D are as below. Please install the correct version of MMEngine, MMCV and MMDetection to avoid installation issues. - The required versions of MMEngine, MMCV and MMDetection for different versions of MMDetection3D are as below. Please install the correct version of MMEngine, MMCV and MMDetection to avoid installation issues.
| MMDetection3D version | MMEngine version | MMCV version | MMDetection version | | MMDetection3D version | MMEngine version | MMCV version | MMDetection version |
| --------------------- | :----------------------: | :---------------------: | :----------------------: | | --------------------- | :----------------------: | :---------------------: | :----------------------: |
| dev-1.x | mmengine>=0.1.0, \<1.0.0 | mmcv>=2.0.0rc0, \<2.1.0 | mmdet>=3.0.0rc0, \<3.1.0 | | dev-1.x | mmengine>=0.1.0, \<1.0.0 | mmcv>=2.0.0rc0, \<2.1.0 | mmdet>=3.0.0rc0, \<3.1.0 |
| v1.1.0rc1 | mmengine>=0.1.0, \<1.0.0 | mmcv>=2.0.0rc0, \<2.1.0 | mmdet>=3.0.0rc0, \<3.1.0 | | v1.1.0rc1 | mmengine>=0.1.0, \<1.0.0 | mmcv>=2.0.0rc0, \<2.1.0 | mmdet>=3.0.0rc0, \<3.1.0 |
| v1.1.0rc0 | mmengine>=0.1.0, \<1.0.0 | mmcv>=2.0.0rc0, \<2.1.0 | mmdet>=3.0.0rc0, \<3.1.0 | | v1.1.0rc0 | mmengine>=0.1.0, \<1.0.0 | mmcv>=2.0.0rc0, \<2.1.0 | mmdet>=3.0.0rc0, \<3.1.0 |
**Note:** If you want to install mmdet3d-v1.0.0rcx, the compatible MMDetection, MMSegmentation and MMCV versions table can be found at [here](https://mmdetection3d.readthedocs.io/en/latest/faq.html#mmcv-mmdet-mmdet3d-installation). Please choose the correct version of MMCV, MMDetection and MMSegmentation to avoid installation issues. **Note:** If you want to install mmdet3d-v1.0.0rcx, the compatible MMDetection, MMSegmentation and MMCV versions table can be found at [here](https://mmdetection3d.readthedocs.io/en/latest/faq.html#mmcv-mmdet-mmdet3d-installation). Please choose the correct version of MMCV, MMDetection and MMSegmentation to avoid installation issues.
- If you faced the error shown below when importing open3d: - If you faced the error shown below when importing open3d:
......
...@@ -431,7 +431,7 @@ default_scope = 'mmdet3d' # The default registry scope to find modules. Refer t ...@@ -431,7 +431,7 @@ default_scope = 'mmdet3d' # The default registry scope to find modules. Refer t
env_cfg = dict( env_cfg = dict(
cudnn_benchmark=False, # Whether to enable cudnn benchmark cudnn_benchmark=False, # Whether to enable cudnn benchmark
mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0), # Use fork to start multi-processing threads. 'fork' usually faster than 'spawn' but maybe unsafe. See discussion in https://github.com/pytorch/pytorch/issues/1355 mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0), # Use fork to start multi-processing threads. 'fork' usually faster than 'spawn' but maybe unsafe. See discussion in https://github.com/pytorch/pytorch/issues/1355
dist_cfg=dict(backend='nccl')) # Distribution configs dist_cfg=dict(backend='nccl')) # Distribution configs
vis_backends = [dict(type='LocalVisBackend')] # Visualization backends. vis_backends = [dict(type='LocalVisBackend')] # Visualization backends.
visualizer = dict( visualizer = dict(
......
...@@ -78,7 +78,7 @@ mmdetection3d ...@@ -78,7 +78,7 @@ mmdetection3d
### KITTI ### KITTI
Download KITTI 3D detection data [HERE](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d). Prepare KITTI data splits by running Download KITTI 3D detection data [HERE](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d). Prepare KITTI data splits by running:
```bash ```bash
mkdir ./data/kitti/ && mkdir ./data/kitti/ImageSets mkdir ./data/kitti/ && mkdir ./data/kitti/ImageSets
...@@ -90,21 +90,21 @@ wget -c https://raw.githubusercontent.com/traveller59/second.pytorch/master/sec ...@@ -90,21 +90,21 @@ wget -c https://raw.githubusercontent.com/traveller59/second.pytorch/master/sec
wget -c https://raw.githubusercontent.com/traveller59/second.pytorch/master/second/data/ImageSets/trainval.txt --no-check-certificate --content-disposition -O ./data/kitti/ImageSets/trainval.txt wget -c https://raw.githubusercontent.com/traveller59/second.pytorch/master/second/data/ImageSets/trainval.txt --no-check-certificate --content-disposition -O ./data/kitti/ImageSets/trainval.txt
``` ```
Then generate info files by running Then generate info files by running:
``` ```bash
python tools/create_data.py kitti --root-path ./data/kitti --out-dir ./data/kitti --extra-tag kitti python tools/create_data.py kitti --root-path ./data/kitti --out-dir ./data/kitti --extra-tag kitti
``` ```
In an environment using slurm, users may run the following command instead In an environment using slurm, users may run the following command instead:
``` ```bash
sh tools/create_data.sh <partition> kitti sh tools/create_data.sh <partition> kitti
``` ```
### Waymo ### Waymo
Download Waymo open dataset V1.2 [HERE](https://waymo.com/open/download/) and its data split [HERE](https://drive.google.com/drive/folders/18BVuF_RYJF0NjZpt8SnfzANiakoRMf0o?usp=sharing). Then put tfrecord files into corresponding folders in `data/waymo/waymo_format/` and put the data split txt files into `data/waymo/kitti_format/ImageSets`. Download ground truth bin file for validation set [HERE](https://console.cloud.google.com/storage/browser/waymo_open_dataset_v_1_2_0/validation/ground_truth_objects) and put it into `data/waymo/waymo_format/`. A tip is that you can use `gsutil` to download the large-scale dataset with commands. You can take this [tool](https://github.com/RalphMao/Waymo-Dataset-Tool) as an example for more details. Subsequently, prepare waymo data by running Download Waymo open dataset V1.2 [HERE](https://waymo.com/open/download/) and its data split [HERE](https://drive.google.com/drive/folders/18BVuF_RYJF0NjZpt8SnfzANiakoRMf0o?usp=sharing). Then put `.tfrecord` files into corresponding folders in `data/waymo/waymo_format/` and put the data split `.txt` files into `data/waymo/kitti_format/ImageSets`. Download ground truth `.bin` file for validation set [HERE](https://console.cloud.google.com/storage/browser/waymo_open_dataset_v_1_2_0/validation/ground_truth_objects) and put it into `data/waymo/waymo_format/`. A tip is that you can use `gsutil` to download the large-scale dataset with commands. You can take this [tool](https://github.com/RalphMao/Waymo-Dataset-Tool) as an example for more details. Subsequently, prepare waymo data by running:
```bash ```bash
python tools/create_data.py waymo --root-path ./data/waymo/ --out-dir ./data/waymo/ --workers 128 --extra-tag waymo python tools/create_data.py waymo --root-path ./data/waymo/ --out-dir ./data/waymo/ --workers 128 --extra-tag waymo
...@@ -112,7 +112,7 @@ python tools/create_data.py waymo --root-path ./data/waymo/ --out-dir ./data/way ...@@ -112,7 +112,7 @@ python tools/create_data.py waymo --root-path ./data/waymo/ --out-dir ./data/way
Note that: Note that:
- If your local disk does not have enough space for saving converted data, you can change the `out-dir` to anywhere else. Just remember to create folders and prepare data there in advance and link them back to `data/waymo/kitti_format` after the data conversion. - If your local disk does not have enough space for saving converted data, you can change the `--out-dir` to anywhere else. Just remember to create folders and prepare data there in advance and link them back to `data/waymo/kitti_format` after the data conversion.
- If you want faster evaluation on Waymo, you can download the preprocessed [metainfo](https://download.openmmlab.com/mmdetection3d/data/waymo/idx2metainfo.pkl) containing `contextname` and `timestamp` to the directory `data/waymo/waymo_format/`. Then, the dataset config is modified like the following: - If you want faster evaluation on Waymo, you can download the preprocessed [metainfo](https://download.openmmlab.com/mmdetection3d/data/waymo/idx2metainfo.pkl) containing `contextname` and `timestamp` to the directory `data/waymo/waymo_format/`. Then, the dataset config is modified like the following:
...@@ -132,7 +132,7 @@ Note that: ...@@ -132,7 +132,7 @@ Note that:
### NuScenes ### NuScenes
Download nuScenes V1.0 full dataset data [HERE](https://www.nuscenes.org/download). Prepare nuscenes data by running Download nuScenes V1.0 full dataset data [HERE](https://www.nuscenes.org/download). Prepare nuscenes data by running:
```bash ```bash
python tools/create_data.py nuscenes --root-path ./data/nuscenes --out-dir ./data/nuscenes --extra-tag nuscenes python tools/create_data.py nuscenes --root-path ./data/nuscenes --out-dir ./data/nuscenes --extra-tag nuscenes
...@@ -140,22 +140,22 @@ python tools/create_data.py nuscenes --root-path ./data/nuscenes --out-dir ./dat ...@@ -140,22 +140,22 @@ python tools/create_data.py nuscenes --root-path ./data/nuscenes --out-dir ./dat
### Lyft ### Lyft
Download Lyft 3D detection data [HERE](https://www.kaggle.com/c/3d-object-detection-for-autonomous-vehicles/data). Prepare Lyft data by running Download Lyft 3D detection data [HERE](https://www.kaggle.com/c/3d-object-detection-for-autonomous-vehicles/data). Prepare Lyft data by running:
```bash ```bash
python tools/create_data.py lyft --root-path ./data/lyft --out-dir ./data/lyft --extra-tag lyft --version v1.01 python tools/create_data.py lyft --root-path ./data/lyft --out-dir ./data/lyft --extra-tag lyft --version v1.01
python tools/dataset_converters/lyft_data_fixer.py --version v1.01 --root-folder ./data/lyft python tools/dataset_converters/lyft_data_fixer.py --version v1.01 --root-folder ./data/lyft
``` ```
Note that we follow the original folder names for clear organization. Please rename the raw folders as shown above. Also note that the second command serves the purpose of fixing a corrupted lidar data file. Please refer to the discussion [here](https://www.kaggle.com/c/3d-object-detection-for-autonomous-vehicles/discussion/110000) for more details. Note that we follow the original folder names for clear organization. Please rename the raw folders as shown above. Also note that the second command serves the purpose of fixing a corrupted lidar data file. Please refer to the [discussion](https://www.kaggle.com/c/3d-object-detection-for-autonomous-vehicles/discussion/110000) for more details.
### S3DIS, ScanNet and SUN RGB-D ### S3DIS, ScanNet and SUN RGB-D
To prepare S3DIS data, please see its [README](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/data/s3dis/README.md/). To prepare S3DIS data, please see its [README](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/data/s3dis/README.md).
To prepare ScanNet data, please see its [README](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/data/scannet/README.md/). To prepare ScanNet data, please see its [README](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/data/scannet/README.md).
To prepare SUN RGB-D data, please see its [README](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/data/sunrgbd/README.md/). To prepare SUN RGB-D data, please see its [README](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/data/sunrgbd/README.md).
### Customized Datasets ### Customized Datasets
...@@ -166,15 +166,15 @@ For using custom datasets, please refer to [Customize Datasets](https://github.c ...@@ -166,15 +166,15 @@ For using custom datasets, please refer to [Customize Datasets](https://github.c
If you have used v1.0.0rc1-v1.0.0rc4 mmdetection3d to create data infos before, and now you want to use the newest v1.1.0 mmdetection3d, you need to update the data infos file. If you have used v1.0.0rc1-v1.0.0rc4 mmdetection3d to create data infos before, and now you want to use the newest v1.1.0 mmdetection3d, you need to update the data infos file.
```bash ```bash
python tools/dataset_converters/update_infos_to_v2.py --dataset ${DATA_SET} --pkl ${PKL_PATH} --out-dir ${OUT_DIR} python tools/dataset_converters/update_infos_to_v2.py --dataset ${DATA_SET} --pkl-path ${PKL_PATH} --out-dir ${OUT_DIR}
``` ```
- `dataset` : Name of dataset. - `--dataset` : Name of dataset.
- `pkl` : Specify the data infos pkl file path. - `--pkl-path` : Specify the data infos pkl file path.
- `out-dir` : Output direction of the data infos pkl file. - `--out-dir` : Output direction of the data infos pkl file.
Example Example:
```bash ```bash
python tools/dataset_converters/update_infos_to_v2.py --dataset kitti --pkl ./data/kitti/kitti_infos_trainval.pkl --out-dir ./data/kitti python tools/dataset_converters/update_infos_to_v2.py --dataset kitti --pkl-path ./data/kitti/kitti_infos_trainval.pkl --out-dir ./data/kitti
``` ```
...@@ -16,17 +16,40 @@ ...@@ -16,17 +16,40 @@
#### 点云格式 #### 点云格式
目前,我们只支持 '.bin' 格式的点云用于训练和推理。在训练自己的数据集之前,需要将其他格式的点云文件转换成 '.bin' 文件。常见的点云数据格式包括 `.pcd``.las`,我们列出一些开源工具作为参考。 目前,我们只支持 `.bin` 格式的点云用于训练和推理。在训练自己的数据集之前,需要将其他格式的点云文件转换成 `.bin` 文件。常见的点云数据格式包括 `.pcd``.las`,我们列出一些开源工具作为参考。
1. pcd 转换成 bin:https://github.com/leofansq/Tools_RosBag2KITTI 1. `.pcd` 转换成 `.bin`:https://github.com/DanielPollithy/pypcd
2. las 转换成 bin:常见的转换流程为 las -> pcd -> bin,las -> pcd 的转换可以用该[工具](https://github.com/Hitachi-Automotive-And-Industry-Lab/semantic-segmentation-editor)
- 您可以通过以下指令安装 `pypcd`
```bash
pip install git+https://github.com/DanielPollithy/pypcd.git
```
- 您可以使用以下脚本读取 `.pcd` 文件,将其转换成 `.bin` 格式并保存。
```python
import numpy as np
from pypcd import pypcd
pcd_data = pypcd.PointCloud.from_path('point_cloud_data.pcd')
points = np.zeros([pcd_data.width, 4], dtype=np.float32)
points[:, 0] = pcd_data.pc_data['x'].copy()
points[:, 1] = pcd_data.pc_data['y'].copy()
points[:, 2] = pcd_data.pc_data['z'].copy()
points[:, 3] = pcd_data.pc_data['intensity'].copy().astype(np.float32)
with open('point_cloud_data.bin', 'wb') as f:
f.write(points.tobytes())
```
2. `.las` 转换成 `.bin`:常见的转换流程为 `.las -> .pcd -> .bin``.las -> .pcd` 的转换可以用该[工具](https://github.com/Hitachi-Automotive-And-Industry-Lab/semantic-segmentation-editor)
#### 标签格式 #### 标签格式
最基本的信息:每个场景的 3D 边界框和类别标签应该包含在标注 `.txt` 文件中。每一行代表特定场景的一个 3D 框,如下所示: 最基本的信息:每个场景的 3D 边界框和类别标签应该包含在标注 `.txt` 文件中。每一行代表特定场景的一个 3D 框,如下所示:
```python ```
# format: [x, y, z, dx, dy, dz, yaw, category_name] # 格式:[x, y, z, dx, dy, dz, yaw, category_name]
1.23 1.42 0.23 3.96 1.65 1.55 1.56 Car 1.23 1.42 0.23 3.96 1.65 1.55 1.56 Car
3.51 2.15 0.42 1.05 0.87 1.86 1.23 Pedestrian 3.51 2.15 0.42 1.05 0.87 1.86 1.23 Pedestrian
... ...
...@@ -55,7 +78,7 @@ lidar2cam4 ...@@ -55,7 +78,7 @@ lidar2cam4
... ...
``` ```
### 原始数据格式 ### 原始数据结构
#### 基于激光雷达的 3D 检测 #### 基于激光雷达的 3D 检测
...@@ -197,7 +220,7 @@ class MyDataset(Det3DDataset): ...@@ -197,7 +220,7 @@ class MyDataset(Det3DDataset):
# 替换成自定义 pkl 信息文件里的所有类别 # 替换成自定义 pkl 信息文件里的所有类别
METAINFO = { METAINFO = {
'CLASSES': ('Pedestrian', 'Cyclist', 'Car') 'classes': ('Pedestrian', 'Cyclist', 'Car')
} }
def parse_ann_info(self, info): def parse_ann_info(self, info):
...@@ -245,7 +268,7 @@ data_root = 'data/custom/' ...@@ -245,7 +268,7 @@ data_root = 'data/custom/'
class_names = ['Pedestrian', 'Cyclist', 'Car'] # 替换成自己的数据集类别 class_names = ['Pedestrian', 'Cyclist', 'Car'] # 替换成自己的数据集类别
point_cloud_range = [0, -40, -3, 70.4, 40, 1] # 根据你的数据集进行调整 point_cloud_range = [0, -40, -3, 70.4, 40, 1] # 根据你的数据集进行调整
input_modality = dict(use_lidar=True, use_camera=False) input_modality = dict(use_lidar=True, use_camera=False)
metainfo = dict(CLASSES=class_names) metainfo = dict(classes=class_names)
train_pipeline = [ train_pipeline = [
dict( dict(
...@@ -330,8 +353,7 @@ val_evaluator = dict( ...@@ -330,8 +353,7 @@ val_evaluator = dict(
#### 准备模型配置 #### 准备模型配置
对于基于体素化的检测器如 SECOND,PointPillars 及 CenterPoint,点云范围(point cloud range)和体素大小(voxel size)应该根据你的数据集做调整。 对于基于体素化的检测器如 SECOND,PointPillars 及 CenterPoint,点云范围(point cloud range)和体素大小(voxel size)应该根据你的数据集做调整。理论上,`voxel_size``point_cloud_range` 的设置是相关联的。设置较小的 `voxel_size` 将增加体素数以及相应的内存消耗。此外,需要注意以下问题:
理论上,`voxel_size``point_cloud_range` 的设置是相关联的。设置较小的 `voxel_size` 将增加体素数以及相应的内存消耗。此外,需要注意以下问题:
如果将 `point_cloud_range``voxel_size` 分别设置成 `[0, -40, -3, 70.4, 40, 1]``[0.05, 0.05, 0.1]`,则中间特征图的形状为 `[(1-(-3))/0.1+1, (40-(-40))/0.05, (70.4-0)/0.05]=[41, 1600, 1408]`。更改 `point_cloud_range` 时,请记得依据 `voxel_size` 更改 `middle_encoder` 里中间特征图的形状。 如果将 `point_cloud_range``voxel_size` 分别设置成 `[0, -40, -3, 70.4, 40, 1]``[0.05, 0.05, 0.1]`,则中间特征图的形状为 `[(1-(-3))/0.1+1, (40-(-40))/0.05, (70.4-0)/0.05]=[41, 1600, 1408]`。更改 `point_cloud_range` 时,请记得依据 `voxel_size` 更改 `middle_encoder` 里中间特征图的形状。
...@@ -469,6 +491,6 @@ _base_ = [ ...@@ -469,6 +491,6 @@ _base_ = [
```python ```python
val_evaluator = dict( val_evaluator = dict(
type='KittiMetric', type='KittiMetric',
ann_file=data_root + 'custom_infos_val.pkl', # 指定你的 验证 pkl 信息 ann_file=data_root + 'custom_infos_val.pkl', # 指定你的验证 pkl 信息
metric='bbox') metric='bbox')
``` ```
...@@ -2,12 +2,12 @@ ...@@ -2,12 +2,12 @@
## 自定义优化器设置 ## 自定义优化器设置
优化器相关的配置是由 `optim_wrapper` 管理的,其通常有三个字段:`optimizer``paramwise_cfg``clip_grad`。请参考 [OptimWrapper](https://mmengine.readthedocs.io/en/latest/tutorials/optim_wrapper.md) 了解更多细节。如下所示,使用 `AdamW` 作为`优化器`,骨干网络的学习率降低 10 倍,并添加了梯度裁剪。 优化器相关的配置是由 `optim_wrapper` 管理的,其通常有三个字段:`optimizer``paramwise_cfg``clip_grad`更多细节请参考 [OptimWrapper](https://mmengine.readthedocs.io/zh_CN/latest/tutorials/optim_wrapper.html)。如下所示,使用 `AdamW` 作为`优化器`,骨干网络的学习率降低 10 倍,并添加了梯度裁剪。
```python ```python
optim_wrapper = dict( optim_wrapper = dict(
type='OptimWrapper', type='OptimWrapper',
# optimizer # 优化器
optimizer=dict( optimizer=dict(
type='AdamW', type='AdamW',
lr=0.0001, lr=0.0001,
...@@ -15,21 +15,20 @@ optim_wrapper = dict( ...@@ -15,21 +15,20 @@ optim_wrapper = dict(
eps=1e-8, eps=1e-8,
betas=(0.9, 0.999)), betas=(0.9, 0.999)),
# Parameter-level learning rate and weight decay settings # 参数级学习率及权重衰减系数设置
paramwise_cfg=dict( paramwise_cfg=dict(
custom_keys={ custom_keys={
'backbone': dict(lr_mult=0.1, decay_mult=1.0), 'backbone': dict(lr_mult=0.1, decay_mult=1.0),
}, },
norm_decay_mult=0.0), norm_decay_mult=0.0),
# gradient clipping # 梯度裁剪
clip_grad=dict(max_norm=0.01, norm_type=2)) clip_grad=dict(max_norm=0.01, norm_type=2))
``` ```
### 自定义 PyTorch 支持的优化器 ### 自定义 PyTorch 支持的优化器
我们已经支持使用所有 PyTorch 实现的优化器,且唯一需要修改的地方就是改变配置文件中的 `optim_wrapper` 字段中的 `optimizer` 字段。 我们已经支持使用所有 PyTorch 实现的优化器,且唯一需要修改的地方就是改变配置文件中的 `optim_wrapper` 字段中的 `optimizer` 字段。例如,如果您想使用 `ADAM`(注意到这样可能会使性能大幅下降),您可以这样修改:
举个例子,如果您想使用 `ADAM` (注意到这样可能会使性能大幅下降),您可以这样修改:
```python ```python
optim_wrapper = dict( optim_wrapper = dict(
...@@ -37,7 +36,7 @@ optim_wrapper = dict( ...@@ -37,7 +36,7 @@ optim_wrapper = dict(
optimizer=dict(type='Adam', lr=0.0003, weight_decay=0.0001)) optimizer=dict(type='Adam', lr=0.0003, weight_decay=0.0001))
``` ```
为了修改模型的学习率,用户只需要修改 `optimizer` 中的 `lr` 字段。用户可以根据 PyTorch 的 [API 文档](https://pytorch.org/docs/stable/optim.html?highlight=optim#module-torch.optim) 直接设置参数。 为了修改模型的学习率,用户只需要修改 `optimizer` 中的 `lr` 字段。用户可以根据 PyTorch 的 [API 文档](https://pytorch.org/docs/stable/optim.html?highlight=optim#module-torch.optim)直接设置参数。
### 自定义并实现优化器 ### 自定义并实现优化器
...@@ -45,8 +44,7 @@ optim_wrapper = dict( ...@@ -45,8 +44,7 @@ optim_wrapper = dict(
一个自定义优化器可以按照如下过程定义: 一个自定义优化器可以按照如下过程定义:
假设您想要添加一个叫 `MyOptimizer` 的,拥有参数 `a``b``c` 的优化器,您需要创建一个叫做 `mmdet3d/engine/optimizers` 的目录。 假设您想要添加一个叫 `MyOptimizer` 的,拥有参数 `a``b``c` 的优化器,您需要创建一个叫做 `mmdet3d/engine/optimizers` 的目录。接下来,应该在目录下某个文件中实现新的优化器,比如 `mmdet3d/engine/optimizers/my_optimizer.py`
接下来,应该在目录下某个文件中实现新的优化器,比如 `mmdet3d/engine/optimizers/my_optimizer.py`
```python ```python
from mmdet3d.registry import OPTIMIZERS from mmdet3d.registry import OPTIMIZERS
...@@ -62,38 +60,29 @@ class MyOptimizer(Optimizer): ...@@ -62,38 +60,29 @@ class MyOptimizer(Optimizer):
#### 2. 将优化器添加到注册器 #### 2. 将优化器添加到注册器
为了找到上述定义的优化器模块,该模块首先需要被引入主命名空间。有两种方法实现之 为了找到上述定义的优化器模块,该模块首先需要被引入主命名空间。有两种实现方法:
- 新建 `mmdet3d/engine/optimizers/__init__.py` 文件用于引入 - 修改 `mmdet3d/engine/optimizers/__init__.py` 导入该模块
新定义的模块应该在 `mmdet3d/engine/optimizers/__init__.py` 中被入,使得注册器可以找到新模块并注册之 新定义的模块应该在 `mmdet3d/engine/optimizers/__init__.py` 中被入,从而被找到并且被添加到注册器中
```python ```python
from .my_optimizer import MyOptimizer from .my_optimizer import MyOptimizer
``` ```
您也需要通过添加如下语句在 `mmdet3d/core/__init__.py` 中引入 `optimizer`
```python
from .optimizer import *
```
- 在配置中使用 `custom_imports` 来人工入新优化器: - 在配置中使用 `custom_imports` 来人工入新优化器:
```python ```python
custom_imports = dict(imports=['mmdet3d.core.optimizer.my_optimizer'], allow_failed_imports=False) custom_imports = dict(imports=['mmdet3d.engine.optimizers.my_optimizer'], allow_failed_imports=False)
``` ```
模块 `mmdet3d.engine.optimizers.my_optimizer` 会在程序伊始被引入,且 `MyOptimizer` 类在那时会自动被注册。 模块 `mmdet3d.engine.optimizers.my_optimizer` 会在程序开始被导入,且 `MyOptimizer` 类在那时会自动被注册。注意到应该只有包含 `MyOptimizer` 类的包被导入。`mmdet3d.engine.optimizers.my_optimizer.MyOptimizer`**不能**被直接导入。
注意到只有包含 `MyOptimizer` 类的包应该被引入。
`mmdet3d.engine.optimizers.my_optimizer.MyOptimizer` **不能** 被直接引入。
事实上,用户可以在这种入的方法中使用完全不同的文件目录结构,只要保证根目录能在 `PYTHONPATH` 中被定位。 事实上,用户可以在这种入的方法中使用完全不同的文件目录结构,只要保证根目录能在 `PYTHONPATH` 中被定位。
#### 3. 在配置文件中指定优化器 #### 3. 在配置文件中指定优化器
接下来您可以在配置文件的 `optimizer` 字段中使用 `MyOptimizer` 接下来您可以在配置文件的 `optimizer` 字段中使用 `MyOptimizer`。在配置文件中,优化器在 `optimizer` 字段中以如下方式定义:
在配置文件中,优化器在 `optimizer` 字段中以如下方式定义:
```python ```python
optim_wrapper = dict( optim_wrapper = dict(
...@@ -109,10 +98,9 @@ optim_wrapper = dict( ...@@ -109,10 +98,9 @@ optim_wrapper = dict(
optimizer=dict(type='MyOptimizer', a=a_value, b=b_value, c=c_value)) optimizer=dict(type='MyOptimizer', a=a_value, b=b_value, c=c_value))
``` ```
### 自定义优化器构造器 ### 自定义优化器封装构造器
部分模型可能会拥有一些参数专属的优化器设置,比如 BatchNorm 层的权重衰减 (weight decay)。 部分模型可能会拥有一些参数专属的优化器设置,比如 BatchNorm 层的权重衰减 (weight decay)。用户可以通过自定义优化器封装构造器来对那些细粒度的参数进行调优。
用户可以通过自定义优化器的构造器来对那些细粒度的参数进行调优。
```python ```python
from mmengine.optim import DefaultOptiWrapperConstructor from mmengine.optim import DefaultOptiWrapperConstructor
...@@ -134,11 +122,11 @@ class MyOptimizerWrapperConstructor(DefaultOptimWrapperConstructor): ...@@ -134,11 +122,11 @@ class MyOptimizerWrapperConstructor(DefaultOptimWrapperConstructor):
``` ```
默认优化器构造器在[这里](https://github.com/open-mmlab/mmengine/blob/main/mmengine/optim/optimizer/default_constructor.py#L18)实现。这部分代码也可以用作新优化器构造器的模版。 默认优化器封装构造器在[这里](https://github.com/open-mmlab/mmengine/blob/main/mmengine/optim/optimizer/default_constructor.py#L18)实现。这部分代码也可以用作新优化器封装构造器的模版。
### 额外的设置 ### 额外的设置
没有在优化器部分实现的技巧应该通过优化器构造器或者钩子来实现 (比如逐参数的学习率设置)。我们列举了一些常用的可以稳定训练过程或者加速训练的设置。我们欢迎提供更多类似设置的 PR 和 issue。 没有在优化器部分实现的技巧应该通过优化器封装构造器或者钩子来实现(比如逐参数的学习率设置)。我们列举了一些常用的可以稳定训练过程或者加速训练的设置。我们欢迎提供更多类似设置的 PR 和 issue。
- __使用梯度裁剪 (gradient clip) 来稳定训练过程__: - __使用梯度裁剪 (gradient clip) 来稳定训练过程__:
...@@ -149,19 +137,17 @@ class MyOptimizerWrapperConstructor(DefaultOptimWrapperConstructor): ...@@ -149,19 +137,17 @@ class MyOptimizerWrapperConstructor(DefaultOptimWrapperConstructor):
_delete_=True, clip_grad=dict(max_norm=35, norm_type=2)) _delete_=True, clip_grad=dict(max_norm=35, norm_type=2))
``` ```
如果您的配置继承了一个已经设置了 `optim_wrapper` 的基础配置,那么您可能需要 `_delete_=True` 字段来覆盖基础配置中无用的设置。详见配置文件的[说明文档](https://mmdetection3d.readthedocs.io/en/latest/tutorials/config.html) 如果您的配置继承了一个已经设置了 `optim_wrapper` 的基础配置,那么您可能需要 `_delete_=True` 字段来覆盖基础配置中无用的设置。更多细节请参考[说明文档](https://mmdetection3d.readthedocs.io/en/latest/tutorials/config.html)
- __使用动量规划器 (momentum scheduler) 来加速模型收敛__: - __使用动量调度器 (momentum scheduler) 来加速模型收敛__:
我们支持用动量规划器来根据学习率更改模型的动量,这样可以使模型更快地收敛。 我们支持用动量调度器来根据学习率更改模型的动量,这样可以使模型更快地收敛。动量调度器通常和学习率调度器一起使用,例如,如下配置文件在 [3D 检测](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/configs/_base_/schedules/cyclic_20e.py)中被用于加速模型收敛。更多细节请参考 [CosineAnnealingLR](https://github.com/open-mmlab/mmengine/blob/main/mmengine/optim/scheduler/lr_scheduler.py#L43)[CosineAnnealingMomentum](https://github.com/open-mmlab/mmengine/blob/main/mmengine/optim/scheduler/momentum_scheduler.py#L71) 的实现方法。
动量规划器通常和学习率规划器一起使用,比如说,如下配置文件在 [3D 检测](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/configs/_base_/schedules/cyclic_20e.py)中被用于加速模型收敛。
更多细节详见 [CosineAnnealingLR](https://github.com/open-mmlab/mmengine/blob/main/mmengine/optim/scheduler/lr_scheduler.py#L43)[CosineAnnealingMomentum](https://github.com/open-mmlab/mmengine/blob/main/mmengine/optim/scheduler/momentum_scheduler.py#L71) 的实现。
```python ```python
param_scheduler = [ param_scheduler = [
# 学习率调整策略 # 学习率调度器
#在前 8 ,学习率从 0 升到 lr * 10 # 在前 8 个 epoch,学习率从 0 升到 lr * 10
# 在接下来 12 ,学习率从 lr * 10 降到 lr * 1e-4 # 在接下来 12 个 epoch,学习率从 lr * 10 降到 lr * 1e-4
dict( dict(
type='CosineAnnealingLR', type='CosineAnnealingLR',
T_max=8, T_max=8,
...@@ -178,9 +164,9 @@ class MyOptimizerWrapperConstructor(DefaultOptimWrapperConstructor): ...@@ -178,9 +164,9 @@ class MyOptimizerWrapperConstructor(DefaultOptimWrapperConstructor):
end=20, end=20,
by_epoch=True, by_epoch=True,
convert_to_iter_based=True), convert_to_iter_based=True),
# 动量调整策略 # 动量调度器
# 在前 8 ,动量从 0 升到 0.85 / 0.95 # 在前 8 个 epoch,动量从 0 升到 0.85 / 0.95
# 在接下来 12 ,动量从 0.85 / 0.95 升到 1 # 在接下来 12 个 epoch,动量从 0.85 / 0.95 升到 1
dict( dict(
type='CosineAnnealingMomentum', type='CosineAnnealingMomentum',
T_max=8, T_max=8,
...@@ -200,12 +186,11 @@ class MyOptimizerWrapperConstructor(DefaultOptimWrapperConstructor): ...@@ -200,12 +186,11 @@ class MyOptimizerWrapperConstructor(DefaultOptimWrapperConstructor):
] ]
``` ```
## 自定义训练规程 ## 自定义训练调度
默认情况,我们使用阶梯式学习率衰减的 1 倍训练规程。这会调用 MMEngine 中的 [`MultiStepLR`](https://github.com/open-mmlab/mmengine/blob/main/mmengine/optim/scheduler/lr_scheduler.py#L139) 默认情况下我们使用阶梯式学习率衰减的 1 倍训练调度。这会调用 MMEngine 中的 [`MultiStepLR`](https://github.com/open-mmlab/mmengine/blob/main/mmengine/optim/scheduler/lr_scheduler.py#L139)。我们在[这里](https://github.com/open-mmlab/mmengine/blob/main/mmengine/optim/scheduler/lr_scheduler.py)支持了很多其他学习率调度,比如`余弦退火``多项式衰减`调度。下面是一些样例:
我们在[这里](https://github.com/open-mmlab/mmengine/blob/main/mmengine/optim/scheduler/lr_scheduler.py)支持很多其他学习率规划方案,比如`余弦退火``多项式衰减`规程。下面是一些样例:
- 多项式衰减规程: - 多项式衰减调度:
```python ```python
param_scheduler = [ param_scheduler = [
...@@ -218,7 +203,7 @@ class MyOptimizerWrapperConstructor(DefaultOptimWrapperConstructor): ...@@ -218,7 +203,7 @@ class MyOptimizerWrapperConstructor(DefaultOptimWrapperConstructor):
by_epoch=True)] by_epoch=True)]
``` ```
- 余弦退火规程: - 余弦退火调度:
```python ```python
param_scheduler = [ param_scheduler = [
...@@ -231,9 +216,9 @@ class MyOptimizerWrapperConstructor(DefaultOptimWrapperConstructor): ...@@ -231,9 +216,9 @@ class MyOptimizerWrapperConstructor(DefaultOptimWrapperConstructor):
by_epoch=True)] by_epoch=True)]
``` ```
## 自定义工作流 ## 自定义训练循环控制器
我们默认在 `train_cfg` 中使用 `EpochBasedTrainLoop`,并在每一个训练周期完全后执行一次验证,如下所示: 默认情况下,我们`train_cfg` 中使用 `EpochBasedTrainLoop`,并在每一个训练 epoch 完成后进行一次验证,如下所示:
```python ```python
train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=12, val_begin=1, val_interval=1) train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=12, val_begin=1, val_interval=1)
...@@ -242,9 +227,9 @@ train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=12, val_begin=1, val_int ...@@ -242,9 +227,9 @@ train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=12, val_begin=1, val_int
事实上,[`IterBasedTrainLoop`](https://github.com/open-mmlab/mmengine/blob/main/mmengine/runner/loops.py#L183%5D)[`EpochBasedTrainLoop`](https://github.com/open-mmlab/mmengine/blob/main/mmengine/runner/loops.py#L18) 都支持动态间隔验证,如下所示: 事实上,[`IterBasedTrainLoop`](https://github.com/open-mmlab/mmengine/blob/main/mmengine/runner/loops.py#L183%5D)[`EpochBasedTrainLoop`](https://github.com/open-mmlab/mmengine/blob/main/mmengine/runner/loops.py#L18) 都支持动态间隔验证,如下所示:
```python ```python
# 在第 365001 迭代之前,我们每隔 5000 迭代验证一次 # 在第 365001 迭代之前,我们每隔 5000 迭代验证一次
# 在第 365000 迭代之后,我们每隔 368750 迭代验证一次 # 在第 365000 迭代之后,我们每隔 368750 迭代验证一次
# 这意味着我们在训练结束后验证一次 # 这意味着我们在训练结束后进行验证。
interval = 5000 interval = 5000
max_iters = 368750 max_iters = 368750
...@@ -262,8 +247,7 @@ train_cfg = dict( ...@@ -262,8 +247,7 @@ train_cfg = dict(
#### 1. 实现一个新钩子 #### 1. 实现一个新钩子
MMEngine 提供了一些有用的[钩子](https://github.com/open-mmlab/mmengine/blob/main/docs/en/tutorials/hook.md),但有些场合用户可能需要实现一个新的钩子。在 v1.1.0rc0 之后,MMDetection3D 在训练时支持基于 MMEngine 自定义钩子。因此用户可以直接在 mmdet3d 或者基于 mmdet3d 的代码库中实现钩子并通过更改训练配置来使用钩子。 MMEngine 提供了一些实用的[钩子](https://github.com/open-mmlab/mmengine/blob/main/docs/en/tutorials/hook.md),但有些场合用户可能需要实现一个新的钩子。在 v1.1.0rc0 之后,MMDetection3D 在训练时支持基于 MMEngine 自定义钩子。因此用户可以直接在 mmdet3d 或者基于 mmdet3d 的代码库中实现钩子并通过更改训练配置来使用钩子。这里我们给出一个在 mmdet3d 中创建并使用新钩子的例子。
这里我们给出一个在 mmdet3d 中创建并使用新钩子的例子。
```python ```python
from mmengine.hooks import Hook from mmengine.hooks import Hook
...@@ -300,25 +284,25 @@ class MyHook(Hook): ...@@ -300,25 +284,25 @@ class MyHook(Hook):
outputs: Optional[dict] = None) -> None: outputs: Optional[dict] = None) -> None:
``` ```
取决于钩子的功能,用户需要指定钩子在每个训练阶段时的行为,具体包括如下阶段:`before_run``after_run``before_train``after_train``before_train_epoch``after_train_epoch``before_train_iter`,和 `after_train_iter`。有更多的位点可以插入钩子,详情可参考 [base hook class](https://github.com/open-mmlab/mmengine/blob/main/mmengine/hooks/hook.py#L9) 用户需要根据钩子的功能指定钩子在每个训练阶段时的行为,具体包括如下阶段:`before_run``after_run``before_train``after_train``before_train_epoch``after_train_epoch``before_train_iter`,和 `after_train_iter`。有更多的位点可以插入钩子,详情可参考 [base hook class](https://github.com/open-mmlab/mmengine/blob/main/mmengine/hooks/hook.py#L9)
#### 2. 注册新钩子 #### 2. 注册新钩子
接下来我们需要`MyHook`。假设新钩子位于文件 `mmdet3d/engine/hooks/my_hook.py` 中,有两种方法可以实现之 接下来我们需要`MyHook`。假设新钩子位于文件 `mmdet3d/engine/hooks/my_hook.py` 中,有两种实现方法:
- `mmdet3d/engine/hooks/__init__.py` 来引入之 - `mmdet3d/engine/hooks/__init__.py` 导入该模块
新定义的模块应在 `mmdet3d/engine/hooks/__init__.py`引入,以使得注册器可以找到新模块并注册之 新定义的模块应`mmdet3d/engine/hooks/__init__.py`被导入,从而被找到并且被添加到注册器中
```python ```python
from .my_hook import MyHook from .my_hook import MyHook
``` ```
- 在配置中使用 `custom_imports` 来人为地引入之 - 在配置中使用 `custom_imports` 来人为地导入新钩子
```python ```python
custom_imports = dict(imports=['mmdet3d.core.utils.my_hook'], allow_failed_imports=False) custom_imports = dict(imports=['mmdet3d.engine.hooks.my_hook'], allow_failed_imports=False)
``` ```
#### 3. 更改配置文件 #### 3. 更改配置文件
...@@ -336,7 +320,7 @@ custom_hooks = [ ...@@ -336,7 +320,7 @@ custom_hooks = [
] ]
``` ```
注册阶段钩子的优先级默认设置`NORMAL` 默认情况下,注册阶段钩子的优先级为 `NORMAL`
### 使用 MMEngine 中实现的钩子 ### 使用 MMEngine 中实现的钩子
......
...@@ -4,7 +4,7 @@ ...@@ -4,7 +4,7 @@
## 数据准备 ## 数据准备
您可以在[这里](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d)下载 KITTI 3D 检测数据并解压缩所有 zip 文件。此外,您可以在[这里](https://download.openmmlab.com/mmdetection3d/data/train_planes.zip)下载道路平面信息,其在训练过程中作为一个可选项,用来提高模型的性能。道路平面信息由 [AVOD](https://github.com/kujason/avod) 生成,你可以在[这里](https://github.com/kujason/avod/issues/19)查看更多细节 您可以在[这里](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d)下载 KITTI 3D 检测数据并解压缩所有 zip 文件。此外,您可以在[这里](https://download.openmmlab.com/mmdetection3d/data/train_planes.zip)下载道路平面信息,其在训练过程中作为一个可选项,用来提高模型的性能。道路平面信息由 [AVOD](https://github.com/kujason/avod) 生成,更多细节请参考[此处](https://github.com/kujason/avod/issues/19)
像准备数据集的一般方法一样,建议将数据集根目录链接到 `$MMDETECTION3D/data` 像准备数据集的一般方法一样,建议将数据集根目录链接到 `$MMDETECTION3D/data`
...@@ -97,7 +97,7 @@ kitti ...@@ -97,7 +97,7 @@ kitti
- info\['lidar_points'\]\['Tr_imu_to_velo'\]:IMU 坐标到 Velodyne 坐标的变换矩阵,是一个 4x4 数组。 - info\['lidar_points'\]\['Tr_imu_to_velo'\]:IMU 坐标到 Velodyne 坐标的变换矩阵,是一个 4x4 数组。
- info\['instances'\]:是一个字典组成的列表。每个字典包含单个实例的所有标注信息。对于其中的第 i 个实例,我们有: - info\['instances'\]:是一个字典组成的列表。每个字典包含单个实例的所有标注信息。对于其中的第 i 个实例,我们有:
- info\['instances'\]\[i\]\['bbox'\]:长度为 4 的列表,以 (x1, y1, x2, y2) 的顺序表示实例的 2D 边界框。 - info\['instances'\]\[i\]\['bbox'\]:长度为 4 的列表,以 (x1, y1, x2, y2) 的顺序表示实例的 2D 边界框。
- info\['instances'\]\[i\]\['bbox_3d'\]:长度为 7 的列表,以 (x, y, z, w, h, l, yaw) 的顺序表示实例的 3D 边界框。 - info\['instances'\]\[i\]\['bbox_3d'\]:长度为 7 的列表,以 (x, y, z, l, h, w, yaw) 的顺序表示实例的 3D 边界框。
- info\['instances'\]\[i\]\['bbox_label'\]:是一个整数,表示实例的 2D 标签,-1 代表忽略。 - info\['instances'\]\[i\]\['bbox_label'\]:是一个整数,表示实例的 2D 标签,-1 代表忽略。
- info\['instances'\]\[i\]\['bbox_label_3d'\]:是一个整数,表示实例的 3D 标签,-1 代表忽略。 - info\['instances'\]\[i\]\['bbox_label_3d'\]:是一个整数,表示实例的 3D 标签,-1 代表忽略。
- info\['instances'\]\[i\]\['depth'\]:3D 边界框投影到相关图像平面的中心点的深度。 - info\['instances'\]\[i\]\['depth'\]:3D 边界框投影到相关图像平面的中心点的深度。
...@@ -109,7 +109,7 @@ kitti ...@@ -109,7 +109,7 @@ kitti
- info\['instances'\]\[i\]\['group_ids'\]:用于多部分的物体。 - info\['instances'\]\[i\]\['group_ids'\]:用于多部分的物体。
- info\['plane'\](可选):地平面信息。 - info\['plane'\](可选):地平面信息。
请参考 [kitti_converter.py](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/kitti_converter.py)[update_infos_to_v2.py](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/update_infos_to_v2.py) 了解更多细节 更多细节请参考 [kitti_converter.py](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/kitti_converter.py)[update_infos_to_v2.py](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/update_infos_to_v2.py)
## 训练流程 ## 训练流程
...@@ -160,7 +160,7 @@ bash tools/dist_test.sh configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_ki ...@@ -160,7 +160,7 @@ bash tools/dist_test.sh configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_ki
## 度量指标 ## 度量指标
KITTI 官方使用全类平均精度(mAP)和平均方向相似度(AOS)来评估 3D 目标检测的性能,请参考[官方网站](http://www.cvlibs.net/datasets/kitti/eval_3dobject.php)[论文](http://www.cvlibs.net/publications/Geiger2012CVPR.pdf)获取更多细节 KITTI 官方使用全类平均精度(mAP)和平均方向相似度(AOS)来评估 3D 目标检测的性能,更多细节请参考[官方网站](http://www.cvlibs.net/datasets/kitti/eval_3dobject.php)[论文](http://www.cvlibs.net/publications/Geiger2012CVPR.pdf)
MMDetection3D 采用相同的方法在 KITTI 数据集上进行评估,下面展示了一个评估结果的例子: MMDetection3D 采用相同的方法在 KITTI 数据集上进行评估,下面展示了一个评估结果的例子:
...@@ -181,32 +181,26 @@ aos AP:97.70, 89.11, 87.38 ...@@ -181,32 +181,26 @@ aos AP:97.70, 89.11, 87.38
使用 8 个 GPU 在 KITTI 上测试 PointPillars 并生成对排行榜的提交的示例如下: 使用 8 个 GPU 在 KITTI 上测试 PointPillars 并生成对排行榜的提交的示例如下:
- 首先,你需要在你的配置文件中修改 `test_evaluator` 字典,并加上 `pklfile_prefix``submission_prefix`,如下所示: - 首先,你需要在你的配置文件中修改 `test_dataloader``test_evaluator` 字典,如下所示:
```python ```python
data_root = 'data/kitti' data_root = 'data/kitti/'
test_evaluator = dict( test_dataloader = dict(
type='KittiMetric', dataset=dict(
ann_file=data_root + 'kitti_infos_test.pkl', ann_file='kitti_infos_test.pkl',
metric='bbox', load_eval_anns=False,
pklfile_prefix='results/kitti-3class/kitti_results', data_prefix=dict(pts='testing/velodyne_reduced')))
submission_prefix='results/kitti-3class/kitti_results') test_evaluator = dict(
``` ann_file=data_root + 'kitti_infos_test.pkl',
format_only=True,
pklfile_prefix='results/kitti-3class/kitti_results',
submission_prefix='results/kitti-3class/kitti_results')
```
- 接下来,你可以运行如下测试脚本。 - 接下来,你可以运行如下测试脚本。
```shell ```shell
mkdir -p results/kitti-3class ./tools/dist_test.sh configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class.py work_dirs/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class/latest.pth 8
```
./tools/dist_test.sh configs/pointpillars/configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class.py work_dirs/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class/latest.pth 8
```
- 或者你可以在测试指令中使用 `--cfg-options "test_evaluator.pklfile_prefix=results/kitti-3class/kitti_results" "test_evaluator.submission_prefix=results/kitti-3class/kitti_results"`,然后直接运行如下测试脚本。
```shell
mkdir -p results/kitti-3class
./tools/dist_test.sh configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class.py work_dirs/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class/latest.pth 8 --cfg-options 'test_evaluator.pklfile_prefix=results/kitti-3class/kitti_results' 'test_evaluator.submission_prefix=results/kitti-3class/kitti_results'
```
在生成 `results/kitti-3class/kitti_results/xxxxx.txt` 后,您可以提交这些文件到 KITTI 官方网站进行基准测试,请参考 [KITTI 官方网站](<(http://www.cvlibs.net/datasets/kitti/index.php)>)获取更多细节 在生成 `results/kitti-3class/kitti_results/xxxxx.txt` 后,您可以提交这些文件到 KITTI 官方网站进行基准测试,更多细节请参考 [KITTI 官方网站](http://www.cvlibs.net/datasets/kitti/index.php)
...@@ -33,14 +33,11 @@ mmdetection3d ...@@ -33,14 +33,11 @@ mmdetection3d
│ │ ├── sample_submission.csv │ │ ├── sample_submission.csv
``` ```
其中 `v1.01-train``v1.01-test` 包含与 nuScenes 数据集相同的元文件,`.txt` 文件包含数据划分的信息。 其中 `v1.01-train``v1.01-test` 包含与 nuScenes 数据集相同的元文件,`.txt` 文件包含数据划分的信息。Lyft 不提供训练集和验证集的官方划分方案,因此 MMDetection3D 对不同场景下的不同类别的目标数量进行分析,并提供了一个数据集划分方案。`sample_submission.csv` 是用于提交到 Kaggle 评估服务器的基本文件。需要注意的是,我们遵循了 Lyft 最初的文件夹命名以实现更清楚的文件组织。请将下载下来的原始文件夹按照上述组织结构重新命名。
Lyft 不提供训练集和验证集的官方划分方案,因此 MMDetection3D 对不同场景下的不同类别的目标数量进行分析,并提供了一个数据集划分方案。
`sample_submission.csv` 是用于提交到 Kaggle 评估服务器的基本文件。
需要注意的是,我们遵循了 Lyft 最初的文件夹命名以实现更清楚的文件组织。请将下载下来的原始文件夹按照上述组织结构重新命名。
## 数据准备 ## 数据准备
组织 Lyft 数据集的方式和组织 nuScenes 的方式相同,首先会生成几乎具有相同结构的 .pkl 和 .json 文件,接着需要重点关注这两个数据集之间的不同点,请参考 [nuScenes 教程](https://github.com/open-mmlab/mmdetection3d/blob/master/docs_zh-CN/datasets/nuscenes_det.md)获取更加详细的数据集信息文件结构的说明 组织 Lyft 数据集的方式和组织 nuScenes 的方式相同,首先会生成几乎具有相同结构的 `.pkl` 文件,接着需要重点关注这两个数据集之间的不同点,更多关于数据集信息文件结构的说明请参考 [nuScenes 教程](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/docs/zh_cn/advanced_guides/datasets/nuscenes_det.md)
请通过运行下面的命令来生成 Lyft 的数据集信息文件: 请通过运行下面的命令来生成 Lyft 的数据集信息文件:
...@@ -49,7 +46,7 @@ python tools/create_data.py lyft --root-path ./data/lyft --out-dir ./data/lyft - ...@@ -49,7 +46,7 @@ python tools/create_data.py lyft --root-path ./data/lyft --out-dir ./data/lyft -
python tools/data_converter/lyft_data_fixer.py --version v1.01 --root-folder ./data/lyft python tools/data_converter/lyft_data_fixer.py --version v1.01 --root-folder ./data/lyft
``` ```
请注意,上面的第二行命令用于修复损坏的 lidar 数据文件,请参考[此处](https://www.kaggle.com/c/3d-object-detection-for-autonomous-vehicles/discussion/110000)获取更多细节 请注意,上面的第二行命令用于修复损坏的 lidar 数据文件,更多细节请参考此处[讨论](https://www.kaggle.com/c/3d-object-detection-for-autonomous-vehicles/discussion/110000)
处理后的文件夹结构应该如下: 处理后的文件夹结构应该如下:
...@@ -84,11 +81,11 @@ mmdetection3d ...@@ -84,11 +81,11 @@ mmdetection3d
- info\['token'\]:样本数据标记。 - info\['token'\]:样本数据标记。
- info\['timestamp'\]:样本数据时间戳。 - info\['timestamp'\]:样本数据时间戳。
- info\['lidar_points'\]:是一个字典,包含了所有与激光雷达点相关的信息。 - info\['lidar_points'\]:是一个字典,包含了所有与激光雷达点相关的信息。
- info\['lidar_points'\]\['lidar_path'\]:激光雷达点云数据的文件路径 - info\['lidar_points'\]\['lidar_path'\]:激光雷达点云数据的文件
- info\['lidar_points'\]\['num_pts_feats'\]:点的特征维度。 - info\['lidar_points'\]\['num_pts_feats'\]:点的特征维度。
- info\['lidar_points'\]\['lidar2ego'\]:该激光雷达传感器到自车的变换矩阵。(4x4 列表) - info\['lidar_points'\]\['lidar2ego'\]:该激光雷达传感器到自车的变换矩阵。(4x4 列表)
- info\['lidar_points'\]\['ego2global'\]:自车到全局坐标的变换矩阵。(4x4 列表) - info\['lidar_points'\]\['ego2global'\]:自车到全局坐标的变换矩阵。(4x4 列表)
- info\['lidar_sweeps'\]:是一个列表,包含了扫描信息(没有标注的中间帧) - info\['lidar_sweeps'\]:是一个列表,包含了扫描信息(没有标注的中间帧)
- info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['data_path'\]:第 i 次扫描的激光雷达数据的文件路径。 - info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['data_path'\]:第 i 次扫描的激光雷达数据的文件路径。
- info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\[lidar2ego''\]:当前激光雷达传感器到自车在第 i 次扫描的变换矩阵。(4x4 列表) - info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\[lidar2ego''\]:当前激光雷达传感器到自车在第 i 次扫描的变换矩阵。(4x4 列表)
- info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['ego2global'\]:自车在第 i 次扫描到全局坐标的变换矩阵。(4x4 列表) - info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['ego2global'\]:自车在第 i 次扫描到全局坐标的变换矩阵。(4x4 列表)
...@@ -97,7 +94,7 @@ mmdetection3d ...@@ -97,7 +94,7 @@ mmdetection3d
- info\['lidar_sweeps'\]\[i\]\['sample_data_token'\]:扫描样本数据标记。 - info\['lidar_sweeps'\]\[i\]\['sample_data_token'\]:扫描样本数据标记。
- info\['images'\]:是一个字典,包含与每个相机对应的六个键值:`'CAM_FRONT'`, `'CAM_FRONT_RIGHT'`, `'CAM_FRONT_LEFT'`, `'CAM_BACK'`, `'CAM_BACK_LEFT'`, `'CAM_BACK_RIGHT'`。每个字典包含了对应相机的所有数据信息。 - info\['images'\]:是一个字典,包含与每个相机对应的六个键值:`'CAM_FRONT'`, `'CAM_FRONT_RIGHT'`, `'CAM_FRONT_LEFT'`, `'CAM_BACK'`, `'CAM_BACK_LEFT'`, `'CAM_BACK_RIGHT'`。每个字典包含了对应相机的所有数据信息。
- info\['images'\]\['CAM_XXX'\]\['img_path'\]:图像的文件名。 - info\['images'\]\['CAM_XXX'\]\['img_path'\]:图像的文件名。
- info\['images'\]\['CAM_XXX'\]\['cam2img'\]:当 3D 投影到图像平面时需要的内参信息相关的变换矩阵。(3x3 列表) - info\['images'\]\['CAM_XXX'\]\['cam2img'\]:当 3D 投影到图像平面时需要的内参信息相关的变换矩阵。(3x3 列表)
- info\['images'\]\['CAM_XXX'\]\['sample_data_token'\]:图像样本数据标记。 - info\['images'\]\['CAM_XXX'\]\['sample_data_token'\]:图像样本数据标记。
- info\['images'\]\['CAM_XXX'\]\['timestamp'\]:图像的时间戳。 - info\['images'\]\['CAM_XXX'\]\['timestamp'\]:图像的时间戳。
- info\['images'\]\['CAM_XXX'\]\['cam2ego'\]:该相机传感器到自车的变换矩阵。(4x4 列表) - info\['images'\]\['CAM_XXX'\]\['cam2ego'\]:该相机传感器到自车的变换矩阵。(4x4 列表)
...@@ -113,12 +110,12 @@ mmdetection3d ...@@ -113,12 +110,12 @@ mmdetection3d
- `lyft_infos_train.pkl` - `lyft_infos_train.pkl`
- info\['instances'\]\[i\]\['velocity'\] 不存在Lyft 数据集中不存在速度评估信息。 - info\['instances'\]\[i\]\['velocity'\] 不存在Lyft 数据集中不存在速度评估信息。
- info\['instances'\]\[i\]\['num_lidar_pts'\] 及 info\['instances'\]\[i\]\['num_radar_pts'\] 不存在。 - info\['instances'\]\[i\]\['num_lidar_pts'\] 及 info\['instances'\]\[i\]\['num_radar_pts'\] 不存在。
这里仅介绍存储在训练数据文件的数据记录信息。这同样适用于验证集和测试集(没有实例)。 这里仅介绍存储在训练数据文件的数据记录信息。这同样适用于验证集和测试集(没有实例)。
请参考 [lyft_converter.py](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/lyft_converter.py) 获取更多关于 `lyft_infos_xxx.pkl` 结构的细节 更多关于 `lyft_infos_xxx.pkl` 的结构信息请参考 [lyft_converter.py](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/lyft_converter.py)
## 训练流程 ## 训练流程
...@@ -152,8 +149,7 @@ train_pipeline = [ ...@@ -152,8 +149,7 @@ train_pipeline = [
] ]
``` ```
与 nuScenes 相似,在 Lyft 上进行训练的模型也需要 `LoadPointsFromMultiSweeps` 步骤来从连续帧中加载点云数据。 与 nuScenes 相似,在 Lyft 上进行训练的模型也需要 `LoadPointsFromMultiSweeps` 步骤来从连续帧中加载点云数据。另外,考虑到 Lyft 中所收集的激光雷达点的强度是无效的,因此将 `LoadPointsFromMultiSweeps` 中的 `use_dim` 默认值设置为 `[0, 1, 2, 4]`,其中前三个维度表示点的坐标,最后一个维度表示时间戳的差异。
另外,考虑到 Lyft 中所收集的激光雷达点的强度是无效的,因此将 `LoadPointsFromMultiSweeps` 中的 `use_dim` 默认值设置为 `[0, 1, 2, 4]`,其中前三个维度表示点的坐标,最后一个维度表示时间戳的差异。
## 评估 ## 评估
...@@ -165,11 +161,7 @@ bash ./tools/dist_test.sh configs/pointpillars/pointpillars_hv_fpn_sbn-all_8xb2- ...@@ -165,11 +161,7 @@ bash ./tools/dist_test.sh configs/pointpillars/pointpillars_hv_fpn_sbn-all_8xb2-
## 度量指标 ## 度量指标
Lyft 提出了一个更加严格的用以评估所预测的 3D 检测框的度量指标。 Lyft 提出了一个更加严格的用以评估所预测的 3D 检测框的度量指标。判断一个预测框是否是正类的基本评判标准和 KITTI 一样,如基于 3D 交并比进行评估,然而,Lyft 采用与 COCO 相似的方式来计算平均精度 -- 计算 3D 交并比在 0.5-0.95 之间的不同阈值下的平均精度。实际上,重叠部分大于 0.7 的 3D 交并比是一项对于 3D 检测方法比较严格的标准,因此整体的性能似乎会偏低。相比于其他数据集,Lyft 上不同类别的标注不平衡是导致最终结果偏低的另一个重要原因。更多关于度量指标的定义请参考[官方网址](https://www.kaggle.com/c/3d-object-detection-for-autonomous-vehicles/overview/evaluation)
判断一个预测框是否是正类的基本评判标准和 KITTI 一样,如基于 3D 交并比进行评估,然而,Lyft 采用与 COCO 相似的方式来计算平均精度 -- 计算 3D 交并比在 0.5-0.95 之间的不同阈值下的平均精度。
实际上,重叠部分大于 0.7 的 3D 交并比是一项对于 3D 检测方法比较严格的标准,因此整体的性能似乎会偏低。
相比于其他数据集,Lyft 上不同类别的标注不平衡是导致最终结果偏低的另一个重要原因。
请参考[官方网址](https://www.kaggle.com/c/3d-object-detection-for-autonomous-vehicles/overview/evaluation)获取更多关于度量指标的定义的细节。
这里将采用官方方法对 Lyft 进行评估,下面展示了一个评估结果的例子: 这里将采用官方方法对 Lyft 进行评估,下面展示了一个评估结果的例子:
...@@ -200,4 +192,4 @@ Lyft 提出了一个更加严格的用以评估所预测的 3D 检测框的度 ...@@ -200,4 +192,4 @@ Lyft 提出了一个更加严格的用以评估所预测的 3D 检测框的度
在生成 `work_dirs/pp-lyft/results_challenge.csv`,您可以将生成的文件提交到 Kaggle 评估服务器,请参考[官方网址](https://www.kaggle.com/c/3d-object-detection-for-autonomous-vehicles)获取更多细节。 在生成 `work_dirs/pp-lyft/results_challenge.csv`,您可以将生成的文件提交到 Kaggle 评估服务器,请参考[官方网址](https://www.kaggle.com/c/3d-object-detection-for-autonomous-vehicles)获取更多细节。
同时还可以使用可视化工具将预测结果进行可视化,请参考[可视化文档](https://mmdetection3d.readthedocs.io/zh_CN/latest/useful_tools.html#visualization)获取更多细节 同时还可以使用可视化工具将预测结果进行可视化,更多细节请参考[可视化文档](https://mmdetection3d.readthedocs.io/zh_CN/latest/useful_tools.html#visualization)
...@@ -26,14 +26,13 @@ mmdetection3d ...@@ -26,14 +26,13 @@ mmdetection3d
## 数据准备 ## 数据准备
我们通常需要通过特定样式来使用 .pkl 或 .json 文件组织有用的数据信息,例如用于组织图像及其标注的 coco 样式。 我们通常需要通过特定样式来使用 `.pkl` 文件组织有用的数据信息。要为 nuScenes 准备这些文件,请运行以下命令:
要为 nuScenes 准备这些文件,请运行以下命令:
```bash ```bash
python tools/create_data.py nuscenes --root-path ./data/nuscenes --out-dir ./data/nuscenes --extra-tag nuscenes python tools/create_data.py nuscenes --root-path ./data/nuscenes --out-dir ./data/nuscenes --extra-tag nuscenes
``` ```
处理后的文件夹结构应该如下 处理后的文件夹结构应该如下
``` ```
mmdetection3d mmdetection3d
...@@ -61,7 +60,7 @@ mmdetection3d ...@@ -61,7 +60,7 @@ mmdetection3d
- info\['token'\]:样本数据标记。 - info\['token'\]:样本数据标记。
- info\['timestamp'\]:样本数据时间戳。 - info\['timestamp'\]:样本数据时间戳。
- info\['lidar_points'\]:是一个字典,包含了所有与激光雷达点相关的信息。 - info\['lidar_points'\]:是一个字典,包含了所有与激光雷达点相关的信息。
- info\['lidar_points'\]\['lidar_path'\]:激光雷达点云数据的文件路径 - info\['lidar_points'\]\['lidar_path'\]:激光雷达点云数据的文件
- info\['lidar_points'\]\['num_pts_feats'\]:点的特征维度。 - info\['lidar_points'\]\['num_pts_feats'\]:点的特征维度。
- info\['lidar_points'\]\['lidar2ego'\]:该激光雷达传感器到自车的变换矩阵。(4x4 列表) - info\['lidar_points'\]\['lidar2ego'\]:该激光雷达传感器到自车的变换矩阵。(4x4 列表)
- info\['lidar_points'\]\['ego2global'\]:自车到全局坐标的变换矩阵。(4x4 列表) - info\['lidar_points'\]\['ego2global'\]:自车到全局坐标的变换矩阵。(4x4 列表)
...@@ -74,7 +73,7 @@ mmdetection3d ...@@ -74,7 +73,7 @@ mmdetection3d
- info\['lidar_sweeps'\]\[i\]\['sample_data_token'\]:扫描样本数据标记。 - info\['lidar_sweeps'\]\[i\]\['sample_data_token'\]:扫描样本数据标记。
- info\['images'\]:是一个字典,包含与每个相机对应的六个键值:`'CAM_FRONT'`, `'CAM_FRONT_RIGHT'`, `'CAM_FRONT_LEFT'`, `'CAM_BACK'`, `'CAM_BACK_LEFT'`, `'CAM_BACK_RIGHT'`。每个字典包含了对应相机的所有数据信息。 - info\['images'\]:是一个字典,包含与每个相机对应的六个键值:`'CAM_FRONT'`, `'CAM_FRONT_RIGHT'`, `'CAM_FRONT_LEFT'`, `'CAM_BACK'`, `'CAM_BACK_LEFT'`, `'CAM_BACK_RIGHT'`。每个字典包含了对应相机的所有数据信息。
- info\['images'\]\['CAM_XXX'\]\['img_path'\]:图像的文件名。 - info\['images'\]\['CAM_XXX'\]\['img_path'\]:图像的文件名。
- info\['images'\]\['CAM_XXX'\]\['cam2img'\]:当 3D 投影到图像平面时需要的内参信息相关的变换矩阵。(3x3 列表) - info\['images'\]\['CAM_XXX'\]\['cam2img'\]:当 3D 投影到图像平面时需要的内参信息相关的变换矩阵。(3x3 列表)
- info\['images'\]\['CAM_XXX'\]\['sample_data_token'\]:图像样本数据标记。 - info\['images'\]\['CAM_XXX'\]\['sample_data_token'\]:图像样本数据标记。
- info\['images'\]\['CAM_XXX'\]\['timestamp'\]:图像的时间戳。 - info\['images'\]\['CAM_XXX'\]\['timestamp'\]:图像的时间戳。
- info\['images'\]\['CAM_XXX'\]\['cam2ego'\]:该相机传感器到自车的变换矩阵。(4x4 列表) - info\['images'\]\['CAM_XXX'\]\['cam2ego'\]:该相机传感器到自车的变换矩阵。(4x4 列表)
...@@ -86,24 +85,23 @@ mmdetection3d ...@@ -86,24 +85,23 @@ mmdetection3d
- info\['instances'\]\[i\]\['num_lidar_pts'\]:每个 3D 边界框内包含的激光雷达点数。 - info\['instances'\]\[i\]\['num_lidar_pts'\]:每个 3D 边界框内包含的激光雷达点数。
- info\['instances'\]\[i\]\['num_radar_pts'\]:每个 3D 边界框内包含的雷达点数。 - info\['instances'\]\[i\]\['num_radar_pts'\]:每个 3D 边界框内包含的雷达点数。
- info\['instances'\]\[i\]\['bbox_3d_isvalid'\]:每个包围框是否有效。一般情况下,我们只将包含至少一个激光雷达或雷达点的 3D 框作为有效框。 - info\['instances'\]\[i\]\['bbox_3d_isvalid'\]:每个包围框是否有效。一般情况下,我们只将包含至少一个激光雷达或雷达点的 3D 框作为有效框。
- info\['cam_instances'\]:是一个字典,包含以下键值:`'CAM_FRONT'`, `'CAM_FRONT_RIGHT'`, `'CAM_FRONT_LEFT'`, `'CAM_BACK'`, `'CAM_BACK_LEFT'`, `'CAM_BACK_RIGHT'`。对于基于视觉的 3D 目标检测任务,我们将整个场景的 3D 标注划分至它们所属于的相应相机中。 - info\['cam_instances'\]:是一个字典,包含以下键值:`'CAM_FRONT'`, `'CAM_FRONT_RIGHT'`, `'CAM_FRONT_LEFT'`, `'CAM_BACK'`, `'CAM_BACK_LEFT'`, `'CAM_BACK_RIGHT'`。对于基于视觉的 3D 目标检测任务,我们将整个场景的 3D 标注划分至它们所属于的相应相机中。对于其中的第 i 个实例,我们有:
- info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['bbox_label'\]:实例标签。 - info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['bbox_label'\]:实例标签。
- info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['bbox_label_3d'\]:实例标签。 - info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['bbox_label_3d'\]:实例标签。
- info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['bbox'\]:2D 边界框标注(3D 框投影的矩形框),顺序为 \[x1, y1, x2, y2\] 的列表。 - info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['bbox'\]:2D 边界框标注(3D 框投影的矩形框),顺序为 \[x1, y1, x2, y2\] 的列表。
- info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['center_2d'\]:3D 框投影到图像上的中心点,大小为 (2, ) 的列表。 - info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['center_2d'\]:3D 框投影到图像上的中心点,大小为 (2, ) 的列表。
- info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['depth'\]:3D 框投影中心的深度。 - info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['depth'\]:3D 框投影中心的深度。
- info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['velocity'\]:3D 边界框的速度(由于不正确,没有垂直测量),大小为 (2, ) 的列表。 - info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['velocity'\]:3D 边界框的速度(由于不正确,没有垂直测量),大小为 (2, ) 的列表。
- info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['attr_label'\]:实例的属性标签。我们为属性分类维护了一个属性集合和映射 - info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['attr_label'\]:实例的属性标签。我们为属性分类维护了一个属性集合和映射
- info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['bbox_3d'\]:长度为 7 的列表,以 (x, y, z, l, h, w, yaw) 的顺序表示实例的 3D 边界框。 - info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['bbox_3d'\]:长度为 7 的列表,以 (x, y, z, l, h, w, yaw) 的顺序表示实例的 3D 边界框。
注意: 注意:
1. `instances``cam_instances``bbox_3d` 的区别。 1. `instances``cam_instances``bbox_3d` 的区别。`bbox_3d` 都被转换到 MMDet3D 定义的坐标系下,`instances` 中的 `bbox_3d` 是在激光雷达坐标系下,而 `cam_instances` 是在相机坐标系下。注意它们 3D 框中表示的不同('l, w, h' 和 'l, h, w')。
`bbox_3d` 都被转换到 MMDet3D 定义的坐标系下,`instances` 中的 `bbox_3d` 是在激光雷达坐标系下,而 `cam_instances` 是在相机坐标系下。注意它们 3D 框中表示的不同('l, w, h' 和 'l, h, w')。
2. 这里我们只解释训练信息文件中记录的数据。这同样适用于验证集和测试集(测试集的 pkl 文件中不包含 `instances` 以及 `cam_instances`)。 2. 这里我们只解释训练信息文件中记录的数据。这同样适用于验证集和测试集(测试集的 `.pkl` 文件中不包含 `instances` 以及 `cam_instances`)。
获取 `nuscenes_infos_xxx.pkl` 的核心函数为 [\_fill_trainval_infos](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/nuscenes_converter.py#L146)[get_2d_boxes](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/nuscenes_converter.py#L397)。更多细节请参考 [nuscenes_converter.py](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/nuscenes_converter.py) 获取 `nuscenes_infos_xxx.pkl` 的核心函数为 [\_fill_trainval_infos](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/nuscenes_converter.py#L146)。更多细节请参考 [nuscenes_converter.py](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/nuscenes_converter.py)
## 训练流程 ## 训练流程
...@@ -138,10 +136,7 @@ train_pipeline = [ ...@@ -138,10 +136,7 @@ train_pipeline = [
] ]
``` ```
与一般情况相比,nuScenes 有一个特定的 `'LoadPointsFromMultiSweeps'` 流水线来从连续帧加载点云。这是此设置中使用的常见做法。 与一般情况相比,nuScenes 有一个特定的 `'LoadPointsFromMultiSweeps'` 流水线来从连续帧加载点云。这是此设置中使用的常见做法。更多细节请参考 nuScenes [原始论文](https://arxiv.org/abs/1903.11027)`'LoadPointsFromMultiSweeps'` 中的默认 `use_dim``[0, 1, 2, 4]`,其中前 3 个维度是指点坐标,最后一个是指时间戳差异。由于在拼接来自不同帧的点时使用点云的强度信息会产生噪声,因此默认情况下不使用点云的强度信息。
更多细节请参考 nuScenes [原始论文](https://arxiv.org/abs/1903.11027)
`'LoadPointsFromMultiSweeps'` 中的默认 `use_dim``[0, 1, 2, 4]`,其中前 3 个维度是指点坐标,最后一个是指时间戳差异。
由于在拼接来自不同帧的点时使用点云的强度信息会产生噪声,因此默认情况下不使用点云的强度信息。
### 基于视觉的方法 ### 基于视觉的方法
...@@ -158,7 +153,7 @@ train_pipeline = [ ...@@ -158,7 +153,7 @@ train_pipeline = [
with_bbox_3d=True, with_bbox_3d=True,
with_label_3d=True, with_label_3d=True,
with_bbox_depth=True), with_bbox_depth=True),
dict(type='Resize', img_scale=(1600, 900), keep_ratio=True), dict(type='mmdet.Resize', img_scale=(1600, 900), keep_ratio=True),
dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5), dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
dict( dict(
type='Pack3DDetInputs', type='Pack3DDetInputs',
...@@ -173,8 +168,7 @@ train_pipeline = [ ...@@ -173,8 +168,7 @@ train_pipeline = [
- 它使用单目流水线加载图像,其中包括额外的必需信息,如相机内参矩阵。 - 它使用单目流水线加载图像,其中包括额外的必需信息,如相机内参矩阵。
- 它需要加载 3D 标注。 - 它需要加载 3D 标注。
- 一些数据增强技术需要调整,例如`RandomFlip3D` - 一些数据增强技术需要调整,例如`RandomFlip3D`。目前我们不支持更多的增强方法,因为如何迁移和应用其他技术仍在探索中。
目前我们不支持更多的增强方法,因为如何迁移和应用其他技术仍在探索中。
## 评估 ## 评估
...@@ -186,9 +180,7 @@ bash ./tools/dist_test.sh configs/pointpillars/pointpillars_hv_fpn_sbn-all_8xb4- ...@@ -186,9 +180,7 @@ bash ./tools/dist_test.sh configs/pointpillars/pointpillars_hv_fpn_sbn-all_8xb4-
## 指标 ## 指标
NuScenes 提出了一个综合指标,即 nuScenes 检测分数(NDS),以评估不同的方法并设置基准测试。 NuScenes 提出了一个综合指标,即 nuScenes 检测分数(NDS),以评估不同的方法并设置基准测试。它由平均精度(mAP)、平均平移误差(ATE)、平均尺度误差(ASE)、平均方向误差(AOE)、平均速度误差(AVE)和平均属性误差(AAE)组成。更多细节请参考其[官方网站](https://www.nuscenes.org/object-detection?externalData=all&mapData=all&modalities=Any)
它由平均精度(mAP)、平均平移误差(ATE)、平均尺度误差(ASE)、平均方向误差(AOE)、平均速度误差(AVE)和平均属性误差(AAE)组成。
更多细节请参考其[官方网站](https://www.nuscenes.org/object-detection?externalData=all&mapData=all&modalities=Any)
我们也采用这种方法对 nuScenes 进行评估。打印的评估结果示例如下: 我们也采用这种方法对 nuScenes 进行评估。打印的评估结果示例如下:
......
# 3D 目标检测 Scannet 数据集 # 3D 目标检测 ScanNet 数据集
## 数据集准备 ## 数据集准备
请参考 ScanNet 的[指南](https://github.com/open-mmlab/mmdetection3d/blob/master/data/scannet/README.md/)以查看总体流程。 请参考 ScanNet 的[指南](https://github.com/open-mmlab/mmdetection3d/blob/master/data/scannet/README.md)以查看总体流程。
### 提取 ScanNet 点云数据 ### 提取 ScanNet 点云数据
...@@ -32,10 +32,10 @@ mmdetection3d ...@@ -32,10 +32,10 @@ mmdetection3d
`scans` 文件夹下总共有 1201 个训练样本文件夹和 312 个验证样本文件夹,其中存有未处理的点云数据和相关的标注。比如说,在文件夹 `scene0001_01` 下文件是这样组织的: `scans` 文件夹下总共有 1201 个训练样本文件夹和 312 个验证样本文件夹,其中存有未处理的点云数据和相关的标注。比如说,在文件夹 `scene0001_01` 下文件是这样组织的:
- `scene0001_01_vh_clean_2.ply`: 存有每个顶点坐标和颜色的网格文件。网格的顶点被直接用作未处理的点云数据。 - `scene0001_01_vh_clean_2.ply`存有每个顶点坐标和颜色的网格文件。网格的顶点被直接用作未处理的点云数据。
- `scene0001_01.aggregation.json`: 包含物体 ID、分割部分 ID、标签的标注文件。 - `scene0001_01.aggregation.json`包含物体 ID、分割部分 ID、标签的标注文件。
- `scene0001_01_vh_clean_2.0.010000.segs.json`: 包含分割部分 ID 和顶点的分割标注文件。 - `scene0001_01_vh_clean_2.0.010000.segs.json`包含分割部分 ID 和顶点的分割标注文件。
- `scene0001_01.txt`: 包括对齐矩阵等的元文件。 - `scene0001_01.txt`包括对齐矩阵等的元文件。
- `scene0001_01_vh_clean_2.labels.ply`:包含每个顶点类别的标注文件。 - `scene0001_01_vh_clean_2.labels.ply`:包含每个顶点类别的标注文件。
通过运行 `python batch_load_scannet_data.py` 来提取 ScanNet 数据。主要步骤包括: 通过运行 `python batch_load_scannet_data.py` 来提取 ScanNet 数据。主要步骤包括:
...@@ -228,11 +228,11 @@ scannet ...@@ -228,11 +228,11 @@ scannet
- `posed_images/scenexxxx_xx``.jpg` 图像的集合,还包含 `.txt` 格式的 4x4 相机姿态和单个 `.txt` 格式的相机内参矩阵文件。 - `posed_images/scenexxxx_xx``.jpg` 图像的集合,还包含 `.txt` 格式的 4x4 相机姿态和单个 `.txt` 格式的相机内参矩阵文件。
- `scannet_infos_train.pkl`:训练集的数据信息,每个场景的具体信息如下: - `scannet_infos_train.pkl`:训练集的数据信息,每个场景的具体信息如下:
- info\['lidar_points'\]:字典包含与激光雷达点相关的信息。 - info\['lidar_points'\]:字典包含与激光雷达点相关的信息。
- info\['lidar_points'\]\['lidar_path'\]:点云数据 `xxx.bin` 的文件路径 - info\['lidar_points'\]\['lidar_path'\]激光雷达点云数据的文件
- info\['lidar_points'\]\['num_pts_feats'\]:点的特征维度。 - info\['lidar_points'\]\['num_pts_feats'\]:点的特征维度。
- info\['lidar_points'\]\['axis_align_matrix'\]:用于对齐坐标轴的变换矩阵。 - info\['lidar_points'\]\['axis_align_matrix'\]:用于对齐坐标轴的变换矩阵。
- info\['pts_semantic_mask_path'\]包含语义分割标注的 `xxx.bin` 文件路径 - info\['pts_semantic_mask_path'\]:语义分割标注的文件名
- info\['pts_instance_mask_path'\]包含实例分割标注的 `xxx.bin` 文件路径 - info\['pts_instance_mask_path'\]:实例分割标注的文件名
- info\['instances'\]:字典组成的列表,每个字典包含一个实例的所有标注信息。对于其中的第 i 个实例,我们有: - info\['instances'\]:字典组成的列表,每个字典包含一个实例的所有标注信息。对于其中的第 i 个实例,我们有:
- info\['instances'\]\[i\]\['bbox_3d'\]:长度为 6 的列表,以 (x, y, z, l, w, h) 的顺序表示深度坐标系下与坐标轴平行的 3D 边界框。 - info\['instances'\]\[i\]\['bbox_3d'\]:长度为 6 的列表,以 (x, y, z, l, w, h) 的顺序表示深度坐标系下与坐标轴平行的 3D 边界框。
- info\[instances\]\[i\]\['bbox_label_3d'\]:3D 边界框的标签。 - info\[instances\]\[i\]\['bbox_label_3d'\]:3D 边界框的标签。
...@@ -258,8 +258,7 @@ train_pipeline = [ ...@@ -258,8 +258,7 @@ train_pipeline = [
with_mask_3d=True, with_mask_3d=True,
with_seg_3d=True), with_seg_3d=True),
dict(type='GlobalAlignment', rotation_axis=2), dict(type='GlobalAlignment', rotation_axis=2),
dict( dict(type='PointSegClassMapping'),
type='PointSegClassMapping'),
dict(type='PointSample', num_points=40000), dict(type='PointSample', num_points=40000),
dict( dict(
type='RandomFlip3D', type='RandomFlip3D',
...@@ -285,10 +284,10 @@ train_pipeline = [ ...@@ -285,10 +284,10 @@ train_pipeline = [
- 数据增强: - 数据增强:
- `PointSample`:下采样输入点云。 - `PointSample`:下采样输入点云。
- `RandomFlip3D`:随机左右或前后翻转点云。 - `RandomFlip3D`:随机左右或前后翻转点云。
- `GlobalRotScaleTrans`: 旋转输入点云,对于 ScanNet 角度通常落入 \[-5, 5\] (度)的范围;并放缩输入点云,对于 ScanNet 比例通常为 1.0(即不做缩放);最后平移输入点云,对于 ScanNet 通常位移量为 0(即不做位移)。 - `GlobalRotScaleTrans`旋转输入点云,对于 ScanNet 角度通常落入 \[-5, 5\](度)的范围;并放缩输入点云,对于 ScanNet 比例通常为 1.0(即不做缩放);最后平移输入点云,对于 ScanNet 通常位移量为 0(即不做位移)。
## 评估指标 ## 评估指标
通常 mAP(全类平均精度)被用于 ScanNet 的检测任务的评估,比如 `mAP@0.25``mAP@0.5`。具体来说,评估时一个通用的计算 3D 物体检测多个类别的精度和召回率的函数被调用,可以参考 [indoor_eval](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/mmdet3d/evaluation/functional/indoor_eval.py) 通常使用 mAP(全类平均精度)来评估 ScanNet 的检测任务的性能,比如 `mAP@0.25``mAP@0.5`。具体来说,评估时调用一个通用的计算 3D 物体检测多个类别的精度和召回率的函数。更多细节请参考 [indoor_eval](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/mmdet3d/evaluation/functional/indoor_eval.py)
与在章节`提取 ScanNet 数据` 中介绍的那样,所有真实物体的三维包围框是与坐标轴平行的,也就是说旋转角为 0。因此,预测包围框的网络接受的包围框旋转角监督也是 0,且在后处理阶段我们使用适用于与坐标轴平行的包围框的非极大值抑制 (NMS) ,该过程不会考虑包围框的旋转。 与在章节`提取 ScanNet 数据`中介绍的那样,所有真实物体的三维包围框是与坐标轴平行的,也就是说旋转角为 0。因此,预测包围框的网络接受的包围框旋转角监督也是 0,且在后处理阶段我们使用适用于与坐标轴平行的包围框的非极大值抑制NMS,该过程不会考虑包围框的旋转。
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
## 数据集的准备 ## 数据集的准备
对于数据集准备的整体流程,请参考 SUN RGB-D 的[指南](https://github.com/open-mmlab/mmdetection3d/blob/master/data/sunrgbd/README.md/) 对于数据集准备的整体流程,请参考 SUN RGB-D 的[指南](https://github.com/open-mmlab/mmdetection3d/blob/master/data/sunrgbd/README.md)
### 下载 SUN RGB-D 数据与工具包 ### 下载 SUN RGB-D 数据与工具包
...@@ -155,7 +155,7 @@ sunrgbd ...@@ -155,7 +155,7 @@ sunrgbd
- `sunrgbd_infos_train.pkl`:训练集数据信息(标注与元信息),每个场景所含数据信息具体如下: - `sunrgbd_infos_train.pkl`:训练集数据信息(标注与元信息),每个场景所含数据信息具体如下:
- info\['lidar_points'\]:字典包含了与激光雷达点相关的信息。 - info\['lidar_points'\]:字典包含了与激光雷达点相关的信息。
- info\['lidar_points'\]\['num_pts_feats'\]:点的特征维度。 - info\['lidar_points'\]\['num_pts_feats'\]:点的特征维度。
- info\['lidar_points'\]\['lidar_path'\]:点云数据 `xxx.bin` 的文件路径 - info\['lidar_points'\]\['lidar_path'\]激光雷达点云数据的文件
- info\['images'\]:字典包含了与图像数据相关的信息。 - info\['images'\]:字典包含了与图像数据相关的信息。
- info\['images'\]\['CAM0'\]\['img_path'\]:图像的文件名。 - info\['images'\]\['CAM0'\]\['img_path'\]:图像的文件名。
- info\['images'\]\['CAM0'\]\['depth2img'\]:深度到图像的变换矩阵,形状为 (4, 4)。 - info\['images'\]\['CAM0'\]\['depth2img'\]:深度到图像的变换矩阵,形状为 (4, 4)。
...@@ -201,7 +201,7 @@ train_pipeline = [ ...@@ -201,7 +201,7 @@ train_pipeline = [
点云上的数据增强 点云上的数据增强
- `RandomFlip3D`:随机左右或前后翻转输入点云。 - `RandomFlip3D`:随机左右或前后翻转输入点云。
- `GlobalRotScaleTrans`:旋转输入点云,对于 SUN RGB-D 角度通常落入 \[-30, 30\] (度)的范围;并放缩输入点云,对于 SUN RGB-D 比例通常落入 \[0.85, 1.15\] 的范围;最后平移输入点云,对于 SUN RGB-D 通常位移量为 0(即不做位移)。 - `GlobalRotScaleTrans`:旋转输入点云,对于 SUN RGB-D 角度通常落入 \[-30, 30\](度)的范围;并放缩输入点云,对于 SUN RGB-D 比例通常落入 \[0.85, 1.15\] 的范围;最后平移输入点云,对于 SUN RGB-D 通常位移量为 0(即不做位移)。
- `PointSample`:降采样输入点云。 - `PointSample`:降采样输入点云。
SUN RGB-D 上多模态(点云和图像)3D 物体检测的典型流程如下: SUN RGB-D 上多模态(点云和图像)3D 物体检测的典型流程如下:
...@@ -238,13 +238,13 @@ train_pipeline = [ ...@@ -238,13 +238,13 @@ train_pipeline = [
图像上的数据增强 图像上的数据增强
- `Resize`: 改变输入图像的大小, `keep_ratio=True` 意味着图像的比例不改变。 - `Resize`改变输入图像的大小`keep_ratio=True` 意味着图像的比例不改变。
- `RandomFlip`: 随机地翻折图像。 - `RandomFlip`随机地翻折图像。
图像增强的实现取自 [MMDetection](https://github.com/open-mmlab/mmdetection/tree/dev-3.x/mmdet/datasets/transforms) 图像增强的实现取自 [MMDetection](https://github.com/open-mmlab/mmdetection/tree/dev-3.x/mmdet/datasets/transforms)
## 度量指标 ## 度量指标
与 ScanNet 一样,通常 mAP(全类平均精度)被用于 SUN RGB-D 的检测任务的评估,比如 `mAP@0.25``mAP@0.5`。具体来说,评估时一个通用的计算 3D 物体检测多个类别的精度和召回率的函数被调用,可以参考 [`indoor_eval.py`](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/mmdet3d/evaluation/functional/indoor_eval.py) 与 ScanNet 一样,通常使用 mAP(全类平均精度)来评估 SUN RGB-D 的检测任务的性能,比如 `mAP@0.25``mAP@0.5`。具体来说,评估时调用一个通用的计算 3D 物体检测多个类别的精度和召回率的函数。更多细节请参考 [`indoor_eval.py`](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/mmdet3d/evaluation/functional/indoor_eval.py)
因为 SUN RGB-D 包含有图像数据,所以图像上的物体检测也是可行的。举个例子,在 ImVoteNet 中,我们首先训练了一个图像检测器,并且也使用 mAP 指标,如 `mAP@0.5`,来评估其表现。我们使用 [MMDetection](https://github.com/open-mmlab/mmdetection) 库中的 `eval_map` 函数来计算 mAP。 因为 SUN RGB-D 包含有图像数据,所以图像上的物体检测也是可行的。举个例子,在 ImVoteNet 中,我们首先训练了一个图像检测器,并且也使用 mAP 指标,如 `mAP@0.5`,来评估其表现。我们使用 [MMDetection](https://github.com/open-mmlab/mmdetection) 库中的 `eval_map` 函数来计算 mAP。
# 依赖 # 依赖
在本节中,我们将展示如何使用 PyTorch 准备环境。 在本节中,我们将展示如何使用 PyTorch 准备环境。MMDetection3D 可以安装在 Linux, MacOS,(实验性支持 Windows)的平台上,它具体需要下列安装包:
MMDetection3D 可以安装在 Linux, MacOS, (实验性支持 Windows) 的平台上,它具体需要下列安装包:
- Python 3.6+ - Python 3.6+
- PyTorch 1.6+ - PyTorch 1.6+
- CUDA 9.2+ (如果你从源码编译 PyTorch, CUDA 9.0 也是兼容的。) - CUDA 9.2+如果你从源码编译 PyTorch, CUDA 9.0 也是兼容的。
- GCC 5+ - GCC 5+
- [MMEngine](https://mmengine.readthedocs.io/en/latest/#installation) - [MMEngine](https://mmengine.readthedocs.io/zh_CN/latest/#installation)
- [MMCV](https://mmcv.readthedocs.io/en/latest/#installation) - [MMCV](https://mmcv.readthedocs.io/zh_CN/latest/#installation)
```{note} ```{note}
如果你已经装了 pytorch, 可以跳过这一部分,然后转到[下一章节](#安装). 如果没有,可以参照以下步骤安装环境。 如果你已经装了 pytorch可以跳过这一部分,然后转到[下一章节](#安装)如果没有,可以参照以下步骤安装环境。
``` ```
**步骤 0.**[官网](https://docs.conda.io/en/latest/miniconda.html)下载并安装 Miniconda。 **步骤 0.**[官网](https://docs.conda.io/en/latest/miniconda.html)下载并安装 Miniconda。
...@@ -19,8 +18,8 @@ MMDetection3D 可以安装在 Linux, MacOS, (实验性支持 Windows) 的平台 ...@@ -19,8 +18,8 @@ MMDetection3D 可以安装在 Linux, MacOS, (实验性支持 Windows) 的平台
**步骤 1.** 使用 conda 新建虚拟环境,并进入该虚拟环境。 **步骤 1.** 使用 conda 新建虚拟环境,并进入该虚拟环境。
```shell ```shell
# 鉴于 waymo-open-dataset-tf-2-6-0 要求 python>=3.7, 我们推荐安装 python3.8 # 鉴于 waymo-open-dataset-tf-2-6-0 要求 python>=3.7我们推荐安装 python3.8
# 如果您想要安装 python3.6, 之后须确保安装 waymo-open-dataset-tf-2-x-0 (x<=4) # 如果您想要安装 python<3.7,之后须确保安装 waymo-open-dataset-tf-2-x-0 (x<=4)
conda create --name openmmlab python=3.8 -y conda create --name openmmlab python=3.8 -y
conda activate openmmlab conda activate openmmlab
``` ```
...@@ -41,7 +40,7 @@ conda install pytorch torchvision cpuonly -c pytorch ...@@ -41,7 +40,7 @@ conda install pytorch torchvision cpuonly -c pytorch
# 安装 # 安装
我们建议用户参照我们的最佳实践 MMDetection3D。不过,整个过程也是可定制化的,具体可参照[自定义安装章节](#customize-installation) 我们建议用户参照我们的最佳实践 MMDetection3D。不过,整个过程也是可定制化的,具体可参照[自定义安装章节](#%E8%87%AA%E5%AE%9A%E4%B9%89%E5%AE%89%E8%A3%85)
## 最佳实践 ## 最佳实践
...@@ -75,26 +74,26 @@ mim install 'mmdet>=3.0.0rc0' ...@@ -75,26 +74,26 @@ mim install 'mmdet>=3.0.0rc0'
```shell ```shell
git clone https://github.com/open-mmlab/mmdetection.git -b dev-3.x git clone https://github.com/open-mmlab/mmdetection.git -b dev-3.x
# "-b dev-3.x" means checkout to the `dev-3.x` branch. # "-b dev-3.x" 表示切换到 `dev-3.x` 分支。
cd mmdetection cd mmdetection
pip install -v -e . pip install -v -e .
# "-v" means verbose, or more output # "-v" 表示更详细的信息输出
# "-e" means installing a project in editable mode, # "-e" 表示以可编辑的模式安装项目
# thus any local modifications made to the code will take effect without reinstallation. # 因此本地对代码做的任何修改都会生效,而无需重新安装。
``` ```
**步骤 2.** 克隆 MMDetection3D 代码仓库。 **步骤 2.** 克隆 MMDetection3D 代码仓库。
```shell ```shell
git clone https://github.com/open-mmlab/mmdetection3d.git -b dev-1.x git clone https://github.com/open-mmlab/mmdetection3d.git -b dev-1.x
# "-b dev-1.x" means checkout to the `dev-1.x` branch. # "-b dev-1.x" 表示切换到 `dev-1.x` 分支。
cd mmdetection3d cd mmdetection3d
``` ```
**步骤 4.** 安装依赖包和 MMDetection3D。 **步骤 4.** 安装依赖包和 MMDetection3D。
```shell ```shell
pip install -v -e . # or "python setup.py develop" pip install -v -e . # 或者 "python setup.py develop"
``` ```
注意: 注意:
...@@ -111,24 +110,24 @@ pip install -v -e . # or "python setup.py develop" ...@@ -111,24 +110,24 @@ pip install -v -e . # or "python setup.py develop"
2. 按照上述说明,MMDetection3D 安装在 `dev` 模式下,因此在本地对代码做的任何修改都会生效,无需重新安装; 2. 按照上述说明,MMDetection3D 安装在 `dev` 模式下,因此在本地对代码做的任何修改都会生效,无需重新安装;
3. 如果希望使用 `opencv-python-headless` 而不是 `opencv-python` 可以在安装 MMCV 之前安装; 3. 如果希望使用 `opencv-python-headless` 而不是 `opencv-python`,可以在安装 MMCV 之前安装;
4. 一些安装依赖是可以选择的。例如只需要安装最低运行要求的版本,则可以使用 `pip install -v -e .` 命令。如果希望使用可选择的像 `albumentations``imagecorruptions` 这种依赖项,可以使用 `pip install -r requirements/optional.txt ` 进行手动安装,或者在使用 `pip` 时指定所需的附加功能(例如 `pip install -v -e .[optional]`),支持附加功能的有效键值包括 `all``tests``build` 以及 `optional` 4. 一些安装依赖是可以选择的。例如只需要安装最低运行要求的版本,则可以使用 `pip install -v -e .` 命令。如果希望使用可选择的像 `albumentations``imagecorruptions` 这种依赖项,可以使用 `pip install -r requirements/optional.txt` 进行手动安装,或者在使用 `pip` 时指定所需的附加功能(例如 `pip install -v -e .[optional]`),支持附加功能的有效键值包括 `all``tests``build` 以及 `optional`
我们已经支持 spconv2.0. 如果用户已经安装 spconv 2.0, 代码会默认使用 spconv 2.0。它可以比原生 mmcv spconv 使用更少的内存。 用户可以使用下列的命令来安装 spconv 2.0. 我们已经支持 `spconv 2.0`如果用户已经安装 `spconv 2.0`,代码会默认使用 `spconv 2.0`。它可以比原生 `mmcv spconv` 使用更少的内存。用户可以使用下列的命令来安装 `spconv 2.0`.
```bash ```bash
pip install cumm-cuxxx pip install cumm-cuxxx
pip install spconv-cuxxx pip install spconv-cuxxx
``` ```
xxx 表示 CUDA 的版本。 `xxx` 表示 CUDA 的版本。
例如,使用 CUDA 10.2, 对应命令是 `pip install cumm-cu102 && pip install spconv-cu102` 例如,使用 CUDA 10.2, 对应命令是 `pip install cumm-cu102 && pip install spconv-cu102`
支持的 CUDA 版本包括 10.2,11.1,11.3 和 11.4。用户可以通过源码编译来在这些版本上安装。具体细节请参考 [spconv v2.x](https://github.com/traveller59/spconv) 支持的 CUDA 版本包括 10.2,11.1,11.3 和 11.4。用户可以通过源码编译来在这些版本上安装。具体细节请参考 [spconv v2.x](https://github.com/traveller59/spconv)
我们同时也支持 Minkowski Engine 来作为稀疏卷积的后端。如果需要,可以参照[安装指南](https://github.com/NVIDIA/MinkowskiEngine#installation)或使用 `pip` 我们同时也支持 `Minkowski Engine` 来作为稀疏卷积的后端。如果需要,可以参照[安装指南](https://github.com/NVIDIA/MinkowskiEngine#installation)或使用 `pip` 来安装
```shell ```shell
conda install openblas-devel -c anaconda conda install openblas-devel -c anaconda
...@@ -153,8 +152,8 @@ python demo/pcd_demo.py ${PCD_FILE} ${CONFIG_FILE} ${CHECKPOINT_FILE} [--device ...@@ -153,8 +152,8 @@ python demo/pcd_demo.py ${PCD_FILE} ${CONFIG_FILE} ${CHECKPOINT_FILE} [--device
python demo/pcd_demo.py demo/data/kitti/000008.bin configs/second/second_hv-secfpn_8xb6-80e_kitti-3d-car.py checkpoints/second_hv-secfpn_8xb6-80e_kitti-3d-car_20200620_230238-393f000c.pth python demo/pcd_demo.py demo/data/kitti/000008.bin configs/second/second_hv-secfpn_8xb6-80e_kitti-3d-car.py checkpoints/second_hv-secfpn_8xb6-80e_kitti-3d-car_20200620_230238-393f000c.pth
``` ```
如果你想输入一个 `ply` 格式的文件,你可以使用如下函数将它转换为 `bin` 的文件格式。然后就可以使用转化成 `bin` 格式的文件去运行样例程序。 如果你想输入一个 `.ply` 格式的文件,你可以使用如下函数将它转换为 `.bin` 的文件格式。然后就可以使用转化成 `.bin` 格式的文件去运行样例程序。
请注意在使用此脚本前,你需要先安装 `pandas``plyfile` 这个函数也可使用在数据预处理当中,为了能够直接训练 `ply data` 请注意在使用此脚本前,你需要先安装 `pandas``plyfile`。这个函数也可使用在数据预处理当中,为了能够直接训练 `ply data`
```python ```python
import numpy as np import numpy as np
...@@ -162,13 +161,13 @@ import pandas as pd ...@@ -162,13 +161,13 @@ import pandas as pd
from plyfile import PlyData from plyfile import PlyData
def convert_ply(input_path, output_path): def convert_ply(input_path, output_path):
plydata = PlyData.read(input_path) # read file plydata = PlyData.read(input_path) # 读取文件
data = plydata.elements[0].data # read data data = plydata.elements[0].data # 读取数据
data_pd = pd.DataFrame(data) # convert to DataFrame data_pd = pd.DataFrame(data) # 转换成 DataFrame
data_np = np.zeros(data_pd.shape, dtype=np.float) # initialize array to store data data_np = np.zeros(data_pd.shape, dtype=np.float) # 初始化数组来存储数据
property_names = data[0].dtype.names # read names of properties property_names = data[0].dtype.names # 读取属性名称
for i, name in enumerate( for i, name in enumerate(
property_names): # read data by property property_names): # 通过属性读取数据
data_np[:, i] = data_pd[name] data_np[:, i] = data_pd[name]
data_np.astype(np.float32).tofile(output_path) data_np.astype(np.float32).tofile(output_path)
``` ```
...@@ -179,14 +178,14 @@ def convert_ply(input_path, output_path): ...@@ -179,14 +178,14 @@ def convert_ply(input_path, output_path):
convert_ply('./test.ply', './test.bin') convert_ply('./test.ply', './test.bin')
``` ```
如果你有其他格式的点云文件 (例:`off``obj`),你可以使用 `trimesh` 将它们转化成 `ply` 如果你有其他格式的点云文件 (例:`.off``.obj`),你可以使用 `trimesh` 将它们转化成 `.ply`
```python ```python
import trimesh import trimesh
def to_ply(input_path, output_path, original_type): def to_ply(input_path, output_path, original_type):
mesh = trimesh.load(input_path, file_type=original_type) # read file mesh = trimesh.load(input_path, file_type=original_type) # 读取文件
mesh.export(output_path, file_type='ply') # convert to ply mesh.export(output_path, file_type='ply') # 转换成 ply
``` ```
例如: 例如:
...@@ -201,45 +200,20 @@ to_ply('./test.obj', './test.ply', 'obj') ...@@ -201,45 +200,20 @@ to_ply('./test.obj', './test.ply', 'obj')
### CUDA 版本 ### CUDA 版本
当安装 PyTorch,你需要指定 CUDA 的版本。如果你不清楚选择哪个版本,可以参考我们的建议:
- 对于 Ampere 架构的英伟达显卡,例如 GeForce 30 系列以及 NVIDIA A100,CUDA 11 是必须的。
- 对于较旧的英伟达显卡,CUDA 11 是向后兼容的,但 CUDA 10.2 提供更好的兼容性,并且更轻量。
```python
from mmdet3d.apis import init_model, inference_detector
config_file = 'configs/votenet/votenet_8xb8_scannet-3d.py'
checkpoint_file = 'checkpoints/votenet_8x8_scannet-3d-18class_20200620_230238-2cea9c3a.pth'
# 从配置文件和预训练的模型文件中构建模型
model = init_model(config_file, checkpoint_file, device='cuda:0')
# 测试单个文件并可视化结果
point_cloud = 'test.bin'
result, data = inference_detector(model, point_cloud)
# 可视化结果并且将结果保存到 'results' 文件夹
model.show_results(data, result, out_dir='results')
```
## 自定义安装
### CUDA 版本
当安装 PyTorch 的时候,你需要去指定 CUDA 的版本。如果你不清楚如何选择 CUDA 的版本,可以参考我们如下的建议: 当安装 PyTorch 的时候,你需要去指定 CUDA 的版本。如果你不清楚如何选择 CUDA 的版本,可以参考我们如下的建议:
- 对于 Ampere 的 NVIDIA GPU, 比如 GeForce 30 series 和 NVIDIA A100, CUDA 11 是必须的。 - 对于 Ampere 的 NVIDIA GPU, 比如 GeForce 30 series 和 NVIDIA A100, CUDA 11 是必须的。
- 对于老款的 NVIDIA GPUs, CUDA 11 是可编译的,但是 CUDA 10.2 提供更好的可编译性,并且更轻量。 - 对于老款的 NVIDIA GPUs, CUDA 11 是可编译的,但是 CUDA 10.2 提供更好的可编译性,并且更轻量。
请确保GPU 驱动版本大于最低需求。这个[表格](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cuda-major-component-versions__table-cuda-toolkit-driver-versions) 提供更多的信息 请确保 GPU 驱动版本大于最低需求。更多信息请参考此[表格](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cuda-major-component-versions__table-cuda-toolkit-driver-versions)
```{note} ```{note}
如果你参照最佳实践,你只需要安装 CUDA runtime libraries。 这是因为没有代码需要在本地通过 CUDA 编译。然而如果你需要编译MMCV源码,或者编译其他 CUDA 代码,你需要基于 NVIDIA [website](https://developer.nvidia.com/cuda-downloads) 安装完整的 CUDA toolkit,并且要保证它的版本跟 PyTorch 匹配。比如在 'conda install` 里对应的 cudatoolkit 版本。 如果你参照最佳实践,你只需要安装 CUDA runtime libraries。这是因为没有代码需要在本地通过 CUDA 编译。然而如果你需要编译 MMCV 源码,或者编译其他 CUDA 代码,你需要基于 NVIDIA [website](https://developer.nvidia.com/cuda-downloads) 安装完整的 CUDA toolkit,并且要保证它的版本跟 PyTorch 匹配。比如在 `conda install` 指令里指定 cudatoolkit 版本。
``` ```
### 不通过 MIM 安装 MMEngine ### 不通过 MIM 安装 MMEngine
如果想要使用 pip 而不是 MIM 安装 MMEngine, 请参考 [MMEngine 安装指南](https://mmengine.readthedocs.io/zh_CN/latest/get_started/installation.html) 如果想要使用 pip 而不是 MIM 安装 MMEngine, 请参考 [MMEngine 安装指南](https://mmengine.readthedocs.io/zh_CN/latest/get_started/installation.html)
例如,你可以通过以下指令安装 MMEngine。 例如,你可以通过以下指令安装 MMEngine。
...@@ -279,23 +253,23 @@ docker run --gpus all --shm-size=8g -it -v {DATA_DIR}:/mmdetection3d/data mmdete ...@@ -279,23 +253,23 @@ docker run --gpus all --shm-size=8g -it -v {DATA_DIR}:/mmdetection3d/data mmdete
以下是一个基于 conda 安装 MMdetection3D 的脚本 以下是一个基于 conda 安装 MMdetection3D 的脚本
```shell ```shell
# 鉴于 waymo-open-dataset-tf-2-6-0 要求 python>=3.7, 我们推荐安装 python3.8 # 鉴于 waymo-open-dataset-tf-2-6-0 要求 python>=3.7我们推荐安装 python3.8
# 如果您想要安装 python3.6, 之后须确保安装 waymo-open-dataset-tf-2-x-0 (x<=4) # 如果您想要安装 python<3.7,之后须确保安装 waymo-open-dataset-tf-2-x-0 (x<=4)
conda create -n open-mmlab python=3.8 -y conda create -n open-mmlab python=3.8 -y
conda activate open-mmlab conda activate open-mmlab
# install latest PyTorch prebuilt with the default prebuilt CUDA version (usually the latest) # 使用默认的预编译 CUDA 版本(通常是最新的)安装最新的 PyTorch
conda install -c pytorch pytorch torchvision -y conda install -c pytorch pytorch torchvision -y
# install mmengine and mmcv # 安装 mmengine and mmcv
pip install openmim pip install openmim
mim install mmengine mim install mmengine
mim install 'mmcv>=2.0.0rc0' mim install 'mmcv>=2.0.0rc0'
# install mmdetection # 安装 mmdetection
mim install 'mmdet>=3.0.0rc0' mim install 'mmdet>=3.0.0rc0'
# install mmdetection3d # 安装 mmdetection3d
git clone https://github.com/open-mmlab/mmdetection3d.git -b dev-1.x git clone https://github.com/open-mmlab/mmdetection3d.git -b dev-1.x
cd mmdetection3d cd mmdetection3d
pip install -e . pip install -e .
...@@ -303,5 +277,4 @@ pip install -e . ...@@ -303,5 +277,4 @@ pip install -e .
## 故障排除 ## 故障排除
如果在安装过程中遇到什么问题,可以先参考 [FAQ](notes/faq.md) 页面. 如果在安装过程中遇到什么问题,可以先参考 [FAQ](notes/faq.md) 页面。如果没有找到对应的解决方案,你也可以在 Github [提一个 issue](https://github.com/open-mmlab/mmdetection3d/issues/new/choose)
如果没有找到对应的解决方案,你也可以在 Github [提一个 issue](https://github.com/open-mmlab/mmdetection3d/issues/new/choose)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment