Unverified Commit 5de5d264 authored by Xiangxu-0103's avatar Xiangxu-0103 Committed by GitHub
Browse files

[Docs] Update `customize_dataset` documentation (#2153)

* Update customize_dataset.md

* Update customize_dataset.md
parent ef6e0aa2
...@@ -6,7 +6,7 @@ The basic steps are as below: ...@@ -6,7 +6,7 @@ The basic steps are as below:
1. Prepare data 1. Prepare data
2. Prepare a config 2. Prepare a config
3. Train, test and inference models on the customized dataset. 3. Train, test and inference models on the customized dataset
## Data Preparation ## Data Preparation
...@@ -26,7 +26,7 @@ Currently, we only support `.bin` format point cloud for training and inference. ...@@ -26,7 +26,7 @@ Currently, we only support `.bin` format point cloud for training and inference.
pip install git+https://github.com/DanielPollithy/pypcd.git pip install git+https://github.com/DanielPollithy/pypcd.git
``` ```
- You can use the following script to read the `.pcd` file and convert it to `.bin` format and save it: - You can use the following script to read the `.pcd` file and convert it to `.bin` format for saving:
```python ```python
import numpy as np import numpy as np
...@@ -42,11 +42,11 @@ Currently, we only support `.bin` format point cloud for training and inference. ...@@ -42,11 +42,11 @@ Currently, we only support `.bin` format point cloud for training and inference.
f.write(points.tobytes()) f.write(points.tobytes())
``` ```
2. Convert `.las` to `.bin`: The common conversion path is `.las -> .pcd -> .bin`, and the conversion from `.las -> .pcd` can be achieved through [this tool](https://github.com/Hitachi-Automotive-And-Industry-Lab/semantic-segmentation-editor). 2. Convert `.las` to `.bin`: The common conversion path is `.las -> .pcd -> .bin`, and the conversion path `.las -> .pcd` can be achieved through [this tool](https://github.com/Hitachi-Automotive-And-Industry-Lab/semantic-segmentation-editor).
#### Label Format #### Label Format
The most basic information: 3D bounding box and category label of each scene need to be contained in the annotation `.txt` file. Each line represents a 3D box in a certain scene as follow: The most basic information: 3D bounding box and category label of each scene need to be contained in the `.txt` annotation file. Each line represents a 3D box in a certain scene as follow:
``` ```
# format: [x, y, z, dx, dy, dz, yaw, category_name] # format: [x, y, z, dx, dy, dz, yaw, category_name]
...@@ -61,7 +61,7 @@ The 3D Box should be stored in unified 3D coordinates. ...@@ -61,7 +61,7 @@ The 3D Box should be stored in unified 3D coordinates.
#### Calibration Format #### Calibration Format
For the point cloud data collected by each lidar, they are usually fused and converted to a certain LiDAR coordinate. So typically the calibration information file should contain the intrinsic matrix of each camera and the transformation extrinsic matrix from the lidar to each camera in calibration `.txt` file, while `Px` represents the intrinsic matrix of `camera_x` and `lidar2camx` represents the transformation extrinsic matrix from the `lidar` to `camera_x`. For the point cloud data collected by each LiDAR, they are usually fused and converted to a certain LiDAR coordinate. So typically the calibration information file should contain the intrinsic matrix of each camera and the transformation extrinsic matrix from the LiDAR to each camera in `.txt` calibration file, while `Px` represents the intrinsic matrix of `camera_x` and `lidar2camx` represents the transformation extrinsic matrix from the `lidar` to `camera_x`.
``` ```
P0 P0
...@@ -106,7 +106,7 @@ mmdetection3d ...@@ -106,7 +106,7 @@ mmdetection3d
#### Vision-Based 3D Detection #### Vision-Based 3D Detection
The raw data for vision-based 3D object detection are typically organized as follows, where `ImageSets` contains split files indicating which files belong to training/validation set, `images` contains the images from different cameras, for example, images from `camera_x` need to be placed in `images/images_x`. `calibs` contains calibration information files which store the camera intrinsic matrix of each camera, and `labels` includes label files for 3D detection. The raw data for vision-based 3D object detection are typically organized as follows, where `ImageSets` contains split files indicating which files belong to training/validation set, `images` contains the images from different cameras, for example, images from `camera_x` need to be placed in `images/images_x`, `calibs` contains calibration information files which store the camera intrinsic matrix of each camera, and `labels` includes label files for 3D detection.
``` ```
mmdetection3d mmdetection3d
...@@ -138,7 +138,7 @@ mmdetection3d ...@@ -138,7 +138,7 @@ mmdetection3d
#### Multi-Modality 3D Detection #### Multi-Modality 3D Detection
The raw data for multi-modality 3D object detection are typically organized as follows. Different from vision-based 3D Object detection, calibration information files in `calibs` store the camera intrinsic matrix of each camera and extrinsic matrix. The raw data for multi-modality 3D object detection are typically organized as follows. Different from vision-based 3D object detection, calibration information files in `calibs` store the camera intrinsic matrix of each camera and extrinsic matrix.
``` ```
mmdetection3d mmdetection3d
...@@ -174,7 +174,7 @@ mmdetection3d ...@@ -174,7 +174,7 @@ mmdetection3d
#### LiDAR-Based 3D Semantic Segmentation #### LiDAR-Based 3D Semantic Segmentation
The raw data for LiDAR-Based 3D semantic segmentation are typically organized as follows, where `ImageSets` contains split files indicating which files belong to training/validation set, `points` includes point cloud data, and `semantic_mask` includes point-level label. The raw data for LiDAR-based 3D semantic segmentation are typically organized as follows, where `ImageSets` contains split files indicating which files belong to training/validation set, `points` includes point cloud data, and `semantic_mask` includes point-level label.
``` ```
mmdetection3d mmdetection3d
...@@ -200,8 +200,8 @@ mmdetection3d ...@@ -200,8 +200,8 @@ mmdetection3d
Once you prepared the raw data following our instruction, you can directly use the following command to generate training/validation information files. Once you prepared the raw data following our instruction, you can directly use the following command to generate training/validation information files.
``` ```bash
python tools/create_data.py base --root-path ./data/custom --out-dir ./data/custom python tools/create_data.py custom --root-path ./data/custom --out-dir ./data/custom --extra-tag custom
``` ```
## An example of customized dataset ## An example of customized dataset
...@@ -211,8 +211,8 @@ Once we finish data preparation, we can create a new dataset in `mmdet3d/dataset ...@@ -211,8 +211,8 @@ Once we finish data preparation, we can create a new dataset in `mmdet3d/dataset
```python ```python
import mmengine import mmengine
from mmdet3d.det3d_dataset import Det3DDataset
from mmdet3d.registry import DATASETS from mmdet3d.registry import DATASETS
from .det3d_dataset import Det3DDataset
@DATASETS.register_module() @DATASETS.register_module()
...@@ -220,17 +220,21 @@ class MyDataset(Det3DDataset): ...@@ -220,17 +220,21 @@ class MyDataset(Det3DDataset):
# replace with all the classes in customized pkl info file # replace with all the classes in customized pkl info file
METAINFO = { METAINFO = {
'classes': ('Pedestrian', 'Cyclist', 'Car') 'classes': ('Pedestrian', 'Cyclist', 'Car')
} }
def parse_ann_info(self, info): def parse_ann_info(self, info):
"""Process the `instances` in data info to `ann_info` """Process the `instances` in data info to `ann_info`.
Args: Args:
info (dict): Info dict. info (dict): Data information of single data sample.
Returns: Returns:
dict | None: Processed `ann_info` dict: Annotation information consists of the following keys:
- gt_bboxes_3d (:obj:`LiDARInstance3DBoxes`):
3D ground truth bboxes.
- gt_labels_3d (np.ndarray): Labels of ground truths.
""" """
ann_info = super().parse_ann_info(info) ann_info = super().parse_ann_info(info)
if ann_info is None: if ann_info is None:
...@@ -255,7 +259,7 @@ Here we take training PointPillars on customized dataset as an example: ...@@ -255,7 +259,7 @@ Here we take training PointPillars on customized dataset as an example:
### Prepare a config ### Prepare a config
Here we demonstrate a config sample for pure point cloud training: Here we demonstrate a config sample for pure point cloud training.
#### Prepare dataset config #### Prepare dataset config
...@@ -322,7 +326,7 @@ train_dataloader = dict( ...@@ -322,7 +326,7 @@ train_dataloader = dict(
dataset=dict( dataset=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file='custom_infos_train.pkl', # specify your training pkl info ann_file='custom_infos_train.pkl', # specify your training pkl info
data_prefix=dict(pts='points'), data_prefix=dict(pts='points'),
pipeline=train_pipeline, pipeline=train_pipeline,
modality=input_modality, modality=input_modality,
...@@ -339,7 +343,7 @@ val_dataloader = dict( ...@@ -339,7 +343,7 @@ val_dataloader = dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
data_prefix=dict(pts='points'), data_prefix=dict(pts='points'),
ann_file='custom_infos_val.pkl', # specify your validation pkl info ann_file='custom_infos_val.pkl', # specify your validation pkl info
pipeline=test_pipeline, pipeline=test_pipeline,
modality=input_modality, modality=input_modality,
test_mode=True, test_mode=True,
...@@ -347,7 +351,7 @@ val_dataloader = dict( ...@@ -347,7 +351,7 @@ val_dataloader = dict(
box_type_3d='LiDAR')) box_type_3d='LiDAR'))
val_evaluator = dict( val_evaluator = dict(
type='KittiMetric', type='KittiMetric',
ann_file=data_root + 'custom_infos_val.pkl', # specify your validation pkl info ann_file=data_root + 'custom_infos_val.pkl', # specify your validation pkl info
metric='bbox') metric='bbox')
``` ```
...@@ -356,7 +360,7 @@ val_evaluator = dict( ...@@ -356,7 +360,7 @@ val_evaluator = dict(
For voxel-based detectors such as SECOND, PointPillars and CenterPoint, the point cloud range and voxel size should be adjusted according to your dataset. For voxel-based detectors such as SECOND, PointPillars and CenterPoint, the point cloud range and voxel size should be adjusted according to your dataset.
Theoretically, `voxel_size` is linked to the setting of `point_cloud_range`. Setting a smaller `voxel_size` will increase the voxel num and the corresponding memory consumption. In addition, the following issues need to be noted: Theoretically, `voxel_size` is linked to the setting of `point_cloud_range`. Setting a smaller `voxel_size` will increase the voxel num and the corresponding memory consumption. In addition, the following issues need to be noted:
If the `point_cloud_range` and `voxel_size` are set to be `[0, -40, -3, 70.4, 40, 1]` and `[0.05, 0.05, 0.1]` respectively, then the shape of intermediate feature map should be `[(1-(-3))/0.1+1, (40-(-40))/0.05, (70.4-0)/0.05]=[41, 1600, 1408]`. When changing `point_cloud_range`, remember to change the shape of intermediate feature map in `middel_encoder` according to the `voxel_size`. If the `point_cloud_range` and `voxel_size` are set to be `[0, -40, -3, 70.4, 40, 1]` and `[0.05, 0.05, 0.1]` respectively, then the shape of intermediate feature map should be `[(1-(-3))/0.1+1, (40-(-40))/0.05, (70.4-0)/0.05]=[41, 1600, 1408]`. When changing `point_cloud_range`, remember to change the shape of intermediate feature map in `middle_encoder` according to the `voxel_size`.
Regarding the setting of `anchor_range`, it is generally adjusted according to dataset. Note that `z` value needs to be adjusted accordingly to the position of the point cloud, please refer to this [issue](https://github.com/open-mmlab/mmdetection3d/issues/986). Regarding the setting of `anchor_range`, it is generally adjusted according to dataset. Note that `z` value needs to be adjusted accordingly to the position of the point cloud, please refer to this [issue](https://github.com/open-mmlab/mmdetection3d/issues/986).
...@@ -435,21 +439,21 @@ model = dict( ...@@ -435,21 +439,21 @@ model = dict(
assigner=[ assigner=[
dict( # for Pedestrian dict( # for Pedestrian
type='Max3DIoUAssigner', type='Max3DIoUAssigner',
iou_calculator=dict(type='mmdet3d.BboxOverlapsNearest3D'), iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.5, pos_iou_thr=0.5,
neg_iou_thr=0.35, neg_iou_thr=0.35,
min_pos_iou=0.35, min_pos_iou=0.35,
ignore_iof_thr=-1), ignore_iof_thr=-1),
dict( # for Cyclist dict( # for Cyclist
type='Max3DIoUAssigner', type='Max3DIoUAssigner',
iou_calculator=dict(type='mmdet3d.BboxOverlapsNearest3D'), iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.5, pos_iou_thr=0.5,
neg_iou_thr=0.35, neg_iou_thr=0.35,
min_pos_iou=0.35, min_pos_iou=0.35,
ignore_iof_thr=-1), ignore_iof_thr=-1),
dict( # for Car dict( # for Car
type='Max3DIoUAssigner', type='Max3DIoUAssigner',
iou_calculator=dict(type='mmdet3d.BboxOverlapsNearest3D'), iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.6, pos_iou_thr=0.6,
neg_iou_thr=0.45, neg_iou_thr=0.45,
min_pos_iou=0.45, min_pos_iou=0.45,
...@@ -482,18 +486,18 @@ _base_ = [ ...@@ -482,18 +486,18 @@ _base_ = [
#### Visualize your dataset (optional) #### Visualize your dataset (optional)
To valiate whether your prepared data and config are correct, it's highly recommended to use `tools/misc/browse_dataest.py` script To validate whether your prepared data and config are correct, it's highly recommended to use `tools/misc/browse_dataset.py` script
to visualize your dataset and annotations before training and validation, more details refer to the [visualization](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/docs/en/user_guides/visualization.md) doc. to visualize your dataset and annotations before training and validation. Please refer to [visualization doc](https://mmdetection3d.readthedocs.io/en/dev-1.x/user_guides/visualization.html) for more details.
## Evaluation ## Evaluation
Once the data and config have been prepared, you can directly run the training/testing script following our doc. Once the data and config have been prepared, you can directly run the training/testing script following our doc.
**Note**: we only provide an implementation for KITTI style evaluation for the customized dataset. It should be included in the dataset config: **Note**: We only provide an implementation for KITTI style evaluation for the customized dataset. It should be included in the dataset config:
```python ```python
val_evaluator = dict( val_evaluator = dict(
type='KittiMetric', type='KittiMetric',
ann_file=data_root + 'custom_infos_val.pkl', # specify your validation pkl info ann_file=data_root + 'custom_infos_val.pkl', # specify your validation pkl info
metric='bbox') metric='bbox')
``` ```
# 自定义数据集 # 自定义数据集
在本节中,您将了解如何使用自定义数据集训练和测试预训练模型。 在本节中,您将了解如何使用自定义数据集训练和测试预定义模型。
基本步骤如下: 基本步骤如下:
...@@ -10,13 +10,13 @@ ...@@ -10,13 +10,13 @@
## 数据准备 ## 数据准备
理想情况下我们可以重新组织自定义的原始数据并将标注格式转换成 KITTI 风格。但是,考虑到对于自定义数据集而言,一些校准文件和 KITTI 格式的 3D 标注难以获得,我们在文档中介绍基本的数据格式。 理想情况下我们可以重新组织自定义的原始数据并将标注格式转换成 KITTI 风格。但是,考虑到对于自定义数据集而言,KITTI 格式的校准文件和 3D 标注难以获得,因此我们在文档中介绍基本的数据格式。
### 基本数据格式 ### 基本数据格式
#### 点云格式 #### 点云格式
目前,我们只支持 `.bin` 格式的点云用于训练和推理。在训练自己的数据集之前,需要将其格式的点云文件转换成 `.bin` 文件。常见的点云数据格式包括 `.pcd``.las`,我们列一些开源工具作为参考。 目前,我们只支持 `.bin` 格式的点云用于训练和推理。在训练自己的数据集之前,需要将其格式的点云文件转换成 `.bin` 文件。常见的点云数据格式包括 `.pcd``.las`,我们列举了一些开源工具作为参考。
1. `.pcd` 转换成 `.bin`:https://github.com/DanielPollithy/pypcd 1. `.pcd` 转换成 `.bin`:https://github.com/DanielPollithy/pypcd
...@@ -26,7 +26,7 @@ ...@@ -26,7 +26,7 @@
pip install git+https://github.com/DanielPollithy/pypcd.git pip install git+https://github.com/DanielPollithy/pypcd.git
``` ```
- 您可以使用以下脚本读取 `.pcd` 文件,将其转换成 `.bin` 格式保存 - 您可以使用以下脚本读取 `.pcd` 文件,将其转换成 `.bin` 格式保存
```python ```python
import numpy as np import numpy as np
...@@ -42,11 +42,11 @@ ...@@ -42,11 +42,11 @@
f.write(points.tobytes()) f.write(points.tobytes())
``` ```
2. `.las` 转换成 `.bin`:常见的转换流程为 `.las -> .pcd -> .bin``.las -> .pcd` 的转换可以用该[工具](https://github.com/Hitachi-Automotive-And-Industry-Lab/semantic-segmentation-editor) 2. `.las` 转换成 `.bin`:常见的转换流程为 `.las -> .pcd -> .bin``.las -> .pcd` 的转换可以用该[工具](https://github.com/Hitachi-Automotive-And-Industry-Lab/semantic-segmentation-editor)实现
#### 标签格式 #### 标签格式
最基本的信息:每个场景的 3D 边界框和类别标签应该包含在标注 `.txt` 文件中。每一行代表特定场景的一个 3D 框,如下所示: 最基本的信息:每个场景的 3D 边界框和类别标签应该包含在 `.txt` 标注文件中。每一行代表特定场景的一个 3D 框,如下所示:
``` ```
# 格式:[x, y, z, dx, dy, dz, yaw, category_name] # 格式:[x, y, z, dx, dy, dz, yaw, category_name]
...@@ -55,13 +55,13 @@ ...@@ -55,13 +55,13 @@
... ...
``` ```
**注意**:对于自定义数据集评估目前我们只支持 KITTI 评估方法。 **注意**:对于自定义数据集评估我们目前只支持 KITTI 评估方法。
3D 框应存储在统一的 3D 坐标系中。 3D 框应存储在统一的 3D 坐标系中。
#### 校准格式 #### 校准格式
对于每个激光雷达收集的点云数据,通常会进行融合并转换到特定的激光雷达坐标系。因此,校准信息文件中应该包含每个相机的内参矩阵和激光雷达到每个相机的外参转换矩阵,并保存在校准 `.txt` 文件中,其中 `Px` 表示 `camera_x` 的内参矩阵,`lidar2camx` 表示 `lidar``camera_x` 的外参转换矩阵。 对于每个激光雷达收集的点云数据,通常会进行融合并转换到特定的激光雷达坐标系。因此,校准信息文件中通常应该包含每个相机的内参矩阵和激光雷达到每个相机的外参转换矩阵,并保存在 `.txt` 校准文件中,其中 `Px` 表示 `camera_x` 的内参矩阵,`lidar2camx` 表示 `lidar``camera_x` 的外参转换矩阵。
``` ```
P0 P0
...@@ -82,7 +82,7 @@ lidar2cam4 ...@@ -82,7 +82,7 @@ lidar2cam4
#### 基于激光雷达的 3D 检测 #### 基于激光雷达的 3D 检测
基于激光雷达的 3D 目标检测原始数据通常组织成如下格式,其中 `ImageSets` 包含划分文件,指明哪些文件数据属于训练/验证集,`points` 包含存储成 `.bin` 格式的点云数据,`labels` 包含 3D 检测的标签文件。 基于激光雷达的 3D 目标检测原始数据通常组织成如下格式,其中 `ImageSets` 包含划分文件,指明哪些文件属于训练/验证集,`points` 包含存储成 `.bin` 格式的点云数据,`labels` 包含 3D 检测的标签文件。
``` ```
mmdetection3d mmdetection3d
...@@ -104,9 +104,9 @@ mmdetection3d ...@@ -104,9 +104,9 @@ mmdetection3d
│ │ │ ├── ... │ │ │ ├── ...
``` ```
## 基于视觉的 3D 检测 #### 基于视觉的 3D 检测
基于视觉的 3D 目标检测原始数据通常组织成如下格式,其中 `ImageSets` 包含划分文件,指明哪些文件数据属于训练/验证集,`images` 包含来自不同相机的图像,例如 `camera_x` 获得的图像应放在 `images/images_x` 下,`calibs` 包含校准信息文件,其中存储了每个相机的内参矩阵,`labels` 包含 3D 检测的标签文件。 基于视觉的 3D 目标检测原始数据通常组织成如下格式,其中 `ImageSets` 包含划分文件,指明哪些文件属于训练/验证集,`images` 包含来自不同相机的图像,例如 `camera_x` 获得的图像应放在 `images/images_x` 下,`calibs` 包含校准信息文件,其中存储了每个相机的内参矩阵,`labels` 包含 3D 检测的标签文件。
``` ```
mmdetection3d mmdetection3d
...@@ -174,7 +174,7 @@ mmdetection3d ...@@ -174,7 +174,7 @@ mmdetection3d
#### 基于激光雷达的 3D 语义分割 #### 基于激光雷达的 3D 语义分割
基于激光雷达的 3D 语义分割原始数据通常组织成如下格式,其中 `ImageSets` 包含划分文件,指明哪些文件数据属于训练/验证集,`points` 包含点云数据,`semantic_mask` 包含逐点级标签。 基于激光雷达的 3D 语义分割原始数据通常组织成如下格式,其中 `ImageSets` 包含划分文件,指明哪些文件属于训练/验证集,`points` 包含点云数据,`semantic_mask` 包含逐点级标签。
``` ```
mmdetection3d mmdetection3d
...@@ -200,8 +200,8 @@ mmdetection3d ...@@ -200,8 +200,8 @@ mmdetection3d
按照我们的说明准备好原始数据后,您可以直接使用以下命令生成训练/验证信息文件。 按照我们的说明准备好原始数据后,您可以直接使用以下命令生成训练/验证信息文件。
``` ```bash
python tools/create_data.py base --root-path ./data/custom --out-dir ./data/custom python tools/create_data.py custom --root-path ./data/custom --out-dir ./data/custom --extra-tag custom
``` ```
## 自定义数据集示例 ## 自定义数据集示例
...@@ -211,8 +211,8 @@ python tools/create_data.py base --root-path ./data/custom --out-dir ./data/cust ...@@ -211,8 +211,8 @@ python tools/create_data.py base --root-path ./data/custom --out-dir ./data/cust
```python ```python
import mmengine import mmengine
from mmdet3d.det3d_dataset import Det3DDataset
from mmdet3d.registry import DATASETS from mmdet3d.registry import DATASETS
from .det3d_dataset import Det3DDataset
@DATASETS.register_module() @DATASETS.register_module()
...@@ -220,17 +220,21 @@ class MyDataset(Det3DDataset): ...@@ -220,17 +220,21 @@ class MyDataset(Det3DDataset):
# 替换成自定义 pkl 信息文件里的所有类别 # 替换成自定义 pkl 信息文件里的所有类别
METAINFO = { METAINFO = {
'classes': ('Pedestrian', 'Cyclist', 'Car') 'classes': ('Pedestrian', 'Cyclist', 'Car')
} }
def parse_ann_info(self, info): def parse_ann_info(self, info):
"""Process the `instances` in data info to `ann_info` """Process the `instances` in data info to `ann_info`.
Args: Args:
info (dict): Info dict. info (dict): Data information of single data sample.
Returns: Returns:
dict | None: Processed `ann_info` dict: Annotation information consists of the following keys:
- gt_bboxes_3d (:obj:`LiDARInstance3DBoxes`):
3D ground truth bboxes.
- gt_labels_3d (np.ndarray): Labels of ground truths.
""" """
ann_info = super().parse_ann_info(info) ann_info = super().parse_ann_info(info)
if ann_info is None: if ann_info is None:
...@@ -246,10 +250,10 @@ class MyDataset(Det3DDataset): ...@@ -246,10 +250,10 @@ class MyDataset(Det3DDataset):
return ann_info return ann_info
``` ```
数据预处理后,用户可以通过两个步骤来训练自定义数据集: 数据预处理后,用户可以通过以下两个步骤来训练自定义数据集:
1. 修改配置文件来使用自定义数据集。 1. 修改配置文件来使用自定义数据集。
2. 验证自定义数据集的正确性。 2. 验证自定义数据集标注的正确性。
这里我们以在自定义数据集上训练 PointPillars 为例: 这里我们以在自定义数据集上训练 PointPillars 为例:
...@@ -265,8 +269,8 @@ class MyDataset(Det3DDataset): ...@@ -265,8 +269,8 @@ class MyDataset(Det3DDataset):
# 数据集设置 # 数据集设置
dataset_type = 'MyDataset' dataset_type = 'MyDataset'
data_root = 'data/custom/' data_root = 'data/custom/'
class_names = ['Pedestrian', 'Cyclist', 'Car'] # 替换成自己的数据集类别 class_names = ['Pedestrian', 'Cyclist', 'Car'] # 替换成的数据集类别
point_cloud_range = [0, -40, -3, 70.4, 40, 1] # 根据的数据集进行调整 point_cloud_range = [0, -40, -3, 70.4, 40, 1] # 根据的数据集进行调整
input_modality = dict(use_lidar=True, use_camera=False) input_modality = dict(use_lidar=True, use_camera=False)
metainfo = dict(classes=class_names) metainfo = dict(classes=class_names)
...@@ -274,7 +278,7 @@ train_pipeline = [ ...@@ -274,7 +278,7 @@ train_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=4, # 替换成的点云数据维度 load_dim=4, # 替换成的点云数据维度
use_dim=4), # 替换成在训练和推理时实际使用的维度 use_dim=4), # 替换成在训练和推理时实际使用的维度
dict( dict(
type='LoadAnnotations3D', type='LoadAnnotations3D',
...@@ -302,7 +306,7 @@ test_pipeline = [ ...@@ -302,7 +306,7 @@ test_pipeline = [
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=4, # 替换成的点云数据维度 load_dim=4, # 替换成的点云数据维度
use_dim=4), use_dim=4),
dict(type='Pack3DDetInputs', keys=['points']) dict(type='Pack3DDetInputs', keys=['points'])
] ]
...@@ -322,7 +326,7 @@ train_dataloader = dict( ...@@ -322,7 +326,7 @@ train_dataloader = dict(
dataset=dict( dataset=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_file='custom_infos_train.pkl', # 指定的训练 pkl 信息 ann_file='custom_infos_train.pkl', # 指定的训练 pkl 信息
data_prefix=dict(pts='points'), data_prefix=dict(pts='points'),
pipeline=train_pipeline, pipeline=train_pipeline,
modality=input_modality, modality=input_modality,
...@@ -339,7 +343,7 @@ val_dataloader = dict( ...@@ -339,7 +343,7 @@ val_dataloader = dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
data_prefix=dict(pts='points'), data_prefix=dict(pts='points'),
ann_file='custom_infos_val.pkl', # 指定的验证 pkl 信息 ann_file='custom_infos_val.pkl', # 指定的验证 pkl 信息
pipeline=test_pipeline, pipeline=test_pipeline,
modality=input_modality, modality=input_modality,
test_mode=True, test_mode=True,
...@@ -347,25 +351,25 @@ val_dataloader = dict( ...@@ -347,25 +351,25 @@ val_dataloader = dict(
box_type_3d='LiDAR')) box_type_3d='LiDAR'))
val_evaluator = dict( val_evaluator = dict(
type='KittiMetric', type='KittiMetric',
ann_file=data_root + 'custom_infos_val.pkl', # 指定的验证 pkl 信息 ann_file=data_root + 'custom_infos_val.pkl', # 指定的验证 pkl 信息
metric='bbox') metric='bbox')
``` ```
#### 准备模型配置 #### 准备模型配置
对于基于体素化的检测器如 SECOND,PointPillars 及 CenterPoint,点云范围(point cloud range)和体素大小(voxel size)应该根据的数据集做调整。理论上,`voxel_size``point_cloud_range` 的设置是相关联的。设置较小的 `voxel_size` 将增加体素数以及相应的内存消耗。此外,需要注意以下问题: 对于基于体素化的检测器如 SECOND,PointPillars 及 CenterPoint,点云范围(point cloud range)和体素大小(voxel size)应该根据的数据集做调整。理论上,`voxel_size``point_cloud_range` 的设置是相关联的。设置较小的 `voxel_size` 将增加体素数以及相应的内存消耗。此外,需要注意以下问题:
如果将 `point_cloud_range``voxel_size` 分别设置成 `[0, -40, -3, 70.4, 40, 1]``[0.05, 0.05, 0.1]`中间特征图的形状为 `[(1-(-3))/0.1+1, (40-(-40))/0.05, (70.4-0)/0.05]=[41, 1600, 1408]`。更改 `point_cloud_range` 时,请记得依据 `voxel_size` 更改 `middle_encoder` 里中间特征图的形状。 如果将 `point_cloud_range``voxel_size` 分别设置成 `[0, -40, -3, 70.4, 40, 1]``[0.05, 0.05, 0.1]`那么中间特征图的形状应该`[(1-(-3))/0.1+1, (40-(-40))/0.05, (70.4-0)/0.05]=[41, 1600, 1408]`。更改 `point_cloud_range` 时,请记得依据 `voxel_size` 更改 `middle_encoder` 里中间特征图的形状。
关于 `anchor_range` 的设置,一般需要根据数据集做调整。需要注意的是,`z` 值需要根据点云的位置做相应调整,具体请参考此 [issue](https://github.com/open-mmlab/mmdetection3d/issues/986) 关于 `anchor_range` 的设置,一般需要根据数据集做调整。需要注意的是,`z` 值需要根据点云的位置做相应调整,具体请参考此 [issue](https://github.com/open-mmlab/mmdetection3d/issues/986)
关于 `anchor_size` 的设置,通常需要计算整个训练集中目标的长、宽、高的平均值作为 `anchor_size`,以获得最好的结果。 关于 `anchor_size` 的设置,通常需要计算整个训练集中目标的长、宽、高的平均值作为 `anchor_size`,以获得最好的结果。
`configs/_base_/models/pointpillars_hv_secfpn_custom.py` `configs/_base_/models/pointpillars_hv_secfpn_custom.py`
```python ```python
voxel_size = [0.16, 0.16, 4] # 根据的数据集做调整 voxel_size = [0.16, 0.16, 4] # 根据的数据集做调整
point_cloud_range = [0, -39.68, -3, 69.12, 39.68, 1] # 根据的数据集做调整 point_cloud_range = [0, -39.68, -3, 69.12, 39.68, 1] # 根据的数据集做调整
model = dict( model = dict(
type='VoxelNet', type='VoxelNet',
data_preprocessor=dict( data_preprocessor=dict(
...@@ -404,7 +408,7 @@ model = dict( ...@@ -404,7 +408,7 @@ model = dict(
feat_channels=384, feat_channels=384,
use_direction_classifier=True, use_direction_classifier=True,
assign_per_class=True, assign_per_class=True,
# 根据的数据集调整 `ranges` 和 `sizes` # 根据的数据集调整 `ranges` 和 `sizes`
anchor_generator=dict( anchor_generator=dict(
type='AlignedAnchor3DRangeGenerator', type='AlignedAnchor3DRangeGenerator',
ranges=[ ranges=[
...@@ -433,21 +437,21 @@ model = dict( ...@@ -433,21 +437,21 @@ model = dict(
assigner=[ assigner=[
dict( # for Pedestrian dict( # for Pedestrian
type='Max3DIoUAssigner', type='Max3DIoUAssigner',
iou_calculator=dict(type='mmdet3d.BboxOverlapsNearest3D'), iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.5, pos_iou_thr=0.5,
neg_iou_thr=0.35, neg_iou_thr=0.35,
min_pos_iou=0.35, min_pos_iou=0.35,
ignore_iof_thr=-1), ignore_iof_thr=-1),
dict( # for Cyclist dict( # for Cyclist
type='Max3DIoUAssigner', type='Max3DIoUAssigner',
iou_calculator=dict(type='mmdet3d.BboxOverlapsNearest3D'), iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.5, pos_iou_thr=0.5,
neg_iou_thr=0.35, neg_iou_thr=0.35,
min_pos_iou=0.35, min_pos_iou=0.35,
ignore_iof_thr=-1), ignore_iof_thr=-1),
dict( # for Car dict( # for Car
type='Max3DIoUAssigner', type='Max3DIoUAssigner',
iou_calculator=dict(type='mmdet3d.BboxOverlapsNearest3D'), iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.6, pos_iou_thr=0.6,
neg_iou_thr=0.45, neg_iou_thr=0.45,
min_pos_iou=0.45, min_pos_iou=0.45,
...@@ -468,7 +472,7 @@ model = dict( ...@@ -468,7 +472,7 @@ model = dict(
#### 准备整体配置 #### 准备整体配置
我们将上的所有配置组合在 `configs/pointpillars/pointpillars_hv_secfpn_8xb6_custom.py` 文件中: 我们将上的所有配置组合在 `configs/pointpillars/pointpillars_hv_secfpn_8xb6_custom.py` 文件中:
```python ```python
_base_ = [ _base_ = [
...@@ -480,17 +484,17 @@ _base_ = [ ...@@ -480,17 +484,17 @@ _base_ = [
#### 可视化数据集(可选) #### 可视化数据集(可选)
为了验证准备的数据和配置是否正确,我们建议在训练和验证前使用 `tools/misc/browse_dataest.py` 脚本可视化数据集和标注更多细节请参考[可视化](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/docs/zh_cn/user_guides/visualization.md)文档 为了验证准备的数据和配置是否正确,我们建议在训练和验证前使用 `tools/misc/browse_dataset.py` 脚本可视化数据集和标注更多细节请参考[可视化文档](https://mmdetection3d.readthedocs.io/zh_CN/dev-1.x/user_guides/visualization.html)
## 评估 ## 评估
准备好数据和配置之后,可以遵循我们的文档直接运行训练/测试脚本。 准备好数据和配置之后,可以遵循我们的文档直接运行训练/测试脚本。
**注意**:我们为自定义数据集提供了 KITTI 风格的评估实现方法。在数据集配置中需要包含如下内容: **注意**:我们为自定义数据集提供了 KITTI 风格的评估实现方法。在数据集配置中需要包含如下内容:
```python ```python
val_evaluator = dict( val_evaluator = dict(
type='KittiMetric', type='KittiMetric',
ann_file=data_root + 'custom_infos_val.pkl', # 指定的验证 pkl 信息 ann_file=data_root + 'custom_infos_val.pkl', # 指定的验证 pkl 信息
metric='bbox') metric='bbox')
``` ```
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment