[Docs] Add SemanticKITTI docs (#2515)

* add md * fix table * fix dataset_prepare

[Docs] Add SemanticKITTI docs (#2515)
* add md * fix table * fix dataset_prepare
2e70a38c · Sun Jiahao · GitHub · b50e2035 · 2e70a38c · 2e70a38c
Unverified Commit 2e70a38c authored May 12, 2023 by Sun Jiahao Committed by GitHub May 12, 2023
7 changed files
--- a/docs/en/advanced_guides/datasets/index.rst
+++ b/docs/en/advanced_guides/datasets/index.rst
@@ -8,3 +8,4 @@
   sunrgbd.md
   scannet.md
   s3dis.md
+   semantickitti.md
--- a/docs/en/advanced_guides/datasets/semantickitti.md
+++ b/docs/en/advanced_guides/datasets/semantickitti.md
+# SemanticKITTI Dataset
+
+This page provides specific tutorials about the usage of MMDetection3D for SemanticKITTI dataset.
+
+## Prepare dataset
+
+You can download SemanticKITTI dataset [HERE](http://semantic-kitti.org/dataset.html#download) and unzip all zip files.
+
+Like the general way to prepare dataset, it is recommended to symlink the dataset root to `$MMDETECTION3D/data`.
+
+The folder structure should be organized as follows before our processing.
+
+```
+mmdetection3d
+├── mmdet3d
+├── tools
+├── configs
+├── data
+│   ├── semantickitti
+│   │   ├── sequences
+│   │   │   ├── 00
+│   │   │   │   ├── labels
+│   │   │   │   ├── velodyne
+│   │   │   ├── 01
+│   │   │   ├── ..
+│   │   │   ├── 22
+```
+
+SemanticKITTI dataset contains 23 sequences, where \[0-7\], \[9-10\] are used as training set (about 19k training samples), sequence 8 as validation set (about 4k validation samples) and \[11-22\] as test set (about 20k test samples). Each sequence contains velodyne and labels folders for LIDAR point cloud data and segmentation annotations (where the high 16 bits store the instance segmentation annotations and the low 16 bits store the semantic segmentation annotations), respectively.
+
+### Create SemanticKITTI Dataset
+
+We support scripts that generate dataset information for training and testing. Create `.pkl` info by running:
+
+```bash
+python ./tools/create_data.py semantickitti --root-path ./data/semantickitti --out-dir ./data/semantickitti --extra-tag semantickitti
+```
+
+The folder structure after processing should be as below
+
+```
+mmdetection3d
+├── mmdet3d
+├── tools
+├── configs
+├── data
+│   ├── semantickitti
+│   │   ├── sequences
+│   │   │   ├── 00
+│   │   │   │   ├── labels
+│   │   │   │   ├── velodyne
+│   │   │   ├── 01
+│   │   │   ├── ..
+│   │   │   ├── 22
+│   │   ├── semantickitti_infos_test.pkl
+│   │   ├── semantickitti_infos_train.pkl
+│   │   ├── semantickitti_infos_val.pkl
+```
+
+- `semantickitti_infos_train.pkl`: training dataset, a dict contains two keys: `metainfo` and `data_list`.
+  `metainfo` contains the basic information for the dataset itself, while `data_list` is a list of dict, each dict (hereinafter referred to as `info`) contains all the detailed information of single sample as follows:
+  - info\['sample_id'\]: The index of this sample in the whole dataset.
+  - info\['lidar_points'\]: A dict containing all the information related to the lidar points.
+    - info\['lidar_points'\]\['lidar_path'\]: The filename of the lidar point cloud data.
+    - info\['lidar_points'\]\['num_pts_feats'\]: The feature dimension of point.
+  - info\['pts_semantic_mask_pth'\]: The path of 3D semantic segmentation annotation file.
+
+Please refer to [semantickitti_converter.py](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/semantickitti_converter.py) and [update_infos_to_v2.py ](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/update_infos_to_v2.py) for more details.
+
+## Train pipeline
+
+A typical train pipeline of 3D segmentation on SemanticKITTI is as below:
+
+```python
+train_pipeline = [
+    dict(
+        type='LoadPointsFromFile',
+        coord_type='LIDAR',
+        load_dim=4,
+        use_dim=4,
+        backend_args=backend_args),
+    dict(
+        type='LoadAnnotations3D',
+        with_bbox_3d=False,
+        with_label_3d=False,
+        with_seg_3d=True,
+        seg_3d_dtype='np.int32',
+        seg_offset=2**16,
+        dataset_type='semantickitti',
+        backend_args=backend_args),
+    dict(type='PointSegClassMapping'),
+    dict(
+        type='RandomFlip3D',
+        sync_2d=False,
+        flip_ratio_bev_horizontal=0.5,
+        flip_ratio_bev_vertical=0.5),
+    dict(
+        type='GlobalRotScaleTrans',
+        rot_range=[-0.78539816, 0.78539816],
+        scale_ratio_range=[0.95, 1.05],
+        translation_std=[0.1, 0.1, 0.1],
+    ),
+    dict(type='Pack3DDetInputs', keys=['points', 'pts_semantic_mask'])
+]
+```
+
+- Data augmentation:
+  - `RandomFlip3D`: randomly flip input point cloud horizontally or vertically.
+  - `GlobalRotScaleTrans`: rotate/scale/transform input point cloud.
+
+## Evaluation
+
+An example to evaluate MinkUNet with 8 GPUs with semantickitti metrics is as follows:
+
+```shell
+bash tools/dist_test.sh configs/minkunet/minkunet_w32_8xb2-15e_semantickitti.py checkpoints/minkunet_w32_8xb2-15e_semantickitti_20230309_160710-7fa0a6f1.pth 8
+```
+
+## Metrics
+
+Typically mean intersection over union (mIoU) is used for evaluation on Semantickitti. In detail, we first compute IoU for multiple classes and then average them to get mIoU, please refer to [seg_eval.py](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/mmdet3d/evaluation/functional/seg_eval.py).
+
+An example of printed evaluation results is as follows:
+
+| classes | car    | bicycle | motorcycle | truck  | bus    | person | bicyclist | motorcyclist | road   | parking | sidewalk | other-ground | building | fence  | vegetation | trunck | terrian | pole   | traffic-sign | miou   | acc    | acc_cls |
+| ------- | ------ | ------- | ---------- | ------ | ------ | ------ | --------- | ------------ | ------ | ------- | -------- | ------------ | -------- | ------ | ---------- | ------ | ------- | ------ | ------------ | ------ | ------ | ------- |
+| results | 0.9687 | 0.1908  | 0.6313     | 0.8580 | 0.6359 | 0.6818 | 0.8444    | 0.0002       | 0.9353 | 0.4854  | 0.8106   | 0.0024       | 0.9050   | 0.6111 | 0.8822     | 0.6605 | 0.7493  | 0.6442 | 0.4837       | 0.6306 | 0.9202 | 0.6924  |
--- a/docs/en/user_guides/dataset_prepare.md
+++ b/docs/en/user_guides/dataset_prepare.md
@@ -71,6 +71,14 @@ mmdetection3d
 │   │   ├── sunrgbd_data.py
 │   │   ├── sunrgbd_utils.py
 │   │   ├── README.md
+│   ├── semantickitti
+│   │   ├── sequences
+│   │   │   ├── 00
+│   │   │   │   ├── labels
+│   │   │   │   ├── velodyne
+│   │   │   ├── 01
+│   │   │   ├── ..
+│   │   │   ├── 22

 ```

@@ -177,6 +185,20 @@ python tools/dataset_converters/lyft_data_fixer.py --version v1.01 --root-folder

 Note that we follow the original folder names for clear organization. Please rename the raw folders as shown above. Also note that the second command serves the purpose of fixing a corrupted lidar data file. Please refer to the [discussion](https://www.kaggle.com/c/3d-object-detection-for-autonomous-vehicles/discussion/110000) for more details.

+### SemanticKITTI
+
+Download SemanticKITTI dataset [HERE](http://semantic-kitti.org/dataset.html#download) and unzip all zip files.
+
+Then generate info files by running:
+
+```bash
+python ./tools/create_data.py semantickitti --root-path ./data/semantickitti --out-dir ./data/semantickitti --extra-tag semantickitti
+```
+
+**Tips**:
+
+- **Ready-made Annotations**. We have also provided SemanticKITTI data annotation files generated offline [here](#summary-of-annotation-files). You could download them and place them under `data/semantickitti/`.
+
 ### S3DIS, ScanNet and SUN RGB-D

 To prepare S3DIS data, please see its [README](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/data/s3dis/README.md).

--- a/docs/zh_cn/advanced_guides/datasets/index.rst
+++ b/docs/zh_cn/advanced_guides/datasets/index.rst
@@ -8,3 +8,4 @@
   sunrgbd.md
   scannet.md
   s3dis.md
+   semantickitti.md
--- a/docs/zh_cn/advanced_guides/datasets/s3dis.md
+++ b/docs/zh_cn/advanced_guides/datasets/s3dis.md
@@ -214,7 +214,7 @@ train_pipeline = [

 ## 度量指标

-通常我们使用平均交并比 (mean Intersection over Union, mIoU) 作为 ScanNet 语义分割任务的度量指标。
+通常我们使用平均交并比 (mean Intersection over Union, mIoU) 作为 S3DIS 语义分割任务的度量指标。
 具体而言，我们先计算所有类别的 IoU，然后取平均值作为 mIoU。
 更多实现细节请参考 [seg_eval.py](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/mmdet3d/evaluation/functional/seg_eval.py)。


--- a/docs/zh_cn/advanced_guides/datasets/semantickitti.md
+++ b/docs/zh_cn/advanced_guides/datasets/semantickitti.md
+# SemanticKITTI 数据集
+
+本页提供了有关在 MMDetection3D 中使用 SemanticKITTI 数据集的具体教程。
+
+## 数据集准备
+
+您可以在[这里](http://semantic-kitti.org/dataset.html#download)下载 SemanticKITTI 数据集并解压缩所有 zip 文件。
+
+像准备数据集的一般方法一样，建议将数据集根目录链接到 `$MMDETECTION3D/data`。
+
+在我们处理之前，文件夹结构应按如下方式组织：
+
+```
+mmdetection3d
+├── mmdet3d
+├── tools
+├── configs
+├── data
+│   ├── semantickitti
+│   │   ├── sequences
+│   │   │   ├── 00
+│   │   │   │   ├── labels
+│   │   │   │   ├── velodyne
+│   │   │   ├── 01
+│   │   │   ├── ..
+│   │   │   ├── 22
+```
+
+SemanticKITTI 数据集包含 23 个序列，其中序列 \[0-7\] , \[9-10\] 作为训练集（约 19k 训练样本），序列 8 作为验证集（约 4k 验证样本），\[11-22\] 作为测试集 （约20k测试样本）。其中每个序列分别包含 velodyne 和 labels 两个文件夹分别存放激光雷达点云数据和分割标注 (其中高16位存放实例分割标注，低16位存放语义分割标注)。
+
+### 创建 SemanticKITTI 数据集
+
+我们提供了生成数据集信息的脚本，用于测试和训练。通过以下命令生成 `.pkl` 文件：
+
+```bash
+python ./tools/create_data.py semantickitti --root-path ./data/semantickitti --out-dir ./data/semantickitti --extra-tag semantickitti
+```
+
+处理后的文件夹结构应该如下：
+
+```
+mmdetection3d
+├── mmdet3d
+├── tools
+├── configs
+├── data
+│   ├── semantickitti
+│   │   ├── sequences
+│   │   │   ├── 00
+│   │   │   │   ├── labels
+│   │   │   │   ├── velodyne
+│   │   │   ├── 01
+│   │   │   ├── ..
+│   │   │   ├── 22
+│   │   ├── semantickitti_infos_test.pkl
+│   │   ├── semantickitti_infos_train.pkl
+│   │   ├── semantickitti_infos_val.pkl
+```
+
+- `semantickitti_infos_train.pkl`: 训练数据集, 该字典包含两个键值: `metainfo` 和 `data_list`.
+  `metainfo` 包含该数据集的基本信息。 `data_list` 是由字典组成的列表，每个字典（以下简称 `info`）包含了单个样本的所有详细信息。
+  - info\['sample_id'\]：该样本在整个数据集的索引。
+  - info\['lidar_points'\]：是一个字典，包含了激光雷达点相关的信息。
+    - info\['lidar_points'\]\['lidar_path'\]：激光雷达点云数据的文件名。
+    - info\['lidar_points'\]\['num_pts_feats'\]：点的特征维度
+  - info\['pts_semantic_mask_pth'\]：三维语义分割的标注文件的文件路径
+
+更多细节请参考 [semantickitti_converter.py](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/semantickitti_converter.py) 和 [update_infos_to_v2.py ](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/update_infos_to_v2.py) 。
+
+## Train pipeline
+
+下面展示了一个使用 SemanticKITTI 数据集进行 3D 语义分割的典型流程：
+
+```python
+train_pipeline = [
+    dict(
+        type='LoadPointsFromFile',
+        coord_type='LIDAR',
+        load_dim=4,
+        use_dim=4,
+        backend_args=backend_args),
+    dict(
+        type='LoadAnnotations3D',
+        with_bbox_3d=False,
+        with_label_3d=False,
+        with_seg_3d=True,
+        seg_3d_dtype='np.int32',
+        seg_offset=2**16,
+        dataset_type='semantickitti',
+        backend_args=backend_args),
+    dict(type='PointSegClassMapping'),
+    dict(
+        type='RandomFlip3D',
+        sync_2d=False,
+        flip_ratio_bev_horizontal=0.5,
+        flip_ratio_bev_vertical=0.5),
+    dict(
+        type='GlobalRotScaleTrans',
+        rot_range=[-0.78539816, 0.78539816],
+        scale_ratio_range=[0.95, 1.05],
+        translation_std=[0.1, 0.1, 0.1],
+    ),
+    dict(type='Pack3DDetInputs', keys=['points', 'pts_semantic_mask'])
+]
+```
+
+- 数据增强:
+  - `RandomFlip3D`：对输入点云数据进行随机地水平翻转或者垂直翻转。
+  - `GlobalRotScaleTrans`：对输入点云数据进行旋转、缩放、平移。
+
+## 评估
+
+使用 8 个 GPU 以及 SemanticKITTI 指标评估的 MinkUNet 的示例如下：
+
+```shell
+bash tools/dist_test.sh configs/minkunet/minkunet_w32_8xb2-15e_semantickitti.py checkpoints/minkunet_w32_8xb2-15e_semantickitti_20230309_160710-7fa0a6f1.pth 8
+```
+
+## 度量指标
+
+通常我们使用平均交并比 (mean Intersection over Union, mIoU) 作为 SemanticKITTI 语义分割任务的度量指标。
+具体而言，我们先计算所有类别的 IoU，然后取平均值作为 mIoU。
+更多实现细节请参考 [seg_eval.py](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/mmdet3d/evaluation/functional/seg_eval.py)。
+
+以下是一个评估结果的样例:
+
+| classes | car    | bicycle | motorcycle | truck  | bus    | person | bicyclist | motorcyclist | road   | parking | sidewalk | other-ground | building | fence  | vegetation | trunck | terrian | pole   | traffic-sign | miou   | acc    | acc_cls |
+| ------- | ------ | ------- | ---------- | ------ | ------ | ------ | --------- | ------------ | ------ | ------- | -------- | ------------ | -------- | ------ | ---------- | ------ | ------- | ------ | ------------ | ------ | ------ | ------- |
+| results | 0.9687 | 0.1908  | 0.6313     | 0.8580 | 0.6359 | 0.6818 | 0.8444    | 0.0002       | 0.9353 | 0.4854  | 0.8106   | 0.0024       | 0.9050   | 0.6111 | 0.8822     | 0.6605 | 0.7493  | 0.6442 | 0.4837       | 0.6306 | 0.9202 | 0.6924  |
--- a/docs/zh_cn/user_guides/dataset_prepare.md
+++ b/docs/zh_cn/user_guides/dataset_prepare.md
@@ -174,6 +174,18 @@ python tools/data_converter/lyft_data_fixer.py --version v1.01 --root-folder ./d

 注意，为了文件结构的清晰性，我们遵从了 Lyft 数据原先的文件夹名称。请按照上面展示出的文件结构对原始文件夹进行重命名。同样值得注意的是，第二行命令的目的是为了修复一个损坏的激光雷达数据文件。更多细节请参考[该讨论](https://www.kaggle.com/c/3d-object-detection-for-autonomous-vehicles/discussion/110000)。

+### SemanticKITTI
+
+在[这里](http://semantic-kitti.org/dataset.html#download)下载 SemanticKITTI 数据集并解压所有文件。通过运行以下指令对 SemanticKITTI 数据进行预处理：
+
+```bash
+python ./tools/create_data.py semantickitti --root-path ./data/semantickitti --out-dir ./data/semantickitti --extra-tag semantickitti
+```
+
+**小贴士**：
+
+- **现成的标注文件**. 我们已经提供了离线处理好的 [SemanticKITTI 标注文件](#数据集标注文件列表)。您直接下载他们并放到 `data/semantickitti` 目录下。
+
 ### S3DIS、ScanNet 和 SUN RGB-D

 请参考 S3DIS [README](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/data/s3dis/README.md) 文件以对其进行数据预处理。