Commit e585f0d5 authored by Xiangxu-0103's avatar Xiangxu-0103 Committed by ZwwWayne
Browse files

[Docs] Update Chinese documentation (#1891)

parent 0be27ffb
...@@ -60,7 +60,7 @@ a part of the OpenMMLab project developed by [MMLab](http://mmlab.ie.cuhk.edu.hk ...@@ -60,7 +60,7 @@ a part of the OpenMMLab project developed by [MMLab](http://mmlab.ie.cuhk.edu.hk
- **High efficiency** - **High efficiency**
It trains faster than other codebases. The main results are as below. Details can be found in [benchmark.md](./docs/en/benchmarks.md). We compare the number of samples trained per second (the higher, the better). The models that are not supported by other codebases are marked by `×`. It trains faster than other codebases. The main results are as below. Details can be found in [benchmark.md](./docs/en/notes/benchmarks.md). We compare the number of samples trained per second (the higher, the better). The models that are not supported by other codebases are marked by `×`.
| Methods | MMDetection3D | [OpenPCDet](https://github.com/open-mmlab/OpenPCDet) | [votenet](https://github.com/facebookresearch/votenet) | [Det3D](https://github.com/poodarchu/Det3D) | | Methods | MMDetection3D | [OpenPCDet](https://github.com/open-mmlab/OpenPCDet) | [votenet](https://github.com/facebookresearch/votenet) | [Det3D](https://github.com/poodarchu/Det3D) |
| :-----------------: | :-----------: | :--------------------------------------------------: | :----------------------------------------------------: | :-----------------------------------------: | | :-----------------: | :-----------: | :--------------------------------------------------: | :----------------------------------------------------: | :-----------------------------------------: |
...@@ -226,7 +226,7 @@ Results and models are available in the [model zoo](docs/en/model_zoo.md). ...@@ -226,7 +226,7 @@ Results and models are available in the [model zoo](docs/en/model_zoo.md).
| MonoFlex | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ | | MonoFlex | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ |
| SA-SSD | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ☐ | ☐ | ✗ | | SA-SSD | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ☐ | ☐ | ✗ |
**Note:** All the about **300+ models, methods of 40+ papers** in 2D detection supported by [MMDetection](https://github.com/open-mmlab/mmdetection/blob/master/docs/en/model_zoo.md) can be trained or used in this codebase. **Note:** All the about **300+ models, methods of 40+ papers** in 2D detection supported by [MMDetection](https://github.com/open-mmlab/mmdetection/blob/3.x/docs/en/model_zoo.md) can be trained or used in this codebase.
## Installation ## Installation
...@@ -234,7 +234,7 @@ Please refer to [getting_started.md](docs/en/getting_started.md) for installatio ...@@ -234,7 +234,7 @@ Please refer to [getting_started.md](docs/en/getting_started.md) for installatio
## Get Started ## Get Started
Please see [getting_started.md](docs/en/getting_started.md) for the basic usage of MMDetection3D. We provide guidance for quick run [with existing dataset](docs/en/user_guides/1_exist_data_model.md) and [with customized dataset](docs/en/user_guides/2_new_data_model.md) for beginners. There are also tutorials for [learning configuration systems](docs/en/user_guides/config.md), [adding new dataset](docs/en/advanced_guides/customize_dataset.md), [designing data pipeline](docs/en/user_guides/data_pipeline.md), [customizing models](docs/en/advanced_guides/customize_models.md), [customizing runtime settings](docs/en/advanced_guides/customize_runtime.md) and [Waymo dataset](docs/en/advanced_guides/datasets/waymo_det.md). Please see [getting_started.md](docs/en/getting_started.md) for the basic usage of MMDetection3D. We provide guidance for quick run [with existing dataset](docs/en/user_guides/train_test.md) and [with customized dataset](docs/en/user_guides/2_new_data_model.md) for beginners. There are also tutorials for [learning configuration systems](docs/en/user_guides/config.md), [adding new dataset](docs/en/advanced_guides/customize_dataset.md), [designing data pipeline](docs/en/user_guides/data_pipeline.md), [customizing models](docs/en/advanced_guides/customize_models.md), [customizing runtime settings](docs/en/advanced_guides/customize_runtime.md) and [Waymo dataset](docs/en/advanced_guides/datasets/waymo_det.md).
Please refer to [FAQ](docs/en/notes/faq.md) for frequently asked questions. When updating the version of MMDetection3D, please also check the [compatibility doc](docs/en/notes/compatibility.md) to be aware of the BC-breaking updates introduced in each version. Please refer to [FAQ](docs/en/notes/faq.md) for frequently asked questions. When updating the version of MMDetection3D, please also check the [compatibility doc](docs/en/notes/compatibility.md) to be aware of the BC-breaking updates introduced in each version.
......
...@@ -24,13 +24,13 @@ ...@@ -24,13 +24,13 @@
[![codecov](https://codecov.io/gh/open-mmlab/mmdetection3d/branch/master/graph/badge.svg)](https://codecov.io/gh/open-mmlab/mmdetection3d) [![codecov](https://codecov.io/gh/open-mmlab/mmdetection3d/branch/master/graph/badge.svg)](https://codecov.io/gh/open-mmlab/mmdetection3d)
[![license](https://img.shields.io/github/license/open-mmlab/mmdetection3d.svg)](https://github.com/open-mmlab/mmdetection3d/blob/master/LICENSE) [![license](https://img.shields.io/github/license/open-mmlab/mmdetection3d.svg)](https://github.com/open-mmlab/mmdetection3d/blob/master/LICENSE)
**新闻**: **新闻**
**v1.1.0rc1** 版本已经在 2022.10.11 发布。 **v1.1.0rc1** 版本已经在 2022.10.11 发布。
由于坐标系的统一和简化,模型的兼容性会受到影响。目前,大多数模型都以类似的性能对齐了精度,但仍有少数模型在进行基准测试。在接下来的版本中,我们将更新所有的模型权重文件和基准。您可以在 [变更日志](docs/en/changelog.md) [v1.0.x版本变更日志](docs/en/notes/changelog_v1.0.x.md) 中查看更多详细信息。 由于坐标系的统一和简化,模型的兼容性会受到影响。目前,大多数模型都以类似的性能对齐了精度,但仍有少数模型在进行基准测试。在接下来的版本中,我们将更新所有的模型权重文件和基准。您可以在[变更日志](docs/zh_cn/notes/changelog.md)[v1.0.x版本变更日志](docs/zh_cn/notes/changelog_v1.0.x.md)中查看更多详细信息。
文档: https://mmdetection3d.readthedocs.io/ 文档https://mmdetection3d.readthedocs.io/
## 简介 ## 简介
...@@ -38,7 +38,7 @@ ...@@ -38,7 +38,7 @@
主分支代码目前支持 PyTorch 1.6 以上的版本。 主分支代码目前支持 PyTorch 1.6 以上的版本。
MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱, 下一代面向3D检测的平台. 它是 OpenMMlab 项目的一部分,这个项目由香港中文大学多媒体实验室和商汤科技联合发起. MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱下一代面向 3D 检测的平台它是 OpenMMlab 项目的一部分,这个项目由香港中文大学多媒体实验室和商汤科技联合发起
![demo image](resources/mmdet3d_outdoor_demo.gif) ![demo image](resources/mmdet3d_outdoor_demo.gif)
...@@ -50,17 +50,17 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱, 下一代 ...@@ -50,17 +50,17 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱, 下一代
- **支持户内/户外的数据集** - **支持户内/户外的数据集**
支持室内/室外的3D检测数据集,包括 ScanNet, SUNRGB-D, Waymo, nuScenes, Lyft, KITTI. 支持室内/室外的3D检测数据集,包括 ScanNetSUNRGB-DWaymonuScenesLyftKITTI
对于 nuScenes 数据集, 我们也支持 [nuImages 数据集](https://github.com/open-mmlab/mmdetection3d/tree/1.1/configs/nuimages). 对于 nuScenes 数据集我们也支持 [nuImages 数据集](https://github.com/open-mmlab/mmdetection3d/tree/1.1/configs/nuimages)
- **与 2D 检测器的自然整合** - **与 2D 检测器的自然整合**
[MMDetection](https://github.com/open-mmlab/mmdetection/blob/master/docs/zh_cn/model_zoo.md) 支持的**300+个模型 , 40+的论文算法**, 和相关模块都可以在此代码库中训练或使用。 [MMDetection](https://github.com/open-mmlab/mmdetection/blob/3.x/docs/zh_cn/model_zoo.md) 支持的 **300+ 个模型40+ 的论文算法**和相关模块都可以在此代码库中训练或使用。
- **性能高** - **性能高**
训练速度比其他代码库更快。下表可见主要的对比结果。更多的细节可见[基准测评文档](./docs/zh_cn/benchmarks.md)。我们对比了每秒训练的样本数(值越高越好)。其他代码库不支持的模型被标记为 `×` 训练速度比其他代码库更快。下表可见主要的对比结果。更多的细节可见[基准测评文档](./docs/zh_cn/notes/benchmarks.md)。我们对比了每秒训练的样本数(值越高越好)。其他代码库不支持的模型被标记为 `×`
| Methods | MMDetection3D | [OpenPCDet](https://github.com/open-mmlab/OpenPCDet) | [votenet](https://github.com/facebookresearch/votenet) | [Det3D](https://github.com/poodarchu/Det3D) | | Methods | MMDetection3D | [OpenPCDet](https://github.com/open-mmlab/OpenPCDet) | [votenet](https://github.com/facebookresearch/votenet) | [Det3D](https://github.com/poodarchu/Det3D) |
| :-----------------: | :-----------: | :--------------------------------------------------: | :----------------------------------------------------: | :-----------------------------------------: | | :-----------------: | :-----------: | :--------------------------------------------------: | :----------------------------------------------------: | :-----------------------------------------: |
...@@ -70,7 +70,7 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱, 下一代 ...@@ -70,7 +70,7 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱, 下一代
| SECOND | 40 | 30 | × | × | | SECOND | 40 | 30 | × | × |
| Part-A2 | 17 | 14 | × | × | | Part-A2 | 17 | 14 | × | × |
[MMDetection](https://github.com/open-mmlab/mmdetection)[MMCV](https://github.com/open-mmlab/mmcv) 一样, MMDetection3D 也可以作为一个库去支持各式各样的项目. [MMDetection](https://github.com/open-mmlab/mmdetection)[MMCV](https://github.com/open-mmlab/mmcv) 一样MMDetection3D 也可以作为一个库去支持各式各样的项目
## 开源许可证 ## 开源许可证
...@@ -80,7 +80,7 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱, 下一代 ...@@ -80,7 +80,7 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱, 下一代
我们在 2022.10.11 发布了 **1.1.0rc1** 版本. 我们在 2022.10.11 发布了 **1.1.0rc1** 版本.
更多细节和版本发布历史可以参考[changelog.md](docs/en/notes/changelog.md). 更多细节和版本发布历史可以参考 [changelog.md](docs/zh_cn/notes/changelog.md)
## 基准测试和模型库 ## 基准测试和模型库
...@@ -226,7 +226,7 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱, 下一代 ...@@ -226,7 +226,7 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱, 下一代
| MonoFlex | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ | | MonoFlex | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ |
| SA-SSD | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ☐ | ☐ | ✗ | | SA-SSD | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ☐ | ☐ | ✗ |
**注意:** [MMDetection](https://github.com/open-mmlab/mmdetection/blob/master/docs/zh_cn/model_zoo.md) 支持的基于2D检测的**300+个模型 , 40+的论文算法**在 MMDetection3D 中都可以被训练或使用。 **注意:**[MMDetection](https://github.com/open-mmlab/mmdetection/blob/3.x/docs/zh_cn/model_zoo.md) 支持的基于 2D 检测的 **300+ 个模型40+ 的论文算法**在 MMDetection3D 中都可以被训练或使用。
## 安装 ## 安装
...@@ -234,7 +234,7 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱, 下一代 ...@@ -234,7 +234,7 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱, 下一代
## 快速入门 ## 快速入门
请参考[快速入门文档](docs/zh_cn/getting_started.md)学习 MMDetection3D 的基本使用。 我们为新手提供了分别针对[已有数据集](docs/zh_cn/user_guides/1_exist_data_model.md)[新数据集](docs/zh_cn/user_guides/2_new_data_model.md)的使用指南。我们也提供了一些进阶教程,内容覆盖了[学习配置文件](docs/zh_cn/user_guides/config.md), [增加数据集支持](docs/zh_cn/advanced_guides/customize_dataset.md), [设计新的数据预处理流程](docs/zh_cn/user_guides/data_pipeline.md), [增加自定义模型](docs/zh_cn/advanced_guides/customize_models.md), [增加自定义的运行时配置](docs/zh_cn/advanced_guides/customize_runtime.md)[Waymo 数据集](docs/zh_cn/advanced_guides/datasets/waymo.md). 请参考[快速入门文档](docs/zh_cn/getting_started.md)学习 MMDetection3D 的基本使用。我们为新手提供了分别针对[已有数据集](docs/zh_cn/user_guides/train_test.md)[新数据集](docs/zh_cn/user_guides/2_new_data_model.md)的使用指南。我们也提供了一些进阶教程,内容覆盖了[学习配置文件](docs/zh_cn/user_guides/config.md)[增加数据集支持](docs/zh_cn/advanced_guides/customize_dataset.md)[设计新的数据预处理流程](docs/zh_cn/user_guides/data_pipeline.md)[增加自定义模型](docs/zh_cn/advanced_guides/customize_models.md)[增加自定义的运行时配置](docs/zh_cn/advanced_guides/customize_runtime.md)[Waymo 数据集](docs/zh_cn/advanced_guides/datasets/waymo_det.md)
请参考 [FAQ](docs/zh_cn/notes/faq.md) 查看一些常见的问题与解答。在升级 MMDetection3D 的版本时,请查看[兼容性文档](docs/zh_cn/notes/compatibility.md)以知晓每个版本引入的不与之前版本兼容的更新。 请参考 [FAQ](docs/zh_cn/notes/faq.md) 查看一些常见的问题与解答。在升级 MMDetection3D 的版本时,请查看[兼容性文档](docs/zh_cn/notes/compatibility.md)以知晓每个版本引入的不与之前版本兼容的更新。
......
...@@ -83,7 +83,7 @@ mmdetection3d ...@@ -83,7 +83,7 @@ mmdetection3d
#### Vision-Based 3D Detection #### Vision-Based 3D Detection
The raw data for vision-based 3D object detection are typically organized as follows, where `ImageSets` contains split files indicating which files belong to training/validation set, `images` contains the images from different cameras, for example, images from `camera_x` need to be placed in `images\images_x`. `calibs` contains calibration information files which store the camera intrinsic matrix of each camera, and `labels` includes label files for 3D detection. The raw data for vision-based 3D object detection are typically organized as follows, where `ImageSets` contains split files indicating which files belong to training/validation set, `images` contains the images from different cameras, for example, images from `camera_x` need to be placed in `images/images_x`. `calibs` contains calibration information files which store the camera intrinsic matrix of each camera, and `labels` includes label files for 3D detection.
``` ```
mmdetection3d mmdetection3d
...@@ -201,18 +201,13 @@ class MyDataset(Det3DDataset): ...@@ -201,18 +201,13 @@ class MyDataset(Det3DDataset):
} }
def parse_ann_info(self, info): def parse_ann_info(self, info):
"""Get annotation info according to the given index. """Process the `instances` in data info to `ann_info`
Args: Args:
info (dict): Data information of single data sample. info (dict): Info dict.
Returns: Returns:
dict: annotation information consists of the following keys: dict | None: Processed `ann_info`
- gt_bboxes_3d (:obj:`LiDARInstance3DBoxes`):
3D ground truth bboxes.
- bbox_labels_3d (np.ndarray): Labels of ground truths.
""" """
ann_info = super().parse_ann_info(info) ann_info = super().parse_ann_info(info)
if ann_info is None: if ann_info is None:
...@@ -464,9 +459,8 @@ _base_ = [ ...@@ -464,9 +459,8 @@ _base_ = [
#### Visualize your dataset (optional) #### Visualize your dataset (optional)
To valiate whether your prepared data and config are correct, it's highly recommended to use `tools/browse_dataest.py` script To valiate whether your prepared data and config are correct, it's highly recommended to use `tools/misc/browse_dataest.py` script
to visualize your dataset and annotations before training and validation, more details refer to the [visualization](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/docs/en/user_guides/visualization.md/) doc. to visualize your dataset and annotations before training and validation, more details refer to the [visualization](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/docs/en/user_guides/visualization.md) doc.
s
## Evaluation ## Evaluation
......
...@@ -128,7 +128,7 @@ Create a new file `mmdet3d/models/necks/second_fpn.py`. ...@@ -128,7 +128,7 @@ Create a new file `mmdet3d/models/necks/second_fpn.py`.
```python ```python
from mmdet3d.registry import MODELS from mmdet3d.registry import MODELS
@MODELS.register @MODELS.register_module()
class SECONDFPN(BaseModule): class SECONDFPN(BaseModule):
def __init__(self, def __init__(self,
...@@ -571,8 +571,8 @@ The decorator `weighted_loss` enable the loss to be weighted for each element. ...@@ -571,8 +571,8 @@ The decorator `weighted_loss` enable the loss to be weighted for each element.
import torch import torch
import torch.nn as nn import torch.nn as nn
from ..builder import LOSSES from mmdet3d.registry import MODELS
from .utils import weighted_loss from mmdet.models.losses.utils import weighted_loss
@weighted_loss @weighted_loss
def my_loss(pred, target): def my_loss(pred, target):
...@@ -580,7 +580,7 @@ def my_loss(pred, target): ...@@ -580,7 +580,7 @@ def my_loss(pred, target):
loss = torch.abs(pred - target) loss = torch.abs(pred - target)
return loss return loss
@LOSSES.register_module() @MODELS.register_module()
class MyLoss(nn.Module): class MyLoss(nn.Module):
def __init__(self, reduction='mean', loss_weight=1.0): def __init__(self, reduction='mean', loss_weight=1.0):
......
...@@ -26,7 +26,7 @@ optim_wrapper = dict( ...@@ -26,7 +26,7 @@ optim_wrapper = dict(
clip_grad=dict(max_norm=0.01, norm_type=2)) clip_grad=dict(max_norm=0.01, norm_type=2))
``` ```
### Customize optimizer supported by Pytorch ### Customize optimizer supported by PyTorch
We already support to use all the optimizers implemented by PyTorch, and the only modification is to change the `optimizer` field in `optim_wrapper` field of config files. For example, if you want to use `ADAM` (note that the performance could drop a lot), the modification could be as the following. We already support to use all the optimizers implemented by PyTorch, and the only modification is to change the `optimizer` field in `optim_wrapper` field of config files. For example, if you want to use `ADAM` (note that the performance could drop a lot), the modification could be as the following.
...@@ -192,7 +192,7 @@ Tricks not implemented by the optimizer should be implemented through optimizer ...@@ -192,7 +192,7 @@ Tricks not implemented by the optimizer should be implemented through optimizer
## Customize training schedules ## Customize training schedules
By default we use step learning rate with 1x schedule, this calls [MultiStepLR](https://github.com/open-mmlab/mmengine/blob/main/mmengine/optim/scheduler/lr_scheduler.py#L139) in MMEngine. By default we use step learning rate with 1x schedule, this calls [`MultiStepLR`](https://github.com/open-mmlab/mmengine/blob/main/mmengine/optim/scheduler/lr_scheduler.py#L139) in MMEngine.
We support many other learning rate schedule [here](https://github.com/open-mmlab/mmengine/blob/main/mmengine/optim/scheduler/lr_scheduler.py), such as `CosineAnnealingLR` and `PolyLR` schedules. Here are some examples We support many other learning rate schedule [here](https://github.com/open-mmlab/mmengine/blob/main/mmengine/optim/scheduler/lr_scheduler.py), such as `CosineAnnealingLR` and `PolyLR` schedules. Here are some examples
- Poly schedule: - Poly schedule:
...@@ -219,7 +219,6 @@ We support many other learning rate schedule [here](https://github.com/open-mmla ...@@ -219,7 +219,6 @@ We support many other learning rate schedule [here](https://github.com/open-mmla
begin=0, begin=0,
end=8, end=8,
by_epoch=True)] by_epoch=True)]
``` ```
## Customize train loop ## Customize train loop
...@@ -257,7 +256,9 @@ MMEngine provides many useful [hooks](https://github.com/open-mmlab/mmengine/blo ...@@ -257,7 +256,9 @@ MMEngine provides many useful [hooks](https://github.com/open-mmlab/mmengine/blo
Here we give an example of creating a new hook in mmdet3d and using it in training. Here we give an example of creating a new hook in mmdet3d and using it in training.
```python ```python
from mmengine.hooks import HOOKS, Hook from mmengine.hooks import Hook
from mmdet3d.registry import HOOKS
@HOOKS.register_module() @HOOKS.register_module()
...@@ -341,7 +342,7 @@ There are some common hooks that are registered through `default_hooks`, they ar ...@@ -341,7 +342,7 @@ There are some common hooks that are registered through `default_hooks`, they ar
- `CheckpointHook`: A hook that saves checkpoints periodically. - `CheckpointHook`: A hook that saves checkpoints periodically.
- `DistSamplerSeedHook`: A hook that sets the seed for sampler and batch_sampler. - `DistSamplerSeedHook`: A hook that sets the seed for sampler and batch_sampler.
`IterTimerHook`, `ParamSchedulerHook` and `DistSamplerSeedHook` are simple and no need to be modified usually, so here we reveals how what we can do with `LoggerHook`, `CheckpointHook` and `DetVisualizationHook`. `IterTimerHook`, `ParamSchedulerHook` and `DistSamplerSeedHook` are simple and no need to be modified usually, so here we reveals how what we can do with `LoggerHook`, `CheckpointHook` and `Det3DVisualizationHook`.
#### CheckpointHook #### CheckpointHook
......
...@@ -43,11 +43,10 @@ wget -c https://raw.githubusercontent.com/traveller59/second.pytorch/master/sec ...@@ -43,11 +43,10 @@ wget -c https://raw.githubusercontent.com/traveller59/second.pytorch/master/sec
wget -c https://raw.githubusercontent.com/traveller59/second.pytorch/master/second/data/ImageSets/val.txt --no-check-certificate --content-disposition -O ./data/kitti/ImageSets/val.txt wget -c https://raw.githubusercontent.com/traveller59/second.pytorch/master/second/data/ImageSets/val.txt --no-check-certificate --content-disposition -O ./data/kitti/ImageSets/val.txt
wget -c https://raw.githubusercontent.com/traveller59/second.pytorch/master/second/data/ImageSets/trainval.txt --no-check-certificate --content-disposition -O ./data/kitti/ImageSets/trainval.txt wget -c https://raw.githubusercontent.com/traveller59/second.pytorch/master/second/data/ImageSets/trainval.txt --no-check-certificate --content-disposition -O ./data/kitti/ImageSets/trainval.txt
python tools/create_data.py kitti --root-path ./data/kitti --out-dir ./data/kitti --extra-tag kitti --with-plane python tools/create_data.py kitti --root-path ./data/kitti --out-dir ./data/kitti --extra-tag kitti --with-plane
``` ```
Note that if your local disk does not have enough space for saving converted data, you can change the `out-dir` to anywhere else, and you need to remove the `--with-plane` flag if `planes` are not prepared. Note that if your local disk does not have enough space for saving converted data, you can change the `--out-dir` to anywhere else, and you need to remove the `--with-plane` flag if `planes` are not prepared.
The folder structure after processing should be as below The folder structure after processing should be as below
...@@ -79,37 +78,36 @@ kitti ...@@ -79,37 +78,36 @@ kitti
├── kitti_infos_trainval.pkl ├── kitti_infos_trainval.pkl
``` ```
- `kitti_gt_database/xxxxx.bin`: point cloud data included in each 3D bounding box of the training dataset - `kitti_gt_database/xxxxx.bin`: point cloud data included in each 3D bounding box of the training dataset.
- `kitti_infos_train.pkl`: training dataset info, each frame info has two keys: `metainfo` and `data_list`. - `kitti_infos_train.pkl`: training dataset, a dict contains two keys: `metainfo` and `data_list`.
`metainfo` is a dict, it contains the essential information for the dataset, such as `CLASSES` and `version`. `metainfo` contains the basic information for the dataset itself, such as `categories`, `dataset` and `info_version`, while `data_list` is a list of dict, each dict (hereinafter referred to as `info`) contains all the detailed information of single sample as follows:
`data_list` is a list, it has all the needed data information, and each item is detailed information dict for a single sample. Detailed information is as follows:
- info\['sample_idx'\]: The index of this sample in the whole dataset. - info\['sample_idx'\]: The index of this sample in the whole dataset.
- info\['images'\]: Information of images captured by multiple cameras. A dict contains five keys including: `CAM0`, `CAM1`, `CAM2`, `CAM3`, `R0_rect`. - info\['images'\]: Information of images captured by multiple cameras. A dict contains five keys including: `CAM0`, `CAM1`, `CAM2`, `CAM3`, `R0_rect`.
- info\['images'\]\['R0_rect'\]: Rectifying rotation matrix with shape (4, 4). - info\['images'\]\['R0_rect'\]: Rectifying rotation matrix with shape (4, 4).
- info\['images'\]\['CAM2'\]: Include some information about the `CAM2` camera sensor. - info\['images'\]\['CAM2'\]: Include some information about the `CAM2` camera sensor.
- info\['images'\]\['CAM2'\]\['img_path'\]: The path to the image file. - info\['images'\]\['CAM2'\]\['img_path'\]: The filename of the image.
- info\['images'\]\['CAM2'\]\['height'\]: The height of the image. - info\['images'\]\['CAM2'\]\['height'\]: The height of the image.
- info\['images'\]\['CAM2'\]\['width'\]: The width of the image. - info\['images'\]\['CAM2'\]\['width'\]: The width of the image.
- info\['images'\]\['CAM2'\]\['cam2img'\]: Transformation matrix from camera to image with shape (4, 4). - info\['images'\]\['CAM2'\]\['cam2img'\]: Transformation matrix from camera to image with shape (4, 4).
- info\['images'\]\['CAM2'\]\['lidar2cam'\]: Transformation matrix from lidar to camera with shape (4, 4). - info\['images'\]\['CAM2'\]\['lidar2cam'\]: Transformation matrix from lidar to camera with shape (4, 4).
- info\['images'\]\['CAM2'\]\['lidar2img'\]: Transformation matrix from lidar to image with shape (4, 4). - info\['images'\]\['CAM2'\]\['lidar2img'\]: Transformation matrix from lidar to image with shape (4, 4).
- info\['lidar_points'\]: Information of point cloud captured by Lidar. A dict contains information of LiDAR point cloud frame. - info\['lidar_points'\]: A dict containing all the information related to the lidar points.
- info\['lidar_points'\]\['lidar_path'\]: The file path of the lidar point cloud data. - info\['lidar_points'\]\['lidar_path'\]: The filename of the lidar point cloud data.
- info\['lidar_points'\]\['num_features'\]: Number of features for each point. - info\['lidar_points'\]\['num_pts_feats'\]: The feature dimension of point.
- info\['lidar_points'\]\['Tr_velo_to_cam'\]: Transformation from Velodyne coordinate to camera coordinate with shape (4, 4). - info\['lidar_points'\]\['Tr_velo_to_cam'\]: Transformation from Velodyne coordinate to camera coordinate with shape (4, 4).
- info\['lidar_points'\]\['Tr_imu_to_velo'\]: Transformation from IMU coordinate to Velodyne coordinate with shape (4, 4). - info\['lidar_points'\]\['Tr_imu_to_velo'\]: Transformation from IMU coordinate to Velodyne coordinate with shape (4, 4).
- info\['instances'\]: Required by object detection task. A list contains some dict of instance infos. Each dict corresponds to annotations of one instance in this frame. - info\['instances'\]: It is a list of dict. Each dict contains all annotation information of single instance. For the i-th instance:
- info\['instances'\]\['bbox'\]: List of 4 numbers representing the 2D bounding box of the instance, in (x1, y1, x2, y2) order. - info\['instances'\]\[i\]\['bbox'\]: List of 4 numbers representing the 2D bounding box of the instance, in (x1, y1, x2, y2) order.
- info\['instances'\]\['bbox_3d'\]: List of 7 numbers representing the 3D bounding box of the instance, in (x, y, z, w, h, l, yaw) order. - info\['instances'\]\[i\]\['bbox_3d'\]: List of 7 numbers representing the 3D bounding box of the instance, in (x, y, z, w, h, l, yaw) order.
- info\['instances'\]\['bbox_label'\]: An int indicate the 2D label of instance and the -1 indicating ignore. - info\['instances'\]\[i\]\['bbox_label'\]: An int indicate the 2D label of instance and the -1 indicating ignore.
- info\['instances'\]\['bbox_label_3d'\]: An int indicate the 3D label of instance and the -1 indicating ignore. - info\['instances'\]\[i\]\['bbox_label_3d'\]: An int indicate the 3D label of instance and the -1 indicating ignore.
- info\['instances'\]\['depth'\]: Projected center depth of the 3D bounding box with respect to the image plane. - info\['instances'\]\[i\]\['depth'\]: Projected center depth of the 3D bounding box with respect to the image plane.
- info\['instances'\]\['num_lidar_pts'\]: The number of LiDAR points in the 3D bounding box. - info\['instances'\]\[i\]\['num_lidar_pts'\]: The number of LiDAR points in the 3D bounding box.
- info\['instances'\]\['center_2d'\]: Projected 2D center of the 3D bounding box. - info\['instances'\]\[i\]\['center_2d'\]: Projected 2D center of the 3D bounding box.
- info\['instances'\]\['difficulty'\]: Kitti difficulty, Easy, Moderate, Hard. - info\['instances'\]\[i\]\['difficulty'\]: KITTI difficulty: 'Easy', 'Moderate', 'Hard'.
- info\['instances'\]\['truncated'\]: The instances bbox is truncated. - info\['instances'\]\[i\]\['truncated'\]: Float from 0 (non-truncated) to 1 (truncated), where truncated refers to the object leaving image boundaries.
- info\['instances'\]\['occluded'\]: The instances bbox is semi occluded or fully occluded. - info\['instances'\]\[i\]\['occluded'\]: Integer (0,1,2,3) indicating occlusion state: 0 = fully visible, 1 = partly occluded, 2 = largely occluded, 3 = unknown.
- info\['instances'\]\['group_ids'\]: Used for multi-part object. - info\['instances'\]\[i\]\['group_ids'\]: Used for multi-part object.
- info\['plane'\](optional): Road level information. - info\['plane'\](optional): Road level information.
Please refer to [kitti_converter.py](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/kitti_converter.py) and [update_infos_to_v2.py ](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/update_infos_to_v2.py) for more details. Please refer to [kitti_converter.py](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/kitti_converter.py) and [update_infos_to_v2.py ](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/update_infos_to_v2.py) for more details.
...@@ -156,7 +154,7 @@ train_pipeline = [ ...@@ -156,7 +154,7 @@ train_pipeline = [
An example to evaluate PointPillars with 8 GPUs with kitti metrics is as follows: An example to evaluate PointPillars with 8 GPUs with kitti metrics is as follows:
```shell ```shell
bash tools/dist_test.sh configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class.py work_dirs/configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class/latest.pth 8 bash tools/dist_test.sh configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class.py work_dirs/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class/latest.pth 8
``` ```
## Metrics ## Metrics
...@@ -180,7 +178,7 @@ aos AP:97.70, 89.11, 87.38 ...@@ -180,7 +178,7 @@ aos AP:97.70, 89.11, 87.38
## Testing and make a submission ## Testing and make a submission
An example to test PointPillars on KITTI with 8 GPUs and generate a submission to the leaderboard. An example to test PointPillars on KITTI with 8 GPUs and generate a submission to the leaderboard is as follows: An example to test PointPillars on KITTI with 8 GPUs and generate a submission to the leaderboard is as follows:
- First, you need to modify the `test_evaluator` dict in your config file to add `pklfile_prefix` and `submission_prefix`, just like: - First, you need to modify the `test_evaluator` dict in your config file to add `pklfile_prefix` and `submission_prefix`, just like:
...@@ -188,7 +186,7 @@ An example to test PointPillars on KITTI with 8 GPUs and generate a submission t ...@@ -188,7 +186,7 @@ An example to test PointPillars on KITTI with 8 GPUs and generate a submission t
data_root = 'data/kitti/' data_root = 'data/kitti/'
test_evaluator = dict( test_evaluator = dict(
type='KittiMetric', type='KittiMetric',
ann_file=data_root + 'kitti_infos_val.pkl', ann_file=data_root + 'kitti_infos_test.pkl',
metric='bbox', metric='bbox',
pklfile_prefix='results/kitti-3class/kitti_results', pklfile_prefix='results/kitti-3class/kitti_results',
submission_prefix='results/kitti-3class/kitti_results') submission_prefix='results/kitti-3class/kitti_results')
...@@ -199,15 +197,15 @@ test_evaluator = dict( ...@@ -199,15 +197,15 @@ test_evaluator = dict(
```shell ```shell
mkdir -p results/kitti-3class mkdir -p results/kitti-3class
./tools/dist_test.sh configs/pointpillars/configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class.py work_dirs/configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class/latest.pth 8 ./tools/dist_test.sh configs/pointpillars/configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class.py work_dirs/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class/latest.pth 8
``` ```
- Or you can use `--cfg-options "test_evaluator.jsonfile_prefix=work_dirs/pp-nus/results_eval.json)` after the test command, and run test script directly. - Or you can use `--cfg-options "test_evaluator.pklfile_prefix=results/kitti-3class/kitti_results" "test_evaluator.submission_prefix=results/kitti-3class/kitti_results"` after the test command, and run test script directly.
```shell ```shell
mkdir -p results/kitti-3class mkdir -p results/kitti-3class
./tools/dist_test.sh configs/pointpillars/configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class.py work_dirs/configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class/latest.pth 8 --cfg-options 'test_evaluator.pklfile_prefix=results/kitti-3class/kitti_results' 'test_evaluator.submission_prefix=results/kitti-3class/kitti_results' ./tools/dist_test.sh configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class.py work_dirs/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class/latest.pth 8 --cfg-options 'test_evaluator.pklfile_prefix=results/kitti-3class/kitti_results' 'test_evaluator.submission_prefix=results/kitti-3class/kitti_results'
``` ```
After generating `results/kitti-3class/kitti_results/xxxxx.txt` files, you can submit these files to KITTI benchmark. Please refer to the [KITTI official website](http://www.cvlibs.net/datasets/kitti/index.php) for more details. After generating `results/kitti-3class/kitti_results/xxxxx.txt` files, you can submit these files to KITTI benchmark. Please refer to the [KITTI official website](http://www.cvlibs.net/datasets/kitti/index.php) for more details.
...@@ -81,32 +81,33 @@ mmdetection3d ...@@ -81,32 +81,33 @@ mmdetection3d
``` ```
- `lyft_infos_train.pkl`: training dataset, a dict contains two keys: `metainfo` and `data_list`. - `lyft_infos_train.pkl`: training dataset, a dict contains two keys: `metainfo` and `data_list`.
`metainfo` contains the basic information for the dataset itself, such as `CLASSES` and `version`, while `data_list` is a list of dict, each dict ( hereinafter referred to as`info`) contains all the detailed information of single sample as follows: `metainfo` contains the basic information for the dataset itself, such as `categories`, `dataset` and `info_version`, while `data_list` is a list of dict, each dict (hereinafter referred to as `info`) contains all the detailed information of single sample as follows:
- info\['sample_idx'\]: The index of this sample in the whole dataset. - info\['sample_idx'\]: The index of this sample in the whole dataset.
- info\['token'\]: Sample data token. - info\['token'\]: Sample data token.
- info\['timestamp'\]: Timestamp of the sample data. - info\['timestamp'\]: Timestamp of the sample data.
- info\['lidar_points'\]: A dict contains all the information related to the lidar points. - info\['lidar_points'\]: A dict containing all the information related to the lidar points.
- info\['lidar_points'\]\['lidar_path'\]: The filename of the lidar point cloud data. - info\['lidar_points'\]\['lidar_path'\]: The filename of the lidar point cloud data.
- info\['lidar_points'\]\['num_pts_feats'\]: The feature dimension of point.
- info\['lidar_points'\]\['lidar2ego'\]: The transformation matrix from this lidar sensor to ego vehicle. (4x4 list) - info\['lidar_points'\]\['lidar2ego'\]: The transformation matrix from this lidar sensor to ego vehicle. (4x4 list)
- info\['lidar_points'\]\['ego2global'\]: The transformation matrix from the ego vehicle to global coordinates. (4x4 list) - info\['lidar_points'\]\['ego2global'\]: The transformation matrix from the ego vehicle to global coordinates. (4x4 list)
- info\['lidar_sweeps'\]: A list contains sweeps information (The intermediate lidar frames without annotations) - info\['lidar_sweeps'\]: A list contains sweeps information (The intermediate lidar frames without annotations)
- info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['data_path'\]: The lidar data path of i-th sweep. - info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['data_path'\]: The lidar data path of i-th sweep.
- info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['lidar2ego'\]: The transformation matrix from this lidar sensor to ego vehicle in i-th sweep timestamp - info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['lidar2ego'\]: The transformation matrix from this lidar sensor to ego vehicle in i-th sweep timestamp
- info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['ego2global'\]: The transformation matrix from the ego vehicle in i-th sweep timestamp to global coordinates. (4x4 list) - info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['ego2global'\]: The transformation matrix from the ego vehicle in i-th sweep timestamp to global coordinates. (4x4 list)
- info\['lidar_sweeps'\]\[i\]\['lidar2sensor'\]: The transformation matrix from the the lidar (for collecting the i-th sweep data) to the lidar collecting the key/sample data. (4x4 list) - info\['lidar_sweeps'\]\[i\]\['lidar2sensor'\]: The transformation matrix from the keyframe lidar to the i-th frame lidar. (4x4 list)
- info\['lidar_sweeps'\]\[i\]\['timestamp'\]: Timestamp of the sweep data. - info\['lidar_sweeps'\]\[i\]\['timestamp'\]: Timestamp of the sweep data.
- info\['lidar_sweeps'\]\[i\]\['sample_data_token'\]: The sweep sample data token. - info\['lidar_sweeps'\]\[i\]\['sample_data_token'\]: The sweep sample data token.
- info\['images'\]: A dict contains six keys corresponding to each camera: `'CAM_FRONT'`, `'CAM_FRONT_RIGHT'`, `'CAM_FRONT_LEFT'`, `'CAM_BACK'`, `'CAM_BACK_LEFT'`, `'CAM_BACK_RIGHT'`. Each dict contains all data information related to corresponding camera. - info\['images'\]: A dict contains six keys corresponding to each camera: `'CAM_FRONT'`, `'CAM_FRONT_RIGHT'`, `'CAM_FRONT_LEFT'`, `'CAM_BACK'`, `'CAM_BACK_LEFT'`, `'CAM_BACK_RIGHT'`. Each dict contains all data information related to corresponding camera.
- info\['images'\]\['CAM_XXX'\]\['img_path'\]: Filename of image. - info\['images'\]\['CAM_XXX'\]\['img_path'\]: The filename of the image.
- info\['images'\]\['CAM_XXX'\]\['cam2img'\]: The transformation matrix recording the intrinsic parameters when projecting 3D points to each image plane. (3x3 list) - info\['images'\]\['CAM_XXX'\]\['cam2img'\]: The transformation matrix recording the intrinsic parameters when projecting 3D points to each image plane. (3x3 list)
- info\['images'\]\['CAM_XXX'\]\['sample_data_token'\]: Sample data token of image. - info\['images'\]\['CAM_XXX'\]\['sample_data_token'\]: Sample data token of image.
- info\['images'\]\['CAM_XXX'\]\['timestamp'\]: Timestamp of the image. - info\['images'\]\['CAM_XXX'\]\['timestamp'\]: Timestamp of the image.
- info\['images'\]\['CAM_XXX'\]\['cam2ego'\]: The transformation matrix from this camera sensor to ego vehicle. (4x4 list) - info\['images'\]\['CAM_XXX'\]\['cam2ego'\]: The transformation matrix from this camera sensor to ego vehicle. (4x4 list)
- info\['images'\]\['CAM_XXX'\]\['lidar2cam'\]: The transformation matrix from lidar sensor to this camera. (4x4 list) - info\['images'\]\['CAM_XXX'\]\['lidar2cam'\]: The transformation matrix from lidar sensor to this camera. (4x4 list)
- info\['instances'\]: It is a list of dict. Each dict contains all annotation information of single instance. - info\['instances'\]: It is a list of dict. Each dict contains all annotation information of single instance. For the i-th instance:
- info\['instances'\]\['bbox_3d'\]: List of 7 numbers representing the 3D bounding box in lidar coordinate system of the instance, in (x, y, z, l, w, h, yaw) order. - info\['instances'\]\[i\]\['bbox_3d'\]: List of 7 numbers representing the 3D bounding box in lidar coordinate system of the instance, in (x, y, z, l, w, h, yaw) order.
- info\['instances'\]\['bbox_label_3d'\]: A int starting from 0 indicates the label of instance, while the -1 indicates ignore class. - info\['instances'\]\[i\]\['bbox_label_3d'\]: A int starting from 0 indicates the label of instance, while the -1 indicates ignore class.
- info\['instances'\]\['bbox_3d_isvalid'\]: Whether each bounding box is valid. In general, we only take the 3D boxes that include at least one lidar or radar point as valid boxes. - info\['instances'\]\[i\]\['bbox_3d_isvalid'\]: Whether each bounding box is valid. In general, we only take the 3D boxes that include at least one lidar or radar point as valid boxes.
Next, we will elaborate on the difference compared to nuScenes in terms of the details recorded in these info files. Next, we will elaborate on the difference compared to nuScenes in terms of the details recorded in these info files.
...@@ -114,10 +115,10 @@ Next, we will elaborate on the difference compared to nuScenes in terms of the d ...@@ -114,10 +115,10 @@ Next, we will elaborate on the difference compared to nuScenes in terms of the d
- `lyft_infos_train.pkl`: - `lyft_infos_train.pkl`:
- Without info\['instances'\]\['velocity'\], There is no velocity measurement on Lyft. - Without info\['instances'\]\[i\]\['velocity'\], There is no velocity measurement on Lyft.
- Without info\['instances'\]\['num_lidar_pts'\] and info\['instances'\]\['num_radar_pts'\] - Without info\['instances'\]\[i\]\['num_lidar_pts'\] and info\['instances'\]\['num_radar_pts'\]
Here we only explain the data recorded in the training info files. The same applies to the validation set and test set(without instances). Here we only explain the data recorded in the training info files. The same applies to the validation set and test set (without instances).
Please refer to [lyft_converter.py](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/lyft_converter.py) for more details about the structure of `lyft_infos_xxx.pkl`. Please refer to [lyft_converter.py](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/tools/dataset_converters/lyft_converter.py) for more details about the structure of `lyft_infos_xxx.pkl`.
...@@ -133,12 +134,10 @@ train_pipeline = [ ...@@ -133,12 +134,10 @@ train_pipeline = [
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=5, load_dim=5,
use_dim=5, use_dim=5),
),
dict( dict(
type='LoadPointsFromMultiSweeps', type='LoadPointsFromMultiSweeps',
sweeps_num=10, sweeps_num=10),
),
dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True), dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True),
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
......
...@@ -56,29 +56,30 @@ mmdetection3d ...@@ -56,29 +56,30 @@ mmdetection3d
- `nuscenes_database/xxxxx.bin`: point cloud data included in each 3D bounding box of the training dataset - `nuscenes_database/xxxxx.bin`: point cloud data included in each 3D bounding box of the training dataset
- `nuscenes_infos_train.pkl`: training dataset, a dict contains two keys: `metainfo` and `data_list`. - `nuscenes_infos_train.pkl`: training dataset, a dict contains two keys: `metainfo` and `data_list`.
`metadata` contains the basic information for the dataset itself, such as `CLASSES` and `version`, while `data_list` is a list of dict, each dict ( hereinafter referred to as`info`) contains all the detailed information of single sample as follows: `metainfo` contains the basic information for the dataset itself, such as `categories`, `dataset` and `info_version`, while `data_list` is a list of dict, each dict (hereinafter referred to as `info`) contains all the detailed information of single sample as follows:
- info\['sample_idx'\]: The index of this sample in the whole dataset. - info\['sample_idx'\]: The index of this sample in the whole dataset.
- info\['token'\]: Sample data token. - info\['token'\]: Sample data token.
- info\['timestamp'\]: Timestamp of the sample data. - info\['timestamp'\]: Timestamp of the sample data.
- info\['lidar_points'\]: A dict contains all the information related to the lidar points. - info\['lidar_points'\]: A dict containing all the information related to the lidar points.
- info\['lidar_points'\]\['lidar_path'\]: The filename of the lidar point cloud data. - info\['lidar_points'\]\['lidar_path'\]: The filename of the lidar point cloud data.
- info\['lidar_points'\]\['num_pts_feats'\]: The feature dimension of point.
- info\['lidar_points'\]\['lidar2ego'\]: The transformation matrix from this lidar sensor to ego vehicle. (4x4 list) - info\['lidar_points'\]\['lidar2ego'\]: The transformation matrix from this lidar sensor to ego vehicle. (4x4 list)
- info\['lidar_points'\]\['ego2global'\]: The transformation matrix from the ego vehicle to global coordinates. (4x4 list) - info\['lidar_points'\]\['ego2global'\]: The transformation matrix from the ego vehicle to global coordinates. (4x4 list)
- info\['lidar_sweeps'\]: A list contains sweeps information (The intermediate lidar frames without annotations) - info\['lidar_sweeps'\]: A list contains sweeps information (The intermediate lidar frames without annotations)
- info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['data_path'\]: The lidar data path of i-th sweep. - info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['data_path'\]: The lidar data path of i-th sweep.
- info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['lidar2ego'\]: The transformation matrix from this lidar sensor to ego vehicle. (4x4 list) - info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['lidar2ego'\]: The transformation matrix from this lidar sensor to ego vehicle. (4x4 list)
- info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['ego2global'\]: The transformation matrix from the ego vehicle to global coordinates. (4x4 list) - info\['lidar_sweeps'\]\[i\]\['lidar_points'\]\['ego2global'\]: The transformation matrix from the ego vehicle to global coordinates. (4x4 list)
- info\['lidar_sweeps'\]\[i\]\['lidar2sensor'\]: The transformation matrix from the main lidar sensor to the current sensor (for collecting the sweep data) to lidar. (4x4 list) - info\['lidar_sweeps'\]\[i\]\['lidar2sensor'\]: The transformation matrix from the main lidar sensor to the current sensor (for collecting the sweep data). (4x4 list)
- info\['lidar_sweeps'\]\[i\]\['timestamp'\]: Timestamp of the sweep data. - info\['lidar_sweeps'\]\[i\]\['timestamp'\]: Timestamp of the sweep data.
- info\['lidar_sweeps'\]\[i\]\['sample_data_token'\]: The sweep sample data token. - info\['lidar_sweeps'\]\[i\]\['sample_data_token'\]: The sweep sample data token.
- info\['images'\]: A dict contains six keys corresponding to each camera: `'CAM_FRONT'`, `'CAM_FRONT_RIGHT'`, `'CAM_FRONT_LEFT'`, `'CAM_BACK'`, `'CAM_BACK_LEFT'`, `'CAM_BACK_RIGHT'`. Each dict contains all data information related to corresponding camera. - info\['images'\]: A dict contains six keys corresponding to each camera: `'CAM_FRONT'`, `'CAM_FRONT_RIGHT'`, `'CAM_FRONT_LEFT'`, `'CAM_BACK'`, `'CAM_BACK_LEFT'`, `'CAM_BACK_RIGHT'`. Each dict contains all data information related to corresponding camera.
- info\['images'\]\['CAM_XXX'\]\['img_path'\]: Filename of image. - info\['images'\]\['CAM_XXX'\]\['img_path'\]: The filename of the image.
- info\['images'\]\['CAM_XXX'\]\['cam2img'\]: The transformation matrix recording the intrinsic parameters when projecting 3D points to each image plane. (3x3 list) - info\['images'\]\['CAM_XXX'\]\['cam2img'\]: The transformation matrix recording the intrinsic parameters when projecting 3D points to each image plane. (3x3 list)
- info\['images'\]\['CAM_XXX'\]\['sample_data_token'\]: Sample data token of image. - info\['images'\]\['CAM_XXX'\]\['sample_data_token'\]: Sample data token of image.
- info\['images'\]\['CAM_XXX'\]\['timestamp'\]: Timestamp of the image. - info\['images'\]\['CAM_XXX'\]\['timestamp'\]: Timestamp of the image.
- info\['images'\]\['CAM_XXX'\]\['cam2ego'\]: The transformation matrix from this camera sensor to ego vehicle. (4x4 list) - info\['images'\]\['CAM_XXX'\]\['cam2ego'\]: The transformation matrix from this camera sensor to ego vehicle. (4x4 list)
- info\['images'\]\['CAM_XXX'\]\['lidar2cam'\]: The transformation matrix from lidar sensor to this camera. (4x4 list) - info\['images'\]\['CAM_XXX'\]\['lidar2cam'\]: The transformation matrix from lidar sensor to this camera. (4x4 list)
- info\['instances'\]: It is a list of dict. Each dict contains all annotation information of single instance. - info\['instances'\]: It is a list of dict. Each dict contains all annotation information of single instance. For the i-th instance:
- info\['instances'\]\['bbox_3d'\]: List of 7 numbers representing the 3D bounding box of the instance, in (x, y, z, l, w, h, yaw) order. - info\['instances'\]\['bbox_3d'\]: List of 7 numbers representing the 3D bounding box of the instance, in (x, y, z, l, w, h, yaw) order.
- info\['instances'\]\['bbox_label_3d'\]: A int indicate the label of instance and the -1 indicate ignore. - info\['instances'\]\['bbox_label_3d'\]: A int indicate the label of instance and the -1 indicate ignore.
- info\['instances'\]\['velocity'\]: Velocities of 3D bounding boxes (no vertical measurements due to inaccuracy), a list has shape (2.). - info\['instances'\]\['velocity'\]: Velocities of 3D bounding boxes (no vertical measurements due to inaccuracy), a list has shape (2.).
...@@ -86,19 +87,19 @@ mmdetection3d ...@@ -86,19 +87,19 @@ mmdetection3d
- info\['instances'\]\['num_radar_pts'\]: Number of radar points included in each 3D bounding box. - info\['instances'\]\['num_radar_pts'\]: Number of radar points included in each 3D bounding box.
- info\['instances'\]\['bbox_3d_isvalid'\]: Whether each bounding box is valid. In general, we only take the 3D boxes that include at least one lidar or radar point as valid boxes. - info\['instances'\]\['bbox_3d_isvalid'\]: Whether each bounding box is valid. In general, we only take the 3D boxes that include at least one lidar or radar point as valid boxes.
- info\['cam_instances'\]: It is a dict contains keys `'CAM_FRONT'`, `'CAM_FRONT_RIGHT'`, `'CAM_FRONT_LEFT'`, `'CAM_BACK'`, `'CAM_BACK_LEFT'`, `'CAM_BACK_RIGHT'`. For vision-based 3D object detection task, we split 3D annotations of the whole scenes according to the camera they belong to. - info\['cam_instances'\]: It is a dict contains keys `'CAM_FRONT'`, `'CAM_FRONT_RIGHT'`, `'CAM_FRONT_LEFT'`, `'CAM_BACK'`, `'CAM_BACK_LEFT'`, `'CAM_BACK_RIGHT'`. For vision-based 3D object detection task, we split 3D annotations of the whole scenes according to the camera they belong to.
- info\['cam_instances'\]\['CAM_XXX'\]\['bbox_label'\]: Label of instance. - info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['bbox_label'\]: Label of instance.
- info\['cam_instances'\]\['CAM_XXX'\]\['bbox_label_3d'\]: Label of instance. - info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['bbox_label_3d'\]: Label of instance.
- info\['cam_instances'\]\['CAM_XXX'\]\['bbox'\]: 2D bounding box annotation (exterior rectangle of the projected 3D box), a list arrange as \[x1, y1, x2, y2\]. - info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['bbox'\]: 2D bounding box annotation (exterior rectangle of the projected 3D box), a list arrange as \[x1, y1, x2, y2\].
- info\['cam_instances'\]\['CAM_XXX'\]\['center_2d'\]: Projected center location on the image, a list has shape (2,), . - info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['center_2d'\]: Projected center location on the image, a list has shape (2,), .
- info\['cam_instances'\]\['CAM_XXX'\]\['depth'\]: The depth of projected center. - info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['depth'\]: The depth of projected center.
- info\['cam_instances'\]\['CAM_XXX'\]\['velocity'\]: Velocities of 3D bounding boxes (no vertical measurements due to inaccuracy), a list has shape (2,). - info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['velocity'\]: Velocities of 3D bounding boxes (no vertical measurements due to inaccuracy), a list has shape (2,).
- info\['cam_instances'\]\['CAM_XXX'\]\['attr_label'\]: The attr label of instance. We maintain a default attribute collection and mapping for attribute classification. - info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['attr_label'\]: The attr label of instance. We maintain a default attribute collection and mapping for attribute classification.
- info\['cam_instances'\]\['CAM_XXX'\]\['bbox_3d'\]: List of 7 numbers representing the 3D bounding box of the instance, in (x, y, z, l, h, w, yaw) order. - info\['cam_instances'\]\['CAM_XXX'\]\[i\]\['bbox_3d'\]: List of 7 numbers representing the 3D bounding box of the instance, in (x, y, z, l, h, w, yaw) order.
Note: Note:
1. The differences between `bbox_3d` in `instances` and that in `cam_instances`. 1. The differences between `bbox_3d` in `instances` and that in `cam_instances`.
Both `bbox_3d` have been converted to MMDet3D coordinate system, but `bboxes_3d` in `instances` is in LiDAR coordinate format and `bboxes_3d` in `cam_instances` is in Camera coordinate format. Mind the difference between them in 3D Box representation ('l, w, h' and 'l, h, w'). Both `bbox_3d` have been converted to MMDet3D coordinate system, but `bboxes_3d` in `instances` is in LiDAR coordinate format and `bboxes_3d` in `cam_instances` is in Camera coordinate format. Mind the difference between them in 3D Box representation ('l, w, h' and 'l, h, w').
2. Here we only explain the data recorded in the training info files. The same applies to validation and testing set (the pkl of test set does not contains `instances` and `cam_instances`). 2. Here we only explain the data recorded in the training info files. The same applies to validation and testing set (the pkl of test set does not contains `instances` and `cam_instances`).
...@@ -117,12 +118,10 @@ train_pipeline = [ ...@@ -117,12 +118,10 @@ train_pipeline = [
type='LoadPointsFromFile', type='LoadPointsFromFile',
coord_type='LIDAR', coord_type='LIDAR',
load_dim=5, load_dim=5,
use_dim=5, use_dim=5),
),
dict( dict(
type='LoadPointsFromMultiSweeps', type='LoadPointsFromMultiSweeps',
sweeps_num=10, sweeps_num=10),
),
dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True), dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True),
dict( dict(
type='GlobalRotScaleTrans', type='GlobalRotScaleTrans',
......
...@@ -195,8 +195,7 @@ train_pipeline = [ ...@@ -195,8 +195,7 @@ train_pipeline = [
jitter_std=[0.01, 0.01, 0.01], jitter_std=[0.01, 0.01, 0.01],
clip_range=[-0.05, 0.05]), clip_range=[-0.05, 0.05]),
dict(type='RandomDropPointsColor', drop_ratio=0.2), dict(type='RandomDropPointsColor', drop_ratio=0.2),
dict(type='DefaultFormatBundle3D', class_names=class_names), dict(type='Pack3DDetInputs', keys=['points', 'pts_semantic_mask'])
dict(type='Collect3D', keys=['points', 'pts_semantic_mask'])
] ]
``` ```
...@@ -210,7 +209,7 @@ train_pipeline = [ ...@@ -210,7 +209,7 @@ train_pipeline = [
## Metrics ## Metrics
Typically mean intersection over union (mIoU) is used for evaluation on S3DIS. In detail, we first compute IoU for multiple classes and then average them to get mIoU, please refer to [seg_eval.py](https://github.com/open-mmlab/mmdetection3d/blob/master/mmdet3d/core/evaluation/seg_eval.py). Typically mean intersection over union (mIoU) is used for evaluation on S3DIS. In detail, we first compute IoU for multiple classes and then average them to get mIoU, please refer to [seg_eval.py](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/mmdet3d/evaluation/functional/seg_eval.py).
As introduced in section `Export S3DIS data`, S3DIS trains on 5 areas and evaluates on the remaining 1 area. But there are also other area split schemes in different papers. As introduced in section `Export S3DIS data`, S3DIS trains on 5 areas and evaluates on the remaining 1 area. But there are also other area split schemes in different papers.
To enable flexible combination of train-val splits, we use sub-dataset to represent one area, and concatenate them to form a larger training set. An example of training on area 1, 2, 3, 4, 6 and evaluating on area 5 is shown as below: To enable flexible combination of train-val splits, we use sub-dataset to represent one area, and concatenate them to form a larger training set. An example of training on area 1, 2, 3, 4, 6 and evaluating on area 5 is shown as below:
...@@ -222,31 +221,42 @@ class_names = ('ceiling', 'floor', 'wall', 'beam', 'column', 'window', 'door', ...@@ -222,31 +221,42 @@ class_names = ('ceiling', 'floor', 'wall', 'beam', 'column', 'window', 'door',
'table', 'chair', 'sofa', 'bookcase', 'board', 'clutter') 'table', 'chair', 'sofa', 'bookcase', 'board', 'clutter')
train_area = [1, 2, 3, 4, 6] train_area = [1, 2, 3, 4, 6]
test_area = 5 test_area = 5
data = dict( train_dataloader = dict(
train=dict( batch_size=8,
num_workers=4,
persistent_workers=True,
sampler=dict(type='DefaultSampler', shuffle=True),
dataset=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_files=[ ann_files=[f's3dis_infos_Area_{i}.pkl' for i in train_area],
data_root + f's3dis_infos_Area_{i}.pkl' for i in train_area metainfo=metainfo,
], data_prefix=data_prefix,
pipeline=train_pipeline, pipeline=train_pipeline,
classes=class_names, modality=input_modality,
test_mode=False,
ignore_index=len(class_names), ignore_index=len(class_names),
scene_idxs=[ scene_idxs=[
data_root + f'seg_info/Area_{i}_resampled_scene_idxs.npy' f'seg_info/Area_{i}_resampled_scene_idxs.npy' for i in train_area
for i in train_area ],
]), test_mode=False))
val=dict( test_dataloader = dict(
batch_size=1,
num_workers=1,
persistent_workers=True,
drop_last=False,
sampler=dict(type='DefaultSampler', shuffle=False),
dataset=dict(
type=dataset_type, type=dataset_type,
data_root=data_root, data_root=data_root,
ann_files=data_root + f's3dis_infos_Area_{test_area}.pkl', ann_files=f's3dis_infos_Area_{test_area}.pkl',
metainfo=metainfo,
data_prefix=data_prefix,
pipeline=test_pipeline, pipeline=test_pipeline,
classes=class_names, modality=input_modality,
test_mode=True,
ignore_index=len(class_names), ignore_index=len(class_names),
scene_idxs=data_root + scene_idxs=f'seg_info/Area_{test_area}_resampled_scene_idxs.npy',
f'seg_info/Area_{test_area}_resampled_scene_idxs.npy')) test_mode=True))
val_dataloader = test_dataloader
``` ```
where we specify the areas used for training/validation by setting `ann_files` and `scene_idxs` with lists that include corresponding paths. The train-val split can be simply modified via changing the `train_area` and `test_area` variables. where we specify the areas used for training/validation by setting `ann_files` and `scene_idxs` with lists that include corresponding paths. The train-val split can be simply modified via changing the `train_area` and `test_area` variables.
...@@ -226,13 +226,13 @@ scannet ...@@ -226,13 +226,13 @@ scannet
- `semantic_mask/xxxxx.bin`: The semantic label for each point, value range: \[1, 40\], i.e. `nyu40id` standard. Note: the `nyu40id` ID will be mapped to train ID in train pipeline `PointSegClassMapping`. - `semantic_mask/xxxxx.bin`: The semantic label for each point, value range: \[1, 40\], i.e. `nyu40id` standard. Note: the `nyu40id` ID will be mapped to train ID in train pipeline `PointSegClassMapping`.
- `posed_images/scenexxxx_xx`: The set of `.jpg` images with `.txt` 4x4 poses and the single `.txt` file with camera intrinsic matrix. - `posed_images/scenexxxx_xx`: The set of `.jpg` images with `.txt` 4x4 poses and the single `.txt` file with camera intrinsic matrix.
- `scannet_infos_train.pkl`: The train data infos, the detailed info of each scan is as follows: - `scannet_infos_train.pkl`: The train data infos, the detailed info of each scan is as follows:
- info\['lidar_points'\]: A dict contains all information relate to the the lidar points. - info\['lidar_points'\]: A dict containing all information relate to the lidar points.
- info\['lidar_points'\]\['lidar_path'\]: The filename of `xxx.bin` of lidar points. - info\['lidar_points'\]\['lidar_path'\]: The filename of `xxx.bin` of lidar points.
- info\['lidar_points'\]\['num_pts_feats'\]: The feature dimension of point. - info\['lidar_points'\]\['num_pts_feats'\]: The feature dimension of point.
- info\['lidar_points'\]\['axis_align_matrix'\]: The transformation matrix to align the axis. - info\['lidar_points'\]\['axis_align_matrix'\]: The transformation matrix to align the axis.
- info\['pts_semantic_mask_path'\]: The filename of `xxx.bin` contains semantic mask annotation. - info\['pts_semantic_mask_path'\]: The filename of `xxx.bin` contains semantic mask annotation.
- info\['pts_instance_mask_path'\]: The filename of `xxx.bin` contains semantic mask annotation. - info\['pts_instance_mask_path'\]: The filename of `xxx.bin` contains semantic mask annotation.
- info\['instances'\]: A list of dict contains all annotations, each dict contains all annotation information of single instance. - info\['instances'\]: A list of dict contains all annotations, each dict contains all annotation information of single instance. For the i-th instance:
- info\['instances'\]\[i\]\['bbox_3d'\]: List of 6 numbers representing the axis-aligned 3D bounding box of the instance in depth coordinate system, in (x, y, z, l, w, h) order. - info\['instances'\]\[i\]\['bbox_3d'\]: List of 6 numbers representing the axis-aligned 3D bounding box of the instance in depth coordinate system, in (x, y, z, l, w, h) order.
- info\['instances'\]\[i\]\['bbox_label_3d'\]: The label of each 3d bounding boxes. - info\['instances'\]\[i\]\['bbox_label_3d'\]: The label of each 3d bounding boxes.
- `scannet_infos_val.pkl`: The val data infos, which shares the same format as `scannet_infos_train.pkl`. - `scannet_infos_val.pkl`: The val data infos, which shares the same format as `scannet_infos_train.pkl`.
......
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
## Dataset preparation ## Dataset preparation
The overall process is similar to ScanNet 3D detection task. Please refer to this [section](https://github.com/open-mmlab/mmdetection3d/blob/master/docs/en/datasets/scannet_det.md#dataset-preparation). Only a few differences and additional information about the 3D semantic segmentation data will be listed below. The overall process is similar to ScanNet 3D detection task. Please refer to this [section](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/docs/en/advanced_guides/datasets/scannet_det.md#dataset-preparation). Only a few differences and additional information about the 3D semantic segmentation data will be listed below.
### Export ScanNet data ### Export ScanNet data
...@@ -100,8 +100,7 @@ train_pipeline = [ ...@@ -100,8 +100,7 @@ train_pipeline = [
enlarge_size=0.2, enlarge_size=0.2,
min_unique_num=None), min_unique_num=None),
dict(type='NormalizePointsColor', color_mean=None), dict(type='NormalizePointsColor', color_mean=None),
dict(type='DefaultFormatBundle3D', class_names=class_names), dict(type='Pack3DDetInputs', keys=['points', 'pts_semantic_mask'])
dict(type='Collect3D', keys=['points', 'pts_semantic_mask'])
] ]
``` ```
...@@ -111,7 +110,7 @@ train_pipeline = [ ...@@ -111,7 +110,7 @@ train_pipeline = [
## Metrics ## Metrics
Typically mean Intersection over Union (mIoU) is used for evaluation on ScanNet. In detail, we first compute IoU for multiple classes and then average them to get mIoU, please refer to [seg_eval](https://github.com/open-mmlab/mmdetection3d/blob/master/mmdet3d/core/evaluation/seg_eval.py). Typically mean Intersection over Union (mIoU) is used for evaluation on ScanNet. In detail, we first compute IoU for multiple classes and then average them to get mIoU, please refer to [seg_eval](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/mmdet3d/evaluation/functional/seg_eval.py).
## Testing and Making a Submission ## Testing and Making a Submission
......
...@@ -151,21 +151,21 @@ sunrgbd ...@@ -151,21 +151,21 @@ sunrgbd
├── sunrgbd_infos_val.pkl ├── sunrgbd_infos_val.pkl
``` ```
- `points/0xxxxx.bin`: The point cloud data after downsample. - `points/xxxxxx.bin`: The point cloud data after downsample.
- `sunrgbd_infos_train.pkl`: The train data infos, the detailed info of each scene is as follows: - `sunrgbd_infos_train.pkl`: The train data infos, the detailed info of each scene is as follows:
- info\['lidar_points'\]: A dict contains all information relate to the the lidar points. - info\['lidar_points'\]: A dict containing all information relate to the the lidar points.
- info\['lidar_points'\]\['num_pts_feats'\]: The feature dimension of point. - info\['lidar_points'\]\['num_pts_feats'\]: The feature dimension of point.
- info\['lidar_points'\]\['lidar_path'\]: The filename of `xxx.bin` of lidar points. - info\['lidar_points'\]\['lidar_path'\]: The filename of `xxx.bin` of lidar points.
- info\['images'\]: A dict contains all information relate to the image data. - info\['images'\]: A dict containing all information relate to the image data.
- info\['images'\]\['CAM0'\]\['img_path'\]: The image file name. - info\['images'\]\['CAM0'\]\['img_path'\]: The filename of the image.
- info\['images'\]\['CAM0'\]\['depth2img'\]: Transformation matrix from depth to image with shape (4, 4). - info\['images'\]\['CAM0'\]\['depth2img'\]: Transformation matrix from depth to image with shape (4, 4).
- info\['images'\]\['CAM0'\]\['height'\]: The height of image. - info\['images'\]\['CAM0'\]\['height'\]: The height of image.
- info\['images'\]\['CAM0'\]\['width'\]: The width of image. - info\['images'\]\['CAM0'\]\['width'\]: The width of image.
- info\['instances'\]: A list of dict contains all the annotations of this frame. Each dict corresponds to annotations of single instance. - info\['instances'\]: A list of dict contains all the annotations of this frame. Each dict corresponds to annotations of single instance. For the i-th instance:
- info\['instances'\]\['bbox_3d'\]: List of 7 numbers representing the 3D bounding box in depth coordinate system. - info\['instances'\]\[i\]\['bbox_3d'\]: List of 7 numbers representing the 3D bounding box in depth coordinate system.
- info\['instances'\]\['bbox'\]: List of 4 numbers representing the 2D bounding box of the instance, in (x1, y1, x2, y2) order. - info\['instances'\]\[i\]\['bbox'\]: List of 4 numbers representing the 2D bounding box of the instance, in (x1, y1, x2, y2) order.
- info\['instances'\]\['bbox_label_3d'\]: An int indicates the 3D label of instance and the -1 indicates ignore class. - info\['instances'\]\[i\]\['bbox_label_3d'\]: An int indicates the 3D label of instance and the -1 indicates ignore class.
- info\['instances'\]\['bbox_label'\]: An int indicates the 2D label of instance and the -1 indicates ignore class. - info\['instances'\]\[i\]\['bbox_label'\]: An int indicates the 2D label of instance and the -1 indicates ignore class.
- `sunrgbd_infos_val.pkl`: The val data infos, which shares the same format as `sunrgbd_infos_train.pkl`. - `sunrgbd_infos_val.pkl`: The val data infos, which shares the same format as `sunrgbd_infos_train.pkl`.
## Train pipeline ## Train pipeline
...@@ -232,16 +232,16 @@ train_pipeline = [ ...@@ -232,16 +232,16 @@ train_pipeline = [
shift_height=True), shift_height=True),
dict( dict(
type='Pack3DDetInputs', type='Pack3DDetInputs',
keys=['points', 'gt_bboxes_3d', 'gt_labels_3d','img', 'gt_bboxes', 'gt_bboxes_labels', ]) keys=['points', 'gt_bboxes_3d', 'gt_labels_3d','img', 'gt_bboxes', 'gt_bboxes_labels'])
] ]
``` ```
Data augmentation/normalization for images: Data augmentation for images:
- `Resize`: resize the input image, `keep_ratio=True` means the ratio of the image is kept unchanged. - `Resize`: resize the input image, `keep_ratio=True` means the ratio of the image is kept unchanged.
- `RandomFlip`: randomly flip the input image. - `RandomFlip`: randomly flip the input image.
The image augmentation and normalization functions are implemented in [MMDetection](https://github.com/open-mmlab/mmdetection/tree/master/mmdet/datasets/pipelines). The image augmentation functions are implemented in [MMDetection](https://github.com/open-mmlab/mmdetection/tree/dev-3.x/mmdet/datasets/transforms).
## Metrics ## Metrics
......
...@@ -101,7 +101,7 @@ Considering there are many similar frames in the original dataset, we can basica ...@@ -101,7 +101,7 @@ Considering there are many similar frames in the original dataset, we can basica
## Evaluation ## Evaluation
For evaluation on Waymo, please follow the [instruction](https://github.com/waymo-research/waymo-open-dataset/blob/master/docs/quick_start.md/) to build the binary file `compute_detection_metrics_main` for metrics computation and put it into `mmdet3d/core/evaluation/waymo_utils/`. Basically, you can follow the commands below to install `bazel` and build the file. For evaluation on Waymo, please follow the [instruction](https://github.com/waymo-research/waymo-open-dataset/blob/master/docs/quick_start.md) to build the binary file `compute_detection_metrics_main` for metrics computation and put it into `mmdet3d/core/evaluation/waymo_utils/`. Basically, you can follow the commands below to install `bazel` and build the file.
```shell ```shell
# download the code and enter the base directory # download the code and enter the base directory
...@@ -145,7 +145,7 @@ Then you can evaluate your models on Waymo. An example to evaluate PointPillars ...@@ -145,7 +145,7 @@ Then you can evaluate your models on Waymo. An example to evaluate PointPillars
An example to test PointPillars on Waymo with 8 GPUs, generate the bin files and make a submission to the leaderboard. An example to test PointPillars on Waymo with 8 GPUs, generate the bin files and make a submission to the leaderboard.
`submission_prefix` should be set in `test_evaluator` of configuration before you run the test command if you want to generate the bin files and make a submission to the leaderboard.. `submission_prefix` should be set in `test_evaluator` of configuration before you run the test command if you want to generate the bin files and make a submission to the leaderboard..
After generating the bin file, you can simply build the binary file `create_submission` and use them to create a submission file by following the [instruction](https://github.com/waymo-research/waymo-open-dataset/blob/master/docs/quick_start.md/). Basically, here are some example commands. After generating the bin file, you can simply build the binary file `create_submission` and use them to create a submission file by following the [instruction](https://github.com/waymo-research/waymo-open-dataset/blob/master/docs/quick_start.md/). Basically, here are some example commands.
......
...@@ -7,7 +7,8 @@ MMDection3D works on Linux, Windows (experimental support) and macOS and require ...@@ -7,7 +7,8 @@ MMDection3D works on Linux, Windows (experimental support) and macOS and require
- PyTorch 1.6+ - PyTorch 1.6+
- CUDA 9.2+ (If you build PyTorch from source, CUDA 9.0 is also compatible) - CUDA 9.2+ (If you build PyTorch from source, CUDA 9.0 is also compatible)
- GCC 5+ - GCC 5+
- [MMCV](https://mmcv.readthedocs.io/en/latest/#installation) - [MMEngine](https://mmengine.readthedocs.io/zh_CN/latest/#installation)
- [MMCV](https://mmcv.readthedocs.io/zh_CN/latest/#installation)
```{note} ```{note}
If you are experienced with PyTorch and have already installed it, just skip this part and jump to the [next section](#installation). Otherwise, you can follow these steps for the preparation. If you are experienced with PyTorch and have already installed it, just skip this part and jump to the [next section](#installation). Otherwise, you can follow these steps for the preparation.
...@@ -99,7 +100,7 @@ pip install -v -e . # or "python setup.py develop" ...@@ -99,7 +100,7 @@ pip install -v -e . # or "python setup.py develop"
Note: Note:
1. The git commit id will be written to the version number with step d, e.g. 0.6.0+2e7045c. The version will also be saved in trained models. 1. The git commit id will be written to the version number with step 4, e.g. 0.6.0+2e7045c. The version will also be saved in trained models.
It is recommended that you run step d each time you pull some updates from github. If C++/CUDA codes are modified, then this step is compulsory. It is recommended that you run step d each time you pull some updates from github. If C++/CUDA codes are modified, then this step is compulsory.
> Important: Be sure to remove the `./build` folder if you reinstall mmdet with a different CUDA/PyTorch version. > Important: Be sure to remove the `./build` folder if you reinstall mmdet with a different CUDA/PyTorch version.
...@@ -197,7 +198,7 @@ Examples: ...@@ -197,7 +198,7 @@ Examples:
to_ply('./test.obj', './test.ply', 'obj') to_ply('./test.obj', './test.ply', 'obj')
``` ```
More demos about single/multi-modality and indoor/outdoor 3D detection can be found in [demo](demo.md). More demos about single/multi-modality and indoor/outdoor 3D detection can be found in [demo](user_guides/inference.md).
## Customize Installation ## Customize Installation
...@@ -216,7 +217,7 @@ Installing CUDA runtime libraries is enough if you follow our best practices, be ...@@ -216,7 +217,7 @@ Installing CUDA runtime libraries is enough if you follow our best practices, be
### Install MMEngine without MIM ### Install MMEngine without MIM
To install MMEngine with pip instead of MIM, please follow [MMEngine installation guides](https://mmcv.readthedocs.io/en/latest/get_started/installation.html). To install MMEngine with pip instead of MIM, please follow [MMEngine installation guides](https://mmengine.readthedocs.io/en/latest/get_started/installation.html).
For example, you can install MMEngine by the following command. For example, you can install MMEngine by the following command.
...@@ -280,5 +281,5 @@ pip install -e . ...@@ -280,5 +281,5 @@ pip install -e .
## Trouble shooting ## Trouble shooting
If you have some issues during the installation, please first view the [FAQ](faq.md) page. If you have some issues during the installation, please first view the [FAQ](notes/faq.md) page.
You may [open an issue](https://github.com/open-mmlab/mmdetection3d/issues/new/choose) on GitHub if no solution is found. You may [open an issue](https://github.com/open-mmlab/mmdetection3d/issues/new/choose) on GitHub if no solution is found.
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
We list some potential troubles encountered by users and developers, along with their corresponding solutions. Feel free to enrich the list if you find any frequent issues and contribute your solutions to solve them. If you have any trouble with environment configuration, model training, etc, please create an issue using the [provided templates](https://github.com/open-mmlab/mmdetection3d/blob/master/.github/ISSUE_TEMPLATE/error-report.md/) and fill in all required information in the template. We list some potential troubles encountered by users and developers, along with their corresponding solutions. Feel free to enrich the list if you find any frequent issues and contribute your solutions to solve them. If you have any trouble with environment configuration, model training, etc, please create an issue using the [provided templates](https://github.com/open-mmlab/mmdetection3d/blob/master/.github/ISSUE_TEMPLATE/error-report.md/) and fill in all required information in the template.
## MMCV/MMDet/MMDet3D Installation ## MMEngine/MMCV/MMDet/MMDet3D Installation
- Compatibility issue between MMEngine, MMCV, MMDetection and MMDetection3D; "ConvWS is already registered in conv layer"; "AssertionError: MMCV==xxx is used but incompatible. Please install mmcv>=xxx, \<=xxx." - Compatibility issue between MMEngine, MMCV, MMDetection and MMDetection3D; "ConvWS is already registered in conv layer"; "AssertionError: MMCV==xxx is used but incompatible. Please install mmcv>=xxx, \<=xxx."
...@@ -14,7 +14,7 @@ We list some potential troubles encountered by users and developers, along with ...@@ -14,7 +14,7 @@ We list some potential troubles encountered by users and developers, along with
| v1.1.0rc1 | mmengine>=0.1.0, \<1.0.0 | mmcv>=2.0.0rc0, \<2.1.0 | mmdet>=3.0.0rc0, \<3.1.0 | | v1.1.0rc1 | mmengine>=0.1.0, \<1.0.0 | mmcv>=2.0.0rc0, \<2.1.0 | mmdet>=3.0.0rc0, \<3.1.0 |
| v1.1.0rc0 | mmengine>=0.1.0, \<1.0.0 | mmcv>=2.0.0rc0, \<2.1.0 | mmdet>=3.0.0rc0, \<3.1.0 | | v1.1.0rc0 | mmengine>=0.1.0, \<1.0.0 | mmcv>=2.0.0rc0, \<2.1.0 | mmdet>=3.0.0rc0, \<3.1.0 |
**Note:** If you want to install mmdet3d-v1.0.0x, the compatible MMDetection, MMSegmentation and MMCV versions table can be found at [here](https://mmdetection3d.readthedocs.io/en/latest/faq.html#mmcv-mmdet-mmdet3d-installation). Please choose the correct version of MMCV to avoid installation issues. **Note:** If you want to install mmdet3d-v1.0.0rcx, the compatible MMDetection, MMSegmentation and MMCV versions table can be found at [here](https://mmdetection3d.readthedocs.io/en/latest/faq.html#mmcv-mmdet-mmdet3d-installation). Please choose the correct version of MMCV, MMDetection and MMSegmentation to avoid installation issues.
- If you faced the error shown below when importing open3d: - If you faced the error shown below when importing open3d:
......
...@@ -2,6 +2,7 @@ ...@@ -2,6 +2,7 @@
:maxdepth: 1 :maxdepth: 1
benchmarks.md benchmarks.md
changelog_v1.0.x.md
changelog.md changelog.md
compatibility.md compatibility.md
faq.md faq.md
...@@ -149,6 +149,6 @@ You can also delete the local training log after backing up to the specified Cep ...@@ -149,6 +149,6 @@ You can also delete the local training log after backing up to the specified Cep
log_config = dict( log_config = dict(
interval=50, interval=50,
hooks=[ hooks=[
dict(type='TextLoggerHook', out_dir='s3://openmmlab/mmdetection3d'', keep_local=False), dict(type='TextLoggerHook', out_dir='s3://openmmlab/mmdetection3d', keep_local=False),
]) ])
``` ```
# Learn about Configs # Learn about Configs
MMDetection3D and other OpenMMLab repositories use [MMEngine's config system](https://mmengine.readthedocs.io/en/latest/tutorials/config.md). It has a modular and inheritance design, which is convenient to conduct various experiments. MMDetection3D and other OpenMMLab repositories use [MMEngine's config system](https://mmengine.readthedocs.io/en/latest/tutorials/config.html). It has a modular and inheritance design, which is convenient to conduct various experiments.
If you wish to inspect the config file, you may run `python tools/misc/print_config.py /PATH/TO/CONFIG` to see the complete config. If you wish to inspect the config file, you may run `python tools/misc/print_config.py /PATH/TO/CONFIG` to see the complete config.
## Config File Content ## Config File Content
MMDetection3D uses a modular design, all modules with different functions can be configured through the config. Taking PointPillars as an example, we will introduce each field in the config according to different function modules: MMDetection3D uses a modular design, all modules with different functions can be configured through the config. Taking PointPillars as an example, we will introduce each field in the config according to different function modules.
### Model config ### Model config
...@@ -446,7 +446,7 @@ resume = False ...@@ -446,7 +446,7 @@ resume = False
## Config file inheritance ## Config file inheritance
There are 4 basic component types under `config/_base_`, dataset, model, schedule, default_runtime. There are 4 basic component types under `configs/_base_`, dataset, model, schedule, default_runtime.
Many methods could be easily constructed with one of each like SECOND, PointPillars, PartA2, and VoteNet. Many methods could be easily constructed with one of each like SECOND, PointPillars, PartA2, and VoteNet.
The configs that are composed by components from `_base_` are called _primitive_. The configs that are composed by components from `_base_` are called _primitive_.
......
...@@ -7,6 +7,7 @@ MMDetection3D uses three different coordinate systems. The existence of differen ...@@ -7,6 +7,7 @@ MMDetection3D uses three different coordinate systems. The existence of differen
Despite the variety of datasets and equipment, by summarizing the line of works on 3D object detection we can roughly categorize coordinate systems into three: Despite the variety of datasets and equipment, by summarizing the line of works on 3D object detection we can roughly categorize coordinate systems into three:
- Camera coordinate system -- the coordinate system of most cameras, in which the positive direction of the y-axis points to the ground, the positive direction of the x-axis points to the right, and the positive direction of the z-axis points to the front. - Camera coordinate system -- the coordinate system of most cameras, in which the positive direction of the y-axis points to the ground, the positive direction of the x-axis points to the right, and the positive direction of the z-axis points to the front.
``` ```
up z front up z front
| ^ | ^
...@@ -22,7 +23,9 @@ Despite the variety of datasets and equipment, by summarizing the line of works ...@@ -22,7 +23,9 @@ Despite the variety of datasets and equipment, by summarizing the line of works
v v
y down y down
``` ```
- LiDAR coordinate system -- the coordinate system of many LiDARs, in which the negative direction of the z-axis points to the ground, the positive direction of the x-axis points to the front, and the positive direction of the y-axis points to the left. - LiDAR coordinate system -- the coordinate system of many LiDARs, in which the negative direction of the z-axis points to the ground, the positive direction of the x-axis points to the front, and the positive direction of the y-axis points to the left.
``` ```
z up x front z up x front
^ ^ ^ ^
...@@ -32,7 +35,9 @@ Despite the variety of datasets and equipment, by summarizing the line of works ...@@ -32,7 +35,9 @@ Despite the variety of datasets and equipment, by summarizing the line of works
|/ |/
y left <------ 0 ------ right y left <------ 0 ------ right
``` ```
- Depth coordinate system -- the coordinate system used by VoteNet, H3DNet, etc., in which the negative direction of the z-axis points to the ground, the positive direction of the x-axis points to the right, and the positive direction of the y-axis points to the front. - Depth coordinate system -- the coordinate system used by VoteNet, H3DNet, etc., in which the negative direction of the z-axis points to the ground, the positive direction of the x-axis points to the right, and the positive direction of the y-axis points to the front.
``` ```
z up y front z up y front
^ ^ ^ ^
......
...@@ -15,6 +15,7 @@ defines how to process the annotations and a data pipeline defines all the steps ...@@ -15,6 +15,7 @@ defines how to process the annotations and a data pipeline defines all the steps
A pipeline consists of a sequence of operations. Each operation takes a dict as input and also output a dict for the next transform. A pipeline consists of a sequence of operations. Each operation takes a dict as input and also output a dict for the next transform.
We present a classical pipeline in the following figure. The blue blocks are pipeline operations. With the pipeline going on, each operator can add new keys (marked as green) to the result dict or update the existing keys (marked as orange). We present a classical pipeline in the following figure. The blue blocks are pipeline operations. With the pipeline going on, each operator can add new keys (marked as green) to the result dict or update the existing keys (marked as orange).
![](../../../resources/data_pipeline.png) ![](../../../resources/data_pipeline.png)
The operations are categorized into data loading, pre-processing, formatting and test-time augmentation. The operations are categorized into data loading, pre-processing, formatting and test-time augmentation.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment