Unverified Commit 087ac108 authored by VVsssssk's avatar VVsssssk Committed by GitHub
Browse files

[Docs]Fix docs about cfg and faq (#1787)



* Update faq.md

* Update getting_started.md

* Update the pytorch version

* Update faq.md

* fix cfg docs

* fix docs

* Update mmcv installation scripts

* Update mmcv names

* fix comments

* fix docs

* fix faq
Co-authored-by: default avatarTai-Wang <tab_wang@outlook.com>
parent 32f197e5
...@@ -4,7 +4,7 @@ In this section we demonstrate how to prepare an environment with PyTorch. ...@@ -4,7 +4,7 @@ In this section we demonstrate how to prepare an environment with PyTorch.
MMDection3D works on Linux, Windows (experimental support) and macOS and requires the following packages: MMDection3D works on Linux, Windows (experimental support) and macOS and requires the following packages:
- Python 3.6+ - Python 3.6+
- PyTorch 1.3+ - PyTorch 1.6+
- CUDA 9.2+ (If you build PyTorch from source, CUDA 9.0 is also compatible) - CUDA 9.2+ (If you build PyTorch from source, CUDA 9.0 is also compatible)
- GCC 5+ - GCC 5+
- [MMCV](https://mmcv.readthedocs.io/en/latest/#installation) - [MMCV](https://mmcv.readthedocs.io/en/latest/#installation)
...@@ -47,55 +47,49 @@ Otherwise, you should refer to the step-by-step installation instructions in the ...@@ -47,55 +47,49 @@ Otherwise, you should refer to the step-by-step installation instructions in the
```shell ```shell
pip install openmim pip install openmim
mim install mmcv-full mim install mmengine
mim install mmdet mim install mmcv>=2.0.0rc0
mim install mmsegmentation mim install mmdet>=3.0.0rc0
git clone https://github.com/open-mmlab/mmdetection3d.git git clone https://github.com/open-mmlab/mmdetection3d.git -b dev-1.x
cd mmdetection3d cd mmdetection3d
pip install -e . pip install -e .
``` ```
**Step 0.** Install [MMCV](https://github.com/open-mmlab/mmcv) using [MIM](https://github.com/open-mmlab/mim). **Step 0.** Install [MMEngine](https://github.com/open-mmlab/mmengine) and [MMCV](https://github.com/open-mmlab/mmcv) using [MIM](https://github.com/open-mmlab/mim).
**Step 1.** Install [MMDetection](https://github.com/open-mmlab/mmdetection).
```shell
pip install mmdet
```
Optionally, you could also build MMDetection from source in case you want to modify the code:
```shell ```shell
git clone https://github.com/open-mmlab/mmdetection.git pip install -U openmim
cd mmdetection mim install mmengine
git checkout v2.24.0 # switch to v2.24.0 branch mim install mmcv>=2.0.0rc0
pip install -r requirements/build.txt
pip install -v -e . # or "python setup.py develop"
``` ```
**Step 2.** Install [MMSegmentation](https://github.com/open-mmlab/mmsegmentation). **Step 1.** Install [MMDetection](https://github.com/open-mmlab/mmdetection).
```shell ```shell
pip install mmsegmentation mim install mmdet>=3.0.0rc0
``` ```
Optionally, you could also build MMSegmentation from source in case you want to modify the code: Optionally, you could also build MMDetection from source in case you want to modify the code:
```shell ```shell
git clone https://github.com/open-mmlab/mmsegmentation.git git clone https://github.com/open-mmlab/mmdetection.git -b dev-3.x
cd mmsegmentation # "-b dev-3.x" means checkout to the `dev-3.x` branch.
git checkout v0.20.0 # switch to v0.20.0 branch cd mmdetection
pip install -e . # or "python setup.py develop" pip install -v -e .
# "-v" means verbose, or more output
# "-e" means installing a project in editable mode,
# thus any local modifications made to the code will take effect without reinstallation.
``` ```
**Step 3.** Clone the MMDetection3D repository. **Step 2.** Clone the MMDetection3D repository.
```shell ```shell
git clone https://github.com/open-mmlab/mmdetection3d.git git clone https://github.com/open-mmlab/mmdetection3d.git -b dev-1.x
# "-b dev-1.x" means checkout to the `dev-1.x` branch.
cd mmdetection3d cd mmdetection3d
``` ```
**Step 4.** Install build requirements and then install MMDetection3D. **Step 3.** Install build requirements and then install MMDetection3D.
```shell ```shell
pip install -v -e . # or "python setup.py develop" pip install -v -e . # or "python setup.py develop"
...@@ -156,7 +150,7 @@ python demo/pcd_demo.py ${PCD_FILE} ${CONFIG_FILE} ${CHECKPOINT_FILE} [--device ...@@ -156,7 +150,7 @@ python demo/pcd_demo.py ${PCD_FILE} ${CONFIG_FILE} ${CHECKPOINT_FILE} [--device
Examples: Examples:
```shell ```shell
python demo/pcd_demo.py demo/data/kitti/kitti_000008.bin configs/second/hv_second_secfpn_6x8_80e_kitti-3d-car.py checkpoints/hv_second_secfpn_6x8_80e_kitti-3d-car_20200620_230238-393f000c.pth python demo/pcd_demo.py demo/data/kitti/000008.bin configs/second/second_hv-secfpn_8xb6-80e_kitti-3d-car.py checkpoints/second_hv-secfpn_8xb6-80e_kitti-3d-car_20200620_230238-393f000c.pth
``` ```
If you want to input a `ply` file, you can use the following function and convert it to `bin` format. Then you can use the converted `bin` file to generate demo. If you want to input a `ply` file, you can use the following function and convert it to `bin` format. Then you can use the converted `bin` file to generate demo.
...@@ -218,16 +212,26 @@ Please make sure the GPU driver satisfies the minimum version requirements. See ...@@ -218,16 +212,26 @@ Please make sure the GPU driver satisfies the minimum version requirements. See
Installing CUDA runtime libraries is enough if you follow our best practices, because no CUDA code will be compiled locally. However if you hope to compile MMCV from source or develop other CUDA operators, you need to install the complete CUDA toolkit from NVIDIA's [website](https://developer.nvidia.com/cuda-downloads), and its version should match the CUDA version of PyTorch. i.e., the specified version of cudatoolkit in `conda install` command. Installing CUDA runtime libraries is enough if you follow our best practices, because no CUDA code will be compiled locally. However if you hope to compile MMCV from source or develop other CUDA operators, you need to install the complete CUDA toolkit from NVIDIA's [website](https://developer.nvidia.com/cuda-downloads), and its version should match the CUDA version of PyTorch. i.e., the specified version of cudatoolkit in `conda install` command.
``` ```
### Install MMEngine without MIM
To install MMEngine with pip instead of MIM, please follow [MMEngine installation guides](https://mmcv.readthedocs.io/en/latest/get_started/installation.html).
For example, you can install MMEngine by the following command.
```shell
pip install mmengine
```
### Install MMCV without MIM ### Install MMCV without MIM
MMCV contains C++ and CUDA extensions, thus depending on PyTorch in a complex way. MIM solves such dependencies automatically and makes the installation easier. However, it is not a must. MMCV contains C++ and CUDA extensions, thus depending on PyTorch in a complex way. MIM solves such dependencies automatically and makes the installation easier. However, it is not a must.
To install MMCV with pip instead of MIM, please follow [MMCV installation guides](https://mmcv.readthedocs.io/en/latest/get_started/installation.html). This requires manually specifying a find-url based on PyTorch version and its CUDA version. To install MMCV with pip instead of MIM, please follow [MMCV installation guides](https://mmcv.readthedocs.io/en/latest/get_started/installation.html). This requires manually specifying a find-url based on PyTorch version and its CUDA version.
For example, the following command install mmcv-full built for PyTorch 1.10.x and CUDA 11.3. For example, the following command install mmcv built for PyTorch 1.10.x and CUDA 11.3.
```shell ```shell
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10/index.html pip install mmcv -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10/index.html
``` ```
### Using MMDetection3D with Docker ### Using MMDetection3D with Docker
...@@ -250,25 +254,24 @@ docker run --gpus all --shm-size=8g -it -v {DATA_DIR}:/mmdetection3d/data mmdete ...@@ -250,25 +254,24 @@ docker run --gpus all --shm-size=8g -it -v {DATA_DIR}:/mmdetection3d/data mmdete
Here is a full script for setting up MMdetection3D with conda. Here is a full script for setting up MMdetection3D with conda.
```shell ```shell
conda create -n open-mmlab python=3.7 -y conda create -n open-mmlab python=3.8 -y
conda activate open-mmlab conda activate open-mmlab
# install latest PyTorch prebuilt with the default prebuilt CUDA version (usually the latest) # install latest PyTorch prebuilt with the default prebuilt CUDA version (usually the latest)
conda install -c pytorch pytorch torchvision -y conda install -c pytorch pytorch torchvision -y
# install mmcv # install mmengine and mmcv
pip install mmcv-full pip install openmim
mim install mmengine
mim install mmcv>=2.0.0rc0
# install mmdetection # install mmdetection
pip install git+https://github.com/open-mmlab/mmdetection.git mim install mmdet>=3.0.0rc0
# install mmsegmentation
pip install git+https://github.com/open-mmlab/mmsegmentation.git
# install mmdetection3d # install mmdetection3d
git clone https://github.com/open-mmlab/mmdetection3d.git git clone https://github.com/open-mmlab/mmdetection3d.git -b dev-1.x
cd mmdetection3d cd mmdetection3d
pip install -v -e . pip install -e .
``` ```
## Trouble shooting ## Trouble shooting
......
...@@ -4,35 +4,16 @@ We list some potential troubles encountered by users and developers, along with ...@@ -4,35 +4,16 @@ We list some potential troubles encountered by users and developers, along with
## MMCV/MMDet/MMDet3D Installation ## MMCV/MMDet/MMDet3D Installation
- Compatibility issue between MMCV, MMDetection, MMSegmentation and MMDection3D; "ConvWS is already registered in conv layer"; "AssertionError: MMCV==xxx is used but incompatible. Please install mmcv>=xxx, \<=xxx." - Compatibility issue between MMEngine, MMCV, MMDetection and MMDetection3D; "ConvWS is already registered in conv layer"; "AssertionError: MMCV==xxx is used but incompatible. Please install mmcv>=xxx, \<=xxx."
The required versions of MMCV, MMDetection and MMSegmentation for different versions of MMDetection3D are as below. Please install the correct version of MMCV, MMDetection and MMSegmentation to avoid installation issues. The required versions of MMEngine, MMCV and MMDetection for different versions of MMDetection3D are as below. Please install the correct version of MMEngine, MMCV and MMDetection to avoid installation issues.
| MMDetection3D version | MMDetection version | MMSegmentation version | MMCV version | | MMDetection3D version | MMEngine version | MMCV version | MMDetection version |
| :-------------------: | :----------------------: | :---------------------: | :-------------------------: | | :-------------------: | :----------------------: | :---------------------: | :----------------------: |
| master | mmdet>=2.24.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.4.8, \<=1.6.0 | | dev-1.x | mmengine>=0.1.0, \<0.2.0 | mmcv>=2.0.0rc0, \<2.1.0 | mmdet>=3.0.0rc0, \<3.1.0 |
| v1.0.0rc3 | mmdet>=2.24.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.4.8, \<=1.6.0 | | v1.1.0rc0 | mmengine>=0.1.0, \<0.2.0 | mmcv>=2.0.0rc0, \<2.1.0 | mmdet>=3.0.0rc0, \<3.1.0 |
| v1.0.0rc2 | mmdet>=2.24.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.4.8, \<=1.6.0 |
| v1.0.0rc1 | mmdet>=2.19.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.4.8, \<=1.5.0 | **Note:** If you want to install mmdet3d-v1.0.0x, the compatible MMDetection, MMSegmentation and MMCV versions table can be found at [here](https://mmdetection3d.readthedocs.io/en/latest/faq.html#mmcv-mmdet-mmdet3d-installation). Please choose the correct version of MMCV to avoid installation issues.
| v1.0.0rc0 | mmdet>=2.19.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.3.17, \<=1.5.0 |
| 0.18.1 | mmdet>=2.19.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.3.17, \<=1.5.0 |
| 0.18.0 | mmdet>=2.19.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.3.17, \<=1.5.0 |
| 0.17.3 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
| 0.17.2 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
| 0.17.1 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
| 0.17.0 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
| 0.16.0 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
| 0.15.0 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
| 0.14.0 | mmdet>=2.10.0, \<=2.11.0 | mmseg==0.14.0 | mmcv-full>=1.3.1, \<=1.4.0 |
| 0.13.0 | mmdet>=2.10.0, \<=2.11.0 | Not required | mmcv-full>=1.2.4, \<=1.4.0 |
| 0.12.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.2.4, \<=1.4.0 |
| 0.11.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.2.4, \<=1.3.0 |
| 0.10.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.2.4, \<=1.3.0 |
| 0.9.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.2.4, \<=1.3.0 |
| 0.8.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.1.5, \<=1.3.0 |
| 0.7.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.1.5, \<=1.3.0 |
| 0.6.0 | mmdet>=2.4.0, \<=2.11.0 | Not required | mmcv-full>=1.1.3, \<=1.2.0 |
| 0.5.0 | 2.3.0 | Not required | mmcv-full==1.0.5 |
- If you faced the error shown below when importing open3d: - If you faced the error shown below when importing open3d:
......
# Learn about Configs # Learn about Configs
We incorporate modular and inheritance design into our config system, which is convenient to conduct various experiments. MMDetection3D and other OpenMMLab repositories use [MMEngine's config system](https://mmengine.readthedocs.io/en/latest/tutorials/config.md). It has a modular and inheritance design, which is convenient to conduct various experiments.
If you wish to inspect the config file, you may run `python tools/misc/print_config.py /PATH/TO/CONFIG` to see the complete config. If you wish to inspect the config file, you may run `python tools/misc/print_config.py /PATH/TO/CONFIG` to see the complete config.
You may also pass `--options xxx.yyy=zzz` to see updated config.
## Config File Structure ## Config File Content
There are 4 basic component types under `config/_base_`, dataset, model, schedule, default_runtime. MMDetection3D uses a modular design, all modules with different functions can be configured through the config. Taking PointPillars as an example, we will introduce each field in the config according to different function modules:
Many methods could be easily constructed with one of each like SECOND, PointPillars, PartA2, and VoteNet.
The configs that are composed by components from `_base_` are called _primitive_.
For all configs under the same folder, it is recommended to have only **one** _primitive_ config. All other configs should inherit from the _primitive_ config. In this way, the maximum of inheritance level is 3.
For easy understanding, we recommend contributors to inherit from exiting methods.
For example, if some modification is made based on PointPillars, user may first inherit the basic PointPillars structure by specifying `_base_ = ../pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d.py`, then modify the necessary fields in the config files.
If you are building an entirely new method that does not share the structure with any of the existing methods, you may create a folder `xxx_rcnn` under `configs`,
Please refer to [mmcv](https://mmcv.readthedocs.io/en/latest/understand_mmcv/config.html) for detailed documentation.
## Config Name Style
We follow the below style to name config files. Contributors are advised to follow the same style.
```
{model}_[model setting]_{backbone}_[neck]_[norm setting]_[misc]_[batch_per_gpu x gpu]_{schedule}_{dataset}
```
`{xxx}` is required field and `[yyy]` is optional.
- `{model}`: model type like `hv_pointpillars` (Hard Voxelization PointPillars), `VoteNet`, etc. ### Model config
- `[model setting]`: specific setting for some model.
- `{backbone}`: backbone type like `regnet-400mf`, `regnet-1.6gf`.
- `[neck]`: neck type like `fpn`, `secfpn`.
- `[norm_setting]`: `bn` (Batch Normalization) is used unless specified, other norm layer type could be `gn` (Group Normalization), `sbn` (Synchronized Batch Normalization).
`gn-head`/`gn-neck` indicates GN is applied in head/neck only, while `gn-all` means GN is applied in the entire model, e.g. backbone, neck, head.
- `[misc]`: miscellaneous setting/plugins of model, e.g. `strong-aug` means using stronger augmentation strategies for training.
- `[batch_per_gpu x gpu]`: samples per GPU and GPUs, `4x8` is used by default.
- `{schedule}`: training schedule, options are `1x`, `2x`, `20e`, etc.
`1x` and `2x` means 12 epochs and 24 epochs respectively.
`20e` is adopted in cascade models, which denotes 20 epochs.
For `1x`/`2x`, initial learning rate decays by a factor of 10 at the 8/16th and 11/22th epochs.
For `20e`, initial learning rate decays by a factor of 10 at the 16th and 19th epochs.
- `{dataset}`: dataset like `nus-3d`, `kitti-3d`, `lyft-3d`, `scannet-3d`, `sunrgbd-3d`. We also indicate the number of classes we are using if there exist multiple settings, e.g., `kitti-3d-3class` and `kitti-3d-car` means training on KITTI dataset with 3 classes and single class, respectively.
## Deprecated train_cfg/test_cfg In mmdetection3d's config, we use `model` to setup detection algorithm components. In addition to neural network components such as `voxel_encoder`, `backbone` etc, it also requires `data_preprocessor`, `train_cfg`, and `test_cfg`. `data_preprocessor` is responsible for processing a batch of data output by dataloader. `train_cfg`, and `test_cfg` in the model config are for training and testing hyperparameters of the components.
Following MMDetection, the `train_cfg` and `test_cfg` are deprecated in config file, please specify them in the model config. The original config structure is as below.
```python ```python
# deprecated
model = dict( model = dict(
type=..., type='VoxelNet',
... data_preprocessor=dict(
) type='Det3DDataPreprocessor',
train_cfg=dict(...) voxel=True,
test_cfg=dict(...) voxel_layer=dict(
max_num_points=32,
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1],
voxel_size=[0.16, 0.16, 4],
max_voxels=(16000, 40000))),
voxel_encoder=dict(
type='PillarFeatureNet',
in_channels=4,
feat_channels=[64],
with_distance=False,
voxel_size=[0.16, 0.16, 4],
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1]),
middle_encoder=dict(
type='PointPillarsScatter', in_channels=64, output_shape=[496, 432]),
backbone=dict(
type='SECOND',
in_channels=64,
layer_nums=[3, 5, 5],
layer_strides=[2, 2, 2],
out_channels=[64, 128, 256]),
neck=dict(
type='SECONDFPN',
in_channels=[64, 128, 256],
upsample_strides=[1, 2, 4],
out_channels=[128, 128, 128]),
bbox_head=dict(
type='Anchor3DHead',
num_classes=3,
in_channels=384,
feat_channels=384,
use_direction_classifier=True,
assign_per_class=True,
anchor_generator=dict(
type='AlignedAnchor3DRangeGenerator',
ranges=[[0, -39.68, -0.6, 69.12, 39.68, -0.6],
[0, -39.68, -0.6, 69.12, 39.68, -0.6],
[0, -39.68, -1.78, 69.12, 39.68, -1.78]],
sizes=[[0.8, 0.6, 1.73], [1.76, 0.6, 1.73], [3.9, 1.6, 1.56]],
rotations=[0, 1.57],
reshape_out=False),
diff_rad_by_sin=True,
bbox_coder=dict(type='DeltaXYZWLHRBBoxCoder'),
loss_cls=dict(
type='mmdet.FocalLoss',
use_sigmoid=True,
gamma=2.0,
alpha=0.25,
loss_weight=1.0),
loss_bbox=dict(
type='mmdet.SmoothL1Loss',
beta=0.1111111111111111,
loss_weight=2.0),
loss_dir=dict(
type='mmdet.CrossEntropyLoss', use_sigmoid=False,
loss_weight=0.2)),
train_cfg=dict(
assigner=[
dict(
type='Max3DIoUAssigner',
iou_calculator=dict(type='mmdet3d.BboxOverlapsNearest3D'),
pos_iou_thr=0.5,
neg_iou_thr=0.35,
min_pos_iou=0.35,
ignore_iof_thr=-1),
dict(
type='Max3DIoUAssigner',
iou_calculator=dict(type='mmdet3d.BboxOverlapsNearest3D'),
pos_iou_thr=0.5,
neg_iou_thr=0.35,
min_pos_iou=0.35,
ignore_iof_thr=-1),
dict(
type='Max3DIoUAssigner',
iou_calculator=dict(type='mmdet3d.BboxOverlapsNearest3D'),
pos_iou_thr=0.6,
neg_iou_thr=0.45,
min_pos_iou=0.45,
ignore_iof_thr=-1)
],
allowed_border=0,
pos_weight=-1,
debug=False),
test_cfg=dict(
use_rotate_nms=True,
nms_across_levels=False,
nms_thr=0.01,
score_thr=0.1,
min_bbox_size=0,
nms_pre=100,
max_num=50))
``` ```
The migration example is as below. ### Dataset and evaluator config
```python [Dataloaders](https://pytorch.org/docs/stable/data.html?highlight=data%20loader#torch.utils.data.DataLoader) are required for the training, validation, and testing of the [runner](https://mmengine.readthedocs.io/en/latest/tutorials/runner.html). Dataset and data pipeline need to be set to build the dataloader. Due to the complexity of this part, we use intermediate variables to simplify the writing of dataloader configs.
# recommended
model = dict(
type=...,
...
train_cfg=dict(...),
test_cfg=dict(...)
)
```
## An example of VoteNet
```python ```python
model = dict( dataset_type = 'KittiDataset'
type='VoteNet', # The type of detector, refer to mmdet3d.models.detectors for more details data_root = 'data/kitti/'
backbone=dict( class_names = ['Pedestrian', 'Cyclist', 'Car']
type='PointNet2SASSG', # The type of the backbone, refer to mmdet3d.models.backbones for more details point_cloud_range = [0, -39.68, -3, 69.12, 39.68, 1]
in_channels=4, # Input channels of point cloud input_modality = dict(use_lidar=True, use_camera=False)
num_points=(2048, 1024, 512, 256), # The number of points which each SA module samples metainfo = dict(CLASSES=['Pedestrian', 'Cyclist', 'Car'])
radius=(0.2, 0.4, 0.8, 1.2), # Radius for each set abstraction layer db_sampler = dict(
num_samples=(64, 32, 16, 16), # Number of samples for each set abstraction layer data_root='data/kitti/',
sa_channels=((64, 64, 128), (128, 128, 256), (128, 128, 256), info_path='data/kitti/kitti_dbinfos_train.pkl',
(128, 128, 256)), # Out channels of each mlp in SA module rate=1.0,
fp_channels=((256, 256), (256, 256)), # Out channels of each mlp in FP module prepare=dict(
norm_cfg=dict(type='BN2d'), # Config of normalization layer filter_by_difficulty=[-1],
sa_cfg=dict( # Config of point set abstraction (SA) module filter_by_min_points=dict(Car=5, Pedestrian=5, Cyclist=5)),
type='PointSAModule', # type of SA module classes=['Pedestrian', 'Cyclist', 'Car'],
pool_mod='max', # Pool method ('max' or 'avg') for SA modules sample_groups=dict(Car=15, Pedestrian=15, Cyclist=15),
use_xyz=True, # Whether to use xyz as features during feature gathering points_loader=dict(
normalize_xyz=True)), # Whether to use normalized xyz as feature during feature gathering type='LoadPointsFromFile', coord_type='LIDAR', load_dim=4, use_dim=4))
bbox_head=dict( train_pipeline = [
type='VoteHead', # The type of bbox head, refer to mmdet3d.models.dense_heads for more details dict(type='LoadPointsFromFile', coord_type='LIDAR', load_dim=4, use_dim=4),
num_classes=18, # Number of classes for classification dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True),
bbox_coder=dict(
type='PartialBinBasedBBoxCoder', # The type of bbox_coder, refer to mmdet3d.core.bbox.coders for more details
num_sizes=18, # Number of size clusters
num_dir_bins=1, # Number of bins to encode direction angle
with_rot=False, # Whether the bbox is with rotation
mean_sizes=[[0.76966727, 0.8116021, 0.92573744],
[1.876858, 1.8425595, 1.1931566],
[0.61328, 0.6148609, 0.7182701],
[1.3955007, 1.5121545, 0.83443564],
[0.97949594, 1.0675149, 0.6329687],
[0.531663, 0.5955577, 1.7500148],
[0.9624706, 0.72462326, 1.1481868],
[0.83221924, 1.0490936, 1.6875663],
[0.21132214, 0.4206159, 0.5372846],
[1.4440073, 1.8970833, 0.26985747],
[1.0294262, 1.4040797, 0.87554324],
[1.3766412, 0.65521795, 1.6813129],
[0.6650819, 0.71111923, 1.298853],
[0.41999173, 0.37906948, 1.7513971],
[0.59359556, 0.5912492, 0.73919016],
[0.50867593, 0.50656086, 0.30136237],
[1.1511526, 1.0546296, 0.49706793],
[0.47535285, 0.49249494, 0.5802117]]), # Mean sizes for each class, the order is consistent with class_names.
vote_moudule_cfg=dict( # Config of vote module branch, refer to mmdet3d.models.layers for more details
in_channels=256, # Input channels for vote_module
vote_per_seed=1, # Number of votes to generate for each seed
gt_per_seed=3, # Number of gts for each seed
conv_channels=(256, 256), # Channels for convolution
conv_cfg=dict(type='Conv1d'), # Config of convolution
norm_cfg=dict(type='BN1d'), # Config of normalization
norm_feats=True, # Whether to normalize features
vote_loss=dict( # Config of the loss function for voting branch
type='ChamferDistance', # Type of loss for voting branch
mode='l1', # Loss mode of voting branch
reduction='none', # Specifies the reduction to apply to the output
loss_dst_weight=10.0)), # Destination loss weight of the voting branch
vote_aggregation_cfg=dict( # Config of vote aggregation branch
type='PointSAModule', # type of vote aggregation module
num_point=256, # Number of points for the set abstraction layer in vote aggregation branch
radius=0.3, # Radius for the set abstraction layer in vote aggregation branch
num_sample=16, # Number of samples for the set abstraction layer in vote aggregation branch
mlp_channels=[256, 128, 128, 128], # Mlp channels for the set abstraction layer in vote aggregation branch
use_xyz=True, # Whether to use xyz
normalize_xyz=True), # Whether to normalize xyz
feat_channels=(128, 128), # Channels for feature convolution
conv_cfg=dict(type='Conv1d'), # Config of convolution
norm_cfg=dict(type='BN1d'), # Config of normalization
objectness_loss=dict( # Config of objectness loss
type='CrossEntropyLoss', # Type of loss
class_weight=[0.2, 0.8], # Class weight of the objectness loss
reduction='sum', # Specifies the reduction to apply to the output
loss_weight=5.0), # Loss weight of the objectness loss
center_loss=dict( # Config of center loss
type='ChamferDistance', # Type of loss
mode='l2', # Loss mode of center loss
reduction='sum', # Specifies the reduction to apply to the output
loss_src_weight=10.0, # Source loss weight of the voting branch.
loss_dst_weight=10.0), # Destination loss weight of the voting branch.
dir_class_loss=dict( # Config of direction classification loss
type='CrossEntropyLoss', # Type of loss
reduction='sum', # Specifies the reduction to apply to the output
loss_weight=1.0), # Loss weight of the direction classification loss
dir_res_loss=dict( # Config of direction residual loss
type='SmoothL1Loss', # Type of loss
reduction='sum', # Specifies the reduction to apply to the output
loss_weight=10.0), # Loss weight of the direction residual loss
size_class_loss=dict( # Config of size classification loss
type='CrossEntropyLoss', # Type of loss
reduction='sum', # Specifies the reduction to apply to the output
loss_weight=1.0), # Loss weight of the size classification loss
size_res_loss=dict( # Config of size residual loss
type='SmoothL1Loss', # Type of loss
reduction='sum', # Specifies the reduction to apply to the output
loss_weight=3.3333333333333335), # Loss weight of the size residual loss
semantic_loss=dict( # Config of semantic loss
type='CrossEntropyLoss', # Type of loss
reduction='sum', # Specifies the reduction to apply to the output
loss_weight=1.0)), # Loss weight of the semantic loss
train_cfg = dict( # Config of training hyperparameters for VoteNet
pos_distance_thr=0.3, # distance >= threshold 0.3 will be taken as positive samples
neg_distance_thr=0.6, # distance < threshold 0.6 will be taken as negative samples
sample_mod='vote'), # Mode of the sampling method
test_cfg = dict( # Config of testing hyperparameters for VoteNet
sample_mod='seed', # Mode of the sampling method
nms_thr=0.25, # The threshold to be used during NMS
score_thr=0.8, # Threshold to filter out boxes
per_class_proposal=False)) # Whether to use per_class_proposal
dataset_type = 'ScanNetDataset' # Type of the dataset
data_root = './data/scannet/' # Root path of the data
class_names = ('cabinet', 'bed', 'chair', 'sofa', 'table', 'door', 'window',
'bookshelf', 'picture', 'counter', 'desk', 'curtain',
'refrigerator', 'showercurtrain', 'toilet', 'sink', 'bathtub',
'garbagebin') # Names of classes
train_pipeline = [ # Training pipeline, refer to mmdet3d.datasets.transforms for more details
dict(
type='LoadPointsFromFile', # First pipeline to load points, refer to mmdet3d.datasets.transforms.indoor_loading for more details
shift_height=True, # Whether to use shifted height
load_dim=6, # The dimension of the loaded points
use_dim=[0, 1, 2]), # Which dimensions of the points to be used
dict( dict(
type='LoadAnnotations3D', # Second pipeline to load annotations, refer to mmdet3d.datasets.transforms.indoor_loading for more details type='ObjectSample',
with_bbox_3d=True, # Whether to load 3D boxes db_sampler=dict(
with_label_3d=True, # Whether to load 3D labels corresponding to each 3D box data_root='data/kitti/',
with_mask_3d=True, # Whether to load 3D instance masks info_path='data/kitti/kitti_dbinfos_train.pkl',
with_seg_3d=True), # Whether to load 3D semantic masks rate=1.0,
prepare=dict(
filter_by_difficulty=[-1],
filter_by_min_points=dict(Car=5, Pedestrian=5, Cyclist=5)),
classes=['Pedestrian', 'Cyclist', 'Car'],
sample_groups=dict(Car=15, Pedestrian=15, Cyclist=15),
points_loader=dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=4,
use_dim=4)),
use_ground_plane=True),
dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
dict( dict(
type='PointSegClassMapping', # Declare valid categories, refer to mmdet3d.datasets.transforms.point_seg_class_mapping for more details type='GlobalRotScaleTrans',
valid_cat_ids=(3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 24, 28, 33, 34, rot_range=[-0.78539816, 0.78539816],
36, 39), # all valid categories ids scale_ratio_range=[0.95, 1.05]),
max_cat_id=40), # max possible category id in input segmentation mask
dict(type='PointSample', # Sample points, refer to mmdet3d.datasets.transforms.transforms_3d for more details
num_points=40000), # Number of points to be sampled
dict(type='IndoorFlipData', # Augmentation pipeline that flip points and 3d boxes
flip_ratio_yz=0.5, # Probability of being flipped along yz plane
flip_ratio_xz=0.5), # Probability of being flipped along xz plane
dict( dict(
type='IndoorGlobalRotScale', # Augmentation pipeline that rotate and scale points and 3d boxes, refer to mmdet3d.datasets.transforms.indoor_augment for more details type='PointsRangeFilter',
shift_height=True, # Whether the loaded points use `shift_height` attribute point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1]),
rot_range=[-0.027777777777777776, 0.027777777777777776], # Range of rotation
scale_range=None), # Range of scale
dict( dict(
type='DefaultFormatBundle3D', # Default format bundle to gather data in the pipeline, refer to mmdet3d.datasets.transforms.formatting for more details type='ObjectRangeFilter',
class_names=('cabinet', 'bed', 'chair', 'sofa', 'table', 'door', point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1]),
'window', 'bookshelf', 'picture', 'counter', 'desk', dict(type='PointShuffle'),
'curtain', 'refrigerator', 'showercurtrain', 'toilet',
'sink', 'bathtub', 'garbagebin')),
dict( dict(
type='Collect3D', # Pipeline that decides which keys in the data should be passed to the detector, refer to mmdet3d.datasets.transforms.formatting for more details type='Pack3DDetInputs',
keys=[ keys=['points', 'gt_labels_3d', 'gt_bboxes_3d'])
'points', 'gt_bboxes_3d', 'gt_labels_3d', 'pts_semantic_mask',
'pts_instance_mask'
])
] ]
test_pipeline = [ # Testing pipeline, refer to mmdet3d.datasets.transforms for more details test_pipeline = [
dict( dict(type='LoadPointsFromFile', coord_type='LIDAR', load_dim=4, use_dim=4),
type='LoadPointsFromFile', # First pipeline to load points, refer to mmdet3d.datasets.transforms.indoor_loading for more details
shift_height=True, # Whether to use shifted height
load_dim=6, # The dimension of the loaded points
use_dim=[0, 1, 2]), # Which dimensions of the points to be used
dict(type='PointSample', # Sample points, refer to mmdet3d.datasets.transforms.transforms_3d for more details
num_points=40000), # Number of points to be sampled
dict( dict(
type='DefaultFormatBundle3D', # Default format bundle to gather data in the pipeline, refer to mmdet3d.datasets.transforms.formatting for more details type='MultiScaleFlipAug3D',
class_names=('cabinet', 'bed', 'chair', 'sofa', 'table', 'door', img_scale=(1333, 800),
'window', 'bookshelf', 'picture', 'counter', 'desk', pts_scale_ratio=1,
'curtain', 'refrigerator', 'showercurtrain', 'toilet', flip=False,
'sink', 'bathtub', 'garbagebin')), transforms=[
dict(type='Collect3D', # Pipeline that decides which keys in the data should be passed to the detector, refer to mmdet3d.datasets.transforms.formatting for more details dict(
keys=['points']) type='GlobalRotScaleTrans',
rot_range=[0, 0],
scale_ratio_range=[1.0, 1.0],
translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'),
dict(
type='PointsRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1])
]),
dict(type='Pack3DDetInputs', keys=['points'])
] ]
eval_pipeline = [ # Pipeline used for evaluation or visualization, refer to mmdet3d.datasets.transforms for more details eval_pipeline = [
dict( dict(type='LoadPointsFromFile', coord_type='LIDAR', load_dim=4, use_dim=4),
type='LoadPointsFromFile', # First pipeline to load points, refer to mmdet3d.datasets.transforms.indoor_loading for more details dict(type='Pack3DDetInputs', keys=['points'])
shift_height=True, # Whether to use shifted height
load_dim=6, # The dimension of the loaded points
use_dim=[0, 1, 2]), # Which dimensions of the points to be used
dict(
type='DefaultFormatBundle3D', # Default format bundle to gather data in the pipeline, refer to mmdet3d.datasets.transforms.formatting for more details
class_names=('cabinet', 'bed', 'chair', 'sofa', 'table', 'door',
'window', 'bookshelf', 'picture', 'counter', 'desk',
'curtain', 'refrigerator', 'showercurtrain', 'toilet',
'sink', 'bathtub', 'garbagebin')),
with_label=False),
dict(type='Collect3D', # Pipeline that decides which keys in the data should be passed to the detector, refer to mmdet3d.datasets.transforms.formatting for more details
keys=['points'])
] ]
data = dict( train_dataloader = dict(
samples_per_gpu=8, # Batch size of a single GPU batch_size=6,
workers_per_gpu=4, # Number of workers to pre-fetch data for each single GPU num_workers=4,
train=dict( # Train dataset config persistent_workers=True,
type='RepeatDataset', # Wrapper of dataset, refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/datasets/dataset_wrappers.py for details. sampler=dict(type='DefaultSampler', shuffle=True),
times=5, # Repeat times dataset=dict(
type='RepeatDataset',
times=2,
dataset=dict( dataset=dict(
type='ScanNetDataset', # Type of dataset type='KittiDataset',
data_root='./data/scannet/', # Root path of the data data_root='data/kitti/',
ann_file='./data/scannet/scannet_infos_train.pkl', # Ann path of the data ann_file='kitti_infos_train.pkl',
pipeline=[ # pipeline, this is passed by the train_pipeline created before. data_prefix=dict(pts='training/velodyne_reduced'),
pipeline=[
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
shift_height=True, coord_type='LIDAR',
load_dim=6, load_dim=4,
use_dim=[0, 1, 2]), use_dim=4),
dict( dict(
type='LoadAnnotations3D', type='LoadAnnotations3D',
with_bbox_3d=True, with_bbox_3d=True,
with_label_3d=True, with_label_3d=True),
with_mask_3d=True,
with_seg_3d=True),
dict( dict(
type='PointSegClassMapping', type='ObjectSample',
valid_cat_ids=(3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 24, db_sampler=dict(
28, 33, 34, 36, 39), data_root='data/kitti/',
max_cat_id=40), info_path='data/kitti/kitti_dbinfos_train.pkl',
dict(type='PointSample', num_points=40000), rate=1.0,
prepare=dict(
filter_by_difficulty=[-1],
filter_by_min_points=dict(
Car=5, Pedestrian=5, Cyclist=5)),
classes=['Pedestrian', 'Cyclist', 'Car'],
sample_groups=dict(Car=15, Pedestrian=15, Cyclist=15),
points_loader=dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=4,
use_dim=4)),
use_ground_plane=True),
dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
dict( dict(
type='IndoorFlipData', type='GlobalRotScaleTrans',
flip_ratio_yz=0.5, rot_range=[-0.78539816, 0.78539816],
flip_ratio_xz=0.5), scale_ratio_range=[0.95, 1.05]),
dict( dict(
type='IndoorGlobalRotScale', type='PointsRangeFilter',
shift_height=True, point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1]),
rot_range=[-0.027777777777777776, 0.027777777777777776],
scale_range=None),
dict( dict(
type='DefaultFormatBundle3D', type='ObjectRangeFilter',
class_names=('cabinet', 'bed', 'chair', 'sofa', 'table', point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1]),
'door', 'window', 'bookshelf', 'picture', dict(type='PointShuffle'),
'counter', 'desk', 'curtain', 'refrigerator',
'showercurtrain', 'toilet', 'sink', 'bathtub',
'garbagebin')),
dict( dict(
type='Collect3D', type='Pack3DDetInputs',
keys=[ keys=['points', 'gt_labels_3d', 'gt_bboxes_3d'])
'points', 'gt_bboxes_3d', 'gt_labels_3d',
'pts_semantic_mask', 'pts_instance_mask'
])
], ],
filter_empty_gt=False, # Whether to filter empty ground truth boxes modality=dict(use_lidar=True, use_camera=False),
classes=('cabinet', 'bed', 'chair', 'sofa', 'table', 'door', test_mode=False,
'window', 'bookshelf', 'picture', 'counter', 'desk', metainfo=dict(CLASSES=['Pedestrian', 'Cyclist', 'Car']),
'curtain', 'refrigerator', 'showercurtrain', 'toilet', box_type_3d='LiDAR')))
'sink', 'bathtub', 'garbagebin'))), # Names of classes val_dataloader = dict(
val=dict( # Validation dataset config batch_size=1,
type='ScanNetDataset', # Type of dataset num_workers=1,
data_root='./data/scannet/', # Root path of the data persistent_workers=True,
ann_file='./data/scannet/scannet_infos_val.pkl', # Ann path of the data drop_last=False,
pipeline=[ # Pipeline is passed by test_pipeline created before sampler=dict(type='DefaultSampler', shuffle=False),
dataset=dict(
type='KittiDataset',
data_root='data/kitti/',
data_prefix=dict(pts='training/velodyne_reduced'),
ann_file='kitti_infos_val.pkl',
pipeline=[
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
shift_height=True, coord_type='LIDAR',
load_dim=6, load_dim=4,
use_dim=[0, 1, 2]), use_dim=4),
dict(type='PointSample', num_points=40000),
dict( dict(
type='DefaultFormatBundle3D', type='MultiScaleFlipAug3D',
class_names=('cabinet', 'bed', 'chair', 'sofa', 'table', img_scale=(1333, 800),
'door', 'window', 'bookshelf', 'picture', pts_scale_ratio=1,
'counter', 'desk', 'curtain', 'refrigerator', flip=False,
'showercurtrain', 'toilet', 'sink', 'bathtub', transforms=[
'garbagebin')), dict(
dict(type='Collect3D', keys=['points']) type='GlobalRotScaleTrans',
rot_range=[0, 0],
scale_ratio_range=[1.0, 1.0],
translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'),
dict(
type='PointsRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1])
]),
dict(type='Pack3DDetInputs', keys=['points'])
], ],
classes=('cabinet', 'bed', 'chair', 'sofa', 'table', 'door', 'window', modality=dict(use_lidar=True, use_camera=False),
'bookshelf', 'picture', 'counter', 'desk', 'curtain', test_mode=True,
'refrigerator', 'showercurtrain', 'toilet', 'sink', 'bathtub', metainfo=dict(CLASSES=['Pedestrian', 'Cyclist', 'Car']),
'garbagebin'), # Names of classes box_type_3d='LiDAR'))
test_mode=True), # Whether to use test mode test_dataloader = dict(
test=dict( # Test dataset config batch_size=1,
type='ScanNetDataset', # Type of dataset num_workers=1,
data_root='./data/scannet/', # Root path of the data persistent_workers=True,
ann_file='./data/scannet/scannet_infos_val.pkl', # Ann path of the data drop_last=False,
pipeline=[ # Pipeline is passed by test_pipeline created before sampler=dict(type='DefaultSampler', shuffle=False),
dataset=dict(
type='KittiDataset',
data_root='data/kitti/',
data_prefix=dict(pts='training/velodyne_reduced'),
ann_file='kitti_infos_val.pkl',
pipeline=[
dict( dict(
type='LoadPointsFromFile', type='LoadPointsFromFile',
shift_height=True, coord_type='LIDAR',
load_dim=6, load_dim=4,
use_dim=[0, 1, 2]), use_dim=4),
dict(type='PointSample', num_points=40000),
dict( dict(
type='DefaultFormatBundle3D', type='MultiScaleFlipAug3D',
class_names=('cabinet', 'bed', 'chair', 'sofa', 'table', img_scale=(1333, 800),
'door', 'window', 'bookshelf', 'picture', pts_scale_ratio=1,
'counter', 'desk', 'curtain', 'refrigerator', flip=False,
'showercurtrain', 'toilet', 'sink', 'bathtub', transforms=[
'garbagebin')), dict(
dict(type='Collect3D', keys=['points']) type='GlobalRotScaleTrans',
rot_range=[0, 0],
scale_ratio_range=[1.0, 1.0],
translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'),
dict(
type='PointsRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1])
]),
dict(type='Pack3DDetInputs', keys=['points'])
], ],
classes=('cabinet', 'bed', 'chair', 'sofa', 'table', 'door', 'window', modality=dict(use_lidar=True, use_camera=False),
'bookshelf', 'picture', 'counter', 'desk', 'curtain', test_mode=True,
'refrigerator', 'showercurtrain', 'toilet', 'sink', 'bathtub', metainfo=dict(CLASSES=['Pedestrian', 'Cyclist', 'Car']),
'garbagebin'), # Names of classes box_type_3d='LiDAR'))
test_mode=True)) # Whether to use test mode ```
evaluation = dict(pipeline=[ # Pipeline is passed by eval_pipeline created before
[Evaluators](https://mmengine.readthedocs.io/en/latest/tutorials/metric_and_evaluator.html) are used to compute the metrics of the trained model on the validation and testing datasets. The config of evaluators consists of one or a list of metric configs:
```python
val_evaluator = dict(
type='KittiMetric',
ann_file='data/kitti/kitti_infos_val.pkl',
metric='bbox')
test_evaluator = dict(
type='KittiMetric',
ann_file='data/kitti/kitti_infos_val.pkl',
metric='bbox')
```
### Training and testing config
MMEngine's runner uses Loop to control the training, validation, and testing processes.
Users can set the maximum training epochs and validation intervals with these fields.
```python
train_cfg = dict(
type='EpochBasedTrainLoop',
max_epochs=80,
val_interval=2)
val_cfg = dict(type='ValLoop')
test_cfg = dict(type='TestLoop')
```
### Optimization config
`optim_wrapper` is the field to configure optimization related settings. The optimizer wrapper not only provides the functions of the optimizer, but also supports functions such as gradient clipping, mixed precision training, etc. Find more in [optimizer wrapper tutorial](https://mmengine.readthedocs.io/en/latest/tutorials/optimizer.html).
```python
optim_wrapper = dict( # Optimizer wrapper config
type='OptimWrapper', # Optimizer wrapper type, switch to AmpOptimWrapper to enable mixed precision training.
optimizer=dict( # Optimizer config. Support all kinds of optimizers in PyTorch. Refer to https://pytorch.org/docs/stable/optim.html#algorithms
type='AdamW', lr=0.001, betas=(0.95, 0.99), weight_decay=0.01),
clip_grad=dict(max_norm=35, norm_type=2)) # Gradient clip option. Set None to disable gradient clip. Find usage in https://mmengine.readthedocs.io/en/latest/tutorials
```
`param_scheduler` is a field that configures methods of adjusting optimization hyperparameters such as learning rate and momentum. Users can combine multiple schedulers to create a desired parameter adjustment strategy. Find more in [parameter scheduler tutorial](https://mmengine.readthedocs.io/en/latest/tutorials/param_scheduler.html) and [parameter scheduler API documents](TODO)
```python
param_scheduler = [
dict( dict(
type='LoadPointsFromFile', type='CosineAnnealingLR',
coord_type='DEPTH', T_max=32.0,
shift_height=False, eta_min=0.01,
load_dim=6, begin=0,
use_dim=[0, 1, 2]), end=32.0,
by_epoch=True,
convert_to_iter_based=True),
dict(
type='CosineAnnealingLR',
T_max=48.0,
eta_min=1.0000000000000001e-07,
begin=32.0,
end=80,
by_epoch=True,
convert_to_iter_based=True),
dict( dict(
type='DefaultFormatBundle3D', type='CosineAnnealingMomentum',
class_names=('cabinet', 'bed', 'chair', 'sofa', 'table', 'door', T_max=32.0,
'window', 'bookshelf', 'picture', 'counter', 'desk', eta_min=0.8947368421052632,
'curtain', 'refrigerator', 'showercurtrain', 'toilet', begin=0,
'sink', 'bathtub', 'garbagebin'), end=32.0,
with_label=False), by_epoch=True,
dict(type='Collect3D', keys=['points']) convert_to_iter_based=True),
]) dict(
lr = 0.008 # Learning rate of optimizers type='CosineAnnealingMomentum',
optimizer = dict( # Config used to build optimizer, support all the optimizers in PyTorch whose arguments are also the same as those in PyTorch T_max=48.0,
type='Adam', # Type of optimizers, refer to https://github.com/open-mmlab/mmcv/blob/v1.3.7/mmcv/runner/optimizer/default_constructor.py#L12 for more details eta_min=1,
lr=0.008) # Learning rate of optimizers, see detail usages of the parameters in the documentation of PyTorch begin=32.0,
optimizer_config = dict( # Config used to build the optimizer hook, refer to https://github.com/open-mmlab/mmcv/blob/v1.3.7/mmcv/runner/hooks/optimizer.py#L22 for implementation details. end=80,
grad_clip=dict( # Config used to grad_clip convert_to_iter_based=True)
max_norm=10, # max norm of the gradients ]
norm_type=2)) # Type of the used p-norm. Can be 'inf' for infinity norm.
lr_config = dict( # Learning rate scheduler config used to register LrUpdater hook
policy='step', # The policy of scheduler, also support CosineAnnealing, Cyclic, etc. Refer to details of supported LrUpdater from https://github.com/open-mmlab/mmcv/blob/v1.3.7/mmcv/runner/hooks/lr_updater.py#L9.
warmup=None, # The warmup policy, also support `exp` and `constant`.
step=[24, 32]) # Steps to decay the learning rate
checkpoint_config = dict( # Config of set the checkpoint hook, Refer to https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/checkpoint.py for implementation.
interval=1) # The save interval is 1
log_config = dict( # config of register logger hook
interval=50, # Interval to print the log
hooks=[dict(type='TextLoggerHook'),
dict(type='TensorboardLoggerHook')]) # The logger used to record the training process.
runner = dict(type='EpochBasedRunner', max_epochs=36) # Runner that runs the `workflow` in total `max_epochs`
dist_params = dict(backend='nccl') # Parameters to setup distributed training, the port can also be set.
log_level = 'INFO' # The level of logging.
find_unused_parameters = True # Whether to find unused parameters
work_dir = None # Directory to save the model checkpoints and logs for the current experiments.
load_from = None # load models as a pre-trained model from a given path. This will not resume training.
resume_from = None # Resume checkpoints from a given path, the training will be resumed from the epoch when the checkpoint's is saved. The training state such as the epoch number and optimizer state will be restored.
workflow = [('train', 1)] # Workflow for runner. [('train', 1)] means there is only one workflow and the workflow named 'train' is executed once. The workflow trains the model by 36 epochs according to the max_epochs.
gpu_ids = range(0, 1) # ids of gpus
``` ```
## FAQ ### Hook config
Users can attach hooks to training, validation, and testing loops to insert some oprations during running. There are two different hook fields, one is `default_hooks` and the other is `custom_hooks`.
`default_hooks` is a dict of hook configs. `default_hooks` are the hooks must required at runtime. They have default priority which should not be modified. If not set, runner will use the default values. To disable a default hook, users can set its config to `None`.
```python
default_hooks = dict(
timer=dict(type='IterTimerHook'),
logger=dict(type='LoggerHook', interval=50),
param_scheduler=dict(type='ParamSchedulerHook'),
checkpoint=dict(type='CheckpointHook', interval=1),
sampler_seed=dict(type='DistSamplerSeedHook'))
```
### Runtime config
```python
default_scope = 'mmdet3d' # The default registry scope to find modules. Refer to https://mmengine.readthedocs.io/en/latest/tutorials/registry.html
env_cfg = dict(
cudnn_benchmark=False, # Whether to enable cudnn benchmark
mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0), # Use fork to start multi-processing threads. 'fork' usually faster than 'spawn' but maybe unsafe. See discussion in https://github.com/pytorch/pytorch/issues/1355
dist_cfg=dict(backend='nccl')) # Distribution configs
vis_backends = [dict(type='LocalVisBackend')] # Visualization backends.
visualizer = dict(
type='Det3DLocalVisualizer',
vis_backends=[dict(type='LocalVisBackend')],
name='visualizer')
log_processor = dict(type='LogProcessor', window_size=50, by_epoch=True)
log_level = 'INFO'
load_from = None
resume = False
```
## Config file inheritance
There are 4 basic component types under `config/_base_`, dataset, model, schedule, default_runtime.
Many methods could be easily constructed with one of each like SECOND, PointPillars, PartA2, and VoteNet.
The configs that are composed by components from `_base_` are called _primitive_.
For all configs under the same folder, it is recommended to have only **one** _primitive_ config. All other configs should inherit from the _primitive_ config. In this way, the maximum of inheritance level is 3.
For easy understanding, we recommend contributors to inherit from exiting methods.
For example, if some modification is made based on PointPillars, user may first inherit the basic PointPillars structure by specifying `_base_ = ../pointpillars/pointpillars_hv_fpn_sbn-all_8xb4_2x_nus-3d.py`, and then modify the necessary fields in the config files.
If you are building an entirely new method that does not share the structure with any of the existing methods, you may create a folder `xxx_rcnn` under `configs`,
Please refer to [mmengine config tutorial](https://mmengine.readthedocs.io/en/latest/tutorials/config.html) for detailed documentation.
### Ignore some fields in the base configs ### Ignore some fields in the base configs
...@@ -483,8 +530,9 @@ train_pipeline = [ ...@@ -483,8 +530,9 @@ train_pipeline = [
dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range), dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
dict(type='ObjectNameFilter', classes=class_names), dict(type='ObjectNameFilter', classes=class_names),
dict(type='PointShuffle'), dict(type='PointShuffle'),
dict(type='DefaultFormatBundle3D', class_names=class_names), dict(
dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d']) type='Pack3DDetInputs',
keys=['points', 'gt_labels_3d', 'gt_bboxes_3d'])
] ]
test_pipeline = [ test_pipeline = [
dict( dict(
...@@ -509,18 +557,63 @@ test_pipeline = [ ...@@ -509,18 +557,63 @@ test_pipeline = [
translation_std=[0, 0, 0]), translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'), dict(type='RandomFlip3D'),
dict( dict(
type='PointsRangeFilter', point_cloud_range=point_cloud_range), type='PointsRangeFilter', point_cloud_range=point_cloud_range)
dict( ]),
type='DefaultFormatBundle3D', dict(type='Pack3DDetInputs', keys=['points'])
class_names=class_names,
with_label=False),
dict(type='Collect3D', keys=['points'])
])
] ]
data = dict( train_dataloader = dict(dataset=dict(pipeline=train_pipeline))
train=dict(pipeline=train_pipeline), test_dataloader = dict(dataset=dict(pipeline=test_pipeline))
val=dict(pipeline=test_pipeline), val_dataloader = dict(dataset=dict(pipeline=test_pipeline))
test=dict(pipeline=test_pipeline)) ```
### Reuse variables in \_base\_ file
If the users want to reuse the variables in the base file, they can get a copy of the corresponding variable by using `{{_base_.xxx}}`. E.g:
```python
_base_ = './pointpillars_hv_secfpn_8xb6_160e_kitti-3d-3class.py'
a = {{_base_.model}} # variable `a` is equal to the `model` defined in `_base_`
``` ```
We first define the new `train_pipeline`/`test_pipeline` and pass them into `data`. ## Modify Config Through Script Arguments
When submitting jobs using "tools/train.py" or "tools/test.py", you may specify `--cfg-options` to in-place modify the config.
- Update config keys of dict chains.
The config options can be specified following the order of the dict keys in the original config.
For example, `--cfg-options model.backbone.norm_eval=False` changes the all BN modules in model backbones to `train` mode.
- Update keys inside a list of configs.
Some config dicts are composed as a list in your config. For example, the training pipeline `train_dataloader.dataset.pipeline` is normally a list
e.g. `[dict(type='LoadImageFromFile'), ...]`. If you want to change `'LoadImageFromFile'` to `'LoadImageFromNDArray'` in the pipeline,
you may specify `--cfg-options data.train.pipeline.0.type=LoadImageFromNDArray`.
- Update values of list/tuples.
If the value to be updated is a list or a tuple. For example, the config file normally sets `model.data_preprocessor.mean=[123.675, 116.28, 103.53]`. If you want to
change the mean values, you may specify `--cfg-options model.data_preprocessor.mean="[127,127,127]"`. Note that the quotation mark `"` is necessary to
support list/tuple data types, and that **NO** white space is allowed inside the quotation marks in the specified value.
## Config Name Style
We follow the below style to name config files. Contributors are advised to follow the same style.
```
{algorithm name}_{model component names [component1]_[component2]_[...]}_{training settings}_{training dataset information}_{testing dataset information}.py
```
The file name is divided to five parts. All parts and components are connected with `_` and words of each part or component should be connected with `-`.
- `{algorithm name}`: The name of the algorithm. It can be a detector name such as `pointpillars`, `fcos3d`, etc.
- `{model component names}`: Names of the components used in the algorithm such as voxel_encoder, backbone, neck, etc. For example, `second_secfpn_head-dcn-circlenms` means using SECOND's SparseEncoder, SECONDFPN and a detection head with DCN and circle NMS.
- `{training settings}`: Information of training settings such as batch size, augmentations, loss trick, scheduler, and epochs/iterations. For example: `8xb4-tta-cyclic-20e` means using 8-gpus x 4-samples-per-gpu, test time augmentation, cyclic annealing learning rate, and train 20 epochs.
Some abbreviations:
- `{gpu x batch_per_gpu}`: GPUs and samples per GPU. `bN` indicates N batch size per GPU. E.g. `4xb4` is the short term of 4-gpus x 4-samples-per-gpu.
- `{schedule}`: training schedule, options are `schedule-2x`, `schedule-3x`, `cyclic-20e`, etc.
`schedule-2x` and `schedule-3x` mean 24 epochs and 36 epochs respectively.
`cyclic-20e` means 20 epochs respectively.
- `{training dataset information}`: Training dataset names like `kitti-3d-3class`, `nus-3d`, `s3dis-seg`, `scannet-seg`, `waymoD5-3d-car`. Here `3d` means dataset used for 3d object detection, and `seg` means dataset used for point cloud segmentation.
- `{testing dataset information}` (optional): Testing dataset name for models trained on one dataset but tested on another. If not mentioned, it means the model was trained and tested on the same dataset type.
# Copyright (c) OpenMMLab. All rights reserved. # Copyright (c) OpenMMLab. All rights reserved.
import argparse import argparse
from mmcv import Config, DictAction from mmengine import Config, DictAction
def parse_args(): def parse_args():
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment