Unverified Commit 087ac108 authored by VVsssssk's avatar VVsssssk Committed by GitHub
Browse files

[Docs]Fix docs about cfg and faq (#1787)



* Update faq.md

* Update getting_started.md

* Update the pytorch version

* Update faq.md

* fix cfg docs

* fix docs

* Update mmcv installation scripts

* Update mmcv names

* fix comments

* fix docs

* fix faq
Co-authored-by: default avatarTai-Wang <tab_wang@outlook.com>
parent 32f197e5
......@@ -4,7 +4,7 @@ In this section we demonstrate how to prepare an environment with PyTorch.
MMDection3D works on Linux, Windows (experimental support) and macOS and requires the following packages:
- Python 3.6+
- PyTorch 1.3+
- PyTorch 1.6+
- CUDA 9.2+ (If you build PyTorch from source, CUDA 9.0 is also compatible)
- GCC 5+
- [MMCV](https://mmcv.readthedocs.io/en/latest/#installation)
......@@ -47,55 +47,49 @@ Otherwise, you should refer to the step-by-step installation instructions in the
```shell
pip install openmim
mim install mmcv-full
mim install mmdet
mim install mmsegmentation
git clone https://github.com/open-mmlab/mmdetection3d.git
mim install mmengine
mim install mmcv>=2.0.0rc0
mim install mmdet>=3.0.0rc0
git clone https://github.com/open-mmlab/mmdetection3d.git -b dev-1.x
cd mmdetection3d
pip install -e .
```
**Step 0.** Install [MMCV](https://github.com/open-mmlab/mmcv) using [MIM](https://github.com/open-mmlab/mim).
**Step 1.** Install [MMDetection](https://github.com/open-mmlab/mmdetection).
```shell
pip install mmdet
```
Optionally, you could also build MMDetection from source in case you want to modify the code:
**Step 0.** Install [MMEngine](https://github.com/open-mmlab/mmengine) and [MMCV](https://github.com/open-mmlab/mmcv) using [MIM](https://github.com/open-mmlab/mim).
```shell
git clone https://github.com/open-mmlab/mmdetection.git
cd mmdetection
git checkout v2.24.0 # switch to v2.24.0 branch
pip install -r requirements/build.txt
pip install -v -e . # or "python setup.py develop"
pip install -U openmim
mim install mmengine
mim install mmcv>=2.0.0rc0
```
**Step 2.** Install [MMSegmentation](https://github.com/open-mmlab/mmsegmentation).
**Step 1.** Install [MMDetection](https://github.com/open-mmlab/mmdetection).
```shell
pip install mmsegmentation
mim install mmdet>=3.0.0rc0
```
Optionally, you could also build MMSegmentation from source in case you want to modify the code:
Optionally, you could also build MMDetection from source in case you want to modify the code:
```shell
git clone https://github.com/open-mmlab/mmsegmentation.git
cd mmsegmentation
git checkout v0.20.0 # switch to v0.20.0 branch
pip install -e . # or "python setup.py develop"
git clone https://github.com/open-mmlab/mmdetection.git -b dev-3.x
# "-b dev-3.x" means checkout to the `dev-3.x` branch.
cd mmdetection
pip install -v -e .
# "-v" means verbose, or more output
# "-e" means installing a project in editable mode,
# thus any local modifications made to the code will take effect without reinstallation.
```
**Step 3.** Clone the MMDetection3D repository.
**Step 2.** Clone the MMDetection3D repository.
```shell
git clone https://github.com/open-mmlab/mmdetection3d.git
git clone https://github.com/open-mmlab/mmdetection3d.git -b dev-1.x
# "-b dev-1.x" means checkout to the `dev-1.x` branch.
cd mmdetection3d
```
**Step 4.** Install build requirements and then install MMDetection3D.
**Step 3.** Install build requirements and then install MMDetection3D.
```shell
pip install -v -e . # or "python setup.py develop"
......@@ -156,7 +150,7 @@ python demo/pcd_demo.py ${PCD_FILE} ${CONFIG_FILE} ${CHECKPOINT_FILE} [--device
Examples:
```shell
python demo/pcd_demo.py demo/data/kitti/kitti_000008.bin configs/second/hv_second_secfpn_6x8_80e_kitti-3d-car.py checkpoints/hv_second_secfpn_6x8_80e_kitti-3d-car_20200620_230238-393f000c.pth
python demo/pcd_demo.py demo/data/kitti/000008.bin configs/second/second_hv-secfpn_8xb6-80e_kitti-3d-car.py checkpoints/second_hv-secfpn_8xb6-80e_kitti-3d-car_20200620_230238-393f000c.pth
```
If you want to input a `ply` file, you can use the following function and convert it to `bin` format. Then you can use the converted `bin` file to generate demo.
......@@ -218,16 +212,26 @@ Please make sure the GPU driver satisfies the minimum version requirements. See
Installing CUDA runtime libraries is enough if you follow our best practices, because no CUDA code will be compiled locally. However if you hope to compile MMCV from source or develop other CUDA operators, you need to install the complete CUDA toolkit from NVIDIA's [website](https://developer.nvidia.com/cuda-downloads), and its version should match the CUDA version of PyTorch. i.e., the specified version of cudatoolkit in `conda install` command.
```
### Install MMEngine without MIM
To install MMEngine with pip instead of MIM, please follow [MMEngine installation guides](https://mmcv.readthedocs.io/en/latest/get_started/installation.html).
For example, you can install MMEngine by the following command.
```shell
pip install mmengine
```
### Install MMCV without MIM
MMCV contains C++ and CUDA extensions, thus depending on PyTorch in a complex way. MIM solves such dependencies automatically and makes the installation easier. However, it is not a must.
To install MMCV with pip instead of MIM, please follow [MMCV installation guides](https://mmcv.readthedocs.io/en/latest/get_started/installation.html). This requires manually specifying a find-url based on PyTorch version and its CUDA version.
For example, the following command install mmcv-full built for PyTorch 1.10.x and CUDA 11.3.
For example, the following command install mmcv built for PyTorch 1.10.x and CUDA 11.3.
```shell
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10/index.html
pip install mmcv -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10/index.html
```
### Using MMDetection3D with Docker
......@@ -250,25 +254,24 @@ docker run --gpus all --shm-size=8g -it -v {DATA_DIR}:/mmdetection3d/data mmdete
Here is a full script for setting up MMdetection3D with conda.
```shell
conda create -n open-mmlab python=3.7 -y
conda create -n open-mmlab python=3.8 -y
conda activate open-mmlab
# install latest PyTorch prebuilt with the default prebuilt CUDA version (usually the latest)
conda install -c pytorch pytorch torchvision -y
# install mmcv
pip install mmcv-full
# install mmengine and mmcv
pip install openmim
mim install mmengine
mim install mmcv>=2.0.0rc0
# install mmdetection
pip install git+https://github.com/open-mmlab/mmdetection.git
# install mmsegmentation
pip install git+https://github.com/open-mmlab/mmsegmentation.git
mim install mmdet>=3.0.0rc0
# install mmdetection3d
git clone https://github.com/open-mmlab/mmdetection3d.git
git clone https://github.com/open-mmlab/mmdetection3d.git -b dev-1.x
cd mmdetection3d
pip install -v -e .
pip install -e .
```
## Trouble shooting
......
......@@ -4,35 +4,16 @@ We list some potential troubles encountered by users and developers, along with
## MMCV/MMDet/MMDet3D Installation
- Compatibility issue between MMCV, MMDetection, MMSegmentation and MMDection3D; "ConvWS is already registered in conv layer"; "AssertionError: MMCV==xxx is used but incompatible. Please install mmcv>=xxx, \<=xxx."
The required versions of MMCV, MMDetection and MMSegmentation for different versions of MMDetection3D are as below. Please install the correct version of MMCV, MMDetection and MMSegmentation to avoid installation issues.
| MMDetection3D version | MMDetection version | MMSegmentation version | MMCV version |
| :-------------------: | :----------------------: | :---------------------: | :-------------------------: |
| master | mmdet>=2.24.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.4.8, \<=1.6.0 |
| v1.0.0rc3 | mmdet>=2.24.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.4.8, \<=1.6.0 |
| v1.0.0rc2 | mmdet>=2.24.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.4.8, \<=1.6.0 |
| v1.0.0rc1 | mmdet>=2.19.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.4.8, \<=1.5.0 |
| v1.0.0rc0 | mmdet>=2.19.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.3.17, \<=1.5.0 |
| 0.18.1 | mmdet>=2.19.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.3.17, \<=1.5.0 |
| 0.18.0 | mmdet>=2.19.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.3.17, \<=1.5.0 |
| 0.17.3 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
| 0.17.2 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
| 0.17.1 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
| 0.17.0 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
| 0.16.0 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
| 0.15.0 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
| 0.14.0 | mmdet>=2.10.0, \<=2.11.0 | mmseg==0.14.0 | mmcv-full>=1.3.1, \<=1.4.0 |
| 0.13.0 | mmdet>=2.10.0, \<=2.11.0 | Not required | mmcv-full>=1.2.4, \<=1.4.0 |
| 0.12.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.2.4, \<=1.4.0 |
| 0.11.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.2.4, \<=1.3.0 |
| 0.10.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.2.4, \<=1.3.0 |
| 0.9.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.2.4, \<=1.3.0 |
| 0.8.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.1.5, \<=1.3.0 |
| 0.7.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.1.5, \<=1.3.0 |
| 0.6.0 | mmdet>=2.4.0, \<=2.11.0 | Not required | mmcv-full>=1.1.3, \<=1.2.0 |
| 0.5.0 | 2.3.0 | Not required | mmcv-full==1.0.5 |
- Compatibility issue between MMEngine, MMCV, MMDetection and MMDetection3D; "ConvWS is already registered in conv layer"; "AssertionError: MMCV==xxx is used but incompatible. Please install mmcv>=xxx, \<=xxx."
The required versions of MMEngine, MMCV and MMDetection for different versions of MMDetection3D are as below. Please install the correct version of MMEngine, MMCV and MMDetection to avoid installation issues.
| MMDetection3D version | MMEngine version | MMCV version | MMDetection version |
| :-------------------: | :----------------------: | :---------------------: | :----------------------: |
| dev-1.x | mmengine>=0.1.0, \<0.2.0 | mmcv>=2.0.0rc0, \<2.1.0 | mmdet>=3.0.0rc0, \<3.1.0 |
| v1.1.0rc0 | mmengine>=0.1.0, \<0.2.0 | mmcv>=2.0.0rc0, \<2.1.0 | mmdet>=3.0.0rc0, \<3.1.0 |
**Note:** If you want to install mmdet3d-v1.0.0x, the compatible MMDetection, MMSegmentation and MMCV versions table can be found at [here](https://mmdetection3d.readthedocs.io/en/latest/faq.html#mmcv-mmdet-mmdet3d-installation). Please choose the correct version of MMCV to avoid installation issues.
- If you faced the error shown below when importing open3d:
......
# Learn about Configs
We incorporate modular and inheritance design into our config system, which is convenient to conduct various experiments.
MMDetection3D and other OpenMMLab repositories use [MMEngine's config system](https://mmengine.readthedocs.io/en/latest/tutorials/config.md). It has a modular and inheritance design, which is convenient to conduct various experiments.
If you wish to inspect the config file, you may run `python tools/misc/print_config.py /PATH/TO/CONFIG` to see the complete config.
You may also pass `--options xxx.yyy=zzz` to see updated config.
## Config File Structure
## Config File Content
There are 4 basic component types under `config/_base_`, dataset, model, schedule, default_runtime.
Many methods could be easily constructed with one of each like SECOND, PointPillars, PartA2, and VoteNet.
The configs that are composed by components from `_base_` are called _primitive_.
For all configs under the same folder, it is recommended to have only **one** _primitive_ config. All other configs should inherit from the _primitive_ config. In this way, the maximum of inheritance level is 3.
For easy understanding, we recommend contributors to inherit from exiting methods.
For example, if some modification is made based on PointPillars, user may first inherit the basic PointPillars structure by specifying `_base_ = ../pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d.py`, then modify the necessary fields in the config files.
If you are building an entirely new method that does not share the structure with any of the existing methods, you may create a folder `xxx_rcnn` under `configs`,
Please refer to [mmcv](https://mmcv.readthedocs.io/en/latest/understand_mmcv/config.html) for detailed documentation.
## Config Name Style
We follow the below style to name config files. Contributors are advised to follow the same style.
```
{model}_[model setting]_{backbone}_[neck]_[norm setting]_[misc]_[batch_per_gpu x gpu]_{schedule}_{dataset}
```
`{xxx}` is required field and `[yyy]` is optional.
MMDetection3D uses a modular design, all modules with different functions can be configured through the config. Taking PointPillars as an example, we will introduce each field in the config according to different function modules:
- `{model}`: model type like `hv_pointpillars` (Hard Voxelization PointPillars), `VoteNet`, etc.
- `[model setting]`: specific setting for some model.
- `{backbone}`: backbone type like `regnet-400mf`, `regnet-1.6gf`.
- `[neck]`: neck type like `fpn`, `secfpn`.
- `[norm_setting]`: `bn` (Batch Normalization) is used unless specified, other norm layer type could be `gn` (Group Normalization), `sbn` (Synchronized Batch Normalization).
`gn-head`/`gn-neck` indicates GN is applied in head/neck only, while `gn-all` means GN is applied in the entire model, e.g. backbone, neck, head.
- `[misc]`: miscellaneous setting/plugins of model, e.g. `strong-aug` means using stronger augmentation strategies for training.
- `[batch_per_gpu x gpu]`: samples per GPU and GPUs, `4x8` is used by default.
- `{schedule}`: training schedule, options are `1x`, `2x`, `20e`, etc.
`1x` and `2x` means 12 epochs and 24 epochs respectively.
`20e` is adopted in cascade models, which denotes 20 epochs.
For `1x`/`2x`, initial learning rate decays by a factor of 10 at the 8/16th and 11/22th epochs.
For `20e`, initial learning rate decays by a factor of 10 at the 16th and 19th epochs.
- `{dataset}`: dataset like `nus-3d`, `kitti-3d`, `lyft-3d`, `scannet-3d`, `sunrgbd-3d`. We also indicate the number of classes we are using if there exist multiple settings, e.g., `kitti-3d-3class` and `kitti-3d-car` means training on KITTI dataset with 3 classes and single class, respectively.
### Model config
## Deprecated train_cfg/test_cfg
Following MMDetection, the `train_cfg` and `test_cfg` are deprecated in config file, please specify them in the model config. The original config structure is as below.
In mmdetection3d's config, we use `model` to setup detection algorithm components. In addition to neural network components such as `voxel_encoder`, `backbone` etc, it also requires `data_preprocessor`, `train_cfg`, and `test_cfg`. `data_preprocessor` is responsible for processing a batch of data output by dataloader. `train_cfg`, and `test_cfg` in the model config are for training and testing hyperparameters of the components.
```python
# deprecated
model = dict(
type=...,
...
)
train_cfg=dict(...)
test_cfg=dict(...)
type='VoxelNet',
data_preprocessor=dict(
type='Det3DDataPreprocessor',
voxel=True,
voxel_layer=dict(
max_num_points=32,
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1],
voxel_size=[0.16, 0.16, 4],
max_voxels=(16000, 40000))),
voxel_encoder=dict(
type='PillarFeatureNet',
in_channels=4,
feat_channels=[64],
with_distance=False,
voxel_size=[0.16, 0.16, 4],
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1]),
middle_encoder=dict(
type='PointPillarsScatter', in_channels=64, output_shape=[496, 432]),
backbone=dict(
type='SECOND',
in_channels=64,
layer_nums=[3, 5, 5],
layer_strides=[2, 2, 2],
out_channels=[64, 128, 256]),
neck=dict(
type='SECONDFPN',
in_channels=[64, 128, 256],
upsample_strides=[1, 2, 4],
out_channels=[128, 128, 128]),
bbox_head=dict(
type='Anchor3DHead',
num_classes=3,
in_channels=384,
feat_channels=384,
use_direction_classifier=True,
assign_per_class=True,
anchor_generator=dict(
type='AlignedAnchor3DRangeGenerator',
ranges=[[0, -39.68, -0.6, 69.12, 39.68, -0.6],
[0, -39.68, -0.6, 69.12, 39.68, -0.6],
[0, -39.68, -1.78, 69.12, 39.68, -1.78]],
sizes=[[0.8, 0.6, 1.73], [1.76, 0.6, 1.73], [3.9, 1.6, 1.56]],
rotations=[0, 1.57],
reshape_out=False),
diff_rad_by_sin=True,
bbox_coder=dict(type='DeltaXYZWLHRBBoxCoder'),
loss_cls=dict(
type='mmdet.FocalLoss',
use_sigmoid=True,
gamma=2.0,
alpha=0.25,
loss_weight=1.0),
loss_bbox=dict(
type='mmdet.SmoothL1Loss',
beta=0.1111111111111111,
loss_weight=2.0),
loss_dir=dict(
type='mmdet.CrossEntropyLoss', use_sigmoid=False,
loss_weight=0.2)),
train_cfg=dict(
assigner=[
dict(
type='Max3DIoUAssigner',
iou_calculator=dict(type='mmdet3d.BboxOverlapsNearest3D'),
pos_iou_thr=0.5,
neg_iou_thr=0.35,
min_pos_iou=0.35,
ignore_iof_thr=-1),
dict(
type='Max3DIoUAssigner',
iou_calculator=dict(type='mmdet3d.BboxOverlapsNearest3D'),
pos_iou_thr=0.5,
neg_iou_thr=0.35,
min_pos_iou=0.35,
ignore_iof_thr=-1),
dict(
type='Max3DIoUAssigner',
iou_calculator=dict(type='mmdet3d.BboxOverlapsNearest3D'),
pos_iou_thr=0.6,
neg_iou_thr=0.45,
min_pos_iou=0.45,
ignore_iof_thr=-1)
],
allowed_border=0,
pos_weight=-1,
debug=False),
test_cfg=dict(
use_rotate_nms=True,
nms_across_levels=False,
nms_thr=0.01,
score_thr=0.1,
min_bbox_size=0,
nms_pre=100,
max_num=50))
```
The migration example is as below.
### Dataset and evaluator config
```python
# recommended
model = dict(
type=...,
...
train_cfg=dict(...),
test_cfg=dict(...)
)
```
## An example of VoteNet
[Dataloaders](https://pytorch.org/docs/stable/data.html?highlight=data%20loader#torch.utils.data.DataLoader) are required for the training, validation, and testing of the [runner](https://mmengine.readthedocs.io/en/latest/tutorials/runner.html). Dataset and data pipeline need to be set to build the dataloader. Due to the complexity of this part, we use intermediate variables to simplify the writing of dataloader configs.
```python
model = dict(
type='VoteNet', # The type of detector, refer to mmdet3d.models.detectors for more details
backbone=dict(
type='PointNet2SASSG', # The type of the backbone, refer to mmdet3d.models.backbones for more details
in_channels=4, # Input channels of point cloud
num_points=(2048, 1024, 512, 256), # The number of points which each SA module samples
radius=(0.2, 0.4, 0.8, 1.2), # Radius for each set abstraction layer
num_samples=(64, 32, 16, 16), # Number of samples for each set abstraction layer
sa_channels=((64, 64, 128), (128, 128, 256), (128, 128, 256),
(128, 128, 256)), # Out channels of each mlp in SA module
fp_channels=((256, 256), (256, 256)), # Out channels of each mlp in FP module
norm_cfg=dict(type='BN2d'), # Config of normalization layer
sa_cfg=dict( # Config of point set abstraction (SA) module
type='PointSAModule', # type of SA module
pool_mod='max', # Pool method ('max' or 'avg') for SA modules
use_xyz=True, # Whether to use xyz as features during feature gathering
normalize_xyz=True)), # Whether to use normalized xyz as feature during feature gathering
bbox_head=dict(
type='VoteHead', # The type of bbox head, refer to mmdet3d.models.dense_heads for more details
num_classes=18, # Number of classes for classification
bbox_coder=dict(
type='PartialBinBasedBBoxCoder', # The type of bbox_coder, refer to mmdet3d.core.bbox.coders for more details
num_sizes=18, # Number of size clusters
num_dir_bins=1, # Number of bins to encode direction angle
with_rot=False, # Whether the bbox is with rotation
mean_sizes=[[0.76966727, 0.8116021, 0.92573744],
[1.876858, 1.8425595, 1.1931566],
[0.61328, 0.6148609, 0.7182701],
[1.3955007, 1.5121545, 0.83443564],
[0.97949594, 1.0675149, 0.6329687],
[0.531663, 0.5955577, 1.7500148],
[0.9624706, 0.72462326, 1.1481868],
[0.83221924, 1.0490936, 1.6875663],
[0.21132214, 0.4206159, 0.5372846],
[1.4440073, 1.8970833, 0.26985747],
[1.0294262, 1.4040797, 0.87554324],
[1.3766412, 0.65521795, 1.6813129],
[0.6650819, 0.71111923, 1.298853],
[0.41999173, 0.37906948, 1.7513971],
[0.59359556, 0.5912492, 0.73919016],
[0.50867593, 0.50656086, 0.30136237],
[1.1511526, 1.0546296, 0.49706793],
[0.47535285, 0.49249494, 0.5802117]]), # Mean sizes for each class, the order is consistent with class_names.
vote_moudule_cfg=dict( # Config of vote module branch, refer to mmdet3d.models.layers for more details
in_channels=256, # Input channels for vote_module
vote_per_seed=1, # Number of votes to generate for each seed
gt_per_seed=3, # Number of gts for each seed
conv_channels=(256, 256), # Channels for convolution
conv_cfg=dict(type='Conv1d'), # Config of convolution
norm_cfg=dict(type='BN1d'), # Config of normalization
norm_feats=True, # Whether to normalize features
vote_loss=dict( # Config of the loss function for voting branch
type='ChamferDistance', # Type of loss for voting branch
mode='l1', # Loss mode of voting branch
reduction='none', # Specifies the reduction to apply to the output
loss_dst_weight=10.0)), # Destination loss weight of the voting branch
vote_aggregation_cfg=dict( # Config of vote aggregation branch
type='PointSAModule', # type of vote aggregation module
num_point=256, # Number of points for the set abstraction layer in vote aggregation branch
radius=0.3, # Radius for the set abstraction layer in vote aggregation branch
num_sample=16, # Number of samples for the set abstraction layer in vote aggregation branch
mlp_channels=[256, 128, 128, 128], # Mlp channels for the set abstraction layer in vote aggregation branch
use_xyz=True, # Whether to use xyz
normalize_xyz=True), # Whether to normalize xyz
feat_channels=(128, 128), # Channels for feature convolution
conv_cfg=dict(type='Conv1d'), # Config of convolution
norm_cfg=dict(type='BN1d'), # Config of normalization
objectness_loss=dict( # Config of objectness loss
type='CrossEntropyLoss', # Type of loss
class_weight=[0.2, 0.8], # Class weight of the objectness loss
reduction='sum', # Specifies the reduction to apply to the output
loss_weight=5.0), # Loss weight of the objectness loss
center_loss=dict( # Config of center loss
type='ChamferDistance', # Type of loss
mode='l2', # Loss mode of center loss
reduction='sum', # Specifies the reduction to apply to the output
loss_src_weight=10.0, # Source loss weight of the voting branch.
loss_dst_weight=10.0), # Destination loss weight of the voting branch.
dir_class_loss=dict( # Config of direction classification loss
type='CrossEntropyLoss', # Type of loss
reduction='sum', # Specifies the reduction to apply to the output
loss_weight=1.0), # Loss weight of the direction classification loss
dir_res_loss=dict( # Config of direction residual loss
type='SmoothL1Loss', # Type of loss
reduction='sum', # Specifies the reduction to apply to the output
loss_weight=10.0), # Loss weight of the direction residual loss
size_class_loss=dict( # Config of size classification loss
type='CrossEntropyLoss', # Type of loss
reduction='sum', # Specifies the reduction to apply to the output
loss_weight=1.0), # Loss weight of the size classification loss
size_res_loss=dict( # Config of size residual loss
type='SmoothL1Loss', # Type of loss
reduction='sum', # Specifies the reduction to apply to the output
loss_weight=3.3333333333333335), # Loss weight of the size residual loss
semantic_loss=dict( # Config of semantic loss
type='CrossEntropyLoss', # Type of loss
reduction='sum', # Specifies the reduction to apply to the output
loss_weight=1.0)), # Loss weight of the semantic loss
train_cfg = dict( # Config of training hyperparameters for VoteNet
pos_distance_thr=0.3, # distance >= threshold 0.3 will be taken as positive samples
neg_distance_thr=0.6, # distance < threshold 0.6 will be taken as negative samples
sample_mod='vote'), # Mode of the sampling method
test_cfg = dict( # Config of testing hyperparameters for VoteNet
sample_mod='seed', # Mode of the sampling method
nms_thr=0.25, # The threshold to be used during NMS
score_thr=0.8, # Threshold to filter out boxes
per_class_proposal=False)) # Whether to use per_class_proposal
dataset_type = 'ScanNetDataset' # Type of the dataset
data_root = './data/scannet/' # Root path of the data
class_names = ('cabinet', 'bed', 'chair', 'sofa', 'table', 'door', 'window',
'bookshelf', 'picture', 'counter', 'desk', 'curtain',
'refrigerator', 'showercurtrain', 'toilet', 'sink', 'bathtub',
'garbagebin') # Names of classes
train_pipeline = [ # Training pipeline, refer to mmdet3d.datasets.transforms for more details
dict(
type='LoadPointsFromFile', # First pipeline to load points, refer to mmdet3d.datasets.transforms.indoor_loading for more details
shift_height=True, # Whether to use shifted height
load_dim=6, # The dimension of the loaded points
use_dim=[0, 1, 2]), # Which dimensions of the points to be used
dataset_type = 'KittiDataset'
data_root = 'data/kitti/'
class_names = ['Pedestrian', 'Cyclist', 'Car']
point_cloud_range = [0, -39.68, -3, 69.12, 39.68, 1]
input_modality = dict(use_lidar=True, use_camera=False)
metainfo = dict(CLASSES=['Pedestrian', 'Cyclist', 'Car'])
db_sampler = dict(
data_root='data/kitti/',
info_path='data/kitti/kitti_dbinfos_train.pkl',
rate=1.0,
prepare=dict(
filter_by_difficulty=[-1],
filter_by_min_points=dict(Car=5, Pedestrian=5, Cyclist=5)),
classes=['Pedestrian', 'Cyclist', 'Car'],
sample_groups=dict(Car=15, Pedestrian=15, Cyclist=15),
points_loader=dict(
type='LoadPointsFromFile', coord_type='LIDAR', load_dim=4, use_dim=4))
train_pipeline = [
dict(type='LoadPointsFromFile', coord_type='LIDAR', load_dim=4, use_dim=4),
dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True),
dict(
type='LoadAnnotations3D', # Second pipeline to load annotations, refer to mmdet3d.datasets.transforms.indoor_loading for more details
with_bbox_3d=True, # Whether to load 3D boxes
with_label_3d=True, # Whether to load 3D labels corresponding to each 3D box
with_mask_3d=True, # Whether to load 3D instance masks
with_seg_3d=True), # Whether to load 3D semantic masks
type='ObjectSample',
db_sampler=dict(
data_root='data/kitti/',
info_path='data/kitti/kitti_dbinfos_train.pkl',
rate=1.0,
prepare=dict(
filter_by_difficulty=[-1],
filter_by_min_points=dict(Car=5, Pedestrian=5, Cyclist=5)),
classes=['Pedestrian', 'Cyclist', 'Car'],
sample_groups=dict(Car=15, Pedestrian=15, Cyclist=15),
points_loader=dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=4,
use_dim=4)),
use_ground_plane=True),
dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
dict(
type='PointSegClassMapping', # Declare valid categories, refer to mmdet3d.datasets.transforms.point_seg_class_mapping for more details
valid_cat_ids=(3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 24, 28, 33, 34,
36, 39), # all valid categories ids
max_cat_id=40), # max possible category id in input segmentation mask
dict(type='PointSample', # Sample points, refer to mmdet3d.datasets.transforms.transforms_3d for more details
num_points=40000), # Number of points to be sampled
dict(type='IndoorFlipData', # Augmentation pipeline that flip points and 3d boxes
flip_ratio_yz=0.5, # Probability of being flipped along yz plane
flip_ratio_xz=0.5), # Probability of being flipped along xz plane
type='GlobalRotScaleTrans',
rot_range=[-0.78539816, 0.78539816],
scale_ratio_range=[0.95, 1.05]),
dict(
type='IndoorGlobalRotScale', # Augmentation pipeline that rotate and scale points and 3d boxes, refer to mmdet3d.datasets.transforms.indoor_augment for more details
shift_height=True, # Whether the loaded points use `shift_height` attribute
rot_range=[-0.027777777777777776, 0.027777777777777776], # Range of rotation
scale_range=None), # Range of scale
type='PointsRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1]),
dict(
type='DefaultFormatBundle3D', # Default format bundle to gather data in the pipeline, refer to mmdet3d.datasets.transforms.formatting for more details
class_names=('cabinet', 'bed', 'chair', 'sofa', 'table', 'door',
'window', 'bookshelf', 'picture', 'counter', 'desk',
'curtain', 'refrigerator', 'showercurtrain', 'toilet',
'sink', 'bathtub', 'garbagebin')),
type='ObjectRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1]),
dict(type='PointShuffle'),
dict(
type='Collect3D', # Pipeline that decides which keys in the data should be passed to the detector, refer to mmdet3d.datasets.transforms.formatting for more details
keys=[
'points', 'gt_bboxes_3d', 'gt_labels_3d', 'pts_semantic_mask',
'pts_instance_mask'
])
type='Pack3DDetInputs',
keys=['points', 'gt_labels_3d', 'gt_bboxes_3d'])
]
test_pipeline = [ # Testing pipeline, refer to mmdet3d.datasets.transforms for more details
dict(
type='LoadPointsFromFile', # First pipeline to load points, refer to mmdet3d.datasets.transforms.indoor_loading for more details
shift_height=True, # Whether to use shifted height
load_dim=6, # The dimension of the loaded points
use_dim=[0, 1, 2]), # Which dimensions of the points to be used
dict(type='PointSample', # Sample points, refer to mmdet3d.datasets.transforms.transforms_3d for more details
num_points=40000), # Number of points to be sampled
test_pipeline = [
dict(type='LoadPointsFromFile', coord_type='LIDAR', load_dim=4, use_dim=4),
dict(
type='DefaultFormatBundle3D', # Default format bundle to gather data in the pipeline, refer to mmdet3d.datasets.transforms.formatting for more details
class_names=('cabinet', 'bed', 'chair', 'sofa', 'table', 'door',
'window', 'bookshelf', 'picture', 'counter', 'desk',
'curtain', 'refrigerator', 'showercurtrain', 'toilet',
'sink', 'bathtub', 'garbagebin')),
dict(type='Collect3D', # Pipeline that decides which keys in the data should be passed to the detector, refer to mmdet3d.datasets.transforms.formatting for more details
keys=['points'])
type='MultiScaleFlipAug3D',
img_scale=(1333, 800),
pts_scale_ratio=1,
flip=False,
transforms=[
dict(
type='GlobalRotScaleTrans',
rot_range=[0, 0],
scale_ratio_range=[1.0, 1.0],
translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'),
dict(
type='PointsRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1])
]),
dict(type='Pack3DDetInputs', keys=['points'])
]
eval_pipeline = [ # Pipeline used for evaluation or visualization, refer to mmdet3d.datasets.transforms for more details
dict(
type='LoadPointsFromFile', # First pipeline to load points, refer to mmdet3d.datasets.transforms.indoor_loading for more details
shift_height=True, # Whether to use shifted height
load_dim=6, # The dimension of the loaded points
use_dim=[0, 1, 2]), # Which dimensions of the points to be used
dict(
type='DefaultFormatBundle3D', # Default format bundle to gather data in the pipeline, refer to mmdet3d.datasets.transforms.formatting for more details
class_names=('cabinet', 'bed', 'chair', 'sofa', 'table', 'door',
'window', 'bookshelf', 'picture', 'counter', 'desk',
'curtain', 'refrigerator', 'showercurtrain', 'toilet',
'sink', 'bathtub', 'garbagebin')),
with_label=False),
dict(type='Collect3D', # Pipeline that decides which keys in the data should be passed to the detector, refer to mmdet3d.datasets.transforms.formatting for more details
keys=['points'])
eval_pipeline = [
dict(type='LoadPointsFromFile', coord_type='LIDAR', load_dim=4, use_dim=4),
dict(type='Pack3DDetInputs', keys=['points'])
]
data = dict(
samples_per_gpu=8, # Batch size of a single GPU
workers_per_gpu=4, # Number of workers to pre-fetch data for each single GPU
train=dict( # Train dataset config
type='RepeatDataset', # Wrapper of dataset, refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/datasets/dataset_wrappers.py for details.
times=5, # Repeat times
train_dataloader = dict(
batch_size=6,
num_workers=4,
persistent_workers=True,
sampler=dict(type='DefaultSampler', shuffle=True),
dataset=dict(
type='RepeatDataset',
times=2,
dataset=dict(
type='ScanNetDataset', # Type of dataset
data_root='./data/scannet/', # Root path of the data
ann_file='./data/scannet/scannet_infos_train.pkl', # Ann path of the data
pipeline=[ # pipeline, this is passed by the train_pipeline created before.
type='KittiDataset',
data_root='data/kitti/',
ann_file='kitti_infos_train.pkl',
data_prefix=dict(pts='training/velodyne_reduced'),
pipeline=[
dict(
type='LoadPointsFromFile',
shift_height=True,
load_dim=6,
use_dim=[0, 1, 2]),
coord_type='LIDAR',
load_dim=4,
use_dim=4),
dict(
type='LoadAnnotations3D',
with_bbox_3d=True,
with_label_3d=True,
with_mask_3d=True,
with_seg_3d=True),
with_label_3d=True),
dict(
type='PointSegClassMapping',
valid_cat_ids=(3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 24,
28, 33, 34, 36, 39),
max_cat_id=40),
dict(type='PointSample', num_points=40000),
type='ObjectSample',
db_sampler=dict(
data_root='data/kitti/',
info_path='data/kitti/kitti_dbinfos_train.pkl',
rate=1.0,
prepare=dict(
filter_by_difficulty=[-1],
filter_by_min_points=dict(
Car=5, Pedestrian=5, Cyclist=5)),
classes=['Pedestrian', 'Cyclist', 'Car'],
sample_groups=dict(Car=15, Pedestrian=15, Cyclist=15),
points_loader=dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=4,
use_dim=4)),
use_ground_plane=True),
dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
dict(
type='IndoorFlipData',
flip_ratio_yz=0.5,
flip_ratio_xz=0.5),
type='GlobalRotScaleTrans',
rot_range=[-0.78539816, 0.78539816],
scale_ratio_range=[0.95, 1.05]),
dict(
type='IndoorGlobalRotScale',
shift_height=True,
rot_range=[-0.027777777777777776, 0.027777777777777776],
scale_range=None),
type='PointsRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1]),
dict(
type='DefaultFormatBundle3D',
class_names=('cabinet', 'bed', 'chair', 'sofa', 'table',
'door', 'window', 'bookshelf', 'picture',
'counter', 'desk', 'curtain', 'refrigerator',
'showercurtrain', 'toilet', 'sink', 'bathtub',
'garbagebin')),
type='ObjectRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1]),
dict(type='PointShuffle'),
dict(
type='Collect3D',
keys=[
'points', 'gt_bboxes_3d', 'gt_labels_3d',
'pts_semantic_mask', 'pts_instance_mask'
])
type='Pack3DDetInputs',
keys=['points', 'gt_labels_3d', 'gt_bboxes_3d'])
],
filter_empty_gt=False, # Whether to filter empty ground truth boxes
classes=('cabinet', 'bed', 'chair', 'sofa', 'table', 'door',
'window', 'bookshelf', 'picture', 'counter', 'desk',
'curtain', 'refrigerator', 'showercurtrain', 'toilet',
'sink', 'bathtub', 'garbagebin'))), # Names of classes
val=dict( # Validation dataset config
type='ScanNetDataset', # Type of dataset
data_root='./data/scannet/', # Root path of the data
ann_file='./data/scannet/scannet_infos_val.pkl', # Ann path of the data
pipeline=[ # Pipeline is passed by test_pipeline created before
modality=dict(use_lidar=True, use_camera=False),
test_mode=False,
metainfo=dict(CLASSES=['Pedestrian', 'Cyclist', 'Car']),
box_type_3d='LiDAR')))
val_dataloader = dict(
batch_size=1,
num_workers=1,
persistent_workers=True,
drop_last=False,
sampler=dict(type='DefaultSampler', shuffle=False),
dataset=dict(
type='KittiDataset',
data_root='data/kitti/',
data_prefix=dict(pts='training/velodyne_reduced'),
ann_file='kitti_infos_val.pkl',
pipeline=[
dict(
type='LoadPointsFromFile',
shift_height=True,
load_dim=6,
use_dim=[0, 1, 2]),
dict(type='PointSample', num_points=40000),
coord_type='LIDAR',
load_dim=4,
use_dim=4),
dict(
type='DefaultFormatBundle3D',
class_names=('cabinet', 'bed', 'chair', 'sofa', 'table',
'door', 'window', 'bookshelf', 'picture',
'counter', 'desk', 'curtain', 'refrigerator',
'showercurtrain', 'toilet', 'sink', 'bathtub',
'garbagebin')),
dict(type='Collect3D', keys=['points'])
type='MultiScaleFlipAug3D',
img_scale=(1333, 800),
pts_scale_ratio=1,
flip=False,
transforms=[
dict(
type='GlobalRotScaleTrans',
rot_range=[0, 0],
scale_ratio_range=[1.0, 1.0],
translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'),
dict(
type='PointsRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1])
]),
dict(type='Pack3DDetInputs', keys=['points'])
],
classes=('cabinet', 'bed', 'chair', 'sofa', 'table', 'door', 'window',
'bookshelf', 'picture', 'counter', 'desk', 'curtain',
'refrigerator', 'showercurtrain', 'toilet', 'sink', 'bathtub',
'garbagebin'), # Names of classes
test_mode=True), # Whether to use test mode
test=dict( # Test dataset config
type='ScanNetDataset', # Type of dataset
data_root='./data/scannet/', # Root path of the data
ann_file='./data/scannet/scannet_infos_val.pkl', # Ann path of the data
pipeline=[ # Pipeline is passed by test_pipeline created before
modality=dict(use_lidar=True, use_camera=False),
test_mode=True,
metainfo=dict(CLASSES=['Pedestrian', 'Cyclist', 'Car']),
box_type_3d='LiDAR'))
test_dataloader = dict(
batch_size=1,
num_workers=1,
persistent_workers=True,
drop_last=False,
sampler=dict(type='DefaultSampler', shuffle=False),
dataset=dict(
type='KittiDataset',
data_root='data/kitti/',
data_prefix=dict(pts='training/velodyne_reduced'),
ann_file='kitti_infos_val.pkl',
pipeline=[
dict(
type='LoadPointsFromFile',
shift_height=True,
load_dim=6,
use_dim=[0, 1, 2]),
dict(type='PointSample', num_points=40000),
coord_type='LIDAR',
load_dim=4,
use_dim=4),
dict(
type='DefaultFormatBundle3D',
class_names=('cabinet', 'bed', 'chair', 'sofa', 'table',
'door', 'window', 'bookshelf', 'picture',
'counter', 'desk', 'curtain', 'refrigerator',
'showercurtrain', 'toilet', 'sink', 'bathtub',
'garbagebin')),
dict(type='Collect3D', keys=['points'])
type='MultiScaleFlipAug3D',
img_scale=(1333, 800),
pts_scale_ratio=1,
flip=False,
transforms=[
dict(
type='GlobalRotScaleTrans',
rot_range=[0, 0],
scale_ratio_range=[1.0, 1.0],
translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'),
dict(
type='PointsRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1])
]),
dict(type='Pack3DDetInputs', keys=['points'])
],
classes=('cabinet', 'bed', 'chair', 'sofa', 'table', 'door', 'window',
'bookshelf', 'picture', 'counter', 'desk', 'curtain',
'refrigerator', 'showercurtrain', 'toilet', 'sink', 'bathtub',
'garbagebin'), # Names of classes
test_mode=True)) # Whether to use test mode
evaluation = dict(pipeline=[ # Pipeline is passed by eval_pipeline created before
modality=dict(use_lidar=True, use_camera=False),
test_mode=True,
metainfo=dict(CLASSES=['Pedestrian', 'Cyclist', 'Car']),
box_type_3d='LiDAR'))
```
[Evaluators](https://mmengine.readthedocs.io/en/latest/tutorials/metric_and_evaluator.html) are used to compute the metrics of the trained model on the validation and testing datasets. The config of evaluators consists of one or a list of metric configs:
```python
val_evaluator = dict(
type='KittiMetric',
ann_file='data/kitti/kitti_infos_val.pkl',
metric='bbox')
test_evaluator = dict(
type='KittiMetric',
ann_file='data/kitti/kitti_infos_val.pkl',
metric='bbox')
```
### Training and testing config
MMEngine's runner uses Loop to control the training, validation, and testing processes.
Users can set the maximum training epochs and validation intervals with these fields.
```python
train_cfg = dict(
type='EpochBasedTrainLoop',
max_epochs=80,
val_interval=2)
val_cfg = dict(type='ValLoop')
test_cfg = dict(type='TestLoop')
```
### Optimization config
`optim_wrapper` is the field to configure optimization related settings. The optimizer wrapper not only provides the functions of the optimizer, but also supports functions such as gradient clipping, mixed precision training, etc. Find more in [optimizer wrapper tutorial](https://mmengine.readthedocs.io/en/latest/tutorials/optimizer.html).
```python
optim_wrapper = dict( # Optimizer wrapper config
type='OptimWrapper', # Optimizer wrapper type, switch to AmpOptimWrapper to enable mixed precision training.
optimizer=dict( # Optimizer config. Support all kinds of optimizers in PyTorch. Refer to https://pytorch.org/docs/stable/optim.html#algorithms
type='AdamW', lr=0.001, betas=(0.95, 0.99), weight_decay=0.01),
clip_grad=dict(max_norm=35, norm_type=2)) # Gradient clip option. Set None to disable gradient clip. Find usage in https://mmengine.readthedocs.io/en/latest/tutorials
```
`param_scheduler` is a field that configures methods of adjusting optimization hyperparameters such as learning rate and momentum. Users can combine multiple schedulers to create a desired parameter adjustment strategy. Find more in [parameter scheduler tutorial](https://mmengine.readthedocs.io/en/latest/tutorials/param_scheduler.html) and [parameter scheduler API documents](TODO)
```python
param_scheduler = [
dict(
type='LoadPointsFromFile',
coord_type='DEPTH',
shift_height=False,
load_dim=6,
use_dim=[0, 1, 2]),
type='CosineAnnealingLR',
T_max=32.0,
eta_min=0.01,
begin=0,
end=32.0,
by_epoch=True,
convert_to_iter_based=True),
dict(
type='CosineAnnealingLR',
T_max=48.0,
eta_min=1.0000000000000001e-07,
begin=32.0,
end=80,
by_epoch=True,
convert_to_iter_based=True),
dict(
type='DefaultFormatBundle3D',
class_names=('cabinet', 'bed', 'chair', 'sofa', 'table', 'door',
'window', 'bookshelf', 'picture', 'counter', 'desk',
'curtain', 'refrigerator', 'showercurtrain', 'toilet',
'sink', 'bathtub', 'garbagebin'),
with_label=False),
dict(type='Collect3D', keys=['points'])
])
lr = 0.008 # Learning rate of optimizers
optimizer = dict( # Config used to build optimizer, support all the optimizers in PyTorch whose arguments are also the same as those in PyTorch
type='Adam', # Type of optimizers, refer to https://github.com/open-mmlab/mmcv/blob/v1.3.7/mmcv/runner/optimizer/default_constructor.py#L12 for more details
lr=0.008) # Learning rate of optimizers, see detail usages of the parameters in the documentation of PyTorch
optimizer_config = dict( # Config used to build the optimizer hook, refer to https://github.com/open-mmlab/mmcv/blob/v1.3.7/mmcv/runner/hooks/optimizer.py#L22 for implementation details.
grad_clip=dict( # Config used to grad_clip
max_norm=10, # max norm of the gradients
norm_type=2)) # Type of the used p-norm. Can be 'inf' for infinity norm.
lr_config = dict( # Learning rate scheduler config used to register LrUpdater hook
policy='step', # The policy of scheduler, also support CosineAnnealing, Cyclic, etc. Refer to details of supported LrUpdater from https://github.com/open-mmlab/mmcv/blob/v1.3.7/mmcv/runner/hooks/lr_updater.py#L9.
warmup=None, # The warmup policy, also support `exp` and `constant`.
step=[24, 32]) # Steps to decay the learning rate
checkpoint_config = dict( # Config of set the checkpoint hook, Refer to https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/checkpoint.py for implementation.
interval=1) # The save interval is 1
log_config = dict( # config of register logger hook
interval=50, # Interval to print the log
hooks=[dict(type='TextLoggerHook'),
dict(type='TensorboardLoggerHook')]) # The logger used to record the training process.
runner = dict(type='EpochBasedRunner', max_epochs=36) # Runner that runs the `workflow` in total `max_epochs`
dist_params = dict(backend='nccl') # Parameters to setup distributed training, the port can also be set.
log_level = 'INFO' # The level of logging.
find_unused_parameters = True # Whether to find unused parameters
work_dir = None # Directory to save the model checkpoints and logs for the current experiments.
load_from = None # load models as a pre-trained model from a given path. This will not resume training.
resume_from = None # Resume checkpoints from a given path, the training will be resumed from the epoch when the checkpoint's is saved. The training state such as the epoch number and optimizer state will be restored.
workflow = [('train', 1)] # Workflow for runner. [('train', 1)] means there is only one workflow and the workflow named 'train' is executed once. The workflow trains the model by 36 epochs according to the max_epochs.
gpu_ids = range(0, 1) # ids of gpus
type='CosineAnnealingMomentum',
T_max=32.0,
eta_min=0.8947368421052632,
begin=0,
end=32.0,
by_epoch=True,
convert_to_iter_based=True),
dict(
type='CosineAnnealingMomentum',
T_max=48.0,
eta_min=1,
begin=32.0,
end=80,
convert_to_iter_based=True)
]
```
## FAQ
### Hook config
Users can attach hooks to training, validation, and testing loops to insert some oprations during running. There are two different hook fields, one is `default_hooks` and the other is `custom_hooks`.
`default_hooks` is a dict of hook configs. `default_hooks` are the hooks must required at runtime. They have default priority which should not be modified. If not set, runner will use the default values. To disable a default hook, users can set its config to `None`.
```python
default_hooks = dict(
timer=dict(type='IterTimerHook'),
logger=dict(type='LoggerHook', interval=50),
param_scheduler=dict(type='ParamSchedulerHook'),
checkpoint=dict(type='CheckpointHook', interval=1),
sampler_seed=dict(type='DistSamplerSeedHook'))
```
### Runtime config
```python
default_scope = 'mmdet3d' # The default registry scope to find modules. Refer to https://mmengine.readthedocs.io/en/latest/tutorials/registry.html
env_cfg = dict(
cudnn_benchmark=False, # Whether to enable cudnn benchmark
mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0), # Use fork to start multi-processing threads. 'fork' usually faster than 'spawn' but maybe unsafe. See discussion in https://github.com/pytorch/pytorch/issues/1355
dist_cfg=dict(backend='nccl')) # Distribution configs
vis_backends = [dict(type='LocalVisBackend')] # Visualization backends.
visualizer = dict(
type='Det3DLocalVisualizer',
vis_backends=[dict(type='LocalVisBackend')],
name='visualizer')
log_processor = dict(type='LogProcessor', window_size=50, by_epoch=True)
log_level = 'INFO'
load_from = None
resume = False
```
## Config file inheritance
There are 4 basic component types under `config/_base_`, dataset, model, schedule, default_runtime.
Many methods could be easily constructed with one of each like SECOND, PointPillars, PartA2, and VoteNet.
The configs that are composed by components from `_base_` are called _primitive_.
For all configs under the same folder, it is recommended to have only **one** _primitive_ config. All other configs should inherit from the _primitive_ config. In this way, the maximum of inheritance level is 3.
For easy understanding, we recommend contributors to inherit from exiting methods.
For example, if some modification is made based on PointPillars, user may first inherit the basic PointPillars structure by specifying `_base_ = ../pointpillars/pointpillars_hv_fpn_sbn-all_8xb4_2x_nus-3d.py`, and then modify the necessary fields in the config files.
If you are building an entirely new method that does not share the structure with any of the existing methods, you may create a folder `xxx_rcnn` under `configs`,
Please refer to [mmengine config tutorial](https://mmengine.readthedocs.io/en/latest/tutorials/config.html) for detailed documentation.
### Ignore some fields in the base configs
......@@ -483,8 +530,9 @@ train_pipeline = [
dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
dict(type='ObjectNameFilter', classes=class_names),
dict(type='PointShuffle'),
dict(type='DefaultFormatBundle3D', class_names=class_names),
dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
dict(
type='Pack3DDetInputs',
keys=['points', 'gt_labels_3d', 'gt_bboxes_3d'])
]
test_pipeline = [
dict(
......@@ -509,18 +557,63 @@ test_pipeline = [
translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'),
dict(
type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict(
type='DefaultFormatBundle3D',
class_names=class_names,
with_label=False),
dict(type='Collect3D', keys=['points'])
])
type='PointsRangeFilter', point_cloud_range=point_cloud_range)
]),
dict(type='Pack3DDetInputs', keys=['points'])
]
data = dict(
train=dict(pipeline=train_pipeline),
val=dict(pipeline=test_pipeline),
test=dict(pipeline=test_pipeline))
train_dataloader = dict(dataset=dict(pipeline=train_pipeline))
test_dataloader = dict(dataset=dict(pipeline=test_pipeline))
val_dataloader = dict(dataset=dict(pipeline=test_pipeline))
```
### Reuse variables in \_base\_ file
If the users want to reuse the variables in the base file, they can get a copy of the corresponding variable by using `{{_base_.xxx}}`. E.g:
```python
_base_ = './pointpillars_hv_secfpn_8xb6_160e_kitti-3d-3class.py'
a = {{_base_.model}} # variable `a` is equal to the `model` defined in `_base_`
```
We first define the new `train_pipeline`/`test_pipeline` and pass them into `data`.
## Modify Config Through Script Arguments
When submitting jobs using "tools/train.py" or "tools/test.py", you may specify `--cfg-options` to in-place modify the config.
- Update config keys of dict chains.
The config options can be specified following the order of the dict keys in the original config.
For example, `--cfg-options model.backbone.norm_eval=False` changes the all BN modules in model backbones to `train` mode.
- Update keys inside a list of configs.
Some config dicts are composed as a list in your config. For example, the training pipeline `train_dataloader.dataset.pipeline` is normally a list
e.g. `[dict(type='LoadImageFromFile'), ...]`. If you want to change `'LoadImageFromFile'` to `'LoadImageFromNDArray'` in the pipeline,
you may specify `--cfg-options data.train.pipeline.0.type=LoadImageFromNDArray`.
- Update values of list/tuples.
If the value to be updated is a list or a tuple. For example, the config file normally sets `model.data_preprocessor.mean=[123.675, 116.28, 103.53]`. If you want to
change the mean values, you may specify `--cfg-options model.data_preprocessor.mean="[127,127,127]"`. Note that the quotation mark `"` is necessary to
support list/tuple data types, and that **NO** white space is allowed inside the quotation marks in the specified value.
## Config Name Style
We follow the below style to name config files. Contributors are advised to follow the same style.
```
{algorithm name}_{model component names [component1]_[component2]_[...]}_{training settings}_{training dataset information}_{testing dataset information}.py
```
The file name is divided to five parts. All parts and components are connected with `_` and words of each part or component should be connected with `-`.
- `{algorithm name}`: The name of the algorithm. It can be a detector name such as `pointpillars`, `fcos3d`, etc.
- `{model component names}`: Names of the components used in the algorithm such as voxel_encoder, backbone, neck, etc. For example, `second_secfpn_head-dcn-circlenms` means using SECOND's SparseEncoder, SECONDFPN and a detection head with DCN and circle NMS.
- `{training settings}`: Information of training settings such as batch size, augmentations, loss trick, scheduler, and epochs/iterations. For example: `8xb4-tta-cyclic-20e` means using 8-gpus x 4-samples-per-gpu, test time augmentation, cyclic annealing learning rate, and train 20 epochs.
Some abbreviations:
- `{gpu x batch_per_gpu}`: GPUs and samples per GPU. `bN` indicates N batch size per GPU. E.g. `4xb4` is the short term of 4-gpus x 4-samples-per-gpu.
- `{schedule}`: training schedule, options are `schedule-2x`, `schedule-3x`, `cyclic-20e`, etc.
`schedule-2x` and `schedule-3x` mean 24 epochs and 36 epochs respectively.
`cyclic-20e` means 20 epochs respectively.
- `{training dataset information}`: Training dataset names like `kitti-3d-3class`, `nus-3d`, `s3dis-seg`, `scannet-seg`, `waymoD5-3d-car`. Here `3d` means dataset used for 3d object detection, and `seg` means dataset used for point cloud segmentation.
- `{testing dataset information}` (optional): Testing dataset name for models trained on one dataset but tested on another. If not mentioned, it means the model was trained and tested on the same dataset type.
# Copyright (c) OpenMMLab. All rights reserved.
import argparse
from mmcv import Config, DictAction
from mmengine import Config, DictAction
def parse_args():
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment