Unverified Commit 5fe12001 authored by Xiang Xu's avatar Xiang Xu Committed by GitHub
Browse files

[Docs] Update `config` tutorial (#2214)

* Update config.md

* Update config.md

* fix typo

* Update config.md

* Update config.md

* Update config.md

* fix minor typo

* fix minor typo in zh_cn version
parent f42e968c
# Learn about Configs
MMDetection3D and other OpenMMLab repositories use [MMEngine's config system](https://mmengine.readthedocs.io/en/latest/tutorials/config.html). It has a modular and inheritance design, which is convenient to conduct various experiments.
If you wish to inspect the config file, you may run `python tools/misc/print_config.py /PATH/TO/CONFIG` to see the complete config.
MMDetection3D and other OpenMMLab repositories use [MMEngine's config system](https://mmengine.readthedocs.io/en/latest/advanced_tutorials/config.html). It has a modular and inheritance design, which is convenient to conduct various experiments.
## Config File Content
## Config file content
MMDetection3D uses a modular design, all modules with different functions can be configured through the config. Taking PointPillars as an example, we will introduce each field in the config according to different function modules.
### Model config
In mmdetection3d's config, we use `model` to setup detection algorithm components. In addition to neural network components such as `voxel_encoder`, `backbone` etc, it also requires `data_preprocessor`, `train_cfg`, and `test_cfg`. `data_preprocessor` is responsible for processing a batch of data output by dataloader. `train_cfg`, and `test_cfg` in the model config are for training and testing hyperparameters of the components.
In MMDetection3D's config, we use `model` to setup detection algorithm components. In addition to neural network components such as `voxel_encoder`, `backbone` etc, it also requires `data_preprocessor`, `train_cfg`, and `test_cfg`. `data_preprocessor` is responsible for processing a batch of data output by dataloader. `train_cfg` and `test_cfg` in the model config are training and testing hyperparameters of the components.
```python
model = dict(
......@@ -76,21 +75,21 @@ model = dict(
assigner=[
dict(
type='Max3DIoUAssigner',
iou_calculator=dict(type='mmdet3d.BboxOverlapsNearest3D'),
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.5,
neg_iou_thr=0.35,
min_pos_iou=0.35,
ignore_iof_thr=-1),
dict(
type='Max3DIoUAssigner',
iou_calculator=dict(type='mmdet3d.BboxOverlapsNearest3D'),
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.5,
neg_iou_thr=0.35,
min_pos_iou=0.35,
ignore_iof_thr=-1),
dict(
type='Max3DIoUAssigner',
iou_calculator=dict(type='mmdet3d.BboxOverlapsNearest3D'),
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.6,
neg_iou_thr=0.45,
min_pos_iou=0.45,
......@@ -119,49 +118,31 @@ data_root = 'data/kitti/'
class_names = ['Pedestrian', 'Cyclist', 'Car']
point_cloud_range = [0, -39.68, -3, 69.12, 39.68, 1]
input_modality = dict(use_lidar=True, use_camera=False)
metainfo = dict(classes=['Pedestrian', 'Cyclist', 'Car'])
metainfo = dict(classes=class_names)
db_sampler = dict(
data_root='data/kitti/',
info_path='data/kitti/kitti_dbinfos_train.pkl',
data_root=data_root,
info_path=data_root + 'kitti_dbinfos_train.pkl',
rate=1.0,
prepare=dict(
filter_by_difficulty=[-1],
filter_by_min_points=dict(Car=5, Pedestrian=5, Cyclist=5)),
classes=['Pedestrian', 'Cyclist', 'Car'],
classes=class_names,
sample_groups=dict(Car=15, Pedestrian=15, Cyclist=15),
points_loader=dict(
type='LoadPointsFromFile', coord_type='LIDAR', load_dim=4, use_dim=4))
train_pipeline = [
dict(type='LoadPointsFromFile', coord_type='LIDAR', load_dim=4, use_dim=4),
dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True),
dict(
type='ObjectSample',
db_sampler=dict(
data_root='data/kitti/',
info_path='data/kitti/kitti_dbinfos_train.pkl',
rate=1.0,
prepare=dict(
filter_by_difficulty=[-1],
filter_by_min_points=dict(Car=5, Pedestrian=5, Cyclist=5)),
classes=['Pedestrian', 'Cyclist', 'Car'],
sample_groups=dict(Car=15, Pedestrian=15, Cyclist=15),
points_loader=dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=4,
use_dim=4)),
use_ground_plane=True),
dict(type='ObjectSample', db_sampler=db_sampler, use_ground_plane=True),
dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
dict(
type='GlobalRotScaleTrans',
rot_range=[-0.78539816, 0.78539816],
scale_ratio_range=[0.95, 1.05]),
dict(
type='PointsRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1]),
dict(
type='ObjectRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1]),
dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
dict(type='PointShuffle'),
dict(
type='Pack3DDetInputs',
......@@ -178,12 +159,11 @@ test_pipeline = [
dict(
type='GlobalRotScaleTrans',
rot_range=[0, 0],
scale_ratio_range=[1.0, 1.0],
scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'),
dict(
type='PointsRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1])
type='PointsRangeFilter', point_cloud_range=point_cloud_range)
]),
dict(type='Pack3DDetInputs', keys=['points'])
]
......@@ -200,57 +180,14 @@ train_dataloader = dict(
type='RepeatDataset',
times=2,
dataset=dict(
type='KittiDataset',
data_root='data/kitti/',
type=dataset_type,
data_root=data_root,
ann_file='kitti_infos_train.pkl',
data_prefix=dict(pts='training/velodyne_reduced'),
pipeline=[
dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=4,
use_dim=4),
dict(
type='LoadAnnotations3D',
with_bbox_3d=True,
with_label_3d=True),
dict(
type='ObjectSample',
db_sampler=dict(
data_root='data/kitti/',
info_path='data/kitti/kitti_dbinfos_train.pkl',
rate=1.0,
prepare=dict(
filter_by_difficulty=[-1],
filter_by_min_points=dict(
Car=5, Pedestrian=5, Cyclist=5)),
classes=['Pedestrian', 'Cyclist', 'Car'],
sample_groups=dict(Car=15, Pedestrian=15, Cyclist=15),
points_loader=dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=4,
use_dim=4)),
use_ground_plane=True),
dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
dict(
type='GlobalRotScaleTrans',
rot_range=[-0.78539816, 0.78539816],
scale_ratio_range=[0.95, 1.05]),
dict(
type='PointsRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1]),
dict(
type='ObjectRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1]),
dict(type='PointShuffle'),
dict(
type='Pack3DDetInputs',
keys=['points', 'gt_labels_3d', 'gt_bboxes_3d'])
],
modality=dict(use_lidar=True, use_camera=False),
pipeline=train_pipeline,
modality=input_modality,
test_mode=False,
metainfo=dict(classes=['Pedestrian', 'Cyclist', 'Car']),
metainfo=metainfo,
box_type_3d='LiDAR')))
val_dataloader = dict(
batch_size=1,
......@@ -259,37 +196,14 @@ val_dataloader = dict(
drop_last=False,
sampler=dict(type='DefaultSampler', shuffle=False),
dataset=dict(
type='KittiDataset',
data_root='data/kitti/',
type=dataset_type,
data_root=data_root,
data_prefix=dict(pts='training/velodyne_reduced'),
ann_file='kitti_infos_val.pkl',
pipeline=[
dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=4,
use_dim=4),
dict(
type='MultiScaleFlipAug3D',
img_scale=(1333, 800),
pts_scale_ratio=1,
flip=False,
transforms=[
dict(
type='GlobalRotScaleTrans',
rot_range=[0, 0],
scale_ratio_range=[1.0, 1.0],
translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'),
dict(
type='PointsRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1])
]),
dict(type='Pack3DDetInputs', keys=['points'])
],
modality=dict(use_lidar=True, use_camera=False),
pipeline=test_pipeline,
modality=input_modality,
test_mode=True,
metainfo=dict(classes=['Pedestrian', 'Cyclist', 'Car']),
metainfo=metainfo,
box_type_3d='LiDAR'))
test_dataloader = dict(
batch_size=1,
......@@ -298,57 +212,61 @@ test_dataloader = dict(
drop_last=False,
sampler=dict(type='DefaultSampler', shuffle=False),
dataset=dict(
type='KittiDataset',
data_root='data/kitti/',
type=dataset_type,
data_root=data_root,
data_prefix=dict(pts='training/velodyne_reduced'),
ann_file='kitti_infos_val.pkl',
pipeline=[
dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=4,
use_dim=4),
dict(
type='MultiScaleFlipAug3D',
img_scale=(1333, 800),
pts_scale_ratio=1,
flip=False,
transforms=[
dict(
type='GlobalRotScaleTrans',
rot_range=[0, 0],
scale_ratio_range=[1.0, 1.0],
translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'),
dict(
type='PointsRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1])
]),
dict(type='Pack3DDetInputs', keys=['points'])
],
modality=dict(use_lidar=True, use_camera=False),
pipeline=test_pipeline,
modality=input_modality,
test_mode=True,
metainfo=dict(classes=['Pedestrian', 'Cyclist', 'Car']),
metainfo=metainfo,
box_type_3d='LiDAR'))
```
[Evaluators](https://mmengine.readthedocs.io/en/latest/tutorials/metric_and_evaluator.html) are used to compute the metrics of the trained model on the validation and testing datasets. The config of evaluators consists of one or a list of metric configs:
[Evaluators](https://mmengine.readthedocs.io/en/latest/tutorials/evaluation.html) are used to compute the metrics of the trained model on the validation and testing datasets. The config of evaluators consists of one or a list of metric configs:
```python
val_evaluator = dict(
type='KittiMetric',
ann_file='data/kitti/kitti_infos_val.pkl',
ann_file=data_root + 'kitti_infos_val.pkl',
metric='bbox')
test_evaluator = val_evaluator
```
Since the test dataset has no annotation files, the test_dataloader and test_evaluator config in MMDetection3D are generally equal to the val's. If you want to save the detection results on the test dataset, you can write the config like this:
```python
# inference on test dataset and
# format the output results for submission.
test_dataloader = dict(
batch_size=1,
num_workers=1,
persistent_workers=True,
drop_last=False,
sampler=dict(type='DefaultSampler', shuffle=False),
dataset=dict(
type=dataset_type,
data_root=data_root,
data_prefix=dict(pts='testing/velodyne_reduced'),
ann_file='kitti_infos_test.pkl',
load_eval_anns=False,
pipeline=test_pipeline,
modality=input_modality,
test_mode=True,
metainfo=metainfo,
box_type_3d='LiDAR'))
test_evaluator = dict(
type='KittiMetric',
ann_file='data/kitti/kitti_infos_val.pkl',
metric='bbox')
ann_file=data_root + 'kitti_infos_test.pkl',
metric='bbox',
format_only=True,
submission_prefix='results/kitti-3class/kitti_results')
```
### Training and testing config
MMEngine's runner uses Loop to control the training, validation, and testing processes.
Users can set the maximum training epochs and validation intervals with these fields.
Users can set the maximum training epochs and validation intervals with these fields:
```python
train_cfg = dict(
......@@ -361,110 +279,144 @@ test_cfg = dict(type='TestLoop')
### Optimization config
`optim_wrapper` is the field to configure optimization related settings. The optimizer wrapper not only provides the functions of the optimizer, but also supports functions such as gradient clipping, mixed precision training, etc. Find more in [optimizer wrapper tutorial](https://mmengine.readthedocs.io/en/latest/tutorials/optimizer.html).
`optim_wrapper` is the field to configure optimization-related settings. The optimizer wrapper not only provides the functions of the optimizer, but also supports functions such as gradient clipping, mixed precision training, etc. Find more in [optimizer wrapper tutorial](https://mmengine.readthedocs.io/en/latest/tutorials/optim_wrapper.html).
```python
optim_wrapper = dict( # Optimizer wrapper config
type='OptimWrapper', # Optimizer wrapper type, switch to AmpOptimWrapper to enable mixed precision training.
optimizer=dict( # Optimizer config. Support all kinds of optimizers in PyTorch. Refer to https://pytorch.org/docs/stable/optim.html#algorithms
type='AdamW', lr=0.001, betas=(0.95, 0.99), weight_decay=0.01),
clip_grad=dict(max_norm=35, norm_type=2)) # Gradient clip option. Set None to disable gradient clip. Find usage in https://mmengine.readthedocs.io/en/latest/tutorials
clip_grad=dict(max_norm=35, norm_type=2)) # Gradient clip option. Set None to disable gradient clip. Find usage in https://mmengine.readthedocs.io/en/latest/tutorials/optim_wrapper.html
```
`param_scheduler` is a field that configures methods of adjusting optimization hyperparameters such as learning rate and momentum. Users can combine multiple schedulers to create a desired parameter adjustment strategy. Find more in [parameter scheduler tutorial](https://mmengine.readthedocs.io/en/latest/tutorials/param_scheduler.html) and [parameter scheduler API documents](TODO)
`param_scheduler` is a field that configures methods of adjusting optimization hyperparameters such as learning rate and momentum. Users can combine multiple schedulers to create a desired parameter adjustment strategy. Find more in [parameter scheduler tutorial](https://mmengine.readthedocs.io/en/latest/tutorials/param_scheduler.html) and [parameter scheduler API documents](https://mmengine.readthedocs.io/en/latest/api/optim.html#scheduler).
```python
param_scheduler = [
dict(
type='CosineAnnealingLR',
T_max=32.0,
T_max=32,
eta_min=0.01,
begin=0,
end=32.0,
end=32,
by_epoch=True,
convert_to_iter_based=True),
dict(
type='CosineAnnealingLR',
T_max=48.0,
T_max=48,
eta_min=1.0000000000000001e-07,
begin=32.0,
begin=32,
end=80,
by_epoch=True,
convert_to_iter_based=True),
dict(
type='CosineAnnealingMomentum',
T_max=32.0,
T_max=32,
eta_min=0.8947368421052632,
begin=0,
end=32.0,
end=32,
by_epoch=True,
convert_to_iter_based=True),
dict(
type='CosineAnnealingMomentum',
T_max=48.0,
T_max=48,
eta_min=1,
begin=32.0,
begin=32,
end=80,
convert_to_iter_based=True)
by_epoch=True,
convert_to_iter_based=True),
]
```
### Hook config
Users can attach hooks to training, validation, and testing loops to insert some operations during running. There are two different hook fields, one is `default_hooks` and the other is `custom_hooks`.
Users can attach Hooks to training, validation, and testing loops to insert some operations during running. There are two different hook fields, one is `default_hooks` and the other is `custom_hooks`.
`default_hooks` is a dict of hook configs. `default_hooks` are the hooks must required at runtime. They have default priority which should not be modified. If not set, runner will use the default values. To disable a default hook, users can set its config to `None`.
`default_hooks` is a dict of hook configs, and they are the hooks must be required at the runtime. They have default priority which should not be modified. If not set, runner will use the default values. To disable a default hook, users can set its config to `None`.
```python
default_hooks = dict(
timer=dict(type='IterTimerHook'),
logger=dict(type='LoggerHook', interval=50),
param_scheduler=dict(type='ParamSchedulerHook'),
checkpoint=dict(type='CheckpointHook', interval=1),
sampler_seed=dict(type='DistSamplerSeedHook'))
checkpoint=dict(type='CheckpointHook', interval=-1),
sampler_seed=dict(type='DistSamplerSeedHook'),
visualization=dict(type='Det3DVisualizationHook'))
```
`custom_hooks` is a list of all other hook configs. Users can develop their own hooks and insert them in this field.
```python
custom_hooks = []
```
### Runtime config
```python
default_scope = 'mmdet3d' # The default registry scope to find modules. Refer to https://mmengine.readthedocs.io/en/latest/tutorials/registry.html
default_scope = 'mmdet3d' # The default registry scope to find modules. Refer to https://mmengine.readthedocs.io/en/latest/advanced_tutorials/registry.html
env_cfg = dict(
cudnn_benchmark=False, # Whether to enable cudnn benchmark
mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0), # Use fork to start multi-processing threads. 'fork' usually faster than 'spawn' but maybe unsafe. See discussion in https://github.com/pytorch/pytorch/issues/1355
mp_cfg=dict( # Multi-processing config
mp_start_method='fork', # Use fork to start multi-processing threads. 'fork' usually faster than 'spawn' but maybe unsafe. See discussion in https://github.com/pytorch/pytorch/issues/1355
opencv_num_threads=0), # Disable opencv multi-threads to avoid system being overloaded
dist_cfg=dict(backend='nccl')) # Distribution configs
vis_backends = [dict(type='LocalVisBackend')] # Visualization backends.
vis_backends = [dict(type='LocalVisBackend')] # Visualization backends. Refer to https://mmengine.readthedocs.io/en/latest/advanced_tutorials/visualization.html
visualizer = dict(
type='Det3DLocalVisualizer',
vis_backends=[dict(type='LocalVisBackend')],
name='visualizer')
log_processor = dict(type='LogProcessor', window_size=50, by_epoch=True)
log_level = 'INFO'
load_from = None
resume = False
type='Det3DLocalVisualizer', vis_backends=vis_backends, name='visualizer')
log_processor = dict(
type='LogProcessor', # Log processor to process runtime logs
window_size=50, # Smooth interval of log values
by_epoch=True) # Whether to format logs with epoch type. Should be consistent with the train loop's type.
log_level = 'INFO' # The level of logging.
load_from = None # Load model checkpoint as a pre-trained model from a given path. This will not resume training.
resume = False # Whether to resume from the checkpoint defined in `load_from`. If `load_from` is None, it will resume the latest checkpoint in the `work_dir`.
```
## Config file inheritance
There are 4 basic component types under `configs/_base_`, dataset, model, schedule, default_runtime.
Many methods could be easily constructed with one of each like SECOND, PointPillars, PartA2, and VoteNet.
The configs that are composed by components from `_base_` are called _primitive_.
Many methods could be easily constructed with one of these models like SECOND, PointPillars, PartA2, VoteNet.
The configs that are composed of components from `_base_` are called _primitive_.
For all configs under the same folder, it is recommended to have only **one** _primitive_ config. All other configs should inherit from the _primitive_ config. In this way, the maximum of inheritance level is 3.
For easy understanding, we recommend contributors to inherit from exiting methods.
For example, if some modification is made based on PointPillars, user may first inherit the basic PointPillars structure by specifying `_base_ = ../pointpillars/pointpillars_hv_fpn_sbn-all_8xb4_2x_nus-3d.py`, and then modify the necessary fields in the config files.
For easy understanding, we recommend contributors to inherit from existing methods.
For example, if some modification is made based on PointPillars, users may first inherit the basic PointPillars structure by specifying `_base_ = '../pointpillars/pointpillars_hv_fpn_sbn-all_8xb4-2x_nus-3d.py'`, then modify the necessary fields in the config files.
If you are building an entirely new method that does not share the structure with any of the existing methods, you may create a folder `xxx_rcnn` under `configs`.
Please refer to [MMEngine config tutorial](https://mmengine.readthedocs.io/en/latest/advanced_tutorials/config.html) for detailed documentation.
If you are building an entirely new method that does not share the structure with any of the existing methods, you may create a folder `xxx_rcnn` under `configs`,
By setting the `_base_` field, we can set which files the current configuration file inherits from.
Please refer to [mmengine config tutorial](https://mmengine.readthedocs.io/en/latest/tutorials/config.html) for detailed documentation.
When `_base_` is a string of a file path, it means inheriting the contents from one config file.
```python
_base_ = './pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class.py'
```
When `_base_` is a list of multiple file paths, it means inheriting from multiple files.
```python
_base_ = [
'../_base_/models/pointpillars_hv_secfpn_kitti.py',
'../_base_/datasets/kitti-3d-3class.py',
'../_base_/schedules/cyclic-40e.py', '../_base_/default_runtime.py'
]
```
If you wish to inspect the config file, you may run `python tools/misc/print_config.py /PATH/TO/CONFIG` to see the complete config.
### Ignore some fields in the base configs
Sometimes, you may set `_delete_=True` to ignore some of fields in base configs.
You may refer to [mmcv](https://mmcv.readthedocs.io/en/latest/utils.html#inherit-from-base-config-with-ignored-fields) for simple illustration.
Sometimes, you may set `_delete_=True` to ignore some of the fields in base configs.
You may refer to [MMEngine config tutorial](https://mmengine.readthedocs.io/en/latest/advanced_tutorials/config.html) for a simple illustration.
In MMDetection3D, for example, to change the FPN neck of PointPillars with the following config.
In MMDetection3D, for example, to change the neck of PointPillars with the following config:
```python
model = dict(
......@@ -484,7 +436,7 @@ model = dict(
pts_bbox_head=dict(...))
```
`FPN` and `SECONDFPN` use different keywords to construct.
`FPN` and `SECONDFPN` use different keywords to construct:
```python
_base_ = '../_base_/models/pointpillars_hv_fpn_nus.py'
......@@ -505,7 +457,7 @@ The `_delete_=True` would replace all old keys in `pts_neck` field with new keys
Some intermediate variables are used in the configs files, like `train_pipeline`/`test_pipeline` in datasets.
It's worth noting that when modifying intermediate variables in the children configs, user needs to pass the intermediate variables into corresponding fields again.
For example, we would like to use multi scale strategy to train and test a PointPillars. `train_pipeline`/`test_pipeline` are intermediate variable we would like modify.
For example, we would like to use a multi-scale strategy to train and test a PointPillars, `train_pipeline`/`test_pipeline` are intermediate variables we would like to modify.
```python
_base_ = './nus-3d.py'
......@@ -559,39 +511,41 @@ test_pipeline = [
dict(
type='PointsRangeFilter', point_cloud_range=point_cloud_range)
]),
dict(type='Pack3DDetInputs', keys=['points'])
dict(type='Pack3DDetInputs', keys=['points'])
]
train_dataloader = dict(dataset=dict(pipeline=train_pipeline))
test_dataloader = dict(dataset=dict(pipeline=test_pipeline))
val_dataloader = dict(dataset=dict(pipeline=test_pipeline))
test_dataloader = dict(dataset=dict(pipeline=test_pipeline))
```
We first define the new `train_pipeline`/`test_pipeline` and pass them into dataloader fields.
### Reuse variables in \_base\_ file
If the users want to reuse the variables in the base file, they can get a copy of the corresponding variable by using `{{_base_.xxx}}`. E.g:
```python
_base_ = './pointpillars_hv_secfpn_8xb6_160e_kitti-3d-3class.py'
_base_ = './pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class.py'
a = {{_base_.model}} # variable `a` is equal to the `model` defined in `_base_`
a = {{_base_.model}} # variable `a` is equal to the `model` defined in `_base_`
```
## Modify Config Through Script Arguments
## Modify config through script arguments
When submitting jobs using "tools/train.py" or "tools/test.py", you may specify `--cfg-options` to in-place modify the config.
When submitting jobs using `tools/train.py` or `tools/test.py`, you may specify `--cfg-options` to in-place modify the config.
- Update config keys of dict chains.
- Update config keys of dict chains
The config options can be specified following the order of the dict keys in the original config.
For example, `--cfg-options model.backbone.norm_eval=False` changes the all BN modules in model backbones to `train` mode.
- Update keys inside a list of configs.
- Update keys inside a list of configs
Some config dicts are composed as a list in your config. For example, the training pipeline `train_dataloader.dataset.pipeline` is normally a list
e.g. `[dict(type='LoadImageFromFile'), ...]`. If you want to change `'LoadImageFromFile'` to `'LoadImageFromNDArray'` in the pipeline,
you may specify `--cfg-options data.train.pipeline.0.type=LoadImageFromNDArray`.
e.g. `[dict(type='LoadPointsFromFile'), ...]`. If you want to change `'LoadPointsFromFile'` to `'LoadPointsFromDict'` in the pipeline,
you may specify `--cfg-options data.train.pipeline.0.type=LoadPointsFromDict`.
- Update values of list/tuples.
- Update values of list/tuple
If the value to be updated is a list or a tuple. For example, the config file normally sets `model.data_preprocessor.mean=[123.675, 116.28, 103.53]`. If you want to
change the mean values, you may specify `--cfg-options model.data_preprocessor.mean="[127,127,127]"`. Note that the quotation mark `"` is necessary to
......@@ -611,9 +565,9 @@ The file name is divided to five parts. All parts and components are connected w
- `{model component names}`: Names of the components used in the algorithm such as voxel_encoder, backbone, neck, etc. For example, `second_secfpn_head-dcn-circlenms` means using SECOND's SparseEncoder, SECONDFPN and a detection head with DCN and circle NMS.
- `{training settings}`: Information of training settings such as batch size, augmentations, loss trick, scheduler, and epochs/iterations. For example: `8xb4-tta-cyclic-20e` means using 8-gpus x 4-samples-per-gpu, test time augmentation, cyclic annealing learning rate, and train 20 epochs.
Some abbreviations:
- `{gpu x batch_per_gpu}`: GPUs and samples per GPU. `bN` indicates N batch size per GPU. E.g. `4xb4` is the short term of 4-gpus x 4-samples-per-gpu.
- `{gpu x batch_per_gpu}`: GPUs and samples per GPU. `bN` indicates N batch size per GPU. E.g. `4xb4` is the short term of 4-GPUs x 4-samples-per-GPU.
- `{schedule}`: training schedule, options are `schedule-2x`, `schedule-3x`, `cyclic-20e`, etc.
`schedule-2x` and `schedule-3x` mean 24 epochs and 36 epochs respectively.
`cyclic-20e` means 20 epochs respectively.
- `{training dataset information}`: Training dataset names like `kitti-3d-3class`, `nus-3d`, `s3dis-seg`, `scannet-seg`, `waymoD5-3d-car`. Here `3d` means dataset used for 3d object detection, and `seg` means dataset used for point cloud segmentation.
- `{training dataset information}`: Training dataset names like `kitti-3d-3class`, `nus-3d`, `s3dis-seg`, `scannet-seg`, `waymoD5-3d-car`. Here `3d` means dataset used for 3D object detection, and `seg` means dataset used for point cloud segmentation.
- `{testing dataset information}` (optional): Testing dataset name for models trained on one dataset but tested on another. If not mentioned, it means the model was trained and tested on the same dataset type.
# 学习配置文件
MMDetection3D 和其他 OpenMMLab 仓库使用[MMEngine 配置系统](https://mmengine.readthedocs.io/zh_CN/latest/tutorials/config.html)。它具有模块化和继承性设计,便于进行各种实验。如果希望检查配置文件,可以通过运行 `python tools/misc/print_config.py /PATH/TO/CONFIG` 来查看完整的配置。
MMDetection3D 和其他 OpenMMLab 仓库使用 [MMEngine 配置文件系统](https://mmengine.readthedocs.io/zh_CN/latest/advanced_tutorials/config.html)。它具有模块化和继承性设计,便于进行各种实验。
## 配置文件内容
## 配置文件内容
MMDetection3D 使用模块化设计,所有具有不同功能的模块可以通过配置文件进行配置。以 PointPillars 为例,我们将根据不同的功能模块介绍配置文件的每一个字段。
MMDetection3D 用模块化设计,所有功能的模块可以通过配置文件进行配置。以 PointPillars 为例,我们将根据不同的功能模块介绍配置文件的个字段。
### 模型配置
mmdetection3d 配置中,我们使用 `model` 置检测算法组件。除了神经网络组件,如 `voxel_encoder``backbone` 等,还需要 `data_preprocessor``train_cfg``test_cfg``data_preprocessor` 负责处理 dataloader 输出的一个批次的数据。模型配置中的 `train_cfg``test_cfg` 用于设置训练和测试组件的超参数。
MMDetection3D 的配置中,我们使用 `model` 字段来配置检测算法组件。除了 `voxel_encoder``backbone`神经网络组件外,还需要 `data_preprocessor``train_cfg``test_cfg``data_preprocessor` 负责对数据加载器(dataloader输出的每一批数据进行预处理。模型配置中的 `train_cfg``test_cfg` 用于设置训练和测试组件的超参数。
```python
model = dict(
......@@ -75,21 +75,21 @@ model = dict(
assigner=[
dict(
type='Max3DIoUAssigner',
iou_calculator=dict(type='mmdet3d.BboxOverlapsNearest3D'),
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.5,
neg_iou_thr=0.35,
min_pos_iou=0.35,
ignore_iof_thr=-1),
dict(
type='Max3DIoUAssigner',
iou_calculator=dict(type='mmdet3d.BboxOverlapsNearest3D'),
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.5,
neg_iou_thr=0.35,
min_pos_iou=0.35,
ignore_iof_thr=-1),
dict(
type='Max3DIoUAssigner',
iou_calculator=dict(type='mmdet3d.BboxOverlapsNearest3D'),
iou_calculator=dict(type='BboxOverlapsNearest3D'),
pos_iou_thr=0.6,
neg_iou_thr=0.45,
min_pos_iou=0.45,
......@@ -108,9 +108,9 @@ model = dict(
max_num=50))
```
### 数据集和评器配置
### 数据集和评器配置
[执行器(Runner)](https://mmengine.readthedocs.io/zh_CN/latest/tutorials/runner.html)的训练,验证和测试需要[数据加载器(dataloaders)](https://pytorch.org/docs/stable/data.html?highlight=data%20loader#torch.utils.data.DataLoader)。构建数据加载器需要数据集和数据流水线。由于这部分的复杂,我们使用中间变量简化数据加载配置文件的编写。
在使用[执行器(Runner)](https://mmengine.readthedocs.io/zh_CN/latest/tutorials/runner.html)进行训练、测试和验证时,我们需要配置[数据加载器](https://pytorch.org/docs/stable/data.html?highlight=data%20loader#torch.utils.data.DataLoader)。构建数据加载器需要设置数据集和数据处理流程。由于这部分的配置较为复杂,我们使用中间变量简化数据加载配置的编写。
```python
dataset_type = 'KittiDataset'
......@@ -118,49 +118,31 @@ data_root = 'data/kitti/'
class_names = ['Pedestrian', 'Cyclist', 'Car']
point_cloud_range = [0, -39.68, -3, 69.12, 39.68, 1]
input_modality = dict(use_lidar=True, use_camera=False)
metainfo = dict(classes=['Pedestrian', 'Cyclist', 'Car'])
metainfo = dict(classes=class_names)
db_sampler = dict(
data_root='data/kitti/',
info_path='data/kitti/kitti_dbinfos_train.pkl',
data_root=data_root,
info_path=data_root + 'kitti_dbinfos_train.pkl',
rate=1.0,
prepare=dict(
filter_by_difficulty=[-1],
filter_by_min_points=dict(Car=5, Pedestrian=5, Cyclist=5)),
classes=['Pedestrian', 'Cyclist', 'Car'],
classes=class_names,
sample_groups=dict(Car=15, Pedestrian=15, Cyclist=15),
points_loader=dict(
type='LoadPointsFromFile', coord_type='LIDAR', load_dim=4, use_dim=4))
train_pipeline = [
dict(type='LoadPointsFromFile', coord_type='LIDAR', load_dim=4, use_dim=4),
dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True),
dict(
type='ObjectSample',
db_sampler=dict(
data_root='data/kitti/',
info_path='data/kitti/kitti_dbinfos_train.pkl',
rate=1.0,
prepare=dict(
filter_by_difficulty=[-1],
filter_by_min_points=dict(Car=5, Pedestrian=5, Cyclist=5)),
classes=['Pedestrian', 'Cyclist', 'Car'],
sample_groups=dict(Car=15, Pedestrian=15, Cyclist=15),
points_loader=dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=4,
use_dim=4)),
use_ground_plane=True),
dict(type='ObjectSample', db_sampler=db_sampler, use_ground_plane=True),
dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
dict(
type='GlobalRotScaleTrans',
rot_range=[-0.78539816, 0.78539816],
scale_ratio_range=[0.95, 1.05]),
dict(
type='PointsRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1]),
dict(
type='ObjectRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1]),
dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
dict(type='PointShuffle'),
dict(
type='Pack3DDetInputs',
......@@ -177,12 +159,11 @@ test_pipeline = [
dict(
type='GlobalRotScaleTrans',
rot_range=[0, 0],
scale_ratio_range=[1.0, 1.0],
scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'),
dict(
type='PointsRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1])
type='PointsRangeFilter', point_cloud_range=point_cloud_range)
]),
dict(type='Pack3DDetInputs', keys=['points'])
]
......@@ -199,57 +180,14 @@ train_dataloader = dict(
type='RepeatDataset',
times=2,
dataset=dict(
type='KittiDataset',
data_root='data/kitti/',
type=dataset_type,
data_root=data_root,
ann_file='kitti_infos_train.pkl',
data_prefix=dict(pts='training/velodyne_reduced'),
pipeline=[
dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=4,
use_dim=4),
dict(
type='LoadAnnotations3D',
with_bbox_3d=True,
with_label_3d=True),
dict(
type='ObjectSample',
db_sampler=dict(
data_root='data/kitti/',
info_path='data/kitti/kitti_dbinfos_train.pkl',
rate=1.0,
prepare=dict(
filter_by_difficulty=[-1],
filter_by_min_points=dict(
Car=5, Pedestrian=5, Cyclist=5)),
classes=['Pedestrian', 'Cyclist', 'Car'],
sample_groups=dict(Car=15, Pedestrian=15, Cyclist=15),
points_loader=dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=4,
use_dim=4)),
use_ground_plane=True),
dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
dict(
type='GlobalRotScaleTrans',
rot_range=[-0.78539816, 0.78539816],
scale_ratio_range=[0.95, 1.05]),
dict(
type='PointsRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1]),
dict(
type='ObjectRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1]),
dict(type='PointShuffle'),
dict(
type='Pack3DDetInputs',
keys=['points', 'gt_labels_3d', 'gt_bboxes_3d'])
],
modality=dict(use_lidar=True, use_camera=False),
pipeline=train_pipeline,
modality=input_modality,
test_mode=False,
metainfo=dict(classes=['Pedestrian', 'Cyclist', 'Car']),
metainfo=metainfo,
box_type_3d='LiDAR')))
val_dataloader = dict(
batch_size=1,
......@@ -258,37 +196,14 @@ val_dataloader = dict(
drop_last=False,
sampler=dict(type='DefaultSampler', shuffle=False),
dataset=dict(
type='KittiDataset',
data_root='data/kitti/',
type=dataset_type,
data_root=data_root,
data_prefix=dict(pts='training/velodyne_reduced'),
ann_file='kitti_infos_val.pkl',
pipeline=[
dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=4,
use_dim=4),
dict(
type='MultiScaleFlipAug3D',
img_scale=(1333, 800),
pts_scale_ratio=1,
flip=False,
transforms=[
dict(
type='GlobalRotScaleTrans',
rot_range=[0, 0],
scale_ratio_range=[1.0, 1.0],
translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'),
dict(
type='PointsRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1])
]),
dict(type='Pack3DDetInputs', keys=['points'])
],
modality=dict(use_lidar=True, use_camera=False),
pipeline=test_pipeline,
modality=input_modality,
test_mode=True,
metainfo=dict(classes=['Pedestrian', 'Cyclist', 'Car']),
metainfo=metainfo,
box_type_3d='LiDAR'))
test_dataloader = dict(
batch_size=1,
......@@ -297,56 +212,60 @@ test_dataloader = dict(
drop_last=False,
sampler=dict(type='DefaultSampler', shuffle=False),
dataset=dict(
type='KittiDataset',
data_root='data/kitti/',
type=dataset_type,
data_root=data_root,
data_prefix=dict(pts='training/velodyne_reduced'),
ann_file='kitti_infos_val.pkl',
pipeline=[
dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=4,
use_dim=4),
dict(
type='MultiScaleFlipAug3D',
img_scale=(1333, 800),
pts_scale_ratio=1,
flip=False,
transforms=[
dict(
type='GlobalRotScaleTrans',
rot_range=[0, 0],
scale_ratio_range=[1.0, 1.0],
translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'),
dict(
type='PointsRangeFilter',
point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1])
]),
dict(type='Pack3DDetInputs', keys=['points'])
],
modality=dict(use_lidar=True, use_camera=False),
pipeline=test_pipeline,
modality=input_modality,
test_mode=True,
metainfo=dict(classes=['Pedestrian', 'Cyclist', 'Car']),
metainfo=metainfo,
box_type_3d='LiDAR'))
```
[](https://mmengine.readthedocs.io/zh_CN/latest/tutorials/metric_and_evaluator.html)计算训练模型在验证和测试集上的评价指标。评器的配置包含一个或一系列指标配置:
[](https://mmengine.readthedocs.io/zh_CN/latest/tutorials/evaluation.html)计算训练模型在验证和测试数据集上的指标。评器的配置一个或一组评价指标配置组成
```python
val_evaluator = dict(
type='KittiMetric',
ann_file='data/kitti/kitti_infos_val.pkl',
ann_file=data_root + 'kitti_infos_val.pkl',
metric='bbox')
test_evaluator = val_evaluator
```
由于测试数据集没有标注文件,因此 MMDetection3D 中的 test_dataloader 和 test_evaluator 配置通常等于 val。如果您想要保存在测试数据集上的检测结果,则可以像这样编写配置:
```python
# 在测试集上推理,
# 并将检测结果转换格式以用于提交结果
test_dataloader = dict(
batch_size=1,
num_workers=1,
persistent_workers=True,
drop_last=False,
sampler=dict(type='DefaultSampler', shuffle=False),
dataset=dict(
type=dataset_type,
data_root=data_root,
data_prefix=dict(pts='testing/velodyne_reduced'),
ann_file='kitti_infos_test.pkl',
load_eval_anns=False,
pipeline=test_pipeline,
modality=input_modality,
test_mode=True,
metainfo=metainfo,
box_type_3d='LiDAR'))
test_evaluator = dict(
type='KittiMetric',
ann_file='data/kitti/kitti_infos_val.pkl',
metric='bbox')
ann_file=data_root + 'kitti_infos_test.pkl',
metric='bbox',
format_only=True,
submission_prefix='results/kitti-3class/kitti_results')
```
### 训练和测试配置
MMEngine 的执行器使用循环(Loop)来控制训练验证以及测试过程。用户可以用以下字段设置最大训练周期以及验证间隔
MMEngine 的执行器使用循环(Loop)来控制训练验证测试过程。用户可以使用这些字段设置最大训练轮次和验证间隔
```python
train_cfg = dict(
......@@ -357,108 +276,142 @@ val_cfg = dict(type='ValLoop')
test_cfg = dict(type='TestLoop')
```
### 优化配置
### 优化配置
`optim_wrapper` 字段用来配置优化相关设置。优化器包装器不仅提供优化器的功能,用时也支持其它功能,如梯度裁剪、混合精度训练等。更多细节请参考[优化器封装教程](https://mmengine.readthedocs.io/zh_CN/latest/tutorials/optimizer.html)
`optim_wrapper` 配置优化相关设置的字段。优化器封装不仅提供优化器的功能,还支持梯度裁剪、混合精度训练等功能。更多内容请看[优化器封装教程](https://mmengine.readthedocs.io/zh_CN/latest/tutorials/optim_wrapper.html)
```python
optim_wrapper = dict( # 优化器封装配置
type='OptimWrapper', # 优化器封装类型,切换 AmpOptimWrapper 使用混合精度训练
optimizer=dict( # 优化器配置。支持 PyTorch 中所有类型的优化器参考 https://pytorch.org/docs/stable/optim.html#algorithms
type='OptimWrapper', # 优化器封装类型,切换 AmpOptimWrapper 启动混合精度训练
optimizer=dict( # 优化器配置。支持 PyTorch 的各种优化器,请参考 https://pytorch.org/docs/stable/optim.html#algorithms
type='AdamW', lr=0.001, betas=(0.95, 0.99), weight_decay=0.01),
clip_grad=dict(max_norm=35, norm_type=2)) # 梯度裁剪选项。设置 None 禁用梯度裁剪。用法请参考 https://mmengine.readthedocs.io/zh_CN/latest/tutorials
clip_grad=dict(max_norm=35, norm_type=2)) # 梯度裁剪选项。设置 None 禁用梯度裁剪。使用方法请见 https://mmengine.readthedocs.io/zh_CN/latest/tutorials/optim_wrapper.html
```
`param_scheduler` 字段用来配置调整优化器超参数例如学习率和动量。用户可以组合多个调度器来创建所需要的参数调整策略。更多细节请参考[参数调度器教程](https://mmengine.readthedocs.io/zh_CN/latest/tutorials/param_scheduler.html)[参数调度器 API 文档](TODO)
`param_scheduler` 配置调整优化器超参数例如学习率和动量)的字段。用户可以组合多个调度器来创建所需要的参数调整策略。更多信息请参考[参数调度器教程](https://mmengine.readthedocs.io/zh_CN/latest/tutorials/param_scheduler.html)[参数调度器 API 文档](https://mmengine.readthedocs.io/zh_CN/latest/api/optim.html#scheduler)
```python
param_scheduler = [
dict(
type='CosineAnnealingLR',
T_max=32.0,
T_max=32,
eta_min=0.01,
begin=0,
end=32.0,
end=32,
by_epoch=True,
convert_to_iter_based=True),
dict(
type='CosineAnnealingLR',
T_max=48.0,
T_max=48,
eta_min=1.0000000000000001e-07,
begin=32.0,
begin=32,
end=80,
by_epoch=True,
convert_to_iter_based=True),
dict(
type='CosineAnnealingMomentum',
T_max=32.0,
T_max=32,
eta_min=0.8947368421052632,
begin=0,
end=32.0,
end=32,
by_epoch=True,
convert_to_iter_based=True),
dict(
type='CosineAnnealingMomentum',
T_max=48.0,
T_max=48,
eta_min=1,
begin=32.0,
begin=32,
end=80,
convert_to_iter_based=True)
by_epoch=True,
convert_to_iter_based=True),
]
```
### 钩子配置
用户可以将钩子连接到训练、验证和测试中,以便在运行期间插入一些操作。有两不同的钩子字段`default_hooks` `custom_hooks`
用户可以训练、验证和测试循环上添加钩子,从而在运行期间插入一些操作。有两不同的钩子字段,一种是 `default_hooks`,另一种是 `custom_hooks`
`default_hooks` 是一个钩子配置字典`default_hooks` 里的钩子是运行时所需要的。它们有默认优先级,是不需要修改的。如果没有设置,执行器使用默认值。如果要禁用默认钩子,用户可以将其配置设置 `None`
`default_hooks` 是一个钩子配置字典,并且这些钩子是运行时所需要的。它们有默认优先级,是不需要修改的。如果设置,执行器使用默认值。如果要禁用默认钩子,用户可以将其配置设置 `None`
```python
default_hooks = dict(
timer=dict(type='IterTimerHook'),
logger=dict(type='LoggerHook', interval=50),
param_scheduler=dict(type='ParamSchedulerHook'),
checkpoint=dict(type='CheckpointHook', interval=1),
sampler_seed=dict(type='DistSamplerSeedHook'))
checkpoint=dict(type='CheckpointHook', interval=-1),
sampler_seed=dict(type='DistSamplerSeedHook'),
visualization=dict(type='Det3DVisualizationHook'))
```
`custom_hooks` 是一个由其他钩子配置组成的列表。用户可以开发自己的钩子并将其插入到该字段中。
```python
custom_hooks = []
```
### 运行配置
```python
default_scope = 'mmdet3d' # 寻找模块的默认注册域。参考 https://mmengine.readthedocs.io/zh_CN/latest/tutorials/registry.html
default_scope = 'mmdet3d' # 寻找模块的默认注册域。参考 https://mmengine.readthedocs.io/zh_CN/latest/advanced_tutorials/registry.html
env_cfg = dict(
cudnn_benchmark=False, # 是否使用 cudnn benchmark
mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0), # 使用 fork 开启多线程。'fork' 通常比 'spawn' 快,但可能不安全。可参考 https://github.com/pytorch/pytorch/issues/1355
cudnn_benchmark=False, # 是否启用 cudnn benchmark
mp_cfg=dict( # 多进程配置
mp_start_method='fork', # 使用 fork 来启动多进程。'fork' 通常比 'spawn' 更快,但可能不安全。请参考 https://github.com/pytorch/pytorch/issues/1355
opencv_num_threads=0), # 关闭 opencv 的多进程以避免系统超负荷
dist_cfg=dict(backend='nccl')) # 分布式配置
vis_backends = [dict(type='LocalVisBackend')] # 可视化后端
vis_backends = [dict(type='LocalVisBackend')] # 可视化后端。请参考 https://mmengine.readthedocs.io/zh_CN/latest/advanced_tutorials/visualization.html
visualizer = dict(
type='Det3DLocalVisualizer',
vis_backends=[dict(type='LocalVisBackend')],
name='visualizer')
log_processor = dict(type='LogProcessor', window_size=50, by_epoch=True)
log_level = 'INFO'
load_from = None
resume = False
type='Det3DLocalVisualizer', vis_backends=vis_backends, name='visualizer')
log_processor = dict(
type='LogProcessor', # 日志处理器用于处理运行时日志
window_size=50, # 日志数值的平滑窗口
by_epoch=True) # 是否使用 epoch 格式的日志。需要与训练循环的类型保持一致
log_level = 'INFO' # 日志等级
load_from = None # 从给定路径加载模型检查点作为预训练模型。这不会恢复训练。
resume = False # 是否从 `load_from` 中定义的检查点恢复。如果 `load_from` 为 None,它将恢复 `work_dir` 中的最近检查点。
```
## 配置文件继承
## 配置文件继承
`configs/_base_` 文件夹下有 4 个基本组件类型,分别是:数据集(dataset),模型(model),训练策略(schedule)和运行时的默认设置(default runtime)。通过从上述文件夹中选取一个组件进行组合,许多方法如 SECOND、PointPillars、PartA2VoteNet 都能够很容易地构建出来。由 `_base_` 下的组件组成的配置,被我们称为 _原始配置(primitive)_。
`configs/_base_` 文件夹下有 4 个基本组件类型,分别是:数据集(dataset),模型(model),训练策略(schedule)和运行时的默认设置(default runtime)。许多方法如 SECOND、PointPillars、PartA2VoteNet 都能够很容易地构建出来。由 `_base_` 下的组件组成的配置,被我们称为 _原始配置primitive_。
对于同一文件夹下的配置,推荐**只有一个**对应的 _原始配置_ 文件所有其他的配置文件都应该继承自这个 _原始配置_ 文件这样就能保证配置文件的最大继承深度为 3。
对于同一文件夹下的所有配置,推荐**只有一个**对应的 _原始配置_ 文件所有其他的配置文件都应该继承自这个 _原始配置_ 文件这样就能保证配置文件的最大继承深度为 3。
为了便于理解,我们建议贡献者继承现有方法。例如,如果在 PointPillars 的基础上做了一些修改,用户首先可以通过指定 `_base_ = ../pointpillars/pointpillars_hv_fpn_sbn-all_8xb4_2x_nus-3d.py` 来继承基础的 PointPillars 结构,然后修改配置文件中的必要参数以完成继承。
为了便于理解,我们建议贡献者继承现有方法。例如,如果在 PointPillars 的基础上做了一些修改,用户可以首先通过指定 `_base_ = '../pointpillars/pointpillars_hv_fpn_sbn-all_8xb4-2x_nus-3d.py'` 来继承基础的 PointPillars 结构,然后修改配置文件中的必要参数以完成继承。
如果在构建一个与任何现有方法不共享结构的全新方法,可以在 `configs` 文件夹下创建一个新的例如 `xxx_rcnn` 文件夹。
如果在构建一个与任何现有方法不共享的全新方法,那么可以在 `configs` 文件夹下创建一个新的例如 `xxx_rcnn` 文件夹。
更多细节请参考 [mmengine 配置文件教程](https://mmengine.readthedocs.io/zh_CN/latest/tutorials/config.html)
更多细节请参考 [MMEngine 配置文件教程](https://mmengine.readthedocs.io/zh_CN/latest/advanced_tutorials/config.html)
### 忽略基础配置中的某些字段
通过设置 `_base_` 字段,我们可以设置当前配置文件继承自哪些文件。
有时候,你需要设置 `_delete_=True` 来忽略基础配置中的某些字段。你可以参考 [mmcv](https://mmcv.readthedocs.io/en/latest/utils.html#inherit-from-base-config-with-ignored-fields) 做简单了解
`_base_` 为文件路径字符串时,表示继承一个配置文件的内容
在 MMDetection3D 中,例如,修改以下 PointPillars 配置中的 FPN 瓶颈网络。
```python
_base_ = './pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class.py'
```
`_base_` 是多个文件路径组成的列表式,表示继承多个文件。
```python
_base_ = [
'../_base_/models/pointpillars_hv_secfpn_kitti.py',
'../_base_/datasets/kitti-3d-3class.py',
'../_base_/schedules/cyclic-40e.py', '../_base_/default_runtime.py'
]
```
如果需要检测配置文件,可以通过运行 `python tools/misc/print_config.py /PATH/TO/CONFIG` 来查看完整的配置。
### 忽略基础配置文件里的部分字段
有时,您也许会设置 `_delete_=True` 去忽略基础配置文件里的一些字段。您可以参考 [MMEngine 配置文件教程](https://mmengine.readthedocs.io/zh_CN/latest/advanced_tutorials/config.html) 来获得一些简单的指导。
在 MMDetection3D 里,例如,修改以下 PointPillars 配置中的颈部网络:
```python
model = dict(
......@@ -478,7 +431,7 @@ model = dict(
pts_bbox_head=dict(...))
```
`FPN``SECONDFPN` 使用不同的关键字来构建
`FPN``SECONDFPN` 使用不同的关键字来构建
```python
_base_ = '../_base_/models/pointpillars_hv_fpn_nus.py'
......@@ -493,11 +446,11 @@ model = dict(
pts_bbox_head=dict(...))
```
`_delete_=True`使用新的键替换 `pts_neck` 中的旧键值
`_delete_=True` 将使用新的键替换 `pts_neck` 字段内所有旧的键
### 在配置使用中间变量
### 在配置文件里使用中间变量
配置文件中通常会使用一些中间变量,例如数据集 `train_pipeline`/`test_pipeline`。需要注意的是当在子配置中修改中间变量,用户需要再次将中间变量传递到应的字段中。例如,我们想使用多尺度策略训练测试 PointPillars`train_pipeline`/`test_pipeline` 是我们要修改的中间变量。
配置文件会使用一些中间变量,例如数据集里的 `train_pipeline`/`test_pipeline`。需要注意的是,当修改子配置文件中的中间变量,用户需要再次将中间变量传递到应的字段中。例如,我们想使用多尺度策略训练测试 PointPillars`train_pipeline`/`test_pipeline` 是我们要修改的中间变量。
```python
_base_ = './nus-3d.py'
......@@ -551,53 +504,55 @@ test_pipeline = [
dict(
type='PointsRangeFilter', point_cloud_range=point_cloud_range)
]),
dict(type='Pack3DDetInputs', keys=['points'])
dict(type='Pack3DDetInputs', keys=['points'])
]
train_dataloader = dict(dataset=dict(pipeline=train_pipeline))
test_dataloader = dict(dataset=dict(pipeline=test_pipeline))
val_dataloader = dict(dataset=dict(pipeline=test_pipeline))
test_dataloader = dict(dataset=dict(pipeline=test_pipeline))
```
### 重使用 \_base\_ 文件中的变量
我们首先定义新的 `train_pipeline`/`test_pipeline`,然后传递到数据加载器字段中。
### 复用 \_base\_ 文件中的变量
如果用户想重新使用基础文件中的变量,可以通过使用 `{{_base_.xxx}}` 拷贝相应的变量。例如:
如果用户希望复用 base 文件中的变量,可以通过使用 `{{_base_.xxx}}` 获取对应变量的拷贝。例如:
```python
_base_ = './pointpillars_hv_secfpn_8xb6_160e_kitti-3d-3class.py'
_base_ = './pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class.py'
a = {{_base_.model}} # 变量 `a` `_base_` 中定义的 `model` 相同
a = {{_base_.model}} # 变量 `a` 等于 `_base_` 中定义的 `model`
```
### 通过脚本参数修改配置
## 通过脚本参数修改配置
当使用 "tools/train.py" 或者 "tools/test.py" 提交工作任务时,可以指定 `--cfg-options` 来修改配置。
当使用 `tools/train.py` 或者 `tools/test.py` 提交工作时,可以通过指定 `--cfg-options` 来修改配置文件
- 更新配置字典的键值
可以按照原始配置中字典的键值顺序指定配置选项。例如,`--cfg-options model.backbone.norm_eval=False` 改变模型干网络中的 BN 模块为 `train` 模式。
可以按照原始配置文件中字典的键值顺序指定配置选项。例如,使用 `--cfg-options model.backbone.norm_eval=False` 模型干网络中的所有 BN 模块都改`train` 模式。
- 更新配置列表中的键值
配置一些配置字典是由列表组成。例如,训练流水线 `train_dataloader.dataset.pipeline` 通常是一个列表,例如 `[dict(type='LoadImageFromFile'), ...]`。如果你想将流水线中的 `LoadImageFromFile` `LoadImageFromNDArray`,你可以指定 `--cfg-options data.train.pipeline.0.type=LoadImageFromNDArray`
配置文件里,一些配置字典被包含在列表中,例如,训练流 `train_dataloader.dataset.pipeline` 通常是一个列表,例如 `[dict(type='LoadPointsFromFile'), ...]`。如果您想要将训练流程中的 `'LoadPointsFromFile'` `'LoadPointsFromDict'`,您需要指定 `--cfg-options data.train.pipeline.0.type=LoadPointsFromDict`
- 更新列表/元组值
- 更新列表/元组
如果更新的值是列表或元组。例如,配置文件通常设置 `model.data_preprocessor.mean=[123.675, 116.28, 103.53]`。如果你想改变这个均值,你可以指定 `--cfg-options model.data_preprocessor.mean="[127,127,127]"`。注意引用符`"`用来支持列表/元组数据类型所必的,并且在指定值的引用符号内**没有**空格。
如果更新的值是列表或元组。例如,配置文件通常设置 `model.data_preprocessor.mean=[123.675, 116.28, 103.53]`。如果您想要改变这个均值,您需要指定 `--cfg-options model.data_preprocessor.mean="[127,127,127]"`。注意引号 `"` 是支持列表/元组数据类型所必的,并且在指定值的引号内**不允许**空格。
## 配置文件名称风格
我们遵循以下样式来命名配置文件,并建议贡献者遵循相同的风格。
我们遵循以下样式来命名配置文件建议贡献者遵循相同的风格。
```
{algorithm name}_{model component names [component1]_[component2]_[...]}_{training settings}_{training dataset information}_{testing dataset information}.py
```
文件名分为五个部分。所有部分和组件通过 `_` 连接,每个部分或组件单词`-` 连接。
文件名分为五个部分。所有部分和组件 `_` 连接,每个部分或组件内的单词应该`-` 连接。
- `{algorithm name}`:算法名。这应该是检测器的名例如 `pointpillars``fcos3d` 等。
- `{model component names}`:算法中使用的组件名,例如体素编码器,骨干网络,瓶颈网络等。例如 `second_secfpn_head-dcn-circlenms` 意味着使用 SECOND's SparseEncoder,SECONDFPN 以及使用 DCN 和 circle NMS 的检测头。
- `{training settings}`:训练设置信息,例如批量大小,数据增强,损失函数策略,调度器以及 epoch/迭代。例如`8xb4-tta-cyclic-20e` 意味着使用 8 个 GPUs,每个 GPU 有 4 个数据样本,测试增强,余弦退火学习率以及训练 20 个 epoch。一些缩写:
- `{gpu x batch_per_gpu}`:GPUs以及每块 GPU 的样本数。`bN` 表示 每块 GPU 的批量大小为 N。例如 `4xb4` 是 4-gpus x 4-samples-per-gpu 的简写。
- `{schedule}`:训练调度,可选项为 `schedule-2x``schedule-3x``cyclic-20e`等。`schedule-2x``schedule-3x` 分别表示训练 24 和 36 epoch。`cyclic-20e` 表示训练 20 epoch。
- `{training dataset information}`:训练数据集名如 `kitti-3d-3class``nus-3d``s3dis-seg``scannet-seg``waymoD5-3d-car`此处 `3d` 表示数据集用于 3d 目标检测,`seg` 表示数据集用于点云分割。
- `{testing dataset information}`(可选):当模型在一个数据集上训练,在另一个数据集上测试时的测试数据集名。如果没有指定,意味着模型在同一数据类型上训练和测试
- `{algorithm name}`:算法的名称。它可以是检测器的名称,例如 `pointpillars``fcos3d` 等。
- `{model component names}`:算法中使用的组件名称,如 voxel_encoder、backbone、neck 等。例如 `second_secfpn_head-dcn-circlenms` 表示使用 SECOND SparseEncoder,SECONDFPN以及带有 DCN 和 circle NMS 的检测头。
- `{training settings}`:训练设置信息,例如批量大小,数据增强,损失函数策略,调度器以及训练轮次/迭代。例如 `8xb4-tta-cyclic-20e` 表示使用 8 个 gpu,每个 gpu 有 4 个数据样本,测试增强,余弦退火学习率训练 20 个 epoch。缩写介绍
- `{gpu x batch_per_gpu}`:GPU 数和每个 GPU 的样本数。`bN` 表示每个 GPU 的批量大小为 N。例如 `4xb4` 是 4 个 GPU,每个 GPU 有 4 个样本数的缩写。
- `{schedule}`:训练方案,可选项为 `schedule-2x``schedule-3x``cyclic-20e` 等。`schedule-2x``schedule-3x` 分别代表 24 epoch 和 36 epoch。`cyclic-20e` 表示 20 epoch。
- `{training dataset information}`:训练数据集名,例`kitti-3d-3class``nus-3d``s3dis-seg``scannet-seg``waymoD5-3d-car`这里 `3d` 表示数据集用于 3D 目标检测,`seg` 表示数据集用于点云分割。
- `{testing dataset information}`(可选):当模型在一个数据集上训练,在另一个数据集上测试时的测试数据集名。如果没有注明,则表示训练和测试的数据集类型相同
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment