"examples/vscode:/vscode.git/clone" did not exist on "6fedb29f1113e734c47360a04a2d52312a1dd7bc"
Unverified Commit 4fb54412 authored by twang's avatar twang Committed by GitHub
Browse files

[Enhance] Refactor doc (#242)

* Refactor get_started.md primarily

* Create data_preparation.md

* Refactor model_zoo.md

* Remove the title

* Put Data Preparation under Get Started

* Adjust the level of Model Zoo

* Create 1_exist_data_model.md

* Create 2_new_data_model.md

* Update 1_exist_data_model.md

* Create useful_tools.md

* Update index.rst

* Delete install.md

* Adjust the order of get started subsections

* First complete version for customized datasets

* Update Tutorial 1

* Moved into Tutorials

* Update titles of tutorial 3

* Update 2_new_data_model.md

* Update 1_exist_data_model.md

* Update links to relative paths

* Tutorial 2 revised

* Delete finetune.md

* Create customize_models.md

* Create customize_runtime.md

* Update index.rst

* Update waymo.md

* Delete new_modules.md

* Preliminary update

* Update customize_models.md

* Update useful_tools.md

* Update title

* Fix a typo of tutorial 4

* Update level of titles

* Merge verification and demo

* Update 1_exist_data_model.md

* Update [Dataset Conversion]

* Update 2_new_data_model.md

* Update customize_dataset.md

* Update the example of customized dataset

* Update customize_dataset.md

* Enhance the doc structure

* Update README.md

* Change the abs path to relative path
parent dd74422e
...@@ -28,8 +28,11 @@ scannet ...@@ -28,8 +28,11 @@ scannet
├── scans ├── scans
├── scannet_train_instance_data ├── scannet_train_instance_data
├── points ├── points
│ ├── xxxxx.bin
├── instance_mask ├── instance_mask
│ ├── xxxxx.bin
├── semantic_mask ├── semantic_mask
│ ├── xxxxx.bin
├── scannet_infos_train.pkl ├── scannet_infos_train.pkl
├── scannet_infos_val.pkl ├── scannet_infos_val.pkl
......
# 1: Inference and train with existing models and standard datasets
## Inference with existing models
Here we provide testing scripts to evaluate a whole dataset (SUNRGBD, ScanNet, KITTI, etc.).
For high-level apis easier to integrated into other projects and basic demos, please refer to Verification/Demo under [Get Started](./getting_started.md).
### Test existing models on standard datasets
- single GPU
- single node multiple GPU
- multiple node
You can use the following commands to test a dataset.
```shell
# single-gpu testing
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] [--show]
# multi-gpu testing
./tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]
```
Optional arguments:
- `RESULT_FILE`: Filename of the output results in pickle format. If not specified, the results will not be saved to a file.
- `EVAL_METRICS`: Items to be evaluated on the results. Allowed values depend on the dataset, e.g., `proposal_fast`, `proposal`, `bbox`, `segm` are available for COCO, `mAP`, `recall` for PASCAL VOC. Cityscapes could be evaluated by `cityscapes` as well as all COCO metrics.
- `--show`: If specified, detection results will be plotted in the silient mode. It is only applicable to single GPU testing and used for debugging and visualization. This should be used with `--show-dir`.
- `--show-dir`: If specified, detection results will be plotted on the `***_points.obj` and `***_pred.ply` files in the specified directory. It is only applicable to single GPU testing and used for debugging and visualization. You do NOT need a GUI available in your environment for using this option.
Examples:
Assume that you have already downloaded the checkpoints to the directory `checkpoints/`.
1. Test votenet on ScanNet and save the points and prediction visualization results.
```shell
python tools/test.py configs/votenet/votenet_8x8_scannet-3d-18class.py \
checkpoints/votenet_8x8_scannet-3d-18class_20200620_230238-2cea9c3a.pth \
--show --show-dir ./data/scannet/show_results
```
2. Test votenet on ScanNet, save the points, prediction, groundtruth visualization results, and evaluate the mAP.
```shell
python tools/test.py configs/votenet/votenet_8x8_scannet-3d-18class.py \
checkpoints/votenet_8x8_scannet-3d-18class_20200620_230238-2cea9c3a.pth \
--eval mAP
--options 'show=True' 'out_dir=./data/scannet/show_results'
```
3. Test votenet on ScanNet (without saving the test results) and evaluate the mAP.
```shell
python tools/test.py configs/votenet/votenet_8x8_scannet-3d-18class.py \
checkpoints/votenet_8x8_scannet-3d-18class_20200620_230238-2cea9c3a.pth \
--eval mAP
```
4. Test SECOND with 8 GPUs, and evaluate the mAP.
```shell
./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} configs/second/hv_second_secfpn_6x8_80e_kitti-3d-3class.py \
checkpoints/hv_second_secfpn_6x8_80e_kitti-3d-3class_20200620_230238-9208083a.pth \
--out results.pkl --eval mAP
```
5. Test PointPillars on nuscenes with 8 GPUs, and generate the json file to be submit to the official evaluation server.
```shell
./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} configs/pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d.py \
checkpoints/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d_20200620_230405-2fa62f3d.pth \
--format-only --options 'jsonfile_prefix=./pointpillars_nuscenes_results'
```
The generated results be under `./pointpillars_nuscenes_results` directory.
6. Test SECOND on KITTI with 8 GPUs, and generate the pkl files and submission datas to be submit to the official evaluation server.
```shell
./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} configs/second/hv_second_secfpn_6x8_80e_kitti-3d-3class.py \
checkpoints/hv_second_secfpn_6x8_80e_kitti-3d-3class_20200620_230238-9208083a.pth \
--format-only --options 'pklfile_prefix=./second_kitti_results' 'submission_prefix=./second_kitti_results'
```
The generated results be under `./second_kitti_results` directory.
7. Test PointPillars on Lyft with 8 GPUs, generate the pkl files and make a submission to the leaderboard.
```shell
./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} configs/pointpillars/hv_pointpillars_fpn_sbn-2x8_2x_lyft-3d.py \
checkpoints/hv_pointpillars_fpn_sbn-2x8_2x_lyft-3d_latest.pth --out results/pp_lyft/results_challenge.pkl \
--format-only --options 'jsonfile_prefix=results/pp_lyft/results_challenge' \
'csv_path=results/pp_lyft/results_challenge.csv'
```
**Notice**: To generate submissions on Lyft, `csv_path` must be given in the options. After generating the csv file, you can make a submission with kaggle commands given on the [website](https://www.kaggle.com/c/3d-object-detection-for-autonomous-vehicles/submit).
7. Test PointPillars on waymo with 8 GPUs, and evaluate the mAP with waymo metrics.
```shell
./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} configs/pointpillars/hv_pointpillars_secfpn_sbn-2x16_2x_waymo-3d-car.py \
checkpoints/hv_pointpillars_secfpn_sbn-2x16_2x_waymo-3d-car_latest.pth --out results/waymo-car/results_eval.pkl \
--eval waymo --options 'pklfile_prefix=results/waymo-car/kitti_results' \
'submission_prefix=results/waymo-car/kitti_results'
```
**Notice**: For evaluation on waymo, please follow the [instruction](https://github.com/waymo-research/waymo-open-dataset/blob/master/docs/quick_start.md) to build the binary file `compute_detection_metrics_main` for metrics computation and put it into `mmdet3d/core/evaluation/waymo_utils/`.(Sometimes when using bazel to build `compute_detection_metrics_main`, an error `'round' is not a member of 'std'` may appear. We just need to remove the `std::` before `round` in that file.) `pklfile_prefix` should be given in the options for the bin file generation. For metrics, `waymo` is the recommended official evaluation prototype. Currently, evaluating with choice `kitti` is adapted from KITTI and the results for each difficulty are not exactly the same as the definition of KITTI. Instead, most of objects are marked with difficulty 0 currently, which will be fixed in the future. The reasons of its instability include the large computation for evalution, the lack of occlusion and truncation in the converted data, different definition of difficulty and different methods of computing average precision.
8. Test PointPillars on waymo with 8 GPUs, generate the bin files and make a submission to the leaderboard.
```shell
./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} configs/pointpillars/hv_pointpillars_secfpn_sbn-2x16_2x_waymo-3d-car.py \
checkpoints/hv_pointpillars_secfpn_sbn-2x16_2x_waymo-3d-car_latest.pth --out results/waymo-car/results_eval.pkl \
--format-only --options 'pklfile_prefix=results/waymo-car/kitti_results' \
'submission_prefix=results/waymo-car/kitti_results'
```
**Notice**: After generating the bin file, you can simply build the binary file `create_submission` and use them to create a submission file by following the [instruction](https://github.com/waymo-research/waymo-open-dataset/blob/master/docs/quick_start.md). For evaluation on the validation set with the eval server, you can also use the same way to generate a submission.
## Train predefined models on standard datasets
MMDetection implements distributed training and non-distributed training,
which uses `MMDistributedDataParallel` and `MMDataParallel` respectively.
All outputs (log files and checkpoints) will be saved to the working directory,
which is specified by `work_dir` in the config file.
By default we evaluate the model on the validation set after each epoch, you can change the evaluation interval by adding the interval argument in the training config.
```python
evaluation = dict(interval=12) # This evaluate the model per 12 epoch.
```
**Important**: The default learning rate in config files is for 8 GPUs and the exact batch size is marked by the config's file name, e.g. '2x8' means 2 samples per GPU using 8 GPUs.
According to the [Linear Scaling Rule](https://arxiv.org/abs/1706.02677), you need to set the learning rate proportional to the batch size if you use different GPUs or images per GPU, e.g., lr=0.01 for 4 GPUs * 2 img/gpu and lr=0.08 for 16 GPUs * 4 img/gpu. However, since most of the models in this repo use ADAM rather than SGD for optimization, the rule may not hold and users need to tune the learning rate by themselves.
### Train with a single GPU
```shell
python tools/train.py ${CONFIG_FILE} [optional arguments]
```
If you want to specify the working directory in the command, you can add an argument `--work_dir ${YOUR_WORK_DIR}`.
### Train with multiple GPUs
```shell
./tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]
```
Optional arguments are:
- `--no-validate` (**not suggested**): By default, the codebase will perform evaluation at every k (default value is 1, which can be modified like [this](https://github.com/open-mmlab/mmdetection/blob/master/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py#L174)) epochs during the training. To disable this behavior, use `--no-validate`.
- `--work-dir ${WORK_DIR}`: Override the working directory specified in the config file.
- `--resume-from ${CHECKPOINT_FILE}`: Resume from a previous checkpoint file.
- `--options 'Key=value'`: Overide some settings in the used config.
Difference between `resume-from` and `load-from`:
`resume-from` loads both the model weights and optimizer status, and the epoch is also inherited from the specified checkpoint. It is usually used for resuming the training process that is interrupted accidentally.
`load-from` only loads the model weights and the training epoch starts from 0. It is usually used for finetuning.
### Train with multiple machines
If you run MMDetection on a cluster managed with [slurm](https://slurm.schedmd.com/), you can use the script `slurm_train.sh`. (This script also supports single machine training.)
```shell
[GPUS=${GPUS}] ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${WORK_DIR}
```
Here is an example of using 16 GPUs to train Mask R-CNN on the dev partition.
```shell
GPUS=16 ./tools/slurm_train.sh dev mask_r50_1x configs/mask_rcnn_r50_fpn_1x_coco.py /nfs/xxxx/mask_rcnn_r50_fpn_1x
```
You can check [slurm_train.sh](https://github.com/open-mmlab/mmdetection/blob/master/tools/slurm_train.sh) for full arguments and environment variables.
If you have just multiple machines connected with ethernet, you can refer to
PyTorch [launch utility](https://pytorch.org/docs/stable/distributed_deprecated.html#launch-utility).
Usually it is slow if you do not have high speed networking like InfiniBand.
### Launch multiple jobs on a single machine
If you launch multiple jobs on a single machine, e.g., 2 jobs of 4-GPU training on a machine with 8 GPUs,
you need to specify different ports (29500 by default) for each job to avoid communication conflict.
If you use `dist_train.sh` to launch training jobs, you can set the port in commands.
```shell
CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh ${CONFIG_FILE} 4
CUDA_VISIBLE_DEVICES=4,5,6,7 PORT=29501 ./tools/dist_train.sh ${CONFIG_FILE} 4
```
If you use launch training jobs with Slurm, there are two ways to specify the ports.
1. Set the port through `--options`. This is more recommended since it does not change the original configs.
```shell
CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR} --options 'dist_params.port=29500'
CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR} --options 'dist_params.port=29501'
```
2. Modify the config files (usually the 6th line from the bottom in config files) to set different communication ports.
In `config1.py`,
```python
dist_params = dict(backend='nccl', port=29500)
```
In `config2.py`,
```python
dist_params = dict(backend='nccl', port=29501)
```
Then you can launch two jobs with `config1.py` ang `config2.py`.
```shell
CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR}
CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR}
```
# 2: Train with customized datasets
In this note, you will know how to train and test predefined models with customized datasets. We use the Waymo dataset as an example to describe the whole process.
The basic steps are as below:
1. Prepare the customized dataset
2. Prepare a config
3. Train, test, inference models on the customized dataset.
## Prepare the customized dataset
There are three ways to support a new dataset in MMDetection3D:
1. reorganize the dataset into existing format.
2. reorganize the dataset into a middle format.
3. implement a new dataset.
Usually we recommend to use the first two methods which are usually easier than the third.
In this note, we give an example for converting the data into KITTI format.
**Note**: We take Waymo as the example here considering its format is totally different from other existing formats. For other datasets using similar methods to organize data, like Lyft compared to nuScenes, it would be easier to directly implement the new data converter (for the second approach above) instead of converting it to another format (for the first approach above).
### KITTI dataset format
Firstly, the raw data for 3D object detection from KITTI are typically organized as follows, where `ImageSets` contains split files indicating which files belong to training/validation/testing set, `calib` contains calibration information files, `image_2` and `velodyne` include image data and point cloud data, and `label_2` includes label files for 3D detection.
```
mmdetection3d
├── mmdet3d
├── tools
├── configs
├── data
│ ├── kitti
│ │ ├── ImageSets
│ │ ├── testing
│ │ │ ├── calib
│ │ │ ├── image_2
│ │ │ ├── velodyne
│ │ ├── training
│ │ │ ├── calib
│ │ │ ├── image_2
│ │ │ ├── label_2
│ │ │ ├── velodyne
```
Specific annotation format is described in the official object development [kit](https://s3.eu-central-1.amazonaws.com/avg-kitti/devkit_object.zip). For example, it consists of the following labels:
```
#Values Name Description
----------------------------------------------------------------------------
1 type Describes the type of object: 'Car', 'Van', 'Truck',
'Pedestrian', 'Person_sitting', 'Cyclist', 'Tram',
'Misc' or 'DontCare'
1 truncated Float from 0 (non-truncated) to 1 (truncated), where
truncated refers to the object leaving image boundaries
1 occluded Integer (0,1,2,3) indicating occlusion state:
0 = fully visible, 1 = partly occluded
2 = largely occluded, 3 = unknown
1 alpha Observation angle of object, ranging [-pi..pi]
4 bbox 2D bounding box of object in the image (0-based index):
contains left, top, right, bottom pixel coordinates
3 dimensions 3D object dimensions: height, width, length (in meters)
3 location 3D object location x,y,z in camera coordinates (in meters)
1 rotation_y Rotation ry around Y-axis in camera coordinates [-pi..pi]
1 score Only for results: Float, indicating confidence in
detection, needed for p/r curves, higher is better.
```
Assume we use the Waymo dataset.
After downloading the data, we need to implement a function to convert both the input data and annotation format into the KITTI style. Then we can implement WaymoDataset inherited from KittiDataset to load the data and perform training and evaluation.
Specifically, we implement a waymo [converter](../tools/data_converter/waymo_converter.py) to convert Waymo data into KITTI format and a waymo dataset [class](../mmdet3d/datasets/waymo_dataset.py) to process it. Because we preprocess the raw data and reorganize it like KITTI, the dataset class could be implemented more easily by inheriting from KittiDataset. The last thing needed to be noted is the evaluation protocol you would like to use. Because Waymo has its own evaluation approach, we further incorporate it into our dataset class. Afterwards, users can successfully convert the data format and use `WaymoDataset` to train and evaluate the model.
For more details about the intermediate results of preprocessing of Waymo dataset, please refer to its [tutorial](./tutorials/waymo.md).
## Prepare a config
The second step is to prepare configs such that the dataset could be successfully loaded. In addition, adjusting hyperparameters is usually necessary to obtain decent performance in 3D detection.
Suppose we would like to train PointPillars on Waymo to achieve 3D detection for 3 classes, vehilce, cyclist and pedestrian, we need to prepare dataset config like [this](../mmdet3d/datasets/waymo_dataset.py), model config like [this](../configs/_base_/models/hv_pointpillars_secfpn_waymo.py) and combine them like [this](../configs/pointpillars/hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-3class.py), compared to KITTI [dataset config](../configs/_base_/datasets/kitti-3d-3class.py), [model config](../configs/_base_/models/hv_pointpillars_secfpn_kitti.py) and [overall](../configs/pointpillars/hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class.py).
## Train a new model
To train a model with the new config, you can simply run
```shell
python tools/train.py configs/pointpillars/hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-3class.py
```
For more detailed usages, please refer to the [Case 1](1_exist_data_model.md).
## Test and inference
To test the trained model, you can simply run
```shell
python tools/test.py configs/pointpillars/hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-3class.py work_dirs/hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-3class/latest.pth --eval waymo
```
**Note**: To use Waymo evaluation protocol, you need to follow the [tutorial](tutorials/waymo.md) and prepare files related to metrics computation as official instructions.
For more detailed usages for test and inference, please refer to the [Case 1](1_exist_data_model.md).
# Dataset Preparation
## Before Preparation
It is recommended to symlink the dataset root to `$MMDETECTION3D/data`.
If your folder structure is different from the following, you may need to change the corresponding paths in config files.
```
mmdetection3d
├── mmdet3d
├── tools
├── configs
├── data
│ ├── nuscenes
│ │ ├── maps
│ │ ├── samples
│ │ ├── sweeps
│ │ ├── v1.0-test
| | ├── v1.0-trainval
│ ├── kitti
│ │ ├── ImageSets
│ │ ├── testing
│ │ │ ├── calib
│ │ │ ├── image_2
│ │ │ ├── velodyne
│ │ ├── training
│ │ │ ├── calib
│ │ │ ├── image_2
│ │ │ ├── label_2
│ │ │ ├── velodyne
│ ├── waymo
│ │ ├── waymo_format
│ │ │ ├── training
│ │ │ ├── validation
│ │ │ ├── testing
│ │ │ ├── gt.bin
│ │ ├── kitti_format
│ │ │ ├── ImageSets
│ ├── lyft
│ │ ├── v1.01-train
│ │ │ ├── v1.01-train (train_data)
│ │ │ ├── lidar (train_lidar)
│ │ │ ├── images (train_images)
│ │ │ ├── maps (train_maps)
│ │ ├── v1.01-test
│ │ │ ├── v1.01-test (test_data)
│ │ │ ├── lidar (test_lidar)
│ │ │ ├── images (test_images)
│ │ │ ├── maps (test_maps)
│ │ ├── train.txt
│ │ ├── val.txt
│ │ ├── test.txt
│ │ ├── sample_submission.csv
│ ├── scannet
│ │ ├── meta_data
│ │ ├── scans
│ │ ├── batch_load_scannet_data.py
│ │ ├── load_scannet_data.py
│ │ ├── scannet_utils.py
│ │ ├── README.md
│ ├── sunrgbd
│ │ ├── OFFICIAL_SUNRGBD
│ │ ├── matlab
│ │ ├── sunrgbd_data.py
│ │ ├── sunrgbd_utils.py
│ │ ├── README.md
```
## Download and Data Preparation
### KITTI
Download KITTI 3D detection data [HERE](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d). Prepare kitti data by running
```bash
mkdir ./data/kitti/ && mkdir ./data/kitti/ImageSets
# Download data split
wget -c https://raw.githubusercontent.com/traveller59/second.pytorch/master/second/data/ImageSets/test.txt --no-check-certificate --content-disposition -O ./data/kitti/ImageSets/test.txt
wget -c https://raw.githubusercontent.com/traveller59/second.pytorch/master/second/data/ImageSets/train.txt --no-check-certificate --content-disposition -O ./data/kitti/ImageSets/train.txt
wget -c https://raw.githubusercontent.com/traveller59/second.pytorch/master/second/data/ImageSets/val.txt --no-check-certificate --content-disposition -O ./data/kitti/ImageSets/val.txt
wget -c https://raw.githubusercontent.com/traveller59/second.pytorch/master/second/data/ImageSets/trainval.txt --no-check-certificate --content-disposition -O ./data/kitti/ImageSets/trainval.txt
python tools/create_data.py kitti --root-path ./data/kitti --out-dir ./data/kitti --extra-tag kitti
```
### Waymo
Download Waymo open dataset V1.2 [HERE](https://waymo.com/open/download/) and its data split [HERE](https://drive.google.com/drive/folders/18BVuF_RYJF0NjZpt8SnfzANiakoRMf0o?usp=sharing). Then put tfrecord files into corresponding folders in `data/waymo/waymo_format/` and put the data split txt files into `data/waymo/kitti_format/ImageSets`. Download ground truth bin file for validation set [HERE](https://console.cloud.google.com/storage/browser/waymo_open_dataset_v_1_2_0/validation/ground_truth_objects) and put it into `data/waymo/waymo_format/`. A tip is that you can use `gsutil` to download the large-scale dataset with commands. You can take this [tool](https://github.com/RalphMao/Waymo-Dataset-Tool) as an example for more details. Subsequently, prepare waymo data by running
```bash
python tools/create_data.py waymo --root-path ./data/waymo/ --out-dir ./data/waymo/ --workers 128 --extra-tag waymo
```
Note that if your local disk does not have enough space for saving converted data, you can change the `out-dir` to anywhere else. Just remember to create folders and prepare data there in advance and link them back to `data/waymo/kitti_format` after the data conversion.
### NuScenes
Download nuScenes V1.0 full dataset data [HERE]( https://www.nuscenes.org/download). Prepare nuscenes data by running
```bash
python tools/create_data.py nuscenes --root-path ./data/nuscenes --out-dir ./data/nuscenes --extra-tag nuscenes
```
### Lyft
Download Lyft 3D detection data [HERE](https://www.kaggle.com/c/3d-object-detection-for-autonomous-vehicles/data). Prepare Lyft data by running
```bash
python tools/create_data.py lyft --root-path ./data/lyft --out-dir ./data/lyft --extra-tag lyft --version v1.01
```
Note that we follow the original folder names for clear organization. Please rename the raw folders as shown above.
### ScanNet and SUN RGB-D
To prepare scannet data, please see [scannet](../data/scannet/README.md).
To prepare sunrgbd data, please see [sunrgbd](../data/sunrgbd/README.md).
### Customized Datasets
For using custom datasets, please refer to [Tutorials 2: Customize Datasets](tutorials/new_dataset.md).
This diff is collapsed.
...@@ -3,17 +3,18 @@ Welcome to MMDetection3D's documentation! ...@@ -3,17 +3,18 @@ Welcome to MMDetection3D's documentation!
.. toctree:: .. toctree::
:maxdepth: 2 :maxdepth: 2
:caption: Get Started
install.md
getting_started.md getting_started.md
model_zoo.md model_zoo.md
data_preparation.md
.. toctree:: .. toctree::
:maxdepth: 2 :maxdepth: 2
:caption: Notes :caption: Quick Run
benchmarks.md 1_exist_data_model.md
config.md 2_new_data_model.md
.. toctree:: .. toctree::
:maxdepth: 2 :maxdepth: 2
...@@ -21,6 +22,18 @@ Welcome to MMDetection3D's documentation! ...@@ -21,6 +22,18 @@ Welcome to MMDetection3D's documentation!
tutorials/index.rst tutorials/index.rst
.. toctree::
:maxdepth: 2
:caption: Useful Tools and Scripts
useful_tools.md
.. toctree::
:maxdepth: 2
:caption: Notes
benchmarks.md
.. toctree:: .. toctree::
:caption: API Reference :caption: API Reference
......
## Installation
### Requirements
- Linux or macOS (Windows is not currently officially supported)
- Python 3.6+
- PyTorch 1.3+
- CUDA 9.2+ (If you build PyTorch from source, CUDA 9.0 is also compatible)
- GCC 5+
- [mmcv](https://github.com/open-mmlab/mmcv)
### Install mmdetection
a. Create a conda virtual environment and activate it.
```shell
conda create -n open-mmlab python=3.7 -y
conda activate open-mmlab
```
b. Install PyTorch and torchvision following the [official instructions](https://pytorch.org/), e.g.,
```shell
conda install pytorch torchvision -c pytorch
```
Note: Make sure that your compilation CUDA version and runtime CUDA version match.
You can check the supported CUDA version for precompiled packages on the [PyTorch website](https://pytorch.org/).
`E.g.1` If you have CUDA 10.1 installed under `/usr/local/cuda` and would like to install
PyTorch 1.5, you need to install the prebuilt PyTorch with CUDA 10.1.
```python
conda install pytorch cudatoolkit=10.1 torchvision -c pytorch
```
`E.g. 2` If you have CUDA 9.2 installed under `/usr/local/cuda` and would like to install
PyTorch 1.3.1., you need to install the prebuilt PyTorch with CUDA 9.2.
```python
conda install pytorch=1.3.1 cudatoolkit=9.2 torchvision=0.4.2 -c pytorch
```
If you build PyTorch from source instead of installing the prebuilt pacakge,
you can use more CUDA versions such as 9.0.
c. Install [MMCV](https://mmcv.readthedocs.io/en/latest/).
*mmcv-full* is necessary since MMDetection3D relies on MMDetection, CUDA ops in *mmcv-full* are required.
The pre-build *mmcv-full* could be installed by running: (available versions could be found [here](https://mmcv.readthedocs.io/en/latest/#install-with-pip))
```shell
pip install mmcv-full==latest+torch1.5.0+cu101 -f https://download.openmmlab.com/mmcv/dist/index.html
```
Optionally, you could also build the full version from source:
```shell
pip install mmcv-full
```
d. Install [MMDetection](https://github.com/open-mmlab/mmdetection).
```shell
pip install git+https://github.com/open-mmlab/mmdetection.git
```
Optionally, you could also build MMDetection from source in case you want to modify the code:
```shell
git clone https://github.com/open-mmlab/mmdetection.git
cd mmdetection
pip install -r requirements/build.txt
pip install -v -e . # or "python setup.py develop"
```
**Important**:
1. The required versions of MMCV and MMDetection for different versions of MMDetection3D are as below. Please install the correct version of MMCV and MMDetection to avoid installation issues.
| MMDetection3D version | MMDetection version | MMCV version |
|:-------------------:|:-------------------:|:-------------------:|
| master | mmdet>=2.5.0 | mmcv-full>=1.1.5, <=1.3|
| 0.8.0 | mmdet>=2.5.0 | mmcv-full>=1.1.5, <=1.3|
| 0.7.0 | mmdet>=2.5.0 | mmcv-full>=1.1.5, <=1.3|
| 0.6.0 | mmdet>=2.4.0 | mmcv-full>=1.1.3, <=1.2|
| 0.5.0 | 2.3.0 | mmcv-full==1.0.5|
e. Clone the MMDetection3D repository.
```shell
git clone https://github.com/open-mmlab/mmdetection3d.git
cd mmdetection3d
```
f.Install build requirements and then install MMDetection3D.
```shell
pip install -v -e . # or "python setup.py develop"
```
Note:
1. The git commit id will be written to the version number with step d, e.g. 0.6.0+2e7045c. The version will also be saved in trained models.
It is recommended that you run step d each time you pull some updates from github. If C++/CUDA codes are modified, then this step is compulsory.
> Important: Be sure to remove the `./build` folder if you reinstall mmdet with a different CUDA/PyTorch version.
```shell
pip uninstall mmdet3d
rm -rf ./build
find . -name "*.so" | xargs rm
```
2. Following the above instructions, mmdetection is installed on `dev` mode, any local modifications made to the code will take effect without the need to reinstall it (unless you submit some commits and want to update the version number).
3. If you would like to use `opencv-python-headless` instead of `opencv-python`,
you can install it before installing MMCV.
4. Some dependencies are optional. Simply running `pip install -v -e .` will only install the minimum runtime requirements. To use optional dependencies like `albumentations` and `imagecorruptions` either install them manually with `pip install -r requirements/optional.txt` or specify desired extras when calling `pip` (e.g. `pip install -v -e .[optional]`). Valid keys for the extras field are: `all`, `tests`, `build`, and `optional`.
5. The code can not be built for CPU only environment (where CUDA isn't available) for now.
### Another option: Docker Image
We provide a [Dockerfile](https://github.com/open-mmlab/mmdetection3d/blob/master/docker/Dockerfile) to build an image.
```shell
# build an image with PyTorch 1.6, CUDA 10.1
docker build -t mmdetection3d docker/
```
Run it with
```shell
docker run --gpus all --shm-size=8g -it -v {DATA_DIR}:/mmdetection3d/data mmdetection3d
```
### A from-scratch setup script
Here is a full script for setting up mmdetection with conda.
```shell
conda create -n open-mmlab python=3.7 -y
conda activate open-mmlab
# install latest pytorch prebuilt with the default prebuilt CUDA version (usually the latest)
conda install -c pytorch pytorch torchvision -y
# install mmcv
pip install mmcv-full
# install mmdetection
pip install git+https://github.com/open-mmlab/mmdetection.git
# install mmdetection3d
git clone https://github.com/open-mmlab/mmdetection3d.git
cd mmdetection3d
pip install -v -e .
```
### Using multiple MMDetection3D versions
The train and test scripts already modify the `PYTHONPATH` to ensure the script use the MMDetection3D in the current directory.
To use the default MMDetection3D installed in the environment rather than that you are working with, you can remove the following line in those scripts
```shell
PYTHONPATH="$(dirname $0)/..":$PYTHONPATH
```
# Config System # Tutorial 1: Learn about Configs
We incorporate modular and inheritance design into our config system, which is convenient to conduct various experiments. We incorporate modular and inheritance design into our config system, which is convenient to conduct various experiments.
If you wish to inspect the config file, you may run `python tools/print_config.py /PATH/TO/CONFIG` to see the complete config. If you wish to inspect the config file, you may run `python tools/print_config.py /PATH/TO/CONFIG` to see the complete config.
You may also pass `--options xxx.yyy=zzz` to see updated config. You may also pass `--options xxx.yyy=zzz` to see updated config.
......
# Tutorial 2: Customize Datasets
## Support new data format
To support a new data format, you can either convert them to existing formats or directly convert them to the middle format. You could also choose to convert them offline (before training by a script) or online (implement a new dataset and do the conversion at training). In MMDetection3D, for the data that is inconvenient to read directly online, we recommend to convert it into KITTI format and do the conversion offline, thus you only need to modify the config's data annotation paths and classes after the conversion.
For data sharing similar format with existing datasets, like Lyft compared to nuScenes, we recommend to directly implement data converter and dataset class. During the procedure, inheritation could be taken into consideration to reduce the implementation workload.
### Reorganize new data formats to existing format
For data that is inconvenient to read directly online, the simplest way is to convert your dataset to existing dataset formats.
Typically we need a data converter to reorganize the raw data and convert the annotation format into KITTI style. Then a new dataset class inherited from existing ones is sometimes necessary for dealing with some specific differences between datasets. Finally, the users need to further modify the config files to use the dataset. An [example](../2_new_data_model.md) training predefined models on Waymo dataset by converting it into KITTI style can be taken for reference.
### Reorganize new data format to middle format
It is also fine if you do not want to convert the annotation format to existing formats.
Actually, we convert all the supported datasets into pickle files, which summarize useful information for model training and inference.
The annotation of a dataset is a list of dict, each dict corresponds to a frame.
A basic example (used in KITTI) is as follows. A frame consists of several keys, like `image`, `point_cloud`, `calib` and `annos`.
As long as we could directly read data according to these information, the organization of raw data could also be different from existing ones.
With this design, we provide an alternative choice for customizing datasets.
```python
[
{'image': {'image_idx': 0, 'image_path': 'training/image_2/000000.png', 'image_shape': array([ 370, 1224], dtype=int32)},
'point_cloud': {'num_features': 4, 'velodyne_path': 'training/velodyne/000000.bin'},
'calib': {'P0': array([[707.0493, 0. , 604.0814, 0. ],
[ 0. , 707.0493, 180.5066, 0. ],
[ 0. , 0. , 1. , 0. ],
[ 0. , 0. , 0. , 1. ]]),
'P1': array([[ 707.0493, 0. , 604.0814, -379.7842],
[ 0. , 707.0493, 180.5066, 0. ],
[ 0. , 0. , 1. , 0. ],
[ 0. , 0. , 0. , 1. ]]),
'P2': array([[ 7.070493e+02, 0.000000e+00, 6.040814e+02, 4.575831e+01],
[ 0.000000e+00, 7.070493e+02, 1.805066e+02, -3.454157e-01],
[ 0.000000e+00, 0.000000e+00, 1.000000e+00, 4.981016e-03],
[ 0.000000e+00, 0.000000e+00, 0.000000e+00, 1.000000e+00]]),
'P3': array([[ 7.070493e+02, 0.000000e+00, 6.040814e+02, -3.341081e+02],
[ 0.000000e+00, 7.070493e+02, 1.805066e+02, 2.330660e+00],
[ 0.000000e+00, 0.000000e+00, 1.000000e+00, 3.201153e-03],
[ 0.000000e+00, 0.000000e+00, 0.000000e+00, 1.000000e+00]]),
'R0_rect': array([[ 0.9999128 , 0.01009263, -0.00851193, 0. ],
[-0.01012729, 0.9999406 , -0.00403767, 0. ],
[ 0.00847068, 0.00412352, 0.9999556 , 0. ],
[ 0. , 0. , 0. , 1. ]]),
'Tr_velo_to_cam': array([[ 0.00692796, -0.9999722 , -0.00275783, -0.02457729],
[-0.00116298, 0.00274984, -0.9999955 , -0.06127237],
[ 0.9999753 , 0.00693114, -0.0011439 , -0.3321029 ],
[ 0. , 0. , 0. , 1. ]]),
'Tr_imu_to_velo': array([[ 9.999976e-01, 7.553071e-04, -2.035826e-03, -8.086759e-01],
[-7.854027e-04, 9.998898e-01, -1.482298e-02, 3.195559e-01],
[ 2.024406e-03, 1.482454e-02, 9.998881e-01, -7.997231e-01],
[ 0.000000e+00, 0.000000e+00, 0.000000e+00, 1.000000e+00]])},
'annos': {'name': array(['Pedestrian'], dtype='<U10'), 'truncated': array([0.]), 'occluded': array([0]), 'alpha': array([-0.2]), 'bbox': array([[712.4 , 143. , 810.73, 307.92]]), 'dimensions': array([[1.2 , 1.89, 0.48]]), 'location': array([[1.84, 1.47, 8.41]]), 'rotation_y': array([0.01]), 'score': array([0.]), 'index': array([0], dtype=int32), 'group_ids': array([0], dtype=int32), 'difficulty': array([0], dtype=int32), 'num_points_in_gt': array([377], dtype=int32)}}
...
]
```
On top of this you can write a new Dataset class inherited from `Custom3DDataset`, and overwrite related methods,
like [KittiDataset](../../mmdet3d/datasets/kitti_dataset.py) and [ScanNetDataset](../../mmdet3d/datasets/scannet_dataset.py).
### An example of customized dataset
Here we provide an example of customized dataset.
Assume the annotation has been reorganized into a list of dict in pickle files like ScanNet.
The bounding boxes annotations are stored in `annotation.pkl` as the following
```
{'point_cloud': {'num_features': 6, 'lidar_idx': 'scene0000_00'}, 'pts_path': 'points/scene0000_00.bin',
'pts_instance_mask_path': 'instance_mask/scene0000_00.bin', 'pts_semantic_mask_path': 'semantic_mask/scene0000_00.bin',
'annos': {'gt_num': 27, 'name': array(['window', 'window', 'table', 'counter', 'curtain', 'curtain',
'desk', 'cabinet', 'sink', 'garbagebin', 'garbagebin',
'garbagebin', 'sofa', 'refrigerator', 'table', 'table', 'toilet',
'bed', 'cabinet', 'cabinet', 'cabinet', 'cabinet', 'cabinet',
'cabinet', 'door', 'door', 'door'], dtype='<U12'),
'location': array([[ 1.48129511, 3.52074146, 1.85652947],
[ 2.90395617, -3.48033905, 1.52682471]]),
'dimensions': array([[1.74445975, 0.23195696, 0.57235193],
[0.66077662, 0.17072392, 0.67153597]]),
'gt_boxes_upright_depth': array([
[ 1.48129511, 3.52074146, 1.85652947, 1.74445975, 0.23195696,
0.57235193],
[ 2.90395617, -3.48033905, 1.52682471, 0.66077662, 0.17072392,
0.67153597]]),
'index': array([ 0, 1 ], dtype=int32),
'class': array([ 6, 6 ])}}
```
We can create a new dataset in `mmdet3d/datasets/my_dataset.py` to load the data.
```python
import numpy as np
from os import path as osp
from mmdet3d.core import show_result
from mmdet3d.core.bbox import DepthInstance3DBoxes
from mmdet.datasets import DATASETS
from .custom_3d import Custom3DDataset
@DATASETS.register_module()
class MyDataset(Custom3DDataset):
CLASSES = ('cabinet', 'bed', 'chair', 'sofa', 'table', 'door', 'window',
'bookshelf', 'picture', 'counter', 'desk', 'curtain',
'refrigerator', 'showercurtrain', 'toilet', 'sink', 'bathtub',
'garbagebin')
def __init__(self,
data_root,
ann_file,
pipeline=None,
classes=None,
modality=None,
box_type_3d='Depth',
filter_empty_gt=True,
test_mode=False):
super().__init__(
data_root=data_root,
ann_file=ann_file,
pipeline=pipeline,
classes=classes,
modality=modality,
box_type_3d=box_type_3d,
filter_empty_gt=filter_empty_gt,
test_mode=test_mode)
def get_ann_info(self, index):
# Use index to get the annos, thus the evalhook could also use this api
info = self.data_infos[index]
if info['annos']['gt_num'] != 0:
gt_bboxes_3d = info['annos']['gt_boxes_upright_depth'].astype(
np.float32) # k, 6
gt_labels_3d = info['annos']['class'].astype(np.long)
else:
gt_bboxes_3d = np.zeros((0, 6), dtype=np.float32)
gt_labels_3d = np.zeros((0, ), dtype=np.long)
# to target box structure
gt_bboxes_3d = DepthInstance3DBoxes(
gt_bboxes_3d,
box_dim=gt_bboxes_3d.shape[-1],
with_yaw=False,
origin=(0.5, 0.5, 0.5)).convert_to(self.box_mode_3d)
pts_instance_mask_path = osp.join(self.data_root,
info['pts_instance_mask_path'])
pts_semantic_mask_path = osp.join(self.data_root,
info['pts_semantic_mask_path'])
anns_results = dict(
gt_bboxes_3d=gt_bboxes_3d,
gt_labels_3d=gt_labels_3d,
pts_instance_mask_path=pts_instance_mask_path,
pts_semantic_mask_path=pts_semantic_mask_path)
return anns_results
```
Then in the config, to use `MyDataset` you can modify the config as the following
```python
dataset_A_train = dict(
type='MyDataset',
ann_file = 'annotation.pkl',
pipeline=train_pipeline
)
```
## Customize datasets by dataset wrappers
MMDetection3D also supports many dataset wrappers to mix the dataset or modify the dataset distribution for training like MMDetection.
Currently it supports to three dataset wrappers as below:
- `RepeatDataset`: simply repeat the whole dataset.
- `ClassBalancedDataset`: repeat dataset in a class balanced manner.
- `ConcatDataset`: concat datasets.
### Repeat dataset
We use `RepeatDataset` as wrapper to repeat the dataset. For example, suppose the original dataset is `Dataset_A`, to repeat it, the config looks like the following
```python
dataset_A_train = dict(
type='RepeatDataset',
times=N,
dataset=dict( # This is the original config of Dataset_A
type='Dataset_A',
...
pipeline=train_pipeline
)
)
```
### Class balanced dataset
We use `ClassBalancedDataset` as wrapper to repeat the dataset based on category
frequency. The dataset to repeat needs to instantiate function `self.get_cat_ids(idx)`
to support `ClassBalancedDataset`.
For example, to repeat `Dataset_A` with `oversample_thr=1e-3`, the config looks like the following
```python
dataset_A_train = dict(
type='ClassBalancedDataset',
oversample_thr=1e-3,
dataset=dict( # This is the original config of Dataset_A
type='Dataset_A',
...
pipeline=train_pipeline
)
)
```
You may refer to [source code](https://github.com/open-mmlab/mmdetection/blob/master/mmdet/datasets/dataset_wrappers.py) for details.
### Concatenate dataset
There are three ways to concatenate the dataset.
1. If the datasets you want to concatenate are in the same type with different annotation files, you can concatenate the dataset configs like the following.
```python
dataset_A_train = dict(
type='Dataset_A',
ann_file = ['anno_file_1', 'anno_file_2'],
pipeline=train_pipeline
)
```
If the concatenated dataset is used for test or evaluation, this manner supports to evaluate each dataset separately. To test the concatenated datasets as a whole, you can set `separate_eval=False` as below.
```python
dataset_A_train = dict(
type='Dataset_A',
ann_file = ['anno_file_1', 'anno_file_2'],
separate_eval=False,
pipeline=train_pipeline
)
```
2. In case the dataset you want to concatenate is different, you can concatenate the dataset configs like the following.
```python
dataset_A_train = dict()
dataset_B_train = dict()
data = dict(
imgs_per_gpu=2,
workers_per_gpu=2,
train = [
dataset_A_train,
dataset_B_train
],
val = dataset_A_val,
test = dataset_A_test
)
```
If the concatenated dataset is used for test or evaluation, this manner also supports to evaluate each dataset separately.
3. We also support to define `ConcatDataset` explicitly as the following.
```python
dataset_A_val = dict()
dataset_B_val = dict()
data = dict(
imgs_per_gpu=2,
workers_per_gpu=2,
train=dataset_A_train,
val=dict(
type='ConcatDataset',
datasets=[dataset_A_val, dataset_B_val],
separate_eval=False))
```
This manner allows users to evaluate all the datasets as a single one by setting `separate_eval=False`.
**Note:**
1. The option `separate_eval=False` assumes the datasets use `self.data_infos` during evaluation. Therefore, COCO datasets do not support this behavior since COCO datasets do not fully rely on `self.data_infos` for evaluation. Combining different types of datasets and evaluating them as a whole is not tested thus is not suggested.
2. Evaluating `ClassBalancedDataset` and `RepeatDataset` is not supported thus evaluating concatenated datasets of these types is also not supported.
A more complex example that repeats `Dataset_A` and `Dataset_B` by N and M times, respectively, and then concatenates the repeated datasets is as the following.
```python
dataset_A_train = dict(
type='RepeatDataset',
times=N,
dataset=dict(
type='Dataset_A',
...
pipeline=train_pipeline
)
)
dataset_A_val = dict(
...
pipeline=test_pipeline
)
dataset_A_test = dict(
...
pipeline=test_pipeline
)
dataset_B_train = dict(
type='RepeatDataset',
times=M,
dataset=dict(
type='Dataset_B',
...
pipeline=train_pipeline
)
)
data = dict(
imgs_per_gpu=2,
workers_per_gpu=2,
train = [
dataset_A_train,
dataset_B_train
],
val = dataset_A_val,
test = dataset_A_test
)
```
## Modify Dataset Classes
With existing dataset types, we can modify the class names of them to train subset of the annotations.
For example, if you want to train only three classes of the current dataset,
you can modify the classes of dataset.
The dataset will filter out the ground truth boxes of other classes automatically.
```python
classes = ('person', 'bicycle', 'car')
data = dict(
train=dict(classes=classes),
val=dict(classes=classes),
test=dict(classes=classes))
```
MMDetection V2.0 also supports to read the classes from a file, which is common in real applications.
For example, assume the `classes.txt` contains the name of classes as the following.
```
person
bicycle
car
```
Users can set the classes as a file path, the dataset will load it and convert it to a list automatically.
```python
classes = 'path/to/classes.txt'
data = dict(
train=dict(classes=classes),
val=dict(classes=classes),
test=dict(classes=classes))
```
**Note** (related to MMDetection):
- Before MMDetection v2.5.0, the dataset will filter out the empty GT images automatically if the classes are set and there is no way to disable that through config. This is an undesirable behavior and introduces confusion because if the classes are not set, the dataset only filter the empty GT images when `filter_empty_gt=True` and `test_mode=False`. After MMDetection v2.5.0, we decouple the image filtering process and the classes modification, i.e., the dataset will only filter empty GT images when `filter_empty_gt=True` and `test_mode=False`, no matter whether the classes are set. Thus, setting the classes only influences the annotations of classes used for training and users could decide whether to filter empty GT images by themselves.
- Since the middle format only has box labels and does not contain the class names, when using `CustomDataset`, users cannot filter out the empty GT images through configs but only do this offline.
- The features for setting dataset classes and dataset filtering will be refactored to be more user-friendly in v2.8.0 or v2.9.0 (depends on the progress).
# Tutorial 4: Customize Models
We basically categorize model components into 6 types.
- encoder: including voxel layer, voxel encoder and middle encoder used in voxel-based methods before backbone, e.g., HardVFE and PointPillarsScatter.
- backbone: usually an FCN network to extract feature maps, e.g., ResNet, SECOND.
- neck: the component between backbones and heads, e.g., FPN, SECONDFPN.
- head: the component for specific tasks, e.g., bbox prediction and mask prediction.
- roi extractor: the part for extracting RoI features from feature maps, e.g., H3DRoIHead and PartAggregationROIHead.
- loss: the component in head for calculating losses, e.g., FocalLoss, L1Loss, and GHMLoss.
## Develop new components
### Add a new encoder
Here we show how to develop new components with an example of HardVFE.
#### 1. Define a new voxel encoder (e.g. HardVFE)
Create a new file `mmdet3d/models/voxel_encoders/voxel_encoder.py`.
```python
import torch.nn as nn
from ..builder import VOXEL_ENCODERS
@VOXEL_ENCODERS.register_module()
class HardVFE(nn.Module):
def __init__(self, arg1, arg2):
pass
def forward(self, x): # should return a tuple
pass
def init_weights(self, pretrained=None):
pass
```
#### 2. Import the module
You can either add the following line to `mmdet3d/models/voxel_encoders/__init__.py`
```python
from .voxel_encoder import HardVFE
```
or alternatively add
```python
custom_imports = dict(
imports=['mmdet3d.models.voxel_encoders.HardVFE'],
allow_failed_imports=False)
```
to the config file to avoid modifying the original code.
#### 3. Use the backbone in your config file
```python
model = dict(
...
voxel_encoder=dict(
type='HardVFE',
arg1=xxx,
arg2=xxx),
...
```
### Add a new backbone
Here we show how to develop new components with an example of SECOND (Sparsely Embedded Convolutional Detection).
#### 1. Define a new backbone (e.g. SECOND)
Create a new file `mmdet3d/models/backbones/second.py`.
```python
import torch.nn as nn
from ..builder import BACKBONES
@BACKBONES.register_module()
class SECOND(nn.Module):
def __init__(self, arg1, arg2):
pass
def forward(self, x): # should return a tuple
pass
def init_weights(self, pretrained=None):
pass
```
#### 2. Import the module
You can either add the following line to `mmdet3d/models/backbones/__init__.py`
```python
from .second import SECOND
```
or alternatively add
```python
custom_imports = dict(
imports=['mmdet3d.models.backbones.second'],
allow_failed_imports=False)
```
to the config file to avoid modifying the original code.
#### 3. Use the backbone in your config file
```python
model = dict(
...
backbone=dict(
type='SECOND',
arg1=xxx,
arg2=xxx),
...
```
### Add new necks
#### 1. Define a neck (e.g. SECONDFPN)
Create a new file `mmdet3d/models/necks/second_fpn.py`.
```python
from ..builder import NECKS
@NECKS.register
class SECONDFPN(nn.Module):
def __init__(self,
in_channels=[128, 128, 256],
out_channels=[256, 256, 256],
upsample_strides=[1, 2, 4],
norm_cfg=dict(type='BN', eps=1e-3, momentum=0.01),
upsample_cfg=dict(type='deconv', bias=False),
conv_cfg=dict(type='Conv2d', bias=False),
use_conv_for_no_stride=False):
pass
def forward(self, X):
# implementation is ignored
pass
```
#### 2. Import the module
You can either add the following line to `mmdet3D/models/necks/__init__.py`,
```python
from .second_fpn import SECONDFPN
```
or alternatively add
```python
custom_imports = dict(
imports=['mmdet3d.models.necks.second_fpn'],
allow_failed_imports=False)
```
to the config file and avoid modifying the original code.
#### 3. Modify the config file
```python
neck=dict(
type='SECONDFPN',
in_channels=[64, 128, 256],
upsample_strides=[1, 2, 4],
out_channels=[128, 128, 128])
```
### Add new heads
Here we show how to develop a new head with the example of [PartA2 Head](https://arxiv.org/abs/1907.03670) as the following.
**Note**: Here the example of PartA2 RoI Head is used in the second stage. For one-stage heads, please refer to examples in `mmdet3d/models/dense_heads/`. They are more commonly used in 3D detection for autonomous driving due to its simplicity and high efficiency.
First, add a new bbox head in `mmdet3d/models/roi_heads/bbox_heads/parta2_bbox_head.py`.
PartA2 RoI Head implements a new bbox head for object detection.
To implement a bbox head, basically we need to implement three functions of the new module as the following. Sometimes other related functions like `loss` and `get_targets` are also required.
```python
from mmdet.models.builder import HEADS
from .bbox_head import BBoxHead
@HEADS.register_module()
class PartA2BboxHead(nn.Module):
"""PartA2 RoI head."""
def __init__(self,
num_classes,
seg_in_channels,
part_in_channels,
seg_conv_channels=None,
part_conv_channels=None,
merge_conv_channels=None,
down_conv_channels=None,
shared_fc_channels=None,
cls_channels=None,
reg_channels=None,
dropout_ratio=0.1,
roi_feat_size=14,
with_corner_loss=True,
bbox_coder=dict(type='DeltaXYZWLHRBBoxCoder'),
conv_cfg=dict(type='Conv1d'),
norm_cfg=dict(type='BN1d', eps=1e-3, momentum=0.01),
loss_bbox=dict(
type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=2.0),
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=True,
reduction='none',
loss_weight=1.0)):
super(PartA2BboxHead, self).__init__()
def init_weights(self):
# conv layers are already initialized by ConvModule
def forward(self, seg_feats, part_feats):
```
Second, implement a new RoI Head if it is necessary. We plan to inherit the new `PartAggregationROIHead` from `Base3DRoIHead`. We can find that a `Base3DRoIHead` already implements the following functions.
```python
from abc import ABCMeta, abstractmethod
from torch import nn as nn
@HEADS.register_module()
class Base3DRoIHead(nn.Module, metaclass=ABCMeta):
"""Base class for 3d RoIHeads."""
def __init__(self,
bbox_head=None,
mask_roi_extractor=None,
mask_head=None,
train_cfg=None,
test_cfg=None):
@property
def with_bbox(self):
@property
def with_mask(self):
@abstractmethod
def init_weights(self, pretrained):
@abstractmethod
def init_bbox_head(self):
@abstractmethod
def init_mask_head(self):
@abstractmethod
def init_assigner_sampler(self):
@abstractmethod
def forward_train(self,
x,
img_metas,
proposal_list,
gt_bboxes,
gt_labels,
gt_bboxes_ignore=None,
**kwargs):
def simple_test(self,
x,
proposal_list,
img_metas,
proposals=None,
rescale=False,
**kwargs):
"""Test without augmentation."""
pass
def aug_test(self, x, proposal_list, img_metas, rescale=False, **kwargs):
"""Test with augmentations.
If rescale is False, then returned bboxes and masks will fit the scale
of imgs[0].
"""
pass
```
Double Head's modification is mainly in the bbox_forward logic, and it inherits other logics from the `Base3DRoIHead`.
In the `mmdet3d/models/roi_heads/part_aggregation_roi_head.py`, we implement the new RoI Head as the following:
```python
from torch.nn import functional as F
from mmdet3d.core import AssignResult
from mmdet3d.core.bbox import bbox3d2result, bbox3d2roi
from mmdet.core import build_assigner, build_sampler
from mmdet.models import HEADS
from ..builder import build_head, build_roi_extractor
from .base_3droi_head import Base3DRoIHead
@HEADS.register_module()
class PartAggregationROIHead(Base3DRoIHead):
"""Part aggregation roi head for PartA2.
Args:
semantic_head (ConfigDict): Config of semantic head.
num_classes (int): The number of classes.
seg_roi_extractor (ConfigDict): Config of seg_roi_extractor.
part_roi_extractor (ConfigDict): Config of part_roi_extractor.
bbox_head (ConfigDict): Config of bbox_head.
train_cfg (ConfigDict): Training config.
test_cfg (ConfigDict): Testing config.
"""
def __init__(self,
semantic_head,
num_classes=3,
seg_roi_extractor=None,
part_roi_extractor=None,
bbox_head=None,
train_cfg=None,
test_cfg=None):
super(PartAggregationROIHead, self).__init__(
bbox_head=bbox_head, train_cfg=train_cfg, test_cfg=test_cfg)
self.num_classes = num_classes
assert semantic_head is not None
self.semantic_head = build_head(semantic_head)
if seg_roi_extractor is not None:
self.seg_roi_extractor = build_roi_extractor(seg_roi_extractor)
if part_roi_extractor is not None:
self.part_roi_extractor = build_roi_extractor(part_roi_extractor)
self.init_assigner_sampler()
def _bbox_forward(self, seg_feats, part_feats, voxels_dict, rois):
"""Forward function of roi_extractor and bbox_head used in both
training and testing.
Args:
seg_feats (torch.Tensor): Point-wise semantic features.
part_feats (torch.Tensor): Point-wise part prediction features.
voxels_dict (dict): Contains information of voxels.
rois (Tensor): Roi boxes.
Returns:
dict: Contains predictions of bbox_head and
features of roi_extractor.
"""
pooled_seg_feats = self.seg_roi_extractor(seg_feats,
voxels_dict['voxel_centers'],
voxels_dict['coors'][..., 0],
rois)
pooled_part_feats = self.part_roi_extractor(
part_feats, voxels_dict['voxel_centers'],
voxels_dict['coors'][..., 0], rois)
cls_score, bbox_pred = self.bbox_head(pooled_seg_feats,
pooled_part_feats)
bbox_results = dict(
cls_score=cls_score,
bbox_pred=bbox_pred,
pooled_seg_feats=pooled_seg_feats,
pooled_part_feats=pooled_part_feats)
return bbox_results
```
Here we omit more details related to other functions. Please see the [code](mmdet3d/models/roi_heads/part_aggregation_roi_head.py) for more details.
Last, the users need to add the module in
`mmdet3d/models/bbox_heads/__init__.py` and `mmdet3d/models/roi_heads/__init__.py` thus the corresponding registry could find and load them.
Alternatively, the users can add
```python
custom_imports=dict(
imports=['mmdet3d.models.roi_heads.part_aggregation_roi_head', 'mmdet3d.models.bbox_heads.parta2_bbox_head'])
```
to the config file and achieve the same goal.
The config file of PartAggregationROIHead is as the following
```python
model = dict(
...
roi_head=dict(
type='PartAggregationROIHead',
num_classes=3,
semantic_head=dict(
type='PointwiseSemanticHead',
in_channels=16,
extra_width=0.2,
seg_score_thr=0.3,
num_classes=3,
loss_seg=dict(
type='FocalLoss',
use_sigmoid=True,
reduction='sum',
gamma=2.0,
alpha=0.25,
loss_weight=1.0),
loss_part=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)),
seg_roi_extractor=dict(
type='Single3DRoIAwareExtractor',
roi_layer=dict(
type='RoIAwarePool3d',
out_size=14,
max_pts_per_voxel=128,
mode='max')),
part_roi_extractor=dict(
type='Single3DRoIAwareExtractor',
roi_layer=dict(
type='RoIAwarePool3d',
out_size=14,
max_pts_per_voxel=128,
mode='avg')),
bbox_head=dict(
type='PartA2BboxHead',
num_classes=3,
seg_in_channels=16,
part_in_channels=4,
seg_conv_channels=[64, 64],
part_conv_channels=[64, 64],
merge_conv_channels=[128, 128],
down_conv_channels=[128, 256],
bbox_coder=dict(type='DeltaXYZWLHRBBoxCoder'),
shared_fc_channels=[256, 512, 512, 512],
cls_channels=[256, 256],
reg_channels=[256, 256],
dropout_ratio=0.1,
roi_feat_size=14,
with_corner_loss=True,
loss_bbox=dict(
type='SmoothL1Loss',
beta=1.0 / 9.0,
reduction='sum',
loss_weight=1.0),
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=True,
reduction='sum',
loss_weight=1.0)))
...
)
```
Since MMDetection 2.0, the config system supports to inherit configs such that the users can focus on the modification.
The second stage of PartA2 Head mainly uses a new PartAggregationROIHead and a new
`PartA2BboxHead`, the arguments are set according to the `__init__` function of each module.
### Add new loss
Assume you want to add a new loss as `MyLoss`, for bounding box regression.
To add a new loss function, the users need implement it in `mmdet3d/models/losses/my_loss.py`.
The decorator `weighted_loss` enable the loss to be weighted for each element.
```python
import torch
import torch.nn as nn
from ..builder import LOSSES
from .utils import weighted_loss
@weighted_loss
def my_loss(pred, target):
assert pred.size() == target.size() and target.numel() > 0
loss = torch.abs(pred - target)
return loss
@LOSSES.register_module()
class MyLoss(nn.Module):
def __init__(self, reduction='mean', loss_weight=1.0):
super(MyLoss, self).__init__()
self.reduction = reduction
self.loss_weight = loss_weight
def forward(self,
pred,
target,
weight=None,
avg_factor=None,
reduction_override=None):
assert reduction_override in (None, 'none', 'mean', 'sum')
reduction = (
reduction_override if reduction_override else self.reduction)
loss_bbox = self.loss_weight * my_loss(
pred, target, weight, reduction=reduction, avg_factor=avg_factor)
return loss_bbox
```
Then the users need to add it in the `mmdet3d/models/losses/__init__.py`.
```python
from .my_loss import MyLoss, my_loss
```
Alternatively, you can add
```python
custom_imports=dict(
imports=['mmdet3d.models.losses.my_loss'])
```
to the config file and achieve the same goal.
To use it, modify the `loss_xxx` field.
Since MyLoss is for regression, you need to modify the `loss_bbox` field in the head.
```python
loss_bbox=dict(type='MyLoss', loss_weight=1.0))
```
# Tutorial 5: Customize Runtime Settings
## Customize optimization settings
### Customize optimizer supported by Pytorch
We already support to use all the optimizers implemented by PyTorch, and the only modification is to change the `optimizer` field of config files.
For example, if you want to use `ADAM` (note that the performance could drop a lot), the modification could be as the following.
```python
optimizer = dict(type='Adam', lr=0.0003, weight_decay=0.0001)
```
To modify the learning rate of the model, the users only need to modify the `lr` in the config of optimizer. The users can directly set arguments following the [API doc](https://pytorch.org/docs/stable/optim.html?highlight=optim#module-torch.optim) of PyTorch.
### Customize self-implemented optimizer
#### 1. Define a new optimizer
A customized optimizer could be defined as following.
Assume you want to add a optimizer named `MyOptimizer`, which has arguments `a`, `b`, and `c`.
You need to create a new directory named `mmdet/core/optimizer`.
And then implement the new optimizer in a file, e.g., in `mmdet/core/optimizer/my_optimizer.py`:
```python
from .registry import OPTIMIZERS
from torch.optim import Optimizer
@OPTIMIZERS.register_module()
class MyOptimizer(Optimizer):
def __init__(self, a, b, c)
```
#### 2. Add the optimizer to registry
To find the above module defined above, this module should be imported into the main namespace at first. There are two options to achieve it.
- Modify `mmdet/core/optimizer/__init__.py` to import it.
The newly defined module should be imported in `mmdet/core/optimizer/__init__.py` so that the registry will
find the new module and add it:
```python
from .my_optimizer import MyOptimizer
```
- Use `custom_imports` in the config to manually import it
```python
custom_imports = dict(imports=['mmdet.core.optimizer.my_optimizer'], allow_failed_imports=False)
```
The module `mmdet.core.optimizer.my_optimizer` will be imported at the beginning of the program and the class `MyOptimizer` is then automatically registered.
Note that only the package containing the class `MyOptimizer` should be imported.
`mmdet.core.optimizer.my_optimizer.MyOptimizer` **cannot** be imported directly.
Actually users can use a totally different file directory structure using this importing method, as long as the module root can be located in `PYTHONPATH`.
#### 3. Specify the optimizer in the config file
Then you can use `MyOptimizer` in `optimizer` field of config files.
In the configs, the optimizers are defined by the field `optimizer` like the following:
```python
optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
```
To use your own optimizer, the field can be changed to
```python
optimizer = dict(type='MyOptimizer', a=a_value, b=b_value, c=c_value)
```
### Customize optimizer constructor
Some models may have some parameter-specific settings for optimization, e.g. weight decay for BatchNorm layers.
The users can do those fine-grained parameter tuning through customizing optimizer constructor.
```python
from mmcv.utils import build_from_cfg
from mmcv.runner.optimizer import OPTIMIZER_BUILDERS, OPTIMIZERS
from mmdet.utils import get_root_logger
from .my_optimizer import MyOptimizer
@OPTIMIZER_BUILDERS.register_module()
class MyOptimizerConstructor(object):
def __init__(self, optimizer_cfg, paramwise_cfg=None):
def __call__(self, model):
return my_optimizer
```
The default optimizer constructor is implemented [here](https://github.com/open-mmlab/mmcv/blob/9ecd6b0d5ff9d2172c49a182eaa669e9f27bb8e7/mmcv/runner/optimizer/default_constructor.py#L11), which could also serve as a template for new optimizer constructor.
### Additional settings
Tricks not implemented by the optimizer should be implemented through optimizer constructor (e.g., set parameter-wise learning rates) or hooks. We list some common settings that could stabilize the training or accelerate the training. Feel free to create PR, issue for more settings.
- __Use gradient clip to stabilize training__:
Some models need gradient clip to clip the gradients to stabilize the training process. An example is as below:
```python
optimizer_config = dict(
_delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
```
If your config inherits the base config which already sets the `optimizer_config`, you might need `_delete_=True` to overide the unnecessary settings. See the [config documenetation](https://mmdetection.readthedocs.io/en/latest/config.html) for more details.
- __Use momentum schedule to accelerate model convergence__:
We support momentum scheduler to modify model's momentum according to learning rate, which could make the model converge in a faster way.
Momentum scheduler is usually used with LR scheduler, for example, the following config is used in 3D detection to accelerate convergence.
For more details, please refer to the implementation of [CyclicLrUpdater](https://github.com/open-mmlab/mmcv/blob/f48241a65aebfe07db122e9db320c31b685dc674/mmcv/runner/hooks/lr_updater.py#L327) and [CyclicMomentumUpdater](https://github.com/open-mmlab/mmcv/blob/f48241a65aebfe07db122e9db320c31b685dc674/mmcv/runner/hooks/momentum_updater.py#L130).
```python
lr_config = dict(
policy='cyclic',
target_ratio=(10, 1e-4),
cyclic_times=1,
step_ratio_up=0.4,
)
momentum_config = dict(
policy='cyclic',
target_ratio=(0.85 / 0.95, 1),
cyclic_times=1,
step_ratio_up=0.4,
)
```
## Customize training schedules
By default we use step learning rate with 1x schedule, this calls [`StepLRHook`](https://github.com/open-mmlab/mmcv/blob/f48241a65aebfe07db122e9db320c31b685dc674/mmcv/runner/hooks/lr_updater.py#L153) in MMCV.
We support many other learning rate schedule [here](https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/lr_updater.py), such as `CosineAnnealing` and `Poly` schedule. Here are some examples
- Poly schedule:
```python
lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
```
- ConsineAnnealing schedule:
```python
lr_config = dict(
policy='CosineAnnealing',
warmup='linear',
warmup_iters=1000,
warmup_ratio=1.0 / 10,
min_lr_ratio=1e-5)
```
## Customize workflow
Workflow is a list of (phase, epochs) to specify the running order and epochs.
By default it is set to be
```python
workflow = [('train', 1)]
```
which means running 1 epoch for training.
Sometimes user may want to check some metrics (e.g. loss, accuracy) about the model on the validate set.
In such case, we can set the workflow as
```python
[('train', 1), ('val', 1)]
```
so that 1 epoch for training and 1 epoch for validation will be run iteratively.
**Note**:
1. The parameters of model will not be updated during val epoch.
2. Keyword `total_epochs` in the config only controls the number of training epochs and will not affect the validation workflow.
3. Workflows `[('train', 1), ('val', 1)]` and `[('train', 1)]` will not change the behavior of `EvalHook` because `EvalHook` is called by `after_train_epoch` and validation workflow only affect hooks that are called through `after_val_epoch`. Therefore, the only difference between `[('train', 1), ('val', 1)]` and `[('train', 1)]` is that the runner will calculate losses on validation set after each training epoch.
## Customize hooks
### Customize self-implemented hooks
#### 1. Implement a new hook
There are some occasions when the users might need to implement a new hook. MMDetection supports customized hooks in training (#3395) since v2.3.0. Thus the users could implement a hook directly in mmdet or their mmdet-based codebases and use the hook by only modifying the config in training.
Before v2.3.0, the users need to modify the code to get the hook registered before training starts.
Here we give an example of creating a new hook in mmdet and using it in training.
```python
from mmcv.runner import HOOKS, Hook
@HOOKS.register_module()
class MyHook(Hook):
def __init__(self, a, b):
pass
def before_run(self, runner):
pass
def after_run(self, runner):
pass
def before_epoch(self, runner):
pass
def after_epoch(self, runner):
pass
def before_iter(self, runner):
pass
def after_iter(self, runner):
pass
```
Depending on the functionality of the hook, the users need to specify what the hook will do at each stage of the training in `before_run`, `after_run`, `before_epoch`, `after_epoch`, `before_iter`, and `after_iter`.
#### 2. Register the new hook
Then we need to make `MyHook` imported. Assuming the file is in `mmdet/core/utils/my_hook.py` there are two ways to do that:
- Modify `mmdet/core/utils/__init__.py` to import it.
The newly defined module should be imported in `mmdet/core/utils/__init__.py` so that the registry will
find the new module and add it:
```python
from .my_hook import MyHook
```
- Use `custom_imports` in the config to manually import it
```python
custom_imports = dict(imports=['mmdet.core.utils.my_hook'], allow_failed_imports=False)
```
#### 3. Modify the config
```python
custom_hooks = [
dict(type='MyHook', a=a_value, b=b_value)
]
```
You can also set the priority of the hook by adding key `priority` to `'NORMAL'` or `'HIGHEST'` as below
```python
custom_hooks = [
dict(type='MyHook', a=a_value, b=b_value, priority='NORMAL')
]
```
By default the hook's priority is set as `NORMAL` during registration.
### Use hooks implemented in MMCV
If the hook is already implemented in MMCV, you can directly modify the config to use the hook as below
```python
custom_hooks = [
dict(type='MyHook', a=a_value, b=b_value, priority='NORMAL')
]
```
### Modify default runtime hooks
There are some common hooks that are not registerd through `custom_hooks`, they are
- log_config
- checkpoint_config
- evaluation
- lr_config
- optimizer_config
- momentum_config
In those hooks, only the logger hook has the `VERY_LOW` priority, others' priority are `NORMAL`.
The above-mentioned tutorials already covers how to modify `optimizer_config`, `momentum_config`, and `lr_config`.
Here we reveals how what we can do with `log_config`, `checkpoint_config`, and `evaluation`.
#### Checkpoint config
The MMCV runner will use `checkpoint_config` to initialize [`CheckpointHook`](https://github.com/open-mmlab/mmcv/blob/9ecd6b0d5ff9d2172c49a182eaa669e9f27bb8e7/mmcv/runner/hooks/checkpoint.py#L9).
```python
checkpoint_config = dict(interval=1)
```
The users could set `max_keep_ckpts` to only save only small number of checkpoints or decide whether to store state dict of optimizer by `save_optimizer`. More details of the arguments are [here](https://mmcv.readthedocs.io/en/latest/api.html#mmcv.runner.CheckpointHook)
#### Log config
The `log_config` wraps multiple logger hooks and enables to set intervals. Now MMCV supports `WandbLoggerHook`, `MlflowLoggerHook`, and `TensorboardLoggerHook`.
The detail usages can be found in the [doc](https://mmcv.readthedocs.io/en/latest/api.html#mmcv.runner.LoggerHook).
```python
log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
dict(type='TensorboardLoggerHook')
])
```
#### Evaluation config
The config of `evaluation` will be used to initialize the [`EvalHook`](https://github.com/open-mmlab/mmdetection/blob/7a404a2c000620d52156774a5025070d9e00d918/mmdet/core/evaluation/eval_hooks.py#L8).
Except the key `interval`, other arguments such as `metric` will be passed to the `dataset.evaluate()`
```python
evaluation = dict(interval=1, metric='bbox')
```
# Tutorial 3: Custom Data Pipelines # Tutorial 3: Customize Data Pipelines
## Design of Data pipelines ## Design of Data pipelines
......
# Tutorial 1: Finetuning Models
Detectors pre-trained on the COCO dataset can serve as a good pre-trained model for other datasets, e.g., CityScapes and KITTI Dataset.
This tutorial provides instruction for users to use the models provided in the [Model Zoo](../model_zoo.md) for other datasets to obtain better performance.
There are two steps to finetune a model on a new dataset.
- Add support for the new dataset following [Tutorial 2: Adding New Dataset](new_dataset.md).
- Modify the configs as will be discussed in this tutorial.
Take the finetuning process on Cityscapes Dataset as an example, the users need to modify five parts in the config.
## Inherit base configs
To release the burden and reduce bugs in writing the whole configs, MMDetection V2.0 support inheriting configs from multiple existing configs. To finetune a Mask RCNN model, the new config needs to inherit
`_base_/models/mask_rcnn_r50_fpn.py` to build the basic structure of the model. To use the Cityscapes Dataset, the new config can also simply inherit `_base_/datasets/cityscapes_instance.py`. For runtime settings such as training schedules, the new config needs to inherit `_base_/default_runtime.py`. This configs are in the `configs` directory and the users can also choose to write the whole contents rather than use inheritance.
```python
_base_ = [
'../_base_/models/mask_rcnn_r50_fpn.py',
'../_base_/datasets/cityscapes_instance.py', '../_base_/default_runtime.py'
]
```
## Modify head
Then the new config needs to modify the head according to the class numbers of the new datasets. By only changing `num_classes` in the roi_head, the weights of the pre-trained models are mostly reused except the final prediction head.
```python
model = dict(
pretrained=None,
roi_head=dict(
bbox_head=dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=8,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=False,
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)),
mask_head=dict(
type='FCNMaskHead',
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=8,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0))))
```
## Modify dataset
The users may also need to prepare the dataset and write the configs about dataset. MMDetection V2.0 already support VOC, WIDER FACE, COCO and Cityscapes Dataset.
## Modify training schedule
The finetuning hyperparameters vary from the default schedule. It usually requires smaller learning rate and less training epochs
```python
# optimizer
# lr is set for a batch size of 8
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=0.001,
# [7] yields higher performance than [6]
step=[7])
total_epochs = 8 # actual epoch = 8 * 8 = 64
log_config = dict(interval=100)
```
## Use pre-trained model
To use the pre-trained model, the new config add the link of pre-trained models in the `load_from`. The users might need to download the model weights before training to avoid the download time during training.
```python
load_from = 'https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_r50_fpn_2x_20181010-41d35c05.pth' # noqa
```
.. toctree:: .. toctree::
:maxdepth: 2 :maxdepth: 2
finetune.md config.md
new_dataset.md customize_dataset.md
data_pipeline.md data_pipeline.md
new_modules.md customize_models.md
customize_runtime.md
waymo.md waymo.md
# Tutorial 2: Adding New Dataset
## Customize datasets by reorganizing data
### Reorganize dataset to existing format
The simplest way is to convert your dataset to existing dataset formats (COCO or PASCAL VOC).
The annotation json files in COCO format has the following necessary keys:
```python
'images': [
{
'file_name': 'COCO_val2014_000000001268.jpg',
'height': 427,
'width': 640,
'id': 1268
},
...
],
'annotations': [
{
'segmentation': [[192.81,
247.09,
...
219.03,
249.06]], # if you have mask labels
'area': 1035.749,
'iscrowd': 0,
'image_id': 1268,
'bbox': [192.81, 224.8, 74.73, 33.43],
'category_id': 16,
'id': 42986
},
...
],
'categories': [
{'id': 0, 'name': 'car'},
]
```
There are three necessary keys in the json file:
- `images`: contains a list of images with theire informations like `file_name`, `height`, `width`, and `id`.
- `annotations`: contains the list of instance annotations.
- `categories`: contains the list of categories names and their ID.
After the data pre-processing, the users need to further modify the config files to use the dataset.
Here we show an example of using a custom dataset of 5 classes, assuming it is also in COCO format.
In `configs/my_custom_config.py`:
```python
...
# dataset settings
dataset_type = 'CocoDataset'
classes = ('a', 'b', 'c', 'd', 'e')
...
data = dict(
samples_per_gpu=2,
workers_per_gpu=2,
train=dict(
type=dataset_type,
classes=classes,
ann_file='path/to/your/train/data',
...),
val=dict(
type=dataset_type,
classes=classes,
ann_file='path/to/your/val/data',
...),
test=dict(
type=dataset_type,
classes=classes,
ann_file='path/to/your/test/data',
...))
...
```
We use this way to support CityScapes dataset. The script is in [cityscapes.py](https://github.com/open-mmlab/mmdetection/blob/master/tools/convert_datasets/cityscapes.py) and we also provide the finetuning [configs](https://github.com/open-mmlab/mmdetection/blob/master/configs/cityscapes).
### Reorganize dataset to middle format
It is also fine if you do not want to convert the annotation format to COCO or PASCAL format.
Actually, we define a simple annotation format and all existing datasets are
processed to be compatible with it, either online or offline.
The annotation of a dataset is a list of dict, each dict corresponds to an image.
There are 3 field `filename` (relative path), `width`, `height` for testing,
and an additional field `ann` for training. `ann` is also a dict containing at least 2 fields:
`bboxes` and `labels`, both of which are numpy arrays. Some datasets may provide
annotations like crowd/difficult/ignored bboxes, we use `bboxes_ignore` and `labels_ignore`
to cover them.
Here is an example.
```
[
{
'filename': 'a.jpg',
'width': 1280,
'height': 720,
'ann': {
'bboxes': <np.ndarray, float32> (n, 4),
'labels': <np.ndarray, int64> (n, ),
'bboxes_ignore': <np.ndarray, float32> (k, 4),
'labels_ignore': <np.ndarray, int64> (k, ) (optional field)
}
},
...
]
```
There are two ways to work with custom datasets.
- online conversion
You can write a new Dataset class inherited from `CustomDataset`, and overwrite two methods
`load_annotations(self, ann_file)` and `get_ann_info(self, idx)`,
like [CocoDataset](https://github.com/open-mmlab/mmdetection/blob/master/mmdet/datasets/coco.py) and [VOCDataset](https://github.com/open-mmlab/mmdetection/blob/master/mmdet/datasets/voc.py).
- offline conversion
You can convert the annotation format to the expected format above and save it to
a pickle or json file, like [pascal_voc.py](https://github.com/open-mmlab/mmdetection/blob/master/tools/convert_datasets/pascal_voc.py).
Then you can simply use `CustomDataset`.
### An example of customized dataset
Assume the annotation is in a new format in text files.
The bounding boxes annotations are stored in text file `annotation.txt` as the following
```
#
000001.jpg
1280 720
2
10 20 40 60 1
20 40 50 60 2
#
000002.jpg
1280 720
3
50 20 40 60 2
20 40 30 45 2
30 40 50 60 3
```
We can create a new dataset in `mmdet/datasets/my_dataset.py` to load the data.
```python
import mmcv
import numpy as np
from .builder import DATASETS
from .custom import CustomDataset
@DATASETS.register_module()
class MyDataset(CustomDataset):
CLASSES = ('person', 'bicycle', 'car', 'motorcycle')
def load_annotations(self, ann_file):
ann_list = mmcv.list_from_file(ann_file)
data_infos = []
for i, ann_line in enumerate(ann_list):
if ann_line != '#':
continue
img_shape = ann_list[i + 2].split(' ')
width = int(img_shape[0])
height = int(img_shape[1])
bbox_number = int(ann_list[i + 3])
anns = ann_line.split(' ')
bboxes = []
labels = []
for anns in ann_list[i + 4:i + 4 + bbox_number]:
bboxes.append([float(ann) for ann in anns[:4]])
labels.append(int(anns[4]))
data_infos.append(
dict(
filename=ann_list[i + 1],
width=width,
height=height,
ann=dict(
bboxes=np.array(bboxes).astype(np.float32),
labels=np.array(labels).astype(np.int64))
))
return data_infos
def get_ann_info(self, idx):
return self.data_infos[idx]['ann']
```
Then in the config, to use `MyDataset` you can modify the config as the following
```python
dataset_A_train = dict(
type='MyDataset',
ann_file = 'image_list.txt',
pipeline=train_pipeline
)
```
## Customize datasets by mixing dataset
MMDetection also supports to mix dataset for training.
Currently it supports to concat and repeat datasets.
### Repeat dataset
We use `RepeatDataset` as wrapper to repeat the dataset. For example, suppose the original dataset is `Dataset_A`, to repeat it, the config looks like the following
```python
dataset_A_train = dict(
type='RepeatDataset',
times=N,
dataset=dict( # This is the original config of Dataset_A
type='Dataset_A',
...
pipeline=train_pipeline
)
)
```
### Class balanced dataset
We use `ClassBalancedDataset` as wrapper to repeat the dataset based on category
frequency. The dataset to repeat needs to instantiate function `self.get_cat_ids(idx)`
to support `ClassBalancedDataset`.
For example, to repeat `Dataset_A` with `oversample_thr=1e-3`, the config looks like the following
```python
dataset_A_train = dict(
type='ClassBalancedDataset',
oversample_thr=1e-3,
dataset=dict( # This is the original config of Dataset_A
type='Dataset_A',
...
pipeline=train_pipeline
)
)
```
You may refer to [source code](../../mmdet/datasets/dataset_wrappers.py) for details.
### Concatenate dataset
There two ways to concatenate the dataset.
1. If the datasets you want to concatenate are in the same type with different annotation files, you can concatenate the dataset configs like the following.
```python
dataset_A_train = dict(
type='Dataset_A',
ann_file = ['anno_file_1', 'anno_file_2'],
pipeline=train_pipeline
)
```
2. In case the dataset you want to concatenate is different, you can concatenate the dataset configs like the following.
```python
dataset_A_train = dict()
dataset_B_train = dict()
data = dict(
imgs_per_gpu=2,
workers_per_gpu=2,
train = [
dataset_A_train,
dataset_B_train
],
val = dataset_A_val,
test = dataset_A_test
)
```
A more complex example that repeats `Dataset_A` and `Dataset_B` by N and M times, respectively, and then concatenates the repeated datasets is as the following.
```python
dataset_A_train = dict(
type='RepeatDataset',
times=N,
dataset=dict(
type='Dataset_A',
...
pipeline=train_pipeline
)
)
dataset_A_val = dict(
...
pipeline=test_pipeline
)
dataset_A_test = dict(
...
pipeline=test_pipeline
)
dataset_B_train = dict(
type='RepeatDataset',
times=M,
dataset=dict(
type='Dataset_B',
...
pipeline=train_pipeline
)
)
data = dict(
imgs_per_gpu=2,
workers_per_gpu=2,
train = [
dataset_A_train,
dataset_B_train
],
val = dataset_A_val,
test = dataset_A_test
)
```
### Modify classes of existing dataset
With existing dataset types, we can modify the class names of them to train subset of the dataset.
For example, if you want to train only three classes of the current dataset,
you can modify the classes of dataset.
The dataset will subtract subset of the data which contains at least one class in the `classes`.
```python
classes = ('person', 'bicycle', 'car')
data = dict(
train=dict(classes=classes),
val=dict(classes=classes),
test=dict(classes=classes))
```
MMDetection V2.0 also supports to read the classes from a file, which is common in real applications.
For example, assume the `classes.txt` contains the name of classes as the following.
```
person
bicycle
car
```
Users can set the classes as a file path, the dataset will load it and convert it to a list automatically.
```python
classes = 'path/to/classes.txt'
data = dict(
train=dict(classes=classes),
val=dict(classes=classes),
test=dict(classes=classes))
```
# Tutorial 4: Adding New Modules
## Customize optimizer
A customized optimizer could be defined as following.
Assume you want to add a optimizer named as `MyOptimizer`, which has arguments `a`, `b`, and `c`.
You need to create a new directory named `mmdet/core/optimizer`.
And then implement the new optimizer in a file, e.g., in `mmdet/core/optimizer/my_optimizer.py`:
```python
from .registry import OPTIMIZERS
from torch.optim import Optimizer
@OPTIMIZERS.register_module()
class MyOptimizer(Optimizer):
def __init__(self, a, b, c)
```
Then add this module in `mmdet/core/optimizer/__init__.py` thus the registry will
find the new module and add it:
```python
from .my_optimizer import MyOptimizer
```
Then you can use `MyOptimizer` in `optimizer` field of config files.
In the configs, the optimizers are defined by the field `optimizer` like the following:
```python
optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
```
To use your own optimizer, the field can be changed as
```python
optimizer = dict(type='MyOptimizer', a=a_value, b=b_value, c=c_value)
```
We already support to use all the optimizers implemented by PyTorch, and the only modification is to change the `optimizer` field of config files.
For example, if you want to use `ADAM`, though the performance will drop a lot, the modification could be as the following.
```python
optimizer = dict(type='Adam', lr=0.0003, weight_decay=0.0001)
```
The users can directly set arguments following the [API doc](https://pytorch.org/docs/stable/optim.html?highlight=optim#module-torch.optim) of PyTorch.
## Customize optimizer constructor
Some models may have some parameter-specific settings for optimization, e.g. weight decay for BatchNoarm layers.
The users can do those fine-grained parameter tuning through customizing optimizer constructor.
```python
from mmcv.utils import build_from_cfg
from mmcv.runner.optimizer import OPTIMIZER_BUILDERS, OPTIMIZERS
from mmdet.utils import get_root_logger
from .my_optimizer import MyOptimizer
@OPTIMIZER_BUILDERS.register_module()
class MyOptimizerConstructor(object):
def __init__(self, optimizer_cfg, paramwise_cfg=None):
def __call__(self, model):
return my_optimizer
```
## Develop new components
We basically categorize model components into 4 types.
- backbone: usually an FCN network to extract feature maps, e.g., ResNet, MobileNet.
- neck: the component between backbones and heads, e.g., FPN, PAFPN.
- head: the component for specific tasks, e.g., bbox prediction and mask prediction.
- roi extractor: the part for extracting RoI features from feature maps, e.g., RoI Align.
### Add new backbones
Here we show how to develop new components with an example of MobileNet.
1. Create a new file `mmdet/models/backbones/mobilenet.py`.
```python
import torch.nn as nn
from ..registry import BACKBONES
@BACKBONES.register_module()
class MobileNet(nn.Module):
def __init__(self, arg1, arg2):
pass
def forward(self, x): # should return a tuple
pass
def init_weights(self, pretrained=None):
pass
```
2. Import the module in `mmdet/models/backbones/__init__.py`.
```python
from .mobilenet import MobileNet
```
3. Use it in your config file.
```python
model = dict(
...
backbone=dict(
type='MobileNet',
arg1=xxx,
arg2=xxx),
...
```
### Add new necks
Here we take PAFPN as an example.
1. Create a new file in `mmdet/models/necks/pafpn.py`.
```python
from ..registry import NECKS
@NECKS.register
class PAFPN(nn.Module):
def __init__(self,
in_channels,
out_channels,
num_outs,
start_level=0,
end_level=-1,
add_extra_convs=False):
pass
def forward(self, inputs):
# implementation is ignored
pass
```
2. Import the module in `mmdet/models/necks/__init__.py`.
```python
from .pafpn import PAFPN
```
3. Modify the config file.
```python
neck=dict(
type='PAFPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5)
```
### Add new heads
Here we show how to develop a new head with the example of [Double Head R-CNN](https://arxiv.org/abs/1904.06493) as the following.
First, add a new bbox head in `mmdet/models/bbox_heads/double_bbox_head.py`.
Double Head R-CNN implements a new bbox head for object detection.
To implement a bbox head, basically we need to implement three functions of the new module as the following.
```python
@HEADS.register_module()
class DoubleConvFCBBoxHead(BBoxHead):
r"""Bbox head used in Double-Head R-CNN
/-> cls
/-> shared convs ->
\-> reg
roi features
/-> cls
\-> shared fc ->
\-> reg
""" # noqa: W605
def __init__(self,
num_convs=0,
num_fcs=0,
conv_out_channels=1024,
fc_out_channels=1024,
conv_cfg=None,
norm_cfg=dict(type='BN'),
**kwargs):
kwargs.setdefault('with_avg_pool', True)
super(DoubleConvFCBBoxHead, self).__init__(**kwargs)
def init_weights(self):
# conv layers are already initialized by ConvModule
def forward(self, x_cls, x_reg):
```
Second, implement a new RoI Head if it is necessary. We plan to inherit the new `DoubleHeadRoIHead` from `StandardRoIHead`. We can find that a `StandardRoIHead` already implements the following functions.
```python
import torch
from mmdet.core import bbox2result, bbox2roi, build_assigner, build_sampler
from ..builder import HEADS, build_head, build_roi_extractor
from .base_roi_head import BaseRoIHead
from .test_mixins import BBoxTestMixin, MaskTestMixin
@HEADS.register_module()
class StandardRoIHead(BaseRoIHead, BBoxTestMixin, MaskTestMixin):
"""Simplest base roi head including one bbox head and one mask head.
"""
def init_assigner_sampler(self):
def init_bbox_head(self, bbox_roi_extractor, bbox_head):
def init_mask_head(self, mask_roi_extractor, mask_head):
def init_weights(self, pretrained):
def forward_dummy(self, x, proposals):
def forward_train(self,
x,
img_metas,
proposal_list,
gt_bboxes,
gt_labels,
gt_bboxes_ignore=None,
gt_masks=None):
def _bbox_forward(self, x, rois):
def _bbox_forward_train(self, x, sampling_results, gt_bboxes, gt_labels,
img_metas):
def _mask_forward_train(self, x, sampling_results, bbox_feats, gt_masks,
img_metas):
def _mask_forward(self, x, rois=None, pos_inds=None, bbox_feats=None):
def simple_test(self,
x,
proposal_list,
img_metas,
proposals=None,
rescale=False):
"""Test without augmentation."""
```
Double Head's modification is mainly in the bbox_forward logic, and it inherits other logics from the `StandardRoIHead`.
In the `mmdet/models/roi_heads/double_roi_head.py`, we implement the new RoI Head as the following:
```python
from ..builder import HEADS
from .standard_roi_head import StandardRoIHead
@HEADS.register_module()
class DoubleHeadRoIHead(StandardRoIHead):
"""RoI head for Double Head RCNN
https://arxiv.org/abs/1904.06493
"""
def __init__(self, reg_roi_scale_factor, **kwargs):
super(DoubleHeadRoIHead, self).__init__(**kwargs)
self.reg_roi_scale_factor = reg_roi_scale_factor
def _bbox_forward(self, x, rois):
bbox_cls_feats = self.bbox_roi_extractor(
x[:self.bbox_roi_extractor.num_inputs], rois)
bbox_reg_feats = self.bbox_roi_extractor(
x[:self.bbox_roi_extractor.num_inputs],
rois,
roi_scale_factor=self.reg_roi_scale_factor)
if self.with_shared_head:
bbox_cls_feats = self.shared_head(bbox_cls_feats)
bbox_reg_feats = self.shared_head(bbox_reg_feats)
cls_score, bbox_pred = self.bbox_head(bbox_cls_feats, bbox_reg_feats)
bbox_results = dict(
cls_score=cls_score,
bbox_pred=bbox_pred,
bbox_feats=bbox_cls_feats)
return bbox_results
```
Last, the users need to add the module in the `mmdet/models/bbox_heads/__init__.py` and `mmdet/models/roi_heads/__init__.py` thus the corresponding registry could find and load them.
To config file of Double Head R-CNN is as the following
```python
_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
model = dict(
roi_head=dict(
type='DoubleHeadRoIHead',
reg_roi_scale_factor=1.3,
bbox_head=dict(
_delete_=True,
type='DoubleConvFCBBoxHead',
num_convs=4,
num_fcs=2,
in_channels=256,
conv_out_channels=1024,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=80,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=False,
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=2.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=2.0))))
```
Since MMDetection 2.0, the config system support to inherit configs such that the users can focus on the modification.
The Double Head R-CNN mainly uses a new DoubleHeadRoIHead and a new
`DoubleConvFCBBoxHead`, the arguments are set according to the `__init__` function of each module.
### Add new loss
Assume you want to add a new loss as `MyLoss`, for bounding box regression.
To add a new loss function, the users need implement it in `mmdet/models/losses/my_loss.py`.
The decorator `weighted_loss` enable the loss to be weighted for each element.
```python
import torch
import torch.nn as nn
from ..builder import LOSSES
from .utils import weighted_loss
@weighted_loss
def my_loss(pred, target):
assert pred.size() == target.size() and target.numel() > 0
loss = torch.abs(pred - target)
return loss
@LOSSES.register_module()
class MyLoss(nn.Module):
def __init__(self, reduction='mean', loss_weight=1.0):
super(MyLoss, self).__init__()
self.reduction = reduction
self.loss_weight = loss_weight
def forward(self,
pred,
target,
weight=None,
avg_factor=None,
reduction_override=None):
assert reduction_override in (None, 'none', 'mean', 'sum')
reduction = (
reduction_override if reduction_override else self.reduction)
loss_bbox = self.loss_weight * my_loss(
pred, target, weight, reduction=reduction, avg_factor=avg_factor)
return loss_bbox
```
Then the users need to add it in the `mmdet/models/losses/__init__.py`.
```python
from .my_loss import MyLoss, my_loss
```
To use it, modify the `loss_xxx` field.
Since MyLoss is for regrression, you need to modify the `loss_bbox` field in the head.
```python
loss_bbox=dict(type='MyLoss', loss_weight=1.0))
```
# Tutorial 5: Waymo Dataset # Tutorial 6: Waymo Dataset
This page provides specific tutorials about the usage of MMDetection3D for waymo dataset. This page provides specific tutorials about the usage of MMDetection3D for waymo dataset.
......
We provide lots of useful tools under `tools/` directory.
# Log Analysis
You can plot loss/mAP curves given a training log file. Run `pip install seaborn` first to install the dependency.
![loss curve image](../resources/loss_curve.png)
```shell
python tools/analyze_logs.py plot_curve [--keys ${KEYS}] [--title ${TITLE}] [--legend ${LEGEND}] [--backend ${BACKEND}] [--style ${STYLE}] [--out ${OUT_FILE}]
```
Examples:
- Plot the classification loss of some run.
```shell
python tools/analyze_logs.py plot_curve log.json --keys loss_cls --legend loss_cls
```
- Plot the classification and regression loss of some run, and save the figure to a pdf.
```shell
python tools/analyze_logs.py plot_curve log.json --keys loss_cls loss_bbox --out losses.pdf
```
- Compare the bbox mAP of two runs in the same figure.
```shell
python tools/analyze_logs.py plot_curve log1.json log2.json --keys bbox_mAP --legend run1 run2
```
You can also compute the average training speed.
```shell
python tools/analyze_logs.py cal_train_time log.json [--include-outliers]
```
The output is expected to be like the following.
```
-----Analyze train time of work_dirs/some_exp/20190611_192040.log.json-----
slowest epoch 11, average time is 1.2024
fastest epoch 1, average time is 1.1909
time std over epochs is 0.0028
average iter time: 1.1959 s/iter
```
# Visualization
To see the SUNRGBD, ScanNet or KITTI points and detection results, you can run the following command
```bash
python tools/test.py ${CONFIG_FILE} ${CKPT_PATH} --show --show-dir ${SHOW_DIR}
```
Aftering running this command, plotted results ***_points.obj and ***_pred.ply files in `${SHOW_DIR}`.
To see the points, detection results and ground truth of SUNRGBD, ScanNet or KITTI during evaluation time, you can run the following command
```bash
python tools/test.py ${CONFIG_FILE} ${CKPT_PATH} --eval 'mAP' --options 'show=True' 'out_dir=${SHOW_DIR}'
```
After running this command, you will obtain ***_points.ob, ***_pred.ply files and ***_gt.ply in `${SHOW_DIR}`.
You can use 3D visualization software such as the [MeshLab](http://www.meshlab.net/) to open the these files under `${SHOW_DIR}` to see the 3D detection output. Specifically, open `***_points.obj` to see the input point cloud and open `***_pred.ply` to see the predicted 3D bounding boxes. This allows the inference and results generation be done in remote server and the users can open them on their host with GUI.
**Notice**: The visualization API is a little unstable since we plan to refactor these parts together with MMDetection in the future.
# Model Complexity
You can use `tools/get_flops.py` in MMDetection, a script adapted from [flops-counter.pytorch](https://github.com/sovrasov/flops-counter.pytorch), to compute the FLOPs and params of a given model.
```shell
python tools/get_flops.py ${CONFIG_FILE} [--shape ${INPUT_SHAPE}]
```
You will get the results like this.
```text
==============================
Input shape: (3, 1280, 800)
Flops: 239.32 GFLOPs
Params: 37.74 M
==============================
```
**Note**: This tool is still experimental and we do not guarantee that the
number is absolutely correct. You may well use the result for simple
comparisons, but double check it before you adopt it in technical reports or papers.
1. FLOPs are related to the input shape while parameters are not. The default
input shape is (1, 3, 1280, 800).
2. Some operators are not counted into FLOPs like GN and custom operators. Refer to [`mmcv.cnn.get_model_complexity_info()`](https://github.com/open-mmlab/mmcv/blob/master/mmcv/cnn/utils/flops_counter.py) for details.
3. The FLOPs of two-stage detectors is dependent on the number of proposals.
# Model Conversion
## RegNet model to MMDetection
`tools/regnet2mmdet.py` convert keys in pycls pretrained RegNet models to
MMDetection style.
```shell
python tools/regnet2mmdet.py ${SRC} ${DST} [-h]
```
## Detectron ResNet to Pytorch
`tools/detectron2pytorch.py` in MMDetection could convert keys in the original detectron pretrained
ResNet models to PyTorch style.
```shell
python tools/detectron2pytorch.py ${SRC} ${DST} ${DEPTH} [-h]
```
## Prepare a model for publishing
`tools/publish_model.py` helps users to prepare their model for publishing.
Before you upload a model to AWS, you may want to
1. convert model weights to CPU tensors
2. delete the optimizer states and
3. compute the hash of the checkpoint file and append the hash id to the
filename.
```shell
python tools/publish_model.py ${INPUT_FILENAME} ${OUTPUT_FILENAME}
```
E.g.,
```shell
python tools/publish_model.py work_dirs/faster_rcnn/latest.pth faster_rcnn_r50_fpn_1x_20190801.pth
```
The final output filename will be `faster_rcnn_r50_fpn_1x_20190801-{hash id}.pth`.
# Dataset Conversion
`tools/data_converter/` contains tools to convert datasets to other formats. Most of them convert datasets to pickle based info files, like kitti, nuscenes and lyft. Waymo converter is used to reorganize waymo raw data like KITTI style. Users could refer to them for our approach to converting data format. It is also convenient to modify them to use as scripts like nuImages converter.
To convert the nuImages dataset into COCO format, please use the command below:
```shell
python -u tools/data_converter/nuimage_converter.py --data-root ${DATA_ROOT} --version ${VERIONS} \
--out-dir ${OUT_DIR} --nproc ${NUM_WORKERS} --extra-tag ${TAG}
```
- `--data-root`: the root of the dataset, defaults to `./data/nuimages`.
- `--version`: the version of the dataset, defaults to `v1.0-mini`. To get the full dataset, please use `--version v1.0-train v1.0-val v1.0-mini`
- `--out-dir`: the output directory of annotations and semantic masks, defaults to `./data/nuimages/annotations/`.
- `--nproc`: number of workers for data preparation, defaults to `4`. Larger number could reduce the preparation time as images are processed in parallel.
- `--extra-tag`: extra tag of the annotations, defaults to `nuimages`. This can be used to separate different annotations processed in different time for study.
More details could be referred to the [doc](data_preparation.md) for dataset preparation and [README](../configs/nuimages/README.md) for nuImages dataset.
# Miscellaneous
## Print the entire config
`tools/print_config.py` prints the whole config verbatim, expanding all its
imports.
```shell
python tools/print_config.py ${CONFIG} [-h] [--options ${OPTIONS [OPTIONS...]}]
```
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment