Commit cd8d4079 authored by zhangwenwei's avatar zhangwenwei
Browse files

Refactor docs

parent b107238d
...@@ -15,7 +15,7 @@ All kinds of contributions are welcome, including but not limited to the followi ...@@ -15,7 +15,7 @@ All kinds of contributions are welcome, including but not limited to the followi
Note Note
- If you plan to add some new features that involve large changes, it is encouraged to open an issue for discussion first. - If you plan to add some new features that involve large changes, it is encouraged to open an issue for discussion first.
- If you are the author of some papers and would like to include your method to mmdetection, - If you are the author of some papers and would like to include your method to mmdetection,
please contact Wenwei Zhang (zwwdev[at]gmail[dot]com). We will much appreciate your contribution. please contact Kai Chen (chenkaidev[at]gmail[dot]com) and Wenwei Zhang (zwwdev[at]gmail[dot]com). We will much appreciate your contribution.
## Code style ## Code style
...@@ -27,7 +27,25 @@ We use the following tools for linting and formatting: ...@@ -27,7 +27,25 @@ We use the following tools for linting and formatting:
- [yapf](https://github.com/google/yapf): formatter - [yapf](https://github.com/google/yapf): formatter
- [isort](https://github.com/timothycrosley/isort): sort imports - [isort](https://github.com/timothycrosley/isort): sort imports
Style configurations of yapf and isort can be found in [.style.yapf](.style.yapf) and [.isort.cfg](.isort.cfg). Style configurations of yapf and isort can be found in [.style.yapf](../.style.yapf) and [.isort.cfg](../.isort.cfg).
We use [pre-commit hook](https://pre-commit.com/) that checks and formats for `flake8`, `yapf`, `isort`, `trailing whitespaces`,
fixes `end-of-files`, sorts `requirments.txt` automatically on every commit.
The config for a pre-commit hook is stored in [.pre-commit-config](../.pre-commit-config.yaml).
After you clone the repository, you will need to install initialize pre-commit hook.
```
pip install -U pre-commit
```
From the repository folder
```
pre-commit install
```
After this on every commit check code linters and formatter will be enforced.
>Before you create a PR, make sure that your code lints and is formatted by yapf. >Before you create a PR, make sure that your code lints and is formatted by yapf.
......
blank_issues_enabled: false
---
name: Error report
about: Create a report to help us improve
title: ''
labels: ''
assignees: ''
---
Thanks for your error report and we appreciate it a lot.
**Checklist**
1. I have searched related issues but cannot get the expected help.
2. The bug has not been fixed in the latest version.
**Describe the bug**
A clear and concise description of what the bug is.
**Reproduction**
1. What command or script did you run?
```
A placeholder for the command.
```
2. Did you make any modifications on the code or config? Did you understand what you have modified?
3. What dataset did you use?
**Environment**
1. Please run `python mmdet/utils/collect_env.py` to collect necessary environment infomation and paste it here.
2. You may add addition that may be helpful for locating the problem, such as
- How you installed PyTorch [e.g., pip, conda, source]
- Other environment variables that may be related (such as `$PATH`, `$LD_LIBRARY_PATH`, `$PYTHONPATH`, etc.)
**Error traceback**
If applicable, paste the error trackback here.
```
A placeholder for trackback.
```
**Bug fix**
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!
---
name: Feature request
about: Suggest an idea for this project
title: ''
labels: ''
assignees: ''
---
**Describe the feature**
**Motivation**
A clear and concise description of the motivation of the feature.
Ex1. It is inconvenient when [....].
Ex2. There is a recent paper [....], which is very helpful for [....].
**Related resources**
If there is an official code release or third-party implementations, please also provide the information here, which would be very helpful.
**Additional context**
Add any other context or screenshots about the feature request here.
If you would like to implement the feature and create a PR, please leave a comment here and that would be much appreciated.
---
name: General questions
about: Ask general questions to get help
title: ''
labels: ''
assignees: ''
---
---
name: Reimplementation Questions
about: Ask about questions during model reimplementation
title: ''
labels: 'reimplementation'
assignees: ''
---
**Notice**
There are several common situations in the reimplementation issues as below
1. Reimplement a model in the model zoo using the provided configs
2. Reimplement a model in the model zoo on other dataset (e.g., custom datasets)
3. Reimplement a custom model but all the components are implemented in MMDetection
4. Reimplement a custom model with new modules implemented by yourself
There are several things to do for different cases as below.
- For case 1 & 3, please follow the steps in the following sections thus we could help to quick identify the issue.
- For case 2 & 4, please understand that we are not able to do much help here because we usually do not know the full code and the users should be responsible to the code they write.
- One suggestion for case 2 & 4 is that the users should first check whether the bug lies in the self-implemted code or the original code. For example, users can first make sure that the same model runs well on supported datasets. If you still need help, please describe what you have done and what you obtain in the issue, and follow the steps in the following sections and try as clear as possible so that we can better help you.
**Checklist**
1. I have searched related issues but cannot get the expected help.
2. The issue has not been fixed in the latest version.
**Describe the issue**
A clear and concise description of what the problem you meet and what have you done.
**Reproduction**
1. What command or script did you run?
```
A placeholder for the command.
```
2. What config dir you run?
```
A placeholder for the config.
```
3. Did you make any modifications on the code or config? Did you understand what you have modified?
4. What dataset did you use?
**Environment**
1. Please run `python mmdet/utils/collect_env.py` to collect necessary environment infomation and paste it here.
2. You may add addition that may be helpful for locating the problem, such as
- How you installed PyTorch [e.g., pip, conda, source]
- Other environment variables that may be related (such as `$PATH`, `$LD_LIBRARY_PATH`, `$PYTHONPATH`, etc.)
**Results**
If applicable, paste the related results here, e.g., what you expect and what you get.
```
A placeholder for results comparison
```
**Issue fix**
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!
# MMDetection3D # MMDetection3D
**News**: We released the codebase v0.1.0.
Documentation: https://mmdetection3d.readthedocs.io/
## Introduction ## Introduction
The master branch works with **PyTorch 1.1** or higher. The master branch works with **PyTorch 1.3 to 1.5**.
mmdetection3d is an open source 3D object detection toolbox based on PyTorch. It is MMDetection3D is an open source object detection toolbox based on PyTorch. It is
a part of the open-mmlab project developed by [Multimedia Laboratory, CUHK](http://mmlab.ie.cuhk.edu.hk/). a part of the OpenMMLab project developed by [MMLab](http://mmlab.ie.cuhk.edu.hk/).
![demo image](demo/coco_test_12510.jpg)
### Major features ### Major features
- **Modular Design**
We decompose the detection framework into different components and one can easily construct a customized object detection framework by combining different modules.
- **Support of multiple frameworks out of box**
The toolbox directly supports popular and contemporary detection frameworks, *e.g.* Faster RCNN, Mask RCNN, RetinaNet, etc.
- **High efficiency**
The training speed is [faster than other codebases](./docs/benchmarks.md).
- **State of the art**
The accuracy of models is [faster than other codebases](./docs/benchmarks.md).
Apart from MMDetection3D, we also released a library [MMDetection](https://github.com/open-mmlab/mmdetection) and [mmcv](https://github.com/open-mmlab/mmcv) for computer vision research, which are heavily depended on by this toolbox.
## License ## License
This project is released under the [Apache 2.0 license](LICENSE). This project is released under the [Apache 2.0 license](LICENSE).
## Updates ## Changelog
v0.0.1 (07/08/2019) v0.1.0 was released in 24/6/2020.
- the project is initiated Please refer to [changelog.md](docs/changelog.md) for details and release history.
## Benchmark and model zoo ## Benchmark and model zoo
Supported methods and backbones are shown in the below table. Supported methods and backbones are shown in the below table.
Results and models are available in the [Model zoo](MODEL_ZOO.md). Results and models are available in the [model zoo](docs/model_zoo.md).
| | ResNet | ResNeXt | SENet |PointNet++ | HRNet | RegNetX | Res2Net |
|--------------------|:--------:|:--------:|:--------:|:---------:|:-----:|:--------:|:-----:|
| SECOND | ☐ | ☐ | ☐ | ✗ | ✓ | ✓ | ☐ |
| PointPillars | ☐ | ☐ | ☐ | ✗ | ✓ | ✓ | ☐ |
| VoteNet | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ |
| Part-A2 | ☐ | ☐ | ☐ | ✗ | ✓ | ✓ | ☐ |
| MVXNet | ☐ | ☐ | ☐ | ✗ | ✓ | ✓ | ☐ |
Other features
- [x] [Dynamic Voxelization](configs/carafe/README.md)
**Notice**: All the models or modules supported in [MMDetection's model zoo](https://github.com/open-mmlab/mmdetection/blob/master/docs/model_zoo.md) can be trained or used in this codebase.
## Installation ## Installation
Please refer to [INSTALL.md](INSTALL.md) for installation and dataset preparation. Please refer to [install.md](docs/install.md) for installation and dataset preparation.
## Get Started ## Get Started
Please see [GETTING_STARTED.md](GETTING_STARTED.md) for the basic usage of MMDetection. Please see [getting_started.md](docs/getting_started.md) for the basic usage of MMDetection. There are also tutorials for [finetuning models](docs/tutorials/finetune.md), [adding new dataset](docs/tutorials/new_dataset.md), [designing data pipeline](docs/tutorials/data_pipeline.md), and [adding new modules](docs/tutorials/new_modules.md).
## Contributing ## Contributing
We appreciate all contributions to improve MMDetection3D. Please refer to [CONTRIBUTING.md](CONTRIBUTING.md) for the contributing guideline. We appreciate all contributions to improve MMDetection. Please refer to [CONTRIBUTING.md](.github/CONTRIBUTING.md) for the contributing guideline.
## Acknowledgement ## Acknowledgement
MMDetection3D is an open source project that is contributed by researchers and engineers from various colleges and companies. We appreciate all the contributors who implement their methods or add new features, as well as users who give valuable feedbacks. MMDetection3D is an open source project that is contributed by researchers and engineers from various colleges and companies. We appreciate all the contributors who implement their methods or add new features, as well as users who give valuable feedbacks.
We wish that the toolbox and benchmark could serve the growing research community by providing a flexible toolkit to reimplement existing methods and develop their own new detectors. We wish that the toolbox and benchmark could serve the growing research community by providing a flexible toolkit to reimplement existing methods and develop their own new 3D detectors.
## Citation ## Citation
If you use this toolbox or benchmark in your research, please cite this project.
```
@misc{mmdetection3d_2020,
title = {{MMDetection3D}},
author = {Zhang, Wenwei and Wu, Yuefeng and Li, Yinhao and Lin, Kwan-Yee and
Qian, Chen, Shi, Jianping, and Chen, Kai, and Li, Hongsheng and
Lin, Dahua, and Loy, Chen Change},
howpublished = {\url{https://github.com/open-mmlab/mmdetection3d}},
year = {2020}
}
```
## Contact ## Contact
This repo is currently maintained by Wenwei Zhang ([@ZwwWayne](http://github.com/ZwwWayne)). This repo is currently maintained by Wenwei Zhang ([@ZwwWayne](https://github.com/ZwwWayne)).
## Changelog
### v2.0.0 (6/5/2020)
In this release, we made lots of major refactoring and modifications.
1. **Faster speed**. We optimize the training and inference speed for common models, achieving up to 30% speedup for training and 25% for inference. Please refer to [model zoo](model_zoo.md#comparison-with-detectron2) for details.
2. **Higher performance**. We change some default hyperparameters with no additional cost, which leads to a gain of performance for most models. Please refer to [compatibility](compatibility.md#training-hyperparameters) for details.
3. **More documentation and tutorials**. We add a bunch of documentation and tutorials to help users get started more smoothly. Read it [here](https://mmdetection.readthedocs.io/en/latest/).
4. **Support PyTorch 1.5**. The support for 1.1 and 1.2 is dropped, and we switch to some new APIs.
5. **Better configuration system**. Inheritance is supported to reduce the redundancy of configs.
6. **Better modular desing**. Towards the goal of simplicity and flexibility, we simplify some encapsulation while add more other configurable modules like BBoxCoder, IoUCalculator, OptimizerConstructor, RoIHead. Target computation is also included in heads and the call hierarchy is simpler.
7. Support new methods: [FSAF](https://arxiv.org/abs/1903.00621) and PAFPN (part of [PAFPN](https://arxiv.org/abs/1803.01534)).
**Breaking Changes**
Models training with mmdetection 1.x are not fully compatible with 2.0, please refer to the [compatibility doc](compatibility.md) for the details and how to migrate to the new version.
**Improvements**
- Unify cuda and cpp API for custom ops. (#2277)
- New config files with inheritance. (#2216)
- Encapsulate the second stage into RoI heads. (#1999)
- Refactor GCNet/EmpericalAttention into plugins. (#2345)
- Set low quality match as an option in IoU-based bbox assigners. (#2375)
- Change the codebase's coordinate system. (#2380)
- Refactor the category order in heads. 0 means the first positive class instead of background now. (#2374)
- Add bbox sampler and assigner registry. (#2419)
- Speed up the inference of RPN. (#2420)
- Add `train_cfg` and `test_cfg` as class members in all anchor heads. (#2422)
- Merge target computation methods into heads. (#2429)
- Add bbox coder to support different bbox encoding and losses. (#2480)
- Unify the API for regression loss. (#2156)
- Refactor Anchor Generator. (#2474)
- Make `lr` an optional argument for optimizers. (#2509)
- Migrate to modules and methods in MMCV. (#2502, #2511, #2569, #2572)
- Support PyTorch 1.5. (#2524)
- Drop the support for Python 3.5 and use F-string in the codebase. (#2531)
**Bug Fixes**
- Fix the scale factors for resized images without keep the aspect ratio. (#2039)
- Check if max_num > 0 before slicing in NMS. (#2486)
- Fix Deformable RoIPool when there is no instance. (#2490)
- Fix the default value of assigned labels. (#2536)
- Fix the evaluation of Cityscapes. (#2578)
**New Features**
- Add deep_stem and avg_down option to ResNet, i.e., support ResNetV1d. (#2252)
- Add L1 loss. (#2376)
- Support both polygon and bitmap for instance masks. (#2353, #2540)
- Support CPU mode for inference. (#2385)
- Add optimizer constructor for complicated configuration of optimizers. (#2397, #2488)
- Implement PAFPN. (#2392)
- Support empty tensor input for some modules. (#2280)
- Support for custom dataset classes without overriding it. (#2408, #2443)
- Support to train subsets of coco dataset. (#2340)
- Add iou_calculator to potentially support more IoU calculation methods. (2405)
- Support class wise mean AP (was removed in the last version). (#2459)
- Add option to save the testing result images. (#2414)
- Support MomentumUpdaterHook. (#2571)
- Add a demo to inference a single image. (#2605)
### v1.1.0 (24/2/2020)
**Highlights**
- Dataset evaluation is rewritten with a unified api, which is used by both evaluation hooks and test scripts.
- Support new methods: [CARAFE](https://arxiv.org/abs/1905.02188).
**Breaking Changes**
- The new MMDDP inherits from the official DDP, thus the `__init__` api is changed to be the same as official DDP.
- The `mask_head` field in HTC config files is modified.
- The evaluation and testing script is updated.
- In all transforms, instance masks are stored as a numpy array shaped (n, h, w) instead of a list of (h, w) arrays, where n is the number of instances.
**Bug Fixes**
- Fix IOU assigners when ignore_iof_thr > 0 and there is no pred boxes. (#2135)
- Fix mAP evaluation when there are no ignored boxes. (#2116)
- Fix the empty RoI input for Deformable RoI Pooling. (#2099)
- Fix the dataset settings for multiple workflows. (#2103)
- Fix the warning related to `torch.uint8` in PyTorch 1.4. (#2105)
- Fix the inference demo on devices other than gpu:0. (#2098)
- Fix Dockerfile. (#2097)
- Fix the bug that `pad_val` is unused in Pad transform. (#2093)
- Fix the albumentation transform when there is no ground truth bbox. (#2032)
**Improvements**
- Use torch instead of numpy for random sampling. (#2094)
- Migrate to the new MMDDP implementation in MMCV v0.3. (#2090)
- Add meta information in logs. (#2086)
- Rewrite Soft NMS with pytorch extension and remove cython as a dependency. (#2056)
- Rewrite dataset evaluation. (#2042, #2087, #2114, #2128)
- Use numpy array for masks in transforms. (#2030)
**New Features**
- Implement "CARAFE: Content-Aware ReAssembly of FEatures". (#1583)
- Add `worker_init_fn()` in data_loader when seed is set. (#2066, #2111)
- Add logging utils. (#2035)
### v1.0.0 (30/1/2020)
This release mainly improves the code quality and add more docstrings.
**Highlights**
- Documentation is online now: https://mmdetection.readthedocs.io.
- Support new models: [ATSS](https://arxiv.org/abs/1912.02424).
- DCN is now available with the api `build_conv_layer` and `ConvModule` like the normal conv layer.
- A tool to collect environment information is available for trouble shooting.
**Bug Fixes**
- Fix the incompatibility of the latest numpy and pycocotools. (#2024)
- Fix the case when distributed package is unavailable, e.g., on Windows. (#1985)
- Fix the dimension issue for `refine_bboxes()`. (#1962)
- Fix the typo when `seg_prefix` is a list. (#1906)
- Add segmentation map cropping to RandomCrop. (#1880)
- Fix the return value of `ga_shape_target_single()`. (#1853)
- Fix the loaded shape of empty proposals. (#1819)
- Fix the mask data type when using albumentation. (#1818)
**Improvements**
- Enhance AssignResult and SamplingResult. (#1995)
- Add ability to overwrite existing module in Registry. (#1982)
- Reorganize requirements and make albumentations and imagecorruptions optional. (#1969)
- Check NaN in `SSDHead`. (#1935)
- Encapsulate the DCN in ResNe(X)t into a ConvModule & Conv_layers. (#1894)
- Refactoring for mAP evaluation and support multiprocessing and logging. (#1889)
- Init the root logger before constructing Runner to log more information. (#1865)
- Split `SegResizeFlipPadRescale` into different existing transforms. (#1852)
- Move `init_dist()` to MMCV. (#1851)
- Documentation and docstring improvements. (#1971, #1938, #1869, #1838)
- Fix the color of the same class for mask visualization. (#1834)
- Remove the option `keep_all_stages` in HTC and Cascade R-CNN. (#1806)
**New Features**
- Add two test-time options `crop_mask` and `rle_mask_encode` for mask heads. (#2013)
- Support loading grayscale images as single channel. (#1975)
- Implement "Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection". (#1872)
- Add sphinx generated docs. (#1859, #1864)
- Add GN support for flops computation. (#1850)
- Collect env info for trouble shooting. (#1812)
### v1.0rc1 (13/12/2019)
The RC1 release mainly focuses on improving the user experience, and fixing bugs.
**Highlights**
- Support new models: [FoveaBox](https://arxiv.org/abs/1904.03797), [RepPoints](https://arxiv.org/abs/1904.11490) and [FreeAnchor](https://arxiv.org/abs/1909.02466).
- Add a Dockerfile.
- Add a jupyter notebook demo and a webcam demo.
- Setup the code style and CI.
- Add lots of docstrings and unit tests.
- Fix lots of bugs.
**Breaking Changes**
- There was a bug for computing COCO-style mAP w.r.t different scales (AP_s, AP_m, AP_l), introduced by #621. (#1679)
**Bug Fixes**
- Fix a sampling interval bug in Libra R-CNN. (#1800)
- Fix the learning rate in SSD300 WIDER FACE. (#1781)
- Fix the scaling issue when `keep_ratio=False`. (#1730)
- Fix typos. (#1721, #1492, #1242, #1108, #1107)
- Fix the shuffle argument in `build_dataloader`. (#1693)
- Clip the proposal when computing mask targets. (#1688)
- Fix the "index out of range" bug for samplers in some corner cases. (#1610, #1404)
- Fix the NMS issue on devices other than GPU:0. (#1603)
- Fix SSD Head and GHM Loss on CPU. (#1578)
- Fix the OOM error when there are too many gt bboxes. (#1575)
- Fix the wrong keyword argument `nms_cfg` in HTC. (#1573)
- Process masks and semantic segmentation in Expand and MinIoUCrop transforms. (#1550, #1361)
- Fix a scale bug in the Non Local op. (#1528)
- Fix a bug in transforms when `gt_bboxes_ignore` is None. (#1498)
- Fix a bug when `img_prefix` is None. (#1497)
- Pass the device argument to `grid_anchors` and `valid_flags`. (#1478)
- Fix the data pipeline for test_robustness. (#1476)
- Fix the argument type of deformable pooling. (#1390)
- Fix the coco_eval when there are only two classes. (#1376)
- Fix a bug in Modulated DeformableConv when deformable_group>1. (#1359)
- Fix the mask cropping in RandomCrop. (#1333)
- Fix zero outputs in DeformConv when not running on cuda:0. (#1326)
- Fix the type issue in Expand. (#1288)
- Fix the inference API. (#1255)
- Fix the inplace operation in Expand. (#1249)
- Fix the from-scratch training config. (#1196)
- Fix inplace add in RoIExtractor which cause an error in PyTorch 1.2. (#1160)
- Fix FCOS when input images has no positive sample. (#1136)
- Fix recursive imports. (#1099)
**Improvements**
- Print the config file and mmdet version in the log. (#1721)
- Lint the code before compiling in travis CI. (#1715)
- Add a probability argument for the `Expand` transform. (#1651)
- Update the PyTorch and CUDA version in the docker file. (#1615)
- Raise a warning when specifying `--validate` in non-distributed training. (#1624, #1651)
- Beautify the mAP printing. (#1614)
- Add pre-commit hook. (#1536)
- Add the argument `in_channels` to backbones. (#1475)
- Add lots of docstrings and unit tests, thanks to [@Erotemic](https://github.com/Erotemic). (#1603, #1517, #1506, #1505, #1491, #1479, #1477, #1475, #1474)
- Add support for multi-node distributed test when there is no shared storage. (#1399)
- Optimize Dockerfile to reduce the image size. (#1306)
- Update new results of HRNet. (#1284, #1182)
- Add an argument `no_norm_on_lateral` in FPN. (#1240)
- Test the compiling in CI. (#1235)
- Move docs to a separate folder. (#1233)
- Add a jupyter notebook demo. (#1158)
- Support different type of dataset for training. (#1133)
- Use int64_t instead of long in cuda kernels. (#1131)
- Support unsquare RoIs for bbox and mask heads. (#1128)
- Manually add type promotion to make compatible to PyTorch 1.2. (#1114)
- Allowing validation dataset for computing validation loss. (#1093)
- Use `.scalar_type()` instead of `.type()` to suppress some warnings. (#1070)
**New Features**
- Add an option `--with_ap` to compute the AP for each class. (#1549)
- Implement "FreeAnchor: Learning to Match Anchors for Visual Object Detection". (#1391)
- Support [Albumentations](https://github.com/albumentations-team/albumentations) for augmentations in the data pipeline. (#1354)
- Implement "FoveaBox: Beyond Anchor-based Object Detector". (#1339)
- Support horizontal and vertical flipping. (#1273, #1115)
- Implement "RepPoints: Point Set Representation for Object Detection". (#1265)
- Add test-time augmentation to HTC and Cascade R-CNN. (#1251)
- Add a COCO result analysis tool. (#1228)
- Add Dockerfile. (#1168)
- Add a webcam demo. (#1155, #1150)
- Add FLOPs counter. (#1127)
- Allow arbitrary layer order for ConvModule. (#1078)
### v1.0rc0 (27/07/2019)
- Implement lots of new methods and components (Mixed Precision Training, HTC, Libra R-CNN, Guided Anchoring, Empirical Attention, Mask Scoring R-CNN, Grid R-CNN (Plus), GHM, GCNet, FCOS, HRNet, Weight Standardization, etc.). Thank all collaborators!
- Support two additional datasets: WIDER FACE and Cityscapes.
- Refactoring for loss APIs and make it more flexible to adopt different losses and related hyper-parameters.
- Speed up multi-gpu testing.
- Integrate all compiling and installing in a single script.
### v0.6.0 (14/04/2019)
- Up to 30% speedup compared to the model zoo.
- Support both PyTorch stable and nightly version.
- Replace NMS and SigmoidFocalLoss with Pytorch CUDA extensions.
### v0.6rc0(06/02/2019)
- Migrate to PyTorch 1.0.
### v0.5.7 (06/02/2019)
- Add support for Deformable ConvNet v2. (Many thanks to the authors and [@chengdazhi](https://github.com/chengdazhi))
- This is the last release based on PyTorch 0.4.1.
### v0.5.6 (17/01/2019)
- Add support for Group Normalization.
- Unify RPNHead and single stage heads (RetinaHead, SSDHead) with AnchorHead.
### v0.5.5 (22/12/2018)
- Add SSD for COCO and PASCAL VOC.
- Add ResNeXt backbones and detection models.
- Refactoring for Samplers/Assigners and add OHEM.
- Add VOC dataset and evaluation scripts.
### v0.5.4 (27/11/2018)
- Add SingleStageDetector and RetinaNet.
### v0.5.3 (26/11/2018)
- Add Cascade R-CNN and Cascade Mask R-CNN.
- Add support for Soft-NMS in config files.
### v0.5.2 (21/10/2018)
- Add support for custom datasets.
- Add a script to convert PASCAL VOC annotations to the expected format.
### v0.5.1 (20/10/2018)
- Add BBoxAssigner and BBoxSampler, the `train_cfg` field in config files are restructured.
- `ConvFCRoIHead` / `SharedFCRoIHead` are renamed to `ConvFCBBoxHead` / `SharedFCBBoxHead` for consistency.
# Benchmark and Model Zoo
## Mirror sites
We use AWS as the main site to host our model zoo, and maintain a mirror on aliyun.
You can replace `https://s3.ap-northeast-2.amazonaws.com/open-mmlab` with `https://open-mmlab.oss-cn-beijing.aliyuncs.com` in model urls.
## Common settings
- All FPN baselines and RPN-C4 baselines were trained using 8 GPU with a batch size of 16 (2 images per GPU). Other C4 baselines were trained using 8 GPU with a batch size of 8 (1 image per GPU).
- All models were trained on `coco_2017_train`, and tested on the `coco_2017_val`.
- We use distributed training and BN layer stats are fixed.
- We adopt the same training schedules as Detectron. 1x indicates 12 epochs and 2x indicates 24 epochs, which corresponds to slightly less iterations than Detectron and the difference can be ignored.
- All pytorch-style pretrained backbones on ImageNet are from PyTorch model zoo.
- For fair comparison with other codebases, we report the GPU memory as the maximum value of `torch.cuda.max_memory_allocated()` for all 8 GPUs. Note that this value is usually less than what `nvidia-smi` shows.
- We report the inference time as the overall time including data loading, network forwarding and post processing.
## Baselines
### RPN
Please refer to [RPN](https://github.com/open-mmlab/mmdetection/blob/master/configs/rpn) for details.
### Faster R-CNN
Please refer to [Faster R-CNN](https://github.com/open-mmlab/mmdetection/blob/master/configs/faster_rcnn) for details.
### Mask R-CNN
Please refer to [Mask R-CNN](https://github.com/open-mmlab/mmdetection/blob/master/configs/mask_rcnn) for details.
### Fast R-CNN (with pre-computed proposals)
Please refer to [Fast R-CNN](https://github.com/open-mmlab/mmdetection/blob/master/configs/fast_rcnn) for details.
### RetinaNet
Please refer to [RetinaNet](https://github.com/open-mmlab/mmdetection/blob/master/configs/retinanet) for details.
### Cascade R-CNN and Cascade Mask R-CNN
Please refer to [Cascade R-CNN](https://github.com/open-mmlab/mmdetection/blob/master/configs/cascade_rcnn) for details.
### Hybrid Task Cascade (HTC)
Please refer to [HTC](https://github.com/open-mmlab/mmdetection/blob/master/configs/htc) for details.
### SSD
Please refer to [SSD](https://github.com/open-mmlab/mmdetection/blob/master/configs/ssd) for details.
### Group Normalization (GN)
Please refer to [Group Normalization](https://github.com/open-mmlab/mmdetection/blob/master/configs/gn) for details.
### Weight Standardization
Please refer to [Weight Standardization](https://github.com/open-mmlab/mmdetection/blob/master/configs/gn+ws) for details.
### Deformable Convolution v2
Please refer to [Deformable Convolutional Networks](https://github.com/open-mmlab/mmdetection/blob/master/configs/dcn) for details.
### CARAFE: Content-Aware ReAssembly of FEatures
Please refer to [CARAFE](https://github.com/open-mmlab/mmdetection/blob/master/configs/carafe) for details.
### Instaboost
Please refer to [Instaboost](https://github.com/open-mmlab/mmdetection/blob/master/configs/instaboost) for details.
### Libra R-CNN
Please refer to [Libra R-CNN](https://github.com/open-mmlab/mmdetection/blob/master/configs/libra_rcnn) for details.
### Guided Anchoring
Please refer to [Guided Anchoring](https://github.com/open-mmlab/mmdetection/blob/master/configs/guided_anchoring) for details.
### FCOS
Please refer to [FCOS](https://github.com/open-mmlab/mmdetection/blob/master/configs/fcos) for details.
### FoveaBox
Please refer to [FoveaBox](https://github.com/open-mmlab/mmdetection/blob/master/configs/foveabox) for details.
### RepPoints
Please refer to [RepPoints](https://github.com/open-mmlab/mmdetection/blob/master/configs/reppoints) for details.
### FreeAnchor
Please refer to [FreeAnchor](https://github.com/open-mmlab/mmdetection/blob/master/configs/free_anchor) for details.
### Grid R-CNN (plus)
Please refer to [Grid R-CNN](https://github.com/open-mmlab/mmdetection/blob/master/configs/grid_rcnn) for details.
### GHM
Please refer to [GHM](https://github.com/open-mmlab/mmdetection/blob/master/configs/ghm) for details.
### GCNet
Please refer to [GCNet](https://github.com/open-mmlab/mmdetection/blob/master/configs/gcnet) for details.
### HRNet
Please refer to [HRNet](https://github.com/open-mmlab/mmdetection/blob/master/configs/hrnet) for details.
### Mask Scoring R-CNN
Please refer to [Mask Scoring R-CNN](https://github.com/open-mmlab/mmdetection/blob/master/configs/ms_rcnn) for details.
### Train from Scratch
Please refer to [Rethinking ImageNet Pre-training](https://github.com/open-mmlab/mmdetection/blob/master/configs/scratch) for details.
### NAS-FPN
Please refer to [NAS-FPN](https://github.com/open-mmlab/mmdetection/blob/master/configs/nas_fpn) for details.
### ATSS
Please refer to [ATSS](https://github.com/open-mmlab/mmdetection/blob/master/configs/atss) for details.
### FSAF
Please refer to [FSAF](https://github.com/open-mmlab/mmdetection/blob/master/configs/fsaf) for details.
### Other datasets
We also benchmark some methods on [PASCAL VOC](https://github.com/open-mmlab/mmdetection/blob/master/configs/pascal_voc), [Cityscapes](https://github.com/open-mmlab/mmdetection/blob/master/configs/cityscapes) and [WIDER FACE](https://github.com/open-mmlab/mmdetection/blob/master/configs/wider_face).
## Speed benchmark
We compare the training speed of Mask R-CNN with some other popular frameworks (The data is copied from [detectron2](https://github.com/facebookresearch/detectron2/blob/master/docs/notes/benchmarks.md)).
| Implementation | Throughput (img/s) |
|----------------------|--------------------|
| [Detectron2](https://github.com/facebookresearch/detectron2) | 61 |
| [MMDetection](https://github.com/open-mmlab/mmdetection) | 60 |
| [maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark/) | 51 |
| [tensorpack](https://github.com/tensorpack/tensorpack/tree/master/examples/FasterRCNN) | 50 |
| [simpledet](https://github.com/TuSimple/simpledet/) | 39 |
| [Detectron](https://github.com/facebookresearch/Detectron) | 19 |
| [matterport/Mask_RCNN](https://github.com/matterport/Mask_RCNN/) | 14 |
## Comparison with Detectron2
We compare mmdetection with [Detectron2](https://github.com/facebookresearch/detectron2.git) in terms of speed and performance.
We use the commit id [185c27e](https://github.com/facebookresearch/detectron2/tree/185c27e4b4d2d4c68b5627b3765420c6d7f5a659)(30/4/2020) of detectron.
For fair comparison, we install and run both frameworks on the same machine.
### Hardware
- 8 NVIDIA Tesla V100 (32G) GPUs
- Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
### Software environment
- Python 3.7
- PyTorch 1.4
- CUDA 10.1
- CUDNN 7.6.03
- NCCL 2.4.08
### Performance
<table border="1">
<tr>
<th>Type</th>
<th>Lr schd</th>
<th>Detectron2</th>
<th>mmdetection</th>
</tr>
<tr>
<td rowspan="2">Faster R-CNN</td>
<td>1x</td>
<td>37.9</td>
<td>38.0</td>
</tr>
<tr>
<td>3x</td>
<td>40.2</td>
<td>-</td>
</tr>
<tr>
<td rowspan="2">Mask R-CNN</td>
<td>1x</td>
<td>38.6 &amp; 35.2</td>
<td>38.8 &amp; 35.4</td>
</tr>
<tr>
<td>3x</td>
<td>41.0 &amp; 37.2 </td>
<td>-</td>
</tr>
<tr>
<td rowspan="2">Retinanet</td>
<td>1x</td>
<td>36.5</td>
<td>37.0</td>
</tr>
<tr>
<td>3x</td>
<td>37.9</td>
<td>-</td>
</tr>
</table>
### Training Speed
The training speed is measure with s/iter. The lower, the better.
<table border="1">
<tr>
<th>Type</th>
<th>Detectron2</th>
<th>mmdetection</th>
</tr>
<tr>
<td>Faster R-CNN</td>
<td>0.210</td>
<td>0.216</td>
</tr>
<tr>
<td>Mask R-CNN</td>
<td>0.261</td>
<td>0.265</td>
</tr>
<tr>
<td>Retinanet</td>
<td>0.200</td>
<td>0.205</td>
</tr>
</table>
### Inference Speed
The inference speed is measured with fps (img/s) on a single GPU, the higher, the better.
To be consistent with Detectron2, we report the pure inference speed (without the time of data loading).
For Mask R-CNN, we exclude the time of RLE encoding in post-processing.
We also include the officially reported speed in the parentheses, which is slightly higher
than the results tested on our server due to differences of hardwares.
<table border="1">
<tr>
<th>Type</th>
<th>Detectron2</th>
<th>mmdetection</th>
</tr>
<tr>
<td>Faster R-CNN</td>
<td>25.6 (26.3)</td>
<td>22.2</td>
</tr>
<tr>
<td>Mask R-CNN</td>
<td>22.5 (23.3)</td>
<td>19.6</td>
</tr>
<tr>
<td>Retinanet</td>
<td>17.8 (18.2)</td>
<td>20.6</td>
</tr>
</table>
### Training memory
<table border="1">
<tr>
<th>Type</th>
<th>Detectron2</th>
<th>mmdetection</th>
</tr>
<tr>
<td>Faster R-CNN</td>
<td>3.0</td>
<td>3.8</td>
</tr>
<tr>
<td>Mask R-CNN</td>
<td>3.4</td>
<td>3.9</td>
</tr>
<tr>
<td>Retinanet</td>
<td>3.9</td>
<td>3.4</td>
</tr>
</table>
# Corruption Benchmarking
## Introduction
We provide tools to test object detection and instance segmentation models on the image corruption benchmark defined in [Benchmarking Robustness in Object Detection: Autonomous Driving when Winter is Coming](https://arxiv.org/abs/1907.07484).
This page provides basic tutorials how to use the benchmark.
```
@article{michaelis2019winter,
title={Benchmarking Robustness in Object Detection:
Autonomous Driving when Winter is Coming},
author={Michaelis, Claudio and Mitzkus, Benjamin and
Geirhos, Robert and Rusak, Evgenia and
Bringmann, Oliver and Ecker, Alexander S. and
Bethge, Matthias and Brendel, Wieland},
journal={arXiv:1907.07484},
year={2019}
}
```
![image corruption example](../demo/corruptions_sev_3.png)
## About the benchmark
To submit results to the benchmark please visit the [benchmark homepage](https://github.com/bethgelab/robust-detection-benchmark)
The benchmark is modelled after the [imagenet-c benchmark](https://github.com/hendrycks/robustness) which was originally
published in [Benchmarking Neural Network Robustness to Common Corruptions and Perturbations](https://arxiv.org/abs/1903.12261) (ICLR 2019) by Dan Hendrycks and Thomas Dietterich.
The image corruption functions are included in this library but can be installed separately using:
```shell
pip install imagecorruptions
```
Compared to imagenet-c a few changes had to be made to handle images of arbitrary size and greyscale images.
We also modfied the 'motion blur' and 'snow' corruptions to remove dependency from a linux specific library,
which would have to be installed separately otherwise. For details please refer to the [imagecorruptions repository](https://github.com/bethgelab/imagecorruptions).
## Inference with pretrained models
We provide a testing script to evaluate a models performance on any combination of the corruptions provided in the benchmark.
### Test a dataset
- [x] single GPU testing
- [ ] multiple GPU testing
- [ ] visualize detection results
You can use the following commands to test a models performance under the 15 corruptions used in the benchmark.
```shell
# single-gpu testing
python tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]
```
Alternatively different group of corruptions can be selected.
```shell
# noise
python tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] --corruptions noise
# blur
python tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] --corruptions blur
# wetaher
python tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] --corruptions weather
# digital
python tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] --corruptions digital
```
Or a costom set of corruptions e.g.:
```shell
# gaussian noise, zoom blur and snow
python tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] --corruptions gaussian_noise zoom_blur snow
```
Finally the corruption severities to evaluate can be chosen.
Severity 0 corresponds to clean data and the effect increases from 1 to 5.
```shell
# severity 1
python tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] --severities 1
# severities 0,2,4
python tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] --severities 0 2 4
```
## Results for modelzoo models
The results on COCO 2017val are shown in the below table.
Model | Backbone | Style | Lr schd | box AP clean | box AP corr. | box % | mask AP clean | mask AP corr. | mask % |
:-----:|:---------:|:-------:|:-------:|:------------:|:------------:|:-----:|:-------------:|:-------------:|:------:|
Faster R-CNN | R-50-FPN | pytorch | 1x | 36.3 | 18.2 | 50.2 | - | - | - |
Faster R-CNN | R-101-FPN | pytorch | 1x | 38.5 | 20.9 | 54.2 | - | - | - |
Faster R-CNN | X-101-32x4d-FPN | pytorch |1x | 40.1 | 22.3 | 55.5 | - | - | - |
Faster R-CNN | X-101-64x4d-FPN | pytorch |1x | 41.3 | 23.4 | 56.6 | - | - | - |
Faster R-CNN | R-50-FPN-DCN | pytorch | 1x | 40.0 | 22.4 | 56.1 | - | - | - |
Faster R-CNN | X-101-32x4d-FPN-DCN | pytorch | 1x | 43.4 | 26.7 | 61.6 | - | - | - |
Mask R-CNN | R-50-FPN | pytorch | 1x | 37.3 | 18.7 | 50.1 | 34.2 | 16.8 | 49.1 |
Mask R-CNN | R-50-FPN-DCN | pytorch | 1x | 41.1 | 23.3 | 56.7 | 37.2 | 20.7 | 55.7 |
Cascade R-CNN | R-50-FPN | pytorch | 1x | 40.4 | 20.1 | 49.7 | - | - | - |
Cascade Mask R-CNN | R-50-FPN | pytorch | 1x| 41.2 | 20.7 | 50.2 | 35.7 | 17.6 | 49.3 |
RetinaNet | R-50-FPN | pytorch | 1x | 35.6 | 17.8 | 50.1 | - | - | - |
Hybrid Task Cascade | X-101-64x4d-FPN-DCN | pytorch | 1x | 50.6 | 32.7 | 64.7 | 43.8 | 28.1 | 64.0 |
Results may vary slightly due to the stochastic application of the corruptions.
...@@ -24,11 +24,6 @@ post_processing ...@@ -24,11 +24,6 @@ post_processing
.. automodule:: mmdet3d.core.post_processing .. automodule:: mmdet3d.core.post_processing
:members: :members:
optimizer
^^^^^^^^^^
.. automodule:: mmdet3d.core.optimizer
:members:
utils utils
^^^^^^^^^^ ^^^^^^^^^^
.. automodule:: mmdet3d.core.utils .. automodule:: mmdet3d.core.utils
...@@ -60,6 +55,11 @@ backbones ...@@ -60,6 +55,11 @@ backbones
.. automodule:: mmdet3d.models.backbones .. automodule:: mmdet3d.models.backbones
:members: :members:
necks
^^^^^^^^^^
.. automodule:: mmdet3d.models.necks
:members:
dense_heads dense_heads
^^^^^^^^^^^^ ^^^^^^^^^^^^
.. automodule:: mmdet3d.models.dense_heads .. automodule:: mmdet3d.models.dense_heads
...@@ -69,3 +69,38 @@ roi_heads ...@@ -69,3 +69,38 @@ roi_heads
^^^^^^^^^^ ^^^^^^^^^^
.. automodule:: mmdet3d.models.roi_heads .. automodule:: mmdet3d.models.roi_heads
:members: :members:
roi_heads.bbox_heads
^^^^^^^^^^
.. automodule:: mmdet3d.models.roi_heads.bbox_heads
:members:
roi_heads.mask_heads
^^^^^^^^^^
.. automodule:: mmdet3d.models.roi_heads.mask_heads
:members:
roi_heads.roi_extractors
^^^^^^^^^^
.. automodule:: mmdet3d.models.roi_heads.roi_extractors
:members:
fusion_layers
^^^^^^^^^^
.. automodule:: mmdet3d.models.fusion_layers
:members:
losses
^^^^^^^^^^
.. automodule:: mmdet3d.models.losses
:members:
middle_encoders
^^^^^^^^^^
.. automodule:: mmdet3d.models.middle_encoders
:members:
model_utils
^^^^^^^^^^
.. automodule:: mmdet3d.models.model_utils
:members:
# Benchmarks
Here we benchmark the training and testing speed of models in MMDetection3D,
with some other popular open source 3D detection codebases.
## Settings
* Hardwares: 8 NVIDIA Tesla V100 (32G) GPUs, Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
* Software: Python 3.7, CUDA 10.1, cuDNN 7.6.5, PyTorch 1.3, numba 0.48.0.
* Model: Since all the other codebases implements different models, we compare the corresponding models with them separately. We try to use as similar settings as those of other codebases as possible using [benchmark configs](https://github.com/open-mmlab/MMDetection3D/blob/master/configs/benchmark).
* Metrics: We use the average throughput in iterations of the entire training run and skip the first 50 iterations of each epoch to skip GPU warmup time.
Note that the throughput of a detector typically changes during training, because it depends on the predictions of the model.
## Main Results
### VoteNet
We compare our implementation with VoteNet and report the performance of VoteNets on SUNRGB-D v2 dataset under the AP@0.5 metric.
```eval_rst
+----------------+---------------------+--------------------+-------------------+--------+
| Implementation | Training (sample/s) | Testing (sample/s) | Training Time (h) | AP@0.5 |
+================+=====================+====================+===================+========+
| MMDetection3D | | | | |
+----------------+---------------------+--------------------+-------------------+--------+
| VoteNet | | | | |
+----------------+---------------------+--------------------+-------------------+--------+
```
### PointPillars
Since Det3D only provides PointPillars on car class while PCDet only provides PointPillars
on 3 classes, we compare with them separately. For performance on single class, we report the AP on moderate
condition following the KITTI benchmark and compare average AP over all classes on moderate condition for
performance on 3 classes.
```eval_rst
+----------------+---------------------+--------------------+-------------------+-------------+
| Implementation | Training (sample/s) | Testing (sample/s) | Training Time (h) | Moderate AP |
+================+=====================+====================+===================+=============+
| MMDetection3D | | | | |
+----------------+---------------------+--------------------+-------------------+-------------+
| PCDet | | | | |
+----------------+---------------------+--------------------+-------------------+-------------+
```
```eval_rst
+----------------+---------------------+--------------------+-------------------+-------------+
| Implementation | Training (sample/s) | Testing (sample/s) | Training Time (h) | Moderate AP |
+================+=====================+====================+===================+=============+
| MMDetection3D | | | | |
+----------------+---------------------+--------------------+-------------------+-------------+
| Det3D | | | | |
+----------------+---------------------+--------------------+-------------------+-------------+
```
### SECOND
Det3D provides a different SECOND on car class and we cannot train the original SECOND by modifying the config.
So we only compare with PCDet, which is a SECOND model on 3 classes, we report the AP on moderate
condition following the KITTI benchmark and compare average AP over all classes on moderate condition for
performance on 3 classes.
```eval_rst
+----------------+---------------------+--------------------+-------------------+-------------+
| Implementation | Training (sample/s) | Testing (sample/s) | Training Time (h) | Moderate AP |
+================+=====================+====================+===================+=============+
| MMDetection3D | | | | |
+----------------+---------------------+--------------------+-------------------+-------------+
| PCDet | | | | |
+----------------+---------------------+--------------------+-------------------+-------------+
```
### Part-A2
We benchmark Part-A2 with that in PCDet. We report the AP on moderate condition following the KITTI benchmark
and compare average AP over all classes on moderate condition for performance on 3 classes.
```eval_rst
+----------------+---------------------+--------------------+-------------------+-------------+
| Implementation | Training (sample/s) | Testing (sample/s) | Training Time (h) | Moderate AP |
+================+=====================+====================+===================+=============+
| MMDetection3D | | | | |
+----------------+---------------------+--------------------+-------------------+-------------+
| PCDet | | | | |
+----------------+---------------------+--------------------+-------------------+-------------+
```
## Details of Comparison
### VoteNet
* __MMDetection3D__: With release v0.1.0, run
```
./tools/dist_train.sh configs/votenet/mask_rcnn_r50_caffe_fpn_1x_coco.py 8
```
* __votenet__:
### PointPillars
* __MMDetection3D__: With release v0.1.0, run
```
```
* __PCDet__: At commit xxxx
### SECOND
* __MMDetection3D__: With release v0.1.0, run
```
```
* __PCDet__:
### Part-A2
* __MMDetection3D__: With release v0.1.0, run
```
```
* __PCDet__: At commit xxxx
### Modification for Calculating Training Speed
## Changelog
### v0.1.0 (24/5/2020)
MMDetection3D is released.
# Compatibility with MMDetection 1.x
MMDetection 2.0 goes through a big refactoring and addresses many legacy issues. It is not compatible with the 1.x version, i.e., running inference with the same model weights in these two version will produce different results. Thus, MMDetection 2.0 re-benchmarks all the models and provids their links and logs in the model zoo.
The major differences are in four folds: coordinate system, codebase conventions, training hyperparameters, and modular design.
## Coordinate System
The new coordinate system is consistent with [Detectron2](https://github.com/facebookresearch/detectron2/) and treats the center of the most left-top pixel as (0, 0) rather than the left-top corner of that pixel.
Accordingly, the system interprets the coordinates in COCO bounding box and segmentation annotations as coordinates in range `[0, width]` or `[0, height]`.
This modification affects all the computation related to the bbox and pixel selection,
which is more natural and accurate.
- The height and width of a box with corners (x1, y1) and (x2, y2) in the new coordinate system is computed as `width = x2 - x1` and `height = y2 - y1`.
In MMDetection 1.x and previous version, a "+ 1" was added both height and width.
This modification are in three folds:
1. Box transformation and encoding/decoding in regression.
2. Iou calculation. This affects the matching process between ground truth and bounding box and the NMS process. The effect to compatibility is very negligible, though.
3. The corners of bounding box is in float type and no longer quantized. This should provide more accurate bounding box results. Thie also makes the bounding box and rois not required to have minimum size of 1, whose effect is small, though.
- The anchors are center-aligned to feature grid points and in float type.
In MMDetection 1.x and previous version, the anchors are in int type and not center-aligned.
This affects the anchor generation in RPN and all the anchor-based methods.
- ROIAlign is better alligned with the image coordinate system. The new implementation is adopted from [Detectron2](https://github.com/facebookresearch/detectron2/tree/master/detectron2/layers/csrc/ROIAlign).
The RoIs are shifted by half a pixel by default when they are used to cropping RoI features, compared to MMDetection 1.x.
The old behavior is still available by setting `aligned=False` instead of `aligned=True`.
- Mask cropping and pasting are more accurate.
1. We use the new RoIAlign to crop mask targets. In MMDetection 1.x, the bounding box is quantilized before it is used to crop mask target, and the crop process is implemented by numpy. In new implementation, the bounding box for crop is not quantilized and sent to RoIAlign. This implementation accelerates the training speed by a large margin (~0.1s per iter, ~2 hour when training Mask R50 for 1x schedule) and should be more accurate.
2. In MMDetection 2.0, the "paste_mask()" function is different and should be more accurate than those in previous versions. This change follows the modification in [Detectron2](https://github.com/facebookresearch/detectron2/blob/master/detectron2/structures/masks.py) and can improve mask AP on COCO by ~0.5% absolute.
## Codebase Conventions
- MMDetection 2.0 changes the order of class labels to reduce unused parameters in regression and mask branch more naturally (without +1 and -1).
This effect all the classification layers of the model to have a different ordering of class labels. The final layers of regression branch and mask head no longer keep K+1 channels for K categories, and their class orders are consistent with the classification branch.
- In MMDetection 2.0, label "K" means background, and labels [0, K-1] correspond to the K = num_categories object categories.
- In MMDetection 1.x and previous version, label "0" means background, and labels [1, K] correspond to the K categories.
- Low quality matching in R-CNN is not used. In MMDetection 1.x and previous versions, the `max_iou_assigner` will match low quality boxes for each ground truth box in both RPN and R-CNN training. We observe this sometimes does not assign the most perfect GT box to some bounding boxes,
thus MMDetection 2.0 do not allow low quality matching by default in R-CNN training in the new system. This sometimes may slightly improve the box AP (~0.1% absolute).
- Seperate scale factors for width and height. In MMDetection 1.x and previous versions, the scale factor is a single float in mode `keep_ratio=True`. This is slightly inaccurate because the scale factors for width and height have slight difference. MMDetection 2.0 adopts separate scale factors for width and height, the improvment on AP ~0.1% absolute.
- Configs name conventions are changed. MMDetection V2.0 adopts the new name convention to maintain the gradually growing model zoo as the following:
```
[model]_(model setting)_[backbone]_[neck]_(norm setting)_(misc)_(gpu x batch)_[schedule]_[dataset].py,
```
where the (misc) includes DCN and GCBlock, etc. More details are illustrated in the [documentation for config](config.md)
- MMDetection V2.0 uses new ResNet Caffe backbones to reduce warnings when loading pre-trained models. Most of the new backbones' weights are the same as the former ones but do not have `conv.bias`, except that they use a different `img_norm_cfg`. Thus, the new backbone will not cause warning of unexpected keys.
## Training Hyperparameters
The change in training hyperparameters does not affect
model-level compatibility but slightly improves the performance. The major ones are:
- The number of proposals after nms is changed from 2000 to 1000 by setting `nms_post=1000` and `max_num=1000`.
This slightly improves both mask AP and bbox AP by ~0.2% absolute.
- The default box regression losses for Mask R-CNN, Faster R-CNN and RetinaNet are changed from smooth L1 Loss to L1 loss. This leads to an overall improvement in box AP (~0.6% absolute). However, using L1-loss for other methods such as Cascade R-CNN and HTC does not improve the performance, so we keep the original settings for these methods.
- The sample num of RoIAlign layer is set to be 0 for simplicity. This leads to slightly improvement on mask AP (~0.2% absolute).
- The default setting does not use gradient clipping anymore during training for faster training speed. This does not degrade performance of the most of models. For some models such as RepPoints we keep using gradient clipping to stablize the training process and to obtain better performance.
- The default warmup ratio is changed from 1/3 to 0.001 for a more smooth warming up process since the gradient clipping is usually not used. The effect is found negligible during our re-benchmarking, though.
## Upgrade Models from 1.x to 2.0
To convert the models trained by MMDetection V1.x to MMDetection V2.0, the users can use the script `tools/upgrade_model_version.py` to convert
their models. The converted models can be run in MMDetection V2.0 with slightly dropped performance (less than 1% AP absolute).
Details can be found in `configs/legacy`.
...@@ -5,7 +5,7 @@ For installation instructions, please see [install.md](install.md). ...@@ -5,7 +5,7 @@ For installation instructions, please see [install.md](install.md).
## Prepare datasets ## Prepare datasets
It is recommended to symlink the dataset root to `$MMDETECTION/data`. It is recommended to symlink the dataset root to `$MMDETECTION3D/data`.
If your folder structure is different, you may need to change the corresponding paths in config files. If your folder structure is different, you may need to change the corresponding paths in config files.
``` ```
...@@ -61,6 +61,7 @@ To prepare scannet data, please see [scannet](../data/scannet/README.md). ...@@ -61,6 +61,7 @@ To prepare scannet data, please see [scannet](../data/scannet/README.md).
To prepare sunrgbd data, please see [sunrgbd](../data/sunrgbd/README.md). To prepare sunrgbd data, please see [sunrgbd](../data/sunrgbd/README.md).
For using custom datasets, please refer to [Tutorials 2: Adding New Dataset](tutorials/new_dataset.md). For using custom datasets, please refer to [Tutorials 2: Adding New Dataset](tutorials/new_dataset.md).
## Inference with pretrained models ## Inference with pretrained models
...@@ -156,12 +157,13 @@ You will get two json files `mask_rcnn_test-dev_results.bbox.json` and `mask_rcn ...@@ -156,12 +157,13 @@ You will get two json files `mask_rcnn_test-dev_results.bbox.json` and `mask_rcn
The generated png and txt would be under `./mask_rcnn_cityscapes_test_results` directory. The generated png and txt would be under `./mask_rcnn_cityscapes_test_results` directory.
### Image demo ### Image demo
We provide a demo script to test a single image. We provide a demo script to test a single image.
```shell ```shell
python demo/image_demo.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--device ${GPU_ID}] [--camera-id ${CAMERA-ID}] [--score-thr ${SCORE_THR}] python demo/image_demo.py ${IMAGE_FILE} ${CONFIG_FILE} ${CHECKPOINT_FILE} [--device ${GPU_ID}] [--score-thr ${SCORE_THR}]
``` ```
Examples: Examples:
...@@ -176,7 +178,7 @@ python demo/image_demo.py demo/demo.jpg configs/faster_rcnn_r50_fpn_1x_coco.py \ ...@@ -176,7 +178,7 @@ python demo/image_demo.py demo/demo.jpg configs/faster_rcnn_r50_fpn_1x_coco.py \
We provide a webcam demo to illustrate the results. We provide a webcam demo to illustrate the results.
```shell ```shell
python demo/webcam_demo.py ${IMAGE_FILE} ${CONFIG_FILE} ${CHECKPOINT_FILE} [--device ${GPU_ID}] [--score-thr ${SCORE_THR}] python demo/webcam_demo.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--device ${GPU_ID}] [--camera-id ${CAMERA-ID}] [--score-thr ${SCORE_THR}]
``` ```
Examples: Examples:
...@@ -186,6 +188,7 @@ python demo/webcam_demo.py configs/faster_rcnn_r50_fpn_1x_coco.py \ ...@@ -186,6 +188,7 @@ python demo/webcam_demo.py configs/faster_rcnn_r50_fpn_1x_coco.py \
checkpoints/faster_rcnn_r50_fpn_1x_20181010-3d1b3351.pth checkpoints/faster_rcnn_r50_fpn_1x_20181010-3d1b3351.pth
``` ```
### High-level APIs for testing images ### High-level APIs for testing images
#### Synchronous interface #### Synchronous interface
...@@ -318,8 +321,8 @@ GPUS=16 ./tools/slurm_train.sh dev mask_r50_1x configs/mask_rcnn_r50_fpn_1x_coco ...@@ -318,8 +321,8 @@ GPUS=16 ./tools/slurm_train.sh dev mask_r50_1x configs/mask_rcnn_r50_fpn_1x_coco
You can check [slurm_train.sh](https://github.com/open-mmlab/mmdetection/blob/master/tools/slurm_train.sh) for full arguments and environment variables. You can check [slurm_train.sh](https://github.com/open-mmlab/mmdetection/blob/master/tools/slurm_train.sh) for full arguments and environment variables.
If you have just multiple machines connected with ethernet, you can refer to If you have just multiple machines connected with ethernet, you can refer to
pytorch [launch utility](https://pytorch.org/docs/stable/distributed_deprecated.html#launch-utility). PyTorch [launch utility](https://pytorch.org/docs/stable/distributed_deprecated.html#launch-utility).
Usually it is slow if you do not have high speed networking like infiniband. Usually it is slow if you do not have high speed networking like InfiniBand.
### Launch multiple jobs on a single machine ### Launch multiple jobs on a single machine
...@@ -333,7 +336,7 @@ CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh ${CONFIG_FILE} 4 ...@@ -333,7 +336,7 @@ CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh ${CONFIG_FILE} 4
CUDA_VISIBLE_DEVICES=4,5,6,7 PORT=29501 ./tools/dist_train.sh ${CONFIG_FILE} 4 CUDA_VISIBLE_DEVICES=4,5,6,7 PORT=29501 ./tools/dist_train.sh ${CONFIG_FILE} 4
``` ```
If you use launch training jobs with slurm, you need to modify the config files (usually the 6th line from the bottom in config files) to set different communication ports. If you use launch training jobs with Slurm, you need to modify the config files (usually the 6th line from the bottom in config files) to set different communication ports.
In `config1.py`, In `config1.py`,
```python ```python
...@@ -377,7 +380,7 @@ python tools/analyze_logs.py plot_curve log.json --keys loss_cls --legend loss_c ...@@ -377,7 +380,7 @@ python tools/analyze_logs.py plot_curve log.json --keys loss_cls --legend loss_c
- Plot the classification and regression loss of some run, and save the figure to a pdf. - Plot the classification and regression loss of some run, and save the figure to a pdf.
```shell ```shell
python tools/analyze_logs.py plot_curve log.json --keys loss_cls loss_reg --out losses.pdf python tools/analyze_logs.py plot_curve log.json --keys loss_cls loss_bbox --out losses.pdf
``` ```
- Compare the bbox mAP of two runs in the same figure. - Compare the bbox mAP of two runs in the same figure.
...@@ -389,7 +392,7 @@ python tools/analyze_logs.py plot_curve log1.json log2.json --keys bbox_mAP --le ...@@ -389,7 +392,7 @@ python tools/analyze_logs.py plot_curve log1.json log2.json --keys bbox_mAP --le
You can also compute the average training speed. You can also compute the average training speed.
```shell ```shell
python tools/analyze_logs.py cal_train_time ${CONFIG_FILE} [--include-outliers] python tools/analyze_logs.py cal_train_time log.json [--include-outliers]
``` ```
The output is expected to be like the following. The output is expected to be like the following.
...@@ -462,4 +465,5 @@ python tools/pytorch2onnx.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --out ${ONNX_FILE ...@@ -462,4 +465,5 @@ python tools/pytorch2onnx.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --out ${ONNX_FILE
## Tutorials ## Tutorials
Currently, we provide three tutorials for users to [finetune models](tutorials/finetune.md), [add new dataset](tutorials/new_dataset.md), and [add new modules](tutorials/new_modules.md) Currently, we provide four tutorials for users to [finetune models](tutorials/finetune.md), [add new dataset](tutorials/new_dataset.md), [design data pipeline](tutorials/data_pipeline.md) and [add new modules](tutorials/new_modules.md).
We also provide a full description about the [config system](config.md).
Welcome to MMDetection's documentation! Welcome to MMDetection3D's documentation!
======================================= ==========================================
.. toctree:: .. toctree::
:maxdepth: 2 :maxdepth: 2
......
...@@ -6,7 +6,7 @@ ...@@ -6,7 +6,7 @@
- Python 3.6+ - Python 3.6+
- PyTorch 1.3+ - PyTorch 1.3+
- CUDA 9.2+ (If you build PyTorch from source, CUDA 9.0 is also compatible) - CUDA 9.2+ (If you build PyTorch from source, CUDA 9.0 is also compatible)
- GCC 4.9+ - GCC 5+
- [mmcv](https://github.com/open-mmlab/mmcv) - [mmcv](https://github.com/open-mmlab/mmcv)
...@@ -53,11 +53,12 @@ cd mmdetection ...@@ -53,11 +53,12 @@ cd mmdetection
``` ```
d. Install build requirements and then install mmdetection. d. Install build requirements and then install mmdetection.
(We install pycocotools via the github repo instead of pypi because the pypi version is old and not compatible with the latest numpy.) (We install our forked version of pycocotools via the github repo instead of pypi
for better compatibility with our repo.)
```shell ```shell
pip install -r requirements/build.txt pip install -r requirements/build.txt
pip install "git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI" pip install "git+https://github.com/open-mmlab/cocoapi.git#subdirectory=pycocotools"
pip install -v -e . # or "python setup.py develop" pip install -v -e . # or "python setup.py develop"
``` ```
...@@ -72,6 +73,14 @@ Note: ...@@ -72,6 +73,14 @@ Note:
1. The git commit id will be written to the version number with step d, e.g. 0.6.0+2e7045c. The version will also be saved in trained models. 1. The git commit id will be written to the version number with step d, e.g. 0.6.0+2e7045c. The version will also be saved in trained models.
It is recommended that you run step d each time you pull some updates from github. If C++/CUDA codes are modified, then this step is compulsory. It is recommended that you run step d each time you pull some updates from github. If C++/CUDA codes are modified, then this step is compulsory.
> Important: Be sure to remove the `./build` folder if you reinstall mmdet with a different CUDA/PyTorch version.
```
pip uninstall mmdet
rm -rf ./build
find . -name "*.so" | xargs rm
```
2. Following the above instructions, mmdetection is installed on `dev` mode, any local modifications made to the code will take effect without the need to reinstall it (unless you submit some commits and want to update the version number). 2. Following the above instructions, mmdetection is installed on `dev` mode, any local modifications made to the code will take effect without the need to reinstall it (unless you submit some commits and want to update the version number).
3. If you would like to use `opencv-python-headless` instead of `opencv-python`, 3. If you would like to use `opencv-python-headless` instead of `opencv-python`,
...@@ -122,7 +131,7 @@ conda install -c pytorch pytorch torchvision -y ...@@ -122,7 +131,7 @@ conda install -c pytorch pytorch torchvision -y
git clone https://github.com/open-mmlab/mmdetection.git git clone https://github.com/open-mmlab/mmdetection.git
cd mmdetection cd mmdetection
pip install -r requirements/build.txt pip install -r requirements/build.txt
pip install "git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI" pip install "git+https://github.com/open-mmlab/cocoapi.git#subdirectory=pycocotools"
pip install -v -e . pip install -v -e .
``` ```
......
# Benchmark and Model Zoo
## Mirror sites
We use AWS as the main site to host our model zoo, and maintain a mirror on aliyun.
You can replace `https://s3.ap-northeast-2.amazonaws.com/open-mmlab` with `https://open-mmlab.oss-cn-beijing.aliyuncs.com` in model urls.
## Common settings
- We use distributed training.
- For fair comparison with other codebases, we report the GPU memory as the maximum value of `torch.cuda.max_memory_allocated()` for all 8 GPUs. Note that this value is usually less than what `nvidia-smi` shows.
- We report the inference time as the total time of network forwarding and post-processing, excluding the data loading time. Results are obtained with the script [benchmark.py](https://github.com/open-mmlab/mmdetection/blob/master/tools/benchmark.py) which computes the average time on 2000 images.
## Baselines
### SECOND
Please refer to [SECOND](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/second) for details.
### PointPillars
Please refer to [PointPillars](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/pointpillars) for details.
### Part-A2
Please refer to [Part-A2](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/parta2) for details.
### VoteNet
Please refer to [VoteNet](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/votenet) for details.
### Dynamic Voxelization
Please refer to [Dynamic Voxelization](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/dynamic_voxelization) for details.
### MVXNet
Please refer to [MVXNet](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/mvxnet) for details.
### RegNetX
Please refer to [RegNet](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/regnet) for details.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment