[Feature] FCOS3D benchmark on nuScenes (#482)

e9d84fe5 · twang · GitHub · c5abdcbd · e9d84fe5 · e9d84fe5
Unverified Commit e9d84fe5 authored Apr 30, 2021 by twang Committed by GitHub Apr 30, 2021
6 changed files
--- a/README.md
+++ b/README.md
@@ -10,7 +10,9 @@
 **News**: We released the codebase v0.12.0.
-In the recent [nuScenes 3D detection challenge](https://www.nuscenes.org/object-detection?externalData=all&mapData=all&modalities=Any) of the 5th AI Driving Olympics in NeurIPS 2020, we obtained the best PKL award and the second runner-up by multi-modality entry, and the best vision-only results. Code and models will be released soon!
+In the recent [nuScenes 3D detection challenge](https://www.nuscenes.org/object-detection?externalData=all&mapData=all&modalities=Any) of the 5th AI Driving Olympics in NeurIPS 2020, we obtained the best PKL award and the second runner-up by multi-modality entry, and the best vision-only results.
+Code and models for the best vision-only method, [FCOS3D](https://arxiv.org/abs/2104.10956), have been released. Please stay tuned for [MoCa](https://arxiv.org/abs/2012.12741).
 Documentation: https://mmdetection3d.readthedocs.io/
@@ -87,6 +89,7 @@ Support methods
 - [x] [CenterPoint (CVPR'2021)](configs/centerpoint/README.md)
 - [x] [SSN (ECCV'2020)](configs/ssn/README.md)
 - [x] [ImVoteNet (CVPR'2020)](configs/imvotenet/README.md)
+- [x] [FCOS3D (Arxiv'2021)](configs/fcos3d/README.md)
 |                    | ResNet   | ResNeXt  | SENet    |PointNet++ | HRNet | RegNetX | Res2Net |
 |--------------------|:--------:|:--------:|:--------:|:---------:|:-----:|:--------:|:-----:|
@@ -101,6 +104,7 @@ Support methods
 | CenterPoint        | ☐        | ☐        | ☐        | ✗         | ☐     | ✓        | ☐     |
 | SSN                | ☐        | ☐        | ☐        | ✗         | ☐     | ✓        | ☐     |
 | ImVoteNet            | ✗        | ✗        | ✗        | ✓         | ✗     | ✗        | ✗     |
+| FCOS3D               | ✓        | ☐        | ☐        | ✗         | ☐     | ☐        | ☐     |
 Other features
 - [x] [Dynamic Voxelization](configs/dynamic_voxelization/README.md)

--- a/README_zh-CN.md
+++ b/README_zh-CN.md
@@ -12,6 +12,8 @@
 在第三届[ nuScenes 3D 检测挑战赛](https://www.nuscenes.org/object-detection?externalData=all&mapData=all&modalities=Any)（第五届 AI Driving Olympics, NeurIPS 2020）中，我们获得了最佳 PKL 奖、第三名和最好的纯视觉的结果，相关的代码和模型将会在不久后发布。
+最好的纯视觉方法[FCOS3D](https://arxiv.org/abs/2104.10956)的代码和模型已经发布。请继续关注我们的多模态检测器[MoCa](https://arxiv.org/abs/2012.12741)。
 文档: https://mmdetection3d.readthedocs.io/
 ## 简介
@@ -86,6 +88,7 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱, 下一代
 - [x] [CenterPoint (CVPR'2021)](configs/centerpoint/README.md)
 - [x] [SSN (ECCV'2020)](configs/ssn/README.md)
 - [x] [ImVoteNet (CVPR'2020)](configs/imvotenet/README.md)
+- [x] [FCOS3D (Arxiv'2021)](configs/fcos3d/README.md)
 |                    | ResNet   | ResNeXt  | SENet    |PointNet++ | HRNet | RegNetX | Res2Net |
 |--------------------|:--------:|:--------:|:--------:|:---------:|:-----:|:--------:|:-----:|
@@ -100,6 +103,7 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱, 下一代
 | CenterPoint        | ☐        | ☐        | ☐        | ✗         | ☐     | ✓        | ☐     |
 | SSN                | ☐        | ☐        | ☐        | ✗         | ☐     | ✓        | ☐     |
 | ImVoteNet            | ✗        | ✗        | ✗        | ✓         | ✗     | ✗        | ✗     |
+| FCOS3D               | ✓        | ☐        | ☐        | ✗         | ☐     | ☐        | ☐     |
 其他特性
 - [x] [Dynamic Voxelization](configs/dynamic_voxelization/README.md)

--- a/configs/fcos3d/README.md
+++ b/configs/fcos3d/README.md
+# FCOS3D: Fully Convolutional One-Stage Monocular 3D Object Detection
+## Introduction
+<!-- [ALGORITHM] -->
+FCOS3D is a general anchor-free, one-stage monocular 3D object detector adapted from the original 2D version FCOS.
+It serves as a baseline built on top of mmdetection and mmdetection3d for 3D detection based on monocular vision.
+Currently we first support the benchmark on the large-scale nuScenes dataset, which achieved 1st place out of all the vision-only methods in the [nuScenes 3D detecton challenge](https://www.nuscenes.org/object-detection?externalData=all&mapData=all&modalities=Camera) of NeurIPS 2020.
+```
+@article{wang2021fcos3d,
+  title={{FCOS3D}: Fully Convolutional One-Stage Monocular 3D Object Detection},
+  author={Wang, Tai and Zhu, Xinge and Pang, Jiangmiao and Lin, Dahua},
+  journal={arXiv preprint arXiv:2104.10956},
+  year={2021}
+}
+# For the original 2D version
+@inproceedings{tian2019fcos,
+  title     =  {{FCOS}: Fully Convolutional One-Stage Object Detection},
+  author    =  {Tian, Zhi and Shen, Chunhua and Chen, Hao and He, Tong},
+  booktitle =  {Proc. Int. Conf. Computer Vision (ICCV)},
+  year      =  {2019}
+}
+```
+## Usage
+### Data Preparation
+After supporting FCOS3D and monocular 3D object detection in v0.13.0, the coco-style 2D json info files will include related annotations by default 
+(see [here](https://github.com/open-mmlab/mmdetection3d/blob/master/tools/data_converter/nuscenes_converter.py#L333) if you would like to change the parameter).
+So you can just follow the data preparation steps given in the documentation, then all the needed infos are ready together.
+### Training and Inference
+The way to training and inference a monocular 3D object detector is the same as others in mmdetection and mmdetection3d. You can basically follow the [documentation](https://mmdetection3d.readthedocs.io/en/latest/1_exist_data_model.html#train-predefined-models-on-standard-datasets) and change the `config`, `work_dirs`, etc. accordingly.
+### Test time augmentation
+We implement test time augmentation for the dense outputs of detection heads, which is more effective than merging predicted boxes at last.
+You can turn on it by setting `flip=True` in the `test_pipeline`.
+### Training with finetune
+Due to the scale and measurements of depth is different from those of other regression targets, we first train the model with depth weight equal to 0.2 for a more stable training procedure. For a stronger detector with better performance, please finetune the model with depth weight changed to 1.0 as shown in the [config](./fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_finetune.py). Note that the path of `load_from` needs to be changed to yours accordingly.
+## Results
+### NuScenes
+|  Backbone   | Lr schd | Mem (GB) | Inf time (fps) | mAP | NDS | Download |
+| :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
+|[ResNet101 w/ DCN](./fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d.py)|1x|8.69||29.9|37.3|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fcos3d/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_20210425_181341-8d5a21fe.pth) &#124; [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fcos3d/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_20210425_181341.log.json)|
+|[above w/ finetune](./fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_finetune.py)|1x|8.69||32.1|39.3|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fcos3d/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_finetune/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_finetune_20210427_091419-35aaaad0.pth) &#124; [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fcos3d/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_finetune/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_finetune_20210427_091419.log.json)|
+|above w/ tta|1x|8.69||33.1|40.0||
--- a/configs/fcos3d/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_finetune.py
+++ b/configs/fcos3d/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_finetune.py
+_base_ = './fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d.py'
+# model settings
+model = dict(
+    train_cfg=dict(
+        code_weight=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.05, 0.05]))
+# optimizer
+optimizer = dict(lr=0.001)
+load_from = 'work_dirs/fcos3d_nus/latest.pth'
--- a/docs/model_zoo.md
+++ b/docs/model_zoo.md
@@ -57,3 +57,7 @@ Please refer to [SSN](https://github.com/open-mmlab/mmdetection3d/blob/master/co
 ### ImVoteNet
 Please refer to [ImVoteNet](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/imvotenet) for details. We provide VoteNet baselines on SUNRGBD dataset.
+### FCOS3D
+Please refer to [FCOS3D](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/fcos3d) for details. We provide FCOS3D baselines on the nuScenes dataset currently.
--- a/mmdet3d/models/detectors/fcos_mono3d.py
+++ b/mmdet3d/models/detectors/fcos_mono3d.py
@@ -4,7 +4,7 @@ from .single_stage_mono3d import SingleStageMono3DDetector
 @DETECTORS.register_module()
 class FCOSMono3D(SingleStageMono3DDetector):
-    """Implementation of FCOS3D. The technical report will be released soon.
+    r"""FCOS3D <https://arxiv.org/abs/2104.10956>`_ for monocular 3D object detection.
    Currently please refer to our entry on the
    `leaderboard <https://www.nuscenes.org/object-detection?externalData=all&mapData=all&modalities=Camera>` # noqa