"models/git@developer.sourcefind.cn:renzhc/diffusers_dcu.git" did not exist on "9fc2b6c5290eb4a1123f60c8490b4051bbaa9e1b"
Unverified Commit 71c79968 authored by Tai-Wang's avatar Tai-Wang Committed by GitHub
Browse files

[Feature] Support PGD nuScenes benchmark (#1065)

* Support PGD nuscenes benchmark

* Update nuScenes benchmark

* Update metafile.yml

* Update model_zoo.md

* Update model_zoo.md

* Add 2x schedule configs

* Update 2x schedule config links

* Update 2x schedule meta infos
parent 6b73a226
...@@ -6,17 +6,26 @@ ...@@ -6,17 +6,26 @@
PGD, also can be regarded as FCOS3D++, is a simple yet effective monocular 3D detector. It enhances the FCOS3D baseline by involving local geometric constraints and improving instance depth estimation. PGD, also can be regarded as FCOS3D++, is a simple yet effective monocular 3D detector. It enhances the FCOS3D baseline by involving local geometric constraints and improving instance depth estimation.
We first release the code and model for KITTI benchmark, which is a good supplement for the original FCOS3D baseline (only supported on nuScenes). Models for nuScenes will be released soon. We release the code and model for both KITTI and nuScenes benchmark, which is a good supplement for the original FCOS3D baseline (only supported on nuScenes).
For clean implementation, our preliminary release supports base models with proposed local geometric constraints and the probabilistic depth representation. We will involve the geometric graph part in the future. For clean implementation, our preliminary release supports base models with proposed local geometric constraints and the probabilistic depth representation. We will involve the geometric graph part in the future.
A more extensive study based on FCOS3D and PGD is on-going. Please stay tuned.
``` ```
@inproceedings{wang2021pgd, @inproceedings{wang2021pgd,
title={Probabilistic and Geometric Depth: Detecting Objects in Perspective}, title={{Probabilistic and Geometric Depth: Detecting} Objects in Perspective},
author={Wang, Tai and Zhu, Xinge and Pang, Jiangmiao and Lin, Dahua}, author={Wang, Tai and Zhu, Xinge and Pang, Jiangmiao and Lin, Dahua},
booktitle={Conference on Robot Learning (CoRL) 2021}, booktitle={Conference on Robot Learning (CoRL) 2021},
year={2021} year={2021}
} }
# For the baseline version
@inproceedings{wang2021fcos3d,
title={{FCOS3D: Fully} Convolutional One-Stage Monocular 3D Object Detection},
author={Wang, Tai and Zhu, Xinge and Pang, Jiangmiao and Lin, Dahua},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
year={2021}
}
``` ```
## Results ## Results
...@@ -25,7 +34,7 @@ For clean implementation, our preliminary release supports base models with prop ...@@ -25,7 +34,7 @@ For clean implementation, our preliminary release supports base models with prop
| Backbone | Lr schd | Mem (GB) | Inf time (fps) | mAP_11 / mAP_40 | Download | | Backbone | Lr schd | Mem (GB) | Inf time (fps) | mAP_11 / mAP_40 | Download |
| :---------: | :-----: | :------: | :------------: | :----: | :------: | | :---------: | :-----: | :------: | :------------: | :----: | :------: |
|[ResNet101](./pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d.py)|4x|9.07||18.33 / 13.23|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d_20211022_102608-8a97533b.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d_20211022_102608.log.json) |[ResNet101](./pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d.py)|4x|9.07||18.33 / 13.23|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d_20211022_102608-8a97533b.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d_20211022_102608.log.json)|
Detailed performance on KITTI 3D detection (3D/BEV) is as follows, evaluated by AP11 and AP40 metric: Detailed performance on KITTI 3D detection (3D/BEV) is as follows, evaluated by AP11 and AP40 metric:
...@@ -35,3 +44,14 @@ Detailed performance on KITTI 3D detection (3D/BEV) is as follows, evaluated by ...@@ -35,3 +44,14 @@ Detailed performance on KITTI 3D detection (3D/BEV) is as follows, evaluated by
| Car (AP40) | 19.27 / 26.60 | 13.23 / 18.23 | 10.65 / 15.00 | | Car (AP40) | 19.27 / 26.60 | 13.23 / 18.23 | 10.65 / 15.00 |
Note: mAP represents Car moderate 3D strict AP11 / AP40 results. Because of the limited data for pedestrians and cyclists, the detection performance for these two classes is usually unstable. Therefore, we only list car detection results here. In addition, AP40 is a more recommended metric for reference due to its much better stability. Note: mAP represents Car moderate 3D strict AP11 / AP40 results. Because of the limited data for pedestrians and cyclists, the detection performance for these two classes is usually unstable. Therefore, we only list car detection results here. In addition, AP40 is a more recommended metric for reference due to its much better stability.
### NuScenes
| Backbone | Lr schd | Mem (GB) | mAP | NDS | Download |
| :---------: | :-----: | :------: | :----: |:----: | :------: |
|[ResNet101 w/ DCN](./pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d.py)|1x|9.20|31.7|39.3|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_20211116_195350-f4b5eec2.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_20211116_195350.log.json)|
|[above w/ finetune](./pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_finetune.py)|1x|9.20|34.6|41.1|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_finetune/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_finetune_20211118_093245-fd419681.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_finetune/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_finetune_20211118_093245.log.json)|
|above w/ tta|1x|9.20|35.5|41.8||
|[ResNet101 w/ DCN](./pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d.py)|2x|9.20|33.6|40.9|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_20211112_125314-cb677266.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_20211112_125314.log.json)|
|[above w/ finetune](./pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_finetune.py)|2x|9.20|35.8|42.5|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_finetune/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_finetune_20211114_162135-5ec7c1cd.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_finetune/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_finetune_20211114_162135.log.json)|
|above w/ tta|2x|9.20|36.8|43.1||
...@@ -27,3 +27,55 @@ Models: ...@@ -27,3 +27,55 @@ Models:
Metrics: Metrics:
mAP: 18.33 mAP: 18.33
Weights: https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d_20211022_102608-8a97533b.pth Weights: https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d_20211022_102608-8a97533b.pth
- Name: pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d
In Collection: PGD
Config: configs/pgd/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d.py
Metadata:
Training Memory (GB): 9.2
Results:
- Task: 3D Object Detection
Dataset: nuScenes
Metrics:
mAP: 31.7
NDS: 39.3
Weights: https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_20211116_195350-f4b5eec2.pth
- Name: pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_finetune
In Collection: PGD
Config: configs/pgd/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_finetune.py
Metadata:
Training Memory (GB): 9.2
Results:
- Task: 3D Object Detection
Dataset: nuScenes
Metrics:
mAP: 34.6
NDS: 41.1
Weights: https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_finetune/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_finetune_20211118_093245-fd419681.pth
- Name: pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d
In Collection: PGD
Config: configs/pgd/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d.py
Metadata:
Training Memory (GB): 9.2
Results:
- Task: 3D Object Detection
Dataset: nuScenes
Metrics:
mAP: 33.6
NDS: 40.9
Weights: https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_20211112_125314-cb677266.pth
- Name: pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_finetune
In Collection: PGD
Config: configs/pgd/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_finetune.py
Metadata:
Training Memory (GB): 9.2
Results:
- Task: 3D Object Detection
Dataset: nuScenes
Metrics:
mAP: 35.8
NDS: 42.5
Weights: https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_finetune/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_finetune_20211114_162135-5ec7c1cd.pth
_base_ = [
'../_base_/datasets/nus-mono3d.py', '../_base_/models/pgd.py',
'../_base_/schedules/mmdet_schedule_1x.py', '../_base_/default_runtime.py'
]
# model settings
model = dict(
backbone=dict(
dcn=dict(type='DCNv2', deform_groups=1, fallback_on_stride=False),
stage_with_dcn=(False, False, True, True)),
bbox_head=dict(
pred_bbox2d=True,
group_reg_dims=(2, 1, 3, 1, 2,
4), # offset, depth, size, rot, velo, bbox2d
reg_branch=(
(256, ), # offset
(256, ), # depth
(256, ), # size
(256, ), # rot
(), # velo
(256, ) # bbox2d
),
loss_depth=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0),
bbox_coder=dict(
type='PGDBBoxCoder',
base_depths=((31.99, 21.12), (37.15, 24.63), (39.69, 23.97),
(40.91, 26.34), (34.16, 20.11), (22.35, 13.70),
(24.28, 16.05), (27.26, 15.50), (20.61, 13.68),
(22.74, 15.01)),
base_dims=((4.62, 1.73, 1.96), (6.93, 2.83, 2.51),
(12.56, 3.89, 2.94), (11.22, 3.50, 2.95),
(6.68, 3.21, 2.85), (6.68, 3.21, 2.85),
(2.11, 1.46, 0.78), (0.73, 1.77, 0.67),
(0.41, 1.08, 0.41), (0.50, 0.99, 2.52)),
code_size=9)),
# set weight 1.0 for base 7 dims (offset, depth, size, rot)
# 0.05 for 2-dim velocity and 0.2 for 4-dim 2D distance targets
train_cfg=dict(code_weight=[
1.0, 1.0, 0.2, 1.0, 1.0, 1.0, 1.0, 0.05, 0.05, 0.2, 0.2, 0.2, 0.2
]),
test_cfg=dict(nms_pre=1000, nms_thr=0.8, score_thr=0.01, max_per_img=200))
class_names = [
'car', 'truck', 'trailer', 'bus', 'construction_vehicle', 'bicycle',
'motorcycle', 'pedestrian', 'traffic_cone', 'barrier'
]
img_norm_cfg = dict(
mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
train_pipeline = [
dict(type='LoadImageFromFileMono3D'),
dict(
type='LoadAnnotations3D',
with_bbox=True,
with_label=True,
with_attr_label=True,
with_bbox_3d=True,
with_label_3d=True,
with_bbox_depth=True),
dict(type='Resize', img_scale=(1600, 900), keep_ratio=True),
dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle3D', class_names=class_names),
dict(
type='Collect3D',
keys=[
'img', 'gt_bboxes', 'gt_labels', 'attr_labels', 'gt_bboxes_3d',
'gt_labels_3d', 'centers2d', 'depths'
]),
]
test_pipeline = [
dict(type='LoadImageFromFileMono3D'),
dict(
type='MultiScaleFlipAug',
scale_factor=1.0,
flip=False,
transforms=[
dict(type='RandomFlip3D'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(
type='DefaultFormatBundle3D',
class_names=class_names,
with_label=False),
dict(type='Collect3D', keys=['img']),
])
]
data = dict(
samples_per_gpu=2,
workers_per_gpu=2,
train=dict(pipeline=train_pipeline),
val=dict(pipeline=test_pipeline),
test=dict(pipeline=test_pipeline))
# optimizer
optimizer = dict(
lr=0.004, paramwise_cfg=dict(bias_lr_mult=2., bias_decay_mult=0.))
optimizer_config = dict(
_delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
# learning policy
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=1.0 / 3,
step=[8, 11])
total_epochs = 12
evaluation = dict(interval=4)
runner = dict(max_epochs=total_epochs)
_base_ = './pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d.py'
# model settings
model = dict(
train_cfg=dict(code_weight=[
1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.05, 0.05, 0.2, 0.2, 0.2, 0.2
]))
# optimizer
optimizer = dict(lr=0.002)
load_from = 'work_dirs/pgd_nus_benchmark_1x/latest.pth'
_base_ = './pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d.py'
# learning policy
lr_config = dict(step=[16, 22])
total_epochs = 24
runner = dict(max_epochs=total_epochs)
_base_ = './pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d.py'
# model settings
model = dict(
train_cfg=dict(code_weight=[
1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.05, 0.05, 0.2, 0.2, 0.2, 0.2
]))
# optimizer
optimizer = dict(lr=0.002)
load_from = 'work_dirs/pgd_nus_benchmark_2x/latest.pth'
...@@ -60,7 +60,7 @@ Please refer to [ImVoteNet](https://github.com/open-mmlab/mmdetection3d/blob/mas ...@@ -60,7 +60,7 @@ Please refer to [ImVoteNet](https://github.com/open-mmlab/mmdetection3d/blob/mas
### FCOS3D ### FCOS3D
Please refer to [FCOS3D](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/fcos3d) for details. We provide FCOS3D baselines on the nuScenes dataset currently. Please refer to [FCOS3D](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/fcos3d) for details. We provide FCOS3D baselines on the nuScenes dataset.
### PointNet++ ### PointNet++
...@@ -88,4 +88,4 @@ Please refer to [SMOKE](https://github.com/open-mmlab/mmdetection3d/tree/v1.0.0. ...@@ -88,4 +88,4 @@ Please refer to [SMOKE](https://github.com/open-mmlab/mmdetection3d/tree/v1.0.0.
### PGD ### PGD
Please refer to [PGD](https://github.com/open-mmlab/mmdetection3d/tree/v1.0.0.dev0/configs/pgd) for details. We provide PGD baselines on KITTI dataset. Please refer to [PGD](https://github.com/open-mmlab/mmdetection3d/tree/v1.0.0.dev0/configs/pgd) for details. We provide PGD baselines on KITTI and nuScenes dataset.
...@@ -90,4 +90,4 @@ ...@@ -90,4 +90,4 @@
### PGD ### PGD
请参考 [PGD](https://github.com/open-mmlab/mmdetection3d/tree/v1.0.0.dev0/configs/pgd) 获取更多细节,我们在 KITTI 数据集上给出了相应的结果. 请参考 [PGD](https://github.com/open-mmlab/mmdetection3d/tree/v1.0.0.dev0/configs/pgd) 获取更多细节,我们在 KITTI 和 nuScenes 数据集上给出了相应的结果.
...@@ -47,8 +47,6 @@ class FCOSMono3DHead(AnchorFreeMono3DHead): ...@@ -47,8 +47,6 @@ class FCOSMono3DHead(AnchorFreeMono3DHead):
""" # noqa: E501 """ # noqa: E501
def __init__(self, def __init__(self,
num_classes,
in_channels,
regress_ranges=((-1, 48), (48, 96), (96, 192), (192, 384), regress_ranges=((-1, 48), (48, 96), (96, 192), (192, 384),
(384, INF)), (384, INF)),
center_sampling=True, center_sampling=True,
...@@ -89,8 +87,6 @@ class FCOSMono3DHead(AnchorFreeMono3DHead): ...@@ -89,8 +87,6 @@ class FCOSMono3DHead(AnchorFreeMono3DHead):
self.centerness_alpha = centerness_alpha self.centerness_alpha = centerness_alpha
self.centerness_branch = centerness_branch self.centerness_branch = centerness_branch
super().__init__( super().__init__(
num_classes,
in_channels,
loss_cls=loss_cls, loss_cls=loss_cls,
loss_bbox=loss_bbox, loss_bbox=loss_bbox,
loss_dir=loss_dir, loss_dir=loss_dir,
......
...@@ -69,7 +69,6 @@ class PGDHead(FCOSMono3DHead): ...@@ -69,7 +69,6 @@ class PGDHead(FCOSMono3DHead):
loss_bbox2d=dict( loss_bbox2d=dict(
type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0), type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0),
loss_consistency=dict(type='GIoULoss', loss_weight=1.0), loss_consistency=dict(type='GIoULoss', loss_weight=1.0),
pred_velo=False,
pred_bbox2d=True, pred_bbox2d=True,
pred_keypoints=False, pred_keypoints=False,
bbox_coder=dict( bbox_coder=dict(
...@@ -77,11 +76,7 @@ class PGDHead(FCOSMono3DHead): ...@@ -77,11 +76,7 @@ class PGDHead(FCOSMono3DHead):
base_depths=((28.01, 16.32), ), base_depths=((28.01, 16.32), ),
base_dims=((0.8, 1.73, 0.6), (1.76, 1.73, 0.6), base_dims=((0.8, 1.73, 0.6), (1.76, 1.73, 0.6),
(3.9, 1.56, 1.6)), (3.9, 1.56, 1.6)),
code_size=7, code_size=7),
depth_range=(0, 70),
depth_unit=10,
division='uniform',
depth_bins=8),
**kwargs): **kwargs):
self.use_depth_classifier = use_depth_classifier self.use_depth_classifier = use_depth_classifier
self.use_onlyreg_proj = use_onlyreg_proj self.use_onlyreg_proj = use_onlyreg_proj
...@@ -622,7 +617,9 @@ class PGDHead(FCOSMono3DHead): ...@@ -622,7 +617,9 @@ class PGDHead(FCOSMono3DHead):
& (flatten_labels_3d < bg_class_ind)).nonzero().reshape(-1) & (flatten_labels_3d < bg_class_ind)).nonzero().reshape(-1)
num_pos = len(pos_inds) num_pos = len(pos_inds)
loss_cls = self.loss_cls( loss_dict = dict()
loss_dict['loss_cls'] = self.loss_cls(
flatten_cls_scores, flatten_cls_scores,
flatten_labels_3d, flatten_labels_3d,
avg_factor=num_pos + num_imgs) # avoid num_pos is 0 avg_factor=num_pos + num_imgs) # avoid num_pos is 0
...@@ -632,8 +629,6 @@ class PGDHead(FCOSMono3DHead): ...@@ -632,8 +629,6 @@ class PGDHead(FCOSMono3DHead):
bbox_preds, dir_cls_preds, depth_cls_preds, weights, bbox_preds, dir_cls_preds, depth_cls_preds, weights,
attr_preds, centernesses, pos_inds, img_metas) attr_preds, centernesses, pos_inds, img_metas)
loss_dict = dict()
if num_pos > 0: if num_pos > 0:
pos_bbox_targets_3d = flatten_bbox_targets_3d[pos_inds] pos_bbox_targets_3d = flatten_bbox_targets_3d[pos_inds]
pos_centerness_targets = flatten_centerness_targets[pos_inds] pos_centerness_targets = flatten_centerness_targets[pos_inds]
...@@ -658,17 +653,17 @@ class PGDHead(FCOSMono3DHead): ...@@ -658,17 +653,17 @@ class PGDHead(FCOSMono3DHead):
pos_bbox_preds, pos_bbox_targets_3d = self.add_sin_difference( pos_bbox_preds, pos_bbox_targets_3d = self.add_sin_difference(
pos_bbox_preds, pos_bbox_targets_3d) pos_bbox_preds, pos_bbox_targets_3d)
loss_offset = self.loss_bbox( loss_dict['loss_offset'] = self.loss_bbox(
pos_bbox_preds[:, :2], pos_bbox_preds[:, :2],
pos_bbox_targets_3d[:, :2], pos_bbox_targets_3d[:, :2],
weight=bbox_weights[:, :2], weight=bbox_weights[:, :2],
avg_factor=equal_weights.sum()) avg_factor=equal_weights.sum())
loss_size = self.loss_bbox( loss_dict['loss_size'] = self.loss_bbox(
pos_bbox_preds[:, 3:6], pos_bbox_preds[:, 3:6],
pos_bbox_targets_3d[:, 3:6], pos_bbox_targets_3d[:, 3:6],
weight=bbox_weights[:, 3:6], weight=bbox_weights[:, 3:6],
avg_factor=equal_weights.sum()) avg_factor=equal_weights.sum())
loss_rotsin = self.loss_bbox( loss_dict['loss_rotsin'] = self.loss_bbox(
pos_bbox_preds[:, 6], pos_bbox_preds[:, 6],
pos_bbox_targets_3d[:, 6], pos_bbox_targets_3d[:, 6],
weight=bbox_weights[:, 6], weight=bbox_weights[:, 6],
...@@ -751,8 +746,8 @@ class PGDHead(FCOSMono3DHead): ...@@ -751,8 +746,8 @@ class PGDHead(FCOSMono3DHead):
weight=bbox_weights[:, -4:], weight=bbox_weights[:, -4:],
avg_factor=equal_weights.sum()) avg_factor=equal_weights.sum())
loss_centerness = self.loss_centerness(pos_centerness, loss_dict['loss_centerness'] = self.loss_centerness(
pos_centerness_targets) pos_centerness, pos_centerness_targets)
# attribute classification loss # attribute classification loss
if self.pred_attrs: if self.pred_attrs:
...@@ -764,9 +759,9 @@ class PGDHead(FCOSMono3DHead): ...@@ -764,9 +759,9 @@ class PGDHead(FCOSMono3DHead):
else: else:
# need absolute due to possible negative delta x/y # need absolute due to possible negative delta x/y
loss_offset = pos_bbox_preds[:, :2].sum() loss_dict['loss_offset'] = pos_bbox_preds[:, :2].sum()
loss_size = pos_bbox_preds[:, 3:6].sum() loss_dict['loss_size'] = pos_bbox_preds[:, 3:6].sum()
loss_rotsin = pos_bbox_preds[:, 6].sum() loss_dict['loss_rotsin'] = pos_bbox_preds[:, 6].sum()
loss_dict['loss_depth'] = pos_bbox_preds[:, 2].sum() loss_dict['loss_depth'] = pos_bbox_preds[:, 2].sum()
if self.pred_velo: if self.pred_velo:
loss_dict['loss_velo'] = pos_bbox_preds[:, 7:9].sum() loss_dict['loss_velo'] = pos_bbox_preds[:, 7:9].sum()
...@@ -777,7 +772,7 @@ class PGDHead(FCOSMono3DHead): ...@@ -777,7 +772,7 @@ class PGDHead(FCOSMono3DHead):
if self.pred_bbox2d: if self.pred_bbox2d:
loss_dict['loss_bbox2d'] = pos_bbox_preds[:, -4:].sum() loss_dict['loss_bbox2d'] = pos_bbox_preds[:, -4:].sum()
loss_dict['loss_consistency'] = pos_bbox_preds[:, -4:].sum() loss_dict['loss_consistency'] = pos_bbox_preds[:, -4:].sum()
loss_centerness = pos_centerness.sum() loss_dict['loss_centerness'] = pos_centerness.sum()
if self.use_direction_classifier: if self.use_direction_classifier:
loss_dict['loss_dir'] = pos_dir_cls_preds.sum() loss_dict['loss_dir'] = pos_dir_cls_preds.sum()
if self.use_depth_classifier: if self.use_depth_classifier:
...@@ -792,14 +787,6 @@ class PGDHead(FCOSMono3DHead): ...@@ -792,14 +787,6 @@ class PGDHead(FCOSMono3DHead):
if self.pred_attrs: if self.pred_attrs:
loss_dict['loss_attr'] = pos_attr_preds.sum() loss_dict['loss_attr'] = pos_attr_preds.sum()
loss_dict.update(
dict(
loss_cls=loss_cls,
loss_offset=loss_offset,
loss_size=loss_size,
loss_rotsin=loss_rotsin,
loss_centerness=loss_centerness))
return loss_dict return loss_dict
@force_fp32( @force_fp32(
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment