Commit 4eebf2c6 authored by zhangwenwei's avatar zhangwenwei
Browse files

Merge branch 'freeanchor' into 'master'

Freeanchor

See merge request open-mmlab/mmdet.3d!101
parents 37f317e6 140af75d
......@@ -51,6 +51,7 @@ Results and models are available in the [model zoo](docs/model_zoo.md).
|--------------------|:--------:|:--------:|:--------:|:---------:|:-----:|:--------:|:-----:|
| SECOND | ☐ | ☐ | ☐ | ✗ | ✓ | ✓ | ☐ |
| PointPillars | ☐ | ☐ | ☐ | ✗ | ✓ | ✓ | ☐ |
| FreeAnchor | ☐ | ☐ | ☐ | ✗ | ✓ | ✓ | ☐ |
| VoteNet | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ |
| Part-A2 | ☐ | ☐ | ☐ | ✗ | ✓ | ✓ | ☐ |
| MVXNet | ☐ | ☐ | ☐ | ✗ | ✓ | ✓ | ☐ |
......
......@@ -16,16 +16,16 @@ input_modality = dict(
use_radar=False,
use_map=False,
use_external=False)
# file_client_args = dict(backend='disk')
file_client_args = dict(backend='disk')
# Uncomment the following if use ceph or other file clients.
# See https://mmcv.readthedocs.io/en/latest/api.html#mmcv.fileio.FileClient
# for more details.
file_client_args = dict(
backend='petrel',
path_mapping=dict({
'./data/nuscenes/': 's3://nuscenes/nuscenes/',
'data/nuscenes/': 's3://nuscenes/nuscenes/'
}))
# file_client_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/nuscenes/': 's3://nuscenes/nuscenes/',
# 'data/nuscenes/': 's3://nuscenes/nuscenes/'
# }))
train_pipeline = [
dict(
type='LoadPointsFromFile',
......@@ -45,6 +45,7 @@ train_pipeline = [
dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
dict(type='ObjectNameFilter', classes=class_names),
dict(type='PointShuffle'),
dict(type='DefaultFormatBundle3D', class_names=class_names),
dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
......
# FreeAnchor for 3D Object Detection
## Introduction
We implement FreeAnchor in 3D detection systems and provide their first results with PointPillars on nuScenes dataset.
With the implemented `FreeAnchor3DHead`, a PointPillar detector with a big backbone (e.g., RegNet-3.2GF) achieves top performance
on the nuScenes benchmark.
```
@inproceedings{zhang2019freeanchor,
title = {{FreeAnchor}: Learning to Match Anchors for Visual Object Detection},
author = {Zhang, Xiaosong and Wan, Fang and Liu, Chang and Ji, Rongrong and Ye, Qixiang},
booktitle = {Neural Information Processing Systems},
year = {2019}
}
```
## Usage
### Modify config
As in the [baseline config](hv_pointpillars_fpn_sbn-all_free-anchor_4x8_2x_nus-3d.py), we only need to replace the head of an existing one-stage detector to use FreeAnchor head.
Since the config is inherit from a common detector head, `_delete_=True` is necessary to avoid conflicts.
The hyperparameters are specifically tuned according to the original paper.
```python
_base_ = [
'../_base_/models/pointpillars_second_fpn.py',
'../_base_/datasets/nus-3d.py', '../_base_/schedules/schedule_2x.py',
'../_base_/default_runtime.py'
]
model = dict(
pts_bbox_head=dict(
_delete_=True,
type='FreeAnchor3DHead',
num_classes=10,
in_channels=256,
feat_channels=256,
use_direction_classifier=True,
pre_anchor_topk=25,
bbox_thr=0.5,
gamma=2.0,
alpha=0.5,
anchor_generator=dict(
type='AlignedAnchor3DRangeGenerator',
ranges=[[-50, -50, -1.8, 50, 50, -1.8]],
scales=[1, 2, 4],
sizes=[
[0.8660, 2.5981, 1.], # 1.5/sqrt(3)
[0.5774, 1.7321, 1.], # 1/sqrt(3)
[1., 1., 1.],
[0.4, 0.4, 1],
],
custom_values=[0, 0],
rotations=[0, 1.57],
reshape_out=True),
assigner_per_size=False,
diff_rad_by_sin=True,
dir_offset=0.7854, # pi/4
dir_limit_offset=0,
bbox_coder=dict(type='DeltaXYZWLHRBBoxCoder', code_size=9),
loss_cls=dict(
type='FocalLoss',
use_sigmoid=True,
gamma=2.0,
alpha=0.25,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=0.8),
loss_dir=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.2)))
# model training and testing settings
train_cfg = dict(
pts=dict(code_weight=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.25, 0.25]))
```
## Results
### PointPillars
| Backbone |FreeAnchor|Lr schd | Mem (GB) | Inf time (fps) | mAP |NDS| Download |
| :---------: |:-----: |:-----: | :------: | :------------: | :----: |:----: | :------: |
|[FPN](../pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d.py)|✗|2x|17.1||40.0|53.3||
|[FPN](./hv_pointpillars_fpn_sbn-all_free-anchor_4x8_2x_nus-3d.py)|✓|2x|||43.7|55.1||
|[RegNetX-400MF-FPN](../regnet/hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_nus-3d.py)|✗|2x|17.3||44.8|56.4||
|[RegNetX-400MF-FPN](./hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d.py)|✓|2x||||||
|[RegNetX-1.6GF-FPN](./hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d.py)|✓|2x||||||
|[RegNetX-3.2GF-FPN](./hv_pointpillars_regnet-3.2gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d.py)|✓|2x||||||
_base_ = [
'../_base_/models/pointpillars_second_fpn.py',
'../_base_/datasets/nus-3d.py', '../_base_/schedules/schedule_2x.py',
'../_base_/default_runtime.py'
]
model = dict(
pts_bbox_head=dict(
_delete_=True,
type='FreeAnchor3DHead',
num_classes=10,
in_channels=256,
feat_channels=256,
use_direction_classifier=True,
pre_anchor_topk=25,
bbox_thr=0.5,
gamma=2.0,
alpha=0.5,
anchor_generator=dict(
type='AlignedAnchor3DRangeGenerator',
ranges=[[-50, -50, -1.8, 50, 50, -1.8]],
scales=[1, 2, 4],
sizes=[
[0.8660, 2.5981, 1.], # 1.5/sqrt(3)
[0.5774, 1.7321, 1.], # 1/sqrt(3)
[1., 1., 1.],
[0.4, 0.4, 1],
],
custom_values=[0, 0],
rotations=[0, 1.57],
reshape_out=True),
assigner_per_size=False,
diff_rad_by_sin=True,
dir_offset=0.7854, # pi/4
dir_limit_offset=0,
bbox_coder=dict(type='DeltaXYZWLHRBBoxCoder', code_size=9),
loss_cls=dict(
type='FocalLoss',
use_sigmoid=True,
gamma=2.0,
alpha=0.25,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=0.8),
loss_dir=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.2)))
# model training and testing settings
train_cfg = dict(
pts=dict(code_weight=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.25, 0.25]))
_base_ = './hv_pointpillars_fpn_sbn-all_free-anchor_4x8_2x_nus-3d.py'
model = dict(
pretrained=dict(pts='open-mmlab://regnetx_1.6gf'),
pts_backbone=dict(
_delete_=True,
type='NoStemRegNet',
arch='regnetx_1.6gf',
out_indices=(1, 2, 3),
frozen_stages=-1,
strides=(1, 2, 2, 2),
base_channels=64,
stem_channels=64,
norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01),
norm_eval=False,
style='pytorch'),
pts_neck=dict(in_channels=[168, 408, 912]))
_base_ = [
'../_base_/models/pointpillars_second_fpn.py',
'../_base_/datasets/nus-3d.py',
'../_base_/schedules/schedule_2x.py',
'../_base_/default_runtime.py',
]
# model settings
model = dict(
type='MVXFasterRCNN',
pretrained=dict(pts='open-mmlab://regnetx_1.6gf'),
pts_backbone=dict(
_delete_=True,
type='NoStemRegNet',
arch='regnetx_1.6gf',
out_indices=(1, 2, 3),
frozen_stages=-1,
strides=(1, 2, 2, 2),
base_channels=64,
stem_channels=64,
norm_cfg=dict(type='naiveSyncBN2d', eps=1e-3, momentum=0.01),
norm_eval=False,
style='pytorch'),
pts_neck=dict(in_channels=[168, 408, 912]))
......@@ -10,7 +10,7 @@ def merge_aug_bboxes_3d(aug_results, img_metas, test_cfg):
Args:
aug_results (list[dict]): The dict of detection results.
The dict contains the following keys
- boxes_3d (:obj:BaseInstance3DBoxes): detection bbox
- boxes_3d (:obj:`BaseInstance3DBoxes`): detection bbox
- scores_3d (torch.Tensor): detection scores
- labels_3d (torch.Tensor): predicted box labels
img_metas (list[dict]): Meta information of each sample
......@@ -18,7 +18,7 @@ def merge_aug_bboxes_3d(aug_results, img_metas, test_cfg):
Returns:
dict: bbox results in cpu mode, containing the merged results
- boxes_3d (:obj:BaseInstance3DBoxes): merged detection bbox
- boxes_3d (:obj:`BaseInstance3DBoxes`): merged detection bbox
- scores_3d (torch.Tensor): merged detection scores
- labels_3d (torch.Tensor): merged predicted box labels
"""
......
from .anchor3d_head import Anchor3DHead
from .free_anchor3d_head import FreeAnchor3DHead
from .parta2_rpn_head import PartA2RPNHead
from .vote_head import VoteHead
__all__ = ['Anchor3DHead', 'PartA2RPNHead', 'VoteHead']
__all__ = ['Anchor3DHead', 'FreeAnchor3DHead', 'PartA2RPNHead', 'VoteHead']
......@@ -281,7 +281,7 @@ class Anchor3DHead(nn.Module, AnchorTrainMixin):
bbox_preds (list[torch.Tensor]): Multi-level bbox predictions.
dir_cls_preds (list[torch.Tensor]): Multi-level direction
class predictions.
gt_bboxes (list[:obj:BaseInstance3DBoxes]): Gt bboxes
gt_bboxes (list[:obj:`BaseInstance3DBoxes`]): Gt bboxes
of each sample.
gt_labels (list[torch.Tensor]): Gt labels of each sample.
input_metas (list[dict]): Contain pcd and img's meta info.
......@@ -405,7 +405,7 @@ class Anchor3DHead(nn.Module, AnchorTrainMixin):
Returns:
tuple: Contain predictions of single batch.
- bboxes (:obj:BaseInstance3DBoxes): Predicted 3d bboxes.
- bboxes (:obj:`BaseInstance3DBoxes`): Predicted 3d bboxes.
- scores (torch.Tensor): Class score of each bbox.
- labels (torch.Tensor): Label of each bbox.
"""
......
import torch
import torch.nn.functional as F
from mmdet3d.core.bbox import bbox_overlaps_nearest_3d
from mmdet.models import HEADS
from .anchor3d_head import Anchor3DHead
from .train_mixins import get_direction_target
@HEADS.register_module()
class FreeAnchor3DHead(Anchor3DHead):
"""`FreeAnchor <https://arxiv.org/abs/1909.02466>`_ head for 3D detection
Note:
This implementation is directly modified from the `mmdet implementation
<https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/dense_heads/free_anchor_retina_head.py>`_ # noqa
We find it also works on 3D detection with minor modification, i.e.,
different hyper-parameters and a additional direction classifier.
Args:
pre_anchor_topk (int): Number of boxes that be token in each bag.
bbox_thr (float): The threshold of the saturated linear function. It is
usually the same with the IoU threshold used in NMS.
gamma (float): Gamma parameter in focal loss.
alpha (float): Alpha parameter in focal loss.
kwargs (dict): Other arguments are the same as those in :class:`Anchor3DHead`.
"""
def __init__(self,
pre_anchor_topk=50,
bbox_thr=0.6,
gamma=2.0,
alpha=0.5,
**kwargs):
super().__init__(**kwargs)
self.pre_anchor_topk = pre_anchor_topk
self.bbox_thr = bbox_thr
self.gamma = gamma
self.alpha = alpha
def loss(self,
cls_scores,
bbox_preds,
dir_cls_preds,
gt_bboxes,
gt_labels,
input_metas,
gt_bboxes_ignore=None):
"""Calculate loss of FreeAnchor head.
Args:
cls_scores (list[torch.Tensor]): Classification scores of
different samples.
bbox_preds (list[torch.Tensor]): Box predictions of
different samples
dir_cls_preds (list[torch.Tensor]): Direction predictions of
different samples
gt_bboxes (list[:obj:`BaseInstance3DBoxes`]): Ground truth boxes.
gt_labels (list[torch.Tensor]): Ground truth labels.
input_metas (list[dict]): List of input meta information.
gt_bboxes_ignore (list[:obj:`BaseInstance3DBoxes`], optional):
Ground truth boxes that should be ignored. Defaults to None.
Returns:
dict: Loss items.
"""
featmap_sizes = [featmap.size()[-2:] for featmap in cls_scores]
assert len(featmap_sizes) == self.anchor_generator.num_levels
anchor_list = self.get_anchors(featmap_sizes, input_metas)
anchors = [torch.cat(anchor) for anchor in anchor_list]
# concatenate each level
cls_scores = [
cls_score.permute(0, 2, 3, 1).reshape(
cls_score.size(0), -1, self.num_classes)
for cls_score in cls_scores
]
bbox_preds = [
bbox_pred.permute(0, 2, 3, 1).reshape(
bbox_pred.size(0), -1, self.box_code_size)
for bbox_pred in bbox_preds
]
dir_cls_preds = [
dir_cls_pred.permute(0, 2, 3,
1).reshape(dir_cls_pred.size(0), -1, 2)
for dir_cls_pred in dir_cls_preds
]
cls_scores = torch.cat(cls_scores, dim=1)
bbox_preds = torch.cat(bbox_preds, dim=1)
dir_cls_preds = torch.cat(dir_cls_preds, dim=1)
cls_prob = torch.sigmoid(cls_scores)
box_prob = []
num_pos = 0
positive_losses = []
for _, (anchors_, gt_labels_, gt_bboxes_, cls_prob_, bbox_preds_,
dir_cls_preds_) in enumerate(
zip(anchors, gt_labels, gt_bboxes, cls_prob, bbox_preds,
dir_cls_preds)):
gt_bboxes_ = gt_bboxes_.tensor.to(anchors_.device)
with torch.no_grad():
# box_localization: a_{j}^{loc}, shape: [j, 4]
pred_boxes = self.bbox_coder.decode(anchors_, bbox_preds_)
# object_box_iou: IoU_{ij}^{loc}, shape: [i, j]
object_box_iou = bbox_overlaps_nearest_3d(
gt_bboxes_, pred_boxes)
# object_box_prob: P{a_{j} -> b_{i}}, shape: [i, j]
t1 = self.bbox_thr
t2 = object_box_iou.max(
dim=1, keepdim=True).values.clamp(min=t1 + 1e-12)
object_box_prob = ((object_box_iou - t1) / (t2 - t1)).clamp(
min=0, max=1)
# object_cls_box_prob: P{a_{j} -> b_{i}}, shape: [i, c, j]
num_obj = gt_labels_.size(0)
indices = torch.stack(
[torch.arange(num_obj).type_as(gt_labels_), gt_labels_],
dim=0)
object_cls_box_prob = torch.sparse_coo_tensor(
indices, object_box_prob)
# image_box_iou: P{a_{j} \in A_{+}}, shape: [c, j]
"""
from "start" to "end" implement:
image_box_iou = torch.sparse.max(object_cls_box_prob,
dim=0).t()
"""
# start
box_cls_prob = torch.sparse.sum(
object_cls_box_prob, dim=0).to_dense()
indices = torch.nonzero(box_cls_prob).t_()
if indices.numel() == 0:
image_box_prob = torch.zeros(
anchors_.size(0),
self.num_classes).type_as(object_box_prob)
else:
nonzero_box_prob = torch.where(
(gt_labels_.unsqueeze(dim=-1) == indices[0]),
object_box_prob[:, indices[1]],
torch.tensor(
[0]).type_as(object_box_prob)).max(dim=0).values
# upmap to shape [j, c]
image_box_prob = torch.sparse_coo_tensor(
indices.flip([0]),
nonzero_box_prob,
size=(anchors_.size(0), self.num_classes)).to_dense()
# end
box_prob.append(image_box_prob)
# construct bags for objects
match_quality_matrix = bbox_overlaps_nearest_3d(
gt_bboxes_, anchors_)
_, matched = torch.topk(
match_quality_matrix,
self.pre_anchor_topk,
dim=1,
sorted=False)
del match_quality_matrix
# matched_cls_prob: P_{ij}^{cls}
matched_cls_prob = torch.gather(
cls_prob_[matched], 2,
gt_labels_.view(-1, 1, 1).repeat(1, self.pre_anchor_topk,
1)).squeeze(2)
# matched_box_prob: P_{ij}^{loc}
matched_anchors = anchors_[matched]
matched_object_targets = self.bbox_coder.encode(
matched_anchors,
gt_bboxes_.unsqueeze(dim=1).expand_as(matched_anchors))
# direction classification loss
loss_dir = None
if self.use_direction_classifier:
# also calculate direction prob: P_{ij}^{dir}
matched_dir_targets = get_direction_target(
matched_anchors,
matched_object_targets,
self.dir_offset,
one_hot=False)
loss_dir = self.loss_dir(
dir_cls_preds_[matched].transpose(-2, -1),
matched_dir_targets,
reduction_override='none')
# generate bbox weights
if self.diff_rad_by_sin:
bbox_preds_[matched], matched_object_targets = \
self.add_sin_difference(
bbox_preds_[matched], matched_object_targets)
bbox_weights = matched_anchors.new_ones(matched_anchors.size())
# Use pop is not right, check performance
code_weight = self.train_cfg.get('code_weight', None)
if code_weight:
bbox_weights = bbox_weights * bbox_weights.new_tensor(
code_weight)
loss_bbox = self.loss_bbox(
bbox_preds_[matched],
matched_object_targets,
bbox_weights,
reduction_override='none').sum(-1)
if loss_dir is not None:
loss_bbox += loss_dir
matched_box_prob = torch.exp(-loss_bbox)
# positive_losses: {-log( Mean-max(P_{ij}^{cls} * P_{ij}^{loc}) )}
num_pos += len(gt_bboxes_)
positive_losses.append(
self.positive_bag_loss(matched_cls_prob, matched_box_prob))
positive_loss = torch.cat(positive_losses).sum() / max(1, num_pos)
# box_prob: P{a_{j} \in A_{+}}
box_prob = torch.stack(box_prob, dim=0)
# negative_loss:
# \sum_{j}{ FL((1 - P{a_{j} \in A_{+}}) * (1 - P_{j}^{bg})) } / n||B||
negative_loss = self.negative_bag_loss(cls_prob, box_prob).sum() / max(
1, num_pos * self.pre_anchor_topk)
losses = {
'positive_bag_loss': positive_loss,
'negative_bag_loss': negative_loss
}
return losses
def positive_bag_loss(self, matched_cls_prob, matched_box_prob):
"""Generate positive bag loss
Args:
matched_cls_prob (torch.Tensor): Classification probability
of matched positive samples.
matched_box_prob (torch.Tensor): Bounding box probability
of matched positive samples.
Returns:
torch.Tensor: Loss of positive samples.
"""
# bag_prob = Mean-max(matched_prob)
matched_prob = matched_cls_prob * matched_box_prob
weight = 1 / torch.clamp(1 - matched_prob, 1e-12, None)
weight /= weight.sum(dim=1).unsqueeze(dim=-1)
bag_prob = (weight * matched_prob).sum(dim=1)
# positive_bag_loss = -self.alpha * log(bag_prob)
bag_prob = bag_prob.clamp(0, 1) # to avoid bug of BCE, check
return self.alpha * F.binary_cross_entropy(
bag_prob, torch.ones_like(bag_prob), reduction='none')
def negative_bag_loss(self, cls_prob, box_prob):
"""Generate negative bag loss
Args:
cls_prob (torch.Tensor): Classification probability
of negative samples.
box_prob (torch.Tensor): Bounding box probability
of negative samples.
Returns:
torch.Tensor: Loss of negative samples.
"""
prob = cls_prob * (1 - box_prob)
prob = prob.clamp(0, 1) # to avoid bug of BCE, check
negative_bag_loss = prob**self.gamma * F.binary_cross_entropy(
prob, torch.zeros_like(prob), reduction='none')
return (1 - self.alpha) * negative_bag_loss
......@@ -121,7 +121,7 @@ class PartA2RPNHead(Anchor3DHead):
Returns:
dict: Predictions of single batch. Contain the keys:
- boxes_3d (:obj:BaseInstance3DBoxes): Predicted 3d bboxes.
- boxes_3d (:obj:`BaseInstance3DBoxes`): Predicted 3d bboxes.
- scores_3d (torch.Tensor): Score of each bbox.
- labels_3d (torch.Tensor): Label of each bbox.
- cls_preds (torch.Tensor): Class score of each bbox.
......@@ -217,7 +217,7 @@ class PartA2RPNHead(Anchor3DHead):
Returns:
dict: Predictions of single batch. Contain the keys:
- boxes_3d (:obj:BaseInstance3DBoxes): Predicted 3d bboxes.
- boxes_3d (:obj:`BaseInstance3DBoxes`): Predicted 3d bboxes.
- scores_3d (torch.Tensor): Score of each bbox.
- labels_3d (torch.Tensor): Label of each bbox.
- cls_preds (torch.Tensor): Class score of each bbox.
......
......@@ -20,7 +20,7 @@ class AnchorTrainMixin(object):
Args:
anchor_list (list[list]): Multi level anchors of each image.
gt_bboxes_list (list[:obj:BaseInstance3DBoxes]): Ground truth
gt_bboxes_list (list[:obj:`BaseInstance3DBoxes`]): Ground truth
bboxes of each image.
input_metas (list[dict]): Meta info of each image.
gt_bboxes_ignore_list (None | list): Ignore list of gt bboxes.
......@@ -96,7 +96,7 @@ class AnchorTrainMixin(object):
Args:
anchors (torch.Tensor): Concatenated multi-level anchor.
gt_bboxes (:obj:BaseInstance3DBoxes): Gt bboxes.
gt_bboxes (:obj:`BaseInstance3DBoxes`): Gt bboxes.
gt_bboxes_ignore (torch.Tensor): Ignored gt bboxes.
gt_labels (torch.Tensor): Gt class labels.
input_meta (dict): Meta info of each image.
......@@ -185,7 +185,7 @@ class AnchorTrainMixin(object):
Args:
bbox_assigner (BaseAssigner): assign positive and negative boxes.
anchors (torch.Tensor): Concatenated multi-level anchor.
gt_bboxes (:obj:BaseInstance3DBoxes): Gt bboxes.
gt_bboxes (:obj:`BaseInstance3DBoxes`): Gt bboxes.
gt_bboxes_ignore (torch.Tensor): Ignored gt bboxes.
gt_labels (torch.Tensor): Gt class labels.
input_meta (dict): Meta info of each image.
......
......@@ -189,7 +189,7 @@ class VoteHead(nn.Module):
Args:
bbox_preds (dict): Predictions from forward of vote head.
points (list[torch.Tensor]): Input points.
gt_bboxes_3d (list[:obj:BaseInstance3DBoxes]): Gt bboxes
gt_bboxes_3d (list[:obj:`BaseInstance3DBoxes`]): Gt bboxes
of each sample.
gt_labels_3d (list[torch.Tensor]): Gt labels of each sample.
pts_semantic_mask (None | list[torch.Tensor]): Point-wise
......@@ -296,7 +296,7 @@ class VoteHead(nn.Module):
Args:
points (list[torch.Tensor]): Points of each batch.
gt_bboxes_3d (list[:obj:BaseInstance3DBoxes]): gt bboxes of
gt_bboxes_3d (list[:obj:`BaseInstance3DBoxes`]): gt bboxes of
each batch.
gt_labels_3d (list[torch.Tensor]): gt class labels of each batch.
pts_semantic_mask (None | list[torch.Tensor]): point-wise semantic
......@@ -382,7 +382,7 @@ class VoteHead(nn.Module):
Args:
points (torch.Tensor): Points of each batch.
gt_bboxes_3d (:obj:BaseInstance3DBoxes): gt bboxes of each batch.
gt_bboxes_3d (:obj:`BaseInstance3DBoxes`): gt bboxes of each batch.
gt_labels_3d (torch.Tensor): gt class labels of each batch.
pts_semantic_mask (None | torch.Tensor): point-wise semantic
label of each batch.
......
......@@ -38,7 +38,7 @@ class VoteNet(SingleStage3DDetector):
Args:
points (list[torch.Tensor]): Points of each batch.
img_metas (list): Image metas.
gt_bboxes_3d (:obj:BaseInstance3DBoxes): gt bboxes of each batch.
gt_bboxes_3d (:obj:`BaseInstance3DBoxes`): gt bboxes of each batch.
gt_labels_3d (list[torch.Tensor]): gt class labels of each batch.
pts_semantic_mask (None | list[torch.Tensor]): point-wise semantic
label of each batch.
......
......@@ -63,7 +63,7 @@ class Base3DRoIHead(nn.Module, metaclass=ABCMeta):
x (dict): Contains features from the first stage.
img_metas (list[dict]): Meta info of each image.
proposal_list (list[dict]): Proposal information from rpn.
gt_bboxes (list[:obj:BaseInstance3DBoxes]):
gt_bboxes (list[:obj:`BaseInstance3DBoxes`]):
GT bboxes of each sample. The bboxes are encapsulated
by 3D box structures.
gt_labels (list[LongTensor]): GT labels of each sample.
......
......@@ -78,7 +78,7 @@ class PointwiseSemanticHead(nn.Module):
Args:
voxel_centers (torch.Tensor): shape [voxel_num, 3],
the center of voxels
gt_bboxes_3d (:obj:BaseInstance3DBoxes): gt boxes containing tensor
gt_bboxes_3d (:obj:`BaseInstance3DBoxes`): gt boxes with tensor
of shape [box_num, 7].
gt_labels_3d (torch.Tensor): shape [box_num], class label of gt
......@@ -125,7 +125,7 @@ class PointwiseSemanticHead(nn.Module):
Args:
voxel_centers (torch.Tensor): shape [voxel_num, 3],
the center of voxels
gt_bboxes_3d (list[:obj:BaseInstance3DBoxes]): list of gt boxes
gt_bboxes_3d (list[:obj:`BaseInstance3DBoxes`]): list of gt boxes
containing tensor of shape [box_num, 7].
gt_labels_3d (list[torch.Tensor]): list of GT labels.
......
......@@ -79,10 +79,10 @@ class PartAggregationROIHead(Base3DRoIHead):
img_metas (list[dict]): Meta info of each image.
proposal_list (list[dict]): Proposal information from rpn.
The dictionary should contain the following keys:
- boxes_3d (:obj:BaseInstance3DBoxes): Proposal bboxes
- boxes_3d (:obj:`BaseInstance3DBoxes`): Proposal bboxes
- labels_3d (torch.Tensor): Labels of proposals
- cls_preds (torch.Tensor): Original scores of proposals
gt_bboxes_3d (list[:obj:BaseInstance3DBoxes]):
gt_bboxes_3d (list[:obj:`BaseInstance3DBoxes`]):
GT bboxes of each sample. The bboxes are encapsulated
by 3D box structures.
gt_labels_3d (list[LongTensor]): GT labels of each sample.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment