Commit 53435c62 authored by Yezhen Cong's avatar Yezhen Cong Committed by Tai-Wang
Browse files

[Refactor] Refactor code structure and docstrings (#803)

* refactor points_in_boxes

* Merge same functions of three boxes

* More docstring fixes and unify x/y/z size

* Add "optional" and fix "Default"

* Add "optional" and fix "Default"

* Add "optional" and fix "Default"

* Add "optional" and fix "Default"

* Add "optional" and fix "Default"

* Remove None in function param type

* Fix unittest

* Add comments for NMS functions

* Merge methods of Points

* Add unittest

* Add optional and default value

* Fix box conversion and add unittest

* Fix comments

* Add unit test

* Indent

* Fix CI

* Remove useless \\

* Remove useless \\

* Remove useless \\

* Remove useless \\

* Remove useless \\

* Add unit test for box bev

* More unit tests and refine docstrings in box_np_ops

* Fix comment

* Add deprecation warning
parent 4f36084f
...@@ -21,16 +21,16 @@ class SeparateHead(BaseModule): ...@@ -21,16 +21,16 @@ class SeparateHead(BaseModule):
Args: Args:
in_channels (int): Input channels for conv_layer. in_channels (int): Input channels for conv_layer.
heads (dict): Conv information. heads (dict): Conv information.
head_conv (int): Output channels. head_conv (int, optional): Output channels.
Default: 64. Default: 64.
final_kernal (int): Kernal size for the last conv layer. final_kernal (int, optional): Kernal size for the last conv layer.
Deafult: 1. Deafult: 1.
init_bias (float): Initial bias. Default: -2.19. init_bias (float, optional): Initial bias. Default: -2.19.
conv_cfg (dict): Config of conv layer. conv_cfg (dict, optional): Config of conv layer.
Default: dict(type='Conv2d') Default: dict(type='Conv2d')
norm_cfg (dict): Config of norm layer. norm_cfg (dict, optional): Config of norm layer.
Default: dict(type='BN2d'). Default: dict(type='BN2d').
bias (str): Type of bias. Default: 'auto'. bias (str, optional): Type of bias. Default: 'auto'.
""" """
def __init__(self, def __init__(self,
...@@ -100,17 +100,17 @@ class SeparateHead(BaseModule): ...@@ -100,17 +100,17 @@ class SeparateHead(BaseModule):
Returns: Returns:
dict[str: torch.Tensor]: contains the following keys: dict[str: torch.Tensor]: contains the following keys:
-reg (torch.Tensor): 2D regression value with the \ -reg (torch.Tensor): 2D regression value with the
shape of [B, 2, H, W]. shape of [B, 2, H, W].
-height (torch.Tensor): Height value with the \ -height (torch.Tensor): Height value with the
shape of [B, 1, H, W]. shape of [B, 1, H, W].
-dim (torch.Tensor): Size value with the shape \ -dim (torch.Tensor): Size value with the shape
of [B, 3, H, W]. of [B, 3, H, W].
-rot (torch.Tensor): Rotation value with the \ -rot (torch.Tensor): Rotation value with the
shape of [B, 2, H, W]. shape of [B, 2, H, W].
-vel (torch.Tensor): Velocity value with the \ -vel (torch.Tensor): Velocity value with the
shape of [B, 2, H, W]. shape of [B, 2, H, W].
-heatmap (torch.Tensor): Heatmap with the shape of \ -heatmap (torch.Tensor): Heatmap with the shape of
[B, N, H, W]. [B, N, H, W].
""" """
ret_dict = dict() ret_dict = dict()
...@@ -131,18 +131,19 @@ class DCNSeparateHead(BaseModule): ...@@ -131,18 +131,19 @@ class DCNSeparateHead(BaseModule):
Args: Args:
in_channels (int): Input channels for conv_layer. in_channels (int): Input channels for conv_layer.
num_cls (int): Number of classes.
heads (dict): Conv information. heads (dict): Conv information.
dcn_config (dict): Config of dcn layer. dcn_config (dict): Config of dcn layer.
num_cls (int): Output channels. head_conv (int, optional): Output channels.
Default: 64. Default: 64.
final_kernal (int): Kernal size for the last conv layer. final_kernal (int, optional): Kernal size for the last conv
Deafult: 1. layer. Deafult: 1.
init_bias (float): Initial bias. Default: -2.19. init_bias (float, optional): Initial bias. Default: -2.19.
conv_cfg (dict): Config of conv layer. conv_cfg (dict, optional): Config of conv layer.
Default: dict(type='Conv2d') Default: dict(type='Conv2d')
norm_cfg (dict): Config of norm layer. norm_cfg (dict, optional): Config of norm layer.
Default: dict(type='BN2d'). Default: dict(type='BN2d').
bias (str): Type of bias. Default: 'auto'. bias (str, optional): Type of bias. Default: 'auto'.
""" # noqa: W605 """ # noqa: W605
def __init__(self, def __init__(self,
...@@ -215,17 +216,17 @@ class DCNSeparateHead(BaseModule): ...@@ -215,17 +216,17 @@ class DCNSeparateHead(BaseModule):
Returns: Returns:
dict[str: torch.Tensor]: contains the following keys: dict[str: torch.Tensor]: contains the following keys:
-reg (torch.Tensor): 2D regression value with the \ -reg (torch.Tensor): 2D regression value with the
shape of [B, 2, H, W]. shape of [B, 2, H, W].
-height (torch.Tensor): Height value with the \ -height (torch.Tensor): Height value with the
shape of [B, 1, H, W]. shape of [B, 1, H, W].
-dim (torch.Tensor): Size value with the shape \ -dim (torch.Tensor): Size value with the shape
of [B, 3, H, W]. of [B, 3, H, W].
-rot (torch.Tensor): Rotation value with the \ -rot (torch.Tensor): Rotation value with the
shape of [B, 2, H, W]. shape of [B, 2, H, W].
-vel (torch.Tensor): Velocity value with the \ -vel (torch.Tensor): Velocity value with the
shape of [B, 2, H, W]. shape of [B, 2, H, W].
-heatmap (torch.Tensor): Heatmap with the shape of \ -heatmap (torch.Tensor): Heatmap with the shape of
[B, N, H, W]. [B, N, H, W].
""" """
center_feat = self.feature_adapt_cls(x) center_feat = self.feature_adapt_cls(x)
...@@ -243,31 +244,30 @@ class CenterHead(BaseModule): ...@@ -243,31 +244,30 @@ class CenterHead(BaseModule):
"""CenterHead for CenterPoint. """CenterHead for CenterPoint.
Args: Args:
mode (str): Mode of the head. Default: '3d'. in_channels (list[int] | int, optional): Channels of the input
in_channels (list[int] | int): Channels of the input feature map. feature map. Default: [128].
Default: [128]. tasks (list[dict], optional): Task information including class number
tasks (list[dict]): Task information including class number
and class names. Default: None. and class names. Default: None.
dataset (str): Name of the dataset. Default: 'nuscenes'. train_cfg (dict, optional): Train-time configs. Default: None.
weight (float): Weight for location loss. Default: 0.25. test_cfg (dict, optional): Test-time configs. Default: None.
code_weights (list[int]): Code weights for location loss. Default: []. bbox_coder (dict, optional): Bbox coder configs. Default: None.
common_heads (dict): Conv information for common heads. common_heads (dict, optional): Conv information for common heads.
Default: dict(). Default: dict().
loss_cls (dict): Config of classification loss function. loss_cls (dict, optional): Config of classification loss function.
Default: dict(type='GaussianFocalLoss', reduction='mean'). Default: dict(type='GaussianFocalLoss', reduction='mean').
loss_bbox (dict): Config of regression loss function. loss_bbox (dict, optional): Config of regression loss function.
Default: dict(type='L1Loss', reduction='none'). Default: dict(type='L1Loss', reduction='none').
separate_head (dict): Config of separate head. Default: dict( separate_head (dict, optional): Config of separate head. Default: dict(
type='SeparateHead', init_bias=-2.19, final_kernel=3) type='SeparateHead', init_bias=-2.19, final_kernel=3)
share_conv_channel (int): Output channels for share_conv_layer. share_conv_channel (int, optional): Output channels for share_conv
Default: 64. layer. Default: 64.
num_heatmap_convs (int): Number of conv layers for heatmap conv layer. num_heatmap_convs (int, optional): Number of conv layers for heatmap
Default: 2. conv layer. Default: 2.
conv_cfg (dict): Config of conv layer. conv_cfg (dict, optional): Config of conv layer.
Default: dict(type='Conv2d') Default: dict(type='Conv2d')
norm_cfg (dict): Config of norm layer. norm_cfg (dict, optional): Config of norm layer.
Default: dict(type='BN2d'). Default: dict(type='BN2d').
bias (str): Type of bias. Default: 'auto'. bias (str, optional): Type of bias. Default: 'auto'.
""" """
def __init__(self, def __init__(self,
...@@ -366,8 +366,8 @@ class CenterHead(BaseModule): ...@@ -366,8 +366,8 @@ class CenterHead(BaseModule):
feat (torch.tensor): Feature map with the shape of [B, H*W, 10]. feat (torch.tensor): Feature map with the shape of [B, H*W, 10].
ind (torch.Tensor): Index of the ground truth boxes with the ind (torch.Tensor): Index of the ground truth boxes with the
shape of [B, max_obj]. shape of [B, max_obj].
mask (torch.Tensor): Mask of the feature map with the shape mask (torch.Tensor, optional): Mask of the feature map with the
of [B, max_obj]. Default: None. shape of [B, max_obj]. Default: None.
Returns: Returns:
torch.Tensor: Feature map after gathering with the shape torch.Tensor: Feature map after gathering with the shape
...@@ -403,14 +403,14 @@ class CenterHead(BaseModule): ...@@ -403,14 +403,14 @@ class CenterHead(BaseModule):
Returns: Returns:
Returns: Returns:
tuple[list[torch.Tensor]]: Tuple of target including \ tuple[list[torch.Tensor]]: Tuple of target including
the following results in order. the following results in order.
- list[torch.Tensor]: Heatmap scores. - list[torch.Tensor]: Heatmap scores.
- list[torch.Tensor]: Ground truth boxes. - list[torch.Tensor]: Ground truth boxes.
- list[torch.Tensor]: Indexes indicating the \ - list[torch.Tensor]: Indexes indicating the
position of the valid boxes. position of the valid boxes.
- list[torch.Tensor]: Masks indicating which \ - list[torch.Tensor]: Masks indicating which
boxes are valid. boxes are valid.
""" """
heatmaps, anno_boxes, inds, masks = multi_apply( heatmaps, anno_boxes, inds, masks = multi_apply(
...@@ -437,14 +437,14 @@ class CenterHead(BaseModule): ...@@ -437,14 +437,14 @@ class CenterHead(BaseModule):
gt_labels_3d (torch.Tensor): Labels of boxes. gt_labels_3d (torch.Tensor): Labels of boxes.
Returns: Returns:
tuple[list[torch.Tensor]]: Tuple of target including \ tuple[list[torch.Tensor]]: Tuple of target including
the following results in order. the following results in order.
- list[torch.Tensor]: Heatmap scores. - list[torch.Tensor]: Heatmap scores.
- list[torch.Tensor]: Ground truth boxes. - list[torch.Tensor]: Ground truth boxes.
- list[torch.Tensor]: Indexes indicating the position \ - list[torch.Tensor]: Indexes indicating the position
of the valid boxes. of the valid boxes.
- list[torch.Tensor]: Masks indicating which boxes \ - list[torch.Tensor]: Masks indicating which boxes
are valid. are valid.
""" """
device = gt_labels_3d.device device = gt_labels_3d.device
...@@ -728,11 +728,11 @@ class CenterHead(BaseModule): ...@@ -728,11 +728,11 @@ class CenterHead(BaseModule):
Returns: Returns:
list[dict[str: torch.Tensor]]: contains the following keys: list[dict[str: torch.Tensor]]: contains the following keys:
-bboxes (torch.Tensor): Prediction bboxes after nms with the \ -bboxes (torch.Tensor): Prediction bboxes after nms with the
shape of [N, 9]. shape of [N, 9].
-scores (torch.Tensor): Prediction scores after nms with the \ -scores (torch.Tensor): Prediction scores after nms with the
shape of [N]. shape of [N].
-labels (torch.Tensor): Prediction labels after nms with the \ -labels (torch.Tensor): Prediction labels after nms with the
shape of [N]. shape of [N].
""" """
predictions_dicts = [] predictions_dicts = []
...@@ -781,7 +781,7 @@ class CenterHead(BaseModule): ...@@ -781,7 +781,7 @@ class CenterHead(BaseModule):
boxes_for_nms, boxes_for_nms,
top_scores, top_scores,
thresh=self.test_cfg['nms_thr'], thresh=self.test_cfg['nms_thr'],
pre_maxsize=self.test_cfg['pre_max_size'], pre_max_size=self.test_cfg['pre_max_size'],
post_max_size=self.test_cfg['post_max_size']) post_max_size=self.test_cfg['post_max_size'])
else: else:
selected = [] selected = []
......
...@@ -21,25 +21,25 @@ class FCOSMono3DHead(AnchorFreeMono3DHead): ...@@ -21,25 +21,25 @@ class FCOSMono3DHead(AnchorFreeMono3DHead):
num_classes (int): Number of categories excluding the background num_classes (int): Number of categories excluding the background
category. category.
in_channels (int): Number of channels in the input feature map. in_channels (int): Number of channels in the input feature map.
regress_ranges (tuple[tuple[int, int]]): Regress range of multiple regress_ranges (tuple[tuple[int, int]], optional): Regress range of multiple
level points. level points.
center_sampling (bool): If true, use center sampling. Default: True. center_sampling (bool, optional): If true, use center sampling. Default: True.
center_sample_radius (float): Radius of center sampling. Default: 1.5. center_sample_radius (float, optional): Radius of center sampling. Default: 1.5.
norm_on_bbox (bool): If true, normalize the regression targets norm_on_bbox (bool, optional): If true, normalize the regression targets
with FPN strides. Default: True. with FPN strides. Default: True.
centerness_on_reg (bool): If true, position centerness on the centerness_on_reg (bool, optional): If true, position centerness on the
regress branch. Please refer to https://github.com/tianzhi0549/FCOS/issues/89#issuecomment-516877042. regress branch. Please refer to https://github.com/tianzhi0549/FCOS/issues/89#issuecomment-516877042.
Default: True. Default: True.
centerness_alpha: Parameter used to adjust the intensity attenuation centerness_alpha (int, optional): Parameter used to adjust the intensity
from the center to the periphery. Default: 2.5. attenuation from the center to the periphery. Default: 2.5.
loss_cls (dict): Config of classification loss. loss_cls (dict, optional): Config of classification loss.
loss_bbox (dict): Config of localization loss. loss_bbox (dict, optional): Config of localization loss.
loss_dir (dict): Config of direction classification loss. loss_dir (dict, optional): Config of direction classification loss.
loss_attr (dict): Config of attribute classification loss. loss_attr (dict, optional): Config of attribute classification loss.
loss_centerness (dict): Config of centerness loss. loss_centerness (dict, optional): Config of centerness loss.
norm_cfg (dict): dictionary to construct and config norm layer. norm_cfg (dict, optional): dictionary to construct and config norm layer.
Default: norm_cfg=dict(type='GN', num_groups=32, requires_grad=True). Default: norm_cfg=dict(type='GN', num_groups=32, requires_grad=True).
centerness_branch (tuple[int]): Channels for centerness branch. centerness_branch (tuple[int], optional): Channels for centerness branch.
Default: (64, ). Default: (64, ).
""" # noqa: E501 """ # noqa: E501
...@@ -153,7 +153,7 @@ class FCOSMono3DHead(AnchorFreeMono3DHead): ...@@ -153,7 +153,7 @@ class FCOSMono3DHead(AnchorFreeMono3DHead):
is True. is True.
Returns: Returns:
tuple: scores for each class, bbox and direction class \ tuple: scores for each class, bbox and direction class
predictions, centerness predictions of input feature maps. predictions, centerness predictions of input feature maps.
""" """
cls_score, bbox_pred, dir_cls_pred, attr_pred, cls_feat, reg_feat = \ cls_score, bbox_pred, dir_cls_pred, attr_pred, cls_feat, reg_feat = \
...@@ -201,7 +201,7 @@ class FCOSMono3DHead(AnchorFreeMono3DHead): ...@@ -201,7 +201,7 @@ class FCOSMono3DHead(AnchorFreeMono3DHead):
the 7th dimension is rotation dimension. the 7th dimension is rotation dimension.
Returns: Returns:
tuple[torch.Tensor]: ``boxes1`` and ``boxes2`` whose 7th \ tuple[torch.Tensor]: ``boxes1`` and ``boxes2`` whose 7th
dimensions are changed. dimensions are changed.
""" """
rad_pred_encoding = torch.sin(boxes1[..., 6:7]) * torch.cos( rad_pred_encoding = torch.sin(boxes1[..., 6:7]) * torch.cos(
...@@ -295,7 +295,7 @@ class FCOSMono3DHead(AnchorFreeMono3DHead): ...@@ -295,7 +295,7 @@ class FCOSMono3DHead(AnchorFreeMono3DHead):
attr_labels (list[Tensor]): Attributes indices of each box. attr_labels (list[Tensor]): Attributes indices of each box.
img_metas (list[dict]): Meta information of each image, e.g., img_metas (list[dict]): Meta information of each image, e.g.,
image size, scaling factor, etc. image size, scaling factor, etc.
gt_bboxes_ignore (None | list[Tensor]): specify which bounding gt_bboxes_ignore (list[Tensor]): specify which bounding
boxes can be ignored when computing the loss. boxes can be ignored when computing the loss.
Returns: Returns:
...@@ -507,11 +507,11 @@ class FCOSMono3DHead(AnchorFreeMono3DHead): ...@@ -507,11 +507,11 @@ class FCOSMono3DHead(AnchorFreeMono3DHead):
rescale (bool): If True, return boxes in original image space rescale (bool): If True, return boxes in original image space
Returns: Returns:
list[tuple[Tensor, Tensor]]: Each item in result_list is 2-tuple. \ list[tuple[Tensor, Tensor]]: Each item in result_list is 2-tuple.
The first item is an (n, 5) tensor, where the first 4 columns \ The first item is an (n, 5) tensor, where the first 4 columns
are bounding box positions (tl_x, tl_y, br_x, br_y) and the \ are bounding box positions (tl_x, tl_y, br_x, br_y) and the
5-th column is a score between 0 and 1. The second item is a \ 5-th column is a score between 0 and 1. The second item is a
(n,) tensor where each item is the predicted class label of \ (n,) tensor where each item is the predicted class label of
the corresponding box. the corresponding box.
""" """
assert len(cls_scores) == len(bbox_preds) == len(dir_cls_preds) == \ assert len(cls_scores) == len(bbox_preds) == len(dir_cls_preds) == \
...@@ -580,7 +580,7 @@ class FCOSMono3DHead(AnchorFreeMono3DHead): ...@@ -580,7 +580,7 @@ class FCOSMono3DHead(AnchorFreeMono3DHead):
bbox_preds (list[Tensor]): Box energies / deltas for a single scale bbox_preds (list[Tensor]): Box energies / deltas for a single scale
level with shape (num_points * bbox_code_size, H, W). level with shape (num_points * bbox_code_size, H, W).
dir_cls_preds (list[Tensor]): Box scores for direction class dir_cls_preds (list[Tensor]): Box scores for direction class
predictions on a single scale level with shape \ predictions on a single scale level with shape
(num_points * 2, H, W) (num_points * 2, H, W)
attr_preds (list[Tensor]): Attribute scores for each scale level attr_preds (list[Tensor]): Attribute scores for each scale level
Has shape (N, num_points * num_attrs, H, W) Has shape (N, num_points * num_attrs, H, W)
...@@ -700,12 +700,12 @@ class FCOSMono3DHead(AnchorFreeMono3DHead): ...@@ -700,12 +700,12 @@ class FCOSMono3DHead(AnchorFreeMono3DHead):
def pts2Dto3D(points, view): def pts2Dto3D(points, view):
""" """
Args: Args:
points (torch.Tensor): points in 2D images, [N, 3], \ points (torch.Tensor): points in 2D images, [N, 3],
3 corresponds with x, y in the image and depth. 3 corresponds with x, y in the image and depth.
view (np.ndarray): camera instrinsic, [3, 3] view (np.ndarray): camera instrinsic, [3, 3]
Returns: Returns:
torch.Tensor: points in 3D space. [N, 3], \ torch.Tensor: points in 3D space. [N, 3],
3 corresponds with x, y, z in 3D space. 3 corresponds with x, y, z in 3D space.
""" """
assert view.shape[0] <= 4 assert view.shape[0] <= 4
...@@ -767,8 +767,8 @@ class FCOSMono3DHead(AnchorFreeMono3DHead): ...@@ -767,8 +767,8 @@ class FCOSMono3DHead(AnchorFreeMono3DHead):
Returns: Returns:
tuple: tuple:
concat_lvl_labels (list[Tensor]): Labels of each level. \ concat_lvl_labels (list[Tensor]): Labels of each level.
concat_lvl_bbox_targets (list[Tensor]): BBox targets of each \ concat_lvl_bbox_targets (list[Tensor]): BBox targets of each
level. level.
""" """
assert len(points) == len(self.regress_ranges) assert len(points) == len(self.regress_ranges)
......
...@@ -25,13 +25,13 @@ class PointsObjClsModule(BaseModule): ...@@ -25,13 +25,13 @@ class PointsObjClsModule(BaseModule):
Args: Args:
in_channel (int): number of channels of seed point features. in_channel (int): number of channels of seed point features.
num_convs (int): number of conv layers. num_convs (int, optional): number of conv layers.
Default: 3. Default: 3.
conv_cfg (dict): Config of convolution. conv_cfg (dict, optional): Config of convolution.
Default: dict(type='Conv1d'). Default: dict(type='Conv1d').
norm_cfg (dict): Config of normalization. norm_cfg (dict, optional): Config of normalization.
Default: dict(type='BN1d'). Default: dict(type='BN1d').
act_cfg (dict): Config of activation. act_cfg (dict, optional): Config of activation.
Default: dict(type='ReLU'). Default: dict(type='ReLU').
""" """
...@@ -405,15 +405,15 @@ class GroupFree3DHead(BaseModule): ...@@ -405,15 +405,15 @@ class GroupFree3DHead(BaseModule):
Args: Args:
bbox_preds (dict): Predictions from forward of vote head. bbox_preds (dict): Predictions from forward of vote head.
points (list[torch.Tensor]): Input points. points (list[torch.Tensor]): Input points.
gt_bboxes_3d (list[:obj:`BaseInstance3DBoxes`]): Ground truth \ gt_bboxes_3d (list[:obj:`BaseInstance3DBoxes`]): Ground truth
bboxes of each sample. bboxes of each sample.
gt_labels_3d (list[torch.Tensor]): Labels of each sample. gt_labels_3d (list[torch.Tensor]): Labels of each sample.
pts_semantic_mask (None | list[torch.Tensor]): Point-wise pts_semantic_mask (list[torch.Tensor]): Point-wise
semantic mask. semantic mask.
pts_instance_mask (None | list[torch.Tensor]): Point-wise pts_instance_mask (list[torch.Tensor]): Point-wise
instance mask. instance mask.
img_metas (list[dict]): Contain pcd and img's meta info. img_metas (list[dict]): Contain pcd and img's meta info.
gt_bboxes_ignore (None | list[torch.Tensor]): Specify gt_bboxes_ignore (list[torch.Tensor]): Specify
which bounding. which bounding.
ret_target (Bool): Return targets or not. ret_target (Bool): Return targets or not.
...@@ -545,12 +545,12 @@ class GroupFree3DHead(BaseModule): ...@@ -545,12 +545,12 @@ class GroupFree3DHead(BaseModule):
Args: Args:
points (list[torch.Tensor]): Points of each batch. points (list[torch.Tensor]): Points of each batch.
gt_bboxes_3d (list[:obj:`BaseInstance3DBoxes`]): Ground truth \ gt_bboxes_3d (list[:obj:`BaseInstance3DBoxes`]): Ground truth
bboxes of each batch. bboxes of each batch.
gt_labels_3d (list[torch.Tensor]): Labels of each batch. gt_labels_3d (list[torch.Tensor]): Labels of each batch.
pts_semantic_mask (None | list[torch.Tensor]): Point-wise semantic pts_semantic_mask (list[torch.Tensor]): Point-wise semantic
label of each batch. label of each batch.
pts_instance_mask (None | list[torch.Tensor]): Point-wise instance pts_instance_mask (list[torch.Tensor]): Point-wise instance
label of each batch. label of each batch.
bbox_preds (torch.Tensor): Bounding box predictions of vote head. bbox_preds (torch.Tensor): Bounding box predictions of vote head.
max_gt_num (int): Max number of GTs for single batch. max_gt_num (int): Max number of GTs for single batch.
...@@ -657,12 +657,12 @@ class GroupFree3DHead(BaseModule): ...@@ -657,12 +657,12 @@ class GroupFree3DHead(BaseModule):
Args: Args:
points (torch.Tensor): Points of each batch. points (torch.Tensor): Points of each batch.
gt_bboxes_3d (:obj:`BaseInstance3DBoxes`): Ground truth \ gt_bboxes_3d (:obj:`BaseInstance3DBoxes`): Ground truth
boxes of each batch. boxes of each batch.
gt_labels_3d (torch.Tensor): Labels of each batch. gt_labels_3d (torch.Tensor): Labels of each batch.
pts_semantic_mask (None | torch.Tensor): Point-wise semantic pts_semantic_mask (torch.Tensor): Point-wise semantic
label of each batch. label of each batch.
pts_instance_mask (None | torch.Tensor): Point-wise instance pts_instance_mask (torch.Tensor): Point-wise instance
label of each batch. label of each batch.
max_gt_nums (int): Max number of GTs for single batch. max_gt_nums (int): Max number of GTs for single batch.
seed_points (torch.Tensor): Coordinates of seed points. seed_points (torch.Tensor): Coordinates of seed points.
...@@ -710,7 +710,7 @@ class GroupFree3DHead(BaseModule): ...@@ -710,7 +710,7 @@ class GroupFree3DHead(BaseModule):
if self.bbox_coder.with_rot: if self.bbox_coder.with_rot:
vote_targets = points.new_zeros([num_points, 4 * self.gt_per_seed]) vote_targets = points.new_zeros([num_points, 4 * self.gt_per_seed])
vote_target_idx = points.new_zeros([num_points], dtype=torch.long) vote_target_idx = points.new_zeros([num_points], dtype=torch.long)
box_indices_all = gt_bboxes_3d.points_in_boxes(points) box_indices_all = gt_bboxes_3d.points_in_boxes_part(points)
for i in range(gt_labels_3d.shape[0]): for i in range(gt_labels_3d.shape[0]):
box_indices = box_indices_all[:, i] box_indices = box_indices_all[:, i]
indices = torch.nonzero( indices = torch.nonzero(
...@@ -951,7 +951,7 @@ class GroupFree3DHead(BaseModule): ...@@ -951,7 +951,7 @@ class GroupFree3DHead(BaseModule):
box_dim=bbox.shape[-1], box_dim=bbox.shape[-1],
with_yaw=self.bbox_coder.with_rot, with_yaw=self.bbox_coder.with_rot,
origin=(0.5, 0.5, 0.5)) origin=(0.5, 0.5, 0.5))
box_indices = bbox.points_in_boxes_batch(points) box_indices = bbox.points_in_boxes_all(points)
corner3d = bbox.corners corner3d = bbox.corners
minmax_box3d = corner3d.new(torch.Size((corner3d.shape[0], 6))) minmax_box3d = corner3d.new(torch.Size((corner3d.shape[0], 6)))
......
...@@ -100,20 +100,20 @@ class PartA2RPNHead(Anchor3DHead): ...@@ -100,20 +100,20 @@ class PartA2RPNHead(Anchor3DHead):
bbox_preds (list[torch.Tensor]): Multi-level bbox predictions. bbox_preds (list[torch.Tensor]): Multi-level bbox predictions.
dir_cls_preds (list[torch.Tensor]): Multi-level direction dir_cls_preds (list[torch.Tensor]): Multi-level direction
class predictions. class predictions.
gt_bboxes (list[:obj:`BaseInstance3DBoxes`]): Ground truth boxes \ gt_bboxes (list[:obj:`BaseInstance3DBoxes`]): Ground truth boxes
of each sample. of each sample.
gt_labels (list[torch.Tensor]): Labels of each sample. gt_labels (list[torch.Tensor]): Labels of each sample.
input_metas (list[dict]): Point cloud and image's meta info. input_metas (list[dict]): Point cloud and image's meta info.
gt_bboxes_ignore (None | list[torch.Tensor]): Specify gt_bboxes_ignore (list[torch.Tensor]): Specify
which bounding. which bounding.
Returns: Returns:
dict[str, list[torch.Tensor]]: Classification, bbox, and \ dict[str, list[torch.Tensor]]: Classification, bbox, and
direction losses of each level. direction losses of each level.
- loss_rpn_cls (list[torch.Tensor]): Classification losses. - loss_rpn_cls (list[torch.Tensor]): Classification losses.
- loss_rpn_bbox (list[torch.Tensor]): Box regression losses. - loss_rpn_bbox (list[torch.Tensor]): Box regression losses.
- loss_rpn_dir (list[torch.Tensor]): Direction classification \ - loss_rpn_dir (list[torch.Tensor]): Direction classification
losses. losses.
""" """
loss_dict = super().loss(cls_scores, bbox_preds, dir_cls_preds, loss_dict = super().loss(cls_scores, bbox_preds, dir_cls_preds,
...@@ -143,7 +143,7 @@ class PartA2RPNHead(Anchor3DHead): ...@@ -143,7 +143,7 @@ class PartA2RPNHead(Anchor3DHead):
mlvl_anchors (List[torch.Tensor]): Multi-level anchors mlvl_anchors (List[torch.Tensor]): Multi-level anchors
in single batch. in single batch.
input_meta (list[dict]): Contain pcd and img's meta info. input_meta (list[dict]): Contain pcd and img's meta info.
cfg (None | :obj:`ConfigDict`): Training or testing config. cfg (:obj:`ConfigDict`): Training or testing config.
rescale (list[torch.Tensor]): whether th rescale bbox. rescale (list[torch.Tensor]): whether th rescale bbox.
Returns: Returns:
...@@ -240,7 +240,7 @@ class PartA2RPNHead(Anchor3DHead): ...@@ -240,7 +240,7 @@ class PartA2RPNHead(Anchor3DHead):
Multi-level bbox. Multi-level bbox.
score_thr (int): Score threshold. score_thr (int): Score threshold.
max_num (int): Max number of bboxes after nms. max_num (int): Max number of bboxes after nms.
cfg (None | :obj:`ConfigDict`): Training or testing config. cfg (:obj:`ConfigDict`): Training or testing config.
input_meta (dict): Contain pcd and img's meta info. input_meta (dict): Contain pcd and img's meta info.
Returns: Returns:
......
...@@ -30,15 +30,17 @@ class BaseShapeHead(BaseModule): ...@@ -30,15 +30,17 @@ class BaseShapeHead(BaseModule):
num_base_anchors (int): Number of anchors per location. num_base_anchors (int): Number of anchors per location.
box_code_size (int): The dimension of boxes to be encoded. box_code_size (int): The dimension of boxes to be encoded.
in_channels (int): Input channels for convolutional layers. in_channels (int): Input channels for convolutional layers.
shared_conv_channels (tuple): Channels for shared convolutional \ shared_conv_channels (tuple, optional): Channels for shared
layers. Default: (64, 64). \ convolutional layers. Default: (64, 64).
shared_conv_strides (tuple): Strides for shared convolutional \ shared_conv_strides (tuple, optional): Strides for shared
layers. Default: (1, 1). convolutional layers. Default: (1, 1).
use_direction_classifier (bool, optional): Whether to use direction \ use_direction_classifier (bool, optional): Whether to use direction
classifier. Default: True. classifier. Default: True.
conv_cfg (dict): Config of conv layer. Default: dict(type='Conv2d') conv_cfg (dict, optional): Config of conv layer.
norm_cfg (dict): Config of norm layer. Default: dict(type='BN2d'). Default: dict(type='Conv2d')
bias (bool|str, optional): Type of bias. Default: False. norm_cfg (dict, optional): Config of norm layer.
Default: dict(type='BN2d').
bias (bool | str, optional): Type of bias. Default: False.
""" """
def __init__(self, def __init__(self,
...@@ -127,11 +129,11 @@ class BaseShapeHead(BaseModule): ...@@ -127,11 +129,11 @@ class BaseShapeHead(BaseModule):
[B, C, H, W]. [B, C, H, W].
Returns: Returns:
dict[torch.Tensor]: Contain score of each class, bbox \ dict[torch.Tensor]: Contain score of each class, bbox
regression and direction classification predictions. \ regression and direction classification predictions.
Note that all the returned tensors are reshaped as \ Note that all the returned tensors are reshaped as
[bs*num_base_anchors*H*W, num_cls/box_code_size/dir_bins]. \ [bs*num_base_anchors*H*W, num_cls/box_code_size/dir_bins].
It is more convenient to concat anchors for different \ It is more convenient to concat anchors for different
classes even though they have different feature map sizes. classes even though they have different feature map sizes.
""" """
x = self.shared_conv(x) x = self.shared_conv(x)
...@@ -168,9 +170,9 @@ class ShapeAwareHead(Anchor3DHead): ...@@ -168,9 +170,9 @@ class ShapeAwareHead(Anchor3DHead):
Args: Args:
tasks (dict): Shape-aware groups of multi-class objects. tasks (dict): Shape-aware groups of multi-class objects.
assign_per_class (bool, optional): Whether to do assignment for each \ assign_per_class (bool, optional): Whether to do assignment for each
class. Default: True. class. Default: True.
kwargs (dict): Other arguments are the same as those in \ kwargs (dict): Other arguments are the same as those in
:class:`Anchor3DHead`. :class:`Anchor3DHead`.
""" """
...@@ -217,7 +219,7 @@ class ShapeAwareHead(Anchor3DHead): ...@@ -217,7 +219,7 @@ class ShapeAwareHead(Anchor3DHead):
Args: Args:
x (torch.Tensor): Input features. x (torch.Tensor): Input features.
Returns: Returns:
tuple[torch.Tensor]: Contain score of each class, bbox \ tuple[torch.Tensor]: Contain score of each class, bbox
regression and direction classification predictions. regression and direction classification predictions.
""" """
results = [] results = []
...@@ -263,7 +265,7 @@ class ShapeAwareHead(Anchor3DHead): ...@@ -263,7 +265,7 @@ class ShapeAwareHead(Anchor3DHead):
num_total_samples (int): The number of valid samples. num_total_samples (int): The number of valid samples.
Returns: Returns:
tuple[torch.Tensor]: Losses of class, bbox \ tuple[torch.Tensor]: Losses of class, bbox
and direction, respectively. and direction, respectively.
""" """
# classification loss # classification loss
...@@ -325,16 +327,16 @@ class ShapeAwareHead(Anchor3DHead): ...@@ -325,16 +327,16 @@ class ShapeAwareHead(Anchor3DHead):
of each sample. of each sample.
gt_labels (list[torch.Tensor]): Gt labels of each sample. gt_labels (list[torch.Tensor]): Gt labels of each sample.
input_metas (list[dict]): Contain pcd and img's meta info. input_metas (list[dict]): Contain pcd and img's meta info.
gt_bboxes_ignore (None | list[torch.Tensor]): Specify gt_bboxes_ignore (list[torch.Tensor]): Specify
which bounding. which bounding.
Returns: Returns:
dict[str, list[torch.Tensor]]: Classification, bbox, and \ dict[str, list[torch.Tensor]]: Classification, bbox, and
direction losses of each level. direction losses of each level.
- loss_cls (list[torch.Tensor]): Classification losses. - loss_cls (list[torch.Tensor]): Classification losses.
- loss_bbox (list[torch.Tensor]): Box regression losses. - loss_bbox (list[torch.Tensor]): Box regression losses.
- loss_dir (list[torch.Tensor]): Direction classification \ - loss_dir (list[torch.Tensor]): Direction classification
losses. losses.
""" """
device = cls_scores[0].device device = cls_scores[0].device
...@@ -388,7 +390,7 @@ class ShapeAwareHead(Anchor3DHead): ...@@ -388,7 +390,7 @@ class ShapeAwareHead(Anchor3DHead):
dir_cls_preds (list[torch.Tensor]): Multi-level direction dir_cls_preds (list[torch.Tensor]): Multi-level direction
class predictions. class predictions.
input_metas (list[dict]): Contain pcd and img's meta info. input_metas (list[dict]): Contain pcd and img's meta info.
cfg (None | :obj:`ConfigDict`): Training or testing config. cfg (:obj:`ConfigDict`, optional): Training or testing config.
Default: None. Default: None.
rescale (list[torch.Tensor], optional): Whether to rescale bbox. rescale (list[torch.Tensor], optional): Whether to rescale bbox.
Default: False. Default: False.
...@@ -443,8 +445,8 @@ class ShapeAwareHead(Anchor3DHead): ...@@ -443,8 +445,8 @@ class ShapeAwareHead(Anchor3DHead):
mlvl_anchors (List[torch.Tensor]): Multi-level anchors mlvl_anchors (List[torch.Tensor]): Multi-level anchors
in single batch. in single batch.
input_meta (list[dict]): Contain pcd and img's meta info. input_meta (list[dict]): Contain pcd and img's meta info.
cfg (None | :obj:`ConfigDict`): Training or testing config. cfg (:obj:`ConfigDict`): Training or testing config.
rescale (list[torch.Tensor], optional): whether to rescale bbox. \ rescale (list[torch.Tensor], optional): whether to rescale bbox.
Default: False. Default: False.
Returns: Returns:
......
...@@ -128,15 +128,15 @@ class SSD3DHead(VoteHead): ...@@ -128,15 +128,15 @@ class SSD3DHead(VoteHead):
Args: Args:
bbox_preds (dict): Predictions from forward of SSD3DHead. bbox_preds (dict): Predictions from forward of SSD3DHead.
points (list[torch.Tensor]): Input points. points (list[torch.Tensor]): Input points.
gt_bboxes_3d (list[:obj:`BaseInstance3DBoxes`]): Ground truth \ gt_bboxes_3d (list[:obj:`BaseInstance3DBoxes`]): Ground truth
bboxes of each sample. bboxes of each sample.
gt_labels_3d (list[torch.Tensor]): Labels of each sample. gt_labels_3d (list[torch.Tensor]): Labels of each sample.
pts_semantic_mask (None | list[torch.Tensor]): Point-wise pts_semantic_mask (list[torch.Tensor]): Point-wise
semantic mask. semantic mask.
pts_instance_mask (None | list[torch.Tensor]): Point-wise pts_instance_mask (list[torch.Tensor]): Point-wise
instance mask. instance mask.
img_metas (list[dict]): Contain pcd and img's meta info. img_metas (list[dict]): Contain pcd and img's meta info.
gt_bboxes_ignore (None | list[torch.Tensor]): Specify gt_bboxes_ignore (list[torch.Tensor]): Specify
which bounding. which bounding.
Returns: Returns:
...@@ -231,12 +231,12 @@ class SSD3DHead(VoteHead): ...@@ -231,12 +231,12 @@ class SSD3DHead(VoteHead):
Args: Args:
points (list[torch.Tensor]): Points of each batch. points (list[torch.Tensor]): Points of each batch.
gt_bboxes_3d (list[:obj:`BaseInstance3DBoxes`]): Ground truth \ gt_bboxes_3d (list[:obj:`BaseInstance3DBoxes`]): Ground truth
bboxes of each batch. bboxes of each batch.
gt_labels_3d (list[torch.Tensor]): Labels of each batch. gt_labels_3d (list[torch.Tensor]): Labels of each batch.
pts_semantic_mask (None | list[torch.Tensor]): Point-wise semantic pts_semantic_mask (list[torch.Tensor]): Point-wise semantic
label of each batch. label of each batch.
pts_instance_mask (None | list[torch.Tensor]): Point-wise instance pts_instance_mask (list[torch.Tensor]): Point-wise instance
label of each batch. label of each batch.
bbox_preds (torch.Tensor): Bounding box predictions of ssd3d head. bbox_preds (torch.Tensor): Bounding box predictions of ssd3d head.
...@@ -320,12 +320,12 @@ class SSD3DHead(VoteHead): ...@@ -320,12 +320,12 @@ class SSD3DHead(VoteHead):
Args: Args:
points (torch.Tensor): Points of each batch. points (torch.Tensor): Points of each batch.
gt_bboxes_3d (:obj:`BaseInstance3DBoxes`): Ground truth \ gt_bboxes_3d (:obj:`BaseInstance3DBoxes`): Ground truth
boxes of each batch. boxes of each batch.
gt_labels_3d (torch.Tensor): Labels of each batch. gt_labels_3d (torch.Tensor): Labels of each batch.
pts_semantic_mask (None | torch.Tensor): Point-wise semantic pts_semantic_mask (torch.Tensor): Point-wise semantic
label of each batch. label of each batch.
pts_instance_mask (None | torch.Tensor): Point-wise instance pts_instance_mask (torch.Tensor): Point-wise instance
label of each batch. label of each batch.
aggregated_points (torch.Tensor): Aggregated points from aggregated_points (torch.Tensor): Aggregated points from
candidate points layer. candidate points layer.
...@@ -494,7 +494,7 @@ class SSD3DHead(VoteHead): ...@@ -494,7 +494,7 @@ class SSD3DHead(VoteHead):
origin=(0.5, 0.5, 0.5)) origin=(0.5, 0.5, 0.5))
if isinstance(bbox, (LiDARInstance3DBoxes, DepthInstance3DBoxes)): if isinstance(bbox, (LiDARInstance3DBoxes, DepthInstance3DBoxes)):
box_indices = bbox.points_in_boxes_batch(points) box_indices = bbox.points_in_boxes_all(points)
nonempty_box_mask = box_indices.T.sum(1) >= 0 nonempty_box_mask = box_indices.T.sum(1) >= 0
else: else:
raise NotImplementedError('Unsupported bbox type!') raise NotImplementedError('Unsupported bbox type!')
...@@ -550,7 +550,7 @@ class SSD3DHead(VoteHead): ...@@ -550,7 +550,7 @@ class SSD3DHead(VoteHead):
inside bbox and the index of box where each point are in. inside bbox and the index of box where each point are in.
""" """
if isinstance(bboxes_3d, (LiDARInstance3DBoxes, DepthInstance3DBoxes)): if isinstance(bboxes_3d, (LiDARInstance3DBoxes, DepthInstance3DBoxes)):
points_mask = bboxes_3d.points_in_boxes_batch(points) points_mask = bboxes_3d.points_in_boxes_all(points)
assignment = points_mask.argmax(dim=-1) assignment = points_mask.argmax(dim=-1)
else: else:
raise NotImplementedError('Unsupported bbox type!') raise NotImplementedError('Unsupported bbox type!')
......
...@@ -25,7 +25,7 @@ class AnchorTrainMixin(object): ...@@ -25,7 +25,7 @@ class AnchorTrainMixin(object):
gt_bboxes_list (list[:obj:`BaseInstance3DBoxes`]): Ground truth gt_bboxes_list (list[:obj:`BaseInstance3DBoxes`]): Ground truth
bboxes of each image. bboxes of each image.
input_metas (list[dict]): Meta info of each image. input_metas (list[dict]): Meta info of each image.
gt_bboxes_ignore_list (None | list): Ignore list of gt bboxes. gt_bboxes_ignore_list (list): Ignore list of gt bboxes.
gt_labels_list (list[torch.Tensor]): Gt labels of batches. gt_labels_list (list[torch.Tensor]): Gt labels of batches.
label_channels (int): The channel of labels. label_channels (int): The channel of labels.
num_classes (int): The number of classes. num_classes (int): The number of classes.
......
...@@ -234,15 +234,15 @@ class VoteHead(BaseModule): ...@@ -234,15 +234,15 @@ class VoteHead(BaseModule):
Args: Args:
bbox_preds (dict): Predictions from forward of vote head. bbox_preds (dict): Predictions from forward of vote head.
points (list[torch.Tensor]): Input points. points (list[torch.Tensor]): Input points.
gt_bboxes_3d (list[:obj:`BaseInstance3DBoxes`]): Ground truth \ gt_bboxes_3d (list[:obj:`BaseInstance3DBoxes`]): Ground truth
bboxes of each sample. bboxes of each sample.
gt_labels_3d (list[torch.Tensor]): Labels of each sample. gt_labels_3d (list[torch.Tensor]): Labels of each sample.
pts_semantic_mask (None | list[torch.Tensor]): Point-wise pts_semantic_mask (list[torch.Tensor]): Point-wise
semantic mask. semantic mask.
pts_instance_mask (None | list[torch.Tensor]): Point-wise pts_instance_mask (list[torch.Tensor]): Point-wise
instance mask. instance mask.
img_metas (list[dict]): Contain pcd and img's meta info. img_metas (list[dict]): Contain pcd and img's meta info.
gt_bboxes_ignore (None | list[torch.Tensor]): Specify gt_bboxes_ignore (list[torch.Tensor]): Specify
which bounding. which bounding.
ret_target (Bool): Return targets or not. ret_target (Bool): Return targets or not.
...@@ -358,12 +358,12 @@ class VoteHead(BaseModule): ...@@ -358,12 +358,12 @@ class VoteHead(BaseModule):
Args: Args:
points (list[torch.Tensor]): Points of each batch. points (list[torch.Tensor]): Points of each batch.
gt_bboxes_3d (list[:obj:`BaseInstance3DBoxes`]): Ground truth \ gt_bboxes_3d (list[:obj:`BaseInstance3DBoxes`]): Ground truth
bboxes of each batch. bboxes of each batch.
gt_labels_3d (list[torch.Tensor]): Labels of each batch. gt_labels_3d (list[torch.Tensor]): Labels of each batch.
pts_semantic_mask (None | list[torch.Tensor]): Point-wise semantic pts_semantic_mask (list[torch.Tensor]): Point-wise semantic
label of each batch. label of each batch.
pts_instance_mask (None | list[torch.Tensor]): Point-wise instance pts_instance_mask (list[torch.Tensor]): Point-wise instance
label of each batch. label of each batch.
bbox_preds (torch.Tensor): Bounding box predictions of vote head. bbox_preds (torch.Tensor): Bounding box predictions of vote head.
...@@ -447,12 +447,12 @@ class VoteHead(BaseModule): ...@@ -447,12 +447,12 @@ class VoteHead(BaseModule):
Args: Args:
points (torch.Tensor): Points of each batch. points (torch.Tensor): Points of each batch.
gt_bboxes_3d (:obj:`BaseInstance3DBoxes`): Ground truth \ gt_bboxes_3d (:obj:`BaseInstance3DBoxes`): Ground truth
boxes of each batch. boxes of each batch.
gt_labels_3d (torch.Tensor): Labels of each batch. gt_labels_3d (torch.Tensor): Labels of each batch.
pts_semantic_mask (None | torch.Tensor): Point-wise semantic pts_semantic_mask (torch.Tensor): Point-wise semantic
label of each batch. label of each batch.
pts_instance_mask (None | torch.Tensor): Point-wise instance pts_instance_mask (torch.Tensor): Point-wise instance
label of each batch. label of each batch.
aggregated_points (torch.Tensor): Aggregated points from aggregated_points (torch.Tensor): Aggregated points from
vote aggregation layer. vote aggregation layer.
...@@ -471,7 +471,7 @@ class VoteHead(BaseModule): ...@@ -471,7 +471,7 @@ class VoteHead(BaseModule):
vote_target_masks = points.new_zeros([num_points], vote_target_masks = points.new_zeros([num_points],
dtype=torch.long) dtype=torch.long)
vote_target_idx = points.new_zeros([num_points], dtype=torch.long) vote_target_idx = points.new_zeros([num_points], dtype=torch.long)
box_indices_all = gt_bboxes_3d.points_in_boxes_batch(points) box_indices_all = gt_bboxes_3d.points_in_boxes_all(points)
for i in range(gt_labels_3d.shape[0]): for i in range(gt_labels_3d.shape[0]):
box_indices = box_indices_all[:, i] box_indices = box_indices_all[:, i]
indices = torch.nonzero( indices = torch.nonzero(
...@@ -621,7 +621,7 @@ class VoteHead(BaseModule): ...@@ -621,7 +621,7 @@ class VoteHead(BaseModule):
box_dim=bbox.shape[-1], box_dim=bbox.shape[-1],
with_yaw=self.bbox_coder.with_rot, with_yaw=self.bbox_coder.with_rot,
origin=(0.5, 0.5, 0.5)) origin=(0.5, 0.5, 0.5))
box_indices = bbox.points_in_boxes_batch(points) box_indices = bbox.points_in_boxes_all(points)
corner3d = bbox.corners corner3d = bbox.corners
minmax_box3d = corner3d.new(torch.Size((corner3d.shape[0], 6))) minmax_box3d = corner3d.new(torch.Size((corner3d.shape[0], 6)))
......
...@@ -97,7 +97,8 @@ class CenterPoint(MVXTwoStageDetector): ...@@ -97,7 +97,8 @@ class CenterPoint(MVXTwoStageDetector):
Args: Args:
feats (list[torch.Tensor]): Feature of point cloud. feats (list[torch.Tensor]): Feature of point cloud.
img_metas (list[dict]): Meta information of samples. img_metas (list[dict]): Meta information of samples.
rescale (bool): Whether to rescale bboxes. Default: False. rescale (bool, optional): Whether to rescale bboxes.
Default: False.
Returns: Returns:
dict: Returned bboxes consists of the following keys: dict: Returned bboxes consists of the following keys:
......
...@@ -38,11 +38,11 @@ class GroupFree3DNet(SingleStage3DDetector): ...@@ -38,11 +38,11 @@ class GroupFree3DNet(SingleStage3DDetector):
img_metas (list): Image metas. img_metas (list): Image metas.
gt_bboxes_3d (:obj:`BaseInstance3DBoxes`): gt bboxes of each batch. gt_bboxes_3d (:obj:`BaseInstance3DBoxes`): gt bboxes of each batch.
gt_labels_3d (list[torch.Tensor]): gt class labels of each batch. gt_labels_3d (list[torch.Tensor]): gt class labels of each batch.
pts_semantic_mask (None | list[torch.Tensor]): point-wise semantic pts_semantic_mask (list[torch.Tensor]): point-wise semantic
label of each batch. label of each batch.
pts_instance_mask (None | list[torch.Tensor]): point-wise instance pts_instance_mask (list[torch.Tensor]): point-wise instance
label of each batch. label of each batch.
gt_bboxes_ignore (None | list[torch.Tensor]): Specify gt_bboxes_ignore (list[torch.Tensor]): Specify
which bounding. which bounding.
Returns: Returns:
......
...@@ -47,11 +47,11 @@ class H3DNet(TwoStage3DDetector): ...@@ -47,11 +47,11 @@ class H3DNet(TwoStage3DDetector):
img_metas (list): Image metas. img_metas (list): Image metas.
gt_bboxes_3d (:obj:`BaseInstance3DBoxes`): gt bboxes of each batch. gt_bboxes_3d (:obj:`BaseInstance3DBoxes`): gt bboxes of each batch.
gt_labels_3d (list[torch.Tensor]): gt class labels of each batch. gt_labels_3d (list[torch.Tensor]): gt class labels of each batch.
pts_semantic_mask (None | list[torch.Tensor]): point-wise semantic pts_semantic_mask (list[torch.Tensor]): point-wise semantic
label of each batch. label of each batch.
pts_instance_mask (None | list[torch.Tensor]): point-wise instance pts_instance_mask (list[torch.Tensor]): point-wise instance
label of each batch. label of each batch.
gt_bboxes_ignore (None | list[torch.Tensor]): Specify gt_bboxes_ignore (list[torch.Tensor]): Specify
which bounding. which bounding.
Returns: Returns:
......
...@@ -149,21 +149,21 @@ class ImVoteNet(Base3DDetector): ...@@ -149,21 +149,21 @@ class ImVoteNet(Base3DDetector):
if self.with_img_backbone: if self.with_img_backbone:
if img_pretrained is not None: if img_pretrained is not None:
warnings.warn('DeprecationWarning: pretrained is a deprecated \ warnings.warn('DeprecationWarning: pretrained is a deprecated '
key, please consider using init_cfg') 'key, please consider using init_cfg.')
self.img_backbone.init_cfg = dict( self.img_backbone.init_cfg = dict(
type='Pretrained', checkpoint=img_pretrained) type='Pretrained', checkpoint=img_pretrained)
if self.with_img_roi_head: if self.with_img_roi_head:
if img_pretrained is not None: if img_pretrained is not None:
warnings.warn('DeprecationWarning: pretrained is a deprecated \ warnings.warn('DeprecationWarning: pretrained is a deprecated '
key, please consider using init_cfg') 'key, please consider using init_cfg.')
self.img_roi_head.init_cfg = dict( self.img_roi_head.init_cfg = dict(
type='Pretrained', checkpoint=img_pretrained) type='Pretrained', checkpoint=img_pretrained)
if self.with_pts_backbone: if self.with_pts_backbone:
if img_pretrained is not None: if img_pretrained is not None:
warnings.warn('DeprecationWarning: pretrained is a deprecated \ warnings.warn('DeprecationWarning: pretrained is a deprecated '
key, please consider using init_cfg') 'key, please consider using init_cfg.')
self.pts_backbone.init_cfg = dict( self.pts_backbone.init_cfg = dict(
type='Pretrained', checkpoint=pts_pretrained) type='Pretrained', checkpoint=pts_pretrained)
...@@ -393,9 +393,9 @@ class ImVoteNet(Base3DDetector): ...@@ -393,9 +393,9 @@ class ImVoteNet(Base3DDetector):
with shape (num_gts, 4) in [tl_x, tl_y, br_x, br_y] format. with shape (num_gts, 4) in [tl_x, tl_y, br_x, br_y] format.
gt_labels (list[torch.Tensor]): class indices for each gt_labels (list[torch.Tensor]): class indices for each
2d bounding box. 2d bounding box.
gt_bboxes_ignore (None | list[torch.Tensor]): specify which gt_bboxes_ignore (list[torch.Tensor]): specify which
2d bounding boxes can be ignored when computing the loss. 2d bounding boxes can be ignored when computing the loss.
gt_masks (None | torch.Tensor): true segmentation masks for each gt_masks (torch.Tensor): true segmentation masks for each
2d bbox, used if the architecture supports a segmentation task. 2d bbox, used if the architecture supports a segmentation task.
proposals: override rpn proposals (2d) with custom proposals. proposals: override rpn proposals (2d) with custom proposals.
Use when `with_rpn` is False. Use when `with_rpn` is False.
...@@ -403,9 +403,9 @@ class ImVoteNet(Base3DDetector): ...@@ -403,9 +403,9 @@ class ImVoteNet(Base3DDetector):
not supported yet. not supported yet.
gt_bboxes_3d (:obj:`BaseInstance3DBoxes`): 3d gt bboxes. gt_bboxes_3d (:obj:`BaseInstance3DBoxes`): 3d gt bboxes.
gt_labels_3d (list[torch.Tensor]): gt class labels for 3d bboxes. gt_labels_3d (list[torch.Tensor]): gt class labels for 3d bboxes.
pts_semantic_mask (None | list[torch.Tensor]): point-wise semantic pts_semantic_mask (list[torch.Tensor]): point-wise semantic
label of each batch. label of each batch.
pts_instance_mask (None | list[torch.Tensor]): point-wise instance pts_instance_mask (list[torch.Tensor]): point-wise instance
label of each batch. label of each batch.
Returns: Returns:
......
...@@ -84,21 +84,21 @@ class MVXTwoStageDetector(Base3DDetector): ...@@ -84,21 +84,21 @@ class MVXTwoStageDetector(Base3DDetector):
if self.with_img_backbone: if self.with_img_backbone:
if img_pretrained is not None: if img_pretrained is not None:
warnings.warn('DeprecationWarning: pretrained is a deprecated \ warnings.warn('DeprecationWarning: pretrained is a deprecated '
key, please consider using init_cfg') 'key, please consider using init_cfg.')
self.img_backbone.init_cfg = dict( self.img_backbone.init_cfg = dict(
type='Pretrained', checkpoint=img_pretrained) type='Pretrained', checkpoint=img_pretrained)
if self.with_img_roi_head: if self.with_img_roi_head:
if img_pretrained is not None: if img_pretrained is not None:
warnings.warn('DeprecationWarning: pretrained is a deprecated \ warnings.warn('DeprecationWarning: pretrained is a deprecated '
key, please consider using init_cfg') 'key, please consider using init_cfg.')
self.img_roi_head.init_cfg = dict( self.img_roi_head.init_cfg = dict(
type='Pretrained', checkpoint=img_pretrained) type='Pretrained', checkpoint=img_pretrained)
if self.with_pts_backbone: if self.with_pts_backbone:
if pts_pretrained is not None: if img_pretrained is not None:
warnings.warn('DeprecationWarning: pretrained is a deprecated \ warnings.warn('DeprecationWarning: pretrained is a deprecated '
key, please consider using init_cfg') 'key, please consider using init_cfg.')
self.pts_backbone.init_cfg = dict( self.pts_backbone.init_cfg = dict(
type='Pretrained', checkpoint=pts_pretrained) type='Pretrained', checkpoint=pts_pretrained)
...@@ -260,7 +260,7 @@ class MVXTwoStageDetector(Base3DDetector): ...@@ -260,7 +260,7 @@ class MVXTwoStageDetector(Base3DDetector):
of 2D boxes in images. Defaults to None. of 2D boxes in images. Defaults to None.
gt_bboxes (list[torch.Tensor], optional): Ground truth 2D boxes in gt_bboxes (list[torch.Tensor], optional): Ground truth 2D boxes in
images. Defaults to None. images. Defaults to None.
img (torch.Tensor optional): Images of each sample with shape img (torch.Tensor, optional): Images of each sample with shape
(N, C, H, W). Defaults to None. (N, C, H, W). Defaults to None.
proposals ([list[torch.Tensor], optional): Predicted proposals proposals ([list[torch.Tensor], optional): Predicted proposals
used for training Fast RCNN. Defaults to None. used for training Fast RCNN. Defaults to None.
......
...@@ -48,14 +48,15 @@ class SingleStageMono3DDetector(SingleStageDetector): ...@@ -48,14 +48,15 @@ class SingleStageMono3DDetector(SingleStageDetector):
image in [tl_x, tl_y, br_x, br_y] format. image in [tl_x, tl_y, br_x, br_y] format.
gt_labels (list[Tensor]): Class indices corresponding to each box gt_labels (list[Tensor]): Class indices corresponding to each box
gt_bboxes_3d (list[Tensor]): Each item are the 3D truth boxes for gt_bboxes_3d (list[Tensor]): Each item are the 3D truth boxes for
each image in [x, y, z, w, l, h, theta, vx, vy] format. each image in [x, y, z, x_size, y_size, z_size, yaw, vx, vy]
format.
gt_labels_3d (list[Tensor]): 3D class indices corresponding to gt_labels_3d (list[Tensor]): 3D class indices corresponding to
each box. each box.
centers2d (list[Tensor]): Projected 3D centers onto 2D images. centers2d (list[Tensor]): Projected 3D centers onto 2D images.
depths (list[Tensor]): Depth of projected centers on 2D images. depths (list[Tensor]): Depth of projected centers on 2D images.
attr_labels (list[Tensor], optional): Attribute indices attr_labels (list[Tensor], optional): Attribute indices
corresponding to each box corresponding to each box
gt_bboxes_ignore (None | list[Tensor]): Specify which bounding gt_bboxes_ignore (list[Tensor]): Specify which bounding
boxes can be ignored when computing the loss. boxes can be ignored when computing the loss.
Returns: Returns:
......
...@@ -40,11 +40,11 @@ class VoteNet(SingleStage3DDetector): ...@@ -40,11 +40,11 @@ class VoteNet(SingleStage3DDetector):
img_metas (list): Image metas. img_metas (list): Image metas.
gt_bboxes_3d (:obj:`BaseInstance3DBoxes`): gt bboxes of each batch. gt_bboxes_3d (:obj:`BaseInstance3DBoxes`): gt bboxes of each batch.
gt_labels_3d (list[torch.Tensor]): gt class labels of each batch. gt_labels_3d (list[torch.Tensor]): gt class labels of each batch.
pts_semantic_mask (None | list[torch.Tensor]): point-wise semantic pts_semantic_mask (list[torch.Tensor]): point-wise semantic
label of each batch. label of each batch.
pts_instance_mask (None | list[torch.Tensor]): point-wise instance pts_instance_mask (list[torch.Tensor]): point-wise instance
label of each batch. label of each batch.
gt_bboxes_ignore (None | list[torch.Tensor]): Specify gt_bboxes_ignore (list[torch.Tensor]): Specify
which bounding. which bounding.
Returns: Returns:
......
...@@ -34,7 +34,7 @@ def point_sample(img_meta, ...@@ -34,7 +34,7 @@ def point_sample(img_meta,
coord_type (str): 'DEPTH' or 'CAMERA' or 'LIDAR'. coord_type (str): 'DEPTH' or 'CAMERA' or 'LIDAR'.
img_scale_factor (torch.Tensor): Scale factor with shape of \ img_scale_factor (torch.Tensor): Scale factor with shape of \
(w_scale, h_scale). (w_scale, h_scale).
img_crop_offset (torch.Tensor): Crop offset used to crop \ img_crop_offset (torch.Tensor): Crop offset used to crop
image during data augmentation with shape of (w_offset, h_offset). image during data augmentation with shape of (w_offset, h_offset).
img_flip (bool): Whether the image is flipped. img_flip (bool): Whether the image is flipped.
img_pad_shape (tuple[int]): int tuple indicates the h & w after img_pad_shape (tuple[int]): int tuple indicates the h & w after
......
...@@ -54,7 +54,7 @@ class AxisAlignedIoULoss(nn.Module): ...@@ -54,7 +54,7 @@ class AxisAlignedIoULoss(nn.Module):
Args: Args:
pred (torch.Tensor): Bbox predictions with shape [..., 3]. pred (torch.Tensor): Bbox predictions with shape [..., 3].
target (torch.Tensor): Bbox targets (gt) with shape [..., 3]. target (torch.Tensor): Bbox targets (gt) with shape [..., 3].
weight (torch.Tensor|float, optional): Weight of loss. \ weight (torch.Tensor | float, optional): Weight of loss.
Defaults to None. Defaults to None.
avg_factor (int, optional): Average factor that is used to average avg_factor (int, optional): Average factor that is used to average
the loss. Defaults to None. the loss. Defaults to None.
......
...@@ -29,13 +29,13 @@ def chamfer_distance(src, ...@@ -29,13 +29,13 @@ def chamfer_distance(src,
Returns: Returns:
tuple: Source and Destination loss with the corresponding indices. tuple: Source and Destination loss with the corresponding indices.
- loss_src (torch.Tensor): The min distance \ - loss_src (torch.Tensor): The min distance
from source to destination. from source to destination.
- loss_dst (torch.Tensor): The min distance \ - loss_dst (torch.Tensor): The min distance
from destination to source. from destination to source.
- indices1 (torch.Tensor): Index the min distance point \ - indices1 (torch.Tensor): Index the min distance point
for each point in source to destination. for each point in source to destination.
- indices2 (torch.Tensor): Index the min distance point \ - indices2 (torch.Tensor): Index the min distance point
for each point in destination to source. for each point in destination to source.
""" """
...@@ -125,10 +125,10 @@ class ChamferDistance(nn.Module): ...@@ -125,10 +125,10 @@ class ChamferDistance(nn.Module):
Defaults to False. Defaults to False.
Returns: Returns:
tuple[torch.Tensor]: If ``return_indices=True``, return losses of \ tuple[torch.Tensor]: If ``return_indices=True``, return losses of
source and target with their corresponding indices in the \ source and target with their corresponding indices in the
order of ``(loss_source, loss_target, indices1, indices2)``. \ order of ``(loss_source, loss_target, indices1, indices2)``.
If ``return_indices=False``, return \ If ``return_indices=False``, return
``(loss_source, loss_target)``. ``(loss_source, loss_target)``.
""" """
assert reduction_override in (None, 'none', 'mean', 'sum') assert reduction_override in (None, 'none', 'mean', 'sum')
......
...@@ -14,19 +14,21 @@ class SparseEncoder(nn.Module): ...@@ -14,19 +14,21 @@ class SparseEncoder(nn.Module):
Args: Args:
in_channels (int): The number of input channels. in_channels (int): The number of input channels.
sparse_shape (list[int]): The sparse shape of input tensor. sparse_shape (list[int]): The sparse shape of input tensor.
order (list[str]): Order of conv module. Defaults to ('conv', order (list[str], optional): Order of conv module.
'norm', 'act'). Defaults to ('conv', 'norm', 'act').
norm_cfg (dict): Config of normalization layer. Defaults to norm_cfg (dict, optional): Config of normalization layer. Defaults to
dict(type='BN1d', eps=1e-3, momentum=0.01). dict(type='BN1d', eps=1e-3, momentum=0.01).
base_channels (int): Out channels for conv_input layer. base_channels (int, optional): Out channels for conv_input layer.
Defaults to 16. Defaults to 16.
output_channels (int): Out channels for conv_out layer. output_channels (int, optional): Out channels for conv_out layer.
Defaults to 128. Defaults to 128.
encoder_channels (tuple[tuple[int]]): encoder_channels (tuple[tuple[int]], optional):
Convolutional channels of each encode block. Convolutional channels of each encode block.
encoder_paddings (tuple[tuple[int]]): Paddings of each encode block. encoder_paddings (tuple[tuple[int]], optional):
Paddings of each encode block.
Defaults to ((16, ), (32, 32, 32), (64, 64, 64), (64, 64, 64)). Defaults to ((16, ), (32, 32, 32), (64, 64, 64), (64, 64, 64)).
block_type (str): Type of the block to use. Defaults to 'conv_module'. block_type (str, optional): Type of the block to use.
Defaults to 'conv_module'.
""" """
def __init__(self, def __init__(self,
...@@ -99,7 +101,7 @@ class SparseEncoder(nn.Module): ...@@ -99,7 +101,7 @@ class SparseEncoder(nn.Module):
Args: Args:
voxel_features (torch.float32): Voxel features in shape (N, C). voxel_features (torch.float32): Voxel features in shape (N, C).
coors (torch.int32): Coordinates in shape (N, 4), \ coors (torch.int32): Coordinates in shape (N, 4),
the columns in the order of (batch_idx, z_idx, y_idx, x_idx). the columns in the order of (batch_idx, z_idx, y_idx, x_idx).
batch_size (int): Batch size. batch_size (int): Batch size.
...@@ -139,9 +141,9 @@ class SparseEncoder(nn.Module): ...@@ -139,9 +141,9 @@ class SparseEncoder(nn.Module):
make_block (method): A bounded function to build blocks. make_block (method): A bounded function to build blocks.
norm_cfg (dict[str]): Config of normalization layer. norm_cfg (dict[str]): Config of normalization layer.
in_channels (int): The number of encoder input channels. in_channels (int): The number of encoder input channels.
block_type (str): Type of the block to use. Defaults to block_type (str, optional): Type of the block to use.
'conv_module'. Defaults to 'conv_module'.
conv_cfg (dict): Config of conv layer. Defaults to conv_cfg (dict, optional): Config of conv layer. Defaults to
dict(type='SubMConv3d'). dict(type='SubMConv3d').
Returns: Returns:
......
...@@ -15,15 +15,16 @@ class GroupFree3DMHA(MultiheadAttention): ...@@ -15,15 +15,16 @@ class GroupFree3DMHA(MultiheadAttention):
embed_dims (int): The embedding dimension. embed_dims (int): The embedding dimension.
num_heads (int): Parallel attention heads. Same as num_heads (int): Parallel attention heads. Same as
`nn.MultiheadAttention`. `nn.MultiheadAttention`.
attn_drop (float): A Dropout layer on attn_output_weights. Default 0.0. attn_drop (float, optional): A Dropout layer on attn_output_weights.
proj_drop (float): A Dropout layer. Default 0.0. Defaults to 0.0.
dropout_layer (obj:`ConfigDict`): The dropout_layer used proj_drop (float, optional): A Dropout layer. Defaults to 0.0.
dropout_layer (obj:`ConfigDict`, optional): The dropout_layer used
when adding the shortcut. when adding the shortcut.
init_cfg (obj:`mmcv.ConfigDict`): The Config for initialization. init_cfg (obj:`mmcv.ConfigDict`, optional): The Config for
Default: None. initialization. Default: None.
batch_first (bool): Key, Query and Value are shape of batch_first (bool, optional): Key, Query and Value are shape of
(batch, n, embed_dim) (batch, n, embed_dim)
or (n, batch, embed_dim). Default to False. or (n, batch, embed_dim). Defaults to False.
""" """
def __init__(self, def __init__(self,
...@@ -58,26 +59,26 @@ class GroupFree3DMHA(MultiheadAttention): ...@@ -58,26 +59,26 @@ class GroupFree3DMHA(MultiheadAttention):
embed_dims]. Same in `nn.MultiheadAttention.forward`. embed_dims]. Same in `nn.MultiheadAttention.forward`.
key (Tensor): The key tensor with shape [num_keys, bs, key (Tensor): The key tensor with shape [num_keys, bs,
embed_dims]. Same in `nn.MultiheadAttention.forward`. embed_dims]. Same in `nn.MultiheadAttention.forward`.
If None, the ``query`` will be used. Defaults to None. If None, the ``query`` will be used.
value (Tensor): The value tensor with same shape as `key`. value (Tensor): The value tensor with same shape as `key`.
Same in `nn.MultiheadAttention.forward`. Defaults to None. Same in `nn.MultiheadAttention.forward`.
If None, the `key` will be used. If None, the `key` will be used.
identity (Tensor): This tensor, with the same shape as x, identity (Tensor): This tensor, with the same shape as x,
will be used for the identity link. will be used for the identity link. If None, `x` will be used.
If None, `x` will be used. Defaults to None. query_pos (Tensor, optional): The positional encoding for query,
query_pos (Tensor): The positional encoding for query, with with the same shape as `x`. Defaults to None.
the same shape as `x`. If not None, it will If not None, it will be added to `x` before forward function.
be added to `x` before forward function. Defaults to None. key_pos (Tensor, optional): The positional encoding for `key`,
key_pos (Tensor): The positional encoding for `key`, with the with the same shape as `key`. Defaults to None. If not None,
same shape as `key`. Defaults to None. If not None, it will it will be added to `key` before forward function. If None,
be added to `key` before forward function. If None, and and `query_pos` has the same shape as `key`, then `query_pos`
`query_pos` has the same shape as `key`, then `query_pos`
will be used for `key_pos`. Defaults to None. will be used for `key_pos`. Defaults to None.
attn_mask (Tensor): ByteTensor mask with shape [num_queries, attn_mask (Tensor, optional): ByteTensor mask with shape
num_keys]. Same in `nn.MultiheadAttention.forward`. [num_queries, num_keys].
Defaults to None.
key_padding_mask (Tensor): ByteTensor with shape [bs, num_keys].
Same in `nn.MultiheadAttention.forward`. Defaults to None. Same in `nn.MultiheadAttention.forward`. Defaults to None.
key_padding_mask (Tensor, optional): ByteTensor with shape
[bs, num_keys]. Same in `nn.MultiheadAttention.forward`.
Defaults to None.
Returns: Returns:
Tensor: forwarded results with shape [num_queries, bs, embed_dims]. Tensor: forwarded results with shape [num_queries, bs, embed_dims].
...@@ -113,7 +114,7 @@ class ConvBNPositionalEncoding(nn.Module): ...@@ -113,7 +114,7 @@ class ConvBNPositionalEncoding(nn.Module):
Args: Args:
input_channel (int): input features dim. input_channel (int): input features dim.
num_pos_feats (int): output position features dim. num_pos_feats (int, optional): output position features dim.
Defaults to 288 to be consistent with seed features dim. Defaults to 288 to be consistent with seed features dim.
""" """
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment