# 自定义模型 我们通常把模型的各个组成成分分成 6 种类型: - 编码器(encoder):包括 voxel layer、voxel encoder 和 middle encoder 等进入 backbone 前所使用的基于 voxel 的方法,如 HardVFE 和 PointPillarsScatter。 - 骨干网络(backbone):通常采用 FCN 网络来提取特征图,如 ResNet 和 SECOND。 - 颈部网络(neck):位于 backbones 和 heads 之间的组成模块,如 FPN 和 SECONDFPN。 - 检测头(head):用于特定任务的组成模块,如检测框的预测和掩码的预测。 - RoI 提取器(RoI extractor):用于从特征图中提取 RoI 特征的组成模块,如 H3DRoIHead 和 PartAggregationROIHead。 - 损失函数(loss):heads 中用于计算损失函数的组成模块,如 FocalLoss、L1Loss 和 GHMLoss。 ## 开发新的组成模块 ### 添加新建 encoder 接下来我们以 HardVFE 为例展示如何开发新的组成模块。 #### 1. 定义一个新的 voxel encoder(如 HardVFE:即 DV-SECOND 中所提出的 Voxel 特征提取器) 创建一个新文件 `mmdet3d/models/voxel_encoders/voxel_encoder.py` : ```python import torch.nn as nn from mmdet3d.registry import MODELS @MODELS.register_module() class HardVFE(nn.Module): def __init__(self, arg1, arg2): pass def forward(self, x): # should return a tuple pass ``` #### 2. 导入新建模块 用户可以通过添加下面这行代码到 `mmdet3d/models/voxel_encoders/__init__.py` 中 ```python from .voxel_encoder import HardVFE ``` 或者添加以下的代码到配置文件中,从而能够在避免修改源码的情况下导入新建模块。 ```python custom_imports = dict( imports=['mmdet3d.models.voxel_encoders.HardVFE'], allow_failed_imports=False) ``` #### 3. 在配置文件中使用 voxel encoder ```python model = dict( ... voxel_encoder=dict( type='HardVFE', arg1=xxx, arg2=xxx), ... ``` ### 添加新建 backbone 接下来我们以 [SECOND](https://www.mdpi.com/1424-8220/18/10/3337)(Sparsely Embedded Convolutional Detection) 为例展示如何开发新的组成模块。 #### 1. 定义一个新的 backbone(如 SECOND) 创建一个新文件 `mmdet3d/models/backbones/second.py` : ```python import torch.nn as nn from mmdet3d.registry import MODELS @MODELS.register_module() class SECOND(BaseModule): def __init__(self, arg1, arg2): pass def forward(self, x): # should return a tuple pass ``` #### 2. 导入新建模块 用户可以通过添加下面这行代码到 `mmdet3d/models/backbones/__init__.py` 中 ```python from .second import SECOND ``` 或者添加以下的代码到配置文件中,从而能够在避免修改源码的情况下导入新建模块。 ```python custom_imports = dict( imports=['mmdet3d.models.backbones.second'], allow_failed_imports=False) ``` #### 3. 在配置文件中使用 backbone ```python model = dict( ... backbone=dict( type='SECOND', arg1=xxx, arg2=xxx), ... ``` ### 添加新建 necks #### 1. 定义一个新的 neck(如 SECONDFPN) 创建一个新文件 `mmdet3d/models/necks/second_fpn.py` : ```python from mmdet3d.registry import MODELS @MODELS.register_module() class SECONDFPN(BaseModule): def __init__(self, in_channels=[128, 128, 256], out_channels=[256, 256, 256], upsample_strides=[1, 2, 4], norm_cfg=dict(type='BN', eps=1e-3, momentum=0.01), upsample_cfg=dict(type='deconv', bias=False), conv_cfg=dict(type='Conv2d', bias=False), use_conv_for_no_stride=False, init_cfg=None): pass def forward(self, X): # implementation is ignored pass ``` #### 2. 导入新建模块 用户可以通过添加下面这行代码到 `mmdet3D/models/necks/__init__.py` 中 ```python from .second_fpn import SECONDFPN ``` 或者添加以下的代码到配置文件中,从而能够在避免修改源码的情况下导入新建模块。 ```python custom_imports = dict( imports=['mmdet3d.models.necks.second_fpn'], allow_failed_imports=False) ``` #### 3. 在配置文件中使用 neck ```python model = dict( ... neck=dict( type='SECONDFPN', in_channels=[64, 128, 256], upsample_strides=[1, 2, 4], out_channels=[128, 128, 128]), ... ``` ### 添加新建 heads 接下来我们以 [PartA2 Head](https://arxiv.org/abs/1907.03670) 为例展示如何开发新的组成模块。 **注意**:此处展示的 PartA2 RoI Head 将应用于双阶段检测器中,对于单阶段检测器,请参考 `mmdet3d/models/dense_heads/` 中所展示的例子。由于这些 heads 简单高效,因此这些 heads 普遍应用在自动驾驶场景下的 3D 检测任务中。 首先,在 `mmdet3d/models/roi_heads/bbox_heads/parta2_bbox_head.py` 中创建一个新的 bbox head。 PartA2 RoI Head 实现一个新的 bbox head ,并用于目标检测的任务中。 为了实现一个新的 bbox head,通常需要在其中实现三个功能,如下所示,有时该模块还需要实现其他相关的功能,如 `loss` 和 `get_targets`。 ```python from mmdet3d.registry import MODELS from mmengine.model import BaseModule @MODELS.register_module() class PartA2BboxHead(BaseModule): """PartA2 RoI head.""" def __init__(self, num_classes, seg_in_channels, part_in_channels, seg_conv_channels=None, part_conv_channels=None, merge_conv_channels=None, down_conv_channels=None, shared_fc_channels=None, cls_channels=None, reg_channels=None, dropout_ratio=0.1, roi_feat_size=14, with_corner_loss=True, bbox_coder=dict(type='DeltaXYZWLHRBBoxCoder'), conv_cfg=dict(type='Conv1d'), norm_cfg=dict(type='BN1d', eps=1e-3, momentum=0.01), loss_bbox=dict( type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=2.0), loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=True, reduction='none', loss_weight=1.0), init_cfg=None): super(PartA2BboxHead, self).__init__(init_cfg=init_cfg) def forward(self, seg_feats, part_feats): ``` 其次,如果有必要的话,用户还需要实现一个新的 RoI Head,此处我们从 `Base3DRoIHead` 中继承得到一个新类 `PartAggregationROIHead`,此时我们就能发现 `Base3DRoIHead` 已经实现了下面的功能: ```python from mmdet3d.registry import MODELS, TASK_UTILS from mmdet.models.roi_heads import BaseRoIHead class Base3DRoIHead(BaseRoIHead): """Base class for 3d RoIHeads.""" def __init__(self, bbox_head=None, bbox_roi_extractor=None, mask_head=None, mask_roi_extractor=None, train_cfg=None, test_cfg=None, init_cfg=None): super(Base3DRoIHead, self).__init__( bbox_head=bbox_head, bbox_roi_extractor=bbox_roi_extractor, mask_head=mask_head, mask_roi_extractor=mask_roi_extractor, train_cfg=train_cfg, test_cfg=test_cfg, init_cfg=init_cfg) def init_bbox_head(self, bbox_roi_extractor: dict, bbox_head: dict) -> None: """Initialize box head and box roi extractor. Args: bbox_roi_extractor (dict or ConfigDict): Config of box roi extractor. bbox_head (dict or ConfigDict): Config of box in box head. """ self.bbox_roi_extractor = MODELS.build(bbox_roi_extractor) self.bbox_head = MODELS.build(bbox_head) def init_assigner_sampler(self): """Initialize assigner and sampler.""" self.bbox_assigner = None self.bbox_sampler = None if self.train_cfg: if isinstance(self.train_cfg.assigner, dict): self.bbox_assigner = TASK_UTILS.build(self.train_cfg.assigner) elif isinstance(self.train_cfg.assigner, list): self.bbox_assigner = [ TASK_UTILS.build(res) for res in self.train_cfg.assigner ] self.bbox_sampler = TASK_UTILS.build(self.train_cfg.sampler) def init_mask_head(self): """Initialize mask head, skip since ``PartAggregationROIHead`` does not have one.""" pass ``` 接着将会对 bbox_forward 的逻辑进行修改,同时,bbox_forward 还会继承来自 `Base3DRoIHead` 的其他逻辑,在 `mmdet3d/models/roi_heads/part_aggregation_roi_head.py` 中,我们实现了新的 RoI Head,如下所示: ```python from typing import Dict, List, Tuple from mmcv import ConfigDict from torch import Tensor from torch.nn import functional as F from mmdet3d.registry import MODELS from mmdet3d.structures import bbox3d2roi from mmdet3d.utils import InstanceList from mmdet.models.task_modules import AssignResult, SamplingResult from ...structures.det3d_data_sample import SampleList from .base_3droi_head import Base3DRoIHead @MODELS.register_module() class PartAggregationROIHead(Base3DRoIHead): """Part aggregation roi head for PartA2. Args: semantic_head (ConfigDict): Config of semantic head. num_classes (int): The number of classes. seg_roi_extractor (ConfigDict): Config of seg_roi_extractor. bbox_roi_extractor (ConfigDict): Config of part_roi_extractor. bbox_head (ConfigDict): Config of bbox_head. train_cfg (ConfigDict): Training config. test_cfg (ConfigDict): Testing config. """ def __init__(self, semantic_head: dict, num_classes: int = 3, seg_roi_extractor: dict = None, bbox_head: dict = None, bbox_roi_extractor: dict = None, train_cfg: dict = None, test_cfg: dict = None, init_cfg: dict = None) -> None: super(PartAggregationROIHead, self).__init__( bbox_head=bbox_head, bbox_roi_extractor=bbox_roi_extractor, train_cfg=train_cfg, test_cfg=test_cfg, init_cfg=init_cfg) self.num_classes = num_classes assert semantic_head is not None self.init_seg_head(seg_roi_extractor, semantic_head) def init_seg_head(self, seg_roi_extractor: dict, semantic_head: dict) -> None: """Initialize semantic head and seg roi extractor. Args: seg_roi_extractor (dict): Config of seg roi extractor. semantic_head (dict): Config of semantic head. """ self.semantic_head = MODELS.build(semantic_head) self.seg_roi_extractor = MODELS.build(seg_roi_extractor) @property def with_semantic(self): """bool: whether the head has semantic branch""" return hasattr(self, 'semantic_head') and self.semantic_head is not None def predict(self, feats_dict: Dict, rpn_results_list: InstanceList, batch_data_samples: SampleList, rescale: bool = False, **kwargs) -> InstanceList: """Perform forward propagation of the roi head and predict detection results on the features of the upstream network. Args: feats_dict (dict): Contains features from the first stage. rpn_results_list (List[:obj:`InstancesData`]): Detection results of rpn head. batch_data_samples (List[:obj:`Det3DDataSample`]): The Data samples. It usually includes information such as `gt_instance_3d`, `gt_panoptic_seg_3d` and `gt_sem_seg_3d`. rescale (bool): If True, return boxes in original image space. Defaults to False. Returns: list[:obj:`InstanceData`]: Detection results of each sample after the post process. Each item usually contains following keys. - scores_3d (Tensor): Classification scores, has a shape (num_instances, ) - labels_3d (Tensor): Labels of bboxes, has a shape (num_instances, ). - bboxes_3d (BaseInstance3DBoxes): Prediction of bboxes, contains a tensor with shape (num_instances, C), where C >= 7. """ assert self.with_bbox, 'Bbox head must be implemented in PartA2.' assert self.with_semantic, 'Semantic head must be implemented' \ ' in PartA2.' batch_input_metas = [ data_samples.metainfo for data_samples in batch_data_samples ] voxels_dict = feats_dict.pop('voxels_dict') # TODO: Split predict semantic and bbox results_list = self.predict_bbox(feats_dict, voxels_dict, batch_input_metas, rpn_results_list, self.test_cfg) return results_list def predict_bbox(self, feats_dict: Dict, voxel_dict: Dict, batch_input_metas: List[dict], rpn_results_list: InstanceList, test_cfg: ConfigDict) -> InstanceList: """Perform forward propagation of the bbox head and predict detection results on the features of the upstream network. Args: feats_dict (dict): Contains features from the first stage. voxel_dict (dict): Contains information of voxels. batch_input_metas (list[dict], Optional): Batch image meta info. Defaults to None. rpn_results_list (List[:obj:`InstancesData`]): Detection results of rpn head. test_cfg (Config): Test config. Returns: list[:obj:`InstanceData`]: Detection results of each sample after the post process. Each item usually contains following keys. - scores_3d (Tensor): Classification scores, has a shape (num_instances, ) - labels_3d (Tensor): Labels of bboxes, has a shape (num_instances, ). - bboxes_3d (BaseInstance3DBoxes): Prediction of bboxes, contains a tensor with shape (num_instances, C), where C >= 7. """ ... def loss(self, feats_dict: Dict, rpn_results_list: InstanceList, batch_data_samples: SampleList, **kwargs) -> dict: """Perform forward propagation and loss calculation of the detection roi on the features of the upstream network. Args: feats_dict (dict): Contains features from the first stage. rpn_results_list (List[:obj:`InstancesData`]): Detection results of rpn head. batch_data_samples (List[:obj:`Det3DDataSample`]): The Data samples. It usually includes information such as `gt_instance_3d`, `gt_panoptic_seg_3d` and `gt_sem_seg_3d`. Returns: dict[str, Tensor]: A dictionary of loss components """ assert len(rpn_results_list) == len(batch_data_samples) losses = dict() batch_gt_instances_3d = [] batch_gt_instances_ignore = [] voxels_dict = feats_dict.pop('voxels_dict') for data_sample in batch_data_samples: batch_gt_instances_3d.append(data_sample.gt_instances_3d) if 'ignored_instances' in data_sample: batch_gt_instances_ignore.append(data_sample.ignored_instances) else: batch_gt_instances_ignore.append(None) if self.with_semantic: semantic_results = self._semantic_forward_train( feats_dict, voxels_dict, batch_gt_instances_3d) losses.update(semantic_results.pop('loss_semantic')) sample_results = self._assign_and_sample(rpn_results_list, batch_gt_instances_3d) if self.with_bbox: feats_dict.update(semantic_results) bbox_results = self._bbox_forward_train(feats_dict, voxels_dict, sample_results) losses.update(bbox_results['loss_bbox']) return losses ``` 此处我们省略相关函数的更多细节。请参考[代码](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/mmdet3d/models/roi_heads/part_aggregation_roi_head.py)了解更多细节。 最后,用户需要在 `mmdet3d/models/bbox_heads/__init__.py` 和 `mmdet3d/models/roi_heads/__init__.py` 中添加模块以确保相应的注册器能够找到并加载它们。 此外,用户也可以添加以下的代码到配置文件中,从而实现相同的目标。 ```python custom_imports=dict( imports=['mmdet3d.models.roi_heads.part_aggregation_roi_head', 'mmdet3d.models.roi_heads.bbox_heads.parta2_bbox_head']) ``` PartAggregationROIHead 的配置文件如下所示。 ```python model = dict( ... roi_head=dict( type='PartAggregationROIHead', num_classes=3, semantic_head=dict( type='PointwiseSemanticHead', in_channels=16, extra_width=0.2, seg_score_thr=0.3, num_classes=3, loss_seg=dict( type='mmdet.FocalLoss', use_sigmoid=True, reduction='sum', gamma=2.0, alpha=0.25, loss_weight=1.0), loss_part=dict( type='mmdet.CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)), seg_roi_extractor=dict( type='Single3DRoIAwareExtractor', roi_layer=dict( type='RoIAwarePool3d', out_size=14, max_pts_per_voxel=128, mode='max')), bbox_roi_extractor=dict( type='Single3DRoIAwareExtractor', roi_layer=dict( type='RoIAwarePool3d', out_size=14, max_pts_per_voxel=128, mode='avg')), bbox_head=dict( type='PartA2BboxHead', num_classes=3, seg_in_channels=16, part_in_channels=4, seg_conv_channels=[64, 64], part_conv_channels=[64, 64], merge_conv_channels=[128, 128], down_conv_channels=[128, 256], bbox_coder=dict(type='DeltaXYZWLHRBBoxCoder'), shared_fc_channels=[256, 512, 512, 512], cls_channels=[256, 256], reg_channels=[256, 256], dropout_ratio=0.1, roi_feat_size=14, with_corner_loss=True, loss_bbox=dict( type='mmdet.SmoothL1Loss', beta=1.0 / 9.0, reduction='sum', loss_weight=1.0), loss_cls=dict( type='mmdet.CrossEntropyLoss', use_sigmoid=True, reduction='sum', loss_weight=1.0))), ... ) ``` MMDetection 2.0 支持配置文件之间的继承,使得用户能够更加关注自己的配置文件的修改。 PartA2 Head 的第二阶段主要使用新建的 `PartAggregationROIHead` 和 `PartA2BboxHead`,需要根据对应模块的 `__init__` 参数来设置对应的参数。 ### 添加新建 loss 假定用户想要新添一个用于检测框回归的 loss,并命名为 `MyLoss`。 为了添加一个新的 loss ,用于需要在 `mmdet3d/models/losses/my_loss.py` 中实现对应的逻辑。 装饰器 `weighted_loss` 能够保证对 batch 中每个样本的 loss 进行加权平均。 ```python import torch import torch.nn as nn from mmdet3d.registry import MODELS from mmdet.models.losses.utils import weighted_loss @weighted_loss def my_loss(pred, target): assert pred.size() == target.size() and target.numel() > 0 loss = torch.abs(pred - target) return loss @MODELS.register_module() class MyLoss(nn.Module): def __init__(self, reduction='mean', loss_weight=1.0): super(MyLoss, self).__init__() self.reduction = reduction self.loss_weight = loss_weight def forward(self, pred, target, weight=None, avg_factor=None, reduction_override=None): assert reduction_override in (None, 'none', 'mean', 'sum') reduction = ( reduction_override if reduction_override else self.reduction) loss_bbox = self.loss_weight * my_loss( pred, target, weight, reduction=reduction, avg_factor=avg_factor) return loss_bbox ``` 接着,用户需要将 loss 添加到 `mmdet3d/models/losses/__init__.py`: ```python from .my_loss import MyLoss, my_loss ``` 此外,用户也可以添加以下的代码到配置文件中,从而实现相同的目标。 ```python custom_imports=dict( imports=['mmdet3d.models.losses.my_loss']) ``` 为了使用该 loss,需要对 `loss_xxx` 域进行修改。 因为 MyLoss 主要用于检测框的回归,因此需要在对应的 head 中修改 `loss_bbox` 域的值。 ```python loss_bbox=dict(type='MyLoss', loss_weight=1.0)) ```