[Feature] Add SMOKE Head and Coder(v1.0.0.dev0) (#959)

* [Refactor] Main code modification for coordinate system refactor (#677) * [Enhance] Add script for data update (#774) * Fixed wrong config paths and fixed a bug in test * Fixed metafile * Coord sys refactor (main code) * Update test_waymo_dataset.py * Manually resolve conflict * Removed unused lines and fixed imports * remove coord2box and box2coord * update dir_limit_offset * Some minor improvements * Removed some \s in comments * Revert a change * Change Box3DMode to Coord3DMode where points are converted * Fix points_in_bbox function * Fix Imvoxelnet config * Revert adding a line * Fix rotation bug when batch size is 0 * Keep sign of dir_scores as before * Fix several comments * Add a comment * Fix docstring * Add data update scripts * Fix comments * fix import (#839) * [Enhance] refactor iou_neg_piecewise_sampler.py (#842) * [Refactor] Main code modification for coordinate system refactor (#677) * [Enhance] Add script for data update (#774) * Fixed wrong config paths and fixed a bug in test * Fixed metafile * Coord sys refactor (main code) * Update test_waymo_dataset.py * Manually resolve conflict * Removed unused lines and fixed imports * remove coord2box and box2coord * update dir_limit_offset * Some minor improvements * Removed some \s in comments * Revert a change * Change Box3DMode to Coord3DMode where points are converted * Fix points_in_bbox function * Fix Imvoxelnet config * Revert adding a line * Fix rotation bug when batch size is 0 * Keep sign of dir_scores as before * Fix several comments * Add a comment * Fix docstring * Add data update scripts * Fix comments * fix import * refactor iou_neg_piecewise_sampler.py * add docstring * modify docstring Co-authored-by: Yezhen Cong <52420115+THU17cyz@users.noreply.github.com> Co-authored-by: THU17cyz <congyezhen71@hotmail.com> * [Feature] Add roipooling cuda ops (#843) * [Refactor] Main code modification for coordinate system refactor (#677) * [Enhance] Add script for data update (#774) * Fixed wrong config paths and fixed a bug in test * Fixed metafile * Coord sys refactor (main code) * Update test_waymo_dataset.py * Manually resolve conflict * Removed unused lines and fixed imports * remove coord2box and box2coord * update dir_limit_offset * Some minor improvements * Removed some \s in comments * Revert a change * Change Box3DMode to Coord3DMode where points are converted * Fix points_in_bbox function * Fix Imvoxelnet config * Revert adding a line * Fix rotation bug when batch size is 0 * Keep sign of dir_scores as before * Fix several comments * Add a comment * Fix docstring * Add data update scripts * Fix comments * fix import * add roipooling cuda ops * add roi extractor * add test_roi_extractor unittest * Modify setup.py to install roipooling ops * modify docstring * remove enlarge bbox in roipoint pooling * add_roipooling_ops * modify docstring Co-authored-by: Yezhen Cong <52420115+THU17cyz@users.noreply.github.com> Co-authored-by: THU17cyz <congyezhen71@hotmail.com> * [Refactor] Refactor code structure and docstrings (#803) * refactor points_in_boxes * Merge same functions of three boxes * More docstring fixes and unify x/y/z size * Add "optional" and fix "Default" * Add "optional" and fix "Default" * Add "optional" and fix "Default" * Add "optional" and fix "Default" * Add "optional" and fix "Default" * Remove None in function param type * Fix unittest * Add comments for NMS functions * Merge methods of Points * Add unittest * Add optional and default value * Fix box conversion and add unittest * Fix comments * Add unit test * Indent * Fix CI * Remove useless \\ * Remove useless \\ * Remove useless \\ * Remove useless \\ * Remove useless \\ * Add unit test for box bev * More unit tests and refine docstrings in box_np_ops * Fix comment * Add deprecation warning * [Feature] PointXYZWHLRBBoxCoder (#856) * support PointBasedBoxCoder * fix unittest bug * support unittest in gpu * support unittest in gpu * modified docstring * add args * add args * [Enhance] Change Groupfree3D config (#855) * All mods * PointSample * PointSample * [Doc] Add tutorials/data_pipeline Chinese version (#827) * [Doc] Add tutorials/data_pipeline Chinese version * refine doc * Use the absolute link * Use the absolute link Co-authored-by: Tai-Wang <tab_wang@outlook.com> * [Doc] Add Chinese doc for `scannet_det.md` (#836) * Part * Complete * Fix comments * Fix comments * [Doc] Add Chinese doc for `waymo_det.md` (#859) * Add complete translation * Refinements * Fix comments * Fix a minor typo Co-authored-by: Tai-Wang <tab_wang@outlook.com> * Remove 2D annotations on Lyft (#867) * Add header for files (#869) * Add header for files * Add header for files * Add header for files * Add header for files * [fix] fix typos (#872) * Fix 3 unworking configs (#882) * [Fix] Fix `index.rst` for Chinese docs (#873) * Fix index.rst for zh docs * Change switch language * [Fix] Centerpoint head nested list transpose (#879) * FIX Transpose nested lists without Numpy * Removed unused Numpy import * [Enhance] Update PointFusion (#791) * update point fusion * remove LIDAR hardcode * move get_proj_mat_by_coord_type to utils * fix lint * remove todo * fix lint * [Doc] Add nuscenes_det.md Chinese version (#854) * add nus chinese doc * add nuScenes Chinese doc * fix typo * fix typo * fix typo * fix typo * fix typo * [Fix] Fix RegNet pretrained weight loading (#889) * Fix regnet pretrained weight loading * Remove unused file * Fix centerpoint tta (#892) * [Enhance] Add benchmark regression script (#808) * Initial commit * [Feature] Support DGCNN (v1.0.0.dev0) (#896) * support dgcnn * support dgcnn * support dgcnn * support dgcnn * support dgcnn * support dgcnn * support dgcnn * support dgcnn * support dgcnn * support dgcnn * fix typo * fix typo * fix typo * del gf&fa registry (wo reuse pointnet module) * fix typo * add benchmark and add copyright header (for DGCNN only) * fix typo * fix typo * fix typo * fix typo * fix typo * support dgcnn * Change cam rot_3d_in_axis (#906) * [Doc] Add coord sys tutorial pic and change links to dev branch (#912) * Modify link branch and add pic * Fix pic * [Feature] add kitti AP40 evaluation metric (v1.0.0.dev0) (#927) * Add citation (#901) * [Feature] Add python3.9 in CI (#900) * Add python3.0 in CI * Add python3.0 in CI * Bump to v0.17.0 (#898) * Update README.md * Update README_zh-CN.md * Update version.py * Update getting_started.md * Update getting_started.md * Update changelog.md * Remove "recent" in the news * Remove "recent" in the news * Fix comments * [Docs] Fix the version of sphinx (#902) * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * add AP40 * add unitest * add unitest * seperate AP11 and AP40 * fix some typos Co-authored-by: dingchang <hudingchang.vendor@sensetime.com> Co-authored-by: Tai-Wang <tab_wang@outlook.com> * [Feature] add smoke backbone neck (#939) * add smoke detecotor and it's backbone and neck * typo fix * fix typo * add docstring * fix typo * fix comments * fix comments * fix comments * fix typo * fix typo * fix * fix typo * fix docstring * refine feature * fix typo * use Basemodule in Neck * [Refactor] Refactor the transformation from image to camera coordinates (#938) * Refactor points_img2cam * Refine docstring * Support array converter and add unit tests * [Feature] FCOS3D BBox Coder (#940) * FCOS3D BBox Coder * Add unit tests * Change the value from long to float/double * Rename bbox_out as bbox * Add comments to forward returns * [Feature] PGD BBox Coder (#948) * Support PGD BBox Coder * Refine docstring * [Feature] Support Uncertain L1 Loss (#950) * Add uncertain l1 loss and its unit tests * Remove mmcv.jit and refine docstrings * [Fix] Fix visualization in KITTI dataset (#956) * fix bug to support kitti vis * fix * add smoke head * fix typo * fix code * fix * fix typo * add detector * delete dectector * fix test_heads * add coder test * fix head * fix bugs in smoke head * refine kitti_mono3d data config * remove cuda is available * fix docstring * fix unitest * fix typo * refine * add affine aug * fix rebase typo * fix docs * change cam_intrinsics to cam2imgs * fix typos * fix lint * fix bugs * fix las typos * fix typos * add dosctrings for trans_mats Co-authored-by: Yezhen Cong <52420115+THU17cyz@users.noreply.github.com> Co-authored-by: Xi Liu <75658786+xiliu8006@users.noreply.github.com> Co-authored-by: THU17cyz <congyezhen71@hotmail.com> Co-authored-by: Wenhao Wu <79644370+wHao-Wu@users.noreply.github.com> Co-authored-by: Tai-Wang <tab_wang@outlook.com> Co-authored-by: dingchang <hudingchang.vendor@sensetime.com> Co-authored-by: 谢恩泽 <Johnny_ez@163.com> Co-authored-by: Robin Karlsson <34254153+robin-karlsson0@users.noreply.github.com> Co-authored-by: Danila Rukhovich <danrukh@gmail.com>

[Feature] Add SMOKE Head and Coder(v1.0.0.dev0) (#959)
* [Refactor] Main code modification for coordinate system refactor (#677) * [Enhance] Add script for data update (#774) * Fixed wrong config paths and fixed a bug in test * Fixed metafile * Coord sys refactor (main code) * Update test_waymo_dataset.py * Manually resolve conflict * Removed unused lines and fixed imports * remove coord2box and box2coord * update dir_limit_offset * Some minor improvements * Removed some \s in comments * Revert a change * Change Box3DMode to Coord3DMode where points are converted * Fix points_in_bbox function * Fix Imvoxelnet config * Revert adding a line * Fix rotation bug when batch size is 0 * Keep sign of dir_scores as before * Fix several comments * Add a comment * Fix docstring * Add data update scripts * Fix comments * fix import (#839) * [Enhance] refactor iou_neg_piecewise_sampler.py (#842) * [Refactor] Main code modification for coordinate system refactor (#677) * [Enhance] Add script for data update (#774) * Fixed wrong config paths and fixed a bug in test * Fixed metafile * Coord sys refactor (main code) * Update test_waymo_dataset.py * Manually resolve conflict * Removed unused lines and fixed imports * remove coord2box and box2coord * update dir_limit_offset * Some minor improvements * Removed some \s in comments * Revert a change * Change Box3DMode to Coord3DMode where points are converted * Fix points_in_bbox function * Fix Imvoxelnet config * Revert adding a line * Fix rotation bug when batch size is 0 * Keep sign of dir_scores as before * Fix several comments * Add a comment * Fix docstring * Add data update scripts * Fix comments * fix import * refactor iou_neg_piecewise_sampler.py * add docstring * modify docstring Co-authored-by: Yezhen Cong <52420115+THU17cyz@users.noreply.github.com> Co-authored-by: THU17cyz <congyezhen71@hotmail.com> * [Feature] Add roipooling cuda ops (#843) * [Refactor] Main code modification for coordinate system refactor (#677) * [Enhance] Add script for data update (#774) * Fixed wrong config paths and fixed a bug in test * Fixed metafile * Coord sys refactor (main code) * Update test_waymo_dataset.py * Manually resolve conflict * Removed unused lines and fixed imports * remove coord2box and box2coord * update dir_limit_offset * Some minor improvements * Removed some \s in comments * Revert a change * Change Box3DMode to Coord3DMode where points are converted * Fix points_in_bbox function * Fix Imvoxelnet config * Revert adding a line * Fix rotation bug when batch size is 0 * Keep sign of dir_scores as before * Fix several comments * Add a comment * Fix docstring * Add data update scripts * Fix comments * fix import * add roipooling cuda ops * add roi extractor * add test_roi_extractor unittest * Modify setup.py to install roipooling ops * modify docstring * remove enlarge bbox in roipoint pooling * add_roipooling_ops * modify docstring Co-authored-by: Yezhen Cong <52420115+THU17cyz@users.noreply.github.com> Co-authored-by: THU17cyz <congyezhen71@hotmail.com> * [Refactor] Refactor code structure and docstrings (#803) * refactor points_in_boxes * Merge same functions of three boxes * More docstring fixes and unify x/y/z size * Add "optional" and fix "Default" * Add "optional" and fix "Default" * Add "optional" and fix "Default" * Add "optional" and fix "Default" * Add "optional" and fix "Default" * Remove None in function param type * Fix unittest * Add comments for NMS functions * Merge methods of Points * Add unittest * Add optional and default value * Fix box conversion and add unittest * Fix comments * Add unit test * Indent * Fix CI * Remove useless \\ * Remove useless \\ * Remove useless \\ * Remove useless \\ * Remove useless \\ * Add unit test for box bev * More unit tests and refine docstrings in box_np_ops * Fix comment * Add deprecation warning * [Feature] PointXYZWHLRBBoxCoder (#856) * support PointBasedBoxCoder * fix unittest bug * support unittest in gpu * support unittest in gpu * modified docstring * add args * add args * [Enhance] Change Groupfree3D config (#855) * All mods * PointSample * PointSample * [Doc] Add tutorials/data_pipeline Chinese version (#827) * [Doc] Add tutorials/data_pipeline Chinese version * refine doc * Use the absolute link * Use the absolute link Co-authored-by: Tai-Wang <tab_wang@outlook.com> * [Doc] Add Chinese doc for `scannet_det.md` (#836) * Part * Complete * Fix comments * Fix comments * [Doc] Add Chinese doc for `waymo_det.md` (#859) * Add complete translation * Refinements * Fix comments * Fix a minor typo Co-authored-by: Tai-Wang <tab_wang@outlook.com> * Remove 2D annotations on Lyft (#867) * Add header for files (#869) * Add header for files * Add header for files * Add header for files * Add header for files * [fix] fix typos (#872) * Fix 3 unworking configs (#882) * [Fix] Fix `index.rst` for Chinese docs (#873) * Fix index.rst for zh docs * Change switch language * [Fix] Centerpoint head nested list transpose (#879) * FIX Transpose nested lists without Numpy * Removed unused Numpy import * [Enhance] Update PointFusion (#791) * update point fusion * remove LIDAR hardcode * move get_proj_mat_by_coord_type to utils * fix lint * remove todo * fix lint * [Doc] Add nuscenes_det.md Chinese version (#854) * add nus chinese doc * add nuScenes Chinese doc * fix typo * fix typo * fix typo * fix typo * fix typo * [Fix] Fix RegNet pretrained weight loading (#889) * Fix regnet pretrained weight loading * Remove unused file * Fix centerpoint tta (#892) * [Enhance] Add benchmark regression script (#808) * Initial commit * [Feature] Support DGCNN (v1.0.0.dev0) (#896) * support dgcnn * support dgcnn * support dgcnn * support dgcnn * support dgcnn * support dgcnn * support dgcnn * support dgcnn * support dgcnn * support dgcnn * fix typo * fix typo * fix typo * del gf&fa registry (wo reuse pointnet module) * fix typo * add benchmark and add copyright header (for DGCNN only) * fix typo * fix typo * fix typo * fix typo * fix typo * support dgcnn * Change cam rot_3d_in_axis (#906) * [Doc] Add coord sys tutorial pic and change links to dev branch (#912) * Modify link branch and add pic * Fix pic * [Feature] add kitti AP40 evaluation metric (v1.0.0.dev0) (#927) * Add citation (#901) * [Feature] Add python3.9 in CI (#900) * Add python3.0 in CI * Add python3.0 in CI * Bump to v0.17.0 (#898) * Update README.md * Update README_zh-CN.md * Update version.py * Update getting_started.md * Update getting_started.md * Update changelog.md * Remove "recent" in the news * Remove "recent" in the news * Fix comments * [Docs] Fix the version of sphinx (#902) * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * add AP40 * add unitest * add unitest * seperate AP11 and AP40 * fix some typos Co-authored-by: dingchang <hudingchang.vendor@sensetime.com> Co-authored-by: Tai-Wang <tab_wang@outlook.com> * [Feature] add smoke backbone neck (#939) * add smoke detecotor and it's backbone and neck * typo fix * fix typo * add docstring * fix typo * fix comments * fix comments * fix comments * fix typo * fix typo * fix * fix typo * fix docstring * refine feature * fix typo * use Basemodule in Neck * [Refactor] Refactor the transformation from image to camera coordinates (#938) * Refactor points_img2cam * Refine docstring * Support array converter and add unit tests * [Feature] FCOS3D BBox Coder (#940) * FCOS3D BBox Coder * Add unit tests * Change the value from long to float/double * Rename bbox_out as bbox * Add comments to forward returns * [Feature] PGD BBox Coder (#948) * Support PGD BBox Coder * Refine docstring * [Feature] Support Uncertain L1 Loss (#950) * Add uncertain l1 loss and its unit tests * Remove mmcv.jit and refine docstrings * [Fix] Fix visualization in KITTI dataset (#956) * fix bug to support kitti vis * fix * add smoke head * fix typo * fix code * fix * fix typo * add detector * delete dectector * fix test_heads * add coder test * fix head * fix bugs in smoke head * refine kitti_mono3d data config * remove cuda is available * fix docstring * fix unitest * fix typo * refine * add affine aug * fix rebase typo * fix docs * change cam_intrinsics to cam2imgs * fix typos * fix lint * fix bugs * fix las typos * fix typos * add dosctrings for trans_mats Co-authored-by: Yezhen Cong <52420115+THU17cyz@users.noreply.github.com> Co-authored-by: Xi Liu <75658786+xiliu8006@users.noreply.github.com> Co-authored-by: THU17cyz <congyezhen71@hotmail.com> Co-authored-by: Wenhao Wu <79644370+wHao-Wu@users.noreply.github.com> Co-authored-by: Tai-Wang <tab_wang@outlook.com> Co-authored-by: dingchang <hudingchang.vendor@sensetime.com> Co-authored-by: 谢恩泽 <Johnny_ez@163.com> Co-authored-by: Robin Karlsson <34254153+robin-karlsson0@users.noreply.github.com> Co-authored-by: Danila Rukhovich <danrukh@gmail.com>
f268ba42 · ChaimZhu · GitHub · e93a77f0 · f268ba42 · f268ba42
Unverified Commit f268ba42 authored Sep 29, 2021 by ChaimZhu Committed by GitHub Sep 29, 2021
6 changed files
--- a/mmdet3d/core/bbox/coders/__init__.py
+++ b/mmdet3d/core/bbox/coders/__init__.py
@@ -8,9 +8,10 @@ from .groupfree3d_bbox_coder import GroupFree3DBBoxCoder
 from .partial_bin_based_bbox_coder import PartialBinBasedBBoxCoder
 from .pgd_bbox_coder import PGDBBoxCoder
 from .point_xyzwhlr_bbox_coder import PointXYZWHLRBBoxCoder
+from .smoke_bbox_coder import SMOKECoder

 __all__ = [
    'build_bbox_coder', 'DeltaXYZWLHRBBoxCoder', 'PartialBinBasedBBoxCoder',
    'CenterPointBBoxCoder', 'AnchorFreeBBoxCoder', 'GroupFree3DBBoxCoder',
-    'PointXYZWHLRBBoxCoder', 'FCOS3DBBoxCoder', 'PGDBBoxCoder'
+    'PointXYZWHLRBBoxCoder', 'FCOS3DBBoxCoder', 'PGDBBoxCoder', 'SMOKECoder'
 ]
--- a/mmdet3d/core/bbox/coders/smoke_bbox_coder.py
+++ b/mmdet3d/core/bbox/coders/smoke_bbox_coder.py
+import numpy as np
+import torch
+
+from mmdet.core.bbox import BaseBBoxCoder
+from mmdet.core.bbox.builder import BBOX_CODERS
+
+
+@BBOX_CODERS.register_module()
+class SMOKECoder(BaseBBoxCoder):
+    """Bbox Coder for SMOKE.
+
+    Args:
+        base_depth (tuple[float]): Depth references for decode box depth.
+        base_dims (tuple[tuple[float]]): Dimension references [l, h, w]
+            for decode box dimension for each category.
+        code_size (int): The dimension of boxes to be encoded.
+    """
+
+    def __init__(self, base_depth, base_dims, code_size):
+        super(SMOKECoder, self).__init__()
+        self.base_depth = base_depth
+        self.base_dims = base_dims
+        self.bbox_code_size = code_size
+
+    def encode(self, locations, dimensions, orientations, input_metas):
+        """Encode CameraInstance3DBoxes by locations, dimemsions, orientations.
+
+        Args:
+            locations (Tensor): Center location for 3D boxes.
+                (N, 3)
+            dimensions (Tensor): Dimensions for 3D boxes.
+                shape (N, 3)
+            orientations (Tensor): Orientations for 3D boxes.
+                shape (N, 1)
+            input_metas (list[dict]): Meta information of each image, e.g.,
+                image size, scaling factor, etc.
+
+        Return:
+            :obj:`CameraInstance3DBoxes`: 3D bboxes of batch images,
+                shape (N, bbox_code_size).
+        """
+
+        bboxes = torch.cat((locations, dimensions, orientations), dim=1)
+        assert bboxes.shape[1] == self.bbox_code_size, 'bboxes shape dose not'\
+            'match the bbox_code_size.'
+        batch_bboxes = input_metas[0]['box_type_3d'](
+            bboxes, box_dim=self.bbox_code_size)
+
+        return batch_bboxes
+
+    def decode(self,
+               reg,
+               points,
+               labels,
+               cam2imgs,
+               trans_mats,
+               locations=None):
+        """Decode regression into locations, dimemsions, orientations.
+
+        Args:
+            reg (Tensor): Batch regression for each predict center2d point.
+                shape: (batch * K (max_objs), C)
+            points(Tensor): Batch projected bbox centers on image plane.
+                shape: (batch * K (max_objs) , 2)
+            labels (Tensor): Batch predict class label for each predict
+                center2d point.
+                shape: (batch, K (max_objs))
+            cam2imgs (Tensor): Batch images' camera intrinsic matrix.
+                shape: (batch, 4, 4)
+            trans_mats (Tensor): transformation matrix from original image
+                to feature map.
+                shape: (batch, 3, 3)
+            locations (None | Tensor): if locations is None, this function
+                is used to decode while inference, otherwise, it's used while
+                training using the ground truth 3d bbox locations.
+                shape: (batch * K (max_objs), 3)
+
+        Return:
+            tuple(Tensor): The tuple has components below:
+                - locations (Tensor): Centers of 3D boxes.
+                    shape: (batch * K (max_objs), 3)
+                - dimensions (Tensor): Dimensions of 3D boxes.
+                    shpae: (batch * K (max_objs), 3)
+                - orientations (Tensor): Orientations of 3D
+                    boxes.
+                    shape: (batch * K (max_objs), 1)
+        """
+        depth_offsets = reg[:, 0]
+        centers2d_offsets = reg[:, 1:3]
+        dimensions_offsets = reg[:, 3:6]
+        orientations = reg[:, 6:8]
+        depths = self._decode_depth(depth_offsets)
+        # get the 3D Bounding box's center location.
+        pred_locations = self._decode_location(points, centers2d_offsets,
+                                               depths, cam2imgs, trans_mats)
+        pred_dimensions = self._decode_dimension(labels, dimensions_offsets)
+        if locations is None:
+            pred_orientations = self._decode_orientation(
+                orientations, pred_locations)
+        else:
+            pred_orientations = self._decode_orientation(
+                orientations, locations)
+
+        return pred_locations, pred_dimensions, pred_orientations
+
+    def _decode_depth(self, depth_offsets):
+        """Transform depth offset to depth."""
+        base_depth = depth_offsets.new_tensor(self.base_depth)
+        depths = depth_offsets * base_depth[1] + base_depth[0]
+
+        return depths
+
+    def _decode_location(self, points, centers2d_offsets, depths, cam2imgs,
+                         trans_mats):
+        """Retrieve objects location in camera coordinate based on projected
+        points.
+
+        Args:
+            points (Tensor): Projected points on feature map in (x, y)
+                shape: (batch * K, 2)
+            centers2d_offset (Tensor): Project points offset in
+                (delta_x, delta_y). shape: (batch * K, 2)
+            depths (Tensor): Object depth z.
+                shape: (batch * K)
+            cam2imgs (Tensor): Batch camera intrinsics matrix.
+                shape: (batch, 4, 4)
+            trans_mats (Tensor): transformation matrix from original image
+                to feature map.
+                shape: (batch, 3, 3)
+        """
+        # number of points
+        N = centers2d_offsets.shape[0]
+        # batch_size
+        N_batch = cam2imgs.shape[0]
+        batch_id = torch.arange(N_batch).unsqueeze(1)
+        obj_id = batch_id.repeat(1, N // N_batch).flatten()
+        trans_mats_inv = trans_mats.inverse()[obj_id]
+        cam2imgs_inv = cam2imgs.inverse()[obj_id]
+        centers2d = points + centers2d_offsets
+        centers2d_extend = torch.cat((centers2d, centers2d.new_ones(N, 1)),
+                                     dim=1)
+        # expand project points as [N, 3, 1]
+        centers2d_extend = centers2d_extend.unsqueeze(-1)
+        # transform project points back on original image
+        centers2d_img = torch.matmul(trans_mats_inv, centers2d_extend)
+        centers2d_img = centers2d_img * depths.view(N, -1, 1)
+        centers2d_img_extend = torch.cat(
+            (centers2d_img, centers2d.new_ones(N, 1, 1)), dim=1)
+        locations = torch.matmul(cam2imgs_inv, centers2d_img_extend).squeeze(2)
+
+        return locations[:, :3]
+
+    def _decode_dimension(self, labels, dims_offset):
+        """Transform dimension offsets to dimension according to its category.
+
+        Args:
+            labels (Tensor): Each points' category id.
+                shape: (N, K)
+            dims_offset (Tensor): Dimension offsets.
+                shape: (N, 3)
+        """
+        labels = labels.flatten().long()
+        base_dims = dims_offset.new_tensor(self.base_dims)
+        dims_select = base_dims[labels, :]
+        dimensions = dims_offset.exp() * dims_select
+
+        return dimensions
+
+    def _decode_orientation(self, ori_vector, locations):
+        """Retrieve object orientation.
+
+        Args:
+            ori_vector (Tensor): Local orientation in [sin, cos] format.
+                shape: (N, 2)
+            locations (Tensor): Object location.
+                shape: (N, 3)
+
+        Return:
+            Tensor: yaw(Orientation). Notice that the yaw's
+                range is [-np.pi, np.pi].
+                shape：(N, 1）
+        """
+        assert len(ori_vector) == len(locations)
+        locations = locations.view(-1, 3)
+        rays = torch.atan(locations[:, 0] / (locations[:, 2] + 1e-7))
+        alphas = torch.atan(ori_vector[:, 0] / (ori_vector[:, 1] + 1e-7))
+
+        # get cosine value positive and negtive index.
+        cos_pos_inds = (ori_vector[:, 1] >= 0).nonzero()
+        cos_neg_inds = (ori_vector[:, 1] < 0).nonzero()
+
+        alphas[cos_pos_inds] -= np.pi / 2
+        alphas[cos_neg_inds] += np.pi / 2
+        # retrieve object rotation y angle.
+        yaws = alphas + rays
+
+        larger_inds = (yaws > np.pi).nonzero()
+        small_inds = (yaws < -np.pi).nonzero()
+
+        if len(larger_inds) != 0:
+            yaws[larger_inds] -= 2 * np.pi
+        if len(small_inds) != 0:
+            yaws[small_inds] += 2 * np.pi
+
+        yaws = yaws.unsqueeze(-1)
+        return yaws
--- a/mmdet3d/models/dense_heads/__init__.py
+++ b/mmdet3d/models/dense_heads/__init__.py
@@ -9,6 +9,7 @@ from .free_anchor3d_head import FreeAnchor3DHead
 from .groupfree3d_head import GroupFree3DHead
 from .parta2_rpn_head import PartA2RPNHead
 from .shape_aware_head import ShapeAwareHead
+from .smoke_mono3d_head import SMOKEMono3DHead
 from .ssd_3d_head import SSD3DHead
 from .vote_head import VoteHead

@@ -16,5 +17,5 @@ __all__ = [
    'Anchor3DHead', 'FreeAnchor3DHead', 'PartA2RPNHead', 'VoteHead',
    'SSD3DHead', 'BaseConvBboxHead', 'CenterHead', 'ShapeAwareHead',
    'BaseMono3DDenseHead', 'AnchorFreeMono3DHead', 'FCOSMono3DHead',
-    'GroupFree3DHead'
+    'GroupFree3DHead', 'SMOKEMono3DHead'
 ]
--- a/mmdet3d/models/dense_heads/smoke_mono3d_head.py
+++ b/mmdet3d/models/dense_heads/smoke_mono3d_head.py
+import torch
+from torch.nn import functional as F
+
+from mmdet.core import multi_apply
+from mmdet.core.bbox.builder import build_bbox_coder
+from mmdet.models.builder import HEADS
+from mmdet.models.utils import gaussian_radius, gen_gaussian_target
+from mmdet.models.utils.gaussian_target import (get_local_maximum,
+                                                get_topk_from_heatmap,
+                                                transpose_and_gather_feat)
+from .anchor_free_mono3d_head import AnchorFreeMono3DHead
+
+
+@HEADS.register_module()
+class SMOKEMono3DHead(AnchorFreeMono3DHead):
+    r"""Anchor-free head used in `SMOKE <https://arxiv.org/abs/2002.10111>`_
+
+    .. code-block:: none
+
+                /-----> 3*3 conv -----> 1*1 conv -----> cls
+        feature
+                \-----> 3*3 conv -----> 1*1 conv -----> reg
+
+    Args:
+        num_classes (int): Number of categories excluding the background
+            category.
+        in_channels (int): Number of channels in the input feature map.
+        dim_channel (list[int]): indexs of dimension offset preds in
+            regression heatmap channels.
+        ori_channel (list[int]): indexs of orientation offset pred in
+            regression heatmap channels.
+        bbox_coder (:obj:`CameraInstance3DBoxes`): Bbox coder
+            for encoding and decoding boxes.
+        loss_cls (dict, optional): Config of classification loss.
+            Default: loss_cls=dict(type='GaussionFocalLoss', loss_weight=1.0).
+        loss_bbox (dict, optional): Config of localization loss.
+            Default: loss_bbox=dict(type='L1Loss', loss_weight=10.0).
+        loss_dir (dict, optional): Config of direction classification loss.
+            In SMOKE, Default: None.
+        loss_attr (dict, optional): Config of attribute classification loss.
+            In SMOKE, Default: None.
+        loss_centerness (dict): Config of centerness loss.
+        norm_cfg (dict): Dictionary to construct and config norm layer.
+            Default: norm_cfg=dict(type='GN', num_groups=32, requires_grad=True).
+        init_cfg (dict): Initialization config dict. Default: None.
+    """  # noqa: E501
+
+    def __init__(self,
+                 num_classes,
+                 in_channels,
+                 dim_channel,
+                 ori_channel,
+                 bbox_coder,
+                 loss_cls=dict(type='GaussionFocalLoss', loss_weight=1.0),
+                 loss_bbox=dict(type='L1Loss', loss_weight=0.1),
+                 loss_dir=None,
+                 loss_attr=None,
+                 norm_cfg=dict(type='GN', num_groups=32, requires_grad=True),
+                 init_cfg=None,
+                 **kwargs):
+        super().__init__(
+            num_classes,
+            in_channels,
+            loss_cls=loss_cls,
+            loss_bbox=loss_bbox,
+            loss_dir=loss_dir,
+            loss_attr=loss_attr,
+            norm_cfg=norm_cfg,
+            init_cfg=init_cfg,
+            **kwargs)
+        self.dim_channel = dim_channel
+        self.ori_channel = ori_channel
+        self.bbox_coder = build_bbox_coder(bbox_coder)
+
+    def forward(self, feats):
+        """Forward features from the upstream network.
+
+        Args:
+            feats (tuple[Tensor]): Features from the upstream network, each is
+                a 4D-tensor.
+
+        Returns:
+            tuple:
+                cls_scores (list[Tensor]): Box scores for each scale level,
+                    each is a 4D-tensor, the channel number is
+                    num_points * num_classes.
+                bbox_preds (list[Tensor]): Box energies / deltas for each scale
+                    level, each is a 4D-tensor, the channel number is
+                    num_points * bbox_code_size.
+        """
+        return multi_apply(self.forward_single, feats)
+
+    def forward_single(self, x):
+        """Forward features of a single scale level.
+
+        Args:
+            x (Tensor): Input feature map.
+
+        Returns:
+            tuple: Scores for each class, bbox of input feature maps.
+        """
+        cls_score, bbox_pred, dir_cls_pred, attr_pred, cls_feat, reg_feat = \
+            super().forward_single(x)
+        cls_score = cls_score.sigmoid()  # turn to 0-1
+        cls_score = cls_score.clamp(min=1e-4, max=1 - 1e-4)
+        # (N, C, H, W)
+        offset_dims = bbox_pred[:, self.dim_channel, ...]
+        bbox_pred[:, self.dim_channel, ...] = offset_dims.sigmoid() - 0.5
+        # (N, C, H, W)
+        vector_ori = bbox_pred[:, self.ori_channel, ...]
+        bbox_pred[:, self.ori_channel, ...] = F.normalize(vector_ori)
+        return cls_score, bbox_pred
+
+    def get_bboxes(self, cls_scores, bbox_preds, img_metas, rescale=None):
+        """Generate bboxes from bbox head predictions.
+
+        Args:
+            cls_scores (list[Tensor]): Box scores for each scale level.
+            bbox_preds (list[Tensor]): Box regression for each scale.
+            img_metas (list[dict]): Meta information of each image, e.g.,
+                image size, scaling factor, etc.
+            rescale (bool): If True, return boxes in original image space.
+
+        Returns:
+            list[tuple[:obj:`CameraInstance3DBoxes`, Tensor, Tensor, None]]:
+                Each item in result_list is 4-tuple.
+        """
+        assert len(cls_scores) == len(bbox_preds) == 1
+        cam2imgs = torch.stack([
+            cls_scores[0].new_tensor(img_meta['cam_intrinsic'])
+            for img_meta in img_metas
+        ])
+        trans_mats = torch.stack([
+            cls_scores[0].new_tensor(img_meta['trans_mat'])
+            for img_meta in img_metas
+        ])
+        batch_bboxes, batch_scores, batch_topk_labels = self.decode_heatmap(
+            cls_scores[0],
+            bbox_preds[0],
+            img_metas,
+            cam2imgs=cam2imgs,
+            trans_mats=trans_mats,
+            topk=100,
+            kernel=3)
+
+        result_list = []
+        for img_id in range(len(img_metas)):
+
+            bboxes = batch_bboxes[img_id]
+            scores = batch_scores[img_id]
+            labels = batch_topk_labels[img_id]
+
+            keep_idx = scores > 0.25
+            bboxes = bboxes[keep_idx]
+            scores = scores[keep_idx]
+            labels = labels[keep_idx]
+
+            bboxes = img_metas[img_id]['box_type_3d'](
+                bboxes, box_dim=self.bbox_code_size, origin=(0.5, 0.5, 0.5))
+            attrs = None
+            result_list.append((bboxes, scores, labels, attrs))
+
+        return result_list
+
+    def decode_heatmap(self,
+                       cls_score,
+                       reg_pred,
+                       img_metas,
+                       cam2imgs,
+                       trans_mats,
+                       topk=100,
+                       kernel=3):
+        """Transform outputs into detections raw bbox predictions.
+
+        Args:
+            class_score (Tensor): Center predict heatmap,
+                shape (B, num_classes, H, W).
+            reg_pred (Tensor): Box regression map.
+                shape (B, channel, H , W).
+            img_metas (List[dict]): Meta information of each image, e.g.,
+                image size, scaling factor, etc.
+            cam2imgs (Tensor): Camera intrinsic matrixs.
+                shape (B, 4, 4)
+            trans_mats (Tensor): Transformation matrix from original image
+                to feature map.
+                shape: (batch, 3, 3)
+            topk (int): Get top k center keypoints from heatmap. Default 100.
+            kernel (int): Max pooling kernel for extract local maximum pixels.
+               Default 3.
+
+        Returns:
+            tuple[torch.Tensor]: Decoded output of SMOKEHead, containing
+               the following Tensors:
+              - batch_bboxes (Tensor): Coords of each 3D box.
+                    shape (B, k, 7)
+              - batch_scores (Tensor): Scores of each 3D box.
+                    shape (B, k)
+              - batch_topk_labels (Tensor): Categories of each 3D box.
+                    shape (B, k)
+        """
+        img_h, img_w = img_metas[0]['pad_shape'][:2]
+        bs, _, feat_h, feat_w = cls_score.shape
+
+        center_heatmap_pred = get_local_maximum(cls_score, kernel=kernel)
+
+        *batch_dets, topk_ys, topk_xs = get_topk_from_heatmap(
+            center_heatmap_pred, k=topk)
+        batch_scores, batch_index, batch_topk_labels = batch_dets
+
+        regression = transpose_and_gather_feat(reg_pred, batch_index)
+        regression = regression.view(-1, 8)
+
+        points = torch.cat([topk_xs.view(-1, 1),
+                            topk_ys.view(-1, 1).float()],
+                           dim=1)
+        locations, dimensions, orientations = self.bbox_coder.decode(
+            regression, points, batch_topk_labels, cam2imgs, trans_mats)
+
+        batch_bboxes = torch.cat((locations, dimensions, orientations), dim=1)
+        batch_bboxes = batch_bboxes.view(bs, -1, self.bbox_code_size)
+        return batch_bboxes, batch_scores, batch_topk_labels
+
+    def get_predictions(self, labels3d, centers2d, gt_locations, gt_dimensions,
+                        gt_orientations, indexs, img_metas, pred_reg):
+        """Prepare predictions for computing loss.
+
+        Args:
+            labels3d (Tensor): Labels of each 3D box.
+                shpae (B, max_objs, )
+            centers2d (Tensor): Coords of each projected 3D box
+                center on image. shape (B * max_objs, 2)
+            gt_locations (Tensor): Coords of each 3D box's location.
+                shape (B * max_objs, 3)
+            gt_dimensions (Tensor): Dimensions of each 3D box.
+                shape (N, 3)
+            gt_orientations (Tensor): Orientation(yaw) of each 3D box.
+                shape (N, 1)
+            indexs (Tensor): Indexs of the existence of the 3D box.
+                shape (B * max_objs, )
+            img_metas (list[dict]): Meta information of each image,
+                e.g., image size, scaling factor, etc.
+            pre_reg (Tensor): Box regression map.
+                shape (B, channel, H , W).
+
+        Returns:
+            dict: the dict has components below:
+            - bbox3d_yaws (:obj:`CameraInstance3DBoxes`):
+                bbox calculated using pred orientations.
+            - bbox3d_dims (:obj:`CameraInstance3DBoxes`):
+                bbox calculated using pred dimentions.
+            - bbox3d_locs (:obj:`CameraInstance3DBoxes`):
+                bbox calculated using pred locations.
+        """
+        batch, channel = pred_reg.shape[0], pred_reg.shape[1]
+        w = pred_reg.shape[3]
+        cam2imgs = torch.stack([
+            gt_locations.new_tensor(img_meta['cam_intrinsic'])
+            for img_meta in img_metas
+        ])
+        trans_mats = torch.stack([
+            gt_locations.new_tensor(img_meta['trans_mat'])
+            for img_meta in img_metas
+        ])
+        centers2d_inds = centers2d[:, 1] * w + centers2d[:, 0]
+        centers2d_inds = centers2d_inds.view(batch, -1)
+        pred_regression = transpose_and_gather_feat(pred_reg, centers2d_inds)
+        pred_regression_pois = pred_regression.view(-1, channel)
+        locations, dimensions, orientations = self.bbox_coder.decode(
+            pred_regression_pois, centers2d, labels3d, cam2imgs, trans_mats,
+            gt_locations)
+
+        locations, dimensions, orientations = locations[indexs], dimensions[
+            indexs], orientations[indexs]
+
+        locations[:, 1] += dimensions[:, 1] / 2
+
+        gt_locations = gt_locations[indexs]
+
+        assert len(locations) == len(gt_locations)
+        assert len(dimensions) == len(gt_dimensions)
+        assert len(orientations) == len(gt_orientations)
+        bbox3d_yaws = self.bbox_coder.encode(gt_locations, gt_dimensions,
+                                             orientations, img_metas)
+        bbox3d_dims = self.bbox_coder.encode(gt_locations, dimensions,
+                                             gt_orientations, img_metas)
+        bbox3d_locs = self.bbox_coder.encode(locations, gt_dimensions,
+                                             gt_orientations, img_metas)
+
+        pred_bboxes = dict(ori=bbox3d_yaws, dim=bbox3d_dims, loc=bbox3d_locs)
+
+        return pred_bboxes
+
+    def get_targets(self, gt_bboxes, gt_labels, gt_bboxes_3d, gt_labels_3d,
+                    centers2d, feat_shape, img_shape, img_metas):
+        """Get training targets for batch images.
+``
+        Args:
+            gt_bboxes (list[Tensor]): Ground truth bboxes of each image,
+                shape (num_gt, 4).
+            gt_labels (list[Tensor]): Ground truth labels of each box,
+                shape (num_gt,).
+            gt_bboxes_3d (list[:obj:`CameraInstance3DBoxes`]): 3D Ground
+                truth bboxes of each image,
+                shape (num_gt, bbox_code_size).
+            gt_labels_3d (list[Tensor]): 3D Ground truth labels of each
+                box, shape (num_gt,).
+            centers2d (list[Tensor]): Projected 3D centers onto 2D image,
+                shape (num_gt, 2).
+            feat_shape (tuple[int]): Feature map shape with value,
+                shape (B, _, H, W).
+            img_shape (tuple[int]): Image shape in [h, w] format.
+            img_metas (list[dict]): Meta information of each image, e.g.,
+                image size, scaling factor, etc.
+
+        Returns:
+            tuple[Tensor, dict]: The Tensor value is the targets of
+                center heatmap, the dict has components below:
+              - gt_centers2d (Tensor): Coords of each projected 3D box
+                    center on image. shape (B * max_objs, 2)
+              - gt_labels3d (Tensor): Labels of each 3D box.
+                    shpae (B, max_objs, )
+              - indexs (Tensor): Indexs of the existence of the 3D box.
+                    shape (B * max_objs, )
+              - affine_indexs (Tensor): Indexs of the affine of the 3D box.
+                    shape (N, )
+              - gt_locs (Tensor): Coords of each 3D box's location.
+                    shape (N, 3)
+              - gt_dims (Tensor): Dimensions of each 3D box.
+                    shape (N, 3)
+              - gt_yaws (Tensor): Orientation(yaw) of each 3D box.
+                    shape (N, 1)
+              - gt_cors (Tensor): Coords of the corners of each 3D box.
+                    shape (N, 8, 3)
+        """
+
+        reg_mask = torch.stack([
+            gt_bboxes[0].new_tensor(
+                not img_meta['affine_aug'], dtype=torch.bool)
+            for img_meta in img_metas
+        ])
+
+        img_h, img_w = img_shape[:2]
+        bs, _, feat_h, feat_w = feat_shape
+
+        width_ratio = float(feat_w / img_w)  # 1/4
+        height_ratio = float(feat_h / img_h)  # 1/4
+
+        assert width_ratio == height_ratio
+
+        center_heatmap_target = gt_bboxes[-1].new_zeros(
+            [bs, self.num_classes, feat_h, feat_w])
+
+        gt_centers2d = centers2d.copy()
+
+        for batch_id in range(bs):
+            gt_bbox = gt_bboxes[batch_id]
+            gt_label = gt_labels[batch_id]
+            # project centers2d from input image to feat map
+            gt_center2d = gt_centers2d[batch_id] * width_ratio
+
+            for j, center in enumerate(gt_center2d):
+                center_x_int, center_y_int = center.int()
+                scale_box_h = (gt_bbox[j][3] - gt_bbox[j][1]) * height_ratio
+                scale_box_w = (gt_bbox[j][2] - gt_bbox[j][0]) * width_ratio
+                radius = gaussian_radius([scale_box_h, scale_box_w],
+                                         min_overlap=0.7)
+                radius = max(0, int(radius))
+                ind = gt_label[j]
+                gen_gaussian_target(center_heatmap_target[batch_id, ind],
+                                    [center_x_int, center_y_int], radius)
+
+        avg_factor = max(1, center_heatmap_target.eq(1).sum())
+        num_ctrs = [center2d.shape[0] for center2d in centers2d]
+        max_objs = max(num_ctrs)
+
+        reg_inds = torch.cat(
+            [reg_mask[i].repeat(num_ctrs[i]) for i in range(bs)])
+
+        inds = torch.zeros((bs, max_objs),
+                           dtype=torch.bool).to(centers2d[0].device)
+
+        # put gt 3d bboxes to gpu
+        gt_bboxes_3d = [
+            gt_bbox_3d.to(centers2d[0].device) for gt_bbox_3d in gt_bboxes_3d
+        ]
+
+        batch_centers2d = centers2d[0].new_zeros((bs, max_objs, 2))
+        batch_labels_3d = gt_labels_3d[0].new_zeros((bs, max_objs))
+        batch_gt_locations = \
+            gt_bboxes_3d[0].tensor.new_zeros((bs, max_objs, 3))
+        for i in range(bs):
+            inds[i, :num_ctrs[i]] = 1
+            batch_centers2d[i, :num_ctrs[i]] = centers2d[i]
+            batch_labels_3d[i, :num_ctrs[i]] = gt_labels_3d[i]
+            batch_gt_locations[i, :num_ctrs[i]] = \
+                gt_bboxes_3d[i].tensor[:, :3]
+
+        inds = inds.flatten()
+        batch_centers2d = batch_centers2d.view(-1, 2) * width_ratio
+        batch_gt_locations = batch_gt_locations.view(-1, 3)
+
+        # filter the empty image, without gt_bboxes_3d
+        gt_bboxes_3d = [
+            gt_bbox_3d for gt_bbox_3d in gt_bboxes_3d
+            if gt_bbox_3d.tensor.shape[0] > 0
+        ]
+
+        gt_dimensions = torch.cat(
+            [gt_bbox_3d.tensor[:, 3:6] for gt_bbox_3d in gt_bboxes_3d])
+        gt_orientations = torch.cat([
+            gt_bbox_3d.tensor[:, 6].unsqueeze(-1)
+            for gt_bbox_3d in gt_bboxes_3d
+        ])
+        gt_corners = torch.cat(
+            [gt_bbox_3d.corners for gt_bbox_3d in gt_bboxes_3d])
+
+        target_labels = dict(
+            gt_centers2d=batch_centers2d.long(),
+            gt_labels3d=batch_labels_3d,
+            indexs=inds,
+            reg_indexs=reg_inds,
+            gt_locs=batch_gt_locations,
+            gt_dims=gt_dimensions,
+            gt_yaws=gt_orientations,
+            gt_cors=gt_corners)
+
+        return center_heatmap_target, avg_factor, target_labels
+
+    def loss(self,
+             cls_scores,
+             bbox_preds,
+             gt_bboxes,
+             gt_labels,
+             gt_bboxes_3d,
+             gt_labels_3d,
+             centers2d,
+             depths,
+             attr_labels,
+             img_metas,
+             gt_bboxes_ignore=None):
+        """Compute loss of the head.
+
+        Args:
+            cls_scores (list[Tensor]): Box scores for each scale level.
+                shape (num_gt, 4).
+            bbox_preds (list[Tensor]): Box dims is a 4D-tensor, the channel
+                number is bbox_code_size.
+                shape (B, 7, H, W).
+            gt_bboxes (list[Tensor]): Ground truth bboxes for each image.
+                shape (num_gts, 4) in [tl_x, tl_y, br_x, br_y] format.
+            gt_labels (list[Tensor]): Class indices corresponding to each box.
+                shape (num_gts, ).
+            gt_bboxes_3d (list[:obj:`CameraInstance3DBoxes`]): 3D boxes ground
+                truth. it is the flipped gt_bboxes
+            gt_labels_3d (list[Tensor]): Same as gt_labels.
+            centers2d (list[Tensor]): 2D centers on the image.
+                shape (num_gts, 2).
+            depths (list[Tensor]): Depth ground truth.
+                shape (num_gts, ).
+            attr_labels (list[Tensor]): Attributes indices of each box.
+                In kitti it's None.
+            img_metas (list[dict]): Meta information of each image, e.g.,
+                image size, scaling factor, etc.
+            gt_bboxes_ignore (None | list[Tensor]): Specify which bounding
+                boxes can be ignored when computing the loss.
+                Default: None.
+
+        Returns:
+            dict[str, Tensor]: A dictionary of loss components.
+        """
+        assert len(cls_scores) == len(bbox_preds) == 1
+        assert attr_labels is None
+        assert gt_bboxes_ignore is None
+        center2d_heatmap = cls_scores[0]
+        pred_reg = bbox_preds[0]
+
+        center2d_heatmap_target, avg_factor, target_labels = \
+            self.get_targets(gt_bboxes, gt_labels, gt_bboxes_3d,
+                             gt_labels_3d, centers2d,
+                             center2d_heatmap.shape,
+                             img_metas[0]['pad_shape'],
+                             img_metas)
+
+        pred_bboxes = self.get_predictions(
+            labels3d=target_labels['gt_labels3d'],
+            centers2d=target_labels['gt_centers2d'],
+            gt_locations=target_labels['gt_locs'],
+            gt_dimensions=target_labels['gt_dims'],
+            gt_orientations=target_labels['gt_yaws'],
+            indexs=target_labels['indexs'],
+            img_metas=img_metas,
+            pred_reg=pred_reg)
+
+        loss_cls = self.loss_cls(
+            center2d_heatmap, center2d_heatmap_target, avg_factor=avg_factor)
+
+        reg_inds = target_labels['reg_indexs']
+
+        loss_bbox_oris = self.loss_bbox(
+            pred_bboxes['ori'].corners[reg_inds, ...],
+            target_labels['gt_cors'][reg_inds, ...])
+
+        loss_bbox_dims = self.loss_bbox(
+            pred_bboxes['dim'].corners[reg_inds, ...],
+            target_labels['gt_cors'][reg_inds, ...])
+
+        loss_bbox_locs = self.loss_bbox(
+            pred_bboxes['loc'].corners[reg_inds, ...],
+            target_labels['gt_cors'][reg_inds, ...])
+
+        loss_bbox = loss_bbox_dims + loss_bbox_locs + loss_bbox_oris
+
+        loss_dict = dict(loss_cls=loss_cls, loss_bbox=loss_bbox)
+
+        return loss_dict
--- a/tests/test_models/test_heads/test_heads.py
+++ b/tests/test_models/test_heads/test_heads.py
@@ -358,6 +358,103 @@ def test_vote_head():
    assert results[0][2].shape[0] >= 0


+def test_smoke_mono3d_head():
+
+    head_cfg = dict(
+        type='SMOKEMono3DHead',
+        num_classes=3,
+        in_channels=64,
+        dim_channel=[3, 4, 5],
+        ori_channel=[6, 7],
+        stacked_convs=0,
+        feat_channels=64,
+        use_direction_classifier=False,
+        diff_rad_by_sin=False,
+        pred_attrs=False,
+        pred_velo=False,
+        dir_offset=0,
+        strides=None,
+        group_reg_dims=(8, ),
+        cls_branch=(256, ),
+        reg_branch=((256, ), ),
+        num_attrs=0,
+        bbox_code_size=7,
+        dir_branch=(),
+        attr_branch=(),
+        bbox_coder=dict(
+            type='SMOKECoder',
+            base_depth=(28.01, 16.32),
+            base_dims=((0.88, 1.73, 0.67), (1.78, 1.70, 0.58), (3.88, 1.63,
+                                                                1.53)),
+            code_size=7),
+        loss_cls=dict(type='GaussianFocalLoss', loss_weight=1.0),
+        loss_bbox=dict(type='L1Loss', reduction='sum', loss_weight=1 / 300),
+        loss_dir=dict(
+            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
+        loss_attr=None,
+        conv_bias=True,
+        dcn_on_last_conv=False)
+
+    self = build_head(head_cfg)
+
+    feats = [torch.rand([2, 64, 32, 32], dtype=torch.float32)]
+
+    # test forward
+    ret_dict = self(feats)
+
+    assert len(ret_dict) == 2
+    assert len(ret_dict[0]) == 1
+    assert ret_dict[0][0].shape == torch.Size([2, 3, 32, 32])
+    assert ret_dict[1][0].shape == torch.Size([2, 8, 32, 32])
+
+    # test loss
+    gt_bboxes = [
+        torch.Tensor([[1.0, 2.0, 20.0, 40.0], [45.0, 50.0, 80.0, 70.1],
+                      [34.0, 39.0, 65.0, 64.0]]),
+        torch.Tensor([[11.0, 22.0, 29.0, 31.0], [41.0, 55.0, 60.0, 99.0],
+                      [29.0, 29.0, 65.0, 56.0]])
+    ]
+    gt_bboxes_3d = [
+        CameraInstance3DBoxes(torch.rand([3, 7]), box_dim=7),
+        CameraInstance3DBoxes(torch.rand([3, 7]), box_dim=7)
+    ]
+    gt_labels = [torch.randint(0, 3, [3]) for i in range(2)]
+    gt_labels_3d = gt_labels
+    centers2d = [torch.randint(0, 60, (3, 2)), torch.randint(0, 40, (3, 2))]
+    depths = [
+        torch.rand([3], dtype=torch.float32),
+        torch.rand([3], dtype=torch.float32)
+    ]
+    attr_labels = None
+    img_metas = [
+        dict(
+            cam_intrinsic=[[
+                1260.8474446004698, 0.0, 807.968244525554, 40.1111
+            ], [0.0, 1260.8474446004698, 495.3344268742088, 2.34422],
+                           [0.0, 0.0, 1.0, 0.00333333], [0.0, 0.0, 0.0, 1.0]],
+            scale_factor=np.array([1., 1., 1., 1.], dtype=np.float32),
+            pad_shape=[128, 128],
+            trans_mat=np.array([[0.25, 0., 0.], [0., 0.25, 0], [0., 0., 1.]],
+                               dtype=np.float32),
+            affine_aug=False,
+            box_type_3d=CameraInstance3DBoxes) for i in range(2)
+    ]
+    losses = self.loss(*ret_dict, gt_bboxes, gt_labels, gt_bboxes_3d,
+                       gt_labels_3d, centers2d, depths, attr_labels, img_metas)
+
+    assert losses['loss_cls'] >= 0
+    assert losses['loss_bbox'] >= 0
+
+    # test get_boxes
+    results = self.get_bboxes(*ret_dict, img_metas)
+    assert len(results) == 2
+    assert len(results[0]) == 4
+    assert results[0][0].tensor.shape == torch.Size([100, 7])
+    assert results[0][1].shape == torch.Size([100])
+    assert results[0][2].shape == torch.Size([100])
+    assert results[0][3] is None
+
+
 def test_parta2_bbox_head():
    parta2_bbox_head_cfg = _get_parta2_bbox_head_cfg(
        './parta2/hv_PartA2_secfpn_2x8_cyclic_80e_kitti-3d-3class.py')

--- a/tests/test_utils/test_bbox_coders.py
+++ b/tests/test_utils/test_bbox_coders.py
@@ -3,7 +3,8 @@ import torch
 from mmcv.cnn import Scale
 from torch import nn as nn

-from mmdet3d.core.bbox import DepthInstance3DBoxes, LiDARInstance3DBoxes
+from mmdet3d.core.bbox import (CameraInstance3DBoxes, DepthInstance3DBoxes,
+                               LiDARInstance3DBoxes)
 from mmdet.core import build_bbox_coder


@@ -564,3 +565,34 @@ def test_pgd_bbox_coder():
    ])
    assert torch.allclose(
        loguniform_prob_depth_preds, expected_preds, atol=1e-3)
+
+
+def test_smoke_bbox_coder():
+    bbox_coder_cfg = dict(
+        type='SMOKECoder',
+        base_depth=(28.01, 16.32),
+        base_dims=((3.88, 1.63, 1.53), (1.78, 1.70, 0.58), (0.88, 1.73, 0.67)),
+        code_size=7)
+
+    bbox_coder = build_bbox_coder(bbox_coder_cfg)
+    regression = torch.rand([200, 8])
+    points = torch.rand([200, 2])
+    labels = torch.ones([2, 100])
+    cam2imgs = torch.rand([2, 4, 4])
+    trans_mats = torch.rand([2, 3, 3])
+
+    img_metas = [dict(box_type_3d=CameraInstance3DBoxes) for i in range(2)]
+    locations, dimensions, orientations = bbox_coder.decode(
+        regression, points, labels, cam2imgs, trans_mats)
+    assert locations.shape == torch.Size([200, 3])
+    assert dimensions.shape == torch.Size([200, 3])
+    assert orientations.shape == torch.Size([200, 1])
+    bboxes = bbox_coder.encode(locations, dimensions, orientations, img_metas)
+    assert bboxes.tensor.shape == torch.Size([200, 7])
+
+    # specically designed to test orientation decode function's
+    # special cases.
+    ori_vector = torch.tensor([[-0.9, -0.01], [-0.9, 0.01]])
+    locations = torch.tensor([[15., 2., 1.], [15., 2., -1.]])
+    orientations = bbox_coder._decode_orientation(ori_vector, locations)
+    assert orientations.shape == torch.Size([2, 1])