Label visualization (#1050)

* argument show, score_thr added to single_gpu_test * Implemented label color visualization for show_result function * Added show, score_thr argument for base 3 model(mmdetection3d) * Fixed typo(color < 1) for show_result function * Applied pre-commit run --all-files * Revised documentation of show_result and revised variable name * Updated documentation and set default value of score_thr to None

Label visualization (#1050)
* argument show, score_thr added to single_gpu_test * Implemented label color visualization for show_result function * Added show, score_thr argument for base 3 model(mmdetection3d) * Fixed typo(color < 1) for show_result function * Applied pre-commit run --all-files * Revised documentation of show_result and revised variable name * Updated documentation and set default value of score_thr to None
f84a17b8 · MilkClouds · GitHub · 5f1366ce · f84a17b8 · f84a17b8
Unverified Commit f84a17b8 authored Dec 01, 2021 by MilkClouds Committed by GitHub Dec 01, 2021
16 changed files
--- a/CITATION.cff
+++ b/CITATION.cff
@@ -5,4 +5,4 @@ authors:
 title: "OpenMMLab's Next-generation Platform for General 3D Object Detection"
 date-released: 2020-07-23
 url: "https://github.com/open-mmlab/mmdetection3d"
-license: Apache-2.0
\ No newline at end of file
+license: Apache-2.0
--- a/configs/imvoxelnet/README.md
+++ b/configs/imvoxelnet/README.md
@@ -5,7 +5,7 @@
 <!-- [ALGORITHM] -->

 We implement a monocular 3D detector ImVoxelNet and provide its results and checkpoints on KITTI dataset.
-Results for SUN RGB-D, ScanNet and nuScenes are currently available in ImVoxelNet authors 
+Results for SUN RGB-D, ScanNet and nuScenes are currently available in ImVoxelNet authors
 [repo](https://github.com/saic-vul/imvoxelnet) (based on mmdetection3d).

 ```

--- a/data/scannet/README.md
+++ b/data/scannet/README.md
@@ -18,7 +18,7 @@ The overall process could be achieved through the following script

 ```bash
 python batch_load_scannet_data.py
-python extract_posed_images.py 
+python extract_posed_images.py
 cd ../..
 python tools/create_data.py scannet --root-path ./data/scannet --out-dir ./data/scannet --extra-tag scannet
 ```

--- a/docs/datasets/nuscenes_det.md
+++ b/docs/datasets/nuscenes_det.md
@@ -111,7 +111,7 @@ Next, we will elaborate on the details recorded in these info files.
        - info['annotations'][i]['bbox']: 2D bounding box annotation (exterior rectangle of the projected 3D box), 1x4 list following [x1, y1, x2-x1, y2-y1].
        x1/y1 are minimum coordinates along horizontal/vertical direction of the image.
        - info['annotations'][i]['iscrowd']: Whether the region is crowded. Defaults to 0.
-        - info['annotations'][i]['bbox_cam3d']: 3D bounding box (gravity) center location (3), size (3), (global) yaw angle (1), 1x7 list. 
+        - info['annotations'][i]['bbox_cam3d']: 3D bounding box (gravity) center location (3), size (3), (global) yaw angle (1), 1x7 list.
        - info['annotations'][i]['velo_cam3d']: Velocities of 3D bounding boxes (no vertical measurements due to inaccuracy), an Nx2 array.
        - info['annotations'][i]['center2d']: Projected 3D-center containing 2.5D information: projected center location on the image (2) and depth (1), 1x3 list.
        - info['annotations'][i]['attribute_name']: Attribute name.

--- a/docs_zh-CN/benchmarks.md
+++ b/docs_zh-CN/benchmarks.md
-# 基准测试
\ No newline at end of file
+# 基准测试
--- a/docs_zh-CN/changelog.md
+++ b/docs_zh-CN/changelog.md
-# 变更日志
\ No newline at end of file
+# 变更日志
--- a/docs_zh-CN/supported_tasks/lidar_det3d.md
+++ b/docs_zh-CN/supported_tasks/lidar_det3d.md
-# 基于Lidar的3D检测
\ No newline at end of file
+# 基于Lidar的3D检测
--- a/docs_zh-CN/supported_tasks/lidar_sem_seg3d.md
+++ b/docs_zh-CN/supported_tasks/lidar_sem_seg3d.md
-# 基于Lidar的3D语义分割
\ No newline at end of file
+# 基于Lidar的3D语义分割
--- a/docs_zh-CN/supported_tasks/vision_det3d.md
+++ b/docs_zh-CN/supported_tasks/vision_det3d.md
-# 基于视觉的3D检测
\ No newline at end of file
+# 基于视觉的3D检测
--- a/mmdet3d/apis/test.py
+++ b/mmdet3d/apis/test.py
@@ -44,7 +44,12 @@ def single_gpu_test(model,
            models_3d = (Base3DDetector, Base3DSegmentor,
                         SingleStageMono3DDetector)
            if isinstance(model.module, models_3d):
-                model.module.show_results(data, result, out_dir=out_dir)
+                model.module.show_results(
+                    data,
+                    result,
+                    out_dir=out_dir,
+                    show=show,
+                    score_thr=show_score_thr)
            # Visualize the results of MMDetection model
            # 'show_result' is MMdetection visualization API
            else:

--- a/mmdet3d/core/visualizer/show_result.py
+++ b/mmdet3d/core/visualizer/show_result.py
@@ -78,7 +78,8 @@ def show_result(points,
                out_dir,
                filename,
                show=False,
-                snapshot=False):
+                snapshot=False,
+                pred_labels=None):
    """Convert results into format that is directly readable for meshlab.

    Args:
@@ -87,8 +88,11 @@ def show_result(points,
        pred_bboxes (np.ndarray): Predicted boxes.
        out_dir (str): Path of output directory
        filename (str): Filename of the current frame.
-        show (bool): Visualize the results online. Defaults to False.
-        snapshot (bool): Whether to save the online results. Defaults to False.
+        show (bool, optional): Visualize the results online. Defaults to False.
+        snapshot (bool, optional): Whether to save the online results.
+            Defaults to False.
+        pred_labels (np.ndarray, optional): Predicted labels of boxes.
+            Defaults to None.
    """
    result_path = osp.join(out_dir, filename)
    mmcv.mkdir_or_exist(result_path)
@@ -98,7 +102,23 @@ def show_result(points,

        vis = Visualizer(points)
        if pred_bboxes is not None:
-            vis.add_bboxes(bbox3d=pred_bboxes)
+            if pred_labels is None:
+                vis.add_bboxes(bbox3d=pred_bboxes)
+            else:
+                palette = np.random.randint(
+                    0, 255, size=(pred_labels.max() + 1, 3)) / 256
+                labelDict = {}
+                for j in range(len(pred_labels)):
+                    i = int(pred_labels[j].numpy())
+                    if labelDict.get(i) is None:
+                        labelDict[i] = []
+                    labelDict[i].append(pred_bboxes[j])
+                for i in labelDict:
+                    vis.add_bboxes(
+                        bbox3d=np.array(labelDict[i]),
+                        bbox_color=palette[i],
+                        points_in_box_color=palette[i])
+
        if gt_bboxes is not None:
            vis.add_bboxes(bbox3d=gt_bboxes, bbox_color=(0, 0, 1))
        show_path = osp.join(result_path,

--- a/mmdet3d/models/detectors/base.py
+++ b/mmdet3d/models/detectors/base.py
@@ -60,13 +60,18 @@ class Base3DDetector(BaseDetector):
        else:
            return self.forward_test(**kwargs)

-    def show_results(self, data, result, out_dir):
+    def show_results(self, data, result, out_dir, show=False, score_thr=None):
        """Results visualization.

        Args:
            data (list[dict]): Input points and the information of the sample.
            result (list[dict]): Prediction results.
            out_dir (str): Output directory of visualization result.
+            show (bool, optional): Determines whether you are
+                going to show result by open3d.
+                Defaults to False.
+            score_thr (float, optional): Score threshold of bounding boxes.
+                Default to None.
        """
        for batch_id in range(len(result)):
            if isinstance(data['points'][0], DC):
@@ -93,6 +98,12 @@ class Base3DDetector(BaseDetector):
            assert out_dir is not None, 'Expect out_dir, got none.'

            pred_bboxes = result[batch_id]['boxes_3d']
+            pred_labels = result[batch_id]['labels_3d']
+
+            if score_thr is not None:
+                mask = result[batch_id]['scores_3d'] > score_thr
+                pred_bboxes = pred_bboxes[mask]
+                pred_labels = pred_labels[mask]

            # for now we convert points and bbox into depth mode
            if (box_mode_3d == Box3DMode.CAM) or (box_mode_3d
@@ -105,4 +116,11 @@ class Base3DDetector(BaseDetector):
                ValueError(
                    f'Unsupported box_mode_3d {box_mode_3d} for convertion!')
            pred_bboxes = pred_bboxes.tensor.cpu().numpy()
-            show_result(points, None, pred_bboxes, out_dir, file_name)
+            show_result(
+                points,
+                None,
+                pred_bboxes,
+                out_dir,
+                file_name,
+                show=show,
+                pred_labels=pred_labels)
--- a/mmdet3d/models/detectors/single_stage_mono3d.py
+++ b/mmdet3d/models/detectors/single_stage_mono3d.py
@@ -178,13 +178,20 @@ class SingleStageMono3DDetector(SingleStageDetector):

        return [bbox_list]

-    def show_results(self, data, result, out_dir):
+    def show_results(self, data, result, out_dir, show=False, score_thr=None):
        """Results visualization.

        Args:
            data (list[dict]): Input images and the information of the sample.
            result (list[dict]): Prediction results.
            out_dir (str): Output directory of visualization result.
+            show (bool, optional): Determines whether you are
+                going to show result by open3d.
+                Defaults to False.
+            TODO: implement score_thr of single_stage_mono3d.
+            score_thr (float, optional): Score threshold of bounding boxes.
+                Default to None.
+                Not implemented yet, but it is here for unification.
        """
        for batch_id in range(len(result)):
            if isinstance(data['img_metas'][0], DC):
@@ -215,4 +222,4 @@ class SingleStageMono3DDetector(SingleStageDetector):
                out_dir,
                file_name,
                'camera',
-                show=True)
+                show=show)
--- a/mmdet3d/models/segmentors/base.py
+++ b/mmdet3d/models/segmentors/base.py
@@ -72,7 +72,9 @@ class Base3DSegmentor(BaseSegmentor):
                     result,
                     palette=None,
                     out_dir=None,
-                     ignore_index=None):
+                     ignore_index=None,
+                     show=False,
+                     score_thr=None):
        """Results visualization.

        Args:
@@ -85,6 +87,13 @@ class Base3DSegmentor(BaseSegmentor):
            ignore_index (int, optional): The label index to be ignored, e.g.
                unannotated points. If None is given, set to len(self.CLASSES).
                Defaults to None.
+            show (bool, optional): Determines whether you are
+                going to show result by open3d.
+                Defaults to False.
+            TODO: implement score_thr of Base3DSegmentor.
+            score_thr (float, optional): Score threshold of bounding boxes.
+                Default to None.
+                Not implemented yet, but it is here for unification.
        """
        assert out_dir is not None, 'Expect out_dir, got none.'
        if palette is None:
@@ -123,4 +132,4 @@ class Base3DSegmentor(BaseSegmentor):
                file_name,
                palette,
                ignore_index,
-                show=True)
+                show=show)
--- a/tests/data/kitti/kitti_infos_mono3d.coco.json
+++ b/tests/data/kitti/kitti_infos_mono3d.coco.json
-{"images": [{"file_name": "training/image_2/000007.png", "id": 7, "Tri2v": [[0.9999976, 0.0007553071, -0.002035826, -0.8086759], [-0.0007854027, 0.9998898, -0.01482298, 0.3195559], [0.002024406, 0.01482454, 0.9998881, -0.7997231], [0.0, 0.0, 0.0, 1.0]], "Trv2c": [[0.007533745, -0.9999714, -0.000616602, -0.004069766], [0.01480249, 0.0007280733, -0.9998902, -0.07631618], [0.9998621, 0.00752379, 0.01480755, -0.2717806], [0.0, 0.0, 0.0, 1.0]], "rect": [[0.9999239, 0.00983776, -0.007445048, 0.0], [-0.009869795, 0.9999421, -0.004278459, 0.0], [0.007402527, 0.004351614, 0.9999631, 0.0], [0.0, 0.0, 0.0, 1.0]], "cam_intrinsic": [[721.5377, 0.0, 609.5593, 44.85728], [0.0, 721.5377, 172.854, 0.2163791], [0.0, 0.0, 1.0, 0.002745884], [0.0, 0.0, 0.0, 1.0]], "width": 1242, "height": 375}], "annotations": [{"file_name": "training/image_2/000007.png", "image_id": 7, "area": 2556.023616260146, "category_name": "Car", "category_id": 2, "bbox": [565.4822720402807, 175.01202566042497, 51.17323679197273, 49.94844525177848], "iscrowd": 0, "bbox_cam3d": [-0.627830982208252, 0.8849999904632568, 25.010000228881836, 3.200000047683716, 1.6100000143051147, 1.659999966621399, -1.590000033378601], "velo_cam3d": -1, "center2d": [591.3814672167642, 198.3730937263457, 25.012745884], "attribute_name": -1, "attribute_id": -1, "segmentation": [], "id": 2}, {"file_name": "training/image_2/000007.png", "image_id": 7, "area": 693.1538564468428, "category_name": "Car", "category_id": 2, "bbox": [481.8496708488522, 179.85710612050596, 30.55976691329198, 22.681909139344754], "iscrowd": 0, "bbox_cam3d": [-7.367831230163574, 1.1799999475479126, 47.54999923706055, 3.700000047683716, 1.399999976158142, 1.5099999904632568, 1.5499999523162842], "velo_cam3d": -1, "center2d": [497.72892067550754, 190.75320250122618, 47.552745884], "attribute_name": -1, "attribute_id": -1, "segmentation": [], "id": 3}, {"file_name": "training/image_2/000007.png", "image_id": 7, "area": 419.21693566410073, "category_name": "Car", "category_id": 2, "bbox": [542.2247151650495, 175.73341152322814, 23.019633917835904, 18.211277258379255], "iscrowd": 0, "bbox_cam3d": [-4.647830963134766, 0.9800000190734863, 60.52000045776367, 4.050000190734863, 1.4600000381469727, 1.659999966621399, 1.559999942779541], "velo_cam3d": -1, "center2d": [554.1213152040074, 184.53305847203026, 60.522745884], "attribute_name": -1, "attribute_id": -1, "segmentation": [], "id": 4}, {"file_name": "training/image_2/000007.png", "image_id": 7, "area": 928.9555081918186, "category_name": "Cyclist", "category_id": 1, "bbox": [330.84191493374504, 176.13804311926262, 24.65593879860404, 37.67674456769879], "iscrowd": 0, "bbox_cam3d": [-12.567831039428711, 1.0199999809265137, 34.09000015258789, 1.9500000476837158, 1.7200000286102295, 0.5, 1.5399999618530273], "velo_cam3d": -1, "center2d": [343.52506265845847, 194.43366972124528, 34.092745884], "attribute_name": -1, "attribute_id": -1, "segmentation": [], "id": 5}], "categories": [{"id": 0, "name": "Pedestrian"}, {"id": 1, "name": "Cyclist"}, {"id": 2, "name": "Car"}]}
\ No newline at end of file
+{"images": [{"file_name": "training/image_2/000007.png", "id": 7, "Tri2v": [[0.9999976, 0.0007553071, -0.002035826, -0.8086759], [-0.0007854027, 0.9998898, -0.01482298, 0.3195559], [0.002024406, 0.01482454, 0.9998881, -0.7997231], [0.0, 0.0, 0.0, 1.0]], "Trv2c": [[0.007533745, -0.9999714, -0.000616602, -0.004069766], [0.01480249, 0.0007280733, -0.9998902, -0.07631618], [0.9998621, 0.00752379, 0.01480755, -0.2717806], [0.0, 0.0, 0.0, 1.0]], "rect": [[0.9999239, 0.00983776, -0.007445048, 0.0], [-0.009869795, 0.9999421, -0.004278459, 0.0], [0.007402527, 0.004351614, 0.9999631, 0.0], [0.0, 0.0, 0.0, 1.0]], "cam_intrinsic": [[721.5377, 0.0, 609.5593, 44.85728], [0.0, 721.5377, 172.854, 0.2163791], [0.0, 0.0, 1.0, 0.002745884], [0.0, 0.0, 0.0, 1.0]], "width": 1242, "height": 375}], "annotations": [{"file_name": "training/image_2/000007.png", "image_id": 7, "area": 2556.023616260146, "category_name": "Car", "category_id": 2, "bbox": [565.4822720402807, 175.01202566042497, 51.17323679197273, 49.94844525177848], "iscrowd": 0, "bbox_cam3d": [-0.627830982208252, 0.8849999904632568, 25.010000228881836, 3.200000047683716, 1.6100000143051147, 1.659999966621399, -1.590000033378601], "velo_cam3d": -1, "center2d": [591.3814672167642, 198.3730937263457, 25.012745884], "attribute_name": -1, "attribute_id": -1, "segmentation": [], "id": 2}, {"file_name": "training/image_2/000007.png", "image_id": 7, "area": 693.1538564468428, "category_name": "Car", "category_id": 2, "bbox": [481.8496708488522, 179.85710612050596, 30.55976691329198, 22.681909139344754], "iscrowd": 0, "bbox_cam3d": [-7.367831230163574, 1.1799999475479126, 47.54999923706055, 3.700000047683716, 1.399999976158142, 1.5099999904632568, 1.5499999523162842], "velo_cam3d": -1, "center2d": [497.72892067550754, 190.75320250122618, 47.552745884], "attribute_name": -1, "attribute_id": -1, "segmentation": [], "id": 3}, {"file_name": "training/image_2/000007.png", "image_id": 7, "area": 419.21693566410073, "category_name": "Car", "category_id": 2, "bbox": [542.2247151650495, 175.73341152322814, 23.019633917835904, 18.211277258379255], "iscrowd": 0, "bbox_cam3d": [-4.647830963134766, 0.9800000190734863, 60.52000045776367, 4.050000190734863, 1.4600000381469727, 1.659999966621399, 1.559999942779541], "velo_cam3d": -1, "center2d": [554.1213152040074, 184.53305847203026, 60.522745884], "attribute_name": -1, "attribute_id": -1, "segmentation": [], "id": 4}, {"file_name": "training/image_2/000007.png", "image_id": 7, "area": 928.9555081918186, "category_name": "Cyclist", "category_id": 1, "bbox": [330.84191493374504, 176.13804311926262, 24.65593879860404, 37.67674456769879], "iscrowd": 0, "bbox_cam3d": [-12.567831039428711, 1.0199999809265137, 34.09000015258789, 1.9500000476837158, 1.7200000286102295, 0.5, 1.5399999618530273], "velo_cam3d": -1, "center2d": [343.52506265845847, 194.43366972124528, 34.092745884], "attribute_name": -1, "attribute_id": -1, "segmentation": [], "id": 5}], "categories": [{"id": 0, "name": "Pedestrian"}, {"id": 1, "name": "Cyclist"}, {"id": 2, "name": "Car"}]}
--- a/tests/data/nuscenes/nus_infos_mono3d.coco.json
+++ b/tests/data/nuscenes/nus_infos_mono3d.coco.json