Merge pull request #1091 from Cedarch/master

add readme and update yaml of MPPNet

Merge pull request #1091 from Cedarch/master
add readme and update yaml of MPPNet
6ce7e7e9 · Shaoshuai Shi · GitHub · ef7da7dd · 85f1d0d5 · 6ce7e7e9
Unverified Commit 6ce7e7e9 authored Sep 03, 2022 by Shaoshuai Shi Committed by GitHub Sep 03, 2022
8 changed files
--- a/README.md
+++ b/README.md
@@ -22,7 +22,7 @@ It is also the official code release of [`[PointRCNN]`](https://arxiv.org/abs/18

 ## Changelog
 [2022-09-02] **NEW:** Update `OpenPCDet` to v0.6.0:
-* Official code release of [MPPNet](https://arxiv.org/abs/2205.05979) for temporal 3D object detection, which supports long-term multi-frame 3D object detection and ranks 1st place on 3D detection learderboard of Waymo Open Dataset (see the [guideline](docs/guidelines_of_approaches/mppnet.md) on how to train/test with MPPNet).
+* Official code release of [MPPNet](https://arxiv.org/abs/2205.05979) for temporal 3D object detection, which supports long-term multi-frame 3D object detection and ranks 1st place on [3D detection learderboard](https://waymo.com/open/challenges/2020/3d-detection) of Waymo Open Dataset on Sept. 2th, 2022 (see the [guideline](docs/guidelines_of_approaches/mppnet.md) on how to train/test with MPPNet).
 * Support multi-frame training/testing on Waymo Open Dataset (see the [change log](docs/changelog.md) for more details on how to process data).
 * Support to save changing training details (e.g., loss, iter, epoch) to file (previous tqdm progress bar is still supported by using `--use_tqdm_to_record`). Please use `pip install gpustat` if you also want to log the GPU related information.
 * Support to save latest model every 5 mintues, so you can restore the model training from latest status instead of previous epoch.   

--- a/docs/guidelines_of_approaches/mppnet.md
+++ b/docs/guidelines_of_approaches/mppnet.md
-# The guideline of MPPNet Will be available soon
\ No newline at end of file
+## Installation
+
+Please refer to [INSTALL.md](docs/INSTALL.md) for the installation of `OpenPCDet`.
+
+
+##Data Preparation
+Please refer to [GETTING_STARTED.md](docs/GETTING_STARTED.md) to process the Waymo Open Dataset.
+
+##Training
+
+1.  Train the RPN model  for MPPNet (centerpoint_4frames is employed in the paper)
+```shell
+bash scripts/dist_train.sh ${NUM_GPUS} --cfg_file cfgs/waymo_models/centerpoint_4frames.yaml
+```
+The ckpt will be saved in ../output/waymo_models/centerpoint_4frames/default/ckpt.
+
+2.  Save the RPN model's prediction results of  training and val dataset
+```shell
+# training
+bash scripts/dist_test.sh ${NUM_GPUS}  --cfg_file cfgs/waymo_models/mppnet_4frames.yaml \
+--ckpt   ../output/waymo_models/centerpoint_4frames/default/ckpt/checkpoint_epoch_36.pth \
+--set   DATA_CONFIG.DATA_SPLIT.test train
+# val
+bash scripts/dist_test.sh ${NUM_GPUS}  --cfg_file cfgs/waymo_models/mppnet_4frames.yaml \
+--ckpt   ../output/waymo_models/centerpoint_4frames/default/ckpt/checkpoint_epoch_36.pth \
+--set   DATA_CONFIG.DATA_SPLIT.test val
+```
+The prediction results of train and val dataset will be saved in
+../output/waymo_models/centerpoint_4frames/default/eval/epoch_36/train/default/result.pkl,
+../output/waymo_models/centerpoint_4frames/default/eval/epoch_36/val/default/result.pkl.
+
+3.  Train MPPNet (using mppnet_4frames as an example)
+```shell
+bash scripts/dist_train.sh ${NUM_GPUS} --cfg_file cfgs/waymo_models/mppnet_4frames.yaml - --batch_size  2  \
+--set DATA_CONFIG.ROI_BOXES_PATH.train  ../output/waymo_models/centerpoint_4frames/default/eval/epoch_36/train/default/result.pkl \
+            DATA_CONFIG.ROI_BOXES_PATH.test  ../output/waymo_models/centerpoint_4frames/default/eval/epoch_36/val/default/result.pkl
+```
+When using 16-frame, we can just change the `cfg_file` to mpppnet_16frames.yaml and the ` DATA_CONFIG.ROI_BOXES_PATH`  is same with 4-frame.
+We can also save the paths of train and val results to ROI_BOXES_PATH in mppnet_4frames.yaml and mppnet_16frames.yaml to avoid using the `set` flag.
+For each GPU, BATCH_SIZE should be at least equal to 2.  When using 16-frame, the reference GPU memory consumption is 29G with BATCH_SIZE=2.
+
+##Evaluation
+* Test with a pretrained model:
+```shell
+# Single GPU
+python test.py --cfg_file cfgs/waymo_models/mppnet_4frames.yaml  --batch_size  1 \
+--ckpt  ../output/waymo_models/mppnet_4frames/default/ckpt/checkpoint_epoch_6.pth
+# Multiple GPUs
+bash scripts/dist_test.sh ${NUM_GPUS} --cfgs/waymo_models/mppnet_4frames.yaml  --batch_size  1 \
+--ckpt  ../output/waymo_models/mppnet_4frames/default/ckpt/checkpoint_epoch_6.pth
+```
+To avoid OOM,  set BATCH_SIZE=1.
+
+* Test with a memory bank to improve efficiency:
+```shell
+# Currently, only support 1 GPU with batch_size 1
+python test.py --cfg_file cfgs/waymo_models/mppnet_e2e_memorybank_inference.yaml --batch_size 1 \
+--ckpt ../output/waymo_models/mppnet_4frames/default/ckpt/checkpoint_epoch_6.pth \
+--pretrained_model  ../output/waymo_models/centerpoint_4frames/default/ckpt/checkpoint_epoch_36.pth
+```
+The default parameters in mppnet_e2e_memorybank_inference.yaml is for 4-frame and just change them to the setting in mppnet_16frames.yaml when using 16-frame.
--- a/pcdet/models/roi_heads/mppnet_head.py
+++ b/pcdet/models/roi_heads/mppnet_head.py
@@ -726,11 +726,11 @@ class MPPNetHead(RoIHeadTemplate):
        point_cls_list = []
        point_reg_list = []

-        for i in range(3):
+        for i in range(self.num_enc_layer):
            point_cls_list.append(self.class_embed[0](tokens[i][0]))

        for i in range(hs.shape[0]):
-            for j in range(3):
+            for j in range(self.num_enc_layer):
                point_reg_list.append(self.bbox_embed[i](tokens[j][i]))

        point_cls = torch.cat(point_cls_list,0)

--- a/pcdet/models/roi_heads/mppnet_memory_bank_e2e.py
+++ b/pcdet/models/roi_heads/mppnet_memory_bank_e2e.py
@@ -225,11 +225,7 @@ class MPPNetHeadE2E(RoIHeadTemplate):
            time_stamp[:,i,:] = i*0.1 

        box_seq = torch.cat([trajectory_rois[:,:,:,:7],time_stamp],-1)
-        # box_seq_time = box_seq
-
-        if self.model_cfg.USE_BOX_ENCODING.NORM_T0:
-            # canonical transformation
-            box_seq[:, :, :,0:3]  = box_seq[:, :, :,0:3] - box_seq[:, 0:1, :, 0:3]
+        box_seq[:, :, :,0:3]  = box_seq[:, :, :,0:3] - box_seq[:, 0:1, :, 0:3]


        roi_ry = box_seq[:,:,:,6] % (2 * np.pi)
@@ -241,17 +237,10 @@ class MPPNetHeadE2E(RoIHeadTemplate):
            points=box_seq.view(-1, 1, box_seq.shape[-1]), angle=-roi_ry_t0.view(-1)
        ).view(box_seq.shape[0],box_seq.shape[1], -1, box_seq.shape[-1])

-        if self.model_cfg.USE_BOX_ENCODING.ALL_YAW_T0:
-            box_seq[:, :, :, 6]  =  0
-
-        else:
-            box_seq[:, 0:1, :, 6]  =  0
-            box_seq[:, 1:, :, 6]  =  roi_ry[:, 1:, ] - roi_ry[:,0:1]
-
+        box_seq[:,:,:,6]  =  0

        batch_rcnn = box_seq.shape[0]*box_seq.shape[2]

-
        box_reg, box_feat, _ = self.seqboxembed(box_seq.permute(0,2,3,1).contiguous().view(batch_rcnn,box_seq.shape[-1],box_seq.shape[1]))
        
        return box_reg, box_feat
@@ -387,7 +376,6 @@ class MPPNetHeadE2E(RoIHeadTemplate):

                rois_list.append(rois)

-
            batch_rois = self.reorder_rois_for_refining(rois_list)
            batch_dict['roi_scores'] = batch_rois[None,:,:,9]
            batch_dict['roi_labels'] = batch_rois[None,:,:,10]
@@ -493,12 +481,11 @@ class MPPNetHeadE2E(RoIHeadTemplate):
        hs, tokens = self.transformer(src,pos=pos)
        point_cls_list = []

-        for i in range(3):
+        for i in range(self.num_enc_layer):
            point_cls_list.append(self.class_embed[0](tokens[i][0]))

        point_cls = torch.cat(point_cls_list,0)

-
        hs = hs.permute(1,0,2).reshape(hs.shape[1],-1)
    
        _, feat_box = self.trajectories_auxiliary_branch(trajectory_rois)

--- a/tools/cfgs/waymo_models/centerpoint_4frames.yaml
+++ b/tools/cfgs/waymo_models/centerpoint_4frames.yaml
@@ -77,7 +77,7 @@ MODEL:

 OPTIMIZATION:
    BATCH_SIZE_PER_GPU: 4
-    NUM_EPOCHS: 30
+    NUM_EPOCHS: 36

    OPTIMIZER: adam_onecycle
    LR: 0.003

--- a/tools/cfgs/waymo_models/mppnet_16frame.yaml
+++ b/tools/cfgs/waymo_models/mppnet_16frame.yaml
@@ -63,11 +63,8 @@ MODEL:
        AVG_STAGE1_SCORE: True
        USE_TRAJ_EMPTY_MASK: True
        USE_AUX_LOSS: True
-        USE_MLP_JOINTEMB: False
        IOU_WEIGHT: [0.5,0.4]

-
-
        ROI_GRID_POOL:
            GRID_SIZE: 4
            MLPS: [[64,64]]
@@ -75,7 +72,6 @@ MODEL:
            NSAMPLE: [16]
            POOL_METHOD: max_pool

-
        Transformer:
            num_lidar_points: 128
            num_proxy_points: 64  # GRID_SIZE*GRID_SIZE*GRID_SIZE
@@ -92,7 +88,6 @@ MODEL:
            use_grid_pos:
                enabled: True
                init_type: index
-
            use_mlp_mixer:
                enabled: True
                hidden_dim: 16
@@ -145,10 +140,9 @@ MODEL:
            NMS_PRE_MAXSIZE: 4096
            NMS_POST_MAXSIZE: 500

-
 OPTIMIZATION:
-    BATCH_SIZE_PER_GPU: 4
-    NUM_EPOCHS: 3
+    BATCH_SIZE_PER_GPU: 2
+    NUM_EPOCHS: 6

    OPTIMIZER: adam_onecycle
    LR: 0.003

--- a/tools/cfgs/waymo_models/mppnet_4frame.yaml
+++ b/tools/cfgs/waymo_models/mppnet_4frame.yaml
@@ -63,7 +63,6 @@ MODEL:
        AVG_STAGE1_SCORE: True
        USE_TRAJ_EMPTY_MASK: True
        USE_AUX_LOSS: True
-        USE_MLP_JOINTEMB: True
        IOU_WEIGHT: [0.5,0.4]


@@ -146,8 +145,8 @@ MODEL:


 OPTIMIZATION:
-    BATCH_SIZE_PER_GPU: 4
-    NUM_EPOCHS: 3
+    BATCH_SIZE_PER_GPU: 2
+    NUM_EPOCHS: 6

    OPTIMIZER: adam_onecycle
    LR: 0.003

--- a/tools/cfgs/waymo_models/mppnet_e2e_memorybank_inference.yaml
+++ b/tools/cfgs/waymo_models/mppnet_e2e_memorybank_inference.yaml
@@ -2,59 +2,19 @@ CLASS_NAMES: ['Vehicle', 'Pedestrian', 'Cyclist']

 DATA_CONFIG: 

-    _BASE_CONFIG_: cfgs/dataset_configs/waymo_dataset.yaml
+    _BASE_CONFIG_: cfgs/dataset_configs/waymo_dataset_multiframe.yaml
    PROCESSED_DATA_TAG: 'waymo_processed_data_v0_5_0'

-    SAMPLED_INTERVAL: {
-        'train': 1,
-        'test': 1
-    }
-    FILTER_EMPTY_BOXES_FOR_TRAIN: True
-    DISABLE_NLZ_FLAG_ON_POINTS: True
-
    SEQUENCE_CONFIG:
        ENABLED: True
-        USE_SPEED: True
        SAMPLE_OFFSET: [-3, 0] #16frame using [-15,0]

-
    POINT_FEATURE_ENCODING: {
    encoding_type: absolute_coordinates_encoding,
    used_feature_list: ['x', 'y', 'z', 'intensity', 'elongation','time'],
-    src_feature_list: ['x', 'y', 'z', 'intensity', 'elongation','time'],       
+    src_feature_list: ['x', 'y', 'z', 'intensity', 'elongation','time'],
    }

-    DATA_AUGMENTOR:
-        DISABLE_AUG_LIST: [ 'placeholder' ]
-        AUG_CONFIG_LIST:
-
-            -   NAME: random_world_flip
-                ALONG_AXIS_LIST: [ 'x', 'y' ]
-
-            -   NAME: random_world_rotation
-                WORLD_ROT_ANGLE: [ -0.78539816, 0.78539816 ]
-
-            -   NAME: random_world_scaling
-                WORLD_SCALE_RANGE: [ 0.95, 1.05 ]
-
-    DATA_PROCESSOR:
-        -   NAME: mask_points_and_boxes_outside_range
-            REMOVE_OUTSIDE_BOXES: True
-
-        -   NAME: shuffle_points
-            SHUFFLE_ENABLED: {
-                'train': True,
-                'test': True
-            }
-
-        -   NAME: transform_points_to_voxels
-            VOXEL_SIZE: [ 0.1, 0.1, 0.15 ]
-            MAX_POINTS_PER_VOXEL: 5
-            MAX_NUMBER_OF_VOXELS: {
-                'train': 150000,
-                'test': 150000
-            }
-

 MODEL:
    NAME: MPPNetE2E
@@ -131,13 +91,12 @@ MODEL:
            ENABLED: True 
            NORM_T0: True
            ALL_YAW_T0: True
-        AVG_STAGE_1: True
+        AVG_STAGE1_SCORE: True
        USE_TRAJ_EMPTY_MASK: True
        USE_AUX_LOSS: True
-        USE_MLP_JOINTEMB: True
        IOU_WEIGHT: [0.5,0.4]

-        ROI_GRID_POOL:
+        ROI_GRID_POOL: #if using 16frame, change to the corresponding setting 
            GRID_SIZE: 4
            MLPS: [[128,128], [128,128]]
            POOL_RADIUS: [0.8, 1.6]
@@ -214,8 +173,8 @@ MODEL:


 OPTIMIZATION:
-    BATCH_SIZE_PER_GPU: 4
-    NUM_EPOCHS: 36
+    BATCH_SIZE_PER_GPU: 2
+    NUM_EPOCHS: 6

    OPTIMIZER: adam_onecycle
    LR: 0.003