Update to OpenPCDet v0.6 #1087

Merge pull request #1087 from sshaoshuai/dev_v0.6

Update to OpenPCDet v0.6 #1087
Merge pull request #1087 from sshaoshuai/dev_v0.6
aa753ec0 · Shaoshuai Shi · GitHub · 7e8bbe26 · beb249e5 · aa753ec0
Unverified Commit aa753ec0 authored Sep 02, 2022 by Shaoshuai Shi Committed by GitHub Sep 02, 2022
20 changed files
--- a/README.md
+++ b/README.md
@@ -4,10 +4,10 @@
 `OpenPCDet` is a clear, simple, self-contained open source project for LiDAR-based 3D object detection. 
-It is also the official code release of [`[PointRCNN]`](https://arxiv.org/abs/1812.04244), [`[Part-A2-Net]`](https://arxiv.org/abs/1907.03670), [`[PV-RCNN]`](https://arxiv.org/abs/1912.13192), [`[Voxel R-CNN]`](https://arxiv.org/abs/2012.15712) and [`[PV-RCNN++]`](https://arxiv.org/abs/2102.00463). 
+It is also the official code release of [`[PointRCNN]`](https://arxiv.org/abs/1812.04244), [`[Part-A2-Net]`](https://arxiv.org/abs/1907.03670), [`[PV-RCNN]`](https://arxiv.org/abs/1912.13192), [`[Voxel R-CNN]`](https://arxiv.org/abs/2012.15712), [`[PV-RCNN++]`](https://arxiv.org/abs/2102.00463) and [`[MPPNet]`](https://arxiv.org/abs/2205.05979). 
 **Highlights**: 
-* `OpenPCDet` has been updated to `v0.5.2` (Jan. 2022).
+* `OpenPCDet` has been updated to `v0.6.0` (Sep. 2022).
 * The codes of PV-RCNN++ has been supported.
 ## Overview
@@ -21,6 +21,12 @@ It is also the official code release of [`[PointRCNN]`](https://arxiv.org/abs/18
 ## Changelog
+[2022-09-02] **NEW:** Update `OpenPCDet` to v0.6.0:
+* Official code release of [MPPNet](https://arxiv.org/abs/2205.05979) for temporal 3D object detection, which supports long-term multi-frame 3D object detection and ranks 1st place on 3D detection learderboard of Waymo Open Dataset (see the [guideline](docs/guidelines_of_approaches/mppnet.md) on how to train/test with MPPNet).
+* Support multi-frame training/testing on Waymo Open Dataset (see the [change log](docs/changelog.md) for more details on how to process data).
+* Support to save changing training details (e.g., loss, iter, epoch) to file (previous tqdm progress bar is still supported by using `--use_tqdm_to_record`). Please use `pip install gpustat` if you also want to log the GPU related information.
+* Support to save latest model every 5 mintues, so you can restore the model training from latest status instead of previous epoch.   
 [2022-08-22] Added support for [custom dataset tutorial and template](docs/CUSTOM_DATASET_TUTORIAL.md) 
 [2022-07-05] Added support for the 3D object detection backbone network [`Focals Conv`](https://openaccess.thecvf.com/content/CVPR2022/papers/Chen_Focal_Sparse_Convolutional_Networks_for_3D_Object_Detection_CVPR_2022_paper.pdf).

--- a/docs/GETTING_STARTED.md
+++ b/docs/GETTING_STARTED.md
@@ -74,11 +74,15 @@ OpenPCDet
 |   |   |── waymo_processed_data_v0_5_0
 │   │   │   │── segment-xxxxxxxx/
 |   |   |   |── ...
-│   │   │── waymo_processed_data_v0_5_0_gt_database_train_sampled_1/
+│   │   │── waymo_processed_data_v0_5_0_gt_database_train_sampled_1/  (old, for single-frame)
-│   │   │── waymo_processed_data_v0_5_0_waymo_dbinfos_train_sampled_1.pkl
+│   │   │── waymo_processed_data_v0_5_0_waymo_dbinfos_train_sampled_1.pkl  (old, for single-frame)
-│   │   │── waymo_processed_data_v0_5_0_gt_database_train_sampled_1_global.npy (optional)
+│   │   │── waymo_processed_data_v0_5_0_gt_database_train_sampled_1_global.npy (optional, old, for single-frame)
 │   │   │── waymo_processed_data_v0_5_0_infos_train.pkl (optional)
 │   │   │── waymo_processed_data_v0_5_0_infos_val.pkl (optional)
+|   |   |── waymo_processed_data_v0_5_0_gt_database_train_sampled_1_multiframe_-4_to_0 (new, for single/multi-frame)
+│   │   │── waymo_processed_data_v0_5_0_waymo_dbinfos_train_sampled_1_multiframe_-4_to_0.pkl (new, for single/multi-frame)
+│   │   │── waymo_processed_data_v0_5_0_gt_database_train_sampled_1_multiframe_-4_to_0_global.np  (new, for single/multi-frame)
 ├── pcdet
 ├── tools
 ```
@@ -92,8 +96,13 @@ pip3 install waymo-open-dataset-tf-2-5-0 --user
 * Extract point cloud data from tfrecord and generate data infos by running the following command (it takes several hours, 
 and you could refer to `data/waymo/waymo_processed_data_v0_5_0` to see how many records that have been processed): 
 ```python 
+# only for single-frame setting
 python -m pcdet.datasets.waymo.waymo_dataset --func create_waymo_infos \
    --cfg_file tools/cfgs/dataset_configs/waymo_dataset.yaml
+# for single-frame or multi-frame setting
+python -m pcdet.datasets.waymo.waymo_dataset --func create_waymo_infos \
+    --cfg_file tools/cfgs/dataset_configs/waymo_dataset_multiframe.yaml
 # Ignore 'CUDA_ERROR_NO_DEVICE' error as this process does not require GPU.
 ```

--- a/docs/changelog.md
+++ b/docs/changelog.md
+# Changelog and Guidelines
+### [2022-09-02] Update to v0.6.0:
+* How to process data to support multi-frame training/testing on Waymo Open Dataset?
+   * If you never use the OpenPCDet, you can directly follow the [GETTING_STARTED.md](GETTING_STARTED.md)
+   * If you have been using previous OpenPCDet (`v0.5`), then you need to follow the following steps to update your data:
+       * Update your waymo infos  (the `*.pkl` files for each sequence) by adding argument `--update_info_only`:
+        ```
+        python -m pcdet.datasets.waymo.waymo_dataset --func create_waymo_infos --cfg_file tools/cfgs/dataset_configs/waymo_dataset.yaml --update_info_only
+        ```   
+       * Generate multi-frame GT database for copy-paste augmentation of multi-frame training. There is also a faster version with parallel data generation by adding `--use_parallel`, but you need to read the codes and rename the file after getting the results.
+        ```
+        python -m pcdet.datasets.waymo.waymo_dataset --func create_waymo_gt_database --cfg_file tools/cfgs/dataset_configs/waymo_dataset_multiframe.yaml 
+        ```
+        This will generate the new files like the following (the last three lines under `data/waymo`): 
+```
+OpenPCDet
+├── data
+│   ├── waymo
+│   │   │── ImageSets
+│   │   │── raw_data
+│   │   │   │── segment-xxxxxxxx.tfrecord
+|   |   |   |── ...
+|   |   |── waymo_processed_data_v0_5_0
+│   │   │   │── segment-xxxxxxxx/
+|   |   |   |── ...
+│   │   │── waymo_processed_data_v0_5_0_gt_database_train_sampled_1/
+│   │   │── waymo_processed_data_v0_5_0_waymo_dbinfos_train_sampled_1.pkl
+│   │   │── waymo_processed_data_v0_5_0_gt_database_train_sampled_1_global.npy (optional)
+│   │   │── waymo_processed_data_v0_5_0_infos_train.pkl (optional)
+│   │   │── waymo_processed_data_v0_5_0_infos_val.pkl (optional)
+|   |   |── waymo_processed_data_v0_5_0_gt_database_train_sampled_1_multiframe_-4_to_0 (new)
+│   │   │── waymo_processed_data_v0_5_0_waymo_dbinfos_train_sampled_1_multiframe_-4_to_0.pkl (new)
+│   │   │── waymo_processed_data_v0_5_0_gt_database_train_sampled_1_multiframe_-4_to_0_global.np  (new, optional)
+├── pcdet
+├── tools
+```
--- a/docs/guidelines_of_approaches/mppnet.md
+++ b/docs/guidelines_of_approaches/mppnet.md
+# Will be available soon
\ No newline at end of file
--- a/pcdet/datasets/augmentor/augmentor_utils.py
+++ b/pcdet/datasets/augmentor/augmentor_utils.py
@@ -126,60 +126,6 @@ def random_image_flip_horizontal(image, depth_map, gt_boxes, calib):
    return aug_image, aug_depth_map, aug_gt_boxes
-def random_translation_along_x(gt_boxes, points, offset_std):
-    """
-    Args:
-        gt_boxes: (N, 7), [x, y, z, dx, dy, dz, heading, [vx], [vy]]
-        points: (M, 3 + C),
-        offset_std: float
-    Returns:
-    """
-    offset = np.random.normal(0, offset_std, 1)
-    points[:, 0] += offset
-    gt_boxes[:, 0] += offset
-    # if gt_boxes.shape[1] > 7:
-    #     gt_boxes[:, 7] += offset
-    return gt_boxes, points
-def random_translation_along_y(gt_boxes, points, offset_std):
-    """
-    Args:
-        gt_boxes: (N, 7), [x, y, z, dx, dy, dz, heading, [vx], [vy]]
-        points: (M, 3 + C),
-        offset_std: float
-    Returns:
-    """
-    offset = np.random.normal(0, offset_std, 1)
-    points[:, 1] += offset
-    gt_boxes[:, 1] += offset
-    # if gt_boxes.shape[1] > 8:
-    #     gt_boxes[:, 8] += offset
-    return gt_boxes, points
-def random_translation_along_z(gt_boxes, points, offset_std):
-    """
-    Args:
-        gt_boxes: (N, 7), [x, y, z, dx, dy, dz, heading, [vx], [vy]]
-        points: (M, 3 + C),
-        offset_std: float
-    Returns:
-    """
-    offset = np.random.normal(0, offset_std, 1)
-    points[:, 2] += offset
-    gt_boxes[:, 2] += offset
-    return gt_boxes, points
 def random_local_translation_along_x(gt_boxes, points, offset_range):
    """
    Args:

--- a/pcdet/datasets/augmentor/data_augmentor.py
+++ b/pcdet/datasets/augmentor/data_augmentor.py
@@ -105,15 +105,16 @@ class DataAugmentor(object):
        if data_dict is None:
            return partial(self.random_world_translation, config=config)
        noise_translate_std = config['NOISE_TRANSLATE_STD']
-        if noise_translate_std == 0:
+        assert len(noise_translate_std) == 3
-            return data_dict
+        noise_translate = np.array([
-        gt_boxes, points = data_dict['gt_boxes'], data_dict['points']
+            np.random.normal(0, noise_translate_std[0], 1),
-        for cur_axis in config['ALONG_AXIS_LIST']:
+            np.random.normal(0, noise_translate_std[1], 1),
-            assert cur_axis in ['x', 'y', 'z']
+            np.random.normal(0, noise_translate_std[2], 1),
-            gt_boxes, points = getattr(augmentor_utils, 'random_translation_along_%s' % cur_axis)(
+        ], dtype=np.float32).T
-                gt_boxes, points, noise_translate_std,
-            )
+        gt_boxes, points = data_dict['gt_boxes'], data_dict['points']
+        points[:, :3] += noise_translate
+        gt_boxes[:, :3] += noise_translate
        data_dict['gt_boxes'] = gt_boxes
        data_dict['points'] = points
        return data_dict

--- a/pcdet/datasets/augmentor/database_sampler.py
+++ b/pcdet/datasets/augmentor/database_sampler.py
@@ -30,6 +30,13 @@ class DataBaseSampler(object):
        for db_info_path in sampler_cfg.DB_INFO_PATH:
            db_info_path = self.root_path.resolve() / db_info_path
+            if not db_info_path.exists():
+                assert len(sampler_cfg.DB_INFO_PATH) == 1
+                sampler_cfg.DB_INFO_PATH[0] = sampler_cfg.BACKUP_DB_INFO['DB_INFO_PATH']
+                sampler_cfg.DB_DATA_PATH[0] = sampler_cfg.BACKUP_DB_INFO['DB_DATA_PATH']
+                db_info_path = self.root_path.resolve() / sampler_cfg.DB_INFO_PATH[0]
+                sampler_cfg.NUM_POINT_FEATURES = sampler_cfg.BACKUP_DB_INFO['NUM_POINT_FEATURES']
            with open(str(db_info_path), 'rb') as f:
                infos = pickle.load(f)
                [self.db_infos[cur_class].extend(infos[cur_class]) for cur_class in class_names]
@@ -403,11 +410,23 @@ class DataBaseSampler(object):
        obj_points = np.concatenate(obj_points_list, axis=0)
        sampled_gt_names = np.array([x['name'] for x in total_valid_sampled_dict])
+        if self.sampler_cfg.get('FILTER_OBJ_POINTS_BY_TIMESTAMP', False) or obj_points.shape[-1] != points.shape[-1]:
+            if self.sampler_cfg.get('FILTER_OBJ_POINTS_BY_TIMESTAMP', False):
+                min_time = min(self.sampler_cfg.TIME_RANGE[0], self.sampler_cfg.TIME_RANGE[1])
+                max_time = max(self.sampler_cfg.TIME_RANGE[0], self.sampler_cfg.TIME_RANGE[1])
+            else:
+                assert obj_points.shape[-1] == points.shape[-1] + 1
+                # transform multi-frame GT points to single-frame GT points
+                min_time = max_time = 0.0 
+            time_mask = np.logical_and(obj_points[:, -1] < max_time + 1e-6, obj_points[:, -1] > min_time - 1e-6)
+            obj_points = obj_points[time_mask]
        large_sampled_gt_boxes = box_utils.enlarge_box3d(
            sampled_gt_boxes[:, 0:7], extra_width=self.sampler_cfg.REMOVE_EXTRA_WIDTH
        )
        points = box_utils.remove_points_in_boxes3d(points, large_sampled_gt_boxes)
-        points = np.concatenate([obj_points, points], axis=0)
+        points = np.concatenate([obj_points[:, :points.shape[-1]], points], axis=0)
        gt_names = np.concatenate([gt_names, sampled_gt_names], axis=0)
        gt_boxes = np.concatenate([gt_boxes, sampled_gt_boxes], axis=0)
        data_dict['gt_boxes'] = gt_boxes
@@ -462,7 +481,7 @@ class DataBaseSampler(object):
                valid_sampled_dict = [sampled_dict[x] for x in valid_mask]
                valid_sampled_boxes = sampled_boxes[valid_mask]
-                existed_boxes = np.concatenate((existed_boxes, valid_sampled_boxes), axis=0)
+                existed_boxes = np.concatenate((existed_boxes, valid_sampled_boxes[:, :existed_boxes.shape[-1]]), axis=0)
                total_valid_sampled_dict.extend(valid_sampled_dict)
        sampled_gt_boxes = existed_boxes[gt_boxes.shape[0]:, :]

--- a/pcdet/datasets/dataset.py
+++ b/pcdet/datasets/dataset.py
@@ -57,14 +57,13 @@ class DatasetTemplate(torch_data.Dataset):
    def __setstate__(self, d):
        self.__dict__.update(d)
-    @staticmethod
+    def generate_prediction_dicts(self, batch_dict, pred_dicts, class_names, output_path=None):
-    def generate_prediction_dicts(batch_dict, pred_dicts, class_names, output_path=None):
        """
        Args:
            batch_dict:
                frame_id:
            pred_dicts: list of pred_dicts
-                pred_boxes: (N, 7), Tensor
+                pred_boxes: (N, 7 or 9), Tensor
                pred_scores: (N), Tensor
                pred_labels: (N), Tensor
            class_names:
@@ -75,9 +74,10 @@ class DatasetTemplate(torch_data.Dataset):
        """
        def get_template_prediction(num_samples):
+            box_dim = 9 if self.dataset_cfg.get('TRAIN_WITH_SPEED', False) else 7
            ret_dict = {
                'name': np.zeros(num_samples), 'score': np.zeros(num_samples),
-                'boxes_lidar': np.zeros([num_samples, 7]), 'pred_labels': np.zeros(num_samples)
+                'boxes_lidar': np.zeros([num_samples, box_dim]), 'pred_labels': np.zeros(num_samples)
            }
            return ret_dict

--- a/pcdet/datasets/processor/data_processor.py
+++ b/pcdet/datasets/processor/data_processor.py
@@ -85,7 +85,8 @@ class DataProcessor(object):
        if data_dict.get('gt_boxes', None) is not None and config.REMOVE_OUTSIDE_BOXES and self.training:
            mask = box_utils.mask_boxes_outside_range_numpy(
-                data_dict['gt_boxes'], self.point_cloud_range, min_num_corners=config.get('min_num_corners', 1)
+                data_dict['gt_boxes'], self.point_cloud_range, min_num_corners=config.get('min_num_corners', 1), 
+                use_center_to_filter=config.get('USE_CENTER_TO_FILTER', True)
            )
            data_dict['gt_boxes'] = data_dict['gt_boxes'][mask]
        return data_dict

--- a/pcdet/datasets/processor/point_feature_encoder.py
+++ b/pcdet/datasets/processor/point_feature_encoder.py
@@ -45,6 +45,7 @@ class PointFeatureEncoder(object):
            num_output_features = len(self.used_feature_list)
            return num_output_features
+        assert points.shape[-1] == len(self.src_feature_list)
        point_feature_list = [points[:, 0:3]]
        for x in self.used_feature_list:
            if x in ['x', 'y', 'z']:

--- a/pcdet/datasets/waymo/waymo_dataset.py
+++ b/pcdet/datasets/waymo/waymo_dataset.py
--- a/pcdet/datasets/waymo/waymo_eval.py
+++ b/pcdet/datasets/waymo/waymo_eval.py
@@ -60,6 +60,9 @@ class OpenPCDetWaymoDetectionMetricsEstimator(tf.test.TestCase):
                if fake_gt_infos:
                    info['gt_boxes_lidar'] = boxes3d_kitti_fakelidar_to_lidar(info['gt_boxes_lidar'])
+                if info['gt_boxes_lidar'].shape[-1] == 9:
+                    boxes3d.append(info['gt_boxes_lidar'][box_mask][:, 0:7])
+                else:
                    boxes3d.append(info['gt_boxes_lidar'][box_mask])
            else:
                num_boxes = len(info['boxes_lidar'])
@@ -67,6 +70,8 @@ class OpenPCDetWaymoDetectionMetricsEstimator(tf.test.TestCase):
                score.append(info['score'])
                boxes3d.append(np.array(info['boxes_lidar']))
                box_name = info['name']
+                if boxes3d[-1].shape[-1] == 9:
+                    boxes3d[-1] = boxes3d[-1][:, 0:7]
            obj_type += [self.WAYMO_CLASSES.index(name) for i, name in enumerate(box_name)]
            frame_id.append(np.array([frame_index] * num_boxes))

--- a/pcdet/datasets/waymo/waymo_utils.py
+++ b/pcdet/datasets/waymo/waymo_utils.py
@@ -20,11 +20,12 @@ except:
 WAYMO_CLASSES = ['unknown', 'Vehicle', 'Pedestrian', 'Sign', 'Cyclist']
-def generate_labels(frame):
+def generate_labels(frame, pose):
    obj_name, difficulty, dimensions, locations, heading_angles = [], [], [], [], []
    tracking_difficulty, speeds, accelerations, obj_ids = [], [], [], []
    num_points_in_gt = []
    laser_labels = frame.laser_labels
    for i in range(len(laser_labels)):
        box = laser_labels[i].box
        class_ind = laser_labels[i].type
@@ -37,6 +38,8 @@ def generate_labels(frame):
        locations.append(loc)
        obj_ids.append(laser_labels[i].id)
        num_points_in_gt.append(laser_labels[i].num_lidar_points_in_box)
+        speeds.append([laser_labels[i].metadata.speed_x, laser_labels[i].metadata.speed_y])
+        accelerations.append([laser_labels[i].metadata.accel_x, laser_labels[i].metadata.accel_y])
    annotations = {}
    annotations['name'] = np.array(obj_name)
@@ -48,15 +51,21 @@ def generate_labels(frame):
    annotations['obj_ids'] = np.array(obj_ids)
    annotations['tracking_difficulty'] = np.array(tracking_difficulty)
    annotations['num_points_in_gt'] = np.array(num_points_in_gt)
+    annotations['speed_global'] = np.array(speeds)
+    annotations['accel_global'] = np.array(accelerations)
    annotations = common_utils.drop_info_with_name(annotations, name='unknown')
    if annotations['name'].__len__() > 0:
+        global_speed = np.pad(annotations['speed_global'], ((0, 0), (0, 1)), mode='constant', constant_values=0)  # (N, 3)
+        speed = np.dot(global_speed, np.linalg.inv(pose[:3, :3].T))
+        speed = speed[:, :2]
        gt_boxes_lidar = np.concatenate([
-            annotations['location'], annotations['dimensions'], annotations['heading_angles'][..., np.newaxis]],
+            annotations['location'], annotations['dimensions'], annotations['heading_angles'][..., np.newaxis], speed],
            axis=1
        )
    else:
-        gt_boxes_lidar = np.zeros((0, 7))
+        gt_boxes_lidar = np.zeros((0, 9))
    annotations['gt_boxes_lidar'] = gt_boxes_lidar
    return annotations
@@ -158,8 +167,12 @@ def convert_range_image_to_point_cloud(frame, range_images, camera_projections,
 def save_lidar_points(frame, cur_save_path, use_two_returns=True):
-    range_images, camera_projections, range_image_top_pose = \
+    ret_outputs = frame_utils.parse_range_image_and_camera_projection(frame)
-        frame_utils.parse_range_image_and_camera_projection(frame)
+    if len(ret_outputs) == 4:
+        range_images, camera_projections, seg_labels, range_image_top_pose = ret_outputs
+    else:
+        assert len(ret_outputs) == 3
+        range_images, camera_projections, range_image_top_pose = ret_outputs
    points, cp_points, points_in_NLZ_flag, points_intensity, points_elongation = convert_range_image_to_point_cloud(
        frame, range_images, camera_projections, range_image_top_pose, ri_index=(0, 1) if use_two_returns else (0,)
@@ -181,7 +194,7 @@ def save_lidar_points(frame, cur_save_path, use_two_returns=True):
    return num_points_of_each_lidar
-def process_single_sequence(sequence_file, save_path, sampled_interval, has_label=True, use_two_returns=True):
+def process_single_sequence(sequence_file, save_path, sampled_interval, has_label=True, use_two_returns=True, update_info_only=False):
    sequence_name = os.path.splitext(os.path.basename(sequence_file))[0]
    # print('Load record (sampled_interval=%d): %s' % (sampled_interval, sequence_name))
@@ -197,8 +210,13 @@ def process_single_sequence(sequence_file, save_path, sampled_interval, has_labe
    sequence_infos = []
    if pkl_file.exists():
        sequence_infos = pickle.load(open(pkl_file, 'rb'))
+        sequence_infos_old = None
+        if not update_info_only:
            print('Skip sequence since it has been processed before: %s' % pkl_file)
            return sequence_infos
+        else:
+            sequence_infos_old = sequence_infos
+            sequence_infos = []
    for cnt, data in enumerate(dataset):
        if cnt % sampled_interval != 0:
@@ -227,9 +245,13 @@ def process_single_sequence(sequence_file, save_path, sampled_interval, has_labe
        info['pose'] = pose
        if has_label:
-            annotations = generate_labels(frame)
+            annotations = generate_labels(frame, pose=pose)
            info['annos'] = annotations
+        if update_info_only and sequence_infos_old is not None:
+            assert info['frame_id'] == sequence_infos_old[cnt]['frame_id']
+            num_points_of_each_lidar = sequence_infos_old[cnt]['num_points_of_each_lidar']
+        else:
            num_points_of_each_lidar = save_lidar_points(
                frame, cur_save_dir / ('%04d.npy' % cnt), use_two_returns=use_two_returns
            )

--- a/pcdet/utils/box_utils.py
+++ b/pcdet/utils/box_utils.py
@@ -90,7 +90,7 @@ def corners_rect_to_camera(corners):
    return camera_rect
-def mask_boxes_outside_range_numpy(boxes, limit_range, min_num_corners=1):
+def mask_boxes_outside_range_numpy(boxes, limit_range, min_num_corners=1, use_center_to_filter=True):
    """
    Args:
        boxes: (N, 7) [x, y, z, dx, dy, dz, heading, ...], (x, y, z) is the box center
@@ -102,8 +102,13 @@ def mask_boxes_outside_range_numpy(boxes, limit_range, min_num_corners=1):
    """
    if boxes.shape[1] > 7:
        boxes = boxes[:, 0:7]
+    if use_center_to_filter:
+        box_centers = boxes[:, 0:3]
+        mask = ((box_centers >= limit_range[0:3]) & (box_centers <= limit_range[3:6])).all(axis=-1)
+    else:
        corners = boxes_to_corners_3d(boxes)  # (N, 8, 3)
-    mask = ((corners >= limit_range[0:3]) & (corners <= limit_range[3:6])).all(axis=2)
+        corners = corners[:, :, 0:2]
+        mask = ((corners >= limit_range[0:2]) & (corners <= limit_range[3:5])).all(axis=2)
        mask = mask.sum(axis=1) >= min_num_corners  # (N)
    return mask

--- a/setup.py
+++ b/setup.py
@@ -28,7 +28,7 @@ def write_version_to_file(version, target_file):
 if __name__ == '__main__':
-    version = '0.5.2+%s' % get_git_commit_number()
+    version = '0.6.0+%s' % get_git_commit_number()
    write_version_to_file(version, 'pcdet/version.py')
    setup(

--- a/tools/cfgs/dataset_configs/waymo_dataset.yaml
+++ b/tools/cfgs/dataset_configs/waymo_dataset.yaml
@@ -32,6 +32,12 @@ DATA_AUGMENTOR:
          DB_DATA_PATH:
              - waymo_processed_data_v0_5_0_gt_database_train_sampled_1_global.npy
+          BACKUP_DB_INFO:
+              # if the above DB_INFO cannot be found, will use this backup one
+              DB_INFO_PATH: waymo_processed_data_v0_5_0_waymo_dbinfos_train_sampled_1_multiframe_-4_to_0.pkl
+              DB_DATA_PATH: waymo_processed_data_v0_5_0_gt_database_train_sampled_1_multiframe_-4_to_0_global.npy
+              NUM_POINT_FEATURES: 6
          PREPARE: {
             filter_by_min_points: ['Vehicle:5', 'Pedestrian:5', 'Cyclist:5'],
             filter_by_difficulty: [-1],

--- a/tools/cfgs/dataset_configs/waymo_dataset_multiframe.yaml
+++ b/tools/cfgs/dataset_configs/waymo_dataset_multiframe.yaml
+DATASET: 'WaymoDataset'
+DATA_PATH: '../data/waymo'
+PROCESSED_DATA_TAG: 'waymo_processed_data_v0_5_0'
+POINT_CLOUD_RANGE: [-75.2, -75.2, -2, 75.2, 75.2, 4]
+DATA_SPLIT: {
+    'train': train,
+    'test': val
+}
+SAMPLED_INTERVAL: {
+    'train': 5,
+    'test': 1
+}
+FILTER_EMPTY_BOXES_FOR_TRAIN: True
+DISABLE_NLZ_FLAG_ON_POINTS: True
+USE_SHARED_MEMORY: False  # it will load the data to shared memory to speed up (DO NOT USE IT IF YOU DO NOT FULLY UNDERSTAND WHAT WILL HAPPEN)
+SHARED_MEMORY_FILE_LIMIT: 35000  # set it based on the size of your shared memory
+SEQUENCE_CONFIG:
+    ENABLED: True
+    SAMPLE_OFFSET: [-3, 0]
+TRAIN_WITH_SPEED: True 
+DATA_AUGMENTOR:
+    DISABLE_AUG_LIST: ['placeholder']
+    AUG_CONFIG_LIST:
+        - NAME: gt_sampling
+          USE_ROAD_PLANE: False
+          DB_INFO_PATH:
+              - waymo_processed_data_v0_5_0_waymo_dbinfos_train_sampled_1_multiframe_-4_to_0.pkl
+          USE_SHARED_MEMORY: False  # set it to True to speed up (it costs about 50GB? shared memory)
+          DB_DATA_PATH:
+              - waymo_processed_data_v0_5_0_gt_database_train_sampled_1_multiframe_-4_to_0_global.npy
+          PREPARE: {
+             filter_by_min_points: ['Vehicle:5', 'Pedestrian:5', 'Cyclist:5'],
+             filter_by_difficulty: [-1],
+          }
+          SAMPLE_GROUPS: ['Vehicle:15', 'Pedestrian:10', 'Cyclist:10']
+          NUM_POINT_FEATURES: 6
+          REMOVE_EXTRA_WIDTH: [0.0, 0.0, 0.0]
+          LIMIT_WHOLE_SCENE: True
+          FILTER_OBJ_POINTS_BY_TIMESTAMP: True 
+          TIME_RANGE: [0.3, 0.0]  # 0.3s-0.0s indicates 4 frames 
+        - NAME: random_world_flip
+          ALONG_AXIS_LIST: ['x', 'y']
+        - NAME: random_world_rotation
+          WORLD_ROT_ANGLE: [-0.78539816, 0.78539816]
+        - NAME: random_world_scaling
+          WORLD_SCALE_RANGE: [0.95, 1.05]
+POINT_FEATURE_ENCODING: {
+    encoding_type: absolute_coordinates_encoding,
+    used_feature_list: ['x', 'y', 'z', 'intensity', 'elongation', 'timestamp'],
+    src_feature_list: ['x', 'y', 'z', 'intensity', 'elongation', 'timestamp'],
+}
+DATA_PROCESSOR:
+    - NAME: mask_points_and_boxes_outside_range
+      REMOVE_OUTSIDE_BOXES: True
+      USE_CENTER_TO_FILTER: True
+    - NAME: shuffle_points
+      SHUFFLE_ENABLED: {
+        'train': True,
+        'test': True
+      }
+    - NAME: transform_points_to_voxels
+      VOXEL_SIZE: [0.1, 0.1, 0.15]
+      MAX_POINTS_PER_VOXEL: 5
+      MAX_NUMBER_OF_VOXELS: {
+        'train': 180000,
+        'test': 400000
+      }
--- a/tools/cfgs/kitti_models/pointpillar_newaugs.yaml
+++ b/tools/cfgs/kitti_models/pointpillar_newaugs.yaml
@@ -54,8 +54,7 @@ DATA_CONFIG:
              WORLD_SCALE_RANGE: [0.95, 1.05]
            - NAME: random_world_translation
-              WORLD_TRANSLATION_RANGE: [ -0.2, 0.2 ]
+              NOISE_TRANSLATE_STD: [0.5, 0.5, 0.5]
-              ALONG_AXIS_LIST: ['x', 'y', 'z']
            - NAME: random_local_translation
              LOCAL_TRANSLATION_RANGE: [0.95, 1.05]

--- a/tools/cfgs/nuscenes_models/cbgs_voxel0075_res3d_centerpoint.yaml
+++ b/tools/cfgs/nuscenes_models/cbgs_voxel0075_res3d_centerpoint.yaml
@@ -38,8 +38,7 @@ DATA_CONFIG:
              WORLD_SCALE_RANGE: [0.9, 1.1]
            - NAME: random_world_translation
-              NOISE_TRANSLATE_STD: 0.5
+              NOISE_TRANSLATE_STD: [0.5, 0.5, 0.5]
-              ALONG_AXIS_LIST: ['x', 'y', 'z']
    DATA_PROCESSOR:

--- a/tools/cfgs/waymo_models/centerpoint_4frames.yaml
+++ b/tools/cfgs/waymo_models/centerpoint_4frames.yaml
+CLASS_NAMES: ['Vehicle', 'Pedestrian', 'Cyclist']
+DATA_CONFIG:
+    _BASE_CONFIG_: cfgs/dataset_configs/waymo_dataset_multiframe.yaml
+MODEL:
+    NAME: CenterPoint
+    VFE:
+        NAME: MeanVFE
+    BACKBONE_3D:
+        NAME: VoxelResBackBone8x
+    MAP_TO_BEV:
+        NAME: HeightCompression
+        NUM_BEV_FEATURES: 256
+    BACKBONE_2D:
+        NAME: BaseBEVBackbone
+        LAYER_NUMS: [5, 5]
+        LAYER_STRIDES: [1, 2]
+        NUM_FILTERS: [128, 256]
+        UPSAMPLE_STRIDES: [1, 2]
+        NUM_UPSAMPLE_FILTERS: [256, 256]
+    DENSE_HEAD:
+        NAME: CenterHead
+        CLASS_AGNOSTIC: False
+        CLASS_NAMES_EACH_HEAD: [
+            ['Vehicle', 'Pedestrian', 'Cyclist']
+        ]
+        SHARED_CONV_CHANNEL: 64
+        USE_BIAS_BEFORE_NORM: True
+        NUM_HM_CONV: 2
+        SEPARATE_HEAD_CFG:
+            HEAD_ORDER: ['center', 'center_z', 'dim', 'rot', 'vel']
+            HEAD_DICT: {
+                'center': {'out_channels': 2, 'num_conv': 2},
+                'center_z': {'out_channels': 1, 'num_conv': 2},
+                'dim': {'out_channels': 3, 'num_conv': 2},
+                'rot': {'out_channels': 2, 'num_conv': 2},
+                'vel': {'out_channels': 2, 'num_conv': 2},
+            }
+        TARGET_ASSIGNER_CONFIG:
+            FEATURE_MAP_STRIDE: 8
+            NUM_MAX_OBJS: 500
+            GAUSSIAN_OVERLAP: 0.1
+            MIN_RADIUS: 2
+        LOSS_CONFIG:
+            LOSS_WEIGHTS: {
+                'cls_weight': 1.0,
+                'loc_weight': 2.0,
+                'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.2, 0.2]
+            }
+        POST_PROCESSING:
+            SCORE_THRESH: 0.1
+            POST_CENTER_LIMIT_RANGE: [-75.2, -75.2, -2, 75.2, 75.2, 4]
+            MAX_OBJ_PER_SAMPLE: 500
+            NMS_CONFIG:
+                NMS_TYPE: nms_gpu
+                NMS_THRESH: 0.7
+                NMS_PRE_MAXSIZE: 4096
+                NMS_POST_MAXSIZE: 500
+    POST_PROCESSING:
+        RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
+        EVAL_METRIC: waymo
+OPTIMIZATION:
+    BATCH_SIZE_PER_GPU: 4
+    NUM_EPOCHS: 30
+    OPTIMIZER: adam_onecycle
+    LR: 0.003
+    WEIGHT_DECAY: 0.01
+    MOMENTUM: 0.9
+    MOMS: [0.95, 0.85]
+    PCT_START: 0.4
+    DIV_FACTOR: 10
+    DECAY_STEP_LIST: [35, 45]
+    LR_DECAY: 0.1
+    LR_CLIP: 0.0000001
+    LR_WARMUP: False
+    WARMUP_EPOCH: 1
+    GRAD_NORM_CLIP: 10