[Enhance] Parallel waymo dataset converter and ground truth database generator...

[Enhance] Parallel waymo dataset converter and ground truth database generator & Fix unprecise timestamps saving (#1327) * enable parallel waymo dataset converter and fix timestamp overflow bug * update waymo db_sampler configs * add docstrings and do the formatting * fix docstring typos * fix wrong docstrings, and refactor conversion code * add information into the compatibility doc Co-authored-by: Tai-Wang <tab_wang@outlook.com>

[Enhance] Parallel waymo dataset converter and ground truth database generator...
[Enhance] Parallel waymo dataset converter and ground truth database generator & Fix unprecise timestamps saving (#1327) * enable parallel waymo dataset converter and fix timestamp overflow bug * update waymo db_sampler configs * add docstrings and do the formatting * fix docstring typos * fix wrong docstrings, and refactor conversion code * add information into the compatibility doc Co-authored-by: Tai-Wang <tab_wang@outlook.com>
8a8da91b · zhanggefan · GitHub · d85f58ab · 8a8da91b · 8a8da91b
Unverified Commit 8a8da91b authored Apr 01, 2022 by zhanggefan Committed by GitHub Apr 01, 2022
10 changed files
--- a/configs/_base_/datasets/waymoD5-3d-3class.py
+++ b/configs/_base_/datasets/waymoD5-3d-3class.py
@@ -25,7 +25,7 @@ db_sampler = dict(
    points_loader=dict(
        type='LoadPointsFromFile',
        coord_type='LIDAR',
-        load_dim=5,
+        load_dim=6,
        use_dim=[0, 1, 2, 3, 4],
        file_client_args=file_client_args))


--- a/configs/_base_/datasets/waymoD5-3d-car.py
+++ b/configs/_base_/datasets/waymoD5-3d-car.py
@@ -23,7 +23,7 @@ db_sampler = dict(
    points_loader=dict(
        type='LoadPointsFromFile',
        coord_type='LIDAR',
-        load_dim=5,
+        load_dim=6,
        use_dim=[0, 1, 2, 3, 4],
        file_client_args=file_client_args))


--- a/configs/second/hv_second_secfpn_sbn_2x16_2x_waymoD5-3d-3class.py
+++ b/configs/second/hv_second_secfpn_sbn_2x16_2x_waymoD5-3d-3class.py
@@ -23,7 +23,7 @@ db_sampler = dict(
    points_loader=dict(
        type='LoadPointsFromFile',
        coord_type='LIDAR',
-        load_dim=5,
+        load_dim=6,
        use_dim=[0, 1, 2, 3, 4]))

 train_pipeline = [

--- a/docs/en/compatibility.md
+++ b/docs/en/compatibility.md
@@ -4,6 +4,52 @@

 We have adopted CUDA operators compiled from [mmcv](https://github.com/open-mmlab/mmcv/blob/master/mmcv/ops/__init__.py) and removed all the CUDA operators in mmdet3d. We now do not need to compile the CUDA operators in mmdet3d anymore.

+### Waymo dataset converter refactoring
+
+In this version we did a major code refactoring that boosted the performance of waymo dataset conversion by multiprocessing.
+Meanwhile, we also fixed the imprecise timestamps saving issue in waymo dataset conversion. This change introduces following backward compatibility breaks:
+
+- The point cloud .bin files of waymo dataset need to be regenerated.
+In the .bin files each point occupies 6 `float32` and the meaning of the last `float32` now changed from **imprecise timestamps** to **range frame offset**.
+The **range frame offset** for each point is calculated as`ri * h * w + row * w + col` if the point is from the **TOP** lidar or `-1` otherwise.
+The `h`, `w` denote the height and width of the TOP lidar's range frame.
+The `ri`, `row`, `col` denote the return index, the row and the column of the range frame where each point locates.
+Following tables show the difference across the change:
+
+Before
+
+| Element offset (float32) |  0  |  1  |  2  |     3     |     4      |            5            |
+|--------------------------|:---:|:---:|:---:|:---------:|:----------:|:-----------------------:|
+| Bytes offset             |  0  |  4  |  8  |    12     |     16     |           20            |
+| Meaning                  |  x  |  y  |  z  | intensity | elongation | **imprecise timestamp** |
+
+After
+
+| Element offset (float32) |  0  |  1  |  2  |     3     |     4      |           5            |
+|--------------------------|:---:|:---:|:---:|:---------:|:----------:|:----------------------:|
+| Bytes offset             |  0  |  4  |  8  |    12     |     16     |           20           |
+| Meaning                  |  x  |  y  |  z  | intensity | elongation | **range frame offset** |
+
+- The objects' point cloud .bin files in the GT-database of waymo dataset need to be regenerated because we also dumped the range frame offset for each point into it.
+Following tables show the difference across the change:
+
+Before
+
+| Element offset (float32) |  0  |  1  |  2  |     3     |     4      |
+|--------------------------|:---:|:---:|:---:|:---------:|:----------:|
+| Bytes offset             |  0  |  4  |  8  |    12     |     16     |
+| Meaning                  |  x  |  y  |  z  | intensity | elongation |
+
+After
+
+| Element offset (float32) |  0  |  1  |  2  |     3     |     4      |           5            |
+|--------------------------|:---:|:---:|:---:|:---------:|:----------:|:----------------------:|
+| Bytes offset             |  0  |  4  |  8  |    12     |     16     |           20           |
+| Meaning                  |  x  |  y  |  z  | intensity | elongation | **range frame offset** |
+
+- Any configuration that uses waymo dataset with GT Augmentation should change the `db_sampler.points_loader.load_dim` from `5` to `6`.
+
+
 ## v1.0.0rc0

 ### Coordinate system refactoring

--- a/tests/test_data/test_datasets/test_waymo_dataset.py
+++ b/tests/test_data/test_datasets/test_waymo_dataset.py
@@ -27,7 +27,7 @@ def _generate_waymo_train_dataset_config():
        points_loader=dict(
            type='LoadPointsFromFile',
            coord_type='LIDAR',
-            load_dim=5,
+            load_dim=6,
            use_dim=[0, 1, 2, 3, 4],
            file_client_args=file_client_args))
    pipeline = [

--- a/tools/create_data.py
+++ b/tools/create_data.py
@@ -6,7 +6,8 @@ from tools.data_converter import indoor_converter as indoor
 from tools.data_converter import kitti_converter as kitti
 from tools.data_converter import lyft_converter as lyft_converter
 from tools.data_converter import nuscenes_converter as nuscenes_converter
-from tools.data_converter.create_gt_database import create_groundtruth_database
+from tools.data_converter.create_gt_database import (
+    create_groundtruth_database, GTDatabaseCreater)


 def kitti_data_prep(root_path,
@@ -181,14 +182,16 @@ def waymo_data_prep(root_path,
        converter.convert()
    # Generate waymo infos
    out_dir = osp.join(out_dir, 'kitti_format')
-    kitti.create_waymo_info_file(out_dir, info_prefix, max_sweeps=max_sweeps)
-    create_groundtruth_database(
+    kitti.create_waymo_info_file(
+        out_dir, info_prefix, max_sweeps=max_sweeps, workers=workers)
+    GTDatabaseCreater(
        'WaymoDataset',
        out_dir,
        info_prefix,
        f'{out_dir}/{info_prefix}_infos_train.pkl',
        relative_path=False,
-        with_mask=False)
+        with_mask=False,
+        num_worker=workers).create()


 parser = argparse.ArgumentParser(description='Data converter arg parser')

--- a/tools/data_converter/create_gt_database.py
+++ b/tools/data_converter/create_gt_database.py
@@ -124,7 +124,7 @@ def create_groundtruth_database(dataset_class_name,
    """Given the raw data, generate the ground truth database.

    Args:
-        dataset_class_name （str): Name of the input dataset.
+        dataset_class_name (str): Name of the input dataset.
        data_path (str): Path of the data.
        info_prefix (str): Prefix of the info file.
        info_path (str, optional): Path of the info file.
@@ -207,7 +207,7 @@ def create_groundtruth_database(dataset_class_name,
                    type='LoadPointsFromFile',
                    coord_type='LIDAR',
                    load_dim=6,
-                    use_dim=5,
+                    use_dim=6,
                    file_client_args=file_client_args),
                dict(
                    type='LoadAnnotations3D',
@@ -337,3 +337,288 @@ def create_groundtruth_database(dataset_class_name,

    with open(db_info_save_path, 'wb') as f:
        pickle.dump(all_db_infos, f)
+
+
+class GTDatabaseCreater:
+    """Given the raw data, generate the ground truth database. This is the
+    parallel version. For serialized version, please refer to
+    `create_groundtruth_database`
+
+    Args:
+        dataset_class_name (str): Name of the input dataset.
+        data_path (str): Path of the data.
+        info_prefix (str): Prefix of the info file.
+        info_path (str, optional): Path of the info file.
+            Default: None.
+        mask_anno_path (str, optional): Path of the mask_anno.
+            Default: None.
+        used_classes (list[str], optional): Classes have been used.
+            Default: None.
+        database_save_path (str, optional): Path to save database.
+            Default: None.
+        db_info_save_path (str, optional): Path to save db_info.
+            Default: None.
+        relative_path (bool, optional): Whether to use relative path.
+            Default: True.
+        with_mask (bool, optional): Whether to use mask.
+            Default: False.
+        num_worker (int, optional): the number of parallel workers to use.
+            Default: 8.
+    """
+
+    def __init__(self,
+                 dataset_class_name,
+                 data_path,
+                 info_prefix,
+                 info_path=None,
+                 mask_anno_path=None,
+                 used_classes=None,
+                 database_save_path=None,
+                 db_info_save_path=None,
+                 relative_path=True,
+                 add_rgb=False,
+                 lidar_only=False,
+                 bev_only=False,
+                 coors_range=None,
+                 with_mask=False,
+                 num_worker=8) -> None:
+        self.dataset_class_name = dataset_class_name
+        self.data_path = data_path
+        self.info_prefix = info_prefix
+        self.info_path = info_path
+        self.mask_anno_path = mask_anno_path
+        self.used_classes = used_classes
+        self.database_save_path = database_save_path
+        self.db_info_save_path = db_info_save_path
+        self.relative_path = relative_path
+        self.add_rgb = add_rgb
+        self.lidar_only = lidar_only
+        self.bev_only = bev_only
+        self.coors_range = coors_range
+        self.with_mask = with_mask
+        self.num_worker = num_worker
+        self.pipeline = None
+
+    def create_single(self, input_dict):
+        group_counter = 0
+        single_db_infos = dict()
+        example = self.pipeline(input_dict)
+        annos = example['ann_info']
+        image_idx = example['sample_idx']
+        points = example['points'].tensor.numpy()
+        gt_boxes_3d = annos['gt_bboxes_3d'].tensor.numpy()
+        names = annos['gt_names']
+        group_dict = dict()
+        if 'group_ids' in annos:
+            group_ids = annos['group_ids']
+        else:
+            group_ids = np.arange(gt_boxes_3d.shape[0], dtype=np.int64)
+        difficulty = np.zeros(gt_boxes_3d.shape[0], dtype=np.int32)
+        if 'difficulty' in annos:
+            difficulty = annos['difficulty']
+
+        num_obj = gt_boxes_3d.shape[0]
+        point_indices = box_np_ops.points_in_rbbox(points, gt_boxes_3d)
+
+        if self.with_mask:
+            # prepare masks
+            gt_boxes = annos['gt_bboxes']
+            img_path = osp.split(example['img_info']['filename'])[-1]
+            if img_path not in self.file2id.keys():
+                print(f'skip image {img_path} for empty mask')
+                return single_db_infos
+            img_id = self.file2id[img_path]
+            kins_annIds = self.coco.getAnnIds(imgIds=img_id)
+            kins_raw_info = self.coco.loadAnns(kins_annIds)
+            kins_ann_info = _parse_coco_ann_info(kins_raw_info)
+            h, w = annos['img_shape'][:2]
+            gt_masks = [
+                _poly2mask(mask, h, w) for mask in kins_ann_info['masks']
+            ]
+            # get mask inds based on iou mapping
+            bbox_iou = bbox_overlaps(kins_ann_info['bboxes'], gt_boxes)
+            mask_inds = bbox_iou.argmax(axis=0)
+            valid_inds = (bbox_iou.max(axis=0) > 0.5)
+
+            # mask the image
+            # use more precise crop when it is ready
+            # object_img_patches = np.ascontiguousarray(
+            #     np.stack(object_img_patches, axis=0).transpose(0, 3, 1, 2))
+            # crop image patches using roi_align
+            # object_img_patches = crop_image_patch_v2(
+            #     torch.Tensor(gt_boxes),
+            #     torch.Tensor(mask_inds).long(), object_img_patches)
+            object_img_patches, object_masks = crop_image_patch(
+                gt_boxes, gt_masks, mask_inds, annos['img'])
+
+        for i in range(num_obj):
+            filename = f'{image_idx}_{names[i]}_{i}.bin'
+            abs_filepath = osp.join(self.database_save_path, filename)
+            rel_filepath = osp.join(f'{self.info_prefix}_gt_database',
+                                    filename)
+
+            # save point clouds and image patches for each object
+            gt_points = points[point_indices[:, i]]
+            gt_points[:, :3] -= gt_boxes_3d[i, :3]
+
+            if self.with_mask:
+                if object_masks[i].sum() == 0 or not valid_inds[i]:
+                    # Skip object for empty or invalid mask
+                    continue
+                img_patch_path = abs_filepath + '.png'
+                mask_patch_path = abs_filepath + '.mask.png'
+                mmcv.imwrite(object_img_patches[i], img_patch_path)
+                mmcv.imwrite(object_masks[i], mask_patch_path)
+
+            with open(abs_filepath, 'w') as f:
+                gt_points.tofile(f)
+
+            if (self.used_classes is None) or names[i] in self.used_classes:
+                db_info = {
+                    'name': names[i],
+                    'path': rel_filepath,
+                    'image_idx': image_idx,
+                    'gt_idx': i,
+                    'box3d_lidar': gt_boxes_3d[i],
+                    'num_points_in_gt': gt_points.shape[0],
+                    'difficulty': difficulty[i],
+                }
+                local_group_id = group_ids[i]
+                # if local_group_id >= 0:
+                if local_group_id not in group_dict:
+                    group_dict[local_group_id] = group_counter
+                    group_counter += 1
+                db_info['group_id'] = group_dict[local_group_id]
+                if 'score' in annos:
+                    db_info['score'] = annos['score'][i]
+                if self.with_mask:
+                    db_info.update({'box2d_camera': gt_boxes[i]})
+                if names[i] in single_db_infos:
+                    single_db_infos[names[i]].append(db_info)
+                else:
+                    single_db_infos[names[i]] = [db_info]
+
+        return single_db_infos
+
+    def create(self):
+        print(f'Create GT Database of {self.dataset_class_name}')
+        dataset_cfg = dict(
+            type=self.dataset_class_name,
+            data_root=self.data_path,
+            ann_file=self.info_path)
+        if self.dataset_class_name == 'KittiDataset':
+            file_client_args = dict(backend='disk')
+            dataset_cfg.update(
+                test_mode=False,
+                split='training',
+                modality=dict(
+                    use_lidar=True,
+                    use_depth=False,
+                    use_lidar_intensity=True,
+                    use_camera=self.with_mask,
+                ),
+                pipeline=[
+                    dict(
+                        type='LoadPointsFromFile',
+                        coord_type='LIDAR',
+                        load_dim=4,
+                        use_dim=4,
+                        file_client_args=file_client_args),
+                    dict(
+                        type='LoadAnnotations3D',
+                        with_bbox_3d=True,
+                        with_label_3d=True,
+                        file_client_args=file_client_args)
+                ])
+
+        elif self.dataset_class_name == 'NuScenesDataset':
+            dataset_cfg.update(
+                use_valid_flag=True,
+                pipeline=[
+                    dict(
+                        type='LoadPointsFromFile',
+                        coord_type='LIDAR',
+                        load_dim=5,
+                        use_dim=5),
+                    dict(
+                        type='LoadPointsFromMultiSweeps',
+                        sweeps_num=10,
+                        use_dim=[0, 1, 2, 3, 4],
+                        pad_empty_sweeps=True,
+                        remove_close=True),
+                    dict(
+                        type='LoadAnnotations3D',
+                        with_bbox_3d=True,
+                        with_label_3d=True)
+                ])
+
+        elif self.dataset_class_name == 'WaymoDataset':
+            file_client_args = dict(backend='disk')
+            dataset_cfg.update(
+                test_mode=False,
+                split='training',
+                modality=dict(
+                    use_lidar=True,
+                    use_depth=False,
+                    use_lidar_intensity=True,
+                    use_camera=False,
+                ),
+                pipeline=[
+                    dict(
+                        type='LoadPointsFromFile',
+                        coord_type='LIDAR',
+                        load_dim=6,
+                        use_dim=6,
+                        file_client_args=file_client_args),
+                    dict(
+                        type='LoadAnnotations3D',
+                        with_bbox_3d=True,
+                        with_label_3d=True,
+                        file_client_args=file_client_args)
+                ])
+
+        dataset = build_dataset(dataset_cfg)
+        self.pipeline = dataset.pipeline
+        if self.database_save_path is None:
+            self.database_save_path = osp.join(
+                self.data_path, f'{self.info_prefix}_gt_database')
+        if self.db_info_save_path is None:
+            self.db_info_save_path = osp.join(
+                self.data_path, f'{self.info_prefix}_dbinfos_train.pkl')
+        mmcv.mkdir_or_exist(self.database_save_path)
+        if self.with_mask:
+            self.coco = COCO(osp.join(self.data_path, self.mask_anno_path))
+            imgIds = self.coco.getImgIds()
+            self.file2id = dict()
+            for i in imgIds:
+                info = self.coco.loadImgs([i])[0]
+                self.file2id.update({info['file_name']: i})
+
+        def loop_dataset(i):
+            input_dict = dataset.get_data_info(i)
+            dataset.pre_pipeline(input_dict)
+            return input_dict
+
+        multi_db_infos = mmcv.track_parallel_progress(
+            self.create_single, ((loop_dataset(i)
+                                  for i in range(len(dataset))), len(dataset)),
+            self.num_worker)
+        print('Make global unique group id')
+        group_counter_offset = 0
+        all_db_infos = dict()
+        for single_db_infos in track_iter_progress(multi_db_infos):
+            group_id = -1
+            for name, name_db_infos in single_db_infos.items():
+                for db_info in name_db_infos:
+                    group_id = max(group_id, db_info['group_id'])
+                    db_info['group_id'] += group_counter_offset
+                if name not in all_db_infos:
+                    all_db_infos[name] = []
+                all_db_infos[name].extend(name_db_infos)
+            group_counter_offset += (group_id + 1)
+
+        for k, v in all_db_infos.items():
+            print(f'load {len(v)} {k} database infos')
+
+        with open(self.db_info_save_path, 'wb') as f:
+            pickle.dump(all_db_infos, f)
--- a/tools/data_converter/kitti_converter.py
+++ b/tools/data_converter/kitti_converter.py
@@ -7,7 +7,7 @@ import numpy as np
 from nuscenes.utils.geometry_utils import view_points

 from mmdet3d.core.bbox import box_np_ops, points_cam2img
-from .kitti_data_utils import get_kitti_image_info, get_waymo_image_info
+from .kitti_data_utils import WaymoInfoGatherer, get_kitti_image_info
 from .nuscenes_converter import post_process_coords

 kitti_categories = ('Pedestrian', 'Cyclist', 'Car')
@@ -44,6 +44,75 @@ def _read_imageset_file(path):
    return [int(line) for line in lines]


+class _NumPointsInGTCalculater:
+    """Calculate the number of points inside the ground truth box. This is the
+    parallel version. For the serialized version, please refer to
+    `_calculate_num_points_in_gt`.
+
+    Args:
+        data_path (str): Path of the data.
+        relative_path (bool): Whether to use relative path.
+        remove_outside (bool, optional): Whether to remove points which are
+            outside of image. Default: True.
+        num_features (int, optional): Number of features per point.
+            Default: False.
+        num_worker (int, optional): the number of parallel workers to use.
+            Default: 8.
+    """
+
+    def __init__(self,
+                 data_path,
+                 relative_path,
+                 remove_outside=True,
+                 num_features=4,
+                 num_worker=8) -> None:
+        self.data_path = data_path
+        self.relative_path = relative_path
+        self.remove_outside = remove_outside
+        self.num_features = num_features
+        self.num_worker = num_worker
+
+    def calculate_single(self, info):
+        pc_info = info['point_cloud']
+        image_info = info['image']
+        calib = info['calib']
+        if self.relative_path:
+            v_path = str(Path(self.data_path) / pc_info['velodyne_path'])
+        else:
+            v_path = pc_info['velodyne_path']
+        points_v = np.fromfile(
+            v_path, dtype=np.float32,
+            count=-1).reshape([-1, self.num_features])
+        rect = calib['R0_rect']
+        Trv2c = calib['Tr_velo_to_cam']
+        P2 = calib['P2']
+        if self.remove_outside:
+            points_v = box_np_ops.remove_outside_points(
+                points_v, rect, Trv2c, P2, image_info['image_shape'])
+        annos = info['annos']
+        num_obj = len([n for n in annos['name'] if n != 'DontCare'])
+        dims = annos['dimensions'][:num_obj]
+        loc = annos['location'][:num_obj]
+        rots = annos['rotation_y'][:num_obj]
+        gt_boxes_camera = np.concatenate([loc, dims, rots[..., np.newaxis]],
+                                         axis=1)
+        gt_boxes_lidar = box_np_ops.box_camera_to_lidar(
+            gt_boxes_camera, rect, Trv2c)
+        indices = box_np_ops.points_in_rbbox(points_v[:, :3], gt_boxes_lidar)
+        num_points_in_gt = indices.sum(0)
+        num_ignored = len(annos['dimensions']) - num_obj
+        num_points_in_gt = np.concatenate(
+            [num_points_in_gt, -np.ones([num_ignored])])
+        annos['num_points_in_gt'] = num_points_in_gt.astype(np.int32)
+        return info
+
+    def calculate(self, infos):
+        ret_infos = mmcv.track_parallel_progress(self.calculate_single, infos,
+                                                 self.num_worker)
+        for i, ret_info in enumerate(ret_infos):
+            infos[i] = ret_info
+
+
 def _calculate_num_points_in_gt(data_path,
                                infos,
                                relative_path,
@@ -161,7 +230,8 @@ def create_waymo_info_file(data_path,
                           pkl_prefix='waymo',
                           save_path=None,
                           relative_path=True,
-                           max_sweeps=5):
+                           max_sweeps=5,
+                           workers=8):
    """Create info file of waymo dataset.

    Given the raw data, generate its related info file in pkl format.
@@ -187,55 +257,46 @@ def create_waymo_info_file(data_path,
        save_path = Path(data_path)
    else:
        save_path = Path(save_path)
-    waymo_infos_train = get_waymo_image_info(
+    waymo_infos_gatherer_trainval = WaymoInfoGatherer(
        data_path,
        training=True,
        velodyne=True,
        calib=True,
        pose=True,
-        image_ids=train_img_ids,
        relative_path=relative_path,
-        max_sweeps=max_sweeps)
-    _calculate_num_points_in_gt(
-        data_path,
-        waymo_infos_train,
-        relative_path,
-        num_features=6,
-        remove_outside=False)
-    filename = save_path / f'{pkl_prefix}_infos_train.pkl'
-    print(f'Waymo info train file is saved to {filename}')
-    mmcv.dump(waymo_infos_train, filename)
-    waymo_infos_val = get_waymo_image_info(
+        max_sweeps=max_sweeps,
+        num_worker=workers)
+    waymo_infos_gatherer_test = WaymoInfoGatherer(
        data_path,
-        training=True,
+        training=False,
+        label_info=False,
        velodyne=True,
        calib=True,
        pose=True,
-        image_ids=val_img_ids,
        relative_path=relative_path,
-        max_sweeps=max_sweeps)
-    _calculate_num_points_in_gt(
+        max_sweeps=max_sweeps,
+        num_worker=workers)
+    num_points_in_gt_calculater = _NumPointsInGTCalculater(
        data_path,
-        waymo_infos_val,
        relative_path,
        num_features=6,
-        remove_outside=False)
+        remove_outside=False,
+        num_worker=workers)
+
+    waymo_infos_train = waymo_infos_gatherer_trainval.gather(train_img_ids)
+    num_points_in_gt_calculater.calculate(waymo_infos_train)
+    filename = save_path / f'{pkl_prefix}_infos_train.pkl'
+    print(f'Waymo info train file is saved to {filename}')
+    mmcv.dump(waymo_infos_train, filename)
+    waymo_infos_val = waymo_infos_gatherer_trainval.gather(val_img_ids)
+    num_points_in_gt_calculater.calculate(waymo_infos_val)
    filename = save_path / f'{pkl_prefix}_infos_val.pkl'
    print(f'Waymo info val file is saved to {filename}')
    mmcv.dump(waymo_infos_val, filename)
    filename = save_path / f'{pkl_prefix}_infos_trainval.pkl'
    print(f'Waymo info trainval file is saved to {filename}')
    mmcv.dump(waymo_infos_train + waymo_infos_val, filename)
-    waymo_infos_test = get_waymo_image_info(
-        data_path,
-        training=False,
-        label_info=False,
-        velodyne=True,
-        calib=True,
-        pose=True,
-        image_ids=test_img_ids,
-        relative_path=relative_path,
-        max_sweeps=max_sweeps)
+    waymo_infos_test = waymo_infos_gatherer_test.gather(test_img_ids)
    filename = save_path / f'{pkl_prefix}_infos_test.pkl'
    print(f'Waymo info test file is saved to {filename}')
    mmcv.dump(waymo_infos_test, filename)

--- a/tools/data_converter/kitti_data_utils.py
+++ b/tools/data_converter/kitti_data_utils.py
@@ -6,6 +6,7 @@ from pathlib import Path

 import mmcv
 import numpy as np
+from PIL import Image
 from skimage import io


@@ -102,6 +103,16 @@ def get_pose_path(idx,
                               relative_path, exist_check, use_prefix_id)


+def get_timestamp_path(idx,
+                       prefix,
+                       training=True,
+                       relative_path=True,
+                       exist_check=True,
+                       use_prefix_id=False):
+    return get_kitti_info_path(idx, prefix, 'timestamp', '.txt', training,
+                               relative_path, exist_check, use_prefix_id)
+
+
 def get_label_anno(label_path):
    annotations = {}
    annotations.update({
@@ -283,19 +294,9 @@ def get_kitti_image_info(path,
    return list(image_infos)


-def get_waymo_image_info(path,
-                         training=True,
-                         label_info=True,
-                         velodyne=False,
-                         calib=False,
-                         pose=False,
-                         image_ids=7481,
-                         extend_matrix=True,
-                         num_worker=8,
-                         relative_path=True,
-                         with_imageshape=True,
-                         max_sweeps=5):
+class WaymoInfoGatherer:
    """
+    Parallel version of waymo dataset information gathering.
    Waymo annotation format version like KITTI:
    {
        [optional]points: [N, 3+] point cloud
@@ -323,54 +324,88 @@ def get_waymo_image_info(path,
        }
    }
    """
-    root_path = Path(path)
-    if not isinstance(image_ids, list):
-        image_ids = list(range(image_ids))

-    def map_func(idx):
+    def __init__(self,
+                 path,
+                 training=True,
+                 label_info=True,
+                 velodyne=False,
+                 calib=False,
+                 pose=False,
+                 extend_matrix=True,
+                 num_worker=8,
+                 relative_path=True,
+                 with_imageshape=True,
+                 max_sweeps=5) -> None:
+        self.path = path
+        self.training = training
+        self.label_info = label_info
+        self.velodyne = velodyne
+        self.calib = calib
+        self.pose = pose
+        self.extend_matrix = extend_matrix
+        self.num_worker = num_worker
+        self.relative_path = relative_path
+        self.with_imageshape = with_imageshape
+        self.max_sweeps = max_sweeps
+
+    def gather_single(self, idx):
+        root_path = Path(self.path)
        info = {}
        pc_info = {'num_features': 6}
        calib_info = {}

        image_info = {'image_idx': idx}
        annotations = None
-        if velodyne:
+        if self.velodyne:
            pc_info['velodyne_path'] = get_velodyne_path(
-                idx, path, training, relative_path, use_prefix_id=True)
-            points = np.fromfile(
-                Path(path) / pc_info['velodyne_path'], dtype=np.float32)
-            points = np.copy(points).reshape(-1, pc_info['num_features'])
-            info['timestamp'] = np.int64(points[0, -1])
-            # values of the last dim are all the timestamp
+                idx,
+                self.path,
+                self.training,
+                self.relative_path,
+                use_prefix_id=True)
+            with open(
+                    get_timestamp_path(
+                        idx,
+                        self.path,
+                        self.training,
+                        relative_path=False,
+                        use_prefix_id=True)) as f:
+                info['timestamp'] = np.int64(f.read())
        image_info['image_path'] = get_image_path(
            idx,
-            path,
-            training,
-            relative_path,
+            self.path,
+            self.training,
+            self.relative_path,
            info_type='image_0',
            use_prefix_id=True)
-        if with_imageshape:
+        if self.with_imageshape:
            img_path = image_info['image_path']
-            if relative_path:
+            if self.relative_path:
                img_path = str(root_path / img_path)
-            image_info['image_shape'] = np.array(
-                io.imread(img_path).shape[:2], dtype=np.int32)
-        if label_info:
+            # io using PIL is significantly faster than skimage
+            w, h = Image.open(img_path).size
+            image_info['image_shape'] = np.array((h, w), dtype=np.int32)
+        if self.label_info:
            label_path = get_label_path(
                idx,
-                path,
-                training,
-                relative_path,
+                self.path,
+                self.training,
+                self.relative_path,
                info_type='label_all',
                use_prefix_id=True)
-            if relative_path:
+            if self.relative_path:
                label_path = str(root_path / label_path)
            annotations = get_label_anno(label_path)
        info['image'] = image_info
        info['point_cloud'] = pc_info
-        if calib:
+        if self.calib:
            calib_path = get_calib_path(
-                idx, path, training, relative_path=False, use_prefix_id=True)
+                idx,
+                self.path,
+                self.training,
+                relative_path=False,
+                use_prefix_id=True)
            with open(calib_path, 'r') as f:
                lines = f.readlines()
            P0 = np.array([float(info) for info in lines[0].split(' ')[1:13]
@@ -383,7 +418,7 @@ def get_waymo_image_info(path,
                           ]).reshape([3, 4])
            P4 = np.array([float(info) for info in lines[4].split(' ')[1:13]
                           ]).reshape([3, 4])
-            if extend_matrix:
+            if self.extend_matrix:
                P0 = _extend_matrix(P0)
                P1 = _extend_matrix(P1)
                P2 = _extend_matrix(P2)
@@ -392,7 +427,7 @@ def get_waymo_image_info(path,
            R0_rect = np.array([
                float(info) for info in lines[5].split(' ')[1:10]
            ]).reshape([3, 3])
-            if extend_matrix:
+            if self.extend_matrix:
                rect_4x4 = np.zeros([4, 4], dtype=R0_rect.dtype)
                rect_4x4[3, 3] = 1.
                rect_4x4[:3, :3] = R0_rect
@@ -402,7 +437,7 @@ def get_waymo_image_info(path,
            Tr_velo_to_cam = np.array([
                float(info) for info in lines[6].split(' ')[1:13]
            ]).reshape([3, 4])
-            if extend_matrix:
+            if self.extend_matrix:
                Tr_velo_to_cam = _extend_matrix(Tr_velo_to_cam)
            calib_info['P0'] = P0
            calib_info['P1'] = P1
@@ -412,9 +447,13 @@ def get_waymo_image_info(path,
            calib_info['R0_rect'] = rect_4x4
            calib_info['Tr_velo_to_cam'] = Tr_velo_to_cam
            info['calib'] = calib_info
-        if pose:
+        if self.pose:
            pose_path = get_pose_path(
-                idx, path, training, relative_path=False, use_prefix_id=True)
+                idx,
+                self.path,
+                self.training,
+                relative_path=False,
+                use_prefix_id=True)
            info['pose'] = np.loadtxt(pose_path)

        if annotations is not None:
@@ -424,28 +463,31 @@ def get_waymo_image_info(path,

        sweeps = []
        prev_idx = idx
-        while len(sweeps) < max_sweeps:
+        while len(sweeps) < self.max_sweeps:
            prev_info = {}
            prev_idx -= 1
            prev_info['velodyne_path'] = get_velodyne_path(
                prev_idx,
-                path,
-                training,
-                relative_path,
+                self.path,
+                self.training,
+                self.relative_path,
                exist_check=False,
                use_prefix_id=True)
            if_prev_exists = osp.exists(
-                Path(path) / prev_info['velodyne_path'])
+                Path(self.path) / prev_info['velodyne_path'])
            if if_prev_exists:
-                prev_points = np.fromfile(
-                    Path(path) / prev_info['velodyne_path'], dtype=np.float32)
-                prev_points = np.copy(prev_points).reshape(
-                    -1, pc_info['num_features'])
-                prev_info['timestamp'] = np.int64(prev_points[0, -1])
+                with open(
+                        get_timestamp_path(
+                            prev_idx,
+                            self.path,
+                            self.training,
+                            relative_path=False,
+                            use_prefix_id=True)) as f:
+                    prev_info['timestamp'] = np.int64(f.read())
                prev_pose_path = get_pose_path(
                    prev_idx,
-                    path,
-                    training,
+                    self.path,
+                    self.training,
                    relative_path=False,
                    use_prefix_id=True)
                prev_info['pose'] = np.loadtxt(prev_pose_path)
@@ -456,10 +498,12 @@ def get_waymo_image_info(path,

        return info

-    with futures.ThreadPoolExecutor(num_worker) as executor:
-        image_infos = executor.map(map_func, image_ids)
-
-    return list(image_infos)
+    def gather(self, image_ids):
+        if not isinstance(image_ids, list):
+            image_ids = list(range(image_ids))
+        image_infos = mmcv.track_parallel_progress(self.gather_single,
+                                                   image_ids, self.num_worker)
+        return list(image_infos)


 def kitti_anno_to_label_file(annos, folder):

--- a/tools/data_converter/waymo_converter.py
+++ b/tools/data_converter/waymo_converter.py
@@ -87,6 +87,7 @@ class Waymo2KITTI(object):
        self.calib_save_dir = f'{self.save_dir}/calib'
        self.point_cloud_save_dir = f'{self.save_dir}/velodyne'
        self.pose_save_dir = f'{self.save_dir}/pose'
+        self.timestamp_save_dir = f'{self.save_dir}/timestamp'

        self.create_folder()

@@ -119,6 +120,7 @@ class Waymo2KITTI(object):
            self.save_calib(frame, file_idx, frame_idx)
            self.save_lidar(frame, file_idx, frame_idx)
            self.save_pose(frame, file_idx, frame_idx)
+            self.save_timestamp(frame, file_idx, frame_idx)

            if not self.test_mode:
                self.save_label(frame, file_idx, frame_idx)
@@ -210,7 +212,7 @@ class Waymo2KITTI(object):
            parse_range_image_and_camera_projection(frame)

        # First return
-        points_0, cp_points_0, intensity_0, elongation_0 = \
+        points_0, cp_points_0, intensity_0, elongation_0, mask_indices_0 = \
            self.convert_range_image_to_point_cloud(
                frame,
                range_images,
@@ -221,9 +223,10 @@ class Waymo2KITTI(object):
        points_0 = np.concatenate(points_0, axis=0)
        intensity_0 = np.concatenate(intensity_0, axis=0)
        elongation_0 = np.concatenate(elongation_0, axis=0)
+        mask_indices_0 = np.concatenate(mask_indices_0, axis=0)

        # Second return
-        points_1, cp_points_1, intensity_1, elongation_1 = \
+        points_1, cp_points_1, intensity_1, elongation_1, mask_indices_1 = \
            self.convert_range_image_to_point_cloud(
                frame,
                range_images,
@@ -234,15 +237,18 @@ class Waymo2KITTI(object):
        points_1 = np.concatenate(points_1, axis=0)
        intensity_1 = np.concatenate(intensity_1, axis=0)
        elongation_1 = np.concatenate(elongation_1, axis=0)
+        mask_indices_1 = np.concatenate(mask_indices_1, axis=0)

        points = np.concatenate([points_0, points_1], axis=0)
        intensity = np.concatenate([intensity_0, intensity_1], axis=0)
        elongation = np.concatenate([elongation_0, elongation_1], axis=0)
-        timestamp = frame.timestamp_micros * np.ones_like(intensity)
+        mask_indices = np.concatenate([mask_indices_0, mask_indices_1], axis=0)
+
+        # timestamp = frame.timestamp_micros * np.ones_like(intensity)

        # concatenate x,y,z, intensity, elongation, timestamp (6-dim)
        point_cloud = np.column_stack(
-            (points, intensity, elongation, timestamp))
+            (points, intensity, elongation, mask_indices))

        pc_path = f'{self.point_cloud_save_dir}/{self.prefix}' + \
            f'{str(file_idx).zfill(3)}{str(frame_idx).zfill(3)}.bin'
@@ -367,18 +373,39 @@ class Waymo2KITTI(object):
                 f'{str(file_idx).zfill(3)}{str(frame_idx).zfill(3)}.txt'),
            pose)

+    def save_timestamp(self, frame, file_idx, frame_idx):
+        """Save the timestamp data in a separate file instead of the
+        pointcloud.
+
+        Note that SDC's own pose is not included in the regular training
+        of KITTI dataset. KITTI raw dataset contains ego motion files
+        but are not often used. Pose is important for algorithms that
+        take advantage of the temporal information.
+
+        Args:
+            frame (:obj:`Frame`): Open dataset frame proto.
+            file_idx (int): Current file index.
+            frame_idx (int): Current frame index.
+        """
+        with open(
+                join(f'{self.timestamp_save_dir}/{self.prefix}' +
+                     f'{str(file_idx).zfill(3)}{str(frame_idx).zfill(3)}.txt'),
+                'w') as f:
+            f.write(str(frame.timestamp_micros))
+
    def create_folder(self):
        """Create folder for data preprocessing."""
        if not self.test_mode:
            dir_list1 = [
                self.label_all_save_dir, self.calib_save_dir,
-                self.point_cloud_save_dir, self.pose_save_dir
+                self.point_cloud_save_dir, self.pose_save_dir,
+                self.timestamp_save_dir
            ]
            dir_list2 = [self.label_save_dir, self.image_save_dir]
        else:
            dir_list1 = [
                self.calib_save_dir, self.point_cloud_save_dir,
-                self.pose_save_dir
+                self.pose_save_dir, self.timestamp_save_dir
            ]
            dir_list2 = [self.image_save_dir]
        for d in dir_list1:
@@ -409,7 +436,9 @@ class Waymo2KITTI(object):
        Returns:
            tuple[list[np.ndarray]]: (List of points with shape [N, 3],
                camera projections of points with shape [N, 6], intensity
-                with shape [N, 1], elongation with shape [N, 1]). All the
+                with shape [N, 1], elongation with shape [N, 1], points'
+                position in the depth map (element offset if points come from
+                the main lidar otherwise -1) with shape[N, 1]). All the
                lists have the length of lidar numbers (5).
        """
        calibrations = sorted(
@@ -418,6 +447,7 @@ class Waymo2KITTI(object):
        cp_points = []
        intensity = []
        elongation = []
+        mask_indices = []

        frame_pose = tf.convert_to_tensor(
            value=np.reshape(np.array(frame.pose.transform), [4, 4]))
@@ -473,27 +503,36 @@ class Waymo2KITTI(object):
                    pixel_pose=pixel_pose_local,
                    frame_pose=frame_pose_local)

+            mask_index = tf.where(range_image_mask)
+
            range_image_cartesian = tf.squeeze(range_image_cartesian, axis=0)
-            points_tensor = tf.gather_nd(range_image_cartesian,
-                                         tf.compat.v1.where(range_image_mask))
+            points_tensor = tf.gather_nd(range_image_cartesian, mask_index)

            cp = camera_projections[c.name][ri_index]
            cp_tensor = tf.reshape(
                tf.convert_to_tensor(value=cp.data), cp.shape.dims)
-            cp_points_tensor = tf.gather_nd(
-                cp_tensor, tf.compat.v1.where(range_image_mask))
+            cp_points_tensor = tf.gather_nd(cp_tensor, mask_index)
            points.append(points_tensor.numpy())
            cp_points.append(cp_points_tensor.numpy())

            intensity_tensor = tf.gather_nd(range_image_tensor[..., 1],
-                                            tf.where(range_image_mask))
+                                            mask_index)
            intensity.append(intensity_tensor.numpy())

            elongation_tensor = tf.gather_nd(range_image_tensor[..., 2],
-                                             tf.where(range_image_mask))
+                                             mask_index)
            elongation.append(elongation_tensor.numpy())
+            if c.name == 1:
+                mask_index = (ri_index * range_image_mask.shape[0] +
+                              mask_index[:, 0]
+                              ) * range_image_mask.shape[1] + mask_index[:, 1]
+                mask_index = mask_index.numpy().astype(elongation[-1].dtype)
+            else:
+                mask_index = np.full_like(elongation[-1], -1)
+
+            mask_indices.append(mask_index)

-        return points, cp_points, intensity, elongation
+        return points, cp_points, intensity, elongation, mask_indices

    def cart_to_homo(self, mat):
        """Convert transformation matrix in Cartesian coordinates to