Merge pull request #192 from sshaoshuai/master

Release OpenPCDet v0.3.0

Merge pull request #192 from sshaoshuai/master
Release OpenPCDet v0.3.0
32567b04 · Shaoshuai Shi · GitHub · 853b759b · 04e0d4f0 · 32567b04
Unverified Commit 32567b04 authored Jul 30, 2020 by Shaoshuai Shi Committed by GitHub Jul 30, 2020
20 changed files
--- a/README.md
+++ b/README.md
@@ -2,11 +2,35 @@
 # OpenPCDet
-## Introduction
 `OpenPCDet` is a clear, simple, self-contained open source project for LiDAR-based 3D object detection. 
-It is also the official code release of [`[Part-A^2 net]`](https://arxiv.org/abs/1907.03670) and [`[PV-RCNN]`](https://arxiv.org/abs/1912.13192). 
+It is also the official code release of [`[PointRCNN]`](https://arxiv.org/abs/1812.04244), [`[Part-A^2 net]`](https://arxiv.org/abs/1907.03670) and [`[PV-RCNN]`](https://arxiv.org/abs/1912.13192). 
+## Overview
+- [Changelog](#changelog)
+- [Design Pattern](#openpcdet-design-pattern)
+- [Model Zoo](#model-zoo)
+- [Installation](docs/INSTALL.md)
+- [Quick Demo](docs/DEMO.md)
+- [Getting Started](docs/GETTING_STARTED.md)
+- [Citation](#citation)
+## Changelog
+[2020-07-30] **NEW:** `OpenPCDet` v0.3.0 is released with the following features:
+   * The Point-based and Anchor-Free models ([`PointRCNN`](#KITTI-3D-Object-Detection-Baselines), [`PartA2-Free`](#KITTI-3D-Object-Detection-Baselines)) are supported now.
+   * The NuScenes dataset is supported with strong baseline results ([`SECOND-MultiHead (CBGS)`](#NuScenes-3D-Object-Detection-Baselines) and [`PointPillar-MultiHead`](#NuScenes-3D-Object-Detection-Baselines)).
+   * High efficiency than last version, support `PyTorch 1.1~1.5` and `spconv 1.0~1.2` simultaneously.
+[2020-07-17]  Add simple visualization codes and a quick demo to test with custom data. 
+[2020-06-24] `OpenPCDet` v0.2.0 is released with pretty new structures to support more models and datasets. 
+[2020-03-16] `OpenPCDet` v0.1.0 is released. 
+## Introduction
 ### What does `OpenPCDet` toolbox do?
@@ -54,44 +78,45 @@ Contributions are also welcomed.
 - [x] Support GPU version 3D IoU calculation and rotated NMS 
-## ChangeLog
-[2020-07-17]  Add simple visualization codes and a quick demo to test with custom data. 
-[2020-06-24] `OpenPCDet` v0.2.0 is released with pretty new structures to support more models and datasets. 
-[2020-03-16] `OpenPCDet` v0.1.0 is released. 
 ## Model Zoo
 ### KITTI 3D Object Detection Baselines
-Selected supported methods are shown in the below table. The results are the 3D detection performance of car class on the *val* set of KITTI dataset.
+Selected supported methods are shown in the below table. The results are the 3D detection performance of moderate difficulty on the *val* set of KITTI dataset.
+* All models are trained with 8 GTX 1080Ti GPUs and are available for download. 
+* The training time is measured with 8 TITAN XP GPUs and PyTorch 1.5.
+|                                             | training time | Car | Pedestrian | Cyclist  | download | 
+|---------------------------------------------|----------:|:-------:|:-------:|:-------:|:---------:|
+| [PointPillar](tools/cfgs/kitti_models/pointpillar.yaml) |~1.2 hours| 77.28 | 52.29 | 62.68 | [model-18M](https://drive.google.com/file/d/1wMxWTpU1qUoY3DsCH31WJmvJxcjFXKlm/view?usp=sharing) | 
+| [SECOND](tools/cfgs/kitti_models/second.yaml)       |  ~1.7 hours  | 78.62 | 52.98 | 67.15 | [model-20M](https://drive.google.com/file/d/1-01zsPOsqanZQqIIyy7FpNXStL3y4jdR/view?usp=sharing) |
+| [PointRCNN](tools/cfgs/kitti_models/pointrcnn.yaml) | ~3 hours | 78.70 | 54.41 | 72.11 | [model-16M](https://drive.google.com/file/d/1BCX9wMn-GYAfSOPpyxf6Iv6fc0qKLSiU/view?usp=sharing)| 
+| [PointRCNN-IoU](tools/cfgs/kitti_models/pointrcnn_iou.yaml) | ~3 hours | 78.75 | 58.32 | 71.34 | [model-16M](https://drive.google.com/file/d/1V0vNZ3lAHpEEt0MlT80eL2f41K2tHm_D/view?usp=sharing)|
+| [Part-A^2-Free](tools/cfgs/kitti_models/PartA2_free.yaml)   | ~3.8 hours| 78.72 | 65.99 | 74.29 | [model-226M](https://drive.google.com/file/d/1lcUUxF8mJgZ_e-tZhP1XNQtTBuC-R0zr/view?usp=sharing) |
+| [Part-A^2-Anchor](tools/cfgs/kitti_models/PartA2.yaml)    | ~4.3 hours| 79.40 | 60.05 | 69.90 | [model-244M](https://drive.google.com/file/d/10GK1aCkLqxGNeX3lVu8cLZyE0G8002hY/view?usp=sharing) |
+| [PV-RCNN](tools/cfgs/kitti_models/pv_rcnn.yaml) | ~5 hours| 83.61 | 57.90 | 70.47 | [model-50M](https://drive.google.com/file/d/1lIOq4Hxr0W3qsX83ilQv0nk1Cls6KAr-/view?usp=sharing) |
+### NuScenes 3D Object Detection Baselines
 All models are trained with 8 GTX 1080Ti GPUs and are available for download.
-|                                             |training time | Batch Size | AP_Easy | **AP_Moderate** | AP_Hard | download  |
+|                                             | mATE | mASE | mAOE | mAVE | mAAE | mAP | NDS | download | 
-|---------------------------------------------|:----------:|:----------:|:-------:|:-------:|:-------:|:---------:|
+|---------------------------------------------|----------:|:-------:|:-------:|:-------:|:---------:|:-------:|:-------:|:---------:|
-| [PointPillar](tools/cfgs/kitti_models/pointpillar.yaml) |~95 mins| 32 | 86.46 | 77.28 | 74.65 | [model-18M](https://drive.google.com/file/d/1wMxWTpU1qUoY3DsCH31WJmvJxcjFXKlm/view?usp=sharing) | 
+| [PointPillar-MultiHead](tools/cfgs/nuscenes_models/cbgs_pp_multihead.yaml) | 33.87	| 26.00 | 32.07	| 28.74 | 20.15 | 44.63 | 58.23	 | [model-23M](https://drive.google.com/file/d/1fnxKTUi79dSARhsREXR_UKnWs-83bgEV/view?usp=sharing) | 
-| [SECOND](tools/cfgs/kitti_models/second.yaml)       |  ~2 hours  | 32  | 88.61 | 78.62| 77.22 | [model-20M](https://drive.google.com/file/d/1-01zsPOsqanZQqIIyy7FpNXStL3y4jdR/view?usp=sharing) |
+| [SECOND-MultiHead (CBGS)](tools/cfgs/nuscenes_models/cbgs_second_multihead.yaml) | 31.15 |	25.51 |	26.64 | 26.26 | 20.46 | 50.59 | 62.29 | [model-35M](https://drive.google.com/file/d/1s34D8g-h65qDyoYbgCraxcZQwinbxhaY/view?usp=sharing) |
-| [Part-A^2](tools/cfgs/kitti_models/PartA2.yaml)    | ~5 hours| 32 | 89.55 | 79.40 | 78.84 | [model-244M](https://drive.google.com/file/d/10GK1aCkLqxGNeX3lVu8cLZyE0G8002hY/view?usp=sharing) |
-| [PV-RCNN](tools/cfgs/kitti_models/pv_rcnn.yaml) | ~6 hours| 16 | 89.34 | 83.69 | 78.70 | [model-50M](https://drive.google.com/file/d/1lIOq4Hxr0W3qsX83ilQv0nk1Cls6KAr-/view?usp=sharing) |
-| [SECOND-MultiHead](tools/cfgs/kitti_models/second_multihead.yaml) | - | 32 | - | - | - | ongoing |
-| PointRCNN | - | 32 | - | - | - | ongoing|
 ### Other datasets
 More datasets are on the way. 
 ## Installation
-Please refer to [INSTALL.md](docs/INSTALL.md) for installation and dataset preparation.
+Please refer to [INSTALL.md](docs/INSTALL.md) for the installation of `OpenPCDet`.
 ## Quick Demo
 Please refer to [DEMO.md](docs/DEMO.md) for a quick demo to test with a pretrained model and 
 visualize the predicted results on your custom data or the original KITTI data.
-## Get Started
+## Getting Started
 Please refer to [GETTING_STARTED.md](docs/GETTING_STARTED.md) to learn more usage about this project.

--- a/docs/DEMO.md
+++ b/docs/DEMO.md
-## Quick Demo
+# Quick Demo
 Here we provide a quick demo to test a pretrained model on the custom point cloud data and visualize the predicted results. 

--- a/docs/GETTING_STARTED.md
+++ b/docs/GETTING_STARTED.md
-## Getting Started
+# Getting Started
-The dataset configs are located within [tools/cfgs/dataset_configs](tools/cfgs/dataset_configs), 
+The dataset configs are located within [tools/cfgs/dataset_configs](../tools/cfgs/dataset_configs), 
-and the model configs are located within [tools/cfgs](tools/cfgs) for different datasets, like [tools/cfgs/kitti_models/](tools/cfgs/kitti_models/). 
+and the model configs are located within [tools/cfgs](../tools/cfgs) for different datasets. 
+## Dataset Preparation
+Currently we provide the dataloader of KITTI dataset and NuScenes dataset, and the supporting of more datasets are on the way.  
+### KITTI Dataset
+* Please download the official [KITTI 3D object detection](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d) dataset and organize the downloaded files as follows (the road planes could be downloaded from [[road plane]](https://drive.google.com/file/d/1d5mq0RXRnvHPVeKx6Q612z0YRO1t2wAp/view?usp=sharing), which are optional for data augmentation in the training):
+* NOTE: if you already have the data infos from `pcdet v0.1`, you can choose to use the old infos and set the DATABASE_WITH_FAKELIDAR option in tools/cfgs/dataset_configs/kitti_dataset.yaml as True. The second choice is that you can create the infos and gt database again and leave the config unchanged.
+```
+OpenPCDet
+├── data
+│   ├── kitti
+│   │   │── ImageSets
+│   │   │── training
+│   │   │   ├──calib & velodyne & label_2 & image_2 & (optional: planes)
+│   │   │── testing
+│   │   │   ├──calib & velodyne & image_2
+├── pcdet
+├── tools
+```
+* Generate the data infos by running the following command: 
+```python 
+python -m pcdet.datasets.kitti.kitti_dataset create_kitti_infos tools/cfgs/dataset_configs/kitti_dataset.yaml
+```
+### NuScenes Dataset
+* Please download the official [NuScenes 3D object detection dataset](https://www.nuscenes.org/download) and 
+organize the downloaded files as follows: 
+```
+OpenPCDet
+├── data
+│   ├── nuscenes
+│   │   │── v1.0-trainval (or v1.0-mini if you use mini)
+│   │   │   │── samples
+│   │   │   │── sweeps
+│   │   │   │── maps
+│   │   │   │── v1.0-trainval  
+├── pcdet
+├── tools
+```
+* Install the `nuscenes-devkit` with version `1.0.5` by running the following command: 
+```shell script
+pip install nuscenes-devkit==1.0.5
+```
+* Generate the data infos by running the following command (it may take several hours): 
+```python 
+python -m pcdet.datasets.nuscenes.nuscenes_dataset --func create_nuscenes_infos \ 
+    --cfg_file tools/cfgs/dataset_configs/nuscenes_dataset.yaml \
+    --version v1.0-trainval
+```
+## Training & Testing
 ### Test and evaluate the pretrained models
@@ -17,34 +73,30 @@ python test.py --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} --eval_all
 * To test with multiple GPUs:
 ```shell script
-sh scripts/slurm_test_mgpu.sh ${PARTITION} ${NUM_GPUS} \ 
+sh scripts/dist_test.sh ${NUM_GPUS} \
    --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE}
 # or
-sh scripts/dist_test.sh ${NUM_GPUS} \
+sh scripts/slurm_test_mgpu.sh ${PARTITION} ${NUM_GPUS} \ 
    --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE}
 ```
 ### Train a model
-Note that the `--batch_size` depends on the number of your training GPUs, 
+You could optionally add extra command line parameters `--batch_size ${BATCH_SIZE}` and `--epochs ${EPOCHS}` to specify your preferred parameters. 
-please refer to `Model Zoo` of [README.md](../README.md) for the setting of batch_size for different models.  
-* Train with multiple GPUs:
+* Train with multiple GPUs or multiple machines
 ```shell script
-sh scripts/dist_train.sh ${NUM_GPUS} \ 
+sh scripts/dist_train.sh ${NUM_GPUS} --cfg_file ${CONFIG_FILE}
-    --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE}  --epochs 80
-```
-* Train with multiple machines:
+# or 
-```shell script
-sh scripts/slurm_train.sh ${PARTITION} ${JOB_NAME} ${NUM_GPUS} \ 
+sh scripts/slurm_train.sh ${PARTITION} ${JOB_NAME} ${NUM_GPUS} --cfg_file ${CONFIG_FILE}
-    --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} --epochs 80
 ```
 * Train with a single GPU:
 ```shell script
-python train.py --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} --epochs 50
+python train.py --cfg_file ${CONFIG_FILE}
 ```
--- a/docs/INSTALL.md
+++ b/docs/INSTALL.md
-## Installation
+# Installation
 ### Requirements
 All the codes are tested in the following environment:
 * Linux (tested on Ubuntu 14.04/16.04)
 * Python 3.6+
-* PyTorch 1.1 or higher (tested on PyTorch 1.1)
+* PyTorch 1.1 or higher (tested on PyTorch 1.1, 1,3, 1,5)
-* CUDA 9.0 or higher
+* CUDA 9.0 or higher (PyTorch 1.3+ needs CUDA 9.2+)
-* `spconv v1.0` ([commit 8da6f96](https://github.com/traveller59/spconv/tree/8da6f967fb9a054d8870c3515b1b44eca2103634))
+* [`spconv v1.0 (commit 8da6f96)`](https://github.com/traveller59/spconv/tree/8da6f967fb9a054d8870c3515b1b44eca2103634) or [`spconv v1.2`](https://github.com/traveller59/spconv)
-### Install `pcdet v0.2`
+### Install `pcdet v0.3`
-NOTE: Please re-install `pcdet v0.2` by running `python setup.py develop` if you have already installed `pcdet v0.1` previously.
+NOTE: Please re-install `pcdet v0.3` by running `python setup.py develop` even if you have already installed previous version.
 a. Clone this repository.
 ```shell
@@ -24,36 +24,11 @@ b. Install the dependent libraries as follows:
 pip install -r requirements.txt 
 ```
-* Install the SparseConv library, we use the non-official implementation from [`spconv`](https://github.com/traveller59/spconv). 
+* Install the SparseConv library, we use the implementation from [`[spconv]`](https://github.com/traveller59/spconv). 
-Note that we use the initial version of `spconv`, make sure you install the `spconv v1.0` ([commit 8da6f96](https://github.com/traveller59/spconv/tree/8da6f967fb9a054d8870c3515b1b44eca2103634)) instead of the latest one.
+    * If you use PyTorch 1.1, then make sure you install the `spconv v1.0` with ([commit 8da6f96](https://github.com/traveller59/spconv/tree/8da6f967fb9a054d8870c3515b1b44eca2103634)) instead of the latest one.
+    * If you use PyTorch 1.3+, then you need to install the `spconv v1.2`. As mentioned by the author of [`spconv`](https://github.com/traveller59/spconv), you need to use their docker if you use PyTorch 1.4+. 
 c. Install this `pcdet` library by running the following command:
 ```shell
 python setup.py develop
 ```
-## Dataset Preparation
-Currently we provide the dataloader of KITTI dataset, and the supporting of more datasets are on the way.  
-### KITTI Dataset
-* Please download the official [KITTI 3D object detection](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d) dataset and organize the downloaded files as follows (the road planes could be downloaded from [[road plane]](https://drive.google.com/file/d/1d5mq0RXRnvHPVeKx6Q612z0YRO1t2wAp/view?usp=sharing), which are optional for data augmentation in the training):
-* NOTE: if you already have the data infos from `pcdet v0.1`, you can choose to use the old infos and set the DATABASE_WITH_FAKELIDAR option in tools/cfgs/dataset_configs/kitti_dataset.yaml as True. The second choice is that you can create the infos and gt database again and leave the config unchanged.
-```
-PCDet
-├── data
-│   ├── kitti
-│   │   │──ImageSets
-│   │   │──training
-│   │   │   ├──calib & velodyne & label_2 & image_2 & (optional: planes)
-│   │   │──testing
-│   │   │   ├──calib & velodyne & image_2
-├── pcdet
-├── tools
-```
-* Generate the data infos by running the following command: 
-```python 
-python -m pcdet.datasets.kitti.kitti_dataset create_kitti_infos tools/cfgs/dataset_configs/kitti_dataset.yaml
-```
--- a/pcdet/__init__.py
+++ b/pcdet/__init__.py
-from pathlib import Path
 import subprocess
+from pathlib import Path
 from .version import __version__
@@ -22,4 +22,3 @@ script_version = get_git_commit_number()
 if script_version not in __version__:
    __version__ = __version__ + '+py%s' % script_version
--- a/pcdet/config.py
+++ b/pcdet/config.py
-from easydict import EasyDict
 from pathlib import Path
 import yaml
+from easydict import EasyDict
 def log_config_to_file(cfg, pre='cfg', logger=None):

--- a/pcdet/datasets/__init__.py
+++ b/pcdet/datasets/__init__.py
 import torch
 from torch.utils.data import DataLoader
-from .dataset import DatasetTemplate
-from .kitti.kitti_dataset import KittiDataset
 from torch.utils.data import DistributedSampler as _DistributedSampler
 from pcdet.utils import common_utils
+from .dataset import DatasetTemplate
+from .kitti.kitti_dataset import KittiDataset
+from .nuscenes.nuscenes_dataset import NuScenesDataset
 __all__ = {
    'DatasetTemplate': DatasetTemplate,
    'KittiDataset': KittiDataset,
+    'NuScenesDataset': NuScenesDataset
 }
 class DistributedSampler(_DistributedSampler):
    def __init__(self, dataset, num_replicas=None, rank=None, shuffle=True):

--- a/pcdet/datasets/augmentor/augmentor_utils.py
+++ b/pcdet/datasets/augmentor/augmentor_utils.py
 import numpy as np
 from ...utils import common_utils
 def random_flip_along_x(gt_boxes, points):
    """
    Args:
-        gt_boxes: (N, 7), [x, y, z, dx, dy, dz, heading]
+        gt_boxes: (N, 7 + C), [x, y, z, dx, dy, dz, heading, [vx], [vy]]
        points: (M, 3 + C)
    Returns:
    """
@@ -14,13 +15,17 @@ def random_flip_along_x(gt_boxes, points):
        gt_boxes[:, 1] = -gt_boxes[:, 1]
        gt_boxes[:, 6] = -gt_boxes[:, 6]
        points[:, 1] = -points[:, 1]
+        if gt_boxes.shape[1] > 7:
+            gt_boxes[:, 8] = -gt_boxes[:, 8]
    return gt_boxes, points
 def random_flip_along_y(gt_boxes, points):
    """
    Args:
-        gt_boxes: (N, 7), [x, y, z, dx, dy, dz, heading]
+        gt_boxes: (N, 7 + C), [x, y, z, dx, dy, dz, heading, [vx], [vy]]
        points: (M, 3 + C)
    Returns:
    """
@@ -29,13 +34,17 @@ def random_flip_along_y(gt_boxes, points):
        gt_boxes[:, 0] = -gt_boxes[:, 0]
        gt_boxes[:, 6] = -(gt_boxes[:, 6] + np.pi)
        points[:, 0] = -points[:, 0]
+        if gt_boxes.shape[1] > 7:
+            gt_boxes[:, 7] = -gt_boxes[:, 7]
    return gt_boxes, points
 def global_rotation(gt_boxes, points, rot_range):
    """
    Args:
-        gt_boxes: (N, 7), [x, y, z, dx, dy, dz, heading]
+        gt_boxes: (N, 7 + C), [x, y, z, dx, dy, dz, heading, [vx], [vy]]
        points: (M, 3 + C),
        rot_range: [min, max]
    Returns:
@@ -44,6 +53,12 @@ def global_rotation(gt_boxes, points, rot_range):
    points = common_utils.rotate_points_along_z(points[np.newaxis, :, :], np.array([noise_rotation]))[0]
    gt_boxes[:, 0:3] = common_utils.rotate_points_along_z(gt_boxes[np.newaxis, :, 0:3], np.array([noise_rotation]))[0]
    gt_boxes[:, 6] += noise_rotation
+    if gt_boxes.shape[1] > 7:
+        gt_boxes[:, 7:9] = common_utils.rotate_points_along_z(
+            np.hstack((gt_boxes[:, 7:9], np.zeros((gt_boxes.shape[0], 1))))[np.newaxis, :, :],
+            np.array([noise_rotation])
+        )[0][:, 0:2]
    return gt_boxes, points

--- a/pcdet/datasets/augmentor/data_augmentor.py
+++ b/pcdet/datasets/augmentor/data_augmentor.py
 from functools import partial
 import numpy as np
-from . import augmentor_utils, database_sampler
 from ...utils import common_utils
+from . import augmentor_utils, database_sampler
 class DataAugmentor(object):

--- a/pcdet/datasets/augmentor/database_sampler.py
+++ b/pcdet/datasets/augmentor/database_sampler.py
-import numpy as np
 import pickle
-from ...utils import box_utils
+import numpy as np
 from ...ops.iou3d_nms import iou3d_nms_utils
+from ...utils import box_utils
 class DataBaseSampler(object):

--- a/pcdet/datasets/dataset.py
+++ b/pcdet/datasets/dataset.py
-from pathlib import Path
 from collections import defaultdict
+from pathlib import Path
 import numpy as np
 import torch.utils.data as torch_data
+from ..utils import common_utils
 from .augmentor.data_augmentor import DataAugmentor
 from .processor.data_processor import DataProcessor
 from .processor.point_feature_encoder import PointFeatureEncoder
-from ..utils import common_utils
 class DatasetTemplate(torch_data.Dataset):

--- a/pcdet/datasets/kitti/kitti_dataset.py
+++ b/pcdet/datasets/kitti/kitti_dataset.py
-import pickle
 import copy
+import pickle
 import numpy as np
 from skimage import io
-from ...utils import box_utils, common_utils, calibration_kitti, object3d_kitti
-from ..dataset import DatasetTemplate
 from ...ops.roiaware_pool3d import roiaware_pool3d_utils
+from ...utils import box_utils, calibration_kitti, common_utils, object3d_kitti
+from ..dataset import DatasetTemplate
 class KittiDataset(DatasetTemplate):
@@ -435,4 +437,3 @@ if __name__ == '__main__':
            data_path=ROOT_DIR / 'data' / 'kitti',
            save_path=ROOT_DIR / 'data' / 'kitti'
        )
--- a/pcdet/datasets/kitti/kitti_object_eval_python/eval.py
+++ b/pcdet/datasets/kitti/kitti_object_eval_python/eval.py
-import numpy as np
-import numba
 import io as sysio
+import numba
+import numpy as np
 from .rotate_iou import rotate_iou_gpu_eval

--- a/pcdet/datasets/kitti/kitti_object_eval_python/evaluate.py
+++ b/pcdet/datasets/kitti/kitti_object_eval_python/evaluate.py
 import time
 import fire
 import .kitti_common as kitti
-from .eval import get_official_eval_result, get_coco_eval_result
+from .eval import get_coco_eval_result, get_official_eval_result
 def _read_imageset_file(path):

--- a/pcdet/datasets/kitti/kitti_object_eval_python/kitti_common.py
+++ b/pcdet/datasets/kitti/kitti_object_eval_python/kitti_common.py
@@ -7,6 +7,7 @@ from collections import OrderedDict
 import numpy as np
 from skimage import io
 def get_image_index_str(img_idx):
    return "{:06d}".format(img_idx)

--- a/pcdet/datasets/kitti/kitti_object_eval_python/rotate_iou.py
+++ b/pcdet/datasets/kitti/kitti_object_eval_python/rotate_iou.py
@@ -9,6 +9,7 @@ import numba
 import numpy as np
 from numba import cuda
 @numba.jit(nopython=True)
 def div_up(m, n):
    return m // n + (m % n > 0)

--- a/pcdet/datasets/nuscenes/nuscenes_dataset.py
+++ b/pcdet/datasets/nuscenes/nuscenes_dataset.py
+import copy
+import pickle
+from pathlib import Path
+import numpy as np
+from tqdm import tqdm
+from ...ops.roiaware_pool3d import roiaware_pool3d_utils
+from ...utils import common_utils
+from ..dataset import DatasetTemplate
+class NuScenesDataset(DatasetTemplate):
+    def __init__(self, dataset_cfg, class_names, training=True, root_path=None, logger=None):
+        root_path = (root_path if root_path is not None else Path(dataset_cfg.DATA_PATH)) / dataset_cfg.VERSION
+        super().__init__(
+            dataset_cfg=dataset_cfg, class_names=class_names, training=training, root_path=root_path, logger=logger
+        )
+        self.infos = []
+        self.include_nuscenes_data(self.mode)
+        if self.training and self.dataset_cfg.get('BALANCED_RESAMPLING', False):
+            self.infos = self.balanced_infos_resampling(self.infos)
+    def include_nuscenes_data(self, mode):
+        self.logger.info('Loading NuScenes dataset')
+        nuscenes_infos = []
+        for info_path in self.dataset_cfg.INFO_PATH[mode]:
+            info_path = self.root_path / info_path
+            if not info_path.exists():
+                continue
+            with open(info_path, 'rb') as f:
+                infos = pickle.load(f)
+                nuscenes_infos.extend(infos)
+        self.infos.extend(nuscenes_infos)
+        self.logger.info('Total samples for NuScenes dataset: %d' % (len(nuscenes_infos)))
+    def balanced_infos_resampling(self, infos):
+        """
+        Class-balanced sampling of nuScenes dataset from https://arxiv.org/abs/1908.09492
+        """
+        if self.class_names is None:
+            return infos
+        cls_infos = {name: [] for name in self.class_names}
+        for info in infos:
+            for name in set(info['gt_names']):
+                if name in self.class_names:
+                    cls_infos[name].append(info)
+        duplicated_samples = sum([len(v) for _, v in cls_infos.items()])
+        cls_dist = {k: len(v) / duplicated_samples for k, v in cls_infos.items()}
+        sampled_infos = []
+        frac = 1.0 / len(self.class_names)
+        ratios = [frac / v for v in cls_dist.values()]
+        for cur_cls_infos, ratio in zip(list(cls_infos.values()), ratios):
+            sampled_infos += np.random.choice(
+                cur_cls_infos, int(len(cur_cls_infos) * ratio)
+            ).tolist()
+        self.logger.info('Total samples after balanced resampling: %s' % (len(sampled_infos)))
+        cls_infos_new = {name: [] for name in self.class_names}
+        for info in sampled_infos:
+            for name in set(info['gt_names']):
+                if name in self.class_names:
+                    cls_infos_new[name].append(info)
+        cls_dist_new = {k: len(v) / len(sampled_infos) for k, v in cls_infos_new.items()}
+        return sampled_infos
+    def get_sweep(self, sweep_info):
+        def remove_ego_points(points, center_radius=1.0):
+            mask = ~((np.abs(points[:, 0]) < center_radius) & (np.abs(points[:, 1]) < center_radius))
+            return points[mask]
+        lidar_path = self.root_path / sweep_info['lidar_path']
+        points_sweep = np.fromfile(str(lidar_path), dtype=np.float32, count=-1).reshape([-1, 5])[:, :4]
+        points_sweep = remove_ego_points(points_sweep).T
+        if sweep_info['transform_matrix'] is not None:
+            num_points = points_sweep.shape[1]
+            points_sweep[:3, :] = sweep_info['transform_matrix'].dot(
+                np.vstack((points_sweep[:3, :], np.ones(num_points))))[:3, :]
+        cur_times = sweep_info['time_lag'] * np.ones((1, points_sweep.shape[1]))
+        return points_sweep.T, cur_times.T
+    def get_lidar_with_sweeps(self, index, max_sweeps=1):
+        info = self.infos[index]
+        lidar_path = self.root_path / info['lidar_path']
+        points = np.fromfile(str(lidar_path), dtype=np.float32, count=-1).reshape([-1, 5])[:, :4]
+        sweep_points_list = [points]
+        sweep_times_list = [np.zeros((points.shape[0], 1))]
+        for k in np.random.choice(len(info['sweeps']), max_sweeps - 1, replace=False):
+            points_sweep, times_sweep = self.get_sweep(info['sweeps'][k])
+            sweep_points_list.append(points_sweep)
+            sweep_times_list.append(times_sweep)
+        points = np.concatenate(sweep_points_list, axis=0)
+        times = np.concatenate(sweep_times_list, axis=0).astype(points.dtype)
+        points = np.concatenate((points, times), axis=1)
+        return points
+    def __len__(self):
+        if self._merge_all_iters_to_one_epoch:
+            return len(self.infos) * self.total_epochs
+        return len(self.infos)
+    def __getitem__(self, index):
+        if self._merge_all_iters_to_one_epoch:
+            index = index % len(self.infos)
+        info = copy.deepcopy(self.infos[index])
+        points = self.get_lidar_with_sweeps(index, max_sweeps=self.dataset_cfg.MAX_SWEEPS)
+        input_dict = {
+            'points': points,
+            'frame_id': Path(info['lidar_path']).stem,
+            'metadata': {'token': info['token']}
+        }
+        if 'gt_boxes' in info:
+            if self.dataset_cfg.get('FILTER_MIN_POINTS_IN_GT', False):
+                mask = (info['num_lidar_pts'] > self.dataset_cfg.FILTER_MIN_POINTS_IN_GT - 1)
+            else:
+                mask = None
+            input_dict.update({
+                'gt_names': info['gt_names'] if mask is None else info['gt_names'][mask],
+                'gt_boxes': info['gt_boxes'] if mask is None else info['gt_boxes'][mask]
+            })
+        data_dict = self.prepare_data(data_dict=input_dict)
+        if self.dataset_cfg.get('SET_NAN_VELOCITY_TO_ZEROS', False):
+            gt_boxes = data_dict['gt_boxes']
+            gt_boxes[np.isnan(gt_boxes)] = 0
+            data_dict['gt_boxes'] = gt_boxes
+        if not self.dataset_cfg.PRED_VELOCITY and 'gt_boxes' in data_dict:
+            data_dict['gt_boxes'] = data_dict['gt_boxes'][:, [0, 1, 2, 3, 4, 5, 6, -1]]
+        return data_dict
+    @staticmethod
+    def generate_prediction_dicts(batch_dict, pred_dicts, class_names, output_path=None):
+        """
+        Args:
+            batch_dict:
+                frame_id:
+            pred_dicts: list of pred_dicts
+                pred_boxes: (N, 7), Tensor
+                pred_scores: (N), Tensor
+                pred_labels: (N), Tensor
+            class_names:
+            output_path:
+        Returns:
+        """
+        def get_template_prediction(num_samples):
+            ret_dict = {
+                'name': np.zeros(num_samples), 'score': np.zeros(num_samples),
+                'boxes_lidar': np.zeros([num_samples, 7]), 'pred_labels': np.zeros(num_samples)
+            }
+            return ret_dict
+        def generate_single_sample_dict(box_dict):
+            pred_scores = box_dict['pred_scores'].cpu().numpy()
+            pred_boxes = box_dict['pred_boxes'].cpu().numpy()
+            pred_labels = box_dict['pred_labels'].cpu().numpy()
+            pred_dict = get_template_prediction(pred_scores.shape[0])
+            if pred_scores.shape[0] == 0:
+                return pred_dict
+            pred_dict['name'] = np.array(class_names)[pred_labels - 1]
+            pred_dict['score'] = pred_scores
+            pred_dict['boxes_lidar'] = pred_boxes
+            pred_dict['pred_labels'] = pred_labels
+            return pred_dict
+        annos = []
+        for index, box_dict in enumerate(pred_dicts):
+            single_pred_dict = generate_single_sample_dict(box_dict)
+            single_pred_dict['frame_id'] = batch_dict['frame_id'][index]
+            single_pred_dict['metadata'] = batch_dict['metadata'][index]
+            annos.append(single_pred_dict)
+        return annos
+    def evaluation(self, det_annos, class_names, **kwargs):
+        import json
+        from nuscenes.nuscenes import NuScenes
+        from . import nuscenes_utils
+        nusc = NuScenes(version=self.dataset_cfg.VERSION, dataroot=str(self.root_path), verbose=True)
+        nusc_annos = nuscenes_utils.transform_det_annos_to_nusc_annos(det_annos, nusc)
+        nusc_annos['meta'] = {
+            'use_camera': False,
+            'use_lidar': True,
+            'use_radar': False,
+            'use_map': False,
+            'use_external': False,
+        }
+        output_path = Path(kwargs['output_path'])
+        output_path.mkdir(exist_ok=True, parents=True)
+        res_path = str(output_path / 'results_nusc.json')
+        with open(res_path, 'w') as f:
+            json.dump(nusc_annos, f)
+        self.logger.info(f'The predictions of NuScenes have been saved to {res_path}')
+        if self.dataset_cfg.VERSION == 'v1.0-test':
+            return 'No ground-truth annotations for evaluation', {}
+        from nuscenes.eval.detection.config import config_factory
+        from nuscenes.eval.detection.evaluate import NuScenesEval
+        eval_set_map = {
+            'v1.0-mini': 'mini_val',
+            'v1.0-trainval': 'val',
+            'v1.0-test': 'test'
+        }
+        try:
+            eval_version = 'detection_cvpr_2019'
+            eval_config = config_factory(eval_version)
+        except:
+            eval_version = 'cvpr_2019'
+            eval_config = config_factory(eval_version)
+        nusc_eval = NuScenesEval(
+            nusc,
+            config=eval_config,
+            result_path=res_path,
+            eval_set=eval_set_map[self.dataset_cfg.VERSION],
+            output_dir=str(output_path),
+            verbose=True,
+        )
+        metrics_summary = nusc_eval.main(plot_examples=0, render_curves=False)
+        with open(output_path / 'metrics_summary.json', 'r') as f:
+            metrics = json.load(f)
+        result_str, result_dict = nuscenes_utils.format_nuscene_results(metrics, self.class_names, version=eval_version)
+        return result_str, result_dict
+    def create_groundtruth_database(self, used_classes=None, max_sweeps=10):
+        import torch
+        database_save_path = self.root_path / f'gt_database_{max_sweeps}sweeps_withvelo'
+        db_info_save_path = self.root_path / f'nuscenes_dbinfos_{max_sweeps}sweeps_withvelo.pkl'
+        database_save_path.mkdir(parents=True, exist_ok=True)
+        all_db_infos = {}
+        for idx in tqdm(range(len(self.infos))):
+            sample_idx = idx
+            info = self.infos[idx]
+            points = self.get_lidar_with_sweeps(idx, max_sweeps=max_sweeps)
+            gt_boxes = info['gt_boxes']
+            gt_names = info['gt_names']
+            box_idxs_of_pts = roiaware_pool3d_utils.points_in_boxes_gpu(
+                torch.from_numpy(points[:, 0:3]).unsqueeze(dim=0).float().cuda(),
+                torch.from_numpy(gt_boxes[:, 0:7]).unsqueeze(dim=0).float().cuda()
+            ).long().squeeze(dim=0).cpu().numpy()
+            for i in range(gt_boxes.shape[0]):
+                filename = '%s_%s_%d.bin' % (sample_idx, gt_names[i], i)
+                filepath = database_save_path / filename
+                gt_points = points[box_idxs_of_pts == i]
+                gt_points[:, :3] -= gt_boxes[i, :3]
+                with open(filepath, 'w') as f:
+                    gt_points.tofile(f)
+                if (used_classes is None) or gt_names[i] in used_classes:
+                    db_path = str(filepath.relative_to(self.root_path))  # gt_database/xxxxx.bin
+                    db_info = {'name': gt_names[i], 'path': db_path, 'image_idx': sample_idx, 'gt_idx': i,
+                               'box3d_lidar': gt_boxes[i], 'num_points_in_gt': gt_points.shape[0]}
+                    if gt_names[i] in all_db_infos:
+                        all_db_infos[gt_names[i]].append(db_info)
+                    else:
+                        all_db_infos[gt_names[i]] = [db_info]
+        for k, v in all_db_infos.items():
+            print('Database %s: %d' % (k, len(v)))
+        with open(db_info_save_path, 'wb') as f:
+            pickle.dump(all_db_infos, f)
+def create_nuscenes_info(version, data_path, save_path, max_sweeps=10):
+    from nuscenes.nuscenes import NuScenes
+    from nuscenes.utils import splits
+    from . import nuscenes_utils
+    data_path = data_path / version
+    save_path = save_path / version
+    assert version in ['v1.0-trainval', 'v1.0-test', 'v1.0-mini']
+    if version == 'v1.0-trainval':
+        train_scenes = splits.train
+        val_scenes = splits.val
+    elif version == 'v1.0-test':
+        train_scenes = splits.test
+        val_scenes = []
+    elif version == 'v1.0-mini':
+        train_scenes = splits.mini_train
+        val_scenes = splits.mini_val
+    else:
+        raise NotImplementedError
+    nusc = NuScenes(version=version, dataroot=data_path, verbose=True)
+    available_scenes = nuscenes_utils.get_available_scenes(nusc)
+    available_scene_names = [s['name'] for s in available_scenes]
+    train_scenes = list(filter(lambda x: x in available_scene_names, train_scenes))
+    val_scenes = list(filter(lambda x: x in available_scene_names, val_scenes))
+    train_scenes = set([available_scenes[available_scene_names.index(s)]['token'] for s in train_scenes])
+    val_scenes = set([available_scenes[available_scene_names.index(s)]['token'] for s in val_scenes])
+    print('%s: train scene(%d), val scene(%d)' % (version, len(train_scenes), len(val_scenes)))
+    train_nusc_infos, val_nusc_infos = nuscenes_utils.fill_trainval_infos(
+        data_path=data_path, nusc=nusc, train_scenes=train_scenes, val_scenes=val_scenes,
+        test='test' in version, max_sweeps=max_sweeps
+    )
+    if version == 'v1.0-test':
+        print('test sample: %d' % len(train_nusc_infos))
+        with open(save_path / f'nuscenes_infos_{max_sweeps}sweeps_test.pkl', 'wb') as f:
+            pickle.dump(train_nusc_infos, f)
+    else:
+        print('train sample: %d, val sample: %d' % (len(train_nusc_infos), len(val_nusc_infos)))
+        with open(save_path / f'nuscenes_infos_{max_sweeps}sweeps_train.pkl', 'wb') as f:
+            pickle.dump(train_nusc_infos, f)
+        with open(save_path / f'nuscenes_infos_{max_sweeps}sweeps_val.pkl', 'wb') as f:
+            pickle.dump(val_nusc_infos, f)
+if __name__ == '__main__':
+    import yaml
+    import argparse
+    from pathlib import Path
+    from easydict import EasyDict
+    parser = argparse.ArgumentParser(description='arg parser')
+    parser.add_argument('--cfg_file', type=str, default=None, help='specify the config of dataset')
+    parser.add_argument('--func', type=str, default='create_nuscenes_infos', help='')
+    parser.add_argument('--version', type=str, default='v1.0-trainval', help='')
+    args = parser.parse_args()
+    if args.func == 'create_nuscenes_infos':
+        dataset_cfg = EasyDict(yaml.load(open(args.cfg_file)))
+        ROOT_DIR = (Path(__file__).resolve().parent / '../../../').resolve()
+        dataset_cfg.VERSION = args.version
+        create_nuscenes_info(
+            version=dataset_cfg.VERSION,
+            data_path=ROOT_DIR / 'data' / 'nuscenes',
+            save_path=ROOT_DIR / 'data' / 'nuscenes',
+            max_sweeps=dataset_cfg.MAX_SWEEPS,
+        )
+        nuscenes_dataset = NuScenesDataset(
+            dataset_cfg=dataset_cfg, class_names=None,
+            root_path=ROOT_DIR / 'data' / 'nuscenes',
+            logger=common_utils.create_logger(), training=True
+        )
+        nuscenes_dataset.create_groundtruth_database(max_sweeps=dataset_cfg.MAX_SWEEPS)
--- a/pcdet/datasets/nuscenes/nuscenes_utils.py
+++ b/pcdet/datasets/nuscenes/nuscenes_utils.py
+"""
+The NuScenes data pre-processing and evaluation is modified from
+https://github.com/traveller59/second.pytorch and https://github.com/poodarchu/Det3D
+"""
+import operator
+from functools import reduce
+from pathlib import Path
+import numpy as np
+import tqdm
+from nuscenes.utils.data_classes import Box
+from nuscenes.utils.geometry_utils import transform_matrix
+from pyquaternion import Quaternion
+map_name_from_general_to_detection = {
+    'human.pedestrian.adult': 'pedestrian',
+    'human.pedestrian.child': 'pedestrian',
+    'human.pedestrian.wheelchair': 'ignore',
+    'human.pedestrian.stroller': 'ignore',
+    'human.pedestrian.personal_mobility': 'ignore',
+    'human.pedestrian.police_officer': 'pedestrian',
+    'human.pedestrian.construction_worker': 'pedestrian',
+    'animal': 'ignore',
+    'vehicle.car': 'car',
+    'vehicle.motorcycle': 'motorcycle',
+    'vehicle.bicycle': 'bicycle',
+    'vehicle.bus.bendy': 'bus',
+    'vehicle.bus.rigid': 'bus',
+    'vehicle.truck': 'truck',
+    'vehicle.construction': 'construction_vehicle',
+    'vehicle.emergency.ambulance': 'ignore',
+    'vehicle.emergency.police': 'ignore',
+    'vehicle.trailer': 'trailer',
+    'movable_object.barrier': 'barrier',
+    'movable_object.trafficcone': 'traffic_cone',
+    'movable_object.pushable_pullable': 'ignore',
+    'movable_object.debris': 'ignore',
+    'static_object.bicycle_rack': 'ignore',
+}
+cls_attr_dist = {
+    'barrier': {
+        'cycle.with_rider': 0,
+        'cycle.without_rider': 0,
+        'pedestrian.moving': 0,
+        'pedestrian.sitting_lying_down': 0,
+        'pedestrian.standing': 0,
+        'vehicle.moving': 0,
+        'vehicle.parked': 0,
+        'vehicle.stopped': 0,
+    },
+    'bicycle': {
+        'cycle.with_rider': 2791,
+        'cycle.without_rider': 8946,
+        'pedestrian.moving': 0,
+        'pedestrian.sitting_lying_down': 0,
+        'pedestrian.standing': 0,
+        'vehicle.moving': 0,
+        'vehicle.parked': 0,
+        'vehicle.stopped': 0,
+    },
+    'bus': {
+        'cycle.with_rider': 0,
+        'cycle.without_rider': 0,
+        'pedestrian.moving': 0,
+        'pedestrian.sitting_lying_down': 0,
+        'pedestrian.standing': 0,
+        'vehicle.moving': 9092,
+        'vehicle.parked': 3294,
+        'vehicle.stopped': 3881,
+    },
+    'car': {
+        'cycle.with_rider': 0,
+        'cycle.without_rider': 0,
+        'pedestrian.moving': 0,
+        'pedestrian.sitting_lying_down': 0,
+        'pedestrian.standing': 0,
+        'vehicle.moving': 114304,
+        'vehicle.parked': 330133,
+        'vehicle.stopped': 46898,
+    },
+    'construction_vehicle': {
+        'cycle.with_rider': 0,
+        'cycle.without_rider': 0,
+        'pedestrian.moving': 0,
+        'pedestrian.sitting_lying_down': 0,
+        'pedestrian.standing': 0,
+        'vehicle.moving': 882,
+        'vehicle.parked': 11549,
+        'vehicle.stopped': 2102,
+    },
+    'ignore': {
+        'cycle.with_rider': 307,
+        'cycle.without_rider': 73,
+        'pedestrian.moving': 0,
+        'pedestrian.sitting_lying_down': 0,
+        'pedestrian.standing': 0,
+        'vehicle.moving': 165,
+        'vehicle.parked': 400,
+        'vehicle.stopped': 102,
+    },
+    'motorcycle': {
+        'cycle.with_rider': 4233,
+        'cycle.without_rider': 8326,
+        'pedestrian.moving': 0,
+        'pedestrian.sitting_lying_down': 0,
+        'pedestrian.standing': 0,
+        'vehicle.moving': 0,
+        'vehicle.parked': 0,
+        'vehicle.stopped': 0,
+    },
+    'pedestrian': {
+        'cycle.with_rider': 0,
+        'cycle.without_rider': 0,
+        'pedestrian.moving': 157444,
+        'pedestrian.sitting_lying_down': 13939,
+        'pedestrian.standing': 46530,
+        'vehicle.moving': 0,
+        'vehicle.parked': 0,
+        'vehicle.stopped': 0,
+    },
+    'traffic_cone': {
+        'cycle.with_rider': 0,
+        'cycle.without_rider': 0,
+        'pedestrian.moving': 0,
+        'pedestrian.sitting_lying_down': 0,
+        'pedestrian.standing': 0,
+        'vehicle.moving': 0,
+        'vehicle.parked': 0,
+        'vehicle.stopped': 0,
+    },
+    'trailer': {
+        'cycle.with_rider': 0,
+        'cycle.without_rider': 0,
+        'pedestrian.moving': 0,
+        'pedestrian.sitting_lying_down': 0,
+        'pedestrian.standing': 0,
+        'vehicle.moving': 3421,
+        'vehicle.parked': 19224,
+        'vehicle.stopped': 1895,
+    },
+    'truck': {
+        'cycle.with_rider': 0,
+        'cycle.without_rider': 0,
+        'pedestrian.moving': 0,
+        'pedestrian.sitting_lying_down': 0,
+        'pedestrian.standing': 0,
+        'vehicle.moving': 21339,
+        'vehicle.parked': 55626,
+        'vehicle.stopped': 11097,
+    },
+}
+def get_available_scenes(nusc):
+    available_scenes = []
+    print('total scene num:', len(nusc.scene))
+    for scene in nusc.scene:
+        scene_token = scene['token']
+        scene_rec = nusc.get('scene', scene_token)
+        sample_rec = nusc.get('sample', scene_rec['first_sample_token'])
+        sd_rec = nusc.get('sample_data', sample_rec['data']['LIDAR_TOP'])
+        has_more_frames = True
+        scene_not_exist = False
+        while has_more_frames:
+            lidar_path, boxes, _ = nusc.get_sample_data(sd_rec['token'])
+            if not Path(lidar_path).exists():
+                scene_not_exist = True
+                break
+            else:
+                break
+            # if not sd_rec['next'] == '':
+            #     sd_rec = nusc.get('sample_data', sd_rec['next'])
+            # else:
+            #     has_more_frames = False
+        if scene_not_exist:
+            continue
+        available_scenes.append(scene)
+    print('exist scene num:', len(available_scenes))
+    return available_scenes
+def get_sample_data(nusc, sample_data_token, selected_anntokens=None):
+    """
+    Returns the data path as well as all annotations related to that sample_data.
+    Note that the boxes are transformed into the current sensor's coordinate frame.
+    Args:
+        nusc:
+        sample_data_token: Sample_data token.
+        selected_anntokens: If provided only return the selected annotation.
+    Returns:
+    """
+    # Retrieve sensor & pose records
+    sd_record = nusc.get('sample_data', sample_data_token)
+    cs_record = nusc.get('calibrated_sensor', sd_record['calibrated_sensor_token'])
+    sensor_record = nusc.get('sensor', cs_record['sensor_token'])
+    pose_record = nusc.get('ego_pose', sd_record['ego_pose_token'])
+    data_path = nusc.get_sample_data_path(sample_data_token)
+    if sensor_record['modality'] == 'camera':
+        cam_intrinsic = np.array(cs_record['camera_intrinsic'])
+        imsize = (sd_record['width'], sd_record['height'])
+    else:
+        cam_intrinsic = imsize = None
+    # Retrieve all sample annotations and map to sensor coordinate system.
+    if selected_anntokens is not None:
+        boxes = list(map(nusc.get_box, selected_anntokens))
+    else:
+        boxes = nusc.get_boxes(sample_data_token)
+    # Make list of Box objects including coord system transforms.
+    box_list = []
+    for box in boxes:
+        box.velocity = nusc.box_velocity(box.token)
+        # Move box to ego vehicle coord system
+        box.translate(-np.array(pose_record['translation']))
+        box.rotate(Quaternion(pose_record['rotation']).inverse)
+        #  Move box to sensor coord system
+        box.translate(-np.array(cs_record['translation']))
+        box.rotate(Quaternion(cs_record['rotation']).inverse)
+        box_list.append(box)
+    return data_path, box_list, cam_intrinsic
+def quaternion_yaw(q: Quaternion) -> float:
+    """
+    Calculate the yaw angle from a quaternion.
+    Note that this only works for a quaternion that represents a box in lidar or global coordinate frame.
+    It does not work for a box in the camera frame.
+    :param q: Quaternion of interest.
+    :return: Yaw angle in radians.
+    """
+    # Project into xy plane.
+    v = np.dot(q.rotation_matrix, np.array([1, 0, 0]))
+    # Measure yaw using arctan.
+    yaw = np.arctan2(v[1], v[0])
+    return yaw
+def fill_trainval_infos(data_path, nusc, train_scenes, val_scenes, test=False, max_sweeps=10):
+    train_nusc_infos = []
+    val_nusc_infos = []
+    progress_bar = tqdm.tqdm(total=len(nusc.sample), desc='create_info', dynamic_ncols=True)
+    ref_chan = 'LIDAR_TOP'  # The radar channel from which we track back n sweeps to aggregate the point cloud.
+    chan = 'LIDAR_TOP'  # The reference channel of the current sample_rec that the point clouds are mapped to.
+    for index, sample in enumerate(nusc.sample):
+        progress_bar.update()
+        ref_sd_token = sample['data'][ref_chan]
+        ref_sd_rec = nusc.get('sample_data', ref_sd_token)
+        ref_cs_rec = nusc.get('calibrated_sensor', ref_sd_rec['calibrated_sensor_token'])
+        ref_pose_rec = nusc.get('ego_pose', ref_sd_rec['ego_pose_token'])
+        ref_time = 1e-6 * ref_sd_rec['timestamp']
+        ref_lidar_path, ref_boxes, _ = get_sample_data(nusc, ref_sd_token)
+        ref_cam_front_token = sample['data']['CAM_FRONT']
+        ref_cam_path, _, ref_cam_intrinsic = nusc.get_sample_data(ref_cam_front_token)
+        # Homogeneous transform from ego car frame to reference frame
+        ref_from_car = transform_matrix(
+            ref_cs_rec['translation'], Quaternion(ref_cs_rec['rotation']), inverse=True
+        )
+        # Homogeneous transformation matrix from global to _current_ ego car frame
+        car_from_global = transform_matrix(
+            ref_pose_rec['translation'], Quaternion(ref_pose_rec['rotation']), inverse=True,
+        )
+        info = {
+            'lidar_path': Path(ref_lidar_path).relative_to(data_path).__str__(),
+            'cam_front_path': Path(ref_cam_path).relative_to(data_path).__str__(),
+            'cam_intrinsic': ref_cam_intrinsic,
+            'token': sample['token'],
+            'sweeps': [],
+            'ref_from_car': ref_from_car,
+            'car_from_global': car_from_global,
+            'timestamp': ref_time,
+        }
+        sample_data_token = sample['data'][chan]
+        curr_sd_rec = nusc.get('sample_data', sample_data_token)
+        sweeps = []
+        while len(sweeps) < max_sweeps - 1:
+            if curr_sd_rec['prev'] == '':
+                if len(sweeps) == 0:
+                    sweep = {
+                        'lidar_path': Path(ref_lidar_path).relative_to(data_path).__str__(),
+                        'sample_data_token': curr_sd_rec['token'],
+                        'transform_matrix': None,
+                        'time_lag': curr_sd_rec['timestamp'] * 0,
+                    }
+                    sweeps.append(sweep)
+                else:
+                    sweeps.append(sweeps[-1])
+            else:
+                curr_sd_rec = nusc.get('sample_data', curr_sd_rec['prev'])
+                # Get past pose
+                current_pose_rec = nusc.get('ego_pose', curr_sd_rec['ego_pose_token'])
+                global_from_car = transform_matrix(
+                    current_pose_rec['translation'], Quaternion(current_pose_rec['rotation']), inverse=False,
+                )
+                # Homogeneous transformation matrix from sensor coordinate frame to ego car frame.
+                current_cs_rec = nusc.get(
+                    'calibrated_sensor', curr_sd_rec['calibrated_sensor_token']
+                )
+                car_from_current = transform_matrix(
+                    current_cs_rec['translation'], Quaternion(current_cs_rec['rotation']), inverse=False,
+                )
+                tm = reduce(np.dot, [ref_from_car, car_from_global, global_from_car, car_from_current])
+                lidar_path = nusc.get_sample_data_path(curr_sd_rec['token'])
+                time_lag = ref_time - 1e-6 * curr_sd_rec['timestamp']
+                sweep = {
+                    'lidar_path': Path(lidar_path).relative_to(data_path).__str__(),
+                    'sample_data_token': curr_sd_rec['token'],
+                    'transform_matrix': tm,
+                    'global_from_car': global_from_car,
+                    'car_from_current': car_from_current,
+                    'time_lag': time_lag,
+                }
+                sweeps.append(sweep)
+        info['sweeps'] = sweeps
+        assert len(info['sweeps']) == max_sweeps - 1, \
+            f"sweep {curr_sd_rec['token']} only has {len(info['sweeps'])} sweeps, " \
+            f"you should duplicate to sweep num {max_sweeps - 1}"
+        if not test:
+            annotations = [nusc.get('sample_annotation', token) for token in sample['anns']]
+            # the filtering gives 0.5~1 map improvement
+            num_lidar_pts = np.array([anno['num_lidar_pts'] for anno in annotations])
+            num_radar_pts = np.array([anno['num_radar_pts'] for anno in annotations])
+            mask = (num_lidar_pts + num_radar_pts > 0)
+            locs = np.array([b.center for b in ref_boxes]).reshape(-1, 3)
+            dims = np.array([b.wlh for b in ref_boxes]).reshape(-1, 3)[:, [1, 0, 2]]  # wlh == > dxdydz (lwh)
+            velocity = np.array([b.velocity for b in ref_boxes]).reshape(-1, 3)
+            rots = np.array([quaternion_yaw(b.orientation) for b in ref_boxes]).reshape(-1, 1)
+            names = np.array([b.name for b in ref_boxes])
+            tokens = np.array([b.token for b in ref_boxes])
+            gt_boxes = np.concatenate([locs, dims, rots, velocity[:, :2]], axis=1)
+            assert len(annotations) == len(gt_boxes) == len(velocity)
+            info['gt_boxes'] = gt_boxes[mask, :]
+            info['gt_boxes_velocity'] = velocity[mask, :]
+            info['gt_names'] = np.array([map_name_from_general_to_detection[name] for name in names])[mask]
+            info['gt_boxes_token'] = tokens[mask]
+            info['num_lidar_pts'] = num_lidar_pts[mask]
+            info['num_radar_pts'] = num_radar_pts[mask]
+        if sample['scene_token'] in train_scenes:
+            train_nusc_infos.append(info)
+        else:
+            val_nusc_infos.append(info)
+    progress_bar.close()
+    return train_nusc_infos, val_nusc_infos
+def boxes_lidar_to_nusenes(det_info):
+    boxes3d = det_info['boxes_lidar']
+    scores = det_info['score']
+    labels = det_info['pred_labels']
+    box_list = []
+    for k in range(boxes3d.shape[0]):
+        quat = Quaternion(axis=[0, 0, 1], radians=boxes3d[k, 6])
+        velocity = (*boxes3d[k, 7:9], 0.0) if boxes3d.shape[1] == 9 else (0.0, 0.0, 0.0)
+        box = Box(
+            boxes3d[k, :3],
+            boxes3d[k, [4, 3, 5]],  # wlh
+            quat, label=labels[k], score=scores[k], velocity=velocity,
+        )
+        box_list.append(box)
+    return box_list
+def lidar_nusc_box_to_global(nusc, boxes, sample_token):
+    s_record = nusc.get('sample', sample_token)
+    sample_data_token = s_record['data']['LIDAR_TOP']
+    sd_record = nusc.get('sample_data', sample_data_token)
+    cs_record = nusc.get('calibrated_sensor', sd_record['calibrated_sensor_token'])
+    sensor_record = nusc.get('sensor', cs_record['sensor_token'])
+    pose_record = nusc.get('ego_pose', sd_record['ego_pose_token'])
+    data_path = nusc.get_sample_data_path(sample_data_token)
+    box_list = []
+    for box in boxes:
+        # Move box to ego vehicle coord system
+        box.rotate(Quaternion(cs_record['rotation']))
+        box.translate(np.array(cs_record['translation']))
+        # Move box to global coord system
+        box.rotate(Quaternion(pose_record['rotation']))
+        box.translate(np.array(pose_record['translation']))
+        box_list.append(box)
+    return box_list
+def transform_det_annos_to_nusc_annos(det_annos, nusc):
+    nusc_annos = {
+        'results': {},
+        'meta': None,
+    }
+    for det in det_annos:
+        annos = []
+        box_list = boxes_lidar_to_nusenes(det)
+        box_list = lidar_nusc_box_to_global(
+            nusc=nusc, boxes=box_list, sample_token=det['metadata']['token']
+        )
+        for k, box in enumerate(box_list):
+            name = det['name'][k]
+            if np.sqrt(box.velocity[0] ** 2 + box.velocity[1] ** 2) > 0.2:
+                if name in ['car', 'construction_vehicle', 'bus', 'truck', 'trailer']:
+                    attr = 'vehicle.moving'
+                elif name in ['bicycle', 'motorcycle']:
+                    attr = 'cycle.with_rider'
+                else:
+                    attr = None
+            else:
+                if name in ['pedestrian']:
+                    attr = 'pedestrian.standing'
+                elif name in ['bus']:
+                    attr = 'vehicle.stopped'
+                else:
+                    attr = None
+            attr = attr if attr is not None else max(
+                cls_attr_dist[name].items(), key=operator.itemgetter(1))[0]
+            nusc_anno = {
+                'sample_token': det['metadata']['token'],
+                'translation': box.center.tolist(),
+                'size': box.wlh.tolist(),
+                'rotation': box.orientation.elements.tolist(),
+                'velocity': box.velocity[:2].tolist(),
+                'detection_name': name,
+                'detection_score': box.score,
+                'attribute_name': attr
+            }
+            annos.append(nusc_anno)
+        nusc_annos['results'].update({det["metadata"]["token"]: annos})
+    return nusc_annos
+def format_nuscene_results(metrics, class_names, version='default'):
+    result = '----------------Nuscene %s results-----------------\n' % version
+    for name in class_names:
+        threshs = ', '.join(list(metrics['label_aps'][name].keys()))
+        ap_list = list(metrics['label_aps'][name].values())
+        err_name =', '.join([x.split('_')[0] for x in list(metrics['label_tp_errors'][name].keys())])
+        error_list = list(metrics['label_tp_errors'][name].values())
+        result += f'***{name} error@{err_name} | AP@{threshs}\n'
+        result += ', '.join(['%.2f' % x for x in error_list]) + ' | '
+        result += ', '.join(['%.2f' % (x * 100) for x in ap_list])
+        result += f" | mean AP: {metrics['mean_dist_aps'][name]}"
+        result += '\n'
+    result += '--------------average performance-------------\n'
+    details = {}
+    for key, val in metrics['tp_errors'].items():
+        result += '%s:\t %.4f\n' % (key, val)
+        details[key] = val
+    result += 'mAP:\t %.4f\n' % metrics['mean_ap']
+    result += 'NDS:\t %.4f\n' % metrics['nd_score']
+    details.update({
+        'mAP': metrics['mean_ap'],
+        'NDS': metrics['nd_score'],
+    })
+    return result, details
--- a/pcdet/datasets/processor/data_processor.py
+++ b/pcdet/datasets/processor/data_processor.py
 from functools import partial
 import numpy as np
 from ...utils import box_utils, common_utils
@@ -40,7 +42,10 @@ class DataProcessor(object):
    def transform_points_to_voxels(self, data_dict=None, config=None, voxel_generator=None):
        if data_dict is None:
-            from spconv.utils import VoxelGenerator
+            try:
+                from spconv.utils import VoxelGeneratorV2 as VoxelGenerator
+            except:
+                from spconv.utils import VoxelGenerator
            voxel_generator = VoxelGenerator(
                voxel_size=config.VOXEL_SIZE,
@@ -52,8 +57,15 @@ class DataProcessor(object):
            self.grid_size = np.round(grid_size).astype(np.int64)
            self.voxel_size = config.VOXEL_SIZE
            return partial(self.transform_points_to_voxels, voxel_generator=voxel_generator)
        points = data_dict['points']
-        voxels, coordinates, num_points = voxel_generator.generate(points)
+        voxel_output = voxel_generator.generate(points)
+        if isinstance(voxel_output, dict):
+            voxels, coordinates, num_points = \
+                voxel_output['voxels'], voxel_output['coordinates'], voxel_output['num_points_per_voxel']
+        else:
+            voxels, coordinates, num_points = voxel_output
        if not data_dict['use_lead_xyz']:
            voxels = voxels[..., 3:]  # remove xyz in voxels(N, 3)
@@ -62,6 +74,34 @@ class DataProcessor(object):
        data_dict['voxel_num_points'] = num_points
        return data_dict
+    def sample_points(self, data_dict=None, config=None):
+        if data_dict is None:
+            return partial(self.sample_points, config=config)
+        num_points = config.NUM_POINTS[self.mode]
+        if num_points == -1:
+            return data_dict
+        points = data_dict['points']
+        if num_points < len(points):
+            pts_depth = np.linalg.norm(points[:, 0:3], axis=1)
+            pts_near_flag = pts_depth < 40.0
+            far_idxs_choice = np.where(pts_near_flag == 0)[0]
+            near_idxs = np.where(pts_near_flag == 1)[0]
+            near_idxs_choice = np.random.choice(near_idxs, num_points - len(far_idxs_choice), replace=False)
+            choice = np.concatenate((near_idxs_choice, far_idxs_choice), axis=0) \
+                if len(far_idxs_choice) > 0 else near_idxs_choice
+            np.random.shuffle(choice)
+        else:
+            choice = np.arange(0, len(points), dtype=np.int32)
+            if num_points > len(points):
+                extra_choice = np.random.choice(choice, num_points - len(points), replace=False)
+                choice = np.concatenate((choice, extra_choice), axis=0)
+            np.random.shuffle(choice)
+        data_dict['points'] = points[choice]
+        return data_dict
    def forward(self, data_dict):
        """
        Args:

--- a/pcdet/models/__init__.py
+++ b/pcdet/models/__init__.py
-import torch
-import numpy as np
 from collections import namedtuple
+import numpy as np
+import torch
 from .detectors import build_detector