Unverified Commit 83954d03 authored by yukang's avatar yukang Committed by GitHub
Browse files

Add support for VoxelNeXt (#1309)

* VoxelNeXt
parent 31f6758a
......@@ -22,6 +22,8 @@ It is also the official code release of [`[PointRCNN]`](https://arxiv.org/abs/18
## Changelog
[2023-04-02] Added support for [`VoxelNeXt`](https://github.com/dvlab-research/VoxelNeXt) on Nuscenes, Waymo, and Argoverse2 datasets. It is a fully sparse 3D object detection network, which is a clean sparse CNNs network and predicts 3D objects directly upon voxels.
[2022-09-02] **NEW:** Update `OpenPCDet` to v0.6.0:
* Official code release of [MPPNet](https://arxiv.org/abs/2205.05979) for temporal 3D object detection, which supports long-term multi-frame 3D object detection and ranks 1st place on [3D detection learderboard](https://waymo.com/open/challenges/2020/3d-detection) of Waymo Open Dataset on Sept. 2th, 2022. For validation dataset, MPPNet achieves 74.96%, 75.06% and 74.52% for vehicle, pedestrian and cyclist classes in terms of mAPH@Level_2. (see the [guideline](docs/guidelines_of_approaches/mppnet.md) on how to train/test with MPPNet).
* Support multi-frame training/testing on Waymo Open Dataset (see the [change log](docs/changelog.md) for more details on how to process data).
......@@ -172,7 +174,6 @@ By default, all models are trained with **a single frame** of **20% data (~32k f
| [PV-RCNN++](tools/cfgs/waymo_models/pv_rcnn_plusplus.yaml) | 77.82/77.32| 69.07/68.62| 77.99/71.36| 69.92/63.74| 71.80/70.71| 69.31/68.26|
| [PV-RCNN++ (ResNet)](tools/cfgs/waymo_models/pv_rcnn_plusplus_resnet.yaml) |77.61/77.14| 69.18/68.75| 79.42/73.31| 70.88/65.21| 72.50/71.39| 69.84/68.77|
Here we also provide the performance of several models trained on the full training set (refer to the paper of [PV-RCNN++](https://arxiv.org/abs/2102.00463)):
| Performance@(train with 100\% Data) | Vec_L1 | Vec_L2 | Ped_L1 | Ped_L2 | Cyc_L1 | Cyc_L2 |
......@@ -180,6 +181,7 @@ Here we also provide the performance of several models trained on the full train
| [SECOND](tools/cfgs/waymo_models/second.yaml) | 72.27/71.69 | 63.85/63.33 | 68.70/58.18 | 60.72/51.31 | 60.62/59.28 | 58.34/57.05 |
| [CenterPoint-Pillar](tools/cfgs/waymo_models/centerpoint_pillar_1x.yaml)| 73.37/72.86 | 65.09/64.62 | 75.35/65.11 | 67.61/58.25 | 67.76/66.22 | 65.25/63.77 |
| [Part-A2-Anchor](tools/cfgs/waymo_models/PartA2.yaml) | 77.05/76.51 | 68.47/67.97 | 75.24/66.87 | 66.18/58.62 | 68.60/67.36 | 66.13/64.93 |
| [VoxelNeXt-2D](tools/cfgs/waymo_models/voxelnext2d_ioubranch.yaml) | 77.94/77.47 |69.68/69.25 |80.24/73.47 |72.23/65.88 |73.33/72.20 |70.66/69.56 |
| [PV-RCNN (CenterHead)](tools/cfgs/waymo_models/pv_rcnn_with_centerhead_rpn.yaml) | 78.00/77.50 | 69.43/68.98 | 79.21/73.03 | 70.42/64.72 | 71.46/70.27 | 68.95/67.79 |
| [PV-RCNN++](tools/cfgs/waymo_models/pv_rcnn_plusplus.yaml) | 79.10/78.63 | 70.34/69.91 | 80.62/74.62 | 71.86/66.30 | 73.49/72.38 | 70.70/69.62 |
| [PV-RCNN++ (ResNet)](tools/cfgs/waymo_models/pv_rcnn_plusplus_resnet.yaml) | 79.25/78.78 | 70.61/70.18 | 81.83/76.28 | 73.17/68.00 | 73.72/72.66 | 71.21/70.19 |
......@@ -198,13 +200,14 @@ but you could easily achieve similar performance by training with the default co
### NuScenes 3D Object Detection Baselines
All models are trained with 8 GTX 1080Ti GPUs and are available for download.
| | mATE | mASE | mAOE | mAVE | mAAE | mAP | NDS | download |
|---------------------------------------------|----------:|:-------:|:-------:|:-------:|:---------:|:-------:|:-------:|:---------:|
| [PointPillar-MultiHead](tools/cfgs/nuscenes_models/cbgs_pp_multihead.yaml) | 33.87 | 26.00 | 32.07 | 28.74 | 20.15 | 44.63 | 58.23 | [model-23M](https://drive.google.com/file/d/1p-501mTWsq0G9RzroTWSXreIMyTUUpBM/view?usp=sharing) |
| [SECOND-MultiHead (CBGS)](tools/cfgs/nuscenes_models/cbgs_second_multihead.yaml) | 31.15 | 25.51 | 26.64 | 26.26 | 20.46 | 50.59 | 62.29 | [model-35M](https://drive.google.com/file/d/1bNzcOnE3u9iooBFMk2xK7HqhdeQ_nwTq/view?usp=sharing) |
| [CenterPoint-PointPillar](tools/cfgs/nuscenes_models/cbgs_dyn_pp_centerpoint.yaml) | 31.13 | 26.04 | 42.92 | 23.90 | 19.14 | 50.03 | 60.70 | [model-23M](https://drive.google.com/file/d/1UvGm6mROMyJzeSRu7OD1leU_YWoAZG7v/view?usp=sharing) |
| [CenterPoint (voxel_size=0.1)](tools/cfgs/nuscenes_models/cbgs_voxel01_res3d_centerpoint.yaml) | 30.11 | 25.55 | 38.28 | 21.94 | 18.87 | 56.03 | 64.54 | [model-34M](https://drive.google.com/file/d/1Cz-J1c3dw7JAWc25KRG1XQj8yCaOlexQ/view?usp=sharing) |
| [CenterPoint (voxel_size=0.075)](tools/cfgs/nuscenes_models/cbgs_voxel0075_res3d_centerpoint.yaml) | 28.80 | 25.43 | 37.27 | 21.55 | 18.24 | 59.22 | 66.48 | [model-34M](https://drive.google.com/file/d/1XOHAWm1MPkCKr1gqmc3TWi5AYZgPsgxU/view?usp=sharing) |
| | mATE | mASE | mAOE | mAVE | mAAE | mAP | NDS | download |
|----------------------------------------------------------------------------------------------------|-------:|:------:|:------:|:-----:|:-----:|:-----:|:------:|:--------------------------------------------------------------------------------------------------:|
| [PointPillar-MultiHead](tools/cfgs/nuscenes_models/cbgs_pp_multihead.yaml) | 33.87 | 26.00 | 32.07 | 28.74 | 20.15 | 44.63 | 58.23 | [model-23M](https://drive.google.com/file/d/1p-501mTWsq0G9RzroTWSXreIMyTUUpBM/view?usp=sharing) |
| [SECOND-MultiHead (CBGS)](tools/cfgs/nuscenes_models/cbgs_second_multihead.yaml) | 31.15 | 25.51 | 26.64 | 26.26 | 20.46 | 50.59 | 62.29 | [model-35M](https://drive.google.com/file/d/1bNzcOnE3u9iooBFMk2xK7HqhdeQ_nwTq/view?usp=sharing) |
| [CenterPoint-PointPillar](tools/cfgs/nuscenes_models/cbgs_dyn_pp_centerpoint.yaml) | 31.13 | 26.04 | 42.92 | 23.90 | 19.14 | 50.03 | 60.70 | [model-23M](https://drive.google.com/file/d/1UvGm6mROMyJzeSRu7OD1leU_YWoAZG7v/view?usp=sharing) |
| [CenterPoint (voxel_size=0.1)](tools/cfgs/nuscenes_models/cbgs_voxel01_res3d_centerpoint.yaml) | 30.11 | 25.55 | 38.28 | 21.94 | 18.87 | 56.03 | 64.54 | [model-34M](https://drive.google.com/file/d/1Cz-J1c3dw7JAWc25KRG1XQj8yCaOlexQ/view?usp=sharing) |
| [CenterPoint (voxel_size=0.075)](tools/cfgs/nuscenes_models/cbgs_voxel0075_res3d_centerpoint.yaml) | 28.80 | 25.43 | 37.27 | 21.55 | 18.24 | 59.22 | 66.48 | [model-34M](https://drive.google.com/file/d/1XOHAWm1MPkCKr1gqmc3TWi5AYZgPsgxU/view?usp=sharing) |
| [VoxelNeXt (voxel_size=0.075)](tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml) | 30.11 | 25.23 | 40.57 | 21.69 | 18.56 | 60.53 | 66.65 | [model-31M](https://drive.google.com/file/d/1IV7e7G9X-61KXSjMGtQo579pzDNbhwvf/view?usp=share_link) |
### ONCE 3D Object Detection Baselines
......@@ -218,6 +221,14 @@ All models are trained with 8 GPUs.
| [PV-RCNN](tools/cfgs/once_models/pv_rcnn.yaml) | 77.77 | 23.50 | 59.37 | 53.55 |
| [CenterPoint](tools/cfgs/once_models/centerpoint.yaml) | 78.02 | 49.74 | 67.22 | 64.99 |
### Argoverse2 3D Object Detection Baselines
All models are trained with 4 GPUs.
| | mAP | download |
|---------------------------------------------------------|:----:|:--------------------------------------------------------------------------------------------------:|
| [VoxelNeXt](tools/cfgs/argo2_models/cbgs_voxel01_voxelnext.yaml) | 30.0 | [model-30M](https://drive.google.com/file/d/1zr-it1ERJzLQ3a3hP060z_EQqS_RkNaC/view?usp=share_link) |
| [VoxelNeXt-K3](tools/cfgs/argo2_models/cbgs_voxel01_voxelnext_headkernel3.yaml) | 30.7 | [model-45M](https://drive.google.com/file/d/1NrYRsiKbuWyL8jE4SY27IHpFMY9K0o__/view?usp=share_link) |
### Other datasets
Welcome to support other datasets by submitting pull request.
......
This diff is collapsed.
This diff is collapsed.
......@@ -12,6 +12,7 @@ from .waymo.waymo_dataset import WaymoDataset
from .pandaset.pandaset_dataset import PandasetDataset
from .lyft.lyft_dataset import LyftDataset
from .once.once_dataset import ONCEDataset
from .argo2.argo2_dataset import Argo2Dataset
from .custom.custom_dataset import CustomDataset
__all__ = {
......@@ -22,7 +23,8 @@ __all__ = {
'PandasetDataset': PandasetDataset,
'LyftDataset': LyftDataset,
'ONCEDataset': ONCEDataset,
'CustomDataset': CustomDataset
'CustomDataset': CustomDataset,
'Argo2Dataset': Argo2Dataset
}
......
import copy
import pickle
import torch
import numpy as np
from ..dataset import DatasetTemplate
from .argo2_utils.so3 import yaw_to_quat
from .argo2_utils.constants import LABEL_ATTR
from os import path as osp
from pathlib import Path
class Argo2Dataset(DatasetTemplate):
def __init__(self, dataset_cfg, class_names, training=True, root_path=None, logger=None):
"""
Args:
root_path:
dataset_cfg:
class_names:
training:
logger:
"""
super().__init__(
dataset_cfg=dataset_cfg, class_names=class_names, training=training, root_path=root_path, logger=logger
)
self.split = self.dataset_cfg.DATA_SPLIT[self.mode]
self.root_split_path = self.root_path / ('training' if self.split != 'test' else 'testing')
split_dir = self.root_path / 'ImageSets' / (self.split + '.txt')
self.sample_id_list = [x.strip() for x in open(split_dir).readlines()] if split_dir.exists() else None
self.kitti_infos = []
self.include_kitti_data(self.mode)
def include_kitti_data(self, mode):
if self.logger is not None:
self.logger.info('Loading Argoverse2 dataset')
kitti_infos = []
for info_path in self.dataset_cfg.INFO_PATH[mode]:
info_path = self.root_path / info_path
if not info_path.exists():
continue
with open(info_path, 'rb') as f:
infos = pickle.load(f)
kitti_infos.extend(infos)
self.kitti_infos.extend(kitti_infos)
if self.logger is not None:
self.logger.info('Total samples for Argo2 dataset: %d' % (len(kitti_infos)))
def set_split(self, split):
super().__init__(
dataset_cfg=self.dataset_cfg, class_names=self.class_names, training=self.training, root_path=self.root_path, logger=self.logger
)
self.split = split
self.root_split_path = self.root_path / ('training' if self.split != 'test' else 'testing')
split_dir = self.root_path / 'ImageSets' / (self.split + '.txt')
self.sample_id_list = [x.strip() for x in open(split_dir).readlines()] if split_dir.exists() else None
def get_lidar(self, idx):
lidar_file = self.root_split_path / 'velodyne' / ('%s.bin' % idx)
assert lidar_file.exists()
return np.fromfile(str(lidar_file), dtype=np.float32).reshape(-1, 4)
@staticmethod
def generate_prediction_dicts(batch_dict, pred_dicts, class_names, output_path=None):
"""
Args:
batch_dict:
frame_id:
pred_dicts: list of pred_dicts
pred_boxes: (N, 7), Tensor
pred_scores: (N), Tensor
pred_labels: (N), Tensor
class_names:
output_path:
Returns:
"""
def get_template_prediction(num_samples):
ret_dict = {
'name': np.zeros(num_samples), 'truncated': np.zeros(num_samples),
'occluded': np.zeros(num_samples), 'alpha': np.zeros(num_samples),
'bbox': np.zeros([num_samples, 4]), 'dimensions': np.zeros([num_samples, 3]),
'location': np.zeros([num_samples, 3]), 'rotation_y': np.zeros(num_samples),
'score': np.zeros(num_samples), 'boxes_lidar': np.zeros([num_samples, 7])
}
return ret_dict
def generate_single_sample_dict(batch_index, box_dict):
pred_scores = box_dict['pred_scores'].cpu().numpy()
pred_boxes = box_dict['pred_boxes'].cpu().numpy()
pred_labels = box_dict['pred_labels'].cpu().numpy()
pred_dict = get_template_prediction(pred_scores.shape[0])
if pred_scores.shape[0] == 0:
return pred_dict
pred_boxes_img = pred_boxes
pred_boxes_camera = pred_boxes
pred_dict['name'] = np.array(class_names)[pred_labels - 1]
pred_dict['alpha'] = -np.arctan2(-pred_boxes[:, 1], pred_boxes[:, 0]) + pred_boxes_camera[:, 6]
pred_dict['bbox'] = pred_boxes_img
pred_dict['dimensions'] = pred_boxes_camera[:, 3:6]
pred_dict['location'] = pred_boxes_camera[:, 0:3]
pred_dict['rotation_y'] = pred_boxes_camera[:, 6]
pred_dict['score'] = pred_scores
pred_dict['boxes_lidar'] = pred_boxes
return pred_dict
annos = []
for index, box_dict in enumerate(pred_dicts):
frame_id = batch_dict['frame_id'][index]
single_pred_dict = generate_single_sample_dict(index, box_dict)
single_pred_dict['frame_id'] = frame_id
annos.append(single_pred_dict)
if output_path is not None:
cur_det_file = output_path / ('%s.txt' % frame_id)
with open(cur_det_file, 'w') as f:
bbox = single_pred_dict['bbox']
loc = single_pred_dict['location']
dims = single_pred_dict['dimensions'] # lhw -> hwl
for idx in range(len(bbox)):
print('%s -1 -1 %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f'
% (single_pred_dict['name'][idx], single_pred_dict['alpha'][idx],
bbox[idx][0], bbox[idx][1], bbox[idx][2], bbox[idx][3],
dims[idx][1], dims[idx][2], dims[idx][0], loc[idx][0],
loc[idx][1], loc[idx][2], single_pred_dict['rotation_y'][idx],
single_pred_dict['score'][idx]), file=f)
return annos
def __len__(self):
if self._merge_all_iters_to_one_epoch:
return len(self.kitti_infos) * self.total_epochs
return len(self.kitti_infos)
def __getitem__(self, index):
# index = 4
if self._merge_all_iters_to_one_epoch:
index = index % len(self.kitti_infos)
info = copy.deepcopy(self.kitti_infos[index])
sample_idx = info['point_cloud']['velodyne_path'].split('/')[-1].rstrip('.bin')
calib = None
get_item_list = self.dataset_cfg.get('GET_ITEM_LIST', ['points'])
input_dict = {
'frame_id': sample_idx,
'calib': calib,
}
if 'annos' in info:
annos = info['annos']
loc, dims, rots = annos['location'], annos['dimensions'], annos['rotation_y']
gt_names = annos['name']
gt_bboxes_3d = np.concatenate([loc, dims, rots[..., np.newaxis]], axis=1).astype(np.float32)
input_dict.update({
'gt_names': gt_names,
'gt_boxes': gt_bboxes_3d
})
if "points" in get_item_list:
points = self.get_lidar(sample_idx)
input_dict['points'] = points
input_dict['calib'] = calib
data_dict = self.prepare_data(data_dict=input_dict)
return data_dict
def format_results(self,
outputs,
class_names,
pklfile_prefix=None,
submission_prefix=None,
):
"""Format the results to .feather file with argo2 format.
Args:
outputs (list[dict]): Testing results of the dataset.
pklfile_prefix (str | None): The prefix of pkl files. It includes
the file path and the prefix of filename, e.g., "a/b/prefix".
If not specified, a temp file will be created. Default: None.
submission_prefix (str | None): The prefix of submitted files. It
includes the file path and the prefix of filename, e.g.,
"a/b/prefix". If not specified, a temp file will be created.
Default: None.
Returns:
tuple: (result_files, tmp_dir), result_files is a dict containing
the json filepaths, tmp_dir is the temporal directory created
for saving json files when jsonfile_prefix is not specified.
"""
import pandas as pd
assert len(self.kitti_infos) == len(outputs)
num_samples = len(outputs)
print('\nGot {} samples'.format(num_samples))
serialized_dts_list = []
print('\nConvert predictions to Argoverse 2 format')
for i in range(num_samples):
out_i = outputs[i]
log_id, ts = self.kitti_infos[i]['uuid'].split('/')
track_uuid = None
#cat_id = out_i['labels_3d'].numpy().tolist()
#category = [class_names[i].upper() for i in cat_id]
category = [class_name.upper() for class_name in out_i['name']]
serialized_dts = pd.DataFrame(
self.lidar_box_to_argo2(out_i['bbox']).numpy(), columns=list(LABEL_ATTR)
)
serialized_dts["score"] = out_i['score']
serialized_dts["log_id"] = log_id
serialized_dts["timestamp_ns"] = int(ts)
serialized_dts["category"] = category
serialized_dts_list.append(serialized_dts)
dts = (
pd.concat(serialized_dts_list)
.set_index(["log_id", "timestamp_ns"])
.sort_index()
)
dts = dts.sort_values("score", ascending=False).reset_index()
if pklfile_prefix is not None:
if not pklfile_prefix.endswith(('.feather')):
pklfile_prefix = f'{pklfile_prefix}.feather'
dts.to_feather(pklfile_prefix)
print(f'Result is saved to {pklfile_prefix}.')
dts = dts.set_index(["log_id", "timestamp_ns"]).sort_index()
return dts
def lidar_box_to_argo2(self, boxes):
boxes = torch.Tensor(boxes)
cnt_xyz = boxes[:, :3]
#cnt_xyz[:, 2] += boxes[:, 5] * 0.5
lwh = boxes[:, [4, 3, 5]]
#yaw = -boxes[:, 6] - np.pi/2
yaw = boxes[:, 6] #- np.pi/2
yaw = -yaw - 0.5 * np.pi
while (yaw < -np.pi).any():
yaw[yaw < -np.pi] += 2 * np.pi
while (yaw > np.pi).any():
yaw[yaw > np.pi] -= 2 * np.pi
quat = yaw_to_quat(yaw)
argo_cuboid = torch.cat([cnt_xyz, lwh, quat], dim=1)
return argo_cuboid
def evaluation(self,
results,
class_names,
eval_metric='waymo',
logger=None,
pklfile_prefix=None,
submission_prefix=None,
show=False,
output_path=None,
pipeline=None):
"""Evaluation in KITTI protocol.
Args:
results (list[dict]): Testing results of the dataset.
metric (str | list[str]): Metrics to be evaluated.
Default: 'waymo'. Another supported metric is 'kitti'.
logger (logging.Logger | str | None): Logger used for printing
related information during evaluation. Default: None.
pklfile_prefix (str | None): The prefix of pkl files. It includes
the file path and the prefix of filename, e.g., "a/b/prefix".
If not specified, a temp file will be created. Default: None.
submission_prefix (str | None): The prefix of submission datas.
If not specified, the submission data will not be generated.
show (bool): Whether to visualize.
Default: False.
out_dir (str): Path to save the visualization results.
Default: None.
pipeline (list[dict], optional): raw data loading for showing.
Default: None.
Returns:
dict[str: float]: results of each evaluation metric
"""
from av2.evaluation.detection.constants import CompetitionCategories
from av2.evaluation.detection.utils import DetectionCfg
from av2.evaluation.detection.eval import evaluate
from av2.utils.io import read_feather
dts = self.format_results(results, class_names, pklfile_prefix, submission_prefix)
argo2_root = "../data/argo2/"
val_anno_path = osp.join(argo2_root, 'val_anno.feather')
gts = read_feather(val_anno_path)
gts = gts.set_index(["log_id", "timestamp_ns"]).sort_values("category")
valid_uuids_gts = gts.index.tolist()
valid_uuids_dts = dts.index.tolist()
valid_uuids = set(valid_uuids_gts) & set(valid_uuids_dts)
gts = gts.loc[list(valid_uuids)].sort_index()
categories = set(x.value for x in CompetitionCategories)
categories &= set(gts["category"].unique().tolist())
split = 'val'
dataset_dir = Path(argo2_root) / 'sensor' / split
cfg = DetectionCfg(
dataset_dir=dataset_dir,
categories=tuple(sorted(categories)),
#split=split,
max_range_m=200.0,
eval_only_roi_instances=True,
)
# Evaluate using Argoverse detection API.
eval_dts, eval_gts, metrics = evaluate(
dts.reset_index(), gts.reset_index(), cfg
)
valid_categories = sorted(categories) + ["AVERAGE_METRICS"]
ap_dict = {}
for index, row in metrics.iterrows():
ap_dict[index] = row.to_json()
return metrics.loc[valid_categories], ap_dict
LABEL_ATTR = (
"tx_m",
"ty_m",
"tz_m",
"length_m",
"width_m",
"height_m",
"qw",
"qx",
"qy",
"qz",
)
\ No newline at end of file
"""SO(3) group transformations."""
import kornia.geometry.conversions as C
import torch
from torch import Tensor
from math import pi as PI
@torch.jit.script
def quat_to_mat(quat_wxyz: Tensor) -> Tensor:
"""Convert scalar first quaternion to rotation matrix.
Args:
quat_wxyz: (...,4) Scalar first quaternions.
Returns:
(...,3,3) 3D rotation matrices.
"""
return C.quaternion_to_rotation_matrix(
quat_wxyz, order=C.QuaternionCoeffOrder.WXYZ
)
# @torch.jit.script
def mat_to_quat(mat: Tensor) -> Tensor:
"""Convert rotation matrix to scalar first quaternion.
Args:
mat: (...,3,3) 3D rotation matrices.
Returns:
(...,4) Scalar first quaternions.
"""
return C.rotation_matrix_to_quaternion(
mat, order=C.QuaternionCoeffOrder.WXYZ
)
@torch.jit.script
def quat_to_xyz(
quat_wxyz: Tensor, singularity_value: float = PI / 2
) -> Tensor:
"""Convert scalar first quaternion to Tait-Bryan angles.
Reference:
https://en.wikipedia.org/wiki/Conversion_between_quaternions_and_Euler_angles#Source_code_2
Args:
quat_wxyz: (...,4) Scalar first quaternions.
singularity_value: Value that's set at the singularities.
Returns:
(...,3) The Tait-Bryan angles --- roll, pitch, and yaw.
"""
qw = quat_wxyz[..., 0]
qx = quat_wxyz[..., 1]
qy = quat_wxyz[..., 2]
qz = quat_wxyz[..., 3]
# roll (x-axis rotation)
sinr_cosp = 2 * (qw * qx + qy * qz)
cosr_cosp = 1 - 2 * (qx * qx + qy * qy)
roll = torch.atan2(sinr_cosp, cosr_cosp)
# pitch (y-axis rotation)
pitch = 2 * (qw * qy - qz * qx)
is_out_of_range = torch.abs(pitch) >= 1
pitch[is_out_of_range] = torch.copysign(
torch.as_tensor(singularity_value), pitch[is_out_of_range]
)
pitch[~is_out_of_range] = torch.asin(pitch[~is_out_of_range])
# yaw (z-axis rotation)
siny_cosp = 2 * (qw * qz + qx * qy)
cosy_cosp = 1 - 2 * (qy * qy + qz * qz)
yaw = torch.atan2(siny_cosp, cosy_cosp)
xyz = torch.stack([roll, pitch, yaw], dim=-1)
return xyz
@torch.jit.script
def quat_to_yaw(quat_wxyz: Tensor) -> Tensor:
"""Convert scalar first quaternion to yaw (rotation about vertical axis).
Reference:
https://en.wikipedia.org/wiki/Conversion_between_quaternions_and_Euler_angles#Source_code_2
Args:
quat_wxyz: (...,4) Scalar first quaternions.
Returns:
(...,) The rotation about the z-axis in radians.
"""
xyz = quat_to_xyz(quat_wxyz)
yaw_rad: Tensor = xyz[..., -1]
return yaw_rad
@torch.jit.script
def xyz_to_quat(xyz_rad: Tensor) -> Tensor:
"""Convert euler angles (xyz - pitch, roll, yaw) to scalar first quaternions.
Args:
xyz_rad: (...,3) Tensor of roll, pitch, and yaw in radians.
Returns:
(...,4) Scalar first quaternions (wxyz).
"""
x_rad = xyz_rad[..., 0]
y_rad = xyz_rad[..., 1]
z_rad = xyz_rad[..., 2]
cy = torch.cos(z_rad * 0.5)
sy = torch.sin(z_rad * 0.5)
cp = torch.cos(y_rad * 0.5)
sp = torch.sin(y_rad * 0.5)
cr = torch.cos(x_rad * 0.5)
sr = torch.sin(x_rad * 0.5)
qw = cr * cp * cy + sr * sp * sy
qx = sr * cp * cy - cr * sp * sy
qy = cr * sp * cy + sr * cp * sy
qz = cr * cp * sy - sr * sp * cy
quat_wxyz = torch.stack([qw, qx, qy, qz], dim=-1)
return quat_wxyz
@torch.jit.script
def yaw_to_quat(yaw_rad: Tensor) -> Tensor:
"""Convert yaw (rotation about the vertical axis) to scalar first quaternions.
Args:
yaw_rad: (...,1) Rotations about the z-axis.
Returns:
(...,4) scalar first quaternions (wxyz).
"""
xyz_rad = torch.zeros_like(yaw_rad)[..., None].repeat_interleave(3, dim=-1)
xyz_rad[..., -1] = yaw_rad
quat_wxyz: Tensor = xyz_to_quat(xyz_rad)
return quat_wxyz
......@@ -199,13 +199,19 @@ class DatasetTemplate(torch_data.Dataset):
data_dict[key].append(val)
batch_size = len(batch_list)
ret = {}
batch_size_ratio = 1
for key, val in data_dict.items():
try:
if key in ['voxels', 'voxel_num_points']:
if isinstance(val[0], list):
batch_size_ratio = len(val[0])
val = [i for item in val for i in item]
ret[key] = np.concatenate(val, axis=0)
elif key in ['points', 'voxel_coords']:
coors = []
if isinstance(val[0], list):
val = [i for item in val for i in item]
for i, coor in enumerate(val):
coor_pad = np.pad(coor, ((0, 0), (1, 0)), mode='constant', constant_values=i)
coors.append(coor_pad)
......@@ -287,5 +293,5 @@ class DatasetTemplate(torch_data.Dataset):
print('Error in collate_batch: key=%s' % key)
raise TypeError
ret['batch_size'] = batch_size
ret['batch_size'] = batch_size * batch_size_ratio
return ret
......@@ -112,7 +112,23 @@ class DataProcessor(object):
return partial(self.transform_points_to_voxels_placeholder, config=config)
return data_dict
def double_flip(self, points):
# y flip
points_yflip = points.copy()
points_yflip[:, 1] = -points_yflip[:, 1]
# x flip
points_xflip = points.copy()
points_xflip[:, 0] = -points_xflip[:, 0]
# x y flip
points_xyflip = points.copy()
points_xyflip[:, 0] = -points_xyflip[:, 0]
points_xyflip[:, 1] = -points_xyflip[:, 1]
return points_yflip, points_xflip, points_xyflip
def transform_points_to_voxels(self, data_dict=None, config=None):
if data_dict is None:
grid_size = (self.point_cloud_range[3:6] - self.point_cloud_range[0:3]) / np.array(config.VOXEL_SIZE)
......@@ -138,9 +154,28 @@ class DataProcessor(object):
if not data_dict['use_lead_xyz']:
voxels = voxels[..., 3:] # remove xyz in voxels(N, 3)
data_dict['voxels'] = voxels
data_dict['voxel_coords'] = coordinates
data_dict['voxel_num_points'] = num_points
if config.get('DOUBLE_FLIP', False):
voxels_list, voxel_coords_list, voxel_num_points_list = [voxels], [coordinates], [num_points]
points_yflip, points_xflip, points_xyflip = self.double_flip(points)
points_list = [points_yflip, points_xflip, points_xyflip]
keys = ['yflip', 'xflip', 'xyflip']
for i, key in enumerate(keys):
voxel_output = self.voxel_generator.generate(points_list[i])
voxels, coordinates, num_points = voxel_output
if not data_dict['use_lead_xyz']:
voxels = voxels[..., 3:]
voxels_list.append(voxels)
voxel_coords_list.append(coordinates)
voxel_num_points_list.append(num_points)
data_dict['voxels'] = voxels_list
data_dict['voxel_coords'] = voxel_coords_list
data_dict['voxel_num_points'] = voxel_num_points_list
else:
data_dict['voxels'] = voxels
data_dict['voxel_coords'] = coordinates
data_dict['voxel_num_points'] = num_points
return data_dict
def sample_points(self, data_dict=None, config=None):
......
......@@ -2,6 +2,7 @@ from .pointnet2_backbone import PointNet2Backbone, PointNet2MSG
from .spconv_backbone import VoxelBackBone8x, VoxelResBackBone8x
from .spconv_backbone_2d import PillarBackBone8x, PillarRes18BackBone8x
from .spconv_backbone_focal import VoxelBackBone8xFocal
from .spconv_backbone_voxelnext import VoxelResBackBone8xVoxelNeXt
from .spconv_unet import UNetV2
__all__ = {
......@@ -11,6 +12,7 @@ __all__ = {
'PointNet2MSG': PointNet2MSG,
'VoxelResBackBone8x': VoxelResBackBone8x,
'VoxelBackBone8xFocal': VoxelBackBone8xFocal,
'VoxelResBackBone8xVoxelNeXt': VoxelResBackBone8xVoxelNeXt,
'PillarBackBone8x': PillarBackBone8x,
'PillarRes18BackBone8x': PillarRes18BackBone8x
}
from functools import partial
import torch
import torch.nn as nn
from ...utils.spconv_utils import replace_feature, spconv
def post_act_block(in_channels, out_channels, kernel_size, indice_key=None, stride=1, padding=0,
conv_type='subm', norm_fn=None):
if conv_type == 'subm':
conv = spconv.SubMConv3d(in_channels, out_channels, kernel_size, bias=False, indice_key=indice_key)
elif conv_type == 'spconv':
conv = spconv.SparseConv3d(in_channels, out_channels, kernel_size, stride=stride, padding=padding,
bias=False, indice_key=indice_key)
elif conv_type == 'inverseconv':
conv = spconv.SparseInverseConv3d(in_channels, out_channels, kernel_size, indice_key=indice_key, bias=False)
else:
raise NotImplementedError
m = spconv.SparseSequential(
conv,
norm_fn(out_channels),
nn.ReLU(),
)
return m
class SparseBasicBlock(spconv.SparseModule):
expansion = 1
def __init__(self, inplanes, planes, stride=1, norm_fn=None, downsample=None, indice_key=None):
super(SparseBasicBlock, self).__init__()
assert norm_fn is not None
bias = norm_fn is not None
self.conv1 = spconv.SubMConv3d(
inplanes, planes, kernel_size=3, stride=stride, padding=1, bias=bias, indice_key=indice_key
)
self.bn1 = norm_fn(planes)
self.relu = nn.ReLU()
self.conv2 = spconv.SubMConv3d(
planes, planes, kernel_size=3, stride=stride, padding=1, bias=bias, indice_key=indice_key
)
self.bn2 = norm_fn(planes)
self.downsample = downsample
self.stride = stride
def forward(self, x):
identity = x
out = self.conv1(x)
out = replace_feature(out, self.bn1(out.features))
out = replace_feature(out, self.relu(out.features))
out = self.conv2(out)
out = replace_feature(out, self.bn2(out.features))
if self.downsample is not None:
identity = self.downsample(x)
out = replace_feature(out, out.features + identity.features)
out = replace_feature(out, self.relu(out.features))
return out
class VoxelResBackBone8xVoxelNeXt(nn.Module):
def __init__(self, model_cfg, input_channels, grid_size, **kwargs):
super().__init__()
self.model_cfg = model_cfg
norm_fn = partial(nn.BatchNorm1d, eps=1e-3, momentum=0.01)
spconv_kernel_sizes = model_cfg.get('SPCONV_KERNEL_SIZES', [3, 3, 3, 3])
channels = model_cfg.get('CHANNELS', [16, 32, 64, 128, 128])
out_channel = model_cfg.get('OUT_CHANNEL', 128)
self.sparse_shape = grid_size[::-1] + [1, 0, 0]
self.conv_input = spconv.SparseSequential(
spconv.SubMConv3d(input_channels, channels[0], 3, padding=1, bias=False, indice_key='subm1'),
norm_fn(channels[0]),
nn.ReLU(),
)
block = post_act_block
self.conv1 = spconv.SparseSequential(
SparseBasicBlock(channels[0], channels[0], norm_fn=norm_fn, indice_key='res1'),
SparseBasicBlock(channels[0], channels[0], norm_fn=norm_fn, indice_key='res1'),
)
self.conv2 = spconv.SparseSequential(
# [1600, 1408, 41] <- [800, 704, 21]
block(channels[0], channels[1], spconv_kernel_sizes[0], norm_fn=norm_fn, stride=2, padding=int(spconv_kernel_sizes[0]//2), indice_key='spconv2', conv_type='spconv'),
SparseBasicBlock(channels[1], channels[1], norm_fn=norm_fn, indice_key='res2'),
SparseBasicBlock(channels[1], channels[1], norm_fn=norm_fn, indice_key='res2'),
)
self.conv3 = spconv.SparseSequential(
# [800, 704, 21] <- [400, 352, 11]
block(channels[1], channels[2], spconv_kernel_sizes[1], norm_fn=norm_fn, stride=2, padding=int(spconv_kernel_sizes[1]//2), indice_key='spconv3', conv_type='spconv'),
SparseBasicBlock(channels[2], channels[2], norm_fn=norm_fn, indice_key='res3'),
SparseBasicBlock(channels[2], channels[2], norm_fn=norm_fn, indice_key='res3'),
)
self.conv4 = spconv.SparseSequential(
# [400, 352, 11] <- [200, 176, 6]
block(channels[2], channels[3], spconv_kernel_sizes[2], norm_fn=norm_fn, stride=2, padding=int(spconv_kernel_sizes[2]//2), indice_key='spconv4', conv_type='spconv'),
SparseBasicBlock(channels[3], channels[3], norm_fn=norm_fn, indice_key='res4'),
SparseBasicBlock(channels[3], channels[3], norm_fn=norm_fn, indice_key='res4'),
)
self.conv5 = spconv.SparseSequential(
# [200, 176, 6] <- [100, 88, 3]
block(channels[3], channels[4], spconv_kernel_sizes[3], norm_fn=norm_fn, stride=2, padding=int(spconv_kernel_sizes[3]//2), indice_key='spconv5', conv_type='spconv'),
SparseBasicBlock(channels[4], channels[4], norm_fn=norm_fn, indice_key='res5'),
SparseBasicBlock(channels[4], channels[4], norm_fn=norm_fn, indice_key='res5'),
)
self.conv6 = spconv.SparseSequential(
# [200, 176, 6] <- [100, 88, 3]
block(channels[4], channels[4], spconv_kernel_sizes[3], norm_fn=norm_fn, stride=2, padding=int(spconv_kernel_sizes[3]//2), indice_key='spconv6', conv_type='spconv'),
SparseBasicBlock(channels[4], channels[4], norm_fn=norm_fn, indice_key='res6'),
SparseBasicBlock(channels[4], channels[4], norm_fn=norm_fn, indice_key='res6'),
)
self.conv_out = spconv.SparseSequential(
# [200, 150, 5] -> [200, 150, 2]
spconv.SparseConv2d(channels[3], out_channel, 3, stride=1, padding=1, bias=False, indice_key='spconv_down2'),
norm_fn(out_channel),
nn.ReLU(),
)
self.shared_conv = spconv.SparseSequential(
spconv.SubMConv2d(out_channel, out_channel, 3, stride=1, padding=1, bias=True),
nn.BatchNorm1d(out_channel),
nn.ReLU(True),
)
self.forward_ret_dict = {}
self.num_point_features = out_channel
self.backbone_channels = {
'x_conv1': channels[0],
'x_conv2': channels[1],
'x_conv3': channels[2],
'x_conv4': channels[3]
}
def bev_out(self, x_conv):
features_cat = x_conv.features
indices_cat = x_conv.indices[:, [0, 2, 3]]
spatial_shape = x_conv.spatial_shape[1:]
indices_unique, _inv = torch.unique(indices_cat, dim=0, return_inverse=True)
features_unique = features_cat.new_zeros((indices_unique.shape[0], features_cat.shape[1]))
features_unique.index_add_(0, _inv, features_cat)
x_out = spconv.SparseConvTensor(
features=features_unique,
indices=indices_unique,
spatial_shape=spatial_shape,
batch_size=x_conv.batch_size
)
return x_out
def forward(self, batch_dict):
"""
Args:
batch_dict:
batch_size: int
vfe_features: (num_voxels, C)
voxel_coords: (num_voxels, 4), [batch_idx, z_idx, y_idx, x_idx]
Returns:
batch_dict:
encoded_spconv_tensor: sparse tensor
"""
voxel_features, voxel_coords = batch_dict['voxel_features'], batch_dict['voxel_coords']
batch_size = batch_dict['batch_size']
input_sp_tensor = spconv.SparseConvTensor(
features=voxel_features,
indices=voxel_coords.int(),
spatial_shape=self.sparse_shape,
batch_size=batch_size
)
x = self.conv_input(input_sp_tensor)
x_conv1 = self.conv1(x)
x_conv2 = self.conv2(x_conv1)
x_conv3 = self.conv3(x_conv2)
x_conv4 = self.conv4(x_conv3)
x_conv5 = self.conv5(x_conv4)
x_conv6 = self.conv6(x_conv5)
x_conv5.indices[:, 1:] *= 2
x_conv6.indices[:, 1:] *= 4
x_conv4 = x_conv4.replace_feature(torch.cat([x_conv4.features, x_conv5.features, x_conv6.features]))
x_conv4.indices = torch.cat([x_conv4.indices, x_conv5.indices, x_conv6.indices])
out = self.bev_out(x_conv4)
out = self.conv_out(out)
out = self.shared_conv(out)
batch_dict.update({
'encoded_spconv_tensor': out,
'encoded_spconv_tensor_stride': 8
})
batch_dict.update({
'multi_scale_3d_features': {
'x_conv1': x_conv1,
'x_conv2': x_conv2,
'x_conv3': x_conv3,
'x_conv4': x_conv4,
}
})
batch_dict.update({
'multi_scale_3d_strides': {
'x_conv1': 1,
'x_conv2': 2,
'x_conv3': 4,
'x_conv4': 8,
}
})
return batch_dict
......@@ -5,6 +5,7 @@ from .point_head_box import PointHeadBox
from .point_head_simple import PointHeadSimple
from .point_intra_part_head import PointIntraPartOffsetHead
from .center_head import CenterHead
from .voxelnext_head import VoxelNeXtHead
__all__ = {
'AnchorHeadTemplate': AnchorHeadTemplate,
......@@ -13,5 +14,6 @@ __all__ = {
'PointHeadSimple': PointHeadSimple,
'PointHeadBox': PointHeadBox,
'AnchorHeadMulti': AnchorHeadMulti,
'CenterHead': CenterHead
'CenterHead': CenterHead,
'VoxelNeXtHead': VoxelNeXtHead,
}
This diff is collapsed.
......@@ -12,6 +12,7 @@ from .pv_rcnn_plusplus import PVRCNNPlusPlus
from .mppnet import MPPNet
from .mppnet_e2e import MPPNetE2E
from .pillarnet import PillarNet
from .voxelnext import VoxelNeXt
__all__ = {
'Detector3DTemplate': Detector3DTemplate,
......@@ -28,7 +29,8 @@ __all__ = {
'PVRCNNPlusPlus': PVRCNNPlusPlus,
'MPPNet': MPPNet,
'MPPNetE2E': MPPNetE2E,
'PillarNet': PillarNet
'PillarNet': PillarNet,
'VoxelNeXt': VoxelNeXt
}
......
......@@ -127,7 +127,7 @@ class Detector3DTemplate(nn.Module):
return None, model_info_dict
dense_head_module = dense_heads.__all__[self.model_cfg.DENSE_HEAD.NAME](
model_cfg=self.model_cfg.DENSE_HEAD,
input_channels=model_info_dict['num_bev_features'],
input_channels=model_info_dict['num_bev_features'] if 'num_bev_features' in model_info_dict else self.model_cfg.DENSE_HEAD.INPUT_FEATURES,
num_class=self.num_class if not self.model_cfg.DENSE_HEAD.CLASS_AGNOSTIC else 1,
class_names=self.class_names,
grid_size=model_info_dict['grid_size'],
......
from .detector3d_template import Detector3DTemplate
class VoxelNeXt(Detector3DTemplate):
def __init__(self, model_cfg, num_class, dataset):
super().__init__(model_cfg=model_cfg, num_class=num_class, dataset=dataset)
self.module_list = self.build_networks()
def forward(self, batch_dict):
for cur_module in self.module_list:
batch_dict = cur_module(batch_dict)
if self.training:
loss, tb_dict, disp_dict = self.get_training_loss()
ret_dict = {
'loss': loss
}
return ret_dict, tb_dict, disp_dict
else:
pred_dicts, recall_dicts = self.post_processing(batch_dict)
return pred_dicts, recall_dicts
def get_training_loss(self):
disp_dict = {}
loss, tb_dict = self.dense_head.get_loss()
return loss, tb_dict, disp_dict
def post_processing(self, batch_dict):
post_process_cfg = self.model_cfg.POST_PROCESSING
batch_size = batch_dict['batch_size']
final_pred_dict = batch_dict['final_box_dicts']
recall_dict = {}
for index in range(batch_size):
pred_boxes = final_pred_dict[index]['pred_boxes']
recall_dict = self.generate_recall_record(
box_preds=pred_boxes,
recall_dict=recall_dict, batch_index=index, data_dict=batch_dict,
thresh_list=post_process_cfg.RECALL_THRESH_LIST
)
return final_pred_dict, recall_dict
......@@ -77,6 +77,25 @@ def _nms(heat, kernel=3):
return heat * keep
def gaussian3D(shape, sigma=1):
m, n = [(ss - 1.) / 2. for ss in shape]
y, x = np.ogrid[-m:m + 1, -n:n + 1]
h = np.exp(-(x * x + y * y) / (2 * sigma * sigma))
h[h < np.finfo(h.dtype).eps * h.max()] = 0
return h
def draw_gaussian_to_heatmap_voxels(heatmap, distances, radius, k=1):
diameter = 2 * radius + 1
sigma = diameter / 6
masked_gaussian = torch.exp(- distances / (2 * sigma * sigma))
torch.max(heatmap, masked_gaussian, out=heatmap)
return heatmap
@numba.jit(nopython=True)
def circle_nms(dets, thresh):
x1 = dets[:, 0]
......@@ -214,3 +233,116 @@ def decode_bbox_from_heatmap(heatmap, rot_cos, rot_sin, center, center_z, dim,
'pred_labels': cur_labels
})
return ret_pred_dicts
def _topk_1d(scores, batch_size, batch_idx, obj, K=40, nuscenes=False):
# scores: (N, num_classes)
topk_score_list = []
topk_inds_list = []
topk_classes_list = []
for bs_idx in range(batch_size):
batch_inds = batch_idx==bs_idx
if obj.shape[-1] == 1 and not nuscenes:
score = scores[batch_inds].permute(1, 0)
topk_scores, topk_inds = torch.topk(score, K)
topk_score, topk_ind = torch.topk(obj[topk_inds.view(-1)].squeeze(-1), K) #torch.topk(topk_scores.view(-1), K)
else:
score = obj[batch_inds].permute(1, 0)
topk_scores, topk_inds = torch.topk(score, min(K, score.shape[-1]))
topk_score, topk_ind = torch.topk(topk_scores.view(-1), min(K, topk_scores.view(-1).shape[-1]))
#topk_score, topk_ind = torch.topk(score.reshape(-1), K)
topk_classes = (topk_ind // K).int()
topk_inds = topk_inds.view(-1).gather(0, topk_ind)
#print('topk_inds', topk_inds)
if not obj is None and obj.shape[-1] == 1:
topk_score_list.append(obj[batch_inds][topk_inds])
else:
topk_score_list.append(topk_score)
topk_inds_list.append(topk_inds)
topk_classes_list.append(topk_classes)
topk_score = torch.stack(topk_score_list)
topk_inds = torch.stack(topk_inds_list)
topk_classes = torch.stack(topk_classes_list)
return topk_score, topk_inds, topk_classes
def gather_feat_idx(feats, inds, batch_size, batch_idx):
feats_list = []
dim = feats.size(-1)
_inds = inds.unsqueeze(-1).expand(inds.size(0), inds.size(1), dim)
for bs_idx in range(batch_size):
batch_inds = batch_idx==bs_idx
feat = feats[batch_inds]
feats_list.append(feat.gather(0, _inds[bs_idx]))
feats = torch.stack(feats_list)
return feats
def decode_bbox_from_voxels_nuscenes(batch_size, indices, obj, rot_cos, rot_sin,
center, center_z, dim, vel=None, iou=None, point_cloud_range=None, voxel_size=None, voxels_3d=None,
feature_map_stride=None, K=100, score_thresh=None, post_center_limit_range=None, add_features=None):
batch_idx = indices[:, 0]
spatial_indices = indices[:, 1:]
scores, inds, class_ids = _topk_1d(None, batch_size, batch_idx, obj, K=K, nuscenes=True)
center = gather_feat_idx(center, inds, batch_size, batch_idx)
rot_sin = gather_feat_idx(rot_sin, inds, batch_size, batch_idx)
rot_cos = gather_feat_idx(rot_cos, inds, batch_size, batch_idx)
center_z = gather_feat_idx(center_z, inds, batch_size, batch_idx)
dim = gather_feat_idx(dim, inds, batch_size, batch_idx)
spatial_indices = gather_feat_idx(spatial_indices, inds, batch_size, batch_idx)
if not add_features is None:
add_features = [gather_feat_idx(add_feature, inds, batch_size, batch_idx) for add_feature in add_features]
if not isinstance(feature_map_stride, int):
feature_map_stride = gather_feat_idx(feature_map_stride.unsqueeze(-1), inds, batch_size, batch_idx)
angle = torch.atan2(rot_sin, rot_cos)
xs = (spatial_indices[:, :, -1:] + center[:, :, 0:1]) * feature_map_stride * voxel_size[0] + point_cloud_range[0]
ys = (spatial_indices[:, :, -2:-1] + center[:, :, 1:2]) * feature_map_stride * voxel_size[1] + point_cloud_range[1]
#zs = (spatial_indices[:, :, 0:1]) * feature_map_stride * voxel_size[2] + point_cloud_range[2] + center_z
box_part_list = [xs, ys, center_z, dim, angle]
if not vel is None:
vel = gather_feat_idx(vel, inds, batch_size, batch_idx)
box_part_list.append(vel)
if not iou is None:
iou = gather_feat_idx(iou, inds, batch_size, batch_idx)
iou = torch.clamp(iou, min=0, max=1.)
final_box_preds = torch.cat((box_part_list), dim=-1)
final_scores = scores.view(batch_size, K)
final_class_ids = class_ids.view(batch_size, K)
if not add_features is None:
add_features = [add_feature.view(batch_size, K, add_feature.shape[-1]) for add_feature in add_features]
assert post_center_limit_range is not None
mask = (final_box_preds[..., :3] >= post_center_limit_range[:3]).all(2)
mask &= (final_box_preds[..., :3] <= post_center_limit_range[3:]).all(2)
if score_thresh is not None:
mask &= (final_scores > score_thresh)
ret_pred_dicts = []
for k in range(batch_size):
cur_mask = mask[k]
cur_boxes = final_box_preds[k, cur_mask]
cur_scores = final_scores[k, cur_mask]
cur_labels = final_class_ids[k, cur_mask]
cur_add_features = [add_feature[k, cur_mask] for add_feature in add_features] if not add_features is None else None
cur_iou = iou[k, cur_mask] if not iou is None else None
ret_pred_dicts.append({
'pred_boxes': cur_boxes,
'pred_scores': cur_scores,
'pred_labels': cur_labels,
'pred_ious': cur_iou,
'add_features': cur_add_features,
})
return ret_pred_dicts
......@@ -80,6 +80,42 @@ def boxes_iou3d_gpu(boxes_a, boxes_b):
return iou3d
def boxes_aligned_iou3d_gpu(boxes_a, boxes_b):
"""
Args:
boxes_a: (N, 7) [x, y, z, dx, dy, dz, heading]
boxes_b: (N, 7) [x, y, z, dx, dy, dz, heading]
Returns:
ans_iou: (N,)
"""
assert boxes_a.shape[0] == boxes_b.shape[0]
assert boxes_a.shape[1] == boxes_b.shape[1] == 7
# height overlap
boxes_a_height_max = (boxes_a[:, 2] + boxes_a[:, 5] / 2).view(-1, 1)
boxes_a_height_min = (boxes_a[:, 2] - boxes_a[:, 5] / 2).view(-1, 1)
boxes_b_height_max = (boxes_b[:, 2] + boxes_b[:, 5] / 2).view(-1, 1)
boxes_b_height_min = (boxes_b[:, 2] - boxes_b[:, 5] / 2).view(-1, 1)
# bev overlap
overlaps_bev = torch.cuda.FloatTensor(torch.Size((boxes_a.shape[0], 1))).zero_() # (N, M)
iou3d_nms_cuda.boxes_aligned_overlap_bev_gpu(boxes_a.contiguous(), boxes_b.contiguous(), overlaps_bev)
max_of_min = torch.max(boxes_a_height_min, boxes_b_height_min)
min_of_max = torch.min(boxes_a_height_max, boxes_b_height_max)
overlaps_h = torch.clamp(min_of_max - max_of_min, min=0)
# 3d iou
overlaps_3d = overlaps_bev * overlaps_h
vol_a = (boxes_a[:, 3] * boxes_a[:, 4] * boxes_a[:, 5]).view(-1, 1)
vol_b = (boxes_b[:, 3] * boxes_b[:, 4] * boxes_b[:, 5]).view(-1, 1)
iou3d = overlaps_3d / torch.clamp(vol_a + vol_b - overlaps_3d, min=1e-6)
return iou3d
def nms_gpu(boxes, scores, thresh, pre_maxsize=None, **kwargs):
"""
......
......@@ -250,3 +250,24 @@ int boxes_iou_bev_cpu(at::Tensor boxes_a_tensor, at::Tensor boxes_b_tensor, at::
}
return 1;
}
int boxes_aligned_iou_bev_cpu(at::Tensor boxes_a_tensor, at::Tensor boxes_b_tensor, at::Tensor ans_iou_tensor){
// params boxes_a_tensor: (N, 7) [x, y, z, dx, dy, dz, heading]
// params boxes_b_tensor: (N, 7) [x, y, z, dx, dy, dz, heading]
// params ans_iou_tensor: (N, 1)
CHECK_CONTIGUOUS(boxes_a_tensor);
CHECK_CONTIGUOUS(boxes_b_tensor);
int num_boxes = boxes_a_tensor.size(0);
int num_boxes_b = boxes_b_tensor.size(0);
assert(num_boxes == num_boxes_b);
const float *boxes_a = boxes_a_tensor.data<float>();
const float *boxes_b = boxes_b_tensor.data<float>();
float *ans_iou = ans_iou_tensor.data<float>();
for (int i = 0; i < num_boxes; i++){
ans_iou[i] = iou_bev(boxes_a + i * 7, boxes_b + i * 7);
}
return 1;
}
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment