Commit 00306509 authored by chenych's avatar chenych
Browse files

Add

parents
Pipeline #2786 canceled with stages
MIT License
Copyright (c) 2023 ZhuLifa
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
# [PointPillars: Fast Encoders for Object Detection from Point Clouds](https://arxiv.org/abs/1812.05784)
A Simple PointPillars PyTorch Implenmentation for 3D Lidar(KITTI) Detection. [[Zhihu](https://zhuanlan.zhihu.com/p/521277176)]
- It can be run without installing [Spconv](https://github.com/traveller59/spconv), [mmdet](https://github.com/open-mmlab/mmdetection) or [mmdet3d](https://github.com/open-mmlab/mmdetection3d).
- Only one detection network (PointPillars) was implemented in this repo, so the code may be more easy to read.
- Sincere thanks for the great open-source architectures [mmcv](https://github.com/open-mmlab/mmcv), [mmdet](https://github.com/open-mmlab/mmdetection) and [mmdet3d](https://github.com/open-mmlab/mmdetection3d), which helps me to learn 3D detetion and implement this repo.
## News
- **2025-02** Making PointPillars a python package out of the code is supported.
- **2024-04** Exporting PointPillars to ONNX & TensorRT is supported on branch [feature/deployment](https://github.com/zhulf0804/PointPillars/tree/feature/deployment).
![](./figures/pytorch_trt.png)
## mAP on KITTI validation set (Easy, Moderate, Hard)
| Repo | Metric | Overall | Pedestrian | Cyclist | Car |
| :---: | :---: | :---: | :---: | :---: | :---: |
| this repo | 3D-BBox | 73.3259 62.7834 59.6278 | 51.4642 47.9446 43.8040 | 81.8677 63.6617 60.9126 | 86.6456 76.7439 74.1668 |
| [mmdet3d v0.18.1](https://github.com/open-mmlab/mmdetection3d/tree/v0.18.1) | 3D-BBox | 72.0537, 60.1114, 55.8320 | 52.0263, 46.4037, 42.4841 | 78.7231, 59.9526, 57.2489 | 85.4118, 73.9780, 67.7630 |
| this repo | BEV | 77.8540 69.8003 66.6699 | 59.1687 54.3456 50.5023 | 84.4268 67.1409 63.7409 | 89.9664 87.9145 85.7664 |
| [mmdet3d v0.18.1](https://github.com/open-mmlab/mmdetection3d/tree/v0.18.1) | BEV | 76.6485, 67.7609, 64.5605 | 59.0778, 53.3638, 48.4230 | 80.9328, 63.3447, 60.0618 | 89.9348, 86.5743, 85.1967 |
| this repo | 2D-BBox | 80.5097 74.6120 71.4758 | 64.6249 61.4201 57.5965 | 86.2569 73.0828 70.1726 | 90.6471 89.3330 86.6583 |
| [mmdet3d v0.18.1](https://github.com/open-mmlab/mmdetection3d/tree/v0.18.1) | 2D-BBox | 78.4938, 73.4781, 70.3613 | 62.2413, 58.9157, 55.3660 | 82.6460, 72.3547, 68.4669 | 90.5939, 89.1638, 87.2511 |
| this repo | AOS | 74.9647 68.1712 65.2817 | 49.3777 46.7284 43.8352 | 85.0412 69.1024 66.2801 | 90.4752 88.6828 85.7298 |
| [mmdet3d v0.18.1](https://github.com/open-mmlab/mmdetection3d/tree/v0.18.1) | AOS | 72.41, 66.23, 63.55 | 46.00, 43.22, 40.94 | 80.85, 67.20, 63.63 | 90.37, 88.27, 86.07 |
- **Note: Here, we report [mmdet3d v0.18.1](https://github.com/open-mmlab/mmdetection3d/tree/v0.18.1) (2022/02/09-2022/03/01) performance based on the officially provided [checkpoint](https://github.com/open-mmlab/mmdetection3d/tree/v0.18.1/configs/pointpillars#kitti). Much improvements were made in the [mmdet3d v1.0.0rc1](https://github.com/open-mmlab/mmdetection3d/tree/v1.0.0rc1)**.
## Detection Visualization
![](./figures/pc_pred_000134.png)
![](./figures/img_3dbbox_000134.png)
## [Install]
Install PointPillars as a python package and all its dependencies as follows:
```
cd PointPillars/
pip install -r requirements.txt
python setup.py build_ext --inplace
pip install .
```
## [Datasets]
1. Download
Download [point cloud](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_velodyne.zip)(29GB), [images](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_image_2.zip)(12 GB), [calibration files](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_calib.zip)(16 MB)和[labels](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_label_2.zip)(5 MB)。Format the datasets as follows:
```
kitti
|- training
|- calib (#7481 .txt)
|- image_2 (#7481 .png)
|- label_2 (#7481 .txt)
|- velodyne (#7481 .bin)
|- testing
|- calib (#7518 .txt)
|- image_2 (#7518 .png)
|- velodyne (#7518 .bin)
```
2. Pre-process KITTI datasets First
```
cd PointPillars/
python pre_process_kitti.py --data_root your_path_to_kitti
```
Now, we have datasets as follows:
```
kitti
|- training
|- calib (#7481 .txt)
|- image_2 (#7481 .png)
|- label_2 (#7481 .txt)
|- velodyne (#7481 .bin)
|- velodyne_reduced (#7481 .bin)
|- testing
|- calib (#7518 .txt)
|- image_2 (#7518 .png)
|- velodyne (#7518 .bin)
|- velodyne_reduced (#7518 .bin)
|- kitti_gt_database (# 19700 .bin)
|- kitti_infos_train.pkl
|- kitti_infos_val.pkl
|- kitti_infos_trainval.pkl
|- kitti_infos_test.pkl
|- kitti_dbinfos_train.pkl
```
## [Training]
```
cd PointPillars/
python train.py --data_root your_path_to_kitti
```
## [Evaluation]
```
cd PointPillars/
python evaluate.py --ckpt pretrained/epoch_160.pth --data_root your_path_to_kitti
```
## [Test]
```
cd PointPillars/
# 1. infer and visualize point cloud detection
python test.py --ckpt pretrained/epoch_160.pth --pc_path your_pc_path
# 2. infer and visualize point cloud detection and gound truth.
python test.py --ckpt pretrained/epoch_160.pth --pc_path your_pc_path --calib_path your_calib_path --gt_path your_gt_path
# 3. infer and visualize point cloud & image detection
python test.py --ckpt pretrained/epoch_160.pth --pc_path your_pc_path --calib_path your_calib_path --img_path your_img_path
e.g.
a. [infer on val set 000134]
python test.py --ckpt pretrained/epoch_160.pth --pc_path pointpillars/dataset/demo_data/val/000134.bin
or
python test.py --ckpt pretrained/epoch_160.pth --pc_path pointpillars/dataset/demo_data/val/000134.bin \
--calib_path pointpillars/dataset/demo_data/val/000134.txt \
--img_path pointpillars/dataset/demo_data/val/000134.png \
--gt_path pointpillars/dataset/demo_data/val/000134_gt.txt
b. [infer on test set 000002]
python test.py --ckpt pretrained/epoch_160.pth --pc_path pointpillars/dataset/demo_data/test/000002.bin
or
python test.py --ckpt pretrained/epoch_160.pth --pc_path pointpillars/dataset/demo_data/test/000002.bin \
--calib_path pointpillars/dataset/demo_data/test/000002.txt \
--img_path pointpillars/dataset/demo_data/test/000002.png
```
## Acknowledements
Thanks for the open source code [mmcv](https://github.com/open-mmlab/mmcv), [mmdet](https://github.com/open-mmlab/mmdetection) and [mmdet3d](https://github.com/open-mmlab/mmdetection3d).
# PointPillars
## 论文
[PointPillars: Fast Encoders for Object Detection from Point Clouds](https://arxiv.org/abs/1812.05784)
## 模型结构
<div align=center>
<img src="./asserts/model.png"/>
</div>
## 算法原理
本文提出的PointPillars是一种针对3D目标检测任务的新型点云编码器和网络结构。它采用PointNets来学习点云的垂直柱状特征表示,从而更好地捕捉点云信息,并将其应用于标准的2D卷积检测架构中。PointPillars的设计使得可以根据需要进行速度和精度之间的权衡,同时在保持高精度的同时可以实现超过100Hz的速度。
具体来说,PointPillars采用了三个块来构建网络结构,每个块都包括 Upsampling 步骤和 Concatenation 步骤。这些步骤使特征向量的维度逐渐增加,以便于后续的目标检测任务。此外,PointPillars还使用了一些损失函数来优化网络参数,如SmoothL1损失、Focal损失等。
<div align=center>
<img src="./asserts/image.png"/>
</div>
## 环境配置
`-v 路径``docker_name``imageID`根据实际情况修改
###
### Docker(方法一)
```bash
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.4.1-ubuntu22.04-dtk25.04-py3.10
docker run -it --shm-size 200g --network=host --name {docker_name} --privileged --device=/dev/kfd --device=/dev/dri --device=/dev/mkfd --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal/:/opt/hyhal/:ro {imageID} bash
cd /your_code_path/pointpillars-pytorch
pip install -r requirements.txt
```
### Dockerfile(方法二)
```bash
cd docker
docker build --no-cache -t pointpillars:latest .
docker run -it --shm-size 200g --network=host --name {docker_name} --privileged --device=/dev/kfd --device=/dev/dri --device=/dev/mkfd --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal/:/opt/hyhal/:ro {imageID} bash
cd /your_code_path/pointpillars-pytorch
pip install -r requirements.txt
```
### Anaconda(方法三)
关于本项目DCU显卡所需的特殊深度学习库可从[光合](https://developer.sourcefind.cn/tool/)开发者社区下载安装。
```bash
DTK: 25.04
python: 3.10
vllm: 0.8.5
torch: 2.4.1+das.opt2.dtk2504
deepspeed: 0.14.2+das.opt2.dtk2504
```
`Tips:以上dtk驱动、python、torch等DCU相关工具版本需要严格一一对应`
其它非深度学习库安装方式如下:
```bash
pip install -r requirements.txt
```
## 数据集
下载数据集[point cloud](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_velodyne.zip)(29GB), [images](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_image_2.zip)(12 GB), [calibration files](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_calib.zip)(16 MB)和[labels](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_label_2.zip)(5 MB)。数据格式如下所示:
```
kitti
|- training
|- calib (#7481 .txt)
|- image_2 (#7481 .png)
|- label_2 (#7481 .txt)
|- velodyne (#7481 .bin)
|- velodyne_reduced (#7481 .bin)
|- testing
|- calib (#7518 .txt)
|- image_2 (#7518 .png)
|- velodyne (#7518 .bin)
|- velodyne_reduced (#7518 .bin)
|- kitti_gt_database (# 19700 .bin)
|- kitti_infos_train.pkl
|- kitti_infos_val.pkl
|- kitti_infos_trainval.pkl
|- kitti_infos_test.pkl
|- kitti_dbinfos_train.pkl
```
## 训练
```
cd PointPillars/
python train.py --data_root your_path_to_kitti
```
## 推理-验证
```
cd PointPillars/
python evaluate.py --ckpt pretrained/epoch_160.pth --data_root your_path_to_kitti
```
## result
<div align=center>
<img src="./asserts/result.png"/>
</div>
### 精度
与NV基本一致,部分精度高于NV
==========BBOX_2D==========
Pedestrian AP@0.5: 65.2719 61.7278 58.0838
Cyclist AP@0.5: 87.5932 75.1039 71.6308
Car AP@0.7: 90.7070 89.4133 86.6231
==========AOS==========
Pedestrian AOS@0.5: 45.5449 43.3262 40.6216
Cyclist AOS@0.5: 87.2105 72.7019 69.1393
Car AOS@0.7: 90.6199 88.9817 85.9057
==========BBOX_BEV==========
Pedestrian AP@0.5: 59.7484 55.0461 50.7703
Cyclist AP@0.5: 84.4216 68.7483 64.2096
Car AP@0.7: 90.1521 87.5884 85.7974
==========BBOX_3D==========
Pedestrian AP@0.5: 51.5879 46.5993 43.3320
Cyclist AP@0.5: 80.9012 62.8649 60.4813
Car AP@0.7: 86.0721 76.7700 74.3576
==========Overall==========
bbox_2d AP: 81.1907 75.4150 72.1126
AOS AP: 74.4584 68.3366 65.2222
bbox_bev AP: 78.1073 70.4609 66.9258
bbox_3d AP: 72.8538 62.0781 59.3903
## 应用场景
### 算法类别
文本理解
### 热点应用行业
制造,智能驾驶,3D点云
## 源码仓库及问题反馈
- https://developer.sourcefind.cn/codes/modelzoo/pointpillars-pytorch
## 参考资料
- https://github.com/zhulf0804/PointPillars
import argparse
import numpy as np
import os
import torch
import pdb
from tqdm import tqdm
from pointpillars.utils import setup_seed, keep_bbox_from_image_range, \
keep_bbox_from_lidar_range, write_pickle, write_label, \
iou2d, iou3d_camera, iou_bev
from pointpillars.dataset import Kitti, get_dataloader
from pointpillars.model import PointPillars
def get_score_thresholds(tp_scores, total_num_valid_gt, num_sample_pts=41):
score_thresholds = []
tp_scores = sorted(tp_scores)[::-1]
cur_recall, pts_ind = 0, 0
for i, score in enumerate(tp_scores):
lrecall = (i + 1) / total_num_valid_gt
rrecall = (i + 2) / total_num_valid_gt
if i == len(tp_scores) - 1:
score_thresholds.append(score)
break
if (lrecall + rrecall) / 2 < cur_recall:
continue
score_thresholds.append(score)
pts_ind += 1
cur_recall = pts_ind / (num_sample_pts - 1)
return score_thresholds
def do_eval(det_results, gt_results, CLASSES, saved_path):
'''
det_results: list,
gt_results: dict(id -> det_results)
CLASSES: dict
'''
assert len(det_results) == len(gt_results)
f = open(os.path.join(saved_path, 'eval_results.txt'), 'w')
# 1. calculate iou
ious = {
'bbox_2d': [],
'bbox_bev': [],
'bbox_3d': []
}
ids = list(sorted(gt_results.keys()))
for id in ids:
gt_result = gt_results[id]['annos']
det_result = det_results[id]
# 1.1, 2d bboxes iou
gt_bboxes2d = gt_result['bbox'].astype(np.float32)
det_bboxes2d = det_result['bbox'].astype(np.float32)
iou2d_v = iou2d(torch.from_numpy(gt_bboxes2d).cuda(), torch.from_numpy(det_bboxes2d).cuda())
ious['bbox_2d'].append(iou2d_v.cpu().numpy())
# 1.2, bev iou
gt_location = gt_result['location'].astype(np.float32)
gt_dimensions = gt_result['dimensions'].astype(np.float32)
gt_rotation_y = gt_result['rotation_y'].astype(np.float32)
det_location = det_result['location'].astype(np.float32)
det_dimensions = det_result['dimensions'].astype(np.float32)
det_rotation_y = det_result['rotation_y'].astype(np.float32)
gt_bev = np.concatenate([gt_location[:, [0, 2]], gt_dimensions[:, [0, 2]], gt_rotation_y[:, None]], axis=-1)
det_bev = np.concatenate([det_location[:, [0, 2]], det_dimensions[:, [0, 2]], det_rotation_y[:, None]], axis=-1)
iou_bev_v = iou_bev(torch.from_numpy(gt_bev).cuda(), torch.from_numpy(det_bev).cuda())
ious['bbox_bev'].append(iou_bev_v.cpu().numpy())
# 1.3, 3dbboxes iou
gt_bboxes3d = np.concatenate([gt_location, gt_dimensions, gt_rotation_y[:, None]], axis=-1)
det_bboxes3d = np.concatenate([det_location, det_dimensions, det_rotation_y[:, None]], axis=-1)
iou3d_v = iou3d_camera(torch.from_numpy(gt_bboxes3d).cuda(), torch.from_numpy(det_bboxes3d).cuda())
ious['bbox_3d'].append(iou3d_v.cpu().numpy())
MIN_IOUS = {
'Pedestrian': [0.5, 0.5, 0.5],
'Cyclist': [0.5, 0.5, 0.5],
'Car': [0.7, 0.7, 0.7]
}
MIN_HEIGHT = [40, 25, 25]
overall_results = {}
for e_ind, eval_type in enumerate(['bbox_2d', 'bbox_bev', 'bbox_3d']):
eval_ious = ious[eval_type]
eval_ap_results, eval_aos_results = {}, {}
for cls in CLASSES:
eval_ap_results[cls] = []
eval_aos_results[cls] = []
CLS_MIN_IOU = MIN_IOUS[cls][e_ind]
for difficulty in [0, 1, 2]:
# 1. bbox property
total_gt_ignores, total_det_ignores, total_dc_bboxes, total_scores = [], [], [], []
total_gt_alpha, total_det_alpha = [], []
for id in ids:
gt_result = gt_results[id]['annos']
det_result = det_results[id]
# 1.1 gt bbox property
cur_gt_names = gt_result['name']
cur_difficulty = gt_result['difficulty']
gt_ignores, dc_bboxes = [], []
for j, cur_gt_name in enumerate(cur_gt_names):
ignore = cur_difficulty[j] < 0 or cur_difficulty[j] > difficulty
if cur_gt_name == cls:
valid_class = 1
elif cls == 'Pedestrian' and cur_gt_name == 'Person_sitting':
valid_class = 0
elif cls == 'Car' and cur_gt_name == 'Van':
valid_class = 0
else:
valid_class = -1
if valid_class == 1 and not ignore:
gt_ignores.append(0)
elif valid_class == 0 or (valid_class == 1 and ignore):
gt_ignores.append(1)
else:
gt_ignores.append(-1)
if cur_gt_name == 'DontCare':
dc_bboxes.append(gt_result['bbox'][j])
total_gt_ignores.append(gt_ignores)
total_dc_bboxes.append(np.array(dc_bboxes))
total_gt_alpha.append(gt_result['alpha'])
# 1.2 det bbox property
cur_det_names = det_result['name']
cur_det_heights = det_result['bbox'][:, 3] - det_result['bbox'][:, 1]
det_ignores = []
for j, cur_det_name in enumerate(cur_det_names):
if cur_det_heights[j] < MIN_HEIGHT[difficulty]:
det_ignores.append(1)
elif cur_det_name == cls:
det_ignores.append(0)
else:
det_ignores.append(-1)
total_det_ignores.append(det_ignores)
total_scores.append(det_result['score'])
total_det_alpha.append(det_result['alpha'])
# 2. calculate scores thresholds for PR curve
tp_scores = []
for i, id in enumerate(ids):
cur_eval_ious = eval_ious[i]
gt_ignores, det_ignores = total_gt_ignores[i], total_det_ignores[i]
scores = total_scores[i]
nn, mm = cur_eval_ious.shape
assigned = np.zeros((mm, ), dtype=np.bool_)
for j in range(nn):
if gt_ignores[j] == -1:
continue
match_id, match_score = -1, -1
for k in range(mm):
if not assigned[k] and det_ignores[k] >= 0 and cur_eval_ious[j, k] > CLS_MIN_IOU and scores[k] > match_score:
match_id = k
match_score = scores[k]
if match_id != -1:
assigned[match_id] = True
if det_ignores[match_id] == 0 and gt_ignores[j] == 0:
tp_scores.append(match_score)
total_num_valid_gt = np.sum([np.sum(np.array(gt_ignores) == 0) for gt_ignores in total_gt_ignores])
score_thresholds = get_score_thresholds(tp_scores, total_num_valid_gt)
# 3. draw PR curve and calculate mAP
tps, fns, fps, total_aos = [], [], [], []
for score_threshold in score_thresholds:
tp, fn, fp = 0, 0, 0
aos = 0
for i, id in enumerate(ids):
cur_eval_ious = eval_ious[i]
gt_ignores, det_ignores = total_gt_ignores[i], total_det_ignores[i]
gt_alpha, det_alpha = total_gt_alpha[i], total_det_alpha[i]
scores = total_scores[i]
nn, mm = cur_eval_ious.shape
assigned = np.zeros((mm, ), dtype=np.bool_)
for j in range(nn):
if gt_ignores[j] == -1:
continue
match_id, match_iou = -1, -1
for k in range(mm):
if not assigned[k] and det_ignores[k] >= 0 and scores[k] >= score_threshold and cur_eval_ious[j, k] > CLS_MIN_IOU:
if det_ignores[k] == 0 and cur_eval_ious[j, k] > match_iou:
match_iou = cur_eval_ious[j, k]
match_id = k
elif det_ignores[k] == 1 and match_iou == -1:
match_id = k
if match_id != -1:
assigned[match_id] = True
if det_ignores[match_id] == 0 and gt_ignores[j] == 0:
tp += 1
if eval_type == 'bbox_2d':
aos += (1 + np.cos(gt_alpha[j] - det_alpha[match_id])) / 2
else:
if gt_ignores[j] == 0:
fn += 1
for k in range(mm):
if det_ignores[k] == 0 and scores[k] >= score_threshold and not assigned[k]:
fp += 1
# In case 2d bbox evaluation, we should consider dontcare bboxes
if eval_type == 'bbox_2d':
dc_bboxes = total_dc_bboxes[i]
det_bboxes = det_results[id]['bbox']
if len(dc_bboxes) > 0:
ious_dc_det = iou2d(torch.from_numpy(det_bboxes), torch.from_numpy(dc_bboxes), metric=1).numpy().T
for j in range(len(dc_bboxes)):
for k in range(len(det_bboxes)):
if det_ignores[k] == 0 and scores[k] >= score_threshold and not assigned[k]:
if ious_dc_det[j, k] > CLS_MIN_IOU:
fp -= 1
assigned[k] = True
tps.append(tp)
fns.append(fn)
fps.append(fp)
if eval_type == 'bbox_2d':
total_aos.append(aos)
tps, fns, fps = np.array(tps), np.array(fns), np.array(fps)
recalls = tps / (tps + fns)
precisions = tps / (tps + fps)
for i in range(len(score_thresholds)):
precisions[i] = np.max(precisions[i:])
sums_AP = 0
for i in range(0, len(score_thresholds), 4):
sums_AP += precisions[i]
mAP = sums_AP / 11 * 100
eval_ap_results[cls].append(mAP)
if eval_type == 'bbox_2d':
total_aos = np.array(total_aos)
similarity = total_aos / (tps + fps)
for i in range(len(score_thresholds)):
similarity[i] = np.max(similarity[i:])
sums_similarity = 0
for i in range(0, len(score_thresholds), 4):
sums_similarity += similarity[i]
mSimilarity = sums_similarity / 11 * 100
eval_aos_results[cls].append(mSimilarity)
print(f'=========={eval_type.upper()}==========')
print(f'=========={eval_type.upper()}==========', file=f)
for k, v in eval_ap_results.items():
print(f'{k} AP@{MIN_IOUS[k][e_ind]}: {v[0]:.4f} {v[1]:.4f} {v[2]:.4f}')
print(f'{k} AP@{MIN_IOUS[k][e_ind]}: {v[0]:.4f} {v[1]:.4f} {v[2]:.4f}', file=f)
if eval_type == 'bbox_2d':
print(f'==========AOS==========')
print(f'==========AOS==========', file=f)
for k, v in eval_aos_results.items():
print(f'{k} AOS@{MIN_IOUS[k][e_ind]}: {v[0]:.4f} {v[1]:.4f} {v[2]:.4f}')
print(f'{k} AOS@{MIN_IOUS[k][e_ind]}: {v[0]:.4f} {v[1]:.4f} {v[2]:.4f}', file=f)
overall_results[eval_type] = np.mean(list(eval_ap_results.values()), 0)
if eval_type == 'bbox_2d':
overall_results['AOS'] = np.mean(list(eval_aos_results.values()), 0)
print(f'\n==========Overall==========')
print(f'\n==========Overall==========', file=f)
for k, v in overall_results.items():
print(f'{k} AP: {v[0]:.4f} {v[1]:.4f} {v[2]:.4f}')
print(f'{k} AP: {v[0]:.4f} {v[1]:.4f} {v[2]:.4f}', file=f)
f.close()
def main(args):
val_dataset = Kitti(data_root=args.data_root,
split='val')
val_dataloader = get_dataloader(dataset=val_dataset,
batch_size=args.batch_size,
num_workers=args.num_workers,
shuffle=False)
CLASSES = Kitti.CLASSES
LABEL2CLASSES = {v:k for k, v in CLASSES.items()}
if not args.no_cuda:
model = PointPillars(nclasses=args.nclasses).cuda()
model.load_state_dict(torch.load(args.ckpt))
else:
model = PointPillars(nclasses=args.nclasses)
model.load_state_dict(
torch.load(args.ckpt, map_location=torch.device('cpu')))
saved_path = args.saved_path
os.makedirs(saved_path, exist_ok=True)
saved_submit_path = os.path.join(saved_path, 'submit')
os.makedirs(saved_submit_path, exist_ok=True)
pcd_limit_range = np.array([0, -40, -3, 70.4, 40, 0.0], dtype=np.float32)
model.eval()
with torch.no_grad():
format_results = {}
print('Predicting and Formatting the results.')
for i, data_dict in enumerate(tqdm(val_dataloader)):
if not args.no_cuda:
# move the tensors to the cuda
for key in data_dict:
for j, item in enumerate(data_dict[key]):
if torch.is_tensor(item):
data_dict[key][j] = data_dict[key][j].cuda()
batched_pts = data_dict['batched_pts']
batched_gt_bboxes = data_dict['batched_gt_bboxes']
batched_labels = data_dict['batched_labels']
batched_difficulty = data_dict['batched_difficulty']
batch_results = model(batched_pts=batched_pts,
mode='val',
batched_gt_bboxes=batched_gt_bboxes,
batched_gt_labels=batched_labels)
# pdb.set_trace()
for j, result in enumerate(batch_results):
format_result = {
'name': [],
'truncated': [],
'occluded': [],
'alpha': [],
'bbox': [],
'dimensions': [],
'location': [],
'rotation_y': [],
'score': []
}
calib_info = data_dict['batched_calib_info'][j]
tr_velo_to_cam = calib_info['Tr_velo_to_cam'].astype(np.float32)
r0_rect = calib_info['R0_rect'].astype(np.float32)
P2 = calib_info['P2'].astype(np.float32)
image_shape = data_dict['batched_img_info'][j]['image_shape']
idx = data_dict['batched_img_info'][j]['image_idx']
result_filter = keep_bbox_from_image_range(result, tr_velo_to_cam, r0_rect, P2, image_shape)
result_filter = keep_bbox_from_lidar_range(result_filter, pcd_limit_range)
lidar_bboxes = result_filter['lidar_bboxes']
labels, scores = result_filter['labels'], result_filter['scores']
bboxes2d, camera_bboxes = result_filter['bboxes2d'], result_filter['camera_bboxes']
for lidar_bbox, label, score, bbox2d, camera_bbox in \
zip(lidar_bboxes, labels, scores, bboxes2d, camera_bboxes):
format_result['name'].append(LABEL2CLASSES[label])
format_result['truncated'].append(0.0)
format_result['occluded'].append(0)
alpha = camera_bbox[6] - np.arctan2(camera_bbox[0], camera_bbox[2])
format_result['alpha'].append(alpha)
format_result['bbox'].append(bbox2d)
format_result['dimensions'].append(camera_bbox[3:6])
format_result['location'].append(camera_bbox[:3])
format_result['rotation_y'].append(camera_bbox[6])
format_result['score'].append(score)
write_label(format_result, os.path.join(saved_submit_path, f'{idx:06d}.txt'))
format_results[idx] = {k:np.array(v) for k, v in format_result.items()}
write_pickle(format_results, os.path.join(saved_path, 'results.pkl'))
print('Evaluating.. Please wait several seconds.')
do_eval(format_results, val_dataset.data_infos, CLASSES, saved_path)
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Configuration Parameters')
parser.add_argument('--data_root', default='/mnt/ssd1/lifa_rdata/det/kitti',
help='your data root for kitti')
parser.add_argument('--ckpt', default='pretrained/epoch_160.pth', help='your checkpoint for kitti')
parser.add_argument('--saved_path', default='results', help='your saved path for predicted results')
parser.add_argument('--batch_size', type=int, default=1)
parser.add_argument('--num_workers', type=int, default=4)
parser.add_argument('--nclasses', type=int, default=3)
parser.add_argument('--no_cuda', action='store_true',
help='whether to use cuda')
args = parser.parse_args()
main(args)
# 模型唯一标识
modelCode=1621
# 模型名称
modelName=pointpillars-pytorch
# 模型描述
modelDescription=文提出了一种新型的点云编码器与网络结构——PointPillars,专为3D对象检测设计,特别适合自动驾驶应用中的lidar数据处理。
# 应用场景
appScenario=推理,训练,制造,智能驾驶,3D点云
# 框架类型
frameType=pytorch
\ No newline at end of file
import argparse
import pdb
import cv2
import numpy as np
import os
from tqdm import tqdm
import sys
CUR = os.path.dirname(os.path.abspath(__file__))
sys.path.append(CUR)
from pointpillars.utils import read_points, write_points, read_calib, read_label, \
write_pickle, remove_outside_points, get_points_num_in_bbox, \
points_in_bboxes_v2
def judge_difficulty(annotation_dict):
truncated = annotation_dict['truncated']
occluded = annotation_dict['occluded']
bbox = annotation_dict['bbox']
height = bbox[:, 3] - bbox[:, 1]
MIN_HEIGHTS = [40, 25, 25]
MAX_OCCLUSION = [0, 1, 2]
MAX_TRUNCATION = [0.15, 0.30, 0.50]
difficultys = []
for h, o, t in zip(height, occluded, truncated):
difficulty = -1
for i in range(2, -1, -1):
if h > MIN_HEIGHTS[i] and o <= MAX_OCCLUSION[i] and t <= MAX_TRUNCATION[i]:
difficulty = i
difficultys.append(difficulty)
return np.array(difficultys, dtype=np.int32)
def create_data_info_pkl(data_root, data_type, prefix, label=True, db=False):
sep = os.path.sep
print(f"Processing {data_type} data..")
ids_file = os.path.join(CUR, 'pointpillars', 'dataset', 'ImageSets', f'{data_type}.txt')
with open(ids_file, 'r') as f:
ids = [id.strip() for id in f.readlines()]
split = 'training' if label else 'testing'
kitti_infos_dict = {}
if db:
kitti_dbinfos_train = {}
db_points_saved_path = os.path.join(data_root, f'{prefix}_gt_database')
os.makedirs(db_points_saved_path, exist_ok=True)
for id in tqdm(ids):
cur_info_dict={}
img_path = os.path.join(data_root, split, 'image_2', f'{id}.png')
lidar_path = os.path.join(data_root, split, 'velodyne', f'{id}.bin')
calib_path = os.path.join(data_root, split, 'calib', f'{id}.txt')
cur_info_dict['velodyne_path'] = sep.join(lidar_path.split(sep)[-3:])
img = cv2.imread(img_path)
image_shape = img.shape[:2]
cur_info_dict['image'] = {
'image_shape': image_shape,
'image_path': sep.join(img_path.split(sep)[-3:]),
'image_idx': int(id),
}
calib_dict = read_calib(calib_path)
cur_info_dict['calib'] = calib_dict
lidar_points = read_points(lidar_path)
reduced_lidar_points = remove_outside_points(
points=lidar_points,
r0_rect=calib_dict['R0_rect'],
tr_velo_to_cam=calib_dict['Tr_velo_to_cam'],
P2=calib_dict['P2'],
image_shape=image_shape)
saved_reduced_path = os.path.join(data_root, split, 'velodyne_reduced')
os.makedirs(saved_reduced_path, exist_ok=True)
saved_reduced_points_name = os.path.join(saved_reduced_path, f'{id}.bin')
write_points(reduced_lidar_points, saved_reduced_points_name)
if label:
label_path = os.path.join(data_root, split, 'label_2', f'{id}.txt')
annotation_dict = read_label(label_path)
annotation_dict['difficulty'] = judge_difficulty(annotation_dict)
annotation_dict['num_points_in_gt'] = get_points_num_in_bbox(
points=reduced_lidar_points,
r0_rect=calib_dict['R0_rect'],
tr_velo_to_cam=calib_dict['Tr_velo_to_cam'],
dimensions=annotation_dict['dimensions'],
location=annotation_dict['location'],
rotation_y=annotation_dict['rotation_y'],
name=annotation_dict['name'])
cur_info_dict['annos'] = annotation_dict
if db:
indices, n_total_bbox, n_valid_bbox, bboxes_lidar, name = \
points_in_bboxes_v2(
points=lidar_points,
r0_rect=calib_dict['R0_rect'].astype(np.float32),
tr_velo_to_cam=calib_dict['Tr_velo_to_cam'].astype(np.float32),
dimensions=annotation_dict['dimensions'].astype(np.float32),
location=annotation_dict['location'].astype(np.float32),
rotation_y=annotation_dict['rotation_y'].astype(np.float32),
name=annotation_dict['name']
)
for j in range(n_valid_bbox):
db_points = lidar_points[indices[:, j]]
db_points[:, :3] -= bboxes_lidar[j, :3]
db_points_saved_name = os.path.join(db_points_saved_path, f'{int(id)}_{name[j]}_{j}.bin')
write_points(db_points, db_points_saved_name)
db_info={
'name': name[j],
'path': os.path.join(os.path.basename(db_points_saved_path), f'{int(id)}_{name[j]}_{j}.bin'),
'box3d_lidar': bboxes_lidar[j],
'difficulty': annotation_dict['difficulty'][j],
'num_points_in_gt': len(db_points),
}
if name[j] not in kitti_dbinfos_train:
kitti_dbinfos_train[name[j]] = [db_info]
else:
kitti_dbinfos_train[name[j]].append(db_info)
kitti_infos_dict[int(id)] = cur_info_dict
saved_path = os.path.join(data_root, f'{prefix}_infos_{data_type}.pkl')
write_pickle(kitti_infos_dict, saved_path)
if db:
saved_db_path = os.path.join(data_root, f'{prefix}_dbinfos_train.pkl')
write_pickle(kitti_dbinfos_train, saved_db_path)
return kitti_infos_dict
def main(args):
data_root = args.data_root
prefix = args.prefix
## 1. train: create data infomation pkl file && create reduced point clouds
## && create database(points in gt bbox) for data aumentation
kitti_train_infos_dict = create_data_info_pkl(data_root, 'train', prefix, db=True)
## 2. val: create data infomation pkl file && create reduced point clouds
kitti_val_infos_dict = create_data_info_pkl(data_root, 'val', prefix)
## 3. trainval: create data infomation pkl file
kitti_trainval_infos_dict = {**kitti_train_infos_dict, **kitti_val_infos_dict}
saved_path = os.path.join(data_root, f'{prefix}_infos_trainval.pkl')
write_pickle(kitti_trainval_infos_dict, saved_path)
## 4. test: create data infomation pkl file && create reduced point clouds
kitti_test_infos_dict = create_data_info_pkl(data_root, 'test', prefix, label=False)
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Dataset infomation')
parser.add_argument('--data_root', default='/mnt/ssd1/lifa_rdata/det/kitti',
help='your data root for kitti')
parser.add_argument('--prefix', default='kitti',
help='the prefix name for the saved .pkl file')
args = parser.parse_args()
main(args)
\ No newline at end of file
#numba
numpy==0.61.0
open3d
opencv_python==4.5.5.62
PyYAML
setuptools
tensorboard
#--extra-index-url https://download.pytorch.org/whl/cu111
#torch==1.8.1+cu111
tqdm
\ No newline at end of file
from setuptools import setup, find_packages
from torch.utils.cpp_extension import BuildExtension, CUDAExtension
setup(
name='pointpillars',
version='0.1',
packages=find_packages(),
ext_modules=[
CUDAExtension(
name='pointpillars.ops.voxel_op',
sources=[
'pointpillars/ops/voxelization/voxelization.cpp',
'pointpillars/ops/voxelization/voxelization_cpu.cpp',
'pointpillars/ops/voxelization/voxelization_cuda.cu',
],
define_macros=[('WITH_CUDA', None)]
),
CUDAExtension(
name='pointpillars.ops.iou3d_op',
sources=[
'pointpillars/ops/iou3d/iou3d.cpp',
'pointpillars/ops/iou3d/iou3d_kernel.cu',
],
define_macros=[('WITH_CUDA', None)]
)
],
cmdclass={'build_ext': BuildExtension},
zip_safe=False
)
\ No newline at end of file
import argparse
import cv2
import numpy as np
import os
import torch
import pdb
from pointpillars.utils import setup_seed, read_points, read_calib, read_label, \
keep_bbox_from_image_range, keep_bbox_from_lidar_range, vis_pc, \
vis_img_3d, bbox3d2corners_camera, points_camera2image, \
bbox_camera2lidar
from pointpillars.model import PointPillars
def point_range_filter(pts, point_range=[0, -39.68, -3, 69.12, 39.68, 1]):
'''
data_dict: dict(pts, gt_bboxes_3d, gt_labels, gt_names, difficulty)
point_range: [x1, y1, z1, x2, y2, z2]
'''
flag_x_low = pts[:, 0] > point_range[0]
flag_y_low = pts[:, 1] > point_range[1]
flag_z_low = pts[:, 2] > point_range[2]
flag_x_high = pts[:, 0] < point_range[3]
flag_y_high = pts[:, 1] < point_range[4]
flag_z_high = pts[:, 2] < point_range[5]
keep_mask = flag_x_low & flag_y_low & flag_z_low & flag_x_high & flag_y_high & flag_z_high
pts = pts[keep_mask]
return pts
def main(args):
CLASSES = {
'Pedestrian': 0,
'Cyclist': 1,
'Car': 2
}
LABEL2CLASSES = {v:k for k, v in CLASSES.items()}
pcd_limit_range = np.array([0, -40, -3, 70.4, 40, 0.0], dtype=np.float32)
if not args.no_cuda:
model = PointPillars(nclasses=len(CLASSES)).cuda()
model.load_state_dict(torch.load(args.ckpt))
else:
model = PointPillars(nclasses=len(CLASSES))
model.load_state_dict(
torch.load(args.ckpt, map_location=torch.device('cpu')))
if not os.path.exists(args.pc_path):
raise FileNotFoundError
pc = read_points(args.pc_path)
pc = point_range_filter(pc)
pc_torch = torch.from_numpy(pc)
if os.path.exists(args.calib_path):
calib_info = read_calib(args.calib_path)
else:
calib_info = None
if os.path.exists(args.gt_path):
gt_label = read_label(args.gt_path)
else:
gt_label = None
if os.path.exists(args.img_path):
img = cv2.imread(args.img_path, 1)
else:
img = None
model.eval()
with torch.no_grad():
if not args.no_cuda:
pc_torch = pc_torch.cuda()
result_filter = model(batched_pts=[pc_torch],
mode='test')[0]
if calib_info is not None and img is not None:
tr_velo_to_cam = calib_info['Tr_velo_to_cam'].astype(np.float32)
r0_rect = calib_info['R0_rect'].astype(np.float32)
P2 = calib_info['P2'].astype(np.float32)
image_shape = img.shape[:2]
result_filter = keep_bbox_from_image_range(result_filter, tr_velo_to_cam, r0_rect, P2, image_shape)
result_filter = keep_bbox_from_lidar_range(result_filter, pcd_limit_range)
lidar_bboxes = result_filter['lidar_bboxes']
labels, scores = result_filter['labels'], result_filter['scores']
vis_pc(pc, bboxes=lidar_bboxes, labels=labels)
if calib_info is not None and img is not None:
bboxes2d, camera_bboxes = result_filter['bboxes2d'], result_filter['camera_bboxes']
bboxes_corners = bbox3d2corners_camera(camera_bboxes)
image_points = points_camera2image(bboxes_corners, P2)
img = vis_img_3d(img, image_points, labels, rt=True)
if calib_info is not None and gt_label is not None:
tr_velo_to_cam = calib_info['Tr_velo_to_cam'].astype(np.float32)
r0_rect = calib_info['R0_rect'].astype(np.float32)
dimensions = gt_label['dimensions']
location = gt_label['location']
rotation_y = gt_label['rotation_y']
gt_labels = np.array([CLASSES.get(item, -1) for item in gt_label['name']])
sel = gt_labels != -1
gt_labels = gt_labels[sel]
bboxes_camera = np.concatenate([location, dimensions, rotation_y[:, None]], axis=-1)
gt_lidar_bboxes = bbox_camera2lidar(bboxes_camera, tr_velo_to_cam, r0_rect)
bboxes_camera = bboxes_camera[sel]
gt_lidar_bboxes = gt_lidar_bboxes[sel]
gt_labels = [-1] * len(gt_label['name']) # to distinguish between the ground truth and the predictions
pred_gt_lidar_bboxes = np.concatenate([lidar_bboxes, gt_lidar_bboxes], axis=0)
pred_gt_labels = np.concatenate([labels, gt_labels])
vis_pc(pc, pred_gt_lidar_bboxes, labels=pred_gt_labels)
if img is not None:
bboxes_corners = bbox3d2corners_camera(bboxes_camera)
image_points = points_camera2image(bboxes_corners, P2)
gt_labels = [-1] * len(gt_label['name'])
img = vis_img_3d(img, image_points, gt_labels, rt=True)
if calib_info is not None and img is not None:
cv2.imshow(f'{os.path.basename(args.img_path)}-3d bbox', img)
cv2.waitKey(0)
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Configuration Parameters')
parser.add_argument('--ckpt', default='pretrained/epoch_160.pth', help='your checkpoint for kitti')
parser.add_argument('--pc_path', help='your point cloud path')
parser.add_argument('--calib_path', default='', help='your calib file path')
parser.add_argument('--gt_path', default='', help='your ground truth path')
parser.add_argument('--img_path', default='', help='your image path')
parser.add_argument('--no_cuda', action='store_true',
help='whether to use cuda')
args = parser.parse_args()
main(args)
import argparse
import os
import torch
from tqdm import tqdm
import pdb
from pointpillars.utils import setup_seed
from pointpillars.dataset import Kitti, get_dataloader
from pointpillars.model import PointPillars
from pointpillars.loss import Loss
from torch.utils.tensorboard import SummaryWriter
def save_summary(writer, loss_dict, global_step, tag, lr=None, momentum=None):
for k, v in loss_dict.items():
writer.add_scalar(f'{tag}/{k}', v, global_step)
if lr is not None:
writer.add_scalar('lr', lr, global_step)
if momentum is not None:
writer.add_scalar('momentum', momentum, global_step)
def main(args):
setup_seed()
train_dataset = Kitti(data_root=args.data_root,
split='train')
val_dataset = Kitti(data_root=args.data_root,
split='val')
train_dataloader = get_dataloader(dataset=train_dataset,
batch_size=args.batch_size,
num_workers=args.num_workers,
shuffle=True)
val_dataloader = get_dataloader(dataset=val_dataset,
batch_size=args.batch_size,
num_workers=args.num_workers,
shuffle=False)
if not args.no_cuda:
pointpillars = PointPillars(nclasses=args.nclasses).cuda()
else:
pointpillars = PointPillars(nclasses=args.nclasses)
loss_func = Loss()
max_iters = len(train_dataloader) * args.max_epoch
init_lr = args.init_lr
optimizer = torch.optim.AdamW(params=pointpillars.parameters(),
lr=init_lr,
betas=(0.95, 0.99),
weight_decay=0.01)
scheduler = torch.optim.lr_scheduler.OneCycleLR(optimizer,
max_lr=init_lr*10,
total_steps=max_iters,
pct_start=0.4,
anneal_strategy='cos',
cycle_momentum=True,
base_momentum=0.95*0.895,
max_momentum=0.95,
div_factor=10)
saved_logs_path = os.path.join(args.saved_path, 'summary')
os.makedirs(saved_logs_path, exist_ok=True)
writer = SummaryWriter(saved_logs_path)
saved_ckpt_path = os.path.join(args.saved_path, 'checkpoints')
os.makedirs(saved_ckpt_path, exist_ok=True)
for epoch in range(args.max_epoch):
print('=' * 20, epoch, '=' * 20)
train_step, val_step = 0, 0
for i, data_dict in enumerate(tqdm(train_dataloader)):
if not args.no_cuda:
# move the tensors to the cuda
for key in data_dict:
for j, item in enumerate(data_dict[key]):
if torch.is_tensor(item):
data_dict[key][j] = data_dict[key][j].cuda()
optimizer.zero_grad()
batched_pts = data_dict['batched_pts']
batched_gt_bboxes = data_dict['batched_gt_bboxes']
batched_labels = data_dict['batched_labels']
batched_difficulty = data_dict['batched_difficulty']
bbox_cls_pred, bbox_pred, bbox_dir_cls_pred, anchor_target_dict = \
pointpillars(batched_pts=batched_pts,
mode='train',
batched_gt_bboxes=batched_gt_bboxes,
batched_gt_labels=batched_labels)
bbox_cls_pred = bbox_cls_pred.permute(0, 2, 3, 1).reshape(-1, args.nclasses)
bbox_pred = bbox_pred.permute(0, 2, 3, 1).reshape(-1, 7)
bbox_dir_cls_pred = bbox_dir_cls_pred.permute(0, 2, 3, 1).reshape(-1, 2)
batched_bbox_labels = anchor_target_dict['batched_labels'].reshape(-1)
batched_label_weights = anchor_target_dict['batched_label_weights'].reshape(-1)
batched_bbox_reg = anchor_target_dict['batched_bbox_reg'].reshape(-1, 7)
# batched_bbox_reg_weights = anchor_target_dict['batched_bbox_reg_weights'].reshape(-1)
batched_dir_labels = anchor_target_dict['batched_dir_labels'].reshape(-1)
# batched_dir_labels_weights = anchor_target_dict['batched_dir_labels_weights'].reshape(-1)
pos_idx = (batched_bbox_labels >= 0) & (batched_bbox_labels < args.nclasses)
bbox_pred = bbox_pred[pos_idx]
batched_bbox_reg = batched_bbox_reg[pos_idx]
# sin(a - b) = sin(a)*cos(b) - cos(a)*sin(b)
bbox_pred[:, -1] = torch.sin(bbox_pred[:, -1].clone()) * torch.cos(batched_bbox_reg[:, -1].clone())
batched_bbox_reg[:, -1] = torch.cos(bbox_pred[:, -1].clone()) * torch.sin(batched_bbox_reg[:, -1].clone())
bbox_dir_cls_pred = bbox_dir_cls_pred[pos_idx]
batched_dir_labels = batched_dir_labels[pos_idx]
num_cls_pos = (batched_bbox_labels < args.nclasses).sum()
bbox_cls_pred = bbox_cls_pred[batched_label_weights > 0]
batched_bbox_labels[batched_bbox_labels < 0] = args.nclasses
batched_bbox_labels = batched_bbox_labels[batched_label_weights > 0]
loss_dict = loss_func(bbox_cls_pred=bbox_cls_pred,
bbox_pred=bbox_pred,
bbox_dir_cls_pred=bbox_dir_cls_pred,
batched_labels=batched_bbox_labels,
num_cls_pos=num_cls_pos,
batched_bbox_reg=batched_bbox_reg,
batched_dir_labels=batched_dir_labels)
loss = loss_dict['total_loss']
loss.backward()
# torch.nn.utils.clip_grad_norm_(pointpillars.parameters(), max_norm=35)
optimizer.step()
scheduler.step()
global_step = epoch * len(train_dataloader) + train_step + 1
if global_step % args.log_freq == 0:
save_summary(writer, loss_dict, global_step, 'train',
lr=optimizer.param_groups[0]['lr'],
momentum=optimizer.param_groups[0]['betas'][0])
train_step += 1
if (epoch + 1) % args.ckpt_freq_epoch == 0:
torch.save(pointpillars.state_dict(), os.path.join(saved_ckpt_path, f'epoch_{epoch+1}.pth'))
if epoch % 2 == 0:
continue
pointpillars.eval()
with torch.no_grad():
for i, data_dict in enumerate(tqdm(val_dataloader)):
if not args.no_cuda:
# move the tensors to the cuda
for key in data_dict:
for j, item in enumerate(data_dict[key]):
if torch.is_tensor(item):
data_dict[key][j] = data_dict[key][j].cuda()
batched_pts = data_dict['batched_pts']
batched_gt_bboxes = data_dict['batched_gt_bboxes']
batched_labels = data_dict['batched_labels']
batched_difficulty = data_dict['batched_difficulty']
bbox_cls_pred, bbox_pred, bbox_dir_cls_pred, anchor_target_dict = \
pointpillars(batched_pts=batched_pts,
mode='train',
batched_gt_bboxes=batched_gt_bboxes,
batched_gt_labels=batched_labels)
bbox_cls_pred = bbox_cls_pred.permute(0, 2, 3, 1).reshape(-1, args.nclasses)
bbox_pred = bbox_pred.permute(0, 2, 3, 1).reshape(-1, 7)
bbox_dir_cls_pred = bbox_dir_cls_pred.permute(0, 2, 3, 1).reshape(-1, 2)
batched_bbox_labels = anchor_target_dict['batched_labels'].reshape(-1)
batched_label_weights = anchor_target_dict['batched_label_weights'].reshape(-1)
batched_bbox_reg = anchor_target_dict['batched_bbox_reg'].reshape(-1, 7)
# batched_bbox_reg_weights = anchor_target_dict['batched_bbox_reg_weights'].reshape(-1)
batched_dir_labels = anchor_target_dict['batched_dir_labels'].reshape(-1)
# batched_dir_labels_weights = anchor_target_dict['batched_dir_labels_weights'].reshape(-1)
pos_idx = (batched_bbox_labels >= 0) & (batched_bbox_labels < args.nclasses)
bbox_pred = bbox_pred[pos_idx]
batched_bbox_reg = batched_bbox_reg[pos_idx]
# sin(a - b) = sin(a)*cos(b) - cos(a)*sin(b)
bbox_pred[:, -1] = torch.sin(bbox_pred[:, -1]) * torch.cos(batched_bbox_reg[:, -1])
batched_bbox_reg[:, -1] = torch.cos(bbox_pred[:, -1]) * torch.sin(batched_bbox_reg[:, -1])
bbox_dir_cls_pred = bbox_dir_cls_pred[pos_idx]
batched_dir_labels = batched_dir_labels[pos_idx]
num_cls_pos = (batched_bbox_labels < args.nclasses).sum()
bbox_cls_pred = bbox_cls_pred[batched_label_weights > 0]
batched_bbox_labels[batched_bbox_labels < 0] = args.nclasses
batched_bbox_labels = batched_bbox_labels[batched_label_weights > 0]
loss_dict = loss_func(bbox_cls_pred=bbox_cls_pred,
bbox_pred=bbox_pred,
bbox_dir_cls_pred=bbox_dir_cls_pred,
batched_labels=batched_bbox_labels,
num_cls_pos=num_cls_pos,
batched_bbox_reg=batched_bbox_reg,
batched_dir_labels=batched_dir_labels)
global_step = epoch * len(val_dataloader) + val_step + 1
if global_step % args.log_freq == 0:
save_summary(writer, loss_dict, global_step, 'val')
val_step += 1
pointpillars.train()
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Configuration Parameters')
parser.add_argument('--data_root', default='/mnt/ssd1/lifa_rdata/det/kitti',
help='your data root for kitti')
parser.add_argument('--saved_path', default='pillar_logs')
parser.add_argument('--batch_size', type=int, default=6)
parser.add_argument('--num_workers', type=int, default=4)
parser.add_argument('--nclasses', type=int, default=3)
parser.add_argument('--init_lr', type=float, default=0.00025)
parser.add_argument('--max_epoch', type=int, default=160)
parser.add_argument('--log_freq', type=int, default=8)
parser.add_argument('--ckpt_freq_epoch', type=int, default=20)
parser.add_argument('--no_cuda', action='store_true',
help='whether to use cuda')
args = parser.parse_args()
main(args)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment