Add

00306509 · chenych · 00306509 · 00306509 · 00306509 · 00306509
Commit 00306509 authored Jun 23, 2025 by chenych
10 changed files
--- a/LICENSE
+++ b/LICENSE
+MIT License
+
+Copyright (c) 2023 ZhuLifa
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
--- a/README copy.md
+++ b/README copy.md
+# [PointPillars: Fast Encoders for Object Detection from Point Clouds](https://arxiv.org/abs/1812.05784) 
+
+A Simple PointPillars PyTorch Implenmentation for 3D Lidar(KITTI) Detection. [[Zhihu](https://zhuanlan.zhihu.com/p/521277176)]
+
+- It can be run without installing [Spconv](https://github.com/traveller59/spconv), [mmdet](https://github.com/open-mmlab/mmdetection) or [mmdet3d](https://github.com/open-mmlab/mmdetection3d). 
+- Only one detection network (PointPillars) was implemented in this repo, so the code may be more easy to read. 
+- Sincere thanks for the great open-source architectures [mmcv](https://github.com/open-mmlab/mmcv), [mmdet](https://github.com/open-mmlab/mmdetection) and [mmdet3d](https://github.com/open-mmlab/mmdetection3d), which helps me to learn 3D detetion and implement this repo.
+
+## News
+
+- **2025-02** Making PointPillars a python package out of the code is supported.
+- **2024-04** Exporting PointPillars to ONNX & TensorRT is supported on branch [feature/deployment](https://github.com/zhulf0804/PointPillars/tree/feature/deployment).
+
+    ![](./figures/pytorch_trt.png)
+
+## mAP on KITTI validation set (Easy, Moderate, Hard)
+
+| Repo | Metric | Overall | Pedestrian | Cyclist | Car |
+| :---: | :---: | :---: | :---: | :---: | :---: |
+| this repo | 3D-BBox | 73.3259 62.7834 59.6278 | 51.4642 47.9446 43.8040 | 81.8677 63.6617 60.9126 | 86.6456 76.7439 74.1668 | 
+| [mmdet3d v0.18.1](https://github.com/open-mmlab/mmdetection3d/tree/v0.18.1) | 3D-BBox  | 72.0537, 60.1114, 55.8320 | 52.0263, 46.4037, 42.4841 | 78.7231, 59.9526, 57.2489 | 85.4118, 73.9780, 67.7630 |
+| this repo | BEV | 77.8540 69.8003 66.6699 | 59.1687 54.3456 50.5023 | 84.4268 67.1409 63.7409 | 89.9664 87.9145 85.7664 | 
+| [mmdet3d v0.18.1](https://github.com/open-mmlab/mmdetection3d/tree/v0.18.1) | BEV | 76.6485, 67.7609, 64.5605 | 59.0778, 53.3638, 48.4230 | 80.9328, 63.3447, 60.0618 | 89.9348, 86.5743, 85.1967 |
+| this repo | 2D-BBox | 80.5097 74.6120 71.4758 | 64.6249 61.4201 57.5965 | 86.2569 73.0828 70.1726 | 90.6471 89.3330 86.6583 |
+| [mmdet3d v0.18.1](https://github.com/open-mmlab/mmdetection3d/tree/v0.18.1) | 2D-BBox | 78.4938, 73.4781, 70.3613 | 62.2413, 58.9157, 55.3660 | 82.6460, 72.3547, 68.4669 | 90.5939, 89.1638, 87.2511 |
+| this repo | AOS | 74.9647 68.1712 65.2817 | 49.3777 46.7284 43.8352 | 85.0412 69.1024 66.2801 | 90.4752 88.6828 85.7298 |
+| [mmdet3d v0.18.1](https://github.com/open-mmlab/mmdetection3d/tree/v0.18.1) | AOS | 72.41, 66.23, 63.55 | 46.00, 43.22, 40.94 | 80.85, 67.20, 63.63 | 90.37, 88.27, 86.07 |
+
+- **Note: Here, we report [mmdet3d v0.18.1](https://github.com/open-mmlab/mmdetection3d/tree/v0.18.1) (2022/02/09-2022/03/01) performance based on the officially provided [checkpoint](https://github.com/open-mmlab/mmdetection3d/tree/v0.18.1/configs/pointpillars#kitti). Much improvements were made in the [mmdet3d v1.0.0rc1](https://github.com/open-mmlab/mmdetection3d/tree/v1.0.0rc1)**. 
+
+## Detection Visualization
+
+![](./figures/pc_pred_000134.png)
+![](./figures/img_3dbbox_000134.png)
+
+## [Install] 
+
+Install PointPillars as a python package and all its dependencies as follows:
+
+```
+cd PointPillars/
+pip install -r requirements.txt
+python setup.py build_ext --inplace
+pip install .
+```
+
+## [Datasets]
+
+1. Download
+
+    Download [point cloud](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_velodyne.zip)(29GB), [images](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_image_2.zip)(12 GB), [calibration files](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_calib.zip)(16 MB)和[labels](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_label_2.zip)(5 MB)。Format the datasets as follows:
+    ```
+    kitti
+        |- training
+            |- calib (#7481 .txt)
+            |- image_2 (#7481 .png)
+            |- label_2 (#7481 .txt)
+            |- velodyne (#7481 .bin)
+        |- testing
+            |- calib (#7518 .txt)
+            |- image_2 (#7518 .png)
+            |- velodyne (#7518 .bin)
+    ```
+
+2. Pre-process KITTI datasets First
+
+    ```
+    cd PointPillars/
+    python pre_process_kitti.py --data_root your_path_to_kitti
+    ```
+
+    Now, we have datasets as follows:
+    ```
+    kitti
+        |- training
+            |- calib (#7481 .txt)
+            |- image_2 (#7481 .png)
+            |- label_2 (#7481 .txt)
+            |- velodyne (#7481 .bin)
+            |- velodyne_reduced (#7481 .bin)
+        |- testing
+            |- calib (#7518 .txt)
+            |- image_2 (#7518 .png)
+            |- velodyne (#7518 .bin)
+            |- velodyne_reduced (#7518 .bin)
+        |- kitti_gt_database (# 19700 .bin)
+        |- kitti_infos_train.pkl
+        |- kitti_infos_val.pkl
+        |- kitti_infos_trainval.pkl
+        |- kitti_infos_test.pkl
+        |- kitti_dbinfos_train.pkl
+    ```
+
+## [Training]
+
+```
+cd PointPillars/
+python train.py --data_root your_path_to_kitti
+```
+
+## [Evaluation]
+
+```
+cd PointPillars/
+python evaluate.py --ckpt pretrained/epoch_160.pth --data_root your_path_to_kitti 
+```
+
+## [Test]
+
+```
+cd PointPillars/
+
+# 1. infer and visualize point cloud detection
+python test.py --ckpt pretrained/epoch_160.pth --pc_path your_pc_path 
+
+# 2. infer and visualize point cloud detection and gound truth.
+python test.py --ckpt pretrained/epoch_160.pth --pc_path your_pc_path --calib_path your_calib_path  --gt_path your_gt_path
+
+# 3. infer and visualize point cloud & image detection
+python test.py --ckpt pretrained/epoch_160.pth --pc_path your_pc_path --calib_path your_calib_path --img_path your_img_path
+
+
+e.g. 
+a. [infer on val set 000134]
+
+python test.py --ckpt pretrained/epoch_160.pth --pc_path pointpillars/dataset/demo_data/val/000134.bin
+
+or
+
+python test.py --ckpt pretrained/epoch_160.pth --pc_path pointpillars/dataset/demo_data/val/000134.bin \
+               --calib_path pointpillars/dataset/demo_data/val/000134.txt \
+               --img_path pointpillars/dataset/demo_data/val/000134.png \
+               --gt_path pointpillars/dataset/demo_data/val/000134_gt.txt
+
+b. [infer on test set 000002]
+
+python test.py --ckpt pretrained/epoch_160.pth --pc_path pointpillars/dataset/demo_data/test/000002.bin
+
+or 
+
+python test.py --ckpt pretrained/epoch_160.pth --pc_path pointpillars/dataset/demo_data/test/000002.bin \
+               --calib_path pointpillars/dataset/demo_data/test/000002.txt \
+               --img_path pointpillars/dataset/demo_data/test/000002.png
+```
+
+## Acknowledements
+
+Thanks for the open source code [mmcv](https://github.com/open-mmlab/mmcv), [mmdet](https://github.com/open-mmlab/mmdetection) and [mmdet3d](https://github.com/open-mmlab/mmdetection3d).
--- a/README.md
+++ b/README.md
+# PointPillars
+## 论文
+[PointPillars: Fast Encoders for Object Detection from Point Clouds](https://arxiv.org/abs/1812.05784)
+## 模型结构
+<div align=center>
+    <img src="./asserts/model.png"/>
+</div>
+
+## 算法原理
+本文提出的PointPillars是一种针对3D目标检测任务的新型点云编码器和网络结构。它采用PointNets来学习点云的垂直柱状特征表示，从而更好地捕捉点云信息，并将其应用于标准的2D卷积检测架构中。PointPillars的设计使得可以根据需要进行速度和精度之间的权衡，同时在保持高精度的同时可以实现超过100Hz的速度。
+
+具体来说，PointPillars采用了三个块来构建网络结构，每个块都包括 Upsampling 步骤和 Concatenation 步骤。这些步骤使特征向量的维度逐渐增加，以便于后续的目标检测任务。此外，PointPillars还使用了一些损失函数来优化网络参数，如SmoothL1损失、Focal损失等。
+<div align=center>
+    <img src="./asserts/image.png"/>
+</div>
+
+## 环境配置
+`-v 路径`、`docker_name`和`imageID`根据实际情况修改
+###
+### Docker（方法一）
+```bash
+docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.4.1-ubuntu22.04-dtk25.04-py3.10
+docker run -it --shm-size 200g --network=host --name {docker_name} --privileged --device=/dev/kfd --device=/dev/dri --device=/dev/mkfd --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal/:/opt/hyhal/:ro {imageID} bash
+
+cd /your_code_path/pointpillars-pytorch
+pip install -r requirements.txt
+```
+
+### Dockerfile（方法二）
+```bash
+cd docker
+docker build --no-cache -t pointpillars:latest .
+docker run -it --shm-size 200g --network=host --name {docker_name} --privileged --device=/dev/kfd --device=/dev/dri --device=/dev/mkfd --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal/:/opt/hyhal/:ro {imageID} bash
+
+cd /your_code_path/pointpillars-pytorch
+pip install -r requirements.txt
+```
+
+### Anaconda（方法三）
+关于本项目DCU显卡所需的特殊深度学习库可从[光合](https://developer.sourcefind.cn/tool/)开发者社区下载安装。
+```bash
+DTK: 25.04
+python: 3.10
+vllm: 0.8.5
+torch: 2.4.1+das.opt2.dtk2504
+deepspeed: 0.14.2+das.opt2.dtk2504
+
+```
+`Tips：以上dtk驱动、python、torch等DCU相关工具版本需要严格一一对应`
+
+其它非深度学习库安装方式如下：
+```bash
+pip install -r requirements.txt
+```
+## 数据集
+ 下载数据集[point cloud](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_velodyne.zip)(29GB), [images](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_image_2.zip)(12 GB), [calibration files](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_calib.zip)(16 MB)和[labels](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_label_2.zip)(5 MB)。数据格式如下所示:
+```
+ kitti
+        |- training
+            |- calib (#7481 .txt)
+            |- image_2 (#7481 .png)
+            |- label_2 (#7481 .txt)
+            |- velodyne (#7481 .bin)
+            |- velodyne_reduced (#7481 .bin)
+        |- testing
+            |- calib (#7518 .txt)
+            |- image_2 (#7518 .png)
+            |- velodyne (#7518 .bin)
+            |- velodyne_reduced (#7518 .bin)
+        |- kitti_gt_database (# 19700 .bin)
+        |- kitti_infos_train.pkl
+        |- kitti_infos_val.pkl
+        |- kitti_infos_trainval.pkl
+        |- kitti_infos_test.pkl
+        |- kitti_dbinfos_train.pkl
+```
+## 训练
+```
+cd PointPillars/
+python train.py --data_root your_path_to_kitti
+```
+
+## 推理-验证
+```
+cd PointPillars/
+python evaluate.py --ckpt pretrained/epoch_160.pth --data_root your_path_to_kitti 
+```
+
+## result
+<div align=center>
+    <img src="./asserts/result.png"/>
+</div>
+
+### 精度
+与NV基本一致，部分精度高于NV
+==========BBOX_2D==========
+Pedestrian AP@0.5: 65.2719 61.7278 58.0838
+Cyclist AP@0.5: 87.5932 75.1039 71.6308
+Car AP@0.7: 90.7070 89.4133 86.6231
+==========AOS==========
+Pedestrian AOS@0.5: 45.5449 43.3262 40.6216
+Cyclist AOS@0.5: 87.2105 72.7019 69.1393
+Car AOS@0.7: 90.6199 88.9817 85.9057
+==========BBOX_BEV==========
+Pedestrian AP@0.5: 59.7484 55.0461 50.7703
+Cyclist AP@0.5: 84.4216 68.7483 64.2096
+Car AP@0.7: 90.1521 87.5884 85.7974
+==========BBOX_3D==========
+Pedestrian AP@0.5: 51.5879 46.5993 43.3320
+Cyclist AP@0.5: 80.9012 62.8649 60.4813
+Car AP@0.7: 86.0721 76.7700 74.3576
+
+==========Overall==========
+bbox_2d AP: 81.1907 75.4150 72.1126
+AOS AP: 74.4584 68.3366 65.2222
+bbox_bev AP: 78.1073 70.4609 66.9258
+bbox_3d AP: 72.8538 62.0781 59.3903
+
+## 应用场景
+### 算法类别
+文本理解
+
+### 热点应用行业
+制造,智能驾驶,3D点云
+
+
+## 源码仓库及问题反馈
+- https://developer.sourcefind.cn/codes/modelzoo/pointpillars-pytorch
+
+## 参考资料
+- https://github.com/zhulf0804/PointPillars
+
--- a/evaluate.py
+++ b/evaluate.py
+import argparse
+import numpy as np
+import os
+import torch
+import pdb
+from tqdm import tqdm
+
+from pointpillars.utils import setup_seed, keep_bbox_from_image_range, \
+    keep_bbox_from_lidar_range, write_pickle, write_label, \
+    iou2d, iou3d_camera, iou_bev
+from pointpillars.dataset import Kitti, get_dataloader
+from pointpillars.model import PointPillars
+
+
+def get_score_thresholds(tp_scores, total_num_valid_gt, num_sample_pts=41):
+    score_thresholds = []
+    tp_scores = sorted(tp_scores)[::-1]
+    cur_recall, pts_ind = 0, 0
+    for i, score in enumerate(tp_scores):
+        lrecall = (i + 1) / total_num_valid_gt
+        rrecall = (i + 2) / total_num_valid_gt
+
+        if i == len(tp_scores) - 1:
+            score_thresholds.append(score)
+            break
+
+        if (lrecall + rrecall) / 2 < cur_recall:
+            continue
+
+        score_thresholds.append(score)
+        pts_ind += 1
+        cur_recall = pts_ind / (num_sample_pts - 1)
+    return score_thresholds
+
+
+def do_eval(det_results, gt_results, CLASSES, saved_path):
+    '''
+    det_results: list,
+    gt_results: dict(id -> det_results)
+    CLASSES: dict
+    '''
+    assert len(det_results) == len(gt_results)
+    f = open(os.path.join(saved_path, 'eval_results.txt'), 'w')
+
+    # 1. calculate iou
+    ious = {
+        'bbox_2d': [],
+        'bbox_bev': [],
+        'bbox_3d': []
+    }
+    ids = list(sorted(gt_results.keys()))
+    for id in ids:
+        gt_result = gt_results[id]['annos']
+        det_result = det_results[id]
+
+        # 1.1, 2d bboxes iou
+        gt_bboxes2d = gt_result['bbox'].astype(np.float32)
+        det_bboxes2d = det_result['bbox'].astype(np.float32)
+        iou2d_v = iou2d(torch.from_numpy(gt_bboxes2d).cuda(), torch.from_numpy(det_bboxes2d).cuda())
+        ious['bbox_2d'].append(iou2d_v.cpu().numpy())
+
+        # 1.2, bev iou
+        gt_location = gt_result['location'].astype(np.float32)
+        gt_dimensions = gt_result['dimensions'].astype(np.float32)
+        gt_rotation_y = gt_result['rotation_y'].astype(np.float32)
+        det_location = det_result['location'].astype(np.float32)
+        det_dimensions = det_result['dimensions'].astype(np.float32)
+        det_rotation_y = det_result['rotation_y'].astype(np.float32)
+
+        gt_bev = np.concatenate([gt_location[:, [0, 2]], gt_dimensions[:, [0, 2]], gt_rotation_y[:, None]], axis=-1)
+        det_bev = np.concatenate([det_location[:, [0, 2]], det_dimensions[:, [0, 2]], det_rotation_y[:, None]], axis=-1)
+        iou_bev_v = iou_bev(torch.from_numpy(gt_bev).cuda(), torch.from_numpy(det_bev).cuda())
+        ious['bbox_bev'].append(iou_bev_v.cpu().numpy())
+
+        # 1.3, 3dbboxes iou
+        gt_bboxes3d = np.concatenate([gt_location, gt_dimensions, gt_rotation_y[:, None]], axis=-1)
+        det_bboxes3d = np.concatenate([det_location, det_dimensions, det_rotation_y[:, None]], axis=-1)
+        iou3d_v = iou3d_camera(torch.from_numpy(gt_bboxes3d).cuda(), torch.from_numpy(det_bboxes3d).cuda())
+        ious['bbox_3d'].append(iou3d_v.cpu().numpy())
+
+    MIN_IOUS = {
+        'Pedestrian': [0.5, 0.5, 0.5],
+        'Cyclist': [0.5, 0.5, 0.5],
+        'Car': [0.7, 0.7, 0.7]
+    }
+    MIN_HEIGHT = [40, 25, 25]
+
+    overall_results = {}
+    for e_ind, eval_type in enumerate(['bbox_2d', 'bbox_bev', 'bbox_3d']):
+        eval_ious = ious[eval_type]
+        eval_ap_results, eval_aos_results = {}, {}
+        for cls in CLASSES:
+            eval_ap_results[cls] = []
+            eval_aos_results[cls] = []
+            CLS_MIN_IOU = MIN_IOUS[cls][e_ind]
+            for difficulty in [0, 1, 2]:
+                # 1. bbox property
+                total_gt_ignores, total_det_ignores, total_dc_bboxes, total_scores = [], [], [], []
+                total_gt_alpha, total_det_alpha = [], []
+                for id in ids:
+                    gt_result = gt_results[id]['annos']
+                    det_result = det_results[id]
+
+                    # 1.1 gt bbox property
+                    cur_gt_names = gt_result['name']
+                    cur_difficulty = gt_result['difficulty']
+                    gt_ignores, dc_bboxes = [], []
+                    for j, cur_gt_name in enumerate(cur_gt_names):
+                        ignore = cur_difficulty[j] < 0 or cur_difficulty[j] > difficulty
+                        if cur_gt_name == cls:
+                            valid_class = 1
+                        elif cls == 'Pedestrian' and cur_gt_name == 'Person_sitting':
+                            valid_class = 0
+                        elif cls == 'Car' and cur_gt_name == 'Van':
+                            valid_class = 0
+                        else:
+                            valid_class = -1
+                        
+                        if valid_class == 1 and not ignore:
+                            gt_ignores.append(0)
+                        elif valid_class == 0 or (valid_class == 1 and ignore):
+                            gt_ignores.append(1)
+                        else:
+                            gt_ignores.append(-1)
+                        
+                        if cur_gt_name == 'DontCare':
+                            dc_bboxes.append(gt_result['bbox'][j])
+                    total_gt_ignores.append(gt_ignores)
+                    total_dc_bboxes.append(np.array(dc_bboxes))
+                    total_gt_alpha.append(gt_result['alpha'])
+
+                    # 1.2 det bbox property
+                    cur_det_names = det_result['name']
+                    cur_det_heights = det_result['bbox'][:, 3] - det_result['bbox'][:, 1]
+                    det_ignores = []
+                    for j, cur_det_name in enumerate(cur_det_names):
+                        if cur_det_heights[j] < MIN_HEIGHT[difficulty]:
+                            det_ignores.append(1)
+                        elif cur_det_name == cls:
+                            det_ignores.append(0)
+                        else:
+                            det_ignores.append(-1)
+                    total_det_ignores.append(det_ignores)
+                    total_scores.append(det_result['score'])
+                    total_det_alpha.append(det_result['alpha'])
+
+                # 2. calculate scores thresholds for PR curve
+                tp_scores = []
+                for i, id in enumerate(ids):
+                    cur_eval_ious = eval_ious[i]
+                    gt_ignores, det_ignores = total_gt_ignores[i], total_det_ignores[i]
+                    scores = total_scores[i]
+
+                    nn, mm = cur_eval_ious.shape
+                    assigned = np.zeros((mm, ), dtype=np.bool_)
+                    for j in range(nn):
+                        if gt_ignores[j] == -1:
+                            continue
+                        match_id, match_score = -1, -1
+                        for k in range(mm):
+                            if not assigned[k] and det_ignores[k] >= 0 and cur_eval_ious[j, k] > CLS_MIN_IOU and scores[k] > match_score:
+                                match_id = k
+                                match_score = scores[k]
+                        if match_id != -1:
+                            assigned[match_id] = True
+                            if det_ignores[match_id] == 0 and gt_ignores[j] == 0:
+                                tp_scores.append(match_score)
+                total_num_valid_gt = np.sum([np.sum(np.array(gt_ignores) == 0) for gt_ignores in total_gt_ignores])
+                score_thresholds = get_score_thresholds(tp_scores, total_num_valid_gt)    
+            
+                # 3. draw PR curve and calculate mAP
+                tps, fns, fps, total_aos = [], [], [], []
+
+                for score_threshold in score_thresholds:
+                    tp, fn, fp = 0, 0, 0
+                    aos = 0
+                    for i, id in enumerate(ids):
+                        cur_eval_ious = eval_ious[i]
+                        gt_ignores, det_ignores = total_gt_ignores[i], total_det_ignores[i]
+                        gt_alpha, det_alpha = total_gt_alpha[i], total_det_alpha[i]
+                        scores = total_scores[i]
+
+                        nn, mm = cur_eval_ious.shape
+                        assigned = np.zeros((mm, ), dtype=np.bool_)
+                        for j in range(nn):
+                            if gt_ignores[j] == -1:
+                                continue
+                            match_id, match_iou = -1, -1
+                            for k in range(mm):
+                                if not assigned[k] and det_ignores[k] >= 0 and scores[k] >= score_threshold and cur_eval_ious[j, k] > CLS_MIN_IOU:
+    
+                                    if det_ignores[k] == 0 and cur_eval_ious[j, k] > match_iou:
+                                        match_iou = cur_eval_ious[j, k]
+                                        match_id = k
+                                    elif det_ignores[k] == 1 and match_iou == -1:
+                                        match_id = k
+
+                            if match_id != -1:
+                                assigned[match_id] = True
+                                if det_ignores[match_id] == 0 and gt_ignores[j] == 0:
+                                    tp += 1
+                                    if eval_type == 'bbox_2d':
+                                        aos += (1 + np.cos(gt_alpha[j] - det_alpha[match_id])) / 2
+                            else:
+                                if gt_ignores[j] == 0:
+                                    fn += 1
+                            
+                        for k in range(mm):
+                            if det_ignores[k] == 0 and scores[k] >= score_threshold and not assigned[k]:
+                                fp += 1
+                        
+                        # In case 2d bbox evaluation, we should consider dontcare bboxes
+                        if eval_type == 'bbox_2d':
+                            dc_bboxes = total_dc_bboxes[i]
+                            det_bboxes = det_results[id]['bbox']
+                            if len(dc_bboxes) > 0:
+                                ious_dc_det = iou2d(torch.from_numpy(det_bboxes), torch.from_numpy(dc_bboxes), metric=1).numpy().T
+                                for j in range(len(dc_bboxes)):
+                                    for k in range(len(det_bboxes)):
+                                        if det_ignores[k] == 0 and scores[k] >= score_threshold and not assigned[k]:
+                                            if ious_dc_det[j, k] > CLS_MIN_IOU:
+                                                fp -= 1
+                                                assigned[k] = True
+                            
+                    tps.append(tp)
+                    fns.append(fn)
+                    fps.append(fp)
+                    if eval_type == 'bbox_2d':
+                        total_aos.append(aos)
+
+                tps, fns, fps = np.array(tps), np.array(fns), np.array(fps)
+
+                recalls = tps / (tps + fns)
+                precisions = tps / (tps + fps)
+                for i in range(len(score_thresholds)):
+                    precisions[i] = np.max(precisions[i:])
+                
+                sums_AP = 0
+                for i in range(0, len(score_thresholds), 4):
+                    sums_AP += precisions[i]
+                mAP = sums_AP / 11 * 100
+                eval_ap_results[cls].append(mAP)
+
+                if eval_type == 'bbox_2d':
+                    total_aos = np.array(total_aos)
+                    similarity = total_aos / (tps + fps)
+                    for i in range(len(score_thresholds)):
+                        similarity[i] = np.max(similarity[i:])
+                    sums_similarity = 0
+                    for i in range(0, len(score_thresholds), 4):
+                        sums_similarity += similarity[i]
+                    mSimilarity = sums_similarity / 11 * 100
+                    eval_aos_results[cls].append(mSimilarity)
+
+        print(f'=========={eval_type.upper()}==========')
+        print(f'=========={eval_type.upper()}==========', file=f)
+        for k, v in eval_ap_results.items():
+            print(f'{k} AP@{MIN_IOUS[k][e_ind]}: {v[0]:.4f} {v[1]:.4f} {v[2]:.4f}')
+            print(f'{k} AP@{MIN_IOUS[k][e_ind]}: {v[0]:.4f} {v[1]:.4f} {v[2]:.4f}', file=f)
+        if eval_type == 'bbox_2d':
+            print(f'==========AOS==========')
+            print(f'==========AOS==========', file=f)
+            for k, v in eval_aos_results.items():
+                print(f'{k} AOS@{MIN_IOUS[k][e_ind]}: {v[0]:.4f} {v[1]:.4f} {v[2]:.4f}')
+                print(f'{k} AOS@{MIN_IOUS[k][e_ind]}: {v[0]:.4f} {v[1]:.4f} {v[2]:.4f}', file=f)
+        
+        overall_results[eval_type] = np.mean(list(eval_ap_results.values()), 0)
+        if eval_type == 'bbox_2d':
+            overall_results['AOS'] = np.mean(list(eval_aos_results.values()), 0)
+    
+    print(f'\n==========Overall==========')
+    print(f'\n==========Overall==========', file=f)
+    for k, v in overall_results.items():
+        print(f'{k} AP: {v[0]:.4f} {v[1]:.4f} {v[2]:.4f}')
+        print(f'{k} AP: {v[0]:.4f} {v[1]:.4f} {v[2]:.4f}', file=f)
+    f.close()
+    
+
+def main(args):
+    val_dataset = Kitti(data_root=args.data_root,
+                        split='val')
+    val_dataloader = get_dataloader(dataset=val_dataset, 
+                                    batch_size=args.batch_size, 
+                                    num_workers=args.num_workers,
+                                    shuffle=False)
+    CLASSES = Kitti.CLASSES
+    LABEL2CLASSES = {v:k for k, v in CLASSES.items()}
+
+    if not args.no_cuda:
+        model = PointPillars(nclasses=args.nclasses).cuda()
+        model.load_state_dict(torch.load(args.ckpt))
+    else:
+        model = PointPillars(nclasses=args.nclasses)
+        model.load_state_dict(
+            torch.load(args.ckpt, map_location=torch.device('cpu')))
+    
+    saved_path = args.saved_path
+    os.makedirs(saved_path, exist_ok=True)
+    saved_submit_path = os.path.join(saved_path, 'submit')
+    os.makedirs(saved_submit_path, exist_ok=True)
+
+    pcd_limit_range = np.array([0, -40, -3, 70.4, 40, 0.0], dtype=np.float32)
+
+    model.eval()
+    with torch.no_grad():
+        format_results = {}
+        print('Predicting and Formatting the results.')
+        for i, data_dict in enumerate(tqdm(val_dataloader)):
+            if not args.no_cuda:
+                # move the tensors to the cuda
+                for key in data_dict:
+                    for j, item in enumerate(data_dict[key]):
+                        if torch.is_tensor(item):
+                            data_dict[key][j] = data_dict[key][j].cuda()
+            
+            batched_pts = data_dict['batched_pts']
+            batched_gt_bboxes = data_dict['batched_gt_bboxes']
+            batched_labels = data_dict['batched_labels']
+            batched_difficulty = data_dict['batched_difficulty']
+            batch_results = model(batched_pts=batched_pts, 
+                                  mode='val',
+                                  batched_gt_bboxes=batched_gt_bboxes, 
+                                  batched_gt_labels=batched_labels)
+            # pdb.set_trace()
+            for j, result in enumerate(batch_results):
+                format_result = {
+                    'name': [],
+                    'truncated': [],
+                    'occluded': [],
+                    'alpha': [],
+                    'bbox': [],
+                    'dimensions': [],
+                    'location': [],
+                    'rotation_y': [],
+                    'score': []
+                }
+                
+                calib_info = data_dict['batched_calib_info'][j]
+                tr_velo_to_cam = calib_info['Tr_velo_to_cam'].astype(np.float32)
+                r0_rect = calib_info['R0_rect'].astype(np.float32)
+                P2 = calib_info['P2'].astype(np.float32)
+                image_shape = data_dict['batched_img_info'][j]['image_shape']
+                idx = data_dict['batched_img_info'][j]['image_idx']
+                result_filter = keep_bbox_from_image_range(result, tr_velo_to_cam, r0_rect, P2, image_shape)
+                result_filter = keep_bbox_from_lidar_range(result_filter, pcd_limit_range)
+
+                lidar_bboxes = result_filter['lidar_bboxes']
+                labels, scores = result_filter['labels'], result_filter['scores']
+                bboxes2d, camera_bboxes = result_filter['bboxes2d'], result_filter['camera_bboxes']
+                for lidar_bbox, label, score, bbox2d, camera_bbox in \
+                    zip(lidar_bboxes, labels, scores, bboxes2d, camera_bboxes):
+                    format_result['name'].append(LABEL2CLASSES[label])
+                    format_result['truncated'].append(0.0)
+                    format_result['occluded'].append(0)
+                    alpha = camera_bbox[6] - np.arctan2(camera_bbox[0], camera_bbox[2])
+                    format_result['alpha'].append(alpha)
+                    format_result['bbox'].append(bbox2d)
+                    format_result['dimensions'].append(camera_bbox[3:6])
+                    format_result['location'].append(camera_bbox[:3])
+                    format_result['rotation_y'].append(camera_bbox[6])
+                    format_result['score'].append(score)
+                
+                write_label(format_result, os.path.join(saved_submit_path, f'{idx:06d}.txt'))
+
+                format_results[idx] = {k:np.array(v) for k, v in format_result.items()}
+        
+        write_pickle(format_results, os.path.join(saved_path, 'results.pkl'))
+    
+    print('Evaluating.. Please wait several seconds.')
+    do_eval(format_results, val_dataset.data_infos, CLASSES, saved_path)
+
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser(description='Configuration Parameters')
+    parser.add_argument('--data_root', default='/mnt/ssd1/lifa_rdata/det/kitti', 
+                        help='your data root for kitti')
+    parser.add_argument('--ckpt', default='pretrained/epoch_160.pth', help='your checkpoint for kitti')
+    parser.add_argument('--saved_path', default='results', help='your saved path for predicted results')
+    parser.add_argument('--batch_size', type=int, default=1)
+    parser.add_argument('--num_workers', type=int, default=4)
+    parser.add_argument('--nclasses', type=int, default=3)
+    parser.add_argument('--no_cuda', action='store_true',
+                        help='whether to use cuda')
+    args = parser.parse_args()
+
+    main(args)
--- a/model.properties
+++ b/model.properties
+# 模型唯一标识
+modelCode=1621
+# 模型名称
+modelName=pointpillars-pytorch
+# 模型描述
+modelDescription=文提出了一种新型的点云编码器与网络结构——PointPillars，专为3D对象检测设计，特别适合自动驾驶应用中的lidar数据处理。
+# 应用场景
+appScenario=推理,训练,制造,智能驾驶,3D点云
+# 框架类型
+frameType=pytorch
\ No newline at end of file
--- a/pre_process_kitti.py
+++ b/pre_process_kitti.py
+import argparse
+import pdb
+import cv2
+import numpy as np
+import os
+from tqdm import tqdm
+import sys
+CUR = os.path.dirname(os.path.abspath(__file__))
+sys.path.append(CUR)
+
+from pointpillars.utils import read_points, write_points, read_calib, read_label, \
+    write_pickle, remove_outside_points, get_points_num_in_bbox, \
+    points_in_bboxes_v2
+
+
+def judge_difficulty(annotation_dict):
+    truncated = annotation_dict['truncated']
+    occluded = annotation_dict['occluded']
+    bbox = annotation_dict['bbox']
+    height = bbox[:, 3] - bbox[:, 1]
+
+    MIN_HEIGHTS = [40, 25, 25]
+    MAX_OCCLUSION = [0, 1, 2]
+    MAX_TRUNCATION = [0.15, 0.30, 0.50]
+    difficultys = []
+    for h, o, t in zip(height, occluded, truncated):
+        difficulty = -1
+        for i in range(2, -1, -1):
+            if h > MIN_HEIGHTS[i] and o <= MAX_OCCLUSION[i] and t <= MAX_TRUNCATION[i]:
+                difficulty = i
+        difficultys.append(difficulty)
+    return np.array(difficultys, dtype=np.int32)
+
+
+def create_data_info_pkl(data_root, data_type, prefix, label=True, db=False):
+    sep = os.path.sep
+    print(f"Processing {data_type} data..")
+    ids_file = os.path.join(CUR, 'pointpillars', 'dataset', 'ImageSets', f'{data_type}.txt')
+    with open(ids_file, 'r') as f:
+        ids = [id.strip() for id in f.readlines()]
+    
+    split = 'training' if label else 'testing'
+
+    kitti_infos_dict = {}
+    if db:
+        kitti_dbinfos_train = {}
+        db_points_saved_path = os.path.join(data_root, f'{prefix}_gt_database')
+        os.makedirs(db_points_saved_path, exist_ok=True)
+    for id in tqdm(ids):
+        cur_info_dict={}
+        img_path = os.path.join(data_root, split, 'image_2', f'{id}.png')
+        lidar_path = os.path.join(data_root, split, 'velodyne', f'{id}.bin')
+        calib_path = os.path.join(data_root, split, 'calib', f'{id}.txt') 
+        cur_info_dict['velodyne_path'] = sep.join(lidar_path.split(sep)[-3:])
+
+        img = cv2.imread(img_path)
+        image_shape = img.shape[:2]
+        cur_info_dict['image'] = {
+            'image_shape': image_shape,
+            'image_path': sep.join(img_path.split(sep)[-3:]), 
+            'image_idx': int(id),
+        }
+
+        calib_dict = read_calib(calib_path)
+        cur_info_dict['calib'] = calib_dict
+
+        lidar_points = read_points(lidar_path)
+        reduced_lidar_points = remove_outside_points(
+            points=lidar_points, 
+            r0_rect=calib_dict['R0_rect'], 
+            tr_velo_to_cam=calib_dict['Tr_velo_to_cam'], 
+            P2=calib_dict['P2'], 
+            image_shape=image_shape)
+        saved_reduced_path = os.path.join(data_root, split, 'velodyne_reduced')
+        os.makedirs(saved_reduced_path, exist_ok=True)
+        saved_reduced_points_name = os.path.join(saved_reduced_path, f'{id}.bin')
+        write_points(reduced_lidar_points, saved_reduced_points_name)
+
+        if label:
+            label_path = os.path.join(data_root, split, 'label_2', f'{id}.txt')
+            annotation_dict = read_label(label_path)
+            annotation_dict['difficulty'] = judge_difficulty(annotation_dict)
+            annotation_dict['num_points_in_gt'] = get_points_num_in_bbox(
+                points=reduced_lidar_points,
+                r0_rect=calib_dict['R0_rect'], 
+                tr_velo_to_cam=calib_dict['Tr_velo_to_cam'],
+                dimensions=annotation_dict['dimensions'],
+                location=annotation_dict['location'],
+                rotation_y=annotation_dict['rotation_y'],
+                name=annotation_dict['name'])
+            cur_info_dict['annos'] = annotation_dict
+
+            if db:
+                indices, n_total_bbox, n_valid_bbox, bboxes_lidar, name = \
+                    points_in_bboxes_v2(
+                        points=lidar_points,
+                        r0_rect=calib_dict['R0_rect'].astype(np.float32), 
+                        tr_velo_to_cam=calib_dict['Tr_velo_to_cam'].astype(np.float32),
+                        dimensions=annotation_dict['dimensions'].astype(np.float32),
+                        location=annotation_dict['location'].astype(np.float32),
+                        rotation_y=annotation_dict['rotation_y'].astype(np.float32),
+                        name=annotation_dict['name']    
+                    )
+                for j in range(n_valid_bbox):
+                    db_points = lidar_points[indices[:, j]]
+                    db_points[:, :3] -= bboxes_lidar[j, :3]
+                    db_points_saved_name = os.path.join(db_points_saved_path, f'{int(id)}_{name[j]}_{j}.bin')
+                    write_points(db_points, db_points_saved_name)
+
+                    db_info={
+                        'name': name[j],
+                        'path': os.path.join(os.path.basename(db_points_saved_path), f'{int(id)}_{name[j]}_{j}.bin'),
+                        'box3d_lidar': bboxes_lidar[j],
+                        'difficulty': annotation_dict['difficulty'][j], 
+                        'num_points_in_gt': len(db_points), 
+                    }
+                    if name[j] not in kitti_dbinfos_train:
+                        kitti_dbinfos_train[name[j]] = [db_info]
+                    else:
+                        kitti_dbinfos_train[name[j]].append(db_info)
+        
+        kitti_infos_dict[int(id)] = cur_info_dict
+
+    saved_path = os.path.join(data_root, f'{prefix}_infos_{data_type}.pkl')
+    write_pickle(kitti_infos_dict, saved_path)
+    if db:
+        saved_db_path = os.path.join(data_root, f'{prefix}_dbinfos_train.pkl')
+        write_pickle(kitti_dbinfos_train, saved_db_path)
+    return kitti_infos_dict
+
+
+def main(args):
+    data_root = args.data_root
+    prefix = args.prefix
+
+    ## 1. train: create data infomation pkl file && create reduced point clouds 
+    ##           && create database(points in gt bbox) for data aumentation
+    kitti_train_infos_dict = create_data_info_pkl(data_root, 'train', prefix, db=True)
+
+    ## 2. val: create data infomation pkl file && create reduced point clouds
+    kitti_val_infos_dict = create_data_info_pkl(data_root, 'val', prefix)
+    
+    ## 3. trainval: create data infomation pkl file
+    kitti_trainval_infos_dict = {**kitti_train_infos_dict, **kitti_val_infos_dict}
+    saved_path = os.path.join(data_root, f'{prefix}_infos_trainval.pkl')
+    write_pickle(kitti_trainval_infos_dict, saved_path)
+
+    ## 4. test: create data infomation pkl file && create reduced point clouds
+    kitti_test_infos_dict = create_data_info_pkl(data_root, 'test', prefix, label=False)
+
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser(description='Dataset infomation')
+    parser.add_argument('--data_root', default='/mnt/ssd1/lifa_rdata/det/kitti', 
+                        help='your data root for kitti')
+    parser.add_argument('--prefix', default='kitti', 
+                        help='the prefix name for the saved .pkl file')
+    args = parser.parse_args()
+
+    main(args)
\ No newline at end of file
--- a/requirements.txt
+++ b/requirements.txt
+#numba
+numpy==0.61.0
+open3d
+opencv_python==4.5.5.62
+PyYAML
+setuptools
+tensorboard
+#--extra-index-url https://download.pytorch.org/whl/cu111
+#torch==1.8.1+cu111
+tqdm
\ No newline at end of file
--- a/setup.py
+++ b/setup.py
+from setuptools import setup, find_packages
+from torch.utils.cpp_extension import BuildExtension, CUDAExtension
+
+setup(
+    name='pointpillars',
+    version='0.1',
+    packages=find_packages(),
+    ext_modules=[
+        CUDAExtension(
+            name='pointpillars.ops.voxel_op',
+            sources=[
+                'pointpillars/ops/voxelization/voxelization.cpp',
+                'pointpillars/ops/voxelization/voxelization_cpu.cpp',
+                'pointpillars/ops/voxelization/voxelization_cuda.cu',
+            ],
+            define_macros=[('WITH_CUDA', None)]
+        ),
+        CUDAExtension(
+            name='pointpillars.ops.iou3d_op',
+            sources=[
+                'pointpillars/ops/iou3d/iou3d.cpp',
+                'pointpillars/ops/iou3d/iou3d_kernel.cu',
+            ],
+            define_macros=[('WITH_CUDA', None)]
+        )
+    ],
+    cmdclass={'build_ext': BuildExtension},
+    zip_safe=False
+)
\ No newline at end of file
--- a/test.py
+++ b/test.py
+import argparse
+import cv2
+import numpy as np
+import os
+import torch
+import pdb
+
+from pointpillars.utils import setup_seed, read_points, read_calib, read_label, \
+    keep_bbox_from_image_range, keep_bbox_from_lidar_range, vis_pc, \
+    vis_img_3d, bbox3d2corners_camera, points_camera2image, \
+    bbox_camera2lidar
+from pointpillars.model import PointPillars
+
+
+def point_range_filter(pts, point_range=[0, -39.68, -3, 69.12, 39.68, 1]):
+    '''
+    data_dict: dict(pts, gt_bboxes_3d, gt_labels, gt_names, difficulty)
+    point_range: [x1, y1, z1, x2, y2, z2]
+    '''
+    flag_x_low = pts[:, 0] > point_range[0]
+    flag_y_low = pts[:, 1] > point_range[1]
+    flag_z_low = pts[:, 2] > point_range[2]
+    flag_x_high = pts[:, 0] < point_range[3]
+    flag_y_high = pts[:, 1] < point_range[4]
+    flag_z_high = pts[:, 2] < point_range[5]
+    keep_mask = flag_x_low & flag_y_low & flag_z_low & flag_x_high & flag_y_high & flag_z_high
+    pts = pts[keep_mask]
+    return pts 
+
+
+def main(args):
+    CLASSES = {
+        'Pedestrian': 0, 
+        'Cyclist': 1, 
+        'Car': 2
+        }
+    LABEL2CLASSES = {v:k for k, v in CLASSES.items()}
+    pcd_limit_range = np.array([0, -40, -3, 70.4, 40, 0.0], dtype=np.float32)
+
+    if not args.no_cuda:
+        model = PointPillars(nclasses=len(CLASSES)).cuda()
+        model.load_state_dict(torch.load(args.ckpt))
+    else:
+        model = PointPillars(nclasses=len(CLASSES))
+        model.load_state_dict(
+            torch.load(args.ckpt, map_location=torch.device('cpu')))
+    
+    if not os.path.exists(args.pc_path):
+        raise FileNotFoundError 
+    pc = read_points(args.pc_path)
+    pc = point_range_filter(pc)
+    pc_torch = torch.from_numpy(pc)
+    if os.path.exists(args.calib_path):
+        calib_info = read_calib(args.calib_path)
+    else:
+        calib_info = None
+    
+    if os.path.exists(args.gt_path):
+        gt_label = read_label(args.gt_path)
+    else:
+        gt_label = None
+
+    if os.path.exists(args.img_path):
+        img = cv2.imread(args.img_path, 1)
+    else:
+        img = None
+
+    model.eval()
+    with torch.no_grad():
+        if not args.no_cuda:
+            pc_torch = pc_torch.cuda()
+        
+        result_filter = model(batched_pts=[pc_torch], 
+                              mode='test')[0]
+    if calib_info is not None and img is not None:
+        tr_velo_to_cam = calib_info['Tr_velo_to_cam'].astype(np.float32)
+        r0_rect = calib_info['R0_rect'].astype(np.float32)
+        P2 = calib_info['P2'].astype(np.float32)
+
+        image_shape = img.shape[:2]
+        result_filter = keep_bbox_from_image_range(result_filter, tr_velo_to_cam, r0_rect, P2, image_shape)
+
+    result_filter = keep_bbox_from_lidar_range(result_filter, pcd_limit_range)
+    lidar_bboxes = result_filter['lidar_bboxes']
+    labels, scores = result_filter['labels'], result_filter['scores']
+
+    vis_pc(pc, bboxes=lidar_bboxes, labels=labels)
+
+    if calib_info is not None and img is not None:
+        bboxes2d, camera_bboxes = result_filter['bboxes2d'], result_filter['camera_bboxes'] 
+        bboxes_corners = bbox3d2corners_camera(camera_bboxes)
+        image_points = points_camera2image(bboxes_corners, P2)
+        img = vis_img_3d(img, image_points, labels, rt=True)
+
+    if calib_info is not None and gt_label is not None:
+        tr_velo_to_cam = calib_info['Tr_velo_to_cam'].astype(np.float32)
+        r0_rect = calib_info['R0_rect'].astype(np.float32)
+
+        dimensions = gt_label['dimensions']
+        location = gt_label['location']
+        rotation_y = gt_label['rotation_y']
+        gt_labels = np.array([CLASSES.get(item, -1) for item in gt_label['name']])
+        sel = gt_labels != -1
+        gt_labels = gt_labels[sel]
+        bboxes_camera = np.concatenate([location, dimensions, rotation_y[:, None]], axis=-1)
+        gt_lidar_bboxes = bbox_camera2lidar(bboxes_camera, tr_velo_to_cam, r0_rect)
+        bboxes_camera = bboxes_camera[sel]
+        gt_lidar_bboxes = gt_lidar_bboxes[sel]
+
+        gt_labels = [-1] * len(gt_label['name']) # to distinguish between the ground truth and the predictions
+        
+        pred_gt_lidar_bboxes = np.concatenate([lidar_bboxes, gt_lidar_bboxes], axis=0)
+        pred_gt_labels = np.concatenate([labels, gt_labels])
+        vis_pc(pc, pred_gt_lidar_bboxes, labels=pred_gt_labels)
+
+        if img is not None:
+            bboxes_corners = bbox3d2corners_camera(bboxes_camera)
+            image_points = points_camera2image(bboxes_corners, P2)
+            gt_labels = [-1] * len(gt_label['name'])
+            img = vis_img_3d(img, image_points, gt_labels, rt=True)
+    
+    if calib_info is not None and img is not None:
+        cv2.imshow(f'{os.path.basename(args.img_path)}-3d bbox', img)
+        cv2.waitKey(0)
+            
+        
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser(description='Configuration Parameters')
+    parser.add_argument('--ckpt', default='pretrained/epoch_160.pth', help='your checkpoint for kitti')
+    parser.add_argument('--pc_path', help='your point cloud path')
+    parser.add_argument('--calib_path', default='', help='your calib file path')
+    parser.add_argument('--gt_path', default='', help='your ground truth path')
+    parser.add_argument('--img_path', default='', help='your image path')
+    parser.add_argument('--no_cuda', action='store_true',
+                        help='whether to use cuda')
+    args = parser.parse_args()
+
+    main(args)
--- a/train.py
+++ b/train.py
+import argparse
+import os
+import torch
+from tqdm import tqdm
+import pdb
+
+from pointpillars.utils import setup_seed
+from pointpillars.dataset import Kitti, get_dataloader
+from pointpillars.model import PointPillars
+from pointpillars.loss import Loss
+from torch.utils.tensorboard import SummaryWriter
+
+def save_summary(writer, loss_dict, global_step, tag, lr=None, momentum=None):
+    for k, v in loss_dict.items():
+        writer.add_scalar(f'{tag}/{k}', v, global_step)
+    if lr is not None:
+        writer.add_scalar('lr', lr, global_step)
+    if momentum is not None:
+        writer.add_scalar('momentum', momentum, global_step)
+
+
+def main(args):
+    setup_seed()
+    train_dataset = Kitti(data_root=args.data_root,
+                          split='train')
+    val_dataset = Kitti(data_root=args.data_root,
+                        split='val')
+    train_dataloader = get_dataloader(dataset=train_dataset, 
+                                      batch_size=args.batch_size, 
+                                      num_workers=args.num_workers,
+                                      shuffle=True)
+    val_dataloader = get_dataloader(dataset=val_dataset, 
+                                    batch_size=args.batch_size, 
+                                    num_workers=args.num_workers,
+                                    shuffle=False)
+
+    if not args.no_cuda:
+        pointpillars = PointPillars(nclasses=args.nclasses).cuda()
+    else:
+        pointpillars = PointPillars(nclasses=args.nclasses)
+    loss_func = Loss()
+
+    max_iters = len(train_dataloader) * args.max_epoch
+    init_lr = args.init_lr
+    optimizer = torch.optim.AdamW(params=pointpillars.parameters(), 
+                                  lr=init_lr, 
+                                  betas=(0.95, 0.99),
+                                  weight_decay=0.01)
+    scheduler = torch.optim.lr_scheduler.OneCycleLR(optimizer,  
+                                                    max_lr=init_lr*10, 
+                                                    total_steps=max_iters, 
+                                                    pct_start=0.4, 
+                                                    anneal_strategy='cos',
+                                                    cycle_momentum=True, 
+                                                    base_momentum=0.95*0.895, 
+                                                    max_momentum=0.95,
+                                                    div_factor=10)
+    saved_logs_path = os.path.join(args.saved_path, 'summary')
+    os.makedirs(saved_logs_path, exist_ok=True)
+    writer = SummaryWriter(saved_logs_path)
+    saved_ckpt_path = os.path.join(args.saved_path, 'checkpoints')
+    os.makedirs(saved_ckpt_path, exist_ok=True)
+
+    for epoch in range(args.max_epoch):
+        print('=' * 20, epoch, '=' * 20)
+        train_step, val_step = 0, 0
+        for i, data_dict in enumerate(tqdm(train_dataloader)):
+            if not args.no_cuda:
+                # move the tensors to the cuda
+                for key in data_dict:
+                    for j, item in enumerate(data_dict[key]):
+                        if torch.is_tensor(item):
+                            data_dict[key][j] = data_dict[key][j].cuda()
+            
+            optimizer.zero_grad()
+
+            batched_pts = data_dict['batched_pts']
+            batched_gt_bboxes = data_dict['batched_gt_bboxes']
+            batched_labels = data_dict['batched_labels']
+            batched_difficulty = data_dict['batched_difficulty']
+            bbox_cls_pred, bbox_pred, bbox_dir_cls_pred, anchor_target_dict = \
+                pointpillars(batched_pts=batched_pts, 
+                             mode='train',
+                             batched_gt_bboxes=batched_gt_bboxes, 
+                             batched_gt_labels=batched_labels)
+            
+            bbox_cls_pred = bbox_cls_pred.permute(0, 2, 3, 1).reshape(-1, args.nclasses)
+            bbox_pred = bbox_pred.permute(0, 2, 3, 1).reshape(-1, 7)
+            bbox_dir_cls_pred = bbox_dir_cls_pred.permute(0, 2, 3, 1).reshape(-1, 2)
+
+            batched_bbox_labels = anchor_target_dict['batched_labels'].reshape(-1)
+            batched_label_weights = anchor_target_dict['batched_label_weights'].reshape(-1)
+            batched_bbox_reg = anchor_target_dict['batched_bbox_reg'].reshape(-1, 7)
+            # batched_bbox_reg_weights = anchor_target_dict['batched_bbox_reg_weights'].reshape(-1)
+            batched_dir_labels = anchor_target_dict['batched_dir_labels'].reshape(-1)
+            # batched_dir_labels_weights = anchor_target_dict['batched_dir_labels_weights'].reshape(-1)
+            
+            pos_idx = (batched_bbox_labels >= 0) & (batched_bbox_labels < args.nclasses)
+            bbox_pred = bbox_pred[pos_idx]
+            batched_bbox_reg = batched_bbox_reg[pos_idx]
+            # sin(a - b) = sin(a)*cos(b) - cos(a)*sin(b)
+            bbox_pred[:, -1] = torch.sin(bbox_pred[:, -1].clone()) * torch.cos(batched_bbox_reg[:, -1].clone())
+            batched_bbox_reg[:, -1] = torch.cos(bbox_pred[:, -1].clone()) * torch.sin(batched_bbox_reg[:, -1].clone())
+            bbox_dir_cls_pred = bbox_dir_cls_pred[pos_idx]
+            batched_dir_labels = batched_dir_labels[pos_idx]
+
+            num_cls_pos = (batched_bbox_labels < args.nclasses).sum()
+            bbox_cls_pred = bbox_cls_pred[batched_label_weights > 0]
+            batched_bbox_labels[batched_bbox_labels < 0] = args.nclasses
+            batched_bbox_labels = batched_bbox_labels[batched_label_weights > 0]
+
+            loss_dict = loss_func(bbox_cls_pred=bbox_cls_pred,
+                                  bbox_pred=bbox_pred,
+                                  bbox_dir_cls_pred=bbox_dir_cls_pred,
+                                  batched_labels=batched_bbox_labels, 
+                                  num_cls_pos=num_cls_pos, 
+                                  batched_bbox_reg=batched_bbox_reg, 
+                                  batched_dir_labels=batched_dir_labels)
+            
+            loss = loss_dict['total_loss']
+            loss.backward()
+            # torch.nn.utils.clip_grad_norm_(pointpillars.parameters(), max_norm=35)
+            optimizer.step()
+            scheduler.step()
+
+            global_step = epoch * len(train_dataloader) + train_step + 1
+
+            if global_step % args.log_freq == 0:
+                save_summary(writer, loss_dict, global_step, 'train',
+                             lr=optimizer.param_groups[0]['lr'], 
+                             momentum=optimizer.param_groups[0]['betas'][0])
+            train_step += 1
+        if (epoch + 1) % args.ckpt_freq_epoch == 0:
+            torch.save(pointpillars.state_dict(), os.path.join(saved_ckpt_path, f'epoch_{epoch+1}.pth'))
+
+        if epoch % 2 == 0:
+            continue
+        pointpillars.eval()
+        with torch.no_grad():
+            for i, data_dict in enumerate(tqdm(val_dataloader)):
+                if not args.no_cuda:
+                    # move the tensors to the cuda
+                    for key in data_dict:
+                        for j, item in enumerate(data_dict[key]):
+                            if torch.is_tensor(item):
+                                data_dict[key][j] = data_dict[key][j].cuda()
+                
+                batched_pts = data_dict['batched_pts']
+                batched_gt_bboxes = data_dict['batched_gt_bboxes']
+                batched_labels = data_dict['batched_labels']
+                batched_difficulty = data_dict['batched_difficulty']
+                bbox_cls_pred, bbox_pred, bbox_dir_cls_pred, anchor_target_dict = \
+                    pointpillars(batched_pts=batched_pts, 
+                                mode='train',
+                                batched_gt_bboxes=batched_gt_bboxes, 
+                                batched_gt_labels=batched_labels)
+                
+                bbox_cls_pred = bbox_cls_pred.permute(0, 2, 3, 1).reshape(-1, args.nclasses)
+                bbox_pred = bbox_pred.permute(0, 2, 3, 1).reshape(-1, 7)
+                bbox_dir_cls_pred = bbox_dir_cls_pred.permute(0, 2, 3, 1).reshape(-1, 2)
+
+                batched_bbox_labels = anchor_target_dict['batched_labels'].reshape(-1)
+                batched_label_weights = anchor_target_dict['batched_label_weights'].reshape(-1)
+                batched_bbox_reg = anchor_target_dict['batched_bbox_reg'].reshape(-1, 7)
+                # batched_bbox_reg_weights = anchor_target_dict['batched_bbox_reg_weights'].reshape(-1)
+                batched_dir_labels = anchor_target_dict['batched_dir_labels'].reshape(-1)
+                # batched_dir_labels_weights = anchor_target_dict['batched_dir_labels_weights'].reshape(-1)
+                
+                pos_idx = (batched_bbox_labels >= 0) & (batched_bbox_labels < args.nclasses)
+                bbox_pred = bbox_pred[pos_idx]
+                batched_bbox_reg = batched_bbox_reg[pos_idx]
+                # sin(a - b) = sin(a)*cos(b) - cos(a)*sin(b)
+                bbox_pred[:, -1] = torch.sin(bbox_pred[:, -1]) * torch.cos(batched_bbox_reg[:, -1])
+                batched_bbox_reg[:, -1] = torch.cos(bbox_pred[:, -1]) * torch.sin(batched_bbox_reg[:, -1])
+                bbox_dir_cls_pred = bbox_dir_cls_pred[pos_idx]
+                batched_dir_labels = batched_dir_labels[pos_idx]
+
+                num_cls_pos = (batched_bbox_labels < args.nclasses).sum()
+                bbox_cls_pred = bbox_cls_pred[batched_label_weights > 0]
+                batched_bbox_labels[batched_bbox_labels < 0] = args.nclasses
+                batched_bbox_labels = batched_bbox_labels[batched_label_weights > 0]
+
+                loss_dict = loss_func(bbox_cls_pred=bbox_cls_pred,
+                                    bbox_pred=bbox_pred,
+                                    bbox_dir_cls_pred=bbox_dir_cls_pred,
+                                    batched_labels=batched_bbox_labels, 
+                                    num_cls_pos=num_cls_pos, 
+                                    batched_bbox_reg=batched_bbox_reg, 
+                                    batched_dir_labels=batched_dir_labels)
+                
+                global_step = epoch * len(val_dataloader) + val_step + 1
+                if global_step % args.log_freq == 0:
+                    save_summary(writer, loss_dict, global_step, 'val')
+                val_step += 1
+        pointpillars.train()
+
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser(description='Configuration Parameters')
+    parser.add_argument('--data_root', default='/mnt/ssd1/lifa_rdata/det/kitti', 
+                        help='your data root for kitti')
+    parser.add_argument('--saved_path', default='pillar_logs')
+    parser.add_argument('--batch_size', type=int, default=6)
+    parser.add_argument('--num_workers', type=int, default=4)
+    parser.add_argument('--nclasses', type=int, default=3)
+    parser.add_argument('--init_lr', type=float, default=0.00025)
+    parser.add_argument('--max_epoch', type=int, default=160)
+    parser.add_argument('--log_freq', type=int, default=8)
+    parser.add_argument('--ckpt_freq_epoch', type=int, default=20)
+    parser.add_argument('--no_cuda', action='store_true',
+                        help='whether to use cuda')
+    args = parser.parse_args()
+
+    main(args)