Support Waymo Open Dataset with SoTA results #349

a6bb3580 · Shaoshuai Shi · GitHub · 7bc7e551 · 4d3f0009 · a6bb3580
Unverified Commit a6bb3580 authored Nov 10, 2020 by Shaoshuai Shi Committed by GitHub Nov 10, 2020
13 changed files
--- a/README.md
+++ b/README.md
@@ -18,12 +18,15 @@ It is also the official code release of [`[PointRCNN]`](https://arxiv.org/abs/18
 ## Changelog
-[2020-08-10] **NEW:** Bugfixed: The provided NuScenes models have been updated to fix the loading bugs. Please redownload it if you need to use the pretrained NuScenes models.
+[2020-11-10] **NEW:** The [Waymo Open Dataset](#waymo-open-dataset-baselines) has been supported with state-of-the-art results. Currently we provide the 
+configs and results of `SECOND`, `PartA2` and `PV-RCNN` on the Waymo Open Dataset, and more models could be easily supported by modifying their dataset configs. 
-[2020-07-30] **NEW:** `OpenPCDet` v0.3.0 is released with the following features:
+[2020-08-10] Bugfixed: The provided NuScenes models have been updated to fix the loading bugs. Please redownload it if you need to use the pretrained NuScenes models.
+[2020-07-30] `OpenPCDet` v0.3.0 is released with the following features:
   * The Point-based and Anchor-Free models ([`PointRCNN`](#KITTI-3D-Object-Detection-Baselines), [`PartA2-Free`](#KITTI-3D-Object-Detection-Baselines)) are supported now.
   * The NuScenes dataset is supported with strong baseline results ([`SECOND-MultiHead (CBGS)`](#NuScenes-3D-Object-Detection-Baselines) and [`PointPillar-MultiHead`](#NuScenes-3D-Object-Detection-Baselines)).
-   * High efficiency than last version, support `PyTorch 1.1~1.5` and `spconv 1.0~1.2` simultaneously.
+   * High efficiency than last version, support **PyTorch 1.1~1.7** and **spconv 1.0~1.2** simultaneously.
 [2020-07-17]  Add simple visualization codes and a quick demo to test with custom data. 
@@ -87,7 +90,7 @@ Selected supported methods are shown in the below table. The results are the 3D
 * All models are trained with 8 GTX 1080Ti GPUs and are available for download. 
 * The training time is measured with 8 TITAN XP GPUs and PyTorch 1.5.
-|                                             | training time | Car | Pedestrian | Cyclist  | download | 
+|                                             | training time | Car@R11 | Pedestrian@R11 | Cyclist@R11  | download | 
 |---------------------------------------------|----------:|:-------:|:-------:|:-------:|:---------:|
 | [PointPillar](tools/cfgs/kitti_models/pointpillar.yaml) |~1.2 hours| 77.28 | 52.29 | 62.68 | [model-18M](https://drive.google.com/file/d/1wMxWTpU1qUoY3DsCH31WJmvJxcjFXKlm/view?usp=sharing) | 
 | [SECOND](tools/cfgs/kitti_models/second.yaml)       |  ~1.7 hours  | 78.62 | 52.98 | 67.15 | [model-20M](https://drive.google.com/file/d/1-01zsPOsqanZQqIIyy7FpNXStL3y4jdR/view?usp=sharing) |
@@ -105,6 +108,22 @@ All models are trained with 8 GTX 1080Ti GPUs and are available for download.
 | [PointPillar-MultiHead](tools/cfgs/nuscenes_models/cbgs_pp_multihead.yaml) | 33.87	| 26.00 | 32.07	| 28.74 | 20.15 | 44.63 | 58.23	 | [model-23M](https://drive.google.com/file/d/1p-501mTWsq0G9RzroTWSXreIMyTUUpBM/view?usp=sharing) | 
 | [SECOND-MultiHead (CBGS)](tools/cfgs/nuscenes_models/cbgs_second_multihead.yaml) | 31.15 |	25.51 |	26.64 | 26.26 | 20.46 | 50.59 | 62.29 | [model-35M](https://drive.google.com/file/d/1bNzcOnE3u9iooBFMk2xK7HqhdeQ_nwTq/view?usp=sharing) |
+### Waymo Open Dataset Baselines
+We provide the setting of [`DATA_CONFIG.SAMPLED_INTERVAL`](tools/cfgs/dataset_configs/waymo_dataset.yaml) on the Waymo Open Dataset (WOD) to subsample partial samples for training and evaluation, 
+so you could also play with WOD by setting a smaller `DATA_CONFIG.SAMPLED_INTERVAL` even if you only have limited GPU resources. 
+By default, all models are trained with **20% data (~32k frames)** of all the training samples on 8 GTX 1080Ti GPUs, and the results of each cell here are mAP/mAPH calculated by the official Waymo evaluation metrics on the **whole** validation set (version 1.2).    
+|                                             | Vec_L1 | Vec_L2 | Ped_L1 | Ped_L2 | Cyc_L1 | Cyc_L2 |  
+|---------------------------------------------|----------:|:-------:|:-------:|:-------:|:-------:|:-------:|
+| [SECOND](tools/cfgs/waymo_models/second.yaml) | 68.03/67.44	| 59.57/59.04 | 61.14/50.33	| 53.00/43.56 | 54.66/53.31 | 52.67/51.37 | 
+| [Part-A^2-Anchor](tools/cfgs/waymo_models/PartA2.yaml) | 71.82/71.29 | 64.33/63.82 | 63.15/54.96 | 54.24/47.11 | 65.23/63.92 | 62.61/61.35 |
+| [PV-RCNN](tools/cfgs/waymo_models/pv_rcnn.yaml) | 74.06/73.38 | 64.99/64.38 |	62.66/52.68 | 53.80/45.14 |	63.32/61.71	| 60.72/59.18 | 
+We could not provide the above pretrained models due to [Waymo Dataset License Agreement](https://waymo.com/open/terms/), 
+but you could easily achieve similar performance by training with the default configs.
 ### Other datasets
 More datasets are on the way. 
@@ -140,31 +159,15 @@ If you find this project useful in your research, please consider cite:
 ```
-@inproceedings{shi2020pv,
+@misc{openpcdet2020,
-  title={Pv-rcnn: Point-voxel feature set abstraction for 3d object detection},
+    title={OpenPCDet: An Open-source Toolbox for 3D Object Detection from Point Clouds},
-  author={Shi, Shaoshuai and Guo, Chaoxu and Jiang, Li and Wang, Zhe and Shi, Jianping and Wang, Xiaogang and Li, Hongsheng},
+    author={OpenPCDet Development Team},
-  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
+    howpublished = {\url{https://github.com/open-mmlab/OpenPCDet}},
-  pages={10529--10538},
    year={2020}
 }
+```
+## Contribution
+Welcome to be a member of the OpenPCDet development team by contributing to this repo, and feel free to contact us for any potential contributions. 
-@article{shi2020points,
-  title={From Points to Parts: 3D Object Detection from Point Cloud with Part-aware and Part-aggregation Network},
-  author={Shi, Shaoshuai and Wang, Zhe and Shi, Jianping and Wang, Xiaogang and Li, Hongsheng},
-  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
-  year={2020},
-  publisher={IEEE}
-}
-@inproceedings{shi2019pointrcnn,
-  title={PointRCNN: 3d Object Progposal Generation and Detection from Point Cloud},
-  author={Shi, Shaoshuai and Wang, Xiaogang and Li, Hongsheng},
-  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
-  pages={770--779},
-  year={2019}
-}
-```
-## Contact
-This project is currently maintained by Shaoshuai Shi ([@sshaoshuai](http://github.com/sshaoshuai)) and Chaoxu Guo ([@Gus-Guo](https://github.com/Gus-Guo)).
--- a/data/waymo/ImageSets/train.txt
+++ b/data/waymo/ImageSets/train.txt
--- a/data/waymo/ImageSets/val.txt
+++ b/data/waymo/ImageSets/val.txt
+segment-10203656353524179475_7625_000_7645_000_with_camera_labels.tfrecord
+segment-1024360143612057520_3580_000_3600_000_with_camera_labels.tfrecord
+segment-10247954040621004675_2180_000_2200_000_with_camera_labels.tfrecord
+segment-10289507859301986274_4200_000_4220_000_with_camera_labels.tfrecord
+segment-10335539493577748957_1372_870_1392_870_with_camera_labels.tfrecord
+segment-10359308928573410754_720_000_740_000_with_camera_labels.tfrecord
+segment-10448102132863604198_472_000_492_000_with_camera_labels.tfrecord
+segment-10689101165701914459_2072_300_2092_300_with_camera_labels.tfrecord
+segment-1071392229495085036_1844_790_1864_790_with_camera_labels.tfrecord
+segment-10837554759555844344_6525_000_6545_000_with_camera_labels.tfrecord
+segment-10868756386479184868_3000_000_3020_000_with_camera_labels.tfrecord
+segment-11037651371539287009_77_670_97_670_with_camera_labels.tfrecord
+segment-11048712972908676520_545_000_565_000_with_camera_labels.tfrecord
+segment-1105338229944737854_1280_000_1300_000_with_camera_labels.tfrecord
+segment-11356601648124485814_409_000_429_000_with_camera_labels.tfrecord
+segment-11387395026864348975_3820_000_3840_000_with_camera_labels.tfrecord
+segment-11406166561185637285_1753_750_1773_750_with_camera_labels.tfrecord
+segment-11434627589960744626_4829_660_4849_660_with_camera_labels.tfrecord
+segment-11450298750351730790_1431_750_1451_750_with_camera_labels.tfrecord
+segment-11616035176233595745_3548_820_3568_820_with_camera_labels.tfrecord
+segment-11660186733224028707_420_000_440_000_with_camera_labels.tfrecord
+segment-11901761444769610243_556_000_576_000_with_camera_labels.tfrecord
+segment-12102100359426069856_3931_470_3951_470_with_camera_labels.tfrecord
+segment-12134738431513647889_3118_000_3138_000_with_camera_labels.tfrecord
+segment-12306251798468767010_560_000_580_000_with_camera_labels.tfrecord
+segment-12358364923781697038_2232_990_2252_990_with_camera_labels.tfrecord
+segment-12374656037744638388_1412_711_1432_711_with_camera_labels.tfrecord
+segment-12496433400137459534_120_000_140_000_with_camera_labels.tfrecord
+segment-12657584952502228282_3940_000_3960_000_with_camera_labels.tfrecord
+segment-12820461091157089924_5202_916_5222_916_with_camera_labels.tfrecord
+segment-12831741023324393102_2673_230_2693_230_with_camera_labels.tfrecord
+segment-12866817684252793621_480_000_500_000_with_camera_labels.tfrecord
+segment-12940710315541930162_2660_000_2680_000_with_camera_labels.tfrecord
+segment-13178092897340078601_5118_604_5138_604_with_camera_labels.tfrecord
+segment-13184115878756336167_1354_000_1374_000_with_camera_labels.tfrecord
+segment-13299463771883949918_4240_000_4260_000_with_camera_labels.tfrecord
+segment-1331771191699435763_440_000_460_000_with_camera_labels.tfrecord
+segment-13336883034283882790_7100_000_7120_000_with_camera_labels.tfrecord
+segment-13356997604177841771_3360_000_3380_000_with_camera_labels.tfrecord
+segment-13415985003725220451_6163_000_6183_000_with_camera_labels.tfrecord
+segment-13469905891836363794_4429_660_4449_660_with_camera_labels.tfrecord
+segment-13573359675885893802_1985_970_2005_970_with_camera_labels.tfrecord
+segment-13694146168933185611_800_000_820_000_with_camera_labels.tfrecord
+segment-13941626351027979229_3363_930_3383_930_with_camera_labels.tfrecord
+segment-13982731384839979987_1680_000_1700_000_with_camera_labels.tfrecord
+segment-1405149198253600237_160_000_180_000_with_camera_labels.tfrecord
+segment-14081240615915270380_4399_000_4419_000_with_camera_labels.tfrecord
+segment-14107757919671295130_3546_370_3566_370_with_camera_labels.tfrecord
+segment-14127943473592757944_2068_000_2088_000_with_camera_labels.tfrecord
+segment-14165166478774180053_1786_000_1806_000_with_camera_labels.tfrecord
+segment-14244512075981557183_1226_840_1246_840_with_camera_labels.tfrecord
+segment-14262448332225315249_1280_000_1300_000_with_camera_labels.tfrecord
+segment-14300007604205869133_1160_000_1180_000_with_camera_labels.tfrecord
+segment-14333744981238305769_5658_260_5678_260_with_camera_labels.tfrecord
+segment-14383152291533557785_240_000_260_000_with_camera_labels.tfrecord
+segment-14486517341017504003_3406_349_3426_349_with_camera_labels.tfrecord
+segment-1457696187335927618_595_027_615_027_with_camera_labels.tfrecord
+segment-14624061243736004421_1840_000_1860_000_with_camera_labels.tfrecord
+segment-1464917900451858484_1960_000_1980_000_with_camera_labels.tfrecord
+segment-14663356589561275673_935_195_955_195_with_camera_labels.tfrecord
+segment-14687328292438466674_892_000_912_000_with_camera_labels.tfrecord
+segment-14739149465358076158_4740_000_4760_000_with_camera_labels.tfrecord
+segment-14811410906788672189_373_113_393_113_with_camera_labels.tfrecord
+segment-14931160836268555821_5778_870_5798_870_with_camera_labels.tfrecord
+segment-14956919859981065721_1759_980_1779_980_with_camera_labels.tfrecord
+segment-15021599536622641101_556_150_576_150_with_camera_labels.tfrecord
+segment-15028688279822984888_1560_000_1580_000_with_camera_labels.tfrecord
+segment-1505698981571943321_1186_773_1206_773_with_camera_labels.tfrecord
+segment-15096340672898807711_3765_000_3785_000_with_camera_labels.tfrecord
+segment-15224741240438106736_960_000_980_000_with_camera_labels.tfrecord
+segment-15396462829361334065_4265_000_4285_000_with_camera_labels.tfrecord
+segment-15488266120477489949_3162_920_3182_920_with_camera_labels.tfrecord
+segment-15496233046893489569_4551_550_4571_550_with_camera_labels.tfrecord
+segment-15611747084548773814_3740_000_3760_000_with_camera_labels.tfrecord
+segment-15724298772299989727_5386_410_5406_410_with_camera_labels.tfrecord
+segment-15948509588157321530_7187_290_7207_290_with_camera_labels.tfrecord
+segment-15959580576639476066_5087_580_5107_580_with_camera_labels.tfrecord
+segment-16204463896543764114_5340_000_5360_000_with_camera_labels.tfrecord
+segment-16213317953898915772_1597_170_1617_170_with_camera_labels.tfrecord
+segment-16229547658178627464_380_000_400_000_with_camera_labels.tfrecord
+segment-16751706457322889693_4475_240_4495_240_with_camera_labels.tfrecord
+segment-16767575238225610271_5185_000_5205_000_with_camera_labels.tfrecord
+segment-16979882728032305374_2719_000_2739_000_with_camera_labels.tfrecord
+segment-17065833287841703_2980_000_3000_000_with_camera_labels.tfrecord
+segment-17135518413411879545_1480_000_1500_000_with_camera_labels.tfrecord
+segment-17136314889476348164_979_560_999_560_with_camera_labels.tfrecord
+segment-17152649515605309595_3440_000_3460_000_with_camera_labels.tfrecord
+segment-17244566492658384963_2540_000_2560_000_with_camera_labels.tfrecord
+segment-17344036177686610008_7852_160_7872_160_with_camera_labels.tfrecord
+segment-17539775446039009812_440_000_460_000_with_camera_labels.tfrecord
+segment-17612470202990834368_2800_000_2820_000_with_camera_labels.tfrecord
+segment-17626999143001784258_2760_000_2780_000_with_camera_labels.tfrecord
+segment-17694030326265859208_2340_000_2360_000_with_camera_labels.tfrecord
+segment-17703234244970638241_220_000_240_000_with_camera_labels.tfrecord
+segment-17763730878219536361_3144_635_3164_635_with_camera_labels.tfrecord
+segment-17791493328130181905_1480_000_1500_000_with_camera_labels.tfrecord
+segment-17860546506509760757_6040_000_6060_000_with_camera_labels.tfrecord
+segment-17962792089966876718_2210_933_2230_933_with_camera_labels.tfrecord
+segment-18024188333634186656_1566_600_1586_600_with_camera_labels.tfrecord
+segment-18045724074935084846_6615_900_6635_900_with_camera_labels.tfrecord
+segment-18252111882875503115_378_471_398_471_with_camera_labels.tfrecord
+segment-18305329035161925340_4466_730_4486_730_with_camera_labels.tfrecord
+segment-18331704533904883545_1560_000_1580_000_with_camera_labels.tfrecord
+segment-18333922070582247333_320_280_340_280_with_camera_labels.tfrecord
+segment-18446264979321894359_3700_000_3720_000_with_camera_labels.tfrecord
+segment-1906113358876584689_1359_560_1379_560_with_camera_labels.tfrecord
+segment-191862526745161106_1400_000_1420_000_with_camera_labels.tfrecord
+segment-1943605865180232897_680_000_700_000_with_camera_labels.tfrecord
+segment-2094681306939952000_2972_300_2992_300_with_camera_labels.tfrecord
+segment-2105808889850693535_2295_720_2315_720_with_camera_labels.tfrecord
+segment-2308204418431899833_3575_000_3595_000_with_camera_labels.tfrecord
+segment-2335854536382166371_2709_426_2729_426_with_camera_labels.tfrecord
+segment-2367305900055174138_1881_827_1901_827_with_camera_labels.tfrecord
+segment-2506799708748258165_6455_000_6475_000_with_camera_labels.tfrecord
+segment-2551868399007287341_3100_000_3120_000_with_camera_labels.tfrecord
+segment-260994483494315994_2797_545_2817_545_with_camera_labels.tfrecord
+segment-2624187140172428292_73_000_93_000_with_camera_labels.tfrecord
+segment-271338158136329280_2541_070_2561_070_with_camera_labels.tfrecord
+segment-272435602399417322_2884_130_2904_130_with_camera_labels.tfrecord
+segment-2736377008667623133_2676_410_2696_410_with_camera_labels.tfrecord
+segment-2834723872140855871_1615_000_1635_000_with_camera_labels.tfrecord
+segment-3015436519694987712_1300_000_1320_000_with_camera_labels.tfrecord
+segment-3039251927598134881_1240_610_1260_610_with_camera_labels.tfrecord
+segment-3077229433993844199_1080_000_1100_000_with_camera_labels.tfrecord
+segment-30779396576054160_1880_000_1900_000_with_camera_labels.tfrecord
+segment-3126522626440597519_806_440_826_440_with_camera_labels.tfrecord
+segment-346889320598157350_798_187_818_187_with_camera_labels.tfrecord
+segment-3577352947946244999_3980_000_4000_000_with_camera_labels.tfrecord
+segment-3651243243762122041_3920_000_3940_000_with_camera_labels.tfrecord
+segment-366934253670232570_2229_530_2249_530_with_camera_labels.tfrecord
+segment-3731719923709458059_1540_000_1560_000_with_camera_labels.tfrecord
+segment-3915587593663172342_10_000_30_000_with_camera_labels.tfrecord
+segment-4013125682946523088_3540_000_3560_000_with_camera_labels.tfrecord
+segment-4195774665746097799_7300_960_7320_960_with_camera_labels.tfrecord
+segment-4246537812751004276_1560_000_1580_000_with_camera_labels.tfrecord
+segment-4409585400955983988_3500_470_3520_470_with_camera_labels.tfrecord
+segment-4423389401016162461_4235_900_4255_900_with_camera_labels.tfrecord
+segment-4426410228514970291_1620_000_1640_000_with_camera_labels.tfrecord
+segment-447576862407975570_4360_000_4380_000_with_camera_labels.tfrecord
+segment-4490196167747784364_616_569_636_569_with_camera_labels.tfrecord
+segment-4575389405178805994_4900_000_4920_000_with_camera_labels.tfrecord
+segment-4612525129938501780_340_000_360_000_with_camera_labels.tfrecord
+segment-4690718861228194910_1980_000_2000_000_with_camera_labels.tfrecord
+segment-4759225533437988401_800_000_820_000_with_camera_labels.tfrecord
+segment-4764167778917495793_860_000_880_000_with_camera_labels.tfrecord
+segment-4816728784073043251_5273_410_5293_410_with_camera_labels.tfrecord
+segment-4854173791890687260_2880_000_2900_000_with_camera_labels.tfrecord
+segment-5183174891274719570_3464_030_3484_030_with_camera_labels.tfrecord
+segment-5289247502039512990_2640_000_2660_000_with_camera_labels.tfrecord
+segment-5302885587058866068_320_000_340_000_with_camera_labels.tfrecord
+segment-5372281728627437618_2005_000_2025_000_with_camera_labels.tfrecord
+segment-5373876050695013404_3817_170_3837_170_with_camera_labels.tfrecord
+segment-5574146396199253121_6759_360_6779_360_with_camera_labels.tfrecord
+segment-5772016415301528777_1400_000_1420_000_with_camera_labels.tfrecord
+segment-5832416115092350434_60_000_80_000_with_camera_labels.tfrecord
+segment-5847910688643719375_180_000_200_000_with_camera_labels.tfrecord
+segment-5990032395956045002_6600_000_6620_000_with_camera_labels.tfrecord
+segment-6001094526418694294_4609_470_4629_470_with_camera_labels.tfrecord
+segment-6074871217133456543_1000_000_1020_000_with_camera_labels.tfrecord
+segment-6161542573106757148_585_030_605_030_with_camera_labels.tfrecord
+segment-6183008573786657189_5414_000_5434_000_with_camera_labels.tfrecord
+segment-6324079979569135086_2372_300_2392_300_with_camera_labels.tfrecord
+segment-6491418762940479413_6520_000_6540_000_with_camera_labels.tfrecord
+segment-662188686397364823_3248_800_3268_800_with_camera_labels.tfrecord
+segment-6637600600814023975_2235_000_2255_000_with_camera_labels.tfrecord
+segment-6680764940003341232_2260_000_2280_000_with_camera_labels.tfrecord
+segment-6707256092020422936_2352_392_2372_392_with_camera_labels.tfrecord
+segment-7119831293178745002_1094_720_1114_720_with_camera_labels.tfrecord
+segment-7163140554846378423_2717_820_2737_820_with_camera_labels.tfrecord
+segment-7253952751374634065_1100_000_1120_000_with_camera_labels.tfrecord
+segment-7493781117404461396_2140_000_2160_000_with_camera_labels.tfrecord
+segment-7650923902987369309_2380_000_2400_000_with_camera_labels.tfrecord
+segment-7732779227944176527_2120_000_2140_000_with_camera_labels.tfrecord
+segment-7799643635310185714_680_000_700_000_with_camera_labels.tfrecord
+segment-7932945205197754811_780_000_800_000_with_camera_labels.tfrecord
+segment-7988627150403732100_1487_540_1507_540_with_camera_labels.tfrecord
+segment-8079607115087394458_1240_000_1260_000_with_camera_labels.tfrecord
+segment-8133434654699693993_1162_020_1182_020_with_camera_labels.tfrecord
+segment-8137195482049459160_3100_000_3120_000_with_camera_labels.tfrecord
+segment-8302000153252334863_6020_000_6040_000_with_camera_labels.tfrecord
+segment-8331804655557290264_4351_740_4371_740_with_camera_labels.tfrecord
+segment-8398516118967750070_3958_000_3978_000_with_camera_labels.tfrecord
+segment-8506432817378693815_4860_000_4880_000_with_camera_labels.tfrecord
+segment-8679184381783013073_7740_000_7760_000_with_camera_labels.tfrecord
+segment-8845277173853189216_3828_530_3848_530_with_camera_labels.tfrecord
+segment-8888517708810165484_1549_770_1569_770_with_camera_labels.tfrecord
+segment-8907419590259234067_1960_000_1980_000_with_camera_labels.tfrecord
+segment-89454214745557131_3160_000_3180_000_with_camera_labels.tfrecord
+segment-8956556778987472864_3404_790_3424_790_with_camera_labels.tfrecord
+segment-902001779062034993_2880_000_2900_000_with_camera_labels.tfrecord
+segment-9024872035982010942_2578_810_2598_810_with_camera_labels.tfrecord
+segment-9041488218266405018_6454_030_6474_030_with_camera_labels.tfrecord
+segment-9114112687541091312_1100_000_1120_000_with_camera_labels.tfrecord
+segment-9164052963393400298_4692_970_4712_970_with_camera_labels.tfrecord
+segment-9231652062943496183_1740_000_1760_000_with_camera_labels.tfrecord
+segment-9243656068381062947_1297_428_1317_428_with_camera_labels.tfrecord
+segment-9265793588137545201_2981_960_3001_960_with_camera_labels.tfrecord
+segment-933621182106051783_4160_000_4180_000_with_camera_labels.tfrecord
+segment-9443948810903981522_6538_870_6558_870_with_camera_labels.tfrecord
+segment-9472420603764812147_850_000_870_000_with_camera_labels.tfrecord
+segment-9579041874842301407_1300_000_1320_000_with_camera_labels.tfrecord
+segment-967082162553397800_5102_900_5122_900_with_camera_labels.tfrecord
--- a/docs/DEMO.md
+++ b/docs/DEMO.md
@@ -23,7 +23,7 @@ y-axis points towards to the left direction, and z-axis points towards to the to
   ...
   # Save it to the file. 
-   # The shape of points should be (num_points, 4), that is [x, y, z, intensity],  
+   # The shape of points should be (num_points, 4), that is [x, y, z, intensity] (Only for KITTI dataset).  
   # If you doesn't have the intensity information, just set them to zeros. 
   # If you have the intensity information, you should normalize them to [0, 1].
   points[:, 3] = 0 

--- a/docs/GETTING_STARTED.md
+++ b/docs/GETTING_STARTED.md
@@ -57,6 +57,43 @@ python -m pcdet.datasets.nuscenes.nuscenes_dataset --func create_nuscenes_infos
    --version v1.0-trainval
 ```
+### Waymo Open Dataset
+* Please download the official [Waymo Open Dataset](https://waymo.com/open/download/), 
+including the training data `training_0000.tar~training_0031.tar` and the validation 
+data `validation_0000.tar~validation_0007.tar`.
+* Unzip all the above `xxxx.tar` files to the directory of `data/waymo/raw_data` as follows (You could get 798 *train* tfrecord and 202 *val* tfrecord ):  
+```
+OpenPCDet
+├── data
+│   ├── waymo
+│   │   │── ImageSets
+│   │   │── raw_data
+│   │   │   │── segment-xxxxxxxx.tfrecord
+|   |   |   |── ...
+|   |   |── waymo_processed_data
+│   │   │   │── segment-xxxxxxxx/
+|   |   |   |── ...
+│   │   │── pcdet_gt_database_train_sampled_xx/
+│   │   │── pcdet_waymo_dbinfos_train_sampled_xx.pkl   
+├── pcdet
+├── tools
+```
+* Install the official `waymo-open-dataset` by running the following command: 
+```shell script
+pip3 install --upgrade pip
+# tf 2.0.0
+pip3 install waymo-open-dataset-tf-2-0-0==1.2.0 --user
+```
+* Extract point cloud data from tfrecord and generate data infos by running the following command (it takes several hours, 
+and you could refer to `data/waymo/waymo_processed_data` to see how many records that have been processed): 
+```python 
+python -m pcdet.datasets.waymo.waymo_dataset --func create_waymo_infos \
+    --cfg_file tools/cfgs/dataset_configs/waymo_dataset.yaml
+```
+Note that you do not need to install `waymo-open-dataset` if you have already processed the data before and do not need to evaluate with official Waymo Metrics. 
 ## Training & Testing

--- a/pcdet/datasets/__init__.py
+++ b/pcdet/datasets/__init__.py
@@ -7,11 +7,13 @@ from pcdet.utils import common_utils
 from .dataset import DatasetTemplate
 from .kitti.kitti_dataset import KittiDataset
 from .nuscenes.nuscenes_dataset import NuScenesDataset
+from .waymo.waymo_dataset import WaymoDataset
 __all__ = {
    'DatasetTemplate': DatasetTemplate,
    'KittiDataset': KittiDataset,
-    'NuScenesDataset': NuScenesDataset
+    'NuScenesDataset': NuScenesDataset,
+    'WaymoDataset': WaymoDataset
 }

--- a/pcdet/datasets/waymo/waymo_dataset.py
+++ b/pcdet/datasets/waymo/waymo_dataset.py
+# OpenPCDet PyTorch Dataloader and Evaluation Tools for Waymo Open Dataset
+# Reference https://github.com/open-mmlab/OpenPCDet
+# Written by Shaoshuai Shi, Chaoxu Guo
+# All Rights Reserved 2019-2020.
+import os
+import pickle
+import copy
+import numpy as np
+import torch
+from pathlib import Path
+from ...ops.roiaware_pool3d import roiaware_pool3d_utils
+from ...utils import box_utils, common_utils
+from ..dataset import DatasetTemplate
+class WaymoDataset(DatasetTemplate):
+    def __init__(self, dataset_cfg, class_names, training=True, root_path=None, logger=None):
+        super().__init__(
+            dataset_cfg=dataset_cfg, class_names=class_names, training=training, root_path=root_path, logger=logger
+        )
+        self.data_path = self.root_path / self.dataset_cfg.PROCESSED_DATA_TAG
+        self.split = self.dataset_cfg.DATA_SPLIT[self.mode]
+        split_dir = self.root_path / 'ImageSets' / (self.split + '.txt')
+        self.sample_sequence_list = [x.strip() for x in open(split_dir).readlines()]
+        self.infos = []
+        self.include_waymo_data(self.mode)
+    def set_split(self, split):
+        super().__init__(
+            dataset_cfg=self.dataset_cfg, class_names=self.class_names, training=self.training,
+            root_path=self.root_path, logger=self.logger
+        )
+        self.split = split
+        split_dir = self.root_path / 'ImageSets' / (self.split + '.txt')
+        self.sample_sequence_list = [x.strip() for x in open(split_dir).readlines()]
+        self.infos = []
+        self.include_waymo_data(self.mode)
+    def include_waymo_data(self, mode):
+        self.logger.info('Loading Waymo dataset')
+        waymo_infos = []
+        num_skipped_infos = 0
+        for k in range(len(self.sample_sequence_list)):
+            sequence_name = os.path.splitext(self.sample_sequence_list[k])[0]
+            info_path = self.data_path / sequence_name / ('%s.pkl' % sequence_name)
+            info_path = self.check_sequence_name_with_all_version(info_path)
+            if not info_path.exists():
+                num_skipped_infos += 1
+                continue
+            with open(info_path, 'rb') as f:
+                infos = pickle.load(f)
+                waymo_infos.extend(infos)
+        self.infos.extend(waymo_infos[:])
+        self.logger.info('Total skipped info %s' % num_skipped_infos)
+        self.logger.info('Total samples for Waymo dataset: %d' % (len(waymo_infos)))
+        if self.dataset_cfg.SAMPLED_INTERVAL[mode] > 1:
+            sampled_waymo_infos = []
+            for k in range(0, len(self.infos), self.dataset_cfg.SAMPLED_INTERVAL[mode]):
+                sampled_waymo_infos.append(self.infos[k])
+            self.infos = sampled_waymo_infos
+            self.logger.info('Total sampled samples for Waymo dataset: %d' % len(self.infos))
+    @staticmethod
+    def check_sequence_name_with_all_version(sequence_file):
+        if '_with_camera_labels' not in str(sequence_file) and not sequence_file.exists():
+            sequence_file = Path(str(sequence_file[:-9]) + '_with_camera_labels.tfrecord')
+        if '_with_camera_labels' in str(sequence_file) and not sequence_file.exists():
+            sequence_file = Path(str(sequence_file).replace('_with_camera_labels', ''))
+        return sequence_file
+    def get_infos(self, raw_data_path, save_path, num_workers=4, has_label=True, sampled_interval=1):
+        import concurrent.futures as futures
+        from functools import partial
+        from . import waymo_utils
+        print('---------------The waymo sample interval is %d, total sequecnes is %d-----------------'
+              % (sampled_interval, len(self.sample_sequence_list)))
+        process_single_sequence = partial(
+            waymo_utils.process_single_sequence,
+            save_path=save_path, sampled_interval=sampled_interval, has_label=has_label
+        )
+        sample_sequence_file_list = [
+            self.check_sequence_name_with_all_version(raw_data_path / sequence_file)
+            for sequence_file in self.sample_sequence_list
+        ]
+        # process_single_sequence(sample_sequence_file_list[0])
+        with futures.ThreadPoolExecutor(num_workers) as executor:
+            sequence_infos = executor.map(process_single_sequence, sample_sequence_file_list)
+        sequence_infos = list(sequence_infos)
+        all_sequences_infos = [item for infos in sequence_infos for item in infos]
+        return all_sequences_infos
+    def get_lidar(self, sequence_name, sample_idx):
+        lidar_file = self.data_path / sequence_name / ('%04d.npy' % sample_idx)
+        point_features = np.load(lidar_file)  # (N, 7): [x, y, z, intensity, elongation, NLZ_flag]
+        points_all, NLZ_flag = point_features[:, 0:5], point_features[:, 5]
+        points_all = points_all[NLZ_flag == -1]
+        points_all[:, 3] = np.tanh(points_all[:, 3])
+        return points_all
+    def __len__(self):
+        if self._merge_all_iters_to_one_epoch:
+            return len(self.infos) * self.total_epochs
+        return len(self.infos)
+    def __getitem__(self, index):
+        if self._merge_all_iters_to_one_epoch:
+            index = index % len(self.infos)
+        info = copy.deepcopy(self.infos[index])
+        pc_info = info['point_cloud']
+        sequence_name = pc_info['lidar_sequence']
+        sample_idx = pc_info['sample_idx']
+        points = self.get_lidar(sequence_name, sample_idx)
+        input_dict = {
+            'points': points,
+            'frame_id': info['frame_id'],
+        }
+        if 'annos' in info:
+            annos = info['annos']
+            annos = common_utils.drop_info_with_name(annos, name='unknown')
+            if self.dataset_cfg.get('INFO_WITH_FAKELIDAR', False):
+                gt_boxes_lidar = box_utils.boxes3d_kitti_fakelidar_to_lidar(annos['gt_boxes_lidar'])
+            else:
+                gt_boxes_lidar = annos['gt_boxes_lidar']
+            input_dict.update({
+                'gt_names': annos['name'],
+                'gt_boxes': gt_boxes_lidar,
+                'num_points_in_gt': annos.get('num_points_in_gt', None)
+            })
+        data_dict = self.prepare_data(data_dict=input_dict)
+        data_dict['metadata'] = info.get('metadata', info['frame_id'])
+        data_dict.pop('num_points_in_gt', None)
+        return data_dict
+    @staticmethod
+    def generate_prediction_dicts(batch_dict, pred_dicts, class_names, output_path=None):
+        """
+        Args:
+            batch_dict:
+                frame_id:
+            pred_dicts: list of pred_dicts
+                pred_boxes: (N, 7), Tensor
+                pred_scores: (N), Tensor
+                pred_labels: (N), Tensor
+            class_names:
+            output_path:
+        Returns:
+        """
+        def get_template_prediction(num_samples):
+            ret_dict = {
+                'name': np.zeros(num_samples), 'score': np.zeros(num_samples),
+                'boxes_lidar': np.zeros([num_samples, 7])
+            }
+            return ret_dict
+        def generate_single_sample_dict(box_dict):
+            pred_scores = box_dict['pred_scores'].cpu().numpy()
+            pred_boxes = box_dict['pred_boxes'].cpu().numpy()
+            pred_labels = box_dict['pred_labels'].cpu().numpy()
+            pred_dict = get_template_prediction(pred_scores.shape[0])
+            if pred_scores.shape[0] == 0:
+                return pred_dict
+            pred_dict['name'] = np.array(class_names)[pred_labels - 1]
+            pred_dict['score'] = pred_scores
+            pred_dict['boxes_lidar'] = pred_boxes
+            return pred_dict
+        annos = []
+        for index, box_dict in enumerate(pred_dicts):
+            single_pred_dict = generate_single_sample_dict(box_dict)
+            single_pred_dict['frame_id'] = batch_dict['frame_id'][index]
+            single_pred_dict['metadata'] = batch_dict['metadata'][index]
+            annos.append(single_pred_dict)
+        return annos
+    def evaluation(self, det_annos, class_names, **kwargs):
+        if 'annos' not in self.infos[0].keys():
+            return 'No ground-truth boxes for evaluation', {}
+        def kitti_eval(eval_det_annos, eval_gt_annos):
+            from ..kitti.kitti_object_eval_python import eval as kitti_eval
+            from ..kitti import kitti_utils
+            map_name_to_kitti = {
+                'Vehicle': 'Car',
+                'Pedestrian': 'Pedestrian',
+                'Cyclist': 'Cyclist',
+                'Sign': 'Sign',
+                'Car': 'Car'
+            }
+            kitti_utils.transform_annotations_to_kitti_format(eval_det_annos, map_name_to_kitti=map_name_to_kitti)
+            kitti_utils.transform_annotations_to_kitti_format(
+                eval_gt_annos, map_name_to_kitti=map_name_to_kitti,
+                info_with_fakelidar=self.dataset_cfg.get('INFO_WITH_FAKELIDAR', False)
+            )
+            kitti_class_names = [map_name_to_kitti[x] for x in class_names]
+            ap_result_str, ap_dict = kitti_eval.get_official_eval_result(
+                gt_annos=eval_gt_annos, dt_annos=eval_det_annos, current_classes=kitti_class_names
+            )
+            return ap_result_str, ap_dict
+        def waymo_eval(eval_det_annos, eval_gt_annos):
+            from .waymo_eval import OpenPCDetWaymoDetectionMetricsEstimator
+            eval = OpenPCDetWaymoDetectionMetricsEstimator()
+            ap_dict = eval.waymo_evaluation(
+                eval_det_annos, eval_gt_annos, class_name=class_names,
+                distance_thresh=1000, fake_gt_infos=self.dataset_cfg.get('INFO_WITH_FAKELIDAR', False)
+            )
+            ap_result_str = '\n'
+            for key in ap_dict:
+                ap_dict[key] = ap_dict[key][0]
+                ap_result_str += '%s: %.4f \n' % (key, ap_dict[key])
+            return ap_result_str, ap_dict
+        eval_det_annos = copy.deepcopy(det_annos)
+        eval_gt_annos = [copy.deepcopy(info['annos']) for info in self.infos]
+        if kwargs['eval_metric'] == 'kitti':
+            ap_result_str, ap_dict = kitti_eval(eval_det_annos, eval_gt_annos)
+        elif kwargs['eval_metric'] == 'waymo':
+            ap_result_str, ap_dict = waymo_eval(eval_det_annos, eval_gt_annos)
+        else:
+            raise NotImplementedError
+        return ap_result_str, ap_dict
+    def create_groundtruth_database(self, info_path, save_path, used_classes=None, split='train', sampled_interval=10,
+                                    processed_data_tag=None):
+        database_save_path = save_path / ('pcdet_gt_database_%s_sampled_%d' % (split, sampled_interval))
+        db_info_save_path = save_path / ('pcdet_waymo_dbinfos_%s_sampled_%d.pkl' % (split, sampled_interval))
+        database_save_path.mkdir(parents=True, exist_ok=True)
+        all_db_infos = {}
+        with open(info_path, 'rb') as f:
+            infos = pickle.load(f)
+        for k in range(0, len(infos), sampled_interval):
+            print('gt_database sample: %d/%d' % (k + 1, len(infos)))
+            info = infos[k]
+            pc_info = info['point_cloud']
+            sequence_name = pc_info['lidar_sequence']
+            sample_idx = pc_info['sample_idx']
+            points = self.get_lidar(sequence_name, sample_idx)
+            annos = info['annos']
+            names = annos['name']
+            difficulty = annos['difficulty']
+            gt_boxes = annos['gt_boxes_lidar']
+            num_obj = gt_boxes.shape[0]
+            box_idxs_of_pts = roiaware_pool3d_utils.points_in_boxes_gpu(
+                torch.from_numpy(points[:, 0:3]).unsqueeze(dim=0).float().cuda(),
+                torch.from_numpy(gt_boxes[:, 0:7]).unsqueeze(dim=0).float().cuda()
+            ).long().squeeze(dim=0).cpu().numpy()
+            for i in range(num_obj):
+                filename = '%s_%04d_%s_%d.bin' % (sequence_name, sample_idx, names[i], i)
+                filepath = database_save_path / filename
+                gt_points = points[box_idxs_of_pts == i]
+                gt_points[:, :3] -= gt_boxes[i, :3]
+                if (used_classes is None) or names[i] in used_classes:
+                    with open(filepath, 'w') as f:
+                        gt_points.tofile(f)
+                    db_path = str(filepath.relative_to(self.root_path))  # gt_database/xxxxx.bin
+                    db_info = {'name': names[i], 'path': db_path, 'sequence_name': sequence_name,
+                               'sample_idx': sample_idx, 'gt_idx': i, 'box3d_lidar': gt_boxes[i],
+                               'num_points_in_gt': gt_points.shape[0], 'difficulty': difficulty[i]}
+                    if names[i] in all_db_infos:
+                        all_db_infos[names[i]].append(db_info)
+                    else:
+                        all_db_infos[names[i]] = [db_info]
+        for k, v in all_db_infos.items():
+            print('Database %s: %d' % (k, len(v)))
+        with open(db_info_save_path, 'wb') as f:
+            pickle.dump(all_db_infos, f)
+def create_waymo_infos(dataset_cfg, class_names, data_path, save_path,
+                       raw_data_tag='raw_data', processed_data_tag='waymo_processed_data', workers=4):
+    dataset = WaymoDataset(
+        dataset_cfg=dataset_cfg, class_names=class_names, root_path=data_path,
+        training=False, logger=common_utils.create_logger()
+    )
+    train_split, val_split = 'train', 'val'
+    train_filename = save_path / ('waymo_infos_%s.pkl' % train_split)
+    val_filename = save_path / ('waymo_infos_%s.pkl' % val_split)
+    print('---------------Start to generate data infos---------------')
+    dataset.set_split(train_split)
+    waymo_infos_train = dataset.get_infos(
+        raw_data_path=data_path / raw_data_tag,
+        save_path=save_path / processed_data_tag, num_workers=workers, has_label=True,
+        sampled_interval=1
+    )
+    with open(train_filename, 'wb') as f:
+        pickle.dump(waymo_infos_train, f)
+    print('----------------Waymo info train file is saved to %s----------------' % train_filename)
+    dataset.set_split(val_split)
+    waymo_infos_val = dataset.get_infos(
+        raw_data_path=data_path / raw_data_tag,
+        save_path=save_path / processed_data_tag, num_workers=workers, has_label=True,
+        sampled_interval=1
+    )
+    with open(val_filename, 'wb') as f:
+        pickle.dump(waymo_infos_val, f)
+    print('----------------Waymo info val file is saved to %s----------------' % val_filename)
+    print('---------------Start create groundtruth database for data augmentation---------------')
+    dataset.set_split(train_split)
+    dataset.create_groundtruth_database(
+        info_path=train_filename, save_path=save_path, split='train', sampled_interval=10,
+        used_classes=['Vehicle', 'Pedestrian', 'Cyclist']
+    )
+    print('---------------Data preparation Done---------------')
+if __name__ == '__main__':
+    import argparse
+    parser = argparse.ArgumentParser(description='arg parser')
+    parser.add_argument('--cfg_file', type=str, default=None, help='specify the config of dataset')
+    parser.add_argument('--func', type=str, default='create_waymo_infos', help='')
+    args = parser.parse_args()
+    if args.func == 'create_waymo_infos':
+        import yaml
+        from easydict import EasyDict
+        dataset_cfg = EasyDict(yaml.load(open(args.cfg_file)))
+        ROOT_DIR = (Path(__file__).resolve().parent / '../../../').resolve()
+        create_waymo_infos(
+            dataset_cfg=dataset_cfg,
+            class_names=['Vehicle', 'Pedestrian', 'Cyclist'],
+            data_path=ROOT_DIR / 'data' / 'waymo',
+            save_path=ROOT_DIR / 'data' / 'waymo',
+            raw_data_tag='raw_data',
+            processed_data_tag=dataset_cfg.PROCESSED_DATA_TAG
+        )
--- a/pcdet/datasets/waymo/waymo_eval.py
+++ b/pcdet/datasets/waymo/waymo_eval.py
+# OpenPCDet PyTorch Dataloader and Evaluation Tools for Waymo Open Dataset
+# Reference https://github.com/open-mmlab/OpenPCDet
+# Written by Shaoshuai Shi, Chaoxu Guo
+# All Rights Reserved 2019-2020.
+import numpy as np
+import pickle
+import tensorflow as tf
+from google.protobuf import text_format
+from waymo_open_dataset.metrics.python import detection_metrics
+from waymo_open_dataset.protos import metrics_pb2
+import argparse
+tf.get_logger().setLevel('INFO')
+def limit_period(val, offset=0.5, period=np.pi):
+    return val - np.floor(val / period + offset) * period
+class OpenPCDetWaymoDetectionMetricsEstimator(tf.test.TestCase):
+    WAYMO_CLASSES = ['unknown', 'Vehicle', 'Pedestrian', 'Truck', 'Cyclist']
+    def generate_waymo_type_results(self, infos, class_names, is_gt=False, fake_gt_infos=True):
+        def boxes3d_kitti_fakelidar_to_lidar(boxes3d_lidar):
+            """
+            Args:
+                boxes3d_fakelidar: (N, 7) [x, y, z, w, l, h, r] in old LiDAR coordinates, z is bottom center
+            Returns:
+                boxes3d_lidar: [x, y, z, dx, dy, dz, heading], (x, y, z) is the box center
+            """
+            w, l, h, r = boxes3d_lidar[:, 3:4], boxes3d_lidar[:, 4:5], boxes3d_lidar[:, 5:6], boxes3d_lidar[:, 6:7]
+            boxes3d_lidar[:, 2] += h[:, 0] / 2
+            return np.concatenate([boxes3d_lidar[:, 0:3], l, w, h, -(r + np.pi / 2)], axis=-1)
+        frame_id, boxes3d, obj_type, score, overlap_nlz, difficulty = [], [], [], [], [], []
+        for frame_index, info in enumerate(infos):
+            if is_gt:
+                box_mask = np.array([n in class_names for n in info['name']], dtype=np.bool_)
+                if 'num_points_in_gt' in info:
+                    zero_difficulty_mask = info['difficulty'] == 0
+                    info['difficulty'][(info['num_points_in_gt'] > 5) & zero_difficulty_mask] = 1
+                    info['difficulty'][(info['num_points_in_gt'] <= 5) & zero_difficulty_mask] = 2
+                    nonzero_mask = info['num_points_in_gt'] > 0
+                    box_mask = box_mask & nonzero_mask
+                num_boxes = box_mask.sum()
+                box_name = info['name'][box_mask]
+                difficulty.append(info['difficulty'][box_mask])
+                score.append(np.ones(num_boxes))
+                if fake_gt_infos:
+                    info['gt_boxes_lidar'] = boxes3d_kitti_fakelidar_to_lidar(info['gt_boxes_lidar'])
+                boxes3d.append(info['gt_boxes_lidar'][box_mask])
+            else:
+                num_boxes = len(info['boxes_lidar'])
+                difficulty.append([0] * num_boxes)
+                score.append(info['score'])
+                boxes3d.append(np.array(info['boxes_lidar']))
+                box_name = info['name']
+            obj_type += [self.WAYMO_CLASSES.index(name) for i, name in enumerate(box_name)]
+            frame_id.append(np.array([frame_index] * num_boxes))
+            overlap_nlz.append(np.zeros(num_boxes))  # set zero currently
+        frame_id = np.concatenate(frame_id).reshape(-1).astype(np.int64)
+        boxes3d = np.concatenate(boxes3d, axis=0)
+        obj_type = np.array(obj_type).reshape(-1)
+        score = np.concatenate(score).reshape(-1)
+        overlap_nlz = np.concatenate(overlap_nlz).reshape(-1)
+        difficulty = np.concatenate(difficulty).reshape(-1).astype(np.int8)
+        boxes3d[:, -1] = limit_period(boxes3d[:, -1], offset=0.5, period=np.pi * 2)
+        return frame_id, boxes3d, obj_type, score, overlap_nlz, difficulty
+    def build_config(self):
+        config = metrics_pb2.Config()
+        config_text = """
+        breakdown_generator_ids: OBJECT_TYPE
+        difficulties {
+        levels:1
+        levels:2
+        }
+        matcher_type: TYPE_HUNGARIAN
+        iou_thresholds: 0.0
+        iou_thresholds: 0.7
+        iou_thresholds: 0.5
+        iou_thresholds: 0.5
+        iou_thresholds: 0.5
+        box_type: TYPE_3D
+        """
+        for x in range(0, 100):
+            config.score_cutoffs.append(x * 0.01)
+        config.score_cutoffs.append(1.0)
+        text_format.Merge(config_text, config)
+        return config
+    def build_graph(self, graph):
+        with graph.as_default():
+            self._pd_frame_id = tf.compat.v1.placeholder(dtype=tf.int64)
+            self._pd_bbox = tf.compat.v1.placeholder(dtype=tf.float32)
+            self._pd_type = tf.compat.v1.placeholder(dtype=tf.uint8)
+            self._pd_score = tf.compat.v1.placeholder(dtype=tf.float32)
+            self._pd_overlap_nlz = tf.compat.v1.placeholder(dtype=tf.bool)
+            self._gt_frame_id = tf.compat.v1.placeholder(dtype=tf.int64)
+            self._gt_bbox = tf.compat.v1.placeholder(dtype=tf.float32)
+            self._gt_type = tf.compat.v1.placeholder(dtype=tf.uint8)
+            self._gt_difficulty = tf.compat.v1.placeholder(dtype=tf.uint8)
+            metrics = detection_metrics.get_detection_metric_ops(
+                config=self.build_config(),
+                prediction_frame_id=self._pd_frame_id,
+                prediction_bbox=self._pd_bbox,
+                prediction_type=self._pd_type,
+                prediction_score=self._pd_score,
+                prediction_overlap_nlz=self._pd_overlap_nlz,
+                ground_truth_bbox=self._gt_bbox,
+                ground_truth_type=self._gt_type,
+                ground_truth_frame_id=self._gt_frame_id,
+                ground_truth_difficulty=self._gt_difficulty,
+            )
+            return metrics
+    def run_eval_ops(
+        self,
+        sess,
+        graph,
+        metrics,
+        prediction_frame_id,
+        prediction_bbox,
+        prediction_type,
+        prediction_score,
+        prediction_overlap_nlz,
+        ground_truth_frame_id,
+        ground_truth_bbox,
+        ground_truth_type,
+        ground_truth_difficulty,
+    ):
+        sess.run(
+            [tf.group([value[1] for value in metrics.values()])],
+            feed_dict={
+                self._pd_bbox: prediction_bbox,
+                self._pd_frame_id: prediction_frame_id,
+                self._pd_type: prediction_type,
+                self._pd_score: prediction_score,
+                self._pd_overlap_nlz: prediction_overlap_nlz,
+                self._gt_bbox: ground_truth_bbox,
+                self._gt_type: ground_truth_type,
+                self._gt_frame_id: ground_truth_frame_id,
+                self._gt_difficulty: ground_truth_difficulty,
+            },
+        )
+    def eval_value_ops(self, sess, graph, metrics):
+        return {item[0]: sess.run([item[1][0]]) for item in metrics.items()}
+    def mask_by_distance(self, distance_thresh, boxes_3d, *args):
+        mask = np.linalg.norm(boxes_3d[:, 0:2], axis=1) < distance_thresh + 0.5
+        boxes_3d = boxes_3d[mask]
+        ret_ans = [boxes_3d]
+        for arg in args:
+            ret_ans.append(arg[mask])
+        return tuple(ret_ans)
+    def waymo_evaluation(self, prediction_infos, gt_infos, class_name, distance_thresh=100, fake_gt_infos=True):
+        print('Start the waymo evaluation...')
+        assert len(prediction_infos) == len(gt_infos), '%d vs %d' % (prediction_infos.__len__(), gt_infos.__len__())
+        tf.compat.v1.disable_eager_execution()
+        pd_frameid, pd_boxes3d, pd_type, pd_score, pd_overlap_nlz, _ = self.generate_waymo_type_results(
+            prediction_infos, class_name, is_gt=False
+        )
+        gt_frameid, gt_boxes3d, gt_type, gt_score, gt_overlap_nlz, gt_difficulty = self.generate_waymo_type_results(
+            gt_infos, class_name, is_gt=True, fake_gt_infos=fake_gt_infos
+        )
+        pd_boxes3d, pd_frameid, pd_type, pd_score, pd_overlap_nlz = self.mask_by_distance(
+            distance_thresh, pd_boxes3d, pd_frameid, pd_type, pd_score, pd_overlap_nlz
+        )
+        gt_boxes3d, gt_frameid, gt_type, gt_score, gt_difficulty = self.mask_by_distance(
+            distance_thresh, gt_boxes3d, gt_frameid, gt_type, gt_score, gt_difficulty
+        )
+        print('Number: (pd, %d) VS. (gt, %d)' % (len(pd_boxes3d), len(gt_boxes3d)))
+        print('Level 1: %d, Level2: %d)' % ((gt_difficulty == 1).sum(), (gt_difficulty == 2).sum()))
+        if pd_score.max() > 1:
+            # assert pd_score.max() <= 1.0, 'Waymo evaluation only supports normalized scores'
+            pd_score = 1 / (1 + np.exp(-pd_score))
+            print('Warning: Waymo evaluation only supports normalized scores')
+        graph = tf.Graph()
+        metrics = self.build_graph(graph)
+        with self.test_session(graph=graph) as sess:
+            sess.run(tf.compat.v1.initializers.local_variables())
+            self.run_eval_ops(
+                sess, graph, metrics, pd_frameid, pd_boxes3d, pd_type, pd_score, pd_overlap_nlz,
+                gt_frameid, gt_boxes3d, gt_type, gt_difficulty,
+            )
+            with tf.compat.v1.variable_scope('detection_metrics', reuse=True):
+                aps = self.eval_value_ops(sess, graph, metrics)
+        return aps
+def main():
+    parser = argparse.ArgumentParser(description='arg parser')
+    parser.add_argument('--pred_infos', type=str, default=None, help='pickle file')
+    parser.add_argument('--gt_infos', type=str, default=None, help='pickle file')
+    parser.add_argument('--class_names', type=str, nargs='+', default=['Vehicle', 'Pedestrian', 'Cyclist'], help='')
+    parser.add_argument('--sampled_interval', type=int, default=5, help='sampled interval for GT sequences')
+    args = parser.parse_args()
+    pred_infos = pickle.load(open(args.pred_infos, 'rb'))
+    gt_infos = pickle.load(open(args.gt_infos, 'rb'))
+    print('Start to evaluate the waymo format results...')
+    eval = OpenPCDetWaymoDetectionMetricsEstimator()
+    gt_infos_dst = []
+    for idx in range(0, len(gt_infos), args.sampled_interval):
+        cur_info = gt_infos[idx]['annos']
+        cur_info['frame_id'] = gt_infos[idx]['frame_id']
+        gt_infos_dst.append(cur_info)
+    waymo_AP = eval.waymo_evaluation(
+        pred_infos, gt_infos_dst, class_name=args.class_names, distance_thresh=1000, fake_gt_infos=True
+    )
+    print(waymo_AP)
+if __name__ == '__main__':
+    main()
--- a/pcdet/datasets/waymo/waymo_utils.py
+++ b/pcdet/datasets/waymo/waymo_utils.py
+# OpenPCDet PyTorch Dataloader and Evaluation Tools for Waymo Open Dataset
+# Reference https://github.com/open-mmlab/OpenPCDet
+# Written by Shaoshuai Shi, Chaoxu Guo
+# All Rights Reserved 2019-2020.
+import os
+import pickle
+import numpy as np
+from ...utils import common_utils
+import tensorflow as tf
+from waymo_open_dataset.utils import frame_utils, transform_utils, range_image_utils
+from waymo_open_dataset import dataset_pb2
+try:
+    tf.enable_eager_execution()
+except:
+    pass
+WAYMO_CLASSES = ['unknown', 'Vehicle', 'Pedestrian', 'Sign', 'Cyclist']
+def generate_labels(frame):
+    obj_name, difficulty, dimensions, locations, heading_angles = [], [], [], [], []
+    tracking_difficulty, speeds, accelerations, obj_ids = [], [], [], []
+    laser_labels = frame.laser_labels
+    for i in range(len(laser_labels)):
+        box = laser_labels[i].box
+        class_ind = laser_labels[i].type
+        loc = [box.center_x, box.center_y, box.center_z]
+        heading_angles.append(box.heading)
+        obj_name.append(WAYMO_CLASSES[class_ind])
+        difficulty.append(laser_labels[i].detection_difficulty_level)
+        tracking_difficulty.append(laser_labels[i].tracking_difficulty_level)
+        dimensions.append([box.length, box.width, box.height])  # lwh in unified coordinate of OpenPCDet
+        locations.append(loc)
+        obj_ids.append(laser_labels[i].id)
+    annotations = {}
+    annotations['name'] = np.array(obj_name)
+    annotations['difficulty'] = np.array(difficulty)
+    annotations['dimensions'] = np.array(dimensions)
+    annotations['location'] = np.array(locations)
+    annotations['heading_angles'] = np.array(heading_angles)
+    annotations['obj_ids'] = np.array(obj_ids)
+    annotations['tracking_difficulty'] = np.array(tracking_difficulty)
+    annotations = common_utils.drop_info_with_name(annotations, name='unknown')
+    if annotations['name'].__len__() > 0:
+        gt_boxes_lidar = np.concatenate([
+            annotations['location'], annotations['dimensions'], annotations['heading_angles'][..., np.newaxis]],
+            axis=1
+        )
+    else:
+        gt_boxes_lidar = np.zeros((0, 7))
+    annotations['gt_boxes_lidar'] = gt_boxes_lidar
+    return annotations
+def convert_range_image_to_point_cloud(frame, range_images, camera_projections, range_image_top_pose, ri_index=0):
+    """
+    Modified from the codes of Waymo Open Dataset.
+    Convert range images to point cloud.
+    Args:
+        frame: open dataset frame
+        range_images: A dict of {laser_name, [range_image_first_return, range_image_second_return]}.
+        camera_projections: A dict of {laser_name,
+            [camera_projection_from_first_return, camera_projection_from_second_return]}.
+        range_image_top_pose: range image pixel pose for top lidar.
+        ri_index: 0 for the first return, 1 for the second return.
+    Returns:
+        points: {[N, 3]} list of 3d lidar points of length 5 (number of lidars).
+        cp_points: {[N, 6]} list of camera projections of length 5 (number of lidars).
+    """
+    calibrations = sorted(frame.context.laser_calibrations, key=lambda c: c.name)
+    points = []
+    cp_points = []
+    points_NLZ = []
+    points_intensity = []
+    points_elongation = []
+    frame_pose = tf.convert_to_tensor(np.reshape(np.array(frame.pose.transform), [4, 4]))
+    # [H, W, 6]
+    range_image_top_pose_tensor = tf.reshape(
+        tf.convert_to_tensor(range_image_top_pose.data), range_image_top_pose.shape.dims
+    )
+    # [H, W, 3, 3]
+    range_image_top_pose_tensor_rotation = transform_utils.get_rotation_matrix(
+        range_image_top_pose_tensor[..., 0], range_image_top_pose_tensor[..., 1],
+        range_image_top_pose_tensor[..., 2])
+    range_image_top_pose_tensor_translation = range_image_top_pose_tensor[..., 3:]
+    range_image_top_pose_tensor = transform_utils.get_transform(
+        range_image_top_pose_tensor_rotation,
+        range_image_top_pose_tensor_translation)
+    for c in calibrations:
+        range_image = range_images[c.name][ri_index]
+        if len(c.beam_inclinations) == 0:  # pylint: disable=g-explicit-length-test
+            beam_inclinations = range_image_utils.compute_inclination(
+                tf.constant([c.beam_inclination_min, c.beam_inclination_max]),
+                height=range_image.shape.dims[0])
+        else:
+            beam_inclinations = tf.constant(c.beam_inclinations)
+        beam_inclinations = tf.reverse(beam_inclinations, axis=[-1])
+        extrinsic = np.reshape(np.array(c.extrinsic.transform), [4, 4])
+        range_image_tensor = tf.reshape(
+            tf.convert_to_tensor(range_image.data), range_image.shape.dims)
+        pixel_pose_local = None
+        frame_pose_local = None
+        if c.name == dataset_pb2.LaserName.TOP:
+            pixel_pose_local = range_image_top_pose_tensor
+            pixel_pose_local = tf.expand_dims(pixel_pose_local, axis=0)
+            frame_pose_local = tf.expand_dims(frame_pose, axis=0)
+        range_image_mask = range_image_tensor[..., 0] > 0
+        range_image_NLZ = range_image_tensor[..., 3]
+        range_image_intensity = range_image_tensor[..., 1]
+        range_image_elongation = range_image_tensor[..., 2]
+        range_image_cartesian = range_image_utils.extract_point_cloud_from_range_image(
+            tf.expand_dims(range_image_tensor[..., 0], axis=0),
+            tf.expand_dims(extrinsic, axis=0),
+            tf.expand_dims(tf.convert_to_tensor(beam_inclinations), axis=0),
+            pixel_pose=pixel_pose_local,
+            frame_pose=frame_pose_local)
+        range_image_cartesian = tf.squeeze(range_image_cartesian, axis=0)
+        points_tensor = tf.gather_nd(range_image_cartesian,
+                                     tf.where(range_image_mask))
+        points_NLZ_tensor = tf.gather_nd(range_image_NLZ, tf.compat.v1.where(range_image_mask))
+        points_intensity_tensor = tf.gather_nd(range_image_intensity, tf.compat.v1.where(range_image_mask))
+        points_elongation_tensor = tf.gather_nd(range_image_elongation, tf.compat.v1.where(range_image_mask))
+        cp = camera_projections[c.name][0]
+        cp_tensor = tf.reshape(tf.convert_to_tensor(cp.data), cp.shape.dims)
+        cp_points_tensor = tf.gather_nd(cp_tensor, tf.where(range_image_mask))
+        points.append(points_tensor.numpy())
+        cp_points.append(cp_points_tensor.numpy())
+        points_NLZ.append(points_NLZ_tensor.numpy())
+        points_intensity.append(points_intensity_tensor.numpy())
+        points_elongation.append(points_elongation_tensor.numpy())
+    return points, cp_points, points_NLZ, points_intensity, points_elongation
+def save_lidar_points(frame, cur_save_path):
+    range_images, camera_projections, range_image_top_pose = \
+        frame_utils.parse_range_image_and_camera_projection(frame)
+    points, cp_points, points_in_NLZ_flag, points_intensity, points_elongation = \
+        convert_range_image_to_point_cloud(frame, range_images, camera_projections, range_image_top_pose)
+    # 3d points in vehicle frame.
+    points_all = np.concatenate(points, axis=0)
+    points_in_NLZ_flag = np.concatenate(points_in_NLZ_flag, axis=0).reshape(-1, 1)
+    points_intensity = np.concatenate(points_intensity, axis=0).reshape(-1, 1)
+    points_elongation = np.concatenate(points_elongation, axis=0).reshape(-1, 1)
+    num_points_of_each_lidar = [point.shape[0] for point in points]
+    save_points = np.concatenate([
+        points_all, points_intensity, points_elongation, points_in_NLZ_flag
+    ], axis=-1).astype(np.float32)
+    np.save(cur_save_path, save_points)
+    # print('saving to ', cur_save_path)
+    return num_points_of_each_lidar
+def process_single_sequence(sequence_file, save_path, sampled_interval, has_label=True):
+    sequence_name = os.path.splitext(os.path.basename(sequence_file))[0]
+    # print('Load record (sampled_interval=%d): %s' % (sampled_interval, sequence_name))
+    if not sequence_file.exists():
+        print('NotFoundError: %s' % sequence_file)
+        return []
+    dataset = tf.data.TFRecordDataset(str(sequence_file), compression_type='')
+    cur_save_dir = save_path / sequence_name
+    cur_save_dir.mkdir(parents=True, exist_ok=True)
+    pkl_file = cur_save_dir / ('%s.pkl' % sequence_name)
+    sequence_infos = []
+    if pkl_file.exists():
+        sequence_infos = pickle.load(open(pkl_file, 'rb'))
+        print('Skip sequence since it has been processed before: %s' % pkl_file)
+        return sequence_infos
+    for cnt, data in enumerate(dataset):
+        if cnt % sampled_interval != 0:
+            continue
+        # print(sequence_name, cnt)
+        frame = dataset_pb2.Frame()
+        frame.ParseFromString(bytearray(data.numpy()))
+        info = {}
+        pc_info = {'num_features': 5, 'lidar_sequence': sequence_name, 'sample_idx': cnt}
+        info['point_cloud'] = pc_info
+        info['frame_id'] = sequence_name + ('_%03d' % cnt)
+        image_info = {}
+        for j in range(5):
+            width = frame.context.camera_calibrations[j].width
+            height = frame.context.camera_calibrations[j].height
+            image_info.update({'image_shape_%d' % j: (height, width)})
+        info['image'] = image_info
+        pose = np.array(frame.pose.transform, dtype=np.float32).reshape(4, 4)
+        info['pose'] = pose
+        if has_label:
+            annotations = generate_labels(frame)
+            info['annos'] = annotations
+        num_points_of_each_lidar = save_lidar_points(frame, cur_save_dir / ('%04d.npy' % cnt))
+        info['num_points_of_each_lidar'] = num_points_of_each_lidar
+        sequence_infos.append(info)
+    with open(pkl_file, 'wb') as f:
+        pickle.dump(sequence_infos, f)
+    print('Infos are saved to (sampled_interval=%d): %s' % (sampled_interval, pkl_file))
+    return sequence_infos
--- a/tools/cfgs/dataset_configs/waymo_dataset.yaml
+++ b/tools/cfgs/dataset_configs/waymo_dataset.yaml
+DATASET: 'WaymoDataset'
+DATA_PATH: '../data/waymo'
+PROCESSED_DATA_TAG: 'waymo_processed_data'
+POINT_CLOUD_RANGE: [-75.2, -75.2, -2, 75.2, 75.2, 4]
+DATA_SPLIT: {
+    'train': train,
+    'test': val
+}
+SAMPLED_INTERVAL: {
+    'train': 5,
+    'test': 5
+}
+DATA_AUGMENTOR:
+    DISABLE_AUG_LIST: ['placeholder']
+    AUG_CONFIG_LIST:
+        - NAME: gt_sampling
+          USE_ROAD_PLANE: False
+          DB_INFO_PATH:
+              - pcdet_waymo_dbinfos_train_sampled_10.pkl
+          PREPARE: {
+             filter_by_min_points: ['Vehicle:5', 'Pedestrian:5', 'Cyclist:5'],
+             filter_by_difficulty: [-1],
+          }
+          SAMPLE_GROUPS: ['Vehicle:15', 'Pedestrian:10', 'Cyclist:10']
+          NUM_POINT_FEATURES: 5
+          REMOVE_EXTRA_WIDTH: [0.0, 0.0, 0.0]
+          LIMIT_WHOLE_SCENE: True
+        - NAME: random_world_flip
+          ALONG_AXIS_LIST: ['x', 'y']
+        - NAME: random_world_rotation
+          WORLD_ROT_ANGLE: [-0.78539816, 0.78539816]
+        - NAME: random_world_scaling
+          WORLD_SCALE_RANGE: [0.95, 1.05]
+POINT_FEATURE_ENCODING: {
+    encoding_type: absolute_coordinates_encoding,
+    used_feature_list: ['x', 'y', 'z', 'intensity', 'elongation'],
+    src_feature_list: ['x', 'y', 'z', 'intensity', 'elongation'],
+}
+DATA_PROCESSOR:
+    - NAME: mask_points_and_boxes_outside_range
+      REMOVE_OUTSIDE_BOXES: True
+    - NAME: shuffle_points
+      SHUFFLE_ENABLED: {
+        'train': True,
+        'test': True
+      }
+    - NAME: transform_points_to_voxels
+      VOXEL_SIZE: [0.1, 0.1, 0.15]
+      MAX_POINTS_PER_VOXEL: 5
+      MAX_NUMBER_OF_VOXELS: {
+        'train': 80000,
+        'test': 90000
+      }
--- a/tools/cfgs/waymo_models/PartA2.yaml
+++ b/tools/cfgs/waymo_models/PartA2.yaml
+CLASS_NAMES: ['Vehicle', 'Pedestrian', 'Cyclist']
+DATA_CONFIG:
+    _BASE_CONFIG_: cfgs/dataset_configs/waymo_dataset.yaml
+MODEL:
+    NAME: PartA2Net
+    VFE:
+        NAME: MeanVFE
+    BACKBONE_3D:
+        NAME: UNetV2
+    MAP_TO_BEV:
+        NAME: HeightCompression
+        NUM_BEV_FEATURES: 256
+    BACKBONE_2D:
+        NAME: BaseBEVBackbone
+        LAYER_NUMS: [5, 5]
+        LAYER_STRIDES: [1, 2]
+        NUM_FILTERS: [128, 256]
+        UPSAMPLE_STRIDES: [1, 2]
+        NUM_UPSAMPLE_FILTERS: [256, 256]
+    DENSE_HEAD:
+        NAME: AnchorHeadSingle
+        CLASS_AGNOSTIC: False
+        USE_DIRECTION_CLASSIFIER: True
+        DIR_OFFSET: 0.78539
+        DIR_LIMIT_OFFSET: 0.0
+        NUM_DIR_BINS: 2
+        ANCHOR_GENERATOR_CONFIG: [
+            {
+                'class_name': 'Vehicle',
+                'anchor_sizes': [[4.7, 2.1, 1.7]],
+                'anchor_rotations': [0, 1.57],
+                'anchor_bottom_heights': [0],
+                'align_center': False,
+                'feature_map_stride': 8,
+                'matched_threshold': 0.55,
+                'unmatched_threshold': 0.4
+            },
+            {
+                'class_name': 'Pedestrian',
+                'anchor_sizes': [[0.91, 0.86, 1.73]],
+                'anchor_rotations': [0, 1.57],
+                'anchor_bottom_heights': [0],
+                'align_center': False,
+                'feature_map_stride': 8,
+                'matched_threshold': 0.5,
+                'unmatched_threshold': 0.35
+            },
+            {
+                'class_name': 'Cyclist',
+                'anchor_sizes': [[1.78, 0.84, 1.78]],
+                'anchor_rotations': [0, 1.57],
+                'anchor_bottom_heights': [0],
+                'align_center': False,
+                'feature_map_stride': 8,
+                'matched_threshold': 0.5,
+                'unmatched_threshold': 0.35
+            }
+        ]
+        TARGET_ASSIGNER_CONFIG:
+            NAME: AxisAlignedTargetAssigner
+            POS_FRACTION: -1.0
+            SAMPLE_SIZE: 512
+            NORM_BY_NUM_EXAMPLES: False
+            MATCH_HEIGHT: False
+            BOX_CODER: ResidualCoder
+        LOSS_CONFIG:
+            LOSS_WEIGHTS: {
+                'cls_weight': 1.0,
+                'loc_weight': 2.0,
+                'dir_weight': 0.2,
+                'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
+            }
+    POINT_HEAD:
+        NAME: PointIntraPartOffsetHead
+        CLS_FC: []
+        PART_FC: []
+        CLASS_AGNOSTIC: True
+        TARGET_CONFIG:
+            GT_EXTRA_WIDTH: [0.2, 0.2, 0.2]
+        LOSS_CONFIG:
+            LOSS_REG: smooth-l1
+            LOSS_WEIGHTS: {
+                'point_cls_weight': 1.0,
+                'point_part_weight': 1.0
+            }
+    ROI_HEAD:
+        NAME: PartA2FCHead
+        CLASS_AGNOSTIC: True
+        SHARED_FC: [256, 256, 256]
+        CLS_FC: [256, 256]
+        REG_FC: [256, 256]
+        DP_RATIO: 0.3
+        SEG_MASK_SCORE_THRESH: 0.3
+        NMS_CONFIG:
+            TRAIN:
+                NMS_TYPE: nms_gpu
+                MULTI_CLASSES_NMS: False
+                NMS_PRE_MAXSIZE: 9000
+                NMS_POST_MAXSIZE: 512
+                NMS_THRESH: 0.8
+            TEST:
+                NMS_TYPE: nms_gpu
+                MULTI_CLASSES_NMS: False
+                NMS_PRE_MAXSIZE: 1024
+                NMS_POST_MAXSIZE: 100
+                NMS_THRESH: 0.7
+        ROI_AWARE_POOL:
+            POOL_SIZE: 10
+            NUM_FEATURES: 128
+            MAX_POINTS_PER_VOXEL: 128
+        TARGET_CONFIG:
+            BOX_CODER: ResidualCoder
+            ROI_PER_IMAGE: 128
+            FG_RATIO: 0.5
+            SAMPLE_ROI_BY_EACH_CLASS: True
+            CLS_SCORE_TYPE: roi_iou
+            CLS_FG_THRESH: 0.75
+            CLS_BG_THRESH: 0.25
+            CLS_BG_THRESH_LO: 0.1
+            HARD_BG_RATIO: 0.8
+            REG_FG_THRESH: 0.65
+        LOSS_CONFIG:
+            CLS_LOSS: BinaryCrossEntropy
+            REG_LOSS: smooth-l1
+            CORNER_LOSS_REGULARIZATION: True
+            LOSS_WEIGHTS: {
+                'rcnn_cls_weight': 1.0,
+                'rcnn_reg_weight': 1.0,
+                'rcnn_corner_weight': 1.0,
+                'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
+            }
+    POST_PROCESSING:
+        RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
+        SCORE_THRESH: 0.1
+        OUTPUT_RAW_SCORE: False
+        EVAL_METRIC: waymo
+        NMS_CONFIG:
+            MULTI_CLASSES_NMS: False
+            NMS_TYPE: nms_gpu
+            NMS_THRESH: 0.1
+            NMS_PRE_MAXSIZE: 4096
+            NMS_POST_MAXSIZE: 500
+OPTIMIZATION:
+    BATCH_SIZE_PER_GPU: 3
+    NUM_EPOCHS: 30
+    OPTIMIZER: adam_onecycle
+    LR: 0.01
+    WEIGHT_DECAY: 0.01
+    MOMENTUM: 0.9
+    MOMS: [0.95, 0.85]
+    PCT_START: 0.4
+    DIV_FACTOR: 10
+    DECAY_STEP_LIST: [35, 45]
+    LR_DECAY: 0.1
+    LR_CLIP: 0.0000001
+    LR_WARMUP: False
+    WARMUP_EPOCH: 1
+    GRAD_NORM_CLIP: 10
\ No newline at end of file
--- a/tools/cfgs/waymo_models/pv_rcnn.yaml
+++ b/tools/cfgs/waymo_models/pv_rcnn.yaml
+CLASS_NAMES: ['Vehicle', 'Pedestrian', 'Cyclist']
+DATA_CONFIG:
+    _BASE_CONFIG_: cfgs/dataset_configs/waymo_dataset.yaml
+MODEL:
+    NAME: PVRCNN
+    VFE:
+        NAME: MeanVFE
+    BACKBONE_3D:
+        NAME: VoxelBackBone8x
+    MAP_TO_BEV:
+        NAME: HeightCompression
+        NUM_BEV_FEATURES: 256
+    BACKBONE_2D:
+        NAME: BaseBEVBackbone
+        LAYER_NUMS: [5, 5]
+        LAYER_STRIDES: [1, 2]
+        NUM_FILTERS: [128, 256]
+        UPSAMPLE_STRIDES: [1, 2]
+        NUM_UPSAMPLE_FILTERS: [256, 256]
+    DENSE_HEAD:
+        NAME: AnchorHeadSingle
+        CLASS_AGNOSTIC: False
+        USE_DIRECTION_CLASSIFIER: True
+        DIR_OFFSET: 0.78539
+        DIR_LIMIT_OFFSET: 0.0
+        NUM_DIR_BINS: 2
+        ANCHOR_GENERATOR_CONFIG: [
+            {
+                'class_name': 'Vehicle',
+                'anchor_sizes': [[4.7, 2.1, 1.7]],
+                'anchor_rotations': [0, 1.57],
+                'anchor_bottom_heights': [0],
+                'align_center': False,
+                'feature_map_stride': 8,
+                'matched_threshold': 0.55,
+                'unmatched_threshold': 0.4
+            },
+            {
+                'class_name': 'Pedestrian',
+                'anchor_sizes': [[0.91, 0.86, 1.73]],
+                'anchor_rotations': [0, 1.57],
+                'anchor_bottom_heights': [0],
+                'align_center': False,
+                'feature_map_stride': 8,
+                'matched_threshold': 0.5,
+                'unmatched_threshold': 0.35
+            },
+            {
+                'class_name': 'Cyclist',
+                'anchor_sizes': [[1.78, 0.84, 1.78]],
+                'anchor_rotations': [0, 1.57],
+                'anchor_bottom_heights': [0],
+                'align_center': False,
+                'feature_map_stride': 8,
+                'matched_threshold': 0.5,
+                'unmatched_threshold': 0.35
+            }
+        ]
+        TARGET_ASSIGNER_CONFIG:
+            NAME: AxisAlignedTargetAssigner
+            POS_FRACTION: -1.0
+            SAMPLE_SIZE: 512
+            NORM_BY_NUM_EXAMPLES: False
+            MATCH_HEIGHT: False
+            BOX_CODER: ResidualCoder
+        LOSS_CONFIG:
+            LOSS_WEIGHTS: {
+                'cls_weight': 1.0,
+                'loc_weight': 2.0,
+                'dir_weight': 0.2,
+                'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
+            }
+    PFE:
+        NAME: VoxelSetAbstraction
+        POINT_SOURCE: raw_points
+        NUM_KEYPOINTS: 4096
+        NUM_OUTPUT_FEATURES: 128
+        SAMPLE_METHOD: FPS
+        FEATURES_SOURCE: ['bev', 'x_conv3', 'x_conv4', 'raw_points']
+        SA_LAYER:
+            raw_points:
+                MLPS: [[16, 16], [16, 16]]
+                POOL_RADIUS: [0.4, 0.8]
+                NSAMPLE: [16, 16]
+            x_conv1:
+                DOWNSAMPLE_FACTOR: 1
+                MLPS: [[16, 16], [16, 16]]
+                POOL_RADIUS: [0.4, 0.8]
+                NSAMPLE: [16, 16]
+            x_conv2:
+                DOWNSAMPLE_FACTOR: 2
+                MLPS: [[32, 32], [32, 32]]
+                POOL_RADIUS: [0.8, 1.2]
+                NSAMPLE: [16, 32]
+            x_conv3:
+                DOWNSAMPLE_FACTOR: 4
+                MLPS: [[64, 64], [64, 64]]
+                POOL_RADIUS: [1.2, 2.4]
+                NSAMPLE: [16, 32]
+            x_conv4:
+                DOWNSAMPLE_FACTOR: 8
+                MLPS: [[64, 64], [64, 64]]
+                POOL_RADIUS: [2.4, 4.8]
+                NSAMPLE: [16, 32]
+    POINT_HEAD:
+        NAME: PointHeadSimple
+        CLS_FC: [256, 256]
+        CLASS_AGNOSTIC: True
+        USE_POINT_FEATURES_BEFORE_FUSION: True
+        TARGET_CONFIG:
+            GT_EXTRA_WIDTH: [0.2, 0.2, 0.2]
+        LOSS_CONFIG:
+            LOSS_REG: smooth-l1
+            LOSS_WEIGHTS: {
+                'point_cls_weight': 1.0,
+            }
+    ROI_HEAD:
+        NAME: PVRCNNHead
+        CLASS_AGNOSTIC: True
+        SHARED_FC: [256, 256]
+        CLS_FC: [256, 256]
+        REG_FC: [256, 256]
+        DP_RATIO: 0.3
+        NMS_CONFIG:
+            TRAIN:
+                NMS_TYPE: nms_gpu
+                MULTI_CLASSES_NMS: False
+                NMS_PRE_MAXSIZE: 9000
+                NMS_POST_MAXSIZE: 512
+                NMS_THRESH: 0.8
+            TEST:
+                NMS_TYPE: nms_gpu
+                MULTI_CLASSES_NMS: False
+                NMS_PRE_MAXSIZE: 1024
+                NMS_POST_MAXSIZE: 100
+                NMS_THRESH: 0.7
+        ROI_GRID_POOL:
+            GRID_SIZE: 6
+            MLPS: [[64, 64], [64, 64]]
+            POOL_RADIUS: [0.8, 1.6]
+            NSAMPLE: [16, 16]
+            POOL_METHOD: max_pool
+        TARGET_CONFIG:
+            BOX_CODER: ResidualCoder
+            ROI_PER_IMAGE: 128
+            FG_RATIO: 0.5
+            SAMPLE_ROI_BY_EACH_CLASS: True
+            CLS_SCORE_TYPE: roi_iou
+            CLS_FG_THRESH: 0.75
+            CLS_BG_THRESH: 0.25
+            CLS_BG_THRESH_LO: 0.1
+            HARD_BG_RATIO: 0.8
+            REG_FG_THRESH: 0.55
+        LOSS_CONFIG:
+            CLS_LOSS: BinaryCrossEntropy
+            REG_LOSS: smooth-l1
+            CORNER_LOSS_REGULARIZATION: True
+            LOSS_WEIGHTS: {
+                'rcnn_cls_weight': 1.0,
+                'rcnn_reg_weight': 1.0,
+                'rcnn_corner_weight': 1.0,
+                'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
+            }
+    POST_PROCESSING:
+        RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
+        SCORE_THRESH: 0.1
+        OUTPUT_RAW_SCORE: False
+        EVAL_METRIC: waymo
+        NMS_CONFIG:
+            MULTI_CLASSES_NMS: False
+            NMS_TYPE: nms_gpu
+            NMS_THRESH: 0.1
+            NMS_PRE_MAXSIZE: 4096
+            NMS_POST_MAXSIZE: 500
+OPTIMIZATION:
+    BATCH_SIZE_PER_GPU: 2
+    NUM_EPOCHS: 30
+    OPTIMIZER: adam_onecycle
+    LR: 0.01
+    WEIGHT_DECAY: 0.001
+    MOMENTUM: 0.9
+    MOMS: [0.95, 0.85]
+    PCT_START: 0.4
+    DIV_FACTOR: 10
+    DECAY_STEP_LIST: [35, 45]
+    LR_DECAY: 0.1
+    LR_CLIP: 0.0000001
+    LR_WARMUP: False
+    WARMUP_EPOCH: 1
+    GRAD_NORM_CLIP: 10
\ No newline at end of file
--- a/tools/cfgs/waymo_models/second.yaml
+++ b/tools/cfgs/waymo_models/second.yaml
+CLASS_NAMES: ['Vehicle', 'Pedestrian', 'Cyclist']
+DATA_CONFIG:
+    _BASE_CONFIG_: cfgs/dataset_configs/waymo_dataset.yaml
+MODEL:
+    NAME: SECONDNet
+    VFE:
+        NAME: MeanVFE
+    BACKBONE_3D:
+        NAME: VoxelBackBone8x
+    MAP_TO_BEV:
+        NAME: HeightCompression
+        NUM_BEV_FEATURES: 256
+    BACKBONE_2D:
+        NAME: BaseBEVBackbone
+        LAYER_NUMS: [5, 5]
+        LAYER_STRIDES: [1, 2]
+        NUM_FILTERS: [128, 256]
+        UPSAMPLE_STRIDES: [1, 2]
+        NUM_UPSAMPLE_FILTERS: [256, 256]
+    DENSE_HEAD:
+        NAME: AnchorHeadSingle
+        CLASS_AGNOSTIC: False
+        USE_DIRECTION_CLASSIFIER: True
+        DIR_OFFSET: 0.78539
+        DIR_LIMIT_OFFSET: 0.0
+        NUM_DIR_BINS: 2
+        ANCHOR_GENERATOR_CONFIG: [
+            {
+                'class_name': 'Vehicle',
+                'anchor_sizes': [[4.7, 2.1, 1.7]],
+                'anchor_rotations': [0, 1.57],
+                'anchor_bottom_heights': [0],
+                'align_center': False,
+                'feature_map_stride': 8,
+                'matched_threshold': 0.55,
+                'unmatched_threshold': 0.4
+            },
+            {
+                'class_name': 'Pedestrian',
+                'anchor_sizes': [[0.91, 0.86, 1.73]],
+                'anchor_rotations': [0, 1.57],
+                'anchor_bottom_heights': [0],
+                'align_center': False,
+                'feature_map_stride': 8,
+                'matched_threshold': 0.5,
+                'unmatched_threshold': 0.35
+            },
+            {
+                'class_name': 'Cyclist',
+                'anchor_sizes': [[1.78, 0.84, 1.78]],
+                'anchor_rotations': [0, 1.57],
+                'anchor_bottom_heights': [0],
+                'align_center': False,
+                'feature_map_stride': 8,
+                'matched_threshold': 0.5,
+                'unmatched_threshold': 0.35
+            }
+        ]
+        TARGET_ASSIGNER_CONFIG:
+            NAME: AxisAlignedTargetAssigner
+            POS_FRACTION: -1.0
+            SAMPLE_SIZE: 512
+            NORM_BY_NUM_EXAMPLES: False
+            MATCH_HEIGHT: False
+            BOX_CODER: ResidualCoder
+        LOSS_CONFIG:
+            LOSS_WEIGHTS: {
+                'cls_weight': 1.0,
+                'loc_weight': 2.0,
+                'dir_weight': 0.2,
+                'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
+            }
+    POST_PROCESSING:
+        RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
+        SCORE_THRESH: 0.1
+        OUTPUT_RAW_SCORE: False
+        EVAL_METRIC: waymo
+        NMS_CONFIG:
+            MULTI_CLASSES_NMS: False
+            NMS_TYPE: nms_gpu
+            NMS_THRESH: 0.01
+            NMS_PRE_MAXSIZE: 4096
+            NMS_POST_MAXSIZE: 500
+OPTIMIZATION:
+    BATCH_SIZE_PER_GPU: 4
+    NUM_EPOCHS: 30
+    OPTIMIZER: adam_onecycle
+    LR: 0.003
+    WEIGHT_DECAY: 0.01
+    MOMENTUM: 0.9
+    MOMS: [0.95, 0.85]
+    PCT_START: 0.4
+    DIV_FACTOR: 10
+    DECAY_STEP_LIST: [35, 45]
+    LR_DECAY: 0.1
+    LR_CLIP: 0.0000001
+    LR_WARMUP: False
+    WARMUP_EPOCH: 1
+    GRAD_NORM_CLIP: 10
\ No newline at end of file