# Tutorial 2: Customize Datasets ## Support new data format To support a new data format, you can either convert them to existing formats or directly convert them to the middle format. You could also choose to convert them offline (before training by a script) or online (implement a new dataset and do the conversion at training). In MMDetection3D, for the data that is inconvenient to read directly online, we recommend to convert it into KITTI format and do the conversion offline, thus you only need to modify the config's data annotation paths and classes after the conversion. For data sharing similar format with existing datasets, like Lyft compared to nuScenes, we recommend to directly implement data converter and dataset class. During the procedure, inheritation could be taken into consideration to reduce the implementation workload. ### Reorganize new data formats to existing format For data that is inconvenient to read directly online, the simplest way is to convert your dataset to existing dataset formats. Typically we need a data converter to reorganize the raw data and convert the annotation format into KITTI style. Then a new dataset class inherited from existing ones is sometimes necessary for dealing with some specific differences between datasets. Finally, the users need to further modify the config files to use the dataset. An [example](https://mmdetection3d.readthedocs.io/en/latest/2_new_data_model.html) training predefined models on Waymo dataset by converting it into KITTI style can be taken for reference. ### Reorganize new data format to middle format It is also fine if you do not want to convert the annotation format to existing formats. Actually, we convert all the supported datasets into pickle files, which summarize useful information for model training and inference. The annotation of a dataset is a list of dict, each dict corresponds to a frame. A basic example (used in KITTI) is as follows. A frame consists of several keys, like `image`, `point_cloud`, `calib` and `annos`. As long as we could directly read data according to these information, the organization of raw data could also be different from existing ones. With this design, we provide an alternative choice for customizing datasets. ```python [ {'image': {'image_idx': 0, 'image_path': 'training/image_2/000000.png', 'image_shape': array([ 370, 1224], dtype=int32)}, 'point_cloud': {'num_features': 4, 'velodyne_path': 'training/velodyne/000000.bin'}, 'calib': {'P0': array([[707.0493, 0. , 604.0814, 0. ], [ 0. , 707.0493, 180.5066, 0. ], [ 0. , 0. , 1. , 0. ], [ 0. , 0. , 0. , 1. ]]), 'P1': array([[ 707.0493, 0. , 604.0814, -379.7842], [ 0. , 707.0493, 180.5066, 0. ], [ 0. , 0. , 1. , 0. ], [ 0. , 0. , 0. , 1. ]]), 'P2': array([[ 7.070493e+02, 0.000000e+00, 6.040814e+02, 4.575831e+01], [ 0.000000e+00, 7.070493e+02, 1.805066e+02, -3.454157e-01], [ 0.000000e+00, 0.000000e+00, 1.000000e+00, 4.981016e-03], [ 0.000000e+00, 0.000000e+00, 0.000000e+00, 1.000000e+00]]), 'P3': array([[ 7.070493e+02, 0.000000e+00, 6.040814e+02, -3.341081e+02], [ 0.000000e+00, 7.070493e+02, 1.805066e+02, 2.330660e+00], [ 0.000000e+00, 0.000000e+00, 1.000000e+00, 3.201153e-03], [ 0.000000e+00, 0.000000e+00, 0.000000e+00, 1.000000e+00]]), 'R0_rect': array([[ 0.9999128 , 0.01009263, -0.00851193, 0. ], [-0.01012729, 0.9999406 , -0.00403767, 0. ], [ 0.00847068, 0.00412352, 0.9999556 , 0. ], [ 0. , 0. , 0. , 1. ]]), 'Tr_velo_to_cam': array([[ 0.00692796, -0.9999722 , -0.00275783, -0.02457729], [-0.00116298, 0.00274984, -0.9999955 , -0.06127237], [ 0.9999753 , 0.00693114, -0.0011439 , -0.3321029 ], [ 0. , 0. , 0. , 1. ]]), 'Tr_imu_to_velo': array([[ 9.999976e-01, 7.553071e-04, -2.035826e-03, -8.086759e-01], [-7.854027e-04, 9.998898e-01, -1.482298e-02, 3.195559e-01], [ 2.024406e-03, 1.482454e-02, 9.998881e-01, -7.997231e-01], [ 0.000000e+00, 0.000000e+00, 0.000000e+00, 1.000000e+00]])}, 'annos': {'name': array(['Pedestrian'], dtype='