DATA.md 9.04 KB
Newer Older
chenych's avatar
chenych committed
1
# 准备Painter所需数据集
chenych's avatar
chenych committed
2

chenych's avatar
chenych committed
3
训练所需的数据集有: [COCO](https://cocodataset.org/), [ADE20K-官方](https://groups.csail.mit.edu/vision/datasets/ADE20K/), [NYUDepthV2](https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html), [Synthetic Rain Datasets](https://paperswithcode.com/dataset/synthetic-rain-datasets), [SIDD](https://www.eecs.yorku.ca/~kamel/sidd/), 和 [LoL](https://daooshee.github.io/BMVC2018website/).
chenych's avatar
chenych committed
4
5


chenych's avatar
chenych committed
6
7
8
9
10
11
### 数据集准备所需环境配置
#### COCO Panoptic Segmentation
```bash
pip install openmim #(0.3.9)
mim install mmcv-full # 注意版本是不是1.7.1
pip install mmdet==2.26.0 # 对应 mmcv-1.7.1
chenych's avatar
chenych committed
12
13
```

chenych's avatar
chenych committed
14
15
16
17
18
19
20
21
22
23
24
#### COCO Pose Estimation
pip install mmcv==1.3.9
pip install mmpose==0.29.0

或者也可以直接采用源码安装mmpose
```bash
# choose commit id `8c58a18b`
git clone https://github.com/open-mmlab/mmpose.git
cd mmpose
pip install -r requirements.txt
pip install -v -e .
chenych's avatar
chenych committed
25
26
```

chenych's avatar
chenych committed
27
28
### 完整数据集下载
#### NYU Depth V2
chenych's avatar
chenych committed
29

chenych's avatar
chenych committed
30
首先, 下载数据集[here](https://drive.google.com/file/d/1AysroWpfISmm-yRFGBgFTrLy6FjQwvwP/view?usp=sharing). 确保将下载的数据集存放到 `$Painter_ROOT/datasets/nyu_depth_v2/sync.zip`
chenych's avatar
chenych committed
31

chenych's avatar
chenych committed
32
接下来准备NYU_Depth_V2测试集[NYU Depth V2 test](https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html).
chenych's avatar
chenych committed
33
34

```bash
chenych's avatar
chenych committed
35
# 下载原始 NYU Depth V2 split file
chenych's avatar
chenych committed
36
wget -P datasets/nyu_depth_v2/ http://horatio.cs.nyu.edu/mit/silberman/nyu_depth_v2/nyu_depth_v2_labeled.mat
chenych's avatar
chenych committed
37
# 将 mat 数据转换成 image files
chenych's avatar
chenych committed
38
39
40
python data/depth/extract_official_train_test_set_from_mat.py datasets/nyu_depth_v2/nyu_depth_v2_labeled.mat data/depth/splits.mat datasets/nyu_depth_v2/official_splits/
```

chenych's avatar
chenych committed
41
42
最后, 准备训练和验证所需json数据, 生成的json数据将会默认保存到 `$Painter_ROOT/datasets/nyu_depth_v2/` 路径下.

chenych's avatar
chenych committed
43
44
45
46
47
```bash
python data/depth/gen_json_nyuv2_depth.py --split sync
python data/depth/gen_json_nyuv2_depth.py --split test
```

chenych's avatar
chenych committed
48
#### ADE20k Semantic Segmentation
chenych's avatar
chenych committed
49

chenych's avatar
chenych committed
50
51
52
53
54
55
56
1. 下载数据集: [ADEChallengeData2016](https://aistudio.baidu.com/datasetdetail/54455);

2. 将下载的数据集存放到 `$Painter_ROOT/datasets/`;

3. 解压文件并重命名为`ade20k`. 完成后的 ade20k 文件结构如下所示:

```bash
chenych's avatar
chenych committed
57
58
59
60
61
ade20k/
    images/
    annotations/
```

chenych's avatar
chenych committed
62
4. 执行下面的命令准备训练和验证所需的 annotations, 生成的 annotations 将会默认保存到 `$Painter_ROOT/datasets/ade20k/annotations_with_color/` 路径下.
chenych's avatar
chenych committed
63
64
65
66
67
```bash
python data/ade20k/gen_color_ade20k_sem.py --split training
python data/ade20k/gen_color_ade20k_sem.py --split validation
```

chenych's avatar
chenych committed
68
5. 准备训练和验证所需json文件, 生成的json数据将会默认保存到 `$Painter_ROOT/datasets/ade20k/` 路径下.
chenych's avatar
chenych committed
69
70
71
72
73
```bash
python data/ade20k/gen_json_ade20k_sem.py --split training
python data/ade20k/gen_json_ade20k_sem.py --split validation
```

chenych's avatar
chenych committed
74
6. 为了确认能通过 detectron2 进行验证, 创建 `$Painter_ROOT/datasets/ade20k` to `$Painter_ROOT/datasets/ADEChallengeData2016` 的软连接, 然后执行下面的操作:
chenych's avatar
chenych committed
75
```bash
Rayyyyy's avatar
Rayyyyy committed
76
77
# 很重要!!!!创建软连接, 注意, 一定是datasets下面创建ADEChallengeData2016!!
ln -s $Painter_ROOT/datasets/ade20k datasets/ADEChallengeData2016
chenych's avatar
chenych committed
78
# 执行
chenych's avatar
chenych committed
79
80
81
python data/prepare_ade20k_sem_seg.py
```

chenych's avatar
chenych committed
82
83
#### COCO Panoptic Segmentation
下载COCO2017数据和相应的全视分割标注. 完成后的文件结构如下所示:
chenych's avatar
chenych committed
84
85
86
87
88
89
90
91
92
93
94
95
96
```
coco/
    train2017/
    val2017/
    annotations/
        instances_train2017.json
        instances_val2017.json
        panoptic_train2017.json
        panoptic_val2017.json
        panoptic_train2017/
        panoptic_val2017/
```

chenych's avatar
chenych committed
97
98
1. 准备 COCO Semantic Segmentation
准备训练所需的annotations, 生成的annotations默认保存到 `$Painter_ROOT/datasets/coco/pano_sem_seg/` 路径下.
chenych's avatar
chenych committed
99
100
101
102
103
```bash
python data/coco_semseg/gen_color_coco_panoptic_segm.py --split train2017
python data/coco_semseg/gen_color_coco_panoptic_segm.py --split val2017
```

chenych's avatar
chenych committed
104
准备训练和验证所需的json数据, 生成的json数据默认保存到 `$Painter_ROOT/datasets/coco/pano_sem_seg/` 路径下.
chenych's avatar
chenych committed
105
106
107
108
109
```bash
python data/coco_semseg/gen_json_coco_panoptic_segm.py --split train2017
python data/coco_semseg/gen_json_coco_panoptic_segm.py --split val2017
```

chenych's avatar
chenych committed
110
2. 准备 COCO Class-Agnostic Instance Segmentation
chenych's avatar
chenych committed
111

chenych's avatar
chenych committed
112
第一步, 通过下面的命令对数据进行预处理, 生成的 painted ground truth 将会默认保存到 `$Painter_ROOT/datasets/coco/pano_ca_inst` 路径下.
chenych's avatar
chenych committed
113
114
115
116

```bash
cd $Painter_ROOT/data/mmdet_custom

chenych's avatar
chenych committed
117
# 为实例分割生成使用通用数据增强的训练数据, 注意我们通过在configs/coco_panoptic_ca_inst_gen_augg.py中交替生成30个副本train_aug{idx}
chenych's avatar
chenych committed
118
./tools/dist_train.sh configs/coco_panoptic_ca_inst_gen_aug.py 1
chenych's avatar
chenych committed
119
# 仅使用水平翻转增强生成训练数据
chenych's avatar
chenych committed
120
./tools/dist_train.sh configs/coco_panoptic_ca_inst_gen_orgflip.py 1
chenych's avatar
chenych committed
121
# 生成无数据增强的训练数据
chenych's avatar
chenych committed
122
./tools/dist_train.sh configs/coco_panoptic_ca_inst_gen_org.py 1
chenych's avatar
chenych committed
123
# 生成验证数据(无数据增强)
chenych's avatar
chenych committed
124
125
126
./tools/dist_test.sh configs/coco_panoptic_ca_inst_gen_org.py none 1 --eval segm
```

chenych's avatar
chenych committed
127
128
然后, 准备训练和验证所需json文件. 生成的json文件将会默认保存到 `$Painter_ROOT/datasets/coco/pano_ca_inst` 路径下.

chenych's avatar
chenych committed
129
130
131
132
133
134
```bash
cd $Painter_ROOT
python data/mmdet_custom/gen_json_coco_panoptic_inst.py --split train
python data/mmdet_custom/gen_json_coco_panoptic_inst.py --split val
```

chenych's avatar
chenych committed
135
最后, 为了确保使用detectron2进行验证, 创建`$Painter_ROOT/datasets/coco/annotations/panoptic_val2017` to `$Painter_ROOT/datasets/coco/panoptic_val2017` 的软连接并运行:
chenych's avatar
chenych committed
136
```bash
chenych's avatar
chenych committed
137
# 创建软连接
chenych's avatar
chenych committed
138
# ln -s $Painter_ROOT/datasets/coco/annotations/panoptic_val2017 datasets/coco/panoptic_val2017
chenych's avatar
chenych committed
139
# 执行
chenych's avatar
chenych committed
140
141
142
python data/prepare_coco_semantic_annos_from_panoptic_annos.py
```

chenych's avatar
chenych committed
143
#### COCO Human Pose Estimation
chenych's avatar
chenych committed
144

chenych's avatar
chenych committed
145
1. 下载COCO val2017的行人检测结果 [google drive](https://drive.google.com/drive/folders/1fRUDNUDxe9fjqcRZ2bnF_TKMlO0nB_dk), 将下载的数据放入 `$Painter_ROOT/datasets/coco_pose/` 路径下
chenych's avatar
chenych committed
146

chenych's avatar
chenych committed
147
2. 通过下面的命令对数据进行预处理, 得到的 painted ground truth 默认保存到 `$Painter_ROOT/datasets/coco_pose/` 路径下.
chenych's avatar
chenych committed
148
149
150
151

```bash
cd $Painter_ROOT/data/mmpose_custom

chenych's avatar
chenych committed
152
# 生成用于姿态估计的通用数据增强的训练数据, 本项目生成20个副本用于训练, 需要对coco_256x192_gendata.py中52行的aug_idx参数进行对应数量修改,当前默认为0
chenych's avatar
chenych committed
153
./tools/dist_train.sh configs/coco_256x192_gendata.py 1
chenych's avatar
chenych committed
154
# 生成训练期间验证的数据
chenych's avatar
chenych committed
155
./tools/dist_test.sh configs/coco_256x192_gendata.py none 1
chenych's avatar
chenych committed
156
# 生成用于测试的数据(使用离线盒子)
chenych's avatar
chenych committed
157
./tools/dist_test.sh configs/coco_256x192_gendata_test.py none 1
chenych's avatar
chenych committed
158
# 生成用于测试的数据(使用离线盒子+翻转)
chenych's avatar
chenych committed
159
160
161
./tools/dist_test.sh configs/coco_256x192_gendata_testflip.py none 1
```

chenych's avatar
chenych committed
162
接着, 准备训练和验证所需json文件. 生成的json文件将会默认保存到 `datasets/pano_ca_inst/` 路径下.
chenych's avatar
chenych committed
163
164
165
166
167
168
```bash
cd $Painter_ROOT
python data/mmpose_custom/gen_json_coco_pose.py --split train
python data/mmpose_custom/gen_json_coco_pose.py --split val
```

chenych's avatar
chenych committed
169
170
171
#### Low-level Vision Tasks
##### Deraining
参考[MPRNet](https://github.com/swz30/MPRNet) 进行deraining的数据准备.
chenych's avatar
chenych committed
172

chenych's avatar
chenych committed
173
174
跟随[MPRNet](https://github.com/swz30/MPRNet/blob/main/Deraining/Datasets/README.md)的指令步骤下载数据集, 将下载的数据集保存到 `$Painter_ROOT/datasets/derain/`. 完成后的 Derain 文件结构如下所示:
```
chenych's avatar
chenych committed
175
176
177
178
179
180
181
182
183
184
185
186
derain/
    train/
        input/
        target/
    test/
        Rain100H/
        Rain100L/
        Test100/
        Test1200/
        Test2800/
```

chenych's avatar
chenych committed
187
接着, 通过下面的命令, 准备训练和验证所需json文件. 生成的json文件将保存到 `datasets/derain/` 路径下.
chenych's avatar
chenych committed
188
189
190
191
192
193
```bash
python data/derain/gen_json_rain.py --split train
python data/derain/gen_json_rain.py --split val
```

### Denoising
chenych's avatar
chenych committed
194
195
196
参考[Uformer](https://github.com/ZhendongWang6/Uformer)准备SIDD denoising数据集.

针对训练用的SIDD数据集, 可从[official url](https://www.eecs.yorku.ca/~kamel/sidd/dataset.php)中下载SIDD-Medium dataset数据;
chenych's avatar
chenych committed
197

chenych's avatar
chenych committed
198
针对验证用的SIDD数据集. 可以从[here](https://mailustceducn-my.sharepoint.com/:f:/g/personal/zhendongwang_mail_ustc_edu_cn/Ev832uKaw2JJhwROKqiXGfMBttyFko_zrDVzfSbFFDoi4Q?e=S3p5hQ)下载.
chenych's avatar
chenych committed
199

chenych's avatar
chenych committed
200
接下来, 使用以下命令生成用于训练的图像补丁:
chenych's avatar
chenych committed
201
202
203
204
```bash
python data/sidd/generate_patches_SIDD.py --src_dir datasets/denoise/SIDD_Medium_Srgb/Data --tar_dir datasets/denoise/train
```

chenych's avatar
chenych committed
205
最后, 准备训练和验证所需json文件, 生成的json文件将保存在 `datasets/denoise/` 路径下.
chenych's avatar
chenych committed
206
207
208
209
210
211
```bash
python data/sidd/gen_json_sidd.py --split train
python data/sidd/gen_json_sidd.py --split val
```

### Low-Light Image Enhancement
chenych's avatar
chenych committed
212
首先, 下载 LOL 数据集 [google drive](https://drive.google.com/file/d/157bjO1_cFuSd0HWDUuAmcHRJDVyWpOxB/view), 将下载的数据集存放到 `$Painter_ROOT/datasets/light_enhance/` 路径下. 完成后的 LOL 文件结构如下所示:
chenych's avatar
chenych committed
213

chenych's avatar
chenych committed
214
```
chenych's avatar
chenych committed
215
216
217
218
219
220
221
222
223
light_enhance/
    our485/
        low/
        high/
    eval15/
        low/
        high/
```

chenych's avatar
chenych committed
224
225
准备训练和验证所需json文件, 生成的json文件将保存在 `$Painter_ROOTdatasets/light_enhance/` 路径下.
```
chenych's avatar
chenych committed
226
227
228
python data/lol/gen_json_lol.py --split train
python data/lol/gen_json_lol.py --split val
```