README.md 17.7 KB
Newer Older
chenych's avatar
chenych committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
# MaskedDenoising
## 论文
[Images Speak in Images: A Generalist Painter for In-Context Visual Learning](https://arxiv.org/abs/2212.02499)

## 模型结构



<div align=center>
    <img src="./doc/method.jpg"/>
</div>

## 算法原理



<div align=center>
    <img src="./doc/progress.png"/>
</div>

## 环境配置

### Docker(方法一)

-v 路径、docker_name和imageID根据实际情况修改

```image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04.1-py38-latest
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04.1-py38-latest
docker run -it -v /path/your_code_data/:/path/ your_code_data/ --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash

cd /your_code_path/maskeddenoising_pytorch
pip install --upgrade setuptools wheel
pip install -r requirement.txt
```

### Dockerfile(方法二)

-v 路径、docker_name和imageID根据实际情况修改

```
cd ./docker
cp ../requirement.txt requirement.txt
docker build --no-cache -t maskeddenoising:latest .
docker run -it -v /path/your_code_data/:/path/your_code_data/ --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash

cd /your_code_path/maskeddenoising_pytorch
pip install --upgrade setuptools wheel
pip install -r requirement.txt
```

### Anaconda(方法三)

1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装: https://developer.hpccube.com/tool/

```
DTK软件栈:dtk23.04.1
python:python3.8
torch:1.13.1
torchvision:0.14.1
```

Tips:以上dtk软件栈、python、torch等DCU相关工具版本需要严格一一对应

2、其他非特殊库直接按照requirement.txt安装

```bash
pip install --upgrade setuptools wheel
pip install -r requirement.txt
```

## 数据集

### 数据集所需环境配置
#### ADE20K Semantic Segmentation

```bash
git clone https://github.com/facebookresearch/detectron2
python -m pip install -e detectron2
```

#### COCO Panoptic Segmentation

```bash
pip install openmim #(0.3.9)
mim install mmcv-full # 注意版本是不是1.7.1
pip install mmdet==2.26.0 # 对应 mmcv-1.7.1
pip install yapf==0.40.1
```

#### COCO Pose Estimation

pip install mmcv==1.3.9
pip install mmpose==0.29.0

或者也可以直接采用源码安装mmpose
```bash
# choose commit id `8c58a18b`
git clone https://github.com/open-mmlab/mmpose.git
cd mmpose
pip install -r requirements.txt
pip install -v -e .
```

### 数据集下载
项目数据集需求较多, 可以使用提供的[a toy training dataset](https://huggingface.co/BAAI/Painter/blob/main/toy_datasets.tar)数据集来验证功能, 数据集由每个类别中各10个类别组成. 将数据集放置于 `$Painter_ROOT/toy_datasets` 路径下, 并设置`$Painter_ROOT/train_painter_vit_large.sh``DATA_PATH=toy_datasets`.

完整所需数据集如下所示:

#### NYU Depth V2

首先, 下载数据集[here](https://drive.google.com/file/d/1AysroWpfISmm-yRFGBgFTrLy6FjQwvwP/view?usp=sharing). 确保将下载的数据集存放到 `$Painter_ROOT/datasets/nyu_depth_v2/sync.zip`

接下来准备NYU_Depth_V2测试集[NYU Depth V2 test](https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html).

```bash
# 下载原始 NYU Depth V2 split file
wget -P datasets/nyu_depth_v2/ http://horatio.cs.nyu.edu/mit/silberman/nyu_depth_v2/nyu_depth_v2_labeled.mat
# 将 mat 数据转换成 image files
python data/depth/extract_official_train_test_set_from_mat.py datasets/nyu_depth_v2/nyu_depth_v2_labeled.mat data/depth/splits.mat datasets/nyu_depth_v2/official_splits/
```

最后, 准备训练和验证所需json数据, 生成的json数据将会默认保存到 `$Painter_ROOT/datasets/nyu_depth_v2/` 路径下.

```bash
python data/depth/gen_json_nyuv2_depth.py --split sync
python data/depth/gen_json_nyuv2_depth.py --split test
```

#### ADE20k Semantic Segmentation

首先, 下载数据集 [official website](https://groups.csail.mit.edu/vision/datasets/ADE20K/), 将下载的数据集存放到 `$Painter_ROOT/datasets/`.

接下来, 解压 zip 文件并重命名为`ade20k`. 完成后的 ade20k 文件结构如下所示:
```bash
ade20k/
    images/
    annotations/
```

第二, 执行下面的命令准备训练和验证所需的 annotations, 生成的 annotations 将会默认保存到 `$Painter_ROOT/datasets/ade20k/annotations_with_color/` 路径下.
```bash
python data/ade20k/gen_color_ade20k_sem.py --split training
python data/ade20k/gen_color_ade20k_sem.py --split validation
```

第三, 准备训练和验证所需json文件, 生成的json数据将会默认保存到 `$Painter_ROOT/datasets/ade20k/` 路径下.
```bash
python data/ade20k/gen_json_ade20k_sem.py --split training
python data/ade20k/gen_json_ade20k_sem.py --split validation
```

最后, 为了确认能通过 detectron2 进行验证, 创建 `$Painter_ROOT/datasets/ade20k` to `$Painter_ROOT/datasets/ADEChallengeData2016` 的软连接, 然后执行下面的操作:
```bash
# 创建软连接
# ln -s $Painter_ROOT/datasets/ade20k datasets/ADEChallengeData2016
# 执行
python data/prepare_ade20k_sem_seg.py
```

#### COCO Panoptic Segmentation
下载 COCO2017 数据 和 the corresponding panoptic segmentation annotation. 完成后的 COCO 文件结构如下所示:
```
coco/
    train2017/
    val2017/
    annotations/
        instances_train2017.json
        instances_val2017.json
        panoptic_train2017.json
        panoptic_val2017.json
        panoptic_train2017/
        panoptic_val2017/
```

1. 准备 COCO Semantic Segmentation
准备训练所需的annotations, 生成的annotations默认保存到 `$Painter_ROOT/datasets/coco/pano_sem_seg/` 路径下.
```bash
python data/coco_semseg/gen_color_coco_panoptic_segm.py --split train2017
python data/coco_semseg/gen_color_coco_panoptic_segm.py --split val2017
```

准备训练和验证所需的json数据, 生成的json数据默认保存到 `$Painter_ROOT/datasets/coco/pano_sem_seg/` 路径下.
```bash
python data/coco_semseg/gen_json_coco_panoptic_segm.py --split train2017
python data/coco_semseg/gen_json_coco_panoptic_segm.py --split val2017
```

2. 准备 COCO Class-Agnostic Instance Segmentation

第一步, 通过下面的命令对数据进行预处理, 生成的 painted ground truth 将会默认保存到 `$Painter_ROOT/datasets/coco/pano_ca_inst` 路径下.

```bash
cd $Painter_ROOT/data/mmdet_custom

# 为实例分割生成使用通用数据增强的训练数据, 注意我们通过在configs/coco_panoptic_ca_inst_gen_augg.py中交替生成30个副本train_aug{idx}
./tools/dist_train.sh configs/coco_panoptic_ca_inst_gen_aug.py 1
# 仅使用水平翻转增强生成训练数据
./tools/dist_train.sh configs/coco_panoptic_ca_inst_gen_orgflip.py 1
# 生成无数据增强的训练数据
./tools/dist_train.sh configs/coco_panoptic_ca_inst_gen_org.py 1
# 生成验证数据(无数据增强)
./tools/dist_test.sh configs/coco_panoptic_ca_inst_gen_org.py none 1 --eval segm
```

然后, 准备训练和验证所需json文件. 生成的json文件将会默认保存到 `$Painter_ROOT/datasets/coco/pano_ca_inst` 路径下.

```bash
cd $Painter_ROOT
python data/mmdet_custom/gen_json_coco_panoptic_inst.py --split train
python data/mmdet_custom/gen_json_coco_panoptic_inst.py --split val
```

最后, 为了确保使用detectron2进行验证, 创建`$Painter_ROOT/datasets/coco/annotations/panoptic_val2017` to `$Painter_ROOT/datasets/coco/panoptic_val2017` 的软连接并运行:
```bash
# 创建软连接
# ln -s $Painter_ROOT/datasets/coco/annotations/panoptic_val2017 datasets/coco/panoptic_val2017
# 执行
python data/prepare_coco_semantic_annos_from_panoptic_annos.py
```

#### COCO Human Pose Estimation

首先, 下载COCO val2017的行人检测结果 [google drive](https://drive.google.com/drive/folders/1fRUDNUDxe9fjqcRZ2bnF_TKMlO0nB_dk), 将下载的数据放入 `$Painter_ROOT/datasets/coco_pose/` 路径下

然后, 通过下面的命令对数据进行预处理, 得到的 painted ground truth 默认保存到 `$Painter_ROOT/datasets/coco_pose/` 路径下.

```bash
cd $Painter_ROOT/data/mmpose_custom

# 生成用于姿态估计的通用数据增强的训练数据, 本项目生成20个副本用于训练, 需要对coco_256x192_gendata.py中52行的aug_idx参数进行对应数量修改,当前默认为0
./tools/dist_train.sh configs/coco_256x192_gendata.py 1
# 生成训练期间验证的数据
./tools/dist_test.sh configs/coco_256x192_gendata.py none 1
# 生成用于测试的数据(使用离线盒子)
./tools/dist_test.sh configs/coco_256x192_gendata_test.py none 1
# 生成用于测试的数据(使用离线盒子+翻转)
./tools/dist_test.sh configs/coco_256x192_gendata_testflip.py none 1
```

接着, 准备训练和验证所需json文件. 生成的json文件将会默认保存到 `datasets/pano_ca_inst/` 路径下.
```bash
cd $Painter_ROOT
python data/mmpose_custom/gen_json_coco_pose.py --split train
python data/mmpose_custom/gen_json_coco_pose.py --split val
```

#### Low-level Vision Tasks

##### Deraining
参考[MPRNet](https://github.com/swz30/MPRNet) 进行deraining的数据准备.

跟随[MPRNet](https://github.com/swz30/MPRNet/blob/main/Deraining/Datasets/README.md)的指令步骤下载数据集, 将下载的数据集保存到 `$Painter_ROOT/datasets/derain/`. 完成后的 Derain 文件结构如下所示:
```
derain/
    train/
        input/
        target/
    test/
        Rain100H/
        Rain100L/
        Test100/
        Test1200/
        Test2800/
```

接着, 通过下面的命令, 准备训练和验证所需json文件. 生成的json文件将保存到 `datasets/derain/` 路径下.
```bash
python data/derain/gen_json_rain.py --split train
python data/derain/gen_json_rain.py --split val
```

### Denoising
参考[Uformer](https://github.com/ZhendongWang6/Uformer)准备SIDD denoising数据集.

针对训练用的SIDD数据集, 可从[official url](https://www.eecs.yorku.ca/~kamel/sidd/dataset.php)中下载SIDD-Medium dataset数据;

针对验证用的SIDD数据集. 可以从[here](https://mailustceducn-my.sharepoint.com/:f:/g/personal/zhendongwang_mail_ustc_edu_cn/Ev832uKaw2JJhwROKqiXGfMBttyFko_zrDVzfSbFFDoi4Q?e=S3p5hQ)下载.

接下来, 使用以下命令生成用于训练的图像补丁:
```bash
python data/sidd/generate_patches_SIDD.py --src_dir datasets/denoise/SIDD_Medium_Srgb/Data --tar_dir datasets/denoise/train
```

最后, 准备训练和验证所需json文件, 生成的json文件将保存在 `datasets/denoise/` 路径下.
```bash
python data/sidd/gen_json_sidd.py --split train
python data/sidd/gen_json_sidd.py --split val
```


### Low-Light Image Enhancement

首先, 下载 LOL 数据集 [google drive](https://drive.google.com/file/d/157bjO1_cFuSd0HWDUuAmcHRJDVyWpOxB/view), 将下载的数据集存放到 `$Painter_ROOT/datasets/light_enhance/` 路径下. 完成后的 LOL 文件结构如下所示:

```
light_enhance/
    our485/
        low/
        high/
    eval15/
        low/
        high/
```

Next, prepare json files for training and evaluation. The generated json files will be saved at `$Painter_ROOTdatasets/light_enhance/`.
```
python data/lol/gen_json_lol.py --split train
python data/lol/gen_json_lol.py --split val
```

数据集的目录结构如下:

```
├── nyu_depth_v2/
│   ├── sync/
│   ├── official_splits/
│   ├── nyu_depth_v2_labeled.mat
│   ├── nyuv2_sync_image_depth.json  # generated
│   ├── nyuv2_test_image_depth.json  # generated
├── ade20k/
│   ├── images/
│   ├── annotations/
│   ├── annotations_detectron2/  # generated
│   ├── annotations_with_color/  # generated
│   ├── ade20k_training_image_semantic.json  # generated
│   ├── ade20k_validation_image_semantic.json  # generated
├── ADEChallengeData2016/  # sim-link to $Painter_ROOT/datasets/ade20k
├── coco/
│   ├── train2017/
│   ├── val2017/
│   ├── annotations/
│       ├── instances_train2017.json
│       ├── instances_val2017.json
│       ├── person_keypoints_val2017.json
│       ├── panoptic_train2017.json
│       ├── panoptic_val2017.json
│       ├── panoptic_train2017/
│       ├── panoptic_val2017/
│   ├── panoptic_semseg_val2017/  # generated
│   ├── panoptic_val2017/  # sim-link to $Painter_ROOT/datasets/coco/annotations/panoptic_val2017
│   ├── pano_sem_seg/  # generated
│       ├── panoptic_segm_train2017_with_color
│       ├── panoptic_segm_val2017_with_color
│       ├── coco_train2017_image_panoptic_sem_seg.json
│       ├── coco_val2017_image_panoptic_sem_seg.json
│   ├── pano_ca_inst/  # generated
│       ├── train_aug0/
│       ├── train_aug1/
│       ├── ...
│       ├── train_aug29/
│       ├── train_org/
│       ├── train_flip/
│       ├── val_org/
│       ├── coco_train_image_panoptic_inst.json
│       ├── coco_val_image_panoptic_inst.json
├── coco_pose/
│   ├── person_detection_results/
│       ├── COCO_val2017_detections_AP_H_56_person.json
│   ├── data_pair/  # generated
│       ├── train_256x192_aug0/
│       ├── train_256x192_aug1/
│       ├── ...
│       ├── train_256x192_aug19/
│       ├── val_256x192/
│       ├── test_256x192/
│       ├── test_256x192_flip/
│   ├── coco_pose_256x192_train.json  # generated
│   ├── coco_pose_256x192_val.json  # generated
├── derain/
│   ├── train/
│       ├── input/
│       ├── target/
│   ├── test/
│       ├── Rain100H/
│       ├── Rain100L/
│       ├── Test100/
│       ├── Test1200/
│       ├── Test2800/
│   ├── derain_train.json
│   ├── derain_test_rain100h.json
├── denoise/
│   ├── SIDD_Medium_Srgb/
│   ├── train/
│   ├── val/
│   ├── denoise_ssid_train.json  # generated
│   ├── denoise_ssid_val.json  # generated
├── light_enhance/
│   ├── our485/
│       ├── low/
│       ├── high/
│   ├── eval15/
│       ├── low/
│       ├── high/
│   ├── enhance_lol_train.json  # generated
│   ├── enhance_lol_val.json  # generated
```

## 训练
下载预训练模型 [MAE ViT-Large model ](https://dl.fbaipublicfiles.com/mae/pretrain/mae_pretrain_vit_large.pth), 修改`$Painter_ROOT/train_painter_vit_large.sh`中finetune参数地址.
### 单机多卡
#### 普通训练

```
bash train_painter_vit_large.sh
```

#### 分布式训练
```
bash train_multi.sh
```

## 推理
下载推理模型[🤗 Hugging Face Models](https://huggingface.co/BAAI/Painter/blob/main/painter_vit_large.pth). The results on various tasks are summarized below:

## NYU Depth V2

To evaluate Painter on NYU Depth V2, you may first update the `$JOB_NAME` in `$Painter_ROOT/eval/nyuv2_depth/eval.sh`, then run:
```bash
bash eval/nyuv2_depth/eval.sh
```

## ADE20k Semantic Segmentation

To evaluate Painter on ADE20k semantic segmentation, you may first update the `$JOB_NAME` in `$Painter_ROOT/eval/ade20k_semantic/eval.sh`, then run:
```bash
bash eval/ade20k_semantic/eval.sh
```

## COCO Panoptic Segmentation

To evaluate Painter on COCO panoptic segmentation, you may first update the `$JOB_NAME` in `$Painter_ROOT/eval/coco_panoptic/eval.sh`, then run:
```bash
bash eval/coco_panoptic/eval.sh
```


## COCO Human Pose Estimation

为了评估Painter对COCO姿态的估计, 首先生成绘制的图像:
```bash
python -m torch.distributed.launch --nproc_per_node=8 --master_port=29500 --use_env eval/mmpose_custom/painter_inference_pose.py --ckpt_path models/painter_vit_large/painter_vit_large.pth
python -m torch.distributed.launch --nproc_per_node=8 --master_port=29500 --use_env eval/mmpose_custom/painter_inference_pose.py --ckpt_path models/painter_vit_large/painter_vit_large.pth --flip_test
```

Then, you may update the `job_name` and `ckpt_file` in `$Painter_ROOT/eval/mmpose_custom/configs/coco_256x192_test_offline.py`, and run:
```bash
cd $Painter_ROOT/eval/mmpose_custom
./tools/dist_test.sh configs/coco_256x192_test_offline.py none 1 --eval mAP
```

## Low-level Vision Tasks

### Deraining

To evaluate Painter on deraining, first generate the derained images.
```bash
python eval/derain/painter_inference_derain.py --ckpt_path models/painter_vit_large/painter_vit_large.pth
```

Then, update the path to derained images and ground truth in `$Painter_ROOT/eval/derain/evaluate_PSNR_SSIM.m` and run the following script in MATLAB.
```bash
$Painter_ROOT/eval/derain/evaluate_PSNR_SSIM.m
```


### Denoising

To evaluate Painter on SIDD denoising, first generate the denoised images.
```bash
python eval/sidd/painter_inference_sidd.py --ckpt_path models/painter_vit_large/painter_vit_large.pth
```

Then, update the path to denoising output and ground truth in `$Painter_ROOT/eval/sidd/eval_sidd.m` and run the following script in MATLAB.
```bash
$Painter_ROOT/eval/sidd/eval_sidd.m
```


### Low-Light Image Enhancement

To evaluate Painter on LoL image enhancement:
```bash
python eval/lol/painter_inference_lol.py --ckpt_path models/painter_vit_large/painter_vit_large.pth
```

#### 单卡推理

```
bash test.sh
```

## result

本地测试集测试结果单张展示:

<div align=center>
    <img src="./doc/origin.png"/>
</div>

<div align=center>
    <img src="./doc/results.png"/>
</div>

### 精度

基于项目提供的测试数据, 得到单卡测试结果如下:

|       | PSNR | SSIM | LPIPS  |
| :------: | :------: | :------: | :------: |
| ours | 29.04 | 0.7615 | 0.1294  |
| paper | 30.13 | 0.7981 | 0.1031 |


## 应用场景
### 算法类别
图像降噪

### 热点应用行业
交通,公安,制造

## 源码仓库及问题反馈
http://developer.hpccube.com/codes/modelzoo/maskeddenoising_pytorch.git

## 参考资料
https://github.com/haoyuc/MaskedDenoising.git