README.md

# DiffBIR

## 论文

**DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior**

* https://arxiv.org/abs/2308.15070

## 模型结构

第一阶段模型使用8个Swin Transformer blocks（RSTB），每个RSTB中包含6个Swin Transformer Layers（STL），其中head数为6，window size为8.

第二阶段模型基于Stable Diffusioin 2.1-base，创建了一个与Unet中encoder block与middle block相同的网络。

![Alt text](images/image.png)

## 算法原理

用途：该算法为两阶段算法，可以提升图像的分辨率。

第一阶段使用复原模块，从具有未知和复杂降质的低质量（LQ）图像中恢复清晰图像；

第二阶段使用生成模块来重新生成丢失的信息。

## 环境配置


### Docker（方法一）

    docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04.1-py39-latest
    docker run --shm-size 10g --network=host --name=diffbir --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home/ -it <your IMAGE ID> bash
    pip install -r requirements.txt


### Docker（方法二）

    # 需要在对应的目录下
    docker build -t <IMAGE_NAME>:<TAG> .
    # <your IMAGE ID>用以上拉取的docker的镜像ID替换
    docker run -it --shm-size 10g --network=host --name=diffbir --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined <your IMAGE ID> bash

### Anaconda (方法三)
1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装：
https://developer.hpccube.com/tool/

    DTK驱动：dtk23.04.1
    python：python3.9
    torch:1.13.1
    torchvision:0.14.1
    torchaudio:0.13.1
    deepspeed:0.9.2
    apex:0.1

Tips：以上dtk驱动、python、torch等DCU相关工具版本需要严格一一对应

2、其它非特殊库参照requirements.txt安装

    pip install -r requirements.txt


## 数据集

下载地址（训练+测试集）：https://www.image-net.org/ （imagenet1k）

    datasets
        |- train
            |- n01440764
                |- xxx.JPEG


## 训练

### 阶段一

1、数据准备：该操作用于生成训练以及验证数据路径列表

    python scripts/make_file_list.py \  
    --img_folder [hq_dir_path] \         # 包含图片的文件夹
    --val_size [validation_set_size] \   # 验证集大小
    --save_folder [save_dir_path] \      # 路径列表保存文件夹
    --follow_links

2、修改配置文件

    修改 `configs/dataset`中相应的yaml配置文件

    修改 `configs/train_swinir.yaml`配置文件

3、训练

    python train.py --config [training_config_path]

注意：该阶段训练得到的模型将用于第二阶段的训练。

### 阶段二

1、模型准备(Stable Diffusion v2.1): https://huggingface.co/stabilityai/stable-diffusion-2-1-base/resolve/main/v2-1_512-ema-pruned.ckpt


2、初始化模型参数

    python scripts/make_stage2_init_weight.py \
    --cldm_config configs/model/cldm.yaml \
    --sd_weight [sd_v2.1_ckpt_path] \
    --swinir_weight [swinir_ckpt_path] \  # 第一阶段训练得到的模型
    --output [init_weight_output_path]    # 初始化模型保存地址

3、修改配置文件

    修改`configs/train_cldm.yaml`配置文件

4、训练

    python train.py --config [training_config_path]

## 推理

### general Image

模型下载地址：
* https://huggingface.co/lxq007/DiffBIR/resolve/main/general_full_v1.ckpt

* https://huggingface.co/lxq007/DiffBIR/resolve/main/general_swinir_v1.ckpt

        python inference.py \
        --input inputs/demo/general \
        --config configs/model/cldm.yaml \
        --ckpt weights/general_full_v1.ckpt \  
        --reload_swinir --swinir_ckpt weights/general_swinir_v1.ckpt \  
        --steps 50 \
        --sr_scale 4 \
        --color_fix_type wavelet \
        --output results/demo/general \
        --device cuda [--tiled --tile_size 512 --tile_stride 256]

注意：方括号中的参数为可选项，模型也可以替换为在训练阶段得到的

### Face Image

模型下载地址：

* https://huggingface.co/lxq007/DiffBIR/resolve/main/face_full_v1.ckpt

for aligned face inputs

    python inference_face.py \
    --input inputs/demo/face/aligned \
    --sr_scale 1 \
    --output results/demo/face/aligned \
    --has_aligned \
    --device cuda

for unaligned face inputs

    python inference_face.py \
    --input inputs/demo/face/whole_img \
    --sr_scale 2 \
    --output results/demo/face/whole_img \
    --bg_upsampler DiffBIR \
    --device cuda

## result

恢复后的图像

![Alt text](images/samples_step-004900_e-000008_b-001203.png)

低质量图像

![Alt text](images/lq_step-004900_e-000008_b-001203.png)

### 精度

无

## 应用场景

### 算法类别

`图像超分`

### 热点应用行业

`媒体，科研，教育`

## 源码仓库及问题反馈


## 参考

* https://github.com/XPixelGroup/DiffBIR
* https://github.com/XPixelGroup/DiffBIR/issues/55