codeformer

2136e796 · mashun1 · 2136e796 · 2136e796 · 2136e796 · 2136e796
Commit 2136e796 authored Jan 08, 2024 by mashun1
20 changed files
--- a/.gitignore
+++ b/.gitignore
+.vscode
+datasets
+# ignored files
+version.py
+# ignored files with suffix
+*.html
+# *.png
+# *.jpeg
+# *.jpg
+*.pt
+*.gif
+*.pth
+*.dat
+*.zip
+# template
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+*pyc*
+*idea*
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+.hypothesis/
+.pytest_cache/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# pyenv
+.python-version
+# celery beat schedule file
+celerybeat-schedule
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+# project
+results/
+experiments/
+tb_logger/
+run.sh
+*debug*
+*_old*
+# nohup
+nohup*
--- a/Dockerfile
+++ b/Dockerfile
+FROM image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04.1-py39-latest
\ No newline at end of file
--- a/LICENSE
+++ b/LICENSE
+S-Lab License 1.0
+Copyright 2022 S-Lab
+Redistribution and use for non-commercial purpose in source and 
+binary forms, with or without modification, are permitted provided 
+that the following conditions are met:
+1. Redistributions of source code must retain the above copyright 
+   notice, this list of conditions and the following disclaimer.
+2. Redistributions in binary form must reproduce the above copyright 
+   notice, this list of conditions and the following disclaimer in 
+   the documentation and/or other materials provided with the 
+   distribution.
+3. Neither the name of the copyright holder nor the names of its 
+   contributors may be used to endorse or promote products derived 
+   from this software without specific prior written permission.
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 
+HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+In the event that redistribution and/or use for commercial purpose in 
+source or binary forms, with or without modification is required, 
+please contact the contributor(s) of the work.
\ No newline at end of file
--- a/README.md
+++ b/README.md
+# CodeFormer
+## 论文
+**Towards Robust Blind Face Restoration with Codebook Lookup Transformer**
+* https://arxiv.org/abs/2206.11253
+## 模型结构
+如图所示，该方法分为(a)、(b)两个阶段，每个阶段的模型有所不同。其中(a)阶段包含，$`I_h`$（高质量图像），$`E_h`$（高质量图像编码器），$`Z_h`$（编码后得到的特征），$`C`$（编码本，存储量化的特征），$`S`$（最近邻匹配后得到的索引），$`Z_c`$（$`Z_h`$量化后的特征），$`D_H`$（解码器，从特征重构原始高质量图像），$`I_{rec}`$（重构得到的高质量图像）；(b)阶段包含，$`I_l`$（待修复的低质量图像），$`E_L`$（低质量图像编码器，在$`D_H`$基础上微调），$`Z_l`$（编码后得到的特征），$`T`$（Transform模块，对全局特征建模并预测每个特征对应的索引），$`C`$（固定的预训练编码本），$`\hat{Z}_c`$（量化后的特征），$`D_H`$（固定的预训练解码器），$`L_{res}`$（重构得到的高质量图像），此外，$`F_e`$（低质量图像特征），$`CFT`$（可控特征变换），$`F_d`$（解码特征）。
+![Alt text](readme_images/image-1.png)
+## 算法原理
+用途：该算法可以将低质量的人脸图像恢复为高质量的人脸图像。
+原理：
+1.编码本
+将海量的先验知识以离散化的方式存储。
+![Alt text](readme_images/image-2.png)
+2.Transfomer
+低质量图像特征在存在多样化退化的情况下，可能会偏离正确的索引，并被归为附近的聚类，导致不理想的恢复结果。使用Transformer模块对全局关系建模可以消除该问题。
+## 环境配置
+### Docker
+    docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04.1-py39-latest
+    docker run --shm-size 10g --network=host --name=codeformer --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home/ -it <your IMAGE ID> bash
+    pip install -r requirements.txt
+    python basicsr/setup.py develop
+    # 以下内容可选，仅在视频增强时需要
+    yum install epel-release -y
+    yum localinstall --nogpgcheck https://download1.rpmfusion.org/free/el/rpmfusion-free-release-7.noarch.rpm -y
+    yum install ffmpeg ffmpeg-devel -y
+### Dockerfile
+    # 需要在对应的目录下
+    docker build -t <IMAGE_NAME>:<TAG> .
+    # <your IMAGE ID>用以上拉取的docker的镜像ID替换
+    docker run -it --shm-size 10g --network=host --name=codeformer --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined <your IMAGE ID> bash
+    pip install -r requirements.txt
+    python basicsr/setup.py develop
+    # 以下内容可选，仅在视频增强时需要
+    yum install epel-release -y
+    yum localinstall --nogpgcheck https://download1.rpmfusion.org/free/el/rpmfusion-free-release-7.noarch.rpm -y
+    yum install ffmpeg ffmpeg-devel -y
+### Anaconda (方法三)
+1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装：
+https://developer.hpccube.com/tool/
+    DTK驱动：dtk23.04.1
+    python：python3.9
+    torch:1.13.1
+    torchvision:0.14.1
+    torchaudio:0.13.1
+    deepspeed:0.9.2
+    apex:0.1
+Tips：以上dtk驱动、python、torch等DCU相关工具版本需要严格一一对应
+2、其它非特殊库参照requirements.txt安装
+    pip install -r requirements.txt
+3、basicsr安装
+    python basicsr/setup.py develop
+4、ffmpeg安装
+    conda install -c conda-forge ffmpeg
+## 数据集
+链接： https://github.com/NVlabs/ffhq-dataset
+    dataset
+    |—— ffhq_512
+        |—— xxx.png
+注意：原始数据为`1024x1024`需要处理为`512x512`。
+## 推理
+### 模型下载
+百度网盘链接：https://pan.baidu.com/s/1CuntFQYr9gnkH7zopWdYtQ 
+（提取码：kwai）
+Github：https://github.com/sczhou/CodeFormer/releases/tag/v0.1.0
+    weights
+    ├── CodeFormer
+    │   ├── codeformer_colorization.pth
+    │   ├── codeformer_inpainting.pth
+    │   └── codeformer.pth
+    ├── dlib
+    │   ├── mmod_human_face_detector-4cb19393.dat
+    │   ├── shape_predictor_5_face_landmarks-c4b1e980.dat
+    │   └── shape_predictor_68_face_landmarks-fbdc2cb8.dat
+    ├── facelib
+    │   ├── detection_Resnet50_Final.pth
+    │   └── parsing_parsenet.pth
+    ├── README.md
+    └── realesrgan
+        └── RealESRGAN_x2plus.pth
+也可以使用脚本下载模型
+    python scripts/download_pretrained_models.py facelib
+    python scripts/download_pretrained_models.py dlib (only for dlib face detector)
+    python scripts/download_pretrained_models.py CodeFormer
+### 脸部修复
+    # 获取图像中人脸部分
+    python scripts/crop_align_face.py -i [input folder] -o [output folder]
+    # 对人脸部分进行修复
+    python inference_codeformer.py -w 0.5 --has_aligned --input_path [image folder]|[image path]
+注意：参数`-w`为保真度权重，取值范围为0-1，通常，较小的`w`倾向于产生更高质量的结果，而较大的`w`则产生更高保真度的结果。
+### 整图增强
+    # For whole image
+    # Add '--bg_upsampler realesrgan' to enhance the background regions with Real-ESRGAN
+    # Add '--face_upsample' to further upsample restorated face with Real-ESRGAN
+    python inference_codeformer.py -w 0.7 --input_path [image folder]|[image path]
+### 视频增强
+    # For video clips
+    # Video path should end with '.mp4'|'.mov'|'.avi'
+    python inference_codeformer.py --bg_upsampler realesrgan --face_upsample -w 1.0 --input_path [video path]
+### 脸部着色
+    # For cropped and aligned faces (512x512)
+    # Colorize black and white or faded photo
+    python inference_colorization.py --input_path [image folder]|[image path]
+### 脸部修复
+    # For cropped and aligned faces (512x512)
+    # Inputs could be masked by white brush using an image editing app (e.g., Photoshop) 
+    # (check out the examples in inputs/masked_faces)
+    python inference_inpainting.py --input_path [image folder]|[image path]
+## 训练
+### 阶段一 - 训练VQGAN
+    python -m torch.distributed.launch --nproc_per_node=8 --master_port=4321 basicsr/train.py -opt options/VQGAN_512_ds32_nearest_stage1.yml --launcher pytorch
+获取训练数据的码本序列，可以加速后续训练
+    python scripts/generate_latent_gt.py
+### 阶段二 - 训练Transformer (w = 0)
+    python -m torch.distributed.launch --nproc_per_node=8 --master_port=4322 basicsr/train.py -opt options/CodeFormer_stage2.yml --launcher pytorch
+### 阶段三 - 训练可控特征Transformer (w = 1)
+    python -m torch.distributed.launch --nproc_per_node=8 --master_port=4323 basicsr/train.py -opt options/CodeFormer_stage3.yml --launcher pytorch
+## result
+![Alt text](readme_images/image-3.png)
+### 精度
+无
+## 应用场景
+### 算法类别
+`图像超分`
+### 热点应用行业
+`媒体,科研,教育`
+## 源码仓库及问题反馈
+* https://developer.hpccube.com/codes/aibear/codeformer_pytorch
+## 参考资料
+* https://github.com/guoyww/AnimateDiff/tree/main
+* https://github.com/NVlabs/ffhq-dataset23
--- a/README_official.md
+++ b/README_official.md
+<p align="center">
+  <img src="assets/CodeFormer_logo.png" height=110>
+</p>
+## Towards Robust Blind Face Restoration with Codebook Lookup Transformer (NeurIPS 2022)
+[Paper](https://arxiv.org/abs/2206.11253) | [Project Page](https://shangchenzhou.com/projects/CodeFormer/) | [Video](https://youtu.be/d3VDpkXlueI)
+<a href="https://colab.research.google.com/drive/1m52PNveE4PBhYrecj34cnpEeiHcC5LTb?usp=sharing"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="google colab logo"></a> [![Hugging Face](https://img.shields.io/badge/Demo-%F0%9F%A4%97%20Hugging%20Face-blue)](https://huggingface.co/spaces/sczhou/CodeFormer) [![Replicate](https://img.shields.io/badge/Demo-%F0%9F%9A%80%20Replicate-blue)](https://replicate.com/sczhou/codeformer) [![OpenXLab](https://img.shields.io/badge/Demo-%F0%9F%90%BC%20OpenXLab-blue)](https://openxlab.org.cn/apps/detail/ShangchenZhou/CodeFormer) ![Visitors](https://api.infinitescript.com/badgen/count?name=sczhou/CodeFormer&ltext=Visitors)
+[Shangchen Zhou](https://shangchenzhou.com/), [Kelvin C.K. Chan](https://ckkelvinchan.github.io/), [Chongyi Li](https://li-chongyi.github.io/), [Chen Change Loy](https://www.mmlab-ntu.com/person/ccloy/) 
+S-Lab, Nanyang Technological University
+<img src="assets/network.jpg" width="800px"/>
+:star: If CodeFormer is helpful to your images or projects, please help star this repo. Thanks! :hugs: 
+### Update
+- **2023.07.20**: Integrated to :panda_face: [OpenXLab](https://openxlab.org.cn/apps). Try out online demo! [![OpenXLab](https://img.shields.io/badge/Demo-%F0%9F%90%BC%20OpenXLab-blue)](https://openxlab.org.cn/apps/detail/ShangchenZhou/CodeFormer)
+- **2023.04.19**: :whale: Training codes and config files are public available now.
+- **2023.04.09**: Add features of inpainting and colorization for cropped and aligned face images.
+- **2023.02.10**: Include `dlib` as a new face detector option, it produces more accurate face identity.
+- **2022.10.05**: Support video input `--input_path [YOUR_VIDEO.mp4]`. Try it to enhance your videos! :clapper: 
+- **2022.09.14**: Integrated to :hugs: [Hugging Face](https://huggingface.co/spaces). Try out online demo! [![Hugging Face](https://img.shields.io/badge/Demo-%F0%9F%A4%97%20Hugging%20Face-blue)](https://huggingface.co/spaces/sczhou/CodeFormer)
+- **2022.09.09**: Integrated to :rocket: [Replicate](https://replicate.com/explore). Try out online demo! [![Replicate](https://img.shields.io/badge/Demo-%F0%9F%9A%80%20Replicate-blue)](https://replicate.com/sczhou/codeformer)
+- [**More**](docs/history_changelog.md)
+### TODO
+- [x] Add training code and config files
+- [x] Add checkpoint and script for face inpainting
+- [x] Add checkpoint and script for face colorization
+- [x] ~~Add background image enhancement~~
+#### :panda_face: Try Enhancing Old Photos / Fixing AI-arts
+[<img src="assets/imgsli_1.jpg" height="226px"/>](https://imgsli.com/MTI3NTE2) [<img src="assets/imgsli_2.jpg" height="226px"/>](https://imgsli.com/MTI3NTE1) [<img src="assets/imgsli_3.jpg" height="226px"/>](https://imgsli.com/MTI3NTIw) 
+#### Face Restoration
+<img src="assets/restoration_result1.png" width="400px"/> <img src="assets/restoration_result2.png" width="400px"/>
+<img src="assets/restoration_result3.png" width="400px"/> <img src="assets/restoration_result4.png" width="400px"/>
+#### Face Color Enhancement and Restoration
+<img src="assets/color_enhancement_result1.png" width="400px"/> <img src="assets/color_enhancement_result2.png" width="400px"/>
+#### Face Inpainting
+<img src="assets/inpainting_result1.png" width="400px"/> <img src="assets/inpainting_result2.png" width="400px"/>
+### Dependencies and Installation
+- Pytorch >= 1.7.1
+- CUDA >= 10.1
+- Other required packages in `requirements.txt`
+```
+# git clone this repository
+git clone https://github.com/sczhou/CodeFormer
+cd CodeFormer
+# create new anaconda env
+conda create -n codeformer python=3.8 -y
+conda activate codeformer
+# install python dependencies
+pip3 install -r requirements.txt
+python basicsr/setup.py develop
+conda install -c conda-forge dlib (only for face detection or cropping with dlib)
+```
+<!-- conda install -c conda-forge dlib -->
+### Quick Inference
+#### Download Pre-trained Models:
+Download the facelib and dlib pretrained models from [[Releases](https://github.com/sczhou/CodeFormer/releases/tag/v0.1.0) | [Google Drive](https://drive.google.com/drive/folders/1b_3qwrzY_kTQh0-SnBoGBgOrJ_PLZSKm?usp=sharing) | [OneDrive](https://entuedu-my.sharepoint.com/:f:/g/personal/s200094_e_ntu_edu_sg/EvDxR7FcAbZMp_MA9ouq7aQB8XTppMb3-T0uGZ_2anI2mg?e=DXsJFo)] to the `weights/facelib` folder. You can manually download the pretrained models OR download by running the following command:
+```
+python scripts/download_pretrained_models.py facelib
+python scripts/download_pretrained_models.py dlib (only for dlib face detector)
+```
+Download the CodeFormer pretrained models from [[Releases](https://github.com/sczhou/CodeFormer/releases/tag/v0.1.0) | [Google Drive](https://drive.google.com/drive/folders/1CNNByjHDFt0b95q54yMVp6Ifo5iuU6QS?usp=sharing) | [OneDrive](https://entuedu-my.sharepoint.com/:f:/g/personal/s200094_e_ntu_edu_sg/EoKFj4wo8cdIn2-TY2IV6CYBhZ0pIG4kUOeHdPR_A5nlbg?e=AO8UN9)] to the `weights/CodeFormer` folder. You can manually download the pretrained models OR download by running the following command:
+```
+python scripts/download_pretrained_models.py CodeFormer
+```
+#### Prepare Testing Data:
+You can put the testing images in the `inputs/TestWhole` folder. If you would like to test on cropped and aligned faces, you can put them in the `inputs/cropped_faces` folder. You can get the cropped and aligned faces by running the following command:
+```
+# you may need to install dlib via: conda install -c conda-forge dlib
+python scripts/crop_align_face.py -i [input folder] -o [output folder]
+```
+#### Testing:
+[Note] If you want to compare CodeFormer in your paper, please run the following command indicating `--has_aligned` (for cropped and aligned face), as the command for the whole image will involve a process of face-background fusion that may damage hair texture on the boundary, which leads to unfair comparison.
+Fidelity weight *w* lays in [0, 1]. Generally, smaller *w* tends to produce a higher-quality result, while larger *w* yields a higher-fidelity result. The results will be saved in the `results` folder.
+🧑🏻 Face Restoration (cropped and aligned face)
+```
+# For cropped and aligned faces (512x512)
+python inference_codeformer.py -w 0.5 --has_aligned --input_path [image folder]|[image path]
+```
+:framed_picture: Whole Image Enhancement
+```
+# For whole image
+# Add '--bg_upsampler realesrgan' to enhance the background regions with Real-ESRGAN
+# Add '--face_upsample' to further upsample restorated face with Real-ESRGAN
+python inference_codeformer.py -w 0.7 --input_path [image folder]|[image path]
+```
+:clapper: Video Enhancement
+```
+# For Windows/Mac users, please install ffmpeg first
+conda install -c conda-forge ffmpeg
+```
+```
+# For video clips
+# Video path should end with '.mp4'|'.mov'|'.avi'
+python inference_codeformer.py --bg_upsampler realesrgan --face_upsample -w 1.0 --input_path [video path]
+```
+🌈 Face Colorization (cropped and aligned face)
+```
+# For cropped and aligned faces (512x512)
+# Colorize black and white or faded photo
+python inference_colorization.py --input_path [image folder]|[image path]
+```
+🎨 Face Inpainting (cropped and aligned face)
+```
+# For cropped and aligned faces (512x512)
+# Inputs could be masked by white brush using an image editing app (e.g., Photoshop) 
+# (check out the examples in inputs/masked_faces)
+python inference_inpainting.py --input_path [image folder]|[image path]
+```
+### Training:
+The training commands can be found in the documents: [English](docs/train.md) **|** [简体中文](docs/train_CN.md).
+### Citation
+If our work is useful for your research, please consider citing:
+    @inproceedings{zhou2022codeformer,
+        author = {Zhou, Shangchen and Chan, Kelvin C.K. and Li, Chongyi and Loy, Chen Change},
+        title = {Towards Robust Blind Face Restoration with Codebook Lookup TransFormer},
+        booktitle = {NeurIPS},
+        year = {2022}
+    }
+### License
+This project is licensed under <a rel="license" href="https://github.com/sczhou/CodeFormer/blob/master/LICENSE">NTU S-Lab License 1.0</a>. Redistribution and use should follow this license.
+### Acknowledgement
+This project is based on [BasicSR](https://github.com/XPixelGroup/BasicSR). Some codes are brought from [Unleashing Transformers](https://github.com/samb-t/unleashing-transformers), [YOLOv5-face](https://github.com/deepcam-cn/yolov5-face), and [FaceXLib](https://github.com/xinntao/facexlib). We also adopt [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) to support background image enhancement. Thanks for their awesome works.
+### Contact
+If you have any questions, please feel free to reach me out at `shangchenzhou@gmail.com`. 
--- a/assets/CodeFormer_logo.png
+++ b/assets/CodeFormer_logo.png
--- a/assets/color_enhancement_result1.png
+++ b/assets/color_enhancement_result1.png
--- a/assets/color_enhancement_result2.png
+++ b/assets/color_enhancement_result2.png
--- a/assets/imgsli_1.jpg
+++ b/assets/imgsli_1.jpg
--- a/assets/imgsli_2.jpg
+++ b/assets/imgsli_2.jpg
--- a/assets/imgsli_3.jpg
+++ b/assets/imgsli_3.jpg
--- a/assets/inpainting_result1.png
+++ b/assets/inpainting_result1.png
--- a/assets/inpainting_result2.png
+++ b/assets/inpainting_result2.png
--- a/assets/network.jpg
+++ b/assets/network.jpg
--- a/assets/restoration_result1.png
+++ b/assets/restoration_result1.png
--- a/assets/restoration_result2.png
+++ b/assets/restoration_result2.png
--- a/assets/restoration_result3.png
+++ b/assets/restoration_result3.png
--- a/assets/restoration_result4.png
+++ b/assets/restoration_result4.png
--- a/basicsr/VERSION
+++ b/basicsr/VERSION
+1.3.2
--- a/basicsr/__init__.py
+++ b/basicsr/__init__.py
+# https://github.com/xinntao/BasicSR
+# flake8: noqa
+from .archs import *
+from .data import *
+from .losses import *
+from .metrics import *
+from .models import *
+from .ops import *
+from .train import *
+from .utils import *
+from .version import __gitsha__, __version__