Commit a8ada82f authored by chenych's avatar chenych
Browse files

First commit

parent 537691da
*.mdb
*.tar
*.ipynb
*.zip
*.eps
*.pdf
### Linux ###
*~
# temporary files which can be created if a process still has a handle open of a deleted file
.fuse_hidden*
# KDE directory preferences
.directory
# Linux trash folder which might appear on any partition or disk
.Trash-*
# .nfs files are created when an open file is removed but is still being accessed
.nfs*
### OSX ###
# General
.DS_Store
.AppleDouble
.LSOverride
# Icon must end with two \r
Icon
# Thumbnails
._*
# Files that might appear in the root of a volume
.DocumentRevisions-V100
.fseventsd
.Spotlight-V100
.TemporaryItems
.Trashes
.VolumeIcon.icns
.com.apple.timemachine.donotpresent
# Directories potentially created on remote AFP share
.AppleDB
.AppleDesktop
Network Trash Folder
Temporary Items
.apdisk
### Python ###
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
.python-version
# celery beat schedule file
celerybeat-schedule
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
### Python Patch ###
.venv/
### Python.VirtualEnv Stack ###
# Virtualenv
# http://iamzed.com/2009/05/07/a-primer-on-virtualenv/
[Bb]in
[Ii]nclude
[Ll]ib64
[Ll]ocal
[Ss]cripts
pyvenv.cfg
pip-selfcheck.json
### Windows ###
# Windows thumbnail cache files
Thumbs.db
ehthumbs.db
ehthumbs_vista.db
# Dump file
*.stackdump
# Folder config file
[Dd]esktop.ini
# Recycle Bin used on file shares
$RECYCLE.BIN/
# Windows Installer files
*.cab
*.msi
*.msix
*.msm
*.msp
# Windows shortcuts
*.lnk
.idea/
.vscode/
output/
exp/
# data/
*.pyc
*.mp4
*.zip
\ No newline at end of file
# MaskedDenoising_pytorch # MaskedDenoising
## 论文
[Masked Image Training for Generalizable Deep Image Denoising](https://arxiv.org/abs/2303.13132)
Masked Image Training for Generalizable Deep Image Denoising ## 模型结构
\ No newline at end of file
本文对模型修改较小,主要基于SwinIR模型结构增加了input mask 和 attention masks。
<div align=center>
<img src="./doc/method.jpg"/>
</div>
## 算法原理
传统的去噪模型是靠识别噪声本身来起去噪作用的,而不是真正理解图像内容模型。本文的模型在特征提取之后,会对输入图像进行随机大比例遮盖(input mask),比如遮盖75%~85%的像素,迫使网络学习重构被遮盖的内容,增强对图像本身分布的建模能力。遮挡训练的方法可以使模型学习理解和重构图像的内容,而不仅仅依赖于噪声特征,从而获得更好的泛化能力。
## 环境配置
### Docker(方法一)
-v 路径、docker_name和imageID根据实际情况修改
```image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04.1-py38-latest
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04.1-py38-latest
docker run -it -v /path/your_code_data/:/path/ your_code_data/ --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash
cd /your_code_path/maskeddenoising_pytorch
pip install -r requirement.txt
```
### Dockerfile(方法二)
-v 路径、docker_name和imageID根据实际情况修改
```
cd ./docker
cp ../requirement.txt requirement.txt
docker build --no-cache -t maskeddenoising:latest .
docker run -it -v /path/your_code_data/:/path/your_code_data/ --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash
cd /your_code_path/maskeddenoising_pytorch
pip install -r requirement.txt
```
### Anaconda(方法三)
1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装: https://developer.hpccube.com/tool/
```
DTK软件栈:dtk23.04.1
python:python3.8
torch:1.13.1
torchvision:0.14.1
```
Tips:以上dtk软件栈、python、torch等DCU相关工具版本需要严格一一对应
2、其他非特殊库直接按照requirement.txt安装
```
pip install -r requirement.txt
```
## 数据集
将待训练数据Train400/DIV2K/Flickr2K放入 trainset 文件夹中
Train400:https://github.com/cszn/DnCNN/tree/master/TrainingCodes/DnCNN_TrainingCodes_v1.0/data
DIV2K官方地址:https://data.vision.ee.ethz.ch/cvl/DIV2K/
[Train Data (HR images)](http://data.vision.ee.ethz.ch/cvl/DIV2K/DIV2K_train_HR.zip)
如果官方地址无法下载,可以选择AiStudio公开数据集里的DIV2K下载DIV2K_train_HR.zip:https://aistudio.baidu.com/aistudio/datasetdetail/104667
Flickr2K:https://cv.snu.ac.kr/research/EDSR/Flickr2K.tar
数据整理完成后,保存在trainset文件夹下,执行gen_data.py获得预处理图像文件夹 trainsets/trainH。
```
gen_data.py
```
测试数据集(已在项目中预置):BSD68、CBSD68、Kodak24、McMaster
https://github.com/cszn/FFDNet/tree/master/testsets
数据集的目录结构如下:
```
├── trainset
│ ├── DIV2K
│ ├── Flickr2K
│ └── Train400
├── trainsets
│ └── trainH
├── testsets
│ ├── BSD68
│ ├── CBSD68
│ ├── Kodak24
│ └── McMaster
```
## 训练
修改配置文件 options/masked_denoising/input_mask_80_90.json 中参数为实际训练数据,主要参数如下:
"gpu_ids": [0,1,2,3] 训练的卡号
"dataroot_H": "trainsets/trainH" 数据地址
input mask: 设置 "if_mask"和"mask1", "mask2"(line 32-34), 制作比例将在mask1和mask2之间随机采样。
attention mask: 设置 "use_mask" 和 "mask_ratio1", "mask_ratio2" (line 68-70). attention mask ratio 可以是一个范围或者一个固定值。
### 单机多卡
# 普通训练
```
bash train.sh
```
# 分布式训练
```
bash train_multi.sh
```
## 推理
如需使用自己的模型,请修改:
--model_path 训练模型地址
--opt 训练模型对应的json文件
--name 结果保存路径results/{name}
#### 单卡推理
```
bash test.sh
```
## result
本地测试集测试结果单张展示:
<div align=center>
<img src="./doc/origin.png"/>
</div>
<div align=center>
<img src="./doc/results.png"/>
</div>
### 精度
基于项目提供的测试数据,得到单卡测试结果如下:
| Swin | PSNR | SSIM | LPIPS |
| :------: | :------: | :------: | :------: |
| ours | 29.04 | 0.7615 | 0.1294 |
| paper | 30.13 | 0/7981 | 0.1031 |
## 应用场景
### 算法类别
图像降噪
### 热点应用行业
教育,交通,公安
## 源码仓库及问题反馈
http://developer.hpccube.com/codes/modelzoo/maskeddenoising_pytorch.git
## 参考资料
https://github.com/haoyuc/MaskedDenoising.git
import random
import numpy as np
import torch.utils.data as data
import utils.utils_image as util
import os
from utils import utils_blindsr as blindsr
class DatasetBlindSR(data.Dataset):
'''
# -----------------------------------------
# dataset for BSRGAN
# -----------------------------------------
'''
def __init__(self, opt):
super(DatasetBlindSR, self).__init__()
self.opt = opt
self.n_channels = opt['n_channels'] if opt['n_channels'] else 3
self.sf = opt['scale'] if opt['scale'] else 4
self.shuffle_prob = opt['shuffle_prob'] if opt['shuffle_prob'] else 0.1
self.use_sharp = opt['use_sharp'] if opt['use_sharp'] else False
self.degradation_type = opt['degradation_type'] if opt['degradation_type'] else 'bsrgan'
self.lq_patchsize = self.opt['lq_patchsize'] if self.opt['lq_patchsize'] else 64
self.patch_size = self.opt['H_size'] if self.opt['H_size'] else self.lq_patchsize*self.sf
self.paths_H = util.get_image_paths(opt['dataroot_H'])
print(len(self.paths_H))
# for n, v in enumerate(self.paths_H):
# if 'face' in v:
# del self.paths_H[n]
# time.sleep(1)
assert self.paths_H, 'Error: H path is empty.'
self.if_mask = True if opt['if_mask'] else False
def __getitem__(self, index):
L_path = None
# ------------------------------------
# get H image
# ------------------------------------
H_path = self.paths_H[index]
img_H = util.imread_uint(H_path, self.n_channels)
img_name, ext = os.path.splitext(os.path.basename(H_path))
H, W, C = img_H.shape
if H < self.patch_size or W < self.patch_size:
img_H = np.tile(np.random.randint(0, 256, size=[1, 1, self.n_channels], dtype=np.uint8), (self.patch_size, self.patch_size, 1))
# ------------------------------------
# if train, get L/H patch pair
# ------------------------------------
if self.opt['phase'] == 'train':
H, W, C = img_H.shape
rnd_h_H = random.randint(0, max(0, H - self.patch_size))
rnd_w_H = random.randint(0, max(0, W - self.patch_size))
img_H = img_H[rnd_h_H:rnd_h_H + self.patch_size, rnd_w_H:rnd_w_H + self.patch_size, :]
if 'face' in img_name:
mode = random.choice([0, 4])
img_H = util.augment_img(img_H, mode=mode)
else:
mode = random.randint(0, 7)
img_H = util.augment_img(img_H, mode=mode)
img_H = util.uint2single(img_H)
if self.degradation_type == 'bsrgan':
img_L, img_H = blindsr.degradation_bsrgan(img_H, self.sf, lq_patchsize=self.lq_patchsize, isp_model=None)
else:
img_H = util.uint2single(img_H)
if self.degradation_type == 'bsrgan':
img_L, img_H = blindsr.degradation_bsrgan(img_H, self.sf, lq_patchsize=self.lq_patchsize, isp_model=None)
# ------------------------------------
# L/H pairs, HWC to CHW, numpy to tensor
# ------------------------------------
img_H, img_L = util.single2tensor3(img_H), util.single2tensor3(img_L)
if L_path is None:
L_path = H_path
return {'L': img_L, 'H': img_H, 'L_path': L_path, 'H_path': H_path}
def __len__(self):
return len(self.paths_H)
import os.path
import random
import numpy as np
import torch
import torch.utils.data as data
import utils.utils_image as util
class DatasetDnCNN(data.Dataset):
"""
# -----------------------------------------
# Get L/H for denosing on AWGN with fixed sigma.
# Only dataroot_H is needed.
# -----------------------------------------
# e.g., DnCNN
# -----------------------------------------
"""
def __init__(self, opt):
super(DatasetDnCNN, self).__init__()
print('Dataset: Denosing on AWGN with fixed sigma. Only dataroot_H is needed.')
self.opt = opt
self.n_channels = opt['n_channels'] if opt['n_channels'] else 3
self.patch_size = opt['H_size'] if opt['H_size'] else 64
self.sigma = opt['sigma'] if opt['sigma'] else 25
self.sigma_test = opt['sigma_test'] if opt['sigma_test'] else self.sigma
# ------------------------------------
# get path of H
# return None if input is None
# ------------------------------------
self.paths_H = util.get_image_paths(opt['dataroot_H'])
def __getitem__(self, index):
# ------------------------------------
# get H image
# ------------------------------------
H_path = self.paths_H[index]
img_H = util.imread_uint(H_path, self.n_channels)
L_path = H_path
if self.opt['phase'] == 'train':
"""
# --------------------------------
# get L/H patch pairs
# --------------------------------
"""
H, W, _ = img_H.shape
# --------------------------------
# randomly crop the patch
# --------------------------------
rnd_h = random.randint(0, max(0, H - self.patch_size))
rnd_w = random.randint(0, max(0, W - self.patch_size))
patch_H = img_H[rnd_h:rnd_h + self.patch_size, rnd_w:rnd_w + self.patch_size, :]
# --------------------------------
# augmentation - flip, rotate
# --------------------------------
mode = random.randint(0, 7)
patch_H = util.augment_img(patch_H, mode=mode)
# --------------------------------
# HWC to CHW, numpy(uint) to tensor
# --------------------------------
img_H = util.uint2tensor3(patch_H)
img_L = img_H.clone()
# --------------------------------
# add noise
# --------------------------------
noise = torch.randn(img_L.size()).mul_(self.sigma/255.0)
img_L.add_(noise)
else:
"""
# --------------------------------
# get L/H image pairs
# --------------------------------
"""
img_H = util.uint2single(img_H)
img_L = np.copy(img_H)
# --------------------------------
# add noise
# --------------------------------
np.random.seed(seed=0)
img_L += np.random.normal(0, self.sigma_test/255.0, img_L.shape)
# --------------------------------
# HWC to CHW, numpy to tensor
# --------------------------------
img_L = util.single2tensor3(img_L)
img_H = util.single2tensor3(img_H)
return {'L': img_L, 'H': img_H, 'H_path': H_path, 'L_path': L_path}
def __len__(self):
return len(self.paths_H)
import random
import numpy as np
import torch
import torch.utils.data as data
import utils.utils_image as util
class DatasetDnPatch(data.Dataset):
"""
# -----------------------------------------
# Get L/H for denosing on AWGN with fixed sigma.
# ****Get all H patches first****
# Only dataroot_H is needed.
# -----------------------------------------
# e.g., DnCNN with BSD400
# -----------------------------------------
"""
def __init__(self, opt):
super(DatasetDnPatch, self).__init__()
print('Get L/H for denosing on AWGN with fixed sigma. Only dataroot_H is needed.')
self.opt = opt
self.n_channels = opt['n_channels'] if opt['n_channels'] else 3
self.patch_size = opt['H_size'] if opt['H_size'] else 64
self.sigma = opt['sigma'] if opt['sigma'] else 25
self.sigma_test = opt['sigma_test'] if opt['sigma_test'] else self.sigma
self.num_patches_per_image = opt['num_patches_per_image'] if opt['num_patches_per_image'] else 40
self.num_sampled = opt['num_sampled'] if opt['num_sampled'] else 3000
# ------------------------------------
# get paths of H
# ------------------------------------
self.paths_H = util.get_image_paths(opt['dataroot_H'])
assert self.paths_H, 'Error: H path is empty.'
# ------------------------------------
# number of sampled H images
# ------------------------------------
self.num_sampled = min(self.num_sampled, len(self.paths_H))
# ------------------------------------
# reserve space with zeros
# ------------------------------------
self.total_patches = self.num_sampled * self.num_patches_per_image
self.H_data = np.zeros([self.total_patches, self.patch_size, self.patch_size, self.n_channels], dtype=np.uint8)
# ------------------------------------
# update H patches
# ------------------------------------
self.update_data()
def update_data(self):
"""
# ------------------------------------
# update whole H patches
# ------------------------------------
"""
self.index_sampled = random.sample(range(0, len(self.paths_H), 1), self.num_sampled)
n_count = 0
for i in range(len(self.index_sampled)):
H_patches = self.get_patches(self.index_sampled[i])
for H_patch in H_patches:
self.H_data[n_count,:,:,:] = H_patch
n_count += 1
print('Training data updated! Total number of patches is: %5.2f X %5.2f = %5.2f\n' % (len(self.H_data)//128, 128, len(self.H_data)))
def get_patches(self, index):
"""
# ------------------------------------
# get H patches from an H image
# ------------------------------------
"""
H_path = self.paths_H[index]
img_H = util.imread_uint(H_path, self.n_channels) # uint format
H, W = img_H.shape[:2]
H_patches = []
num = self.num_patches_per_image
for _ in range(num):
rnd_h = random.randint(0, max(0, H - self.patch_size))
rnd_w = random.randint(0, max(0, W - self.patch_size))
H_patch = img_H[rnd_h:rnd_h + self.patch_size, rnd_w:rnd_w + self.patch_size, :]
H_patches.append(H_patch)
return H_patches
def __getitem__(self, index):
H_path = 'toy.png'
if self.opt['phase'] == 'train':
patch_H = self.H_data[index]
# --------------------------------
# augmentation - flip and/or rotate
# --------------------------------
mode = random.randint(0, 7)
patch_H = util.augment_img(patch_H, mode=mode)
patch_H = util.uint2tensor3(patch_H)
patch_L = patch_H.clone()
# ------------------------------------
# add noise
# ------------------------------------
noise = torch.randn(patch_L.size()).mul_(self.sigma/255.0)
patch_L.add_(noise)
else:
H_path = self.paths_H[index]
img_H = util.imread_uint(H_path, self.n_channels)
img_H = util.uint2single(img_H)
img_L = np.copy(img_H)
# ------------------------------------
# add noise
# ------------------------------------
np.random.seed(seed=0)
img_L += np.random.normal(0, self.sigma_test/255.0, img_L.shape)
patch_L, patch_H = util.single2tensor3(img_L), util.single2tensor3(img_H)
L_path = H_path
return {'L': patch_L, 'H': patch_H, 'L_path': L_path, 'H_path': H_path}
def __len__(self):
return len(self.H_data)
import random
import numpy as np
import torch
import torch.utils.data as data
import utils.utils_image as util
class DatasetDPSR(data.Dataset):
'''
# -----------------------------------------
# Get L/H/M for noisy image SR.
# Only "paths_H" is needed, sythesize bicubicly downsampled L on-the-fly.
# -----------------------------------------
# e.g., SRResNet super-resolver prior for DPSR
# -----------------------------------------
'''
def __init__(self, opt):
super(DatasetDPSR, self).__init__()
self.opt = opt
self.n_channels = opt['n_channels'] if opt['n_channels'] else 3
self.sf = opt['scale'] if opt['scale'] else 4
self.patch_size = self.opt['H_size'] if self.opt['H_size'] else 96
self.L_size = self.patch_size // self.sf
self.sigma = opt['sigma'] if opt['sigma'] else [0, 50]
self.sigma_min, self.sigma_max = self.sigma[0], self.sigma[1]
self.sigma_test = opt['sigma_test'] if opt['sigma_test'] else 0
# ------------------------------------
# get paths of L/H
# ------------------------------------
self.paths_H = util.get_image_paths(opt['dataroot_H'])
self.paths_L = util.get_image_paths(opt['dataroot_L'])
assert self.paths_H, 'Error: H path is empty.'
def __getitem__(self, index):
# ------------------------------------
# get H image
# ------------------------------------
H_path = self.paths_H[index]
img_H = util.imread_uint(H_path, self.n_channels)
img_H = util.uint2single(img_H)
# ------------------------------------
# modcrop for SR
# ------------------------------------
img_H = util.modcrop(img_H, self.sf)
# ------------------------------------
# sythesize L image via matlab's bicubic
# ------------------------------------
H, W, _ = img_H.shape
img_L = util.imresize_np(img_H, 1 / self.sf, True)
if self.opt['phase'] == 'train':
"""
# --------------------------------
# get L/H patch pairs
# --------------------------------
"""
H, W, C = img_L.shape
# --------------------------------
# randomly crop L patch
# --------------------------------
rnd_h = random.randint(0, max(0, H - self.L_size))
rnd_w = random.randint(0, max(0, W - self.L_size))
img_L = img_L[rnd_h:rnd_h + self.L_size, rnd_w:rnd_w + self.L_size, :]
# --------------------------------
# crop corresponding H patch
# --------------------------------
rnd_h_H, rnd_w_H = int(rnd_h * self.sf), int(rnd_w * self.sf)
img_H = img_H[rnd_h_H:rnd_h_H + self.patch_size, rnd_w_H:rnd_w_H + self.patch_size, :]
# --------------------------------
# augmentation - flip and/or rotate
# --------------------------------
mode = random.randint(0, 7)
img_L, img_H = util.augment_img(img_L, mode=mode), util.augment_img(img_H, mode=mode)
# --------------------------------
# get patch pairs
# --------------------------------
img_H, img_L = util.single2tensor3(img_H), util.single2tensor3(img_L)
# --------------------------------
# select noise level and get Gaussian noise
# --------------------------------
if random.random() < 0.1:
noise_level = torch.zeros(1).float()
else:
noise_level = torch.FloatTensor([np.random.uniform(self.sigma_min, self.sigma_max)])/255.0
# noise_level = torch.rand(1)*50/255.0
# noise_level = torch.min(torch.from_numpy(np.float32([7*np.random.chisquare(2.5)/255.0])),torch.Tensor([50./255.]))
else:
img_H, img_L = util.single2tensor3(img_H), util.single2tensor3(img_L)
noise_level = torch.FloatTensor([self.sigma_test])
# ------------------------------------
# add noise
# ------------------------------------
noise = torch.randn(img_L.size()).mul_(noise_level).float()
img_L.add_(noise)
# ------------------------------------
# get noise level map M
# ------------------------------------
M_vector = noise_level.unsqueeze(1).unsqueeze(1)
M = M_vector.repeat(1, img_L.size()[-2], img_L.size()[-1])
"""
# -------------------------------------
# concat L and noise level map M
# -------------------------------------
"""
img_L = torch.cat((img_L, M), 0)
L_path = H_path
return {'L': img_L, 'H': img_H, 'L_path': L_path, 'H_path': H_path}
def __len__(self):
return len(self.paths_H)
import random
import numpy as np
import torch
import torch.utils.data as data
import utils.utils_image as util
class DatasetFDnCNN(data.Dataset):
"""
# -----------------------------------------
# Get L/H/M for denosing on AWGN with a range of sigma.
# Only dataroot_H is needed.
# -----------------------------------------
# e.g., FDnCNN, H = f(cat(L, M)), M is noise level map
# -----------------------------------------
"""
def __init__(self, opt):
super(DatasetFDnCNN, self).__init__()
self.opt = opt
self.n_channels = opt['n_channels'] if opt['n_channels'] else 3
self.patch_size = self.opt['H_size'] if opt['H_size'] else 64
self.sigma = opt['sigma'] if opt['sigma'] else [0, 75]
self.sigma_min, self.sigma_max = self.sigma[0], self.sigma[1]
self.sigma_test = opt['sigma_test'] if opt['sigma_test'] else 25
# -------------------------------------
# get the path of H, return None if input is None
# -------------------------------------
self.paths_H = util.get_image_paths(opt['dataroot_H'])
def __getitem__(self, index):
# -------------------------------------
# get H image
# -------------------------------------
H_path = self.paths_H[index]
img_H = util.imread_uint(H_path, self.n_channels)
L_path = H_path
if self.opt['phase'] == 'train':
"""
# --------------------------------
# get L/H/M patch pairs
# --------------------------------
"""
H, W = img_H.shape[:2]
# ---------------------------------
# randomly crop the patch
# ---------------------------------
rnd_h = random.randint(0, max(0, H - self.patch_size))
rnd_w = random.randint(0, max(0, W - self.patch_size))
patch_H = img_H[rnd_h:rnd_h + self.patch_size, rnd_w:rnd_w + self.patch_size, :]
# ---------------------------------
# augmentation - flip, rotate
# ---------------------------------
mode = random.randint(0, 7)
patch_H = util.augment_img(patch_H, mode=mode)
# ---------------------------------
# HWC to CHW, numpy(uint) to tensor
# ---------------------------------
img_H = util.uint2tensor3(patch_H)
img_L = img_H.clone()
# ---------------------------------
# get noise level
# ---------------------------------
# noise_level = torch.FloatTensor([np.random.randint(self.sigma_min, self.sigma_max)])/255.0
noise_level = torch.FloatTensor([np.random.uniform(self.sigma_min, self.sigma_max)])/255.0
noise_level_map = torch.ones((1, img_L.size(1), img_L.size(2))).mul_(noise_level).float() # torch.full((1, img_L.size(1), img_L.size(2)), noise_level)
# ---------------------------------
# add noise
# ---------------------------------
noise = torch.randn(img_L.size()).mul_(noise_level).float()
img_L.add_(noise)
else:
"""
# --------------------------------
# get L/H/M image pairs
# --------------------------------
"""
img_H = util.uint2single(img_H)
img_L = np.copy(img_H)
np.random.seed(seed=0)
img_L += np.random.normal(0, self.sigma_test/255.0, img_L.shape)
noise_level_map = torch.ones((1, img_L.shape[0], img_L.shape[1])).mul_(self.sigma_test/255.0).float() # torch.full((1, img_L.size(1), img_L.size(2)), noise_level)
# ---------------------------------
# L/H image pairs
# ---------------------------------
img_H, img_L = util.single2tensor3(img_H), util.single2tensor3(img_L)
"""
# -------------------------------------
# concat L and noise level map M
# -------------------------------------
"""
img_L = torch.cat((img_L, noise_level_map), 0)
return {'L': img_L, 'H': img_H, 'L_path': L_path, 'H_path': H_path}
def __len__(self):
return len(self.paths_H)
import random
import numpy as np
import torch
import torch.utils.data as data
import utils.utils_image as util
class DatasetFFDNet(data.Dataset):
"""
# -----------------------------------------
# Get L/H/M for denosing on AWGN with a range of sigma.
# Only dataroot_H is needed.
# -----------------------------------------
# e.g., FFDNet, H = f(L, sigma), sigma is noise level
# -----------------------------------------
"""
def __init__(self, opt):
super(DatasetFFDNet, self).__init__()
self.opt = opt
self.n_channels = opt['n_channels'] if opt['n_channels'] else 3
self.patch_size = self.opt['H_size'] if opt['H_size'] else 64
self.sigma = opt['sigma'] if opt['sigma'] else [0, 75]
self.sigma_min, self.sigma_max = self.sigma[0], self.sigma[1]
self.sigma_test = opt['sigma_test'] if opt['sigma_test'] else 25
# -------------------------------------
# get the path of H, return None if input is None
# -------------------------------------
self.paths_H = util.get_image_paths(opt['dataroot_H'])
def __getitem__(self, index):
# -------------------------------------
# get H image
# -------------------------------------
H_path = self.paths_H[index]
img_H = util.imread_uint(H_path, self.n_channels)
L_path = H_path
if self.opt['phase'] == 'train':
"""
# --------------------------------
# get L/H/M patch pairs
# --------------------------------
"""
H, W = img_H.shape[:2]
# ---------------------------------
# randomly crop the patch
# ---------------------------------
rnd_h = random.randint(0, max(0, H - self.patch_size))
rnd_w = random.randint(0, max(0, W - self.patch_size))
patch_H = img_H[rnd_h:rnd_h + self.patch_size, rnd_w:rnd_w + self.patch_size, :]
# ---------------------------------
# augmentation - flip, rotate
# ---------------------------------
mode = random.randint(0, 7)
patch_H = util.augment_img(patch_H, mode=mode)
# ---------------------------------
# HWC to CHW, numpy(uint) to tensor
# ---------------------------------
img_H = util.uint2tensor3(patch_H)
img_L = img_H.clone()
# ---------------------------------
# get noise level
# ---------------------------------
# noise_level = torch.FloatTensor([np.random.randint(self.sigma_min, self.sigma_max)])/255.0
noise_level = torch.FloatTensor([np.random.uniform(self.sigma_min, self.sigma_max)])/255.0
# ---------------------------------
# add noise
# ---------------------------------
noise = torch.randn(img_L.size()).mul_(noise_level).float()
img_L.add_(noise)
else:
"""
# --------------------------------
# get L/H/sigma image pairs
# --------------------------------
"""
img_H = util.uint2single(img_H)
img_L = np.copy(img_H)
np.random.seed(seed=0)
img_L += np.random.normal(0, self.sigma_test/255.0, img_L.shape)
noise_level = torch.FloatTensor([self.sigma_test/255.0])
# ---------------------------------
# L/H image pairs
# ---------------------------------
img_H, img_L = util.single2tensor3(img_H), util.single2tensor3(img_L)
noise_level = noise_level.unsqueeze(1).unsqueeze(1)
return {'L': img_L, 'H': img_H, 'C': noise_level, 'L_path': L_path, 'H_path': H_path}
def __len__(self):
return len(self.paths_H)
import random
import torch.utils.data as data
import utils.utils_image as util
import cv2
class DatasetJPEG(data.Dataset):
def __init__(self, opt):
super(DatasetJPEG, self).__init__()
print('Dataset: JPEG compression artifact reduction (deblocking) with quality factor. Only dataroot_H is needed.')
self.opt = opt
self.n_channels = opt['n_channels'] if opt['n_channels'] else 3
self.patch_size = self.opt['H_size'] if opt['H_size'] else 128
self.quality_factor = opt['quality_factor'] if opt['quality_factor'] else 40
self.quality_factor_test = opt['quality_factor_test'] if opt['quality_factor_test'] else 40
self.is_color = opt['is_color'] if opt['is_color'] else False
# -------------------------------------
# get the path of H, return None if input is None
# -------------------------------------
self.paths_H = util.get_image_paths(opt['dataroot_H'])
def __getitem__(self, index):
if self.opt['phase'] == 'train':
# -------------------------------------
# get H image
# -------------------------------------
H_path = self.paths_H[index]
img_H = util.imread_uint(H_path, 3)
L_path = H_path
H, W = img_H.shape[:2]
self.patch_size_plus = self.patch_size + 8
# ---------------------------------
# randomly crop a large patch
# ---------------------------------
rnd_h = random.randint(0, max(0, H - self.patch_size_plus))
rnd_w = random.randint(0, max(0, W - self.patch_size_plus))
patch_H = img_H[rnd_h:rnd_h + self.patch_size_plus, rnd_w:rnd_w + self.patch_size_plus, ...]
# ---------------------------------
# augmentation - flip, rotate
# ---------------------------------
mode = random.randint(0, 7)
patch_H = util.augment_img(patch_H, mode=mode)
# ---------------------------------
# HWC to CHW, numpy(uint) to tensor
# ---------------------------------
img_L = patch_H.copy()
# ---------------------------------
# set quality factor
# ---------------------------------
quality_factor = self.quality_factor
if self.is_color: # color image
img_H = img_L.copy()
img_L = cv2.cvtColor(img_L, cv2.COLOR_RGB2BGR)
result, encimg = cv2.imencode('.jpg', img_L, [int(cv2.IMWRITE_JPEG_QUALITY), quality_factor])
img_L = cv2.imdecode(encimg, 1)
img_L = cv2.cvtColor(img_L, cv2.COLOR_BGR2RGB)
else:
if random.random() > 0.5:
img_L = util.rgb2ycbcr(img_L)
else:
img_L = cv2.cvtColor(img_L, cv2.COLOR_RGB2GRAY)
img_H = img_L.copy()
result, encimg = cv2.imencode('.jpg', img_L, [int(cv2.IMWRITE_JPEG_QUALITY), quality_factor])
img_L = cv2.imdecode(encimg, 0)
# ---------------------------------
# randomly crop a patch
# ---------------------------------
H, W = img_H.shape[:2]
if random.random() > 0.5:
rnd_h = random.randint(0, max(0, H - self.patch_size))
rnd_w = random.randint(0, max(0, W - self.patch_size))
else:
rnd_h = 0
rnd_w = 0
img_H = img_H[rnd_h:rnd_h + self.patch_size, rnd_w:rnd_w + self.patch_size]
img_L = img_L[rnd_h:rnd_h + self.patch_size, rnd_w:rnd_w + self.patch_size]
else:
H_path = self.paths_H[index]
L_path = H_path
# ---------------------------------
# set quality factor
# ---------------------------------
quality_factor = self.quality_factor_test
if self.is_color: # color JPEG image deblocking
img_H = util.imread_uint(H_path, 3)
img_L = img_H.copy()
img_L = cv2.cvtColor(img_L, cv2.COLOR_RGB2BGR)
result, encimg = cv2.imencode('.jpg', img_L, [int(cv2.IMWRITE_JPEG_QUALITY), quality_factor])
img_L = cv2.imdecode(encimg, 1)
img_L = cv2.cvtColor(img_L, cv2.COLOR_BGR2RGB)
else:
img_H = cv2.imread(H_path, cv2.IMREAD_UNCHANGED)
is_to_ycbcr = True if img_L.ndim == 3 else False
if is_to_ycbcr:
img_H = cv2.cvtColor(img_H, cv2.COLOR_BGR2RGB)
img_H = util.rgb2ycbcr(img_H)
result, encimg = cv2.imencode('.jpg', img_H, [int(cv2.IMWRITE_JPEG_QUALITY), quality_factor])
img_L = cv2.imdecode(encimg, 0)
img_L, img_H = util.uint2tensor3(img_L), util.uint2tensor3(img_H)
return {'L': img_L, 'H': img_H, 'L_path': L_path, 'H_path': H_path}
def __len__(self):
return len(self.paths_H)
import torch.utils.data as data
import utils.utils_image as util
class DatasetL(data.Dataset):
'''
# -----------------------------------------
# Get L in testing.
# Only "dataroot_L" is needed.
# -----------------------------------------
# -----------------------------------------
'''
def __init__(self, opt):
super(DatasetL, self).__init__()
print('Read L in testing. Only "dataroot_L" is needed.')
self.opt = opt
self.n_channels = opt['n_channels'] if opt['n_channels'] else 3
# ------------------------------------
# get the path of L
# ------------------------------------
self.paths_L = util.get_image_paths(opt['dataroot_L'])
assert self.paths_L, 'Error: L paths are empty.'
def __getitem__(self, index):
L_path = None
# ------------------------------------
# get L image
# ------------------------------------
L_path = self.paths_L[index]
img_L = util.imread_uint(L_path, self.n_channels)
# ------------------------------------
# HWC to CHW, numpy to tensor
# ------------------------------------
img_L = util.uint2tensor3(img_L)
return {'L': img_L, 'L_path': L_path}
def __len__(self):
return len(self.paths_L)
import random
import numpy as np
import torch.utils.data as data
import utils.utils_image as util
import os
from utils import utils_mask
class DatasetMaskedDenoising(data.Dataset):
'''
# -----------------------------------------
# dataset for BSRGAN
# -----------------------------------------
'''
def __init__(self, opt):
super(DatasetMaskedDenoising, self).__init__()
self.opt = opt
self.n_channels = opt['n_channels'] if opt['n_channels'] else 3
self.sf = opt['scale'] if opt['scale'] else 1
self.lq_patchsize = self.opt['lq_patchsize'] if self.opt['lq_patchsize'] else 64
self.patch_size = self.opt['H_size'] if self.opt['H_size'] else self.lq_patchsize*self.sf
self.paths_H = util.get_image_paths(opt['dataroot_H'])
print(f'len(self.paths_H): {len(self.paths_H)}')
assert self.paths_H, 'Error: H path is empty.'
self.if_mask = True if opt['if_mask'] else False
def __getitem__(self, index):
L_path = None
# ------------------------------------
# get H image
# ------------------------------------
H_path = self.paths_H[index]
img_H = util.imread_uint(H_path, self.n_channels)
img_name, ext = os.path.splitext(os.path.basename(H_path))
H, W, C = img_H.shape
if H < self.patch_size or W < self.patch_size:
img_H = np.tile(np.random.randint(0, 256, size=[1, 1, self.n_channels], dtype=np.uint8), (self.patch_size, self.patch_size, 1))
# ------------------------------------
# if train, get L/H patch pair
# ------------------------------------
if self.opt['phase'] == 'train':
H, W, C = img_H.shape
rnd_h_H = random.randint(0, max(0, H - self.patch_size))
rnd_w_H = random.randint(0, max(0, W - self.patch_size))
img_H = img_H[rnd_h_H:rnd_h_H + self.patch_size, rnd_w_H:rnd_w_H + self.patch_size, :]
mode = random.randint(0, 7)
img_H = util.augment_img(img_H, mode=mode)
img_H = util.uint2single(img_H)
img_L, img_H = utils_mask.input_mask_with_noise(img_H,
sf=self.sf,
lq_patchsize=self.lq_patchsize,
noise_level=self.opt['noise_level'],
if_mask=self.if_mask,
mask1=self.opt['mask1'],
mask2=self.opt['mask2'])
else:
img_H = util.uint2single(img_H)
img_L, img_H = utils_mask.input_mask_with_noise(img_H, self.sf, lq_patchsize=self.lq_patchsize)
# ------------------------------------
# L/H pairs, HWC to CHW, numpy to tensor
# ------------------------------------
img_H, img_L = util.single2tensor3(img_H), util.single2tensor3(img_L)
if L_path is None:
L_path = H_path
return {'L': img_L, 'H': img_H, 'L_path': L_path, 'H_path': H_path}
def __len__(self):
return len(self.paths_H)
import random
import numpy as np
import torch.utils.data as data
import utils.utils_image as util
class DatasetPlain(data.Dataset):
'''
# -----------------------------------------
# Get L/H for image-to-image mapping.
# Both "paths_L" and "paths_H" are needed.
# -----------------------------------------
# e.g., train denoiser with L and H
# -----------------------------------------
'''
def __init__(self, opt):
super(DatasetPlain, self).__init__()
print('Get L/H for image-to-image mapping. Both "paths_L" and "paths_H" are needed.')
self.opt = opt
self.n_channels = opt['n_channels'] if opt['n_channels'] else 3
self.patch_size = self.opt['H_size'] if self.opt['H_size'] else 64
# ------------------------------------
# get the path of L/H
# ------------------------------------
self.paths_H = util.get_image_paths(opt['dataroot_H'])
self.paths_L = util.get_image_paths(opt['dataroot_L'])
assert self.paths_H, 'Error: H path is empty.'
assert self.paths_L, 'Error: L path is empty. Plain dataset assumes both L and H are given!'
if self.paths_L and self.paths_H:
assert len(self.paths_L) == len(self.paths_H), 'L/H mismatch - {}, {}.'.format(len(self.paths_L), len(self.paths_H))
def __getitem__(self, index):
# ------------------------------------
# get H image
# ------------------------------------
H_path = self.paths_H[index]
img_H = util.imread_uint(H_path, self.n_channels)
# ------------------------------------
# get L image
# ------------------------------------
L_path = self.paths_L[index]
img_L = util.imread_uint(L_path, self.n_channels)
# ------------------------------------
# if train, get L/H patch pair
# ------------------------------------
if self.opt['phase'] == 'train':
H, W, _ = img_H.shape
# --------------------------------
# randomly crop the patch
# --------------------------------
rnd_h = random.randint(0, max(0, H - self.patch_size))
rnd_w = random.randint(0, max(0, W - self.patch_size))
patch_L = img_L[rnd_h:rnd_h + self.patch_size, rnd_w:rnd_w + self.patch_size, :]
patch_H = img_H[rnd_h:rnd_h + self.patch_size, rnd_w:rnd_w + self.patch_size, :]
# --------------------------------
# augmentation - flip and/or rotate
# --------------------------------
mode = random.randint(0, 7)
patch_L, patch_H = util.augment_img(patch_L, mode=mode), util.augment_img(patch_H, mode=mode)
# --------------------------------
# HWC to CHW, numpy(uint) to tensor
# --------------------------------
img_L, img_H = util.uint2tensor3(patch_L), util.uint2tensor3(patch_H)
else:
# --------------------------------
# HWC to CHW, numpy(uint) to tensor
# --------------------------------
img_L, img_H = util.uint2tensor3(img_L), util.uint2tensor3(img_H)
return {'L': img_L, 'H': img_H, 'L_path': L_path, 'H_path': H_path}
def __len__(self):
return len(self.paths_H)
import os.path
import random
import numpy as np
import torch.utils.data as data
import utils.utils_image as util
class DatasetPlainPatch(data.Dataset):
'''
# -----------------------------------------
# Get L/H for image-to-image mapping.
# Both "paths_L" and "paths_H" are needed.
# -----------------------------------------
# e.g., train denoiser with L and H patches
# create a large patch dataset first
# -----------------------------------------
'''
def __init__(self, opt):
super(DatasetPlainPatch, self).__init__()
print('Get L/H for image-to-image mapping. Both "paths_L" and "paths_H" are needed.')
self.opt = opt
self.n_channels = opt['n_channels'] if opt['n_channels'] else 3
self.patch_size = self.opt['H_size'] if self.opt['H_size'] else 64
self.num_patches_per_image = opt['num_patches_per_image'] if opt['num_patches_per_image'] else 40
self.num_sampled = opt['num_sampled'] if opt['num_sampled'] else 3000
# -------------------
# get the path of L/H
# -------------------
self.paths_H = util.get_image_paths(opt['dataroot_H'])
self.paths_L = util.get_image_paths(opt['dataroot_L'])
assert self.paths_H, 'Error: H path is empty.'
assert self.paths_L, 'Error: L path is empty. This dataset uses L path, you can use dataset_dnpatchh'
if self.paths_L and self.paths_H:
assert len(self.paths_L) == len(self.paths_H), 'H and L datasets have different number of images - {}, {}.'.format(len(self.paths_L), len(self.paths_H))
# ------------------------------------
# number of sampled images
# ------------------------------------
self.num_sampled = min(self.num_sampled, len(self.paths_H))
# ------------------------------------
# reserve space with zeros
# ------------------------------------
self.total_patches = self.num_sampled * self.num_patches_per_image
self.H_data = np.zeros([self.total_patches, self.path_size, self.path_size, self.n_channels], dtype=np.uint8)
self.L_data = np.zeros([self.total_patches, self.path_size, self.path_size, self.n_channels], dtype=np.uint8)
# ------------------------------------
# update H patches
# ------------------------------------
self.update_data()
def update_data(self):
"""
# ------------------------------------
# update whole L/H patches
# ------------------------------------
"""
self.index_sampled = random.sample(range(0, len(self.paths_H), 1), self.num_sampled)
n_count = 0
for i in range(len(self.index_sampled)):
L_patches, H_patches = self.get_patches(self.index_sampled[i])
for (L_patch, H_patch) in zip(L_patches, H_patches):
self.L_data[n_count,:,:,:] = L_patch
self.H_data[n_count,:,:,:] = H_patch
n_count += 1
print('Training data updated! Total number of patches is: %5.2f X %5.2f = %5.2f\n' % (len(self.H_data)//128, 128, len(self.H_data)))
def get_patches(self, index):
"""
# ------------------------------------
# get L/H patches from L/H images
# ------------------------------------
"""
L_path = self.paths_L[index]
H_path = self.paths_H[index]
img_L = util.imread_uint(L_path, self.n_channels) # uint format
img_H = util.imread_uint(H_path, self.n_channels) # uint format
H, W = img_H.shape[:2]
L_patches, H_patches = [], []
num = self.num_patches_per_image
for _ in range(num):
rnd_h = random.randint(0, max(0, H - self.path_size))
rnd_w = random.randint(0, max(0, W - self.path_size))
L_patch = img_L[rnd_h:rnd_h + self.path_size, rnd_w:rnd_w + self.path_size, :]
H_patch = img_H[rnd_h:rnd_h + self.path_size, rnd_w:rnd_w + self.path_size, :]
L_patches.append(L_patch)
H_patches.append(H_patch)
return L_patches, H_patches
def __getitem__(self, index):
if self.opt['phase'] == 'train':
patch_L, patch_H = self.L_data[index], self.H_data[index]
# --------------------------------
# augmentation - flip and/or rotate
# --------------------------------
mode = random.randint(0, 7)
patch_L = util.augment_img(patch_L, mode=mode)
patch_H = util.augment_img(patch_H, mode=mode)
patch_L, patch_H = util.uint2tensor3(patch_L), util.uint2tensor3(patch_H)
else:
L_path, H_path = self.paths_L[index], self.paths_H[index]
patch_L = util.imread_uint(L_path, self.n_channels)
patch_H = util.imread_uint(H_path, self.n_channels)
patch_L, patch_H = util.uint2tensor3(patch_L), util.uint2tensor3(patch_H)
return {'L': patch_L, 'H': patch_H}
def __len__(self):
return self.total_patches
import random
import numpy as np
import torch.utils.data as data
import utils.utils_image as util
class DatasetSR(data.Dataset):
'''
# -----------------------------------------
# Get L/H for SISR.
# If only "paths_H" is provided, sythesize bicubicly downsampled L on-the-fly.
# -----------------------------------------
# e.g., SRResNet
# -----------------------------------------
'''
def __init__(self, opt):
super(DatasetSR, self).__init__()
self.opt = opt
self.n_channels = opt['n_channels'] if opt['n_channels'] else 3
self.sf = opt['scale'] if opt['scale'] else 4
self.patch_size = self.opt['H_size'] if self.opt['H_size'] else 96
self.L_size = self.patch_size // self.sf
# ------------------------------------
# get paths of L/H
# ------------------------------------
self.paths_H = util.get_image_paths(opt['dataroot_H'])
self.paths_L = util.get_image_paths(opt['dataroot_L'])
assert self.paths_H, 'Error: H path is empty.'
if self.paths_L and self.paths_H:
assert len(self.paths_L) == len(self.paths_H), 'L/H mismatch - {}, {}.'.format(len(self.paths_L), len(self.paths_H))
def __getitem__(self, index):
L_path = None
# ------------------------------------
# get H image
# ------------------------------------
H_path = self.paths_H[index]
img_H = util.imread_uint(H_path, self.n_channels)
img_H = util.uint2single(img_H)
# ------------------------------------
# modcrop
# ------------------------------------
img_H = util.modcrop(img_H, self.sf)
# ------------------------------------
# get L image
# ------------------------------------
if self.paths_L:
# --------------------------------
# directly load L image
# --------------------------------
L_path = self.paths_L[index]
img_L = util.imread_uint(L_path, self.n_channels)
img_L = util.uint2single(img_L)
else:
# --------------------------------
# sythesize L image via matlab's bicubic
# --------------------------------
H, W = img_H.shape[:2]
img_L = util.imresize_np(img_H, 1 / self.sf, True)
# ------------------------------------
# if train, get L/H patch pair
# ------------------------------------
if self.opt['phase'] == 'train':
H, W, C = img_L.shape
# --------------------------------
# randomly crop the L patch
# --------------------------------
rnd_h = random.randint(0, max(0, H - self.L_size))
rnd_w = random.randint(0, max(0, W - self.L_size))
img_L = img_L[rnd_h:rnd_h + self.L_size, rnd_w:rnd_w + self.L_size, :]
# --------------------------------
# crop corresponding H patch
# --------------------------------
rnd_h_H, rnd_w_H = int(rnd_h * self.sf), int(rnd_w * self.sf)
img_H = img_H[rnd_h_H:rnd_h_H + self.patch_size, rnd_w_H:rnd_w_H + self.patch_size, :]
# --------------------------------
# augmentation - flip and/or rotate
# --------------------------------
mode = random.randint(0, 7)
img_L, img_H = util.augment_img(img_L, mode=mode), util.augment_img(img_H, mode=mode)
# ------------------------------------
# L/H pairs, HWC to CHW, numpy to tensor
# ------------------------------------
img_H, img_L = util.single2tensor3(img_H), util.single2tensor3(img_L)
if L_path is None:
L_path = H_path
return {'L': img_L, 'H': img_H, 'L_path': L_path, 'H_path': H_path}
def __len__(self):
return len(self.paths_H)
import random
import numpy as np
import torch
import torch.utils.data as data
import utils.utils_image as util
from utils import utils_sisr
import hdf5storage
import os
class DatasetSRMD(data.Dataset):
'''
# -----------------------------------------
# Get L/H/M for noisy image SR with Gaussian kernels.
# Only "paths_H" is needed, sythesize bicubicly downsampled L on-the-fly.
# -----------------------------------------
# e.g., SRMD, H = f(L, kernel, sigma), sigma is noise level
# -----------------------------------------
'''
def __init__(self, opt):
super(DatasetSRMD, self).__init__()
self.opt = opt
self.n_channels = opt['n_channels'] if opt['n_channels'] else 3
self.sf = opt['scale'] if opt['scale'] else 4
self.patch_size = self.opt['H_size'] if self.opt['H_size'] else 96
self.L_size = self.patch_size // self.sf
self.sigma = opt['sigma'] if opt['sigma'] else [0, 50]
self.sigma_min, self.sigma_max = self.sigma[0], self.sigma[1]
self.sigma_test = opt['sigma_test'] if opt['sigma_test'] else 0
# -------------------------------------
# PCA projection matrix
# -------------------------------------
self.p = hdf5storage.loadmat(os.path.join('kernels', 'srmd_pca_pytorch.mat'))['p']
self.ksize = int(np.sqrt(self.p.shape[-1])) # kernel size
# ------------------------------------
# get paths of L/H
# ------------------------------------
self.paths_H = util.get_image_paths(opt['dataroot_H'])
self.paths_L = util.get_image_paths(opt['dataroot_L'])
def __getitem__(self, index):
# ------------------------------------
# get H image
# ------------------------------------
H_path = self.paths_H[index]
img_H = util.imread_uint(H_path, self.n_channels)
img_H = util.uint2single(img_H)
# ------------------------------------
# modcrop for SR
# ------------------------------------
img_H = util.modcrop(img_H, self.sf)
# ------------------------------------
# kernel
# ------------------------------------
if self.opt['phase'] == 'train':
l_max = 10
theta = np.pi*random.random()
l1 = 0.1+l_max*random.random()
l2 = 0.1+(l1-0.1)*random.random()
kernel = utils_sisr.anisotropic_Gaussian(ksize=self.ksize, theta=theta, l1=l1, l2=l2)
else:
kernel = utils_sisr.anisotropic_Gaussian(ksize=self.ksize, theta=np.pi, l1=0.1, l2=0.1)
k = np.reshape(kernel, (-1), order="F")
k_reduced = np.dot(self.p, k)
k_reduced = torch.from_numpy(k_reduced).float()
# ------------------------------------
# sythesize L image via specified degradation model
# ------------------------------------
H, W, _ = img_H.shape
img_L = utils_sisr.srmd_degradation(img_H, kernel, self.sf)
img_L = np.float32(img_L)
if self.opt['phase'] == 'train':
"""
# --------------------------------
# get L/H patch pairs
# --------------------------------
"""
H, W, C = img_L.shape
# --------------------------------
# randomly crop L patch
# --------------------------------
rnd_h = random.randint(0, max(0, H - self.L_size))
rnd_w = random.randint(0, max(0, W - self.L_size))
img_L = img_L[rnd_h:rnd_h + self.L_size, rnd_w:rnd_w + self.L_size, :]
# --------------------------------
# crop corresponding H patch
# --------------------------------
rnd_h_H, rnd_w_H = int(rnd_h * self.sf), int(rnd_w * self.sf)
img_H = img_H[rnd_h_H:rnd_h_H + self.patch_size, rnd_w_H:rnd_w_H + self.patch_size, :]
# --------------------------------
# augmentation - flip and/or rotate
# --------------------------------
mode = random.randint(0, 7)
img_L, img_H = util.augment_img(img_L, mode=mode), util.augment_img(img_H, mode=mode)
# --------------------------------
# get patch pairs
# --------------------------------
img_H, img_L = util.single2tensor3(img_H), util.single2tensor3(img_L)
# --------------------------------
# select noise level and get Gaussian noise
# --------------------------------
if random.random() < 0.1:
noise_level = torch.zeros(1).float()
else:
noise_level = torch.FloatTensor([np.random.uniform(self.sigma_min, self.sigma_max)])/255.0
# noise_level = torch.rand(1)*50/255.0
# noise_level = torch.min(torch.from_numpy(np.float32([7*np.random.chisquare(2.5)/255.0])),torch.Tensor([50./255.]))
else:
img_H, img_L = util.single2tensor3(img_H), util.single2tensor3(img_L)
noise_level = noise_level = torch.FloatTensor([self.sigma_test])
# ------------------------------------
# add noise
# ------------------------------------
noise = torch.randn(img_L.size()).mul_(noise_level).float()
img_L.add_(noise)
# ------------------------------------
# get degradation map M
# ------------------------------------
M_vector = torch.cat((k_reduced, noise_level), 0).unsqueeze(1).unsqueeze(1)
M = M_vector.repeat(1, img_L.size()[-2], img_L.size()[-1])
"""
# -------------------------------------
# concat L and noise level map M
# -------------------------------------
"""
img_L = torch.cat((img_L, M), 0)
L_path = H_path
return {'L': img_L, 'H': img_H, 'L_path': L_path, 'H_path': H_path}
def __len__(self):
return len(self.paths_H)
import random
import numpy as np
import torch
import torch.utils.data as data
import utils.utils_image as util
from utils import utils_deblur
from utils import utils_sisr
import os
from scipy import ndimage
from scipy.io import loadmat
# import hdf5storage
class DatasetUSRNet(data.Dataset):
'''
# -----------------------------------------
# Get L/k/sf/sigma for USRNet.
# Only "paths_H" and kernel is needed, synthesize L on-the-fly.
# -----------------------------------------
'''
def __init__(self, opt):
super(DatasetUSRNet, self).__init__()
self.opt = opt
self.n_channels = opt['n_channels'] if opt['n_channels'] else 3
self.patch_size = self.opt['H_size'] if self.opt['H_size'] else 96
self.sigma_max = self.opt['sigma_max'] if self.opt['sigma_max'] is not None else 25
self.scales = opt['scales'] if opt['scales'] is not None else [1,2,3,4]
self.sf_validation = opt['sf_validation'] if opt['sf_validation'] is not None else 3
#self.kernels = hdf5storage.loadmat(os.path.join('kernels', 'kernels_12.mat'))['kernels']
self.kernels = loadmat(os.path.join('kernels', 'kernels_12.mat'))['kernels'] # for validation
# -------------------
# get the path of H
# -------------------
self.paths_H = util.get_image_paths(opt['dataroot_H']) # return None if input is None
self.count = 0
def __getitem__(self, index):
# -------------------
# get H image
# -------------------
H_path = self.paths_H[index]
img_H = util.imread_uint(H_path, self.n_channels)
L_path = H_path
if self.opt['phase'] == 'train':
# ---------------------------
# 1) scale factor, ensure each batch only involves one scale factor
# ---------------------------
if self.count % self.opt['dataloader_batch_size'] == 0:
# sf = random.choice([1,2,3,4])
self.sf = random.choice(self.scales)
# self.count = 0 # optional
self.count += 1
H, W, _ = img_H.shape
# ----------------------------
# randomly crop the patch
# ----------------------------
rnd_h = random.randint(0, max(0, H - self.patch_size))
rnd_w = random.randint(0, max(0, W - self.patch_size))
patch_H = img_H[rnd_h:rnd_h + self.patch_size, rnd_w:rnd_w + self.patch_size, :]
# ---------------------------
# augmentation - flip, rotate
# ---------------------------
mode = np.random.randint(0, 8)
patch_H = util.augment_img(patch_H, mode=mode)
# ---------------------------
# 2) kernel
# ---------------------------
r_value = random.randint(0, 7)
if r_value>3:
k = utils_deblur.blurkernel_synthesis(h=25) # motion blur
else:
sf_k = random.choice(self.scales)
k = utils_sisr.gen_kernel(scale_factor=np.array([sf_k, sf_k])) # Gaussian blur
mode_k = random.randint(0, 7)
k = util.augment_img(k, mode=mode_k)
# ---------------------------
# 3) noise level
# ---------------------------
if random.randint(0, 8) == 1:
noise_level = 0/255.0
else:
noise_level = np.random.randint(0, self.sigma_max)/255.0
# ---------------------------
# Low-quality image
# ---------------------------
img_L = ndimage.filters.convolve(patch_H, np.expand_dims(k, axis=2), mode='wrap')
img_L = img_L[0::self.sf, 0::self.sf, ...]
# add Gaussian noise
img_L = util.uint2single(img_L) + np.random.normal(0, noise_level, img_L.shape)
img_H = patch_H
else:
k = self.kernels[0, 0].astype(np.float64) # validation kernel
k /= np.sum(k)
noise_level = 0./255.0 # validation noise level
# ------------------------------------
# modcrop
# ------------------------------------
img_H = util.modcrop(img_H, self.sf_validation)
img_L = ndimage.filters.convolve(img_H, np.expand_dims(k, axis=2), mode='wrap') # blur
img_L = img_L[0::self.sf_validation, 0::self.sf_validation, ...] # downsampling
img_L = util.uint2single(img_L) + np.random.normal(0, noise_level, img_L.shape)
self.sf = self.sf_validation
k = util.single2tensor3(np.expand_dims(np.float32(k), axis=2))
img_H, img_L = util.uint2tensor3(img_H), util.single2tensor3(img_L)
noise_level = torch.FloatTensor([noise_level]).view([1,1,1])
return {'L': img_L, 'H': img_H, 'k': k, 'sigma': noise_level, 'sf': self.sf, 'L_path': L_path, 'H_path': H_path}
def __len__(self):
return len(self.paths_H)
import glob
import torch
from os import path as osp
import torch.utils.data as data
import utils.utils_video as utils_video
class VideoRecurrentTestDataset(data.Dataset):
"""Video test dataset for recurrent architectures, which takes LR video
frames as input and output corresponding HR video frames. Modified from
https://github.com/xinntao/BasicSR/blob/master/basicsr/data/reds_dataset.py
Supported datasets: Vid4, REDS4, REDSofficial.
More generally, it supports testing dataset with following structures:
dataroot
├── subfolder1
├── frame000
├── frame001
├── ...
├── subfolder1
├── frame000
├── frame001
├── ...
├── ...
For testing datasets, there is no need to prepare LMDB files.
Args:
opt (dict): Config for train dataset. It contains the following keys:
dataroot_gt (str): Data root path for gt.
dataroot_lq (str): Data root path for lq.
io_backend (dict): IO backend type and other kwarg.
cache_data (bool): Whether to cache testing datasets.
name (str): Dataset name.
meta_info_file (str): The path to the file storing the list of test
folders. If not provided, all the folders in the dataroot will
be used.
num_frame (int): Window size for input frames.
padding (str): Padding mode.
"""
def __init__(self, opt):
super(VideoRecurrentTestDataset, self).__init__()
self.opt = opt
self.cache_data = opt['cache_data']
self.gt_root, self.lq_root = opt['dataroot_gt'], opt['dataroot_lq']
self.data_info = {'lq_path': [], 'gt_path': [], 'folder': [], 'idx': [], 'border': []}
self.imgs_lq, self.imgs_gt = {}, {}
if 'meta_info_file' in opt:
with open(opt['meta_info_file'], 'r') as fin:
subfolders = [line.split(' ')[0] for line in fin]
subfolders_lq = [osp.join(self.lq_root, key) for key in subfolders]
subfolders_gt = [osp.join(self.gt_root, key) for key in subfolders]
else:
subfolders_lq = sorted(glob.glob(osp.join(self.lq_root, '*')))
subfolders_gt = sorted(glob.glob(osp.join(self.gt_root, '*')))
for subfolder_lq, subfolder_gt in zip(subfolders_lq, subfolders_gt):
# get frame list for lq and gt
subfolder_name = osp.basename(subfolder_lq)
img_paths_lq = sorted(list(utils_video.scandir(subfolder_lq, full_path=True)))
img_paths_gt = sorted(list(utils_video.scandir(subfolder_gt, full_path=True)))
max_idx = len(img_paths_lq)
assert max_idx == len(img_paths_gt), (f'Different number of images in lq ({max_idx})'
f' and gt folders ({len(img_paths_gt)})')
self.data_info['lq_path'].extend(img_paths_lq)
self.data_info['gt_path'].extend(img_paths_gt)
self.data_info['folder'].extend([subfolder_name] * max_idx)
for i in range(max_idx):
self.data_info['idx'].append(f'{i}/{max_idx}')
border_l = [0] * max_idx
for i in range(self.opt['num_frame'] // 2):
border_l[i] = 1
border_l[max_idx - i - 1] = 1
self.data_info['border'].extend(border_l)
# cache data or save the frame list
if self.cache_data:
print(f'Cache {subfolder_name} for VideoTestDataset...')
self.imgs_lq[subfolder_name] = utils_video.read_img_seq(img_paths_lq)
self.imgs_gt[subfolder_name] = utils_video.read_img_seq(img_paths_gt)
else:
self.imgs_lq[subfolder_name] = img_paths_lq
self.imgs_gt[subfolder_name] = img_paths_gt
# Find unique folder strings
self.folders = sorted(list(set(self.data_info['folder'])))
self.sigma = opt['sigma'] / 255. if 'sigma' in opt else 0 # for non-blind video denoising
def __getitem__(self, index):
folder = self.folders[index]
if self.sigma:
# for non-blind video denoising
if self.cache_data:
imgs_gt = self.imgs_gt[folder]
else:
imgs_gt = utils_video.read_img_seq(self.imgs_gt[folder])
torch.manual_seed(0)
noise_level = torch.ones((1, 1, 1, 1)) * self.sigma
noise = torch.normal(mean=0, std=noise_level.expand_as(imgs_gt))
imgs_lq = imgs_gt + noise
t, _, h, w = imgs_lq.shape
imgs_lq = torch.cat([imgs_lq, noise_level.expand(t, 1, h, w)], 1)
else:
# for video sr and deblurring
if self.cache_data:
imgs_lq = self.imgs_lq[folder]
imgs_gt = self.imgs_gt[folder]
else:
imgs_lq = utils_video.read_img_seq(self.imgs_lq[folder])
imgs_gt = utils_video.read_img_seq(self.imgs_gt[folder])
return {
'L': imgs_lq,
'H': imgs_gt,
'folder': folder,
'lq_path': self.imgs_lq[folder],
}
def __len__(self):
return len(self.folders)
class SingleVideoRecurrentTestDataset(data.Dataset):
"""Single ideo test dataset for recurrent architectures, which takes LR video
frames as input and output corresponding HR video frames (only input LQ path).
More generally, it supports testing dataset with following structures:
dataroot
├── subfolder1
├── frame000
├── frame001
├── ...
├── subfolder1
├── frame000
├── frame001
├── ...
├── ...
For testing datasets, there is no need to prepare LMDB files.
Args:
opt (dict): Config for train dataset. It contains the following keys:
dataroot_gt (str): Data root path for gt.
dataroot_lq (str): Data root path for lq.
io_backend (dict): IO backend type and other kwarg.
cache_data (bool): Whether to cache testing datasets.
name (str): Dataset name.
meta_info_file (str): The path to the file storing the list of test
folders. If not provided, all the folders in the dataroot will
be used.
num_frame (int): Window size for input frames.
padding (str): Padding mode.
"""
def __init__(self, opt):
super(SingleVideoRecurrentTestDataset, self).__init__()
self.opt = opt
self.cache_data = opt['cache_data']
self.lq_root = opt['dataroot_lq']
self.data_info = {'lq_path': [], 'folder': [], 'idx': [], 'border': []}
self.imgs_lq = {}
if 'meta_info_file' in opt:
with open(opt['meta_info_file'], 'r') as fin:
subfolders = [line.split(' ')[0] for line in fin]
subfolders_lq = [osp.join(self.lq_root, key) for key in subfolders]
else:
subfolders_lq = sorted(glob.glob(osp.join(self.lq_root, '*')))
for subfolder_lq in subfolders_lq:
# get frame list for lq and gt
subfolder_name = osp.basename(subfolder_lq)
img_paths_lq = sorted(list(utils_video.scandir(subfolder_lq, full_path=True)))
max_idx = len(img_paths_lq)
self.data_info['lq_path'].extend(img_paths_lq)
self.data_info['folder'].extend([subfolder_name] * max_idx)
for i in range(max_idx):
self.data_info['idx'].append(f'{i}/{max_idx}')
border_l = [0] * max_idx
for i in range(self.opt['num_frame'] // 2):
border_l[i] = 1
border_l[max_idx - i - 1] = 1
self.data_info['border'].extend(border_l)
# cache data or save the frame list
if self.cache_data:
print(f'Cache {subfolder_name} for VideoTestDataset...')
self.imgs_lq[subfolder_name] = utils_video.read_img_seq(img_paths_lq)
else:
self.imgs_lq[subfolder_name] = img_paths_lq
# Find unique folder strings
self.folders = sorted(list(set(self.data_info['folder'])))
def __getitem__(self, index):
folder = self.folders[index]
if self.cache_data:
imgs_lq = self.imgs_lq[folder]
else:
imgs_lq = utils_video.read_img_seq(self.imgs_lq[folder])
return {
'L': imgs_lq,
'folder': folder,
'lq_path': self.imgs_lq[folder],
}
def __len__(self):
return len(self.folders)
class VideoTestVimeo90KDataset(data.Dataset):
"""Video test dataset for Vimeo90k-Test dataset.
It only keeps the center frame for testing.
For testing datasets, there is no need to prepare LMDB files.
Args:
opt (dict): Config for train dataset. It contains the following keys:
dataroot_gt (str): Data root path for gt.
dataroot_lq (str): Data root path for lq.
io_backend (dict): IO backend type and other kwarg.
cache_data (bool): Whether to cache testing datasets.
name (str): Dataset name.
meta_info_file (str): The path to the file storing the list of test
folders. If not provided, all the folders in the dataroot will
be used.
num_frame (int): Window size for input frames.
padding (str): Padding mode.
"""
def __init__(self, opt):
super(VideoTestVimeo90KDataset, self).__init__()
self.opt = opt
self.cache_data = opt['cache_data']
if self.cache_data:
raise NotImplementedError('cache_data in Vimeo90K-Test dataset is not implemented.')
self.gt_root, self.lq_root = opt['dataroot_gt'], opt['dataroot_lq']
self.data_info = {'lq_path': [], 'gt_path': [], 'folder': [], 'idx': [], 'border': []}
neighbor_list = [i + (9 - opt['num_frame']) // 2 for i in range(opt['num_frame'])]
with open(opt['meta_info_file'], 'r') as fin:
subfolders = [line.split(' ')[0] for line in fin]
for idx, subfolder in enumerate(subfolders):
gt_path = osp.join(self.gt_root, subfolder, 'im4.png')
self.data_info['gt_path'].append(gt_path)
lq_paths = [osp.join(self.lq_root, subfolder, f'im{i}.png') for i in neighbor_list]
self.data_info['lq_path'].append(lq_paths)
self.data_info['folder'].append('vimeo90k')
self.data_info['idx'].append(f'{idx}/{len(subfolders)}')
self.data_info['border'].append(0)
self.pad_sequence = opt.get('pad_sequence', False)
def __getitem__(self, index):
lq_path = self.data_info['lq_path'][index]
gt_path = self.data_info['gt_path'][index]
imgs_lq = utils_video.read_img_seq(lq_path)
img_gt = utils_video.read_img_seq([gt_path])
img_gt.squeeze_(0)
if self.pad_sequence: # pad the sequence: 7 frames to 8 frames
imgs_lq = torch.cat([imgs_lq, imgs_lq[-1:,...]], dim=0)
return {
'L': imgs_lq, # (t, c, h, w)
'H': img_gt, # (c, h, w)
'folder': self.data_info['folder'][index], # folder name
'idx': self.data_info['idx'][index], # e.g., 0/843
'border': self.data_info['border'][index], # 0 for non-border
'lq_path': lq_path[self.opt['num_frame'] // 2] # center frame
}
def __len__(self):
return len(self.data_info['gt_path'])
class SingleVideoRecurrentTestDataset(data.Dataset):
"""Single Video test dataset (only input LQ path).
Supported datasets: Vid4, REDS4, REDSofficial.
More generally, it supports testing dataset with following structures:
dataroot
├── subfolder1
├── frame000
├── frame001
├── ...
├── subfolder1
├── frame000
├── frame001
├── ...
├── ...
For testing datasets, there is no need to prepare LMDB files.
Args:
opt (dict): Config for train dataset. It contains the following keys:
dataroot_gt (str): Data root path for gt.
dataroot_lq (str): Data root path for lq.
io_backend (dict): IO backend type and other kwarg.
cache_data (bool): Whether to cache testing datasets.
name (str): Dataset name.
meta_info_file (str): The path to the file storing the list of test
folders. If not provided, all the folders in the dataroot will
be used.
num_frame (int): Window size for input frames.
padding (str): Padding mode.
"""
def __init__(self, opt):
super(SingleVideoRecurrentTestDataset, self).__init__()
self.opt = opt
self.cache_data = opt['cache_data']
self.lq_root = opt['dataroot_lq']
self.data_info = {'lq_path': [], 'folder': [], 'idx': [], 'border': []}
# file client (io backend)
self.file_client = None
self.imgs_lq = {}
if 'meta_info_file' in opt:
with open(opt['meta_info_file'], 'r') as fin:
subfolders = [line.split(' ')[0] for line in fin]
subfolders_lq = [osp.join(self.lq_root, key) for key in subfolders]
else:
subfolders_lq = sorted(glob.glob(osp.join(self.lq_root, '*')))
for subfolder_lq in subfolders_lq:
# get frame list for lq and gt
subfolder_name = osp.basename(subfolder_lq)
img_paths_lq = sorted(list(utils_video.scandir(subfolder_lq, full_path=True)))
max_idx = len(img_paths_lq)
self.data_info['lq_path'].extend(img_paths_lq)
self.data_info['folder'].extend([subfolder_name] * max_idx)
for i in range(max_idx):
self.data_info['idx'].append(f'{i}/{max_idx}')
border_l = [0] * max_idx
for i in range(self.opt['num_frame'] // 2):
border_l[i] = 1
border_l[max_idx - i - 1] = 1
self.data_info['border'].extend(border_l)
# cache data or save the frame list
if self.cache_data:
logger.info(f'Cache {subfolder_name} for VideoTestDataset...')
self.imgs_lq[subfolder_name] = utils_video.read_img_seq(img_paths_lq)
else:
self.imgs_lq[subfolder_name] = img_paths_lq
# Find unique folder strings
self.folders = sorted(list(set(self.data_info['folder'])))
def __getitem__(self, index):
folder = self.folders[index]
if self.cache_data:
imgs_lq = self.imgs_lq[folder]
else:
imgs_lq = utils_video.read_img_seq(self.imgs_lq[folder])
return {
'L': imgs_lq,
'folder': folder,
'lq_path': self.imgs_lq[folder],
}
def __len__(self):
return len(self.folders)
import numpy as np
import random
import torch
from pathlib import Path
import torch.utils.data as data
import utils.utils_video as utils_video
class VideoRecurrentTrainDataset(data.Dataset):
"""Video dataset for training recurrent networks.
The keys are generated from a meta info txt file.
basicsr/data/meta_info/meta_info_XXX_GT.txt
Each line contains:
1. subfolder (clip) name; 2. frame number; 3. image shape, separated by
a white space.
Examples:
720p_240fps_1 100 (720,1280,3)
720p_240fps_3 100 (720,1280,3)
...
Key examples: "720p_240fps_1/00000"
GT (gt): Ground-Truth;
LQ (lq): Low-Quality, e.g., low-resolution/blurry/noisy/compressed frames.
Args:
opt (dict): Config for train dataset. It contains the following keys:
dataroot_gt (str): Data root path for gt.
dataroot_lq (str): Data root path for lq.
dataroot_flow (str, optional): Data root path for flow.
meta_info_file (str): Path for meta information file.
val_partition (str): Validation partition types. 'REDS4' or
'official'.
io_backend (dict): IO backend type and other kwarg.
num_frame (int): Window size for input frames.
gt_size (int): Cropped patched size for gt patches.
interval_list (list): Interval list for temporal augmentation.
random_reverse (bool): Random reverse input frames.
use_hflip (bool): Use horizontal flips.
use_rot (bool): Use rotation (use vertical flip and transposing h
and w for implementation).
scale (bool): Scale, which will be added automatically.
"""
def __init__(self, opt):
super(VideoRecurrentTrainDataset, self).__init__()
self.opt = opt
self.scale = opt.get('scale', 4)
self.gt_size = opt.get('gt_size', 256)
self.gt_root, self.lq_root = Path(opt['dataroot_gt']), Path(opt['dataroot_lq'])
self.filename_tmpl = opt.get('filename_tmpl', '08d')
self.filename_ext = opt.get('filename_ext', 'png')
self.num_frame = opt['num_frame']
keys = []
total_num_frames = [] # some clips may not have 100 frames
start_frames = [] # some clips may not start from 00000
with open(opt['meta_info_file'], 'r') as fin:
for line in fin:
folder, frame_num, _, start_frame = line.split(' ')
keys.extend([f'{folder}/{i:{self.filename_tmpl}}' for i in range(int(start_frame), int(start_frame)+int(frame_num))])
total_num_frames.extend([int(frame_num) for i in range(int(frame_num))])
start_frames.extend([int(start_frame) for i in range(int(frame_num))])
# remove the video clips used in validation
if opt['name'] == 'REDS':
if opt['val_partition'] == 'REDS4':
val_partition = ['000', '011', '015', '020']
elif opt['val_partition'] == 'official':
val_partition = [f'{v:03d}' for v in range(240, 270)]
else:
raise ValueError(f'Wrong validation partition {opt["val_partition"]}.'
f"Supported ones are ['official', 'REDS4'].")
else:
val_partition = []
self.keys = []
self.total_num_frames = [] # some clips may not have 100 frames
self.start_frames = []
if opt['test_mode']:
for i, v in zip(range(len(keys)), keys):
if v.split('/')[0] in val_partition:
self.keys.append(keys[i])
self.total_num_frames.append(total_num_frames[i])
self.start_frames.append(start_frames[i])
else:
for i, v in zip(range(len(keys)), keys):
if v.split('/')[0] not in val_partition:
self.keys.append(keys[i])
self.total_num_frames.append(total_num_frames[i])
self.start_frames.append(start_frames[i])
# file client (io backend)
self.file_client = None
self.io_backend_opt = opt['io_backend']
self.is_lmdb = False
if self.io_backend_opt['type'] == 'lmdb':
self.is_lmdb = True
if hasattr(self, 'flow_root') and self.flow_root is not None:
self.io_backend_opt['db_paths'] = [self.lq_root, self.gt_root, self.flow_root]
self.io_backend_opt['client_keys'] = ['lq', 'gt', 'flow']
else:
self.io_backend_opt['db_paths'] = [self.lq_root, self.gt_root]
self.io_backend_opt['client_keys'] = ['lq', 'gt']
# temporal augmentation configs
self.interval_list = opt.get('interval_list', [1])
self.random_reverse = opt.get('random_reverse', False)
interval_str = ','.join(str(x) for x in self.interval_list)
print(f'Temporal augmentation interval list: [{interval_str}]; '
f'random reverse is {self.random_reverse}.')
def __getitem__(self, index):
if self.file_client is None:
self.file_client = utils_video.FileClient(self.io_backend_opt.pop('type'), **self.io_backend_opt)
key = self.keys[index]
total_num_frames = self.total_num_frames[index]
start_frames = self.start_frames[index]
clip_name, frame_name = key.split('/') # key example: 000/00000000
# determine the neighboring frames
interval = random.choice(self.interval_list)
# ensure not exceeding the borders
start_frame_idx = int(frame_name)
endmost_start_frame_idx = start_frames + total_num_frames - self.num_frame * interval
if start_frame_idx > endmost_start_frame_idx:
start_frame_idx = random.randint(start_frames, endmost_start_frame_idx)
end_frame_idx = start_frame_idx + self.num_frame * interval
neighbor_list = list(range(start_frame_idx, end_frame_idx, interval))
# random reverse
if self.random_reverse and random.random() < 0.5:
neighbor_list.reverse()
# get the neighboring LQ and GT frames
img_lqs = []
img_gts = []
for neighbor in neighbor_list:
if self.is_lmdb:
img_lq_path = f'{clip_name}/{neighbor:{self.filename_tmpl}}'
img_gt_path = f'{clip_name}/{neighbor:{self.filename_tmpl}}'
else:
img_lq_path = self.lq_root / clip_name / f'{neighbor:{self.filename_tmpl}}.{self.filename_ext}'
img_gt_path = self.gt_root / clip_name / f'{neighbor:{self.filename_tmpl}}.{self.filename_ext}'
# get LQ
img_bytes = self.file_client.get(img_lq_path, 'lq')
img_lq = utils_video.imfrombytes(img_bytes, float32=True)
img_lqs.append(img_lq)
# get GT
img_bytes = self.file_client.get(img_gt_path, 'gt')
img_gt = utils_video.imfrombytes(img_bytes, float32=True)
img_gts.append(img_gt)
# randomly crop
img_gts, img_lqs = utils_video.paired_random_crop(img_gts, img_lqs, self.gt_size, self.scale, img_gt_path)
# augmentation - flip, rotate
img_lqs.extend(img_gts)
img_results = utils_video.augment(img_lqs, self.opt['use_hflip'], self.opt['use_rot'])
img_results = utils_video.img2tensor(img_results)
img_gts = torch.stack(img_results[len(img_lqs) // 2:], dim=0)
img_lqs = torch.stack(img_results[:len(img_lqs) // 2], dim=0)
# img_lqs: (t, c, h, w)
# img_gts: (t, c, h, w)
# key: str
return {'L': img_lqs, 'H': img_gts, 'key': key}
def __len__(self):
return len(self.keys)
class VideoRecurrentTrainNonblindDenoisingDataset(VideoRecurrentTrainDataset):
"""Video dataset for training recurrent architectures in non-blind video denoising.
Args:
Same as VideoTestDataset.
"""
def __init__(self, opt):
super(VideoRecurrentTrainNonblindDenoisingDataset, self).__init__(opt)
self.sigma_min = self.opt['sigma_min'] / 255.
self.sigma_max = self.opt['sigma_max'] / 255.
def __getitem__(self, index):
if self.file_client is None:
self.file_client = utils_video.FileClient(self.io_backend_opt.pop('type'), **self.io_backend_opt)
key = self.keys[index]
total_num_frames = self.total_num_frames[index]
start_frames = self.start_frames[index]
clip_name, frame_name = key.split('/') # key example: 000/00000000
# determine the neighboring frames
interval = random.choice(self.interval_list)
# ensure not exceeding the borders
start_frame_idx = int(frame_name)
endmost_start_frame_idx = start_frames + total_num_frames - self.num_frame * interval
if start_frame_idx > endmost_start_frame_idx:
start_frame_idx = random.randint(start_frames, endmost_start_frame_idx)
end_frame_idx = start_frame_idx + self.num_frame * interval
neighbor_list = list(range(start_frame_idx, end_frame_idx, interval))
# random reverse
if self.random_reverse and random.random() < 0.5:
neighbor_list.reverse()
# get the neighboring GT frames
img_gts = []
for neighbor in neighbor_list:
if self.is_lmdb:
img_gt_path = f'{clip_name}/{neighbor:{self.filename_tmpl}}'
else:
img_gt_path = self.gt_root / clip_name / f'{neighbor:{self.filename_tmpl}}.{self.filename_ext}'
# get GT
img_bytes = self.file_client.get(img_gt_path, 'gt')
img_gt = utils_video.imfrombytes(img_bytes, float32=True)
img_gts.append(img_gt)
# randomly crop
img_gts, _ = utils_video.paired_random_crop(img_gts, img_gts, self.gt_size, 1, img_gt_path)
# augmentation - flip, rotate
img_gts = utils_video.augment(img_gts, self.opt['use_hflip'], self.opt['use_rot'])
img_gts = utils_video.img2tensor(img_gts)
img_gts = torch.stack(img_gts, dim=0)
# we add noise in the network
noise_level = torch.empty((1, 1, 1, 1)).uniform_(self.sigma_min, self.sigma_max)
noise = torch.normal(mean=0, std=noise_level.expand_as(img_gts))
img_lqs = img_gts + noise
t, _, h, w = img_lqs.shape
img_lqs = torch.cat([img_lqs, noise_level.expand(t, 1, h, w)], 1)
# img_lqs: (t, c, h, w)
# img_gts: (t, c, h, w)
# key: str
return {'L': img_lqs, 'H': img_gts, 'key': key}
def __len__(self):
return len(self.keys)
class VideoRecurrentTrainVimeoDataset(data.Dataset):
"""Vimeo90K dataset for training recurrent networks.
The keys are generated from a meta info txt file.
basicsr/data/meta_info/meta_info_Vimeo90K_train_GT.txt
Each line contains:
1. clip name; 2. frame number; 3. image shape, separated by a white space.
Examples:
00001/0001 7 (256,448,3)
00001/0002 7 (256,448,3)
Key examples: "00001/0001"
GT (gt): Ground-Truth;
LQ (lq): Low-Quality, e.g., low-resolution/blurry/noisy/compressed frames.
The neighboring frame list for different num_frame:
num_frame | frame list
1 | 4
3 | 3,4,5
5 | 2,3,4,5,6
7 | 1,2,3,4,5,6,7
Args:
opt (dict): Config for train dataset. It contains the following keys:
dataroot_gt (str): Data root path for gt.
dataroot_lq (str): Data root path for lq.
meta_info_file (str): Path for meta information file.
io_backend (dict): IO backend type and other kwarg.
num_frame (int): Window size for input frames.
gt_size (int): Cropped patched size for gt patches.
random_reverse (bool): Random reverse input frames.
use_hflip (bool): Use horizontal flips.
use_rot (bool): Use rotation (use vertical flip and transposing h
and w for implementation).
scale (bool): Scale, which will be added automatically.
"""
def __init__(self, opt):
super(VideoRecurrentTrainVimeoDataset, self).__init__()
self.opt = opt
self.gt_root, self.lq_root = Path(opt['dataroot_gt']), Path(opt['dataroot_lq'])
with open(opt['meta_info_file'], 'r') as fin:
self.keys = [line.split(' ')[0] for line in fin]
# file client (io backend)
self.file_client = None
self.io_backend_opt = opt['io_backend']
self.is_lmdb = False
if self.io_backend_opt['type'] == 'lmdb':
self.is_lmdb = True
self.io_backend_opt['db_paths'] = [self.lq_root, self.gt_root]
self.io_backend_opt['client_keys'] = ['lq', 'gt']
# indices of input images
self.neighbor_list = [i + (9 - opt['num_frame']) // 2 for i in range(opt['num_frame'])]
# temporal augmentation configs
self.random_reverse = opt['random_reverse']
print(f'Random reverse is {self.random_reverse}.')
self.flip_sequence = opt.get('flip_sequence', False)
self.pad_sequence = opt.get('pad_sequence', False)
self.neighbor_list = [1, 2, 3, 4, 5, 6, 7]
def __getitem__(self, index):
if self.file_client is None:
self.file_client = utils_video.FileClient(self.io_backend_opt.pop('type'), **self.io_backend_opt)
# random reverse
if self.random_reverse and random.random() < 0.5:
self.neighbor_list.reverse()
scale = self.opt['scale']
gt_size = self.opt['gt_size']
key = self.keys[index]
clip, seq = key.split('/') # key example: 00001/0001
# get the neighboring LQ and GT frames
img_lqs = []
img_gts = []
for neighbor in self.neighbor_list:
if self.is_lmdb:
img_lq_path = f'{clip}/{seq}/im{neighbor}'
img_gt_path = f'{clip}/{seq}/im{neighbor}'
else:
img_lq_path = self.lq_root / clip / seq / f'im{neighbor}.png'
img_gt_path = self.gt_root / clip / seq / f'im{neighbor}.png'
# LQ
img_bytes = self.file_client.get(img_lq_path, 'lq')
img_lq = utils_video.imfrombytes(img_bytes, float32=True)
# GT
img_bytes = self.file_client.get(img_gt_path, 'gt')
img_gt = utils_video.imfrombytes(img_bytes, float32=True)
img_lqs.append(img_lq)
img_gts.append(img_gt)
# randomly crop
img_gts, img_lqs = utils_video.paired_random_crop(img_gts, img_lqs, gt_size, scale, img_gt_path)
# augmentation - flip, rotate
img_lqs.extend(img_gts)
img_results = utils_video.augment(img_lqs, self.opt['use_hflip'], self.opt['use_rot'])
img_results = utils_video.img2tensor(img_results)
img_lqs = torch.stack(img_results[:7], dim=0)
img_gts = torch.stack(img_results[7:], dim=0)
if self.flip_sequence: # flip the sequence: 7 frames to 14 frames
img_lqs = torch.cat([img_lqs, img_lqs.flip(0)], dim=0)
img_gts = torch.cat([img_gts, img_gts.flip(0)], dim=0)
elif self.pad_sequence: # pad the sequence: 7 frames to 8 frames
img_lqs = torch.cat([img_lqs, img_lqs[-1:,...]], dim=0)
img_gts = torch.cat([img_gts, img_gts[-1:,...]], dim=0)
# img_lqs: (t, c, h, w)
# img_gt: (c, h, w)
# key: str
return {'L': img_lqs, 'H': img_gts, 'key': key}
def __len__(self):
return len(self.keys)
bear 82 (480,854,3) 00000
bike-packing 69 (480,854,3) 00000
blackswan 50 (480,854,3) 00000
bmx-bumps 90 (480,854,3) 00000
bmx-trees 80 (480,854,3) 00000
boat 75 (480,854,3) 00000
boxing-fisheye 87 (480,854,3) 00000
breakdance 84 (480,854,3) 00000
breakdance-flare 71 (480,854,3) 00000
bus 80 (480,854,3) 00000
camel 90 (480,854,3) 00000
car-roundabout 75 (480,854,3) 00000
car-shadow 40 (480,854,3) 00000
car-turn 80 (480,854,3) 00000
cat-girl 89 (480,854,3) 00000
classic-car 63 (480,854,3) 00000
color-run 84 (480,854,3) 00000
cows 104 (480,854,3) 00000
crossing 52 (480,854,3) 00000
dance-jump 60 (480,854,3) 00000
dance-twirl 90 (480,854,3) 00000
dancing 62 (480,854,3) 00000
disc-jockey 76 (480,854,3) 00000
dog 60 (480,854,3) 00000
dog-agility 25 (480,854,3) 00000
dog-gooses 86 (480,854,3) 00000
dogs-jump 66 (480,854,3) 00000
dogs-scale 83 (480,854,3) 00000
drift-chicane 52 (480,854,3) 00000
drift-straight 50 (480,854,3) 00000
drift-turn 64 (480,854,3) 00000
drone 91 (480,854,3) 00000
elephant 80 (480,854,3) 00000
flamingo 80 (480,854,3) 00000
goat 90 (480,854,3) 00000
gold-fish 78 (480,854,3) 00000
hike 80 (480,854,3) 00000
hockey 75 (480,854,3) 00000
horsejump-high 50 (480,854,3) 00000
horsejump-low 60 (480,854,3) 00000
india 81 (480,854,3) 00000
judo 34 (480,854,3) 00000
kid-football 68 (480,854,3) 00000
kite-surf 50 (480,854,3) 00000
kite-walk 80 (480,854,3) 00000
koala 100 (480,854,3) 00000
lab-coat 47 (480,854,3) 00000
lady-running 65 (480,854,3) 00000
libby 49 (480,854,3) 00000
lindy-hop 73 (480,854,3) 00000
loading 50 (480,854,3) 00000
longboard 52 (480,854,3) 00000
lucia 70 (480,854,3) 00000
mallard-fly 70 (480,854,3) 00000
mallard-water 80 (480,854,3) 00000
mbike-trick 79 (480,854,3) 00000
miami-surf 70 (480,854,3) 00000
motocross-bumps 60 (480,854,3) 00000
motocross-jump 40 (480,854,3) 00000
motorbike 43 (480,854,3) 00000
night-race 46 (480,854,3) 00000
paragliding 70 (480,854,3) 00000
paragliding-launch 80 (480,854,3) 00000
parkour 100 (480,854,3) 00000
pigs 79 (480,854,3) 00000
planes-water 38 (480,854,3) 00000
rallye 50 (480,854,3) 00000
rhino 90 (480,854,3) 00000
rollerblade 35 (480,854,3) 00000
schoolgirls 80 (480,854,3) 00000
scooter-black 43 (480,854,3) 00000
scooter-board 91 (480,854,3) 00000
scooter-gray 75 (480,854,3) 00000
sheep 68 (480,854,3) 00000
shooting 40 (480,854,3) 00000
skate-park 80 (480,854,3) 00000
snowboard 66 (480,854,3) 00000
soapbox 99 (480,854,3) 00000
soccerball 48 (480,854,3) 00000
stroller 91 (480,854,3) 00000
stunt 71 (480,854,3) 00000
surf 55 (480,854,3) 00000
swing 60 (480,854,3) 00000
tennis 70 (480,854,3) 00000
tractor-sand 76 (480,854,3) 00000
train 80 (480,854,3) 00000
tuk-tuk 59 (480,854,3) 00000
upside-down 65 (480,854,3) 00000
varanus-cage 67 (480,854,3) 00000
walking 72 (480,854,3) 00000
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment