Initial commit

3bfb5c80 · yongshk · 3bfb5c80 · 3bfb5c80 · 3bfb5c80 · 3bfb5c80
Commit 3bfb5c80 authored Apr 24, 2023 by yongshk
20 changed files
--- a/.gitignore
+++ b/.gitignore
+debug*
+checkpoints/
+results/
+build/
+dist/
+torch.egg-info/
+*/**/__pycache__
+torch/version.py
+torch/csrc/generic/TensorMethods.cpp
+torch/lib/*.so*
+torch/lib/*.dylib*
+torch/lib/*.h
+torch/lib/build
+torch/lib/tmp_install
+torch/lib/include
+torch/lib/torch_shm_manager
+torch/csrc/cudnn/cuDNN.cpp
+torch/csrc/nn/THNN.cwrap
+torch/csrc/nn/THNN.cpp
+torch/csrc/nn/THCUNN.cwrap
+torch/csrc/nn/THCUNN.cpp
+torch/csrc/nn/THNN_generic.cwrap
+torch/csrc/nn/THNN_generic.cpp
+torch/csrc/nn/THNN_generic.h
+docs/src/**/*
+test/data/legacy_modules.t7
+test/data/gpu_tensors.pt
+test/htmlcov
+test/.coverage
+*/*.pyc
+*/**/*.pyc
+*/**/**/*.pyc
+*/**/**/**/*.pyc
+*/**/**/**/**/*.pyc
+*/*.so*
+*/**/*.so*
+*/**/*.dylib*
+test/data/legacy_serialized.pt
+*.DS_Store
+*~
--- a/LICENSE.txt
+++ b/LICENSE.txt
+Copyright (C) 2019 NVIDIA Corporation. Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu.
+BSD License. All rights reserved. 
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+* Redistributions of source code must retain the above copyright notice, this
+  list of conditions and the following disclaimer.
+
+* Redistributions in binary form must reproduce the above copyright notice,
+  this list of conditions and the following disclaimer in the documentation
+  and/or other materials provided with the distribution.
+
+THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL 
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR ANY PARTICULAR PURPOSE. 
+IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL 
+DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, 
+WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING 
+OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+
+
+--------------------------- LICENSE FOR pytorch-CycleGAN-and-pix2pix ----------------
+Copyright (c) 2017, Jun-Yan Zhu and Taesung Park
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+* Redistributions of source code must retain the above copyright notice, this
+  list of conditions and the following disclaimer.
+
+* Redistributions in binary form must reproduce the above copyright notice,
+  this list of conditions and the following disclaimer in the documentation
+  and/or other materials provided with the distribution.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
--- a/README.md
+++ b/README.md
+# 简介
+
+Pix2pixHD是一种图像到图像的转换模型，它可以将输入图像转换成一种特定的输出图像。这种模型的应用非常广泛，例如将草图转换成真实的图片、将低分辨率图片转换成高分辨率图片、将黑白图像转换成彩色图像等等。
+
+Pix2pixHD是pix2pix的改进版，它在分辨率、视觉质量、视觉一致性和训练效率等方面都有所提高。Pix2pixHD引入了多尺度判别器，这样可以捕捉到不同尺度的全局和局部信息，提高了图像质量。此外，Pix2pixHD还使用了语义分割，可以将输入图像分解成不同的区域并对其进行处理，从而提高了模型的视觉一致性。
+
+# 测试流程
+
+## 安装工具包
+
+pytorch1.10版本[1.10.0a0+git2040069-dtk2210]
+
+## 加载环境变量
+
+```
+export PATH={PYTHON3_install_dir}/bin:$PATH
+
+export LD_LIBRARY_PATH={PYTHON3_install_dir}/lib:$LD_LIBRARY_PATH
+```
+
+## 下载数据集
+
+数据集下载地址：cityscapes
+
+<https://www.cityscapes-dataset.com/> 
+
+## 修改配置文件
+
+options/base_options.py
+
+```python
+self.parser.add_argument('--checkpoints_dir', type=str, default='/../', help='models are saved here')
+
+
+self.parser.add_argument('--dataroot', type=str, default='/../pix2pixHD/datasets/cityscapes')
+```
+
+
+
+# 运行指令
+
+## 训练模型
+
+以 1024 x 512 分辨率训练模型 ( `bash ./scripts/train_512p.sh`)：
+
+```
+#!./scripts/train_512p.sh
+python train.py --name label2city_512p
+```
+
+## 测试
+
+- 该文件夹中包含一些示例 Cityscapes 测试图像`datasets`。
+- [请从这里](https://drive.google.com/file/d/1h9SykUnuZul7J3Nbms2QGH1wa85nbN2-/view?usp=sharing)（谷歌驱动器链接）下载预训练的 Cityscapes 模型，并将其放在`./checkpoints/label2city_1024p/`
+- 测试模型（`bash ./scripts/test_1024p.sh`）：
+
+```
+#!./scripts/test_1024p.sh
+python test.py --name label2city_1024p --netG local --ngf 32 --resize_or_crop none
+```
+
+测试结果将保存到此处的 html 文件中：`./results/label2city_1024p/test_latest/index.html`。
+
+# 参考
+
+[https://github.com/NVIDIA/pix2pixHD](https://github.com/NVIDIA/pix2pixHD)
\ No newline at end of file
--- a/_config.yml
+++ b/_config.yml
+theme: jekyll-theme-minimal
\ No newline at end of file
--- a/data/__init__.py
+++ b/data/__init__.py
--- a/data/aligned_dataset.py
+++ b/data/aligned_dataset.py
+import os.path
+from data.base_dataset import BaseDataset, get_params, get_transform, normalize
+from data.image_folder import make_dataset
+from PIL import Image
+
+class AlignedDataset(BaseDataset):
+    def initialize(self, opt):
+        self.opt = opt
+        self.root = opt.dataroot    
+
+        ### input A (label maps)
+        dir_A = '_A' if self.opt.label_nc == 0 else '_label'
+        self.dir_A = os.path.join(opt.dataroot, opt.phase + dir_A)
+        self.A_paths = sorted(make_dataset(self.dir_A))
+
+        ### input B (real images)
+        if opt.isTrain or opt.use_encoded_image:
+            dir_B = '_B' if self.opt.label_nc == 0 else '_img'
+            self.dir_B = os.path.join(opt.dataroot, opt.phase + dir_B)  
+            self.B_paths = sorted(make_dataset(self.dir_B))
+
+        ### instance maps
+        if not opt.no_instance:
+            self.dir_inst = os.path.join(opt.dataroot, opt.phase + '_inst')
+            self.inst_paths = sorted(make_dataset(self.dir_inst))
+
+        ### load precomputed instance-wise encoded features
+        if opt.load_features:                              
+            self.dir_feat = os.path.join(opt.dataroot, opt.phase + '_feat')
+            print('----------- loading features from %s ----------' % self.dir_feat)
+            self.feat_paths = sorted(make_dataset(self.dir_feat))
+
+        self.dataset_size = len(self.A_paths) 
+      
+    def __getitem__(self, index):        
+        ### input A (label maps)
+        A_path = self.A_paths[index]              
+        A = Image.open(A_path)        
+        params = get_params(self.opt, A.size)
+        if self.opt.label_nc == 0:
+            transform_A = get_transform(self.opt, params)
+            A_tensor = transform_A(A.convert('RGB'))
+        else:
+            transform_A = get_transform(self.opt, params, method=Image.NEAREST, normalize=False)
+            A_tensor = transform_A(A) * 255.0
+
+        B_tensor = inst_tensor = feat_tensor = 0
+        ### input B (real images)
+        if self.opt.isTrain or self.opt.use_encoded_image:
+            B_path = self.B_paths[index]   
+            B = Image.open(B_path).convert('RGB')
+            transform_B = get_transform(self.opt, params)      
+            B_tensor = transform_B(B)
+
+        ### if using instance maps        
+        if not self.opt.no_instance:
+            inst_path = self.inst_paths[index]
+            inst = Image.open(inst_path)
+            inst_tensor = transform_A(inst)
+
+            if self.opt.load_features:
+                feat_path = self.feat_paths[index]            
+                feat = Image.open(feat_path).convert('RGB')
+                norm = normalize()
+                feat_tensor = norm(transform_A(feat))                            
+
+        input_dict = {'label': A_tensor, 'inst': inst_tensor, 'image': B_tensor, 
+                      'feat': feat_tensor, 'path': A_path}
+
+        return input_dict
+
+    def __len__(self):
+        return len(self.A_paths) // self.opt.batchSize * self.opt.batchSize
+
+    def name(self):
+        return 'AlignedDataset'
\ No newline at end of file
--- a/data/base_data_loader.py
+++ b/data/base_data_loader.py
+
+class BaseDataLoader():
+    def __init__(self):
+        pass
+    
+    def initialize(self, opt):
+        self.opt = opt
+        pass
+
+    def load_data():
+        return None
+
+        
+        
--- a/data/base_dataset.py
+++ b/data/base_dataset.py
+import torch.utils.data as data
+from PIL import Image
+import torchvision.transforms as transforms
+import numpy as np
+import random
+
+class BaseDataset(data.Dataset):
+    def __init__(self):
+        super(BaseDataset, self).__init__()
+
+    def name(self):
+        return 'BaseDataset'
+
+    def initialize(self, opt):
+        pass
+
+def get_params(opt, size):
+    w, h = size
+    new_h = h
+    new_w = w
+    if opt.resize_or_crop == 'resize_and_crop':
+        new_h = new_w = opt.loadSize            
+    elif opt.resize_or_crop == 'scale_width_and_crop':
+        new_w = opt.loadSize
+        new_h = opt.loadSize * h // w
+
+    x = random.randint(0, np.maximum(0, new_w - opt.fineSize))
+    y = random.randint(0, np.maximum(0, new_h - opt.fineSize))
+    
+    flip = random.random() > 0.5
+    return {'crop_pos': (x, y), 'flip': flip}
+
+def get_transform(opt, params, method=Image.BICUBIC, normalize=True):
+    transform_list = []
+    if 'resize' in opt.resize_or_crop:
+        osize = [opt.loadSize, opt.loadSize]
+        transform_list.append(transforms.Scale(osize, method))   
+    elif 'scale_width' in opt.resize_or_crop:
+        transform_list.append(transforms.Lambda(lambda img: __scale_width(img, opt.loadSize, method)))
+        
+    if 'crop' in opt.resize_or_crop:
+        transform_list.append(transforms.Lambda(lambda img: __crop(img, params['crop_pos'], opt.fineSize)))
+
+    if opt.resize_or_crop == 'none':
+        base = float(2 ** opt.n_downsample_global)
+        if opt.netG == 'local':
+            base *= (2 ** opt.n_local_enhancers)
+        transform_list.append(transforms.Lambda(lambda img: __make_power_2(img, base, method)))
+
+    if opt.isTrain and not opt.no_flip:
+        transform_list.append(transforms.Lambda(lambda img: __flip(img, params['flip'])))
+
+    transform_list += [transforms.ToTensor()]
+
+    if normalize:
+        transform_list += [transforms.Normalize((0.5, 0.5, 0.5),
+                                                (0.5, 0.5, 0.5))]
+    return transforms.Compose(transform_list)
+
+def normalize():    
+    return transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
+
+def __make_power_2(img, base, method=Image.BICUBIC):
+    ow, oh = img.size        
+    h = int(round(oh / base) * base)
+    w = int(round(ow / base) * base)
+    if (h == oh) and (w == ow):
+        return img
+    return img.resize((w, h), method)
+
+def __scale_width(img, target_width, method=Image.BICUBIC):
+    ow, oh = img.size
+    if (ow == target_width):
+        return img    
+    w = target_width
+    h = int(target_width * oh / ow)    
+    return img.resize((w, h), method)
+
+def __crop(img, pos, size):
+    ow, oh = img.size
+    x1, y1 = pos
+    tw = th = size
+    if (ow > tw or oh > th):        
+        return img.crop((x1, y1, x1 + tw, y1 + th))
+    return img
+
+def __flip(img, flip):
+    if flip:
+        return img.transpose(Image.FLIP_LEFT_RIGHT)
+    return img
--- a/data/custom_dataset_data_loader.py
+++ b/data/custom_dataset_data_loader.py
+import torch.utils.data
+from data.base_data_loader import BaseDataLoader
+
+
+def CreateDataset(opt):
+    dataset = None
+    from data.aligned_dataset import AlignedDataset
+    dataset = AlignedDataset()
+
+    print("dataset [%s] was created" % (dataset.name()))
+    dataset.initialize(opt)
+    return dataset
+
+class CustomDatasetDataLoader(BaseDataLoader):
+    def name(self):
+        return 'CustomDatasetDataLoader'
+
+    def initialize(self, opt):
+        BaseDataLoader.initialize(self, opt)
+        self.dataset = CreateDataset(opt)
+        self.dataloader = torch.utils.data.DataLoader(
+            self.dataset,
+            batch_size=opt.batchSize,
+            shuffle=not opt.serial_batches,
+            num_workers=int(opt.nThreads))
+
+    def load_data(self):
+        return self.dataloader
+
+    def __len__(self):
+        return min(len(self.dataset), self.opt.max_dataset_size)
--- a/data/data_loader.py
+++ b/data/data_loader.py
+
+def CreateDataLoader(opt):
+    from data.custom_dataset_data_loader import CustomDatasetDataLoader
+    data_loader = CustomDatasetDataLoader()
+    print(data_loader.name())
+    data_loader.initialize(opt)
+    return data_loader
--- a/data/image_folder.py
+++ b/data/image_folder.py
+###############################################################################
+# Code from
+# https://github.com/pytorch/vision/blob/master/torchvision/datasets/folder.py
+# Modified the original code so that it also loads images from the current
+# directory as well as the subdirectories
+###############################################################################
+import torch.utils.data as data
+from PIL import Image
+import os
+
+IMG_EXTENSIONS = [
+    '.jpg', '.JPG', '.jpeg', '.JPEG',
+    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP', '.tiff'
+]
+
+
+def is_image_file(filename):
+    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)
+
+
+def make_dataset(dir):
+    images = []
+    assert os.path.isdir(dir), '%s is not a valid directory' % dir
+
+    for root, _, fnames in sorted(os.walk(dir)):
+        for fname in fnames:
+            if is_image_file(fname):
+                path = os.path.join(root, fname)
+                images.append(path)
+
+    return images
+
+
+def default_loader(path):
+    return Image.open(path).convert('RGB')
+
+
+class ImageFolder(data.Dataset):
+
+    def __init__(self, root, transform=None, return_paths=False,
+                 loader=default_loader):
+        imgs = make_dataset(root)
+        if len(imgs) == 0:
+            raise(RuntimeError("Found 0 images in: " + root + "\n"
+                               "Supported image extensions are: " +
+                               ",".join(IMG_EXTENSIONS)))
+
+        self.root = root
+        self.imgs = imgs
+        self.transform = transform
+        self.return_paths = return_paths
+        self.loader = loader
+
+    def __getitem__(self, index):
+        path = self.imgs[index]
+        img = self.loader(path)
+        if self.transform is not None:
+            img = self.transform(img)
+        if self.return_paths:
+            return img, path
+        else:
+            return img
+
+    def __len__(self):
+        return len(self.imgs)
--- a/datasets/cityscapes/test_inst/frankfurt_000000_000576_gtFine_instanceIds.png
+++ b/datasets/cityscapes/test_inst/frankfurt_000000_000576_gtFine_instanceIds.png
--- a/datasets/cityscapes/test_inst/frankfurt_000000_001236_gtFine_instanceIds.png
+++ b/datasets/cityscapes/test_inst/frankfurt_000000_001236_gtFine_instanceIds.png
--- a/datasets/cityscapes/test_inst/frankfurt_000000_003357_gtFine_instanceIds.png
+++ b/datasets/cityscapes/test_inst/frankfurt_000000_003357_gtFine_instanceIds.png
--- a/datasets/cityscapes/test_inst/frankfurt_000000_011810_gtFine_instanceIds.png
+++ b/datasets/cityscapes/test_inst/frankfurt_000000_011810_gtFine_instanceIds.png
--- a/datasets/cityscapes/test_inst/frankfurt_000000_012868_gtFine_instanceIds.png
+++ b/datasets/cityscapes/test_inst/frankfurt_000000_012868_gtFine_instanceIds.png
--- a/datasets/cityscapes/test_inst/frankfurt_000001_013710_gtFine_instanceIds.png
+++ b/datasets/cityscapes/test_inst/frankfurt_000001_013710_gtFine_instanceIds.png
--- a/datasets/cityscapes/test_inst/frankfurt_000001_015328_gtFine_instanceIds.png
+++ b/datasets/cityscapes/test_inst/frankfurt_000001_015328_gtFine_instanceIds.png
--- a/datasets/cityscapes/test_inst/frankfurt_000001_023769_gtFine_instanceIds.png
+++ b/datasets/cityscapes/test_inst/frankfurt_000001_023769_gtFine_instanceIds.png
--- a/datasets/cityscapes/test_inst/frankfurt_000001_028335_gtFine_instanceIds.png
+++ b/datasets/cityscapes/test_inst/frankfurt_000001_028335_gtFine_instanceIds.png