stylegan2_mmcv

1401de15 · dongchy920 · 1401de15 · 1401de15 · 1401de15 · 1401de15
Commit 1401de15 authored Jun 28, 2024 by dongchy920
20 changed files
--- a/configs/improved_ddpm/ddpm_cosine_hybird_timestep-4k_imagenet1k_64x64_b8x16_1500k.py
+++ b/configs/improved_ddpm/ddpm_cosine_hybird_timestep-4k_imagenet1k_64x64_b8x16_1500k.py
+_base_ = [
+    '../_base_/models/improved_ddpm/ddpm_64x64.py',
+    '../_base_/datasets/imagenet_noaug_64.py', '../_base_/default_runtime.py'
+]
+lr_config = None
+checkpoint_config = dict(interval=10000, by_epoch=False, max_keep_ckpts=20)
+custom_hooks = [
+    dict(
+        type='MMGenVisualizationHook',
+        output_dir='training_samples',
+        res_name_list=['real_imgs', 'x_0_pred', 'x_t', 'x_t_1'],
+        padding=1,
+        interval=1000),
+    dict(
+        type='ExponentialMovingAverageHook',
+        module_keys=('denoising_ema'),
+        interval=1,
+        start_iter=0,
+        interp_cfg=dict(momentum=0.9999),
+        priority='VERY_HIGH')
+]
+# do not evaluation in training process because evaluation take too much time.
+evaluation = None
+total_iters = 1500000  # 1500k
+data = dict(samples_per_gpu=16)  # 8x16=128
+# use ddp wrapper for faster training
+use_ddp_wrapper = True
+find_unused_parameters = False
+runner = dict(
+    type='DynamicIterBasedRunner',
+    is_dynamic_ddp=False,  # Note that this flag should be False.
+    pass_training_status=True)
+inception_pkl = './work_dirs/inception_pkl/imagenet_64x64.pkl'
+metrics = dict(
+    fid50k=dict(
+        type='FID',
+        num_images=50000,
+        bgr2rgb=True,
+        inception_pkl=inception_pkl,
+        inception_args=dict(type='StyleGAN')))
--- a/configs/improved_ddpm/metafile.yml
+++ b/configs/improved_ddpm/metafile.yml
+Collections:
+- Metadata:
+    Architecture:
+    - Improved-DDPM
+  Name: Improved-DDPM
+  Paper:
+  - https://arxiv.org/abs/2102.09672
+  README: configs/improved_ddpm/README.md
+Models:
+- Config: https://github.com/open-mmlab/mmgeneration/blob/master/configs/improved_ddpm/ddpm_cosine_hybird_timestep-4k_drop0.3_cifar10_32x32_b8x16_500k.py
+  In Collection: Improved-DDPM
+  Metadata:
+    Training Data: CIFAR
+  Name: ddpm_cosine_hybird_timestep-4k_drop0.3_cifar10_32x32_b8x16_500k
+  Results:
+  - Dataset: CIFAR
+    Metrics:
+      FID: 3.8848
+    Task: Denoising Diffusion Probabilistic Models
+  Weights: https://download.openmmlab.com/mmgen/improved_ddpm/ddpm_cosine_hybird_timestep-4k_drop0.3_cifar10_32x32_b8x16_500k_20220103_222621-2f42f476.pth
+- Config: https://github.com/open-mmlab/mmgeneration/tree/master/configs/improve_ddpm/ddpm_cosine_hybird_timestep-4k_imagenet1k_64x64_b8x16_1500k.py
+  In Collection: Improved-DDPM
+  Metadata:
+    Training Data: IMAGENET
+  Name: ddpm_cosine_hybird_timestep-4k_imagenet1k_64x64_b8x16_1500k
+  Results:
+  - Dataset: IMAGENET
+    Metrics:
+      FID: 13.5181
+    Task: Denoising Diffusion Probabilistic Models
+  Weights: https://download.openmmlab.com/mmgen/improved_ddpm/ddpm_cosine_hybird_timestep-4k_imagenet1k_64x64_b8x16_1500k_20220103_223919-b8f1a310.pth
+- Config: https://github.com/open-mmlab/mmgeneration/blob/master/configs/improved_ddpm/ddpm_cosine_hybird_timestep-4k_drop0.3_imagenet1k_64x64_b8x16_1500k.py
+  In Collection: Improved-DDPM
+  Metadata:
+    Training Data: IMAGENET
+  Name: ddpm_cosine_hybird_timestep-4k_drop0.3_imagenet1k_64x64_b8x16_1500k
+  Results:
+  - Dataset: IMAGENET
+    Metrics:
+      FID: 13.4094
+    Task: Denoising Diffusion Probabilistic Models
+  Weights: https://download.openmmlab.com/mmgen/improved_ddpm/ddpm_cosine_hybird_timestep-4k_drop0.3_imagenet1k_64x64_b8x16_1500k_20220103_224427-7bb55975.pth
--- a/configs/lsgan/README.md
+++ b/configs/lsgan/README.md
+# LSGAN
+> [Least Squares Generative Adversarial Networks](https://openaccess.thecvf.com/content_iccv_2017/html/Mao_Least_Squares_Generative_ICCV_2017_paper.html)
+<!-- [ALGORITHM] -->
+## Abstract
+<!-- [ABSTRACT] -->
+Unsupervised learning with generative adversarial networks (GANs) has proven hugely successful. Regular GANs hypothesize the discriminator as a classifier with the sigmoid cross entropy loss function. However, we found that this loss function may lead to the vanishing gradients problem during the learning process. To overcome such a problem, we propose in this paper the Least Squares Generative Adversarial Networks (LSGANs) which adopt the least squares loss function for the discriminator. We show that minimizing the objective function of LSGAN yields minimizing the Pearson χ2 divergence. There are two benefits of LSGANs over regular GANs. First, LSGANs are able to generate higher quality images than regular GANs. Second, LSGANs perform more stable during the learning process. We evaluate LSGANs on five scene datasets and the experimental results show that the images generated by LSGANs are of better quality than the ones generated by regular GANs. We also conduct two comparison experiments between LSGANs and regular GANs to illustrate the stability of LSGANs.
+<!-- [IMAGE] -->
+<div align=center>
+<img src="https://user-images.githubusercontent.com/28132635/143052264-afd97b91-5fd1-4134-ad4d-529e364fdcc8.JPG"/>
+</div>
+## Results and models
+<div align="center">
+  <b> LSGAN 64x64, CelebA-Cropped</b>
+  <br/>
+  <img src="https://user-images.githubusercontent.com/22982797/116498716-f4e74200-a8dc-11eb-9c28-5549d96e20a6.png" width="800"/>
+</div>
+|    Models     |    Dataset     |               SWD               | MS-SSIM |   FID   |                             Config                              |                             Download                              |
+| :-----------: | :------------: | :-----------------------------: | :-----: | :-----: | :-------------------------------------------------------------: | :---------------------------------------------------------------: |
+|  LSGAN 64x64  | CelebA-Cropped |     6.16, 6.83, 37.64/16.87     | 0.3216  | 11.9258 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/lsgan/lsgan_dcgan-archi_lr-1e-3_celeba-cropped_64_b128x1_12m.py) | [model](https://download.openmmlab.com/mmgen/lsgan/lsgan_celeba-cropped_dcgan-archi_lr-1e-3_64_b128x1_12m_20210429_144001-92ca1d0d.pth)\| [log](https://download.openmmlab.com/mmgen/lsgan/lsgan_celeba-cropped_dcgan-archi_lr-1e-3_64_b128x1_12m_20210422_131925.log.json) |
+|  LSGAN 64x64  |  LSUN-Bedroom  |      5.66, 9.0, 18.6/11.09      | 0.0671  | 30.7390 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/lsgan/lsgan_dcgan-archi_lr-1e-4_lsun-bedroom_64_b128x1_12m.py) | [model](https://download.openmmlab.com/mmgen/lsgan/lsgan_lsun-bedroom_dcgan-archi_lr-1e-4_64_b128x1_12m_20210429_144602-ec4ec6bb.pth)\| [log](https://download.openmmlab.com/mmgen/lsgan/lsgan_lsun-bedroom_dcgan-archi_lr-1e-4_64_b128x1_12m_20210423_005020.log.json) |
+| LSGAN 128x128 | CelebA-Cropped | 21.66, 9.83, 16.06, 70.76/29.58 | 0.3691  | 38.3752 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/lsgan/lsgan_dcgan-archi_lr-1e-4_celeba-cropped_128_b64x1_10m.py) | [model](https://download.openmmlab.com/mmgen/lsgan/lsgan_celeba-cropped_dcgan-archi_lr-1e-4_128_b64x1_10m_20210429_144229-01ba67dc.pth)\| [log](https://download.openmmlab.com/mmgen/lsgan/lsgan_celeba-cropped_dcgan-archi_lr-1e-4_128_b64x1_10m_20210423_132126.log.json) |
+| LSGAN 128x128 |  LSUN-Bedroom  |  19.52, 9.99, 7.48, 14.3/12.82  | 0.0612  | 51.5500 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/lsgan/lsgan_lsgan-archi_lr-1e-4_lsun-bedroom_128_b64x1_10m.py) | [model](https://download.openmmlab.com/mmgen/lsgan/lsgan_lsun-bedroom_lsgan-archi_lr-1e-4_128_b64x1_10m_20210429_155605-cf78c0a8.pth)\| [log](https://download.openmmlab.com/mmgen/lsgan/lsgan_lsun-bedroom_lsgan-archi_lr-1e-4_128_b64x1_10m_20210429_142302.log.json) |
+## Citation
+```latex
+@inproceedings{mao2017least,
+  title={Least squares generative adversarial networks},
+  author={Mao, Xudong and Li, Qing and Xie, Haoran and Lau, Raymond YK and Wang, Zhen and Paul Smolley, Stephen},
+  booktitle={Proceedings of the IEEE international conference on computer vision},
+  pages={2794--2802},
+  year={2017},
+  url={https://openaccess.thecvf.com/content_iccv_2017/html/Mao_Least_Squares_Generative_ICCV_2017_paper.html},
+}
+```
--- a/configs/lsgan/lsgan_dcgan-archi_lr-1e-3_celeba-cropped_64_b128x1_12m.py
+++ b/configs/lsgan/lsgan_dcgan-archi_lr-1e-3_celeba-cropped_64_b128x1_12m.py
+_base_ = [
+    '../_base_/models/dcgan/dcgan_64x64.py',
+    '../_base_/datasets/unconditional_imgs_64x64.py',
+    '../_base_/default_runtime.py'
+]
+model = dict(gan_loss=dict(type='GANLoss', gan_type='lsgan'))
+# define dataset
+# you must set `samples_per_gpu` and `imgs_root`
+data = dict(
+    samples_per_gpu=128,
+    train=dict(imgs_root='./data/celeba-cropped/cropped_images_aligned_png/'))
+optimizer = dict(
+    generator=dict(type='Adam', lr=0.001, betas=(0.5, 0.99)),
+    discriminator=dict(type='Adam', lr=0.001, betas=(0.5, 0.99)))
+# adjust running config
+lr_config = None
+checkpoint_config = dict(interval=10000, by_epoch=False, max_keep_ckpts=20)
+custom_hooks = [
+    dict(
+        type='VisualizeUnconditionalSamples',
+        output_dir='training_samples',
+        interval=10000)
+]
+evaluation = dict(
+    type='GenerativeEvalHook',
+    interval=10000,
+    metrics=dict(
+        type='FID', num_images=50000, inception_pkl=None, bgr2rgb=True),
+    sample_kwargs=dict(sample_model='orig'))
+total_iters = 100000
+# use ddp wrapper for faster training
+use_ddp_wrapper = True
+find_unused_parameters = False
+runner = dict(
+    type='DynamicIterBasedRunner',
+    is_dynamic_ddp=False,  # Note that this flag should be False.
+    pass_training_status=True)
+metrics = dict(
+    ms_ssim10k=dict(type='MS_SSIM', num_images=10000),
+    swd16k=dict(type='SWD', num_images=16384, image_shape=(3, 64, 64)),
+    fid50k=dict(type='FID', num_images=50000, inception_pkl=None))
--- a/configs/lsgan/lsgan_dcgan-archi_lr-1e-4_celeba-cropped_128_b64x1_10m.py
+++ b/configs/lsgan/lsgan_dcgan-archi_lr-1e-4_celeba-cropped_128_b64x1_10m.py
+_base_ = [
+    '../_base_/models/dcgan/dcgan_128x128.py',
+    '../_base_/datasets/unconditional_imgs_128x128.py',
+    '../_base_/default_runtime.py'
+]
+model = dict(
+    discriminator=dict(output_scale=4, out_channels=1),
+    gan_loss=dict(type='GANLoss', gan_type='lsgan'))
+# define dataset
+# you must set `samples_per_gpu` and `imgs_root`
+data = dict(
+    samples_per_gpu=64,
+    train=dict(imgs_root='./data/celeba-cropped/cropped_images_aligned_png/'))
+optimizer = dict(
+    generator=dict(type='Adam', lr=0.0001, betas=(0.5, 0.99)),
+    discriminator=dict(type='Adam', lr=0.0001, betas=(0.5, 0.99)))
+# adjust running config
+lr_config = None
+checkpoint_config = dict(interval=10000, by_epoch=False, max_keep_ckpts=20)
+custom_hooks = [
+    dict(
+        type='VisualizeUnconditionalSamples',
+        output_dir='training_samples',
+        interval=10000)
+]
+evaluation = dict(
+    type='GenerativeEvalHook',
+    interval=10000,
+    metrics=dict(
+        type='FID', num_images=50000, inception_pkl=None, bgr2rgb=True),
+    sample_kwargs=dict(sample_model='orig'))
+total_iters = 160000
+# use ddp wrapper for faster training
+use_ddp_wrapper = True
+find_unused_parameters = False
+runner = dict(
+    type='DynamicIterBasedRunner',
+    is_dynamic_ddp=False,  # Note that this flag should be False.
+    pass_training_status=True)
+metrics = dict(
+    ms_ssim10k=dict(type='MS_SSIM', num_images=10000),
+    swd16k=dict(type='SWD', num_images=16384, image_shape=(3, 128, 128)),
+    fid50k=dict(type='FID', num_images=50000, inception_pkl=None))
--- a/configs/lsgan/lsgan_dcgan-archi_lr-1e-4_lsun-bedroom_64_b128x1_12m.py
+++ b/configs/lsgan/lsgan_dcgan-archi_lr-1e-4_lsun-bedroom_64_b128x1_12m.py
+_base_ = [
+    '../_base_/models/dcgan/dcgan_64x64.py',
+    '../_base_/datasets/unconditional_imgs_64x64.py',
+    '../_base_/default_runtime.py'
+]
+model = dict(
+    discriminator=dict(output_scale=4, out_channels=1),
+    gan_loss=dict(type='GANLoss', gan_type='lsgan'))
+# define dataset
+# you must set `samples_per_gpu` and `imgs_root`
+data = dict(
+    samples_per_gpu=128, train=dict(imgs_root='./data/lsun/bedroom_train'))
+optimizer = dict(
+    generator=dict(type='Adam', lr=0.0001, betas=(0.5, 0.99)),
+    discriminator=dict(type='Adam', lr=0.0001, betas=(0.5, 0.99)))
+# adjust running config
+lr_config = None
+checkpoint_config = dict(interval=10000, by_epoch=False, max_keep_ckpts=20)
+custom_hooks = [
+    dict(
+        type='VisualizeUnconditionalSamples',
+        output_dir='training_samples',
+        interval=10000)
+]
+evaluation = dict(
+    type='GenerativeEvalHook',
+    interval=10000,
+    metrics=dict(
+        type='FID', num_images=50000, inception_pkl=None, bgr2rgb=True),
+    sample_kwargs=dict(sample_model='orig'))
+total_iters = 100000
+# use ddp wrapper for faster training
+use_ddp_wrapper = True
+find_unused_parameters = False
+runner = dict(
+    type='DynamicIterBasedRunner',
+    is_dynamic_ddp=False,  # Note that this flag should be False.
+    pass_training_status=True)
+metrics = dict(
+    ms_ssim10k=dict(type='MS_SSIM', num_images=10000),
+    swd16k=dict(type='SWD', num_images=16384, image_shape=(3, 64, 64)),
+    fid50k=dict(type='FID', num_images=50000, inception_pkl=None))
--- a/configs/lsgan/lsgan_lsgan-archi_lr-1e-4_lsun-bedroom_128_b64x1_10m.py
+++ b/configs/lsgan/lsgan_lsgan-archi_lr-1e-4_lsun-bedroom_128_b64x1_10m.py
+_base_ = [
+    '../_base_/models/lsgan/lsgan_128x128.py',
+    '../_base_/datasets/unconditional_imgs_128x128.py',
+    '../_base_/default_runtime.py'
+]
+# define dataset
+# you must set `samples_per_gpu` and `imgs_root`
+data = dict(
+    samples_per_gpu=64, train=dict(imgs_root='./data/lsun/bedroom_train'))
+optimizer = dict(
+    generator=dict(type='Adam', lr=0.0001, betas=(0.5, 0.99)),
+    discriminator=dict(type='Adam', lr=0.0001, betas=(0.5, 0.99)))
+# adjust running config
+lr_config = None
+checkpoint_config = dict(interval=10000, by_epoch=False, max_keep_ckpts=20)
+custom_hooks = [
+    dict(
+        type='VisualizeUnconditionalSamples',
+        output_dir='training_samples',
+        interval=10000)
+]
+evaluation = dict(
+    type='GenerativeEvalHook',
+    interval=10000,
+    metrics=dict(
+        type='FID', num_images=50000, inception_pkl=None, bgr2rgb=True),
+    sample_kwargs=dict(sample_model='orig'))
+total_iters = 160000
+# use ddp wrapper for faster training
+use_ddp_wrapper = True
+find_unused_parameters = False
+runner = dict(
+    type='DynamicIterBasedRunner',
+    is_dynamic_ddp=False,  # Note that this flag should be False.
+    pass_training_status=True)
+metrics = dict(
+    ms_ssim10k=dict(type='MS_SSIM', num_images=10000),
+    swd16k=dict(type='SWD', num_images=16384, image_shape=(3, 128, 128)),
+    fid50k=dict(type='FID', num_images=50000, inception_pkl=None))
--- a/configs/lsgan/metafile.yml
+++ b/configs/lsgan/metafile.yml
+Collections:
+- Metadata:
+    Architecture:
+    - LSGAN
+  Name: LSGAN
+  Paper:
+  - https://openaccess.thecvf.com/content_iccv_2017/html/Mao_Least_Squares_Generative_ICCV_2017_paper.html
+  README: configs/lsgan/README.md
+Models:
+- Config: https://github.com/open-mmlab/mmgeneration/tree/master/configs/lsgan/lsgan_dcgan-archi_lr-1e-3_celeba-cropped_64_b128x1_12m.py
+  In Collection: LSGAN
+  Metadata:
+    Training Data: CELEBA
+  Name: lsgan_dcgan-archi_lr-1e-3_celeba-cropped_64_b128x1_12m
+  Results:
+  - Dataset: CELEBA
+    Metrics:
+      FID: 11.9258
+      MS-SSIM: 0.3216
+      SWD: 6.16, 6.83, 37.64/16.87
+    Task: Unconditional GANs
+  Weights: https://download.openmmlab.com/mmgen/lsgan/lsgan_celeba-cropped_dcgan-archi_lr-1e-3_64_b128x1_12m_20210429_144001-92ca1d0d.pth
+- Config: https://github.com/open-mmlab/mmgeneration/tree/master/configs/lsgan/lsgan_dcgan-archi_lr-1e-4_lsun-bedroom_64_b128x1_12m.py
+  In Collection: LSGAN
+  Metadata:
+    Training Data: LSUN
+  Name: lsgan_dcgan-archi_lr-1e-4_lsun-bedroom_64_b128x1_12m
+  Results:
+  - Dataset: LSUN
+    Metrics:
+      FID: 30.739
+      MS-SSIM: 0.0671
+      SWD: 5.66, 9.0, 18.6/11.09
+    Task: Unconditional GANs
+  Weights: https://download.openmmlab.com/mmgen/lsgan/lsgan_lsun-bedroom_dcgan-archi_lr-1e-4_64_b128x1_12m_20210429_144602-ec4ec6bb.pth
+- Config: https://github.com/open-mmlab/mmgeneration/tree/master/configs/lsgan/lsgan_dcgan-archi_lr-1e-4_celeba-cropped_128_b64x1_10m.py
+  In Collection: LSGAN
+  Metadata:
+    Training Data: CELEBA
+  Name: lsgan_dcgan-archi_lr-1e-4_celeba-cropped_128_b64x1_10m
+  Results:
+  - Dataset: CELEBA
+    Metrics:
+      FID: 38.3752
+      MS-SSIM: 0.3691
+      SWD: 21.66, 9.83, 16.06, 70.76/29.58
+    Task: Unconditional GANs
+  Weights: https://download.openmmlab.com/mmgen/lsgan/lsgan_celeba-cropped_dcgan-archi_lr-1e-4_128_b64x1_10m_20210429_144229-01ba67dc.pth
+- Config: https://github.com/open-mmlab/mmgeneration/tree/master/configs/lsgan/lsgan_lsgan-archi_lr-1e-4_lsun-bedroom_128_b64x1_10m.py
+  In Collection: LSGAN
+  Metadata:
+    Training Data: LSUN
+  Name: lsgan_lsgan-archi_lr-1e-4_lsun-bedroom_128_b64x1_10m
+  Results:
+  - Dataset: LSUN
+    Metrics:
+      FID: 51.55
+      MS-SSIM: 0.0612
+      SWD: 19.52, 9.99, 7.48, 14.3/12.82
+    Task: Unconditional GANs
+  Weights: https://download.openmmlab.com/mmgen/lsgan/lsgan_lsun-bedroom_lsgan-archi_lr-1e-4_128_b64x1_10m_20210429_155605-cf78c0a8.pth
--- a/configs/pggan/README.md
+++ b/configs/pggan/README.md
+# PGGAN
+> [Progressive Growing of GANs for Improved Quality, Stability, and Variation](https://arxiv.org/abs/1710.10196)
+<!-- [ALGORITHM] -->
+## Abstract
+<!-- [ABSTRACT] -->
+We describe a new training methodology for generative adversarial networks. The key idea is to grow both the generator and discriminator progressively: starting from a low resolution, we add new layers that model increasingly fine details as training progresses. This both speeds the training up and greatly stabilizes it, allowing us to produce images of unprecedented quality, e.g., CelebA images at 1024^2. We also propose a simple way to increase the variation in generated images, and achieve a record inception score of 8.80 in unsupervised CIFAR10. Additionally, we describe several implementation details that are important for discouraging unhealthy competition between the generator and discriminator. Finally, we suggest a new metric for evaluating GAN results, both in terms of image quality and variation. As an additional contribution, we construct a higher-quality version of the CelebA dataset.
+<!-- [IMAGE] -->
+<div align=center>
+<img src="https://user-images.githubusercontent.com/28132635/143053374-c03894c3-6def-49c2-94ed-80c4accee726.JPG" />
+</div>
+## Results and models
+<div align="center">
+  <b> Results (compressed) from our PGGAN trained in CelebA-HQ@1024</b>
+  <br/>
+  <img src="https://user-images.githubusercontent.com/12726765/114009864-1df45400-9896-11eb-9d25-da9eabfe02ce.png" width="800"/>
+</div>
+|     Models      |    Details     | MS-SSIM |     SWD(xx,xx,xx,xx/avg)     |                               Config                                |                               Download                                |
+| :-------------: | :------------: | :-----: | :--------------------------: | :-----------------------------------------------------------------: | :-------------------------------------------------------------------: |
+|  pggan_128x128  | celeba-cropped | 0.3023  | 3.42, 4.04, 4.78, 20.38/8.15 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/pggan/pggan_celeba-cropped_128_g8_12Mimgs.py) | [model](https://download.openmmlab.com/mmgen/pggan/pggan_celeba-cropped_128_g8_20210408_181931-85a2e72c.pth) |
+|  pggan_128x128  |  lsun-bedroom  | 0.0602  |  3.5, 2.96, 2.76, 9.65/4.72  | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/pggan/pggan_lsun-bedroom_128_g8_12Mimgs.py) | [model](https://download.openmmlab.com/mmgen/pggan/pggan_lsun-bedroom_128x128_g8_20210408_182033-5e59f45d.pth) |
+| pggan_1024x1024 |   celeba-hq    | 0.3379  | 8.93, 3.98, 3.07, 2.64/4.655 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/pggan/pggan_celeba-hq_1024_g8_12Mimg.py) | [model](https://download.openmmlab.com/mmgen/pggan/pggan_celeba-hq_1024_g8_20210408_181911-f1ef51c3.pth) |
+## Citation
+<summary align="right"><a href="https://arxiv.org/abs/1710.10196">PGGAN (arXiv'2017)</a></summary>
+```latex
+@article{karras2017progressive,
+  title={Progressive growing of gans for improved quality, stability, and variation},
+  author={Karras, Tero and Aila, Timo and Laine, Samuli and Lehtinen, Jaakko},
+  journal={arXiv preprint arXiv:1710.10196},
+  year={2017},
+  url={https://arxiv.org/abs/1710.10196},
+}
+```
--- a/configs/pggan/metafile.yml
+++ b/configs/pggan/metafile.yml
+Collections:
+- Metadata:
+    Architecture:
+    - PGGAN
+  Name: PGGAN
+  Paper:
+  - https://arxiv.org/abs/1710.10196
+  README: configs/pggan/README.md
+Models:
+- Config: https://github.com/open-mmlab/mmgeneration/tree/master/configs/pggan/pggan_celeba-cropped_128_g8_12Mimgs.py
+  In Collection: PGGAN
+  Metadata:
+    Training Data: CELEBA
+  Name: pggan_celeba-cropped_128_g8_12Mimgs
+  Results:
+  - Dataset: CELEBA
+    Metrics:
+      Details: celeba-cropped
+      MS-SSIM: 0.3023
+      SWD(xx,xx,xx,xx/avg): 3.42, 4.04, 4.78, 20.38/8.15
+    Task: Unconditional GANs
+  Weights: https://download.openmmlab.com/mmgen/pggan/pggan_celeba-cropped_128_g8_20210408_181931-85a2e72c.pth
+- Config: https://github.com/open-mmlab/mmgeneration/tree/master/configs/pggan/pggan_lsun-bedroom_128_g8_12Mimgs.py
+  In Collection: PGGAN
+  Metadata:
+    Training Data: LSUN
+  Name: pggan_lsun-bedroom_128_g8_12Mimgs
+  Results:
+  - Dataset: LSUN
+    Metrics:
+      Details: lsun-bedroom
+      MS-SSIM: 0.0602
+      SWD(xx,xx,xx,xx/avg): 3.5, 2.96, 2.76, 9.65/4.72
+    Task: Unconditional GANs
+  Weights: https://download.openmmlab.com/mmgen/pggan/pggan_lsun-bedroom_128x128_g8_20210408_182033-5e59f45d.pth
+- Config: https://github.com/open-mmlab/mmgeneration/tree/master/configs/pggan/pggan_celeba-hq_1024_g8_12Mimg.py
+  In Collection: PGGAN
+  Metadata:
+    Training Data: CELEBA
+  Name: pggan_celeba-hq_1024_g8_12Mimg
+  Results:
+  - Dataset: CELEBA
+    Metrics:
+      Details: celeba-hq
+      MS-SSIM: 0.3379
+      SWD(xx,xx,xx,xx/avg): 8.93, 3.98, 3.07, 2.64/4.655
+    Task: Unconditional GANs
+  Weights: https://download.openmmlab.com/mmgen/pggan/pggan_celeba-hq_1024_g8_20210408_181911-f1ef51c3.pth
--- a/configs/pggan/pggan_celeba-cropped_128_g8_12Mimgs.py
+++ b/configs/pggan/pggan_celeba-cropped_128_g8_12Mimgs.py
+_base_ = [
+    '../_base_/models/pggan/pggan_128x128.py',
+    '../_base_/datasets/grow_scale_imgs_128x128.py',
+    '../_base_/default_runtime.py'
+]
+optimizer = None
+checkpoint_config = dict(interval=10000, by_epoch=False, max_keep_ckpts=20)
+data = dict(
+    samples_per_gpu=64,
+    train=dict(
+        imgs_roots={'128': './data/celeba-cropped/cropped_images_aligned_png'},
+        gpu_samples_base=4,
+        # note that this should be changed with total gpu number
+        gpu_samples_per_scale={
+            '4': 64,
+            '8': 32,
+            '16': 16,
+            '32': 8,
+            '64': 4
+        }))
+custom_hooks = [
+    dict(
+        type='VisualizeUnconditionalSamples',
+        output_dir='training_samples',
+        interval=5000),
+    dict(type='PGGANFetchDataHook', interval=1),
+    dict(
+        type='ExponentialMovingAverageHook',
+        module_keys=('generator_ema', ),
+        interval=1,
+        priority='VERY_HIGH')
+]
+lr_config = None
+total_iters = 280000
+metrics = dict(
+    ms_ssim10k=dict(type='MS_SSIM', num_images=10000),
+    swd16k=dict(type='SWD', num_images=16384, image_shape=(3, 128, 128)))
--- a/configs/pggan/pggan_celeba-hq_1024_g8_12Mimg.py
+++ b/configs/pggan/pggan_celeba-hq_1024_g8_12Mimg.py
+_base_ = [
+    '../_base_/models/pggan/pggan_1024.py',
+    '../_base_/datasets/grow_scale_imgs_celeba-hq.py',
+    '../_base_/default_runtime.py'
+]
+optimizer = None
+checkpoint_config = dict(interval=5000, by_epoch=False, max_keep_ckpts=20)
+data = dict(
+    samples_per_gpu=64,
+    train=dict(
+        gpu_samples_base=4,
+        # note that this should be changed with total gpu number
+        gpu_samples_per_scale={
+            '4': 64,
+            '8': 32,
+            '16': 16,
+            '32': 8,
+            '64': 4
+        },
+    ))
+custom_hooks = [
+    dict(
+        type='VisualizeUnconditionalSamples',
+        output_dir='training_samples',
+        interval=5000),
+    dict(type='PGGANFetchDataHook', interval=1),
+    dict(
+        type='ExponentialMovingAverageHook',
+        module_keys=('generator_ema', ),
+        interval=1,
+        priority='VERY_HIGH')
+]
+lr_config = None
+total_iters = 280000
+metrics = dict(
+    ms_ssim10k=dict(type='MS_SSIM', num_images=10000),
+    swd16k=dict(type='SWD', num_images=16384, image_shape=(3, 1024, 1024)))
--- a/configs/pggan/pggan_lsun-bedroom_128_g8_12Mimgs.py
+++ b/configs/pggan/pggan_lsun-bedroom_128_g8_12Mimgs.py
+_base_ = [
+    '../_base_/models/pggan/pggan_128x128.py',
+    '../_base_/datasets/grow_scale_imgs_128x128.py',
+    '../_base_/default_runtime.py'
+]
+optimizer = None
+checkpoint_config = dict(interval=10000, by_epoch=False, max_keep_ckpts=20)
+data = dict(
+    samples_per_gpu=64,
+    train=dict(
+        imgs_roots={'128': './data/lsun/bedroom_train'},
+        gpu_samples_base=4,
+        # note that this should be changed with total gpu number
+        gpu_samples_per_scale={
+            '4': 64,
+            '8': 32,
+            '16': 16,
+            '32': 8,
+            '64': 4
+        },
+    ))
+custom_hooks = [
+    dict(
+        type='VisualizeUnconditionalSamples',
+        output_dir='training_samples',
+        interval=5000),
+    dict(type='PGGANFetchDataHook', interval=1),
+    dict(
+        type='ExponentialMovingAverageHook',
+        module_keys=('generator_ema', ),
+        interval=1,
+        priority='VERY_HIGH')
+]
+lr_config = None
+total_iters = 280000
+metrics = dict(
+    ms_ssim10k=dict(type='MS_SSIM', num_images=10000),
+    swd16k=dict(type='SWD', num_images=16384, image_shape=(3, 128, 128)))
--- a/configs/pix2pix/README.md
+++ b/configs/pix2pix/README.md
+# Pix2Pix
+> [Pix2Pix: Image-to-Image Translation with Conditional Adversarial Networks](https://openaccess.thecvf.com/content_cvpr_2017/html/Isola_Image-To-Image_Translation_With_CVPR_2017_paper.html)
+<!-- [ALGORITHM] -->
+## Abstract
+<!-- [ABSTRACT] -->
+We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Moreover, since the release of the pix2pix software associated with this paper, hundreds of twitter users have posted their own artistic experiments using our system. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without handengineering our loss functions either.
+<!-- [IMAGE] -->
+<div align=center>
+<img src="https://user-images.githubusercontent.com/28132635/143053385-1b03356d-43df-423b-88b2-7a82b73d2edd.JPG"/>
+</div>
+## Results and Models
+<div align="center">
+  <b> Results from Pix2Pix trained by MMGeneration</b>
+  <br/>
+  <img src="https://user-images.githubusercontent.com/22982797/114269080-4ff0ec00-9a37-11eb-92c4-1525864e0307.PNG" width="800"/>
+</div>
+We use `FID` and `IS` metrics to evaluate the generation performance of pix2pix.<sup>1</sup>
+| Models |   Dataset   |   FID    |  IS   |                                        Config                                        |                                        Download                                        |
+| :----: | :---------: | :------: | :---: | :----------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------: |
+|  Ours  |   facades   | 124.9773 | 1.620 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/pix2pix/pix2pix_vanilla_unet_bn_facades_b1x1_80k.py) | [model](https://download.openmmlab.com/mmgen/pix2pix/refactor/pix2pix_vanilla_unet_bn_1x1_80k_facades_20210902_170442-c0958d50.pth)   \| [log](https://download.openmmlab.com/mmgen/pix2pix/pix2pix_vanilla_unet_bn_1x1_80k_facades_20210317_172625.log.json)<sup>2</sup> |
+|  Ours  | aerial2maps | 122.5856 | 3.137 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/pix2pix/pix2pix_vanilla_unet_bn_aerial2maps_b1x1_220k.py) | [model](https://download.openmmlab.com/mmgen/pix2pix/refactor/pix2pix_vanilla_unet_bn_a2b_1x1_219200_maps_convert-bgr_20210902_170729-59a31517.pth) |
+|  Ours  | maps2aerial | 88.4635  | 3.310 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/pix2pix/pix2pix_vanilla_unet_bn_maps2aerial_b1x1_220k.py) | [model](https://download.openmmlab.com/mmgen/pix2pix/refactor/pix2pix_vanilla_unet_bn_b2a_1x1_219200_maps_convert-bgr_20210902_170814-6d2eac4a.pth) |
+|  Ours  | edges2shoes | 84.3750  | 2.815 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/pix2pix/pix2pix_vanilla_unet_bn_wo_jitter_flip_edges2shoes_b1x4_190k.py) | [model](https://download.openmmlab.com/mmgen/pix2pix/refactor/pix2pix_vanilla_unet_bn_wo_jitter_flip_1x4_186840_edges2shoes_convert-bgr_20210902_170902-0c828552.pth) |
+`FID` comparison with official:
+| Dataset  |   facades   | aerial2maps  | maps2aerial | edges2shoes |   average    |
+| :------: | :---------: | :----------: | :---------: | :---------: | :----------: |
+| official | **119.135** |   149.731    |   102.072   | **75.774**  |   111.678    |
+|   ours   |  124.9773   | **122.5856** | **88.4635** |   84.3750   | **105.1003** |
+`IS` comparison with official:
+| Dataset  |  facades  | aerial2maps | maps2aerial | edges2shoes |  average   |
+| :------: | :-------: | :---------: | :---------: | :---------: | :--------: |
+| official | **1.650** |    2.529    |  **3.552**  |    2.766    |   2.624    |
+|   ours   |   1.620   |  **3.137**  |    3.310    |  **2.815**  | **2.7205** |
+Note:
+1. we strictly follow the [paper](http://openaccess.thecvf.com/content_cvpr_2017/papers/Isola_Image-To-Image_Translation_With_CVPR_2017_paper.pdf) setting in Section 3.3: "*At inference time, we run the generator net in exactly
+   the same manner as during the training phase. This differs
+   from the usual protocol in that we apply dropout at test time,
+   and we apply batch normalization using the statistics of
+   the test batch, rather than aggregated statistics of the training batch.*" (i.e., use model.train() mode), thus may lead to slightly different inference results every time.
+2. This is the training log before refactoring. Updated logs will be released soon.
+## Citation
+```latex
+@inproceedings{isola2017image,
+  title={Image-to-image translation with conditional adversarial networks},
+  author={Isola, Phillip and Zhu, Jun-Yan and Zhou, Tinghui and Efros, Alexei A},
+  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
+  pages={1125--1134},
+  year={2017},
+  url={https://openaccess.thecvf.com/content_cvpr_2017/html/Isola_Image-To-Image_Translation_With_CVPR_2017_paper.html},
+}
+```
--- a/configs/pix2pix/metafile.yml
+++ b/configs/pix2pix/metafile.yml
+Collections:
+- Metadata:
+    Architecture:
+    - Pix2Pix
+  Name: Pix2Pix
+  Paper:
+  - https://openaccess.thecvf.com/content_cvpr_2017/html/Isola_Image-To-Image_Translation_With_CVPR_2017_paper.html
+  README: configs/pix2pix/README.md
+Models:
+- Config: https://github.com/open-mmlab/mmgeneration/tree/master/configs/pix2pix/pix2pix_vanilla_unet_bn_facades_b1x1_80k.py
+  In Collection: Pix2Pix
+  Metadata:
+    Training Data: FACADES
+  Name: pix2pix_vanilla_unet_bn_facades_b1x1_80k
+  Results:
+  - Dataset: FACADES
+    Metrics:
+      FID: 124.9773
+      IS: 1.62
+    Task: Image2Image Translation
+  Weights: https://download.openmmlab.com/mmgen/pix2pix/refactor/pix2pix_vanilla_unet_bn_1x1_80k_facades_20210902_170442-c0958d50.pth
+- Config: https://github.com/open-mmlab/mmgeneration/tree/master/configs/pix2pix/pix2pix_vanilla_unet_bn_aerial2maps_b1x1_220k.py
+  In Collection: Pix2Pix
+  Metadata:
+    Training Data: MAPS
+  Name: pix2pix_vanilla_unet_bn_aerial2maps_b1x1_220k
+  Results:
+  - Dataset: MAPS
+    Metrics:
+      FID: 122.5856
+      IS: 3.137
+    Task: Image2Image Translation
+  Weights: https://download.openmmlab.com/mmgen/pix2pix/refactor/pix2pix_vanilla_unet_bn_a2b_1x1_219200_maps_convert-bgr_20210902_170729-59a31517.pth
+- Config: https://github.com/open-mmlab/mmgeneration/tree/master/configs/pix2pix/pix2pix_vanilla_unet_bn_maps2aerial_b1x1_220k.py
+  In Collection: Pix2Pix
+  Metadata:
+    Training Data: MAPS
+  Name: pix2pix_vanilla_unet_bn_maps2aerial_b1x1_220k
+  Results:
+  - Dataset: MAPS
+    Metrics:
+      FID: 88.4635
+      IS: 3.31
+    Task: Image2Image Translation
+  Weights: https://download.openmmlab.com/mmgen/pix2pix/refactor/pix2pix_vanilla_unet_bn_b2a_1x1_219200_maps_convert-bgr_20210902_170814-6d2eac4a.pth
+- Config: https://github.com/open-mmlab/mmgeneration/tree/master/configs/pix2pix/pix2pix_vanilla_unet_bn_wo_jitter_flip_edges2shoes_b1x4_190k.py
+  In Collection: Pix2Pix
+  Metadata:
+    Training Data: EDGES2SHOES
+  Name: pix2pix_vanilla_unet_bn_wo_jitter_flip_edges2shoes_b1x4_190k
+  Results:
+  - Dataset: EDGES2SHOES
+    Metrics:
+      FID: 84.375
+      IS: 2.815
+    Task: Image2Image Translation
+  Weights: https://download.openmmlab.com/mmgen/pix2pix/refactor/pix2pix_vanilla_unet_bn_wo_jitter_flip_1x4_186840_edges2shoes_convert-bgr_20210902_170902-0c828552.pth
--- a/configs/pix2pix/pix2pix_vanilla_unet_bn_aerial2maps_b1x1_220k.py
+++ b/configs/pix2pix/pix2pix_vanilla_unet_bn_aerial2maps_b1x1_220k.py
+_base_ = [
+    '../_base_/models/pix2pix/pix2pix_vanilla_unet_bn.py',
+    '../_base_/datasets/paired_imgs_256x256_crop.py',
+    '../_base_/default_runtime.py'
+]
+source_domain = 'aerial'
+target_domain = 'map'
+# model settings
+model = dict(
+    default_domain=target_domain,
+    reachable_domains=[target_domain],
+    related_domains=[target_domain, source_domain],
+    gen_auxiliary_loss=dict(
+        data_info=dict(
+            pred=f'fake_{target_domain}', target=f'real_{target_domain}')))
+# dataset settings
+domain_a = source_domain
+domain_b = target_domain
+img_norm_cfg = dict(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
+train_pipeline = [
+    dict(
+        type='LoadPairedImageFromFile',
+        io_backend='disk',
+        key='pair',
+        domain_a=domain_a,
+        domain_b=domain_b,
+        flag='color'),
+    dict(
+        type='Resize',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        scale=(286, 286),
+        interpolation='bicubic'),
+    dict(
+        type='FixedCrop',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        crop_size=(256, 256)),
+    dict(
+        type='Flip',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        direction='horizontal'),
+    dict(type='RescaleToZeroOne', keys=[f'img_{domain_a}', f'img_{domain_b}']),
+    dict(
+        type='Normalize',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        to_rgb=False,
+        **img_norm_cfg),
+    dict(type='ImageToTensor', keys=[f'img_{domain_a}', f'img_{domain_b}']),
+    dict(
+        type='Collect',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        meta_keys=[f'img_{domain_a}_path', f'img_{domain_b}_path'])
+]
+test_pipeline = [
+    dict(
+        type='LoadPairedImageFromFile',
+        io_backend='disk',
+        key='pair',
+        domain_a=domain_a,
+        domain_b=domain_b,
+        flag='color'),
+    dict(
+        type='Resize',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        scale=(256, 256),
+        interpolation='bicubic'),
+    dict(type='RescaleToZeroOne', keys=[f'img_{domain_a}', f'img_{domain_b}']),
+    dict(
+        type='Normalize',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        to_rgb=False,
+        **img_norm_cfg),
+    dict(type='ImageToTensor', keys=[f'img_{domain_a}', f'img_{domain_b}']),
+    dict(
+        type='Collect',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        meta_keys=[f'img_{domain_a}_path', f'img_{domain_b}_path'])
+]
+dataroot = 'data/paired/maps'
+data = dict(
+    train=dict(dataroot=dataroot, pipeline=train_pipeline),
+    val=dict(dataroot=dataroot, pipeline=test_pipeline, testdir='val'),
+    test=dict(dataroot=dataroot, pipeline=test_pipeline, testdir='val'))
+# optimizer
+optimizer = dict(
+    generators=dict(type='Adam', lr=2e-4, betas=(0.5, 0.999)),
+    discriminators=dict(type='Adam', lr=2e-4, betas=(0.5, 0.999)))
+# learning policy
+lr_config = None
+# checkpoint saving
+checkpoint_config = dict(interval=10000, save_optimizer=True, by_epoch=False)
+custom_hooks = [
+    dict(
+        type='MMGenVisualizationHook',
+        output_dir='training_samples',
+        res_name_list=[f'fake_{target_domain}'],
+        interval=5000)
+]
+runner = None
+use_ddp_wrapper = True
+# runtime settings
+total_iters = 220000
+workflow = [('train', 1)]
+exp_name = 'pix2pix_aerial2map'
+work_dir = f'./work_dirs/experiments/{exp_name}'
+num_images = 1098
+metrics = dict(
+    FID=dict(type='FID', num_images=num_images, image_shape=(3, 256, 256)),
+    IS=dict(
+        type='IS',
+        num_images=num_images,
+        image_shape=(3, 256, 256),
+        inception_args=dict(type='pytorch')))
+evaluation = dict(
+    type='TranslationEvalHook',
+    target_domain=domain_b,
+    interval=10000,
+    metrics=[
+        dict(type='FID', num_images=num_images, bgr2rgb=True),
+        dict(
+            type='IS',
+            num_images=num_images,
+            inception_args=dict(type='pytorch'))
+    ],
+    best_metric=['fid', 'is'])
--- a/configs/pix2pix/pix2pix_vanilla_unet_bn_facades_b1x1_80k.py
+++ b/configs/pix2pix/pix2pix_vanilla_unet_bn_facades_b1x1_80k.py
+_base_ = [
+    '../_base_/models/pix2pix/pix2pix_vanilla_unet_bn.py',
+    '../_base_/datasets/paired_imgs_256x256_crop.py',
+    '../_base_/default_runtime.py'
+]
+source_domain = 'mask'
+target_domain = 'photo'
+# model settings
+model = dict(
+    default_domain=target_domain,
+    reachable_domains=[target_domain],
+    related_domains=[target_domain, source_domain],
+    gen_auxiliary_loss=dict(
+        data_info=dict(
+            pred=f'fake_{target_domain}', target=f'real_{target_domain}')))
+# dataset settings
+domain_a = target_domain
+domain_b = source_domain
+img_norm_cfg = dict(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
+train_pipeline = [
+    dict(
+        type='LoadPairedImageFromFile',
+        io_backend='disk',
+        key='pair',
+        domain_a=domain_a,
+        domain_b=domain_b,
+        flag='color'),
+    dict(
+        type='Resize',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        scale=(286, 286),
+        interpolation='bicubic'),
+    dict(
+        type='FixedCrop',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        crop_size=(256, 256)),
+    dict(
+        type='Flip',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        direction='horizontal'),
+    dict(type='RescaleToZeroOne', keys=[f'img_{domain_a}', f'img_{domain_b}']),
+    dict(
+        type='Normalize',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        to_rgb=False,
+        **img_norm_cfg),
+    dict(type='ImageToTensor', keys=[f'img_{domain_a}', f'img_{domain_b}']),
+    dict(
+        type='Collect',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        meta_keys=[f'img_{domain_a}_path', f'img_{domain_b}_path'])
+]
+test_pipeline = [
+    dict(
+        type='LoadPairedImageFromFile',
+        io_backend='disk',
+        key='pair',
+        domain_a=domain_a,
+        domain_b=domain_b,
+        flag='color'),
+    dict(
+        type='Resize',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        scale=(256, 256),
+        interpolation='bicubic'),
+    dict(type='RescaleToZeroOne', keys=[f'img_{domain_a}', f'img_{domain_b}']),
+    dict(
+        type='Normalize',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        to_rgb=False,
+        **img_norm_cfg),
+    dict(type='ImageToTensor', keys=[f'img_{domain_a}', f'img_{domain_b}']),
+    dict(
+        type='Collect',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        meta_keys=[f'img_{domain_a}_path', f'img_{domain_b}_path'])
+]
+dataroot = 'data/paired/facades'
+data = dict(
+    train=dict(dataroot=dataroot, pipeline=train_pipeline),
+    val=dict(dataroot=dataroot, pipeline=test_pipeline),
+    test=dict(dataroot=dataroot, pipeline=test_pipeline))
+# optimizer
+optimizer = dict(
+    generators=dict(type='Adam', lr=2e-4, betas=(0.5, 0.999)),
+    discriminators=dict(type='Adam', lr=2e-4, betas=(0.5, 0.999)))
+# learning policy
+lr_config = None
+# checkpoint saving
+checkpoint_config = dict(interval=10000, save_optimizer=True, by_epoch=False)
+custom_hooks = [
+    dict(
+        type='MMGenVisualizationHook',
+        output_dir='training_samples',
+        res_name_list=[f'fake_{target_domain}'],
+        interval=5000)
+]
+runner = None
+use_ddp_wrapper = True
+# runtime settings
+total_iters = 80000
+workflow = [('train', 1)]
+exp_name = 'pix2pix_facades'
+work_dir = f'./work_dirs/experiments/{exp_name}'
+num_images = 106
+metrics = dict(
+    FID=dict(type='FID', num_images=num_images, image_shape=(3, 256, 256)),
+    IS=dict(
+        type='IS',
+        num_images=num_images,
+        image_shape=(3, 256, 256),
+        inception_args=dict(type='pytorch')))
+evaluation = dict(
+    type='TranslationEvalHook',
+    target_domain=domain_b,
+    interval=10000,
+    metrics=[
+        dict(type='FID', num_images=num_images, bgr2rgb=True),
+        dict(
+            type='IS',
+            num_images=num_images,
+            inception_args=dict(type='pytorch'))
+    ],
+    best_metric=['fid', 'is'])
--- a/configs/pix2pix/pix2pix_vanilla_unet_bn_maps2aerial_b1x1_220k.py
+++ b/configs/pix2pix/pix2pix_vanilla_unet_bn_maps2aerial_b1x1_220k.py
+_base_ = [
+    '../_base_/models/pix2pix/pix2pix_vanilla_unet_bn.py',
+    '../_base_/datasets/paired_imgs_256x256_crop.py',
+    '../_base_/default_runtime.py'
+]
+source_domain = 'map'
+target_domain = 'aerial'
+# model settings
+model = dict(
+    default_domain=target_domain,
+    reachable_domains=[target_domain],
+    related_domains=[target_domain, source_domain],
+    gen_auxiliary_loss=dict(
+        data_info=dict(
+            pred=f'fake_{target_domain}', target=f'real_{target_domain}')))
+# dataset settings
+domain_a = target_domain
+domain_b = source_domain
+img_norm_cfg = dict(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
+train_pipeline = [
+    dict(
+        type='LoadPairedImageFromFile',
+        io_backend='disk',
+        key='pair',
+        domain_a=domain_a,
+        domain_b=domain_b,
+        flag='color'),
+    dict(
+        type='Resize',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        scale=(286, 286),
+        interpolation='bicubic'),
+    dict(
+        type='FixedCrop',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        crop_size=(256, 256)),
+    dict(
+        type='Flip',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        direction='horizontal'),
+    dict(type='RescaleToZeroOne', keys=[f'img_{domain_a}', f'img_{domain_b}']),
+    dict(
+        type='Normalize',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        to_rgb=False,
+        **img_norm_cfg),
+    dict(type='ImageToTensor', keys=[f'img_{domain_a}', f'img_{domain_b}']),
+    dict(
+        type='Collect',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        meta_keys=[f'img_{domain_a}_path', f'img_{domain_b}_path'])
+]
+test_pipeline = [
+    dict(
+        type='LoadPairedImageFromFile',
+        io_backend='disk',
+        key='pair',
+        domain_a=domain_a,
+        domain_b=domain_b,
+        flag='color'),
+    dict(
+        type='Resize',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        scale=(256, 256),
+        interpolation='bicubic'),
+    dict(type='RescaleToZeroOne', keys=[f'img_{domain_a}', f'img_{domain_b}']),
+    dict(
+        type='Normalize',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        to_rgb=False,
+        **img_norm_cfg),
+    dict(type='ImageToTensor', keys=[f'img_{domain_a}', f'img_{domain_b}']),
+    dict(
+        type='Collect',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        meta_keys=[f'img_{domain_a}_path', f'img_{domain_b}_path'])
+]
+dataroot = 'data/paired/maps'
+data = dict(
+    train=dict(dataroot=dataroot, pipeline=train_pipeline),
+    val=dict(dataroot=dataroot, pipeline=test_pipeline, testdir='val'),
+    test=dict(dataroot=dataroot, pipeline=test_pipeline, testdir='val'))
+# optimizer
+optimizer = dict(
+    generators=dict(type='Adam', lr=2e-4, betas=(0.5, 0.999)),
+    discriminators=dict(type='Adam', lr=2e-4, betas=(0.5, 0.999)))
+# learning policy
+lr_config = None
+# checkpoint saving
+checkpoint_config = dict(interval=10000, save_optimizer=True, by_epoch=False)
+custom_hooks = [
+    dict(
+        type='MMGenVisualizationHook',
+        output_dir='training_samples',
+        res_name_list=[f'fake_{target_domain}'],
+        interval=5000)
+]
+runner = None
+use_ddp_wrapper = True
+# runtime settings
+total_iters = 220000
+workflow = [('train', 1)]
+exp_name = 'pix2pix_maps2aerial'
+work_dir = f'./work_dirs/experiments/{exp_name}'
+num_images = 1098
+metrics = dict(
+    FID=dict(type='FID', num_images=num_images, image_shape=(3, 256, 256)),
+    IS=dict(
+        type='IS',
+        num_images=num_images,
+        image_shape=(3, 256, 256),
+        inception_args=dict(type='pytorch')))
+evaluation = dict(
+    type='TranslationEvalHook',
+    target_domain=domain_b,
+    interval=10000,
+    metrics=[
+        dict(type='FID', num_images=num_images, bgr2rgb=True),
+        dict(
+            type='IS',
+            num_images=num_images,
+            inception_args=dict(type='pytorch'))
+    ],
+    best_metric=['fid', 'is'])
--- a/configs/pix2pix/pix2pix_vanilla_unet_bn_wo_jitter_flip_edges2shoes_b1x4_190k.py
+++ b/configs/pix2pix/pix2pix_vanilla_unet_bn_wo_jitter_flip_edges2shoes_b1x4_190k.py
+_base_ = [
+    '../_base_/models/pix2pix/pix2pix_vanilla_unet_bn.py',
+    '../_base_/datasets/paired_imgs_256x256.py', '../_base_/default_runtime.py'
+]
+source_domain = 'edges'
+target_domain = 'photo'
+# model settings
+model = dict(
+    default_domain=target_domain,
+    reachable_domains=[target_domain],
+    related_domains=[target_domain, source_domain],
+    gen_auxiliary_loss=dict(
+        data_info=dict(
+            pred=f'fake_{target_domain}', target=f'real_{target_domain}')))
+# dataset settings
+domain_a = source_domain
+domain_b = target_domain
+img_norm_cfg = dict(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
+train_pipeline = [
+    dict(
+        type='LoadPairedImageFromFile',
+        io_backend='disk',
+        key='pair',
+        domain_a=domain_a,
+        domain_b=domain_b,
+        flag='color'),
+    dict(
+        type='Resize',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        scale=(286, 286),
+        interpolation='bicubic'),
+    dict(
+        type='FixedCrop',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        crop_size=(256, 256)),
+    dict(
+        type='Flip',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        direction='horizontal'),
+    dict(type='RescaleToZeroOne', keys=[f'img_{domain_a}', f'img_{domain_b}']),
+    dict(
+        type='Normalize',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        to_rgb=False,
+        **img_norm_cfg),
+    dict(type='ImageToTensor', keys=[f'img_{domain_a}', f'img_{domain_b}']),
+    dict(
+        type='Collect',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        meta_keys=[f'img_{domain_a}_path', f'img_{domain_b}_path'])
+]
+test_pipeline = [
+    dict(
+        type='LoadPairedImageFromFile',
+        io_backend='disk',
+        key='pair',
+        domain_a=domain_a,
+        domain_b=domain_b,
+        flag='color'),
+    dict(
+        type='Resize',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        scale=(256, 256),
+        interpolation='bicubic'),
+    dict(type='RescaleToZeroOne', keys=[f'img_{domain_a}', f'img_{domain_b}']),
+    dict(
+        type='Normalize',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        to_rgb=False,
+        **img_norm_cfg),
+    dict(type='ImageToTensor', keys=[f'img_{domain_a}', f'img_{domain_b}']),
+    dict(
+        type='Collect',
+        keys=[f'img_{domain_a}', f'img_{domain_b}'],
+        meta_keys=[f'img_{domain_a}_path', f'img_{domain_b}_path'])
+]
+dataroot = 'data/paired/edges2shoes'
+data = dict(
+    train=dict(dataroot=dataroot, pipeline=train_pipeline),
+    val=dict(dataroot=dataroot, pipeline=test_pipeline, testdir='val'),
+    test=dict(dataroot=dataroot, pipeline=test_pipeline, testdir='val'))
+# optimizer
+optimizer = dict(
+    generators=dict(type='Adam', lr=2e-4, betas=(0.5, 0.999)),
+    discriminators=dict(type='Adam', lr=2e-4, betas=(0.5, 0.999)))
+# learning policy
+lr_config = None
+# checkpoint saving
+checkpoint_config = dict(interval=10000, save_optimizer=True, by_epoch=False)
+custom_hooks = [
+    dict(
+        type='MMGenVisualizationHook',
+        output_dir='training_samples',
+        res_name_list=[f'fake_{target_domain}'],
+        interval=5000)
+]
+runner = None
+use_ddp_wrapper = True
+# runtime settings
+total_iters = 190000
+workflow = [('train', 1)]
+exp_name = 'pix2pix_edges2shoes_wo_jitter_flip'
+work_dir = f'./work_dirs/experiments/{exp_name}'
+num_images = 200
+metrics = dict(
+    FID=dict(type='FID', num_images=num_images, image_shape=(3, 256, 256)),
+    IS=dict(
+        type='IS',
+        num_images=num_images,
+        image_shape=(3, 256, 256),
+        inception_args=dict(type='pytorch')))
+evaluation = dict(
+    type='TranslationEvalHook',
+    target_domain=domain_b,
+    interval=10000,
+    metrics=[
+        dict(type='FID', num_images=num_images, bgr2rgb=True),
+        dict(
+            type='IS',
+            num_images=num_images,
+            inception_args=dict(type='pytorch'))
+    ],
+    best_metric=['fid', 'is'])
--- a/configs/positional_encoding_in_gans/README.md
+++ b/configs/positional_encoding_in_gans/README.md
+# Positional Encoding in GANs
+> [Positional Encoding as Spatial Inductive Bias in GANs](https://openaccess.thecvf.com/content/CVPR2021/html/Xu_Positional_Encoding_As_Spatial_Inductive_Bias_in_GANs_CVPR_2021_paper.html)
+<!-- [ALGORITHM] -->
+## Abstract
+<!-- [ABSTRACT] -->
+SinGAN shows impressive capability in learning internal patch distribution despite its limited effective receptive field. We are interested in knowing how such a translation-invariant convolutional generator could capture the global structure with just a spatially i.i.d. input. In this work, taking SinGAN and StyleGAN2 as examples, we show that such capability, to a large extent, is brought by the implicit positional encoding when using zero padding in the generators. Such positional encoding is indispensable for generating images with high fidelity. The same phenomenon is observed in other generative architectures such as DCGAN and PGGAN. We further show that zero padding leads to an unbalanced spatial bias with a vague relation between locations. To offer a better spatial inductive bias, we investigate alternative positional encodings and analyze their effects. Based on a more flexible positional encoding explicitly, we propose a new multi-scale training strategy and demonstrate its effectiveness in the state-of-the-art unconditional generator StyleGAN2. Besides, the explicit spatial inductive bias substantially improve SinGAN for more versatile image manipulation.
+<!-- [IMAGE] -->
+<div align=center>
+<img src="https://user-images.githubusercontent.com/28132635/143053767-c6a503b2-87ff-434a-a439-d9fb0e98d804.JPG"/>
+</div>
+## Results and models for MS-PIE
+<div align="center">
+  <b> 896x896 results generated from a 256 generator using MS-PIE</b>
+  <br/>
+  <img src="https://download.openmmlab.com/mmgen/pe_in_gans/mspie_256-896_demo.png" width="800"/>
+</div>
+|            Models            | Reference in Paper |     Scales     | FID50k |   P&R10k    |                            Config                            |                            Download                             |
+| :--------------------------: | :----------------: | :------------: | :----: | :---------: | :----------------------------------------------------------: | :-------------------------------------------------------------: |
+|  stylegan2_c2_256_baseline   |   Tab.5 config-a   |      256       |  5.56  | 75.92/51.24 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/positional_encoding_in_gans/stylegan2_c2_ffhq_256_b3x8_1100k.py) | [model](https://download.openmmlab.com/mmgen/pe_in_gans/stylegan2_c2_config-a_ffhq_256x256_b3x8_1100k_20210406_145127-71d9634b.pth) |
+|  stylegan2_c2_512_baseline   |   Tab.5 config-b   |      512       |  4.91  | 75.65/54.58 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/positional_encoding_in_gans/stylegan2_c2_ffhq_512_b3x8_1100k.py) | [model](https://download.openmmlab.com/mmgen/pe_in_gans/stylegan2_c2_config-b_ffhq_512x512_b3x8_1100k_20210406_145142-e85e5cf4.pth) |
+| ms-pie_stylegan2_c2_config-c |   Tab.5 config-c   | 256, 384, 512  |  3.35  | 73.84/55.77 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/positional_encoding_in_gans/mspie-stylegan2_c2_config-c_ffhq_256-512_b3x8_1100k.py) | [model](https://download.openmmlab.com/mmgen/pe_in_gans/mspie-stylegan2_c2_config-c_ffhq_256-512_b3x8_1100k_20210406_144824-9f43b07d.pth) |
+| ms-pie_stylegan2_c2_config-d |   Tab.5 config-d   | 256, 384, 512  |  3.50  | 73.28/56.16 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/positional_encoding_in_gans/mspie-stylegan2_c2_config-d_ffhq_256-512_b3x8_1100k.py) | [model](https://download.openmmlab.com/mmgen/pe_in_gans/mspie-stylegan2_c2_config-d_ffhq_256-512_b3x8_1100k_20210406_144840-dbefacf6.pth) |
+| ms-pie_stylegan2_c2_config-e |   Tab.5 config-e   | 256, 384, 512  |  3.15  | 74.13/56.88 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/positional_encoding_in_gans/mspie-stylegan2_c2_config-e_ffhq_256-512_b3x8_1100k.py) | [model](https://download.openmmlab.com/mmgen/pe_in_gans/mspie-stylegan2_c2_config-e_ffhq_256-512_b3x8_1100k_20210406_144906-98d5a42a.pth) |
+| ms-pie_stylegan2_c2_config-f |   Tab.5 config-f   | 256, 384, 512  |  2.93  | 73.51/57.32 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/positional_encoding_in_gans/mspie-stylegan2_c2_config-f_ffhq_256-512_b3x8_1100k.py) | [model](https://download.openmmlab.com/mmgen/pe_in_gans/mspie-stylegan2_c2_config-f_ffhq_256-512_b3x8_1100k_20210406_144927-4f4d5391.pth) |
+| ms-pie_stylegan2_c1_config-g |   Tab.5 config-g   | 256, 384, 512  |  3.40  | 73.05/56.45 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/positional_encoding_in_gans/mspie-stylegan2_c1_config-g_ffhq_256-512_b3x8_1100k.py) | [model](https://download.openmmlab.com/mmgen/pe_in_gans/mspie-stylegan2_c1_config-g_ffhq_256-512_b3x8_1100k_20210406_144758-2df61752.pth) |
+| ms-pie_stylegan2_c2_config-h |   Tab.5 config-h   | 256, 384, 512  |  4.01  | 72.81/54.35 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/positional_encoding_in_gans/mspie-stylegan2_c2_config-h_ffhq_256-512_b3x8_1100k.py) | [model](https://download.openmmlab.com/mmgen/pe_in_gans/mspie-stylegan2_c2_config-h_ffhq_256-512_b3x8_1100k_20210406_145006-84cf3f48.pth) |
+| ms-pie_stylegan2_c2_config-i |   Tab.5 config-i   | 256, 384, 512  |  3.76  | 73.26/54.71 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/positional_encoding_in_gans/mspie-stylegan2_c2_config-i_ffhq_256-512_b3x8_1100k.py) | [model](https://download.openmmlab.com/mmgen/pe_in_gans/mspie-stylegan2_c2_config-i_ffhq_256-512_b3x8_1100k_20210406_145023-c2b0accf.pth) |
+| ms-pie_stylegan2_c2_config-j |   Tab.5 config-j   | 256, 384, 512  |  4.23  | 73.11/54.63 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/positional_encoding_in_gans/mspie-stylegan2_c2_config-j_ffhq_256-512_b3x8_1100k.py) | [model](https://download.openmmlab.com/mmgen/pe_in_gans/mspie-stylegan2_c2_config-j_ffhq_256-512_b3x8_1100k_20210406_145044-c407481b.pth) |
+| ms-pie_stylegan2_c2_config-k |   Tab.5 config-k   | 256, 384, 512  |  4.17  | 73.05/51.07 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/positional_encoding_in_gans/mspie-stylegan2_c2_config-k_ffhq_256-512_b3x8_1100k.py) | [model](https://download.openmmlab.com/mmgen/pe_in_gans/mspie-stylegan2_c2_config-k_ffhq_256-512_b3x8_1100k_20210406_145105-6d8cc39f.pth) |
+| ms-pie_stylegan2_c2_config-f | higher-resolution  | 256, 512, 896  |  4.10  | 72.21/50.29 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/positional_encoding_in_gans/mspie-stylegan2_c2_config-f_ffhq_256-896_b3x8_1100k.py) | [model](https://download.openmmlab.com/mmgen/pe_in_gans/mspie-stylegan2_c2_config-f_ffhq_256-896_b3x8_1100k_20210406_144943-6c18ad5d.pth) |
+| ms-pie_stylegan2_c1_config-f | higher-resolution  | 256, 512, 1024 |  6.24  | 71.79/49.92 | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/positional_encoding_in_gans/mspie-stylegan2_c1_config-f_ffhq_256-1024_b2x8_1600k.py) | [model](https://download.openmmlab.com/mmgen/pe_in_gans/mspie-stylegan2_c1_config-f_ffhq_256-1024_b2x8_1600k_20210406_144716-81cbdc96.pth) |
+Note that we report the FID and P&R metric (FFHQ dataset) in the largest scale.
+## Results and Models for SinGAN
+<div align="center">
+  <b> Positional Encoding in SinGAN</b>
+  <br/>
+  <img src="https://nbei.github.io/gan-pos-encoding/teaser-web-singan.png" width="800"/>
+</div>
+|              Model              |                        Data                         | Num Scales |                        Config                         |                        Download                         |
+| :-----------------------------: | :-------------------------------------------------: | :--------: | :---------------------------------------------------: | :-----------------------------------------------------: |
+|         SinGAN + no pad         | [balloons.png](https://download.openmmlab.com/mmgen/dataset/singan/balloons.png) |     8      | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/positional_encoding_in_gans/singan_interp-pad_balloons.py) | [ckpt](https://download.openmmlab.com/mmgen/pe_in_gans/singan_interp-pad_balloons_20210406_180014-96f51555.pth) \| [pkl](https://download.openmmlab.com/mmgen/pe_in_gans/singan_interp-pad_balloons_20210406_180014-96f51555.pkl) |
+| SinGAN + no pad + no bn in disc | [balloons.png](https://download.openmmlab.com/mmgen/dataset/singan/balloons.png) |     8      | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/positional_encoding_in_gans/singan_interp-pad_disc-nobn_balloons.py) | [ckpt](https://download.openmmlab.com/mmgen/pe_in_gans/singan_interp-pad_disc-nobn_balloons_20210406_180059-7d63e65d.pth) \| [pkl](https://download.openmmlab.com/mmgen/pe_in_gans/singan_interp-pad_disc-nobn_balloons_20210406_180059-7d63e65d.pkl) |
+| SinGAN + no pad + no bn in disc | [fish.jpg](https://download.openmmlab.com/mmgen/dataset/singan/fish-crop.jpg) |     10     | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/positional_encoding_in_gans/singan_interp-pad_disc-nobn_fish.py) | [ckpt](https://download.openmmlab.com/mmgen/pe_in_gans/singan_interp-pad_disc-nobn_fis_20210406_175720-9428517a.pth) \| [pkl](https://download.openmmlab.com/mmgen/pe_in_gans/singan_interp-pad_disc-nobn_fis_20210406_175720-9428517a.pkl) |
+|          SinGAN + CSG           | [fish.jpg](https://download.openmmlab.com/mmgen/dataset/singan/fish-crop.jpg) |     10     | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/positional_encoding_in_gans/singan_csg_fish.py) | [ckpt](https://download.openmmlab.com/mmgen/pe_in_gans/singan_csg_fis_20210406_175532-f0ec7b61.pth) \| [pkl](https://download.openmmlab.com/mmgen/pe_in_gans/singan_csg_fis_20210406_175532-f0ec7b61.pkl) |
+|          SinGAN + CSG           | [bohemian.png](https://download.openmmlab.com/mmgen/dataset/singan/bohemian.png) |     10     | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/positional_encoding_in_gans/singan_csg_bohemian.py) | [ckpt](https://download.openmmlab.com/mmgen/pe_in_gans/singan_csg_bohemian_20210407_195455-5ed56db2.pth) \| [pkl](https://download.openmmlab.com/mmgen/pe_in_gans/singan_csg_bohemian_20210407_195455-5ed56db2.pkl) |
+|        SinGAN + SPE-dim4        | [fish.jpg](https://download.openmmlab.com/mmgen/dataset/singan/fish-crop.jpg) |     10     | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/positional_encoding_in_gans/singan_spe-dim4_fish.py) | [ckpt](https://download.openmmlab.com/mmgen/pe_in_gans/singan_spe-dim4_fish_20210406_175933-f483a7e3.pth) \| [pkl](https://download.openmmlab.com/mmgen/pe_in_gans/singan_spe-dim4_fish_20210406_175933-f483a7e3.pkl) |
+|        SinGAN + SPE-dim4        | [bohemian.png](https://download.openmmlab.com/mmgen/dataset/singan/bohemian.png) |     10     | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/positional_encoding_in_gans/singan_spe-dim4_bohemian.py) | [ckpt](https://download.openmmlab.com/mmgen/pe_in_gans/singan_spe-dim4_bohemian_20210406_175820-6e484a35.pth) \| [pkl](https://download.openmmlab.com/mmgen/pe_in_gans/singan_spe-dim4_bohemian_20210406_175820-6e484a35.pkl) |
+|        SinGAN + SPE-dim8        | [bohemian.png](https://download.openmmlab.com/mmgen/dataset/singan/bohemian.png) |     10     | [config](https://github.com/open-mmlab/mmgeneration/tree/master/configs/positional_encoding_in_gans/singan_spe-dim8_bohemian.py) | [ckpt](https://download.openmmlab.com/mmgen/pe_in_gans/singan_spe-dim8_bohemian_20210406_175858-7faa50f3.pth) \| [pkl](https://download.openmmlab.com/mmgen/pe_in_gans/singan_spe-dim8_bohemian_20210406_175858-7faa50f3.pkl) |
+## Citation
+```latex
+@article{xu2020positional,
+  title={Positional Encoding as Spatial Inductive Bias in GANs},
+  author={Xu, Rui and Wang, Xintao and Chen, Kai and Zhou, Bolei and Loy, Chen Change},
+  journal={arXiv preprint arXiv:2012.05217},
+  year={2020},
+  url={https://openaccess.thecvf.com/content/CVPR2021/html/Xu_Positional_Encoding_As_Spatial_Inductive_Bias_in_GANs_CVPR_2021_paper.html},
+}
+```