Commit 57e0e891 authored by limm's avatar limm
Browse files

add part mmgeneration code

parent 04e07f48
# Changelog
## v0.1.0 (20/04/2021)
**Highlights**
- MMGeneration is released.
**Main Features**
- High-quality Training Performance: We currently support training on Unconditional GANs (`DCGAN`, `WGAN-GP`, `PGGAN`, `StyleGANV1`, `StyleGANV2`, `Positional Encoding in GANs`), Internal GANs (`SinGAN`), and Image Translation Models (`Pix2Pix`, `CycleGAN`). Support for conditional models will come soon.
- Powerful Application Toolkit: A plentiful toolkit containing multiple applications in GANs is provided to users. GAN interpolation, GAN projection, and GAN manipulations are integrated into our framework. It's time to play with your GANs!
- Efficient Distributed Training for Generative Models: For the highly dynamic training in generative models, we adopt a new way to train dynamic models with `MMDDP`.
- New Modular Design for Flexible Combination: A new design for complex loss modules is proposed for customizing the links between modules, which can achieve flexible combination among different modules.
## v0.2.0 (30/05/2021)
#### Highlights
- Support new methods: LSGAN, GGAN.
- Support mixed-precision training (FP16): official PyTorch Implementation and APEX (#11, #20)
#### New Features
- Add the experiment of MNIST in DCGAN (#24)
- Add support for uploading checkpoints to `Ceph` system (cloud server) (#27)
- Add the functionality of saving the best checkpoint in GenerativeEvalHook (#21)
#### Fix bugs and Improvements
- Fix loss of sample-cfg argument (#13)
- Add `pbar` to offline eval and fix bug in grayscale image evaluation/saving (#23)
- Fix error when data_root option in val_cfg or test_cfg are set as None (#28)
- Change latex in quick_run.md to svg url and fix number of checkpoints in modelzoo_statistics.md (#34)
## v0.3.0 (02/08/2021)
#### Highlights
- Support conditional GANs: Projection GAN, SNGAN, SAGAN, and BigGAN
#### New Features
- Add support for persistent_workers in PyTorch >= 1.7.0 #71
- Support warm-up for EMA #55
#### Fix bugs and Improvements
- Fix failing to build docs #64
- Revise the logic of `num_classes` in basic conditional gan #69
- Support dynamic eval internal in eval hook #73
## v0.4.0 (03/11/2021)
#### Highlights
- Add more experiments for conditional GANs: SNGAN, SAGAN, and BigGAN
- Refact Translation Model (#88, #126, #127, #145)
#### New Features
- Use PyTorch Sphinx theme #123
- Support torchserve for unconditional models #131
#### Fix bugs and Improvements
- Add CI for python3.9 #110
- Add support for PyTorch1.9 #115
- Add pre-commit hook for spell checking #135
## v0.5.0 (12/01/2022)
#### Highlights
- Support BigGAN style's Spectral Norm and update BigGAN with best FID and IS (#159)
- Support import projected latent and export video in interpolation (#167)
- Support Improved-DDPM model (#205)
- One face editing application build upon MMGen is released
#### New Features
- Support evaluation in distributed mode (#151)
- Support `presistent_work` in validation dataloader (#179)
- Support dockerfile (#200)
- Support `mim` (#176)
#### Fix bugs and Improvements
- Fix bug in SinGAN dataset (#192)
- Fix SAGAN, SNGAN and BigGAN's default `sn_style` (#199, #213)
## v0.6.0 (07/03/2022)
#### Highlights
- Support StyleGANv3 (#247, #253, #258)
- Support StyleCLIP (#236)
#### New Features
- Support training on CPU (#238)
- Speed up training (#231)
#### Fix bugs and Improvements
- Fix bug in non-distributed training/testing (#239)
- Fix typos and invalid links (#221, #226, #228, #244, #249)
- Add part of Chinese documentation (#250, #257)
## v0.7.0 (02/04/2022)
#### Highlights
- Support training of StyleGANv3 (#275, #277)
- Support adaptive discriminator augmentation (#276)
#### New Features
- Support passing training arguments in static unconditional gan (#275)
- Support dynamic EMA, now you can define momentum updating policy (#261)
- Add multi machine distribute train (#267)
#### Fix bugs and Improvements
- Add brief installation steps in README (#270)
- Support random seed for distributed sampler (#271)
- Use hyphen for command line args in apps (#273)
## v0.7.1 (30/04/2022)
#### Fix bugs and Improvements
- Support train_dataloader, val_dataloader and test_dataloader settings (#281)
- Fix ada typo (#283)
- Add chinese application tutorial (#284)
- Add chinese document of ddp training (#286)
## v0.7.2 (12/09/2022)
#### Highlights
- Complete readme of StyleGAN-Ada (#391)
#### Fix bugs and Improvements
- Update limitation of MMCV's version (#397)
- Add Circle CI (#431)
- Update Chinese readme for `application.md` (#425)
## v0.7.3 (14/04/2023)
#### Fix bugs and Improvements
- Fix SiLU activation (#447)
- Support Perceptual Loss (#471)
- Fix tensor and index aren't on the same device error (#476)
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html
# -- Path setup --------------------------------------------------------------
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
import subprocess
import sys
import pytorch_sphinx_theme
from recommonmark.transform import AutoStructify
sys.path.insert(0, os.path.abspath('../../'))
# -- Project information -----------------------------------------------------
project = 'MMGeneration'
copyright = '2020-2030, OpenMMLab'
author = 'MMGeneration Authors'
version_file = '../../mmgen/version.py'
def get_version():
with open(version_file, 'r') as f:
exec(compile(f.read(), version_file, 'exec'))
return locals()['__version__']
# The full version, including alpha/beta/rc tags
release = get_version()
# -- General configuration ---------------------------------------------------
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
'sphinx.ext.autodoc',
'sphinx.ext.napoleon',
'sphinx.ext.viewcode',
'sphinx_markdown_tables',
'sphinx.ext.autosectionlabel',
'myst_parser',
'sphinx_copybutton',
]
autodoc_mock_imports = [
'matplotlib', 'pycocotools', 'terminaltables', 'mmgen.version', 'mmcv.ops'
]
# Ignore >>> when copying code
copybutton_prompt_text = r'>>> |\.\.\. '
copybutton_prompt_is_regexp = True
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix(es) of source filenames.
# You can specify multiple suffix as a list of string:
#
source_suffix = {
'.rst': 'restructuredtext',
'.md': 'markdown',
}
# The master toctree document.
master_doc = 'index'
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
# -- Options for HTML output -------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
# html_theme = 'sphinx_rtd_theme'
html_theme = 'pytorch_sphinx_theme'
html_theme_path = [pytorch_sphinx_theme.get_html_theme_path()]
html_theme_options = {
'menu': [
{
'name': 'GitHub',
'url': 'https://github.com/open-mmlab/mmgeneration',
},
],
'menu_lang':
'en'
}
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
html_css_files = ['css/readthedocs.css']
myst_enable_extensions = ['colon_fence']
myst_heading_anchors = 3
def builder_inited_handler(app):
subprocess.run(['./stat.py'])
def setup(app):
app.connect('builder-inited', builder_inited_handler)
app.add_transform(AutoStructify)
# FAQ
We list some common troubles faced by many users and their corresponding solutions here. Feel free to enrich the list if you find any frequent issues and have ways to help others to solve them. If the contents here do not cover your issue, please create an issue using the [provided templates](https://github.com/open-mmlab/mmgeneration/blob/master/.github/ISSUE_TEMPLATE/error-report.md) and make sure you fill in all required information in the template.
## Installation
- Compatible MMGeneration and MMCV versions are shown as below. Please choose the correct version of MMCV to avoid installation issues.
| MMGeneration version | MMCV version |
| :------------------: | :--------------: |
| master | mmcv-full>=1.3.0 |
Note: You need to run `pip uninstall mmcv` first if you have mmcv installed.
If mmcv and mmcv-full are both installed, there will be `ModuleNotFoundError`.
# Prerequisites
In this section we demonstrate how to prepare an environment with PyTorch.
MMGeneration works on Linux, Windows and macOS. It requires Python 3.6+, CUDA 9.2+ and PyTorch 1.5+.
```{note}
If you are experienced with PyTorch and have already installed it, just skip this part and jump to the [next section](#installation). Otherwise, you can follow these steps for the preparation.
```
**Step 0.** Download and install Miniconda from the [official website](https://docs.conda.io/en/latest/miniconda.html).
**Step 1.** Create a conda environment and activate it.
On GPU platforms:
```shell
conda create -name openmmlab python=3.8 -y
conda activate openmmlab
```
**Step 2.** Install Pytorch following [official instructions](https://pytorch.org/get-started/locally/), e.g.
On GPU platforms:
```shell
conda install pytorch torchvision -c pytorch
```
On CPU platforms:
```shell
conda install pytorch torchvision cpuonly -c pytorch
```
# Installation
We recommend that users follow our best practices to install MMGeneration. However, the whole process is highly customizable. See [Customize Installation](#customize-installation) section for more information.
## Best Practices
**Step 0.** Install [MMCV](https://github.com/open-mmlab/mmcv) using [MIM](https://github.com/open-mmlab/mim).
```shell
pip install -U openmim
mim install mmcv-full
```
**Step 1.** Install MMGeneration.
```shell
git clone https://github.com/open-mmlab/mmgeneration.git
cd mmgeneration
pip install -v -e .
# "-v" means verbose, or more output
# "-e" means installing a project in editable mode,
# thus any local modifications made to the code will take effect without reinstallation.
```
Case b: If you use mmgeneration as a dependency or third-party package, install it with pip:
```shell
pip install mmgen
```
## Verify the Installation
To verify whether MMGeneration and the required environment are installed correctly, we can run sample Python code to initialize an unconditional model and use it to generate random samples:
```python
from mmgen.apis import init_model, sample_unconditional_model
config_file = 'configs/styleganv2/stylegan2_c2_lsun-church_256_b4x8_800k.py'
# you can download this checkpoint in advance and use a local file path.
checkpoint_file = 'https://download.openmmlab.com/mmgen/stylegan2/official_weights/stylegan2-church-config-f-official_20210327_172657-1d42b7d1.pth'
device = 'cuda:0'
# init a generatvie
model = init_model(config_file, checkpoint_file, device=device)
# sample images
fake_imgs = sample_unconditional_model(model, 4)
```
The above code is supposed to run successfully upon you finish the installation.
## Customize Installation
### CUDA Version
When installing PyTorch, you need to specify the version of CUDA. If you are not clear on which to choose, follow our recommendations:
- For Ampere-based NVIDIA GPUs, such as GeForce 30 series and NVIDIA A100, CUDA 11 is a must.
- For older NVIDIA GPUs, CUDA 11 is backward compatible, but CUDA 10.2 offers better compatibility and is more lightweight.
Please make sure the GPU driver satisfies the minimum version requirements. See [this table](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cuda-major-component-versions__table-cuda-toolkit-driver-versions) for more information.
```{note}
Installing CUDA runtime libraries is enough if you follow our best practices, because no CUDA code will be compiled locally. However if you hope to compile MMCV from source or develop other CUDA operators, you need to install the complete CUDA toolkit from NVIDIA's [website](https://developer.nvidia.com/cuda-downloads), and its version should match the CUDA version of PyTorch. i.e., the specified version of cudatoolkit in `conda install` command.
```
### Install MMCV without MIM
MMCV contains C++ and CUDA extensions, thus depending on PyTorch in a complex way. MIM solves such dependencies automatically and makes the installation easier. However, it is not a must.
To install MMCV with pip instead of MIM, please follow [MMCV installation guides](https://mmcv.readthedocs.io/en/latest/get_started/installation.html). This requires manually specifying a find-url based on PyTorch version and its CUDA version.
For example, the following command install mmcv-full built for PyTorch 1.10.x and CUDA 11.3.
```shell
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10/index.html
```
### Using MMGeneration with Docker
We provide a [Dockerfile](https://github.com/open-mmlab/mmgeneration/blob/master/docker/Dockerfile) to build an image. Ensure that your [docker version](https://docs.docker.com/engine/install/) >=19.03.
```shell
# build an image with PyTorch 1.8, CUDA 11.1
# If you prefer other versions, just modified the Dockerfile
docker build -t mmgeneration docker/
```
Run it with
```shell
docker run --gpus all --shm-size=8g -it -v {DATA_DIR}:/mmgeneration/data mmgeneration
```
## Trouble shooting
If you have some issues during the installation, please first view the [FAQ](faq.md) page.
You may [open an issue](https://github.com/open-mmlab/mmgeneration/issues/new/choose) on GitHub if no solution is found.
# Developing with multiple MMGeneration versions
The train and test scripts already modify the `PYTHONPATH` to ensure the script uses the `MMGeneration` in the current directory.
To use the default MMGeneration installed in the environment rather than that you are working with, you can remove the following line in those scripts
```shell
PYTHONPATH="$(dirname $0)/..":$PYTHONPATH
```
Welcome to MMGeneration's documentation!
=======================================
.. toctree::
:maxdepth: 2
:caption: Get Started
get_started.md
modelzoo_statistics.md
.. toctree::
:maxdepth: 2
:caption: Quick Run
quick_run.md
.. toctree::
:maxdepth: 2
:caption: Tutorials
tutorials/index.rst
.. toctree::
:maxdepth: 2
:caption: Notes
changelog.md
faq.md
.. toctree::
:caption: Switch Language
switch_language.md
.. toctree::
:caption: API Reference
api.rst
Indices and tables
==================
* :ref:`genindex`
* :ref:`search`
@ECHO OFF
pushd %~dp0
REM Command file for Sphinx documentation
if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=.
set BUILDDIR=_build
if "%1" == "" goto help
%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.http://sphinx-doc.org/
exit /b 1
)
%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end
:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
:end
popd
# Model Zoo Statistics
- Number of papers: 15
- Number of checkpoints: 91
- [Large Scale GAN Training for High Fidelity Natural Image Synthesis](https://github.com/open-mmlab/mmgeneration/blob/master/configs/biggan) (7 ckpts)
- [CycleGAN: Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks](https://github.com/open-mmlab/mmgeneration/blob/master/configs/cyclegan) (6 ckpts)
- [Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks](https://github.com/open-mmlab/mmgeneration/blob/master/configs/dcgan) (3 ckpts)
- [Geometric GAN](https://github.com/open-mmlab/mmgeneration/blob/master/configs/ggan) (3 ckpts)
- [Improved Denoising Diffusion Probabilistic Models](https://github.com/open-mmlab/mmgeneration/blob/master/configs/improved_ddpm) (3 ckpts)
- [Least Squares Generative Adversarial Networks](https://github.com/open-mmlab/mmgeneration/blob/master/configs/lsgan) (4 ckpts)
- [Progressive Growing of GANs for Improved Quality, Stability, and Variation](https://github.com/open-mmlab/mmgeneration/blob/master/configs/pggan) (3 ckpts)
- [Pix2Pix: Image-to-Image Translation with Conditional Adversarial Networks](https://github.com/open-mmlab/mmgeneration/blob/master/configs/pix2pix) (4 ckpts)
- [Positional Encoding as Spatial Inductive Bias in GANs (CVPR'2021)](https://github.com/open-mmlab/mmgeneration/blob/master/configs/positional_encoding_in_gans) (21 ckpts)
- [Self-attention generative adversarial networks](https://github.com/open-mmlab/mmgeneration/blob/master/configs/sagan) (9 ckpts)
- [Singan: Learning a Generative Model from a Single Natural Image (ICCV'2019)](https://github.com/open-mmlab/mmgeneration/blob/master/configs/singan) (3 ckpts)
- [Spectral Normalization for Generative Adversarial Networks](https://github.com/open-mmlab/mmgeneration/blob/master/configs/sngan_proj) (10 ckpts)
- [A Style-Based Generator Architecture for Generative Adversarial Networks (CVPR'2019)](https://github.com/open-mmlab/mmgeneration/blob/master/configs/styleganv1) (2 ckpts)
- [Analyzing and Improving the Image Quality of Stylegan (CVPR'2020)](https://github.com/open-mmlab/mmgeneration/blob/master/configs/styleganv2) (11 ckpts)
- [Improved Training of Wasserstein GANs](https://github.com/open-mmlab/mmgeneration/blob/master/configs/wgan-gp) (2 ckpts)
# 1: Inference and train with existing models and standard datasets
Currently, we support various popular generative models, including unconditional GANs, image translation models, and internal GANs. Meanwhile, our framework has been tested on multiple standard datasets, e.g., FFHQ, CelebA, and LSUN. This note will show how to perform common tasks on these existing models and standard datasets, including:
- Use existing models to generate random samples
- Test existing models on standard datasets.
- Train predefined models on standard datasets.
## Generate samples with existing models
In this section, we will specify how to sample fake images by using our unconditional GANs and image translation models. For model inference, all of the APIs have been included in [mmgen/apis/inference.py](https://github.com/open-mmlab/mmgeneration/tree/master/mmgen/apis/inference.py). The most important function is `init_model` for creating a generative model from a config. Then, adopting the sampling function in this file with the generative model will offer you the synthesized images.
### Sample images with unconditional GANs
MMGeneration provides high-level APIs for sampling images with unconditional GANs. Here is an example for building StyleGAN2-256 and obtaining the synthesized images.
```python
import mmcv
from mmgen.apis import init_model, sample_unconditional_model
# Specify the path to model config and checkpoint file
config_file = 'configs/styleganv2/stylegan2_c2_ffhq_1024_b4x8.py'
# you can download this checkpoint in advance and use a local file path.
checkpoint_file = 'https://download.openmmlab.com/mmgen/stylegan2/stylegan2_c2_ffhq_1024_b4x8_20210407_150045-618c9024.pth'
device = 'cuda:0'
# init a generatvie
model = init_model(config_file, checkpoint_file, device=device)
# sample images
fake_imgs = sample_unconditional_model(model, 4)
```
Indeed, we have already provided a more friendly demo script to users. You can use [demo/unconditional_demo.py](https://github.com/open-mmlab/mmgeneration/tree/master/mmgen/demo/unconditional_demo.py) with the following commands:
```shell
python demo/unconditional_demo.py \
${CONFIG_FILE} \
${CHECKPOINT} \
[--save-path ${SAVE_PATH}] \
[--device ${GPU_ID}]
```
Note that more arguments are also offered to customizing your sampling procedure. Please use `python demo/unconditional_demo.py --help` to check more details.
### Sample images with conditional GANs
MMGeneration provides high-level APIs for sampling images with conditional GANs. Here is an example for building SAGAN-128 and obtaining the synthesized images.
```python
import mmcv
from mmgen.apis import init_model, sample_conditional_model
# Specify the path to model config and checkpoint file
config_file = 'configs/sagan/sagan_128_woReLUinplace_noaug_bigGAN_Glr-1e-4_Dlr-4e-4_ndisc1_imagenet1k_b32x8.py'
# you can download this checkpoint in advance and use a local file path.
checkpoint_file = 'https://download.openmmlab.com/mmgen/sagan/sagan_128_woReLUinplace_noaug_bigGAN_imagenet1k_b32x8_Glr1e-4_Dlr-4e-4_ndisc1_20210818_210232-3f5686af.pth'
device = 'cuda:0'
# init a generatvie
model = init_model(config_file, checkpoint_file, device=device)
# sample images with random label
fake_imgs = sample_conditional_model(model, 4)
# sample images with the same label
fake_imgs = sample_conditional_model(model, 4, label=0)
# sample images with specific labels
fake_imgs = sample_conditional_model(model, 4, label=[0, 1, 2, 3])
```
Indeed, we have already provided a more friendly demo script to users. You can use [demo/conditional_demo.py](https://github.com/open-mmlab/mmgeneration/tree/master/mmgen/demo/conditional_demo.py) with the following commands:
```shell
python demo/conditional_demo.py \
${CONFIG_FILE} \
${CHECKPOINT} \
[--label] ${LABEL} \
[--samples-per-classes] ${SAMPLES_PER_CLASSES} \
[--sample-all-classes] \
[--save-path ${SAVE_PATH}] \
[--device ${GPU_ID}]
```
If `--label` is not passed, images with random labels would be generated.
If `--label` is passed, we would generate `${SAMPLES_PER_CLASSES}` images for each input label.
If `sample_all_classes` is set true in command line, `--label` would be ignored and the generator will output images for all categories.
Note that more arguments are also offered to customizing your sampling procedure. Please use `python demo/conditional_demo.py --help` to check more details.
### Sample images with image translation models
MMGeneration provides high-level APIs for translating images by using image translation models. Here is an example of building Pix2Pix and obtaining the translated images.
```python
import mmcv
from mmgen.apis import init_model, sample_img2img_model
# Specify the path to model config and checkpoint file
config_file = 'configs/pix2pix/pix2pix_vanilla_unet_bn_wo_jitter_flip_edges2shoes_b1x4_190k.py'
# you can download this checkpoint in advance and use a local file path.
checkpoint_file = 'https://download.openmmlab.com/mmgen/pix2pix/refactor/pix2pix_vanilla_unet_bn_wo_jitter_flip_1x4_186840_edges2shoes_convert-bgr_20210902_170902-0c828552.pth'
# Specify the path to image you want to translate
image_path = 'tests/data/paired/test/33_AB.jpg'
device = 'cuda:0'
# init a generatvie
model = init_model(config_file, checkpoint_file, device=device)
# translate a single image
translated_image = sample_img2img_model(model, image_path, target_domain='photo')
```
Indeed, we have already provided a more friendly demo script to users. You can use [demo/translation_demo.py](https://github.com/open-mmlab/mmgeneration/tree/master/mmgen/demo/translation_demo.py) with the following commands:
```shell
python demo/translation_demo.py \
${CONFIG_FILE} \
${CHECKPOINT} \
${IMAGE_PATH}
[--save-path ${SAVE_PATH}] \
[--device ${GPU_ID}]
```
Note that more customized arguments are also offered to customizing your sampling procedure. Please use `python demo/translation_demo.py --help` to check more details.
# 2: Prepare dataset for training and testing
This section details how to prepare the dataset for MMGeneration and provides a standard way which we have used in our default configs. We recommend that all of the users may follow the following steps to organize their datasets.
### Datasets for unconditional models
It's much easier to prepare dataset for unconditional models. Firstly, please make a directory, named `data`, in the MMGeneration project. After that, all of datasets can be used by adopting the technology of symlink (soft link).
```shell
mkdir data
ln -s absolute_path_to_dataset ./data/dataset_name
```
Since unconditional models only need real images for training and testing, all you need to do is link your dataset to the `data` directory. Our dataset will automatically check all of the images in a specified path (recursively).
Here, we provide several download links of datasets frequently used in unconditional models: [LSUN](http://dl.yf.io/lsun/), [CelebA](http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html), [CelebA-HQ](https://drive.google.com/drive/folders/11Vz0fqHS2rXDb5pprgTjpD7S2BAJhi1P), [FFHQ](https://drive.google.com/drive/folders/1u2xu7bSrWxrbUxk-dT-UvEJq8IjdmNTP).
### Datasets for image translation models
For translation models, now we offer two settings for datasets called paired image dataset and unpaired image dataset.
For paired image dataset, every image is formed by concatenating two corresponding images from two domains along the width dimension. You are supposed to make two folders "train" and "test" filled with images of this format for training and testing. Folder structure is presented below.
```
./data/dataset_name/
├── test
│   └── XXX.jpg
└── train
└── XXX.jpg
```
For unpaired image dataset, you are supposed to make two folders "trainA" and "testA" filled with images from domain A and two folders "trainB" and "testB" filled with images from domain B. Folder structure is presented below.
```
./data/dataset_name/
├── testA
│   └── XXX.jpg
├── testB
│   └── XXX.jpg
├── trainA
│   └── XXX.jpg
└── trainB
└── XXX.jpg
```
Please read the section `Datasets for unconditional models` and also use the symlink (soft link) to build up the dataset.
Here, we provide download links of datasets used in [Pix2Pix](http://efrosgans.eecs.berkeley.edu/pix2pix/datasets/) and [CycleGAN](https://people.eecs.berkeley.edu/~taesung_park/CycleGAN/datasets/).
# 3: Train existing models
Currently, we have tested all of the model on distributed training. Thus, we highly recommend to adopt distributed training with our scripts. The basic usage is as follows:
```shell
sh tools/dist_train.sh ${CONFIG_FILE} ${GPUS_NUMBER} \
--work-dir ./work_dirs/experiments/experiments_name \
[optional arguments]
```
If you are using slurm system, the following commands can help you start training"
```shell
sh tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG} ${WORK_DIR} \
[optional arguments]
```
There two scripts wrap [tools/train.py](https://github.com/open-mmlab/mmgeneration/tree/master/tools/train.py) with distributed training entrypoint. The `optional arguments` are defined in [tools/train.py](https://github.com/open-mmlab/mmgeneration/tree/master/tools/train.py). Users can also set `random-seed` and `resume-from` with these arguments.
Note that the name of `work_dirs` has already been put into our `.gitignore` file. Users can put any files here without concern about changing git related files. Here is an example command that we use to train our `1024x1024 StyleGAN2 ` model.
```shell
sh tools/slurm_train.sh openmmlab-platform stylegan2-1024 \
configs/styleganv2/stylegan2_c2_ffhq_1024_b4x8.py \
work_dirs/experiments/stylegan2_c2_ffhq_1024_b4x8
```
During training, log files and checkpoints will be saved to the working directory. At the beginning of our development, we evaluate our model after the training finishes. However, the evaluation hook has been already supported to evaluate our models in the training procedure. More details can be found in our tutorial for running time configuration.
## Training with multiple machines
If you launch with multiple machines simply connected with ethernet, you can simply run following commands:
On the first machine:
```shell
NNODES=2 NODE_RANK=0 PORT=$MASTER_PORT MASTER_ADDR=$MASTER_ADDR sh tools/dist_train.sh $CONFIG $GPUS
```
On the second machine:
```shell
NNODES=2 NODE_RANK=1 PORT=$MASTER_PORT MASTER_ADDR=$MASTER_ADDR sh tools/dist_train.sh $CONFIG $GPUS
```
Usually it is slow if you do not have high speed networking like InfiniBand.
If you launch with slurm, the command is the same as that on single machine described above, but you need refer to [slurm_train.sh](https://github.com/open-mmlab/mmgeneration/blob/master/tools/slurm_train.sh) to set appropriate parameters and environment variables.
## Training on CPU
The process of training on the CPU is consistent with single GPU training. We just need to disable GPUs before the training process.
```shell
export CUDA_VISIBLE_DEVICES=-1
```
And then run this script.
```shell
python tools/train.py config --work-dir WORK_DIR
```
**Note**:
We do not recommend users to use CPU for training because it is too slow. We support this feature to allow users to debug on machines without GPU for convenience. Also you cannot train Dynamic GANs on CPU. For more details, please refer to [ddp training](docs/en/tutorials/ddp_train_gans.md).
# 4: Test existing models
Currently, we have supported **6 evaluation metrics**, i.e., MS-SSIM, SWD, IS, FID, Precision&Recall, and PPL. For unconditional GANs, we have provided unified evaluation scripts in [tools/evaluation.py](https://github.com/open-mmlab/mmgeneration/tree/master/tools/evaluation.py). Additionally, [configs/_base_/default_metrics.py](https://github.com/open-mmlab/mmgeneration/tree/master/configs/_base_/default_metrics.py) also offers the commonly used configurations to users. If users want to evaluate their models with some metrics, you can add the `metrics` into your config file like this:
```python
# at the end of the configs/styleganv2/stylegan2_c2_ffhq_256_b4x8_800k.py
metrics = dict(
fid50k=dict(
type='FID',
num_images=50000,
inception_pkl='work_dirs/inception_pkl/ffhq-256-50k-rgb.pkl',
bgr2rgb=True))
```
(We will specify how to obtain `inception_pkl` in the [FID](#FID) section.)
Then, users can use the evaluation script with the following command:
```shell
sh eval.sh ${CONFIG_FILE} ${CKPT_FILE} --batch-size 10 --online
```
If you are in slurm environment, please switch to the [tools/slurm_eval.sh](https://github.com/open-mmlab/mmgeneration/tree/master/tools/slurm_eval.sh) by using the following commands:
```shell
sh slurm_eval.sh ${PLATFORM} ${JOBNAME} ${CONFIG_FILE} ${CKPT_FILE} \
--batch-size 10
--online
```
As you can see, we have provided two modes for evaluating your models, i.e., `online`, and `offline`. `online` mode indicates that the synthesized images will be directly passed to the metrics instead of being saved to the file system. If users have set the `--samples-path` argument, `offline` mode will save the generated images in this directory so that users can use them for other tasks. Besides, users can use the `offline` mode to sample images:
```shell
# for general envs
sh eval.sh ${CONFIG_FILE} ${CKPT_FILE} --eval none
# for slurm
sh slurm_eval.sh ${PLATFORM} ${JOBNAME} ${CONFIG_FILE} ${CKPT_FILE} \
--eval none
```
We also provide [tools/utils/translation_eval.py](https://github.com/open-mmlab/mmgeneration/blob/master/tools/utils/translation_eval.py) for users to evaluate their translation models. You are supposed to set the `target-domain` of the output images and run the following command:
```shell
python tools/utils/translation_eval.py ${CONFIG_FILE} ${CKPT_FILE} --t ${target-domain}
```
To be noted that, in current version of MMGeneration, we support multi GPUs for [FID](#fid) and [IS](#is) evaluation and image saving. You can use the following command to use this feature:
```shell
# online evaluation
sh dist_eval.sh ${CONFIG_FILE} ${CKPT_FILE} ${GPUS_NUMBER} --batch-size 10 --online
# online evaluation with slurm
sh slurm_eval_multi_gpu.sh ${PLATFORM} ${JOBNAME} ${CONFIG_FILE} ${CKPT_FILE} --batch-size 10 --online
# offline evaluation
sh dist_eval.sh${CONFIG_FILE} ${CKPT_FILE} ${GPUS_NUMBER}
# offline evaluation with slurm
sh slurm_eval_multi_gpu.sh ${PLATFORM} ${JOBNAME} ${CONFIG_FILE} ${CKPT_FILE}
# image saving
sh dist_eval.sh${CONFIG_FILE} ${CKPT_FILE} ${GPUS_NUMBER} --eval none --samples-path ${SAMPLES_PATH}
# image saving with slurm
sh slurm_eval_multi_gpu.sh ${PLATFORM} ${JOBNAME} ${CONFIG_FILE} ${CKPT_FILE} --eval none --samples-path ${SAMPLES_PATH}
```
In the subsequent version, multi GPUs evaluation for more metrics will be supported.
Next, we will specify the details of different metrics one by one.
## **FID**
Fréchet Inception Distance is a measure of similarity between two datasets of images. It was shown to correlate well with the human judgment of visual quality and is most often used to evaluate the quality of samples of Generative Adversarial Networks. FID is calculated by computing the Fréchet distance between two Gaussians fitted to feature representations of the Inception network.
In `MMGeneration`, we provide two versions for FID calculation. One is the commonly used PyTorch version and the other one is used in StyleGAN paper. Meanwhile, we have compared the difference between these two implementations in the StyleGAN2-FFHQ1024 model (the details can be found [here](https://github.com/open-mmlab/mmgeneration/blob/master/configs/styleganv2/README.md)). Fortunately, there is a marginal difference in the final results. Thus, we recommend users adopt the more convenient PyTorch version.
**About PyTorch version and Tero's version:** The commonly used PyTorch version adopts the modified InceptionV3 network to extract features for real and fake images. However, Tero's FID requires a [script module](https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/metrics/inception-2015-12-05.pt) for Tensorflow InceptionV3. Note that applying this script module needs `PyTorch >= 1.6.0`.
**About extracting real inception data:** For convenience, we always extract the features for real images in advance. In `MMGeneration`, we have provided [tools/utils/inception_stat.py](https://github.com/open-mmlab/mmgeneration/blob/master/tools/utils/inception_stat.py) for users to prepare the real inception data. After running the following command, the extracted features will be saved in a `pkl` file.
```shell
python tools/utils/inception_stat.py --imgsdir ${IMGS_PATH} --pklname ${PKLNAME} --size ${SIZE}
```
In the aforementioned command, the script will take the PyTorch InceptionV3 by default. If you want the Tero's InceptionV3, you will need to switch to the script module:
```shell
python tools/utils/inception_stat.py --imgsdir ${IMGS_PATH} --pklname ${PKLNAME} --size ${SIZE} \
--inception-style stylegan --inception-pth ${PATH_SCRIPT_MODULE}
```
If you want to know more information about how to extract the inception state please refer to this [doc](https://github.com/open-mmlab/mmgeneration/blob/master/docs/en/tutorials/inception_stat.md).
To use the FID metric, you should add the metric in a config file like this:
```python
metrics = dict(
fid50k=dict(
type='FID',
num_images=50000,
inception_pkl='work_dirs/inception_pkl/ffhq-256-50k-rgb.pkl',
bgr2rgb=True))
```
If the `inception_pkl` is not set, the metric will calculate the real inception statistics on the fly. If you hope to use the Tero's InceptionV3, please use the following metric configuration:
```python
metrics = dict(
fid50k=dict(
type='FID',
num_images=50000,
inception_pkl='work_dirs/inception_pkl/ffhq-1024-50k-stylegan.pkl', inception_args=dict(
type='StyleGAN',
inception_path='work_dirs/cache/inception-2015-12-05.pt')))
```
The `inception_path` indicates the path to Tero's script module.
## Precision and Recall
Our `Precision and Recall` implementation follows the version used in StyleGAN2. In this metric, a VGG network will be adopted to extract the features for images. Unfortunately, we have not found a PyTorch VGG implementation leading to similar results with Tero's version used in StyleGAN2. (About the differences, please see this [file](https://github.com/open-mmlab/mmgeneration/blob/master/configs/styleganv2/README.md).) Thus, in our implementation, we adopt [Teor's VGG](https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/metrics/vgg16.pt) network by default. Importantly, applying this script module needs `PyTorch >= 1.6.0`. If with a lower PyTorch version, we will use the PyTorch official VGG network for feature extraction.
To evaluate with `P&R`, please add the following configuration in the config file:
```python
metrics = dict(
PR=dict(
type='PR',
num_images=50000))
```
## IS
Inception score is an objective metric for evaluating the quality of generated images, proposed in [Improved Techniques for Training GANs](https://arxiv.org/pdf/1606.03498.pdf). It uses an InceptionV3 model to predict the class of the generated images, and suppose that 1) If an image is of high quality, it will be categorized into a specific class. 2) If images are of high diversity, the range of images' classes will be wide. So the KL-divergence of the conditional probability and marginal probability can indicate the quality and diversity of generated images. You can see the complete implementation in `metrics.py`, which refers to https://github.com/sbarratt/inception-score-pytorch/blob/master/inception_score.py.
If you want to evaluate models with `IS` metrics, you can add the `metrics` into your config file like this:
```python
# at the end of the configs/pix2pix/pix2pix_vanilla_unet_bn_facades_b1x1_80k.py
metrics = dict(
IS=dict(type='IS', num_images=106, image_shape=(3, 256, 256)))
```
You can run the command below to calculate IS.
```shell
python tools/utils/translation_eval.py --t photo \
./configs/pix2pix/pix2pix_vanilla_unet_bn_facades_b1x1_80k.py \
https://download.openmmlab.com/mmgen/pix2pix/refactor/pix2pix_vanilla_unet_bn_1x1_80k_facades_20210902_170442-c0958d50.pth \
--eval IS
```
To be noted that, the selection of Inception V3 and image resize method can significantly influence the final IS score. Therefore, we strongly recommend users may download the [Tero's script model of Inception V3](https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/metrics/inception-2015-12-05.pt) (load this script model need torch >= 1.6) and use `Bicubic` interpolation with `Pillow` backend. We provide a template for the [data process pipline](https://github.com/open-mmlab/mmgeneration/tree/master/configs/_base_/datasets/Inception_Score.py) as well.
We also perform a survey on the influence of data loading pipeline and the version of pretrained Inception V3 on the IS result. All IS are evaluated on the same group of images which are randomly selected from the ImageNet dataset.
<details> <summary> Show the Comparison Results </summary>
| Code Base | Inception V3 Version | Data Loader Backend | Resize Interpolation Method | IS |
| :-------------------------------------------------------------: | :------------------: | :-----------------: | :-------------------------: | :-------------------: |
| [OpenAI (baseline)](https://github.com/openai/improved-gan) | Tensorflow | Pillow | Pillow Bicubic | **312.255 +/- 4.970** |
| [StyleGAN-Ada](https://github.com/NVlabs/stylegan2-ada-pytorch) | Tero's Script Model | Pillow | Pillow Bicubic | 311.895 +/ 4.844 |
| mmgen (Ours) | Pytorch Pretrained | cv2 | cv2 Bilinear | 322.932 +/- 2.317 |
| mmgen (Ours) | Pytorch Pretrained | cv2 | cv2 Bicubic | 324.604 +/- 5.157 |
| mmgen (Ours) | Pytorch Pretrained | cv2 | Pillow Bicubic | 318.161 +/- 5.330 |
| mmgen (Ours) | Pytorch Pretrained | Pillow | Pillow Bilinear | 313.126 +/- 5.449 |
| mmgen (Ours) | Pytorch Pretrained | Pillow | cv2 Bilinear | 318.021+/-3.864 |
| mmgen (Ours) | Pytorch Pretrained | Pillow | Pillow Bicubic | 317.997 +/- 5.350 |
| mmgen (Ours) | Tero's Script Model | cv2 | cv2 Bilinear | 318.879 +/- 2.433 |
| mmgen (Ours) | Tero's Script Model | cv2 | cv2 Bicubic | 316.125 +/- 5.718 |
| mmgen (Ours) | Tero's Script Model | cv2 | Pillow Bicubic | **312.045 +/- 5.440** |
| mmgen (Ours) | Tero's Script Model | Pillow | Pillow Bilinear | 308.645 +/- 5.374 |
| mmgen (Ours) | Tero's Script Model | Pillow | Pillow Bicubic | 311.733 +/- 5.375 |
</details>
## PPL
Perceptual path length measures the difference between consecutive images (their VGG16 embeddings) when interpolating between two random inputs. Drastic changes mean that multiple features have changed together and that they might be entangled. Thus, a smaller PPL score appears to indicate higher overall image quality by experiments. \
As a basis for our metric, we use a perceptually-based pairwise image distance that is calculated as a weighted difference between two VGG16 embeddings, where the weights are fit so that the metric agrees with human perceptual similarity judgments.
If we subdivide a latent space interpolation path into linear segments, we can define the total perceptual length of this segmented path as the sum of perceptual differences over each segment, and a natural definition for the perceptual path length would be the limit of this sum under infinitely fine subdivision, but in practice we approximate it using a small subdivision `` $`\epsilon=10^{-4}`$ ``.
The average perceptual path length in latent `space` Z, over all possible endpoints, is therefore
`` $$`L_Z = E[\frac{1}{\epsilon^2}d(G(slerp(z_1,z_2;t))), G(slerp(z_1,z_2;t+\epsilon)))]`$$ ``
Computing the average perceptual path length in latent `space` W is carried out in a similar fashion:
`` $$`L_Z = E[\frac{1}{\epsilon^2}d(G(slerp(z_1,z_2;t))), G(slerp(z_1,z_2;t+\epsilon)))]`$$ ``
Where `` $`z_1, z_2 \sim P(z)`$ ``, and `` $` t \sim U(0,1)`$ `` if we set `sampling` to full, `` $` t \in \{0,1\}`$ `` if we set `sampling` to end. `` $` G`$ `` is the generator(i.e. `` $` g \circ f`$ `` for style-based networks), and `` $` d(.,.)`$ `` evaluates the perceptual distance between the resulting images.We compute the expectation by taking 100,000 samples (set `num_images` to 50,000 in our code).
You can find the complete implementation in `metrics.py`, which refers to https://github.com/rosinality/stylegan2-pytorch/blob/master/ppl.py.
If you want to evaluate models with `PPL` metrics, you can add the `metrics` into your config file like this:
```python
# at the end of the configs/styleganv2/stylegan2_c2_ffhq_1024_b4x8.py
metrics = dict(
ppl_wend=dict(type='PPL', space='W', sampling='end', num_images=50000, image_shape=(3, 1024, 1024)))
```
You can run the command below to calculate PPL.
```shell
python tools/evaluation.py ./configs/styleganv2/stylegan2_c2_ffhq_1024_b4x8.py \
https://download.openmmlab.com/mmgen/stylegan2/stylegan2_c2_ffhq_1024_b4x8_20210407_150045-618c9024.pth \
--batch-size 2 --online --eval ppl_wend
```
## SWD
Sliced Wasserstein distance is a discrepancy measure for probability distributions, and smaller distance indicates generated images look like the real ones. We obtain the Laplacian pyramids of every image and extract patches from the Laplacian pyramids as descriptors, then SWD can be calculated by taking the sliced Wasserstein distance of the real and fake descriptors.
You can see the complete implementation in `metrics.py`, which refers to https://github.com/tkarras/progressive_growing_of_gans/blob/master/metrics/sliced_wasserstein.py.
If you want to evaluate models with `SWD` metrics, you can add the `metrics` into your config file like this:
```python
# at the end of the configs/pggan/pggan_celeba-cropped_128_g8_12Mimgs.py
metrics = dict(swd16k=dict(type='SWD', num_images=16384, image_shape=(3, 128, 128)))
```
You can run the command below to calculate SWD.
```shell
python tools/evaluation.py ./configs/pggan/pggan_celeba-cropped_128_g8_12Mimgs.py \
https://download.openmmlab.com/mmgen/pggan/pggan_celeba-cropped_128_g8_20210408_181931-85a2e72c.pth \
--batch-size 64 --online --eval swd16k
```
## MS-SSIM
Multi-scale structural similarity is used to measure the similarity of two images. We use MS-SSIM here to measure the diversity of generated images, and a low MS-SSIM score indicates the high diversity of generated images. You can see the complete implementation in `metrics.py`, which refers to https://github.com/tkarras/progressive_growing_of_gans/blob/master/metrics/ms_ssim.py.
If you want to evaluate models with `MS-SSIM` metrics, you can add the `metrics` into your config file like this:
```python
# at the end of the configs/pggan/pggan_celeba-cropped_128_g8_12Mimgs.py
metrics = dict(ms_ssim10k=dict(type='MS_SSIM', num_images=10000))
```
You can run the command below to calculate MS-SSIM.
```shell
python tools/evaluation.py ./configs/pggan/pggan_celeba-cropped_128_g8_12Mimgs.py \
https://download.openmmlab.com/mmgen/pggan/pggan_celeba-cropped_128_g8_20210408_181931-85a2e72c.pth \
--batch-size 64 --online --eval ms_ssim10k
```
# 5: Evaluation during training
In this section, we will discuss how to evaluate the generative models, especially for GANs, in the training. Note that `MMGeneration` only supports distributed training and the evaluation metric adopted in the training procedure should also be run in a distributed style. Currently, only `FID` has been implemented and tested in an efficient distributed version. Other metrics with efficient distributed version will be supported in the recent future. Thus, in the following part, we will specify how to evaluate your models with `FID` metric in training.
In [eval_hooks.py](https://github.com/open-mmlab/mmgeneration/blob/master/mmgen/core/evaluation/eval_hooks.py), `GenerativeEvalHook` is provided to evaluate generative models during training. The most important argument for this hook is `metrics`. In fact, users can directly copy the configs in the last section to define the evaluation metric. To evaluate the model with `FID` metric, please add the following python codes in your config file:
```python
# define the evaluation keywords, otherwise evaluation will not be
# added in training
evaluation = dict(
type='GenerativeEvalHook',
interval=10000,
metrics=dict(
type='FID',
num_images=50000,
inception_pkl='path_to_inception_pkl',
bgr2rgb=True),
sample_kwargs=dict(sample_model='ema'))
```
We also provide `TranslationEvalHook` to evaluate translation models during training. You can use it in almost the same way as `GenerativeEvalHook`. The only difference is that you need to specify the `target_domain` of the evaluated images. To evaluate the model with `FID` metric, please add the following python codes in your config file:
```python
# define the evaluation keywords, otherwise evaluation will not be
# added in training
evaluation = dict(
type='TranslationEvalHook',
interval=10000,
target_domain='target_domain',
metrics=dict(
type='FID',
num_images=50000,
inception_pkl='path_to_inception_pkl',
bgr2rgb=True),
sample_kwargs=dict(sample_model='ema'))
```
For `FID` evaluation, our distributed version only takes about **400 seconds (7 minutes)**. Thus, it will not influence the training time significantly. In addition, users should also offer the `val` dataset, even if this metric will not use the files from this dataset:
```python
data = dict(
samples_per_gpu=4,
train=dict(dataset=dict(imgs_root='./data/ffhq/ffhq_imgs/ffhq_256')),
val=dict(imgs_root='./data/ffhq/ffhq_imgs/ffhq_256'))
```
We highly recommend that users should pre-calculate the inception pickle file in advance, which will reduce the evaluation cost significantly.
We also provide `TranslationEvalHook` for users to evaluate translation models during training. The only difference with `GenerativeEvalHook` is that you need to specify the target domain of the evaluated model. For example, to evaluate the model with `FID` metric, please add the following python codes in your config file:
```python
evaluation = dict(
type='TranslationEvalHook',
target_domain=target_domain,
interval=10000,
metrics=[
dict(type='FID', num_images=num_images, bgr2rgb=True)
])
```
#!/usr/bin/env python
import glob
import os.path as osp
import re
url_prefix = 'https://github.com/open-mmlab/mmgeneration/blob/master/'
files = sorted(glob.glob('../../configs/*/README.md'))
stats = []
titles = []
num_ckpts = 0
for f in files:
url = osp.dirname(f.replace('../../', url_prefix))
with open(f, 'r') as content_file:
content = content_file.read()
title = content.split('\n')[0].replace('# ', '')
titles.append(title)
ckpts = set(x.lower().strip()
for x in re.findall(r'https?://download.(.*?)\.pth', content)
if 'mmgen' in x)
num_ckpts += len(ckpts)
statsmsg = f"""
\t* [{title}]({url}) ({len(ckpts)} ckpts)
"""
stats.append((title, ckpts, statsmsg))
msglist = '\n'.join(x for _, _, x in stats)
modelzoo = f"""
# Model Zoo Statistics
* Number of papers: {len(titles)}
* Number of checkpoints: {num_ckpts}
{msglist}
"""
with open('modelzoo_statistics.md', 'w') as f:
f.write(modelzoo)
## <a href='https://mmgeneration.readthedocs.io/en/latest/'>English</a>
## <a href='https://mmgeneration.readthedocs.io/zh_CN/latest/'>简体中文</a>
# Tutorial 8: Applications with Generative Models
## Interpolation
The generative model in the GAN architecture learns to map points in the latent space to generated images. The latent space has no meaning other than the meaning applied to it via the generative model. Generally, we want to explore the structure of latent space, one thing we can do is to interpolate a sequence of points between two endpoints in the latent space, and see the results these points yield. (Eg. we believe that features that are absent in either endpoint appear in the middle of a linear interpolation path is a sign that the latent space is entangled and the factors of variation are not properly separated.)
Indeed, we have provided a application script to users. You can use [apps/interpolate_sample.py](https://github.com/open-mmlab/mmgeneration/tree/master/apps/interpolate_sample.py) with the following commands for unconditional models' interpolation:
```bash
python apps/interpolate_sample.py \
${CONFIG_FILE} \
${CHECKPOINT} \
[--show-mode ${SHOW_MODE}] \
[--endpoint ${ENDPOINT}] \
[--interval ${INTERVAL}] \
[--space ${SPACE}] \
[--samples-path ${SAMPLES_PATH}] \
[--batch-size ${BATCH_SIZE}] \
```
Here, we provide two kinds of `show-mode`, `sequence`, and `group`. In `sequence` mode, we sample a sequence of endpoints first, then interpolate points between two endpoints in order, generated images will be saved individually. In `group` mode, we sample several pairs of endpoints, then interpolate points between two endpoints in a pair, generated images will be saved in a single picture. What's more, `space` refers to the latent code space, you can choose 'z' or 'w' (especially refer to style space in StyleGAN series), `endpoint` indicates the number of endpoints you want to sample (should be set to even number in `group` mode), `interval` means the number of points (include endpoints) you interpolate between two endpoints.
Note that more customized arguments are also offered to customizing your interpolating procedure.
Please use `python apps/interpolate_sample.py --help` to check more details.
As in the above approach, You can use [apps/conditional_interpolate.py](https://github.com/open-mmlab/mmgeneration/tree/master/apps/conditional_interpolate.py) with the following commands for conditional models' interpolation:
```bash
python apps/conditional_interpolate.py \
${CONFIG_FILE} \
${CHECKPOINT} \
[--show-mode ${SHOW_MODE}] \
[--endpoint ${ENDPOINT}] \
[--interval ${INTERVAL}] \
[--embedding-name ${EMBEDDING_NAME}]
[--fix-z] \
[--fix-y] \
[--samples-path ${SAMPLES_PATH}] \
[--batch-size ${BATCH_SIZE}] \
```
Here, unlike unconditional models, you need to provide the name of the embedding layer if the label embedding is shared among conv_blocks. Otherwise, you can set the `embedding-name` to 'NULL'. Considering that conditional models have noise and label as inputs, we provide `fix-z` to fix the noise and `fix-y` to fix the label when performing image interpolation.
## Projection
Inverting the synthesis network g is an interesting problem that has many applications. For example, manipulating a given image in the latent feature space requires finding a matching latent code for it first. Generally, you can reconstruct a target image by optimizing over the latent vector, using lpips and pixel-wise loss as the objective function.
Indeed, we have provided an application script to users to find the matching latent vector w of StyleGAN series synthesis network for given images. You can use [apps/stylegan_projector.py](https://github.com/open-mmlab/mmgeneration/tree/master/apps/stylegan_projector.py) with the following commands:
```bash
python apps/stylegan_projector.py \
${CONFIG_FILE} \
${CHECKPOINT} \
${FILES}
[--results-path ${RESULTS_PATH}]
```
Here, `FILES` refer to the images' path, and the projection latent and reconstructed images will be saved in `results-path`.
Note that more customized arguments are also offered to customizing your projection procedure.
Please use `python apps/stylegan_projector.py --help` to check more details.
## Manipulation
A general application of StyleGAN based models is manipulating the latent space to control the attributes of the synthesized images. Here, we provide a simple but popular algorithm based on [SeFa](https://arxiv.org/pdf/2007.06600.pdf) to users. Of course, we modify the original version in calculating eigenvectors and offer a more flexible interface.
To manipulate your generator, you can run the script [apps/modified_sefa.py](https://github.com/open-mmlab/mmgeneration/tree/master/apps/modified_sefa.py) with the following command:
```shell
python apps/modified_sefa.py --cfg ${CONFIG} --ckpt ${CKPT} \
-i ${INDEX} -d ${DEGREE} --degree-step ${D_STEP} \
-l ${LAYER_NO} \
[--eigen-vector ${PATH_EIGEN_VEC}]
```
In this script, the eigenvector for the generator parameter will be calculated if `eigen-vector` is None. Meanwhile, we will save it in the same directory of the `ckpt` file, so that users can apply this pre-calculated vector. The demo of `Positional Encoding as Spatial Inductive Bias in GANs` just comes from this script. Here is an example for users to get similar results with our demo.
The `${INDEX}` indicates which eigenvector we will apply to manipulate the images. In general cases, each index controls one independent attribute, which is guaranteed by the disentangled representation in StyleGAN. We suggest that users can try different indexes to find the one you want. The `--degree` sets the range of the multiplication factor. In our experiments, we observe that an unsymmetric range like `[-3, 8]` is very helpful. Thus, we allow for setting the lower and higher bound in this argument. `--layer` or `-l` defines which layer we will apply the eigenvector. Some properties, like lighting, are only related to 1-2 layers in the generator.
Taking the lighting attribute as an example, we adopt the following command on our MS-PIE-StyleGAN2-256 model:
```shell
python apps/modified_sefa.py \
--config configs/positional_encoding_in_gans/mspie-stylegan2_c2_config-f_ffhq_256-512_b3x8_1100k.py \
--ckpt https://download.openmmlab.com/mmgen/pe_in_gans/mspie-stylegan2_c2_config-f_ffhq_256-512_b3x8_1100k_20210406_144927-4f4d5391.pth \
-i 15 -d 8. --degree-step 0.5 -l 8 9 --sample-path ./work_dirs/sefa-exp/ \
--sample-cfg chosen_scale=4 randomize_noise=False
```
Importantly, after setting `chosen_scale=4`, we can manipulate the 512x512 images with a simple 256-scale generator.
# Tutorial 1: Learn about Configs
We incorporate modular and inheritance design into our config system, which is convenient to conduct various experiments.
If you wish to inspect the config file, you may run `python tools/misc/print_config.py /PATH/TO/CONFIG` to see the complete config.
## Modify config through script arguments
When submitting jobs using "tools/train.py" or "tools/evaluation.py", you may specify `--cfg-options` to in-place modify the config.
- Update config keys of dict chains.
The config options can be specified following the order of the dict keys in the original config.
For example, `--cfg-options test_cfg.use_ema=False` changes the default sampling model to the original generator.
- Update keys inside a list of configs.
Some config dicts are composed as a list in your config. For example, the training pipeline `data.train.pipeline` is normally a list
e.g. `[dict(type='LoadImageFromFile'), ...]`. If you want to change `'LoadImageFromFile'` to `'LoadImageFromWebcam'` in the pipeline,
you may specify `--cfg-options data.train.pipeline.0.type=LoadImageFromWebcam`.
- Update values of list/tuples.
If the value to be updated is a list or a tuple. For example, the config file normally sets `workflow=[('train', 1)]`. If you want to
change this key, you may specify `--cfg-options workflow="[(train,1),(val,1)]"`. Note that the quotation mark " is necessary to
support list/tuple data types, and that **NO** white space is allowed inside the quotation marks in the specified value.
## Config File Structure
There are 4 basic component types under `config/_base_`, dataset, model, default_metrics, default_runtime.
Many methods could be easily constructed with one of each like StyleGAN2, CycleGAN, SinGAN.
Configs consisting of components from `_base_` are called _primitive_.
For all configs under the same folder, it is recommended to have only **one** _primitive_ config. All other configs should inherit from the _primitive_ config. In this way, the maximum of inheritance level is 3.
For easy understanding, we recommend contributors to inherit from existing methods.
For example, if some modification is made base on StyleGAN2, user may first inherit the basic StyleGAN2 structure by specifying `_base_ = ../styleganv2/stylegan2_c2_ffhq_256_b4x8_800k.py`, then modify the necessary fields in the config files.
If you are building an entirely new method that does not share the structure with any of the existing methods, you may create a folder `xxxgan` under `configs`,
Please refer to [mmcv](https://mmcv.readthedocs.io/en/latest/utils.html#config) for detailed documentation.
## Config Name Style
We follow the below style to name config files. Contributors are advised to follow the same style.
```
{model}_[model setting]_{dataset}_[batch_per_gpu x gpu]_{schedule}
```
`{xxx}` is required field and `[yyy]` is optional.
- `{model}`: model type like `stylegan`, `dcgan`, etc.
- `[model setting]`: specific setting for some model, like `c2` for `stylegan2`, etc.
- `{dataset}`: dataset like `ffhq`, `lsun-car`, `celeba-hq`.
- `[batch_per_gpu x gpu]`: GPUs and samples per GPU, `b4x8` is used by default in stylegan2.
- `{schedule}`: training schedule. Following Tero's convention, we recommend to use the number of images shown to the discriminator, like 5M, 800k. Of course, you can use 5e indicating 5 epochs or 80k-iters for 80k iterations.
## An Example of StyleGAN2
To help the users have a basic idea of a complete config and the modules in a modern detection system,
we make brief comments on the config of Stylegan2 at 256x256 scale.
For more detailed usage and the corresponding alternative for each module, please refer to the API documentation and the [tutorial in MMDetection](https://github.com/open-mmlab/mmdetection/blob/master/docs/en/tutorials/config.md).
```python
_base_ = [
'../_base_/datasets/ffhq_flip.py', '../_base_/models/stylegan/stylegan2_base.py',
'../_base_/default_runtime.py', '../_base_/default_metrics.py'
] # base config file which we build new config file on.
model = dict(generator=dict(out_size=256), discriminator=dict(in_size=256)) # update the `out_size` and `in_size` arguments.
data = dict(
samples_per_gpu=4, # specify the number of samples on each GPU
train=dict(dataset=dict(imgs_root='./data/ffhq/ffhq_imgs/ffhq_256'))) # provide root path for dataset
ema_half_life = 10. # G_smoothing_kimg
custom_hooks = [ # add customized hooks for training
dict(
type='VisualizeUnconditionalSamples', # visualize training samples for GANs
output_dir='training_samples', # define output path
interval=5000), # the interval of calling this hook
dict(
type='ExponentialMovingAverageHook', # EMA hook for better generator
module_keys=('generator_ema', ), # get the ema model and the original model should be named as `generator`
interval=1, # the interval of calling this hook
interp_cfg=dict(momentum=0.5**(32. / (ema_half_life * 1000.))), # args for updating params for ema model
priority='VERY_HIGH') # define the priority of this hook
]
metrics = dict( # metrics we used to test this model
fid50k=dict(
inception_pkl='work_dirs/inception_pkl/ffhq-256-50k-rgb.pkl', # provide the inception pkl for FID
bgr2rgb=True)) # change the order of the image channel when extracting inception features
checkpoint_config = dict(interval=10000, by_epoch=False, max_keep_ckpts=30) # define checkpoint hook
lr_config = None # remove lr scheduler
log_config = dict( # define log hook
interval=100,
hooks=[
dict(type='TextLoggerHook'),
# dict(type='TensorboardLoggerHook'),
])
total_iters = 800002 # define the total number of iterations
```
## FAQ
### Ignore some fields in the base configs
Sometimes, you may set `_delete_=True` to ignore some of fields in base configs.
You may refer to [mmcv](https://mmcv.readthedocs.io/en/latest/utils.html#inherit-from-base-config-with-ignored-fields) for simple illustration.
You may have a careful look at [this tutorial](https://github.com/open-mmlab/mmdetection/blob/master/docs/en/tutorials/config.md) for better understanding of this feature.
### Use intermediate variables in configs
Some intermediate variables are used in the config files, like `train_pipeline`/`test_pipeline` in datasets.
It's worth noting that when modifying intermediate variables in the children configs, users need to pass the intermediate variables into corresponding fields again. An intuitive example can be found in [this tutorial](https://github.com/open-mmlab/mmdetection/blob/master/docs/tutorials/config.md).
# Tutorial 2: Customize Datasets
In this section, we will detail how to prepare data and adopt proper dataset in our repo for different methods.
## Datasets for unconditional models
**Data preparation for unconditional model** is simple. What you need to do is downloading the images and put them into a directory. Next, you should set a symlink in the `data` directory. For standard unconditional gans with static architectures, like DCGAN and StyleGAN2, `UnconditionalImageDataset` is designed to train such unconditional models. Here is an example config for FFHQ dataset:
```python
dataset_type = 'UnconditionalImageDataset'
train_pipeline = [
dict(
type='LoadImageFromFile',
key='real_img',
io_backend='disk',
),
dict(type='Flip', keys=['real_img'], direction='horizontal'),
dict(
type='Normalize',
keys=['real_img'],
mean=[127.5] * 3,
std=[127.5] * 3,
to_rgb=False),
dict(type='ImageToTensor', keys=['real_img']),
dict(type='Collect', keys=['real_img'], meta_keys=['real_img_path'])
]
# `samples_per_gpu` and `imgs_root` need to be set.
data = dict(
samples_per_gpu=4,
workers_per_gpu=4,
train=dict(
type='RepeatDataset',
times=100,
dataset=dict(
type=dataset_type,
imgs_root='data/ffhq/images',
pipeline=train_pipeline)))
```
Here, we adopt `RepeatDataset` to avoid frequent dataloader reloading, which will accelerate the training procedure. As shown in the example, `pipeline` provides important data pipeline to process images, including loading from file system, resizing, cropping and transferring to `torch.Tensor`. All of supported data pipelines can be found in `mmgen/datasets/pipelines`.
For unconditional GANs with dynamic architectures like PGGAN and StyleGANv1, `GrowScaleImgDataset` is recommended to use for training. Since such dynamic architectures need real images in different scales, directly adopting `UnconditionalImageDataset` will bring heavy I/O cost for loading multiple high-resolution images. Here is an example we use for training PGGAN in CelebA-HQ dataset:
```python
dataset_type = 'GrowScaleImgDataset'
train_pipeline = [
dict(
type='LoadImageFromFile',
key='real_img',
io_backend='disk',
),
dict(type='Flip', keys=['real_img'], direction='horizontal'),
dict(
type='Normalize',
keys=['real_img'],
mean=[127.5] * 3,
std=[127.5] * 3,
to_rgb=False),
dict(type='ImageToTensor', keys=['real_img']),
dict(type='Collect', keys=['real_img'], meta_keys=['real_img_path'])
]
# `samples_per_gpu` and `imgs_root` need to be set.
data = dict(
# samples per gpu should be the same as the first scale, e.g. '4': 64
# in this case
samples_per_gpu=64,
workers_per_gpu=4,
train=dict(
type=dataset_type,
# just an example
imgs_roots={
'64': './data/celebahq/imgs_64',
'256': './data/celebahq/imgs_256',
'512': './data/celebahq/imgs_512',
'1024': './data/celebahq/imgs_1024'
},
pipeline=train_pipeline,
gpu_samples_base=4,
# note that this should be changed with total gpu number
gpu_samples_per_scale={
'4': 64,
'8': 32,
'16': 16,
'32': 8,
'64': 4
},
len_per_stage=300000))
```
In this dataset, you should provide a dictionary of image paths to the `imgs_roots`. Thus, you should resize the images in the dataset in advance. For the resizing methods in the data pre-processing, we adopt bilinear interpolation methods in all of the experiments studied in MMGeneration.
Note that this dataset should be used with `PGGANFetchDataHook`. In this config file, this hook should be added in the customized hooks, as shown below.
```python
custom_hooks = [
dict(
type='VisualizeUnconditionalSamples',
output_dir='training_samples',
interval=5000),
dict(type='PGGANFetchDataHook', interval=1),
dict(
type='ExponentialMovingAverageHook',
module_keys=('generator_ema', ),
interval=1,
priority='VERY_HIGH')
]
```
This fetching data hook helps the dataloader update the status of dataset to change the data source and batch size during training.
## Datasets for image translation models
**Data preparation for translation model** needs a little attention. You should organize the files in the way we told you in `quick_run.md`. Fortunately, for most official datasets like facades and summer2winter_yosemite, they already have the right format. Also, you should set a symlink in the `data` directory. For paired-data trained translation model like Pix2Pix , `PairedImageDataset` is designed to train such translation models. Here is an example config for facades dataset:
```python
train_dataset_type = 'PairedImageDataset'
val_dataset_type = 'PairedImageDataset'
img_norm_cfg = dict(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
train_pipeline = [
dict(
type='LoadPairedImageFromFile',
io_backend='disk',
key='pair',
flag='color'),
dict(
type='Resize',
keys=['img_a', 'img_b'],
scale=(286, 286),
interpolation='bicubic'),
dict(type='FixedCrop', keys=['img_a', 'img_b'], crop_size=(256, 256)),
dict(type='Flip', keys=['img_a', 'img_b'], direction='horizontal'),
dict(type='RescaleToZeroOne', keys=['img_a', 'img_b']),
dict(
type='Normalize',
keys=['img_a', 'img_b'],
to_rgb=False,
**img_norm_cfg),
dict(type='ImageToTensor', keys=['img_a', 'img_b']),
dict(
type='Collect',
keys=['img_a', 'img_b'],
meta_keys=['img_a_path', 'img_b_path'])
]
test_pipeline = [
dict(
type='LoadPairedImageFromFile',
io_backend='disk',
key='pair',
flag='color'),
dict(
type='Resize',
keys=['img_a', 'img_b'],
scale=(256, 256),
interpolation='bicubic'),
dict(type='RescaleToZeroOne', keys=['img_a', 'img_b']),
dict(
type='Normalize',
keys=['img_a', 'img_b'],
to_rgb=False,
**img_norm_cfg),
dict(type='ImageToTensor', keys=['img_a', 'img_b']),
dict(
type='Collect',
keys=['img_a', 'img_b'],
meta_keys=['img_a_path', 'img_b_path'])
]
dataroot = 'data/paired/facades'
data = dict(
samples_per_gpu=1,
workers_per_gpu=4,
drop_last=True,
train=dict(
type=train_dataset_type,
dataroot=dataroot,
pipeline=train_pipeline,
test_mode=False),
val=dict(
type=val_dataset_type,
dataroot=dataroot,
pipeline=test_pipeline,
test_mode=True),
test=dict(
type=val_dataset_type,
dataroot=dataroot,
pipeline=test_pipeline,
test_mode=True))
```
Here, we adopt `LoadPairedImageFromFile` to load a paired image as the common loader does and crops
it into two images with the same shape in different domains. As shown in the example, `pipeline` provides important data pipeline to process images, including loading from file system, resizing, cropping, flipping and transferring to `torch.Tensor`. All of supported data pipelines can be found in `mmgen/datasets/pipelines`.
For unpaired-data trained translation model like CycleGAN , `UnpairedImageDataset` is designed to train such translation models. Here is an example config for horse2zebra dataset:
```python
train_dataset_type = 'UnpairedImageDataset'
val_dataset_type = 'UnpairedImageDataset'
img_norm_cfg = dict(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
train_pipeline = [
dict(
type='LoadImageFromFile', io_backend='disk', key='img_a',
flag='color'),
dict(
type='LoadImageFromFile', io_backend='disk', key='img_b',
flag='color'),
dict(
type='Resize',
keys=['img_a', 'img_b'],
scale=(286, 286),
interpolation='bicubic'),
dict(
type='Crop',
keys=['img_a', 'img_b'],
crop_size=(256, 256),
random_crop=True),
dict(type='Flip', keys=['img_a'], direction='horizontal'),
dict(type='Flip', keys=['img_b'], direction='horizontal'),
dict(type='RescaleToZeroOne', keys=['img_a', 'img_b']),
dict(
type='Normalize',
keys=['img_a', 'img_b'],
to_rgb=False,
mean=[0.5, 0.5, 0.5],
std=[0.5, 0.5, 0.5]),
dict(type='ImageToTensor', keys=['img_a', 'img_b']),
dict(
type='Collect',
keys=['img_a', 'img_b'],
meta_keys=['img_a_path', 'img_b_path'])
]
test_pipeline = [
dict(
type='LoadImageFromFile', io_backend='disk', key='img_a',
flag='color'),
dict(
type='LoadImageFromFile', io_backend='disk', key='img_b',
flag='color'),
dict(
type='Resize',
keys=['img_a', 'img_b'],
scale=(256, 256),
interpolation='bicubic'),
dict(type='RescaleToZeroOne', keys=['img_a', 'img_b']),
dict(
type='Normalize',
keys=['img_a', 'img_b'],
to_rgb=False,
mean=[0.5, 0.5, 0.5],
std=[0.5, 0.5, 0.5]),
dict(type='ImageToTensor', keys=['img_a', 'img_b']),
dict(
type='Collect',
keys=['img_a', 'img_b'],
meta_keys=['img_a_path', 'img_b_path'])
]
data_root = './data/horse2zebra/'
data = dict(
samples_per_gpu=1,
workers_per_gpu=4,
drop_last=True,
train=dict(
type=train_dataset_type,
dataroot=data_root,
pipeline=train_pipeline,
test_mode=False),
val=dict(
type=val_dataset_type,
dataroot=data_root,
pipeline=test_pipeline,
test_mode=True),
test=dict(
type=val_dataset_type,
dataroot=data_root,
pipeline=test_pipeline,
test_mode=True))
```
Here, `UnpairedImageDataset` will load both images (domain A and B) from different paths and transform them at the same time.
# Tutorial 4: Design of Our Loss Modules
As shown in the last tutorial for customizing models, `losses` are regarded/registered as `MODULES` in `MMGeneration`. Customizing losses is similar to customizing any other models. This section is mainly for clarifying the design of loss modules in our repo. Importantly, when writing your own loss modules, you should follow the same design, so that the new loss module can be adopted in our framework without extra efforts.
## Design of loss modules
In general, to implement a loss module, we will write a function implementation and then wrap it with a class implementation. However, in `MMGeneration`, we provide another unified interface `data_info` for users to define the mapping between the input argument and data items.
```python
@weighted_loss
def disc_shift_loss(pred):
return pred**2
@MODULES.register_module()
class DiscShiftLoss(nn.Module):
def __init__(self, loss_weight=1.0, data_info=None):
super(DiscShiftLoss, self).__init__()
# codes can be found in ``mmgen/models/losses/disc_auxiliary_loss.py``
def forward(self, *args, **kwargs):
# codes can be found in ``mmgen/models/losses/disc_auxiliary_loss.py``
```
The goal of this design for loss modules is to allow for using it automatically in the generative models (`MODELS`), without other complex codes to define the mapping between data and keyword arguments. Thus, different from other frameworks in `OpenMMLab`, our loss modules contain a special keyword, `data_info`, which is a dictionary defining the mapping between the input arguments and data from the generative models. Taking the `DiscShiftLoss` as an example, when writing the config file, users may use this loss as follows:
```python
dict(type='DiscShiftLoss',
loss_weight=0.001 * 0.5,
data_info=dict(pred='disc_pred_real')
```
The information in `data_info` tells the module to use the `disc_pred_real` data as the input tensor for `pred` arguments. Once the `data_info` is not `None`, our loss module will automatically build up the computational graph.
```python
@MODULES.register_module()
class DiscShiftLoss(nn.Module):
def __init__(self, loss_weight=1.0, data_info=None):
super(DiscShiftLoss, self).__init__()
self.loss_weight = loss_weight
self.data_info = data_info
def forward(self, *args, **kwargs):
# use data_info to build computational path
if self.data_info is not None:
# parse the args and kwargs
if len(args) == 1:
assert isinstance(args[0], dict), (
'You should offer a dictionary containing network outputs '
'for building up computational graph of this loss module.')
outputs_dict = args[0]
elif 'outputs_dict' in kwargs:
assert len(args) == 0, (
'If the outputs dict is given in keyworded arguments, no'
' further non-keyworded arguments should be offered.')
outputs_dict = kwargs.pop('outputs_dict')
else:
raise NotImplementedError(
'Cannot parsing your arguments passed to this loss module.'
' Please check the usage of this module')
# link the outputs with loss input args according to self.data_info
loss_input_dict = {
k: outputs_dict[v]
for k, v in self.data_info.items()
}
kwargs.update(loss_input_dict)
kwargs.update(dict(weight=self.loss_weight))
return disc_shift_loss(**kwargs)
else:
# if you have not define how to build computational graph, this
# module will just directly return the loss as usual.
return disc_shift_loss(*args, weight=self.loss_weight, **kwargs)
@staticmethod
def loss_name():
return 'loss_disc_shift'
```
As shown in this part of codes, once users set the `data_info`, the loss module will receive a dictionary containing all of the necessary data and modules, which is provided by the `MODELS` in the training procedure. If this dictionary is given as a non-keyword argument, it should be offered as the first argument. If you are using a keyword argument, please name it as `outputs_dict`.
To build the computational graph, the generative models have to provide a dictionary containing all kinds of data. Having a close look at any generative model, you will find that we collect all kinds of features and modules into a dictionary. The following codes are from our `ProgressiveGrowingGAN`:
```python
def train_step(self,
data_batch,
optimizer,
ddp_reducer=None,
running_status=None)
# ...
# get data dict to compute losses for disc
data_dict_ = dict(
iteration=curr_iter,
gen=self.generator,
disc=self.discriminator,
disc_pred_fake=disc_pred_fake,
disc_pred_real=disc_pred_real,
fake_imgs=fake_imgs,
real_imgs=real_imgs,
curr_scale=self.curr_scale[0],
transition_weight=transition_weight,
gen_partial=partial(
self.generator,
curr_scale=self.curr_scale[0],
transition_weight=transition_weight),
disc_partial=partial(
self.discriminator,
curr_scale=self.curr_scale[0],
transition_weight=transition_weight))
loss_disc, log_vars_disc = self._get_disc_loss(data_dict_)
# ...
```
Here, the `_get_disc_loss` defined in [BaseGAN](https://github.com/open-mmlab/mmgeneration/tree/master/mmgen/models/gans/base_gan.py) will help to combine all kinds of losses automatically.
```python
def _get_disc_loss(self, outputs_dict):
# Construct losses dict. If you hope some items to be included in the
# computational graph, you have to add 'loss' in its name. Otherwise,
# items without 'loss' in their name will just be used to print
# information.
losses_dict = {}
# gan loss
losses_dict['loss_disc_fake'] = self.gan_loss(
outputs_dict['disc_pred_fake'], target_is_real=False, is_disc=True)
losses_dict['loss_disc_real'] = self.gan_loss(
outputs_dict['disc_pred_real'], target_is_real=True, is_disc=True)
# disc auxiliary loss
if self.with_disc_auxiliary_loss:
for loss_module in self.disc_auxiliary_losses:
loss_ = loss_module(outputs_dict)
if loss_ is None:
continue
# the `loss_name()` function return name as 'loss_xxx'
if loss_module.loss_name() in losses_dict:
losses_dict[loss_module.loss_name(
)] = losses_dict[loss_module.loss_name()] + loss_
else:
losses_dict[loss_module.loss_name()] = loss_
loss, log_var = self._parse_losses(losses_dict)
return loss, log_var
```
Therefore, as long as users design the loss module with the same rules, any kind of loss can be inserted in the training of generative models, without other modifications in the code of models. What you only need to do is just defining the `data_info` in the config files.
# Tutorial 3: Customize Models
We basically categorize our supported models into 3 main streams according to tasks:
- Unconditional GANs:
- Static architectures: DCGAN, StyleGANv2
- Dynamic architectures: PGGAN, StyleGANv1
- Image Translation Models: Pix2Pix, CycleGAN
- Internal Learning (Single Image Model): SinGAN
Of course, some methods, like WGAN-GP, focus on the design of loss functions or learning schedule may be adopted into multiple generative models. Different from `MMDetection`, only two basic categories, `MODULES` and `MODELS`, exist in our repo. In other words, each module will be registered as `MODULES` or `MODELS`.
`MODELS` only contains all of the topmost wrappers for generative models. It supports the commonly used `train_step` and other sampling interface, which can be directly called during training. For static architectures in unconditional GANs, `StaticUnconditionalGAN` is the model that you can use for training your generator and discriminator.
All of the other modules in `MMGeneration` will be registered as `MODULES`, including components of loss functions, generators and discriminators.
## Develop new components
In all of the related repos in OpenMMLab, users may follow the similar steps to build up a new components:
- Implement a class
- Decorate the class with one of the register (`MODELS` or `MODULES` in our repo)
- Import this component in related `__init__.py` files
- Modify the configs and train your models
In the following part, we will show how to add a new generator in `MMGeneration`.
### Implement a class
Here is an standard template for define a new component with PyTorch. Users may insert their codes to define their generator or other components.
```python
import torch.nn as nn
class NewGenerator(nn.Module):
def __init__(self, *args, **kwargs):
super(NewGenerator, self).__init__()
# insert your codes
def forward(self, x):
pass
# insert your codes
```
### Decoate new class with register
In this step, users should import the proper register from `MMGeneration` and decorate their new modules with the `register_module` function.
```python
import torch.nn as nn
from mmgen.models import MODULES
@MODULES.register_module()
class NewGenerator(nn.Module):
def __init__(self, *args, **kwargs):
super(NewGenerator, self).__init__()
# insert your codes
def forward(self, x):
pass
# insert your codes
```
### Import new component in `__init__.py`
Only decorating the new class will **NOT** register the new class into our register. The most important thing you should do is to explicitly import this class in `__init__.py` files.
```python
from .new_generators import NewGenerator
__all__ = ['NewGenerator']
```
If you have already import some modules in a `__init__.py` file, the code still meets `cannot import error`, though. You may try to import this module from the parent package's `__init__.py` file.
### Modify config file to use new model
As discussed in the [tutorial for our config system](https://github.com/open-mmlab/mmgeneration/blob/master/docs/en/tutorials/config.md), users are recommended to create a new config file based on existing standard configs. Here, we show how to modify the [StyleGAN2 model](https://github.com/open-mmlab/mmgeneration/blob/master/configs/styleganv2/stylegan2_c2_ffhq_256_b4x8_800k.py) with our new generator:
```python
_base_ = ['configs/styleganv2/stylegan2_c2_ffhq_256_b4x8_800k.py']
model = dict(generator=dict(type='NewGenerator'))
```
Defining the new config file in this way will help us to modify the generator to our new architecture while keeping other configuration unchanged. However, if you do not want to inherit other arguments defined in the `_base_` config file, you can apply the `_delete_` keyword in this way:
```python
_base_ = ['configs/styleganv2/stylegan2_c2_ffhq_256_b4x8_800k.py']
model = dict(generator=dict(_delete_=True, type='NewGenerator'))
```
In `MMCV`, we will automatically discard all of the data for `generator` coming from the `_base_` config file. That is, your generator is just built by `dict(generator=dict(type='NewGenerator))`.
# Tutorial 7: Customize Runtime Settings
## Customize optimization settings
### Customize optimizer supported by PyTorch
We already support the use of all the optimizers implemented by PyTorch, and the only modification is to change the `optimizer` field of config files.
For example, if you want to use `ADAM` (note that the performance could drop a lot), the modification could be as the following.
```python
optimizer = dict(type='Adam', lr=0.0003, weight_decay=0.0001)
```
To modify the learning rate of the model, the users only need to modify the `lr` in the config of optimizer. The users can directly set arguments following the [API doc](https://pytorch.org/docs/stable/optim.html?highlight=optim#module-torch.optim) of PyTorch.
### Customize self-implemented optimizer
#### 1. Define a new optimizer
A customized optimizer could be defined as follows.
Assume you want to add an optimizer named `MyOptimizer`, which has arguments `a`, `b`, and `c`.
You need to create a new directory named `mmgen/core/optimizer`.
And then implement the new optimizer in a file, e.g., in `mmgen/core/optimizer/my_optimizer.py`:
```python
from .registry import OPTIMIZERS
from torch.optim import Optimizer
@OPTIMIZERS.register_module()
class MyOptimizer(Optimizer):
def __init__(self, a, b, c)
```
#### 2. Add the optimizer to registry
To find the `MyOptimizer` module defined above, this module should be imported into the main namespace at first. There are two options to achieve it.
- Modify `mmgen/core/optimizer/__init__.py` to import it.
The newly defined module should be imported in `mmgen/core/optimizer/__init__.py` so that the registry will
find the new module and add it:
```python
from .my_optimizer import MyOptimizer
```
- Use `custom_imports` in the config to manually import it
```python
custom_imports = dict(imports=['mmgen.core.optimizer.my_optimizer'], allow_failed_imports=False)
```
The module `mmgen.core.optimizer.my_optimizer` will be imported at the beginning of the program and the class `MyOptimizer` is then automatically registered.
Note that only the package containing the class `MyOptimizer` should be imported.
`mmgen.core.optimizer.my_optimizer.MyOptimizer` **cannot** be imported directly.
Actually, users can use a totally different file directory structure using this importing method, as long as the module root can be located in `PYTHONPATH`.
#### 3. Specify the optimizer in the config file
Then you can use `MyOptimizer` in `optimizer` field of config files.
In the configs, the optimizers are defined by the field `optimizer` like the following:
```python
optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
```
To use your own optimizer, the field can be changed to
```python
optimizer = dict(type='MyOptimizer', a=a_value, b=b_value, c=c_value)
```
### Customize optimizer constructor
Some models may have some parameter-specific settings for optimization, e.g. weight decay for BatchNorm layers.
The users can do those fine-grained parameter tuning through customizing optimizer constructor.
```python
from mmcv.utils import build_from_cfg
from mmcv.runner.optimizer import OPTIMIZER_BUILDERS, OPTIMIZERS
from mmgen.utils import get_root_logger
from .my_optimizer import MyOptimizer
@OPTIMIZER_BUILDERS.register_module()
class MyOptimizerConstructor(object):
def __init__(self, optimizer_cfg, paramwise_cfg=None):
def __call__(self, model):
return my_optimizer
```
The default optimizer constructor is implemented [here](https://github.com/open-mmlab/mmcv/blob/9ecd6b0d5ff9d2172c49a182eaa669e9f27bb8e7/mmcv/runner/optimizer/default_constructor.py#L11), which could also serve as a template for new optimizer constructor.
### Additional settings
Tricks not implemented by the optimizer should be implemented through optimizer constructor (e.g., set parameter-wise learning rates) or hooks. We list some common settings that could stabilize the training or accelerate the training. Feel free to create PR, issue for more settings.
- __Use gradient clip to stabilize training__:
Some models need gradient clip to clip the gradients to stabilize the training process. An example is as below:
```python
optimizer_config = dict(
_delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
```
If your config inherits the base config which already sets the `optimizer_config`, you might need `_delete_=True` to override the unnecessary settings. See the [config documentation](https://mmgeneration.readthedocs.io/en/latest/config.html) for more details.
- __Use momentum schedule to accelerate model convergence__:
We support momentum scheduler to modify model's momentum according to learning rate, which could make the model converge in a faster way.
Momentum scheduler is usually used with LR scheduler, for example, the following config is used in 3D detection to accelerate convergence.
For more details, please refer to the implementation of [CyclicLrUpdater](https://github.com/open-mmlab/mmcv/blob/f48241a65aebfe07db122e9db320c31b685dc674/mmcv/runner/hooks/lr_updater.py#L327) and [CyclicMomentumUpdater](https://github.com/open-mmlab/mmcv/blob/f48241a65aebfe07db122e9db320c31b685dc674/mmcv/runner/hooks/momentum_updater.py#L130).
```python
lr_config = dict(
policy='cyclic',
target_ratio=(10, 1e-4),
cyclic_times=1,
step_ratio_up=0.4,
)
momentum_config = dict(
policy='cyclic',
target_ratio=(0.85 / 0.95, 1),
cyclic_times=1,
step_ratio_up=0.4,
)
```
## Customize training schedules
By default, we use step learning rate with 1x schedule, this calls [`StepLRHook`](https://github.com/open-mmlab/mmcv/blob/f48241a65aebfe07db122e9db320c31b685dc674/mmcv/runner/hooks/lr_updater.py#L153) in MMCV.
We support many other learning rate schedules [here](https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/lr_updater.py), such as `CosineAnnealing` and `Poly` schedule. Here are some examples
- Poly schedule:
```python
lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
```
- ConsineAnnealing schedule:
```python
lr_config = dict(
policy='CosineAnnealing',
warmup='linear',
warmup_iters=1000,
warmup_ratio=1.0 / 10,
min_lr_ratio=1e-5)
```
## Customize workflow
Workflow is a list of (phase, epochs) to specify the running order and epochs.
By default, it is set to be
```python
workflow = [('train', 1)]
```
which means running 1 epoch for training.
Sometimes user may want to check some metrics (e.g. loss, accuracy) about the model on the validate set.
In such case, we can set the workflow as
```python
[('train', 1), ('val', 1)]
```
so that 1 epoch for training and 1 epoch for validation will be run iteratively.
**Note**:
1. The parameters of model will not be updated during val epoch.
2. Keyword `total_epochs` in the config only controls the number of training epochs and will not affect the validation workflow.
3. Workflows `[('train', 1), ('val', 1)]` and `[('train', 1)]` will not change the behavior of `EvalHook` because `EvalHook` is called by `after_train_epoch` and validation workflow only affect hooks that are called through `after_val_epoch`. Therefore, the only difference between `[('train', 1), ('val', 1)]` and `[('train', 1)]` is that the runner will calculate losses on validation set after each training epoch.
## Customize hooks
### Customize self-implemented hooks
#### 1. Implement a new hook
There are some occasions when the users might need to implement a new hook. MMGeneration supports customized hooks in training (#3395) since v2.3.0. Thus the users could implement a hook directly in mmgen or their mmgen-based codebases and use the hook by only modifying the config in training.
Before v2.3.0, the users need to modify the code to get the hook registered before training starts.
Here we give an example of creating a new hook in mmgen and using it in training.
```python
from mmcv.runner import HOOKS, Hook
@HOOKS.register_module()
class MyHook(Hook):
def __init__(self, a, b):
pass
def before_run(self, runner):
pass
def after_run(self, runner):
pass
def before_epoch(self, runner):
pass
def after_epoch(self, runner):
pass
def before_iter(self, runner):
pass
def after_iter(self, runner):
pass
```
Depending on the functionality of the hook, the users need to specify what the hook will do at each stage of the training in `before_run`, `after_run`, `before_epoch`, `after_epoch`, `before_iter`, and `after_iter`.
#### 2. Register the new hook
Then we need to make `MyHook` imported. Assuming the file is in `mmgen/core/utils/my_hook.py` there are two ways to do that:
- Modify `mmgen/core/utils/__init__.py` to import it.
The newly defined module should be imported in `mmgen/core/utils/__init__.py` so that the registry will
find the new module and add it:
```python
from .my_hook import MyHook
```
- Use `custom_imports` in the config to manually import it
```python
custom_imports = dict(imports=['mmgen.core.utils.my_hook'], allow_failed_imports=False)
```
#### 3. Modify the config
```python
custom_hooks = [
dict(type='MyHook', a=a_value, b=b_value)
]
```
You can also set the priority of the hook by adding key `priority` to `'NORMAL'` or `'HIGHEST'` as below
```python
custom_hooks = [
dict(type='MyHook', a=a_value, b=b_value, priority='NORMAL')
]
```
By default, the hook's priority is set as `NORMAL` during registration.
### Use hooks implemented in MMCV
If the hook is already implemented in MMCV, you can directly modify the config to use the hook as below
### Modify default runtime hooks
Some common hooks are not registered through `custom_hooks`, they are
- log_config
- checkpoint_config
- evaluation
- lr_config
- optimizer_config
- momentum_config
In those hooks, only the logger hook `log_config` has the `VERY_LOW` priority, the others have the `NORMAL` priority.
The above-mentioned tutorials already cover how to modify `optimizer_config`, `momentum_config`, and `lr_config`.
Here we reveal how what we can do with `log_config`, `checkpoint_config`, and `evaluation`.
#### Checkpoint config
The MMCV runner will use `checkpoint_config` to initialize [`CheckpointHook`](https://github.com/open-mmlab/mmcv/blob/9ecd6b0d5ff9d2172c49a182eaa669e9f27bb8e7/mmcv/runner/hooks/checkpoint.py#L9).
```python
checkpoint_config = dict(interval=1)
```
The users could set `max_keep_ckpts` to only save a small number of checkpoints or decide whether to store state dict of optimizer by `save_optimizer`. More details of the arguments are [here](https://mmcv.readthedocs.io/en/latest/api.html#mmcv.runner.CheckpointHook)
#### Log config
The `log_config` wraps multiple logger hooks and enables to set intervals. Now MMCV supports `WandbLoggerHook`, `MlflowLoggerHook`, and `TensorboardLoggerHook`.
The detailed usages can be found in the [doc](https://mmcv.readthedocs.io/en/latest/api.html#mmcv.runner.LoggerHook).
```python
log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
dict(type='TensorboardLoggerHook')
])
```
#### Evaluation config
The config of `evaluation` will be used to initialize the [`EvalHook`](https://github.com/open-mmlab/mmgeneration/blob/7a404a2c000620d52156774a5025070d9e00d918/mmgen/core/evaluation/eval_hooks.py#L8).
Except for the key `interval`, other arguments such as `metrics` will be passed to the hook. Note that, in current distributed training, only `FID` metric has been tested.
```python
evaluation = dict(
type='GenerativeEvalHook',
interval=10000,
metrics=dict(
type='FID',
num_images=50000,
inception_pkl='work_dirs/inception_pkl/ffhq-256-50k-rgb.pkl',
bgr2rgb=True),
sample_kwargs=dict(sample_model='ema'))
```
# Tutorial 6: DDP Training in MMGeneration
In this section, we will discuss the `DDP` (Distributed Data-Parallel) training for generative models, especially for GANs.
## Summary of ways for DDP Training
| DDP Model | find_unused_parameters | Static GANs | Dynamic GANs |
| :--------------------------------: | :--------------------: | :---------: | :----------: |
| MMDDP/PyTorch DDP | False | Error | Error |
| MMDDP/PyTorch DDP | True | Error | Error |
| DDP Wrapper | False | **No Bugs** | Error |
| DDP Wrapper | True | **No Bugs** | **No Bugs** |
| MMDDP/PyTorch DDP + Dynamic Runner | True | **No Bugs** | **No Bugs** |
In this table, we summarize the ways of DDP training for GANs. [`MMDDP/PyTorch DDP`](https://github.com/open-mmlab/mmcv/blob/master/mmcv/parallel/distributed.py) denotes directly wrapping the GAN model (containing the generator, discriminator, and loss modules) with `MMDistributedDataPrarallel`. However, in such a way, we cannot train the GAN models with the adversarial training schedule. The main reason is that we always need to backward the losses for partial models (only for discriminator or generator) in `train_step` function.
Another way to use DDP is adopting the [DDP Wrapper](https://github.com/open-mmlab/mmgeneration/tree/master/mmgen/core/ddp_wrapper.py) to wrap each component in the GAN model with `MMDDP`, which is widely used in current literature, e.g., `MMEditing` and [StyleGAN2-ADA-PyTorch](https://github.com/NVlabs/stylegan2-ada-pytorch). In this way, there is an important argument, `find_unused_parameters`. As shown in the table, users must set `True` in this argument for training dynamic architectures, like PGGAN and StyleGANv1. However, once set `True` in `find_unused_parameters`, the model will rebuild the bucket for synchronizing gradients and information after each forward process. This step will help the backward procedure to track which tensors are needed in the current computation graph.
In `MMGeneration`, we design another way for users to adopt `DDP` training, i.e., `MMDDP/PyTorch DDP + Dynamic Runner`. Before specifying the details of this new design, we first clarify why users should switch to it. In spite of achieving training in dynamic GANs with `DDP Wrapper`, we still spot some inconvenience and disadvantages:
- `DDP Wrapper` prevents users from calling the function or obtaining the attribute of the component in GANs, e.g., generator and discriminator. After adopting `DDP Wrapper`, if we want to call the function in `generator`, we have to use `generator.module.xxx()`.
- `DDP Wrapper` will cause redundant bucket rebuilding. The true reason for avoiding ddp error by adopting `DDP Wrapper` is that each component in the GAN model will rebuild the bucket for backward right after calling their `forward` function. However, as known in GAN literature, there are many cases that we need not build a bucket for backward, e.g., building the bucket for the generator when updating discriminators.
To solve these points, we try to find a way to directly adopt `MMDDP` and support dynamic GAN training. In `MMGeneration`, `DynamicIterBasedRunner` helps us to achieve this. Importantly, only `<10` line modification will solve the problem.
## MMDDP/PyTorch DDP + Dynamic Runner
The key point of adopting DDP in static/dynamic GAN training is to construct (or check) the bucket used for backward before backward (discriminator backward and generator backward). Since the parameters that need gradients in these two backward are from different parts of the GAN model. Thus, our solution is just explicitly rebuilding the bucket right before each backward procedure.
In [mmgen/core/runners/dynamic_iterbased_runner.py](https://github.com/open-mmlab/mmgeneration/tree/master/mmgen/core/runners/dynamic_iterbased_runner.py), we obtain the `reducer` by using **PyTorch private API**:
```python
if self.is_dynamic_ddp:
kwargs.update(dict(ddp_reducer=self.model.reducer))
outputs = self.model.train_step(data_batch, self.optimizer, **kwargs)
```
The reducer can help us to rebuild the bucket for current backward path by just adding this line in the `train_step` function:
```python
if ddp_reducer is not None:
ddp_reducer.prepare_for_backward(_find_tensors(loss_disc))
```
A complete using case is:
```python
loss_disc, log_vars_disc = self._get_disc_loss(data_dict_)
# prepare for backward in ddp. If you do not call this function before
# back propagation, the ddp will not dynamically find the used params
# in current computation.
if ddp_reducer is not None:
ddp_reducer.prepare_for_backward(_find_tensors(loss_disc))
loss_disc.backward()
```
That is, users should add reducer preparation in between the loss calculation and loss backward.
In our `MMGeneration`, this feature is adoptted as the default way to train DDP model. In configs, users should only add the following configuration to use dynamic ddp runner:
```python
# use dynamic runner
runner = dict(
type='DynamicIterBasedRunner',
is_dynamic_ddp=True,
pass_training_status=True)
```
*We have to admit that this implementation will use the private interface in PyTorch and will keep maintaining this feature.*
## DDP Wrapper
Of course, we still support using the `DDP Wrapper` to train your GANs. If you want to switch to use DDP Wrapper, you should modify the config file like this:
```python
# use ddp wrapper for faster training
use_ddp_wrapper = True
find_unused_parameters = True # True for dynamic model, False for static model
runner = dict(
type='DynamicIterBasedRunner',
is_dynamic_ddp=False, # Note that this flag should be False.
pass_training_status=True)
```
In [`dcgan config file`](https://github.com/open-mmlab/mmgeneration/tree/master/configs/dcgan/dcgan_celeba-cropped_64_b128x1_300k.py), we have already provided an example for using `DDPWrapper` in MMGeneration.
# Tutorial 5: How to Extract Inception State for FID Evaluation
In MMGeneration, we provide a [script](https://github.com/open-mmlab/mmgeneration/blob/master/tools/utils/inception_stat.py) to extract the inception state of the dataset. In this doc, we provide a brief introduction on how to use this script.
<!-- TOC -->
- [Load images](#load-images)
- [Load from directory](#load-from-directory)
- [Load with dataset config](#load-with-dataset-config)
- [Define the version of Inception Net](#define-the-version-of-inception-net)
- [Control number of images to calculate inception state](#control-number-of-images-to-calculate-inception-state)
- [Control the shuffle operation in data loading](#control-the-shuffle-operation-in-data-loading)
- [Note on inception state extraction between various code bases](#note-on-inception-state-extraction-between-various-code-bases)
<!-- TOC -->
## Load Images
We provide two ways to load real data, namely, pass the path of directory that contains real images and pass the dataset config file you want to use.
### Load from Directory
If you want to pass the path of real images, you can use `--imgsdir` arguments as the follow command.
```shell
python tools/utils/inception_stat.py --imgsdir ${IMGS_PATH} --pklname ${PKLNAME} --size ${SIZE} --flip ${FLIP}
```
Then a pre-defined pipeline will be used to load images in `${IMGS_PATH}`.
```python
pipeline = [
dict(type='LoadImageFromFile', key='real_img'),
dict(
type='Resize', keys=['real_img'], scale=SIZE,
keep_ratio=False),
dict(
type='Normalize',
keys=['real_img'],
mean=[127.5] * 3,
std=[127.5] * 3,
to_rgb=True), # default to RGB images
dict(type='Collect', keys=['real_img'], meta_keys=[]),
dict(type='ImageToTensor', keys=['real_img'])
]
```
If `${FLIP}` is set as `True`, the following config of horizontal flip operation would be added to the end of the pipeline.
```python
dict(type='Flip', keys=['real_img'], direction='horizontal')
```
If you want to use a specific pipeline otherwise the pre-defined ones, you can use `--pipeline-cfg` to pass a config file contains the data pipeline you want to use.
```shell
python tools/utils/inception_stat.py --imgsdir ${IMGS_PATH} --pklname ${PKLNAME} --pipeline-cfg ${PIPELINE}
```
To be noted that, the name of the pipeline dict in `${PIPELINE}` should be fixed as `inception_pipeline`. For example,
```python
# an example of ${PIPELINE}
inception_pipeline = [
dict(type='LoadImageFromFile', key='real_img'),
...
]
```
### Load with Dataset Config
If you want to use a dataset config, you can use `--data-config` arguments as the following command.
```shell
python tools/utils/inception_stat.py --data-config ${CONFIG} --pklname ${PKLNAME} --subset ${SUBSET}
```
Then a dataset will be instantiated following the `${SUBSET}` in the configs, and defaults to `test`. Take the following dataset config as example,
```python
# from `imagenet_128x128_inception_stat.py`
data = dict(
samples_per_gpu=None,
workers_per_gpu=2,
train=dict(
type=dataset_type,
data_prefix='data/imagenet/train',
pipeline=train_pipeline),
val=dict(
type=dataset_type,
data_prefix='data/imagenet/val',
ann_file='data/imagenet/meta/val.txt',
pipeline=test_pipeline),
test=dict(
type=dataset_type,
data_prefix='data/imagenet/val',
ann_file='data/imagenet/meta/val.txt',
pipeline=test_pipeline))
```
If not defined, the config in `data['test']` would be used in data loading process. If you want to extract the inception state of the training set, you can set `--subset train` in the command. Then the dataset would be built under the guidance of config in `data['train']` and images under `data/imagenet/train` and process pipeline of `train_pipeline` would be used.
## Define the Version of Inception Net
In the aforementioned command, the script will take the [PyTorch InceptionV3](https://github.com/pytorch/vision/blob/main/torchvision/models/inception.py) by default. If you want the [Tero's InceptionV3](https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/metrics/inception-2015-12-05.pt), you will need to switch to the script module:
```shell
python tools/utils/inception_stat.py --imgsdir ${IMGS_PATH} --pklname ${PKLNAME} --size ${SIZE} \
--inception-style stylegan --inception-pth ${PATH_SCRIPT_MODULE}
```
## Control Number of Images to Calculate Inception State
In `inception_stat.py`, we provide `--num-samples` argument to control the number of images used to calculate inception state.
```shell
python tools/utils/inception_stat.py --data-config ${CONFIG} --pklname ${PKLNAME} --num-samples ${NUMS}
```
If `${NUMS}` is set as `-1`, all images in the defined dataset would be used.
## Control the Shuffle Operation in Data Loading
In `inception_stat.py`, we provide `--no-shuffle` argument to avoid the shuffle operation in images loading process. For example, you can use the following command:
```shell
python tools/utils/inception_stat.py --data-config ${CONFIG} --pklname ${PKLNAME} --no-shuffle
```
## Note on Inception State Extraction between Various Code Bases
For FID evaluation, differences between [PyTorch Studio GAN](https://github.com/POSTECH-CVLab/PyTorch-StudioGAN) and ours are mainly on the selection of real samples. In MMGen, we follow the pipeline of [BigGAN](https://github.com/ajbrock/BigGAN-PyTorch), where the whole training set is adopted to extract inception statistics. Besides, we also use [Tero's Inception](https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/metrics/inception-2015-12-05.pt) for feature extraction.
You can download the preprocessed inception state by the following url:
- [CIFAR10](https://download.openmmlab.com/mmgen/evaluation/fid_inception_pkl/cifar10.pkl)
- [ImageNet1k](https://download.openmmlab.com/mmgen/evaluation/fid_inception_pkl/imagenet.pkl)
- [ImageNet1k-64x64](https://download.openmmlab.com/mmgen/evaluation/fid_inception_pkl/imagenet_64x64.pkl)
You can use following commands to extract those inception states by yourself as well.
```shell
# For CIFAR10
python tools/utils/inception_stat.py --data-cfg configs/_base_/datasets/cifar10_inception_stat.py --pklname cifar10.pkl --no-shuffle --inception-style stylegan --num-samples -1 --subset train
# For ImageNet1k
python tools/utils/inception_stat.py --data-cfg configs/_base_/datasets/imagenet_128x128_inception_stat.py --pklname imagenet.pkl --no-shuffle --inception-style stylegan --num-samples -1 --subset train
# For ImageNet1k-64x64
python tools/utils/inception_stat.py --data-cfg configs/_base_/datasets/imagenet_64x64_inception_stat.py --pklname imagenet_64x64.pkl --no-shuffle --inception-style stylegan --num-samples -1 --subset train
```
.. toctree::
:maxdepth: 2
config.md
customize_dataset.md
customize_models.md
customize_losses.md
inception_stat.md
ddp_train_gans.md
customize_runtime.md
applications.md
# Minimal makefile for Sphinx documentation
#
# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build
# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
.PHONY: help Makefile
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment