Commit e83c5363 authored by Patrick von Platen's avatar Patrick von Platen
Browse files
parents f09defd3 e6c4c72e
# Diffusers
## Definitions
**Models**: Single neural network that models p_θ(x_t-1|x_t) and is trained to “denoise” to image
*Examples: UNet, Conditioned UNet, 3D UNet, Transformer UNet*
![model_diff_1_50](https://user-images.githubusercontent.com/23423619/171610307-dab0cd8b-75da-4d4e-9f5a-5922072e2bb5.png)
**Samplers**: Algorithm to *train* and *sample* from **Model**. Defines alpha and beta schedule, timesteps, etc..
*Example: Vanilla DDPM, DDIM, PMLS, DEIN*
![sampling](https://user-images.githubusercontent.com/23423619/171608981-3ad05953-a684-4c82-89f8-62a459147a07.png)
![training](https://user-images.githubusercontent.com/23423619/171608964-b3260cce-e6b4-4841-959d-7d8ba4b8d1b2.png)
**Diffusion Pipeline**: End-to-end pipeline that includes multiple diffusion models, possible text encoders, CLIP
*Example: GLIDE,CompVis/Latent-Diffusion, Imagen, DALL-E*
![imagen](https://user-images.githubusercontent.com/23423619/171609001-c3f2c1c9-f597-4a16-9843-749bf3f9431c.png)
## Library structure:
```
├── models
│   ├── dalle2
│   │   ├── modeling_dalle2.py
│   │   ├── README.md
│   │   └── run_dalle2.py
│   ├── ddpm
│   │   ├── modeling_ddpm.py
│   │   ├── README.md
│   │   └── run_ddpm.py
│   ├── glide
│   │   ├── modeling_glide.py
│   │   ├── README.md
│   │   └── run_dalle2.py
│   ├── imagen
│   │   ├── modeling_dalle2.py
│   │   ├── README.md
│   │   └── run_dalle2.py
│   └── latent_diffusion
│   ├── modeling_latent_diffusion.py
│   ├── README.md
│   └── run_latent_diffusion.py
│   ├── audio
│   │   └── fastdiff
│   │   ├── modeling_fastdiff.py
│   │   ├── README.md
│   │   └── run_fastdiff.py
│   └── vision
│   ├── dalle2
│   │   ├── modeling_dalle2.py
│   │   ├── README.md
│   │   └── run_dalle2.py
│   ├── ddpm
│   │   ├── modeling_ddpm.py
│   │   ├── README.md
│   │   └── run_ddpm.py
│   ├── glide
│   │   ├── modeling_glide.py
│   │   ├── README.md
│   │   └── run_dalle2.py
│   ├── imagen
│   │   ├── modeling_dalle2.py
│   │   ├── README.md
│   │   └── run_dalle2.py
│   └── latent_diffusion
│   ├── modeling_latent_diffusion.py
│   ├── README.md
│   └── run_latent_diffusion.py
├── src
│   └── diffusers
│   ├── configuration_utils.py
......@@ -38,7 +63,14 @@
│   └── test_modeling_utils.py
```
## Dummy Example
## 1. `diffusers` as a central modular diffusion and sampler library
`diffusers` should be more modularized than `transformers` so that parts of it can be easily used in other libraries.
It could become a central place for all kinds of models, samplers, training utils and processors required when using diffusion models in audio, vision, ...
One should be able to save both models and samplers as well as load them from the Hub.
Example:
```python
from diffusers import UNetModel, GaussianDiffusion
import torch
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment