Merge branch 'main' of https://github.com/huggingface/diffusers

e83c5363 · Patrick von Platen · f09defd3 · e6c4c72e · e83c5363
Commit e83c5363 authored Jun 02, 2022 by Patrick von Platen
Hide whitespace changes
Inline Side-by-side

Showing with 53 additions and 21 deletions

README.md README.md +53 -21

No files found.
--- a/README.md
+++ b/README.md
 # Diffusers

+## Definitions
+
+**Models**: Single neural network that models p_θ(x_t-1|x_t) and is trained to “denoise” to image
+*Examples: UNet, Conditioned UNet, 3D UNet, Transformer UNet*
+
+![model_diff_1_50](https://user-images.githubusercontent.com/23423619/171610307-dab0cd8b-75da-4d4e-9f5a-5922072e2bb5.png)
+
+**Samplers**: Algorithm to *train* and *sample* from **Model**. Defines alpha and beta schedule, timesteps, etc..
+*Example: Vanilla DDPM, DDIM, PMLS, DEIN*
+
+![sampling](https://user-images.githubusercontent.com/23423619/171608981-3ad05953-a684-4c82-89f8-62a459147a07.png)
+![training](https://user-images.githubusercontent.com/23423619/171608964-b3260cce-e6b4-4841-959d-7d8ba4b8d1b2.png)
+
+**Diffusion Pipeline**: End-to-end pipeline that includes multiple diffusion models, possible text encoders, CLIP
+*Example: GLIDE,CompVis/Latent-Diffusion, Imagen, DALL-E*
+
+![imagen](https://user-images.githubusercontent.com/23423619/171609001-c3f2c1c9-f597-4a16-9843-749bf3f9431c.png)
+
 ## Library structure:

 ```
 ├── models
-│   ├── dalle2
-│   │   ├── modeling_dalle2.py
-│   │   ├── README.md
-│   │   └── run_dalle2.py
-│   ├── ddpm
-│   │   ├── modeling_ddpm.py
-│   │   ├── README.md
-│   │   └── run_ddpm.py
-│   ├── glide
-│   │   ├── modeling_glide.py
-│   │   ├── README.md
-│   │   └── run_dalle2.py
-│   ├── imagen
-│   │   ├── modeling_dalle2.py
-│   │   ├── README.md
-│   │   └── run_dalle2.py
-│   └── latent_diffusion
-│       ├── modeling_latent_diffusion.py
-│       ├── README.md
-│       └── run_latent_diffusion.py
+│   ├── audio
+│   │   └── fastdiff
+│   │       ├── modeling_fastdiff.py
+│   │       ├── README.md
+│   │       └── run_fastdiff.py
+│   └── vision
+│       ├── dalle2
+│       │   ├── modeling_dalle2.py
+│       │   ├── README.md
+│       │   └── run_dalle2.py
+│       ├── ddpm
+│       │   ├── modeling_ddpm.py
+│       │   ├── README.md
+│       │   └── run_ddpm.py
+│       ├── glide
+│       │   ├── modeling_glide.py
+│       │   ├── README.md
+│       │   └── run_dalle2.py
+│       ├── imagen
+│       │   ├── modeling_dalle2.py
+│       │   ├── README.md
+│       │   └── run_dalle2.py
+│       └── latent_diffusion
+│           ├── modeling_latent_diffusion.py
+│           ├── README.md
+│           └── run_latent_diffusion.py
+
 ├── src
 │   └── diffusers
 │       ├── configuration_utils.py
@@ -38,7 +63,14 @@
 │   └── test_modeling_utils.py
 ```

-## Dummy Example
+## 1. `diffusers` as a central modular diffusion and sampler library
+
+`diffusers` should be more modularized than `transformers` so that parts of it can be easily used in other libraries.
+It could become a central place for all kinds of models, samplers, training utils and processors required when using diffusion models in audio, vision, ... 
+One should be able to save both models and samplers as well as load them from the Hub.
+
+Example:
+
 ```python
 from diffusers import UNetModel, GaussianDiffusion
 import torch