Unverified Commit 8aac1f99 authored by apolinario's avatar apolinario Committed by GitHub
Browse files

v1-5 docs updates (#921)



* Update README.md

Additionally add FLAX so the model card can be slimmer and point to this page

* Find and replace all

* v-1-5 -> v1-5

* revert test changes

* Update README.md
Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>

* Update docs/source/quicktour.mdx
Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>

* Update README.md
Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>

* Update docs/source/quicktour.mdx
Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>

* Update README.md
Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>

* Revert certain references to v1-5

* Docs changes

* Apply suggestions from code review
Co-authored-by: default avatarapolinario <joaopaulo.passos+multimodal@gmail.com>
Co-authored-by: default avataranton-l <anton@huggingface.co>
Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
parent 2c82e0c4
...@@ -64,44 +64,54 @@ In order to get started, we recommend taking a look at two notebooks: ...@@ -64,44 +64,54 @@ In order to get started, we recommend taking a look at two notebooks:
- The [Training a diffusers model](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb) notebook summarizes diffusion models training methods. This notebook takes a step-by-step approach to training your - The [Training a diffusers model](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb) notebook summarizes diffusion models training methods. This notebook takes a step-by-step approach to training your
diffusion models on an image dataset, with explanatory graphics. diffusion models on an image dataset, with explanatory graphics.
## **New** Stable Diffusion is now fully compatible with `diffusers`! ## Stable Diffusion is fully compatible with `diffusers`!
Stable Diffusion is a text-to-image latent diffusion model created by the researchers and engineers from [CompVis](https://github.com/CompVis), [Stability AI](https://stability.ai/) and [LAION](https://laion.ai/). It's trained on 512x512 images from a subset of the [LAION-5B](https://laion.ai/blog/laion-5b/) database. This model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 10GB VRAM. Stable Diffusion is a text-to-image latent diffusion model created by the researchers and engineers from [CompVis](https://github.com/CompVis), [Stability AI](https://stability.ai/), [LAION](https://laion.ai/) and [RunwayML](https://runwayml.com/). It's trained on 512x512 images from a subset of the [LAION-5B](https://laion.ai/blog/laion-5b/) database. This model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 4GB VRAM.
See the [model card](https://huggingface.co/CompVis/stable-diffusion) for more information. See the [model card](https://huggingface.co/CompVis/stable-diffusion) for more information.
You need to accept the model license before downloading or using the Stable Diffusion weights. Please, visit the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. You have to be a registered user in 🤗 Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section](https://huggingface.co/docs/hub/security-tokens) of the documentation. You need to accept the model license before downloading or using the Stable Diffusion weights. Please, visit the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license carefully and tick the checkbox if you agree. You have to be a registered user in 🤗 Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section](https://huggingface.co/docs/hub/security-tokens) of the documentation.
### Text-to-Image generation with Stable Diffusion ### Text-to-Image generation with Stable Diffusion
First let's install
```bash
pip install --upgrade diffusers transformers scipy
```
Run this command to log in with your HF Hub token if you haven't before (you can skip this step if you prefer to run the model locally, follow [this](#running-the-model-locally) instead)
```bash
huggingface-cli login
```
We recommend using the model in [half-precision (`fp16`)](https://pytorch.org/blog/accelerating-training-on-nvidia-gpus-with-pytorch-automatic-mixed-precision/) as it gives almost always the same results as full We recommend using the model in [half-precision (`fp16`)](https://pytorch.org/blog/accelerating-training-on-nvidia-gpus-with-pytorch-automatic-mixed-precision/) as it gives almost always the same results as full
precision while being roughly twice as fast and requiring half the amount of GPU RAM. precision while being roughly twice as fast and requiring half the amount of GPU RAM.
```python ```python
# make sure you're logged in with `huggingface-cli login`
from diffusers import StableDiffusionPipeline from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", torch_type=torch.float16, revision="fp16") pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_type=torch.float16, revision="fp16")
pipe = pipe.to("cuda") pipe = pipe.to("cuda")
prompt = "a photo of an astronaut riding a horse on mars" prompt = "a photo of an astronaut riding a horse on mars"
image = pipe(prompt).images[0] image = pipe(prompt).images[0]
``` ```
**Note**: If you don't want to use the token, you can also simply download the model weights #### Running the model locally
(after having [accepted the license](https://huggingface.co/CompVis/stable-diffusion-v1-4)) and pass If you don't want to login to Hugging Face, you can also simply download the model folder
(after having [accepted the license](https://huggingface.co/runwayml/stable-diffusion-v1-5)) and pass
the path to the local folder to the `StableDiffusionPipeline`. the path to the local folder to the `StableDiffusionPipeline`.
``` ```
git lfs install git lfs install
git clone https://huggingface.co/CompVis/stable-diffusion-v1-4 git clone https://huggingface.co/runwayml/stable-diffusion-v1-5
``` ```
Assuming the folder is stored locally under `./stable-diffusion-v1-4`, you can also run stable diffusion Assuming the folder is stored locally under `./stable-diffusion-v1-5`, you can also run stable diffusion
without requiring an authentication token: without requiring an authentication token:
```python ```python
pipe = StableDiffusionPipeline.from_pretrained("./stable-diffusion-v1-4") pipe = StableDiffusionPipeline.from_pretrained("./stable-diffusion-v1-5")
pipe = pipe.to("cuda") pipe = pipe.to("cuda")
prompt = "a photo of an astronaut riding a horse on mars" prompt = "a photo of an astronaut riding a horse on mars"
...@@ -114,7 +124,7 @@ The following snippet should result in less than 4GB VRAM. ...@@ -114,7 +124,7 @@ The following snippet should result in less than 4GB VRAM.
```python ```python
pipe = StableDiffusionPipeline.from_pretrained( pipe = StableDiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4", "runwayml/stable-diffusion-v1-5",
revision="fp16", revision="fp16",
torch_dtype=torch.float16, torch_dtype=torch.float16,
) )
...@@ -125,7 +135,7 @@ pipe.enable_attention_slicing() ...@@ -125,7 +135,7 @@ pipe.enable_attention_slicing()
image = pipe(prompt).images[0] image = pipe(prompt).images[0]
``` ```
If you wish to use a different scheduler, you can simply instantiate If you wish to use a different scheduler (e.g.: DDIM, LMS, PNDM/PLMS), you can instantiate
it before the pipeline and pass it to `from_pretrained`. it before the pipeline and pass it to `from_pretrained`.
```python ```python
...@@ -138,7 +148,7 @@ lms = LMSDiscreteScheduler( ...@@ -138,7 +148,7 @@ lms = LMSDiscreteScheduler(
) )
pipe = StableDiffusionPipeline.from_pretrained( pipe = StableDiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4", "runwayml/stable-diffusion-v1-5",
revision="fp16", revision="fp16",
torch_dtype=torch.float16, torch_dtype=torch.float16,
scheduler=lms, scheduler=lms,
...@@ -158,7 +168,7 @@ please run the model in the default *full-precision* setting: ...@@ -158,7 +168,7 @@ please run the model in the default *full-precision* setting:
# make sure you're logged in with `huggingface-cli login` # make sure you're logged in with `huggingface-cli login`
from diffusers import StableDiffusionPipeline from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4") pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
# disable the following line if you run on CPU # disable the following line if you run on CPU
pipe = pipe.to("cuda") pipe = pipe.to("cuda")
...@@ -169,6 +179,75 @@ image = pipe(prompt).images[0] ...@@ -169,6 +179,75 @@ image = pipe(prompt).images[0]
image.save("astronaut_rides_horse.png") image.save("astronaut_rides_horse.png")
``` ```
### JAX/Flax
To use StableDiffusion on TPUs and GPUs for faster inference you can leverage JAX/Flax.
Running the pipeline with default PNDMScheduler
```python
import jax
import numpy as np
from flax.jax_utils import replicate
from flax.training.common_utils import shard
from diffusers import FlaxStableDiffusionPipeline
pipeline, params = FlaxStableDiffusionPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5", revision="flax", dtype=jax.numpy.bfloat16
)
prompt = "a photo of an astronaut riding a horse on mars"
prng_seed = jax.random.PRNGKey(0)
num_inference_steps = 50
num_samples = jax.device_count()
prompt = num_samples * [prompt]
prompt_ids = pipeline.prepare_inputs(prompt)
# shard inputs and rng
params = replicate(params)
prng_seed = jax.random.split(prng_seed, jax.device_count())
prompt_ids = shard(prompt_ids)
images = pipeline(prompt_ids, params, prng_seed, num_inference_steps, jit=True).images
images = pipeline.numpy_to_pil(np.asarray(images.reshape((num_samples,) + images.shape[-3:])))
```
**Note**:
If you are limited by TPU memory, please make sure to load the `FlaxStableDiffusionPipeline` in `bfloat16` precision instead of the default `float32` precision as done above. You can do so by telling diffusers to load the weights from "bf16" branch.
```python
import jax
import numpy as np
from flax.jax_utils import replicate
from flax.training.common_utils import shard
from diffusers import FlaxStableDiffusionPipeline
pipeline, params = FlaxStableDiffusionPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5", revision="bf16", dtype=jax.numpy.bfloat16
)
prompt = "a photo of an astronaut riding a horse on mars"
prng_seed = jax.random.PRNGKey(0)
num_inference_steps = 50
num_samples = jax.device_count()
prompt = num_samples * [prompt]
prompt_ids = pipeline.prepare_inputs(prompt)
# shard inputs and rng
params = replicate(params)
prng_seed = jax.random.split(prng_seed, jax.device_count())
prompt_ids = shard(prompt_ids)
images = pipeline(prompt_ids, params, prng_seed, num_inference_steps, jit=True).images
images = pipeline.numpy_to_pil(np.asarray(images.reshape((num_samples,) + images.shape[-3:])))
```
### Image-to-Image text-guided generation with Stable Diffusion ### Image-to-Image text-guided generation with Stable Diffusion
The `StableDiffusionImg2ImgPipeline` lets you pass a text prompt and an initial image to condition the generation of new images. The `StableDiffusionImg2ImgPipeline` lets you pass a text prompt and an initial image to condition the generation of new images.
...@@ -183,14 +262,14 @@ from diffusers import StableDiffusionImg2ImgPipeline ...@@ -183,14 +262,14 @@ from diffusers import StableDiffusionImg2ImgPipeline
# load the pipeline # load the pipeline
device = "cuda" device = "cuda"
model_id_or_path = "CompVis/stable-diffusion-v1-4" model_id_or_path = "runwayml/stable-diffusion-v1-5"
pipe = StableDiffusionImg2ImgPipeline.from_pretrained( pipe = StableDiffusionImg2ImgPipeline.from_pretrained(
model_id_or_path, model_id_or_path,
revision="fp16", revision="fp16",
torch_dtype=torch.float16, torch_dtype=torch.float16,
) )
# or download via git clone https://huggingface.co/CompVis/stable-diffusion-v1-4 # or download via git clone https://huggingface.co/runwayml/stable-diffusion-v1-5
# and pass `model_id_or_path="./stable-diffusion-v1-4"`. # and pass `model_id_or_path="./stable-diffusion-v1-5"`.
pipe = pipe.to(device) pipe = pipe.to(device)
# let's download an initial image # let's download an initial image
......
...@@ -67,8 +67,8 @@ Diffusion models often consist of multiple independently-trained models or other ...@@ -67,8 +67,8 @@ Diffusion models often consist of multiple independently-trained models or other
Each model has been trained independently on a different task and the scheduler can easily be swapped out and replaced with a different one. Each model has been trained independently on a different task and the scheduler can easily be swapped out and replaced with a different one.
During inference, we however want to be able to easily load all components and use them in inference - even if one component, *e.g.* CLIP's text encoder, originates from a different library, such as [Transformers](https://github.com/huggingface/transformers). To that end, all pipelines provide the following functionality: During inference, we however want to be able to easily load all components and use them in inference - even if one component, *e.g.* CLIP's text encoder, originates from a different library, such as [Transformers](https://github.com/huggingface/transformers). To that end, all pipelines provide the following functionality:
- [`from_pretrained` method](../diffusion_pipeline) that accepts a Hugging Face Hub repository id, *e.g.* [CompVis/stable-diffusion-v1-4](https://huggingface.co/CompVis/stable-diffusion-v1-4) or a path to a local directory, *e.g.* - [`from_pretrained` method](../diffusion_pipeline) that accepts a Hugging Face Hub repository id, *e.g.* [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5) or a path to a local directory, *e.g.*
"./stable-diffusion". To correctly retrieve which models and components should be loaded, one has to provide a `model_index.json` file, *e.g.* [CompVis/stable-diffusion-v1-4/model_index.json](https://huggingface.co/CompVis/stable-diffusion-v1-4/blob/main/model_index.json), which defines all components that should be "./stable-diffusion". To correctly retrieve which models and components should be loaded, one has to provide a `model_index.json` file, *e.g.* [runwayml/stable-diffusion-v1-5/model_index.json](https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/model_index.json), which defines all components that should be
loaded into the pipelines. More specifically, for each model/component one needs to define the format `<name>: ["<library>", "<class name>"]`. `<name>` is the attribute name given to the loaded instance of `<class name>` which can be found in the library or pipeline folder called `"<library>"`. loaded into the pipelines. More specifically, for each model/component one needs to define the format `<name>: ["<library>", "<class name>"]`. `<name>` is the attribute name given to the loaded instance of `<class name>` which can be found in the library or pipeline folder called `"<library>"`.
- [`save_pretrained`](../diffusion_pipeline) that accepts a local path, *e.g.* `./stable-diffusion` under which all models/components of the pipeline will be saved. For each component/model a folder is created inside the local path that is named after the given attribute name, *e.g.* `./stable_diffusion/unet`. - [`save_pretrained`](../diffusion_pipeline) that accepts a local path, *e.g.* `./stable-diffusion` under which all models/components of the pipeline will be saved. For each component/model a folder is created inside the local path that is named after the given attribute name, *e.g.* `./stable_diffusion/unet`.
In addition, a `model_index.json` file is created at the root of the local path, *e.g.* `./stable_diffusion/model_index.json` so that the complete pipeline can again be instantiated In addition, a `model_index.json` file is created at the root of the local path, *e.g.* `./stable_diffusion/model_index.json` so that the complete pipeline can again be instantiated
...@@ -100,7 +100,7 @@ logic including pre-processing, an unrolled diffusion loop, and post-processing ...@@ -100,7 +100,7 @@ logic including pre-processing, an unrolled diffusion loop, and post-processing
# make sure you're logged in with `huggingface-cli login` # make sure you're logged in with `huggingface-cli login`
from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4") pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipe = pipe.to("cuda") pipe = pipe.to("cuda")
prompt = "a photo of an astronaut riding a horse on mars" prompt = "a photo of an astronaut riding a horse on mars"
...@@ -123,7 +123,7 @@ from diffusers import StableDiffusionImg2ImgPipeline ...@@ -123,7 +123,7 @@ from diffusers import StableDiffusionImg2ImgPipeline
# load the pipeline # load the pipeline
device = "cuda" device = "cuda"
pipe = StableDiffusionImg2ImgPipeline.from_pretrained( pipe = StableDiffusionImg2ImgPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16 "runwayml/stable-diffusion-v1-5", revision="fp16", torch_dtype=torch.float16
).to(device) ).to(device)
# let's download an initial image # let's download an initial image
......
...@@ -56,7 +56,7 @@ If you use a CUDA GPU, you can take advantage of `torch.autocast` to perform inf ...@@ -56,7 +56,7 @@ If you use a CUDA GPU, you can take advantage of `torch.autocast` to perform inf
from torch import autocast from torch import autocast
from diffusers import StableDiffusionPipeline from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4") pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipe = pipe.to("cuda") pipe = pipe.to("cuda")
prompt = "a photo of an astronaut riding a horse on mars" prompt = "a photo of an astronaut riding a horse on mars"
...@@ -72,7 +72,7 @@ To save more GPU memory and get even more speed, you can load and run the model ...@@ -72,7 +72,7 @@ To save more GPU memory and get even more speed, you can load and run the model
```Python ```Python
pipe = StableDiffusionPipeline.from_pretrained( pipe = StableDiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4", "runwayml/stable-diffusion-v1-5",
revision="fp16", revision="fp16",
torch_dtype=torch.float16, torch_dtype=torch.float16,
) )
...@@ -97,7 +97,7 @@ import torch ...@@ -97,7 +97,7 @@ import torch
from diffusers import StableDiffusionPipeline from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained( pipe = StableDiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4", "runwayml/stable-diffusion-v1-5",
revision="fp16", revision="fp16",
torch_dtype=torch.float16, torch_dtype=torch.float16,
) )
...@@ -152,7 +152,7 @@ def generate_inputs(): ...@@ -152,7 +152,7 @@ def generate_inputs():
pipe = StableDiffusionPipeline.from_pretrained( pipe = StableDiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4", "runwayml/stable-diffusion-v1-5",
revision="fp16", revision="fp16",
torch_dtype=torch.float16, torch_dtype=torch.float16,
).to("cuda") ).to("cuda")
...@@ -216,7 +216,7 @@ class UNet2DConditionOutput: ...@@ -216,7 +216,7 @@ class UNet2DConditionOutput:
pipe = StableDiffusionPipeline.from_pretrained( pipe = StableDiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4", "runwayml/stable-diffusion-v1-5",
revision="fp16", revision="fp16",
torch_dtype=torch.float16, torch_dtype=torch.float16,
).to("cuda") ).to("cuda")
......
...@@ -31,7 +31,7 @@ We recommend to "prime" the pipeline using an additional one-time pass through i ...@@ -31,7 +31,7 @@ We recommend to "prime" the pipeline using an additional one-time pass through i
# make sure you're logged in with `huggingface-cli login` # make sure you're logged in with `huggingface-cli login`
from diffusers import StableDiffusionPipeline from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4") pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipe = pipe.to("mps") pipe = pipe.to("mps")
prompt = "a photo of an astronaut riding a horse on mars" prompt = "a photo of an astronaut riding a horse on mars"
......
...@@ -28,7 +28,7 @@ The snippet below demonstrates how to use the ONNX runtime. You need to use `Sta ...@@ -28,7 +28,7 @@ The snippet below demonstrates how to use the ONNX runtime. You need to use `Sta
from diffusers import StableDiffusionOnnxPipeline from diffusers import StableDiffusionOnnxPipeline
pipe = StableDiffusionOnnxPipeline.from_pretrained( pipe = StableDiffusionOnnxPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4", "runwayml/stable-diffusion-v1-5",
revision="onnx", revision="onnx",
provider="CUDAExecutionProvider", provider="CUDAExecutionProvider",
) )
......
...@@ -68,8 +68,7 @@ You can save the image by simply calling: ...@@ -68,8 +68,7 @@ You can save the image by simply calling:
More advanced models, like [Stable Diffusion](https://huggingface.co/CompVis/stable-diffusion) require you to accept a [license](https://huggingface.co/spaces/CompVis/stable-diffusion-license) before running the model. More advanced models, like [Stable Diffusion](https://huggingface.co/CompVis/stable-diffusion) require you to accept a [license](https://huggingface.co/spaces/CompVis/stable-diffusion-license) before running the model.
This is due to the improved image generation capabilities of the model and the potentially harmful content that could be produced with it. This is due to the improved image generation capabilities of the model and the potentially harmful content that could be produced with it.
Long story short: Head over to your stable diffusion model of choice, *e.g.* [`CompVis/stable-diffusion-v1-4`](https://huggingface.co/CompVis/stable-diffusion-v1-4), read through the license and click-accept to get Please, head over to your stable diffusion model of choice, *e.g.* [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license carefully and tick the checkbox if you agree.
access to the model.
You have to be a registered user in 🤗 Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens). You have to be a registered user in 🤗 Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens).
Having "click-accepted" the license, you can save your token: Having "click-accepted" the license, you can save your token:
...@@ -77,13 +76,13 @@ Having "click-accepted" the license, you can save your token: ...@@ -77,13 +76,13 @@ Having "click-accepted" the license, you can save your token:
AUTH_TOKEN = "<please-fill-with-your-token>" AUTH_TOKEN = "<please-fill-with-your-token>"
``` ```
You can then load [`CompVis/stable-diffusion-v1-4`](https://huggingface.co/CompVis/stable-diffusion-v1-4) You can then load [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5)
just like we did before only that now you need to pass your `AUTH_TOKEN`: just like we did before only that now you need to pass your `AUTH_TOKEN`:
```python ```python
>>> from diffusers import DiffusionPipeline >>> from diffusers import DiffusionPipeline
>>> generator = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", use_auth_token=AUTH_TOKEN) >>> generator = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", use_auth_token=AUTH_TOKEN)
``` ```
If you do not pass your authentication token you will see that the diffusion system will not be correctly If you do not pass your authentication token you will see that the diffusion system will not be correctly
...@@ -95,15 +94,15 @@ the weights locally via: ...@@ -95,15 +94,15 @@ the weights locally via:
``` ```
git lfs install git lfs install
git clone https://huggingface.co/CompVis/stable-diffusion-v1-4 git clone https://huggingface.co/runwayml/stable-diffusion-v1-5
``` ```
and then load locally saved weights into the pipeline. This way, you do not need to pass an authentication and then load locally saved weights into the pipeline. This way, you do not need to pass an authentication
token. Assuming that `"./stable-diffusion-v1-4"` is the local path to the cloned stable-diffusion-v1-4 repo, token. Assuming that `"./stable-diffusion-v1-5"` is the local path to the cloned stable-diffusion-v1-5 repo,
you can also load the pipeline as follows: you can also load the pipeline as follows:
```python ```python
>>> generator = DiffusionPipeline.from_pretrained("./stable-diffusion-v1-4") >>> generator = DiffusionPipeline.from_pretrained("./stable-diffusion-v1-5")
``` ```
Running the pipeline is then identical to the code above as it's the same model architecture. Running the pipeline is then identical to the code above as it's the same model architecture.
...@@ -125,7 +124,7 @@ you could use it as follows: ...@@ -125,7 +124,7 @@ you could use it as follows:
>>> scheduler = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear") >>> scheduler = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear")
>>> generator = StableDiffusionPipeline.from_pretrained( >>> generator = StableDiffusionPipeline.from_pretrained(
... "CompVis/stable-diffusion-v1-4", scheduler=scheduler, use_auth_token=AUTH_TOKEN ... "runwayml/stable-diffusion-v1-5", scheduler=scheduler, use_auth_token=AUTH_TOKEN
... ) ... )
``` ```
......
...@@ -64,7 +64,7 @@ accelerate config ...@@ -64,7 +64,7 @@ accelerate config
### Cat toy example ### Cat toy example
You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. You need to accept the model license before downloading or using the weights. In this example we'll use model version `v1-4`, so you'll need to visit [its card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree.
You have to be a registered user in 🤗 Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens). You have to be a registered user in 🤗 Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens).
...@@ -83,7 +83,7 @@ Now let's get our dataset.Download 3-4 images from [here](https://drive.google.c ...@@ -83,7 +83,7 @@ Now let's get our dataset.Download 3-4 images from [here](https://drive.google.c
And launch the training using And launch the training using
```bash ```bash
export MODEL_NAME="CompVis/stable-diffusion-v1-4" export MODEL_NAME="runwayml/stable-diffusion-v1-5"
export DATA_DIR="path-to-dir-containing-images" export DATA_DIR="path-to-dir-containing-images"
accelerate launch textual_inversion.py \ accelerate launch textual_inversion.py \
......
...@@ -58,7 +58,7 @@ feature_extractor = CLIPFeatureExtractor.from_pretrained(clip_model_id) ...@@ -58,7 +58,7 @@ feature_extractor = CLIPFeatureExtractor.from_pretrained(clip_model_id)
clip_model = CLIPModel.from_pretrained(clip_model_id) clip_model = CLIPModel.from_pretrained(clip_model_id)
pipeline = DiffusionPipeline.from_pretrained( pipeline = DiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4", "runwayml/stable-diffusion-v1-5",
custom_pipeline="clip_guided_stable_diffusion", custom_pipeline="clip_guided_stable_diffusion",
clip_model=clip_model, clip_model=clip_model,
feature_extractor=feature_extractor, feature_extractor=feature_extractor,
......
...@@ -25,7 +25,7 @@ from diffusers import StableDiffusionImg2ImgPipeline ...@@ -25,7 +25,7 @@ from diffusers import StableDiffusionImg2ImgPipeline
# load the pipeline # load the pipeline
device = "cuda" device = "cuda"
pipe = StableDiffusionImg2ImgPipeline.from_pretrained( pipe = StableDiffusionImg2ImgPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16 "runwayml/stable-diffusion-v1-5", revision="fp16", torch_dtype=torch.float16
).to(device) ).to(device)
# let's download an initial image # let's download an initial image
......
...@@ -18,7 +18,7 @@ If a community doesn't work as expected, please open an issue and ping the autho ...@@ -18,7 +18,7 @@ If a community doesn't work as expected, please open an issue and ping the autho
To load a custom pipeline you just need to pass the `custom_pipeline` argument to `DiffusionPipeline`, as one of the files in `diffusers/examples/community`. Feel free to send a PR with your own pipelines, we will merge them quickly. To load a custom pipeline you just need to pass the `custom_pipeline` argument to `DiffusionPipeline`, as one of the files in `diffusers/examples/community`. Feel free to send a PR with your own pipelines, we will merge them quickly.
```py ```py
pipe = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", custom_pipeline="filename_in_the_community_folder") pipe = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", custom_pipeline="filename_in_the_community_folder")
``` ```
## Example usages ## Example usages
...@@ -41,7 +41,7 @@ clip_model = CLIPModel.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K", ...@@ -41,7 +41,7 @@ clip_model = CLIPModel.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K",
guided_pipeline = DiffusionPipeline.from_pretrained( guided_pipeline = DiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4", "runwayml/stable-diffusion-v1-5",
custom_pipeline="clip_guided_stable_diffusion", custom_pipeline="clip_guided_stable_diffusion",
clip_model=clip_model, clip_model=clip_model,
feature_extractor=feature_extractor, feature_extractor=feature_extractor,
...@@ -141,7 +141,7 @@ def download_image(url): ...@@ -141,7 +141,7 @@ def download_image(url):
response = requests.get(url) response = requests.get(url)
return PIL.Image.open(BytesIO(response.content)).convert("RGB") return PIL.Image.open(BytesIO(response.content)).convert("RGB")
pipe = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", custom_pipeline="stable_diffusion_mega", torch_dtype=torch.float16, revision="fp16") pipe = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", custom_pipeline="stable_diffusion_mega", torch_dtype=torch.float16, revision="fp16")
pipe.to("cuda") pipe.to("cuda")
pipe.enable_attention_slicing() pipe.enable_attention_slicing()
......
...@@ -46,7 +46,7 @@ class StableDiffusionMegaPipeline(DiffusionPipeline): ...@@ -46,7 +46,7 @@ class StableDiffusionMegaPipeline(DiffusionPipeline):
[`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`].
safety_checker ([`StableDiffusionMegaSafetyChecker`]): safety_checker ([`StableDiffusionMegaSafetyChecker`]):
Classification module that estimates whether generated images could be considered offensive or harmful. Classification module that estimates whether generated images could be considered offensive or harmful.
Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details. Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details.
feature_extractor ([`CLIPFeatureExtractor`]): feature_extractor ([`CLIPFeatureExtractor`]):
Model that extracts features from generated images to be used as inputs for the `safety_checker`. Model that extracts features from generated images to be used as inputs for the `safety_checker`.
""" """
......
...@@ -101,7 +101,7 @@ class ExamplesTestsAccelerate(unittest.TestCase): ...@@ -101,7 +101,7 @@ class ExamplesTestsAccelerate(unittest.TestCase):
with tempfile.TemporaryDirectory() as tmpdir: with tempfile.TemporaryDirectory() as tmpdir:
test_args = f""" test_args = f"""
examples/textual_inversion/textual_inversion.py examples/textual_inversion/textual_inversion.py
--pretrained_model_name_or_path CompVis/stable-diffusion-v1-4 --pretrained_model_name_or_path runwayml/stable-diffusion-v1-5
--train_data_dir docs/source/imgs --train_data_dir docs/source/imgs
--learnable_property object --learnable_property object
--placeholder_token <cat-toy> --placeholder_token <cat-toy>
......
...@@ -48,7 +48,7 @@ Now let's get our dataset.Download 3-4 images from [here](https://drive.google.c ...@@ -48,7 +48,7 @@ Now let's get our dataset.Download 3-4 images from [here](https://drive.google.c
And launch the training using And launch the training using
```bash ```bash
export MODEL_NAME="CompVis/stable-diffusion-v1-4" export MODEL_NAME="runwayml/stable-diffusion-v1-5"
export DATA_DIR="path-to-dir-containing-images" export DATA_DIR="path-to-dir-containing-images"
accelerate launch textual_inversion.py \ accelerate launch textual_inversion.py \
......
...@@ -105,14 +105,14 @@ class FlaxModelMixin: ...@@ -105,14 +105,14 @@ class FlaxModelMixin:
>>> from diffusers import FlaxUNet2DConditionModel >>> from diffusers import FlaxUNet2DConditionModel
>>> # load model >>> # load model
>>> model, params = FlaxUNet2DConditionModel.from_pretrained("CompVis/stable-diffusion-v1-4") >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5")
>>> # By default, the model parameters will be in fp32 precision, to cast these to bfloat16 precision >>> # By default, the model parameters will be in fp32 precision, to cast these to bfloat16 precision
>>> params = model.to_bf16(params) >>> params = model.to_bf16(params)
>>> # If you don't want to cast certain parameters (for example layer norm bias and scale) >>> # If you don't want to cast certain parameters (for example layer norm bias and scale)
>>> # then pass the mask as follows >>> # then pass the mask as follows
>>> from flax import traverse_util >>> from flax import traverse_util
>>> model, params = FlaxUNet2DConditionModel.from_pretrained("CompVis/stable-diffusion-v1-4") >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5")
>>> flat_params = traverse_util.flatten_dict(params) >>> flat_params = traverse_util.flatten_dict(params)
>>> mask = { >>> mask = {
... path: (path[-2] != ("LayerNorm", "bias") and path[-2:] != ("LayerNorm", "scale")) ... path: (path[-2] != ("LayerNorm", "bias") and path[-2:] != ("LayerNorm", "scale"))
...@@ -141,7 +141,7 @@ class FlaxModelMixin: ...@@ -141,7 +141,7 @@ class FlaxModelMixin:
>>> from diffusers import FlaxUNet2DConditionModel >>> from diffusers import FlaxUNet2DConditionModel
>>> # Download model and configuration from huggingface.co >>> # Download model and configuration from huggingface.co
>>> model, params = FlaxUNet2DConditionModel.from_pretrained("CompVis/stable-diffusion-v1-4") >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5")
>>> # By default, the model params will be in fp32, to illustrate the use of this method, >>> # By default, the model params will be in fp32, to illustrate the use of this method,
>>> # we'll first cast to fp16 and back to fp32 >>> # we'll first cast to fp16 and back to fp32
>>> params = model.to_f16(params) >>> params = model.to_f16(params)
...@@ -171,14 +171,14 @@ class FlaxModelMixin: ...@@ -171,14 +171,14 @@ class FlaxModelMixin:
>>> from diffusers import FlaxUNet2DConditionModel >>> from diffusers import FlaxUNet2DConditionModel
>>> # load model >>> # load model
>>> model, params = FlaxUNet2DConditionModel.from_pretrained("CompVis/stable-diffusion-v1-4") >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5")
>>> # By default, the model params will be in fp32, to cast these to float16 >>> # By default, the model params will be in fp32, to cast these to float16
>>> params = model.to_fp16(params) >>> params = model.to_fp16(params)
>>> # If you want don't want to cast certain parameters (for example layer norm bias and scale) >>> # If you want don't want to cast certain parameters (for example layer norm bias and scale)
>>> # then pass the mask as follows >>> # then pass the mask as follows
>>> from flax import traverse_util >>> from flax import traverse_util
>>> model, params = FlaxUNet2DConditionModel.from_pretrained("CompVis/stable-diffusion-v1-4") >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5")
>>> flat_params = traverse_util.flatten_dict(params) >>> flat_params = traverse_util.flatten_dict(params)
>>> mask = { >>> mask = {
... path: (path[-2] != ("LayerNorm", "bias") and path[-2:] != ("LayerNorm", "scale")) ... path: (path[-2] != ("LayerNorm", "bias") and path[-2:] != ("LayerNorm", "scale"))
...@@ -216,7 +216,7 @@ class FlaxModelMixin: ...@@ -216,7 +216,7 @@ class FlaxModelMixin:
- A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co.
Valid model ids are namespaced under a user or organization name, like Valid model ids are namespaced under a user or organization name, like
`CompVis/stable-diffusion-v1-4`. `runwayml/stable-diffusion-v1-5`.
- A path to a *directory* containing model weights saved using [`~ModelMixin.save_pretrained`], - A path to a *directory* containing model weights saved using [`~ModelMixin.save_pretrained`],
e.g., `./my_model_directory/`. e.g., `./my_model_directory/`.
dtype (`jax.numpy.dtype`, *optional*, defaults to `jax.numpy.float32`): dtype (`jax.numpy.dtype`, *optional*, defaults to `jax.numpy.float32`):
...@@ -273,7 +273,7 @@ class FlaxModelMixin: ...@@ -273,7 +273,7 @@ class FlaxModelMixin:
>>> from diffusers import FlaxUNet2DConditionModel >>> from diffusers import FlaxUNet2DConditionModel
>>> # Download model and configuration from huggingface.co and cache. >>> # Download model and configuration from huggingface.co and cache.
>>> model, params = FlaxUNet2DConditionModel.from_pretrained("CompVis/stable-diffusion-v1-4") >>> model, params = FlaxUNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5")
>>> # Model was saved using *save_pretrained('./test/saved_model/')* (for example purposes, not runnable). >>> # Model was saved using *save_pretrained('./test/saved_model/')* (for example purposes, not runnable).
>>> model, params = FlaxUNet2DConditionModel.from_pretrained("./test/saved_model/") >>> model, params = FlaxUNet2DConditionModel.from_pretrained("./test/saved_model/")
```""" ```"""
......
...@@ -244,7 +244,7 @@ class FlaxDiffusionPipeline(ConfigMixin): ...@@ -244,7 +244,7 @@ class FlaxDiffusionPipeline(ConfigMixin):
<Tip> <Tip>
It is required to be logged in (`huggingface-cli login`) when you want to use private or [gated It is required to be logged in (`huggingface-cli login`) when you want to use private or [gated
models](https://huggingface.co/docs/hub/models-gated#gated-models), *e.g.* `"CompVis/stable-diffusion-v1-4"` models](https://huggingface.co/docs/hub/models-gated#gated-models), *e.g.* `"runwayml/stable-diffusion-v1-5"`
</Tip> </Tip>
...@@ -266,13 +266,13 @@ class FlaxDiffusionPipeline(ConfigMixin): ...@@ -266,13 +266,13 @@ class FlaxDiffusionPipeline(ConfigMixin):
>>> # Download pipeline that requires an authorization token >>> # Download pipeline that requires an authorization token
>>> # For more information on access tokens, please refer to this section >>> # For more information on access tokens, please refer to this section
>>> # of the documentation](https://huggingface.co/docs/hub/security-tokens) >>> # of the documentation](https://huggingface.co/docs/hub/security-tokens)
>>> pipeline = FlaxDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4") >>> pipeline = FlaxDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
>>> # Download pipeline, but overwrite scheduler >>> # Download pipeline, but overwrite scheduler
>>> from diffusers import LMSDiscreteScheduler >>> from diffusers import LMSDiscreteScheduler
>>> scheduler = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear") >>> scheduler = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear")
>>> pipeline = FlaxDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", scheduler=scheduler) >>> pipeline = FlaxDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", scheduler=scheduler)
``` ```
""" """
cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE) cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE)
......
...@@ -317,7 +317,7 @@ class DiffusionPipeline(ConfigMixin): ...@@ -317,7 +317,7 @@ class DiffusionPipeline(ConfigMixin):
<Tip> <Tip>
It is required to be logged in (`huggingface-cli login`) when you want to use private or [gated It is required to be logged in (`huggingface-cli login`) when you want to use private or [gated
models](https://huggingface.co/docs/hub/models-gated#gated-models), *e.g.* `"CompVis/stable-diffusion-v1-4"` models](https://huggingface.co/docs/hub/models-gated#gated-models), *e.g.* `"runwayml/stable-diffusion-v1-5"`
</Tip> </Tip>
...@@ -339,13 +339,13 @@ class DiffusionPipeline(ConfigMixin): ...@@ -339,13 +339,13 @@ class DiffusionPipeline(ConfigMixin):
>>> # Download pipeline that requires an authorization token >>> # Download pipeline that requires an authorization token
>>> # For more information on access tokens, please refer to this section >>> # For more information on access tokens, please refer to this section
>>> # of the documentation](https://huggingface.co/docs/hub/security-tokens) >>> # of the documentation](https://huggingface.co/docs/hub/security-tokens)
>>> pipeline = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4") >>> pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
>>> # Download pipeline, but overwrite scheduler >>> # Download pipeline, but overwrite scheduler
>>> from diffusers import LMSDiscreteScheduler >>> from diffusers import LMSDiscreteScheduler
>>> scheduler = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear") >>> scheduler = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear")
>>> pipeline = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", scheduler=scheduler) >>> pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", scheduler=scheduler)
``` ```
""" """
cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE) cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE)
......
...@@ -55,8 +55,8 @@ Diffusion models often consist of multiple independently-trained models or other ...@@ -55,8 +55,8 @@ Diffusion models often consist of multiple independently-trained models or other
Each model has been trained independently on a different task and the scheduler can easily be swapped out and replaced with a different one. Each model has been trained independently on a different task and the scheduler can easily be swapped out and replaced with a different one.
During inference, we however want to be able to easily load all components and use them in inference - even if one component, *e.g.* CLIP's text encoder, originates from a different library, such as [Transformers](https://github.com/huggingface/transformers). To that end, all pipelines provide the following functionality: During inference, we however want to be able to easily load all components and use them in inference - even if one component, *e.g.* CLIP's text encoder, originates from a different library, such as [Transformers](https://github.com/huggingface/transformers). To that end, all pipelines provide the following functionality:
- [`from_pretrained` method](https://github.com/huggingface/diffusers/blob/5cbed8e0d157f65d3ddc2420dfd09f2df630e978/src/diffusers/pipeline_utils.py#L139) that accepts a Hugging Face Hub repository id, *e.g.* [CompVis/stable-diffusion-v1-4](https://huggingface.co/CompVis/stable-diffusion-v1-4) or a path to a local directory, *e.g.* - [`from_pretrained` method](https://github.com/huggingface/diffusers/blob/5cbed8e0d157f65d3ddc2420dfd09f2df630e978/src/diffusers/pipeline_utils.py#L139) that accepts a Hugging Face Hub repository id, *e.g.* [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5) or a path to a local directory, *e.g.*
"./stable-diffusion". To correctly retrieve which models and components should be loaded, one has to provide a `model_index.json` file, *e.g.* [CompVis/stable-diffusion-v1-4/model_index.json](https://huggingface.co/CompVis/stable-diffusion-v1-4/blob/main/model_index.json), which defines all components that should be "./stable-diffusion". To correctly retrieve which models and components should be loaded, one has to provide a `model_index.json` file, *e.g.* [runwayml/stable-diffusion-v1-5/model_index.json](https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/model_index.json), which defines all components that should be
loaded into the pipelines. More specifically, for each model/component one needs to define the format `<name>: ["<library>", "<class name>"]`. `<name>` is the attribute name given to the loaded instance of `<class name>` which can be found in the library or pipeline folder called `"<library>"`. loaded into the pipelines. More specifically, for each model/component one needs to define the format `<name>: ["<library>", "<class name>"]`. `<name>` is the attribute name given to the loaded instance of `<class name>` which can be found in the library or pipeline folder called `"<library>"`.
- [`save_pretrained`](https://github.com/huggingface/diffusers/blob/5cbed8e0d157f65d3ddc2420dfd09f2df630e978/src/diffusers/pipeline_utils.py#L90) that accepts a local path, *e.g.* `./stable-diffusion` under which all models/components of the pipeline will be saved. For each component/model a folder is created inside the local path that is named after the given attribute name, *e.g.* `./stable_diffusion/unet`. - [`save_pretrained`](https://github.com/huggingface/diffusers/blob/5cbed8e0d157f65d3ddc2420dfd09f2df630e978/src/diffusers/pipeline_utils.py#L90) that accepts a local path, *e.g.* `./stable-diffusion` under which all models/components of the pipeline will be saved. For each component/model a folder is created inside the local path that is named after the given attribute name, *e.g.* `./stable_diffusion/unet`.
In addition, a `model_index.json` file is created at the root of the local path, *e.g.* `./stable_diffusion/model_index.json` so that the complete pipeline can again be instantiated In addition, a `model_index.json` file is created at the root of the local path, *e.g.* `./stable_diffusion/model_index.json` so that the complete pipeline can again be instantiated
...@@ -88,7 +88,7 @@ logic including pre-processing, an unrolled diffusion loop, and post-processing ...@@ -88,7 +88,7 @@ logic including pre-processing, an unrolled diffusion loop, and post-processing
# make sure you're logged in with `huggingface-cli login` # make sure you're logged in with `huggingface-cli login`
from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4") pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipe = pipe.to("cuda") pipe = pipe.to("cuda")
prompt = "a photo of an astronaut riding a horse on mars" prompt = "a photo of an astronaut riding a horse on mars"
...@@ -111,7 +111,7 @@ from diffusers import StableDiffusionImg2ImgPipeline ...@@ -111,7 +111,7 @@ from diffusers import StableDiffusionImg2ImgPipeline
# load the pipeline # load the pipeline
device = "cuda" device = "cuda"
pipe = StableDiffusionImg2ImgPipeline.from_pretrained( pipe = StableDiffusionImg2ImgPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4", "runwayml/stable-diffusion-v1-5",
revision="fp16", revision="fp16",
torch_dtype=torch.float16, torch_dtype=torch.float16,
).to(device) ).to(device)
......
...@@ -13,7 +13,7 @@ The summary of the model is the following: ...@@ -13,7 +13,7 @@ The summary of the model is the following:
- Stable Diffusion has the same architecture as [Latent Diffusion](https://arxiv.org/abs/2112.10752) but uses a frozen CLIP Text Encoder instead of training the text encoder jointly with the diffusion model. - Stable Diffusion has the same architecture as [Latent Diffusion](https://arxiv.org/abs/2112.10752) but uses a frozen CLIP Text Encoder instead of training the text encoder jointly with the diffusion model.
- An in-detail explanation of the Stable Diffusion model can be found under [Stable Diffusion with 🧨 Diffusers](https://huggingface.co/blog/stable_diffusion). - An in-detail explanation of the Stable Diffusion model can be found under [Stable Diffusion with 🧨 Diffusers](https://huggingface.co/blog/stable_diffusion).
- If you don't want to rely on the Hugging Face Hub and having to pass a authentication token, you can - If you don't want to rely on the Hugging Face Hub and having to pass a authentication token, you can
download the weights with `git lfs install; git clone https://huggingface.co/CompVis/stable-diffusion-v1-4` and instead pass the local path to the cloned folder to `from_pretrained` as shown below. download the weights with `git lfs install; git clone https://huggingface.co/runwayml/stable-diffusion-v1-5` and instead pass the local path to the cloned folder to `from_pretrained` as shown below.
- Stable Diffusion can work with a variety of different samplers as is shown below. - Stable Diffusion can work with a variety of different samplers as is shown below.
## Available Pipelines: ## Available Pipelines:
...@@ -33,14 +33,14 @@ If you want to download the model weights using a single Python line, you need t ...@@ -33,14 +33,14 @@ If you want to download the model weights using a single Python line, you need t
```python ```python
from diffusers import DiffusionPipeline from diffusers import DiffusionPipeline
pipeline = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4") pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
``` ```
This however can make it difficult to build applications on top of `diffusers` as you will always have to pass the token around. A potential way to solve this issue is by downloading the weights to a local path `"./stable-diffusion-v1-4"`: This however can make it difficult to build applications on top of `diffusers` as you will always have to pass the token around. A potential way to solve this issue is by downloading the weights to a local path `"./stable-diffusion-v1-5"`:
``` ```
git lfs install git lfs install
git clone https://huggingface.co/CompVis/stable-diffusion-v1-4 git clone https://huggingface.co/runwayml/stable-diffusion-v1-5
``` ```
and simply passing the local path to `from_pretrained`: and simply passing the local path to `from_pretrained`:
...@@ -48,7 +48,7 @@ and simply passing the local path to `from_pretrained`: ...@@ -48,7 +48,7 @@ and simply passing the local path to `from_pretrained`:
```python ```python
from diffusers import StableDiffusionPipeline from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained("./stable-diffusion-v1-4") pipe = StableDiffusionPipeline.from_pretrained("./stable-diffusion-v1-5")
``` ```
### Text-to-Image with default PLMS scheduler ### Text-to-Image with default PLMS scheduler
...@@ -57,7 +57,7 @@ pipe = StableDiffusionPipeline.from_pretrained("./stable-diffusion-v1-4") ...@@ -57,7 +57,7 @@ pipe = StableDiffusionPipeline.from_pretrained("./stable-diffusion-v1-4")
# make sure you're logged in with `huggingface-cli login` # make sure you're logged in with `huggingface-cli login`
from diffusers import StableDiffusionPipeline from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4") pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipe = pipe.to("cuda") pipe = pipe.to("cuda")
prompt = "a photo of an astronaut riding a horse on mars" prompt = "a photo of an astronaut riding a horse on mars"
...@@ -75,7 +75,7 @@ from diffusers import StableDiffusionPipeline, DDIMScheduler ...@@ -75,7 +75,7 @@ from diffusers import StableDiffusionPipeline, DDIMScheduler
scheduler = DDIMScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear", clip_sample=False, set_alpha_to_one=False) scheduler = DDIMScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear", clip_sample=False, set_alpha_to_one=False)
pipe = StableDiffusionPipeline.from_pretrained( pipe = StableDiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4", "runwayml/stable-diffusion-v1-5",
scheduler=scheduler, scheduler=scheduler,
).to("cuda") ).to("cuda")
...@@ -98,7 +98,7 @@ lms = LMSDiscreteScheduler( ...@@ -98,7 +98,7 @@ lms = LMSDiscreteScheduler(
) )
pipe = StableDiffusionPipeline.from_pretrained( pipe = StableDiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4", "runwayml/stable-diffusion-v1-5",
scheduler=lms, scheduler=lms,
).to("cuda") ).to("cuda")
......
...@@ -45,7 +45,7 @@ class FlaxStableDiffusionPipeline(FlaxDiffusionPipeline): ...@@ -45,7 +45,7 @@ class FlaxStableDiffusionPipeline(FlaxDiffusionPipeline):
[`FlaxDDIMScheduler`], [`FlaxLMSDiscreteScheduler`], or [`FlaxPNDMScheduler`]. [`FlaxDDIMScheduler`], [`FlaxLMSDiscreteScheduler`], or [`FlaxPNDMScheduler`].
safety_checker ([`FlaxStableDiffusionSafetyChecker`]): safety_checker ([`FlaxStableDiffusionSafetyChecker`]):
Classification module that estimates whether generated images could be considered offensive or harmful. Classification module that estimates whether generated images could be considered offensive or harmful.
Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details. Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details.
feature_extractor ([`CLIPFeatureExtractor`]): feature_extractor ([`CLIPFeatureExtractor`]):
Model that extracts features from generated images to be used as inputs for the `safety_checker`. Model that extracts features from generated images to be used as inputs for the `safety_checker`.
""" """
......
...@@ -50,7 +50,7 @@ class OnnxStableDiffusionImg2ImgPipeline(DiffusionPipeline): ...@@ -50,7 +50,7 @@ class OnnxStableDiffusionImg2ImgPipeline(DiffusionPipeline):
[`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`]. [`DDIMScheduler`], [`LMSDiscreteScheduler`], or [`PNDMScheduler`].
safety_checker ([`StableDiffusionSafetyChecker`]): safety_checker ([`StableDiffusionSafetyChecker`]):
Classification module that estimates whether generated images could be considered offensive or harmful. Classification module that estimates whether generated images could be considered offensive or harmful.
Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details. Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details.
feature_extractor ([`CLIPFeatureExtractor`]): feature_extractor ([`CLIPFeatureExtractor`]):
Model that extracts features from generated images to be used as inputs for the `safety_checker`. Model that extracts features from generated images to be used as inputs for the `safety_checker`.
""" """
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment