Unverified Commit ae82a3eb authored by Steven Liu's avatar Steven Liu Committed by GitHub
Browse files

[docs] AutoPipeline tutorial (#4273)



* first draft

* tidy api

* apply feedback

* mdx to md

* apply feedback

* Apply suggestions from code review
Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>

---------
Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
parent 816ca004
...@@ -13,6 +13,8 @@ ...@@ -13,6 +13,8 @@
title: Overview title: Overview
- local: using-diffusers/write_own_pipeline - local: using-diffusers/write_own_pipeline
title: Understanding models and schedulers title: Understanding models and schedulers
- local: tutorials/autopipeline
title: AutoPipeline
- local: tutorials/basic_training - local: tutorials/basic_training
title: Train a diffusion model title: Train a diffusion model
title: Tutorials title: Tutorials
......
...@@ -12,35 +12,41 @@ specific language governing permissions and limitations under the License. ...@@ -12,35 +12,41 @@ specific language governing permissions and limitations under the License.
# AutoPipeline # AutoPipeline
In many cases, one checkpoint can be used for multiple tasks. For example, you may be able to use the same checkpoint for Text-to-Image, Image-to-Image, and Inpainting. However, you'll need to know the pipeline class names linked to your checkpoint. `AutoPipeline` is designed to:
AutoPipeline is designed to make it easy for you to use multiple pipelines in your workflow. We currently provide 3 AutoPipeline classes to perform three different tasks, i.e. [`AutoPipelineForText2Image`], [`AutoPipelineForImage2Image`], and [`AutoPipelineForInpainting`]. You'll need to choose the AutoPipeline class based on the task you want to perform and use it to automatically retrieve the relevant pipeline given the name/path to the pre-trained weights. 1. make it easy for you to load a checkpoint for a task without knowing the specific pipeline class to use
2. use multiple pipelines in your workflow
For example, to perform Image-to-Image with the SD1.5 checkpoint, you can do Based on the task, the `AutoPipeline` class automatically retrieves the relevant pipeline given the name or path to the pretrained weights with the `from_pretrained()` method.
```python To seamlessly switch between tasks with the same checkpoint without reallocating additional memory, use the `from_pipe()` method to transfer the components from the original pipeline to the new one.
from diffusers import PipelineForImageToImage
pipe_i2i = PipelineForImageoImage.from_pretrained("runwayml/stable-diffusion-v1-5") ```py
from diffusers import AutoPipelineForText2Image
import torch
pipeline = AutoPipelineForText2Image.from_pretrained(
"runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, use_safetensors=True
).to("cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipeline(prompt, num_inference_steps=25).images[0]
``` ```
It will also help you switch between tasks seamlessly using the same checkpoint without reallocating additional memory. For example, to re-use the Image-to-Image pipeline we just created for inpainting, you can do <Tip>
```python Check out the [AutoPipeline](/tutorials/autopipeline) tutorial to learn how to use this API!
from diffusers import PipelineForInpainting
pipe_inpaint = AutoPipelineForInpainting.from_pipe(pipe_i2i) </Tip>
```
All the components will be transferred to the inpainting pipeline with zero cost.
`AutoPipeline` supports text-to-image, image-to-image, and inpainting for the following diffusion models:
Currently AutoPipeline support the Text-to-Image, Image-to-Image, and Inpainting tasks for below diffusion models: - [Stable Diffusion](./stable_diffusion)
- [stable Diffusion](./stable_diffusion) - [ControlNet](./api/pipelines/controlnet)
- [Stable Diffusion Controlnet](./api/pipelines/controlnet) - [Stable Diffusion XL (SDXL)](./stable_diffusion/stable_diffusion_xl)
- [Stable Diffusion XL](./stable_diffusion/stable_diffusion_xl) - [DeepFloyd IF](./if)
- [IF](./if)
- [Kandinsky](./kandinsky) - [Kandinsky](./kandinsky)
- [Kandinsky 2.2](./kandinsky) - [Kandinsky 2.2](./kandinsky#kandinsky-22)
## AutoPipelineForText2Image ## AutoPipelineForText2Image
......
# AutoPipeline
🤗 Diffusers is able to complete many different tasks, and you can often reuse the same pretrained weights for multiple tasks such as text-to-image, image-to-image, and inpainting. If you're new to the library and diffusion models though, it may be difficult to know which pipeline to use for a task. For example, if you're using the [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5) checkpoint for text-to-image, you might not know that you could also use it for image-to-image and inpainting by loading the checkpoint with the [`StableDiffusionImg2ImgPipeline`] and [`StableDiffusionInpaintPipeline`] classes respectively.
The `AutoPipeline` class is designed to simplify the variety of pipelines in 🤗 Diffusers. It is a generic, *task-first* pipeline that lets you focus on the task. The `AutoPipeline` automatically detects the correct pipeline class to use, which makes it easier to load a checkpoint for a task without knowing the specific pipeline class name.
<Tip>
Take a look at the [AutoPipeline](./pipelines/auto_pipeline) reference to see which tasks are supported. Currently, it supports text-to-image, image-to-image, and inpainting.
</Tip>
This tutorial shows you how to use an `AutoPipeline` to automatically infer the pipeline class to load for a specific task, given the pretrained weights.
## Choose an AutoPipeline for your task
Start by picking a checkpoint. For example, if you're interested in text-to-image with the [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5) checkpoint, use [`AutoPipelineForText2Image`]:
```py
from diffusers import AutoPipelineForText2Image
import torch
pipeline = AutoPipelineForText2Image.from_pretrained(
"runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, use_safetensors=True
).to("cuda")
prompt = "peasant and dragon combat, wood cutting style, viking era, bevel with rune"
image = pipeline(prompt, num_inference_steps=25).images[0]
```
<div class="flex justify-center">
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/autopipeline-text2img.png" alt="generated image of peasant fighting dragon in wood cutting style"/>
</div>
Under the hood, [`AutoPipelineForText2Image`]:
1. automatically detects a `"stable-diffusion"` class from the [`model_index.json`](https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/model_index.json) file
2. loads the corresponding text-to-image [`StableDiffusionPipline`] based on the `"stable-diffusion"` class name
Likewise, for image-to-image, [`AutoPipelineForImage2Image`] detects a `"stable-diffusion"` checkpoint from the `model_index.json` file and it'll load the corresponding [`StableDiffusionImg2ImgPipeline`] behind the scenes. You can also pass any additional arguments specific to the pipeline class such as `strength`, which determines the amount of noise or variation added to an input image:
```py
from diffusers import AutoPipelineForImage2Image
pipeline = AutoPipelineForImage2Image.from_pretrained(
"runwayml/stable-diffusion-v1-5",
torch_dtype=torch.float16,
use_safetensors=True,
).to("cuda")
prompt = "a portrait of a dog wearing a pearl earring"
url = "https://upload.wikimedia.org/wikipedia/commons/thumb/0/0f/1665_Girl_with_a_Pearl_Earring.jpg/800px-1665_Girl_with_a_Pearl_Earring.jpg"
response = requests.get(url)
image = Image.open(BytesIO(response.content)).convert("RGB")
image.thumbnail((768, 768))
image = pipeline(prompt, image, num_inference_steps=200, strength=0.75, guidance_scale=10.5).images[0]
```
<div class="flex justify-center">
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/autopipeline-img2img.png" alt="generated image of a vermeer portrait of a dog wearing a pearl earring"/>
</div>
And if you want to do inpainting, then [`AutoPipelineForInpainting`] loads the underlying [`StableDiffusionInpaintPipeline`] class in the same way:
```py
from diffusers import AutoPipelineForInpainting
from diffusers.utils import load_image
pipeline = AutoPipelineForInpainting.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, use_safetensors=True
).to("cuda")
img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"
init_image = load_image(img_url).convert("RGB")
mask_image = load_image(mask_url).convert("RGB")
prompt = "A majestic tiger sitting on a bench"
image = pipeline(prompt, image=init_image, mask_image=mask_image, num_inference_steps=50, strength=0.80).images[0]
```
<div class="flex justify-center">
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/autopipeline-inpaint.png" alt="generated image of a tiger sitting on a bench"/>
</div>
If you try to load an unsupported checkpoint, it'll throw an error:
```py
from diffusers import AutoPipelineForImage2Image
import torch
pipeline = AutoPipelineForImage2Image.from_pretrained(
"openai/shap-e-img2img", torch_dtype=torch.float16, use_safetensors=True
)
"ValueError: AutoPipeline can't find a pipeline linked to ShapEImg2ImgPipeline for None"
```
## Use multiple pipelines
For some workflows or if you're loading many pipelines, it is more memory-efficient to reuse the same components from a checkpoint instead of reloading them which would unnecessarily consume additional memory. For example, if you're using a checkpoint for text-to-image and you want to use it again for image-to-image, use the [`~AutoPipelineForImage2Image.from_pipe`] method. This method creates a new pipeline from the components of a previously loaded pipeline at no additional memory cost.
The [`~AutoPipelineForImage2Image.from_pipe`] method detects the original pipeline class and maps it to the new pipeline class corresponding to the task you want to do. For example, if you load a `"stable-diffusion"` class pipeline for text-to-image:
```py
from diffusers import AutoPipelineForText2Image, AutoPipelineForImage2Image
pipeline_text2img = AutoPipelineForText2Image.from_pretrained(
"runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, use_safetensors=True
)
print(type(pipeline_text2img))
"<class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'>"
```
Then [`~AutoPipelineForImage2Image.from_pipe`] maps the original `"stable-diffusion"` pipeline class to [`StableDiffusionImg2ImgPipeline`]:
```py
pipeline_img2img = AutoPipelineForImage2Image.from_pipe(pipeline_text2img)
print(type(pipeline_img2img))
"<class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion_img2img.StableDiffusionImg2ImgPipeline'>"
```
If you passed an optional argument - like disabling the safety checker - to the original pipeline, this argument is also passed on to the new pipeline:
```py
from diffusers import AutoPipelineForText2Image, AutoPipelineForImage2Image
pipeline_text2img = AutoPipelineForText2Image.from_pretrained(
"runwayml/stable-diffusion-v1-5",
torch_dtype=torch.float16,
use_safetensors=True,
requires_safety_checker=False,
).to("cuda")
pipeline_img2img = AutoPipelineForImage2Image.from_pipe(pipeline_text2img)
print(pipe.config.requires_safety_checker)
"False"
```
You can overwrite any of the arguments and even configuration from the original pipeline if you want to change the behavior of the new pipeline. For example, to turn the safety checker back on and add the `strength` argument:
```py
pipeline_img2img = AutoPipelineForImage2Image.from_pipe(pipeline_text2img, requires_safety_checker=True, strength=0.3)
```
...@@ -158,16 +158,11 @@ def _get_signature_keys(obj): ...@@ -158,16 +158,11 @@ def _get_signature_keys(obj):
class AutoPipelineForText2Image(ConfigMixin): class AutoPipelineForText2Image(ConfigMixin):
r""" r"""
AutoPipeline for text-to-image generation. [`AutoPipelineForText2Image`] is a generic pipeline class that instantiates a text-to-image pipeline class. The
specific underlying pipeline class is automatically selected from either the
[`~AutoPipelineForText2Image.from_pretrained`] or [`~AutoPipelineForText2Image.from_pipe`] methods.
[`AutoPipelineForText2Image`] is a generic pipeline class that will be instantiated as one of the text-to-image This class cannot be instantiated using `__init__()` (throws an error).
pipeline class in diffusers.
The pipeline type (for example [`StableDiffusionPipeline`]) is automatically selected when created with the
AutoPipelineForText2Image.from_pretrained(pretrained_model_name_or_path) or
AutoPipelineForText2Image.from_pipe(pipeline) class methods .
This class cannot be instantiated using __init__() (throws an error).
Class attributes: Class attributes:
...@@ -297,7 +292,7 @@ class AutoPipelineForText2Image(ConfigMixin): ...@@ -297,7 +292,7 @@ class AutoPipelineForText2Image(ConfigMixin):
>>> from diffusers import AutoPipelineForText2Image >>> from diffusers import AutoPipelineForText2Image
>>> pipeline = AutoPipelineForText2Image.from_pretrained("runwayml/stable-diffusion-v1-5") >>> pipeline = AutoPipelineForText2Image.from_pretrained("runwayml/stable-diffusion-v1-5")
>>> print(pipeline.__class__) >>> image = pipeline(prompt).images[0]
``` ```
""" """
config = cls.load_config(pretrained_model_or_path) config = cls.load_config(pretrained_model_or_path)
...@@ -328,13 +323,14 @@ class AutoPipelineForText2Image(ConfigMixin): ...@@ -328,13 +323,14 @@ class AutoPipelineForText2Image(ConfigMixin):
an instantiated `DiffusionPipeline` object an instantiated `DiffusionPipeline` object
```py ```py
>>> from diffusers import AutoPipelineForText2Image, AutoPipelineForImageToImage >>> from diffusers import AutoPipelineForText2Image, AutoPipelineForImage2Image
>>> pipe_i2i = AutoPipelineForImage2Image.from_pretrained( >>> pipe_i2i = AutoPipelineForImage2Image.from_pretrained(
... "runwayml/stable-diffusion-v1-5", requires_safety_checker=False ... "runwayml/stable-diffusion-v1-5", requires_safety_checker=False
... ) ... )
>>> pipe_t2i = AutoPipelineForText2Image.from_pipe(pipe_t2i) >>> pipe_t2i = AutoPipelineForText2Image.from_pipe(pipe_i2i)
>>> image = pipe_t2i(prompt).images[0]
``` ```
""" """
...@@ -401,16 +397,11 @@ class AutoPipelineForText2Image(ConfigMixin): ...@@ -401,16 +397,11 @@ class AutoPipelineForText2Image(ConfigMixin):
class AutoPipelineForImage2Image(ConfigMixin): class AutoPipelineForImage2Image(ConfigMixin):
r""" r"""
AutoPipeline for image-to-image generation. [`AutoPipelineForImage2Image`] is a generic pipeline class that instantiates an image-to-image pipeline class. The
specific underlying pipeline class is automatically selected from either the
[`AutoPipelineForImage2Image`] is a generic pipeline class that will be instantiated as one of the image-to-image [`~AutoPipelineForImage2Image.from_pretrained`] or [`~AutoPipelineForImage2Image.from_pipe`] methods.
pipeline classes in diffusers.
The pipeline type (for example [`StableDiffusionImg2ImgPipeline`]) is automatically selected when created with the
`AutoPipelineForImage2Image.from_pretrained(pretrained_model_name_or_path)` or
`AutoPipelineForImage2Image.from_pipe(pipeline)` class methods.
This class cannot be instantiated using __init__() (throws an error). This class cannot be instantiated using `__init__()` (throws an error).
Class attributes: Class attributes:
...@@ -438,7 +429,8 @@ class AutoPipelineForImage2Image(ConfigMixin): ...@@ -438,7 +429,8 @@ class AutoPipelineForImage2Image(ConfigMixin):
2. Find the image-to-image pipeline linked to the pipeline class using pattern matching on pipeline class 2. Find the image-to-image pipeline linked to the pipeline class using pattern matching on pipeline class
name. name.
If a `controlnet` argument is passed, it will instantiate a StableDiffusionControlNetImg2ImgPipeline object. If a `controlnet` argument is passed, it will instantiate a [`StableDiffusionControlNetImg2ImgPipeline`]
object.
The pipeline is set in evaluation mode (`model.eval()`) by default. The pipeline is set in evaluation mode (`model.eval()`) by default.
...@@ -537,10 +529,10 @@ class AutoPipelineForImage2Image(ConfigMixin): ...@@ -537,10 +529,10 @@ class AutoPipelineForImage2Image(ConfigMixin):
Examples: Examples:
```py ```py
>>> from diffusers import AutoPipelineForText2Image >>> from diffusers import AutoPipelineForImage2Image
>>> pipeline = AutoPipelineForImageToImage.from_pretrained("runwayml/stable-diffusion-v1-5") >>> pipeline = AutoPipelineForImage2Image.from_pretrained("runwayml/stable-diffusion-v1-5")
>>> print(pipeline.__class__) >>> image = pipeline(prompt, image).images[0]
``` ```
""" """
config = cls.load_config(pretrained_model_or_path) config = cls.load_config(pretrained_model_or_path)
...@@ -573,13 +565,14 @@ class AutoPipelineForImage2Image(ConfigMixin): ...@@ -573,13 +565,14 @@ class AutoPipelineForImage2Image(ConfigMixin):
Examples: Examples:
```py ```py
>>> from diffusers import AutoPipelineForText2Image, AutoPipelineForImageToImage >>> from diffusers import AutoPipelineForText2Image, AutoPipelineForImage2Image
>>> pipe_t2i = AutoPipelineForText2Image.from_pretrained( >>> pipe_t2i = AutoPipelineForText2Image.from_pretrained(
... "runwayml/stable-diffusion-v1-5", requires_safety_checker=False ... "runwayml/stable-diffusion-v1-5", requires_safety_checker=False
... ) ... )
>>> pipe_i2i = AutoPipelineForImageToImage.from_pipe(pipe_t2i) >>> pipe_i2i = AutoPipelineForImage2Image.from_pipe(pipe_t2i)
>>> image = pipe_i2i(prompt, image).images[0]
``` ```
""" """
...@@ -646,16 +639,11 @@ class AutoPipelineForImage2Image(ConfigMixin): ...@@ -646,16 +639,11 @@ class AutoPipelineForImage2Image(ConfigMixin):
class AutoPipelineForInpainting(ConfigMixin): class AutoPipelineForInpainting(ConfigMixin):
r""" r"""
AutoPipeline for inpainting generation. [`AutoPipelineForInpainting`] is a generic pipeline class that instantiates an inpainting pipeline class. The
specific underlying pipeline class is automatically selected from either the
[`AutoPipelineForInpainting`] is a generic pipeline class that will be instantiated as one of the inpainting [`~AutoPipelineForInpainting.from_pretrained`] or [`~AutoPipelineForInpainting.from_pipe`] methods.
pipeline class in diffusers.
The pipeline type (for example [`IFInpaintingPipeline`]) is automatically selected when created with the This class cannot be instantiated using `__init__()` (throws an error).
AutoPipelineForInpainting.from_pretrained(pretrained_model_name_or_path) or
AutoPipelineForInpainting.from_pipe(pipeline) class methods .
This class cannot be instantiated using __init__() (throws an error).
Class attributes: Class attributes:
...@@ -682,7 +670,8 @@ class AutoPipelineForInpainting(ConfigMixin): ...@@ -682,7 +670,8 @@ class AutoPipelineForInpainting(ConfigMixin):
config object config object
2. Find the inpainting pipeline linked to the pipeline class using pattern matching on pipeline class name. 2. Find the inpainting pipeline linked to the pipeline class using pattern matching on pipeline class name.
If a `controlnet` argument is passed, it will instantiate a StableDiffusionControlNetInpaintPipeline object. If a `controlnet` argument is passed, it will instantiate a [`StableDiffusionControlNetInpaintPipeline`]
object.
The pipeline is set in evaluation mode (`model.eval()`) by default. The pipeline is set in evaluation mode (`model.eval()`) by default.
...@@ -781,10 +770,10 @@ class AutoPipelineForInpainting(ConfigMixin): ...@@ -781,10 +770,10 @@ class AutoPipelineForInpainting(ConfigMixin):
Examples: Examples:
```py ```py
>>> from diffusers import AutoPipelineForText2Image >>> from diffusers import AutoPipelineForInpainting
>>> pipeline = AutoPipelineForImageToImage.from_pretrained("runwayml/stable-diffusion-v1-5") >>> pipeline = AutoPipelineForInpainting.from_pretrained("runwayml/stable-diffusion-v1-5")
>>> print(pipeline.__class__) >>> image = pipeline(prompt, image=init_image, mask_image=mask_image).images[0]
``` ```
""" """
config = cls.load_config(pretrained_model_or_path) config = cls.load_config(pretrained_model_or_path)
...@@ -824,6 +813,7 @@ class AutoPipelineForInpainting(ConfigMixin): ...@@ -824,6 +813,7 @@ class AutoPipelineForInpainting(ConfigMixin):
... ) ... )
>>> pipe_inpaint = AutoPipelineForInpainting.from_pipe(pipe_t2i) >>> pipe_inpaint = AutoPipelineForInpainting.from_pipe(pipe_t2i)
>>> image = pipe_inpaint(prompt, image=init_image, mask_image=mask_image).images[0]
``` ```
""" """
original_config = dict(pipeline.config) original_config = dict(pipeline.config)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment