loading.md 26.8 KB
Newer Older
Aryan's avatar
Aryan committed
1
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Nathan Lambert's avatar
Nathan Lambert committed
2
3
4
5
6
7
8
9
10
11
12

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

Steven Liu's avatar
Steven Liu committed
13
# Load pipelines
Patrick von Platen's avatar
Patrick von Platen committed
14

15
16
[[open-in-colab]]

Steven Liu's avatar
Steven Liu committed
17
Diffusion systems consist of multiple components like parameterized models and schedulers that interact in complex ways. That is why we designed the [`DiffusionPipeline`] to wrap the complexity of the entire diffusion system into an easy-to-use API. At the same time, the [`DiffusionPipeline`] is entirely customizable so you can modify each component to build a diffusion system for your use case.
18

19
This guide will show you how to load:
20

21
22
- pipelines from the Hub and locally
- different components into a pipeline
Steven Liu's avatar
Steven Liu committed
23
- multiple pipelines without increasing memory usage
24
- checkpoint variants such as different floating point types or non-exponential mean averaged (EMA) weights
25

Steven Liu's avatar
Steven Liu committed
26
27
28
29
## Load a pipeline

> [!TIP]
> Skip to the [DiffusionPipeline explained](#diffusionpipeline-explained) section if you're interested in an explanation about how the [`DiffusionPipeline`] class works.
30

Steven Liu's avatar
Steven Liu committed
31
There are two ways to load a pipeline for a task:
32

Steven Liu's avatar
Steven Liu committed
33
34
1. Load the generic [`DiffusionPipeline`] class and allow it to automatically detect the correct pipeline class from the checkpoint.
2. Load a specific pipeline class for a specific task.
35

Steven Liu's avatar
Steven Liu committed
36
37
<hfoptions id="pipelines">
<hfoption id="generic pipeline">
38

Steven Liu's avatar
Steven Liu committed
39
The [`DiffusionPipeline`] class is a simple and generic way to load the latest trending diffusion model from the [Hub](https://huggingface.co/models?library=diffusers&sort=trending). It uses the [`~DiffusionPipeline.from_pretrained`] method to automatically detect the correct pipeline class for a task from the checkpoint, downloads and caches all the required configuration and weight files, and returns a pipeline ready for inference.
40
41
42
43

```python
from diffusers import DiffusionPipeline

44
pipeline = DiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", use_safetensors=True)
45
46
```

Steven Liu's avatar
Steven Liu committed
47
48
49
50
51
This same checkpoint can also be used for an image-to-image task. The [`DiffusionPipeline`] class can handle any task as long as you provide the appropriate inputs. For example, for an image-to-image task, you need to pass an initial image to the pipeline.

```py
from diffusers import DiffusionPipeline

52
pipeline = DiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", use_safetensors=True)
Steven Liu's avatar
Steven Liu committed
53
54
55
56
57
58
59
60
61
62

init_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/img2img-init.png")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipeline("Astronaut in a jungle, cold color palette, muted colors, detailed, 8k", image=init_image).images[0]
```

</hfoption>
<hfoption id="specific pipeline">

Checkpoints can be loaded by their specific pipeline class if you already know it. For example, to load a Stable Diffusion model, use the [`StableDiffusionPipeline`] class.
63
64

```python
65
66
from diffusers import StableDiffusionPipeline

67
pipeline = StableDiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", use_safetensors=True)
68
69
```

Steven Liu's avatar
Steven Liu committed
70
This same checkpoint may also be used for another task like image-to-image. To differentiate what task you want to use the checkpoint for, you have to use the corresponding task-specific pipeline class. For example, to use the same checkpoint for image-to-image, use the [`StableDiffusionImg2ImgPipeline`] class.
71

Steven Liu's avatar
Steven Liu committed
72
```py
73
74
from diffusers import StableDiffusionImg2ImgPipeline

75
pipeline = StableDiffusionImg2ImgPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", use_safetensors=True)
76
77
```

Steven Liu's avatar
Steven Liu committed
78
79
80
81
</hfoption>
</hfoptions>

Use the Space below to gauge a pipeline's memory requirements before you download and load it to see if it runs on your hardware.
82
83

<div class="block dark:hidden">
84
	<iframe
85
86
87
88
89
90
        src="https://diffusers-compute-pipeline-size.hf.space?__theme=light"
        width="850"
        height="1600"
    ></iframe>
</div>
<div class="hidden dark:block">
91
    <iframe
92
93
94
95
96
97
        src="https://diffusers-compute-pipeline-size.hf.space?__theme=dark"
        width="850"
        height="1600"
    ></iframe>
</div>

hlky's avatar
hlky committed
98
99
100
101
102
103
104
105
106
107
### Specifying Component-Specific Data Types

You can customize the data types for individual sub-models by passing a dictionary to the `torch_dtype` parameter. This allows you to load different components of a pipeline in different floating point precisions. For instance, if you want to load the transformer with `torch.bfloat16` and all other components with `torch.float16`, you can pass a dictionary mapping:

```python
from diffusers import HunyuanVideoPipeline
import torch

pipe = HunyuanVideoPipeline.from_pretrained(
    "hunyuanvideo-community/HunyuanVideo",
108
    torch_dtype={"transformer": torch.bfloat16, "default": torch.float16},
hlky's avatar
hlky committed
109
110
111
112
113
114
)
print(pipe.transformer.dtype, pipe.vae.dtype)  # (torch.bfloat16, torch.float16)
```

If a component is not explicitly specified in the dictionary and no `default` is provided, it will be loaded with `torch.float32`.

115
### Local pipeline
116

Steven Liu's avatar
Steven Liu committed
117
To load a pipeline locally, use [git-lfs](https://git-lfs.github.com/) to manually download a checkpoint to your local disk.
118

119
```bash
120
git-lfs install
121
git clone https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5
122
```
123

Steven Liu's avatar
Steven Liu committed
124
This creates a local folder, ./stable-diffusion-v1-5, on your disk and you should pass its path to [`~DiffusionPipeline.from_pretrained`].
125
126
127
128

```python
from diffusers import DiffusionPipeline

Steven Liu's avatar
Steven Liu committed
129
stable_diffusion = DiffusionPipeline.from_pretrained("./stable-diffusion-v1-5", use_safetensors=True)
130
131
```

Steven Liu's avatar
Steven Liu committed
132
The [`~DiffusionPipeline.from_pretrained`] method won't download files from the Hub when it detects a local path, but this also means it won't download and cache the latest changes to a checkpoint.
133

Steven Liu's avatar
Steven Liu committed
134
## Customize a pipeline
135

Steven Liu's avatar
Steven Liu committed
136
You can customize a pipeline by loading different components into it. This is important because you can:
137

Steven Liu's avatar
Steven Liu committed
138
139
- change to a scheduler with faster generation speed or higher generation quality depending on your needs (call the `scheduler.compatibles` method on your pipeline to see compatible schedulers)
- change a default pipeline component to a newer and better performing one
140

Steven Liu's avatar
Steven Liu committed
141
For example, let's customize the default [stabilityai/stable-diffusion-xl-base-1.0](https://hf.co/stabilityai/stable-diffusion-xl-base-1.0) checkpoint with:
142

Steven Liu's avatar
Steven Liu committed
143
144
- The [`HeunDiscreteScheduler`] to generate higher quality images at the expense of slower generation speed. You must pass the `subfolder="scheduler"` parameter in [`~HeunDiscreteScheduler.from_pretrained`] to load the scheduler configuration into the correct [subfolder](https://hf.co/stabilityai/stable-diffusion-xl-base-1.0/tree/main/scheduler) of the pipeline repository.
- A more stable VAE that runs in fp16.
145

Steven Liu's avatar
Steven Liu committed
146
147
148
```py
from diffusers import StableDiffusionXLPipeline, HeunDiscreteScheduler, AutoencoderKL
import torch
149

Steven Liu's avatar
Steven Liu committed
150
151
scheduler = HeunDiscreteScheduler.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", subfolder="scheduler")
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16, use_safetensors=True)
152
```
153

Steven Liu's avatar
Steven Liu committed
154
Now pass the new scheduler and VAE to the [`StableDiffusionXLPipeline`].
155
156

```py
Steven Liu's avatar
Steven Liu committed
157
pipeline = StableDiffusionXLPipeline.from_pretrained(
158
159
160
161
162
  "stabilityai/stable-diffusion-xl-base-1.0",
  scheduler=scheduler,
  vae=vae,
  torch_dtype=torch.float16,
  variant="fp16",
Steven Liu's avatar
Steven Liu committed
163
164
  use_safetensors=True
).to("cuda")
165
166
```

Steven Liu's avatar
Steven Liu committed
167
## Reuse a pipeline
Steven Liu's avatar
Steven Liu committed
168

Steven Liu's avatar
Steven Liu committed
169
When you load multiple pipelines that share the same model components, it makes sense to reuse the shared components instead of reloading everything into memory again, especially if your hardware is memory-constrained. For example:
170

Steven Liu's avatar
Steven Liu committed
171
172
1. You generated an image with the [`StableDiffusionPipeline`] but you want to improve its quality with the [`StableDiffusionSAGPipeline`]. Both of these pipelines share the same pretrained model, so it'd be a waste of memory to load the same model twice.
2. You want to add a model component, like a [`MotionAdapter`](../api/pipelines/animatediff#animatediffpipeline), to [`AnimateDiffPipeline`] which was instantiated from an existing [`StableDiffusionPipeline`]. Again, both pipelines share the same pretrained model, so it'd be a waste of memory to load an entirely new pipeline again.
173

Steven Liu's avatar
Steven Liu committed
174
175
176
177
With the [`DiffusionPipeline.from_pipe`] API, you can switch between multiple pipelines to take advantage of their different features without increasing memory-usage. It is similar to turning on and off a feature in your pipeline.

> [!TIP]
> To switch between tasks (rather than features), use the [`~DiffusionPipeline.from_pipe`] method with the [AutoPipeline](../api/pipelines/auto_pipeline) class, which automatically identifies the pipeline class based on the task (learn more in the [AutoPipeline](../tutorials/autopipeline) tutorial).
178

Steven Liu's avatar
Steven Liu committed
179
Let's start with a [`StableDiffusionPipeline`] and then reuse the loaded model components to create a [`StableDiffusionSAGPipeline`] to increase generation quality. You'll use the [`StableDiffusionPipeline`] with an [IP-Adapter](./ip_adapter) to generate a bear eating pizza.
180
181
182
183
184
185
186
187
188
189

```python
from diffusers import DiffusionPipeline, StableDiffusionSAGPipeline
import torch
import gc
from diffusers.utils import load_image
from accelerate.utils import compute_module_sizes

image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/load_neg_embed.png")

Steven Liu's avatar
Steven Liu committed
190
pipe_sd = DiffusionPipeline.from_pretrained("SG161222/Realistic_Vision_V6.0_B1_noVAE", torch_dtype=torch.float16)
191
192
193
194
195
196
pipe_sd.load_ip_adapter("h94/IP-Adapter", subfolder="models", weight_name="ip-adapter_sd15.bin")
pipe_sd.set_ip_adapter_scale(0.6)
pipe_sd.to("cuda")

generator = torch.Generator(device="cpu").manual_seed(33)
out_sd = pipe_sd(
Steven Liu's avatar
Steven Liu committed
197
    prompt="bear eats pizza",
198
    negative_prompt="wrong white balance, dark, sketches,worst quality,low quality",
199
    ip_adapter_image=image,
Steven Liu's avatar
Steven Liu committed
200
    num_inference_steps=50,
201
202
    generator=generator,
).images[0]
Steven Liu's avatar
Steven Liu committed
203
out_sd
204
205
206
207
208
209
```

<div class="flex justify-center">
  <img class="rounded-xl" src="https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/from_pipe_out_sd_0.png"/>
</div>

Steven Liu's avatar
Steven Liu committed
210
211
For reference, you can check how much memory this process consumed.

212
213
214
```python
def bytes_to_giga_bytes(bytes):
    return bytes / 1024 / 1024 / 1024
Steven Liu's avatar
Steven Liu committed
215
216
print(f"Max memory allocated: {bytes_to_giga_bytes(torch.cuda.max_memory_allocated())} GB")
"Max memory allocated: 4.406213283538818 GB"
217
218
```

Steven Liu's avatar
Steven Liu committed
219
Now, reuse the same pipeline components from [`StableDiffusionPipeline`] in [`StableDiffusionSAGPipeline`] with the [`~DiffusionPipeline.from_pipe`] method.
220

Steven Liu's avatar
Steven Liu committed
221
222
223
224
> [!WARNING]
> Some pipeline methods may not function properly on new pipelines created with [`~DiffusionPipeline.from_pipe`]. For instance, the [`~DiffusionPipeline.enable_model_cpu_offload`] method installs hooks on the model components based on a unique offloading sequence for each pipeline. If the models are executed in a different order in the new pipeline, the CPU offloading may not work correctly.
>
> To ensure everything works as expected, we recommend re-applying a pipeline method on a new pipeline created with [`~DiffusionPipeline.from_pipe`].
225
226
227

```python
pipe_sag = StableDiffusionSAGPipeline.from_pipe(
Steven Liu's avatar
Steven Liu committed
228
    pipe_sd
229
230
231
232
)

generator = torch.Generator(device="cpu").manual_seed(33)
out_sag = pipe_sag(
Steven Liu's avatar
Steven Liu committed
233
234
    prompt="bear eats pizza",
    negative_prompt="wrong white balance, dark, sketches,worst quality,low quality",
235
    ip_adapter_image=image,
Steven Liu's avatar
Steven Liu committed
236
    num_inference_steps=50,
237
238
    generator=generator,
    guidance_scale=1.0,
Steven Liu's avatar
Steven Liu committed
239
240
241
    sag_scale=0.75
).images[0]
out_sag
242
243
244
245
246
247
```

<div class="flex justify-center">
  <img class="rounded-xl" src="https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/from_pipe_out_sag_1.png"/>
</div>

Steven Liu's avatar
Steven Liu committed
248
If you check the memory usage, you'll see it remains the same as before because [`StableDiffusionPipeline`] and [`StableDiffusionSAGPipeline`] are sharing the same pipeline components. This allows you to use them interchangeably without any additional memory overhead.
249

Steven Liu's avatar
Steven Liu committed
250
251
252
```py
print(f"Max memory allocated: {bytes_to_giga_bytes(torch.cuda.max_memory_allocated())} GB")
"Max memory allocated: 4.406213283538818 GB"
253
254
```

Steven Liu's avatar
Steven Liu committed
255
Let's animate the image with the [`AnimateDiffPipeline`] and also add a [`MotionAdapter`] module to the pipeline. For the [`AnimateDiffPipeline`], you need to unload the IP-Adapter first and reload it *after* you've created your new pipeline (this only applies to the [`AnimateDiffPipeline`]).
256

Steven Liu's avatar
Steven Liu committed
257
```py
258
259
260
from diffusers import AnimateDiffPipeline, MotionAdapter, DDIMScheduler
from diffusers.utils import export_to_gif

Steven Liu's avatar
Steven Liu committed
261
pipe_sag.unload_ip_adapter()
262
263
264
265
adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-2", torch_dtype=torch.float16)

pipe_animate = AnimateDiffPipeline.from_pipe(pipe_sd, motion_adapter=adapter)
pipe_animate.scheduler = DDIMScheduler.from_config(pipe_animate.scheduler.config, beta_schedule="linear")
Steven Liu's avatar
Steven Liu committed
266
# load IP-Adapter and LoRA weights again
267
268
269
270
271
272
273
pipe_animate.load_ip_adapter("h94/IP-Adapter", subfolder="models", weight_name="ip-adapter_sd15.bin")
pipe_animate.load_lora_weights("guoyww/animatediff-motion-lora-zoom-out", adapter_name="zoom-out")
pipe_animate.to("cuda")

generator = torch.Generator(device="cpu").manual_seed(33)
pipe_animate.set_adapters("zoom-out", adapter_weights=0.75)
out = pipe_animate(
Steven Liu's avatar
Steven Liu committed
274
    prompt="bear eats pizza",
275
    num_frames=16,
Steven Liu's avatar
Steven Liu committed
276
277
    num_inference_steps=50,
    ip_adapter_image=image,
278
279
280
281
    generator=generator,
).frames[0]
export_to_gif(out, "out_animate.gif")
```
Steven Liu's avatar
Steven Liu committed
282

283
284
285
286
<div class="flex justify-center">
  <img class="rounded-xl" src="https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/from_pipe_out_animate_3.gif"/>
</div>

Steven Liu's avatar
Steven Liu committed
287
The [`AnimateDiffPipeline`] is more memory-intensive and consumes 15GB of memory (see the [Memory-usage of from_pipe](#memory-usage-of-from_pipe) section to learn what this means for your memory-usage).
288

Steven Liu's avatar
Steven Liu committed
289
290
291
292
```py
print(f"Max memory allocated: {bytes_to_giga_bytes(torch.cuda.max_memory_allocated())} GB")
"Max memory allocated: 15.178664207458496 GB"
```
293

Steven Liu's avatar
Steven Liu committed
294
### Modify from_pipe components
295

Steven Liu's avatar
Steven Liu committed
296
Pipelines loaded with [`~DiffusionPipeline.from_pipe`] can be customized with different model components or methods. However, whenever you modify the *state* of the model components, it affects all the other pipelines that share the same components. For example, if you call [`~diffusers.loaders.IPAdapterMixin.unload_ip_adapter`] on the [`StableDiffusionSAGPipeline`], you won't be able to use IP-Adapter with the [`StableDiffusionPipeline`] because it's been removed from their shared components.
297

Steven Liu's avatar
Steven Liu committed
298
299
```py
pipe.sag_unload_ip_adapter()
300
301

generator = torch.Generator(device="cpu").manual_seed(33)
Steven Liu's avatar
Steven Liu committed
302
303
out_sd = pipe_sd(
    prompt="bear eats pizza",
304
    negative_prompt="wrong white balance, dark, sketches,worst quality,low quality",
Steven Liu's avatar
Steven Liu committed
305
306
    ip_adapter_image=image,
    num_inference_steps=50,
307
    generator=generator,
Steven Liu's avatar
Steven Liu committed
308
309
).images[0]
"AttributeError: 'NoneType' object has no attribute 'image_projection_layers'"
310
311
```

Steven Liu's avatar
Steven Liu committed
312
### Memory usage of from_pipe
313

Steven Liu's avatar
Steven Liu committed
314
The memory requirement of loading multiple pipelines with [`~DiffusionPipeline.from_pipe`] is determined by the pipeline with the highest memory-usage regardless of the number of pipelines you create.
315

Steven Liu's avatar
Steven Liu committed
316
317
318
319
320
321
322
| Pipeline | Memory usage (GB) |
|---|---|
| StableDiffusionPipeline | 4.400 |
| StableDiffusionSAGPipeline | 4.400 |
| AnimateDiffPipeline | 15.178 |

The [`AnimateDiffPipeline`] has the highest memory requirement, so the *total memory-usage* is based only on the [`AnimateDiffPipeline`]. Your memory-usage will not increase if you create additional pipelines as long as their memory requirements doesn't exceed that of the [`AnimateDiffPipeline`]. Each pipeline can be used interchangeably without any additional memory overhead.
323

Steven Liu's avatar
Steven Liu committed
324
## Safety checker
325

Steven Liu's avatar
Steven Liu committed
326
Diffusers implements a [safety checker](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/safety_checker.py) for Stable Diffusion models which can generate harmful content. The safety checker screens the generated output against known hardcoded not-safe-for-work (NSFW) content. If for whatever reason you'd like to disable the safety checker, pass `safety_checker=None` to the [`~DiffusionPipeline.from_pretrained`] method.
327

Steven Liu's avatar
Steven Liu committed
328
329
330
```python
from diffusers import DiffusionPipeline

331
pipeline = DiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", safety_checker=None, use_safetensors=True)
Steven Liu's avatar
Steven Liu committed
332
333
334
335
"""
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide by the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend keeping the safety filter enabled in all public-facing circumstances, disabling it only for use cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
"""
```
336

337
## Checkpoint variants
338

339
A checkpoint variant is usually a checkpoint whose weights are:
340

Steven Liu's avatar
Steven Liu committed
341
342
- Stored in a different floating point type, such as [torch.float16](https://pytorch.org/docs/stable/tensors.html#data-types), because it only requires half the bandwidth and storage to download. You can't use this variant if you're continuing training or using a CPU.
- Non-exponential mean averaged (EMA) weights which shouldn't be used for inference. You should use this variant to continue finetuning a model.
343

Steven Liu's avatar
Steven Liu committed
344
345
> [!TIP]
> When the checkpoints have identical model structures, but they were trained on different datasets and with a different training setup, they should be stored in separate repositories. For example, [stabilityai/stable-diffusion-2](https://hf.co/stabilityai/stable-diffusion-2) and [stabilityai/stable-diffusion-2-1](https://hf.co/stabilityai/stable-diffusion-2-1) are stored in separate repositories.
346

Steven Liu's avatar
Steven Liu committed
347
Otherwise, a variant is **identical** to the original checkpoint. They have exactly the same serialization format (like [safetensors](./using_safetensors)), model structure, and their weights have identical tensor shapes.
348

Steven Liu's avatar
Steven Liu committed
349
350
351
352
353
| **checkpoint type** | **weight name**                             | **argument for loading weights** |
|---------------------|---------------------------------------------|----------------------------------|
| original            | diffusion_pytorch_model.safetensors         |                                  |
| floating point      | diffusion_pytorch_model.fp16.safetensors    | `variant`, `torch_dtype`         |
| non-EMA             | diffusion_pytorch_model.non_ema.safetensors | `variant`                        |
354

Steven Liu's avatar
Steven Liu committed
355
There are two important arguments for loading variants:
356

Steven Liu's avatar
Steven Liu committed
357
- `torch_dtype` specifies the floating point precision of the loaded checkpoint. For example, if you want to save bandwidth by loading a fp16 variant, you should set `variant="fp16"` and `torch_dtype=torch.float16` to *convert the weights* to fp16. Otherwise, the fp16 weights are converted to the default fp32 precision.
358

Steven Liu's avatar
Steven Liu committed
359
  If you only set `torch_dtype=torch.float16`, the default fp32 weights are downloaded first and then converted to fp16.
360

361
- `variant` specifies which files should be loaded from the repository. For example, if you want to load a non-EMA variant of a UNet from [stable-diffusion-v1-5/stable-diffusion-v1-5](https://hf.co/stable-diffusion-v1-5/stable-diffusion-v1-5/tree/main/unet), set `variant="non_ema"` to download the `non_ema` file.
362

Steven Liu's avatar
Steven Liu committed
363
364
<hfoptions id="variants">
<hfoption id="fp16">
365

Steven Liu's avatar
Steven Liu committed
366
```py
367
from diffusers import DiffusionPipeline
368
import torch
369

Steven Liu's avatar
Steven Liu committed
370
pipeline = DiffusionPipeline.from_pretrained(
371
    "stable-diffusion-v1-5/stable-diffusion-v1-5", variant="fp16", torch_dtype=torch.float16, use_safetensors=True
372
373
)
```
374

Steven Liu's avatar
Steven Liu committed
375
376
</hfoption>
<hfoption id="non-EMA">
377

Steven Liu's avatar
Steven Liu committed
378
379
```py
pipeline = DiffusionPipeline.from_pretrained(
380
    "stable-diffusion-v1-5/stable-diffusion-v1-5", variant="non_ema", use_safetensors=True
381
)
382
383
```

Steven Liu's avatar
Steven Liu committed
384
385
</hfoption>
</hfoptions>
386

Steven Liu's avatar
Steven Liu committed
387
Use the `variant` parameter in the [`DiffusionPipeline.save_pretrained`] method to save a checkpoint as a different floating point type or as a non-EMA variant. You should try save a variant to the same folder as the original checkpoint, so you have the option of loading both from the same folder.
388

Steven Liu's avatar
Steven Liu committed
389
390
<hfoptions id="save">
<hfoption id="fp16">
391

392
393
```python
from diffusers import DiffusionPipeline
394

395
pipeline.save_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", variant="fp16")
396
397
```

Steven Liu's avatar
Steven Liu committed
398
399
</hfoption>
<hfoption id="non_ema">
400

Steven Liu's avatar
Steven Liu committed
401
```py
402
pipeline.save_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", variant="non_ema")
403
404
```

Steven Liu's avatar
Steven Liu committed
405
406
</hfoption>
</hfoptions>
407

Steven Liu's avatar
Steven Liu committed
408
If you don't save the variant to an existing folder, you must specify the `variant` argument otherwise it'll throw an `Exception` because it can't find the original checkpoint.
409

410
```python
Steven Liu's avatar
Steven Liu committed
411
412
413
# 👎 this won't work
pipeline = DiffusionPipeline.from_pretrained(
    "./stable-diffusion-v1-5", torch_dtype=torch.float16, use_safetensors=True
414
)
Steven Liu's avatar
Steven Liu committed
415
416
417
# 👍 this works
pipeline = DiffusionPipeline.from_pretrained(
    "./stable-diffusion-v1-5", variant="fp16", torch_dtype=torch.float16, use_safetensors=True
418
419
)
```
420

421
## DiffusionPipeline explained
422
423
424

As a class method, [`DiffusionPipeline.from_pretrained`] is responsible for two things:

425
- Download the latest version of the folder structure required for inference and cache it. If the latest folder structure is available in the local cache, [`DiffusionPipeline.from_pretrained`] reuses the cache and won't redownload the files.
426
- Load the cached weights into the correct pipeline [class](../api/pipelines/overview#diffusers-summary) - retrieved from the `model_index.json` file - and return an instance of it.
427

428
The pipelines' underlying folder structure corresponds directly with their class instances. For example, the [`StableDiffusionPipeline`] corresponds to the folder structure in [`stable-diffusion-v1-5/stable-diffusion-v1-5`](https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5).
429
430
431
432

```python
from diffusers import DiffusionPipeline

433
repo_id = "stable-diffusion-v1-5/stable-diffusion-v1-5"
434
pipeline = DiffusionPipeline.from_pretrained(repo_id, use_safetensors=True)
435
print(pipeline)
436
437
```

438
439
You'll see pipeline is an instance of [`StableDiffusionPipeline`], which consists of seven components:

440
- `"feature_extractor"`: a [`~transformers.CLIPImageProcessor`] from 🤗 Transformers.
441
442
443
444
445
- `"safety_checker"`: a [component](https://github.com/huggingface/diffusers/blob/e55687e1e15407f60f32242027b7bb8170e58266/src/diffusers/pipelines/stable_diffusion/safety_checker.py#L32) for screening against harmful content.
- `"scheduler"`: an instance of [`PNDMScheduler`].
- `"text_encoder"`: a [`~transformers.CLIPTextModel`] from 🤗 Transformers.
- `"tokenizer"`: a [`~transformers.CLIPTokenizer`] from 🤗 Transformers.
- `"unet"`: an instance of [`UNet2DConditionModel`].
446
- `"vae"`: an instance of [`AutoencoderKL`].
447
448

```json
449
450
451
StableDiffusionPipeline {
  "feature_extractor": [
    "transformers",
452
    "CLIPImageProcessor"
453
454
455
456
  ],
  "safety_checker": [
    "stable_diffusion",
    "StableDiffusionSafetyChecker"
457
458
459
  ],
  "scheduler": [
    "diffusers",
460
461
462
463
464
    "PNDMScheduler"
  ],
  "text_encoder": [
    "transformers",
    "CLIPTextModel"
465
466
467
  ],
  "tokenizer": [
    "transformers",
468
    "CLIPTokenizer"
469
470
471
472
473
  ],
  "unet": [
    "diffusers",
    "UNet2DConditionModel"
  ],
474
  "vae": [
475
476
477
478
479
480
    "diffusers",
    "AutoencoderKL"
  ]
}
```

481
Compare the components of the pipeline instance to the [`stable-diffusion-v1-5/stable-diffusion-v1-5`](https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5/tree/main) folder structure, and you'll see there is a separate folder for each of the components in the repository:
482
483
484

```
.
485
486
487
488
├── feature_extractor
│   └── preprocessor_config.json
├── model_index.json
├── safety_checker
489
│   ├── config.json
490
491
492
493
|   ├── model.fp16.safetensors
│   ├── model.safetensors
│   ├── pytorch_model.bin
|   └── pytorch_model.fp16.bin
494
495
├── scheduler
│   └── scheduler_config.json
496
497
├── text_encoder
│   ├── config.json
498
499
500
501
|   ├── model.fp16.safetensors
│   ├── model.safetensors
│   |── pytorch_model.bin
|   └── pytorch_model.fp16.bin
502
├── tokenizer
503
│   ├── merges.txt
504
505
│   ├── special_tokens_map.json
│   ├── tokenizer_config.json
506
│   └── vocab.json
507
508
├── unet
│   ├── config.json
509
│   ├── diffusion_pytorch_model.bin
510
511
512
513
514
515
516
517
518
519
520
|   |── diffusion_pytorch_model.fp16.bin
│   |── diffusion_pytorch_model.f16.safetensors
│   |── diffusion_pytorch_model.non_ema.bin
│   |── diffusion_pytorch_model.non_ema.safetensors
│   └── diffusion_pytorch_model.safetensors
|── vae
.   ├── config.json
.   ├── diffusion_pytorch_model.bin
    ├── diffusion_pytorch_model.fp16.bin
    ├── diffusion_pytorch_model.fp16.safetensors
    └── diffusion_pytorch_model.safetensors
521
522
```

523
You can access each of the components of the pipeline as an attribute to view its configuration:
524

525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
```py
pipeline.tokenizer
CLIPTokenizer(
    name_or_path="/root/.cache/huggingface/hub/models--runwayml--stable-diffusion-v1-5/snapshots/39593d5650112b4cc580433f6b0435385882d819/tokenizer",
    vocab_size=49408,
    model_max_length=77,
    is_fast=False,
    padding_side="right",
    truncation_side="right",
    special_tokens={
        "bos_token": AddedToken("<|startoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True),
        "eos_token": AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True),
        "unk_token": AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True),
        "pad_token": "<|endoftext|>",
    },
540
    clean_up_tokenization_spaces=True
541
)
542
```
543

544
Every pipeline expects a [`model_index.json`](https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5/blob/main/model_index.json) file that tells the [`DiffusionPipeline`]:
545
546
547
548
549
550

- which pipeline class to load from `_class_name`
- which version of 🧨 Diffusers was used to create the model in `_diffusers_version`
- what components from which library are stored in the subfolders (`name` corresponds to the component and subfolder name, `library` corresponds to the name of the library to load the class from, and `class` corresponds to the class name)

```json
551
{
552
553
554
555
  "_class_name": "StableDiffusionPipeline",
  "_diffusers_version": "0.6.0",
  "feature_extractor": [
    "transformers",
556
    "CLIPImageProcessor"
557
558
559
560
  ],
  "safety_checker": [
    "stable_diffusion",
    "StableDiffusionSafetyChecker"
561
562
563
  ],
  "scheduler": [
    "diffusers",
564
565
566
567
568
    "PNDMScheduler"
  ],
  "text_encoder": [
    "transformers",
    "CLIPTextModel"
569
570
571
  ],
  "tokenizer": [
    "transformers",
572
    "CLIPTokenizer"
573
574
575
576
577
  ],
  "unet": [
    "diffusers",
    "UNet2DConditionModel"
  ],
578
  "vae": [
579
580
581
582
    "diffusers",
    "AutoencoderKL"
  ]
}
583
```