loading.md 27.8 KB
Newer Older
Aryan's avatar
Aryan committed
1
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Nathan Lambert's avatar
Nathan Lambert committed
2
3
4
5
6
7
8
9
10
11
12

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

Steven Liu's avatar
Steven Liu committed
13
# Load pipelines
Patrick von Platen's avatar
Patrick von Platen committed
14

15
16
[[open-in-colab]]

Steven Liu's avatar
Steven Liu committed
17
Diffusion systems consist of multiple components like parameterized models and schedulers that interact in complex ways. That is why we designed the [`DiffusionPipeline`] to wrap the complexity of the entire diffusion system into an easy-to-use API. At the same time, the [`DiffusionPipeline`] is entirely customizable so you can modify each component to build a diffusion system for your use case.
18

19
This guide will show you how to load:
20

21
22
- pipelines from the Hub and locally
- different components into a pipeline
Steven Liu's avatar
Steven Liu committed
23
- multiple pipelines without increasing memory usage
24
- checkpoint variants such as different floating point types or non-exponential mean averaged (EMA) weights
25

Steven Liu's avatar
Steven Liu committed
26
27
28
29
## Load a pipeline

> [!TIP]
> Skip to the [DiffusionPipeline explained](#diffusionpipeline-explained) section if you're interested in an explanation about how the [`DiffusionPipeline`] class works.
30

Steven Liu's avatar
Steven Liu committed
31
There are two ways to load a pipeline for a task:
32

Steven Liu's avatar
Steven Liu committed
33
34
1. Load the generic [`DiffusionPipeline`] class and allow it to automatically detect the correct pipeline class from the checkpoint.
2. Load a specific pipeline class for a specific task.
35

Steven Liu's avatar
Steven Liu committed
36
37
<hfoptions id="pipelines">
<hfoption id="generic pipeline">
38

Steven Liu's avatar
Steven Liu committed
39
The [`DiffusionPipeline`] class is a simple and generic way to load the latest trending diffusion model from the [Hub](https://huggingface.co/models?library=diffusers&sort=trending). It uses the [`~DiffusionPipeline.from_pretrained`] method to automatically detect the correct pipeline class for a task from the checkpoint, downloads and caches all the required configuration and weight files, and returns a pipeline ready for inference.
40
41
42
43

```python
from diffusers import DiffusionPipeline

44
pipeline = DiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", use_safetensors=True)
45
46
```

Steven Liu's avatar
Steven Liu committed
47
48
49
50
51
This same checkpoint can also be used for an image-to-image task. The [`DiffusionPipeline`] class can handle any task as long as you provide the appropriate inputs. For example, for an image-to-image task, you need to pass an initial image to the pipeline.

```py
from diffusers import DiffusionPipeline

52
pipeline = DiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", use_safetensors=True)
Steven Liu's avatar
Steven Liu committed
53
54
55
56
57
58
59
60
61
62

init_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/img2img-init.png")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipeline("Astronaut in a jungle, cold color palette, muted colors, detailed, 8k", image=init_image).images[0]
```

</hfoption>
<hfoption id="specific pipeline">

Checkpoints can be loaded by their specific pipeline class if you already know it. For example, to load a Stable Diffusion model, use the [`StableDiffusionPipeline`] class.
63
64

```python
65
66
from diffusers import StableDiffusionPipeline

67
pipeline = StableDiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", use_safetensors=True)
68
69
```

Steven Liu's avatar
Steven Liu committed
70
This same checkpoint may also be used for another task like image-to-image. To differentiate what task you want to use the checkpoint for, you have to use the corresponding task-specific pipeline class. For example, to use the same checkpoint for image-to-image, use the [`StableDiffusionImg2ImgPipeline`] class.
71

Steven Liu's avatar
Steven Liu committed
72
```py
73
74
from diffusers import StableDiffusionImg2ImgPipeline

75
pipeline = StableDiffusionImg2ImgPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", use_safetensors=True)
76
77
```

Steven Liu's avatar
Steven Liu committed
78
79
80
81
</hfoption>
</hfoptions>

Use the Space below to gauge a pipeline's memory requirements before you download and load it to see if it runs on your hardware.
82
83

<div class="block dark:hidden">
84
	<iframe
85
86
87
88
89
90
        src="https://diffusers-compute-pipeline-size.hf.space?__theme=light"
        width="850"
        height="1600"
    ></iframe>
</div>
<div class="hidden dark:block">
91
    <iframe
92
93
94
95
96
97
        src="https://diffusers-compute-pipeline-size.hf.space?__theme=dark"
        width="850"
        height="1600"
    ></iframe>
</div>

hlky's avatar
hlky committed
98
99
100
101
102
103
104
105
106
107
### Specifying Component-Specific Data Types

You can customize the data types for individual sub-models by passing a dictionary to the `torch_dtype` parameter. This allows you to load different components of a pipeline in different floating point precisions. For instance, if you want to load the transformer with `torch.bfloat16` and all other components with `torch.float16`, you can pass a dictionary mapping:

```python
from diffusers import HunyuanVideoPipeline
import torch

pipe = HunyuanVideoPipeline.from_pretrained(
    "hunyuanvideo-community/HunyuanVideo",
108
    torch_dtype={"transformer": torch.bfloat16, "default": torch.float16},
hlky's avatar
hlky committed
109
110
111
112
113
114
)
print(pipe.transformer.dtype, pipe.vae.dtype)  # (torch.bfloat16, torch.float16)
```

If a component is not explicitly specified in the dictionary and no `default` is provided, it will be loaded with `torch.float32`.

115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
### Parallel loading

Large models are often [sharded](../training/distributed_inference#model-sharding) into smaller files so that they are easier to load. Diffusers supports loading shards in parallel to speed up the loading process.

Set the environment variables below to enable parallel loading.

- Set `HF_ENABLE_PARALLEL_LOADING` to `"YES"` to enable parallel loading of shards.
- Set `HF_PARALLEL_LOADING_WORKERS` to configure the number of parallel threads to use when loading shards. More workers loads a model faster but uses more memory.

The `device_map` argument should be set to `"cuda"` to pre-allocate a large chunk of memory based on the model size. This substantially reduces model load time because warming up the memory allocator now avoids many smaller calls to the allocator later.

```py
import os
import torch
from diffusers import DiffusionPipeline

os.environ["HF_ENABLE_PARALLEL_LOADING"] = "YES"
pipeline = DiffusionPipeline.from_pretrained(
    "Wan-AI/Wan2.2-I2V-A14B-Diffusers",
    torch_dtype=torch.bfloat16,
    device_map="cuda"
)
```

139
### Local pipeline
140

Steven Liu's avatar
Steven Liu committed
141
To load a pipeline locally, use [git-lfs](https://git-lfs.github.com/) to manually download a checkpoint to your local disk.
142

143
```bash
144
git-lfs install
145
git clone https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5
146
```
147

Steven Liu's avatar
Steven Liu committed
148
This creates a local folder, ./stable-diffusion-v1-5, on your disk and you should pass its path to [`~DiffusionPipeline.from_pretrained`].
149
150
151
152

```python
from diffusers import DiffusionPipeline

Steven Liu's avatar
Steven Liu committed
153
stable_diffusion = DiffusionPipeline.from_pretrained("./stable-diffusion-v1-5", use_safetensors=True)
154
155
```

Steven Liu's avatar
Steven Liu committed
156
The [`~DiffusionPipeline.from_pretrained`] method won't download files from the Hub when it detects a local path, but this also means it won't download and cache the latest changes to a checkpoint.
157

Steven Liu's avatar
Steven Liu committed
158
## Customize a pipeline
159

Steven Liu's avatar
Steven Liu committed
160
You can customize a pipeline by loading different components into it. This is important because you can:
161

Steven Liu's avatar
Steven Liu committed
162
163
- change to a scheduler with faster generation speed or higher generation quality depending on your needs (call the `scheduler.compatibles` method on your pipeline to see compatible schedulers)
- change a default pipeline component to a newer and better performing one
164

Steven Liu's avatar
Steven Liu committed
165
For example, let's customize the default [stabilityai/stable-diffusion-xl-base-1.0](https://hf.co/stabilityai/stable-diffusion-xl-base-1.0) checkpoint with:
166

Steven Liu's avatar
Steven Liu committed
167
168
- The [`HeunDiscreteScheduler`] to generate higher quality images at the expense of slower generation speed. You must pass the `subfolder="scheduler"` parameter in [`~HeunDiscreteScheduler.from_pretrained`] to load the scheduler configuration into the correct [subfolder](https://hf.co/stabilityai/stable-diffusion-xl-base-1.0/tree/main/scheduler) of the pipeline repository.
- A more stable VAE that runs in fp16.
169

Steven Liu's avatar
Steven Liu committed
170
171
172
```py
from diffusers import StableDiffusionXLPipeline, HeunDiscreteScheduler, AutoencoderKL
import torch
173

Steven Liu's avatar
Steven Liu committed
174
175
scheduler = HeunDiscreteScheduler.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", subfolder="scheduler")
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16, use_safetensors=True)
176
```
177

Steven Liu's avatar
Steven Liu committed
178
Now pass the new scheduler and VAE to the [`StableDiffusionXLPipeline`].
179
180

```py
Steven Liu's avatar
Steven Liu committed
181
pipeline = StableDiffusionXLPipeline.from_pretrained(
182
183
184
185
186
  "stabilityai/stable-diffusion-xl-base-1.0",
  scheduler=scheduler,
  vae=vae,
  torch_dtype=torch.float16,
  variant="fp16",
Steven Liu's avatar
Steven Liu committed
187
188
  use_safetensors=True
).to("cuda")
189
190
```

Steven Liu's avatar
Steven Liu committed
191
## Reuse a pipeline
Steven Liu's avatar
Steven Liu committed
192

Steven Liu's avatar
Steven Liu committed
193
When you load multiple pipelines that share the same model components, it makes sense to reuse the shared components instead of reloading everything into memory again, especially if your hardware is memory-constrained. For example:
194

Steven Liu's avatar
Steven Liu committed
195
196
1. You generated an image with the [`StableDiffusionPipeline`] but you want to improve its quality with the [`StableDiffusionSAGPipeline`]. Both of these pipelines share the same pretrained model, so it'd be a waste of memory to load the same model twice.
2. You want to add a model component, like a [`MotionAdapter`](../api/pipelines/animatediff#animatediffpipeline), to [`AnimateDiffPipeline`] which was instantiated from an existing [`StableDiffusionPipeline`]. Again, both pipelines share the same pretrained model, so it'd be a waste of memory to load an entirely new pipeline again.
197

Steven Liu's avatar
Steven Liu committed
198
199
200
201
With the [`DiffusionPipeline.from_pipe`] API, you can switch between multiple pipelines to take advantage of their different features without increasing memory-usage. It is similar to turning on and off a feature in your pipeline.

> [!TIP]
> To switch between tasks (rather than features), use the [`~DiffusionPipeline.from_pipe`] method with the [AutoPipeline](../api/pipelines/auto_pipeline) class, which automatically identifies the pipeline class based on the task (learn more in the [AutoPipeline](../tutorials/autopipeline) tutorial).
202

Steven Liu's avatar
Steven Liu committed
203
Let's start with a [`StableDiffusionPipeline`] and then reuse the loaded model components to create a [`StableDiffusionSAGPipeline`] to increase generation quality. You'll use the [`StableDiffusionPipeline`] with an [IP-Adapter](./ip_adapter) to generate a bear eating pizza.
204
205
206
207
208
209
210
211
212
213

```python
from diffusers import DiffusionPipeline, StableDiffusionSAGPipeline
import torch
import gc
from diffusers.utils import load_image
from accelerate.utils import compute_module_sizes

image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/load_neg_embed.png")

Steven Liu's avatar
Steven Liu committed
214
pipe_sd = DiffusionPipeline.from_pretrained("SG161222/Realistic_Vision_V6.0_B1_noVAE", torch_dtype=torch.float16)
215
216
217
218
219
220
pipe_sd.load_ip_adapter("h94/IP-Adapter", subfolder="models", weight_name="ip-adapter_sd15.bin")
pipe_sd.set_ip_adapter_scale(0.6)
pipe_sd.to("cuda")

generator = torch.Generator(device="cpu").manual_seed(33)
out_sd = pipe_sd(
Steven Liu's avatar
Steven Liu committed
221
    prompt="bear eats pizza",
222
    negative_prompt="wrong white balance, dark, sketches,worst quality,low quality",
223
    ip_adapter_image=image,
Steven Liu's avatar
Steven Liu committed
224
    num_inference_steps=50,
225
226
    generator=generator,
).images[0]
Steven Liu's avatar
Steven Liu committed
227
out_sd
228
229
230
231
232
233
```

<div class="flex justify-center">
  <img class="rounded-xl" src="https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/from_pipe_out_sd_0.png"/>
</div>

Steven Liu's avatar
Steven Liu committed
234
235
For reference, you can check how much memory this process consumed.

236
237
238
```python
def bytes_to_giga_bytes(bytes):
    return bytes / 1024 / 1024 / 1024
Steven Liu's avatar
Steven Liu committed
239
240
print(f"Max memory allocated: {bytes_to_giga_bytes(torch.cuda.max_memory_allocated())} GB")
"Max memory allocated: 4.406213283538818 GB"
241
242
```

Steven Liu's avatar
Steven Liu committed
243
Now, reuse the same pipeline components from [`StableDiffusionPipeline`] in [`StableDiffusionSAGPipeline`] with the [`~DiffusionPipeline.from_pipe`] method.
244

Steven Liu's avatar
Steven Liu committed
245
246
247
248
> [!WARNING]
> Some pipeline methods may not function properly on new pipelines created with [`~DiffusionPipeline.from_pipe`]. For instance, the [`~DiffusionPipeline.enable_model_cpu_offload`] method installs hooks on the model components based on a unique offloading sequence for each pipeline. If the models are executed in a different order in the new pipeline, the CPU offloading may not work correctly.
>
> To ensure everything works as expected, we recommend re-applying a pipeline method on a new pipeline created with [`~DiffusionPipeline.from_pipe`].
249
250
251

```python
pipe_sag = StableDiffusionSAGPipeline.from_pipe(
Steven Liu's avatar
Steven Liu committed
252
    pipe_sd
253
254
255
256
)

generator = torch.Generator(device="cpu").manual_seed(33)
out_sag = pipe_sag(
Steven Liu's avatar
Steven Liu committed
257
258
    prompt="bear eats pizza",
    negative_prompt="wrong white balance, dark, sketches,worst quality,low quality",
259
    ip_adapter_image=image,
Steven Liu's avatar
Steven Liu committed
260
    num_inference_steps=50,
261
262
    generator=generator,
    guidance_scale=1.0,
Steven Liu's avatar
Steven Liu committed
263
264
265
    sag_scale=0.75
).images[0]
out_sag
266
267
268
269
270
271
```

<div class="flex justify-center">
  <img class="rounded-xl" src="https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/from_pipe_out_sag_1.png"/>
</div>

Steven Liu's avatar
Steven Liu committed
272
If you check the memory usage, you'll see it remains the same as before because [`StableDiffusionPipeline`] and [`StableDiffusionSAGPipeline`] are sharing the same pipeline components. This allows you to use them interchangeably without any additional memory overhead.
273

Steven Liu's avatar
Steven Liu committed
274
275
276
```py
print(f"Max memory allocated: {bytes_to_giga_bytes(torch.cuda.max_memory_allocated())} GB")
"Max memory allocated: 4.406213283538818 GB"
277
278
```

Steven Liu's avatar
Steven Liu committed
279
Let's animate the image with the [`AnimateDiffPipeline`] and also add a [`MotionAdapter`] module to the pipeline. For the [`AnimateDiffPipeline`], you need to unload the IP-Adapter first and reload it *after* you've created your new pipeline (this only applies to the [`AnimateDiffPipeline`]).
280

Steven Liu's avatar
Steven Liu committed
281
```py
282
283
284
from diffusers import AnimateDiffPipeline, MotionAdapter, DDIMScheduler
from diffusers.utils import export_to_gif

Steven Liu's avatar
Steven Liu committed
285
pipe_sag.unload_ip_adapter()
286
287
288
289
adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-2", torch_dtype=torch.float16)

pipe_animate = AnimateDiffPipeline.from_pipe(pipe_sd, motion_adapter=adapter)
pipe_animate.scheduler = DDIMScheduler.from_config(pipe_animate.scheduler.config, beta_schedule="linear")
Steven Liu's avatar
Steven Liu committed
290
# load IP-Adapter and LoRA weights again
291
292
293
294
295
296
297
pipe_animate.load_ip_adapter("h94/IP-Adapter", subfolder="models", weight_name="ip-adapter_sd15.bin")
pipe_animate.load_lora_weights("guoyww/animatediff-motion-lora-zoom-out", adapter_name="zoom-out")
pipe_animate.to("cuda")

generator = torch.Generator(device="cpu").manual_seed(33)
pipe_animate.set_adapters("zoom-out", adapter_weights=0.75)
out = pipe_animate(
Steven Liu's avatar
Steven Liu committed
298
    prompt="bear eats pizza",
299
    num_frames=16,
Steven Liu's avatar
Steven Liu committed
300
301
    num_inference_steps=50,
    ip_adapter_image=image,
302
303
304
305
    generator=generator,
).frames[0]
export_to_gif(out, "out_animate.gif")
```
Steven Liu's avatar
Steven Liu committed
306

307
308
309
310
<div class="flex justify-center">
  <img class="rounded-xl" src="https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/from_pipe_out_animate_3.gif"/>
</div>

Steven Liu's avatar
Steven Liu committed
311
The [`AnimateDiffPipeline`] is more memory-intensive and consumes 15GB of memory (see the [Memory-usage of from_pipe](#memory-usage-of-from_pipe) section to learn what this means for your memory-usage).
312

Steven Liu's avatar
Steven Liu committed
313
314
315
316
```py
print(f"Max memory allocated: {bytes_to_giga_bytes(torch.cuda.max_memory_allocated())} GB")
"Max memory allocated: 15.178664207458496 GB"
```
317

Steven Liu's avatar
Steven Liu committed
318
### Modify from_pipe components
319

Steven Liu's avatar
Steven Liu committed
320
Pipelines loaded with [`~DiffusionPipeline.from_pipe`] can be customized with different model components or methods. However, whenever you modify the *state* of the model components, it affects all the other pipelines that share the same components. For example, if you call [`~diffusers.loaders.IPAdapterMixin.unload_ip_adapter`] on the [`StableDiffusionSAGPipeline`], you won't be able to use IP-Adapter with the [`StableDiffusionPipeline`] because it's been removed from their shared components.
321

Steven Liu's avatar
Steven Liu committed
322
323
```py
pipe.sag_unload_ip_adapter()
324
325

generator = torch.Generator(device="cpu").manual_seed(33)
Steven Liu's avatar
Steven Liu committed
326
327
out_sd = pipe_sd(
    prompt="bear eats pizza",
328
    negative_prompt="wrong white balance, dark, sketches,worst quality,low quality",
Steven Liu's avatar
Steven Liu committed
329
330
    ip_adapter_image=image,
    num_inference_steps=50,
331
    generator=generator,
Steven Liu's avatar
Steven Liu committed
332
333
).images[0]
"AttributeError: 'NoneType' object has no attribute 'image_projection_layers'"
334
335
```

Steven Liu's avatar
Steven Liu committed
336
### Memory usage of from_pipe
337

Steven Liu's avatar
Steven Liu committed
338
The memory requirement of loading multiple pipelines with [`~DiffusionPipeline.from_pipe`] is determined by the pipeline with the highest memory-usage regardless of the number of pipelines you create.
339

Steven Liu's avatar
Steven Liu committed
340
341
342
343
344
345
346
| Pipeline | Memory usage (GB) |
|---|---|
| StableDiffusionPipeline | 4.400 |
| StableDiffusionSAGPipeline | 4.400 |
| AnimateDiffPipeline | 15.178 |

The [`AnimateDiffPipeline`] has the highest memory requirement, so the *total memory-usage* is based only on the [`AnimateDiffPipeline`]. Your memory-usage will not increase if you create additional pipelines as long as their memory requirements doesn't exceed that of the [`AnimateDiffPipeline`]. Each pipeline can be used interchangeably without any additional memory overhead.
347

Steven Liu's avatar
Steven Liu committed
348
## Safety checker
349

Steven Liu's avatar
Steven Liu committed
350
Diffusers implements a [safety checker](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/safety_checker.py) for Stable Diffusion models which can generate harmful content. The safety checker screens the generated output against known hardcoded not-safe-for-work (NSFW) content. If for whatever reason you'd like to disable the safety checker, pass `safety_checker=None` to the [`~DiffusionPipeline.from_pretrained`] method.
351

Steven Liu's avatar
Steven Liu committed
352
353
354
```python
from diffusers import DiffusionPipeline

355
pipeline = DiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", safety_checker=None, use_safetensors=True)
Steven Liu's avatar
Steven Liu committed
356
357
358
359
"""
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide by the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend keeping the safety filter enabled in all public-facing circumstances, disabling it only for use cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
"""
```
360

361
## Checkpoint variants
362

363
A checkpoint variant is usually a checkpoint whose weights are:
364

Steven Liu's avatar
Steven Liu committed
365
366
- Stored in a different floating point type, such as [torch.float16](https://pytorch.org/docs/stable/tensors.html#data-types), because it only requires half the bandwidth and storage to download. You can't use this variant if you're continuing training or using a CPU.
- Non-exponential mean averaged (EMA) weights which shouldn't be used for inference. You should use this variant to continue finetuning a model.
367

Steven Liu's avatar
Steven Liu committed
368
369
> [!TIP]
> When the checkpoints have identical model structures, but they were trained on different datasets and with a different training setup, they should be stored in separate repositories. For example, [stabilityai/stable-diffusion-2](https://hf.co/stabilityai/stable-diffusion-2) and [stabilityai/stable-diffusion-2-1](https://hf.co/stabilityai/stable-diffusion-2-1) are stored in separate repositories.
370

Steven Liu's avatar
Steven Liu committed
371
Otherwise, a variant is **identical** to the original checkpoint. They have exactly the same serialization format (like [safetensors](./using_safetensors)), model structure, and their weights have identical tensor shapes.
372

Steven Liu's avatar
Steven Liu committed
373
374
375
376
377
| **checkpoint type** | **weight name**                             | **argument for loading weights** |
|---------------------|---------------------------------------------|----------------------------------|
| original            | diffusion_pytorch_model.safetensors         |                                  |
| floating point      | diffusion_pytorch_model.fp16.safetensors    | `variant`, `torch_dtype`         |
| non-EMA             | diffusion_pytorch_model.non_ema.safetensors | `variant`                        |
378

Steven Liu's avatar
Steven Liu committed
379
There are two important arguments for loading variants:
380

Steven Liu's avatar
Steven Liu committed
381
- `torch_dtype` specifies the floating point precision of the loaded checkpoint. For example, if you want to save bandwidth by loading a fp16 variant, you should set `variant="fp16"` and `torch_dtype=torch.float16` to *convert the weights* to fp16. Otherwise, the fp16 weights are converted to the default fp32 precision.
382

Steven Liu's avatar
Steven Liu committed
383
  If you only set `torch_dtype=torch.float16`, the default fp32 weights are downloaded first and then converted to fp16.
384

385
- `variant` specifies which files should be loaded from the repository. For example, if you want to load a non-EMA variant of a UNet from [stable-diffusion-v1-5/stable-diffusion-v1-5](https://hf.co/stable-diffusion-v1-5/stable-diffusion-v1-5/tree/main/unet), set `variant="non_ema"` to download the `non_ema` file.
386

Steven Liu's avatar
Steven Liu committed
387
388
<hfoptions id="variants">
<hfoption id="fp16">
389

Steven Liu's avatar
Steven Liu committed
390
```py
391
from diffusers import DiffusionPipeline
392
import torch
393

Steven Liu's avatar
Steven Liu committed
394
pipeline = DiffusionPipeline.from_pretrained(
395
    "stable-diffusion-v1-5/stable-diffusion-v1-5", variant="fp16", torch_dtype=torch.float16, use_safetensors=True
396
397
)
```
398

Steven Liu's avatar
Steven Liu committed
399
400
</hfoption>
<hfoption id="non-EMA">
401

Steven Liu's avatar
Steven Liu committed
402
403
```py
pipeline = DiffusionPipeline.from_pretrained(
404
    "stable-diffusion-v1-5/stable-diffusion-v1-5", variant="non_ema", use_safetensors=True
405
)
406
407
```

Steven Liu's avatar
Steven Liu committed
408
409
</hfoption>
</hfoptions>
410

Steven Liu's avatar
Steven Liu committed
411
Use the `variant` parameter in the [`DiffusionPipeline.save_pretrained`] method to save a checkpoint as a different floating point type or as a non-EMA variant. You should try save a variant to the same folder as the original checkpoint, so you have the option of loading both from the same folder.
412

Steven Liu's avatar
Steven Liu committed
413
414
<hfoptions id="save">
<hfoption id="fp16">
415

416
417
```python
from diffusers import DiffusionPipeline
418

419
pipeline.save_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", variant="fp16")
420
421
```

Steven Liu's avatar
Steven Liu committed
422
423
</hfoption>
<hfoption id="non_ema">
424

Steven Liu's avatar
Steven Liu committed
425
```py
426
pipeline.save_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", variant="non_ema")
427
428
```

Steven Liu's avatar
Steven Liu committed
429
430
</hfoption>
</hfoptions>
431

Steven Liu's avatar
Steven Liu committed
432
If you don't save the variant to an existing folder, you must specify the `variant` argument otherwise it'll throw an `Exception` because it can't find the original checkpoint.
433

434
```python
Steven Liu's avatar
Steven Liu committed
435
436
437
# 👎 this won't work
pipeline = DiffusionPipeline.from_pretrained(
    "./stable-diffusion-v1-5", torch_dtype=torch.float16, use_safetensors=True
438
)
Steven Liu's avatar
Steven Liu committed
439
440
441
# 👍 this works
pipeline = DiffusionPipeline.from_pretrained(
    "./stable-diffusion-v1-5", variant="fp16", torch_dtype=torch.float16, use_safetensors=True
442
443
)
```
444

445
## DiffusionPipeline explained
446
447
448

As a class method, [`DiffusionPipeline.from_pretrained`] is responsible for two things:

449
- Download the latest version of the folder structure required for inference and cache it. If the latest folder structure is available in the local cache, [`DiffusionPipeline.from_pretrained`] reuses the cache and won't redownload the files.
450
- Load the cached weights into the correct pipeline [class](../api/pipelines/overview#diffusers-summary) - retrieved from the `model_index.json` file - and return an instance of it.
451

452
The pipelines' underlying folder structure corresponds directly with their class instances. For example, the [`StableDiffusionPipeline`] corresponds to the folder structure in [`stable-diffusion-v1-5/stable-diffusion-v1-5`](https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5).
453
454
455
456

```python
from diffusers import DiffusionPipeline

457
repo_id = "stable-diffusion-v1-5/stable-diffusion-v1-5"
458
pipeline = DiffusionPipeline.from_pretrained(repo_id, use_safetensors=True)
459
print(pipeline)
460
461
```

462
463
You'll see pipeline is an instance of [`StableDiffusionPipeline`], which consists of seven components:

464
- `"feature_extractor"`: a [`~transformers.CLIPImageProcessor`] from 🤗 Transformers.
465
466
467
468
469
- `"safety_checker"`: a [component](https://github.com/huggingface/diffusers/blob/e55687e1e15407f60f32242027b7bb8170e58266/src/diffusers/pipelines/stable_diffusion/safety_checker.py#L32) for screening against harmful content.
- `"scheduler"`: an instance of [`PNDMScheduler`].
- `"text_encoder"`: a [`~transformers.CLIPTextModel`] from 🤗 Transformers.
- `"tokenizer"`: a [`~transformers.CLIPTokenizer`] from 🤗 Transformers.
- `"unet"`: an instance of [`UNet2DConditionModel`].
470
- `"vae"`: an instance of [`AutoencoderKL`].
471
472

```json
473
474
475
StableDiffusionPipeline {
  "feature_extractor": [
    "transformers",
476
    "CLIPImageProcessor"
477
478
479
480
  ],
  "safety_checker": [
    "stable_diffusion",
    "StableDiffusionSafetyChecker"
481
482
483
  ],
  "scheduler": [
    "diffusers",
484
485
486
487
488
    "PNDMScheduler"
  ],
  "text_encoder": [
    "transformers",
    "CLIPTextModel"
489
490
491
  ],
  "tokenizer": [
    "transformers",
492
    "CLIPTokenizer"
493
494
495
496
497
  ],
  "unet": [
    "diffusers",
    "UNet2DConditionModel"
  ],
498
  "vae": [
499
500
501
502
503
504
    "diffusers",
    "AutoencoderKL"
  ]
}
```

505
Compare the components of the pipeline instance to the [`stable-diffusion-v1-5/stable-diffusion-v1-5`](https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5/tree/main) folder structure, and you'll see there is a separate folder for each of the components in the repository:
506
507
508

```
.
509
510
511
512
├── feature_extractor
│   └── preprocessor_config.json
├── model_index.json
├── safety_checker
513
│   ├── config.json
514
515
516
517
|   ├── model.fp16.safetensors
│   ├── model.safetensors
│   ├── pytorch_model.bin
|   └── pytorch_model.fp16.bin
518
519
├── scheduler
│   └── scheduler_config.json
520
521
├── text_encoder
│   ├── config.json
522
523
524
525
|   ├── model.fp16.safetensors
│   ├── model.safetensors
│   |── pytorch_model.bin
|   └── pytorch_model.fp16.bin
526
├── tokenizer
527
│   ├── merges.txt
528
529
│   ├── special_tokens_map.json
│   ├── tokenizer_config.json
530
│   └── vocab.json
531
532
├── unet
│   ├── config.json
533
│   ├── diffusion_pytorch_model.bin
534
535
536
537
538
539
540
541
542
543
544
|   |── diffusion_pytorch_model.fp16.bin
│   |── diffusion_pytorch_model.f16.safetensors
│   |── diffusion_pytorch_model.non_ema.bin
│   |── diffusion_pytorch_model.non_ema.safetensors
│   └── diffusion_pytorch_model.safetensors
|── vae
.   ├── config.json
.   ├── diffusion_pytorch_model.bin
    ├── diffusion_pytorch_model.fp16.bin
    ├── diffusion_pytorch_model.fp16.safetensors
    └── diffusion_pytorch_model.safetensors
545
546
```

547
You can access each of the components of the pipeline as an attribute to view its configuration:
548

549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
```py
pipeline.tokenizer
CLIPTokenizer(
    name_or_path="/root/.cache/huggingface/hub/models--runwayml--stable-diffusion-v1-5/snapshots/39593d5650112b4cc580433f6b0435385882d819/tokenizer",
    vocab_size=49408,
    model_max_length=77,
    is_fast=False,
    padding_side="right",
    truncation_side="right",
    special_tokens={
        "bos_token": AddedToken("<|startoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True),
        "eos_token": AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True),
        "unk_token": AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True),
        "pad_token": "<|endoftext|>",
    },
564
    clean_up_tokenization_spaces=True
565
)
566
```
567

568
Every pipeline expects a [`model_index.json`](https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5/blob/main/model_index.json) file that tells the [`DiffusionPipeline`]:
569
570
571
572
573
574

- which pipeline class to load from `_class_name`
- which version of 🧨 Diffusers was used to create the model in `_diffusers_version`
- what components from which library are stored in the subfolders (`name` corresponds to the component and subfolder name, `library` corresponds to the name of the library to load the class from, and `class` corresponds to the class name)

```json
575
{
576
577
578
579
  "_class_name": "StableDiffusionPipeline",
  "_diffusers_version": "0.6.0",
  "feature_extractor": [
    "transformers",
580
    "CLIPImageProcessor"
581
582
583
584
  ],
  "safety_checker": [
    "stable_diffusion",
    "StableDiffusionSafetyChecker"
585
586
587
  ],
  "scheduler": [
    "diffusers",
588
589
590
591
592
    "PNDMScheduler"
  ],
  "text_encoder": [
    "transformers",
    "CLIPTextModel"
593
594
595
  ],
  "tokenizer": [
    "transformers",
596
    "CLIPTokenizer"
597
598
599
600
601
  ],
  "unet": [
    "diffusers",
    "UNet2DConditionModel"
  ],
602
  "vae": [
603
604
605
606
    "diffusers",
    "AutoencoderKL"
  ]
}
607
```