loading.mdx 19.2 KB
Newer Older
Patrick von Platen's avatar
Patrick von Platen committed
1
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
Nathan Lambert's avatar
Nathan Lambert committed
2
3
4
5
6
7
8
9
10
11
12

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

13
# Load pipelines, models, and schedulers
Patrick von Platen's avatar
Patrick von Platen committed
14

15
Having an easy way to use a diffusion system for inference is essential to 🧨 Diffusers. Diffusion systems often consist of multiple components like parameterized models, tokenizers, and schedulers that interact in complex ways. That is why we designed the [`DiffusionPipeline`] to wrap the complexity of the entire diffusion system into an easy-to-use API, while remaining flexible enough to be adapted for other use cases, such as loading each component individually as building blocks to assemble your own diffusion system.
16

17
Everything you need for inference or training is accessible with the `from_pretrained()` method.
18

19
This guide will show you how to load:
20

21
22
23
24
- pipelines from the Hub and locally
- different components into a pipeline
- checkpoint variants such as different floating point types or non-exponential mean averaged (EMA) weights
- models and schedulers
25

26
27
28
29
30
31
32
33
34
## Diffusion Pipeline

<Tip>

💡 Skip to the [DiffusionPipeline explained](#diffusionpipeline-explained) section if you interested in learning in more detail about how the [`DiffusionPipeline`] class works.

</Tip>

The [`DiffusionPipeline`] class is the simplest and most generic way to load any diffusion model from the [Hub](https://huggingface.co/models?library=diffusers). The [`DiffusionPipeline.from_pretrained`] method automatically detects the correct pipeline class from the checkpoint, downloads and caches all the required configuration and weight files, and returns a pipeline instance ready for inference.
35
36
37
38

```python
from diffusers import DiffusionPipeline

39
40
repo_id = "runwayml/stable-diffusion-v1-5"
pipe = DiffusionPipeline.from_pretrained(repo_id)
41
42
```

43
You can also load a checkpoint with it's specific pipeline class. The example above loaded a Stable Diffusion model; to get the same result, use the [`StableDiffusionPipeline`] class:
44
45

```python
46
47
48
49
50
51
from diffusers import StableDiffusionPipeline

repo_id = "runwayml/stable-diffusion-v1-5"
pipe = StableDiffusionPipeline.from_pretrained(repo_id)
```

52
A checkpoint (such as [`CompVis/stable-diffusion-v1-4`](https://huggingface.co/CompVis/stable-diffusion-v1-4) or [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5)) may also be used for more than one task, like text-to-image or image-to-image. To differentiate what task you want to use the checkpoint for, you have to load it directly with it's corresponding task-specific pipeline class:
53
54
55
56
57
58

```python
from diffusers import StableDiffusionImg2ImgPipeline

repo_id = "runwayml/stable-diffusion-v1-5"
pipe = StableDiffusionImg2ImgPipeline.from_pretrained(repo_id)
59
60
```

61
### Local pipeline
62

63
To load a diffusion pipeline locally, use [`git-lfs`](https://git-lfs.github.com/) to manually download the checkpoint (in this case, [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5)) to your local disk. This creates a local folder, `./stable-diffusion-v1-5`, on your disk:
64

65
66
67
68
```bash
git lfs install
git clone https://huggingface.co/runwayml/stable-diffusion-v1-5
```
69

70
Then pass the local path to [`~DiffusionPipeline.from_pretrained`]:
71
72
73
74

```python
from diffusers import DiffusionPipeline

75
repo_id = "./stable-diffusion-v1-5"
76
77
78
stable_diffusion = DiffusionPipeline.from_pretrained(repo_id)
```

79
The [`~DiffusionPipeline.from_pretrained`] method won't download any files from the Hub when it detects a local path, but this also means it won't download and cache the latest changes to a checkpoint.
80

81
### Swap components in a pipeline
82

83
You can customize the default components of any pipeline with another compatible component. Customization is important because:
84

85
86
87
- Changing the scheduler is important for exploring the trade-off between generation speed and quality.
- Different components of a model are typically trained independently and you can swap out a component with a better-performing one.
- During finetuning, usually only some components - like the UNet or text encoder - are trained.
88

89
To find out which schedulers are compatible for customization, you can use the `compatibles` method:
90

91
```py
92
93
94
95
from diffusers import DiffusionPipeline

repo_id = "runwayml/stable-diffusion-v1-5"
stable_diffusion = DiffusionPipeline.from_pretrained(repo_id)
96
stable_diffusion.scheduler.compatibles
97
98
```

99
Let's use the [`SchedulerMixin.from_pretrained`] method to replace the default [`PNDMScheduler`] with a more performant scheduler, [`EulerDiscreteScheduler`]. The `subfolder="scheduler"` argument is required to load the scheduler configuration from the correct [subfolder](https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/main/scheduler) of the pipeline repository.
100

101
Then you can pass the new [`EulerDiscreteScheduler`] instance to the `scheduler` argument in [`DiffusionPipeline`]:
102
103
104
105
106
107

```python
from diffusers import DiffusionPipeline, EulerDiscreteScheduler, DPMSolverMultistepScheduler

repo_id = "runwayml/stable-diffusion-v1-5"

108
scheduler = EulerDiscreteScheduler.from_pretrained(repo_id, subfolder="scheduler")
109
110
111
112

stable_diffusion = DiffusionPipeline.from_pretrained(repo_id, scheduler=scheduler)
```

113
### Safety checker
114

115
Diffusion models like Stable Diffusion can generate harmful content, which is why 🧨 Diffusers has a [safety checker](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/safety_checker.py) to check generated outputs against known hardcoded NSFW content. If you'd like to disable the safety checker for whatever reason, pass `None` to the `safety_checker` argument:
116
117

```python
118
from diffusers import DiffusionPipeline
119

120
repo_id = "runwayml/stable-diffusion-v1-5"
121
122
123
stable_diffusion = DiffusionPipeline.from_pretrained(repo_id, safety_checker=None)
```

124
125
126
### Reuse components across pipelines

You can also reuse the same components in multiple pipelines without loading the weights into RAM twice. Use the [`DiffusionPipeline.components`] method to save the components in `components`:
127
128
129
130
131
132
133
134

```python
from diffusers import StableDiffusionPipeline, StableDiffusionImg2ImgPipeline

model_id = "runwayml/stable-diffusion-v1-5"
stable_diffusion_txt2img = StableDiffusionPipeline.from_pretrained(model_id)

components = stable_diffusion_txt2img.components
135
```
136

137
138
139
Then you can pass the `components` to another pipeline without reloading the weights into RAM:

```py
140
141
142
stable_diffusion_img2img = StableDiffusionImg2ImgPipeline(**components)
```

143
## Checkpoint variants
144

145
A checkpoint variant is usually a checkpoint where it's weights are:
146

147
148
- Stored in a different floating point type for lower precision and lower storage, such as [`torch.float16`](https://pytorch.org/docs/stable/tensors.html#data-types), because it only requires half the bandwidth and storage to download. You can't use this variant if you're continuing training or using a CPU.
- Non-exponential mean averaged (EMA) weights which shouldn't be used for inference. You should use these to continue finetuning a model.
149

150
<Tip>
151

152
💡 When the checkpoints have identical model structures, but they were trained on different datasets and with a different training setup, they should be stored in separate repositories instead of variations (for example, [`stable-diffusion-v1-4`] and [`stable-diffusion-v1-5`]).
153

154
</Tip>
155

156
Otherwise, a variant is **identical** to the original checkpoint. They have exactly the same serialization format (like [Safetensors](./using-diffusers/using_safetensors)), model structure, and weights have identical tensor shapes.
157

158
159
160
161
162
| **checkpoint type** | **weight name**                     | **argument for loading weights** |
|---------------------|-------------------------------------|----------------------------------|
| original            | diffusion_pytorch_model.bin         |                                  |
| floating point      | diffusion_pytorch_model.fp16.bin    | `variant`, `torch_dtype`         |
| non-EMA             | diffusion_pytorch_model.non_ema.bin | `variant`                        |
163

164
There are two important arguments to know for loading variants:
165

166
- `torch_dtype` defines the floating point precision of the loaded checkpoints. For example, if you want to save bandwidth by loading a `fp16` variant, you should specify `torch_dtype=torch.float16` to *convert the weights* to `fp16`. Otherwise, the `fp16` weights are converted to the default `fp32` precision. You can also load the original checkpoint without defining the `variant` argument, and convert it to `fp16` with `torch_dtype=torch.float16`. In this case, the default `fp32` weights are downloaded first, and then they're converted to `fp16` after loading.
167

168
- `variant` defines which files should be loaded from the repository. For example, if you want to load a `non_ema` variant from the [`diffusers/stable-diffusion-variants`](https://huggingface.co/diffusers/stable-diffusion-variants/tree/main/unet) repository, you should specify `variant="non_ema"` to download the `non_ema` files.
169

170
171
```python
from diffusers import DiffusionPipeline
172

173
174
175
176
177
178
179
# load fp16 variant
stable_diffusion = DiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5", variant="fp16", torch_dtype=torch.float16
)
# load non_ema variant
stable_diffusion = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", variant="non_ema")
```
180

181
182
183
To save a checkpoint stored in a different floating point type or as a non-EMA variant, use the [`DiffusionPipeline.save_pretrained`] method and specify the `variant` argument. You should try and save a variant to the same folder as the original checkpoint, so you can load both from the same folder:

```python
184
185
from diffusers import DiffusionPipeline

186
187
188
189
# save as fp16 variant
stable_diffusion.save_pretrained("runwayml/stable-diffusion-v1-5", variant="fp16")
# save as non-ema variant
stable_diffusion.save_pretrained("runwayml/stable-diffusion-v1-5", variant="non_ema")
190
191
```

192
If you don't save the variant to an existing folder, you must specify the `variant` argument otherwise it'll throw an `Exception` because it can't find the original checkpoint:
193

194
195
196
197
198
199
200
```python
# 👎 this won't work
stable_diffusion = DiffusionPipeline.from_pretrained("./stable-diffusion-v1-5", torch_dtype=torch.float16)
# 👍 this works
stable_diffusion = DiffusionPipeline.from_pretrained(
    "./stable-diffusion-v1-5", variant="fp16", torch_dtype=torch.float16
)
201
202
```

203
204
<!--
TODO(Patrick) - Make sure to uncomment this part as soon as things are deprecated.
205

206
#### Using `revision` to load pipeline variants is deprecated
207

208
209
Previously the `revision` argument of [`DiffusionPipeline.from_pretrained`] was heavily used to 
load model variants, e.g.:
210

211
212
```python
from diffusers import DiffusionPipeline
213

214
pipe = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", revision="fp16")
215
216
```

217
However, this behavior is now deprecated since the "revision" argument should (just as it's done in GitHub) better be used to load model checkpoints from a specific commit or branch in development.
218

219
The above example is therefore deprecated and won't be supported anymore for `diffusers >= 1.0.0`.
220

221
<Tip warning={true}>
222

223
224
225
If you load diffusers pipelines or models with `revision="fp16"` or `revision="non_ema"`, 
please make sure to update to code and use `variant="fp16"` or `variation="non_ema"` respectively
instead.
226

227
228
</Tip>
-->
229

230
## Models
231

232
Models are loaded from the [`ModelMixin.from_pretrained`] method, which downloads and caches the latest version of the model weights and configurations. If the latest files are available in the local cache, [`~ModelMixin.from_pretrained`] reuses files in the cache instead of redownloading them.
233

234
Models can be loaded from a subfolder with the `subfolder` argument. For example, the model weights for `runwayml/stable-diffusion-v1-5` are stored in the [`unet`](https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/main/unet) subfolder:
235

236
237
238
239
240
```python
from diffusers import UNet2DConditionModel

repo_id = "runwayml/stable-diffusion-v1-5"
model = UNet2DConditionModel.from_pretrained(repo_id, subfolder="unet")
241
242
```

243
Or directly from a repository's [directory](https://huggingface.co/google/ddpm-cifar10-32/tree/main):
244

245
246
247
248
249
```python
from diffusers import UNet2DModel

repo_id = "google/ddpm-cifar10-32"
model = UNet2DModel.from_pretrained(repo_id)
250
251
```

252
You can also load and save model variants by specifying the `variant` argument in [`ModelMixin.from_pretrained`] and [`ModelMixin.save_pretrained`]:
253

254
255
```python
from diffusers import UNet2DConditionModel
256

257
258
259
model = UNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5", subfolder="unet", variant="non-ema")
model.save_pretrained("./local-unet", variant="non-ema")
```
260

261
262
263
## Schedulers

Schedulers are loaded from the [`SchedulerMixin.from_pretrained`] method, and unlike models, schedulers are **not parameterized** or **trained**; they are defined by a configuration file.
264

265
266
Loading schedulers does not consume any significant amount of memory and the same configuration file can be used for a variety of different schedulers.
For example, the following schedulers are compatible with [`StableDiffusionPipeline`] which means you can load the same scheduler configuration file in any of these classes:
267
268

```python
269
270
271
272
273
274
275
276
277
278
from diffusers import StableDiffusionPipeline
from diffusers import (
    DDPMScheduler,
    DDIMScheduler,
    PNDMScheduler,
    LMSDiscreteScheduler,
    EulerDiscreteScheduler,
    EulerAncestralDiscreteScheduler,
    DPMSolverMultistepScheduler,
)
279

280
repo_id = "runwayml/stable-diffusion-v1-5"
281

282
283
284
285
286
287
288
ddpm = DDPMScheduler.from_pretrained(repo_id, subfolder="scheduler")
ddim = DDIMScheduler.from_pretrained(repo_id, subfolder="scheduler")
pndm = PNDMScheduler.from_pretrained(repo_id, subfolder="scheduler")
lms = LMSDiscreteScheduler.from_pretrained(repo_id, subfolder="scheduler")
euler_anc = EulerAncestralDiscreteScheduler.from_pretrained(repo_id, subfolder="scheduler")
euler = EulerDiscreteScheduler.from_pretrained(repo_id, subfolder="scheduler")
dpm = DPMSolverMultistepScheduler.from_pretrained(repo_id, subfolder="scheduler")
289

290
291
292
# replace `dpm` with any of `ddpm`, `ddim`, `pndm`, `lms`, `euler_anc`, `euler`
pipeline = StableDiffusionPipeline.from_pretrained(repo_id, scheduler=dpm)
```
293

294
## DiffusionPipeline explained
295
296
297

As a class method, [`DiffusionPipeline.from_pretrained`] is responsible for two things:

298
299
300
301
- Download the latest version of the folder structure required for inference and cache it. If the latest folder structure is available in the local cache, [`DiffusionPipeline.from_pretrained`] reuses the cache and won't redownload the files.
- Load the cached weights into the correct pipeline [class](./api/pipelines/overview#diffusers-summary) - retrieved from the `model_index.json` file - and return an instance of it.

The pipelines underlying folder structure corresponds directly with their class instances. For example, the [`StableDiffusionPipeline`] corresponds to the folder structure in [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5).
302
303
304
305

```python
from diffusers import DiffusionPipeline

306
repo_id = "runwayml/stable-diffusion-v1-5"
307
308
pipeline = DiffusionPipeline.from_pretrained(repo_id)
print(pipeline)
309
310
```

311
312
313
314
315
316
317
318
319
320
321
You'll see pipeline is an instance of [`StableDiffusionPipeline`], which consists of seven components:

- `"feature_extractor"`: a [`~transformers.CLIPFeatureExtractor`] from 🤗 Transformers.
- `"safety_checker"`: a [component](https://github.com/huggingface/diffusers/blob/e55687e1e15407f60f32242027b7bb8170e58266/src/diffusers/pipelines/stable_diffusion/safety_checker.py#L32) for screening against harmful content.
- `"scheduler"`: an instance of [`PNDMScheduler`].
- `"text_encoder"`: a [`~transformers.CLIPTextModel`] from 🤗 Transformers.
- `"tokenizer"`: a [`~transformers.CLIPTokenizer`] from 🤗 Transformers.
- `"unet"`: an instance of [`UNet2DConditionModel`].
- `"vae"` an instance of [`AutoencoderKL`].

```json
322
323
324
StableDiffusionPipeline {
  "feature_extractor": [
    "transformers",
325
    "CLIPImageProcessor"
326
327
328
329
  ],
  "safety_checker": [
    "stable_diffusion",
    "StableDiffusionSafetyChecker"
330
331
332
  ],
  "scheduler": [
    "diffusers",
333
334
335
336
337
    "PNDMScheduler"
  ],
  "text_encoder": [
    "transformers",
    "CLIPTextModel"
338
339
340
  ],
  "tokenizer": [
    "transformers",
341
    "CLIPTokenizer"
342
343
344
345
346
  ],
  "unet": [
    "diffusers",
    "UNet2DConditionModel"
  ],
347
  "vae": [
348
349
350
351
352
353
    "diffusers",
    "AutoencoderKL"
  ]
}
```

354
Compare the components of the pipeline instance to the [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5) folder structure, and you'll see there is a separate folder for each of the components in the repository:
355
356
357

```
.
358
359
360
361
├── feature_extractor
│   └── preprocessor_config.json
├── model_index.json
├── safety_checker
362
363
364
365
│   ├── config.json
│   └── pytorch_model.bin
├── scheduler
│   └── scheduler_config.json
366
367
368
├── text_encoder
│   ├── config.json
│   └── pytorch_model.bin
369
├── tokenizer
370
│   ├── merges.txt
371
372
│   ├── special_tokens_map.json
│   ├── tokenizer_config.json
373
│   └── vocab.json
374
375
├── unet
│   ├── config.json
376
377
│   ├── diffusion_pytorch_model.bin
└── vae
378
    ├── config.json
379
    ├── diffusion_pytorch_model.bin
380
381
```

382
You can access each of the components of the pipeline as an attribute to view its configuration:
383

384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
```py
pipeline.tokenizer
CLIPTokenizer(
    name_or_path="/root/.cache/huggingface/hub/models--runwayml--stable-diffusion-v1-5/snapshots/39593d5650112b4cc580433f6b0435385882d819/tokenizer",
    vocab_size=49408,
    model_max_length=77,
    is_fast=False,
    padding_side="right",
    truncation_side="right",
    special_tokens={
        "bos_token": AddedToken("<|startoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True),
        "eos_token": AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True),
        "unk_token": AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True),
        "pad_token": "<|endoftext|>",
    },
)
400
```
401
402
403
404
405
406
407
408

Every pipeline expects a `model_index.json` file that tells the [`DiffusionPipeline`]:

- which pipeline class to load from `_class_name`
- which version of 🧨 Diffusers was used to create the model in `_diffusers_version`
- what components from which library are stored in the subfolders (`name` corresponds to the component and subfolder name, `library` corresponds to the name of the library to load the class from, and `class` corresponds to the class name)

```json
409
{
410
411
412
413
  "_class_name": "StableDiffusionPipeline",
  "_diffusers_version": "0.6.0",
  "feature_extractor": [
    "transformers",
414
    "CLIPImageProcessor"
415
416
417
418
  ],
  "safety_checker": [
    "stable_diffusion",
    "StableDiffusionSafetyChecker"
419
420
421
  ],
  "scheduler": [
    "diffusers",
422
423
424
425
426
    "PNDMScheduler"
  ],
  "text_encoder": [
    "transformers",
    "CLIPTextModel"
427
428
429
  ],
  "tokenizer": [
    "transformers",
430
    "CLIPTokenizer"
431
432
433
434
435
  ],
  "unet": [
    "diffusers",
    "UNet2DConditionModel"
  ],
436
  "vae": [
437
438
439
440
    "diffusers",
    "AutoencoderKL"
  ]
}
441
```