callback.md 5.52 KB
Newer Older
Aryan's avatar
Aryan committed
1
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
2
3
4
5
6
7
8
9
10
11
12

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

Steven Liu's avatar
Steven Liu committed
13
# Pipeline callbacks
14

15
A callback is a function that modifies [`DiffusionPipeline`] behavior and it is executed at the end of a denoising step. The changes are propagated to subsequent steps in the denoising process. It is useful for adjusting pipeline attributes or tensor variables to support new features without rewriting the underlying pipeline code.
16

17
Diffusers provides several callbacks in the pipeline [overview](../api/pipelines/overview#callbacks).
Steven Liu's avatar
Steven Liu committed
18

19
To enable a callback, configure when the callback is executed after a certain number of denoising steps with one of the following arguments.
Steven Liu's avatar
Steven Liu committed
20

21
22
- `cutoff_step_ratio` specifies when a callback is activated as a percentage of the total denoising steps.
- `cutoff_step_index` specifies the exact step number a callback is activated.
Álvaro Somoza's avatar
Álvaro Somoza committed
23

24
The example below uses `cutoff_step_ratio=0.4`, which means the callback is activated once denoising reaches 40% of the total inference steps. [`~callbacks.SDXLCFGCutoffCallback`] disables classifier-free guidance (CFG) after a certain number of steps, which can help save compute without significantly affecting performance.
Álvaro Somoza's avatar
Álvaro Somoza committed
25

26
Define a callback with either of the `cutoff` arguments and pass it to the `callback_on_step_end` parameter in the pipeline.
Álvaro Somoza's avatar
Álvaro Somoza committed
27

28
```py
Álvaro Somoza's avatar
Álvaro Somoza committed
29
30
31
32
33
import torch
from diffusers import DPMSolverMultistepScheduler, StableDiffusionXLPipeline
from diffusers.callbacks import SDXLCFGCutoffCallback

callback = SDXLCFGCutoffCallback(cutoff_step_ratio=0.4)
34
# if using cutoff_step_index
Álvaro Somoza's avatar
Álvaro Somoza committed
35
36
37
38
39
# callback = SDXLCFGCutoffCallback(cutoff_step_ratio=None, cutoff_step_index=10)

pipeline = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16,
40
41
    device_map="cuda"
)
Álvaro Somoza's avatar
Álvaro Somoza committed
42
43
44
pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config, use_karras_sigmas=True)

prompt = "a sports car at the road, best quality, high quality, high detail, 8k resolution"
45
output = pipeline(
Álvaro Somoza's avatar
Álvaro Somoza committed
46
47
48
49
50
51
52
    prompt=prompt,
    negative_prompt="",
    guidance_scale=6.5,
    num_inference_steps=25,
    generator=generator,
    callback_on_step_end=callback,
)
53
54
```

55
If you want to add a new official callback, feel free to open a [feature request](https://github.com/huggingface/diffusers/issues/new/choose) or [submit a PR](https://huggingface.co/docs/diffusers/main/en/conceptual/contribution#how-to-open-a-pr). Otherwise, you can also create your own callback as shown below.
Dhruv Nair's avatar
Dhruv Nair committed
56

57
## Early stopping
Dhruv Nair's avatar
Dhruv Nair committed
58

59
Early stopping is useful if you aren't happy with the intermediate results during generation. This callback sets a hardcoded stop point after which the pipeline terminates by setting the `_interrupt` attribute to `True`.
Dhruv Nair's avatar
Dhruv Nair committed
60

61
62
```py
from diffusers import StableDiffusionXLPipeline
Dhruv Nair's avatar
Dhruv Nair committed
63

Steven Liu's avatar
Steven Liu committed
64
def interrupt_callback(pipeline, i, t, callback_kwargs):
Dhruv Nair's avatar
Dhruv Nair committed
65
66
    stop_idx = 10
    if i == stop_idx:
Steven Liu's avatar
Steven Liu committed
67
        pipeline._interrupt = True
Dhruv Nair's avatar
Dhruv Nair committed
68
69
70

    return callback_kwargs

71
72
73
74
75
pipeline = StableDiffusionXLPipeline.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-v1-5"
)
num_inference_steps = 50

Steven Liu's avatar
Steven Liu committed
76
pipeline(
Dhruv Nair's avatar
Dhruv Nair committed
77
78
79
80
81
    "A photo of a cat",
    num_inference_steps=num_inference_steps,
    callback_on_step_end=interrupt_callback,
)
```
Steven Liu's avatar
Steven Liu committed
82

83
## Display intermediate images
84

85
Visualizing the intermediate images is useful for progress monitoring and assessing the quality of the generated content. This callback decodes the latent tensors at each step and converts them to images.
86

87
[Convert](https://huggingface.co/blog/TimothyAlexisVass/explaining-the-sdxl-latent-space) the Stable Diffusion XL latents from latents (4 channels) to RGB tensors (3 tensors).
Steven Liu's avatar
Steven Liu committed
88
89
90
91
92
93

```py
def latents_to_rgb(latents):
    weights = (
        (60, -60, 25, -70),
        (60,  -5, 15, -50),
94
        (60,  10, -5, -35),
Steven Liu's avatar
Steven Liu committed
95
96
97
98
99
    )

    weights_tensor = torch.t(torch.tensor(weights, dtype=latents.dtype).to(latents.device))
    biases_tensor = torch.tensor((150, 140, 130), dtype=latents.dtype).to(latents.device)
    rgb_tensor = torch.einsum("...lxy,lr -> ...rxy", latents, weights_tensor) + biases_tensor.unsqueeze(-1).unsqueeze(-1)
100
    image_array = rgb_tensor.clamp(0, 255).byte().cpu().numpy().transpose(1, 2, 0)
Steven Liu's avatar
Steven Liu committed
101
102
103
104

    return Image.fromarray(image_array)
```

105
Extract the latents and convert the first image in the batch to RGB. Save the image as a PNG file with the step number.
Steven Liu's avatar
Steven Liu committed
106
107
108
109

```py
def decode_tensors(pipe, step, timestep, callback_kwargs):
    latents = callback_kwargs["latents"]
Tolga Cangöz's avatar
Tolga Cangöz committed
110

111
    image = latents_to_rgb(latents[0])
Steven Liu's avatar
Steven Liu committed
112
113
114
115
116
    image.save(f"{step}.png")

    return callback_kwargs
```

117
Use the `callback_on_step_end_tensor_inputs` parameter to specify what input type to modify, which in this case, are the latents.
Steven Liu's avatar
Steven Liu committed
118
119
120
121

```py
import torch
from PIL import Image
122
from diffusers import AutoPipelineForText2Image
Steven Liu's avatar
Steven Liu committed
123
124
125
126

pipeline = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16,
127
128
    device_map="cuda"
)
Steven Liu's avatar
Steven Liu committed
129

130
131
132
image = pipeline(
    prompt="A croissant shaped like a cute bear.",
    negative_prompt="Deformed, ugly, bad anatomy",
Steven Liu's avatar
Steven Liu committed
133
134
135
136
    callback_on_step_end=decode_tensors,
    callback_on_step_end_tensor_inputs=["latents"],
).images[0]
```