flux.md 2.93 KB
Newer Older
Sayak Paul's avatar
Sayak Paul committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
<!--Copyright 2024 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# Flux

15
Flux is a series of text-to-image generation models based on diffusion transformers. To know more about Flux, check out the original [blog post](https://blackforestlabs.ai/announcing-black-forest-labs/) by the creators of Flux, Black Forest Labs.
Sayak Paul's avatar
Sayak Paul committed
16
17
18
19
20

Original model checkpoints for Flux can be found [here](https://huggingface.co/black-forest-labs). Original inference code can be found [here](https://github.com/black-forest-labs/flux).

<Tip>

21
Flux can be quite expensive to run on consumer hardware devices. However, you can perform a suite of optimizations to run it faster and in a more memory-friendly manner. Check out [this section](https://huggingface.co/blog/sd3#memory-optimizations-for-sd3) for more details. Additionally, Flux can benefit from quantization for memory efficiency with a trade-off in inference latency. Refer to [this blog post](https://huggingface.co/blog/quanto-diffusers) to learn more.
Sayak Paul's avatar
Sayak Paul committed
22
23
24
25
26
27
28
29

</Tip>

Flux comes in two variants:

* Timestep-distilled (`black-forest-labs/FLUX.1-schnell`)
* Guidance-distilled (`black-forest-labs/FLUX.1-dev`)

30
Both checkpoints have slightly difference usage which we detail below.
Sayak Paul's avatar
Sayak Paul committed
31
32
33

### Timestep-distilled

34
* `max_sequence_length` cannot be more than 256.
Sayak Paul's avatar
Sayak Paul committed
35
36
37
38
39
40
41
42
43
44
45
46
* `guidance_scale` needs to be 0.
* As this is a timestep-distilled model, it benefits from fewer sampling steps.

```python
import torch
from diffusers import  FluxPipeline

pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-schnell", torch_dtype=torch.bfloat16)
pipe.enable_model_cpu_offload()

prompt = "A cat holding a sign that says hello world"
out = pipe(
47
48
49
50
51
    prompt=prompt,
    guidance_scale=0.,
    height=768,
    width=1360,
    num_inference_steps=4,
Sayak Paul's avatar
Sayak Paul committed
52
53
54
55
56
57
58
59
    max_sequence_length=256,
).images[0]
out.save("image.png")
```

### Guidance-distilled

* The guidance-distilled variant takes about 50 sampling steps for good-quality generation.
60
* It doesn't have any limitations around the `max_sequence_length`.
Sayak Paul's avatar
Sayak Paul committed
61
62
63
64
65
66
67
68
69
70

```python
import torch
from diffusers import  FluxPipeline

pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
pipe.enable_model_cpu_offload()

prompt = "a tiny astronaut hatching from an egg on the moon"
out = pipe(
71
72
73
74
    prompt=prompt,
    guidance_scale=3.5,
    height=768,
    width=1360,
Sayak Paul's avatar
Sayak Paul committed
75
76
77
78
79
80
81
82
83
84
    num_inference_steps=50,
).images[0]
out.save("image.png")
```

## FluxPipeline

[[autodoc]] FluxPipeline
	- all
	- __call__