custom_pipeline_examples.md 5.56 KB
Newer Older
1
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
2
3
4
5
6
7
8
9
10
11
12

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

13
# Community pipelines
14

15
16
[[open-in-colab]]

Steven Liu's avatar
Steven Liu committed
17
<Tip>
18

Steven Liu's avatar
Steven Liu committed
19
For more context about the design choices behind community pipelines, please have a look at [this issue](https://github.com/huggingface/diffusers/issues/841).
20

Steven Liu's avatar
Steven Liu committed
21
22
23
24
25
</Tip>

Community pipelines allow you to get creative and build your own unique pipelines to share with the community. You can find all community pipelines in the [diffusers/examples/community](https://github.com/huggingface/diffusers/tree/main/examples/community) folder along with inference and training examples for how to use them. This guide showcases some of the community pipelines and hopefully it'll inspire you to create your own (feel free to open a PR with your own pipeline and we will merge it!).

To load a community pipeline, use the `custom_pipeline` argument in [`DiffusionPipeline`] to specify one of the files in [diffusers/examples/community](https://github.com/huggingface/diffusers/tree/main/examples/community):
26
27

```py
28
29
from diffusers import DiffusionPipeline

30
pipe = DiffusionPipeline.from_pretrained(
31
    "CompVis/stable-diffusion-v1-4", custom_pipeline="filename_in_the_community_folder", use_safetensors=True
32
33
34
)
```

Steven Liu's avatar
Steven Liu committed
35
If a community pipeline doesn't work as expected, please open a GitHub issue and mention the author.
36

Steven Liu's avatar
Steven Liu committed
37
You can learn more about community pipelines in the how to [load community pipelines](custom_pipeline_overview) and how to [contribute a community pipeline](contribute_pipeline) guides.
38

Steven Liu's avatar
Steven Liu committed
39
## Multilingual Stable Diffusion
40

Steven Liu's avatar
Steven Liu committed
41
The multilingual Stable Diffusion pipeline uses a pretrained [XLM-RoBERTa](https://huggingface.co/papluca/xlm-roberta-base-language-detection) to identify a language and the [mBART-large-50](https://huggingface.co/facebook/mbart-large-50-many-to-one-mmt) model to handle the translation. This allows you to generate images from text in 20 languages.
42

Steven Liu's avatar
Steven Liu committed
43
```py
44
45
import torch
from diffusers import DiffusionPipeline
Steven Liu's avatar
Steven Liu committed
46
47
48
49
50
from diffusers.utils import make_image_grid
from transformers import (
    pipeline,
    MBart50TokenizerFast,
    MBartForConditionalGeneration,
51
52
)

Steven Liu's avatar
Steven Liu committed
53
54
device = "cuda" if torch.cuda.is_available() else "cpu"
device_dict = {"cuda": 0, "cpu": -1}
55

Steven Liu's avatar
Steven Liu committed
56
57
58
59
60
# add language detection pipeline
language_detection_model_ckpt = "papluca/xlm-roberta-base-language-detection"
language_detection_pipeline = pipeline("text-classification",
                                       model=language_detection_model_ckpt,
                                       device=device_dict[device])
61

Steven Liu's avatar
Steven Liu committed
62
# add model for language translation
63
64
translation_tokenizer = MBart50TokenizerFast.from_pretrained("facebook/mbart-large-50-many-to-one-mmt")
translation_model = MBartForConditionalGeneration.from_pretrained("facebook/mbart-large-50-many-to-one-mmt").to(device)
65

Steven Liu's avatar
Steven Liu committed
66
diffuser_pipeline = DiffusionPipeline.from_pretrained(
67
    "CompVis/stable-diffusion-v1-4",
Steven Liu's avatar
Steven Liu committed
68
69
    custom_pipeline="multilingual_stable_diffusion",
    detection_pipeline=language_detection_pipeline,
70
71
    translation_model=translation_model,
    translation_tokenizer=translation_tokenizer,
72
73
74
    torch_dtype=torch.float16,
)

Steven Liu's avatar
Steven Liu committed
75
76
diffuser_pipeline.enable_attention_slicing()
diffuser_pipeline = diffuser_pipeline.to(device)
77

78
prompt = ["a photograph of an astronaut riding a horse",
Steven Liu's avatar
Steven Liu committed
79
80
81
          "Una casa en la playa",
          "Ein Hund, der Orange isst",
          "Un restaurant parisien"]
82

Steven Liu's avatar
Steven Liu committed
83
images = diffuser_pipeline(prompt).images
84
make_image_grid(images, rows=2, cols=2)
85
86
```

Steven Liu's avatar
Steven Liu committed
87
88
89
<div class="flex justify-center">
    <img src="https://user-images.githubusercontent.com/4313860/198328706-295824a4-9856-4ce5-8e66-278ceb42fd29.png"/>
</div>
90

Steven Liu's avatar
Steven Liu committed
91
## MagicMix
92

Steven Liu's avatar
Steven Liu committed
93
[MagicMix](https://huggingface.co/papers/2210.16056) is a pipeline that can mix an image and text prompt to generate a new image that preserves the image structure. The `mix_factor` determines how much influence the prompt has on the layout generation, `kmin` controls the number of steps during the content generation process, and `kmax` determines how much information is kept in the layout of the original image.
94

Steven Liu's avatar
Steven Liu committed
95
96
```py
from diffusers import DiffusionPipeline, DDIMScheduler
97
from diffusers.utils import load_image, make_image_grid
98

Steven Liu's avatar
Steven Liu committed
99
pipeline = DiffusionPipeline.from_pretrained(
100
    "CompVis/stable-diffusion-v1-4",
Steven Liu's avatar
Steven Liu committed
101
    custom_pipeline="magic_mix",
102
    scheduler=DDIMScheduler.from_pretrained("CompVis/stable-diffusion-v1-4", subfolder="scheduler"),
Steven Liu's avatar
Steven Liu committed
103
).to('cuda')
104

Steven Liu's avatar
Steven Liu committed
105
img = load_image("https://user-images.githubusercontent.com/59410571/209578593-141467c7-d831-4792-8b9a-b17dc5e47816.jpg")
106
107
mix_img = pipeline(img, prompt="bed", kmin=0.3, kmax=0.5, mix_factor=0.5)
make_image_grid([img, mix_img], rows=1, cols=2)
108
109
```

Steven Liu's avatar
Steven Liu committed
110
111
112
<div class="flex gap-4">
  <div>
    <img class="rounded-xl" src="https://user-images.githubusercontent.com/59410571/209578593-141467c7-d831-4792-8b9a-b17dc5e47816.jpg" />
113
    <figcaption class="mt-2 text-center text-sm text-gray-500">original image</figcaption>
Steven Liu's avatar
Steven Liu committed
114
115
116
117
118
  </div>
  <div>
    <img class="rounded-xl" src="https://user-images.githubusercontent.com/59410571/209578602-70f323fa-05b7-4dd6-b055-e40683e37914.jpg" />
    <figcaption class="mt-2 text-center text-sm text-gray-500">image and text prompt mix</figcaption>
  </div>
119
</div>