custom_pipeline_examples.md 5.48 KB
Newer Older
Patrick von Platen's avatar
Patrick von Platen committed
1
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
2
3
4
5
6
7
8
9
10
11
12

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

13
# Community pipelines
14

15
16
[[open-in-colab]]

Steven Liu's avatar
Steven Liu committed
17
<Tip>
18

Steven Liu's avatar
Steven Liu committed
19
For more context about the design choices behind community pipelines, please have a look at [this issue](https://github.com/huggingface/diffusers/issues/841).
20

Steven Liu's avatar
Steven Liu committed
21
22
23
24
25
</Tip>

Community pipelines allow you to get creative and build your own unique pipelines to share with the community. You can find all community pipelines in the [diffusers/examples/community](https://github.com/huggingface/diffusers/tree/main/examples/community) folder along with inference and training examples for how to use them. This guide showcases some of the community pipelines and hopefully it'll inspire you to create your own (feel free to open a PR with your own pipeline and we will merge it!).

To load a community pipeline, use the `custom_pipeline` argument in [`DiffusionPipeline`] to specify one of the files in [diffusers/examples/community](https://github.com/huggingface/diffusers/tree/main/examples/community):
26
27
28

```py
pipe = DiffusionPipeline.from_pretrained(
29
    "CompVis/stable-diffusion-v1-4", custom_pipeline="filename_in_the_community_folder", use_safetensors=True
30
31
32
)
```

Steven Liu's avatar
Steven Liu committed
33
If a community pipeline doesn't work as expected, please open a GitHub issue and mention the author.
34

Steven Liu's avatar
Steven Liu committed
35
You can learn more about community pipelines in the how to [load community pipelines](custom_pipeline_overview) and how to [contribute a community pipeline](contribute_pipeline) guides.
36

Steven Liu's avatar
Steven Liu committed
37
## Multilingual Stable Diffusion
38

Steven Liu's avatar
Steven Liu committed
39
The multilingual Stable Diffusion pipeline uses a pretrained [XLM-RoBERTa](https://huggingface.co/papluca/xlm-roberta-base-language-detection) to identify a language and the [mBART-large-50](https://huggingface.co/facebook/mbart-large-50-many-to-one-mmt) model to handle the translation. This allows you to generate images from text in 20 languages.
40

Steven Liu's avatar
Steven Liu committed
41
42
```py
from PIL import Image
43
44
import torch
from diffusers import DiffusionPipeline
Steven Liu's avatar
Steven Liu committed
45
46
47
48
49
from diffusers.utils import make_image_grid
from transformers import (
    pipeline,
    MBart50TokenizerFast,
    MBartForConditionalGeneration,
50
51
)

Steven Liu's avatar
Steven Liu committed
52
53
device = "cuda" if torch.cuda.is_available() else "cpu"
device_dict = {"cuda": 0, "cpu": -1}
54

Steven Liu's avatar
Steven Liu committed
55
56
57
58
59
# add language detection pipeline
language_detection_model_ckpt = "papluca/xlm-roberta-base-language-detection"
language_detection_pipeline = pipeline("text-classification",
                                       model=language_detection_model_ckpt,
                                       device=device_dict[device])
60

Steven Liu's avatar
Steven Liu committed
61
62
63
# add model for language translation
trans_tokenizer = MBart50TokenizerFast.from_pretrained("facebook/mbart-large-50-many-to-one-mmt")
trans_model = MBartForConditionalGeneration.from_pretrained("facebook/mbart-large-50-many-to-one-mmt").to(device)
64

Steven Liu's avatar
Steven Liu committed
65
diffuser_pipeline = DiffusionPipeline.from_pretrained(
66
    "CompVis/stable-diffusion-v1-4",
Steven Liu's avatar
Steven Liu committed
67
68
69
70
    custom_pipeline="multilingual_stable_diffusion",
    detection_pipeline=language_detection_pipeline,
    translation_model=trans_model,
    translation_tokenizer=trans_tokenizer,
71
72
73
    torch_dtype=torch.float16,
)

Steven Liu's avatar
Steven Liu committed
74
75
diffuser_pipeline.enable_attention_slicing()
diffuser_pipeline = diffuser_pipeline.to(device)
76

77
prompt = ["a photograph of an astronaut riding a horse",
Steven Liu's avatar
Steven Liu committed
78
79
80
          "Una casa en la playa",
          "Ein Hund, der Orange isst",
          "Un restaurant parisien"]
81

Steven Liu's avatar
Steven Liu committed
82
83
84
images = diffuser_pipeline(prompt).images
grid = make_image_grid(images, rows=2, cols=2)
grid
85
86
```

Steven Liu's avatar
Steven Liu committed
87
88
89
<div class="flex justify-center">
    <img src="https://user-images.githubusercontent.com/4313860/198328706-295824a4-9856-4ce5-8e66-278ceb42fd29.png"/>
</div>
90

Steven Liu's avatar
Steven Liu committed
91
## MagicMix
92

Steven Liu's avatar
Steven Liu committed
93
[MagicMix](https://huggingface.co/papers/2210.16056) is a pipeline that can mix an image and text prompt to generate a new image that preserves the image structure. The `mix_factor` determines how much influence the prompt has on the layout generation, `kmin` controls the number of steps during the content generation process, and `kmax` determines how much information is kept in the layout of the original image.
94

Steven Liu's avatar
Steven Liu committed
95
96
97
```py
from diffusers import DiffusionPipeline, DDIMScheduler
from diffusers.utils import load_image
98

Steven Liu's avatar
Steven Liu committed
99
pipeline = DiffusionPipeline.from_pretrained(
100
    "CompVis/stable-diffusion-v1-4",
Steven Liu's avatar
Steven Liu committed
101
102
103
    custom_pipeline="magic_mix",
    scheduler = DDIMScheduler.from_pretrained("CompVis/stable-diffusion-v1-4", subfolder="scheduler"),
).to('cuda')
104

Steven Liu's avatar
Steven Liu committed
105
106
107
img = load_image("https://user-images.githubusercontent.com/59410571/209578593-141467c7-d831-4792-8b9a-b17dc5e47816.jpg")
mix_img = pipeline(img, prompt="bed", kmin = 0.3, kmax = 0.5, mix_factor = 0.5)
mix_img
108
109
```

Steven Liu's avatar
Steven Liu committed
110
111
112
113
114
115
116
117
118
119
<div class="flex gap-4">
  <div>
    <img class="rounded-xl" src="https://user-images.githubusercontent.com/59410571/209578593-141467c7-d831-4792-8b9a-b17dc5e47816.jpg" />
    <figcaption class="mt-2 text-center text-sm text-gray-500">image prompt</figcaption>
  </div>
  <div>
    <img class="rounded-xl" src="https://user-images.githubusercontent.com/59410571/209578602-70f323fa-05b7-4dd6-b055-e40683e37914.jpg" />
    <figcaption class="mt-2 text-center text-sm text-gray-500">image and text prompt mix</figcaption>
  </div>
</div>