textual_inversion_inference.md 3.12 KB
Newer Older
1
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
M. Tolga Cangöz's avatar
M. Tolga Cangöz committed
2
3
4
5
6
7
8
9
10
11
12

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

Steven Liu's avatar
Steven Liu committed
13
# Textual Inversion
14

Steven Liu's avatar
Steven Liu committed
15
[Textual Inversion](https://huggingface.co/papers/2208.01618) is a method for generating personalized images of a concept. It works by fine-tuning a models word embeddings on 3-5 images of the concept (for example, pixel art) that is associated with a unique token (`<sks>`). This allows you to use the `<sks>` token in your prompt to trigger the model to generate pixel art images.
16

Steven Liu's avatar
Steven Liu committed
17
Textual Inversion weights are very lightweight and typically only a few KBs because they're only word embeddings. However, this also means the word embeddings need to be loaded after loading a model with [`~DiffusionPipeline.from_pretrained`].
18
19
20

```py
import torch
Steven Liu's avatar
Steven Liu committed
21
from diffusers import AutoPipelineForText2Image
22

Steven Liu's avatar
Steven Liu committed
23
24
25
pipeline = AutoPipelineForText2Image.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-v1-5",
    torch_dtype=torch.float16
26
).to("cuda")
27
28
```

Steven Liu's avatar
Steven Liu committed
29
Load the word embeddings with [`~loaders.TextualInversionLoaderMixin.load_textual_inversion`] and include the unique token in the prompt to activate its generation.
30
31

```py
Steven Liu's avatar
Steven Liu committed
32
33
34
pipeline.load_textual_inversion("sd-concepts-library/gta5-artwork")
prompt = "A cute brown bear eating a slice of pizza, stunning color scheme, masterpiece, illustration, <gta5-artwork> style"
pipeline(prompt).images[0]
35
36
37
```

<div class="flex justify-center">
Steven Liu's avatar
Steven Liu committed
38
    <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/load_txt_embed.png" />
39
</div>
40

Steven Liu's avatar
Steven Liu committed
41
Textual Inversion can also be trained to learn *negative embeddings* to steer generation away from unwanted characteristics such as "blurry" or "ugly". It is useful for improving image quality.
42

Steven Liu's avatar
Steven Liu committed
43
EasyNegative is a widely used negative embedding that contains multiple learned negative concepts. Load the negative embeddings and specify the file name and token associated with the negative embeddings. Pass the token to `negative_prompt` in your pipeline to activate it.
44
45
46

```py
import torch
Steven Liu's avatar
Steven Liu committed
47
from diffusers import AutoPipelineForText2Image
48

Steven Liu's avatar
Steven Liu committed
49
50
51
52
53
54
55
56
57
58
59
60
pipeline = AutoPipelineForText2Image.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-v1-5",
    torch_dtype=torch.float16
).to("cuda")
pipeline.load_textual_inversion(
    "EvilEngine/easynegative",
    weight_name="easynegative.safetensors",
    token="easynegative"
)
prompt = "A cute brown bear eating a slice of pizza, stunning color scheme, masterpiece, illustration"
negative_prompt = "easynegative"
pipeline(prompt, negative_prompt).images[0]
61
```
Steven Liu's avatar
Steven Liu committed
62
63
64
65

<div class="flex justify-center">
    <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/load_neg_embed.png" />
</div>