"git@developer.sourcefind.cn:renzhc/diffusers_dcu.git" did not exist on "00d8d46e23b8b741a451db5e711969c46af127fd"
Unverified Commit a4b233e5 authored by Patrick von Platen's avatar Patrick von Platen Committed by GitHub
Browse files

Finish docs textual inversion (#3068)



* Finish docs textual inversion

* Apply suggestions from code review
Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>

---------
Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
parent 524535b5
...@@ -157,24 +157,61 @@ If you're interested in following along with your model training progress, you c ...@@ -157,24 +157,61 @@ If you're interested in following along with your model training progress, you c
## Inference ## Inference
Once you have trained a model, you can use it for inference with the [`StableDiffusionPipeline`]. Make sure you include the `placeholder_token` in your prompt, in this case, it is `<cat-toy>`. Once you have trained a model, you can use it for inference with the [`StableDiffusionPipeline`].
The textual inversion script will by default only save the textual inversion embedding vector(s) that have
been added to the text encoder embedding matrix and consequently been trained.
<frameworkcontent> <frameworkcontent>
<pt> <pt>
<Tip>
💡 The community has created a large library of different textual inversion embedding vectors, called [sd-concepts-library](https://huggingface.co/sd-concepts-library).
Instead of training textual inversion embeddings from scratch you can also see whether a fitting textual inversion embedding has already been added to the libary.
</Tip>
To load the textual inversion embeddings you first need to load the base model that was used when training
your textual inversion embedding vectors. Here we assume that [`runwayml/stable-diffusion-v1-5`](runwayml/stable-diffusion-v1-5)
was used as a base model so we load it first:
```python ```python
from diffusers import StableDiffusionPipeline from diffusers import StableDiffusionPipeline
import torch
model_id = "path-to-your-trained-model" model_id = "runwayml/stable-diffusion-v1-5"
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda") pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda")
```
prompt = "A <cat-toy> backpack" Next, we need to load the textual inversion embedding vector which can be done via the [`TextualInversionLoaderMixin.load_textual_inversion`]
function. Here we'll load the embeddings of the "<cat-toy>" example from before.
```python
pipe.load_textual_inversion("sd-concepts-library/cat-toy")
```
image = pipe(prompt, num_inference_steps=50, guidance_scale=7.5).images[0] Now we can run the pipeline making sure that the placeholder token `<cat-toy>` is used in our prompt.
```python
prompt = "A <cat-toy> backpack"
image = pipe(prompt, num_inference_steps=50).images[0]
image.save("cat-backpack.png") image.save("cat-backpack.png")
``` ```
The function [`TextualInversionLoaderMixin.load_textual_inversion`] can not only
load textual embedding vectors saved in Diffusers' format, but also embedding vectors
saved in [Automatic1111](https://github.com/AUTOMATIC1111/stable-diffusion-webui) format.
To do so, you can first download an embedding vector from [civitAI](https://civitai.com/models/3036?modelVersionId=8387)
and then load it locally:
```python
pipe.load_textual_inversion("./charturnerv2.pt")
```
</pt> </pt>
<jax> <jax>
Currently there is no `load_textual_inversion` function for Flax so one has to make sure the textual inversion
embedding vector is saved as part of the model after training.
The model can then be run just like any other Flax model:
```python ```python
import jax import jax
import numpy as np import numpy as np
......
...@@ -368,7 +368,7 @@ class TextualInversionLoaderMixin: ...@@ -368,7 +368,7 @@ class TextualInversionLoaderMixin:
): ):
r""" r"""
Load textual inversion embeddings into the text encoder of stable diffusion pipelines. Both `diffusers` and Load textual inversion embeddings into the text encoder of stable diffusion pipelines. Both `diffusers` and
`Automatic1111` formats are supported. `Automatic1111` formats are supported (see example below).
<Tip warning={true}> <Tip warning={true}>
...@@ -427,6 +427,42 @@ class TextualInversionLoaderMixin: ...@@ -427,6 +427,42 @@ class TextualInversionLoaderMixin:
models](https://huggingface.co/docs/hub/models-gated#gated-models). models](https://huggingface.co/docs/hub/models-gated#gated-models).
</Tip> </Tip>
Example:
To load a textual inversion embedding vector in `diffusers` format:
```py
from diffusers import StableDiffusionPipeline
import torch
model_id = "runwayml/stable-diffusion-v1-5"
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda")
pipe.load_textual_inversion("sd-concepts-library/cat-toy")
prompt = "A <cat-toy> backpack"
image = pipe(prompt, num_inference_steps=50).images[0]
image.save("cat-backpack.png")
```
To load a textual inversion embedding vector in Automatic1111 format, make sure to first download the vector,
e.g. from [civitAI](https://civitai.com/models/3036?modelVersionId=9857) and then load the vector locally:
```py
from diffusers import StableDiffusionPipeline
import torch
model_id = "runwayml/stable-diffusion-v1-5"
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda")
pipe.load_textual_inversion("./charturnerv2.pt")
prompt = "charturnerv2, multiple views of the same character in the same outfit, a character turnaround of a woman wearing a black jacket and red shirt, best quality, intricate details."
image = pipe(prompt, num_inference_steps=50).images[0]
image.save("character.png")
```
""" """
if not hasattr(self, "tokenizer") or not isinstance(self.tokenizer, PreTrainedTokenizer): if not hasattr(self, "tokenizer") or not isinstance(self.tokenizer, PreTrainedTokenizer):
raise ValueError( raise ValueError(
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment