depth2img.md 2.54 KB
Newer Older
Patrick von Platen's avatar
Patrick von Platen committed
1
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
Patrick von Platen's avatar
Patrick von Platen committed
2
3
4
5
6
7
8
9
10
11
12

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

13
# Text-guided depth-to-image generation
Patrick von Platen's avatar
Patrick von Platen committed
14

15
16
17
18
19
[[open-in-colab]]

The [`StableDiffusionDepth2ImgPipeline`] lets you pass a text prompt and an initial image to condition the generation of new images. In addition, you can also pass a `depth_map` to preserve the image structure. If no `depth_map` is provided, the pipeline automatically predicts the depth via an integrated [depth-estimation model](https://github.com/isl-org/MiDaS).

Start by creating an instance of the [`StableDiffusionDepth2ImgPipeline`]:
Patrick von Platen's avatar
Patrick von Platen committed
20
21
22
23

```python
import torch
from diffusers import StableDiffusionDepth2ImgPipeline
24
from diffusers.utils import load_image, make_image_grid
Patrick von Platen's avatar
Patrick von Platen committed
25

26
pipeline = StableDiffusionDepth2ImgPipeline.from_pretrained(
Patrick von Platen's avatar
Patrick von Platen committed
27
28
    "stabilityai/stable-diffusion-2-depth",
    torch_dtype=torch.float16,
29
    use_safetensors=True,
Patrick von Platen's avatar
Patrick von Platen committed
30
).to("cuda")
31
```
Patrick von Platen's avatar
Patrick von Platen committed
32

33
Now pass your prompt to the pipeline. You can also pass a `negative_prompt` to prevent certain words from guiding how an image is generated:
Patrick von Platen's avatar
Patrick von Platen committed
34

35
```python
Patrick von Platen's avatar
Patrick von Platen committed
36
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
37
init_image = load_image(url)
Patrick von Platen's avatar
Patrick von Platen committed
38
prompt = "two tigers"
39
40
41
negative_prompt = "bad, deformed, ugly, bad anatomy"
image = pipeline(prompt=prompt, image=init_image, negative_prompt=negative_prompt, strength=0.7).images[0]
make_image_grid([init_image, image], rows=1, cols=2)
Patrick von Platen's avatar
Patrick von Platen committed
42
```
43
44
45
46

| Input                                                                           | Output                                                                                                                                |
|---------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------|
| <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/coco-cats.png" width="500"/> | <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/depth2img-tigers.png" width="500"/> |