README.md 5.87 KB
Newer Older
lijian6's avatar
lijian6 committed
1
2
# Stable Diffusion

lijian6's avatar
Update  
lijian6 committed
3
4
5
6
7
8
9
10
11
12
13
14
## 论文
`High-Resolution Image Synthesis with Latent Diffusion Models`
- https://arxiv.org/abs/2112.10752

## 模型结构
通过串联或更通用的交叉注意机制来调节LDM

![img](./doc/arch.png)
## 算法原理
通过将图像形成过程分解为去噪自动编码器的顺序应用,扩散模型(DM)在图像数据和其他数据上实现了最先进的合成结果。为了在有限的计算资源上进行DM训练,同时保持其质量和灵活性,我们将其应用于强大的预训练自动编码器的潜在空间。在这种表示上训练扩散模型首次能够在降低复杂性和空间下采样之间达到接近最佳的点,提高了视觉逼真度。通过在模型架构中引入跨注意力层,将扩散模型变成了强大而灵活的生成器,用于文本或边界框等一般条件输入,高分辨率合成以卷积方式成为可能。我们的潜在扩散模型(LDM)在各种任务上实现了极具竞争力的性能,包括无条件图像生成、修复和超分辨率,同时与基于像素的DM相比,显著降低了计算要求。

![img](./doc/algo.png)
lijian6's avatar
lijian6 committed
15
16
17

## 环境配置

lijian6's avatar
Update  
lijian6 committed
18
### Docker(方法一):
lijian6's avatar
lijian6 committed
19
```
lijian6's avatar
Update  
lijian6 committed
20
21
22
23
24
25
26
27
28
docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:stablediffusion
# <your IMAGE ID>用以上拉取的docker的镜像ID替换
docker run --rm --shm-size 10g --network=host --name=stablediffussion --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v path_to_stablediffussion:/home/sd -it <your IMAGE ID> bash
```
### Dockerfile(方法二):
```
cd stablediffussion/docker
docker build --no-cache -t stablediffussion:test .
docker run --rm --shm-size 10g --network=host --name=stablediffussion --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v path_to_stablediffussion:/home/sd -it stablediffussion:test bash
lijian6's avatar
lijian6 committed
29
```
lijian6's avatar
lijian6 committed
30
### 下载Stable Diffusion模型
lijian6's avatar
lijian6 committed
31
```
lijian6's avatar
lijian6 committed
32
cd stablediffussion
lijian6's avatar
lijian6 committed
33
## 下载checkpoint model
lijian6's avatar
lijian6 committed
34
35
git lfs install
git clone https://huggingface.co/CompVis/stable-diffusion-v-1-1-original
lijian6's avatar
lijian6 committed
36
37
38
git clone https://huggingface.co/CompVis/stable-diffusion-v-1-2-original
git clone https://huggingface.co/CompVis/stable-diffusion-v-1-3-original
git clone https://huggingface.co/CompVis/stable-diffusion-v-1-4-original
lijian6's avatar
lijian6 committed
39

lijian6's avatar
lijian6 committed
40
## 下载diffusers version模型
lijian6's avatar
lijian6 committed
41
git clone https://huggingface.co/CompVis/stable-diffusion-v1-1
lijian6's avatar
lijian6 committed
42
43
44
45
git clone https://huggingface.co/CompVis/stable-diffusion-v1-2
git clone https://huggingface.co/CompVis/stable-diffusion-v1-3
git clone https://huggingface.co/CompVis/stable-diffusion-v1-4
git clone https://huggingface.co/runwayml/stable-diffusion-v1-5
lijian6's avatar
Update  
lijian6 committed
46
以上9个,任选其一
lijian6's avatar
lijian6 committed
47

lijian6's avatar
lijian6 committed
48
## 下载safety-checker
lijian6's avatar
lijian6 committed
49
git clone https://huggingface.co/CompVis/stable-diffusion-safety-checker
lijian6's avatar
lijian6 committed
50
```
lijian6's avatar
lijian6 committed
51

lijian6's avatar
lijian6 committed
52
53
54
55
## 数据集


## 推理
lijian6's avatar
lijian6 committed
56

lijian6's avatar
Update  
lijian6 committed
57
### 1、运行checkpoint version示例:
lijian6's avatar
lijian6 committed
58
59
60
61

下载完`stable-diffusion-*-original`模型之后,link到对应的目录:
```
mkdir -p models/ldm/stable-diffusion-v1/
lijian6's avatar
lijian6 committed
62
ln -s <path/to/model.ckpt> models/ldm/stable-diffusion-v1/model.ckpt
lijian6's avatar
lijian6 committed
63
64
65
66
```

运行:
```
lijian6's avatar
lijian6 committed
67
python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms
lijian6's avatar
lijian6 committed
68
```
lijian6's avatar
Update  
lijian6 committed
69
```
lijian6's avatar
lijian6 committed
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
usage: txt2img.py [-h] [--prompt [PROMPT]] [--outdir [OUTDIR]] [--skip_grid] [--skip_save] [--ddim_steps DDIM_STEPS] [--plms] [--laion400m] [--fixed_code] [--ddim_eta DDIM_ETA]
                  [--n_iter N_ITER] [--H H] [--W W] [--C C] [--f F] [--n_samples N_SAMPLES] [--n_rows N_ROWS] [--scale SCALE] [--from-file FROM_FILE] [--config CONFIG] [--ckpt CKPT]
                  [--seed SEED] [--precision {full,autocast}]

optional arguments:
  -h, --help            show this help message and exit
  --prompt [PROMPT]     the prompt to render
  --outdir [OUTDIR]     dir to write results to
  --skip_grid           do not save a grid, only individual samples. Helpful when evaluating lots of samples
  --skip_save           do not save individual samples. For speed measurements.
  --ddim_steps DDIM_STEPS
                        number of ddim sampling steps
  --plms                use plms sampling
  --laion400m           uses the LAION400M model
  --fixed_code          if enabled, uses the same starting code across samples
  --ddim_eta DDIM_ETA   ddim eta (eta=0.0 corresponds to deterministic sampling
  --n_iter N_ITER       sample this often
  --H H                 image height, in pixel space
  --W W                 image width, in pixel space
  --C C                 latent channels
  --f F                 downsampling factor
  --n_samples N_SAMPLES
                        how many samples to produce for each given prompt. A.k.a. batch size
  --n_rows N_ROWS       rows in the grid (default: n_samples)
  --scale SCALE         unconditional guidance scale: eps = eps(x, empty) + scale * (eps(x, cond) - eps(x, empty))
  --from-file FROM_FILE
                        if specified, load prompts from this file
  --config CONFIG       path to config which constructs model
  --ckpt CKPT           path to checkpoint of model
  --seed SEED           the seed (for reproducible sampling)
  --precision {full,autocast}
                        evaluate at this precision
```

lijian6's avatar
Update  
lijian6 committed
104
### 2、运行Diffusers示例:
lijian6's avatar
lijian6 committed
105

lijian6's avatar
Update  
lijian6 committed
106
下载diffusers version模型, 运行 python Diffusers.py:
lijian6's avatar
Update  
lijian6 committed
107
```
lijian6's avatar
lijian6 committed
108
# make sure you're logged in with `huggingface-cli login`
lijian6's avatar
lijian6 committed
109
import torch
lijian6's avatar
lijian6 committed
110
111
112
113
from torch import autocast
from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained(
lijian6's avatar
lijian6 committed
114
	"stable-diffusion-v1-4", 
lijian6's avatar
lijian6 committed
115
116
117
118
119
120
121
122
123
	use_auth_token=True
).to("cuda")

prompt = "a photo of an astronaut riding a horse on mars"
with autocast("cuda"):
    image = pipe(prompt)["sample"][0]  
    
image.save("astronaut_rides_horse.png")
```
lijian6's avatar
Update  
lijian6 committed
124
125
126
## result
![img](./doc/result.png)

lijian6's avatar
update  
lijian6 committed
127
128
129
## 精度


lijian6's avatar
Update  
lijian6 committed
130
131
## 应用场景
### 算法类别
lijian6's avatar
lijian6 committed
132
`以文生图`
lijian6's avatar
Update  
lijian6 committed
133
134

### 热点应用行业
lijian6's avatar
lijian6 committed
135
`绘画,动漫`
lijian6's avatar
lijian6 committed
136
137
138
139
140
141

## 源码仓库及问题反馈
http://developer.hpccube.com/codes/modelzoo/stablediffussion.git

## 参考
https://github.com/CompVis/stable-diffusion