"docs/vscode:/vscode.git/clone" did not exist on "e32b19e1ede24e240fa560337c2771fff6c35ab1"
README.md 3.35 KB
Newer Older
dcuai's avatar
dcuai committed
1
# Stable Diffusion Version 2
yaoht's avatar
yaoht committed
2

gaoqiong's avatar
gaoqiong committed
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
## 论文

https://arxiv.org/pdf/2010.02502

## 模型结构
文生图任务是指将一段文本输入到SD模型中,经过一定的迭代次数,SD模型输出一张符合输入文本描述的图片。

<img src=./sd_model.png style="zoom:100%;" align=middle>

## 算法原理

使用CLIP Text Encode模型将输入的人类文本信息进行编码,生成与文本信息对应的Text Embeddings特征矩阵;
输入文本信息,再用random函数生成一个高斯噪声矩阵 作为Latent Feature(隐空间特征)的“替代” 输入到SD模型的 “图像优化模块” 中;
首先图像优化模块是由U-Net网络和Schedule算法 组成,将图像优化模块进行优化迭代后的Latent Feature输入到 图像解码器 (VAE Decoder) 中,将Latent Feature重建成像素级图。

yaoht's avatar
yaoht committed
18
19
20
21
22
<img src=./text_encoder.png style="zoom:100%;" align=middle>

<img src=./unet.png style="zoom:100%;" align=middle>

<img src=./vae.png style="zoom:100%;" align=middle>
gaoqiong's avatar
gaoqiong committed
23
24
25
26
27
28
29
30

## 环境配置

### Docker(方法一)

拉取镜像:

```shell
yaoht's avatar
yaoht committed
31
docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:stablediffusion-migraphx-ubuntu20.04-dtk24.04.3-py310
gaoqiong's avatar
gaoqiong committed
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
```

创建并启动容器:

```shell
docker run --shm-size 16g --network=host --name=sd2.1_migraphx --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v $PWD/sd2.1_migraphx:/home/sd2.1_migraphx -v /opt/hyhal:/opt/hyhal:ro -it <Your Image ID> /bin/bash

# 激活dtk
source /opt/dtk/env.sh
```

### Dockerfile(方法二)

```shell
cd ./docker
docker build --no-cache -t sd2.1_migraphx:2.0 .

docker run --shm-size 16g --network=host --name=sd2.1_migraphx --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v $PWD/sd2.1_migraphx:/home/sd2.1_migraphx -v /opt/hyhal:/opt/hyhal:ro -it <Your Image ID> /bin/bash

# 激活dtk
source /opt/dtk/env.sh
```

## 数据集



dcuai's avatar
dcuai committed
59
## 推理
gaoqiong's avatar
gaoqiong committed
60

dcuai's avatar
dcuai committed
61
**模型下载**
chenzk's avatar
chenzk committed
62
63
64
65
[stabilityai/stable-diffusion-2-1-base](https://huggingface.co/stabilityai/stable-diffusion-2-1-base)
[stabilityai/stable-diffusion-3.5-large](https://huggingface.co/stabilityai/stable-diffusion-3.5-large)

下面运行示例命令中的onnx-model-path参数就设置为该下载后onnx模型目录。
gaoqiong's avatar
gaoqiong committed
66

dcuai's avatar
dcuai committed
67
**设置环境变量**
gaoqiong's avatar
gaoqiong committed
68
69
70
```shell
export PYTHONPATH=/opt/dtk/lib:$PYTHONPATH
```
dcuai's avatar
dcuai committed
71
**安装依赖**
gaoqiong's avatar
gaoqiong committed
72
73
74
75
76
77
78
```shell
# 进入python示例目录
cd <path_to_sd2.1_migraphx>

# 安装依赖
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
```
yaoht's avatar
yaoht committed
79
80
### 运行示例

gaoqiong's avatar
gaoqiong committed
81
82
83
stablediffusion_v2.1模型的推理示例程序是Diffusion_test_offload_false.py,使用如下命令运行该推理示例:

```shell
yaoht's avatar
yaoht committed
84
python Diffusion_test_offload_false.py --prompt "a photograph of an astronaut riding a horse" --seed 13 --output astro_horse.jpg --steps 50 --fp16 all --onnx-model-path stablediffusion_v2.1_migraphx
gaoqiong's avatar
gaoqiong committed
85
86
```

dcuai's avatar
dcuai committed
87
## result
yaoht's avatar
git  
yaoht committed
88

gaoqiong's avatar
gaoqiong committed
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
推理结果:

python程序运行结束后,会在当前目录保存推理生成的图像。

<img src="./astro_horse.jpg" alt="Result" style="zoom: 50%;" />


### 精度



## 应用场景

### 算法类别

`以文生图`

### 热点应用行业

`绘画`,`动漫`,`媒体`

## 源码仓库及问题反馈

chenzk's avatar
chenzk committed
112
https://developer.sourcefind.cn/codes/modelzoo/stablediffusion_v2.1_migraphx
gaoqiong's avatar
gaoqiong committed
113
114
115
116

## 参考资料

https://github.com/Stability-AI/stablediffusion