# Stable Diffusion

## 模型介绍

Stable Diffusion是一种潜在的文本到图像扩散型。我们能够在LAION-512B数据库子集的512x5图像上训练潜在扩散模型。与谷歌的Imagen类似，该模型使用冻结的 CLIP ViT-L/14 文本编码器根据文本提示对模型进行调节。凭借其860M UNet和123M文本编码器，该模型相对轻巧，可以在至少具有10GB VRAM的GPU上运行。请参阅下面的此部分和模型卡。
  
## 环境配置

1、使用镜像：
在光源可拉取推理的docker镜像，拉取方式如下：

```
docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:stable-diffusion
```

2、使用conda环境：
A suitable [conda](https://conda.io/) environment named `ldm` can be created
and activated with:

```
conda env create -f environment.yaml
conda activate ldm
```

## 下载Stable Diffusion模型

### checkpoint version
git lfs install
git clone https://huggingface.co/CompVis/stable-diffusion-v-1-1-original

### diffusers version
git clone https://huggingface.co/CompVis/stable-diffusion-v1-1

## 运行示例：

### 运行checkpoint version示例：

下载完`stable-diffusion-*-original`模型之后，link到对应的目录：
```
mkdir -p models/ldm/stable-diffusion-v1/
ln -s <path/to/model.ckpt> models/ldm/stable-diffusion-v1/model.ckpt 
```

运行：
```
python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms 
```


```commandline
usage: txt2img.py [-h] [--prompt [PROMPT]] [--outdir [OUTDIR]] [--skip_grid] [--skip_save] [--ddim_steps DDIM_STEPS] [--plms] [--laion400m] [--fixed_code] [--ddim_eta DDIM_ETA]
                  [--n_iter N_ITER] [--H H] [--W W] [--C C] [--f F] [--n_samples N_SAMPLES] [--n_rows N_ROWS] [--scale SCALE] [--from-file FROM_FILE] [--config CONFIG] [--ckpt CKPT]
                  [--seed SEED] [--precision {full,autocast}]

optional arguments:
  -h, --help            show this help message and exit
  --prompt [PROMPT]     the prompt to render
  --outdir [OUTDIR]     dir to write results to
  --skip_grid           do not save a grid, only individual samples. Helpful when evaluating lots of samples
  --skip_save           do not save individual samples. For speed measurements.
  --ddim_steps DDIM_STEPS
                        number of ddim sampling steps
  --plms                use plms sampling
  --laion400m           uses the LAION400M model
  --fixed_code          if enabled, uses the same starting code across samples
  --ddim_eta DDIM_ETA   ddim eta (eta=0.0 corresponds to deterministic sampling
  --n_iter N_ITER       sample this often
  --H H                 image height, in pixel space
  --W W                 image width, in pixel space
  --C C                 latent channels
  --f F                 downsampling factor
  --n_samples N_SAMPLES
                        how many samples to produce for each given prompt. A.k.a. batch size
  --n_rows N_ROWS       rows in the grid (default: n_samples)
  --scale SCALE         unconditional guidance scale: eps = eps(x, empty) + scale * (eps(x, cond) - eps(x, empty))
  --from-file FROM_FILE
                        if specified, load prompts from this file
  --config CONFIG       path to config which constructs model
  --ckpt CKPT           path to checkpoint of model
  --seed SEED           the seed (for reproducible sampling)
  --precision {full,autocast}
                        evaluate at this precision
```

### 运行Diffusers示例：

Diffusers.py:
```py
# make sure you're logged in with `huggingface-cli login`
from torch import autocast
from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained(
	"CompVis/stable-diffusion-v1-4", 
	use_auth_token=True
).to("cuda")

prompt = "a photo of an astronaut riding a horse on mars"
with autocast("cuda"):
    image = pipe(prompt)["sample"][0]  
    
image.save("astronaut_rides_horse.png")
```

## 源码仓库及问题反馈
http://developer.hpccube.com/codes/modelzoo/stablediffussion.git

## 参考
https://github.com/CompVis/stable-diffusion