README.md

## 以通用方式加载并运行pipeline

```bash
python tools/run_pipe.py -m /path/to/models -p "the ocean in dream"
```

脚本参数说明：

| 参数 | 说明 | 类型 | 默认值 |
| --- | --- | --- | --- |
| `-m` / `--model-dir` | **必选**，pipeline 模型路径 | str | None |
| `--force-compile` | 可选，是否强制重新编译模型 | bool | False |
| `--num-images-per-prompt` | 可选，一条提示词一次生成图片的数量 | int | 1 |
| `--img-size` | 可选，生成图像尺寸，如果不设置，则跟随各 pipeline 默认的图像尺寸参数 | int | None | 
| `-p` / `--prompt` | **必选**，提示词，描述图片内容、风格、生成要求等 | str | None |
| `-n` / `--negative-prompt` | 可选，反向提示词，例如 "ugly" | str | None |
| `-t` / `--num-inference-steps` | 可选，生成图片时迭代多少步 | int | 50 |
| `--seed` | 可选，随机数种子 | int | 42 |
| `--save-prefix` | 可选，保存图片的前缀 | str | None |


## 以自定义组件方式加载并运行pipeline

> reference: [https://huggingface.co/docs/diffusers/using-diffusers/custom_pipeline_overview#community-components](https://huggingface.co/docs/diffusers/using-diffusers/custom_pipeline_overview#community-components)  
> Community components allow users to build pipelines that may have customized components that are not a part of Diffusers. If your pipeline has custom components that Diffusers doesn’t already support, you need to provide their implementations as Python modules. These customized components could be a VAE, UNet, and scheduler. In most cases, the text encoder is imported from the Transformers library. The pipeline code itself can also be customized.

在此项目中，我们以 MIGraphX 为推理后端重写了 text_encoder、unet、vae_decoder 等组件，除了通用的模型加载方法 `DiffusionPipeline.from_pretrained` 外，我们还可以先加载每个自定义的组件，然后创建 pipeline 实例。以 sdxl 为例：

```bash
# 运行 sdxl
python tools/run_sdxl_with_custom_components.py -m /path/to/sdxl_models -p "the ocean in bream"
```

脚本参数说明：

| 参数 | 说明 | 类型 | 默认值 |
| --- | --- | --- | --- |
| `-m` / `--model-dir` | **必选**，sdxl 模型路径 | str | None |
| `--force-compile` | 可选，是否强制重新编译模型 | bool | False |
| `--num-images-per-prompt` | 可选，一条提示词一次生成图片的数量 | int | 1 |
| `--img-size` | 可选，生成图像尺寸 | int | 1024 | 
| `-p` / `--prompt` | **必选**，提示词，描述图片内容、风格、生成要求等 | str | None |
| `-n` / `--negative-prompt` | 可选，反向提示词，例如 "ugly" | str | None |
| `-t` / `--num-inference-steps` | 可选，生成图片时迭代多少步 | int | 50 |
| `--seed` | 可选，随机数种子 | int | 42 |
| `--save-prefix` | 可选，保存图片的前缀 | str | None |


## 批量生成图片

可以将多条不同主题的提示与反向提示词放在一个 json 文件中，然后使用 `tools/run_examples.py` 批量生成图片。

```bash
# 运行 sdxl
python tools/run_examples.py \
-m /path/to/sdxl_models \
--examples-json examples/prompts_and_negative_prompts.json \
--output-dir examples/sdxl-images-1024
```

脚本参数说明：

| 参数 | 说明 | 类型 | 默认值 |
| --- | --- | --- | --- |
| `-m` / `--model-dir` | **必选**，sdxl 模型路径 | str | None |
| `--force-compile` | 可选，是否强制重新编译模型 | bool | False |
| `--num-images-per-prompt` | 可选，一条提示词一次生成图片的数量 | int | 1 |
| `--img-size` | 可选，生成图像尺寸，如果不设置，则跟随各 pipeline 默认的图像尺寸参数 | int | None | 
| `-t` / `--num-inference-steps` | 可选，生成图片时迭代多少步 | int | 50 |
| `--seed` | 可选，随机数种子 | int | 42 |
| `-examples-json` | 可选，提示词与反向提示词文件 | str | examples/prompts_and_negative_prompts.json |
| `--output-dir` | 可选，保存生成的图片路径 | str | None |

其中，提示词与反向提示词文件格式为：
```json
[
    {
      "theme": "theme0 name",
      "examples": [
        {
          "prompt": "promt0 text here",
          "negative_prompt": "negative_prompt0 text here"
        },
        {
          "prompt": "promt1 text here",
          "negative_prompt": "negative_prompt1 text here"
        },
        ...
      ]
    },
    {
      "theme": "theme1 name",
      "examples": [
        {
          "prompt": "promt0 text here",
          "negative_prompt": "negative_prompt0 text here"
        },
        {
          "prompt": "promt1 text here",
          "negative_prompt": "negative_prompt1 text here"
        },
        ...
      ]
    },
    ...
]
```

示例：[../examples/prompts_and_negative_prompts.json](../examples/prompts_and_negative_prompts.json)


## 统计各模块耗时

此项目支持通过非侵入式打点统计各模块耗时，大致的步骤如下：
1. 创建计时器；
2. 将要统计耗时的函数或方法注册进计时器；
3. 开启计时器；
4. 运行要统计耗时的函数或方法；
5. 打印统计数据。

简单使用示例：
```python
import random
import time
from migraphx_diffusers import AutoTimer

def sleep_func(sleep_seconds=1):
    time.sleep(sleep_seconds)

class SleepClass:
    def __init__(self):
        self.min_seconds = 1
        self.max_seconds = 5

    def random_sleep(self):
        time.sleep(random.randint(self.min_seconds, self.max_seconds))

    def __call__(self, sleep_seconds=1):
        time.sleep(sleep_seconds)

obj = SleepClass()

t = AutoTimer()  # step1

# step2
t.add_target(sleep_func, key="sleep_func")
t.add_target(obj.random_sleep, key="random_sleep")
t.add_target(obj, key="__call__")

t.start_work() # step3

# step4
for i in range(10):
    sleep_func()
    obj()
    if i % 3 == 0:
        obj.random_sleep()

t.summary()  # step5
```

运行结果如下：
```
+--------------------------------------------------------------------------------------+
|                                     Test Latency                                     |
+--------------+----------+--------------+--------------+--------------+---------------+
|     模块     | 运行次数  | 最长耗时(ms) | 最短耗时(ms) | 平均耗时(ms) | 平均性能(fps) |
+--------------+----------+--------------+--------------+--------------+---------------+
|  sleep_func  |    10    |   1001.06    |   1001.02    |   1001.04    |      1.0      |
|   __call__   |    10    |   1001.07    |   1000.06    |   1000.94    |      1.0      |
| random_sleep |    4     |    4004.1    |   1001.05    |   2252.33    |      0.44     |
+--------------+----------+--------------+--------------+--------------+---------------+
```

统计 sdxl 或 sd2.1 端到端性能与各组件性能数据：
```bash
python tools/time_count.py -m /path/to/sdxl_models
```

脚本参数说明：

| 参数 | 说明 | 类型 | 默认值 |
| --- | --- | --- | --- |
| `-m` / `--model-dir` | **必选**，sdxl 模型路径 | str | None |
| `--force-compile` | 可选，是否强制重新编译模型 | bool | False |
| `--num-images-per-prompt` | 可选，一条提示词一次生成图片的数量 | int | 1 |
| `--img-size` | 可选，生成图像尺寸，如果不设置，则跟随各 pipeline 默认的图像尺寸参数 | int | None | 
| `-t` / `--num-inference-steps` | 可选，生成图片时迭代多少步 | int | 50 |
| `--num-warmup-loops` | 可选，warmup 迭代次数 | int | 1 |
| `--num-count-loops` | 可选，性能统计迭代次数 | int | 100 |
| `--out-csv-file` | 可选，性能数据保存路径，CSV文件 | str | ./perf-{date}-{time}.csv |


## SD2.1 端到端性能测试

```bash
python tools/run_sd2_1.py /path/to/sd2.1_models
```

脚本参数说明：

| 参数 | 说明 | 类型 | 默认值 |
| --- | --- | --- | --- |
| `model-dir` | **位置参数**，sd2.1 模型路径 | str | None |
| `--result-dir` | 可选，生成图片的存放目录 | str | ./results |

测试场景如下：

+ batchsize: 1、2、4、8
+ image_size: 512
+ num_inference_steps: 20


## 模型精度评估

文生图任务一般采用 CLIP-score 来评估模型，首先准备数据集与多模态模型：
```bash
# 下载数据集
wget https://raw.githubusercontent.com/google-research/parti/main/PartiPrompts.tsv --no-check-certificate

# 下载模型
mkdir ./openai
huggingface-cli download openai/clip-vit-base-patch16 --local-dir ./openai/clip-vit-base-patch16 --local-dir-use-symlinks False
```

根据数据集中的提示词生成图片：
```bash
python tools/gen_p2_images.py -m /path/to/models --num-images-per-prompt 4 -p ./PartiPrompts.tsv --save-dir ./p2_images
```

评估生成的结果：
```bash
python python tools/evaluate.py -m ./openai/clip-vit-base-patch16 -d ./p2_images
```