init

6f3f2c81 · wangwf · 4da15b1a · 6f3f2c81 · 6f3f2c81 · 6f3f2c81
Commit 6f3f2c81 authored Sep 12, 2025 by wangwf
20 changed files
--- a/examples/sdxl-images-1024/6-DailyLife/theme_6_example_3_image_0.png
+++ b/examples/sdxl-images-1024/6-DailyLife/theme_6_example_3_image_0.png
--- a/examples/sdxl-images-1024/6-DailyLife/theme_6_example_4_image_0.png
+++ b/examples/sdxl-images-1024/6-DailyLife/theme_6_example_4_image_0.png
--- a/examples/sdxl-images-1024/7-HistoryAndRetro/theme_7_example_0_image_0.png
+++ b/examples/sdxl-images-1024/7-HistoryAndRetro/theme_7_example_0_image_0.png
--- a/examples/sdxl-images-1024/7-HistoryAndRetro/theme_7_example_1_image_0.png
+++ b/examples/sdxl-images-1024/7-HistoryAndRetro/theme_7_example_1_image_0.png
--- a/examples/sdxl-images-1024/7-HistoryAndRetro/theme_7_example_2_image_0.png
+++ b/examples/sdxl-images-1024/7-HistoryAndRetro/theme_7_example_2_image_0.png
--- a/examples/sdxl-images-1024/7-HistoryAndRetro/theme_7_example_3_image_0.png
+++ b/examples/sdxl-images-1024/7-HistoryAndRetro/theme_7_example_3_image_0.png
--- a/examples/sdxl-images-1024/7-HistoryAndRetro/theme_7_example_4_image_0.png
+++ b/examples/sdxl-images-1024/7-HistoryAndRetro/theme_7_example_4_image_0.png
--- a/examples/sdxl-images-1024/8-DarkAndGrotesque/theme_8_example_0_image_0.png
+++ b/examples/sdxl-images-1024/8-DarkAndGrotesque/theme_8_example_0_image_0.png
--- a/examples/sdxl-images-1024/8-DarkAndGrotesque/theme_8_example_1_image_0.png
+++ b/examples/sdxl-images-1024/8-DarkAndGrotesque/theme_8_example_1_image_0.png
--- a/examples/sdxl-images-1024/8-DarkAndGrotesque/theme_8_example_2_image_0.png
+++ b/examples/sdxl-images-1024/8-DarkAndGrotesque/theme_8_example_2_image_0.png
--- a/examples/sdxl-images-1024/8-DarkAndGrotesque/theme_8_example_3_image_0.png
+++ b/examples/sdxl-images-1024/8-DarkAndGrotesque/theme_8_example_3_image_0.png
--- a/examples/sdxl-images-1024/8-DarkAndGrotesque/theme_8_example_4_image_0.png
+++ b/examples/sdxl-images-1024/8-DarkAndGrotesque/theme_8_example_4_image_0.png
--- a/examples/sdxl-images-1024/9-TechnologyAndDigital/theme_9_example_0_image_0.png
+++ b/examples/sdxl-images-1024/9-TechnologyAndDigital/theme_9_example_0_image_0.png
--- a/examples/sdxl-images-1024/9-TechnologyAndDigital/theme_9_example_1_image_0.png
+++ b/examples/sdxl-images-1024/9-TechnologyAndDigital/theme_9_example_1_image_0.png
--- a/examples/sdxl-images-1024/9-TechnologyAndDigital/theme_9_example_2_image_0.png
+++ b/examples/sdxl-images-1024/9-TechnologyAndDigital/theme_9_example_2_image_0.png
--- a/examples/sdxl-images-1024/9-TechnologyAndDigital/theme_9_example_3_image_0.png
+++ b/examples/sdxl-images-1024/9-TechnologyAndDigital/theme_9_example_3_image_0.png
--- a/examples/sdxl-images-1024/9-TechnologyAndDigital/theme_9_example_4_image_0.png
+++ b/examples/sdxl-images-1024/9-TechnologyAndDigital/theme_9_example_4_image_0.png
--- a/requirements.txt
+++ b/requirements.txt
+accelerate>=0.20.0
+diffusers>=0.34.0
+huggingface-hub<1.0,>=0.34.0
+numpy>=1.24.3,<2.0.0
+onnx
+onnxruntime>=1.22.1
+pillow
+prettytable
+tokenizers<0.22,>=0.21
+torch>=2.4.1
+transformers>=4.54.1
--- a/set_env.sh
+++ b/set_env.sh
+rocblas_lib_path=$1
+
+export MIGRAPHX_ENABLE_MIOPEN_GROUPNORM=1
+export MIGRAPHX_ENABLE_NHWC=1
+export MIGRAPHX_ENABLE_MIOPEN_CONCAT=1
+export MIGRAPHX_STABLEDIFFUSION_OPT=1
+export MIGRAPHX_ENABLE_MIOPEN_GN_LN=1
+export MIGRAPHX_ENABLE_LAYERNORM_FUSION=1
+# export PADDING_MALLOC=0  # run on KME
+
+# export HIP_VISIBLE_DEVICES=6
+export LD_LIBRARY_PATH=${rocblas_lib_path}:$LD_LIBRARY_PATH
--- a/tools/README.md
+++ b/tools/README.md
+## 以通用方式加载并运行pipeline
+
+```bash
+# 运行 sdxl 与 sd2.1
+python tools/run_pipe.py -m /path/to/sdxl_or_sd2.1_models -p "the ocean in bream"
+```
+
+脚本参数说明：
+
+| 参数 | 说明 | 类型 | 默认值 |
+| --- | --- | --- | --- |
+| `-m` / `--model-dir` | **必选**，pipeline 模型路径 | str | None |
+| `--force-compile` | 可选，是否强制重新编译模型 | bool | False |
+| `--num-images-per-prompt` | 可选，一条提示词一次生成图片的数量 | int | 1 |
+| `--img-size` | 可选，生成图像尺寸，如果不设置，则跟随各 pipeline 默认的图像尺寸参数 | int | None | 
+| `-p` / `--prompt` | **必选**，提示词，描述图片内容、风格、生成要求等 | str | None |
+| `-n` / `--negative-prompt` | 可选，反向提示词，例如 "ugly" | str | None |
+| `-t` / `--num-inference-steps` | 可选，生成图片时迭代多少步 | int | 50 |
+| `--seed` | 可选，随机数种子 | int | 42 |
+| `--save-prefix` | 可选，保存图片的前缀 | str | None |
+
+
+## 以自定义组件方式加载并运行pipeline
+
+> reference: [https://huggingface.co/docs/diffusers/using-diffusers/custom_pipeline_overview#community-components](https://huggingface.co/docs/diffusers/using-diffusers/custom_pipeline_overview#community-components)  
+> Community components allow users to build pipelines that may have customized components that are not a part of Diffusers. If your pipeline has custom components that Diffusers doesn’t already support, you need to provide their implementations as Python modules. These customized components could be a VAE, UNet, and scheduler. In most cases, the text encoder is imported from the Transformers library. The pipeline code itself can also be customized.
+
+在此项目中，我们以 MIGraphX 为推理后端重写了 text_encoder、unet、vae_decoder 等组件，除了通用的模型加载方法 `DiffusionPipeline.from_pretrained` 外，我们还可以先加载每个自定义的组件，然后创建 pipeline 实例。以 sdxl 为例：
+
+```bash
+# 运行 sdxl
+python tools/run_sdxl_with_custom_components.py -m /path/to/sdxl_models -p "the ocean in bream"
+```
+
+脚本参数说明：
+
+| 参数 | 说明 | 类型 | 默认值 |
+| --- | --- | --- | --- |
+| `-m` / `--model-dir` | **必选**，sdxl 模型路径 | str | None |
+| `--force-compile` | 可选，是否强制重新编译模型 | bool | False |
+| `--num-images-per-prompt` | 可选，一条提示词一次生成图片的数量 | int | 1 |
+| `--img-size` | 可选，生成图像尺寸 | int | 1024 | 
+| `-p` / `--prompt` | **必选**，提示词，描述图片内容、风格、生成要求等 | str | None |
+| `-n` / `--negative-prompt` | 可选，反向提示词，例如 "ugly" | str | None |
+| `-t` / `--num-inference-steps` | 可选，生成图片时迭代多少步 | int | 50 |
+| `--seed` | 可选，随机数种子 | int | 42 |
+| `--save-prefix` | 可选，保存图片的前缀 | str | None |
+
+
+## 批量生成图片
+
+可以将多条不同主题的提示与反向提示词放在一个 json 文件中，然后使用 `tools/run_examples.py` 批量生成图片。
+
+```bash
+# 运行 sdxl
+python tools/run_examples.py \
+-m /path/to/sdxl_models \
+--examples-json examples/prompts_and_negative_prompts.json \
+--output-dir examples/sdxl-images-1024
+```
+
+脚本参数说明：
+
+| 参数 | 说明 | 类型 | 默认值 |
+| --- | --- | --- | --- |
+| `-m` / `--model-dir` | **必选**，sdxl 模型路径 | str | None |
+| `--force-compile` | 可选，是否强制重新编译模型 | bool | False |
+| `--num-images-per-prompt` | 可选，一条提示词一次生成图片的数量 | int | 1 |
+| `--img-size` | 可选，生成图像尺寸，如果不设置，则跟随各 pipeline 默认的图像尺寸参数 | int | None | 
+| `-t` / `--num-inference-steps` | 可选，生成图片时迭代多少步 | int | 50 |
+| `--seed` | 可选，随机数种子 | int | 42 |
+| `-examples-json` | 可选，提示词与反向提示词文件 | str | examples/prompts_and_negative_prompts.json |
+| `--output-dir` | 可选，保存生成的图片路径 | str | None |
+
+其中，提示词与反向提示词文件格式为：
+```json
+[
+    {
+      "theme": "theme0 name",
+      "examples": [
+        {
+          "prompt": "promt0 text here",
+          "negative_prompt": "negative_prompt0 text here"
+        },
+        {
+          "prompt": "promt1 text here",
+          "negative_prompt": "negative_prompt1 text here"
+        },
+        ...
+      ]
+    },
+    {
+      "theme": "theme1 name",
+      "examples": [
+        {
+          "prompt": "promt0 text here",
+          "negative_prompt": "negative_prompt0 text here"
+        },
+        {
+          "prompt": "promt1 text here",
+          "negative_prompt": "negative_prompt1 text here"
+        },
+        ...
+      ]
+    },
+    ...
+]
+```
+
+示例：[../examples/prompts_and_negative_prompts.json](../examples/prompts_and_negative_prompts.json)
+
+
+## 统计各模块耗时
+
+此项目支持通过非侵入式打点统计各模块耗时，大致的步骤如下：
+1. 创建计时器；
+2. 将要统计耗时的函数或方法注册进计时器；
+3. 开启计时器；
+4. 运行要统计耗时的函数或方法；
+5. 打印统计数据。
+
+简单使用示例：
+```python
+import random
+import time
+from migraphx_diffusers import AutoTimer
+
+def sleep_func(sleep_seconds=1):
+    time.sleep(sleep_seconds)
+
+class SleepClass:
+    def __init__(self):
+        self.min_seconds = 1
+        self.max_seconds = 5
+
+    def random_sleep(self):
+        time.sleep(random.randint(self.min_seconds, self.max_seconds))
+
+    def __call__(self, sleep_seconds=1):
+        time.sleep(sleep_seconds)
+
+obj = SleepClass()
+
+t = AutoTimer()  # step1
+
+# step2
+t.add_target(sleep_func, key="sleep_func")
+t.add_target(obj.random_sleep, key="random_sleep")
+t.add_target(obj, key="__call__")
+
+t.start_work() # step3
+
+# step4
+for i in range(10):
+    sleep_func()
+    obj()
+    if i % 3 == 0:
+        obj.random_sleep()
+
+t.summary()  # step5
+```
+
+运行结果如下：
+```
+--------------------------------------------------------------------------------------+
+|                                     Test Latency                                     |
+--------------+----------+--------------+--------------+--------------+---------------+
+|     模块     | 运行次数  | 最长耗时(ms) | 最短耗时(ms) | 平均耗时(ms) | 平均性能(fps) |
+--------------+----------+--------------+--------------+--------------+---------------+
+|  sleep_func  |    10    |   1001.06    |   1001.02    |   1001.04    |      1.0      |
+|   __call__   |    10    |   1001.07    |   1000.06    |   1000.94    |      1.0      |
+| random_sleep |    4     |    4004.1    |   1001.05    |   2252.33    |      0.44     |
+--------------+----------+--------------+--------------+--------------+---------------+
+```
+
+统计 sdxl 或 sd2.1 端到端性能与各组件性能数据：
+```bash
+python tools/time_count.py -m /path/to/sdxl_models
+```
+
+脚本参数说明：
+
+| 参数 | 说明 | 类型 | 默认值 |
+| --- | --- | --- | --- |
+| `-m` / `--model-dir` | **必选**，sdxl 模型路径 | str | None |
+| `--force-compile` | 可选，是否强制重新编译模型 | bool | False |
+| `--num-images-per-prompt` | 可选，一条提示词一次生成图片的数量 | int | 1 |
+| `--img-size` | 可选，生成图像尺寸，如果不设置，则跟随各 pipeline 默认的图像尺寸参数 | int | None | 
+| `-t` / `--num-inference-steps` | 可选，生成图片时迭代多少步 | int | 50 |
+| `--num-warmup-loops` | 可选，warmup 迭代次数 | int | 1 |
+| `--num-count-loops` | 可选，性能统计迭代次数 | int | 100 |
+| `--out-csv-file` | 可选，性能数据保存路径，CSV文件 | str | ./perf-{date}-{time}.csv |