first

c93161f8 · raojy · bfd39dd3 · c93161f8 · b360596f · c93161f8
Commit c93161f8 authored Apr 09, 2026 by raojy
Hide whitespace changes
Inline Side-by-side

Showing with 156 additions and 0 deletions

README.md README.md +90 -0

diffusers diffusers +1 -0

doc/1.png doc/1.png +0 -0

run.py run.py +65 -0

No files found.
--- a/README.md
+++ b/README.md
 # ernie_image_pytorch

+## 论文
+[ernie_image]()
+
+## 模型简介
+
+1111111111
+
+<div align=center>
+    <img src="./doc/qwen3.5_397b_a17b_infra.jpg"/>
+</div>
+
+## 环境依赖
+| 软件 |                    版本                     |
+| :------: |:-----------------------------------------:|
+| DTK |                   26.04                   |
+| python |                  3.10.12                  |
+| transformers |                5.5.0                 |
+| torch | 2.9.0+das.opt1.dtk2604.20260206.g275d08c2 |
+| torchvision | 0.24.0+das.opt1.dtk2604.20260210.gf0277aff |
+| pillow | 12.1.1 |
+| accelerate | 1.12.0 |
+
+当前推荐使用镜像: harbor.sourcefind.cn:5443/dcu/admin/base/custom:vllm0.15.1-ubuntu22.04-dtk26.04-0130-py3.10-20260220
+
+- 挂载地址`-v` 根据实际模型情况修改
+```bash
+docker run -it \
+    --shm-size 200g \
+    --network=host \
+    --name erinie \
+    --privileged \
+    --device=/dev/kfd \
+    --device=/dev/dri \
+    --device=/dev/mkfd \
+    --group-add video \
+    --cap-add=SYS_PTRACE \
+    --security-opt seccomp=unconfined \
+    -u root \
+    -v /opt/hyhal/:/opt/hyhal/:ro \
+    -v /path/your_code_data/:/path/your_code_data/ \
+    harbor.sourcefind.cn:5443/dcu/admin/base/custom:vllm0.15.1-ubuntu22.04-dtk26.04-0130-py3.10-20260220 bash
+```
+更多镜像可前往[光源](https://sourcefind.cn/#/service-list)下载使用。
+
+关于本项目DCU显卡所需的特殊深度学习库可从[光合](https://developer.sourcefind.cn/tool/)开发者社区下载安装，numpy、transformers库需要替换安装：
+```
+pip install transformers==5.5.0
+cd diffusers
+pip install -e .
+```
+
+## 数据集
+暂无
+
+## 训练
+暂无
+
+## 推理
+### torch
+#### 单机推理
+```bash
+python run.py
+```
+### vllm
+#### 单机推理
+
+**注意**：使用`K100 AI` 启动服务时需要添加`--disable-custom-all-reduce`参数,加载W8A8模型启动服务时需要添加`-cc.mode=3`和`-cc.inductor_compile_config='{"combo_kernels": false, "benchmark_combo_kernel": false}'`
+
+```bash
+
+
+
+## 效果展示
+<div align=center>
+    <img src="./doc/1.png"/>
+</div>
+
+### 精度
+DCU与GPU精度一致，推理框架：vllm。
+
+## 预训练权重
+|  模型名称  | 权重大小 | DCU型号  | 最低卡数需求 |         下载地址          |
+|:------:|:----:|:----------:|:------:|:---------------------:|
+
+
+## 源码仓库及问题反馈
+- https://developer.sourcefind.cn/codes/modelzoo/
+
+## 参考资料
+- https://github.com//
--- a/diffusers @ b360596f
+++ b/diffusers @ b360596f
+Subproject commit b360596fa8933d59abf4edc91f036807ee6bbe61
--- a/doc/1.png
+++ b/doc/1.png
--- a/run.py
+++ b/run.py
+import os
+import random
+import numpy as np
+import torch
+from diffusers import ErnieImagePipeline
+from tqdm import tqdm
+
+# 设置全局随机种子确保可复现性
+# seed = 42
+seed = random.randint(0, 100000)
+print(f"seed: {seed}")
+random.seed(seed)
+np.random.seed(seed)
+torch.manual_seed(seed)
+# 在 DCU 上，torch.cuda.manual_seed_all 会自动映射到底层 hipRAND
+torch.cuda.manual_seed_all(seed)
+
+# 允许一定的算子融合和自动寻优，DCU 的 MIOpen 会接管 cudnn 的设置
+torch.backends.cudnn.deterministic = True
+torch.backends.cudnn.benchmark = False
+
+# 加载 pipeline
+# 注意：如果你的 DCU 版本（如某些较老的型号）对 bfloat16 支持不佳，可以尝试换成 torch.float16
+pipe = ErnieImagePipeline.from_pretrained(
+    "/public/home/raojy/project/baidu/ERNIE-Image",
+    torch_dtype=torch.bfloat16 
+)
+
+# DCU 版本的 PyTorch 会自动将 "cuda" 映射到 DCU 设备上
+pipe = pipe.to("cuda")
+
+pipe.transformer.eval() 
+pipe.vae.eval()
+pipe.text_encoder.eval()
+pipe.pe.eval()
+
+# 如果显存不够可以开启 offload
+# pipe.enable_model_cpu_offload()
+
+# 设置随机种子，"cuda" 在这里同样会被映射到 DCU
+generator = torch.Generator(device="cuda").manual_seed(seed)
+
+# 确保输出目录存在
+os.makedirs("../tests", exist_ok=True)
+
+# 生成图片
+prompt_list = [
+    "A highly detailed biological pathway diagram in BioRender style. Depicting the viral infection process of human immune cells. Showing a virus particle attaching to a T-cell receptor, viral RNA replicating inside the cell nucleus, and the cell transforming into a malignant tumor cell. Includes molecular signaling pathways, proteins, and epigenetic modification symbols. Scientific flat vector style, soft pastel medical color palette, clean white background, educational graphic, crisp lines, professional scientific journal illustration. --ar 16:9 --v 6.0",
+    "建筑坐落在住宅小区道路旁，被高大浓密的绿色乔木包围，树冠形成自然遮荫空间，严格保持图中所有元素的一致性。 午后自然阳光透过树叶形成斑驳光影（dappled sunlight），光线柔和且具有层次，地面呈现清晰树影，空气通透，微风感 建筑下方为开放式咖啡空间，人群自然分布，轻松社交状态，室内外边界模糊，空间通透流动 摄影级真实渲染，超高动态范围（HDR），自然曝光，真实反射与折射，玻璃微反射环境，极致细节。SANAA建筑风格，极简主义，轻盈漂浮感，日式当代建筑语言，自然主义建筑融合 高级建筑摄影风格，类似Dezeen / ArchDaily封面级别，纪实但理想化 电影级光影（cinematic natural lighting），柔和高光不过曝，阴影细腻。tree canopy filtered sunlight, soft shadows, volumetric light subtle, natural ambient occlusion 阳光穿过树叶产生细碎光斑，地面光影随机分布，光影边界柔和不过硬 色反射光（green bounce light）轻微影响建筑底部色温。广角镜头 24mm，低机位轻微仰视，前景草地，中景建筑，背景树林 画面有树枝作为前景遮挡（frame foreground leaves），增加空间层次 景深适中，整体清晰但有空气透视。--quality 2 --style raw --ar 3:4 --lighting natural --render photorealistic --detail ultra",
+    "A photograph of the Straw Hat Pirates drawn on a glass whiteboard with a faded green marker, front view, 4K resolution."
+]
+
+for idx, prompt in enumerate(prompt_list):
+    output = pipe(
+        prompt=prompt,
+        height=1024,
+        width=1024,
+        num_inference_steps=50,
+        guidance_scale=5.0,
+        generator=generator,
+    )
+    revised_prompt = output.revised_prompts
+    images = output.images
+    images[0].save(f"../tests/hf_output{idx+1}.png")
+    print(f"Prompt {idx+1} revised: {revised_prompt}")
\ No newline at end of file