Commit 9ed6153f authored by lijian6's avatar lijian6
Browse files

Update README.md


Signed-off-by: lijian6's avatarlijian <lijian6@sugon.com>
parent a87f0843
Pipeline #1366 canceled with stages
......@@ -10,13 +10,13 @@
模型基于`transformer decoder`结构,在`DiT`基础上重新设计了`Time Embedding`以及`positional Embedding`的添加方式,`Text Prompt`通过两个`text encoder`进行编码,其余与DiT一致。
![alt text](readme_imgs/model_structure.png)
![alt text](docs/model_structure.png)
## 算法原理
使用`self-attention`捕获图像内部的结构信息,使用`cross attention`对齐文本与图像。
![alt text](readme_imgs/alg.png)
![alt text](docs/alg.png)
## 环境配置
......@@ -41,7 +41,7 @@
### 模型下载
huggingface镜像网站上下载diffusers模型:[HunyuanDiT-Diffusers](https://hf-mirror.com/Tencent-Hunyuan/HunyuanDiT-Diffusers)
huggingface镜像网站上下载diffusers模型:[HunyuanDiT-Diffusers](https://hf-mirror.com/Tencent-Hunyuan/HunyuanDiT-Diffusers)
### 运行
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment