README.md 5.65 KB
Newer Older
jerrrrry's avatar
jerrrrry committed
1

jerrrrry's avatar
jerrrrry committed
2
# hunyuan-dit
jerrrrry's avatar
jerrrrry committed
3

jerrrrry's avatar
jerrrrry committed
4
5
6
7
8
9
10
11
12
13
14
> A high-performance implementation of the HunyuanDiT model for text-to-image generation.  
> This project provides an environment setup, dependency installation, and usage instructions to reproduce and run the model efficiently using Docker and optimized hardware libraries.


## 🔥 复现指南 (Reproduction Guide)

### 1. 环境准备 (Prepare Environment)

Pull the required Docker image:

```bash
jerrrrry's avatar
jerrrrry committed
15
docker pull image.sourcefind.cn:5000/dcu/admin/base/vllm:0.9.2-ubuntu22.04-dtk25.04.1-rc5-rocblas101839-0811-das1.6-py3.10-20250908-rc1
jerrrrry's avatar
jerrrrry committed
16
17
18
```

### 2. 创建容器 (Create Container)
jerrrrry's avatar
jerrrrry committed
19

jerrrrry's avatar
jerrrrry committed
20
21
22
Run a Docker container with proper configurations:

```bash
jerrrrry's avatar
jerrrrry committed
23
docker run -it \
jerrrrry's avatar
jerrrrry committed
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
  --network=host \
  --hostname=localhost \
  --name=HUNYUAN \
  -v /opt/hyhal:/opt/hyhal:ro \
  -v $PWD:/workspace \
  --ipc=host \
  --device=/dev/kfd \
  --device=/dev/mkfd \
  --device=/dev/dri \
  --shm-size=512G \
  --privileged \
  --group-add video \
  --cap-add=SYS_PTRACE \
  --security-opt seccomp=unconfined \
  image.sourcefind.cn:5000/dcu/admin/base/vllm:0.9.2-ubuntu22.04-dtk25.04.1-rc5-rocblas101839-0811-das1.6-py3.10-20250908-rc1 \
  /bin/bash
```

### 3. 拉取代码 (Clone Repository)

```bash
jerrrrry's avatar
jerrrrry committed
45
git clone http://developer.sourcefind.cn/codes/bw_bestperf/hunyuan-dit.git
jerrrrry's avatar
jerrrrry committed
46
47
48
49
cd hunyuan-dit
```

### 4. 获取 & 安装依赖 (Download & Install Dependencies)
jerrrrry's avatar
jerrrrry committed
50

jerrrrry's avatar
jerrrrry committed
51
52
53
54
55
56
57
58
59
60
Download required custom wheels:

```bash
# Apex
curl -f -C - -o apex-1.5.0+das.opt1.dtk25041-cp310-cp310-linux_x86_64.whl https://ksefile.hpccube.com:65241/efile/s/d/amVycnJycnk=/e759f4e7fbb64b10

# Lightop
curl -f -C - -o lightop-0.5.0+das.dtk25041.unknown-cp310-cp310-linux_x86_64.whl https://ksefile.hpccube.com:65241/efile/s/d/amVycnJycnk=/3ca9654a8fc1b0b5

# Deepspeed
jerrrrry's avatar
jerrrrry committed
61
wget https://download.sourcefind.cn:65024/directlink/4/deepspeed/DAS1.6/deepspeed-0.14.2+das.opt1.dtk25041-cp310-cp310-manylinux_2_28_x86_64.whl
jerrrrry's avatar
jerrrrry committed
62
63
64
```

Install the wheels and requirements:
jerrrrry's avatar
jerrrrry committed
65

jerrrrry's avatar
jerrrrry committed
66
67
68
```bash
pip install apex-1.5.0+das.opt1.dtk25041-cp310-cp310-linux_x86_64.whl
pip install lightop-0.5.0+das.dtk25041.unknown-cp310-cp310-linux_x86_64.whl
jerrrrry's avatar
jerrrrry committed
69
pip install deepspeed-0.14.2+das.opt1.dtk25041-cp310-cp310-manylinux_2_28_x86_64.whl
jerrrrry's avatar
jerrrrry committed
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
pip install -r requirements.txt -i https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
```

### 5. 下载优化包 (Download Optimization Packages)

```bash
curl -f -C - -o hipblaslt-install0925.tar.gz https://ksefile.hpccube.com:65241/efile/s/d/amVycnJycnk=/5857030947151012
curl -f -C - -o package_0915_ubuntu.tar.gz https://ksefile.hpccube.com:65241/efile/s/d/amVycnJycnk=/0c80d0e60b9af80d
```

Extract and install them accordingly as per your environment needs.

### 6. 下载模型 (Download Model)

Refer to the model page on ModelScope:  
https://modelscope.cn/models/dengcao/HunyuanDiT-v1.2
jerrrrry's avatar
jerrrrry committed
86

jerrrrry's avatar
jerrrrry committed
87
Commands to download and prepare:
jerrrrry's avatar
jerrrrry committed
88

jerrrrry's avatar
jerrrrry committed
89
90
91
92
```bash
pip install modelscope

modelscope download --model dengcao/HunyuanDiT-v1.2 --local_dir ./HunyuanDiT-v1.2
jerrrrry's avatar
jerrrrry committed
93
94

cd HunyuanDiT-v1.2
jerrrrry's avatar
jerrrrry committed
95

jerrrrry's avatar
jerrrrry committed
96
97
98
wget https://dit.hunyuan.tencent.com/download/HunyuanDiT/tokenizer.zip
wget https://dit.hunyuan.tencent.com/download/HunyuanDiT/sdxl-vae-fp16-fix.zip
wget https://dit.hunyuan.tencent.com/download/HunyuanDiT/clip_text_encoder.zip
jerrrrry's avatar
jerrrrry committed
99
```
jerrrrry's avatar
jerrrrry committed
100

jerrrrry's avatar
jerrrrry committed
101
Model directory structure after download:
jerrrrry's avatar
jerrrrry committed
102
103

<p align="center">
jerrrrry's avatar
jerrrrry committed
104
  <img src="19115934112c36d5d67394265d1498e2.png" height="300" alt="Model Directory Structure">
jerrrrry's avatar
jerrrrry committed
105
106
</p>

jerrrrry's avatar
jerrrrry committed
107
---
jerrrrry's avatar
jerrrrry committed
108

jerrrrry's avatar
jerrrrry committed
109
110
111
112
113
## 测试指令 (Test Command)

Set library paths and run inference:

```bash
jerrrrry's avatar
jerrrrry committed
114
115
116
export LD_LIBRARY_PATH=/workspace/OEM_ADVTG_TEST/hunyuan/hipblaslt-install/lib/:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/workspace/OEM_ADVTG_TEST/hunyuan/package/miopen/lib/:$LD_LIBRARY_PATH

jerrrrry's avatar
jerrrrry committed
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
python sample_t2i_dcu.py \
  --model-root /workspace/OEM_ADVTG_TEST/hunyuan/HunyuanDiT-v1.2/ \
  --batch-size 4 \
  --infer-mode fa \
  --prompt "青花瓷风格,一只可爱的哈士奇" \
  --no-enhance \
  --load-key module \
  --image-size 1024 1024 \
  --infer-steps 20
```

---

## 配置选项 (Configuration Options)

| Option       | Description                                 | Default / Example                  |
|--------------|---------------------------------------------|----------------------------------|
| `--model-root` | Path to the downloaded model directory     | `/workspace/OEM_ADVTG_TEST/hunyuan/HunyuanDiT-v1.2/` |
| `--batch-size` | Batch size for inference                     | 4                                |
| `--infer-mode` | Inference mode (e.g., "fa")                  | "fa"                            |
| `--prompt`     | Text prompt for image generation             | `"青花瓷风格,一只可爱的哈士奇"`   |
| `--no-enhance` | Disable image enhancement                     | Flag                            |
| `--load-key`   | Key for loading model weights                 | `module`                        |
| `--image-size` | Output image size `[width] [height]`          | `1024 1024`                     |
| `--infer-steps`| Number of inference steps                      | 20                             |

---

## 贡献指南 (Contributing)

We welcome contributions! Please follow the steps below to contribute:

1. Fork the repository.
2. Create a feature branch: `git checkout -b feature-name`.
3. Make your changes and commit with clear messages.
4. Open a Pull Request describing your changes.
5. Ensure code passes tests and adheres to project style.

Please report issues and suggest improvements via the issue tracker.

---

## 许可证 (License)

This project is licensed under the **[MIT License](./LICENSE)**.  
Feel free to use, modify, and distribute under the terms of this license.

---

## 联系方式 (Contact)

For any questions or support, please contact the maintainers via the repository issue page.

---

Thank you for using **hunyuan-dit**! Enjoy exploring the power of text-to-image models.
jerrrrry's avatar
jerrrrry committed
173
```