README.md

```
# hunyuan-dit

> A high-performance implementation of the HunyuanDiT model for text-to-image generation.  
> This project provides an environment setup, dependency installation, and usage instructions to reproduce and run the model efficiently using Docker and optimized hardware libraries.

```

## 🔥 复现指南 (Reproduction Guide)

### 1. 环境准备 (Prepare Environment)

Pull the required Docker image:

```bash
docker pull image.sourcefind.cn:5000/dcu/admin/base/vllm:0.9.2-ubuntu22.04-dtk25.04.1-rc5-rocblas101839-0811-das1.6-py3.10-20250908-rc1
```

### 2. 创建容器 (Create Container)

Run a Docker container with proper configurations:

```bash
docker run -it \
  --network=host \
  --hostname=localhost \
  --name=HUNYUAN \
  -v /opt/hyhal:/opt/hyhal:ro \
  -v $PWD:/workspace \
  --ipc=host \
  --device=/dev/kfd \
  --device=/dev/mkfd \
  --device=/dev/dri \
  --shm-size=512G \
  --privileged \
  --group-add video \
  --cap-add=SYS_PTRACE \
  --security-opt seccomp=unconfined \
  image.sourcefind.cn:5000/dcu/admin/base/vllm:0.9.2-ubuntu22.04-dtk25.04.1-rc5-rocblas101839-0811-das1.6-py3.10-20250908-rc1 \
  /bin/bash
```

### 3. 拉取代码 (Clone Repository)

```bash
git clone http://developer.sourcefind.cn/codes/bw_bestperf/hunyuan-dit.git
cd hunyuan-dit
```

### 4. 获取 & 安装依赖 (Download & Install Dependencies)

Download required custom wheels:

```bash
# Apex
curl -f -C - -o apex-1.5.0+das.opt1.dtk25041-cp310-cp310-linux_x86_64.whl https://ksefile.hpccube.com:65241/efile/s/d/amVycnJycnk=/e759f4e7fbb64b10

# Lightop
curl -f -C - -o lightop-0.5.0+das.dtk25041.unknown-cp310-cp310-linux_x86_64.whl https://ksefile.hpccube.com:65241/efile/s/d/amVycnJycnk=/3ca9654a8fc1b0b5

# Deepspeed
wget https://download.sourcefind.cn:65024/directlink/4/deepspeed/DAS1.6/deepspeed-0.14.2+das.opt1.dtk25041-cp310-cp310-manylinux_2_28_x86_64.whl
```

Install the wheels and requirements:

```bash
pip install apex-1.5.0+das.opt1.dtk25041-cp310-cp310-linux_x86_64.whl
pip install lightop-0.5.0+das.dtk25041.unknown-cp310-cp310-linux_x86_64.whl
pip install deepspeed-0.14.2+das.opt1.dtk25041-cp310-cp310-manylinux_2_28_x86_64.whl
pip install -r requirements.txt -i https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
```

### 5. 下载优化包 (Download Optimization Packages)

```bash
curl -f -C - -o hipblaslt-install0925.tar.gz https://ksefile.hpccube.com:65241/efile/s/d/amVycnJycnk=/5857030947151012
curl -f -C - -o package_0915_ubuntu.tar.gz https://ksefile.hpccube.com:65241/efile/s/d/amVycnJycnk=/0c80d0e60b9af80d
```

Extract and install them accordingly as per your environment needs.

### 6. 下载模型 (Download Model)

Refer to the model page on ModelScope:  
https://modelscope.cn/models/dengcao/HunyuanDiT-v1.2

Commands to download and prepare:

```bash
pip install modelscope

modelscope download --model dengcao/HunyuanDiT-v1.2 --local_dir ./HunyuanDiT-v1.2

cd HunyuanDiT-v1.2

wget https://dit.hunyuan.tencent.com/download/HunyuanDiT/tokenizer.zip
wget https://dit.hunyuan.tencent.com/download/HunyuanDiT/sdxl-vae-fp16-fix.zip
wget https://dit.hunyuan.tencent.com/download/HunyuanDiT/clip_text_encoder.zip
```

Model directory structure after download:

<p align="center">
  <img src="19115934112c36d5d67394265d1498e2.png" height="300" alt="Model Directory Structure">
</p>

---

## 测试指令 (Test Command)

Set library paths and run inference:

```bash
export LD_LIBRARY_PATH=/workspace/OEM_ADVTG_TEST/hunyuan/hipblaslt-install/lib/:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/workspace/OEM_ADVTG_TEST/hunyuan/package/miopen/lib/:$LD_LIBRARY_PATH

python sample_t2i_dcu.py \
  --model-root /workspace/OEM_ADVTG_TEST/hunyuan/HunyuanDiT-v1.2/ \
  --batch-size 4 \
  --infer-mode fa \
  --prompt "青花瓷风格，一只可爱的哈士奇" \
  --no-enhance \
  --load-key module \
  --image-size 1024 1024 \
  --infer-steps 20
```

---

## 配置选项 (Configuration Options)

| Option       | Description                                 | Default / Example                  |
|--------------|---------------------------------------------|----------------------------------|
| `--model-root` | Path to the downloaded model directory     | `/workspace/OEM_ADVTG_TEST/hunyuan/HunyuanDiT-v1.2/` |
| `--batch-size` | Batch size for inference                     | 4                                |
| `--infer-mode` | Inference mode (e.g., "fa")                  | "fa"                            |
| `--prompt`     | Text prompt for image generation             | `"青花瓷风格，一只可爱的哈士奇"`   |
| `--no-enhance` | Disable image enhancement                     | Flag                            |
| `--load-key`   | Key for loading model weights                 | `module`                        |
| `--image-size` | Output image size `[width] [height]`          | `1024 1024`                     |
| `--infer-steps`| Number of inference steps                      | 20                             |

---

## 贡献指南 (Contributing)

We welcome contributions! Please follow the steps below to contribute:

1. Fork the repository.
2. Create a feature branch: `git checkout -b feature-name`.
3. Make your changes and commit with clear messages.
4. Open a Pull Request describing your changes.
5. Ensure code passes tests and adheres to project style.

Please report issues and suggest improvements via the issue tracker.

---

## 许可证 (License)

This project is licensed under the **[MIT License](./LICENSE)**.  
Feel free to use, modify, and distribute under the terms of this license.

---

## 联系方式 (Contact)

For any questions or support, please contact the maintainers via the repository issue page.

---

Thank you for using **hunyuan-dit**! Enjoy exploring the power of text-to-image models.
```