README.md

# Kolors

## 论文

`Kolors: Effective Training of Diffusion Model for Photorealistic Text-to-Image Synthesis`

* https://github.com/Kwai-Kolors/Kolors/blob/master/imgs/Kolors_paper.pdf 

## 模型结构

模型基于`SDXL`，并使用`ChatGLM-6B-Base`作为`text-encoder`。

![alt text](readme_imgs/arch.png)

## 算法原理

算法通过使用双语模型作为`text-encoder`，通过对训练`图像-文本`数据精心标注，并采用两阶段训练策略，遵循DDMP训练目标。

![alt text](readme_imgs/alg.png)

## 环境配置

### Docker（方法一）
    
    docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10

    docker run --shm-size 50g --network=host --name=kolors --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home/ -v /opt/hyhal:/opt/hyhal:ro -it <your IMAGE ID> bash

    pip install -r requirements.txt

    python setup.py install


### Dockerfile（方法二）

    docker build -t <IMAGE_NAME>:<TAG> .

    docker run --shm-size 50g --network=host --name=kolors --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home/ -v /opt/hyhal:/opt/hyhal:ro -it <your IMAGE ID> bash

    pip install -r requirements.txt

    python setup.py install


### Anaconda (方法三)

1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装：
https://developer.sourcefind.cn/tool/

    DTK驱动：dtk24.04.1
    python：python3.10
    torch: 2.1.0
    torchvision: 0.16.0
    deepspeed: 0.12.3
    xformers: 0.0.25
    trition: 2.1.0

Tips：以上dtk驱动、python、torch等DCU相关工具版本需要严格一一对应

2、其它非特殊库参照requirements.txt安装

    pip install -r requirements.txt

    pip install accelerate==0.31.0

    python setup.py install

## 数据集

无

## 训练

无

## 推理

    python scripts/sample.py <prompt>

    # webui
    python scripts/sampleui.py

## result

|||||
|:---:|:---|:---:|:---:|
|prompt|一只熊猫坐在湖边看夕阳，湖边有一片竹林|Oriental Pearl Tower, Cyberpunk style.|一张瓢虫的照片，微距，变焦，高质量，电影，拿着一个牌子，写着“可图”|
|output|![alt text](readme_imgs/r1.png)|![alt text](readme_imgs/r2.png)|![alt text](readme_imgs/r3.png)|

### 精度

无

## 应用场景

### 算法类别

`AIGC`

### 热点应用行业

`零售,广媒,教育`

## 预训练权重

[huggingface](https://huggingface.co/Kwai-Kolors/Kolors/tree/main) | [Modelscope](https://modelscope.cn/models/Kwai-Kolors/Kolors/files)

权重文件结构
    
    weights
    └── Kolors
        ├── imgs
        │   └── head_final3.png
        ├── model_index.json
        ├── MODEL_LICENSE
        ├── README.md
        ├── scheduler
        │   └── scheduler_config.json
        ├── text_encoder
        │   ├── config.json
        │   ├── configuration_chatglm.py
        │   ├── modeling_chatglm.py
        │   ├── __pycache__
        │   │   ├── configuration_chatglm.cpython-311.pyc
        │   │   ├── configuration_chatglm.cpython-37.pyc
        │   │   ├── configuration_chatglm.cpython-38.pyc
        │   │   ├── configuration_chatglm.cpython-39.pyc
        │   │   ├── modeling_chatglm.cpython-38.pyc
        │   │   ├── modeling_chatglm.cpython-39.pyc
        │   │   ├── tokenization_chatglm.cpython-38.pyc
        │   │   └── tokenization_chatglm.cpython-39.pyc
        │   ├── pytorch_model-00001-of-00007.bin
        │   ├── pytorch_model-00002-of-00007.bin
        │   ├── pytorch_model-00003-of-00007.bin
        │   ├── pytorch_model-00004-of-00007.bin
        │   ├── pytorch_model-00005-of-00007.bin
        │   ├── pytorch_model-00006-of-00007.bin
        │   ├── pytorch_model-00007-of-00007.bin
        │   ├── pytorch_model.bin.index.json
        │   ├── quantization.py
        │   ├── tokenization_chatglm.py
        │   ├── tokenizer_config.json
        │   ├── tokenizer.model
        │   └── vocab.txt
        ├── tokenizer
        │   ├── tokenization_chatglm.py
        │   ├── tokenizer_config.json
        │   ├── tokenizer.model
        │   └── vocab.txt
        ├── unet
        │   ├── config.json
        │   ├── diffusion_pytorch_model.fp16.safetensors
        │   └── diffusion_pytorch_model.safetensors
        └── vae
            ├── config.json
            ├── diffusion_pytorch_model.bin
            ├── diffusion_pytorch_model.fp16.bin
            ├── diffusion_pytorch_model.fp16.safetensors
            └── diffusion_pytorch_model.safetensors

## 源码仓库及问题反馈

* https://developer.sourcefind.cn/codes/modelzoo/kolors_pytorch

## 参考资料

* https://github.com/Kwai-Kolors/Kolors.git