"test/layernorm/test_layernorm2d_util.hpp" did not exist on "85978e0201bb94bf6e59b325e1f5f19266845d08"
README.md 6.39 KB
Newer Older
mashun1's avatar
idmvton  
mashun1 committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
# IDM-VTON

## 论文

**Improving Diffusion Models for Virtual Try-on**

* https://arxiv.org/abs/2403.05139

## 模型结构

模型基于`SDXL`,使用`IP-Adapter`以及`GarmentNet``Unet`)提取衣物特征并加入主网络。

![alt text](readme_imgs/model_structure.png)

## 算法原理

使用`self-attention`融合低级图像特征信息,使用`cross attention`融合高级语义特征。

![alt text](readme_imgs/alg.png)


## 环境配置

### Docker(方法一)
    
    docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-centos7.6-dtk24.04-py310

    docker run --shm-size 10g --network=host --name=idmvton --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home/ -v /opt/hyhal:/opt/hyhal:ro -it <your IMAGE ID> bash

    pip install -r requirements.txt

    pip install bitsandbytes-0.42.0-py3-none-any.whl  (whl文件夹中)

    cd BasicSR && python setup.py develop


### Dockerfile(方法二)

    docker build -t <IMAGE_NAME>:<TAG> .

    docker run --shm-size 10g --network=host --name=idmvton --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home/ -v /opt/hyhal:/opt/hyhal:ro -it <your IMAGE ID> bash

    pip install -r requirements.txt

    pip install bitsandbytes-0.42.0-py3-none-any.whl  (whl文件夹中)

    cd BasicSR && python setup.py develop


### Anaconda (方法三)

1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装:
https://developer.hpccube.com/tool/

    DTK驱动:dtk24.04
    python:python3.10
    torch: 2.1.0
    torchvision: 0.16.0
    onnx: 1.15.0
    flash-attn: 2.0.4

Tips:以上dtk驱动、python、torch等DCU相关工具版本需要严格一一对应

2、其它非特殊库参照requirements.txt安装

    pip install -r requirements.txt

    pip install bitsandbytes-0.42.0-py3-none-any.whl  (whl文件夹中)

    cd BasicSR && python setup.py develop

## 数据集

|名称|链接|
|:---|:---|
|VITON-HD| https://github.com/shadow2496/VITON-HD |
|Dress Code|https://github.com/aimagelab/dress-code|

除了下载原始数据,该项目提供了用于测试的数据集,存放在`datasets`中。

VITON-HD

    train
    |-- ...

    test
    |-- image
    |-- image-densepose
    |-- agnostic-mask
    |-- cloth
    |-- vitonhd_test_tagged.json

DressCode

    |-- dresses
        |-- images
        |-- image-densepose
        |-- dc_caption.txt
        |-- ...
    |-- lower_body
        |-- images
        |-- image-densepose
        |-- dc_caption.txt
        |-- ...
    |-- upper_body
        |-- images
        |-- image-densepose
        |-- dc_caption.txt
        |-- ...

注意:其中image-denspose使用[detectron2](https://github.com/facebookresearch/detectron2)处理获得,具体参考 https://github.com/sangyun884/HR-VITON/issues/45 ,也可通过[原文连接](https://kaistackr-my.sharepoint.com/:u:/g/personal/cpis7_kaist_ac_kr/EaIPRG-aiRRIopz9i002FOwBDa-0-BHUKVZ7Ia5yAVVG3A?e=YxkAip)直接下载 。


## 推理

### 模型下载

mashun1's avatar
mashun1 committed
118
[hugging-face](https://hf-mirror.com/yisol/IDM-VTON) | [SCNet](http://113.200.138.88:18080/aimodels/IDM-VTON)
mashun1's avatar
idmvton  
mashun1 committed
119

mashun1's avatar
mashun1 committed
120
[hugging-face](https://hf-mirror.com/yisol/IDM-VTON-DC) | [SCNet]()
mashun1's avatar
idmvton  
mashun1 committed
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177

    ckpt/
    ├── densepose
    │   └── model_final_162be9.pkl
    ├── humanparsing
    │   ├── parsing_atr.onnx
    │   └── parsing_lip.onnx
    └── openpose
        └── ckpts
            └── body_pose_model.pth
    

    pretrained_models/
    ├── dc
    │   ├── config.json
    │   └── diffusion_pytorch_model.bin
    ├── image_encoder
    │   ├── config.json
    │   └── model.safetensors
    ├── model_index.json
    ├── scheduler
    │   └── scheduler_config.json
    ├── text_encoder
    │   ├── config.json
    │   └── model.safetensors
    ├── text_encoder_2
    │   ├── config.json
    │   └── model.safetensors
    ├── tokenizer
    │   ├── merges.txt
    │   ├── special_tokens_map.json
    │   ├── tokenizer_config.json
    │   └── vocab.json
    ├── tokenizer_2
    │   ├── merges.txt
    │   ├── special_tokens_map.json
    │   ├── tokenizer_config.json
    │   └── vocab.json
    ├── unet
    │   ├── config.json
    │   └── diffusion_pytorch_model.bin
    ├── unet_encoder
    │   ├── config.json
    │   └── diffusion_pytorch_model.safetensors
    └── vae
        ├── config.json
        └── diffusion_pytorch_model.safetensors

### 命令行

#### VITON-HD

    accelerate launch inference.py \
    --width 768 --height 1024 --num_inference_steps 30 \
    --pretrained_model_name_or_path <path/to/pretrained_models> \
    --output_dir "result" \
    --unpaired \
mashun1's avatar
mashun1 committed
178
    --data_dir <path/to/datasets/viton-hd> \
mashun1's avatar
idmvton  
mashun1 committed
179
180
181
182
183
184
185
186
187
188
189
190
    --seed 42 \
    --test_batch_size 1 \
    --guidance_scale 2.0

#### DressCode

    accelerate launch inference_dc.py \
    --width 768 --height 1024 --num_inference_steps 30 \
    --pretrained_model_name_or_path <path/to/pretrained_models> \
    --unet_path <path/to/pretrained_models/dc> \
    --output_dir "result" \
    --unpaired \
mashun1's avatar
mashun1 committed
191
    --data_dir <path/to/datasets/dress-code> \
mashun1's avatar
idmvton  
mashun1 committed
192
193
194
195
196
    --seed 42 
    --test_batch_size 1
    --guidance_scale 2.0
    --category "upper_body" 

mashun1's avatar
mashun1 committed
197
198
注意:以上默认使用多卡推理,可使用`export HIP_VISIBLE_DEVICES=<设备号>`进行限制。

mashun1's avatar
idmvton  
mashun1 committed
199
200
201
202
### webui

    python gradio_demo/app.py

mashun1's avatar
mashun1 committed
203
204
注意:需要修改`gradio_demo/app.py``base_path='path/to/pretrained_models'`

mashun1's avatar
idmvton  
mashun1 committed
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
## result

|model|image|cloth|prompt|result|
|:---|:---:|:---:|:---:|:---:|
|dc|![alt txt](readme_imgs/020716_0.jpg)|![alt txt](readme_imgs/020717_1.jpg)|Short Sleeves Round Neck Knit Dress|![alt txt](readme_imgs/020716_0_r.jpg)|
||![alt txt](readme_imgs/00006_00.jpg)|![alt txt](readme_imgs/00013_00.jpg)|a photo of Short Sleeve Round Neck T-shirts|![alt txt](readme_imgs/00006_00_r.jpg)|


### 精度



## 应用场景

### 算法类别

mashun1's avatar
mashun1 committed
221
`AIGC`
mashun1's avatar
idmvton  
mashun1 committed
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238

### 热点应用行业

`零售,广媒,电商`

## 源码仓库及问题反馈

* https://developer.hpccube.com/codes/modelzoo/idm-vton_pytorch

## 参考资料

* https://github.com/yisol/IDM-VTON