README.md 5.74 KB
Newer Older
mashun1's avatar
mashun1 committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
# MetaPortrait

## 论文

MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation

https://browse.arxiv.org/pdf/2212.08062.pdf

## 模型结构

整体流程
![Alt text](imgs/image.png)

(a)$I_s$表示输入的原始图像,$I_d$表示被模仿的图像(视频中的某一帧),$I_s^{ldmk}$和$I_d^{ldmk}$分别表示两者的dense landmark;(b)$x_{in} = Concat(I_s, I_s^{ldmk}, I_d^{ldmk})$也就是在阶段(a)中的输入$I_s$及两个输出,$E_w$表示CNN Encoder;(c)$E_{id}$为已经预训练的人脸识别模型,FILM表示Feature-wise Linear Modulate,AdaIN表示一种风格迁移方法。

warping network

![Alt text](imgs/image-1.png)

$F_r$

![Alt text](imgs/image-2.png)

$F_{3d}$

![Alt text](imgs/image-3.png)                 

## 算法原理

用途:该算法可以用来生成单镜头说话的头部视频

原理:

1. dense landmarks获取几何感知的变形场估计,自适应融合源身份以更好地保持肖像关键特征

2. meta learning加快模型的微调(学习)速度
![Alt text](imgs/image-4.png)

3. 时域一致的超分辨率网络提高图像分辨率



## 环境配置


### Docker(方法一)

    docker build --no-cache -t MetaPortrait:latest .
    docker run --rm --shm-size 10g --network=host --name=metaportrait --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -it <Image ID> bash
    cd sr_model/Basicsr
    pip uninstall basicsr
    python setup.py develop
    pip install urllib3==1.26.15
    # 若遇到Dockerfile启动的方式安装环境需要长时间等待,可注释掉里面的pip安装,启动容器后再安装python库:pip install -r requirements.txt

### Docker(方法二)

    docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04-py37-latest
    docker run --rm --shm-size 10g --network=host --name=metaportrait --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址<绝对路径>:/home/ -it <Image ID> bash
    pip install -r requirements.txt
    cd sr_model/Basicsr
    pip uninstall basicsr
    python setup.py develop
    pip install urllib3==1.26.15

mashun1's avatar
mashun1 committed
66
### Anaconda (方法三)
mashun1's avatar
mashun1 committed
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装:
https://developer.hpccube.com/tool/

    DTK驱动:dtk23.04
    python:python3.7
    torch:1.13.1
    torchvision:0.14.1
    torchaudio:0.13.1
    deepspeed:0.9.2
    apex:0.1

2、创建虚拟环境并加载

    conda create -f meta_portrait_base python=3.7
    conda activate meta_portrait_base 

    pip install -r requirements.txt

    cd sr_model/Basicsr
    pip uninstall basicsr
    python setup.py develop

## 数据集

下载地址: 
https://drive.google.com/file/d/166eNbabM6TeJVy7hxol2gL1kUGKHi3Do/view?usp=share_link

```
base_model
    data
    ├── 0
    │   ├── imgs
    │   │   ├── 00000000.png
    │   │   ├── ...
    │   ├── ldmks
    │   │   ├── 00000000_ldmk.npy
    │   │   ├── ...
    │   └── thetas
    │       ├── 00000000_theta.npy
    │       ├── ...
    ├── src_0_id.npy  # identity_embedding可使用人脸识别模型获取
    ├── src_0_ldmk.npy  # landmarks
    ├── src_0.png 
    ├── src_0_theta.npy  # 将人脸对齐到图像中心的变换矩阵
    └── src_map_dict.pkl
```

下载地址:

(模型)https://github.com/Meta-Portrait/MetaPortrait/releases/download/v0.0.1/temporal_gfpgan.pth

(模型)https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.3.pth

(数据集)https://hkustconnect-my.sharepoint.com/personal/cqiaa_connect_ust_hk/_layouts/15/onedrive.aspx?id=%2Fpersonal%2Fcqiaa%5Fconnect%5Fust%5Fhk%2FDocuments%2Ftalking%20head%2Frelease%2Fdata%2FHDTF%5Fwarprefine&ga=1

```
sr_model
    pretrained_ckpt
    ├── temporal_gfpgan.pth
    ├── GFPGANv1.3.pth
    ...
    data
    ├── HDTF_warprefine
    │   ├── gt
    │   ├── lq
    │   ├── ...
```



## 训练

1.训练warping network

    cd base_model
    CUDA_VISIBLE_DEVICES=0 python main.py --config config/meta_portrait_256_pretrain_warp.yaml --fp16 --stage Warp --task Pretrain

2.联合训练warping network和refinement network,需要修改config/meta_portrait_256_pretrain_full.yaml中的warp_ckpt

    CUDA_VISIBLE_DEVICES=0 python main.py --config config/meta_portrait_256_pretrain_full.yaml --fp16 --stage Full --task Pretrain

3.训练sr model

    CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node=1 --master_port=4321 Experimental_root/train.py -opt options/train/train_sr_hdtf.yml --launcher pytorch


## 推理

1.生成256x256的图片

    cd base_model
    CUDA_VISIBLE_DEVICES=0 python inference.py --save_dir result --config config/meta_portrait_256_eval.yaml --ckpt checkpoint/ckpt_base.pth.tar

2.提升图片分辨率

    cd sr_model
    CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node=1 --master_port=4321 Experimental_root/test.py -opt options/test/same_id_demo.yml --launcher pytorch


## result

1.
![Alt text](imgs/2500.png)


2.
![Alt text](imgs/image-9.png)

## 精度

|psnr|lpips|Ewarp|
|:---:|:---:|:---:|
|26.916|0.1514|0.0244|

## 应用场景

### 算法类别

`计算机视觉`

### 热点应用行业

mashun1's avatar
mashun1 committed
189
`人脸识别,反欺诈,美颜特效`
mashun1's avatar
mashun1 committed
190
191
192
193
194
195
196
197
198
199
200
201
202
203

## 源码仓库及问题反馈

https://developer.hpccube.com/codes/modelzoo/metaportrait_pytorch

## 参考

https://github.com/Meta-Portrait/MetaPortrait

https://github.com/Meta-Portrait/MetaPortrait/issues/4

https://github.com/1adrianb/face-alignment

https://datahacker.rs/010-how-to-align-faces-with-opencv-in-python/