README.md 5.92 KB
Newer Older
mashun1's avatar
mashun1 committed
1
2
3
4
5
6
7
8
9
10
11
12
13
# MetaPortrait

## 论文

MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation

https://browse.arxiv.org/pdf/2212.08062.pdf

## 模型结构

整体流程
![Alt text](imgs/image.png)

mashun1's avatar
mashun1 committed
14
(a)$`I_s`$表示输入的原始图像,$`I_d`$表示被模仿的图像(视频中的某一帧),$`I_s^{ldmk}`$和$`I_d^{ldmk}`$分别表示两者的dense landmark;(b)$`x_{in} = Concat(I_s, I_s^{ldmk}, I_d^{ldmk})`$也就是在阶段(a)中的输入$`I_s`$及两个输出,$`E_w`$表示CNN Encoder;(c)$`E_{id}`$为已经预训练的人脸识别模型,FILM表示Feature-wise Linear Modulate,AdaIN表示一种风格迁移方法。
mashun1's avatar
mashun1 committed
15
16
17
18
19

warping network

![Alt text](imgs/image-1.png)

mashun1's avatar
mashun1 committed
20
$`F_r`$
mashun1's avatar
mashun1 committed
21
22
23

![Alt text](imgs/image-2.png)

mashun1's avatar
mashun1 committed
24
$`F_{3d}`$
mashun1's avatar
mashun1 committed
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46

![Alt text](imgs/image-3.png)                 

## 算法原理

用途:该算法可以用来生成单镜头说话的头部视频

原理:

1. dense landmarks获取几何感知的变形场估计,自适应融合源身份以更好地保持肖像关键特征

2. meta learning加快模型的微调(学习)速度
![Alt text](imgs/image-4.png)

3. 时域一致的超分辨率网络提高图像分辨率



## 环境配置

### Docker(方法一)

mashun1's avatar
mashun1 committed
47
48
49
    docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04-py37-latest
    docker run --shm-size 10g --network=host --name=metaportrait --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址<绝对路径>:/home/ -it <Image ID> bash
    pip install -r requirements.txt
mashun1's avatar
mashun1 committed
50
51
52
53
54
    cd sr_model/Basicsr
    pip uninstall basicsr
    python setup.py develop
    pip install urllib3==1.26.15

mashun1's avatar
mashun1 committed
55
### Dockerfile(方法二)
mashun1's avatar
mashun1 committed
56

mashun1's avatar
mashun1 committed
57
58
    docker build --no-cache -t MetaPortrait:latest .
    docker run --shm-size 10g --network=host --name=metaportrait --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -it <Image ID> bash
mashun1's avatar
mashun1 committed
59
60
61
62
    cd sr_model/Basicsr
    pip uninstall basicsr
    python setup.py develop
    pip install urllib3==1.26.15
mashun1's avatar
mashun1 committed
63
64
    # 若遇到Dockerfile启动的方式安装环境需要长时间等待,可注释掉里面的pip安装,启动容器后再安装python库:pip install -r requirements.txt

mashun1's avatar
mashun1 committed
65

mashun1's avatar
mashun1 committed
66
### Anaconda (方法三)
mashun1's avatar
mashun1 committed
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装:
https://developer.hpccube.com/tool/

    DTK驱动:dtk23.04
    python:python3.7
    torch:1.13.1
    torchvision:0.14.1
    torchaudio:0.13.1
    deepspeed:0.9.2
    apex:0.1

2、创建虚拟环境并加载

    conda create -f meta_portrait_base python=3.7
    conda activate meta_portrait_base 

    pip install -r requirements.txt

    cd sr_model/Basicsr
    pip uninstall basicsr
    python setup.py develop

## 数据集

下载地址: 
https://drive.google.com/file/d/166eNbabM6TeJVy7hxol2gL1kUGKHi3Do/view?usp=share_link

```
base_model
mashun1's avatar
mashun1 committed
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
data
├── 0
│   ├── imgs
│   │   ├── 00000000.png
│   │   ├── ...
│   ├── ldmks
│   │   ├── 00000000_ldmk.npy
│   │   ├── ...
│   └── thetas
│       ├── 00000000_theta.npy
│       ├── ...
├── src_0_id.npy  # identity_embedding可使用人脸识别模型获取
├── src_0_ldmk.npy  # landmarks
├── src_0.png 
├── src_0_theta.npy  # 将人脸对齐到图像中心的变换矩阵
└── src_map_dict.pkl
mashun1's avatar
mashun1 committed
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
```

下载地址:

(模型)https://github.com/Meta-Portrait/MetaPortrait/releases/download/v0.0.1/temporal_gfpgan.pth

(模型)https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.3.pth

(数据集)https://hkustconnect-my.sharepoint.com/personal/cqiaa_connect_ust_hk/_layouts/15/onedrive.aspx?id=%2Fpersonal%2Fcqiaa%5Fconnect%5Fust%5Fhk%2FDocuments%2Ftalking%20head%2Frelease%2Fdata%2FHDTF%5Fwarprefine&ga=1

```
sr_model
    pretrained_ckpt
    ├── temporal_gfpgan.pth
    ├── GFPGANv1.3.pth
    ...
    data
    ├── HDTF_warprefine
    │   ├── gt
    │   ├── lq
    │   ├── ...
```



## 训练

1.训练warping network

    cd base_model
    CUDA_VISIBLE_DEVICES=0 python main.py --config config/meta_portrait_256_pretrain_warp.yaml --fp16 --stage Warp --task Pretrain

2.联合训练warping network和refinement network,需要修改config/meta_portrait_256_pretrain_full.yaml中的warp_ckpt

    CUDA_VISIBLE_DEVICES=0 python main.py --config config/meta_portrait_256_pretrain_full.yaml --fp16 --stage Full --task Pretrain

3.训练sr model

mashun1's avatar
mashun1 committed
150
    cd ../sr_model
mashun1's avatar
mashun1 committed
151
152
153
154
155
156
157
    CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node=1 --master_port=4321 Experimental_root/train.py -opt options/train/train_sr_hdtf.yml --launcher pytorch


## 推理

1.生成256x256的图片

mashun1's avatar
mashun1 committed
158
159
下载模型:https://drive.google.com/file/d/1Kmdv3w6N_we7W7lIt6LBzqRHwwy1dBxD/view  (放入checkpoint文件夹中)

mashun1's avatar
mashun1 committed
160
161
162
163
164
    cd base_model
    CUDA_VISIBLE_DEVICES=0 python inference.py --save_dir result --config config/meta_portrait_256_eval.yaml --ckpt checkpoint/ckpt_base.pth.tar

2.提升图片分辨率

mashun1's avatar
mashun1 committed
165
    cd ../sr_model
mashun1's avatar
mashun1 committed
166
167
    CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node=1 --master_port=4321 Experimental_root/test.py -opt options/test/same_id_demo.yml --launcher pytorch

mashun1's avatar
mashun1 committed
168
注意:如果文件夹中包含非文件夹内容,如xxx.mp4,需要移出后再运行上述命令。
mashun1's avatar
mashun1 committed
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188

## result

1.
![Alt text](imgs/2500.png)


2.
![Alt text](imgs/image-9.png)

## 精度

|psnr|lpips|Ewarp|
|:---:|:---:|:---:|
|26.916|0.1514|0.0244|

## 应用场景

### 算法类别

mashun1's avatar
mashun1 committed
189
`风格迁移`
mashun1's avatar
mashun1 committed
190
191
192

### 热点应用行业

mashun1's avatar
mashun1 committed
193
`科研,教育,广媒`
mashun1's avatar
mashun1 committed
194
195
196
197
198

## 源码仓库及问题反馈

https://developer.hpccube.com/codes/modelzoo/metaportrait_pytorch

mashun1's avatar
mashun1 committed
199
## 参考资料
mashun1's avatar
mashun1 committed
200
201
202
203
204
205
206
207

https://github.com/Meta-Portrait/MetaPortrait

https://github.com/Meta-Portrait/MetaPortrait/issues/4

https://github.com/1adrianb/face-alignment

https://datahacker.rs/010-how-to-align-faces-with-opencv-in-python/