README.md 5.88 KB
Newer Older
mashun1's avatar
mashun1 committed
1
2
3
4
5
6
7
8
9
10
11
12
13
# MetaPortrait

## 论文

MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation

https://browse.arxiv.org/pdf/2212.08062.pdf

## 模型结构

整体流程
![Alt text](imgs/image.png)

mashun1's avatar
mashun1 committed
14
(a)$`I_s`$表示输入的原始图像,$`I_d`$表示被模仿的图像(视频中的某一帧),$`I_s^{ldmk}`$和$`I_d^{ldmk}`$分别表示两者的dense landmark;(b)$`x_{in} = Concat(I_s, I_s^{ldmk}, I_d^{ldmk})`$也就是在阶段(a)中的输入$`I_s`$及两个输出,$`E_w`$表示CNN Encoder;(c)$`E_{id}`$为已经预训练的人脸识别模型,FILM表示Feature-wise Linear Modulate,AdaIN表示一种风格迁移方法。
mashun1's avatar
mashun1 committed
15
16
17
18
19

warping network

![Alt text](imgs/image-1.png)

mashun1's avatar
mashun1 committed
20
$`F_r`$
mashun1's avatar
mashun1 committed
21
22
23

![Alt text](imgs/image-2.png)

mashun1's avatar
mashun1 committed
24
$`F_{3d}`$
mashun1's avatar
mashun1 committed
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44

![Alt text](imgs/image-3.png)                 

## 算法原理

用途:该算法可以用来生成单镜头说话的头部视频

原理:

1. dense landmarks获取几何感知的变形场估计,自适应融合源身份以更好地保持肖像关键特征

2. meta learning加快模型的微调(学习)速度
![Alt text](imgs/image-4.png)

3. 时域一致的超分辨率网络提高图像分辨率



## 环境配置

mashun1's avatar
mashun1 committed
45
46
47
48
49
若使用dtk24.04.1,需要设置如下环境变量

    export MIOPEN_DEBUG_HIPNN_CONV_S773=0
    export MIOPEN_DEBUG_HIPNN_CONV_S771=0

mashun1's avatar
mashun1 committed
50
51
### Docker(方法一)

dcuai's avatar
dcuai committed
52
53
    docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10
    docker run --shm-size 10g --network=host --name=metaportrait --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v /opt/hyhal:/opt/hyhal:ro -v 项目地址<绝对路径>:/home/ -it <Image ID> bash
mashun1's avatar
mashun1 committed
54
    pip install -r requirements.txt
mashun1's avatar
mashun1 committed
55
56
57
58
59
    cd sr_model/Basicsr
    pip uninstall basicsr
    python setup.py develop
    pip install urllib3==1.26.15

mashun1's avatar
mashun1 committed
60
### Dockerfile(方法二)
mashun1's avatar
mashun1 committed
61

mashun1's avatar
mashun1 committed
62
63
    docker build --no-cache -t MetaPortrait:latest .
    docker run --shm-size 10g --network=host --name=metaportrait --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -it <Image ID> bash
mashun1's avatar
mashun1 committed
64
65
66
67
    cd sr_model/Basicsr
    pip uninstall basicsr
    python setup.py develop
    pip install urllib3==1.26.15
mashun1's avatar
mashun1 committed
68
69
    # 若遇到Dockerfile启动的方式安装环境需要长时间等待,可注释掉里面的pip安装,启动容器后再安装python库:pip install -r requirements.txt

mashun1's avatar
mashun1 committed
70

mashun1's avatar
mashun1 committed
71
### Anaconda (方法三)
mashun1's avatar
mashun1 committed
72
1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装:
chenzk's avatar
chenzk committed
73
https://developer.sourcefind.cn/tool/
mashun1's avatar
mashun1 committed
74

dcuai's avatar
dcuai committed
75
76
77
78
79
80
81
    DTK驱动:dtk24.04.1
    python:python3.10
    torch:2.1.0
    torchvision:0.16.0
    torchaudio:2.1.2
    deepspeed:0.12.3
    apex:1.1.0
mashun1's avatar
mashun1 committed
82
83
84

2、创建虚拟环境并加载

dcuai's avatar
dcuai committed
85
    conda create -f meta_portrait_base python=3.10
mashun1's avatar
mashun1 committed
86
87
88
89
90
91
92
93
94
95
    conda activate meta_portrait_base 

    pip install -r requirements.txt

    cd sr_model/Basicsr
    pip uninstall basicsr
    python setup.py develop

## 数据集

chenzk's avatar
chenzk committed
96
HDTF_warprefine下载地址: 
mashun1's avatar
mashun1 committed
97
98
99
100
https://drive.google.com/file/d/166eNbabM6TeJVy7hxol2gL1kUGKHi3Do/view?usp=share_link

```
base_model
mashun1's avatar
mashun1 committed
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
data
├── 0
│   ├── imgs
│   │   ├── 00000000.png
│   │   ├── ...
│   ├── ldmks
│   │   ├── 00000000_ldmk.npy
│   │   ├── ...
│   └── thetas
│       ├── 00000000_theta.npy
│       ├── ...
├── src_0_id.npy  # identity_embedding可使用人脸识别模型获取
├── src_0_ldmk.npy  # landmarks
├── src_0.png 
├── src_0_theta.npy  # 将人脸对齐到图像中心的变换矩阵
└── src_map_dict.pkl
mashun1's avatar
mashun1 committed
117
118
119
120
121
122
123
124
```

下载地址:

(模型)https://github.com/Meta-Portrait/MetaPortrait/releases/download/v0.0.1/temporal_gfpgan.pth

(模型)https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.3.pth

wanglch's avatar
wanglch committed
125

mashun1's avatar
mashun1 committed
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
```
sr_model
    pretrained_ckpt
    ├── temporal_gfpgan.pth
    ├── GFPGANv1.3.pth
    ...
    data
    ├── HDTF_warprefine
    │   ├── gt
    │   ├── lq
    │   ├── ...
```



## 训练

1.训练warping network

    cd base_model
    CUDA_VISIBLE_DEVICES=0 python main.py --config config/meta_portrait_256_pretrain_warp.yaml --fp16 --stage Warp --task Pretrain

2.联合训练warping network和refinement network,需要修改config/meta_portrait_256_pretrain_full.yaml中的warp_ckpt

    CUDA_VISIBLE_DEVICES=0 python main.py --config config/meta_portrait_256_pretrain_full.yaml --fp16 --stage Full --task Pretrain

3.训练sr model

mashun1's avatar
mashun1 committed
154
    cd ../sr_model
mashun1's avatar
mashun1 committed
155
156
157
158
159
160
161
    CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node=1 --master_port=4321 Experimental_root/train.py -opt options/train/train_sr_hdtf.yml --launcher pytorch


## 推理

1.生成256x256的图片

mashun1's avatar
mashun1 committed
162
163
下载模型:https://drive.google.com/file/d/1Kmdv3w6N_we7W7lIt6LBzqRHwwy1dBxD/view  (放入checkpoint文件夹中)

mashun1's avatar
mashun1 committed
164
165
166
167
168
    cd base_model
    CUDA_VISIBLE_DEVICES=0 python inference.py --save_dir result --config config/meta_portrait_256_eval.yaml --ckpt checkpoint/ckpt_base.pth.tar

2.提升图片分辨率

mashun1's avatar
mashun1 committed
169
    cd ../sr_model
mashun1's avatar
mashun1 committed
170
171
    CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node=1 --master_port=4321 Experimental_root/test.py -opt options/test/same_id_demo.yml --launcher pytorch

mashun1's avatar
mashun1 committed
172
注意:如果文件夹中包含非文件夹内容,如xxx.mp4,需要移出后再运行上述命令。
mashun1's avatar
mashun1 committed
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192

## result

1.
![Alt text](imgs/2500.png)


2.
![Alt text](imgs/image-9.png)

## 精度

|psnr|lpips|Ewarp|
|:---:|:---:|:---:|
|26.916|0.1514|0.0244|

## 应用场景

### 算法类别

mashun1's avatar
mashun1 committed
193
`风格迁移`
mashun1's avatar
mashun1 committed
194
195
196

### 热点应用行业

mashun1's avatar
mashun1 committed
197
`科研,教育,广媒`
mashun1's avatar
mashun1 committed
198
199
200

## 源码仓库及问题反馈

chenzk's avatar
chenzk committed
201
https://developer.sourcefind.cn/codes/modelzoo/metaportrait_pytorch
mashun1's avatar
mashun1 committed
202

mashun1's avatar
mashun1 committed
203
## 参考资料
mashun1's avatar
mashun1 committed
204
205
206
207
208
209
210
211

https://github.com/Meta-Portrait/MetaPortrait

https://github.com/Meta-Portrait/MetaPortrait/issues/4

https://github.com/1adrianb/face-alignment

https://datahacker.rs/010-how-to-align-faces-with-opencv-in-python/