README.md 6.31 KB
Newer Older
bailuo's avatar
update  
bailuo committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Inpaint-Anything
通过SAM编辑修复任意物体。

## 论文
`Inpaint Anything: Segment Anything Meets Image Inpainting`
- https://arxiv.org/abs/2304.06790

## 模型结构
<!-- 此处一句话简要介绍模型结构 -->
Inpaint-Anything主要是基于Segment Anything Model(SAM)进行图像的编辑修复,SAM是一种Vision Transformer(ViT)结构的模型。

<div align=center>
    <img src="./doc/SAM.jpg"/>
    <div >SAM</div>
</div>
<div align=center>
    <img src="./doc/ViT.jpg"/>
    <div >ViT</div>
</div>

## 算法原理
Inpaint-Anything核心思想是结合不同模型的优势,以构建一个非常强大且用户友好的管道来解决图像修复相关问题。通过SAM处理任意物体生成mask掩码,再通过LaMa、SD等模型对mask部分进行编辑,可以实现任意物体的消除、目标替换以及背景替换等功能。

<div align=center>
    <img src="./doc/MainFramework.png"/>
    <div >Inpaint-Anything</div>
</div>


## 环境配置
```
dcuai's avatar
dcuai committed
32
mv inpaint-anything_pytorch inpaint-anything # 去框架名后缀
bailuo's avatar
update  
bailuo committed
33
34
35
36
37
38
# docker的-v 路径、docker_name和imageID根据实际情况修改
```
### Docker(方法一)
<!-- 此处提供[光源](https://www.sourcefind.cn/#/service-details)拉取docker镜像的地址与使用步骤 -->
```
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.2-py3.10 # 本镜像imageID为:2f1f619d0182
dcuai's avatar
dcuai committed
39
docker run -it -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal/:/opt/hyhal/:ro --shm-size=16G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --network=host --name docker_name imageID bash
bailuo's avatar
update  
bailuo committed
40
41
cd /your_code_path/inpaint-anything
pip install -e segment_anything
dcuai's avatar
dcuai committed
42
pip install transformers accelerate scipy safetensors
dcuai's avatar
dcuai committed
43
44
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple/
# 如果pip安装下载慢建议多尝试更换镜像源(下同)
bailuo's avatar
update  
bailuo committed
45
46
47
48
49
50
51
52
53
```
### Dockerfile(方法二)
<!-- 此处提供dockerfile的使用方法 -->
```
cd /your_code_path/inpaint-anything/docker
docker build --no-cache -t codestral:latest .
docker run -it -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal/:/opt/hyhal/:ro --shm-size=16G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash
cd /your_code_path/inpaint-anything
pip install -e segment_anything
dcuai's avatar
dcuai committed
54
pip install transformers accelerate scipy safetensors
dcuai's avatar
dcuai committed
55
pip install -r requirements.txt
bailuo's avatar
update  
bailuo committed
56
57
58
59
```
### Anaconda(方法三)
<!-- 此处提供本地配置、编译的详细步骤,例如: -->

chenzk's avatar
chenzk committed
60
关于本项目DCU显卡所需的特殊深度学习库可从[光合](https://developer.sourcefind.cn/tool/)开发者社区下载安装。
bailuo's avatar
update  
bailuo committed
61
62
63
64
65
66
67
```
DTK驱动: dtk24.04.2
python: python3.10
pytorch: 2.1.0
```
`Tips:以上DTK驱动、python、pytorch等DCU相关工具版本需要严格一一对应`

dcuai's avatar
dcuai committed
68
其它非深度学习库参照requirement.txt安装:
bailuo's avatar
update  
bailuo committed
69
70
```
pip install -e segment_anything
dcuai's avatar
dcuai committed
71
pip install transformers accelerate scipy safetensors
dcuai's avatar
dcuai committed
72
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple/
bailuo's avatar
update  
bailuo committed
73
74
75
76
77
78
79
80
```
## 数据集


## 训练


## 推理
chenzk's avatar
chenzk committed
81
82
需要下载模型权重 [SAM(sam_vit_h_4b8939.pth)](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth)[LaMa](https://disk.yandex.ru/d/ouP6l8VJ0HpMZg),或者从 https://drive.google.com/drive/folders/1ST0aRbDRZGli0r7OVVOQvXwtadMCuWXg?usp=sharing 一并下载 ,并放在 ./pretrained_models 下。

bailuo's avatar
update  
bailuo committed
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
注意:如果huggingface访问不通,请设置镜像网站:
```
export HF_ENDPOINT=https://hf-mirror.com
```

```
# 目标消除
python remove_anything.py \
    --input_img ./example/remove-anything/dog.jpg \
    --coords_type key_in \
    --point_coords 200 450 \
    --point_labels 1 \
    --dilate_kernel_size 15 \
    --output_dir ./results \
    --sam_model_type "vit_h" \
    --sam_ckpt ./pretrained_models/sam_vit_h_4b8939.pth \
    --lama_config ./lama/configs/prediction/default.yaml \
    --lama_ckpt ./pretrained_models/big-lama
# 注意模型文件路径
```

```
# 目标替换
python fill_anything.py \
    --input_img ./example/fill-anything/sample1.png \
    --coords_type key_in \
    --point_coords 750 500 \
    --point_labels 1 \
    --text_prompt "a teddy bear on a bench" \
    --dilate_kernel_size 50 \
    --output_dir ./results \
    --sam_model_type "vit_h" \
    --sam_ckpt ./pretrained_models/sam_vit_h_4b8939.pth
```

```
# 背景替换
python replace_anything.py \
    --input_img ./example/replace-anything/dog.png \
    --coords_type key_in \
    --point_coords 750 500 \
    --point_labels 1 \
    --text_prompt "sit on the swing" \
    --output_dir ./results \
    --sam_model_type "vit_h" \
    --sam_ckpt ./pretrained_models/sam_vit_h_4b8939.pth
```
## result
<!-- 此处填算法效果测试图(包括输入、输出) -->

推理结果

<center class="half">
bailuo's avatar
bailuo committed
136
137
<img src="./results/cat/with_points.png" width=300/>
<img src="./results/cat/inpainted_with_mask_2.png" width=300/>
bailuo's avatar
update  
bailuo committed
138
139
140
141
<div >目标消除</div>
</center>

<center class="half">
bailuo's avatar
bailuo committed
142
143
<img src="./results/sample3/with_points.png" width=300/>
<img src="./results/sample3/filled_with_mask_2.png" width=300/>
bailuo's avatar
update  
bailuo committed
144
145
146
147
<div >目标替换</div>
</center>

<center class="half">
bailuo's avatar
bailuo committed
148
149
<img src="./results/bus/with_points.png" width=300/>
<img src="./results/bus/replaced_with_mask_1.png" width=300/>
bailuo's avatar
update  
bailuo committed
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
<div >背景替换</div>
</center>


### 精度




## 应用场景
### 算法类别

<!-- 超出以上分类的类别命名也可参考此网址中的类别名:https://huggingface.co/ \ -->
`AIGC`

### 热点应用行业
<!-- 应用行业的填写需要做大量调研,从而为使用者提供专业、全面的推荐,除特殊算法,通常推荐数量>=3。 -->
`推理,零售,制造,电商,医疗,教育`

<!-- ## 预训练权重 -->
<!-- - 此处填写预训练权重在公司内部的下载地址(预训练权重存放中心为:[SCNet AIModels](http://113.200.138.88:18080/aimodels) ,模型用到的各预训练权重请分别填上具体地址。),过小权重文件可打包到项目里。
- 此处填写公开预训练权重官网下载地址(非必须)。 -->

## 源码仓库及问题反馈
<!-- - 此处填本项目gitlab地址 -->
chenzk's avatar
chenzk committed
175
- https://developer.sourcefind.cn/codes/modelzoo/inpaint-anything_pytorch
bailuo's avatar
update  
bailuo committed
176
177
178
179
180
181

## 参考资料
- https://github.com/geekyutao/Inpaint-Anything
- https://github.com/facebookresearch/segment-anything
- https://github.com/advimman/lama
- https://github.com/CompVis/stable-diffusion