README.md

# Inpaint-Anything
通过SAM编辑修复任意物体。

## 论文
`Inpaint Anything: Segment Anything Meets Image Inpainting`
- https://arxiv.org/abs/2304.06790

## 模型结构
<!-- 此处一句话简要介绍模型结构 -->
Inpaint-Anything主要是基于Segment Anything Model（SAM）进行图像的编辑修复，SAM是一种Vision Transformer（ViT）结构的模型。

<div align=center>
    <img src="./doc/SAM.jpg"/>
    <div >SAM</div>
</div>
<div align=center>
    <img src="./doc/ViT.jpg"/>
    <div >ViT</div>
</div>

## 算法原理
Inpaint-Anything核心思想是结合不同模型的优势，以构建一个非常强大且用户友好的管道来解决图像修复相关问题。通过SAM处理任意物体生成mask掩码，再通过LaMa、SD等模型对mask部分进行编辑，可以实现任意物体的消除、目标替换以及背景替换等功能。

<div align=center>
    <img src="./doc/MainFramework.png"/>
    <div >Inpaint-Anything</div>
</div>


## 环境配置
```
mv inpaint-anything_pytorch inpaint-anything # 去框架名后缀
# docker的-v 路径、docker_name和imageID根据实际情况修改
```
### Docker（方法一）
<!-- 此处提供[光源](https://www.sourcefind.cn/#/service-details)拉取docker镜像的地址与使用步骤 -->
```
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.2-py3.10 # 本镜像imageID为：2f1f619d0182
docker run -it -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal/:/opt/hyhal/:ro --shm-size=16G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --network=host --name docker_name imageID bash
cd /your_code_path/inpaint-anything
pip install -e segment_anything
pip install transformers accelerate scipy safetensors
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple/
# 如果pip安装下载慢建议多尝试更换镜像源（下同）
```
### Dockerfile（方法二）
<!-- 此处提供dockerfile的使用方法 -->
```
cd /your_code_path/inpaint-anything/docker
docker build --no-cache -t codestral:latest .
docker run -it -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal/:/opt/hyhal/:ro --shm-size=16G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash
cd /your_code_path/inpaint-anything
pip install -e segment_anything
pip install transformers accelerate scipy safetensors
pip install -r requirements.txt
```
### Anaconda（方法三）
<!-- 此处提供本地配置、编译的详细步骤，例如： -->

关于本项目DCU显卡所需的特殊深度学习库可从[光合](https://developer.sourcefind.cn/tool/)开发者社区下载安装。
```
DTK驱动: dtk24.04.2
python: python3.10
pytorch: 2.1.0
```
`Tips：以上DTK驱动、python、pytorch等DCU相关工具版本需要严格一一对应`

其它非深度学习库参照requirement.txt安装：
```
pip install -e segment_anything
pip install transformers accelerate scipy safetensors
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple/
```
## 数据集
无

## 训练
无

## 推理
需要下载模型权重 [SAM(sam_vit_h_4b8939.pth)](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth) 和 [LaMa](https://disk.yandex.ru/d/ouP6l8VJ0HpMZg)，或者从 https://drive.google.com/drive/folders/1ST0aRbDRZGli0r7OVVOQvXwtadMCuWXg?usp=sharing 一并下载 ，并放在 ./pretrained_models 下。

注意：如果huggingface访问不通，请设置镜像网站：
```
export HF_ENDPOINT=https://hf-mirror.com
```

```
# 目标消除
python remove_anything.py \
    --input_img ./example/remove-anything/dog.jpg \
    --coords_type key_in \
    --point_coords 200 450 \
    --point_labels 1 \
    --dilate_kernel_size 15 \
    --output_dir ./results \
    --sam_model_type "vit_h" \
    --sam_ckpt ./pretrained_models/sam_vit_h_4b8939.pth \
    --lama_config ./lama/configs/prediction/default.yaml \
    --lama_ckpt ./pretrained_models/big-lama
# 注意模型文件路径
```

```
# 目标替换
python fill_anything.py \
    --input_img ./example/fill-anything/sample1.png \
    --coords_type key_in \
    --point_coords 750 500 \
    --point_labels 1 \
    --text_prompt "a teddy bear on a bench" \
    --dilate_kernel_size 50 \
    --output_dir ./results \
    --sam_model_type "vit_h" \
    --sam_ckpt ./pretrained_models/sam_vit_h_4b8939.pth
```

```
# 背景替换
python replace_anything.py \
    --input_img ./example/replace-anything/dog.png \
    --coords_type key_in \
    --point_coords 750 500 \
    --point_labels 1 \
    --text_prompt "sit on the swing" \
    --output_dir ./results \
    --sam_model_type "vit_h" \
    --sam_ckpt ./pretrained_models/sam_vit_h_4b8939.pth
```
## result
<!-- 此处填算法效果测试图（包括输入、输出） -->

推理结果

<center class="half">
<img src="./results/cat/with_points.png" width=300/>
<img src="./results/cat/inpainted_with_mask_2.png" width=300/>
<div >目标消除</div>
</center>

<center class="half">
<img src="./results/sample3/with_points.png" width=300/>
<img src="./results/sample3/filled_with_mask_2.png" width=300/>
<div >目标替换</div>
</center>

<center class="half">
<img src="./results/bus/with_points.png" width=300/>
<img src="./results/bus/replaced_with_mask_1.png" width=300/>
<div >背景替换</div>
</center>


### 精度
无


## 应用场景
### 算法类别

<!-- 超出以上分类的类别命名也可参考此网址中的类别名：https://huggingface.co/ \ -->
`AIGC`

### 热点应用行业
<!-- 应用行业的填写需要做大量调研，从而为使用者提供专业、全面的推荐，除特殊算法，通常推荐数量>=3。 -->
`推理,零售,制造,电商,医疗,教育`

<!-- ## 预训练权重 -->
<!-- - 此处填写预训练权重在公司内部的下载地址（预训练权重存放中心为：[SCNet AIModels](http://113.200.138.88:18080/aimodels) ，模型用到的各预训练权重请分别填上具体地址。），过小权重文件可打包到项目里。
- 此处填写公开预训练权重官网下载地址（非必须）。 -->

## 源码仓库及问题反馈
<!-- - 此处填本项目gitlab地址 -->
- https://developer.sourcefind.cn/codes/modelzoo/inpaint-anything_pytorch

## 参考资料
- https://github.com/geekyutao/Inpaint-Anything
- https://github.com/facebookresearch/segment-anything
- https://github.com/advimman/lama
- https://github.com/CompVis/stable-diffusion