README.md 12.5 KB
Newer Older
1
2
3
4
5
6
7
8
<p align="center">
  <img src="https://photo-maker.github.io/assets/logo.png" height=100>

</p>

<!-- ## <div align="center"><b>PhotoMaker</b></div> -->
<div align="center">
  
Zhen Li's avatar
Zhen Li committed
9
## PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding  [![Paper page](https://huggingface.co/datasets/huggingface/badges/resolve/main/paper-page-md-dark.svg)](https://huggingface.co/papers/2312.04461)
10
11
[[Paper](https://huggingface.co/papers/2312.04461)] &emsp; [[Project Page](https://photo-maker.github.io)] &emsp; [[Model Card](https://huggingface.co/TencentARC/PhotoMaker)] <br>

12
13
14
[[🤗 Demo (Realistic)](https://huggingface.co/spaces/TencentARC/PhotoMaker)] &emsp; [[🤗 Demo (Stylization)](https://huggingface.co/spaces/TencentARC/PhotoMaker-Style)] <br>

[[Replicate Demo (Realistic)](https://replicate.com/jd7h/photomaker)] &emsp; [[Replicate Demo (Stylization)](https://replicate.com/yorickvp/photomaker-style)] <be>
15

Zhen Li's avatar
Zhen Li committed
16
If the ID fidelity is not enough for you, please try our [stylization application](https://huggingface.co/spaces/TencentARC/PhotoMaker-Style), you may be pleasantly surprised.
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
</div>


---

Official implementation of **[PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding](https://huggingface.co/papers/2312.04461)**.

### 🌠  **Key Features:**

1. Rapid customization **within seconds**, with no additional LoRA training.
2. Ensures impressive ID fidelity, offering diversity, promising text controllability, and high-quality generation.
3. Can serve as an **Adapter** to collaborate with other Base Models alongside LoRA modules in community.

---

Zhen Li's avatar
Zhen Li committed
32
❗❗ Note: If there are any PhotoMaker based resources and applications, please leave them in the [discussion](https://github.com/TencentARC/PhotoMaker/discussions/36) and we will list them in the [Related Resources](https://github.com/TencentARC/PhotoMaker?tab=readme-ov-file#related-resources) section in README file.
33
Now we know the implementation of **Replicate**, **Windows**, **ComfyUI**, and **WebUI**. Thank you all! 
Zhen Li's avatar
Zhen Li committed
34

Paper99's avatar
Paper99 committed
35
<div align="center">
Paper99's avatar
Paper99 committed
36

Paper99's avatar
Paper99 committed
37
38
![photomaker_demo_fast](https://github.com/TencentARC/PhotoMaker/assets/21050959/e72cbf4d-938f-417d-b308-55e76a4bc5c8)
</div>
Paper99's avatar
Paper99 committed
39
40


41
## 🚩 **New Features/Updates**
Zhen Li's avatar
Zhen Li committed
42
- ✅ Jan. 20, 2024. An **important** note: For those GPUs that do not support bfloat16, please change [this line](https://github.com/TencentARC/PhotoMaker/blob/6ec44fc13909d64a65c635b9e3b6f238eb1de9fe/gradio_demo/app.py#L39) to `torch_dtype = torch.float16`, the speed will be **greatly improved** (1min/img (before) vs. 14s/img (after) on V100). The minimum GPU memory requirement for PhotoMaker is **15G**.
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
- ✅ Jan. 15, 2024. We release PhotoMaker.

---

## 🔥 **Examples**


### Realistic generation 

- [![Huggingface PhotoMaker](https://img.shields.io/static/v1?label=Demo&message=Huggingface%20Gradio&color=orange)](https://huggingface.co/spaces/TencentARC/PhotoMaker)
- [**PhotoMaker notebook demo**](photomaker_demo.ipynb)

<p align="center">
  <img src="https://cdn-uploads.huggingface.co/production/uploads/6285a9133ab6642179158944/BYBZNyfmN4jBKBxxt4uxz.jpeg" height=450>
</p>

<p align="center">
  <img src="https://cdn-uploads.huggingface.co/production/uploads/6285a9133ab6642179158944/9KYqoDxfbNVLzVKZzSzwo.jpeg" height=450>
</p>

### Stylization generation 

Note: only change the base model and add the LoRA modules for better stylization

- [![Huggingface PhotoMaker-Style](https://img.shields.io/static/v1?label=Demo&message=Huggingface%20Gradio&color=orange)](https://huggingface.co/spaces/TencentARC/PhotoMaker-Style)
- [**PhotoMaker-Style notebook demo**](photomaker_style_demo.ipynb) 

<p align="center">
  <img src="https://cdn-uploads.huggingface.co/production/uploads/6285a9133ab6642179158944/du884lcjpqqjnJIxpATM2.jpeg" height=450>
</p>
  
<p align="center">
  <img src="https://cdn-uploads.huggingface.co/production/uploads/6285a9133ab6642179158944/-AC7Hr5YL4yW1zXGe_Izl.jpeg" height=450>
</p>

78
79
80
81
82
83
84
85
86
87
### Note about the output aspect ratio

The underlying generative model has billions of parameters. And, whether You believe it or not, **all** of them do affect the output. The width/height (i.e. the "Aspect Ratio") of the output are just a few of these parameters.

For example, using the exact same positive/negative prompts and seed value, only changing the aspect ratio (i.e. the width/height) of the output image, it is possible to generate e.g. following results:

<p align="center">
  <img src="https://cdn.cleverest.eu/attachment/github.com/TencentARC/PhotoMaker/pull/120/aspect-ratio-matters.png" height=725>
</p>

88
89
90
91
92
# 🔧 Dependencies and Installation

- Python >= 3.8 (Recommend to use [Anaconda](https://www.anaconda.com/download/#linux) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html))
- [PyTorch >= 2.0.0](https://pytorch.org/)
```bash
93
conda create --name photomaker python=3.10
Zhen Li's avatar
Zhen Li committed
94
conda activate photomaker
95
96
97
pip install -U pip

# Install requirements
98
pip install -r requirements.txt
99
100
101
102
103
104
105
106

# Install photomaker
pip install git+https://github.com/TencentARC/PhotoMaker.git
```

Then you can run the following command to use it
```python
from photomaker import PhotoMakerStableDiffusionXLPipeline
107
108
109
```

# ⏬ Download Models 
emmanuel's avatar
emmanuel committed
110
The model will be automatically downloaded through the following two lines:
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128

```python
from huggingface_hub import hf_hub_download
photomaker_path = hf_hub_download(repo_id="TencentARC/PhotoMaker", filename="photomaker-v1.bin", repo_type="model")
```

You can also choose to download manually from this [url](https://huggingface.co/TencentARC/PhotoMaker).

# 💻 How to Test

## Use like [diffusers](https://github.com/huggingface/diffusers)

- Dependency
```py
import torch
import os
from diffusers.utils import load_image
from diffusers import EulerDiscreteScheduler
129
from photomaker import PhotoMakerStableDiffusionXLPipeline
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205

### Load base model
pipe = PhotoMakerStableDiffusionXLPipeline.from_pretrained(
    base_model_path,  # can change to any base model based on SDXL
    torch_dtype=torch.bfloat16, 
    use_safetensors=True, 
    variant="fp16"
).to(device)

### Load PhotoMaker checkpoint
pipe.load_photomaker_adapter(
    os.path.dirname(photomaker_path),
    subfolder="",
    weight_name=os.path.basename(photomaker_path),
    trigger_word="img"  # define the trigger word
)     

pipe.scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config)

### Also can cooperate with other LoRA modules
# pipe.load_lora_weights(os.path.dirname(lora_path), weight_name=lora_model_name, adapter_name="xl_more_art-full")
# pipe.set_adapters(["photomaker", "xl_more_art-full"], adapter_weights=[1.0, 0.5])

pipe.fuse_lora()
```

- Input ID Images
```py
### define the input ID images
input_folder_name = './examples/newton_man'
image_basename_list = os.listdir(input_folder_name)
image_path_list = sorted([os.path.join(input_folder_name, basename) for basename in image_basename_list])

input_id_images = []
for image_path in image_path_list:
    input_id_images.append(load_image(image_path))
```

<div align="center">

<a href="https://github.com/TencentARC/PhotoMaker/assets/21050959/01d53dfa-7528-4f09-a1a5-96b349ae7800" align="center"><img style="margin:0;padding:0;" src="https://github.com/TencentARC/PhotoMaker/assets/21050959/01d53dfa-7528-4f09-a1a5-96b349ae7800"/></a>
</div>

- Generation
```py
# Note that the trigger word `img` must follow the class word for personalization
prompt = "a half-body portrait of a man img wearing the sunglasses in Iron man suit, best quality"
negative_prompt = "(asymmetry, worst quality, low quality, illustration, 3d, 2d, painting, cartoons, sketch), open mouth, grayscale"
generator = torch.Generator(device=device).manual_seed(42)
images = pipe(
    prompt=prompt,
    input_id_images=input_id_images,
    negative_prompt=negative_prompt,
    num_images_per_prompt=1,
    num_inference_steps=num_steps,
    start_merge_step=10,
    generator=generator,
).images[0]
gen_images.save('out_photomaker.png')
```

<div align="center">

<a href="https://github.com/TencentARC/PhotoMaker/assets/21050959/703c00e1-5e50-4c19-899e-25ee682d2c06" align="center"><img width=400 style="margin:0;padding:0;" src="https://github.com/TencentARC/PhotoMaker/assets/21050959/703c00e1-5e50-4c19-899e-25ee682d2c06"/></a>

</div>

## Start a local gradio demo
Run the following command:

```python
python gradio_demo/app.py
```

You could customize this script in [this file](gradio_demo/app.py).

206
207
If you want to run it on MAC, you should follow [this Instruction](MacGPUEnv.md) and then run the app.py.

208
## Usage Tips:
sayan-0's avatar
sayan-0 committed
209
- Upload more photos of the person to be customized to improve ID fidelity. If the input is Asian face(s), maybe consider adding 'Asian' before the class word, e.g., `Asian woman img`
210
- When stylizing, does the generated face look too realistic? Adjust the Style strength to 30-50, the larger the number, the less ID fidelity, but the stylization ability will be better. You could also try out other base models or LoRAs with good stylization effects.
sayan-0's avatar
sayan-0 committed
211
- Reduce the number of generated images and sampling steps for faster speed. However, please keep in mind that reducing the sampling steps may compromise the ID fidelity.
212

Zhen Li's avatar
Zhen Li committed
213
# Related Resources
214
### Replicate demo of PhotoMaker: 
Zhen Li's avatar
Zhen Li committed
215
1. [Demo link](https://replicate.com/jd7h/photomaker), run PhotoMaker on replicate, provided by [@yorickvP](https://github.com/yorickvP).
216
217
2. [Demo link (style version)](https://replicate.com/yorickvp/photomaker-style).

218
219
220
### Windows version of PhotoMaker: 
1. [bmaltais/PhotoMaker](https://github.com/bmaltais/PhotoMaker/tree/v1.0.1) by [@bmaltais](https://github.com/bmaltais), easy to deploy PhotoMaker on Windows. The description can be found in [this link](https://github.com/TencentARC/PhotoMaker/discussions/36#discussioncomment-8156199).
2. [sdbds/PhotoMaker-for-windows](https://github.com/sdbds/PhotoMaker-for-windows/tree/windows) by [@sdbds](https://github.com/bmaltais).
Zhen Li's avatar
Zhen Li committed
221
   
222
### ComfyUI:
223
224
225
226
1. 🔥 **Official Implementation by [ComfyUI](https://github.com/comfyanonymous/ComfyUI)**: https://github.com/comfyanonymous/ComfyUI/commit/d1533d9c0f1dde192f738ef1b745b15f49f41e02
2. https://github.com/ZHO-ZHO-ZHO/ComfyUI-PhotoMaker
3. https://github.com/StartHua/Comfyui-Mine-PhotoMaker
4. https://github.com/shiimizu/ComfyUI-PhotoMaker
Zhen Li's avatar
Zhen Li committed
227

Zhen Li's avatar
Zhen Li committed
228
### Other Applications / Web Demos
229
230
1. **Wisemodel 始智 (Easy to use in China)** https://wisemodel.cn/space/gradio/photomaker 
2. **OpenXLab (Easy to use in China)**: https://openxlab.org.cn/apps/detail/camenduru/PhotoMaker
Zhen Li's avatar
Zhen Li committed
231
232
 [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/app-center/openxlab_app.svg)](https://openxlab.org.cn/apps/detail/camenduru/PhotoMaker)
by [@camenduru](https://github.com/camenduru).
233
3. **Fooocus App**: [Fooocus-inswapper](https://github.com/machineminded/Fooocus-inswapper) provided by [@machineminded](https://github.com/machineminded)
Zhen Li's avatar
Zhen Li committed
234
235
236
4. **Colab**: https://github.com/camenduru/PhotoMaker-colab by [@camenduru](https://github.com/camenduru)
5. **Monster API**: https://monsterapi.ai/playground?model=photo-maker
6. **Pinokio**: https://pinokio.computer/item?uri=https://github.com/cocktailpeanutlabs/photomaker
Zhen Li's avatar
Zhen Li committed
237

238
239
### Graido demo in 45 lines
Provided by [@Gradio](https://twitter.com/Gradio/status/1747683500495691942)
Zhen Li's avatar
Zhen Li committed
240

Zhen Li's avatar
Zhen Li committed
241

242
# 🤗 Acknowledgements
243
- PhotoMaker is co-hosted by Tencent ARC Lab and Nankai University [MCG-NKU](https://mmcheng.net/cmm/).
sayan-0's avatar
sayan-0 committed
244
245
246
- Inspired from many excellent demos and repos, including [IP-Adapter](https://github.com/tencent-ailab/IP-Adapter), [multimodalart/Ip-Adapter-FaceID](https://huggingface.co/spaces/multimodalart/Ip-Adapter-FaceID), [FastComposer](https://github.com/mit-han-lab/fastcomposer), and [T2I-Adapter](https://github.com/TencentARC/T2I-Adapter). Thanks for their great work!
- Thanks to the Venus team in Tencent PCG for their feedback and suggestions.
- Thanks to the HuggingFace team for their generous support! 
247
248

# Disclaimer
sayan-0's avatar
sayan-0 committed
249
This project strives to impact the domain of AI-driven image generation positively. Users are granted the freedom to create images using this tool, but they are expected to comply with local laws and utilize it responsibly. The developers do not assume any responsibility for potential misuse by users.
250
251
252
253

# BibTeX
If you find PhotoMaker useful for your research and applications, please cite using this BibTeX:

sayan-0's avatar
sayan-0 committed
254
```BibTeX
255
256
@article{li2023photomaker,
  title={PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding},
Zhen Li's avatar
Zhen Li committed
257
  author={Li, Zhen and Cao, Mingdeng and Wang, Xintao and Qi, Zhongang and Cheng, Ming-Ming and Shan, Ying},
258
259
  booktitle={arXiv preprint arxiv:2312.04461},
  year={2023}
Zhen Li's avatar
Zhen Li committed
260
}