Unverified Commit fd35689f authored by Shauray Singh's avatar Shauray Singh Committed by GitHub
Browse files

[WIP] Add Fabric (#4201)



* empty PR

* init

* changes

* starting with the pipeline

* stable diff

* prev

* more things, getting started

* more functions

* makeing it more readable

* almost done testing

* var changes

* testing

* device

* device support

* maybe

* device malfunctions

* new new

* register

* testing

* exec does not work

* float

* change info

* change of architecture

* might work

* testing with colab

* more attn atuff

* stupid additions

* documenting and testing

* writing tests

* more docs

* tests and docs

* remove test

* empty PR

* init

* changes

* starting with the pipeline

* stable diff

* prev

* more things, getting started

* more functions

* makeing it more readable

* almost done testing

* var changes

* testing

* device

* device support

* maybe

* device malfunctions

* new new

* register

* testing

* exec does not work

* float

* change info

* change of architecture

* might work

* testing with colab

* more attn atuff

* stupid additions

* documenting and testing

* writing tests

* more docs

* tests and docs

* remove test

* change cross attention

* revert back

* tests

* reverting back to orig

* changes

* test passing

* pipeline changes

* before quality

* quality checks pass

* remove print statements

* doc fixes

* __init__ error something

* update docs, working on dim

* working on encoding

* doc fix

* more fixes

* no more dependent on 512*512

* update docs

* fixes

* test passing

* remove comment

* fixes and migration

* simpler tests

* doc changes

* green CI

* changes

* more docs

* changes

* new images

* to community examples

* selete

* more fixes

* changes

* fix

---------
Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
parent e8c9069d
...@@ -41,6 +41,7 @@ Unless otherwise mentioned, these are techniques that work with existing models ...@@ -41,6 +41,7 @@ Unless otherwise mentioned, these are techniques that work with existing models
13. [Model Editing](#model-editing) 13. [Model Editing](#model-editing)
14. [DiffEdit](#diffedit) 14. [DiffEdit](#diffedit)
15. [T2I-Adapter](#t2i-adapter) 15. [T2I-Adapter](#t2i-adapter)
16. [FABRIC](#fabric)
For convenience, we provide a table to denote which methods are inference-only and which require fine-tuning/training. For convenience, we provide a table to denote which methods are inference-only and which require fine-tuning/training.
...@@ -61,7 +62,7 @@ For convenience, we provide a table to denote which methods are inference-only a ...@@ -61,7 +62,7 @@ For convenience, we provide a table to denote which methods are inference-only a
| [Model Editing](#model-editing) | ✅ | ❌ | | | [Model Editing](#model-editing) | ✅ | ❌ | |
| [DiffEdit](#diffedit) | ✅ | ❌ | | | [DiffEdit](#diffedit) | ✅ | ❌ | |
| [T2I-Adapter](#t2i-adapter) | ✅ | ❌ | | | [T2I-Adapter](#t2i-adapter) | ✅ | ❌ | |
| [Fabric](#fabric) | ✅ | ❌ | |
## Instruct Pix2Pix ## Instruct Pix2Pix
[Paper](https://arxiv.org/abs/2211.09800) [Paper](https://arxiv.org/abs/2211.09800)
...@@ -230,3 +231,14 @@ There are 8 canonical pre-trained adapters trained on different conditionings su ...@@ -230,3 +231,14 @@ There are 8 canonical pre-trained adapters trained on different conditionings su
depth maps, and semantic segmentations. depth maps, and semantic segmentations.
See [here](../api/pipelines/stable_diffusion/adapter) for more information on how to use it. See [here](../api/pipelines/stable_diffusion/adapter) for more information on how to use it.
## Fabric
[Paper](https://arxiv.org/abs/2307.10159)
[Fabric](../api/pipelines/fabric) is a training-free
approach applicable to a wide range of popular diffusion models, which exploits
the self-attention layer present in the most widely used architectures to condition
the diffusion process on a set of feedback images.
To know more details, check out the [official doc](../api/pipelines/fabric).
...@@ -41,6 +41,7 @@ If a community doesn't work as expected, please open an issue and ping the autho ...@@ -41,6 +41,7 @@ If a community doesn't work as expected, please open an issue and ping the autho
| IADB Pipeline | Implementation of [Iterative α-(de)Blending: a Minimalist Deterministic Diffusion Model](https://arxiv.org/abs/2305.03486) | [IADB Pipeline](#iadb-pipeline) | - | [Thomas Chambon](https://github.com/tchambon) | IADB Pipeline | Implementation of [Iterative α-(de)Blending: a Minimalist Deterministic Diffusion Model](https://arxiv.org/abs/2305.03486) | [IADB Pipeline](#iadb-pipeline) | - | [Thomas Chambon](https://github.com/tchambon)
| Zero1to3 Pipeline | Implementation of [Zero-1-to-3: Zero-shot One Image to 3D Object](https://arxiv.org/abs/2303.11328) | [Zero1to3 Pipeline](#Zero1to3-pipeline) | - | [Xin Kong](https://github.com/kxhit) | | Zero1to3 Pipeline | Implementation of [Zero-1-to-3: Zero-shot One Image to 3D Object](https://arxiv.org/abs/2303.11328) | [Zero1to3 Pipeline](#Zero1to3-pipeline) | - | [Xin Kong](https://github.com/kxhit) |
Stable Diffusion XL Long Weighted Prompt Pipeline | A pipeline support unlimited length of prompt and negative prompt, use A1111 style of prompt weighting | [Stable Diffusion XL Long Weighted Prompt Pipeline](#stable-diffusion-xl-long-weighted-prompt-pipeline) | - | [Andrew Zhu](https://xhinker.medium.com/) | Stable Diffusion XL Long Weighted Prompt Pipeline | A pipeline support unlimited length of prompt and negative prompt, use A1111 style of prompt weighting | [Stable Diffusion XL Long Weighted Prompt Pipeline](#stable-diffusion-xl-long-weighted-prompt-pipeline) | - | [Andrew Zhu](https://xhinker.medium.com/) |
FABRIC - Stable Diffusion with feedback Pipeline | pipeline supports feedback from liked and disliked images | [Stable Diffusion Fabric Pipline](#stable-diffusion-fabric-pipeline) | - | [Shauray Singh](https://shauray8.github.io/about_shauray/) |
To load a custom pipeline you just need to pass the `custom_pipeline` argument to `DiffusionPipeline`, as one of the files in `diffusers/examples/community`. Feel free to send a PR with your own pipelines, we will merge them quickly. To load a custom pipeline you just need to pass the `custom_pipeline` argument to `DiffusionPipeline`, as one of the files in `diffusers/examples/community`. Feel free to send a PR with your own pipelines, we will merge them quickly.
...@@ -1955,3 +1956,69 @@ Output Image ...@@ -1955,3 +1956,69 @@ Output Image
`reference_attn=True, reference_adain=True, num_inference_steps=20` `reference_attn=True, reference_adain=True, num_inference_steps=20`
![output_image](https://github.com/huggingface/diffusers/assets/34944964/9b2f1aca-886f-49c3-89ec-d2031c8e3670) ![output_image](https://github.com/huggingface/diffusers/assets/34944964/9b2f1aca-886f-49c3-89ec-d2031c8e3670)
### Stable diffusion fabric pipeline
FABRIC approach applicable to a wide range of popular diffusion models, which exploits
the self-attention layer present in the most widely used architectures to condition
the diffusion process on a set of feedback images.
```python
import requests
import torch
from PIL import Image
from io import BytesIO
from diffusers import Diffusionpipeline
# load the pipeline
# make sure you're logged in with `huggingface-cli login`
model_id_or_path = "runwayml/stable-diffusion-v1-5"
#can also be used with dreamlike-art/dreamlike-photoreal-2.0
pipe = DiffusionPipeline.from_pretrained(model_id_or_path, torch_dtype=torch.float16, custom_pipeline="pipeline_fabric").to("cuda")
# let's specify a prompt
prompt = "An astronaut riding an elephant"
negative_prompt = "lowres, cropped"
# call the pipeline
image = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
num_inference_steps=20,
generator=torch.manual_seed(12)
).images[0]
image.save("horse_to_elephant.jpg")
# let's try another example with feedback
url = "https://raw.githubusercontent.com/ChenWu98/cycle-diffusion/main/data/dalle2/A%20black%20colored%20car.png"
response = requests.get(url)
init_image = Image.open(BytesIO(response.content)).convert("RGB")
prompt = "photo, A blue colored car, fish eye"
liked = [init_image]
## same goes with disliked
# call the pipeline
torch.manual_seed(0)
image = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
liked = liked,
num_inference_steps=20,
).images[0]
image.save("black_to_blue.png")
```
*With enough feedbacks you can create very similar high quality images.*
The original codebase can be found at [sd-fabric/fabric](https://github.com/sd-fabric/fabric), and available checkpoints are [dreamlike-art/dreamlike-photoreal-2.0](https://huggingface.co/dreamlike-art/dreamlike-photoreal-2.0), [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5), and [stabilityai/stable-diffusion-2-1](https://huggingface.co/stabilityai/stable-diffusion-2-1) (may give unexpected results).
Let's have a look at the images (*512X512*)
| Without Feedback | With Feedback (1st image) |
|---------------------|---------------------|
| ![Image 1](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/fabric_wo_feedback.jpg) | ![Feedback Image 1](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/fabric_w_feedback.png) |
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment