Commits · ef4c2fa4f1cfaebab186d8007923ab129d219bd3 · renzhc / diffusers_dcu

28 Mar, 2023 3 commits
- Update alt_diffusion.mdx (#2865) · ef4c2fa4
  M. Tolga Cangöz authored Mar 28, 2023
```
Fix typos
```
  ef4c2fa4
- Update overview.mdx (#2864) · 3980858a
  M. Tolga Cangöz authored Mar 28, 2023
```
Fix typos
```
  3980858a
- improve stable unclip doc. (#2823) · fab4f3d6
  Sayak Paul authored Mar 28, 2023
  
  fab4f3d6
24 Mar, 2023 2 commits

[Docs] update docs (Stable unCLIP) to reflect the updated ckpts. (#2815) · 5883d8d4

Sayak Paul authored Mar 24, 2023



* update docs to reflect the updated ckpts.

* update: point about prompt.

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* emove image resizing.

* Apply suggestions from code review

* Apply suggestions from code review

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

5883d8d4

Add ModelEditing pipeline (#2721) · 37a44bb2

Bahjat Kawar authored Mar 24, 2023



* TIME first commit

* styling.

* styling 2.

* fixes; tests

* apply styling and doc fix.

* remove sups.

* fixes

* remove temp file

* move augmentations to const

* added doc entry

* code quality

* customize augmentations

* quality

* quality

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

37a44bb2

23 Mar, 2023 6 commits

Add AudioLDM (#2232) · b94880e5

Sanchit Gandhi authored Mar 23, 2023



* Add AudioLDM

* up

* add vocoder

* start unet

* unconditional unet

* clap, vocoder and vae

* clean-up: conversion scripts

* fix: conversion script token_type_ids

* clean-up: pipeline docstring

* tests: from SD

* clean-up: cpu offload vocoder instead of safety checker

* feat: adapt tests to audioldm

* feat: add docs

* clean-up: amend pipeline docstrings

* clean-up: make style

* clean-up: make fix-copies

* fix: add doc path to toctree

* clean-up: args for conversion script

* clean-up: paths to checkpoints

* fix: use conditional unet

* clean-up: make style

* fix: type hints for UNet

* clean-up: docstring for UNet

* clean-up: make style

* clean-up: remove duplicate in docstring

* clean-up: make style

* clean-up: make fix-copies

* clean-up: move imports to start in code snippet

* fix: pass cross_attention_dim as a list/tuple to unet

* clean-up: make fix-copies

* fix: update checkpoint path

* fix: unet cross_attention_dim in tests

* film embeddings -> class embeddings

* Apply suggestions from code review
Co-authored-by: Will Berman <wlbberman@gmail.com>

* fix: unet film embed to use existing args

* fix: unet tests to use existing args

* fix: make style

* fix: transformers import and version in init

* clean-up: make style

* Revert "clean-up: make style"

This reverts commit 5d6d1f8b324f5583e7805dc01e2c86e493660d66.

* clean-up: make style

* clean-up: use pipeline tester mixin tests where poss

* clean-up: skip attn slicing test

* fix: add torch dtype to docs

* fix: remove conversion script out of src

* fix: remove .detach from 1d waveform

* fix: reduce default num inf steps

* fix: swap height/width -> audio_length_in_s

* clean-up: make style

* fix: remove nightly tests

* fix: imports in conversion script

* clean-up: slim-down to two slow tests

* clean-up: slim-down fast tests

* fix: batch consistent tests

* clean-up: make style

* clean-up: remove vae slicing fast test

* clean-up: propagate changes to doc

* fix: increase test tol to 1e-2

* clean-up: finish docs

* clean-up: make style

* feat: vocoder / VAE compatibility check

* feat: possibly expand / cut audio waveform

* fix: pipeline call signature test

* fix: slow tests output len

* clean-up: make style

* make style

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: William Berman <WLBberman@gmail.com>

b94880e5

Flax controlnet (#2727) · df91c447

YiYi Xu authored Mar 23, 2023



* add contronet flax

---------
Co-authored-by: yiyixuxu <yixu310@gmail,com>

df91c447

[Docs] small fixes to the text to video doc. (#2787) · 0d7aac3e
Sayak Paul authored Mar 23, 2023
```
* small fixes to the text to video doc.

* add: Spaces link.

* add: warning on research-only model.
```
0d7aac3e

Music Spectrogram diffusion pipeline (#1044) · 2ef9bdd7

Kashif Rasul authored Mar 23, 2023



* initial TokenEncoder and ContinuousEncoder

* initial modules

* added ContinuousContextTransformer

* fix copy paste error

* use numpy for get_sequence_length

* initial terminal relative positional encodings

* fix weights keys

* fix assert

* cross attend style: concat encodings

* make style

* concat once

* fix formatting

* Initial SpectrogramPipeline

* fix input_tokens

* make style

* added mel output

* ignore weights for config

* move mel to numpy

* import pipeline

* fix class names and import

* moved models to models folder

* import ContinuousContextTransformer and SpectrogramDiffusionPipeline

* initial spec diffusion converstion script

* renamed config to t5config

* added weight loading

* use arguments instead of t5config

* broadcast noise time to batch dim

* fix call

* added scale_to_features

* fix weights

* transpose laynorm weight

* scale is a vector

* scale the query outputs

* added comment

* undo scaling

* undo depth_scaling

* inital get_extended_attention_mask

* attention_mask is none in self-attention

* cleanup

* manually invert attention

* nn.linear need bias=False

* added T5LayerFFCond

* remove to fix conflict

* make style and dummy

* remove unsed variables

* remove predict_epsilon

* Move accelerate to a soft-dependency (#1134)

* finish

* finish

* Update src/diffusers/modeling_utils.py

* Update src/diffusers/pipeline_utils.py
Co-authored-by: Anton Lozhkov <anton@huggingface.co>

* more fixes

* fix
Co-authored-by: Anton Lozhkov <anton@huggingface.co>

* fix order

* added initial midi to note token data pipeline

* added int to int tokenizer

* remove duplicate

* added logic for segments

* add melgan to pipeline

* move autoregressive gen into pipeline

* added note_representation_processor_chain

* fix dtypes

* remove immutabledict req

* initial doc

* use np.where

* require note_seq

* fix typo

* update dependency

* added note-seq to test

* added is_note_seq_available

* fix import

* added toc

* added example usage

* undo for now

* moved docs

* fix merge

* fix imports

* predict first segment

* avoid un-needed copy to and from cpu

* make style

* Copyright

* fix style

* add test and fix inference steps

* remove bogus files

* reorder models

* up

* remove transformers dependency

* make work with diffusers cross attention

* clean more

* remove @

* improve further

* up

* uP

* Apply suggestions from code review

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

* loop over all tokens

* make style

* Added a section on the model

* fix formatting

* grammer

* formatting

* make fix-copies

* Update src/diffusers/pipelines/__init__.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/spectrogram_diffusion/pipeline_spectrogram_diffusion.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* added callback ad optional ionnx

* do not squeeze batch dim

* clean up more

* upload

* convert jax to nnumpy

* make style

* fix warning

* make fix-copies

* fix warning

* add initial fast tests

* add initial pipeline_params

* eval mode due to dropout

* skip batch tests as pipeline runs on a single file

* make style

* fix relative path

* fix doc tests

* Update src/diffusers/models/t5_film_transformer.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/models/t5_film_transformer.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update docs/source/en/api/pipelines/spectrogram_diffusion.mdx
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* add MidiProcessor

* format

* fix org

* Apply suggestions from code review

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

* make style

* pin protobuf to <4

* fix formatting

* white space

* tensorboard needs protobuf

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>

2ef9bdd7

Rename 'CLIPFeatureExtractor' class to 'CLIPImageProcessor' (#2732) · 14e3a28c

Naoki Ainoya authored Mar 23, 2023

The 'CLIPFeatureExtractor' class name has been renamed to 'CLIPImageProcessor' in order to comply with future deprecation. This commit includes the necessary changes to the affected files.

14e3a28c

add: section on multiple controlnets. (#2762) · c681ad1a

Sayak Paul authored Mar 23, 2023



* add: section on multiple controlnets.
Co-authored-by: William Berman <WLBberman@gmail.com>

* fix: docs.

* fix: docs.

---------
Co-authored-by: William Berman <WLBberman@gmail.com>

c681ad1a

22 Mar, 2023 1 commit

[MS Text To Video] Add first text to video (#2738) · ca1a2229

Patrick von Platen authored Mar 22, 2023



* [MS Text To Video} Add first text to video

* upload

* make first model example

* match unet3d params

* make sure weights are correcctly converted

* improve

* forward pass works, but diff result

* make forward work

* fix more

* finish

* refactor video output class.

* feat: add support for a video export utility.

* fix: opencv availability check.

* run make fix-copies.

* add: docs for the model components.

* add: standalone pipeline doc.

* edit docstring of the pipeline.

* add: right path to TransformerTempModel

* add: first set of tests.

* complete fast tests for text to video.

* fix bug

* up

* three fast tests failing.

* add: note on slow tests

* make work with all schedulers

* apply styling.

* add slow tests

* change file name

* update

* more correction

* more fixes

* finish

* up

* Apply suggestions from code review

* up

* finish

* make copies

* fix pipeline tests

* fix more tests

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* apply suggestions

* up

* revert

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

ca1a2229

15 Mar, 2023 1 commit
- [docs] Reorganize table of contents (#2671) · 588e50bc
  Steven Liu authored Mar 15, 2023
```
* reorg toc

* reorg toc some more

* remove duplicate config
```
  588e50bc
09 Mar, 2023 1 commit

add flax pipelines to api doc + doc string examples (#2600) · a062e47e

YiYi Xu authored Mar 09, 2023



* add api doc for flax pipeline + doc string examples

* make style

---------
Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

a062e47e

06 Mar, 2023 1 commit
- Fix: controlnet docs format (#2559) · b36cbd4f
  Vico Chu authored Mar 06, 2023
  
  b36cbd4f
03 Mar, 2023 2 commits
- Correct section docs (#2540) · 7f0f7e1e
  Patrick von Platen authored Mar 03, 2023
  
  7f0f7e1e
- Small fixes for controlnet (#2542) · 10219293
  Patrick von Platen authored Mar 03, 2023
```
* Small fixes for controlnet

* finish links
```
  10219293
02 Mar, 2023 2 commits

8k Stable Diffusion with tiled VAE (#1441) · 80148484

Ilmari Heikkinen authored Mar 03, 2023



* Tiled VAE for high-res text2img and img2img

* vae tiling, fix formatting

* enable_vae_tiling API and tests

* tiled vae docs, disable tiling for images that would have only one tile

* tiled vae tests, use channels_last memory format

* tiled vae tests, use smaller test image

* tiled vae tests, remove tiling test from fast tests

* up

* up

* make style

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* make style

* improve naming

* finish

* apply suggestions

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* up

---------
Co-authored-by: Ilmari Heikkinen <ilmari@fhtr.org>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

80148484

Add a ControlNet model & pipeline (#2407) · 8dfff7c0

Takuma Mori authored Mar 02, 2023



* add scaffold
- copied convert_controlnet_to_diffusers.py from
convert_original_stable_diffusion_to_diffusers.py

* Add support to load ControlNet (WIP)
- this makes Missking Key error on ControlNetModel

* Update to convert ControlNet without error msg
- init impl for StableDiffusionControlNetPipeline
- init impl for ControlNetModel

* cleanup of commented out

* split create_controlnet_diffusers_config()
from create_unet_diffusers_config()

- add config: hint_channels

* Add input_hint_block, input_zero_conv and
middle_block_out
- this makes missing key error on loading model

* add unet_2d_blocks_controlnet.py
- copied from unet_2d_blocks.py as impl CrossAttnDownBlock2D,DownBlock2D
- this makes missing key error on loading model

* Add loading for input_hint_block, zero_convs
and middle_block_out

- this makes no error message on model loading

* Copy from UNet2DConditionalModel except __init__

* Add ultra primitive test for ControlNetModel
inference

* Support ControlNetModel inference
- without exceptions

* copy forward() from UNet2DConditionModel

* Impl ControlledUNet2DConditionModel inference
- test_controlled_unet_inference passed

* Frozen weight & biases for training

* Minimized version of ControlNet/ControlledUnet
- test_modules_controllnet.py passed

* make style

* Add support model loading for minimized ver

* Remove all previous version files

* from_pretrained and inference test passed

* copied from pipeline_stable_diffusion.py
except `__init__()`

* Impl pipeline, pixel match test (almost) passed.

* make style

* make fix-copies

* Fix to add import ControlNet blocks
for `make fix-copies`

* Remove einops dependency

* Support  np.ndarray, PIL.Image for controlnet_hint

* set default config file as lllyasviel's

* Add support grayscale (hw) numpy array

* Add and update docstrings

* add control_net.mdx

* add control_net.mdx to toctree

* Update copyright year

* Fix to add PIL.Image RGB->BGR conversion
- thanks @Mystfit

* make fix-copies

* add basic fast test for controlnet

* add slow test for controlnet/unet

* Ignore down/up_block len check on ControlNet

* add a copy from test_stable_diffusion.py

* Accept controlnet_hint is None

* merge pipeline_stable_diffusion.py diff

* Update class name to SDControlNetPipeline

* make style

* Baseline fast test almost passed (w long desc)

* still needs investigate.

Following didn't passed descriped in TODO comment:
- test_stable_diffusion_long_prompt
- test_stable_diffusion_no_safety_checker

Following didn't passed same as stable_diffusion_pipeline:
- test_attention_slicing_forward_pass
- test_inference_batch_single_identical
- test_xformers_attention_forwardGenerator_pass
these seems come from calc accuracy.

* Add note comment related vae_scale_factor

* add test_stable_diffusion_controlnet_ddim

* add assertion for vae_scale_factor != 8

* slow test of pipeline almost passed
Failed: test_stable_diffusion_pipeline_with_model_offloading
- ImportError: `enable_model_offload` requires `accelerate v0.17.0` or higher

but currently latest version == 0.16.0

* test_stable_diffusion_long_prompt passed

* test_stable_diffusion_no_safety_checker passed

- due to its model size, move to slow test

* remove PoC test files

* fix num_of_image, prompt length issue add add test

* add support List[PIL.Image] for controlnet_hint

* wip

* all slow test passed

* make style

* update for slow test

* RGB(PIL)->BGR(ctrlnet) conversion

* fixes

* remove manual num_images_per_prompt test

* add document

* add `image` argument docstring

* make style

* Add line to correct conversion

* add controlnet_conditioning_scale (aka control_scales
strength)

* rgb channel ordering by default

* image batching logic

* Add control image descriptions for each checkpoint

* Only save controlnet model in conversion script

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_controlnet.py

typo
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* add gerated image example

* a depth mask -> a depth map

* rename control_net.mdx to controlnet.mdx

* fix toc title

* add ControlNet abstruct and link

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_controlnet.py
Co-authored-by: dqueue <dbyqin@gmail.com>

* remove controlnet constructor arguments re: @patrickvonplaten

* [integration tests] test canny

* test_canny fixes

* [integration tests] test_depth

* [integration tests] test_hed

* [integration tests] test_mlsd

* add channel order config to controlnet

* [integration tests] test normal

* [integration tests] test_openpose test_scribble

* change height and width to default to conditioning image

* [integration tests] test seg

* style

* test_depth fix

* [integration tests] size fixes

* [integration tests] cpu offloading

* style

* generalize controlnet embedding

* fix conversion script

* Update docs/source/en/api/pipelines/stable_diffusion/controlnet.mdx
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/controlnet.mdx
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/controlnet.mdx
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/controlnet.mdx
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Style adapted to the documentation of pix2pix

* merge main by hand

* style

* [docs] controlling generation doc nits

* correct some things

* add: controlnetmodel to autodoc.

* finish docs

* finish

* finish 2

* correct images

* finish controlnet

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* uP

* upload model

* up

* up

---------
Co-authored-by: William Berman <WLBberman@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: dqueue <dbyqin@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

8dfff7c0

01 Mar, 2023 1 commit
- [Copyright] 2023 (#2524) · eadf0e25
  Patrick von Platen authored Mar 01, 2023
  
  eadf0e25
28 Feb, 2023 1 commit

[Docs] Include more information in the "controlling generation" doc (#2434) · e3a2c7f0

Sayak Paul authored Feb 28, 2023



* edit controlling generation doc.

* add: demo link to pix2pix zero docs.

* refactor oanorama a bit.

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* pix: typo.

---------
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

e3a2c7f0

21 Feb, 2023 1 commit
- fix: code snippet of instruct pix2pix from the docs. (#2446) · 39a3c77e
  Sayak Paul authored Feb 21, 2023
  
  39a3c77e
20 Feb, 2023 3 commits
- add demo (#2436) · 17ecf72d
  YiYi Xu authored Feb 20, 2023
```
Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan>
```
  17ecf72d
- [Docs] Add a note on SDEdit (#2433) · 8f1fe75b
  Sayak Paul authored Feb 20, 2023
```
add: note on SDEdit
```
  8f1fe75b
- remove author names. (#2428) · 5f65ef4d
  Sayak Paul authored Feb 20, 2023
```
* remove author names.

* add: demo link to panorama.
```
  5f65ef4d
17 Feb, 2023 6 commits

add index page (#2401) · 770d3b3c

YiYi Xu authored Feb 17, 2023



* add index page

* update

---------
Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan>

770d3b3c

add MultiDiffusionPanorama pipeline (#2393) · 38de9643

Omer Bar Tal authored Feb 17, 2023



* add MultiDiffusionPanorama pipeline

* fix docs naming

* update pipeline name, remove redundant tests

* apply styling.

* debugging information.

* fix: assertion values.

* fix-copies.

* update docs

* update docs

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

38de9643

Add ddim inversion pix2pix (#2397) · 14b95070

Patrick von Platen authored Feb 17, 2023



* add

* finish

* add tests

* add tests

* up

* up

* pull from main

* uP

* Apply suggestions from code review

* finish

* Update docs/source/en/_toctree.yml
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* finish

* clean docs

* next

* next

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* up

* up

---------
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

14b95070

Add semantic guidance pipeline (#2223) · 01a80807

Manuel Brack authored Feb 17, 2023



* Add semantic guidance pipeline

* Fix style

* Refactor Pipeline

* Pipeline documentation

* Add documentation

* Fix style and quality

* Fix doctree

* Add tests for SEGA

* Update src/diffusers/pipelines/semantic_stable_diffusion/pipeline_semantic_stable_diffusion.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/semantic_stable_diffusion/pipeline_semantic_stable_diffusion.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/semantic_stable_diffusion/pipeline_semantic_stable_diffusion.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Make compatible with half precision

* Change deprecation warning to throw an exception

* update

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

01a80807

add: inversion to pix2pix zero docs. (#2398) · 867a217d

Sayak Paul authored Feb 17, 2023

* add: inversion to pix2pix zero docs.

* add: comment to emphasize the use of flan to generate.

* more nits.

867a217d

[Pipelines] Add a section on generating captions and embeddings for Pix2Pix Zero (#2395) · c5fa13aa

Sayak Paul authored Feb 17, 2023



* add: section on generating embeddings.

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* apply changes from code review.

---------
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

c5fa13aa

16 Feb, 2023 4 commits

Attend and excite 2 (#2369) · 2e7a2865

YiYi Xu authored Feb 16, 2023



* attend and excite pipeline

* update

update docstring example

remove visualization

remove the base class attention control

remove dependency on stable diffusion pipeline

always apply gaussian filter with default setting

remove run_standard_sd argument

hardcode attention_res and scale_range (related to step size)

Update docs/source/en/api/pipelines/stable_diffusion/attend_and_excite.mdx
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Update tests/pipelines/stable_diffusion_2/test_stable_diffusion_attend_and_excite.py
Co-authored-by: Will Berman <wlbberman@gmail.com>

Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py
Co-authored-by: Will Berman <wlbberman@gmail.com>

Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py
Co-authored-by: Will Berman <wlbberman@gmail.com>

revert test_float16_inference

revert change to the batch related tests

fix test_float16_inference

handle batch

remove the deprecation message

remove None check, step_size

remove debugging logging

add slow test

indices_to_alter -> indices

add check_input

* skip mps

* style

* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* indices -> token_indices
---------
Co-authored-by: evin <evinpinarornek@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

2e7a2865

add the UniPC scheduler (#2373) · aaaec064

Wenliang Zhao authored Feb 17, 2023



* add UniPC scheduler

* add the return type to the functions

* code quality check

* add tests

* finish docs

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

aaaec064

Add Self-Attention-Guided (SAG) Stable Diffusion pipeline (#2193) · fa35750d

Susung Hong authored Feb 16, 2023



* Add Stable Diffusion Sw/ elf-Attention Guidance

* Modify __init__.py

* Register attention storing processor

* Update pipeline_stable_diffusion_sag.py

* Editing default value

* Update pipeline_stable_diffusion_sag.py

* Update pipeline_stable_diffusion_sag.py

* Update pipeline_stable_diffusion_sag.py

* Update dummy_torch_and_transformers_objects.py

* Update pipeline_stable_diffusion_sag.py

* Update pipeline_stable_diffusion_sag.py

* Update pipeline_stable_diffusion_sag.py

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update pipeline_stable_diffusion_sag.py

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update pipeline_stable_diffusion_sag.py

* Update pipeline_stable_diffusion_sag.py

* Update pipeline_stable_diffusion_sag.py

* Update pipeline_stable_diffusion_sag.py

* Create test_stable_diffusion_sag.py

* Create self_attention_guidance.py

* Update pipeline_stable_diffusion_sag.py

* Update test_stable_diffusion_sag.py

* Update pipeline_stable_diffusion_sag.py

* Rename self_attention_guidance.py to self_attention_guidance.mdx

* Update self_attention_guidance.mdx

* Update self_attention_guidance.mdx

* Update _toctree.yml

* Update pipeline_stable_diffusion_sag.py

* Update pipeline_stable_diffusion_sag.py

* Update pipeline_stable_diffusion_sag.py

* Update pipeline_stable_diffusion_sag.py

* Fixing order

* Update pipeline_stable_diffusion_sag.py

* fixing import order

* fix order

* Update pipeline_stable_diffusion_sag.py

* Update pipeline_stable_diffusion_sag.py

* Naming change

* Noting pred_x0

* Adding some fast tests

* Update pipeline_stable_diffusion_sag.py

* Update test_stable_diffusion_sag.py

* Update test_stable_diffusion_sag.py

* Update test_stable_diffusion_sag.py

* Update docs/source/en/api/pipelines/stable_diffusion/self_attention_guidance.mdx

* implement gaussian_blur

* Update pipeline_stable_diffusion_sag.py

* Update pipeline_stable_diffusion_sag.py

* fix tests

* Update pipeline_stable_diffusion_sag.py

* Update pipeline_stable_diffusion_sag.py

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Will Berman <wlbberman@gmail.com>

fa35750d

[Pipelines] Adds pix2pix zero (#2334) · fd3d5502

Sayak Paul authored Feb 16, 2023

* add: support for BLIP generation.

* add: support for editing synthetic images.

* remove unnecessary comments.

* add inits and run make fix-copies.

* version change of diffusers.

* fix: condition for loading the captioner.

* default conditions_input_image to False.

* guidance_amount -> cross_attention_guidance_amount

* fix inputs to check_inputs()

* fix: attribute.

* fix: prepare_attention_mask() call.

* debugging.

* better placement of references.

* remove torch.no_grad() decorations.

* put torch.no_grad() context before the first denoising loop.

* detach() latents before decoding them.

* put deocding in a torch.no_grad() context.

* add reconstructed image for debugging.

* no_grad(0

* apply formatting.

* address one-off suggestions from the draft PR.

* back to torch.no_grad() and add more elaborate comments.

* refactor prepare_unet() per Patrick's suggestions.

* more elaborate description for .

* formatting.

* add docstrings to the methods specific to pix2pix zero.

* suspecting a redundant noise prediction.

* needed for gradient computation chain.

* less hacks.

* fix: attention mask handling within the processor.

* remove attention reference map computation.

* fix: cross attn args.

* fix: prcoessor.

* store attention maps.

* fix: attention processor.

* update docs and better treatment to xa args.

* update the final noise computation call.

* change xa args call.

* remove xa args option from the pipeline.

* add: docs.

* first test.

* fix: url call.

* fix: argument call.

* remove image conditioning for now.

* 🚨 add: fast tests.

* explicit placement of the xa attn weights.

* add: slow tests 🐢

* fix: tests.

* edited direction embedding should be on the same device as prompt_embeds.

* debugging message.

* debugging.

* add pix2pix zero pipeline for a non-deterministic test.

* debugging/

* remove debugging message.

* make caption generation _

* address comments (part I).

* address PR comments (part II)

* fix: DDPM test assertion.

* refactor doc.

* address PR comments (part III).

* fix: type annotation for the scheduler.

* apply styling.

* skip_mps and add note on embeddings in the docs.

fd3d5502

15 Feb, 2023 1 commit

[Docs] initial docs about KarrasDiffusionSchedulers (#2349) · 51b61b69

Kashif Rasul authored Feb 15, 2023



* initial docs about KarrasDiffusionSchedulers

* typo

* grammer

* Update docs/source/en/api/schedulers/overview.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* do not list the schedulers explicitly

---------
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

51b61b69

14 Feb, 2023 1 commit

unCLIP variant (#2297) · 62b3c9e0

Will Berman authored Feb 14, 2023



* pipeline_variant

* Add docs for when clip_stats_path is specified

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip_img2img.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip_img2img.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* prepare_latents # Copied from re: @patrickvonplaten

* NoiseAugmentor->ImageNormalizer

* stable_unclip_prior default to None re: @patrickvonplaten

* prepare_prior_extra_step_kwargs

* prior denoising scale model input

* {DDIM,DDPM}Scheduler -> KarrasDiffusionSchedulers re: @patrickvonplaten

* docs

* Update docs/source/en/api/pipelines/stable_unclip.mdx
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

62b3c9e0

07 Feb, 2023 1 commit

Stable Diffusion Latent Upscaler (#2059) · 1051ca81

YiYi Xu authored Feb 06, 2023



* Modify UNet2DConditionModel

- allow skipping mid_block

- adding a norm_group_size argument so that we can set the `num_groups` for group norm using `num_channels//norm_group_size`

- allow user to set dimension for the timestep embedding (`time_embed_dim`)

- the kernel_size for `conv_in` and `conv_out` is now configurable

- add random fourier feature layer (`GaussianFourierProjection`) for `time_proj`

- allow user to add the time and class embeddings before passing through the projection layer together - `time_embedding(t_emb + class_label))`

- added 2 arguments `attn1_types` and `attn2_types`

  * currently we have argument `only_cross_attention`: when it's set to `True`, we will have a to the
`BasicTransformerBlock` block with 2 cross-attention , otherwise we
get a self-attention followed by a cross-attention; in k-upscaler, we need to have blocks that include just one cross-attention, or self-attention -> cross-attention;
so I added `attn1_types` and `attn2_types` to the unet's argument list to allow user specify the attention types for the 2 positions in each block;  note that I stil kept
the `only_cross_attention` argument for unet for easy configuration, but it will be converted to `attn1_type` and `attn2_type` when passing down to the down blocks

- the position of downsample layer and upsample layer is now configurable

- in k-upscaler unet, there is only one skip connection per each up/down block (instead of each layer in stable diffusion unet), added `skip_freq = "block"` to support
this use case

- if user passes attention_mask to unet, it will prepare the mask and pass a flag to cross attention processer to skip the `prepare_attention_mask` step
inside cross attention block

add up/down blocks for k-upscaler

modify CrossAttention class

- make the `dropout` layer in `to_out` optional

- `use_conv_proj` - use conv instead of linear for all projection layers (i.e. `to_q`, `to_k`, `to_v`, `to_out`) whenever possible. note that when it's used to do cross
attention, to_k, to_v has to be linear because the `encoder_hidden_states` is not 2d

- `cross_attention_norm` - add an optional layernorm on encoder_hidden_states

- `attention_dropout`: add an optional dropout on attention score

adapt BasicTransformerBlock

- add an ada groupnorm layer  to conditioning attention input with timestep embedding

- allow skipping the FeedForward layer in between the attentions

- replaced the only_cross_attention argument with attn1_type and attn2_type for more flexible configuration

update timestep embedding: add new act_fn  gelu and an optional act_2

modified ResnetBlock2D

- refactored with AdaGroupNorm class (the timestep scale shift normalization)

- add `mid_channel` argument - allow the first conv to have a different output dimension from the second conv

- add option to use input AdaGroupNorm on the input instead of groupnorm

- add options to add a dropout layer after each conv

- allow user to set the bias in conv_shortcut (needed for k-upscaler)

- add gelu

adding conversion script for k-upscaler unet

add pipeline

* fix attention mask

* fix a typo

* fix a bug

* make sure model can be used with GPU

* make pipeline work with fp16

* fix an error in BasicTransfomerBlock

* make style

* fix typo

* some more fixes

* uP

* up

* correct more

* some clean-up

* clean time proj

* up

* uP

* more changes

* remove the upcast_attention=True from unet config

* remove attn1_types, attn2_types etc

* fix

* revert incorrect changes up/down samplers

* make style

* remove outdated files

* Apply suggestions from code review

* attention refactor

* refactor cross attention

* Apply suggestions from code review

* update

* up

* update

* Apply suggestions from code review

* finish

* Update src/diffusers/models/cross_attention.py

* more fixes

* up

* up

* up

* finish

* more corrections of conversion state

* act_2 -> act_2_fn

* remove dropout_after_conv from ResnetBlock2D

* make style

* simplify KAttentionBlock

* add fast test for latent upscaler pipeline

* add slow test

* slow test fp16

* make style

* add doc string for pipeline_stable_diffusion_latent_upscale

* add api doc page for latent upscaler pipeline

* deprecate attention mask

* clean up embeddings

* simplify resnet

* up

* clean up resnet

* up

* correct more

* up

* up

* improve a bit more

* correct more

* more clean-ups

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* add docstrings for new unet config

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* # Copied from

* encode the image if not latent

* remove force casting vae to fp32

* fix

* add comments about preconditioning parameters from k-diffusion paper

* attn1_type, attn2_type -> add_self_attention

* clean up get_down_block and get_up_block

* fix

* fixed a typo(?) in ada group norm

* update slice attention processer for cross attention

* update slice

* fix fast test

* update the checkpoint

* finish tests

* fix-copies

* fix-copy for modeling_text_unet.py

* make style

* make style

* fix f-string

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix import

* correct changes

* fix resnet

* make fix-copies

* correct euler scheduler

* add missing #copied from for preprocess

* revert

* fix

* fix copies

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/models/cross_attention.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* clean up conversion script

* KDownsample2d,KUpsample2d -> KDownsample2D,KUpsample2D

* more

* Update src/diffusers/models/unet_2d_condition.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* remove prepare_extra_step_kwargs

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix a typo in timestep embedding

* remove num_image_per_prompt

* fix fasttest

* make style + fix-copies

* fix

* fix xformer test

* fix style

* doc string

* make style

* fix-copies

* docstring for time_embedding_norm

* make style

* final finishes

* make fix-copies

* fix tests

---------
Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

1051ca81

31 Jan, 2023 1 commit
- [Docs] Add components to docs (#2175) · d0d7ffff
  Patrick von Platen authored Jan 31, 2023
  
  d0d7ffff
27 Jan, 2023 1 commit
- Fix typo (#2138) · fdf70cb5
  Pedro Cuenca authored Jan 27, 2023
  
  fdf70cb5