Commits · 7bc2fff1a552ea16de1bdfccdf5d865613f6a63f · renzhc / diffusers_dcu

"vscode:/vscode.git/clone" did not exist on "dcc227d7eccd09d5feed3cda1e35f74de211c5a0"

27 Mar, 2023 1 commit

Ruff: apply same rules as in transformers (#2827) · 1d7b4b60

Pedro Cuenca authored Mar 27, 2023

* Apply same ruff settings as in transformers

See https://github.com/huggingface/transformers/blob/main/pyproject.toml

Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>

* Apply new style rules

* Style
Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>

* style

* remove list, ruff wouldn't auto fix.

---------
Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>

1d7b4b60

23 Mar, 2023 4 commits

Add AudioLDM (#2232) · b94880e5

Sanchit Gandhi authored Mar 23, 2023



* Add AudioLDM

* up

* add vocoder

* start unet

* unconditional unet

* clap, vocoder and vae

* clean-up: conversion scripts

* fix: conversion script token_type_ids

* clean-up: pipeline docstring

* tests: from SD

* clean-up: cpu offload vocoder instead of safety checker

* feat: adapt tests to audioldm

* feat: add docs

* clean-up: amend pipeline docstrings

* clean-up: make style

* clean-up: make fix-copies

* fix: add doc path to toctree

* clean-up: args for conversion script

* clean-up: paths to checkpoints

* fix: use conditional unet

* clean-up: make style

* fix: type hints for UNet

* clean-up: docstring for UNet

* clean-up: make style

* clean-up: remove duplicate in docstring

* clean-up: make style

* clean-up: make fix-copies

* clean-up: move imports to start in code snippet

* fix: pass cross_attention_dim as a list/tuple to unet

* clean-up: make fix-copies

* fix: update checkpoint path

* fix: unet cross_attention_dim in tests

* film embeddings -> class embeddings

* Apply suggestions from code review
Co-authored-by: Will Berman <wlbberman@gmail.com>

* fix: unet film embed to use existing args

* fix: unet tests to use existing args

* fix: make style

* fix: transformers import and version in init

* clean-up: make style

* Revert "clean-up: make style"

This reverts commit 5d6d1f8b324f5583e7805dc01e2c86e493660d66.

* clean-up: make style

* clean-up: use pipeline tester mixin tests where poss

* clean-up: skip attn slicing test

* fix: add torch dtype to docs

* fix: remove conversion script out of src

* fix: remove .detach from 1d waveform

* fix: reduce default num inf steps

* fix: swap height/width -> audio_length_in_s

* clean-up: make style

* fix: remove nightly tests

* fix: imports in conversion script

* clean-up: slim-down to two slow tests

* clean-up: slim-down fast tests

* fix: batch consistent tests

* clean-up: make style

* clean-up: remove vae slicing fast test

* clean-up: propagate changes to doc

* fix: increase test tol to 1e-2

* clean-up: finish docs

* clean-up: make style

* feat: vocoder / VAE compatibility check

* feat: possibly expand / cut audio waveform

* fix: pipeline call signature test

* fix: slow tests output len

* clean-up: make style

* make style

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: William Berman <WLBberman@gmail.com>

b94880e5

Flax controlnet (#2727) · df91c447

YiYi Xu authored Mar 23, 2023



* add contronet flax

---------
Co-authored-by: yiyixuxu <yixu310@gmail,com>

df91c447

Music Spectrogram diffusion pipeline (#1044) · 2ef9bdd7

Kashif Rasul authored Mar 23, 2023



* initial TokenEncoder and ContinuousEncoder

* initial modules

* added ContinuousContextTransformer

* fix copy paste error

* use numpy for get_sequence_length

* initial terminal relative positional encodings

* fix weights keys

* fix assert

* cross attend style: concat encodings

* make style

* concat once

* fix formatting

* Initial SpectrogramPipeline

* fix input_tokens

* make style

* added mel output

* ignore weights for config

* move mel to numpy

* import pipeline

* fix class names and import

* moved models to models folder

* import ContinuousContextTransformer and SpectrogramDiffusionPipeline

* initial spec diffusion converstion script

* renamed config to t5config

* added weight loading

* use arguments instead of t5config

* broadcast noise time to batch dim

* fix call

* added scale_to_features

* fix weights

* transpose laynorm weight

* scale is a vector

* scale the query outputs

* added comment

* undo scaling

* undo depth_scaling

* inital get_extended_attention_mask

* attention_mask is none in self-attention

* cleanup

* manually invert attention

* nn.linear need bias=False

* added T5LayerFFCond

* remove to fix conflict

* make style and dummy

* remove unsed variables

* remove predict_epsilon

* Move accelerate to a soft-dependency (#1134)

* finish

* finish

* Update src/diffusers/modeling_utils.py

* Update src/diffusers/pipeline_utils.py
Co-authored-by: Anton Lozhkov <anton@huggingface.co>

* more fixes

* fix
Co-authored-by: Anton Lozhkov <anton@huggingface.co>

* fix order

* added initial midi to note token data pipeline

* added int to int tokenizer

* remove duplicate

* added logic for segments

* add melgan to pipeline

* move autoregressive gen into pipeline

* added note_representation_processor_chain

* fix dtypes

* remove immutabledict req

* initial doc

* use np.where

* require note_seq

* fix typo

* update dependency

* added note-seq to test

* added is_note_seq_available

* fix import

* added toc

* added example usage

* undo for now

* moved docs

* fix merge

* fix imports

* predict first segment

* avoid un-needed copy to and from cpu

* make style

* Copyright

* fix style

* add test and fix inference steps

* remove bogus files

* reorder models

* up

* remove transformers dependency

* make work with diffusers cross attention

* clean more

* remove @

* improve further

* up

* uP

* Apply suggestions from code review

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

* loop over all tokens

* make style

* Added a section on the model

* fix formatting

* grammer

* formatting

* make fix-copies

* Update src/diffusers/pipelines/__init__.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/spectrogram_diffusion/pipeline_spectrogram_diffusion.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* added callback ad optional ionnx

* do not squeeze batch dim

* clean up more

* upload

* convert jax to nnumpy

* make style

* fix warning

* make fix-copies

* fix warning

* add initial fast tests

* add initial pipeline_params

* eval mode due to dropout

* skip batch tests as pipeline runs on a single file

* make style

* fix relative path

* fix doc tests

* Update src/diffusers/models/t5_film_transformer.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/models/t5_film_transformer.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update docs/source/en/api/pipelines/spectrogram_diffusion.mdx
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* add MidiProcessor

* format

* fix org

* Apply suggestions from code review

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

* make style

* pin protobuf to <4

* fix formatting

* white space

* tensorboard needs protobuf

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>

2ef9bdd7

[UNet3DModel] Fix with attn processor (#2790) · a8315ce1
Patrick von Platen authored Mar 23, 2023
```
* [UNet3DModel] Fix attn processor

* make style
```
a8315ce1

22 Mar, 2023 1 commit

[MS Text To Video] Add first text to video (#2738) · ca1a2229

Patrick von Platen authored Mar 22, 2023



* [MS Text To Video} Add first text to video

* upload

* make first model example

* match unet3d params

* make sure weights are correcctly converted

* improve

* forward pass works, but diff result

* make forward work

* fix more

* finish

* refactor video output class.

* feat: add support for a video export utility.

* fix: opencv availability check.

* run make fix-copies.

* add: docs for the model components.

* add: standalone pipeline doc.

* edit docstring of the pipeline.

* add: right path to TransformerTempModel

* add: first set of tests.

* complete fast tests for text to video.

* fix bug

* up

* three fast tests failing.

* add: note on slow tests

* make work with all schedulers

* apply styling.

* add slow tests

* change file name

* update

* more correction

* more fixes

* finish

* up

* Apply suggestions from code review

* up

* finish

* make copies

* fix pipeline tests

* fix more tests

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* apply suggestions

* up

* revert

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

ca1a2229

21 Mar, 2023 1 commit
- Fix typos (#2715) · f024e003
  Alexander Pivovarov authored Mar 21, 2023
```
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
```
  f024e003
17 Mar, 2023 1 commit

Enabling gradient checkpointing for VAE (#2536) · 116f70cb

Andy authored Mar 17, 2023



* updated black format

* update black format

* make style format

* updated line endings

* update code formatting

* Update examples/research_projects/onnxruntime/text_to_image/train_text_to_image.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/models/vae.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/models/vae.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* added vae gradient checkpointing test

* make style

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Will Berman <wlbberman@gmail.com>

116f70cb

16 Mar, 2023 2 commits

Improve deprecation error message when using cross_attention import (#2710) · a41850a2
Patrick von Platen authored Mar 17, 2023
```
Improve error message
```
a41850a2

Adding `use_safetensors` argument to give more control to users (#2123) · d9227cf7

Nicolas Patry authored Mar 16, 2023



* Adding `use_safetensors` argument to give more control to users

about which weights they use.

* Doc style.

* Rebased (not functional).

* Rebased and functional with tests.

* Style.

* Apply suggestions from code review

* Style.

* Addressing comments.

* Update tests/test_pipelines.py
Co-authored-by: Will Berman <wlbberman@gmail.com>

* Black ???

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Will Berman <wlbberman@gmail.com>

d9227cf7

15 Mar, 2023 3 commits

Rename attention (#2691) · e8282327

Patrick von Platen authored Mar 16, 2023

* rename file

* rename attention

* fix more

* rename more

* up

* more deprecation imports

* fixes

e8282327

T5Attention support for cross-attention (#2654) · cf4227cd

Kashif Rasul authored Mar 15, 2023



* fix AttnProcessor2_0

Fix use of AttnProcessor2_0 for cross attention with mask

* added scale_qk and out_bias flags

* fixed for xformers

* check if it has scale argument

* Update cross_attention.py

* check torch version

* fix sliced attn

* style

* set scale

* fix test

* fixed addedKV processor

* revert back AttnProcessor2_0

* if missing if

* fix inner_dim

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

cf4227cd

Controlnet training (#2545) · 79eb3d07

Henrik Forstén authored Mar 15, 2023

* Controlnet training code initial commit

Works with circle dataset: https://github.com/lllyasviel/ControlNet/blob/main/docs/train.md



* Script for adding a controlnet to existing model

* Fix control image transform

Control image should be in 0..1 range.

* Add license header and remove more unused configs

* controlnet training readme

* Allow nonlocal model in add_controlnet.py

* Formatting

* Remove unused code

* Code quality

* Initialize controlnet in training script

* Formatting

* Address review comments

* doc style

* explicit constructor args and submodule names

* hub dataset

NOTE -  not tested

* empty prompts

* add conditioning image

* rename

* remove instance data dir

* image_transforms -> -1,1 . conditioning_image_transformers -> 0, 1

* nits

* remove local rank config

I think this isn't necessary in any of our training scripts

* validation images

* proportion_empty_prompts typo

* weight copying to controlnet bug

* call log validation fix

* fix

* gitignore wandb

* fix progress bar and resume from checkpoint iteration

* initial step fix

* log multiple images

* fix

* fixes

* tracker project name configurable

* misc

* add controlnet requirements.txt

* update docs

* image labels

* small fixes

* log validation using existing models for pipeline

* fix for deepspeed saving

* memory usage docs

* Update examples/controlnet/train_controlnet.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update examples/controlnet/train_controlnet.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update examples/controlnet/README.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update examples/controlnet/README.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update examples/controlnet/README.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update examples/controlnet/README.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update examples/controlnet/README.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update examples/controlnet/README.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update examples/controlnet/README.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update examples/controlnet/README.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* remove extra is main process check

* link to dataset in intro paragraph

* remove unnecessary paragraph

* note on deepspeed

* Update examples/controlnet/README.md
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* assert -> value error

* weights and biases note

* move images out of git

* remove .gitignore

---------
Co-authored-by: William Berman <WLBberman@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

79eb3d07

14 Mar, 2023 2 commits

AutoencoderKL: clamp indices of blend_h and blend_v to input size (#2660) · a7cc468f
Ilmari Heikkinen authored Mar 15, 2023

a7cc468f

fix the in-place modification in unet condition when using controlnet (#2586) · e2d9a9be

Haiwen Huang authored Mar 14, 2023



* fix the in-place modification in unet condition when using controlnet, which will cause backprop errors when training

* add clone to mid block

* fix-copies

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: William Berman <WLBberman@gmail.com>

e2d9a9be

13 Mar, 2023 2 commits

Add support for Multi-ControlNet to StableDiffusionControlNetPipeline (#2627) · d9b8adc4

Takuma Mori authored Mar 14, 2023



* support for List[ControlNetModel] on init()

* Add to support for multiple ControlNetCondition

* rename conditioning_scale to scale

* scaling bugfix

* Manually merge `MultiControlNet` #2621
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_controlnet.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_controlnet.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_controlnet.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_controlnet.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_controlnet.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_controlnet.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* cleanups
- don't expose ControlNetCondition
- move scaling to ControlNetModel

* make style error correct

* remove ControlNetCondition to reduce code diff

* refactoring image/cond_scale

* add explain for `images`

* Add docstrings

* all fast-test passed

* Add a slow test

* nit

* Apply suggestions from code review

* small precision fix

* nits

MultiControlNet -> MultiControlNetModel - Matches existing naming a bit
closer

MultiControlNetModel inherit from model utils class - Don't have to
re-write fp16 test

Skip tests that save multi controlnet pipeline - Clearer than changing
test body

Don't auto-batch the number of input images to the number of controlnets.
We generally like to require the user to pass the expected number of
inputs. This simplifies the processing code a bit more

Use existing image pre-processing code a bit more. We can rely on the
existing image pre-processing code and keep the inference loop a bit
simpler.

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: William Berman <WLBberman@gmail.com>

d9b8adc4

[attention] Fix attention (#2656) · 4ae54b37
Patrick von Platen authored Mar 13, 2023
```
* [attention] Fix attention

* fix

* correct
```
4ae54b37

10 Mar, 2023 1 commit

[From pretrained] Speed-up loading from cache (#2515) · d761b58b

Patrick von Platen authored Mar 10, 2023



* [From pretrained] Speed-up loading from cache

* up

* Fix more

* fix one more bug

* make style

* bigger refactor

* factor out function

* Improve more

* better

* deprecate return cache folder

* clean up

* improve tests

* up

* upload

* add nice tests

* simplify

* finish

* correct

* fix version

* rename

* Apply suggestions from code review
Co-authored-by: Lucain <lucainp@gmail.com>

* rename

* correct doc string

* correct more

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* apply code suggestions

* finish

---------
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

d761b58b

09 Mar, 2023 1 commit
- Up vesion at which we deprecate "revision='fp16'" since `transformers` is not released yet (#2623) · 6a7a5467
  Patrick von Platen authored Mar 09, 2023
```
* improve error message

* upload
```
  6a7a5467
07 Mar, 2023 1 commit

Improve dynamic thresholding and extend to DDPM and DDIM Schedulers (#2528) · 55660cfb

clarencechen authored Mar 07, 2023



* Improve dynamic threshold

* Update code

* Add dynamic threshold to ddim and ddpm

* Encapsulate and leverage code copy mechanism

Update style

* Clean up DDPM/DDIM constructor arguments

* add test

* also add to unipc

---------
Co-authored-by: Peter Lin <peterlin9863@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

55660cfb

06 Mar, 2023 1 commit
- [Unet1d] correct docs (#2565) · ec021923
  Patrick von Platen authored Mar 06, 2023
  
  ec021923
03 Mar, 2023 1 commit
- Bug Fix: Remove explicit message argument in deprecate (#2421) · e75eae37
  alvanli authored Mar 03, 2023
```
Remove explicit message argument
```
  e75eae37
02 Mar, 2023 2 commits

8k Stable Diffusion with tiled VAE (#1441) · 80148484

Ilmari Heikkinen authored Mar 03, 2023



* Tiled VAE for high-res text2img and img2img

* vae tiling, fix formatting

* enable_vae_tiling API and tests

* tiled vae docs, disable tiling for images that would have only one tile

* tiled vae tests, use channels_last memory format

* tiled vae tests, use smaller test image

* tiled vae tests, remove tiling test from fast tests

* up

* up

* make style

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* make style

* improve naming

* finish

* apply suggestions

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* up

---------
Co-authored-by: Ilmari Heikkinen <ilmari@fhtr.org>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

80148484

Add a ControlNet model & pipeline (#2407) · 8dfff7c0

Takuma Mori authored Mar 02, 2023



* add scaffold
- copied convert_controlnet_to_diffusers.py from
convert_original_stable_diffusion_to_diffusers.py

* Add support to load ControlNet (WIP)
- this makes Missking Key error on ControlNetModel

* Update to convert ControlNet without error msg
- init impl for StableDiffusionControlNetPipeline
- init impl for ControlNetModel

* cleanup of commented out

* split create_controlnet_diffusers_config()
from create_unet_diffusers_config()

- add config: hint_channels

* Add input_hint_block, input_zero_conv and
middle_block_out
- this makes missing key error on loading model

* add unet_2d_blocks_controlnet.py
- copied from unet_2d_blocks.py as impl CrossAttnDownBlock2D,DownBlock2D
- this makes missing key error on loading model

* Add loading for input_hint_block, zero_convs
and middle_block_out

- this makes no error message on model loading

* Copy from UNet2DConditionalModel except __init__

* Add ultra primitive test for ControlNetModel
inference

* Support ControlNetModel inference
- without exceptions

* copy forward() from UNet2DConditionModel

* Impl ControlledUNet2DConditionModel inference
- test_controlled_unet_inference passed

* Frozen weight & biases for training

* Minimized version of ControlNet/ControlledUnet
- test_modules_controllnet.py passed

* make style

* Add support model loading for minimized ver

* Remove all previous version files

* from_pretrained and inference test passed

* copied from pipeline_stable_diffusion.py
except `__init__()`

* Impl pipeline, pixel match test (almost) passed.

* make style

* make fix-copies

* Fix to add import ControlNet blocks
for `make fix-copies`

* Remove einops dependency

* Support  np.ndarray, PIL.Image for controlnet_hint

* set default config file as lllyasviel's

* Add support grayscale (hw) numpy array

* Add and update docstrings

* add control_net.mdx

* add control_net.mdx to toctree

* Update copyright year

* Fix to add PIL.Image RGB->BGR conversion
- thanks @Mystfit

* make fix-copies

* add basic fast test for controlnet

* add slow test for controlnet/unet

* Ignore down/up_block len check on ControlNet

* add a copy from test_stable_diffusion.py

* Accept controlnet_hint is None

* merge pipeline_stable_diffusion.py diff

* Update class name to SDControlNetPipeline

* make style

* Baseline fast test almost passed (w long desc)

* still needs investigate.

Following didn't passed descriped in TODO comment:
- test_stable_diffusion_long_prompt
- test_stable_diffusion_no_safety_checker

Following didn't passed same as stable_diffusion_pipeline:
- test_attention_slicing_forward_pass
- test_inference_batch_single_identical
- test_xformers_attention_forwardGenerator_pass
these seems come from calc accuracy.

* Add note comment related vae_scale_factor

* add test_stable_diffusion_controlnet_ddim

* add assertion for vae_scale_factor != 8

* slow test of pipeline almost passed
Failed: test_stable_diffusion_pipeline_with_model_offloading
- ImportError: `enable_model_offload` requires `accelerate v0.17.0` or higher

but currently latest version == 0.16.0

* test_stable_diffusion_long_prompt passed

* test_stable_diffusion_no_safety_checker passed

- due to its model size, move to slow test

* remove PoC test files

* fix num_of_image, prompt length issue add add test

* add support List[PIL.Image] for controlnet_hint

* wip

* all slow test passed

* make style

* update for slow test

* RGB(PIL)->BGR(ctrlnet) conversion

* fixes

* remove manual num_images_per_prompt test

* add document

* add `image` argument docstring

* make style

* Add line to correct conversion

* add controlnet_conditioning_scale (aka control_scales
strength)

* rgb channel ordering by default

* image batching logic

* Add control image descriptions for each checkpoint

* Only save controlnet model in conversion script

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_controlnet.py

typo
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* add gerated image example

* a depth mask -> a depth map

* rename control_net.mdx to controlnet.mdx

* fix toc title

* add ControlNet abstruct and link

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_controlnet.py
Co-authored-by: dqueue <dbyqin@gmail.com>

* remove controlnet constructor arguments re: @patrickvonplaten

* [integration tests] test canny

* test_canny fixes

* [integration tests] test_depth

* [integration tests] test_hed

* [integration tests] test_mlsd

* add channel order config to controlnet

* [integration tests] test normal

* [integration tests] test_openpose test_scribble

* change height and width to default to conditioning image

* [integration tests] test seg

* style

* test_depth fix

* [integration tests] size fixes

* [integration tests] cpu offloading

* style

* generalize controlnet embedding

* fix conversion script

* Update docs/source/en/api/pipelines/stable_diffusion/controlnet.mdx
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/controlnet.mdx
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/controlnet.mdx
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/controlnet.mdx
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Style adapted to the documentation of pix2pix

* merge main by hand

* style

* [docs] controlling generation doc nits

* correct some things

* add: controlnetmodel to autodoc.

* finish docs

* finish

* finish 2

* correct images

* finish controlnet

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* uP

* upload model

* up

* up

---------
Co-authored-by: William Berman <WLBberman@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: dqueue <dbyqin@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

8dfff7c0

01 Mar, 2023 2 commits
- Bring Flax attention naming in sync with PyTorch (#2511) · e4a9fb3b
  Pedro Cuenca authored Mar 01, 2023
```
Bring flax attention naming in sync with PyTorch.
```
  e4a9fb3b
- [Copyright] 2023 (#2524) · eadf0e25
  Patrick von Platen authored Mar 01, 2023
  
  eadf0e25
27 Feb, 2023 1 commit
- [Safetensors] Make sure metadata is saved (#2506) · 0e975e5f
  Patrick von Platen authored Feb 27, 2023
```
* [Safetensors] Make sure metadata is saved

* make style
```
  0e975e5f
17 Feb, 2023 2 commits

Fix typo in AttnProcessor2_0 symbol (#2404) · 780b3a4f
Pedro Cuenca authored Feb 17, 2023
```
Fix typo in AttnProcessor2_0 symbol.
```
780b3a4f

Torch2.0 scaled_dot_product_attention processor (#2303) · 0c0bb085

Suraj Patil authored Feb 17, 2023



* add sdpa processor

* don't use it by default

* add some checks and style

* typo

* support torch sdpa in dreambooth example

* use torch attn proc by default when available

* typo

* add attn mask

* fix naming

* being doc

* doc

* Apply suggestions from code review

* polish

* torctree

* Apply suggestions from code review
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* better name

* style

* add benchamrk table

* Update docs/source/en/optimization/torch2.0.mdx

* up

* fix example

* check if processor is None

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* add fp32 benchmakr

* Apply suggestions from code review
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

0c0bb085

16 Feb, 2023 3 commits

Replace torch.concat calls by torch.cat (#2378) · d32e9391
fxmarty authored Feb 16, 2023
```
replace torch.concat by torch.cat
```
d32e9391

`enable_model_cpu_offload` (#2285) · 2777264e

Pedro Cuenca authored Feb 16, 2023

* enable_model_offload PoC

It's surprisingly more involved than expected, see comments in the PR.

* Rename final_offload_hook

* Invoke the vae forward hook manually.

* Completely remove decoder.

* Style

* apply_forward_hook decorator

* Rename method.

* Style

* Copy enable_model_cpu_offload

* Fix copies.

* Remove comment.

* Fix copies

* Missing import

* Fix doc-builder style.

* Merge main and fix again.

* Add docs

* Fix docs.

* Add a couple of tests.

* style

2777264e

[Variant] Add "variant" as input kwarg so to have better UX when downloading... · e5810e68

Patrick von Platen authored Feb 16, 2023


[Variant] Add "variant" as input kwarg so to have better UX when downloading no_ema or fp16 weights (#2305)

* [Variant] Add variant loading mechanism

* clean

* improve further

* up

* add tests

* add some first tests

* up

* up

* use path splittetx

* add deprecate

* deprecation warnings

* improve docs

* up

* up

* up

* fix tests

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* correct code format

* fix warning

* finish

* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update docs/source/en/using-diffusers/loading.mdx
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Apply suggestions from code review
Co-authored-by: Will Berman <wlbberman@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* correct loading docs

* finish

---------
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Will Berman <wlbberman@gmail.com>

e5810e68

14 Feb, 2023 2 commits

unCLIP variant (#2297) · 62b3c9e0

Will Berman authored Feb 14, 2023



* pipeline_variant

* Add docs for when clip_stats_path is specified

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip_img2img.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip_img2img.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* prepare_latents # Copied from re: @patrickvonplaten

* NoiseAugmentor->ImageNormalizer

* stable_unclip_prior default to None re: @patrickvonplaten

* prepare_prior_extra_step_kwargs

* prior denoising scale model input

* {DDIM,DDPM}Scheduler -> KarrasDiffusionSchedulers re: @patrickvonplaten

* docs

* Update docs/source/en/api/pipelines/stable_unclip.mdx
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

62b3c9e0

unet check length inputs (#2327) · e55687e1

Will Berman authored Feb 13, 2023



* unet check length input

* prep test file for changes

* correct all tests

* clean up

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

e55687e1

13 Feb, 2023 1 commit

Fix running LoRA with xformers (#2286) · 5d4f59ee

bddppq authored Feb 13, 2023

* Fix running LoRA with xformers

* support disabling xformers

* reformat

* Add test

5d4f59ee

07 Feb, 2023 3 commits

Replace flake8 with ruff and update black (#2279) · a7ca03aa

Patrick von Platen authored Feb 08, 2023

* before running make style

* remove left overs from flake8

* finish

* make fix-copies

* final fix

* more fixes

a7ca03aa

mps cross-attention hack: don't crash on fp16 (#2258) · e619db24
Pedro Cuenca authored Feb 07, 2023
```
* mps cross-attention hack: don't crash on fp16

* Make conversion explicit.
```
e619db24

Stable Diffusion Latent Upscaler (#2059) · 1051ca81

YiYi Xu authored Feb 06, 2023



* Modify UNet2DConditionModel

- allow skipping mid_block

- adding a norm_group_size argument so that we can set the `num_groups` for group norm using `num_channels//norm_group_size`

- allow user to set dimension for the timestep embedding (`time_embed_dim`)

- the kernel_size for `conv_in` and `conv_out` is now configurable

- add random fourier feature layer (`GaussianFourierProjection`) for `time_proj`

- allow user to add the time and class embeddings before passing through the projection layer together - `time_embedding(t_emb + class_label))`

- added 2 arguments `attn1_types` and `attn2_types`

  * currently we have argument `only_cross_attention`: when it's set to `True`, we will have a to the
`BasicTransformerBlock` block with 2 cross-attention , otherwise we
get a self-attention followed by a cross-attention; in k-upscaler, we need to have blocks that include just one cross-attention, or self-attention -> cross-attention;
so I added `attn1_types` and `attn2_types` to the unet's argument list to allow user specify the attention types for the 2 positions in each block;  note that I stil kept
the `only_cross_attention` argument for unet for easy configuration, but it will be converted to `attn1_type` and `attn2_type` when passing down to the down blocks

- the position of downsample layer and upsample layer is now configurable

- in k-upscaler unet, there is only one skip connection per each up/down block (instead of each layer in stable diffusion unet), added `skip_freq = "block"` to support
this use case

- if user passes attention_mask to unet, it will prepare the mask and pass a flag to cross attention processer to skip the `prepare_attention_mask` step
inside cross attention block

add up/down blocks for k-upscaler

modify CrossAttention class

- make the `dropout` layer in `to_out` optional

- `use_conv_proj` - use conv instead of linear for all projection layers (i.e. `to_q`, `to_k`, `to_v`, `to_out`) whenever possible. note that when it's used to do cross
attention, to_k, to_v has to be linear because the `encoder_hidden_states` is not 2d

- `cross_attention_norm` - add an optional layernorm on encoder_hidden_states

- `attention_dropout`: add an optional dropout on attention score

adapt BasicTransformerBlock

- add an ada groupnorm layer  to conditioning attention input with timestep embedding

- allow skipping the FeedForward layer in between the attentions

- replaced the only_cross_attention argument with attn1_type and attn2_type for more flexible configuration

update timestep embedding: add new act_fn  gelu and an optional act_2

modified ResnetBlock2D

- refactored with AdaGroupNorm class (the timestep scale shift normalization)

- add `mid_channel` argument - allow the first conv to have a different output dimension from the second conv

- add option to use input AdaGroupNorm on the input instead of groupnorm

- add options to add a dropout layer after each conv

- allow user to set the bias in conv_shortcut (needed for k-upscaler)

- add gelu

adding conversion script for k-upscaler unet

add pipeline

* fix attention mask

* fix a typo

* fix a bug

* make sure model can be used with GPU

* make pipeline work with fp16

* fix an error in BasicTransfomerBlock

* make style

* fix typo

* some more fixes

* uP

* up

* correct more

* some clean-up

* clean time proj

* up

* uP

* more changes

* remove the upcast_attention=True from unet config

* remove attn1_types, attn2_types etc

* fix

* revert incorrect changes up/down samplers

* make style

* remove outdated files

* Apply suggestions from code review

* attention refactor

* refactor cross attention

* Apply suggestions from code review

* update

* up

* update

* Apply suggestions from code review

* finish

* Update src/diffusers/models/cross_attention.py

* more fixes

* up

* up

* up

* finish

* more corrections of conversion state

* act_2 -> act_2_fn

* remove dropout_after_conv from ResnetBlock2D

* make style

* simplify KAttentionBlock

* add fast test for latent upscaler pipeline

* add slow test

* slow test fp16

* make style

* add doc string for pipeline_stable_diffusion_latent_upscale

* add api doc page for latent upscaler pipeline

* deprecate attention mask

* clean up embeddings

* simplify resnet

* up

* clean up resnet

* up

* correct more

* up

* up

* improve a bit more

* correct more

* more clean-ups

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* add docstrings for new unet config

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* # Copied from

* encode the image if not latent

* remove force casting vae to fp32

* fix

* add comments about preconditioning parameters from k-diffusion paper

* attn1_type, attn2_type -> add_self_attention

* clean up get_down_block and get_up_block

* fix

* fixed a typo(?) in ada group norm

* update slice attention processer for cross attention

* update slice

* fix fast test

* update the checkpoint

* finish tests

* fix-copies

* fix-copy for modeling_text_unet.py

* make style

* make style

* fix f-string

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix import

* correct changes

* fix resnet

* make fix-copies

* correct euler scheduler

* add missing #copied from for preprocess

* revert

* fix

* fix copies

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/models/cross_attention.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* clean up conversion script

* KDownsample2d,KUpsample2d -> KDownsample2D,KUpsample2D

* more

* Update src/diffusers/models/unet_2d_condition.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* remove prepare_extra_step_kwargs

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix a typo in timestep embedding

* remove num_image_per_prompt

* fix fasttest

* make style + fix-copies

* fix

* fix xformer test

* fix style

* doc string

* make style

* fix-copies

* docstring for time_embedding_norm

* make style

* final finishes

* make fix-copies

* fix tests

---------
Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

1051ca81

03 Feb, 2023 1 commit

Fixes LoRAXFormersCrossAttnProcessor (#2207) · 58c416ab

Jorge C. Gomes authored Feb 03, 2023

Related to #2124
The current implementation is throwing a shape mismatch error. Which makes sense, as this line is obviously missing, comparing to XFormersCrossAttnProcessor and LoRACrossAttnProcessor.

I don't have formal tests, but I compared `LoRACrossAttnProcessor` and `LoRAXFormersCrossAttnProcessor` ad-hoc, and they produce the same results with this fix.

58c416ab

01 Feb, 2023 1 commit
- [Loading] Better error message on missing keys (#2198) · 8267c784
  Patrick von Platen authored Feb 01, 2023
```
* up

* finish
```
  8267c784