Commits · a7ca03aa85f94574f06576d2155b3ec061fe8d63 · renzhc / diffusers_dcu

07 Feb, 2023 13 commits

Replace flake8 with ruff and update black (#2279) · a7ca03aa

Patrick von Platen authored Feb 08, 2023

* before running make style

* remove left overs from flake8

* finish

* make fix-copies

* final fix

* more fixes

a7ca03aa

Use `accelerate` save & loading hooks to have better checkpoint structure (#2048) · f5ccffec

Patrick von Platen authored Feb 07, 2023



* better accelerated saving

* up

* finish

* finish

* uP

* up

* up

* fix

* Apply suggestions from code review

* correct ema

* Remove @

* up

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/training/dreambooth.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

---------
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

f5ccffec

mps cross-attention hack: don't crash on fp16 (#2258) · e619db24
Pedro Cuenca authored Feb 07, 2023
```
* mps cross-attention hack: don't crash on fp16

* Make conversion explicit.
```
e619db24

Fix torchvision.transforms and transforms function naming clash (#2274) · 111228cb

wfng92 authored Feb 08, 2023



* Fix torchvision.transforms and transforms function naming clash

* Update unconditional script for onnx

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

111228cb

[Tests] Fix slow tests (#2271) · bbb46ad3
Patrick von Platen authored Feb 07, 2023

bbb46ad3

Make center crop and random flip as args for unconditional image generation (#2259) · b1dad2e9

wfng92 authored Feb 07, 2023

* Add center crop and horizontal flip to args

* Update command to use center crop and random flip

* Add center crop and horizontal flip to args

* Update command to use center crop and random flip

b1dad2e9

[Examples] Remove datasets important that is not needed (#2267) · cd524755
Patrick von Platen authored Feb 07, 2023
```
* [Examples] Remove datasets important that is not needed

* remove from lora tambien
```
cd524755
fix vae pt script · 0f04e799
Patrick von Platen authored Feb 07, 2023

0f04e799

Stable Diffusion Latent Upscaler (#2059) · 1051ca81

YiYi Xu authored Feb 06, 2023



* Modify UNet2DConditionModel

- allow skipping mid_block

- adding a norm_group_size argument so that we can set the `num_groups` for group norm using `num_channels//norm_group_size`

- allow user to set dimension for the timestep embedding (`time_embed_dim`)

- the kernel_size for `conv_in` and `conv_out` is now configurable

- add random fourier feature layer (`GaussianFourierProjection`) for `time_proj`

- allow user to add the time and class embeddings before passing through the projection layer together - `time_embedding(t_emb + class_label))`

- added 2 arguments `attn1_types` and `attn2_types`

  * currently we have argument `only_cross_attention`: when it's set to `True`, we will have a to the
`BasicTransformerBlock` block with 2 cross-attention , otherwise we
get a self-attention followed by a cross-attention; in k-upscaler, we need to have blocks that include just one cross-attention, or self-attention -> cross-attention;
so I added `attn1_types` and `attn2_types` to the unet's argument list to allow user specify the attention types for the 2 positions in each block;  note that I stil kept
the `only_cross_attention` argument for unet for easy configuration, but it will be converted to `attn1_type` and `attn2_type` when passing down to the down blocks

- the position of downsample layer and upsample layer is now configurable

- in k-upscaler unet, there is only one skip connection per each up/down block (instead of each layer in stable diffusion unet), added `skip_freq = "block"` to support
this use case

- if user passes attention_mask to unet, it will prepare the mask and pass a flag to cross attention processer to skip the `prepare_attention_mask` step
inside cross attention block

add up/down blocks for k-upscaler

modify CrossAttention class

- make the `dropout` layer in `to_out` optional

- `use_conv_proj` - use conv instead of linear for all projection layers (i.e. `to_q`, `to_k`, `to_v`, `to_out`) whenever possible. note that when it's used to do cross
attention, to_k, to_v has to be linear because the `encoder_hidden_states` is not 2d

- `cross_attention_norm` - add an optional layernorm on encoder_hidden_states

- `attention_dropout`: add an optional dropout on attention score

adapt BasicTransformerBlock

- add an ada groupnorm layer  to conditioning attention input with timestep embedding

- allow skipping the FeedForward layer in between the attentions

- replaced the only_cross_attention argument with attn1_type and attn2_type for more flexible configuration

update timestep embedding: add new act_fn  gelu and an optional act_2

modified ResnetBlock2D

- refactored with AdaGroupNorm class (the timestep scale shift normalization)

- add `mid_channel` argument - allow the first conv to have a different output dimension from the second conv

- add option to use input AdaGroupNorm on the input instead of groupnorm

- add options to add a dropout layer after each conv

- allow user to set the bias in conv_shortcut (needed for k-upscaler)

- add gelu

adding conversion script for k-upscaler unet

add pipeline

* fix attention mask

* fix a typo

* fix a bug

* make sure model can be used with GPU

* make pipeline work with fp16

* fix an error in BasicTransfomerBlock

* make style

* fix typo

* some more fixes

* uP

* up

* correct more

* some clean-up

* clean time proj

* up

* uP

* more changes

* remove the upcast_attention=True from unet config

* remove attn1_types, attn2_types etc

* fix

* revert incorrect changes up/down samplers

* make style

* remove outdated files

* Apply suggestions from code review

* attention refactor

* refactor cross attention

* Apply suggestions from code review

* update

* up

* update

* Apply suggestions from code review

* finish

* Update src/diffusers/models/cross_attention.py

* more fixes

* up

* up

* up

* finish

* more corrections of conversion state

* act_2 -> act_2_fn

* remove dropout_after_conv from ResnetBlock2D

* make style

* simplify KAttentionBlock

* add fast test for latent upscaler pipeline

* add slow test

* slow test fp16

* make style

* add doc string for pipeline_stable_diffusion_latent_upscale

* add api doc page for latent upscaler pipeline

* deprecate attention mask

* clean up embeddings

* simplify resnet

* up

* clean up resnet

* up

* correct more

* up

* up

* improve a bit more

* correct more

* more clean-ups

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* add docstrings for new unet config

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* # Copied from

* encode the image if not latent

* remove force casting vae to fp32

* fix

* add comments about preconditioning parameters from k-diffusion paper

* attn1_type, attn2_type -> add_self_attention

* clean up get_down_block and get_up_block

* fix

* fixed a typo(?) in ada group norm

* update slice attention processer for cross attention

* update slice

* fix fast test

* update the checkpoint

* finish tests

* fix-copies

* fix-copy for modeling_text_unet.py

* make style

* make style

* fix f-string

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix import

* correct changes

* fix resnet

* make fix-copies

* correct euler scheduler

* add missing #copied from for preprocess

* revert

* fix

* fix copies

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/models/cross_attention.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* clean up conversion script

* KDownsample2d,KUpsample2d -> KDownsample2D,KUpsample2D

* more

* Update src/diffusers/models/unet_2d_condition.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* remove prepare_extra_step_kwargs

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix a typo in timestep embedding

* remove num_image_per_prompt

* fix fasttest

* make style + fix-copies

* fix

* fix xformer test

* fix style

* doc string

* make style

* fix-copies

* docstring for time_embedding_norm

* make style

* final finishes

* make fix-copies

* fix tests

---------
Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

1051ca81

make style · 3b66cc0f
Patrick von Platen authored Feb 07, 2023

3b66cc0f

Create convert_vae_pt_to_diffusers.py (#2215) · 717a956a

chavinlo authored Feb 07, 2023

* Create convert_vae_pt_to_diffusers.py

Just a simple script to convert VAE.pt files to diffusers format
Tested with: https://huggingface.co/WarriorMama777/OrangeMixs/blob/main/VAEs/orangemix.vae.pt



* Update convert_vae_pt_to_diffusers.py

Forgot to add the function call

* make style

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: chavinlo <example@example.com>

717a956a

Fixes prompt input checks in StableDiffusion img2img pipeline (#2206) · d43972ae

Jorge C. Gomes authored Feb 07, 2023

* Fixes prompt input checks in img2img

Allows providing prompt_embeds instead of the prompt, which is not currently possible as the first check fails.
This becomes the same as the function found in https://github.com/huggingface/diffusers/blob/8267c7844504b55366525169187767ef92d1f499/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py#L393

* Continues the fix

This also needs to be fixed. Becomes consistent with https://github.com/huggingface/diffusers/blob/8267c7844504b55366525169187767ef92d1f499/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py#L558

I've now tested this implementation, and it produces the expected results.

d43972ae

fix distributed init twice (#2252) · ffed2420
Fazzie-Maqianli authored Feb 07, 2023
```
fix colossalai dreambooth
```
ffed2420

06 Feb, 2023 2 commits
- Mention training problems with xFormers 0.0.16 (#2254) · 8178c840
  Pedro Cuenca authored Feb 06, 2023
  
  8178c840
- Fix a typo: bfloa16 -> bfloat16 (#2243) · 3a0d3da6
  nickkolok authored Feb 06, 2023
  
  3a0d3da6
05 Feb, 2023 1 commit
- Fix k_dpm_2 & k_dpm_2_a on MPS (#2241) · 22c1ba56
  psychedelicious authored Feb 06, 2023
```
Needed to convert `timesteps` to `float32` a bit sooner.

Fixes #1537
```
  22c1ba56
04 Feb, 2023 2 commits
- Show error when loading safety_checker `from_flax` (#2187) · 7386e773
  Pedro Cuenca authored Feb 04, 2023
```
* Show error when loading safety_checker `from_flax`

* fix style
```
  7386e773
- [Flax DDPM] Make `key` optional so default pipelines don't fail (#2176) · 154a7865
  Pedro Cuenca authored Feb 04, 2023
```
Make `key` optional so default pipelines don't fail.
```
  154a7865
03 Feb, 2023 11 commits

Fix typo in StableDiffusionInpaintPipeline (#2197) · 9baa29e9

Robin Hutmacher authored Feb 03, 2023



* Fix typo in StableDiffusionInpaintPipeline

* Add embedded prompt handling

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

9baa29e9

Fixes LoRAXFormersCrossAttnProcessor (#2207) · 58c416ab

Jorge C. Gomes authored Feb 03, 2023

Related to #2124
The current implementation is throwing a shape mismatch error. Which makes sense, as this line is obviously missing, comparing to XFormersCrossAttnProcessor and LoRACrossAttnProcessor.

I don't have formal tests, but I compared `LoRACrossAttnProcessor` and `LoRAXFormersCrossAttnProcessor` ad-hoc, and they produce the same results with this fix.

58c416ab

Hotfix textual inv logging (#2183) · d46d78c5
Isamu Isozaki authored Feb 04, 2023

d46d78c5
make style · 05168e5d
Patrick von Platen authored Feb 03, 2023

05168e5d

fix: flagged_images implementation (#1947) · 948022e1

Justin Merrell authored Feb 03, 2023



Flagged images would be set to the blank image instead of the original image that contained the NSF concept for optional viewing.
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

948022e1

[LoRA] Make sure validation works in multi GPU setup (#2172) · 2f9a70aa
Patrick von Platen authored Feb 03, 2023
```
* [LoRA] Make sure validation works in multi GPU setup

* more fixes

* up
```
2f9a70aa
removes `~`s in favor of full-fledged links. (#2229) · e43e206d
Sayak Paul authored Feb 03, 2023
```
remove ~ in favor of full-fledged links.
```
e43e206d
[nit] negative_prompt typo (#2227) · 99c39b40
Will Berman authored Feb 03, 2023
```
* negative_prompt typo

* fix
```
99c39b40

Fix timestep dtype in legacy inpaint (#2120) · 7547f9b4

dymil authored Feb 03, 2023

* Fix timestep dtype in legacy inpaint

This matches the structure in the text2img, img2img, and inpaint ONNX pipelines

* Fix style in dtype patch

7547f9b4

refactor onnxruntime integration (#2042) · a87e87fc

Prathik Rao authored Feb 03, 2023



* refactor onnxruntime integration

* fix requirements.txt bug

* make style

* add support for textual_inversion

* make style

* add readme

* cleanup README files

* 1/27/2023 update to training scripts

* make style

* 1/30 update to train_unconditional

* style with black-22.8.0

---------
Co-authored-by: Prathik Rao <prathikrao@microsoft.com>
Co-authored-by: anton- <anton@huggingface.co>

a87e87fc

[Bug] scheduling_ddpm: fix variance in the case of learned_range type. (#2090) · ecadcdef

Dudu Moshe authored Feb 03, 2023

scheduling_ddpm: fix variance in the case of learned_range type.

In the case of learned_range variance type, there are missing logs
and exponent comparing to the theory (see "Improved Denoising Diffusion
Probabilistic Models" section 3.1 equation 15:
https://arxiv.org/pdf/2102.09672.pdf).

ecadcdef

02 Feb, 2023 3 commits

Docs: short section on changing the scheduler in Flax (#2181) · 2bbd5329

Pedro Cuenca authored Feb 02, 2023



* Short doc on changing the scheduler in Flax.

* Apply fix from @patil-suraj
Co-authored-by: Suraj Patil <surajp815@gmail.com>

---------
Co-authored-by: Suraj Patil <surajp815@gmail.com>

2bbd5329

Create train_dreambooth_inpaint_lora.py (#2205) · 68ef0666

Adalberto authored Feb 02, 2023

* Create train_dreambooth_inpaint_lora.py

* Update train_dreambooth_inpaint_lora.py

* Update train_dreambooth_inpaint_lora.py

* Update train_dreambooth_inpaint_lora.py

* Update train_dreambooth_inpaint_lora.py

68ef0666

add CITATION.cff (#2211) · 7ac95703
Kashif Rasul authored Feb 02, 2023
```
add citation.cff
```
7ac95703

01 Feb, 2023 5 commits

Update xFormers docs (#2208) · 3816c9ad
Pedro Cuenca authored Feb 01, 2023
```
Update xFormers docs.
```
3816c9ad
[Loading] Better error message on missing keys (#2198) · 8267c784
Patrick von Platen authored Feb 01, 2023
```
* up

* finish
```
8267c784
Fix a dimension bug in Transform2d (#2144) · 4fc70848
Muyang Li authored Feb 01, 2023
```
The dimension does not match when `inner_dim` is not equal to `in_channels`.
```
4fc70848

add: guide on kerascv conversion tool. (#2169) · 9213d81b

Sayak Paul authored Feb 01, 2023



* add: guide on kerascv conversion tool.

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* address additional suggestions from review.

* change links to documentation-images.

* add separate links for training and inference goodies from diffusers.

* address Patrick's comments.

---------
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

9213d81b

Pass LoRA rank to LoRALinearLayer (#2191) · dd3cae33
Asad Memon authored Feb 01, 2023

dd3cae33

31 Jan, 2023 3 commits

[Docs] remove license (#2188) · f73d0b6b
Patrick von Platen authored Jan 31, 2023

f73d0b6b
[Docs] Add components to docs (#2175) · d0d7ffff
Patrick von Platen authored Jan 31, 2023

d0d7ffff

Use `requests` instead of `wget` in `convert_from_ckpt.py` (#2168) · 87cf88ed

Abhishek Varma authored Jan 31, 2023



-- This commit adopts `requests` in place of `wget` to fetch config `.yaml`
   files as part of `load_pipeline_from_original_stable_diffusion_ckpt` API.
-- This was done because in Windows PowerShell one needs to explicitly ensure
   that `wget` binary is part of the PATH variable. If not present, this leads
   to the code not being able to download the `.yaml` config file.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>
Co-authored-by: Abhishek Varma <abhishek@nod-labs.com>

87cf88ed