Commits · 6782b70dd376a86a4e13ff37441cd8fb8b8da5b9 · renzhc / diffusers_dcu

13 Feb, 2023 11 commits

github issue forum link (#2335) · 6782b70d

Will Berman authored Feb 13, 2023



* github issue forum link

* Update .github/ISSUE_TEMPLATE/config.yml
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

6782b70d

karlo image variation use kakaobrain upload (#2338) · f190714e
Will Berman authored Feb 13, 2023

f190714e
[Tests] Remove unnecessary tests (#2337) · 6cbd7b8b
Patrick von Platen authored Feb 13, 2023

6cbd7b8b
[Latent Upscaling] Remove unused noise (#2298) · bc0cee9d
Patrick von Platen authored Feb 13, 2023

bc0cee9d
[Versatile Diffusion] Fix tests (#2336) · 1f5f17c5
Patrick von Platen authored Feb 13, 2023

1f5f17c5
[Docs] Fix ethical guidelines docs (#2333) · 98c1a8e7
Patrick von Platen authored Feb 13, 2023

98c1a8e7
Fix typo in load_pipeline_from_original_stable_diffusion_ckpt() method (#2320) · 0850b88f
Plat authored Feb 13, 2023
```
fix typo
```
0850b88f

Fix running LoRA with xformers (#2286) · 5d4f59ee

bddppq authored Feb 13, 2023

* Fix running LoRA with xformers

* support disabling xformers

* reformat

* Add test

5d4f59ee

Add ethical guidelines (#2330) · f2eae168

Giada Pistilli authored Feb 13, 2023



* add ethical guidelines

* update file name

* edit file name

* update toctree

* Update docs/source/en/conceptual/ethical_guidelines.mdx

* Update docs/source/en/conceptual/ethical_guidelines.mdx

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

f2eae168

[Tests] Refactor push tests (#2329) · 120844aa
Patrick von Platen authored Feb 13, 2023
```
* [Tests] Refactor push tests

* correct
```
120844aa

[Community Pipeline] UnCLIP Text Interpolation Pipeline (#2257) · a688c7bd

Naga Sai Abhinay authored Feb 13, 2023



* UnCLIP Text Interpolation Pipeline

* Formatter fixes

* Changes based on feedback

* Formatting fix

* Formatting fix

* isort formatting fix(?)

* Remove duplicate code

* Formatting fix

* Refactor __call__ and change example in readme.

* Update examples/community/unclip_text_interpolation.py

Refactor to linter formatting
Co-authored-by: Will Berman <wlbberman@gmail.com>

---------
Co-authored-by: Will Berman <wlbberman@gmail.com>

a688c7bd

10 Feb, 2023 5 commits

convert ckpt script docstring fixes (#2293) · 1e7f9654

Will Berman authored Feb 10, 2023



* convert ckpt script docstring fixes

* Update src/diffusers/pipelines/stable_diffusion/convert_from_ckpt.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/pipelines/stable_diffusion/convert_from_ckpt.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

---------
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

1e7f9654

remove ddpm test_full_inference (#2291) · beb59abf
Will Berman authored Feb 10, 2023
```
* remove ddpm test_full_inference

* style
```
beb59abf

Correct fast tests (#2314) · 96c2279b

Patrick von Platen authored Feb 10, 2023

* correct some

* Apply suggestions from code review

* correct

* Update tests/pipelines/altdiffusion/test_alt_diffusion_img2img.py

* Final

96c2279b

Fast CPU tests should also run on main (#2313) · 716286f1
Patrick von Platen authored Feb 10, 2023
```
add fast tests
```
716286f1
make style · e83b4361
Patrick von Platen authored Feb 10, 2023

e83b4361

09 Feb, 2023 3 commits

[LoRA] Freezing the model weights (#2245) · 1be7df02

erkams authored Feb 09, 2023



* [LoRA] Freezing the model weights

Freeze the model weights since we don't need to calculate grads for them.

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Apply suggestions from code review

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

1be7df02

make style · 62a15cec
Patrick von Platen authored Feb 09, 2023

62a15cec

Run same number of DDPM steps in inference as training (#2263) · f3c84838

Ben Evans authored Feb 09, 2023

Resolves ValueError: `num_inference_steps`: 1000 cannot be larger than `self.config.train_timesteps`: 50 as the unet model trained with this scheduler can only handle maximal 50 timesteps.

f3c84838

08 Feb, 2023 6 commits

misc fixes (#2282) · fd5c3c09
Will Berman authored Feb 08, 2023
```
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
```
fd5c3c09
fix pix2pix docs (#2290) · 648090e2
Patrick von Platen authored Feb 08, 2023

648090e2
[Examples] Test all examples on CPU (#2289) · 1ed6b777
Patrick von Platen authored Feb 08, 2023
```
* [Examples] Test all examples on CPU

* add

* correct

* Apply suggestions from code review
```
1ed6b777

EMA: fix `state_dict()` and `load_state_dict()` & add `cur_decay_value` (#2146) · 9d0d0709

Chenguo Lin authored Feb 08, 2023

* EMA: fix `state_dict()` & add `cur_decay_value`

* EMA: fix a bug in `load_state_dict()`

'float' object (`state_dict["power"]`) has no attribute 'get'.

* del train_unconditional_ort.py

9d0d0709

Textual inv save log memory (#2184) · c1971a53

Isamu Isozaki authored Feb 08, 2023



* Quality check and adding tokenizer

* Adapted stable diffusion to mixed precision+finished up style fixes

* Fixed based on patrick's review

* Fixed oom from number of validation images

* Removed unnecessary np.array conversion

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

c1971a53

correct tests · 41db2dbf
Patrick von Platen authored Feb 08, 2023

41db2dbf

07 Feb, 2023 13 commits

Replace flake8 with ruff and update black (#2279) · a7ca03aa

Patrick von Platen authored Feb 08, 2023

* before running make style

* remove left overs from flake8

* finish

* make fix-copies

* final fix

* more fixes

a7ca03aa

Use `accelerate` save & loading hooks to have better checkpoint structure (#2048) · f5ccffec

Patrick von Platen authored Feb 07, 2023



* better accelerated saving

* up

* finish

* finish

* uP

* up

* up

* fix

* Apply suggestions from code review

* correct ema

* Remove @

* up

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/training/dreambooth.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

---------
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

f5ccffec

mps cross-attention hack: don't crash on fp16 (#2258) · e619db24
Pedro Cuenca authored Feb 07, 2023
```
* mps cross-attention hack: don't crash on fp16

* Make conversion explicit.
```
e619db24

Fix torchvision.transforms and transforms function naming clash (#2274) · 111228cb

wfng92 authored Feb 08, 2023



* Fix torchvision.transforms and transforms function naming clash

* Update unconditional script for onnx

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

111228cb

[Tests] Fix slow tests (#2271) · bbb46ad3
Patrick von Platen authored Feb 07, 2023

bbb46ad3

Make center crop and random flip as args for unconditional image generation (#2259) · b1dad2e9

wfng92 authored Feb 07, 2023

* Add center crop and horizontal flip to args

* Update command to use center crop and random flip

* Add center crop and horizontal flip to args

* Update command to use center crop and random flip

b1dad2e9

[Examples] Remove datasets important that is not needed (#2267) · cd524755
Patrick von Platen authored Feb 07, 2023
```
* [Examples] Remove datasets important that is not needed

* remove from lora tambien
```
cd524755
fix vae pt script · 0f04e799
Patrick von Platen authored Feb 07, 2023

0f04e799

Stable Diffusion Latent Upscaler (#2059) · 1051ca81

YiYi Xu authored Feb 06, 2023



* Modify UNet2DConditionModel

- allow skipping mid_block

- adding a norm_group_size argument so that we can set the `num_groups` for group norm using `num_channels//norm_group_size`

- allow user to set dimension for the timestep embedding (`time_embed_dim`)

- the kernel_size for `conv_in` and `conv_out` is now configurable

- add random fourier feature layer (`GaussianFourierProjection`) for `time_proj`

- allow user to add the time and class embeddings before passing through the projection layer together - `time_embedding(t_emb + class_label))`

- added 2 arguments `attn1_types` and `attn2_types`

  * currently we have argument `only_cross_attention`: when it's set to `True`, we will have a to the
`BasicTransformerBlock` block with 2 cross-attention , otherwise we
get a self-attention followed by a cross-attention; in k-upscaler, we need to have blocks that include just one cross-attention, or self-attention -> cross-attention;
so I added `attn1_types` and `attn2_types` to the unet's argument list to allow user specify the attention types for the 2 positions in each block;  note that I stil kept
the `only_cross_attention` argument for unet for easy configuration, but it will be converted to `attn1_type` and `attn2_type` when passing down to the down blocks

- the position of downsample layer and upsample layer is now configurable

- in k-upscaler unet, there is only one skip connection per each up/down block (instead of each layer in stable diffusion unet), added `skip_freq = "block"` to support
this use case

- if user passes attention_mask to unet, it will prepare the mask and pass a flag to cross attention processer to skip the `prepare_attention_mask` step
inside cross attention block

add up/down blocks for k-upscaler

modify CrossAttention class

- make the `dropout` layer in `to_out` optional

- `use_conv_proj` - use conv instead of linear for all projection layers (i.e. `to_q`, `to_k`, `to_v`, `to_out`) whenever possible. note that when it's used to do cross
attention, to_k, to_v has to be linear because the `encoder_hidden_states` is not 2d

- `cross_attention_norm` - add an optional layernorm on encoder_hidden_states

- `attention_dropout`: add an optional dropout on attention score

adapt BasicTransformerBlock

- add an ada groupnorm layer  to conditioning attention input with timestep embedding

- allow skipping the FeedForward layer in between the attentions

- replaced the only_cross_attention argument with attn1_type and attn2_type for more flexible configuration

update timestep embedding: add new act_fn  gelu and an optional act_2

modified ResnetBlock2D

- refactored with AdaGroupNorm class (the timestep scale shift normalization)

- add `mid_channel` argument - allow the first conv to have a different output dimension from the second conv

- add option to use input AdaGroupNorm on the input instead of groupnorm

- add options to add a dropout layer after each conv

- allow user to set the bias in conv_shortcut (needed for k-upscaler)

- add gelu

adding conversion script for k-upscaler unet

add pipeline

* fix attention mask

* fix a typo

* fix a bug

* make sure model can be used with GPU

* make pipeline work with fp16

* fix an error in BasicTransfomerBlock

* make style

* fix typo

* some more fixes

* uP

* up

* correct more

* some clean-up

* clean time proj

* up

* uP

* more changes

* remove the upcast_attention=True from unet config

* remove attn1_types, attn2_types etc

* fix

* revert incorrect changes up/down samplers

* make style

* remove outdated files

* Apply suggestions from code review

* attention refactor

* refactor cross attention

* Apply suggestions from code review

* update

* up

* update

* Apply suggestions from code review

* finish

* Update src/diffusers/models/cross_attention.py

* more fixes

* up

* up

* up

* finish

* more corrections of conversion state

* act_2 -> act_2_fn

* remove dropout_after_conv from ResnetBlock2D

* make style

* simplify KAttentionBlock

* add fast test for latent upscaler pipeline

* add slow test

* slow test fp16

* make style

* add doc string for pipeline_stable_diffusion_latent_upscale

* add api doc page for latent upscaler pipeline

* deprecate attention mask

* clean up embeddings

* simplify resnet

* up

* clean up resnet

* up

* correct more

* up

* up

* improve a bit more

* correct more

* more clean-ups

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* add docstrings for new unet config

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* # Copied from

* encode the image if not latent

* remove force casting vae to fp32

* fix

* add comments about preconditioning parameters from k-diffusion paper

* attn1_type, attn2_type -> add_self_attention

* clean up get_down_block and get_up_block

* fix

* fixed a typo(?) in ada group norm

* update slice attention processer for cross attention

* update slice

* fix fast test

* update the checkpoint

* finish tests

* fix-copies

* fix-copy for modeling_text_unet.py

* make style

* make style

* fix f-string

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix import

* correct changes

* fix resnet

* make fix-copies

* correct euler scheduler

* add missing #copied from for preprocess

* revert

* fix

* fix copies

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/models/cross_attention.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* clean up conversion script

* KDownsample2d,KUpsample2d -> KDownsample2D,KUpsample2D

* more

* Update src/diffusers/models/unet_2d_condition.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* remove prepare_extra_step_kwargs

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix a typo in timestep embedding

* remove num_image_per_prompt

* fix fasttest

* make style + fix-copies

* fix

* fix xformer test

* fix style

* doc string

* make style

* fix-copies

* docstring for time_embedding_norm

* make style

* final finishes

* make fix-copies

* fix tests

---------
Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

1051ca81

make style · 3b66cc0f
Patrick von Platen authored Feb 07, 2023

3b66cc0f

Create convert_vae_pt_to_diffusers.py (#2215) · 717a956a

chavinlo authored Feb 07, 2023

* Create convert_vae_pt_to_diffusers.py

Just a simple script to convert VAE.pt files to diffusers format
Tested with: https://huggingface.co/WarriorMama777/OrangeMixs/blob/main/VAEs/orangemix.vae.pt



* Update convert_vae_pt_to_diffusers.py

Forgot to add the function call

* make style

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: chavinlo <example@example.com>

717a956a

Fixes prompt input checks in StableDiffusion img2img pipeline (#2206) · d43972ae

Jorge C. Gomes authored Feb 07, 2023

* Fixes prompt input checks in img2img

Allows providing prompt_embeds instead of the prompt, which is not currently possible as the first check fails.
This becomes the same as the function found in https://github.com/huggingface/diffusers/blob/8267c7844504b55366525169187767ef92d1f499/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py#L393

* Continues the fix

This also needs to be fixed. Becomes consistent with https://github.com/huggingface/diffusers/blob/8267c7844504b55366525169187767ef92d1f499/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py#L558

I've now tested this implementation, and it produces the expected results.

d43972ae

fix distributed init twice (#2252) · ffed2420
Fazzie-Maqianli authored Feb 07, 2023
```
fix colossalai dreambooth
```
ffed2420

06 Feb, 2023 2 commits
- Mention training problems with xFormers 0.0.16 (#2254) · 8178c840
  Pedro Cuenca authored Feb 06, 2023
  
  8178c840
- Fix a typo: bfloa16 -> bfloat16 (#2243) · 3a0d3da6
  nickkolok authored Feb 06, 2023
  
  3a0d3da6