Commits · 1a6fa69ab610586dad912c2b8d72bef9e3f209ee · OpenDAS / diffusers

14 Feb, 2023 2 commits

Will Berman authored Feb 14, 2023



* pipeline_variant

* Add docs for when clip_stats_path is specified

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip_img2img.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip_img2img.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* prepare_latents # Copied from re: @patrickvonplaten

* NoiseAugmentor->ImageNormalizer

* stable_unclip_prior default to None re: @patrickvonplaten

* prepare_prior_extra_step_kwargs

* prior denoising scale model input

* {DDIM,DDPM}Scheduler -> KarrasDiffusionSchedulers re: @patrickvonplaten

* docs

* Update docs/source/en/api/pipelines/stable_unclip.mdx
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

62b3c9e0

unet check length inputs (#2327) · e55687e1

Will Berman authored Feb 13, 2023



* unet check length input

* prep test file for changes

* correct all tests

* clean up

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

e55687e1

07 Feb, 2023 1 commit

Stable Diffusion Latent Upscaler (#2059) · 1051ca81

YiYi Xu authored Feb 06, 2023



* Modify UNet2DConditionModel

- allow skipping mid_block

- adding a norm_group_size argument so that we can set the `num_groups` for group norm using `num_channels//norm_group_size`

- allow user to set dimension for the timestep embedding (`time_embed_dim`)

- the kernel_size for `conv_in` and `conv_out` is now configurable

- add random fourier feature layer (`GaussianFourierProjection`) for `time_proj`

- allow user to add the time and class embeddings before passing through the projection layer together - `time_embedding(t_emb + class_label))`

- added 2 arguments `attn1_types` and `attn2_types`

  * currently we have argument `only_cross_attention`: when it's set to `True`, we will have a to the
`BasicTransformerBlock` block with 2 cross-attention , otherwise we
get a self-attention followed by a cross-attention; in k-upscaler, we need to have blocks that include just one cross-attention, or self-attention -> cross-attention;
so I added `attn1_types` and `attn2_types` to the unet's argument list to allow user specify the attention types for the 2 positions in each block;  note that I stil kept
the `only_cross_attention` argument for unet for easy configuration, but it will be converted to `attn1_type` and `attn2_type` when passing down to the down blocks

- the position of downsample layer and upsample layer is now configurable

- in k-upscaler unet, there is only one skip connection per each up/down block (instead of each layer in stable diffusion unet), added `skip_freq = "block"` to support
this use case

- if user passes attention_mask to unet, it will prepare the mask and pass a flag to cross attention processer to skip the `prepare_attention_mask` step
inside cross attention block

add up/down blocks for k-upscaler

modify CrossAttention class

- make the `dropout` layer in `to_out` optional

- `use_conv_proj` - use conv instead of linear for all projection layers (i.e. `to_q`, `to_k`, `to_v`, `to_out`) whenever possible. note that when it's used to do cross
attention, to_k, to_v has to be linear because the `encoder_hidden_states` is not 2d

- `cross_attention_norm` - add an optional layernorm on encoder_hidden_states

- `attention_dropout`: add an optional dropout on attention score

adapt BasicTransformerBlock

- add an ada groupnorm layer  to conditioning attention input with timestep embedding

- allow skipping the FeedForward layer in between the attentions

- replaced the only_cross_attention argument with attn1_type and attn2_type for more flexible configuration

update timestep embedding: add new act_fn  gelu and an optional act_2

modified ResnetBlock2D

- refactored with AdaGroupNorm class (the timestep scale shift normalization)

- add `mid_channel` argument - allow the first conv to have a different output dimension from the second conv

- add option to use input AdaGroupNorm on the input instead of groupnorm

- add options to add a dropout layer after each conv

- allow user to set the bias in conv_shortcut (needed for k-upscaler)

- add gelu

adding conversion script for k-upscaler unet

add pipeline

* fix attention mask

* fix a typo

* fix a bug

* make sure model can be used with GPU

* make pipeline work with fp16

* fix an error in BasicTransfomerBlock

* make style

* fix typo

* some more fixes

* uP

* up

* correct more

* some clean-up

* clean time proj

* up

* uP

* more changes

* remove the upcast_attention=True from unet config

* remove attn1_types, attn2_types etc

* fix

* revert incorrect changes up/down samplers

* make style

* remove outdated files

* Apply suggestions from code review

* attention refactor

* refactor cross attention

* Apply suggestions from code review

* update

* up

* update

* Apply suggestions from code review

* finish

* Update src/diffusers/models/cross_attention.py

* more fixes

* up

* up

* up

* finish

* more corrections of conversion state

* act_2 -> act_2_fn

* remove dropout_after_conv from ResnetBlock2D

* make style

* simplify KAttentionBlock

* add fast test for latent upscaler pipeline

* add slow test

* slow test fp16

* make style

* add doc string for pipeline_stable_diffusion_latent_upscale

* add api doc page for latent upscaler pipeline

* deprecate attention mask

* clean up embeddings

* simplify resnet

* up

* clean up resnet

* up

* correct more

* up

* up

* improve a bit more

* correct more

* more clean-ups

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* add docstrings for new unet config

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* # Copied from

* encode the image if not latent

* remove force casting vae to fp32

* fix

* add comments about preconditioning parameters from k-diffusion paper

* attn1_type, attn2_type -> add_self_attention

* clean up get_down_block and get_up_block

* fix

* fixed a typo(?) in ada group norm

* update slice attention processer for cross attention

* update slice

* fix fast test

* update the checkpoint

* finish tests

* fix-copies

* fix-copy for modeling_text_unet.py

* make style

* make style

* fix f-string

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix import

* correct changes

* fix resnet

* make fix-copies

* correct euler scheduler

* add missing #copied from for preprocess

* revert

* fix

* fix copies

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/models/cross_attention.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* clean up conversion script

* KDownsample2d,KUpsample2d -> KDownsample2D,KUpsample2D

* more

* Update src/diffusers/models/unet_2d_condition.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* remove prepare_extra_step_kwargs

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix a typo in timestep embedding

* remove num_image_per_prompt

* fix fasttest

* make style + fix-copies

* fix

* fix xformer test

* fix style

* doc string

* make style

* fix-copies

* docstring for time_embedding_norm

* make style

* final finishes

* make fix-copies

* fix tests

---------
Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

1051ca81

27 Jan, 2023 1 commit

Allow lora from pipeline (#2129) · 0c39f53c

Patrick von Platen authored Jan 27, 2023

* [LoRA] All to use in inference with pipeline

* [LoRA] allow cross attention kwargs passed to pipeline

* finish

0c39f53c

26 Jan, 2023 1 commit

Allow `UNet2DModel` to use arbitrary class embeddings (#2080) · 915a5636

Pedro Cuenca authored Jan 26, 2023

* Allow `UNet2DModel` to use arbitrary class embeddings.

We can currently use class conditioning in `UNet2DConditionModel`, but
not in `UNet2DModel`. However, `UNet2DConditionModel` requires text
conditioning too, which is unrelated to other types of conditioning.
This commit makes it possible for `UNet2DModel` to be conditioned on
entities other than timesteps. This is useful for training /
research purposes. We can currently train models to perform
unconditional image generation or text-to-image generation, but it's not
straightforward to train a model to perform class-conditioned image
generation, if text conditioning is not required.

We could potentiall use `UNet2DConditionModel` for class-conditioning
without text embeddings by using down/up blocks without
cross-conditioning. However:
- The mid block currently requires cross attention.
- We are required to provide `encoder_hidden_states` to `forward`.

* Style

* Align class conditioning, add docstring for `num_class_embeds`.

* Copy docstring to versatile_diffusion UNetFlatConditionModel

915a5636

18 Jan, 2023 1 commit

[LoRA] Add LoRA training script (#1884) · ed616bd8

Patrick von Platen authored Jan 18, 2023



* [Lora] first upload

* add first lora version

* upload

* more

* first training

* up

* correct

* improve

* finish loaders and inference

* up

* up

* fix more

* up

* finish more

* finish more

* up

* up

* change year

* revert year change

* Change lines

* Add cloneofsimo as co-author.
Co-authored-by: Simo Ryu <cloneofsimo@gmail.com>

* finish

* fix docs

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* upload

* finish
Co-authored-by: Simo Ryu <cloneofsimo@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

ed616bd8

01 Jan, 2023 1 commit
- [Attention] Finish refactor attention file (#1879) · 21bbc633
  Patrick von Platen authored Jan 01, 2023
```
* [Attention] Finish refactor attention file

* correct more

* fix

* more fixes

* correct

* up
```
  21bbc633
30 Dec, 2022 1 commit

Make repo structure consistent (#1862) · 29b2c93c

Patrick von Platen authored Dec 30, 2022



* move files a bit

* more refactors

* fix more

* more fixes

* fix more onnx

* make style

* upload

* fix

* up

* fix more

* up again

* up

* small fix

* Update src/diffusers/__init__.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* correct
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

29b2c93c

20 Dec, 2022 1 commit

Refactor cross attention and allow mechanism to tweak cross attention function (#1639) · 4125756e

Patrick von Platen authored Dec 20, 2022



* first proposal

* rename

* up

* Apply suggestions from code review

* better

* up

* finish

* up

* rename

* correct versatile

* up

* up

* up

* up

* fix

* Apply suggestions from code review

* make style

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* add error message
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

4125756e

19 Dec, 2022 2 commits
- Add attention mask to uclip (#1756) · 429e5449
  Patrick von Platen authored Dec 19, 2022
```
* Remove bogus file

* [Unclip] Add efficient attention

* [Unclip] Add efficient attention
```
  429e5449
- Add resnet_time_scale_shift to VD layers (#1757) · dc7cd893
  Anton Lozhkov authored Dec 19, 2022
  
  dc7cd893
18 Dec, 2022 1 commit

kakaobrain unCLIP (#1428) · 2dcf64b7

Will Berman authored Dec 18, 2022



* [wip] attention block updates

* [wip] unCLIP unet decoder and super res

* [wip] unCLIP prior transformer

* [wip] scheduler changes

* [wip] text proj utility class

* [wip] UnCLIPPipeline

* [wip] kakaobrain unCLIP convert script

* [unCLIP pipeline] fixes re: @patrickvonplaten

remove callbacks

move denoising loops into call function

* UNCLIPScheduler re: @patrickvonplaten

Revert changes to DDPMScheduler. Make UNCLIPScheduler, a modified
DDPM scheduler with changes to support karlo

* mask -> attention_mask re: @patrickvonplaten

* [DDPMScheduler] remove leftover change

* [docs] PriorTransformer

* [docs] UNet2DConditionModel and UNet2DModel

* [nit] UNCLIPScheduler -> UnCLIPScheduler

matches existing unclip naming better

* [docs] SchedulingUnCLIP

* [docs] UnCLIPTextProjModel

* refactor

* finish licenses

* rename all to attention_mask and prep in models

* more renaming

* don't expose unused configs

* final renaming fixes

* remove x attn mask when not necessary

* configure kakao script to use new class embedding config

* fix copies

* [tests] UnCLIPScheduler

* finish x attn

* finish

* remove more

* rename condition blocks

* clean more

* Apply suggestions from code review

* up

* fix

* [tests] UnCLIPPipelineFastTests

* remove unused imports

* [tests] UnCLIPPipelineIntegrationTests

* correct

* make style
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

2dcf64b7

08 Dec, 2022 1 commit
- [Versatile Diffusion] add upcast_attention (#1605) · a934e5bc
  Suraj Patil authored Dec 08, 2022
```
add upcast_attention arg
```
  a934e5bc
07 Dec, 2022 2 commits
- Make cross-attention check more robust (#1560) · 5e036921
  Pedro Cuenca authored Dec 07, 2022
```
* Make cross-attention check more robust.

* Fix copies.
```
  5e036921
- [UNet2DConditionModel] add an option to upcast attention to fp32 (#1590) · 170ebd28
  Suraj Patil authored Dec 07, 2022
```
upcast attention
```
  170ebd28
05 Dec, 2022 3 commits

[Docs] Correct docs (#1554) · 62b497c4
Patrick von Platen authored Dec 05, 2022

62b497c4
Research folder (#1553) · 459b8ca8
Patrick von Platen authored Dec 05, 2022
```
* Research folder

* Update examples/research_projects/README.md

* up
```
459b8ca8

[refactor] make set_attention_slice recursive (#1532) · bce65cd1

Suraj Patil authored Dec 05, 2022



* make attn slice recursive

* remove set_attention_slice from blocks

* fix copies

* make enable_attention_slicing base class method of DiffusionPipeline

* fix set_attention_slice

* fix set_attention_slice

* fix copies

* add tests

* up

* up

* up

* update

* up

* uP
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

bce65cd1

02 Dec, 2022 3 commits

fix tests · cf4664e8
Patrick von Platen authored Dec 02, 2022

cf4664e8

Do not use torch.long in mps (#1488) · 3ceaa280

Pedro Cuenca authored Dec 02, 2022



* Do not use torch.long in mps

Addresses #1056.

* Use torch.int instead of float.

* Propagate changes.

* Do not silently change float -> int.

* Propagate changes.

* Apply suggestions from code review
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>

3ceaa280

[refactor] Making the xformers mem-efficient attention activation recursive (#1493) · a816a87a

Benjamin Lefaudeux authored Dec 02, 2022



* Moving the mem efficiient attention activation to the top + recursive

* black, too bad there's no pre-commit ?
Co-authored-by: Benjamin Lefaudeux <benjamin@photoroom.com>

a816a87a

25 Nov, 2022 1 commit
- Allow to set config params directly in init (#1419) · 8faa822d
  Patrick von Platen authored Nov 25, 2022
```
* fix

* fix deprecated kwargs logic

* add tests

* finish
```
  8faa822d
24 Nov, 2022 3 commits

Support SD2 attention slicing (#1397) · d50e3217

Anton Lozhkov authored Nov 24, 2022

* Support SD2 attention slicing

* Support SD2 attention slicing

* Add more copies

* Use attn_num_head_channels in blocks

* fix-copies

* Update tests

* fix imports

d50e3217

Add the new SD2 attention params to the VD text unet (#1400) · bb2c64a0
Anton Lozhkov authored Nov 24, 2022

bb2c64a0

Adapt UNet2D for supre-resolution (#1385) · cecdd8bd

Suraj Patil authored Nov 24, 2022

* allow disabling self attention

* add class_embedding

* fix copies

* fix condition

* fix copies

* do_self_attention -> only_cross_attention

* fix copies

* num_classes -> num_class_embeds

* fix default value

cecdd8bd

23 Nov, 2022 2 commits

update unet2d (#1376) · f07a16e0

Suraj Patil authored Nov 23, 2022

* boom boom

* remove duplicate arg

* add use_linear_proj arg

* fix copies

* style

* add fast tests

* use_linear_proj -> use_linear_projection

f07a16e0

[Versatile Diffusion] Add versatile diffusion model (#1283) · 2625fb59

Patrick von Platen authored Nov 23, 2022



* up

* convert dual unet

* revert dual attn

* adapt for vd-official

* test the full pipeline

* mixed inference

* mixed inference for text2img

* add image prompting

* fix clip norm

* split text2img and img2img

* fix format

* refactor text2img

* mega pipeline

* add optimus

* refactor image var

* wip text_unet

* text unet end to end

* update tests

* reshape

* fix image to text

* add some first docs

* dual guided pipeline

* fix token ratio

* propose change

* dual transformer as a native module

* DualTransformer(nn.Module)

* DualTransformer(nn.Module)

* correct unconditional image

* save-load with mega pipeline

* remove image to text

* up

* uP

* fix

* up

* final fix

* remove_unused_weights

* test updates

* save progress

* uP

* fix dual prompts

* some fixes

* finish

* style

* finish renaming

* up

* fix

* fix

* fix

* finish
Co-authored-by: anton-l <anton@huggingface.co>

2625fb59