- 20 Jul, 2023 1 commit
-
-
Allen Zhang authored
fix inner_dim
-
- 09 Jul, 2023 1 commit
-
-
Will Berman authored
* refactor to support patching LoRA into T5 instantiate the lora linear layer on the same device as the regular linear layer get lora rank from state dict tests fmt can create lora layer in float32 even when rest of model is float16 fix loading model hook remove load_lora_weights_ and T5 dispatching remove Unet#attn_processors_state_dict docstrings * text encoder monkeypatch class method * fix test --------- Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
- 06 Jul, 2023 1 commit
-
-
Patrick von Platen authored
* Add new text encoder * add transformers depth * More * Correct conversion script * Fix more * Fix more * Correct more * correct text encoder * Finish all * proof that in works in run local xl * clean up * Get refiner to work * Add red castle * Fix batch size * Improve pipelines more * Finish text2image tests * Add img2img test * Fix more * fix import * Fix embeddings for classic models (#3888) Fix embeddings for classic SD models. * Allow multiple prompts to be passed to the refiner (#3895) * finish more * Apply suggestions from code review * add watermarker * Model offload (#3889) * Model offload. * Model offload for refiner / img2img * Hardcode encoder offload on img2img vae encode Saves some GPU RAM in img2img / refiner tasks so it remains below 8 GB. --------- Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * correct * fix * clean print * Update install warning for `invisible-watermark` * add: missing docstrings. * fix and simplify the usage example in img2img. * fix setup for watermarking. * Revert "fix setup for watermarking." This reverts commit 491bc9f5a640bbf46a97a8e52d6eff7e70eb8e4b. * fix: watermarking setup. * fix: op. * run make fix-copies. * make sure tests pass * improve convert * make tests pass * make tests pass * better error message * fiinsh * finish * Fix final test --------- Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com>
-
- 16 Jun, 2023 1 commit
-
-
Will Berman authored
-
- 06 Jun, 2023 1 commit
-
-
Sayak Paul authored
* feat: add lora attention processor for pt 2.0. * explicit context manager for SDPA. * switch to flash attention * make shapes compatible to work optimally with SDPA. * fix: circular import problem. * explicitly specify the flash attention kernel in sdpa * fall back to efficient attention context manager. * remove explicit dispatch. * fix: removed processor. * fix: remove optional from type annotation. * feat: make changes regarding LoRAAttnProcessor2_0. * remove confusing warning. * formatting. * relax tolerance for PT 2.0 * fix: loading message. * remove unnecessary logging. * add: entry to the docs. * add: network_alpha argument. * relax tolerance.
-
- 02 Jun, 2023 1 commit
-
-
Takuma Mori authored
* add _convert_kohya_lora_to_diffusers * make style * add scaffold * match result: unet attention only * fix monkey-patch for text_encoder * with CLIPAttention While the terrible images are no longer produced, the results do not match those from the hook ver. This may be due to not setting the network_alpha value. * add to support network_alpha * generate diff image * fix monkey-patch for text_encoder * add test_text_encoder_lora_monkey_patch() * verify that it's okay to release the attn_procs * fix closure version * add comment * Revert "fix monkey-patch for text_encoder" This reverts commit bb9c61e6faecc1935c9c4319c77065837655d616. * Fix to reuse utility functions * make LoRAAttnProcessor targets to self_attn * fix LoRAAttnProcessor target * make style * fix split key * Update src/diffusers/loaders.py * remove TEXT_ENCODER_TARGET_MODULES loop * add print memory usage * remove test_kohya_loras_scaffold.py * add: doc on LoRA civitai * remove print statement and refactor in the doc. * fix state_dict test for kohya-ss style lora * Apply suggestions from code review Co-authored-by:
Takuma Mori <takuma104@gmail.com> --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com>
-
- 30 May, 2023 2 commits
-
-
Patrick von Platen authored
Make sure we also change the config when setting `encoder_hid_dim_type=="text_proj"` and allow xformers (#3615) * fix if * make style * make style * add tests for xformers * make style * update
-
Patrick von Platen authored
* Fix temb attention * Apply suggestions from code review * make style * Add tests and fix docker * Apply suggestions from code review
-
- 29 May, 2023 1 commit
-
-
Sayak Paul authored
-
- 26 May, 2023 1 commit
-
-
Steven Liu authored
* add attnprocessor to docs * fix path to class * create separate page for attnprocessors * fix path * fix path for real * fill in docstrings * apply feedback * apply feedback
-
- 25 May, 2023 1 commit
-
-
YiYi Xu authored
add kandinsky2.1 --------- Co-authored-by:
yiyixuxu <yixu310@gmail,com> Co-authored-by:
Ayush Mangal <43698245+ayushtues@users.noreply.github.com> Co-authored-by:
ayushmangal <ayushmangal@microsoft.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com>
-
- 23 May, 2023 1 commit
-
-
Will Berman authored
-
- 22 May, 2023 1 commit
-
-
Birch-san authored
* Cross-attention masks prefer qualified symbol, fix accidental Optional prefer qualified symbol in AttentionProcessor prefer qualified symbol in embeddings.py qualified symbol in transformed_2d qualify FloatTensor in unet_2d_blocks move new transformer_2d params attention_mask, encoder_attention_mask to the end of the section which is assumed (e.g. by functions such as checkpoint()) to have a stable positional param interface. regard return_dict as a special-case which is assumed to be injected separately from positional params (e.g. by create_custom_forward()). move new encoder_attention_mask param to end of CrossAttn block interfaces and Unet2DCondition interface, to maintain positional param interface. regenerate modeling_text_unet.py remove unused import unet_2d_condition encoder_attention_mask docs Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> versatile_diffusion/modeling_text_unet.py encoder_attention_mask docs Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> transformer_2d encoder_attention_mask docs Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> unet_2d_blocks.py: add parameter name comments Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> revert description. bool-to-bias treatment happens in unet_2d_condition only. comment parameter names fix copies, style * encoder_attention_mask for SimpleCrossAttnDownBlock2D, SimpleCrossAttnUpBlock2D * encoder_attention_mask for UNetMidBlock2DSimpleCrossAttn * support attention_mask, encoder_attention_mask in KCrossAttnDownBlock2D, KCrossAttnUpBlock2D, KAttentionBlock. fix binding of attention_mask, cross_attention_kwargs params in KCrossAttnDownBlock2D, KCrossAttnUpBlock2D checkpoint invocations. * fix mistake made during merge conflict resolution * regenerate versatile_diffusion * pass time embedding into checkpointed attention invocation * always assume encoder_attention_mask is a mask (i.e. not a bias). * style, fix-copies * add tests for cross-attention masks * add test for padding of attention mask * explain mask's query_tokens dim. fix explanation about broadcasting over channels; we actually broadcast over query tokens * support both masks and biases in Transformer2DModel#forward. document behaviour * fix-copies * delete attention_mask docs on the basis I never tested self-attention masking myself. not comfortable explaining it, since I don't actually understand how a self-attn mask can work in its current form: the key length will be different in every ResBlock (we don't downsample the mask when we downsample the image). * review feedback: the standard Unet blocks shouldn't pass temb to attn (only to resnet). remove from KCrossAttnDownBlock2D,KCrossAttnUpBlock2D#forward. * remove encoder_attention_mask param from SimpleCrossAttn{Up,Down}Block2D,UNetMidBlock2DSimpleCrossAttn, and mask-choice in those blocks' #forward, on the basis that they only do one type of attention, so the consumer can pass whichever type of attention_mask is appropriate. * put attention mask padding back to how it was (since the SD use-case it enabled wasn't important, and it breaks the original unclip use-case). disable the test which was added. * fix-copies * style * fix-copies * put encoder_attention_mask param back into Simple block forward interfaces, to ensure consistency of forward interface. * restore passing of emb to KAttentionBlock#forward, on the basis that removal caused test failures. restore also the passing of emb to checkpointed calls to KAttentionBlock#forward. * make simple unet2d blocks use encoder_attention_mask, but only when attention_mask is None. this should fix UnCLIP compatibility. * fix copies
-
- 21 May, 2023 1 commit
-
-
Sayak Paul authored
* add: debugging to enabling memory efficient processing * add: better warning message.
-
- 17 May, 2023 1 commit
-
-
cmdr2 authored
Release large tensors in attention (as soon as they're no longer required). Reduces peak VRAM by nearly 2 GB for 1024x1024 (even after slicing), and the savings scale up with image size.
-
- 12 May, 2023 1 commit
-
-
Will Berman authored
* Replace `AttentionBlock` with `Attention` * use _from_deprecated_attn_block check re: @patrickvonplaten
-
- 10 May, 2023 1 commit
-
-
Sayak Paul authored
* add: a warning message when using xformers in a PT 2.0 env. * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> --------- Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 09 May, 2023 1 commit
-
-
Will Berman authored
* update IF stage I pipelines add fixed variance schedulers and lora loading * added kv lora attn processor * allow loading into alternative lora attn processor * make vae optional * throw away predicted variance * allow loading into added kv lora layer * allow load T5 * allow pre compute text embeddings * set new variance type in schedulers * fix copies * refactor all prompt embedding code class prompts are now included in pre-encoding code max tokenizer length is now configurable embedding attention mask is now configurable * fix for when variance type is not defined on scheduler * do not pre compute validation prompt if not present * add example test for if lora dreambooth * add check for train text encoder and pre compute text embeddings
-
- 01 May, 2023 1 commit
-
-
Patrick von Platen authored
* fix more * Fix more * fix more * Apply suggestions from code review * fix * make style * make fix-copies * fix * make sure torch compile * Clean * fix test
-
- 20 Apr, 2023 1 commit
-
-
nupurkmr9 authored
* diffusers==0.14.0 update * custom diffusion update * custom diffusion update * custom diffusion update * custom diffusion update * custom diffusion update * custom diffusion update * custom diffusion * custom diffusion * custom diffusion * custom diffusion * custom diffusion * apply formatting and get rid of bare except. * refactor readme and other minor changes. * misc refactor. * fix: repo_id issue and loaders logging bug. * fix: save_model_card. * fix: save_model_card. * fix: save_model_card. * add: doc entry. * refactor doc,. * custom diffusion * custom diffusion * custom diffusion * apply style. * remove tralining whitespace. * fix: toctree entry. * remove unnecessary print. * custom diffusion * custom diffusion * custom diffusion test * custom diffusion xformer update * custom diffusion xformer update * custom diffusion xformer update --------- Co-authored-by:
Nupur Kumari <nupurkumari@Nupurs-MacBook-Pro.local> Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Nupur Kumari <nupurkumari@nupurs-mbp.wifi.local.cmu.edu>
-
- 11 Apr, 2023 4 commits
-
-
Will Berman authored
add AttnAddedKVProcessor2_0 block
-
Will Berman authored
add group norm type to attention processor cross attention norm This lets the cross attention norm use both a group norm block and a layer norm block. The group norm operates along the channels dimension and requires input shape (batch size, channels, *) where as the layer norm with a single `normalized_shape` dimension only operates over the least significant dimension i.e. (*, channels). The channels we want to normalize are the hidden dimension of the encoder hidden states. By convention, the encoder hidden states are always passed as (batch size, sequence length, hidden states). This means the layer norm can operate on the tensor without modification, but the group norm requires flipping the last two dimensions to operate on (batch size, hidden states, sequence length). All existing attention processors will have the same logic and we can consolidate it in a helper function `prepare_encoder_hidden_states` prepare_encoder_hidden_states -> norm_encoder_hidden_states re: @patrickvonplaten move norm_cross defined check to outside norm_encoder_hidden_states add missing attn.norm_cross check
-
Will Berman authored
* add only cross attention to simple attention blocks * add test for only_cross_attention re: @patrickvonplaten * mid_block_only_cross_attention better default allow mid_block_only_cross_attention to default to `only_cross_attention` when `only_cross_attention` is given as a single boolean
-
Will Berman authored
* `AttentionProcessor.group_norm` num_channels should be `query_dim` The group_norm on the attention processor should really norm the number of channels in the query _not_ the inner dim. This wasn't caught before because the group_norm is only used by the added kv attention processors and the added kv attention processors are only used by the karlo models which are configured such that the inner dim is the same as the query dim. * add_{k,v}_proj should be projecting to inner_dim
-
- 10 Apr, 2023 2 commits
-
-
William Berman authored
-
William Berman authored
-
- 15 Mar, 2023 2 commits
-
-
Patrick von Platen authored
* rename file * rename attention * fix more * rename more * up * more deprecation imports * fixes
-
Kashif Rasul authored
* fix AttnProcessor2_0 Fix use of AttnProcessor2_0 for cross attention with mask * added scale_qk and out_bias flags * fixed for xformers * check if it has scale argument * Update cross_attention.py * check torch version * fix sliced attn * style * set scale * fix test * fixed addedKV processor * revert back AttnProcessor2_0 * if missing if * fix inner_dim --------- Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
- 03 Mar, 2023 1 commit
-
-
alvanli authored
Remove explicit message argument
-
- 01 Mar, 2023 1 commit
-
-
Patrick von Platen authored
-
- 17 Feb, 2023 2 commits
-
-
Pedro Cuenca authored
Fix typo in AttnProcessor2_0 symbol.
-
Suraj Patil authored
* add sdpa processor * don't use it by default * add some checks and style * typo * support torch sdpa in dreambooth example * use torch attn proc by default when available * typo * add attn mask * fix naming * being doc * doc * Apply suggestions from code review * polish * torctree * Apply suggestions from code review Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * better name * style * add benchamrk table * Update docs/source/en/optimization/torch2.0.mdx * up * fix example * check if processor is None * Apply suggestions from code review Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * add fp32 benchmakr * Apply suggestions from code review Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Pedro Cuenca <pedro@huggingface.co>
-
- 16 Feb, 2023 1 commit
-
-
fxmarty authored
replace torch.concat by torch.cat
-
- 13 Feb, 2023 1 commit
-
-
bddppq authored
* Fix running LoRA with xformers * support disabling xformers * reformat * Add test
-
- 07 Feb, 2023 2 commits
-
-
Pedro Cuenca authored
* mps cross-attention hack: don't crash on fp16 * Make conversion explicit.
-
YiYi Xu authored
* Modify UNet2DConditionModel - allow skipping mid_block - adding a norm_group_size argument so that we can set the `num_groups` for group norm using `num_channels//norm_group_size` - allow user to set dimension for the timestep embedding (`time_embed_dim`) - the kernel_size for `conv_in` and `conv_out` is now configurable - add random fourier feature layer (`GaussianFourierProjection`) for `time_proj` - allow user to add the time and class embeddings before passing through the projection layer together - `time_embedding(t_emb + class_label))` - added 2 arguments `attn1_types` and `attn2_types` * currently we have argument `only_cross_attention`: when it's set to `True`, we will have a to the `BasicTransformerBlock` block with 2 cross-attention , otherwise we get a self-attention followed by a cross-attention; in k-upscaler, we need to have blocks that include just one cross-attention, or self-attention -> cross-attention; so I added `attn1_types` and `attn2_types` to the unet's argument list to allow user specify the attention types for the 2 positions in each block; note that I stil kept the `only_cross_attention` argument for unet for easy configuration, but it will be converted to `attn1_type` and `attn2_type` when passing down to the down blocks - the position of downsample layer and upsample layer is now configurable - in k-upscaler unet, there is only one skip connection per each up/down block (instead of each layer in stable diffusion unet), added `skip_freq = "block"` to support this use case - if user passes attention_mask to unet, it will prepare the mask and pass a flag to cross attention processer to skip the `prepare_attention_mask` step inside cross attention block add up/down blocks for k-upscaler modify CrossAttention class - make the `dropout` layer in `to_out` optional - `use_conv_proj` - use conv instead of linear for all projection layers (i.e. `to_q`, `to_k`, `to_v`, `to_out`) whenever possible. note that when it's used to do cross attention, to_k, to_v has to be linear because the `encoder_hidden_states` is not 2d - `cross_attention_norm` - add an optional layernorm on encoder_hidden_states - `attention_dropout`: add an optional dropout on attention score adapt BasicTransformerBlock - add an ada groupnorm layer to conditioning attention input with timestep embedding - allow skipping the FeedForward layer in between the attentions - replaced the only_cross_attention argument with attn1_type and attn2_type for more flexible configuration update timestep embedding: add new act_fn gelu and an optional act_2 modified ResnetBlock2D - refactored with AdaGroupNorm class (the timestep scale shift normalization) - add `mid_channel` argument - allow the first conv to have a different output dimension from the second conv - add option to use input AdaGroupNorm on the input instead of groupnorm - add options to add a dropout layer after each conv - allow user to set the bias in conv_shortcut (needed for k-upscaler) - add gelu adding conversion script for k-upscaler unet add pipeline * fix attention mask * fix a typo * fix a bug * make sure model can be used with GPU * make pipeline work with fp16 * fix an error in BasicTransfomerBlock * make style * fix typo * some more fixes * uP * up * correct more * some clean-up * clean time proj * up * uP * more changes * remove the upcast_attention=True from unet config * remove attn1_types, attn2_types etc * fix * revert incorrect changes up/down samplers * make style * remove outdated files * Apply suggestions from code review * attention refactor * refactor cross attention * Apply suggestions from code review * update * up * update * Apply suggestions from code review * finish * Update src/diffusers/models/cross_attention.py * more fixes * up * up * up * finish * more corrections of conversion state * act_2 -> act_2_fn * remove dropout_after_conv from ResnetBlock2D * make style * simplify KAttentionBlock * add fast test for latent upscaler pipeline * add slow test * slow test fp16 * make style * add doc string for pipeline_stable_diffusion_latent_upscale * add api doc page for latent upscaler pipeline * deprecate attention mask * clean up embeddings * simplify resnet * up * clean up resnet * up * correct more * up * up * improve a bit more * correct more * more clean-ups * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * add docstrings for new unet config * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * # Copied from * encode the image if not latent * remove force casting vae to fp32 * fix * add comments about preconditioning parameters from k-diffusion paper * attn1_type, attn2_type -> add_self_attention * clean up get_down_block and get_up_block * fix * fixed a typo(?) in ada group norm * update slice attention processer for cross attention * update slice * fix fast test * update the checkpoint * finish tests * fix-copies * fix-copy for modeling_text_unet.py * make style * make style * fix f-string * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * fix import * correct changes * fix resnet * make fix-copies * correct euler scheduler * add missing #copied from for preprocess * revert * fix * fix copies * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update src/diffusers/models/cross_attention.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * clean up conversion script * KDownsample2d,KUpsample2d -> KDownsample2D,KUpsample2D * more * Update src/diffusers/models/unet_2d_condition.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * remove prepare_extra_step_kwargs * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * fix a typo in timestep embedding * remove num_image_per_prompt * fix fasttest * make style + fix-copies * fix * fix xformer test * fix style * doc string * make style * fix-copies * docstring for time_embedding_norm * make style * final finishes * make fix-copies * fix tests --------- Co-authored-by:
yiyixuxu <yixu@yis-macbook-pro.lan> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Pedro Cuenca <pedro@huggingface.co>
-
- 03 Feb, 2023 1 commit
-
-
Jorge C. Gomes authored
Related to #2124 The current implementation is throwing a shape mismatch error. Which makes sense, as this line is obviously missing, comparing to XFormersCrossAttnProcessor and LoRACrossAttnProcessor. I don't have formal tests, but I compared `LoRACrossAttnProcessor` and `LoRAXFormersCrossAttnProcessor` ad-hoc, and they produce the same results with this fix.
-
- 01 Feb, 2023 1 commit
-
-
Asad Memon authored
-
- 27 Jan, 2023 2 commits
-
-
Patrick von Platen authored
-
Patrick von Platen authored
-