1. 21 May, 2023 1 commit
  2. 17 May, 2023 1 commit
  3. 12 May, 2023 1 commit
  4. 10 May, 2023 1 commit
  5. 09 May, 2023 1 commit
    • Will Berman's avatar
      if dreambooth lora (#3360) · a757b2db
      Will Berman authored
      * update IF stage I pipelines
      
      add fixed variance schedulers and lora loading
      
      * added kv lora attn processor
      
      * allow loading into alternative lora attn processor
      
      * make vae optional
      
      * throw away predicted variance
      
      * allow loading into added kv lora layer
      
      * allow load T5
      
      * allow pre compute text embeddings
      
      * set new variance type in schedulers
      
      * fix copies
      
      * refactor all prompt embedding code
      
      class prompts are now included in pre-encoding code
      max tokenizer length is now configurable
      embedding attention mask is now configurable
      
      * fix for when variance type is not defined on scheduler
      
      * do not pre compute validation prompt if not present
      
      * add example test for if lora dreambooth
      
      * add check for train text encoder and pre compute text embeddings
      a757b2db
  6. 01 May, 2023 1 commit
    • Patrick von Platen's avatar
      Torch compile graph fix (#3286) · 0e82fb19
      Patrick von Platen authored
      * fix more
      
      * Fix more
      
      * fix more
      
      * Apply suggestions from code review
      
      * fix
      
      * make style
      
      * make fix-copies
      
      * fix
      
      * make sure torch compile
      
      * Clean
      
      * fix test
      0e82fb19
  7. 20 Apr, 2023 1 commit
    • nupurkmr9's avatar
      adding custom diffusion training to diffusers examples (#3031) · 3979aac9
      nupurkmr9 authored
      
      
      * diffusers==0.14.0 update
      
      * custom diffusion update
      
      * custom diffusion update
      
      * custom diffusion update
      
      * custom diffusion update
      
      * custom diffusion update
      
      * custom diffusion update
      
      * custom diffusion
      
      * custom diffusion
      
      * custom diffusion
      
      * custom diffusion
      
      * custom diffusion
      
      * apply formatting and get rid of bare except.
      
      * refactor readme and other minor changes.
      
      * misc refactor.
      
      * fix: repo_id issue and loaders logging bug.
      
      * fix: save_model_card.
      
      * fix: save_model_card.
      
      * fix: save_model_card.
      
      * add: doc entry.
      
      * refactor doc,.
      
      * custom diffusion
      
      * custom diffusion
      
      * custom diffusion
      
      * apply style.
      
      * remove tralining whitespace.
      
      * fix: toctree entry.
      
      * remove unnecessary print.
      
      * custom diffusion
      
      * custom diffusion
      
      * custom diffusion test
      
      * custom diffusion xformer update
      
      * custom diffusion xformer update
      
      * custom diffusion xformer update
      
      ---------
      Co-authored-by: default avatarNupur Kumari <nupurkumari@Nupurs-MacBook-Pro.local>
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarNupur Kumari <nupurkumari@nupurs-mbp.wifi.local.cmu.edu>
      3979aac9
  8. 11 Apr, 2023 4 commits
    • Will Berman's avatar
      Attn added kv processor torch 2.0 block (#3023) · ea39cd7e
      Will Berman authored
      add AttnAddedKVProcessor2_0 block
      ea39cd7e
    • Will Berman's avatar
      Attention processor cross attention norm group norm (#3021) · 98c5e5da
      Will Berman authored
      add group norm type to attention processor cross attention norm
      
      This lets the cross attention norm use both a group norm block and a
      layer norm block.
      
      The group norm operates along the channels dimension
      and requires input shape (batch size, channels, *) where as the layer norm with a single
      `normalized_shape` dimension only operates over the least significant
      dimension i.e. (*, channels).
      
      The channels we want to normalize are the hidden dimension of the encoder hidden states.
      
      By convention, the encoder hidden states are always passed as (batch size, sequence
      length, hidden states).
      
      This means the layer norm can operate on the tensor without modification, but the group
      norm requires flipping the last two dimensions to operate on (batch size, hidden states, sequence length).
      
      All existing attention processors will have the same logic and we can
      consolidate it in a helper function `prepare_encoder_hidden_states`
      
      prepare_encoder_hidden_states -> norm_encoder_hidden_states re: @patrickvonplaten
      
      move norm_cross defined check to outside norm_encoder_hidden_states
      
      add missing attn.norm_cross check
      98c5e5da
    • Will Berman's avatar
      add only cross attention to simple attention blocks (#3011) · c6180a31
      Will Berman authored
      * add only cross attention to simple attention blocks
      
      * add test for only_cross_attention re: @patrickvonplaten
      
      * mid_block_only_cross_attention better default
      
      allow mid_block_only_cross_attention to default to
      `only_cross_attention` when `only_cross_attention` is given
      as a single boolean
      c6180a31
    • Will Berman's avatar
      `AttentionProcessor.group_norm` num_channels should be `query_dim` (#3046) · 8c6b47cf
      Will Berman authored
      * `AttentionProcessor.group_norm` num_channels should be `query_dim`
      
      The group_norm on the attention processor should really norm the number
      of channels in the query _not_ the inner dim. This wasn't caught before
      because the group_norm is only used by the added kv attention processors
      and the added kv attention processors are only used by the karlo models
      which are configured such that the inner dim is the same as the query
      dim.
      
      * add_{k,v}_proj should be projecting to inner_dim
      8c6b47cf
  9. 10 Apr, 2023 2 commits
  10. 15 Mar, 2023 2 commits
  11. 03 Mar, 2023 1 commit
  12. 01 Mar, 2023 1 commit
  13. 17 Feb, 2023 2 commits
  14. 16 Feb, 2023 1 commit
  15. 13 Feb, 2023 1 commit
  16. 07 Feb, 2023 2 commits
    • Pedro Cuenca's avatar
      mps cross-attention hack: don't crash on fp16 (#2258) · e619db24
      Pedro Cuenca authored
      * mps cross-attention hack: don't crash on fp16
      
      * Make conversion explicit.
      e619db24
    • YiYi Xu's avatar
      Stable Diffusion Latent Upscaler (#2059) · 1051ca81
      YiYi Xu authored
      
      
      * Modify UNet2DConditionModel
      
      - allow skipping mid_block
      
      - adding a norm_group_size argument so that we can set the `num_groups` for group norm using `num_channels//norm_group_size`
      
      - allow user to set dimension for the timestep embedding (`time_embed_dim`)
      
      - the kernel_size for `conv_in` and `conv_out` is now configurable
      
      - add random fourier feature layer (`GaussianFourierProjection`) for `time_proj`
      
      - allow user to add the time and class embeddings before passing through the projection layer together - `time_embedding(t_emb + class_label))`
      
      - added 2 arguments `attn1_types` and `attn2_types`
      
        * currently we have argument `only_cross_attention`: when it's set to `True`, we will have a to the
      `BasicTransformerBlock` block with 2 cross-attention , otherwise we
      get a self-attention followed by a cross-attention; in k-upscaler, we need to have blocks that include just one cross-attention, or self-attention -> cross-attention;
      so I added `attn1_types` and `attn2_types` to the unet's argument list to allow user specify the attention types for the 2 positions in each block;  note that I stil kept
      the `only_cross_attention` argument for unet for easy configuration, but it will be converted to `attn1_type` and `attn2_type` when passing down to the down blocks
      
      - the position of downsample layer and upsample layer is now configurable
      
      - in k-upscaler unet, there is only one skip connection per each up/down block (instead of each layer in stable diffusion unet), added `skip_freq = "block"` to support
      this use case
      
      - if user passes attention_mask to unet, it will prepare the mask and pass a flag to cross attention processer to skip the `prepare_attention_mask` step
      inside cross attention block
      
      add up/down blocks for k-upscaler
      
      modify CrossAttention class
      
      - make the `dropout` layer in `to_out` optional
      
      - `use_conv_proj` - use conv instead of linear for all projection layers (i.e. `to_q`, `to_k`, `to_v`, `to_out`) whenever possible. note that when it's used to do cross
      attention, to_k, to_v has to be linear because the `encoder_hidden_states` is not 2d
      
      - `cross_attention_norm` - add an optional layernorm on encoder_hidden_states
      
      - `attention_dropout`: add an optional dropout on attention score
      
      adapt BasicTransformerBlock
      
      - add an ada groupnorm layer  to conditioning attention input with timestep embedding
      
      - allow skipping the FeedForward layer in between the attentions
      
      - replaced the only_cross_attention argument with attn1_type and attn2_type for more flexible configuration
      
      update timestep embedding: add new act_fn  gelu and an optional act_2
      
      modified ResnetBlock2D
      
      - refactored with AdaGroupNorm class (the timestep scale shift normalization)
      
      - add `mid_channel` argument - allow the first conv to have a different output dimension from the second conv
      
      - add option to use input AdaGroupNorm on the input instead of groupnorm
      
      - add options to add a dropout layer after each conv
      
      - allow user to set the bias in conv_shortcut (needed for k-upscaler)
      
      - add gelu
      
      adding conversion script for k-upscaler unet
      
      add pipeline
      
      * fix attention mask
      
      * fix a typo
      
      * fix a bug
      
      * make sure model can be used with GPU
      
      * make pipeline work with fp16
      
      * fix an error in BasicTransfomerBlock
      
      * make style
      
      * fix typo
      
      * some more fixes
      
      * uP
      
      * up
      
      * correct more
      
      * some clean-up
      
      * clean time proj
      
      * up
      
      * uP
      
      * more changes
      
      * remove the upcast_attention=True from unet config
      
      * remove attn1_types, attn2_types etc
      
      * fix
      
      * revert incorrect changes up/down samplers
      
      * make style
      
      * remove outdated files
      
      * Apply suggestions from code review
      
      * attention refactor
      
      * refactor cross attention
      
      * Apply suggestions from code review
      
      * update
      
      * up
      
      * update
      
      * Apply suggestions from code review
      
      * finish
      
      * Update src/diffusers/models/cross_attention.py
      
      * more fixes
      
      * up
      
      * up
      
      * up
      
      * finish
      
      * more corrections of conversion state
      
      * act_2 -> act_2_fn
      
      * remove dropout_after_conv from ResnetBlock2D
      
      * make style
      
      * simplify KAttentionBlock
      
      * add fast test for latent upscaler pipeline
      
      * add slow test
      
      * slow test fp16
      
      * make style
      
      * add doc string for pipeline_stable_diffusion_latent_upscale
      
      * add api doc page for latent upscaler pipeline
      
      * deprecate attention mask
      
      * clean up embeddings
      
      * simplify resnet
      
      * up
      
      * clean up resnet
      
      * up
      
      * correct more
      
      * up
      
      * up
      
      * improve a bit more
      
      * correct more
      
      * more clean-ups
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * add docstrings for new unet config
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * # Copied from
      
      * encode the image if not latent
      
      * remove force casting vae to fp32
      
      * fix
      
      * add comments about preconditioning parameters from k-diffusion paper
      
      * attn1_type, attn2_type -> add_self_attention
      
      * clean up get_down_block and get_up_block
      
      * fix
      
      * fixed a typo(?) in ada group norm
      
      * update slice attention processer for cross attention
      
      * update slice
      
      * fix fast test
      
      * update the checkpoint
      
      * finish tests
      
      * fix-copies
      
      * fix-copy for modeling_text_unet.py
      
      * make style
      
      * make style
      
      * fix f-string
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * fix import
      
      * correct changes
      
      * fix resnet
      
      * make fix-copies
      
      * correct euler scheduler
      
      * add missing #copied from for preprocess
      
      * revert
      
      * fix
      
      * fix copies
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/models/cross_attention.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * clean up conversion script
      
      * KDownsample2d,KUpsample2d -> KDownsample2D,KUpsample2D
      
      * more
      
      * Update src/diffusers/models/unet_2d_condition.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * remove prepare_extra_step_kwargs
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * fix a typo in timestep embedding
      
      * remove num_image_per_prompt
      
      * fix fasttest
      
      * make style + fix-copies
      
      * fix
      
      * fix xformer test
      
      * fix style
      
      * doc string
      
      * make style
      
      * fix-copies
      
      * docstring for time_embedding_norm
      
      * make style
      
      * final finishes
      
      * make fix-copies
      
      * fix tests
      
      ---------
      Co-authored-by: default avataryiyixuxu <yixu@yis-macbook-pro.lan>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      1051ca81
  17. 03 Feb, 2023 1 commit
    • Jorge C. Gomes's avatar
      Fixes LoRAXFormersCrossAttnProcessor (#2207) · 58c416ab
      Jorge C. Gomes authored
      Related to #2124 
      The current implementation is throwing a shape mismatch error. Which makes sense, as this line is obviously missing, comparing to XFormersCrossAttnProcessor and LoRACrossAttnProcessor.
      
      I don't have formal tests, but I compared `LoRACrossAttnProcessor` and `LoRAXFormersCrossAttnProcessor` ad-hoc, and they produce the same results with this fix.
      58c416ab
  18. 01 Feb, 2023 1 commit
  19. 27 Jan, 2023 3 commits
  20. 26 Jan, 2023 2 commits
  21. 24 Jan, 2023 1 commit
  22. 18 Jan, 2023 1 commit
  23. 16 Jan, 2023 2 commits
  24. 20 Dec, 2022 1 commit