1. 24 Oct, 2023 1 commit
  2. 13 Oct, 2023 1 commit
  3. 11 Oct, 2023 1 commit
  4. 11 Sep, 2023 1 commit
    • Dhruv Nair's avatar
      Lazy Import for Diffusers (#4829) · b6e0b016
      Dhruv Nair authored
      
      
      * initial commit
      
      * move modules to import struct
      
      * add dummy objects and _LazyModule
      
      * add lazy import to schedulers
      
      * clean up unused imports
      
      * lazy import on models module
      
      * lazy import for schedulers module
      
      * add lazy import to pipelines module
      
      * lazy import altdiffusion
      
      * lazy import audio diffusion
      
      * lazy import audioldm
      
      * lazy import consistency model
      
      * lazy import controlnet
      
      * lazy import dance diffusion ddim ddpm
      
      * lazy import deepfloyd
      
      * lazy import kandinksy
      
      * lazy imports
      
      * lazy import semantic diffusion
      
      * lazy imports
      
      * lazy import stable diffusion
      
      * move sd output to its own module
      
      * clean up
      
      * lazy import t2iadapter
      
      * lazy import unclip
      
      * lazy import versatile and vq diffsuion
      
      * lazy import vq diffusion
      
      * helper to fetch objects from modules
      
      * lazy import sdxl
      
      * lazy import txt2vid
      
      * lazy import stochastic karras
      
      * fix model imports
      
      * fix bug
      
      * lazy import
      
      * clean up
      
      * clean up
      
      * fixes for tests
      
      * fixes for tests
      
      * clean up
      
      * remove import of torch_utils from utils module
      
      * clean up
      
      * clean up
      
      * fix mistake import statement
      
      * dedicated modules for exporting and loading
      
      * remove testing utils from utils module
      
      * fixes from  merge conflicts
      
      * Update src/diffusers/pipelines/kandinsky2_2/__init__.py
      
      * fix docs
      
      * fix alt diffusion copied from
      
      * fix check dummies
      
      * fix more docs
      
      * remove accelerate import from utils module
      
      * add type checking
      
      * make style
      
      * fix check dummies
      
      * remove torch import from xformers check
      
      * clean up error message
      
      * fixes after upstream merges
      
      * dummy objects fix
      
      * fix tests
      
      * remove unused module import
      
      ---------
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      b6e0b016
  5. 04 Sep, 2023 1 commit
    • Sayak Paul's avatar
      [Core] LoRA improvements pt. 3 (#4842) · c81a88b2
      Sayak Paul authored
      
      
      * throw warning when more than one lora is attempted to be fused.
      
      * introduce support of lora scale during fusion.
      
      * change test name
      
      * changes
      
      * change to _lora_scale
      
      * lora_scale to call whenever applicable.
      
      * debugging
      
      * lora_scale additional.
      
      * cross_attention_kwargs
      
      * lora_scale -> scale.
      
      * lora_scale fix
      
      * lora_scale in patched projection.
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * styling.
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * remove unneeded prints.
      
      * remove unneeded prints.
      
      * assign cross_attention_kwargs.
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * clean up.
      
      * refactor scale retrieval logic a bit.
      
      * fix nonetypw
      
      * fix: tests
      
      * add more tests
      
      * more fixes.
      
      * figure out a way to pass lora_scale.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * unify the retrieval logic of lora_scale.
      
      * move adjust_lora_scale_text_encoder to lora.py.
      
      * introduce dynamic adjustment lora scale support to sd
      
      * fix up copies
      
      * Empty-Commit
      
      * add: test to check fusion equivalence on different scales.
      
      * handle lora fusion warning.
      
      * make lora smaller
      
      * make lora smaller
      
      * make lora smaller
      
      ---------
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      c81a88b2
  6. 01 Sep, 2023 1 commit
    • Nguyễn Công Tú Anh's avatar
      Add GLIGEN Text Image implementation (#4777) · 38466c36
      Nguyễn Công Tú Anh authored
      * Add GLIGEN Text Image implementation
      
      * add style transfer from image
      
      * fix check_repository_consistency
      
      * add convert script GLIGEN model to Diffusers
      
      * rename attention type
      
      * fix style code
      
      * remove PositionNetTextImage
      
      * Revert "fix check_repository_consistency"
      
      This reverts commit 15f098c96e00bb9e67b831161615b30a2d28d815.
      
      * change attention type name
      
      * update docs for GLIGEN
      
      * change examples with hf-document-image
      
      * fix style
      
      * add CLIPImageProjection for GLIGEN
      
      * Add new encode_prompt, load project matrix in pipe init
      
      * move CLIPImageProjection to stable_diffusion
      
      * add comment
      38466c36
  7. 16 Aug, 2023 1 commit
    • nikhil-masterful's avatar
      Add GLIGEN implementation (#4441) · da5ab51d
      nikhil-masterful authored
      * Add GLIGEN implementation
      
      * GLIGEN: Fix code quality check failures
      
      * GLIGEN: Fix Import block un-sorted or un-formatted failures
      
      * GLIGEN: Fix check_repository_consistency failures
      
      * GLIGEN: Add 'PositionNet' to versatile_diffusion/modeling_text_unet.py
      
      * GLIGEN: check_repository_consistency: fix 'copy does not match' error
      
      * GLIGEN: Fix review comments (1)
      
      * GLIGEN: Fix E721 Do not compare types, use `isinstance()` failures
      
      * GLIGEN : Ensure _encode_prompt() copy matches to StableDiffusionPipeline
      
      * GLIGEN: Fix ruff E721 failure in unidiffuser/test_unidiffuser.py
      
      * GLIGEN: doc_builder: restyle pipeline_stable_diffusion_gligen.py
      
      * GIGLEN: reset files unrelated to gligen
      
      * GLIGEN: Fix documentation comments (1)
      
      * GLIGEN: Fix review comments (2)
      
      * GLIGEN: Added FastTest
      
      * GLIGEN: Fix review comments (3)
      da5ab51d
  8. 25 Jul, 2023 1 commit
  9. 03 Jul, 2023 1 commit
  10. 05 Jun, 2023 1 commit
  11. 22 May, 2023 1 commit
    • Birch-san's avatar
      Support for cross-attention bias / mask (#2634) · 64bf5d33
      Birch-san authored
      
      
      * Cross-attention masks
      
      prefer qualified symbol, fix accidental Optional
      
      prefer qualified symbol in AttentionProcessor
      
      prefer qualified symbol in embeddings.py
      
      qualified symbol in transformed_2d
      
      qualify FloatTensor in unet_2d_blocks
      
      move new transformer_2d params attention_mask, encoder_attention_mask to the end of the section which is assumed (e.g. by functions such as checkpoint()) to have a stable positional param interface. regard return_dict as a special-case which is assumed to be injected separately from positional params (e.g. by create_custom_forward()).
      
      move new encoder_attention_mask param to end of CrossAttn block interfaces and Unet2DCondition interface, to maintain positional param interface.
      
      regenerate modeling_text_unet.py
      
      remove unused import
      
      unet_2d_condition encoder_attention_mask docs
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      versatile_diffusion/modeling_text_unet.py encoder_attention_mask docs
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      transformer_2d encoder_attention_mask docs
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      unet_2d_blocks.py: add parameter name comments
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      revert description. bool-to-bias treatment happens in unet_2d_condition only.
      
      comment parameter names
      
      fix copies, style
      
      * encoder_attention_mask for SimpleCrossAttnDownBlock2D, SimpleCrossAttnUpBlock2D
      
      * encoder_attention_mask for UNetMidBlock2DSimpleCrossAttn
      
      * support attention_mask, encoder_attention_mask in KCrossAttnDownBlock2D, KCrossAttnUpBlock2D, KAttentionBlock. fix binding of attention_mask, cross_attention_kwargs params in KCrossAttnDownBlock2D, KCrossAttnUpBlock2D checkpoint invocations.
      
      * fix mistake made during merge conflict resolution
      
      * regenerate versatile_diffusion
      
      * pass time embedding into checkpointed attention invocation
      
      * always assume encoder_attention_mask is a mask (i.e. not a bias).
      
      * style, fix-copies
      
      * add tests for cross-attention masks
      
      * add test for padding of attention mask
      
      * explain mask's query_tokens dim. fix explanation about broadcasting over channels; we actually broadcast over query tokens
      
      * support both masks and biases in Transformer2DModel#forward. document behaviour
      
      * fix-copies
      
      * delete attention_mask docs on the basis I never tested self-attention masking myself. not comfortable explaining it, since I don't actually understand how a self-attn mask can work in its current form: the key length will be different in every ResBlock (we don't downsample the mask when we downsample the image).
      
      * review feedback: the standard Unet blocks shouldn't pass temb to attn (only to resnet). remove from KCrossAttnDownBlock2D,KCrossAttnUpBlock2D#forward.
      
      * remove encoder_attention_mask param from SimpleCrossAttn{Up,Down}Block2D,UNetMidBlock2DSimpleCrossAttn, and mask-choice in those blocks' #forward, on the basis that they only do one type of attention, so the consumer can pass whichever type of attention_mask is appropriate.
      
      * put attention mask padding back to how it was (since the SD use-case it enabled wasn't important, and it breaks the original unclip use-case). disable the test which was added.
      
      * fix-copies
      
      * style
      
      * fix-copies
      
      * put encoder_attention_mask param back into Simple block forward interfaces, to ensure consistency of forward interface.
      
      * restore passing of emb to KAttentionBlock#forward, on the basis that removal caused test failures. restore also the passing of emb to checkpointed calls to KAttentionBlock#forward.
      
      * make simple unet2d blocks use encoder_attention_mask, but only when attention_mask is None. this should fix UnCLIP compatibility.
      
      * fix copies
      64bf5d33
  12. 12 May, 2023 1 commit
  13. 01 May, 2023 1 commit
    • Patrick von Platen's avatar
      Torch compile graph fix (#3286) · 0e82fb19
      Patrick von Platen authored
      * fix more
      
      * Fix more
      
      * fix more
      
      * Apply suggestions from code review
      
      * fix
      
      * make style
      
      * make fix-copies
      
      * fix
      
      * make sure torch compile
      
      * Clean
      
      * fix test
      0e82fb19
  14. 28 Apr, 2023 1 commit
  15. 24 Apr, 2023 1 commit
  16. 22 Apr, 2023 1 commit
  17. 11 Apr, 2023 1 commit
    • Chanchana Sornsoontorn's avatar
      Fix typo and format BasicTransformerBlock attributes (#2953) · 52c4d32d
      Chanchana Sornsoontorn authored
      * ️chore(train_controlnet) fix typo in logger message
      
      * ️chore(models) refactor modules order; make them the same as calling order
      
      When printing the BasicTransformerBlock to stdout, I think it's crucial that the attributes order are shown in proper order. And also previously the "3. Feed Forward" comment was not making sense. It should have been close to self.ff but it's instead next to self.norm3
      
      * correct many tests
      
      * remove bogus file
      
      * make style
      
      * correct more tests
      
      * finish tests
      
      * fix one more
      
      * make style
      
      * make unclip deterministic
      
      * 
      
      ️chore(models/attention) reorganize comments in BasicTransformerBlock class
      
      ---------
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      52c4d32d
  18. 22 Mar, 2023 1 commit
    • Patrick von Platen's avatar
      [MS Text To Video] Add first text to video (#2738) · ca1a2229
      Patrick von Platen authored
      
      
      * [MS Text To Video} Add first text to video
      
      * upload
      
      * make first model example
      
      * match unet3d params
      
      * make sure weights are correcctly converted
      
      * improve
      
      * forward pass works, but diff result
      
      * make forward work
      
      * fix more
      
      * finish
      
      * refactor video output class.
      
      * feat: add support for a video export utility.
      
      * fix: opencv availability check.
      
      * run make fix-copies.
      
      * add: docs for the model components.
      
      * add: standalone pipeline doc.
      
      * edit docstring of the pipeline.
      
      * add: right path to TransformerTempModel
      
      * add: first set of tests.
      
      * complete fast tests for text to video.
      
      * fix bug
      
      * up
      
      * three fast tests failing.
      
      * add: note on slow tests
      
      * make work with all schedulers
      
      * apply styling.
      
      * add slow tests
      
      * change file name
      
      * update
      
      * more correction
      
      * more fixes
      
      * finish
      
      * up
      
      * Apply suggestions from code review
      
      * up
      
      * finish
      
      * make copies
      
      * fix pipeline tests
      
      * fix more tests
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * apply suggestions
      
      * up
      
      * revert
      
      ---------
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      ca1a2229
  19. 21 Mar, 2023 1 commit
  20. 15 Mar, 2023 1 commit
  21. 13 Mar, 2023 1 commit
  22. 01 Mar, 2023 1 commit
  23. 07 Feb, 2023 1 commit
    • YiYi Xu's avatar
      Stable Diffusion Latent Upscaler (#2059) · 1051ca81
      YiYi Xu authored
      
      
      * Modify UNet2DConditionModel
      
      - allow skipping mid_block
      
      - adding a norm_group_size argument so that we can set the `num_groups` for group norm using `num_channels//norm_group_size`
      
      - allow user to set dimension for the timestep embedding (`time_embed_dim`)
      
      - the kernel_size for `conv_in` and `conv_out` is now configurable
      
      - add random fourier feature layer (`GaussianFourierProjection`) for `time_proj`
      
      - allow user to add the time and class embeddings before passing through the projection layer together - `time_embedding(t_emb + class_label))`
      
      - added 2 arguments `attn1_types` and `attn2_types`
      
        * currently we have argument `only_cross_attention`: when it's set to `True`, we will have a to the
      `BasicTransformerBlock` block with 2 cross-attention , otherwise we
      get a self-attention followed by a cross-attention; in k-upscaler, we need to have blocks that include just one cross-attention, or self-attention -> cross-attention;
      so I added `attn1_types` and `attn2_types` to the unet's argument list to allow user specify the attention types for the 2 positions in each block;  note that I stil kept
      the `only_cross_attention` argument for unet for easy configuration, but it will be converted to `attn1_type` and `attn2_type` when passing down to the down blocks
      
      - the position of downsample layer and upsample layer is now configurable
      
      - in k-upscaler unet, there is only one skip connection per each up/down block (instead of each layer in stable diffusion unet), added `skip_freq = "block"` to support
      this use case
      
      - if user passes attention_mask to unet, it will prepare the mask and pass a flag to cross attention processer to skip the `prepare_attention_mask` step
      inside cross attention block
      
      add up/down blocks for k-upscaler
      
      modify CrossAttention class
      
      - make the `dropout` layer in `to_out` optional
      
      - `use_conv_proj` - use conv instead of linear for all projection layers (i.e. `to_q`, `to_k`, `to_v`, `to_out`) whenever possible. note that when it's used to do cross
      attention, to_k, to_v has to be linear because the `encoder_hidden_states` is not 2d
      
      - `cross_attention_norm` - add an optional layernorm on encoder_hidden_states
      
      - `attention_dropout`: add an optional dropout on attention score
      
      adapt BasicTransformerBlock
      
      - add an ada groupnorm layer  to conditioning attention input with timestep embedding
      
      - allow skipping the FeedForward layer in between the attentions
      
      - replaced the only_cross_attention argument with attn1_type and attn2_type for more flexible configuration
      
      update timestep embedding: add new act_fn  gelu and an optional act_2
      
      modified ResnetBlock2D
      
      - refactored with AdaGroupNorm class (the timestep scale shift normalization)
      
      - add `mid_channel` argument - allow the first conv to have a different output dimension from the second conv
      
      - add option to use input AdaGroupNorm on the input instead of groupnorm
      
      - add options to add a dropout layer after each conv
      
      - allow user to set the bias in conv_shortcut (needed for k-upscaler)
      
      - add gelu
      
      adding conversion script for k-upscaler unet
      
      add pipeline
      
      * fix attention mask
      
      * fix a typo
      
      * fix a bug
      
      * make sure model can be used with GPU
      
      * make pipeline work with fp16
      
      * fix an error in BasicTransfomerBlock
      
      * make style
      
      * fix typo
      
      * some more fixes
      
      * uP
      
      * up
      
      * correct more
      
      * some clean-up
      
      * clean time proj
      
      * up
      
      * uP
      
      * more changes
      
      * remove the upcast_attention=True from unet config
      
      * remove attn1_types, attn2_types etc
      
      * fix
      
      * revert incorrect changes up/down samplers
      
      * make style
      
      * remove outdated files
      
      * Apply suggestions from code review
      
      * attention refactor
      
      * refactor cross attention
      
      * Apply suggestions from code review
      
      * update
      
      * up
      
      * update
      
      * Apply suggestions from code review
      
      * finish
      
      * Update src/diffusers/models/cross_attention.py
      
      * more fixes
      
      * up
      
      * up
      
      * up
      
      * finish
      
      * more corrections of conversion state
      
      * act_2 -> act_2_fn
      
      * remove dropout_after_conv from ResnetBlock2D
      
      * make style
      
      * simplify KAttentionBlock
      
      * add fast test for latent upscaler pipeline
      
      * add slow test
      
      * slow test fp16
      
      * make style
      
      * add doc string for pipeline_stable_diffusion_latent_upscale
      
      * add api doc page for latent upscaler pipeline
      
      * deprecate attention mask
      
      * clean up embeddings
      
      * simplify resnet
      
      * up
      
      * clean up resnet
      
      * up
      
      * correct more
      
      * up
      
      * up
      
      * improve a bit more
      
      * correct more
      
      * more clean-ups
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * add docstrings for new unet config
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * # Copied from
      
      * encode the image if not latent
      
      * remove force casting vae to fp32
      
      * fix
      
      * add comments about preconditioning parameters from k-diffusion paper
      
      * attn1_type, attn2_type -> add_self_attention
      
      * clean up get_down_block and get_up_block
      
      * fix
      
      * fixed a typo(?) in ada group norm
      
      * update slice attention processer for cross attention
      
      * update slice
      
      * fix fast test
      
      * update the checkpoint
      
      * finish tests
      
      * fix-copies
      
      * fix-copy for modeling_text_unet.py
      
      * make style
      
      * make style
      
      * fix f-string
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * fix import
      
      * correct changes
      
      * fix resnet
      
      * make fix-copies
      
      * correct euler scheduler
      
      * add missing #copied from for preprocess
      
      * revert
      
      * fix
      
      * fix copies
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/models/cross_attention.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * clean up conversion script
      
      * KDownsample2d,KUpsample2d -> KDownsample2D,KUpsample2D
      
      * more
      
      * Update src/diffusers/models/unet_2d_condition.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * remove prepare_extra_step_kwargs
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * fix a typo in timestep embedding
      
      * remove num_image_per_prompt
      
      * fix fasttest
      
      * make style + fix-copies
      
      * fix
      
      * fix xformer test
      
      * fix style
      
      * doc string
      
      * make style
      
      * fix-copies
      
      * docstring for time_embedding_norm
      
      * make style
      
      * final finishes
      
      * make fix-copies
      
      * fix tests
      
      ---------
      Co-authored-by: default avataryiyixuxu <yixu@yis-macbook-pro.lan>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      1051ca81
  24. 27 Jan, 2023 2 commits
  25. 24 Jan, 2023 1 commit
  26. 17 Jan, 2023 1 commit
    • Kashif Rasul's avatar
      DiT Pipeline (#1806) · 37d113cc
      Kashif Rasul authored
      
      
      * added dit model
      
      * import
      
      * initial pipeline
      
      * initial convert script
      
      * initial pipeline
      
      * make style
      
      * raise valueerror
      
      * single function
      
      * rename classes
      
      * use DDIMScheduler
      
      * timesteps embedder
      
      * samples to cpu
      
      * fix var names
      
      * fix numpy type
      
      * use timesteps class for proj
      
      * fix typo
      
      * fix arg name
      
      * flip_sin_to_cos and better var names
      
      * fix C shape cal
      
      * make style
      
      * remove unused imports
      
      * cleanup
      
      * add back patch_size
      
      * initial dit doc
      
      * typo
      
      * Update docs/source/api/pipelines/dit.mdx
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * added copyright license headers
      
      * added example usage and toc
      
      * fix variable names asserts
      
      * remove comment
      
      * added docs
      
      * fix typo
      
      * upstream changes
      
      * set proper device for drop_ids
      
      * added initial dit pipeline test
      
      * update docs
      
      * fix imports
      
      * make fix-copies
      
      * isort
      
      * fix imports
      
      * get rid of more magic numbers
      
      * fix code when guidance is off
      
      * remove block_kwargs
      
      * cleanup script
      
      * removed to_2tuple
      
      * use FeedForward class instead of another MLP
      
      * style
      
      * work on mergint DiTBlock with BasicTransformerBlock
      
      * added missing final_dropout and args to BasicTransformerBlock
      
      * use norm from block
      
      * fix arg
      
      * remove unused arg
      
      * fix call to class_embedder
      
      * use timesteps
      
      * make style
      
      * attn_output gets multiplied
      
      * removed commented code
      
      * use Transformer2D
      
      * use self.is_input_patches
      
      * fix flags
      
      * fixed conversion to use Transformer2DModel
      
      * fixes for pipeline
      
      * remove dit.py
      
      * fix timesteps device
      
      * use randn_tensor and fix fp16 inf.
      
      * timesteps_emb already the right dtype
      
      * fix dit test class
      
      * fix test and style
      
      * fix norm2 usage in vq-diffusion
      
      * added author names to pipeline and lmagenet labels link
      
      * fix tests
      
      * use norm_type as string
      
      * rename dit to transformer
      
      * fix name
      
      * fix test
      
      * set  norm_type = "layer" by default
      
      * fix tests
      
      * do not skip common tests
      
      * Update src/diffusers/models/attention.py
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * revert AdaLayerNorm API
      
      * fix norm_type name
      
      * make sure all components are in eval mode
      
      * revert norm2 API
      
      * compact
      
      * finish deprecation
      
      * add slow tests
      
      * remove @
      
      * refactor some stuff
      
      * upload
      
      * Update src/diffusers/pipelines/dit/pipeline_dit.py
      
      * finish more
      
      * finish docs
      
      * improve docs
      
      * finish docs
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      Co-authored-by: default avatarWilliam Berman <WLBberman@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      37d113cc
  27. 16 Jan, 2023 1 commit
  28. 01 Jan, 2023 1 commit
  29. 30 Dec, 2022 1 commit
  30. 28 Dec, 2022 1 commit
  31. 27 Dec, 2022 1 commit
  32. 20 Dec, 2022 3 commits
  33. 19 Dec, 2022 3 commits
  34. 18 Dec, 2022 1 commit
    • Will Berman's avatar
      kakaobrain unCLIP (#1428) · 2dcf64b7
      Will Berman authored
      
      
      * [wip] attention block updates
      
      * [wip] unCLIP unet decoder and super res
      
      * [wip] unCLIP prior transformer
      
      * [wip] scheduler changes
      
      * [wip] text proj utility class
      
      * [wip] UnCLIPPipeline
      
      * [wip] kakaobrain unCLIP convert script
      
      * [unCLIP pipeline] fixes re: @patrickvonplaten
      
      remove callbacks
      
      move denoising loops into call function
      
      * UNCLIPScheduler re: @patrickvonplaten
      
      Revert changes to DDPMScheduler. Make UNCLIPScheduler, a modified
      DDPM scheduler with changes to support karlo
      
      * mask -> attention_mask re: @patrickvonplaten
      
      * [DDPMScheduler] remove leftover change
      
      * [docs] PriorTransformer
      
      * [docs] UNet2DConditionModel and UNet2DModel
      
      * [nit] UNCLIPScheduler -> UnCLIPScheduler
      
      matches existing unclip naming better
      
      * [docs] SchedulingUnCLIP
      
      * [docs] UnCLIPTextProjModel
      
      * refactor
      
      * finish licenses
      
      * rename all to attention_mask and prep in models
      
      * more renaming
      
      * don't expose unused configs
      
      * final renaming fixes
      
      * remove x attn mask when not necessary
      
      * configure kakao script to use new class embedding config
      
      * fix copies
      
      * [tests] UnCLIPScheduler
      
      * finish x attn
      
      * finish
      
      * remove more
      
      * rename condition blocks
      
      * clean more
      
      * Apply suggestions from code review
      
      * up
      
      * fix
      
      * [tests] UnCLIPPipelineFastTests
      
      * remove unused imports
      
      * [tests] UnCLIPPipelineIntegrationTests
      
      * correct
      
      * make style
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      2dcf64b7
  35. 09 Dec, 2022 1 commit