1. 11 Apr, 2023 1 commit
    • Patrick von Platen's avatar
      Fix config prints and save, load of pipelines (#2849) · 8b451eb6
      Patrick von Platen authored
      * [Config] Fix config prints and save, load
      
      * Only use potential nn.Modules for dtype and device
      
      * Correct vae image processor
      
      * make sure in_channels is not accessed directly
      
      * make sure in channels is only accessed via config
      
      * Make sure schedulers only access config attributes
      
      * Make sure to access config in SAG
      
      * Fix vae processor and make style
      
      * add tests
      
      * uP
      
      * make style
      
      * Fix more naming issues
      
      * Final fix with vae config
      
      * change more
      8b451eb6
  2. 10 Apr, 2023 3 commits
  3. 27 Mar, 2023 1 commit
  4. 23 Mar, 2023 1 commit
    • Sanchit Gandhi's avatar
      Add AudioLDM (#2232) · b94880e5
      Sanchit Gandhi authored
      
      
      * Add AudioLDM
      
      * up
      
      * add vocoder
      
      * start unet
      
      * unconditional unet
      
      * clap, vocoder and vae
      
      * clean-up: conversion scripts
      
      * fix: conversion script token_type_ids
      
      * clean-up: pipeline docstring
      
      * tests: from SD
      
      * clean-up: cpu offload vocoder instead of safety checker
      
      * feat: adapt tests to audioldm
      
      * feat: add docs
      
      * clean-up: amend pipeline docstrings
      
      * clean-up: make style
      
      * clean-up: make fix-copies
      
      * fix: add doc path to toctree
      
      * clean-up: args for conversion script
      
      * clean-up: paths to checkpoints
      
      * fix: use conditional unet
      
      * clean-up: make style
      
      * fix: type hints for UNet
      
      * clean-up: docstring for UNet
      
      * clean-up: make style
      
      * clean-up: remove duplicate in docstring
      
      * clean-up: make style
      
      * clean-up: make fix-copies
      
      * clean-up: move imports to start in code snippet
      
      * fix: pass cross_attention_dim as a list/tuple to unet
      
      * clean-up: make fix-copies
      
      * fix: update checkpoint path
      
      * fix: unet cross_attention_dim in tests
      
      * film embeddings -> class embeddings
      
      * Apply suggestions from code review
      Co-authored-by: default avatarWill Berman <wlbberman@gmail.com>
      
      * fix: unet film embed to use existing args
      
      * fix: unet tests to use existing args
      
      * fix: make style
      
      * fix: transformers import and version in init
      
      * clean-up: make style
      
      * Revert "clean-up: make style"
      
      This reverts commit 5d6d1f8b324f5583e7805dc01e2c86e493660d66.
      
      * clean-up: make style
      
      * clean-up: use pipeline tester mixin tests where poss
      
      * clean-up: skip attn slicing test
      
      * fix: add torch dtype to docs
      
      * fix: remove conversion script out of src
      
      * fix: remove .detach from 1d waveform
      
      * fix: reduce default num inf steps
      
      * fix: swap height/width -> audio_length_in_s
      
      * clean-up: make style
      
      * fix: remove nightly tests
      
      * fix: imports in conversion script
      
      * clean-up: slim-down to two slow tests
      
      * clean-up: slim-down fast tests
      
      * fix: batch consistent tests
      
      * clean-up: make style
      
      * clean-up: remove vae slicing fast test
      
      * clean-up: propagate changes to doc
      
      * fix: increase test tol to 1e-2
      
      * clean-up: finish docs
      
      * clean-up: make style
      
      * feat: vocoder / VAE compatibility check
      
      * feat: possibly expand / cut audio waveform
      
      * fix: pipeline call signature test
      
      * fix: slow tests output len
      
      * clean-up: make style
      
      * make style
      
      ---------
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarWilliam Berman <WLBberman@gmail.com>
      b94880e5
  5. 21 Mar, 2023 1 commit
  6. 15 Mar, 2023 1 commit
  7. 14 Mar, 2023 1 commit
  8. 07 Mar, 2023 1 commit
  9. 02 Mar, 2023 1 commit
    • Takuma Mori's avatar
      Add a ControlNet model & pipeline (#2407) · 8dfff7c0
      Takuma Mori authored
      
      
      * add scaffold
      - copied convert_controlnet_to_diffusers.py from
      convert_original_stable_diffusion_to_diffusers.py
      
      * Add support to load ControlNet (WIP)
      - this makes Missking Key error on ControlNetModel
      
      * Update to convert ControlNet without error msg
      - init impl for StableDiffusionControlNetPipeline
      - init impl for ControlNetModel
      
      * cleanup of commented out
      
      * split create_controlnet_diffusers_config()
      from create_unet_diffusers_config()
      
      - add config: hint_channels
      
      * Add input_hint_block, input_zero_conv and
      middle_block_out
      - this makes missing key error on loading model
      
      * add unet_2d_blocks_controlnet.py
      - copied from unet_2d_blocks.py as impl CrossAttnDownBlock2D,DownBlock2D
      - this makes missing key error on loading model
      
      * Add loading for input_hint_block, zero_convs
      and middle_block_out
      
      - this makes no error message on model loading
      
      * Copy from UNet2DConditionalModel except __init__
      
      * Add ultra primitive test for ControlNetModel
      inference
      
      * Support ControlNetModel inference
      - without exceptions
      
      * copy forward() from UNet2DConditionModel
      
      * Impl ControlledUNet2DConditionModel inference
      - test_controlled_unet_inference passed
      
      * Frozen weight & biases for training
      
      * Minimized version of ControlNet/ControlledUnet
      - test_modules_controllnet.py passed
      
      * make style
      
      * Add support model loading for minimized ver
      
      * Remove all previous version files
      
      * from_pretrained and inference test passed
      
      * copied from pipeline_stable_diffusion.py
      except `__init__()`
      
      * Impl pipeline, pixel match test (almost) passed.
      
      * make style
      
      * make fix-copies
      
      * Fix to add import ControlNet blocks
      for `make fix-copies`
      
      * Remove einops dependency
      
      * Support  np.ndarray, PIL.Image for controlnet_hint
      
      * set default config file as lllyasviel's
      
      * Add support grayscale (hw) numpy array
      
      * Add and update docstrings
      
      * add control_net.mdx
      
      * add control_net.mdx to toctree
      
      * Update copyright year
      
      * Fix to add PIL.Image RGB->BGR conversion
      - thanks @Mystfit
      
      * make fix-copies
      
      * add basic fast test for controlnet
      
      * add slow test for controlnet/unet
      
      * Ignore down/up_block len check on ControlNet
      
      * add a copy from test_stable_diffusion.py
      
      * Accept controlnet_hint is None
      
      * merge pipeline_stable_diffusion.py diff
      
      * Update class name to SDControlNetPipeline
      
      * make style
      
      * Baseline fast test almost passed (w long desc)
      
      * still needs investigate.
      
      Following didn't passed descriped in TODO comment:
      - test_stable_diffusion_long_prompt
      - test_stable_diffusion_no_safety_checker
      
      Following didn't passed same as stable_diffusion_pipeline:
      - test_attention_slicing_forward_pass
      - test_inference_batch_single_identical
      - test_xformers_attention_forwardGenerator_pass
      these seems come from calc accuracy.
      
      * Add note comment related vae_scale_factor
      
      * add test_stable_diffusion_controlnet_ddim
      
      * add assertion for vae_scale_factor != 8
      
      * slow test of pipeline almost passed
      Failed: test_stable_diffusion_pipeline_with_model_offloading
      - ImportError: `enable_model_offload` requires `accelerate v0.17.0` or higher
      
      but currently latest version == 0.16.0
      
      * test_stable_diffusion_long_prompt passed
      
      * test_stable_diffusion_no_safety_checker passed
      
      - due to its model size, move to slow test
      
      * remove PoC test files
      
      * fix num_of_image, prompt length issue add add test
      
      * add support List[PIL.Image] for controlnet_hint
      
      * wip
      
      * all slow test passed
      
      * make style
      
      * update for slow test
      
      * RGB(PIL)->BGR(ctrlnet) conversion
      
      * fixes
      
      * remove manual num_images_per_prompt test
      
      * add document
      
      * add `image` argument docstring
      
      * make style
      
      * Add line to correct conversion
      
      * add controlnet_conditioning_scale (aka control_scales
      strength)
      
      * rgb channel ordering by default
      
      * image batching logic
      
      * Add control image descriptions for each checkpoint
      
      * Only save controlnet model in conversion script
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_controlnet.py
      
      typo
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * add gerated image example
      
      * a depth mask -> a depth map
      
      * rename control_net.mdx to controlnet.mdx
      
      * fix toc title
      
      * add ControlNet abstruct and link
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_controlnet.py
      Co-authored-by: default avatardqueue <dbyqin@gmail.com>
      
      * remove controlnet constructor arguments re: @patrickvonplaten
      
      * [integration tests] test canny
      
      * test_canny fixes
      
      * [integration tests] test_depth
      
      * [integration tests] test_hed
      
      * [integration tests] test_mlsd
      
      * add channel order config to controlnet
      
      * [integration tests] test normal
      
      * [integration tests] test_openpose test_scribble
      
      * change height and width to default to conditioning image
      
      * [integration tests] test seg
      
      * style
      
      * test_depth fix
      
      * [integration tests] size fixes
      
      * [integration tests] cpu offloading
      
      * style
      
      * generalize controlnet embedding
      
      * fix conversion script
      
      * Update docs/source/en/api/pipelines/stable_diffusion/controlnet.mdx
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/controlnet.mdx
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/controlnet.mdx
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/controlnet.mdx
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * Style adapted to the documentation of pix2pix
      
      * merge main by hand
      
      * style
      
      * [docs] controlling generation doc nits
      
      * correct some things
      
      * add: controlnetmodel to autodoc.
      
      * finish docs
      
      * finish
      
      * finish 2
      
      * correct images
      
      * finish controlnet
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * uP
      
      * upload model
      
      * up
      
      * up
      
      ---------
      Co-authored-by: default avatarWilliam Berman <WLBberman@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      Co-authored-by: default avatardqueue <dbyqin@gmail.com>
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      8dfff7c0
  10. 01 Mar, 2023 1 commit
  11. 14 Feb, 2023 2 commits
  12. 07 Feb, 2023 1 commit
    • YiYi Xu's avatar
      Stable Diffusion Latent Upscaler (#2059) · 1051ca81
      YiYi Xu authored
      
      
      * Modify UNet2DConditionModel
      
      - allow skipping mid_block
      
      - adding a norm_group_size argument so that we can set the `num_groups` for group norm using `num_channels//norm_group_size`
      
      - allow user to set dimension for the timestep embedding (`time_embed_dim`)
      
      - the kernel_size for `conv_in` and `conv_out` is now configurable
      
      - add random fourier feature layer (`GaussianFourierProjection`) for `time_proj`
      
      - allow user to add the time and class embeddings before passing through the projection layer together - `time_embedding(t_emb + class_label))`
      
      - added 2 arguments `attn1_types` and `attn2_types`
      
        * currently we have argument `only_cross_attention`: when it's set to `True`, we will have a to the
      `BasicTransformerBlock` block with 2 cross-attention , otherwise we
      get a self-attention followed by a cross-attention; in k-upscaler, we need to have blocks that include just one cross-attention, or self-attention -> cross-attention;
      so I added `attn1_types` and `attn2_types` to the unet's argument list to allow user specify the attention types for the 2 positions in each block;  note that I stil kept
      the `only_cross_attention` argument for unet for easy configuration, but it will be converted to `attn1_type` and `attn2_type` when passing down to the down blocks
      
      - the position of downsample layer and upsample layer is now configurable
      
      - in k-upscaler unet, there is only one skip connection per each up/down block (instead of each layer in stable diffusion unet), added `skip_freq = "block"` to support
      this use case
      
      - if user passes attention_mask to unet, it will prepare the mask and pass a flag to cross attention processer to skip the `prepare_attention_mask` step
      inside cross attention block
      
      add up/down blocks for k-upscaler
      
      modify CrossAttention class
      
      - make the `dropout` layer in `to_out` optional
      
      - `use_conv_proj` - use conv instead of linear for all projection layers (i.e. `to_q`, `to_k`, `to_v`, `to_out`) whenever possible. note that when it's used to do cross
      attention, to_k, to_v has to be linear because the `encoder_hidden_states` is not 2d
      
      - `cross_attention_norm` - add an optional layernorm on encoder_hidden_states
      
      - `attention_dropout`: add an optional dropout on attention score
      
      adapt BasicTransformerBlock
      
      - add an ada groupnorm layer  to conditioning attention input with timestep embedding
      
      - allow skipping the FeedForward layer in between the attentions
      
      - replaced the only_cross_attention argument with attn1_type and attn2_type for more flexible configuration
      
      update timestep embedding: add new act_fn  gelu and an optional act_2
      
      modified ResnetBlock2D
      
      - refactored with AdaGroupNorm class (the timestep scale shift normalization)
      
      - add `mid_channel` argument - allow the first conv to have a different output dimension from the second conv
      
      - add option to use input AdaGroupNorm on the input instead of groupnorm
      
      - add options to add a dropout layer after each conv
      
      - allow user to set the bias in conv_shortcut (needed for k-upscaler)
      
      - add gelu
      
      adding conversion script for k-upscaler unet
      
      add pipeline
      
      * fix attention mask
      
      * fix a typo
      
      * fix a bug
      
      * make sure model can be used with GPU
      
      * make pipeline work with fp16
      
      * fix an error in BasicTransfomerBlock
      
      * make style
      
      * fix typo
      
      * some more fixes
      
      * uP
      
      * up
      
      * correct more
      
      * some clean-up
      
      * clean time proj
      
      * up
      
      * uP
      
      * more changes
      
      * remove the upcast_attention=True from unet config
      
      * remove attn1_types, attn2_types etc
      
      * fix
      
      * revert incorrect changes up/down samplers
      
      * make style
      
      * remove outdated files
      
      * Apply suggestions from code review
      
      * attention refactor
      
      * refactor cross attention
      
      * Apply suggestions from code review
      
      * update
      
      * up
      
      * update
      
      * Apply suggestions from code review
      
      * finish
      
      * Update src/diffusers/models/cross_attention.py
      
      * more fixes
      
      * up
      
      * up
      
      * up
      
      * finish
      
      * more corrections of conversion state
      
      * act_2 -> act_2_fn
      
      * remove dropout_after_conv from ResnetBlock2D
      
      * make style
      
      * simplify KAttentionBlock
      
      * add fast test for latent upscaler pipeline
      
      * add slow test
      
      * slow test fp16
      
      * make style
      
      * add doc string for pipeline_stable_diffusion_latent_upscale
      
      * add api doc page for latent upscaler pipeline
      
      * deprecate attention mask
      
      * clean up embeddings
      
      * simplify resnet
      
      * up
      
      * clean up resnet
      
      * up
      
      * correct more
      
      * up
      
      * up
      
      * improve a bit more
      
      * correct more
      
      * more clean-ups
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * add docstrings for new unet config
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * # Copied from
      
      * encode the image if not latent
      
      * remove force casting vae to fp32
      
      * fix
      
      * add comments about preconditioning parameters from k-diffusion paper
      
      * attn1_type, attn2_type -> add_self_attention
      
      * clean up get_down_block and get_up_block
      
      * fix
      
      * fixed a typo(?) in ada group norm
      
      * update slice attention processer for cross attention
      
      * update slice
      
      * fix fast test
      
      * update the checkpoint
      
      * finish tests
      
      * fix-copies
      
      * fix-copy for modeling_text_unet.py
      
      * make style
      
      * make style
      
      * fix f-string
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * fix import
      
      * correct changes
      
      * fix resnet
      
      * make fix-copies
      
      * correct euler scheduler
      
      * add missing #copied from for preprocess
      
      * revert
      
      * fix
      
      * fix copies
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/models/cross_attention.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * clean up conversion script
      
      * KDownsample2d,KUpsample2d -> KDownsample2D,KUpsample2D
      
      * more
      
      * Update src/diffusers/models/unet_2d_condition.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * remove prepare_extra_step_kwargs
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * fix a typo in timestep embedding
      
      * remove num_image_per_prompt
      
      * fix fasttest
      
      * make style + fix-copies
      
      * fix
      
      * fix xformer test
      
      * fix style
      
      * doc string
      
      * make style
      
      * fix-copies
      
      * docstring for time_embedding_norm
      
      * make style
      
      * final finishes
      
      * make fix-copies
      
      * fix tests
      
      ---------
      Co-authored-by: default avataryiyixuxu <yixu@yis-macbook-pro.lan>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      1051ca81
  13. 27 Jan, 2023 1 commit
  14. 26 Jan, 2023 1 commit
    • Pedro Cuenca's avatar
      Allow `UNet2DModel` to use arbitrary class embeddings (#2080) · 915a5636
      Pedro Cuenca authored
      * Allow `UNet2DModel` to use arbitrary class embeddings.
      
      We can currently use class conditioning in `UNet2DConditionModel`, but
      not in `UNet2DModel`. However, `UNet2DConditionModel` requires text
      conditioning too, which is unrelated to other types of conditioning.
      This commit makes it possible for `UNet2DModel` to be conditioned on
      entities other than timesteps. This is useful for training /
      research purposes. We can currently train models to perform
      unconditional image generation or text-to-image generation, but it's not
      straightforward to train a model to perform class-conditioned image
      generation, if text conditioning is not required.
      
      We could potentiall use `UNet2DConditionModel` for class-conditioning
      without text embeddings by using down/up blocks without
      cross-conditioning. However:
      - The mid block currently requires cross attention.
      - We are required to provide `encoder_hidden_states` to `forward`.
      
      * Style
      
      * Align class conditioning, add docstring for `num_class_embeds`.
      
      * Copy docstring to versatile_diffusion UNetFlatConditionModel
      915a5636
  15. 18 Jan, 2023 1 commit
  16. 30 Dec, 2022 1 commit
  17. 20 Dec, 2022 1 commit
  18. 18 Dec, 2022 1 commit
    • Will Berman's avatar
      kakaobrain unCLIP (#1428) · 2dcf64b7
      Will Berman authored
      
      
      * [wip] attention block updates
      
      * [wip] unCLIP unet decoder and super res
      
      * [wip] unCLIP prior transformer
      
      * [wip] scheduler changes
      
      * [wip] text proj utility class
      
      * [wip] UnCLIPPipeline
      
      * [wip] kakaobrain unCLIP convert script
      
      * [unCLIP pipeline] fixes re: @patrickvonplaten
      
      remove callbacks
      
      move denoising loops into call function
      
      * UNCLIPScheduler re: @patrickvonplaten
      
      Revert changes to DDPMScheduler. Make UNCLIPScheduler, a modified
      DDPM scheduler with changes to support karlo
      
      * mask -> attention_mask re: @patrickvonplaten
      
      * [DDPMScheduler] remove leftover change
      
      * [docs] PriorTransformer
      
      * [docs] UNet2DConditionModel and UNet2DModel
      
      * [nit] UNCLIPScheduler -> UnCLIPScheduler
      
      matches existing unclip naming better
      
      * [docs] SchedulingUnCLIP
      
      * [docs] UnCLIPTextProjModel
      
      * refactor
      
      * finish licenses
      
      * rename all to attention_mask and prep in models
      
      * more renaming
      
      * don't expose unused configs
      
      * final renaming fixes
      
      * remove x attn mask when not necessary
      
      * configure kakao script to use new class embedding config
      
      * fix copies
      
      * [tests] UnCLIPScheduler
      
      * finish x attn
      
      * finish
      
      * remove more
      
      * rename condition blocks
      
      * clean more
      
      * Apply suggestions from code review
      
      * up
      
      * fix
      
      * [tests] UnCLIPPipelineFastTests
      
      * remove unused imports
      
      * [tests] UnCLIPPipelineIntegrationTests
      
      * correct
      
      * make style
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      2dcf64b7
  19. 07 Dec, 2022 2 commits
  20. 05 Dec, 2022 2 commits
  21. 02 Dec, 2022 3 commits
  22. 24 Nov, 2022 2 commits
    • Anton Lozhkov's avatar
      Support SD2 attention slicing (#1397) · d50e3217
      Anton Lozhkov authored
      * Support SD2 attention slicing
      
      * Support SD2 attention slicing
      
      * Add more copies
      
      * Use attn_num_head_channels in blocks
      
      * fix-copies
      
      * Update tests
      
      * fix imports
      d50e3217
    • Suraj Patil's avatar
      Adapt UNet2D for supre-resolution (#1385) · cecdd8bd
      Suraj Patil authored
      * allow disabling self attention
      
      * add class_embedding
      
      * fix copies
      
      * fix condition
      
      * fix copies
      
      * do_self_attention -> only_cross_attention
      
      * fix copies
      
      * num_classes -> num_class_embeds
      
      * fix default value
      cecdd8bd
  23. 23 Nov, 2022 3 commits
    • Suraj Patil's avatar
      update unet2d (#1376) · f07a16e0
      Suraj Patil authored
      * boom boom
      
      * remove duplicate arg
      
      * add use_linear_proj arg
      
      * fix copies
      
      * style
      
      * add fast tests
      
      * use_linear_proj -> use_linear_projection
      f07a16e0
    • Patrick von Platen's avatar
      [Versatile Diffusion] Add versatile diffusion model (#1283) · 2625fb59
      Patrick von Platen authored
      
      
      * up
      
      * convert dual unet
      
      * revert dual attn
      
      * adapt for vd-official
      
      * test the full pipeline
      
      * mixed inference
      
      * mixed inference for text2img
      
      * add image prompting
      
      * fix clip norm
      
      * split text2img and img2img
      
      * fix format
      
      * refactor text2img
      
      * mega pipeline
      
      * add optimus
      
      * refactor image var
      
      * wip text_unet
      
      * text unet end to end
      
      * update tests
      
      * reshape
      
      * fix image to text
      
      * add some first docs
      
      * dual guided pipeline
      
      * fix token ratio
      
      * propose change
      
      * dual transformer as a native module
      
      * DualTransformer(nn.Module)
      
      * DualTransformer(nn.Module)
      
      * correct unconditional image
      
      * save-load with mega pipeline
      
      * remove image to text
      
      * up
      
      * uP
      
      * fix
      
      * up
      
      * final fix
      
      * remove_unused_weights
      
      * test updates
      
      * save progress
      
      * uP
      
      * fix dual prompts
      
      * some fixes
      
      * finish
      
      * style
      
      * finish renaming
      
      * up
      
      * fix
      
      * fix
      
      * fix
      
      * finish
      Co-authored-by: default avataranton-l <anton@huggingface.co>
      2625fb59
    • Penn's avatar
      Fix using non-square images with UNet2DModel and DDIM/DDPM pipelines (#1289) · 8fd3a743
      Penn authored
      
      
      * fix non square images with UNet2DModel and DDIM/DDPM pipelines
      
      * fix unet_2d `sample_size` docstring
      
      * update pipeline tests for unet uncond
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      8fd3a743
  24. 16 Nov, 2022 1 commit
  25. 14 Nov, 2022 1 commit
  26. 02 Nov, 2022 1 commit
    • MatthieuTPHR's avatar
      Up to 2x speedup on GPUs using memory efficient attention (#532) · 98c42134
      MatthieuTPHR authored
      
      
      * 2x speedup using memory efficient attention
      
      * remove einops dependency
      
      * Swap K, M in op instantiation
      
      * Simplify code, remove unnecessary maybe_init call and function, remove unused self.scale parameter
      
      * make xformers a soft dependency
      
      * remove one-liner functions
      
      * change one letter variable to appropriate names
      
      * Remove Env variable dependency, remove MemoryEfficientCrossAttention class and use enable_xformers_memory_efficient_attention method
      
      * Add memory efficient attention toggle to img2img and inpaint pipelines
      
      * Clearer management of xformers' availability
      
      * update optimizations markdown to add info about memory efficient attention
      
      * add benchmarks for TITAN RTX
      
      * More detailed explanation of how the mem eff benchmark were ran
      
      * Removing autocast from optimization markdown
      
      * import_utils: import torch only if is available
      Co-authored-by: default avatarNouamane Tazi <nouamane98@gmail.com>
      98c42134
  27. 25 Oct, 2022 1 commit
  28. 12 Oct, 2022 1 commit
  29. 10 Oct, 2022 1 commit
  30. 05 Oct, 2022 1 commit