1. 01 Nov, 2023 1 commit
  2. 26 Jul, 2023 1 commit
  3. 21 Jul, 2023 1 commit
    • Steven Liu's avatar
      [docs] Clean up pipeline apis (#3905) · a69754bb
      Steven Liu authored
      * start with stable diffusion
      
      * fix
      
      * finish stable diffusion pipelines
      
      * fix path to pipeline output
      
      * fix flax paths
      
      * fix copies
      
      * add up to score sde ve
      
      * finish first pass of pipelines
      
      * fix copies
      
      * second review
      
      * align doc titles
      
      * more review fixes
      
      * final review
      a69754bb
  4. 20 Jun, 2023 1 commit
  5. 28 Apr, 2023 1 commit
    • clarencechen's avatar
      Diffedit Zero-Shot Inpainting Pipeline (#2837) · be0bfcec
      clarencechen authored
      * Update Pix2PixZero Auto-correlation Loss
      
      * Add Stable Diffusion DiffEdit pipeline
      
      * Add draft documentation and import code
      
      * Bugfixes and refactoring
      
      * Add option to not decode latents in the inversion process
      
      * Harmonize preprocessing
      
      * Revert "Update Pix2PixZero Auto-correlation Loss"
      
      This reverts commit b218062fed08d6cc164206d6cb852b2b7b00847a.
      
      * Update annotations
      
      * rename `compute_mask` to `generate_mask`
      
      * Update documentation
      
      * Update docs
      
      * Update Docs
      
      * Fix copy
      
      * Change shape of output latents to batch first
      
      * Update docs
      
      * Add first draft for tests
      
      * Bugfix and update tests
      
      * Add `cross_attention_kwargs` support for all pipeline methods
      
      * Fix Copies
      
      * Add support for PIL image latents
      
      Add support for mask broadcasting
      
      Update docs and tests
      
      Align `mask` argument to `mask_image`
      
      Remove height and width arguments
      
      * Enable MPS Tests
      
      * Move example docstrings
      
      * Fix test
      
      * Fix test
      
      * fix pipeline inheritance
      
      * Harmonize `prepare_image_latents` with StableDiffusionPix2PixZeroPipeline
      
      * Register modules set to `None` in config for `test_save_load_optional_components`
      
      * Move fixed logic to specific test class
      
      * Clean changes to other pipelines
      
      * Update new tests to coordinate with #2953
      
      * Update slow tests for better results
      
      * Safety to avoid potential problems with torch.inference_mode
      
      * Add reference in SD Pipeline Overview
      
      * Fix tests again
      
      * Enforce determinism in noise for generate_mask
      
      * Fix copies
      
      * Widen test tolerance for fp16 based on `test_stable_diffusion_upscale_pipeline_fp16`
      
      * Add LoraLoaderMixin and update `prepare_image_latents`
      
      * clean up repeat and reg
      
      * bugfix
      
      * Remove invalid args from docs
      
      Suppress spurious warning by repeating image before latent to mask gen
      be0bfcec
  6. 24 Mar, 2023 1 commit
    • Bahjat Kawar's avatar
      Add ModelEditing pipeline (#2721) · 37a44bb2
      Bahjat Kawar authored
      
      
      * TIME first commit
      
      * styling.
      
      * styling 2.
      
      * fixes; tests
      
      * apply styling and doc fix.
      
      * remove sups.
      
      * fixes
      
      * remove temp file
      
      * move augmentations to const
      
      * added doc entry
      
      * code quality
      
      * customize augmentations
      
      * quality
      
      * quality
      
      ---------
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      37a44bb2
  7. 01 Mar, 2023 1 commit
  8. 16 Feb, 2023 2 commits
    • YiYi Xu's avatar
      Attend and excite 2 (#2369) · 2e7a2865
      YiYi Xu authored
      
      
      * attend and excite pipeline
      
      * update
      
      update docstring example
      
      remove visualization
      
      remove the base class attention control
      
      remove dependency on stable diffusion pipeline
      
      always apply gaussian filter with default setting
      
      remove run_standard_sd argument
      
      hardcode attention_res and scale_range (related to step size)
      
      Update docs/source/en/api/pipelines/stable_diffusion/attend_and_excite.mdx
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      Update tests/pipelines/stable_diffusion_2/test_stable_diffusion_attend_and_excite.py
      Co-authored-by: default avatarWill Berman <wlbberman@gmail.com>
      
      Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py
      Co-authored-by: default avatarWill Berman <wlbberman@gmail.com>
      
      Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py
      Co-authored-by: default avatarWill Berman <wlbberman@gmail.com>
      
      revert test_float16_inference
      
      revert change to the batch related tests
      
      fix test_float16_inference
      
      handle batch
      
      remove the deprecation message
      
      remove None check, step_size
      
      remove debugging logging
      
      add slow test
      
      indices_to_alter -> indices
      
      add check_input
      
      * skip mps
      
      * style
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * indices -> token_indices
      ---------
      Co-authored-by: default avatarevin <evinpinarornek@gmail.com>
      Co-authored-by: default avataryiyixuxu <yixu310@gmail,com>
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      2e7a2865
    • Sayak Paul's avatar
      [Pipelines] Adds pix2pix zero (#2334) · fd3d5502
      Sayak Paul authored
      * add: support for BLIP generation.
      
      * add: support for editing synthetic images.
      
      * remove unnecessary comments.
      
      * add inits and run make fix-copies.
      
      * version change of diffusers.
      
      * fix: condition for loading the captioner.
      
      * default conditions_input_image to False.
      
      * guidance_amount -> cross_attention_guidance_amount
      
      * fix inputs to check_inputs()
      
      * fix: attribute.
      
      * fix: prepare_attention_mask() call.
      
      * debugging.
      
      * better placement of references.
      
      * remove torch.no_grad() decorations.
      
      * put torch.no_grad() context before the first denoising loop.
      
      * detach() latents before decoding them.
      
      * put deocding in a torch.no_grad() context.
      
      * add reconstructed image for debugging.
      
      * no_grad(0
      
      * apply formatting.
      
      * address one-off suggestions from the draft PR.
      
      * back to torch.no_grad() and add more elaborate comments.
      
      * refactor prepare_unet() per Patrick's suggestions.
      
      * more elaborate description for .
      
      * formatting.
      
      * add docstrings to the methods specific to pix2pix zero.
      
      * suspecting a redundant noise prediction.
      
      * needed for gradient computation chain.
      
      * less hacks.
      
      * fix: attention mask handling within the processor.
      
      * remove attention reference map computation.
      
      * fix: cross attn args.
      
      * fix: prcoessor.
      
      * store attention maps.
      
      * fix: attention processor.
      
      * update docs and better treatment to xa args.
      
      * update the final noise computation call.
      
      * change xa args call.
      
      * remove xa args option from the pipeline.
      
      * add: docs.
      
      * first test.
      
      * fix: url call.
      
      * fix: argument call.
      
      * remove image conditioning for now.
      
      * 🚨 add: fast tests.
      
      * explicit placement of the xa attn weights.
      
      * add: slow tests 🐢
      
      * fix: tests.
      
      * edited direction embedding should be on the same device as prompt_embeds.
      
      * debugging message.
      
      * debugging.
      
      * add pix2pix zero pipeline for a non-deterministic test.
      
      * debugging/
      
      * remove debugging message.
      
      * make caption generation _
      
      * address comments (part I).
      
      * address PR comments (part II)
      
      * fix: DDPM test assertion.
      
      * refactor doc.
      
      * address PR comments (part III).
      
      * fix: type annotation for the scheduler.
      
      * apply styling.
      
      * skip_mps and add note on embeddings in the docs.
      fd3d5502
  9. 07 Feb, 2023 1 commit
    • YiYi Xu's avatar
      Stable Diffusion Latent Upscaler (#2059) · 1051ca81
      YiYi Xu authored
      
      
      * Modify UNet2DConditionModel
      
      - allow skipping mid_block
      
      - adding a norm_group_size argument so that we can set the `num_groups` for group norm using `num_channels//norm_group_size`
      
      - allow user to set dimension for the timestep embedding (`time_embed_dim`)
      
      - the kernel_size for `conv_in` and `conv_out` is now configurable
      
      - add random fourier feature layer (`GaussianFourierProjection`) for `time_proj`
      
      - allow user to add the time and class embeddings before passing through the projection layer together - `time_embedding(t_emb + class_label))`
      
      - added 2 arguments `attn1_types` and `attn2_types`
      
        * currently we have argument `only_cross_attention`: when it's set to `True`, we will have a to the
      `BasicTransformerBlock` block with 2 cross-attention , otherwise we
      get a self-attention followed by a cross-attention; in k-upscaler, we need to have blocks that include just one cross-attention, or self-attention -> cross-attention;
      so I added `attn1_types` and `attn2_types` to the unet's argument list to allow user specify the attention types for the 2 positions in each block;  note that I stil kept
      the `only_cross_attention` argument for unet for easy configuration, but it will be converted to `attn1_type` and `attn2_type` when passing down to the down blocks
      
      - the position of downsample layer and upsample layer is now configurable
      
      - in k-upscaler unet, there is only one skip connection per each up/down block (instead of each layer in stable diffusion unet), added `skip_freq = "block"` to support
      this use case
      
      - if user passes attention_mask to unet, it will prepare the mask and pass a flag to cross attention processer to skip the `prepare_attention_mask` step
      inside cross attention block
      
      add up/down blocks for k-upscaler
      
      modify CrossAttention class
      
      - make the `dropout` layer in `to_out` optional
      
      - `use_conv_proj` - use conv instead of linear for all projection layers (i.e. `to_q`, `to_k`, `to_v`, `to_out`) whenever possible. note that when it's used to do cross
      attention, to_k, to_v has to be linear because the `encoder_hidden_states` is not 2d
      
      - `cross_attention_norm` - add an optional layernorm on encoder_hidden_states
      
      - `attention_dropout`: add an optional dropout on attention score
      
      adapt BasicTransformerBlock
      
      - add an ada groupnorm layer  to conditioning attention input with timestep embedding
      
      - allow skipping the FeedForward layer in between the attentions
      
      - replaced the only_cross_attention argument with attn1_type and attn2_type for more flexible configuration
      
      update timestep embedding: add new act_fn  gelu and an optional act_2
      
      modified ResnetBlock2D
      
      - refactored with AdaGroupNorm class (the timestep scale shift normalization)
      
      - add `mid_channel` argument - allow the first conv to have a different output dimension from the second conv
      
      - add option to use input AdaGroupNorm on the input instead of groupnorm
      
      - add options to add a dropout layer after each conv
      
      - allow user to set the bias in conv_shortcut (needed for k-upscaler)
      
      - add gelu
      
      adding conversion script for k-upscaler unet
      
      add pipeline
      
      * fix attention mask
      
      * fix a typo
      
      * fix a bug
      
      * make sure model can be used with GPU
      
      * make pipeline work with fp16
      
      * fix an error in BasicTransfomerBlock
      
      * make style
      
      * fix typo
      
      * some more fixes
      
      * uP
      
      * up
      
      * correct more
      
      * some clean-up
      
      * clean time proj
      
      * up
      
      * uP
      
      * more changes
      
      * remove the upcast_attention=True from unet config
      
      * remove attn1_types, attn2_types etc
      
      * fix
      
      * revert incorrect changes up/down samplers
      
      * make style
      
      * remove outdated files
      
      * Apply suggestions from code review
      
      * attention refactor
      
      * refactor cross attention
      
      * Apply suggestions from code review
      
      * update
      
      * up
      
      * update
      
      * Apply suggestions from code review
      
      * finish
      
      * Update src/diffusers/models/cross_attention.py
      
      * more fixes
      
      * up
      
      * up
      
      * up
      
      * finish
      
      * more corrections of conversion state
      
      * act_2 -> act_2_fn
      
      * remove dropout_after_conv from ResnetBlock2D
      
      * make style
      
      * simplify KAttentionBlock
      
      * add fast test for latent upscaler pipeline
      
      * add slow test
      
      * slow test fp16
      
      * make style
      
      * add doc string for pipeline_stable_diffusion_latent_upscale
      
      * add api doc page for latent upscaler pipeline
      
      * deprecate attention mask
      
      * clean up embeddings
      
      * simplify resnet
      
      * up
      
      * clean up resnet
      
      * up
      
      * correct more
      
      * up
      
      * up
      
      * improve a bit more
      
      * correct more
      
      * more clean-ups
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * add docstrings for new unet config
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * # Copied from
      
      * encode the image if not latent
      
      * remove force casting vae to fp32
      
      * fix
      
      * add comments about preconditioning parameters from k-diffusion paper
      
      * attn1_type, attn2_type -> add_self_attention
      
      * clean up get_down_block and get_up_block
      
      * fix
      
      * fixed a typo(?) in ada group norm
      
      * update slice attention processer for cross attention
      
      * update slice
      
      * fix fast test
      
      * update the checkpoint
      
      * finish tests
      
      * fix-copies
      
      * fix-copy for modeling_text_unet.py
      
      * make style
      
      * make style
      
      * fix f-string
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * fix import
      
      * correct changes
      
      * fix resnet
      
      * make fix-copies
      
      * correct euler scheduler
      
      * add missing #copied from for preprocess
      
      * revert
      
      * fix
      
      * fix copies
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/models/cross_attention.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * clean up conversion script
      
      * KDownsample2d,KUpsample2d -> KDownsample2D,KUpsample2D
      
      * more
      
      * Update src/diffusers/models/unet_2d_condition.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * remove prepare_extra_step_kwargs
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * fix a typo in timestep embedding
      
      * remove num_image_per_prompt
      
      * fix fasttest
      
      * make style + fix-copies
      
      * fix
      
      * fix xformer test
      
      * fix style
      
      * doc string
      
      * make style
      
      * fix-copies
      
      * docstring for time_embedding_norm
      
      * make style
      
      * final finishes
      
      * make fix-copies
      
      * fix tests
      
      ---------
      Co-authored-by: default avataryiyixuxu <yixu@yis-macbook-pro.lan>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      1051ca81
  10. 20 Jan, 2023 1 commit
  11. 04 Jan, 2023 1 commit
    • Chanran Kim's avatar
      Init for korean docs (#1910) · 75d53cc8
      Chanran Kim authored
      * init for korean docs
      
      * edit build yml file for multi language docs
      
      * edit one more build yml file for multi language docs
      
      * add title for get_frontmatter error
      75d53cc8
  12. 03 Jan, 2023 1 commit
  13. 02 Jan, 2023 1 commit
  14. 30 Dec, 2022 1 commit
  15. 08 Dec, 2022 1 commit
    • Suraj Patil's avatar
      StableDiffusionDepth2ImgPipeline (#1531) · 5383188c
      Suraj Patil authored
      
      
      * begin depth pipeline
      
      * add depth estimation model
      
      * fix prepare_depth_mask
      
      * add a comment about autocast
      
      * copied from, quality, cleanup
      
      * begin tests
      
      * handle tensors
      
      * norm image tensor
      
      * fix batch size
      
      * fix tests
      
      * fix enable_sequential_cpu_offload
      
      * fix save load
      
      * fix test_save_load_float16
      
      * fix test_save_load_optional_components
      
      * fix test_float16_inference
      
      * fix test_cpu_offload_forward_pass
      
      * fix test_dict_tuple_outputs_equivalent
      
      * up
      
      * fix fast tests
      
      * fix test_stable_diffusion_img2img_multiple_init_images
      
      * fix few more fast tests
      
      * don't use device map for DPT
      
      * fix test_stable_diffusion_pipeline_with_sequential_cpu_offloading
      
      * accept external depth maps
      
      * prepare_depth_mask -> prepare_depth_map
      
      * fix file name
      
      * fix file name
      
      * quality
      
      * check transformers version
      
      * fix test names
      
      * use skipif
      
      * fix import
      
      * add docs
      
      * skip tests on mps
      
      * correct version
      
      * uP
      
      * Update docs/source/api/pipelines/stable_diffusion_2.mdx
      
      * fix fix-copies
      
      * fix fix-copies
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avataranton- <anton@huggingface.co>
      5383188c
  16. 30 Nov, 2022 1 commit
  17. 29 Nov, 2022 1 commit
    • Ilmari Heikkinen's avatar
      StableDiffusion: Decode latents separately to run larger batches (#1150) · c28d3c82
      Ilmari Heikkinen authored
      
      
      * StableDiffusion: Decode latents separately to run larger batches
      
      * Move VAE sliced decode under enable_vae_sliced_decode and vae.enable_sliced_decode
      
      * Rename sliced_decode to slicing
      
      * fix whitespace
      
      * fix quality check and repository consistency
      
      * VAE slicing tests and documentation
      
      * API doc hooks for VAE slicing
      
      * reformat vae slicing tests
      
      * Skip VAE slicing for one-image batches
      
      * Documentation tweaks for VAE slicing
      Co-authored-by: default avatarIlmari Heikkinen <ilmari@fhtr.org>
      c28d3c82
  18. 25 Nov, 2022 2 commits
  19. 23 Nov, 2022 1 commit
    • Suraj Patil's avatar
      StableDiffusionImageVariationPipeline (#1365) · 0eb507f2
      Suraj Patil authored
      
      
      * add StableDiffusionImageVariationPipeline
      
      * add ini init
      
      * use CLIPVisionModelWithProjection
      
      * fix _encode_image
      
      * add copied from
      
      * fix copies
      
      * add doc
      
      * handle tensor in _encode_image
      
      * add tests
      
      * correct model_id
      
      * remove copied from in enable_sequential_cpu_offload
      
      * fix tests
      
      * make slow tests pass
      
      * update slow tests
      
      * use temp model for now
      
      * fix test_stable_diffusion_img_variation_intermediate_state
      
      * fix test_stable_diffusion_img_variation_intermediate_state
      
      * check for torch.Tensor
      
      * quality
      
      * fix name
      
      * fix slow tests
      
      * install transformers from source
      
      * fix install
      
      * fix install
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * input_image -> image
      
      * remove deprication warnings
      
      * fix test_stable_diffusion_img_variation_multiple_images
      
      * make flake happy
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      0eb507f2
  20. 17 Nov, 2022 2 commits
  21. 15 Nov, 2022 1 commit
  22. 02 Nov, 2022 1 commit
  23. 24 Oct, 2022 1 commit
  24. 20 Oct, 2022 1 commit
  25. 12 Sep, 2022 1 commit
  26. 08 Sep, 2022 3 commits
  27. 07 Sep, 2022 2 commits
  28. 13 Jul, 2022 1 commit
    • Nathan Lambert's avatar
      Docs (#45) · c3d78cd3
      Nathan Lambert authored
      * first pass at docs structure
      
      * minor reformatting, add github actions for docs
      
      * populate docs (primarily from README, some writing)
      c3d78cd3