1. 16 Feb, 2023 1 commit
    • Sayak Paul's avatar
      [Pipelines] Adds pix2pix zero (#2334) · fd3d5502
      Sayak Paul authored
      * add: support for BLIP generation.
      
      * add: support for editing synthetic images.
      
      * remove unnecessary comments.
      
      * add inits and run make fix-copies.
      
      * version change of diffusers.
      
      * fix: condition for loading the captioner.
      
      * default conditions_input_image to False.
      
      * guidance_amount -> cross_attention_guidance_amount
      
      * fix inputs to check_inputs()
      
      * fix: attribute.
      
      * fix: prepare_attention_mask() call.
      
      * debugging.
      
      * better placement of references.
      
      * remove torch.no_grad() decorations.
      
      * put torch.no_grad() context before the first denoising loop.
      
      * detach() latents before decoding them.
      
      * put deocding in a torch.no_grad() context.
      
      * add reconstructed image for debugging.
      
      * no_grad(0
      
      * apply formatting.
      
      * address one-off suggestions from the draft PR.
      
      * back to torch.no_grad() and add more elaborate comments.
      
      * refactor prepare_unet() per Patrick's suggestions.
      
      * more elaborate description for .
      
      * formatting.
      
      * add docstrings to the methods specific to pix2pix zero.
      
      * suspecting a redundant noise prediction.
      
      * needed for gradient computation chain.
      
      * less hacks.
      
      * fix: attention mask handling within the processor.
      
      * remove attention reference map computation.
      
      * fix: cross attn args.
      
      * fix: prcoessor.
      
      * store attention maps.
      
      * fix: attention processor.
      
      * update docs and better treatment to xa args.
      
      * update the final noise computation call.
      
      * change xa args call.
      
      * remove xa args option from the pipeline.
      
      * add: docs.
      
      * first test.
      
      * fix: url call.
      
      * fix: argument call.
      
      * remove image conditioning for now.
      
      * 🚨 add: fast tests.
      
      * explicit placement of the xa attn weights.
      
      * add: slow tests 🐢
      
      * fix: tests.
      
      * edited direction embedding should be on the same device as prompt_embeds.
      
      * debugging message.
      
      * debugging.
      
      * add pix2pix zero pipeline for a non-deterministic test.
      
      * debugging/
      
      * remove debugging message.
      
      * make caption generation _
      
      * address comments (part I).
      
      * address PR comments (part II)
      
      * fix: DDPM test assertion.
      
      * refactor doc.
      
      * address PR comments (part III).
      
      * fix: type annotation for the scheduler.
      
      * apply styling.
      
      * skip_mps and add note on embeddings in the docs.
      fd3d5502
  2. 14 Feb, 2023 1 commit
  3. 07 Feb, 2023 1 commit
    • YiYi Xu's avatar
      Stable Diffusion Latent Upscaler (#2059) · 1051ca81
      YiYi Xu authored
      
      
      * Modify UNet2DConditionModel
      
      - allow skipping mid_block
      
      - adding a norm_group_size argument so that we can set the `num_groups` for group norm using `num_channels//norm_group_size`
      
      - allow user to set dimension for the timestep embedding (`time_embed_dim`)
      
      - the kernel_size for `conv_in` and `conv_out` is now configurable
      
      - add random fourier feature layer (`GaussianFourierProjection`) for `time_proj`
      
      - allow user to add the time and class embeddings before passing through the projection layer together - `time_embedding(t_emb + class_label))`
      
      - added 2 arguments `attn1_types` and `attn2_types`
      
        * currently we have argument `only_cross_attention`: when it's set to `True`, we will have a to the
      `BasicTransformerBlock` block with 2 cross-attention , otherwise we
      get a self-attention followed by a cross-attention; in k-upscaler, we need to have blocks that include just one cross-attention, or self-attention -> cross-attention;
      so I added `attn1_types` and `attn2_types` to the unet's argument list to allow user specify the attention types for the 2 positions in each block;  note that I stil kept
      the `only_cross_attention` argument for unet for easy configuration, but it will be converted to `attn1_type` and `attn2_type` when passing down to the down blocks
      
      - the position of downsample layer and upsample layer is now configurable
      
      - in k-upscaler unet, there is only one skip connection per each up/down block (instead of each layer in stable diffusion unet), added `skip_freq = "block"` to support
      this use case
      
      - if user passes attention_mask to unet, it will prepare the mask and pass a flag to cross attention processer to skip the `prepare_attention_mask` step
      inside cross attention block
      
      add up/down blocks for k-upscaler
      
      modify CrossAttention class
      
      - make the `dropout` layer in `to_out` optional
      
      - `use_conv_proj` - use conv instead of linear for all projection layers (i.e. `to_q`, `to_k`, `to_v`, `to_out`) whenever possible. note that when it's used to do cross
      attention, to_k, to_v has to be linear because the `encoder_hidden_states` is not 2d
      
      - `cross_attention_norm` - add an optional layernorm on encoder_hidden_states
      
      - `attention_dropout`: add an optional dropout on attention score
      
      adapt BasicTransformerBlock
      
      - add an ada groupnorm layer  to conditioning attention input with timestep embedding
      
      - allow skipping the FeedForward layer in between the attentions
      
      - replaced the only_cross_attention argument with attn1_type and attn2_type for more flexible configuration
      
      update timestep embedding: add new act_fn  gelu and an optional act_2
      
      modified ResnetBlock2D
      
      - refactored with AdaGroupNorm class (the timestep scale shift normalization)
      
      - add `mid_channel` argument - allow the first conv to have a different output dimension from the second conv
      
      - add option to use input AdaGroupNorm on the input instead of groupnorm
      
      - add options to add a dropout layer after each conv
      
      - allow user to set the bias in conv_shortcut (needed for k-upscaler)
      
      - add gelu
      
      adding conversion script for k-upscaler unet
      
      add pipeline
      
      * fix attention mask
      
      * fix a typo
      
      * fix a bug
      
      * make sure model can be used with GPU
      
      * make pipeline work with fp16
      
      * fix an error in BasicTransfomerBlock
      
      * make style
      
      * fix typo
      
      * some more fixes
      
      * uP
      
      * up
      
      * correct more
      
      * some clean-up
      
      * clean time proj
      
      * up
      
      * uP
      
      * more changes
      
      * remove the upcast_attention=True from unet config
      
      * remove attn1_types, attn2_types etc
      
      * fix
      
      * revert incorrect changes up/down samplers
      
      * make style
      
      * remove outdated files
      
      * Apply suggestions from code review
      
      * attention refactor
      
      * refactor cross attention
      
      * Apply suggestions from code review
      
      * update
      
      * up
      
      * update
      
      * Apply suggestions from code review
      
      * finish
      
      * Update src/diffusers/models/cross_attention.py
      
      * more fixes
      
      * up
      
      * up
      
      * up
      
      * finish
      
      * more corrections of conversion state
      
      * act_2 -> act_2_fn
      
      * remove dropout_after_conv from ResnetBlock2D
      
      * make style
      
      * simplify KAttentionBlock
      
      * add fast test for latent upscaler pipeline
      
      * add slow test
      
      * slow test fp16
      
      * make style
      
      * add doc string for pipeline_stable_diffusion_latent_upscale
      
      * add api doc page for latent upscaler pipeline
      
      * deprecate attention mask
      
      * clean up embeddings
      
      * simplify resnet
      
      * up
      
      * clean up resnet
      
      * up
      
      * correct more
      
      * up
      
      * up
      
      * improve a bit more
      
      * correct more
      
      * more clean-ups
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * add docstrings for new unet config
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * # Copied from
      
      * encode the image if not latent
      
      * remove force casting vae to fp32
      
      * fix
      
      * add comments about preconditioning parameters from k-diffusion paper
      
      * attn1_type, attn2_type -> add_self_attention
      
      * clean up get_down_block and get_up_block
      
      * fix
      
      * fixed a typo(?) in ada group norm
      
      * update slice attention processer for cross attention
      
      * update slice
      
      * fix fast test
      
      * update the checkpoint
      
      * finish tests
      
      * fix-copies
      
      * fix-copy for modeling_text_unet.py
      
      * make style
      
      * make style
      
      * fix f-string
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * fix import
      
      * correct changes
      
      * fix resnet
      
      * make fix-copies
      
      * correct euler scheduler
      
      * add missing #copied from for preprocess
      
      * revert
      
      * fix
      
      * fix copies
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/models/cross_attention.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * clean up conversion script
      
      * KDownsample2d,KUpsample2d -> KDownsample2D,KUpsample2D
      
      * more
      
      * Update src/diffusers/models/unet_2d_condition.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * remove prepare_extra_step_kwargs
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * fix a typo in timestep embedding
      
      * remove num_image_per_prompt
      
      * fix fasttest
      
      * make style + fix-copies
      
      * fix
      
      * fix xformer test
      
      * fix style
      
      * doc string
      
      * make style
      
      * fix-copies
      
      * docstring for time_embedding_norm
      
      * make style
      
      * final finishes
      
      * make fix-copies
      
      * fix tests
      
      ---------
      Co-authored-by: default avataryiyixuxu <yixu@yis-macbook-pro.lan>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      1051ca81
  4. 20 Jan, 2023 1 commit
  5. 17 Jan, 2023 2 commits
    • Kashif Rasul's avatar
      DiT Pipeline (#1806) · 37d113cc
      Kashif Rasul authored
      
      
      * added dit model
      
      * import
      
      * initial pipeline
      
      * initial convert script
      
      * initial pipeline
      
      * make style
      
      * raise valueerror
      
      * single function
      
      * rename classes
      
      * use DDIMScheduler
      
      * timesteps embedder
      
      * samples to cpu
      
      * fix var names
      
      * fix numpy type
      
      * use timesteps class for proj
      
      * fix typo
      
      * fix arg name
      
      * flip_sin_to_cos and better var names
      
      * fix C shape cal
      
      * make style
      
      * remove unused imports
      
      * cleanup
      
      * add back patch_size
      
      * initial dit doc
      
      * typo
      
      * Update docs/source/api/pipelines/dit.mdx
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * added copyright license headers
      
      * added example usage and toc
      
      * fix variable names asserts
      
      * remove comment
      
      * added docs
      
      * fix typo
      
      * upstream changes
      
      * set proper device for drop_ids
      
      * added initial dit pipeline test
      
      * update docs
      
      * fix imports
      
      * make fix-copies
      
      * isort
      
      * fix imports
      
      * get rid of more magic numbers
      
      * fix code when guidance is off
      
      * remove block_kwargs
      
      * cleanup script
      
      * removed to_2tuple
      
      * use FeedForward class instead of another MLP
      
      * style
      
      * work on mergint DiTBlock with BasicTransformerBlock
      
      * added missing final_dropout and args to BasicTransformerBlock
      
      * use norm from block
      
      * fix arg
      
      * remove unused arg
      
      * fix call to class_embedder
      
      * use timesteps
      
      * make style
      
      * attn_output gets multiplied
      
      * removed commented code
      
      * use Transformer2D
      
      * use self.is_input_patches
      
      * fix flags
      
      * fixed conversion to use Transformer2DModel
      
      * fixes for pipeline
      
      * remove dit.py
      
      * fix timesteps device
      
      * use randn_tensor and fix fp16 inf.
      
      * timesteps_emb already the right dtype
      
      * fix dit test class
      
      * fix test and style
      
      * fix norm2 usage in vq-diffusion
      
      * added author names to pipeline and lmagenet labels link
      
      * fix tests
      
      * use norm_type as string
      
      * rename dit to transformer
      
      * fix name
      
      * fix test
      
      * set  norm_type = "layer" by default
      
      * fix tests
      
      * do not skip common tests
      
      * Update src/diffusers/models/attention.py
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * revert AdaLayerNorm API
      
      * fix norm_type name
      
      * make sure all components are in eval mode
      
      * revert norm2 API
      
      * compact
      
      * finish deprecation
      
      * add slow tests
      
      * remove @
      
      * refactor some stuff
      
      * upload
      
      * Update src/diffusers/pipelines/dit/pipeline_dit.py
      
      * finish more
      
      * finish docs
      
      * improve docs
      
      * finish docs
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      Co-authored-by: default avatarWilliam Berman <WLBberman@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      37d113cc
    • Jerry Jiarui XU's avatar
      [Flax] Add Flax inpainting impl (#1966) · a43bdd01
      Jerry Jiarui XU authored
      * [Flax] Add Flax inpainting impl
      
      * fixed copies, add README.md
      
      * fixed README.md
      
      * add test
      
      * format
      
      * update README.md
      a43bdd01
  6. 30 Dec, 2022 1 commit
  7. 28 Dec, 2022 1 commit
    • Will Berman's avatar
      unCLIP image variation (#1781) · 53c8147a
      Will Berman authored
      * unCLIP image variation
      
      * remove prior comment re: @pcuenca
      
      * stable diffusion -> unCLIP re: @pcuenca
      
      * add copy froms re: @patil-suraj
      53c8147a
  8. 20 Dec, 2022 1 commit
    • Dhruv Naik's avatar
      Add Flax stable diffusion img2img pipeline (#1355) · a9190bad
      Dhruv Naik authored
      
      
      * add flax img2img pipeline
      
      * update pipeline
      
      * black format file
      
      * remove argg from get_timesteps
      
      * update get_timesteps
      
      * fix bug: make use of timesteps for for_loop
      
      * black file
      
      * black, isort, flake8
      
      * update docstring
      
      * update readme
      
      * update flax img2img readme
      
      * update sd pipeline init
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_flax_stable_diffusion_img2img.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_flax_stable_diffusion_img2img.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * update inits
      
      * revert change
      
      * update var name to image, typo
      
      * update readme
      
      * return new t_start instead of modified timestep
      
      * black format
      
      * isort files
      
      * update docs
      
      * fix-copies
      
      * update prng_seed typing
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      a9190bad
  9. 18 Dec, 2022 1 commit
    • Will Berman's avatar
      kakaobrain unCLIP (#1428) · 2dcf64b7
      Will Berman authored
      
      
      * [wip] attention block updates
      
      * [wip] unCLIP unet decoder and super res
      
      * [wip] unCLIP prior transformer
      
      * [wip] scheduler changes
      
      * [wip] text proj utility class
      
      * [wip] UnCLIPPipeline
      
      * [wip] kakaobrain unCLIP convert script
      
      * [unCLIP pipeline] fixes re: @patrickvonplaten
      
      remove callbacks
      
      move denoising loops into call function
      
      * UNCLIPScheduler re: @patrickvonplaten
      
      Revert changes to DDPMScheduler. Make UNCLIPScheduler, a modified
      DDPM scheduler with changes to support karlo
      
      * mask -> attention_mask re: @patrickvonplaten
      
      * [DDPMScheduler] remove leftover change
      
      * [docs] PriorTransformer
      
      * [docs] UNet2DConditionModel and UNet2DModel
      
      * [nit] UNCLIPScheduler -> UnCLIPScheduler
      
      matches existing unclip naming better
      
      * [docs] SchedulingUnCLIP
      
      * [docs] UnCLIPTextProjModel
      
      * refactor
      
      * finish licenses
      
      * rename all to attention_mask and prep in models
      
      * more renaming
      
      * don't expose unused configs
      
      * final renaming fixes
      
      * remove x attn mask when not necessary
      
      * configure kakao script to use new class embedding config
      
      * fix copies
      
      * [tests] UnCLIPScheduler
      
      * finish x attn
      
      * finish
      
      * remove more
      
      * rename condition blocks
      
      * clean more
      
      * Apply suggestions from code review
      
      * up
      
      * fix
      
      * [tests] UnCLIPPipelineFastTests
      
      * remove unused imports
      
      * [tests] UnCLIPPipelineIntegrationTests
      
      * correct
      
      * make style
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      2dcf64b7
  10. 08 Dec, 2022 3 commits
  11. 07 Dec, 2022 1 commit
    • Patrick von Platen's avatar
      Add paint by example (#1533) · 896c98a2
      Patrick von Platen authored
      
      
      * add paint by example
      
      * mkae loading possibel
      
      * up
      
      * Update src/diffusers/models/attention.py
      
      * up
      
      * finalize weight structure
      
      * make example work
      
      * make it work
      
      * up
      
      * up
      
      * fix
      
      * del
      
      * add
      
      * update
      
      * Apply suggestions from code review
      
      * correct transformer 2d
      
      * finish
      
      * up
      
      * up
      
      * up
      
      * up
      
      * fix
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Apply suggestions from code review
      
      * up
      
      * finish
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      896c98a2
  12. 05 Dec, 2022 1 commit
    • Robert Dargavel Smith's avatar
      add AudioDiffusionPipeline and LatentAudioDiffusionPipeline #1334 (#1426) · 48d0123f
      Robert Dargavel Smith authored
      
      
      * add AudioDiffusionPipeline and LatentAudioDiffusionPipeline
      
      * add docs to toc
      
      * fix tests
      
      * fix tests
      
      * fix tests
      
      * fix tests
      
      * fix tests
      
      * Update pr_tests.yml
      
      Fix tests
      
      * parent 499ff34b3edc3e0c506313ab48f21514d8f58b09
      author teticio <teticio@gmail.com> 1668765652 +0000
      committer teticio <teticio@gmail.com> 1669041721 +0000
      
      parent 499ff34b3edc3e0c506313ab48f21514d8f58b09
      author teticio <teticio@gmail.com> 1668765652 +0000
      committer teticio <teticio@gmail.com> 1669041704 +0000
      
      add colab notebook
      
      [Flax] Fix loading scheduler from subfolder (#1319)
      
      [FLAX] Fix loading scheduler from subfolder
      
      Fix/Enable all schedulers for in-painting (#1331)
      
      * inpaint fix k lms
      
      * onnox as well
      
      * up
      
      Correct path to schedlure (#1322)
      
      * [Examples] Correct path
      
      * uP
      
      Avoid nested fix-copies (#1332)
      
      * Avoid nested `# Copied from` statements during `make fix-copies`
      
      * style
      
      Fix img2img speed with LMS-Discrete Scheduler (#896)
      
      Casting `self.sigmas` into a different dtype (the one of original_samples) is not advisable. In my img2img pipeline this leads to a long running time in the  `integrate.quad` call later on- by long I mean more than 10x slower.
      Co-authored-by: default avatarAnton Lozhkov <anton@huggingface.co>
      
      Fix the order of casts for onnx inpainting (#1338)
      
      Legacy Inpainting Pipeline for Onnx Models (#1237)
      
      * Add legacy inpainting pipeline compatibility for onnx
      
      * remove commented out line
      
      * Add onnx legacy inpainting test
      
      * Fix slow decorators
      
      * pep8 styling
      
      * isort styling
      
      * dummy object
      
      * ordering consistency
      
      * style
      
      * docstring styles
      
      * Refactor common prompt encoding pattern
      
      * Update tests to permanent repository home
      
      * support all available schedulers until ONNX IO binding is available
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      
      * updated styling from PR suggested feedback
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      
      Jax infer support negative prompt (#1337)
      
      * support negative prompts in sd jax pipeline
      
      * pass batched neg_prompt
      
      * only encode when negative prompt is None
      Co-authored-by: default avatarJuan Acevedo <jfacevedo@google.com>
      
      Update README.md: Minor change to Imagic code snippet, missing dir error (#1347)
      
      Minor change to Imagic Readme
      
      Missing dir causes an error when running the example code.
      
      make style
      
      change the sample model (#1352)
      
      * Update alt_diffusion.mdx
      
      * Update alt_diffusion.mdx
      
      Add bit diffusion [WIP] (#971)
      
      * Create bit_diffusion.py
      
      Bit diffusion based on the paper, arXiv:2208.04202, Chen2022AnalogBG
      
      * adding bit diffusion to new branch
      
      ran tests
      
      * tests
      
      * tests
      
      * tests
      
      * tests
      
      * removed test folders + added to README
      
      * Update README.md
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * move Mel to module in pipeline construction, make librosa optional
      
      * fix imports
      
      * fix copy & paste error in comment
      
      * fix style
      
      * add missing register_to_config
      
      * fix class docstrings
      
      * fix class docstrings
      
      * tweak docstrings
      
      * tweak docstrings
      
      * update slow test
      
      * put trailing commas back
      
      * respect alphabetical order
      
      * remove LatentAudioDiffusion, make vqvae optional
      
      * move Mel from models back to pipelines :-)
      
      * allow loading of pretrained audiodiffusion models
      
      * fix tests
      
      * fix dummies
      
      * remove reference to latent_audio_diffusion in docs
      
      * unused import
      
      * inherit from SchedulerMixin to make loadable
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      48d0123f
  13. 25 Nov, 2022 1 commit
    • Suraj Patil's avatar
      StableDiffusionUpscalePipeline (#1396) · 9ec5084a
      Suraj Patil authored
      
      
      * StableDiffusionUpscalePipeline
      
      * fix a few things
      
      * make it better
      
      * fix image batching
      
      * run vae in fp32
      
      * fix docstr
      
      * resize to mul of 64
      
      * doc
      
      * remove safety_checker
      
      * add max_noise_level
      
      * fix Copied
      
      * begin tests
      
      * slow tests
      
      * default max_noise_level
      
      * remove kwargs
      
      * doc
      
      * fix
      
      * fix fast tests
      
      * fix fast tests
      
      * no sf
      
      * don't offload vae
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      9ec5084a
  14. 23 Nov, 2022 2 commits
    • Patrick von Platen's avatar
      [Versatile Diffusion] Add versatile diffusion model (#1283) · 2625fb59
      Patrick von Platen authored
      
      
      * up
      
      * convert dual unet
      
      * revert dual attn
      
      * adapt for vd-official
      
      * test the full pipeline
      
      * mixed inference
      
      * mixed inference for text2img
      
      * add image prompting
      
      * fix clip norm
      
      * split text2img and img2img
      
      * fix format
      
      * refactor text2img
      
      * mega pipeline
      
      * add optimus
      
      * refactor image var
      
      * wip text_unet
      
      * text unet end to end
      
      * update tests
      
      * reshape
      
      * fix image to text
      
      * add some first docs
      
      * dual guided pipeline
      
      * fix token ratio
      
      * propose change
      
      * dual transformer as a native module
      
      * DualTransformer(nn.Module)
      
      * DualTransformer(nn.Module)
      
      * correct unconditional image
      
      * save-load with mega pipeline
      
      * remove image to text
      
      * up
      
      * uP
      
      * fix
      
      * up
      
      * final fix
      
      * remove_unused_weights
      
      * test updates
      
      * save progress
      
      * uP
      
      * fix dual prompts
      
      * some fixes
      
      * finish
      
      * style
      
      * finish renaming
      
      * up
      
      * fix
      
      * fix
      
      * fix
      
      * finish
      Co-authored-by: default avataranton-l <anton@huggingface.co>
      2625fb59
    • Suraj Patil's avatar
      StableDiffusionImageVariationPipeline (#1365) · 0eb507f2
      Suraj Patil authored
      
      
      * add StableDiffusionImageVariationPipeline
      
      * add ini init
      
      * use CLIPVisionModelWithProjection
      
      * fix _encode_image
      
      * add copied from
      
      * fix copies
      
      * add doc
      
      * handle tensor in _encode_image
      
      * add tests
      
      * correct model_id
      
      * remove copied from in enable_sequential_cpu_offload
      
      * fix tests
      
      * make slow tests pass
      
      * update slow tests
      
      * use temp model for now
      
      * fix test_stable_diffusion_img_variation_intermediate_state
      
      * fix test_stable_diffusion_img_variation_intermediate_state
      
      * check for torch.Tensor
      
      * quality
      
      * fix name
      
      * fix slow tests
      
      * install transformers from source
      
      * fix install
      
      * fix install
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * input_image -> image
      
      * remove deprication warnings
      
      * fix test_stable_diffusion_img_variation_multiple_images
      
      * make flake happy
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      0eb507f2
  15. 22 Nov, 2022 1 commit
    • Manuel Brack's avatar
      Add Safe Stable Diffusion Pipeline (#1244) · e50c25d8
      Manuel Brack authored
      
      
      * Add pipeline_stable_diffusion_safe.py to pipelines
      
      * Fix repository consistency
      
      Ran make fix-copies after adding new pipline
      
      * Add Paper/Equation reference for parameters to doc string
      
      * Ensure code style and quality
      
      * Perform code refactoring
      
      * Fix copies inherited from merge with huggingface/main
      
      * Add docs
      
      * Fix code style
      
      * Fix errors in documentation
      
      * Fix refactoring error
      
      * remove debugging print statement
      
      * added Safe Latent Diffusion tests
      
      * Fix style
      
      * Fix style
      
      * Add pre-defined safety configurations
      
      * Fix line-break
      
      * fix some tests
      
      * finish
      
      * Change safety checker
      
      * Add missing safety_checker.py file
      
      * Remove unused imports
      Co-authored-by: default avatarPatrickSchrML <patrick_schramowski@hotmail.de>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      e50c25d8
  16. 18 Nov, 2022 1 commit
    • Clayton Sims's avatar
      Legacy Inpainting Pipeline for Onnx Models (#1237) · 30220905
      Clayton Sims authored
      
      
      * Add legacy inpainting pipeline compatibility for onnx
      
      * remove commented out line
      
      * Add onnx legacy inpainting test
      
      * Fix slow decorators
      
      * pep8 styling
      
      * isort styling
      
      * dummy object
      
      * ordering consistency
      
      * style
      
      * docstring styles
      
      * Refactor common prompt encoding pattern
      
      * Update tests to permanent repository home
      
      * support all available schedulers until ONNX IO binding is available
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      
      * updated styling from PR suggested feedback
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      30220905
  17. 15 Nov, 2022 1 commit
    • Patrick von Platen's avatar
      Add AltDiffusion (#1299) · 8a730645
      Patrick von Platen authored
      
      
      * add conversion script for vae
      
      * up
      
      * up
      
      * some fixes
      
      * add text model
      
      * use the correct config
      
      * add docs
      
      * move model in it's own file
      
      * move model in its own file
      
      * pass attenion mask to text encoder
      
      * pass attn mask to uncond inputs
      
      * quality
      
      * fix image2image
      
      * add imag2image in init
      
      * fix import
      
      * fix one more import
      
      * fix import, dummy objetcs
      
      * fix copied from
      
      * up
      
      * finish
      Co-authored-by: default avatarpatil-suraj <surajp815@gmail.com>
      8a730645
  18. 09 Nov, 2022 1 commit
  19. 04 Nov, 2022 1 commit
    • Chen Wu (吴尘)'s avatar
      Add CycleDiffusion pipeline using Stable Diffusion (#888) · 9d8943b7
      Chen Wu (吴尘) authored
      
      
      * Add CycleDiffusion pipeline for Stable Diffusion
      
      * Add the option of passing noise to DDIMScheduler
      
      Add the option of providing the noise itself to DDIMScheduler, instead of the random seed generator.
      
      * Update README.md
      
      * Update README.md
      
      * Update pipeline_stable_diffusion_cycle_diffusion.py
      
      * Update pipeline_stable_diffusion_cycle_diffusion.py
      
      * Update pipeline_stable_diffusion_cycle_diffusion.py
      
      * Update pipeline_stable_diffusion_cycle_diffusion.py
      
      * Update scheduling_ddim.py
      
      * Update import format
      
      * Update pipeline_stable_diffusion_cycle_diffusion.py
      
      * Update scheduling_ddim.py
      
      * Update src/diffusers/schedulers/scheduling_ddim.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/schedulers/scheduling_ddim.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/schedulers/scheduling_ddim.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/schedulers/scheduling_ddim.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/schedulers/scheduling_ddim.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update scheduling_ddim.py
      
      * Update scheduling_ddim.py
      
      * Update scheduling_ddim.py
      
      * add two tests
      
      * Update pipeline_stable_diffusion_cycle_diffusion.py
      
      * Update pipeline_stable_diffusion_cycle_diffusion.py
      
      * Update README.md
      
      * Rename pipeline name as suggested in the latest reviewer comment
      
      * Update test_pipelines.py
      
      * Update test_pipelines.py
      
      * Update test_pipelines.py
      
      * Update pipeline_stable_diffusion_cycle_diffusion.py
      
      * Remove the generator
      
      This generator does not control all randomness during sampling, which can be misleading.
      
      * Update optimal hyperparameters
      
      * Update src/diffusers/pipelines/stable_diffusion/README.md
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Update src/diffusers/pipelines/stable_diffusion/README.md
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Update src/diffusers/pipelines/stable_diffusion/README.md
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Apply suggestions from code review
      
      * uP
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_cycle_diffusion.py
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * up
      
      * up
      
      * Replace assert with ValueError
      
      * finish docs
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      9d8943b7
  20. 03 Nov, 2022 2 commits
    • Will Berman's avatar
      VQ-diffusion (#658) · ef2ea33c
      Will Berman authored
      
      
      * Changes for VQ-diffusion VQVAE
      
      Add specify dimension of embeddings to VQModel:
      `VQModel` will by default set the dimension of embeddings to the number
      of latent channels. The VQ-diffusion VQVAE has a smaller
      embedding dimension, 128, than number of latent channels, 256.
      
      Add AttnDownEncoderBlock2D and AttnUpDecoderBlock2D to the up and down
      unet block helpers. VQ-diffusion's VQVAE uses those two block types.
      
      * Changes for VQ-diffusion transformer
      
      Modify attention.py so SpatialTransformer can be used for
      VQ-diffusion's transformer.
      
      SpatialTransformer:
      - Can now operate over discrete inputs (classes of vector embeddings) as well as continuous.
      - `in_channels` was made optional in the constructor so two locations where it was passed as a positional arg were moved to kwargs
      - modified forward pass to take optional timestep embeddings
      
      ImagePositionalEmbeddings:
      - added to provide positional embeddings to discrete inputs for latent pixels
      
      BasicTransformerBlock:
      - norm layers were made configurable so that the VQ-diffusion could use AdaLayerNorm with timestep embeddings
      - modified forward pass to take optional timestep embeddings
      
      CrossAttention:
      - now may optionally take a bias parameter for its query, key, and value linear layers
      
      FeedForward:
      - Internal layers are now configurable
      
      ApproximateGELU:
      - Activation function in VQ-diffusion's feedforward layer
      
      AdaLayerNorm:
      - Norm layer modified to incorporate timestep embeddings
      
      * Add VQ-diffusion scheduler
      
      * Add VQ-diffusion pipeline
      
      * Add VQ-diffusion convert script to diffusers
      
      * Add VQ-diffusion dummy objects
      
      * Add VQ-diffusion markdown docs
      
      * Add VQ-diffusion tests
      
      * some renaming
      
      * some fixes
      
      * more renaming
      
      * correct
      
      * fix typo
      
      * correct weights
      
      * finalize
      
      * fix tests
      
      * Apply suggestions from code review
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * finish
      
      * finish
      
      * up
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      ef2ea33c
    • Revist's avatar
      feat: add repaint (#974) · d38c8043
      Revist authored
      
      
      * feat: add repaint
      
      * fix: fix quality check with `make fix-copies`
      
      * fix: remove old unnecessary arg
      
      * chore: change default to DDPM (looks better in experiments)
      
      * ".to(device)" changed to "device="
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      
      * make generator device-specific
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      
      * make generator device-specific and change shape
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      
      * fix: add preprocessing for image and mask
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      
      * fix: update test
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      
      * Update src/diffusers/pipelines/repaint/pipeline_repaint.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Add docs and examples
      
      * Fix toctree
      Co-authored-by: default avatarfja <fja@zurich.ibm.com>
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarAnton Lozhkov <anton@huggingface.co>
      d38c8043
  21. 25 Oct, 2022 1 commit
  22. 19 Oct, 2022 1 commit
  23. 18 Oct, 2022 2 commits
  24. 06 Oct, 2022 2 commits
  25. 03 Oct, 2022 1 commit
    • Pedro Cuenca's avatar
      Fix import with Flax but without PyTorch (#688) · 688031c5
      Pedro Cuenca authored
      * Don't use `load_state_dict` if torch is not installed.
      
      * Define `SchedulerOutput` to use torch or flax arrays.
      
      * Don't import LMSDiscreteScheduler without torch.
      
      * Create distinct FlaxSchedulerOutput.
      
      * Additional changes required for FlaxSchedulerMixin
      
      * Do not import torch pipelines in Flax.
      
      * Revert "Define `SchedulerOutput` to use torch or flax arrays."
      
      This reverts commit f653140134b74d9ffec46d970eb46925fe3a409d.
      
      * Prefix Flax scheduler outputs for consistency.
      
      * make style
      
      * FlaxSchedulerOutput is now a dataclass.
      
      * Don't use f-string without placeholders.
      
      * Add blank line.
      
      * Style (docstrings)
      688031c5
  26. 20 Sep, 2022 1 commit
  27. 08 Sep, 2022 1 commit
  28. 07 Sep, 2022 1 commit
  29. 01 Sep, 2022 1 commit
  30. 30 Aug, 2022 1 commit
  31. 17 Aug, 2022 1 commit
  32. 14 Aug, 2022 1 commit
  33. 09 Aug, 2022 1 commit