1. 12 May, 2023 1 commit
  2. 11 May, 2023 1 commit
    • Sayak Paul's avatar
      [Tests] better determinism (#3374) · 90f5f3c4
      Sayak Paul authored
      * enable deterministic pytorch and cuda operations.
      
      * disable manual seeding.
      
      * make style && make quality for unet_2d tests.
      
      * enable determinism for the unet2dconditional model.
      
      * add CUBLAS_WORKSPACE_CONFIG for better reproducibility.
      
      * relax tolerance (very weird issue, though).
      
      * revert to torch manual_seed() where needed.
      
      * relax more tolerance.
      
      * better placement of the cuda variable and relax more tolerance.
      
      * enable determinism for 3d condition model.
      
      * relax tolerance.
      
      * add: determinism to alt_diffusion.
      
      * relax tolerance for alt diffusion.
      
      * dance diffusion.
      
      * dance diffusion is flaky.
      
      * test_dict_tuple_outputs_equivalent edit.
      
      * fix two more tests.
      
      * fix more ddim tests.
      
      * fix: argument.
      
      * change to diff in place of difference.
      
      * fix: test_save_load call.
      
      * test_save_load_float16 call.
      
      * fix: expected_max_diff
      
      * fix: paint by example.
      
      * relax tolerance.
      
      * add determinism to 1d unet model.
      
      * torch 2.0 regressions seem to be brutal
      
      * determinism to vae.
      
      * add reason to skipping.
      
      * up tolerance.
      
      * determinism to vq.
      
      * determinism to cuda.
      
      * determinism to the generic test pipeline file.
      
      * refactor general pipelines testing a bit.
      
      * determinism to alt diffusion i2i
      
      * up tolerance for alt diff i2i and audio diff
      
      * up tolerance.
      
      * determinism to audioldm
      
      * increase tolerance for audioldm lms.
      
      * increase tolerance for paint by paint.
      
      * increase tolerance for repaint.
      
      * determinism to cycle diffusion and sd 1.
      
      * relax tol for cycle diffusion 🚲
      
      * relax tol for sd 1.0
      
      * relax tol for controlnet.
      
      * determinism to img var.
      
      * relax tol for img variation.
      
      * tolerance to i2i sd
      
      * make style
      
      * determinism to inpaint.
      
      * relax tolerance for inpaiting.
      
      * determinism for inpainting legacy
      
      * relax tolerance.
      
      * determinism to instruct pix2pix
      
      * determinism to model editing.
      
      * model editing tolerance.
      
      * panorama determinism
      
      * determinism to pix2pix zero.
      
      * determinism to sag.
      
      * sd 2. determinism
      
      * sd. tolerance
      
      * disallow tf32 matmul.
      
      * relax tolerance is all you need.
      
      * make style and determinism to sd 2 depth
      
      * relax tolerance for depth.
      
      * tolerance to diffedit.
      
      * tolerance to sd 2 inpaint.
      
      * up tolerance.
      
      * determinism in upscaling.
      
      * tolerance in upscaler.
      
      * more tolerance relaxation.
      
      * determinism to v pred.
      
      * up tol for v_pred
      
      * unclip determinism
      
      * determinism to unclip img2img
      
      * determinism to text to video.
      
      * determinism to last set of tests
      
      * up tol.
      
      * vq cumsum doesn't have a deterministic kernel
      
      * relax tol
      
      * relax tol
      90f5f3c4
  3. 22 Apr, 2023 1 commit
  4. 20 Apr, 2023 2 commits
  5. 17 Apr, 2023 2 commits
  6. 13 Apr, 2023 1 commit
  7. 12 Apr, 2023 1 commit
  8. 11 Apr, 2023 2 commits
    • Will Berman's avatar
      add only cross attention to simple attention blocks (#3011) · c6180a31
      Will Berman authored
      * add only cross attention to simple attention blocks
      
      * add test for only_cross_attention re: @patrickvonplaten
      
      * mid_block_only_cross_attention better default
      
      allow mid_block_only_cross_attention to default to
      `only_cross_attention` when `only_cross_attention` is given
      as a single boolean
      c6180a31
    • Patrick von Platen's avatar
      Fix config prints and save, load of pipelines (#2849) · 8b451eb6
      Patrick von Platen authored
      * [Config] Fix config prints and save, load
      
      * Only use potential nn.Modules for dtype and device
      
      * Correct vae image processor
      
      * make sure in_channels is not accessed directly
      
      * make sure in channels is only accessed via config
      
      * Make sure schedulers only access config attributes
      
      * Make sure to access config in SAG
      
      * Fix vae processor and make style
      
      * add tests
      
      * uP
      
      * make style
      
      * Fix more naming issues
      
      * Final fix with vae config
      
      * change more
      8b451eb6
  9. 04 Apr, 2023 1 commit
  10. 31 Mar, 2023 1 commit
  11. 27 Mar, 2023 1 commit
  12. 23 Mar, 2023 2 commits
    • Sanchit Gandhi's avatar
      Add AudioLDM (#2232) · b94880e5
      Sanchit Gandhi authored
      
      
      * Add AudioLDM
      
      * up
      
      * add vocoder
      
      * start unet
      
      * unconditional unet
      
      * clap, vocoder and vae
      
      * clean-up: conversion scripts
      
      * fix: conversion script token_type_ids
      
      * clean-up: pipeline docstring
      
      * tests: from SD
      
      * clean-up: cpu offload vocoder instead of safety checker
      
      * feat: adapt tests to audioldm
      
      * feat: add docs
      
      * clean-up: amend pipeline docstrings
      
      * clean-up: make style
      
      * clean-up: make fix-copies
      
      * fix: add doc path to toctree
      
      * clean-up: args for conversion script
      
      * clean-up: paths to checkpoints
      
      * fix: use conditional unet
      
      * clean-up: make style
      
      * fix: type hints for UNet
      
      * clean-up: docstring for UNet
      
      * clean-up: make style
      
      * clean-up: remove duplicate in docstring
      
      * clean-up: make style
      
      * clean-up: make fix-copies
      
      * clean-up: move imports to start in code snippet
      
      * fix: pass cross_attention_dim as a list/tuple to unet
      
      * clean-up: make fix-copies
      
      * fix: update checkpoint path
      
      * fix: unet cross_attention_dim in tests
      
      * film embeddings -> class embeddings
      
      * Apply suggestions from code review
      Co-authored-by: default avatarWill Berman <wlbberman@gmail.com>
      
      * fix: unet film embed to use existing args
      
      * fix: unet tests to use existing args
      
      * fix: make style
      
      * fix: transformers import and version in init
      
      * clean-up: make style
      
      * Revert "clean-up: make style"
      
      This reverts commit 5d6d1f8b324f5583e7805dc01e2c86e493660d66.
      
      * clean-up: make style
      
      * clean-up: use pipeline tester mixin tests where poss
      
      * clean-up: skip attn slicing test
      
      * fix: add torch dtype to docs
      
      * fix: remove conversion script out of src
      
      * fix: remove .detach from 1d waveform
      
      * fix: reduce default num inf steps
      
      * fix: swap height/width -> audio_length_in_s
      
      * clean-up: make style
      
      * fix: remove nightly tests
      
      * fix: imports in conversion script
      
      * clean-up: slim-down to two slow tests
      
      * clean-up: slim-down fast tests
      
      * fix: batch consistent tests
      
      * clean-up: make style
      
      * clean-up: remove vae slicing fast test
      
      * clean-up: propagate changes to doc
      
      * fix: increase test tol to 1e-2
      
      * clean-up: finish docs
      
      * clean-up: make style
      
      * feat: vocoder / VAE compatibility check
      
      * feat: possibly expand / cut audio waveform
      
      * fix: pipeline call signature test
      
      * fix: slow tests output len
      
      * clean-up: make style
      
      * make style
      
      ---------
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarWilliam Berman <WLBberman@gmail.com>
      b94880e5
    • Pedro Cuenca's avatar
      Skip `mps` in text-to-video tests (#2792) · aa0531fa
      Pedro Cuenca authored
      * Skip mps in text-to-video tests.
      
      * style
      
      * Skip UNet3D mps tests.
      aa0531fa
  13. 22 Mar, 2023 2 commits
    • Pedro Cuenca's avatar
      `mps`: remove warmup passes (#2771) · 92e1164e
      Pedro Cuenca authored
      * Remove warmup passes in mps tests.
      
      * Update mps docs: no warmup pass in PyTorch 2
      
      * Update imports.
      92e1164e
    • Patrick von Platen's avatar
      [MS Text To Video] Add first text to video (#2738) · ca1a2229
      Patrick von Platen authored
      
      
      * [MS Text To Video} Add first text to video
      
      * upload
      
      * make first model example
      
      * match unet3d params
      
      * make sure weights are correcctly converted
      
      * improve
      
      * forward pass works, but diff result
      
      * make forward work
      
      * fix more
      
      * finish
      
      * refactor video output class.
      
      * feat: add support for a video export utility.
      
      * fix: opencv availability check.
      
      * run make fix-copies.
      
      * add: docs for the model components.
      
      * add: standalone pipeline doc.
      
      * edit docstring of the pipeline.
      
      * add: right path to TransformerTempModel
      
      * add: first set of tests.
      
      * complete fast tests for text to video.
      
      * fix bug
      
      * up
      
      * three fast tests failing.
      
      * add: note on slow tests
      
      * make work with all schedulers
      
      * apply styling.
      
      * add slow tests
      
      * change file name
      
      * update
      
      * more correction
      
      * more fixes
      
      * finish
      
      * up
      
      * Apply suggestions from code review
      
      * up
      
      * finish
      
      * make copies
      
      * fix pipeline tests
      
      * fix more tests
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * apply suggestions
      
      * up
      
      * revert
      
      ---------
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      ca1a2229
  14. 21 Mar, 2023 1 commit
  15. 18 Mar, 2023 1 commit
  16. 17 Mar, 2023 1 commit
  17. 16 Mar, 2023 1 commit
  18. 15 Mar, 2023 2 commits
  19. 04 Mar, 2023 1 commit
  20. 03 Mar, 2023 1 commit
  21. 01 Mar, 2023 1 commit
  22. 24 Feb, 2023 1 commit
    • Pedro Cuenca's avatar
      `mps` test fixes (#2470) · 54bc882d
      Pedro Cuenca authored
      * Skip variant tests (UNet1d, UNetRL) on mps.
      
      mish op not yet supported.
      
      * Exclude a couple of panorama tests on mps
      
      They are too slow for fast CI.
      
      * Exclude mps panorama from more tests.
      
      * mps: exclude all fast panorama tests as they keep failing.
      54bc882d
  23. 13 Feb, 2023 1 commit
  24. 07 Feb, 2023 2 commits
    • Patrick von Platen's avatar
      Replace flake8 with ruff and update black (#2279) · a7ca03aa
      Patrick von Platen authored
      * before running make style
      
      * remove left overs from flake8
      
      * finish
      
      * make fix-copies
      
      * final fix
      
      * more fixes
      a7ca03aa
    • YiYi Xu's avatar
      Stable Diffusion Latent Upscaler (#2059) · 1051ca81
      YiYi Xu authored
      
      
      * Modify UNet2DConditionModel
      
      - allow skipping mid_block
      
      - adding a norm_group_size argument so that we can set the `num_groups` for group norm using `num_channels//norm_group_size`
      
      - allow user to set dimension for the timestep embedding (`time_embed_dim`)
      
      - the kernel_size for `conv_in` and `conv_out` is now configurable
      
      - add random fourier feature layer (`GaussianFourierProjection`) for `time_proj`
      
      - allow user to add the time and class embeddings before passing through the projection layer together - `time_embedding(t_emb + class_label))`
      
      - added 2 arguments `attn1_types` and `attn2_types`
      
        * currently we have argument `only_cross_attention`: when it's set to `True`, we will have a to the
      `BasicTransformerBlock` block with 2 cross-attention , otherwise we
      get a self-attention followed by a cross-attention; in k-upscaler, we need to have blocks that include just one cross-attention, or self-attention -> cross-attention;
      so I added `attn1_types` and `attn2_types` to the unet's argument list to allow user specify the attention types for the 2 positions in each block;  note that I stil kept
      the `only_cross_attention` argument for unet for easy configuration, but it will be converted to `attn1_type` and `attn2_type` when passing down to the down blocks
      
      - the position of downsample layer and upsample layer is now configurable
      
      - in k-upscaler unet, there is only one skip connection per each up/down block (instead of each layer in stable diffusion unet), added `skip_freq = "block"` to support
      this use case
      
      - if user passes attention_mask to unet, it will prepare the mask and pass a flag to cross attention processer to skip the `prepare_attention_mask` step
      inside cross attention block
      
      add up/down blocks for k-upscaler
      
      modify CrossAttention class
      
      - make the `dropout` layer in `to_out` optional
      
      - `use_conv_proj` - use conv instead of linear for all projection layers (i.e. `to_q`, `to_k`, `to_v`, `to_out`) whenever possible. note that when it's used to do cross
      attention, to_k, to_v has to be linear because the `encoder_hidden_states` is not 2d
      
      - `cross_attention_norm` - add an optional layernorm on encoder_hidden_states
      
      - `attention_dropout`: add an optional dropout on attention score
      
      adapt BasicTransformerBlock
      
      - add an ada groupnorm layer  to conditioning attention input with timestep embedding
      
      - allow skipping the FeedForward layer in between the attentions
      
      - replaced the only_cross_attention argument with attn1_type and attn2_type for more flexible configuration
      
      update timestep embedding: add new act_fn  gelu and an optional act_2
      
      modified ResnetBlock2D
      
      - refactored with AdaGroupNorm class (the timestep scale shift normalization)
      
      - add `mid_channel` argument - allow the first conv to have a different output dimension from the second conv
      
      - add option to use input AdaGroupNorm on the input instead of groupnorm
      
      - add options to add a dropout layer after each conv
      
      - allow user to set the bias in conv_shortcut (needed for k-upscaler)
      
      - add gelu
      
      adding conversion script for k-upscaler unet
      
      add pipeline
      
      * fix attention mask
      
      * fix a typo
      
      * fix a bug
      
      * make sure model can be used with GPU
      
      * make pipeline work with fp16
      
      * fix an error in BasicTransfomerBlock
      
      * make style
      
      * fix typo
      
      * some more fixes
      
      * uP
      
      * up
      
      * correct more
      
      * some clean-up
      
      * clean time proj
      
      * up
      
      * uP
      
      * more changes
      
      * remove the upcast_attention=True from unet config
      
      * remove attn1_types, attn2_types etc
      
      * fix
      
      * revert incorrect changes up/down samplers
      
      * make style
      
      * remove outdated files
      
      * Apply suggestions from code review
      
      * attention refactor
      
      * refactor cross attention
      
      * Apply suggestions from code review
      
      * update
      
      * up
      
      * update
      
      * Apply suggestions from code review
      
      * finish
      
      * Update src/diffusers/models/cross_attention.py
      
      * more fixes
      
      * up
      
      * up
      
      * up
      
      * finish
      
      * more corrections of conversion state
      
      * act_2 -> act_2_fn
      
      * remove dropout_after_conv from ResnetBlock2D
      
      * make style
      
      * simplify KAttentionBlock
      
      * add fast test for latent upscaler pipeline
      
      * add slow test
      
      * slow test fp16
      
      * make style
      
      * add doc string for pipeline_stable_diffusion_latent_upscale
      
      * add api doc page for latent upscaler pipeline
      
      * deprecate attention mask
      
      * clean up embeddings
      
      * simplify resnet
      
      * up
      
      * clean up resnet
      
      * up
      
      * correct more
      
      * up
      
      * up
      
      * improve a bit more
      
      * correct more
      
      * more clean-ups
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * add docstrings for new unet config
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * # Copied from
      
      * encode the image if not latent
      
      * remove force casting vae to fp32
      
      * fix
      
      * add comments about preconditioning parameters from k-diffusion paper
      
      * attn1_type, attn2_type -> add_self_attention
      
      * clean up get_down_block and get_up_block
      
      * fix
      
      * fixed a typo(?) in ada group norm
      
      * update slice attention processer for cross attention
      
      * update slice
      
      * fix fast test
      
      * update the checkpoint
      
      * finish tests
      
      * fix-copies
      
      * fix-copy for modeling_text_unet.py
      
      * make style
      
      * make style
      
      * fix f-string
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * fix import
      
      * correct changes
      
      * fix resnet
      
      * make fix-copies
      
      * correct euler scheduler
      
      * add missing #copied from for preprocess
      
      * revert
      
      * fix
      
      * fix copies
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/models/cross_attention.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * clean up conversion script
      
      * KDownsample2d,KUpsample2d -> KDownsample2D,KUpsample2D
      
      * more
      
      * Update src/diffusers/models/unet_2d_condition.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * remove prepare_extra_step_kwargs
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * fix a typo in timestep embedding
      
      * remove num_image_per_prompt
      
      * fix fasttest
      
      * make style + fix-copies
      
      * fix
      
      * fix xformer test
      
      * fix style
      
      * doc string
      
      * make style
      
      * fix-copies
      
      * docstring for time_embedding_norm
      
      * make style
      
      * final finishes
      
      * make fix-copies
      
      * fix tests
      
      ---------
      Co-authored-by: default avataryiyixuxu <yixu@yis-macbook-pro.lan>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      1051ca81
  25. 26 Jan, 2023 1 commit
  26. 25 Jan, 2023 1 commit
    • Patrick von Platen's avatar
      Reproducibility 3/3 (#1924) · 6ba2231d
      Patrick von Platen authored
      
      
      * make tests deterministic
      
      * run slow tests
      
      * prepare for testing
      
      * finish
      
      * refactor
      
      * add print statements
      
      * finish more
      
      * correct some test failures
      
      * more fixes
      
      * set up to correct tests
      
      * more corrections
      
      * up
      
      * fix more
      
      * more prints
      
      * add
      
      * up
      
      * up
      
      * up
      
      * uP
      
      * uP
      
      * more fixes
      
      * uP
      
      * up
      
      * up
      
      * up
      
      * up
      
      * fix more
      
      * up
      
      * up
      
      * clean tests
      
      * up
      
      * up
      
      * up
      
      * more fixes
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * make
      
      * correct
      
      * finish
      
      * finish
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      6ba2231d
  27. 18 Jan, 2023 1 commit
  28. 30 Dec, 2022 1 commit
  29. 20 Dec, 2022 1 commit
  30. 09 Dec, 2022 1 commit
  31. 05 Dec, 2022 2 commits
  32. 30 Nov, 2022 1 commit