1. 30 Jul, 2024 2 commits
    • Yoach Lacombe's avatar
      Fix Stable Audio repository id (#9016) · ea1b4ea7
      Yoach Lacombe authored
      Fix Stable Audio repo id
      ea1b4ea7
    • Yoach Lacombe's avatar
      Stable Audio integration (#8716) · 69e72b1d
      Yoach Lacombe authored
      
      
      * WIP modeling code and pipeline
      
      * add custom attention processor + custom activation + add to init
      
      * correct ProjectionModel forward
      
      * add stable audio to __initèè
      
      * add autoencoder and update pipeline and modeling code
      
      * add half Rope
      
      * add partial rotary v2
      
      * add temporary modfis to scheduler
      
      * add EDM DPM Solver
      
      * remove TODOs
      
      * clean GLU
      
      * remove att.group_norm to attn processor
      
      * revert back src/diffusers/schedulers/scheduling_dpmsolver_multistep.py
      
      * refactor GLU -> SwiGLU
      
      * remove redundant args
      
      * add channel multiples in autoencoder docstrings
      
      * changes in docsrtings and copyright headers
      
      * clean pipeline
      
      * further cleaning
      
      * remove peft and lora and fromoriginalmodel
      
      * Delete src/diffusers/pipelines/stable_audio/diffusers.code-workspace
      
      * make style
      
      * dummy models
      
      * fix copied from
      
      * add fast oobleck tests
      
      * add brownian tree
      
      * oobleck autoencoder slow tests
      
      * remove TODO
      
      * fast stable audio pipeline tests
      
      * add slow tests
      
      * make style
      
      * add first version of docs
      
      * wrap is_torchsde_available to the scheduler
      
      * fix slow test
      
      * test with input waveform
      
      * add input waveform
      
      * remove some todos
      
      * create stableaudio gaussian projection + make style
      
      * add pipeline to toctree
      
      * fix copied from
      
      * make quality
      
      * refactor timestep_features->time_proj
      
      * refactor joint_attention_kwargs->cross_attention_kwargs
      
      * remove forward_chunk
      
      * move StableAudioDitModel to transformers folder
      
      * correct convert + remove partial rotary embed
      
      * apply suggestions from yiyixuxu -> removing attn.kv_heads
      
      * remove temb
      
      * remove cross_attention_kwargs
      
      * further removal of cross_attention_kwargs
      
      * remove text encoder autocast to fp16
      
      * continue removing autocast
      
      * make style
      
      * refactor how text and audio are embedded
      
      * add paper
      
      * update example code
      
      * make style
      
      * unify projection model forward + fix device placement
      
      * make style
      
      * remove fuse qkv
      
      * apply suggestions from review
      
      * Update src/diffusers/pipelines/stable_audio/pipeline_stable_audio.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * make style
      
      * smaller models in fast tests
      
      * pass sequential offloading fast tests
      
      * add docs for vae and autoencoder
      
      * make style and update example
      
      * remove useless import
      
      * add cosine scheduler
      
      * dummy classes
      
      * cosine scheduler docs
      
      * better description of scheduler
      
      ---------
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      69e72b1d
  2. 24 Jul, 2024 1 commit
  3. 23 Jul, 2024 1 commit
  4. 22 Jul, 2024 1 commit
  5. 17 Jul, 2024 1 commit
  6. 11 Jul, 2024 1 commit
  7. 09 Jul, 2024 1 commit
  8. 08 Jul, 2024 1 commit
  9. 06 Jul, 2024 1 commit
  10. 04 Jul, 2024 1 commit
  11. 27 Jun, 2024 1 commit
    • Mathis Koroglu's avatar
      Motion Model / Adapter versatility (#8301) · 3e0d128d
      Mathis Koroglu authored
      * Motion Model / Adapter versatility
      
      - allow to use a different number of layers per block
      - allow to use a different number of transformer per layers per block
      - allow a different number of motion attention head per block
      - use dropout argument in get_down/up_block in 3d blocks
      
      * Motion Model added arguments renamed & refactoring
      
      * Add test for asymmetric UNetMotionModel
      3e0d128d
  12. 26 Jun, 2024 2 commits
  13. 25 Jun, 2024 1 commit
  14. 24 Jun, 2024 1 commit
  15. 21 Jun, 2024 1 commit
  16. 18 Jun, 2024 1 commit
  17. 12 Jun, 2024 1 commit
  18. 07 Jun, 2024 1 commit
    • Sayak Paul's avatar
      [Core] support saving and loading of sharded checkpoints (#7830) · 7d887118
      Sayak Paul authored
      
      
      * feat: support saving a model in sharded checkpoints.
      
      * feat: make loading of sharded checkpoints work.
      
      * add tests
      
      * cleanse the loading logic a bit more.
      
      * more resilience while loading from the Hub.
      
      * parallelize shard downloads by using snapshot_download()/
      
      * default to a shard size.
      
      * more fix
      
      * Empty-Commit
      
      * debug
      
      * fix
      
      * uality
      
      * more debugging
      
      * fix more
      
      * initial comments from Benjamin
      
      * move certain methods to loading_utils
      
      * add test to check if the correct number of shards are present.
      
      * add a test to check if loading of sharded checkpoints from the Hub is okay
      
      * clarify the unit when passed as an int.
      
      * use hf_hub for sharding.
      
      * remove unnecessary code
      
      * remove unnecessary function
      
      * lucain's comments.
      
      * fixes
      
      * address high-level comments.
      
      * fix test
      
      * subfolder shenanigans./
      
      * Update src/diffusers/utils/hub_utils.py
      Co-authored-by: default avatarLucain <lucainp@gmail.com>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarLucain <lucainp@gmail.com>
      
      * remove _huggingface_hub_version as not needed.
      
      * address more feedback.
      
      * add a test for local_files_only=True/
      
      * need hf hub to be at least 0.23.2
      
      * style
      
      * final comment.
      
      * clean up subfolder.
      
      * deal with suffixes in code.
      
      * _add_variant default.
      
      * use weights_name_pattern
      
      * remove add_suffix_keyword
      
      * clean up downloading of sharded ckpts.
      
      * don't return something special when using index.json
      
      * fix more
      
      * don't use bare except
      
      * remove comments and catch the errors better
      
      * fix a couple of things when using is_file()
      
      * empty
      
      ---------
      Co-authored-by: default avatarLucain <lucainp@gmail.com>
      7d887118
  19. 05 Jun, 2024 1 commit
    • Sayak Paul's avatar
      [LoRA] Remove legacy LoRA code and related adjustments (#8316) · a0542c19
      Sayak Paul authored
      * remove legacy code from load_attn_procs.
      
      * finish first draft
      
      * fix more.
      
      * fix more
      
      * add test
      
      * add serialization support.
      
      * fix-copies
      
      * require peft backend for lora tests
      
      * style
      
      * fix test
      
      * fix loading.
      
      * empty
      
      * address benjamin's feedback.
      a0542c19
  20. 31 May, 2024 1 commit
    • Sayak Paul's avatar
      [Core] Introduce class variants for `Transformer2DModel` (#7647) · 983dec3b
      Sayak Paul authored
      * init for patches
      
      * finish patched model.
      
      * continuous transformer
      
      * vectorized transformer2d.
      
      * style.
      
      * inits.
      
      * fix-copies.
      
      * introduce DiTTransformer2DModel.
      
      * fixes
      
      * use REMAPPING as suggested by @DN6
      
      * better logging.
      
      * add pixart transformer model.
      
      * inits.
      
      * caption_channels.
      
      * attention masking.
      
      * fix use_additional_conditions.
      
      * remove print.
      
      * debug
      
      * flatten
      
      * fix: assertion for sigma
      
      * handle remapping for modeling_utils
      
      * add tests for dit transformer2d
      
      * quality
      
      * placeholder for pixart tests
      
      * pixart tests
      
      * add _no_split_modules
      
      * add docs.
      
      * check
      
      * check
      
      * check
      
      * check
      
      * fix tests
      
      * fix tests
      
      * move Transformer output to modeling_output
      
      * move errors better and bring back use_additional_conditions attribute.
      
      * add unnecessary things from DiT.
      
      * clean up pixart
      
      * fix remapping
      
      * fix device_map things in pixart2d.
      
      * replace Transformer2DModel with appropriate classes in dit, pixart tests
      
      * empty
      
      * legacy mixin classes./
      
      * use a remapping dict for fetching class names.
      
      * change to specifc model types in the pipeline implementations.
      
      * move _fetch_remapped_cls_from_config to modeling_loading_utils.py
      
      * fix dependency problems.
      
      * add deprecation note.
      983dec3b
  21. 29 May, 2024 1 commit
  22. 22 May, 2024 1 commit
  23. 15 May, 2024 1 commit
    • Isamu Isozaki's avatar
      Adding VQGAN Training script (#5483) · d27e996c
      Isamu Isozaki authored
      
      
      * Init commit
      
      * Removed einops
      
      * Added default movq config for training
      
      * Update explanation of prompts
      
      * Fixed inheritance of discriminator and init_tracker
      
      * Fixed incompatible api between muse and here
      
      * Fixed output
      
      * Setup init training
      
      * Basic structure done
      
      * Removed attention for quick tests
      
      * Style fixes
      
      * Fixed vae/vqgan styles
      
      * Removed redefinition of wandb
      
      * Fixed log_validation and tqdm
      
      * Nothing commit
      
      * Added commit loss to lookup_from_codebook
      
      * Update src/diffusers/models/vq_model.py
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * Adding perliminary README
      
      * Fixed one typo
      
      * Local changes
      
      * Fixed main issues
      
      * Merging
      
      * Update src/diffusers/models/vq_model.py
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * Testing+Fixed bugs in training script
      
      * Some style fixes
      
      * Added wandb to docs
      
      * Fixed timm test
      
      * get testing suite ready.
      
      * remove return loss
      
      * remove return_loss
      
      * Remove diffs
      
      * Remove diffs
      
      * fix ruff format
      
      ---------
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      d27e996c
  24. 09 May, 2024 1 commit
    • Dhruv Nair's avatar
      [Refactor] Better align `from_single_file` logic with `from_pretrained` (#7496) · cb0f3b49
      Dhruv Nair authored
      
      
      * refactor unet single file loading a bit.
      
      * retrieve the unet from create_diffusers_unet_model_from_ldm
      
      * update
      
      * update
      
      * updae
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * tests
      
      * update
      
      * update
      
      * update
      
      * Update docs/source/en/api/single_file.md
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * Update docs/source/en/api/single_file.md
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * Update docs/source/en/api/loaders/single_file.md
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * Update src/diffusers/loaders/single_file.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * Update docs/source/en/api/loaders/single_file.md
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * Update docs/source/en/api/loaders/single_file.md
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * Update docs/source/en/api/loaders/single_file.md
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * Update docs/source/en/api/loaders/single_file.md
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      ---------
      Co-authored-by: default avatarsayakpaul <spsayakpaul@gmail.com>
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      cb0f3b49
  25. 03 May, 2024 1 commit
  26. 01 May, 2024 1 commit
  27. 30 Apr, 2024 1 commit
    • Sayak Paul's avatar
      [Core] introduce _no_split_modules to `ModelMixin` (#6396) · 3fd31eef
      Sayak Paul authored
      * introduce _no_split_modules.
      
      * unnecessary spaces.
      
      * remove unnecessary kwargs and style
      
      * fix: accelerate imports.
      
      * change to _determine_device_map
      
      * add the blocks that have residual connections.
      
      * add: CrossAttnUpBlock2D
      
      * add: testin
      
      * style
      
      * line-spaces
      
      * quality
      
      * add disk offload test without safetensors.
      
      * checking disk offloading percentages.
      
      * change model split
      
      * add: utility for checking multi-gpu requirement.
      
      * model parallelism test
      
      * splits.
      
      * splits.
      
      * splits
      
      * splits.
      
      * splits.
      
      * splits.
      
      * offload folder to test_disk_offload_with_safetensors
      
      * add _no_split_modules
      
      * fix-copies
      3fd31eef
  28. 25 Apr, 2024 1 commit
  29. 24 Apr, 2024 1 commit
  30. 19 Apr, 2024 2 commits
  31. 16 Apr, 2024 1 commit
    • UmerHA's avatar
      Fixing implementation of ControlNet-XS (#6772) · fda1531d
      UmerHA authored
      
      
      * CheckIn - created DownSubBlocks
      
      * Added extra channels, implemented subblock fwd
      
      * Fixed connection sizes
      
      * checkin
      
      * Removed iter, next in forward
      
      * Models for SD21 & SDXL run through
      
      * Added back pipelines, cleared up connections
      
      * Cleaned up connection creation
      
      * added debug logs
      
      * updated logs
      
      * logs: added input loading
      
      * Update umer_debug_logger.py
      
      * log: Loading hint
      
      * Update umer_debug_logger.py
      
      * added logs
      
      * Changed debug logging
      
      * debug: added more logs
      
      * Fixed num_norm_groups
      
      * Debug: Logging all of SDXL input
      
      * Update umer_debug_logger.py
      
      * debug: updated logs
      
      * checkim
      
      * Readded tests
      
      * Removed debug logs
      
      * Fixed Slow Tests
      
      * Added value ckecks | Updated model_cpu_offload_seq
      
      * accelerate-offloading works ; fast tests work
      
      * Made unet & addon explicit in controlnet
      
      * Updated slow tests
      
      * Added dtype/device to ControlNetXS
      
      * Filled in test model paths
      
      * Added image_encoder/feature_extractor to XL pipe
      
      * Fixed fast tests
      
      * Added comments and docstrings
      
      * Fixed copies
      
      * Added docs ; Updates slow tests
      
      * Moved changes to UNetMidBlock2DCrossAttn
      
      * tiny cleanups
      
      * Removed stray prints
      
      * Removed ip adapters + freeU
      
      - Removed ip adapters + freeU as they don't make sense for ControlNet-XS
      - Fixed imports of UNet components
      
      * Fixed test_save_load_float16
      
      * Make style, quality, fix-copies
      
      * Changed loading/saving API for ControlNetXS
      
      - Changed loading/saving API for ControlNetXS
      - other small fixes
      
      * Removed ControlNet-XS from research examples
      
      * Make style, quality, fix-copies
      
      * Small fixes
      
      - deleted ControlNetXSModel.init_original
      - added time_embedding_mix to StableDiffusionControlNetXSPipeline .from_pretrained / StableDiffusionXLControlNetXSPipeline.from_pretrained
      - fixed copy hints
      
      * checkin May 11 '23
      
      * CheckIn Mar 12 '24
      
      * Fixed tests for SD
      
      * Added tests for UNetControlNetXSModel
      
      * Fixed SDXL tests
      
      * cleanup
      
      * Delete Pipfile
      
      * CheckIn Mar 20
      
      Started replacing sub blocks  by `ControlNetXSCrossAttnDownBlock2D` and `ControlNetXSCrossAttnUplock2D`
      
      * check-in Mar 23
      
      * checkin 24 Mar
      
      * Created init for UNetCnxs and CnxsAddon
      
      * CheckIn
      
      * Made from_modules, from_unet and no_control work
      
      * make style,quality,fix-copies & small changes
      
      * Fixed freezing
      
      * Added gradient ckpt'ing; fixed tests
      
      * Fix slow tests(+compile) ; clear naming confusion
      
      * Don't create UNet in init ; removed class_emb
      
      * Incorporated review feedback
      
      - Deleted get_base_pipeline /  get_controlnet_addon for pipes
      - Pipes inherit from StableDiffusionXLPipeline
      - Made module dicts for cnxs-addon's down/mid/up classes
      - Added support for qkv fusion and freeU
      
      * Make style, quality, fix-copies
      
      * Implemented review feedback
      
      * Removed compatibility check for vae/ctrl embedding
      
      * make style, quality, fix-copies
      
      * Delete Pipfile
      
      * Integrated review feedback
      
      - Importing ControlNetConditioningEmbedding now
      - get_down/mid/up_block_addon now outside class
      - renamed `do_control` to `apply_control`
      
      * Reduced size of test tensors
      
      For this, added `norm_num_groups` as parameter everywhere
      
      * Renamed cnxs-`Addon` to cnxs-`Adapter`
      
      - `ControlNetXSAddon` -> `ControlNetXSAdapter`
      - `ControlNetXSAddonDownBlockComponents` -> `DownBlockControlNetXSAdapter`, and similarly for mid/up
      - `get_mid_block_addon` -> `get_mid_block_adapter`, and similarly for mid/up
      
      * Fixed save_pretrained/from_pretrained bug
      
      * Removed redundant code
      
      ---------
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      fda1531d
  32. 09 Apr, 2024 1 commit
  33. 05 Apr, 2024 1 commit
    • Sayak Paul's avatar
      [Tests] reduce block sizes of UNet and VAE tests (#7560) · 1c60e094
      Sayak Paul authored
      * reduce block sizes for unet1d.
      
      * reduce blocks for unet_2d.
      
      * reduce block size for unet_motion
      
      * increase channels.
      
      * correctly increase channels.
      
      * reduce number of layers in unet2dconditionmodel tests.
      
      * reduce block sizes for unet2dconditionmodel tests
      
      * reduce block sizes for unet3dconditionmodel.
      
      * fix: test_feed_forward_chunking
      
      * fix: test_forward_with_norm_groups
      
      * skip spatiotemporal tests on MPS.
      
      * reduce block size in AutoencoderKL.
      
      * reduce block sizes for vqmodel.
      
      * further reduce block size.
      
      * make style.
      
      * Empty-Commit
      
      * reduce sizes for ConsistencyDecoderVAETests
      
      * further reduction.
      
      * further block reductions in AutoencoderKL and AssymetricAutoencoderKL.
      
      * massively reduce the block size in unet2dcontionmodel.
      
      * reduce sizes for unet3d
      
      * fix tests in unet3d.
      
      * reduce blocks further in motion unet.
      
      * fix: output shape
      
      * add attention_head_dim to the test configuration.
      
      * remove unexpected keyword arg
      
      * up a bit.
      
      * groups.
      
      * up again
      
      * fix
      1c60e094
  34. 29 Mar, 2024 2 commits
  35. 26 Mar, 2024 2 commits