1. 19 May, 2025 2 commits
  2. 15 May, 2025 2 commits
  3. 13 May, 2025 1 commit
    • Aryan's avatar
      LTX Video 0.9.7 (#11516) · 06fee551
      Aryan authored
      
      
      * add upsampling pipeline
      
      * ltx upsample pipeline conversion; pipeline fixes
      
      * make fix-copies
      
      * remove print
      
      * add vae convenience methods
      
      * update
      
      * add tests
      
      * support denoising strength for upscaling & video-to-video
      
      * update docs
      
      * update doc checkpoints
      
      * update docs
      
      * fix
      
      ---------
      Co-authored-by: default avatarLinoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
      06fee551
  4. 12 May, 2025 2 commits
    • Zhong-Yu Li's avatar
      Add VisualCloze (#11377) · 4f438de3
      Zhong-Yu Li authored
      * VisualCloze
      
      * style quality
      
      * add docs
      
      * add docs
      
      * typo
      
      * Update docs/source/en/api/pipelines/visualcloze.md
      
      * delete einops
      
      * style quality
      
      * Update src/diffusers/pipelines/visualcloze/pipeline_visualcloze.py
      
      * reorg
      
      * refine doc
      
      * style quality
      
      * typo
      
      * typo
      
      * Update src/diffusers/image_processor.py
      
      * add comment
      
      * test
      
      * style
      
      * Modified based on review
      
      * style
      
      * restore image_processor
      
      * update example url
      
      * style
      
      * fix-copies
      
      * VisualClozeGenerationPipeline
      
      * combine
      
      * tests docs
      
      * remove VisualClozeUpsamplingPipeline
      
      * style
      
      * quality
      
      * test examples
      
      * quality style
      
      * typo
      
      * make fix-copies
      
      * fix test_callback_cfg and test_save_load_dduf in VisualClozePipelineFastTests
      
      * add EXAMPLE_DOC_STRING to VisualClozeGenerationPipeline
      
      * delete maybe_free_model_hooks from pipeline_visualcloze_combined
      
      * Apply suggestions from code review
      
      * fix test_save_load_local test; add reason for skipping cfg test
      
      * more save_load test fixes
      
      * fix tests in generation pipeline tests
      4f438de3
    • Aryan's avatar
      Hunyuan Video Framepack F1 (#11534) · e48f6aee
      Aryan authored
      * support framepack f1
      
      * update docs
      
      * update toctree
      
      * remove typo
      e48f6aee
  5. 09 May, 2025 1 commit
    • Sayak Paul's avatar
      feat: pipeline-level quantization config (#11130) · 599c8871
      Sayak Paul authored
      
      
      * feat: pipeline-level quant config.
      Co-authored-by: default avatarSunMarc <marc.sun@hotmail.fr>
      
      condition better.
      
      support mapping.
      
      improvements.
      
      [Quantization] Add Quanto backend (#10756)
      
      * update
      
      * updaet
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * Update docs/source/en/quantization/quanto.md
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * Update src/diffusers/quantizers/quanto/utils.py
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * update
      
      * update
      
      ---------
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      [Single File] Add single file loading for SANA Transformer (#10947)
      
      * added support for from_single_file
      
      * added diffusers mapping script
      
      * added testcase
      
      * bug fix
      
      * updated tests
      
      * corrected code quality
      
      * corrected code quality
      
      ---------
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      [LoRA] Improve warning messages when LoRA loading becomes a no-op (#10187)
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * notebooks revert
      
      * fix-copies.
      
      * seeing
      
      * fix
      
      * revert
      
      * fixes
      
      * fixes
      
      * fixes
      
      * remove print
      
      * fix
      
      * conflicts ii.
      
      * updates
      
      * fixes
      
      * better filtering of prefix.
      
      ---------
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      [LoRA] CogView4 (#10981)
      
      * update
      
      * make fix-copies
      
      * update
      
      [Tests] improve quantization tests by additionally measuring the inference memory savings (#11021)
      
      * memory usage tests
      
      * fixes
      
      * gguf
      
      [`Research Project`] Add AnyText: Multilingual Visual Text Generation And Editing (#8998)
      
      * Add initial template
      
      * Second template
      
      * feat: Add TextEmbeddingModule to AnyTextPipeline
      
      * feat: Add AuxiliaryLatentModule template to AnyTextPipeline
      
      * Add bert tokenizer from the anytext repo for now
      
      * feat: Update AnyTextPipeline's modify_prompt method
      
      This commit adds improvements to the modify_prompt method in the AnyTextPipeline class. The method now handles special characters and replaces selected string prompts with a placeholder. Additionally, it includes a check for Chinese text and translation using the trans_pipe.
      
      * Fill in the `forward` pass of `AuxiliaryLatentModule`
      
      * `make style && make quality`
      
      * `chore: Update bert_tokenizer.py with a TODO comment suggesting the use of the transformers library`
      
      * Update error handling to raise and logging
      
      * Add `create_glyph_lines` function into `TextEmbeddingModule`
      
      * make style
      
      * Up
      
      * Up
      
      * Up
      
      * Up
      
      * Remove several comments
      
      * refactor: Remove ControlNetConditioningEmbedding and update code accordingly
      
      * Up
      
      * Up
      
      * up
      
      * refactor: Update AnyTextPipeline to include new optional parameters
      
      * up
      
      * feat: Add OCR model and its components
      
      * chore: Update `TextEmbeddingModule` to include OCR model components and dependencies
      
      * chore: Update `AuxiliaryLatentModule` to include VAE model and its dependencies for masked image in the editing task
      
      * `make style`
      
      * refactor: Update `AnyTextPipeline`'s docstring
      
      * Update `AuxiliaryLatentModule` to include info dictionary so that text processing is done once
      
      * simplify
      
      * `make style`
      
      * Converting `TextEmbeddingModule` to ordinary `encode_prompt()` function
      
      * Simplify for now
      
      * `make style`
      
      * Up
      
      * feat: Add scripts to convert AnyText controlnet to diffusers
      
      * `make style`
      
      * Fix: Move glyph rendering to `TextEmbeddingModule` from `AuxiliaryLatentModule`
      
      * make style
      
      * Up
      
      * Simplify
      
      * Up
      
      * feat: Add safetensors module for loading model file
      
      * Fix device issues
      
      * Up
      
      * Up
      
      * refactor: Simplify
      
      * refactor: Simplify code for loading models and handling data types
      
      * `make style`
      
      * refactor: Update to() method in FrozenCLIPEmbedderT3 and TextEmbeddingModule
      
      * refactor: Update dtype in embedding_manager.py to match proj.weight
      
      * Up
      
      * Add attribution and adaptation information to pipeline_anytext.py
      
      * Update usage example
      
      * Will refactor `controlnet_cond_embedding` initialization
      
      * Add `AnyTextControlNetConditioningEmbedding` template
      
      * Refactor organization
      
      * style
      
      * style
      
      * Move custom blocks from `AuxiliaryLatentModule` to `AnyTextControlNetConditioningEmbedding`
      
      * Follow one-file policy
      
      * style
      
      * [Docs] Update README and pipeline_anytext.py to use AnyTextControlNetModel
      
      * [Docs] Update import statement for AnyTextControlNetModel in pipeline_anytext.py
      
      * [Fix] Update import path for ControlNetModel, ControlNetOutput in anytext_controlnet.py
      
      * Refactor AnyTextControlNet to use configurable conditioning embedding channels
      
      * Complete control net conditioning embedding in AnyTextControlNetModel
      
      * up
      
      * [FIX] Ensure embeddings use correct device in AnyTextControlNetModel
      
      * up
      
      * up
      
      * style
      
      * [UPDATE] Revise README and example code for AnyTextPipeline integration with DiffusionPipeline
      
      * [UPDATE] Update example code in anytext.py to use correct font file and improve clarity
      
      * down
      
      * [UPDATE] Refactor BasicTokenizer usage to a new Checker class for text processing
      
      * update pillow
      
      * [UPDATE] Remove commented-out code and unnecessary docstring in anytext.py and anytext_controlnet.py for improved clarity
      
      * [REMOVE] Delete frozen_clip_embedder_t3.py as it is in the anytext.py file
      
      * [UPDATE] Replace edict with dict for configuration in anytext.py and RecModel.py for consistency
      
      * 🆙
      
      
      
      * style
      
      * [UPDATE] Revise README.md for clarity, remove unused imports in anytext.py, and add author credits in anytext_controlnet.py
      
      * style
      
      * Update examples/research_projects/anytext/README.md
      Co-authored-by: default avatarAryan <contact.aryanvs@gmail.com>
      
      * Remove commented-out image preparation code in AnyTextPipeline
      
      * Remove unnecessary blank line in README.md
      
      [Quantization] Allow loading TorchAO serialized Tensor objects with torch>=2.6  (#11018)
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      fix: mixture tiling sdxl pipeline - adjust gerating time_ids & embeddings  (#11012)
      
      small fix on generating time_ids & embeddings
      
      [LoRA] support wan i2v loras from the world. (#11025)
      
      * support wan i2v loras from the world.
      
      * remove copied from.
      
      * upates
      
      * add lora.
      
      Fix SD3 IPAdapter feature extractor (#11027)
      
      chore: fix help messages in advanced diffusion examples (#10923)
      
      Fix missing **kwargs in lora_pipeline.py (#11011)
      
      * Update lora_pipeline.py
      
      * Apply style fixes
      
      * fix-copies
      
      ---------
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      Co-authored-by: default avatargithub-actions[bot] <github-actions[bot]@users.noreply.github.com>
      
      Fix for multi-GPU WAN inference (#10997)
      
      Ensure that hidden_state and shift/scale are on the same device when running with multiple GPUs
      
      Co-authored-by: Jimmy <39@🇺🇸.com>
      
      [Refactor] Clean up import utils boilerplate (#11026)
      
      * update
      
      * update
      
      * update
      
      Use `output_size` in `repeat_interleave` (#11030)
      
      [hybrid inference 🍯🐝] Add VAE encode (#11017)
      
      * [hybrid inference 🍯🐝
      
      ] Add VAE encode
      
      * _toctree: add vae encode
      
      * Add endpoints, tests
      
      * vae_encode docs
      
      * vae encode benchmarks
      
      * api reference
      
      * changelog
      
      * Update docs/source/en/hybrid_inference/overview.md
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * update
      
      ---------
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      Wan Pipeline scaling fix, type hint warning, multi generator fix (#11007)
      
      * Wan Pipeline scaling fix, type hint warning, multi generator fix
      
      * Apply suggestions from code review
      
      [LoRA] change to warning from info when notifying the users about a LoRA no-op (#11044)
      
      * move to warning.
      
      * test related changes.
      
      Rename Lumina(2)Text2ImgPipeline -> Lumina(2)Pipeline (#10827)
      
      * Rename Lumina(2)Text2ImgPipeline -> Lumina(2)Pipeline
      
      ---------
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      making ```formatted_images``` initialization compact (#10801)
      
      compact writing
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      Fix aclnnRepeatInterleaveIntWithDim error on NPU for get_1d_rotary_pos_embed (#10820)
      
      * get_1d_rotary_pos_embed support npu
      
      * Update src/diffusers/models/embeddings.py
      
      ---------
      Co-authored-by: default avatarKai zheng <kaizheng@KaideMacBook-Pro.local>
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      [Tests] restrict memory tests for quanto for certain schemes. (#11052)
      
      * restrict memory tests for quanto for certain schemes.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      * fixes
      
      * style
      
      ---------
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      [LoRA] feat: support non-diffusers wan t2v loras. (#11059)
      
      feat: support non-diffusers wan t2v loras.
      
      [examples/controlnet/train_controlnet_sd3.py] Fixes #11050 - Cast prompt_embeds and pooled_prompt_embeds to weight_dtype to prevent dtype mismatch (#11051)
      
      Fix: dtype mismatch of prompt embeddings in sd3 controlnet training
      Co-authored-by: default avatarAndreas Jörg <andreasjoerg@MacBook-Pro-von-Andreas-2.fritz.box>
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      reverts accidental change that removes attn_mask in attn. Improves fl… (#11065)
      
      reverts accidental change that removes attn_mask in attn. Improves flux ptxla by using flash block sizes. Moves encoding outside the for loop.
      Co-authored-by: default avatarJuan Acevedo <jfacevedo@google.com>
      
      Fix deterministic issue when getting pipeline dtype and device (#10696)
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      [Tests] add requires peft decorator. (#11037)
      
      * add requires peft decorator.
      
      * install peft conditionally.
      
      * conditional deps.
      Co-authored-by: default avatarDN6 <dhruv.nair@gmail.com>
      
      ---------
      Co-authored-by: default avatarDN6 <dhruv.nair@gmail.com>
      
      CogView4 Control Block (#10809)
      
      * cogview4 control training
      
      ---------
      Co-authored-by: default avatarOleehyO <leehy0357@gmail.com>
      Co-authored-by: default avataryiyixuxu <yixu310@gmail.com>
      
      [CI] pin transformers version for benchmarking. (#11067)
      
      pin transformers version for benchmarking.
      
      updates
      
      Fix Wan I2V Quality (#11087)
      
      * fix_wan_i2v_quality
      
      * Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * Update pipeline_wan_i2v.py
      
      ---------
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      LTX 0.9.5 (#10968)
      
      * update
      
      ---------
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      make PR GPU tests conditioned on styling. (#11099)
      
      Group offloading improvements (#11094)
      
      update
      
      Fix pipeline_flux_controlnet.py (#11095)
      
      * Fix pipeline_flux_controlnet.py
      
      * Fix style
      
      update readme instructions. (#11096)
      Co-authored-by: default avatarJuan Acevedo <jfacevedo@google.com>
      
      Resolve stride mismatch in UNet's ResNet to support Torch DDP (#11098)
      
      Modify UNet's ResNet implementation to resolve stride mismatch in Torch's DDP
      
      Fix Group offloading behaviour when using streams (#11097)
      
      * update
      
      * update
      
      Quality options in `export_to_video` (#11090)
      
      * Quality options in `export_to_video`
      
      * make style
      
      improve more.
      
      add placeholders for docstrings.
      
      formatting.
      
      smol fix.
      
      solidify validation and annotation
      
      * Revert "feat: pipeline-level quant config."
      
      This reverts commit 316ff46b7648bfa24525ac02c284afcf440404aa.
      
      * feat: implement pipeline-level quantization config
      Co-authored-by: default avatarSunMarc <marc@huggingface.co>
      
      * update
      
      * fixes
      
      * fix validation.
      
      * add tests and other improvements.
      
      * add tests
      
      * import quality
      
      * remove prints.
      
      * add docs.
      
      * fixes to docs.
      
      * doc fixes.
      
      * doc fixes.
      
      * add validation to the input quantization_config.
      
      * clarify recommendations.
      
      * docs
      
      * add to ci.
      
      * todo.
      
      ---------
      Co-authored-by: default avatarSunMarc <marc@huggingface.co>
      599c8871
  6. 07 May, 2025 1 commit
    • Aryan's avatar
      Cosmos (#10660) · 7b904941
      Aryan authored
      
      
      * begin transformer conversion
      
      * refactor
      
      * refactor
      
      * refactor
      
      * refactor
      
      * refactor
      
      * refactor
      
      * update
      
      * add conversion script
      
      * add pipeline
      
      * make fix-copies
      
      * remove einops
      
      * update docs
      
      * gradient checkpointing
      
      * add transformer test
      
      * update
      
      * debug
      
      * remove prints
      
      * match sigmas
      
      * add vae pt. 1
      
      * finish CV* vae
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * make fix-copies
      
      * update
      
      * make fix-copies
      
      * fix
      
      * update
      
      * update
      
      * make fix-copies
      
      * update
      
      * update tests
      
      * handle device and dtype for safety checker; required in latest diffusers
      
      * remove enable_gqa and use repeat_interleave instead
      
      * enforce safety checker; use dummy checker in fast tests
      
      * add review suggestion for ONNX export
      Co-Authored-By: default avatarAsfiya Baig <asfiyab@nvidia.com>
      
      * fix safety_checker issues when not passed explicitly
      
      We could either do what's done in this commit, or update the Cosmos examples to explicitly pass the safety checker
      
      * use cosmos guardrail package
      
      * auto format docs
      
      * update conversion script to support 14B models
      
      * update name CosmosPipeline -> CosmosTextToWorldPipeline
      
      * update docs
      
      * fix docs
      
      * fix group offload test failing for vae
      
      ---------
      Co-authored-by: default avatarAsfiya Baig <asfiyab@nvidia.com>
      7b904941
  7. 06 May, 2025 3 commits
  8. 02 May, 2025 1 commit
  9. 01 May, 2025 2 commits
  10. 24 Apr, 2025 2 commits
  11. 22 Apr, 2025 1 commit
    • Linoy Tsaban's avatar
      [LoRA] add LoRA support to HiDream and fine-tuning script (#11281) · e30d3bf5
      Linoy Tsaban authored
      
      
      * initial commit
      
      * initial commit
      
      * initial commit
      
      * initial commit
      
      * initial commit
      
      * initial commit
      
      * Update examples/dreambooth/train_dreambooth_lora_hidream.py
      Co-authored-by: default avatarBagheera <59658056+bghira@users.noreply.github.com>
      
      * move prompt embeds, pooled embeds outside
      
      * Update examples/dreambooth/train_dreambooth_lora_hidream.py
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      * Update examples/dreambooth/train_dreambooth_lora_hidream.py
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      * fix import
      
      * fix import and tokenizer 4, text encoder 4 loading
      
      * te
      
      * prompt embeds
      
      * fix naming
      
      * shapes
      
      * initial commit to add HiDreamImageLoraLoaderMixin
      
      * fix init
      
      * add tests
      
      * loader
      
      * fix model input
      
      * add code example to readme
      
      * fix default max length of text encoders
      
      * prints
      
      * nullify training cond in unpatchify for temp fix to incompatible shaping of transformer output during training
      
      * smol fix
      
      * unpatchify
      
      * unpatchify
      
      * fix validation
      
      * flip pred and loss
      
      * fix shift!!!
      
      * revert unpatchify changes (for now)
      
      * smol fix
      
      * Apply style fixes
      
      * workaround moe training
      
      * workaround moe training
      
      * remove prints
      
      * to reduce some memory, keep vae in `weight_dtype` same as we have for flux (as it's the same vae)
      https://github.com/huggingface/diffusers/blob/bbd0c161b55ba2234304f1e6325832dd69c60565/examples/dreambooth/train_dreambooth_lora_flux.py#L1207
      
      
      
      * refactor to align with HiDream refactor
      
      * refactor to align with HiDream refactor
      
      * refactor to align with HiDream refactor
      
      * add support for cpu offloading of text encoders
      
      * Apply style fixes
      
      * adjust lr and rank for train example
      
      * fix copies
      
      * Apply style fixes
      
      * update README
      
      * update README
      
      * update README
      
      * fix license
      
      * keep prompt2,3,4 as None in validation
      
      * remove reverse ode comment
      
      * Update examples/dreambooth/train_dreambooth_lora_hidream.py
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * Update examples/dreambooth/train_dreambooth_lora_hidream.py
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * vae offload change
      
      * fix text encoder offloading
      
      * Apply style fixes
      
      * cleaner to_kwargs
      
      * fix module name in copied from
      
      * add requirements
      
      * fix offloading
      
      * fix offloading
      
      * fix offloading
      
      * update transformers version in reqs
      
      * try AutoTokenizer
      
      * try AutoTokenizer
      
      * Apply style fixes
      
      * empty commit
      
      * Delete tests/lora/test_lora_layers_hidream.py
      
      * change tokenizer_4 to load with AutoTokenizer as well
      
      * make text_encoder_four and tokenizer_four configurable
      
      * save model card
      
      * save model card
      
      * revert T5
      
      * fix test
      
      * remove non diffusers lumina2 conversion
      
      ---------
      Co-authored-by: default avatarBagheera <59658056+bghira@users.noreply.github.com>
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      Co-authored-by: default avatargithub-actions[bot] <github-actions[bot]@users.noreply.github.com>
      e30d3bf5
  12. 18 Apr, 2025 1 commit
  13. 17 Apr, 2025 2 commits
  14. 16 Apr, 2025 2 commits
  15. 15 Apr, 2025 2 commits
  16. 14 Apr, 2025 1 commit
  17. 13 Apr, 2025 2 commits
    • Ishan Modi's avatar
      [ControlNet] Adds controlnet for SanaTransformer (#11040) · f1f38ffb
      Ishan Modi authored
      
      
      * added controlnet for sana transformer
      
      * improve code quality
      
      * addressed PR comments
      
      * bug fixes
      
      * added test cases
      
      * update
      
      * added dummy objects
      
      * addressed PR comments
      
      * update
      
      * Forcing update
      
      * add to docs
      
      * code quality
      
      * addressed PR comments
      
      * addressed PR comments
      
      * update
      
      * addressed PR comments
      
      * added proper styling
      
      * update
      
      * Revert "added proper styling"
      
      This reverts commit 344ee8a7014ada095b295034ef84341f03b0e359.
      
      * manually ordered
      
      * Apply suggestions from code review
      
      ---------
      Co-authored-by: default avatarAryan <contact.aryanvs@gmail.com>
      f1f38ffb
    • Adrien B's avatar
      Update autoencoderkl_allegro.md (#11303) · ed41db85
      Adrien B authored
      Correction typo
      ed41db85
  18. 11 Apr, 2025 1 commit
  19. 09 Apr, 2025 3 commits
  20. 08 Apr, 2025 3 commits
    • Sayak Paul's avatar
      [feat] implement `record_stream` when using CUDA streams during group offloading (#11081) · 4b27c4a4
      Sayak Paul authored
      
      
      * implement record_stream for better performance.
      
      * fix
      
      * style.
      
      * merge #11097
      
      * Update src/diffusers/hooks/group_offloading.py
      Co-authored-by: default avatarAryan <aryan@huggingface.co>
      
      * fixes
      
      * docstring.
      
      * remaining todos in low_cpu_mem_usage
      
      * tests
      
      * updates to docs.
      
      ---------
      Co-authored-by: default avatarAryan <aryan@huggingface.co>
      4b27c4a4
    • Benjamin Bossan's avatar
      [LoRA] Implement hot-swapping of LoRA (#9453) · fb544996
      Benjamin Bossan authored
      * [WIP][LoRA] Implement hot-swapping of LoRA
      
      This PR adds the possibility to hot-swap LoRA adapters. It is WIP.
      
      Description
      
      As of now, users can already load multiple LoRA adapters. They can
      offload existing adapters or they can unload them (i.e. delete them).
      However, they cannot "hotswap" adapters yet, i.e. substitute the weights
      from one LoRA adapter with the weights of another, without the need to
      create a separate LoRA adapter.
      
      Generally, hot-swapping may not appear not super useful but when the
      model is compiled, it is necessary to prevent recompilation. See #9279
      for more context.
      
      Caveats
      
      To hot-swap a LoRA adapter for another, these two adapters should target
      exactly the same layers and the "hyper-parameters" of the two adapters
      should be identical. For instance, the LoRA alpha has to be the same:
      Given that we keep the alpha from the first adapter, the LoRA scaling
      would be incorrect for the second adapter otherwise.
      
      Theoretically, we could override the scaling dict with the alpha values
      derived from the second adapter's config, but changing the dict will
      trigger a guard for recompilation, defeating the main purpose of the
      feature.
      
      I also found that compilation flags can have an impact on whether this
      works or not. E.g. when passing "reduce-overhead", there will be errors
      of the type:
      
      > input name: arg861_1. data pointer changed from 139647332027392 to
      139647331054592
      
      I don't know enough about compilation to determine whether this is
      problematic or not.
      
      Current state
      
      This is obviously WIP right now to collect feedback and discuss which
      direction to take this. If this PR turns out to be useful, the
      hot-swapping functions will be added to PEFT itself and can be imported
      here (or there is a separate copy in diffusers to avoid the need for a
      min PEFT version to use this feature).
      
      Moreover, more tests need to be added to better cover this feature,
      although we don't necessarily need tests for the hot-swapping
      functionality itself, since those tests will be added to PEFT.
      
      Furthermore, as of now, this is only implemented for the unet. Other
      pipeline components have yet to implement this feature.
      
      Finally, it should be properly documented.
      
      I would like to collect feedback on the current state of the PR before
      putting more time into finalizing it.
      
      * Reviewer feedback
      
      * Reviewer feedback, adjust test
      
      * Fix, doc
      
      * Make fix
      
      * Fix for possible g++ error
      
      * Add test for recompilation w/o hotswapping
      
      * Make hotswap work
      
      Requires https://github.com/huggingface/peft/pull/2366
      
      More changes to make hotswapping work. Together with the mentioned PEFT
      PR, the tests pass for me locally.
      
      List of changes:
      
      - docstring for hotswap
      - remove code copied from PEFT, import from PEFT now
      - adjustments to PeftAdapterMixin.load_lora_adapter (unfortunately, some
        state dict renaming was necessary, LMK if there is a better solution)
      - adjustments to UNet2DConditionLoadersMixin._process_lora: LMK if this
        is even necessary or not, I'm unsure what the overall relationship is
        between this and PeftAdapterMixin.load_lora_adapter
      - also in UNet2DConditionLoadersMixin._process_lora, I saw that there is
        no LoRA unloading when loading the adapter fails, so I added it
        there (in line with what happens in PeftAdapterMixin.load_lora_adapter)
      - rewritten tests to avoid shelling out, make the test more precise by
        making sure that the outputs align, parametrize it
      - also checked the pipeline code mentioned in this comment:
        https://github.com/huggingface/diffusers/pull/9453#issuecomment-2418508871;
      
      
        when running this inside the with
        torch._dynamo.config.patch(error_on_recompile=True) context, there is
        no error, so I think hotswapping is now working with pipelines.
      
      * Address reviewer feedback:
      
      - Revert deprecated method
      - Fix PEFT doc link to main
      - Don't use private function
      - Clarify magic numbers
      - Add pipeline test
      
      Moreover:
      - Extend docstrings
      - Extend existing test for outputs != 0
      - Extend existing test for wrong adapter name
      
      * Change order of test decorators
      
      parameterized.expand seems to ignore skip decorators if added in last
      place (i.e. innermost decorator).
      
      * Split model and pipeline tests
      
      Also increase test coverage by also targeting conv2d layers (support of
      which was added recently on the PEFT PR).
      
      * Reviewer feedback: Move decorator to test classes
      
      ... instead of having them on each test method.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      * Reviewer feedback: version check, TODO comment
      
      * Add enable_lora_hotswap method
      
      * Reviewer feedback: check _lora_loadable_modules
      
      * Revert changes in unet.py
      
      * Add possibility to ignore enabled at wrong time
      
      * Fix docstrings
      
      * Log possible PEFT error, test
      
      * Raise helpful error if hotswap not supported
      
      I.e. for the text encoder
      
      * Formatting
      
      * More linter
      
      * More ruff
      
      * Doc-builder complaint
      
      * Update docstring:
      
      - mention no text encoder support yet
      - make it clear that LoRA is meant
      - mention that same adapter name should be passed
      
      * Fix error in docstring
      
      * Update more methods with hotswap argument
      
      - SDXL
      - SD3
      - Flux
      
      No changes were made to load_lora_into_transformer.
      
      * Add hotswap argument to load_lora_into_transformer
      
      For SD3 and Flux. Use shorter docstring for brevity.
      
      * Extend docstrings
      
      * Add version guards to tests
      
      * Formatting
      
      * Fix LoRA loading call to add prefix=None
      
      See:
      https://github.com/huggingface/diffusers/pull/10187#issuecomment-2717571064
      
      
      
      * Run make fix-copies
      
      * Add hot swap documentation to the docs
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      ---------
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      fb544996
    • Steven Liu's avatar
      [docs] MPS update (#11212) · fc7a867a
      Steven Liu authored
      mps
      fc7a867a
  21. 04 Apr, 2025 1 commit
    • Tolga Cangöz's avatar
      [LTX0.9.5] Refactor `LTXConditionPipeline` for text-only conditioning (#11174) · 13e48492
      Tolga Cangöz authored
      * Refactor `LTXConditionPipeline` to add text-only conditioning
      
      * style
      
      * up
      
      * Refactor `LTXConditionPipeline` to streamline condition handling and improve clarity
      
      * Improve condition checks
      
      * Simplify latents handling based on conditioning type
      
      * Refactor rope_interpolation_scale preparation for clarity and efficiency
      
      * Update LTXConditionPipeline docstring to clarify supported input types
      
      * Add LTX Video 0.9.5 model to documentation
      
      * Clarify documentation to indicate support for text-only conditioning without passing `conditions`
      
      * refactor: comment out unused parameters in LTXConditionPipeline
      
      * fix: restore previously commented parameters in LTXConditionPipeline
      
      * fix: remove unused parameters from LTXConditionPipeline
      
      * refactor: remove unnecessary lines in LTXConditionPipeline
      13e48492
  22. 02 Apr, 2025 1 commit
  23. 01 Apr, 2025 1 commit
    • Dhruv Nair's avatar
      [WIP] Add Wan Video2Video (#11053) · df1d7b01
      Dhruv Nair authored
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      df1d7b01
  24. 31 Mar, 2025 1 commit
  25. 28 Mar, 2025 1 commit