1. 05 Dec, 2025 1 commit
  2. 03 Dec, 2025 1 commit
  3. 25 Nov, 2025 1 commit
    • Jerry Wu's avatar
      Add Support for Z-Image Series (#12703) · 4088e8a8
      Jerry Wu authored
      
      
      * Add Support for Z-Image.
      
      * Reformatting with make style, black & isort.
      
      * Remove init, Modify import utils, Merge forward in transformers block, Remove once func in pipeline.
      
      * modified main model forward, freqs_cis left
      
      * refactored to add B dim
      
      * fixed stack issue
      
      * fixed modulation bug
      
      * fixed modulation bug
      
      * fix bug
      
      * remove value_from_time_aware_config
      
      * styling
      
      * Fix neg embed and devide / bug; Reuse pad zero tensor; Turn cat -> repeat; Add hint for attn processor.
      
      * Replace padding with pad_sequence; Add gradient checkpointing.
      
      * Fix flash_attn3 in dispatch attn backend by _flash_attn_forward, replace its origin implement; Add DocString in pipeline for that.
      
      * Fix Docstring and Make Style.
      
      * Revert "Fix flash_attn3 in dispatch attn backend by _flash_attn_forward, replace its origin implement; Add DocString in pipeline for that."
      
      This reverts commit fbf26b7ed11d55146103c97740bad4a5f91744e0.
      
      * update z-image docstring
      
      * Revert attention dispatcher
      
      * update z-image docstring
      
      * styling
      
      * Recover attention_dispatch.py with its origin impl, later would special commit for fa3 compatibility.
      
      * Fix prev bug, and support for prompt_embeds pass in args after prompt pre-encode as List of torch Tensor.
      
      * Remove einop dependency.
      
      * remove redundant imports & make fix-copies
      
      * fix import
      
      ---------
      Co-authored-by: default avatarliudongyang <liudongyang0114@gmail.com>
      4088e8a8
  4. 07 Nov, 2025 1 commit
  5. 24 Oct, 2025 1 commit
  6. 08 Oct, 2025 1 commit
  7. 24 Sep, 2025 1 commit
    • Aryan's avatar
      Context Parallel w/ Ring & Ulysses & Unified Attention (#11941) · dcb6dd9b
      Aryan authored
      
      
      * update
      
      * update
      
      * add coauthor
      Co-Authored-By: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      * improve test
      
      * handle ip adapter params correctly
      
      * fix chroma qkv fusion test
      
      * fix fastercache implementation
      
      * fix more tests
      
      * fight more tests
      
      * add back set_attention_backend
      
      * update
      
      * update
      
      * make style
      
      * make fix-copies
      
      * make ip adapter processor compatible with attention dispatcher
      
      * refactor chroma as well
      
      * remove rmsnorm assert
      
      * minify and deprecate npu/xla processors
      
      * update
      
      * refactor
      
      * refactor; support flash attention 2 with cp
      
      * fix
      
      * support sage attention with cp
      
      * make torch compile compatible
      
      * update
      
      * refactor
      
      * update
      
      * refactor
      
      * refactor
      
      * add ulysses backward
      
      * try to make dreambooth script work; accelerator backward not playing well
      
      * Revert "try to make dreambooth script work; accelerator backward not playing well"
      
      This reverts commit 768d0ea6fa6a305d12df1feda2afae3ec80aa449.
      
      * workaround compilation problems with triton when doing all-to-all
      
      * support wan
      
      * handle backward correctly
      
      * support qwen
      
      * support ltx
      
      * make fix-copies
      
      * Update src/diffusers/models/modeling_utils.py
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      * apply review suggestions
      
      * update docs
      
      * add explanation
      
      * make fix-copies
      
      * add docstrings
      
      * support passing parallel_config to from_pretrained
      
      * apply review suggestions
      
      * make style
      
      * update
      
      * Update docs/source/en/api/parallel.md
      Co-authored-by: default avatarAryan <aryan@huggingface.co>
      
      * up
      
      ---------
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      Co-authored-by: default avatarsayakpaul <spsayakpaul@gmail.com>
      dcb6dd9b
  8. 08 Sep, 2025 1 commit
  9. 03 Sep, 2025 1 commit
  10. 20 Aug, 2025 1 commit
    • galbria's avatar
      Bria 3 2 pipeline (#12010) · 7993be9e
      galbria authored
      
      
      * Add Bria model and pipeline to diffusers
      
      - Introduced `BriaTransformer2DModel` and `BriaPipeline` for enhanced image generation capabilities.
      - Updated import structures across various modules to include the new Bria components.
      - Added utility functions and output classes specific to the Bria pipeline.
      - Implemented tests for the Bria pipeline to ensure functionality and output integrity.
      
      * with working tests
      
      * style and quality pass
      
      * adding docs
      
      * add to overview
      
      * fixes from "make fix-copies"
      
      * Refactor transformer_bria.py and pipeline_bria.py: Introduce new EmbedND class for rotary position embedding, and enhance Timestep and TimestepProjEmbeddings classes. Add utility functions for handling negative prompts and generating original sigmas in pipeline_bria.py.
      
      * remove redundent and duplicates tests and fix bf16
      slow test
      
      * style fixes
      
      * small doc update
      
      * Enhance Bria 3.2 documentation and implementation
      
      - Updated the GitHub repository link for Bria 3.2.
      - Added usage instructions for the gated model access.
      - Introduced the BriaTransformerBlock and BriaAttention classes to the model architecture.
      - Refactored existing classes to integrate Bria-specific components, including BriaEmbedND and BriaPipeline.
      - Updated the pipeline output class to reflect Bria-specific functionality.
      - Adjusted test cases to align with the new Bria model structure.
      
      * Refactor Bria model components and update documentation
      
      - Removed outdated inference example from Bria 3.2 documentation.
      - Introduced the BriaTransformerBlock class to enhance model architecture.
      - Updated attention handling to use `attention_kwargs` instead of `joint_attention_kwargs`.
      - Improved import structure in the Bria pipeline to handle optional dependencies.
      - Adjusted test cases to reflect changes in model dtype assertions.
      
      * Update Bria model reference in documentation to reflect new file naming convention
      
      * Update docs/source/en/_toctree.yml
      
      * Refactor BriaPipeline to inherit from DiffusionPipeline instead of FluxPipeline, updating imports accordingly.
      
      * move the __call__ func to the end of file
      
      * Update BriaPipeline example to use bfloat16 for precision sensitivity for better result
      
      * make style && make quality &&  make fix-copiessource
      
      ---------
      Co-authored-by: default avatarLinoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
      Co-authored-by: default avatarAryan <contact.aryanvs@gmail.com>
      7993be9e
  11. 06 Aug, 2025 3 commits
  12. 03 Aug, 2025 1 commit
    • naykun's avatar
      Qwen-Image (#12055) · 8e53cd95
      naykun authored
      
      
      * (feat): qwen-image integration
      
      * fix(qwen-image):
      - remove unused logics related to controlnet/ip-adapter
      
      * fix(qwen-image):
      - compatible with attention dispatcher
      - cond cache support
      
      * fix(qwen-image):
      - cond cache registry
      - attention backend argument
      - fix copies
      
      * fix(qwen-image):
      - remove local test
      
      * Update src/diffusers/models/transformers/transformer_qwenimage.py
      
      ---------
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      8e53cd95
  13. 29 Jul, 2025 2 commits
  14. 25 Jul, 2025 1 commit
  15. 23 Jul, 2025 1 commit
  16. 17 Jul, 2025 1 commit
  17. 10 Jul, 2025 1 commit
  18. 09 Jul, 2025 1 commit
  19. 08 Jul, 2025 1 commit
    • Aryan's avatar
      First Block Cache (#11180) · 0454fbb3
      Aryan authored
      
      
      * update
      
      * modify flux single blocks to make compatible with cache techniques (without too much model-specific intrusion code)
      
      * remove debug logs
      
      * update
      
      * cache context for different batches of data
      
      * fix hs residual bug for single return outputs; support ltx
      
      * fix controlnet flux
      
      * support flux, ltx i2v, ltx condition
      
      * update
      
      * update
      
      * Update docs/source/en/api/cache.md
      
      * Update src/diffusers/hooks/hooks.py
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      * address review comments pt. 1
      
      * address review comments pt. 2
      
      * cache context refacotr; address review pt. 3
      
      * address review comments
      
      * metadata registration with decorators instead of centralized
      
      * support cogvideox
      
      * support mochi
      
      * fix
      
      * remove unused function
      
      * remove central registry based on review
      
      * update
      
      ---------
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      0454fbb3
  20. 27 Jun, 2025 1 commit
  21. 26 Jun, 2025 1 commit
  22. 24 Jun, 2025 1 commit
  23. 19 Jun, 2025 2 commits
  24. 30 May, 2025 1 commit
  25. 27 May, 2025 1 commit
  26. 01 May, 2025 1 commit
  27. 30 Apr, 2025 2 commits
  28. 23 Apr, 2025 1 commit
  29. 08 Apr, 2025 1 commit
  30. 24 Mar, 2025 1 commit
  31. 21 Mar, 2025 1 commit
    • Aryan's avatar
      [core] FasterCache (#10163) · 844221ae
      Aryan authored
      
      
      * init
      
      * update
      
      * update
      
      * update
      
      * make style
      
      * update
      
      * fix
      
      * make it work with guidance distilled models
      
      * update
      
      * make fix-copies
      
      * add tests
      
      * update
      
      * apply_faster_cache -> apply_fastercache
      
      * fix
      
      * reorder
      
      * update
      
      * refactor
      
      * update docs
      
      * add fastercache to CacheMixin
      
      * update tests
      
      * Apply suggestions from code review
      
      * make style
      
      * try to fix partial import error
      
      * Apply style fixes
      
      * raise warning
      
      * update
      
      ---------
      Co-authored-by: default avatargithub-actions[bot] <github-actions[bot]@users.noreply.github.com>
      844221ae
  32. 20 Mar, 2025 1 commit
  33. 18 Mar, 2025 2 commits
  34. 14 Feb, 2025 1 commit
    • Aryan's avatar
      Module Group Offloading (#10503) · 9a147b82
      Aryan authored
      
      
      * update
      
      * fix
      
      * non_blocking; handle parameters and buffers
      
      * update
      
      * Group offloading with cuda stream prefetching (#10516)
      
      * cuda stream prefetch
      
      * remove breakpoints
      
      * update
      
      * copy model hook implementation from pab
      
      * update; ~very workaround based implementation but it seems to work as expected; needs cleanup and rewrite
      
      * more workarounds to make it actually work
      
      * cleanup
      
      * rewrite
      
      * update
      
      * make sure to sync current stream before overwriting with pinned params
      
      not doing so will lead to erroneous computations on the GPU and cause bad results
      
      * better check
      
      * update
      
      * remove hook implementation to not deal with merge conflict
      
      * re-add hook changes
      
      * why use more memory when less memory do trick
      
      * why still use slightly more memory when less memory do trick
      
      * optimise
      
      * add model tests
      
      * add pipeline tests
      
      * update docs
      
      * add layernorm and groupnorm
      
      * address review comments
      
      * improve tests; add docs
      
      * improve docs
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * apply suggestions from code review
      
      * update tests
      
      * apply suggestions from review
      
      * enable_group_offloading -> enable_group_offload for naming consistency
      
      * raise errors if multiple offloading strategies used; add relevant tests
      
      * handle .to() when group offload applied
      
      * refactor some repeated code
      
      * remove unintentional change from merge conflict
      
      * handle .cuda()
      
      ---------
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      9a147b82