- 05 Dec, 2025 1 commit
-
-
swappy authored
* fix: group offloading to support standalone computational layers in block-level offloading * test: for models with standalone and deeply nested layers in block-level offloading * feat: support for block-level offloading in group offloading config * fix: group offload block modules to AutoencoderKL and AutoencoderKLWan * fix: update group offloading tests to use AutoencoderKL and adjust input dimensions * refactor: streamline block offloading logic * Apply style fixes * update tests * update * fix for failing tests * clean up * revert to use skip_keys * clean up --------- Co-authored-by:
github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com>
-
- 03 Dec, 2025 1 commit
-
-
Kimbing Ng authored
* Fixes #12673. Wrong default_stream is used. leading to wrong execution order when record_steram is enabled. * update * Update test --------- Co-authored-by:Sayak Paul <spsayakpaul@gmail.com>
-
- 25 Nov, 2025 1 commit
-
-
Jerry Wu authored
* Add Support for Z-Image. * Reformatting with make style, black & isort. * Remove init, Modify import utils, Merge forward in transformers block, Remove once func in pipeline. * modified main model forward, freqs_cis left * refactored to add B dim * fixed stack issue * fixed modulation bug * fixed modulation bug * fix bug * remove value_from_time_aware_config * styling * Fix neg embed and devide / bug; Reuse pad zero tensor; Turn cat -> repeat; Add hint for attn processor. * Replace padding with pad_sequence; Add gradient checkpointing. * Fix flash_attn3 in dispatch attn backend by _flash_attn_forward, replace its origin implement; Add DocString in pipeline for that. * Fix Docstring and Make Style. * Revert "Fix flash_attn3 in dispatch attn backend by _flash_attn_forward, replace its origin implement; Add DocString in pipeline for that." This reverts commit fbf26b7ed11d55146103c97740bad4a5f91744e0. * update z-image docstring * Revert attention dispatcher * update z-image docstring * styling * Recover attention_dispatch.py with its origin impl, later would special commit for fa3 compatibility. * Fix prev bug, and support for prompt_embeds pass in args after prompt pre-encode as List of torch Tensor. * Remove einop dependency. * remove redundant imports & make fix-copies * fix import --------- Co-authored-by:liudongyang <liudongyang0114@gmail.com>
-
- 07 Nov, 2025 1 commit
-
-
Wang, Yi authored
* fix the crash in Wan-AI/Wan2.2-TI2V-5B-Diffusers if CP is enabled Signed-off-by:
Wang, Yi <yi.a.wang@intel.com> * address review comment Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> * refine Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by:
Wang, Yi <yi.a.wang@intel.com> Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com>
-
- 24 Oct, 2025 1 commit
-
-
YiYi Xu authored
* add hunyuanimage2.1 --------- Co-authored-by:Sayak Paul <spsayakpaul@gmail.com>
-
- 08 Oct, 2025 1 commit
-
-
Sayak Paul authored
* up * unguard.
-
- 24 Sep, 2025 1 commit
-
-
Aryan authored
* update * update * add coauthor Co-Authored-By:
Dhruv Nair <dhruv.nair@gmail.com> * improve test * handle ip adapter params correctly * fix chroma qkv fusion test * fix fastercache implementation * fix more tests * fight more tests * add back set_attention_backend * update * update * make style * make fix-copies * make ip adapter processor compatible with attention dispatcher * refactor chroma as well * remove rmsnorm assert * minify and deprecate npu/xla processors * update * refactor * refactor; support flash attention 2 with cp * fix * support sage attention with cp * make torch compile compatible * update * refactor * update * refactor * refactor * add ulysses backward * try to make dreambooth script work; accelerator backward not playing well * Revert "try to make dreambooth script work; accelerator backward not playing well" This reverts commit 768d0ea6fa6a305d12df1feda2afae3ec80aa449. * workaround compilation problems with triton when doing all-to-all * support wan * handle backward correctly * support qwen * support ltx * make fix-copies * Update src/diffusers/models/modeling_utils.py Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com> * apply review suggestions * update docs * add explanation * make fix-copies * add docstrings * support passing parallel_config to from_pretrained * apply review suggestions * make style * update * Update docs/source/en/api/parallel.md Co-authored-by:
Aryan <aryan@huggingface.co> * up --------- Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com> Co-authored-by:
sayakpaul <spsayakpaul@gmail.com>
-
- 08 Sep, 2025 1 commit
-
-
YiYi Xu authored
* add qwen modular
-
- 03 Sep, 2025 1 commit
-
-
co63oc authored
Signed-off-by:co63oc <co63oc@users.noreply.github.com>
-
- 20 Aug, 2025 1 commit
-
-
galbria authored
* Add Bria model and pipeline to diffusers - Introduced `BriaTransformer2DModel` and `BriaPipeline` for enhanced image generation capabilities. - Updated import structures across various modules to include the new Bria components. - Added utility functions and output classes specific to the Bria pipeline. - Implemented tests for the Bria pipeline to ensure functionality and output integrity. * with working tests * style and quality pass * adding docs * add to overview * fixes from "make fix-copies" * Refactor transformer_bria.py and pipeline_bria.py: Introduce new EmbedND class for rotary position embedding, and enhance Timestep and TimestepProjEmbeddings classes. Add utility functions for handling negative prompts and generating original sigmas in pipeline_bria.py. * remove redundent and duplicates tests and fix bf16 slow test * style fixes * small doc update * Enhance Bria 3.2 documentation and implementation - Updated the GitHub repository link for Bria 3.2. - Added usage instructions for the gated model access. - Introduced the BriaTransformerBlock and BriaAttention classes to the model architecture. - Refactored existing classes to integrate Bria-specific components, including BriaEmbedND and BriaPipeline. - Updated the pipeline output class to reflect Bria-specific functionality. - Adjusted test cases to align with the new Bria model structure. * Refactor Bria model components and update documentation - Removed outdated inference example from Bria 3.2 documentation. - Introduced the BriaTransformerBlock class to enhance model architecture. - Updated attention handling to use `attention_kwargs` instead of `joint_attention_kwargs`. - Improved import structure in the Bria pipeline to handle optional dependencies. - Adjusted test cases to reflect changes in model dtype assertions. * Update Bria model reference in documentation to reflect new file naming convention * Update docs/source/en/_toctree.yml * Refactor BriaPipeline to inherit from DiffusionPipeline instead of FluxPipeline, updating imports accordingly. * move the __call__ func to the end of file * Update BriaPipeline example to use bfloat16 for precision sensitivity for better result * make style && make quality && make fix-copiessource --------- Co-authored-by:
Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com> Co-authored-by:
Aryan <contact.aryanvs@gmail.com>
-
- 06 Aug, 2025 3 commits
-
-
Aryan authored
update Co-authored-by:Álvaro Somoza <asomoza@users.noreply.github.com>
-
Aryan authored
* update * update * refactor * fuck yeah * make style * Update src/diffusers/hooks/group_offloading.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * Update src/diffusers/hooks/group_offloading.py --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com>
-
Aryan authored
* update * update * refactor * add test * address review comment * nit
-
- 03 Aug, 2025 1 commit
-
-
naykun authored
* (feat): qwen-image integration * fix(qwen-image): - remove unused logics related to controlnet/ip-adapter * fix(qwen-image): - compatible with attention dispatcher - cond cache support * fix(qwen-image): - cond cache registry - attention backend argument - fix copies * fix(qwen-image): - remove local test * Update src/diffusers/models/transformers/transformer_qwenimage.py --------- Co-authored-by:YiYi Xu <yixu310@gmail.com>
-
- 29 Jul, 2025 2 commits
-
-
Sayak Paul authored
* start flux. * more * up * up * up * up * get back the deleted files. * up * empathy
-
Aryan authored
* update * try test fix * add missing link * fix tests * Update src/diffusers/hooks/first_block_cache.py * make style
-
- 25 Jul, 2025 1 commit
-
-
Aryan authored
* update * update
-
- 23 Jul, 2025 1 commit
-
-
Aryan authored
* update
-
- 17 Jul, 2025 1 commit
-
-
Aryan authored
* update * update * add coauthor Co-Authored-By:
Dhruv Nair <dhruv.nair@gmail.com> * improve test * handle ip adapter params correctly * fix chroma qkv fusion test * fix fastercache implementation * fix more tests * fight more tests * add back set_attention_backend * update * update * make style * make fix-copies * make ip adapter processor compatible with attention dispatcher * refactor chroma as well * remove rmsnorm assert * minify and deprecate npu/xla processors --------- Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com>
-
- 10 Jul, 2025 1 commit
-
-
YiYi Xu authored
adding modular diffusers as experimental feature --------- Co-authored-by:
hlky <hlky@hlky.ac> Co-authored-by:
Álvaro Somoza <asomoza@users.noreply.github.com> Co-authored-by:
Aryan <aryan@huggingface.co> Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com> Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com>
-
- 09 Jul, 2025 1 commit
-
-
Sayak Paul authored
* fix memory address problem * add more tests * updates * updates * update * _group_id = group_id * update * Apply suggestions from code review Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com> * update * update * update * fix --------- Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com>
-
- 08 Jul, 2025 1 commit
-
-
Aryan authored
* update * modify flux single blocks to make compatible with cache techniques (without too much model-specific intrusion code) * remove debug logs * update * cache context for different batches of data * fix hs residual bug for single return outputs; support ltx * fix controlnet flux * support flux, ltx i2v, ltx condition * update * update * Update docs/source/en/api/cache.md * Update src/diffusers/hooks/hooks.py Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com> * address review comments pt. 1 * address review comments pt. 2 * cache context refacotr; address review pt. 3 * address review comments * metadata registration with decorators instead of centralized * support cogvideox * support mochi * fix * remove unused function * remove central registry based on review * update --------- Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com>
-
- 27 Jun, 2025 1 commit
-
-
Aryan authored
* update * add test * address review comments * update * fixes * change decorator order to fix tests * try fix * fight tests
-
- 26 Jun, 2025 1 commit
-
-
Dhruv Nair authored
* update * update * update --------- Co-authored-by:Sayak Paul <spsayakpaul@gmail.com>
-
- 24 Jun, 2025 1 commit
-
-
Sayak Paul authored
* raise as early as possible in group offloading * remove check from ModuleGroup
-
- 19 Jun, 2025 2 commits
-
-
Sayak Paul authored
* start implementing disk offloading in group. * delete diff file. * updates.patch * offload_to_disk_path * check if safetensors already exist. * add test and clarify. * updates * update todos. * update more docs. * update docs
-
Aryan authored
update
-
- 30 May, 2025 1 commit
-
-
co63oc authored
* Fix typos in strings and comments Signed-off-by:
co63oc <co63oc@users.noreply.github.com> * Update src/diffusers/hooks/hooks.py Co-authored-by:
Aryan <contact.aryanvs@gmail.com> * Update src/diffusers/hooks/hooks.py Co-authored-by:
Aryan <contact.aryanvs@gmail.com> * Update layerwise_casting.py * Apply style fixes * update --------- Signed-off-by:
co63oc <co63oc@users.noreply.github.com> Co-authored-by:
Aryan <contact.aryanvs@gmail.com> Co-authored-by:
github-actions[bot] <github-actions[bot]@users.noreply.github.com>
-
- 27 May, 2025 1 commit
-
-
Sayak Paul authored
wip: check if we can make go compile compat
-
- 01 May, 2025 1 commit
-
-
co63oc authored
* Fix typos in docs and comments * Apply style fixes --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
github-actions[bot] <github-actions[bot]@users.noreply.github.com>
-
- 30 Apr, 2025 2 commits
-
-
Yao Matrix authored
* make autoencoders. controlnet_flux and wan_transformer3d_single_file pass on XPU Signed-off-by:
Yao Matrix <matrix.yao@intel.com> * Apply style fixes --------- Signed-off-by:
Yao Matrix <matrix.yao@intel.com> Co-authored-by:
github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by:
Aryan <aryan@huggingface.co>
-
Aryan authored
raise warning instead of error
-
- 23 Apr, 2025 1 commit
-
-
Aryan authored
* fix * add tests * add message check
-
- 08 Apr, 2025 1 commit
-
-
Sayak Paul authored
* implement record_stream for better performance. * fix * style. * merge #11097 * Update src/diffusers/hooks/group_offloading.py Co-authored-by:
Aryan <aryan@huggingface.co> * fixes * docstring. * remaining todos in low_cpu_mem_usage * tests * updates to docs. --------- Co-authored-by:
Aryan <aryan@huggingface.co>
-
- 24 Mar, 2025 1 commit
-
-
Aryan authored
* update * Update docs/source/en/optimization/memory.md * Apply suggestions from code review Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com> * apply review suggestions * update --------- Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com>
-
- 21 Mar, 2025 1 commit
-
-
Aryan authored
* init * update * update * update * make style * update * fix * make it work with guidance distilled models * update * make fix-copies * add tests * update * apply_faster_cache -> apply_fastercache * fix * reorder * update * refactor * update docs * add fastercache to CacheMixin * update tests * Apply suggestions from code review * make style * try to fix partial import error * Apply style fixes * raise warning * update --------- Co-authored-by:github-actions[bot] <github-actions[bot]@users.noreply.github.com>
-
- 20 Mar, 2025 1 commit
-
-
Dhruv Nair authored
* update * update * clean up
-
- 18 Mar, 2025 2 commits
- 14 Feb, 2025 1 commit
-
-
Aryan authored
* update * fix * non_blocking; handle parameters and buffers * update * Group offloading with cuda stream prefetching (#10516) * cuda stream prefetch * remove breakpoints * update * copy model hook implementation from pab * update; ~very workaround based implementation but it seems to work as expected; needs cleanup and rewrite * more workarounds to make it actually work * cleanup * rewrite * update * make sure to sync current stream before overwriting with pinned params not doing so will lead to erroneous computations on the GPU and cause bad results * better check * update * remove hook implementation to not deal with merge conflict * re-add hook changes * why use more memory when less memory do trick * why still use slightly more memory when less memory do trick * optimise * add model tests * add pipeline tests * update docs * add layernorm and groupnorm * address review comments * improve tests; add docs * improve docs * Apply suggestions from code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * apply suggestions from code review * update tests * apply suggestions from review * enable_group_offloading -> enable_group_offload for naming consistency * raise errors if multiple offloading strategies used; add relevant tests * handle .to() when group offload applied * refactor some repeated code * remove unintentional change from merge conflict * handle .cuda() --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-