Commits · a4df8dbc40e170ff828f8d8f79c2c861c9f1748d · renzhc / diffusers_dcu

19 Jun, 2025 2 commits
- Update more licenses to 2025 (#11746) · a4df8dbc
  Aryan authored Jun 19, 2025
```
update
```
  a4df8dbc
- [Quantizers] add `is_compileable` property to quantizers. (#11736) · 48eae6f4
  Sayak Paul authored Jun 19, 2025
```
add is_compileable property to quantizers.
```
  48eae6f4
18 Jun, 2025 5 commits

Dhruv Nair authored Jun 18, 2025

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* updte

* update

* update

* update

66394bf6

[chore] change to 2025 licensing for remaining (#11741) · 62cce304
Sayak Paul authored Jun 18, 2025
```
change to 2025 licensing for remaining
```
62cce304

[tests] device_map tests for all models. (#11708) · 05e86778

Sayak Paul authored Jun 18, 2025



* device_map tests for all models.

* updates

* Update tests/models/test_modeling_common.py
Co-authored-by: Aryan <aryan@huggingface.co>

* fix device_map in test

---------
Co-authored-by: Aryan <aryan@huggingface.co>

05e86778

[training] add ds support to lora hidream (#11737) · d72184eb

Leo Jiang authored Jun 17, 2025



* [training] add ds support to lora hidream

* Apply style fixes

---------
Co-authored-by: J石页 <jiangshuo9@h-partners.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

d72184eb

⚡

️ Speed up method `AutoencoderKLWan.clear_cache` by 886% (#11665) · 5ce4814a

Saurabh Misra authored Jun 17, 2025

* ⚡

️ Speed up method `AutoencoderKLWan.clear_cache` by 886%

**Key optimizations:**
- Compute the number of `WanCausalConv3d` modules in each model (`encoder`/`decoder`) **only once during initialization**, store in `self._cached_conv_counts`. This removes unnecessary repeated tree traversals at every `clear_cache` call, which was the main bottleneck (from profiling).
- The internal helper `_count_conv3d_fast` is optimized via a generator expression with `sum` for efficiency.

All comments from the original code are preserved, except for updated or removed local docstrings/comments relevant to changed lines.  
**Function signatures and outputs remain unchanged.**

* Apply style fixes

* Apply suggestions from code review
Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Apply style fixes

---------
Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: Aseem Saxena <aseem.bits@gmail.com>

5ce4814a

17 Jun, 2025 2 commits

[LoRA training] update metadata use for lora alpha + README (#11723) · 1bc6f3dc

Linoy Tsaban authored Jun 17, 2025



* lora alpha

* Apply style fixes

* Update examples/advanced_diffusion_training/README_flux.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* fix readme format

---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

1bc6f3dc

Support more Wan loras (VACE) (#11726) · 79bd7ecc
Aryan authored Jun 17, 2025
```
update
```
79bd7ecc

16 Jun, 2025 4 commits

Add Pruna optimization framework documentation (#11688) · 9b834f87

David Berenstein authored Jun 16, 2025

* Add Pruna optimization framework documentation

- Introduced a new section for Pruna in the table of contents.
- Added comprehensive documentation for Pruna, detailing its optimization techniques, installation instructions, and examples for optimizing and evaluating models

* Enhance Pruna documentation with image alt text and code block formatting

- Added alt text to images for better accessibility and context.
- Changed code block syntax from diff to python for improved clarity.

* Add installation section to Pruna documentation

- Introduced a new installation section in the Pruna documentation to guide users on how to install the framework.
- Enhanced the overall clarity and usability of the documentation for new users.

* Update pruna.md

* Update Pruna documentation for model optimization and evaluation

- Changed section titles for consistency and clarity, from "Optimizing models" to "Optimize models" and "Evaluating and benchmarking optimized models" to "Evaluate and benchmark models".
- Enhanced descriptions to clarify the use of `diffusers` models and the evaluation process.
- Added a new example for evaluating standalone `diffusers` models.
- Updated references and links for better navigation within the documentation.

* Refactor Pruna documentation for clarity and consistency

- Removed outdated references to FLUX-juiced and streamlined the explanation of benchmarking.
- Enhanced the description of evaluating standalone `diffusers` models.
- Cleaned up code examples by removing unnecessary imports and comments for better readability.

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Enhance Pruna documentation with new examples and clarifications

- Added an image to illustrate the optimization process.
- Updated the explanation for sharing and loading optimized models on the Hugging Face Hub.
- Clarified the evaluation process for optimized models using the EvaluationAgent.
- Improved descriptions for defining metrics and evaluating standalone diffusers models.

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

9b834f87

Fix misleading comment (#11722) · 81426b0f
Carl Thomé authored Jun 16, 2025

81426b0f

[training] show how metadata stuff should be incorporated in training scripts. (#11707) · f0dba33d

Sayak Paul authored Jun 16, 2025



* show how metadata stuff should be incorporated in training scripts.

* typing

* fix

---------
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>

f0dba33d

[LoRA ]fix flux lora loader when return_metadata is true for non-diffusers (#11716) · d1db4f85
Sayak Paul authored Jun 16, 2025
```
* fix flux lora loader when return_metadata is true for non-diffusers

* remove annotation
```
d1db4f85

14 Jun, 2025 1 commit

Chroma Pipeline (#11698) · 8adc6003

Edna authored Jun 13, 2025



* working state from hameerabbasi and iddl

* working state form hameerabbasi and iddl (transformer)

* working state (normalization)

* working state (embeddings)

* add chroma loader

* add chroma to mappings

* add chroma to transformer init

* take out variant stuff

* get decently far in changing variant stuff

* add chroma init

* make chroma output class

* add chroma transformer to dummy tp

* add chroma to init

* add chroma to init

* fix single file

* update

* update

* add chroma to auto pipeline

* add chroma to pipeline init

* change to chroma transformer

* take out variant from blocks

* swap embedder location

* remove prompt_2

* work on swapping text encoders

* remove mask function

* dont modify mask (for now)

* wrap attn mask

* no attn mask (can't get it to work)

* remove pooled prompt embeds

* change to my own unpooled embeddeer

* fix load

* take pooled projections out of transformer

* ensure correct dtype for chroma embeddings

* update

* use dn6 attn mask + fix true_cfg_scale

* use chroma pipeline output

* use DN6 embeddings

* remove guidance

* remove guidance embed (pipeline)

* remove guidance from embeddings

* don't return length

* dont change dtype

* remove unused stuff, fix up docs

* add chroma autodoc

* add .md (oops)

* initial chroma docs

* undo don't change dtype

* undo arxiv change

unsure why that happened

* fix hf papers regression in more places

* Update docs/source/en/api/pipelines/chroma.md
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* do_cfg -> self.do_classifier_free_guidance

* Update docs/source/en/api/models/chroma_transformer.md
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Update chroma.md

* Move chroma layers into transformer

* Remove pruned AdaLayerNorms

* Add chroma fast tests

* (untested) batch cond and uncond

* Add # Copied from for shift

* Update # Copied from statements

* update norm imports

* Revert cond + uncond batching

* Add transformer tests

* move chroma test (oops)

* chroma init

* fix chroma pipeline fast tests

* Update src/diffusers/models/transformers/transformer_chroma.py
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Move Approximator and Embeddings

* Fix auto pipeline + make style, quality

* make style

* Apply style fixes

* switch to new input ids

* fix # Copied from error

* remove # Copied from on protected members

* try to fix import

* fix import

* make fix-copes

* revert style fix

* update chroma transformer params

* update chroma transformer approximator init params

* update to pad tokens

* fix batch inference

* Make more pipeline tests work

* Make most transformer tests work

* fix docs

* make style, make quality

* skip batch tests

* fix test skipping

* fix test skipping again

* fix for tests

* Fix all pipeline test

* update

* push local changes, fix docs

* add encoder test, remove pooled dim

* default proj dim

* fix tests

* fix equal size list input

* update

* push local changes, fix docs

* add encoder test, remove pooled dim

* default proj dim

* fix tests

* fix equal size list input

* Revert "fix equal size list input"

This reverts commit 3fe4ad67d58d83715bc238f8654f5e90bfc5653c.

* update

* update

* update

* update

* update

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

8adc6003

13 Jun, 2025 5 commits

Cosmos Predict2 (#11695) · 9f91305f

Aryan authored Jun 14, 2025

* support text-to-image

* update example

* make fix-copies

* support use_flow_sigmas in EDM scheduler instead of maintain cosmos-specific scheduler

* support video-to-world

* update

* rename text2image pipeline

* make fix-copies

* add t2i test

* add test for v2w pipeline

* support edm dpmsolver multistep

* update

* update

* update

* update tests

* fix tests

* safety checker

* make conversion script work without guardrail

9f91305f

[LoRA] parse metadata from LoRA and save metadata (#11324) · 368958df

Sayak Paul authored Jun 13, 2025



* feat: parse metadata from lora state dicts.

* tests

* fix tests

* key renaming

* fix

* smol update

* smol updates

* load metadata.

* automatically save metadata in save_lora_adapter.

* propagate changes.

* changes

* add test to models too.

* tigher tests.

* updates

* fixes

* rename tests.

* sorted.

* Update src/diffusers/loaders/lora_base.py
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* review suggestions.

* removeprefix.

* propagate changes.

* fix-copies

* sd

* docs.

* fixes

* get review ready.

* one more test to catch error.

* change to a different approach.

* fix-copies.

* todo

* sd3

* update

* revert changes in get_peft_kwargs.

* update

* fixes

* fixes

* simplify _load_sft_state_dict_metadata

* update

* style fix

* uipdate

* update

* update

* empty commit

* _pack_dict_with_prefix

* update

* TODO 1.

* todo: 2.

* todo: 3.

* update

* update

* Apply suggestions from code review
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* reraise.

* move argument.

---------
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>

368958df

Support Wan AccVideo lora (#11704) · e52ceae3

Aryan authored Jun 13, 2025

* update

* make style

* Update src/diffusers/loaders/lora_conversion_utils.py

* add note explaining threshold

e52ceae3

[docs] mention fp8 benefits on supported hardware. (#11699) · 62cbde8d

Sayak Paul authored Jun 13, 2025



* mention fp8 benefits on supported hardware.

* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

62cbde8d

swap out token for style bot. (#11701) · 648e8955
Sayak Paul authored Jun 13, 2025

648e8955

12 Jun, 2025 1 commit

[docs] add compilation bits to the bitsandbytes docs. (#11693) · 00b179fb

Sayak Paul authored Jun 12, 2025



* add compilation bits to the bitsandbytes docs.

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* finish

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

00b179fb

11 Jun, 2025 9 commits

Apply Occam's Razor in position embedding calculation (#11562) · 47ef7946
Tolga Cangöz authored Jun 12, 2025
```
* fix: remove redundant indexing

* style
```
47ef7946
Avoid DtoH sync from access of nonzero() item in scheduler (#11696) · b272807b
Joel Schlosser authored Jun 11, 2025

b272807b
Set _torch_version to N/A if torch is disabled. (#11645) · 447ccd06
rasmi authored Jun 11, 2025

447ccd06

Improve Wan docstrings (#11689) · f3e09114

Aryan authored Jun 12, 2025



* improve docstrings for wan

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* make style

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

f3e09114

[tests] model-level `device_map` clarifications (#11681) · 91545666

Sayak Paul authored Jun 11, 2025

* add clarity in documentation for device_map

* docs

* fix how compiler tester mixins are used.

* propagate

* more

* typo.

* fix tests

* fix order of decroators.

* clarify more.

* more test cases.

* fix doc

* fix device_map docstring in pipeline_utils.

* more examples

* more

* update

* remove code for stuff that is already supported.

* fix stuff.

91545666

[tests] tests for compilation + quantization (bnb) (#11672) · b6f79330

Sayak Paul authored Jun 11, 2025



* start adding compilation tests for quantization.

* fixes

* make common utility.

* modularize.

* add group offloading+compile

* xfail

* update

* Update tests/quantization/test_torch_compile_utils.py
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* fixes

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

b6f79330

enable torchao test cases on XPU and switch to device agnostic APIs for test cases (#11654) · 33e636ce

Yao Matrix authored Jun 11, 2025



* enable torchao cases on XPU
Signed-off-by: Matrix YAO <matrix.yao@intel.com>

* device agnostic APIs
Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* more
Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* fix style
Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* enable test_torch_compile_recompilation_and_graph_break on XPU
Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* resolve comments
Signed-off-by: YAO Matrix <matrix.yao@intel.com>

---------
Signed-off-by: Matrix YAO <matrix.yao@intel.com>
Signed-off-by: YAO Matrix <matrix.yao@intel.com>

33e636ce

[`Wan`] Fix VAE sampling mode in `WanVideoToVideoPipeline` (#11639) · e27142ac
Tolga Cangöz authored Jun 11, 2025
```
* fix: vae sampling mode

* fix a typo
```
e27142ac
[LoRA] support Flux Control LoRA with bnb 8bit. (#11655) · 8e88495d
Sayak Paul authored Jun 11, 2025
```
support Flux Control LoRA with bnb 8bit.
```
8e88495d

10 Jun, 2025 2 commits

Allow remote code repo names to contain "." (#11652) · b79803fe

Akash Haridas authored Jun 10, 2025



* allow loading from repo with dot in name

* put new arg at the end to avoid breaking compatibility

* add test for loading repo with dot in name

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

b79803fe

Update pipeline_flux_inpaint.py to fix padding_mask_crop returning only the inpainted area (#11658) · b0f7036d

Meatfucker authored Jun 10, 2025

* Update pipeline_flux_inpaint.py to fix padding_mask_crop returning only the inpainted area and not the entire image.

* Apply style fixes

* Update src/diffusers/pipelines/flux/pipeline_flux_inpaint.py

b0f7036d

09 Jun, 2025 3 commits

Add community class StableDiffusionXL_T5Pipeline (#11626) · 6c7fad7e

Philip Brown authored Jun 09, 2025

* Add community class StableDiffusionXL_T5Pipeline
Will be used with base model opendiffusionai/stablediffusionxl_t5

* Changed pooled_embeds to use projection instead of slice

* "make style" tweaks

* Added comments to top of code

* Apply style fixes

6c7fad7e

Introduce DeprecatedPipelineMixin to simplify pipeline deprecation process (#11596) · 5b0dab12
Dhruv Nair authored Jun 09, 2025
```
* update

* update

* update

* update

* update

* update

* update
```
5b0dab12
[tests] Fix how compiler mixin classes are used (#11680) · 7c6e9ef4
Sayak Paul authored Jun 09, 2025
```
* fix how compiler tester mixins are used.

* propagate

* more
```
7c6e9ef4

08 Jun, 2025 1 commit
- fixed axes_dims_rope init (huggingface#11641) (#11678) · f46abfe4
  Valeriy Sofin authored Jun 08, 2025
  
  f46abfe4
06 Jun, 2025 3 commits

Wan VACE (#11582) · 73a9d585

Aryan authored Jun 06, 2025

* initial support

* make fix-copies

* fix no split modules

* add conversion script

* refactor

* add pipeline test

* refactor

* fix bug with mask

* fix for reference images

* remove print

* update docs

* update slices

* update

* update

* update example

73a9d585

[tests] add test for torch.compile + group offloading (#11670) · 16c955c5
Sayak Paul authored Jun 06, 2025
```
* add a test for group offloading + compilation.

* tests
```
16c955c5

use deterministic to get stable result (#11663) · 0f91f2f6

jiqing-feng authored Jun 06, 2025



* use deterministic to get stable result
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add deterministic for int8 test
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

0f91f2f6

05 Jun, 2025 2 commits

[examples] flux-control: use num_training_steps_for_scheduler (#11662) · 745199a8

Markus Pobitzer authored Jun 05, 2025



[examples] flux-control: use num_training_steps_for_scheduler in get_scheduler instead of args.max_train_steps * accelerator.num_processes
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

745199a8

[chore] bring PipelineQuantizationConfig at the top of the import chain. (#11656) · 0142f6f3
Sayak Paul authored Jun 05, 2025
```
bring PipelineQuantizationConfig at the top of the import chain.
```
0142f6f3