Commits · 62cce3045d6710e9cd68869ac8391867e326928b · renzhc / diffusers_dcu

18 Jun, 2025 2 commits

[chore] change to 2025 licensing for remaining (#11741) · 62cce304
Sayak Paul authored Jun 18, 2025
```
change to 2025 licensing for remaining
```
62cce304

️ Speed up method `AutoencoderKLWan.clear_cache` by 886% (#11665) · 5ce4814a

Saurabh Misra authored Jun 17, 2025

* ⚡

️ Speed up method `AutoencoderKLWan.clear_cache` by 886%

**Key optimizations:**
- Compute the number of `WanCausalConv3d` modules in each model (`encoder`/`decoder`) **only once during initialization**, store in `self._cached_conv_counts`. This removes unnecessary repeated tree traversals at every `clear_cache` call, which was the main bottleneck (from profiling).
- The internal helper `_count_conv3d_fast` is optimized via a generator expression with `sum` for efficiency.

All comments from the original code are preserved, except for updated or removed local docstrings/comments relevant to changed lines.  
**Function signatures and outputs remain unchanged.**

* Apply style fixes

* Apply suggestions from code review
Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Apply style fixes

---------
Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: Aseem Saxena <aseem.bits@gmail.com>

5ce4814a

14 Jun, 2025 1 commit

Chroma Pipeline (#11698) · 8adc6003

Edna authored Jun 13, 2025



* working state from hameerabbasi and iddl

* working state form hameerabbasi and iddl (transformer)

* working state (normalization)

* working state (embeddings)

* add chroma loader

* add chroma to mappings

* add chroma to transformer init

* take out variant stuff

* get decently far in changing variant stuff

* add chroma init

* make chroma output class

* add chroma transformer to dummy tp

* add chroma to init

* add chroma to init

* fix single file

* update

* update

* add chroma to auto pipeline

* add chroma to pipeline init

* change to chroma transformer

* take out variant from blocks

* swap embedder location

* remove prompt_2

* work on swapping text encoders

* remove mask function

* dont modify mask (for now)

* wrap attn mask

* no attn mask (can't get it to work)

* remove pooled prompt embeds

* change to my own unpooled embeddeer

* fix load

* take pooled projections out of transformer

* ensure correct dtype for chroma embeddings

* update

* use dn6 attn mask + fix true_cfg_scale

* use chroma pipeline output

* use DN6 embeddings

* remove guidance

* remove guidance embed (pipeline)

* remove guidance from embeddings

* don't return length

* dont change dtype

* remove unused stuff, fix up docs

* add chroma autodoc

* add .md (oops)

* initial chroma docs

* undo don't change dtype

* undo arxiv change

unsure why that happened

* fix hf papers regression in more places

* Update docs/source/en/api/pipelines/chroma.md
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* do_cfg -> self.do_classifier_free_guidance

* Update docs/source/en/api/models/chroma_transformer.md
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Update chroma.md

* Move chroma layers into transformer

* Remove pruned AdaLayerNorms

* Add chroma fast tests

* (untested) batch cond and uncond

* Add # Copied from for shift

* Update # Copied from statements

* update norm imports

* Revert cond + uncond batching

* Add transformer tests

* move chroma test (oops)

* chroma init

* fix chroma pipeline fast tests

* Update src/diffusers/models/transformers/transformer_chroma.py
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Move Approximator and Embeddings

* Fix auto pipeline + make style, quality

* make style

* Apply style fixes

* switch to new input ids

* fix # Copied from error

* remove # Copied from on protected members

* try to fix import

* fix import

* make fix-copes

* revert style fix

* update chroma transformer params

* update chroma transformer approximator init params

* update to pad tokens

* fix batch inference

* Make more pipeline tests work

* Make most transformer tests work

* fix docs

* make style, make quality

* skip batch tests

* fix test skipping

* fix test skipping again

* fix for tests

* Fix all pipeline test

* update

* push local changes, fix docs

* add encoder test, remove pooled dim

* default proj dim

* fix tests

* fix equal size list input

* update

* push local changes, fix docs

* add encoder test, remove pooled dim

* default proj dim

* fix tests

* fix equal size list input

* Revert "fix equal size list input"

This reverts commit 3fe4ad67d58d83715bc238f8654f5e90bfc5653c.

* update

* update

* update

* update

* update

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

8adc6003

13 Jun, 2025 1 commit

Cosmos Predict2 (#11695) · 9f91305f

Aryan authored Jun 14, 2025

* support text-to-image

* update example

* make fix-copies

* support use_flow_sigmas in EDM scheduler instead of maintain cosmos-specific scheduler

* support video-to-world

* update

* rename text2image pipeline

* make fix-copies

* add t2i test

* add test for v2w pipeline

* support edm dpmsolver multistep

* update

* update

* update

* update tests

* fix tests

* safety checker

* make conversion script work without guardrail

9f91305f

11 Jun, 2025 2 commits

Apply Occam's Razor in position embedding calculation (#11562) · 47ef7946
Tolga Cangöz authored Jun 12, 2025
```
* fix: remove redundant indexing

* style
```
47ef7946

[tests] model-level `device_map` clarifications (#11681) · 91545666

Sayak Paul authored Jun 11, 2025

* add clarity in documentation for device_map

* docs

* fix how compiler tester mixins are used.

* propagate

* more

* typo.

* fix tests

* fix order of decroators.

* clarify more.

* more test cases.

* fix doc

* fix device_map docstring in pipeline_utils.

* more examples

* more

* update

* remove code for stuff that is already supported.

* fix stuff.

91545666

08 Jun, 2025 1 commit
- fixed axes_dims_rope init (huggingface#11641) (#11678) · f46abfe4
  Valeriy Sofin authored Jun 08, 2025
  
  f46abfe4
06 Jun, 2025 1 commit

Wan VACE (#11582) · 73a9d585

Aryan authored Jun 06, 2025

* initial support

* make fix-copies

* fix no split modules

* add conversion script

* refactor

* add pipeline test

* refactor

* fix bug with mask

* fix for reference images

* remove print

* update docs

* update slices

* update

* update

* update example

73a9d585

02 Jun, 2025 1 commit
- Use float32 RoPE freqs in Wan with MPS backends (#11643) · 3a31b291
  Roy Hvaara authored Jun 02, 2025
```
Use float32 for RoPE on MPS in Wan
```
  3a31b291
30 May, 2025 2 commits

Fix typos in strings and comments (#11476) · 8183d0f1

co63oc authored May 30, 2025



* Fix typos in strings and comments
Signed-off-by: co63oc <co63oc@users.noreply.github.com>

* Update src/diffusers/hooks/hooks.py
Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Update src/diffusers/hooks/hooks.py
Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Update layerwise_casting.py

* Apply style fixes

* update

---------
Signed-off-by: co63oc <co63oc@users.noreply.github.com>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

8183d0f1

removing unnecessary else statement (#11624) · 3651bdb7
Yaniv Galron authored May 30, 2025
```
Co-authored-by: Aryan <aryan@huggingface.co>
```
3651bdb7

26 May, 2025 1 commit

[Feature] AutoModel can load components using model_index.json (#11401) · f64fa949

Ishan Modi authored May 26, 2025



* update

* update

* update

* update

* addressed PR comments

* update

* addressed PR comments

* added tests

* addressed PR comments

* updates

* update

* addressed PR comments

* update

* fix style

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

f64fa949

19 May, 2025 1 commit

Use HF Papers (#11567) · c8bb1ff5

Quentin Gallouédec authored May 19, 2025



* Use HF Papers

* Apply style fixes

---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

c8bb1ff5

15 May, 2025 1 commit
- [Single File] GGUF/Single File Support for HiDream (#11550) · 4267d8f4
  Dhruv Nair authored May 15, 2025
```
* update

* update

* update

* update

* update

* update

* update
```
  4267d8f4
13 May, 2025 1 commit
- fix: remove `torch_dtype="auto"` option from docstrings (#11513) · f8d4a1e2
  johannaSommer authored May 13, 2025
```
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
```
  f8d4a1e2
11 May, 2025 1 commit
- [tests] add tests for framepack transformer model. (#11520) · 01abfc87
  Sayak Paul authored May 11, 2025
```
* start.

* add tests for framepack transformer model.

* merge conflicts.

* make to square.

* fixes
```
  01abfc87
09 May, 2025 1 commit
- Change Framepack transformer layer initialization order (#11535) · 92fe689f
  Aryan authored May 09, 2025
```
update
```
  92fe689f
08 May, 2025 1 commit
- Conditionally import torchvision in Cosmos transformer (#11524) · 6674a515
  Aryan authored May 08, 2025
```
fix
```
  6674a515
07 May, 2025 1 commit

Cosmos (#10660) · 7b904941

Aryan authored May 07, 2025



* begin transformer conversion

* refactor

* refactor

* refactor

* refactor

* refactor

* refactor

* update

* add conversion script

* add pipeline

* make fix-copies

* remove einops

* update docs

* gradient checkpointing

* add transformer test

* update

* debug

* remove prints

* match sigmas

* add vae pt. 1

* finish CV* vae

* update

* update

* update

* update

* update

* update

* make fix-copies

* update

* make fix-copies

* fix

* update

* update

* make fix-copies

* update

* update tests

* handle device and dtype for safety checker; required in latest diffusers

* remove enable_gqa and use repeat_interleave instead

* enforce safety checker; use dummy checker in fast tests

* add review suggestion for ONNX export
Co-Authored-By: Asfiya Baig <asfiyab@nvidia.com>

* fix safety_checker issues when not passed explicitly

We could either do what's done in this commit, or update the Cosmos examples to explicitly pass the safety checker

* use cosmos guardrail package

* auto format docs

* update conversion script to support 14B models

* update name CosmosPipeline -> CosmosTextToWorldPipeline

* update docs

* fix docs

* fix group offload test failing for vae

---------
Co-authored-by: Asfiya Baig <asfiyab@nvidia.com>

7b904941

06 May, 2025 1 commit

Hunyuan Video Framepack (#11428) · d7ffe601

Aryan authored May 06, 2025

* add transformer

* add pipeline

* fixes

* make fix-copies

* update

* add flux mu shift

* update example snippet

* debug

* cleanup

* batch_size=1 optimization

* add pipeline test

* fix for model cpu offloading'

* add last_image support; credits: https://github.com/lllyasviel/FramePack/pull/167

* update example with flf2v

* update penguin url

* fix test

* address review comment: https://github.com/huggingface/diffusers/pull/11428#discussion_r2071032371

* address review comment: https://github.com/huggingface/diffusers/pull/11428#discussion_r2071087689



* Update src/diffusers/pipelines/hunyuan_video/pipeline_hunyuan_video_framepack.py

---------
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>

d7ffe601

05 May, 2025 1 commit
- [Feature] Implement tiled VAE encoding/decoding for Wan model. (#11414) · 8520d496
  Connector Switch authored May 05, 2025
```
* implement tiled encode/decode

* address review comments
```
  8520d496
01 May, 2025 2 commits

Fix typos in docs and comments (#11416) · 86294d3c

co63oc authored May 01, 2025



* Fix typos in docs and comments

* Apply style fixes

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

86294d3c

[WAN] fix recompilation issues (#11475) · d70f8ee1

Sayak Paul authored May 01, 2025



* [tests] Add torch.compile() test for WanTransformer3DModel

* fix wan recompilation issues.

* style

---------
Co-authored-by: tongyu0924 <winnie920924@gmail.com>

d70f8ee1

30 Apr, 2025 1 commit
- `torch.compile` fullgraph compatibility for Hunyuan Video (#11457) · c8651158
  Aryan authored Apr 30, 2025
```
udpate
```
  c8651158
24 Apr, 2025 1 commit
- Fix typos in strings and comments (#11407) · f00a9957
  co63oc authored Apr 25, 2025
  
  f00a9957
22 Apr, 2025 3 commits

[HiDream] move deprecation to 0.35.0 (#11384) · 448c72a2
YiYi Xu authored Apr 22, 2025
```
up
```
448c72a2
Update modeling imports (#11129) · f108ad88
Aryan authored Apr 22, 2025
```
update
```
f108ad88

[LoRA] add LoRA support to HiDream and fine-tuning script (#11281) · e30d3bf5

Linoy Tsaban authored Apr 22, 2025



* initial commit

* initial commit

* initial commit

* initial commit

* initial commit

* initial commit

* Update examples/dreambooth/train_dreambooth_lora_hidream.py
Co-authored-by: Bagheera <59658056+bghira@users.noreply.github.com>

* move prompt embeds, pooled embeds outside

* Update examples/dreambooth/train_dreambooth_lora_hidream.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update examples/dreambooth/train_dreambooth_lora_hidream.py
Co-authored-by: hlky <hlky@hlky.ac>

* fix import

* fix import and tokenizer 4, text encoder 4 loading

* te

* prompt embeds

* fix naming

* shapes

* initial commit to add HiDreamImageLoraLoaderMixin

* fix init

* add tests

* loader

* fix model input

* add code example to readme

* fix default max length of text encoders

* prints

* nullify training cond in unpatchify for temp fix to incompatible shaping of transformer output during training

* smol fix

* unpatchify

* unpatchify

* fix validation

* flip pred and loss

* fix shift!!!

* revert unpatchify changes (for now)

* smol fix

* Apply style fixes

* workaround moe training

* workaround moe training

* remove prints

* to reduce some memory, keep vae in `weight_dtype` same as we have for flux (as it's the same vae)
https://github.com/huggingface/diffusers/blob/bbd0c161b55ba2234304f1e6325832dd69c60565/examples/dreambooth/train_dreambooth_lora_flux.py#L1207



* refactor to align with HiDream refactor

* refactor to align with HiDream refactor

* refactor to align with HiDream refactor

* add support for cpu offloading of text encoders

* Apply style fixes

* adjust lr and rank for train example

* fix copies

* Apply style fixes

* update README

* update README

* update README

* fix license

* keep prompt2,3,4 as None in validation

* remove reverse ode comment

* Update examples/dreambooth/train_dreambooth_lora_hidream.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update examples/dreambooth/train_dreambooth_lora_hidream.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* vae offload change

* fix text encoder offloading

* Apply style fixes

* cleaner to_kwargs

* fix module name in copied from

* add requirements

* fix offloading

* fix offloading

* fix offloading

* update transformers version in reqs

* try AutoTokenizer

* try AutoTokenizer

* Apply style fixes

* empty commit

* Delete tests/lora/test_lora_layers_hidream.py

* change tokenizer_4 to load with AutoTokenizer as well

* make text_encoder_four and tokenizer_four configurable

* save model card

* save model card

* revert T5

* fix test

* remove non diffusers lumina2 conversion

---------
Co-authored-by: Bagheera <59658056+bghira@users.noreply.github.com>
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

e30d3bf5

21 Apr, 2025 1 commit

[cogview4][feat] Support attention mechanism with variable-length support and... · 0434db9a

OleehyO authored Apr 22, 2025


[cogview4][feat] Support attention mechanism with variable-length support and batch packing (#11349)

* [cogview4] Enhance attention mechanism with variable-length support and batch packing

---------
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

0434db9a

19 Apr, 2025 1 commit
- update output for Hidream transformer (#11366) · 5a2e0f71
  YiYi Xu authored Apr 18, 2025
```
up
```
  5a2e0f71
18 Apr, 2025 1 commit

support Wan-FLF2V (#11353) · 0021bfa1

YiYi Xu authored Apr 18, 2025



* update transformer

---------
Co-authored-by: Aryan <aryan@huggingface.co>

0021bfa1

17 Apr, 2025 2 commits
- Update controlnet_flux.py (#11350) · ee6ad51d
  Frank (Haofan) Wang authored Apr 18, 2025
  
  ee6ad51d
- [Hi Dream] follow-up (#11296) · 05679329
  YiYi Xu authored Apr 17, 2025
```
* add
```
  05679329
15 Apr, 2025 3 commits

Rewrite AuraFlowPatchEmbed.pe_selection_index_based_on_dim to be torch.compile compatible (#11297) · b6156aaf

AstraliteHeart authored Apr 15, 2025



* Update pe_selection_index_based_on_dim

* Make pe_selection_index_based_on_dim work with torh.compile

* Fix AuraFlowTransformer2DModel's dpcstring default values

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

b6156aaf

Fix vae.Decoder prev_output_channel (#11280) · 6e80d240
hlky authored Apr 15, 2025

6e80d240

[LoRA] Add LoRA support to AuraFlow (#10216) · 9352a5ca

Hameer Abbasi authored Apr 15, 2025



* Add AuraFlowLoraLoaderMixin

* Add comments, remove qkv fusion

* Add Tests

* Add AuraFlowLoraLoaderMixin to documentation

* Add Suggested changes

* Change attention_kwargs->joint_attention_kwargs

* Rebasing derp.

* fix

* fix

* Quality fixes.

* make style

* `make fix-copies`

* `ruff check --fix`

* Attept 1 to fix tests.

* Attept 2 to fix tests.

* Attept 3 to fix tests.

* Address review comments.

* Rebasing derp.

* Get more tests passing by copying from Flux. Address review comments.

* `joint_attention_kwargs`->`attention_kwargs`

* Add `lora_scale` property for te LoRAs.

* Make test better.

* Remove useless property.

* Skip TE-only tests for AuraFlow.

* Support LoRA for non-CLIP TEs.

* Restore LoRA tests.

* Undo adding LoRA support for non-CLIP TEs.

* Undo support for TE in AuraFlow LoRA.

* `make fix-copies`

* Sync with upstream changes.

* Remove unneeded stuff.

* Mirror `Lumina2`.

* Skip for MPS.

* Address review comments.

* Remove duplicated code.

* Remove unnecessary code.

* Remove repeated docs.

* Propagate attention.

* Fix TE target modules.

* MPS fix for LoRA tests.

* Unrelated TE LoRA tests fix.

* Fix AuraFlow LoRA tests by applying to the right denoiser layers.
Co-authored-by: AstraliteHeart <81396681+AstraliteHeart@users.noreply.github.com>

* Apply style fixes

* empty commit

* Fix the repo consistency issues.

* Remove unrelated changes.

* Style.

* Fix `test_lora_fuse_nan`.

* fix quality issues.

* `pytest.xfail` -> `ValueError`.

* Add back `skip_mps`.

* Apply style fixes

* `make fix-copies`

---------
Co-authored-by: Warlord-K <warlordk28@gmail.com>
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: AstraliteHeart <81396681+AstraliteHeart@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

9352a5ca

14 Apr, 2025 1 commit
- Use float32 on mps or npu in transformer_hidream_image's rope (#11316) · dcf836cf
  hlky authored Apr 14, 2025
  
  dcf836cf
13 Apr, 2025 3 commits

[ControlNet] Adds controlnet for SanaTransformer (#11040) · f1f38ffb

Ishan Modi authored Apr 13, 2025



* added controlnet for sana transformer

* improve code quality

* addressed PR comments

* bug fixes

* added test cases

* update

* added dummy objects

* addressed PR comments

* update

* Forcing update

* add to docs

* code quality

* addressed PR comments

* addressed PR comments

* update

* addressed PR comments

* added proper styling

* update

* Revert "added proper styling"

This reverts commit 344ee8a7014ada095b295034ef84341f03b0e359.

* manually ordered

* Apply suggestions from code review

---------
Co-authored-by: Aryan <contact.aryanvs@gmail.com>

f1f38ffb

Fix incorrect tile_latent_min_width calculations (#11305) · 36538e11
Tuna Tuncer authored Apr 13, 2025

36538e11

Hidream refactoring follow ups (#11299) · 97e0ef4d

Aryan authored Apr 13, 2025



* HiDream Image

* update

* -einops

* py3.8

* fix -einops

* mixins, offload_seq, option_components

* docs

* Apply style fixes

* trigger tests

* Apply suggestions from code review
Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* joint_attention_kwargs -> attention_kwargs, fixes

* fast tests

* -_init_weights

* style tests

* move reshape logic

* update slice 😴

* supports_dduf

* 🤷🏻

‍♂️

* Update src/diffusers/models/transformers/transformer_hidream_image.py
Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* address review comments

* update tests

* doc updates

* update

* Update src/diffusers/models/transformers/transformer_hidream_image.py

* Apply style fixes

---------
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

97e0ef4d