Commits · c90352754a802757f1c4ff3a9a5272f010a1dc43 · renzhc / diffusers_dcu

11 Jul, 2025 1 commit

(#11904) · c9035275

Aryan authored Jul 11, 2025



* update

* update

* update

* pin accelerate version

* add comment explanations

* update docstring

* make style

* non_blocking does not matter for dtype cast

* _empty_cache -> clear_cache

* update

* Update src/diffusers/models/model_loading_utils.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/diffusers/models/model_loading_utils.py

---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

c9035275

10 Jul, 2025 2 commits
- Avoid creating tensor in CosmosAttnProcessor2_0 (#11761) (#11763) · 941b7fc0
  chenxiao authored Jul 11, 2025
```
* Avoid creating tensor in CosmosAttnProcessor2_0 (#11761)

* up

---------
Co-authored-by: yiyixuxu <yixu310@gmail.com>
```
  941b7fc0
- [ControlnetUnion] Multiple Fixes (#11888) · 76a62ac9
  Álvaro Somoza authored Jul 10, 2025
```
fixes

---------
Co-authored-by: hlky <hlky@hlky.ac>
```
  76a62ac9
08 Jul, 2025 1 commit

First Block Cache (#11180) · 0454fbb3

Aryan authored Jul 09, 2025



* update

* modify flux single blocks to make compatible with cache techniques (without too much model-specific intrusion code)

* remove debug logs

* update

* cache context for different batches of data

* fix hs residual bug for single return outputs; support ltx

* fix controlnet flux

* support flux, ltx i2v, ltx condition

* update

* update

* Update docs/source/en/api/cache.md

* Update src/diffusers/hooks/hooks.py
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* address review comments pt. 1

* address review comments pt. 2

* cache context refacotr; address review pt. 3

* address review comments

* metadata registration with decorators instead of centralized

* support cogvideox

* support mochi

* fix

* remove unused function

* remove central registry based on review

* update

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

0454fbb3

01 Jul, 2025 2 commits

Use real-valued instead of complex tensors in Wan2.1 RoPE (#11649) · 62e847db

Mikko Tukiainen authored Jul 02, 2025



* use real instead of complex tensors in Wan2.1 RoPE

* remove the redundant type conversion

* unpack rotary_emb

* register rotary embedding frequencies as non-persistent buffers

* Apply style fixes

---------
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

62e847db

[single file] Cosmos (#11801) · a79c3af6
Aryan authored Jul 01, 2025
```
* update

* update

* update docs
```
a79c3af6

26 Jun, 2025 1 commit

[rfc][compile] compile method for DiffusionPipeline (#11705) · d93381cd

Animesh Jain authored Jun 25, 2025



* [rfc][compile] compile method for DiffusionPipeline

* Apply suggestions from code review
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Apply style fixes

* Update docs/source/en/optimization/fp16.md

* check

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

d93381cd

24 Jun, 2025 1 commit
- [tests] Fix group offloading and layerwise casting test interaction (#11796) · 5df02fc1
  Aryan authored Jun 24, 2025
```
* update

* update

* update
```
  5df02fc1
21 Jun, 2025 1 commit
- Fix dimensionalities in `apply_rotary_emb` functions' comments (#11717) · 7fc53b5d
  Tolga Cangöz authored Jun 22, 2025
```
Fix dimensionality in `apply_rotary_emb` functions' comments.
```
  7fc53b5d
19 Jun, 2025 2 commits

make group offloading work with disk/nvme transfers (#11682) · 85a916bb

Sayak Paul authored Jun 19, 2025

* start implementing disk offloading in group.

* delete diff file.

* updates.patch

* offload_to_disk_path

* check if safetensors already exist.

* add test and clarify.

* updates

* update todos.

* update more docs.

* update docs

85a916bb

Update more licenses to 2025 (#11746) · a4df8dbc
Aryan authored Jun 19, 2025
```
update
```
a4df8dbc

18 Jun, 2025 3 commits

Chroma Follow Up (#11725) · 66394bf6

Dhruv Nair authored Jun 18, 2025

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* updte

* update

* update

* update

66394bf6

[chore] change to 2025 licensing for remaining (#11741) · 62cce304
Sayak Paul authored Jun 18, 2025
```
change to 2025 licensing for remaining
```
62cce304

⚡

️ Speed up method `AutoencoderKLWan.clear_cache` by 886% (#11665) · 5ce4814a

Saurabh Misra authored Jun 17, 2025

* ⚡

️ Speed up method `AutoencoderKLWan.clear_cache` by 886%

**Key optimizations:**
- Compute the number of `WanCausalConv3d` modules in each model (`encoder`/`decoder`) **only once during initialization**, store in `self._cached_conv_counts`. This removes unnecessary repeated tree traversals at every `clear_cache` call, which was the main bottleneck (from profiling).
- The internal helper `_count_conv3d_fast` is optimized via a generator expression with `sum` for efficiency.

All comments from the original code are preserved, except for updated or removed local docstrings/comments relevant to changed lines.  
**Function signatures and outputs remain unchanged.**

* Apply style fixes

* Apply suggestions from code review
Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Apply style fixes

---------
Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: Aseem Saxena <aseem.bits@gmail.com>

5ce4814a

14 Jun, 2025 1 commit

Chroma Pipeline (#11698) · 8adc6003

Edna authored Jun 13, 2025



* working state from hameerabbasi and iddl

* working state form hameerabbasi and iddl (transformer)

* working state (normalization)

* working state (embeddings)

* add chroma loader

* add chroma to mappings

* add chroma to transformer init

* take out variant stuff

* get decently far in changing variant stuff

* add chroma init

* make chroma output class

* add chroma transformer to dummy tp

* add chroma to init

* add chroma to init

* fix single file

* update

* update

* add chroma to auto pipeline

* add chroma to pipeline init

* change to chroma transformer

* take out variant from blocks

* swap embedder location

* remove prompt_2

* work on swapping text encoders

* remove mask function

* dont modify mask (for now)

* wrap attn mask

* no attn mask (can't get it to work)

* remove pooled prompt embeds

* change to my own unpooled embeddeer

* fix load

* take pooled projections out of transformer

* ensure correct dtype for chroma embeddings

* update

* use dn6 attn mask + fix true_cfg_scale

* use chroma pipeline output

* use DN6 embeddings

* remove guidance

* remove guidance embed (pipeline)

* remove guidance from embeddings

* don't return length

* dont change dtype

* remove unused stuff, fix up docs

* add chroma autodoc

* add .md (oops)

* initial chroma docs

* undo don't change dtype

* undo arxiv change

unsure why that happened

* fix hf papers regression in more places

* Update docs/source/en/api/pipelines/chroma.md
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* do_cfg -> self.do_classifier_free_guidance

* Update docs/source/en/api/models/chroma_transformer.md
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Update chroma.md

* Move chroma layers into transformer

* Remove pruned AdaLayerNorms

* Add chroma fast tests

* (untested) batch cond and uncond

* Add # Copied from for shift

* Update # Copied from statements

* update norm imports

* Revert cond + uncond batching

* Add transformer tests

* move chroma test (oops)

* chroma init

* fix chroma pipeline fast tests

* Update src/diffusers/models/transformers/transformer_chroma.py
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Move Approximator and Embeddings

* Fix auto pipeline + make style, quality

* make style

* Apply style fixes

* switch to new input ids

* fix # Copied from error

* remove # Copied from on protected members

* try to fix import

* fix import

* make fix-copes

* revert style fix

* update chroma transformer params

* update chroma transformer approximator init params

* update to pad tokens

* fix batch inference

* Make more pipeline tests work

* Make most transformer tests work

* fix docs

* make style, make quality

* skip batch tests

* fix test skipping

* fix test skipping again

* fix for tests

* Fix all pipeline test

* update

* push local changes, fix docs

* add encoder test, remove pooled dim

* default proj dim

* fix tests

* fix equal size list input

* update

* push local changes, fix docs

* add encoder test, remove pooled dim

* default proj dim

* fix tests

* fix equal size list input

* Revert "fix equal size list input"

This reverts commit 3fe4ad67d58d83715bc238f8654f5e90bfc5653c.

* update

* update

* update

* update

* update

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

8adc6003

13 Jun, 2025 1 commit

Cosmos Predict2 (#11695) · 9f91305f

Aryan authored Jun 14, 2025

* support text-to-image

* update example

* make fix-copies

* support use_flow_sigmas in EDM scheduler instead of maintain cosmos-specific scheduler

* support video-to-world

* update

* rename text2image pipeline

* make fix-copies

* add t2i test

* add test for v2w pipeline

* support edm dpmsolver multistep

* update

* update

* update

* update tests

* fix tests

* safety checker

* make conversion script work without guardrail

9f91305f

11 Jun, 2025 2 commits

Apply Occam's Razor in position embedding calculation (#11562) · 47ef7946
Tolga Cangöz authored Jun 12, 2025
```
* fix: remove redundant indexing

* style
```
47ef7946

[tests] model-level `device_map` clarifications (#11681) · 91545666

Sayak Paul authored Jun 11, 2025

* add clarity in documentation for device_map

* docs

* fix how compiler tester mixins are used.

* propagate

* more

* typo.

* fix tests

* fix order of decroators.

* clarify more.

* more test cases.

* fix doc

* fix device_map docstring in pipeline_utils.

* more examples

* more

* update

* remove code for stuff that is already supported.

* fix stuff.

91545666

08 Jun, 2025 1 commit
- fixed axes_dims_rope init (huggingface#11641) (#11678) · f46abfe4
  Valeriy Sofin authored Jun 08, 2025
  
  f46abfe4
06 Jun, 2025 1 commit

Wan VACE (#11582) · 73a9d585

Aryan authored Jun 06, 2025

* initial support

* make fix-copies

* fix no split modules

* add conversion script

* refactor

* add pipeline test

* refactor

* fix bug with mask

* fix for reference images

* remove print

* update docs

* update slices

* update

* update

* update example

73a9d585

02 Jun, 2025 1 commit
- Use float32 RoPE freqs in Wan with MPS backends (#11643) · 3a31b291
  Roy Hvaara authored Jun 02, 2025
```
Use float32 for RoPE on MPS in Wan
```
  3a31b291
30 May, 2025 2 commits

Fix typos in strings and comments (#11476) · 8183d0f1

co63oc authored May 30, 2025



* Fix typos in strings and comments
Signed-off-by: co63oc <co63oc@users.noreply.github.com>

* Update src/diffusers/hooks/hooks.py
Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Update src/diffusers/hooks/hooks.py
Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Update layerwise_casting.py

* Apply style fixes

* update

---------
Signed-off-by: co63oc <co63oc@users.noreply.github.com>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

8183d0f1

removing unnecessary else statement (#11624) · 3651bdb7
Yaniv Galron authored May 30, 2025
```
Co-authored-by: Aryan <aryan@huggingface.co>
```
3651bdb7

26 May, 2025 1 commit

[Feature] AutoModel can load components using model_index.json (#11401) · f64fa949

Ishan Modi authored May 26, 2025



* update

* update

* update

* update

* addressed PR comments

* update

* addressed PR comments

* added tests

* addressed PR comments

* updates

* update

* addressed PR comments

* update

* fix style

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

f64fa949

19 May, 2025 1 commit

Use HF Papers (#11567) · c8bb1ff5

Quentin Gallouédec authored May 19, 2025



* Use HF Papers

* Apply style fixes

---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

c8bb1ff5

15 May, 2025 1 commit
- [Single File] GGUF/Single File Support for HiDream (#11550) · 4267d8f4
  Dhruv Nair authored May 15, 2025
```
* update

* update

* update

* update

* update

* update

* update
```
  4267d8f4
13 May, 2025 1 commit
- fix: remove `torch_dtype="auto"` option from docstrings (#11513) · f8d4a1e2
  johannaSommer authored May 13, 2025
```
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
```
  f8d4a1e2
11 May, 2025 1 commit
- [tests] add tests for framepack transformer model. (#11520) · 01abfc87
  Sayak Paul authored May 11, 2025
```
* start.

* add tests for framepack transformer model.

* merge conflicts.

* make to square.

* fixes
```
  01abfc87
09 May, 2025 1 commit
- Change Framepack transformer layer initialization order (#11535) · 92fe689f
  Aryan authored May 09, 2025
```
update
```
  92fe689f
08 May, 2025 1 commit
- Conditionally import torchvision in Cosmos transformer (#11524) · 6674a515
  Aryan authored May 08, 2025
```
fix
```
  6674a515
07 May, 2025 1 commit

Cosmos (#10660) · 7b904941

Aryan authored May 07, 2025



* begin transformer conversion

* refactor

* refactor

* refactor

* refactor

* refactor

* refactor

* update

* add conversion script

* add pipeline

* make fix-copies

* remove einops

* update docs

* gradient checkpointing

* add transformer test

* update

* debug

* remove prints

* match sigmas

* add vae pt. 1

* finish CV* vae

* update

* update

* update

* update

* update

* update

* make fix-copies

* update

* make fix-copies

* fix

* update

* update

* make fix-copies

* update

* update tests

* handle device and dtype for safety checker; required in latest diffusers

* remove enable_gqa and use repeat_interleave instead

* enforce safety checker; use dummy checker in fast tests

* add review suggestion for ONNX export
Co-Authored-By: Asfiya Baig <asfiyab@nvidia.com>

* fix safety_checker issues when not passed explicitly

We could either do what's done in this commit, or update the Cosmos examples to explicitly pass the safety checker

* use cosmos guardrail package

* auto format docs

* update conversion script to support 14B models

* update name CosmosPipeline -> CosmosTextToWorldPipeline

* update docs

* fix docs

* fix group offload test failing for vae

---------
Co-authored-by: Asfiya Baig <asfiyab@nvidia.com>

7b904941

06 May, 2025 1 commit

Hunyuan Video Framepack (#11428) · d7ffe601

Aryan authored May 06, 2025

* add transformer

* add pipeline

* fixes

* make fix-copies

* update

* add flux mu shift

* update example snippet

* debug

* cleanup

* batch_size=1 optimization

* add pipeline test

* fix for model cpu offloading'

* add last_image support; credits: https://github.com/lllyasviel/FramePack/pull/167

* update example with flf2v

* update penguin url

* fix test

* address review comment: https://github.com/huggingface/diffusers/pull/11428#discussion_r2071032371

* address review comment: https://github.com/huggingface/diffusers/pull/11428#discussion_r2071087689



* Update src/diffusers/pipelines/hunyuan_video/pipeline_hunyuan_video_framepack.py

---------
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>

d7ffe601

05 May, 2025 1 commit
- [Feature] Implement tiled VAE encoding/decoding for Wan model. (#11414) · 8520d496
  Connector Switch authored May 05, 2025
```
* implement tiled encode/decode

* address review comments
```
  8520d496
01 May, 2025 2 commits

Fix typos in docs and comments (#11416) · 86294d3c

co63oc authored May 01, 2025



* Fix typos in docs and comments

* Apply style fixes

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

86294d3c

[WAN] fix recompilation issues (#11475) · d70f8ee1

Sayak Paul authored May 01, 2025



* [tests] Add torch.compile() test for WanTransformer3DModel

* fix wan recompilation issues.

* style

---------
Co-authored-by: tongyu0924 <winnie920924@gmail.com>

d70f8ee1

30 Apr, 2025 1 commit
- `torch.compile` fullgraph compatibility for Hunyuan Video (#11457) · c8651158
  Aryan authored Apr 30, 2025
```
udpate
```
  c8651158
24 Apr, 2025 1 commit
- Fix typos in strings and comments (#11407) · f00a9957
  co63oc authored Apr 25, 2025
  
  f00a9957
22 Apr, 2025 3 commits

[HiDream] move deprecation to 0.35.0 (#11384) · 448c72a2
YiYi Xu authored Apr 22, 2025
```
up
```
448c72a2
Update modeling imports (#11129) · f108ad88
Aryan authored Apr 22, 2025
```
update
```
f108ad88

[LoRA] add LoRA support to HiDream and fine-tuning script (#11281) · e30d3bf5

Linoy Tsaban authored Apr 22, 2025



* initial commit

* initial commit

* initial commit

* initial commit

* initial commit

* initial commit

* Update examples/dreambooth/train_dreambooth_lora_hidream.py
Co-authored-by: Bagheera <59658056+bghira@users.noreply.github.com>

* move prompt embeds, pooled embeds outside

* Update examples/dreambooth/train_dreambooth_lora_hidream.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update examples/dreambooth/train_dreambooth_lora_hidream.py
Co-authored-by: hlky <hlky@hlky.ac>

* fix import

* fix import and tokenizer 4, text encoder 4 loading

* te

* prompt embeds

* fix naming

* shapes

* initial commit to add HiDreamImageLoraLoaderMixin

* fix init

* add tests

* loader

* fix model input

* add code example to readme

* fix default max length of text encoders

* prints

* nullify training cond in unpatchify for temp fix to incompatible shaping of transformer output during training

* smol fix

* unpatchify

* unpatchify

* fix validation

* flip pred and loss

* fix shift!!!

* revert unpatchify changes (for now)

* smol fix

* Apply style fixes

* workaround moe training

* workaround moe training

* remove prints

* to reduce some memory, keep vae in `weight_dtype` same as we have for flux (as it's the same vae)
https://github.com/huggingface/diffusers/blob/bbd0c161b55ba2234304f1e6325832dd69c60565/examples/dreambooth/train_dreambooth_lora_flux.py#L1207



* refactor to align with HiDream refactor

* refactor to align with HiDream refactor

* refactor to align with HiDream refactor

* add support for cpu offloading of text encoders

* Apply style fixes

* adjust lr and rank for train example

* fix copies

* Apply style fixes

* update README

* update README

* update README

* fix license

* keep prompt2,3,4 as None in validation

* remove reverse ode comment

* Update examples/dreambooth/train_dreambooth_lora_hidream.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update examples/dreambooth/train_dreambooth_lora_hidream.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* vae offload change

* fix text encoder offloading

* Apply style fixes

* cleaner to_kwargs

* fix module name in copied from

* add requirements

* fix offloading

* fix offloading

* fix offloading

* update transformers version in reqs

* try AutoTokenizer

* try AutoTokenizer

* Apply style fixes

* empty commit

* Delete tests/lora/test_lora_layers_hidream.py

* change tokenizer_4 to load with AutoTokenizer as well

* make text_encoder_four and tokenizer_four configurable

* save model card

* save model card

* revert T5

* fix test

* remove non diffusers lumina2 conversion

---------
Co-authored-by: Bagheera <59658056+bghira@users.noreply.github.com>
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

e30d3bf5