Commits · e8312e7ca9c47fff56b38e7ef639168d0a74089c · renzhc / diffusers_dcu

22 Apr, 2025 3 commits

[HiDream] move deprecation to 0.35.0 (#11384) · 448c72a2
YiYi Xu authored Apr 22, 2025
```
up
```
448c72a2
Update modeling imports (#11129) · f108ad88
Aryan authored Apr 22, 2025
```
update
```
f108ad88

[LoRA] add LoRA support to HiDream and fine-tuning script (#11281) · e30d3bf5

Linoy Tsaban authored Apr 22, 2025



* initial commit

* initial commit

* initial commit

* initial commit

* initial commit

* initial commit

* Update examples/dreambooth/train_dreambooth_lora_hidream.py
Co-authored-by: Bagheera <59658056+bghira@users.noreply.github.com>

* move prompt embeds, pooled embeds outside

* Update examples/dreambooth/train_dreambooth_lora_hidream.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update examples/dreambooth/train_dreambooth_lora_hidream.py
Co-authored-by: hlky <hlky@hlky.ac>

* fix import

* fix import and tokenizer 4, text encoder 4 loading

* te

* prompt embeds

* fix naming

* shapes

* initial commit to add HiDreamImageLoraLoaderMixin

* fix init

* add tests

* loader

* fix model input

* add code example to readme

* fix default max length of text encoders

* prints

* nullify training cond in unpatchify for temp fix to incompatible shaping of transformer output during training

* smol fix

* unpatchify

* unpatchify

* fix validation

* flip pred and loss

* fix shift!!!

* revert unpatchify changes (for now)

* smol fix

* Apply style fixes

* workaround moe training

* workaround moe training

* remove prints

* to reduce some memory, keep vae in `weight_dtype` same as we have for flux (as it's the same vae)
https://github.com/huggingface/diffusers/blob/bbd0c161b55ba2234304f1e6325832dd69c60565/examples/dreambooth/train_dreambooth_lora_flux.py#L1207



* refactor to align with HiDream refactor

* refactor to align with HiDream refactor

* refactor to align with HiDream refactor

* add support for cpu offloading of text encoders

* Apply style fixes

* adjust lr and rank for train example

* fix copies

* Apply style fixes

* update README

* update README

* update README

* fix license

* keep prompt2,3,4 as None in validation

* remove reverse ode comment

* Update examples/dreambooth/train_dreambooth_lora_hidream.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update examples/dreambooth/train_dreambooth_lora_hidream.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* vae offload change

* fix text encoder offloading

* Apply style fixes

* cleaner to_kwargs

* fix module name in copied from

* add requirements

* fix offloading

* fix offloading

* fix offloading

* update transformers version in reqs

* try AutoTokenizer

* try AutoTokenizer

* Apply style fixes

* empty commit

* Delete tests/lora/test_lora_layers_hidream.py

* change tokenizer_4 to load with AutoTokenizer as well

* make text_encoder_four and tokenizer_four configurable

* save model card

* save model card

* revert T5

* fix test

* remove non diffusers lumina2 conversion

---------
Co-authored-by: Bagheera <59658056+bghira@users.noreply.github.com>
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

e30d3bf5

21 Apr, 2025 1 commit

[cogview4][feat] Support attention mechanism with variable-length support and... · 0434db9a

OleehyO authored Apr 22, 2025


[cogview4][feat] Support attention mechanism with variable-length support and batch packing (#11349)

* [cogview4] Enhance attention mechanism with variable-length support and batch packing

---------
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

0434db9a

19 Apr, 2025 1 commit
- update output for Hidream transformer (#11366) · 5a2e0f71
  YiYi Xu authored Apr 18, 2025
```
up
```
  5a2e0f71
18 Apr, 2025 1 commit

support Wan-FLF2V (#11353) · 0021bfa1

YiYi Xu authored Apr 18, 2025



* update transformer

---------
Co-authored-by: Aryan <aryan@huggingface.co>

0021bfa1

17 Apr, 2025 2 commits
- Update controlnet_flux.py (#11350) · ee6ad51d
  Frank (Haofan) Wang authored Apr 18, 2025
  
  ee6ad51d
- [Hi Dream] follow-up (#11296) · 05679329
  YiYi Xu authored Apr 17, 2025
```
* add
```
  05679329
15 Apr, 2025 3 commits

Rewrite AuraFlowPatchEmbed.pe_selection_index_based_on_dim to be torch.compile compatible (#11297) · b6156aaf

AstraliteHeart authored Apr 15, 2025



* Update pe_selection_index_based_on_dim

* Make pe_selection_index_based_on_dim work with torh.compile

* Fix AuraFlowTransformer2DModel's dpcstring default values

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

b6156aaf

Fix vae.Decoder prev_output_channel (#11280) · 6e80d240
hlky authored Apr 15, 2025

6e80d240

[LoRA] Add LoRA support to AuraFlow (#10216) · 9352a5ca

Hameer Abbasi authored Apr 15, 2025



* Add AuraFlowLoraLoaderMixin

* Add comments, remove qkv fusion

* Add Tests

* Add AuraFlowLoraLoaderMixin to documentation

* Add Suggested changes

* Change attention_kwargs->joint_attention_kwargs

* Rebasing derp.

* fix

* fix

* Quality fixes.

* make style

* `make fix-copies`

* `ruff check --fix`

* Attept 1 to fix tests.

* Attept 2 to fix tests.

* Attept 3 to fix tests.

* Address review comments.

* Rebasing derp.

* Get more tests passing by copying from Flux. Address review comments.

* `joint_attention_kwargs`->`attention_kwargs`

* Add `lora_scale` property for te LoRAs.

* Make test better.

* Remove useless property.

* Skip TE-only tests for AuraFlow.

* Support LoRA for non-CLIP TEs.

* Restore LoRA tests.

* Undo adding LoRA support for non-CLIP TEs.

* Undo support for TE in AuraFlow LoRA.

* `make fix-copies`

* Sync with upstream changes.

* Remove unneeded stuff.

* Mirror `Lumina2`.

* Skip for MPS.

* Address review comments.

* Remove duplicated code.

* Remove unnecessary code.

* Remove repeated docs.

* Propagate attention.

* Fix TE target modules.

* MPS fix for LoRA tests.

* Unrelated TE LoRA tests fix.

* Fix AuraFlow LoRA tests by applying to the right denoiser layers.
Co-authored-by: AstraliteHeart <81396681+AstraliteHeart@users.noreply.github.com>

* Apply style fixes

* empty commit

* Fix the repo consistency issues.

* Remove unrelated changes.

* Style.

* Fix `test_lora_fuse_nan`.

* fix quality issues.

* `pytest.xfail` -> `ValueError`.

* Add back `skip_mps`.

* Apply style fixes

* `make fix-copies`

---------
Co-authored-by: Warlord-K <warlordk28@gmail.com>
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: AstraliteHeart <81396681+AstraliteHeart@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

9352a5ca

14 Apr, 2025 1 commit
- Use float32 on mps or npu in transformer_hidream_image's rope (#11316) · dcf836cf
  hlky authored Apr 14, 2025
  
  dcf836cf
13 Apr, 2025 3 commits

[ControlNet] Adds controlnet for SanaTransformer (#11040) · f1f38ffb

Ishan Modi authored Apr 13, 2025



* added controlnet for sana transformer

* improve code quality

* addressed PR comments

* bug fixes

* added test cases

* update

* added dummy objects

* addressed PR comments

* update

* Forcing update

* add to docs

* code quality

* addressed PR comments

* addressed PR comments

* update

* addressed PR comments

* added proper styling

* update

* Revert "added proper styling"

This reverts commit 344ee8a7014ada095b295034ef84341f03b0e359.

* manually ordered

* Apply suggestions from code review

---------
Co-authored-by: Aryan <contact.aryanvs@gmail.com>

f1f38ffb

Fix incorrect tile_latent_min_width calculations (#11305) · 36538e11
Tuna Tuncer authored Apr 13, 2025

36538e11

Hidream refactoring follow ups (#11299) · 97e0ef4d

Aryan authored Apr 13, 2025



* HiDream Image

* update

* -einops

* py3.8

* fix -einops

* mixins, offload_seq, option_components

* docs

* Apply style fixes

* trigger tests

* Apply suggestions from code review
Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* joint_attention_kwargs -> attention_kwargs, fixes

* fast tests

* -_init_weights

* style tests

* move reshape logic

* update slice 😴

* supports_dduf

* 🤷🏻

‍♂️

* Update src/diffusers/models/transformers/transformer_hidream_image.py
Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* address review comments

* update tests

* doc updates

* update

* Update src/diffusers/models/transformers/transformer_hidream_image.py

* Apply style fixes

---------
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

97e0ef4d

11 Apr, 2025 2 commits

HiDream Image (#11231) · 0ef29355

hlky authored Apr 11, 2025



* HiDream Image


---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>

0ef29355

Fix incorrect tile_latent_min_width calculation in AutoencoderKLMochi (#11294) · bc261058
Tuna Tuncer authored Apr 11, 2025

bc261058

09 Apr, 2025 3 commits

fix flux controlnet bug (#11152) · 6a7c2d0a

Ilya Drobyshevskiy authored Apr 09, 2025

Before this if txt_ids was 3d tensor, line with txt_ids[:1] concat txt_ids by batch dim. Now we first check that txt_ids is 2d tensor (or take first batch element) and then concat by token dim

6a7c2d0a

Update Ruff to latest Version (#10919) · edc154da
Dhruv Nair authored Apr 09, 2025
```
* update

* update

* update

* update
```
edc154da

AutoModel (#11115) · 437cb36c

hlky authored Apr 09, 2025



* AutoModel

* ...

* lol

* ...

* add test

* update

* make fix-copies

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

437cb36c

08 Apr, 2025 1 commit

[feat] implement `record_stream` when using CUDA streams during group offloading (#11081) · 4b27c4a4

Sayak Paul authored Apr 08, 2025



* implement record_stream for better performance.

* fix

* style.

* merge #11097

* Update src/diffusers/hooks/group_offloading.py
Co-authored-by: Aryan <aryan@huggingface.co>

* fixes

* docstring.

* remaining todos in low_cpu_mem_usage

* tests

* updates to docs.

---------
Co-authored-by: Aryan <aryan@huggingface.co>

4b27c4a4

05 Apr, 2025 1 commit

Add missing MochiEncoder3D.gradient_checkpointing attribute (#11146) · 8ad68c13

Mikko Tukiainen authored Apr 06, 2025



* Add missing 'gradient_checkpointing = False' attr

* Add (limited) tests for Mochi autoencoder

* Apply style fixes

* pass 'conv_cache' as arg instead of kwarg

---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

8ad68c13

02 Apr, 2025 4 commits

Add CacheMixin to Wan and LTX Transformers (#11187) · c97b709a
Dhruv Nair authored Apr 02, 2025
```
* update

* update

* update
```
c97b709a
Revert `save_model` in ModelMixin save_pretrained and use safe_serialization=False in test (#11196) · da857beb
hlky authored Apr 02, 2025

da857beb

allow models to run with a user-provided dtype map instead of a single dtype (#10301) · d8c617cc

hlky authored Apr 02, 2025



* allow models to run with a user-provided dtype map instead of a single dtype

* make style

* Add warning, change `_` to `default`

* make style

* add test

* handle shared tensors

* remove warning

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

d8c617cc

remove unnecessary call to `F.pad` (#10620) · fe2b3974

Bruno Magalhaes authored Apr 02, 2025

* rewrite memory count without implicitly using dimensions by @ic-synth

* replace F.pad by built-in padding in Conv3D

* in-place sums to reduce memory allocations

* fixed trailing whitespace

* file reformatted

* in-place sums

* simpler in-place expressions

* removed in-place sum, may affect backward propagation logic

* removed in-place sum, may affect backward propagation logic

* removed in-place sum, may affect backward propagation logic

* reverted change

fe2b3974

29 Mar, 2025 1 commit
- Fix LatteTransformer3DModel dtype mismatch with enable_temporal_attentions (#11139) · 75d7e5cc
  hlky authored Mar 29, 2025
  
  75d7e5cc
25 Mar, 2025 1 commit
- add a timestep scale for sana-sprint teacher model (#11150) · 739d6ec7
  Junsong Chen authored Mar 26, 2025
  
  739d6ec7
24 Mar, 2025 1 commit

New HunyuanVideo-I2V (#11066) · 8907a70a

Aryan authored Mar 24, 2025

* update

* update

* update

* add tests

* update docs

* raise value error

* warning for true cfg and guidance scale

* fix test

8907a70a

21 Mar, 2025 3 commits

Don't override `torch_dtype` and don't use when `quantization_config` is set (#11039) · a7d53a59

hlky authored Mar 21, 2025



* Don't use `torch_dtype` when `quantization_config` is set

* up

* djkajka

* Apply suggestions from code review

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

a7d53a59

add sana-sprint (#11074) · 8a63aa5e

YiYi Xu authored Mar 21, 2025



* add sana-sprint




---------
Co-authored-by: Junsong Chen <cjs1020440147@icloud.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>

8a63aa5e

[core] FasterCache (#10163) · 844221ae

Aryan authored Mar 21, 2025



* init

* update

* update

* update

* make style

* update

* fix

* make it work with guidance distilled models

* update

* make fix-copies

* add tests

* update

* apply_faster_cache -> apply_fastercache

* fix

* reorder

* update

* refactor

* update docs

* add fastercache to CacheMixin

* update tests

* Apply suggestions from code review

* make style

* try to fix partial import error

* Apply style fixes

* raise warning

* update

---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

844221ae

20 Mar, 2025 2 commits
- remove F.rms_norm for now (#11126) · e9fda392
  YiYi Xu authored Mar 20, 2025
```
up
```
  e9fda392
- Provide option to reduce CPU RAM usage in Group Offload (#11106) · 2c1ed50f
  Dhruv Nair authored Mar 20, 2025
```
* update

* update

* clean up
```
  2c1ed50f
18 Mar, 2025 2 commits

Resolve stride mismatch in UNet's ResNet to support Torch DDP (#11098) · cb1b8b21
Cheng Jin authored Mar 18, 2025
```
Modify UNet's ResNet implementation to resolve stride mismatch in Torch's DDP
```
cb1b8b21

LTX 0.9.5 (#10968) · 2e83cbbb

Aryan authored Mar 18, 2025



* update


---------
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: hlky <hlky@hlky.ac>

2e83cbbb

15 Mar, 2025 1 commit

CogView4 Control Block (#10809) · 82188cef

Yuxuan Zhang authored Mar 16, 2025




* cogview4 control training


---------
Co-authored-by: OleehyO <leehy0357@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail.com>

82188cef

14 Mar, 2025 1 commit

reverts accidental change that removes attn_mask in attn. Improves fl… (#11065) · 6b9a3334

Juan Acevedo authored Mar 14, 2025



reverts accidental change that removes attn_mask in attn. Improves flux ptxla by using flash block sizes. Moves encoding outside the for loop.
Co-authored-by: Juan Acevedo <jfacevedo@google.com>

6b9a3334

13 Mar, 2025 1 commit

Fix aclnnRepeatInterleaveIntWithDim error on NPU for get_1d_rotary_pos_embed (#10820) · ccc83216

ZhengKai91 authored Mar 14, 2025



* get_1d_rotary_pos_embed support npu

* Update src/diffusers/models/embeddings.py

---------
Co-authored-by: Kai zheng <kaizheng@KaideMacBook-Pro.local>
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: YiYi Xu <yixu310@gmail.com>

ccc83216

12 Mar, 2025 1 commit
- Use `output_size` in `repeat_interleave` (#11030) · 8b4f8ba7
  hlky authored Mar 12, 2025
  
  8b4f8ba7