Commits · 5ffb73d4aeac9eaef8366d7b21872d64009bd1c7 · renzhc / diffusers_dcu

25 Nov, 2025 1 commit

(#12711) · 5ffb73d4

Sayak Paul authored Nov 25, 2025



* add vae

* Initial commit for Flux 2 Transformer implementation

* add pipeline part

* small edits to the pipeline and conversion

* update conversion script

* fix

* up up

* finish pipeline

* Remove Flux IP Adapter logic for now

* Remove deprecated 3D id logic

* Remove ControlNet logic for now

* Add link to ViT-22B paper as reference for parallel transformer blocks such as the Flux 2 single stream block

* update pipeline

* Don't use biases for input projs and output AdaNorm

* up

* Remove bias for double stream block text QKV projections

* Add script to convert Flux 2 transformer to diffusers

* make style and make quality

* fix a few things.

* allow sft files to go.

* fix image processor

* fix batch

* style a bit

* Fix some bugs in Flux 2 transformer implementation

* Fix dummy input preparation and fix some test bugs

* fix dtype casting in timestep guidance module.

* resolve conflicts.,

* remove ip adapter stuff.

* Fix Flux 2 transformer consistency test

* Fix bug in Flux2TransformerBlock (double stream block)

* Get remaining Flux 2 transformer tests passing

* make style; make quality; make fix-copies

* remove stuff.

* fix type annotaton.

* remove unneeded stuff from tests

* tests

* up

* up

* add sf support

* Remove unused IP Adapter and ControlNet logic from transformer (#9)

* copied from

* Apply suggestions from code review
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: apolinário <joaopaulo.passos@gmail.com>

* up

* up

* up

* up

* up

* Refactor Flux2Attention into separate classes for double stream and single stream attention

* Add _supports_qkv_fusion to AttentionModuleMixin to allow subclasses to disable QKV fusion

* Have Flux2ParallelSelfAttention inherit from AttentionModuleMixin with _supports_qkv_fusion=False

* Log debug message when calling fuse_projections on a AttentionModuleMixin subclass that does not support QKV fusion

* Address review comments

* Update src/diffusers/pipelines/flux2/pipeline_flux2.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* up

* Remove maybe_allow_in_graph decorators for Flux 2 transformer blocks (#12)

* up

* support ostris loras. (#13)

* up

* update schdule

* up

* up (#17)

* add training scripts (#16)

* add training scripts
Co-authored-by: Linoy Tsaban <linoytsaban@gmail.com>

* model cpu offload in validation.

* add flux.2 readme

* add img2img and tests

* cpu offload in log validation

* Apply suggestions from code review

* fix

* up

* fixes

* remove i2i training tests for now.

---------
Co-authored-by: Linoy Tsaban <linoytsaban@gmail.com>
Co-authored-by: linoytsaban <linoy@huggingface.co>

* up

---------
Co-authored-by: yiyixuxu <yixu310@gmail.com>
Co-authored-by: Daniel Gu <dgu8957@gmail.com>
Co-authored-by: yiyi@huggingface.co <yiyi@ip-10-53-87-203.ec2.internal>
Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: apolinário <joaopaulo.passos@gmail.com>
Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal>
Co-authored-by: Linoy Tsaban <linoytsaban@gmail.com>
Co-authored-by: linoytsaban <linoy@huggingface.co>

5ffb73d4

17 Nov, 2025 1 commit
- Revert `AutoencoderKLWan`'s `dim_mult` default value back to list (#12640) · 67dc65e2
  dg845 authored Nov 17, 2025
```
Revert dim_mult back to list and fix type annotation
```
  67dc65e2
12 Nov, 2025 1 commit

ArXiv -> HF Papers (#12583) · f3db38c1

Quentin Gallouédec authored Nov 12, 2025

* Update pipeline_skyreels_v2_i2v.py

* Update README.md

* Update torch_utils.py

* Update torch_utils.py

* Update guider_utils.py

* Update pipeline_ltx.py

* Update pipeline_bria.py

* Apply suggestion from @qgallouedec

* Update autoencoder_kl_qwenimage.py

* Update pipeline_prx.py

* Update pipeline_wan_vace.py

* Update pipeline_skyreels_v2.py

* Update pipeline_skyreels_v2_diffusion_forcing.py

* Update pipeline_bria_fibo.py

* Update pipeline_skyreels_v2_diffusion_forcing_i2v.py

* Update pipeline_ltx_condition.py

* Update pipeline_ltx_image2video.py

* Update regional_prompting_stable_diffusion.py

* make style

* style

* style

f3db38c1

10 Nov, 2025 1 commit

Fix: update type hints for Tuple parameters across multiple files to support... · 5a47442f

Cesaryuan authored Nov 11, 2025


Fix: update type hints for Tuple parameters across multiple files to support variable-length tuples (#12544)

* Fix: update type hints for Tuple parameters across multiple files to support variable-length tuples

* Apply style fixes

---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

5a47442f

30 Oct, 2025 1 commit

Avoiding graph break by changing the way we infer dtype in vae.decoder (#12512) · 9f3c0fdc

Pavle Padjin authored Oct 30, 2025

* Changing the way we infer dtype to avoid force evaluation of lazy tensors

* changing way to infer dtype to ensure type consistency

* more robust infering of dtype

* removing the upscale dtype entirely

9f3c0fdc

28 Oct, 2025 1 commit

fix crash if tiling mode is enabled (#12521) · dc622a95

Wang, Yi authored Oct 28, 2025



* fix crash in tiling mode is enabled
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* fmt
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

---------
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

dc622a95

24 Oct, 2025 1 commit

HunyuanImage21 (#12333) · a138d71e

YiYi Xu authored Oct 23, 2025



* add hunyuanimage2.1


---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

a138d71e

22 Oct, 2025 2 commits
- Fix: Add _skip_keys for AutoencoderKLWan (#12523) · bec2d8ea
  YiYi Xu authored Oct 22, 2025
```
add
```
  bec2d8ea
- [core] `AutoencoderMixin` to abstract common methods (#12473) · a5a0ccf8
  Sayak Paul authored Oct 22, 2025
```
* up

* correct wording.

* up

* up

* up
```
  a5a0ccf8
15 Oct, 2025 1 commit
- remove unneeded checkpoint imports. (#12488) · 53a10518
  Sayak Paul authored Oct 15, 2025
  
  53a10518
30 Sep, 2025 1 commit
- [docs] Migrate syntax (#12390) · cc5b31ff
  Steven Liu authored Sep 30, 2025
```
* change syntax

* make style
```
  cc5b31ff
22 Sep, 2025 1 commit
- Fix bug with VAE slicing in autoencoder_dc.py (#12343) · d83d35c1
  Chen Mingyi authored Sep 22, 2025
  
  d83d35c1
16 Sep, 2025 1 commit

Fix autoencoder_kl_wan.py bugs for Wan2.2 VAE (#12335) · d06750a5

Zijian Zhou authored Sep 17, 2025

* Update autoencoder_kl_wan.py

When using the Wan2.2 VAE, the spatial compression ratio calculated here is incorrect. It should be 16 instead of 8. Pass it in directly via the config to ensure it’s correct here.

* Update autoencoder_kl_wan.py

d06750a5

18 Aug, 2025 1 commit
- Minor modification to support DC-AE-turbo (#12169) · 85cbe589
  Junyu Chen authored Aug 18, 2025
```
* minor modification to support dc-ae-turbo

* minor
```
  85cbe589
04 Aug, 2025 2 commits

fix(qwen-image): update vae license (#12063) · 69a9828f

naykun authored Aug 04, 2025



* fix(qwen-image):
- update vae license

* Apply style fixes

---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aryan <aryan@huggingface.co>

69a9828f

tests + minor refactor for QwenImage (#12057) · 9a38fab5
Aryan authored Aug 04, 2025
```
* update

* update

* update

* add docs
```
9a38fab5

03 Aug, 2025 1 commit

Qwen-Image (#12055) · 8e53cd95

naykun authored Aug 04, 2025



* (feat): qwen-image integration

* fix(qwen-image):
- remove unused logics related to controlnet/ip-adapter

* fix(qwen-image):
- compatible with attention dispatcher
- cond cache support

* fix(qwen-image):
- cond cache registry
- attention backend argument
- fix copies

* fix(qwen-image):
- remove local test

* Update src/diffusers/models/transformers/transformer_qwenimage.py

---------
Co-authored-by: YiYi Xu <yixu310@gmail.com>

8e53cd95

02 Aug, 2025 2 commits

Update autoencoder_kl_cosmos.py (#12045) · 359b605f

Tanuj Rai authored Aug 02, 2025



* Update autoencoder_kl_cosmos.py

* Apply style fixes

---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aryan <aryan@huggingface.co>

359b605f

Fix type of force_upcast to bool (#12046) · 6febc08b
Bernd Doser authored Aug 02, 2025

6febc08b

01 Aug, 2025 1 commit
- [wan2.2] fix vae patches (#12041) · 58d2b10a
  YiYi Xu authored Jul 31, 2025
```
up
```
  58d2b10a
28 Jul, 2025 1 commit

[WIP] Wan2.2 (#12004) · a6d9f6a1

YiYi Xu authored Jul 28, 2025



* support wan 2.2 i2v

* add t2v + vae2.2

* add conversion script for vae 2.2

* add

* add 5b t2v

* conversion script

* refactor out reearrange

* remove a copied from in skyreels

* Apply suggestions from code review
Co-authored-by: bagheera <59658056+bghira@users.noreply.github.com>

* Update src/diffusers/models/transformers/transformer_wan.py

* fix fast tests

* style

---------
Co-authored-by: bagheera <59658056+bghira@users.noreply.github.com>

a6d9f6a1

24 Jun, 2025 1 commit
- [tests] Fix group offloading and layerwise casting test interaction (#11796) · 5df02fc1
  Aryan authored Jun 24, 2025
```
* update

* update

* update
```
  5df02fc1
19 Jun, 2025 1 commit
- Update more licenses to 2025 (#11746) · a4df8dbc
  Aryan authored Jun 19, 2025
```
update
```
  a4df8dbc
18 Jun, 2025 1 commit

⚡

️ Speed up method `AutoencoderKLWan.clear_cache` by 886% (#11665) · 5ce4814a

Saurabh Misra authored Jun 17, 2025

* ⚡

️ Speed up method `AutoencoderKLWan.clear_cache` by 886%

**Key optimizations:**
- Compute the number of `WanCausalConv3d` modules in each model (`encoder`/`decoder`) **only once during initialization**, store in `self._cached_conv_counts`. This removes unnecessary repeated tree traversals at every `clear_cache` call, which was the main bottleneck (from profiling).
- The internal helper `_count_conv3d_fast` is optimized via a generator expression with `sum` for efficiency.

All comments from the original code are preserved, except for updated or removed local docstrings/comments relevant to changed lines.  
**Function signatures and outputs remain unchanged.**

* Apply style fixes

* Apply suggestions from code review
Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Apply style fixes

---------
Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: Aseem Saxena <aseem.bits@gmail.com>

5ce4814a

30 May, 2025 1 commit

Fix typos in strings and comments (#11476) · 8183d0f1

co63oc authored May 30, 2025



* Fix typos in strings and comments
Signed-off-by: co63oc <co63oc@users.noreply.github.com>

* Update src/diffusers/hooks/hooks.py
Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Update src/diffusers/hooks/hooks.py
Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Update layerwise_casting.py

* Apply style fixes

* update

---------
Signed-off-by: co63oc <co63oc@users.noreply.github.com>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

8183d0f1

19 May, 2025 1 commit

Use HF Papers (#11567) · c8bb1ff5

Quentin Gallouédec authored May 19, 2025



* Use HF Papers

* Apply style fixes

---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

c8bb1ff5

07 May, 2025 1 commit

Cosmos (#10660) · 7b904941

Aryan authored May 07, 2025



* begin transformer conversion

* refactor

* refactor

* refactor

* refactor

* refactor

* refactor

* update

* add conversion script

* add pipeline

* make fix-copies

* remove einops

* update docs

* gradient checkpointing

* add transformer test

* update

* debug

* remove prints

* match sigmas

* add vae pt. 1

* finish CV* vae

* update

* update

* update

* update

* update

* update

* make fix-copies

* update

* make fix-copies

* fix

* update

* update

* make fix-copies

* update

* update tests

* handle device and dtype for safety checker; required in latest diffusers

* remove enable_gqa and use repeat_interleave instead

* enforce safety checker; use dummy checker in fast tests

* add review suggestion for ONNX export
Co-Authored-By: Asfiya Baig <asfiyab@nvidia.com>

* fix safety_checker issues when not passed explicitly

We could either do what's done in this commit, or update the Cosmos examples to explicitly pass the safety checker

* use cosmos guardrail package

* auto format docs

* update conversion script to support 14B models

* update name CosmosPipeline -> CosmosTextToWorldPipeline

* update docs

* fix docs

* fix group offload test failing for vae

---------
Co-authored-by: Asfiya Baig <asfiyab@nvidia.com>

7b904941

05 May, 2025 1 commit
- [Feature] Implement tiled VAE encoding/decoding for Wan model. (#11414) · 8520d496
  Connector Switch authored May 05, 2025
```
* implement tiled encode/decode

* address review comments
```
  8520d496
15 Apr, 2025 1 commit
- Fix vae.Decoder prev_output_channel (#11280) · 6e80d240
  hlky authored Apr 15, 2025
  
  6e80d240
13 Apr, 2025 1 commit
- Fix incorrect tile_latent_min_width calculations (#11305) · 36538e11
  Tuna Tuncer authored Apr 13, 2025
  
  36538e11
11 Apr, 2025 1 commit
- Fix incorrect tile_latent_min_width calculation in AutoencoderKLMochi (#11294) · bc261058
  Tuna Tuncer authored Apr 11, 2025
  
  bc261058
05 Apr, 2025 1 commit

Add missing MochiEncoder3D.gradient_checkpointing attribute (#11146) · 8ad68c13

Mikko Tukiainen authored Apr 06, 2025



* Add missing 'gradient_checkpointing = False' attr

* Add (limited) tests for Mochi autoencoder

* Apply style fixes

* pass 'conv_cache' as arg instead of kwarg

---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

8ad68c13

02 Apr, 2025 1 commit

remove unnecessary call to `F.pad` (#10620) · fe2b3974

Bruno Magalhaes authored Apr 02, 2025

* rewrite memory count without implicitly using dimensions by @ic-synth

* replace F.pad by built-in padding in Conv3D

* in-place sums to reduce memory allocations

* fixed trailing whitespace

* file reformatted

* in-place sums

* simpler in-place expressions

* removed in-place sum, may affect backward propagation logic

* removed in-place sum, may affect backward propagation logic

* removed in-place sum, may affect backward propagation logic

* reverted change

fe2b3974

18 Mar, 2025 1 commit

LTX 0.9.5 (#10968) · 2e83cbbb

Aryan authored Mar 18, 2025



* update


---------
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: hlky <hlky@hlky.ac>

2e83cbbb

12 Mar, 2025 1 commit
- Use `output_size` in `repeat_interleave` (#11030) · 8b4f8ba7
  hlky authored Mar 12, 2025
  
  8b4f8ba7
07 Mar, 2025 2 commits
- [Single File] Add single file support for Wan T2V/I2V (#10991) · 1357931d
  Dhruv Nair authored Mar 07, 2025
```
* update

* update

* update

* update

* update

* update

* update
```
  1357931d
- Wan VAE move scaling to pipeline (#10998) · 363d1ab7
  hlky authored Mar 07, 2025
  
  363d1ab7
03 Mar, 2025 1 commit

Add EasyAnimateV5.1 text-to-video, image-to-video, control-to-video generation model (#10626) · 5e3b7d2d

Bubbliiiing authored Mar 03, 2025



* Update EasyAnimate V5.1

* Add docs && add tests && Fix comments problems in transformer3d and vae

* delete comments and remove useless import

* delete process

* Update EXAMPLE_DOC_STRING

* rename transformer file

* make fix-copies

* make style

* refactor pt. 1

* update toctree.yml

* add model tests

* Update layer_norm for norm_added_q and norm_added_k in Attention

* Fix processor problem

* refactor vae

* Fix problem in comments

* refactor tiling; remove einops dependency

* fix docs path

* make fix-copies

* Update src/diffusers/pipelines/easyanimate/pipeline_easyanimate_control.py

* update _toctree.yml

* fix test

* update

* update

* update

* make fix-copies

* fix tests

---------
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

5e3b7d2d

02 Mar, 2025 1 commit

[Alibaba Wan Team] continue on #10921 Wan2.1 (#10922) · 2d8a41ca

YiYi Xu authored Mar 02, 2025

* Add wanx pipeline, model and example

* wanx_merged_v1

* change WanX into Wan

* fix i2v fp32 oom error

Link: https://code.alibaba-inc.com/open_wanx2/diffusers/codereview/20607813



* support t2v load fp32 ckpt

* add example

* final merge v1

* Update autoencoder_kl_wan.py

* up

* update middle, test up_block

* up up

* one less nn.sequential

* up more

* up

* more

* [refactor] [wip] Wan transformer/pipeline (#10926)

* update

* update

* refactor rope

* refactor pipeline

* make fix-copies

* add transformer test

* update

* update

* make style

* update tests

* tests

* conversion script

* conversion script

* update

* docs

* remove unused code

* fix _toctree.yml

* update dtype

* fix test

* fix tests: scale

* up

* more

* Apply suggestions from code review

* Apply suggestions from code review

* style

* Update scripts/convert_wan_to_diffusers.py

* update docs

* fix

---------
Co-authored-by: Yitong Huang <huangyitong.hyt@alibaba-inc.com>
Co-authored-by: 亚森 <wangjiayu.wjy@alibaba-inc.com>
Co-authored-by: Aryan <aryan@huggingface.co>

2d8a41ca

14 Feb, 2025 1 commit

Module Group Offloading (#10503) · 9a147b82

Aryan authored Feb 14, 2025



* update

* fix

* non_blocking; handle parameters and buffers

* update

* Group offloading with cuda stream prefetching (#10516)

* cuda stream prefetch

* remove breakpoints

* update

* copy model hook implementation from pab

* update; ~very workaround based implementation but it seems to work as expected; needs cleanup and rewrite

* more workarounds to make it actually work

* cleanup

* rewrite

* update

* make sure to sync current stream before overwriting with pinned params

not doing so will lead to erroneous computations on the GPU and cause bad results

* better check

* update

* remove hook implementation to not deal with merge conflict

* re-add hook changes

* why use more memory when less memory do trick

* why still use slightly more memory when less memory do trick

* optimise

* add model tests

* add pipeline tests

* update docs

* add layernorm and groupnorm

* address review comments

* improve tests; add docs

* improve docs

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* apply suggestions from code review

* update tests

* apply suggestions from review

* enable_group_offloading -> enable_group_offload for naming consistency

* raise errors if multiple offloading strategies used; add relevant tests

* handle .to() when group offload applied

* refactor some repeated code

* remove unintentional change from merge conflict

* handle .cuda()

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

9a147b82