Commits · f1f38ffbeed793d684684e00e6e1213bcaf494d6 · renzhc / diffusers_dcu

13 Apr, 2025 1 commit

[ControlNet] Adds controlnet for SanaTransformer (#11040) · f1f38ffb

Ishan Modi authored Apr 13, 2025



* added controlnet for sana transformer

* improve code quality

* addressed PR comments

* bug fixes

* added test cases

* update

* added dummy objects

* addressed PR comments

* update

* Forcing update

* add to docs

* code quality

* addressed PR comments

* addressed PR comments

* update

* addressed PR comments

* added proper styling

* update

* Revert "added proper styling"

This reverts commit 344ee8a7014ada095b295034ef84341f03b0e359.

* manually ordered

* Apply suggestions from code review

---------
Co-authored-by: Aryan <contact.aryanvs@gmail.com>

f1f38ffb

11 Apr, 2025 1 commit

HiDream Image (#11231) · 0ef29355

hlky authored Apr 11, 2025



* HiDream Image


---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>

0ef29355

10 Apr, 2025 1 commit
- add onnxruntime-qnn & onnxruntime-cann (#11269) · 0efdf411
  xieofxie authored Apr 10, 2025
```
Co-authored-by: hualxie <hualxie@microsoft.com>
```
  0efdf411
09 Apr, 2025 4 commits

fix consisid imports (#11254) · 5b27f8ab
Sayak Paul authored Apr 09, 2025
```
* fix consisid imports

* fix opencv import

* fix
```
5b27f8ab
Update Ruff to latest Version (#10919) · edc154da
Dhruv Nair authored Apr 09, 2025
```
* update

* update

* update

* update
```
edc154da

AutoModel (#11115) · 437cb36c

hlky authored Apr 09, 2025



* AutoModel

* ...

* lol

* ...

* add test

* update

* make fix-copies

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

437cb36c

[LoRA] support more comyui loras for Flux

🚨

(#10985) · 6bfacf04

Sayak Paul authored Apr 09, 2025

* support more comyui loras.

* fix

* fixes

* revert changes in LoRA base.

* no position_embedding

* 🚨

 introduce a breaking change to let peft handle module ambiguity

* styling

* remove position embeddings.

* improvements.

* style

* make info instead of NotImplementedError

* Update src/diffusers/loaders/peft.py
Co-authored-by: hlky <hlky@hlky.ac>

* add example.

* robust checks

* updates

---------
Co-authored-by: hlky <hlky@hlky.ac>

6bfacf04

08 Apr, 2025 1 commit

introduce compute arch specific expectations and fix test_sd3_img2img_inference failure (#11227) · c51b6bd8

Yao Matrix authored Apr 08, 2025



* add arch specfic expectations support, to support different arch's numerical characteristics
Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* fix typo
Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* Apply suggestions from code review

* Apply style fixes

* Update src/diffusers/utils/testing_utils.py

---------
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

c51b6bd8

04 Apr, 2025 1 commit

Fixed requests.get function call by adding timeout parameter. (#11156) · f10775b1

Kenneth Gerald Hamilton authored Apr 04, 2025



* Fixed requests.get function call by adding timeout parameter.

* declare DIFFUSERS_REQUEST_TIMEOUT in constants and import when needed

* remove unneeded os import

* Apply style fixes

---------
Co-authored-by: Sai-Suraj-27 <sai.suraj.27.729@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

f10775b1

02 Apr, 2025 2 commits
- Update import_utils.py (#10329) · b0ff822e
  lakshay sharma authored Apr 02, 2025
```
added onnxruntime-vitisai for custom build onnxruntime pkg
```
  b0ff822e
- map BACKEND_RESET_MAX_MEMORY_ALLOCATED to reset_peak_memory_stats on XPU (#11191) · a7f07c1e
  Yao Matrix authored Apr 02, 2025
```
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
```
  a7f07c1e
01 Apr, 2025 1 commit

[WIP] Add Wan Video2Video (#11053) · df1d7b01

Dhruv Nair authored Apr 01, 2025

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

df1d7b01

21 Mar, 2025 2 commits

add sana-sprint (#11074) · 8a63aa5e

YiYi Xu authored Mar 21, 2025



* add sana-sprint




---------
Co-authored-by: Junsong Chen <cjs1020440147@icloud.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>

8a63aa5e

[core] FasterCache (#10163) · 844221ae

Aryan authored Mar 21, 2025



* init

* update

* update

* update

* make style

* update

* fix

* make it work with guidance distilled models

* update

* make fix-copies

* add tests

* update

* apply_faster_cache -> apply_fastercache

* fix

* reorder

* update

* refactor

* update docs

* add fastercache to CacheMixin

* update tests

* Apply suggestions from code review

* make style

* try to fix partial import error

* Apply style fixes

* raise warning

* update

---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

844221ae

20 Mar, 2025 2 commits

[tests] make cuda only tests device-agnostic (#11058) · 15ad97f7

Fanli Lin authored Mar 20, 2025

* enable bnb on xpu

* add 2 more cases

* add missing change

* add missing change

* add one more

* enable cuda only tests on xpu

* enable big gpu cases

15ad97f7

Flux with Remote Encode (#11091) · 9f2d5c9e
hlky authored Mar 20, 2025
```
* Flux img2img remote encode

* Flux inpaint

* -copied from
```
9f2d5c9e

19 Mar, 2025 1 commit

[tests] enable bnb tests on xpu (#11001) · 56f74005

Fanli Lin authored Mar 20, 2025

* enable bnb on xpu

* add 2 more cases

* add missing change

* add missing change

* add one more

56f74005

18 Mar, 2025 2 commits

Quality options in `export_to_video` (#11090) · 0ab8fe49
hlky authored Mar 18, 2025
```
* Quality options in `export_to_video`

* make style
```
0ab8fe49

LTX 0.9.5 (#10968) · 2e83cbbb

Aryan authored Mar 18, 2025



* update


---------
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: hlky <hlky@hlky.ac>

2e83cbbb

15 Mar, 2025 1 commit

CogView4 Control Block (#10809) · 82188cef

Yuxuan Zhang authored Mar 16, 2025




* cogview4 control training


---------
Co-authored-by: OleehyO <leehy0357@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail.com>

82188cef

14 Mar, 2025 1 commit

[Tests] restrict memory tests for quanto for certain schemes. (#11052) · 2f0f281b

Sayak Paul authored Mar 14, 2025



* restrict memory tests for quanto for certain schemes.

* Apply suggestions from code review
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* fixes

* style

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

2f0f281b

13 Mar, 2025 1 commit

Rename Lumina(2)Text2ImgPipeline -> Lumina(2)Pipeline (#10827) · 5551506b

hlky authored Mar 13, 2025



* Rename Lumina(2)Text2ImgPipeline -> Lumina(2)Pipeline


---------
Co-authored-by: YiYi Xu <yixu310@gmail.com>

5551506b

12 Mar, 2025 2 commits

[hybrid inference

🍯

🐝

] Add VAE encode (#11017) · 733b44ac

hlky authored Mar 12, 2025

* [hybrid inference 🍯🐝

] Add VAE encode

* _toctree: add vae encode

* Add endpoints, tests

* vae_encode docs

* vae encode benchmarks

* api reference

* changelog

* Update docs/source/en/hybrid_inference/overview.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

733b44ac

[Refactor] Clean up import utils boilerplate (#11026) · 54280464
Dhruv Nair authored Mar 12, 2025
```
* update

* update

* update
```
54280464

11 Mar, 2025 1 commit
- [Quantization] Allow loading TorchAO serialized Tensor objects with torch>=2.6 (#11018) · 9add0715
  Dhruv Nair authored Mar 11, 2025
```
* update

* update

* update

* update

* update

* update

* update

* update

* update
```
  9add0715
10 Mar, 2025 1 commit

[Quantization] Add Quanto backend (#10756) · f5edaa78

Dhruv Nair authored Mar 10, 2025



* update

* updaet

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update docs/source/en/quantization/quanto.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update src/diffusers/quantizers/quanto/utils.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

f5edaa78

07 Mar, 2025 1 commit

Hunyuan I2V (#10983) · 2e5203be

Aryan authored Mar 07, 2025

* update

* update

* update

* add tests

* update

* add model tests

* update docs

* update

* update example

* fix defaults

* update

2e5203be

03 Mar, 2025 1 commit

Add EasyAnimateV5.1 text-to-video, image-to-video, control-to-video generation model (#10626) · 5e3b7d2d

Bubbliiiing authored Mar 03, 2025



* Update EasyAnimate V5.1

* Add docs && add tests && Fix comments problems in transformer3d and vae

* delete comments and remove useless import

* delete process

* Update EXAMPLE_DOC_STRING

* rename transformer file

* make fix-copies

* make style

* refactor pt. 1

* update toctree.yml

* add model tests

* Update layer_norm for norm_added_q and norm_added_k in Attention

* Fix processor problem

* refactor vae

* Fix problem in comments

* refactor tiling; remove einops dependency

* fix docs path

* make fix-copies

* Update src/diffusers/pipelines/easyanimate/pipeline_easyanimate_control.py

* update _toctree.yml

* fix test

* update

* update

* update

* make fix-copies

* fix tests

---------
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

5e3b7d2d

02 Mar, 2025 2 commits

Add `remote_decode` to `remote_utils` (#10898) · fc4229a0

hlky authored Mar 02, 2025



* Add `remote_decode` to `remote_utils`

* test dependency

* test dependency

* dependency

* dependency

* dependency

* docstrings

* changes

* make style

* apply

* revert, add new options

* Apply style fixes

* deprecate base64, headers not needed

* address comments

* add license header

* init test_remote_decode

* more

* more test

* more test

* skeleton for xl, flux

* more test

* flux test

* flux packed

* no scaling

* -save

* hunyuanvideo test

* Apply style fixes

* init docs

* Update src/diffusers/utils/remote_utils.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* comments

* Apply style fixes

* comments

* hybrid_inference/vae_decode

* fix

* tip?

* tip

* api reference autodoc

* install tip

---------
Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

fc4229a0

[Alibaba Wan Team] continue on #10921 Wan2.1 (#10922) · 2d8a41ca

YiYi Xu authored Mar 02, 2025

* Add wanx pipeline, model and example

* wanx_merged_v1

* change WanX into Wan

* fix i2v fp32 oom error

Link: https://code.alibaba-inc.com/open_wanx2/diffusers/codereview/20607813



* support t2v load fp32 ckpt

* add example

* final merge v1

* Update autoencoder_kl_wan.py

* up

* update middle, test up_block

* up up

* one less nn.sequential

* up more

* up

* more

* [refactor] [wip] Wan transformer/pipeline (#10926)

* update

* update

* refactor rope

* refactor pipeline

* make fix-copies

* add transformer test

* update

* update

* make style

* update tests

* tests

* conversion script

* conversion script

* update

* docs

* remove unused code

* fix _toctree.yml

* update dtype

* fix test

* fix tests: scale

* up

* more

* Apply suggestions from code review

* Apply suggestions from code review

* style

* Update scripts/convert_wan_to_diffusers.py

* update docs

* fix

---------
Co-authored-by: Yitong Huang <huangyitong.hyt@alibaba-inc.com>
Co-authored-by: 亚森 <wangjiayu.wjy@alibaba-inc.com>
Co-authored-by: Aryan <aryan@huggingface.co>

2d8a41ca

26 Feb, 2025 1 commit

Marigold Update: v1-1 models, Intrinsic Image Decomposition pipeline, documentation (#10884) · 3fab6624

Anton Obukhov authored Feb 26, 2025



* minor documentation fixes of the depth and normals pipelines

* update license headers

* update model checkpoints in examples
fix missing prediction_type in register_to_config in the normals pipeline

* add initial marigold intrinsics pipeline
update comments about num_inference_steps and ensemble_size
minor fixes in comments of marigold normals and depth pipelines

* update uncertainty visualization to work with intrinsics

* integrate iid


---------
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

3fab6624

25 Feb, 2025 1 commit

Multi IP-Adapter for Flux pipelines (#10867) · 1450c2ac

Daniel Regado authored Feb 25, 2025



* Initial implementation of Flux multi IP-Adapter

* Update src/diffusers/pipelines/flux/pipeline_flux.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/flux/pipeline_flux.py
Co-authored-by: hlky <hlky@hlky.ac>

* Changes for ipa image embeds

* Update src/diffusers/pipelines/flux/pipeline_flux.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/flux/pipeline_flux.py
Co-authored-by: hlky <hlky@hlky.ac>

* make style && make quality

* Updated ip_adapter test

* Created typing_utils.py

---------
Co-authored-by: hlky <hlky@hlky.ac>

1450c2ac

24 Feb, 2025 1 commit
- Add SD3 ControlNet to AutoPipeline (#10888) · aba4a579
  hlky authored Feb 24, 2025
```
Co-authored-by: puhuk <wetr235@gmail.com>
```
  aba4a579
21 Feb, 2025 2 commits

`device_map` in `load_model_dict_into_meta` (#10851) · d75ea3c7

hlky authored Feb 21, 2025

* `device_map` in `load_model_dict_into_meta`

* _LOW_CPU_MEM_USAGE_DEFAULT

* fix is_peft_version is_bitsandbytes_version

d75ea3c7

SkyReels Hunyuan T2V & I2V (#10837) · e3bc4aab

Aryan authored Feb 21, 2025



* update

* make fix-copies

* update

* tests

* update

* update

* add co-author
Co-Authored-By: Langdx <82783347+Langdx@users.noreply.github.com>

* add co-author
Co-Authored-By: howe <howezhang2018@gmail.com>

* update

---------
Co-authored-by: Langdx <82783347+Langdx@users.noreply.github.com>
Co-authored-by: howe <howezhang2018@gmail.com>

e3bc4aab

20 Feb, 2025 1 commit

[tests] test `encode_prompt()` in isolation (#10438) · b2ca39c8

Sayak Paul authored Feb 20, 2025

* poc encode_prompt() tests

* fix

* updates.

* fixes

* fixes

* updates

* updates

* updates

* revert

* updates

* updates

* updates

* updates

* remove SDXLOptionalComponentsTesterMixin.

* remove tests that directly leveraged encode_prompt() in some way or the other.

* fix imports.

* remove _save_load

* fixes

* fixes

* fixes

* fixes

b2ca39c8

19 Feb, 2025 1 commit

[FEAT] Model loading refactor (#10604) · f5929e03

Marc Sun authored Feb 19, 2025



* first draft model loading refactor

* revert name change

* fix bnb

* revert name

* fix dduf

* fix huanyan

* style

* Update src/diffusers/models/model_loading_utils.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* suggestions from reviews

* Update src/diffusers/models/modeling_utils.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* remove safetensors check

* fix default value

* more fix from suggestions

* revert logic for single file

* style

* typing + fix couple of issues

* improve speed

* Update src/diffusers/models/modeling_utils.py
Co-authored-by: Aryan <aryan@huggingface.co>

* fp8 dtype

* add tests

* rename resolved_archive_file to resolved_model_file

* format

* map_location default cpu

* add utility function

* switch to smaller model + test inference

* Apply suggestions from code review
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* rm comment

* add log

* Apply suggestions from code review
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* add decorator

* cosine sim instead

* fix use_keep_in_fp32_modules

* comm

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>

f5929e03

15 Feb, 2025 1 commit

CogView4 (supports different length c and uc) (#10649) · d90cd362

Yuxuan Zhang authored Feb 16, 2025



* init

* encode with glm

* draft schedule

* feat(scheduler): Add CogView scheduler implementation

* feat(embeddings): add CogView 2D rotary positional embedding

* 1

* Update pipeline_cogview4.py

* fix the timestep init and sigma

* update latent

* draft patch(not work)

* fix

* [WIP][cogview4]: implement initial CogView4 pipeline

Implement the basic CogView4 pipeline structure with the following changes:
- Add CogView4 pipeline implementation
- Implement DDIM scheduler for CogView4
- Add CogView3Plus transformer architecture
- Update embedding models

Current limitations:
- CFG implementation uses padding for sequence length alignment
- Need to verify transformer inference alignment with Megatron

TODO:
- Consider separate forward passes for condition/uncondition
  instead of padding approach

* [WIP][cogview4][refactor]: Split condition/uncondition forward pass in CogView4 pipeline

Split the forward pass for conditional and unconditional predictions in the CogView4 pipeline to match the original implementation. The noise prediction is now done separately for each case before combining them for guidance. However, the results still need improvement.

This is a work in progress as the generated images are not yet matching expected quality.

* use with -2 hidden state

* remove text_projector

* 1

* [WIP] Add tensor-reload to align input from transformer block

* [WIP] for older glm

* use with cogview4 transformers forward twice of u and uc

* Update convert_cogview4_to_diffusers.py

* remove this

* use main example

* change back

* reset

* setback

* back

* back 4

* Fix qkv conversion logic for CogView4 to Diffusers format

* back5

* revert to sat to cogview4 version

* update a new convert from megatron

* [WIP][cogview4]: implement CogView4 attention processor

Add CogView4AttnProcessor class for implementing scaled dot-product attention
with rotary embeddings for the CogVideoX model. This processor concatenates
encoder and hidden states, applies QKV projections and RoPE, but does not
include spatial normalization.

TODO:
- Fix incorrect QKV projection weights
- Resolve ~25% error in RoPE implementation compared to Megatron

* [cogview4] implement CogView4 transformer block

Implement CogView4 transformer block following the Megatron architecture:
- Add multi-modulate and multi-gate mechanisms for adaptive layer normalization
- Implement dual-stream attention with encoder-decoder structure
- Add feed-forward network with GELU activation
- Support rotary position embeddings for image tokens

The implementation follows the original CogView4 architecture while adapting
it to work within the diffusers framework.

* with new attn

* [bugfix] fix dimension mismatch in CogView4 attention

* [cogview4][WIP]: update final normalization in CogView4 transformer

Refactored the final normalization layer in CogView4 transformer to use separate layernorm and AdaLN operations instead of combined AdaLayerNormContinuous. This matches the original implementation but needs validation.

Needs verification against reference implementation.

* 1

* put back

* Update transformer_cogview4.py

* change time_shift

* Update pipeline_cogview4.py

* change timesteps

* fix

* change text_encoder_id

* [cogview4][rope] align RoPE implementation with Megatron

- Implement apply_rope method in attention processor to match Megatron's implementation
- Update position embeddings to ensure compatibility with Megatron-style rotary embeddings
- Ensure consistent rotary position encoding across attention layers

This change improves compatibility with Megatron-based models and provides
better alignment with the original implementation's positional encoding approach.

* [cogview4][bugfix] apply silu activation to time embeddings in CogView4

Applied silu activation to time embeddings before splitting into conditional
and unconditional parts in CogView4Transformer2DModel. This matches the
original implementation and helps ensure correct time conditioning behavior.

* [cogview4][chore] clean up pipeline code

- Remove commented out code and debug statements
- Remove unused retrieve_timesteps function
- Clean up code formatting and documentation

This commit focuses on code cleanup in the CogView4 pipeline implementation, removing unnecessary commented code and improving readability without changing functionality.

* [cogview4][scheduler] Implement CogView4 scheduler and pipeline

* now It work

* add timestep

* batch

* change convert scipt

* refactor pt. 1; make style

* refactor pt. 2

* refactor pt. 3

* add tests

* make fix-copies

* update toctree.yml

* use flow match scheduler instead of custom

* remove scheduling_cogview.py

* add tiktoken to test dependencies

* Update src/diffusers/models/embeddings.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* apply suggestions from review

* use diffusers apply_rotary_emb

* update flow match scheduler to accept timesteps

* fix comment

* apply review sugestions

* Update src/diffusers/schedulers/scheduling_flow_match_euler_discrete.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

---------
Co-authored-by: 三洋三洋 <1258009915@qq.com>
Co-authored-by: OleehyO <leehy0357@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: YiYi Xu <yixu310@gmail.com>

d90cd362

12 Feb, 2025 1 commit

Faster set_adapters (#10777) · 067eab1b

Thanh Le authored Feb 12, 2025



* Update peft_utils.py

* Update peft_utils.py

* Update peft_utils.py

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

067eab1b

11 Feb, 2025 1 commit

Add support for lumina2 (#10642) · 81440fd4

Le Zhuo authored Feb 12, 2025



* Add support for lumina2


---------
Co-authored-by: csuhan <hanjiaming@whu.edu.cn>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: hlky <hlky@hlky.ac>

81440fd4