Commits · 5ffb73d4aeac9eaef8366d7b21872d64009bd1c7 · renzhc / diffusers_dcu

25 Nov, 2025 1 commit

(#12711) · 5ffb73d4

Sayak Paul authored Nov 25, 2025



* add vae

* Initial commit for Flux 2 Transformer implementation

* add pipeline part

* small edits to the pipeline and conversion

* update conversion script

* fix

* up up

* finish pipeline

* Remove Flux IP Adapter logic for now

* Remove deprecated 3D id logic

* Remove ControlNet logic for now

* Add link to ViT-22B paper as reference for parallel transformer blocks such as the Flux 2 single stream block

* update pipeline

* Don't use biases for input projs and output AdaNorm

* up

* Remove bias for double stream block text QKV projections

* Add script to convert Flux 2 transformer to diffusers

* make style and make quality

* fix a few things.

* allow sft files to go.

* fix image processor

* fix batch

* style a bit

* Fix some bugs in Flux 2 transformer implementation

* Fix dummy input preparation and fix some test bugs

* fix dtype casting in timestep guidance module.

* resolve conflicts.,

* remove ip adapter stuff.

* Fix Flux 2 transformer consistency test

* Fix bug in Flux2TransformerBlock (double stream block)

* Get remaining Flux 2 transformer tests passing

* make style; make quality; make fix-copies

* remove stuff.

* fix type annotaton.

* remove unneeded stuff from tests

* tests

* up

* up

* add sf support

* Remove unused IP Adapter and ControlNet logic from transformer (#9)

* copied from

* Apply suggestions from code review
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: apolinário <joaopaulo.passos@gmail.com>

* up

* up

* up

* up

* up

* Refactor Flux2Attention into separate classes for double stream and single stream attention

* Add _supports_qkv_fusion to AttentionModuleMixin to allow subclasses to disable QKV fusion

* Have Flux2ParallelSelfAttention inherit from AttentionModuleMixin with _supports_qkv_fusion=False

* Log debug message when calling fuse_projections on a AttentionModuleMixin subclass that does not support QKV fusion

* Address review comments

* Update src/diffusers/pipelines/flux2/pipeline_flux2.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* up

* Remove maybe_allow_in_graph decorators for Flux 2 transformer blocks (#12)

* up

* support ostris loras. (#13)

* up

* update schdule

* up

* up (#17)

* add training scripts (#16)

* add training scripts
Co-authored-by: Linoy Tsaban <linoytsaban@gmail.com>

* model cpu offload in validation.

* add flux.2 readme

* add img2img and tests

* cpu offload in log validation

* Apply suggestions from code review

* fix

* up

* fixes

* remove i2i training tests for now.

---------
Co-authored-by: Linoy Tsaban <linoytsaban@gmail.com>
Co-authored-by: linoytsaban <linoy@huggingface.co>

* up

---------
Co-authored-by: yiyixuxu <yixu310@gmail.com>
Co-authored-by: Daniel Gu <dgu8957@gmail.com>
Co-authored-by: yiyi@huggingface.co <yiyi@ip-10-53-87-203.ec2.internal>
Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: apolinário <joaopaulo.passos@gmail.com>
Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal>
Co-authored-by: Linoy Tsaban <linoytsaban@gmail.com>
Co-authored-by: linoytsaban <linoy@huggingface.co>

5ffb73d4

28 Aug, 2025 1 commit

[Refactor] Move testing utils out of src (#12238) · 7aa6af11

Dhruv Nair authored Aug 28, 2025

* update

* update

* update

* update

* update

* merge main

* Revert "merge main"

This reverts commit 65efbcead58644b31596ed2d714f7cee0e0238d3.

7aa6af11

17 Jul, 2025 1 commit

[refactor] Flux/Chroma single file implementation + Attention Dispatcher (#11916) · 18c8f10f

Aryan authored Jul 17, 2025



* update

* update

* add coauthor
Co-Authored-By: Dhruv Nair <dhruv.nair@gmail.com>

* improve test

* handle ip adapter params correctly

* fix chroma qkv fusion test

* fix fastercache implementation

* fix more tests

* fight more tests

* add back set_attention_backend

* update

* update

* make style

* make fix-copies

* make ip adapter processor compatible with attention dispatcher

* refactor chroma as well

* remove rmsnorm assert

* minify and deprecate npu/xla processors

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

18c8f10f

14 Jun, 2025 1 commit

Chroma Pipeline (#11698) · 8adc6003

Edna authored Jun 13, 2025



* working state from hameerabbasi and iddl

* working state form hameerabbasi and iddl (transformer)

* working state (normalization)

* working state (embeddings)

* add chroma loader

* add chroma to mappings

* add chroma to transformer init

* take out variant stuff

* get decently far in changing variant stuff

* add chroma init

* make chroma output class

* add chroma transformer to dummy tp

* add chroma to init

* add chroma to init

* fix single file

* update

* update

* add chroma to auto pipeline

* add chroma to pipeline init

* change to chroma transformer

* take out variant from blocks

* swap embedder location

* remove prompt_2

* work on swapping text encoders

* remove mask function

* dont modify mask (for now)

* wrap attn mask

* no attn mask (can't get it to work)

* remove pooled prompt embeds

* change to my own unpooled embeddeer

* fix load

* take pooled projections out of transformer

* ensure correct dtype for chroma embeddings

* update

* use dn6 attn mask + fix true_cfg_scale

* use chroma pipeline output

* use DN6 embeddings

* remove guidance

* remove guidance embed (pipeline)

* remove guidance from embeddings

* don't return length

* dont change dtype

* remove unused stuff, fix up docs

* add chroma autodoc

* add .md (oops)

* initial chroma docs

* undo don't change dtype

* undo arxiv change

unsure why that happened

* fix hf papers regression in more places

* Update docs/source/en/api/pipelines/chroma.md
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* do_cfg -> self.do_classifier_free_guidance

* Update docs/source/en/api/models/chroma_transformer.md
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Update chroma.md

* Move chroma layers into transformer

* Remove pruned AdaLayerNorms

* Add chroma fast tests

* (untested) batch cond and uncond

* Add # Copied from for shift

* Update # Copied from statements

* update norm imports

* Revert cond + uncond batching

* Add transformer tests

* move chroma test (oops)

* chroma init

* fix chroma pipeline fast tests

* Update src/diffusers/models/transformers/transformer_chroma.py
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Move Approximator and Embeddings

* Fix auto pipeline + make style, quality

* make style

* Apply style fixes

* switch to new input ids

* fix # Copied from error

* remove # Copied from on protected members

* try to fix import

* fix import

* make fix-copes

* revert style fix

* update chroma transformer params

* update chroma transformer approximator init params

* update to pad tokens

* fix batch inference

* Make more pipeline tests work

* Make most transformer tests work

* fix docs

* make style, make quality

* skip batch tests

* fix test skipping

* fix test skipping again

* fix for tests

* Fix all pipeline test

* update

* push local changes, fix docs

* add encoder test, remove pooled dim

* default proj dim

* fix tests

* fix equal size list input

* update

* push local changes, fix docs

* add encoder test, remove pooled dim

* default proj dim

* fix tests

* fix equal size list input

* Revert "fix equal size list input"

This reverts commit 3fe4ad67d58d83715bc238f8654f5e90bfc5653c.

* update

* update

* update

* update

* update

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

8adc6003

09 Apr, 2025 1 commit
- Update Ruff to latest Version (#10919) · edc154da
  Dhruv Nair authored Apr 09, 2025
```
* update

* update

* update

* update
```
  edc154da
03 Mar, 2025 1 commit

[Tests] Remove more encode prompts tests (#10942) · 7513162b

Sayak Paul authored Mar 03, 2025

* fix-copies went uncaught it seems.

* remove more unneeded encode_prompt() tests

* Revert "fix-copies went uncaught it seems."

This reverts commit eefb302791172a4fb8ef008e400f94878de2c6c9.

* empty

7513162b

14 Feb, 2025 1 commit

Module Group Offloading (#10503) · 9a147b82

Aryan authored Feb 14, 2025



* update

* fix

* non_blocking; handle parameters and buffers

* update

* Group offloading with cuda stream prefetching (#10516)

* cuda stream prefetch

* remove breakpoints

* update

* copy model hook implementation from pab

* update; ~very workaround based implementation but it seems to work as expected; needs cleanup and rewrite

* more workarounds to make it actually work

* cleanup

* rewrite

* update

* make sure to sync current stream before overwriting with pinned params

not doing so will lead to erroneous computations on the GPU and cause bad results

* better check

* update

* remove hook implementation to not deal with merge conflict

* re-add hook changes

* why use more memory when less memory do trick

* why still use slightly more memory when less memory do trick

* optimise

* add model tests

* add pipeline tests

* update docs

* add layernorm and groupnorm

* address review comments

* improve tests; add docs

* improve docs

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* apply suggestions from code review

* update tests

* apply suggestions from review

* enable_group_offloading -> enable_group_offload for naming consistency

* raise errors if multiple offloading strategies used; add relevant tests

* handle .to() when group offload applied

* refactor some repeated code

* remove unintentional change from merge conflict

* handle .cuda()

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

9a147b82

22 Jan, 2025 1 commit

[core] Layerwise Upcasting (#10347) · beacaa55

Aryan authored Jan 22, 2025



* update

* update

* make style

* remove dynamo disable

* add coauthor
Co-Authored-By: Dhruv Nair <dhruv.nair@gmail.com>

* update

* update

* update

* update mixin

* add some basic tests

* update

* update

* non_blocking

* improvements

* update

* norm.* -> norm

* apply suggestions from review

* add example

* update hook implementation to the latest changes from pyramid attention broadcast

* deinitialize should raise an error

* update doc page

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update docs

* update

* refactor

* fix _always_upcast_modules for asym ae and vq_model

* fix lumina embedding forward to not depend on weight dtype

* refactor tests

* add simple lora inference tests

* _always_upcast_modules -> _precision_sensitive_module_patterns

* remove todo comments about review; revert changes to self.dtype in unets because .dtype on ModelMixin should be able to handle fp8 weight case

* check layer dtypes in lora test

* fix UNet1DModelTests::test_layerwise_upcasting_inference

* _precision_sensitive_module_patterns -> _skip_layerwise_casting_patterns based on feedback

* skip test in NCSNppModelTests

* skip tests for AutoencoderTinyTests

* skip tests for AutoencoderOobleckTests

* skip tests for UNet1DModelTests - unsupported pytorch operations

* layerwise_upcasting -> layerwise_casting

* skip tests for UNetRLModelTests; needs next pytorch release for currently unimplemented operation support

* add layerwise fp8 pipeline test

* use xfail

* Apply suggestions from code review
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* add assertion with fp32 comparison; add tolerance to fp8-fp32 vs fp32-fp32 comparison (required for a few models' test to pass)

* add note about memory consumption on tesla CI runner for failing test

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

beacaa55

23 Nov, 2024 1 commit

Flux Fill, Canny, Depth, Redux (#9985) · 7ac6e286

Aryan authored Nov 23, 2024



* update

---------
Co-authored-by: yiyixuxu <yixu310@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

7ac6e286

20 Nov, 2024 1 commit

Flux latents fix (#9929) · f6f7afa1

Dhruv Nair authored Nov 20, 2024



* update

* update

* update

* update

* update

* update

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

f6f7afa1

31 Oct, 2024 1 commit

[CI] add a big GPU marker to run memory-intensive tests separately on CI (#9691) · ff182ad6

Sayak Paul authored Oct 31, 2024



* add a marker for big gpu tests

* update

* trigger on PRs temporarily.

* onnx

* fix

* total memory

* fixes

* reduce memory threshold.

* bigger gpu

* empty

* g6e

* Apply suggestions from code review

* address comments.

* fix

* fix

* fix

* fix

* fix

* okay

* further reduce.

* updates

* remove

* updates

* updates

* updates

* updates

* fixes

* fixes

* updates.

* fix

* workflow fixes.

---------
Co-authored-by: Aryan <aryan@huggingface.co>

ff182ad6

17 Sep, 2024 1 commit
- Feature flux controlnet img2img and inpaint pipeline (#9408) · bb1b0fa1
  Subho Ghosh authored Sep 18, 2024
```
* Implemented FLUX controlnet support to Img2Img pipeline
```
  bb1b0fa1
02 Sep, 2024 1 commit
- [CI] More fixes for Fast GPU Tests on main (#9300) · 007ad0e2
  Dhruv Nair authored Sep 02, 2024
```
update
```
  007ad0e2
23 Aug, 2024 1 commit
- [Core] fuse_qkv_projection() to Flux (#9185) · 2d9ccf39
  Sayak Paul authored Aug 23, 2024
```
* start fusing flux.

* test

* finish fusion

* fix-copues
```
  2d9ccf39
02 Aug, 2024 1 commit

[Flux] allow tests to run (#9050) · 0e460675

Sayak Paul authored Aug 02, 2024

* fix tests

* fix

* float64 skip

* remove sample_size.

* remove

* remove more

* default_sample_size.

* credit black forest for flux model.

* skip

* fix: tests

* remove OriginalModelMixin

* add transformer model test

* add: transformer model tests

0e460675

01 Aug, 2024 1 commit

Flux pipeline (#9043) · 27637a54

Sayak Paul authored Aug 02, 2024



add flux!
Signed-off-by: Adrien <adrien@huggingface.co>
Co-authored-by: Adrien <adrien.69740@gmail.com>
Co-authored-by: Anatoly Belikov <abelikov@singularitynet.io>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail.com>

27637a54

24 Jul, 2024 1 commit

[Core] fix QKV fusion for attention (#8829) · 50d21f7c

Sayak Paul authored Jul 24, 2024

* start debugging the problem,

* start

* fix

* fix

* fix imports.

* handle hunyuan

* remove residuals.

* add a check for making sure there's appropriate procs.

* add more rigor to the tests.

* fix test

* remove redundant check

* fix-copies

* move check_qkv_fusion_matches_attn_procs_length and check_qkv_fusion_processors_exist.

50d21f7c

12 Jun, 2024 1 commit

Add Stable Diffusion 3 (#8483) · 04717fd8

Dhruv Nair authored Jun 13, 2024



* up

* add sd3

* update

* update

* add tests

* fix copies

* fix docs

* update

* add dreambooth lora

* add LoRA

* update

* update

* update

* update

* import fix

* update

* Update src/diffusers/pipelines/stable_diffusion_3/pipeline_stable_diffusion_3.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* import fix 2

* update

* Update src/diffusers/models/autoencoders/autoencoder_kl.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* update

* update

* update

* fix ckpt id

* fix more ids

* update

* missing doc

* Update src/diffusers/schedulers/scheduling_flow_match_euler_discrete.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/schedulers/scheduling_flow_match_euler_discrete.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/stable_diffusion_3.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/stable_diffusion_3.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update'

* fix

* update

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

* note on gated access.

* requirements

* licensing

---------
Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>

04717fd8