1. 25 Nov, 2025 1 commit
    • Sayak Paul's avatar
      let's go Flux2 🚀 (#12711) · 5ffb73d4
      Sayak Paul authored
      
      
      * add vae
      
      * Initial commit for Flux 2 Transformer implementation
      
      * add pipeline part
      
      * small edits to the pipeline and conversion
      
      * update conversion script
      
      * fix
      
      * up up
      
      * finish pipeline
      
      * Remove Flux IP Adapter logic for now
      
      * Remove deprecated 3D id logic
      
      * Remove ControlNet logic for now
      
      * Add link to ViT-22B paper as reference for parallel transformer blocks such as the Flux 2 single stream block
      
      * update pipeline
      
      * Don't use biases for input projs and output AdaNorm
      
      * up
      
      * Remove bias for double stream block text QKV projections
      
      * Add script to convert Flux 2 transformer to diffusers
      
      * make style and make quality
      
      * fix a few things.
      
      * allow sft files to go.
      
      * fix image processor
      
      * fix batch
      
      * style a bit
      
      * Fix some bugs in Flux 2 transformer implementation
      
      * Fix dummy input preparation and fix some test bugs
      
      * fix dtype casting in timestep guidance module.
      
      * resolve conflicts.,
      
      * remove ip adapter stuff.
      
      * Fix Flux 2 transformer consistency test
      
      * Fix bug in Flux2TransformerBlock (double stream block)
      
      * Get remaining Flux 2 transformer tests passing
      
      * make style; make quality; make fix-copies
      
      * remove stuff.
      
      * fix type annotaton.
      
      * remove unneeded stuff from tests
      
      * tests
      
      * up
      
      * up
      
      * add sf support
      
      * Remove unused IP Adapter and ControlNet logic from transformer (#9)
      
      * copied from
      
      * Apply suggestions from code review
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      Co-authored-by: default avatarapolinário <joaopaulo.passos@gmail.com>
      
      * up
      
      * up
      
      * up
      
      * up
      
      * up
      
      * Refactor Flux2Attention into separate classes for double stream and single stream attention
      
      * Add _supports_qkv_fusion to AttentionModuleMixin to allow subclasses to disable QKV fusion
      
      * Have Flux2ParallelSelfAttention inherit from AttentionModuleMixin with _supports_qkv_fusion=False
      
      * Log debug message when calling fuse_projections on a AttentionModuleMixin subclass that does not support QKV fusion
      
      * Address review comments
      
      * Update src/diffusers/pipelines/flux2/pipeline_flux2.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * up
      
      * Remove maybe_allow_in_graph decorators for Flux 2 transformer blocks (#12)
      
      * up
      
      * support ostris loras. (#13)
      
      * up
      
      * update schdule
      
      * up
      
      * up (#17)
      
      * add training scripts (#16)
      
      * add training scripts
      Co-authored-by: default avatarLinoy Tsaban <linoytsaban@gmail.com>
      
      * model cpu offload in validation.
      
      * add flux.2 readme
      
      * add img2img and tests
      
      * cpu offload in log validation
      
      * Apply suggestions from code review
      
      * fix
      
      * up
      
      * fixes
      
      * remove i2i training tests for now.
      
      ---------
      Co-authored-by: default avatarLinoy Tsaban <linoytsaban@gmail.com>
      Co-authored-by: default avatarlinoytsaban <linoy@huggingface.co>
      
      * up
      
      ---------
      Co-authored-by: default avataryiyixuxu <yixu310@gmail.com>
      Co-authored-by: default avatarDaniel Gu <dgu8957@gmail.com>
      Co-authored-by: default avataryiyi@huggingface.co <yiyi@ip-10-53-87-203.ec2.internal>
      Co-authored-by: default avatardg845 <58458699+dg845@users.noreply.github.com>
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      Co-authored-by: default avatarapolinário <joaopaulo.passos@gmail.com>
      Co-authored-by: default avataryiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal>
      Co-authored-by: default avatarLinoy Tsaban <linoytsaban@gmail.com>
      Co-authored-by: default avatarlinoytsaban <linoy@huggingface.co>
      5ffb73d4
  2. 28 Aug, 2025 1 commit
  3. 17 Jul, 2025 1 commit
  4. 14 Jun, 2025 1 commit
    • Edna's avatar
      Chroma Pipeline (#11698) · 8adc6003
      Edna authored
      
      
      * working state from hameerabbasi and iddl
      
      * working state form hameerabbasi and iddl (transformer)
      
      * working state (normalization)
      
      * working state (embeddings)
      
      * add chroma loader
      
      * add chroma to mappings
      
      * add chroma to transformer init
      
      * take out variant stuff
      
      * get decently far in changing variant stuff
      
      * add chroma init
      
      * make chroma output class
      
      * add chroma transformer to dummy tp
      
      * add chroma to init
      
      * add chroma to init
      
      * fix single file
      
      * update
      
      * update
      
      * add chroma to auto pipeline
      
      * add chroma to pipeline init
      
      * change to chroma transformer
      
      * take out variant from blocks
      
      * swap embedder location
      
      * remove prompt_2
      
      * work on swapping text encoders
      
      * remove mask function
      
      * dont modify mask (for now)
      
      * wrap attn mask
      
      * no attn mask (can't get it to work)
      
      * remove pooled prompt embeds
      
      * change to my own unpooled embeddeer
      
      * fix load
      
      * take pooled projections out of transformer
      
      * ensure correct dtype for chroma embeddings
      
      * update
      
      * use dn6 attn mask + fix true_cfg_scale
      
      * use chroma pipeline output
      
      * use DN6 embeddings
      
      * remove guidance
      
      * remove guidance embed (pipeline)
      
      * remove guidance from embeddings
      
      * don't return length
      
      * dont change dtype
      
      * remove unused stuff, fix up docs
      
      * add chroma autodoc
      
      * add .md (oops)
      
      * initial chroma docs
      
      * undo don't change dtype
      
      * undo arxiv change
      
      unsure why that happened
      
      * fix hf papers regression in more places
      
      * Update docs/source/en/api/pipelines/chroma.md
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      * do_cfg -> self.do_classifier_free_guidance
      
      * Update docs/source/en/api/models/chroma_transformer.md
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      * Update chroma.md
      
      * Move chroma layers into transformer
      
      * Remove pruned AdaLayerNorms
      
      * Add chroma fast tests
      
      * (untested) batch cond and uncond
      
      * Add # Copied from for shift
      
      * Update # Copied from statements
      
      * update norm imports
      
      * Revert cond + uncond batching
      
      * Add transformer tests
      
      * move chroma test (oops)
      
      * chroma init
      
      * fix chroma pipeline fast tests
      
      * Update src/diffusers/models/transformers/transformer_chroma.py
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      * Move Approximator and Embeddings
      
      * Fix auto pipeline + make style, quality
      
      * make style
      
      * Apply style fixes
      
      * switch to new input ids
      
      * fix # Copied from error
      
      * remove # Copied from on protected members
      
      * try to fix import
      
      * fix import
      
      * make fix-copes
      
      * revert style fix
      
      * update chroma transformer params
      
      * update chroma transformer approximator init params
      
      * update to pad tokens
      
      * fix batch inference
      
      * Make more pipeline tests work
      
      * Make most transformer tests work
      
      * fix docs
      
      * make style, make quality
      
      * skip batch tests
      
      * fix test skipping
      
      * fix test skipping again
      
      * fix for tests
      
      * Fix all pipeline test
      
      * update
      
      * push local changes, fix docs
      
      * add encoder test, remove pooled dim
      
      * default proj dim
      
      * fix tests
      
      * fix equal size list input
      
      * update
      
      * push local changes, fix docs
      
      * add encoder test, remove pooled dim
      
      * default proj dim
      
      * fix tests
      
      * fix equal size list input
      
      * Revert "fix equal size list input"
      
      This reverts commit 3fe4ad67d58d83715bc238f8654f5e90bfc5653c.
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      ---------
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      Co-authored-by: default avatargithub-actions[bot] <github-actions[bot]@users.noreply.github.com>
      8adc6003
  5. 09 Apr, 2025 1 commit
  6. 03 Mar, 2025 1 commit
    • Sayak Paul's avatar
      [Tests] Remove more encode prompts tests (#10942) · 7513162b
      Sayak Paul authored
      * fix-copies went uncaught it seems.
      
      * remove more unneeded encode_prompt() tests
      
      * Revert "fix-copies went uncaught it seems."
      
      This reverts commit eefb302791172a4fb8ef008e400f94878de2c6c9.
      
      * empty
      7513162b
  7. 14 Feb, 2025 1 commit
    • Aryan's avatar
      Module Group Offloading (#10503) · 9a147b82
      Aryan authored
      
      
      * update
      
      * fix
      
      * non_blocking; handle parameters and buffers
      
      * update
      
      * Group offloading with cuda stream prefetching (#10516)
      
      * cuda stream prefetch
      
      * remove breakpoints
      
      * update
      
      * copy model hook implementation from pab
      
      * update; ~very workaround based implementation but it seems to work as expected; needs cleanup and rewrite
      
      * more workarounds to make it actually work
      
      * cleanup
      
      * rewrite
      
      * update
      
      * make sure to sync current stream before overwriting with pinned params
      
      not doing so will lead to erroneous computations on the GPU and cause bad results
      
      * better check
      
      * update
      
      * remove hook implementation to not deal with merge conflict
      
      * re-add hook changes
      
      * why use more memory when less memory do trick
      
      * why still use slightly more memory when less memory do trick
      
      * optimise
      
      * add model tests
      
      * add pipeline tests
      
      * update docs
      
      * add layernorm and groupnorm
      
      * address review comments
      
      * improve tests; add docs
      
      * improve docs
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * apply suggestions from code review
      
      * update tests
      
      * apply suggestions from review
      
      * enable_group_offloading -> enable_group_offload for naming consistency
      
      * raise errors if multiple offloading strategies used; add relevant tests
      
      * handle .to() when group offload applied
      
      * refactor some repeated code
      
      * remove unintentional change from merge conflict
      
      * handle .cuda()
      
      ---------
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      9a147b82
  8. 22 Jan, 2025 1 commit
    • Aryan's avatar
      [core] Layerwise Upcasting (#10347) · beacaa55
      Aryan authored
      
      
      * update
      
      * update
      
      * make style
      
      * remove dynamo disable
      
      * add coauthor
      Co-Authored-By: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      * update
      
      * update
      
      * update
      
      * update mixin
      
      * add some basic tests
      
      * update
      
      * update
      
      * non_blocking
      
      * improvements
      
      * update
      
      * norm.* -> norm
      
      * apply suggestions from review
      
      * add example
      
      * update hook implementation to the latest changes from pyramid attention broadcast
      
      * deinitialize should raise an error
      
      * update doc page
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * update docs
      
      * update
      
      * refactor
      
      * fix _always_upcast_modules for asym ae and vq_model
      
      * fix lumina embedding forward to not depend on weight dtype
      
      * refactor tests
      
      * add simple lora inference tests
      
      * _always_upcast_modules -> _precision_sensitive_module_patterns
      
      * remove todo comments about review; revert changes to self.dtype in unets because .dtype on ModelMixin should be able to handle fp8 weight case
      
      * check layer dtypes in lora test
      
      * fix UNet1DModelTests::test_layerwise_upcasting_inference
      
      * _precision_sensitive_module_patterns -> _skip_layerwise_casting_patterns based on feedback
      
      * skip test in NCSNppModelTests
      
      * skip tests for AutoencoderTinyTests
      
      * skip tests for AutoencoderOobleckTests
      
      * skip tests for UNet1DModelTests - unsupported pytorch operations
      
      * layerwise_upcasting -> layerwise_casting
      
      * skip tests for UNetRLModelTests; needs next pytorch release for currently unimplemented operation support
      
      * add layerwise fp8 pipeline test
      
      * use xfail
      
      * Apply suggestions from code review
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      * add assertion with fp32 comparison; add tolerance to fp8-fp32 vs fp32-fp32 comparison (required for a few models' test to pass)
      
      * add note about memory consumption on tesla CI runner for failing test
      
      ---------
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      beacaa55
  9. 23 Nov, 2024 1 commit
  10. 20 Nov, 2024 1 commit
  11. 31 Oct, 2024 1 commit
    • Sayak Paul's avatar
      [CI] add a big GPU marker to run memory-intensive tests separately on CI (#9691) · ff182ad6
      Sayak Paul authored
      
      
      * add a marker for big gpu tests
      
      * update
      
      * trigger on PRs temporarily.
      
      * onnx
      
      * fix
      
      * total memory
      
      * fixes
      
      * reduce memory threshold.
      
      * bigger gpu
      
      * empty
      
      * g6e
      
      * Apply suggestions from code review
      
      * address comments.
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * okay
      
      * further reduce.
      
      * updates
      
      * remove
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * fixes
      
      * fixes
      
      * updates.
      
      * fix
      
      * workflow fixes.
      
      ---------
      Co-authored-by: default avatarAryan <aryan@huggingface.co>
      ff182ad6
  12. 17 Sep, 2024 1 commit
  13. 02 Sep, 2024 1 commit
  14. 23 Aug, 2024 1 commit
  15. 02 Aug, 2024 1 commit
    • Sayak Paul's avatar
      [Flux] allow tests to run (#9050) · 0e460675
      Sayak Paul authored
      * fix tests
      
      * fix
      
      * float64 skip
      
      * remove sample_size.
      
      * remove
      
      * remove more
      
      * default_sample_size.
      
      * credit black forest for flux model.
      
      * skip
      
      * fix: tests
      
      * remove OriginalModelMixin
      
      * add transformer model test
      
      * add: transformer model tests
      0e460675
  16. 01 Aug, 2024 1 commit
  17. 24 Jul, 2024 1 commit
    • Sayak Paul's avatar
      [Core] fix QKV fusion for attention (#8829) · 50d21f7c
      Sayak Paul authored
      * start debugging the problem,
      
      * start
      
      * fix
      
      * fix
      
      * fix imports.
      
      * handle hunyuan
      
      * remove residuals.
      
      * add a check for making sure there's appropriate procs.
      
      * add more rigor to the tests.
      
      * fix test
      
      * remove redundant check
      
      * fix-copies
      
      * move check_qkv_fusion_matches_attn_procs_length and check_qkv_fusion_processors_exist.
      50d21f7c
  18. 12 Jun, 2024 1 commit