1. 08 Jul, 2025 2 commits
    • Aryan's avatar
      First Block Cache (#11180) · 0454fbb3
      Aryan authored
      
      
      * update
      
      * modify flux single blocks to make compatible with cache techniques (without too much model-specific intrusion code)
      
      * remove debug logs
      
      * update
      
      * cache context for different batches of data
      
      * fix hs residual bug for single return outputs; support ltx
      
      * fix controlnet flux
      
      * support flux, ltx i2v, ltx condition
      
      * update
      
      * update
      
      * Update docs/source/en/api/cache.md
      
      * Update src/diffusers/hooks/hooks.py
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      * address review comments pt. 1
      
      * address review comments pt. 2
      
      * cache context refacotr; address review pt. 3
      
      * address review comments
      
      * metadata registration with decorators instead of centralized
      
      * support cogvideox
      
      * support mochi
      
      * fix
      
      * remove unused function
      
      * remove central registry based on review
      
      * update
      
      ---------
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      0454fbb3
    • Dhruv Nair's avatar
      [CI] Fix big GPU test marker (#11786) · cbc8ced2
      Dhruv Nair authored
      * update
      
      * update
      cbc8ced2
  2. 11 Jun, 2025 1 commit
  3. 09 Apr, 2025 1 commit
  4. 21 Mar, 2025 1 commit
    • Aryan's avatar
      [core] FasterCache (#10163) · 844221ae
      Aryan authored
      
      
      * init
      
      * update
      
      * update
      
      * update
      
      * make style
      
      * update
      
      * fix
      
      * make it work with guidance distilled models
      
      * update
      
      * make fix-copies
      
      * add tests
      
      * update
      
      * apply_faster_cache -> apply_fastercache
      
      * fix
      
      * reorder
      
      * update
      
      * refactor
      
      * update docs
      
      * add fastercache to CacheMixin
      
      * update tests
      
      * Apply suggestions from code review
      
      * make style
      
      * try to fix partial import error
      
      * Apply style fixes
      
      * raise warning
      
      * update
      
      ---------
      Co-authored-by: default avatargithub-actions[bot] <github-actions[bot]@users.noreply.github.com>
      844221ae
  5. 20 Mar, 2025 1 commit
  6. 04 Mar, 2025 1 commit
    • Fanli Lin's avatar
      [tests] make tests device-agnostic (part 4) (#10508) · 7855ac59
      Fanli Lin authored
      
      
      * initial comit
      
      * fix empty cache
      
      * fix one more
      
      * fix style
      
      * update device functions
      
      * update
      
      * update
      
      * Update src/diffusers/utils/testing_utils.py
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      * Update src/diffusers/utils/testing_utils.py
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      * Update src/diffusers/utils/testing_utils.py
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      * Update tests/pipelines/controlnet/test_controlnet.py
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      * Update src/diffusers/utils/testing_utils.py
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      * Update src/diffusers/utils/testing_utils.py
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      * Update tests/pipelines/controlnet/test_controlnet.py
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      * with gc.collect
      
      * update
      
      * make style
      
      * check_torch_dependencies
      
      * add mps empty cache
      
      * add changes
      
      * bug fix
      
      * enable on xpu
      
      * update more cases
      
      * revert
      
      * revert back
      
      * Update test_stable_diffusion_xl.py
      
      * Update tests/pipelines/stable_diffusion/test_stable_diffusion.py
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      * Update tests/pipelines/stable_diffusion/test_stable_diffusion.py
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      * Update tests/pipelines/stable_diffusion/test_stable_diffusion_img2img.py
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      * Update tests/pipelines/stable_diffusion/test_stable_diffusion_img2img.py
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      * Update tests/pipelines/stable_diffusion/test_stable_diffusion_img2img.py
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      * add test marker
      
      ---------
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      7855ac59
  7. 03 Mar, 2025 1 commit
    • Sayak Paul's avatar
      [Tests] Remove more encode prompts tests (#10942) · 7513162b
      Sayak Paul authored
      * fix-copies went uncaught it seems.
      
      * remove more unneeded encode_prompt() tests
      
      * Revert "fix-copies went uncaught it seems."
      
      This reverts commit eefb302791172a4fb8ef008e400f94878de2c6c9.
      
      * empty
      7513162b
  8. 14 Feb, 2025 1 commit
    • Aryan's avatar
      Module Group Offloading (#10503) · 9a147b82
      Aryan authored
      
      
      * update
      
      * fix
      
      * non_blocking; handle parameters and buffers
      
      * update
      
      * Group offloading with cuda stream prefetching (#10516)
      
      * cuda stream prefetch
      
      * remove breakpoints
      
      * update
      
      * copy model hook implementation from pab
      
      * update; ~very workaround based implementation but it seems to work as expected; needs cleanup and rewrite
      
      * more workarounds to make it actually work
      
      * cleanup
      
      * rewrite
      
      * update
      
      * make sure to sync current stream before overwriting with pinned params
      
      not doing so will lead to erroneous computations on the GPU and cause bad results
      
      * better check
      
      * update
      
      * remove hook implementation to not deal with merge conflict
      
      * re-add hook changes
      
      * why use more memory when less memory do trick
      
      * why still use slightly more memory when less memory do trick
      
      * optimise
      
      * add model tests
      
      * add pipeline tests
      
      * update docs
      
      * add layernorm and groupnorm
      
      * address review comments
      
      * improve tests; add docs
      
      * improve docs
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * apply suggestions from code review
      
      * update tests
      
      * apply suggestions from review
      
      * enable_group_offloading -> enable_group_offload for naming consistency
      
      * raise errors if multiple offloading strategies used; add relevant tests
      
      * handle .to() when group offload applied
      
      * refactor some repeated code
      
      * remove unintentional change from merge conflict
      
      * handle .cuda()
      
      ---------
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      9a147b82
  9. 27 Jan, 2025 1 commit
  10. 22 Jan, 2025 1 commit
    • Aryan's avatar
      [core] Layerwise Upcasting (#10347) · beacaa55
      Aryan authored
      
      
      * update
      
      * update
      
      * make style
      
      * remove dynamo disable
      
      * add coauthor
      Co-Authored-By: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      * update
      
      * update
      
      * update
      
      * update mixin
      
      * add some basic tests
      
      * update
      
      * update
      
      * non_blocking
      
      * improvements
      
      * update
      
      * norm.* -> norm
      
      * apply suggestions from review
      
      * add example
      
      * update hook implementation to the latest changes from pyramid attention broadcast
      
      * deinitialize should raise an error
      
      * update doc page
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * update docs
      
      * update
      
      * refactor
      
      * fix _always_upcast_modules for asym ae and vq_model
      
      * fix lumina embedding forward to not depend on weight dtype
      
      * refactor tests
      
      * add simple lora inference tests
      
      * _always_upcast_modules -> _precision_sensitive_module_patterns
      
      * remove todo comments about review; revert changes to self.dtype in unets because .dtype on ModelMixin should be able to handle fp8 weight case
      
      * check layer dtypes in lora test
      
      * fix UNet1DModelTests::test_layerwise_upcasting_inference
      
      * _precision_sensitive_module_patterns -> _skip_layerwise_casting_patterns based on feedback
      
      * skip test in NCSNppModelTests
      
      * skip tests for AutoencoderTinyTests
      
      * skip tests for AutoencoderOobleckTests
      
      * skip tests for UNet1DModelTests - unsupported pytorch operations
      
      * layerwise_upcasting -> layerwise_casting
      
      * skip tests for UNetRLModelTests; needs next pytorch release for currently unimplemented operation support
      
      * add layerwise fp8 pipeline test
      
      * use xfail
      
      * Apply suggestions from code review
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      * add assertion with fp32 comparison; add tolerance to fp8-fp32 vs fp32-fp32 comparison (required for a few models' test to pass)
      
      * add note about memory consumption on tesla CI runner for failing test
      
      ---------
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      beacaa55
  11. 12 Jan, 2025 1 commit
  12. 10 Jan, 2025 1 commit
    • Sayak Paul's avatar
      [LoRA] allow big CUDA tests to run properly for LoRA (and others) (#9845) · a6f043a8
      Sayak Paul authored
      
      
      * allow big lora tests to run on the CI.
      
      * print
      
      * print.
      
      * print
      
      * print
      
      * print
      
      * print
      
      * more
      
      * print
      
      * remove print.
      
      * remove print
      
      * directly place on cuda.
      
      * remove pipeline.
      
      * remove
      
      * fix
      
      * fix
      
      * spaces
      
      * quality
      
      * updates
      
      * directly place flux controlnet pipeline on cuda.
      
      * torch_device instead of cuda.
      
      * style
      
      * device placement.
      
      * fixes
      
      * add big gpu marker for mochi; rename test correctly
      
      * address feedback
      
      * fix
      
      ---------
      Co-authored-by: default avatarAryan <aryan@huggingface.co>
      a6f043a8
  13. 21 Dec, 2024 1 commit
    • hlky's avatar
      Support Flux IP Adapter (#10261) · be207099
      hlky authored
      
      
      * Flux IP-Adapter
      
      * test cfg
      
      * make style
      
      * temp remove copied from
      
      * fix test
      
      * fix test
      
      * v2
      
      * fix
      
      * make style
      
      * temp remove copied from
      
      * Apply suggestions from code review
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * Move encoder_hid_proj to inside FluxTransformer2DModel
      
      * merge
      
      * separate encode_prompt, add copied from, image_encoder offload
      
      * make
      
      * fix test
      
      * fix
      
      * Update src/diffusers/pipelines/flux/pipeline_flux.py
      
      * test_flux_prompt_embeds change not needed
      
      * true_cfg -> true_cfg_scale
      
      * fix merge conflict
      
      * test_flux_ip_adapter_inference
      
      * add fast test
      
      * FluxIPAdapterMixin not test mixin
      
      * Update pipeline_flux.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      ---------
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      be207099
  14. 20 Nov, 2024 1 commit
  15. 31 Oct, 2024 3 commits
  16. 02 Sep, 2024 1 commit
  17. 23 Aug, 2024 1 commit
  18. 02 Aug, 2024 1 commit
    • Sayak Paul's avatar
      [Flux] allow tests to run (#9050) · 0e460675
      Sayak Paul authored
      * fix tests
      
      * fix
      
      * float64 skip
      
      * remove sample_size.
      
      * remove
      
      * remove more
      
      * default_sample_size.
      
      * credit black forest for flux model.
      
      * skip
      
      * fix: tests
      
      * remove OriginalModelMixin
      
      * add transformer model test
      
      * add: transformer model tests
      0e460675
  19. 01 Aug, 2024 1 commit
  20. 24 Jul, 2024 1 commit
    • Sayak Paul's avatar
      [Core] fix QKV fusion for attention (#8829) · 50d21f7c
      Sayak Paul authored
      * start debugging the problem,
      
      * start
      
      * fix
      
      * fix
      
      * fix imports.
      
      * handle hunyuan
      
      * remove residuals.
      
      * add a check for making sure there's appropriate procs.
      
      * add more rigor to the tests.
      
      * fix test
      
      * remove redundant check
      
      * fix-copies
      
      * move check_qkv_fusion_matches_attn_procs_length and check_qkv_fusion_processors_exist.
      50d21f7c
  21. 12 Jun, 2024 1 commit