1. 30 Sep, 2025 1 commit
  2. 26 Sep, 2025 1 commit
  3. 24 Sep, 2025 1 commit
    • DefTruth's avatar
      Introduce cache-dit to community optimization (#12366) · 310fdaf5
      DefTruth authored
      * docs: introduce cache-dit to diffusers
      
      * docs: introduce cache-dit to diffusers
      
      * docs: introduce cache-dit to diffusers
      
      * docs: introduce cache-dit to diffusers
      
      * docs: introduce cache-dit to diffusers
      
      * docs: introduce cache-dit to diffusers
      
      * docs: introduce cache-dit to diffusers
      
      * misc: update examples link
      
      * misc: update examples link
      
      * docs: introduce cache-dit to diffusers
      
      * docs: introduce cache-dit to diffusers
      
      * docs: introduce cache-dit to diffusers
      
      * docs: introduce cache-dit to diffusers
      
      * docs: introduce cache-dit to diffusers
      
      * Refine documentation for CacheDiT features
      
      Updated the wording for clarity and consistency in the documentation. Adjusted sections on cache acceleration, automatic block adapter, patch functor, and hybrid cache configuration.
      310fdaf5
  4. 23 Sep, 2025 1 commit
  5. 10 Sep, 2025 1 commit
  6. 25 Aug, 2025 2 commits
  7. 18 Jul, 2025 1 commit
  8. 11 Jul, 2025 1 commit
  9. 26 Jun, 2025 2 commits
  10. 20 Jun, 2025 2 commits
  11. 19 Jun, 2025 2 commits
  12. 16 Jun, 2025 1 commit
    • David Berenstein's avatar
      Add Pruna optimization framework documentation (#11688) · 9b834f87
      David Berenstein authored
      
      
      * Add Pruna optimization framework documentation
      
      - Introduced a new section for Pruna in the table of contents.
      - Added comprehensive documentation for Pruna, detailing its optimization techniques, installation instructions, and examples for optimizing and evaluating models
      
      * Enhance Pruna documentation with image alt text and code block formatting
      
      - Added alt text to images for better accessibility and context.
      - Changed code block syntax from diff to python for improved clarity.
      
      * Add installation section to Pruna documentation
      
      - Introduced a new installation section in the Pruna documentation to guide users on how to install the framework.
      - Enhanced the overall clarity and usability of the documentation for new users.
      
      * Update pruna.md
      
      * Update pruna.md
      
      * Update Pruna documentation for model optimization and evaluation
      
      - Changed section titles for consistency and clarity, from "Optimizing models" to "Optimize models" and "Evaluating and benchmarking optimized models" to "Evaluate and benchmark models".
      - Enhanced descriptions to clarify the use of `diffusers` models and the evaluation process.
      - Added a new example for evaluating standalone `diffusers` models.
      - Updated references and links for better navigation within the documentation.
      
      * Refactor Pruna documentation for clarity and consistency
      
      - Removed outdated references to FLUX-juiced and streamlined the explanation of benchmarking.
      - Enhanced the description of evaluating standalone `diffusers` models.
      - Cleaned up code examples by removing unnecessary imports and comments for better readability.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Enhance Pruna documentation with new examples and clarifications
      
      - Added an image to illustrate the optimization process.
      - Updated the explanation for sharing and loading optimized models on the Hugging Face Hub.
      - Clarified the evaluation process for optimized models using the EvaluationAgent.
      - Improved descriptions for defining metrics and evaluating standalone diffusers models.
      
      ---------
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      9b834f87
  13. 02 Jun, 2025 1 commit
  14. 28 May, 2025 1 commit
  15. 23 May, 2025 1 commit
  16. 19 May, 2025 2 commits
  17. 15 May, 2025 1 commit
  18. 01 May, 2025 1 commit
  19. 08 Apr, 2025 2 commits
  20. 24 Mar, 2025 1 commit
  21. 14 Feb, 2025 1 commit
    • Aryan's avatar
      Module Group Offloading (#10503) · 9a147b82
      Aryan authored
      
      
      * update
      
      * fix
      
      * non_blocking; handle parameters and buffers
      
      * update
      
      * Group offloading with cuda stream prefetching (#10516)
      
      * cuda stream prefetch
      
      * remove breakpoints
      
      * update
      
      * copy model hook implementation from pab
      
      * update; ~very workaround based implementation but it seems to work as expected; needs cleanup and rewrite
      
      * more workarounds to make it actually work
      
      * cleanup
      
      * rewrite
      
      * update
      
      * make sure to sync current stream before overwriting with pinned params
      
      not doing so will lead to erroneous computations on the GPU and cause bad results
      
      * better check
      
      * update
      
      * remove hook implementation to not deal with merge conflict
      
      * re-add hook changes
      
      * why use more memory when less memory do trick
      
      * why still use slightly more memory when less memory do trick
      
      * optimise
      
      * add model tests
      
      * add pipeline tests
      
      * update docs
      
      * add layernorm and groupnorm
      
      * address review comments
      
      * improve tests; add docs
      
      * improve docs
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * apply suggestions from code review
      
      * update tests
      
      * apply suggestions from review
      
      * enable_group_offloading -> enable_group_offload for naming consistency
      
      * raise errors if multiple offloading strategies used; add relevant tests
      
      * handle .to() when group offload applied
      
      * refactor some repeated code
      
      * remove unintentional change from merge conflict
      
      * handle .cuda()
      
      ---------
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      9a147b82
  22. 23 Jan, 2025 1 commit
  23. 22 Jan, 2025 1 commit
    • Aryan's avatar
      [core] Layerwise Upcasting (#10347) · beacaa55
      Aryan authored
      
      
      * update
      
      * update
      
      * make style
      
      * remove dynamo disable
      
      * add coauthor
      Co-Authored-By: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      * update
      
      * update
      
      * update
      
      * update mixin
      
      * add some basic tests
      
      * update
      
      * update
      
      * non_blocking
      
      * improvements
      
      * update
      
      * norm.* -> norm
      
      * apply suggestions from review
      
      * add example
      
      * update hook implementation to the latest changes from pyramid attention broadcast
      
      * deinitialize should raise an error
      
      * update doc page
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * update docs
      
      * update
      
      * refactor
      
      * fix _always_upcast_modules for asym ae and vq_model
      
      * fix lumina embedding forward to not depend on weight dtype
      
      * refactor tests
      
      * add simple lora inference tests
      
      * _always_upcast_modules -> _precision_sensitive_module_patterns
      
      * remove todo comments about review; revert changes to self.dtype in unets because .dtype on ModelMixin should be able to handle fp8 weight case
      
      * check layer dtypes in lora test
      
      * fix UNet1DModelTests::test_layerwise_upcasting_inference
      
      * _precision_sensitive_module_patterns -> _skip_layerwise_casting_patterns based on feedback
      
      * skip test in NCSNppModelTests
      
      * skip tests for AutoencoderTinyTests
      
      * skip tests for AutoencoderOobleckTests
      
      * skip tests for UNet1DModelTests - unsupported pytorch operations
      
      * layerwise_upcasting -> layerwise_casting
      
      * skip tests for UNetRLModelTests; needs next pytorch release for currently unimplemented operation support
      
      * add layerwise fp8 pipeline test
      
      * use xfail
      
      * Apply suggestions from code review
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      * add assertion with fp32 comparison; add tolerance to fp8-fp32 vs fp32-fp32 comparison (required for a few models' test to pass)
      
      * add note about memory consumption on tesla CI runner for failing test
      
      ---------
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      beacaa55
  24. 16 Jan, 2025 1 commit
  25. 25 Oct, 2024 1 commit
  26. 12 Oct, 2024 1 commit
  27. 23 Sep, 2024 1 commit
  28. 16 Sep, 2024 1 commit
  29. 09 Sep, 2024 1 commit
  30. 08 Aug, 2024 1 commit
  31. 05 Jun, 2024 1 commit
    • Tolga Cangöz's avatar
      Errata (#8322) · 98730c5d
      Tolga Cangöz authored
      * Fix typos
      
      * Trim trailing whitespaces
      
      * Remove a trailing whitespace
      
      * chore: Update MarigoldDepthPipeline checkpoint to prs-eth/marigold-lcm-v1-0
      
      * Revert "chore: Update MarigoldDepthPipeline checkpoint to prs-eth/marigold-lcm-v1-0"
      
      This reverts commit fd742b30b4258106008a6af4d0dd4664904f8595.
      
      * pokemon -> naruto
      
      * `DPMSolverMultistep` -> `DPMSolverMultistepScheduler`
      
      * Improve Markdown stylization
      
      * Improve style
      
      * Improve style
      
      * Refactor pipeline variable names for consistency
      
      * up style
      98730c5d
  32. 24 May, 2024 1 commit
  33. 20 May, 2024 1 commit
  34. 10 May, 2024 1 commit
    • Mark Van Aken's avatar
      #7535 Update FloatTensor type hints to Tensor (#7883) · be4afa0b
      Mark Van Aken authored
      * find & replace all FloatTensors to Tensor
      
      * apply formatting
      
      * Update torch.FloatTensor to torch.Tensor in the remaining files
      
      * formatting
      
      * Fix the rest of the places where FloatTensor is used as well as in documentation
      
      * formatting
      
      * Update new file from FloatTensor to Tensor
      be4afa0b