- 04 Dec, 2025 1 commit
-
-
Sayak Paul authored
-
- 03 Dec, 2025 1 commit
-
-
Sayak Paul authored
* start varlen variants for attn backend kernels. * maybe unflatten heads. * updates * remove unused function. * doc * up
-
- 24 Nov, 2025 1 commit
-
-
Sayak Paul authored
* up * support automatic dispatch. * disable compile support for now./ * up * flash too. * document. * up * up * up * up
-
- 27 Oct, 2025 1 commit
-
-
Mikko Lauri authored
* add aiter attention backend * Apply style fixes --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
github-actions[bot] <github-actions[bot]@users.noreply.github.com>
-
- 16 Oct, 2025 1 commit
-
-
Steven Liu authored
* checks * feedback --------- Co-authored-by:Sayak Paul <spsayakpaul@gmail.com>
-
- 30 Sep, 2025 1 commit
-
-
Steven Liu authored
* change syntax * make style
-
- 26 Sep, 2025 1 commit
-
-
Sayak Paul authored
* slight edits to the attention backends docs. * Update docs/source/en/optimization/attention_backends.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
- 24 Sep, 2025 1 commit
-
-
DefTruth authored
* docs: introduce cache-dit to diffusers * docs: introduce cache-dit to diffusers * docs: introduce cache-dit to diffusers * docs: introduce cache-dit to diffusers * docs: introduce cache-dit to diffusers * docs: introduce cache-dit to diffusers * docs: introduce cache-dit to diffusers * misc: update examples link * misc: update examples link * docs: introduce cache-dit to diffusers * docs: introduce cache-dit to diffusers * docs: introduce cache-dit to diffusers * docs: introduce cache-dit to diffusers * docs: introduce cache-dit to diffusers * Refine documentation for CacheDiT features Updated the wording for clarity and consistency in the documentation. Adjusted sections on cache acceleration, automatic block adapter, patch functor, and hybrid cache configuration.
-
- 23 Sep, 2025 1 commit
-
-
Steven Liu authored
* init * feedback * update * feedback * fixes
-
- 10 Sep, 2025 1 commit
-
-
Sayak Paul authored
* feat: support group offloading at the pipeline level. * add tests * up * [docs] Pipeline group offloading (#12286) init Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
- 25 Aug, 2025 2 commits
-
-
Manith Ratnayake authored
[docs] typo: corrected 'compile regions' to 'compile_regions'
-
Sayak Paul authored
* up * up * Apply suggestions from code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
- 18 Jul, 2025 1 commit
-
-
Sayak Paul authored
* include bp link. * Update docs/source/en/optimization/fp16.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * resources. --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
- 11 Jul, 2025 1 commit
-
-
Steven Liu authored
* add blog post * feedback * feedback
-
- 26 Jun, 2025 2 commits
-
-
Sayak Paul authored
* add test for checking compile on different shapes. * update * update * Apply suggestions from code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
Animesh Jain authored
* [rfc][compile] compile method for DiffusionPipeline * Apply suggestions from code review Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * Apply style fixes * Update docs/source/en/optimization/fp16.md * check --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
github-actions[bot] <github-actions[bot]@users.noreply.github.com>
-
- 20 Jun, 2025 2 commits
-
-
Steven Liu authored
draft Co-authored-by:Sayak Paul <spsayakpaul@gmail.com>
-
Steven Liu authored
* draft * feedback * update * feedback * fix * feedback * feedback * fix * feedback
-
- 19 Jun, 2025 2 commits
-
-
Sayak Paul authored
* start implementing disk offloading in group. * delete diff file. * updates.patch * offload_to_disk_path * check if safetensors already exist. * add test and clarify. * updates * update todos. * update more docs. * update docs
-
Aryan authored
update
-
- 16 Jun, 2025 1 commit
-
-
David Berenstein authored
* Add Pruna optimization framework documentation - Introduced a new section for Pruna in the table of contents. - Added comprehensive documentation for Pruna, detailing its optimization techniques, installation instructions, and examples for optimizing and evaluating models * Enhance Pruna documentation with image alt text and code block formatting - Added alt text to images for better accessibility and context. - Changed code block syntax from diff to python for improved clarity. * Add installation section to Pruna documentation - Introduced a new installation section in the Pruna documentation to guide users on how to install the framework. - Enhanced the overall clarity and usability of the documentation for new users. * Update pruna.md * Update pruna.md * Update Pruna documentation for model optimization and evaluation - Changed section titles for consistency and clarity, from "Optimizing models" to "Optimize models" and "Evaluating and benchmarking optimized models" to "Evaluate and benchmark models". - Enhanced descriptions to clarify the use of `diffusers` models and the evaluation process. - Added a new example for evaluating standalone `diffusers` models. - Updated references and links for better navigation within the documentation. * Refactor Pruna documentation for clarity and consistency - Removed outdated references to FLUX-juiced and streamlined the explanation of benchmarking. - Enhanced the description of evaluating standalone `diffusers` models. - Cleaned up code examples by removing unnecessary imports and comments for better readability. * Apply suggestions from code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Enhance Pruna documentation with new examples and clarifications - Added an image to illustrate the optimization process. - Updated the explanation for sharing and loading optimized models on the Hugging Face Hub. - Clarified the evaluation process for optimized models using the EvaluationAgent. - Improved descriptions for defining metrics and evaluating standalone diffusers models. --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
- 02 Jun, 2025 1 commit
-
-
Steven Liu authored
* cache * feedback
-
- 28 May, 2025 1 commit
-
-
Steven Liu authored
* combine * Update docs/source/en/optimization/fp16.md Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com>
-
- 23 May, 2025 1 commit
-
-
regisss authored
Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
- 19 May, 2025 2 commits
-
-
Quentin Gallouédec authored
* Use HF Papers * Apply style fixes --------- Co-authored-by:github-actions[bot] <github-actions[bot]@users.noreply.github.com>
-
Sayak Paul authored
* tip for group offloding + quantization Co-authored-by:
Aryan VS <contact.aryanvs@gmail.com> * Apply suggestions from code review Co-authored-by:
Aryan <aryan@huggingface.co> --------- Co-authored-by:
Aryan VS <contact.aryanvs@gmail.com> Co-authored-by:
Aryan <aryan@huggingface.co>
-
- 15 May, 2025 1 commit
-
-
Sayak Paul authored
* add regional compilation docs. * minor. * reviwer feedback. * Update docs/source/en/optimization/torch2.0.md Co-authored-by:
Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com> --------- Co-authored-by:
Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
-
- 01 May, 2025 1 commit
-
-
Steven Liu authored
* reformat * initial * fin * review * inference * feedback * feedback * feedback
-
- 08 Apr, 2025 2 commits
-
-
Sayak Paul authored
* implement record_stream for better performance. * fix * style. * merge #11097 * Update src/diffusers/hooks/group_offloading.py Co-authored-by:
Aryan <aryan@huggingface.co> * fixes * docstring. * remaining todos in low_cpu_mem_usage * tests * updates to docs. --------- Co-authored-by:
Aryan <aryan@huggingface.co>
-
Steven Liu authored
mps
-
- 24 Mar, 2025 1 commit
-
-
Aryan authored
* update * Update docs/source/en/optimization/memory.md * Apply suggestions from code review Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com> * apply review suggestions * update --------- Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com>
-
- 14 Feb, 2025 1 commit
-
-
Aryan authored
* update * fix * non_blocking; handle parameters and buffers * update * Group offloading with cuda stream prefetching (#10516) * cuda stream prefetch * remove breakpoints * update * copy model hook implementation from pab * update; ~very workaround based implementation but it seems to work as expected; needs cleanup and rewrite * more workarounds to make it actually work * cleanup * rewrite * update * make sure to sync current stream before overwriting with pinned params not doing so will lead to erroneous computations on the GPU and cause bad results * better check * update * remove hook implementation to not deal with merge conflict * re-add hook changes * why use more memory when less memory do trick * why still use slightly more memory when less memory do trick * optimise * add model tests * add pipeline tests * update docs * add layernorm and groupnorm * address review comments * improve tests; add docs * improve docs * Apply suggestions from code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * apply suggestions from code review * update tests * apply suggestions from review * enable_group_offloading -> enable_group_offload for naming consistency * raise errors if multiple offloading strategies used; add relevant tests * handle .to() when group offload applied * refactor some repeated code * remove unintentional change from merge conflict * handle .cuda() --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
- 23 Jan, 2025 1 commit
-
-
Sayak Paul authored
fix image path in para attention docs
-
- 22 Jan, 2025 1 commit
-
-
Aryan authored
* update * update * make style * remove dynamo disable * add coauthor Co-Authored-By:
Dhruv Nair <dhruv.nair@gmail.com> * update * update * update * update mixin * add some basic tests * update * update * non_blocking * improvements * update * norm.* -> norm * apply suggestions from review * add example * update hook implementation to the latest changes from pyramid attention broadcast * deinitialize should raise an error * update doc page * Apply suggestions from code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * update docs * update * refactor * fix _always_upcast_modules for asym ae and vq_model * fix lumina embedding forward to not depend on weight dtype * refactor tests * add simple lora inference tests * _always_upcast_modules -> _precision_sensitive_module_patterns * remove todo comments about review; revert changes to self.dtype in unets because .dtype on ModelMixin should be able to handle fp8 weight case * check layer dtypes in lora test * fix UNet1DModelTests::test_layerwise_upcasting_inference * _precision_sensitive_module_patterns -> _skip_layerwise_casting_patterns based on feedback * skip test in NCSNppModelTests * skip tests for AutoencoderTinyTests * skip tests for AutoencoderOobleckTests * skip tests for UNet1DModelTests - unsupported pytorch operations * layerwise_upcasting -> layerwise_casting * skip tests for UNetRLModelTests; needs next pytorch release for currently unimplemented operation support * add layerwise fp8 pipeline test * use xfail * Apply suggestions from code review Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com> * add assertion with fp32 comparison; add tolerance to fp8-fp32 vs fp32-fp32 comparison (required for a few models' test to pass) * add note about memory consumption on tesla CI runner for failing test --------- Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
- 16 Jan, 2025 1 commit
-
-
C authored
* add para_attn_flux.md and para_attn_hunyuan_video.md * add enable_sequential_cpu_offload in para_attn_hunyuan_video.md * add comment * refactor * fix * fix * Update docs/source/en/optimization/para_attn.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/para_attn.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/para_attn.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/para_attn.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/para_attn.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/para_attn.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/para_attn.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/para_attn.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/para_attn.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/para_attn.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix * update links * Update docs/source/en/optimization/para_attn.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/para_attn.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/para_attn.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix * Update docs/source/en/optimization/para_attn.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/para_attn.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/para_attn.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/para_attn.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/para_attn.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/para_attn.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/para_attn.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/para_attn.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
- 25 Oct, 2024 1 commit
-
-
Jingya HUANG authored
* start draft * add doc * Update docs/source/en/optimization/neuron.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/neuron.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/neuron.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/neuron.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/neuron.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/neuron.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/optimization/neuron.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * bref intro of ON * Update docs/source/en/optimization/neuron.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
- 12 Oct, 2024 1 commit
-
-
Jinzhe Pan authored
* docs: fix xDiT doc image damage * doc: move xdit images to hf dataset --------- Co-authored-by:Sayak Paul <spsayakpaul@gmail.com>
-
- 23 Sep, 2024 1 commit
-
-
LukeLin authored
* Fix bug * import imageio
-
- 16 Sep, 2024 1 commit
-
-
suzukimain authored
* [docs] Replace runwayml/stable-diffusion-v1-5 with Lykon/dreamshaper-8 Updated documentation as runwayml/stable-diffusion-v1-5 has been removed from Huggingface. * Update docs/source/en/using-diffusers/inpaint.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Replace with stable-diffusion-v1-5/stable-diffusion-v1-5 * Update inpaint.md --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
- 09 Sep, 2024 1 commit
-
-
Jinzhe Pan authored
* docs: add xDiT to optimization methods * fix: picture layout problem * docs: add more introduction about xdit & apply suggestions * Apply suggestions from code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-