Commits · f12d161d6763cff0f45b0ec3b3f6072a2b7c7f9d · renzhc / diffusers_dcu

05 Dec, 2025 1 commit

Fix broken group offloading with block_level for models with standalone layers (#12692) · f12d161d

swappy authored Dec 05, 2025



* fix: group offloading to support standalone computational layers in block-level offloading

* test: for models with standalone and deeply nested layers in block-level offloading

* feat: support for block-level offloading in group offloading config

* fix: group offload block modules to AutoencoderKL and AutoencoderKLWan

* fix: update group offloading tests to use AutoencoderKL and adjust input dimensions

* refactor: streamline block offloading logic

* Apply style fixes

* update tests

* update

* fix for failing tests

* clean up

* revert to use skip_keys

* clean up

---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

f12d161d

28 Aug, 2025 1 commit

[Refactor] Move testing utils out of src (#12238) · 7aa6af11

Dhruv Nair authored Aug 28, 2025

* update

* update

* update

* update

* update

* merge main

* Revert "merge main"

This reverts commit 65efbcead58644b31596ed2d714f7cee0e0238d3.

7aa6af11

06 Aug, 2025 1 commit
- [refactor] condense group offloading (#11990) · cfd6ec74
  Aryan authored Aug 06, 2025
```
* update

* update

* refactor

* add test

* address review comment

* nit
```
  cfd6ec74
18 Jun, 2025 1 commit
- [chore] change to 2025 licensing for remaining (#11741) · 62cce304
  Sayak Paul authored Jun 18, 2025
```
change to 2025 licensing for remaining
```
  62cce304
30 May, 2025 1 commit

enable group_offloading and PipelineDeviceAndDtypeStabilityTests on XPU, all passed (#11620) · a7aa8bf2

Yao Matrix authored May 30, 2025



* enable group_offloading and PipelineDeviceAndDtypeStabilityTests on XPU,
all passed
Signed-off-by: Matrix YAO <matrix.yao@intel.com>

* fix style
Signed-off-by: Matrix YAO <matrix.yao@intel.com>

* fix
Signed-off-by: Matrix YAO <matrix.yao@intel.com>

---------
Signed-off-by: Matrix YAO <matrix.yao@intel.com>
Co-authored-by: Aryan <aryan@huggingface.co>

a7aa8bf2

23 Apr, 2025 1 commit
- Fix group offloading with block_level and use_stream=True (#11375) · 6cef71de
  Aryan authored Apr 23, 2025
```
* fix

* add tests

* add message check
```
  6cef71de
14 Feb, 2025 1 commit

Module Group Offloading (#10503) · 9a147b82

Aryan authored Feb 14, 2025



* update

* fix

* non_blocking; handle parameters and buffers

* update

* Group offloading with cuda stream prefetching (#10516)

* cuda stream prefetch

* remove breakpoints

* update

* copy model hook implementation from pab

* update; ~very workaround based implementation but it seems to work as expected; needs cleanup and rewrite

* more workarounds to make it actually work

* cleanup

* rewrite

* update

* make sure to sync current stream before overwriting with pinned params

not doing so will lead to erroneous computations on the GPU and cause bad results

* better check

* update

* remove hook implementation to not deal with merge conflict

* re-add hook changes

* why use more memory when less memory do trick

* why still use slightly more memory when less memory do trick

* optimise

* add model tests

* add pipeline tests

* update docs

* add layernorm and groupnorm

* address review comments

* improve tests; add docs

* improve docs

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* apply suggestions from code review

* update tests

* apply suggestions from review

* enable_group_offloading -> enable_group_offload for naming consistency

* raise errors if multiple offloading strategies used; add relevant tests

* handle .to() when group offload applied

* refactor some repeated code

* remove unintentional change from merge conflict

* handle .cuda()

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

9a147b82