Commits · 9a147b82f72e5df4553cb0f845bb957be3aa6028 · renzhc / diffusers_dcu

"docs/vscode:/vscode.git/clone" did not exist on "eb23a24291f748b53b32022d8b3a021644c33a25"

14 Feb, 2025 1 commit

Module Group Offloading (#10503) · 9a147b82

Aryan authored Feb 14, 2025



* update

* fix

* non_blocking; handle parameters and buffers

* update

* Group offloading with cuda stream prefetching (#10516)

* cuda stream prefetch

* remove breakpoints

* update

* copy model hook implementation from pab

* update; ~very workaround based implementation but it seems to work as expected; needs cleanup and rewrite

* more workarounds to make it actually work

* cleanup

* rewrite

* update

* make sure to sync current stream before overwriting with pinned params

not doing so will lead to erroneous computations on the GPU and cause bad results

* better check

* update

* remove hook implementation to not deal with merge conflict

* re-add hook changes

* why use more memory when less memory do trick

* why still use slightly more memory when less memory do trick

* optimise

* add model tests

* add pipeline tests

* update docs

* add layernorm and groupnorm

* address review comments

* improve tests; add docs

* improve docs

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* apply suggestions from code review

* update tests

* apply suggestions from review

* enable_group_offloading -> enable_group_offload for naming consistency

* raise errors if multiple offloading strategies used; add relevant tests

* handle .to() when group offload applied

* refactor some repeated code

* remove unintentional change from merge conflict

* handle .cuda()

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

9a147b82

13 Feb, 2025 4 commits
- Refactor CogVideoX transformer forward (#10789) · ab428207
  Aryan authored Feb 14, 2025
```
update
```
  ab428207
- Update FlowMatch docstrings to mention correct output classes (#10788) · 8d081de8
  Aryan authored Feb 14, 2025
```
update
```
  8d081de8
- Disable PEFT input autocast when using fp8 layerwise casting (#10685) · a0c22997
  Aryan authored Feb 13, 2025
```
* disable peft input autocast

* use new peft method name; only disable peft input autocast if submodule layerwise casting active

* add test; reference PeftInputAutocastDisableHook in peft docs

* add load_lora_weights test

* casted -> cast

* Update tests/lora/utils.py
```
  a0c22997
- make tensors contiguous before passing to safetensors (#10761) · 97abdd22
  Fanli Lin authored Feb 13, 2025
```
fix contiguous bug
```
  97abdd22
12 Feb, 2025 5 commits

`MultiControlNetUnionModel` on SDXL (#10747) · 5105b5a8
Daniel Regado authored Feb 12, 2025
```
* SDXL with MultiControlNetUnionModel



---------
Co-authored-by: hlky <hlky@hlky.ac>
```
5105b5a8

Fix `use_lu_lambdas` and `use_karras_sigmas` with... · ca6330dc

hlky authored Feb 12, 2025

Fix `use_lu_lambdas` and `use_karras_sigmas` with `beta_schedule=squaredcos_cap_v2` in `DPMSolverMultistepScheduler` (#10740)

ca6330dc

[Single File] Add Single File support for Lumina Image 2.0 Transformer (#10781) · 28f48f40
Dhruv Nair authored Feb 12, 2025
```
* update

* update
```
28f48f40

Faster set_adapters (#10777) · 067eab1b

Thanh Le authored Feb 12, 2025



* Update peft_utils.py

* Update peft_utils.py

* Update peft_utils.py

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

067eab1b

Refactor OmniGen (#10771) · 57ac6738

Aryan authored Feb 12, 2025



* OmniGen model.py

* update OmniGenTransformerModel

* omnigen pipeline

* omnigen pipeline

* update omnigen_pipeline

* test case for omnigen

* update omnigenpipeline

* update docs

* update docs

* offload_transformer

* enable_transformer_block_cpu_offload

* update docs

* reformat

* reformat

* reformat

* update docs

* update docs

* make style

* make style

* Update docs/source/en/api/models/omnigen_transformer.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update docs

* revert changes to examples/

* update OmniGen2DModel

* make style

* update test cases

* Update docs/source/en/api/pipelines/omnigen.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update docs

* typo

* Update src/diffusers/models/embeddings.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/models/attention.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/models/transformers/transformer_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/models/transformers/transformer_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/models/transformers/transformer_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update tests/pipelines/omnigen/test_pipeline_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update tests/pipelines/omnigen/test_pipeline_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* consistent attention processor

* updata

* update

* check_inputs

* make style

* update testpipeline

* update testpipeline

* refactor omnigen

* more updates

* apply review suggestion

---------
Co-authored-by: shitao <2906698981@qq.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: hlky <hlky@hlky.ac>

57ac6738

11 Feb, 2025 3 commits

Add support for lumina2 (#10642) · 81440fd4

Le Zhuo authored Feb 12, 2025



* Add support for lumina2


---------
Co-authored-by: csuhan <hanjiaming@whu.edu.cn>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: hlky <hlky@hlky.ac>

81440fd4

Add OmniGen (#10148) · 798e1718

Shitao Xiao authored Feb 12, 2025



* OmniGen model.py

* update OmniGenTransformerModel

* omnigen pipeline

* omnigen pipeline

* update omnigen_pipeline

* test case for omnigen

* update omnigenpipeline

* update docs

* update docs

* offload_transformer

* enable_transformer_block_cpu_offload

* update docs

* reformat

* reformat

* reformat

* update docs

* update docs

* make style

* make style

* Update docs/source/en/api/models/omnigen_transformer.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update docs

* revert changes to examples/

* update OmniGen2DModel

* make style

* update test cases

* Update docs/source/en/api/pipelines/omnigen.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update docs

* typo

* Update src/diffusers/models/embeddings.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/models/attention.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/models/transformers/transformer_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/models/transformers/transformer_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/models/transformers/transformer_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update tests/pipelines/omnigen/test_pipeline_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update tests/pipelines/omnigen/test_pipeline_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py
Co-authored-by: hlky <hlky@hlky.ac>

* consistent attention processor

* updata

* update

* check_inputs

* make style

* update testpipeline

* update testpipeline

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: Aryan <aryan@huggingface.co>

798e1718

speedup hunyuan encoder causal mask generation (#10764) · 8ae8008b
Mathias Parger authored Feb 11, 2025
```
* speedup causal mask generation

* fixing hunyuan attn mask test case
```
8ae8008b

10 Feb, 2025 2 commits
- Add `Self` type hint to `ModelMixin`'s `from_pretrained` (#10742) · 7fb481f8
  hlky authored Feb 10, 2025
  
  7fb481f8
- [LoRA] fix peft state dict parsing (#10532) · 9f5ad1db
  Sayak Paul authored Feb 10, 2025
```
* fix peft state dict parsing

* updates
```
  9f5ad1db
07 Feb, 2025 1 commit
- EDMEulerScheduler accept sigmas, add final_sigmas_type (#10734) · 464374fb
  hlky authored Feb 07, 2025
  
  464374fb
06 Feb, 2025 1 commit
- Quantized Flux with IP-Adapter (#10728) · d43ce14e
  hlky authored Feb 06, 2025
  
  d43ce14e
05 Feb, 2025 1 commit
- add provider_options in from_pretrained (#10719) · 23bc56a0
  xieofxie authored Feb 06, 2025
```
Co-authored-by: hualxie <hualxie@microsoft.com>
```
  23bc56a0
04 Feb, 2025 2 commits

[Fix] Type Hint in from_pretrained() to Ensure Correct Type Inference (#10714) · 5b1dcd15

SahilCarterr authored Feb 05, 2025



* Update pipeline_utils.py

Added Self in from_pretrained method so  inference will correctly recognize pipeline

* Use typing_extensions

---------
Co-authored-by: hlky <hlky@hlky.ac>

5b1dcd15

[bitsandbytes] Simplify bnb int8 dequant (#10401) · 5e8e6cb4

Sayak Paul authored Feb 04, 2025

* fix dequantization for latest bnb.

* smol fixes.

* fix type annotation

* update peft link

* updates

5e8e6cb4

01 Feb, 2025 1 commit

feat(training-utils): support device and dtype params in... · 9f28f1ab

Vedat Baday authored Feb 02, 2025


feat(training-utils): support device and dtype params in compute_density_for_timestep_sampling (#10699)

* feat(training-utils): support device and dtype params in compute_density_for_timestep_sampling

* chore: update type hint

* refactor: use union for type hint

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

9f28f1ab

31 Jan, 2025 1 commit

Fix enable memory efficient attention on ROCm (#10564) · 1ae9b059

Max Podkorytov authored Jan 31, 2025

* fix enable memory efficient attention on ROCm

while calling CK implementation

* Update attention_processor.py

refactor of picking a set element

1ae9b059

29 Jan, 2025 3 commits
- [FIX] check_inputs function in Auraflow Pipeline (#10678) · aad69ac2
  SahilCarterr authored Jan 30, 2025
```
fix_shape_error
```
  aad69ac2
- fix(hunyuan-video): typo in height and width input check (#10684) · ea76880b
  Vedat Baday authored Jan 30, 2025
  
  ea76880b
- support StableDiffusionAdapterPipeline.from_single_file (#10552) · 33f93615
  Teriks authored Jan 29, 2025
```
* support StableDiffusionAdapterPipeline.from_single_file

* make style

---------
Co-authored-by: Teriks <Teriks@users.noreply.github.com>
Co-authored-by: hlky <hlky@hlky.ac>
```
  33f93615
28 Jan, 2025 3 commits

[Tests] conditionally check `fp8_e4m3_bf16_max_memory < fp8_e4m3_fp32_max_memory` (#10669) · 7b100ce5
Sayak Paul authored Jan 28, 2025
```
* conditionally check if compute capability is met.

* log info.

* fix condition.

* updates

* updates

* updates

* updates
```
7b100ce5

Refactor gradient checkpointing (#10611) · c4d4ac21

Aryan authored Jan 28, 2025

* update

* remove unused fn

* apply suggestions based on review

* update + cleanup 🧹

* more cleanup 🧹

* make fix-copies

* update test

c4d4ac21

[fix] refer use_framewise_encoding on AutoencoderKLHunyuanVideo._encode (#10600) · f295e2ee

Hanch Han authored Jan 28, 2025



* fix: refer to use_framewise_encoding on AutoencoderKLHunyuanVideo._encode

* fix: comment about tile_sample_min_num_frames

---------
Co-authored-by: Aryan <aryan@huggingface.co>

f295e2ee

27 Jan, 2025 6 commits

[core] Pyramid Attention Broadcast (#9562) · 658e24e8

Aryan authored Jan 28, 2025



* start pyramid attention broadcast

* add coauthor
Co-Authored-By: Xuanlei Zhao <43881818+oahzxl@users.noreply.github.com>

* update

* make style

* update

* make style

* add docs

* add tests

* update

* Update docs/source/en/api/pipelines/cogvideox.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/cogvideox.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Pyramid Attention Broadcast rewrite + introduce hooks (#9826)

* rewrite implementation with hooks

* make style

* update

* merge pyramid-attention-rewrite-2

* make style

* remove changes from latte transformer

* revert docs changes

* better debug message

* add todos for future

* update tests

* make style

* cleanup

* fix

* improve log message; fix latte test

* refactor

* update

* update

* update

* revert changes to tests

* update docs

* update tests

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update

* fix flux test

* reorder

* refactor

* make fix-copies

* update docs

* fixes

* more fixes

* make style

* update tests

* update code example

* make fix-copies

* refactor based on reviews

* use maybe_free_model_hooks

* CacheMixin

* make style

* update

* add current_timestep property; update docs

* make fix-copies

* update

* improve tests

* try circular import fix

* apply suggestions from review

* address review comments

* Apply suggestions from code review

* refactor hook implementation

* add test suite for hooks

* PAB Refactor (#10667)

* update

* update

* update

---------
Co-authored-by: DN6 <dhruv.nair@gmail.com>

* update

* fix remove hook behaviour

---------
Co-authored-by: Xuanlei Zhao <43881818+oahzxl@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: DN6 <dhruv.nair@gmail.com>

658e24e8

Revert RePaint scheduler 'fix' (#10644) · fb420664
Giuseppe Catalano authored Jan 27, 2025
```
Co-authored-by: Giuseppe Catalano <giuseppelorenzo.catalano@unito.it>
```
fb420664

SDXL ControlNet Union pipelines, make control_image argument immutible (#10663) · e89ab5bc

Teriks authored Jan 27, 2025



controlnet union XL, make control_image immutible

when this argument is passed a list, __call__
modifies its content, since it is pass by reference
the list passed by the caller gets its content
modified unexpectedly

make a copy at method intro so this does not happen
Co-authored-by: Teriks <Teriks@users.noreply.github.com>

e89ab5bc

fix check_inputs func in LuminaText2ImgPipeline (#10651) · 8ceec90d
victolee0 authored Jan 28, 2025

8ceec90d
Add provider_options to OnnxRuntimeModel (#10661) · 158c5c4d
hlky authored Jan 27, 2025

158c5c4d
ControlNet Union controlnet_conditioning_scale for multiple control inputs (#10666) · 18f7d1d9
hlky authored Jan 27, 2025

18f7d1d9

26 Jan, 2025 1 commit
- Add sigmoid scheduler in `scheduling_ddpm.py` docs (#10648) · 4f3ec536
  Jacob Helwig authored Jan 26, 2025
```
Sigmoid scheduler in scheduling_ddpm.py docs
```
  4f3ec536
24 Jan, 2025 1 commit

NPU Adaption for Sanna (#10409) · 07860f99

Leo Jiang authored Jan 24, 2025



* NPU Adaption for Sanna


---------
Co-authored-by: J石页 <jiangshuo9@h-partners.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

07860f99

23 Jan, 2025 2 commits
- width and height are mixed-up (#10629) · 9684c52a
  Raul Ciotescu authored Jan 23, 2025
```
vars mixed-up
```
  9684c52a
- add onnxruntime-migraphx as part of check for onnxruntime in import_utils.py (#10624) · 04d40920
  kahmed10 authored Jan 22, 2025
```
add onnxruntime-migraphx to import_utils.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
```
  04d40920
22 Jan, 2025 2 commits

Improve TorchAO error message (#10627) · ca60ad8e
Aryan authored Jan 22, 2025
```
improve error message
```
ca60ad8e

[core] Layerwise Upcasting (#10347) · beacaa55

Aryan authored Jan 22, 2025



* update

* update

* make style

* remove dynamo disable

* add coauthor
Co-Authored-By: Dhruv Nair <dhruv.nair@gmail.com>

* update

* update

* update

* update mixin

* add some basic tests

* update

* update

* non_blocking

* improvements

* update

* norm.* -> norm

* apply suggestions from review

* add example

* update hook implementation to the latest changes from pyramid attention broadcast

* deinitialize should raise an error

* update doc page

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update docs

* update

* refactor

* fix _always_upcast_modules for asym ae and vq_model

* fix lumina embedding forward to not depend on weight dtype

* refactor tests

* add simple lora inference tests

* _always_upcast_modules -> _precision_sensitive_module_patterns

* remove todo comments about review; revert changes to self.dtype in unets because .dtype on ModelMixin should be able to handle fp8 weight case

* check layer dtypes in lora test

* fix UNet1DModelTests::test_layerwise_upcasting_inference

* _precision_sensitive_module_patterns -> _skip_layerwise_casting_patterns based on feedback

* skip test in NCSNppModelTests

* skip tests for AutoencoderTinyTests

* skip tests for AutoencoderOobleckTests

* skip tests for UNet1DModelTests - unsupported pytorch operations

* layerwise_upcasting -> layerwise_casting

* skip tests for UNetRLModelTests; needs next pytorch release for currently unimplemented operation support

* add layerwise fp8 pipeline test

* use xfail

* Apply suggestions from code review
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* add assertion with fp32 comparison; add tolerance to fp8-fp32 vs fp32-fp32 comparison (required for a few models' test to pass)

* add note about memory consumption on tesla CI runner for failing test

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

beacaa55