Commits · 48eae6f4204dbdca26e6c1f0c8dc64caa0e48f08 · renzhc / diffusers_dcu

11 Jun, 2025 3 commits

[tests] tests for compilation + quantization (bnb) (#11672) · b6f79330

Sayak Paul authored Jun 11, 2025



* start adding compilation tests for quantization.

* fixes

* make common utility.

* modularize.

* add group offloading+compile

* xfail

* update

* Update tests/quantization/test_torch_compile_utils.py
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* fixes

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

b6f79330

enable torchao test cases on XPU and switch to device agnostic APIs for test cases (#11654) · 33e636ce

Yao Matrix authored Jun 11, 2025



* enable torchao cases on XPU
Signed-off-by: Matrix YAO <matrix.yao@intel.com>

* device agnostic APIs
Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* more
Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* fix style
Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* enable test_torch_compile_recompilation_and_graph_break on XPU
Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* resolve comments
Signed-off-by: YAO Matrix <matrix.yao@intel.com>

---------
Signed-off-by: Matrix YAO <matrix.yao@intel.com>
Signed-off-by: YAO Matrix <matrix.yao@intel.com>

33e636ce

[LoRA] support Flux Control LoRA with bnb 8bit. (#11655) · 8e88495d
Sayak Paul authored Jun 11, 2025
```
support Flux Control LoRA with bnb 8bit.
```
8e88495d

06 Jun, 2025 1 commit

use deterministic to get stable result (#11663) · 0f91f2f6

jiqing-feng authored Jun 06, 2025



* use deterministic to get stable result
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add deterministic for int8 test
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

0f91f2f6

02 Jun, 2025 1 commit
- [chore] misc changes in the bnb tests for consistency. (#11355) · d4dc4d76
  Sayak Paul authored Jun 02, 2025
```
misc changes in the bnb tests for consistency.
```
  d4dc4d76
30 May, 2025 1 commit

Fix typos in strings and comments (#11476) · 8183d0f1

co63oc authored May 30, 2025



* Fix typos in strings and comments
Signed-off-by: co63oc <co63oc@users.noreply.github.com>

* Update src/diffusers/hooks/hooks.py
Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Update src/diffusers/hooks/hooks.py
Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Update layerwise_casting.py

* Apply style fixes

* update

---------
Signed-off-by: co63oc <co63oc@users.noreply.github.com>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

8183d0f1

15 May, 2025 1 commit
- [Single File] GGUF/Single File Support for HiDream (#11550) · 4267d8f4
  Dhruv Nair authored May 15, 2025
```
* update

* update

* update

* update

* update

* update

* update
```
  4267d8f4
09 May, 2025 2 commits

enable 7 cases on XPU (#11503) · 2d380895

Yao Matrix authored May 09, 2025



* enable 7 cases on XPU
Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* calibrate A100 expectations
Signed-off-by: YAO Matrix <matrix.yao@intel.com>

---------
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
Signed-off-by: YAO Matrix <matrix.yao@intel.com>

2d380895

feat: pipeline-level quantization config (#11130) · 599c8871

Sayak Paul authored May 09, 2025



* feat: pipeline-level quant config.
Co-authored-by: SunMarc <marc.sun@hotmail.fr>

condition better.

support mapping.

improvements.

[Quantization] Add Quanto backend (#10756)

* update

* updaet

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update docs/source/en/quantization/quanto.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update src/diffusers/quantizers/quanto/utils.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

[Single File] Add single file loading for SANA Transformer (#10947)

* added support for from_single_file

* added diffusers mapping script

* added testcase

* bug fix

* updated tests

* corrected code quality

* corrected code quality

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

[LoRA] Improve warning messages when LoRA loading becomes a no-op (#10187)

* updates

* updates

* updates

* updates

* notebooks revert

* fix-copies.

* seeing

* fix

* revert

* fixes

* fixes

* fixes

* remove print

* fix

* conflicts ii.

* updates

* fixes

* better filtering of prefix.

---------
Co-authored-by: hlky <hlky@hlky.ac>

[LoRA] CogView4 (#10981)

* update

* make fix-copies

* update

[Tests] improve quantization tests by additionally measuring the inference memory savings (#11021)

* memory usage tests

* fixes

* gguf

[`Research Project`] Add AnyText: Multilingual Visual Text Generation And Editing (#8998)

* Add initial template

* Second template

* feat: Add TextEmbeddingModule to AnyTextPipeline

* feat: Add AuxiliaryLatentModule template to AnyTextPipeline

* Add bert tokenizer from the anytext repo for now

* feat: Update AnyTextPipeline's modify_prompt method

This commit adds improvements to the modify_prompt method in the AnyTextPipeline class. The method now handles special characters and replaces selected string prompts with a placeholder. Additionally, it includes a check for Chinese text and translation using the trans_pipe.

* Fill in the `forward` pass of `AuxiliaryLatentModule`

* `make style && make quality`

* `chore: Update bert_tokenizer.py with a TODO comment suggesting the use of the transformers library`

* Update error handling to raise and logging

* Add `create_glyph_lines` function into `TextEmbeddingModule`

* make style

* Up

* Up

* Up

* Up

* Remove several comments

* refactor: Remove ControlNetConditioningEmbedding and update code accordingly

* Up

* Up

* up

* refactor: Update AnyTextPipeline to include new optional parameters

* up

* feat: Add OCR model and its components

* chore: Update `TextEmbeddingModule` to include OCR model components and dependencies

* chore: Update `AuxiliaryLatentModule` to include VAE model and its dependencies for masked image in the editing task

* `make style`

* refactor: Update `AnyTextPipeline`'s docstring

* Update `AuxiliaryLatentModule` to include info dictionary so that text processing is done once

* simplify

* `make style`

* Converting `TextEmbeddingModule` to ordinary `encode_prompt()` function

* Simplify for now

* `make style`

* Up

* feat: Add scripts to convert AnyText controlnet to diffusers

* `make style`

* Fix: Move glyph rendering to `TextEmbeddingModule` from `AuxiliaryLatentModule`

* make style

* Up

* Simplify

* Up

* feat: Add safetensors module for loading model file

* Fix device issues

* Up

* Up

* refactor: Simplify

* refactor: Simplify code for loading models and handling data types

* `make style`

* refactor: Update to() method in FrozenCLIPEmbedderT3 and TextEmbeddingModule

* refactor: Update dtype in embedding_manager.py to match proj.weight

* Up

* Add attribution and adaptation information to pipeline_anytext.py

* Update usage example

* Will refactor `controlnet_cond_embedding` initialization

* Add `AnyTextControlNetConditioningEmbedding` template

* Refactor organization

* style

* style

* Move custom blocks from `AuxiliaryLatentModule` to `AnyTextControlNetConditioningEmbedding`

* Follow one-file policy

* style

* [Docs] Update README and pipeline_anytext.py to use AnyTextControlNetModel

* [Docs] Update import statement for AnyTextControlNetModel in pipeline_anytext.py

* [Fix] Update import path for ControlNetModel, ControlNetOutput in anytext_controlnet.py

* Refactor AnyTextControlNet to use configurable conditioning embedding channels

* Complete control net conditioning embedding in AnyTextControlNetModel

* up

* [FIX] Ensure embeddings use correct device in AnyTextControlNetModel

* up

* up

* style

* [UPDATE] Revise README and example code for AnyTextPipeline integration with DiffusionPipeline

* [UPDATE] Update example code in anytext.py to use correct font file and improve clarity

* down

* [UPDATE] Refactor BasicTokenizer usage to a new Checker class for text processing

* update pillow

* [UPDATE] Remove commented-out code and unnecessary docstring in anytext.py and anytext_controlnet.py for improved clarity

* [REMOVE] Delete frozen_clip_embedder_t3.py as it is in the anytext.py file

* [UPDATE] Replace edict with dict for configuration in anytext.py and RecModel.py for consistency

* 🆙



* style

* [UPDATE] Revise README.md for clarity, remove unused imports in anytext.py, and add author credits in anytext_controlnet.py

* style

* Update examples/research_projects/anytext/README.md
Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Remove commented-out image preparation code in AnyTextPipeline

* Remove unnecessary blank line in README.md

[Quantization] Allow loading TorchAO serialized Tensor objects with torch>=2.6  (#11018)

* update

* update

* update

* update

* update

* update

* update

* update

* update

fix: mixture tiling sdxl pipeline - adjust gerating time_ids & embeddings  (#11012)

small fix on generating time_ids & embeddings

[LoRA] support wan i2v loras from the world. (#11025)

* support wan i2v loras from the world.

* remove copied from.

* upates

* add lora.

Fix SD3 IPAdapter feature extractor (#11027)

chore: fix help messages in advanced diffusion examples (#10923)

Fix missing **kwargs in lora_pipeline.py (#11011)

* Update lora_pipeline.py

* Apply style fixes

* fix-copies

---------
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Fix for multi-GPU WAN inference (#10997)

Ensure that hidden_state and shift/scale are on the same device when running with multiple GPUs

Co-authored-by: Jimmy <39@🇺🇸.com>

[Refactor] Clean up import utils boilerplate (#11026)

* update

* update

* update

Use `output_size` in `repeat_interleave` (#11030)

[hybrid inference 🍯🐝] Add VAE encode (#11017)

* [hybrid inference 🍯🐝

] Add VAE encode

* _toctree: add vae encode

* Add endpoints, tests

* vae_encode docs

* vae encode benchmarks

* api reference

* changelog

* Update docs/source/en/hybrid_inference/overview.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

Wan Pipeline scaling fix, type hint warning, multi generator fix (#11007)

* Wan Pipeline scaling fix, type hint warning, multi generator fix

* Apply suggestions from code review

[LoRA] change to warning from info when notifying the users about a LoRA no-op (#11044)

* move to warning.

* test related changes.

Rename Lumina(2)Text2ImgPipeline -> Lumina(2)Pipeline (#10827)

* Rename Lumina(2)Text2ImgPipeline -> Lumina(2)Pipeline

---------
Co-authored-by: YiYi Xu <yixu310@gmail.com>

making ```formatted_images``` initialization compact (#10801)

compact writing
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>

Fix aclnnRepeatInterleaveIntWithDim error on NPU for get_1d_rotary_pos_embed (#10820)

* get_1d_rotary_pos_embed support npu

* Update src/diffusers/models/embeddings.py

---------
Co-authored-by: Kai zheng <kaizheng@KaideMacBook-Pro.local>
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: YiYi Xu <yixu310@gmail.com>

[Tests] restrict memory tests for quanto for certain schemes. (#11052)

* restrict memory tests for quanto for certain schemes.

* Apply suggestions from code review
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* fixes

* style

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

[LoRA] feat: support non-diffusers wan t2v loras. (#11059)

feat: support non-diffusers wan t2v loras.

[examples/controlnet/train_controlnet_sd3.py] Fixes #11050 - Cast prompt_embeds and pooled_prompt_embeds to weight_dtype to prevent dtype mismatch (#11051)

Fix: dtype mismatch of prompt embeddings in sd3 controlnet training
Co-authored-by: Andreas Jörg <andreasjoerg@MacBook-Pro-von-Andreas-2.fritz.box>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

reverts accidental change that removes attn_mask in attn. Improves fl… (#11065)

reverts accidental change that removes attn_mask in attn. Improves flux ptxla by using flash block sizes. Moves encoding outside the for loop.
Co-authored-by: Juan Acevedo <jfacevedo@google.com>

Fix deterministic issue when getting pipeline dtype and device (#10696)
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

[Tests] add requires peft decorator. (#11037)

* add requires peft decorator.

* install peft conditionally.

* conditional deps.
Co-authored-by: DN6 <dhruv.nair@gmail.com>

---------
Co-authored-by: DN6 <dhruv.nair@gmail.com>

CogView4 Control Block (#10809)

* cogview4 control training

---------
Co-authored-by: OleehyO <leehy0357@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail.com>

[CI] pin transformers version for benchmarking. (#11067)

pin transformers version for benchmarking.

updates

Fix Wan I2V Quality (#11087)

* fix_wan_i2v_quality

* Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update pipeline_wan_i2v.py

---------
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: hlky <hlky@hlky.ac>

LTX 0.9.5 (#10968)

* update

---------
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: hlky <hlky@hlky.ac>

make PR GPU tests conditioned on styling. (#11099)

Group offloading improvements (#11094)

update

Fix pipeline_flux_controlnet.py (#11095)

* Fix pipeline_flux_controlnet.py

* Fix style

update readme instructions. (#11096)
Co-authored-by: Juan Acevedo <jfacevedo@google.com>

Resolve stride mismatch in UNet's ResNet to support Torch DDP (#11098)

Modify UNet's ResNet implementation to resolve stride mismatch in Torch's DDP

Fix Group offloading behaviour when using streams (#11097)

* update

* update

Quality options in `export_to_video` (#11090)

* Quality options in `export_to_video`

* make style

improve more.

add placeholders for docstrings.

formatting.

smol fix.

solidify validation and annotation

* Revert "feat: pipeline-level quant config."

This reverts commit 316ff46b7648bfa24525ac02c284afcf440404aa.

* feat: implement pipeline-level quantization config
Co-authored-by: SunMarc <marc@huggingface.co>

* update

* fixes

* fix validation.

* add tests and other improvements.

* add tests

* import quality

* remove prints.

* add docs.

* fixes to docs.

* doc fixes.

* doc fixes.

* add validation to the input quantization_config.

* clarify recommendations.

* docs

* add to ci.

* todo.

---------
Co-authored-by: SunMarc <marc@huggingface.co>

599c8871

28 Apr, 2025 2 commits

enable 28 GGUF test cases on XPU (#11404) · 7567adfc

Yao Matrix authored Apr 29, 2025



* enable gguf test cases on XPU
Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* make SD35LargeGGUFSingleFileTests::test_pipeline_inference pas
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>

* make FluxControlLoRAGGUFTests::test_lora_loading pass
Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* polish code
Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* Apply style fixes

---------
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
Co-authored-by: root <root@a4bf01945cfe.jf.intel.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

7567adfc

enable group_offload cases and quanto cases on XPU (#11405) · 9ce89e2e

Yao Matrix authored Apr 28, 2025



* enable group_offload cases and quanto cases on XPU
Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* use backend APIs
Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* fix style
Signed-off-by: Yao Matrix <matrix.yao@intel.com>

---------
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Signed-off-by: Yao Matrix <matrix.yao@intel.com>

9ce89e2e

17 Apr, 2025 1 commit

enable 2 test cases on XPU (#11332) · eef3d659

Yao Matrix authored Apr 18, 2025



* enable 2 test cases on XPU
Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* Apply style fixes

---------
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

eef3d659

08 Apr, 2025 2 commits

[bistandbytes] improve replacement warnings for bnb (#11132) · 1a048124
Sayak Paul authored Apr 08, 2025
```
* improve replacement warnings for bnb

* updates to docs.
```
1a048124

Flux quantized with lora (#10990) · 5d49b3e8

hlky authored Apr 08, 2025



* Flux quantized with lora

* fix

* changes

* Apply suggestions from code review
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Apply style fixes

* enable model cpu offload()

* Update src/diffusers/loaders/lora_pipeline.py
Co-authored-by: hlky <hlky@hlky.ac>

* update

* Apply suggestions from code review

* update

* add peft as an additional dependency for gguf

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

5d49b3e8

07 Apr, 2025 1 commit

enable 1 case on XPU (#11219) · 506f39af

Yao Matrix authored Apr 07, 2025



enable case on XPU: 1. tests/quantization/bnb/test_mixed_int8.py::BnB8bitTrainingTests::test_training
Signed-off-by: YAO Matrix <matrix.yao@intel.com>

506f39af

02 Apr, 2025 1 commit
- fix autocast (#11190) · 4d5a96e4
  jiqing-feng authored Apr 02, 2025
```
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
```
  4d5a96e4
01 Apr, 2025 1 commit
- [tests] no hard-coded cuda (#11186) · 5a6edac0
  Fanli Lin authored Apr 01, 2025
```
no cuda only
```
  5a6edac0
26 Mar, 2025 1 commit
- [Quantization] dtype fix for GGUF + fix BnB tests (#11159) · 7dc52ea7
  Dhruv Nair authored Mar 26, 2025
```
* update

* update

* update

* update
```
  7dc52ea7
19 Mar, 2025 1 commit

[tests] enable bnb tests on xpu (#11001) · 56f74005

Fanli Lin authored Mar 20, 2025

* enable bnb on xpu

* add 2 more cases

* add missing change

* add missing change

* add one more

56f74005

15 Mar, 2025 1 commit

[Tests] add requires peft decorator. (#11037) · cc19726f

Sayak Paul authored Mar 15, 2025



* add requires peft decorator.

* install peft conditionally.

* conditional deps.
Co-authored-by: DN6 <dhruv.nair@gmail.com>

---------
Co-authored-by: DN6 <dhruv.nair@gmail.com>

cc19726f

14 Mar, 2025 1 commit

[Tests] restrict memory tests for quanto for certain schemes. (#11052) · 2f0f281b

Sayak Paul authored Mar 14, 2025



* restrict memory tests for quanto for certain schemes.

* Apply suggestions from code review
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* fixes

* style

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

2f0f281b

10 Mar, 2025 2 commits

[Tests] improve quantization tests by additionally measuring the inference memory savings (#11021) · e7e6d852
Sayak Paul authored Mar 10, 2025
```
* memory usage tests

* fixes

* gguf
```
e7e6d852

[Quantization] Add Quanto backend (#10756) · f5edaa78

Dhruv Nair authored Mar 10, 2025



* update

* updaet

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update docs/source/en/quantization/quanto.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update src/diffusers/quantizers/quanto/utils.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

f5edaa78

04 Mar, 2025 1 commit

[Quantization] support pass MappingType for TorchAoConfig (#10927) · 11d8e3ce

a120092009 authored Mar 04, 2025



* [Quantization] support pass MappingType for TorchAoConfig

* Apply style fixes

---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

11d8e3ce

19 Feb, 2025 1 commit

[FEAT] Model loading refactor (#10604) · f5929e03

Marc Sun authored Feb 19, 2025



* first draft model loading refactor

* revert name change

* fix bnb

* revert name

* fix dduf

* fix huanyan

* style

* Update src/diffusers/models/model_loading_utils.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* suggestions from reviews

* Update src/diffusers/models/modeling_utils.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* remove safetensors check

* fix default value

* more fix from suggestions

* revert logic for single file

* style

* typing + fix couple of issues

* improve speed

* Update src/diffusers/models/modeling_utils.py
Co-authored-by: Aryan <aryan@huggingface.co>

* fp8 dtype

* add tests

* rename resolved_archive_file to resolved_model_file

* format

* map_location default cpu

* add utility function

* switch to smaller model + test inference

* Apply suggestions from code review
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* rm comment

* add log

* Apply suggestions from code review
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* add decorator

* cosine sim instead

* fix use_keep_in_fp32_modules

* comm

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>

f5929e03

16 Jan, 2025 1 commit
- Move buffers to device (#10523) · 0b065c09
  hlky authored Jan 16, 2025
```
* Move buffers to device

* add test

* named_buffers
```
  0b065c09
15 Jan, 2025 2 commits

[Tests] add: test to check 8bit bnb quantized models work with lora loading. (#10576) · bba59fb8

Sayak Paul authored Jan 15, 2025



* add: test to check 8bit bnb quantized models work with lora loading.

* Update tests/quantization/bnb/test_mixed_int8.py
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

bba59fb8

[LoRA] feat: support loading loras into 4bit quantized Flux models. (#10578) · 2432f80c
Sayak Paul authored Jan 15, 2025
```
* feat: support loading loras into 4bit quantized models.

* updates

* update

* remove weight check.
```
2432f80c

14 Jan, 2025 1 commit
- Test sequential cpu offload for torchao quantization (#10506) · aa79d7da
  Aryan authored Jan 14, 2025
```
test sequential cpu offload
```
  aa79d7da
10 Jan, 2025 1 commit
- [CI] Match remaining assertions from big runner (#10521) · 9f06a0d1
  Sayak Paul authored Jan 10, 2025
```
* print

* remove print.

* print

* update slice.

* empty
```
  9f06a0d1
08 Jan, 2025 1 commit

Add AuraFlow GGUF support (#10463) · cb342b74

AstraliteHeart authored Jan 07, 2025

* Add support for loading AuraFlow models from GGUF

https://huggingface.co/city96/AuraFlow-v0.3-gguf



* Update AuraFlow documentation for GGUF, add GGUF tests and model detection.

* Address code review comments.

* Remove unused config.

---------
Co-authored-by: hlky <hlky@hlky.ac>

cb342b74

25 Dec, 2024 1 commit

Aryan authored Dec 25, 2024

* Revert "Add support for sharded models when TorchAO quantization is enabled (#10256)"

This reverts commit 41ba8c0b

.

* update tests

* udpate

* update

* update

* update device map tests

* apply review suggestions

* update

* make style

* fix

* update docs

* update tests

* update workflow

* update

* improve tests

* allclose tolerance

* Update src/diffusers/models/modeling_utils.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update tests/quantization/torchao/test_torchao.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* improve tests

* fix

* update correct slices

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

cd991d1e

23 Dec, 2024 2 commits
- [tests] Refactor TorchAO serialization fast tests (#10271) · 02c777c0
  Aryan authored Dec 23, 2024
```
refactor
```
  02c777c0
- Bump minimum TorchAO version to 0.7.0 (#10293) · ffc0eaab
  Aryan authored Dec 23, 2024
```
* bump min torchao version to 0.7.0

* update
```
  ffc0eaab
20 Dec, 2024 1 commit
- Add support for sharded models when TorchAO quantization is enabled (#10256) · 41ba8c0b
  Aryan authored Dec 20, 2024
```
* add sharded + device_map check
```
  41ba8c0b
17 Dec, 2024 2 commits

[tests] Remove/rename unsupported quantization torchao type (#10263) · 1524781b
Aryan authored Dec 17, 2024
```
update
```
1524781b

[Single File] Add GGUF support (#9964) · e24941b2

Dhruv Nair authored Dec 17, 2024



* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update src/diffusers/quantizers/gguf/utils.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update docs/source/en/quantization/gguf.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update

* update

* update

* update

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

e24941b2

16 Dec, 2024 1 commit

[core] TorchAO Quantizer (#10009) · 9f00c617

Aryan authored Dec 17, 2024



* torchao quantizer


---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

9f00c617

04 Dec, 2024 1 commit

[bitsandbytes] allow directly CUDA placements of pipelines loaded with bnb components (#9840) · e8da75df

Sayak Paul authored Dec 04, 2024



* allow device placement when using bnb quantization.

* warning.

* tests

* fixes

* docs.

* require accelerate version.

* remove print.

* revert to()

* tests

* fixes

* fix: missing AutoencoderKL lora adapter (#9807)

* fix: missing AutoencoderKL lora adapter

* fix

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* fixes

* fix condition test

* updates

* updates

* remove is_offloaded.

* fixes

* better

* empty

---------
Co-authored-by: Emmanuel Benazera <emmanuel.benazera@jolibrain.com>

e8da75df

02 Dec, 2024 1 commit

[CI] Add quantization (#9832) · 827b6c25

Sayak Paul authored Dec 02, 2024

* add quantization to nightly CI.

* prep.

* fix lib name.

* remove deps that are not needed.

* fix slice.

827b6c25