- 11 Jun, 2025 3 commits
-
-
Sayak Paul authored
* start adding compilation tests for quantization. * fixes * make common utility. * modularize. * add group offloading+compile * xfail * update * Update tests/quantization/test_torch_compile_utils.py Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com> * fixes --------- Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com>
-
Yao Matrix authored
* enable torchao cases on XPU Signed-off-by:
Matrix YAO <matrix.yao@intel.com> * device agnostic APIs Signed-off-by:
YAO Matrix <matrix.yao@intel.com> * more Signed-off-by:
YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by:
YAO Matrix <matrix.yao@intel.com> * enable test_torch_compile_recompilation_and_graph_break on XPU Signed-off-by:
YAO Matrix <matrix.yao@intel.com> * resolve comments Signed-off-by:
YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by:
Matrix YAO <matrix.yao@intel.com> Signed-off-by:
YAO Matrix <matrix.yao@intel.com>
-
Sayak Paul authored
support Flux Control LoRA with bnb 8bit.
-
- 06 Jun, 2025 1 commit
-
-
jiqing-feng authored
* use deterministic to get stable result Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * add deterministic for int8 test Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by:
jiqing-feng <jiqing.feng@intel.com>
-
- 02 Jun, 2025 1 commit
-
-
Sayak Paul authored
misc changes in the bnb tests for consistency.
-
- 30 May, 2025 1 commit
-
-
co63oc authored
* Fix typos in strings and comments Signed-off-by:
co63oc <co63oc@users.noreply.github.com> * Update src/diffusers/hooks/hooks.py Co-authored-by:
Aryan <contact.aryanvs@gmail.com> * Update src/diffusers/hooks/hooks.py Co-authored-by:
Aryan <contact.aryanvs@gmail.com> * Update layerwise_casting.py * Apply style fixes * update --------- Signed-off-by:
co63oc <co63oc@users.noreply.github.com> Co-authored-by:
Aryan <contact.aryanvs@gmail.com> Co-authored-by:
github-actions[bot] <github-actions[bot]@users.noreply.github.com>
-
- 15 May, 2025 1 commit
-
-
Dhruv Nair authored
* update * update * update * update * update * update * update
-
- 09 May, 2025 2 commits
-
-
Yao Matrix authored
* enable 7 cases on XPU Signed-off-by:
Yao Matrix <matrix.yao@intel.com> * calibrate A100 expectations Signed-off-by:
YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by:
Yao Matrix <matrix.yao@intel.com> Signed-off-by:
YAO Matrix <matrix.yao@intel.com>
-
Sayak Paul authored
* feat: pipeline-level quant config. Co-authored-by:
SunMarc <marc.sun@hotmail.fr> condition better. support mapping. improvements. [Quantization] Add Quanto backend (#10756) * update * updaet * update * update * update * update * update * update * update * update * update * update * Update docs/source/en/quantization/quanto.md Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * Update src/diffusers/quantizers/quanto/utils.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * update * update --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> [Single File] Add single file loading for SANA Transformer (#10947) * added support for from_single_file * added diffusers mapping script * added testcase * bug fix * updated tests * corrected code quality * corrected code quality --------- Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com> [LoRA] Improve warning messages when LoRA loading becomes a no-op (#10187) * updates * updates * updates * updates * notebooks revert * fix-copies. * seeing * fix * revert * fixes * fixes * fixes * remove print * fix * conflicts ii. * updates * fixes * better filtering of prefix. --------- Co-authored-by:
hlky <hlky@hlky.ac> [LoRA] CogView4 (#10981) * update * make fix-copies * update [Tests] improve quantization tests by additionally measuring the inference memory savings (#11021) * memory usage tests * fixes * gguf [`Research Project`] Add AnyText: Multilingual Visual Text Generation And Editing (#8998) * Add initial template * Second template * feat: Add TextEmbeddingModule to AnyTextPipeline * feat: Add AuxiliaryLatentModule template to AnyTextPipeline * Add bert tokenizer from the anytext repo for now * feat: Update AnyTextPipeline's modify_prompt method This commit adds improvements to the modify_prompt method in the AnyTextPipeline class. The method now handles special characters and replaces selected string prompts with a placeholder. Additionally, it includes a check for Chinese text and translation using the trans_pipe. * Fill in the `forward` pass of `AuxiliaryLatentModule` * `make style && make quality` * `chore: Update bert_tokenizer.py with a TODO comment suggesting the use of the transformers library` * Update error handling to raise and logging * Add `create_glyph_lines` function into `TextEmbeddingModule` * make style * Up * Up * Up * Up * Remove several comments * refactor: Remove ControlNetConditioningEmbedding and update code accordingly * Up * Up * up * refactor: Update AnyTextPipeline to include new optional parameters * up * feat: Add OCR model and its components * chore: Update `TextEmbeddingModule` to include OCR model components and dependencies * chore: Update `AuxiliaryLatentModule` to include VAE model and its dependencies for masked image in the editing task * `make style` * refactor: Update `AnyTextPipeline`'s docstring * Update `AuxiliaryLatentModule` to include info dictionary so that text processing is done once * simplify * `make style` * Converting `TextEmbeddingModule` to ordinary `encode_prompt()` function * Simplify for now * `make style` * Up * feat: Add scripts to convert AnyText controlnet to diffusers * `make style` * Fix: Move glyph rendering to `TextEmbeddingModule` from `AuxiliaryLatentModule` * make style * Up * Simplify * Up * feat: Add safetensors module for loading model file * Fix device issues * Up * Up * refactor: Simplify * refactor: Simplify code for loading models and handling data types * `make style` * refactor: Update to() method in FrozenCLIPEmbedderT3 and TextEmbeddingModule * refactor: Update dtype in embedding_manager.py to match proj.weight * Up * Add attribution and adaptation information to pipeline_anytext.py * Update usage example * Will refactor `controlnet_cond_embedding` initialization * Add `AnyTextControlNetConditioningEmbedding` template * Refactor organization * style * style * Move custom blocks from `AuxiliaryLatentModule` to `AnyTextControlNetConditioningEmbedding` * Follow one-file policy * style * [Docs] Update README and pipeline_anytext.py to use AnyTextControlNetModel * [Docs] Update import statement for AnyTextControlNetModel in pipeline_anytext.py * [Fix] Update import path for ControlNetModel, ControlNetOutput in anytext_controlnet.py * Refactor AnyTextControlNet to use configurable conditioning embedding channels * Complete control net conditioning embedding in AnyTextControlNetModel * up * [FIX] Ensure embeddings use correct device in AnyTextControlNetModel * up * up * style * [UPDATE] Revise README and example code for AnyTextPipeline integration with DiffusionPipeline * [UPDATE] Update example code in anytext.py to use correct font file and improve clarity * down * [UPDATE] Refactor BasicTokenizer usage to a new Checker class for text processing * update pillow * [UPDATE] Remove commented-out code and unnecessary docstring in anytext.py and anytext_controlnet.py for improved clarity * [REMOVE] Delete frozen_clip_embedder_t3.py as it is in the anytext.py file * [UPDATE] Replace edict with dict for configuration in anytext.py and RecModel.py for consistency *
🆙 * style * [UPDATE] Revise README.md for clarity, remove unused imports in anytext.py, and add author credits in anytext_controlnet.py * style * Update examples/research_projects/anytext/README.md Co-authored-by:Aryan <contact.aryanvs@gmail.com> * Remove commented-out image preparation code in AnyTextPipeline * Remove unnecessary blank line in README.md [Quantization] Allow loading TorchAO serialized Tensor objects with torch>=2.6 (#11018) * update * update * update * update * update * update * update * update * update fix: mixture tiling sdxl pipeline - adjust gerating time_ids & embeddings (#11012) small fix on generating time_ids & embeddings [LoRA] support wan i2v loras from the world. (#11025) * support wan i2v loras from the world. * remove copied from. * upates * add lora. Fix SD3 IPAdapter feature extractor (#11027) chore: fix help messages in advanced diffusion examples (#10923) Fix missing **kwargs in lora_pipeline.py (#11011) * Update lora_pipeline.py * Apply style fixes * fix-copies --------- Co-authored-by:
hlky <hlky@hlky.ac> Co-authored-by:
github-actions[bot] <github-actions[bot]@users.noreply.github.com> Fix for multi-GPU WAN inference (#10997) Ensure that hidden_state and shift/scale are on the same device when running with multiple GPUs Co-authored-by: Jimmy <39@
🇺🇸 .com> [Refactor] Clean up import utils boilerplate (#11026) * update * update * update Use `output_size` in `repeat_interleave` (#11030) [hybrid inference🍯 🐝 ] Add VAE encode (#11017) * [hybrid inference🍯 🐝 ] Add VAE encode * _toctree: add vae encode * Add endpoints, tests * vae_encode docs * vae encode benchmarks * api reference * changelog * Update docs/source/en/hybrid_inference/overview.md Co-authored-by:Sayak Paul <spsayakpaul@gmail.com> * update --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Wan Pipeline scaling fix, type hint warning, multi generator fix (#11007) * Wan Pipeline scaling fix, type hint warning, multi generator fix * Apply suggestions from code review [LoRA] change to warning from info when notifying the users about a LoRA no-op (#11044) * move to warning. * test related changes. Rename Lumina(2)Text2ImgPipeline -> Lumina(2)Pipeline (#10827) * Rename Lumina(2)Text2ImgPipeline -> Lumina(2)Pipeline --------- Co-authored-by:
YiYi Xu <yixu310@gmail.com> making ```formatted_images``` initialization compact (#10801) compact writing Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
YiYi Xu <yixu310@gmail.com> Fix aclnnRepeatInterleaveIntWithDim error on NPU for get_1d_rotary_pos_embed (#10820) * get_1d_rotary_pos_embed support npu * Update src/diffusers/models/embeddings.py --------- Co-authored-by:
Kai zheng <kaizheng@KaideMacBook-Pro.local> Co-authored-by:
hlky <hlky@hlky.ac> Co-authored-by:
YiYi Xu <yixu310@gmail.com> [Tests] restrict memory tests for quanto for certain schemes. (#11052) * restrict memory tests for quanto for certain schemes. * Apply suggestions from code review Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com> * fixes * style --------- Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com> [LoRA] feat: support non-diffusers wan t2v loras. (#11059) feat: support non-diffusers wan t2v loras. [examples/controlnet/train_controlnet_sd3.py] Fixes #11050 - Cast prompt_embeds and pooled_prompt_embeds to weight_dtype to prevent dtype mismatch (#11051) Fix: dtype mismatch of prompt embeddings in sd3 controlnet training Co-authored-by:
Andreas Jörg <andreasjoerg@MacBook-Pro-von-Andreas-2.fritz.box> Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> reverts accidental change that removes attn_mask in attn. Improves fl… (#11065) reverts accidental change that removes attn_mask in attn. Improves flux ptxla by using flash block sizes. Moves encoding outside the for loop. Co-authored-by:
Juan Acevedo <jfacevedo@google.com> Fix deterministic issue when getting pipeline dtype and device (#10696) Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com> [Tests] add requires peft decorator. (#11037) * add requires peft decorator. * install peft conditionally. * conditional deps. Co-authored-by:
DN6 <dhruv.nair@gmail.com> --------- Co-authored-by:
DN6 <dhruv.nair@gmail.com> CogView4 Control Block (#10809) * cogview4 control training --------- Co-authored-by:
OleehyO <leehy0357@gmail.com> Co-authored-by:
yiyixuxu <yixu310@gmail.com> [CI] pin transformers version for benchmarking. (#11067) pin transformers version for benchmarking. updates Fix Wan I2V Quality (#11087) * fix_wan_i2v_quality * Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py Co-authored-by:
YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py Co-authored-by:
YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py Co-authored-by:
YiYi Xu <yixu310@gmail.com> * Update pipeline_wan_i2v.py --------- Co-authored-by:
YiYi Xu <yixu310@gmail.com> Co-authored-by:
hlky <hlky@hlky.ac> LTX 0.9.5 (#10968) * update --------- Co-authored-by:
YiYi Xu <yixu310@gmail.com> Co-authored-by:
hlky <hlky@hlky.ac> make PR GPU tests conditioned on styling. (#11099) Group offloading improvements (#11094) update Fix pipeline_flux_controlnet.py (#11095) * Fix pipeline_flux_controlnet.py * Fix style update readme instructions. (#11096) Co-authored-by:
Juan Acevedo <jfacevedo@google.com> Resolve stride mismatch in UNet's ResNet to support Torch DDP (#11098) Modify UNet's ResNet implementation to resolve stride mismatch in Torch's DDP Fix Group offloading behaviour when using streams (#11097) * update * update Quality options in `export_to_video` (#11090) * Quality options in `export_to_video` * make style improve more. add placeholders for docstrings. formatting. smol fix. solidify validation and annotation * Revert "feat: pipeline-level quant config." This reverts commit 316ff46b7648bfa24525ac02c284afcf440404aa. * feat: implement pipeline-level quantization config Co-authored-by:
SunMarc <marc@huggingface.co> * update * fixes * fix validation. * add tests and other improvements. * add tests * import quality * remove prints. * add docs. * fixes to docs. * doc fixes. * doc fixes. * add validation to the input quantization_config. * clarify recommendations. * docs * add to ci. * todo. --------- Co-authored-by:
SunMarc <marc@huggingface.co>
-
- 28 Apr, 2025 2 commits
-
-
Yao Matrix authored
* enable gguf test cases on XPU Signed-off-by:
YAO Matrix <matrix.yao@intel.com> * make SD35LargeGGUFSingleFileTests::test_pipeline_inference pas Signed-off-by:
root <root@a4bf01945cfe.jf.intel.com> * make FluxControlLoRAGGUFTests::test_lora_loading pass Signed-off-by:
Yao Matrix <matrix.yao@intel.com> * polish code Signed-off-by:
Yao Matrix <matrix.yao@intel.com> * Apply style fixes --------- Signed-off-by:
YAO Matrix <matrix.yao@intel.com> Signed-off-by:
root <root@a4bf01945cfe.jf.intel.com> Signed-off-by:
Yao Matrix <matrix.yao@intel.com> Co-authored-by:
root <root@a4bf01945cfe.jf.intel.com> Co-authored-by:
github-actions[bot] <github-actions[bot]@users.noreply.github.com>
-
Yao Matrix authored
* enable group_offload cases and quanto cases on XPU Signed-off-by:
YAO Matrix <matrix.yao@intel.com> * use backend APIs Signed-off-by:
Yao Matrix <matrix.yao@intel.com> * fix style Signed-off-by:
Yao Matrix <matrix.yao@intel.com> --------- Signed-off-by:
YAO Matrix <matrix.yao@intel.com> Signed-off-by:
Yao Matrix <matrix.yao@intel.com>
-
- 17 Apr, 2025 1 commit
-
-
Yao Matrix authored
* enable 2 test cases on XPU Signed-off-by:
YAO Matrix <matrix.yao@intel.com> * Apply style fixes --------- Signed-off-by:
YAO Matrix <matrix.yao@intel.com> Co-authored-by:
github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com>
-
- 08 Apr, 2025 2 commits
-
-
Sayak Paul authored
* improve replacement warnings for bnb * updates to docs.
-
hlky authored
* Flux quantized with lora * fix * changes * Apply suggestions from code review Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * Apply style fixes * enable model cpu offload() * Update src/diffusers/loaders/lora_pipeline.py Co-authored-by:
hlky <hlky@hlky.ac> * update * Apply suggestions from code review * update * add peft as an additional dependency for gguf --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com>
-
- 07 Apr, 2025 1 commit
-
-
Yao Matrix authored
enable case on XPU: 1. tests/quantization/bnb/test_mixed_int8.py::BnB8bitTrainingTests::test_training Signed-off-by:YAO Matrix <matrix.yao@intel.com>
-
- 02 Apr, 2025 1 commit
-
-
jiqing-feng authored
Signed-off-by:jiqing-feng <jiqing.feng@intel.com>
-
- 01 Apr, 2025 1 commit
-
-
Fanli Lin authored
no cuda only
-
- 26 Mar, 2025 1 commit
-
-
Dhruv Nair authored
* update * update * update * update
-
- 19 Mar, 2025 1 commit
-
-
Fanli Lin authored
* enable bnb on xpu * add 2 more cases * add missing change * add missing change * add one more
-
- 15 Mar, 2025 1 commit
-
-
Sayak Paul authored
* add requires peft decorator. * install peft conditionally. * conditional deps. Co-authored-by:
DN6 <dhruv.nair@gmail.com> --------- Co-authored-by:
DN6 <dhruv.nair@gmail.com>
-
- 14 Mar, 2025 1 commit
-
-
Sayak Paul authored
* restrict memory tests for quanto for certain schemes. * Apply suggestions from code review Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com> * fixes * style --------- Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com>
-
- 10 Mar, 2025 2 commits
-
-
Sayak Paul authored
* memory usage tests * fixes * gguf
-
Dhruv Nair authored
* update * updaet * update * update * update * update * update * update * update * update * update * update * Update docs/source/en/quantization/quanto.md Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * Update src/diffusers/quantizers/quanto/utils.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * update * update --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com>
-
- 04 Mar, 2025 1 commit
-
-
a120092009 authored
* [Quantization] support pass MappingType for TorchAoConfig * Apply style fixes --------- Co-authored-by:github-actions[bot] <github-actions[bot]@users.noreply.github.com>
-
- 19 Feb, 2025 1 commit
-
-
Marc Sun authored
* first draft model loading refactor * revert name change * fix bnb * revert name * fix dduf * fix huanyan * style * Update src/diffusers/models/model_loading_utils.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * suggestions from reviews * Update src/diffusers/models/modeling_utils.py Co-authored-by:
YiYi Xu <yixu310@gmail.com> * remove safetensors check * fix default value * more fix from suggestions * revert logic for single file * style * typing + fix couple of issues * improve speed * Update src/diffusers/models/modeling_utils.py Co-authored-by:
Aryan <aryan@huggingface.co> * fp8 dtype * add tests * rename resolved_archive_file to resolved_model_file * format * map_location default cpu * add utility function * switch to smaller model + test inference * Apply suggestions from code review Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * rm comment * add log * Apply suggestions from code review Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * add decorator * cosine sim instead * fix use_keep_in_fp32_modules * comm --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
YiYi Xu <yixu310@gmail.com> Co-authored-by:
Aryan <aryan@huggingface.co>
-
- 16 Jan, 2025 1 commit
-
-
hlky authored
* Move buffers to device * add test * named_buffers
-
- 15 Jan, 2025 2 commits
-
-
Sayak Paul authored
* add: test to check 8bit bnb quantized models work with lora loading. * Update tests/quantization/bnb/test_mixed_int8.py Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com> --------- Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com>
-
Sayak Paul authored
* feat: support loading loras into 4bit quantized models. * updates * update * remove weight check.
-
- 14 Jan, 2025 1 commit
-
-
Aryan authored
test sequential cpu offload
-
- 10 Jan, 2025 1 commit
-
-
Sayak Paul authored
* print * remove print. * print * update slice. * empty
-
- 08 Jan, 2025 1 commit
-
-
AstraliteHeart authored
* Add support for loading AuraFlow models from GGUF https://huggingface.co/city96/AuraFlow-v0.3-gguf * Update AuraFlow documentation for GGUF, add GGUF tests and model detection. * Address code review comments. * Remove unused config. --------- Co-authored-by:
hlky <hlky@hlky.ac>
-
- 25 Dec, 2024 1 commit
-
-
Aryan authored
* Revert "Add support for sharded models when TorchAO quantization is enabled (#10256)" This reverts commit 41ba8c0b . * update tests * udpate * update * update * update device map tests * apply review suggestions * update * make style * fix * update docs * update tests * update workflow * update * improve tests * allclose tolerance * Update src/diffusers/models/modeling_utils.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * Update tests/quantization/torchao/test_torchao.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * improve tests * fix * update correct slices --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com>
-
- 23 Dec, 2024 2 commits
- 20 Dec, 2024 1 commit
-
-
Aryan authored
* add sharded + device_map check
-
- 17 Dec, 2024 2 commits
-
-
Aryan authored
update
-
Dhruv Nair authored
* update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * Update src/diffusers/quantizers/gguf/utils.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * update * update * update * update * update * update * update * update * update * update * Update docs/source/en/quantization/gguf.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * update * update * update * update --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
- 16 Dec, 2024 1 commit
-
-
Aryan authored
* torchao quantizer --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
- 04 Dec, 2024 1 commit
-
-
Sayak Paul authored
* allow device placement when using bnb quantization. * warning. * tests * fixes * docs. * require accelerate version. * remove print. * revert to() * tests * fixes * fix: missing AutoencoderKL lora adapter (#9807) * fix: missing AutoencoderKL lora adapter * fix --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * fixes * fix condition test * updates * updates * remove is_offloaded. * fixes * better * empty --------- Co-authored-by:
Emmanuel Benazera <emmanuel.benazera@jolibrain.com>
-
- 02 Dec, 2024 1 commit
-
-
Sayak Paul authored
* add quantization to nightly CI. * prep. * fix lib name. * remove deps that are not needed. * fix slice.
-