- 19 Jun, 2025 1 commit
-
-
Sayak Paul authored
* start implementing disk offloading in group. * delete diff file. * updates.patch * offload_to_disk_path * check if safetensors already exist. * add test and clarify. * updates * update todos. * update more docs. * update docs
-
- 11 Jun, 2025 1 commit
-
-
Sayak Paul authored
* add clarity in documentation for device_map * docs * fix how compiler tester mixins are used. * propagate * more * typo. * fix tests * fix order of decroators. * clarify more. * more test cases. * fix doc * fix device_map docstring in pipeline_utils. * more examples * more * update * remove code for stuff that is already supported. * fix stuff.
-
- 13 May, 2025 1 commit
-
-
johannaSommer authored
Co-authored-by:Dhruv Nair <dhruv.nair@gmail.com>
-
- 08 Apr, 2025 1 commit
-
-
Sayak Paul authored
* implement record_stream for better performance. * fix * style. * merge #11097 * Update src/diffusers/hooks/group_offloading.py Co-authored-by:
Aryan <aryan@huggingface.co> * fixes * docstring. * remaining todos in low_cpu_mem_usage * tests * updates to docs. --------- Co-authored-by:
Aryan <aryan@huggingface.co>
-
- 02 Apr, 2025 2 commits
-
-
hlky authored
-
hlky authored
* allow models to run with a user-provided dtype map instead of a single dtype * make style * Add warning, change `_` to `default` * make style * add test * handle shared tensors * remove warning --------- Co-authored-by:Sayak Paul <spsayakpaul@gmail.com>
-
- 21 Mar, 2025 2 commits
-
-
hlky authored
* Don't use `torch_dtype` when `quantization_config` is set * up * djkajka * Apply suggestions from code review --------- Co-authored-by:Sayak Paul <spsayakpaul@gmail.com>
-
Aryan authored
* init * update * update * update * make style * update * fix * make it work with guidance distilled models * update * make fix-copies * add tests * update * apply_faster_cache -> apply_fastercache * fix * reorder * update * refactor * update docs * add fastercache to CacheMixin * update tests * Apply suggestions from code review * make style * try to fix partial import error * Apply style fixes * raise warning * update --------- Co-authored-by:github-actions[bot] <github-actions[bot]@users.noreply.github.com>
-
- 20 Mar, 2025 1 commit
-
-
Dhruv Nair authored
* update * update * clean up
-
- 02 Mar, 2025 1 commit
-
-
YiYi Xu authored
* Add wanx pipeline, model and example * wanx_merged_v1 * change WanX into Wan * fix i2v fp32 oom error Link: https://code.alibaba-inc.com/open_wanx2/diffusers/codereview/20607813 * support t2v load fp32 ckpt * add example * final merge v1 * Update autoencoder_kl_wan.py * up * update middle, test up_block * up up * one less nn.sequential * up more * up * more * [refactor] [wip] Wan transformer/pipeline (#10926) * update * update * refactor rope * refactor pipeline * make fix-copies * add transformer test * update * update * make style * update tests * tests * conversion script * conversion script * update * docs * remove unused code * fix _toctree.yml * update dtype * fix test * fix tests: scale * up * more * Apply suggestions from code review * Apply suggestions from code review * style * Update scripts/convert_wan_to_diffusers.py * update docs * fix --------- Co-authored-by:
Yitong Huang <huangyitong.hyt@alibaba-inc.com> Co-authored-by:
亚森 <wangjiayu.wjy@alibaba-inc.com> Co-authored-by:
Aryan <aryan@huggingface.co>
-
- 24 Feb, 2025 1 commit
-
-
hlky authored
* Fix `torch_dtype` in Kolors text encoder with `transformers` v4.49 * Default torch_dtype and warning
-
- 20 Feb, 2025 1 commit
-
-
Aryan authored
remove prints
-
- 19 Feb, 2025 1 commit
-
-
Marc Sun authored
* first draft model loading refactor * revert name change * fix bnb * revert name * fix dduf * fix huanyan * style * Update src/diffusers/models/model_loading_utils.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * suggestions from reviews * Update src/diffusers/models/modeling_utils.py Co-authored-by:
YiYi Xu <yixu310@gmail.com> * remove safetensors check * fix default value * more fix from suggestions * revert logic for single file * style * typing + fix couple of issues * improve speed * Update src/diffusers/models/modeling_utils.py Co-authored-by:
Aryan <aryan@huggingface.co> * fp8 dtype * add tests * rename resolved_archive_file to resolved_model_file * format * map_location default cpu * add utility function * switch to smaller model + test inference * Apply suggestions from code review Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * rm comment * add log * Apply suggestions from code review Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * add decorator * cosine sim instead * fix use_keep_in_fp32_modules * comm --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
YiYi Xu <yixu310@gmail.com> Co-authored-by:
Aryan <aryan@huggingface.co>
-
- 14 Feb, 2025 1 commit
-
-
Aryan authored
* update * fix * non_blocking; handle parameters and buffers * update * Group offloading with cuda stream prefetching (#10516) * cuda stream prefetch * remove breakpoints * update * copy model hook implementation from pab * update; ~very workaround based implementation but it seems to work as expected; needs cleanup and rewrite * more workarounds to make it actually work * cleanup * rewrite * update * make sure to sync current stream before overwriting with pinned params not doing so will lead to erroneous computations on the GPU and cause bad results * better check * update * remove hook implementation to not deal with merge conflict * re-add hook changes * why use more memory when less memory do trick * why still use slightly more memory when less memory do trick * optimise * add model tests * add pipeline tests * update docs * add layernorm and groupnorm * address review comments * improve tests; add docs * improve docs * Apply suggestions from code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * apply suggestions from code review * update tests * apply suggestions from review * enable_group_offloading -> enable_group_offload for naming consistency * raise errors if multiple offloading strategies used; add relevant tests * handle .to() when group offload applied * refactor some repeated code * remove unintentional change from merge conflict * handle .cuda() --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
- 13 Feb, 2025 1 commit
-
-
Fanli Lin authored
fix contiguous bug
-
- 10 Feb, 2025 1 commit
-
-
hlky authored
-
- 28 Jan, 2025 1 commit
-
-
Aryan authored
* update * remove unused fn * apply suggestions based on review * update + cleanup 🧹 * more cleanup 🧹 * make fix-copies * update test
-
- 22 Jan, 2025 1 commit
-
-
Aryan authored
* update * update * make style * remove dynamo disable * add coauthor Co-Authored-By:
Dhruv Nair <dhruv.nair@gmail.com> * update * update * update * update mixin * add some basic tests * update * update * non_blocking * improvements * update * norm.* -> norm * apply suggestions from review * add example * update hook implementation to the latest changes from pyramid attention broadcast * deinitialize should raise an error * update doc page * Apply suggestions from code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * update docs * update * refactor * fix _always_upcast_modules for asym ae and vq_model * fix lumina embedding forward to not depend on weight dtype * refactor tests * add simple lora inference tests * _always_upcast_modules -> _precision_sensitive_module_patterns * remove todo comments about review; revert changes to self.dtype in unets because .dtype on ModelMixin should be able to handle fp8 weight case * check layer dtypes in lora test * fix UNet1DModelTests::test_layerwise_upcasting_inference * _precision_sensitive_module_patterns -> _skip_layerwise_casting_patterns based on feedback * skip test in NCSNppModelTests * skip tests for AutoencoderTinyTests * skip tests for AutoencoderOobleckTests * skip tests for UNet1DModelTests - unsupported pytorch operations * layerwise_upcasting -> layerwise_casting * skip tests for UNetRLModelTests; needs next pytorch release for currently unimplemented operation support * add layerwise fp8 pipeline test * use xfail * Apply suggestions from code review Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com> * add assertion with fp32 comparison; add tolerance to fp8-fp32 vs fp32-fp32 comparison (required for a few models' test to pass) * add note about memory consumption on tesla CI runner for failing test --------- Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
- 21 Jan, 2025 1 commit
-
-
Sayak Paul authored
change licensing to 2025 from 2024.
-
- 16 Jan, 2025 2 commits
-
-
Juan Acevedo authored
* implementing flux on TPUs with ptxla * add xla flux attention class * run make style/quality * Update src/diffusers/models/attention_processor.py Co-authored-by:
YiYi Xu <yixu310@gmail.com> * Update src/diffusers/models/attention_processor.py Co-authored-by:
YiYi Xu <yixu310@gmail.com> * run style and quality --------- Co-authored-by:
Juan Acevedo <jfacevedo@google.com> Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
YiYi Xu <yixu310@gmail.com>
-
hlky authored
* Move buffers to device * add test * named_buffers
-
- 14 Jan, 2025 1 commit
-
-
Marc Sun authored
* load and save dduf archive * style * switch to zip uncompressed * updates * Update src/diffusers/pipelines/pipeline_utils.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * Update src/diffusers/pipelines/pipeline_utils.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * first draft * remove print * switch to dduf_file for consistency * switch to huggingface hub api * fix log * add a basic test * Update src/diffusers/configuration_utils.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * Update src/diffusers/pipelines/pipeline_utils.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * Update src/diffusers/pipelines/pipeline_utils.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * fix * fix variant * change saving logic * DDUF - Load transformers components manually (#10171) * update hfh version * Load transformers components manually * load encoder from_pretrained with state_dict * working version with transformers and tokenizer ! * add generation_config case * fix tests * remove saving for now * typing * need next version from transformers * Update src/diffusers/configuration_utils.py Co-authored-by:
Lucain <lucain@huggingface.co> * check path corectly * Apply suggestions from code review Co-authored-by:
Lucain <lucain@huggingface.co> * udapte * typing * remove check for subfolder * quality * revert setup changes * oups * more readable condition * add loading from the hub test * add basic docs. * Apply suggestions from code review Co-authored-by:
Lucain <lucain@huggingface.co> * add example * add * make functions private * Apply suggestions from code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * minor. * fixes * fix * change the precdence of parameterized. * error out when custom pipeline is passed with dduf_file. * updates * fix * updates * fixes * updates * fix xfail condition. * fix xfail * fixes * sharded checkpoint compat * add test for sharded checkpoint * add suggestions * Update src/diffusers/models/model_loading_utils.py Co-authored-by:
YiYi Xu <yixu310@gmail.com> * from suggestions * add class attributes to flag dduf tests * last one * fix logic * remove comment * revert changes --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
Lucain <lucain@huggingface.co> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
YiYi Xu <yixu310@gmail.com>
-
- 10 Jan, 2025 1 commit
-
-
Daniel Hipke authored
Add a `disable_mmap` option to the `from_single_file` loader to improve load performance on network mounts (#10305) * Add no_mmap arg. * Fix arg parsing. * Update another method to force no mmap. * logging * logging2 * propagate no_mmap * logging3 * propagate no_mmap * logging4 * fix open call * clean up logging * cleanup * fix missing arg * update logging and comments * Rename to disable_mmap and update other references. * [Docs] Update ltx_video.md to remove generator from `from_pretrained()` (#10316) Update ltx_video.md to remove generator from `from_pretrained()` * docs: fix a mistake in docstring (#10319) Update pipeline_hunyuan_video.py docs: fix a mistake * [BUG FIX] [Stable Audio Pipeline] Resolve torch.Tensor.new_zeros() TypeError in function prepare_latents caused by audio_vae_length (#10306) [BUG FIX] [Stable Audio Pipeline] TypeError: new_zeros(): argument 'size' failed to unpack the object at pos 3 with error "type must be tuple of ints,but got float" torch.Tensor.new_zeros() takes a single argument size (int...) – a list, tuple, or torch.Size of integers defining the shape of the output tensor. in function prepare_latents: audio_vae_length = self.transformer.config.sample_size * self.vae.hop_length audio_shape = (batch_size // num_waveforms_per_prompt, audio_channels, audio_vae_length) ... audio = initial_audio_waveforms.new_zeros(audio_shape) audio_vae_length evaluates to float because self.transformer.config.sample_size returns a float Co-authored-by:
hlky <hlky@hlky.ac> * [docs] Fix quantization links (#10323) Update overview.md * [Sana]add 2K related model for Sana (#10322) add 2K related model for Sana * Update src/diffusers/loaders/single_file_model.py Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com> * Update src/diffusers/loaders/single_file.py Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com> * make style --------- Co-authored-by:
hlky <hlky@hlky.ac> Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
Leojc <liao_junchao@outlook.com> Co-authored-by:
Aditya Raj <syntaxticsugr@gmail.com> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
Junsong Chen <cjs1020440147@icloud.com> Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com>
-
- 08 Jan, 2025 2 commits
-
-
Marc Sun authored
* fix device issue in single gpu case * Update src/diffusers/pipelines/pipeline_utils.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com>
-
Aryan authored
* set supports gradient checkpointing to true where necessary; add missing no split modules * fix cogvideox tests * update --------- Co-authored-by:Dhruv Nair <dhruv.nair@gmail.com>
-
- 25 Dec, 2024 1 commit
-
-
Aryan authored
* Revert "Add support for sharded models when TorchAO quantization is enabled (#10256)" This reverts commit 41ba8c0b . * update tests * udpate * update * update * update device map tests * apply review suggestions * update * make style * fix * update docs * update tests * update workflow * update * improve tests * allclose tolerance * Update src/diffusers/models/modeling_utils.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * Update tests/quantization/torchao/test_torchao.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * improve tests * fix * update correct slices --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com>
-
- 23 Dec, 2024 1 commit
-
-
YiYi Xu authored
add: q
-
- 20 Dec, 2024 1 commit
-
-
Aryan authored
* add sharded + device_map check
-
- 17 Dec, 2024 1 commit
-
-
Dhruv Nair authored
* update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * Update src/diffusers/quantizers/gguf/utils.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * update * update * update * update * update * update * update * update * update * update * Update docs/source/en/quantization/gguf.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * update * update * update * update --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
- 16 Dec, 2024 1 commit
-
-
Aryan authored
* torchao quantizer --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
- 06 Dec, 2024 1 commit
-
-
Juan Acevedo authored
* update ptxla example --------- Co-authored-by:
Juan Acevedo <jfacevedo@google.com> Co-authored-by:
Pei Zhang <zpcore@gmail.com> Co-authored-by:
Pei Zhang <piz@google.com> Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
Pei Zhang <pei@Peis-MacBook-Pro.local> Co-authored-by:
hlky <hlky@hlky.ac>
-
- 05 Dec, 2024 1 commit
-
-
Aryan authored
* update * apply review suggestion --------- Co-authored-by:Sayak Paul <spsayakpaul@gmail.com>
-
- 03 Dec, 2024 1 commit
-
-
Lucain authored
-
- 21 Oct, 2024 1 commit
-
-
Sayak Paul authored
* quantization config. * fix-copies * fix * modules_to_not_convert * add bitsandbytes utilities. * make progress. * fixes * quality * up * up rotary embedding refactor 2: update comments, fix dtype for use_real=False (#9312) fix notes and dtype up up * minor * up * up * fix * provide credits where due. * make configurations work. * fixes * fix * update_missing_keys * fix * fix * make it work. * fix * provide credits to transformers. * empty commit * handle to() better. * tests * change to bnb from bitsandbytes * fix tests fix slow quality tests SD3 remark fix complete int4 tests add a readme to the test files. add model cpu offload tests warning test * better safeguard. * change merging status * courtesy to transformers. * move upper. * better * make the unused kwargs warning friendlier. * harmonize changes with https://github.com/huggingface/transformers/pull/33122 * style * trainin tests * feedback part i. * Add Flux inpainting and Flux Img2Img (#9135) --------- Co-authored-by:
yiyixuxu <yixu310@gmail.com> Update `UNet2DConditionModel`'s error messages (#9230) * refactor [CI] Update Single file Nightly Tests (#9357) * update * update feedback. improve README for flux dreambooth lora (#9290) * improve readme * improve readme * improve readme * improve readme fix one uncaught deprecation warning for accessing vae_latent_channels in VaeImagePreprocessor (#9372) deprecation warning vae_latent_channels add mixed int8 tests and more tests to nf4. [core] Freenoise memory improvements (#9262) * update * implement prompt interpolation * make style * resnet memory optimizations * more memory optimizations; todo: refactor * update * update animatediff controlnet with latest changes * refactor chunked inference changes * remove print statements * update * chunk -> split * remove changes from incorrect conflict resolution * remove changes from incorrect conflict resolution * add explanation of SplitInferenceModule * update docs * Revert "update docs" This reverts commit c55a50a271b2cefa8fe340a4f2a3ab9b9d374ec0. * update docstring for freenoise split inference * apply suggestions from review * add tests * apply suggestions from review quantization docs. docs. * Revert "Add Flux inpainting and Flux Img2Img (#9135)" This reverts commit 5799954dd4b3d753c7c1b8d722941350fe4f62ca. * tests * don * Apply suggestions from code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * contribution guide. * changes * empty * fix tests * harmonize with https://github.com/huggingface/transformers/pull/33546 . * numpy_cosine_distance * config_dict modification. * remove if config comment. * note for load_state_dict changes. * float8 check. * quantizer. * raise an error for non-True low_cpu_mem_usage values when using quant. * low_cpu_mem_usage shenanigans when using fp32 modules. * don't re-assign _pre_quantization_type. * make comments clear. * remove comments. * handle mixed types better when moving to cpu. * add tests to check if we're throwing warning rightly. * better check. * fix 8bit test_quality. * handle dtype more robustly. * better message when keep_in_fp32_modules. * handle dtype casting. * fix dtype checks in pipeline. * fix warning message. * Update src/diffusers/models/modeling_utils.py Co-authored-by:
YiYi Xu <yixu310@gmail.com> * mitigate the confusing cpu warning --------- Co-authored-by:
Vishnu V Jaddipal <95531133+Gothos@users.noreply.github.com> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
YiYi Xu <yixu310@gmail.com>
-
- 28 Sep, 2024 1 commit
-
-
Sayak Paul authored
* fix variant-idenitification. * fix variant * fix sharded variant checkpoint loading. * Apply suggestions from code review * fixes. * more fixes. * remove print. * fixes * fixes * comments * fixes * apply suggestions. * hub_utils.py * fix test * updates * fixes * fixes * Apply suggestions from code review Co-authored-by:
YiYi Xu <yixu310@gmail.com> * updates. * removep patch file. --------- Co-authored-by:
YiYi Xu <yixu310@gmail.com>
-
- 25 Sep, 2024 1 commit
-
-
YiYi Xu authored
* up * Update src/diffusers/models/modeling_utils.py Co-authored-by:
Aryan <aryan@huggingface.co> --------- Co-authored-by:
Aryan <aryan@huggingface.co>
-
- 06 Aug, 2024 1 commit
-
-
Marc Sun authored
* Fix loading sharded checkpoint when we have variant * add test * remote print --------- Co-authored-by:Sayak Paul <spsayakpaul@gmail.com>
-
- 18 Jul, 2024 1 commit
-
-
Sayak Paul authored
* remove resume_download * fix: _fetch_index_file call. * remove resume_download from docs.
-
- 24 Jun, 2024 1 commit
-
-
Tolga Cangöz authored
* Fix typos & improve contributing page * `make style && make quality` * fix typos * Fix typo --------- Co-authored-by:Sayak Paul <spsayakpaul@gmail.com>
-
- 21 Jun, 2024 1 commit
-
-
YiYi Xu authored
fix Co-authored-by:yiyixuxu <yixu310@gmail,com>
-