"...text-generation-inference.git" did not exist on "010508cec8e60e3995f4d0a0f98f30c499c28b59"
  1. 19 Feb, 2025 1 commit
  2. 21 Jan, 2025 1 commit
  3. 16 Jan, 2025 1 commit
  4. 14 Jan, 2025 1 commit
    • Marc Sun's avatar
      [FEAT] DDUF format (#10037) · fbff43ac
      Marc Sun authored
      
      
      * load and save dduf archive
      
      * style
      
      * switch to zip uncompressed
      
      * updates
      
      * Update src/diffusers/pipelines/pipeline_utils.py
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * Update src/diffusers/pipelines/pipeline_utils.py
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * first draft
      
      * remove print
      
      * switch to dduf_file for consistency
      
      * switch to huggingface hub api
      
      * fix log
      
      * add a basic test
      
      * Update src/diffusers/configuration_utils.py
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * Update src/diffusers/pipelines/pipeline_utils.py
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * Update src/diffusers/pipelines/pipeline_utils.py
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * fix
      
      * fix variant
      
      * change saving logic
      
      * DDUF - Load transformers components manually (#10171)
      
      * update hfh version
      
      * Load transformers components manually
      
      * load encoder from_pretrained with state_dict
      
      * working version with transformers and tokenizer !
      
      * add generation_config case
      
      * fix tests
      
      * remove saving for now
      
      * typing
      
      * need next version from transformers
      
      * Update src/diffusers/configuration_utils.py
      Co-authored-by: default avatarLucain <lucain@huggingface.co>
      
      * check path corectly
      
      * Apply suggestions from code review
      Co-authored-by: default avatarLucain <lucain@huggingface.co>
      
      * udapte
      
      * typing
      
      * remove check for subfolder
      
      * quality
      
      * revert setup changes
      
      * oups
      
      * more readable condition
      
      * add loading from the hub test
      
      * add basic docs.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarLucain <lucain@huggingface.co>
      
      * add example
      
      * add
      
      * make functions private
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * minor.
      
      * fixes
      
      * fix
      
      * change the precdence of parameterized.
      
      * error out when custom pipeline is passed with dduf_file.
      
      * updates
      
      * fix
      
      * updates
      
      * fixes
      
      * updates
      
      * fix xfail condition.
      
      * fix xfail
      
      * fixes
      
      * sharded checkpoint compat
      
      * add test for sharded checkpoint
      
      * add suggestions
      
      * Update src/diffusers/models/model_loading_utils.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * from suggestions
      
      * add class attributes to flag dduf tests
      
      * last one
      
      * fix logic
      
      * remove comment
      
      * revert changes
      
      ---------
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      Co-authored-by: default avatarLucain <lucain@huggingface.co>
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      fbff43ac
  5. 10 Jan, 2025 1 commit
    • Daniel Hipke's avatar
      Add a `disable_mmap` option to the `from_single_file` loader to improve load... · 52c05bd4
      Daniel Hipke authored
      
      Add a `disable_mmap` option to the `from_single_file` loader to improve load performance on network mounts (#10305)
      
      * Add no_mmap arg.
      
      * Fix arg parsing.
      
      * Update another method to force no mmap.
      
      * logging
      
      * logging2
      
      * propagate no_mmap
      
      * logging3
      
      * propagate no_mmap
      
      * logging4
      
      * fix open call
      
      * clean up logging
      
      * cleanup
      
      * fix missing arg
      
      * update logging and comments
      
      * Rename to disable_mmap and update other references.
      
      * [Docs] Update ltx_video.md to remove generator from `from_pretrained()` (#10316)
      
      Update ltx_video.md to remove generator from `from_pretrained()`
      
      * docs: fix a mistake in docstring (#10319)
      
      Update pipeline_hunyuan_video.py
      
      docs: fix a mistake
      
      * [BUG FIX] [Stable Audio Pipeline] Resolve torch.Tensor.new_zeros() TypeError in function prepare_latents caused by audio_vae_length (#10306)
      
      [BUG FIX] [Stable Audio Pipeline] TypeError: new_zeros(): argument 'size' failed to unpack the object at pos 3 with error "type must be tuple of ints,but got float"
      
      torch.Tensor.new_zeros() takes a single argument size (int...) – a list, tuple, or torch.Size of integers defining the shape of the output tensor.
      
      in function prepare_latents:
      audio_vae_length = self.transformer.config.sample_size * self.vae.hop_length
      audio_shape = (batch_size // num_waveforms_per_prompt, audio_channels, audio_vae_length)
      ...
      audio = initial_audio_waveforms.new_zeros(audio_shape)
      
      audio_vae_length evaluates to float because self.transformer.config.sample_size returns a float
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      * [docs] Fix quantization links (#10323)
      
      Update overview.md
      
      * [Sana]add 2K related model for Sana (#10322)
      
      add 2K related model for Sana
      
      * Update src/diffusers/loaders/single_file_model.py
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      * Update src/diffusers/loaders/single_file.py
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      * make style
      
      ---------
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      Co-authored-by: default avatarLeojc <liao_junchao@outlook.com>
      Co-authored-by: default avatarAditya Raj <syntaxticsugr@gmail.com>
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      Co-authored-by: default avatarJunsong Chen <cjs1020440147@icloud.com>
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      52c05bd4
  6. 23 Dec, 2024 1 commit
  7. 17 Dec, 2024 1 commit
  8. 16 Dec, 2024 1 commit
  9. 05 Dec, 2024 1 commit
  10. 22 Oct, 2024 1 commit
    • Sayak Paul's avatar
      [bitsandbbytes] follow-ups (#9730) · 60ffa842
      Sayak Paul authored
      * bnb follow ups.
      
      * add a warning when dtypes mismatch.
      
      * fx-copies
      
      * clear cache.
      
      * check_if_quantized_param
      
      * add a check on shape.
      
      * updates
      
      * docs
      
      * improve readability.
      
      * resources.
      
      * fix
      60ffa842
  11. 21 Oct, 2024 1 commit
    • Sayak Paul's avatar
      [Quantization] Add quantization support for `bitsandbytes` (#9213) · b821f006
      Sayak Paul authored
      * quantization config.
      
      * fix-copies
      
      * fix
      
      * modules_to_not_convert
      
      * add bitsandbytes utilities.
      
      * make progress.
      
      * fixes
      
      * quality
      
      * up
      
      * up
      
      rotary embedding refactor 2: update comments, fix dtype for use_real=False (#9312)
      
      fix notes and dtype
      
      up
      
      up
      
      * minor
      
      * up
      
      * up
      
      * fix
      
      * provide credits where due.
      
      * make configurations work.
      
      * fixes
      
      * fix
      
      * update_missing_keys
      
      * fix
      
      * fix
      
      * make it work.
      
      * fix
      
      * provide credits to transformers.
      
      * empty commit
      
      * handle to() better.
      
      * tests
      
      * change to bnb from bitsandbytes
      
      * fix tests
      
      fix slow quality tests
      
      SD3 remark
      
      fix
      
      complete int4 tests
      
      add a readme to the test files.
      
      add model cpu offload tests
      
      warning test
      
      * better safeguard.
      
      * change merging status
      
      * courtesy to transformers.
      
      * move  upper.
      
      * better
      
      * make the unused kwargs warning friendlier.
      
      * harmonize changes with https://github.com/huggingface/transformers/pull/33122
      
      
      
      * style
      
      * trainin tests
      
      * feedback part i.
      
      * Add Flux inpainting and Flux Img2Img (#9135)
      
      ---------
      Co-authored-by: default avataryiyixuxu <yixu310@gmail.com>
      
      Update `UNet2DConditionModel`'s error messages (#9230)
      
      * refactor
      
      [CI] Update Single file Nightly Tests (#9357)
      
      * update
      
      * update
      
      feedback.
      
      improve README for flux dreambooth lora (#9290)
      
      * improve readme
      
      * improve readme
      
      * improve readme
      
      * improve readme
      
      fix one uncaught deprecation warning for accessing vae_latent_channels in VaeImagePreprocessor (#9372)
      
      deprecation warning vae_latent_channels
      
      add mixed int8 tests and more tests to nf4.
      
      [core] Freenoise memory improvements (#9262)
      
      * update
      
      * implement prompt interpolation
      
      * make style
      
      * resnet memory optimizations
      
      * more memory optimizations; todo: refactor
      
      * update
      
      * update animatediff controlnet with latest changes
      
      * refactor chunked inference changes
      
      * remove print statements
      
      * update
      
      * chunk -> split
      
      * remove changes from incorrect conflict resolution
      
      * remove changes from incorrect conflict resolution
      
      * add explanation of SplitInferenceModule
      
      * update docs
      
      * Revert "update docs"
      
      This reverts commit c55a50a271b2cefa8fe340a4f2a3ab9b9d374ec0.
      
      * update docstring for freenoise split inference
      
      * apply suggestions from review
      
      * add tests
      
      * apply suggestions from review
      
      quantization docs.
      
      docs.
      
      * Revert "Add Flux inpainting and Flux Img2Img (#9135)"
      
      This reverts commit 5799954dd4b3d753c7c1b8d722941350fe4f62ca.
      
      * tests
      
      * don
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * contribution guide.
      
      * changes
      
      * empty
      
      * fix tests
      
      * harmonize with https://github.com/huggingface/transformers/pull/33546
      
      .
      
      * numpy_cosine_distance
      
      * config_dict modification.
      
      * remove if config comment.
      
      * note for load_state_dict changes.
      
      * float8 check.
      
      * quantizer.
      
      * raise an error for non-True low_cpu_mem_usage values when using quant.
      
      * low_cpu_mem_usage shenanigans when using fp32 modules.
      
      * don't re-assign _pre_quantization_type.
      
      * make comments clear.
      
      * remove comments.
      
      * handle mixed types better when moving to cpu.
      
      * add tests to check if we're throwing warning rightly.
      
      * better check.
      
      * fix 8bit test_quality.
      
      * handle dtype more robustly.
      
      * better message when keep_in_fp32_modules.
      
      * handle dtype casting.
      
      * fix dtype checks in pipeline.
      
      * fix warning message.
      
      * Update src/diffusers/models/modeling_utils.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * mitigate the confusing cpu warning
      
      ---------
      Co-authored-by: default avatarVishnu V Jaddipal <95531133+Gothos@users.noreply.github.com>
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      b821f006
  12. 28 Sep, 2024 1 commit
    • Sayak Paul's avatar
      [Core] fix variant-identification. (#9253) · 11542431
      Sayak Paul authored
      
      
      * fix variant-idenitification.
      
      * fix variant
      
      * fix sharded variant checkpoint loading.
      
      * Apply suggestions from code review
      
      * fixes.
      
      * more fixes.
      
      * remove print.
      
      * fixes
      
      * fixes
      
      * comments
      
      * fixes
      
      * apply suggestions.
      
      * hub_utils.py
      
      * fix test
      
      * updates
      
      * fixes
      
      * fixes
      
      * Apply suggestions from code review
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * updates.
      
      * removep patch file.
      
      ---------
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      11542431
  13. 18 Jul, 2024 1 commit
  14. 06 Jul, 2024 1 commit
  15. 07 Jun, 2024 1 commit
    • Sayak Paul's avatar
      [Core] support saving and loading of sharded checkpoints (#7830) · 7d887118
      Sayak Paul authored
      
      
      * feat: support saving a model in sharded checkpoints.
      
      * feat: make loading of sharded checkpoints work.
      
      * add tests
      
      * cleanse the loading logic a bit more.
      
      * more resilience while loading from the Hub.
      
      * parallelize shard downloads by using snapshot_download()/
      
      * default to a shard size.
      
      * more fix
      
      * Empty-Commit
      
      * debug
      
      * fix
      
      * uality
      
      * more debugging
      
      * fix more
      
      * initial comments from Benjamin
      
      * move certain methods to loading_utils
      
      * add test to check if the correct number of shards are present.
      
      * add a test to check if loading of sharded checkpoints from the Hub is okay
      
      * clarify the unit when passed as an int.
      
      * use hf_hub for sharding.
      
      * remove unnecessary code
      
      * remove unnecessary function
      
      * lucain's comments.
      
      * fixes
      
      * address high-level comments.
      
      * fix test
      
      * subfolder shenanigans./
      
      * Update src/diffusers/utils/hub_utils.py
      Co-authored-by: default avatarLucain <lucainp@gmail.com>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarLucain <lucainp@gmail.com>
      
      * remove _huggingface_hub_version as not needed.
      
      * address more feedback.
      
      * add a test for local_files_only=True/
      
      * need hf hub to be at least 0.23.2
      
      * style
      
      * final comment.
      
      * clean up subfolder.
      
      * deal with suffixes in code.
      
      * _add_variant default.
      
      * use weights_name_pattern
      
      * remove add_suffix_keyword
      
      * clean up downloading of sharded ckpts.
      
      * don't return something special when using index.json
      
      * fix more
      
      * don't use bare except
      
      * remove comments and catch the errors better
      
      * fix a couple of things when using is_file()
      
      * empty
      
      ---------
      Co-authored-by: default avatarLucain <lucainp@gmail.com>
      7d887118
  16. 04 Jun, 2024 1 commit
  17. 31 May, 2024 1 commit
    • Sayak Paul's avatar
      [Core] Introduce class variants for `Transformer2DModel` (#7647) · 983dec3b
      Sayak Paul authored
      * init for patches
      
      * finish patched model.
      
      * continuous transformer
      
      * vectorized transformer2d.
      
      * style.
      
      * inits.
      
      * fix-copies.
      
      * introduce DiTTransformer2DModel.
      
      * fixes
      
      * use REMAPPING as suggested by @DN6
      
      * better logging.
      
      * add pixart transformer model.
      
      * inits.
      
      * caption_channels.
      
      * attention masking.
      
      * fix use_additional_conditions.
      
      * remove print.
      
      * debug
      
      * flatten
      
      * fix: assertion for sigma
      
      * handle remapping for modeling_utils
      
      * add tests for dit transformer2d
      
      * quality
      
      * placeholder for pixart tests
      
      * pixart tests
      
      * add _no_split_modules
      
      * add docs.
      
      * check
      
      * check
      
      * check
      
      * check
      
      * fix tests
      
      * fix tests
      
      * move Transformer output to modeling_output
      
      * move errors better and bring back use_additional_conditions attribute.
      
      * add unnecessary things from DiT.
      
      * clean up pixart
      
      * fix remapping
      
      * fix device_map things in pixart2d.
      
      * replace Transformer2DModel with appropriate classes in dit, pixart tests
      
      * empty
      
      * legacy mixin classes./
      
      * use a remapping dict for fetching class names.
      
      * change to specifc model types in the pipeline implementations.
      
      * move _fetch_remapped_cls_from_config to modeling_loading_utils.py
      
      * fix dependency problems.
      
      * add deprecation note.
      983dec3b
  18. 14 May, 2024 1 commit