1. 28 May, 2025 1 commit
  2. 01 May, 2025 1 commit
  3. 16 Sep, 2024 1 commit
  4. 08 Aug, 2024 1 commit
  5. 20 May, 2024 1 commit
  6. 06 May, 2024 1 commit
  7. 25 Feb, 2024 1 commit
  8. 08 Feb, 2024 1 commit
  9. 09 Nov, 2023 1 commit
    • M. Tolga Cangöz's avatar
      [`Docs`] Fix typos and update files at Optimization Page (#5674) · 53a8439f
      M. Tolga Cangöz authored
      
      
      * Fix typos, update, trim trailing whitespace
      
      * Trim trailing whitespaces
      
      * Update docs/source/en/optimization/memory.md
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update docs/source/en/optimization/memory.md
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update _toctree.yml
      
      * Update adapt_a_model.md
      
      * Reverse
      
      * Reverse
      
      * Reverse
      
      * Update dreambooth.md
      
      * Update instructpix2pix.md
      
      * Update lora.md
      
      * Update overview.md
      
      * Update t2i_adapters.md
      
      * Update text2image.md
      
      * Update text_inversion.md
      
      * Update create_dataset.md
      
      * Update create_dataset.md
      
      * Update create_dataset.md
      
      * Update create_dataset.md
      
      * Update coreml.md
      
      * Delete docs/source/en/training/create_dataset.md
      
      * Original create_dataset.md
      
      * Update create_dataset.md
      
      * Delete docs/source/en/training/create_dataset.md
      
      * Add original file
      
      * Delete docs/source/en/training/create_dataset.md
      
      * Add original one
      
      * Delete docs/source/en/training/text2image.md
      
      * Delete docs/source/en/training/instructpix2pix.md
      
      * Delete docs/source/en/training/dreambooth.md
      
      * Add original files
      
      ---------
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      53a8439f
  10. 13 Sep, 2023 1 commit
  11. 10 Aug, 2023 2 commits
  12. 26 Jul, 2023 1 commit
  13. 26 May, 2023 1 commit
  14. 15 May, 2023 1 commit
  15. 27 Apr, 2023 1 commit
  16. 28 Mar, 2023 2 commits
  17. 20 Mar, 2023 1 commit
  18. 02 Mar, 2023 1 commit
  19. 01 Mar, 2023 1 commit
  20. 16 Feb, 2023 1 commit
    • Pedro Cuenca's avatar
      `enable_model_cpu_offload` (#2285) · 2777264e
      Pedro Cuenca authored
      * enable_model_offload PoC
      
      It's surprisingly more involved than expected, see comments in the PR.
      
      * Rename final_offload_hook
      
      * Invoke the vae forward hook manually.
      
      * Completely remove decoder.
      
      * Style
      
      * apply_forward_hook decorator
      
      * Rename method.
      
      * Style
      
      * Copy enable_model_cpu_offload
      
      * Fix copies.
      
      * Remove comment.
      
      * Fix copies
      
      * Missing import
      
      * Fix doc-builder style.
      
      * Merge main and fix again.
      
      * Add docs
      
      * Fix docs.
      
      * Add a couple of tests.
      
      * style
      2777264e
  21. 07 Feb, 2023 1 commit
  22. 17 Jan, 2023 1 commit
  23. 12 Jan, 2023 1 commit
  24. 04 Jan, 2023 1 commit
    • Chanran Kim's avatar
      Init for korean docs (#1910) · 75d53cc8
      Chanran Kim authored
      * init for korean docs
      
      * edit build yml file for multi language docs
      
      * edit one more build yml file for multi language docs
      
      * add title for get_frontmatter error
      75d53cc8
  25. 19 Dec, 2022 1 commit
  26. 16 Dec, 2022 1 commit
    • Pedro Cuenca's avatar
      Docs: recommend xformers (#1724) · acd31781
      Pedro Cuenca authored
      * Fix links to flash attention.
      
      * Add xformers installation instructions.
      
      * Make link to xformers install more prominent.
      
      * Link to xformers install from training docs.
      acd31781
  27. 29 Nov, 2022 1 commit
    • Ilmari Heikkinen's avatar
      StableDiffusion: Decode latents separately to run larger batches (#1150) · c28d3c82
      Ilmari Heikkinen authored
      
      
      * StableDiffusion: Decode latents separately to run larger batches
      
      * Move VAE sliced decode under enable_vae_sliced_decode and vae.enable_sliced_decode
      
      * Rename sliced_decode to slicing
      
      * fix whitespace
      
      * fix quality check and repository consistency
      
      * VAE slicing tests and documentation
      
      * API doc hooks for VAE slicing
      
      * reformat vae slicing tests
      
      * Skip VAE slicing for one-image batches
      
      * Documentation tweaks for VAE slicing
      Co-authored-by: default avatarIlmari Heikkinen <ilmari@fhtr.org>
      c28d3c82
  28. 02 Nov, 2022 1 commit
    • MatthieuTPHR's avatar
      Up to 2x speedup on GPUs using memory efficient attention (#532) · 98c42134
      MatthieuTPHR authored
      
      
      * 2x speedup using memory efficient attention
      
      * remove einops dependency
      
      * Swap K, M in op instantiation
      
      * Simplify code, remove unnecessary maybe_init call and function, remove unused self.scale parameter
      
      * make xformers a soft dependency
      
      * remove one-liner functions
      
      * change one letter variable to appropriate names
      
      * Remove Env variable dependency, remove MemoryEfficientCrossAttention class and use enable_xformers_memory_efficient_attention method
      
      * Add memory efficient attention toggle to img2img and inpaint pipelines
      
      * Clearer management of xformers' availability
      
      * update optimizations markdown to add info about memory efficient attention
      
      * add benchmarks for TITAN RTX
      
      * More detailed explanation of how the mem eff benchmark were ran
      
      * Removing autocast from optimization markdown
      
      * import_utils: import torch only if is available
      Co-authored-by: default avatarNouamane Tazi <nouamane98@gmail.com>
      98c42134
  29. 29 Oct, 2022 1 commit
  30. 27 Oct, 2022 1 commit
  31. 24 Oct, 2022 1 commit
  32. 05 Oct, 2022 2 commits
  33. 04 Oct, 2022 1 commit
  34. 30 Sep, 2022 2 commits
    • Nouamane Tazi's avatar
      [docs] fix table in fp16.mdx (#683) · daa22050
      Nouamane Tazi authored
      daa22050
    • Nouamane Tazi's avatar
      Optimize Stable Diffusion (#371) · 9ebaea54
      Nouamane Tazi authored
      * initial commit
      
      * make UNet stream capturable
      
      * try to fix noise_pred value
      
      * remove cuda graph and keep NB
      
      * non blocking unet with PNDMScheduler
      
      * make timesteps np arrays for pndm scheduler
      because lists don't get formatted to tensors in `self.set_format`
      
      * make max async in pndm
      
      * use channel last format in unet
      
      * avoid moving timesteps device in each unet call
      
      * avoid memcpy op in `get_timestep_embedding`
      
      * add `channels_last` kwarg to `DiffusionPipeline.from_pretrained`
      
      * update TODO
      
      * replace `channels_last` kwarg with `memory_format` for more generality
      
      * revert the channels_last changes to leave it for another PR
      
      * remove non_blocking when moving input ids to device
      
      * remove blocking from all .to() operations at beginning of pipeline
      
      * fix merging
      
      * fix merging
      
      * model can run in other precisions without autocast
      
      * attn refactoring
      
      * Revert "attn refactoring"
      
      This reverts commit 0c70c0e189cd2c4d8768274c9fcf5b940ee310fb.
      
      * remove restriction to run conv_norm in fp32
      
      * use `baddbmm` instead of `matmul`for better in attention for better perf
      
      * removing all reshapes to test perf
      
      * Revert "removing all reshapes to test perf"
      
      This reverts commit 006ccb8a8c6bc7eb7e512392e692a29d9b1553cd.
      
      * add shapes comments
      
      * hardcore whats needed for jitting
      
      * Revert "hardcore whats needed for jitting"
      
      This reverts commit 2fa9c698eae2890ac5f8e367ca80532ecf94df9a.
      
      * Revert "remove restriction to run conv_norm in fp32"
      
      This reverts commit cec592890c32da3d1b78d38b49e4307aedf459b9.
      
      * revert using baddmm in attention's forward
      
      * cleanup comment
      
      * remove restriction to run conv_norm in fp32. no quality loss was noticed
      
      This reverts commit cc9bc1339c998ebe9e7d733f910c6d72d9792213.
      
      * add more optimizations techniques to docs
      
      * Revert "add shapes comments"
      
      This reverts commit 31c58eadb8892f95478cdf05229adf678678c5f4.
      
      * apply suggestions
      
      * make quality
      
      * apply suggestions
      
      * styling
      
      * `scheduler.timesteps` are now arrays so we dont need .to()
      
      * remove useless .type()
      
      * use mean instead of max in `test_stable_diffusion_inpaint_pipeline_k_lms`
      
      * move scheduler timestamps to correct device if tensors
      
      * add device to `set_timesteps` in LMSD scheduler
      
      * `self.scheduler.set_timesteps` now uses device arg for schedulers that accept it
      
      * quick fix
      
      * styling
      
      * remove kwargs from schedulers `set_timesteps`
      
      * revert to using max in K-LMS inpaint pipeline test
      
      * Revert "`self.scheduler.set_timesteps` now uses device arg for schedulers that accept it"
      
      This reverts commit 00d5a51e5c20d8d445c8664407ef29608106d899.
      
      * move timesteps to correct device before loop in SD pipeline
      
      * apply previous fix to other SD pipelines
      
      * UNet now accepts tensor timesteps even on wrong device, to avoid errors
      - it shouldnt affect performance if timesteps are alrdy on correct device
      - it does slow down performance if they're on the wrong device
      
      * fix pipeline when timesteps are arrays with strides
      9ebaea54
  35. 08 Sep, 2022 1 commit
  36. 07 Sep, 2022 1 commit