Commits · be2fb77dc164083bf8f033874b066c96bc6752b8 · renzhc / diffusers_dcu

28 May, 2025 1 commit

Steven Liu authored May 28, 2025



* combine

* Update docs/source/en/optimization/fp16.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

be2fb77d

01 May, 2025 1 commit

[docs] Memory optims (#11385) · b848d479

Steven Liu authored May 01, 2025

* reformat

* initial

* fin

* review

* inference

* feedback

* feedback

* feedback

b848d479

16 Sep, 2024 1 commit

[docs] Replace runwayml/stable-diffusion-v1-5 with Lykon/dreamshaper-8 (#9428) · b52119ae

suzukimain authored Sep 17, 2024



* [docs] Replace runwayml/stable-diffusion-v1-5 with Lykon/dreamshaper-8

Updated documentation as runwayml/stable-diffusion-v1-5 has been removed from Huggingface.

* Update docs/source/en/using-diffusers/inpaint.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Replace with stable-diffusion-v1-5/stable-diffusion-v1-5

* Update inpaint.md

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

b52119ae

08 Aug, 2024 1 commit
- [docs] Organize model toctree (#9118) · ba7e4845
  Steven Liu authored Aug 07, 2024
```
* toctree

* fix
```
  ba7e4845
20 May, 2024 1 commit
- Fix typo in "attention" (#7977) · d6ca1209
  Jacob Marks authored May 20, 2024
  
  d6ca1209
06 May, 2024 1 commit
- [docs] Distilled inference (#7834) · 0d23645b
  Steven Liu authored May 06, 2024
```
* combine

* edits
```
  0d23645b
25 Feb, 2024 1 commit
- [docs] Minor updates (#7063) · 3dd4168d
  Steven Liu authored Feb 25, 2024
```
* updates

* feedback
```
  3dd4168d
08 Feb, 2024 1 commit
- change to 2024 in the license (#6902) · 30e5e81d
  Sayak Paul authored Feb 08, 2024
```
change to 2024
```
  30e5e81d
09 Nov, 2023 1 commit

[`Docs`] Fix typos and update files at Optimization Page (#5674) · 53a8439f

M. Tolga Cangöz authored Nov 10, 2023



* Fix typos, update, trim trailing whitespace

* Trim trailing whitespaces

* Update docs/source/en/optimization/memory.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/memory.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update _toctree.yml

* Update adapt_a_model.md

* Reverse

* Reverse

* Reverse

* Update dreambooth.md

* Update instructpix2pix.md

* Update lora.md

* Update overview.md

* Update t2i_adapters.md

* Update text2image.md

* Update text_inversion.md

* Update create_dataset.md

* Update create_dataset.md

* Update create_dataset.md

* Update create_dataset.md

* Update coreml.md

* Delete docs/source/en/training/create_dataset.md

* Original create_dataset.md

* Update create_dataset.md

* Delete docs/source/en/training/create_dataset.md

* Add original file

* Delete docs/source/en/training/create_dataset.md

* Add original one

* Delete docs/source/en/training/text2image.md

* Delete docs/source/en/training/instructpix2pix.md

* Delete docs/source/en/training/dreambooth.md

* Add original files

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

53a8439f

13 Sep, 2023 1 commit

[docs] Create clearer optimization sections (#4870) · 19edca82

Steven Liu authored Sep 13, 2023

* refactor

* update general optim sections

* update more sections

* few more updates

* benchmark code

19edca82

10 Aug, 2023 2 commits
- [docs] Add safetensors flag (#4245) · cd7071e7
  Steven Liu authored Aug 10, 2023
```
* add safetensors flag

* apply review
```
  cd7071e7
- [docs] Remove attention slicing (#4518) · e31f38b5
  Steven Liu authored Aug 10, 2023
```
* remove attention slicing

* apply feedback
```
  e31f38b5
26 Jul, 2023 1 commit

Where did this 'x' come from, Elon? (#4277) · c6ae9b7d

camenduru authored Jul 26, 2023



* why mdx?

* why mdx?

* why mdx?

* no x for kandinksy either

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

c6ae9b7d

26 May, 2023 1 commit
- [docs] Maintenance (#3552) · ab986769
  Steven Liu authored May 26, 2023
```
* doc fixes

* fix latex

* parenthesis on inside
```
  ab986769
15 May, 2023 1 commit
- Fix style rendering (#3433) · 7a32b6be
  Pedro Cuenca authored May 15, 2023
```
* Fix style rendering.

* Fix typo
```
  7a32b6be
27 Apr, 2023 1 commit

[docs] add notes for stateful model changes (#3252) · 256e6960

Will Berman authored Apr 27, 2023



* [docs] add notes for stateful model changes

* Update docs/source/en/optimization/fp16.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* link to accelerate docs for discarding hooks

---------
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

256e6960

28 Mar, 2023 2 commits

[WIP][Docs] Use DiffusionPipeline Instead of Child Classes when Loading Pipeline (#2809) · 663c6545

dg845 authored Mar 28, 2023



* Change the docs to use the parent DiffusionPipeline class when loading a checkpoint using from_pretrained() instead of a child class (e.g. StableDiffusionPipeline) where possible.

* Run make style to fix style issues.

* Change more docs to use DiffusionPipeline rather than a subclass.

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

663c6545

Remove suggestion to use cuDNN benchmark in docs (#2793) · b76d9fde
Sandeep authored Mar 28, 2023
```
* Remove suggestion to use cuDNN benchmark in docs

* removing the wrong line
```
b76d9fde

20 Mar, 2023 1 commit
- Update fp16.mdx (#2746) · af86b0cc
  M. Tolga Cangöz authored Mar 20, 2023
```
Fix typos
```
  af86b0cc
02 Mar, 2023 1 commit

8k Stable Diffusion with tiled VAE (#1441) · 80148484

Ilmari Heikkinen authored Mar 03, 2023



* Tiled VAE for high-res text2img and img2img

* vae tiling, fix formatting

* enable_vae_tiling API and tests

* tiled vae docs, disable tiling for images that would have only one tile

* tiled vae tests, use channels_last memory format

* tiled vae tests, use smaller test image

* tiled vae tests, remove tiling test from fast tests

* up

* up

* make style

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* make style

* improve naming

* finish

* apply suggestions

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* up

---------
Co-authored-by: Ilmari Heikkinen <ilmari@fhtr.org>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

80148484

01 Mar, 2023 1 commit
- [Copyright] 2023 (#2524) · eadf0e25
  Patrick von Platen authored Mar 01, 2023
  
  eadf0e25
16 Feb, 2023 1 commit

`enable_model_cpu_offload` (#2285) · 2777264e

Pedro Cuenca authored Feb 16, 2023

* enable_model_offload PoC

It's surprisingly more involved than expected, see comments in the PR.

* Rename final_offload_hook

* Invoke the vae forward hook manually.

* Completely remove decoder.

* Style

* apply_forward_hook decorator

* Rename method.

* Style

* Copy enable_model_cpu_offload

* Fix copies.

* Remove comment.

* Fix copies

* Missing import

* Fix doc-builder style.

* Merge main and fix again.

* Add docs

* Fix docs.

* Add a couple of tests.

* style

2777264e

07 Feb, 2023 1 commit

Replace flake8 with ruff and update black (#2279) · a7ca03aa

Patrick von Platen authored Feb 08, 2023

* before running make style

* remove left overs from flake8

* finish

* make fix-copies

* final fix

* more fixes

a7ca03aa

17 Jan, 2023 1 commit
- [Docs] No more autocast (#2021) · f77ff561
  Patrick von Platen authored Jan 17, 2023
```
no more autocast
```
  f77ff561
12 Jan, 2023 1 commit

[CPU offload] correct cpu offload (#1968) · 57f7d259

Patrick von Platen authored Jan 12, 2023



* [CPU offload] correct cpu offload

* [CPU offload] correct cpu offload

* finish

* finish

* Update docs/source/en/optimization/fp16.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/pipelines/alt_diffusion/pipeline_alt_diffusion.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

57f7d259

04 Jan, 2023 1 commit

Init for korean docs (#1910) · 75d53cc8

Chanran Kim authored Jan 05, 2023

* init for korean docs

* edit build yml file for multi language docs

* edit one more build yml file for multi language docs

* add title for get_frontmatter error

75d53cc8

19 Dec, 2022 1 commit
- [Revision] Don't recommend using revision (#1764) · ce1c27ad
  Patrick von Platen authored Dec 19, 2022
  
  ce1c27ad
16 Dec, 2022 1 commit

Docs: recommend xformers (#1724) · acd31781

Pedro Cuenca authored Dec 16, 2022

* Fix links to flash attention.

* Add xformers installation instructions.

* Make link to xformers install more prominent.

* Link to xformers install from training docs.

acd31781

29 Nov, 2022 1 commit

StableDiffusion: Decode latents separately to run larger batches (#1150) · c28d3c82

Ilmari Heikkinen authored Nov 29, 2022



* StableDiffusion: Decode latents separately to run larger batches

* Move VAE sliced decode under enable_vae_sliced_decode and vae.enable_sliced_decode

* Rename sliced_decode to slicing

* fix whitespace

* fix quality check and repository consistency

* VAE slicing tests and documentation

* API doc hooks for VAE slicing

* reformat vae slicing tests

* Skip VAE slicing for one-image batches

* Documentation tweaks for VAE slicing
Co-authored-by: Ilmari Heikkinen <ilmari@fhtr.org>

c28d3c82

02 Nov, 2022 1 commit

Up to 2x speedup on GPUs using memory efficient attention (#532) · 98c42134

MatthieuTPHR authored Nov 02, 2022



* 2x speedup using memory efficient attention

* remove einops dependency

* Swap K, M in op instantiation

* Simplify code, remove unnecessary maybe_init call and function, remove unused self.scale parameter

* make xformers a soft dependency

* remove one-liner functions

* change one letter variable to appropriate names

* Remove Env variable dependency, remove MemoryEfficientCrossAttention class and use enable_xformers_memory_efficient_attention method

* Add memory efficient attention toggle to img2img and inpaint pipelines

* Clearer management of xformers' availability

* update optimizations markdown to add info about memory efficient attention

* add benchmarks for TITAN RTX

* More detailed explanation of how the mem eff benchmark were ran

* Removing autocast from optimization markdown

* import_utils: import torch only if is available
Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>

98c42134

29 Oct, 2022 1 commit
- Fix speedup ratio in fp16.mdx (#837) · fc0ca474
  Minwoo Byeon authored Oct 29, 2022
  
  fc0ca474
27 Oct, 2022 1 commit

Document sequential CPU offload method on Stable Diffusion pipeline (#1024) · de00c632

Pi Esposito authored Oct 27, 2022



* document cpu offloading method

* address review comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

de00c632

24 Oct, 2022 1 commit

v1-5 docs updates (#921) · 8aac1f99

apolinario authored Oct 24, 2022



* Update README.md

Additionally add FLAX so the model card can be slimmer and point to this page

* Find and replace all

* v-1-5 -> v1-5

* revert test changes

* Update README.md
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update docs/source/quicktour.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update README.md
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/quicktour.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update README.md
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Revert certain references to v1-5

* Docs changes

* Apply suggestions from code review
Co-authored-by: apolinario <joaopaulo.passos+multimodal@gmail.com>
Co-authored-by: anton-l <anton@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

8aac1f99

05 Oct, 2022 2 commits
- [Docs] Advertise fp16 instead of autocast (#740) · 4deb16e8
  Patrick von Platen authored Oct 05, 2022
```
up
```
  4deb16e8
- No more use_auth_token=True (#733) · 78744b6a
  Patrick von Platen authored Oct 05, 2022
```
* up

* uP

* uP

* make style

* Apply suggestions from code review

* up

* finish
```
  78744b6a
04 Oct, 2022 1 commit

Fix typos (#718) · 7e92c5bc

Yuta Hayashibe authored Oct 04, 2022



* Fix typos

* Update examples/dreambooth/train_dreambooth.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

7e92c5bc

30 Sep, 2022 2 commits

[docs] fix table in fp16.mdx (#683) · daa22050
Nouamane Tazi authored Sep 30, 2022

daa22050

Optimize Stable Diffusion (#371) · 9ebaea54

Nouamane Tazi authored Sep 30, 2022

* initial commit

* make UNet stream capturable

* try to fix noise_pred value

* remove cuda graph and keep NB

* non blocking unet with PNDMScheduler

* make timesteps np arrays for pndm scheduler
because lists don't get formatted to tensors in `self.set_format`

* make max async in pndm

* use channel last format in unet

* avoid moving timesteps device in each unet call

* avoid memcpy op in `get_timestep_embedding`

* add `channels_last` kwarg to `DiffusionPipeline.from_pretrained`

* update TODO

* replace `channels_last` kwarg with `memory_format` for more generality

* revert the channels_last changes to leave it for another PR

* remove non_blocking when moving input ids to device

* remove blocking from all .to() operations at beginning of pipeline

* fix merging

* fix merging

* model can run in other precisions without autocast

* attn refactoring

* Revert "attn refactoring"

This reverts commit 0c70c0e189cd2c4d8768274c9fcf5b940ee310fb.

* remove restriction to run conv_norm in fp32

* use `baddbmm` instead of `matmul`for better in attention for better perf

* removing all reshapes to test perf

* Revert "removing all reshapes to test perf"

This reverts commit 006ccb8a8c6bc7eb7e512392e692a29d9b1553cd.

* add shapes comments

* hardcore whats needed for jitting

* Revert "hardcore whats needed for jitting"

This reverts commit 2fa9c698eae2890ac5f8e367ca80532ecf94df9a.

* Revert "remove restriction to run conv_norm in fp32"

This reverts commit cec592890c32da3d1b78d38b49e4307aedf459b9.

* revert using baddmm in attention's forward

* cleanup comment

* remove restriction to run conv_norm in fp32. no quality loss was noticed

This reverts commit cc9bc1339c998ebe9e7d733f910c6d72d9792213.

* add more optimizations techniques to docs

* Revert "add shapes comments"

This reverts commit 31c58eadb8892f95478cdf05229adf678678c5f4.

* apply suggestions

* make quality

* apply suggestions

* styling

* `scheduler.timesteps` are now arrays so we dont need .to()

* remove useless .type()

* use mean instead of max in `test_stable_diffusion_inpaint_pipeline_k_lms`

* move scheduler timestamps to correct device if tensors

* add device to `set_timesteps` in LMSD scheduler

* `self.scheduler.set_timesteps` now uses device arg for schedulers that accept it

* quick fix

* styling

* remove kwargs from schedulers `set_timesteps`

* revert to using max in K-LMS inpaint pipeline test

* Revert "`self.scheduler.set_timesteps` now uses device arg for schedulers that accept it"

This reverts commit 00d5a51e5c20d8d445c8664407ef29608106d899.

* move timesteps to correct device before loop in SD pipeline

* apply previous fix to other SD pipelines

* UNet now accepts tensor timesteps even on wrong device, to avoid errors
- it shouldnt affect performance if timesteps are alrdy on correct device
- it does slow down performance if they're on the wrong device

* fix pipeline when timesteps are arrays with strides

9ebaea54

08 Sep, 2022 1 commit

Docs: fp16 page (#404) · c29d81c3

Pedro Cuenca authored Sep 08, 2022



* Initial version of `fp16` page.

* Fix typo in README.

* Change titles of fp16 section in toctree.

* PR suggestion
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* PR suggestion
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Clarify attention slicing is useful even for batches of 1

Explained by @patrickvonplaten after a suggestion by @keturn.
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Do not talk about `batches` in `enable_attention_slicing`.

* Use Tip (just for fun), add link to method.

* Comment about fp16 results looking the same as float32 in practice.

* Style: docstring line wrapping.
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

c29d81c3

07 Sep, 2022 1 commit
- [Docs] Let's go (#385) · 5a38033d
  Patrick von Platen authored Sep 07, 2022
  
  5a38033d