Commits · 8a4c3e50bdae402482f49cc72c4f97e46ac083ee · renzhc / diffusers_dcu

27 Dec, 2022 1 commit
- Width was typod as weight (#1800) · 8a4c3e50
  William Held authored Dec 27, 2022
```
* Width was typod as weight

* Run Black
```
  8a4c3e50
20 Dec, 2022 3 commits

Refactor cross attention and allow mechanism to tweak cross attention function (#1639) · 4125756e

Patrick von Platen authored Dec 20, 2022



* first proposal

* rename

* up

* Apply suggestions from code review

* better

* up

* finish

* up

* rename

* correct versatile

* up

* up

* up

* up

* fix

* Apply suggestions from code review

* make style

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* add error message
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

4125756e

make style · e29dc972
Patrick von Platen authored Dec 20, 2022

e29dc972
Only test for xformers when enabling them #1773 (#1776) · 8e4733b3
Ilmari Heikkinen authored Dec 20, 2022
```
* only check for xformers when xformers are enabled

* only test for xformers when enabling them
```
8e4733b3

19 Dec, 2022 3 commits
- [Versatile] fix attention mask (#1763) · b267d285
  Patrick von Platen authored Dec 19, 2022
  
  b267d285
- Support attn2==None for xformers (#1759) · 32a5d70c
  Anton Lozhkov authored Dec 19, 2022
  
  32a5d70c
- Add attention mask to uclip (#1756) · 429e5449
  Patrick von Platen authored Dec 19, 2022
```
* Remove bogus file

* [Unclip] Add efficient attention

* [Unclip] Add efficient attention
```
  429e5449
18 Dec, 2022 1 commit

kakaobrain unCLIP (#1428) · 2dcf64b7

Will Berman authored Dec 18, 2022



* [wip] attention block updates

* [wip] unCLIP unet decoder and super res

* [wip] unCLIP prior transformer

* [wip] scheduler changes

* [wip] text proj utility class

* [wip] UnCLIPPipeline

* [wip] kakaobrain unCLIP convert script

* [unCLIP pipeline] fixes re: @patrickvonplaten

remove callbacks

move denoising loops into call function

* UNCLIPScheduler re: @patrickvonplaten

Revert changes to DDPMScheduler. Make UNCLIPScheduler, a modified
DDPM scheduler with changes to support karlo

* mask -> attention_mask re: @patrickvonplaten

* [DDPMScheduler] remove leftover change

* [docs] PriorTransformer

* [docs] UNet2DConditionModel and UNet2DModel

* [nit] UNCLIPScheduler -> UnCLIPScheduler

matches existing unclip naming better

* [docs] SchedulingUnCLIP

* [docs] UnCLIPTextProjModel

* refactor

* finish licenses

* rename all to attention_mask and prep in models

* more renaming

* don't expose unused configs

* final renaming fixes

* remove x attn mask when not necessary

* configure kakao script to use new class embedding config

* fix copies

* [tests] UnCLIPScheduler

* finish x attn

* finish

* remove more

* rename condition blocks

* clean more

* Apply suggestions from code review

* up

* fix

* [tests] UnCLIPPipelineFastTests

* remove unused imports

* [tests] UnCLIPPipelineIntegrationTests

* correct

* make style
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

2dcf64b7

09 Dec, 2022 1 commit
- do not automatically enable xformers (#1640) · 6b68afd8
  Patrick von Platen authored Dec 09, 2022
```
* do not automatically enable xformers

* uP
```
  6b68afd8
07 Dec, 2022 3 commits

fix upcast in slice attention (#1591) · ced7c960
Suraj Patil authored Dec 07, 2022
```
* fix upcast in slice attention

* fix dtype

* add test

* fix test
```
ced7c960
[UNet2DConditionModel] add an option to upcast attention to fp32 (#1590) · 170ebd28
Suraj Patil authored Dec 07, 2022
```
upcast attention
```
170ebd28

Add paint by example (#1533) · 896c98a2

Patrick von Platen authored Dec 07, 2022



* add paint by example

* mkae loading possibel

* up

* Update src/diffusers/models/attention.py

* up

* finalize weight structure

* make example work

* make it work

* up

* up

* fix

* del

* add

* update

* Apply suggestions from code review

* correct transformer 2d

* finish

* up

* up

* up

* up

* fix

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Apply suggestions from code review

* up

* finish
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

896c98a2

05 Dec, 2022 1 commit

[refactor] make set_attention_slice recursive (#1532) · bce65cd1

Suraj Patil authored Dec 05, 2022



* make attn slice recursive

* remove set_attention_slice from blocks

* fix copies

* make enable_attention_slicing base class method of DiffusionPipeline

* fix set_attention_slice

* fix set_attention_slice

* fix copies

* add tests

* up

* up

* up

* update

* up

* uP
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

bce65cd1

03 Dec, 2022 1 commit

Add xformers attention to VAE (#1507) · daebee09

Ilmari Heikkinen authored Dec 03, 2022



* Add xformers attention to VAE

* Simplify VAE xformers code

* Update src/diffusers/models/attention.py
Co-authored-by: Ilmari Heikkinen <ilmari@fhtr.org>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

daebee09

02 Dec, 2022 1 commit

[refactor] Making the xformers mem-efficient attention activation recursive (#1493) · a816a87a

Benjamin Lefaudeux authored Dec 02, 2022



* Moving the mem efficiient attention activation to the top + recursive

* black, too bad there's no pre-commit ?
Co-authored-by: Benjamin Lefaudeux <benjamin@photoroom.com>

a816a87a

01 Dec, 2022 1 commit
- simplyfy AttentionBlock (#1492) · 2bbf8b67
  Suraj Patil authored Dec 01, 2022
  
  2bbf8b67
25 Nov, 2022 1 commit

[MPS] call contiguous after permute (#1411) · babfb8a0

Kashif Rasul authored Nov 25, 2022

* call contiguous after permute

Fixes for MPS device

* Fix MPS UserWarning

* make style

* Revert "Fix MPS UserWarning"

This reverts commit b46c32810ee5fdc4c16a8e9224a826490b66cf49.

babfb8a0

24 Nov, 2022 1 commit

Adapt UNet2D for supre-resolution (#1385) · cecdd8bd

Suraj Patil authored Nov 24, 2022

* allow disabling self attention

* add class_embedding

* fix copies

* fix condition

* fix copies

* do_self_attention -> only_cross_attention

* fix copies

* num_classes -> num_class_embeds

* fix default value

cecdd8bd

23 Nov, 2022 3 commits

[Transformer2DModel] don't norm twice (#1381) · 15241225
Suraj Patil authored Nov 24, 2022
```
don't norm twice
```
15241225

update unet2d (#1376) · f07a16e0

Suraj Patil authored Nov 23, 2022

* boom boom

* remove duplicate arg

* add use_linear_proj arg

* fix copies

* style

* add fast tests

* use_linear_proj -> use_linear_projection

f07a16e0

[Versatile Diffusion] Add versatile diffusion model (#1283) · 2625fb59

Patrick von Platen authored Nov 23, 2022



* up

* convert dual unet

* revert dual attn

* adapt for vd-official

* test the full pipeline

* mixed inference

* mixed inference for text2img

* add image prompting

* fix clip norm

* split text2img and img2img

* fix format

* refactor text2img

* mega pipeline

* add optimus

* refactor image var

* wip text_unet

* text unet end to end

* update tests

* reshape

* fix image to text

* add some first docs

* dual guided pipeline

* fix token ratio

* propose change

* dual transformer as a native module

* DualTransformer(nn.Module)

* DualTransformer(nn.Module)

* correct unconditional image

* save-load with mega pipeline

* remove image to text

* up

* uP

* fix

* up

* final fix

* remove_unused_weights

* test updates

* save progress

* uP

* fix dual prompts

* some fixes

* finish

* style

* finish renaming

* up

* fix

* fix

* fix

* finish
Co-authored-by: anton-l <anton@huggingface.co>

2625fb59

22 Nov, 2022 1 commit

use memory_efficient_attention by default (#1354) · 2d6d4edb

Suraj Patil authored Nov 22, 2022



* use memory_efficient_attention by default

* Update src/diffusers/models/attention.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

2d6d4edb

21 Nov, 2022 1 commit
- perf: prefer batched matmuls for attention (#1203) · ad935933
  Birch-san authored Nov 21, 2022
```
perf: prefer batched matmuls for attention. added fast-path to Decoder when num_heads=1
```
  ad935933
14 Nov, 2022 1 commit

Edited attention.py for older xformers (#1270) · 33d7e89c

Lime-Cakes authored Nov 14, 2022

Older versions of xformers require query, key, value to be contiguous, this calls .contiguous() on q/k/v before passing to xformers.

33d7e89c

08 Nov, 2022 1 commit
- handle dtype xformers attention (#1196) · 5786b0e2
  Suraj Patil authored Nov 08, 2022
```
handle dtype xformers
```
  5786b0e2
03 Nov, 2022 1 commit

VQ-diffusion (#658) · ef2ea33c

Will Berman authored Nov 03, 2022



* Changes for VQ-diffusion VQVAE

Add specify dimension of embeddings to VQModel:
`VQModel` will by default set the dimension of embeddings to the number
of latent channels. The VQ-diffusion VQVAE has a smaller
embedding dimension, 128, than number of latent channels, 256.

Add AttnDownEncoderBlock2D and AttnUpDecoderBlock2D to the up and down
unet block helpers. VQ-diffusion's VQVAE uses those two block types.

* Changes for VQ-diffusion transformer

Modify attention.py so SpatialTransformer can be used for
VQ-diffusion's transformer.

SpatialTransformer:
- Can now operate over discrete inputs (classes of vector embeddings) as well as continuous.
- `in_channels` was made optional in the constructor so two locations where it was passed as a positional arg were moved to kwargs
- modified forward pass to take optional timestep embeddings

ImagePositionalEmbeddings:
- added to provide positional embeddings to discrete inputs for latent pixels

BasicTransformerBlock:
- norm layers were made configurable so that the VQ-diffusion could use AdaLayerNorm with timestep embeddings
- modified forward pass to take optional timestep embeddings

CrossAttention:
- now may optionally take a bias parameter for its query, key, and value linear layers

FeedForward:
- Internal layers are now configurable

ApproximateGELU:
- Activation function in VQ-diffusion's feedforward layer

AdaLayerNorm:
- Norm layer modified to incorporate timestep embeddings

* Add VQ-diffusion scheduler

* Add VQ-diffusion pipeline

* Add VQ-diffusion convert script to diffusers

* Add VQ-diffusion dummy objects

* Add VQ-diffusion markdown docs

* Add VQ-diffusion tests

* some renaming

* some fixes

* more renaming

* correct

* fix typo

* correct weights

* finalize

* fix tests

* Apply suggestions from code review
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* finish

* finish

* up
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

ef2ea33c

02 Nov, 2022 2 commits

Fix a small typo of a variable name (#1063) · 1216a3b1
Omiita authored Nov 02, 2022
```
Fix a small typo

fix a typo in `models/attention.py`.
weight -> width
```
1216a3b1

Up to 2x speedup on GPUs using memory efficient attention (#532) · 98c42134

MatthieuTPHR authored Nov 02, 2022



* 2x speedup using memory efficient attention

* remove einops dependency

* Swap K, M in op instantiation

* Simplify code, remove unnecessary maybe_init call and function, remove unused self.scale parameter

* make xformers a soft dependency

* remove one-liner functions

* change one letter variable to appropriate names

* Remove Env variable dependency, remove MemoryEfficientCrossAttention class and use enable_xformers_memory_efficient_attention method

* Add memory efficient attention toggle to img2img and inpaint pipelines

* Clearer management of xformers' availability

* update optimizations markdown to add info about memory efficient attention

* add benchmarks for TITAN RTX

* More detailed explanation of how the mem eff benchmark were ran

* Removing autocast from optimization markdown

* import_utils: import torch only if is available
Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>

98c42134

31 Oct, 2022 1 commit
- Remove nn sequential (#1086) · 888468dd
  Patrick von Platen authored Oct 31, 2022
```
* Remove nn sequential

* up
```
  888468dd
29 Oct, 2022 1 commit

Experimental: allow fp16 in `mps` (#961) · 95414bd6

Pedro Cuenca authored Oct 29, 2022

* Docs: refer to pre-RC version of PyTorch 1.13.0.

* Remove temporary workaround for unavailable op.

* Update comment to make it less ambiguous.

* Remove use of contiguous in mps.

It appears to not longer be necessary.

* Special case: use einsum for much better performance in mps

* Update mps docs.

* MPS: make pipeline work in half precision.

95414bd6

25 Oct, 2022 1 commit

mps changes for PyTorch 1.13 (#926) · 3d02c921

Pedro Cuenca authored Oct 25, 2022



* Docs: refer to pre-RC version of PyTorch 1.13.0.

* Remove temporary workaround for unavailable op.

* Update comment to make it less ambiguous.

* Remove use of contiguous in mps.

It appears to not longer be necessary.

* Special case: use einsum for much better performance in mps

* Update mps docs.

* Minor doc update.

* Accept suggestion
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>

3d02c921

12 Oct, 2022 1 commit
- add or fix license formatting in models directory (#808) · 5afc2b60
  Nathan Lambert authored Oct 12, 2022
```
* add or fix license formatting

* fix quality
```
  5afc2b60
30 Sep, 2022 2 commits

Fix slow tests (#689) · b2cfc7a0

Nouamane Tazi authored Sep 30, 2022

* revert using baddbmm in attention
- to fix `test_stable_diffusion_memory_chunking` test

* styling

b2cfc7a0

Optimize Stable Diffusion (#371) · 9ebaea54

Nouamane Tazi authored Sep 30, 2022

* initial commit

* make UNet stream capturable

* try to fix noise_pred value

* remove cuda graph and keep NB

* non blocking unet with PNDMScheduler

* make timesteps np arrays for pndm scheduler
because lists don't get formatted to tensors in `self.set_format`

* make max async in pndm

* use channel last format in unet

* avoid moving timesteps device in each unet call

* avoid memcpy op in `get_timestep_embedding`

* add `channels_last` kwarg to `DiffusionPipeline.from_pretrained`

* update TODO

* replace `channels_last` kwarg with `memory_format` for more generality

* revert the channels_last changes to leave it for another PR

* remove non_blocking when moving input ids to device

* remove blocking from all .to() operations at beginning of pipeline

* fix merging

* fix merging

* model can run in other precisions without autocast

* attn refactoring

* Revert "attn refactoring"

This reverts commit 0c70c0e189cd2c4d8768274c9fcf5b940ee310fb.

* remove restriction to run conv_norm in fp32

* use `baddbmm` instead of `matmul`for better in attention for better perf

* removing all reshapes to test perf

* Revert "removing all reshapes to test perf"

This reverts commit 006ccb8a8c6bc7eb7e512392e692a29d9b1553cd.

* add shapes comments

* hardcore whats needed for jitting

* Revert "hardcore whats needed for jitting"

This reverts commit 2fa9c698eae2890ac5f8e367ca80532ecf94df9a.

* Revert "remove restriction to run conv_norm in fp32"

This reverts commit cec592890c32da3d1b78d38b49e4307aedf459b9.

* revert using baddmm in attention's forward

* cleanup comment

* remove restriction to run conv_norm in fp32. no quality loss was noticed

This reverts commit cc9bc1339c998ebe9e7d733f910c6d72d9792213.

* add more optimizations techniques to docs

* Revert "add shapes comments"

This reverts commit 31c58eadb8892f95478cdf05229adf678678c5f4.

* apply suggestions

* make quality

* apply suggestions

* styling

* `scheduler.timesteps` are now arrays so we dont need .to()

* remove useless .type()

* use mean instead of max in `test_stable_diffusion_inpaint_pipeline_k_lms`

* move scheduler timestamps to correct device if tensors

* add device to `set_timesteps` in LMSD scheduler

* `self.scheduler.set_timesteps` now uses device arg for schedulers that accept it

* quick fix

* styling

* remove kwargs from schedulers `set_timesteps`

* revert to using max in K-LMS inpaint pipeline test

* Revert "`self.scheduler.set_timesteps` now uses device arg for schedulers that accept it"

This reverts commit 00d5a51e5c20d8d445c8664407ef29608106d899.

* move timesteps to correct device before loop in SD pipeline

* apply previous fix to other SD pipelines

* UNet now accepts tensor timesteps even on wrong device, to avoid errors
- it shouldnt affect performance if timesteps are alrdy on correct device
- it does slow down performance if they're on the wrong device

* fix pipeline when timesteps are arrays with strides

9ebaea54

27 Sep, 2022 1 commit

Fix `SpatialTransformer` (#578) · d886e497

Yih-Dar authored Sep 27, 2022



* Fix SpatialTransformer

* Fix SpatialTransformer
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

d886e497

19 Sep, 2022 3 commits
- Fix `CrossAttention._sliced_attention` (#563) · 84616b5d
  Yih-Dar authored Sep 19, 2022
```
* Fix CrossAttention._sliced_attention
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  84616b5d
- revert the accidental commit · 0424615a
  ydshieh authored Sep 19, 2022
  
  0424615a
- Fix CrossAttention._sliced_attention · 8187865a
  ydshieh authored Sep 19, 2022
  
  8187865a
15 Sep, 2022 1 commit

[UNet2DConditionModel, UNet2DModel] pass norm_num_groups to all the blocks (#442) · d144c46a

Suraj Patil authored Sep 15, 2022

* pass norm_num_groups to unet blocs and attention

* fix UNet2DConditionModel

* add norm_num_groups arg in vae

* add tests

* remove comment

* Apply suggestions from code review

d144c46a

14 Sep, 2022 1 commit

[CrossAttention] add different method for sliced attention (#446) · 8b450969

Suraj Patil authored Sep 14, 2022



* add different method for sliced attention

* Update src/diffusers/models/attention.py

* Apply suggestions from code review

* Update src/diffusers/models/attention.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

8b450969