Commits · dc7cd893fdee3906b3223a623b0f6d884a2df7c4 · renzhc / diffusers_dcu

18 Dec, 2022 1 commit

Will Berman authored Dec 18, 2022



* [wip] attention block updates

* [wip] unCLIP unet decoder and super res

* [wip] unCLIP prior transformer

* [wip] scheduler changes

* [wip] text proj utility class

* [wip] UnCLIPPipeline

* [wip] kakaobrain unCLIP convert script

* [unCLIP pipeline] fixes re: @patrickvonplaten

remove callbacks

move denoising loops into call function

* UNCLIPScheduler re: @patrickvonplaten

Revert changes to DDPMScheduler. Make UNCLIPScheduler, a modified
DDPM scheduler with changes to support karlo

* mask -> attention_mask re: @patrickvonplaten

* [DDPMScheduler] remove leftover change

* [docs] PriorTransformer

* [docs] UNet2DConditionModel and UNet2DModel

* [nit] UNCLIPScheduler -> UnCLIPScheduler

matches existing unclip naming better

* [docs] SchedulingUnCLIP

* [docs] UnCLIPTextProjModel

* refactor

* finish licenses

* rename all to attention_mask and prep in models

* more renaming

* don't expose unused configs

* final renaming fixes

* remove x attn mask when not necessary

* configure kakao script to use new class embedding config

* fix copies

* [tests] UnCLIPScheduler

* finish x attn

* finish

* remove more

* rename condition blocks

* clean more

* Apply suggestions from code review

* up

* fix

* [tests] UnCLIPPipelineFastTests

* remove unused imports

* [tests] UnCLIPPipelineIntegrationTests

* correct

* make style
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

2dcf64b7

13 Dec, 2022 1 commit

Make sure all pipelines can run with batched input (#1669) · b345c74d

Patrick von Platen authored Dec 13, 2022



* [SD] Make sure batched input works correctly

* uP

* uP

* up

* up

* uP

* up

* fix mask stuff

* up

* uP

* more up

* up

* uP

* up

* finish

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

b345c74d

09 Dec, 2022 1 commit
- do not automatically enable xformers (#1640) · 6b68afd8
  Patrick von Platen authored Dec 09, 2022
```
* do not automatically enable xformers

* uP
```
  6b68afd8
07 Dec, 2022 4 commits

Make cross-attention check more robust (#1560) · 5e036921
Pedro Cuenca authored Dec 07, 2022
```
* Make cross-attention check more robust.

* Fix copies.
```
5e036921
fix upcast in slice attention (#1591) · ced7c960
Suraj Patil authored Dec 07, 2022
```
* fix upcast in slice attention

* fix dtype

* add test

* fix test
```
ced7c960
[UNet2DConditionModel] add an option to upcast attention to fp32 (#1590) · 170ebd28
Suraj Patil authored Dec 07, 2022
```
upcast attention
```
170ebd28

Add paint by example (#1533) · 896c98a2

Patrick von Platen authored Dec 07, 2022



* add paint by example

* mkae loading possibel

* up

* Update src/diffusers/models/attention.py

* up

* finalize weight structure

* make example work

* make it work

* up

* up

* fix

* del

* add

* update

* Apply suggestions from code review

* correct transformer 2d

* finish

* up

* up

* up

* up

* fix

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Apply suggestions from code review

* up

* finish
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

896c98a2

05 Dec, 2022 3 commits

[Docs] Correct docs (#1554) · 62b497c4
Patrick von Platen authored Dec 05, 2022

62b497c4

[refactor] make set_attention_slice recursive (#1532) · bce65cd1

Suraj Patil authored Dec 05, 2022



* make attn slice recursive

* remove set_attention_slice from blocks

* fix copies

* make enable_attention_slicing base class method of DiffusionPipeline

* fix set_attention_slice

* fix set_attention_slice

* fix copies

* add tests

* up

* up

* up

* update

* up

* uP
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

bce65cd1

Compute embedding distances with torch.cdist (#1459) · 720dbfc9
Benjamin Lefaudeux authored Dec 05, 2022
```
small but mighty
```
720dbfc9

03 Dec, 2022 1 commit

Add xformers attention to VAE (#1507) · daebee09

Ilmari Heikkinen authored Dec 03, 2022



* Add xformers attention to VAE

* Simplify VAE xformers code

* Update src/diffusers/models/attention.py
Co-authored-by: Ilmari Heikkinen <ilmari@fhtr.org>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

daebee09

02 Dec, 2022 3 commits

fix tests · cf4664e8
Patrick von Platen authored Dec 02, 2022

cf4664e8

Do not use torch.long in mps (#1488) · 3ceaa280

Pedro Cuenca authored Dec 02, 2022



* Do not use torch.long in mps

Addresses #1056.

* Use torch.int instead of float.

* Propagate changes.

* Do not silently change float -> int.

* Propagate changes.

* Apply suggestions from code review
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>

3ceaa280

[refactor] Making the xformers mem-efficient attention activation recursive (#1493) · a816a87a

Benjamin Lefaudeux authored Dec 02, 2022



* Moving the mem efficiient attention activation to the top + recursive

* black, too bad there's no pre-commit ?
Co-authored-by: Benjamin Lefaudeux <benjamin@photoroom.com>

a816a87a

01 Dec, 2022 2 commits
- Fix Flax flip_sin_to_cos (#1369) · a6a25ceb
  Akash Gokul authored Dec 01, 2022
```
* Fix Flax flip_sin_to_cos

* Adding flip_sin_to_cos
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
```
  a6a25ceb
- simplyfy AttentionBlock (#1492) · 2bbf8b67
  Suraj Patil authored Dec 01, 2022
  
  2bbf8b67
29 Nov, 2022 2 commits

StableDiffusion: Decode latents separately to run larger batches (#1150) · c28d3c82

Ilmari Heikkinen authored Nov 29, 2022



* StableDiffusion: Decode latents separately to run larger batches

* Move VAE sliced decode under enable_vae_sliced_decode and vae.enable_sliced_decode

* Rename sliced_decode to slicing

* fix whitespace

* fix quality check and repository consistency

* VAE slicing tests and documentation

* API doc hooks for VAE slicing

* reformat vae slicing tests

* Skip VAE slicing for one-image batches

* Documentation tweaks for VAE slicing
Co-authored-by: Ilmari Heikkinen <ilmari@fhtr.org>

c28d3c82

Flax support for Stable Diffusion 2 (#1423) · 4d1e4e24

Pedro Cuenca authored Nov 29, 2022



* Flax: start adapting to Stable Diffusion 2

* More changes.

* attention_head_dim can be a tuple.

* Fix typos

* Add simple SD 2 integration test.

Slice values taken from my Ampere GPU.

* Add simple UNet integration tests for Flax.

Note that the expected values are taken from the PyTorch results. This
ensures the Flax and PyTorch versions are not too far off.

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Typos and style

* Tests: verify jax is available.

* Style

* Make flake happy

* Remove typo.

* Simple Flax SD 2 pipeline tests.

* Import order

* Remove unused import.
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: @camenduru

4d1e4e24

25 Nov, 2022 2 commits

Allow to set config params directly in init (#1419) · 8faa822d
Patrick von Platen authored Nov 25, 2022
```
* fix

* fix deprecated kwargs logic

* add tests

* finish
```
8faa822d

[MPS] call contiguous after permute (#1411) · babfb8a0

Kashif Rasul authored Nov 25, 2022

* call contiguous after permute

Fixes for MPS device

* Fix MPS UserWarning

* make style

* Revert "Fix MPS UserWarning"

This reverts commit b46c32810ee5fdc4c16a8e9224a826490b66cf49.

babfb8a0

24 Nov, 2022 2 commits

Support SD2 attention slicing (#1397) · d50e3217

Anton Lozhkov authored Nov 24, 2022

* Support SD2 attention slicing

* Support SD2 attention slicing

* Add more copies

* Use attn_num_head_channels in blocks

* fix-copies

* Update tests

* fix imports

d50e3217

Adapt UNet2D for supre-resolution (#1385) · cecdd8bd

Suraj Patil authored Nov 24, 2022

* allow disabling self attention

* add class_embedding

* fix copies

* fix condition

* fix copies

* do_self_attention -> only_cross_attention

* fix copies

* num_classes -> num_class_embeds

* fix default value

cecdd8bd

23 Nov, 2022 5 commits

[Transformer2DModel] don't norm twice (#1381) · 15241225
Suraj Patil authored Nov 24, 2022
```
don't norm twice
```
15241225

update unet2d (#1376) · f07a16e0

Suraj Patil authored Nov 23, 2022

* boom boom

* remove duplicate arg

* add use_linear_proj arg

* fix copies

* style

* add fast tests

* use_linear_proj -> use_linear_projection

f07a16e0

[Versatile Diffusion] Add versatile diffusion model (#1283) · 2625fb59

Patrick von Platen authored Nov 23, 2022



* up

* convert dual unet

* revert dual attn

* adapt for vd-official

* test the full pipeline

* mixed inference

* mixed inference for text2img

* add image prompting

* fix clip norm

* split text2img and img2img

* fix format

* refactor text2img

* mega pipeline

* add optimus

* refactor image var

* wip text_unet

* text unet end to end

* update tests

* reshape

* fix image to text

* add some first docs

* dual guided pipeline

* fix token ratio

* propose change

* dual transformer as a native module

* DualTransformer(nn.Module)

* DualTransformer(nn.Module)

* correct unconditional image

* save-load with mega pipeline

* remove image to text

* up

* uP

* fix

* up

* final fix

* remove_unused_weights

* test updates

* save progress

* uP

* fix dual prompts

* some fixes

* finish

* style

* finish renaming

* up

* fix

* fix

* fix

* finish
Co-authored-by: anton-l <anton@huggingface.co>

2625fb59

handle fp16 in `UNet2DModel` (#1216) · 9e234d80

Suraj Patil authored Nov 23, 2022



* make sure fp16 runs well

* add fp16 test for superes

* Update src/diffusers/models/unet_2d.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* gen on cuda

* always run fast inferecne test on cpu

* run on cpu
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

9e234d80

Fix using non-square images with UNet2DModel and DDIM/DDPM pipelines (#1289) · 8fd3a743

Penn authored Nov 23, 2022



* fix non square images with UNet2DModel and DDIM/DDPM pipelines

* fix unet_2d `sample_size` docstring

* update pipeline tests for unet uncond
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

8fd3a743

22 Nov, 2022 1 commit

use memory_efficient_attention by default (#1354) · 2d6d4edb

Suraj Patil authored Nov 22, 2022



* use memory_efficient_attention by default

* Update src/diffusers/models/attention.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

2d6d4edb

21 Nov, 2022 1 commit
- perf: prefer batched matmuls for attention (#1203) · ad935933
  Birch-san authored Nov 21, 2022
```
perf: prefer batched matmuls for attention. added fast-path to Decoder when num_heads=1
```
  ad935933
16 Nov, 2022 1 commit
- doc string args shape fix (#1243) · aa5c4c26
  Kamal Raj authored Nov 16, 2022
```
* doc string args shape fix

* fix styling
```
  aa5c4c26
14 Nov, 2022 3 commits

Fix documentation typo for `UNet2DModel` and `UNet2DConditionModel` (#1275) · 57525bb4
Joshua Lochner authored Nov 14, 2022
```
* Fix documentation typo

* Fix other typo
```
57525bb4

Add UNet 1d for RL model for planning + colab (#105) · 7c5fef81

Nathan Lambert authored Nov 14, 2022



* re-add RL model code

* match model forward api

* add register_to_config, pass training tests

* fix tests, update forward outputs

* remove unused code, some comments

* add to docs

* remove extra embedding code

* unify time embedding

* remove conv1d output sequential

* remove sequential from conv1dblock

* style and deleting duplicated code

* clean files

* remove unused variables

* clean variables

* add 1d resnet block structure for downsample

* rename as unet1d

* fix renaming

* rename files

* add get_block(...) api

* unify args for model1d like model2d

* minor cleaning

* fix docs

* improve 1d resnet blocks

* fix tests, remove permuts

* fix style

* add output activation

* rename flax blocks file

* Add Value Function and corresponding example script to Diffuser implementation (#884)

* valuefunction code

* start example scripts

* missing imports

* bug fixes and placeholder example script

* add value function scheduler

* load value function from hub and get best actions in example

* very close to working example

* larger batch size for planning

* more tests

* merge unet1d changes

* wandb for debugging, use newer models

* success!

* turns out we just need more diffusion steps

* run on modal

* merge and code cleanup

* use same api for rl model

* fix variance type

* wrong normalization function

* add tests

* style

* style and quality

* edits based on comments

* style and quality

* remove unused var

* hack unet1d into a value function

* add pipeline

* fix arg order

* add pipeline to core library

* community pipeline

* fix couple shape bugs

* style

* Apply suggestions from code review
Co-authored-by: Nathan Lambert <nathan@huggingface.co>

* update post merge of scripts

* add mdiblock / outblock architecture

* Pipeline cleanup (#947)

* valuefunction code

* start example scripts

* missing imports

* bug fixes and placeholder example script

* add value function scheduler

* load value function from hub and get best actions in example

* very close to working example

* larger batch size for planning

* more tests

* merge unet1d changes

* wandb for debugging, use newer models

* success!

* turns out we just need more diffusion steps

* run on modal

* merge and code cleanup

* use same api for rl model

* fix variance type

* wrong normalization function

* add tests

* style

* style and quality

* edits based on comments

* style and quality

* remove unused var

* hack unet1d into a value function

* add pipeline

* fix arg order

* add pipeline to core library

* community pipeline

* fix couple shape bugs

* style

* Apply suggestions from code review

* clean up comments

* convert older script to using pipeline and add readme

* rename scripts

* style, update tests

* delete unet rl model file

* remove imports in src
Co-authored-by: Nathan Lambert <nathan@huggingface.co>

* Update src/diffusers/models/unet_1d_blocks.py

* Update tests/test_models_unet.py

* RL Cleanup v2 (#965)

* valuefunction code

* start example scripts

* missing imports

* bug fixes and placeholder example script

* add value function scheduler

* load value function from hub and get best actions in example

* very close to working example

* larger batch size for planning

* more tests

* merge unet1d changes

* wandb for debugging, use newer models

* success!

* turns out we just need more diffusion steps

* run on modal

* merge and code cleanup

* use same api for rl model

* fix variance type

* wrong normalization function

* add tests

* style

* style and quality

* edits based on comments

* style and quality

* remove unused var

* hack unet1d into a value function

* add pipeline

* fix arg order

* add pipeline to core library

* community pipeline

* fix couple shape bugs

* style

* Apply suggestions from code review

* clean up comments

* convert older script to using pipeline and add readme

* rename scripts

* style, update tests

* delete unet rl model file

* remove imports in src

* add specific vf block and update tests

* style

* Update tests/test_models_unet.py
Co-authored-by: Nathan Lambert <nathan@huggingface.co>

* fix quality in tests

* fix quality style, split test file

* fix checks / tests

* make timesteps closer to main

* unify block API

* unify forward api

* delete lines in examples

* style

* examples style

* all tests pass

* make style

* make dance_diff test pass

* Refactoring RL PR (#1200)

* init file changes

* add import utils

* finish cleaning files, imports

* remove import flags

* clean examples

* fix imports, tests for merge

* update readmes

* hotfix for tests

* quality

* fix some tests

* change defaults

* more mps test fixes

* unet1d defaults

* do not default import experimental

* defaults for tests

* fix tests

* fix-copies

* fix

* changes per Patrik's comments (#1285)

* changes per Patrik's comments

* update conversion script

* fix renaming

* skip more mps tests

* last test fix

* Update examples/rl/README.md
Co-authored-by: Ben Glickenhaus <benglickenhaus@gmail.com>

7c5fef81

Edited attention.py for older xformers (#1270) · 33d7e89c

Lime-Cakes authored Nov 14, 2022

Older versions of xformers require query, key, value to be contiguous, this calls .contiguous() on q/k/v before passing to xformers.

33d7e89c

08 Nov, 2022 1 commit
- handle dtype xformers attention (#1196) · 5786b0e2
  Suraj Patil authored Nov 08, 2022
```
handle dtype xformers
```
  5786b0e2
05 Nov, 2022 1 commit

Flax: Flip sin to cos in time embeddings (#1149) · 08a6dc8a

Pedro Cuenca authored Nov 05, 2022

Flip sin to cos in t embeddings.

This was assumed in the previous implementation, but now the default is
the opposite.

Fixes #1145.

08a6dc8a

04 Nov, 2022 1 commit
- fix the parameter naming in `self.downsamplers` (#1108) · 5b20d3b3
  Chenguo Lin authored Nov 05, 2022
```
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
```
  5b20d3b3
03 Nov, 2022 1 commit

VQ-diffusion (#658) · ef2ea33c

Will Berman authored Nov 03, 2022



* Changes for VQ-diffusion VQVAE

Add specify dimension of embeddings to VQModel:
`VQModel` will by default set the dimension of embeddings to the number
of latent channels. The VQ-diffusion VQVAE has a smaller
embedding dimension, 128, than number of latent channels, 256.

Add AttnDownEncoderBlock2D and AttnUpDecoderBlock2D to the up and down
unet block helpers. VQ-diffusion's VQVAE uses those two block types.

* Changes for VQ-diffusion transformer

Modify attention.py so SpatialTransformer can be used for
VQ-diffusion's transformer.

SpatialTransformer:
- Can now operate over discrete inputs (classes of vector embeddings) as well as continuous.
- `in_channels` was made optional in the constructor so two locations where it was passed as a positional arg were moved to kwargs
- modified forward pass to take optional timestep embeddings

ImagePositionalEmbeddings:
- added to provide positional embeddings to discrete inputs for latent pixels

BasicTransformerBlock:
- norm layers were made configurable so that the VQ-diffusion could use AdaLayerNorm with timestep embeddings
- modified forward pass to take optional timestep embeddings

CrossAttention:
- now may optionally take a bias parameter for its query, key, and value linear layers

FeedForward:
- Internal layers are now configurable

ApproximateGELU:
- Activation function in VQ-diffusion's feedforward layer

AdaLayerNorm:
- Norm layer modified to incorporate timestep embeddings

* Add VQ-diffusion scheduler

* Add VQ-diffusion pipeline

* Add VQ-diffusion convert script to diffusers

* Add VQ-diffusion dummy objects

* Add VQ-diffusion markdown docs

* Add VQ-diffusion tests

* some renaming

* some fixes

* more renaming

* correct

* fix typo

* correct weights

* finalize

* fix tests

* Apply suggestions from code review
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* finish

* finish

* up
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

ef2ea33c

02 Nov, 2022 3 commits

[Flax] time embedding (#1081) · 0b61cea3

Kashif Rasul authored Nov 02, 2022

* initial get_sinusoidal_embeddings

* added asserts

* better var name

* fix docs

0b61cea3

Fix a small typo of a variable name (#1063) · 1216a3b1
Omiita authored Nov 02, 2022
```
Fix a small typo

fix a typo in `models/attention.py`.
weight -> width
```
1216a3b1

Up to 2x speedup on GPUs using memory efficient attention (#532) · 98c42134

MatthieuTPHR authored Nov 02, 2022



* 2x speedup using memory efficient attention

* remove einops dependency

* Swap K, M in op instantiation

* Simplify code, remove unnecessary maybe_init call and function, remove unused self.scale parameter

* make xformers a soft dependency

* remove one-liner functions

* change one letter variable to appropriate names

* Remove Env variable dependency, remove MemoryEfficientCrossAttention class and use enable_xformers_memory_efficient_attention method

* Add memory efficient attention toggle to img2img and inpaint pipelines

* Clearer management of xformers' availability

* update optimizations markdown to add info about memory efficient attention

* add benchmarks for TITAN RTX

* More detailed explanation of how the mem eff benchmark were ran

* Removing autocast from optimization markdown

* import_utils: import torch only if is available
Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>

98c42134