Commits · ad935933452f40a3832cfab77a4b568ac0916885 · renzhc / diffusers_dcu

"vscode:/vscode.git/clone" did not exist on "88018fcf20cc375a938c1c9fb786fab3ea8fe20c"

21 Nov, 2022 1 commit
- perf: prefer batched matmuls for attention (#1203) · ad935933
  Birch-san authored Nov 21, 2022
```
perf: prefer batched matmuls for attention. added fast-path to Decoder when num_heads=1
```
  ad935933
16 Nov, 2022 1 commit
- doc string args shape fix (#1243) · aa5c4c26
  Kamal Raj authored Nov 16, 2022
```
* doc string args shape fix

* fix styling
```
  aa5c4c26
14 Nov, 2022 3 commits

Fix documentation typo for `UNet2DModel` and `UNet2DConditionModel` (#1275) · 57525bb4
Joshua Lochner authored Nov 14, 2022
```
* Fix documentation typo

* Fix other typo
```
57525bb4

Add UNet 1d for RL model for planning + colab (#105) · 7c5fef81

Nathan Lambert authored Nov 14, 2022



* re-add RL model code

* match model forward api

* add register_to_config, pass training tests

* fix tests, update forward outputs

* remove unused code, some comments

* add to docs

* remove extra embedding code

* unify time embedding

* remove conv1d output sequential

* remove sequential from conv1dblock

* style and deleting duplicated code

* clean files

* remove unused variables

* clean variables

* add 1d resnet block structure for downsample

* rename as unet1d

* fix renaming

* rename files

* add get_block(...) api

* unify args for model1d like model2d

* minor cleaning

* fix docs

* improve 1d resnet blocks

* fix tests, remove permuts

* fix style

* add output activation

* rename flax blocks file

* Add Value Function and corresponding example script to Diffuser implementation (#884)

* valuefunction code

* start example scripts

* missing imports

* bug fixes and placeholder example script

* add value function scheduler

* load value function from hub and get best actions in example

* very close to working example

* larger batch size for planning

* more tests

* merge unet1d changes

* wandb for debugging, use newer models

* success!

* turns out we just need more diffusion steps

* run on modal

* merge and code cleanup

* use same api for rl model

* fix variance type

* wrong normalization function

* add tests

* style

* style and quality

* edits based on comments

* style and quality

* remove unused var

* hack unet1d into a value function

* add pipeline

* fix arg order

* add pipeline to core library

* community pipeline

* fix couple shape bugs

* style

* Apply suggestions from code review
Co-authored-by: Nathan Lambert <nathan@huggingface.co>

* update post merge of scripts

* add mdiblock / outblock architecture

* Pipeline cleanup (#947)

* valuefunction code

* start example scripts

* missing imports

* bug fixes and placeholder example script

* add value function scheduler

* load value function from hub and get best actions in example

* very close to working example

* larger batch size for planning

* more tests

* merge unet1d changes

* wandb for debugging, use newer models

* success!

* turns out we just need more diffusion steps

* run on modal

* merge and code cleanup

* use same api for rl model

* fix variance type

* wrong normalization function

* add tests

* style

* style and quality

* edits based on comments

* style and quality

* remove unused var

* hack unet1d into a value function

* add pipeline

* fix arg order

* add pipeline to core library

* community pipeline

* fix couple shape bugs

* style

* Apply suggestions from code review

* clean up comments

* convert older script to using pipeline and add readme

* rename scripts

* style, update tests

* delete unet rl model file

* remove imports in src
Co-authored-by: Nathan Lambert <nathan@huggingface.co>

* Update src/diffusers/models/unet_1d_blocks.py

* Update tests/test_models_unet.py

* RL Cleanup v2 (#965)

* valuefunction code

* start example scripts

* missing imports

* bug fixes and placeholder example script

* add value function scheduler

* load value function from hub and get best actions in example

* very close to working example

* larger batch size for planning

* more tests

* merge unet1d changes

* wandb for debugging, use newer models

* success!

* turns out we just need more diffusion steps

* run on modal

* merge and code cleanup

* use same api for rl model

* fix variance type

* wrong normalization function

* add tests

* style

* style and quality

* edits based on comments

* style and quality

* remove unused var

* hack unet1d into a value function

* add pipeline

* fix arg order

* add pipeline to core library

* community pipeline

* fix couple shape bugs

* style

* Apply suggestions from code review

* clean up comments

* convert older script to using pipeline and add readme

* rename scripts

* style, update tests

* delete unet rl model file

* remove imports in src

* add specific vf block and update tests

* style

* Update tests/test_models_unet.py
Co-authored-by: Nathan Lambert <nathan@huggingface.co>

* fix quality in tests

* fix quality style, split test file

* fix checks / tests

* make timesteps closer to main

* unify block API

* unify forward api

* delete lines in examples

* style

* examples style

* all tests pass

* make style

* make dance_diff test pass

* Refactoring RL PR (#1200)

* init file changes

* add import utils

* finish cleaning files, imports

* remove import flags

* clean examples

* fix imports, tests for merge

* update readmes

* hotfix for tests

* quality

* fix some tests

* change defaults

* more mps test fixes

* unet1d defaults

* do not default import experimental

* defaults for tests

* fix tests

* fix-copies

* fix

* changes per Patrik's comments (#1285)

* changes per Patrik's comments

* update conversion script

* fix renaming

* skip more mps tests

* last test fix

* Update examples/rl/README.md
Co-authored-by: Ben Glickenhaus <benglickenhaus@gmail.com>

7c5fef81

Edited attention.py for older xformers (#1270) · 33d7e89c

Lime-Cakes authored Nov 14, 2022

Older versions of xformers require query, key, value to be contiguous, this calls .contiguous() on q/k/v before passing to xformers.

33d7e89c

08 Nov, 2022 1 commit
- handle dtype xformers attention (#1196) · 5786b0e2
  Suraj Patil authored Nov 08, 2022
```
handle dtype xformers
```
  5786b0e2
05 Nov, 2022 1 commit

Flax: Flip sin to cos in time embeddings (#1149) · 08a6dc8a

Pedro Cuenca authored Nov 05, 2022

Flip sin to cos in t embeddings.

This was assumed in the previous implementation, but now the default is
the opposite.

Fixes #1145.

08a6dc8a

04 Nov, 2022 1 commit
- fix the parameter naming in `self.downsamplers` (#1108) · 5b20d3b3
  Chenguo Lin authored Nov 05, 2022
```
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
```
  5b20d3b3
03 Nov, 2022 1 commit

VQ-diffusion (#658) · ef2ea33c

Will Berman authored Nov 03, 2022



* Changes for VQ-diffusion VQVAE

Add specify dimension of embeddings to VQModel:
`VQModel` will by default set the dimension of embeddings to the number
of latent channels. The VQ-diffusion VQVAE has a smaller
embedding dimension, 128, than number of latent channels, 256.

Add AttnDownEncoderBlock2D and AttnUpDecoderBlock2D to the up and down
unet block helpers. VQ-diffusion's VQVAE uses those two block types.

* Changes for VQ-diffusion transformer

Modify attention.py so SpatialTransformer can be used for
VQ-diffusion's transformer.

SpatialTransformer:
- Can now operate over discrete inputs (classes of vector embeddings) as well as continuous.
- `in_channels` was made optional in the constructor so two locations where it was passed as a positional arg were moved to kwargs
- modified forward pass to take optional timestep embeddings

ImagePositionalEmbeddings:
- added to provide positional embeddings to discrete inputs for latent pixels

BasicTransformerBlock:
- norm layers were made configurable so that the VQ-diffusion could use AdaLayerNorm with timestep embeddings
- modified forward pass to take optional timestep embeddings

CrossAttention:
- now may optionally take a bias parameter for its query, key, and value linear layers

FeedForward:
- Internal layers are now configurable

ApproximateGELU:
- Activation function in VQ-diffusion's feedforward layer

AdaLayerNorm:
- Norm layer modified to incorporate timestep embeddings

* Add VQ-diffusion scheduler

* Add VQ-diffusion pipeline

* Add VQ-diffusion convert script to diffusers

* Add VQ-diffusion dummy objects

* Add VQ-diffusion markdown docs

* Add VQ-diffusion tests

* some renaming

* some fixes

* more renaming

* correct

* fix typo

* correct weights

* finalize

* fix tests

* Apply suggestions from code review
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* finish

* finish

* up
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

ef2ea33c

02 Nov, 2022 3 commits

[Flax] time embedding (#1081) · 0b61cea3

Kashif Rasul authored Nov 02, 2022

* initial get_sinusoidal_embeddings

* added asserts

* better var name

* fix docs

0b61cea3

Fix a small typo of a variable name (#1063) · 1216a3b1
Omiita authored Nov 02, 2022
```
Fix a small typo

fix a typo in `models/attention.py`.
weight -> width
```
1216a3b1

Up to 2x speedup on GPUs using memory efficient attention (#532) · 98c42134

MatthieuTPHR authored Nov 02, 2022



* 2x speedup using memory efficient attention

* remove einops dependency

* Swap K, M in op instantiation

* Simplify code, remove unnecessary maybe_init call and function, remove unused self.scale parameter

* make xformers a soft dependency

* remove one-liner functions

* change one letter variable to appropriate names

* Remove Env variable dependency, remove MemoryEfficientCrossAttention class and use enable_xformers_memory_efficient_attention method

* Add memory efficient attention toggle to img2img and inpaint pipelines

* Clearer management of xformers' availability

* update optimizations markdown to add info about memory efficient attention

* add benchmarks for TITAN RTX

* More detailed explanation of how the mem eff benchmark were ran

* Removing autocast from optimization markdown

* import_utils: import torch only if is available
Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>

98c42134

31 Oct, 2022 2 commits
- Remove some unused parameter in CrossAttnUpBlock2D (#1034) · 7fb4b882
  Laurent Mazare authored Oct 31, 2022
```
Remove some unused parameter

The `downsample_padding` parameter does not seem to be used in `CrossAttnUpBlock2D` (or by any up block for that matter) so removing it.
```
  7fb4b882
- Remove nn sequential (#1086) · 888468dd
  Patrick von Platen authored Oct 31, 2022
```
* Remove nn sequential

* up
```
  888468dd
29 Oct, 2022 2 commits

Experimental: allow fp16 in `mps` (#961) · 95414bd6

Pedro Cuenca authored Oct 29, 2022

* Docs: refer to pre-RC version of PyTorch 1.13.0.

* Remove temporary workaround for unavailable op.

* Update comment to make it less ambiguous.

* Remove use of contiguous in mps.

It appears to not longer be necessary.

* Special case: use einsum for much better performance in mps

* Update mps docs.

* MPS: make pipeline work in half precision.

95414bd6

clean incomplete pages (#1008) · 12fd0736
Nathan Lambert authored Oct 29, 2022

12fd0736

28 Oct, 2022 1 commit
- fix `F.interpolate()` for large batch sizes (#1006) · ab079f27
  Nouamane Tazi authored Oct 28, 2022
```
* fix `upsample_nearest_nhwc` for large bsz

* fix `upsample_nearest_nhwc` for large bsz
```
  ab079f27
25 Oct, 2022 3 commits

[Dance Diffusion] FP16 (#980) · 365ff8f7
Patrick von Platen authored Oct 25, 2022
```
* add in fp16

* up
```
365ff8f7

[Dance Diffusion] Add dance diffusion (#803) · 88fa6b7d

Patrick von Platen authored Oct 25, 2022



* start

* add more logic

* Update src/diffusers/models/unet_2d_condition_flax.py

* match weights

* up

* make model work

* making class more general, fixing missed file rename

* small fix

* make new conversion work

* up

* finalize conversion

* up

* first batch of variable renamings

* remove c and c_prev var names

* add mid and out block structure

* add pipeline

* up

* finish conversion

* finish

* upload

* more fixes

* Apply suggestions from code review

* add attr

* up

* uP

* up

* finish tests

* finish

* uP

* finish

* fix test

* up

* naming consistency in tests

* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Nathan Lambert <nathan@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>

* remove hardcoded 16

* Remove bogus

* fix some stuff

* finish

* improve logging

* docs

* upload
Co-authored-by: Nathan Lambert <nol@berkeley.edu>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Nathan Lambert <nathan@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>

88fa6b7d

mps changes for PyTorch 1.13 (#926) · 3d02c921

Pedro Cuenca authored Oct 25, 2022



* Docs: refer to pre-RC version of PyTorch 1.13.0.

* Remove temporary workaround for unavailable op.

* Update comment to make it less ambiguous.

* Remove use of contiguous in mps.

It appears to not longer be necessary.

* Special case: use einsum for much better performance in mps

* Update mps docs.

* Minor doc update.

* Accept suggestion
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>

3d02c921

12 Oct, 2022 1 commit
- add or fix license formatting in models directory (#808) · 5afc2b60
  Nathan Lambert authored Oct 12, 2022
```
* add or fix license formatting

* fix quality
```
  5afc2b60
11 Oct, 2022 2 commits

Flax: Trickle down `norm_num_groups` (#789) · a1242044

Akash Pannu authored Oct 11, 2022

* pass norm_num_groups param and add tests

* set resnet_groups for FlaxUNetMidBlock2D

* fixed docstrings

* fixed typo

* using is_flax_available util and created require_flax decorator

a1242044

support bf16 for stable diffusion (#792) · 797b290e
Suraj Patil authored Oct 11, 2022
```
* support bf16 for stable diffusion

* fix typo

* address review comments
```
797b290e

10 Oct, 2022 2 commits
- fix typo docstring in unet2d (#798) · 71ca10c6
  Nathan Lambert authored Oct 10, 2022
```
fix typo docstring
```
  71ca10c6
- Clean up resnet.py file (#780) · a73f8b72
  Nathan Lambert authored Oct 10, 2022
```
* clean up resnet.py

* make style and quality

* minor formatting
```
  a73f8b72
07 Oct, 2022 1 commit

[img2img, inpainting] fix fp16 inference (#769) · 92d70863

Suraj Patil authored Oct 07, 2022

* handle dtype in vae and image2image pipeline

* fix inpaint in fp16

* dtype should be handled in add_noise

* style

* address review comments

* add simple fast tests to check fp16

* fix test name

* put mask in fp16

92d70863

06 Oct, 2022 2 commits
- Revert "[v0.4.0] Temporarily remove Flax modules from the public API (#755)" · 970e3060
  anton-l authored Oct 06, 2022
```
This reverts commit 2e209c30.
```
  970e3060
- [v0.4.0] Temporarily remove Flax modules from the public API (#755) · 2e209c30
  Anton Lozhkov authored Oct 06, 2022
```
Temporarily remove Flax modules from the public API
```
  2e209c30
05 Oct, 2022 1 commit

Removing `autocast` for `35-25% speedup`. (`autocast` considered harmful). (#511) · 3dcc75cb

Nicolas Patry authored Oct 05, 2022



* Removing `autocast` for `35-25% speedup`.

* iQuality

* Adding a slow test.

* Fixing mps noise generation.

* Raising error on wrong device, instead of just casting on behalf of user.

* Quality.

* fix merge
Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>

3dcc75cb

04 Oct, 2022 3 commits

renamed x to meaningful variable in resnet.py (#677) · 7265dd8c

NIKHIL A V authored Oct 05, 2022



* renamed single letter variables

* renamed x to meaningful variable in resnet.py

Hello @patil-suraj can you verify it
Thanks

* Reformatted using black

* renamed x to meaningful variable in resnet.py

Hello @patil-suraj can you verify it
Thanks

* reformatted the files

* modified unboundlocalerror in line 374

* removed referenced before error

* renamed single variable x -> hidden_state, p-> pad_value
Co-authored-by: Nikhil A V <nikhilav@Nikhils-MacBook-Pro.local>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

7265dd8c

Remove comments no longer appropriate (#716) · 6b221920

Pedro Cuenca authored Oct 04, 2022

Remove comments no longer appropriate.

There were casting operations before, they are now gone.

6b221920

[Docs] fix docstring for issue #709 (#710) · f1b9ee7e
Kashif Rasul authored Oct 04, 2022
```
fix docstring

fixes #709
```
f1b9ee7e

30 Sep, 2022 3 commits

Fix slow tests (#689) · b2cfc7a0

Nouamane Tazi authored Sep 30, 2022

* revert using baddbmm in attention
- to fix `test_stable_diffusion_memory_chunking` test

* styling

b2cfc7a0

Allow resolutions that are not multiples of 64 (#505) · a784be2e

Josh Achiam authored Sep 30, 2022



* Allow resolutions that are not multiples of 64

* ran black

* fix bug

* add test

* more explanation

* more comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

a784be2e

Optimize Stable Diffusion (#371) · 9ebaea54

Nouamane Tazi authored Sep 30, 2022

* initial commit

* make UNet stream capturable

* try to fix noise_pred value

* remove cuda graph and keep NB

* non blocking unet with PNDMScheduler

* make timesteps np arrays for pndm scheduler
because lists don't get formatted to tensors in `self.set_format`

* make max async in pndm

* use channel last format in unet

* avoid moving timesteps device in each unet call

* avoid memcpy op in `get_timestep_embedding`

* add `channels_last` kwarg to `DiffusionPipeline.from_pretrained`

* update TODO

* replace `channels_last` kwarg with `memory_format` for more generality

* revert the channels_last changes to leave it for another PR

* remove non_blocking when moving input ids to device

* remove blocking from all .to() operations at beginning of pipeline

* fix merging

* fix merging

* model can run in other precisions without autocast

* attn refactoring

* Revert "attn refactoring"

This reverts commit 0c70c0e189cd2c4d8768274c9fcf5b940ee310fb.

* remove restriction to run conv_norm in fp32

* use `baddbmm` instead of `matmul`for better in attention for better perf

* removing all reshapes to test perf

* Revert "removing all reshapes to test perf"

This reverts commit 006ccb8a8c6bc7eb7e512392e692a29d9b1553cd.

* add shapes comments

* hardcore whats needed for jitting

* Revert "hardcore whats needed for jitting"

This reverts commit 2fa9c698eae2890ac5f8e367ca80532ecf94df9a.

* Revert "remove restriction to run conv_norm in fp32"

This reverts commit cec592890c32da3d1b78d38b49e4307aedf459b9.

* revert using baddmm in attention's forward

* cleanup comment

* remove restriction to run conv_norm in fp32. no quality loss was noticed

This reverts commit cc9bc1339c998ebe9e7d733f910c6d72d9792213.

* add more optimizations techniques to docs

* Revert "add shapes comments"

This reverts commit 31c58eadb8892f95478cdf05229adf678678c5f4.

* apply suggestions

* make quality

* apply suggestions

* styling

* `scheduler.timesteps` are now arrays so we dont need .to()

* remove useless .type()

* use mean instead of max in `test_stable_diffusion_inpaint_pipeline_k_lms`

* move scheduler timestamps to correct device if tensors

* add device to `set_timesteps` in LMSD scheduler

* `self.scheduler.set_timesteps` now uses device arg for schedulers that accept it

* quick fix

* styling

* remove kwargs from schedulers `set_timesteps`

* revert to using max in K-LMS inpaint pipeline test

* Revert "`self.scheduler.set_timesteps` now uses device arg for schedulers that accept it"

This reverts commit 00d5a51e5c20d8d445c8664407ef29608106d899.

* move timesteps to correct device before loop in SD pipeline

* apply previous fix to other SD pipelines

* UNet now accepts tensor timesteps even on wrong device, to avoid errors
- it shouldnt affect performance if timesteps are alrdy on correct device
- it does slow down performance if they're on the wrong device

* fix pipeline when timesteps are arrays with strides

9ebaea54

29 Sep, 2022 1 commit
- Renamed x -> hidden_states in resnet.py (#676) · a7058f42
  Partho authored Sep 30, 2022
```
renamed x to hidden_states
```
  a7058f42
27 Sep, 2022 1 commit

Fix `SpatialTransformer` (#578) · d886e497

Yih-Dar authored Sep 27, 2022



* Fix SpatialTransformer

* Fix SpatialTransformer
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

d886e497

23 Sep, 2022 1 commit

Flax documentation (#589) · 8b0be935

Younes Belkada authored Sep 23, 2022



* documenting `attention_flax.py` file

* documenting `embeddings_flax.py`

* documenting `unet_blocks_flax.py`

* Add new objs to doc page

* document `vae_flax.py`

* Apply suggestions from code review

* modify `unet_2d_condition_flax.py`

* make style

* Apply suggestions from code review

* make style

* Apply suggestions from code review

* fix indent

* fix typo

* fix indent unet

* Update src/diffusers/models/vae_flax.py

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Mishig Davaadorj <dmishig@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

8b0be935

22 Sep, 2022 1 commit

[UNet2DConditionModel] add gradient checkpointing (#461) · e7120bae

Suraj Patil authored Sep 22, 2022

* add grad ckpt to downsample blocks

* make it work

* don't pass gradient_checkpointing to upsample block

* add tests for UNet2DConditionModel

* add test_gradient_checkpointing

* add gradient_checkpointing for up and down blocks

* add functions to enable and disable grad ckpt

* remove the forward argument

* better naming

* make supports_gradient_checkpointing private

e7120bae

21 Sep, 2022 1 commit
- Replace `dropout_prob` by `dropout` in `vae` (#595) · 3fc8ef72
  Younes Belkada authored Sep 21, 2022
```
replace `dropout_prob` by `dropout` in `vae`
```
  3fc8ef72