Commits · ca1a22296d02ab44547a33c1bb7e5a4015a2f843 · renzhc / diffusers_dcu

22 Mar, 2023 1 commit

[MS Text To Video] Add first text to video (#2738) · ca1a2229

Patrick von Platen authored Mar 22, 2023



* [MS Text To Video} Add first text to video

* upload

* make first model example

* match unet3d params

* make sure weights are correcctly converted

* improve

* forward pass works, but diff result

* make forward work

* fix more

* finish

* refactor video output class.

* feat: add support for a video export utility.

* fix: opencv availability check.

* run make fix-copies.

* add: docs for the model components.

* add: standalone pipeline doc.

* edit docstring of the pipeline.

* add: right path to TransformerTempModel

* add: first set of tests.

* complete fast tests for text to video.

* fix bug

* up

* three fast tests failing.

* add: note on slow tests

* make work with all schedulers

* apply styling.

* add slow tests

* change file name

* update

* more correction

* more fixes

* finish

* up

* Apply suggestions from code review

* up

* finish

* make copies

* fix pipeline tests

* fix more tests

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* apply suggestions

* up

* revert

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

ca1a2229

02 Mar, 2023 1 commit

Add a ControlNet model & pipeline (#2407) · 8dfff7c0

Takuma Mori authored Mar 02, 2023



* add scaffold
- copied convert_controlnet_to_diffusers.py from
convert_original_stable_diffusion_to_diffusers.py

* Add support to load ControlNet (WIP)
- this makes Missking Key error on ControlNetModel

* Update to convert ControlNet without error msg
- init impl for StableDiffusionControlNetPipeline
- init impl for ControlNetModel

* cleanup of commented out

* split create_controlnet_diffusers_config()
from create_unet_diffusers_config()

- add config: hint_channels

* Add input_hint_block, input_zero_conv and
middle_block_out
- this makes missing key error on loading model

* add unet_2d_blocks_controlnet.py
- copied from unet_2d_blocks.py as impl CrossAttnDownBlock2D,DownBlock2D
- this makes missing key error on loading model

* Add loading for input_hint_block, zero_convs
and middle_block_out

- this makes no error message on model loading

* Copy from UNet2DConditionalModel except __init__

* Add ultra primitive test for ControlNetModel
inference

* Support ControlNetModel inference
- without exceptions

* copy forward() from UNet2DConditionModel

* Impl ControlledUNet2DConditionModel inference
- test_controlled_unet_inference passed

* Frozen weight & biases for training

* Minimized version of ControlNet/ControlledUnet
- test_modules_controllnet.py passed

* make style

* Add support model loading for minimized ver

* Remove all previous version files

* from_pretrained and inference test passed

* copied from pipeline_stable_diffusion.py
except `__init__()`

* Impl pipeline, pixel match test (almost) passed.

* make style

* make fix-copies

* Fix to add import ControlNet blocks
for `make fix-copies`

* Remove einops dependency

* Support  np.ndarray, PIL.Image for controlnet_hint

* set default config file as lllyasviel's

* Add support grayscale (hw) numpy array

* Add and update docstrings

* add control_net.mdx

* add control_net.mdx to toctree

* Update copyright year

* Fix to add PIL.Image RGB->BGR conversion
- thanks @Mystfit

* make fix-copies

* add basic fast test for controlnet

* add slow test for controlnet/unet

* Ignore down/up_block len check on ControlNet

* add a copy from test_stable_diffusion.py

* Accept controlnet_hint is None

* merge pipeline_stable_diffusion.py diff

* Update class name to SDControlNetPipeline

* make style

* Baseline fast test almost passed (w long desc)

* still needs investigate.

Following didn't passed descriped in TODO comment:
- test_stable_diffusion_long_prompt
- test_stable_diffusion_no_safety_checker

Following didn't passed same as stable_diffusion_pipeline:
- test_attention_slicing_forward_pass
- test_inference_batch_single_identical
- test_xformers_attention_forwardGenerator_pass
these seems come from calc accuracy.

* Add note comment related vae_scale_factor

* add test_stable_diffusion_controlnet_ddim

* add assertion for vae_scale_factor != 8

* slow test of pipeline almost passed
Failed: test_stable_diffusion_pipeline_with_model_offloading
- ImportError: `enable_model_offload` requires `accelerate v0.17.0` or higher

but currently latest version == 0.16.0

* test_stable_diffusion_long_prompt passed

* test_stable_diffusion_no_safety_checker passed

- due to its model size, move to slow test

* remove PoC test files

* fix num_of_image, prompt length issue add add test

* add support List[PIL.Image] for controlnet_hint

* wip

* all slow test passed

* make style

* update for slow test

* RGB(PIL)->BGR(ctrlnet) conversion

* fixes

* remove manual num_images_per_prompt test

* add document

* add `image` argument docstring

* make style

* Add line to correct conversion

* add controlnet_conditioning_scale (aka control_scales
strength)

* rgb channel ordering by default

* image batching logic

* Add control image descriptions for each checkpoint

* Only save controlnet model in conversion script

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_controlnet.py

typo
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* add gerated image example

* a depth mask -> a depth map

* rename control_net.mdx to controlnet.mdx

* fix toc title

* add ControlNet abstruct and link

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_controlnet.py
Co-authored-by: dqueue <dbyqin@gmail.com>

* remove controlnet constructor arguments re: @patrickvonplaten

* [integration tests] test canny

* test_canny fixes

* [integration tests] test_depth

* [integration tests] test_hed

* [integration tests] test_mlsd

* add channel order config to controlnet

* [integration tests] test normal

* [integration tests] test_openpose test_scribble

* change height and width to default to conditioning image

* [integration tests] test seg

* style

* test_depth fix

* [integration tests] size fixes

* [integration tests] cpu offloading

* style

* generalize controlnet embedding

* fix conversion script

* Update docs/source/en/api/pipelines/stable_diffusion/controlnet.mdx
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/controlnet.mdx
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/controlnet.mdx
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/controlnet.mdx
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Style adapted to the documentation of pix2pix

* merge main by hand

* style

* [docs] controlling generation doc nits

* correct some things

* add: controlnetmodel to autodoc.

* finish docs

* finish

* finish 2

* correct images

* finish controlnet

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* uP

* upload model

* up

* up

---------
Co-authored-by: William Berman <WLBberman@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: dqueue <dbyqin@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

8dfff7c0

01 Mar, 2023 1 commit
- [Copyright] 2023 (#2524) · eadf0e25
  Patrick von Platen authored Mar 01, 2023
  
  eadf0e25
30 Dec, 2022 1 commit

Make repo structure consistent (#1862) · 29b2c93c

Patrick von Platen authored Dec 30, 2022



* move files a bit

* more refactors

* fix more

* more fixes

* fix more onnx

* make style

* upload

* fix

* up

* fix more

* up again

* up

* small fix

* Update src/diffusers/__init__.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* correct
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

29b2c93c

18 Dec, 2022 1 commit

kakaobrain unCLIP (#1428) · 2dcf64b7

Will Berman authored Dec 18, 2022



* [wip] attention block updates

* [wip] unCLIP unet decoder and super res

* [wip] unCLIP prior transformer

* [wip] scheduler changes

* [wip] text proj utility class

* [wip] UnCLIPPipeline

* [wip] kakaobrain unCLIP convert script

* [unCLIP pipeline] fixes re: @patrickvonplaten

remove callbacks

move denoising loops into call function

* UNCLIPScheduler re: @patrickvonplaten

Revert changes to DDPMScheduler. Make UNCLIPScheduler, a modified
DDPM scheduler with changes to support karlo

* mask -> attention_mask re: @patrickvonplaten

* [DDPMScheduler] remove leftover change

* [docs] PriorTransformer

* [docs] UNet2DConditionModel and UNet2DModel

* [nit] UNCLIPScheduler -> UnCLIPScheduler

matches existing unclip naming better

* [docs] SchedulingUnCLIP

* [docs] UnCLIPTextProjModel

* refactor

* finish licenses

* rename all to attention_mask and prep in models

* more renaming

* don't expose unused configs

* final renaming fixes

* remove x attn mask when not necessary

* configure kakao script to use new class embedding config

* fix copies

* [tests] UnCLIPScheduler

* finish x attn

* finish

* remove more

* rename condition blocks

* clean more

* Apply suggestions from code review

* up

* fix

* [tests] UnCLIPPipelineFastTests

* remove unused imports

* [tests] UnCLIPPipelineIntegrationTests

* correct

* make style
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

2dcf64b7

03 Nov, 2022 1 commit

VQ-diffusion (#658) · ef2ea33c

Will Berman authored Nov 03, 2022



* Changes for VQ-diffusion VQVAE

Add specify dimension of embeddings to VQModel:
`VQModel` will by default set the dimension of embeddings to the number
of latent channels. The VQ-diffusion VQVAE has a smaller
embedding dimension, 128, than number of latent channels, 256.

Add AttnDownEncoderBlock2D and AttnUpDecoderBlock2D to the up and down
unet block helpers. VQ-diffusion's VQVAE uses those two block types.

* Changes for VQ-diffusion transformer

Modify attention.py so SpatialTransformer can be used for
VQ-diffusion's transformer.

SpatialTransformer:
- Can now operate over discrete inputs (classes of vector embeddings) as well as continuous.
- `in_channels` was made optional in the constructor so two locations where it was passed as a positional arg were moved to kwargs
- modified forward pass to take optional timestep embeddings

ImagePositionalEmbeddings:
- added to provide positional embeddings to discrete inputs for latent pixels

BasicTransformerBlock:
- norm layers were made configurable so that the VQ-diffusion could use AdaLayerNorm with timestep embeddings
- modified forward pass to take optional timestep embeddings

CrossAttention:
- now may optionally take a bias parameter for its query, key, and value linear layers

FeedForward:
- Internal layers are now configurable

ApproximateGELU:
- Activation function in VQ-diffusion's feedforward layer

AdaLayerNorm:
- Norm layer modified to incorporate timestep embeddings

* Add VQ-diffusion scheduler

* Add VQ-diffusion pipeline

* Add VQ-diffusion convert script to diffusers

* Add VQ-diffusion dummy objects

* Add VQ-diffusion markdown docs

* Add VQ-diffusion tests

* some renaming

* some fixes

* more renaming

* correct

* fix typo

* correct weights

* finalize

* fix tests

* Apply suggestions from code review
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* finish

* finish

* up
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

ef2ea33c

25 Oct, 2022 1 commit

[Dance Diffusion] Add dance diffusion (#803) · 88fa6b7d

Patrick von Platen authored Oct 25, 2022



* start

* add more logic

* Update src/diffusers/models/unet_2d_condition_flax.py

* match weights

* up

* make model work

* making class more general, fixing missed file rename

* small fix

* make new conversion work

* up

* finalize conversion

* up

* first batch of variable renamings

* remove c and c_prev var names

* add mid and out block structure

* add pipeline

* up

* finish conversion

* finish

* upload

* more fixes

* Apply suggestions from code review

* add attr

* up

* uP

* up

* finish tests

* finish

* uP

* finish

* fix test

* up

* naming consistency in tests

* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Nathan Lambert <nathan@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>

* remove hardcoded 16

* Remove bogus

* fix some stuff

* finish

* improve logging

* docs

* upload
Co-authored-by: Nathan Lambert <nol@berkeley.edu>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Nathan Lambert <nathan@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>

88fa6b7d

06 Oct, 2022 2 commits
- Revert "[v0.4.0] Temporarily remove Flax modules from the public API (#755)" · 970e3060
  anton-l authored Oct 06, 2022
```
This reverts commit 2e209c30.
```
  970e3060
- [v0.4.0] Temporarily remove Flax modules from the public API (#755) · 2e209c30
  Anton Lozhkov authored Oct 06, 2022
```
Temporarily remove Flax modules from the public API
```
  2e209c30
20 Sep, 2022 1 commit

FlaxDiffusionPipeline & FlaxStableDiffusionPipeline (#559) · d934d3d7

Mishig Davaadorj authored Sep 20, 2022



* WIP: flax FlaxDiffusionPipeline & FlaxStableDiffusionPipeline

* todo comment

* Fix imports

* Fix imports

* add dummies

* Fix empty init

* make pipeline work

* up

* Use Flax schedulers (typing, docstring)

* Wrap model imports inside availability checks.

* more updates

* make sure flax is not broken

* make style

* more fixes

* up
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@latenitesoft.com>

d934d3d7

01 Sep, 2022 1 commit
- Fix flake8 F401 imported but unused (#317) · 3c138a4d
  Anton Lozhkov authored Sep 01, 2022
```
* Fix flake8 F401 '...' imported but unused

* One more F403
```
  3c138a4d
20 Jul, 2022 1 commit

Big Model Renaming (#109) · 9c3820d0

Patrick von Platen authored Jul 21, 2022

* up

* change model name

* renaming

* more changes

* up

* up

* up

* save checkpoint

* finish api / naming

* finish config renaming

* rename all weights

* finish really

9c3820d0

19 Jul, 2022 2 commits
- Get diffusers ready 🚀🚀🚀 (#101) · 8c31925b
  Patrick von Platen authored Jul 19, 2022
```
* big purge

* more fixes

* finish for now
```
  8c31925b
- Finalize ldm (#96) · d5acb411
  Patrick von Platen authored Jul 19, 2022
```
* upload

* make checkpoint work

* finalize
```
  d5acb411
13 Jul, 2022 1 commit
- The big purge -> remove everything except vision for now · 2a69c0b7
  Patrick von Platen authored Jul 13, 2022
  
  2a69c0b7
12 Jul, 2022 1 commit
- Add unconditional image generation (#79) · 06c79730
  Patrick von Platen authored Jul 12, 2022
```
* uP

* finish downsampling layers

* finish major refactor

* remove bugus file
```
  06c79730
29 Jun, 2022 1 commit
- add vae model · 4d1536bb
  patil-suraj authored Jun 29, 2022
  
  4d1536bb
24 Jun, 2022 1 commit
- add score estimation model · ac796924
  Patrick von Platen authored Jun 24, 2022
  
  ac796924
22 Jun, 2022 1 commit
- refactor naming · d0032c60
  Patrick von Platen authored Jun 22, 2022
  
  d0032c60
21 Jun, 2022 1 commit
- make style · 9c82c32b
  anton-l authored Jun 21, 2022
  
  9c82c32b
20 Jun, 2022 1 commit
- add imports for RL UNet · 49718b47
  Nathan Lambert authored Jun 20, 2022
  
  49718b47
15 Jun, 2022 2 commits
- remove torchvision dependency · 17c574a1
  Patrick von Platen authored Jun 15, 2022
  
  17c574a1
- add unet grad tts · 31712dea
  patil-suraj authored Jun 15, 2022
  
  31712dea
13 Jun, 2022 1 commit
- tune ddpm training · 77451b3b
  anton-l authored Jun 13, 2022
  
  77451b3b
09 Jun, 2022 3 commits
- move vqvae in top models dir · d1af9d91
  patil-suraj authored Jun 09, 2022
  
  d1af9d91
- remove CLIPTextModel from src · f23bb3e8
  anton-l authored Jun 09, 2022
  
  f23bb3e8
- fix setup · cbb19ee8
  Patrick von Platen authored Jun 09, 2022
  
  cbb19ee8
08 Jun, 2022 5 commits
- Convert glide upsampling weights · f9cdb4dd
  anton-l authored Jun 08, 2022
  
  f9cdb4dd
- add vqvae · f4ee3498
  patil-suraj authored Jun 08, 2022
  
  f4ee3498
- Style · 07ffe73f
  anton-l authored Jun 08, 2022
  
  07ffe73f
- Classifier-free guidance scheduler + GLIDe pipeline · 1e21f061
  anton-l authored Jun 08, 2022
  
  1e21f061
- add unet ldm in init · 4d53a521
  patil-suraj authored Jun 08, 2022
  
  4d53a521
07 Jun, 2022 1 commit
- Add glide modeling files · 111fa990
  anton-l authored Jun 07, 2022
  
  111fa990
01 Jun, 2022 1 commit
- add pretrained model and pretrained sampler · 8cb5e694
  Patrick von Platen authored Jun 02, 2022
  
  8cb5e694
31 May, 2022 1 commit
- add first template for DDPM forward · e779b250
  Patrick von Platen authored May 31, 2022
  
  e779b250
30 May, 2022 2 commits
- init upload · d8498166
  Patrick von Platen authored May 30, 2022
  
  d8498166
- upload some initial structure · 0bea0268
  Patrick von Platen authored May 30, 2022
  
  0bea0268