Commits · b3d10d6d65a80593627c6738fbeded2f69b5129f · renzhc / diffusers_dcu

27 May, 2024 1 commit

[Pipeline] Marigold depth and normals estimation (#7847) · b3d10d6d

Anton Obukhov authored May 27, 2024



* implement marigold depth and normals pipelines in diffusers core

* remove bibtex

* remove deprecations

* remove save_memory argument

* remove validate_vae

* remove config output

* remove batch_size autodetection

* remove presets logic
move default denoising_steps and processing_resolution into the model config
make default ensemble_size 1

* remove no_grad

* add fp16 to the example usage

* implement is_matplotlib_available
use is_matplotlib_available, is_scipy_available for conditional imports in the marigold depth pipeline

* move colormap, visualize_depth, and visualize_normals into export_utils.py

* make the denoising loop more lucid
fix the outputs to always be 4d tensors or lists of pil images
support a 4d input_image case
attempt to support model_cpu_offload_seq
move check_inputs into a separate function
change default batch_size to 1, remove any logic to make it bigger implicitly

* style

* rename denoising_steps into num_inference_steps

* rename input_image into image

* rename input_latent into latents

* remove decode_image
change decode_prediction to use the AutoencoderKL.decode method

* move clean_latent outside of progress_bar

* refactor marigold-reusable image processing bits into MarigoldImageProcessor class

* clean up the usage example docstring

* make ensemble functions members of the pipelines

* add early checks in check_inputs
rename E into ensemble_size in depth ensembling

* fix vae_scale_factor computation

* better compatibility with torch.compile
better variable naming

* move export_depth_to_png to export_utils

* remove encode_prediction

* improve visualize_depth and visualize_normals to accept multi-dimensional data and lists
remove visualization functions from the pipelines
move exporting depth as 16-bit PNGs functionality from the depth pipeline
update example docstrings

* do not shortcut vae.config variables

* change all asserts to raise ValueError

* rename output_prediction_type to output_type

* better variable names
clean up variable deletion code

* better variable names

* pass desc and leave kwargs into the diffusers progress_bar
implement nested progress bar for images and steps loops

* implement scale_invariant and shift_invariant flags in the ensemble_depth function
add scale_invariant and shift_invariant flags readout from the model config
further refactor ensemble_depth
support ensembling without alignment
add ensemble_depth docstring

* fix generator device placement checks

* move encode_empty_text body into the pipeline call

* minor empty text encoding simplifications

* adjust pipelines' class docstrings to explain the added construction arguments

* improve the scipy failure condition
add comments
improve docstrings
change the default use_full_z_range to True

* make input image values range check configurable in the preprocessor
refactor load_image_canonical in preprocessor to reject unknown types and return the image in the expected 4D format of tensor and on right device
support a list of everything as inputs to the pipeline, change type to PipelineImageInput
implement a check that all input list elements have the same dimensions
improve docstrings of pipeline outputs
remove check_input pipeline argument

* remove forgotten print

* add prediction_type model config

* add uncertainty visualization into export utils
fix NaN values in normals uncertainties

* change default of output_uncertainty to False
better handle the case of an attempt to export or visualize none

* fix `output_uncertainty=False`

* remove kwargs
fix check_inputs according to the new inputs of the pipeline

* rename prepare_latent into prepare_latents as in other pipelines
annotate prepare_latents in normals pipeline with "Copied from"
annotate encode_image in normals pipeline with "Copied from"

* move nested-capable `progress_bar` method into the pipelines
revert the original `progress_bar` method in pipeline_utils

* minor message improvement

* fix cpu offloading

* move colormap, visualize_depth, export_depth_to_16bit_png, visualize_normals, visualize_uncertainty to marigold_image_processing.py
update example docstrings

* fix missing comma

* change torch.FloatTensor to torch.Tensor

* fix importing of MarigoldImageProcessor

* fix vae offloading
fix batched image encoding
remove separate encode_image function and use vae.encode instead

* implement marigold's intial tests
relax generator checks in line with other pipelines
implement return_dict __call__ argument in line with other pipelines

* fix num_images computation

* remove MarigoldImageProcessor and outputs from import structure
update tests

* update docstrings

* update init

* update

* style

* fix

* fix

* up

* up

* up

* add simple test

* up

* update expected np input/output to be channel last

* move expand_tensor_or_array into the MarigoldImageProcessor

* rewrite tests to follow conventions - hardcoded slices instead of image artifacts
write more smoke tests

* add basic docs.

* add anton's contribution statement

* remove todos.

* fix assertion values for marigold depth slow tests

* fix assertion values for depth normals.

* remove print

* support AutoencoderTiny in the pipelines

* update documentation page
add Available Pipelines section
add Available Checkpoints section
add warning about num_inference_steps

* fix missing import in docstring
fix wrong value in visualize_depth docstring

* [doc] add marigold to pipelines overview

* [doc] add section "usage examples"

* fix an issue with latents check in the pipelines

* add "Frame-by-frame Video Processing with Consistency" section

* grammarly

* replace tables with images with css-styled images (blindly)

* style

* print

* fix the assertions.

* take from the github runner.

* take the slices from action artifacts

* style.

* update with the slices from the runner.

* remove unnecessary code blocks.

* Revert "[doc] add marigold to pipelines overview"

This reverts commit a505165150afd8dab23c474d1a054ea505a56a5f.

* remove invitation for new modalities

* split out marigold usage examples

* doc cleanup

---------
Co-authored-by: yiyixuxu <yixu310@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: sayakpaul <spsayakpaul@gmail.com>

b3d10d6d

24 May, 2024 3 commits
- Fix a grammatical error in the `raise` messages (#8272) · db33af06
  Tolga Cangöz authored May 24, 2024
```
Fix grammatical error
```
  db33af06
- Respect `resume_download` deprecation V2 (#8267) · edf5ba6a
  Lucain authored May 24, 2024
```
* Fix resume_downoad FutureWarning

* only resume download
```
  edf5ba6a
- Use `freedesktop_os_release()` in diffusers cli for Python >=3.10 (#8235) · 370146e4
  Dhruv Nair authored May 24, 2024
```
* update

* update
```
  370146e4
23 May, 2024 1 commit
- Fix resize issue in SVD pipeline with VideoProcessor (#8229) · 67b3fe0a
  Dhruv Nair authored May 23, 2024
```
update
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
```
  67b3fe0a
22 May, 2024 1 commit
- fix: Attribute error in Logger object (logger.warning) (#8183) · 509741ae
  BootesVoid authored May 21, 2024
  
  509741ae
21 May, 2024 1 commit
- [docs] VideoProcessor (#7965) · fdb1baa0
  Steven Liu authored May 20, 2024
```
* fix?

* fix?

* fix
```
  fdb1baa0
20 May, 2024 5 commits

Make VAE compatible to torch.compile() (#7984) · 6529ee67
Vinh H. Pham authored May 21, 2024
```
make VAE compatible to torch.compile()
Co-authored-by: YiYi Xu <yixu310@gmail.com>
```
6529ee67
fix: Fixed few `docstrings` according to the Google Style Guide (#7717) · df2bc5ef
Sai-Suraj-27 authored May 20, 2024
```
Fixed few docstrings according to the Google Style Guide.
```
df2bc5ef

Passing `cross_attention_kwargs` to `StableDiffusionInstructPix2PixPipeline` (#7961) · a7bf77fc

Aleksei Zhuravlev authored May 20, 2024

* Update pipeline_stable_diffusion_instruct_pix2pix.py

Add `cross_attention_kwargs` to `__call__` method of `StableDiffusionInstructPix2PixPipeline`, which are passed to UNet.

* Update documentation for pipeline_stable_diffusion_instruct_pix2pix.py

* Update docstring

* Update docstring

* Fix typing import

a7bf77fc

[docs] add doc for PixArtSigmaPipeline (#7857) · 0f0defdb

Junsong Chen authored May 21, 2024



* 1. add doc for PixArtSigmaPipeline;

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Guillaume LEGENDRE <glegendre01@gmail.com>
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
Co-authored-by: Bagheera <59658056+bghira@users.noreply.github.com>
Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Hyoungwon Cho <jhw9811@korea.ac.kr>
Co-authored-by: yiyixuxu <yixu310@gmail.com>
Co-authored-by: Tolga Cangöz <46008593+standardAI@users.noreply.github.com>
Co-authored-by: Philip Pham <phillypham@google.com>

0f0defdb

Update pipeline_controlnet_inpaint_sd_xl.py (#7983) · 19df9f3e
Nikita authored May 20, 2024

19df9f3e

16 May, 2024 3 commits

Consistent SDXL Controlnet callback tensor inputs (#7958) · 6c60e430

Álvaro Somoza authored May 16, 2024

* make _callback_tensor_inputs consistent between sdxl pipelines

* forgot this one

* fix failing test

* fix test_components_function

* fix controlnet inpaint tests

6c60e430

Fix the text tokenizer name in logger warning of PixArt pipelines (#7912) · 746f603b
Liang Hou authored May 16, 2024
```
Fix CLIP to T5 in logger warning
```
746f603b

refactor: Refactored code by Merging `isinstance` calls (#7710) · 2afea72d

Sai-Suraj-27 authored May 16, 2024



* Merged isinstance calls to make the code simpler.

* Corrected formatting errors using ruff.

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>

2afea72d

15 May, 2024 1 commit

Adding VQGAN Training script (#5483) · d27e996c

Isamu Isozaki authored May 15, 2024



* Init commit

* Removed einops

* Added default movq config for training

* Update explanation of prompts

* Fixed inheritance of discriminator and init_tracker

* Fixed incompatible api between muse and here

* Fixed output

* Setup init training

* Basic structure done

* Removed attention for quick tests

* Style fixes

* Fixed vae/vqgan styles

* Removed redefinition of wandb

* Fixed log_validation and tqdm

* Nothing commit

* Added commit loss to lookup_from_codebook

* Update src/diffusers/models/vq_model.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Adding perliminary README

* Fixed one typo

* Local changes

* Fixed main issues

* Merging

* Update src/diffusers/models/vq_model.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Testing+Fixed bugs in training script

* Some style fixes

* Added wandb to docs

* Fixed timm test

* get testing suite ready.

* remove return loss

* remove return_loss

* Remove diffs

* Remove diffs

* fix ruff format

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

d27e996c

14 May, 2024 3 commits

Fix `added_cond_kwargs` when using IP-Adapter in StableDiffusionXLControlNetInpaintPipeline (#7924) · b2140a89

Nikita authored May 14, 2024

Fix `added_cond_kwargs` when using IP-Adapter

Fix error when using IP-Adapter in pipeline and passing `ip_adapter_image_embeds` instead of `ip_adapter_image`
Co-authored-by: YiYi Xu <yixu310@gmail.com>

b2140a89

[Core] separate the loading utilities in modeling similar to pipelines. (#7943) · e0e8c58f
Sayak Paul authored May 14, 2024
```
separate the loading utilities in modeling similar to pipelines.
```
e0e8c58f

Expansion proposal of `diffusers-cli env` (#7403) · a1245c2c

Tolga Cangöz authored May 14, 2024



* Expand `diffusers-cli env`

* SafeTensors -> Safetensors
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Move `safetensors_version = "not installed"` to `else`

* Update `safetensors_version` checking

* Add GPU detection for Linux, Mac OS, and Windows

* Add accelerator detection to environment command

* Add is_peft_version to import_utils

* Update env.py

* Add `huggingface_hub` reference

* Add `transformers` reference

* Add reference for `huggingface_hub`

* Fix print statement in env.py for unusual OS

* Up

* Fix platform information in env.py

* up

* Fix import order in env.py

* ruff

* make style

* Fix platform system check in env.py

* Fix run method return type in env.py

* 🤗



* No need f-string

* Remove location info

* Remove accelerate config

* Refactor env.py to remove accelerate config

* feat: Add support for `bitsandbytes` library in environment command

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

a1245c2c

13 May, 2024 3 commits
- fix multicontrolnet `save_pretrained` logic for compatibility (#7821) · b41ce1e0
  rebel-kblee authored May 14, 2024
```
fix multicontrolnet save_pretrained logic for compatibility
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
```
  b41ce1e0
- fix AnimateDiff creation with a unet loaded with IP Adapter (#7791) · 44aa9e56
  Fabio Rigano authored May 13, 2024
```
* Fix loading from_pipe

* Fix style

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
```
  44aa9e56
- Official callbacks (#7761) · fdb05f54
  Álvaro Somoza authored May 12, 2024
  
  fdb05f54
12 May, 2024 2 commits
- [Core] fix offload behaviour when device_map is enabled. (#7919) · 5bb38586
  Sayak Paul authored May 12, 2024
```
fix offload behaviour when device_map is enabled.
```
  5bb38586
- add custom sigmas and timesteps for StableDiffusionXLControlNet pipeline (#7913) · e4f8dca9
  momo authored May 12, 2024
```
add custom sigmas and timesteps
```
  e4f8dca9
10 May, 2024 2 commits

#7535 Update FloatTensor type hints to Tensor (#7883) · be4afa0b

Mark Van Aken authored May 10, 2024

* find & replace all FloatTensors to Tensor

* apply formatting

* Update torch.FloatTensor to torch.Tensor in the remaining files

* formatting

* Fix the rest of the places where FloatTensor is used as well as in documentation

* formatting

* Update new file from FloatTensor to Tensor

be4afa0b

[Core] introduce videoprocessor. (#7776) · 04f4bd54

Sayak Paul authored May 10, 2024



* introduce videoprocessor.

* fix quality

* address yiyi's feedback

* fix preprocess_video call.

* video_processor -> image_processor

* fix

* fix more.

* quality

* image_processor -> video_processor

* support List[List[PIL.Image.Image]]

* change to video_processor.

* documentation

* Apply suggestions from code review

* changes

* remove print.

* refactor video processor (part # 7776) (#7861)

* update

* update remove deprecate

* Update src/diffusers/video_processor.py

* update

* Apply suggestions from code review

* deprecate list of 5d for video and list of 4d for image + apply other feedbacks

* up

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* add doc.

* tensor2vid -> postprocess_video.

* refactor preprocess with preprocess_video

* set default values.

* empty commit

* more refactoring of prepare_latents in animatediff vid2vid

* checking documentation

* remove documentation for now.

* fix animatediff sdxl

* fix test failure [part of video processor PR] (#7905)

up

* remove preceed_with_frames.

* doc

* fix

* fix

* remove video input as a single-frame video.

---------
Co-authored-by: YiYi Xu <yixu310@gmail.com>

04f4bd54

09 May, 2024 5 commits

[scheduler] support custom `timesteps` and `sigmas` (#7817) · b934215d

YiYi Xu authored May 09, 2024



* support custom sigmas and timesteps, dpm euler

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

b934215d

fix `_optional_components` in `StableCascadeCombinedPipeline` (#7894) · 5ed3abd3
YiYi Xu authored May 09, 2024
```
* fix

* up
```
5ed3abd3

[Tests] fix things after #7013 (#7899) · 305f2b44

Sayak Paul authored May 09, 2024

* debugging

* save the resulting image

* check if order reversing works.

* checking values.

* up

* okay

* checking

* fix

* remove print

305f2b44

[Refactor] Better align `from_single_file` logic with `from_pretrained` (#7496) · cb0f3b49

Dhruv Nair authored May 09, 2024



* refactor unet single file loading a bit.

* retrieve the unet from create_diffusers_unet_model_from_ldm

* update

* update

* updae

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* tests

* update

* update

* update

* Update docs/source/en/api/single_file.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/single_file.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update docs/source/en/api/loaders/single_file.md
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/loaders/single_file.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update docs/source/en/api/loaders/single_file.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/loaders/single_file.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/loaders/single_file.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/loaders/single_file.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

---------
Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>

cb0f3b49

Fix several imports (#7712) · caf9e985
Tolga Cangöz authored May 09, 2024
```
Fix imports
```
caf9e985

08 May, 2024 5 commits

Remove dead code and fix f-string issue (#7720) · c1c42698

Tolga Cangöz authored May 09, 2024

* Remove dead code

* PylancereportGeneralTypeIssues: Strings nested within an f-string cannot use the same quote character as the f-string prior to Python 3.12.

* Remove dead code

c1c42698

Allow users to save SDXL LoRA weights for only one text encoder (#7607) · 75aab346

Pierre Dulac authored May 08, 2024



SDXL LoRA weights for text encoders should be decoupled on save

The method checks if at least one of unet, text_encoder and
text_encoder_2 lora weights are passed, which was not reflected in the
implentation.
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>

75aab346

[Pipeline] AnimateDiff SDXL (#6721) · 818f7607

Aryan authored May 08, 2024



* update conversion script to handle motion adapter sdxl checkpoint

* add animatediff xl

* handle addition_embed_type

* fix output

* update

* add imports

* make fix-copies

* add decode latents

* update docstrings

* add animatediff sdxl to docs

* remove unnecessary lines

* update example

* add test

* revert conv_in conv_out kernel param

* remove unused param addition_embed_type_num_heads

* latest IPAdapter impl

* make fix-copies

* fix return

* add IPAdapterTesterMixin to tests

* fix return

* revert based on suggestion

* add freeinit

* fix test_to_dtype test

* use StableDiffusionMixin instead of different helper methods

* fix progress bar iterations

* apply suggestions from review

* hardcode flip_sin_to_cos and freq_shift

* make fix-copies

* fix ip adapter implementation

* fix last failing test

* make style

* Update docs/source/en/api/pipelines/animatediff.md
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* remove todo

* fix doc-builder errors

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

818f7607

Check shape and remove deprecated APIs in scheduling_ddpm_flax.py (#7703) · f29b9348

Philip Pham authored May 08, 2024

`model_output.shape` may only have rank 1.

There are warnings related to use of random keys.

```
tests/schedulers/test_scheduler_flax.py: 13 warnings
  /Users/phillypham/diffusers/src/diffusers/schedulers/scheduling_ddpm_flax.py:268: FutureWarning: normal accepts a single key, but was given a key array of shape (1, 2) != (). Use jax.vmap for batching. In a future JAX version, this will be an error.
    noise = jax.random.normal(split_key, shape=model_output.shape, dtype=self.dtype)

tests/schedulers/test_scheduler_flax.py::FlaxDDPMSchedulerTest::test_betas
  /Users/phillypham/virtualenv/diffusers/lib/python3.9/site-packages/jax/_src/random.py:731: FutureWarning: uniform accepts a single key, but was given a key array of shape (1,) != (). Use jax.vmap for batching. In a future JAX version, this will be an error.
    u = uniform(key, shape, dtype, lo, hi)  # type: ignore[arg-type]
```

f29b9348

Fix image upcasting (#7858) · d50baf0c

Tolga Cangöz authored May 08, 2024



Fix image's upcasting before `vae.encode()` when using `fp16`
Co-authored-by: YiYi Xu <yixu310@gmail.com>

d50baf0c

07 May, 2024 1 commit
- Fix for "no lora weight found module" with some loras (#7875) · 23e09156
  Álvaro Somoza authored May 07, 2024
```
* return layer weight if not found

* better system and test

* key example and typo
```
  23e09156
03 May, 2024 2 commits

Add Ascend NPU support for SDXL fine-tuning and fix the model saving bug when... · 58237364

HelloWorldBeginner authored May 04, 2024


Add Ascend NPU support for SDXL fine-tuning and fix the model saving bug when using DeepSpeed. (#7816)

* Add Ascend NPU support for SDXL fine-tuning and fix the model saving bug when using DeepSpeed.

* fix check code quality

* Decouple the NPU flash attention and make it an independent module.

* add doc and unit tests for npu flash attention.

---------
Co-authored-by: mhh001 <mahonghao1@huawei.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

58237364

Respect `resume_download` deprecation (#7843) · 6a479588

Lucain authored May 03, 2024



* Deprecate resume_download

* align docstring with transformers

* style

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

6a479588

02 May, 2024 1 commit

Fix key error for dictionary with randomized order in convert_ldm_unet_checkpoint (#7680) · c1b2a89e

yunseong Cho authored May 02, 2024



fix key error for different order
Co-authored-by: yunseong <yunseong.cho@superlabs.us>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

c1b2a89e