Commits · 5a196e3d46e87d50a1c993a13d2589d40739dc63 · renzhc / diffusers_dcu

15 Dec, 2024 1 commit

[Sana] Add Sana, including `SanaPipeline`, `SanaPAGPipeline`,... · 5a196e3d

Junsong Chen authored Dec 16, 2024


[Sana] Add Sana, including `SanaPipeline`, `SanaPAGPipeline`, `LinearAttentionProcessor`, `Flow-based DPM-sovler` and so on. (#9982)

* first add a script for DC-AE;

* DC-AE init

* replace triton with custom implementation

* 1. rename file and remove un-used codes;

* no longer rely on omegaconf and dataclass

* replace custom activation with diffuers activation

* remove dc_ae attention in attention_processor.py

* iinherit from ModelMixin

* inherit from ConfigMixin

* dc-ae reduce to one file

* update downsample and upsample

* clean code

* support DecoderOutput

* remove get_same_padding and val2tuple

* remove autocast and some assert

* update ResBlock

* remove contents within super().__init__

* Update src/diffusers/models/autoencoders/dc_ae.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* remove opsequential

* update other blocks to support the removal of build_norm

* remove build encoder/decoder project in/out

* remove inheritance of RMSNorm2d from LayerNorm

* remove reset_parameters for RMSNorm2d
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* remove device and dtype in RMSNorm2d __init__
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/dc_ae.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/dc_ae.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/dc_ae.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* remove op_list & build_block

* remove build_stage_main

* change file name to autoencoder_dc

* move LiteMLA to attention.py

* align with other vae decode output;

* add DC-AE into init files;

* update

* make quality && make style;

* quick push before dgx disappears again

* update

* make style

* update

* update

* fix

* refactor

* refactor

* refactor

* update

* possibly change to nn.Linear

* refactor

* make fix-copies

* replace vae with ae

* replace get_block_from_block_type to get_block

* replace downsample_block_type from Conv to conv for consistency

* add scaling factors

* incorporate changes for all checkpoints

* make style

* move mla to attention processor file; split qkv conv to linears

* refactor

* add tests

* from original file loader

* add docs

* add standard autoencoder methods

* combine attention processor

* fix tests

* update

* minor fix

* minor fix

* minor fix & in/out shortcut rename

* minor fix

* make style

* fix paper link

* update docs

* update single file loading

* make style

* remove single file loading support; todo for DN6

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* add abstract

* 1. add DCAE into diffusers;
2. make style and make quality;

* add DCAE_HF into diffusers;

* bug fixed;

* add SanaPipeline, SanaTransformer2D into diffusers;

* add sanaLinearAttnProcessor2_0;

* first update for SanaTransformer;

* first update for SanaPipeline;

* first success run SanaPipeline;

* model output finally match with original model with the same intput;

* code update;

* code update;

* add a flow dpm-solver scripts

* 🎉[important update]
1. Integrate flow-dpm-sovler into diffusers;
2. finally run successfully on both `FlowMatchEulerDiscreteScheduler` and `FlowDPMSolverMultistepScheduler`;

* 🎉🔧

[important update & fix huge bugs!!]
1. add SanaPAGPipeline & several related Sana linear attention operators;
2. `SanaTransformer2DModel` not supports multi-resolution input;
2. fix the multi-scale HW bugs in SanaPipeline and SanaPAGPipeline;
3. fix the flow-dpm-solver set_timestep() init `model_output` and `lower_order_nums` bugs;

* remove prints;

* add convert sana official checkpoint to diffusers format Safetensor.

* Update src/diffusers/models/transformers/sana_transformer_2d.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/models/transformers/sana_transformer_2d.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/models/transformers/sana_transformer_2d.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/pipelines/pag/pipeline_pag_sana.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/models/transformers/sana_transformer_2d.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/models/transformers/sana_transformer_2d.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/pipelines/sana/pipeline_sana.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/pipelines/sana/pipeline_sana.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update Sana for DC-AE's recent commit;

* make style && make quality

* Add StableDiffusion3PAGImg2Img Pipeline + Fix SD3 Unconditional PAG (#9932)

* fix progress bar updates in SD 1.5 PAG Img2Img pipeline

---------
Co-authored-by: Vinh H. Pham <phamvinh257@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* make the vae can be None in `__init__` of `SanaPipeline`

* Update src/diffusers/models/transformers/sana_transformer_2d.py
Co-authored-by: hlky <hlky@hlky.ac>

* change the ae related code due to the latest update of DCAE branch;

* change the ae related code due to the latest update of DCAE branch;

* 1. change code based on AutoencoderDC;
2. fix the bug of new GLUMBConv;
3. run success;

* update for solving conversation.

* 1. fix bugs and run convert script success;
2. Downloading ckpt from hub automatically;

* make style && make quality;

* 1. remove un-unsed parameters in init;
2. code update;

* remove test file

* refactor; add docs; add tests; update conversion script

* make style

* make fix-copies

* refactor

* udpate pipelines

* pag tests and refactor

* remove sana pag conversion script

* handle weight casting in conversion script

* update conversion script

* add a processor

* 1. add bf16 pth file path;
2. add complex human instruct in pipeline;

* fix fast \tests

* change gemma-2-2b-it ckpt to a non-gated repo;

* fix the pth path bug in conversion script;

* change grad ckpt to original; make style

* fix the complex_human_instruct bug and typo;

* remove dpmsolver flow scheduler

* apply review suggestions

* change the `FlowMatchEulerDiscreteScheduler` to default `DPMSolverMultistepScheduler` with flow matching scheduler.

* fix the tokenizer.padding_side='right' bug;

* update docs

* make fix-copies

* fix imports

* fix docs

* add integration test

* update docs

* update examples

* fix convert_model_output in schedulers

* fix failing tests

---------
Co-authored-by: Junyu Chen <chenjydl2003@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: chenjy2003 <70215701+chenjy2003@users.noreply.github.com>
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: hlky <hlky@hlky.ac>

5a196e3d

13 Dec, 2024 1 commit

Use `torch` in `get_2d_sincos_pos_embed` and `get_3d_sincos_pos_embed` (#10156) · 63243406

hlky authored Dec 13, 2024

* Use torch in get_2d_sincos_pos_embed

* Use torch in get_3d_sincos_pos_embed

* get_1d_sincos_pos_embed_from_grid in LatteTransformer3DModel

* deprecate

* move deprecate, make private

63243406

12 Dec, 2024 4 commits

update StableDiffusion3Img2ImgPipeline.add image size validation (#10166) · bdbaea8f

Bios authored Dec 13, 2024



* update StableDiffusion3Img2ImgPipeline.add image size validation

---------
Co-authored-by: hlky <hlky@hlky.ac>

bdbaea8f

refactor StableDiffusionXLControlNetUnion (#10200) · e8b65bff
hlky authored Dec 12, 2024
```
mode
```
e8b65bff
Remove `negative_*` from SDXL callback (#10203) · f2d348d9
hlky authored Dec 12, 2024
```
* Remove `negative_*` from SDXL callback

* Change example and add XL version
```
f2d348d9

[core] LTX Video (#10021) · 96c376a5

Aryan authored Dec 12, 2024



* transformer

* make style & make fix-copies

* transformer

* add transformer tests

* 80% vae

* make style

* make fix-copies

* fix

* undo cogvideox changes

* update

* update

* match vae

* add docs

* t2v pipeline working; scheduler needs to be checked

* docs

* add pipeline test

* update

* update

* make fix-copies

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update

* copy t2v to i2v pipeline

* update

* apply review suggestions

* update

* make style

* remove framewise encoding/decoding

* pack/unpack latents

* image2video

* update

* make fix-copies

* update

* update

* rope scale fix

* debug layerwise code

* remove debug

* Apply suggestions from code review
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* propagate precision changes to i2v pipeline

* remove downcast

* address review comments

* fix comment

* address review comments

* [Single File] LTX support for loading original weights (#10135)

* from original file mixin for ltx

* undo config mapping fn changes

* update

* add single file to pipelines

* update docs

* Update src/diffusers/models/autoencoders/autoencoder_kl_ltx.py

* Update src/diffusers/models/autoencoders/autoencoder_kl_ltx.py

* rename classes based on ltx review

* point to original repository for inference

* make style

* resolve conflicts correctly

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>

96c376a5

11 Dec, 2024 2 commits
- Add ControlNetUnion (#10131) · 914a585b
  hlky authored Dec 11, 2024
```
* ControlNetUnion model
```
  914a585b
- Added Error when len(gligen_images ) is not equal to len(gligen_phrases) in... · d041dd50
  SahilCarterr authored Dec 11, 2024
```
Added Error when len(gligen_images ) is not equal to len(gligen_phrases) in StableDiffusionGLIGENTextImagePipeline (#10176)

* added check value error

* fix style
```
  d041dd50
10 Dec, 2024 2 commits

Add PAG Support for Stable Diffusion Inpaint Pipeline (#9386) · 65b98b5d

Darshil Jariwala authored Dec 11, 2024



* using sd inpaint pipeline and sdxl pag inpaint pipeline to add changes

* using sd inpaint pipeline and sdxl pag inpaint pipeline to add changes

* finished the call function

* added auto pipeline

* merging diffusers

* ready to test

* ready to test

* added copied from and removed unnecessary tests

* make style changes

* doc changes

* updating example doc string

* style fix

* init

* adding imports

* quality

* Update src/diffusers/pipelines/pag/pipeline_pag_sd_inpaint.py

* make

* Update tests/pipelines/pag/test_pag_sd_inpaint.py

* slice and size

* slice

---------
Co-authored-by: Darshil Jariwala <darshiljariwala@Darshils-MacBook-Air.local>
Co-authored-by: Darshil Jariwala <jariwala.darshil2002@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: hlky <hlky@hlky.ac>

65b98b5d

Use `torch` in `get_3d_rotary_pos_embed`/`_allegro` (#10161) · 4c4b323c
hlky authored Dec 10, 2024
```
Use torch in get_3d_rotary_pos_embed/_allegro
```
4c4b323c

06 Dec, 2024 1 commit
- Remove duplicate checks for len(generator) != batch_size when generator is a list (#10134) · 18f9b990
  Aryan authored Dec 06, 2024
```
remove duplicate checks
```
  18f9b990
04 Dec, 2024 5 commits

[Flux Redux] add prompt & multiple image input (#10056) · 04bba387
Linoy Tsaban authored Dec 04, 2024
```
* add multiple prompts to flux redux

---------
Co-authored-by: hlky <hlky@hlky.ac>
```
04bba387
Add `sigmas` to pipelines using FlowMatch (#10116) · a2d424eb
hlky authored Dec 04, 2024

a2d424eb

[bitsandbytes] allow directly CUDA placements of pipelines loaded with bnb components (#9840) · e8da75df

Sayak Paul authored Dec 04, 2024



* allow device placement when using bnb quantization.

* warning.

* tests

* fixes

* docs.

* require accelerate version.

* remove print.

* revert to()

* tests

* fixes

* fix: missing AutoencoderKL lora adapter (#9807)

* fix: missing AutoencoderKL lora adapter

* fix

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* fixes

* fix condition test

* updates

* updates

* remove is_offloaded.

* fixes

* better

* empty

---------
Co-authored-by: Emmanuel Benazera <emmanuel.benazera@jolibrain.com>

e8da75df

Fix `pipeline_stable_audio` formating (#10114) · 8a450c3d
hlky authored Dec 04, 2024

8a450c3d
add torch_xla support in pipeline_stable_audio.py (#10109) · 9ff72433
fancy45daddy authored Dec 04, 2024
```
Update pipeline_stable_audio.py
```
9ff72433

03 Dec, 2024 4 commits

Fix multi-prompt inference (#10103) · 6a51427b

hlky authored Dec 03, 2024



* Fix multi-prompt inference

Fix generation of multiple images with multiple prompts, e.g len(prompts)>1, num_images_per_prompt>1

* make

* fix copies

---------
Co-authored-by: Nikita Balabin <nikita@mxl.ru>

6a51427b

Avoid compiling a progress bar. (#10098) · 619b9658

lsb authored Dec 03, 2024

* Avoid creating a progress bar when it is disabled.

This is useful when exporting a pipeline, and allows a compiler to avoid trying to compile away tqdm.

* Prevent the PyTorch compiler from compiling progress bars.

* Update pipeline_utils.py

619b9658

Add StableDiffusion3PAGImg2Img Pipeline + Fix SD3 Unconditional PAG (#9932) · 63b631f3

Benjamin Paine authored Dec 03, 2024



* fix progress bar updates in SD 1.5 PAG Img2Img pipeline



---------
Co-authored-by: Vinh H. Pham <phamvinh257@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

63b631f3

Let server decide default repo visibility (#10047) · 0763a7ed
Lucain authored Dec 03, 2024

0763a7ed

02 Dec, 2024 3 commits
- Fix `num_images_per_prompt>1` with Skip Guidance Layers in `StableDiffusion3Pipeline` (#10086) · beb85668
  hlky authored Dec 02, 2024
  
  beb85668
- fix offloading for sd3.5 controlnets (#10072) · cd344393
  YiYi Xu authored Dec 02, 2024
```
* add
```
  cd344393
- Add `sigmas` to Flux pipelines (#10081) · 8d386f79
  hlky authored Dec 02, 2024
  
  8d386f79
27 Nov, 2024 1 commit

Sd35 controlnet (#10020) · 75bd1e83

YiYi Xu authored Nov 27, 2024



* add model/pipeline
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

75bd1e83

23 Nov, 2024 1 commit

Flux Fill, Canny, Depth, Redux (#9985) · 7ac6e286

Aryan authored Nov 23, 2024



* update

---------
Co-authored-by: yiyixuxu <yixu310@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

7ac6e286

21 Nov, 2024 1 commit
- Fix prepare latent image ids and vae sample generators for flux (#9981) · cd6ca9df
  Aryan authored Nov 21, 2024
```
* fix

* update expected slice
```
  cd6ca9df
20 Nov, 2024 3 commits
- fix controlnet module refactor (#9968) · e564abe2
  YiYi Xu authored Nov 20, 2024
```
* fix
```
  e564abe2
- [LoRA] enable LoRA for Mochi-1 (#9943) · 805aa937
  Sayak Paul authored Nov 21, 2024
```
* feat: add lora support to Mochi-1.
```
  805aa937
- Flux latents fix (#9929) · f6f7afa1
  Dhruv Nair authored Nov 20, 2024
```
* update

* update

* update

* update

* update

* update

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
```
  f6f7afa1
19 Nov, 2024 2 commits

add skip_layers argument to SD3 transformer model class (#9880) · 99c0483b

Bagheera authored Nov 19, 2024



* add skip_layers argument to SD3 transformer model class

* add unit test for skip_layers in stable diffusion 3

* sd3: pipeline should support skip layer guidance

* up

---------
Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: yiyixuxu <yixu310@gmail.com>

99c0483b

Make CogVideoX RoPE implementation consistent (#9963) · 0583a8d1
Aryan authored Nov 19, 2024
```
* update cogvideox rope implementation

* apply suggestions from review
```
0583a8d1

18 Nov, 2024 1 commit

CogVideoX 1.5 (#9877) · 3b283061

Yuxuan.Zhang authored Nov 19, 2024



* CogVideoX1_1PatchEmbed test

* 1360 * 768

* refactor

* make style

* update docs

* add modeling tests for cogvideox 1.5

* update

* make fix-copies

* add ofs embed(for convert)

* add ofs embed(for convert)

* more resolution for cogvideox1.5-5b-i2v

* use even number of latent frames only

* update pipeline implementations

* make style

* set patch_size_t as None by default

* #skip frames 0

* refactor

* make style

* update docs

* fix ofs_embed

* update docs

* invert_scale_latents

* update

* fix

* Update docs/source/en/api/pipelines/cogvideox.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/cogvideox.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/cogvideox.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/cogvideox.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/models/transformers/cogvideox_transformer_3d.py

* update conversion script

* remove copied from

* fix test

* Update docs/source/en/api/pipelines/cogvideox.md

* Update docs/source/en/api/pipelines/cogvideox.md

* Update docs/source/en/api/pipelines/cogvideox.md

* Update docs/source/en/api/pipelines/cogvideox.md

---------
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

3b283061

17 Nov, 2024 1 commit
- Correct pipeline_output.py to the type Mochi (#9945) · 07d0fbf3
  _ authored Nov 17, 2024
```
Correct pipeline_output.py
```
  07d0fbf3
14 Nov, 2024 2 commits

Update pipeline_flux_img2img.py (#9928) · 5c94937d

Sam authored Nov 14, 2024

* Update pipeline_flux_img2img.py

Added FromSingleFileMixin to this pipeline loader like the other FLUX pipelines.

* Update pipeline_flux_img2img.py

typo

* modified:   src/diffusers/pipelines/flux/pipeline_flux_img2img.py

5c94937d

Fix Progress Bar Updates in SD 1.5 PAG Img2Img pipeline (#9925) · d74483c4
Benjamin Paine authored Nov 14, 2024
```
fix progress bar updates in SD 1.5 PAG Img2Img pipeline
```
d74483c4

08 Nov, 2024 3 commits
- Revert "[Flux] reduce explicit device transfers and typecasting in flux." (#9896) · 8d6dc2be
  Sayak Paul authored Nov 08, 2024
```
Revert "[Flux] reduce explicit device transfers and typecasting in flux. (#9817)"

This reverts commit 5588725e.
```
  8d6dc2be
- Enabling gradient checkpointing in eval() mode (#9878) · 5b972fbd
  Michael Tkachuk authored Nov 08, 2024
```
* refactored
```
  5b972fbd
- Improve downloads of sharded variants (#9869) · 1b392544
  Dhruv Nair authored Nov 08, 2024
```
* update

* update

* update

* update

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
```
  1b392544
07 Nov, 2024 2 commits

[Flux] reduce explicit device transfers and typecasting in flux. (#9817) · 5588725e
Sayak Paul authored Nov 07, 2024
```
reduce explicit device transfers and typecasting in flux.
```
5588725e

[Core] introduce `controlnet` module (#8768) · ded3db16

Sayak Paul authored Nov 07, 2024



* move vae flax module.

* controlnet module.

* prepare for PR.

* revert a commit

* gracefully deprecate controlnet deps.

* fix

* fix doc path

* fix-copies

* fix path

* style

* style

* conflicts

* fix

* fix-copies

* sparsectrl.

* updates

* fix

* updates

* updates

* updates

* fix

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

ded3db16